next up previous contents index
Next: Markings required for scoring Up: Introduction Previous: Introduction

Points of departure

There are 6 points of departure:

  1. The transcription is intended to be an orthographic, lexical transcription with a few details included that represent audible acoustic events (speech and nonspeech) present in the corresponding waveform files. The extra marks contained in the transcription aid in interpreting the text form of the utterance.
  2. The transcription is intended to be a quick and broad transcription; transcribers should not have to agonise over decisions, but rather realise that their transcription is intended to be a rough guide that others may examine further for details.
  3. Transcriptions should be made in two passes: one pass in which WORDS are transcribed, and a second in which the additional details are added. Background noises and uh's are easy to miss unless specifically attended to. It is recommended that transcribers have some background in phonetics and/or linguistics, or that their training and preparation for the transcription task cover some basics in acoustic phonetics and dialect and style variations.
  4. The overall aim is to keep as much speech in the corpus as possible and to try to avoid the need for deleting recordings from the corpus due to some extra noises, dysfluencies, etc.
  5. The conventions comprise both mandatory and optional transcriptions. All transcriptions should precisely follow the mandatory guidelines. The optional transcriptions are marked OPTIONAL in this document, and if provided should be documented and should follow these guidelines precisely. This is to regulate the task of external validation. Markings which are optional have been chosen to be easily removed or translated by automatic means to yield the base transcription form.
  6. The documentation provided with the database transcriptions should accurately provide details of which optional transcriptions were performed, and all relevant additional information, such as standard dictionary, preferred spelling variants, etc.

In summary, the principles are ``Keep it simple'' and ``Document everything adequately''.


next up previous contents index
Next: Markings required for scoring Up: Introduction Previous: Introduction

EAGLES SWLG SoftEdition, May 1997. Get the book...