next up previous contents index
Next: The levels and types Up: Concerning the segmentation and Previous: Automatic and semi-automatic segmentation

Segmentation and labelling in the VERBMOBIL project

     

In the German VERBMOBIL project, segmentation and labelling of recorded speech data is a fundamental part of the research. The following procedures are adopted [Kohler et al. (1995)]. The phonemic labels are based on the SAMPA  symbols for German, augmented by extra labels for phonetic segments such as plosive release, aspiration  after a plosive, creak, and nasalisation  of a vowel as a reflex of a deleted nasal  . Hence the labelling is partially carried out at the narrow phonetic level, being basically phonemic.

During the labelling process, a label will be aligned with the start of the portion of speech that is considered to represent its chief acoustic correlates. Labels are discrete and non-overlapping, except in the following cases:

  1. Labels for creak and nasalisation  are always superimposed on other labels, which they modify.
  2. A special label (-MA) is used to indicate that the phonetic correlates of one or more deleted  segments are present in the surrounding material. For example, where an unstressed  rounded vowel has been deleted , labialisation may still be present in a neighbouring consonant, and will be marked in this way.

Inter-labeller consistency is maintained in three ways, as follows:

  1. The inventory of possible labels is restricted mainly to the list of German phonemes . This restriction minimises the possibility of error.
  2. The labeller works from a citation-phonemic form of the utterance that has been previously prepared. This eliminates gross errors.
  3. There are restrictions on the types of modification allowed. The labeller is permitted to mark the following: deletion  (where the initially-provided label is marked with a following hyphen, to indicate deletion ); insertion (where the new label is prefixed with a hyphen); and substitution  (where the new label is inserted after the one initially provided, separated by a hyphen).

The checking of segmented and labelled speech files is carried out partly by a program developed at IPDS Kiel that detects invalid sequences of symbols, and partly by experienced labellers checking the work of less experienced transcribers [Kohler et al. (1995)]. All segmenting and labelling is carried out manually. The initial citation-phonemic transcription  is generated at IPDS Kiel with the help of the grapheme-to-phoneme converter  of the Rulsys/Infovox TTS system  for German [Kohler et al. (1995)], subsequently checked manually for mistakes. A system of prosodic  labelling has also been developed: PROLAB [Kohler et al. (1995)].          


next up previous contents index
Next: The levels and types Up: Concerning the segmentation and Previous: Automatic and semi-automatic segmentation

EAGLES SWLG SoftEdition, May 1997. Get the book...