next up previous contents index
Next: Prosodic transcription Up: The levels and types Previous: Acoustic-phonetic transcription

Physical transcription


The most detailed level of representation is the physical level. This does not need to relate only to an acoustic record, but could have separate tiers related to different types of input (e.g. nasal transmission detectors, palatography). However, acoustic parameters are likely to be the most frequent type of representation, and different types may be needed for particular application areas (e.g. filter bank output energies, formant frequencies, LPC   and cepstral coefficients, fundamental frequency  or electroglottograph output waveform). The physical events may be overlapping or discrete in time, with each parameter allotted a separate annotation  row (e.g. nasal resonance, periodicity, high-frequency noise ). This level of labelling  has not been generally used in speech technology research to date, but it has the potential to serve as a resource for developing speech synthesisers  with greater ``naturalness '', and speech recognisers  which include more speech knowledge in their algorithms.  

EAGLES SWLG SoftEdition, May 1997. Get the book...