Physical transcription

Next: Prosodic transcription Up: The levels and types Previous: Acoustic-phonetic transcription

Physical transcription

The most detailed level of representation is the physical level. This does not need to relate only to an acoustic record, but could have separate tiers related to different types of input (e.g. nasal transmission detectors, palatography). However, acoustic parameters are likely to be the most frequent type of representation, and different types may be needed for particular application areas (e.g. filter bank output energies, formant frequencies, LPC and cepstral coefficients, fundamental frequency or electroglottograph output waveform). The physical events may be overlapping or discrete in time, with each parameter allotted a separate annotation row (e.g. nasal resonance, periodicity, high-frequency noise ). This level of labelling has not been generally used in speech technology research to date, but it has the potential to serve as a resource for developing speech synthesisers with greater ``naturalness '', and speech recognisers which include more speech knowledge in their algorithms.

EAGLES SWLG SoftEdition, May 1997. Get the book...