Linguistic vs. acoustic

Next: Human vs. automated Up: Towards a taxonomy of Previous: Laboratory vs. field

Linguistic vs. acoustic

The more complex TTS systems can roughly be divided into a linguistic interface that transforms spelling into an abstract phonological code, and an acoustical interface that transduces this symbolic representation to an audible waveform. The quality of the intermediary representation can be tested directly at the symbolic-linguistic level or indirectly at the level of the acoustic output. Testing the audio output has the advantage that only errors in the symbolic representation that have audible consequences, will affect the evaluation. The disadvantage of audio testing is that it involves the use of human listeners, and is therefore costly and time-consuming. Moreover, the results of acoustic testing are unspecific in that the designer is not informed whether the problems originate at the linguistic or at the acoustic level. As an alternative the intermediate representations in the linguistic interface are often evaluated at the symbolic level. It is, of course, a relatively easy task to compare the symbolic output of a linguistic module with some pre-stored key or model representation and determine the discrepancies, and this is what is normally done. The non-trivial problem is where to obtain the model representations. These will generally have to be compiled manually (or semi-automatically at best), and often involve multiple correct solutions.

EAGLES SWLG SoftEdition, May 1997. Get the book...