The different nature of categories and time functions

Next: Applications of spoken language Up: Eight main differences between Previous: The different legal status

The different nature of categories and time functions

The last difference, and the most important one, must be looked at from two different angles. The first thing to understand is that the relevant category of the data (that determines its collection) is already inherently given in the case of NL, but totally unknown in the case of physically recorded speech. The ASCII symbols of a given text are elementary categories by themselves, and are directly used to form syntactically analysable expressions for the representation of all the different linguistically relevant categories. Thus relevant categorical information can be directly inferred from categorically given data and their ASCII representations. In contrast to this NL situation, the data of a digital speech signal do not signal any such categories, because they only represent a measured time function without any inherent categorical interpretation. At the present stage in the development of SLP it is not yet even possible to decide automatically whether a given digital signal is a speech signal or not. Therefore the necessary categorical annotations for SL data must still be produced by human workers (with the increasing support of semi-automatic procedures).
The second matter that must be considered in judging the different roles of categories and time functions in speech technology is that speech signals contain relevant prosodic and paralinguistic information that is not represented by the pure text of what was pronounced within a given utterance. As long as NLP can be restricted to non-spoken language processing the restriction to NL data does not pose severe problems. But as soon as real speech utterances are to be processed in an information technology application, the other, non-linguistic, but communicatively extremely relevant categories cannot be ignored. They must be represented in future SL data collections, and much effort has still to be invested by the international scientific community to deal with all these information-bearing aspects of any given speech utterance.

Next: Applications of spoken language Up: Eight main differences between Previous: The different legal status

EAGLES SWLG SoftEdition, May 1997. Get the book...