Speaker classification tasks

Next: General definitions Up: Presentation Previous: Presentation

Speaker classification tasks

If the goal is to decide whether a given speech utterance was uttered by a male speaker or a female speaker, this particular problem of speaker classification can be referred to as sex identification.
When the goal is to classify a speaker within an age group, from a spoken utterance, the problem can be called age identification.
Some health professionals are interested in detecting pathologies using voice samples (for instance, vocal cord disfunctionings). This problem of pathology identification is a particular case of health state identification .
Any task that would consist in determining whether a speaker is angry, sad, stressed , calm, happy, relaxed, etc. would resort to mood identification .
We will understand by the term accent identification any process consisting of determining some aspects of the sociological background of the speaker. The most realistic is certainly to try to identify regional accent for a native speaker, or linguistic origin for a non-native speaker.
For some applications, it is necessary to classify a speaker with respect to one of several categories, the characteristics of which cannot be expressed in objective terms This task can be covered under the general term of speaker cluster selection.

In the special case where the goal is to identify in which language a given speech utterance has been produced, we recommend using the term spoken language identification instead of the usual expression of language identification , as the latter can be confused with written language identification.

Finally, if the task consists in finding information about the identity of the speaker from a speech signal, it is classically designated as speaker recognition .

For speaker classification and recognition tasks, a general distinction must be made between identification and verification. While identification consists in finding to which class or speaker a speech utterance is most likely to belong, verification aims at validating or dismissing the hypothesis that the utterance pertains to a given class or speaker.

Examples of speaker class identification are given above. For speaker class verification , a typical problem of age verification would consist in checking whether a speaker is an adult or not, and spoken language verification would aim at checking whether an utterance was pronounced in a given language (the expected language of an application, for instance).

In the rest of this chapter, we will mainly focus on speaker identification and verification . However, most concepts are easy to generalise to other speaker classification problems.

Next: General definitions Up: Presentation Previous: Presentation

EAGLES SWLG SoftEdition, May 1997. Get the book...