Applications of spoken language corpora


As mentioned earlier, speech corpora are always designed for specific purposes. These purposes determine the content and design of a corpus. Thus, a speech therapist interested in pathological speech  will collect a completely different corpus than a designer of a telephone response application. For example, in the first case hi-fi speech recordings are most probably needed in order to study properties of voice quality, whereas in the latter case realistic speech should be collected over the phone, which will result in a rather poor quality of the speech.
In this section we will present a non-exhaustive list of possible users of speech corpora together with the specific types of speech corpora they would need. A distinction will be made between corpora for research purposes and those meant for technological applications. Of course, this does not mean that corpora gathered in the one field cannot be used in the other, although there will be differences in the exchangeability of corpora depending on the corpus. It must be clear that we cannot handle all the details of specific corpora, and that we will indicate only some general properties.

