As mentioned earlier, speech corpora are always designed for specific purposes. These
purposes determine the content and design of a
corpus. Thus, a speech therapist interested in pathological speech will collect
a completely different
corpus than a designer of a telephone response application. For
example, in the first case hi-fi speech
recordings are most probably needed in order to study properties of voice
quality, whereas in the latter case realistic speech should be collected over the phone, which will result in a
rather poor quality of the speech.
In this section we will present a non-exhaustive list of possible users of speech corpora together
with the specific types of speech corpora they would need.
A distinction will be made between corpora for research purposes and those
meant for technological applications. Of course, this does not mean that corpora
gathered in the one field cannot be used in the other, although there will
be differences in the exchangeability of corpora depending on the corpus.
It must be clear that we cannot handle all the details of specific corpora, and that we will indicate
only some general properties.