There is a wide range of technologies which fall under the general banner of ``spoken language processing'' (SLP) including:
Many of these technologies rely heavily on the availability of substantial quantities of recorded speech material: first, as a source of data from which to derive the parameters of their constituent models (manually or automatically), and second, in order to assess their behaviour under controlled (repeatable) test conditions.
Of course very few spoken language processing applications involve stand-alone spoken language technology. Spoken language provides an essential component of the more general human-computer interface alongside other input/output modalities such as handwriting, typing, pointing, imaging and graphics (see Figure 1.3). This means that the actions and behaviours of the speech-specific components of a spoken language system inevitably have to be orchestrated with respect to the other modalities and to the application itself by some form of interactive dialogue process (simultaneously taking into account the wide range of human factors involved).
The complexity of the human-computer interface , and the subtle role of speech and language processing within it, has been (and continues to be) a prime source of difficulty in deploying spoken language systems in ``real'' applications. Not only are field conditions very different to laboratory conditions, but there has been a serious lack of agreed protocols for testing such systems and for measuring their overall effectiveness.
Figure 1.3: Multimodal human-computer interface (HCI) including speech/language input/output