Speaker-adapted systems

Next: Speaking aspects Up: Speech recognition systems Previous: Speaker dependency

Speaker-adapted systems

The system may incorporate a speaker adaptation procedure that allows the system to learn the current speaker characteristics and thus improve its performance during the interaction. At the beginning the system may be used in a degraded mode (either speaker-independent or speaker dependent trained on another speaker) and ending up as an optimised speaker-adapted system.

The adaptation may have to be done by the application developer in order to tune the system to his specific application. Usually two approaches are used:

A static adaptation process:
One has to start from an off-line recording of data and a training phase before using the system. The system references are adapted to the new speaker once and for all. The duration of this process is important: it can be real-time or even last for hours. The speech data needed can be acoustic data without any manual labelling or manual pre-processing , or it may have to be labelled (orthographic plus phonetic). The speech corpus may range from a few minutes of speech to a few hours.
A dynamic adaptation process:
The system learns the current speaker characteristics while the speaker is using the system. This may be done by user request if errors occurred during the application, or the system may automatically take into consideration the speech data uttered by the present speaker.

If this procedure is available the application developer has to know how to use it and he needs the know-how to collect and process the required data.

EAGLES SWLG SoftEdition, May 1997. Get the book...