next up previous contents index
Next: Exploitation Up: Speaker verification/identification Previous: Error measure

Training

The following description applies to the speaker verification as well as the speaker identification process. The system may be based on the speaker uttering a sentence or a sequence of words to give some samples of his speech. The comparison during the exploitation phase  uses a reference dictionary  obtained during the training phase .

The training phase  may be off-line or on-line and carried out at:

The training phase  may be carried out off-line using a particular platform , or on-line while the application is operating. The application developer has to know whether he can achieve the training himself (or the end-user can do it) or he will have to deliver the data to the technology provider who will provide the speech models.

The training material  can be specified by the technology provider as a list of phonetically balanced  sentences, well chosen sequences of words, or data selected with respect to some particular criteria (e.g. phonetic coverage of the language). In some cases this material has to be collected and modelled by the application developer. In some other cases it is automatically done during a training session that is seen as a black box  procedure. In all cases the technology provider should indicate the size and characteristics of the speech database needed to achieve the required performance.

The system documentation should also indicate the kind of know-how necessary to best exploit the technology if the training is accomplished by the application developer. This may be a list of appropriate phonetically balanced  sentences per language if this is required, a tool to generate a minimal set of sentences or words, a selected list of words, etc.

If the training is achieved off-line using a database that has to be recorded beforehand then the application developer has to know what intervention is necessary to obtain a usable corpus. These can be speech segmentation , speech labelling  using phonetic labels, orthographic transcriptions  , etc. Consequently the application developer should request an adequate development platform  with adapted tools such as a speech recording and analysis environment.



next up previous contents index
Next: Exploitation Up: Speaker verification/identification Previous: Error measure

EAGLES SWLG SoftEdition, May 1997. Get the book...