Sources of recognition errors

Next: Search: Finding the single Up: Language models and search Previous: Language models and search

Sources of recognition errors

Looking at the basic architecture shown in Figure 7.1 we see that there are different types of reason why a speech recognition system , in particular a large-vocabulary continuous-speech system, can make a recognition error:

acoustic-phonetic modelling: This part of the system includes all parts related to the acoustic signal:
- signal analysis;
- phoneme modelling:
  - the inventory of context independent and context dependent phoneme units;
  - in most cases, the phoneme units are represented by Hidden Markov models [Levinson et al. (1983), Bahl et al. (1983)]; any of their details such as topology and emission probabilities may have an effect on the error rate ;
- pronunciation lexicon: the pronunciation lexicon serves as the link between the word level and the phoneme units.
It is obvious that any of these three levels of acoustic-phonetic modelling can cause recognition errors. For example, a word whose entry in the pronunciation lexicon is incorrect is unlikely to be recognised correctly.
language modelling: If the language model is poor it cannot help much to resolve the ambiguities in acoustic recognition.
search errors: A full, i.e. globally optimal, search is prohibitive for large vocabulary speech recognition . Therefore global optimal search is abandoned and replaced by a suboptimal search. Not finding the globally optimal word sequence may cause additional recognition errors. These search errors will disappear if the search effort is increased to evaluate more hypotheses about the spoken word sequences.

EAGLES SWLG SoftEdition, May 1997. Get the book...