next up previous contents index
Next: Recommendations on defining spoken Up: What is a spoken Previous: Basic lexicographic coverage criteria

The lexicon in spoken language recognition systems

   

A spoken language recognition system is generally divided into two components: the recognition component and the search  component (see Chapters 7 and 10). In the recognition component, intervals of the speech signal are mapped by probabilistic systems such as Hidden Markov Models , Neural Networks , Dynamic Programming algorithms, Fuzzy Logic  knowledge bases, to word hypotheses; the resulting mapping is organised as a word lattice  or word graph , i.e. a set of word hypotheses, each assigned in principle to a temporal interval in the speech signal. The term word is used here in the sense of ``lexical lookup key''. The keys are traditionally represented by orthography, but would be better represented in a spoken language system  by phonemic transcriptions.   in order to avoid orthographic noise  due to heterophonous homographs. The search  component enhances the information from the speech signal with top-down information from a language model  in order to narrow down the lexical search space. In spoken language recognition system development, a corpus based lexicon of orthographically  transcribed forms is used as the basis for a pronunciation lexicon (pronunciation dictionary); the lexicon is often supplemented by rules for generating pronunciation variants due to informal speech styles (phonostylistics) or speaker and dialect variants. The pronunciation lexicon is required in order to tune the recognition system to a specific corpus by statistical training : frequencies of distribution of words in a corpus are interpreted as the prior (a priori) probabilities of words in a given context. These prior probabilities may be based on the absolute frequencies of words, or on their frequencies relative to a given context, e.g. digram (bigram ) frequencies.

The functionality of spoken language lexica may be summarised in the following terms.

   



next up previous contents index
Next: Recommendations on defining spoken Up: What is a spoken Previous: Basic lexicographic coverage criteria

EAGLES SWLG SoftEdition, May 1997. Get the book...