A spoken language recognition system is generally divided into two components: the recognition component and the search component (see Chapters 7 and 10). In the recognition component, intervals of the speech signal are mapped by probabilistic systems such as Hidden Markov Models , Neural Networks , Dynamic Programming algorithms, Fuzzy Logic knowledge bases, to word hypotheses; the resulting mapping is organised as a word lattice or word graph , i.e. a set of word hypotheses, each assigned in principle to a temporal interval in the speech signal. The term word is used here in the sense of ``lexical lookup key''. The keys are traditionally represented by orthography, but would be better represented in a spoken language system by phonemic transcriptions. in order to avoid orthographic noise due to heterophonous homographs. The search component enhances the information from the speech signal with top-down information from a language model in order to narrow down the lexical search space. In spoken language recognition system development, a corpus based lexicon of orthographically transcribed forms is used as the basis for a pronunciation lexicon (pronunciation dictionary); the lexicon is often supplemented by rules for generating pronunciation variants due to informal speech styles (phonostylistics) or speaker and dialect variants. The pronunciation lexicon is required in order to tune the recognition system to a specific corpus by statistical training : frequencies of distribution of words in a corpus are interpreted as the prior (a priori) probabilities of words in a given context. These prior probabilities may be based on the absolute frequencies of words, or on their frequencies relative to a given context, e.g. digram (bigram ) frequencies.
The functionality of spoken language lexica may be summarised in the following terms.