Next: Why does the trigram
Up: System architecture for speech
Previous: Stochastic language modelling
To illustrate the broad range of
language model types, we mention some
- no or uniform language model: Here, the idea is to use the
same probability for all events; events can be either
the words of the vocabulary or the sentences, if the
number of sentences is limited.
If all words are equiprobable, there is an implied
model for the duration of a sentence:
a sentence of N words then has a probability .
- finite state language model :
The set of legal
word sequences is represented as a finite state network
(or regular grammar ) whose edges stand for the spoken
words, i.e. each path through the network results in
a legal word sequence. To make this approach
correct from a probabilistic point of view,
the edges have to
be assigned probabilities.
- m-gram language models: In m-gram language
models, all word sequences
are possible, and the probability of the word
predicted depends only on the (m-1) immediate predecessor
words (see above).
- grammar based language models:
Typically, these models
are based on variants of
stochastic context free grammars or
other phrase structure grammars .
- other types: There are language models
that make use of still other concepts
like CART (classification and regression trees)
[Breiman et al. (1984), Bahl et al. (1989)]
and maximum entropy [Lau et al. (1993), Rosenfeld (1994)].
It should be noted that this classification of
language models is not exhaustive,
and a specific language model may belong to several
EAGLES SWLG SoftEdition, May 1997. Get the book...