Recommendations: Refined language models

Next: Language models and search Up: Refined language models Previous: Grammar based language models

Here, some recommendations are given for the use of the refined language models in specific recognition tasks:

Experimental experience is that any type of the usual language model refinements is unlikely to reduce the perplexity by more than 10% over a standard trigram model (or bigram model, if the amount of training data is small). Therefore in all applications, it should be checked first whether a trigram model in combination with a cache component does not already do the job. In a number of recognition tasks, the perplexity improvements by the language model refinements are not worth the additional effort using today's algorithms.
There might be some particular applications where the amount of training data is really small. In these cases, it can be useful to base the language model on word classes rather than the words themselves. These word classes can be classes defined either by an automatic clustering procedure or by linguistic prior knowledge, e.g. parts of speech (POS) .
If it is suitable to combine two language models of different type, e.g. a word bigram model and a class bigram model, the first choice should be to try a linear interpolation of the two models.

EAGLES SWLG SoftEdition, May 1997. Get the book...