In order to show how the trigram works in recognition, F. Jelinek [Jelinek (1991)] has given the example shown in Table 7.1.
1 2 3 4 5 6 7 8 9 10 11 12 13 1 The are to know the issues necessary role and the next be meeting
2 This will have this problems data thing from two months
3 One the understand these the information that in years
4 Two would do problems above to to meetings
5 A also get any other contact are to
6 Three do the a time parts with week
7 Please need use problem people point where days
8 In provide them operators for requiring
9 We insert all tools issues still
... ... ... ...
61 ... ... being
62 ... ... during
63 ... ... I
64 ... ... involved
65 ... ... would
66 ... ... within
... ... ...
93 request factors
94 respond facts
95 supply I
96 write jobs
97 me MVS
98 resolve old
... ...
636 mailroom
637 marketplace
638 provision
639 reception
640 shop
641 important
The spoken sentence was: ``We need to resolve all the important issues within the next two months.'' The figure shows all the words that are assigned a probability higher than the word actually spoken for a trigram language model. The probabilities are based on using the words actually spoken as conditioning events. For example, for the two predecessor words ``all the'', the most likely word is ``necessary'', whereas the actually spoken word ``important'' is only in position 641. From the figure, it can be seen that function words like prepositions and articles tend to be better predicted than the content words. The reason is that the function words occur more often in a corpus and thus their trigrams are more reliable. At the same time, the function words are more difficult to recognise from the acoustic-phonetic point of view due to the coarticulation effects. So we see that there is an interesting symbiosis of trigram language models and acoustic-phonetic models. When the acoustic-phonetic model tends to be poor, as for function words, the trigram model tends to be strong. When the trigram model is weak as for content words, the acoustic-phonetic models are more reliable because content words are long and are less subjected to coarticulation.