Why does the trigram language model work?

In order to show how the trigram works in recognition, F. Jelinek [Jelinek (1991)] has given the example shown in Table 7.1.

1 2 3 4 5 6 7 8 9 10 11 12 13

1 The are to know the issues necessary role and the next be meeting

2 This will have this problems data thing from two months

3 One the understand these the information that in years

4 Two would do problems above to to meetings

5 A also get any other contact are to

6 Three do the a time parts with week

7 Please need use problem people point where days

8 In provide them operators for requiring

9 We insert all tools issues still

... ... ... ...

61 ... ... being

62 ... ... during

63 ... ... I

64 ... ... involved

65 ... ... would

66 ... ... within

... ... ...

93 request factors

94 respond facts

95 supply I

96 write jobs

97 me MVS

98 resolve old

... ...

636 mailroom

637 marketplace

638 provision

639 reception

640 shop

641 important

Table 7.1: Effect of trigram model on recognition [Jelinek (1991)]

The spoken sentence was: ``We need to resolve all the important issues within the next two months.'' The figure shows all the words that are assigned a probability higher than the word actually spoken for a trigram language model. The probabilities are based on using the words actually spoken as conditioning events. For example, for the two predecessor words ``all the'', the most likely word is ``necessary'', whereas the actually spoken word ``important'' is only in position 641. From the figure, it can be seen that function words like prepositions and articles tend to be better predicted than the content words. The reason is that the function words occur more often in a corpus and thus their trigrams are more reliable. At the same time, the function words are more difficult to recognise from the acoustic-phonetic point of view due to the coarticulation effects. So we see that there is an interesting symbiosis of trigram language models and acoustic-phonetic models. When the acoustic-phonetic model tends to be poor, as for function words, the trigram model tends to be strong. When the trigram model is weak as for content words, the acoustic-phonetic models are more reliable because content words are long and are less subjected to coarticulation.

	1	2	3	4	5	6	7	8	9	10	11	12	13
1	The	are	to	know	the	issues	necessary	role	and	the	next	be	meeting
2	This	will		have	this	problems	data	thing	from			two	months
3	One	the		understand	these	the	information	that	in				years
4	Two	would		do	problems		above	to	to				meetings
5	A	also		get	any		other	contact	are				to
6	Three	do		the	a		time	parts	with				week
7	Please	need		use	problem		people	point	where				days
8	In			provide	them		operators	for	requiring
9	We			insert	all		tools	issues	still
...				...			...		...
61				...			...		being
62				...			...		during
63				...			...		I
64				...			...		involved
65				...			...		would
66				...			...		within
...				...			...
93				request			factors
94				respond			facts
95				supply			I
96				write			jobs
97				me			MVS
98				resolve			old
...							...
636							mailroom
637							marketplace
638							provision
639							reception
640							shop
641							important