When smoothing a trigram model with a bigram model, we have to keep in mind
that the backingoff distribution
itself requires smoothing.
So the bigram itself is smoothed by a unigram
which again may be
smoothed by a zerogram .
Thus, we can define the following levels for a trigram event
(u,v,w):
 the trigram level , which defines
the relative trigram frequencies as the level to start with;
 the bigram level ;
 the unigram level ;
 the zerogram level
if the unigram estimates are unreliable.
It is helpful to explicitly write down the notation
used in the following, in particular
the definitions of the socalled singletons
and the unseen events:

 N(u,v,w): number of observations for trigram uvw;

 : number of observations for bigram uv;

 : number of unseen trigrams starting with uv;

 : number of trigram singletons ending in vw;

 : number of trigram singletons
having v in the middle.
The definitions at the bigram and unigram level are similar:

 : number of observations for unigram u;

 : number of unseen bigrams starting with v;

 : number of unseen unigrams.
