In linear interpolation, a weighted average between the relative frequencies and the general distribution is computed:

In other words, the difference to linear discounting in connection
with backing-off is that the more general distribution
is used in *all* cases [Jelinek & Mercer (1980), Nadas (1984)].
The mathematical framework becomes rather complex for estimating
the unknown parameters in linear interpolation.
In most cases, the so-called EM algorithm is
used as described in the appendix;
EM stands for *expectation-maximisation*.

