In linear interpolation, a
weighted average between the relative frequencies and the
general distribution
is computed:
In other words, the difference to linear discounting in connection
with backing-off is that the more general distribution
is used in all cases [Jelinek & Mercer (1980), Nadas (1984)].
The mathematical framework becomes rather complex for estimating
the unknown parameters in linear interpolation.
In most cases, the so-called EM algorithm is
used as described in the appendix;
EM stands for expectation-maximisation.