For isolated word recognisers , we can define a more specific measure
than the various contributions to the error rate alone. The class of
substitution s can be divided into all possible confusions
between words. The confusion is defined as the probability
that word i is recognised as word j. (Incidentally, the value
is the fraction of times word i is correctly
recognised.) These probabilities can be estimated from a large test
sample in the same manner the basic error rates are measured, by
measuring the number of times the confusion took place:
where is the number of times word j was recognised on the
input word i.
The confusion matrix gives more detailed information than the
error rates , but has much worse statistics, as the numbers involved
are normally low. If we want to include insertions and
deletion s in this matrix, a null word i=0 should be added
(formally not in the vocabulary ), so that the row contains
false alarms , the column
the deletions,
and
. From
this expanded confusion matrix , the error rate can be calculated from
the diagonal of the matrix, i.e.
. The elements
for
are called
the off-diagonal elements .