For isolated word recognisers , we can define a more specific measure
than the various contributions to the error rate alone. The class of
substitution s can be divided into all possible confusions
between words. The confusion is defined as the probability
that word i is recognised as word j. (Incidentally, the value
is the fraction of times word i is correctly
recognised.) These probabilities can be estimated from a large test
sample in the same manner the basic error rates are measured, by
measuring the number of times the confusion took place:
where is the number of times word j was recognised on the
input word i.
The confusion matrix gives more detailed information than the error rates , but has much worse statistics, as the numbers involved are normally low. If we want to include insertions and deletion s in this matrix, a null word i=0 should be added (formally not in the vocabulary ), so that the row contains false alarms , the column the deletions, and . From this expanded confusion matrix , the error rate can be calculated from the diagonal of the matrix, i.e. . The elements for are called the off-diagonal elements .