Confusions

Next: Vocabulary Up: Definitions and nomenclature Previous: Recognition score

Confusions

For isolated word recognisers , we can define a more specific measure than the various contributions to the error rate alone. The class of substitution s can be divided into all possible confusions between words. The confusion is defined as the probability that word i is recognised as word j. (Incidentally, the value is the fraction of times word i is correctly recognised.) These probabilities can be estimated from a large test sample in the same manner the basic error rates are measured, by measuring the number of times the confusion took place:

where is the number of times word j was recognised on the input word i.

The confusion matrix gives more detailed information than the error rates , but has much worse statistics, as the numbers involved are normally low. If we want to include insertions and deletion s in this matrix, a null word i=0 should be added (formally not in the vocabulary ), so that the row contains false alarms , the column the deletions, and . From this expanded confusion matrix , the error rate can be calculated from the diagonal of the matrix, i.e. . The elements for are called the off-diagonal elements .

EAGLES SWLG SoftEdition, May 1997. Get the book...