The previous methodology will give a measure of performance which is typically the accuracy , word error rate or sentence error rate . This is but one measure of the performance. Another measure can be based upon making the task more difficult and seeing how much difficulty has to be added in order to obtain a certain level of word error rate. (One of the problems with the final application assessment can be that the word error rate is so low that the test must be made very large in order to get results that are statistically significant.) Vice versa, the reference system (i.e. the system to which the recognition system is compared) can get a more difficult task.
An example of the latter is the measure introduced by [Moore (1977)], the ``Human Equivalent Noise Ratio'', HENR. The performance of a system is compared to human performance scores. For humans, noise is added in order to decrease the scores. The signal-to-noise ratio at which the human score is equal to the recogniser's score, is defined to be the HENR of the recognition system. The advantage of this method is that the performance measure is relatively independent of the test vocabulary , and that it gives a comparison to human recognition by definition. A disadvantage is that the method is very laborious, as the human calibration has to be carried out for each new test database , with various subjects.
Also, a reference speech recognition system can be used as a benchmark . [Chollet & Gagnoulet (1981)] developed such a software-based recogniser for assessment purposes.