next up previous contents index
Next: Assessment of speaker verification Up: Experimental design of large Previous: Evaluation protocol

Scoring method

For continuous speech  it is not so trivial how the assignment of ``deletion s'', ``substitution s'' and ``insertion s'' should be made. The process by which this is carried out is called alignment . If the recogniser's  segmentation  is available, i.e. if the times of the starting and ending of each recogniser word are available, this alignmentalignment  can be done in a way comparable to the isolated word recogniser   assessment.

Generally, such labelling information is not available in the recognition output. In this case, the alignment  process uses a dynamic programming algorithm to minimise the misalignment of two strings of words (symbols), the reference sentence and the recognised sentence. The alignment  depends on the relative weights of the contributions of substitutionsubstitution s , insertions  and deletions.   [Hunt (1990)] discusses the theory of word-symbol alignment  and analyses some experiments on alignment.

NIST  has developed freely available software for analysis of continuous speech  recognition systems. It basically consists of two parts: an alignment  program and a statistics package.

The alignment  can be performed both on word level and on the phone level (so-called phonetic alignment ) if the dictionarydictionary   is available. It is a standard alignment procedure and is therefore recommended for competitive assessment.

The software was developed for the ARPA  evaluations, but it has been designed to make the programs generally applicable. The alignment  program generates a (binary) file with all alignment information, which can be printed by another utility in various levels of detail. Overall results can be compiled, as well as results on a per-speaker level. The statistics program can pairwise compare the results of different recognition systems and decide whether or not the difference in performance is significant. This is done using four different statistical tests.


next up previous contents index
Next: Assessment of speaker verification Up: Experimental design of large Previous: Evaluation protocol

EAGLES SWLG SoftEdition, May 1997. Get the book...