next up previous contents index
Next: Comparative and indirect assessment Up: Scoring procedures Previous: Open-set identification

Recommendations

 

These recommendations indicate how the performance of a speaker recognition  system should be scored.

  1. For closed-set identification 
    -
    Beside the test set misclassification   rate (tex2html_wrap_inline47113), report on average misclassification and mistrust  rates (tex2html_wrap_inline47111 and tex2html_wrap_inline48717), and provide also gender-balanced rates  (tex2html_wrap_inline47123 and tex2html_wrap_inline48721) if the test population is composed of male and female speakers.
    -
    As the number of registered speakers  is a crucial factor of performance, it is essential to indicate the number of speakers in the registered speaker  population. Mention also the proportion of male and female speakers for information.
    -
    For statistical validity information, indicate the number and male/female proportion of speakers in the test population and the average number of test utterances  per test speaker.

  2. For verification
    -
    For static evaluation, beside the test set false rejection   rate  (tex2html_wrap_inline45995) and the test set false acceptance  rate (tex2html_wrap_inline46691), provide the average false rejection  rate (tex2html_wrap_inline47735) and the average false acceptance  rate (tex2html_wrap_inline47821 or tex2html_wrap_inline47835 depending on whether the impostors' identities are known or not). Gender-balanced rates  (tex2html_wrap_inline47737 and tex2html_wrap_inline47823 or tex2html_wrap_inline47839) should also be reported.
    -
    For dynamic evaluation and a speaker-independent threshold,   the system ROC  curve should be obtained as:
    either tex2html_wrap_inline48481 (or tex2html_wrap_inline48483 if impostors are unknown),
    or tex2html_wrap_inline48485 (or tex2html_wrap_inline48487 if impostors are unknown),
    or tex2html_wrap_inline48489
    Summarise a ROC  curve by its traditional equal error rate  (respectively tex2html_wrap_inline48491, tex2html_wrap_inline48493 and tex2html_wrap_inline48399). Investigate on the possibility of finding a ROC  curve model, and report the model equal error rate  (tex2html_wrap_inline48755, tex2html_wrap_inline48757 and tex2html_wrap_inline48581) and the tex2html_wrap_inline48557-accuracy  false rejection  rate validity domain tex2html_wrap_inline48601. Find a reasonable compromise between tex2html_wrap_inline48575 and tex2html_wrap_inline48767.
    -
    For dynamic evaluation with speaker-dependent thresholds , compute the individual equal error rate  (tex2html_wrap_inline48501) of each ROC  curve tex2html_wrap_inline48771 and give the gender-balanced equal error rate   (tex2html_wrap_inline48505), the average equal error rate  (tex2html_wrap_inline48503) and the test set equal error rate   (tex2html_wrap_inline48777). Investigate the possibility of fitting a common ROC  curve model by adjusting individually a model equal error rate  (tex2html_wrap_inline48589) for each curve. Here, either an accuracy  tex2html_wrap_inline48575 is fixed and speaker-dependent  validity domains are computed, or the validity domain is fixed in a speaker-independent   manner and the individual accuracy  tex2html_wrap_inline48603 is computed. Compute anyway the global model equal error rates  (tex2html_wrap_inline48597, tex2html_wrap_inline48787 and tex2html_wrap_inline48599). Then give accordingly either the average validity domain for a speaker-independent   accuracy  or the average accuracy  for a speaker-independent   validity domain.
    -
    For statistical validity information, indicate the number of registered speakers , the proportion of male and female registered speakers , the number of genuine test speakers , the proportion of male and female genuine test speakers and the average number of genuine test utterances  per genuine test speaker. Give also a relevant description of the test impostor configuration and population.

  3. For open-set identification 
    -
    For static evaluation, score separately the false rejections  (tex2html_wrap_inline45995, tex2html_wrap_inline47735 and tex2html_wrap_inline47737), the false acceptances  (tex2html_wrap_inline46691, tex2html_wrap_inline47821 or tex2html_wrap_inline47835 and tex2html_wrap_inline47823 or tex2html_wrap_inline48805) and the misclassifications  (tex2html_wrap_inline47113, tex2html_wrap_inline47111 and tex2html_wrap_inline47123).
    -
    For dynamic evaluation and a speaker-independent   threshold, project the three-dimensional ROC  curve tex2html_wrap_inline48813 into two curves tex2html_wrap_inline48489 and tex2html_wrap_inline48691. Summarise the first one as its equal error rate  tex2html_wrap_inline48399 and the second one as its extremity tex2html_wrap_inline48675. Investigate the possibility of using a parametric approach.
    -
    For dynamic evaluation with speaker-dependent thresholds , average individual tex2html_wrap_inline48501 and individual tex2html_wrap_inline48825. Investigate the possibility of using a parametric approach.
    -
    As for closed-set identification  and for verification, give all relevant information concerning the registered population, the genuine test population and the impostor population and test configuration.

In practice, gender-balanced average  and test set  scores are obtained very easily as various linear combinations of individual speaker scores.

 


next up previous contents index
Next: Comparative and indirect assessment Up: Scoring procedures Previous: Open-set identification

EAGLES SWLG SoftEdition, May 1997. Get the book...