Next: Comparative and indirect assessment
Up: Scoring procedures
Previous: Open-set identification
These recommendations indicate how the performance of a speaker
recognition system should be scored.
-
For closed-set identification
- -
-
Beside the test set misclassification rate
(
), report on average misclassification and mistrust
rates (
and
), and provide also
gender-balanced rates (
and
) if the test population is
composed of male and female speakers.
- -
-
As the number of registered speakers is a crucial factor of performance, it is essential to indicate the number of speakers in the registered speaker population. Mention also the proportion of male and female speakers for information.
- -
-
For statistical validity information, indicate the number and male/female proportion of speakers in the test population and the average number of test utterances per test speaker.
-
For verification
- -
-
For static evaluation, beside the test set false rejection
rate (
) and the test set false
acceptance rate (
), provide the average false rejection rate (
) and the average false acceptance rate (
or
depending on whether the impostors' identities are known or not). Gender-balanced rates (
and
or
) should also be reported.
- -
-
For dynamic evaluation and a speaker-independent threshold,
the system ROC curve should be obtained as:
-
either
(or
if impostors are unknown),
-
or
(or
if impostors are unknown),
-
or
Summarise a ROC curve by its traditional equal error rate (respectively
,
and
). Investigate on the possibility of finding a ROC curve model, and report the model equal error rate (
,
and
) and the
-accuracy false rejection rate validity domain
. Find a reasonable compromise between
and
.
- -
-
For dynamic evaluation with speaker-dependent thresholds , compute the
individual equal error rate (
) of each ROC curve
and give the gender-balanced equal error
rate
(
), the average equal error rate
(
) and the test set equal error rate (
). Investigate the
possibility of fitting a common ROC curve model by adjusting individually a
model equal error rate (
) for each curve. Here, either an
accuracy
is fixed and speaker-dependent validity domains are
computed, or the validity domain is fixed in a speaker-independent
manner
and the individual accuracy
is computed. Compute anyway the
global model equal error rates (
,
and
). Then give accordingly either the average
validity domain for a speaker-independent
accuracy or the average accuracy
for a speaker-independent validity domain.
- -
-
For statistical validity information, indicate the number of registered
speakers , the proportion of male and female
registered speakers , the number of genuine test
speakers , the proportion of male and female genuine
test speakers and the average number of genuine test utterances per genuine
test speaker. Give also a relevant description of the test impostor
configuration and population.
-
For open-set identification
- -
-
For static evaluation, score separately the false rejections (
,
and
), the false acceptances (
,
or
and
or
) and the misclassifications (
,
and
).
- -
-
For dynamic evaluation and a speaker-independent threshold, project the
three-dimensional ROC curve
into two curves
and
. Summarise the first one as
its equal error rate
and the
second one as its extremity
. Investigate the possibility of
using a parametric approach.
- -
-
For dynamic evaluation with speaker-dependent thresholds , average individual
and individual
. Investigate the possibility of
using a parametric approach.
- -
-
As for closed-set identification and for
verification, give all relevant information concerning the registered
population, the genuine test population and the impostor population and test
configuration.
In practice, gender-balanced average and test
set scores are obtained very easily as various linear combinations of
individual speaker scores.
Next: Comparative and indirect assessment
Up: Scoring procedures
Previous: Open-set identification
EAGLES SWLG SoftEdition, May 1997. Get the book...