An open-set identification system can be viewed as a function which
assigns to any test utterance z an estimated speaker index
, corresponding to the identified speaker
in the set of registered
speakers , or outputs 0 if the applicant
speaker is considered as an impostor.
In open-set identification, three types of error can be distinguished:
Here, two points of view can be adopted.
Either a misclassification error is
considered as a false acceptance (while a correct
identification is treated as a true acceptance ). In this case, open-set
identification can be scored in the same way as verification, namely by
evaluating a false rejection rate and a false
acceptance rate
. The concept of ROC
curve can
be extended to this family of systems, and in particular, an equal error
rate
can be computed. However, the
false acceptance rate
is now bounded by a
value
when the threshold
tends to 0,
being the closed-set misclassification rate
of the system, i.e. the performance that the open-set identification system
would provide if it was functioning in a closed-set mode. Therefore, a
parametric approach for dynamic evaluation would require a specific class of
ROC curve models (at
least with two parameters). Moreover, merging classification errors with false
acceptances may not be appropriate if the two types
of error are not equally harmful.
An alternative solution is to keep distinct the three types of error, and
measure them by three rates ,
and
. The
ROC curve is now a curve in a
three-dimensional space, with equation
. The two extremities of this curve are the points with coordinates
and
. The ROC
curve can be projected as
and
. The first projection is a monotonically decreasing
curve such as
and
, whereas the second projection is
also monotonically decreasing, and satisfies
and
. A minimal description of the curve of
could
then be the equal error rate
of
function f and the closed-set identification
score
of function g. Parametric models of
with two degrees of freedom could be thought of, but to our
knowledge, this remains an unexplored research topic.
Among both possibilities, we believe that the second one is to be preferred, though it is slightly more complex.