Speaker population size and typology

Next: Speaker purpose and other Up: Influencing factors Previous: Speech quantity and variety

Speaker population size and typology

In this section, we indicate in what way, and to what extent the population composition, in terms of size and typology, can affect the performance of a speaker recognition system, and how it should be taken into account when designing an evaluation experiment.

When the goal is closed-set speaker identification, it is clear that the complexity of the task increases with n, the registered speaker population size. However, the proportion of men and women in the population also has a direct influence, as same-sex confusions are usually much more likely than cross-sex errors . If additional geographical, physiological, and even psychological and sociological information seems particularly relevant or clearly specific of the tested population, the experimenter should be aware of it and explicate it.

With respect to concerns speaker verification, the level of performance does not depend on the registered speaker population size, as for each trial, the complexity of the task corresponds to an open-set speaker identification with n = 1. A large representative population of registered speakers will only guarantee a higher statistical validity of evaluation results, whereas general conclusions will be less reliable with a small specific population.

However, a relevant issue for speaker verification (and open-set identification) is the number and typology of pseudo-impostors , i.e. speakers used to model impostors during the registration phase. With more pseudo-impostors , the modelling of imposture is usually more accurate. The way pseudo-impostors are selected, and in what way they differ from authorised users is also essential.

In general, each registered speaker has a corresponding impostor model , which represents real impostors who could claim his identity. The impostor model can be common to all registered speakers , or specific to each authorised user, if the pseudo-impostor population varies across subscribers. Pseudo-impostors can be chosen within the population of registered speakers , or originate from an external population. We will use the term pseudo-impostor bundle to refer to the group of speakers who have been used to build the impostor model of a given registered speaker .

From a practical point of view, when impostor models are built from other registered speakers , the recording burden is lighter, but the impostor models may be less representative of imposture in general. If an additional population of external speakers is used, the number of additional pseudo-impostors , their population typology, as well as the speech quantity and number of sessions required from each of them should be specified.

Incidentally, for the evaluation of a speaker verification system, a test impostor should not be part of the pseudo-impostor bundle of the speaker he is claiming to be, as the real rejection abilities of the system may be overestimated otherwise. On the other hand, there is no objection to having a registered speaker belong to his own pseudo-impostor bundle , as is the case when the whole registered population is used to build a common impostor model.

Next: Speaker purpose and other Up: Influencing factors Previous: Speech quantity and variety

EAGLES SWLG SoftEdition, May 1997. Get the book...