Another feature which is classically used to specify a speaker recognition system is its level of text dependence, i.e. the constraints on the linguistic material imposed on a test utterance . A main distinction is conventionally set between text-dependent systems and text-independent systems . Though this basic distinction is not accurate enough to cover the range of practical possibilities, below we give a definition of these two terms according to the usage found in the literature. To simplify, in text-dependent systems, the linguistic content of the training and test material are totally identical, while in text-independent systems test utterances vary across trials (at least in terms of word order).
However, a deeper study of the various strategies used in practice
shows that at least five levels of text dependence should be
distinguished. Two of them resort to text-dependent approaches, but
can be opposed to the use of either a personal
password or a common password.
The
other three can be viewed as several variants of text-independent
approaches, using either fixed words in a random order (
fixed-vocabulary systems
), a specific
linguistic event, wherever it occurs (event-dependent systems
), or a completely
unrestricted text (unrestricted text-independent
systems
).