If the acoustic part of a recogniser is to be assessed, one can use specific vocabularies that concentrate on certain linguistic features. One method was proposed by [Steeneken (1987)], where the test vocabulary consists of so-called CVC-words (consonant-vowel-consonant). Within the test vocabulary, words differ only in the first consonant (CVC), in the vowel (CVC) or in the final consonant (CVC) . Thus, by measuring the confusion matrix , one can get diagnostic information on what consonants (or vowels) are hard to distinguish. This CVC-database assessment method is based upon speech intelligibility evaluation.
One of the major advantages of using a CVC-type database is that the recognition scores will be generally low. This might be counterintuitive, but for diagnostic and development purposes this is useful, because with a relatively small test the recognition score measurement gets reasonably accurate. For instance, if the purpose of the recognition test is to tune some technical recogniser parameters (such as energy threshold ), one wants to get a reasonable recognition score that does not saturate to 100%. In this way, the change of a parameter will be apparent on a small test set .
Another advantage of the use of CVC-databases is the small size of the vocabulary. For Dutch CVC s , for instance, lists of 17 CVC, 15 CVC and 11 CVC are representative for the language. Such a small vocabulary allows to study confusions accurately, but also makes a quick tune-test cycle possible.
As a result of the SAM effort, a speech database on CD-ROM called EUROM-1 has been produced, which contains CVC-tests for various languages. There are also embedded CVC-tests , where the test words are embedded in carrier sentences [Steeneken (1991)]. This can be used to test both word spotting systems and connected word recognisers .
Another approach was made by [Simpson & Ruth (1987a), Simpson & Ruth (1987b)]. Their test set is based on Phonetic Discrimination with 100 words (PD-100). The test words have been designed to have minimum difference in phonetic respects for pairwise comparison. Related to the Diagnostic Rhyme Test (DRT) , the response is closed (i.e. there are forced choices) and this may lead to misleading results.