Most of the detail on background procedures has been given in the section on segmentation . The main issue to be covered is the extent to which ``judges'' agree about a category to be labelled and what psychophysical effects affect that judgment.
If the judge is a phonetician, he might well be influenced in boundary placement by the sound just heard. For example this sort of expert judge knows the effects of plosives on duration of the following vowel or those of pre-pausal lengthening. Consequently, this might influence his categorisation of events in a way atypical of the population of listeners from (in our ANN example) the EU country at large. (It is presumed that the recogniser is to be a model of representative listeners, not a model of listeners trained to hear things in ways that might be coloured by alternative theories.)
If the judge is to locate the phonemes of a language, some of the judgments will depend upon duration of the events (e.g.\ /t/ -- // and vowel quality). In addition, different speakers vary their rate and even the same speaker adjusts his rate during utterances. These influences can lead to variable labelling of phonemes based on well-known psychophysical effects which affect human (but not machine) judgments. This makes the machine's task of duplicating human performance a difficult one.
These effects are called range effects [Parducci (1965)]. They are ubiquitous features of human judgment behaviour but here they will be illustrated for the judgment of speech segments in contexts spoken at different rates. Generally speaking, judgments about the attribute value of an event is affected by the range of the attributes in the contextual material presented for judgment at the same time. So, here judgment about the temporal characteristics of an event to determine, for example, whether it is a /ta/ or /a/ will be affected by the temporal properties of the rest of the material: a sound will have to be longer to be judged /a/ in a slow context than in a fast one. Thus, judgments will be influenced by the contextual material. The changes in /ta/ and /a/ counts are due to judges being influenced by the context; they are not due to changes in speaker behaviour. The net effect is a spurious decrease in the /a/ count when the rate is slow.