The spectrographic method for speaker recognition makes use of an instrument that converts the speech signal into a visual display. For many years the reference instrument was the ``Voice Identification Inc., Sound Spectrograph, model 700''. This instrument is able to give a permanent record of changing energy-frequency distribution of a speech wave through time. Usually, the frequency range is 0-4000Hz, and the bandfilter is 300Hz. Since spectrograms are visual representations of the speech signal, they convey information about the message spoken by the speaker as well as about the speaker himself. For this reason, these patterns were thought to be used as a way of identifying speakers. For example, when the recordings of the voice of two individuals are obtained, an examiner may be able to give an opinion about the similarity between two recordings, if there are common phonetic elements between their speech recordings. This method for speaker identification was originally proposed in [Gray & Kopp (1944)], but its use for forensic applications was not considered until 1962, when [Kersta (1962)] published the results of experiments on one-word spectral comparison in closed-set tests. Further studies were also carried out by [Stevens et al. (1968)] and by [Tosi et al. (1972)], who presented the results of research at Michigan State University on the basis of a ``forensic model'' with open-set tests. These results have been analysed from [Bolt (1970)], by observing the error rates of false identification and of false elimination. They observed that the error rate is dependent upon a lot of factors, i.e. different conditions of environment noise , the change in the psychological state of the speaker, his attempts to alter his voice, the recording conditions, the orthophonic or telephonic voice of the talker etc.; in particular, the error rate is widely dependent on the examiner and is increased by changing from trained to untrained examiners. Owing to these factors and to other restrictive conditions that affect the error rate of examiners, this method is today of no great interest to scientists in speaker recognition tasks.
Both the listening and the spectrographic methods are subjective techniques based, the first one on aural comparison of recordings and the second one on visual examination of spectrograms in order to attribute two voice samples to the same talker. These decisions are taken by one or several experts, according to some process that is clearly impossible to formulate. Moreover, the process that leads to attributing to a particular individual the quality of ``expert'' is far from being calibrated itself. Very often also, experts are asked to give a probability of confidence on their own judgments. Are forensic speech experts submitted to a benchmark test before they are recruited? Does it make sense to give a figure measuring one's own self-confidence on one's own decision? There are certainly several issues about forensic applications of human speaker recognition that have to be called into question.