Another way of supporting reproducibility can be achieved by the application of standardised signals or channels . This increases the chance of setting a common basis in signal generation techniques that are applicable also by audio non-professionals. Additionally, the variety of the available speech databases can be decreased.
The primary way to take advantage of standardised signals is to rely on speech signals that have been previously produced and collected in a spoken language corpus (cf. Chapters 3 to 5 on Spoken Language Corpora). A description of public domain spoken language corpora is given in Appendix L; for an up-to-date overview over available corpora and their distribution we recommend contacting the relevant speech agencies like ESCA, ELRA or LDC via the Internet.
If special signals are needed for measurement purposes that are not already available, a standardised scheme for signal generation should be considered. Various signal generators exist in hardware and software that produce for instance sinusoidal signals at an adjustable frequency or a wide or narrow-band noise. Noise with constant power density over constant-bandwidth intervals is called white noise, and with constant power per third or octave (constant relative bandwidth ) pink noise. Another common reference signal is the artificial voice that is made up of a sequence with standardised glottal pulses reflecting typical long-term frequency characteristics of speech [CCITT (1988a)].
For the measurement of sound intensity , for instance the intensity of environmental noise , a sound level meter has to be used. This instrument yields the sound level in db, but includes an optional weighting curve over frequency . Three weighting curves have been defined, referred to as A, B and C, that more or less correspond to the equal-loudness contours of the human ear at three different sound levels (cf. Section 8.2.3 about isophones ). In practice, only the A-weighting is used, yielding an A-weighted sound pressure level denoted as in dB(A).
Suitable reference signals have to be chosen, but even so, the channel characteristics must not be neglected. To ensure a clearly defined input to the channel, an anechoic room is recommended for positioning the sound source (the speaker). Furthermore, the acoustic characteristics of a human speaker may be simulated by an artificial mouth, and the human receiver (head with two ears) may be simulated by an artificial head (the ear) [Blauert (1983)]. Both influences can also be emulated by signal processing techniques (cf. Section 8.8).
The fact often neglected that the behaviour of electro-acoustic transducers like microphones is not constant over a long period . Therefore these channel components have to be calibrated at least at the recommended intervals. Check the data sheet or ask the manufacturer of the transducer for calibration time intervals and methods.