Any frequency spectrum that passes through a linear system can undergo only linear distortions. In practice these distortions may emerge by a change in signal amplitude and/or a change in signal phase for each frequency component. Consequently the distortions of linear systems are divided into two categories: amplitude and phase distortions.
The characteristic of this kind of distortion is a frequency-dependent change in the output or response amplitude to input or excitation amplitude ratio, the so-called transmission factor :
For an acoustic or mechanical output quantity the transmission factor is also called efficiency, while it is called sensitivity in the electrical case.
The dependency or is called the frequency response, and may be obtained by calculation or by measurement. A graphic representation of the amplitude-frequency curve is called a Bode diagram . The transmission factor at a test frequency compared to the transmission factor at a reference frequency (commonly 1000Hz for audio devices) is called damping distortion . In case of the same input amplitude for the test and reference frequency , linear and logarithmic damping distortion is defined respectively by:
For instance, a system may be described by ``frequency range 40 to 10000Hz 1db''. This means that at all frequencies within the cited frequency range B the damping distortions are lower than 1db. If a cited frequency range goes without a specified damping distortion, we can assume a maximum distortion of -3db compared to 1000Hz, corresponding with a maximum decrease in the transmission factor by . Where at constant input amplitude the output amplitude of a system changes in proportion to the frequency , the damping distortion is specified by ``dB per octave''.
In general terms, B is the bandwidth with an approximately horizontal amplitude-frequency response and may also be specified by the lower and upper cut-off frequency . Below and above these frequencies the response differs appreciably from the response within B. We should bear in mind that a specification of a lower and an upper cut-off frequency gives no information either on the strength or on the kind of distortion (dips/notches or peaks) below and above B. Additionally, the quality of linearity within B is not fully specified. Therefore a more practicable description of the amplitude distortion may be obtained from a tolerance mask that shows the amplitude-frequency range containing the frequency response.
The frequency range required for high listening quality is to a large extent application dependent. High-quality reproduction of music may be obtained within a range from 40 to 15000Hz, whereas a range from 20 to 20000Hz is recommended for professional audio applications. In case of speech signals the frequency range varies between 300 to 3400Hz (telephone quality), 70 to 8000Hz (reasonable quality) and 40 to 15000Hz (high quality). A sharp reduction of the frequency range of speech that can be audibly detected by 80% of the subjects occurs for ranges of 120 to 7900Hz (male speaker) and 220Hz to 10500Hz (female speaker) [Webers (1985)].
In the case of digital audio , attention should be paid to the choice of the sampling rate , which must be equal to or higher than twice the upper cut-off frequency . Therefore a compromise has to be met between better listening quality obtained from higher frequency energy on the one hand, and higher storage and processing demands on the other. For professional applications of speech or application-independent ``flawless'' recordings, we recommend a sampling rate of 32000, 44100 or 48000Hz. This ensures high quality as well as flexible conversion of the data to standardised digital audio formats.
The perceived degradation by irregularities in the amplitude-frequency response are dependent on the form of the deviation from the horizontal ``ideal'' curve. It has been shown by listening tests that distortion peaks are much more disturbing than distortion valleys or dips: at 90Hz a peak of 10db was detected in 40% of the test cases, and a dip of 25db in only 30% [Rossi (1988)]. This holds for music, white noise and speech, and the signal presentation was done via headphones in this evaluation. Consequently, dips are considered acceptable, whereas distortion peaks are perceived as very unpleasant. Additionally, distortions at high or medium frequencies are more annoying than the same distortions at low frequencies [Eargle (1976)]. These facts, which can also be corroborated by speech intelligibility tests, have been taken into consideration in standards for high-fidelity electro-acoustic equipment.
So far we have considered one conformity condition of the recording chain: a constant or frequency independent gain G. Therefore a theoretical transfer of an impulse over the recording chain may lead to an output impulse that is a scaled version of the excitation. Additionally, the output impulse may be delayed in time by a frequency independent delay , corresponding with a linear phase response in the frequency domain.
If a speech spectrum is cut into two components below and above a cutting frequency , the smallest audible delays between these components are approximately 10 ms and are to be found for cutting frequencies between 500 and 2000Hz [Webers (1985)]. The smallest perceivable delays are speaker-dependent and increase in case of a reverberant environment.
Today the effects of linear phase distortions on listeners are still under discussion. There are no defined requirements with respect to this kind of linear distortion.