What do we hear? The logarithmic sound pressure level in dB-SPL does not correspond to the subjective level of perception. However, the latter is called loudness (measured in sones) and is roughly proportional to the 0.6th power of the sound pressure, within a wide amplitude and frequency range. The definition of loudness is based on the loudness level (measured in phones). While a sound intensity ratio can only be described by the loudness measure, the determination of loudness levels in a physical space leads to the so-called equal loudness contours or isophones for different intensity levels as a function of frequency (cf. isobars as pressure contours). Although the averaged isophones are standardised, originally both measures have to be determined by listening tests with subjects using a standard reference stimulus. For this stimulus (1000Hz sine wave) the loudness level in phones is defined to be equal as the sound pressure level in dB.
In case of speech signals, the rule of thumb is as follows: doubling the sound pressure increases the sound pressure level , and also the loudness level (in phones) at 1000Hz, by 6dB. On the other hand, doubling the loudness (in sones) - and thereby the degree of perception - is equivalent to an increase of the sound pressure level by 10dB.
The loudness level can be estimated on the basis of the frequency spectrum , taking into account the frequency response and the masking properties of the auditory system. Frequency response means that the sensitivity of the auditory system varies with frequency: the human ear is most sensitive at frequencies between 2000 and 5000Hz, and most insensitive at low and high frequencies. This phenomenon is more distinct at low than at high sound pressure levels. Additionally, masking properties have to be considered, which means that the estimation of these measures for real-life sounds like speech is by far more complex due to time and frequency-dependent suppression or masking of signal components by the human ear.