Today most signal processing and transmission techniques are carried out in the digital signal domain, with the advantages of greater robustness compared to the analog domain and of the flexible (programmable) and reliable use of digital computers and associated digital hardware to arbitrary accuracy. Therefore recommended that digital data should be used wherever possible. In many cases it is necessary to switch into the digital domain, for instance for storing speech data on Digital Audio Tapes (DAT's) or CD-ROMs, or for further processing on computers. In order to understand the characteristics and limits of digital signal representation, the basic concepts of sampling and quantisation must be understood.
An analog signal is continuous in both time and amplitude. The transition to
a time-discrete but amplitude-continuous signal is performed by the sampling
process: by taking one amplitude value (or sample) every seconds the
original waveform is converted into a train of pulses. This signal
representation is called Pulse Amplitude Modulation (PAM),
and all coding
methods that try to reconstruct this pulse train are called waveform
coding.
stands for the sampling interval ,
and
stands for the sampling rate or sampling frequency.
For choosing T without any loss of
information, the sampling theorem has to be borne in mind: a band-limited
analog signal may be represented by time-discrete sampling values at constant
time intervals
without any information loss if
, with sampling rate
.
This is only defined for low-pass signals below a specified cut-off frequency,
with spectrum
for
)
This means that the highest frequency component
in the analog signal to be sampled has to be lower than half
of the sampling rate
.
If you are unsure, this has to be guaranteed by low-pass filtering of
the signal before starting the sampling process. Otherwise,
the analog signal cannot be
reconstructed from the samples without severe errors commonly called aliasing.
While PAM offers time-multiplexing capabilities, the pulse amplitudes are still sensitive to noise .
In a second step, quantisation, sample amplitudes are binary coded
with a binary word length of w bits per sample, in order to achieve
an amplitude-discrete representation (linear Pulse Code Modulation or Lin-PCM).
Consequently, the
most similar value of possible amplitude values has to be chosen, the
difference compared to the original amplitude being the quantisation
error.
While the bandwidth requirements increase by coding w bits (pulses)
per sample, the digital signal is resistent to added
noise distortions if the
noise does not exceed one quantisation step and if the signal amplitude does
not exceed the maximum discrete amplitude range.
In order to waste no quantisation
steps or bits, the recording level has to be controlled to take advantage,
without overload, of the full recording range.
The quality of linear
PCM
is commonly described by the signal-to-noise ratio SNR,
referring to signal power and noise power.
In addition to this linear time-invariant coding of the original sampled
signal, various modifications have been proposed to take full advantage of
the long-time or short-time characteristics of the speech signal [Rabiner & Schafer (1978)]. One
method of these is logarithmic PCM (A-law or -law Log-PCM), which uses
a higher quantiser resolution at small signal amplitudes and larger
quantisation steps at high amplitudes. Another improvement can be achieved by
permanently adapting the range of the quantiser to the short-time signal
amplitude. A different category of so-called parametric coding
strategies
applies assumptions about the speech production process within an ``intelligent''
speech coder, thereby shifting the costs from the transmission line (where very low
bit rates can be achieved) to the signal analysis and synthesis stage.
For further reading consult [O'Shaughnessy (1987), Rabiner & Schafer (1978), Pierce (1991)].