Waveform Coding

Next: Source Codecs Up: Audio Compression and Codecs Previous: Audio Compression and Codecs Contents Index

Waveform Coding

Waveform coding is some kind of approximately lossless coding, as it deals with speech signal as any kind of ordinary data. The resulting signal is close as possible as the original one. Codecs using this techniques have generally low complexity and give high quality at rates $\geq$ 16 Kbps. The simplest form of waveform coding is Pulse Code Modulation (PCM), which involves sampling and quantizing the input waveform. Narrow-band speech is typically band-limited to 4 KHz and sampled at 8 KHz. Many codecs try to predict the value of the next sample from the previous samples. This is because there is correlation between speech samples due to the nature of speech signal. An error signal is computed from the original and predicted signals. As in most cases, this error signal is small with respect to the original one, it will have lower variance than the original one. Hence, fewer bits are required to encode them. This is the basis of Differential Pulse Code Modulation (DPCM) codecs. They quantize the difference between the original and predicted (from the past samples) signals. The notion of adaptive coding is an enhancement to DPCM coding. This is done by making the predictor and quantizer adaptive so that they change to match the characteristics of the speech being coded. The most known codec using this technique is the Adaptive DPCM (ADPCM) codecs. It is also possible to encode in the frequency domain instead of the time domain (as the above mentioned techniques). In Sub-Band Coding (SBC), the original speech signal is divided into a number of frequency bands, or sub-bands. Each one is coded independently using any time domain coding technique like ADPCM encoder. One of the advantages of doing this is that all sub-band frequencies do not influence in the same way the perceptual quality of the signal. Hence, more bits are used to encode the sub-bands having more perceptually important effect on the quality than those where the noise at these frequencies is less perceptually important. Adaptive bit allocation schemes may be used to further exploit these ideas. SBC produces good quality at bit rates ranging form 16 to 32 Kbps. However, they are very complex with respect to the DPCM codecs. As in video spatial coding, Discrete Cosine Transformation (DCT) is used in speech coding techniques. The type of coding employing this technique is the Adaptive Transform Coding (ATC). Blocks of speech signal is divided into a large numbers of frequency bands. The number of bits used to code each transformation coefficient is adapted depending on the spectral properties of the speech. Good signal quality is maintained using ATC coding at bit rates of about 16 Kbps.

Next: Source Codecs Up: Audio Compression and Codecs Previous: Audio Compression and Codecs Contents Index

Samir Mohamed 2003-01-08