Home     Analog Devices     Feedback     Subscribe     Archives     Advanced Search        

Book Review

High Fidelity Multichannel Audio Coding, by Dai Tracy Yang, Chris Kyriakakis, and C. Jay Kuo, Hindawi Publishing Corp., 2004, ISBN 977-5945-08-9

Reviewed by Vladimir Botchev [vladimir.botchev@analog.com]

When it comes to media data compression, one can find many books on static- and motion-video compression, and on speech compression—but only two books (both recently published) entirely devoted to high-performance audio compression. They are the above volume and the Introduction to Digital Audio Coding and Standards by Marina Bosi. Both books appear to present the topic of audio coding in an equivalent manner. However the present one could be termed more advanced. Also the technical explanations are noticeably better at times, for example, on the topics on quantization and lossless coding (Huffman and arithmetic); however, some topics, such as temporal noise shaping, are less detailed than in Bosi’s book.

The book begins with an introduction to digital audio coding, including some basic signal-processing operations and a short multichannel-audio primer. Chapter 2 presents the topic of quantization in a very approachable manner. Considered here are scalar and vector quantizations and also basic concepts of bit allocation. Chapter 3 introduces lossless coding techniques, such as Huffman and adaptive Huffman encoding, arithmetic coding, and QM coder, a variant of arithmetic coding (the authors term it a successor). The codec procedures are presented in detail, with snippets of pseudo-code.

Chapter 4, while not a substitute for Zwicker’s Psychoacoustics, provides a concise introduction to what makes hi-fidelity audio coding at all possible: human hearing and psychoacoustics. Chapter Five is on the important topic, perceptual quality assessment. Similar chapters exist in other books as well, such as Applications of Digital Signal Processing to Audio and Acoustics, but here the material can be read as more of a recipe than theoretical considerations. The next two chapters deal with MPEG audio coding tools. They can’t be read as a substitute for the standards, but they give enough background so that one can feel more comfortable when working with the standard.

The material introduced in chapter 8 and detailed in subsequent chapters is not part of a standard, and eventually could be skipped by practitioners. However, it provides a detailed overview of coding enhancements and gives clues on where audio coding research might be headed (something partially lacking in Bosi’s book). Chapter 9 teaches how interchannel redundancies can be removed using an adaptive Karhunen-Loeve transform. Chapter 10 focuses on the performance of the adaptive Kahunen-Loeve transform and quantization issues.

Chapter 11 introduces the concept of scalable bit stream for audio coding, not unlike some video compression schemes, e.g., the embedded zerotree wavelet (EZW) method. Chapter 12 discusses error resiliency in audio codec design. Personally I have never been able to fully appreciate this particular topic in audio, since it seems that a Reed-Solomon codec with interleaving and perhaps preceded by a Viterbi/Turbo codec is almost the best one can get—as demonstrated by the use of these principles in the harsh environment of DSL.

In conclusion, there are not many books devoted entirely to high fidelity digital audio coding. The present one is a good reference on current standards and their implementations and may be very useful for people implementing the standards on various platforms as well as for studying the state of the art in digital audio coding. This is a book for audio engineers to consider, even if they already have adopted the earlier book by Dr. Bosi.

top of page

Copyright 1995- Analog Devices, Inc. All rights reserved.