Since the introduction of the compact disc in the early 1980s, digital technology has become the standard for the recording and storage of high-fidelity audio. It is not difficult to see why. Digital signals are robust. Digital signals can be transmitted and copied without distortion. Digital signals can be played back without degrading the carrier. Who would want to go back to scraping a needle along a vinyl groove now?
Another advantage of digital audio signals is the ease with which they can be manipulated. Digital Signal Processing (DSP) technology has advanced to such an extent that almost any audio product, from a mobile phone to a professional mixing console, contains a DSP chip. Once again the reasons for the success of DSP are simple: stability, reliability, enhanced performance and programmability. Signal processing functions can be implemented for a fraction of the cost, and in a fraction of the space required by analog circuitry, as well as providing functionality that simply couldn't be done in analog. In fact, so ubiquitous has it now become that, for many people, the word "digital" has become synonymous with "high quality".
The ever-increasing performance and falling cost of DSP hardware have generated new applications and new markets for digital audio in both the consumer and professional audio sectors. Digital Versatile Disk (DVD) and digital surround sound in the home, digital radio and hands-free cellular phones in the car are just a few of the DSP-based technologies which have appeared in the last few years. The demands on the quality, speed and flexibility of DSP has also increased as more functionality is added to DSP products: a DSP might now be required for mixing, equalization, dynamic range compression and data decompression, all in one product, implemented on one chip.
16-bit, 44.1 kHz PCM digital audio continues to be the standard for high quality audio in most current applications such as CD, DAT, and high-quality PC audio. Recent technological developments and improved knowledge of human hearing, however, have created a demand for greater data word lengths. Analog-to-digital converters now available support 18-, 20-, and 24-bits and are capable of exceeding the 96dB dynamic range available using 16-bit data words. Many recording studios now routinely master their recordings using 20- or 24-bit recorders. These technological developments are beginning to make their way into the consumer and "prosumer" audio applications. The most obvious consumer audio impact is DVD which is capable of carrying audio with up to 24-bit resolution at sample rates well above 48 kHz. Another example is a 16-channel digital home studio recorder, capable of sampling at a 96 kHz sample rate with 24-bit resolution. In fact, three trends can be identified which have influenced the current generation of digital audio formats which are set to replace CD digital audio. These can be summarized as follows:
- Higher resolution - either 20- or 24-bits per data word
- Higher sampling frequency - typically 96 kHz and 192 kHz
- More audio channels for a more realistic "3D" sound experience
Low-cost, higher-performance digital signal processors are now appearing on the market to satisfy the high dynamic range requirements for processing or synthesizing audio signals. How many bits are required for processing audio signals? Is it 16, 20, 24, or 32 bits? Does the audio application require fixed-point of floating-point arithmetic? What undesirable side effects of quantization should the audio designer look out for?
The first section in this report briefly reviews desirable characteristics of a DSP for use in audio applications, and then discusses the differences in data formats for fixed- and floating-point processors. Next, the relationship of dynamic range to data word size in processing audio signals is examined. This will aid in determining how many bits would be required for your application, whether it is a lower-cost, low-fidelity consumer device or a high-performance, high-fidelity professional audio gear. Finally, to design a system with either CD-quality or professional-quality audio, it is suggested that for a digital filter routine to operate transparently, the resolution of the processing system must be considerably greater than that of the input signal. For the highest-quality, professional audio systems, a 32-bit DSP is offered as a suggested solution.
1. What are the Benefits of Using a DSP to Process Audio Signals?
A digital signal processor has one purpose: to operate on quantized signal data as quickly and efficiently as possible. Compared to a typical CPU or microcontroller, a well-architected DSP usually contains the following desirable characteristics to perform real-time DSP computations on audio signals:
Fast and Flexible Arithmetic
Single-cycle computation for multiplication with accumulation, arbitrary amounts of shifting, and standard arithmetic and logical operations.
Extended Dynamic Range for Extended Sum-of Product Calculations
Extended sums-of-products, common in DSP algorithms, are supported in multiply-accumulate units. Extended precision in the multiplier's accumulator provides extra bits for protection against overflow in successive additions to ensure that no loss of data or range occurs.
Single-cycle Fetch of Two Operands For Sum-of-Products Calculations
In extended sums-of-products calculations, two operations are needed on each cycle to feed the calculation. The DSP should be able to sustain two-operand data throughput, whether the data is stored on-chip or off.
Hardware Circular Buffer Support For Efficient Storage and Retrieval of Samples
A large class of DSP algorithms, including digital filters, requires circular data buffers. A circular buffer is a finite segment of the DSP's memory defined by the programmer that is used to store samples for processing. Hardware Circular Buffering is designed to allow automatic address pointer wraparounds to the beginning of the buffer for simplifying circular buffer implementations, and thus reducing overhead and improving performance. When circular buffering is implemented in hardware, the DSP programmer does not have to be concerned with the additional overhead of testing and resetting the address pointer so that it does not go beyond the boundary of the buffer.
Efficient Looping and Branching for Repetitive DSP Operations
DSP algorithms are repetitive and are most logically expressed as loops. For digital filter routines, a running sum of MAC operations is typically executed in fast, efficient loop structures. A DSP's program sequencer, or control unit, should allow looping of code with minimal or zero overhead. Any loop branching, loop decrementing, and termination test operations are built into the DSP control unit hardware. Also, no overhead penalties should result for conditional branching instructions which branch based of a computation unit's status bits.
All of the above architectural features are used for implementation of DSP-type operations. For example, convolution is a common signal processing operation involving the multiplication of two sets of discrete data, an input multiplied with a shifted version of the impulse response to a system, and keeping a running sum of the outputs. This is seen in the following convolution equation [17, 18, 19, 20]:
DSP architectural features are designed to perform these types of discrete mathematical operations as quickly as possible, usually within a single instruction cycle. Examining this equation closely shows elements required for implementation. The filter coefficients and input samples required to implement the above equation can be stored in two memory arrays defined as circular buffers. Both circular buffers need to be multiplied together and added to the results of previous iterations. To perform the operation shown above, the DSP architecture should allow one multiplication to be executed, along with an addition to a previous result in a single instruction cycle. Within the same cycle, the architecture should also contain enough parallelism in the compute units to enable memory reads of the next sample and filter coefficient for the next loop iteration. Hardware looping circuitry included in the architecture would allow efficient looping through the number of iterations with zero-overhead. When used in a zero-overhead loop, digital filter implementations become extremely optimized since no explicit software decrement, test and jump instructions are required. Thus, for actual implementation of the convolution operation, two circular buffers, multipliers, adders, and a zero-overhead loop construct are required. A digital signal processor contains the necessary building blocks to accomplish implementation of discrete-time filter operations.
In performing these types of repetitive DSP calculations, quantization errors from truncation and rounding can accumulate over time, degrading the quality of the DSP algorithmic result. The number of bits of resolution used in the arithmetic computations, along with a given filter structure realization, will determine the robustness of a filter algorithm's signal manipulation. The rest of this article will discuss how many bits would potentially be required for a particular audio application, as this is determined by the complexity of the processing and the desired target signal quality.
2. DSP Numeric Formats: Do I Required Fixed or Floating Point Arithmetic for my Audio Application?
Depending on the complexity of the application, the audio system designer must decide on how much computational accuracy and dynamic range will be needed. The most common native data types are explained briefly in this section. 16- and 24-bit fixed-point DSPs are designed to compute integer or fractional arithmetic. 32-bit DSPs like the Analog Devices ADSP-2106x SHARC family were traditionally offered as floating point devices, however, this popular family of DSPs can equally perform both floating-point arithmetic and integer or fractional fixed-point arithmetic.
2.1 16-, 24-, and 32-Bit Fixed-Point Arithmetic
DSPs that can perform fixed-point operations typically use a two's complement binary notation for representing signals. The representation of the fixed-point format can be signed (twos-complement) or unsigned integer or fractional notation. Most DSP operations are optimized for signed fractional notation. For example, the Analog Devices ADSP-21161 is capable of 32-bit fractional arithmetic.
The numeric format in signed fractional notation makes sense to use in DSP computations, because a fractional representation it would easily correspond to a ratio of the full range of samples produced from a 5 Volt A/D converter, as shown in Figure 1 below. It is harder to overflow a fractional result, because multiplying a fraction by a fraction results in a smaller number, which is then either truncated or rounded. The highest full-scale positive fractional number would be 0.99999, while the highest full scale negative number is -1.0. Anything in between the highest representable signal from the converter would be a fractional representation of the "loudest" signal. For example, the midway positive amplitude for a converter would be 1/2, and this would be interpreted as a fractional value of 0x4000 by the DSP.
In the fractional format, the binary point is assumed to be to the to the left of the LSB (sign bit). In the integer format, the binary point is to the right of the LSB (Figure 2).
Fractional math is more intuitive for signal manipulation, and it is the least significant bits in a fractional result that we will examine in this article, since it is in these lower order bits that can suffer from quantization errors due to finite word length effects. The more bits that are used to represent a given audio signal will produce a more accurate arithmetic result. This is discussed in Section 3.
2.2 32-/40-bit Floating-Point Arithmetic
Floating point math offers flexibility in programming because it is much harder to overflow a result, while the programmer is less concerned about scaling inputs to prevent overflow. IEEE 754/854 Floating-point data is stored in a format that is 32 bits wide, where 24 bits represent the mantissa and 8 bits represent the exponent. The 24-bit mantissa is used for precision while the exponent is for extending the dynamic range. For 40-bit extended precision, 32 bits are used for the mantissa while 8 bits are used to represent the exponent (figures 3 and 4).
A 32-bit floating point number is represented in decimal as:
Its binary numeric IEEE format representation is stored on the 32-bit floating point DSP as:
It is important to know that the IEEE standard always refers to the mantissa in signed-magnitude format, and not in twos-complement format. So the extra hidden bit effectively improved the precision to 24 bits and also insures any number ranges from 1 (1.0000E00) to 2 (1.1111E11) since the hidden bit is always assumed to be a 1.
Figure 7 shows the 40-bit extended precision format available that is also supported on the ADSP-2106x family of DSPs. With extended precision, the mantissa is extended to 32 bits. In all other respects, it is the same format as the IEEE standard format. 40-bit extended precision binary numeric format representation is stored as:
For audio-processing, the dynamic range of floating point may be unnecessary for some algorithms, but the flexibility in programming in floating-point make it desirable to take advantage of, especially for high-level programming languages like C. Keep in mind, that many of the fixed-point precision issues discussed in later sections would still apply for a DSP that supports floating point arithmetic, at least in terms of truncation and coefficient quantization. The programmer still has to convert the fixed-point data coming from an A/D converter to it's floating point representation, while the floating-point result has to be converted back to it's fixed-point equivalent when the data is sent to a D/A converter.
Floating-point arithmetic was traditionally used for applications that have very high dynamic range requirements, like image processing, graphics and military/space applications. The dynamic range offered for 32-bit IEEE floating-point arithmetic is 1530 dB. Typically in the past, trade-offs were considered with price vs. performance when deciding on the use of floating-point processors. Until recently, the higher cost made 32-bit floating point DSPs unreasonable for use in audio. Today, designers can achieve high-quality audio using either 32-bit fixed or floating point processing with the introduction of the lower-cost 32-bit processors like the ADSP-21161, at a cost comparable to 16-bit and 24-bit DSPs.
3. The Relationship of Dynamic Range to Data Word Size in Digital Audio
One of the top considerations when designing an audio system is determining acceptable signal quality for the application. Table 1 below shows some comparisons of signal quality for some audio applications, devices and equipment .
Table 1. Dynamic Ranges
|Audio Device/Application||Dynamic Range|
|Analog Broadcast TV
|Analog Cassette Player
|ADI SoundPort Codecs
|16-bit Audio Converters
||90 to 95 dB
|Digital Broadcast TV
||92 to 96 dB
|18-bit Audio Converters
|Digital Audio Tape (DAT)
|20-bit Audio Converters
|24-bit Audio Converters
||110 to 120 dB
Audio equipment retailers and consumers often use the phrase 'CD-quality sound' when referring to high dynamic range audio. Compare sound quality of a CD player to that of an AM radio broadcast. For higher quality CD audio, noise is not audible, especially during quiet passages in music. Lower level signals are heard clearly. But, the AM radio listener can easily hear the low level noise at very audible levels to where it can be a distraction to the listener. With an increase of an audio signal's dynamic ranges, then better distinction one can make for low-level audio signals while the noise floor is lowered and becomes undetectable to the listener ("noise floor" is a term used to describe the point where the audio signal cannot be distinguished from low-level white noise).
"Recent advancements within the past decade in human hearing indicate the sensitivity of the human ear is such that the dynamic range between the quietest sound detectable and the maximum sound which can be experienced without pain is approximately 120dB. Further studies suggest there is critically important audio information at frequencies up to 40 kHz and possibly 80 kHz"
To achieve CD-type signal quality, the trend in recent years has been to design a system that processes audio signals digitally, using 16-bit A/D and D/A converters signal-to-noise ratio (SNR) and dynamic range around 90-93 dB. When processing these signals, the programmer should normally design the algorithm with enough computation precision that is usually greater than 16-bits in compact disk signals. CD-quality audio is just one example. For whatever the application, the audio system designer must first determine what is an acceptable SNR and then decide how much precision is required to produce acceptable results for the intended application.
3.1 What Is The SNR and Dynamic Range for a DSP?
In analog and digital terms, SNR (S/N ratio) and dynamic range often used synonymously. In pure analog terms, SNR is defined as the ratio of the largest known signal that exists to the noise present when no signal exists. In digital terms, SNR and dynamic range are used synonymously to describe the ratio between the largest representable number to the quantization error . A well-designed digital filter should contain a maximum signal to noise ratio (SNR) that is greater than the converter SNR. Thus, the DSP designer must be sure that the noise floor of a filter is not larger than the minimum precision required of the ADC or DAC.
Figure 5 below shows the relationship between dynamic range, SNR and headroom:
Here is a summary of the terms shown in the figure 9 as defined by Davis and Jones  (we will be referring to many of these terms frequently throughout this article):
Decibel - Used to describe sound level (sound pressure level) ratio, or power and voltage ratios:
dBVolts=20log(Vo/Vi), dBWatts=10log(Po/Pi), dBSPL=20log(Po/Pi)
Dynamic Range - The difference between the loudest and quietest representable signal level, or if noise is present, the difference between the loudest (maximum level) signal to the noise floor. Measured in dB.
Dynamic Range = (Peak Level) - (Noise Floor) dB
SNR (Signal-To-Noise Ratio, or S/N Ratio) - The difference between the nominal level and the noise floor. Measured in dB. Other authors define this for analog systems as the ratio of the largest represent signal to the noise floor when no signal is present, which more closely parallels SNR for a digital system.
Headroom - The difference between nominal line level and peak level where signal clipping occurs. Measured in dB. The larger the headroom, the better the audio system will handle very loud signal peaks before distortion occurs.
Peak Operating Level - The maximum representable signal level at which point clipping of the signal will occur.
Line Level - Nominal operating level ( 0 dB, or more precisely between -10 dB and +4 dB)
Noise Floor - The noise floor for human hearing is the average level of 'just audible' white noise. Analog audio equipment can generate noise from components. With a DSP, noise can be generated from quantization errors. [One can make an assumption that the headroom + S/N ratio of an electrical analog signal equals the dynamic range (although not entirely accurate since signals can still be audible below the noise floor)].
"In theoretical terms, there is an increase in the signal-to-quantization noise or dynamic range by approximately 6 dB for each bit added to the word-length of an ADC, DAC or DSP."
In "real-world" signal processing, quantization is the process by which a number is approximated by a number of finite precision. For example, during analog-to-digital conversion, an infinitely variable signal voltage is represented by a binary number with a fixed number of bits. The difference between two consecutive binary values is called the quantization step, or quantization level. The size of the quantization step defines the effective noise floor of the quantized signal. The word length for a given processor determines the number of quantization levels that are available. For an n-bit data word would yield 2n quantization levels (some examples for common data word widths are shown in Table 2).
Table 2: An n-bit data word yields 2n quantization levels
|N Quantization Levels for n-bit data words ( N = 2n levels)
|28 = 256
|216 = 65,536
|220 = 1,048,576
|224 = 16,777,216
|232 = 4,294,967,296
|264 = 18,446,744,073,729,551,616
The higher number of bits used to represent a sample will result in a better approximation of the audio signal and a reduction in quantization error (noise), which produces and an increase in the SNR. In theoretical terms, there is an increase in the signal-to-quantization noise or dynamic range by approximately 6 dB for each bit added to the word length of an ADC, DAC or DSP.
Note that the "6-dB-Per-Bit-Rule" is an approximation to calculating the actual dynamic range for a given word width. The maximum representable signal amplitude to the maximum quantization error for of an ideal A/D converter or DSP-based digital system is actually calculated as:
1.76 dB is based on sinusoidal waveform statistics, and would vary for other waveforms, and n represents the data word length of the converter or the digital signal processor .
In undithered DSP-based systems, the SNR definition above is not directly applicable since there is no noise present when there is no signal. In digital terms, dynamic range and SNR (Figure 6) are often both used synonymously to describe the ratio of the largest representable signal to the quantization error or noise floor . Therefore, when referring to SNR or dynamic range in terms of DSP data word size and quantization errors - both terms mean the same thing.
Now the question arises, how many bits are required to design a high quality audio system? In terms of dynamic range and SNR, what is the best precision one can choose without sacrificing low cost in a given design? Let's first see what are the dynamic range comparisons between DSPs with different native data word sizes. Figure 7 shows the dynamic range relationship between the three most common DSP fixed-point processor data-word width: 16-, 24- and 32-bits. The quantization level comparisons are also given. As stated earlier, the number of data-word bits used to represent a signal directly affects the SNR and quantization noise introduced during the sample conversions and arithmetic computations.
Table 3. Dynamic Range Vs. Resolution
(Fixed-Point Binary Representation)
(# of bits per data word x 6 db/bit or resolution)
Each additional bit of resolution that is used by the DSP for calculations will reduce the quantization noise power by 6dB. 16-bit fixed-point numeric precision yields 96 dB [16 x 6dB per bit], 24-bit fixed-point precision yields 144 dB [24 x 6dB per bit], while 32-bit fixed-point precision will yield 192 dB [32 x 6dB per bit]. Note that for native single-precision math, a 16-bit DSP is not adequate for accurately representing the full dynamic range required for 'higher-fidelity' audio signals around 120 dB.
In terms of quantization levels, figure 8 demonstrates how 32-bit and 24-bit processing can more accurately represent a processed audio signal as compared to 16-bit processing. 24-bit processing can more accurately represent a signal 256 times better than 16-bit processing, while 32-bit processing can more accurately represent signals 65,536 times better than that for 16-bit processing, and 256 times more accurately than that of a 24-bit processor.
Using the "6-dB-Per-Bit-Rule," 32-bit IEEE floating point dynamic range is determined to be 1530 dB. For floating point this is calculated by the size of the exponent - 6 dB x 255 exponent levels = 1530 dB. (255 levels come from the fact that there is an 8-bit exponent). For floating-point audio processing, we can see there is much more dynamic range available than the 120 dB required for covering the full audio dynamic range capabilities of the human ear.
3.2 Additional Fixed Point MAC Unit Dynamic Range for DSP Overflow Prevention
Computation overflow/underflow is a hardware limitation that occurs when the numerical result of the fixed-point computation exceeds the largest or smallest number that can be represented by the DSP. Many DSPs include additional bits in the MAC unit to prevent overflow in intermediate calculations. Extended sums-of-products, which are common in DSP algorithms, are achieved in the MAC unit with single cycle multiply accumulates placed in an efficient loop structure. The extra bits of precision in the accumulator result register provide extended dynamic range for the protection against overflow in successive multiplies and additions. Thus, no loss of data or range occurs. Table 4 shows a comparison of the extended dynamic ranges of 16-bit, 24-bit, and 32-bit DSPs. Note that the ADSP-21161 SHARC 32-bit DSP has a much higher extended dynamic range than 16- and 24-bit DSPs when executing fixed-point multiplication instructions. The MAC unit on the SHARC contains dual accumulators that can produce an 80-bit fixed-point result when multiplying two 32-bit fixed point values. There are 16 bits of additional precision for the 64-bit MAC result. The SHARC's 80-bit result can yield a fixed-point dynamic range as high as 480 dB for intermediate calculations.
Table 4. Comparison of Extended Dynamic Range in Fixed-Point DSP Multiplier Unit
|N-bit DSP||N-bit x
|Additional MAC Result Bits
||Precision in MAC Result Register
||Additional Dynamic Range Gained
||Resulting MAC Dynamic Range|
4. Considering Data Word Length Issues When Developing Audio Algorithms Free from Noise Artifacts
Digital Signal Processing is often discussed as if the signals to be processed and the filter arithmetic used to process them are both of infinite precision. However, all implementations of DSP necessarily use words of finite length to represent each and every value, be it a digital audio input sample, a filter coefficient or the result of a multiplication. This finite precision of representation means that any digital signal processing performed to generate a desired result introduces inaccuracy into the result. If a signal goes through several stages of DSP, then each stage will add more inaccuracy.
The effects of a finite word length can severely effect signal quality (i.e. lower the system S/N ratio) and produce unacceptable error when performing DSP calculations. Undesirable effects of finite precision can result of any of the following:
- A/D Conversion Noise
Finite precision of an input data word sample will introduce some inaccuracy for the DSP computation as a result of the nonlinearities inherent in the A/D Conversion Process. Therefore, the accuracy of the result of an arithmetic computation can not be greater than the resolution of the quantized sample. In other words, the A/D conversion process will establish the noise floor for the DSP (unless the D/A converter has a lower noise floor). The DSP programmer must ensure that the noise floor of the processing algorithm does not exceed the noise floor of the A/D converter.
- Quantization Error of Arithmetic Computations From Truncation and Rounding
DSP Algorithms such as Digital Filters will generate results that must be truncated or rounded up (i.e. re-quantized). When a processing result need to be stored, it must be quantized to the native data-word length of the processor, introducing an error. For recursive DSP algorithms these re-quantized values are part of a feedback loop, causing arithmetic errors can build up, which then reduces the dynamic range of the filter. The smaller the data word of the DSP, the more likely these types of errors will show up in the D-A converted output analog signal.
In a n-bit fixed-point system, quantization of results may be considered as the addition of noise to the result. Consider a multiplication operation in a digital filter, including re-quantization of the result. This can be modeled as an infinite-precision multiplication followed by an addition stage where quantization noise is added to the product so that the result is equal to a n-bit number .
In a digital signal processing system multiplication, addition and shift operations are performed on a sequence of n-bit input values. These operations generate results which would require more than n bits to be represented accurately. The solution to this problem is generally to eliminate the low-order bits resulting from an arithmetic operation in order to produce a n-bit value which can be stored by the system.
The two most common methods for eliminating the low-order bits are truncation and rounding. Truncation is accomplished by simply discarding all bits less significant than the least significant bit that is retained. Rounding is performed by choosing the n-bit number which is closest to the original unrounded quantity.
- Computational Overflow
Whenever the result of an arithmetic computation is larger than the highest positive or negative full-scale value, an overflow will occur and the true result will be lost.
- Coefficient Quantization
Finite Word Length (n-bit data word size) of a filter coefficient can affect pole/zero placement and a digital filter's frequency response. This imprecision can cause distortion in the frequency response of the filter and, in the worst case, instability.
Errors in the values of a filter's coefficients cause alterations in the positions of the transfer function poles and zeros and therefore are manifested as changes to the frequency and phase response characteristics of the filter. In a DSP system of finite precision, such deviations cannot be avoided. It can, however, be reduced by using greater precision for the representation of coefficients. This issue is particularly important for poles close to the unit circle in the z-plane, where an inaccuracy could make the difference between stability and instability.
Occur in IIR filters from truncation and rounding of multiplication results or addition overflow. These often cause periodic oscillations in the output result, even when the input is zero.
Other than A/D Conversion Noise, all other effects of having a finite data-word size are mainly dependent on the precision of the re-quantization of data and the type of arithmetic operations used in the DSP algorithm. Any given filter structure can offer a significantly lower noise floor over another structure which accomplishes the same task.
"The overall DSP-based audio system dynamic range is only as good as it's weakest link."
In a DSP-based audio system, this means that any one of the following sources or devices in the audio signal chain will determine the dynamic range of the overall audio system :
- The "real world" analog input signal, typically from a microphone or line-level source
- The A/D converter word size and conversion errors
- DSP finite word length effects such as quantization errors resulting from truncation and rounding, and filter coefficient quantization
- The D/A converter word size
- The analog output circuitry connecting to a speaker
- or, another device in the signal path that will further process the audio signal
So the choice of components and the digital filter implementation will also determine the overall quality of the processed signal. For example, if we have a 75 dB D/A converter and a DSP which can maintain 144 dB dynamic range, the overall 'System' dynamic range will still only be 75 dB. So the D/A converter is the limiting factor. Even thought the DSP would compute a given algorithm and maintain a result that had 122 dB of precision and dynamic range, the result would have to be truncated in order for the DAC to properly convert it back to an analog signal. Now, if the choice is made to high quality analog, ADC, and DAC components, wouldn't one want to be careful to ensure the signal quality is maintained by the DSP algorithm? Care must then be taken in a digital system to ensure the DSP is not the weakest chain in the 'signal chain'.
"For a digital filter routine to operate transparently, the resolution of the processing system must be considerably greater than that of the input signal so that any errors introduced by the arithmetic computations are smaller than the precision of the ADC or DAC."
If a digital signal processing algorithm produces quantization noise artifacts which are above the noise floor of the input signal, then these artifacts will be audible under certain circumstances, especially when an input signal is of low intensity or limited frequency. Therefore, whatever the dynamic range of a high-quality audio input, be it 16-, 20- or 24-bit input samples, the digital processing which is performed on it should be designed to prevent processing noise from reaching levels at which it may appear above the noise floor of the input, and thus become audible content [see 2-Wilson and 5-Chen]. For a digital filter routine to operate transparently, the resolution of the processing system must be considerably greater than that of the input signal so that any errors introduced by the arithmetic computations are smaller than the precision of the ADC or DAC. In order for the DSP to maintain the SNR established by the A/D converter, all intermediate DSP calculations require the use of higher precision processing greater than the input sample word-size [see 2-Wilson, 3-Dattorro, 4-Zolzer, 5-Chen, 6-Kloker, Lindsley Thompson].
What are the dynamic ranges that must be maintained for CD-quality and Professional-quality audio designs? Fielder  demonstrated the dynamic range requirements for consumer CD audio requires 16-bit conversion/processing while the minimum requirement for professional audio is 20-bits (based on perceptual tests performed on human auditory capabilities). Traditional dynamic range application requirements for high-fidelity audio processing can be categorized into two groups:
'Consumer CD-Quality' audio systems uses 16-bit conversion with typical dynamic ranges between 85-93 dB.
'Professional-Quality' audio systems uses 20-24 bit conversion with dynamic ranges between 110-122 dB.
5. Maintaining 16-bit 'CD-Quality' Accuracy During DSP Processing
As we saw in the last section, when using a DSP to process audio signals, the DSP designer must ensure that any quantization errors introduced by the arithmetic calculations executed on the processor are lower than the converter noise floor. Consider a 'CD-quality' audio system. If the DSP is to process audio data from a 16 bit A/D converter (ideal case), a 96 dB SNR must be maintained through the algorithmic process in order to maintain a CD-quality audio signal (6x16=96dB). Therefore, it is important that all intermediate calculations be performed with higher precision than the 16-bit ADC or DAC resolution . Errors introduced by the arithmetic calculations can be minimized when using larger data-word width sizes for processing audio signals. For fractional fixed-point math, we can visualize the addition of extra 'footroom' bits added to the right of the least significant bit of the input sample. The larger word sizes used in the arithmetic operations will ensure that truncation or round-off errors will be lower than the noise floor of the D/A converter, as long as 'optimal' algorithms (better filter structures) are utilized in conjunction with the larger word width.
When considering selection of a processor for implementation, a choice therefore has to be made. Should one use a lower data-word DSP using double precision math, or should a higher data-word DSP be used supporting single precision math, which is more efficient? It is estimated that double-precision math operations can take up to 4-5 times the overhead of single precision math [5, 6]. Double-precision not only adds computation overhead to a digital filter, it also doubles the memory storage requirements for the filter coefficient buffer and the input delay line buffer. Every application is different, and although some applications may suffice smaller native data-word width processor, the use of double-precision computations, coefficients and intermediate storage comes at the expense of a drastic reduction in processing throughput.
To visually see the benefits of a larger DSP word size, let's take a look at the processing of audio signals from a 16-bit A/D converter that has a dynamic range close to it's theoretical maximum, in this case with a 92 dB signal-to-noise ratio (see Figure 9 below). Figure 10 below shows a conceptual view of a 16-bit data word that is transferred from an A/D converter to the DSP's internal memory. Typically, the data transfer would occur through a serial port interface from the serial A/D converter, and the DSP may be configured to automatically perform a direct memory transfer (DMA) of the sample at the serial port circuitry to internal memory for processing. Notice that for the 24-bit and 32-bit processors, there are adequate 'footroom-bits' below the noise floor (to the right) to protect against quantization errors.
The 16-bit DSP has 4 dB higher SNR than the A/D converter's 92 dB, so not much room for error would be allowed in arithmetic computations. We can easily see that for moderate to complex audio processing using single precision arithmetic, the 16-bit DSP data path will not be adequate for precise processing of 16-bit samples as a result of truncation and round-off errors that can accumulate during the execution of the algorithm. As shown in the Figure 11, errors resulting from the arithmetic computations can easily be seen by the output D/A converter and thus become audible noise. For example, complex recursive computations can easily result in the introduction of 18 dB of quantization noise, and with the 16-bit DSP word width, the errors are seen by the DAC and hence will be easily heard by the listener.
Double-precision math can obviously still be used for the 16-bit DSP if software overhead is available, but the real performance of the processor will be compromised. A 16-bit DSP using single-precision processing would only suffice for low-cost audio applications where processing is not too complex and SNR requirements are around 75 dB (audio-cassette quality).
The same algorithm implemented on a 24-bit or 32-bit DSP would ensure these errors are not seen by the D/A converter. As can be seen in the figure 11, even though 18 dB of quantization noise was introduced by the computations in the 24-bit and 32-bit DSP, they remain well below the noise floor of the 16-bit DAC when these two processors run the exact same algorithm.
The 24-bit DSP has 8 bits below the converter noise floor to allow for errors. In other words, we have 8 digits to the right of the least significant bit in the 16-bit input sample. It takes 256 multiplicative processing operations to be performed before the noise floor of the algorithm goes above the resolution of the input sample.
A 32-bit DSP (e.g. the ADSP-21161) has 16-bits below the noise floor when executing 32-bit fractional math, allowing for the greatest computation flexibility in developing stable, noise-free audio algorithms. There are 16 digits to the right of the least significant bit in the 16-bit input sample. It would take 65,536 multiplicative processing operations before the noise floor of the algorithm would go above the resolution of the 16-bit input. With more room for quantization errors, filter implementation restrictions seen with 16- or 24-bit DSPs are now removed.
So, the higher number of bits used to process an audio signal will result in a reduction in quantization error (noise). If these errors remain below the noise floor, the overall 'digital system SNR' established by the converters is therefore maintained. The DSP should not the limiting factor in signal quality! When using a 16-bit converter for 'CD-quality' audio, the general recommendation widely accepted is to use a higher resolution processor (24 or 32-bit) since additional bits of precision gives the DSP the ability to maintain the 96dB SNR of the audio converters [2, 5, 9].
5.1 Is 24-Bit Processing Always Enough For Maintaining 16-bit Sample Accuracy?
Now it would appear in some cases, 32-bit processing would be unnecessary for minimal processing of 16-bit data. In order to maintain a 96-dB dynamic range, 24 bits would appear to be sufficient to process a 16-bit signal without any double-precision math requirement. But the question is then asked: Is a 24-bit DSP sufficient in all cases to guarantee that noise introduced in a DSP computation will never go above a 16-bit noise floor? For moderate and non-recursive DSP operations, 24-bits should normally be sufficient. However, research conducted in recent years has clearly shown that for precise processing of 16-bit signals in recursive audio processing, a 24-bit DSP may not be sufficient. Recursive filters are necessary for a wide variety of audio applications such as graphic equalizers, parametric equalizers, and comb filters.
In a 1993 AES Journal publication, R. Wilson  demonstrated that even for recursive second-order IIR filter computations on a 24-bit DSP, the noise floor of the digital filter can still go above that of the 16-bit sample and hence become audible. To compensate for this the use of error feedback schemes (error spectrum shaping) or double-precision arithmetic were recommended, especially for extremely critical frequency response designs. The use of double-precision math can add processor computational overhead by more than a factor of five in the filter computations, while doubling memory storage requirements.
Another March 1996 AES Journal publication by W. Chen  came to the same conclusion. In order to maintain the 96-dB signal-to-noise ratio for 24-bit processing of second-order IIR filters, a double-precision filter structure was required to ensure that the digital equalizer output's noise floor was greater than 96 dB. Chen researched various second-order realizations to determine the best structure when performing 24-bit processing on 16-bit input. In one test case, he implemented a single high-pass second-order filter using direct-form-1 structures, finding these implementations to yield an SNR between 85 to 88 dB, which is lower than the 96 dB theoretical maximum of the ideal 16-bit A/D converter.
Chen's second example consisted of cascading of second-order structures to implement a sixteenth-order digital equalizer. He then measured the noise floor of the equalizer using an Audio Precision System One tester in order to find an adequate second-order IIR filter structure to meet his target 96-dB requirement. The results of using the 24-bit DSP on a 16-bit sample are shown in Table 5.
Table 5. Chen's Results of 24-bit 2nd Order IIR Processing on 16-bit Data [March 1996 Journal of AES]
|Second-Order Filter Structure
||S/N Ratio (dB) Results for 16th-order Equalizer
||1 -75 dB
||2 -63 dB
|Cascaded Transposed Form
||1 -70 dB
|Double Precision Cascaded Form 1
|Parallel Form 1
Chen's conclusion - in order to maintain a higher signal-to-noise ratio greater than 96 dB when cascading multiple second-order stages, double-precision arithmetic was required. In his optimal implementation of the double-precision direct-form-1 filter, there was an increase in the number of instruction cycles (3x increase) and greater memory space (2x increase) for storing internal filter states.
Recall that with a 32-bit DSP, there are 8 extra bits of precision compared to a 24-bit processor. For a given second-order filter structure implemented on a 24-bit processor is then implemented in a 32-bit fixed-point processor, the arithmetic result should result in a reduction in the noise floor by 48 dB. Direct-form 1 filter structures are generally the best filter structure for use in audio, because of better noise performance it provides [2, 3]. For example, we can see that in Chen's results (Table 5), the Parallel Form 1 structure used to construct the equalizer provided the best result for single-precision 24-bit computation. However, this is still less than the ideal 96-dB case. The 24-bit processor's 144-dB ideal noise floor is significantly raised by 70 to 80 dB and as a result, it is greater than the 16-bit converter's noise floor. If this same algorithm is implemented on a 32-bit fixed-point processor, the noise floor of the filter output is lowered by 48 dB (with the 8 extra 'foot-room' bits) to 133 dB. This is not only sufficient for remaining lower than a 16-bit converter's noise floor, but a 32-bit implementation of the single-precision direct-form 1 structure would be adequate for even a 24-bit converter's noise floor as well.
When processing of 16-bit samples with a 32-bit processor versus a 24-bit processor, the 8 additional bits available below the noise floor and the use of 32-bit filter coefficients will ensure that double-precision overhead is not necessary when using any standard second-order IIR filter realization.
6. Processing 110-120 dB, 20-/24-bit Professional-Quality Audio
When the compact disc was launched in the early 1980s, the digital format of 16-bit words sampled at 44.1 kHz, was chosen for a mixture of technical and commercial reasons. The choice was limited by the quality of available analog-to-digital converters, by the quality and cost of other digital components, and by the density at which digital data could be stored on the medium itself. It was thought that the format would be sufficient to record audio signals with all the fidelity required for the full range of human hearing. However, research since the entrance of CD technology has shown that this format is imperfect in some respects.
New research conducted within the last decade indicates that the sensitivity of the human ear is such that the dynamic range between the quietest sound detectable and the maximum sound which can be experienced without pain is approximately 120dB. Therefore, 16-bit CD-quality audio is no longer thought to be the highest-quality audio that can be stored and played back. Also, many audiophiles claimed that CD-quality audio lacked a certain warmth that a vinyl groove offered. This may have been due to a combination of the dynamic range limitation of 16-bits as well as the chosen sample rate of 44.1 kHz. The 16-bit words used for CD allow a maximum dynamic range of 96 dB although with the use of dither this is reduced to about 93 dB. Digital conversion technology has now advanced to the stage where recordings with a dynamic range of 120dB or greater may be made, but compact disc is unable to accurately carry them.
Recent technological developments and improved knowledge of human hearing have created a demand for greater word lengths and faster sampling rates in the professional and consumer audio sectors. It has long been assumed that the human ear was capable of hearing sounds up to a frequency of about 20 kHz and was completely insensitive to frequencies above this value. This assumption was a major factor in the selection of a 44.1 kHz sampling rate. New research has suggested that many people can distinguish the quality of audio at frequencies of up to 25 kHz, and that humans are also sensitive to a degree to frequencies above even this value. This research is mainly empirical, but would mean that a substantially higher sampling frequency is necessary. D. E. Blackmer  has suggested that in order to fully meet the requirements of human auditory perception, a sound systems must be designed to cover the frequency range to up to 40 kHz (and possibly up to 80 kHz) with over 120 dB dynamic range to handle transient peaks. This is beyond the requirements of many of today's digital audio systems. As a result, 18, 20 and even 24 bit analog-to-digital converters are now widely available which are capable of exceeding the 96dB dynamic range available using 16 bits.
6.1 The Race Toward The Use of 24-bit A/D and D/A Conversion
Multibit Sigma-Delta Converters capable of 24-bit conversion are now in production by various manufacturers (Analog Devices, Crystal Semiconductor, and AKM Semiconductor to name a few). The popularity of 24-bit D/A converters is increasing for both professional and high-end consumer applications. The reason for using these higher precision A-D and D-A converters for audio processing is clear: the distortion performance (linearity) of these higher resolution converters are much better than 16-bit converters. The other obvious reason is the increase in SNR and dynamic range that they provide over 16-20 bit technology.
"24-bit A/D and D/A converter technology is capable of 120-122 dB dynamic range, fully supporting the dynamic range capability of the human ear up to the threshold of pain of 120 dB, at sample rates of 96 kHz and 192 kHz."
Many 24-bit converters on the market range from 110 to 120 dB, which is professional quality and close to the range capable by the human ear. The higher-end converters range from 117 dB to 122 dB (Conversion errors such as intermodulation distortion introduced by the 24-bit converters limit the final SNR from the theoretical 148 dB maximum). These newer 24-bit converters have up to 120-122 dB dynamic range, easily allowing input sources such as a 120 dB low-noise condenser microphone.
At many AES conventions in recent years, professional equipment manufacturers have showcased equipment with 24-bit conversion with 96 kHz sample rates. New DVD standards are extending the digital formats to 24-bits at sample rates of 96 kHz and 192 kHz formats. Professional quality audio is emerging in consumer audio market sector, traditionally a market with less stringent audio specifications. The race is on for audio equipment manufacturers to include 24-bit, 96 kHz converters to maintain signal quality up to 120 dB.
6.2 Comparing 24-bit and 32-bit Processing of Audio Signals with 24-bit Resolution
For years it has been widely accepted that in most cases 24-bit DSP processing offers adequate precision for 16-bit samples. With higher-precision 24-bit converters emerging to support newer professional and consumer audio standards, what will become the recommended processor word-width required to maintain 24-bit precision? For 24-bit conversion, a 24-bit DSP may no longer be able to adequately process 24-bit samples without resorting to double-precision math, especially for recursive second-order IIR algorithms. Newer 24-bit converter technology is making a strong case for 32-bit processing. The use of a 32-bit DSP has already become the logical processor-of-choice for many audio equipment manufacturers when using a 24-bit signal conversion. Let's examine why this is the case.
Figure 12 visually demonstrates a typical situation that can result from moderately complex or recursive processing of 24-bit samples. Note that the 24-bit sample in this case is assuming a 1.23 fractional number interpreted from the 24-bit converters. The extra bits of precision that 32-bit fixed-point processing provides to the right of the 24-bit input's LSB. For example, the parallel combination of second-order IIR filters can result in significant quantization artifacts from in the lower order bits of the data word. If both the 24-bit and 32-bit end up producing errors that result in an introduction of 24 dB of noise (4 bits x 6 dB/bit), the error will show up on the 24-bit DAC since the 24-bit DSP has the result above the noise floor. Single-precision computations with 24-bit processing can limit the result of a processed input to about 15-bit accuracy. Should one use double precision routines on the 24-bit processor, or should one opt for a 32-bit processor when using a 24-bit converter? Using a 32-bit processor, the errors produced during the computations will never be seen by a 120 dB, 24-bit D-A converter.
Recall in section 5, analysis of Wilson's and Chen's research demonstrated that for even second-order IIR filter designs using a 24-bit processor, one may require the use of additional error feedback computations or double-precision math to ensure the noise floor remains lower that a 16-bit converter. If 24-bit computations can introduce noise artifacts that can go above a 16-bit noise floor for complex second order filters, what does that mean? We can conclude that a 24-bit DSP processing 24-bit samples will result in the noise floor of the digital filter to always be greater than the 24-bit converter's noise floor, unless methods are implemented to reduce the digital filter's noise floor. These costly methods of implementing error-feedback schemes and double-precision arithmetic are unavoidable and can add significant overhead in processing of 24-bit audio data.
With many converter manufacturers introducing 24-bit A/D and D/A converters to meet emerging consumer and professional audio standards, the audio systems using these higher resolution converters will require at least 32-bit processing in order to offer sufficient precision to ensure that a filter algorithm's quantization noise artifacts will not exceed the 24-bit input signal. If optimal filter routines are used for complex processing, any quantization noise introduced in the 32-bit computations will never be seen by the 24-bit output D-A converter. In many cases, the audio designer can choose from a number of second-order structures because the result will still be greater than 120 dB. 32-bit processing will guarantee that the noise artifacts remain below the 120-dB noise floor, and hence provide a dynamic range of the audio signal up the human ear's threshold of pain. Therefore, the goal of developing robust audio algorithms is accomplished, and the only limiting factor when examining the signal quality (SNR) of the digital audio system is the precision of the 24-bit A/D and D/A converters.
7. Summary of Data Word Size Requirements for Processing Audio Signals
To maintain high audio-signal quality well above the noise floor, all intermediate DSP calculations should be done using higher precision than the bit length of the quantized input data. High precision storage should also be used between the DSP's memory and computation units. The use of "optimal" filter algorithms, higher precision filter coefficients, and higher precision storage of intermediate samples (available with extended precision in the MAC unit) will ensure that errors introduced by the arithmetic computations are much smaller than the error introduced by the conversion of the results by a DAC. Therefore, the noise floor of the digital filter algorithm will be lower than the resolution of the A/D and D/A converters.
A 16-bit DSP may suffice for low cost audio applications where processing is not complex and SNR requirements are around 75 dB. However, 16-bit DSPs using single precision computations will not be adequate for precise processing 16-bit signals. When using 16-bit A-D and D-A Converters in an audio system that will process `CD-quality' signals having a dynamic range of 90 - 96 dB, a 16-bit data path may not be adequate as a result of truncation and rounding errors accumulating during execution of the DSP algorithm. Double-precision routines can be utilized to lower the digital filter's noise floor as long as the software overhead is available.
While complexity for new DSP algorithms increase as audio standards and requirements are increasing, designers are looking to 18-bit, 20-bit, and 24-bit converters to increase the signal quality. A 16-bit DSP will not be adequate due to these higher resolution converter's dynamic range capabilities exceeding a 16-bit DSP processor. However, a 16-bit DSP may still be able to interface to these higher precision converters, but this would then require the use of double-precision arithmetic. Double-precision operations slow down the true performance of the processor while increasing programming complexity. Memory requirements for double-precision math are doubled. Even if double-precision math can be used, the interfaces to these higher precision converters in many cases would require glue logic to move the data to/from the DSP.
At least 24 bits are required in processing if the quality of 16 bits is to be preserved. However, even with 24-bit processing, it has been demonstrated that care would need to be taken to ensure the noise floor of the digital filter algorithm is not greater than the established noise floor of the 16 bit signal, especially for recursive IIR audio filters. Recursive IIR filters can introduce quantization noise above the noise floor of a 16-bit converter when using a 24-bit DSP [2, 5] and therefore 24-bit processing requires software overhead to lower the digital filter's noise floor. Again, double precision math is an option, but this can add overhead by as much as a factor of five.
Using a 32-bit, fixed-point DSP will give additional benefit of ensuring 16-bit signal quality is not impaired during arithmetic computations. Thus, the higher resolution of the 32-bit DSP will eliminate quantization noise from showing up in the D/A converter output, providing improved Signal-to-Noise (SNR) ratio over 16- and 24-bit DSPs.
When processing 16-bit audio data, the use of 32-bit processing is especially useful for complex recursive processing using IIR filters. For example, parametric and graphic equalizer implementations using cascaded 2nd-order IIR filters, and comb/allpass filters for audio are more robust using 32-bit math. A 32-bit processor operating on 16- or 20-bit data removes the filter structure implementation restrictions that are present for 24-bit processors. Any filter structure of choice can then be uses without worrying about the level of the noise floor. Double-precision and error-feedback schemes are therefore eliminated. With 16-bits below the noise floor on a 32-bit DSP, quantization errors would have to accumulate up to 96 dB from the LSB before these errors can be seen by the 16-bit D/A converter.
At least 32 bits are required if 24-bit signals are to be preserved with complex, math-intensive, or recursive processing. Using 24-bit A-D and D-A converters will require a 32-bit DSP in order to offer sufficient precision to ensure that the noise floor of the algorithm will not exceed the 24-bit input signal.
The ADSP-21161's 32-bit capability reduces the implementation burden from the DSP programmer by ensuring that the quantization error from computations does not go above the ADC/DAC noise floor. The ADSP-21161's 32-bit processing can give an additional 48 dB with 8 extra 'guard' bits in the LSBs compared to a 24-bit processor to ensure 16-bit signal quality is not impaired during recursive filter computations or multiple processing stages before obtaining the final result for the DAC. The ADSP-21161 enables more precise placement of poles/zeros with it's 32-bit accuracy using native single-precision arithmetic.
32-bit floating-point operations contain 24-bit precision, with over 1500 dB dynamic range. The wider dynamic range of floating-point computations can virtually eliminate the need for scaling input samples to prevent overflow. The ADSP-21161's 40-bit floating point operations have as much accuracy as a 32-bit fixed point computation with a 32-bit mantissa. Dynamic range is equivalent to that of 32-bit floating-point operations.
8. ADSP-21161 SIMD SHARC DSP - The 32-bit Processor of Choice for Present and Future Audio DSP
The 16-, 20- and even 24-bit, fixed-point digital signal processors in use today in the majority of digital audio products are reaching the point where their performance is no longer sufficient to meet the needs both of established and emerging digital audio markets.
To fully realize the potential of the latest digital audio formats now and into the future requires faster, more flexible DSPs with more accurate and more powerful arithmetic. One such processor is the Analog Devices ADSP-21161, capable of both fixed- and floating-point arithmetic. The ADSP-21161 processor contains the ADSP-2116x SHARC SIMD core (a SIMD processor uses two identical set of ALU, MAC and Shifter) and its dual computational unit supports the following data types:
- 32-bit fixed-point
- 32-bit IEEE 754/854 floating-point
- 40-bit floating-point
"32-bit processing is required if 24-bit audio signals are to be preserved for complex, computationally-intensive or recursive audio processing. A 32-bit DSP like the ADSP-21161 offers sufficient precision to ensure that the noise floor of the algorithm will not exceed the 24-bit input signal."
The majority of DSP applications in the consumer audio sector currently use 16- or 24-bit fixed-point DSPs for audio processing. However, as the professional and consumer audio market expands in terms of both variety and requirements for high fidelity, these DSP technologies will no longer be adequate to deliver the accuracy and flexibility of DSP processing required. The three data types supported by ADSP-21161 make it ideal for satisfying the demand for improved sound quality. In addition, the ADSP-21161 includes many other features which make it highly flexible and capable of meeting the needs of developers for a wide variety of applications. These other features include:
- 100 MHz provides 200 MIPS, 600 MFLOPS
- 1 Megabit of internal memory
- 2 link ports for bytewide dedicated interprocessor communication at 100 MHz
- 8 bidirectional serial data paths
- I2S support provides 16 programmable direction audio channels, configurable as inputs or outputs
- 12 programmable I/O pins for performing 'microcontroller'-type' housekeeping tasks
- 2 external port and 8 serial port DMA channels
- Glueless Multiprocessing with up to six ADSP-21161s in a cluster
- SDRAM interface for bulk storage of lengthy audio delay lines
 Gary Davis & Ralph Jones, "Sound Reinforcement Handbook, 2nd Edition", Ch. 14, pp. 259-278, Yamaha Corporation of America, (1989, 1990)
 R. Wilson, "Filter Topologies", J. Audio Engineering Society, Vol 41, No. 9, September 1993
 J. Dattorro, "The Implementation of Digital Filters for High Fidelity Audio", Audio in Digital Times, Proc. Audio En g. Soc. 7th Inter. Conf., Toronto, Ont., Canada, May 14th-17th 1989, pp. 165-180
 Udo Zolzer, "Roundoff Error Analysis of Digital Filters", J. Audio Engineering Society, Vol 42, No. 4, April 1994
 W. Chen, "Performance of Cascade and Parallel IIR Filters," J. Audio Engineering Society, Vol 44, No. 3, March 1996
 K. L. Kloker, B. L. Lindsley, C.D. Thompson, "VLSI Architectures for Digital Audio Signal Processing," Audio in Digital Times, Proc. Audio En g. Soc. 7th Inter. Conf., Toronto, Ont., Canada, May 14th-17th 1989, pp. 313-325
 D. E. Blackmer, "The World Beyond 20 kHz", Studio Sound, pp. 92 - 94, January 1999
 E. Cooper and R. Price, "Minimizing Quantization Effects in Digital Signal Processors", Proceedings of the 1994 DSPx Technical Program, May 15-18, 1995, San Jose Convention Center, San Jose, CA, pp. 53 - 72, (1995)
 L. D. Fielder, "Human Auditory Capabilities and Their Consequences in Digital-Audio Converter Design", Audio in Digital Times, Proc. Audio En g. Soc. 7th Inter. Conf., Toronto, Ont., Canada, May 14th-17th 1989, pp. 45-62.
 W. A. Yost and D.W. Nielsen, Fundamentals of Hearing, Second Edition, Holt, Rinehart and Winston, Inc., Chicago, IL, (1985). ISBN 0-03-069621-6
 W. R. Zemlin, Speech and Hearing Science - Anatomy and Physiology, Third Edition, Prentice Hall, Englewood, Cliffs, New Jersey 07632, (1988), ISBN 0-13-827429-0
 J. Katz, Handbook of Clinical Audiology, Third Edition, Williams and Wilkins, Baltimore, MD, (1985), ISBN 0-683-04549-0
 J. Tomarakos and D. Ledger, "DSPs for Digital Audio Applications: Part 1," Multmedia Systems Design, Miller Freeman, Inc., San Francisco, CA, July, 1999
 J. Tomarakos and C. Duggan, "32-bit SIMD SHARC Architecture for Digital Audio Signal Processing Applications", J. Audio Engineering Society, Vol 48, No. 3, March 2000
 Analog Devices Whitepaper, ADSP-21065L: Low-Cost 32-bit Processing for High Fidelity Digital Audio, Analog Devices, 3 Technology Way, Norwood, MA, November 1997
 S. P. Lipshitz, R A. Wannamaker, and J. Vanderkooy, "Quantization and Dither: A Theoretical Survey", J. Audio Engineering Society, Vol 40, No. 5, May 1992
 Steven W. Smith, The Scientist and Engineer's Guide to Digital Signal Processing. California Technical Publishing, San Diego, CA, (1998)
 S. J. Orfanidis, Introduction to Signal Processing, Chapter 8, Sec 8.2, pp. 355-383, Prentice Hall, Englewood Cliffs, NJ, (1996)
 J. G. Proakis & D. G. Manolakis, Introduction To Digital Signal Processing, Macmillan Publishing Company, New York, NY, (1988)
 A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing, Prentice Hall, Englewood Cliffs, NJ, (1989)
 P. Lapsley, J. Bier, A. Shoham and E. A. Lee, DSP Processor Fundamentals: Architectures and Features, Berkley Design Technology, Inc., Fremont, CA, (1996)
 J. Bier, P. Lapsley, and G. Blalock, "Choosing a DSP Processor," Embedded Systems Programming, pp. 85-97, (October 1996)
 K. Bogdanowicz & R. Belcher, "Using Multiple Processor for Real-Time Audio Effects", Audio in Digital Times, Proc. Audio En g. Soc. 7th Inter. Conf., Toronto, Ont., Canada, May 14th-17th 1989, pp. 337-342
 R. Bristow-Johnson, "The Equivalence of Various Methods of Computing Biquad Coefficients for Audio Parametric Equalizers", presented at AES 97th Convention, J. Audio Engineering Soc. (Abstracts Preprint 3096), Vol 42, pp. 1062-1063, (December 1994)
 D. J. Shpak, "Analytical Design of Biquadratic Filter Sections for Parametric Filters", J. Audio Engineering Soc., Vol 40, No 11, pp. 876-885, (November 1992)
 S. J. Orfanidis, "Digital Parametric Equalizer Design with Prescribed Nyquist-Frequency Gain," J. Audio Engineering Soc., Vol. 45, No. 6, pp. 444 - 455, June 1997
 Analog Devices, Inc, ADSP-21065L SHARC User's Manual, Second Edition, Analog Devices, 3 Technology Way, Norwood, MA (1996)
 D. C. Massie, "An Engineering Study of the Four-Multiply Normalized Ladder Filter," J. Audio Engineering Soc., Vol.41, No. 7/8, pp. 564-582, July/August 1993
 D. P. Weiss, "Experiences with the AT&T DSP32 Digital Signal Processor in Digital Audio Applications," Audio in Digital Times, Proc. Audio En g. Soc. 7th Inter. Conf., Toronto, Ont., Canada, May 14th-17th 1989, pp. 343-351
 C. Anderton, Home Recording for Musicians, Amsco Publications, New York, NY, (1996)
 B. Gibson, The AudioPro Home Recording Course, MixBooks, Emeryville, CA, (1996)
 Dominic Milano, Multi-Track Recording, A Technical And Creative Guide For The Musician And Home Recorder, Reprinted from The Keyboard Magazine, Ch. 2, pp. 37 - 50. Hal Leonard Books, 8112 W. Bluemound Road, Milwaukee, WI (1988)
Glossary for Some Common A/D and D/A Audio Converter Terms
Signal-to-noise Ratio (SNR or S/N)
This is the ratio of the input signal S to the background noise N in a system. For an ideal A-D converter with a sine wave input, the SNR related to the resolution n is SNR(RMS) = 6.02n + 1.76 dB.
Thus, the resolution and quantization level will establish the noise floor. Random system noise will reduce the SNR.
All A-Ds will have at least a minimum error as a result of the discrete or finite specs that represent the analog input, and this error is directly proportional to the resolution.
Quantization Uncertainty Error = +/- .5 LSB
(Spurious Free) Dynamic Range
This is the ratio of the full-scale input or output signal to the highest harmonic or spurious input/output noise component amplitude. Essentially, this is an indication of how far it is possible to go below the full-scale input signal without hitting noise or distortion. This is usually measured from 0 to 20 kHz and is expressed in decibels (dB). Dynamic range is measured with a -60 dB input signal and is calculated as follows:
Dynamic Range = (S/[THD+N]) + 60 dB
Dynamic Range of a digital signal is defined as the ratio of the maximum full scale signal representation to the smallest signal the DSP or converter can represent. For an N-bit system, the ratio is theoretically equal to 6.02N.
Note: Spurious harmonics are below the noise with a -60 dB input, so the noise level establishes the dynamic range. This is the recommendation of AES and EIAJ.
Total Harmonic Distortion
A very important specification in audio systems, the THD is defined to be the RMS (root-mean-square) ratio of the sum of all spectral components (harmonic distortion amplitudes) to the original full-scale input amplitude. It is caused by the A-D converter nonlinearities.
Total Harmonic Distortion + Noise (THD+N)
The ratio of the root-mean-square value of a full-scale fundamental input signal to the RMS sum of all other spectral components in the passband, expressed in decibels (dB) and percentages.