Designing Efficient, Real-Time Audio Systems with VisualAudio The VisualAudio design and development environment is a new software tool for designing and developing audio systems. Its real-time-architecture is especially well-suited to the challenges of audio product development. This article briefly introduces VisualAudio, and then describes its framework, audio modules, and application to audio product development. Audio Product-Development Challenges
To deal with these factors, developers are turning to digital signal processors (DSPs), because their programmability allows systems to be customized for specific market niches and applications. The SHARC® Processor family from Analog Devices (ADI) is particularly well-suited for this task, since it offers features such as large internal memory, floating-point precision, and high-performance computation units. The recently announced third-generation SHARC processors take this one step further by integrating additional features that are specifically introduced to facilitate audio product design. These features include hardware sample-rate converters, encryption and decryption, a sophisticated digital audio interface, and an on-chip ROM containing multiple audio decoders. The historical challenge faced by DSP users has been the development of software that makes optimum use of processor clock cycles and efficient use of memory. The long-used and laborious approach of hand-coding audio signal processing algorithms in assembly language has become less and less viable. This is particularly true when a large portion of the required effort goes into creating standard checklist and me-too functions instead of focusing on differentiating the product with value-added features. A better approach to developing audio product software was required. To fulfill this need, ADI has developed a graphical environmentVisualAudioas an aid to designing and developing audio systems that use the SHARC processor family. VisualAudio provides audio system developers with most of the software building blockstogether with an intuitive, graphical interface, shown in Figure 1for designing, developing, tuning, and testing audio systems.
Figure 1. Example of VisualAudio graphical-interface screens. VisualAudio comprises a PC-based graphical user interface (GUI, the graphical tool), a DSP kernel (the framework), and an extensible library of audio algorithms (audio modules). Working in conjunction with ADIs VisualDSP++ integrated development and debugging environment (IDDE), VisualAudio generates product-ready code that is optimized for both speed, in millions of instructions per second (MIPS), and memory usage. By simplifying the process of developing complex digital signal-processing software, VisualAudio reduces development cost, risk, and time. As a result, audio-system developers are able to focus on adding value by differentiating their audio products from the competition. At the core of VisualAudio is a real-time software architecture that handles audio I/O and post-processing. In order to be viable, the generated DSP code must be efficient in terms of MIPS and memory, and be flexible enough to handle a variety of audio product categories. The VisualAudio real-time architecture is described below, first the framework and then the audio-processing modules. The Framework Audio products have specific requirements that govern the design of the framework. Each audio product has two primary functions: (1) real-time audio processing, and (2) control of this processing. The time scales for these two functions are vastly different. The real-time processing (with all internal operations completed) must occur at the sampling rate, otherwise there will be unacceptable pops and clicks in the output audio. The control functionality can occur at a much slower rate, 10 Hz to 100 Hz, and still be acceptable. The bulk of the MIPS usage thus occurs within the real-time processing, while the bulk of the software complexity is within the control functions. In order to simplify product design and development, VisualAudio separates the real-time and control functions into separate threads. Efficiency is achieved by hand-optimized, real-time audio processing modules, while the complexity of control code is managed by allowing the developer to write it in C and run it in a separate thread. Traditionally, two different approaches to audio processing have been taken. In stream processing, audio samples are processed one at a time as they arrive, while in block processing several audio samples are buffered and then processed as a group. Each method has distinct advantages and disadvantages. Stream processing is efficient in terms of data memory, since no buffering of audio data is required. The major limitation of stream processing is that the overhead of multiple function calls cannot be tolerated. This forces the audio processing code to be written in-line, usually in assembly language. Such code is difficult to modularize and maintain. Block processing requires additional buffering memory for I/O and scratch memory. Typical block sizes are in the range of 32 to 256 samples. Since many samples are processed at once, the overhead of function calls is amortized over a large number of samples. This leads to MIPS-efficient implementationat the expense of additional memorybut is preferred, since structured programming methodologies may be employed. Block processing is also a natural fit to audio decoders that generate blocks of audio. For example, both Dolby Digital and DTS decoders generate audio in 256-sample blocks. Block processing, the approach used by VisualAudio, has several additional advantages. All audio I/O within VisualAudio is double-buffered and managed using direct memory access (DMA). The processor receives an interrupt once per blocknot once per sampleresulting in much less interrupt overhead than with stream processing. Also, by utilizing the chained DMA capabilities of the SHARC processors, double buffering is managed by the DMA controller, significantly increasing the allowable latency when servicing an audio input/output (I/O) interrupt. The VisualAudio framework delivers audio to the post-processing network in blocks. Certain restrictions are placed on the block size. First, it must be an even number, due to the single-instruction, multiple-data (SIMD) behavior of some audio modules. Second, the minimum block size is eight samplesdue to pipelining within some audio modules. Finally, in systems with an audio decoder, the post-processing block size must be a factor of the decoder block size. For example, with Dolby Digital, possible block sizes are 8, 16, 32, 64, 128, and 256 samples. Audio I/O and buffering can be seen within the example of a VisualAudio automotive framework, shown in Figure 2. Audio arrives from either the MOST network or from A/D converters, and is separated into multiple streams. The primary entertainment stream is generated by the DVD player, and additional monaural streams are produced by the telematics system or by chimes. The DVD data first undergoes digital-transmission copy-detection (DTCP) decryption, and is then fed to the bit stream detector. The output of the bit stream detector is packed into blocks; when a complete frame of data is available, the audio decoder is executed. The DVD player generates its own sampling rate, which is distinct from the sampling rate used by the post-processing. Thus, the output of the audio decoder must pass through an asynchronous sampling-rate converter. This block converts all input data streams to a fixed output sampling rate. At this point, the audio post-processing is executed at a fixed block size of 32 samples. As a final step, the audio channels are fed to D/A converters or returned to the MOST network.
Figure 2. Audio flow within the VisualAudio automotive framework The automotive framework contains multiple audio decoders, only one of which is active at a time. To reduce decoder memory requirements, VisualAudio manages a decoder memory pool that is shared among all possible decoders. The output of the decoder is fed to the post-processing network; and this drives the D/A converters. VisualAudio uses a simple interrupt driven kernel to manage multiple threads. For example, the sample automotive framework contains a total of six threads. From highest to lowest priority they are:
Typical thread activity within the sample AVR framework is shown in Figure 3. Each horizontal slice represents a different thread. The audio I/O, decoder, and the post-processing are run at regular intervals, and the UCC runs at the lowest priority. Note that the UCC might not be run for several millisecondswhile the processor is busy with audio processing.
Figure 3. Thread activity within the VisualAudio AVR framework. The threads are ordered from high priority (at the top) to the lowest priority (at the bottom). VisualAudio partitions control functionality between the host microcontroller and the DSP. This partitioning is arbitrary, and can even support systems without a dedicated host microcontroller. As described above, the UCC executes at lowest priorityusing any free cycles not consumed by the interrupt handlers. The UCC is called periodically by the framework with messages and notifications. Messages are commands sent from the host microcontroller to the DSP. These commands are used to control audio processing (e.g., Set the volume to 20 dB, Set the bass tone-control to +3 dB) or to query the state of the audio processing (e.g., Is the system limiting?). The VisualAudio framework handles some commands internally, while the remainder are passed to the UCC. At each point in time, there can be only one pending message between the host and the DSPand the DSP must send an acknowledgement after each command has been processed. Notifications are generated asynchronously by the framework and occur under several conditions. The first notification occurs during system initializationbefore any real-time processing is enabled. System- or application-specific initialization may be done at this time. A second notification, generated periodically at a rate of approximately 200 Hz, is used for control of the real-time audio processing, for example, automatic gain-control (AGC) calculations and updates. A final class of notifications is generated by the audio decoders in response to changes in the encoded bit stream. Notifications of this type occur when the sampling rate changes, the number of input channels changes, or if a cyclic redundancy-check (CRC) error is detected in the incoming bit stream. These notifications allow the UCC to make appropriate changes in the audio processing. Since the UCC is pre-empted by real-time audio processing, it may not be executed until several milliseconds have gone byas shown in Figure 3. VisualAudio includes several features that simplify writing a UCC that will be constantly subjected to interrupts. First, since the host-communication interface only allows a single host message to be pending between the host and DSP, there is no danger of overflowing a message buffer, or of host messages overwriting each other. Another feature is a notification queue, in which notifications of the same type overwrite each other. For example, if two sampling-rate notifications are generated closely spaced in time before the UCC is executed, then the UCC will only receive the second notificationthe final sampling rate. Also, since there are a finite number of notifications, the notification queue is necessarily of finite length. The UCC must also be carefully written for updating certain audio module parameters. Some module parameters, such as infinite impulse-response (IIR) filter coefficients, must be updated automatically without being interrupted by the audio processing. Each VisualAudio framework has an associated XML platform file that describes the capabilities of the target platform to the VisualAudio application, and also contains a list of the source, object, and library files needed to build the executable. Software design and development usually begins on a readily available evaluation or development platform, such as an EZ-KIT Lite evaluation kit from ADI, and then migrates to the actual target hardware once it is complete. VisualAudios Change Platform Wizard automates the process of migrating software between hardware platforms. The Audio Modules Figure 4 demonstrates the efficiency of block processing for a 10th order IIR filterimplemented as a cascade of five biquad filters. The number of operations per sample is plotted as a function of the block size. Most efficiency gains have been realized by a 64-sample block, with diminishing returns for larger blocks. For this filter, the core inner loop contains 21 multiply-accumulates (MACs) per sample; but by the use of SIMD instructions that operate on two pieces of data simultaneously, the loop is reduced to roughly 15 cycles.
Figure 4. Processing efficiency in an IIR filter as a function of the block size. Each audio module is described to the VisualAudio application by an associated XML file. This file contains a description of the modules in-memory data structure, memory allocation rules, input- and output audio channel list, high-level interface variables (shown on the modules graphical representation, or inspector), and design equations. The XML language consists of tags that label structured data. For example, consider a simple audio module that scales a monaural signal by a fixed gain. This module would contain a single render Variableampthat specifies the gain to apply. Within the modules XML file, the variable amp is described by the XML code shown below: <renderVariable> The description tag provides a short summary of the variable. The name tag indicates the variable name within the data structure. The variable is described as being floating-point, having a default range of [1 +1] that can be modified, and a default value of 1.0. VisualAudio will treat this variable as a design-settable and tunable parameter, allowing it to be modified at design time, and later, during real-time tuning. This monaural fixed-gain example illustrates one of the many variable types that can be described by the VisualAudio XML format. VisualAudio creates a separate data structure for each instance of an audio module within the post-processing layout. All data structures share a common 5-word header that describes the modules run-time interface to the framework. This is followed by module-specific parameters and -state variables. For example, the type declaration of the data structure for the scaler module described above is: typedef struct The audio processing functions within VisualAudio follow a uniform calling sequence. Each function is called with three arguments: a pointer to the instance data structure, a pointer to an array of input and output buffer pointers (input buffers first, then output buffers), and an integer specifying the block size. Continuing the example of the scaler, its real-time processing function is defined below. void AMF_Scaler_Render(AMF_Scaler *instance, float ** buffers,int blockSize) { Note that this example is written in C to clarify its description. The actual render function included with VisualAudio is written in optimized assembly language for speed of execution. SUMMARY More about VisualAudio 1The bit-stream detector, used with digital inputs such as an S/PDIF interface, monitors the incoming data stream and distinguishes between uncompressed PCM audio and compressed audio. <return to text> Trademarks and registered trademarks are the property of their respective owners. Copyright 1995- Analog Devices, Inc. All rights reserved. |