|
Volume 36, Number 2, March-April, 2002
Interfacing a Blackfin DSP to High-Speed Converters for Wireless Applications INTRODUCTION Until recently, most designers had to interface high-speed parallel converters to application-specific ICs (ASICs) or fast field-programmable gate arrays (FPGAs). Devices like these are capable of resolving the many required simultaneous parallel digital operations; but they are often inflexible and can be prohibitively expensive. Now, with the recent introduction of Blackfin™ DSPs, such as the ADSP-21535, users have available a programmable general-purpose (GP) 16-bit fixed-point vector DSP—with a 300-MHz-capable core—that can handle the sustained input/output (I/O) and core throughputs required to process data from the many available high-speed converters. Depending on the core clock frequency, a maximum system clock (SCLK) of 133MHz can be achieved. [This SCLK should not be confused with the serial clock for the serial peripheral interface (SPI)]. Why Choose a General-Purpose DSP? The ADSP-21535, the first member of ADI’s Blackfin family, was designed to work optimally in a computer-bus environment, while newer designs available soon (within the year) will have a parallel peripheral interface (PPI), specifically designed to work with I/O data. However, in the interim, the ADSP-21535’s power can be made available for use in urgently needed designs, such as wireless applications, by using it with a small amount of readily available external circuitry. What are the issues? In general, to guarantee sufficient data-processing bandwidth, the DSP needs a minimum clock speed an order of magnitude (10x) faster than the converter’s sample rate. In turn, the amount of processing bandwidth needed depends upon the DSP’s interface capabilities, which are, in turn, influenced by several other factors. These considerations include: block processing versus sample processing, the existence of a direct memory-access (DMA) controller, multi-ported memory, and whether external FIFOs are used. Fortunately, the ADSP-21535 has a full DMA controller that operates independently of the core, with multi-ported level-1 (L1) and level-2 (L2) memories. The combination of core speed, an independent DMA controller, and a large multi-ported on-board memory (308 Kbytes) allows the ADSP-21535 to perform efficient block processing at high data rates. For example, if the Revision 2.2-compliant, 33-MHz, 32-bit (4 bytes) peripheral component interconnect (PCI) interface is used (not shown in this application), transfer bandwidths can be achieved that approach 132 MB/s.
Figure 1. External logic connections between the ADSP-21535 and the AD9860/AD9862. The ADSP-21535’s external bus interface unit (EBIU) provides interfaces to asynchronous (ASYNC) external memories. If the PCI bus must be used for other system communications, the EBIU is the only available parallel interface to connect the ADSP-21535 to a high-speed converter. To combine the DSP-mastered, asynchronous control of this port with the synchronous, continuous data stream of converters may pose somewhat of a challenge for a system designer. This article describes one particular hardware implementation, utilizing low-pin-count, low-cost, commonly available “glue logic” devices, such as a programmable array logic chip (PAL), a complex programmable logic device (CPLD), or an FPGA. This logic performs the control functions between the AD9860/62 Mixed Signal Front-End (MxFEÔ) and the ASYNC external memory bus of the ADSP-21535. The application depicted in Figure 1 is for an orthogonal frequency-division multiplexed (OFDM) wireless portable terminal. The ADC and DAC are time-shared (time-division multiplexed, or TDM) over the ASYNC interface of the DSP. (The information given here applies equally to other parallel high-speed ADCs and DACs.) Engineer to Engineer Note EE-162* is available, describing the details of the interconnection scheme. It assumes that the reader has information on hand about both the ADSP-21535 and the AD9860/2, including the “ADSP-2153x/21535 BlackfinÔ DSP Hardware Reference” and the datasheet for the AD9860/AD9862. These can be found at *”Interfacing the ADSP21535 to High-Speed Converters (like those on the AD9860/2) over the External Memory Bus.” A PDF of this file and a ZIP file of software code for EE-162 can be found on the DSP Application Notes Page under EE-162. Design Goals Design Challenges Some of the major design constraints are:
In addition, when the ADSP-21535 ASYNC interface is connected to devices that do not contain FIFOs or memory, all latencies must be thoroughly understood. For example, every time the memDMA relinquishes the bus after a burst of 8 transfers, it requires 10 SCLK cycles to begin the next transfer.Future Blackfin family members will have programmable priority levels for the DMA controller, as well as a dedicated high-speed parallel interface—with DMA-request and DMA-grant signaling. With a dedicated PPI, these future Blackfin products will not require the ASYNC memory interface to connect with parallel converters. The approach used here assumes that the memory interface is dedicated to the converters. Multiplexing external SRAM/SDRAM memory with the converter(s) would be difficult and is not recommended, especially considering that there is only one memDMA, and it would need to be shared. The existence of a large on-board L2 memory (256K bytes) minimizes the need for any external memory. However, it is permissible to multiplex the parallel converter(s) with a Flash or EPROM for the initial booting process. This design uses a TDM time-slice approach for sharing the external bus between the ADCs and the DACs, because simultaneous access is not possible here, since the single memory interface either does a read or a write—and there is only one set of memDMA channels (source and destination). The ADSP-21535 will support a maximum SCLK of 133 MHz (peak DMA bandwidth). At this rate, and with no external FIFO, the memDMA could sustain a transfer (32-bit word) rate of 133 MSPS/10 (nine cycles are required for bus acquisition and 1 for next transfer), or 13.3 M words/s. However, the SCLK of the ADSP-21535 is derived from the core clock (CCLK). CCLK in turn is generated via the PLL divider, whose available ratios are 1 to 31—and there are only four available divide ratios: 2.0, 2.5, 3.0, and 4.0. So one possible combination of CCLK and divisor that will allow a 133-MHz SCLK is CCLK = 266 MHz and CCLK/SCLK = 2. But if the core must run at 300 MHz, as in this application, the highest SCLK that can be obtained is 120 MHz (divisor = 2.5) to stay under the maximum 133 MHz. Now, since the ASYNC memory interface is 32-bits wide, up to two 16-bit samples (in this case I and Q) can be packed into each word. This effectively halves the word rate that the DSP must process (with a 15.36-MSPS converter sample rate, the DSP will “see” 7.68 MSPS). The highest external converter sample rate that the memDMA will support under these conditions is 2 x 120/10 = 24 MSPS. Furthermore, the SCLK must be an integer multiple of the converter sample rate to ensure proper phase alignment between converter timing and DSP timing and eliminate the need for any external FIFOs. Therefore, the highest converter sample rate that the ADSP-21535 will support at a 300-MHz core rate is 2 x 120/10 M = 24 MSPS—or twice the memDMA rate, as discussed in Commandment #10. Since the DSP will only process the packed data at half this rate, 12 MSPS is the maximum rate that the memDMA can sustain, i.e., 12 M words/s. Higher sample rates can be processed by the ADSP-21535 if small external FIFOs are included between the converter(s) and the EBIU. Table 1. Possible parameter scenarios for the ADSP-21535.
Recall now that OFDM requirements dictated a 15.36-MSPS converter sample rate. To obtain a SCLK that is an integer multiple of this converter sample rate, one must choose a phase-locked-loop (PLL) multiplier that is an integer multiple of one of the four available divisor ratios (2.0, 2.5, 3.0, or 4.0). With a PLL multiplier of 18, the maximum CCLK allowed is 276.48 MHz. This, in turn, limits the SCLK to an integer multiple of 3, because 276.48/3 = 92.16 MHz (a divide ratio of 2 would give an SCLK over the 133-MHz maximum). Under these constraints, the maximum sustained rate that the memDMA can support is 92.16/10 = 9.21 M words/s. DMA Considerations Table 2: Arbitration Priority
Analysis of the DMA engine within the ADSP-21535 reveals a few other considerations. While the DMA engine supports two types of DMA transfers—descriptor-based and autobuffer-based—the memDMA controller does not support autobuffer-based DMA. Therefore, descriptor-based transfers must be used. The descriptor fetch from L1/L2 memory involves two 5-word block moves, one for the source descriptor and another for the destination descriptor. In addition, the memDMA has a 16-entry 32-bit FIFO that is filled from the source and emptied from the destination. If both descriptors are loaded simultaneously, 39 SCLK cycles (worst case) are required from L2. The destination descriptor load has priority over the source load to avoid overrunning the FIFO. Thus, in this example, the amount of time required to load both descriptors simultaneously is (1/92.16 M) x 39 = 423 ns. The DMA engine descriptor load performs best when the descriptors are loaded from L2 memory. If the descriptors are located in L1 memory, there are additional delays. The worst-case source-plus-destination descriptor load time from L1 is 65 SCLK cycles. To process data effectively at these sample rates, ping-pong buffers are normally used (in this design, two 1024-word buffers are utilized). This technique allows data to be filled into one buffer while the core processes the other buffer. As a reference, the complete VisualDSP++Ô 2.0 project program is available from ADI. There are two phases of operation that must be analyzed: Samples must be received by the DSP from the ADC (receiver TDM phase); and samples must be transmitted from the DSP to the DAC (transmitter TDM phase). Receiver TDM Phase Transmitter TDM Phase Logic Overview and Timing All data movement is controlled or mastered by the memDMA within the DSP. When the ADC data is read (see Figure 2), the external logic must drive the data and the ARDY signal. The external logic must sample the /AOE pin to check when data can be driven to the ADSP-21535. The /AOE signal indicates to the external logic that the DMA controller is ready to take data. The receiver three-state machine is shown at the bottom of the figure.
Figure 2. Receive timing and state machine. When data is being sent out to the DAC (see Figure 3), the external logic has to sample the /AWE signal and then drive ARDY. /AWE indicates to the external logic when the DMA controller is ready with new data. The transmitter four-state machine is shown at the bottom of the figure.
Figure 3. Transmit timing and state machine. Conclusions The following set of interfacing rules, or “ten commandments,” will help in optimizing performance of the ADSP-21535 when used with high-speed converters: The ADSP-21535’s Ten Commandments
|