A 12-b 10-GS/s Interleaved Pipeline ADC in 28-nm CMOS Technology

by Siddharth Devarajan, Larry Singer, Dan Kelly, Tao Pan, Jose Silva, Janet Brunsilius, Daniel Rey-Losada, Frank Murden, Carroll Speir, Jeffery Bray, Eric Otte, Nevena Rakuljic, Phil Brown, Todd Weigandt, Qicheng Yu, Donald Paterson, Corey Petersen, Jeffrey Gealow 和 Gabriele Manganaro

2021-05-03

Abstract

A 12-bit 10-GS/s interleaved (IL) pipeline analog to-digital converter (ADC) is described in this paper. The ADC achieves a signal to noise and distortion ratio (SNDR) of 55 dB and a spurious free dynamic range (SFDR) of 66 dB with a 4-GHz input signal, is fabricated in the 28-nm CMOS technology, and dissipates 2.9 W. Eight pipeline sub-ADCs are interleaved to achieve 10-GS/s sample rate, and mismatches between sub-ADCs are calibrated in the background. The pipeline sub-ADCs employ a variety of techniques to lower power, like avoiding a dedicated sample-and-hold amplifier (SHA-less), residue scaling, flash background calibration, dithering and inter-stage gain error background calibration. A push–pull input buffer optimized for high-frequency linearity drives the interleaved sub-ADCs to enable >7-GHz bandwidth. A fast turn-ON bootstrapped switch enables 100-ps sampling. The ADC also includes the ability to randomize the sub-ADC selection pattern to further reduce residual interleaving spurs.

Index Terms—Calibration, CMOS, digitally assisted analog design, direct RF sampling analog-to-digital converter (ADC), gigahertz data conversion, interleaved (IL) ADC, pipeline ADC, switched capacitor.

Introduction

Recent progress in high-speed analog-to-digital converter (ADC) design with resolution greater than 10 b and sample rates well into the gigahertz range have made software-defined radios practical for a variety of applications, including communication systems and data acquisition systems [1]–[5]. While narrower band radios, as in the heterodyne receiver shown on the upper half of Fig. 1, have traditionally been used, developments in data conversion technology have enabled a wideband ADC to replace a significant part of the signal chain as shown in the lower half of Fig. 1, thereby lowering system complexity, power, and cost.

Fig. 1. Traditional heterodyne receiver versus direct RF receiver.

Wireless infrastructure systems such as macro cellular base stations, satellite communication systems, as well as electronic warfare systems and high-performance bench measurement systems, are driving the demand for directly digitizing signals with gigahertz-wide bands (sometimes resulting from merging multiple separate sub-bands, co-existing at different carrier frequencies) located up at RF frequencies as high as 3.2 GHz, and with fairly high linearity (e.g., SFDR of the order of 70 dB at 1 GHz or higher) and low-noise spectral density (e.g., noise spectral density (NSD) of the order of –150 dBFS/Hz or better). Unfortunately, as the sample rate (f_s) of an ADC is increased, its power consumption increases: first linearly with f_s then super-linearly with f_s; thereby, making the ADC increasingly inefficient and eventually making its implementation impractical. Interleaved (IL) ADCs can enable higher sample rate conversion while keeping power consumption manageable. However, multiple design trade-offs are involved and many architectural and circuit design challenges need to be overcome.

In this paper, a 12-b 10-GS/s IL pipeline ADC that is fabricated in 28-nm CMOS technology is described [6]. The ADC interleaves an array of eight 12-b pipeline sub-ADCs that are driven by a single input buffer, and employs a variety of calibration, dithering, and randomization techniques to improve spectral performance.

Fig. 2. Schreier FOM and associated ADC trends [8].

This paper is organized as follows. In Section II, some of the architectural trade-offs and challenges associated with interleaving at gigahertz sample rates are outlined. Section III begins with a description of the overall architecture employed in this design and the various interleaving calibrations. Next, in Section III-A the architecture and circuits associated with the sub-ADCs, along with their calibration and dithering are discussed. Section III-B covers the front-end circuit design including the input buffer, and Section III-C discusses the residual effects of sequentially interleaving the sub-ADCs, and the bene.ts of randomizing sub-ADC selection. The measurement results from a prototype IC are reported in Section IV. A comparison with similar state-of-the-art ADCs is the topic of Section V. Finally, summary and conclusions are provided in Section VI.

Interleaving and Architectural Tradeoffs

A commonly used figure-of-merit (FOM) to assess the power efficiency of an ADC, known as the Schreier FOM, is

Equation 1

where SNDR_dB is the signal-to-noise and distortion ratio expressed in decibel, f_SNYQ is the Nyquist sample rate (corresponding to the sample rate f_s divided by the oversampling ratio) and P is the power consumption [7]. A scatter plot showing this FOM [8], depicted in Fig. 2, shows how the highest sample rate ADCs rapidly decline in efficiency, lie within the asymptotic diagonal dashed line commonly known as the “technology front”, and are mainly limited by the transistors’ speed in a given process technology. So, as newer ADCs adopt finer lithography CMOS processes with faster devices, the technology front shifts toward the right [7] and higher sample rate ADCs become practical.

While this is true for non-IL (or single core) ADC architectures, IL (or parallel) ADCs offer the theoretical potential to extend the limit imposed by the process technology’s speed [9]. In fact, at least in principle, by interleaving M identical ADCs (called sub-ADCs), each one clocked at f_sc and consuming P_c watts, an IL ADC sampling at f_s = M · f_sc and consuming P = M · P_c watts could be designed.

At first, one could conclude that an IL ADC should be just as power efficient as its sub-ADCs, since the FOM for the IL ADC is

Equation 2

and since, in principle, the SNDR_C of the sub-ADCs is the same as the SNDR_IL of the IL ADC, then by substitution in (2)

Equation 3

and finally

Equation 4

Therefore, referring back to Fig. 2, starting a design from a sub-ADC with FOM_sub-ADC, located on the left of the technology front, and interleaving with increasing M, one could conceive to build faster and faster IL ADCs, with constant FOM (adding new points to the graph for greater f_s but with constant abscissa), eventually crossing over the limit set by the technology front.

In practice, however, to build the IL ADC out of the sub-ADCs, quite a bit of additional circuit overhead is necessary. This includes signal buffering, routing, references, clocking and controls, the front-end interface to the input signal source, the digital back-end de-multiplexing, the power supplies for the different sections, and calibration circuitry. All this consumes additional power Po, which grows linearly or super-linearly with both M and fs, and hence, when introduced in the denominator of the argument of the logarithm of (2), reduces the actual efficiency of the IL ADC

Equation 5

Since the highest FOMs are obtained in non-IL ADC architectures with lower sample rate, such as successive approximation (SAR) converters, it is becoming common for lowering power dissipation to find such SAR sub-ADCs in IL ADCs implemented with large interleaving order M [9]-[14]. However, the above model, as well as other much more complex analytical representations [15], [16] do not capture other important architectural considerations that impact the resulting IL ADC’s spectral performance and driveability.

While very power efficient, high interleaving order ADCs (say, M > 10) suffer from a number of practical implementation challenges that limit their resolution to about 10 b [17], [18]. Moreover, as M increases, the input buffer must drive more sub-ADCs, thereby increasing front-end loading, which degrades input bandwidth (BW) and linearity, and increases power consumption [16], [17]. Conversely, higher sample rate sub-ADC architectures, such as pipelined sub-ADCs [19], while slightly less power efficient, reduce M (for the same f_s), which helps decrease front-end loading, reduce implementation challenges, complexity and overhead, and have demonstrated higher resolution [20]-[22].

So, while both higher M SAR arrays and lower M pipelined arrays have merits, based on the stringent spectral performance and wide BW targets, a pipeline sub-ADC architecture is chosen for this work, and various techniques are employed to lower the power dissipation of the pipeline sub-ADCs. A recently published 14 b 2.5-GS/s pipeline ADC [5] is the fastest non-IL pipeline ADC at these performance levels, and serves as an important data point on the speed limit for such pipeline ADCs in 28-nm CMOS. Our assessment indicated that in the 28-nm CMOS technology power efficient pipeline sub-ADCs can be designed for sample rates below 2 GS/s. Further, using a binary number of sub-ADCs in an IL ADC allows in general better matching layout. Considering all of this, eight sub-ADCs are interleaved in this work to achieve 10 GS/s, and this architecture choice has similarities to other IL pipeline ADCs [20], [18].

Interleaved ADC Architecture

The overall ADC architecture is shown in Fig. 3. Eight pipeline sub-ADCs are interleaved to achieve the 10-GS/s sample rate. A single common input buffer is used to drive the input signal, V_IN, to all the eight sub-ADCs. The digital outputs of the eight sub-ADCs go to individual sub-ADC digital calibration blocks that correct for sub-ADC imperfections. The individually corrected sub-ADC outputs go to a common IL calibration block, which estimates and corrects mismatches between the sub-ADCs that would otherwise cause mismatch tones [15], [16]. Both the estimation and correction aspects of all calibrations are implemented on-chip. Offset, gain, and timing mismatches are calibrated in the background to ensure good spectral performance. Offset and gain mismatches are both estimated and corrected in the digital domain [23]. However, for timing mismatches the estimation is done digitally but the correction is done in the analog domain [16], [24], [12]. To estimate the timing skew, it is assumed that IL offset and gain are already calibrated. If all sub-ADCs sampled at equally spaced instants in time, then they will all have the same correlation to neighboring sub-ADC samples on average. If a sub-ADC is skewed early, then it will be more correlated with the samples immediately before it, and less correlated with the samples immediately after it, again, on average [16]. For each sub-ADC, a correlation is performed between its output and the sample immediately after it. If ADC[n]^sub-ADC_M is the nth overall ADC sample taken with the Mth sub-ADC, then the correlation value of interest is

Equation 6

where E is the expected value or mean. One of the sub-ADCs is taken as a reference, and all other sub-ADC timing skews are periodically adjusted based on their difference from this reference correlation using a recursive digital feedback loop that operates on an average of samples continuously in the background [16]. The correction of the timing mismatch could be done digitally with finite-impulse response filters [22], but even in an advanced process like 28-nm CMOS the power dissipation of such a filter with 10 fs timing resolution would be substantially higher than analog skew correction, which is accomplished by loading the sample clock driver with a capacitive digital to analog converter (DAC) [22]. The complete timing-skew digital feedback loop and the DAC it controls within each sub-ADC are shown in Fig. 4. The sample time is adjusted by turning on (or off) a switch to load (or unload) the inverter to delay (or advance) the sampling clock.

To complete the architecture description, as shown in Fig. 3, a differential clock receiver (Rcvr) is driven from an off-chip 10-GHz clock (CLK) generator, and the output of the clock receiver goes to clock generation circuitry that generates all interleaving clock phases to control the sub-ADC operations such as sampling, coarse quantization in the flash, and multiplying DAC (MDAC) residue generation. The ADC includes on-chip reference voltage generation and bias current generation circuitry.

A. Sub-ADC Architecture and Circuit Implementation

As discussed earlier in Section I, this work interleaves relatively fast (>1.25 GS/s) pipeline sub-ADCs. The pipeline architecture in advanced CMOS technologies like 28 nm enables such GHz sub-ADCs with very good spectral performance. In this work, several techniques are employed to minimize the power consumption of the pipeline sub-ADCs without sacrificing performance. In order to minimize power consumption, the pipeline sub-ADC is designed to operate from the core 1-V supply. One of the key challenges with a low-voltage pipeline is the design of an MDAC amplifier with sufficient swing, gain accuracy, and linearity [25]. The pipeline sub-ADCs are designed to handle an input signal swing of 1.4 Vpp-differential, which creates challenges to designing an MDAC on a 1-V supply. Using a higher supply for the MDAC amplifier would result in higher power and complexity, which includes additional circuitry for voltage stress mitigation when low-voltage MDAC amplifier transistors are used with a higher-than-rated supply, and increased supply routing complexity due to multiple supply voltages. Further, with multiple supply domains in a switched-capacitor MDAC circuit, the clocks and boosters may need further level shifting (LS). All of this would translate into a larger area for the sub-ADC design, which in turn increases overall IL ADC power consumption in terms of clocking parasitics and the parasitics that the input buffer has to drive. In this work, the MDAC amplifier is designed to operate from the 1-V supply to minimize area and power, and a combination of analog circuit techniques and digital calibration techniques are used to ensure good performance.

Fig. 4. Timing-skew correction DAC and entire digital feedback that controls this DAC.

Fig. 5. 12-b pipeline sub-ADC architecture and stage1 implementation details.

The architecture of the pipeline sub-ADC is shown in Fig. 5. The pipeline consists of a 4-b first stage, followed by three 3-b stages and a final 3-b flash. The choice of MDAC resolution (bits per stage) between 2 and 4 b is generally considered a reasonably shallow optimum in thermal noise limited designs [26]-[28]. The pipeline sub-ADC is SHA-less, which avoids the power, noise, and distortion overhead of the SHA, but introduces stringent matching requirements in terms of track BW between the MDAC and the flash [29].

The implementation details of the first stage of the pipeline sub-ADC, stage1, are also shown in Fig. 5. The input signal V_INX is sampled on the sampling capacitor C_S, and a 4-b flash coarsely quantizes V_INX simultaneously. The output of the 4-b flash drives a DAC capacitor C_DAC, and C_DAC subtracts charge from C_S. The use of a separate DAC capacitor as opposed to reusing C_S to also do the DAC function has well known trade-offs [30], [28]. The benefits of a separate C_DAC are: 1) the charge glitch on the reference buffer is signal independent, which allows the use of a low-power reference buffer and 2) C_S does not have non-linear quantization charge on it at the end of the hold phase, which removes the need for an explicit reset phase before CS goes back to track, thereby saving power.

The disadvantages of a separate C_DAC are increased noise and lower feedback factor. The MDAC amplifier, Amp1, generates the residue, V_RES, to the next stage. Dither is injected in stage1 to linearize the sub-ADC transfer function [28], and inter-stage gain error (IGE) calibration is also performed to correct MDAC gain errors in the background [31]. The reference buffer, which is not explicitly shown in Fig. 5, is implemented as a complementary push–pull source follower to ensure fast settling of the C_DAC capacitor when the MDAC is in hold phase. Each MDAC stage within each sub-ADC has its own reference buffer and the mismatches between the reference buffers are corrected as part of the background digital calibrations. Using a common reference buffer for all MDACs would have resulted in that buffer having to drive the routing parasitic capacitance to each MDAC, thereby resulting in a higher power dissipation.

The comparators in the 4-b flash use small devices for low power and small area, so their process mismatches consume significant correction range. To overcome this, the 4-b flash employs a background calibration scheme to correct comparator offsets as shown in Fig. 6. The 4-b flash in this work normally requires 16 comparators (the MDAC transfer function with the 16 comparators transitions is explained later in this section [28]) however, an extra 17th comparator is added for this background calibration scheme. At any given time, only 16 comparators are needed for the main signal path operation, so one of the 17 comparators is removed offline and calibrated in the background. All the comparators are rotated around sequentially to ensure all their offsets are periodically calibrated. In Fig. 6, the comparator being calibrated is highlighted.

Fig. 6. Flash1 background comparator offset calibration.

The reference taps and the output data bits from the comparators are multiplexed as shown around the comparator in calibration to ensure the signal path functionality is not affected by removing a comparator offline for calibration. The comparator that is being auto-zeroed has its inputs disconnected from the sampling network and shorted to provide a zero input. The background offset calibration not only removes offsets over process, supply, and temperature variations but also offset drift with transistor aging, which can be severe in advanced CMOS technology like 28 nm.

Fig. 7. Flash1 sampling comparator architecture to ensure BW matching between flash1 and MDAC1.

An important SHA-less consideration is that, since both the flash and the MDAC in stage 1 sample GHz inputs, small BW mismatches can consume significant correction range. To minimize BW mismatches a sampling comparator architecture [29] is used in flash1 as shown in Fig. 7. V_INX is passively sampled in both the MDAC and the flash first, and then the latch fires to generate the output from the comparator. While this sequential operation adds delay in the comparator as opposed to directly sampling at the latch, the benefit is the ability to very closely match the track BWs of the MDAC and the flash since they are both distributed RC networks when tracking the input signal. Further, to be able to correct any BW mismatches that do exist between the MDAC and the flash, the sample clocks of the MDAC (q1p) and flash (q1p_FL) are split and a delay line is inserted in the flash sample clock path to allow trimming of the flash sample time. This trim is done in the foreground by monitoring the residue of stage1 V_RES and minimizing its amplitude under high-frequency input signal conditions by trimming the flash sample clock delay.

Fig. 8. Stage1 residue transfer function.

In a 4-b stage, the gain of the MDAC is typically set to 2^(4-1) = 8. However, to enable the stage1 MDAC to operate off the core supply the residue gain is reduced to 4 in this work, as shown in Fig. 5 with the ratio of C_S/C_F = 4. Fig. 8 compares the two stage1 transfer functions (TF) of a typical 4-b stage with a gain of 8 with the implemented 4-b stage with a gain of 4. While this residue gain reduction halves the swing at the output of Amp1 and improves linearity, it doubles the backend (i.e., stages 2 to 5) noise referred to input. However, the power increase in the back-end stages to reduce their noise contribution was smaller than the power savings obtained in the stage1 MDAC by halving its swing. Fig. 8 also shows the locations of the 16 comparator flash transitions in the 4-b MDAC.

The MDAC amplifier’s simplified transistor-level implementation is shown in Fig. 9. The amplifier is a two-stage design with a split Cascode compensation scheme. Both stages use a push–pull complementary architecture to double the power efficiency (i.e., double gm/I). However, the push–pull architecture requires different bias points for the PMOS and the NMOS, which is implemented by using dynamically level-shifted capacitors (C_LS1 and C_LS2). Each level-shifted capacitor is charged to the desired level-shift voltage using a switched-capacitor circuit that operates on non-overlapping complementary clocks q1 and q2 [32]. As shown in Fig. 9, a small capacitor C_SMALL is charged to the desired level-shift bias voltages (V_BIASP and V_BIASN), and this small capacitor is periodically switched in parallel with the level-shift capacitor to refresh its charge and thereby establish the level-shift voltage. The first stage of the MDAC amplifier is actively cascoded, and both stages use independent common-mode feedback circuits for better common-mode settling and stability. The amplifier is designed for fast linear settling and optimized for low power, which is made possible by taking advantage of the reduced swing, dithering, and IGE calibration techniques.

Dither is added to both the MDAC (using the C_DITHER capacitor shown in Fig. 5) and the flash [28]. The dither added to the flash linearizes both residual IGE errors and nonlinearities in the stage1 MDAC residue. The dither added to the MDAC propagates down the pipeline and linearizes differential non-linearity (DNL) errors in the back-end ADC. Only mismatches between the MDAC dither and the flash dither end up using correction range, and these mismatches are small relative to the correction range. A random 1-bit generator (labeled IGE in Fig. 5) drives a capacitor C_IGE to inject charge into the MDAC that is used to digitally estimate the IGEs in the MDAC [31]. Once estimated, the IGE is then digitally corrected in the background.

B. Front End

The front end of the eight-way IL ADC is shown in Fig. 10. A common input buffer is shown driving the sampling networks within each of the eight sub-ADCs. This isolates the input V_IN from the loading of the eight sub-ADCs, which improves BW and linearity. To minimize loading on the input buffer and crosstalk between sub-ADCs, only one of the eight sub-ADCs is connected to and tracking the output of the buffer at any time. That is, only one of the eight V_BTSTRP [1:8] is turned on at any time. The seven off input switches present a significant non-linear parasitic on the input buffer that degrades high-frequency linearity, and to reduce this impact, the back gates of these input switches are biased to a –1 V voltage to reduce C_SB non-linearity.

A trade-off exists between the choices of a single common input buffer, as used in this work, versus separate input buffers driving each of the eight sub-ADCs. The gm and power of the buffer to achieve the target BW and linearity at high-frequency inputs is determined by the capacitive loading. When the loading of the buffer is dominated by the sampling capacitor C_S, and with only one sub-ADC sampling at any time, one can argue that a single common buffer is 8× lower power than eight separate buffers, since each of those separate buffers would have to burn the same power to supply the ac current needed when its sub-ADC is sampling with a loading of C_S. However, in reality the common buffer is not 8× lower power, since the metal routing to the eight sub-ADCs and the seven off input switches add additional parasitic capacitance. However, as long as these two additional parasitic capacitances are significantly smaller than 7 × C_S, using a common buffer can be a significant net power saving. Further, with separate input buffers the total capacitance presented to V_IN would also increase, which would significantly lower BW. Based on BW, power, and linearity considerations, a single common buffer is used in this work.

Fig. 9. MDAC1 amplifier transistor level implementation details.

The input buffer implementation details are shown in Fig. 11. A pseudo-differential complementary push–pull architecture is used, which doubles gm/I. However, a push–pull design requires different bias points for the NMOS and PMOS devices, which are implemented with a level shifting (LS) circuit comprised of a current source developing a bias voltage across a high valued resistor that is bypassed with a large feed-forward capacitor. Two levels of cascodes are bootstrapped to the input V_IN to reduce drain modulation of the input devices of the buffer, which improves linearity, but necessitates the use of higher supply voltages for the buffer.

Fig. 10. Front-end circuitry showing the input buffer driving the eight sub-ADC samplers.

The input buffer is powered by 2 and –1 V supply rails, and each of the transistors is biased to ensure they are in saturation with about >150 mV V_DS – V_DSAT margin. While the input devices and the inner cascodes are driven directly from the input V_IN through LS circuits, the outer cascodes are driven from the sources of the inner cascodes. The two other choices to drive the outer cascode gates are either the input V_IN or the output of the buffer V_INX. Both of those choices degrade buffer linearity since the drains of the outer cascodes are not bootstrapped to the input, resulting in a large non-linear gate current at high input frequencies. The backgates of the various transistors in the buffer are bootstrapped, as shown in Fig. 11 to further improve buffer linearity. While the innermost input devices have their backgates tied locally to their sources, the backgates of the cascodes are tied to the equivalent small-signal points on the complementary side of the stack, which increases reverse bias on the backgate diodes, thereby reducing non-linearity.

Fig. 11. Input buffer implementation details.

With each sub-ADC allocated 100 ps at 10 GS/s for tracking and sampling the output of the buffer, a fast turn-ON bootstrapped switch is essential. The traditional bootstrapped switch is shown in Fig. 12 [25]. The operation of this switch follows the sequence of steps indicated in Fig. 12 from 1 to 5. When CLKB and its boosted version CLKB_BST are high, the bootstrap capacitor C_BTSTRP is charged. When CLKB goes low, first V1 goes high, and then the output V_BTSTRP is weakly pulled up to VDD-VTHNMOS, which weakly turns on MN2 and MN1, thereby pulling down on the gate of MP0, which eventually pulls V_BTSTRP high by connecting it to the charged C_BTSTRP capacitor. This is a positive feedback circuit during turn ON, and so once V_BTSTRP is high enough, MN1 and MN2 strongly pull down on the gate of MP0 until the entire circuit reaches bootstrapped steady-state operation. Note that the turn-ON speed of this bootstrapped generator could be improved if the gate of MP0 could be pulled low earlier in the sequence.

Fig. 12. Traditional bootstrapped switch gate drive generation circuitry [25].

Fig. 13. Bootstrapped switch gate drive generation circuit with fast startup circuitry added.

In this work, that is achieved by adding a separate transistor MN0 to directly pull down the gate of MP0 when CLKB goes low as shown in Fig. 13. However, if MN0 remained on when MN1 and MN2 fully turn ON, it would disrupt bootstrapping operation by presenting a low impedance to the input V_INX. To avoid this contention, MN0 is turned off with a delayed version of CLKB, CLKB_DELAY, thereby removing MN0 from affecting bootstrapping operation once it has accelerated bootstrapped circuit turn ON.

C. Sequential and Random Interleaving

IL ADCs typically cycle through the sub-ADCs in a sequential (rotational) pattern. The eight sub-ADCs in Fig. 3 sample the input signal V_IN sequentially in a rotate-by-eight sequential pattern as shown in the sub-ADC selection pattern in the top half of Fig. 14. With sequential interleaving, any mismatches between the sub-ADCs cause spurs in the spectrum, which are calibrated in this work as explained earlier in this section. However, despite calibration, residual interleaving spurs remain, due to the very high sensitivity of these interleaving spurs to mismatches that remain uncorrected after calibration. Further, some of the second-order interleaving mismatches, like linearity mismatches between the sub-ADCs, are not calibrated due to their complexity. For large-signal inputs, the SFDR of a sequentially IL ADC with interleaving mismatch calibrations is typically limited by HD2 or HD3 spurs caused by sampling distortion.

Fig. 14. Interleaving sub-ADC selection patterns for sequential and random modes of operation.

However, as the input signal becomes smaller, the HD2 and HD3 typically improve as the square and cube of the signal reduction, respectively, so the small-signal SFDR can quickly become limited by residual interleaving spurs, which is undesirable for many broadband applications. Further in some applications HD2 and HD3 spurs may be frequency-planned to fall outside the desired frequency band of interest, but residual interleaving spurs may fall in band, which is again undesirable.

To overcome this residual interleaving spur limitation, this work includes the ability to randomize sub-ADC selection patterns at the full 10-GS/s sample rate. Randomization helps convert any residual interleaving spurs into noise, thereby producing a cleaner spectrum, with the trade-off being an increase in the noise floor. To allow randomization, each of the eight sub-ADCs is designed to run at 1.43 GS/s [=(10 GS/s)/7], so after seven clock periods a sub-ADC is available for selection again. This redundancy results in two sub-ADCs being available for selection at any time, and the selection between these two sub-ADCs is controlled by a 1-b random generator (Pseudo Random, or PRND). The bottom half of Fig. 14 describes the random selection sequence pictorially. Assuming an initial starting sequence of 1 through 7, for the 8th sample both sub-ADCs 8 and 1 are available. If, for example, the PRND selects sub-ADC 1 for the 8th sample, then sub-ADC 8 remains in the stack at the same position and sub-ADC 2 gets added to the stack. For the 9th sample, if sub-ADC 8 is selected, then sub-ADC 2 takes its place and sub-ADC 3 takes the place of sub-ADC 2 in the stack. After sub-ADC conversions, the samples are reassembled in the correct sequence, thereby reversing the random scrambling sequence. The timing within the sub-ADC for an example sub-ADC selection sequence is shown in Fig. 15, wherein the MDAC1 tracks (T) the input signal for one period, then takes a sample and goes into hold (H) for a minimum of six periods, which includes the time required for generation of flash data and MDAC amplification to create the residue [28]. The IL calibration algorithms for gain, offset, and timing mismatches are unchanged when randomizing. For estimating timing skew, it was mentioned earlier that a correlation is performed between a given sub-ADC output and the sample immediately after it, which would be produced randomly by each of the other seven sub-ADCs when randomizing. This correlation, on average, still accurately estimates the timing skew of the given sub-ADC even when randomizing.

Fig. 15. Timing diagram of individual sub-ADCs relative to the overall sub-ADC selection pattern.

Fig. 16. Die Photograph—Highlighted area: 7.4 sq. mm. and die dimensions 4.5 mm × 4.5 mm.

Measurement Results

The 12-b 10-GS/s ADC is fabricated in a 28-nm CMOS technology. The die photograph of the ADC is shown in Fig. 16, with the key sections of the chip highlighted. The input buffer is at the top, followed by the eight IL sub-ADCs below it, followed by the digital. The clock receiver and all the clock phase generation circuitry is on the right, and the bias generation is on the left.

Fig. 17 shows the measured digital stage1 residue transfer function (DSRTF) of one of the sub-ADCs, with the back-end (stages 2 to 5) codes on the y-axis and the flash1 output codes on the x-axis. With the flash comparator background offset calibration enabled, a substantial portion of the correction range remains unused. Next, Fig. 18 shows the DSRTF with all the correction range used when the ADC is sampling a 4-GHz signal before the flash sample clock delay is trimmed to match the MDAC. Finally, Fig. 19 shows the DSRTF with the flash sample clock delay trimmed, where, even with a 4-GHz input signal substantial portion of the correction range remains unused.

Fig. 17. Measured DSRTF with low-frequency (127 MHz) input with flash1 background calibration.

Fig. 18. Measured DSRTF with high-frequency (4 GHz) input without flash1 sample clock delay trim.

Fig. 19. Measured DSRTF with a high-frequency (4 GHz) input with flash1 sample clock delay trim.

Fig. 20 shows the measured Integral Non-Linearity (INL) transfer function of one of the eight sub-ADCs for three cases. With IGE calibration and dithering disabled, the INL has sharp discontinuities exceeding ±2 LSBs. Enabling IGE calibration reduces that to about ±1.5 LSBs. Finally with dithering also enabled the INL is less than ±0.7 LSBs. Dithering and IGE calibration significantly improve sub-ADC linearity and ensure a smooth INL transfer function. Having linear sub-ADCs with no transfer function discontinuities is a prerequisite to achieving good interleaving performance.

Fig. 20. Measured Sub-ADC INL transfer functions with and without dithering and IGE calibration.

Fig. 21 shows a measured fast Fourier transform (FFT) of the IL ADC sampling a 4-GHz input signal at 10 GS/s, with the sub-ADC calibrations and dithering enabled, but without interleaving calibrations. The spectrum shows large interleaving mismatch spurs that limit SFDR. When the interleaving calibrations are enabled, as shown in Fig. 22, the interleaving mismatch spurs are reduced below 80 dB, and the SFDR is limited by HD2 to 66 dB and HD3 is at 69 dB, while the achieved SNR is 56 dB and SNDR is 55 dB. An input frequency sweep of SNR, SNDR, and SFDR is shown in Fig. 23. Table I summarizes the performance specifications of this 12-b 10-GS/s ADC, and lists both the Schreier FOM (FOMS_HF) and the Walden FOM (FOMW_HF) [8].

Table I Performance Specification Summary of 12-b 10-gs/s ADC
Resolution	12b
F_SAMPLE	10GS/s
SNR	56dB
SNDR	55dB
SFDR	66dB
66dB	4GHz
Power	2.9W
FOMS_HF	147dB
FOMW_HF	631fJ/Conv-Step
BW	7.4GHz
DR	60dB
NSD_small-signal	-157dBFS/Hz
Technology	28nm

Fig. 21. Measured ADC FFT at 10 GS/s with a 4-GHz input with IL calibrations disabled.

Fig. 22. Measured ADC FFT at 10 GS/s with a 4 GHz input with IL calibrations enabled.

Fig. 23. Measured ADC input frequency sweep at 10 GS/s.

Fig. 24 shows the measured –3 dB BW of this ADC, which is about 7.4 GHz. The primary circuits determining the BW performance are the front-end push–pull input buffer and the fan-out driving the sampling networks within each sub-ADC.

As discussed in Section III-C, this ADC includes the ability to randomize sub-ADC selection to improve spectral performance by reducing the magnitude of the residual interleaving spurs. To explain the effects of randomization, a sequence of measured FFT spectrums is shown next. Fig. 25 shows an FFT of the ADC sequentially sampling a close to full-scale 1-GHz signal at 10 GS/s, where the SFDR is limited by the HD3 component to 71 dBc and the interleaving mismatch spurs are suppressed to the 80-dB level by the calibrations. However, when the input signal amplitude drops by 6 dB, as shown in Fig. 26, the HD2 and HD3 improve by the square and the cube of the signal reduction, and SFDR is now limited to 70 dBc by the interleaving mismatch spurs, which is undesirable since many applications expect SFDR to improve at smaller signal amplitudes. Now, when randomization of the sub-ADCs is enabled, as shown in Fig. 27, these residual interleaving mismatch spurs are smeared into the noise floor and, for the case shown, the SFDR improves by 10 dB to 80 dBc while the trade-off is that the NSD degrades 1.5 dB.

Fig. 25. Sequential IL 10-GS/s FFT with a full-scale signal.

Fig. 26. Sequential IL 10-GS/s FFT showing dominant residual IL tones with a small-scale signal.

Fig. 27. Random IL 10-GS/s FFT showing reduced residual IL tones with a small-scale signal.

Finally, the power consumption of the ADC at 10 GS/s is 2.9 W, which includes about 400 mW for the input buffer, 1800 mW for the eight sub-ADCs, 650 mW for clocking and digital, and 50 mW for reference and bias generation.

Fig. 28. Schreier FOM of ADCs with SNDR ≥ 50 dB (filtered from [8] after including ISSCC 2017 data).

Comparison With State-Of-The-Art ADCs

The Schreier FOM, shown in (1), is used to compare the performance of this 12-b 10-GS/s ADC with other ADCs in the literature. Fig. 28 shows an FOM comparison plot based on data from Murmann [8], where the ADCs have been filtered by the condition that SNDR = 50 dB, and Table II compares this work with recently published ADCs with f_S = 2.5 GS/s from Fig. 28. This work achieves almost twice the sample rate of [5] and [33], both in 28-nm CMOS, while achieving similar FOM. While [21] and [34] achieve better FOM in 16-nm CMOS, they are 2.5× slower than this work. Almost all of these ADCs use an IL pipeline architecture. The process technology used for these IL pipeline ADCs in Table II range from 130-nm BiCMOS to 16-nm CMOS.

Table II Comparison of This Work With Relevant Gigahertz ADCs With SNDR > 50 Db
Specification	This Work	[5] Ali	[33] Wu	[21] Wu	[22] Staayer	[34] Vaz	[35] Chen	[20] Setterberg
FSAMPLE	10GS/s	5GS/s	5.4GS/s	4GS/s	4GS/s	4GS/s	3GS/s	2.5GS/s
Input fin	4GHz	2GHz	2.7GHz	1.9GHz	1.8GHz	1.9GHz	1.5GHz	1GHz
SNDR @ fin	55dB	58dB	50dB	56dB	56dB	57dB	51dB	61dB
SFDR @ fin	66dB	70dB	65dB	68dB	64dB	67dB	—	78dB
Power (W)	2.9	2.3	0.5	0.3	2.2	0.5	0.5	24
FOMS @ fin	147dB	148dB	147dB	154dB	145dB	153dB	146dB	138dB
BW	7.4GHz	5GHz	—	—	4GHz	—	—	—
Process	28nm	28nm	28nm	16nm	65nm	16nm	40nm	130nm BiCMOS
Architecture	IL Pipe	IL Pipe	IL Pipe	IL Pipe	IL Pipe	IL Pipe/SAR	IL Pipe	IL Pipe

Summary and Conclusion

A 12-b 10-GS/s ADC that interleaves eight pipeline sub-ADCs in 28-nm CMOS technology is described in this paper. The SHA-less pipeline sub-ADCs, including the MDAC amplifiers, operate off the core power supply for low-power dissipation, which is made possible with techniques such as residue scaling, flash background calibration, dithering, and IGE calibration. The challenges of achieving BW and linearity in an IL ADC are addressed with a push–pull complementary input buffer to drive the IL sub-ADCs, and a fast bootstrap switch enables 10-GS/s sampling operation. Interleaving mismatches are addressed with background calibration techniques. Random selection of sub-ADCs is shown to reduce residual interleaving spurs.

关于作者

Siddharth Devarajan

Siddharth Devarajan于2001年获得印度金奈马德拉斯大学电气工程学士学位，2003年和2006年分别获得美国纽约特洛伊伦斯勒理工学院电气工程硕士和博士学位。自2006年起，他一直在位于美国马萨诸塞州威明顿的ADI公司工作，专注于许多高速ADC产品的设计，在此领域拥有多项专利，发表过多篇论文。他目前的研究领域包括流水线式ADC、交错式ADC、GHz数据转换电路和数字辅助模拟设计。

Larry Singer

Larry Singer分别于1985年和1987年从美国马萨诸塞州剑桥麻省理工学院获得电气工程理学学士和硕士学位。自1987年起，他一直在美国马萨诸塞州威尔明顿的ADI公司工作，现任ADI部门研究员。他设计了许多高速、高分辨率的ADC和DAC产品，包括基于BiCMOS和CMOS工艺的分步式、流水线和交错式架构。他目前专注研究交错式ADC、流水线ADC、高速采样和接口电路以及ESD保护。

Dan Kelly

Dan Kelly于1985年获得塔夫斯大学（美国马萨诸塞州梅德福市）电气工程学士学位。他在ADI公司（工作地点位于美国马萨诸塞州威明顿）担任产品工程师，主要负责双极性ADC。自1995年起，他一直在ADI公司（工作地点位于美国马萨诸塞州威明顿）担任设计工程师，从事许多高速CMOS ADC的设计工作，主要负责高速放大器设计、开关电容流水线架构和交错式ADC。

Tao Pan

Jose Silva

Janet Brunsilius

Janet Brunsilius获得美国加州大学圣地亚哥分校学士学位和电气工程硕士学位。2000年，她加入了ADI公司高速转换器部门，工作地点在美国加州圣地亚哥，担任设计工程师。从那时起，她一直参与各种通信、仪器仪表和医疗成像应用IC的设计和开发工作。

Daniel Rey-Losada

Daniel Rey-Losada于2001年获得西班牙马德里理工大学通信工程学士学位，并于2003年和2006年获得美国加州斯坦福大学电气工程硕士学位以及管理科学和工程硕士学位。自2003年起，他一直在ADI公司工作，工作地点在美国加州圣何塞和圣地亚哥，一直都是从事监控IC、开关DC/DC稳压器、BiCMOS集成接收器和CMOS模数转换器方面的工作。

Frank Murden

Frank Murden已在ADI公司工作超过35年。他专注于模数和数模转换器、放大器和无线电。他拥有35项电路和架构设计专利。他目前的研究领域包括交错式ADC和流水线式ADC。Murden先生是ADI公司部门研究员。Carroll Speir于1996年获得美国南卡罗来纳州克莱姆森市克莱姆森大学电气工程学士学位。1996年加入ADI公司，工作地点位于美国北卡罗来纳州格林斯博罗，目前担任设计工程师。在ADI任职期间，他专注于针对通信市场的各...

Carroll Speir

Carroll Speir于1996年获得美国南卡罗来纳州克莱姆森市克莱姆森大学电气工程学士学位。1996年加入ADI公司，工作地点位于美国北卡罗来纳州格林斯博罗，目前担任设计工程师。在ADI（工作地点在格林斯博罗）任职期间，他专注于针对通信市场的各种数字和混合信号ASIC以及用于数据转换器校准的数字辅助模拟设计。

Jeffery Bray

Jeff Bray于1985年获得罗德岛大学电子工程学士学位，并于1988年获得加利福尼亚大学尔湾分校电子工程硕士学位。自1994年起，他一直在ADI公司担任数字设计工程师，主要负责数据路径设计、计算引擎和数字接口。

Eric Otte

Eric Otte (M’13)分别于2010年和2013年获得美国密歇根州东兰辛市密歇根州立大学电气工程专业的学士学位和硕士学位。从2013年至2020年，他一直担任ADI公司的数字设计工程师，专注于高速ADC的数字逻辑和校准算法工作。

Nevena Rakuljic

Nevena Rakuljic (S’06-M’13)分别于2006年、2008年和2012年获得加州大学圣地亚哥分校学士学位、硕士学位和博士学位。自2012年起，她一直在ADI公司（工作地点在加利福尼亚州圣地亚哥市）担任IC设计工程师，专注于流水线式ADC和接收器非线性的数字设计和校准工作。目前，她担任《电路和系统IEEE开放期刊》副编辑。

Phil Brown

Phil Brown于2001年获得美国加州大学圣地亚哥分校电气工程学士学位。他于2001年加入ADI公司，目前担任高速转换器部门（工作地点位于美国北卡罗来纳州格林斯博罗）的IC设计工程师。

Todd Weigandt

Todd Weigandt于1991年获得美国马萨诸塞州剑桥麻省理工学院电气工程学士和硕士学位，1998年获得美国加州大学伯克利分校电气工程博士学位。自1997年以来，他一直在ADI公司担任集成电路设计人员，负责超声模拟前端、时钟频段数据恢复IC和RF检波器设计。他目前的研究领域包括针对高级封装开发的电磁建模，包括微波频率模数转换器的系统级封装技术和优化。

Qicheng Yu

Qicheng Yu获得复旦大学（中国上海）学士学位、纽约州立大学石溪分校（美国纽约石溪）和耶鲁大学（美国康涅狄格州纽黑文）硕士学位以及华盛顿大学（美国华盛顿州西雅图）博士学位。他曾担任Cirrus Logic（美国新罕布什尔州纳舒厄）、Silicon Laboratories（美国新罕布什尔州纳舒厄）以及Mediatek Wireless（美国马萨诸塞州沃本）的设计工程师以及Qualtre, Inc.（美国马萨诸塞州马尔堡）的工程顾问。自...

Donald Paterson

Donald Paterson (M’96)于1987年获得英国爱丁堡大学电子学和物理学学士学位。他于1996年加入ADI公司，工作地点位于美国马萨诸塞州威明顿，他在高速转换器部门专注于面向成像和通信应用的高集成度混合信号芯片。

Corey Petersen

Corey Petersen (M’77)于1978年获得美国犹他州洛根市犹他州立大学电气工程学士学位，并于1984年获得美国爱达荷州莫斯科市爱达荷大学电气工程硕士学位。他先后在AMI（1979）、ST Microsystems（1987）、IMP（1988）、AKM Semiconductor（1995）和ADI（2000）公司工作。目前他是高速信号处理部门的设计工程总监。他还服务于CICC技术委员会并拥有16项美国专利。

Jeffrey Gealow

Jeffrey C. Gealow (S’89–M’97–SM’12) received the S.B. degree in electrical engineering and S.M. and Sc.D. degrees in electrical engineering and computer science from the Massachusetts Institute of Technology, Cambridge, MA...

Gabriele Manganaro

Gabriele Manganaro 拥有意大利卡塔尼亚大学工程博士和哲学博士学位。他于1994年开始为ST Microelectronics和德州农工大学做研究。他曾负责Texas Instruments、Engim Inc的数据转换器IC设计，并曾担任National Semiconductor设计总监。Gabriele从2010年开始担任ADI公司高速转换器部门工程总监。他曾连续7年为ISSCC数据转换器的技术小组服务。他曾分别担任I...