A DOUBLE-DIFFERENTIAL-INPUT / DIFFERENTIAL-OUTPUT FULLY COMPLEMENTARY AND SELF-BIASED ASYNCHRONOUS CMOS COMPARATOR

Vladimir Milovanović and Horst Zimmermann

Institute of Electrodynamics, Microwave and Circuit Engineering
Faculty of Electrical Engineering and Information Technology
Vienna University of Technology (TU Wien)
Gußhausstraße 27, A-1040 Wien, Austria

Abstract: A novel fully complementary and fully differential asynchronous CMOS comparator architecture, that consists of a two-stage preamplifier cascaded with a latch, achieves a sub-100 ps propagation delay for a 50 mV_{pp} and higher input signal amplitudes under 1.1 V supply and 2.1 mW power consumption. The proposed voltage comparator topology features two differential pairs of inputs (four in total) thus increasing signal-to-noise ratio (SNR) and noise immunity through rejection of the coupled noise components, reduced even-order harmonic distortion, and doubled output voltage swing. In addition to that, the comparator is truly self-biased via negative feedback loop thereby eliminating the need for a voltage reference and suppressing the influence of process, supply voltage and ambient temperature variations. The described analog comparator prototype occupies 0.001 mm\(^2\) in a purely digital 40 nm LP (low power) CMOS process technology. All the above mentioned merits make it highly attractive for use as a building block in implementation of the leading-edge system-on-chip (SoC) data transceivers and data converters.

Keywords: Comparator, preamplifier, latch, CMOS, fully-differential, PVT variations, noise immunity, self-biasing, data converters, ADC, transceivers.
1 Introduction

After amplifiers, comparators are perhaps the second most widely used analog electronic component. Analog comparators can be used to determine whether one input value is higher or lower than the other one at specific time points (predefined by the clock signal) or to perform the comparisons in an asynchronous manner, that is, to detect the time point at which the difference of the two input signals has changed its sign. These two comparator types are usually classified as dynamic (clocked) comparators and asynchronous (or open-loop), respectively. Further, the compared signal may be any analog physical (i.e., electrical) quantity, like current, voltage, but also charge or even time. This paper settles its contribution in the field of the so-called asynchronous (non-clocked) analog voltage comparators.

Both asynchronous/open-loop [2] and dynamic/synchronous [3] comparator types, are in a widespread use in switched-mode power supplies as well as in the present-day data conversion [4] and/or transmission circuits [5]. After all comparator itself is nothing else but the single-bit analog-to-digital converter (ADC). Often, they are the critical design components as, for example, data converters’ bandwidth and maximum (over-)sampling rate directly depend on comparator’s propagation delay. Moreover, an analog-to-digital converter’s resolution, expressed in terms of signal-to-noise and distortion ratio or effective number of bits, is largely influenced by the comparator’s noise figure and its input-referred noise. Finally, on the one hand, comparators should be high speed/low noise, while on the other, for use in battery-powered applications, they should consume as less power as possible.

The basic idea behind high-speed analog voltage comparators is in combination of the best aspects of a preamplifier with the negative exponential step response with a latch that exhibits the positive exponential rise. The

![Figure 1](image-url)

Fig. 1. Fully differential asynchronous voltage comparator that exploits a preamplifier-latch cascade to achieve fast decision making and thereby high operating speeds.
A Fully Differential Self-Biased Asynchronous CMOS Comparator

\[ v_{\text{in}2} - v_{\text{in}1} = v_{\text{intermediate}}^+ - v_{\text{intermediate}}^- \]

\[ v_{\text{in}2}^+ + v_{\text{in}1}^+ = v_{\text{out}}^+ \]

\[ v_{\text{in}2}^- + v_{\text{in}1}^- = v_{\text{out}}^- \]

Fig. 2. Fully differential high-speed preamplifier-latch asynchronous voltage comparator that features two pairs of differential inputs (four in total) on the preamplifier.

The preamplifier is used to build-up the input voltage difference up to a certain point where the latch takes over and brings the signal to rail. Both clocked and non-clocked comparators can exploit these speed-up principles. A block-level representation of a high-speed asynchronous comparator consisting of a preamplifier-latch cascade is given in Fig. 1.

It is advantageous for high-speed asynchronous voltage comparators to utilize fully differential signaling as it brings with itself increased noise immunity by rejection of the coupled noise components, reduced even-order harmonic distortion, and doubled output voltage swing. Besides using differential output as the one of Fig. 1, the overall noise performance benefits could also be induced from the comparator version of Fig. 2 that features the preamplifier stage with two pairs of differential inputs (four in total).

This article presents a high-speed asynchronous CMOS voltage comparator implementation which exploits two differential pairs of inputs and is suitable for incorporation in the cutting-edge systems on chip (SoCs).

2 Four-Input Asynchronous Comparator Topology

Transistor-level and block-level schematics of the proposed complementary and fully differential self-biased asynchronous CMOS voltage comparator that features two pairs of inputs are shown in Fig. 3 and Fig. 4, respectively.

The comparator is comprised out of three fully differential self-biased CMOS voltage amplifiers that share identical circuit topology, and a CMOS latch. Inputs of two amplifiers (four in total) at the same time act as the comparator inputs, while the biasing nodes and respective outputs of these two amplifiers are connected to each other in parallel, thus constituting the first preamplifier stage. The third amplifier is cascaded to the outputs of the first two, hence effectively forming the preamplifier’s second stage. The
amplifiers constructing the first preamplifying stage are mutually identical (corresponding transistor sizes of both are matched), but are different from the one serving as the second preamplifying stage (meaning, its transistor sizes are optimized independently). Finally, preamplifier is cascaded with a
A Fully Differential Self-Biased Asynchronous CMOS Comparator

![Block-level schematic of the proposed self-biased asynchronous analog voltage comparator](image)

Fig. 4. Block-level schematic of the proposed self-biased asynchronous analog voltage comparator which features two pairs of differential inputs and differential output of Fig. 3.

simple latch whose outputs are at the same time the comparator outputs.

Inputs of each of the three fully differential self-biased inverter-based CMOS amplifiers [5, 6] are amplified through the push-pull inverters consisting of transistors \(N_{Out}^{\text{in}}\) and \(P_{Out}^{\text{in}}\), thus rendering the outputs of that particular amplifier. The CMOS inverters at the inputs bring with themselves inherent advantages like very high input impedance and nominally doubled transconductance. The biasing of each stage is accomplished through complementary transistor pairs \(N_{bias}^{\text{in}}\) and \(P_{bias}^{\text{in}}\) which are controlled by \(v_{bias}\) and are operating deep within the linear region. This potential is in turn stabilized through the negative feedback loop utilizing \(N_{Bias}^{\text{in}}\) and \(P_{Bias}^{\text{in}}\). Namely, any variation in processing parameters or operating conditions (change of supply voltage or ambient temperature) that shifts \(v_{bias}\) from its nominal value, results in an instant attenuation of these deviations [7] in an extent proportional to the value of the loop gain. As the biasing transistors are operating in the triode region, potentials \(v_{down}\) and \(v_{up}\) are very close to the negative and the positive supply rail, respectively. In such configuration, self-biasing is not compromising with the output voltage swing which is nearly equal to the difference between the values of the two supply rails.

Resistors \(R'\) and \(R''\) serve to avoid establishment of the low-resistive paths through \(v_{bias}^{\text{in}}\) and \(v_{bias}^{\text{out}}\) nodes, respectively, for high (by absolute value) input voltage differences. Placed in the biasing part, the resistors have no impact on comparator performance except that it drastically reduces dissipation while mutually distant potentials are applied as comparator inputs.
Problem of the same kind will also occur in the path through $v_{out}^{'+}$ and $v_{out}^{'-}$ nodes but it cannot be avoided using the resistor trick instead these metal lines must be made thicker in order to sustain higher current values.

As already stated, the output of the last preamplifier stage is connected to the input of the latch stage. The latch itself is implemented as the cross-coupled connection of two CMOS inverters (composed out of transistors $N_{\text{latch}}^x$ and $P_{\text{latch}}^x$). The coupling between the preamplifier’s output and the latch itself is done through inverters consisting of transistors $N_{\text{inv}}^x$ and $P_{\text{inv}}^x$. Without transistors $N_{\text{rail}}^x$ and $P_{\text{rail}}^x$, the coupling inverters should be large/strong enough to have the ability to pull the latch out of the positive feedback saturation, but still small/weak enough not to firmly dictate the output voltage (because having a latch in that case is senseless). Connecting these four field-effect transistors to the supply rails relaxes the last requirement and consequently increases design’s reliability and robustness.

Besides being fully complementary, the proposed asynchronous voltage comparator circuit with two pairs of inputs is also perfectly symmetrical with respect to the vertical and the horizontal axis in Fig. 3 and Fig. 4, respectively. This is the reason why the biasing transistors on each preamplifier stage are drawn separately. Symmetry implies beneficial repercussions on the process of laying the circuit out, as one can naturally match paired devices and the propagation delay through separate circuit blocks.

3 CIRCUIT ANALYSIS OF THE COMPARATOR ARCHITECTURE

Analysis of the proposed comparator topology can be accomplished by analyzing two of its subcomponents, namely the preamplifier and the latch.

3.1 Preamplifier

If the voltage drops across the biasing transistors are neglected, that is, if $v_{\text{down}}$ and $v_{\text{up}}$ are approximately at the supply rails, then the small-signal differential gain of the comparator’s preamplifier is just equal to the transfer function of the push-pull inverter and hence it can be written as

$$
H_{\text{preamplifier}}(s) = \frac{V_{out}^{'+} - V_{out}^{'+}}{(V_{in1}^+ - V_{in1}^-) - (V_{in2}^- - V_{in2}^+)} (s) = \frac{R'_{o}R''_{o}(s - g_{m}'/C_{gd}')(s - g_{m}''/C_{gd}'')} {R'_{o}R''_{o}s(1 + g_{m}''R''_{o}) + C_{in2}C_{o1} + R''_{o}(C_{gd} + C_L)} s + 1,
$$
where $g'_m = g'_{mN} + g'_{mP}$ and $g''_m = g''_{mN} + g''_{mP}$ are the total transconductances of the first and the second preamplifier’s stage inverter, respectively. $R'_o$ and $R''_o$ are the total resistances seen at the output of the first and at the output of the preamplifier’s second stage, $C'_{gd} = C'_{gdN} + C'_{gdP}$ and $C''_{gd} = C''_{gdN} + C''_{gdP}$ are the sums of the gate-drain capacitances of the nMOS and pMOS of the first and the second preamplifier’s stage, respectively. For simplicity reasons, 
$$\zeta = C_L \left( C'_{gd} + C''_{gd} + C'_{i2o1} \right) + C''_{gd} \left( C''_{gd} + C'_{i2o1} \right)$$ is introduced, while $C_{i2o1}$ is the total capacitance at the output of the first and the input of the second preamplifier stage and $C_L$ is the total load capacitance at the output of the preamplifier or at the input of the latch.

It may be observed that the transfer function (1) in which $s = \sigma + i\omega$ is the complex angular frequency, is of the second order with two real left complex half-plane poles. It also possesses two real high frequency right complex half-plane zeroes at frequencies $z_1 = g'_m / C'_{gd}$ and $z_2 = g''_m / C''_{gd}$.

The step response of the preamplifier can be predicted based on its transfer function. If the effect of the two high frequency zeroes, $z_1$ and $z_2$ is neglected, together with the dominant pole approximation, the system’s step
response may be written as

\[ v_{\text{out}}^+(t) - v_{\text{out}}^-(t) = \mathcal{L}^{-1}\left\{H_{\text{preamplifier}}(s)/s\right\} \approx G_{\text{preamplifier}} \left[ (v_{\text{in}1}^+ - v_{\text{in}1}^-) - (v_{\text{in}2}^+ - v_{\text{in}2}^-) \right] \left[ 1 - \kappa \exp\left(-t/\tau_A\right) \right] u(t), \]

where \( G_{\text{preamplifier}} \) and \( \tau_A \) are the preamplifier low frequency gain and time constant which is inversely proportional to the value of the dominant pole, \( \kappa \) is a constant dependent on coefficients of the polynomial found in the transfer function denominator, while \( u(t) \) and \( \mathcal{L}^{-1} \) represent the Heaviside step function and the inverse Laplace transform operator, respectively.

### 3.2 Latch

If the initial voltage that is applied to the latch output nodes (through the preamplifier-latch coupling inverters) at specified time point \( t' \) is \( v_{\text{out}}^+(t') - v_{\text{out}}^-(t') \), then the time response of the linearized latch approximation on this initial condition (for \( t \geq t' \) and \( \Delta t = t - t' \)) has the form of an exponentially increasing function of time \( \Delta t \) and can be written as

\[ v_{\text{out}}^+(t) - v_{\text{out}}^-(t) = \exp(\Delta t/\tau_L) \left[ v_{\text{out}}^+(t') - v_{\text{out}}^-(t') \right]. \]

The time constant of the portrayed cross-coupled CMOS inverter latch is approximately equal to \( \tau_L \approx C/g_{mL} \), where \( C \) is the total capacitance seen at the output of the latch, i.e., comparator, while \( g_{mL} = g_{mNL} + g_{mPL} \) is the total transconductance of the latch complementary transistor pair. Note that this is a typical temporal response of positive-feedback systems which have a single or a dominant real right complex half-plane pole.

### 4 Operating Principles of the Described Comparator

As already stated in the introduction, the basic idea behind the presented comparator is in combination of the best aspects of the preamplifier, which is characterized by the negative exponential step response (2), with the positive exponential response (3) latch. The preamplifier builds up the voltage up to a certain point where the latch takes over and brings the signal to a rail.

The previous principle concepts are illustrated in Fig. 5. In this figure, the preamplifier gain times the input voltage alone is not sufficient for the output to reach the rail. Nevertheless, it achieves a high enough output value to pull the latch out of one saturation state and trigger its positive feedback loop that drives the comparator to the saturation state on another supply rail, thus producing a firm logical level (high or low) at the output.
With the total propagation delay through the comparator being the sum of propagation delays of the cascaded components it consists of, namely,

$$ t_{\text{total}} = t_{\text{preamplifier}} + t_{\text{latch}} \quad , $$

it is obvious that reducing the time constants of the separate comparator subcircuits ($\tau_A$ and $\tau_L$) is essential to increase its speed of operation. Additionally, it can be proven that there exists the optimum preamplifier-latch takeover point ($t_x, v_x$) that is located in the point where the first derivatives of the preamplifier and the latch function are equal. This was somewhat expected and hence for high-speed applications the comparator should be optimized such that the subcomponent function that has larger first derivative of the two is used for the corresponding part of the characteristics.

Apart from acceleration, another role of the latch block is also to align comparator’s complementary output fall-time and rise-time edges.
Buffers Only

OUT + & - [V]

Time Elapsed after the Fixed Moment in Time $t$ [ns]

Buffers Only

IN + & - [V]

Comparator

OUT + & - [V]

Comparator

IN2 + & - [V]

Comparator

IN1 + & - [V]

Fig. 7. Measured inputs and outputs of the on-chip structure containing asynchronous voltage comparator featuring two pairs of differential inputs with output drivers and the corresponding on-chip dummy comparator structure containing the output drivers alone.

5 On-Chip Measurement Setup for Propagation Delay

The output of the latch, which is at the same time the comparator output, has rail-to-rail swing and is hence designed to be cascaded by some digital circuitry which regularly features relatively low input capacitance with respect to a pad capacitances. To measure the comparator characteristics in a realistic configuration a chain of several inverters which drive the pad capacitance and the 50 $\Omega$ measurement equipment follows each of the comparator outputs as shown in Fig. 6. Both transistors in the last inverter are designed to have the on-resistance of $R_{on} = 50 \Omega$ to avoid reflection thus halving the output signal amplitude to $V_{DD}/2$. For the same reason all four inputs have 50 $\Omega$ on-chip termination to ground. To enable indirect delay measurement of the comparator, output drivers are also placed on chip, on their own, as explained by Fig. 6. Special attention is paid so that the metal lines routed to and off the comparator (with the output drivers) and the output drivers alone
are identical in every aspect. This enabled the use of identical printed circuit boards, identical coaxial cables and finally identical measurement equipment to drive and characterize both on-chip structures. Thus, delay of the comparator is obtained as the difference between the delay of the structure with comparator plus output buffers and the delay of the dummy structure containing the buffers only. The previous subtraction eliminates the influence of coaxial cables, printed circuit board microstrip lines, on-chip metal lines, etc., which were identical for both measurements and are therefore canceled out in the process of delay subtraction. Additionally, the output drivers are optimized for small propagation delay variation, the standard deviation of which is $\sigma_{\text{delay}} < 5 \text{ ps}$ based on one thousand Monte-Carlo simulations and the sample of ten relative on-chip measurements. Also, the comparator and the buffers have separate supply pads (i.e., analog and digital, respectively) to enable power consumption measurement of the comparator alone.

Measured inputs and outputs of the on-chip characterization structures depicted in Fig. 6, driven by pseudorandom binary sequence signal with frequency of 3.33 GHz, are shown in Fig. 7 in a form of an oscilloscope screenshot. It can be observed that the structure containing buffers only is always driven with rail-to-rail signal resembling the comparator outputs. Difference between the two outputs yields the comparator propagation delay.
6 Measurement Results of the Proposed Comparator

Having in mind reasonable power consumption, the described comparator is optimized for speed and is fabricated in a standard 1P8M digital 40 nm low power multi-threshold CMOS process technology shrank to 90% (minimum transistor gate length 36 nm). To optimize latency and power the exploited technology offers transistors with three different values of threshold voltage. Threshold voltages for low-$V_T$ transistor types, which are used in the design to minimize propagation delay, are around $V_{Tn}/V_{Tp} \approx 0.33\,V/0.28\,V$, while the nominal supply voltage for the given process is $V_{DD} = 1.1\,V$.

The propagation delay of the comparator with two pairs of inputs, measured in the upper described manner, is lower than 100 ps for the 50 mVpp step applied at both of its differential inputs. Total power dissipation of the comparator under these circumstances equals 2.1 mW and is dominated by the preamplifier’s static consumption. Ergo, the DC current consumption accounts for the major part of the total comparator’s power consumption.

Measured eye diagram of the comparator at 3.33 GHz, what was the limit of stimulus equipment, is shown in Fig. 8, however, based on the propagation delay measurements, the eye opening should be present up to 10 GHz.

Test chip photomicrograph is given in Fig. 9. Our proposed four-input comparator design implementation occupies an area of $39.2 \times 25.5 \mu m^2$. 

Fig. 9. Test chip photomicrograph. Abbreviations: (G) ground, (A) analog supply, (D) digital supply, (I) input, (O) output. Left – output buffers; Right – four-input comparator.
7 Conclusions

The article presents a prototype of a novel fully differential asynchronous comparator topology that features two-pairs of inputs and is implemented in 40 nm LP CMOS technology. The comparator consists of a preamplifier-latch cascade and is completely self-biased thus overcoming the need for a reference circuit and reducing the influence of PVT variations. Comparator propagation delay is extracted using subtractive method which exploits on-chip dummy output driver structures. Measurements indicate that, depending on the actual input signal amplitude and common-mode, the comparator can operate at frequencies beyond 10 GHz under dissipation of 2.1 mW. Although both comparator delay and its power consumption greatly depend on the input signal amplitude and common-mode value, this still places it among the fastest non-clocked comparators published up to date. Finally, the proposed comparator circuit is well-suited for implementation in the cutting-edge system-on-chip (SoC) data transceivers and data converters.

Acknowledgements

The authors would like to express their gratitude to Lantiq A and Austrian BMVIT for their financial support of the FIT-IT project xPLC via FFG.

References


Conclusions

The article presents a prototype of a novel fully differential asynchronous comparator topology that features two-pairs of inputs and is implemented in 40 nm LP CMOS technology. The comparator consists of a preamplifier-latch cascade and is completely self-biased thus overcoming the need for a reference circuit and reducing the influence of PVT variations. Comparator propagation delay is extracted using subtractive method which exploits on-chip dummy output driver structures. Measurements indicate that, depending on the actual input signal amplitude and common-mode, the comparator can operate at frequencies beyond 10 GHz under dissipation of 2 mMW. Although both comparator delay and its power consumption greatly depend on the input signal amplitude and common-mode value, this still places it among the fastest non-clocked comparators published up to date. Finally, the proposed comparator circuit is well-suited for implementation in the cutting-edge system-on-chip (SoC) data transceivers and data converters.

Acknowledgements

The authors would like to express their gratitude to Lantiq A and Austrian BMVIT for their financial support of the FIT-IT project xPLC via FFG.

References


