# A 5.2 Gb/s Receiver for Next-Generation 8K Displays in 180 nm CMOS Process

Tianyu Wang<sup>(D)</sup>, Da Wei, *Student Member, IEEE*, Ranick Ng, Gaurav Malhotra, Anup P. Jose<sup>(D)</sup>, *Member, IEEE*, Amir Amirkhany, *Senior Member, IEEE*, and Pavan Kumar Hanumolu<sup>(D)</sup>

Abstract—This article presents a high-speed receiver for nextgeneration 8K ultra-high-definition TVs. The receiver supports error-free communication between the timing controller and the display driver integrated circuits (DDIs) across various channels. Because the receiver must be co-integrated with pixel drivers in the DDI, it must be implemented in a process with high-voltage devices, which poses significant challenges in achieving beyond 5-Gb/s operation. We propose techniques for overcoming such process-induced speed limitations. They include a level-shifting passive continuous-time linear equalizer (CTLE), an active CTLE with extended bandwidth using a negative capacitor, a speculative decision feedback equalizer with a down-sampled edge-sampling path, and a low-dropout regulator with parallel error amplifiers to achieve all-band power supply rejection. A reference-less clock and data recovery circuit with a new frequency detector is also described. Fabricated in a 180-nm CMOS process, the prototype receiver operates at 5.2 Gb/s and can compensate up to 29-dB channel loss while consuming 120 mA from a 1.8-V supply.

*Index Terms*—Clock and data recovery (CDR), decision feedback equalizer (DFE), display drivers, low-dropout (LDO) regulator, passive continuous-time linear equalizer (CTLE), serial links, wide-panel displays.

# I. INTRODUCTION

**M** EETING the ever-increasing demand for higher resolution, color depth, and refresh rate TVs requires high-throughput display driver integrated circuits (DDIs). A DDI receives high-speed data from the timing controller (TCON) and drives it to the pixel drivers [1]–[3]. In a typical 4K (UHD) TV, the throughput for each DDI is 3 Gb/s and is expected to increase to 5.2 Gb/s for the next-generation 8K (Quad-UHD) TVs. Designing a DDI that is capable of receiving such high-speed data is very challenging for multiple reasons. First, DDI must be implemented in a low-cost technology with high-voltage devices, such as the 180-nm CMOS process, to integrate high-voltage pixel drivers and the high-speed receiver on the same IC. This constraint imposes severe speed bottlenecks that make it highly challenging to

Manuscript received 25 August 2021; revised 27 December 2021 and 14 February 2022; accepted 17 February 2022. Date of publication 18 March 2022; date of current version 25 July 2022. This article was approved by Associate Editor Yunzhi Dong. This work was supported by the Samsung Electronics. (*Corresponding author: Tianyu Wang.*)

Tianyu Wang and Pavan Kumar Hanumolu are with the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA (e-mail: wang439@illinois.edu).

Da Wei, Ranick Ng, Gaurav Malhotra, Anup P. Jose, and Amir Amirkhany are with the Display America Lab, Samsung Electronics, San Jose, CA 95134 USA.

Color versions of one or more figures in this article are available at https://doi.org/10.1109/JSSC.2022.3155514.

Digital Object Identifier 10.1109/JSSC.2022.3155514

Fig. 1. Signal distribution across a wide-screen TV.

achieve the target data rate of >5 Gb/s. Second, large display panels require receivers that can operate with a wide range of channels. For instance, to distribute the data across a 105-in panel (see Fig. 1), the insertion loss (IL) could vary from 6 dB for the DDI located closest to the TCON to 29 dB for the DDI farthest from the TCON. Finally, power consumption must be low to avoid overheating, especially in the absence of a dedicated cooling system in the display panel.

We presented design techniques in [2] for achieving a high data rate by overcoming process speed limitations. Using a five-stage continuous-time linear equalizer (CTLE) and quarter-rate tap-tap decision feedback equalizer (QDFE), we could compensate up to 24-dB channel loss and achieve an excellent bit error rate (BER). However, the receiver consumed considerable power consumption (138-mA current from a 1.8-V supply) and required a forwarded clock for data recovery. This article supplements the information provided in [2] and offers new design techniques that significantly improve performance. Specifically, we present a power-efficient twostage CTLE that replaces the power-hungry five-stage CTLE in [2] and a clock and data recovery (CDR) circuit to enable embedded clocking. The prototype receiver, fabricated in a 180-nm CMOS process, achieves 5.2-Gb/s throughput subject to an IL of up to 29 dB while drawing 120-mA current from a 1.8-V supply. Compared to [2], the proposed receiver can tolerate significantly more loss without TX pre-emphasis and provide embedded clocking functionality without degrading the power efficiency.

The rest of this article is organized as follows. The proposed architecture, along with the equalizer details, is described in Section II. CDR architecture and its key building blocks are presented in Section III, and the experimental results obtained from the receiver prototype are provided in Section IV.





Fig. 2. Simplified block diagram of the proposed receiver.

# II. PROPOSED ARCHITECTURE

A simplified block diagram of the proposed receiver is shown in Fig. 2. Channel is terminated by an on-chip resistor and dc-coupled to the input of a two-stage CTLE, which is followed by a QDFE. QDFE outputs are deserialized by a factor of 2 and then fed to the reference-less  $2\times$  oversampling CDR circuit, which provides the sampling clocks to the QDFE. While the architecture is rather conventional, the main contributions, as described next, are in the circuit implementation of key building blocks (CTLE, QDFE, and CDR) to achieve >5-Gb/s data rate in a low-cost, low- $f_T$  180-nm CMOS process.

#### A. Continuous-Time Linear Equalizer

The CTLE must provide an adequate high-frequency boost [4] to compensate for significant channel loss while driving a relatively large load presented by the QDFE. In addition, unique to the DDI, CTLE must level shift the relatively low transmitter output common-mode voltage (0.5 V) to be within the relatively high input common-mode voltage (1.35 V) of the receiver. The discrepancy in the common-mode voltages stems from implementing the TCON in a relatively newer technology with a 1-V supply and the receiver in a 1.8-V 180-nm process for reasons described earlier. This issue can be addressed using a PMOS-based CTLE input stage [5] or ac coupling [6]. However, PMOS transistors are significantly slower than the already slow NMOS transistors, severely limiting the achievable high-frequency boost. On the other hand, ac coupling could not be employed without baseline wander correction because the data are not guaranteed to be dc-balanced. To this end, Hekmat et al. [2] employed an NMOS common-gate (CG) amplifier. Unfortunately, the CG amplifier was highly sensitive to process, voltage, and temperature (PVT) variations and degraded the S11 performance. Due to these challenges, we seek to perform level shifting using a passive CTLE stage.

The proposed CTLE is shown in Fig. 3. It is composed of a passive first stage (PCTLE) followed by an active second stage (ACTLE). Referring to the PCTLE portion in Fig. 3, the channel output common-mode voltage is shifted to about 1.35 V using resistors  $R_1$  and  $R_2$  ( $R_1 = 2R_2 = 2000 \Omega$ ). Due to the relatively large  $R_1/R_2$ , the passive PCTLE has minimal impact on the S11 performance. Note that the PCTLE does



Fig. 3. Proposed CTLE schematic.

not limit the bandwidth of the receiver due to the feed-forward capacitor. The transfer function of PCTLE is studied in [7] and made programmable by digitally switching a portion of  $C_F$  from the feed-forward path to the ground. Since the total capacitance at the output of the PCTLE remains constant and is independent of the  $C_F$  setting, the output pole of the PCTLE remains fixed. As shown in Fig. 4(a), the PCTLE can provide up to 3-dB programmable peaking. Note the PCTLE has most impact on the overall CTLE response at low frequencies.

The second stage of the CTLE (Fig. 3) is a conventional RC-degenerated current-mode logic (CML)-based active CTLE (ACTLE) stage [7], [8]. It converts the passive CTLE output into a CML signal suitable for processing along the rest of the signal path. The value of  $R_S$  is made programmable to tune the amount of peaking from 3.8 to 9.2 dB in the mid-band frequency range [Fig. 4(b)]. A major challenge in the design of the CTLE in [2] was to drive a large load while maintaining a sufficient bandwidth in low- $f_T$  technology. Five stages were needed to meet these design requirements. The five stages also behave like a chain of buffers, with each stage larger than the previous stage. In this work, the loading is reduced by down-sampling the edge-sampling path in DFE and the use of negative capacitance to cancel the DFE input capacitance partially, and it helped reduce the number of stages while maintaining a sufficient bandwidth. The negative capacitor is implemented by degenerating a negative- $G_m$  stage with capacitor  $C_N$  [9]. CTLE followed by negative capacitor was employed in [10]. However, Pan et al. [10] did not consider the impact of bandwidth limitation of the cross-coupled pair on the negative capacitor's behavior. Ideally, if the bandwidth



Fig. 4. CTLE transfer function w/dc level normalized to 0 dB. (a) Passive CTLE code sweep. (b) ACTLE  $R_S$  sweep. (c) Negative capacitance sweep. (d) Entire programmable space.

of the negative- $G_m$  stage is larger than the CTLE's output pole frequency, the negative capacitor simply reduces the effective output capacitance and shifts the output pole to  $(1)/(R_L(C_L - C_N))$  [11]. However, in our design technology, limitation caused the bandwidth of the negative- $G_m$  stage to be significantly lower than ACTLE's output pole frequency. Under this condition, CTLE's output impedance,  $Z_{OUT}(s)$ , can be calculated as

$$Z_{\text{OUT}}(s) = \frac{R_L(1 + s/\omega_1)}{1 + s(1/\omega_1 + (C_L - C_N)/C_L\omega_2) + s^2/\omega_1\omega_2} \quad (1)$$

where  $\omega_1 = (g_{mN}/C_N)$ ,  $\omega_2 = (1/R_LC_L)$ , and  $g_{mN}$  is the transconductance of the cross-coupled pair used to implement the negative- $G_m$  stage. Equation (1) shows that  $Z_{OUT}$  exhibits a second-order response with a damping factor  $\zeta$  given by

$$\zeta = \frac{\sqrt{\frac{\omega_2}{\omega_1}} + \frac{C_L - C_N}{C_L} \sqrt{\frac{\omega_1}{\omega_2}}}{2}.$$
 (2)

Because  $\zeta$  plays a vital role in setting the magnitude response of  $Z_{OUT}$ , it has to be appropriately set to achieve the desired response. To this end,  $C_N$  is made digitally programmable, and  $\zeta$  is controlled by varying the number of negative capacitance taps shown in Fig. 3. Each tap can be enabled/disabled by turning on/off the current sources  $(I_N/I_P)$ . PMOS current sources  $(I_P)$  are added to provide all the current drawn by the NMOS tail current source in the negative capacitance stage so that enabling more taps of the negative capacitance stages does not result in drawing dc current from the ACTLE stage. Note that including the negative- $G_m$  stage in each of the programmable taps keeps  $\omega_1$  constant independent of the number of selected taps. Simulations indicate that the tunable negative capacitance can be used to adjust peaking in the high-frequency band ( $\approx 2.2 \text{ GHz}$ ) by up to 3 dB [see Fig. 4(c)]. Due to the separate knobs available for tuning the frequency



Fig. 5. DFE architecture.

response, CTLE can be optimized to compensate for various channel loss profiles. The entire programmable space of the CTLE transfer functions is shown in Fig. 4(d), where only the extreme codes (0 and 3) for each tuning knob are shown for brevity.

# B. Decision Feedback Equalizer

Fig. 5 shows the block diagram of the proposed QDFE [2]. The interleaving factor of four (quarter rate) is dictated by the power consumption considerations in the clock distribution network. The QDFE contains four lanes (DQ<sub>0</sub>, DQ<sub>1</sub>, DQ<sub>2</sub>, and DQ<sub>3</sub>) for data sampling and one lane (DQ<sub>X</sub>) for edge sampling. The data path in each lane consists of a speculative first tap and four direct-feedback taps, while the edge path consists of only the speculative first tap. The feedback coefficients are annotated as  $\alpha_i$  in the figure, and their values are set using registers that were written using a scan chain. To meet the critical timing constraints associated with the first ( $h_1$ ) and second post-cursor ( $h_2$ ) taps, a bulk-biased CML slicer and a merged mux/latch (MuxL) are adopted in the data lanes, and their details are discussed in the following.

The stringent timing constraint associated with DFE's first tap is alleviated using speculation [12]. Simulations indicate that this constraint could not be met using a sense-amplifier flip-flop (SAFF)-based slicer because its delay exceeds 1 UI in our process. Thus, the slicers are implemented using an active-inductor-loaded CML latch topology [see Fig. 6(a)]. The speed penalty incurred by the additional differential pair used to set the slicer threshold ( $\alpha_1$ ) [13] is mitigated by setting the threshold using the bulk potential of the input NMOS differential pair [14], [15]. Simulations show that bulk biasing can set the slicer threshold voltage in a range of ±110 mV and reduce the slicer delay by approximately 20%. While this improvement is significant, the bulk bias poses the risk of unintentional forward biasing of the body–source junction of the input devices. In [14], such risk is avoided by carefully



Fig. 6. Slicer implementation details. (a) CML latch with active inductor loads and bulk biasing for setting threshold voltage equal to post-cursor ISI. (b) Bulk bias voltage generation circuit.

sizing the devices. However, this is highly sensitive to PVT variations. In this work, we ensured that the bulk voltage can never exceed the source voltage,  $V_{\text{TAIL}}$ , using the bulk-bias generation circuit shown in Fig. 6(b). The bulk-bias generation circuit employs a negative feedback loop to ensure  $V_{\text{DUMMY}} = V_{\text{TAIL}}$ . Consequently, even if all the current is steered to one side, the bulk-bias voltage remains below the source voltage.

The delay of the data-selection mux also turned out to be a significant bottleneck in meeting the speculative DFE timing constraint in our process. The delay of the mux, implemented using a PMOS switch pair, is typically much smaller than the latch delay and can often be ignored. However, the slow PMOS transistors in our process significantly increase mux delay, making it a significant fraction of the available delay budget. To alleviate this issue, mux and the latch are merged to form a MuxL [16], as shown in Fig. 7. When  $CK_N$  is high, tail current is steered to the input differential pair and when  $CK_P$  is high ( $CK_N$  is low), the current is steered to the cross-coupled pair, which regenerates the output to full swing. Because the selection is accomplished by current steering, MuxL alleviates the speed penalty of PMOS devices.

Ideally, edge samples must also be equalized to nearly the same extent as data samples for low-jitter operation [17]. However, this requires a complex power-hungry multi-tap edge-DFE (XDFE). Therefore, we employed two techniques that simplify the design of XDFE and reduce its power consumption: 1) XDFE was implemented with only one speculative tap and 2) edge samples are taken only once every 4 UI. The power savings compared to the full-rate edge-sampling scheme is about 78.8 mW. If the power savings from reduced DFE



Fig. 7. Schematic of the merged MuxL.



Fig. 8. CDR architecture.

loading were to be included, the estimated power saving can be as much as 91.8 mW. The impact of these design choices on the CDR will be quantified in Section III.

# III. CLOCK AND DATA RECOVERY

The block diagram of the CDR circuit is shown in Fig. 8. Using the eight data samples  $(D_0 - D_7)$  and two edge samples  $(X_0/X_4)$  provided by the DFE, the CDR performs reference-less clock recovery. The mode-select signal controls the CDR's mode of operation between frequency-locking loop (FLL) and phase-locking loop (PLL). In the FLL mode, UP/DN signals generated by frequency detector (FD) drive the charge-pump (CP)-based integrator whose output is buffered by a low-dropout (LDO) regulator and used as the supply voltage to the voltage-controlled oscillator (VCO). The FLL brings VCO's oscillation frequency within the PLL's pull-in range. In the PLL mode, UP/DN signals generated by the phase detector (PD) drive the CP-based integrator and the VCO, thus implementing integral and proportional control, respectively. Driving the VCO directly with early/late (E/L) signals minimizes loop latency and decreases dithering jitter. The VCO is implemented using a four-stage CMOS inverterbased ring oscillator whose frequency can be tuned through its supply voltage and the varactors present at the delay stage outputs. The design details of the important building blocks of



Fig. 9. Illustration of the concept behind frequency detection logic.

the CDR, namely, the FD, PD, VCO, LDO, and the circuitry that interfaces the CP with the LDO, are described next. The impact of bang-bang phase detector (BBPD) quantization error, mismatches in the proportional path, and CP current leakage on the CDR's CID performance were analyzed and found to have no significant effect.

# A. Frequency Detector

Frequency detection is performed by leveraging the 12-UI clock pattern training sequence that is defined in the interface protocol. Referring to Fig. 9, the FD's operating principle is based on detecting transitions between data samples  $[D_0, D_2]$ ,  $[D_2, D_4]$ , and  $[D_4, D_6]$ . The corresponding digital circuit is shown in Fig. 10, and an absence of a transition indicates that the VCO is running fast, while the presence of more than one transition means that VCO is running slow. Theoretically, there is no upper bound on the magnitude of the frequency error that can be detected. On the other hand, the lower limit of the detection range is  $f_{\text{VCO,init}} > (\text{Data rate}/20)$ , where  $f_{\text{VCO,init}}$  is the initial VCO frequency. When  $f_{VCO,init} = (Data rate/20)$ , the clock will sample the data every 5 UI. Because the FD takes every other bit as input  $(D_0, D_2, D_4, \text{ and } D_6)$ , the effective sampling period of the FD is 10 UI. The sampled data exhibit a repeating 000111 pattern at this sampling period, which erroneously indicates that the frequency error is zero. Ideally, FD output is zero when the frequency error is zero. However, in practice, the slicers could make erroneous data decisions when the sampling clock is close to the data transition edge. Even though such errors would have close to zero mean (i.e., statistically an equal number of slow/fast decisions), the mismatch between the up and down current sources in the CP can cause the FLL to lock with a residual frequency offset. Therefore, it is important to minimize the current mismatch in the CP to ensure that the recovered frequency is within the pull-in range of the PLL. The "modeselect" signal in Fig. 8 was provided externally using the scan chain. In practice, lock-detect (LD) and loss-of-lock-detect (LOLD) functions are needed to automatically switch between the FLL and PLL modes [18].

#### B. Phase Detector

Phase detection is performed with a conventional Alexander BBPD [19]. Using the eight data bits  $(D_0-D_7)$  and two



Fig. 10. Schematic of the FD.

edge samples  $(X_0/X_4)$  produced by the data-DFE and reduced complexity XDFE, respectively, the PD produces two sets of E/L signals every 8 UI. As described earlier, partial cancellation of ISI and down-sampling helps lower the power and hardware complexity associated with phase detection but could detrimentally impact CDR's jitter performance. Simulations indicate that the residual ISI present in  $h_{2.5}$ -to- $h_{5.5}$  taps of the edge sample amounts to only 20% of the total ISI power, i.e.,  $\sum_{2.5}^{5.5} h_i^2 \approx 0.2 \sum_{1.5}^{\infty} h_i^2$ . Consequently, reducing the number of post-cursor taps in XDFE from 5 to 1 resulted in only a slight increase in the ISI-induced recovered clock jitter.

We now quantify the impact of down-sampling the edge samples by calculating its effect on the PD gain,  $K_{PD}$ . Typically, a sub-rate BBPD performs phase detection using Nedge samples and the corresponding data samples. It outputs N E/L decisions that are converted into a three-level PD output  $(\pm 1, 0)$  using majority voting. The number of edge samples considered by the PD, N, is equal to 8 for the conventional case and equals 2 when edge samples are down-sampled by four. The BBPD output (PD<sub>OUT</sub>) can be represented as

$$PD_{OUT} = sign\left(\sum_{m=0}^{N} T_{m}\right)$$
(3)

where  $T_m$  denotes the *m*th E/L decision and can take one of the three values: +1, -1, and 0, indicating early, late, and no transition with probabilities of  $P(T_m = 1) = 0.5p$ ,  $P(T_m = -1) = 0.5(1 - p)$ , and  $P(T_m = 0) = 0.5$ , respectively, where *p* ranges from 0 to 1 and its value is determined by the sampling phase. Assuming that  $T_m$  is an independent identically distributed random variable, the mean of the PD output is equal to

$$E[PD_{out}] = P\left(\left(\sum_{m=0}^{N-1} T_m\right) > 0\right) - P\left(\left(\sum_{m=0}^{N-1} T_m\right) < 0\right)$$
$$= \sum_{i=1}^{N} P\left(\left(\sum_{m=0}^{N-1} T_m\right) = i\right) - \sum_{i=1}^{N} P\left(\left(\sum_{m=0}^{N-1} T_m\right) = -i\right) \quad (4)$$

where  $P((\sum_{m=0}^{N-1} T_m) = i)$  can be calculated using the multi-nominal distribution as shown in the following for the



Fig. 11. MATLAB-simulated mean PD output versus input phase error.

case i = 6 and N = 8 (no down-sampling):

$$P\left(\left(\sum_{m=0}^{7} T_{m}\right) = 6\right)$$
  
=  $\frac{8!}{7!1!}P(T_{m} = 1)^{7}P(T_{m} = -1)^{1}$   
+  $\frac{8!}{6!2!}P(T_{m} = 1)^{6}P(T_{m} = 0)^{2}$   
=  $8(0.5p)^{7}(0.5(1-p)) + 28(0.5p)^{6}0.5^{2}.$  (5)

The ratio of the PD gains without  $(K_{PD,1 \text{ UI}})$  and with  $(K_{PD,4 \text{ UI}})$  down-sampling can be expressed as

$$\frac{K_{\text{PD,1 UI}}}{K_{\text{PD,4 UI}}} = \frac{dE[\text{PD}_{\text{out,1 UI}}]/d\phi}{dE[\text{PD}_{\text{out,4 UI}}]/d\phi}$$
$$= \frac{dE[\text{PD}_{\text{out,4 UI}}]/dp}{dE[\text{PD}_{\text{out,4 UI}}]/dp} = 2.1 \text{ when } p = 0.5. (6)$$

Equation (6) indicates that down-sampling reduces PD gain by a factor of 2.1 when CDR is locked (p = 0.5).

CDR's time-domain model, which includes imperfections such as phase noise of the VCO and clock distribution network, and CDR loop latency, was developed to verify the accuracy of the above analysis and understand the impact of reduced PD gain on CDR jitter performance. Simulation results obtained from the model and shown in Fig. 11 indicate that the simulated  $K_{PD,1 UI}$ -to- $K_{PD,4 UI}$  ratio is slightly higher (2.3 compared to the calculated value of 2.1) because of the unaccounted residual recovered clock jitter in the above analysis. The simulated recovered clock jitter (Fig. 12) shows that partial edge equalization and down-sampling of edge samples increase recovered clock jitter by only 0.2% and 0.7%UI, respectively. The overall increase in the recovered clock jitter is 0.9%UI, which is small enough to make significant power savings offered by the reduced complexity XDFE attractive.

# C. Low-Dropout Regulator

The high supply sensitivity of the CMOS inverter-based VCO significantly degrades the CDR's jitter performance. An LDO is typically used to shield the VCO from supply perturbations and improve the CDR's immunity to supply noise.



Fig. 12. Simulated histograms of recovered clock jitter.

Due to the CDR's performance being most sensitive to VCO's supply noise near its bandwidth, LDO must provide adequate supply noise rejection in the vicinity of the CDR's bandwidth  $(\sim 2 \text{ MHz})$  [20]. However, achieving such wideband power supply rejection (PSR) with conventional LDOs is challenging. To elucidate this issue further, consider the traditional LDO shown in Fig. 13(a). Typically, such an LDO is stabilized by making the pole at the output of the error amplifier (EA),  $\omega_{p1}$ , dominant, which results in poor PSR at high frequencies [21]. On the other hand, making the pole at the LDO output,  $\omega_{po}$ , dominant results in superior high-frequency PSR at the expense of increased power consumption and area [21]. In this case,  $\omega_{p1}$  must be several times higher than the unity gain frequency ( $\omega_{\text{UGF}}$ ) of the loop to ensure adequate phase margin. Satisfying this stability criterion ( $\omega_{\text{UGF}} < \omega_{\text{pl}}$ ) requires significant reduction of the EA's gain. Consequently, low loop bandwidth and small EA gain combined with the large gate capacitance of the pass transistor  $(M_P)$  limit the PSR to about -30 dB even with a quiescent current of 10 mA and 100-pF output capacitance ( $C_{OUT}$ ). Unfortunately, this level of PSR is inadequate to sufficiently suppress the VCO's supply noise.

Given the above drawbacks, we propose the LDO shown in Fig. 13(b) that significantly improves the PSR (-40 dB at 2 MHz). The proposed EA is composed of two amplifiers, a low-gain wide-bandwidth EA (EA<sub>ORIG</sub>) and a high-gain lowbandwidth EA (EA<sub>ADD</sub>), that operate in parallel. EA<sub>ORIG</sub> is equivalent to the EA used in a conventional LDO, and EA<sub>ADD</sub> is the auxiliary amplifier that provides additional gain needed to improve PSR. A unity-gain buffer is added to shield the EAs from the large gate capacitance of the pass device  $(M_P)$ . Since the buffer's output impedance is low, the pole at the gate node of the pass transistor,  $\omega_{p3}$ , is pushed to high frequency and will be ignored in the subsequent analysis. Output voltage  $V_{\text{SUP}}$  is level shifted by  $V_{\text{SHFT}}$  before feeding it back to the EAs to ensure that both the VCO and the CP are optimally biased. Further details are provided later in Section III-D. Please note that all the simulation results shown have included the level



Fig. 13. (a) Conventional LDO. (b) Proposed LDO.

shifter. Modeling  $EA_{ORIG}$  and  $EA_{ADD}$  as single-pole stages, the transfer function of the EA can be written as

$$A_{\rm EA}(s) = \frac{A_{\rm ORIG}}{1 + \frac{s}{\omega_{p1}}} + \frac{A_{\rm ADD}}{1 + \frac{s}{\omega_{p2}}}$$
$$= \frac{(A_{\rm ORIG} + A_{\rm ADD}) + s(A_{\rm ADD}/\omega_{p1} + A_{\rm ORIG}/\omega_{p2})}{\left(1 + \frac{s}{\omega_{p1}}\right)\left(1 + \frac{s}{\omega_{p2}}\right)}$$
(7)

where  $A_{\text{ORIG}}/A_{\text{ADD}}$  and  $\omega_{p1}/\omega_{p2}$  represent the dc gain and pole frequencies of EAORIG/EAADD, respectively. The Bode magnitude plot of the EAs, output pass-transistor stage, and the complete LDO loop is shown in Fig. 14. At dc, the loop gain is equal to  $(A_{\text{ORIG}} + A_{\text{ADD}})A_{\text{PASS}} \approx A_{\text{ADD}}A_{\text{PASS}}$ , which is significantly larger than the loop-gain of a conventional LDO. For  $\omega > \omega_{p2}$ , loop-gain rolls-off at 20 dB/decade until the frequency  $\omega_z = (A_{ADD}/A_{ORIG})\omega_{p2}$  at which point the gain of EAADD falls below EAORIG. As a result, the loop gain stays fixed at  $A_{\text{ORIG}}A_{\text{PASS}}$  until  $\omega > \omega_{\text{po}}$ . For  $\omega > \omega_{\text{po}}$ , the loop gain again exhibits a first-order roll-off and crosses unity gain (0 dB) at  $\omega_{UGF}$ . Based on this discussion, the stability of the proposed LDO can be guaranteed by ensuring  $\omega_{p2} < \omega_z < \omega_{po} < \omega_{UGF} < \omega_{p1}$ . When  $\omega_z \ll \omega_{UGF}$ , the phase margin of the proposed LDO is the same as the conventional design, while the low-frequency PSR is greatly improved. To ensure this, EA<sub>ADD</sub> and EA<sub>ORIG</sub> are designed for small and large gain-bandwidth (GBW) products, respectively. Consequently, EA<sub>ADD</sub> can be designed with negligible power penalty. The simulated PSR of the conventional and proposed LDO is shown in Fig. 15. The proposed LDO achieves better than -40 dB PSR at 2 MHz ( $\approx$  CDR bandwidth) while consuming less than 10 mA and  $C_{OUT}$  < 100 pF.



Fig. 14. Gain Bode plot of error amplifiers, pass-transistor output stage, and the complete LDO.



Fig. 15. Simulated PSR plots of the conventional and proposed LDOs.

EA<sub>ORIG</sub> and EA<sub>ADD</sub> are implemented using a conventional five-transistor and folded-cascode amplifier stages, respectively. A critical challenge in the circuit realization of the proposed LDO is implementing the wide-bandwidth summer that adds EAs' outputs. We overcome this challenge using the circuit shown in Fig. 16 in which the EA<sub>ORIG</sub>/EA<sub>ADD</sub> outputs are summed implicitly by feeding the output of EA<sub>ADD</sub> to the gate of tail current source of EA<sub>ORIG</sub>. Because gain from V<sub>TAIL</sub> to V<sub>EA</sub> is  $A_{V,TAIL} \approx -(g_{m3}/2g_{m4,5})$ , by superposition, total EA gain equals,  $A_{EA} = A_{ORIG} + A_{ADD}A_{V,TAIL}$ . Note that the biasing of EA<sub>ORIG</sub> is unconventional. The sizes of  $M_3$ ,  $M_{4/5}$ , and  $M_P$  (passing device) must be chosen carefully to ensure that  $M_3$  is biased in saturation. Because the gain of EA<sub>ADD</sub> is large, the LDO can be simplified to Fig. 17, where EA<sub>ORIG</sub> is represented by its half-circuit.  $M_3$  will operate in saturation if

$$V_{\rm OV,3} < V_{\rm DS,3} = V_{\rm REF} - V_{\rm THN} - V_{\rm OV,1/2}.$$
 (8)

By choosing  $(W/L)_{M3} = 2(W/L)_{M1/2}$  and  $V_{OV,3} = V_{OV,1/2}$ , 8 can be written as

$$V_{\rm OV,3} < \frac{V_{\rm REF} - V_{\rm THN}}{2}.$$
(9)

Because  $M_{4,5}$  and  $M_P$  form a current mirror, LDO's negative feedback forces  $V_{\text{TAIL}}$  such that

$$I_{M3} = f(V_{\text{OV},3}) = I_{\text{MP}} * \frac{\left(\frac{W}{L}\right)_{M4,5}}{\left(\frac{W}{L}\right)_{\text{MP}}}.$$
 (10)



Fig. 16. Schematic of the EA.



Fig. 17. LDO schematic simplified for dc biasing.

Using (9) and (10), choosing the dimensions as shown in the following ensures that  $M_3$  is biased in saturation

$$\left(\frac{W}{L}\right)_{M3} > I_{\rm MP} * \frac{\left(\frac{W}{L}\right)_{M4,5}}{\left(\frac{W}{L}\right)_{\rm MP}} \left/ \left(\frac{1}{2}\mu_N C_{\rm OX} \left(\frac{V_{\rm REF} - V_{\rm THN}}{2}\right)^2\right) \right). \tag{11}$$

# D. LDO-CP Interface

The integral control voltage,  $V_{CP}$ , can be used as the reference voltage of the LDO, as shown in Fig. 8. However, the desired nominal supply voltage of the VCO of 1.4 V is relatively high (i.e.,  $V_{CP} = 1.4$  in steady state), which makes it very difficult to match PMOS and NMOS current sources of the CP. Dynamic changes in the VCO supply voltage needed to compensate for noise- and temperature-induced frequency variations exacerbate the CP mismatch issue. Another challenge (not directly related to LDO or CP design) is the large capacitor needed to guarantee a decent CDR phase margin. For instance, to achieve a CDR bandwidth of 2 MHz with a phase margin of 70°, C<sub>CP</sub> of 875 pF is required, even with a relatively small CP current of 5  $\mu$ A,  $K_{PD} = 1.6/rad$ , and  $K_{\rm VCO} = 1$  GHz/V. Both the CP current mismatch and large capacitor area issues are addressed using the circuit shown in Fig. 18. The output voltage of the LDO,  $V_{SUP}$ , is equal to

$$V_{\text{SUP}} = V_{\text{REF}} + I_{\text{SHFT}} R_{\text{SHFT}}$$
  

$$\approx V_{\text{REF}} + \left(I_B + G_{\text{MS}}(V_{\text{CP}} - V_{\text{REF,CP}})\right) R_{\text{SHFT}} \quad (12)$$

where  $G_{\text{MS}}$  is the transcondunctance of the differential pair. The tail current (2 $I_B$ ), reference voltage ( $V_{\text{REF,CP}}$ ), and resistor



Fig. 18. Schematic of the LDO-CP interface.

 $(R_{\text{SHFT}})$  are chosen such that  $V_{\text{CP}}$  is forced close to 0.5 V<sub>DD</sub> in steady state, thus minimizing the dynamic current mismatch in the CP. Because interface circuit also reduces the effective VCO gain by approximately  $G_{\text{MS}}R_{\text{SHFT}}$ , the size of  $C_{\text{CP}}$  can also be accordingly reduced. In the prototype,  $C_{\text{CP}}$  was reduced by almost 40% by making  $G_{\text{MS}}R_{\text{SHFT}} = 0.6$ .

# E. Voltage-Controlled Oscillator

Fig. 19(a) shows the schematic of the four-stage pseudodifferential ring VCO. The delay cell is implemented using two CMOS inverters that are coupled by feed-forward resistors ( $R_F$ ) as shown in Fig. 19(b), to reduce the common-mode gain and ensure differential oscillation [22]. The delay is tuned by varying the load capacitance of the inverters. To this end, two sets of capacitors,  $C_{TUNE}/C_{PROP}$ , are implemented where  $C_{TUNE}$  is used to tune the VCO's free-running frequency to bring it within the pull-in range of the CDR.  $C_{PROP}$  is switched ON/OFF by the E/L signals of the PD, thus implementing the proportional control portion of the proportional–integral loop filter.

It is desirable to make proportional path gain programmable to adjust CDR bandwidth and optimize the jitter performance [23]. Such bandwidth tuning can be achieved by making  $C_{\text{PROP}}$  digitally programmable. However, the minimum



Fig. 19. (a) Four-stage pseudo-differential VCO. (b) Schematic of the delay cell. (c) Load capacitor implementation.



Fig. 20. Die photograph of the receiver.

 $C_{\text{PROP}}$  step size that can be achieved (due to minimum size limitation) in our process resulted in a VCO frequency step of  $\Delta f_{\text{OSC}} \approx 1$  MHz, which translates to a change in CDR bandwidth of  $\Delta BW_{\text{CDR}} = 2\pi K_{\text{PD}} \Delta f_{\text{osc}} \approx 1.6$  MHz. This coarse bandwidth tuning resolution is not adequate and had to be improved. To this end, a 5× capacitor is added, as shown in Fig. 19(c), such that the load capacitance is switched between 1× and 5/6×, resulting in a 6× improvement in tuning resolution. The VCO consumes 10 mA, and the spot phase noise at 1-MHz frequency offset is -102 dBc/Hz.

## **IV. EMPERIMENTAL RESULTS**

The prototype receiver was fabricated in a 180-nm CMOS process, and its micrograph is shown in Fig. 20. It occupies an active area of 0.75 mm<sup>2</sup> of which CTLE, DFE, and CDR occupy 0.0875, 0.158, and 0.475 mm<sup>2</sup>, respectively. The receiver was characterized using a four-layer printed circuit board (PCB) and different channels with varying amounts of IL (see Fig. 21). All the reported results were obtained at a data rate of 5.2 Gb/s unless otherwise specified.

The most lossy channel (Fig. 22) measured is a close mimic to the worst case channel in a real display panel, its IL at Nyquist is 29 dB and consisted of two-source PCBs and three flat-flexible connectors (FFCs). The total PCB length is 27 in

![](_page_8_Figure_9.jpeg)

Fig. 21. IL of the least and most lossy channels.

![](_page_8_Picture_11.jpeg)

Fig. 22. Setup of the channel.

![](_page_8_Figure_13.jpeg)

Fig. 23. Spectra of the recovered clock.

and the connector length is 4.74 in. The least lossy channel consists of only one FFC and its loss is 6 dB. The CDR starts in the FLL mode and acquires frequency lock while operating with 12-UI clock-like data pattern. The accuracy of the FLL is quantified by the measured recovered clock spectrum shown in Fig. 23, which indicates that the residual frequency error is less than 1 MHz. Upon frequency acquisition, the CDR is switched to the PLL mode and BER was measured with different pseudo-random bit sequences (PRBSs) provided by an external BER tester without any transmitter pre-emphasis.

![](_page_9_Figure_1.jpeg)

Fig. 24. Phase noise plot of the recovered clock.

![](_page_9_Figure_3.jpeg)

Fig. 25. JTOL measured under different channel loss conditions.

Without pre-setting the DFE coefficients, the CDR locked with a BER of  $10^{-4}$ . With proper DFE setting, the receiver achieves error-free operation (BER <  $10^{-12}$ ) under all conditions (PRBS7, PRBS15, and PRBS31 data), where the DFE tap weights are found by manually sweeping the code. Ideally, we must determine the optimal setting by plotting BER versus the tap coefficient. However, it is very time-consuming to measure small BER. Thus, each DFE tap is set to the average of the two codes that achieved BER =  $10^{-10}$ .

The measured phase noise plot of the recovered clock is shown in Fig. 24, and the integrated jitter (10 kHz–30 MHz) is 5.73 ps. The measured jitter tolerance (JTOL) curves (BER threshold of  $<10^{-10}$ , PRBS 31 data, 95% confidence level) measured under different IL conditions are shown in Fig. 25. The JTOL corner frequency is about 3 MHz, which is close to the expected jitter transfer bandwidth. The measured total receiver power consumption is 216 mW. Because the entire receiver is under a single voltage supply, we do not have a measured power breakdown. In post-layout simulations, the

TABLE I Performance Summary of the Receiver

|                             | This Work | ISSCC 16 [2] |
|-----------------------------|-----------|--------------|
| Technology [nm]             | 180       | 180          |
| Data Rate [Gb/s]            | 5.2       | 6            |
| Supply [V]                  | 1.8       | 1.8          |
| Area [mm <sup>2</sup> ]     | 0.75      | 0.225        |
| Power Efficiency [pJ/b]     | 41.5      | 41.4         |
| Insertion Loss [dB]         | 6-29      | 24           |
| Clocking Scheme             | Embedded  | Forwarded    |
| JTOL Corner Freq. [MHz]     | 3         | NA           |
| JTOL floor [%UI]            | 14        | NA           |
| Recovered Clock Jitter [ps] | 5.73      | NA           |

The metrics reported for [2] included only AFE and DFE.

CTLE, DFE, and the CDR consume 21.6, 131.4, and 63 mW, respectively. Table I summarizes the receiver performance. Compared to [2], the prototype receiver can operate with a higher loss channel and provides embedded clocking functionality without degrading the power efficiency.

#### REFERENCES

- Y.-H. Kim, T. Lee, H.-K. Jeon, D. Lee, and L.-S. Kim, "An input data and power noise inducing clock jitter tolerant reference-less digital CDR for LCD intra-panel interface," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 64, no. 4, pp. 823–835, Apr. 2017.
- [2] M. Hekmat *et al.*, "A 6 Gb/s 3-tap FFE transmitter and 5-tap DFE receiver in 65 nm/0.18 μm CMOS for next-generation 8K displays," in *IEEE ISSCC Dig. Tech. Papers*, Jan. 2016, pp. 402–403.
- [3] Y. Lee *et al.*, "12-Gb/s over four balanced lines utilizing NRZ braid clock signaling with, no., data overhead and spread transition scheme for 8K UHD intra-panel interfaces," *IEEE J. Solid-State Circuits*, vol. 54, no. 2, pp. 463–475, Feb. 2019.
- [4] P. S. Sahni, S. C. Joshi, N. Gupta, and G. S. Visweswaran, "An equalizer with controllable transfer function for 6-Gb/s HDMI and 5.4-Gb/s DisplayPort receivers in 28-nm UTBB-FDSOI," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 24, no. 8, pp. 2803–2807, Aug. 2016.
- [5] T. Musah et al., "A 4–32 Gb/s bidirectional link with 3-tap FFE/6-tap DFE and collaborative CDR in 22 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 49, no. 12, pp. 3079–3090, Dec. 2014.
- [6] M. Hossain and A. C. Carusone, "A 14-Gb/s 32 mW AC coupled receiver in 90-nm CMOS," in *Proc. IEEE Symp. VLSI Circuits*, Jun. 2007, pp. 32–33.
- [7] P. K. Hanumolu, G.-Y. Wei, and U.-K. Moon, "Equalizers for high-speed serial links," *Int. J. High Speed Electron. Syst.*, vol. 15, no. 2, pp. 429–458, 2005.
- [8] K. Jung, Y. Lu, and E. Alon, "Power analysis and optimization for highspeed I/O transceivers," in *Proc. IEEE 54th Int. Midwest Symp. Circuits Syst. (MWSCAS)*, Aug. 2011, pp. 1–4.
- [9] V. S. Kshatri, J. M. C. Covington, J. W. Shehan, T. P. Weldon, and R. S. Adams, "Capacitance and bandwidth tradeoffs in a cross-coupled CMOS negative capacitor," in *Proc. IEEE Southeastcon*, Apr. 2013, pp. 1–4.
- [10] Q. Pan, Y. Wang, Y. Lu, and C. P. Yue, "An 18-Gb/s fully integrated optical receiver with adaptive cascaded equalizer," *IEEE J. Sel. Topics Quantum Electron.*, vol. 22, no. 6, pp. 361–369, Nov. 2016.
- [11] H. Kim and C. Yoo, "A 6-Gb/s wireline receiver with intrapair skew compensation and three-tap decision-feedback equalizer in 28-nm CMOS," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 28, no. 5, pp. 1107–1117, May 2020.
- [12] R. S. Kajley, P. J. Hurst, and J. E. C. Brown, "A mixed-signal decision-feedback equalizer that uses a look-ahead architecture," *IEEE J. Solid-State Circuits*, vol. 32, no. 3, pp. 450–459, Mar. 1997.
- [13] A. Abidi and H. Xu, "Understanding the regenerative comparator circuit," in Proc. IEEE Custom Integr. Circuits Conf., Sep. 2014, pp. 1–8.
- [14] S. Babayan-Mashhadi and R. Lotfi, "An offset cancellation technique for comparators using body-voltage trimming," *Anal. Integr. Circuits Signal Process.*, vol. 73, no. 3, pp. 673–682, Dec. 2012.

- [15] S. A. Z. Murad, A. Harun, M. M. Ramli, M. N. Md Isa, and R. Sapawi, "High-speed low power CMOS comparator using forward body bias technique in 0.13  $\mu$ m technology," *J. Telecommun., Electron. Comput. Eng.*, vol. 10, pp. 25–29, Jul. 2018.
- [16] W.-Y. Tsai, C.-T. Chiu, J.-M. Wu, S. S. H. Hsu, and Y.-S. Hsu, "A novel low gate-count pipeline topology with multiplexer-flip-flops for serial link," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 59, no. 11, pp. 2600–2610, Nov. 2012.
- [17] K. L. J. Wong, E. H. Chen, and C. K. K. Yang, "Edge and data adaptive equalization of serial-link transceivers," *IEEE J. Solid-State Circuits*, vol. 43, no. 9, pp. 2157–2169, Sep. 2008.
- [18] G. Shu *et al.*, "A 4-to-10.5 Gb/s continuous-rate digital clock and data recovery with automatic frequency acquisition," *IEEE J. Solid-State Circuits*, vol. 51, no. 2, pp. 428–439, Feb. 2016.
- [19] B. Razavi, Clock Recovery From Random Binary Signals. Hoboken, NJ, USA: Wiley, 1996, pp. 242–243.
- [20] A. Elshazly, R. Inti, W. Yin, B. Young, and P. K. Hanumolu, "A 0.4-to-3 GHz digital PLL with PVT insensitive supply noise cancellation using deterministic background calibration," *IEEE J. Solid-State Circuits*, vol. 46, no. 12, pp. 2759–2771, Dec. 2011.
- [21] A. Arakali, S. Gondi, and P. K. Hanumolu, "Low-power supplyregulation techniques for ring oscillators in phase-locked loops using a split-tuned architecture," *IEEE J. Solid-State Circuits*, vol. 44, no. 8, pp. 2169–2181, Aug. 2009.
- [22] G. Shu *et al.*, "A reference-less clock and data recovery circuit using phase-rotating phase-locked loop," *IEEE J. Solid-State Circuits*, vol. 49, no. 4, pp. 1036–1047, Apr. 2014.
- [23] J.-Y. Lee, J.-H. Yoon, and H.-M. Bae, "A 10-Gb/s CDR with an adaptive optimum loop-bandwidth calibrator for serial communication links," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 61, no. 8, pp. 2466–2472, Aug. 2014.

![](_page_10_Picture_10.jpeg)

**Tianyu Wang** received the B.S. and M.S. degrees from the University of Illinois at Urbana–Champaign, Urbana, IL, USA, in 2015 and 2018, respectively, where he is currently pursuing the Ph.D. degree.

His research interest is power-efficient wireline serializer/deserializer (SerDes) systems.

![](_page_10_Picture_13.jpeg)

**Da Wei** (Student Member, IEEE) received the B.S. degree (Hons.) and the M.S. and Ph.D. degrees in electrical and computer engineering from the University of Illinois at Urbana–Champaign, Champaign, Urbana, IL, USA, in 2012, 2014, and 2019, respectively.

He is currently a Senior Analog Circuit Design Engineer with Samsung Semiconductor, Inc., San Jose, CA, USA. His research interests are energy-efficient high-speed wireline communication systems and fast transient response power converters.

![](_page_10_Picture_16.jpeg)

**Ranick Ng** received the M.Eng. degree in electrical engineering from the University of Oxford, Oxford, U.K., in 1998, and the Ph.D. degree in semiconductor devices physics from the University of Cambridge, Cambridge, U.K., in 2002.

His current interests include CMOS analog, highspeed serial links, and transceiver design.

![](_page_10_Picture_19.jpeg)

**Gaurav Malhotra** received the M.S. degree in electrical engineering from Penn State, State College, PA, USA, in 2005. His thesis was focused on error correction codes for 10G-BaseT Ethernet.

Since 2006, he has been working on physical layer system architecture for Ethernet, serializer/deserializer (SerDes), display technologies developing communication and signal processing schemes for equalization, echo and noise cancellation, clock recovery, and error correction. He is currently working as a Principal Engineer with the

Display America Lab, Samsung Electronics, San Jose, CA, USA. He has published over ten articles and holds more than 30 patents.

![](_page_10_Picture_23.jpeg)

Anup P. Jose (Member, IEEE) received the B.Tech. degree from IIT Madras, Chennai, India, in 2001, and the M.S. and Ph.D. degrees from Columbia University, New York, NY, USA, in 2003 and 2006, respectively, all in electrical engineering. His doctoral dissertation was focused on low-latency, lowpower interconnects for on-chip networks.

He has held various design engineering positions at AMD, Boxborough, MA, USA, K-Micro (Megachips), San Jose, CA, USA, and Xilinx, San Jose, from 2006 to 2015 with experience in multiple

aspects of serial link design for numerous interfaces, including FBDIMM, DDR2/3, PON, eDP, and gigabit transceivers for field-programmable gate arrays (FPGAs). Since 2015, he has been with the ASIC Group, Samsung Display America Labs (SDAL), San Jose, as the Analog Design/Chip Lead for multiple research chips focusing on high-speed interfaces for displays along with display driving and sensing. He is currently the Director of the Analog Team, Samsung High Speed Interface Lab. He has authored or coauthored over 15 IEEE publications.

Dr. Jose was a co-recipient of the 2005 Best Paper Award at the European Solid-State Circuits Conference (ESSCIRC) and the 2014 Best Paper Award at the Custom Integrated Circuits Conference (CICC).

![](_page_10_Picture_28.jpeg)

Amir Amirkhany (Senior Member, IEEE) received the B.S. degree from the Sharif University of Technology, Tehran, Iran, in 1999, the M.Sc. degree from the University of California at Los Angeles, Los Angeles, CA, USA, in 2001, and the Ph.D. degree from Stanford University, Stanford, CA, USA, in 2008, all in electrical engineering.

He is currently the Head of the Samsung High Speed Interface Lab, Samsung Electronics, San Jose, CA, USA. In this role, he is in charge of high-speed interface research and development for display and

camera applications. Prior to Samsung Electronics, he was with Rambus Inc., Sunnyvale, CA, USA, where he led the development of proprietary high-speed memory interfaces. He has over 25 IEEE publications and over 50 issued patents.

Dr. Amirkhany was a recipient of several IEEE and Samsung best paper awards.

![](_page_10_Picture_33.jpeg)

**Pavan Kumar Hanumolu** received the Ph.D. degree from the School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA, in 2006.

He was a Faculty Member with Oregon State University until 2013. He is currently a Professor with the Department of Electrical and Computer Engineering, University of Illinois at Urbana–Champaign, Urbana, IL, USA. His research interests are energy-efficient integrated circuit implementation of wireline communication systems, ana-

log and digital signal processing, sensor interfaces, and power conversion. Dr. Hanumolu has served as a Technical Program Committee Member of the International Solid-State Circuits Conference, Custom Integrated Circuits Conference, and VLSI Circuits Symposium. He is the Editor-in-Chief of the IEEE JOURNAL OF SOLID-STATE CIRCUITS.