# A Hybrid-PLL (ADPLL/Charge-Pump PLL) Using Phase Realignment With 0.6-us Settling, 0.619-ps Integrated Jitter, and -240.5-dB FoM in 7-nm FinFET

Tsung-Hsien Tsai<sup>(D)</sup>, Ruey-Bin Sheen, Chih-Hsien Chang, Kenny Cheng-Hsiang Hsieh,

and Robert Bogdan Staszewski<sup>(D)</sup>, *Fellow, IEEE* 

Abstract-All-digital PLLs (ADPLLs) based on a ring-oscillator (RO) provide very fast settling, but they suffer from quantization noise due to discrete tuning of their digitally controlled oscillator (DCO). Although RO charge-pump PLLs (CP-PLLs) do not exhibit quantization noise thanks to their continuous VCO tuning, they are quite slow and require huge VCO gain to cover frequency drift due to temperature variations. Further, in CP-PLLs, the reset pulse of phase detector (PD) must be wide for proper PLL functioning, but this sets a lower limit on reference spurs. We propose a hybrid-PLL in a 7-nm FinFET CMOS that combines the best advantages of ADPLL and CP-PLL. We introduce periodical phase realignment by the reference clock, and ultrashort pulse for resetting the PD. The hybrid PLL covers 0.2-4 GHz and settles in 0.6 us. It emits low -52 dB reference spurs in the conventional mode, and 1.05 ps and 0.62 ps integrated jitter in the conventional and realignment modes, respectively.

Index Terms-All-digital PLL (ADPLL), charge-pump PLL (CP-PLL), fast settling, hybrid PLL, realignment and injection locking, reference spur.

## I. INTRODUCTION

The wide-tuning-range ring-oscillator (RO)-based PLLs are widely used in the applications of DDR, SoC, and SerDes. The modern products push for lower power consumption, smaller die size, and higher operated frequency. However, those requirements can be satisfied by advanced process technology [1] because it can provide lower supply voltage, larger device driving current, and smaller device than does the mature technology. Although conventional RO-based ADPLLs have numerous advantages over their analog charge-pump (CP) counterparts in advanced technology [2], [3], they still have some limitations as below. Step-size resolution of a digitally controlled oscillator (DCO) must be very fine, which is quite challenging to meet; otherwise, its quantization noise will degrade the ADPLL's jitter performance. Further, the ADPLLs employ a deterministic sequencer to switch from the frequency acquisition to phase tracking [2], [3]. However, if the allocated acquisition interval happens to be insufficient, then the PLL might run out of range during the phase tracking. Hence, the conventional ADPLLs need hundreds of reference cycles to complete the frequency acquisition. Although the RO-type CP-PLLs avoid quantization noise thanks to the continuous tuning of a VCO, they suffer from two issues.

1) The effective voltage tuning range of VCO becomes very narrow (<0.3 V) because of lower supply voltage that causes huge VCO gain ( $K_{VCO}$ ). In general, 4-GHz frequency tuning

Manuscript received May 18, 2020; revised July 1, 2020; accepted July 13, 2020. Date of publication July 20, 2020; date of current version August 4, 2020. This article was approved by Associate Editor Shuo-Wei (Mike) Chen. The work of Tsung-Hsien Tsai was supported in part by the Science Foundation Ireland under Grant 14/RP/I2921. (Corresponding author: Tsung-Hsien Tsai.)

Tsung-Hsien Tsai is with Taiwan Semiconductor Manufacturing Company, Hsinchu 300, Taiwan, and also with University College Dublin, Dublin 4, Ireland (e-mail: thtsaic@tsmc.com).

Ruey-Bin Sheen, Chih-Hsien Chang, and Kenny Cheng-Hsiang Hsieh are with Taiwan Semiconductor Manufacturing Company, Hsinchu 300, Taiwan. Robert Bogdan Staszewski is with the School of Electrical and Electronic

Engineering, University College Dublin, Dublin 4, Ireland.

Digital Object Identifier 10.1109/LSSC.2020.3010278

range generates more than 13 GHz/V KVCO in 0.3-V voltage range. Huge VCO gain degrades the performance of jitter and reference spur when PLL suffers from noisy supply or ground.

2) Long acquisition times are caused by a limited PLL bandwidth (BW).

In order to meet the stability requirements, the PLL-BW must be smaller than one-tenth of reference clock (FREF). However, traditional ADPLLs [2], [3] can provide shortest frequency acquisition times because the output difference of reference accumulator and variable accumulator (called phase error, PHE) is huge at the beginning to provide huge frequency jump. When the RO frequency is close to target, the PHE becomes smooth, meaning frequency jump becomes small. In general, the modern products call for the locking time of PLL to be <3 us [4]. Furthermore, in traditional CP-PLLs, the wide reset pulse of phase detector (PD) is adopted to guarantee a proper PLL function but a current mismatch between the CP's head and tail current sources causes an increase in reference spurs. A digital self-calibration algorithm of the VCO center frequency was presented in [5] to reduce the VCO gain. It uses binary search to find the target band, but it needs a lot of reference cycles to complete the calibration. Furthermore, the digital coarse tuning of VCO is composed of a voltage-to-current converter (V2I), current multiplier, and an ICO. Hence, the VCO gain  $(K_{VCO})$  varies with the tuning word, thus creating a problem of excessively varying the PLL bandwidth over the wide tuning range.

In this letter, we propose a hybrid PLL architecture that dynam*ically* combines the digital manner of frequency acquisition in ADPLLs with the fine continuous-phase tracking in CP-PLLs to provide an overall better performance, faster locking time, better power supply rejection, lower oscillator's sensitivity to noise, and digital relocking in case of running out of tracking range, while avoiding the oscillator's quantization noise. In this PLL, a digital control loop of a counter-based ADPLL is used for the fast frequency acquisition, whereas a CP-PLL is then seamlessly turned in for the fine phase tracking (see Fig. 1). Hence, it combines the advantages of ADPLL and CP-PLL to reach fast tracking time (<40 reference cycles) while being free from the DCO quantization noise. This PLL also builds a block called tracking-trend-detector. It can automatically transit digital phase acquisition to analog phase tracking at the earliest possible opportunity when the instantaneous oscillating frequency fluctuates around the desired target (see Fig. 2). The switchover to digital coarse tuning is also done automatically in case of running out of tracking range. Benefiting the hybrid architecture of this PLL, the proposed hybrid RO (see Fig. 3) uses collaborative digital and analog tuning controls. It obtains over an octave tuning range with low analog RO gain  $(K_{RO})$  of merely 3 GHz/V. However, low  $K_{RO}$  means the device of "voltage to current" can adopt smaller size to reduce gate leakage current and improve reference spur as well.

The proposed PLL is realized in a 7-nm FinFET, whose devices exhibit excellent  $g_m$  and  $I_{ON}$  [1], thus allowing to be clocked at a much higher reference frequency (FREF) while consuming less power and producing shorter inverter delay. The reference spurs are improved by means of ultrashort reset pulse in the PD and faster switching devices in the charge pump. However, the low frequency device noise becomes worse with process technology migration [6].

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/



Fig. 1. Hybrid PLL. (a) Block diagram. (b) Tracking behavior with the trend-detector. (c) Tracking behavior using the conventional solution.

The realignment technique is induced into this PLL for suppressing the in-band noise of PLL and improving jitter performance.

## II. ARCHITECTURE AND CIRCUIT DESIGN

Fig. 1(a) shows the overall hybrid-PLL architecture. It consists of the digitally controlled frequency acquisition path and analogcontrolled phase-tracking path, jointly tuning the hybrid RO. The former includes a supply voltage divider. The latter comprises a PD, a CP, an RC loop filter (RC-LF), a feedback divider (% N), and a short pulse generator (RL-Pulse-GEN) for the phase realignment. During the frequency acquisition (TRK = 0), the voltage divider feeds the VCO<sub>IN</sub> input to hybrid RO at  $1/2 V_{DD}$  while the analog controlled loop is disabled by breaking the connection between CP and RC-LF. The digital frequency detector compares the phases of FREF and CKV and produces PHE. It is then filtered and normalized as NTW. The RO controller with gain normalization converts NTW to a thermometer code for frequency tuning with the precisely controlled bandwidth (=  $\alpha$ \*FREF/2 $\pi$ ,  $\alpha$  = 2<sup>-3</sup> ~ 1) [3]. The optimum switchover is decided by the tracking trend detector. The PD and feedback divider are operating throughout for quickly responding to the acquisition/tracking switchover. At the moment of switchover, the FREF and FBK clocks are roughly phase aligned, as shown in Fig. 1(b). During the phase tracking, TRK = 1 turns off the voltage divider and enables the analog controlled loop (the path of CP to RC-LF) for producing the continuous tuning voltage (VCO<sub>IN</sub>). At the same time, the latch of RO controller is opaque, allowing to disconnect the digital loop while maintaining the NTW state. The overall locking time, including the frequency acquisition and phase tracking, is only 0.6 us. Benefiting from this architecture, the frequency tuning range of VCO in phase tracking mode can become narrow and thus the V2I device receiving VCO<sub>IN</sub> can be chosen small to reduce leakage current. However, in order to reduce efficiently the leakage current of VCO<sub>IN</sub> path, the capacitor of RC-LF is implemented with MOM capacitors. The device channel length is smaller than 1 um in advanced technology [7], the stacking gate device is



Fig. 2. Tracking trend detector: (a) schematic and timing diagram (all flipflops are reset on power-up). (b) Behavior simulation with FCW change.

used to increase the output resistance. However, in order to achieve fast switching times with high output resistance in CP, the regulated wide-swing current mirror with current switching is used in the CP of this PLL. Benefiting from the 7-nm FinFET,  $I_{REP}$  can fast switch to  $I_{PUMP}$  with the incoming UP/DN. However, the short channel device of 7-nm FinFET provides shorter delay time [7] that can produce the narrow reset pulse for PD. Hence, the combination of narrow reset pulse in PD and fast switching current in PUMP can reduce the mismatch current injected from CP to RC-LF for improving the reference spur. The RL pulse is generated by RL-Pulse-GEN, which is composed of a latch and a few buffer cells which are also used in the PD, thus ensuring precise timing between the two functions.

The conventional ADPLLs employ a *deterministic* sequencer to switch from the frequency acquisition to phase tracking [2], [3]. However, if the allocated acquisition interval happens to be insufficient, then the PLL might run out of range during the phase tracking. Due to PVT uncertainties and wide range of the generated frequencies, the conventional ADPLLs take hundreds of reference cycles to complete the frequency acquisition just to avoid any failure caused by prematurely entering the phase tracking. We propose a behavioral sequencer, the tracking trend detector in Fig. 2, which automatically finds the earliest safe switchover point by observing the oscillator tuning word (NTW). The tracking behavior of the conventional and proposed PLL arrangements are shown in Fig. 1(c) and (b), respectively. The proposed circuitry comprises the activity generator producing a strobe only when the NTW changes, the initial trend detector "clocked" by the strobe, and the switchover activator to assert the switchover control, TRK. At first, the registers of I\_NTW1~3 are reset and the nonzero-detector (NZD) outputs logic low to latch1. Three different NTW samples are captured by the strobe clock. Latch1 will lock the three initial NTW samples as soon as NZD detects nonzero in register I\_NTW3. Then, the following logic circuitry decides on the initial trend direction from sign1 and



Fig. 3. Ideal schematic of the hybrid RO, highlighting the tuning range coverage and phase realignment schemes.

sign2. (sign2/sign1: 2'b00 for negative trend, 2'b11 for positive trend, and 2'b01/10 for flat trend). In the switchover activator, a reversal of the trend direction can be detected by comparing two consecutive samples and the switchover command (TRK) is locked by latch2. Hence, the proposed hybrid PLL with the behavioral sequencer takes only dozens of reference cycles to complete the frequency acquisition. If FCW is changed, the hybrid PLL will return the frequency acquisition from phase tracking by tying VCO<sub>IN</sub> to  $V_{DD}/2$  (enabling the voltage divider again). Furthermore, the frequency acquisition resumes from the last NTW state, rather than the middle reset state, as shown in Fig. 2(b).

Fig. 3 shows a circuit diagram of the hybrid RO. The proposed technique combines the digital  $I_{DIG}$  and analog  $I_{ANA}$  tuning currents to generate the RO supply current,  $I_{RO}$ , while using an active current mirror to increase robustness to supply variations. The RO is implemented with ultralow  $V_T$  devices such that its supply voltage ( $V_D$ ) can be as low as 0.45 V. A voltage margin of 200 mV can maintain the pMOS device of active current mirror in saturation. That sets the supply voltage of hybrid PLL to 0.65 V. The digital tuning path comprises a regulated current mirror and programmable current bank with 31 thermometer-coded tuning steps that produce a combined current  $(I_{DIG} = I_{BASE} + I_{BASE} * stages)$ . The structure of regulated current mirror can provide enough output resistance against the ground noise and it can reduce the mismatching current between current banks in advance technology. This manner of current control is chosen over the conventional capacitance control as it is more precise for tuning with less phase noise variability. The unit current, IBASE, of 20 uA generates an average frequency step of 75 MHz and its quantization error (residue) is passed to the analog tuning path during the phase tracking. This handoff feature is distinct from conventional "hybrid" solutions and is instrumental in achieving the ultrafast settling. The total digital tuning range is 2.33 GHz. The analog path comprises an nMOS device (called MV2I) for translation of tuning voltage (VCOIN) to current IANA for the phase tracking. The RO drift due to temperature is  $\sim$ 430 MHz from -40 °C to 125 °C (i.e., 2.6 MHz/°C). The analog tuning range of 940 MHz can cover 2× of the RO frequency drift and so it results in average  $K_{RO}$  of 3.15 GHz/V in the 0.3-V tuning range, which is  $3.6 \times$  gain reduction improvement versus the conventional CP-PLLs. However, the MV2I can use small device size because of small  $K_{RO}$  and thus it can reduce the leakage current on the path of CP to hybrid-RO. Furthermore, the IDIG current is tuned by the oscillator tuning word in thermometer code (OTW) that is independent of  $I_{ANA}$ . Hence, it can provide uniform frequency



Fig. 4. Behavioral simulations in the conventional mode with 30-ps reset pulse (left) and the realignment mode with *unsuitable* RL strength (right).

steps and its  $K_{RO}$  does not vary with tuning words thus providing a stable PLL bandwidth (unlike in [5]). The  $I_{BLAS}$  current is generated by a current generator with active-current-mirror (ACM) to provide high power noise rejection for maintaining the clock quality. The capacitor1 (CAP1) is used to compensate the stability of regulated current mirror. The combination of capactor2 (CAP2) in current banks and parasitic resistor is used to filter the current noise; its bandwidth is set around 10 MHz because the noise of hybrid-RO in closed-loop can be suppressed by PLL-BW. In the realignment mode, a very narrow pulse (RL) triggered by the rising edge of FREF is injected into the RO. The first RO cell functions as a loop inverter when RL = 0 and RL device is turned off. During a brief moment when RL = 1, the RO loop is broken and the net of ZN is pulled to ground, CKV gets edge-aligned with FREF. However, the RO's accumulated jitter is refreshed [8], [9].

The strength of realignment (RL) dominates the performance of period jitter and reference spur. Although a sufficient strength can reset the accumulated jitter of oscillator to reduce the in-band noise of PLL, this strength distorts the ZN waveform for phase alignment to cause transient period difference on period1 and period2 that degrades the period jitter. However, if the latency of FREF to RL is different than the latency of FREF to UP/DN, the locked phase of CP-loop is broken by this RL strength. The phase difference of UP and DN generated by the PFD appears as the voltage variation of VCO<sub>IN</sub> to adjust the RO's phase that degrades the reference spur, as shown in Fig. 4. Therefore, the suitable RL strength can efficiently reduce the in-band noise of PLL and does not degrade the reference spur and period jitter. Fig. 4 shows the behavioral simulations between the conventional mode with a 30-ps reset pulse and the realignment mode with unsuitable realigned strength; the reference clock is applied at 300 MHz for reducing simulation time and it also shows the reference spur at 300 MHz offset; the target frequency is set at 3 GHz for both simulations. In the conventional mode, the reference spur shows 55dB and the border of spectrum is drawn with a yellow dot line. In the realignment mode, the unsuitable realigned strength (strong one) is applied. The in-band spectrum is efficiently suppressed (compared to the yellow dot line) but the reference spur is degraded to 41 dB. The RL-Pulse-GEN is integrated into the PFD and implemented by auto-place-and-route (APR), while the RL device is implemented by customized layout. Unfortunately, the latencies of FREF to RL and FREF to UP/DN are not constrained in APR and there are no adjustable options built into this prototype. The RL device uses the same W/L as the ring-cell and did not incorporate any adjustable options. It means, this prototype has enough capability to reduce the in-band noise of PLL but the performance of reference spur and period jitter are degraded by the unconstrained realigned strength.

## III. MEASUREMENT RESULTS AND CONCLUSION

The proposed hybrid-PLL is implemented in TSMC 7-nm FinFET CMOS. Fig. 5 shows the measured settling behavior. The overall settling time is only 0.6 us at 200-MHz FREF (0.8 us at 100-MHz FREF) and includes 0.12 us with  $\alpha = 2^{-2}$  and 0.48 us for the frequency and phase acquisition, respectively.



Fig. 5. Measured settling behavior at 200-MHz FREF.



Fig. 6. Measured spectrum at 4-GHz FPLL at 200-MHz FREF and comparison of reset pulse of PD and performance of reference spur.



Fig. 7. Measured phase noise of conventional mode and realignment mode at 3-GHz PLL output with 200-MHz FREF.



Fig. 8. Power consumption allocation among various blocks and die photograph of the 7-nm FinFET hybrid PLL.

Fig. 6 shows the measured spectrum at 4-GHz PLL output with 200-MHz FREF. The measured reference spur is -52.3 dBc when in the conventional mode with an ultrashort PD reset pulse of 30 ps (verified to degrade to -41 dBc if the width of reset pulse would increase to 90 ps). In Fig. 7, the measured integrated phase jitter at 3 GHz PLL output with 200-MHz FREF is 0.619 ps in the realignment mode and 1.05 ps in the conventional mode. In the conventional mode, -110dBc/Hz in-band PN is limited by the CP noise. Benefited from the advantage of realignment, the in-band noise can be eliminated effectively to achieve -120 dBc/Hz, which is merely bounded by the FREF noise with 20log(FCW). The spurs at 4 MHz and 6 MHz exist in both conventional and realignment modes and are caused by the testing environment. Fig. 8 shows the power consumption

 TABLE I

 Performance Summary and Comparison to State-of-the-Art

| Reference                                        | This Work             |                        | ISSCC'16[3]        | JSSC'13[10] | ISSCC'16[11] |
|--------------------------------------------------|-----------------------|------------------------|--------------------|-------------|--------------|
| Architecture                                     | Hybrid PLL            |                        | Charge<br>Pump PLL | MDLL        | DMDLL        |
| Technology                                       | 7 nm                  |                        | 14 nm              | 0.13 um     | 28 nm        |
| Supply                                           | 0.65 V                |                        | 0.95 V             | 1.1 V       | NA           |
| Reference                                        | 200 MHz               |                        | 100 MHz            | 375 MHz     | 75 MHz       |
| Output                                           | 0.2 G-4 GHz           |                        | 0.15 G-5 GHz       | 1.5 GHz     | 2.4 GHz      |
| Settling time @F <sub>PLL</sub>                  | 0.6 us @3 GHz         |                        | 2.96 us @1.2 GHz   | NA          | NA           |
| Area (mm <sup>2</sup> )                          | 0.012                 |                        | 0.022              | 0.25        | 0.024        |
| F <sub>RO</sub> for jitter/power<br>measurements | 3 GHz@RL <sup>2</sup> | 3 GHz@Con <sup>3</sup> | 4 GHz              | 1.5 GHz     | 2.4 GHz      |
| Integrated jitter (ps)                           | 0.619                 | 1.05                   | 1.26               | 0.4         | 0.7          |
| Power (mW)                                       | 2.3                   |                        | 2.56               | 0.9         | 1.51         |
| FoM <sup>1</sup> (dB)                            | -240.5                | -236                   | -233.9             | -248.4      | -241         |

<sup>1</sup>FoM=10\*log[( $\sigma^2_{jitter}$ )\*(P<sub>DC</sub>/1mW)], <sup>2</sup>RL: realignment mode, <sup>3</sup>Con: conventional mode

allocation among various blocks and the die micrograph. The core area is 0.012 mm<sup>2</sup> plus 0.008 mm<sup>2</sup> for decoupling capacitors dedicated to filter out heavy digital noise from the host supply. The total power at 3-GHz PLL output with 200-MHz FREF is 2.3 mW @0.65 V (1.67 mW for hybrid RO, 0.33 mW for CP, 0.2 mW for digital logic, and 0.1 mW for bias circuit), leading to the record-high FoM of -240.5 dB in realignment mode and -236 dB in the conventional mode.

Table I compares the proposed hybrid PLL with the state-of-theart RO PLLs. It achieves the fastest settling of 0.6 us, and the best integrated jitter and FoM of 0.619 ps and -240.5 dB, respectively.

#### REFERENCES

- S.-Y. Wu *et al.*, "A 7nm CMOS platform technology featuring 4<sup>th</sup> generation FinFET transistors with a 0.027μm<sup>2</sup> high-density 6-T SRAM cell for mobile SoC applications," in *Proc. IEEE Int. Electron Devices Meeting (IEDM)*, San Francisco, CA, USA, 2016, pp. 1–4.
- [2] T.-H. Tsai, M.-S. Yuan, C.-H. Chang, C.-C. Liao, C.-C. Li, and R. B. Staszewski, "14.5 A 1.22ps integrated-jitter 0.25-to-4GHz fractional-N ADPLL in 16nm FinFET CM0S," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Techn. Papers*, San Francisco, CA, USA, 2015, pp. 450–451.
- [3] R. Staszewski and P. T. Balsara, "All-digital PLL with ultra fast settling," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 54, no. 2, pp. 181–185, Feb. 2007.
- [4] K.-Y. Shen et al., "19.4 A 0.17-to-3.5mW 0.15-to-5GHz SoC PLL with 15dB built-in supply noise rejection and self-bandwidth control in 14nm CMOS," in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), San Francisco, CA, USA, 2016, pp. 330–331.
- [5] W. B. Wilson, U.-K. Moon, K. R. Lakshmikumar, and L. Dai, "A CMOS self-calibrating frequency synthesizer," *IEEE J. Solid-State Circuits*, vol. 35, no. 10, pp. 1437–1444, Oct. 2000.
- [6] S.-H. Lee *et al.*, "Investigation of low-frequency noise in p-type nanowire FETs: Effect of switched biasing condition and embedded SiGe layer," *IEEE Electron Device Lett.*, vol. 35, no. 7, pp. 702–704, Jul. 2014.
- [7] A. L. S. Loke *et al.*, "Analog/mixed-signal design challenges in 7-nm CMOS and beyond," in *Proc. IEEE Custom Integr. Circuits Conf.* (CICC), San Diego, CA, USA, 2018, pp. 1–8.
- [8] S. Ye, L. Jansson, and I. Galton, "A multiple-crystal interface PLL with VCO realignment to reduce phase noise," *IEEE J. Solid-State Circuits*, vol. 37, no. 12, pp. 1795–1803, Dec. 2002.
- [9] S. Choi, S. Yoo, Y. Lim, and J. Choi, "A PVT-robust and low-jitter ring-VCO-based injection-locked clock multiplier with a continuous frequency-tracking loop using a replica-delay cell and a dual-edge phase detector," *IEEE J. Solid-State Circuits*, vol. 51, no. 8, pp. 1878–1889, Aug. 2016.
- [10] A. Elshazly, R. Inti, B. Young, and P. K. Hanumolu, "Clock multiplication techniques using digital multiplying delay-locked loops," *IEEE J. Solid-State Circuits*, vol. 48, no. 6, pp. 1416–1428, Jun. 2013.
- [11] H. Kim, Y. Kim, T. Kim, H. Park, and S. Cho, "19.3 A 2.4GHz 1.5mW digital MDLL using pulse-width comparator and double injection technique in 28nm CMOS," in *Proc. IEEE Int. Solid-State Circuits Conf.* (ISSCC), San Francisco, CA, USA, 2016, pp. 328–329.
- [12] T.-H. Tsai, R.-B. Sheen, C.-H. Chang, and R. B. Staszewski, "A 0.2GHz to 4GHz hybrid PLL (ADPLL/charge-pump-PLL) in 7NM FinFET CMOS featuring 0.619PS integrated jitter and 0.6US settling time at 2.3MW," in *Proc. IEEE Symp. VLSI Circuits*, Honolulu, HI, USA, 2018, pp. 184–185.