# A DPLL-Centric Bluetooth Low-Energy Transceiver With a 2.3-mW Interference-Tolerant Hybrid-Loop Receiver in 65-nm CMOS

Hanli Liu<sup>®</sup>, Student Member, IEEE, Zheng Sun, Student Member, IEEE, Dexian Tang, Hongye Huang, Student Member, IEEE, Tohru Kaneko, Zhijie Chen, Member, IEEE,

Wei Deng, Senior Member, IEEE, Rui Wu<sup>D</sup>, Member, IEEE, and Kenichi Okada, Senior Member, IEEE

Abstract—This paper presents a Bluetooth low-energy (BLE) transceiver (TRX) achieving ultra-low-power operation for Internet-of-Things (IoT) applications. The proposed TRX utilizes a wide-bandwidth low-power fractional-N digital phase-locked loop (DPLL) as a central component to perform multiple roles: a direct frequency modulator for the transmitter (TX), an analogto-digital converter (ADC) for the receiver (RX), a local oscillator for RX, and a frequency and phase synchronizer for the RX. The DPLL-based single-path downconversion method is adapted to halve the analog baseband circuit to further reduce the power consumption while maintaining a high interference rejection. A DPLL-based ADC with a digital-to-analog converter feedback greatly improves the ADC dynamic range, which improves the RX sensitivity and interference tolerance. By maximally reducing the required radio frequency and analog front-end components in RX, an RX power consumption of 2.3 mW is achieved with a -94 dBm sensitivity. The RX also satisfied all interference requirements with a certain margin. A 5.0-mW TX is achieved when delivering an output power of 0 dBm with a frequency-shift keying error of only 1.89%.

Index Terms—ADPLL, analog-to-digital converter (ADC), blocker, Bluetooth low energy (BLE), delta sigma, digitalto-time converter (DTC), direct frequency-modulation, direct frequency-modulation (DFM), digital phase-locked loop (DPLL), fractional-N, Gaussian frequency-shift keying (GFSK), hybrid loop, IIP2, IIP3, interference, low-noise amplifier (LNA), PLL, sensitivity, single-point modulation, time-to-digital converter (TDC), transceiver (TRX), ultra-low power (ULP).

#### I. INTRODUCTION

**I** N A wireless world, the radio frequency (RF) transceiver (TRX) plays a major role in connecting devices over the

Manuscript received May 6, 2018; revised August 10, 2018 and October 14, 2018; accepted October 18, 2018. Date of publication November 15, 2018; date of current version December 21, 2018. This paper was approved by Guest Editor Arun Natarajan. (*Corresponding author: Hanli Liu.*)

H. Liu, Z. Sun, D. Tang, H. Huang, T. Kaneko, and K. Okada are with the Department of Physical Electronics, Tokyo Institute of Technology, Tokyo 152-8552, Japan (e-mail: liu@ssc.pe.titech.ac.jp).

Z. Chen is with the Faculty of Information, Beijing University of Technology, Beijing 100124, China.

W. Deng is with Apple Inc., Cupertino, CA 95014 USA.

R. Wu is with the Institution of Electronics, Chinese Academy of Sciences, Beijing 100190, China.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JSSC.2018.2878822

air. Because the TRX consumes a significant amount power in a wireless chip, an ultra-low-power (ULP) operation is especially important in Internet-of-Things (IoT) applications. Bluetooth low energy (BLE) is one of the most popular wireless standards for IoT applications. The BLE TRX requires a very long battery life, which means its power consumption should be minimized as much as possible. In addition, a low receiver (RX) sensitivity is needed in order to increase the communication range. The BLE RX should also tolerate strong interference in order to keep working even in a crowded wireless environment.

Low-IF and zero-IF architectures [1]-[3] are among the most common architectures for narrow-band RXs, as shown in Fig. 1(a). They achieve excellent sensitivity and blocker tolerance by utilizing both I and Q channels to demodulate the Gaussian frequency-shift keying (GFSK) data. However, using both of the branches consumes significant amounts of power and area. Sliding-IF (SIF) is another popular architecture for low-power design, although this architecture causes severe image problems [4]–[7]. Reference [8] proposed a hybrid-loop RX to improve the blocker tolerance from the SIF phaseto-digital converter (SIF-PDC) architecture [4]. In order to realize single-path downconversion demodulation, the GFSK constellation is transformed into a differential phase-shift keying (DPSK) constellation at the RX mixer output by shifting the RX local oscillator (LO) frequency by 250 kHz from its carrier frequency [8]. A digital phase-locked loop (DPLL) is used as an LO, and an analog-to-digital converter (ADC) is used to digitize the analog-baseband (ABB) data from the ADC path, as shown in Fig. 1(b). This removes the power consumption from the Q-channel and two ADCs. However, the signal-to-noise plus distortion ratio (SNDR) of the ADC path suffers from highly non-linear varactor gain as well as gain variation due to process, voltage, and temperature (PVT) variation. The RX sensitivity level and the interference tolerances are degraded due to the SNDR degradation of the ADC path. The RX also suffers from an unknown carrier phase by using only I-channel for data demodulation, which decreases the signal-to-noise ratio (SNR) of the demodulated data. Furthermore, the ADC path consumes a lot of power from the time-to-digital converter (TDC) because of its linearity and resolution requirements. A digitally controlled

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/



Fig. 1. (a) Low-IF RX architecture. (b) Hybrid-loop RX with a DPLL-based ADC. (c) Proposed hybrid-loop RX with a dynamic range-enhanced DPLL-based ADC.

oscillator (DCO)-based phase-tracking RX is proposed [9] to improve the power efficiency by adopting the single-path demodulation method. This RX achieved an excellent power efficiency while maintaining a good sensitivity. The carrier frequency drift issue is also solved thanks to the phase tracking functionality of the RX. However, the adjacent blocker tolerance is degraded by the sidelobe energy from the free running DCO [10].

In this paper, we attempt to address the above issues by the proposed techniques. These techniques are verified by presenting a 2.3-mW BLE RX achieving a sensitivity of -94 dBm with all blocker performances satisfied, and a 5.0-mW single-point direct frequency-modulation (DFM) transmitter (TX) with an FSK error of 1.89% at an output of 0 dBm in a 65-nm CMOS process. The digital-toanalog converter (DAC) feedback path is proposed in the DPLL-based ADC to mitigate the linearity degradation and the gain variation from the varactor. This greatly improves the dynamic range of the DPLL-based ADC as shown in Fig. 1(c). As such, the RX sensitivity level and the interference tolerance performances are improved. A digital-to-time converter (DTC)-assisted fractional-N DPLL is implemented. Thanks to the reduced range of the TDC by utilizing a DTC, the TDC achieves a fine resolution with low-power consumption, which improves the in-band phase noise. The highly linear constant-slope DTC operation ensures a good fractional spur performance. A 5-MHz bandwidth (BW) is realized



Fig. 2. (a) Concept of conventional open-loop DPLL-based ADC. (b) Conversion diagrams.

by utilizing the proposed loop-latency reduction technique. The single-path demodulation is supported by a phase-andfrequency synchronization loop in the digital domain when carrier frequency offset is presented.

#### II. DPLL-CENTRIC RECEIVER

## A. DPLL-Based ADC With Dynamic Range Enhancement

The dynamic range of the DPLL-based ADC has considerable influences on the sensitivity level and the blocker tolerances in the hybrid-loop architecture, so, it needs to be improved. The DPLL-based ADC uses an oscillator and a varactor as a voltage-to-frequency (V2F) converter, and the DPLL performs as a frequency quantizer. Fig. 2(a) shows the conventional implementation of the DPLL-based ADC [8]. Fig. 2(b) demonstrates the concept of the digitization process, the ABB data  $V_{ABB}$  modulates the varactor in the oscillator. If the oscillator is free running,  $V_{ABB}$  will produce a frequency disturbance of  $K_{\rm VCO} \cdot V_{\rm ABB}$ . However, due to the negative feedback operation of the DPLL loop, the DPLL could sufficiently suppress this disturbance and corrects it at the digital capacitor bank (PLL path). The compensated frequency of  $K_{\text{DCO}} \cdot D_{\text{OUT}}$ almost equals  $K_{\rm VCO} \cdot V_{\rm ABB}$  in the BW of the DPLL. Hence,  $D_{\text{OUT}}$  can be used as ADC data. In addition, the  $K_{\text{DCO}} \cdot D_{\text{OUT}}$ cancels  $K_{\rm VCO} \cdot V_{\rm ABB}$ , which produces a stable oscillator output frequency of  $f_{OSC} = K_{VCO} \cdot V_{ABB} - K_{DCO} \cdot D_{OUT} + f_{LO} \approx f_{LO}$ . Hence,  $f_{OSC}$  can be used as a LO, as shown by the LO path in Fig. 1(b) as well. The relation between the amplitude of the input  $V_{ABB}$  and the output  $D_{OUT}$  is

$$\frac{D_{\text{OUT}}}{V_{\text{ABB}}} = \frac{K_{\text{VCO}}}{K_{\text{DCO}}} \cdot \frac{H_{\text{OL,DPLL}}(z)}{H_{\text{OL,DPLL}}(z) + 1}.$$
 (1)

The voltage-to-digital (V2D) conversion strongly depends on the varactor gain  $K_{\rm VCO}$  and the digital capacitor bank gain  $K_{\rm DCO}$ . This operation is conducted in an open-loop



Fig. 3. (a) Proposed closed-loop DPLL-based ADC with improved varactor linearity. (b) Conversion diagrams.

manner from an ADC viewpoint and easily suffers from the non-ideality of the loop components. One of the major problems comes from the non-linearity of the varactor at  $V_{ABB}$  input, as shown in Fig. 2(b). Because of the full-range output from a programmable gain amplifier (PGA), the varactor gain varies a lot as the input voltage changes. The conversion spurious-free dynamic range (SFDR) is degraded due to the intermodulation distortion (IMD) and the harmonic distortion (HD) when performing V2F conversion. Moreover, the dc voltage of  $V_{ABB}$  also influences the SFDR as the linearity becomes much worse at both ends. From (1), a larger  $K_{\rm VCO}$  is desired for achieving a better SNR of V2D conversion. However, the larger the  $K_{\rm VCO}$  is, the worse the linearity will be. Furthermore, the varactor gain variation due to the PVT potentially degrades the SNR performance. As a result, the dynamic-range of the DPLL-based ADC is greatly degraded due to the open-loop operation. For achieving better V2F linearity, a varactor array can be implemented with resistor-interpolated voltage biases [8]. Sixteen varactor banks are used to achieve a  $K_{\rm VCO}$  of 800 kHz/V, which consumes a large chip area and produces the large parasitic capacitance of the LC oscillator. In simulation, an SFDR of only 44 dB is achieved by this linearization technique, and the SFDR will become worse under PVT variation.

As shown in Fig. 3(a), the proposed DPLL-based ADC works in a closed-loop manner. A DAC is connected to the output of the DPLL, a pre-distortion signal of  $V_{FB}$  is fed back to the varactor input. Then,  $V_{ABB}$  is subtracted with  $V_{FB}$  by a signal adder at the varactor input. Due to the negative feedback of the DPLL, the loop forces  $V_{FB} \approx V_{ABB}$ . The voltage range of  $V_{ABB}$  is attenuated to  $V_{tune} = V_{ABB} - V_{FB}$  at the varactor input, as shown in Fig. 3(b). If the DPLL BW was very large, due to the large feedback gain of the DPLL,  $V_{tune}$  will be forced to be a DC value. Hence, the V2D conversion is not

degraded by the varactor non-linearity. The DAC feedback path is also used for phase locking. The digital capacitor bank path is used as a frequency locked loop to lock the frequency at the correct BLE channels and will be turned off after the frequency is locked. This ensures that the dc voltage of the DAC is always around 0.5 V [11].

Fig. 4 demonstrates the operation principles of the proposed DPLL-based ADC. The frequency of the DPLL is locked to the required LO frequency of  $f_{RX,LO}$  at LO path, the  $V_{ABB}$  is extracted from the  $V_{\rm RF}$  signal by the downconversion mixer, a low-pass filter (LPF), and a PGA. The  $V_{ABB}$  is input to the DPLL-based ADC at the ADC path and is digitized to  $D_{OUT}$ . In Fig. 4, two loops are presented for the downconversion process and the digitization process. The downconversion loop consists of a mixer, an LPF, a PGA, and an oscillator. The PGA output of  $V_{ABB}$  will control the oscillator frequency though a varactor. The oscillator frequency is also controlled by the negative feedback loop of the DPLL. The downconversion loop and the DPLL independently control the oscillator frequency to be synchronized with each input. Conflicts will occur if both loops have comparable BW. Analysis in [8] shows that the DPLL with a wider BW than that of the LPF can properly stabilize two loops, i.e., a stabilized  $f_{RXLO}$  can be realized. However, excessively increasing the BW of the DPLL will decrease the stability of the DPLL due to the limited sampling frequency and the loop latency. This will cause a large peaking near the DPLL BW which degrades the phase noise. As shown in the discrete-time model of the proposed ADC path in Fig. 4, the varactor input voltage is  $V_{\text{tune}}$  and the quantization noise of the DPLL quantizer is  $Q_n$ . We have

$$\frac{D_{\text{OUT}}}{V_{\text{ABB}}} = \frac{H_{\text{OL,DPLL}}(z)}{H_{\text{OL,DPLL}}(z) + 1} 
= \frac{T_{\text{REF}}^2 K_{\text{OSC}}(K_P(1 - z^{-1}) + K_I)}{t_{\text{RES}} N(1 - z^{-1})^2 + T_{\text{REF}}^2 K_{\text{OSC}}(K_P(1 - z^{-1}) + K_I)}$$
(2)
$$D_{\text{OUT}} = 1$$

$$\frac{Q_n}{Q_n} = \frac{1}{H_{\text{OL,DPLL}}(z) + 1} = \frac{t_{\text{RES}}N(1 - z^{-1})^2}{t_{\text{RES}}N(1 - z^{-1})^2 + T_{\text{REF}}^2K_{\text{OSC}}(K_P(1 - z^{-1}) + K_I)}.$$
(3)

 $H_{OL,DPLL}(z)$  is the open-loop transfer function of the DPLL, *N* is the divide ratio of the frequency divider, and  $t_{RES}$  is the time-resolution of the TDC. Equation (2) is the signal transfer function (STF) of the proposed ADC with a low-pass characteristic that has the same BW as the DPLL, as shown in Fig. 5. In (2), the factor of  $K_{VCO}/K_{DCO}$  is removed, as compared with that in (1). The varactor gain dependence for  $D_{OUT}$  is completely mitigated by this closed-loop operation. Equation (3) shows the noise transfer function (NTF) of the proposed ADC, which has a high-pass characteristic up to the DPLL BW. The NTF has a second-order noise shaping around the dc to provide more suppression of the quantization noise. We can also write the attenuation factor from the signal input



Fig. 4. Discrete-time model of the proposed DPLL-based ADC.



Fig. 5. Plots of the STF, NTF, and the attenuation factor with a DPLL BW of 5 MHz.



Fig. 6. Simulated required varactor linear range versus ADPLL BW.

 $V_{ABB}$  to the varactor input  $V_{tune}$ 

$$\frac{V_{\text{tune}}}{V_{\text{ABB}}} = \frac{1}{H_{\text{OL,DPLL}}(z) + 1}$$
$$= \frac{t_{\text{RES}}N(1 - z^{-1})^2}{t_{\text{RES}}N(1 - z^{-1})^2 + T_{\text{REF}}^2K_{\text{OSC}}(K_P(1 - z^{-1}) + K_I)}.$$
(4)

Since the DPLL has a finite BW, the  $V_{tune}$  still has some amplitude instead of a dc value. As the signal frequency becomes higher, the attenuation will be smaller as shown in Fig. 5. Therefore, a wide-BW DPLL is preferred in order to help reduce the amplitude of  $V_{tune}$ . This mitigates the varactor non-linearity and improves the SFDR performance of the ADC. The BW of the DPLL is decided by considering the required linear range of the varactor and the phase margin (PM) of the DPLL at a large BW. Fig. 6 shows the simulated results of the required linear range for the varactor. As explained by (4), a larger BW of the DPLL can help reduce the required linear range from the varactor. For  $V_{ABB}$  with a



Fig. 7. Schematic of the DAC feedback path.

maximum peak-to-peak amplitude of 500 mV at 750 kHz, the attenuation of  $V_{ABB}$  is very weak for a DPLL BW of 1 MHz. This results in a linear range requirement of 280 mV for the varactor. The required range can be decreased to 80 mV at the DPLL BW of 5 MHz. Ideally, this range can be realized by a single varactor without the linearization techniques. However, as the DPLL BW keeps increasing, the PM will be degraded. In this paper, the DPLL BW is designed to be 5 MHz and an estimated PM of 70° ensures the stability of the DPLL with sufficient margin. Another advantage of the wide-BW operation is the attenuation of the adjacent interference outside the baseband signal BW. If the BW is narrow, the LO frequency  $f_{RX,LO}$  may be pulled to the blocker frequency due to the large signal strength of  $V_{\text{tune}}$ . Fig. 7 shows a schematic of the proposed DAC feedback path, in which a resistor DAC (RDAC) is used to convert  $D_{\text{OUT}}$  to the analog signal. An operational amplifier (OPA) is used to perform the linear addition at node X, and the common-mode voltage of  $V_{\text{tune}}$  is set to  $V_{\text{COM}}$ . The RDAC converts the varactor gain of  $K_{\rm VCO}$  into the digitally controlled gain of  $K_{OSC}$ . A large  $K_{OSC}$  will degrade the quantization noise of the oscillator [12], which worsens the phase noise of the DPLL. A reduced  $K_{\rm VCO}$  will help reduce the required bits of the RDAC. However, a smaller  $K_{VCO}$  will cause a smaller frequency coverage. If the DPLL suffers from a large frequency drift, it will easily fail to lock. On the other hand, the quantization noise from the RDAC will also degrade the SNR of the DPLL-based ADC. From simulation results of the DPLL-based ADC, an 8-bit RDAC is enough for achieving an SNR of 48 dB. The 8-bit RDAC convert the optimized  $K_{\rm VCO}$  of 4 MHz/V into digitally controlled gain of  $K_{\rm OSC}$  = 4 MHz/V $\times$ 2<sup>-8</sup> V/LSB = 16 kHz/LSB for minimizing the phase noise degradation. The thermal noise from the RDAC is small enough and will not degrade the phase noise of the DPLL. Because the distortions result from the non-linearity of the RDAC will appear directly at the ADC output, it is desired



Fig. 8. Test bench of the V2F conversion gain.



Fig. 9. Simulated linearity of (a) V2F conversion w/o DAC feedback, (b) V2F conversion with ideal DAC feedback, (c) nonideal DAC, and (d) V2F conversion with non-ideal DAC feedback.

to have a good linearity of the RDAC to improve the SFDR of the ADC. For the process used in this paper, an 8-bit RDAC with over 54-dB SFDR can be applicable when delivering an output of 500 mV<sub>pp</sub>. The DPLL BW is calibrated by the least mean square (LMS) algorithm in the background, as shown in Fig. 3(a). After calibration,  $H_{OL,DPLL}(z)$  will be maintained constant, regardless of the  $K_{OSC}$  variation, and the SNR of the V2D conversion will no longer suffer from the varactor gain variation caused by PVT in the conventional open-loop design.

The linearity improvement with the proposed DAC feedback is validated by carrying out IMD simulations of the V2F conversion gain on the non-ideal model shown in Fig. 8, the results of which are shown in Fig. 9. The non-linear DAC shown in Fig. 8 is modeled using curve fitting from post-layout simulations. It must be noted that the effects of noise are not included in the aforementioned model in order to ensure accurate characterization of the DAC nonlinearity. For characterizing the non-linearity of the DAC, IMD simulations are carried out with two ideal sinusoidal test signals ( $V_{\text{TEST}}$ ), each with an amplitude of 250 mV<sub>pp</sub>. Without the proposed DAC feedback (excluding the shaded DAC Feedback block in Fig. 8), the presence of a non-linear



Fig. 10. (a) Conventional data digitization process by the full-range TDC. (b) Phase domain diagram.

varactor with large input amplitude limits the linearity of the V2F conversion gain. This is evident from the IMD2 and IMD3 simulation results on the V2F conversion gain shown in Fig. 9(a). For evaluating the V2F conversion gain linearity with the proposed DAC feedback, the DPLL model is included as an ideal delay cell with a delay value based on system simulation. The simulation results based on an ideal DAC is shown in Fig. 9(b), which shows a significant improvement in V2F conversion gain linearity with 22-dB improvement in IMD2 and 30-dB improvement in IMD3 as compared with the system without the DAC feedback. However, the DAC also contributes non-linearity in the feedback path. The simulated non-linearity of the non-ideal DAC is presented in Fig. 9(c) and the simulation carried out using this non-ideal DAC feedback reveals that the V2F conversion gain linearity is limited by IMD2. However, the degradation in IMD2 while using the non-ideal DAC as compared with the ideal DAC is observed to be under 4 dB in the simulation results presented in Fig. 9(d), which is still 18 dB better as compared with the system without the DAC feedback. Note that the linearity degradation from the DAC will not significantly degrade the SNDR performance of the DPLL-based ADC.

To gain a more detailed look at the DPLL quantizer, a detailed phase-domain block diagram is shown in Fig. 10(a). Conventionally, only TDC is used as the phase quantizer inside the DPLL [13]. The LC-oscillator consists of a voltage-tophase (V2P) portion and a digital-to-phase (D2P) portion. The V2P portion will convert the input  $V_{ABB}$  to  $\Phi_{V2P}$ , and the D2P portion will convert  $D_{OUT}$  to  $\Phi_{D2P}$ .  $\Phi_{V2P}$  will be subtracted by  $\Phi_{D2P}$ , which will produce a phase variation of  $\Phi_{OSC}$  at the LC oscillator output. In order to realize fractional frequency synthesization at the BLE channels, the fractional controller is used to dither the multi-modulus divider (MMD) to generate the target fractional phase. This dither operation generates a large peak-to-peak quantization noise of  $\Phi_{Q,\text{DIV}} = 2\pi$ , where  $2\pi$  represents one DCO period.  $\Phi_{OSC}$  contributes to a large output phase error  $\Phi_{\text{TDC}} = \Phi_{\text{OSC}} + \Phi_{O,\text{DIV}}$  at TDC input as shown in Fig. 10(b). Therefore, a TDC range of



Fig. 11. (a) Proposed data digitization process assisted by DTC. (b) Phase domain diagram.

over  $2\pi$  is required. Just like ADC, the TDC will consume a significant amount of power due to the resolution and linearity requirement. The poor resolution of the TDC will degrade the in-band phase noise, and the non-linearity of the TDC will produce in-band fractional spurs. The adjacent channel rejection (ACR) performance will be degraded by the in-band phase noise and fractional spurs due to reciprocal mixing [14]. In this design, a wide BW is required for the DPLL, which requires a spur level of less than -40 dBc and a phase noise of less than -99 dBc/Hz at 3-MHz offset when considering the most stringent ACR performance at 3 MHz from system simulations. To maintain a sufficient design margin, a 5-MHz-BW DPLL with a worst case fractional spur of -50 dBc and a -110-dBc/Hz in-band phase noise will require a resolution of 2.5 ps and a normalized integral non-linearity (INL) of less than 0.5%, according to the system simulation. These requirements will easily cause a power consumption of more than 1 mW for the TDC alone [15]. However, from simulation results, an input signal (V<sub>ABB</sub>) with 500 mV<sub>pp</sub> will only produce a  $\Phi_{OSC}$  with a maximum phase variation of  $0.2\pi$ . A TDC with a range of  $2\pi$  causes a waste of TDC range and greatly degrades the power efficiency of the DPLL-based ADC. Since the TDC resolution and linearity are both important for both the ADC operation and the fractional-N DPLL operation. The TDC resolution and linearity should be enhanced with less power overhead. In this paper, a DTC is used to reduce the required TDC range [11], [16]–[18] as shown in Fig. 11(a), which helps improve the resolution and the linearity of the TDC without power overhead. The fractional controller will produce a pre-distorted phase signal that copies  $\Phi_{O,DIV}$  and will control the DTC to produce  $\Phi_{\text{DTC}}$ . The DTC will add a quantization noise of  $\Phi_{Q,\text{DTC}}$  to its output. As a result, the input at TDC will be  $\Phi_{\text{TDC}} = \Phi_{\text{OSC}} + \Phi_{Q,\text{DIV}} - \Phi_{\text{DTC}} +$  $\Phi_{O,\text{DTC}} = \Phi_{\text{OSC}} + \Phi_{O,\text{DTC}}$ . Since  $\Phi_{O,\text{DTC}}$  is much less than  $\Phi_{O,\text{DIV}}$ , the TDC is only required to cover a phase variation

of  $0.2\pi$ . To leave some safety margin, a TDC with a range of around  $0.4 \pi$  is designed with a 2.5-ps resolution. The power consumption of the TDC is 150  $\mu$ W in post-layout simulation. The constant-slope charging method [19] is utilized to fundamentally improve the linearity of the DTC. The DTC achieves a normalized INL of 0.3% with 1-ps resolution in the post-layout simulation. The proposed DPLL-based ADC achieves a power consumption of around 1.0 mW and a simulated SNR of 48 dB and SFDR of 54 dB thanks to the DAC feedback path and the TDC resolution enhancement technique.

#### B. Wide Loop-Bandwidth Fractional-N DPLL

As 5-MHz BW for the DPLL is specified by the DPLL-based ADC, which brings about various challenges, such as phase noise, fractional spurs, and power consumption. The standard 26-MHz reference would only be able to support a BW of around 2.6 MHz for a type-II PLL due to Gardner's limit [20]. The reference doubler technique is adopted to double the 26-52 MHz. The duty cycle issue is calibrated by using the method in [21]. A time amplifier (TA) [22] is used to improve the coarse TDC resolution from 20-2.5 ps. It improves the theoretical TDC quantization noise [23] from -98 to -116 dBc/Hz within the DPLL BW. Other noise sources, such as the reference noise, the DTC noise, and the TA noise, will be superposed on the TDC quantization noise and worsen the in-band phase noise. In the simulation, a phase noise of less than -110 dBc/Hz at 500 kHz can be achieved at the 52-MHz reference because of the improved in-band phase noise. Another factor that limits the wide-BW operation is the forward loop latency from the TDC input to the digital loop filter (DLF) output, as shown in Fig. 12(a). The larger D in  $Z^{-D}$ is, the worse stability will be at wider BW. In [24], a latency of nearly  $3T_{\text{REF}}$  limits its maximum BW to around 4 MHz at 50-MHz reference clock (CLK). In conventional study [25], the DLF is separated into two paths: the proportional path  $(K_P \text{ path})$  and the integral path  $(K_I \text{ path})$  without digital summing at the DLF output. Both paths are fed into different varactors in the VCO through DACs. This reduces the latency from the TDC inputs to the oscillator interfaces. However, the gains of two paths will vary according to the PVT variation which requires two gain calibration units. Moreover, the lack of retiming at the proportional path will produce glitches and hence worsen the phase noise. In the proposed technique, as shown in Fig. 12(b), the 5-bit coarse quantizer output is retimed by its own output. As shown in Fig. 12(c), after the quantizer finishes the quantization and acquires the thermal bit data (raw data), the quantizer output CLK T<sub>ON</sub> is reused as the TDC CLK to retime raw data at the TDC decoder. The quantizer produces the 5-bit TDC data with the aligned CLK. Overall, the proposed TDC has a latency of  $\Delta T_{\text{TDC}} + \Delta T_1 \approx 2$ ns, in which  $\Delta T_1 > \Delta T_{\text{Dec}}$ , where  $\Delta T_{\text{Dec}}$  is the operation time of the decoder. To reduce the DLF latency, the TDC CLK is further reused as the DLF CLK with a  $\Delta T_2$  delay. If  $\Delta T_2 > \Delta T_{\text{DLF}}$ , where  $\Delta T_{\text{DLF}}$  is the operation time of the DLF, the DCO code is still retimed by the same CLK edge of the TDC output. As a result, the reduced loop latency



Fig. 12. (a) Forward-loop of DPLL with  $D \cdot T_{REF}$  latency. (b) TDC with the proposed loop-latency reduction. (c) Timing chart of the loop-latency reduction.



Fig. 13. Phase noise simulations of the 5-MHz-BW DPLL with different loop latencies.

 $\Delta T_{\text{Latency}}$  is  $\Delta T_{\text{TDC}} + \Delta T_1 + \Delta T_2 + \Delta T_{\text{DAC}} < 0.5T_{\text{REF}}$ , where  $\Delta T_{\text{DAC}}$  is the latency from the RDAC. Since all the data from TDC and DLF are retimed by a clean edge, the glitches are removed and the phase noise will not be degraded. The meta-stability issues due to the retiming operations of the loop-latency reduction path are well confirmed in post-layout simulations. To demonstrate the effect of the loop latency reduction, different latencies are added to the forward PLL loop as shown in Fig. 13. With a  $3T_{\text{REF}}$  latency, large jitter peaking will appear adjacent to the corner frequency of the DPLL BW. While using the proposed loop latency reduction, the phase noise peaking is completely eliminated even at a BW of 5 MHz and the integrated phase noise is improved by more than 12 dB from a  $3T_{\text{REF}}$ .



Fig. 14. Proposed BLE RX baseband with DPLL-based ADC and phase/frequency synchronization loop.



Fig. 15. (a) Simulated result w/o frequency and phase synchronization loop. (b) Simulated results w/ frequency and phase synchronization loop.

## C. Hybrid-Loop RX With Phase and Frequency Recovery Loop

As mentioned earlier, the single-path downconversion method [8] reduces by half the energy consumption and the area from the ABB and ADC in the RX. However, the unknown carrier phase and frequency will degrade the SNR of the down-converted signal. If there is a constant-phase mismatch between the LO and  $V_{\rm RF}$ , the amplitude of the down-converted ABB signal  $V_{\rm ABB}$  will be degraded.  $V_{\rm ABB}$  is digitized to  $D_{\rm DBB}$  and is further processed by a DPSK decoder in the digital baseband (DBB) to acquire the 0/1 data. With the noise associated with the decoder in the DBB, the threshold will be a Gaussian distribution instead of a constant value. The reduced amplitude of  $V_{\rm ABB}$  and the noise will significantly degrade the bit error rate (BER) of the RX.

Fig. 14 shows the proposed RX baseband, and a phase and frequency synchronization loop is implemented to improve the SNR of the down-converted signal. Fig. 15 shows the simulated results with and without the synchronization loop. The worst case phase shift of  $\pi/2$  is assumed in the I-channel signal as discussed in [8]. As shown in Fig. 15(a), without synchronization, the amplitude of the down-converted signal



Fig. 16. Proposed DPLL-centric BLE TRX.

of  $V_{ABB}$  is greatly degraded. As a result,  $D_{DBB}$  will be falsely decoded. With the synchronization, a timing error detector (TED) is placed after the FIR filter to detect the amplitude degradation. When the amplitude of  $V_{ABB}$  is recovered to its maximum value, we have

$$(x[n \cdot T_S] - x[(n-1) \cdot T_S]) \cdot (x[(n-0.5) \cdot T_S]) = 0 \quad (5)$$

where T<sub>S</sub> is the 13-MHz sampling CLK and  $x[(n-0.5) \cdot T_S]$ is the half-symbol point between the current symbol  $x[n \cdot T_S]$ and the previous symbol  $x[(n-1) \cdot T_S]$ , as shown in Fig. 15(b). The detected phase error was filtered and transferred into the control code and was added with the DPLL frequency control word (FCW) to instantaneously change the DPLL phase by varying the output frequency. The  $V_{ABB}$  amplitude is significantly recovered as shown in Fig. 15(b). A settling time of six data symbols is achieved in the simulation. However, due to the long delay from V<sub>RF</sub> to the TED input in Fig. 14, which is mainly dominated by the fourth-order LPF, the settling time of the phase and frequency recovery loop will be degraded if a large carrier frequency offset is presented. When simulated with a carrier frequency offset of  $\pm 100$  kHz, nearly 30  $\mu$ s is required for proper settling. This excessive settling time exceed the 8-symbol preamble time required by the BLE specification. The settling time can be satisfied by dynamically changing the BW of the LPF. When the receiving signal is detected, a large BW of the LPF is adjusted to minimize the delay from V<sub>RF</sub> to the TED input for fast settling of the synchronization loop, while the ACR performance will be degraded. After the synchronization loop is settled, the BW of the LPF is minimized. In the proposed architecture, there is a tradeoff between the required preambles and the ACR performance. In the present study, the LPF is optimized for better ACR performance. Another issue is that the large interference will cause additional noise in (5), which will degrade the BER performance. However, a higher order LPF can be adopted to suppress this extra noise.

#### III. BUILDING BLOCKS OF THE BLE TRANSCEIVER

Fig. 16 shows the proposed BLE TRX, which uses multiple loops for supporting the GFSK data modulation

and demodulation. The proposed RX adopts the concept of single-channel demodulation by transferring GFSK to DPSK constellation [8]. DPLL-based ADC is used as an LO source, and the ADC is used to perform the digitization. The DAC feedback path is used to improve the dynamic range of the DPLL-based ADC. The synchronization path is utilized to synchronize the phase and frequency between the LO and RX input (RX IN). The reference doubler and loop latency reduction techniques are used to support 5-MHz-BW operation of the DPLL. The coarse TDC and a gated loop filter are used to increase the phase locking speed while saving energy after the phase is locked by the PLL path [26]. The digitized data  $D_{OUT}$ with a sampling rate of 52 MHz will be decimated four times by a cascaded integrator-comb (CIC) decimation filter. The power can be reduced because of the multiplierless structure of the CIC filter. However, the magnitude response of the CIC filter has a low attenuation in the passband region. Hence, a CICcompensation filter is required to compensate this attenuation in order to get a flat in-band response. The CIC-compensation filter has 27 taps, and the overall channel-select filter achieves a BW of 1 MHz with 10 dB stopband attenuation. The filtered data will be further processed by a symbol timing recovery block [27], which recovers the symbol timing and sends the correct timing to the DPSK decoder. The DFM modulation path is served for the frequency modulation in the TX. All of these functions are completed by the low-power fractional-N DPLL acting as a center component in the TRX. The reuse of the low-power DPLL cuts a significant amount of power and aggressively minimizes the TRX power consumption without sacrificing performance.

## A. Receiver Front-End Design

As the highest power consuming part in an ULP TRX, the power consumption of the radio frequency front-end and the analog front-end should be minimized. As reported in [28] and [29], various low-power front-end structures have been proposed for ULP TRXs. The low-noise amplifier (LNA) is the most power hungry component due to its noise, linearity, and gain requirements. In [8], the differential LNA alone consumes 0.97 mW from a low supply-voltage of 0.6 V.



Fig. 17. (a) RX front-end implementation with a total supply of 1 V. (b) Low-power consumption source degenerated LNA with stacked gm-cell. (c) LC oscillator.



Fig. 18. (a) Small area and highly balanced stacked balun. (b) EM simulation of the stacked balun.

To improve the power efficiency of the LNA, a supply of 0.6 V is utilized instead of 1.1 V for other analog circuits. However, the LNA requires an additional dc-dc converter and a low-dropout (LDO) regulator to acquire a 0.6 V supply.

In order to avoid using a different supply–voltage while maintaining its power efficiency, a new LNA topology is highly demanded. Fig. 17(a) shows the entire RX-FE implementation. To realize a significant power reduction while achieving required performance without lowering the supply voltage, a new current-reused single-to-differential LNA is proposed, as shown in Fig. 17(b). With a fully on-chip matching network, a single-ended LNA with a stacked differential transconductance amplifier (gm-cell) is implemented. A balun is inserted between the LNA and the gm-cell in order to perform both inductive loading of the LNA and a single-to-differential converter. The proposed topology can share the same supply voltage with other analog building blocks without the need for an additional low-voltage supply to maintain the current efficiency. The input signal is amplified in the voltage domain by the LNA and transformed into a differential signal by the passive balun. To save the chip area, a stacked single-to-differential balun architecture is adopted [30]. As shown in Fig. 18(a), this balun is composed of three turns of primary windings by the top metal (M9) and four turns of secondary windings by M8. This stacked balun structure has a high-coupling factor utilizing the same area as a single inductor. With the center tap of secondary windings connected to ground, the single-ended input signal can be transformed into differential signals, which are directly connected to the stacked differential gm-cell inputs. In electromagnetic

simulation, the phase imbalance between the differential ports is only 0.9° and the amplitude imbalance is less than 0.1 dB at the operating frequency of interest as shown in Fig. 18(b). To ensure all transistors operate in the linear region, the biases and the transistor size of the gm-cell and the LNA are optimized in simulations. The total current flows in the gm-cell and its bias condition decide the drain voltage (VDD<sub>LNA</sub>) for the LNA transistor. Consequently, the dc current is reused between the gm-cell and the source degenerated LNA. In the case of the mismatch between the two branches in the gm-cell, a 10-pF capacitor is implemented at the LNA's VDD to realize AC ground. Using a 1-V supply, the power consumption of this stacked structure is only 0.7 mW. With fully on-chip impedance matching, the minimum noise figure of this LNA with a stacked gm-cell is 4 dB. With an inverter-based gmcell, the RF signal can be transformed into the current domain, which relaxes the linearity requirements for mixers and analog front-end. A passive double-balanced mixer is implemented to avoid the flicker noise and the power overhead from an active mixer in the voltage domain. A fourth-order LPF with a BW of 750 kHz is implemented for higher blocker rejection. The gain of the RX chain can be controlled to allow different input levels as shown in Fig. 17(a). The switch capacitor bank ( $C_{\text{BANK}}$ ) is used for accurately controlling the passband of the LNA, as shown in Fig. 17(b). Gain control technique [31] is used to digitally control the LNA gain. The measured gain of the LNA and gm-cell can be adjusted from 12 to 46 dB, and the PGA have a measured gain control range of 28 dB. The measured 1-dB compression point of the RX is -14.2/-22.0/-45.5 dBm and the measured in-band IIP3 of the RX is -3.5/-11.5/-32.5 dBm in the low/medium/high gain setting of the LNA. The out-of-band IIP3 (OBIIP3) of +2 dBm is measured by feeding two-tone signals at 2.5000 and 2.5661 GHz to the RX input with an LO frequency of 2.434 GHz. The OBIIP2 of +58 dBm is measured by feeding two-tone signals at 2.5000 and 2.5001 GHz to the RX input with an LO frequency of 2.434 GHz. The detailed oscillator implementation is shown in Fig. 17(c). A CMOS-type LC architecture is utilized for the low-power operation. Both the digital capacitor bank and the varactor bank are implemented for the DPLL-based ADC. The varactor bank consists of four identical varactor cells to perform over a 100-mV linear range. The wide BW DPLL operation relaxes the oscillator phase noise requirement. The simulated phase noise is -110 dBc/Hz at a 1-MHz offset with a power consumption of 0.21 mW. The tuning range of the oscillator is designed from 2.2 to 2.6 GHz to cover the 80-MHz BLE band.

## B. Single-Point Direct Frequency-Modulation TX

The DFM-TX [32]–[34] draws a lot of researchers' attentions in BLE applications because of its simplicity comparing with the Cartesian-TX when performing the FSK modulation. A Wide-BW DPLL is capable of realizing wider TX modulation BW. However, the DPLL requires additional power to increase BW. In this paper, thanks to the low-power wide-BW DPLL with low spurs and good in-band phase noise proposed in Section II-B, the single-point DFM TX with low-power



Fig. 19. Block diagrams of single-point DFM TX.



Fig. 20. Chip photo of BLE TRX.



Fig. 21. Measured DPLL phase noise at 2441.75 MHz with the TX/RX OFF and with different signal power of the input when the RX is ON.

consumption can be realized. The pulling effect from the PA to the oscillator at PA startup becomes severe if the oscillator and PA work at the same frequency. This effect becomes dominate at a very large output power of the PA and will degrade the settling time of the DPLL. The wide-BW operation of the DPLL can help reduce the frequency settling time of the DPLL at the PA startup. Fig. 19 shows the single-point DFM TX design. Class-D PA [35] is implemented to improve the power efficiency while it results in a large third-order harmonic at PA output. Hence, an OFF-chip filter is used to suppress this harmonic. For test purpose, the 1-Mb/s data are generated from the data pattern generator which is not synchronized with the sc on-chip reference CLK of 26 MHz. In order to avoid the meta stability, two D flip-flops (DFF) working at 13 MHz are used to retime the 1-Mb/s TX data. The encoder transfers the 1-bit of information into 10-bit signed fixed-point number.



Fig. 22. Measured stability of the RX when the large in-band blockers and the desired signal are fed to the RX.



Fig. 23. Measurement result of the ADC SNDR.

The data are filtered by a digital GFSK filter with a Bluetooth (BT) of 0.5 and a modulation index of 0.5. The GFSK filter output will be normalized using a constant gain of  $\eta$ , which yields a modulation code of  $c_{\text{mod}} = y \cdot \eta$ . The output will be added to the FCW that has a 6-bit integer part and 18-bit fractional part. The fractional-N DPLL has a gain of  $K_{\text{DPLL}} = 52 \text{ MHz}/2^{18} \text{ LSB} \approx 200 \text{ Hz/LSB}$  at the FCW input. At the output of the PLL,  $f_{\text{out}} = (\text{FCW} + c_{\text{mod}}) \cdot K_{\text{DPLL}} = f_{\text{LO}} + c_{\text{mod}} \cdot K_{\text{DPLL}}$ .

## IV. MEASUREMENT

The prototype of the proposed BLE TRX is implemented in a 65-nm CMOS technology. The chip micrograph is shown in Fig. 20. The measured phase noise of the fractional-N DPLL is shown in Fig. 21. The DPLL achieves a phase noise of -110 dBc/Hz at 1-MHz offset frequency with around



Fig. 24. (a) Measurement result of the PGA output without phase and frequency synchronization loop. (b) Measurement result of the PGA output with phase and frequency synchronization loop.

a 5-MHz BW at a frequency of 2441.75 MHz while no significant jitter peaking is observed thanks to the loop-latency reduction technique. The worst case in-band phase noise is -108 dBc/Hz at 1-MHz offset under 80 °C. The measured worst in-band fractional spur of DPLL over all BLE channels is -51.7 dBc. To validate the input power tolerance of the RX, different levels of the BLE signals are added at the LNA input port as shown in Fig. 21. An input power of up to



Fig. 25. Measured BER with phase and frequency synchronization loop when the carrier frequency offset is presented in the TX signal.

-10 dBm at 2442 MHz is added at the LNA input in order to demonstrate the specified maximum input power. With the gain adaptation of the LNA and the PGA as well as the wide-BW DPLL operation, the PLL remains locked even with -10 dBm as the input. The integrated phase noise degrades around 1 dB at the desired input of -67 dBm. The single-path downconversion RX is stabilized by the fourth-order LPF and the 5-MHz wide-BW DPLL loop as explained in Section II-A when large in-band blockers are presented. The phase noise of the DPLL can be the indicator of the stability of the RX, which shows the stability of the LO frequency. In Fig. 22, the desired signal of -67 dBm at 2442 MHz and different levels of in-band blockers at  $\pm 1/\pm 2/\pm 3$  MHz are fed to the RX input at a fixed RX gain. To satisfy the ACR specification, the required levels of in-band blockers are -82/-50/-40 dBm at the adjacent frequency of  $\pm 1/\pm 2/\pm 3$  MHz. For the blocker at 1 MHz with -40 dBm, which is 42 dB higher than the BLE specification, the LO frequency is still stable, as shown in Fig. 22(a). The blocker at -1 MHz as shown in Fig. 22(d) has the biggest impact on the RX system, as it suffers from less suppression from the LPF due to the shifted RX LO frequency of 250 kHz for the single-path downconversion demodulation method [8]. However, this -50-dBm blocker level is still much higher than the requirement of -82 dBm in the BLE specification. A -20-dBm blocker power at  $\pm 2$  MHz will degrade the stability of the RX and generate noises to the lower offset frequency, as shown in Fig. 22(b) and (e). However, sufficient margin is left for the ACR specification. With the help of the fourth-order LPF, the blockers at  $\pm 3$  MHz will not degrade the stability of the loop even with -20-dBm power, as shown in Fig. 22(c) and (f). The higher order LPF can be adopted to achieve better RX stability and higher blocker tolerance while more power is required. In order to evaluate the dynamic range of the DPLL-based ADC, a pure sine wave at 250 kHz is given. The ADC output is monitored using a 10-bit DAC to save pins, and the DFFT is performed to calculate the SNDR performance. In order to verify the improvement of the proposed method, the DPLL-based ADC can be configured as either the conventional open-loop type with a digital capacitor control path or the proposed close-loop type. In the conventional open-loop method, a maximum SNDR of 25 dB is achieved at an input of approximately -18-dBFS input, as shown in Fig. 23. As the input increase further, the varactor linearity will become worse and the SFDR



Fig. 26. Measured demodulator with the symbol timing recovery decoder.

will degrade dramatically as the input becomes larger. After we close the loop by the DAC feedback path, the SNDR continues to increase, even after -18 dBFS, and reaches around 43 dB at an input of -6 dBFS. The SNDR starts to degrade after -6 dBFS due to the saturation of the TDC code and the linearity degradation of the varactor. The linearity improvement by the DAC feedback path enhances the dynamic range of the ADC by 18 dB, i.e., an improvement of three effective bits. The dynamic-range improvement directly improves the sensitivity and the in-band blocker tolerance.

The phase and frequency synchronization loop is evaluated by being turned on/off, as shown in Fig. 24. When there is no synchronization, even a very small phase and frequency error will degrade the amplitude of the down-converted data at PGA output, as shown in Fig. 24(a). The analog data could not be distinguished at digital baseband, and the decoded data will be wrong as shown in question marks. With the synchronization shown in Fig. 24(b), the amplitude of the data is recovered. When the carrier frequency offset is presented as shown in Fig. 25, the BER is measured at the desired input power of -67 dBm. The synchronization loop can cover a range of  $\pm 100$  kHz when the BER requirement of 0.1% can still be satisfied. If a large blocker of -40 dBm is associated with the desired -67 dBm signal, the synchronization loop is affected and the coverage decreasing to  $\pm 50$  kHz. The digital baseband is evaluated in Fig. 26. The PGA output, decoded data, and recovered data CLK are measured using an oscilloscope. Due to the constellation transform from GFSK to DPSK, the TX data can be read out as shown in Fig. 26. The symbol recovery circuit after the FIR filter extracts the correct sampling CLK and provides the recovered CLK to the DPSK decoder. The sensitivity is measured by evaluating the BER performance. The data points of the recovered data and the recovered CLK shown in Fig. 26 are exported from the oscilloscope. A total of 10000 symbols are recoded for the data post-processing performed using MATLAB. The BER is computed by comparison with PRBS9 data. A sensitivity of -94 dBm is achieved when the BER is still below 0.1%. The blocker performances are measured by setting the desired signal to -67 dBm and applying different levels of blocker power.



Fig. 27. (a) Measurement result of the RX ACR with and without DAC feedback loop. (b) Measurement result of the out-of-band blocker tolerance.



Fig. 28. (a) Measurement result of the TX spectrum mask. (b) Measured eye diagram of the single-point DFM TX. (c) Settling time of DPLL at 0-dBm PA output when DPLL and PA startup simultaneously.

The maximum tolerant blocker level is measured when the blocker power makes the BER over 0.1%. The ACR, as one of the most important specifications for BLE RX, is shown in Fig. 27(a). To demonstrate the dynamic-range improvement of the DPLL-based ADC with and without the DAC feedback



Fig. 29. Measured power consumptions of each building block.

path, the ACRs of both cases are measured. Without the DAC feedback path, the ACR drops below the specified value in BLE standard at -3 MHz. With the DAC feedback path, the ACR is improved by almost 9 dB at -3 MHz and all points satisfy the BLE standard with a sufficient margin. The out-of-band blocker performance is measured using the same method as the ACR measurement shown in Fig. 27(b). This performance is mainly limited by the out-of-band rejection of the matching network and the RX linearity.

The single-point DFM TX is measured using the vector signal analyzer. The spectrum from the PA output at the BLE channel of 2434 MHz is shown in Fig. 28(a). The eye pattern is measured as shown in Fig. 28(b). The TX achieves a 1.89% FSK error. The measured worst case GFSK modulation deviation for a 11110000 data pattern, i.e.,  $\Delta f_1$ , is  $\pm 249$  kHz. As for the measured worst case GFSK modulation deviation for a 10101010 data pattern, i.e.,  $\Delta f_2$ , the result shows a deviation of  $\pm 215$  kHz. The measured HD2 and HD3 is -43.0 and -41.4 dBm, respectively, for a PA output of 0-dBm. The settling time of the DPLL is measured at the PA output of 0 dBm when turn on the enable signal of the DPLL and PA simultaneously. Due to the large DPLL BW, the mutual pulling effect of oscillator and PA is reduced as compared with [2], [3], and [6], and a settling time of less than 5  $\mu$ s is achieved, as shown in Fig. 28(c). The measured power consumption breakdowns of the RX and TX, including the DBB are shown in Fig. 29. A power consumption of 2.6 mW is achieved for the RX at maximum gain while 5.2 mW is consumed for the TX when delivering 0-dBm output power. A detailed comparison with the state-of-the-art BLE 4.0 TX/RX is shown in Table I.

JSSC'16 JSSC'17 ISSCC'15 ISSCC'15 ISSCC'15 ISSCC'18 ISSCC'17 This Work [8] [2] [6] [5] [3] [10] [9] Technology 65nm 65nm 28nm 40nm 40nm 55nm 40nm 40nm MSK/HS-OOPSK Modulation(Data Rate) GFSK(1Mbps) GFSK(1Mbps) GFSK(1Mbps) GFSK(1Mbps) GFSK(1Mbps) GFSK(1Mbps) GFSK(1Mbps) (2Mbps) ANT SW1/ ANT SW/RF/ RF/ADPLL/ ANT SW/RF/PLL/ RF/ADPLL/ Integration level RF/DPLL/DBB RF/ADPLL/DBB RF/DCO PLL/PMU<sup>3</sup> **RF/ADPLL** DBB/XO<sup>2</sup> DBB/PMU/XO DBB/XO -90dBm -87dBm **RX** sensitivity -94dBm -95dBm -94dBm -94.5dBm -94.5dBm -95dBm 4/25/35 dB 2/32/N.A. dB N.A./18/30 dB 20 dB@5MHz RX ACR(1/2/3MHz) 1/31/36 dB N.A./24/29 dB N.A. N.A. **Blocker** Power -1dBm, -6dBm, -20dBm, -42dBm, -18dBm, 4.5dBm, -1dBm, -22dBm, (30~2000MHz. -13dBm, -22dBm. -25dBm, -25dBm. -28dBm, -9dBm. -15dBm. -33dBm, 2003~2399MHz, -12dBm, -24dBm. -17dBm. -16dBm. -24dBm. -28dBm. -9dBm. N.A. 2484~2997MHz. 1dBm 0dBm -7dBm N.A. -13dBm >9dBm -8dBm N.A. 3000~12750MHz)  $1.46 \text{ps}^4$ PLL Integrated Jitter 0.85ps 0.98ps 1.71ps N.A. N.A. N.A. N.A. (Integration BW) (10kHz-20MHz) (10kHz-10MHz) (10kHz-10MHz) (1kHz-100MHz) 2-point DFM Single-point DFM 2-point DFM 2-point DFM Up conversion N.A. **TX** Architecture Up conversion N.A. **TX FSK Error** 1 89% N.A. 2 67% 48% N.A N.A. 20% N.A **TX Output Power** 0dBm N.A. 0dBm -2dBm 0dBm 0dBm 1.8dBm N.A 15% 20% N.A. 21% 13% 10% 25% N.A TX Efficiency Supply Voltage 1V0.6/1.1V 0.5/1V 1V1.1V 0.9~3.3V 0.8V 0.85 DBB 0.3mW 0.5mW N.A. 0.4mW N.A. 0.74mW N.A. RX 11.2mW 2.3mW 3.75mW 6.3mW 1.55mW 5.5mW 3.3mW 2.3mW Analog **Power Consumption** DBB 0.2mW N.A. N.A. 0.2mW N.A. N.A. N.A. ΤХ 10.1mW Analog 5.0mW N.A. 4.7mW 4.2mW 7.7mW 6.1mW N.A. TRX Active Area 1.64mm<sup>2</sup> N.A. 1.9mm<sup>2</sup> 1.3mm<sup>2</sup> 1.1mm<sup>2</sup> 2.9mm 0.8mm<sup>2</sup> 0.3mm

1. Integrated antenna switch.

2. Integrated crystal oscillator.

3. Integrated power management unit.

4. Estimated from phase noise plot.

The RX consumes less power while achieving better blocker performance.

## V. CONCLUSION

A BLE TRX for IoT applications is demonstrated in a 65nm CMOS technology. A wide-BW fractional-N DPLL plays a centric role in the BLE TRX which maximally reduces the required circuit blocks, thereby achieving the minimum power consumption. A DPLL-based ADC with the dynamic-range enhancement technique is proposed and greatly improves the sensitivity level and the interference tolerance. The proposed DPLL-based ADC can be utilized in narrow-band wireless applications. Loop-latency reduction and the reference doubler help to mitigate the jitter peaking at the 5-MHz-BW of the DPLL using only a 26-MHz reference CLK and improves the stability of the RX. Phase and frequency synchronization loop assists the proper demodulation of the single-path downconversion demodulation. For the single-point DFM TX, the wide BW of the DPLL improves the settling time of the DPLL at TX startup.

#### **ACKNOWLEDGMENTS**

This paper is based on results obtained from a project commissioned by the New Energy and Industrial Technology Development Organization (NEDO).

#### References

 Y.-H. Liu *et al.*, "A 1.9 nJ/b 2.4 GHz multistandard (Bluetooth Low Energy/Zigbee/IEEE802.15.6) transceiver for personal/body-area networks," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2013, pp. 446–447.

- [2] F.-W. Kuo *et al.*, "A Bluetooth low-energy transceiver with 3.7-mW all-digital transmitter, 2.75-mW high-IF discrete-time receiver, and TX/RX switchable on-chip matching network," *IEEE J. Solid-State Circuits*, vol. 52, no. 4, pp. 1144–1162, Apr. 2017.
- [3] J. Prummel et al., "A 10 mW Bluetooth low-energy transceiver with on-chip matching," *IEEE J. Solid-State Circuits*, vol. 50, no. 12, pp. 3077–3088, Dec. 2015.
- [4] Y. H. Liu, A. Ba, J. H. C. V. D. Heuvel, K. Philips, G. Dolmans, and H. D. Groot, "A 1.2 nJ/bit 2.4 GHz receiver with a sliding-IF phaseto-digital converter for wireless personal/body area networks," *IEEE J. Solid-State Circuits*, vol. 49, no. 12, pp. 3005–3017, Dec. 2014.
- [5] T. Sano et al., "A 6.3mW BLE transceiver embedded RX image-rejection filter and TX harmonic-suppression filter reusing on-chip matching network," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2015, pp. 1–3.
- [6] Y.-H. Liu et al., "A 3.7 mW-RX 4.4 mW-TX fully integrated Bluetooth Low-Energy/IEEE802.15.4/proprietary SoC with an ADPLL-based fast frequency offset compensation in 40 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2015, pp. 236–237.
- [7] A. Wong et al., "A 1 V 5 mA multimode IEEE 802.15.6/Bluetooth low-energy WBAN transceiver for biotelemetry applications," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2012, pp. 300–302.
- [8] A. Sai *et al.*, "A 5.5 mW ADPLL-based receiver with a hybrid loop interference rejection for BLE application in 65 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 51, no. 12, pp. 3125–3136, Dec. 2016.
- [9] Y.-H. Liu et al., "A 770 pJ/b 0.85 V 0.3 mm<sup>2</sup> DCO-based phase-tracking RX featuring direct demodulation and data-aided carrier tracking for IoT applications," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 408–409.
- [10] M. Ding et al., "A 0.8 V 0.8 mm<sup>2</sup> Bluetooth 5/BLE digital-intensive transceiver with a 2.3 mW phase-tracking RX utilizing a hybrid loop filter for interference resilience in 40 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 446–448.
- [11] D. Tasca, M. Zanuso, G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, "A 2.9–4.0-GHz fractional-N digital PLL with bang-bang phase detector and 560-fs<sub>rms</sub> integrated jitter at 4.5-mW power," *IEEE J. Solid-State Circuits*, vol. 46, no. 12, pp. 2745–2758, Dec. 2011.

 TABLE I

 Comparison Table of the State-of-the-Art BLE 4.0 TX/RX

- [12] P. Madoglio, M. Zanuso, S. Levantino, C. Samori, and A. L. Lacaita, "Quantization effects in all-digital phase-locked loops," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 54, no. 12, pp. 1120–1124, Dec. 2007.
- [13] C. W. Yao and A. N. Willson, "A 2.8–3.2 GHz fractional-N digital PLL with ADC-assisted TDC and inductively coupled fine-tuning DCO," *IEEE J. Solid-State Circuits*, vol. 48, no. 3, pp. 698–710, Mar. 2013.
- [14] H. Darabi, Radio Frequency Integrated Circuits and Systems. Cambridge, U.K.: Cambridge Univ. Press, 2015.
- [15] Z. Xu et al., "A 3.6 GHz low-noise fractional-N digital PLL using SAR-ADC-based TDC," *IEEE J. Solid-State Circuits*, vol. 51, no. 10, pp. 2345–2356, Oct. 2016.
- [16] V. K. Chillara *et al.*, "An 860 μW 2.1-to-2.7 GHz all-digital PLL-based frequency modulator with a DTC-assisted snapshot TDC for WPAN (Bluetooth Smart and ZigBee) applications," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2014, pp. 172–173.
- [17] Y. He *et al.*, "A 673μW 1.8-to-2.5 GHz dividerless fractional-N digital PLL with an inherent frequency-capture capability and a phase-dithering spur mitigation for IoT applications," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 420–421.
- [18] Y.-H. Liu *et al.*, "An ultra-low power 1.7–2.7 GHz fractional-N subsampling digital frequency synthesizer and modulator for IoT applications in 40 nm CMOS," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 64, no. 5, pp. 1094–1105, May 2017.
- [19] J. Z. Ru, C. Palattella, P. Geraedts, E. Klumperink, and B. Nauta, "A high-linearity digital-to-time converter technique: Constant-slope charging," *IEEE J. Solid-State Circuits*, vol. 50, no. 6, pp. 1412–1423, Jun. 2015.
- [20] F. M. Gardner, "Charge-pump phase-lock loops," *IEEE Trans. Commun.*, vol. COM-28, no. 11, pp. 1849–1858, Nov. 1980.
- [21] Y.-L. Hsueh et al., "A 0.29 mm<sup>2</sup> frequency synthesizer in 40 nm CMOS with 0.19 ps<sub>rms</sub> jitter and <-100 dBc reference spur for 802.11ac," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2014, pp. 472–473.
- [22] A. Elkholy, T. Anand, W. S. Choi, A. Elshazly, and P. K. Hanumolu, "A 3.7 mW low-noise wide-bandwidth 4.5 GHz digital fractional-N PLL using time amplifier-based TDC," *IEEE J. Solid-State Circuits*, vol. 50, no. 4, pp. 867–881, Apr. 2015.
- [23] R. B. Staszewski and P. T. Balsara, All-Digital Frequency Synthesizer in Deep-Submicron CMOS. Hoboken, NJ, USA: Wiley, 2006.
- [24] A. Sai, S. Kondo, T. T. Ta, H. Okuni, M. Furuta, and T. Itakura, "A 65 nm CMOS ADPLL with 360 μW 1.6ps-INL SS-ADC-based period-detection-free TDC," in *IEEE Int. Solid-State Circuits Conf.* (*ISSCC*) Dig. Tech. Papers, Jan. 2016, pp. 336–337.
- [25] T.-K. Kuan and S.-I. Liu, "A bang bang phase-locked loop using automatic loop gain control and loop latency reduction techniques," *IEEE J. Solid-State Circuits*, vol. 51, no. 4, pp. 821–831, Apr. 2016.
- [26] H. Liu *et al.*, "A 0.98 mW fractional-N ADPLL using 10b isolated constant-slope DTC with FOM of -246 dB for IoT applications in 65 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 246–248.
- [27] F. Gardner, "A BPSK/QPSK timing-error detector for sampled receivers," *IEEE Trans. Commun.*, vol. COM-34, no. 5, pp. 423–429, May 1986.
- [28] J. Masuch and M. Delgado-Restituto, "A 1.1-mW-RX-81.4-dBm sensitivity CMOS transceiver for Bluetooth low energy," *IEEE Trans. Microw. Theory Techn.*, vol. 61, no. 4, pp. 1660–1673, Apr. 2013.
- [29] Z. Lin, P.-I. Mak, and R. P. Martins, "A 2.4 GHz ZigBee receiver exploiting an RF-to-BB-current-reuse Blixer + hybrid filter topology in 65 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 49, no. 6, pp. 1333–1344, Jun. 2014.
- [30] S. Akhtar, R. Taylor, and P. Litmanen, "A high magnetic coupling, low loss, stacked balun in digital 65 nm CMOS," in *Proc. IEEE Radio Freq. Integr. Circuits Symp.*, Jun. 2009, pp. 513–516.
- [31] F.-H. Chen, S.-Y. Lin, X.-Z. Duo, and X.-W. Sun, "A L-band gain controllable CMOS LNA," in *Proc. Asia–Pacific Microw. Conf.*, Dec. 2009, pp. 1124–1127.
- [32] X. Peng, J. Yin, P.-I. Mak, W.-H. Yu, and R. P. Martins, "A 2.4-GHz ZigBee transmitter using a function-reuse class-F DCO-PA and an ADPLL achieving 22.6% (14.5%) system efficiency at 6-dBm (0dBm) P<sub>out</sub>," *IEEE J. Solid-State Circuits*, vol. 52, no. 6, pp. 1495–1508, Jun. 2017.
- [33] H. Liu *et al.*, "An ADPLL-centric Bluetooth low-energy transceiver with 2.3 mW interference-tolerant hybrid-loop receiver and 2.9 mW single-point polar transmitter in 65 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 444–445.

- [34] J. Yin, S. Yang, H. Yi, W.-H. Yu, P.-I. Mak, and R. P. Martins, "A 0.2 V energy-harvesting BLE transmitter with a micropower manager achieving 25% system efficiency at 0dBm output and 5.2 nW sleep power in 28 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 450–452.
- [35] W. Yu, X. Peng, P.-I. Mak, and R. P. Martins, "A high-voltage-enabled class-D polar PA using interactive AM-AM modulation, dynamic matching, and power-gating for average PAE enhancement," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 64, no. 11, pp. 2844–2857, Nov. 2017.



**Hanli Liu** (S'16) received the B.S. degree from the University of Electronic Science and Technology of China, China, in 2013, and the M.S. from the Tokyo Institute of Technology, Japan, in 2015, where he is currently pursuing the Ph.D. degree, with a focus on transceivers for Internet-of-Things and low-power low-jitter digital PLLs.

He was an Intern with the Mixed-Signal IC Group, Toshiba Cooperate Research and Development Center, Kawasaki, Japan, in 2017, where he was involved in digital PLL architectures. His research interests

include ultra-low-power wireless transceivers for Bluetooth low energy, low-power low-jitter digital PLLs, and ultra-low-jitter PLLs for 5G cellular, high FOM oscillators.

Mr. Liu was a recipient of the SSCS Predoctoral Achievement Award 2017–2018. He serves as a Reviewer of the IEEE JOURNAL OF SOLID-STATE CIRCUITS and the IEEE TRANSACTIONS ON VERY LARGE SCALE INTE-GRATION SYSTEMS.



Zheng Sun (S'16) received the B.S. degree in information engineering from Southeast University, Nanjing, China, in 2014, and the M.S. degree in information, production and systems engineering from Waseda University, Tokyo, Japan, in 2015. He is currently pursuing Ph.D. degree in electrical and electronic engineering with the Tokyo Institute of Technology, Tokyo.

He is/was involved in low-power RF, mixed-signal, and digital PLL designs. His current interests include

transceivers for Bluetooth low energy, *LC*-VCO for Internet of Things applications, and harmonic suppression techniques for the power amplifier.



**Dexian Tang** received the B.Eng. degree from the University of Electronic Science and Technology of China, Chengdu, China, in 2014, and the M.Eng. degree from the Tokyo Institute of Technology, Tokyo, Japan, in 2017.

His current research interests include low-power digital PLLs for Internet of Things applications.



Hongye Huang (S'16) was born in Guilin, China, in 1994. He received the B.Eng. degree from the University of Electronic Science and Technology of China, Chengdu, China, in 2016, and the M.Eng. degree from the Tokyo Institute of Technology, Tokyo, Japan, in 2018, where he is currently pursuing the Ph.D. degree. His current research interests include mixed-signal integrated circuits and frequency synthesizers.



**Tohru Kaneko** received the B.Eng. degree in electrical and electronic engineering and the M.Eng. and Ph.D. degree in physical electronics from the Tokyo Institute of Technology, Tokyo, Japan, in 2013, 2015, and 2018, respectively.



**Rui Wu** (S'07–M'15) received the B.S. and M.S. degrees from the University of Electronic Science and Technology of China, Chengdu, China, in 2006 and 2009, respectively, and the Ph.D. degree from the Tokyo Institute of Technology, Tokyo, Japan, in 2015.

Since 2015, he has been a Post-Doctoral Researcher with the Tokyo Institute of Technology. His current research interests include RF/mmW transceivers for high-data-rate wireless communications.



**Zhijie Chen** (S'13–M'16) received the B.E. degree from China Agricultural University, Beijing, China, in 2009, the M.E. degree from Tsinghua University, Beijing, in 2012, and the Ph.D. degree from the Tokyo Institute of Technology, Tokyo, Japan, in 2016.

From 2010 to 2011, he was involved in sigma-delta modulator design with the University of Macau, Macau, as an Exchange Student. Since 2017, he has been an Associate Professor with the Faculty of Information, Beijing University of Technology,

Beijing. His current research interest includes RF CMOS and mixed signal circuit.



Wei Deng (S'08–M'13–SM'17) received the B.S. and M.S. degrees from the University of Electronic Science and Technology of China, Chengdu, China, in 2006 and 2009, respectively, and the Ph.D. degree from the Tokyo Institute of Technology, Tokyo, Japan, in 2013, all in electronic engineering.

From 2013 to 2014, he was a Post-Doctoral Researcher with the Tokyo Institute of Technology. He has been with Apple Inc., Cupertino, CA, USA, since 2015, where he is currently involved in RF/mm-wave phased-array transceiver architecture

and IC design for multi-Gb/s wireless SoC and mixed-signal/analog IC design for Apple A-series processors. He has authored and co-authored over 60 IEEE journal and conference papers and holds four issued U.S. patents. His current research interests include RF/mm-wave wireless transceiver IC and system design.

Dr. Deng was a recipient of several national and international awards, including the China Youth Science and Technology Innovation Award, the IEEE SSCS Predoctoral Achievement Award, the Chinese Government Award for Outstanding Self-Financed (non-government sponsored) Students Abroad, the Tejima Research Award, and the Asia and South Pacific Design Automation Conference (ASP-DAC) Best Design Award.



Kenichi Okada (S'99–M'03–SM'16) received the B.E., M.E., and Ph.D. degrees in communications and computer engineering from Kyoto University, Kyoto, Japan, in 1998, 2000, and 2003, respectively.

From 2000 to 2003, he was a Research Fellow with the Japan Society for the Promotion of Science, Kyoto University. From 2003 to 2007, he was an Assistant Professor with the Precision and Intelligence Laboratory, Tokyo Institute of Technology, Tokyo, Japan. Since 2007, he has been an Associate

Professor with the Department of Physical Electronics and then the Department of Electrical and Electronic Engineering, Tokyo Institute of Technology, Tokyo, Japan. He has authored or co-authored over 400 journal and conference papers. His current research interests include millimeter-wave CMOS wireless transceivers for 20/28/39/60/77/79/100/300 GHz for WiGig, 5G, satellite and future wireless system, digital PLL, synthesizable PLL, atomic clock, and ultra-low-power wireless transceivers for Bluetooth low energy, and sub-GHz applications.

Dr. Okada is a member of the Institute of Electronics, Information and Communication Engineers, the Information Processing Society of Japan, and the Japan Society of Applied Physics. He received the Ericsson Young Scientist Award in 2004, the Asian Solid-State Circuits Conference (A-SSCC) Outstanding Design Award in 2006 and 2011, the ASP-DAC Special Feature Award in 2011, the Best Design Award in 2014 and 2015, the Japan Society for the Promotion of Science (JSPS) Prize in 2014, the Suematsu Yasuharu Award in 2015, the MEXT Prizes for Science and Technology in 2017, and over 40 other international and domestic awards. He is/was a member of the technical program committees of ISSCC, Very Large Scale Integration Circuits, and ESSCIRC. He also is/was a Guest Editor and an Associate Editor of the IEEE JOURNAL OF SOLID-STATE CIRCUITS.