

Received 12 October 2023, accepted 26 October 2023, date of publication 30 October 2023, date of current version 7 November 2023. Digital Object Identifier 10.1109/ACCESS.2023.3328772

## **RESEARCH ARTICLE**

# **Design of a High-Efficiency Low-Ripple Buck Converter for Low-Power System-On-Chips**

## SIWAKORN THONGMARK AND WORADORN WATTANAPANITCH<sup>®</sup>, (Member, IEEE)

Department of Electrical Engineering, Faculty of Engineering, Kasetsart University, Bangkok 10900, Thailand

Corresponding author: Woradorn Wattanapanitch (woradorn.w@ku.th)

This work was supported in part by the Faculty of Engineering, Kasetsart University (the Scholarship for Master students); in part by the National Research Council of Thailand under Grant N41A640139; and in part by Kasetsart University.

**ABSTRACT** This paper presents the design of a buck converter for use in low-power system-on-chips (SoCs) with a ripple voltage small enough to directly power sensitive analog circuits without the help of low-dropout voltage regulators (LDOs). To minimize the ripple voltage while maximizing the converter's light-load efficiency, we employ a pulse-frequency modulation (PFM) scheme and a fast duty-cycled comparator to control the converter's output voltage. The duty cycling of the comparator, automatically performed by the Sleep State Controller (SSC), helps improve the light-load efficiency by 48%. Fabricated in a 0.18- $\mu$ m CMOS process with an active area of 0.42 mm<sup>2</sup>, the proposed low-ripple buck converter achieves the ripple voltage of 1.6 mV<sub>pp</sub> and the overall efficiency of higher than 74.4% over the load current range from 1.2  $\mu$ A to 1.8 mA.

**INDEX TERMS** Switching DC-DC converter, buck converter, low-power, power-management circuits, highefficiency, low ripple voltage, system-on-chips, sensor interfaces, Internet of Things (IoT).

#### I. INTRODUCTION

Over the past few decades, we have witnessed technological innovations in all areas of life imaginable, thanks to the invention of the integrated-circuit (IC) technology, which makes possible the development of various system-onchips (SoCs) that, with low power consumption, can pack tremendous computing power into very small footprints. Small but smart and low-power, SoCs are thus essential in most battery-powered portable/wearable devices requiring distributed intelligence, such as in wireless sensor nodes for environmental and structural monitoring [1], [2] or in wearable/implantable sensors for healthcare [3], [4], [5].

Energy being precious, small battery-powered devices require efficient means for distributing energy from their power sources (batteries) to the working electronics. Additionally, in mixed-signal SoCs, "clean" power supplies are often required for powering sensitive analog circuits e.g., low-noise analog front ends and RF communication circuits—to ensure their proper operations. Therefore,

The associate editor coordinating the review of this manuscript and approving it for publication was Artur Antonyan<sup>(D)</sup>.

low-dropout voltage regulators (LDOs) [6], [7], [8], [9] are often used to power on-chip sensitive analog circuits due to their linear-feedback operation producing smooth/clean output voltages. However, LDO offers poor conversion efficiency for SoCs requiring large step-down of the battery voltages to produce their on-chip supply voltages (high voltage-conversion ratio), as its energy loss is proportional to the voltage drop across its power device. On the contrary, step-down switching dc-dc converter [10], [11] ("buck converter") can offer much higher conversion efficiency regardless of the voltage step since, theoretically, a buck converter can achieve a conversion efficiency of 100%; though, most practical buck converters achieve slightly more than 90% at large load current [12], [13], [14]. However, due to the switching activities required to achieve such high conversion efficiency, the outputs of buck converters often incur large ripple voltage, thus making them not suitable for directly powering on-chip sensitive analog circuits. Hence, to produce clean power supplies for on-chip sensitive circuits while minimizing energy loss in the LDOs, a hybrid twostage power-delivery approach such as illustrated in Fig. 1 can be employed [15], [16]. In the first step, a buck converter



FIGURE 1. An example of a power-management unit in mixed-signal SoCs.

(buck converter 1) transforms the battery's voltage (around 2.7-4.2 V for lithium-ions batteries [9]) to a lower value (1.2 V), which is higher than the desired targets by approximately one minimum dropout voltage required across the LDOs (around 200 mV); in the second step, LDOs transform the buck converter 1's output to the targeted value (1 V) for powering sensitive circuits (analog front end and RF circuits). For digital circuits that are quite tolerable to switching noise on their supply voltages, they may obtain their power directly from a buck converter (buck converter 2).

Though, compared to the traditional approach, the hybrid two-stage approach can significantly reduce power loss in the LDO, we are still curious if there is a more efficient method in generating clean supply voltages for sensitive analog circuits. Even as the hybrid approach tries to minimize the dropout voltage across the LDO, such voltage is still significant as to incur a sizable efficiency loss especially for an LDO producing a low supply voltage. For example, even in an ideal situation in which no extra bias current was consumed besides what delivered to the load, an LDO producing a 1.2-V supply voltage with a small dropout voltage of 200 mV would still incur an efficiency loss of 14.3%. Such conversion loss could be even worse for LDOs generating lower supply voltages required by many of today SoCs [17], [18], and for other light-load applications in which the LDO's bias current is a sizable fraction of the load current. For an SoC designed to fully operate by energy harvested from the environment, the loss in its power-delivery system at light load may dictate the SoC's feasibility in such energy-harvesting application [19]. The potential to eliminate such conversion loss thus motivates our investigation into the feasibility of eliminating the LDO entirely and only employing the buck converter as the sole component of the power-delivery system. Removing the LDO also spares the chip precious area for other circuit blocks within the SoC.

Another challenge in the design of practical buck converters is that high conversion efficiency can often be achieved only at large load currents since the power consumed in the control and gate-drive circuits constitutes only a small fraction of the overall power delivered to the load. However, at small load currents, the power consumed by these peripheral circuitry may constitute a significant fraction of the overall load power, leading to a drastic drop in the converter's conversion efficiency compared to at high load currents. In many applications, the current consumed by an SoC may vary by several orders of magnitude across its various modes of operation. For instance, an SoC used in a wireless sensor node may possess two modes of operation: 1) low-power mode while it only performs data logging, and 2) high-power mode while it wirelessly communicates with the base station [3], [20]. Between these two modes, the current drawn from the battery may vary from as low as a few microamps in the low-power mode to over a milliamp in the high-power mode. Though maximizing the buck converter's efficiency at high load current is important, doing so for light load should also be addressed since the SoC often operates in the low-power mode most of the time, thus making the total energy consumed in such mode dominant. Therefore, it is advantageous to maximize the converter's efficiency not only just at high load, but also across all levels of the load current.

Maximizing the converter's efficiency across all loadcurrent levels becomes even more challenging with the additional requirement that the converter also exhibits low ripple voltage. This is because achieving low ripple voltage requires responsive (fast) control circuitry within the converter, and fast control circuitry consumes high power. And what is the maximum ripple voltage's amplitude should the converter exhibit for the LDO to be amortized? To answer this question, recall that the maximum ripple voltage on the supply that an analog circuit block can tolerate depends on its power-supply-rejection ratio (PSRR): the input-referred supply noise should be lower than the circuit's intrinsic inputreferred noise. With this in mind, let us consider the most sensitive circuit block-i.e., the analog front end (AFE)-in wearable electrocardiogram (ECG) recording devices, whose input-referred intrinsic noise is on the order of 1  $\mu V_{rms}$ [21], [22], [23], [24]. Implemented with fully-differential topologies, these AFEs exhibit PSRR of at least 60 dB. Thus, for the supply noise to be lower than the intrinsic noise when referred to the AFE's input, it is required that the ripple voltage on the AFE's supply be smaller than around 4 mV<sub>pp</sub>—the number is obtained using the condition:  $V_{\rm r,rms}/\rm{PSRR}$  < 1  $\mu V_{\rm rms}$  in which  $V_{\rm r,rms}$  is the rootmean-squared ripple voltage, which is assumed to be a sawtooth. In this work, we shall target the ripple voltage of around 1 mV<sub>pp</sub> to provide some safety margin against other nonidealities.

There are many existing high-efficiency buck converters with relatively small ripple voltage reported in the literature [12], [13], [14], [25], and [26], but none of which exhibits the ripple voltage low enough to warrant removing the LDOs from the SoCs.

Therefore, this paper presents the design of a buck converter to achieve the following objectives: First, the converter should exhibit so small the output ripple voltage that it can directly power sensitive analog circuits without the need of an LDO; Second, the converter should exhibit high efficiency over a wide range of relatively low load currents ( $1.2 \ \mu$ A to  $1.8 \ m$ A) to make it attractive for use in low-power applications such as low-power sensor nodes for the Internet of Things (IoT) and smart healthcare; Finally, even with small output ripple and high light-load efficiency, the converter should allow reliable implementation in a small footprint with small enough on-board inductor and capacitor.

The paper is organized as follows: Section II investigates various design considerations of the proposed buck converter. Section III explains the designs of the whole converter, providing detailed mechanisms of different circuit components. Section IV shows measured results and, finally, Section V concludes the paper.

## **II. DESIGN CONSIDERATIONS**

#### A. MODE OF CONDUCTION

For low ripple voltage, it is natural to operate the converter in a continuous-conduction mode (CCM) [27], [28] in which the inductor current supplying the load is continuous (always greater than zero). By using a large inductor and high switching frequency, the inductor current can be made smooth, thus producing small ripple voltage. However, the use of a large inductor makes impractical the realization of small electronic devices, and a high switching frequency incurs significant power loss in the control and gate-drive circuitry, thus degrading the converter's overall efficiency, especially at low load current. On the contrary, operating the converter in a discontinuous-conduction mode (DCM) [25], [29]—in which the inductor current is allowed to be zero for some intervals-requires neither as large an inductor nor as high a switching frequency as in the CCM. Therefore, in this work targeting low-power portable devices, we opt for using DCM in our proposed converter. Nevertheless, limiting the ripple voltage to within 4 mV<sub>pp</sub> in a DCM converter requires careful attention in the choices of various converter's design parameters, which we shall discuss in Section III-B.

## **B. VOLTAGE-CONTROL SCHEME**

There are two commonly-used schemes for controlling the output voltage of DCM converters: 1) pulse-width modulation (PWM) and 2) pulse-frequency modulation (PFM). Let's consider which of the two schemes is appropriate for our proposed low-ripple converter. Fig. 2(a) illustrates how a converter employing the PWM scheme adjusts its switching behavior to stabilize the output voltage to a desired value: the converter employs a fixed switching frequency  $(1/T_s)$  while varying the ON times of the power transistors ( $T_{chg}$  and  $T_{dchg}$ ) to alter the pulse width of the inductor current ( $I_L$ ) such that its average value balances with the load current. Though easier to design, the PWM scheme is not suitable for the realization of our low-ripple converter due to two main disadvantages: first, since the ripple's amplitude depends on



FIGURE 2. Commonly-used voltage control schemes for DCM converters: (a) Pulse Width Modulation (PWM) (b) Pulse Frequency Modulation (PFM).

the amount of charge delivered to the output capacitor per each pulse of the inductor current, variability in the inductor current's pulse width makes the control of the ripple voltage's amplitude difficult; second, the switching frequency, which is constant across all the load conditions, must be chosen to be high enough to ensure that, even at high load current, enough inductor current can still be delivered to the load to maintain the output voltage to the desired level. As a result, the switching frequency is normally chosen to be higher than optimal for the light-load condition, which increases the power consumption of the peripheral circuitry and, in turn, degrades the converter's light-load efficiency.

To simultaneously ease the control of the ripple voltage's amplitude and maximize the converter's efficiency across all its load conditions, employing the PFM scheme for controlling our proposed converter is more appropriate. As illustrated in Fig. 2(b), the PFM scheme keeps the width of each inductor-current pulse fixed-by keeping the ON times of the power transistors constant-while varying the switching frequency to alter the average value of the inductor current to balance with the load current. Since the amount of charge delivered by each inductor-current pulse is now constant, the ripple's amplitude becomes constant across the entire load-current range. In addition, since the switching frequency is adapted to suit the load current, significant power saving can be achieved at light load due to the low switching activity that ensues-hence, the improvement in the lightload efficiency. Therefore, we adopt a DCM-PFM scheme for controlling our proposed converter in this work.

## C. THE COMPARATOR

To maximize conversion efficiency at light load, most modern buck converters employ a comparator and associated digital control circuitry instead of a conventional opamp-based circuit to help regulate their outputs to the desired values in a feedback fashion [12], [13], [14], [25], [26]. As shown in Fig. 3(a), the comparator compares the output voltage,  $V_{out}$ , to the reference voltage,  $V_{ref}$ , and, once  $V_{out}$  falls below  $V_{ref}$ ,



**FIGURE 3.** (a) A comparator-based buck converter. (b) Ripple voltages at low and high load currents.

informs the control circuitry (which controls the gate driver) to switch ON the power transistors ( $M_n$  and  $M_p$ ) to deliver inductor-current pulses to the load such that  $V_{out}$  is brought back above  $V_{ref}$ .

Since the decision on when to activate the power transistors starts from the comparator, its design is very crucial in determining the converter's performance. One important concern is the tradeoff between the comparator's speed (hence, its power consumption) and the output ripple voltage at various load conditions. To understand this tradeoff, let's consider two loading conditions where a comparator-based converter tries to maintain its output voltage, Vout, at the reference value,  $V_{ref}$ . Let  $t_d$  be the delay from when the comparator detects the crossing between its two inputs to when the converter initiates the delivery of an inductor-current pulse to the load. For simplicity, we will assume that  $t_d$  is dominated by the comparator's delay such that we can think of  $t_d$  as just the comparator's delay (see Table 1). On the left of Fig. 3(b), the load current,  $I_{\text{load}}$ , is so small that it discharges  $V_{\text{out}}$  only slowly such that, after the comparator's delay of  $t_d$ ,  $V_{out}$  falls just slightly below  $V_{ref}$ . As a result, only one pulse of the inductor current suffices to bring  $V_{out}$  back above  $V_{ref}$ . In this scenario, what determines the ripple voltage's amplitude is the amount of charge in a single inductor-current pulse (and the values of L and C). Hence, by controlling the amount of charge per pulse via architecting appropriate ON times of the power transistors, we can control the amplitude of the ripple voltage.

However, the situation is quite different when the load current is large, as seen toward the right of Fig. 3(b). In this scenario, a large load current causes  $V_{out}$  to droop rapidly

such that it falls significantly below  $V_{ref}$  after the same comparator's delay of  $t_d$ . As a result, one pulse of the inductor current is insufficient in bringing  $V_{out}$  back to above  $V_{ref}$ . Hence, the control circuitry will instruct the power transistors to fire inductor-current pulses successively to bring  $V_{out}$  back above  $V_{ref}$ . Such a multiple-ripple scenario effectively causes the amplitude of the ripple voltage to be quite large, as it is now determined by the total amount of charge in multiple inductor-current pulses.

Preventing such a multiple-ripple scenario requires a comparator fast enough to quickly respond to the crossing events between  $V_{out}$  and  $V_{ref}$ . The type of comparator most appropriate for our DCM-PFM converter is a continuous-time one as it allows the comparator to continuously monitor  $V_{\text{out}}$  to determine the moment of  $V_{\text{out}}$ - $V_{\text{ref}}$  crossing such that the converter's switching frequency can be adjusted accordingly. However, a high-speed continuous-time comparator consumes significant static power since it is powered ON all the time, thus degrading the converter's efficiency, especially at light load. As a result, many buck converters for light-load applications still employ latch-based comparators controlled by relatively high-speed clocks [13], [29] to minimize static power burnt. However, due to the limited number of available clock frequencies on-chip, the comparator often operates with too high a clock frequency for a given load current to ensure its correct operation, thus resulting in extra switching loss in its oscillator and digital control circuitry. For example, even at its light load— $\approx$  50  $\mu$ A, which is considered quite a high load in our standard—the converter in [13] still employs a 100-kHz clock for its comparator, which degrades the converter's light-load efficiency tremendously. The design in [29] incorporates an on-chip oscillator with its frequency scalable from 6 Hz to 1.2 MHz to provide the comparator's clock optimized for a particular load current. However, determining such an optimal clock's frequency is still performed off-chip, and the power associated with such computation has not been accounted for in the efficiency calculation. In this work, we chose to combine both the continuous-time and the clocked methods to minimize the power consumed in the comparator: "continuous-time" in the sense that the comparator is powered ON to monitor the  $V_{\text{out}}$ - $V_{\text{ref}}$  crossing events, and "clocked" in the sense that the comparator is also shut down to minimize static power. Next, we describe the rationales behind this proposed power-saving scheme.

For the comparator to respond sufficiently fast to the  $V_{out}$ - $V_{ref}$  crossings over the entire load-current range, it cannot be avoided that the converter be biased with high current. Nevertheless, two concepts may pave the way to minimizing the comparator's averaged power while allowing it to be responsive. First, most of the time, the load current appears virtually static over several switching periods—e.g., while an environmental-monitoring sensor logs data from its surrounding—hence, its switching period (assuming the PFM scheme) appears highly predictable. Second, in the light-load scenario, the crossings between  $V_{out}$  and  $V_{ref}$  occur



FIGURE 4. Overall schematic of the proposed low-ripple buck converter.

relatively sparingly compared to the high-load case such that the comparator spends most of its time waiting, during which its power is wasted; hence, the reason for the poor light-load efficiency. With such realizations, it can be reasoned that if we can predict the moments of the  $V_{out}$ - $V_{ref}$  crossings, we can turn the comparator ON only within the vicinity of these crossings while turning it OFF the rest of the time to save power. Therefore, as part of our proposed low-ripple buck converter, we propose a power-saving scheme that utilizes a time-based technique to adaptively predict the crossings between  $V_{out}$  and  $V_{ref}$  and subsequently duty-cycle the comparator to minimize its power to improve the converter's light-load efficiency.

## **III. DESIGN OF THE CONVERTER**

Fig. 4 shows the high-level schematic of our proposed low-ripple buck converter designed to operate from the input voltage  $V_{in} = 3.3$  V to produce the output voltage  $V_{out} =$ 1.2 V. To reduce the comparator's power as discussed in Section II-C, we incorporate the Sleep-State Controller (SSC) to determine the duration in which the comparator should be put to sleep to save power and to turn the comparator ON only for sufficient duration to detect the crossings between  $V_{out}$  and  $V_{ref}$ . An asynchronous state machine (ASM) is used to provide the converter with a smooth operation at a switching frequency appropriate for the load current. The pulse generator receives control signals from the ASM to generate the timing pulses,  $D_p$  and  $D_n$ , which are then buffered by the gate driver to drive the pMOS and nMOS power transistors, respectively.

During steady state, the SSC, the ASM, and the pulse generator operate from the supply voltage of  $V_{sup} = 1.2 \text{ V}$ 

which is connected directly to  $V_{out}$ . To prevent the converter from failing to start when  $V_{out}$  has not yet settled to 1.2 V,  $V_{sup}$ is temporarily connected to the external reference voltage  $V_{ref} = 1.2$  V during the start-up. Only when  $V_{out}$  reaches its steady-state value is  $V_{sup}$  switched to being connected to  $V_{out}$ . The connection from  $V_{sup}$  to either  $V_{ref}$  or  $V_{out}$  is controlled by the supply multiplexer via an external control signal SC, which in this version is still supplied manually.

## A. OVERALL OPERATION

The converter's operation can be divided into the following four states:

- 1) *Alert*: During the *Alert* state, the comparator is turned ON, ready to detect the  $V_{out}$ - $V_{ref}$  crossings while both power transistors are OFF.
- 2) Up: During the Up state,  $M_p$  is turned ON (while  $M_n$  is OFF) to ramp up the inductor current (charging phase). The converter enters the Up state from the *Alert* state after the comparator detects that  $V_{out}$  falls below  $V_{ref}$ . During the Up state, the comparator is turned OFF to save power since it is not needed.
- 3) *Down*: During the *Down* state,  $M_n$  is turned ON (while  $M_p$  is OFF) to ramp down the inductor current (discharging phase). The converter enters the *Down* state right after it exits the *Up* state. Most of the time during the *Down* state, the comparator is turned OFF to save power; only shortly toward the end of the *Down* state is the comparator turned ON to detect the multiple-ripple scenario (more in Section III-F).
- 4) *Sleep*: During the *Sleep* state, both  $M_p$  and  $M_n$  are OFF. The comparator is also turned OFF to save power since the SSC has predicted that  $V_{out}$  will remain above  $V_{ref}$  for a certain duration.



**FIGURE 5.** (a) Timing diagram of the state-control signals. (b) Timing diagram illustrating the proposed power-saving scheme for minimizing the comparator's power.

To instruct other circuit components to operate according to the converter's state, the ASM provides four control signals—Up, Down, Sleep, Alert—each asserted high during its namesake's state as shown in the timing diagram of Fig. 5(a). Therefore, the width of each control signal corresponds to the duration the converter operates in a particular state for each switching cycle—i.e.,  $T_{up}$ ,  $T_{dn}$ ,  $T_{sleep}$ ,  $T_{alert}$  are the durations the converter operates in the Up, Down, Sleep, and Alert states, respectively. The ASM also takes as its inputs the flag signals—UpExit, DownExit, SleepExit—as feedback from other circuit components to notify of the moment the converter should exit a particular state. The explanation of how each flag signal is generated will be provided during the explanation of the relevant circuit component.

The timing diagram in Fig. 5(b) illustrates how the proposed converter—after it reaches a steady state—minimizes power consumed by the comparator. The converter operates in the *Alert* state in which the comparator is turned ON (seen from the comparator's enable signal, CP<sub>on</sub>, being asserted high) only near the vicinity of  $V_{out}$  falling below  $V_{ref}$ —for  $t \in [t_0, t_1]$ , with  $t = t'_0$  being a  $V_{out}$ - $V_{ref}$  crossing moment. Once the comparator's output CP<sub>out</sub> reaches the valid level at  $t = t_1$ —the converter goes through the *Up* and *Down* states (e.g.,  $t \in [t_1, t'_2]$ ) to deliver a pulse of inductor current to the load. Toward the end of the *Down* state, the comparator is also briefly turned ON to detect the multiple-ripple scenario—e.g.,  $t \in [t_2, t'_2]$ . If, during this time, the multiple-ripple



FIGURE 6. A simplified model for determining the design parameters to achieve the required ripple voltage.

scenario is detected (CPout is asserted high), the ASM will assert the MR\_det flag, whose effect is to keep the comparator on a high alert to promptly resolve the multiple-ripple issue (more on this in Section III-F). Then, after the *Down* state, the converter enters the *Sleep* state in which the comparator is powered OFF to save power as it is predicted that  $V_{out}$  will remain above  $V_{\text{ref}}$  for a specific duration—e.g.,  $t \in [t'_2, t_3]$ . After the converter operates in the Sleep state for a duration determined by the SSC, it enters the Alert state once again at  $t = t_3$ , ready to detect another  $V_{out}$ - $V_{ref}$  crossing moment at  $t = t'_3$ , and so on. It can therefore be reasoned from the timing diagram in Fig. 5(b) that, to minimize the comparator's power, we should have it operate as long as possible in the *Sleep* state (and as short as possible in the *Alert* state), while still ensuring the correct functionality of the converter. Section III-E will explain how we design the SSC to achieve this objective.

## B. DESIGN PARAMETERS FOR THE TARGETED RIPPLE VOLTAGE

Various design parameters, including *L*, *C*,  $T_{chg}$ , and  $T_{dchg}$ , affect the ripple voltage's amplitude, which, in turn, determines the converter's switching frequency ( $f_s = 1/T_s$ ) for a particular load current. This section will explain how we select their values for our proposed design. Fig. 6 shows a simplified circuit model of our converter along with the zoomed-in views of the converter's output  $V_{out}$  and the inductor current  $I_L$ . Since this is a DCM-PFM buck converter with fixed values of  $T_{chg}$  and  $T_{dchg}$ , we will assume that, for each switching cycle,  $I_L$  ramps up from zero to reach the peak value of  $I_p$  before ramping down to zero once again to deliver a fixed amount of charge to the load per switching cycle. As a result, each inductor current pulse creates one cycle of the ripple voltage with a peak-to-peak amplitude of  $V_{rpp}$ . To simplify our calculation, let us also assume that

the pulse width of  $I_L$  ( $T_{chg} + T_{dchg}$ ) is very brief compared to the switching period ( $T_s$ ). We can therefore estimate that the ripple voltage is a direct result of the total charge in one inductor-current pulse being delivered to the output capacitor C—i.e., the role of  $I_{load}$  is insignificant in determining  $V_{rpp}$ .

From the requirement that the average voltage across L must be zero in steady state, we obtain the relationship between  $T_{chg}$  and  $T_{dchg}$ :

$$\frac{T_{\rm chg}}{T_{\rm chg} + T_{\rm dchg}} = \frac{V_{\rm out}}{V_{\rm in}}.$$
 (1)

Assuming that, for our converter, the amplitude of the ripple voltage is so small that we can approximate  $V_{out}$  as being constant—hence,  $I_L$  changes linearly during  $T_{chg}$  and  $T_{dchg}$ —we can write an expression of the inductor current as

$$I_{\rm p} = \frac{V_{\rm in} - V_{\rm out}}{L} \cdot T_{\rm chg},\tag{2}$$

and we can estimate the total charge delivered to C per each pulse of the inductor current as

$$Q_{\rm u} = \frac{1}{2} I_{\rm p} \left( T_{\rm chg} + T_{\rm dchg} \right). \tag{3}$$

Since  $Q_u$  in (3) causes the voltage across C to fluctuate by  $V_{rpp}$ , we have  $Q_u = V_{rpp}C$ . Then, from (1)-(3) and from  $Q_u = V_{rpp}C$ , we can solve for an expression of  $T_{chg}$  as

$$T_{\rm chg} = \sqrt{2 \cdot V_{\rm rpp} \cdot LC \cdot \frac{V_{\rm out}/V_{\rm in}}{V_{\rm in} \left(1 - V_{\rm out}/V_{\rm in}\right)}}.$$
 (4)

And from (1), we can write an expression of  $T_{dchg}$  as a function of  $T_{chg}$  as

$$T_{\rm dchg} = T_{\rm chg} \left( \frac{1 - V_{\rm out} / V_{\rm in}}{V_{\rm out} / V_{\rm in}} \right).$$
(5)

Substituting the expression of  $T_{chg}$  in (4) into that of  $I_p$  in (2), we then obtain an expression of the inductor's peak current as

$$I_{\rm p} = \sqrt{2 \cdot \frac{V_{\rm rpp}}{L/C} \cdot V_{\rm out} \left(1 - \frac{V_{\rm out}}{V_{\rm in}}\right)}.$$
 (6)

Finally, we can determine  $f_s$  for a given load current,  $I_{load}$ , by realizing that, in steady state, the average inductor current must balance with the load current:  $I_{load} = Q_u \cdot f_s$ . And from  $Q_u = V_{rpp}C$ , we can solve for  $f_s$  as

$$f_{\rm s} = \frac{I_{\rm load}}{V_{\rm rpp}C}.$$
 (7)

Recall the three design goals of our proposed converter: 1) achieving small output ripple voltage, 2) preserving a high light-load conversion efficiency, and 3) allowing the converter to be reliably implemented in a small footprint (with *L* and *C* in small packages on the PCB). While keeping the small output ripple voltage in mind, let's consider how we could maintain a high conversion efficiency at light load. Recall that, at smaller  $I_{load}$ , a larger fraction of the converter's overall power consumption is attributed to that



FIGURE 7. (a)  $f_s$  vs. C. (b)  $I_p$  vs.  $\sqrt{L/C}$ . (c)  $T_{chg}$  vs.  $\sqrt{LC}$ .

of the digital-control circuits (the digital power), which is proportional to the converter's switching frequency  $f_s$ (assuming that the digital power is dominated by the converter's switching activities). Hence, to keep the light-load efficiency high, we should aim to minimize  $f_s$  at small  $I_{load}$ . The expression in (7) informs us that for a given  $I_{\text{load}}$  and  $V_{\text{rpp}}$ , reducing  $f_s$  can be achieved by making C large. To help find a suitable C to keep  $f_s$  sufficiently low at small  $I_{load}$ , we plot in Fig. 7(a) the resulting  $f_s$  as a function of C for a few values of  $I_{\text{load}}$  from 1.2  $\mu$ A to 20  $\mu$ A. We then chose  $C = 22 \ \mu$ F a capacitance still small enough to be available in a chipscale package—to keep  $f_s$  below 1 kHz for this range of  $I_{load}$ . In fact, the simulation result to be presented in Section III-G confirms that, at  $I_{\text{load}} = 1.2 \ \mu\text{A}$  where the penalty inflicted by the digital power is severest, choosing  $C = 22 \ \mu F$  results in the digital power of 0.136  $\mu$ W, which is equivalent to only around 9.4% of the converter's output power.

Now with the value of C determined, our next task is to determine an appropriate value of L while keeping in mind



FIGURE 8. Schematic of the comparator.

the availability of the inductor in a chip-scale packagei.e., a small value of L is preferred. Also, for an inductor to be physically small, it requires that the inductor's peak current  $I_p$  be kept minimal. Hence, minimizing  $I_p$  becomes one of our criteria in choosing L. From the expression of  $I_{\rm p}$  in (6), we learn that, for a given  $V_{\rm rpp}$ ,  $V_{\rm in}$ , and  $V_{\rm out}$ ,  $I_{\rm p}$  is inversely proportional to  $\sqrt{L/C}$ . Fig. 7(b) plots the resulting  $I_{\rm p}$  from (6) as a function of  $\sqrt{L/C}$  for a few values of  $V_{\rm rpp}$ from 1 mV to 10 mV. It can be seen from this plot that, for a given  $V_{\rm rpp}$ , the  $I_{\rm p}$ -vs.- $\sqrt{L/C}$  curve is very steep when  $\sqrt{L/C}$  is small and becomes relatively flat as the value of  $\sqrt{L/C}$  gets large. Hence, if an inductor with too small an inductance L is chosen, it may need to endure a large  $I_p$  which, in turn, requires that it be implemented in a large form factor. On the other hand, the flat  $I_{\rm p}$ -vs.- $\sqrt{L/C}$  curve at high  $\sqrt{L/C}$ suggests that the added benefit of increasing L to reduce  $I_p$ becomes minimal. In this work with the targeted  $V_{rpp}$  of 1 mV, we have observed from Fig. 7(b) that  $\sqrt{L/C} \in [1, 1.5] \Omega$ represents the optimal range providing balance between the inductor's peak current and its inductance value. For C =22  $\mu$ F, such optimal range of  $\sqrt{L/C}$  translates to the value of L being in the range from 22  $\mu$ H to around 50  $\mu$ H.

Finally, we can narrow down the value of L by considering the ON time of the power transistors ( $T_{chg}$  and  $T_{dchg}$ ) that must be implemented on chip. From the expression of  $T_{chg}$ in (4), we find that, for a given set { $V_{rpp}$ ,  $V_{in}$ ,  $V_{out}$ },  $T_{chg}$  is proportional to  $\sqrt{LC}$  as plotted in Fig. 7(c) for a few values of  $V_{rpp}$  between 1 to 10 mV. For  $C = 22 \ \mu$ F and the range of L between 22 to 50  $\mu$ H determined earlier—which results in  $\sqrt{LC} \in [22, 33] \ \mu$ s—the resulting  $T_{chg}$  for  $V_{rpp} = 1 \ mV$ ranges from 409 to 614 ns. In this work, we have opted for  $T_{chg}$  being on the higher end of such range to ensure that it can be reliably implemented in our process technology. By choosing  $L = 47 \ \mu$ H, as this value is available in a chip-scale package (1008) (Coilcraft, Inc., Vishay Intertech., Inc.), we obtain, for  $V_{rpp} = 1 \ mV$ ,  $T_{chg}$  and  $T_{dchg}$  of 600 ns and 1.05  $\mu$ s, respectively.

## C. COMPARATOR DESIGN

Fig. 8 shows the schematic of the comparator used in this design. To avoid the multiple-ripple scenario, which leads to

large ripple voltage, the comparator must react sufficiently quickly once  $V_{out}$  falls below  $V_{ref}$ . Provided that each inductor-current pulse can increase the value of  $V_{out}$  by only about  $V_{rpp}$ , it follows that the value of  $V_{out}$  must not fall below  $V_{ref}$  by more than  $V_{rpp}$  over the entire comparator's delay  $t_d$ . Since  $V_{out}$  droops most rapidly at the highest load current ( $I_{load,max} = 1.8$  mA in our design), we can formulate, using  $I_{load,max}$ , the maximum comparator's delay tolerable and, hence, the minimum comparator's bias current, to prevent the multiple-ripple scenario over the entire load-current range.

We can use the drawing of  $V_{out}$  in Fig. 6 to help determine the maximum tolerable  $t_d$ . At the maximum load current, the amount that  $V_{out}$  falls below  $V_{ref}$  over the duration of  $t_d$  ( $V_{drop}$ ) is given by

$$V_{\rm drop} = I_{\rm load,max} \cdot t_{\rm d}/C. \tag{8}$$

To ensure that  $V_{\text{out}}$  can be brought up well above  $V_{\text{ref}}$  by each inductor-current pulse, we may enforce that such  $V_{\text{drop}}$ only account for around half of  $V_{\text{rpp}}$ —i.e.,  $V_{\text{drop}} \approx 0.5 V_{\text{rpp}}$ . Such condition leads to the following requirement on the comparator's delay:

$$t_{\rm d} < \frac{1}{2} \frac{V_{\rm rpp} \cdot C}{I_{\rm load, max}}.$$
(9)

With  $V_{rpp} = 1$  mV,  $C = 22 \mu$ F, and  $I_{load,max} = 1.8$  mA, we have the maximum tolerable comparator's delay of around 6.11  $\mu$ s. For the design of the comparator in Fig. 8, we provide the comparator with a total bias current of 1  $\mu$ A to achieve a smaller comparator's delay of around 4.5  $\mu$ s in simulation to provide some margin for the delay incurred in the ASM and the gate-drive circuitry. It is thus evident that, if such comparator was kept continuously ON in a light-load case, its power ( $\approx 3.3 \ \mu W$ ) would be so prohibitive that the converter's light-load efficiency becomes very low-e.g., the efficiency at  $1.2-\mu A$  load current would be only 30% even when power loss by other circuits within the converter is ignored. Therefore, to boost the converter's light-load efficiency, it is imperative that we drastically reduce the average power consumed by the comparator during light load, hence, the reason for the SSC to be described in Section III-E.

#### D. PULSE GENERATOR AND GATE DRIVE

Fig. 9(a) shows the schematic of the pulse generator responsible for the generation of the pulses  $D_p$  and  $D_n$  for controlling the ON durations ( $T_{chg}$  and  $T_{dchg}$ ) of  $M_p$  and  $M_n$ , respectively, and the UpExit and DownExit flags to inform the ASM of the moment the converter should exit the Upand *Down* states, respectively; the pulse generator is also responsible for generating the CP<sub>on,dn</sub> signal with a pulse width of  $T_{cmp}$  to turn the comparator ON toward the end of the *Down* state to check if the multiple-ripple scenario occurs. The pulse generator employs three delay gates, DL<sub>p</sub>, DL<sub>n</sub>, and DL<sub>cmp</sub>, whose schematic is shown in Fig. 9(b) and whose delays determine the values of  $T_{chg}$ ,  $T_{dchg}$ , and  $T_{cmp}$ , respectively. To prevent conduction loss in the body diodes of  $M_n$  due to it being turned OFF while the inductor



**FIGURE 9.** (a) Overall schematic of the pulse generator. (b) Schematic of all the delay gates used in the pulse generator.

current is nonzero, a zero-current detection circuit (ZCD) is incorporated to help adjust  $T_{dchg}$  to achieve the zero-current-switching condition [12].

First, let us consider the  $D_p$ -pulse-gen portion of the pulse generator, with the timing operation shown in Fig. 10(a), for generating the  $D_p$  pulse. We use the Up signal as the  $D_p$ pulse to ramp the inductor current up over the entire duration of the Up state. To inform the converter of the moment at which it should exit the Up state (to enter the Down state), we employ the delay gate, DLp, to generate the UpExit flag as a delayed version of the Up signal. Once the ASM receives the UpExit flag, it immediately sets the converter's state to Down, causing the Up signal to go low (after a finite delay through the ASM of  $t_{d1}$ ). Hence, ignoring the delay through the ASM, the duration between the rising edges of the Up signal and the UpExit flags, which is the delay through the DL<sub>p</sub> gate, defies the duration of the Up state. Therefore, we can make the width of the  $D_p$  pulse equal to  $T_{chg}$  by making the DL<sub>p</sub> gate's delay equal to  $T_{chg}$ . Moreover, once the Up signal goes low, the delay gate DL<sub>p</sub> quickly resets the UpExit flag to low (after



**FIGURE 10.** (a) Timing diagram for the generation of the  $D_p$  pulse. (b) A 2,000-sample Monte Carlo simulation at room temperature of  $T_{chg}$ . (c) Simplified timing diagram for the generation of the  $D_n$  pulse.

a finite delay of  $t_{d2}$ ) to help prepare the delay gate for its subsequent operation.

Fig. 9(b) shows the design of the delay gate  $DL_p$ , which consists of the current-starved inverter  $(M_1, M_2, M_2)$ and  $I_d$ ) and the capacitor  $C_d$  for determining the delay, three high-threshold inverters (HI<sub>1</sub>-HI<sub>3</sub>), and the reset transistors  $M_3$ - $M_5$ . The low-to-high propagation delay of  $DL_p$  responsible for determining  $T_{chg}$  is then approximately  $C_{\rm d}(V_{\rm DD} - V_{\rm t1})/I_{\rm d}$ , where  $V_{\rm t1}$  is the switching threshold of HI1. The three high-threshold inverters, HI1-HI3-made with stacked transistors as shown in the inset of Fig. 9(b) to raise their switching threshold-are employed to minimize  $V_{\rm DD}$ -to-ground feedthrough current when their inputs linger near the middle of the supply rails due to the slow edge of the current-starved inverter's output. In this work, we design the delay gate  $DL_p$  with  $C_d = 300$  fF and  $I_d = 300$  nA to achieve the required  $T_{chg}$  of 600 ns—for  $V_{DD} = 1.2$  V and  $V_{t1} \approx 0.6$  V. Fig. 10(b) shows a Monte Carlo simulation



FIGURE 11. Schematic of the gate driver.

for 2,000 samples at room temperature for  $T_{chg}$ , illustrating a nearly Gaussian distribution with a mean of 600 ns and a standard deviation of around 6.7%. Such statistical result suggests that 99.8% ( $\pm 3\sigma$ ) of all the chip samples should have  $T_{chg}$  within the range of 480 ns to 720 ns. And since  $T_{\rm chg}$  determines the amount of charge delivered to the load per switching cycle, the variation in  $T_{chg}$  can then affect the ripple voltage's amplitude. With the help of (4), it can be inferred that the change in  $V_{rpp}$  from its nominal value (1 mV) due to the variation in  $T_{chg}$  should be in the range of -36% to +45%, which still keeps  $V_{rpp}$  under 2 mV. Finally, to speed up the high-to-low transition to help prepare as promptly as possible the delay gate for its next operation, we employ the reset transistors  $M_3$ - $M_5$  to asynchronously reset the outputs of all the high-threshold inverters to their appropriate supply rails once the delay gate's input returns to zero.

Next, let us consider the  $D_n$ -pulse-gen portion of the circuit, whose simplified timing diagram is shown in Fig. 10(c). The part consisting of the delay gate  $DL_n$ , the inverter, and the AND gate (AND1) serves to create from the Down signal the gate-drive pulse  $D_n$ , whose width,  $T_{dchg}$ , is approximately equal to the low-to-high propagation delay of the  $DL_n$  gate. The design of the  $DL_n$  gate is mostly similar to that of the  $DL_p$  gate shown in Fig. 9(b) except that the size of  $C_d$  can be automatically tuned by the ZCD circuit (by the 5-bit signal A[4:0]) according to the scheme proposed in [12] to achieve  $T_{dchg}$  that provides the zero-current switching condition. In this work, we design the delay gate DL<sub>n</sub> with  $I_d = 200$  nA and  $C_d$  programmable at 5-bit resolution with a unit capacitance of 20 fF, thus allowing  $T_{\rm dchg}$  to be tuned to within the vicinity of 1.05  $\mu$ s at 60-ns resolution. Finally, the delay gate DL<sub>cmp</sub> with the lowto-high propagation delay of  $T_{\rm cmp}$  and the other AND gate (AND2) are used for creating, toward the end of the Down state before the DownExit flag is asserted, the comparatorenabling signal,  $CP_{on,dn}$ , with a pulse width of  $T_{cmp}$ . In this work, we also employ the delay gate's topology in Fig. 9(b) to realize DL<sub>cmp</sub>, but with  $C_d = 300$  fF and  $I_d = 200$  nA to provide  $T_{\rm cmp}$  of around 900 ns, the comparator's ON time just sufficiently long for detecting the occurrence of the multiple-ripple scenario but not too long to significantly increase the comparator's power.

Finally, Fig. 11 shows the schematic of the gate driver for driving the power transistors  $M_p$  and  $M_n$ . Two level shifters are used to convert the 1.2-V  $D_p$  and  $D_n$  pulses to their 3.3-V ( $V_{in}$ ) counterparts. Due to  $M_p$  and  $M_n$  being sized large to provide up to 1.8 mA of load current while minimizing conduction loss—(9  $\mu$ m/180 nm)×240 for both transistors—the converted pulses are buffered by inverter chains to minimize the rise/fall times of the signals driving the gates of  $M_p$  and  $M_n$ .

#### E. THE SLEEP-STATE CONTROLLER (SSC)

Recall from the timing diagram of Fig. 5(b) that, to minimize the power consumed by the comparator for a given  $I_{load}$ , we should maximize the duration of the *Sleep* state while minimizing that of the *Alert* state, hence, the task of the SSC. Fig. 12 shows the high-level design of the SSC, which consists of four main components: 1) Alert-Time Quantizer (ATQ), 2) Sleep-Time Register (STR), 3) Sleep-State Enabler (SSE), and 4) Digitally-Controlled Oscillator (DCO). In this section, we describe the design of the SSC to illustrate how it maximizes  $T_{sleep}$  while minimizing  $T_{alert}$  for a given load current.

With  $T_{\text{alert}}$  being the pulse width of the Alert signal, the ATQ then quantizes  $T_{alert}$  into a three-bit digital signal,  $\{T_2, T_1, T_0\}$ . Conceptually, the ATQ acts as a time-to-digital converter transforming the Alert state's duration into a digital representation. The STR-which houses two 6-bit registers  $(T_{\rm sl,crs} \text{ and } T_{\rm sl,fne})$  that jointly represent  $T_{\rm sleep}$ —then takes the quantized value of  $T_{alert}$  and decides if it is longer than necessary. If so, the STR appropriately increments the value of one of the two 6-bit registers, which effectively lengthens  $T_{\text{sleep}}$  (and shorten  $T_{\text{alert}}$ ): if  $T_{\text{alert}}$  remains larger than a certain threshold, the STR increments the value of  $T_{\rm sl,crs}$  to lengthen  $T_{\text{sleep}}$  in a coarse manner; otherwise, it increments the value of  $T_{\rm sl,fne}$  instead to lengthen  $T_{\rm sleep}$  in a fine manner. On the contrary, if the STR finds  $T_{alert}$  to be too short, which risks causing error in the comparator's operation, it will decrement the value stored in the  $T_{\rm sl,fne}$  register, thus shortening  $T_{\rm sleep}$ (and lengthening  $T_{alert}$ ) in a fine manner. The update of the  $T_{\rm sl,crs}$  and  $T_{\rm sl,fne}$  registers proceeds cycle-to-cycle until  $T_{\rm alert}$ reaches an optimal value—i.e., around 4-10  $\mu$ s, which is short enough for saving the comparator's power but still long enough to ensure correct comparator's operation-at which point the update ceases. The information on  $T_{sleep}$  in the  $T_{\rm sl,crs}$  and  $T_{\rm sl,crs}$  registers is then passed to the SSE, whose role is to implement an actual wait time of  $T_{\text{sleep}}$  before exerting the SleepExit flag to inform the converter to leave the Sleep state and enter the Alert state. In other words, the SSE acts as a digital-to-time converter that transforms the digital output of the STR into an actual duration of the Sleep state. To implement the actual  $T_{sleep}$ , the STR employs the timing reference from the DCO whose oscillation period corresponds in real time to the state of time tracking by the STR. To understand the operation of the SSC, let us consider the design of each circuit block in more detail.

#### 1) ALERT-TIME QUANTIZER (ATQ)

Fig. 13(a) shows the schematic of the ATQ for quantizing  $T_{\text{alert}}$  into its digital representation. The ATQ consists of



FIGURE 12. High-level schematic of the Sleep-State Controller (SSC).



FIGURE 13. (a) Schematic of the Alert Time Quantizer (ATQ). (b) The ATQ's output levels and their corresponding STR's actions.

three D flip-flops (DFFs) for producing the 3-bit output,  $\{T_2, T_1, T_0\}$ . To produce each output bit  $T_i$  ( $i \in \{0, 1, 2\}$ ), the corresponding DFF, which operates on the falling edge of the Alert signal, receives as its input a

delayed-by- $T_{di}$  version of the Alert signal. Hence, each ATQ's output bit,  $T_i$ , will be asserted only when the rising edge of the Alert signal's delayed-by- $T_{di}$  version occurs before its falling edge—i.e., if  $T_{alert} > T_{di}$ . In this work, we design the three delay gates to produce  $T_{d2}$ ,  $T_{d1}$ , and  $T_{d0}$ , of approximately 400  $\mu$ s, 10  $\mu$ s, and 4  $\mu$ s, respectively. Therefore, the ATQ in Fig. 13(a) effectively quantizes  $T_{alert}$  into four levels: level 1 corresponding to  $T_{alert} < T_{d0}$ , level 2 to  $T_{d0} < T_{alert} < T_{d1}$ , level 3 to  $T_{d1} < T_{alert} < T_{d2}$ , and level 4 to  $T_{alert} > T_{d2}$ , as illustrated in Fig. 13(b).

Fig. 14(a)-14(b) show the Monte Carlo simulation results (2,000 samples) at room temperature for  $T_{d2}$ ,  $T_{d1}$ , and  $T_{d0}$ , with the standard deviations of 0.7%, 1.5%, and 1.1% from the means, respectively. The small standard deviations thus ensure the robustness of the quantization levels of  $T_{alert}$  against process variations.

#### 2) SLEEP-TIME REGISTER (STR)

The STR, with the schematic shown in Fig. 15, takes the ATQ's output to adjust the values of its two internal 6-bit registers (up/down counters),  $T_{sl,crs}$  and  $T_{sl,fne}$ , to adjust the value of  $T_{sleep}$  at an appropriate rate as indicated in Fig. 13(b). If the ATQ indicates that  $T_{alert}$  is within level 4  $(T_2T_1T_0 = 111)$ , the STR will adjust  $T_{sleep}$  in a coarse manner by incrementing the value of  $T_{sl,crs}$  by one, an equivalent of increasing  $T_{\text{sleep}}$  by a coarse unit time,  $T_{\text{crs}}$ , to speed up the adjustment. To prevent error due to the comparator not being properly turned ON to make a decision, we must ensure that Talert remain sufficiently long for the Alert state to cover, with some margin, the moment when  $V_{out}$  crosses  $V_{ref}$ . We thus incorporate two preventive measures to ensure that  $T_{\text{alert}}$  be sufficiently wide: 1) if the ATQ identifies that  $T_{\text{alert}}$ is smaller than  $T_{d2}$  but still significant—i.e.,  $T_{alert}$  is within level 3 ( $T_2T_1T_0 = 011$ )—the STR will adjust  $T_{sleep}$  in a finer manner by incrementing the value of  $T_{\rm sl,fne}$  by one, an equivalent of increasing  $T_{sleep}$  by a fine unit time,  $T_{fne}$ ; 2) if the ATQ identifies that  $T_{alert}$  is already too short but still



**FIGURE 14.** Monte Carlo simulation results (2,000 samples) at room temperature for: (a)  $T_{d2}$ . (b)  $T_{d1}$ . (c)  $T_{d0}$ .



FIGURE 15. Schematic of the Sleep-Time Register (STR).

positive—i.e.,  $T_{\text{alert}}$  is within level 1 ( $T_2T_1T_0 = 000$ )—the STR will decrement the value of  $T_{\text{sl,fne}}$  by one, an equivalent of decreasing  $T_{\text{sleep}}$  (and lengthening  $T_{\text{alert}}$ ) by  $T_{\text{fne}}$ .

Such preventive measures are necessary because, as will be shown in Section III-E4, the two unit times,  $T_{crs}$  and  $T_{fne}$ , are generated by the DCO whose oscillation period cannot be



FIGURE 16. (a) Schematic of the Sleep State Enabler (SSE). (b) Timing diagram of the SSE.

precisely controlled with respect to the delays of the ATQ's delay gates ( $T_{di}$ ). Once the ATQ identifies that  $T_{alert}$  is now optimally small—i.e.,  $T_{alert}$  is within level 2 ( $T_2T_1T_0 = 001$ , suggesting that 4  $\mu$ s <  $T_{alert} < 10 \ \mu$ s)—the STR will update neither the  $T_{sl,crs}$ 's value nor the  $T_{sl,fne}$ 's.

## 3) SLEEP STATE ENABLER (SSE)

The role of the SSE, with the schematic shown in Fig. 16(a), is to allow the converter to operate in the *Sleep* state for the duration of  $T_{sleep}$  as determined by the STR. Recall from the timing diagram in Fig. 5(b) that the converter automatically enters the *Sleep* state right after it exits the *Down* state. After the converter has operated in the *Sleep* state for a duration of  $T_{sleep}$  as specified by  $T_{sl,crs}$  and  $T_{sl,fne}$  registers, the SSE will assert the flag signal, SleepExit, to notify the converter to exit the *Sleep* state.

To understand the SSE's operation, let us first suppose that  $T_{\rm sl,crs}$  and  $T_{\rm sl,fne}$  assume the values of M and N, respectively, with M, N > 0. The SSE's job then is to implement a waiting period of  $T_{\rm sleep} = MT_{\rm crs} + NT_{\rm fne}$  before asserting the SleepExit flag. To implement such  $T_{\rm sleep}$ , the SSE employs two anti-phasic clock signals CLK<sub>1</sub> and CLK<sub>2</sub>, a 6-bit counter  $T_{\rm sl\_crs\_cnt}$ , and other logic circuits. The two anti-phasic clock signals are generated by the DCO, whose oscillation period can be programmed based on the information provided by the ATQ ( $T_{\rm sl\_fne}$ ) and the SSE (the signal cnt\\_fne\\_en). Section III-E4 will explain the DCO's operation in more detail.

Fig. 16(b) shows the SSE's timing diagram to help understand its operation. Before the converter enters the Sleep state  $(t < t_0)$ , the Sleep signal is low; thus, the 6-bit counter  $T_{sl\_crs\_cnt}$  and the DFF  $Q_{fne}$  are set to zero. Since, initially,  $T_{\rm sl}$  crs cnt is zero and not yet equal to M, the output of the digital comparator Cmp1, A, is asserted high, which consequently sets the signal cnt\_fne\_en to low, which, in turn, instructs the DCO to produce CLK<sub>1</sub> and CLK<sub>2</sub> with a period of  $T_{\rm crs}$ . Once the converter enters the *Sleep* state at  $t = t_0$ ,  $T_{\rm sl\_crs\_cnt}$  and  $Q_{\rm fne}$  are no longer in reset. With A being its enable signal,  $T_{sl\_crs\_cnt}$  will count up on the rising edge of CLK<sub>1</sub> as long as A remains high—i.e., as long as  $T_{sl\_crs\_cnt} <$ M. Once  $T_{sl crs cnt}$  reaches M at  $t = t_1$ , the comparator Cmp1 will force A to low, thus asserting the signal cnt\_fne\_en high and causing  $T_{sl\_crs\_cnt}$  to stop counting and remain at M. Thus, the elapsed time from the rising edge of the Sleep signal (t = $t_0$ ) to the moment when the signal cnt\_fne\_en being asserted high  $(t = t_1)$  is equal to  $MT_{crs}$ —i.e., the counting by  $T_{sl\_crs\_cnt}$ from zero to M fulfills the  $MT_{crs}$  portion of the required  $T_{\text{sleep}}$ . Let us now look at how the SSE implements the  $NT_{\text{fne}}$ portion.

With SleepExit =  $A \cdot B$ , the fact that A remains high for the duration of MT<sub>crs</sub> ensures that SleepExit be kept low for such duration. Once A goes low at  $t = t_1$ , keeping SleepExit low for an additional duration of  $NT_{\text{fne}}$  requires that B be kept high. To do so, we first use the signal cnt\_fne\_en =  $\overline{A}$  to instruct the DCO to change the period of CLK<sub>1</sub> and CLK<sub>2</sub> to  $NT_{\text{fne}}$  (more on this in Section III-E4). Since  $B = C \cdot D$  and C is already high due to the comparator Cmp2 determining that N is greater than zero, B is determined solely by D. With cnt\_fne\_en being the clock-gating signal for the DFF  $Q_{\text{fne}}$ , cnt\_fne\_en being high then allows  $Q_{\text{fne}}$  to be asserted high on the rising edge of  $CLK_2$  at  $t = t_2$ , which, in turn, causes the signal D to go low on the rising edge of  $CLK_1$  at  $t = t_3$ , which then immediately causes B to go low. As a consequence of B going low, the SleepExit flag is asserted high at around  $t = t_3$ . Notice that the duration between the rising edge of CLK<sub>1</sub> when  $T_{sl\_crs\_cnt}$  reaches M at  $t = t_1$  and the moment at which the SleepExit flag goes high on the next rising edge of CLK<sub>1</sub> at  $t = t_3$  is exactly one period of CLK<sub>1</sub>, which is now NT<sub>fne</sub>. Therefore, the total Sleep state's duration,  $T_{\text{sleep}}$ , determined by the SSE is  $MT_{\text{crs}} + NT_{\text{fne}}$  as intended.

#### 4) DIGITALLY-CONTROLLED OSCILLATOR (DCO)

The DCO, with the schematic shown in Fig. 17(a), provides time references for realizing  $T_{\text{sleep}}$ . The DCO generates as its output the clock signals CLK<sub>1</sub> and CLK<sub>2</sub>, whose period can be programmed to either  $T_{\text{crs}}$  or an appropriate multiple of  $T_{\text{fne}}$  ( $NT_{\text{fne}}$ ). Programming the DCO's period is achieved via programming the propagation delays of the two identical delay gates DL<sub>1</sub> and DL<sub>2</sub>.

The timing diagram in Fig. 17(b) can help illustrate the DCO's operation. Before the converter enters the *Sleep* state at  $t = t_1$  (when Sleep=0), the two DFFs, DFF<sub>1</sub> and DFF<sub>2</sub>, are reset such that all the DCO's internal signals— $V_1$ ,  $V_2$ , CLK<sub>1</sub>, and CLK<sub>2</sub>—are all low. Shortly after  $t = t_1$ , since CLK<sub>1</sub> and CLK<sub>2</sub> remain low, the Sleep signal being high then sets the DFF<sub>1</sub>'s output,  $V_1$ , high. As  $V_1$  is the input of the delay gate DL<sub>1</sub>, whose output is CLK<sub>2</sub>, the low-to-high transition on  $V_1$  then causes CLK<sub>2</sub> to go high at  $t = t_2$ , after the lowto-high propagation delay of  $DL_1$  ( $T_{LH}$ ). The low-to-high transition on CLK<sub>2</sub> at  $t = t_2$  then results in two important subsequent events: 1) since  $CLK_1$  remains low at  $t = t_2$ , CLK<sub>2</sub> acts as the clock input into DFF<sub>2</sub> (through an OR gate) such that its rising edge causes  $DFF_2$ 's output,  $V_2$ , to toggle i.e., making a low-to-high transition; 2) since CLK<sub>2</sub> is already high when  $V_2$  makes a low-to-high transition, the rising edge of  $V_2$  at  $t = t_2$  generates the rising edge of the clock input (through a series of an AND gate and an OR gate) into DFF<sub>1</sub>, thus causing  $V_1$  to toggle—i.e., making a high-tolow transition. Note that these two events occur quickly in the vicinity of  $t = t_2$ —over a few propagation delays of the associated logic gates and DFFs-such that, in the time scale of our interest, we can approximate them as occurring at  $t = t_2$ . The high-to-low transition on  $V_1$  at  $t = t_2$  then causes CLK<sub>2</sub> to go low after one high-to-low propagation delay of  $DL_1$  ( $T_{HL}$ ).

All the events from  $V_1$  going high at  $t = t_1$  to  $V_1$  coming back low at  $t = t_2$  constitute half of the DCO's oscillation cycle, covering the duration of around  $T_{LH}$ . The same process then repeats with the roles of  $V_1$  and  $V_2$  reversed, and also those of CLK<sub>1</sub> and CLK<sub>2</sub>: the low-to-high transition on  $V_2$  at  $t = t_2$  then causes CLK<sub>1</sub> to go high at  $t = t_3$  after DL<sub>2</sub>'s low-to-high propagation delay (which is also equal to  $T_{LH}$ since the two delay gates are assumed to be identical); then the high-to-low transition on CLK<sub>1</sub> at  $t = t_3$  sets  $V_1$  to high, and so on. We can now see from the timing diagram in Fig. 17(b) that CLK<sub>1</sub>, CLK<sub>2</sub>,  $V_1$ , and  $V_2$  have the same period, which is approximately equal to  $2T_{LH}$ . Hence, by programming  $T_{LH}$ of the two delay gates, we can program the period of both CLK<sub>1</sub> and CLK<sub>2</sub>.

Fig. 17(c) shows the schematic of the two programmable delay gates, which consists of a current-starved inverter (INV1), a regular inverter (INV2), and a programmable capacitor bank. Let  $C_d$  be the total capacitance from the internal node  $V_{int}$  to ground as provided by programming the capacitor bank. In addition, since the current source  $I_d$  limits the INV1's current in discharging the node  $V_{int}$ ,



**FIGURE 17.** (a) Schematic of the Digitally-Controlled Oscillator (DCO). (b) The DCO's timing diagram. (c) Schematic of the delay cells, DL<sub>1</sub> and DL<sub>2</sub>. (d) Monte Carlo simulation of  $T_{LH}$  at room temperature for 2,000 samples.

the low-to-high propagation delay  $T_{LH}$  of the delay gate is then proportional to  $C_d/I_d$ , which can be programmed via programming the value of  $C_d$ . In this work, we construct the capacitor bank  $C_d$  from six binary-weighted capacitors, with the *i*<sup>th</sup> capacitor the size of  $2^i C_u$ ,  $i \in \{0, ..., 5\}$ , where  $C_u = 120$  fF is the unit capacitance. Programming  $C_d$ 's size can then be achieved by controlling which binary-weighted capacitor to connect to the node  $V_{int}$ .

Whether to connect the *i*<sup>th</sup> binary-weighted capacitor to the node  $V_{\text{int}}$  is determined by the signal cnt\_fne\_en and the  $i^{\text{th}}$ bit of the STR's fine register,  $T_{sl,fne}[i]$ . If cnt\_fne\_en is low i.e., the STE's counter  $T_{sl\_crs\_cnt}$  is still counting up and has not yet reached the value stored in the STR's coarse register  $T_{\rm sl, crs}$ —all the binary-weighted capacitors are connected to the capacitor bank such that  $C_d = 63C_u$ . As a result, DL<sub>1</sub> and DL<sub>2</sub> experience the largest possible delay, causing the DCO to operate with the longest oscillation period  $T_{\rm crs} \propto 63 C_{\rm u}/I_{\rm d}$ . On the other hand, if the counter  $T_{sl crs cnt}$  reaches the value stored in the T<sub>sl,crs</sub> register such that cnt\_fne\_en is now high, the switch connecting each  $C_{d,i}$  to the overall  $C_d$  will now be controlled by the *i*<sup>th</sup> bit of the STR's fine register  $T_{\rm sl,fne}[i]$  $(C_{d,prog}[i] = T_{sl,fne}[i])$ . Hence, for  $T_{sl,fne} = N$ , the total capacitance of the bank  $C_d$  is equal to  $NC_u$  and the lowto-high propagation delay,  $T_{LH}$ , of  $DL_1$  and  $DL_2$  is now proportional to  $NC_u/I_d$ . As a result, the period of CLK<sub>1</sub> and CLK<sub>2</sub> is now proportional to  $NT_{\rm fne}$  where  $T_{\rm fne} \propto C_{\rm u}/I_{\rm d}$  is the fine unit time. Finally, Fig. 17(d) shows the Monte Carlo simulation of  $T_{\rm LH}$  at room temperature for 2,000 samples when  $C_d$  is programmed to its maximum value (63 $C_u$ ). The maximum  $T_{\rm LH}$  exhibits a mean of 170.2  $\mu$ s with a standard deviation of 3.5%.

#### F. LOAD-CHANGE DETECTION

This section explains the rationale behind turning ON the comparator toward the end of the *Down* state to detect the multiple-ripple scenario. In conventional PFM schemes, the converter is relatively quick in responding to changes in the load current since the comparator is always ON to detect the moments when  $V_{out}$  falls below  $V_{ref}$ . However, in our proposed scheme, optimizing  $T_{sleep}$  is equivalent to predetermining the converter's switching frequency to suit a particular level of the load current—because the comparator cannot be turned ON until the converter exits the *Sleep* state. Hence, when the load current changes,  $T_{sleep}$  needs to be re-optimized for the converter to function properly.

The ATQ's operation in Fig. 13(b) illustrates that the adjustment of  $T_{sleep}$  is asymmetric between the accumulation (increasing) and relaxation (decreasing) directions: the accumulation of  $T_{sleep}$  is relatively much faster than its relaxation as, in one converter's cycle, an increment in the value of the  $T_{sl,crs}$  register by one (accumulation) is equivalent to an increase in  $T_{sleep}$  of around 400  $\mu$ s, while a decrement in the value of the  $T_{sl,fne}$  register by one (relaxation) is to a decrease in  $T_{sleep}$  of around only 4  $\mu$ s. Therefore, if the load current abruptly increases, the converter needs to know whether it should re-optimize  $T_{sleep}$  in a drastic fashion.

To determine if  $T_{\text{sleep}}$  needs to be re-optimized, we turn the comparator ON for the duration of  $T_{\text{cmp}}$  toward the end of



FIGURE 18. Timing diagram for detecting the load change.

the *Down* state, which is right after an inductor-current pulse has been delivered to the load. If the comparator detects that  $V_{out}$  remains smaller than  $V_{ref}$  at the end of the *Down* state, it means that the load current must have increased abruptly within the previous switching cycle such that the current switching frequency is not sufficient in balancing the average inductor current with the load current. As a result,  $T_{sleep}$  needs to be re-optimized (decreased) to suit the new load current.

Consider an illustrative timing diagram in Fig. 18(a) to understand the proposed concept. Before  $t = t_2$ , the load current has been steadily low for an extended period such that  $T_{\text{sleep}}$  has reached its optimal value. Thus, after an inductor-current pulse has been delivered to the load by  $t = t_0$ ,  $V_{out}$  droops in a predictable manner such that it crosses  $V_{\rm ref}$  at an instant predicted by the SSC, as seen from the comparator being on alert near  $t = t_1$ . Then, at  $t = t_2$ , the load current abruptly increases, causing  $V_{out}$  to droop more rapidly afterward. Due to Vout's faster droop from the higher load current, the inductor-current pulse delivered to the load during  $t \in [t_4, t_5]$  is not sufficient in bringing  $V_{out}$  back above V<sub>ref</sub>—i.e., the multiple-ripple scenario. As a result, the comparator being ON during  $t \in [t_5, t_5']$  will inform the SSC that the previous value of  $T_{sleep}$  is no longer valid and that it needs to be re-optimized. To re-optimize  $T_{sleep}$ , the ASM will assert the MR\_det flag to reset both the  $T_{\rm sl,crs}$  and  $T_{\rm sl,fne}$ registers in the STR to zero (see Fig. 4, Fig. 12, and Fig. 15)as seen from the  $T_{\rm sl,crs}$  register being reset to zero at  $t = t_5'$  in Fig. 18(a). Then, the scheme for saving the comparator's power explained in Section III-E will proceed to find a new optimal value of  $T_{\text{sleep}}$ .

Note that the efficacy of the proposed load-change detection scheme still depends on the *Sleep* state's duration and when in the *Sleep* state the abrupt increase in the load current occurs. This is because, once the converter is in the *Sleep* state, it will not turn the comparator ON to monitor  $V_{out}$  until the next *Alert* state. As a result, if the change in the load current occurs early in the *Sleep* state,  $V_{out}$  may have dropped

 
 TABLE 1. Simulated delays and powers among the circuit blocks at the two extremes of the load-current range.

| Circuit Blocks               | Delay   | Power              |  |  |  |  |
|------------------------------|---------|--------------------|--|--|--|--|
| load current = 1.8 mA        |         |                    |  |  |  |  |
| Comparator                   | 4.5 μs  | $3.07 \ \mu W$     |  |  |  |  |
| Control Circuitry            | 1.05 ns | $5.13 \mu\text{W}$ |  |  |  |  |
| Gate Driver                  | 1.6 ns  | $13.4 \mu\text{W}$ |  |  |  |  |
| load current = $1.2 \ \mu A$ |         |                    |  |  |  |  |
| Comparator                   | 4.5 μs  | 6 nW               |  |  |  |  |
| Control Circuitry            | 0.8 ns  | 136 nW             |  |  |  |  |
| Gate Driver                  | 1.6 ns  | 10.7 nW            |  |  |  |  |

significantly over the Sleep state's remaining duration. The problem is most problematic when the load current abruptly changes from a very low value to a very high one since the low load current (before the increase) causes  $T_{\text{sleep}}$  to be very long and a high load current (after the increase) causes  $V_{out}$ to droop very rapidly. At worst, the drop in  $V_{out}$  can be so significant that it causes the circuits powered by the converter to malfunction. Nevertheless, the problem just mentioned can be eliminated in the next version of the chip with the realization that, in most SoCs, the moment of abrupt increases in the load current can be precisely determined-e.g., when the SoC's control unit wakes up the RF communication unit to transmit the stored data to a base station. Therefore, by providing a flag signal from the load circuit indicating the imminent abrupt increase in the load current, we can modify the ASM's MR\_det flag to respond to the load circuit's flag such that the STR's registers can be immediately reset to zero upon the anticipation of an abrupt increase in the load current.

## G. POWER AND DELAY DISTRIBUTIONS AMONG THE CIRCUIT BLOCKS

It is instructive to investigate how the power and delay are distributed among various circuit blocks over the entire range of the load current. Table 1 shows the simulated



**FIGURE 19.** Chip micrograph of the proposed low-ripple buck converter fabricated in a 0.18- $\mu$ m CMOS process.

values—at the two extremes of the load current range (1.2  $\mu$ A and 1.8 mA)—of the time delay through and the power distributions among the comparator, the digital control circuitry (the ASM, the SSC, and the pulse generator), and the gate driver. It is evident from Table 1 that, among all the delays within each switching cycle, the comparator's is by far the most dominant, thus justifying our effort in minimizing it via biasing the comparator with a high bias current.

Since the power consumption of each circuit block should be proportional to the converter's switching frequency, ones should expect it to be linear with the load current (see (7)). However, a closer look at Table 1 reveals that though the powers of the three circuit blocks decrease as the load current decreases, the rates of decrease are less linear for the comparator's and the control circuitry's compared to the gate driver's: as the load current decreases by a factor of  $6.67 \times 10^{-4}$  (from 1.8 mA to 1.2  $\mu$ A), the comparator's and control circuitry's powers only decrease by a factor of  $1.95 \times 10^{-3}$  and  $26 \times 10^{-3}$ , respectively, compared to the gate driver's of  $8 \times 10^{-4}$ . That the comparator's and control circuitry's powers do not decrease as much as the gate driver's is due to the leakage current in the control circuitry becoming more dominant at a very small load current. As a result, the converter needs to compensate for this leakage current by operating faster than in its ideal condition (with no leakage current), which raises the comparator's and control circuitry's powers compared to the values extrapolated by the load current.

## **IV. MEASUREMENT RESULTS**

The proposed low-ripple buck converter, with a micrograph shown in Fig. 19, has been fabricated in a 0.18- $\mu$ m CMOS process from the United Microelectronics Corp. (UMC) and occupies an active area of 0.42 mm<sup>2</sup> (the overall chip's size is  $1.5 \text{ mm} \times 1.5 \text{ mm}$ ). All the capacitors used for implementing the on-chip delay gates—DL<sub>p</sub>, DL<sub>n</sub>, and DL<sub>cmp</sub> for the pulse generator, the three delay gates for the ATQ, and the DL<sub>1</sub> and DL<sub>2</sub> for the DCO—are of the metal-insulator-metal (MiM) type with the per-area capacitance of around 1 fF/ $\mu$ m<sup>2</sup>. The total areas occupied by the capacitors within the pulse



FIGURE 20. The converter's test setup on a printed circuit board.

generator and the SSC are  $0.011 \text{ mm}^2$  and  $0.075 \text{ mm}^2$ , respectively; these areas amount to 24.5% and 34.7% of the pulse generator's and the SSC's total areas, respectively.

Fig. 20 shows a test setup built on a printed circuit board (PCB) for evaluating the proposed converter's performance. The on-chip converter takes as inputs a 3.3-V supply voltage  $(V_{in})$  and a 1.2-V reference voltage to produce an output voltage of 1.2 V supplying a resistive load. Abruptly changing the load current of the converter to understand its dynamics can be achieved by digitally switching between two on-board load resistors (labeled R on the board). The values of the off-chip inductor (L) and capacitor (C) are 47  $\mu$ H and 22  $\mu$ F, respectively, as discussed in Section III-B. Though an off-chip inductor with a relatively large footprint is used in this version of the test setup (Bourns Inc., SRR1280-470M), a chip-scale inductor could be used in a later version without hurting the overall efficiency due to its low series resistance. For example, a  $47-\mu$ H chip-scale inductor (1008LS-473XJR from Coilcraft, Inc.) exhibits the maximum DC resistance of only 10.7  $\Omega$ , which causes negligible power dissipation in the inductor compared to that delivered to the load even at the minimum load current of 1.2  $\mu$ A.

All the converter's timing signals—the pulses  $D_p$ ,  $D_n$ , the SSC's CLK<sub>1</sub>—are automatically generated on-chip. A fieldprogrammable gate array (CMOD A7, Digilent Inc.) is incorporated in the test setup for collecting internal digital signals to assess the converter's operation, for generating the control signal to switch the resistive load, and for generating serial programming data to configure the converter. The converter's output voltage and internal digital signals are probed via a USB mixed-signal oscilloscope (PicoScope, 3204D MSO).

Fig. 21(a) illustrates the converter's operation—with the Alert and Sleep signals shown for identifying the converter's state—as the load current abruptly decreases from 12  $\mu$ A to 1.2  $\mu$ A. For t < 17.17 ms when  $I_{\text{load}} = 12 \mu$ A, the converter has reached its steady state as seen from the Alert and Sleep signals exhibiting regular periods; the pulse width of the Alert signal is also much shorter than that of the Sleep signal since the converter is disabled most of the time to save power. After the load current abruptly decreases to 1.2  $\mu$ A at t = 17.17 ms, the converter's switching frequency decreases



**FIGURE 21.** The proposed converter's dynammics as the load current changes from: (a) 12  $\mu$ A to 1.2  $\mu$ A. (b) 1.2  $\mu$ A to 12  $\mu$ A.

dramatically, as seen from the periods of the Alert and Sleep signals becoming longer. During this transient period, the duration of the Sleep state starts to accumulate while that of the Alert state relax, as seen from the progressive changes in the pulse widths of the Sleep and Alert signals. Eventually, at t = 207.58 ms, the converter reaches its new steady state with a new lower switching frequency; in addition, the Alert signal becomes very short and Sleep signal very wide, suggesting that the converter has been put to sleep for most of the switching period to save power. Fig. 21(b) illustrates the opposite scenario in which  $I_{load}$  suddenly increases from 1.2  $\mu$ A to 12  $\mu$ A. At t = 21.64 ms when  $I_{load}$  abruptly increases, the load-change detection scheme detects multiple ripples in  $V_{\text{out}}$ , which results in the ASM resetting  $T_{\text{sleep}}$  to zero. The SSC then re-optimizes  $T_{\text{sleep}}$ , which reaches its steady-state values 1.91 ms after the detection of the load change. Notice that, compared to when  $I_{\text{load}} = 1.2 \ \mu\text{A}$ , the value of  $T_{\text{sleep}}$  for  $I_{\text{load}} = 12 \ \mu\text{A}$  is much shorter due to the increase in the converter's switching frequency.

To evaluate the converter's response to large step changes in the load current, we stepped the load current from  $1.2 \,\mu$ A to 1 mA, then back to  $1.2 \,\mu$ A, while observing the converter's output voltage. Fig. 22 shows the result. The converter still exhibits quite a strong load regulation, as seen from the output voltage drop of about 24 mV as the load current increases by approximately 1 mA. This level of load regulation is



FIGURE 22. The converter's output as the load current changes between 1.2  $\mu$ A and 1 mA.

to be expected from most DCM-PFM converters (such as [29]) as two effects combine to lower the average value of the output voltage: per switching cycle, 1) the higher load current reduces the net current charging the output capacitor, and 2) the higher load current causes the output voltage to droop faster. The load regulation can be improved in a later version of the chip via a regulation enhancement scheme proposed in [12]: by moving the comparison point between  $V_{\rm out}$  and  $V_{\rm ref}$  upward to compensate for the drop in  $V_{\rm out}$ as the load current increases. Nevertheless, thanks to the load-change detection scheme discussed in Section III-F, the output voltage can adapt quickly to its steady-state value as the load current suddenly changes from 1.2  $\mu$ A to 1 mA. Even when the load current suddenly decreases from 1 mA to 1.2  $\mu$ A, which requires accumulating  $T_{\text{sleep}}$ , it takes only around 50 ms for the converter's output to reach its steadystate value, a negligible transient duration if the converter is to operate in the low-power mode for an extended period afterward.

For the amplitude of the output ripple voltage, it can be reasoned that the lower is the load current, the higher is the ripple voltage's amplitude—since the lower load current leaves a higher fraction of the total charge in each inductor-current pulse for charging the output capacitor. Hence, for the worst-case ripple voltage's amplitude, we show in Fig. 23(a) the converter's output voltage,  $V_{out}$ , in steady state at the load current of  $1.2 \,\mu$ A, while Fig. 23(b) shows its zoomed-in view against the Alert and Sleep control signals. It is evident from Fig. 23(b) that the ripple voltage is in-phase with the two control signals, with the Alert signal preceding the charging of  $V_{out}$  in every switching cycle. Direct measurement of the peak-to-peak ripple amplitude gives  $V_{rpp} = 1.6 \,\text{mV}_{pp}$ , which is quite close to our targeted value of 1 mV<sub>pp</sub>.

To show the effectiveness of our scheme in reducing the comparator's power, we measured the average power consumed by the comparator—once the converter reaches the steady state in which  $T_{\text{alert}}$  has been minimized by the SSC—as the load current varies from 1.2  $\mu$ A to 1.8 mA as shown in Fig. 24. At the load current of 1.8 mA, the comparator consumes around 2.98  $\mu$ W, thus suggesting that the comparator is ON most of the time because it operates close to being in the CCM. As the load current

 TABLE 2. Performance summary and comparison to previous works.

| Parameters              | [12]                   | [13]            | [14]                | [25]                              | [26]             | This Work       |
|-------------------------|------------------------|-----------------|---------------------|-----------------------------------|------------------|-----------------|
|                         | JSSC'12                | JSSC'14         | JSSC'16             | TPEL'17                           | JSSC'18          |                 |
| Technology (nm)         | 250                    | 40              | 180                 | 130                               | 130              | 180             |
| Area (mm <sup>2</sup> ) | 0.39                   | 0.084           | 1.44                | 0.66                              | 0.14             | 0.42            |
| Inductor (µH)           | 3.3                    | 220             | 4.7                 | 3                                 | 18               | 47              |
| Capacitor (µF)          | 4.7                    | 1               | 4.7                 | 3                                 | 0.056            | 22              |
| Input Voltage (V)       | 1.2-2.5                | 0.6-1.1         | 0.55-1              | 2.2-3.3                           | 1.8-3.3          | 3.3             |
| Output Voltage (V)      | 1                      | 0.3-0.55        | 0.35-0.5            | 1.7                               | 1.2              | 1.2             |
| Load Range              | 1 µA-50 mA             | 50 µA-10 mA     | 100 nA-20 mA        | 10 µA-20 mA                       | 0.1 μA-2.65 mA   | 1.2 μA-1.8 mA   |
| Quies. Power (µW)       | 0.22                   | 15(1)           | 0.04 <sup>(2)</sup> | 0.5 <sup>(3)</sup>                | 0.44-12.32       | 0.53            |
|                         | 0.22                   | 15**            |                     | $(V_{\rm in} = 2.2 - 3.3  \rm V)$ |                  |                 |
| Voltage Ripple (mV)     | $< 30@ V_{in} = 1.2 V$ | 11              | 10                  | 180                               | 29               | 1.6             |
| Eff. @ 1.2 μA (%)       | 65.24                  | -               | 77                  | -                                 | 26               | 74.4            |
| Eff. @ 1.8 mA (%)       | 94                     | 93              | 84                  | 84                                | 86               | 86.2            |
| High Eff. Range         | > 61%                  | > 50%           | > 70%               | > 74.2%                           | > 75%            | > 74.4%         |
|                         | (1 µA-50 mA)           | (0.05 mA-10 mA) | (400 nA-20 mA)      | (10 µA-20 mA)                     | (0.1 mA-2.65 mA) | (1.2 µA-1.8 mA) |

<sup>(1)</sup> Calculated from 50- $\mu$ A load current and  $V_{\rm in} = 0.3$  V with efficiency of 50%.

<sup>(2)</sup> Calculated from 100-nA load current and  $V_{in} = 0.35$  V with efficiency of 47%.

<sup>(3)</sup> Reported here only for the retention mode (load current of 10  $\mu$ A-500  $\mu$ A).



**FIGURE 23.** (a) The converter's measured output voltage  $V_{out}$  at  $I_{load} = 1.2 \ \mu$ A. (b) The zoomed-in view of  $V_{out}$  against the Alert and Sleep control signals.

becomes smaller, the less is the average current drawn by the comparator due to the shorter duration of the *Alert* state and, hence, the shorter comparator's ON duration. At the lowest load current of 1.2  $\mu$ A, the total power consumed by the comparator is only 10 nW. Therefore, thanks to the proposed power-saving scheme, the comparator's power is only a small fraction of the power delivered to the load even at a very small load current.



FIGURE 24. Comparator's power vs. load current.



FIGURE 25. Efficiency of the proposed converter compared to those of the existing low-ripple converters.

Fig. 25 shows the efficiency of our proposed converter within our load-current range of interest—both when the SSC is disabled and enabled—compared to other existing buck converters with relatively small output ripple voltage. It is evident that the converter's light-load efficiency when the SSC is enabled is improved tremendously from when the SSC is disabled—e.g., from 26% to 74% at 1.2- $\mu$ A load current;

however, at high load current (> 100  $\mu$ A), the efficiency improvement is almost negligible since the converter operates close to being in a CCM regardless of the SSC; also, at high load current, the comparator's power is only a small fraction of the power delivered to the load, making negligible its contribution to the efficiency calculation. It should also be noted that our two reported cases of the efficiency mark the boundary of the actual efficiency to be achieved in practice. If the load current keeps changing rapidly such that the SSC never finishes minimizing the comparator's power, the actual efficiency would fall between these two curves.

#### **V. CONCLUSION AND DISCUSSION**

In this work, we have presented the design of a low-ripple and high-light-load-efficiency buck converter for use in lowpower SoCs. We have described how to achieve very small ripple voltage (1.6 mV<sub>pp</sub>) and high light-load efficiency via the following strategies. First, to keep the ripple voltage within our requirement, we operate the converter in the DCM-PFM scheme in which each pulse of the inductor current delivers a fixed amount of charge to the load. Second, we carefully choose the values of L and C to minimize the converter's switching frequency (to lower the digital power) and the inductor's peak current (to minimize the inductor's form factor), while keeping sufficiently long the ON times of the power transistors for a reliable onchip implementation. Third, to prevent the occurrence of large ripple voltage due to multiple ripples, we employ a fast comparator to monitor the output voltage against the reference value; however, a fast comparator consumes high power, leading to poor light-load efficiency; hence, to improve the converter's light-load efficiency, we propose the use of the SSC to minimize the comparator's power by putting the comparator in the Sleep state when not needed and only turning it ON right before it needs to make a decision.

Table 2 summarizes the proposed converter's performances compared to existing low-ripple converters'. It is evident that, with respectable efficiency even when operating from a much higher  $V_{in}$ , the proposed buck converter exhibits more than  $6 \times$  smaller ripple voltage compared to the lowest-ripple one [14]. Such small ripple voltage may warrant using the proposed converter for directly powering sensitive analog circuits without the help of LDOs.

#### REFERENCES

- N. Ahmed, A. Radchenko, D. Pommerenke, and Y. R. Zheng, "Design and evaluation of low-cost and energy-efficient magneto-inductive sensor nodes for wireless sensor networks," *IEEE Syst. J.*, vol. 13, no. 2, pp. 1135–1144, Jun. 2019.
- [2] F. Di Nuzzo, D. Brunelli, T. Polonelli, and L. Benini, "Structural health monitoring system with narrowband IoT and MEMS sensors," *IEEE Sensors J.*, vol. 21, no. 14, pp. 16371–16380, Jul. 2021.
- [3] O. Omeni, A. C. W. Wong, A. J. Burdett, and C. Toumazou, "Energy efficient medium access protocol for wireless medical body area sensor networks," *IEEE Trans. Biomed. Circuits Syst.*, vol. 2, no. 4, pp. 251–259, Dec. 2008.

- [4] Z. Cao, R. Zhu, and R.-Y. Que, "A wireless portable system with microsensors for monitoring respiratory diseases," *IEEE Trans. Biomed. Eng.*, vol. 59, no. 11, pp. 3110–3116, Nov. 2012.
- [5] Y.-L. Zheng, X.-R. Ding, C. C. Y. Poon, B. P. L. Lo, H. Zhang, X.-L. Zhou, G.-Z. Yang, N. Zhao, and Y.-T. Zhang, "Unobtrusive sensing and wearable devices for health informatics," *IEEE Trans. Biomed. Eng.*, vol. 61, no. 5, pp. 1538–1554, May 2014.
- [6] G. A. Rincon-Mora and P. E. Allen, "A low-voltage, low quiescent current, low drop-out regulator," *IEEE J. Solid-State Circuits*, vol. 33, no. 1, pp. 36–44, Jan. 1998.
- [7] R. J. Milliken, J. Silva-Martinez, and E. Sanchez-Sinencio, "Full on-chip CMOS low-dropout voltage regulator," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 54, no. 9, pp. 1879–1890, Sep. 2007.
- [8] S. K. Lau, P. K. T. Mok, and K. N. Leung, "A low-dropout regulator for SoC with Q-reduction," *IEEE J. Solid-State Circuits*, vol. 42, no. 3, pp. 658–664, Mar. 2007.
- [9] G. A. Rincon-Mora, Analog IC Design With Low-Dropout Regulators, 2nd ed. New York, NY, USA: McGraw-Hill, 2014.
- [10] M. Wens and M. Steyaert, Design and Implementation of Fully-Integrated Inductive DC–DC Converters in Standard CMOS. Dordrecht, The Netherlands: Springer, 2011.
- [11] M. Steyaert, T. Van Breussegem, H. Meyvaert, P. Callemeyn, and M. Wens, "DC–DC converters: From discrete towards fully integrated CMOS," in *Proc. Eur. Solid-State Device Res. Conf. (ESSDERC)*, Sep. 2011, pp. 59–66.
- [12] T.-C. Huang, C.-Y. Hsieh, Y.-Y. Yang, Y.-H. Lee, Y.-C. Kang, K.-H. Chen, C.-C. Huang, Y.-H. Lin, and M.-W. Lee, "A battery-free 217 nW static control power buck converter for wireless RF energy harvesting with α-calibrated dynamic on/off time and adaptive phase lead control," *IEEE J. Solid-State Circuits*, vol. 47, no. 4, pp. 852–862, Apr. 2012.
- [13] X. Zhang, P.-H. Chen, Y. Okuma, K. Ishida, Y. Ryu, K. Watanabe, T. Sakurai, and M. Takamiya, "A 0.6 V input CCM/DCM operating digital buck converter in 40 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 49, no. 11, pp. 2377–2386, Nov. 2014.
- [14] P.-H. Chen, C.-S. Wu, and K.-C. Lin, "A 50 nW-to-10 mW output power tri-mode digital buck converter with self-tracking zero current detection for photovoltaic energy harvesting," *IEEE J. Solid-State Circuits*, vol. 51, no. 2, pp. 523–532, Feb. 2016.
- [15] J. Gjanci and M. H. Chowdhury, "A hybrid scheme for on-chip voltage regulation in system-on-a-chip (SOC)," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 19, no. 11, pp. 1949–1959, Nov. 2011.
- [16] F. U. Ahmed, Z. T. Sandhie, L. Ali, and M. H. Chowdhury, "A brief overview of on-chip voltage regulation in high-performance and high-density integrated circuits," *IEEE Access*, vol. 9, pp. 813–826, 2021.
- [17] G. Chen, M. A. Anders, H. Kaul, S. K. Satpathy, S. K. Mathew, S. K. Hsu, A. Agarwal, R. K. Krishnamurthy, V. De, and S. Borkar, "A 340 mVto-0.9 V 20.2 Tb/s source-synchronous hybrid packet/circuit-switched 16 × 16 network-on-chip in 22 nm tri-gate CMOS," *IEEE J. Solid-State Circuits*, vol. 50, no. 1, pp. 59–67, Jan. 2015.
- [18] N. Couniot, G. de Streel, F. Botman, A. K. Lusala, D. Flandre, and D. Bol, "A 65 nm 0.5 V DPS CMOS image sensor with 17 pJ/frame.pixel and 42 dB dynamic range for ultra-low-power SoCs," *IEEE J. Solid-State Circuits*, vol. 50, no. 10, pp. 2419–2430, Oct. 2015.
- [19] S. Bose, B. Shen, and M. L. Johnston, "A batteryless motion-adaptive heartbeat detection system-on-chip powered by human body heat," *IEEE J. Solid-State Circuits*, vol. 55, no. 11, pp. 2902–2913, Nov. 2020.
- [20] H. Bhamra, Y.-W. Huang, Q. Yuan, and P. Irazoqui, "An ultra-low power 2.4 GHz transmitter for energy harvested wireless sensor nodes and biomedical devices," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 68, no. 1, pp. 206–210, Jan. 2021.
- [21] N. Van Helleputte, S. Kim, H. Kim, J. P. Kim, C. Van Hoof, and R. F. Yazicioglu, "A 160-μA biopotential acquisition IC with fully integrated IA and motion artifact suppression," *IEEE Trans. Biomed. Circuits Syst.*, vol. 6, no. 6, pp. 552–561, Dec. 2012.
- [22] N. Van Helleputte, M. Konijnenburg, J. Pettine, D.-W. Jee, H. Kim, A. Morgado, R. Van Wegberg, T. Torfs, R. Mohan, A. Breeschoten, H. de Groot, C. Van Hoof, and R. F. Yazicioglu, "A 345 μW multi-sensor biomedical SoC with bio-impedance, 3-channel ECG, motion artifact reduction, and integrated DSP," *IEEE J. Solid-State Circuits*, vol. 50, no. 1, pp. 230–244, Jan. 2015.

- [23] J. Xu, M. Konijnenburg, H. Ha, R. van Wegberg, S. Song, D. Blanco-Almazán, C. Van Hoof, and N. Van Helleputte, "A 36 μW 1.1 mm<sup>2</sup> reconfigurable analog front-end for cardiovascular and respiratory signals recording," *IEEE Trans. Biomed. Circuits Syst.*, vol. 12, no. 4, pp. 774–783, Aug. 2018.
- [24] S.-Y. Lee, P.-W. Huang, J.-R. Chiou, C. Tsou, Y.-Y. Liao, and J.-Y. Chen, "Electrocardiogram and phonocardiogram monitoring system for cardiac auscultation," *IEEE Trans. Biomed. Circuits Syst.*, vol. 13, no. 6, pp. 1471–1482, Dec. 2019.
- [25] Y.-J. Park, J.-H. Park, H.-J. Kim, H. Ryu, S. Kim, Y. Pu, K. C. Hwang, Y. Yang, M. Lee, and K.-Y. Lee, "A design of a 92.4% efficiency triple mode control DC–DC buck converter with low power retention mode and adaptive zero current detector for IoT/Wearable applications," *IEEE Trans. Power Electron.*, vol. 32, no. 9, pp. 6946–6960, Sep. 2017.
- [26] F. Santoro, R. Kuhn, N. Gibson, N. Rasera, T. Tost, H. Graeb, B. Wicht, and R. Brederlow, "A hysteretic buck converter with 92.1% maximum efficiency designed for ultra-low power and fast wake-up SoC applications," *IEEE J. Solid-State Circuits*, vol. 53, no. 6, pp. 1856–1868, Jun. 2018.
- [27] J.-C. Tsai, T.-Y. Huang, W.-W. Lai, and K.-H. Chen, "Dual modulation technique for high efficiency in high-switching buck converters over a wide load range," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 58, no. 7, pp. 1671–1680, Jul. 2011.
- [28] Y. Wang, J. Xu, F. Qin, and D. Mou, "A capacitor current and capacitor voltage ripple controlled SIDO CCM buck converter with wide load range and reduced cross regulation," *IEEE Trans. Ind. Electron.*, vol. 69, no. 1, pp. 270–281, Jan. 2022.
- [29] A. Paidimarri and A. P. Chandrakasan, "A wide dynamic range buck converter with sub-nW quiescent power," *IEEE J. Solid-State Circuits*, vol. 52, no. 12, pp. 3119–3131, Dec. 2017.





**SIWAKORN THONGMARK** received the B.Eng. degree (Hons.) in electrical engineering from Kasetsart University, Bangkok, Thailand, in 2018, where he is currently pursuing the M.Eng. degree in electrical engineering. From 2018 to 2022, he was a Researcher with the Kasetsart's Low Power Integrated Circuits and Systems Laboratory. His research interests include low-power analog and mixed-signal circuit design for biomedical applications.

**WORADORN WATTANAPANITCH** (Member, IEEE) received the B.S. degree (summa cum laude) in electrical and computer engineering from Cornell University, Ithaca, NY, USA, in 2005, and the M.Sc. and Ph.D. degrees in electrical engineering and computer science from the Massachusetts Institute of Technology (MIT), Cambridge, in 2007 and 2011, respectively.

He worked on developing ultra-low-power electronics for biomedical applications with MIT.

He joined the Department of Electrical Engineering, Kasetsart University, Bangkok, Thailand, in 2011, as a Faculty Member, where he currently leads the Kasetsart's Low Power Integrated Circuits and Systems Research Group. His research interests include low-power analog and mixed-signal circuit design for biomedical applications, efficient power management systems, adaptive circuit techniques, and control theory.

...