# An Edge-Combining Frequency-Multiplying Class-D Power Amplifier

Hieu Minh Nguyen<sup>®</sup>, Student Member, IEEE, Feifei Zhang, Member, IEEE, Ivan O'Connell<sup>®</sup>, Senior Member, IEEE, R. Bogdan Staszewski<sup>®</sup>, Fellow, IEEE, and Jeffrey Sean Walling<sup>®</sup>, Senior Member, IEEE

Abstract—The class-D power amplifier (PA) is commonly implemented in CMOS, but its operating frequency is often limited due to the power loss of parasitic capacitances and the lower transition frequency of the PMOS transistor. In this brief we demonstrate edge-combining frequency-multiplication embedded directly in the output-stage, allowing higher-frequency operation of the class-D PA, while maintaining similar performance to a lower-frequency PA. A 65 nm CMOS prototype achieves output power and system efficiency of 22.3 dBm and 30.2%, respectively. The prototype is tested with a D-BPSK signal and achieves an EVM of 2%-rms. Although the prototype was not embedded with amplitude modulation capability, it can be readily adapted for such operation using switched-capacitor PA techniques.

*Index Terms*—CMOS power amplifier (PA), class-D PA; edgecombining, frequency multiplier, switching PA.

## I. INTRODUCTION

**S** CALING of the CMOS transistor has been primarily motivated by economic considerations related to minimizing the cost per transistor, rather than improving its performance as a transconductor. As such, CMOS devices show better performance as low-resistance switches than they do as current sources. This motivates the use of CMOS devices in switched-mode RF power amplifiers (PAs) and transmitters (TXs), which are typically more energy efficient than their counterparts operating in linear modes. Switched-mode RF PAs typically are limited in maximum frequency of operation for three primary reasons. First, for optimal performance, they require sharp switching-edge transitions at the input and output of the switching device to create the pulse-shaping that makes them more energy efficient. Second, the parasitic capacitance at the input and output must be driven, which requires proportionally higher power at higher frequency.

Manuscript received 7 April 2022; accepted 27 April 2022. Date of publication 29 April 2022; date of current version 9 February 2023. This brief was recommended by Associate Editor J. Goes. (*Corresponding author: Jeffrey Sean Walling.*)

Hieu Minh Nguyen is with the University College Dublin, Dublin 4, D04 V1W8 Ireland.

Feifei Zhang and R. Bogdan Staszewski are with the School of Electrical and Electronic Engineering, University College Dublin, Dublin 4, D04 V1W8 Ireland.

Ivan O'Connell is with MCCI, Tyndall National Institute, Cork, T12 R5CP Ireland.

Jeffrey Sean Walling is with the Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061 USA (e-mail: jeffrey.s.walling@gmail.com).

Color versions of one or more figures in this article are available at https://doi.org/10.1109/TCSII.2022.3171495.

Digital Object Identifier 10.1109/TCSII.2022.3171495

 $V_{DD} \downarrow V_{D} \downarrow V_{$ 

Fig. 1. Cascoded class-D PA with parasitic capacitance.

Third, transistors can operate efficiently in the switching mode up to  $\sim f_T/10$  [1]. When implemented in CMOS, the class-D PA is a static inverter followed by a series resonant filter, tuned to the output frequency,  $f_o$ , as shown in Fig. 1. The above limitations are exacerbated in CMOS because of the use of a high-side PMOS switch, which typically has lower  $f_T$  than an NMOS, and often must be scaled larger due to the lower carrier mobility.

Frequency-multiplying output stages have been proposed ever since vacuum tubes predominated [2], [3]. In these multiplying stages, the tube/transistor is either switched or partially conducts. The resulting pulsed waveform at the output of the transistor is rich in harmonics; hence, using a tuned pulseshaping filter enables the desired output frequency to be selected and delivered to the load. In a class-D harmonic multiplier, the series resonant circuit could be tuned to the desired harmonic output (e.g.,  $3 \times f_o$ ), to select this as an output, rather than the fundamental,  $f_o$ . The challenge with such multipliers is that the signal amplitude at the harmonic frequency is typically a fraction of the amplitude available at the fundamental frequency. For instance, for a 50% dutycycle square-wave at  $V_D$ , the third harmonic component has 9.5 dB less power than does the fundamental. Hence, it is difficult to output high RF power using such harmonic multiplying. Nevertheless, the technique has shown usefulness in sub-harmonic switching architectures to extend the dynamic range of digital transmitters [4], [5].

Digital edge combining has been proposed to embed frequency multiplication directly in a PA output stage, primarily in ultra-low-power applications. In [6], stacked-NMOS current sources are controlled with a nine-phase input signal that sequentially closes different branches, whose shared outputs are summed into a class-E pulse-shaping network, resulting in an output frequency  $9 \times$  the input frequency.

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/





Fig. 2. XOR-based frequency-multiplying class-D PA with parasitic capacitance.

An XOR-gate can act as an edge-combining frequency multiplier, as shown in Fig. 2. In this configuration, the input signals, A and B, are at frequency  $f_o$  and 90° out-of-phase. The resulting output signal,  $V_D$  is at  $2 \times f_o$ , and the resonant tank can be tuned to this frequency. In this case, the edge-combining multiplication results in a signal where the dominant component at the *output* of the switch network is operating at the desired output frequency and hence does not suffer the same power degradation as in the harmonicbased multiplication. XOR multiplication can be generalized to more than two input phases, allowing the potential for output frequencies much higher than the previous limits would dictate. This has recently been demonstrated in applications up to low mm-wave range [7]. In this brief, an XORbased edge-combining frequency multiplication is introduced and demonstrated with a  $3 \times$  multiplying prototype class-D amplifier. The technique can be generalized to  $N \times$  multiplication and can be embedded with amplitude modulation if it replaces the static inverter used in the switched-capacitor power amplifier (SCPA) [8].

This brief is organized as follows. Theoretical operation of edge-combining frequency multiplication is described in Section II. Details of a proposed frequency tripling class-D PA prototype, implemented in a 65-nm RF CMOS process, are described in Section III. In Section IV, measurement results of the prototype are summarized. Finally, conclusions and future work are presented in Section V.

#### **II. THEORETICAL OPERATION**

Edge-combining frequency multiplication has previously been embedded in the output stages of low-power transmitters intended for medical implants, with the purpose of reducing the power cost associated with clock generation [6], [9]. The XOR-gate with off-set phases (Fig. 2) shows the principle of edge-combining when applied to the class-D amplifier. In the case of the XOR-gate multiplier, one pull-up branch is closed, followed by one pull-down branch, every half-cycle of the input waveform. Using two pull-up branches and two pulldown branches allows two positive edges and two negative edges to be created per cycle and hence the frequency of the waveform at the drain node  $V_D$  of the switch is  $f_o = 2 \times f_i$ .

Generally, the edge-combining frequency multiplication principle of the two-input XOR-gate can be extended to multiplication by N by creating N parallel pull-up and pull-down paths, as shown in Fig. 3. When N is even,  $2 \times N$  clock phases control the individual pull-up and pull-down paths. When Nis odd, N clock phases control the individual pull-up and pull-down paths. By stacking the devices in series, each pullup/down branch must only close once per input cycle; hence the  $f_T$  limitation of the devices is alleviated. It should be noted that higher frequency multiplication factors can be achieved either by adding more parallel pull-up/down branches, or by stacking more devices, or a combination of the two methods. However, stacking up more devices is limited for two primary reasons. First, each device in the stack adds more switching resistance and hence the devices must be made larger to compensate, requiring more power to switch the individual devices. Second, unless isolated wells or SOI devices are used, the maximum voltage in the stack is limited, which makes it more difficult to deliver power as the number of stacked devices is increased.

Although the edge-combining principle can be theoretically extended to any arbitrary value of N, there are practical limitations as to the maximum value of N. First, the overall power consumption due to switching is not reduced in the edge-combining multiplication. This is because, although the frequency is reduced by a factor of N, there are now  $N \times$  as many devices to be switched. Second, each branch that is added increases complexity in the layout, resulting in additional parasitics due to routing. The parasitics at the output are switched at the higher output frequency and hence worsen the energy efficiency. Third, by increasing N, more clock phases are required to be generated and distributed. thus increasing the power consumption in the clock generation and distribution, and requiring higher precision in the relative delay of each phase as more phases are added. As a result, edge-combining frequency multiplication has been limited to  $N \leq 9$  [6], [10], and output frequencies up to the Ka-band have been achieved [7].

The edge-combining, frequency-tripling class-D PA has an output power given by the following:

$$P_{\rm out} = \alpha^2 \beta \frac{2}{\pi^2} \frac{V_{\rm DD}^2}{R_L}.$$
 (1)

 $R_L$  is the termination resistance and  $\alpha$  and  $\beta$  are loss factors associated with the switching resistance and output matching network, respectively.  $\alpha$  arises due to voltage division between the switch and termination and is given as follows:

$$\alpha = \frac{R_L}{R_L + r_{sw}} \tag{2}$$

where  $r_{sw}$  is the equivalent switching resistance of the output stage.  $R_L$  and the series resonant filter are typically realized with an output matching network that transforms the load impedance (e.g.,  $Z_{\text{antenna}} = 50 \Omega$ ) to  $R_L$ . Because of the multiplication by N, the output matching network is tuned at  $N \times f_i$ .  $\beta$  is the attenuation of the output matching network. For a two-element down-converting impedance match,  $\beta$  is given as follows [11]:

$$\beta \approx \frac{1 - \frac{Q_{\rm NW}}{Q_C}}{1 + \frac{Q_{\rm NW}}{Q_L}} \tag{3}$$

where  $Q_C$  and  $Q_L$  are the quality factors of the passive components in the matching network. Typically,  $Q_C > 100$  and  $Q_L < 20$ .  $Q_{NW}$  is the network's loaded quality factor and, assuming the load impedance is 50  $\Omega$ , it is given by the



Fig. 3. (left) Schematic of a generalized edge-combining frequency-multiplying class-D PA and (right) input and output pulse waveforms.



Fig. 4. Block diagram of the implemented edge-combining, frequency-tripling class-D PA.

following:

$$Q_{NW} = \sqrt{\frac{50}{R_L} - 1} \tag{4}$$

 $R_L$  can be found for the desired output power level by substituting (2), (3) and (4) into (1) and iterating  $R_L$  until the desired output power is met. Next, details of the circuit blocks of a class-D PA using edge-combining frequency multiplication to achieve  $f_o = 3 \times f_i$  will be described.

## **III. CIRCUIT IMPLEMENTATION**

A block diagram of an edge-combining, frequency-tripling class-D PA can be found in Fig. 4. The PA system consists of an LVDS clock receiver, an injection-locked multi-phase ring oscillator for phase generation, drivers and an edge-combining class-D output stage with an on-chip output match. Circuit details for each of the implemented blocks follow, starting at the output stage and ending at the input.

# A. Edge-Combining, Frequency-Tripling Class-D PA Output Stage and Drivers

The output stage, as shown in Fig. 4, is pseudo-differential, consisting of a pair of edge-combining, frequency-tripling class-D PAs that are driven by complementary phases. The half-circuit schematic for the output stage is shown in Fig. 5. The PMOS transistors are driven with level-shifted signals, switching between  $V_{\text{DD}}$  and  $V_{\text{DD2}}(= 2 \times V_{\text{DD}})$ , while the NMOS transistors are switched between  $V_{\text{GND}}$  and  $V_{\text{DD}}$ . Each of the NMOS transistors has an aspect ratio of 978  $\mu$ m/65 nm, while the PMOS transistors are 2245  $\mu$ m/65 nm. The output



Fig. 5. (left) Schematic of the output stage of the edge-combining, frequencytripling output stage and matching network, and (right) input 3-phase clock waveforms.

matching network converts the load impedance  $(R_L = 50 \Omega)$  to the optimum termination impedance  $(7 \Omega)$  using a twoelement, high-pass match, where  $C_{ser} \approx 2.7$  pF and  $L_{sh} \approx$ 475 pH. The half-circuit inductor is merged, with the center of the inductor acting as a virtual ground; hence the total spiral for the differential pair has an inductance of 950 pH. The insertion loss of the matching network is  $\approx 0.5$  dB at the carrier frequency. The clock waveforms are also shown in Fig. 5. To create the level-shifted signals for the PMOS and NMOS transistors, while ensuring equal delay, star-connected level-shifters are used, similar to the one proposed in [12]. Following the level-shifter, buffers are sized using standard logical-effort scaling techniques, requiring six stages scaled with an average taper factor  $\approx 3$ .

### B. Injection-Locked Ring-Oscillating Clock Generator

The 3-phase (+3-complementary-phase) clock generation is realized with an injection-locked ring oscillator (ILRO), as shown in Fig. 6. A three-stage differential ring-oscillator is designed where the delay cells are based upon the Maneatis delay cell [13]. The oscillator is injection-locked to an input clock frequency that is provided from off-chip via an LVDS interface. The delay cells are current-starved and the delay can be tuned using a control voltage, which increases the locking range so that injection locking can be guaranteed across a broad range of input frequencies. Bi-directional buffering is connected to each of the output ports of the oscillator (e.g., A, B, C and their complements). This minimizes the impact of the injection locking on the phase difference between A, B, and C. The clock routing is buffered for global routing to the drivers/level-shifters that are located at the output stage.



Fig. 6. Input injection-locked oscillator.



Fig. 7. Chip microphotograph of the proposed 65-nm RF PA prototype.

## C. LVDS Clock Receiver

The ILRO's injection signal is obtained from an off-chip signal generator. Given the high frequency of the injection signal (e.g.,  $f_o/3$ ), an LVDS clock receiver (RX) is implemented to receive the injection signal and drive it into the ILRO. The LVDS comprises a differential pair with a regenerative (positive-feedback) latch, similar to what is implemented in [14]. Following the LVDS clock RX is a self-biased inverter acting as a trans-impedance amplifier (TIA). The output of the TIA drives the ILRO via a pair of input buffers.

#### IV. MEASUREMENT AND DISCUSSION

A prototype edge-combining, frequency-tripling class-D PA is fabricated in a 65-nm RF CMOS process with MiM capacitors and ultra-thick metal (UTM). The chip microphotograph is shown in Fig. 7, and it occupies an area of  $2 \times 2$ mm<sup>2</sup>. Next, the static PA characterization is discussed.

# A. Static Performance

The PA is designed for an output frequency centered at  $f_o = 3 \times f_i = 4.5$  GHz. The PA is a switched-mode PA, so its output amplitude is not meant to be proportional to its input amplitude (e.g., it is nonlinear). Amplitude control can be added by using a programmable charge division as in switched-capacitor [8], outphasing [15], or supply-modulation [16] PAs. The output power,  $P_{\text{out}}$ , achieves a peak value of 22.3 dBm at 4.5 GHz. The system efficiency (SE) is the ratio of the output power to the total power supplied to the chip. The SE has a peak value of 30.2%. The  $P_{\text{out}}$  and SE are plotted in Fig. 8. For



Fig. 8. Measured Pout and SE vs. output frequency.



Fig. 9. Measured spectrum and constellation for D-BPSK rotation modulation at 4.5 GHz  $f_{\rm out}.$ 

comparison, we simulated a conventional class-D PA with the same switch periphery and matching network, but without the edge-combining, and were able to show that the simulated SE of the edge-combining PA was  $\approx 6\%$  higher than the conventional.

## B. Dynamic Performance

The prototype PA does not have an amplitude modulator, hence it is measured with a constant envelope 10-MS/s D-BPSK modulation. Due to the tripling of the frequency, *the input phase must be divided by 3* in the digital signal processing. The measured output spectrum, constellation and eye-diagram for the modulation measurement at 4.5 GHz are shown in Fig. 9, using an input sample rate of 100 MS/s.

The linearity can be optimized at different frequencies by optimizing the delay of the ILRO unit cells. This is done using the delay control (Fig. 6). The EVM for the same 10-MS/s D-BPSK signal is measured across frequency when the delay control is optimized for output at 3.9, 4.2 and 4.5 GHz, with the results presented in Fig. 10. The optimal EVM has a floor of  $\sim 2\%$ -rms. It is noted that even for sub-optimal delay cell tuning in the ILRO, the EVM does not degrade significantly.



Fig. 10. Measured EVM versus frequency optimized at 3.9, 4.2 and 4.5 GHz.



Fig. 11. Measured ACLR and EVM versus baseband sampling frequency.

 TABLE I

 COMPARISON TO RECENT SWITCHED-MODE PAS

| Ref.         | Topology              | Tech. | Supply<br>[V] | f <sub>in</sub><br>[GHz] | f <sub>out</sub><br>[GHz] | P <sub>out</sub><br>[dBm] | SE <sup>*</sup> /PAE <sup>#</sup> /<br>η <sup>^</sup> [%] |
|--------------|-----------------------|-------|---------------|--------------------------|---------------------------|---------------------------|-----------------------------------------------------------|
| This<br>Work | Class-D               | 65nm  | 2.4           | 1.5                      | 4.5                       | 22.3                      | 30.2*                                                     |
| [17]         | Class-D <sup>-1</sup> | 65nm  | 2.1           | 3.3                      | 3.3                       | 23                        | 35*                                                       |
| [18]         | Class-D <sup>-1</sup> | 65nm  | 3.0           | 4.5                      | 4.5                       | 26.7                      | 27^                                                       |
| [19]         | Class-D               | 180nm | 1.8           | 0.6                      | 0.6                       | 21                        | 47#                                                       |

Finally, the EVM and ACLR are evaluated as a function of the input baseband sampling rate, as shown in Fig. 11. As expected, the ACLR and EVM improve with increased sampling rate, and approach their floor values for  $f_S = 100$  MHz.

## V. CONCLUSION

Outputting high frequencies is difficult in switching amplifiers, because switching becomes more difficult to achieve as frequencies approach ~  $f_T/10$  [1]. This is exacerbated in class-D amplifiers that utilize a high-side PMOS switch. In this brief, an edge-combining frequency-multiplying class-D PA is introduced. A prototype was fabricated in 65-nm RF CMOS and measurement results demonstrate that the technique enables efficient operation at moderate output power after tripling the input frequency. A comparison to recent class-D and class-D<sup>-1</sup> RF PAs [17]–[19] is made in Table I. The output power and efficiency compare favorably to the prior art, noting that the  $f_{in}$  in the presented technique is reduced relative to the  $f_{out}$ . Additionally, [17], [18] use class-D<sup>-1</sup> networks, which are known to place significant voltage stress on the switching transistors [20]. Although the amplifier does not include provisions for amplitude modulation, the architecture can be embedded in the SCPA [7], [8], outphasing [15] or supply-modulation [16] power amplifiers to provide for linear transmission of non-constant envelope modulation.

#### ACKNOWLEDGMENT

The authors would like to thank Microelectronic Circuits Centre Ireland (MCCI) for technical and administrative support.

#### REFERENCES

- E. McCune, "A technical foundation for RF CMOS power amplifiers: Part 5: Making a switch-mode power amplifier," *IEEE Solid-State Circuits Mag.*, vol. 8, no. 3, pp. 57–62, 2016.
- [2] R. I. Sarbacher, "Power-tube performance in class C amplifiers and frequency multipliers as influenced by harmonic voltage," *Proc. IRE*, vol. 31, no. 11, pp. 607–625, Nov. 1943.
- [3] R. Zulinski and J. Steadman, "Idealized operation of class E frequency multipliers," *IEEE Trans. Circuits Syst.*, vol. CS-33, no. 12, pp. 1209–1218, Dec. 1986.
- [4] K. Cho and R. Gharpurey, "A 25.6 dBm wireless transmitter using RF-PWM with carrier switching in 130-nm CMOS," in *Proc. IEEE RFIC Symp.*, 2015, pp. 139–142.
- [5] A. Zhang and M. S.-W. Chen, "A watt-level phase-interleaved multisubharmonic switching digital power amplifier," *IEEE J. Solid-State Circuits*, vol. 54, no. 12, pp. 3452–3465, Dec. 2019.
- [6] J. Pandey and B. P. Otis, A sub-100 μW MICS/ISM band transmitter based on injection-locking and frequency multiplication," *IEEE J. Solid-State Circuits*, vol. 46, no. 5, pp. 1049–1058, May 2011.
- [7] H. M. Nguyen, J. S. Walling, A. Zhu, and R. B. Staszewski, "A Ka-band switched-capacitor RFDAC using edge-combining in 22nm fd-SOI," in *Proc. IEEE VLSI Symp.*, 2021, pp. 1–2.
  [8] S.-M. Yoo, J. S. Walling, E. C. Woo, B. Jann, and D. J. Allstot, "A
- [8] S.-M. Yoo, J. S. Walling, E. C. Woo, B. Jann, and D. J. Allstot, "A switched-capacitor RF power amplifier," *IEEE J. Solid-State Circuits*, vol. 46, no. 12, pp. 2977–2987, Dec. 2011.
- [9] R. R. Manikandan, A. Kumar, and B. Amrutur, "A digital frequency multiplication technique for energy efficient transmitters," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 23, no. 4, pp. 781–785, Apr. 2015.
- [10] G. Chien and P. R. Gray, "A 900-MHz local oscillator using a DLLbased frequency multiplier technique for PCS applications," *IEEE J. Solid-State Circuits*, vol. 35, no. 12, pp. 1996–1999, Dec. 2000.
- [11] Y. Han and D. J. Perreault, "Analysis and design of high efficiency matching networks," *IEEE Trans. Power Electron.*, vol. 21, no. 5, pp. 1484–1491, May 2006.
- [12] L. G. Salem, J. F. Buckwalter, and P. P. Mercier, "A recursive switchedcapacitor house-of-cards power amplifier," *IEEE J. Solid-State Circuits*, vol. 52, no. 7, pp. 1719–1738, Jul. 2017.
- [13] J. G. Maneatis and M. A. Horowitz, "Precise delay generation using coupled oscillators," *IEEE J. Solid-State Circuits*, vol. 28, no. 12, pp. 1273–1282, Dec. 1993.
- [14] J. S. Walling *et al.*, "A class-E PA with pulse-width and pulse-position modulation in 65 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 44, no. 6, pp. 1668–1678, Jun. 2009.
- [15] H. Xu, Y. Palaskas, A. Ravi, M. Sajadieh, M. A. El-Tanani, and K. Soumyanath, "A flip-chip-packaged 25.3 dBm class-D Outphasing power amplifier in 32 nm CMOS for WLAN application," *IEEE J. Solid-State Circuits*, vol. 46, no. 7, pp. 1596–1605, Jul. 2011.
- [16] P. Reynaert and M. S. J. Steyaert, "A 1.75-GHz polar modulated CMOS RF power amplifier for GSM-edge," *IEEE J. Solid-State Circuits*, vol. 40, no. 12, pp. 2598–2608, Dec. 2005.
- [17] S. Zheng and H. C. Luong, "A WCDMA/WLAN digital polar transmitter with low-noise ADPLL, wide-band PM/AM modulator and linearized PA in 65nm CMOS," in *Proc. ESSCIRC*, 2014, pp. 375–378.
- [18] J. S. Park, S. Hu, Y. Wang, and H. Wang, "A highly linear dual-band mixed-mode polar power amplifier in CMOS with an ultra-compact output network," in *Proc. IEEE CICC*, 2015, pp. 1–4.
- [19] J. Hur et al., "A multilevel class-D CMOS power amplifier for an outphasing transmitter with a Nonisolated power combiner," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 63, no. 7, pp. 618–622, Jul. 2016.
- [20] H. Kobayashi, J. M. Hinrichs, and P. M. Asbeck, "Current-mode class-D power amplifiers for high-efficiency RF applications," *IEEE Trans. Microw. Theory Techn.*, vol. 49, no. 12, pp. 2480–2485, Dec. 2001.