

Received November 9, 2021, accepted November 26, 2021, date of publication November 30, 2021, date of current version December 8, 2021.

Digital Object Identifier 10.1109/ACCESS.2021.3131819

# A 16.4-dBm 20.3% PAE 22-dB Gain 77 GHz Power Amplifier in 65-nm CMOS Technology

# VAN-SON TRINH<sup>®</sup>, (Graduate Student Member, IEEE), AND JUNG-DONG PARK<sup>®</sup>, (Senior Member, IEEE)

Division of Electronics and Electrical Engineering, Dongguk University, Seoul 04620, Republic of Korea

Corresponding author: Jung-Dong Park (jdpark@dongguk.edu)

This work was supported in part by the National Research Foundation of Korea (NRF) Grant by the Korean Government through the Ministry of Science, ICT and Future Planning (MSIP) under Grant 2019M3F6A1106118, in part by the Korea Institute of Energy Technology Evaluation and Planning (KETEP), and in part by the Ministry of Trade, Industry and Energy (MOTIE) of the Republic of Korea under Grant 20194030202320.

**ABSTRACT** We present a compact W-band power amplifier (PA) for automotive radar application in 65-nm CMOS technology. The circuit adopts a pseudo-differential push-pull configuration based on transformers (TFs) which offer highly efficient and flexible matching networks with minimized area occupancy. We have set the optimal output resistance close to 50  $\Omega$ , design guidelines in sizing active devices for each stage, and the corresponding transformers were presented for optimal power efficiency based on an analysis of surrounding matching networks. Working under a supply voltage of 1.3-V, the implemented 77GHz PA achieved a 3-dB gain bandwidth of 9-GHz (72.5–81.5 GHz), a peak gain of 22.4 dB, a saturated power ( $P_{sat}$ ) of 16.4 dBm, and a peak power-added efficiency (PAE) of 20.3%. The area for the core layout is only 0.05 mm<sup>2</sup>, which demonstrates the highest power density among the recently reported W-band CMOS PAs.

**INDEX TERMS** CMOS technology, millimeter-wave circuits, power amplifiers, transformers.

#### I. INTRODUCTION

Nowadays, collision avoidance systems (CAS) utilizing radar, previously known from ships and aircraft, have been widely applied in traffic vehicles to assist drivers in many situations. Indeed, such a system aims to enhance driving safety and provide better user convenience [1], [2]. W-band is suitable to be used in radar sensors for traffic vehicles due to its two key properties. First, the small wavelength enhances the detecting resolution of radar sensors such that they can detect small objects at the size of a human, small cars, or traffic poles. The use of high frequencies also facilitates the sensors in capturing higher velocity, which is essential for a collision-avoidance system [2]. Secondly, with strong penetrating properties of the electromagnetic waves at high frequency, W-band radars are highly reliable under an extreme environment (e.g., bad weather conditions like heavy rain or snow, dense fog, etc.) [3]. Hence, ITU recommended a band of 76-81 GHz for automotive radar applications.

Among various integrated circuit technologies, CMOS technology is preferred for full implementations of W-band radar sensors since it can provide adequate power and effi-

The associate editor coordinating the review of this manuscript and approving it for publication was Rocco Giofrè<sup>10</sup>.

ciency performance with the virtues of low cost and high integration capability. However, designing a high-performance PA in CMOS at such high frequencies is challenging due to low device breakdown voltages. Moreover, millimeterwave circuitry suffers from a lossy substrate environment for passive devices as well as the inferior power gain of the active devices.

To cover the standard detection distance of 250 m, the required power for the transmitter is estimated to be 13-dBm for a typical channel with a radar cross-section (RCS) of a mid-car ( $\sim$ 30-m<sup>2</sup>) [4]. Nevertheless, a PA that can provide higher output power is preferred for a reliable operation considering losses in packaging. Also, lower than the saturated output power could be used in the casual condition with the best efficiency, while the peak power might be necessary for adverse environments.

Recently, it has been possible to achieve an output power larger than 13-dBm from a CMOS PA without using a complex power combining network which could degrade the power efficiency due to the extra loss from the power combiners at the output [4], [5]. Therefore, it is natural to employ the PA without such a complex combiner/splitter to attain advantages in efficiency and its occupancy with less designing effort [3]. To exploit the maximum possible



FIGURE 1. Simplified schematic of the 77-GHz PA.

power from a single-way PA, the active device at the output stage should be chosen the largest size possible while keeping feasible impedance matching with surrounding circuits. Therefore, it is critical to select an optimal active device size to achieve the largest possible output power with minimal area occupancy.

This paper presents a transformer-based push-pull PA design at 77-GHz for automotive radar application in 65-nm CMOS which supports the back-ended-of-line (BEOL) with an ultra-thick metal (UTM) of copper. The circuit composes three stages of the push-pull amplifier aiming at a power gain higher than 20-dB and an output power better than 13-dBm. Herein, the design procedure is emphasized as a guideline in choosing an optimal active device with proper transformer (TF) sizing for a highly efficient mm-wave PA.

#### **II. MILLIMETER-WAVE PUSH-PULL PA DESIGN**

The simplified schematic of the proposed PA is presented in Fig. 1. It consists of three stages of the push-pull amplifier, including the input stage, driving stage, and output stage. Besides various advantages in a compact size and power delivery, the transformer also provides galvanic isolation and electrostatic discharge (ESD) protection at the input and output ports.

In each amplifying stage, neutralization capacitors are included to improve the stability factor, which eventually results in better impedance matching. This architecture also increases the isolation between input and output and the gain of each push-pull amplifier stage. In this design, metaloxide-metal (MOM) capacitors were used to achieve a high-precision embedding network instead of using more compact MOS capacitors with a lower Q-factor [6]. The value of the neutralization capacitor is chosen to be roughly  $C_{gd}$  of the transistor [7]. Specifically, Rollet's stability factor (K- $\Delta$ ) and the maximum stable gain ( $G_{ma}$ ) values of the amplifier were investigated carefully to ensure its stable operation [8].

The gate bias voltage ( $V_{GS}$ ) for the three stages was chosen considering the trade-offs between the dc-power dissipation and the maximum output power. The VGS for the output stage ( $3^{rd}$ ) was chosen to be 0.7 V to achieve a good output power while the  $V_{GS}$  for the input and the driving stages was chosen to be 0.6 V for the better power gain [8].

159542

**A. TRANSFORMER-BASED MATCHING NETWORK DESIGN** For the push-pull amplifier with transformers at the input and output, a proper design of the matching network is crucial in achieving optimal power efficiency. To model an on-chip transformer, the low-frequency model with five parameters  $(L_1, Q_1, L_2, Q_2, \text{ and } k)$  has been widely used to characterize their coil inductances, quality factors, and the mutually inductive coupling factor between the two coils [9]. The source and load of the transformer can be either 50- $\Omega$  terminal, gates, or drains of the MOSFET in the PA. The optimal source and load for a given transformer are reported in [9], which are written in terms of admittances as

$$G_{s} = \frac{1}{R_{1}} \cdot \frac{\sqrt{1 + k^{2}Q_{1}Q_{2}}}{1 + Q_{1}^{2} + k^{2}Q_{1}Q_{2}}; \quad B_{s} = \frac{1}{\omega L_{1}}$$
$$\cdot \frac{Q_{1}^{2}}{1 + Q_{1}^{2} + k^{2}Q_{1}Q_{2}}$$
(1.1)

$$G_{L} = \frac{1}{R_{2}} \cdot \frac{\sqrt{1 + k^{2}Q_{1}Q_{2}}}{1 + Q_{2}^{2} + k^{2}Q_{1}Q_{2}}; \quad B_{L} = \frac{1}{\omega L_{2}}$$
$$\cdot \frac{Q_{2}^{2}}{1 + Q_{2}^{2} + k^{2}Q_{1}Q_{2}}, \quad (1.2)$$

where  $Y_S = G_S + jB_S$  is the source admittance and  $Y_L = G_L + jB_L$  is the load admittance of the transformer. Throughout this work, impedance from a node is represented with admittance which characterizes parallel connections effectively.

In (1), when  $Q_1 = Q_2 \gg 1$  and k = 1, the expressions of  $B_S$  and  $B_L$  can be simplified to  $1/(2\omega L_1)$  and  $1/(2\omega L_2)$ , respectively. Although the solution given in (1) indicates specific optimal load and source values that maximize the efficiency of the transformer, the constraints on the real parts (i.e., the source and load conductance values) are not so rigorous, specifically when their quality factors are high. Intuitively, an ideal transformer is perceptually realized as an impedance transformer that only requires a specific ratio between the load and source resistance values to maximize its efficiency depending on the turns ratio of the transformer.

For example, an ideal 1:2 TF transforms the load resistance to a quarter of that seen on the source side. To demonstrate that the real values do not strongly affect the TF efficiency, we examined the TF efficiency (i.e., the power gain or  $S_{21}$ ) of the symmetric 1:1 TF with  $Q_1 = Q_2 = Q$  and  $L_1 = L_2$  under various source and load resistance values (i.e.,  $R_{Sp}$  and  $R_{Lp}$ ,



**FIGURE 2.** Simulated efficiency of an ideally symmetric TF with  $Q = Q_1 = Q_2$  and  $L_1 = L_2$  at 77 GHz.



**FIGURE 3.** Simulated  $G_{ma}$  and  $S_{21}$  in several cases of load and source for the implemented TF ( $D_{in} = 26 \ \mu m$  in Fig. 6).

respectively). Owing to the TF's symmetry, the two-terminal resistance values were set as the same and denoted by  $R_p$ . For a fair comparison,  $R_p$  was varied from its optimal value  $R_{p(opt)}$  and optimal parallel capacitances were used to keep the optimal susceptance for each state.

Fig. 2 presents the simulated efficiency of the TF versus the ratio  $R_p/R_{p(opt)}$  for several cases of quality factor Q and coupling factor k at the center frequency. As can be seen, with k = 1 and Q = 50, the efficiency of the TF varied merely by 0.6-dB when  $R_p$  was increased or decreased by 10 times from its optimal value,  $R_{p(opt)}$ . This independence property of the efficiency depending on the value of  $R_p$  was drastically weakened as the coupling factor k decreased. Moreover, we can observe that the quality factor Q strongly affected the intrinsic insertion loss. Nevertheless, the overall trend in efficiency depending on  $R_p$  was independent of Q. With the typical values of Q = 10 and k = 0.7, the efficiency of the TF decreased by around 2 dB (from -1.2 dB to -3.2 dB) when  $R_p$  was changed by four times its original optimal value (i.e.,  $\pm 6$  dB).

Figure 3 shows simulated  $G_{ma}$  and  $S_{21}$  values for several cases of the implemented TF with  $D_{in} = 26 \ \mu \text{m}$  connected in the differential-to-differential configuration. When *Ys* and *Y<sub>L</sub>* were set to their optimal values,  $S_{21}$  was maximized at the target frequency of 77 GHz. The operational frequency of the TF shifted to around 108 GHz when the parallel reactance

TABLE 1. Performance of the output transistor at different sizes.

| Output<br>transistor width | Suitable input<br>transformer size | $P_{max}$<br>(mW) | PAE  | $R_{op}$ |  |
|----------------------------|------------------------------------|-------------------|------|----------|--|
| (µm)                       | (µm)                               | (111)             | (, ) | ()       |  |
| 80                         | ~28                                | 41.7              | 45.9 | 71.7     |  |
| 104                        | ~25                                | 50.9              | 45.4 | 61.5     |  |
| 128                        | ~22                                | 58.9              | 44.5 | 52.3     |  |
| 152                        | ~20                                | 65.3              | 43.3 | 45.0     |  |
| 176                        | ~18                                | 70.5              | 41.6 | 38.6     |  |
| 200                        | ~17                                | 73.1              | 39.1 | 34.2     |  |



FIGURE 4. A simplified PA with matching input and output networks.

values were reduced by twice their optimal values. Although changes in  $R_{Sp}$  and  $R_{Lp}$  caused minor shifts in the peak frequency, there was meaningful degradation of the power gain. When the optimal reactance values of the TF were applied both at the source and load, the decreased  $R_p$  at the source and load provided wider bandwidth with a higher degradation of the power gain whereas increased  $R_p$  made the bandwidth narrower with better power gain near the center frequency. Therefore, we can see that the compensation in the imaginary parts of the load and the source admittances is crucial for determining the operating frequency of the TF while matching in the real parts of them has only a minor impact on the TF efficiency. From this, we could achieve power matching of the active device by marginally sacrificing the TF efficiency while improving the overall power efficiency of the designed PA.

The significance of the matching networks depends on their position in the PA, i.e., at the input, inter-stage, or output of the PA. To evaluate this, let us consider a PA whose gain is 20-dB. Now, if the output matching network suffers from 1-dB more insertion loss, then the PAE of the PA will drop by  $\sim 0.794$  times (i.e., 20.6% degradation). By contrast, if the 1-dB more loss is applied to the input matching network, then the PAE merely reduces by 0.3%. With this understanding, we can perform reasonable trade-offs between the insertion loss and other factors such as bandwidth or compactness of the matching networks.

## B. THE EFFECTS OF MATCHING NETWORK LOSS DEPENDING ON GAIN

Let us consider a PA with the gain stage having matching input and output networks, as presented in Fig. 4. The gain stage has a transducer power gain of  $G_T$  (=  $G_{\text{ma}}$ - $IL_{Min}$ - $IL_{Mout}$ ) with well-matched input/output ports by assuming that  $G_{\text{ma}}$  is the maximum available gain

from an unconditionally stable device with input and output matching networks (TMN<sub>in</sub> and TMN<sub>out</sub>, respectively) that provide good enough impedance matching with  $IL_{Min}$  and  $IL_{Mout}$ , respectively. The effect of TMN<sub>in</sub> and TMN<sub>out</sub> is quite different in the whole PA performance.

Let us evaluate their effect by assuming that either the insertion loss of TMN<sub>in</sub> or TMN<sub>out</sub> increases by 1 dB. Since the PAE needs to be compared at the same output power level for a fair comparison, we maintain the whole gain level as constant. Thus, to keep the same output power, if  $IL_{Min}$  is increased by 1 dB, then  $P_{in}$  should be increased by 1 dB accordingly. Therefore, the new PAE ( $PAE_{pk(new)}$ ) affected by the variation in power gain  $\Delta G_{TdB}$  from the TMNs can be calculated as

$$r_{PAE} = \frac{PAE_{pk(new)}}{PAE_{pk}} = \frac{P_{out} - 10^{-\Delta G_{TdB}/10} \times P_{in}}{P_{out} - P_{in}}$$
$$= \frac{G_T - 10^{-\Delta G_{TdB}/10}}{G_T - 1}$$
(2)

From (2), the effect of  $IL_{Min}$  on the PAE is quite minor when  $G_T$  is relatively large. If  $G_T$  is reduced from  $G_{TdB} =$ 20 to 19 dB (i.e.,  $\Delta G_{TdB} = -1$  dB) due to the increase in  $IL_{Min}$ , the calculated  $r_{PAE}$  is merely 0.997 while  $r_{PAE}$  = 0.88 for the PA with  $G_{TdB} = 5$  dB with the same degradation in TMN<sub>in</sub> ( $\Delta G_{TdB} = -1$  dB). It can be seen that the influence of TMNout on PAE is more direct and stronger than that of TMN<sub>in</sub>. Thus, the influence of each matching network on the PAE of any PA can be evaluated by the gain of the PA. The impact of each block on the PAE of the PA is inversely proportional to the gain of each stage that provides the overall gain. Since the effect of the TMNs (except for the output stages) on the power efficiency is minor, we can perform a reasonable trade-off between the insertion loss and other factors such as bandwidth or compactness of the matching networks. With this understanding, the resistance matching issue in the inter-stage and the input stage presented in the previous sub-section can be alleviated.

#### C. OUTPUT STAGE DESIGN CONSIDERATIONS

It is a natural choice to design the PA from the output stage to the input stage consecutively when considering the importance of the larger signal toward the output stage. There are trade-offs in choosing the active device size for the output stage. A large-sized transistor is preferable for high output power. However, two issues need to be considered regarding its output and input impedance matchings. The output impedance of a transistor can be modeled by a parasitic capacitor  $(C_{op})$  in parallel with an output resistor  $(R_{op})$ , and this model applies to the large signal as well. When the output transformer (i.e., TF<sub>4</sub>) has the impedance transformation ratio of  $T_{im}$  and its primary inductance perfectly resonates out  $C_{op}$ , then  $R_{op}$  should be  $R_{opTF} = R_L^* T_{im}$  ( $R_L$  is the load impedance) to attain the maximum efficiency  $\eta_{max}$ . However, the device size can be further increased to enhance the output power in a trade-off with degradation of the power efficiency. When the device size is increased by n times (n > 1), the



**FIGURE 5.** Power gain and efficiency compression ratios (re and rp) versus the increase ratio of the active device size (n).

output resistor,  $R_{op}$ , roughly decreases by *n* times. Then, the new efficiency  $\eta$  can be calculated through the maximum efficiency  $\eta_{\text{max}}$  by the ratio  $r_e$  as

$$r_e = \frac{\eta}{\eta_{\text{max}}} = 1 - \left(\frac{n-1}{n+1}\right)^2 = \frac{4n}{(n+1)^2}.$$
 (3)

The ratio of the new saturated output power  $P_{sat}$  to  $P_{sat0}$  at the maximum efficiency becomes:

$$r_p = \frac{P_{sat}}{P_{sat0}} = n \left( 1 - \left( \frac{n-1}{n+1} \right)^2 \right) = \frac{4n^2}{(n+1)^2}.$$
 (4)

Figure 5 presents the ratio of efficiency decrease  $(r_e)$  and power increase  $(r_p)$  versus *n* which shows that  $r_p$  increases faster than  $r_e$  decreases, particularly in the small region of *n*. Thus, we can see a small amount of the efficiency degradation can be well traded off for relatively larger output power.

There is another aspect to be considered when choosing the output active device size which is related to its preceding transformer (i.e., TF<sub>3</sub>). A larger transistor size (M<sub>3</sub>) requires a smaller transformer (TF<sub>3</sub>) to resonate out its increased gate capacitance. However, the reduced magnetic coupling of the small size results in a high-loss transformer implementation. To investigate the effects of the reduced magnetic coupling, we simulated various transformers of different inner diameters  $(D_{in})$ . The realized structure of the transformers is shown in Fig. 6. Herein, the on-chip transformer is constructed from three metal layers. The ultra-thick metal layer (UTM) forms the primary coil, aiming to carry the large drain quiescent current. Meanwhile, the two metal layers below the UTM are combined for the secondary coil. The inner diameter of the transformer is denoted by  $D_{in}$  and the width of the winding is  $W = 6 \ \mu m$ . The length of the two ports is fixed to be  $25-\mu m$  to keep a certain distance between the windings and the surrounding ground. Each winding of the transformer has a center tap for VDD and gate biasing.

The extracted optimal load susceptance ( $B_{Lopt}$  in (1.2)) and maximum available gain ( $G_{ma}$ ) of the transformers in different sizes are presented in Fig. 7. We can observe that the transformer efficiency is degraded quickly as the transformer size decreases due to the reduced magnetic coupling. When we reduce the transformer diameter  $D_{in}$  from 32  $\mu$ m to 16  $\mu$ m,  $G_{ma}$  drops by about 20%, and the extracted  $B_{Lopt}$  increases from 14.8 mS to 43.6 mS. This means the output transistor size supported by the 32  $\mu$ m transformer is expected to be nearly three times smaller than that of the 16  $\mu$ m transformer.



FIGURE 6. Implemented transformer structure in 65-nm CMOS.



**FIGURE 7.** The simulated optimal load susceptance ( $B_L$ ), and the maximum available gain (Gma) of transformers with different inner diameter size ( $D_{in}$ ).



FIGURE 8. Simplified layout of transistor using table structure.

In this analysis, it was assumed that the maximum generated output power and the parasitics of the transistor are linearly proportional to its size. However, in practice, the efficiency of a large transistor can be noticeably degraded due to the long routing line with bottom metal layers in the device layout. We designed various transistors at different sizes using the 'table structure' with eight cells to investigate this effect as shown in Fig. 8. The gate capacitance of the transistors was extracted to select the suitable preceding resonant transformer (i.e.,  $TF_3$ ). Load-pull simulations were performed on the output transistors with their selected transformer-based input matching networks, and the simulation results are shown in Table 1.

It is noticed that the required impedance transformation ratio,  $T_{im}$ , of the output transformer (TF<sub>4</sub>) is roughly close to unity for the optimal power efficiency from Table 1. Thus, a 1:1 turns ratio is selected for TF<sub>4</sub>. The optimal size of M<sub>3</sub> for the output impedance matching is expected to be around  $W = 128 \ \mu m$ . Based on the analysis, the width of M<sub>3</sub> was slightly increased by  $W = 168 \ \mu m$  from the optimal size to achieve higher output power. With the selected output transistor,  $D_{in} = 18 \ \mu \text{m}$  was chosen for TF<sub>3</sub>, which could resonate with the large output transistor M3 to achieve a good trade-off between the expected output power and efficiency. The output transformer (TF<sub>4</sub>) was designed as large as possible for a given transistor to improve the overall power efficiency. By using the impedance matching formulas for transformers in [9], the output transformer was designed to be 24  $\mu$ m so that the susceptance of the single-ended terminal compensates for the parasitic capacitance of the RF pad at the output port. Through the proposed approach, the maximum possible size of the output transformer can be chosen for improved power efficiency. On the primary side of TF<sub>4</sub>, an additional capacitor  $C_4 = 4$  fF is required to compensate for its primary coil inductance. A MOM capacitor with a tailored layout was used for the compact matching of the primary coil, and its capacitance was extracted using Calibre<sup>TM</sup>.  $C_2$  and  $C_3$  were also implemented in the same way.

#### D. GAIN STAGES DESIGN

The active device size of the first  $(M_1)$  and the second  $(M_2)$ driving stages were determined considering the optimal efficiency. The device size was reduced compared with that of the output MOSFET, but it must be large enough to drive their load (i.e., their next stage). In this 65-nm CMOS process, each amplifier stage had an estimated gain of around 7 to 8 dB after impedance matching, and a power gain compression of 3 to 4 dB was observed when the output power  $(P_{out})$ became saturated with a large input power level. Thus, it is roughly estimated that the driving stage should provide an output power of 3-4 dB less than that at the output stage to achieve the full drive. Assuming that the maximum output power is proportional to the device size, we can initially set the active device size of the driving stage to half of that of the output stage. Because the gate biasing voltages for M<sub>1</sub> and M<sub>2</sub> were set to 0.6 V for improved efficiency, the device size was set to slightly larger than the expected size.

To ensure the two driving stages can drive the output stage to its maximum saturated power and achieve a good OP1dB level, an iterative process was performed on the device sizes of  $M_1$  and  $M_2$  with the initial device sizes estimated. All other transformer-based matching networks were designed in the same procedure as for TF<sub>4</sub> at the output stage. The final device sizes for  $M_1$  and  $M_2$  were 60 and 88  $\mu$ m, respectively. Notably, DC-current consumption by M<sub>1</sub> is marginal compared with that by  $M_3$ . Hence, we could choose a larger  $M_1$ size than expected to provide a higher gain. The relatively large gate capacitances of M<sub>3</sub>, M<sub>2</sub>, and M<sub>1</sub> determine the size of TF<sub>3</sub>, TF<sub>2</sub>, and TF<sub>1</sub>, respectively, so that each gate capacitance resonates out the secondary inductances of the corresponding transformers. In this way, it was not necessary to add tuning capacitors for the gate of each transistor. However, on the primary side of TF<sub>2</sub> and TF<sub>3</sub>, additional capacitors  $C_2 = 30$  fF and  $C_3 = 45$  fF were added to the corresponding drains to ensure the matching. Specifically,



FIGURE 9. A Photograph of the fabricated 77-GHz PA in a 65-nm CMOS.



**FIGURE 10.** Measurement setup for S-parameters (a) and large-signal parameters (P<sub>sat</sub>, OP1dB, and PAE) (b).

in the case of TF<sub>1</sub> with a single-ended-to-differential configuration, the center tap of the primary winding is connected to the ground to reduce the parasitic capacitance. Because of this connection, an extra capacitor  $C_1$  of 34 fF was needed to make it resonate with the primary inductance of TF1 along with the parasitic capacitance from the input RF pad. The gate bias lines for TF<sub>1</sub>, TF<sub>2</sub>, and TF<sub>3</sub> were connected in series with 5k- $\Omega$  resistors to avoid a potential common-mode oscillation caused by the parasitic inductances of the biasing lines [10].

### E. DESIGN PROCEDURE

To summarize, the design sequence of the initial three-stage push-pull PA in this work is listed as below:

- *Step 1*: Choose M<sub>3</sub> by considering the output power, efficiency with the corresponding TF<sub>3</sub>. Design TF<sub>3</sub> based on M<sub>3</sub> so that the gate capacitance compensates for the secondary inductance of TF<sub>3</sub>.
- *Step 2*: Design TF<sub>4</sub> based on the extracted capacitance from the output RF-pad. Calculate C<sub>3</sub> based on TF<sub>4</sub> and M<sub>3</sub>.
- *Step 3*: Choose M<sub>2</sub> size around half of M<sub>3</sub>; choose the size of M<sub>1</sub> to be around 2/3 of M<sub>2</sub>.
- *Step 4*: Design TF<sub>2</sub> and TF<sub>1</sub> based on the gate capacitance of M<sub>2</sub> and M<sub>1</sub>, respectively.



FIGURE 11. Simulated and measured S-parameters of the 77-GHz PA.



**FIGURE 12.** Measured saturated output power (Psat), output 1-dB gain compression point, and PAE versus frequency.

• *Step 5*: Calculate *C*<sub>3</sub> based on TF<sub>3</sub> and M<sub>2</sub>, calculate *C*<sub>2</sub> based on TF<sub>2</sub> and M<sub>1</sub>; calculate *C*<sub>1</sub> based on TF<sub>1</sub>.

To demonstrate the validity of the design approach, a W-band push-pull PA was fabricated in 65-nm CMOS process. The photograph of the fabricated chip is presented in Fig. 9. The core size of the designed PA is only 0.05 mm<sup>2</sup> while the total area including RF pads is 0.435 mm<sup>2</sup>.

#### **III. MEASUREMENTS RESULTS**

In the measurement, the PA consumed a DC-current of 95 mA from a 1.3-V supply without input signals. The measurement setup for S-parameters and large-signal performance is illustrated in Fig. 10. A vector network analyzer (VNA), Keysight N5224A (10 MHz to 43.5 GHz) combined with an extension module was used with an on-wafer probe station to measure the S-parameters of the PA. The on-wafer setup was calibrated using a calibration kit (CS-5). The measured S-parameters of the PA are presented in Fig. 11 in comparison with the simulation results. It achieved a peak power gain of 22.6 dB at 77-GHz and a 3-dB bandwidth of 9 GHz (72.5–81.5 GHz), which corresponds well with the simulation results. The measured reverse isolation (- $S_{12}$ ) is better than 45 dB.

In the large-signal measurement, a signal generator with a stand-alone frequency multiplier was used to generate

| Ref. | CMOS Tech. | Combination<br>way | Freq. (GHz)    | P <sub>sat</sub><br>(dBm) | Gain<br>(dB) | Peak PAE<br>(%) | OP1dB<br>(dBm) | Core Area<br>(mm <sup>2</sup> ) | DC-Diss.<br>(mW) | $P_{\text{sat}}/\text{Area}$<br>(mW/mm <sup>2</sup> ) |
|------|------------|--------------------|----------------|---------------------------|--------------|-----------------|----------------|---------------------------------|------------------|-------------------------------------------------------|
| This | 65 nm      | 1-way              | 72.5-81.5@77   | 16.4                      | 22.6         | 20.3            | 13.6           | 0.05                            | 124              | 873                                                   |
| [3]  | 65-nm      | 1-way              | 77             | 13.2                      | NA           | 17.6            | NA             | NA                              | 0.17             | -                                                     |
| [11] | 65-nm      | 2-ways             | 84.0-88.8@87   | 11.9                      | 18.6         | 9.0             | 9.6            | 0.23*                           | NA               | 67                                                    |
| [12] | 65-nm      | 1-way              | 68-78@75       | 17.3                      | 21.4         | 18.9            | 14.6           | 0.09*                           | 284.7            | 597                                                   |
| [13] | 65-nm      | 2-ways             | 74-82.5@77     | 15.8                      | 26.4         | 15.9            | 11.5           | 0.14*                           | 240              | 272                                                   |
| [14] | 40-nm      | 4-ways             | 73             | 22.6                      | 25.3         | 19.3            | 18.9           | 0.25*                           | NA               | 728                                                   |
| [15] | 65-nm      | 1-way              | 76.8-83.8@81.6 | 16.3                      | 28.3         | 14.1            | 13.6           | 0.121                           | 234              | 353                                                   |
| [16] | 65-nm      | 2-way              | 76-81          | 16.1                      | 30           | 12.8            | 12.2           | 0.34                            | 365              | 120                                                   |
| [17] | 65-nm      | 8-way              | 74.3-86.2@77   | 15.4                      | 24.4         | 10.4            | 12.1           | 0.42*                           | 336              | 83                                                    |
| [5]  | 65-nm      | 1-way              | 73             | 14.29                     | 26-31        | 22.37           | 12.03          | 0.033                           | 120              | 813                                                   |
| [19] | 40-nm      | 4-way              | 72             | 21                        | NA           | 13.6            | 19.2           | 0.19                            | NA               | 663                                                   |
| [20] | 40-nm      | 4-way              | 70.3-85.5      | 20.9                      | 18.1         | 22.3            | 17.8           | 0.19                            | 375              | 648                                                   |
| [21] | 28-nm      | 2-way              | 78             | 15.7                      | 13.8         | 8.9             | NA             | NA                              | NA               | -                                                     |
| [18] | 22-nm SOI  | 1-way              | 76             | 17.8                      | 17.8         | 17.3            | 13.3           | 0.02                            | 260              | 3049                                                  |
| [4]  | 28-nm SOI  | 1-way              | 77             | 13.5                      | 26.5         | 14.5            | 10             | 0.14                            | 150              | 160                                                   |

TABLE 2. Summary of state-of-art mm-Wave CMOS PAs around 77 GHz.

\* Estimated from the chip photo



FIGURE 13. Measured output power (Pout), gain, and PAE versus input power.

W-band signals and a tunable attenuator was used to sweep the input power level. The insertion losses of the probe tips and the WR-10 waveguides were measured and calibrated from the raw data. The measurement results for the PA in terms of output power, output 1-dB gain compression point (*OP1dB*) and power-added efficiency (*PAE*) as a function of the frequency is presented in Fig. 12. The measured output power, gain, and PAE at 77-GHz and 79-GHz are shown in Fig. 13. The fabricated PA achieved a maximum  $P_{sat}$  of 16.4 dBm with a peak *OP1dB* of 13.6 dBm and a peak PAE of 20.3% recorded at 79 GHz. Over the band of interest (76-81 GHz), the measured saturate output power varies within 0.6-dB from its peak.

The performances of the proposed PAs are summarized and compared with recently reported CMOS PAs at similar frequencies in Table 2. The implemented 77GHz PA in this work attained well-balanced small-signal and large-signal performances and, to the best of our knowledge, its achieved power density is among the highest score for a bulk CMOS PA in W-band.

#### **IV. CONCLUSION**

This paper reports a three-stage push-pull power amplifier (PA) for 77-GHz automotive radar application in 65-nm

bulk CMOS technology. A design strategy with a detailed guideline was presented in sizing the active device as well as the transformers to achieve a good trade-off between its output power and efficiency. In measurement, the fabricated PA exhibits an output power of 16.4 dBm, a power gain of 22.6 dB, and a peak PAE of 20.3% while occupying only 0.05 mm<sup>2</sup> for the core block. The well-balanced performance of the implemented W-band PA demonstrates the feasibility of the single-way CMOS PAs for automotive radar applications by taking advantage of the low-cost and high-integration level.

#### ACKNOWLEDGMENT

The chip fabrication and EDA tool were supported by the IC Design Education Center (IDEC).

#### REFERENCES

- D. M. Grimes and T. O. Jones, "Automotive radar: A brief review," *Proc. IEEE*, vol. 62, no. 6, pp. 804–822, Jun. 1974.
- [2] S. M. Patole, M. Torlak, D. Wang, and M. Ali, "Automotive radars: A review of signal processing techniques," *IEEE Signal Process. Mag.*, vol. 34, no. 2, pp. 22–35, Mar. 2017.
- [3] H. Jia, L. Kuang, W. Zhu, Z. Wang, F. Ma, Z. Wang, and B. Chi, "A 77 GHz frequency doubling two-path phased-array FMCW transceiver for automotive radar," *IEEE J. Solid-State Circuits*, vol. 51, no. 10, pp. 2299–2311, Oct. 2016.
- [4] C. Nocera, G. Papotto, A. Cavarra, E. Ragonese, and G. Palmisano, "A 13.5-dBm 1-V power amplifier for W-band automotive radar applications in 28-nm FD-SOI CMOS technology," *IEEE Trans. Microw. Theory Techn.*, vol. 69, no. 3, pp. 1654–1660, Mar. 2021.
- [5] L. Chen, L. Zhang, Y. Wang, and Z. Yu, "A compact E-band power amplifier with gain-boosting and efficiency enhancement," *IEEE Trans. Microw. Theory Techn.*, vol. 68, no. 11, pp. 4620–4630, Nov. 2020.
- [6] W. Ye, K. Ma, K. S. Yeo, and Q. Zou, "A 65 nm CMOS power amplifier with peak PAE above 18.9% from 57 to 66 GHz using synthesized transformer-based matching network," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 62, no. 10, pp. 2533–2543, Oct. 2015.
- [7] Z. Wang, P.-Y. Chiang, P. Nazari, C.-C. Wang, Z. Chen, and P. Heydari, "A CMOS 210-GHz fundamental transceiver with OOK modulation," *IEEE J. Solid-State Circuits*, vol. 49, no. 3, pp. 564–580, Mar. 2014.
- [8] V.-S. Trinh and J.-D. Park, "A 25.1 dBm 25.9-dB gain 25.4% PAE X-band power amplifier utilizing voltage combining transformer in 65-nm CMOS," *IEEE Access*, vol. 9, pp. 6513–6521, 2021.

- [9] V.-S. Trinh and J.-D. Park, "Theory and design of impedance matching network utilizing a lossy on-chip transformer," *IEEE Access*, vol. 7, pp. 140980–140989, 2019.
- [10] V.-S. Trinh and J.-D. Park, "Common-mode stability test and design guidelines for a transformer-based push-pull power amplifier," *IEEE Access*, vol. 8, pp. 42243–42250, 2020.
- [11] H. Jia, B. Chi, L. Kuang, and Z. Wang, "A W-band power amplifier utilizing a miniaturized Marchand balun combiner," *IEEE Trans. Microw. Theory Techn.*, vol. 63, no. 2, pp. 719–725, Feb. 2015.
- [12] T. Xi, S. Huang, S. Guo, P. Gui, D. Huang, and S. Chakraborty, "Highefficiency E-band power amplifiers and transmitter using gate capacitance linearization in a 65-nm CMOS process," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 64, no. 3, pp. 234–238, Mar. 2017.
- [13] L. Chen, L. Zhang, and Y. Wang, "A 26.4-dB gain 15.82-dBm 77-GHz CMOS power amplifier with 15.9% PAE using transformer-based quadrature coupler network," *IEEE Microw. Wireless Compon. Lett.*, vol. 30, no. 1, pp. 78–81, Jan. 2020.
- [14] D. Zhao and P. Reynaert, "A 40-nm CMOS E-band 4-way power amplifier with neutralized bootstrapped cascode amplifier and optimum passive circuits," *IEEE Trans. Microw. Theory Techn.*, vol. 63, no. 12, pp. 4083–4089, Dec. 2015.
- [15] V.-S. Trinh and J.-D. Park, "A 16.3 dBm 14.1% PAE 28-dB gain W-band power amplifier with inductive feedback in 65-nm CMOS," *IEEE Microw. Wireless Compon. Lett.*, vol. 30, no. 2, pp. 193–196, Feb. 2020.
- [16] D. Pan, Z. Duan, L. Sun, S. Guo, L. Cheng, and P. Gui, "A 76-81 GHz CMOS PA with 16-dBm PSAT and 30-dB amplitude control for MIMO automotive radars," in *Proc. IEEE 45th Eur. Solid State Circuits Conf.* (ESSCIRC), Cracow, Poland, Sep. 2019, pp. 329–332.
- [17] Y.-H. Hsiao, Y.-C. Chang, C.-H. Tsai, T.-Y. Huang, S. Aloui, D.-J. Huang, Y.-H. Chen, P.-H. Tsai, J.-C. Kao, Y.-H. Lin, B.-Y. Chen, J.-H. Cheng, T.-W. Huang, H.-C. Lu, K.-Y. Lin, R.-B. Wu, S.-J. Chung, and H. Wang, "A 77-GHz 2T6R transceiver with injection-lock frequency sextupler using 65-nm CMOS for automotive radar system application," *IEEE Trans. Microw. Theory Techn.*, vol. 64, no. 10, pp. 3031–3048, Oct. 2016.
- [18] U. Celik and P. Reynaert, "An E-band compact power amplifier for future array-based backhaul networks in 22 nm FD-SOI," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Jun. 2019, pp. 187–190.
- [19] E. Kaymaksut, D. Zhao, and P. Reynaert, "Transformer-based Doherty power amplifiers for mm-wave applications in 40-nm CMOS," *IEEE Trans. Microw. Theory Techn.*, vol. 63, no. 4, pp. 1186–1192, Apr. 2015.
- [20] D. Zhao and P. Reynaert, "An E-band power amplifier with broadband parallel-series power combiner in 40-nm CMOS," *IEEE Trans. Microw. Theory Techn.*, vol. 63, no. 2, pp. 683–690, Feb. 2015.
- [21] N.-C. Kuo and A. M. Niknejad, "An E-band QPSK transmitter element in 28-nm CMOS with multistate power amplifier for digitally-modulated phased arrays," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Jun. 2018, pp. 184–187.



**VAN-SON TRINH** (Graduate Student Member, IEEE) received the Bachelor of Science degree from the Hanoi University of Science and Technology (HUST), Hanoi, Vietnam, in 2015. He is currently pursuing the Ph.D. degree in electronics and electrical engineering with Dongguk University, Seoul, South Korea.

His current research interests include various analog and RF integrated circuits.



**JUNG-DONG PARK** (Senior Member, IEEE) received the Bachelor of Science degree from Dongguk University, Seoul, South Korea, in 1998, the M.S. degree from the Gwangju Institute of Science and Technology (GIST), Gwangju, South Korea, in 2000, and the Ph.D. degree in electrical engineering and computer science (EECS) from the University of California, Berkeley, in 2012.

From 2000 to 2002, he worked at the Institute for Advanced Engineering (IAE), Yongin, South

Korea, where he was involved with the design of 35 GHz radar/radiometer transceivers. From 2002 to 2007, he was a Senior Researcher at the Agency for Defense Development (ADD), Daejeon, South Korea, where he was responsible for the development of millimeter-wave (mmW) passive/active sensors and related mmW modules. From 2007 to 2012, he was with the Berkeley Wireless Research Center (BWRC), where he worked on silicon-based RF/millimeter-wave/terahertz circuits and systems. From 2012 to 2015, he worked at Qualcomm Inc., San Jose, CA, USA, where he designed various RF/analog integrated circuits. He is currently an Associate Professor at the Division of Electronics and Electrical Engineering, Dongguk University. His research interests include wireless communications, remote sensors, microwave electronics, analog, RF, mixed-signal, and millimeter-wave circuits.

Prof. Park was a recipient of the 2017 Most Frequently Cited Papers Award as a Lead Author at the 2017 IEEE Symposium on VLSI Circuits, Kyoto, from 2010 to 2016.