

Received 22 November 2022; revised 29 December 2022; accepted 10 January 2023. Date of publication 12 January 2023; date of current version 2 March 2023. Digital Object Identifier 10.1109/OJCAS.2023.3236567

# An Inductorless Optical Receiver Front-End Employing a High Gain-BW Product Differential Transimpedance Amplifier in 16-nm FinFET Process

MILAD HAGHI KASHANI<sup>®1</sup> (Member, IEEE), HOSSEIN SHAKIBA<sup>®2</sup> (Senior Member, IEEE), AND ALI SHEIKHOLESLAMI<sup>®1</sup> (Senior Member, IEEE)

<sup>1</sup> Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON M5S 1A1, Canada

<sup>2</sup>HiLink, Huawei Technologies Canada, Markham, ON L3R 5A4, Canada

This article was recommended by Associate Editor J. Han.

CORRESPONDING AUTHOR: M. H. KASHANI (e-mail: miladhk@ece.utoronto.ca)

This work was supported in part by the Natural Sciences and Research Council of Canada (NSERC) and in part by the Huawei Technologies.

**ABSTRACT** In this paper, a fully-differential transimpedance amplifier (TIA) providing a high gain-BW product (GBP) is introduced. In the proposed architecture, a cascode cross-coupled structure is employed to double the effective transconductance of the cascode devices, improving the BW of the TIA. Moreover, a differential architecture is implemented using an RC high-pass filter along with a buffer stage requiring smaller capacitance and resistance. Furthermore, a single-ended negative capacitance generation (NCG) circuit is employed at the input of the TIA to partially compensate for the input parasitic capacitances. A TIA including the proposed techniques, designed and laid out in a 16-nm FinFET process, demonstrates 57% and 79% better figure-of-merit compared to cascode and conventional TIAs designed along with the proposed TIA for a fair comparison, respectively. Post-layout simulations in companion with statistical analysis are employed to verify the effectiveness of the proposed architecture. From simulation results, the optical receiver achieves a peak transimpedance gain of 58.5 dB $\Omega$ , a BW of 14.8 GHz, an input-referred noise of 33.6 pA/ $\sqrt{Hz}$ , and an eye-opening of 30 mV at a data-rate of 56 Gbps PAM4 and at a bit-error-rate (BER) of 1E-6. The whole circuit consumes 49 mW and occupies an active area of 0.0076 mm<sup>2</sup>.

**INDEX TERMS** Inductor-less transimpedance amplifier, cascode TIA, inverter-based TIA, PAM4, statistical eye-diagram.

## I. INTRODUCTION

**R** ECENTLY, there has been an increasing demand for higher data rate transmission over short and long-reach communication channels, terabit/s switching systems, cloud computing, and larger data volumes [1]. In this regard, the optical interconnects such as optical fibers have become more popular thanks to their lower high-frequency loss and lower cross-talk noise compared to their electrical counterpart such as copper [2]. This lower level of signal dispersion reduces the design complexity of channel equalization in optical communication. Ethernet systems are typically implemented by multiplexing parallel channels, although there are still challenges to implement 400Gbps and higher [3], [4], [5]. The intention here is to lower the number of parallel channels, which in turn results in area and power savings. CMOS integrated circuit is considered as one of the best technologies due to its high level of integration and a reasonable compromise between cost, speed, and power consumption [6]. Today, monolithic implementations (CMOS silicon photonics) have attracted more attentions thanks to the promise of co-integration of electronics with optics, ranging from co-packaging to eventually full silicon integration.

In the optical receiver chain, the transimpedance amplifier (TIA) is the first block and typically dictates the bandwidth,

noise, and sensitivity of the whole receiver at high data rates [7], [8]. TIA is driven by a photodiode which contributes a large parasitic capacitance. This capacitance in conjunction with the input parasitic capacitance of the TIA form the dominant pole of the system. In this regard, a TIA with a smaller input resistance is desired to achieve a wider bandwidth [9]. Moreover, the input-referred noise of the TIA directly affects the receiver bit-error-rate (BER) [10]. To lower the noise, larger devices should be utilized at the cost of larger parasitics, power consumption, and area. In this regard, the design of a low-power low-noise TIA while operating at a high speed and providing a large transimpedance gain is in high demand.

A common approach to improve the BW of the inductive peaking. Several inductive peak-TIA is ing techniques are reported, including series-inductive peaking (SIP) [3], [4], [10], shunt-inductive peaking (SHIP) [3], [4], [10],  $\pi$ -type inductive peaking (PIP) [6], and T-type inductive peaking (TIP) [7]. However, having inductors and transformers [10] in the design increases the silicon area, and impedes implementation in low-cost digital processes due to the lack of thick metal layers and high quality factor passive components [11]. Moreover, the coupling to the substrate increases through the inductors which results in higher crosstalk [11]. In the multi-channel parallel optical receivers, multiple TIAs are placed in close proximity, demanding low substrate coupling and compact TIAs to reduce the crosstalk [11].

In this work, we propose a new inductor-less differentialoutput TIA design which provides a sufficiently large gain-BW product without using T-coils, and transformers. In the proposed TIA chain, the single-ended to differential block and the core TIA are combined into one block and differential signaling is utilized at the first stage of the receiver chain. Compared to [10], this work implements an inductorless optical receiver front-end including a novel fully-differential TIA (the TIA and single-ended to differential conversion are combined into one block) in companion with a negative capacitance generation circuit to improve the gain-BW product. The paper is organized as follows; Section II provides the background. Section III and IV describe the analysis of the proposed TIA and its circuit implementation, respectively. Simulation results are provided in Section V, and finally, Section VI concludes the paper.

### **II. BACKGROUND**

As mentioned earlier, a low input resistance TIA is desired to push the dominant pole of the TIA to higher frequencies. A common-gate TIA can provide a low input resistance at the cost of larger power consumption and input-referred noise [12], [13]. To alleviate these challenges, a regulatedcascode (RGC) [14] TIA is introduced to lower the input resistance by increasing the effective transconductance, however it increases the power consumption [12], [13]. The shunt-feedback TIA with a common-source amplifier can be an alternative thanks to providing an inductive impedance at the input in addition to lowering the input resistance [12], [13], however it shows a trade-off between the transimpedance gain and power consumption. Another implementation of the shunt-feedback TIA is inverter-based TIA which achieves a lower noise compared to the shuntfeedback TIA with a common-source amplifier due to the current-reuse technique [12], [13]. This is also suitable for low supply voltage applications. For these reasons, the CMOS inverter-based structure is used as the core TIA in this work.

The conventional inverter-based TIA [10] with a capacitive load and an electrical model of the PD is shown in Fig. 1(a). In this figure, C<sub>T</sub>, C<sub>g1</sub> (and C<sub>g2</sub>), C<sub>gd1</sub> (and C<sub>gd2</sub>), and  $C_{d1}$  (and  $C_{d2}$ ) represent the total parasitic capacitance at the input (including the parasitics of the PD, pad, and electro-static discharge (ESD) protector), the MOS gate parasitic capacitance, gate-drain parasitic capacitance, and drain parasitic capacitance of M<sub>1</sub> and M<sub>2</sub>, respectively. There are some challenges associated with the design of the inverterbased TIAs. First, Cgd1 and Cgd2 are in parallel with the feedback resistor (R<sub>F</sub>), weakening the feedback impedance at higher frequencies and appearing at the input as Miller capacitors [11]. Since the gain is typically large, the equivalent Miller capacitance would be large, limiting the BW. One solution to overcome the Miller effect is to implement a cascode structure [2], shown in Fig. 1(b), at the cost of a degraded linearity using the same supply voltage. Second, the transimpedance gain (which is  $\sim R_F$ ) is limited in order to get the maximally flat frequency response without using peaking techniques. Third, the single-ended structure is vulnerable to the substrate and power supply noises [7]. Therefore, differential signaling is required at the input stage of the TIA to lower the impact of common-mode noise. To address the first two challenges, we note that the transimpedance gain must satisfy the following inequality, derived in [15], to get a maximally flat frequency response without equalization;

$$R_F \le \frac{A \times f_A}{2\pi C_T B W_T^2} \tag{1}$$

where A,  $f_A$ , and  $BW_T$  indicate respectively the forward voltage gain of the amplifier, the bandwidth of the amplifier, and the closed-loop TIA's bandwidth. From (1), the low-frequency transimpedance gain ( $\sim R_F$ ) degrades with a square of the TIA's BW which in turn degrades the noise performance of the TIA. To partially relax the  $R_F$  limit, the forward voltage gain of the amplifier (A) is increased in the cascode TIA due to the larger output resistance. Also,  $C_T$ is reduced thanks to smaller Miller capacitance, resulting in a wider BW. Furthermore,  $R_F$  is no longer in parallel with  $C_{gd1}$  and  $C_{gd2}$  and the gain from the gate of  $M_1$  (and  $M_2$ ) to drain of  $M_1$  (and  $M_2$ ) is lower than that of the conventional inverter-based TIA. As a result, the feedback impedance degradation at higher frequencies would be smaller.

To address the third challenge, that is to create immunity against common-mode noises, the simple method is to create a complementary signal using a single-ended



FIGURE 1. (a) Inverter-based TIA [10] (b) Cascode inverter-based TIA [2].

inverter [16], [17]. However, this design would be asymmetric as the delay between the two paths may create a non-zero phase shift beyond the required 180 degrees. Moreover, this approach is sensitive to PVT variations. In [18], [19], two photodetectors in companion with two replica TIAs are implemented. However, the design requires a larger silicon area and power consumption, and a more complex design of the printed circuit board (PCB) [7]. An alternative approach is to implement a pseudo differential architecture requiring either a replica TIA or a passive low-pass filter [20], [21] which increases the power consumption and silicon area. A fully differential RGC TIA is introduced in [22], however the noise performance is rather high because of the input current source. Another solution is to implement a single-ended to differential block including a pair of common-source and common-gate amplifiers [23] next to the TIA at the cost of a considerable power and area overhead. To overcome the aforementioned challenges, we propose a novel fully differential TIA discussed next.

## **III. PROPOSED TIA ARCHITECTURE**

Fig. 2(a) shows a schematic of the proposed TIA which includes two cascode TIAs coupled with a single capacitor  $C_C$ . The single-ended PD current creates a single-ended  $V_{ip}$ . To make a differential output, we need to create a complementary  $V_{in}$  (=  $-V_{ip}$ ). A brute-force approach would require

a gain stage of -1 between V<sub>in</sub> and V<sub>in</sub>. To do this, M<sub>1</sub> can act as a common-source amplifier and the cascode device  $M_3$  as its load to produce a voltage gain of -1 at  $V_1$  without power and area overhead. We then couple the signal at node  $V_1$  to the second TIA through the coupling capacitor  $C_{\rm C}$ . However, there are three concerns that need to be addressed. 1) R<sub>F2</sub> and C<sub>C</sub> form a high-pass filter from V<sub>1</sub> to V<sub>in</sub> whose cut-off frequency must be small enough to provide a differential output within a MHz frequency range. This would require large values for R<sub>F2</sub> and C<sub>C</sub> which occupy a large silicon area and contribute a large parasitic capacitance to the substrate limiting the BW of the TIA. 2) The architecture is asymmetric since the top PMOS circuitry does not contribute to making Vin. Also, RF2 loads node V1 which causes asymmetry between node V1, V2, and V5 (drain of M<sub>5</sub>) 3) We must maintain  $|V_{op}/V_{on}|$  close to 1 in the face of PVT variations and mismatch. We address the first two challenges here and the last one in Section V.

The complete schematic of the proposed differential TIA is shown in Fig. 2(b). In this figure, M<sub>1-2</sub> (and M<sub>5-6</sub>) and  $M_{3-4}$  (and  $M_{7-8}$ ), are the main transconductors and the cascode devices, respectively. The single-ended input current (Ipd) from the photodetector flows into NMOS and PMOS cascode stages ( $M_{1-2}$ ,  $M_{3-4}$ , and  $R_{F1}$ ), resulting into a negative excursion output voltage (Von). Then, the signal at the drain of M<sub>1</sub> and M<sub>2</sub> are AC coupled to the right half circuit through a buffer stage experiencing the same load and RF2 would also no longer load node V1. This means that the signal at nodes V<sub>1</sub> and V<sub>5</sub> would be also symmetric. Note that the parasitic capacitance of  $C_{C1}$  (200 fF in our design) does not have a noticeable impact on the output within the BW due to the low resistance at node  $V_1$ . This addresses the second concern raised with respect to the simplified design. Note that the asymmetry between node V<sub>1</sub> and drain of M<sub>5</sub> and  $M_6$  in the circuit of Fig. 2(a) can be also improved by inserting dummy loads. Insertion of a buffer between the two half circuits in Fig. 2(a) enables using a considerably smaller coupling capacitor, C<sub>C</sub>, and R<sub>F2</sub> which reduces the area and eliminates a potentially large parasitic capacitance to the substrate. As an example, by employing the proposed technique shown in Fig. 2(b), the value of  $C_{C1}$  and  $C_{C2}$  can be decreased from 20 pF to 200 fF (reduced by a factor of 100) to achieve the same cut-off frequency with the same value of R<sub>F2</sub>. This addresses the first concern raised with respect to the simplified design.

Finally, the output of the buffer (i.e.,  $V_{in}$ ), which is designed to have the same magnitude and opposite phase with respect to  $V_{ip}$ , in companion with  $R_{F2}$  develop a positive excursion output voltage ( $V_{op}$ ). Furthermore, by coupling the source of a cascode device (such as  $M_3$ ) to the gate of the cascode device on the opposite side (such as  $M_7$ ) through  $R_{Bi}$  (such as  $R_{B2}$ ) and  $C_{Bi}$  (such as  $C_{B2}$ ), where  $R_{Bi}$  and  $C_{Bi}$  refer to the i<sup>th</sup> bias resistor and capacitor for i = 1 to 4, we effectively double the transconductance of the cascode device. As an illustration, when the voltage at the source of  $M_3$  is equal to  $V_1$ , the voltage at the source of  $M_7$  is equal



FIGURE 2. (a) A simplified and (b) a complete circuit schematic of the proposed differential TIA.

to  $-V_1$ , which is connected to the gate of M<sub>3</sub>. This means that the voltage across the source-gate junction of M<sub>3</sub> is  $2V_1$ , effectively doubling its transconductance. Two supply voltages of 1.2 V (HVDD) and 0.9 V (LVDD) are used in the proposed design.

There are some design considerations about the proposed design discussed next. First, there is an additional phase delay from node  $V_1$  and  $V_2$  to  $V_{in}$ , however this extra phase delay is small thanks to the small output resistance of the source-follower architecture. Based on the simulation results, the phase delay is less than 7.3 degrees within the BW of interest (15 GHz). Second, node  $V_{in}$  shows a lower BW compared to  $V_{ip}$  due to experiencing larger parasitics. However, nodes  $V_{op}$  and  $V_{on}$  show the same BW since the



FIGURE 3. (a) Simplified half small-signal circuit of the proposed TIA. (b)  $Z_{\mbox{\scriptsize eq}}$  calculation.

BW of both signal paths are limited by the pole at the output node resulted by the cascode output resistance and the load capacitance. Third, the loading on nodes  $V_1$  and  $V_2$  are the same by the insertion of the buffer stage considering the following assumptions;  $M_1 = M_2 = M_5 = M_6$  and  $M_3 = M_4 = M_7 = M_8$  and  $R_{BBi}$ ,  $C_{Ci}$ ,  $R_{Bi}$ , and  $C_{Bi}$  shown in Fig. 2(b) are equal for corresponding values of "i".

### A. DIFFERENTIAL OUTPUT REQUIREMENT

Before we analyze the proposed TIA, we derive the requirement for its proper differential operation. In the following small-signal analysis, we have neglected the bulk transconductance ( $g_{mb}$ ) of MOSFETs, and have assumed that  $M_1$  and  $M_2$  (and  $M_3$  and  $M_4$ ) are matched for simplicity. Moreover,  $r_{dsi}$  and  $g_{mi}$  are the drain-source resistance and transconductance of the i<sup>th</sup> transistor shown in Fig. 2(b), respectively. The low-frequency single-ended voltage gain of the left half of the amplifier without feedback can be written as;

$$A = \frac{V_{on}}{V_{ip}} = -g_{m1}g_{m3}r_{ds1}r_{ds3}$$
(2)

For a proper differential operation, a negative unity gain is required from  $V_{ip}$  to  $V_{in}$ . To facilitate this,  $Z_{eq}$ , the impedance seen at the drain of  $M_1$  when  $V_{ip}$  is grounded, must be calculated. This is shown in Fig. 3(a) and (b), from which  $Z_{eq}$  can be obtained as;

$$Z_{eq} \approx r_{ds1} || \frac{r_{ds3} + \left[ R_{F1} || (2g_{m4}r_{ds4}r_{ds2}) \right]}{1 + 2g_{m3}r_{ds3}}$$
(3)

Consequently,

$$\frac{V_{in}}{V_{ip}} = \left(-g_{m1}Z_{eq}\right) \cdot \frac{(g_{m9} + g_{m10})\left(\frac{R_{F2}}{1+|A|}||r_{ds11}\right)}{1 + (g_{m9} + g_{m10})\left(\frac{R_{F2}}{1+|A|}||r_{ds11}\right)} \approx -1$$
(4)

To satisfy (4), one design approach is to choose  $g_{m1}Z_{eq}$  to be -1 and the gain of the buffer stage as  $\sim 1$  which results

in (5).

$$\frac{V_{in}}{V_{1,2}} = \frac{(g_{m9} + g_{m10}) \left(\frac{R_{F2}}{1+|A|} || r_{ds11}\right)}{1 + (g_{m9} + g_{m10}) \left(\frac{R_{F2}}{1+|A|} || r_{ds11}\right)} \approx 1$$
(5)

Therefore, the transconductance of the cascode transistors can be found as;

$$g_{m3} \approx \frac{g_{m1}r_{ds1}([R_{F1}||(2g_{m4}r_{ds4}r_{ds2})] + r_{ds3})}{2r_{ds1}r_{ds3}} - \frac{[R_{F1}||(2g_{m4}r_{ds4}r_{ds2})] + r_{ds1} + r_{ds3}}{2r_{ds1}r_{ds3}}$$
(6)

From (6), the size and gate bias voltage of the cascode transistors can be calculated given a desired bias current. Using Miller's theorem, the low-frequency input resistance of the proposed TIA can be found as;

$$R_{in} = \frac{R_{F1}}{1 + |A|} = \frac{R_{F1}}{1 + g_{m1}g_{m3}r_{ds1}r_{ds3}}$$
(7)

Based on the aforementioned assumptions, the transimpedance gain of each path is calculated in (8) and (9) shown below;

$$\frac{V_{on}}{I_{pd}} = \frac{V_{ip}}{I_{pd}} \cdot \frac{V_{on}}{V_{ip}} = R_{in} \cdot A = \frac{A \cdot R_{F1}}{1 + |A|} \approx -R_{F1}$$
(8)
$$\frac{V_{op}}{I_{pd}} \approx \frac{V_{ip}}{I_{pd}} \cdot \frac{V_{1}}{V_{ip}} \cdot \frac{V_{on}}{V_{1}} \cdot \frac{V_{op}}{V_{in}} \\
\approx R_{in} \cdot \left(-g_{m1} Z_{eq}\right) \cdot \frac{(g_{m9} + g_{m10}) \left(\frac{R_{F2}}{1 + |A|} ||r_{ds11}\right)}{1 + (g_{m9} + g_{m10}) \left(\frac{R_{F2}}{1 + |A|} ||r_{ds11}\right)} \cdot A \approx R_{F1}$$
(9)

Note that the final approximation in (9) is based on satisfying the conditions given in (4). To realize (5), one may pick a large value of  $R_{F2}$  which would save power and lower the input-referred noise of the TIA without affecting its transimpedance gain (refer to (8) and (9)). However, such a large resistor would be bulky and have a large parasitic to the substrate. As discussed earlier, to relax the requirement on the large value of  $R_{F2}$ , we insert the buffer stage between  $R_{F2}$  and  $C_C$ . This buffer consumes a simulated 0.9 mW to achieve the above goal.

A concern regarding (4) is its sensitivity to the PVT variations and the mismatch between the devices. In this regard, satisfying (5) requires the product of  $(g_{m9}+g_{m10})$  and  $r_{ds11}$  to be much larger than 1 (considering that  $R_{F2}$  is large enough). This makes (5) less sensitive to the PVT and mismatch variations. The robustness of the differential operation over PVT variations and mismatch is verified by the simulation results provided in Section V.

To fairly compare the BW amongst discussed TIA designs, the input resistance and capacitance are calculated next. The input resistance of the cascode and conventional inverterbased TIAs (Fig. 1) with the same size counterpart devices can be written as;

TABLE 1. Design parameters used in the analysis.

| Parameter            | Value | Parameter       | Value  | Parameter       | Value |  |
|----------------------|-------|-----------------|--------|-----------------|-------|--|
| CL                   | 40 fF | CT              | 250 fF | Cgd1, Cgd2      | 11 fF |  |
| CD                   | 11 fF | R <sub>F</sub>  | 600 Ω  | Cg1, Cg2        | 90 fF |  |
| HVDD/                | 1.2 V | VDN             | 0.0 V  | VDD             | 0.2 V |  |
| LVDD                 | 0.9 V | VDIN            | 0.9 V  | v Dr            | 0.5 V |  |
| M <sub>1,2,3,4</sub> | 9.8µm | R <sub>F1</sub> | 1 KΩ   | R <sub>F2</sub> | 4 KΩ  |  |



FIGURE 4. Simulated forward voltage gain comparison.

Cascode:

$$R_{in} = \frac{R_{F1}}{1 + g_{m1}g_{m3}r_{ds1}r_{ds3}} \tag{10}$$

Conventional:

$$R_{in} = \frac{R_{F1}}{1 + g_{m1}r_{ds1}} \tag{11}$$

The total input parasitic capacitance of the three TIA designs are written as;

$$C_T \approx C_{PD} + C_{gs1} + C_{gs2} + \left(1 + \frac{g_{m1}}{2g_{m3}}\right) (C_{gd1} + C_{gd2})$$
(12)

Cascode:

$$C_T \approx C_{PD} + C_{gs1} + C_{gs2} + \left(1 + \frac{g_{m1}}{g_{m3}}\right) (C_{gd1} + C_{gd2})$$
 (13)

Conventional:

$$C_T \approx C_{PD} + C_{gs1} + C_{gs2} + (1 + g_{m1}r_{ds1}) (C_{gd1} + C_{gd2})$$
(14)

From (7) and (10) to (14), the proposed TIA achieves the largest BW compared to other two designs given the same size counterpart devices.

In the following, we have simulated the three TIA designs using the parameters provided in Table 1. In this table, the extracted design parameters of an inverter-based TIA (W = 9.8  $\mu$ m and L = 16 nm), used in the three TIA designs, implemented in a 16-nm FinFET process are provided. The simulated forward voltage gain of the three designs are plotted in Fig. 4 and verify that the proposed design provides

|                             | Proposed | Cascode | Conventional |
|-----------------------------|----------|---------|--------------|
| Gain (dB $\Omega$ )         | ~66      | ~56     | ~56          |
| BW (GHz)                    | ~11      | ~10.2   | ~10.2        |
| Power (mW)                  | 11.9     | 5.5     | 6.25         |
| FOM=GBP/Power<br>(GHz.Ω/mW) | 1844.4   | 1170.2  | 1029.7       |
| PSRR (dB)                   | <-25     | <-5     | <-5          |

TABLE 2. Performance comparison amongst different TIA designs.



FIGURE 5. Simulated transimpedance gain comparison.

the largest open-loop gain-BW product compared to the cascode and conventional TIAs. Considering (1), the proposed TIA achieves a larger nominator and a smaller denominator. This implies that the proposed design allows for a larger feedback resistor which in turn increases the transimpedance gain and reduces the input-referred noise current while maintaining a wide BW. In addition, the simulated closed-loop transimpedance gain of the three designs are plotted and compared in Fig. 5. As can be observed, the proposed TIA achieves  $\sim 10$  dB larger gain with  $\sim 8\%$  larger BW (11 GHz compared to 10.2 GHz) compared to the cascode and conventional TIAs. For a fair comparison, a figure-of-merit (FOM) [24] is used which considers the trade-off between gain, BW, and power calculated in Table 2. From this Table, the proposed design achieves 57% and 79% better FOM compared to cascode and conventional TIAs, respectively. In other words, given the same GBP for the three designs, the proposed design would use 37% and 45% less power compared to the cascode and conventional TIAs. The simulated power supply rejection ratio (PSRR) of the three TIAs are also compared in Table 2, highlighting the improvement in PSRR for a differential TIA architecture. This means that the supply noise and other common-mode noises at the supply domain are 20 dB more attenuated in the proposed design compared to other designs. The noise performance comparison is discussed next.

## **B. NOISE ANALYSIS**

The noise of the inverter-based TIA is mainly dominated by the noise contribution of the transistor channel and the feedback resistor [10]. The noise of the feedback resistor directly contributes to the input noise [10]. For the noise calculation, the short-circuit output current noise must be divided by the noise transfer function of the TIA. The equivalent input noise current spectral density of the proposed TIA can be written as;

$$\frac{\overline{I^{2}}_{n,in}(f) \approx \overline{I^{2}}_{n,R_{F1}} + \frac{1 + (2\pi f R_{F1} C_{T})^{2}}{R_{F1}^{2}} \times \left[ \frac{\overline{I^{2}}_{n,D_{1}} + \overline{I^{2}}_{n,D_{2}}}{(g_{m1} + g_{m2})^{2}} + \frac{\overline{I^{2}}_{n,D_{9}} + \overline{I^{2}}_{n,D_{10}} + \overline{I^{2}}_{n,D_{11}} + \overline{I^{2}}_{n,R_{F2}} + \frac{\overline{I^{2}}_{n,D_{5}} + \overline{I^{2}}_{n,D_{6}}}{((g_{m5} + g_{m6})R_{F2})^{2}}} \right] \\
\approx \frac{4kT}{R_{F1}} + \frac{4kT\Gamma}{2g_{m1}R_{F1}^{2}} + \left(\frac{4kT\Gamma}{2g_{m1}} + \frac{4kT\Gamma}{2g_{m9}} + \frac{4kT\Gamma g_{m11}}{4g_{m9}^{2}} + \frac{4kT}{4g_{m9}^{2}R_{F2}}\right) \\
\times (2\pi f C_{T})^{2} \tag{15}$$

where k, T, and  $\Gamma$  represent the Boltzmann's constant, temperature, and the excess noise coefficient, respectively. Then, the input-referred noise current spectral density of the cascode and conventional TIAs are calculated as; Cascode:

ascode:

$$\overline{I_{n,in}^{2}}(f) = \frac{4kT}{R_{F}} + \frac{4kT\Gamma}{2g_{m1}R_{F}^{2}} + \frac{4kT\Gamma}{2g_{m1}} (2\pi f C_{T|Cascode})^{2}$$
(16)

Conventional:

$$\overline{I_{n,in}^{2}(f)} = \frac{4kT}{R_{F}} + \frac{4kT\Gamma}{2g_{m1}R_{F}^{2}} + \frac{4kT\Gamma}{2g_{m1}} \left(2\pi f C_{T|Conventional}\right)^{2}$$
(17)

By observing (15), (16), and (17) and considering the same size counterpart devices for the three designs, we conclude the following: First, the white noise level of the proposed and cascode TIAs are roughly the same and slightly larger than the conventional TIA due to the smaller  $g_{m1}$  (because of the cascode architecture and smaller drain-source voltage given the same supply voltage). Note that the white noise contribution of the buffer and stage, M<sub>5</sub> (and M<sub>6</sub>), and R<sub>F2</sub> are negligible compared to other terms. Second, the slope of the high-frequency noise contribution curve is larger in the proposed TIA due to noise contribution of R<sub>F2</sub> and the buffer stage, although the input capacitance  $(C_T)$  is smaller. Third, the white noise contribution of R<sub>F2</sub> is negligible compared to other terms due to the large gain. Fourth, the high-frequency noise contribution of R<sub>F2</sub> can be reduced by increasing  $R_{F2}$  while keeping the transimpedance gain (refer to (8) and (9)) intact. The simulated input-referred noise current of the three TIAs with the same feedback resistor (600  $\Omega$ ) and transconductors (60 fingers) are compared in Fig. 6 confirming the aforementioned statements. This figure indicates that the input-referred noise current of the proposed, cascode, and conventional TIAs within their noise bandwidth are 23.8, 20.1, and 15.3 pA/√Hz, respectively.

#### C. STABILITY ANALYSIS

To analyze the stability of the proposed TIA, we use a commonly-used criterion known as rate of closure [25]. In



FIGURE 6. Simulated input-referred noise current comparison.

this approach, we consider the rate at which the openloop gain  $20\log|A(j\omega)|$  and the inverse of the feedback factor  $(-20\log|\beta(j\omega)|)$  curves intersect. A phase margin better than 45 degrees is guaranteed if at the intersection of the two curves (i.e., when  $|A\beta(j\omega)| = 1$ ), the difference of slopes does not fall below -20 dB/decade [25]. In the case of a constant feedback factor, this implies that the slope of  $20\log|A(j\omega)|$  at the point of intersection must be larger than -20 dB/decade to ensure more than 45 degrees phase margin. Considering that the drain parasitic capacitance of the transistors are small, this approach simplifies the stability test of the proposed and cascode TIAs, due to the fact that their feedback factor is frequency independent ( $\beta = 1/R_F$ ) in contrast to the conventional TIA  $(\beta_{Conv} = \frac{1 + sR_F(C_{gd1} + C_{gd2})}{R_F})$  where C<sub>gd1</sub> and C<sub>gd2</sub> in parallel with R<sub>F1</sub> form the feedback network. In the conventional TIA,  $C_{gd1}$ ,  $C_{gd2}$ , and  $R_F$  create a pole in  $20\log|1/\beta(j\omega)|$ which pushes the intersection point to a higher frequency and in turn reduces phase margin, although this may be usually negligible due to the higher frequency of this pole compared to the first pole of  $20\log|A(j\omega)|$ . Fig. 7(a) shows plots of  $20\log|A(j\omega)|$  and  $20\log|1/\beta(j\omega)|$  for the three TIA designs demonstrating a  $\sim -20$ dB/dec rate of closure for all of them. Fig. 7(b) shows the open loop phase of the three TIA designs indicating a phase margin of 62 degrees for the proposed and cascode TIAs and a phase margin of 77 degrees for the conventional design with the same feedback resistor (600  $\Omega$ ), transconductors (60 fingers), and power supply for all designs (the same transimpedance gain and power consumption for all designs). The conventional design achieves a better phase margin compared to other two designs thanks to larger loop gain and phase. The reason is that with the same power supply, conventional TIA has a higher transconductance compared to the cascode and proposed TIAs. Although the proposed design shows a lower phase margin compared to the conventional design, the key advantage of the proposed design is to allow for a larger feedback resistor in order to increase the gain without sacrificing power and lower the noise while still providing a wide BW (better FOM).



FIGURE 7. (a) ROC stability criterion comparison (b) Loop phase comparison amongst the three TIA designs.

# IV. IMPLEMENTATION OF NEGATIVE CAPACITANCE GENERATION CIRCUIT

As mentioned earlier, TIAs typically have a dominant pole at the input due to the large parasitics. Inductive peaking methods can lower the amount of parasitics at the cost of a large area and coupling to the substrate. An alternative approach is to employ an active inductor, however, this would increase the power consumption and the amount of inductance is limited. A negative capacitance generation circuit that neutralizes the parasitic capacitance can be an alternative solution. Compared to the peaking techniques using inductors, the negative capacitance generation technique using active circuits has been shown to be more flexible and cost-effective [26], [27], [28], [29], [30].

A popular implementation of the NCG circuit, which creates a negative capacitance between gates and drains of a cross-coupled differential pair, cannot be employed in the single-ended-input TIA. An active-based single-ended negative capacitance circuit employed at the output of the gain stage is proposed in [26] based on a differential amplifier, however half of the current is wasted due to the differential architecture. In addition, the input impedance is low and cannot be implemented at the input of the TIA, which would load the output of the photodiode and reduce the transimpedance gain of the TIA. Other negative capacitance circuits such as those reported in [27] and [29] are



FIGURE 8. (a) Implemented NCG mechanism (b) Simplified input impedance (c) NCG circuit schematic.

also differential. A negative impedance compensation technique is introduced in [29] to adjust the output impedance of the TIA maximizing the GBP, however the design is differential. A single-ended negative impedance generator is introduced in [30] which is placed at the output of the shunt feedback TIA to reduce the loading effect of the feedback resistor and the amplifier output impedance at the cost of a larger power, silicon area, and noise level. Moreover, the design does not provide enough freedom for the designer to adjust the amount of negative capacitance.

To overcome the aforementioned challenges, we have implemented a single-ended NCG circuit shown in Fig. 8(a) and connected it to the input of the TIA. In this case, the NCG circuit directly compensates for the input parasitic capacitance of the TIA at the cost of a larger input-referred noise which will be discussed later. In this architecture, a positive feedback is implemented by using an amplifier with a negative gain of  $-A_v$  between the drain and gate of M<sub>12</sub>. The top current source is utilized to prevent the leakage of PD current into ground. The bottom current source with capacitor  $C_S$  make an RC source degeneration network. The single-ended input impedance of the proposed structure is given as;

$$Z_{NCG}(s) \approx \frac{-1}{A_{\nu}g_{m12}} + \frac{-1}{A_{\nu}C_s s}$$
$$= R_{NCG} + \frac{1}{C_{NCG} s}$$
(18)

As can be seen from (18), series negative capacitance and resistance are generated shown in Fig. 8(b). The negative capacitance is dependent on  $C_s$  and  $A_v$  while the negative resistance is controlled by  $A_v$  and  $g_{m12}$ . From (18),  $A_vg_{m12}$ 



FIGURE 9. Simulated BW versus the control voltage.

must be maximized to minimize the series resistor  $R_{NCG}$ , while  $A_v$  and  $C_s$  must be chosen to achieve a proper negative capacitance  $C_{NCG}$  to neutralize the input capacitance of the TIA ( $C_T$ ).

The circuit implementation of the NCG for our proposed TIA is shown in Fig. 8(c). In this circuit, M<sub>13</sub> and M<sub>14</sub> are the top and bottom current sources, and M<sub>12</sub> is the main amplifier. Transistor M15 and resistor RD form the feedback amplifier. Capacitor Cs in Fig. 8(a) is replaced with a variable capacitor (varactor), Cvar in Fig. 8(c), to make the design tunable which can compensate for different PD modules with different parasitics. An analog-based tuning mechanism is employed in this work by means of a varactor controlled by an off-chip analog voltage (V<sub>ctrl</sub>) to tune the BW considering PVT variations. Moreover, adjusting the gain of the amplifier by changing its tail bias current to achieve a desired negative capacitance can be an alternative, however the power consumption would be varied in each state which may increase the power consumption in some states. The circuit is designed to be connected to the input of the TIA using direct dc coupling. Note that using an ac-coupling capacitor would limit the BW improvement due to the large parasitics to the substrate. The simulated transimpedance BW of the proposed TIA including the NCG mechanism the control voltage over PVT variations is shown in Fig. 9. By sweeping  $V_{ctrl}$  from 0 to 1 V, the BW can be improved by 29% (fast corner), 23% (typical), and 19% (slow corner). A current mirror circuit including M<sub>16</sub>, M<sub>17</sub>, and M<sub>18</sub> is implemented to make sure that both top and bottom current sources provide the same current. Since the input node of the NCG circuit is highimpedance, the dc voltage of this net will be set by the TIA and its feedback resistor. In addition, a feedback loop is implemented at the input to set the input node voltage to the desired value (VDD/2) compensating the PD dc current. The implemented NCG circuit consumes 1.8 mW from a 0.9 V supply.

The output noise current of the NCG directly contributes to the input-referred noise of the TIA. Thus, it is important to consider the noise behavior of the proposed NCG circuit.



FIGURE 10. Effect of the implemented NCG circuit on the input-referred noise of the TIA.

The short-circuit output noise current of the proposed NCG can be written as;

$$\overline{I^{2}_{n,out,NCG}}(f) = \overline{I^{2}_{n,M_{13}}} + \overline{I^{2}_{n,M_{14}}} + \overline{I^{2}_{n,M_{12}}} \left(\frac{r_{ds12}}{r_{ds12} + r_{ds14}}\right)^{2} + \left(\overline{I^{2}_{n,M_{15}}} + \overline{I^{2}_{n,R_{D}}}\right) \left(\frac{g_{m12}}{1 + g_{m12}r_{ds14}}\right)^{2} R_{D}^{2}$$
(19)

Considering (19), the following conclusions can be made. First, the noise of M<sub>13</sub> and M<sub>14</sub> directly contribute to the output noise. In this regard, two large on-chip MOM capacitors of  $C_{n1}$  and  $C_{n2}$  (~2 pF) are used at the gate of  $M_{13}$ and M<sub>14</sub> to eliminate their gate noise contribution. Second, the noise contribution of M<sub>12</sub> is lower if r<sub>ds14</sub> increases. In this case, larger channel length for M<sub>14</sub> is chosen to increase its drain-source resistance. Finally, the noise contribution of  $M_{15}$  and  $R_D$  is minimized by increasing  $g_{m12}$  and  $r_{ds14}$ . To show the effect of the proposed NCG on the input-referred noise of the TIA, the input-referred noise current spectral density of the conventional TIA with and without NCG are compared in Fig. 10. From this figure, the implemented NCG circuit at the input of the TIA increases the input-referred noise by  $\sim 10 \text{ pA}/\sqrt{\text{Hz}}$  (33.6 pA/ $\sqrt{\text{Hz}}$  – 23.8 pA/ $\sqrt{\text{Hz}}$ ). This means that the signal-to-noise ratio (SNR) of the TIA, which is defined as the minimum PD current to the input-referred noise current of the TIA in a logarithmic scale [31], would be degraded by 3 dB (20log (33.6/23.8)). From (18) and (19), there is a trade-off between the generated negative capacitance and the NCG noise contribution. As discussed earlier, the proposed TIA allows to pick a larger R<sub>F1</sub> compared to the conventional designs which results in a better noise performance. We have designed the NCG circuit to improve the BW of the TIA as much as possible while keeping the overall noise level of the receiver reasonable to achieve the target BER, which is chosen as 1E-6 (pre-Forward Error Correction (FEC) BER) for 4-PAM modulation. The minimum pre-FEC BER for IEEE802.3bs 400GbE standard is 2.4E-4 [32], [33], [34].

# V. PERFORMANCE VALIDATION OF THE PROPOSED TIA IN AN OPTICAL RECEIVER

The block diagram and detailed circuit schematics of the implemented optical receiver to validate the proposed TIA design are shown in Fig. 11(a) and (b). The proposed differential TIA is chosen as the core TIA as well as the single-ended to differential converter in the receiver chain. The NCG circuit is employed at the input to further improve the BW of the receiver. The TIA is followed by a variable gain amplifier (VGA) to prevent linearity degradation given high-level input signals. A differential amplifier with a variable current source is implemented as the VGA. The bias of its tail current source is set by the voltage provided by the AGC loop. The AGC loop uses a differential peak detector implemented using two diode-conned NMOS transistors connected to Voutp and Voutn. The peak of the signal is stored on a capacitor and compared with the desired value, V<sub>ref2</sub>, provided off-chip to achieve a desired total harmonic distortion. A sink dc current source is utilized to provide a bleeding path for the capacitor to calibrate the AGC loop. Finally, the error amplifier provides the required voltage for the VGA based on the comparison. A differential amplifier with  $50\Omega$  loads is employed as the output buffer for impedance matching and measurement purposes. A feedback loop is implemented at the input of the TIA to compensate for the PD DC bias current. This feedback loop measures the dc output voltage of the TIA and compares it with the desired value (LVDD/2 which is V<sub>ref1</sub> shown in Fig. 11), which after amplification and level shifting provides the required gate voltage for the variable sink current biasing the PD. Finally, a DC-offset cancellation feedback loop is implemented to cancel the dc level mismatch between the differential outputs of Voutp and V<sub>outn</sub> by adjusting voltage levels at V<sub>op1</sub> and V<sub>on1</sub>.

Fig. 12 shows the layout of the implemented receiver including all the blocks shown in Fig. 11. The active area of the circuit is 85  $\mu$ m × 90  $\mu$ m (0.0076 mm<sup>2</sup>). The total power consumption is 49 mW. One critical design consideration here is the connection between the photodiode and the TIA since they are not typically integrated on the same die. We consider two options for this connection: flip-chip and wirebond. The flip-chip is preferred in this work because it allows the PD and TIA to be placed closer to each other, which usually translates to a better signal integrity. For the PD, a 56 GBaud PIN-based PD circuit model is considered in the simulation test bench which has a BW of 35 GHz and a responsivity of 0.7 A/W at a wavelength of 1310 nm. Other non-idealities such as the PD transit BW limitation, S-parameter model of the bump and the package trace  $(\sim 300 \text{ um})$  between the PD and the TIA, pad capacitance (100 fF), and ESD protector parasitic (80 fF) are included in the simulation test bench shown in Fig. 13. Note that in our co-package design there is no ball grid array (BGA) in the high speed signal path. Moreover, the parasitic inductance of the chip supply domain and the cathode bias voltage of the PD are included in the simulations. The cathode of the PD is connected to an off-chip 3V supply. To reduce the PD





FIGURE 11. (a) Block diagram (b) Detailed circuit schematics of the implemented front-end receiver.

noise, the cathode of the PD is filtered using an on-chip RC low-pass filter.

The following simulation results (Fig. 14 to Fig. 19) are the post-layout (extracted netlist) of Fig. 12 having all of the blocks connected together. Fig. 14 shows the receiver frequency response and output return loss indicating a maximum gain of 58.5 dB $\Omega$  with a BW of 14.8 GHz and S<sub>22</sub> of better than -10 dB up to 20 GHz. The simulated output noise voltage spectral density is shown in Fig. 15. From Fig. 15, the input-referred noise current of the receiver is calculated as 33.6 pA/ $\sqrt{Hz}$  over a noise BW of 30 GHz. Furthermore, the integrated noise at the output of the receiver is calculated as 5.1 mV rms.

The simulated amplitude and phase mismatch of the differential output of the implemented receiver shown in Fig. 12 over PVT variations is shown in Fig. 16. As can be seen, the amplitude and phase mismatch are better than 0.4 dB and 4 degrees up to 20 GHz. It is worth-noting that the output transmission lines (TL) are implemented using two single coplanar waveguides (CPW). They are matched by optimizing



FIGURE 12. Layout of the implemented optical receiver front-end.

the signal metal layer, the width of the trace, and the space between the signal trace and the ground plane. Fig. 17(a) and (b) show the TIA differential output amplitude mismatch in the linear scale ( $|V_{op}/V_{on}|$ ) and that of the optical receiver front-end ( $|V_{outp}/V_{outn}|$ ) over the possible variations of the threshold voltage and W/L ratio of cascode devices. This figure indicates a variation of less than 5% and 0.1% from the target value in the TIA and receiver outputs, respectively. To

| Publication               | Tech | Gain         | DR     | Area     | Noise    | BW    | AGC/DC | Power | Power/bit | Modulation |
|---------------------------|------|--------------|--------|----------|----------|-------|--------|-------|-----------|------------|
|                           | (nm) | $(dB\Omega)$ | (Gbps) | $(mm^2)$ | (pA/√Hz) | (GHz) | offset | (mW)  |           |            |
| ISCAS'14 <sup>*</sup> [5] | 28   | 56           | 40     | 0.05     | 25       | 10    | Both   | 56    | 1.4       | PAM-4      |
| TCASI'18 [9]              | 130  | 94           | 10     | 0.08     | 32       | 7     | No     | 108   | 10.8      | NRZ        |
| OJCAS'22# [10]            | 16   | 58           | 64     | 0.023    | 17.4     | 17.4  | Both   | 19    | 0.29      | PAM-4      |
| JSSC'10 [11]              | 130  | 62           | 10     | 0.06     | 20       | 6     | No     | 98    | 9.8       | NRZ        |
| TCASI'15 [18]             | 130  | 50           | 10     | 0.016    | 31       | 7     | No     | 7.5   | 0.75      | NRZ        |
| JSSC'17 [35]              | 40   | NA           | 25     | 0.007    | 14       | 9     | No     | 39.6  | 1.6       | NRZ        |
| MWCAS'14 [36]             | 28   | 31           | 30     | 0.4      | 53       | 25    | No     | 4.2   | 0.14      | NRZ        |
| TVLSI'16 [37]             | 28   | 65           | 10     | 0.05     | 21       | 7.2   | No     | 68.5  | 6.85      | NRZ        |
| Access'21 [38]            | 130  | 53.2         | None   | 0.021    | 16.8     | 14    | No     | 9.8   | None      | None       |
| This Work*                | 16   | 58.5         | 56     | 0.0076   | 33.6     | 14.8  | Both   | 49    | 0.87      | PAM-4      |
|                           |      |              |        |          |          |       |        |       |           |            |

TABLE 3. Performance summary of state-of-the-art inductorless TIAs.

\*Simulation

#Including inductors and transformers



FIGURE 13. Simulation test bench used in this work.



FIGURE 14. Simulated frequency response and output return loss of the implemented receiver.

consider mismatch between the devices, Monte-Carlo simulations with 200 iterations have been performed. Fig. 18(a) and (b) show the histogram of the TIA and receiver differential output amplitude mismatch, respectively. The simulated sensitivity of the TIA is better than -6 dBm (considering responsivity of 0.8 A/W) at a BER of 1E-6. The input PAM4 peak-to-peak current is 300  $\mu$ A. The simulated eye-diagram of a 56 Gbps PAM4 PRBS15 signal at the outputs of the receiver from transient simulation using Cadence is shown in Fig. 19(a). An eye-opening of 45 mV and a width of 0.2 UI at a BER of 1E-3 is achieved. The output differential swing voltage is 180 mV. We have also performed statistical eye analysis using the pulse response obtained



FIGURE 15. Simulated output noise voltage spectrum of the implemented receiver.



FIGURE 16. Simulated amplitude and phase mismatch of the implemented receiver versus frequency over PVT variations.

from Cadence with a 0.1 UI jitter and a noise standard deviation of 5.1 mV. The contours generated by the statistical eye analysis for 56 Gbps PAM4 at different BERs are shown in Fig. 19(b). Vertical eye-openings of 30 mV and 40 mV with widths of 0.14 UI and 0.2 UI at BERs of 1E-6 and 1E-3 are achieved, respectively, and consistent with transient simulation results. Also, a robust performance is observed at various data rates and PD current conditions. Fig. 19(c)



FIGURE 17. Simulated |V<sup>op</sup>/V<sup>on</sup>| and |V<sup>outp</sup>/V<sup>outn</sup>| over the variations of the (a) threshold voltage (b) W/L ratio of cascode devices.

shows a 64 Gbps PAM4 statistical eye-diagram when a 3tap feedforward equalization (FFE) is included. A vertical eye-opening of 20 mV with a width of 0.1 UI at a BER of 1E-6 is achieved. Finally, Table 3 compares performance of the proposed TIA with the state-of-the-art inductorless TIAs [35], [36], [37], [38]. Reference [10] employs inductors and transformers to improve the GBP without sacrificing the power consumption. Inductor-based TIAs can achieve a higher data rate with a lower power consumption compared to the inductorless TIAs at the cost of a larger silicon area ( $\sim$ 3 times) and a larger coupling to the substrate (which results in a larger cross-talk).

## **VI. CONCLUSION**

In this paper, a differential inductorless TIA is proposed in which a cascode cross-coupled structure is utilized to improve the BW. A buffer stage in companion with an RC high-pass filter is employed to make a differential output. Moreover, a single-ended negative capacitance generation circuit is utilized at the input of the TIA to further enhance the BW. An optical receiver front-end that employs the proposed TIA is simulated in a 16-nm FinFET process. Based on the post-layout simulations and statistical analysis, the proposed receiver achieves a peak transimpedance gain of 58.5 dB $\Omega$  with a bandwidth of 14.8 GHz which supports up to 56 Gbps PAM4 at a BER of 1E-6 without the need



FIGURE 18. Histogram of the (a) TIA (b) receiver differential output amplitude mismatch.



FIGURE 19. (a) 56 Gbps transient eye-diagram (b) 56 Gbps stat-eye (c) 64 Gbps stat-eye with a 3-tap FFE.

for additional equalization. Higher data rates and/or lower BERs can be achieved by adding equalization.

## ACKNOWLEDGMENT

The authors would like to thank Canadian Microelectronics Corporation (CMC) Microsystems for access to CAD tools.

## REFERENCES

 J. Han et al., "A 20-Gb/s transformer-based current-mode optical receiver in 0.13-μm CMOS," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 57, no. 5, pp. 348–352, May 2010.

- [2] M. Atef, H. Chen, and H. Zimmermann, "10Gb/s inverter based cascode transimpedance amplifier in 40nm CMOS technology," in *Proc. IEEE 16th Int. Symp. Des. Diagnos. Electron. Circuits Syst. (DDECS)*, Karlovy Vary, Czech Republic, 2013, pp. 72–75.
- [3] Y. Wang et al., "A 3-mW 25-Gb/s CMOS transimpedance amplifier with fully integrated low-dropout regulator for 100GbE systems," in *Proc. IEEE Radio Freq. Integr. Circuits Symp.*, Tampa, FL, USA, 2014, pp. 275–278.
- [4] H. Li, G. Balamurugan, J. Jaussi, and B. Casper, "A 112 Gb/s PAM4 Linear TIA with 0.96 pJ/bit Energy Efficiency in 28 nm CMOS," in *Proc. IEEE 44th Eur. Solid-State Circuits Conf. (ESSCIRC)*, Dresden, Germany, 2018, pp. 238–241.
- [5] N. A. Quadir, P. D. Townsend, and P. Ossieur, "An inductorless linear optical receiver for 20Gbaud/s (40Gb/s) PAM-4 modulation using 28nm CMOS," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, 2014, pp. 2473–2476.
- [6] P. Sinsoontornpong and A. Worapishet, " $\pi$ -peaking shunt-feedback transimpedance amplifier with bandwidth enhancement," in *Proc. IEEE Int. Conf. Electron Devices Solid-State Circuit (EDSSC)*, Bangkok, Thailand, 2012, pp. 1–4.
- [7] S. G. Kim, C. Hong, Y. S. Eo, J. Kim, and S. M. Park, "A 40-GHz mirrored-cascode differential transimpedance amplifier in 65-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 54, no. 5, pp. 1468–1474, May 2019.
- [8] C. Li and S. Palermo, "A low-power 26-GHz transformer-based regulated cascode SiGe BiCMOS transimpedance amplifier," *IEEE J. Solid-State Circuits*, vol. 48, no. 5, pp. 1264–1275, May 2013.
- [9] S. Ray and M. M. Hella, "A 53 dBΩ 7 -GHz inductorless transimpedance amplifier and a 1-THz+ GBP limiting amplifier in 0.13-μ m CMOS," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 65, no. 8, pp. 2365–2377, Aug. 2018.
- [10] M. H. Kashani, H. Shakiba, and A. Sheikholeslami, "A low-noise high-gain broadband transformer-based inverter-based transimpedance amplifier," *IEEE Open J. Circuits Syst.*, vol. 3, pp. 72–81, 2022.
- [11] O. Momeni, H. Hashemi, and E. Afshari, "A 10-Gb/s inductorless transimpedance amplifier," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 57, no. 12, pp. 926–930, Dec. 2010.
- [12] B. Razavi, Design of Integrated Circuits for Optical Communications. Hoboken, NJ, USA: Wiley, 2012.
- [13] E. Säckinger, Broadband Circuits for Optical Fiber Communication. Hoboken, NJ, USA: Wiley, 2005.
- [14] B. A. Bodriguez, G. C. Temes, K. W. Martin, S. M. L. Law, R. Handy, and N. Kadekodi, "An NMOS buffer amplifier," *IEEE J. Solid-State Circuits*, vol. SC-19, no. 1, pp. 69–71, Feb. 1984.
- [15] S. S. Mohan, M. D. M. Hershenson, S. P. Boyd, and T. H. Lee, "Bandwidth extension in CMOS with optimized on-chip inductors," *IEEE J. Solid-State Circuits*, vol. 35, no. 3, pp. 346–355, Mar. 2000.
- [16] D. R. Patel, "Co-packaged 100+ Gbps optical communication receiver front-ends in FinFET CMOS," Ph.D. dissertation, Dept. Electr. Comput. Eng., Univ. Toronto, Toronto, ON, Canada, 2020.
- [17] K. R. Lakshmikumar et al., "A process and temperature insensitive CMOS linear TIA for 100 gb/s/λ PAM-4 optical links," *IEEE J. Solid-State Circuits*, vol. 54, no. 11, pp. 3180–3190, Nov. 2019.
- [18] A. Awny et al., "23.5 A dual 64Gbaud 10kΩ 5% THD linear differential transimpedance amplifier with automatic gain control in 0.13µm BiCMOS technology for optical fiber coherent receivers," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2016, pp. 406–407.
- [19] J. S. Weiner et al., "SiGe differential transimpedance amplifier with 50-GHz bandwidth," *IEEE J. Solid-State Circuits*, vol. 38, no. 9, pp. 1512–1517, Sep. 2003.
- [20] W.-Z. Chen, Y.-L. Cheng, and D.-S. Lin, "A 1.8-V 10-Gb/s fully integrated CMOS optical receiver analog front-end," *IEEE J. Solid-State Circuits*, vol. 40, no. 6, pp. 1388–1396, Jun. 2005.
- [21] R. G. Meyer and W. D. Mack, "A wideband low-noise variablegain BiCMOS transimpedance amplifier," *IEEE J. Solid-State Circuits*, vol. 29, no. 6, pp. 701–706, Jun. 1994.
- [22] S. G. Kim et al., "A 50-Gb/s differential transimpedance amplifier in 65nm CMOS technology," in *Proc. IEEE Asian Solid-State Circuits Conf. (A-SSCC)*, Nov. 2014, pp. 357–360.
- [23] M. H. Kashani, A. Tarkeshdouz, E. Afshari, and S. Mirabbasi, "A 53–67 GHz low-noise mixer-first receiver front-end in 65-nm CMOS," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 66, no. 6, pp. 2051–2063, Jun. 2019.

- [24] J.-D. Jin and S. S. H. Hsu, "A 40-Gb/s transimpedance amplifier in 0.18-µm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 43, no. 6, pp. 1449–1457, Jun. 2008.
- [25] A. S. Sedra, K. C. Smith, T. C. Carusone, and V. Gaudet, *Microelectronic Circuits*, 8th ed. New York, NY, USA: Oxford Univ. Press, Nov. 2019, p. 857.
- [26] D. J. Comer, D. T. Comer, J. B. Perkins, K. D. Clark, and A. P. C. Genz, "Bandwidth extension of high-gain CMOS stages using active negative capacitance," in *Proc. 13th IEEE Int. Conf. Electron. Circuits Syst.*, 2006, pp. 628–631.
- [27] S. Goswami, T. Copani, B. Vermeire, and H. Barnaby, "BW extension in shunt feedback transimpedance amplifiers using negative Miller capacitance," in *Proc. IEEE Int. Symp. Circuits Syst.*, 2008, pp. 61–64.
- [28] R. Tagawa and Y. Takahashi, "5.3 GHz, 69.6 dBΩ transimpedance amplifier with negative impedance converter," in *Proc. Int. Symp. Intell. Signal Process. Commun. Syst. (ISPACS)*, 2018, pp. 396–400.
- [29] C.-M. Tsai and W.-T. Chen, "A 40mW 3.5kΩ 3Gb/s CMOS differential transimpedance amplifier using negative-impedance compensation," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, 2007, pp. 52–586.
- [30] C.-M. Tsai and L.-R. Huang, "A 24mW 1.25Gb/s 13kΩ transimpedance amplifier using active compensation," in *ISSCC Dig. Tech. Papers*, Feb. 2006, pp. 238–239.
- [31] H. Hashemi, "Transimpedance amplifiers (TIA): Choosing the best amplifier for the job," Texas Instrum., Dallas, TX, USA, Application Rep. SNOA942A, Nov. 2015.
- [32] E. Sentieri et al., "12.2 A 4-channel 200Gb/s PAM-4 BiCMOS transceiver with silicon photonics front-ends for gigabit Ethernet applications," in *Proc. ISSCC*, Feb. 2020, pp. 210–212.
- [33] H. Li, J. Sharma, C.-M. Hsu, G. Balamurugan, and J. Jaussi, "11.6 A 100Gb/s-8.3dBm-sensitivity PAM-4 optical receiver with integrated TIA, FFE and direct-feedback DFE in 28nm CMOS," in *Proc. ISSCC*, Feb. 2021, pp. 190–192.
- [34] D. Patel, A. Sharif-Bakhtiar, and A. C. Carusone, "A 112 Gb/s -8.2 dBm sensitivity 4-PAM linear TIA in 16nm CMOS with copackaged photodiodes," in *Proc. CICC*, Apr. 2022, pp. 1–2.
- [35] S.-H. Huang and W.-Z. Chen, "A 25 Gb/s 1.13 pJ/b –10.8 dBm input sensitivity optical receiver in 40 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 52, no. 3, pp. 747–756, Mar. 2017.
- [36] L. Szilagyi, R. Henker, and F. Ellinger, "An inductor-less ultra-compact transimpedance amplifier for 30 Gbps in 28 nm CMOS with high energy-efficiency," in *Proc. IEEE 57th Int. Midwest Symp. Circuits Syst. (MWSCAS)*, 2014, pp. 957–960.
- [37] O. T.-C. Chen, C.-T. Chan, and R. R.-B. Sheen, "Transimpedance limit exploration and inductor-less bandwidth extension for designing wideband amplifiers," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 24, no. 1, pp. 348–352, Jan. 2016.
- [38] V. Kumar, G. S. Saravanan, P. Duraiswamy, and S. K. Selvaraja, "Single stage low noise inductor-less TIA for RF over fiber communication," *IEEE Access*, vol. 9, pp. 141504–141512, 2021.



MILAD HAGHI KASHANI (Member, IEEE) received the B.Sc. degree (Hons.) in electrical engineering from the Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran, in 2016, and the M.Sc. degree in electrical engineering from the University of British Columbia (UBC), Vancouver, BC, Canada, in 2018. He is currently pursuing the Ph.D. degree with the University of Toronto, Toronto, ON, Canada.

His research interests include RF/mm-wave integrated circuits and systems, analog and dig-

ital integrated circuits and systems, and high-speed signaling. He was the recipient of the Iran's National Elites Foundation Fellowship in 2014, the Graduate Student Research Fellowship from the University of California at Santa Barbara in 2016, the IEEE RESMIQ Best Student Paper Award in the International NEWCAS Conference in 2018, the Four Year Doctoral Fellowship from UBC in 2018, the Finalist of TEXPO Graduate Student Research Competition Award in 2019, the Analog Devices Inc. Outstanding Student Designer Award in 2021, and the Ontario Graduate Scholarship in 2021.





**HOSSEIN SHAKIBA** (Senior Member, IEEE) received the B.Sc. and M.Sc. degrees in electrical engineering from the Department of Electrical and Computer Engineering, Isfahan University of Technology, Iran, in 1985 and 1989, respectively, and the Ph.D. degree in electrical engineering from the Department of Electrical and Computer Engineering, University of Toronto, Canada, in 1997. He has over 35 years of teaching, research, design, and management experience in the area of analog circuit and system design for various

applications with focus on wireline communication in both the industry and academia. He is currently working on system and circuit development for next-generation serial links with Huawei Canada in collaboration with the wireline industry with emphasis on link design, modeling, and analysis, including statistical and signal integrity. He is also actively involved in conducting research with various universities and co-supervises several graduate students.



ALI SHEIKHOLESLAMI (Senior Member, IEEE) received the B.Sc. degree in electrical engineering from Shiraz University, Iran, in 1990, and the M.A.Sc. and Ph.D. degrees in electrical engineering from the University of Toronto, Canada, in 1994 and 1999, respectively.

In 1999, he joined the Department of Electrical and Computer Engineering, University of Toronto, where he is currently a Professor. He was on Research Sabbatical with Fujitsu Labs from 2005 to 2006, and with Analog Devices, Toronto, ON,

Canada, from 2012 to 2013. He has coauthored over 70 journal and conference papers, ten patents, and a graduate-level textbook titled "Understanding Jitter and Phase Noise." His research interests are in analog and digital integrated circuits, high-speed signaling, and CMOS annealing. He has received numerous teaching awards, including the 2005-2006 Early Career Teaching Award and the 2010 Faculty Teaching Award both from the Faculty of Applied Science and Engineering at the University of Toronto. He served on the Memory, Technology Directions, and Wireline Subcommittees of the ISSCC in 2001-2004, 2002-2005, and 2007-2013, respectively. He was an SSCS Distinguished Lecturer from 2018 to 2019. He currently serves as the Education Chair for ISSCC and the Vice President of Education for SSCS. He is an Associate Editor for the IEEE Solid-States Circuits Magazine, in which he has a regular column titled "Circuit Intuitions." He was an Associate Editor for the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-PART I: REGULAR PAPERS from 2010 to 2012, and the Program Chair for the 2004 IEEE ISMVL. He is a Registered Professional Engineer in Ontario, Canada.