# Design of a Low-Power, High-Data-Rate, and Crystal-Less All-Digital IR-UWB Transmitter for High-Density Neural Implants

Jiaxin Lei<sup>®</sup>, Student Member, IEEE, Xiliang Liu, Wei Song, Graduate Student Member, IEEE, Heng Huang<sup>®</sup>, Xiaoyan Ma<sup>®</sup>, Junliang Wei, and Milin Zhang<sup>®</sup>, Senior Member, IEEE

Abstract-This article presents an all-digital, crystal-less impulse radio ultra-wideband (IR-UWB) transmitter for the high-density neural implants with several thousand channels. A six-order hybrid modulation scheme combining the differential 16-pulse-position modulation (D16PPM), pulsewidth modulation (PWM), and differential bi-phase shift keying (DBPSK) is proposed to achieve a high data rate and guarantee the transcutaneous data transmission. A detailed theoretical analysis of the signal-to-noise ratio (SNR) and symbol error rate (SER) relationship is included, along with the analysis of removing the crystal. The proposed transmitter uses a 42-stage bi-phase ring oscillator (RO) to provide all the edges to combine. A highly symmetrical pulse generator and a pulse shaper (PS) with an edge detector (ED) chain are designed, respectively, to realize the proposed modulation scheme efficiently. The transmitter achieves a data rate of 1.8 Gbps with a power consumption of only 4.09 mW. Thus, a power efficiency of 2.3 pJ/bit is achieved. In an ex vivo test, a transcutaneous transmission range of 20 cm was measured with 18-mm pork tissue applied.

*Index Terms*—Brain-machine interfaces (BMIs), crystalless, digital transmitter, hybrid modulation, impulse radio ultra-wideband (IR-UWB), neural implants.

### I. INTRODUCTION

NEURAL implants enable neural signal acquisition from the inner brain, offering new possibilities for interacting between man-made devices and the neural system. Since there are billions of neurons in the human-brain, more sensing channels are needed for higher spatial resolution. Recently, a batch of thousand-channel high-density electrodes have been reported in the literature [1], [2], [3], [4]. The data throughput

Manuscript received 1 September 2023; revised 23 November 2023; accepted 24 December 2023. Date of publication 24 May 2024; date of current version 28 June 2024. This article was approved by Associate Editor Pui-In Mak. This work was supported in part by the Natural Science Foundation of China under Grant 92164202, in part by Beijing Innovation Center for Future Chip, and in part by Beijing National Research Center for Information Science and Technology. (*Corresponding author: Milin Zhang.*)

Jiaxin Lei, Wei Song, and Heng Huang are with the Department of Electronic Engineering, Tsinghua University, Beijing 100084, China.

Xiliang Liu, Xiaoyan Ma, and Junliang Wei are with Beijing Ningju Technology Ltd., Beijing 100190, China.

Milin Zhang is with the Department of Electronic Engineering, Beijing National Research Center for Information Science and Technology, Institute for Precision Medicine, Tsinghua University, Beijing 100084, China (e-mail: zhangmilin@tsinghua.edu.cn).

Color versions of one or more figures in this article are available at https://doi.org/10.1109/JSSC.2023.3349077.

Digital Object Identifier 10.1109/JSSC.2023.3349077

of the neural implants increases proportionally to the number of channels to Gbps level. For instance, a neural implant with 3000 sensing channels, a sampling rate of 30 kSps per channel, and a 16-bit quantization generates a total data throughput of 1.44 Gbps. Such a large data throughput becomes a key challenge in transferring the acquired neural data to the outside body, across the skull and skin [5], [6]. Lossy compression technologies, such as ON-chip spike sorting, spike-driven data compression, and compressed sensing, have been proposed to reduce the data load by  $6 \times -10 \times [7]$ , [8], [9], [10]. However, the additional power required for data compression may exceed the power saved by the wireless data link [5]. In addition, since there is not yet a consensus on the specifications of the compressed neural data [11], the lossless raw data are still preferred in current scientific research. A high-data-rate wireless telemetry module that can transmit all the raw acquired neural data is strongly demanded by the high-density neural implants.

Various wireless technologies have been explored for the implantable scenarios, such as custom Medical Implant Communication System (MICS) or Industrial Scientific Medical (ISM) band transmission [12], [13], [14], [15], inductive coupling transmission [16], [17], [18], [19], ultrasonic communication [20], [21], [22], optical communication [23], [24], [25], human body communication (HBC) [26], [27], [28], [29], [30] and impulse radio ultra-wideband (IR-UWB) [31], [32], [33], [34], [35], [36], [37]. Most of these solutions achieve a data rate from several Mbps to  $\sim 300$  Mbps, while IR-UWB outperforming them by achieving the highest data rate of more than 1 Gbps. This performance is due to its >500 MHz available bandwidth, allowing Gbps throughput with a relatively low complexity and low-power consumption design. Therefore, IR-UWB is a promising option to meet the requirements of high-density neural implants.

Nevertheless, there are still several challenges remaining to be addressed. One of the key design challenges is that the transmit power spectral density (PSD) is limited to -41.3 dBm/MHz to comply with the Federal Communication Commission (FCC) UWB regulation. For a UWB transmitter operating with an exact bandwidth of 500 MHz, the maximum TX power is limited to -14.3 dBm. For IR-UWB, the TX power is equal to the pulse repetition frequency (PRF) multiplied by the energy per pulse. While the PRF should typically be higher to boost the data rate, an escalation of PRF in turn reduces the energy per pulse, which consequently weakens the ability to cope with the additional path loss due to the biological tissue. Thus, a contradiction arises between the high-data-rate demand and the practicality of transcutaneous data transmission. Two methods have been proposed to alleviate this contradiction. One is to further increase the bandwidth. Kim and Rabaey [38] introduced triple-channel UWB with a total bandwidth of 1 GHz. Geng et al. [39], Ko and Gharpurey [40], and Lee et al. [41], [42] introduced frequency hopping technology, achieving a total bandwidth of more than 2 GHz, respectively. With a larger bandwidth, the total TX power can be larger to guarantee the energy per pulse. The other way to balance the data rate and the transcutaneous transmission is to increase the modulation order. Lee et al. [43] proposed digitalized multi-pulse-position modulation (D-MPPM). With a sync pulse and a data pulse with 32 possible positions in one symbol, the modulation order increased to 2.5. Lee et al. [41], [42] further added one more data pulse in a symbol and introduced extended multi-pulseposition modulation (E-MPPM), resulting in a modulation order of 3. However, the sync pulse in E-MPPM still reduced the modulation order by 1/3. With 50% more pulses and nearly halved pulse positions of  $\sim 50$  ps, the modulation order of E-MPPM increased by only 0.5 compared with D-MPPM. Song et al. [34], [35] improved the modulation order to 7 by introducing a 4PAM-8PSK-4PPM hybrid modulation, achieving a data rate of 1.66 Gbps with 2-cm transcutaneous transmission range. Compared with a standalone modulation scheme, the adoption of hybrid modulation reduces the requirement of the signal-to-noise ratio (SNR) [35]. But the hybrid modulation in [34] and [35] requires an initiative reduction of the TX pulse amplitude due to the use of PAM, resulting in a reduction of the transmission range.

Considering the power consumption, the high-frequency carrier generation is usually power-hungry for the conventional upconversion architecture [33], [34], [35], [38], [39], [40]. Alternatively, IR-UWB transmitters can adopt an all-digital edge-combine technique to replace the high-frequency carrier and save power [32], [42], [43], [44], [45], [46]. The edgecombine architecture typically uses a delay line to generate the edges, and a digital combine logic to construct the edges into UWB pulses. However, the modulation order of the digital edge-combine transmitters is usually limited to 2-3 due to the lack of phase modulation [41], [42], [43]. In addition, the delay of the delay cells in the delay line varies with the process corners. Switched capacitors are widely used to calibrate the delay cells [32], [41], [42], [43], which is especially complicated as an extra high-speed oscilloscope is usually needed to quantify the  $\sim 100$ -ps delay of each stage.

The third challenge is the small form-factor challenge. For an IR-UWB transmitter, there are usually three parts of external components that increase the volume of the system: the external matching network, the antenna, and the crystal oscillator. Prior UWB works have proposed matchingnetwork-free [32], [41], [43] or ON-chip-matching [34], [35] transmitters and even ON-chip antenna [33], [47]. Notably,



Fig. 1. Diagram of the D16PPM–PWM–DBPSK hybrid modulation scheme. A total modulation order of 6 is achieved.

some of these works manage to achieve a data rate of  $\sim 1$  Gbps. However, the data rate of the existing crystal-less UWB transmitters is usually limited to  $\sim$ Mbps [48], [49], [50]. This limitation is primarily due to the need for precise timing in high-speed transmitters.

This article proposed an IR-UWB transmitter for highdensity neural implants, as a possible solution to previous challenges [36]. A hybrid modulation scheme combining the differential 16-pulse-position modulation (D16PPM), pulsewidth modulation (PWM), and differential bi-phase shift keying (DBPSK) is proposed to balance the data rate and transcutaneous data transmission. A total modulation order of 6 is achieved. This six-order modulation scheme is realized by an all-digital transmitter, with a ring oscillator (RO) providing all the edges to combine. The calibration of the RO frequency also calibrates all the delay cells, which is much easier than the switched capacitor method. The proposed modulation scheme and the transmitter are also optimized for crystalless working scenarios. The transmitter features a data rate of 1.8 Gbps with a power consumption of only 4.09 mW. Thus, an energy efficiency of 2.3 pJ/bit is achieved. A transcutaneous transmission range of 20 cm is measured with 18-mm pork tissue applied.

The rest of this article is organized as follows. Section II introduces the proposed hybrid modulation scheme, including theoretical analysis among the SNR, symbol-error rate (SER), and the link budget, along with the impact of removing the crystal. The architecture of the transmitter and the detailed circuit implementation are introduced in Section III. Section IV explains the experimental results, including an ex vivo transcutaneous transmitting test, while Section V concludes the entire work.

## **II. HYBRID MODULATION SCHEME**

## A. D16PPM-PWM-DBPSK Modulation

The diagram of the proposed D16PPM–PWM–DBPSK hybrid modulation scheme is illustrated in Fig. 1. The D16PPM modulates the data as the pulse-position difference of adjacent pulses. There are 16 possible positions in every symbol. The pulse-position code (PPC) of one symbol can be expressed as

$$PPC_i \equiv D16PPM_i + PPC_{i-1} \pmod{16}.$$
 (1)

In addition, the PWM modulates the data into a varying number of sub sinusoidal pulses in the TX wave. In the proposed design, there are five and four sub sinusoidal pulses with the PWM code of 0 and 1, respectively, denoting the signal model as  $S_0$  and  $S_1$ . DBPSK further modulates the data with a 0° or 180° phase shift of adjacent pulses for code 0 or 1, respectively. As shown in Fig. 1, the two optional phases for a UWB TX wave are denoted as phase code (PC) 0 or 1. The 180° phase shift of a signal can be expressed by multiplying a parameter  $p_i \in \{-1, 1\}$ . Thus, the TX signal can be expressed as

$$S(t) = \sum_{i=0}^{\infty} p_i \cdot S_{\text{PWM}}(t - i \cdot T_s - \delta \cdot \text{PPC}_i)$$
(2)

where  $T_s$  is the symbol duration.  $\delta$  is the D16PPM time step.

D16PPM features a modulation order of 4. Together with PWM and DBPSK, the hybrid modulation scheme achieves a total modulation order of 6.

To decrease the symbol duration and to increase the PRF, the time step of the D16PPM pulse positions ( $\delta$ ) can be rendered shorter than the pulsewidth, as indicated in Fig. 1. In the proposed design, the time step of the D16PPM is half a sinusoidal period. Given that a pair of adjacent rising and falling edges in the edge-combine circuit corresponds exactly to half a sinusoidal period, this design allows the generation of pulses and the regulation of pulse positions to be performed by a single module. Compared with the previous two-step approach [41], [42], [43], which uses a stand-alone digital-totime converter (DTC) to delay the trigger signal, followed with a stand-alone pulse generator to generate the pulses, a better efficiency can be expected for the proposed design. In addition, both the PWM and DBPSK in the hybrid modulation scheme are also suitable for digital edge-combine transmitters. This is because the pulsewidth variation in PWM equates to an integer multiple of sub-pulses. For DBPSK, a phase difference of 180° can be simply created with an inverter, or a bi-phase RO in this work.

In this work, the PRF is designed as 300 MHz, enabling a  $T_s$  of 3.33 ns. The data rate is specified at 1.8 Gbps. Every symbol duration is divided equally into 28 time slots, with each time slot corresponding to half a sinusoidal period. Thus, the time step of D16PPM ( $\delta$ ) is 119 ps. The time step of PWM is 238 ps. The carrier frequency of the TX signal is established at 4.2 GHz.

## B. SNR and Link Budget Analysis

In assessing the viability of a new modulation scheme, a crucial step is to provide a theoretical analysis of the SNR and SER relationship, followed with a link budget analysis. This analysis is vital to determine whether this new approach is suitable for the intended working scenario, and to identify the requested performance of the circuits involved.

Fig. 2 illustrates the logical diagram of a pair of modulator and demodulator for SNR analysis. For the modulator, the clock source first generates the system clock. The DTC delays the clock according to the D16PPM codes and generates the trigger signal for the Pulse Gen (PG). The PG then produces



Fig. 2. (a) Logical diagram of a modulator for the proposed hybrid modulation scheme. (b) Diagram of a non-optimal demodulator for the D16PPM–PWM–DBPSK modulation scheme. The SNR analysis is based on the modulator and demodulator in this figure.



Fig. 3. Effect of the envelope noise on the trigger time of the ED, taking the rising edge as an example. The noise value  $(\Delta v_{rn})$  is considered to be constant in a very short duration.

pulses, based on the DBPSK codes and PWM codes. Finally, the pulse amplifier (PA) completes the TX process by shaping and amplifying the pulses and generating the final TX signal.

For the demodulator, an optimal demodulator with the maximum likelihood strategy features the ideal best performance on the additive white Gaussian noise (AWGN) channel. However, for the proposed hybrid modulation scheme, the extremely high hardware complexity of the optimal demodulator makes it not a good solution for practical scenarios. Alternatively, non-coherent receivers with an asynchronous edge detector (ED) are widely adopted for PPM-based UWB systems [41], [42], [43]. Fig. 2(b) illustrates a suitable demodulator for the proposed modulation scheme. The non-coherent selfmixing branch extracts the signal envelope (env(t)) and detects the rising and falling edges of the envelope for D16PPM and PWM demodulation. The coherent I/Q branches are implemented for DBPSK demodulation. The sampling events of the coherent branches are triggered by the ED in the selfmixing branch. Thus, the phase of received signal can be synchronized in every symbol.

Although the relationship between the SNR and SER of the asynchronous edge-detecting systems has been widely discussed, the existing analyses primarily focus on situations with no overlap between adjacent pulse positions [51], [52], [53]. For the modulation scheme with an overlap, it is necessary to first derive the probability density function (pdf) of the ED trigger time deviation. Fig. 3 illustrates the effect of the envelope noise  $(\Delta v_{rn})$  to the ED trigger time, taking the rising edge as an example. The subscript *n* denotes the symbol index, and *r* denotes the rising edge.  $V_{th}$  is the threshold voltage of the ED.  $t_{rn}$  and  $\hat{t}_{rn}$  are the ideal trigger time and the actual trigger time, respectively. Since the low-pass filter (LPF) in the non-coherent branch features a cut-frequency of ~1.6 GHz for the proposed modulation, with a typical time deviation ( $\Delta t_{rn}$ ) of only tens of microseconds, the noise value ( $\Delta v_{rn}$ ) can be considered to be constant at  $t_{rn}$  and  $\hat{t}_{rn}$ . In addition, the pdf of a squared signal with the original amplitude *A* and noise variance  $\sigma^2$  can be expressed by the non-central chi-square distribution [52], [54] as

$$p_1(x, A, \sigma) = \frac{1}{2\sigma^2} \left(\frac{x}{A^2}\right)^{-\frac{1}{4}} \exp\left(-\frac{x+A^2}{2\sigma^2}\right) I_{-\frac{1}{2}} \left(\frac{A\sqrt{x}}{\sigma^2}\right)$$
(3)

where  $I_n(x)$  is the modified Bessel function of the first kind with an order of *n*. Thus, the pdf of  $\Delta v_{rn}$  is

$$p_2(\Delta v_{\rm rn}) = p_1(V_{\rm th} + \Delta v_{\rm rn}, \sqrt{V_{\rm th}}, \sigma). \tag{4}$$

Since  $\Delta v_{rn} = V_{th} - \text{env}_n(\hat{t}_{rn})$ , the pdf of  $\hat{t}_{rn}$  is given as

$$p_3(\hat{t}_{\rm rn}) = p_2(V_{\rm th} - {\rm env}_n(\hat{t}_{\rm rn})) \cdot \left| \frac{{\rm d}\hat{t}_{\rm rn}}{{\rm d}\Delta v_{\rm rn}} \right|.$$
(5)

Thus, the pdf of  $\Delta t_{rn}$  is

$$p_4(\Delta t_{\rm rn}) = p_3(\Delta t_{\rm rn} + t_{\rm rn}). \tag{6}$$

The above derivation only considered the thermal noise at the input of the receiver, while for the transmitter, the timing noise of the equivalent DTC affects the TX signal quality the most. If the variance of the DTC timing jitter is  $\sigma_t^2$ , the final expression of the pdf of  $\Delta t_{rn}$  is

$$p_n(\Delta t_{\rm Im}) = \int_{-\infty}^{\infty} p_4(\Delta t_{\rm Im} - x) \cdot p_t(x, \sigma_t^2) dx$$
(7)

where  $p_t(x, \sigma_t^2)$  is the pdf of the Gaussian distribution with 0 mean and  $\sigma_t^2$  variance. The expression of the pdf of the falling edge trigger time deviation ( $\Delta t_{fn}$ ) is the same as (7).

For any  $\Delta t_{rn}$  value in symbol *n*, D16PPM is demodulated correctly only if the rising edge trigger time deviation of symbol n - 1 meets the following condition:

$$\left|\Delta t_{\rm rn} - \Delta t_{r(n-1)}\right| < 0.5 \cdot \delta \tag{8}$$

where  $\delta$  is the time step of D16PPM in (2).

For DBPSK, the possibility of demodulating a DBPSK code correctly is

$$P_{\text{DBPSK}}(\psi_1 < \psi < \psi_2 \mid \Delta \theta_n) = F_{\Delta \theta_n}(\psi_2) - F_{\Delta \theta_n}(\psi_1) + 1$$
(9)

where  $\Delta \theta_n$  is the input phase shift of adjacent UWB pulses.  $\psi$  is the demodulated phase shift.  $\psi_1$  and  $\psi_2$  are the boundaries



Fig. 4. Derived and simulated SER versus  $E_S/N_0$  curves of the proposed hybrid modulation scheme with the demodulator in Fig. 2.  $\sigma_t/\delta$  is the standard deviation of the DTC jitter in comparison to the D16PPM time step.

for code decisions.  $F_{\Delta\theta_n}(\varphi)$  can be expressed as

$$F_{\Delta\theta_n}(\varphi) = \frac{\sin(\Delta\theta_n - \varphi)}{4\pi} \times \int_{-\frac{\pi}{2}}^{\frac{\pi}{2}} \frac{\exp\left\{-\frac{E_x}{N_0}[1 - \cos\left(\Delta\theta_n - \varphi\right) \cdot \cos t\right]\right\}}{1 - \cos\left(\Delta\theta_n - \varphi\right) \cdot \cos t} dt.$$
(10)

Ideally, when the DBPSK code is 0,  $\Delta \theta_n$  is 0. When the DBPSK code is 1,  $\Delta \theta_n$  is  $\pi$ . However, for the proposed modulation scheme, the synchronization of an adjacent I/Q data is realized by the ED in the non-coherent branch. Thus,  $\Delta t_{tn}$  and  $\Delta t_{t(n-1)}$  introduce a synchronization error to  $\Delta \theta_n$ 

$$\Delta \hat{\theta}_n = \Delta \theta_n + \frac{\Delta t_{\text{rn}} - \Delta t_{r(n-1)}}{\delta} \cdot \pi.$$
(11)

As a result, for any  $\Delta t_{rn}$ , the possibility of a correct D16PPM and DBPSK demodulation is

$$P_{1}(\Delta t_{rn}) = \int_{-\frac{\delta}{2} + \Delta t_{rn}}^{\frac{\delta}{2} + \Delta t_{rn}} P_{\text{DBPSK}}(-\pi/2 < \psi < \pi/2 \mid \Delta \hat{\theta}_{n})$$
$$\cdot p_{n}(\Delta t_{r(n-1)}) \cdot d\Delta t_{r(n-1)}.$$
(12)

Similar to D16PPM, the possibility of a correct PWM demodulation with any  $\Delta t_{rn}$  is

$$P_{2}(\Delta t_{\text{fn}}) = 0.5 \cdot (P_{\text{PWM}=0} + P_{\text{PWM}=1})$$
  
=  $0.5 \cdot \left[ \int_{\Delta t_{\text{fn}} < \delta + \Delta t_{\text{fn}}} p_{n}(\Delta t_{\text{fn}}) \cdot d\Delta t_{\text{fn}} + \int_{\Delta t_{\text{fn}} > -\delta + \Delta t_{\text{fn}}} p_{n}(\Delta t_{\text{fn}}) \cdot d\Delta t_{\text{fn}} \right].$  (13)

The final expression of the SER is given as

SER = 
$$1 - \int P_1(\Delta t_{rn}) \cdot P_2(\Delta t_{rn}) \cdot p_n(\Delta t_{rn}) \cdot d\Delta t_{rn}.$$
 (14)

Fig. 4 illustrates the derived SER versus  $E_S/N_0$  curves, as well as several simulated results, with different DTC jitter values ( $\sigma_t$ ) in comparison to the D16PPM time step ( $\delta$ ). Although some approximations are used during the above derivation process, the derived results are consistent well with the simulation results.

For a time step of 119 ps, a DTC jitter of 0.06 least significant bit (LSB) (7.14 ps) is reasonable to achieve. Thus,



Fig. 5. Impact of the frequency deviation to the SER. This result is derived with an SNR of 21.3 dB and a DTC jitter of 7.14 ps.



Fig. 6. Ideal DSB phase noise spectrum of a free-running 300-MHz CMOS oscillator. The corner frequency of this spectrum is set at 1 MHz, where the phase noise is -100 dBc/Hz.

the required SNR for an SER of  $10^{-3}$  is 21.3 dB. The link budget can be calculated as

$$\frac{E_S}{N_0} + \mathrm{NF} + N + \mathrm{IL} = P_{\mathrm{TX}} + G_i - \mathrm{PL} - \mathrm{LM}$$
(15)

where NF is the noise figure of the RX. N is the noise power in the targeting bandwidth. IL is the implementation loss.  $P_{\text{TX}}$  is the TX power.  $G_i$  is the antenna gain. PL is the path loss. LM is the link margin. For a targeting bandwidth of ~1.5 GHz, the noise power N is -82.2 dBm. The TX power of a triangular-shaped UWB signal with the proposed modulation is -11.9 dBm, if it adheres to the -41.3-dBm/MHz spectrum mask. Assuming NF, IL, and  $G_i$  are 5 dB, 3 dB, and 7 dBi respectively, the sum of PL and LM is 48 dB. In addition, according to the measurement results in this work and in literature [55], we can estimate a path loss of roughly 40 dB, for a 4.2-GHz UWB signal transmitting through the skull and over a distance of several centimeters. Therefore, it can be concluded that the proposed hybrid modulation scheme is suitable for the targeting high-density neural implants.

#### C. Impact of Removing the Crystal

According to the above analysis, the performance of the equivalent DTC in the transmitter is crucial for the hybrid modulation scheme. This work aims to implement a crystalless transmitter to reduce the system volume. The absence of the crystal oscillator and the associated phase-locked loop (PLL) may result in two consequences: frequency deviation of the system clock and degradation of the timing noise.



Fig. 7. Cumulative noise ratio  $r_n(f_u)$  of the free-running oscillator, with the individual contribution of white noise and flicker noise presented.

The influence of the frequency deviation can be quantified by incorporating this factor to the theoretical analysis discussed in Section II-B. In our implementation, the deviation of the D16PPM time step, the PWM time step, and the carrier frequency correspond directly to the deviation of the oscillator's frequency. Taking these factors into account, the curve for the relationship between the frequency deviation and SER is shown in Fig. 5. This analysis assumes a crystalless transmitter paired with a receiver with accurate frequency. And the results are based on an SNR of 21.3 dB and a DTC jitter of 7.14 ps. According to Fig. 5, a frequency deviation of 1 MHz, corresponding to 0.3% of the 300-MHz target frequency, results in a 2.7% increase in the SER. This slight increase can be considered bearable. In addition, the simulation results indicate that a calibrated 300-MHz RO may feature a frequency drift of 0.5 MHz/mV, or 1%/°C. Nevertheless, the supply voltage drift can be compensated by common technologies, including voltage regulators or controlled current sources. Regarding the temperature drift, the internal temperature of the targeting implanted area is relatively stable. It is not likely for a temperature drift of >3 °C. Thus, the frequency deviation of the crystal-less system is acceptable.

As for the timing noise, a free-running oscillator may suffer from a root mean square (rms) jitter of up to tens of picoseconds. However, the proposed modulation scheme primarily uses the position or phase difference of adjacent UWB pulses to encode data. Crucially, it is the timing noise accumulated within a few symbol periods that really degrades the quality of the TX signal. In the following, a theoretical and quantified analysis will be given, based on the logical modulator depicted in Fig. 2(a).

In every TX cycle, the DTC in Fig. 2(a) is triggered by the system clock from the oscillator. Thus, the initial timing noise that actually affects the difference of adjacent pulse positions is the period jitter of the free-running oscillator, which can be quantified as the standard deviation of the oscillation period. And the period jitter can be calculated from the self-referenced phase noise variance accumulated within a symbol period

$$\sigma_{T_s} = \frac{\sqrt{\sigma_{\phi \text{osc}}^2(T_s)}}{2\pi f_c} \tag{16}$$



Fig. 8. Diagram of the proposed transmitter for the D16PPM-PWM-DBPSK hybrid modulation scheme.

where  $\sigma_{T_s}$  is the period jitter,  $\sigma_{\phi osc}^2(T)$  is the self-referenced phase noise variance accumulated within time *T*, and  $f_c$  is the oscillation frequency. The phase noise primarily comes from two noise sources: the white noise, which is characterized by its uniform spectral distribution, and the flicker noise, which mainly dominates in the low frequency band. The relationship between the phase noise variance and the noise sources can be expressed as follows [56]:

$$\sigma_{\phi \text{osc}}^2(T_s) = 2 \cdot \int_0^\infty \frac{\sin^2(\pi f T)}{\pi^2 f^2} S_n(f) \ df$$
(17)

where  $S_n(f)$  is the PSD of the stationary noise sources, including the white noise and the flicker noise.

As a specific calculation, we can consider the doublesideband (DSB) phase noise spectrum of a free-running 300-MHz CMOS oscillator shown in Fig. 6. Assume the corner frequency where the white noise and the flicker noise contribute equally is 1 MHz. And let the DSB phase noise at 1 MHz offset is -100 dBc/Hz. Then the calculated period jitter of this free-running oscillator is 2.04 ps. Considering the 7.14-ps jitter limit which was introduced in the previous section, the remaining jitter budget for the DTC is

$$\sigma_{\text{(budget)}} = \sqrt{\frac{\sigma_{(2 \text{ DTC convertion})}^2 - \sigma_{(\text{trigger})}^2}{2}}$$
$$= \sqrt{\frac{2 \times \sigma_{(\text{DTC limit})}^2 - \sigma_{T_s}^2}{2}}$$
$$= \sqrt{\frac{2 \times 7.14^2 - 2.04^2}{2}} = 6.99 \text{ ps.} \qquad (18)$$

Thus, there is enough budget for the transmitter to achieve the targeting timing performance, even if the oscillator is running free.

In addition, the impact of phase noise with different offset frequencies on the period jitter can also be quantified. First, the cumulative noise ratio, denoted as  $r_n(f_u)$ , can be defined as follows:

$$r_n(f_u) = \frac{2 \cdot \int_0^{f_u} \frac{\sin^2(\pi f T_s)}{\pi^2 f^2} S_n(f) \, \mathrm{d}f}{\sigma_{\phi \text{osc}}^2(T_s)}$$
(19)

where  $r_n(f_u)$  denotes the proportion of the phase noise accumulated from zero offset frequency to  $f_u$ , in relation to the total phase noise for the period jitter. For the freerunning oscillator depicted in Fig. 6, the calculated  $r_n(f_u)$  is illustrated in Fig. 7. This figure also highlights the individual contributions of the white noise and flicker noise. As observed from Fig. 7, it is evident that the predominant contributor to the period jitter is the white noise in the frequency range exceeding 1 MHz. Notably, in the frequency band above 100 MHz, the remaining noise still accounts for nearly 40% of the total. Therefore, if a crystal oscillator is implemented, it would necessitate designing a PLL with a bandwidth close to the oscillation frequency, which is challenging, while giving a limited benefit in return.

In conclusion, while the performance of the DTC in the transmitter is critical for the proposed modulation scheme, it is still feasible to eliminate the crystal oscillator. The proposed modulation scheme offers the potential to reduce the system's volume without compromising the needed timing performance.

## **III. CIRCUITS IMPLEMENTATION**

#### A. Transmitter Architecture and Timing Diagram

The diagram of the proposed digital edge-combine transmitter is shown in Fig. 8. In contrast to the open-loop delay lines commonly adopted in prior work, this study proposes a 42-stage bi-phase RO in the transmitter to generate the edges for combination. The oscillation frequency of the RO is regulated and calibrated by a current digital-to-analog converter (DAC). This current DAC uses a design with folded cascode mirrors and source degeneration resistors to minimize the impact of the noise. Calibrating the RO frequency also directly calibrates the delay cells within the RO, presenting a simpler approach compared with the switched capacitor method typically used for open-loop delay lines.

The RO generates a total of 84 phases. One in every three phases is routed out for edge-combine. In every TX cycle, the PG combines the 28 phases from the RO into six pulses. The position of the six pulses is determined by the Pulse Gen Ctrl (PGC) according to the 4-bit PPC code and the 1-bit PC code. The six output pulses from the PG are fed into three PAs. Each PA consists of an nMOS switch array and a pMOS switch array that are controlled by their gate voltages, GN and GP, respectively. When one of the MOS



Fig. 9. Timing diagram of the transmitter. The hybrid modulation scheme is realized by the PG, PS, and PAs efficiently.



Fig. 10. Diagram of the PG module. The PG is highly symmetrical to realize a precise timing control during the edge combination.

switches is turned on, it pulls up or pulls down the dc voltage at the output to form the TX wave. The driven strength of the switch arrays is also configurable to fine-tune the envelope of the TX pulses. A pair of masks, PM and NM, are used to control which portion of the six pulses from PG reach the gate of the MOS switches. PM and NM are generated from the pulse shaper (PS) according to the six pulses from PG, the PC, and the PWM code. Fig. 9 illustrates the detailed timing diagram. Within the PG, the PC introduces a 180° phase shift of the six output pulses. Subsequently, the PS detects every rising and falling edge of the six pulses to generate the mask signals, PM1-3 and NM1-3, for the three PAs, respectively. The number of the pulses in the mask is decided by the PWM code. The PC determines the sequence of NM and PM, which controls the activation order of the MOS and nMOS in turn. As a result, the three PAs are successively activated, each generating two or three sub pulses with different phases in a single TX cycle. Ultimately, the sum of the three PA outputs is the modulated TX wave.

#### B. Pulse Gen and Pulse Gen Ctrl

The PG combines the 28 phases into six pulses in every TX cycle. The detailed diagram of the PG is illustrated in Fig. 10. There are four layers of logic gates. In the first



Fig. 11. Diagram of the PGC. An XNOR gate is used to generate the control signal G. The four-layer selection circuit generates the control signals O1–O14.

layer, 28 MUXs select a combination of the 28 RO phases from two options according to the PC and the PPC. This selection mechanism enables the 180° phase shift of the PG output upon changes in the PC, as indicated in Fig. 9. The second layer generates 14 individual pulses with the input RO phases. Typically, XOR gates are widely used to combine two edges into one single pulse. However, conventional XOR gates, comprising multiple NAND gates, suffer from imprecise timing control, since the input-to-output delays across different inputs are usually unequal. The proposed PG generates a pulse by performing NOR operation on a phase and its adjacent



(b) Timing jitter.

Fig. 12. Design of the PS. (a) FED and the RED. (b) Diagram of the PS with an edge-detector chain of FEDs and REDs.





Fig. 14. Measured performance of the free-running RO. (a) DSB phase noise.

Fig. 13. Microphoto of the chip die.

anti-phase. No additional inverter is needed, since the antiphases of the 28 phases are naturally within the 28 phases. For the third layer, 14 MUXs select six consecutive pulses from the 14 outputs of the second layer, according to the PPC and the PC. The fourth layer combines the six selected pulses using symmetric NOR gates and NAND gates to improve the symmetry of the input-to-output delay across different pathways.

As discussed in Section II-B, the performance of the equivalent DTC in the transmitter is pivotal to the overall system performance. In the proposed transmitter, this equivalent DTC is primarily realized by the PG. To mitigate the non-linearity of the PG, transmission-gate-based MUXs are used in layer 1 and layer 3 to minimize the mismatch, offering an advantage over the traditional inverter-based MUXs. Furthermore, all the logic gates in the PG use transistors that are several times larger than those typical used in standard digital circuits. According to the post-simulation results generated by the Monte Carlo process, the LSB standard deviation of the equivalent DTC due to the local mismatch is only 3.6 ps.

Fig. 11 depicts the architecture of the PGC, which is responsible for generating all the MUX control signals in the PG. An XNOR gate is used to generate the control signal G for the 28 MUXs in the first layer in PG. A four-layer selection circuit with two types of selection cells is designed to generate O1–O14 for the 14 MUXs in the third layer in PG. Separated control signals C1–C4 are applied to each layer. With all the MUXs in the PG as the working load, the PGC consumes a total power of 42 uW, with an input-to-output delay of only 120 ps.

#### C. Pulse Shaper

The PS detects every rising and falling edge of the six pulses from PG and generates the PM1-3 and NM1-3 mask signals for the PAs. To realize this function, an SR NOR latch-based falling ED (FED) and an SR NAND latch-based rising ED (RED) are designed, respectively, as shown in Fig. 12(a). The behaviors of the FED and RED are controlled by the CLR signal and the SET signal. When CLR is 1 and SET is 0, the outputs flip to 0 at targeting edges. When CLR is 0 and SET is 1, the outputs flip to 1 at targeting edges.

The PS consists of an edge-detector chain incorporating the proposed FEDs and REDs. In every transmit cycle, the former EDs enable the subsequent ED. The outputs of all the EDs first change to 1 with targeting edges. A return path enables the second detection round, causing the outputs to revert to 0 with targeting edges. The masks are generated at the outputs of the EDs. An MUX controlled by the PWM code is used to decide which rising edge detection is fed back to the first ED. This control mechanism adjusts the duration of the PM and NM signals, thereby realizing the PWM control. In addition, three MUXs, under the control of the PC, decide the order of NM and PM, enabling the DBPSK modulation.

#### **IV. EXPERIMENTAL RESULTS**

The proposed IR-UWB transmitter was fabricated in 40-nm CMOS technology, occupying a silicon area of 0.058 mm<sup>2</sup>. A microphotograph of the die is shown in Fig. 13.

The measured performance of the free-running RO is illustrated in Fig. 14. According to the DSB phase noise spectrum in Fig. 14(a), the RO has a corner frequency of about 1.25 MHz, where the phase noise is -104 dBc/Hz. Thus, the calculated period jitter by (16) and (17) is 1.64 ps. This result matches the directly measured period jitter of 1.71 ps in Fig. 14(b). Regarding the power supply rejection, the measurement results show that a 20-mV drift in the supply voltage of the current DAC results in a 0.12-MHz frequency drift of the RO. This translates to a mere 1.33-ps deviation in the oscillation period. As a result, the precision afforded by

| COMPARISON WITH RELATED WORKS |                                                              |                                                                                            |                            |                            |                      |                            |
|-------------------------------|--------------------------------------------------------------|--------------------------------------------------------------------------------------------|----------------------------|----------------------------|----------------------|----------------------------|
|                               | This Work                                                    | ISSCC 22 [34]                                                                              | JSSC 22 [42]               | JSSC 19 [43]               | TCAS I 18 [37]       | JSSC 16 [38]               |
| Process                       | 40 nm                                                        | 28 nm                                                                                      | 65 nm                      | 65 nm                      | 130 nm               | 65 nm                      |
| Frequency Band                | 3.1-5 GHz                                                    | 6-9 GHz                                                                                    | 3.5-6 GHz                  | 3.1-5 GHz                  | 3.5-4.5 GHz          | 3.1-10.6 GHz               |
| Architecture                  | Edge Combine                                                 | Up-Conversion                                                                              | Edge Combine               | Edge Combine               | Edge Combine         | Up-Conversion              |
| Modulation                    | D16PPM + PWM + DBPSK                                         | 4PPM + 8PSK + 4PAM                                                                         | E-MPPM                     | D-MPPM                     | OOK                  | BPSK                       |
| Modulation Order              | 6-bit                                                        | 7-bit                                                                                      | 3-bit                      | 2.5-bit                    | 2-bit                | 1-bit                      |
| Crystal-less TX               | Yes                                                          | No                                                                                         | No                         | No                         | No                   | No                         |
| Data Rate                     | 1.80 Gbps                                                    | 1.66 Gbps                                                                                  | 1.125 Gbps                 | 0.5 Gbps                   | 1 Gbps               | 1 Gbps                     |
| TX Power Consumption          | 4.09 mW                                                      | 9.69 mW                                                                                    | 9.2 mW                     | 7 mW                       | 5 mW                 | 21.4 mW                    |
| TX Energy Efficiency          | 2.3 pJ/bit                                                   | 5.8 pJ/bit                                                                                 | 8.2 pJ/bit                 | 14 pJ/bit                  | 5 pJ/bit             | 21.4 pJ/bit                |
| Area                          | 0.058 mm <sup>2</sup>                                        | 0.155 mm <sup>2</sup>                                                                      | -                          | 0.27 mm <sup>2</sup> *     | 0.04 mm <sup>2</sup> | 1.1 mm <sup>2</sup> *      |
| Max Output Amp.               | 140 mV                                                       | 120 mV                                                                                     | 200 mV                     | 200 mV                     | 50 mV                | 80 mV                      |
| Tissue Type                   | 18 mm skin/fat                                               | 15 mm skin/fat                                                                             | No Tissue                  | No Tissue                  | -                    | No Tissue                  |
| TX Antenna Gain               | 2.6 dBi                                                      | -8.5 dBi **                                                                                | 3 dBi                      | 3 dBi                      | -                    | 1.5 dBi                    |
| RX Antenna Gain               | 3.5 dBi                                                      | 5 dBi                                                                                      | 3 dBi                      | 3 dBi                      | -                    | 1.5 dBi                    |
| Transmission<br>Range         | 15 cm (10 <sup>-4</sup> BER)<br>20 cm (10 <sup>-3</sup> BER) | 2 cm @ 1.66 Gbps (10 <sup>-4</sup> BER)<br>15 cm @ 1.43 Gbps (10 <sup>-4</sup> BER)<br>*** | 2 m (10 <sup>-3</sup> BER) | 1 m (10 <sup>-3</sup> BER) | -                    | 1 m (10 <sup>-3</sup> BER) |

TABLE I

Estimated from the chip die photo

\*\* Attenuation of a human head model included. \*\*\* 4PAM decreased to 2PAM to ensure the transmission range



Fig. 15. Measured TX output waveform with different modulation codes.

the free-running RO is sufficient for the timing requirements of the proposed hybrid modulation scheme.

The TX output waveform, measured by a high-speed oscilloscope, is shown in Fig. 15. The TX pulses with different PPC, PC, and PWM codes are also illustrated. The measured TX power is approximately -12.4 dBm. The TX power spectrum also satisfies the FCC UWB mask (Fig. 16).

The power consumption break-down is illustrated in Fig. 17. The current DAC and the RO consume 0.8 mW. The PAs consume 2.15 mW. The pulse generator and other logic consume 1.14 mW. As a result, the 1.8-Gbps transmitter consumes a total power of only 4.09 mW, realizing an energy efficiency of 2.3 pJ/bit.

To quantify the modulation performance of the transmitter, a demodulation algorithm based on the demodulator in



Fig. 16. Measured TX power spectrum.



Fig. 17. Power consumption break-down.

Fig. 2(b) is implemented to the measured TX transient signal. The performance of the equivalent DTC for D16PPM is presented in Fig. 18. The equivalent DTC achieved a maximum differential non-linearity (DNL) of 11.7 ps and a maximum integral non-linearity (INL) of 15.5 ps. For the timing noise, the measured jitter of the DTC is only 5.59 ps, which satisfies the design requirements in Section II-B.

The performance of PWM and DBPSK is illustrated in Fig. 19. For PWM, the standard deviation of the pulsewidth is approximately 10 ps. For DBPSK,  $\Delta IQ$ , which is the difference of adjacent IQ values, is plotted in the constellation diagram. Since there is a frequency offset of the freerunning RO, the  $\Delta IQ$  points form several full circles on the constellation diagram. There are three circles on the outside. This is due to the cross modulation from the combination of adjacent PWM codes. Nevertheless, the distinction of the  $\Delta IQ$ points from different DBPSK codes is quite clear. The DBPSK



Fig. 18. Measured equivalent DTC performance. (a) Measured linearity results. (b) Output histogram of the DTC with different input codes.



Fig. 19. Measured pulsewidth histogram and DBPSK  $\Delta IQ$  constellation diagram.



Fig. 20. Ex vivo TRX test. (a) Experimental setup. (b) Measured BER versus distance curve.

code can be given by judging whether the  $\Delta IQ$  point is located in the circle of radius 1. For the error-vector magnitude (EVM) calculation, one approach is to designate the reference position to the origin point for  $\Delta IQ$  points with DBPSK = 0. For DBPSK = 1, the reference position is designated to the circle where the delta IQ points would be if there is no frequency deviation. Thus, the calculated EVM is -21.2 dB.

To test the transcutaneous transmission, an ex vivo TRX test was implemented with 18-mm pork tissue applied, as shown in Fig. 20. A small-size 2.6-dBi antenna (NN01-107 from Ignion) was used as the TX antenna. The discrete RX system consisted of a 3.5-dBi antenna, an RF filter (VHF-2700+ from

Mini Circuits), a 1.6-dB-NF LNA (QPM1000 from Qorvo), and a high-speed oscilloscope (DSAZ594A from Keysight) operated as an ADC. The received data were decoded by the demodulator shown in Fig. 2. The measured bit error rate (BER) versus the transmission distance is plotted in Fig. 20(b). At a distance of 20 cm, the measured total path loss is about 45 dB, where the BER is approximately  $10^{-3}$ . Consequently, the proposed transmitter is able to give a steady transcutaneous transmission for the targeted high-density neural implants.

Table I shows comparison of the proposed UWB transmitter with related low-power and short-range UWB transmitters in prior arts. A data rate of 1.8 Gbps with a power efficiency of 2.3 pJ/bit is achieved by the proposed transmitter, resulting in highest data rate, lowest power consumption, and best efficiency among the works listed in the table.

### V. CONCLUSION

This article proposed a hybrid modulation scheme combining D16PPM, PWM, and DBPSK to balance the data rate and enable the transcutaneous data transmission for the high-density neural implants. The hybrid modulation scheme achieves a total modulation order of 6. A detailed theoretical analysis on the SNR and SER relationship, as well as the timing noise that affects the modulation, is illustrated in this article. This modulation scheme is realized by a crystalless all-digital transmitter. An RO provides the phases for edge combine, as a simpler way to calibrate the delay cells, compared with the switched capacitors in conventional openloop delay lines. The transmitter features a pulse generator to precisely combine the edges and a PS with an ED chain, achieving high efficiency. The transmitter achieves a data rate of 1.8 Gbps with a power consumption of only 4.09 mW. Thus, a power efficiency of 2.3 pJ/bit is achieved. In the ex vivo test, a transcutaneous transmission range of 20 cm was measured with 18-mm pork tissue applied. Consequently, the proposed UWB transmitter proves its high potential for the high-density neural implants.

#### REFERENCES

- E. Musk and Neuralink, "An integrated brain-machine interface platform with thousands of channels," J. Med. Internet Res., vol. 21, no. 10, Oct. 2019, Art. no. e16194.
- [2] J. Abbott et al., "A nanoelectrode array for obtaining intracellular recordings from thousands of connected neurons," *Nature Biomed. Eng.*, vol. 4, no. 2, pp. 232–241, Sep. 2019.
- [3] X. Yuan, A. Hierlemann, and U. Frey, "Extracellular recording of entire neural networks using a dual-mode microelectrode array with 19 584 electrodes and high SNR," *IEEE J. Solid-State Circuits*, vol. 56, no. 8, pp. 2466–2475, Aug. 2021.
- [4] N. A. Steinmetz et al., "Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings," *Science*, vol. 372, no. 6539, Apr. 2021, Art. no. eabf4588, doi: 10.1126/science.abf4588.
- [5] P.-M. Wang et al., Challenges Design Large-Scale, High-Density, Wireless Stimulation Recording Interface. Cham, Switzerland: Springer, 2020, pp. 1–28, doi: 10.1007/978-3-030-34467-2\_1.
- [6] A. B. Rapeaux and T. G. Constandinou, "Implantable brain machine interfaces: First-in-human studies, technology challenges and trends," *Current Opinion Biotechnol.*, vol. 72, pp. 102–111, Dec. 2021. [Online]. Available: https://www.sciencedirect.com /science/article/pii/S095816692100183X
- [7] X. Liu et al., "A fully integrated wireless compressed sensing neural signal acquisition system for chronic recording and brain machine interface," *IEEE Trans. Biomed. Circuits Syst.*, vol. 10, no. 4, pp. 874–883, Aug. 2016.

- [8] S.-J. Kim et al., "A sub-μW/Ch analog front-end for δ-neural recording with spike-driven data compression," *IEEE Trans. Biomed. Circuits Syst.*, vol. 13, no. 1, pp. 1–14, Feb. 2019.
- Z. Zhang and T. G. Constandinou, "Adaptive spike detection and hardware optimization towards autonomous, high-channel-count BMIs," *J. Neurosci. Methods*, vol. 354, Apr. 2021, Art. no. 109103.
   [Online]. Available: https://www.sciencedirect.com/science/article/pii /S0165027021000388
- [10] T. Wu, W. Zhao, E. Keefer, and Z. Yang, "Deep compressive autoencoder for action potential compression in large-scale neural recording," *J. Neural Eng.*, vol. 15, no. 6, Dec. 2018, Art. no. 066019.
- [11] N. Even-Chen et al., "Power-saving design opportunities for wireless intracortical brain–computer interfaces," *Nature Biomed. Eng.*, vol. 4, no. 10, pp. 984–996, Aug. 2020.
- [12] M. A. A. Ibrahim and M. Onabajo, "A low-power BFSK transmitter architecture for biomedical applications," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 67, no. 5, pp. 1527–1540, May 2020.
- [13] M.-C. Lee et al., "A CMOS MedRadio transceiver with supplymodulated power saving technique for an implantable brain-machine interface system," *IEEE J. Solid-State Circuits*, vol. 54, no. 6, pp. 1541–1552, Jun. 2019.
- [14] H. Huang et al., "A 2 nJ/bit, 2.3% FSK error fully integrated sub-2.4 GHz transmitter with duty-cycle controlled PA for medical band," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 69, no. 12, pp. 5018–5029, Dec. 2022.
- [15] Y. Guo, Y. Li, Z. Weng, H. Jiang, and Z. Wang, "A 0.66 mW 400 MHz/900 MHz transmitter IC for in-body bio-sensing applications," *IEEE Trans. Biomed. Circuits Syst.*, vol. 16, no. 2, pp. 252–265, Apr. 2022.
- [16] P. Yeon, S. A. Mirbozorgi, J. Lim, and M. Ghovanloo, "Feasibility study on active back telemetry and power transmission through an inductive link for millimeter-sized biomedical implants," *IEEE Trans. Biomed. Circuits Syst.*, vol. 11, no. 6, pp. 1366–1376, Dec. 2017.
- [17] V. W. Leung et al., "A CMOS distributed sensor system for high-density wireless neural implants for brain-machine interfaces," in *Proc. IEEE* 44th Eur. Solid State Circuits Conf. (ESSCIRC), Sep. 2018, pp. 230–233.
- [18] T.-C. Yu, W.-H. Huang, and C.-L. Yang, "Design of dual frequency mixed coupling coils of wireless power and data transfer to enhance lateral and angular misalignment tolerance," *IEEE J. Electromagn., RF Microw. Med. Biol.*, vol. 3, no. 3, pp. 216–223, Sep. 2019.
- [19] W. Li, Y. Duan, and J. Rabaey, "A 200-Mb/s energy efficient transcranial transmitter using inductive coupling," *IEEE Trans. Biomed. Circuits Syst.*, vol. 13, no. 2, pp. 435–443, Apr. 2019.
- [20] T. C. Chang, M. L. Wang, J. Charthad, M. J. Weber, and A. Arbabian, "27.7 A 30.5 mm<sup>3</sup> fully packaged implantable device with duplex ultrasonic data and power links achieving 95 kb/s with <10-4 BER at 8.5 cm depth," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 460–461.
- [21] S. F. Alamouti, M. M. Ghanbari, N. T. Ersumo, and R. Müller, "High throughput ultrasonic multi-implant readout using a machine-learning assisted CDMA receiver," in *Proc. 42nd Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC)*, Jul. 2020, pp. 3289–3292.
- [22] T. Bos, M. Verhelst, and W. Dehaene, "A flexible end-to-end dual ASIC transceiver for OFDM ultrasound in-body communication," in *Proc. IEEE Biomed. Circuits Syst. Conf. (BioCAS)*, Oct. 2022, pp. 21–25.
- [23] K. A. Ng et al., "A wireless multi-channel peripheral nerve signal acquisition system-on-chip," *IEEE J. Solid-State Circuits*, vol. 54, no. 8, pp. 2266–2280, Aug. 2019.
- [24] G. Di Patrizio Stanchieri, G. Battisti, A. De Marcellis, M. Faccio, E. Palange, and T. G. Constandinou, "A new multilevel pulsed modulation technique for low power high data rate optical biotelemetry," in *Proc. IEEE Biomed. Circuits Syst. Conf. (BioCAS)*, Oct. 2021, pp. 1–5.
- [25] A. De Marcellis, G. D. P. Stanchieri, M. Faccio, E. Palange, and T. G. Constandinou, "A 300 Mbps 37 pJ/bit pulsed optical biotelemetry," *IEEE Trans. Biomed. Circuits Syst.*, vol. 14, no. 3, pp. 441–451, Jun. 2020.
- [26] H. Cho et al., "A 79 pJ/b 80 Mb/s full-duplex transceiver and a 42.5  $\mu$ W 100 kb/s super-regenerative transceiver for body channel communication," *IEEE J. Solid-State Circuits*, vol. 51, no. 1, pp. 310–317, Jan. 2016.
- [27] B. Kim, B. Yuk, and J. Bae, "A wirelessly-powered 10 Mbps 46-pJ/b body channel communication system with 45% PCE multi-stage and multi-source rectifier for neural interface applications," in *Proc. IEEE Asian Solid-State Circuits Conf. (A-SSCC)*, Nov. 2021, pp. 1–3.

- [28] C. Shi, M. Song, Z. Gao, A. Bevilacqua, G. Dolmans, and Y.-H. Liu, "Galvanic-coupled trans-dural data transfer for high-bandwidth intracortical neural sensing," *IEEE Trans. Microw. Theory Techn.*, vol. 70, no. 10, pp. 4579–4589, Oct. 2022.
- [29] B. Chatterjee, A. Datta, M. Nath, K. G. Kumar, N. Modak, and S. Sen, "A 65 nm 63.3 μW 15 Mbps transceiver with switched-capacitor adiabatic signaling and combinatorial-pulse-position modulation for body-worn video-sensing AR nodes," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, vol. 65, Feb. 2022, pp. 276–278.
- [30] C. Lee et al., "A miniaturized wireless neural implant with body-coupled data transmission and power delivery for freely behaving animals," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, vol. 65, Feb. 2022, pp. 1–3.
- [31] H. Ando, K. Takizawa, T. Yoshida, K. Matsushita, M. Hirata, and T. Suzuki, "Wireless multichannel neural recording with a 128-Mbps UWB transmitter for an implantable brain-machine interfaces," *IEEE Trans. Biomed. Circuits Syst.*, vol. 10, no. 6, pp. 1068–1078, Dec. 2016.
- [32] Y.-J. Lin et al., "A 3.1–5.2 GHz, energy-efficient single antenna, cancellation-free, bitwise time-division duplex transceiver for high channel count optogenetic neural interface," *IEEE Trans. Biomed. Circuits Syst.*, vol. 16, no. 1, pp. 52–63, Feb. 2022.
- [33] H. Rahmani and A. Babakhani, "A wirelessly powered reconfigurable FDD radio with on-chip antennas for multi-site neural interfaces," *IEEE J. Solid-State Circuits*, vol. 56, no. 10, pp. 3177–3190, Oct. 2021.
- [34] M. Song et al., "A 1.66 Gb/s and 5.8 pJ/b transcutaneous IR-UWB telemetry system with hybrid impulse modulation for intracortical braincomputer interfaces," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, vol. 65, Feb. 2022, pp. 394–396.
- [35] M. Song, Y. Huang, H. J. Visser, J. Romme, and Y.-H. Liu, "An energy-efficient and high-data-rate IR-UWB transmitter for intracortical neural sensing interfaces," *IEEE J. Solid-State Circuits*, vol. 57, no. 12, pp. 3656–3668, Dec. 2022.
- [36] J. Lei et al., "A 1.8 Gb/s, 2.3 pj/bit, crystal-less IR-UWB transmitter for neural implants," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2023, pp. 464–466.
- [37] M. Crepaldi, G. N. Angotzi, A. Maviglia, F. Diotalevi, and L. Berdondini, "A 5 pJ/pulse at 1-Gpps pulsed transmitter based on asynchronous logic master–slave PLL synthesis," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 65, no. 3, pp. 1096–1109, Mar. 2018.
- [38] N.-S. Kim and J. M. Rabaey, "A high data-rate energy-efficient triplechannel UWB-based cognitive radio," *IEEE J. Solid-State Circuits*, vol. 51, no. 4, pp. 809–820, Apr. 2016.
- [39] S. Geng, D. Liu, Y. Li, H. Zhuo, W. Rhee, and Z. Wang, "A 13.3 mW 500 Mb/s IR-UWB transceiver with link margin enhancement technique for meter-range communications," *IEEE J. Solid-State Circuits*, vol. 50, no. 3, pp. 669–678, Mar. 2015.
- [40] J. Ko and R. Gharpurey, "A pulsed UWB transceiver in 65 nm CMOS with four-element beamforming for 1 Gbps meter-range WPAN applications," *IEEE J. Solid-State Circuits*, vol. 51, no. 5, pp. 1177–1187, May 2016.
- [41] G. Lee, S. Lee, J.-H. Kim, and T. W. Kim, "21.1 A 1.125 Gb/s 28 mW 2 m-radio-range IR-UWB CMOS transceiver," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC)*, vol. 64, Feb. 2021, pp. 302–304.
- [42] G. Lee, J. Jang, J.-H. Kim, and T. W. Kim, "An IR-UWB CMOS transceiver with extended pulse position modulation," *IEEE J. Solid-State Circuits*, vol. 57, no. 8, pp. 2281–2291, Aug. 2022.
- [43] G. Lee, J. Park, J. Jang, T. Jung, and T. W. Kim, "An IR-UWB CMOS transceiver for high-data-rate, low-power, and shortrange communication," *IEEE J. Solid-State Circuits*, vol. 54, no. 8, pp. 2163–2174, Aug. 2019.
- [44] D. D. Wentzloff and A. P. Chandrakasan, "A 47 pJ/pulse 3.1-to-5 GHz all-digital UWB transmitter in 90 nm CMOS," in *Proc. IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2007, pp. 118–591.
- [45] T. Haapala and K. Halonen, "A fully integrated digitally programmable pulse shaping 6.0–8.5 GHz UWB IR transmitter front-end for energy harvesting applications," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, May 2018, pp. 1–5.
- [46] X. Tong, D. An, and J. Li, "Area- and energy-efficient sub-GHz impulse radio UWB transmitter with output amplitude enhancement for biomedical implants," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 68, no. 6, pp. 1807–1811, Jun. 2021.
- [47] V. Kulkarni, M. Muqsith, H. Ishikuro, and T. Kuroda, "A 750 Mb/s 12 pJ/b 6-to-10 GHz digital UWB transmitter," in *Proc. IEEE Custom Integr. Circuits Conf.*, Sep. 2007, pp. 647–650.

- [48] J. K. Brown, K.-K. Huang, E. Ansari, R. R. Rogel, Y. Lee, and D. D. Wentzloff, "An ultra-low-power 9.8 GHz crystal-less UWB transceiver with digital baseband integrated in 0.18 μm BiCMOS," in *Proc. IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2013, pp. 442–443.
- [49] X. Y. Wang, R. K. Dokania, and A. B. Apsel, "A crystal-less selfsynchronized bit-level duty-cycled IR-UWB transceiver system," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 60, no. 9, pp. 2488–2501, Sep. 2013.
- [50] H. Rahmani and A. Babakhani, "An integrated battery-less wirelessly powered RFID tag with clock recovery and data transmitter for UWB localization," in *IEEE MTT-S Int. Microw. Symp. Dig.*, Aug. 2020, pp. 460–463.
- [51] N. V. Kokkalis, P. T. Mathiopoulos, G. K. Karagiannidis, and C. S. Koukourlis, "Performance analysis of M-ary PPM TH-UWB systems in the presence of MUI and timing jitter," *IEEE J. Sel. Areas Commun.*, vol. 24, no. 4, pp. 822–828, Apr. 2006.
- [52] M. Crepaldi and P. R. Kinget, "Error ratio model for synchronised-OOK IR-UWB receivers in AWGN channels," *Electron. Lett.*, vol. 49, no. 1, pp. 25–27, Jan. 2013.
- [53] X. Shi et al., "BER performance analysis of non-coherent Q-ary pulse position modulation receivers on AWGN channel," *Sensors*, vol. 21, no. 18, p. 6102, Sep. 2021.
- [54] J. G. Proakis, Digital Communications. New York, NY, USA: McGraw-Hill, 2008.
- [55] O. Novak and R. B. Brown, "An empirical model of UWB large-scale signal fading in neocortical research," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, May 2016, pp. 2439–2442.
- [56] A. Demir, "Computing timing jitter from phase noise spectra for oscillators and phase-locked loops with white and 1/f noise," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 53, no. 9, pp. 1869–1884, Sep. 2006.



**Jiaxin Lei** (Student Member, IEEE) received the B.S. degree in electronic engineering from Tsinghua University, Beijing, China, in 2019, where he is currently pursuing the Ph.D. degree, focusing on low-power circuit and system design.

His research interests include various lowpower circuit and system design for biomedical applications.



**Heng Huang** received the B.S. and Ph.D. degrees in electronic engineering from Tsinghua University, Beijing, China, in 2016 and 2023, respectively. His current research interests include mixed-signal and RF integrated circuits.



Xiaoyan Ma received the B.S. degree in telecommunication engineering from Beijing University of Post and Telecommunication, Beijing, China, in 2005, and the M.S. degree in mobile communication from China Academy of Telecommunication Technology, Beijing, in 2008.

She is currently with Beijing Ningju Technology Ltd., Beijing, where she is involved in circuits' design.



**Junliang Wei** received the B.S. degree in communication engineering from Huanghe Science and Technology University, Zhengzhou, China, in 2006, and the M.S. degree in electronics and communication engineering from Beijing University of Technology, Beijing, China, in 2012.

His current research interests include transceivers and RF circuits' designs.



Xiliang Liu received the B.S. degree in biomedical engineering from Shanghai Jiao Tong University, Shanghai, China, in 2012, and the M.S. degree in integrated circuit engineering from Tsinghua University, Beijing, China, in 2015.

He is currently with Beijing Ningju Technology Ltd., Beijing, where he is involved in mixed-signal and RF integrated circuits. His research is on low-power high-efficiency transceivers and digital phase-locked loops.



Wei Song (Graduate Student Member, IEEE) received the B.S. and Ph.D. degrees in electronic engineering from Tsinghua University, Beijing, China, in 2017 and 2023, respectively.

His current research interests include various kinds of integrated circuit designs and electronic system designs, such as brain-machine interface.



Milin Zhang (Senior Member, IEEE) received the B.S. and M.S. degrees in electronic engineering from Tsinghua University, Beijing, China, in 2004 and 2006, respectively, and the Ph.D. degree from the Electronic and Computer Engineering Department, The Hong Kong University of Science and Technology (HKUST), Hong Kong.

She is an Associate Professor with the Department of Electronic Engineering, Tsinghua University, Beijing, China. After the Ph.D. studies, she worked as a Post-Doctoral Researcher with the University

of Pennsylvania (UPenn), Philadelphia, PA, USA. She joined Tsinghua University in 2016. Her research interests include designing of biomedical sensing-oriented SoC designs and various nontraditional imaging sensors.

Dr. Zhang serves and has served as a TPC Member for ISSCC, CICC, A-SSCC, and CASS. She received the Best Paper Award of the BioCAS Track of the 2014 International Symposium on Circuits and Systems (ISCAS), the Best Paper Award (first place) of the 2015 Biomedical Circuits and Systems Conference (BioCAS), and the Best Student Paper Award (second place) of ISCAS 2017. She is the Chapter Chair of the SSCS Beijing Chapter.