

Received 28 May 2023, accepted 10 June 2023, date of publication 19 June 2023, date of current version 28 June 2023.

Digital Object Identifier 10.1109/ACCESS.2023.3287776

# **RESEARCH ARTICLE**

# **Bandwidth, Power and Carrier Configuration Resilient Neural Networks Digital Predistorter**

# ASMA ALI<sup>®</sup> AND OUALID HAMMI<sup>®</sup>, (Member, IEEE) Department of Electrical Engineering, American University of Sharjah, Sharjah, United Arab Emirates

Corresponding author: Oualid Hammi (ohammi@aus.edu)

This work was supported by the Research Office at the American University of Sharjah under Grant FRG20-M-E85.

**ABSTRACT** This paper proposes a neural networks predistorter based on the bidirectional long-short-term memory (BiLSTM) structure. The proposed predistorter was trained while ensuring that it captures the full intrinsic behavior of the device under test including its memory effects and nonlinear distortions. For this purpose, the device under test was characterized while operating at peak power level with a test signal that emulates strong memory effects. Extensive experimental validation carried on a commercial Gallium Nitride power amplifier prototype demonstrated the ability of the proposed predistorter to maintain standard compliant adjacent channel leakage ratio over a wide range of operating conditions including operating average power, signal bandwidth, and carriers' configurations. It has been shown that a digital predistorter (DPD) derived from one single training condition was able to linearize the device under test for 72 different test conditions with signal bandwidths between 10MHz and 40MHz, and an operating power range of 5dB. Furthermore, benchmarking results showed that the BiLSTM DPD is unable to maintain satisfactory performance when trained with a sub-optimal signal which does not emulate the full behavior of the device under test. Moreover, it has been shown that the use of the optimal characterization signal along with a generalized memory polynomial predistorter does not lead to satisfactory performance. Hence, the resilience of the predistorter is obtained by combining the suitable model structure along with the appropriate training approach. Such resilient DPD presents a paradigm shift in predistortion techniques which significantly minimizes the need for update. It is anticipated that this work will pave the road for a new generation of DPDs resilient to a wide range of operating conditions.

**INDEX TERMS** 5G NR, BiLSTM, bidirectional LSTM, digital predistortion, distortions, long-short-term memory (LSTM), memory effects, neural networks, nonlinearity, power amplifier, predistortion.

### I. INTRODUCTION

The fifth-generation (5G) mobile communication networks are set to revolutionize the way we communicate and access information. With the increasing demand for higher data rates, greater reliability, and lower latency, 5G systems require higher spectral efficiency, improved energy efficiency, and increased network capacity. One critical component in the radio frequency (RF) front-end is the power amplifier (PA) which can behave nonlinearly. The nonlinear behavior of RF PAs is observed when the input signal has an amplitude varying envelope which is the case of all modern

The associate editor coordinating the review of this manuscript and approving it for publication was Vittorio Camarchia<sup>10</sup>.

communication systems. Moreover, as the bandwidth of the signals being transmitted increases, the dynamic nonlinearity of the power amplifier (also known as memory effects) becomes more pronounced.

The nonlinear behavior of RF power amplifiers is critical since it affects the spectral and energy efficiency of the communication system. Spectral efficiency is altered by the generation of out of band energy caused by the spectrum regrowth resulting from nonlinear amplification. Energy efficiency is reduced when the power amplifier is forced to operate in its linear region in order to circumvent its nonlinear behavior. In order to minimize the impact of power amplifiers nonlinearity on the overall performance of the communication system, and meet the requirements of 5G standards, the



FIGURE 1. Simplified block diagram of the BiLSTM based DPD.

use of linearization techniques is unavoidable. Among the various linearization techniques, digital predistortion stands out as the most suited approach for 5G base stations. This is mainly due to its ability to achieve satisfactory linearity performance for the considered applications without significant impact on the overall power efficiency.

The principal of digital predistortion consists of applying, in the digital domain, a nonlinear function that is complementary to the nonlinear behavior of the PA in a way that the cascade made of the predistorter and the amplifier behaves as a linear amplification system. Hence, it is clear that the performance of the digital predistorter and its ability to cancel the distortions of the power amplifier heavily relies on the match between the predistortion function and the nonlinear characteristics of the power amplifier. This problem becomes more pronounced in the case of 5G systems where the operating conditions of the power amplifier are expected to vary often, thus requiring a very fast adaptation of the predistortion function in order to continuously ensure optimal linearization performance.

Two approaches can be considered in order to maintain the match between the predistortion function and the amplifier nonlinearity. In the first approach, the predistortion function is updated following changes in the PA behavior. This allows for the use of mildly complex predistortion functions which are suitable only for a particular set of operating conditions such as a specific average power level and a given signal bandwidth. Such predistorters require frequent updates, hence this approach is often considered as computationally demanding [1], [2]. On the other hand, the use of scalable digital predistortion systems can be perceived as a better approach for ensuring continuous match between the predistorter's and the amplifier's nonlinear functions [3], [4], [5], [6], [7], [8]. Scalable predistortion systems aim at minimizing the number of predistorter parameters (for example polynomial function coefficients) that need to be updated following a change in the PA operating conditions. In [3], a bandwidth and power scalable digital predistorter using two-box structure was proposed. This model reduced the update complexity of the two-box predistorter architecture by using a set of pre-calculated memoryless lookup tables indexed as a function of the operating average power.



FIGURE 2. LSTM cell internal structure.

The second box of the model, namely the memory polynomial, was then updated to fine tune the predistortion function and ensure its scalability while updating only a small portion of the overall model parameters. Feature-based modeling has also been investigated for the design of scalable predistorters [4], [5], [6]. In [4], a complexity reduced adaptation technique was proposed to enable digital predistorters to track changes in PA behavior with reduced complexity updates. This technique was based on a pre-training of the digital predistorter over a wide range of operating conditions in order to extract common features of the PA behavior, and identify through principal components analysis technique the model parameters that need to be transformed to track changes in the PA behavior. This approach was then extended to the case of multi-input multi-output transmitters [5]. More recently, the concept of model features extraction was also applied for the linearization of active arrays power amplifiers for various beam conditions [6]. The commonality between these scalable DPD approaches is in the fact that a pre-training over a wide range of operating conditions is needed, and that the predistortion function is then updated with reduced complexity overhead. Neural networks (NN) were also used for scalable digital predistorters [7], [8]. In [7], transfer learning was used to devise a bandwidth scalable digital predistorter in which only a limited number of fine-tuning layers are identified to update the model and ensure scalability. The fine tuning layers represent a small portion of the full transfer learning neural network model. Hence, this approach was found to significantly reduce the computational cost associated with the model update while maintaining its linearization capability. In contrast, the uniform neural network model uses two neural network sub-blocks to implement a power and bandwidth scalable digital predistortion function [8]. The first sub-block is built for typical conditions while the second sub-block is built for scalable conditions. To ensure the accuracy of the overall neural network predistorter, the second sub-block was trained using a wide range of operating conditions for which the scalability is desired. In this prior work [3], [4], [5], [6], [7], [8], at least one sub-set of the model coefficients needs to be updated. Moreover, the model identification and training requires the use of more than one training signal [3], [4], [5], [6], [7], [8]. These two aspects



FIGURE 3. Experimental setup (a) Simplified block diagram, (b) Photograph of the actual setup.

will be addressed in the proposed predistorter which does not require a pre-training over several operating conditions, and does not require an update of the predistortion function.

In this paper, a neural networks based predistorter is proposed for power amplifiers operating using 5G new radio signals under a wide range of average powers, signal bandwidths and carriers' configurations. The proposed DPD eliminates the need for frequent updates of the predistortion function following changes in the PA operating conditions. Another main advantage of the proposed model is that it is trained using a single signal, and is then able to perform satisfactorily over a wide range of average power levels, signal bandwidths and even carriers' configurations. In section II, the use of neural networks for power amplifiers linearization is discussed, and the proposed DPD model is introduced. In section III, the device under test (DUT) and the experimental setup are presented and the initial results of the proposed NN DPD are reported. Section IV thoroughly discusses the synthesis of the scalable predistortion function and provides comprehensive performance assessment analysis. Finally, the conclusions are summarized in section V.

# II. PROPOSED NEURAL NETWORKS BASED DIGITAL PREDISTORTER

Neural networks have been extensively explored in the recent years and have gained popularity due to their excellent capability of acting as a black-box. Not only they can hide the intrinsic nature of the system very well but they can also mimic the system behavior with high accuracy. Models based on real-valued time delay neural networks (RVTDNN) [9], their variation with augmented input (namely AVTDNN) [10], and other variants stemming from densely

| Signal      | Carriers' Configuration        | Bandwidth<br>(MHz) | PAPR<br>(dB) |
|-------------|--------------------------------|--------------------|--------------|
| 40M_4C_1001 | 1001<br>4 carriers 10 MHz each | 40                 | 10.28        |
| 40M_4C_1101 | 1101<br>4 carriers 10 MHz each | 40                 | 10.57        |
| 40M_4C_1111 | 1111<br>4 carriers 10MHz each  | 40                 | 10.69        |
| 40M_2C      | 11<br>2 carriers 20 MHz each   | 40                 | 10.49        |

1

1 carrier 40 MHz 11

2 carriers 15 MHz each

-1

1 carrier 30 MHz

1101

40

30

30

20

10.45

10.50

10.45

10.40

|                                                          | 4 carners 5 Minz each                                                          |                                     |                                        |  |
|----------------------------------------------------------|--------------------------------------------------------------------------------|-------------------------------------|----------------------------------------|--|
| 20M_2C                                                   | 11<br>2 carriers 10 MHz each                                                   | 20                                  | 10.50                                  |  |
| 20M_1C                                                   | 1<br>1 carrier 20 MHz                                                          | 20                                  | 10.40                                  |  |
| 10M_2C                                                   | 11<br>2 carriers 5 MHz each                                                    | 10                                  | 10.32                                  |  |
| 10M_1C                                                   | 1<br>1 carrier 10 MHz                                                          | 10                                  | 10.37                                  |  |
| onnected netwo<br>predistortion ap<br>poticeable trend   | orks have been success<br>plications. More recent<br>l in using long-short-ter | fully used<br>ly, there<br>rm memor | for digital<br>has been a<br>ry (LSTM) |  |
| etworks. In fac                                          | t, LSTM networks were                                                          | found to                            | better cater                           |  |
| o the memory effects observed in power amplifiers behav- |                                                                                |                                     |                                        |  |

TABLE 1. Properties of the 5G NR signals used.

-

40M 1C

30M\_2C

30M 1C

20M 4C 1101

connected networks have been successfully used for digital predistortion applications. More recently, there has been a noticeable trend in using long-short-term memory (LSTM) networks. In fact, LSTM networks were found to better cater to the memory effects observed in power amplifiers behavior [11], [12], [13]. LSTM models were also employed for the modeling of Gallium Nitride (GaN) high-electron-mobility transistors [14]. Bidirectional LSTM (BiLSTM) networks are an enhanced version of the LSTM that incorporates a bidirectional dependency within the input data. BiLSTM networks have shown superior linearization capabilities when compared to the conventional LSTM [12], [13].

In this work, the BiLSTM structure was adopted to synthesize the digital predistortion function. The proposed model architecture is depicted in Fig. 1. The input signal was fed to the model through the signal shaping block which was adopted to generate the vector of input samples to be used to predict each output sample. In fact, when a memory depth of M samples is considered, the input vector to the BiLSTM network at instant n will be given by:

$$X_{n,M} = [I(n)Q(n)I(n-1)Q(n-1)\cdots I(n-M)Q(n-M)]$$
(1)

Typically, the length of the input vector is M + 1 such that it includes the current sample as well as the preceding M samples. The vector of input samples is then applied to the BiLSTM structure. The model shown in Fig. 1 includes a single BiLSTM layer which is made of M + 1 unit cells for the forward propagation path, and another M + 1 unit cells for the backward propagation path. A standard LSTM unit cell structure was used as depicted in Fig. 2. The output of the BiLSTM layer is then fed into a deep neural network



FIGURE 4. Measured AM/AM of the DUT under two 5G NR test signals (a) 10MHz signal (10M\_1C), (b) 40MHz test signal (40M\_4C\_1001).

TABLE 2. DPD performance under matched conditions.

| Signal      | DPD<br>NMSE (dB) | ACLR 1<br>(dBc) | ACLR 2<br>(dBc) |
|-------------|------------------|-----------------|-----------------|
| 40M_4C_1001 | -50.2            | -46.8           | -45.9           |
| 40M_4C_1111 | -50.6            | -49.1           | -47.7           |
| 40M_1C      | -52.0            | -49.4           | -50.7           |
| 30M_2C      | -49.8            | -51.2           | -49.2           |
| 20M_2C      | -47.4            | -50.7           | -50.8           |
| 20M_1C      | -54.8            | -51.2           | -52.1           |
| 10M_1C      | -52.4            | -51.7           | -52.2           |

(DNN). Finally, the output of the DNN is fed into the output layer which will generate the predistorter's output signal components  $I_{out}$  and  $Q_{out}$ . The number of BiLSTM layers, the number of neurons in the DNN layer as well as the activation functions are all expected to be device dependent parameters that can be varied to ensure adequate accuracy. The number of cells in the BiLSMT layer though can be selected to be equal to the memory depth of the device under test.



**FIGURE 5.** Memory effects intensity of the DUT for different 5G NR test signals.

The LSTM structure shown in Fig. 2 is a standard cell made of a forget gate, an input gate, and an output gate. It uses the sigmoid (*sig*) and *tanh* activation functions. The input variables  $C_{n-1}$  and  $h_{n-1}$  refer to the previous cell's output and the previous hidden state, respectively. Similarly, the output variables  $C_n$  and  $h_n$  correspond to the cell's output and the hidden state, respectively. The equations relating the outputs of the LSTM cell to its inputs are standard relationships that can be found in [15].

### **III. EXPERIMENTAL SETUP AND TEST CONDITIONS**

The proposed digital predistortion function was validated experimentally on a commercial Gallium Nitride based power amplifier prototype. The device under test is the CGH40010-AMP demonstration amplifier from Wolfspeed Inc., Duhram, NC. The DUT was driven by 5G new radio (NR) test signals of various bandwidths, carriers' configurations, and average powers. The tests were performed while the DUT operated at a carrier frequency of 2425MHz. The experimental setup used in this work is depicted in Fig. 3. Figure 3 (a) illustrates a simplified block diagram of the experimental setup showing the various functional blocks. An actual photograph of the experimental setup is presented in Figure 3 (b).

In DUT characterization mode, the complex baseband digital waveform is downloaded into the vector signal generator, which generates the corresponding analog RF signal used to drive the power amplifier lineup. To boost the power level of the analog signal in order to operate the CGH40010-AMP over its entire power range, a ZHL-42 amplifier, from Mini-Circuits, Brooklyn, NY, was used as a driver. The output of the DUT was first attenuated and then fed into a vector signal analyzer which was used to perform signal down-conversion and demodulation. The resulting in-phase and quadrature components of the baseband signal are then used along with the input digital waveform to model the device under test and synthesize the predistortion function. The analog signal generation and analysis functions were performed within the Anritsu MS2830A which includes a vector signal generator and a vector signal analyzer in the same chassis. In predistortion mode, the NN based predistortion function is run on the computer and the resulting predistorted signal waveform downloaded into the vector signal generator. The output of



FIGURE 6. Measured ACLR 1 under mismatched DPD conditions. (Legend refers to the test signal used to derive the DPD).

the DUT is fed into the vector signal analyzer where the DPD performance is assessed by observing the corresponding spectra and measuring the adjacent channel leakage ratio (ACLR).

A total of 12 5G NR test signals was used for the validation of the proposed digital predistorter. The signals bandwidths span from 10MHz to 40MHz with various carriers' configurations. The characteristics of the test signals used in this work are summarized in Table 1. The test signals were sampled at 153.6Msps and have a total of 153600 samples corresponding to a time duration of 1ms.

The measured AM/AM of the device under test for the 10M\_1C and the 40M\_4C\_1001 test signals are reported in Fig. 4. These characteristics show a pronounced nonlinear distortion along with mild memory effects as it can be seen through the reduced dispersion for the 10MHz test signal. Conversely, the measured AM/AM of the device under test for the 40MHz signal exhibits significantly stronger memory effects as expected due to the wide bandwidth of this latter test signal. Similar behavior is also observed in the AM/PM characteristics. These are not reported for conciseness.

To assess the ability of the BiLSTM DPD structure in linearizing the DUT, a DPD was identified for 7 out of the 12 test signals. These signals were selected randomly in order to cover a wide range of bandwidths and carriers' configurations to provide a comprehensive overview of the DPD capabilities. For all these signals, the DPD structure and parameters as well as the identification process were identical. The only difference is that for each test, the DPD was trained and tested with the same (matched) test signal. As customary in DPD training, only part of the test signal was used to train the DPD. Neural networks DPDs require extensive training data when compared to memory polynomial based DPDs, therefore the training of the DPD was done using 70% of the signal waveform. The results are reported in Table 2. This table includes the calculated NMSE after the DPD training as well as the measured ACLR at the output of the linearized DUT. These results show that overall the DPD is able to achieve an ACLR

TABLE 3. DPD performance comparison (benchmark vs. proposed).

| Signal      | Matched DPD<br>(Benchmark) |                 | Proposed DPD    |                 |
|-------------|----------------------------|-----------------|-----------------|-----------------|
|             | ACLR 1<br>(dBc)            | ACLR 2<br>(dBc) | ACLR 1<br>(dBc) | ACLR 2<br>(dBc) |
| 40M_4C_1001 | -46.8                      | -45.9           | -46.8           | -45.9           |
| 40M_4C_1111 | -49.1                      | -47.7           | -48.8           | -48.0           |
| 40M_1C      | -49.4                      | -50.7           | -52.0           | -49.4           |
| 30M_2C      | -51.2                      | -49.2           | -51.7           | -49.5           |
| 20M_2C      | -50.7                      | -50.8           | -51.3           | -52.3           |
| 20M_1C      | -51.2                      | -52.1           | -53.4           | -52.2           |
| 10M_1C      | -51.7                      | -52.2           | -51.4           | -52.3           |



FIGURE 7. Measured ACLR 1 for the proposed DPD under mismatched power and bandwidth conditions. (Legend refers to the operating IPBO during the linearization step).

of better than -45dBc for all test cases. However, the DPD performance tends to be best for the narrowband signals and degrades as the bandwidth of the test signal increases.

The objective of this work is to derive a DPD that is able, from a single training, to linearize the amplifier while being driven with any of the above mentioned 12 test signals to ensure a bandwidth and carrier configuration resilient DPD. Moreover, since the operating average power of the amplifier might change, an additional requirement of the DPD is to be able to linearize the power amplifier at various operating power levels. Initial tests showed that a power range corresponding to an input power back off (IPBO) between 0 and 5dB is sufficient. Here, the IPBO is defined with respect to the saturation power of the DUT. An IPBO of 0dB implies that the DUT is pushed up to its saturation power. The range of variations of the input power was limited to 5dB since the DUT was able to meet the ACLR requirements without any linearization for lower operating power levels.

## IV. BANDWIDTH, POWER AND CARRIER CONFIGURATION RESILIENT BILSTM BASED NEURAL NETWORK DPD

Neural networks are known for their ability to learn the intrinsic behavior of the system being modeled. Hence, to ensure



FIGURE 8. Measured ACLR 1 for the "sub-optimal" DPDs under mismatched power and bandwidth conditions. Test signals used for DPD identification (a) 10M\_1C, (b) 20M\_2C, (c) 30M\_2C, (d) 40M\_4C\_1111. (Legend refers to the operating IPBO during the linearization step).

that the designed DPD performs well over all the operating conditions described in the previous section, the following approach was adopted. First, to ensure that the model captures the full power-dependent nonlinearity of the DUT, it was decided to train the DPD for the case where the PA is operated at 0dB IPBO. Moreover, in order to capture the full extent of the DUT memory effects, it is required to train the model with the test signal that emulates the strongest memory effects in the DUT. Hence, the memory effects intensity (MEI) of the DUT was characterized for all considered test signals [16].

The memory effects intensity can be derived from the characterization data of the DUT by applying a memoryless post-compensator that will cancel out the effects of the static nonlinear distortions. Therefore, the residual distortions observed can be attributed to the memory effects. Fig. 5 presents the memory effects intensity quantified for each of the test signals. From the data presented in this figure, one can conclude that the MEI gets stronger as the bandwidth of the test signal increases. Furthermore, for signals with the same bandwidth, the memory effects get stronger as the number of carriers increase. For the same number of carriers, the more empty carriers are present, the stronger the memory effects exhibited by the DUT will get. These results helped identify the proper test condition that will emulate the strongest memory effects. In fact, driving the DUT with a 40MHz test signal having 4 carriers with a carrier configuration of 1001 (test signal 40\_4C\_1001) will lead to the strongest memory effects.

VOLUME 11, 2023

Based on this analysis, the proposed BiLSTM DPD was trained from the characterization data in which the PA was driven by the 40\_4C\_1001 test signal while operating at its peak output power (IPBO=0dB). This corresponds to the case that emulates the strongest memory effects and nonlinearity. This DPD was trained using 70% of the waveform corresponding to the 40\_4C\_1001 test signal, and was later applied to linearize the power amplifier driven by the other 11 test signals, while operating at 0dB IPBO. These 11 test signals represent unseen data for the DPD since they were not used during the training step. The measured ACLR data at the output of the linearized amplifier with this mismatched DPD are reported in Fig. 6. For comparison purposes, this figure also includes the results derived when applying the 6 other DPDs derived from the characterization of the DUT with the test signals listed in Table 2. For a better representation of the results, the signals on the x-axis are listed in descending memory effects intensity. Table 3 compares the performance of the benchmark DPDs when applied under matched conditions and that of the proposed DPD. This table shows that overall the proposed DPD maintains the same linearization performance as the benchmark DPDs.

A general trend that can be observed is that the BiLSTM structure can overall learn the DUT behavior fairly well for signals with comparable conditions. However, its performance quickly degrades when applied to linearize the PA driven by significantly more stressful test signals. Most importantly, the proposed approach is the only one that leads

No DPD





to a DPD able to meet the 45dBc ACLR threshold for all test signals. It is important to emphasis that this DPD was only trained with one test signal, and then able to linearize the amplifier operating with 12 different test signals with bandwidths ranging from 10MHz to 40MHz.

(e)

To further assess the resilience of the proposed DPD to the operating conditions, the IPBO was varied from 0dB to 5dB in steps of 1dB by controlling the average power of the input signals applied to the predistorter. In these tests, the proposed predistorter was used to linearize the DUT while operating under 72 different conditions (12 different test signals with power levels varied over 6 values for each test signal). The measured ACLR at the output of the DUT when linearized by the proposed DPD are reported in Fig. 7.

The data of Fig. 7 shows that, for the proposed DPD, the ACLR of 45dBc is met under all operating conditions. For comparison purposs, the same test was repeated with the other DPDs and the results are summarized in Fig. 8 for 4 of these cases. As it can be seen in the plots of Fig. 8, a performance deterioration is observed. This performance deterioration is mainly function of the signal bandwidth but not the operating power levels. This indicates that the predistorters were able to predict the nonlinear behavior of the DUT (since they were trained from peak power condition at an IPBO of 0dB), however, their performance degradation is mainly due to their inability to predict the memory effects. Therefore, for a given test signal there is no significant performance degradation with respect to the power level.

The measured spectra at the output of the linearized amplifier are reported in Fig. 9. This figure includes the measured spectra at the output of the linearized amplifier obtained using the proposed DPD, and using the matched DPD. The matched DPD corresponds to the DPD derived when the PA was characterized with the same test signal that is being used



FIGURE 10. Measured ACLR 1 for the GMP DPD under mismatched power and bandwidth conditions.

for performance assessment. For example, for the 40M\_1C case, the matched DPD is the one derived from a characterization data in which the PA was driven by the 40M\_1C test signal. In all cases, the proposed DPD corresponds to the DPD derived from the PA characterization using the 40M\_4C\_1001 test signal. To assess the performance of the designed DPDs, the output spectra before linearization are also included.

The spectra reported in Fig. 9 clearly show that the proposed predistorter leads to a performance similar to that of a matched predistorter for bandwidths ranging from 10MHz to 40MHz, and for various carriers' configurations.

To further compare the proposed DPD to other standard DPDs, a generalized memory polynomial (GMP) DPD was considered [17]. The GMP DPD relates the input signal to output signal according to:

$$x_{GMP\_Out}(n) = \sum_{i=1}^{N} \sum_{j=0}^{M} a_{ij} \cdot x_{in} (n-j) \cdot |x_{in}(n-j)|^{i-1} + \sum_{k=1}^{K} \sum_{i=1}^{N_K} \sum_{j=0}^{M_K} b_{ijk} \cdot x_{in} (n-j) \cdot \times |x_{in} (n-j-k)|^{i-1} + \sum_{l=1}^{L} \sum_{i=1}^{N_L} \sum_{j=0}^{M_L} c_{ijl} \cdot x_{in} (n-j) \cdot \times |x_{in} (n-j+l)|^{i-1}$$
(2)

where  $x_{GMP\_Out}$  (*n*) and  $x_{in}$  (*n*) are the output and input samples of the DPD, respectively. *N*, *M* and  $a_{ij}$  are the nonlinearity order, the memory depth, and the coefficients of the time-aligned sub-function of the generalized memory polynomial DPD. *K* and *L* represent the order of the lagging and leading cross-terms of the DPD function, respectively.  $N_K$ ,  $M_K$  and  $b_{ijk}$  are the nonlinearity order, the memory depth, and the coefficients of the lagging cross-terms sub-function of the GMP DPD, respectively. Similarly,  $N_L$ ,  $M_L$  and  $c_{ijl}$  are the nonlinearity order, the memory depth, and the coefficients of the leading cross-terms sub-function of the GMP DPD, respectively.



**FIGURE 11.** Measured ACLR 1 under mismatched DPD conditions. (a) Benchmark GMP DPD, (b) Proposed DPD.

The GMP model was derived from the same characterization data used to generate the proposed model. This corresponds to the DUT operating at peak power level with the signal leading to the strongest memory effects intensity. The corresponding GMP DPD was then applied to linearize the DUT while operating under the same conditions used to derive the data of Fig. 7. The measured ACLR under all these test conditions are compiled in Fig. 10. This figure shows that the GMP DPD performance degrades as a function of the signal type and most importantly as a function of the operating power level. Fig. 11 summarizes the performance of the GMP DPD and the proposed DPD in terms of ACLR at the output of the linearized PA as a function of the signal type and its operating IPBO. In these tests, the GMP DPD and BiLSTM DPD remained unchanged while the DUT operating conditions were varied. The flatness of the ACLR results obtained with the proposed DPD drastically contrast with that of the GMP based DPD for which the ACLR degrades significantly especially with changes in the IPBO.

Fig. 12 includes the measured spectra at the output of the linearized amplifier using the proposed BiLSTM DPD and the GMP DPD. Both DPDs were generated from the same



FIGURE 12. Sample of measured spectra at the output of the linearized amplifier under mismatched power and bandwidth conditions at 2dB IPBO using the GMP and the BiLSMT predistorters. (a) 10M\_2C, (b) 30M\_1C, (c) 40M\_2C, (d) 40M\_4C\_1101.

characterization data (that is with the  $40M_4C_1001$  test signal). During the test, the PA was operated at an IPBO of 2dB. The spectra reported in this figure show the ability of the proposed predistorter to maintain its performance under power and bandwidth mistmach conditions while they clearly point out the limitation of the GMP DPD for such cases.

Considering all spectra corresponding to the matched DPD in Figures 9 and 12 illustrates the robustness of the proposed predistorter in maintaining the linearity performance under a wide range of operating conditions.

When considering all the results, it appears that compared to the GMP DPD, the BiLSMT structure is able to better capture the inherent behavior of the DUT that is emulated during the characterization process. Most importantly, it has been demonstrated that the proper selection of the DPD structure along with the test conditions used to generate the DPD can lead to a power, bandwidth, and carrier configuration resilient predistorter. It is very important to emphasize that the proposed DPD performs well over a wide range of unseen test signals and that it does not require any update (even partial) when the operating conditions change. Therefore, this proposed solution is far more attractive and useful than scalable predistorters which require one form or another of update and scalability.

It is important to note here that the maximum bandwidth of the experimental setup used in this work was limited to 160MHz. Hence, the maximum bandwidth of the input signal was limited to 40MHz. However, it is anticipated that the approach and results presented in this work can be extended to wider bandwidths.

#### **V. CONCLUSION**

In this paper, neural networks were used to develop a powerbandwidth- and carriers' configuration resilient digital predistorter suitable for 5G applications. This predistorter was derived using the bidirectional long-short-term memory neural networks. It has been shown that the proposed model can maintain satisfactory linearization performance when used to linearize the power amplifier driven by unseen data with various bandwidths and average power levels. The training of the proposed DPD was performed only once using a test signal that emulates a comprehensive dynamic nonlinear behavior of the DUT. For experimental validation, the proposed DPD was trained once and then applied to linearize a GaN based power amplifier operating under 72 different conditions that covered variations in the signal power, signal bandwidth and carriers' configuration. The proposed DPD was able to maintain standard compliant ACLR over all considered test conditions. Such predistorter is an enabling technology for future communication infrastructure where a key requirement is a fast update of the predistortion function. Not only does this predistorter achieve this requirements, it completely eliminates the need for update following a change in the signal power, bandwidth or carriers configuration. Future improvement can be built on these results to propose a fully resilient DPD under a wider range of varying conditions such as temperature, etc...

#### REFERENCES

- F. M. Ghannouchi, O. Hammi, and M. Helaoui, *Behavioral Modeling* and *Predistortion of Wideband Wireless Transmitters*. Hoboken, NJ, USA: Wiley, 2015.
- [2] F. M. Ghannouchi and O. Hammi, "Behavioral modeling and predistortion," *IEEE Microw. Mag.*, vol. 10, no. 7, pp. 52–64, Dec. 2009.
- [3] O. Hammi, A. Kwan, and F. M. Ghannouchi, "Bandwidth and power scalable digital predistorter for compensating dynamic distortions in RF power amplifiers," *IEEE Trans. Broadcast.*, vol. 59, no. 3, pp. 520–527, Sep. 2013.
- [4] Y. Li, X. Wang, and A. Zhu, "Complexity-reduced model adaptation for digital predistortion of RF power amplifiers with pretraining-based feature extraction," *IEEE Trans. Microw. Theory Techn.*, vol. 69, no. 3, pp. 1780–1790, Mar. 2021.
- [5] X. Wang, Y. Li, H. Yin, C. Yu, Z. Yu, W. Hong, and A. Zhu, "Digital predistortion of 5G multiuser MIMO transmitters using low-dimensional feature-based model generation," *IEEE Trans. Microw. Theory Techn.*, vol. 70, no. 3, pp. 1509–1520, Mar. 2022.
- [6] M. Mengozzi, G. P. Gibiino, A. M. Angelotti, C. Florian, and A. Santarelli, "Beam-dependent active array linearization by global feature-based machine learning," *IEEE Microw. Wireless Technol. Lett.*, vol. 33, no. 6, pp. 895–898, Jun. 2023.
- [7] F. Jalili, F. F. Tafuri, O. K. Jensen, Q. Chen, M. Shen, and G. F. Pedersen, "Bandwidth-scalable digital predistortion of active phased array using transfer learning neural network," *IEEE Access*, vol. 11, pp. 13877–13888, 2023.
- [8] H. Wu, W. Chen, X. Liu, Z. Feng, and F. M. Ghannouchi, "A uniform neural network digital predistortion model of RF power amplifiers for scalable applications," *IEEE Trans. Microw. Theory Techn.*, vol. 70, no. 11, pp. 4885–4899, Nov. 2022.
- [9] M. Rawat, K. Rawat, and F. M. Ghannouchi, "Adaptive digital predistortion of wireless power amplifiers/transmitters using dynamic real-valued focused time-delay line neural networks," *IEEE Trans. Microw. Theory Techn.*, vol. 58, no. 1, pp. 95–104, Jan. 2010.
- [10] D. Wang, M. Aziz, M. Helaoui, and F. M. Ghannouchi, "Augmented realvalued time-delay neural network for compensation of distortions and impairments in wireless transmitters," *IEEE Trans. Neural Netw. Learn. Syst.*, vol. 30, no. 1, pp. 242–254, Jan. 2019.
- [11] D. Phartiyal and M. Rawat, "LSTM-deep neural networks based predistortion linearizer for high power amplifiers," in *Proc. Nat. Conf. Commun.* (*NCC*), Feb. 2019, pp. 1–5.
- [12] H. Li, Y. Zhang, G. Li, and F. Liu, "Vector decomposed long shortterm memory model for behavioral modeling and digital predistortion for wideband RF power amplifiers," *IEEE Access*, vol. 8, pp. 63780–63789, 2020.
- [13] J. Sun, W. Shi, Z. Yang, J. Yang, and G. Gui, "Behavioral modeling and linearization of wideband RF power amplifiers using BiLSTM networks for 5G wireless systems," *IEEE Trans. Veh. Technol.*, vol. 68, no. 11, pp. 10348–10356, Nov. 2019.
- [14] M. Geng, G. Crupi, and J. Cai, "Accurate and effective nonlinear behavioral modeling of a 10-W GaN HEMT based on LSTM neural networks," *IEEE Access*, vol. 11, pp. 27267–27279, 2023.

- [15] S. Hochreiter and J. Schmidhuber, "Long short-term memory," *Neural Comput.*, vol. 9, no. 8, pp. 1735–1780, Nov. 1997.
- [16] O. Hammi, S. Carichner, B. Vassilakis, and F. M. Ghannouchi, "Power amplifiers' model assessment and memory effects intensity quantification using memoryless post-compensation technique," *IEEE Trans. Microw. Theory Techn.*, vol. 56, no. 12, pp. 3170–3179, Dec. 2008.
- [17] D. R. Morgan, Z. Ma, J. Kim, M. G. Zierdt, and J. Pastalan, "A generalized memory polynomial model for digital predistortion of RF power amplifiers," *IEEE Trans. Signal Process.*, vol. 54, no. 10, pp. 3852–3860, Oct. 2006.



**ASMA ALI** received the B.Sc. degree in electrical engineering from Alfaisal University, Saudi Arabia, in 2018. She is currently pursuing the M.S. degree with the American University of Sharjah, United Arab Emirates. Her current research interests include deep learning, electronics, in particularly power amplifiers and the fusion of the latter.



**OUALID HAMMI** (Member, IEEE) received the B.Eng. degree in electrical engineering from École Nationale d'Ingénieurs de Tunis, Tunis, Tunisia, in 2001, the M.Sc. degree in electrical engineering from École Polytechnique de Montréal, Montréal, QC, Canada, in 2004, and the Ph.D. degree in electrical engineering from the University of Calgary, Calgary, AB, Canada, in 2008.

From 2010 to 2015, he was a Faculty Member with the Department of Electrical Engineering,

King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia. He is currently a Professor with the Electrical Engineering Department, American University of Sharjah, Sharjah, United Arab Emirates. He is the coauthor of two books, more than 100 articles, and inventor/co-inventor on 13 U.S. patents. His current research interests include the design of energy-efficient linear transmitters for wireless communication and satellite systems, and the characterization, behavioral modeling, and linearization of radiofrequency power amplifiers and transmitters.

...