Received 10 December 2023; accepted 22 December 2023. Date of publication 29 December 2023; date of current version 19 January 2024. Digital Object Identifier 10.1109/OJCOMS.2023.3348190

# Pre-DFT Multiplexing of Reference Signals and Data in DFT-s-OFDM Systems

## KOTESWARA RAO GUDIMITLA<sup>®</sup>, M. SIBGATH ALI KHAN, SAIDHIRAJ AMURU (Member, IEEE), AND KIRAN KUCHI (Member, IEEE)

Department of Electrical Engineering, Indian Institute of Technology Hyderabad, Hyderabad 502284, India CORRESPONDING AUTHOR: K. R. GUDIMITLA (e-mail: ee16resch01005@iith.ac.in)

ABSTRACT The current 5<sup>th</sup> Generation-New Radio (5G-NR) systems are designed to meet the demands of various applications, offering high data rates, low latency, and high reliability. In the current 5G-NR systems, the Discrete Fourier Transform-spread-Orthogonal Frequency Division Multiplexing (DFT-s-OFDM) waveform is the most commonly employed waveform to transmit user data in uplink transmissions, specifically in coverage-limited scenarios. To enable coherent data demodulation, pilot signals, commonly referred to as Demodulation Reference Signals (DMRS), are transmitted along with data-carrying DFT-s-OFDM symbols. However, in the current DFT-s-OFDM architecture, the DMRS and data are transmitted on distinct OFDM symbols. This time separation necessitates the demodulation process to commence only after receiving the initial DMRS symbol, resulting in significant processing delays. Furthermore, the existing architecture consumes more DMRS resources and necessitates complex time interpolation techniques to support users with high mobility. To address these challenges, we propose an improved DFT-s-OFDM architecture that enables instantaneous data demodulation, leading to a substantial reduction in processing delays. Furthermore, despite not utilizing any time interpolation techniques, the proposed method effectively caters to high-speed users, thereby conserving computational resources and hence contributing to minimizing the system complexity and latency. We thoroughly investigate the proposed architecture and evaluate its performance in different simulation settings. The results demonstrate that the proposed architecture significantly improves packet error performance, particularly in high Doppler scenarios, while using 16% lesser DMRS overhead. This reduction in DMRS overhead frees up additional resources for data transmission, ultimately enhancing data rates.

INDEX TERMS DFT-s-OFDM, demodulation reference signals, high doppler, low latency.

## I. INTRODUCTION

THE 5<sup>th</sup> Generation New Radio (5G-NR) technology caters to a wide range of applications, including Enhanced Mobile Broadband (eMBB) and Ultra-Reliable Low Latency Communications (URLLC), demanding high data rates and low latency, respectively. To achieve efficient uplink transmissions in 5G-NR, especially in coveragelimited scenarios, the commonly utilized waveform is Discrete Fourier Transform-spread-Orthogonal Frequency Division Multiplexing (DFT-s-OFDM). In the existing DFTs-OFDM architecture [1], uplink user data is transmitted in discrete time units referred to as slots. Each slot comprises 14 DFT-s-OFDM symbols. Specifically, each user is assigned a specific set of *M* frequency subcarriers and  $P(\leq 14)$  consecutive symbols. Notably, among these symbols, at least one is exclusively designated for Demodulation Reference Signals (DMRS) to facilitate channel estimation. Symbols carrying DMRS are commonly known as DMRS symbols, while the remaining symbols are referred to as Data symbols. The channel estimates derived from the DMRS symbols play a pivotal role in subsequently decoding the user data carried by the Data symbols. The existing DFT-s-OFDM architecture and existing architecture are used interchangeably throughout this section.

In the existing architecture, users can be configured with a maximum of 4 DMRS symbols. However, the exact number

<sup>© 2024</sup> The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

and placement of DMRS symbols within the slot are not fixed [1]. Furthermore, DMRS symbols may not always be positioned at the beginning of the allocated P symbols. Since the demodulation of user data depends on essential channel estimates, the process commences only when the initial DMRS symbol becomes available. This results in substantial processing delays, which can be critical, particularly in URLLC applications.

In the existing architecture, the transmission of DMRS and data takes place on distinct symbols. This means that the channel estimates derived from DMRS symbols are either replicated onto the data symbols or undergo time interpolation to adapt to channel variations between symbols. Particularly for high-speed users, where channel characteristics undergo significant changes between symbols, sophisticated interpolation techniques such as cubic or spline interpolation [2] become necessary. However, implementing these interpolation techniques requires the inclusion of more DMRS symbols, leading to a substantial increase in DMRS overhead. Moreover, these interpolation techniques introduce higher computational complexity and processing delays, which make them impractical to be deployed in real-time systems.

Hence, there is a pressing need for an enhanced architecture for the existing DFT-s-OFDM waveform transmission. This enhanced design should cater to the requirements of high-speed users and latency-critical applications while effectively managing computational and latency overheads to maintain acceptable levels of performance.

Numerous improvements have been suggested in previous studies to enhance the performance of the existing DFTs-OFDM architecture. Some of these contributions are specifically directed towards facilitating high-speed user support for DFT-s-OFDM waveforms, while others focus on designing improved DMRS sequences to better support DFT-s-OFDM waveforms. In [3], the authors present an improved design catering to high-speed users. This design emphasizes adapting to changing channel conditions by reducing DMRS density in the frequency domain while simultaneously increasing DMRS density in the time domain. It is specifically optimized for low delay spread channels and lower order modulation techniques, such as Quadrature Phase Shift Keying (QPSK) and 16-Quadrature Amplitude Modulation (16-QAM) modulation. In a separate investigation [4], a model for Doppler frequency compensation is introduced to address use cases involving high mobility. This model calculates the Doppler frequency curve based on the actual speed of the user. Notably, this model is customized for high-speed railway communications, assuming that the receiver possesses knowledge of the user speed. In [5], a modified version of Orthogonal Time Frequency Space (OTFS) modulation is suggested. This modification aims to generate a transmit signal with a low Peak-to-Average Power Ratio (PAPR), making it particularly suitable for integrated sensing and communications, especially in scenarios involving high Doppler effects. Furthermore, in [6], the authors

introduce a pilot design to estimate the channel and mitigate Inter-Carrier-Interference (ICI) generated in high Doppler OFDM systems. This is achieved by exploiting the band structure of the proposed pilot design. To further eliminate ICI, the authors propose a modified version of Minimum Mean Square Error (MMSE) equalization that relies on the sparse and banded structure of the coupling matrix.

Another set of contributions focuses on the design of DMRS sequences for DFT-s-OFDM waveforms. For smaller DMRS lengths ( $l \le 24$ ), [1] specifies QPSK-based Computer Generated Sequences (CGS) and Zadoff-Chubased (ZC-based) sequences for larger DMRS lengths (l >24), regardless of the modulation scheme employed for data transmission. However, the authors in [7] highlight that transmitting ZC or CGS-based DMRS in conjunction with  $\pi/2$ -Binary Phase Shift Keying ( $\pi/2$ -BPSK) data may compromise the Peak-to-Average Power Ratio (PAPR) advantages associated with  $\pi/2$ -BPSK data transmissions. As a remedy, their subsequent work [8] recommends the utilization of Pseudo-Noise (PN)-based  $\pi/2$ -BPSK sequences for DMRS transmissions, particularly when the data is modulated using  $\pi/2$ -BPSK modulation.

Moreover, the DFT-s-OFDM waveforms are vulnerable to distortions caused by phase noise, particularly at higher carrier frequencies, which can result in potential packet losses. To tackle this challenge, [9] introduces dedicated reference signals known as Phase Tracking Reference Signal (PTRS). These signals play a crucial role in facilitating phase noise tracking, enabling the estimation and subsequent correction of phase noise to ensure precise data decoding. In a related study [10], the authors explored various interpolation techniques, including Discrete Cosine Transform and Kalman filtering, to track phase noise variations. These techniques empower the receiver to implement correction mechanisms based on the estimated phase noise, effectively minimizing packet losses.

However, in all the previously mentioned designs, the transmission of DMRS and data occurs on separate symbols, necessitating the use of interpolation or extrapolation techniques. Additionally, most contributions do not account for processing delays and latencies, which can impact the overall system's performance.

In our contribution, we present an improved architecture designed for the transmission of DFT-s-OFDM waveforms, with a specific emphasis on accommodating high-speed users and latency-critical applications. Our primary innovation involves the efficient multiplexing of DMRS and data within a single DFT-s-OFDM symbol, with DMRS transmission occurring on each configured symbol. This approach eliminates the requirement for dedicated symbols for DMRS transmissions. Additionally, as DMRS is available on each configured symbol, the channel estimates derived from DMRS can be directly utilized for decoding the corresponding data symbols. This eliminates the necessity for complex time interpolation techniques. Despite not utilizing any time interpolation techniques, the proposed method effectively



FIGURE 1. Proposed transceiver design for multiplexing DMRS with CP, CS, and data in single DFT-s-OFDM symbol.

caters to high-speed users, thereby conserving computational resources and hence contributing to an overall reduction in system complexity and latency. Through multiplexing DMRS with data, the proposed design enables instantaneous data detection, leading to a substantial reduction in processing delays. This feature is particularly essential for applications with URLLC.

The idea of multiplexing data and DMRS within a single DFT-s-OFDM symbol was originally introduced by the same authors in a 3GPP proceeding [11]. However, the primary emphasis of that study was on enhancing the coverage area of a cell specifically for control data transmissions using  $\pi/2$ -BPSK modulation. In subsequent works, the authors in [12] and [13] proposed a method in which data alone undergoes DFT precoding and the precoded data is punctured with DMRS. While this method facilitates a simplified channel estimation procedure, it results in PAPR degradation and packet losses. To address the packet losses, the authors suggested an iterative data decoding algorithm, albeit at the expense of increased complexity and processing delays. Another proposal, in [14], involves multiplexing ghost pilots with data before DFT precoding. This approach operates under the assumption that ghost pilots closely resemble those inserted after the Inverse Fast Fourier Transform (IFFT) at the transmitter. However, none of the works have extensively studied the multiplexing scheme in generic scenarios.

Building on our proposal in [11], in our current research, we have conducted a comprehensive study on the concept of multiplexing. We have thoroughly investigated how the transmission of data and DMRS on the same symbol can effectively cater to distinct use cases. Further, we have presented a modified symbol structure that can enable instantaneous channel estimation with minimal intra-symbol leakages. We have also demonstrated how this data-DMRS multiplexing can enhance packet error performance in high Doppler scenarios and address the needs of low-latency applications. Furthermore, we present a systematic method for determining the minimum DMRS length for a given allocation size to ensure the required channel estimation accuracy.

The remainder of the paper is structured as follows: Section II introduces the symbol structure and transmitter design of the proposed method. Section III covers the receiver architecture and the channel estimation procedure. Section IV presents the link-level simulation results, comparing the performance of the proposed architecture with that of the existing one. Finally, in Section V, we conclude with key remarks and outline potential avenues for future work.

*Note 1:* The subscripts '*t*' and '*f*' denote the time domain and the frequency domain, respectively. Further, considering the readability of the paper, DFT-s-OFDM symbols are referred to as symbols.

#### **II. TRANSMITTER DESIGN**

In this section, we discuss the proposed transmitter architecture. As mentioned in Section I, the core concept of our design is to multiplex DMRS and user data within a single symbol. The transmitter block diagram is illustrated in Fig. 1.

To facilitate data transmission, each user is assigned a specific set of M subcarriers and P consecutive symbols. In conventional systems, data and DMRS are transmitted on separate symbols. Data symbols utilize all M subcarriers for data transmission, while DMRS symbols use all M subcarriers exclusively for DMRS transmission. However, in our method, we assign a subset of  $l_r$  subcarriers for DMRS transmission within each of the P symbols, while the remaining subcarriers are allocated for data transmission.

Let  $\mathbf{r}_t^i$  denote a DMRS block of length  $l_r$  on symbol *i*, where  $\mathbf{r}_t^i \in \mathbb{C}^{l_r \times 1}$ .  $\mathbf{d}_t^i$  represents a block of  $l_d$  modulation alphabets on symbol *i*, satisfying  $l_r + l_d = M$ . Let  $\mathbf{x}_t^i$ represents the *i*<sup>th</sup> data DMRS multiplexed symbol with length M

$$\mathbf{x}_t^i = \begin{bmatrix} \mathbf{r}_t^i \\ \mathbf{d}_t^i \end{bmatrix}; \quad i \in \{0, 1, 2, \dots, P-1\}.$$
(1)

The multiplexed symbol in (1) undergoes DFT (Discrete Fourier Transform) precoding using an *M*-point DFT matrix ( $\mathbf{D}_M$ ). The resulting DFT precoded multiplexed symbol is represented below:

$$\mathbf{x}_{f}^{i} = \frac{1}{\sqrt{M}} \mathbf{D}_{M} \mathbf{x}_{t}^{i}.$$
 (2)

The proposed architecture considers two key factors for enhancing DMRS efficiency and minimizing block errors. First, the length of DMRS is strategically optimized, and second, Cyclic Suffix (CS) and Cyclic Prefix (CP) around DMRS are introduced to mitigate channel leakages.

The length of DMRS is a crucial parameter within the proposed architecture. It should be carefully selected to ensure that it is adequate for capturing the maximum channel



FIGURE 2. Slot representing the proposed symbol architecture.

energy during the channel estimation process. Further details about the method employed to establish an optimal DMRS length for a given allocation size M is provided in the receiver section.

Since data and DMRS are multiplexed on the same symbol and undergo DFT precoding, there is a potential for the energy of DMRS to leak onto data subcarriers and vice versa during receiver processing. These leakages lead to inaccurate channel estimation, ultimately impacting block decoding performance (further elaborated in the receiver section).

To mitigate this issue, the proposed architecture includes the introduction of a guard band between DMRS and data using CP and CS for each of the *P* symbols. Specifically, a trailing portion of DMRS is duplicated at the start of DMRS in the form of CP, and an initial part of DMRS is duplicated after the end of DMRS in the form of CS. While this approach incurs additional overhead, it significantly improves channel estimation performance. The multiplexed symbol after CP and CS insertion is represented as given below

$$\mathbf{x}_{t}^{i} = \begin{bmatrix} \mathbf{r}_{t,cP}^{i} \\ \mathbf{r}_{t}^{i} \\ \mathbf{r}_{t,cs}^{i} \\ \mathbf{d}_{t}^{i} \end{bmatrix}; \quad i \in \{0, 1, 2, \dots, P-1\}.$$
(3)

Through comprehensive analysis encompassing all standard channel models and essential modulation techniques that must be supported by any 5G-NR system, we have determined that the maximum CP and CS lengths should be half of the DMRS length.

The symbol in (3) undergoes DFT precoding, and the DFT precoder output is fed into the subcarrier mapper, which assigns the precoder output to the designated subcarriers. Subsequently, an *N*-point Inverse Fast Fourier Transform (IFFT) is applied to the subcarrier mapper's output, followed by symbol-level CP insertion. The aforementioned steps are iterated over all the allocated *P* symbols. A slot with 14 OFDM symbols representing the proposed symbol structure is shown in Fig. 2.

## **III. RECEIVER DESIGN**

The block diagram in Fig. 1 illustrates the essential operations carried out in the receiver. This section provides a comprehensive discussion of each of these operations, and also outlines the method we adopted to determine the DMRS length.

After undergoing receiver front-end processing, which also includes CP removal and *N*-point FFT, the allocated *M* subcarriers designated for a specific user are extracted through the subcarrier demapper module. The ' $i^{th}$ ' demapped data DMRS multiplexed symbol can be represented as follows

$$\mathbf{y}_{f}^{i} = \operatorname{diag}\left(\mathbf{h}_{f}^{i}\right) \, \mathbf{x}_{f}^{i} + \mathbf{w}_{f}^{i}. \tag{4}$$

In the above equation,  $\mathbf{y}_{f}^{i}$  is the received data-DMRS vector on symbol *i*, and  $\mathbf{w}_{f}^{i}$  denotes the zero-mean Additive White Gaussian Noise (AWGN) with variance  $\sigma^{2}$ , and  $\mathbf{h}_{f}^{i} = [\mathbf{h}_{f}^{i}(0), \mathbf{h}_{f}^{i}(1), \dots, \mathbf{h}_{f}^{i}(M-1)]^{T}$  represents the frequency response of the L-tap wireless channel (impulse response of length '*L*') observed on the demapped subcarriers.

#### A. CHANNEL ESTIMATION

In this subsection, we discuss the channel estimation procedure employed to derive channel estimates for a particular symbol, 'i.' In conventional systems, data and DMRS are transmitted over separate symbols. Hence, frequency domain DMRS are readily available at the receiver for channel estimation, which can be carried through one of the estimation methods defined in [15]. However, in the proposed method, DMRS and data are multiplexed within the same symbol, followed by DFT precoding. It is crucial to note that although data and DMRS are temporally separated, the DFT output represents a composite mixture of data and DMRS samples. Consequently, isolating DMRS from data after DFT precoding becomes unfeasible. As a result, direct frequency domain DMRS are not available at the receiver for channel estimation. Hence, the channel estimation process becomes relatively complex, necessitating a few additional steps. The initial step involves isolating the DMRS component from the multiplexed symbol  $\mathbf{y}_{t}^{i}$ , which is achieved through an *M*-sized IDFT applied to the received demapped symbol  $\mathbf{y}_{f}^{i}$ . The outcome can be represented below

$$y_t^i(n) = h_t^i(n) \odot x_t^i(n) + w_t^i(n),$$
 (5)

where,  $\mathbf{x}_t^i$  represents the transmitted multiplexed symbol as defined in (3) and  $\mathbf{h}_t^i$  represents the effective channel impulse response acting on  $\mathbf{x}_t^i$ . The vector  $\mathbf{y}_t^i$  comprises stacked DMRS and data, along with CP and CS, as shown below

$$\mathbf{y}_{t}^{i} = \begin{bmatrix} \mathbf{y}_{t,CP}^{i} \\ \mathbf{y}_{t,DMRS}^{i} \\ \mathbf{y}_{t,CS}^{i} \\ \mathbf{y}_{t,d}^{i} \end{bmatrix}.$$
 (6)

The DMRS from the resulting  $\mathbf{y}_t^i$  is extracted using a rectangular window  $\mathbf{P}_r$ , with the length of  $\mathbf{P}_r$  aligned to the



FIGURE 3. BLER performance comparison of the proposed architecture with and without CS for DMRS.

DMRS length. This windowing operation isolates the DMRS portion within the symbol  $\mathbf{y}_{t}^{i}$ 

$$\mathbf{y}_{t,DMRS}^{i} = \mathbf{P}_{r} \odot \mathbf{y}_{t}^{i} \tag{7}$$

where,  $\odot$  defines element by element product,  $\mathbf{P}_r = [\mathbf{O}_{l_{CP} \times 1} \mathbf{I}_{l_r \times 1} \mathbf{O}_{(l_{CS}+l_d) \times 1}]$ , and  $l_{CP}$ ,  $l_{CS}$  are the lengths of CP and CS for DMRS respectively.

Note 2: In practical systems, achieving perfect time synchronization is often challenging, leading to the presence of non-zero time offsets and time dispersion. To address this issue, a compensatory guard interval is incorporated by placing a CP at the beginning of the DMRS. This CP effectively mitigates the effects of time dispersion. Furthermore, when a windowing operation is applied to the multiplexed symbols, it leads to the unintended spread of the DMRS signal onto adjacent subcarriers. To tackle this, we introduce a guard interval positioned between the DMRS tail and the start of the data, forming what we refer to as a CS. By inserting this CS, we effectively prevent signal leakage and maintain separation between DMRS and data subcarriers. Importantly, both the CP and CS are subsequently discarded during receiver operations. This process ensures that signal interference is avoided and accurate data extraction is achieved. Supporting the argument, Fig. 3 shows the BLER performance of the proposed architecture with and without CP, CS for DMRS generated using parameters defined in Table 1. It can be observed from the figure that the absence of CP and CS for DMRS results in the worst performance with a loss of around 4 dB at 1% BLER when compared to the case where DMRS is appended with CP and CS. The results further indicate that when CS is not appended to DMRS, there is a noticeable loss of approximately 1.8 dB at 1% BLER compared to the case where CS is included with DMRS.

The subsequent channel estimation process involves generating an M-length frequency domain channel vector from the extracted DMRS segment. This involves transforming

#### TABLE 1. Simulation parameters for BLER comparisons.

| Parameter                           | Value              |
|-------------------------------------|--------------------|
| System Bandwidth                    | 400 MHz            |
| f <sub>c</sub>                      | 30 GHz             |
| Sub-carrier spacing                 | 120 KHz            |
| Allocated PRBs                      | 200                |
| Number of symbols                   | 14                 |
| Mapping type                        | A                  |
| Modulation                          | 16-QAM, 64-QAM     |
| Channel Model                       | 3GPP channel model |
| Number of user transmitter antennas | 1                  |
| Number of BS receiver antennas      | 2                  |
| DMRS sequences                      | Zadoff-Chu         |
| Channel Coding                      | 3GPP LDPC          |
| Equalizer                           | MMSE               |

the windowed symbol,  $\mathbf{y}_{l,DMRS}^{i}$ , into the frequency domain and employing the least squares method to infer an  $l_r$ -length frequency domain channel vector. From this vector, the corresponding  $l_r$ -length channel impulse response is obtained using an  $l_r$ -point IDFT. Eventually, the *M*-length frequency domain channel vector is derived by subjecting the channel impulse response to an *M*-point DFT. The entire process can be represented as follows:

$$\mathbf{y}_{f,r}^{i} = \frac{1}{\sqrt{l_r}} \, \mathbf{D}_{l_r}^{\dagger} \, \mathbf{y}_{t,\text{DMRs}}^{i} \tag{8}$$

$$\mathbf{h}_{f,r}^{i} = \operatorname{diag}\left(\left(\mathbf{r}_{f}^{i}\right)^{\dagger}\right) \, \mathbf{y}_{f,r}^{i} \tag{9}$$

where,  $\mathbf{r}_{f}^{l}$  is given by

$$\mathbf{r}_{f}^{i} = \frac{1}{\sqrt{l_{r}}} \, \mathbf{D}_{l_{r}}^{\dagger} \, \mathbf{r}_{t}^{i} \tag{10}$$

$$\widehat{\mathbf{H}}_{f}^{i} = \mathbf{D}_{M} \ \mathbf{D}_{l_{r}}^{\dagger} \ \mathbf{h}_{f,r}^{i}.$$
(11)

#### B. LENGTH OF DMRS

In this subsection, we outline a technique for determining the suitable DMRS length for an allocation size of Msubcarriers, which can ensure the desired channel estimation performance. The DMRS length must be carefully considered to strike a balance between DMRS overhead and block error performance. The study conducted by authors in [16] indicates that the optimal channel estimation performance is achieved when the DMRS length adequately captures the predominant energy of the channel impulse response while estimating the channel.

As previously discussed, a user is allocated with M subcarriers, which undergo an N-point IFFT, followed by symbol-level CP insertion. The resulting OFDM symbol is transmitted through a wireless channel characterized by the impulse response  $h_L(n)$ . At the receiver, the demapper module extracts the M subcarriers from the N-point FFT output. The effective channel impulse response experienced by the demapped M subcarriers can be expressed as follows:

$$h_t^i(n') = \frac{1}{M} \sum_{n=0}^{N-1} h_L(n) \sum_{k=0}^{M-1} e^{j2\pi k \left(\frac{n'}{M} - \frac{n}{N}\right)}.$$
 (12)

Given that a typical wireless channel consists of only *L* taps (where  $L \ll N$ ), (12) can be altered to

$$h_t^i(n') = \frac{1}{M} \sum_{n=0}^{L-1} h_L(n) \sum_{k=0}^{M-1} e^{j2\pi k \left(\frac{n'}{M} - \frac{n}{N}\right)}.$$
 (13)

The sum of M complex exponentials can be expressed as the ratio of two sinusoidal functions:

$$\sum_{k=0}^{M-1} e^{jkx} = \frac{\sin(\frac{Mx}{2})}{\sin(\frac{x}{2})} e^{\frac{jx(M-1)}{2}}.$$
 (14)

Hence, using (14), we can represent (13) as follows:

$$h_{t}^{i}(n') = \sum_{n=0}^{L-1} h_{L}(n) e^{j\pi \left(\frac{n'}{M} - \frac{n}{N}\right)(M-1)} \frac{\sin\left(M\pi(\frac{n'}{M} - \frac{n}{N})\right)}{\sin\left(\pi(\frac{n'}{M} - \frac{n}{N})\right)}.$$
(15)

Since M >> 1, the sinusoidal in the denominator varies much slower than the sinusoidal in the numerator. Hence, (15) can be approximated to a Sinc function, as depicted below:

j

$$h_{I}^{i}(n') = \sum_{n=0}^{L-1} h_{L}(n) e^{j\pi \left(\frac{n'}{M} - \frac{n}{N}\right)(M-1)}$$
$$sinc\left(M\pi \left(\frac{n'}{M} - \frac{n}{N}\right)\right)$$
(16)

The effective channel impulse response in (16),  $h_t^i(n')$  for the demapped M subcarriers, is represented as scaled and shifted sinc functions, where the scaling constants are tap weights from  $h_L(n)$ . While  $h_t^i(n')$  extends infinitely, its significant energy is concentrated only within specific segments.

By leveraging the characteristics of a typical wireless channel, where the impulse response  $h_L(n)$  is of finite length '*L*,' and the channel power delay profile exhibits a decreasing trend, particularly with  $|h_L(L)|$  being much smaller than  $|h_L(0)|$ , it becomes possible to efficiently extract a significant amount of energy from  $h_L(n)$ . This can be achieved by systematically prioritizing energies from distinct sinc pulses, where a major portion of the energy is considered from the sinc pulse associated with  $|h_L(0)|$ , while only a minimal portion of the energy is considered from the sinc pulse associated with  $h_L(L)$ .

The same can be represented as given below; in this case, we assume that  $h_L(L)$  is real, primarily for enhancing readability and convenience:

$$h_{t}^{i}(n') = \sum_{n=0}^{L-1} h_{L}(n) e^{j\pi \left(\frac{n'}{M} - \frac{n}{N}\right)(M-1)}$$
$$sinc\left(M\pi \left(\frac{n'}{M} - \frac{n}{N}\right)\right)rect(p)$$
(17)



FIGURE 4. Realization of the wireless channel and sinc interpolated channel.

It can be seen that the sinc pulses are truncated for n' > p, where 'p' corresponds to the index of the first zero-crossing of the 'L<sup>th</sup>' sinc pulse. This approach allows us to retain the maximum energy contribution from the 0<sup>th</sup> sinc pulse and gradually decrease across subsequent pulses until the energy contained only in the main lobe is considered from the L<sup>th</sup> sinc pulse.

This first zero crossing of the  $L^{th}$  sinc pulse occurs at

$$\pi \left(\frac{n'}{M} - \frac{L}{N}\right) M = \pi \implies p = n' = 1 + \frac{LM}{N}.$$
 (18)

It can be implied from (18) that obtaining the initial n' samples of  $h_t^i(n')$  allows capturing its dominant channel energy. The findings from [16] recommend that the DMRS length should encompass the dominant energy of the channel impulse response. Furthermore, [16] proposes that for accurate estimation of an impulse response with a length of n', the corresponding DMRS sequence should be of minimum length n'. Hence, the minimum length of DMRS in the proposed method can be determined as

$$l_r = 1 + \frac{LM}{N}.$$
(19)

Furthermore, the length of the symbol-level cyclic prefix is typically set to exceed the maximum delay spread of a wireless channel, i.e.,  $L \leq L_{CP}$ . Moreover, as per current 5G standards, the cyclic prefix length is approximately set to the first of the symbol length, meaning  $L_{CP} = N/16$ , where N represents the symbol length, and  $L_{CP}$  is the corresponding CP length.

$$\implies L \leq \frac{N}{16}$$
$$\implies l_r \approx \left\lfloor 1 + \frac{M}{16} \right\rfloor. \tag{20}$$

#### C. EXAMPLE

In this section, we demonstrate through an illustrative example that the DMRS length, as determined by (20), adeptly captures the predominant channel energy component. Additionally, we highlight that the energy loss incurred due to the truncation of  $h_t^i(n')$  is negligible.

In this example, we consider a 3GPP channel model with a Power Delay Profile (PDP) illustrated in Fig. 4(a). An allocation size of M = 480 subcarriers is considered in this



FIGURE 5. Relative energy of the channel taps that are not captured in channel estimation with respect to the total channel energy.

example. The resulting effective channel observed on the demapped subcarriers can be expressed as follows:

$$h_t^i(n') = \sum_{n=0}^{L-1} h_L(n) e^{j\pi \left(\frac{n'}{M} - \frac{n}{N}\right)(M-1)}$$
$$sinc\left(M\pi \left(\frac{n'}{M} - \frac{n}{N}\right)\right)$$
(21)

Fig. 4(b) illustrates the magnitude response of  $h_t^i(n')$ , which is obtained through the channel estimation process. Notably, the majority of the energy within  $h_t^i(n')$  is concentrated in the initial few samples. By applying (21), we can calculate the minimum required DMRS length,  $l_r$ . For this scenario,  $l_r = 1 + 480/16 = 31$  samples. Moreover, from Fig. 4(b), it is evident that most of the channel energy resides within the initial 30–34 samples. Consequently, we can conclude that our proposed method for determining the DMRS length effectively captures the dominant portion of the channel energy, thereby facilitating a decent channel estimation performance.

Additionally, we quantify the extent of energy loss resulting from truncation. This evaluation involves calculating the proportion of channel energy existing beyond the encompassed energy range relative to the entire channel energy. It can be seen from Fig. 5 that when the DMRS length is set to 31 samples, the channel energy lost due to truncation is quite minimal. Furthermore, this loss diminishes if the DMRS length is further extended.

#### **IV. RESULTS**

In this section, we present Block Error Rate (BLER) vs. Signal to Noise Ratio (SNR) performance plots across various simulation scenarios. These plots illustrate the block error performance of the proposed architecture, offering a comparative analysis against the current state-of-the-art. Additionally, we also provide a comparison of DMRS overhead and latency requirements. For the sake of comparison, we evaluate the proposed architecture against the DFT-s-OFDM architecture presented in [1]. This architecture is a standard in practical 5G systems, and we refer to it as the existing architecture throughout this section.

The simulation settings considered for the analysis are taken from [17], which outlines the evaluation conditions and performance metrics to be employed in conducting the performance evaluation. We have tabulated some of the simulation details in Table 1. To enable a thorough comparison, we have opted for the best possible configuration of the existing architecture. This configuration involves the allocation of 14 symbols in the time domain, with 4 symbols exclusively designated for DMRS transmission [1]. It is noteworthy that the existing architecture permits a maximum of 4 DMRS symbols. In contrast, the proposed architecture allocates  $\lfloor 1 + M/16 \rfloor$  subcarriers for DMRS within each of the 14 symbols, encompassing both the CP and the CS.

Fig. 6 illustrates the BLER performance of the proposed and existing architectures for various user speeds. The comparison involves utilizing a first-order time interpolation within the existing architecture. This interpolation is applied across DMRS symbols to deduce channel estimates for non-DMRS symbols (Data symbols). It is, nevertheless, essential to highlight that the proposed architecture does not employ any time interpolation.

As depicted in Fig. 6, both the proposed and existing designs exhibit similar performance at lower speeds. However, at higher speeds, the wireless channel experiences significant temporal variations, making it challenging for first-order (Linear) time interpolation to track these changes accurately. Consequently, this leads to inaccurate channel estimates on Data symbols, eventually resulting in a notable degradation in block error performance for the existing design, as is evident in Fig. 6. In contrast, the proposed method maintains a decent block error performance even at higher speeds.

Furthermore, since 4 DMRS symbols are considered to evaluate the existing architecture, there exists an opportunity to explore advanced interpolation techniques beyond the conventional first-order interpolation. To facilitate a comprehensive comparison, we have incorporated second and third-order interpolation techniques alongside the linear interpolator. In Fig. 7, we present the BLER performance with various interpolation techniques in comparison with the proposed architecture, particularly focusing on a user speed of 300 Kmph. Notably, at higher speeds, employing only first-order interpolation may inadequately capture the temporal variations in the channel, resulting in a degradation of BLER performance. The implementation of higher-order interpolations, specifically second and third order, showcases a significant enhancement in performance.

Fig. 7 highlights that the optimal performance is achieved when employing a third-order interpolation across the DMRS symbols within the existing architecture. However, it is noteworthy that the proposed architecture, despite not utilizing any time interpolation techniques, exhibits superior channel estimation performance. This not only conserves computational resources but also minimizes overall latency,



FIGURE 6. BLER performance of the proposed architecture in comparison to the existing system with linear interpolation between the DMRS symbols.



FIGURE 7. BLER performance of the DFT-s-OFDM system with different interpolation techniques in comparison with proposed architecture.



FIGURE 8. BLER performance of 16-QAM modulation at different speeds.

as the computational complexity tends to increase with the interpolation order.

To determine the maximum user speeds supported by both proposed and existing architectures, we conducted a thorough analysis of the BLER performance under various user speeds. The comparison considers the implementation of first-order time interpolation within the existing architecture. Fig. 8(a) showcases the BLER performance of the proposed architecture employing 16-QAM modulation across a range of user speeds. It can be noticed from the figure that, despite not utilizing any interpolation techniques, the proposed architecture maintains satisfactory BLER performance even when subjected to speeds up to 500 Kmph. Notably, the proposed architecture can accommodate speeds of up to 750 Kmph for 16-QAM, albeit with a marginal BLER degradation.

Further, Fig. 8(b) illustrates the BLER performance of the existing design with 16-QAM modulation under varying user speeds. It can be noticed from the figure that the existing architecture effectively decodes 16-QAM constellations up to speeds of 200 Kmph but experiences significant packet losses beyond 200 Kmph.

It is crucial to highlight that, in most of the practical systems, the use of interpolation techniques, including linear interpolation, is often avoided to minimize complexity and processing latencies. Instead, channel estimates obtained from DMRS are replicated onto the Data symbols. While this approach reduces computational complexity and processing latencies, it compromises the channel estimates quality on data symbols, leading to sub-optimal block error performance, particularly for high-speed users. Fig. 9 demonstrates this phenomenon, revealing that the existing architecture introduces an error floor for speeds exceeding 160 Kmph.

As outlined in Table 1, the performance analysis discussed in Fig 6 to 9 is based on DMRS mapping Type-A. However, it is essential to note that the existing architecture supports an additional DMRS Type, namely Type-B. In order to provide a more comprehensive analysis, we have also examined the BLER performance with DMRS Type-B and compared it to Type-A and the proposed architecture. It is worth emphasizing that, similar to Type-A, we have considered the best possible configuration within DMRS Type-B for this analysis. According to this configuration, the 4 DMRS symbols are positioned at symbol numbers [0, 3, 6, 9], while for Type-A, they are positioned at symbols [2, 5, 8, 11] (More details can be seen in [1, Sec. 6.4.1.1.3]).

Fig. 10 illustrates the BLER performance of Mapping types A and B in the existing system and the proposed architecture, considering a user's speed of 200 Kmph. The figure depicts that Mapping type A closely aligns with the performance of the proposed architecture, whereas Mapping type B exhibits a notable performance degradation.

The primary cause for performance degradation in Type-B stems from the structure of the DMRS symbols. As previously discussed, time interpolation is applied across the DMRS symbols to generate channel estimates for non-DMRS symbols. In the case of Type-B mapping, the last



FIGURE 9. BLER performance of the proposed architecture in comparison to the existing system without interpolation between the DMRS symbols.



FIGURE 10. BLER performance comparison of the proposed architecture and existing DFT-s-OFDM system with different Mapping types at 200 Kmph.



FIGURE 11. BLER performance comparison of the proposed architecture and existing DFT-s-OFDM system with different DMRS symbol densities at 200 Kmph.

DMRS symbol is positioned at the 9<sup>th</sup> position, necessitating extrapolation to obtain channel estimates for the final 4 symbols (i.e., symbols 10 through 13). In high-speed scenarios, extrapolation struggles to accurately capture time variations in the channel, leading to sub-optimal channel estimates that adversely affect data decoding. It is important to note that an increase in user speed also results in performance degradation for Type-A. However, the proposed architecture demonstrates satisfactory block error performance even at higher speeds, as illustrated in the figure.

The performance plots in this section, up to this point, have examined scenarios involving 4 DMRS symbols. It is important to note that while the current architecture supports a maximum of 4 DMRS symbols, the actual allocation can range anywhere from 1 to 4 DMRS symbols. To enhance the comprehensiveness of our analysis, we have also explored the BLER performance with varying number of DMRS symbols, as illustrated in Fig. 11. In this analysis, we allocated 14 symbols, utilized DMRS mapping type-A, set the user speed at 200 Kmph, and varied the number of DMRS symbols from one to a maximum of four.

It is evident from the figure that the error performance improves with an increase in the number of DMRS symbols. Specifically, when only one DMRS symbol is utilized, time domain interpolation cannot be applied to obtain the channel on non-DMRS symbols. This results in higher block error rates. Conversely, employing more DMRS symbols provides the opportunity to utilize interpolation techniques, thereby improving decoding performance. Nevertheless, it is important to note that the increase in the number of DMRS symbols is accompanied by an increase in computational complexity and processing delays.

#### A. DMRS OVERHEAD

In this subsection, we provide a brief analysis of the DMRS overhead for both the proposed and existing designs. As explained in Section II, for an allocation size of *M* resource elements, the proposed architecture allocates approximately  $2(\alpha M + 1)$  ( $\alpha = 1/16$ ) subcarriers for DMRS, including CP and CS within each symbol. Consequently, the DMRS overhead is relative to the total number of subcarriers per symbol. In the proposed architecture, it is considered that the combined length of CP and CS is equivalent to the length of DMRS. Hence, considering a total of 14 OFDM symbols, the DMRS overhead across all the 14 OFDM symbols will be around  $\frac{2(\alpha M+1)PM}{PM} \times 100\% = 12.5\%$ . In contrast, the existing design dedicates 4 OFDM symbols solely for DMRS. The DMRS overhead for the existing design can be calculated as  $\frac{4}{14} \times 100 = 28.58\%$ .



FIGURE 12. BLER performance of the proposed architecture with different DMRS overheads.



FIGURE 13. Comparison of % of DMRS allocation for the existing and proposed architecture in different scenarios.

In Fig. 12, we present a comprehensive analysis of the BLER performance for the proposed architecture under different DMRS overhead conditions. Notably, an optimal equilibrium between DMRS overhead and BLER performance is obtained when the overhead is 12.5% of the allocated resources. Subsequently, when the overhead is reduced to 50% of the optimal value, an inadequacy in channel estimation leads to a noticeable degradation in BLER performance. Conversely, no improvement in performance is observed by increasing the DMRS overhead by 50% from the optimal value.

Fig. 13 presents a comparison of the DMRS overhead between the existing and the proposed architectures. The analysis includes two scenarios. In the first scenario, the existing design employs an allocation size of 14 OFDM symbols with 4 symbols dedicated to DMRS, while the proposed architecture uses DMRS in every symbol. However, when prior knowledge indicates a low user speed with slow channel variations across symbols, the proposed architecture can be further optimized. In scenario-2, the proposed architecture tecture is optimized by configuring DMRS on every alternate symbol rather than each symbol, thereby reducing processing delays and DMRS overhead. Further, it is noteworthy that the slowly varying channel nature allows channel estimates from DMRS-containing symbols to be immediately applied to subsequent symbols without DMRS.

Likewise, the existing design can also be optimized with fewer than 4 DMRS symbols. Therefore, in Scenario 2, the proposed architecture uses 14 OFDM symbols with DMRS in alternate symbols, while the existing design employs 14 symbols with 2 DMRS symbols. Fig 13 demonstrates the superior performance of the proposed architecture over the existing one in terms of DMRS resource consumption. In essence, the proposed architecture facilitates more efficient resource allocation for data transmission, resulting in considerably higher throughputs compared to the existing designs with the same allocation size.

## **B. LATENCY**

5G-NR is designed to support applications with latencies of less than 1 ms. In [18], the authors identified use cases needing microsecond-range latencies. The current 5G-NR design transmits data and DMRS in separate symbols, necessitating  $\geq 2$  symbols for data communication, thus introducing  $\geq 1$  symbol additional latency. The proposed method multiplexes data and DMRS in one symbol, facilitating data communication with just one symbol, which is particularly beneficial for low-latency applications. Furthermore, the existing systems halt data decoding until DMRS is received and processed, whereas the proposed architecture enables instant data decoding.

## C. PRE-EMPTION SCENARIO

The proposed method also proves valuable in pre-emption scenarios, where some or all of the time-frequency resources originally designated for a non-latency-critical user, such as an eMBB user, are reallocated to serve a latency-critical user. In this situation, the user experiencing pre-emption will have a significant impact on the performance. This is primarily so because some of the resources the user assumes, which contain data for itself, will suddenly hold data for another device.

For instance, when an eMBB user is initially allotted *P* symbols, some of which are set aside for DMRS, as a result of pre-emption, some symbols, including those intended for DMRS of the eMBB user, are later reassigned to a latencycritical device. Any pre-emption of DMRS symbols not only affects the preempted symbols but also impacts the data demodulation on the remaining non-preempted symbols due to inaccurate channel estimates. However, the proposed method, where each symbol has its dedicated DMRS, enables us to limit the impact of pre-emption solely to the pre-empted symbols.

## D. LIMITATIONS OF PROPOSED ARCHITECTURE

Although the proposed DFT-s-OFDM architecture demonstrates satisfactory block error performance for lower-order modulation schemes, such as 16-QAM, even at higher



FIGURE 14. BLER performance of the proposed architecture and DFT-s-OFDM systems for 16-QAM, and 64-QAM modulation at 200 Kmph.

user speeds, it falls short in accommodating higher-order constellations like 64-QAM, as illustrated in Fig. 14.

The figure compares the performance of both the proposed and existing architectures using 16-QAM and 64-QAM constellations at a user speed of 200 Kmph. Notably, the proposed architecture successfully decodes the 16-QAM constellation with slightly improved performance compared to the existing architecture. However, it encounters substantial packet losses when dealing with the higher order constellation, i.e., the 64-QAM constellation.

These limitations stem primarily from the following reasons.

- Intra-symbol leakages: In contrast to the traditional systems, where data and DMRS are transmitted on distinct OFDM symbols, the proposed architecture multiplexes data and DMRS within the same symbol. This introduces a higher likelihood of DMRS samples leaking into data and vice versa. Despite the incorporation of CP and CS to mitigate intra-symbol leakages, there remains the possibility of minimal leakages that can potentially affect the block error performance in higher-order constellations.
- Frequency interpolation: As outlined in the receiver section, the channel impulse response, estimated using the  $l_r$  length DMRS sequence, undergoes interpolation to derive *M* length frequency domain channel estimates. These estimates are then used to equalize the *M* length data samples. However, this frequency interpolation introduces a risk of degradation in channel estimation performance, adding another layer to the challenges faced in supporting higher-order constellations.

## **V. CONCLUSION**

This paper addresses the limitations of existing DFTs-OFDM architecture in supporting specific 5G-NR applications that require low latency and high-speed capabilities. We introduced an improved DFT-s-OFDM architecture that involves multiplexing data and DMRS within the same DFT-s-OFDM symbol, enabling instantaneous channel estimation and subsequent data demodulation. Furthermore, we have presented a step-wise approach for determining the optimal DMRS length for the proposed system. We have also demonstrated that setting the DMRS length to this determined value can furnish an optimal balance between the DMRS overhead and BLER performance. Through block error performances, we have demonstrated that the proposed architecture with 16% fewer DMRS resources than the existing architecture supports the user speeds till 750 Kmph. It is noteworthy that the existing architectures experience a significant degradation in BLER performance for higher user speeds, particularly for speeds above 200 Kmph. Nonetheless, the proposed architecture faces a constraint in accommodating higher-order constellations due to challenges in channel estimation accuracy. Exploring this limitation further and devising potential solutions remains a promising avenue for future research.

#### REFERENCES

- "NR; Physical channels and modulation; (Release 15), Version 17.4.0," 3GPP, Sophia Antipolis, France, Rep. TS 38.211, Jan. 2023.
- [2] E. Mozo, A. A. Gómez, F. Parrila, and M. Mendicute, "Performance analysis of pilot patterns for channel estimation for OFDM systems in high-speed trains scenarios," in *Proc. IEEE 30th Int. Symp. Pers., Indoor Mobile Radio Commun. (PIMRC)*, 2019, pp. 1–7, doi: 10.1109/PIMRCW.2019.8880812.
- [3] G. Noh, B. Hui, J. Kim, H. S. Chung, and I. Kim, "DMRS design and evaluation for 3GPP 5G new radio in a high speed train scenario," in *Proc. IEEE Global Commun. Conf. (GLOBECOM)*, pp. 1–6, 2017, doi: 10.1109/GLOCOM.2017.8254568.
- [4] T. Thi Huong and N. Duc Anh, "Doppler frequency compensation basing the velocity of train and high-speed railway scenario," in *Proc. Int. Conf. Adv. Technol. for Commun. (ATC)*, pp. 155–159, 2019, doi: 10.1109/ATC.2019.8924519.
- [5] Y. Wu, C. Han, and Z. Chen, "DFT-spread orthogonal time frequency space system with superimposed pilots for Terahertz integrated sensing and communication," *IEEE Trans. Wireless Commun.*, vol. 22, no. 11, pp. 7361–7376, Nov. 2023, doi: 10.1109/TWC.2023.3250267.
- [6] F. J. Martín-Vega and G. Gómez, "A low-complexity pilotbased frequency-domain channel estimation for ICI mitigation in OFDM systems," *Electronics*, vol. 10, no. 12, p. 1404, 2021, doi: 10.3390/electronics10121404.
- [7] M. S. A. Khan, K. Rao, S. Amuru, and K. Kuchi, "Low PAPR DMRS sequence design for 5G-NR uplink," in *Proc. IEEE Int. Conf. Commun. Syst. Net.*, 2020, pp. 207–212, doi: 10.1109/COMSNETS48256.2020.9027415.
- [8] M. S. Ali Khan, K. Rao, S. Amuru, and K. Kuchi, "Low PAPR reference signal transceiver design for 3GPP 5G NR uplink," *J. Wireless Commun. Netw.*, vol. 2020, p. 182, Sep. 2020, doi: 10.1186/s13638-020-01787-1.
- [9] "Study on new radio access technology-physical layer aspects; (Release 14), Version 14.2.0," 3GPP, Sophia Antipolis, France, Rep. TR 38.802, Sep. 2017.
- [10] J.-C. Sibel, "Pilot-based phase noise tracking for uplink DFT-s-OFDM in 5G," in *Proc. 25th Int. Conf. Telecommun. (ICT)*, 2018, pp. 52–56, doi: 10.1109/ICT.2018.8464891.
- [11] "Pre-DFT multiplexing of RS and data: Results on short duration one OFDM symbol uplink," IITH, Sangareddy, Telangana, document TSG RAN WG1 Meeting #88, 3GPP TR R1-1701913, Feb. 2017.
- [12] T. P. Chandrashekhar, S. Amuru, J. Nair, and A. Guchhait, "Multiplexing reference signals and data in a DFT-s-OFDM symbol," in *Proc. Int. Conf. Signal Process. Commun. (SPCOM)*, 2018, pp. 277–281, doi: 10.1109/SPCOM.2018.8724483.

- [13] A. Sahin, E. Bala, R. Yang, and R. L. Olesen, "DFT-spread OFDM with frequency domain reference symbols," in *Proc. IEEE Global Telecommun. Conf. (GLOBECOM)*, 2017, pp. 1–6, doi: 10.1109/GLOCOM.2017.8254241.
- [14] A. Bouttier, "A flexible OFDM-like DFT-s-OFDM reference symbol," in Proc. 10th Adv. Satell. Multimedia Syst. Conf. 16th Signal Process. Space Commun. Workshop (ASMS/SPSC), 2020, pp. 1–7, doi: 10.1109/ASMS/SPSC48805.2020.9268819.
- [15] J.-J. Van De Beek, O. Edfors, M. Sandell, S. K. Wilson, and P. O. Borjesson, "On channel estimation in OFDM systems," in *Proc. IEEE 45th Veh. Technol. Conf. Countdown Wireless 31st Century*, 1995, pp. 815–819, doi: 10.1109/VETEC.1995.504981.
- [16] H. Xie, Y. Wang, G. Andrieux, and X. Ren, "Efficient compressed sensing based non-sample spaced sparse channel estimation in OFDM system," *IEEE Access*, vol. 7, pp. 133362–133370, 2019, doi: 10.1109/ACCESS.2019.2941152.
- [17] "NR; NR support for high speed train scenario in frequency range 2; (Release 16), Version 17.1.0," 3GPP, Sophia, Antipolis, France, Rep. TS 38.854, Jun. 2022.
- [18] H. Tataria, M. Shafi, A. F. Molisch, M. Dohler, H. Sjöland, and F. Tufvesson, "6G wireless systems: Vision, requirements, challenges, insights, and opportunities," *Proc. IEEE*, vol. 109, no. 7, pp. 1166–1199, Jul. 2021, doi: 10.1109/JPROC.2021.3061701.



**KOTESWARA RAO GUDIMITLA** received the B.Tech. degree in electrical and electronics engineering from the Sri Venkateswara University College of Engineering, Tirupati, India, in 2015. He is currently pursuing the Ph.D. degree with the Indian Institute of Technology Hyderabad, Hyderabad. His research interests include physical-layer algorithms and the transceiver design for fifth-generation advanced systems.



**M. SIBGATH ALI KHAN** received the B.Tech. degree from JNTU Hyderabad in 2009, and the Ph.D. degree in electrical engineering from the Indian Institute of Technology (IIT) Hyderabad in 2021. From 2012 to 2017, he worked with the Cyber-Physical Systems Lab, IIT Hyderabad, where his responsibilities encompassed the design and real-time implementation of Physical layer algorithms for 4G and 5G systems. In 2018, he joined WiSig Networks (incubated at IITH), where he is currently a Senior Lead Engineer, taking

charge of designing and developing L1 and L2 algorithms for Massive MIMO Systems. He holds over ten patents and is currently working towards developing a real-time 5G system.



SAIDHIRAJ AMURU (Member, IEEE) received the B.Tech. degree from the Indian Institute of Technology Madras in 2009, and the Ph.D. degree in electrical and computer engineering from Virginia Tech in 2015. From 2009 to 2011, he was with Qualcomm, India, as a Modern Engineer. He is the Head of the Research and Development with WiSig Networks, working on 4G and 5G cellular systems. Also, he is an Adjunct Assistant Professor with the Indian Institute of Technology Hyderabad (IITH), where he teaches courses and

leads the physical layer design and research in the 5G Testbed project. He represents IITH in 3GPP and ITU-R-WP5D (a United Nations group) meetings and represents WiSig Networks in the ORAN alliance and TSDSI meetings. He was awarded the Exemplary Reviewer Award for the IEEE WIRELESS COMMUNICATIONS LETTERS in 2019 and won the Best Paper Award (Honorable Mention) at COMSNETS 2020. He was given the IP Creator Award for 2017 at Samsung for creating the most valuable patents.



**KIRAN KUCHI** (Member, IEEE) received the B.Tech. degree in electronics and communications engineering from the Sri Venkateswara University College of Engineering, Tirupati, India, in 1995, and the M.S. and Ph.D. degrees in electrical engineering from The University of Texas at Arlington, Arlington, TX, USA, in 1997 and 2006, respectively. From 2000 to 2008, he was with Nokia Research, Irving, TX, USA, where he contributed to the development of a global system for mobile communication/EDGE, WiMax, and

long-term evolution systems. From 2008 to 2011, he was with the Centre of Excellence in Wireless Technology, where he led fourth-generation research and standardization efforts. He was also an Adjunct Faculty Member with the Department of Electric Engineering, Indian Institute of Technology (IIT) Madras, Chennai, India. He is currently a Professor with the Department of Electrical Engineering, IIT Hyderabad, Hyderabad, India. He holds more than 20 U.S. patents. His current research interests include physical-layer algorithms and the development of prototypes for fifth-generation systems.