Introduction
Utilizing a substantial number of antennas at the transmitter, massive multiple-input multiple-output (MIMO) can significantly enhance network capacity, spectral efficiency, and overall signal coverage for wireless communication networks [1], [2]. However, an effective approach to mitigate hardware expenses and reduce wireless power consumption in massive MIMO techniques is to utilize lower resolutions (such as 1, 2, or 3-bits) for analog-to-digital converters (ADCs). The complexity and energy consumption of an ADC both increase linearly with resolution [3] since the number of comparators in a k-bit ADC increases linearly with k. Consequently, lower-resolution ADCs are significantly less expensive and consume less power than higher-resolution ADCs. When lower-resolution ADCs are employed, the equipment structures of related RF chain elements can also be streamlined or eliminated. For example, 1-bit ADCs do not require digital gain control as they only preserve the sign of the real and imaginary components of the received signals in a simple design. The use of low-resolution ADCs in real-world massive MIMO communication networks offers significant advantages on the hardware front [4]. However, the use of purely low-resolution ADCs can hinder overall efficiency, leading to error floors in linear multi-user detection [5], degradation of data rates in higher SNR regions [6], and difficulties in channel estimation (CE) [7]. Therefore, it is crucial to develop effective signal-processing techniques for information discovery and CE in these systems to facilitate their transition into commercially viable solutions.
Numerous scientific studies have been conducted on CE within massive MIMO systems, garnering significant attention, particularly with regard to the use of 1-bit ADCs in various environments [8], [9], [10], [11], [12], [13]. For instance, efficiency limitations for mmWave 1-bit ADCs in massive MIMO CE were revealed in [8]. The study in [9] introduced a 1-bit Bussgang-aided minimal mean-squared error (BMMSE) CE method utilizing Bussgang decomposition. Furthermore, authors in [10] investigated angular-domain CE for 1-bit massive MIMO approaches. The work in [14] proposed a variational Bayesian-sparse Bayesian learning-based CE algorithm for the multi-user massive MIMO system where hybrid analog-digital processing and low-resolution ADCs are utilized at the BS. In [11], researchers studied multi-cell analysis, considering pilot interference and spatially/temporally associated channels. The CE using maximum a posteriori and maximum likelihood was explored in [12] and [13] respectively, focusing on sparse mmWave MIMO communication systems. In order to CE in a massive MIMO system, the authors in [15] first made an effort to characterize the target parameter information and channel state information from the standpoint of sensing and communication channels in a single framework. The framework works well for channels that change in time, channels that are frequency selective, and beam squint effects. Regarding CE with 1-bit ADCs, researchers in [16] presented an amplitude retrieval technique. This technique aims to restore missing amplitudes and recover the direction of arrival, facilitating CE. While the aforementioned ADC structure can reduce hardware expenses and energy consumption, the lower-resolution ADC section also limits the efficiency of the transceiver, particularly in CE scenarios with fewer pilot overheads.
Recently, in wireless communication systems, deep learning (DL) deployment has demonstrated effective performance in the field of CE [17], [18], [19], design of pilot [4], [20], [21], [22], detection of information [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], and feedback of channel state information [33], [34], [35], [36]. For handling the CE performance in a 1-bit massive MIMO system, DL techniques demonstrate superior effectiveness compared to traditional methods. Recently, there have been significant researches conducted on DL-based CE for 1 and mixed-bits ADCs [37], [38], [39], [40], [41], [42], [43], [44], [45], [46]. Specifically, to calculate the channel matrix using 1-bit quantization for incoming information, a conditional generative adversarial network (cGAN) was constructed in [39]. This system outperforms basic neural networks (NNs), such as the naive convolutional (CNN), as well as conventional CE methods. In the study of [40], a deep (DNN)-based CE and a learning signal configuration for lower-resolution quantization in MIMO networks were developed. However, the NN performs poorly due to its straightforward fully connected structure. Furthermore, the investigation into the relationship between larger antennas and fewer pilots is introduced in the multilayer perceptron (MLP)-based CE for large MIMO systems with 1-bit ADCs in [41]. Authors in [42] proposed a two-stage model-driven system called OBMNet in the massive MIMO system for efficient data detection. The proposed model structure is designed based on the architecture of the DNN model and exhibits excellent performance compared to existing methods. A combined pilot architecture and CE strategy for mixed-ADCs in massive MIMO research are presented based on the DL algorithm in [43]. They construct a pilot layout NN whose weights explicitly reflect optimized pilots and establish a meticulously connected model driven by a Runge-Kutta method for CE. In [44], a DNN-based mixed-ADC massive MIMO system for CE was proposed. To simplify the input signals of all antennas fed into the fully connected layers of DNN, a directed input model called (DI)-DNN was proposed. Additionally, the authors in [45] proposed a two-phase DNN model with mixed-ADC and low-ADC for CE in the uplink MIMO system. In the first phase, a recovering DNN model is used for coarse CE with fewer-ADC antennas, while in the second phase, a refining DNN is employed for CE with all antennas. In [46], authors proposed a modified DNN-based CE with a mixed-ADC architecture, where the majority of the antennas are fitted with low-resolution ADC, and the remainder are outfitted with high-resolution ADC, respectively. However, recurrent neural network (RNN) is one of the DL models better suited for handling periodic and sequential data than DNN and CNN, as mentioned in [31].
RNN is a feed-forward NN implementation and can process input sequences of varying lengths over time. RNNs utilize past information to predict future outcomes by retaining memories of past events. An RNN variant known as LSTM is designed with specialized gating techniques to control access to memory cells [47]. RNNs like LSTM are commonly used to address sequencing issues. The LSTM employs a gate architecture instead of the hidden unit in a standard RNN framework, enabling it to select and retain crucial data while disregarding irrelevant data. Using the mechanism of LSTM, to solve the CE with 1-bit ADCs in massive MIMO systems, authors in [48] designed an integrated model named LSTM-gated recurrent unit (GRU), demonstrating higher CE accuracy than other existing models. In contrast to LSTM, the BiLSTM network enables bidirectional data flow, making it advantageous for sequence categorization. Compared to LSTM, BiLSTM delivers improved accuracy by utilizing input from both preceding (backward) and subsequent (forward) events simultaneously. This design captures additional input data features, thereby enhancing learning performance as the flow of data in the BiLSTM network is improved [49]. The findings of the experiment indicate that BiLSTM performs better at extracting features than LSTM, according to the study [50] that compared the two models’ performances. For MIMO OFDM communication, the authors in [51] contrasted the BiLSTM approach with CNNs and DNNs. In comparison to other models, the BiLSTM can produce findings for CE issues that are more accurate, according to the conclusion. To recover the transmitted data, the study in [32] suggested a multiuser uplink CE and signal detection in the NOMA-OFDM systems based on the BiLSTM model. Yu et al. [52] previously presented a stacked BiLSTM framework for RIS-assisted unmanned aerial vehicle communication systems. However, BiLSTM faces challenges in complexity, data dependence, and hyperparameter tuning. Despite this, BiLSTM networks offer advantages for CE, capturing sequential context and bidirectional information flow. In addition, the automatic feature representation and context awareness make the BiLSTM promising for CE estimation tasks. In this article, effective CE using 1-bit ADCs in massive MIMO is addressed through the implementation of a DL model known as BiLSTM. The proposed BiLSTM model exhibits improved performance due to the bidirectional operation of two LSTM units, operating in both forward and backward directions. The rationale behind using BiLSTM lies in its ability to embed certain knowledge into long-short-term memory, aiding in retaining crucial information. Moreover, the bidirectional mechanism of the proposed model enables training on more data while preserving more information, thus enhancing overall CE performance.
The CE of the 1-bit ADC was significantly influenced by the findings in the previously mentioned research, especially when fewer pilot signals are employed. However, BiLSTM can be an effective approach in improving CE performance with 1-bit ADCs in massive MIMO communication strategies, particularly in scenarios with fewer pilots and more available antennas. Motivated by the aforementioned approach and research gap as well as the advantages of BiLSTM over other DL frameworks in the literature, this paper proposes an effective BiLSTM-based approach for CE using fewer pilots with 1-bit ADCs in massive MIMO systems. The proposed approach involves the design of the massive MIMO system initially. Subsequently, the transmission frame, comprising both pilot and data tones, is sent through the uplink channel. The next step involves offline training of the BiLSTM model using real and imaginary values from the received data. The effectiveness of the suggested model, with reduced pilot overhead, is observed during the online phase.
The main contributions to the suggested framework can be summed up as follows:
The problem of uplink CE using 1-bit ADCs in massive MIMO is addressed through the implementation of an effective DL model known as BiLSTM. This framework learns to map incoming quantized measurements (QM) to the channels by leveraging DL techniques and past channel estimation data. To establish this mapping, an appropriate pilot sequence (PS) length and architecture are determined, ensuring the existence of these QMs. It is worth noting that a reduced number of pilots is required for a specific set of user locations when more antennas are employed to ensure the existence of this mapping. This might seem counterintuitive, but it indicates that as the number of base station (BS) antennas increases, fewer pilots are necessary for CE. This is supported by the fact that the QM vectors corresponding to different channels become more distinct with an increased number of antennas. Consequently, there is a decreased probability of errors when effectively matching them to their respective channels.
Unlike the previous studies, our proposed model learns long sequences of input data in both directions of hidden layers, thereby maximizing overall training performance. Consequently, the proposed model confirms an increase in CE by calculating the normalized mean-squared error (NMSE) in scenarios with more BS antennas and lower pilot overhead.
To observe the effectiveness of the proposed model, we conduct simulations involving various analyses and compare its performance against other existing methods. This comparison is based on measuring NMSE across different antenna numbers, signal-to-noise ratios (SNR), and pilot lengths. The simulation results reveal that the proposed model consistently outperforms other methods across all analyses. Furthermore, in order to maximize the model’s efficacy, we fine-tune its hyperparameters by experimenting with different learning rates, three optimization algorithms, and varying minibatch sizes during the training phase. Consequently, we identify the optimal parameter settings that significantly enhance the efficiency of CE performance.
The rest of the paper is organized as follows: the proposed system model is presented in Section II. In Section IV, the proposed model with channel estimation is described. The simulation outcome of the proposed system is presented in Section V. Finally, Section VI concludes the paper.
Notations: The boldface letter in lower case and upper case denotes a vector and a matrix, respectively; the subscript on the lowercase letter
System Model
In the proposed system, we have assumed the uplink of massive MIMO communication with 1-bit ADC where U single user equipment (UE) antenna is considered and BS belongs to G antennas. The proposed system model is illustrated in Fig. 1, where the channel is calculated by uplink learning and is utilized for downstream communication of data in a time division duplexing mechanism. Consider N as the number sequence of pilot length and the U UE sends an uplink PS of \begin{equation*} {\mathbf {y}}= \rm {sgn}({\mathbf {h}} {\mathbf {x}} ^{T} + {\mathbf {s}}), \tag {1}\end{equation*}
\begin{align*} \mathrm {sgn}(x)=\begin{cases} \displaystyle 1, & \text {if} \, x \geq 0 \\ \displaystyle -1, & \text {otherwise}. \end{cases} \tag {2}\end{align*}
The widespread usage of 1-bit ADCs in base station receivers for massive MIMO systems. A DL model that estimates the channel vector
Channel Modeling and Data Transmission
For channel model h, we utilize a generic geometric channel and assume that there are P possible pathways for the signal to travel before reaching the BS from the UE. Each pathway p includes an angle of arrival \begin{equation*} {\mathbf {h}} = \sum \limits _{p =1}^{P}\theta _{p }{\mathbf {a}}(\phi _{p}). \tag {3}\end{equation*}
The formulation of BS array response vector \begin{equation*} \mathbf {a}(\phi _{p})\!=\!\frac {1}{\sqrt {P}}\left [{{1,e^{-j2\pi \frac {d}{\lambda }\!\sin (\phi _{p})},\ldots,e^{-j2\pi \frac {d}{\lambda }(M\!-\!1)\!\sin (\phi _{p})}}}\right ]^{T}\!,\! \tag {4}\end{equation*}
The downlink beamforming of \begin{equation*} \mathbf {C_{b}}=\frac {\hat {\mathbf {h^{*}}}}{||\hat {\mathbf {h}}||}. \tag {5}\end{equation*}
\begin{equation*} \textsf {SNR}_{\mathrm {ant}}=\frac {\gamma }{G} \frac {\left |{{\widehat {\mathbf {h}}^{H} {\mathbf {h}}}}\right |^{2}}{\|\widehat {\mathbf {h}}\|^{2}}, \tag {6}\end{equation*}
Proposed Model Based Channel Estimation
Utilizing the highly quantized signal y, we aim to develop an effective CE approach for generating the channel vector h. Our objective is to establish a CE technique that minimizes the NMSE between the predicted channel
For massive MIMO systems employing lower-resolution ADCs, traditional CE approaches like [7] attempt to compute the channel just from the quantized received signal without leveraging previous measurements. However, the channels effectively contain various environmental factors, such as geometry, resource allocation, and transmitter/receiver placements [53]. This implies that comparable channels will probably be seen more than once by the BS deployed in the given environment. The mapping from the received quantized measurements to the channels is learned in the proposed study by utilizing the DL model and prior channel estimation data. Therefore, prior experience can be utilized in finding the fundamental relationship between the channels and the quantized received signals. This may result in a large reduction in the pilot length. The research presented in [54] showed that there are significant correlations between the channels of adjacent subcarriers. The received signals will become highly connected and their practical performance will be degraded if two adjacent subcarriers are simultaneously dedicated to the pilot. However, in order to obtain higher diversity gain and less correlation between subcarriers, the authors used the widely utilized uniform pilot placement. Therefore, this paper proposes utilizing the BiLSTM model to establish the mapping between the obtained quantized assessment matrix y and h. In the subsequent section, we first define the conditions under which this mapping occurs before highlighting an intriguing finding: increasing the number of antennas reduces the required number of pilots.
A. Mapping from Quantized Measurements to Channels
It is assumed that an indoor or outdoor environment setup with a massive MIMO system where BS is aiding UE with a single antenna is mentioned in section II. Consider the channel of candidate set \begin{equation*} \boldsymbol {\psi }~:~\{ {\mathbf {y}}\} \rightarrow \{ {\mathbf {h}}\}. \tag {7}\end{equation*}
The channel vector h can be predicted from the QM matrices y if this mapping has been identified. Therefore, we aim to establish the existence of the aforementioned mapping in Hypothesis 1 and explain the procedure to enhance our understanding of it.
Hypothesis 1: The system and channel of the proposed study are as follows:\begin{align*} {\mathbf {y}}& = \rm {sgn}({\mathbf {h}} {\mathbf {x}} ^{T} + {\mathbf {s}}), \tag {8}\\ {\mathbf {h}} & = \sum \limits _{p =1}^{P}\theta _{p }{\mathbf {a}}(\phi _{p}). \tag {9}\end{align*}
In equation (8), let the value of \begin{align*} {\theta } = \min \limits _{\substack {\forall {\mathbf {h}} _{\mathrm { u}}, {\mathbf {h}}_{\mathrm { v}} \in \{ {\mathbf {h}}\} \\ u \neq v}} ~\max \limits _{\forall g}\left |{{\angle \left [{{ {\mathbf {h}}_{\mathrm { u}}}}\right ]_{\mathrm { g}} - \angle \left [{{ {\mathbf {h}}_{\mathrm { v}}}}\right ]_{\mathrm { g}}}}\right |. \tag {10}\end{align*}
The mapping function
According to Hypothesis 1, once the PS of x is constructed following the specified structure outlined in the proposal, there exists a one-to-one mapping
B. Analysis: Fewer Pilots are Needed as There Are More Antennas
According to Hypothesis 1 and its demonstration, the ideal PS should be long enough to ensure that every pair of channels in h yields two distinct QM matrices. It makes sense that with more antennas installed at the BS, and with identical uplink PS duration, the probability increases for these antennas to result in distinct measurement matrices. It is interesting to note that additional antennas will improve CE, as shown in the simulation results in Section V. This indicates that smaller pilots will be required at the BS if more antennae are used to ensure a one-to-one or bijective mapping from h to y. Analytical descriptions of this intriguing relationship are feasible for various channel approaches. The following Corollary 1 demonstrates that smaller pilots are required when more antennas are added in the LOS channel approach. Corollary 1: It is assumed that a BS has a single-path channel scheme (\begin{equation*} N=\left \lceil {{\frac {1}{(G-1)(4 \sin ^{2} (\tau \phi /2))}}}\right \rceil. \tag {11}\end{equation*}
The proof of (11) is explained in the study by [41], and it is implied by Hypothesis 1. The intriguing advantage of our suggested BiLSTM strategy becomes apparent in Corollary 1, which states that a greater number of antennas require fewer pilots to ensure the presence of
C. Proposed Deep Learning Model
In this study, to effectively calculate the channel matrix, a time series-based model called BiLSTM is used. It is seen from the equation (1) that all of the parameters such as \begin{equation*} {\mathbf {y}}= \rm {sgn}({\mathbf {h}} {\mathbf {x}} + {\mathbf {s}}), \tag {12}\end{equation*}
\begin{align*} \mathbf {y} & = \left [{{\Re _{e} \lbrace {\mathbf {y}}_{\mathrm {t}}\rbrace, \Im _{m} \lbrace {\mathbf {y}}\rbrace }}\right ],~\mathbf {h}= \left [{{\Re _{e} \lbrace {\mathbf {h}}\rbrace, \Im _{m} \lbrace {\mathbf {h}}\rbrace }}\right ], \\ \mathbf {s} & = \left [{{\Re _{e} \lbrace {\mathbf {s}}\rbrace, \Im _{m} \lbrace {\mathbf {s}}\rbrace }}\right ],~\mathbf {x} = \begin{bmatrix}\Re _{e} \lbrace {\mathbf {x}}\rbrace & \Im _{m} \lbrace {\mathbf {x}}\rbrace \\ -\Im _{m} \lbrace {\mathbf {x}}\rbrace & \Re _{e} \lbrace {\mathbf {x}}\rbrace \end{bmatrix}.\end{align*}
The objective of this study is to estimate the channel h from the one-bit quantized data y, analyze the best possible CE, and explore optimal training sequences x. To achieve this goal, we describe an effective time series BiLSTM model enhancing CE performance with reduced pilot data sequences. Leveraging the capabilities of gated units, particularly BiLSTM, we aim to map quantized incoming signals to complex-valued channels. The selection of BiLSTM models represents the current state-of-the-art in translation. The core idea is that with more antennas and higher SNR but fewer pilot overheads, our proposed model should exhibit improved performance due to the bidirectional operation of two LSTM units, operating in both forward and backward directions. The rationale behind using BiLSTM lies in its ability to embed certain knowledge into long-short-term memory, aiding in retaining crucial information. Moreover, the bidirectional mechanism of the proposed model enables training on more data while preserving more information, thus enhancing overall CE performance with fewer pilot data. Consequently, it can denoise the pre-processed y. As mentioned earlier, in cases of prolonged sequential training, conventional back-propagation training faces challenges to diminishing gradients. The proposed BiLSTM utilizes an effective gradient-based method with a structure designed to ensure continuous error flow through internal states of specialized units, addressing these issues related to error back-flow.
1) Data Pre-Possessing and Preparation
Pre-processing of the network inputs and outputs is necessary for effective training before any training occurs. Whether the channels are in the training or testing datasets, the initial stage of pre-processing uses the maximum absolute channel value from the training set to normalize those channels to the range [
Algorithm 1 Data Preparation Algorithm
Begin
Initialize the parameters G, M, T, U, and N.
Generate channel matrices:
Generate uplink pilot symbols:
Transmit pilot symbols:
Simulate 1-bit ADC operation:
Feature extraction:
Construct training dataset:
Split the training and testing dataset as 70% and 30%, respectively.
End
2) Proposed Model Architecture
The BiLSTM architecture consists of two LSTM units operating in the forward and backward direction. These LSTM units comprise multiple memory units known as cells. The input gate, output gate, and forget gate are the three gates that regulate the cell. The forget gate discards unnecessary information from the cell state, while the input gate incorporates new information into it. Finally, the output gate selects crucial data from the current cell state and presents it as the output. We trained the BiLSTM model to learn channel mapping from quantized data, leveraging the efficiency of these gated units known for their memory capabilities. The proposed model and its corresponding gate structure are depicted in Fig. 2. The description of the proposed model structure is articulated as follows:
ImageInputLayer: After preprocessing of y with size \begin{align*} \overrightarrow {Frd (H_{t})} & = LT_{leyar}({\mathbf {F}}_{\mathrm {i}}, \overrightarrow {H}_{t-1}) \\ \overleftarrow {Bkd (H_{t})} & = LT_{leyar}({\mathbf {F}}_{\mathrm {i}}, \overleftarrow {H}_{t+1}) \\ H_{t} & = [ \overrightarrow {Frd (H_{t})};\overleftarrow {Bkd (H_{t})}], \tag {13}\end{align*}
\begin{align*} \mathrm {drop}(k,p)=\begin{cases} \displaystyle k & \text {with probability} \, 1-p \\ \displaystyle 0 & \text {with probability} \, p. \end{cases} \tag {14}\end{align*}
D. Objective Function
The objective of the BiLSTM reduces loss and maximizes CE performance with less pilot overhead. The proposed BiLSTM model is deployed at the BS side for uplink CE, where the estimation is performed through a non-learning mapping relationship from the acquired pilot signal N to the uplink channel and is expressed as follows:\begin{equation*} \hat {\mathbf {h}} = { \boldsymbol {\psi }}_{ \boldsymbol {\theta }}({\mathbf {y}}^{N}), \tag {15}\end{equation*}
\begin{equation*} \mathcal {L}(\boldsymbol {\theta }) = \frac {1}{M}\sum \limits _{i=1}^{M}{\parallel \hat {\mathbf {h}}_{i}-{\mathbf {h}}_{i} \parallel }_{2}^{2}. \tag {16}\end{equation*}
The training of the BiLSTM is to maximize the weights \begin{equation*} \mathop {\min }\limits _{ \boldsymbol {\theta }}\mathcal {L}(\boldsymbol {\theta }) = \frac {1}{M}\sum \limits _{i=1}^{M}{\parallel { \boldsymbol {\psi }}_{ \boldsymbol {\theta }}({\mathbf {y}_{i}}^{N}) -{\mathbf {h}}_{i} \parallel }_{2}^{2}. \tag {17}\end{equation*}
Iteratively training of the BiLSTM on the training dataset involves the definition of the objective function. The weights \begin{equation*} { \boldsymbol {\theta }}_{t+1} = { \boldsymbol {\theta }}_{t} - {\eta }_{t} {\mathbf {g}}({ \boldsymbol {\theta }}_{t}), \tag {18}\end{equation*}
E. Training and Testing of the Proposed Model
The number of antennas at the BS and the number of pilots used for estimation determines the dimensions of the input and output layers. For instance, if there are 200 antennas and 20 pilot symbols, the input and output sizes will be 4000 and 400, respectively. The training samples are arranged as
Algorithm 2 BiLSTM Training and Channel Estimation Algorithm
Initialize estimation function
Load training dataset:
Initialize the weights and biases of forward and backward as (
Calculate loss function
Select the optimizer Adam as follows operation:
where
for
for
Perform estimation:
Compute loss function
Gradient descent:
Update parameters:
end for
end for
Test dataset
Estimated outputs
Initialize NMSE = 0
for
Perform estimation:
Compute NMSE for i th samples:
end for
We use the NMSE to calculate the SNR divergence among the actual channel h and the predicted channel matrix \begin{align*} \textsf {NMSE} = \mathbb {E}\left [{{\frac {\|{\hat {\mathbf {h}}}-{\mathbf {h}}\|^{2}}{\|{\mathbf {h}}\|^{2}}}}\right ]=10 \log \left \lbrace {{\mathop {\mathbb {E}} \left [{{ \frac {||\hat {\mathbf {h}} - {\mathbf {h}}||^{2}}{||\mathbf {h}||^{2}} }}\right ] }}\right \rbrace. \tag {19}\end{align*}
All simulations maintain the same network topology and training settings, except for variations in input and output dimensions, ensuring equitable evaluation. This study conducts simulations within the MATLAB R2022b environment, utilizing a 12th Gen Intel(R) Core(TM) i7-
Simulation Results
In this section, we describe the performance of the proposed DL model for CE in the context of massive MIMO systems employing lower-resolution ADCs. This is presented in different simulation parameter settings. The performance of the proposed model is compared with other studies, and our solution surpasses alternative methods due to the enhanced learning capability and sequence prediction offered by the BiLSTM model. In this study, we evaluate the simulation outcomes using BiLSTM-based CE and adopt the massive MIMO system of Section IV.
A. Impact of SNRs
To evaluate the model performance across different SNR ranges during training, we conduct simulations based on SNR per antenna with the number of antennas represented as G. Additionally, we consider different pilot signal quantities denoted as N, specifically 2, 5, and 10, aiming to distinguish their impact on performance, as depicted in Fig. 3(a), (b), and (c) respectively. In Fig. 3(a), a range of SNR (0 to 30) dB is analyzed for received matrices. It illustrates that with
Simulation results with different SNR and pilot signals to analyze the performance of SNR per Antenna versus the number of antennas.
B. Impact of Pilots and SNRs for NMSE
To observe the impact of
Simulation results with different SNR and pilot signals to analyze the performance of NMSE versus the number of antennas.
C. Comparison for the Performance of the Proposed Model With Other Studies in Terms of Different SNRs and Pilot Length
Fig. 5 shows the comparative performance of the proposed model with MLP [41], cGAN [39], LSTM-GRU [48], FBM-CENet [4], SVM [59], and CNN [60] models, respectively. From Fig. 5, it is evident that the proposed model outperforms the others across different SNR values. For instance, with the proposed model using
Simulation results of the proposed model and other studies for evaluation of NMSE versus different SNRs.
Additionally, to demonstrate the proposed model’s CE solution, we conduct simulations using different numbers of pilots, denoted as
Performanc comparison of the proposed model with others in terms of pilot lengths versus NMSE.
D. Impact of Optimization Algorithms
The choice of the optimal optimization method to address a particular problem is a challenging task. To achieve the best performance for the BiLSTM-based CE with reduced pilot overhead, it is crucial to evaluate the effectiveness of various optimizer functions based on the model and the dataset at hand. In this section, we present simulated comparisons of three optimization techniques to assist in the selection of the most suitable method for CE challenges. The three optimization techniques employed in this simulation are Adam, RMSprop, and SGDm. Fig. 7 illustrates the NMSE performance of the proposed BiLSTM model using three optimization algorithms concerning different numbers of antennas (G), with pilots (N) set to 2, 5, and 10. The results indicate that, up to
Simulation results for the effect of optimization algorithm for the proposed system.
E. Impact of Learning Rates
In order to achieve the best predictive effectiveness, adjusting training parameters is a crucial aspect during the model-learning process. To attain optimal CE performance in this study, we train and predict the proposed model using different learning rates (LRs). Fig. 8 displays the NMSE performance against the number of antennas (G) for the proposed BiLSTM model with three LRs: 0.0001, 0.01, and 0.000001, corresponding to pilots
F. Impact of Minibatch Size
During the model training process, the minibatch size (MB) significantly influences the optimal prediction rate of the DL model. In this study, the proposed model undergoes tuning using four distinct MB sizes: 50, 100, 500, and 900, respectively. Fig. 9 illustrates the impact of MB size on the NMSE measurement performance concerning the number of antennas G. To assess the effectiveness of MB size on NMSE, the results are gathered using different pilot signal values of
Conclusion
In this paper, a DL-based model called BiLSTM is proposed to estimate the channel matrix from highly quantized received signals using one-bit ADCs for a massive MIMO system. To ensure a mapping from quantized data to channels, we determined the length and structure of the PS. Subsequently, we demonstrated that, with increasing antenna numbers, fewer pilots are required. The efficiency of estimating massive MIMO channels only necessitates a small number of pilots and both simulation and analysis outcomes are shown. The proposed BiLSTM enhances CE performance with limited pilot signals by training on long input sequence data using a bi-directional framework. The inclusion of bi-directional (forward and backward) tasks in the hidden layers of BiLSTM enhances training ability and improves the CE of the proposed system. In the simulation, we evaluated the performance of the proposed model based on NMSE and SNR per antenna for different antennas. Additionally, simulations of NMSE with different pilot lengths and SNR were conducted. The results demonstrated that the proposed model outperforms MLP, CNN, cGAN, LSTM-GRU, FBM-CENet, and SVM-based CE methods across various simulation scenarios. To achieve optimal outcomes, we tuned the proposed model by adjusting different learning rates, using three optimization algorithms, and varying minibatch sizes. Consequently, we selected the best model settings to achieve maximum CE accuracy with lower pilot overhead. The performance of the proposed system shows promising improvements for large MIMO systems. The system model that we considered is designed for discrete samples and the communication happens within these angle spaces. In our study, we did not consider the continuous angle space for channel estimation. The study primarily focused on the performance of BiLSTM based 1-bit ADCs with fewer pilot overhead. The change of continuous angle space is not considered. However, it can be a future study scope to observe the performance of the proposed model. In addition, this system could be applied to promising physical layers, such as reflecting intelligent surface-based systems.