Journals & Magazines >IEEE Transactions on Machine ... >Volume: 2

Deep Learning-Based Positioning With Multi-Task Learning and Uncertainty-Based Fusion

Abstract:

Deep learning (DL) methods have been shown to improve the performance of several use cases for the fifth-generation (5G) New radio (NR) air interface. In this paper we in...Show More

Metadata

Abstract:

Deep learning (DL) methods have been shown to improve the performance of several use cases for the fifth-generation (5G) New radio (NR) air interface. In this paper we investigate user equipment (UE) positioning using the channel state information (CSI) fingerprints between a UE and multiple base stations (BSs). In such a setup, we consider two different fusion techniques: early and late fusion. With early fusion, a single DL model can be trained for UE positioning by combining the CSI fingerprints of the multiple BSs as input. With late fusion, a separate DL model is trained at each BS using the CSI specific to that BS and the outputs of these individual models are then combined to determine the UE’s position. In this work we compare these different fusion techniques and show that fusing the outputs of separate models achieves higher positioning accuracy, especially in a dynamic scenario. We also show that the combination of multiple outputs further benefits from considering the uncertainty of the output of the DL model at each BS. For a more efficient training of the DL model across BSs, we additionally propose a multi-task learning (MTL) scheme by sharing some parameters across the models while jointly training all models. This method, not only improves the accuracy of the individual models, but also of the final combined estimate. Lastly, we evaluate the reliability of the uncertainty estimation to determine which of the fusion methods provides the highest quality of uncertainty estimates.

Published in: IEEE Transactions on Machine Learning in Communications and Networking ( Volume: 2)

Page(s): 1127 - 1141

Date of Publication: 09 August 2024

Electronic ISSN: 2831-316X

DOI: 10.1109/TMLCN.2024.3441521

Contents

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.

SECTION I.

Introduction

Accurate user positioning is an enablers of several future services and technologies [1], [2], [3], [4] such as location-aware communication, vehicle to everything (V2X) applications, industrial internet of things (IIOT), cooperating robots, commercial applications, etc. For this purpose, radio-based positioning of user equipment (UE) in wireless communication networks can be considered [5]. Multiple base stations (BSs) deployed in such networks allow the collection of channel state information (CSI) over distributed links, which can be exploited for positioning of a UE. The CSI consists of the channel across the spatial and frequency domain, where the large number of antennas and large available bandwidth of current and future communication networks [4], e.g., fifth generation (5G) or upcoming sixth generation (6G), can provide a high angular and temporal resolution to enable high accuracy positioning.

Conventional radio-based positioning methods are generally model-based and usually follow a two-step approach. With CSI estimated at one BS [6] or at multiple BSs [7], relevant parameters or measurements e.g., path delay, angle of arrival (AoA), reference signal receive power (RSRP), time difference of arrival (TDoA), etc, are first determined to subsequently compute the UE’s position in a second step. Recently, machine learning (ML) and artificial intelligence (AI)-based techniques have also been proposed for radio-based UE positioning [8], [9], [10], [11], [12] which are primarily data-driven and not model-based. In particular, deep learning (DL) methods, particularly convolutional neural networks (CNNs) have shown promising results [13], [14], [15], [16], being able to achieve sub-meter accuracy. In such data-driven models, the CSI over subcarriers and antennas of a UE at a given position is considered as a fingerprint associated with the UE’s position. By leveraging the ability of wireless networks to collect large amounts of data, a database of CSI fingerprints associated with different UE’s positions along with the respective UE’s position label can be constructed. With the DL-based positioning methods, a neural network (NN) can be trained on a given database, such that afterwards the NN can be employed for estimating a UE’s position by providing the CSI of the UE as its input. Different types of fingerprints have been considered in the literature, including the received signal strength (RSS), the magnitude and/or phase of the CSI over subcarriers in the frequency domain and across antennas in the spatial domain [15], [16], [17], [18], [19].

With the CSI of a UE available across multiple BSs, early fusion or late fusion can be considered for the DL-based positioning methods [20], [21]. In early fusion, the CSI fingerprints from multiple BSs are collected and bundled together to constitute a single CSI fingerprint associated with the UE’s position. Thus with early fusion, only one NN needs to be trained with a database comprising with fingerprints of the CSI across multiple BSs [20]. On the other hand, with late fusion, one NN is assumed at each BS where the CSI is considered as a fingerprint of the UE’s location associated only with the given BS [21]. The NN associated with that BS is trained with a database of CSI fingerprints from that BS, enabling the NN to determine the UE’s position based only on the CSI estimated by that BS. Afterwards, a final UE’s position estimate is obtained by combining the position estimates obtained by the NNs across the multiple BSs [21], [22], e.g., with a weighted average.

The choice between early or late fusion generally depends on the application [23]. However, when considering changes in the UE-BS channel between the training phase and deployment phase, e.g., due to a blockage of the line of sight (LOS) between a UE and a BS, late fusion can benefit from uncertainty estimation [21]. In particular, the NN at each BS can be trained to estimate the uncertainty in the UE’s position determined by each NN. This enables the late fusion approach, to determine the final position estimate for the UE considering the uncertainty of the multiple position estimates obtained across multiple BSs. In practice the most reliable position estimates have a larger impact in determining the final UE’s position. Uncertainty estimation can be computed based on simple approaches like Monte Carlo Dropout (MCD) [24] and Deep Ensembles (DEs) [25], which characterize the uncertainty based on the variance of the positioning error obtained with multiple NNs, i.e., similar position estimates across the different NNs indicates lower uncertainty estimation. Uncertainty estimation methods have also been proposed in [26] and [27] to detect corrupted fingerprints.

Most conventional approaches to positioning require a strong line-of-sight (LOS) path and may be impaired in non-LOS (NLOS) conditions or when there is a strong multipath. Recent works such as [6] and [28] have shown how to take advantage of the multipath information for single anchor UE positioning but are limited to multiple-input multiple-output (MIMO) systems and require prior knowledge of the nature of the incoming paths (i.e., LOS or NLOS). On the other hand DL-based methods can still be employed in strong multipath scenarios and don’t require multiple antennas at both receiver and transmitter. Despite this fact, with the multipath profile being susceptible to environmental changes, a DL model trained with CSI fingerprints from one environment may achieve a poor performance for the UE positioning in another environment [21], [29].

The lack of direct transferability of the knowledge acquired in one environment to other environments is one of the challenges of DL-based positioning [30]. The most straightforward way to address this is to retrain the NN from scratch with CSI fingerprints from the new environment, which may however be resource expensive and may not always be feasible.The resource intensive nature of position labeling that is required can be reduced by employing channel charting [31] and by considering distance metrics between CSI fingerprints to create a map of the deployment scenario [32], [33] using no or very few position labels. On the other hand, several approaches can be considered for improving the generalizability of a trained model to adapt it to environmental changes or to a new environment including transfer learning, domain generalization, multi-task learning and meta learning. With transfer learning, a previously trained model is used as an initial model that is fine-tuned with reduced training data from a new environment [19], [29], which allows to speed up the training and to improve the performance compared to training from scratch.

Furthermore, with multi-task learning (MTL) the aim is to jointly learn multiple models by training them while also sharing some or all of their parameters, thereby benefiting from regularization [34]. Consequently, by considering positioning in different environments as different tasks, the positioning across multiple environments can be improved. When training a MTL scheme the choice of the relative importance of each task has to be considered. The hardest to learn tasks should be weighted less, so that the model focuses more on tasks that are easier to learn. Based on the uncertainty of each task, a method was proposed in [35] that takes into account the importance of each task. This method, not only provides a way to tune the importance of different tasks but also simultaneously learns the uncertainty for each task, which as shown in [21] is beneficial for the DL-based position using CSI fingerprints.

Another approach aiming at improving the generalizability of NN models is meta-learning. With meta-learning, a model is trained on multiple tasks or environments such that the minimization of the loss function in an unseen task is done more efficiently. Training is done by considering a meta-level objective such as the average positioning error across the multiple environments [30], [36]. Meta-learning aims at having a trained model that generalizes better not only across the trained tasks but also facilitates learning an unseen task with a lower number of training samples, in contrast to MTL which only aims at learning better the trained tasks.

Motivated by the two-step approach of conventional positioning methods, i.e., with parameter extraction from the CSI in a first step and a position determination in a second step, a two-part model trained with multi-task learning and a meta-level objective has been recently proposed in [37]. For UE positioning in different environments, i.e., different training tasks, different models are assumed with the first part of the models being common across all task and trained with CSI samples from all tasks (multi-task learning) aiming at minimizing the sum positioning error across all tasks (meta-level objective). The second part of the model of each task is trained to be environment specific by using only training data from each environment. The proposed approach in [37] is able to improve the positioning accuracy of the trained environments, as well as achieve a better generalizability when transferring the first part of the model and fine tuning the two-part model with CSI samples of a new environment.

A. Contributions

As proposed in [35], MTL benefits from uncertainty estimation. The training in MTL can be improved by determining the relative weighting of the losses of each task based on the associated uncertainty estimate [35]. For this reason, in this paper we combine the results from [21] and [37] to benefit from the MTL of different positioning tasks and from late fusion using uncertainty estimation. For a setup with multiple BSs and considering the positioning of a UE using each BS as a separate task, we show that employing a MTL scheme with uncertainty estimation and late fusion achieves high positioning accuracy. Additionally, even though this is outside the scope of the current paper, it was shown in [37] that a model trained with the MTL scheme can be further used for transfer learning in a new environment, reducing the time and amount of data that needs to be gathered.

Moreover, we extend the work in [21] by employing a method described in [38] for sensor fusion that takes into account the possibility that one or more sensors may be spurious. In the case of DL-based positioning, a model estimate could be spurious if the purported uncertainty is low but the real error is high. We employ this method in a late fusion scheme and show that it is beneficial in improving the positioning accuracy especially in dynamic environments.

Lastly, we aim not only to minimize the positioning error, but also evaluate the reliability of the uncertainty estimation. It would be beneficial if the estimated uncertainty truly reflects the model’s uncertainty about the current measurement, such that a high uncertainty should indicate high positioning error and vice-versa. To evaluate the quality of uncertainty estimates we consider the area under sparsification error (AUSE) metric [27], [39]. In addition to the AUSE metric, we evaluate the integrity of the positioning results with respect to the integrity risk (IR) which is used in global navigation satellite system (GNSS) applications and has been recently proposed in the Third Generation Partnership Project (3GPP) as a positioning key performance indicator (KPI) for 5G positioning [40].

The paper is structured as follows. In Section II the considered system model is described along with the different types of fusion and the MTL scheme is introduced. In Section III the simulation setup is described and the DL-model structure. The results and conclusion are then presented in sections IV and V respectively.

SECTION II.

System Model

We consider an uplink setup with $N_{B}$ BSs each with $N_{R}$ receive antennas and a single transmit antenna at the UE. The UE transmits a reference signal on $N_{C}$ subcarriers within an orthogonal frequency division multiplexing (OFDM) symbol. The received uplink signal is used to estimate the CSI matrix between UE and each BS. The estimated channel at the n-th BS over the $N_{C}$ subcarriers is described as:\begin{equation*} \tilde {\boldsymbol {H}_{n}} = [\tilde {\boldsymbol {h}}_{0}^{n}, \tilde {\boldsymbol {h}}_{1}^{n}, \ldots, \tilde {\boldsymbol {h}}_{N_{C}-1}^{n} ] \in \mathbb {C}^{N_{R} \times N_{C}}, \tag {1}\end{equation*} View Sourcewhere $\tilde {\boldsymbol {h}}_{l}^{n} \in \mathbb {C}^{N_{R} \times 1}$ is a column vector that describes the estimated uplink channel between the UE and the $N_{R}$ antennas of the n-th BS at the l-th subcarrier. The estimated channels can be considered as a unique fingerprint of the position of the UE and depend on the multipath between the UE and each BS. To transform the raw complex CSI data to meaningful inputs for the NN, we stack the matrices $\Re {\tilde {\{\boldsymbol {H}_{n}\}}}$ and $\Im {\tilde {\{\boldsymbol {H}_{n}\}}}$ in the third dimension to obtain a new real-valued 3D matrix $\boldsymbol {H}_{n} \in \mathbb {R}^{N_{R} \times N_{C} \times 2}$ . The symbols $\Re {\{\cdot \}}$ and $\Im {\{\cdot \}}$ denote the real and imaginary values of each of the matrix elements respectively. The values of each matrix are then normalized in the range $[{0, 1}]$ . This transformation is a widely adopted practice in the literature for AI positioning using fingerprints as it allows the network to learn from both the magnitude and phase information, which are crucial for exploiting the multipath propagation effects captured by CSI [16], [19], [41]. We input the matrix $\boldsymbol {H}_{n}$ to the DL-model without applying manual feature engineering and we leverage the model’s ability to autonomously learn relevant features from the data [42].

A. DL Based Positioning With Fingerprints

Deep learning based localization using CSI fingerprints as inputs consists of two phases, namely the training and the deployment phase, which are often alternatively termed as offline and online phases, respectively. During the training phase, CSI fingerprints are collected throughout the area of interest along with a label corresponding to the UE position associated with each CSI fingerprint. In order to collect fingerprints along with their labels, the use of positioning reference units (PRUs) can be employed, which consist of a device with known position, i.e., obtained with another positioning method or with sensors [43]. Without loss of generality, we assume that the UEs lie on a two dimensional plane. Subsequently, the CSI fingerprints and the position labels $\boldsymbol {p} =[x, y] \in \mathbb {R}^{2}$ are used to train the parameters $\epsilon $ of a neural network (NN) $f_{\epsilon } (.)$ . Training is accomplished by minimizing the mean squared error between the position labels and the output of the NN with the labeled CSI fingerprints as input. Eventually, the trained NN is then used during the deployment phase to estimate the position $\tilde {\boldsymbol {p}}$ of a UE based on the estimated CSI fingerprint, where $\tilde {\boldsymbol {p}} = [\tilde {x}, \tilde {y}]\in \mathbb {R}^{2}$ is the position estimate for the UE.

The key idea behind positioning with CSI fingerprints is that the CSI for each position is considered unique for that specific position. This stems from the fact that the channel between UE and BS is a rich source of information since it is influenced by various environmental factors such as walls objects or other obstacles. All this information is indirectly incorporated into the multipath propagation of the channel, which includes direct paths (LOS) and indirect paths (NLOS), and is extracted during the training phase of the NN. Consequently, positioning using fingerprints is part of modern positioning techniques such as [28], which leverages both LOS and NLOS paths. Additionally, as shown in [44], there is not necessarily a need for a LOS path at all since NLOS paths already contain information that can make the fingerprints unique and useful for positioning. The basic assumption is that the propagation environment should not significantly change between the training and deployment phases since that would degrade the performance of the NN.

Two different approaches for positioning using CSI fingerprints from multiple $N_{B}$ BSs can be considered [21], namely early and late fusion.

1) Early Fusion

In early fusion, a single DL-model is trained for the UE positioning, having as input the concatenation of the CSI fingerprints from all BSs, i.e., the single NN model $f_{\epsilon } (\boldsymbol {H}) = \tilde {\boldsymbol {p}}$ where $\boldsymbol {H}=[\boldsymbol {H}_{1}, \boldsymbol {H}_{2}, \ldots, \boldsymbol {H}_{N_{B}}] \in \mathbb {R}^{(N_{B} \cdot N_{R}) \times N_{C} \times 2}$ . Although this is a straightforward way to combine the information from all BSs and perform localization using a DL-model, it has some disadvantages. Firstly, a large signaling overhead is required in order to transmit the relevant CSI data to a central server which has the single NN model and second, if the setup changes (e.g. a BS is removed), then a new NN model has to be trained from scratch. A block diagram of early fusion is shown on Fig. 1.

FIGURE 1.

Early fusion.

MIT Libraries

MIT Libraries

Deep Learning-Based Positioning With Multi-Task Learning and Uncertainty-Based Fusion

Alerts

Abstract:

Metadata

Abstract:

Introduction

A. Contributions

System Model

A. DL Based Positioning With Fingerprints

1) Early Fusion

2) Late Fusion

B. Uncertainty Estimation

1) Aleatoric Uncertainty

2) Epistemic Uncertainty

C. Multi Task Learning

D. Late Fusion With Uncertainty Estimation

E. Quality of Uncertainty Estimation

1) Area Under Sparsification Error Curve

2) Integrity Risk

F. Database Description

Simulation Setup

A. Dynamic Scenario

B. Neural Network Configuration

Simulation Results

A. Static Scenario

B. Dynamic Scenario

Conclusion

References

IEEE Account

Purchase Details

Profile Information

Need Help?