Journals & Magazines >IEEE Access >Volume: 8

Power Allocation Schemes Based on Deep Learning for Distributed Antenna Systems

We consider a new system model of deep learning for the scenario of the distributed antenna system (DAS). To reduce the computational complexity of system, we used deep n...

Abstract:

In recent years, a lot of power allocation algorithms have been proposed to maximize spectral efficiency (SE) and energy efficiency (EE) for the distributed antenna syste...Show More

Metadata

Abstract:

In recent years, a lot of power allocation algorithms have been proposed to maximize spectral efficiency (SE) and energy efficiency (EE) for the distributed antenna systems (DAS). However, the traditional iterative power allocation algorithms are difficult to be implemented in reality because of their high computational complexity. With the development of machine learning algorithms, it has been proved that the machine learning method has excellent learning ability and low computational complexity, which can approximate the traditional iterative power allocation well and be easily to be implemented in reality. In this paper, we propose a new deep neural network (DNN) model for DAS. From the perspective of machine learning, traditional iterative algorithms can be regarded as a nonlinear mapping between user channel realizations and optimal power allocation schemes. Therefore, we train the DNN to learn the nonlinear mapping between the user channel realizations and the corresponding power allocation schemes based on the traditional iterative algorithm. Then, a power allocation schemes based on DNN method is developed to maximize SE and EE for DAS. The simulation results show that the proposed scheme can not only obtain the almost similar performance as the traditional iterative algorithm, but also reduce much online computational time.

We consider a new system model of deep learning for the scenario of the distributed antenna system (DAS). To reduce the computational complexity of system, we used deep n...

Published in: IEEE Access ( Volume: 8)

Page(s): 31245 - 31253

Date of Publication: 11 February 2020

Electronic ISSN: 2169-3536

DOI: 10.1109/ACCESS.2020.2973253

Funding Agency:

Contents

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.

SECTION I.

Introduction

In recent years, with the improvement of the communication technology, there is a rapid growth in data transmission of cellular networks. The explosive growth data transmission demand has brought tremendous pressure to the fifth-generation (5G) wireless system [1], [2]. In order to provide high rate of data transmission, many researchers have proposed a DAS, which can effectively reduce the access distance between the remote access unit (RAU) and user equipment (UE) [3], [4]. Existing researchs showed that there are a lot of advantages of DAS, such as increasing SE [4]–[7] and improving EE [8]–[12].

In order to provide high data rate transmission, it is also significant for us to optimize SE and EE power allocation in DAS. A lot of algorithms for SE and EE power allocation have been proposed in the past years. In [13], the authors have exploited the water-filling algorithm to maximum EE of MIMO systems. Moreover, a sub-gradient algorithm and fractional programming have been proposed in [14] to optimize three different objectives for DAS, including maximum SE optimization, minimum transmit power optimization and maximum EE optimization. However, most of works mainly based on the traditional methods to get the optimal power allocation algorithm, which usually sacrifice a large amount of online computational time and complex calculation. It is unrealistic to use the traditional iterative algorithms in actual systems whose user channel state information is constantly changing. Therefore, we should find better schemes to reduce the system’s online computational complexity.

Deep learning has been successfully applied in many areas, including computer vision, natural language process and so on. Due to its natural advantages, many researchers try using deep learning algorithms to solve the problem of wireless communication. A lot of studies found that deep learning algorithms can not only get excellent performance in the wireless communication, but also reduce the computational complexity. Deep learning algorithms make it possible to realize the real-time power allocation in actual systems. In [15], the authors used deep learning for channel estimation and signal detection in orthogonal frequency division multiplexing (OFDM) system, which showed that the performance of deep learning algorithm is better than traditional algorithm in reducing time and computational complexity. In [16], the authors regarded the weighted mean square error algorithm (WMMSE) as traditional iterative algorithm and trained the deep neural network to learn the nonlinear mapping between the channel realizations and the resource allocation of the WMMSE algorithm. The experimental results showed that the deep neural network can not only get the almost similar performance as the WMMSE algorithm, but also reduce much online computational time. The authors in [17] exploited deep learning method to solve the EE optimization of system through power control in wireless interference networks. The results showed that the neural network solution can satisfy the demands with less online complexity. The authors in [18] proposed the ensemble deep neural networks to solve the optimal power control problem and trained the deep learning model by adding noise as one of the input neuron, which can better cope with different noises in reality. In [19], the authors used convolutional neural network (CNN) to solve the wireless scheduling problem of device-to-device (D2D) by using user geographic location instead of user channel state information (CSI). The experimental results showed that the performance of CNN method is very close to the traditional iterative method which used CSI, and can save the large cost of measuring the user CSI. However, to the best of the authors’ knowledge, there is little work that has studied the power allocation schemes based on DNN method for DAS. We hope to use deep learning algorithms to realize real-time power allocation in DAS so that the it can be applied in actual wireless communication system.

In this paper, we therefore investigate the deep learning algorithm for the downlink DAS. We define those iterative algorithms as traditional optimization algorithms. One of the most out-standing algorithms for the DAS is sub-gradient algorithm which is mentioned in [15]. Therefore, we define the sub-gradient algorithm as the traditional algorithm and design a DNN to approximate it. We randomly generate a large number of channel realizations and use the traditional sub-gradient algorithm to get the corresponding optimal power allocation schemes. Then we define those channels realizations and the corresponding optimal power allocation schemes as the inputs and labels of the DNN, respectively. We use the channel realizations and labels to train the deep neural network so that the DNN can learn the nonlinear mapping between the channel realizations and the results of the traditional sub-gradient power allocation algorithm. We focus on designing a suitable structure and appropriate parameters of the DNN so that the performance of the DNN method is close enough to the traditional sub-gradient algorithm and reduce a large amount of online computation time.

The remainder of this paper can be listed as follow. In Section II, the DAS model is presented including the system configure and the channel model. In addition, we formulate maximum SE optimization and maximum EE optimization problem of DAS. In Section III, we present the system architecture, the network structure, the process of data generation and stage of training the DNN. In Section IV, we present the parameters selection of the deep neural network and the simulation results presented to demonstrate our assumption. We conclude this paper in Section V.

SECTION II.

System Model and Problem Formulation

A. System Model

In this section, we consider a downlink scenario of DAS. There are $N$ RAUs with one antenna and $K$ cellular UEs with single-antenna in DAS. $N$ RAUs are uniformly deployed in the cell and connected to the central base station. $K$ UEs are randomly distributed in the cell, which showed in Fig. 1. We use $h_{n,k}$ to denote the channel frequency response between the $n$ th RAU and the $k$ th UE, which consists of a small and large scale fading [9] and can be expressed as

$\begin{equation*} {h_{n,k}=g_{n,k}w_{n,k}},\tag{1}\end{equation*}$ View Source

where

$g_{n,k}$

represents the small-scale fading between the

$n$

th RAU and the

$k$

th UE,

$w_{n,k}$

represents the large-scale fading which is independent of

$g_{n,k}$

[20].

FIGURE 1.

System model.

Show All

B. Maximum SE Optimization

In this part, the maximum SE optimization problem of DAS is shown. We assume that the perfect channel state information (CSI) is available at both transmitter and receiver side. In order to reduce the computational time of traditional algorithm so that we can generate enough train data set for DNN, we assume that the channels are orthogonal so there is no interference to each other. The problem of maximizing SE for the downlink DAS can be modeled as [14]

$\begin{align*} &\max _{P_{n,k}} ~\sum _{k=1}^{K}\log _{2}\left ({1 + \frac {\sum _{n=1}^{N}p_{n,k}|h_{n,k}|^{2}}{\sigma ^{2}} }\right) \tag{2}\\ & {\mathrm{ s.t.}} ~ \sum _{k=1}^{K}p_{n,k} \leq P_{max}^{n}, \tag {2a} \\ &\hphantom {\mathrm{ s.t. ~}}p_{n,k} \in \left [{ 0,P_{max}^{n} }\right],\tag{2b}\end{align*}$ View Source

where

$P_{max}^{n}$

is the maximum transmit power of the

$n$

th RAU.

$p_{n,k}$

is the transmit power from the

$n$

th RAU to the

$k$

th cellular UE.

$\sigma ^{2}$

represents the power of the complex additive white Gaussian noise (AWGN) of the UE [14].

C. Maximum EE Optimization

From [11], the problem of maximizing EE can be modeled as

$\begin{align*} &\max _{p_{n,k}} ~\frac {\sum _{k=1}^{K}\log _{2} \left ({1+\frac {\sum _{n=1}^{N}p_{n,k}|h_{n,k}|^{2}}{\sigma ^{2}}}\right)}{\frac {1}{\tau }\sum _{k=1}^{K}\sum _{n=1}^{N}p_{n,k}+NP_{d}+P_{c}+P{o}} \tag{3}\\ & {\mathrm{ s.t.}}~\sum _{k=1}^{K}p_{n,k} \leq P_{max}^{n}, \tag {3a} \\ &\hphantom {\mathrm{ s.t.~}} p_{n,k} \in \left [{0,P_{max}^{n} }\right],\tag{3b}\end{align*}$ View Source

where

$P_{d}$

is the constant circuit power consumption per RAU and

$P_{c}$

denotes the constant basic power consumption and

$P_{o}$

denotes the disseminated of the optical fiber transmission.

$\tau$

represents the power amplifier efficiency [14].

D. Traditional Sub-Gradient Algorithm

According to [11], the maximum SE optimization can find the optimal power allocation schemes by using the sub-gradient algorithm. When the objective problem is maximizing EE, we should transform the non-convex objective function into an equivalent objective function with subtractive form by using fractional programming. However, the time complexity of using sub-gradient algorithm is still high without considering the interference between users, and the time complexity is extremely high after combining fractional programming for maximizing the EE optimization. Therefore, we cannot use algorithms directly in the actual systems due to the high computational complexity.

SECTION III.

Deep Neural Network Based Method

In this part, we therefore consider the deep learning algorithm. We regard the DNN as a “black box” and use DNN to approximate the sub-gradient algorithm in an end-to-end fashion. In the proposed model, the sub-gradient algorithm can be treated as unknown nonlinear mapping between the channel realizations and corresponding power allocation schemes. The deep neural network can deal with this nonlinear mapping problem. Therefore, we can design a DNN model to learn the nonlinear mapping between the channel realizations and corresponding power allocation schemes from a large number of data, which aims to design an appropriate DNN to realize a real-time power allocation in the DAS.

A. System Architecture

In this paper, in order to prove that the power allocation based on the DNN can be applied to different maximum transmission power of RAU, we will compare the performance of the power allocation based on the DNN with the traditional sub-gradient algorithm under different maximum transmission power. The different maximum transmission power can be respectively represent as $P_{max1}, P_{max2}, \ldots, P_{maxi}$ . Because there are different optimal power allocation under different maximum transmission power of RAUs, the same channel realizations are put into the traditional sub-gradient algorithm and we will get different power allocation schemes. Therefore, for the DNN method, we fed the same channel realizations into the DNN and we will get different results of power allocation schemes. We train different DNNs by feeding the same channel realizations as inputs and the different results of power allocation by traditional algorithm as labels. The power allocation of traditional sub-gradient algorithm under different maximum transmission power which are the labels for different DNNs and can represented as $P_{(P_{max1})}, P_{(P_{max2})}, \ldots,P_{(P_{maxi})}$ . Therefore, the system architecture of the DNN method for DAS can be illustrated in Fig. 2.

FIGURE 2.

The structure of the deep neural network.

Show All

B. Data Generation

Because the deep neural network algorithm is data-dependent method and it is very important for us to train the DNNs that prepare a large number of the training data and corresponding labels. Firstly, the channel model mentioned above is used to generate a lot of channel matrices $H^{(t)}$ , where $t$ is the index of training samples. Then it can be exploited in the traditional sub-gradient algorithm to get the optimal power allocation matrix $P_{P_{max1}}^{(t)}$ , $P_{P_{max2}}^{(t)}, \ldots, P_{P_{maxi}}^{(t)}$ , which are labels for correspondent $H^{(t)}$ , respectively. For convenience, we use $P^{(t)}$ to represent labels under different maximum transmission power. Therefore, $[H^{(t)},P^{(t)}]$ denotes as the $t$ th sample. By repeating the above process, we generate a large number of samples as training samples. In addition, we use the cross validation method during training process and we randomly split 99% of the training samples into training dataset and 1% of the training samples into validation dataset. The validation dataset plays an important part in avoiding the over-fitting during the training process. Finally, in order to test the performance of the DNN method, we need to generate a large amount of samples as testing dataset.

C. Network Architecture

After training and testing dataset prepared, we need to design a deep neural network to approximate the sub-gradient algorithm to maximize SE and EE power allocation in DAS. A fully connected neural network is proposed, which includes one input layer, three hidden layers, and one output layers as shown in Fig. 3.

FIGURE 3.

The structure of the deep neural network.

Show All

FIGURE 4.

Data generation and training process.

Show All

The inputs of the DNN are the channel realizations $h_{n,k}$ and the outputs of the DNN are power allocation schemes $\hat {p}_{n,k}$ . In the DAS model, we train the DNN to learn the nonlinear mapping between the channel realizations $h_{n,k}$ and the power allocation of traditional sub-gradient algorithm $p_{n,k}$ . The outputs of the DNN should be a continuous value. Therefore, this problem is a nonlinear regression problem. In order to enhance the nonlinear fitting ability of the deep neural network, ReLU function is exploited as the activation function for the three hidden layers. ReLu function can be represent as

$\begin{equation*} \rm {ReLU(x)} = \max (0,x)\tag{4}\end{equation*}$ View Source

Since we first normalized the training data and labels, the output of the neural network should be between 0 and 1. We use the sigmoid function as the activation function of the output layer.

$\begin{equation*} \rm {sigmoid(x)} = \frac {1}{1+e^{-x}}\tag{5}\end{equation*}$

View Source

D. Training Stage

Training stage of the DNN mainly includes two processes, feed-forward operation and back propagation. The feed-forward operation is to calculate the loss value of the DNN. Otherwise, the back propagation is to update the weights and bias of the DNN by minimizing loss value, which depends on the important part of deep neural network, e.t. loss function. The loss function can be expressed as

$\begin{align*} loss=&\mathbb {E} [loss_{mse} + loss_{const}] \\=&\mathbb {E} \big[\lambda _{1}\sum _{n=1}^{N}\sum _{k=1}^{K} (\hat {p}_{n,k}-p_{n,k})^{2} \\&+\,\lambda _{2} \sum _{n=1}^{N} {\mathrm{ReLU}}\left({\sum _{k=1}^{K}\hat {p}_{n,k}-P_{max}^{n}}\right)\big],\tag{6}\end{align*}$ View Source

where

$p_{n,k}$

represents the optimal power allocation of the traditional sub-gradient algorithm and

$\hat {p}_{n,k}$

represents the output of the DNNs.

The loss function contains two parts, including the mean square error between outputs of the DNN and labels $loss_{mse}$ and the constraint function error $loss_{const}$ . The scaling factors $\lambda _{1}$ and $\lambda _{2}$ are used to balance the $loss_{mse}$ and $loss_{const}$ in order to ensure the DNN can be trained well enough. The first part of the loss function $loss_{mse}$ is used to reduce the error between $\hat {p}_{n,k}$ and $p_{n,k}$ so that the DNNs can reach the performance as the traditional sub-gradient algorithm. The second part of the loss function $loss_{const}$ ensure the output of the deep neural network strictly satisfying the constraint (2a) (3a). If $\sum _{k=1}^{K}\hat {p}_{n,k} \geq P_{max}^{n}$ , the constraint (2a) (3a) are not satisfied and ReLU $\left({\sum _{k=1}^{K}\hat {p}_{n,k}-P_{max}^{n}}\right)>0$ . Then the second part loss function $loss_{const}$ will force the network parameters to be updated to satisfy the constraint (2a) (3a). On the contrary, if $\sum _{k=1}^{K}\hat {p}_{n,k} \leq P_{max}^{n}$ , the constraint (2a) (3a) are satisfied and ReLU $\left({\sum _{k=1}^{K}\hat {p}_{n,k}-P_{max}^{n}}\right)=0$ , the second part loss function $loss_{const}$ will not influence the network training.

We adopt the RMSprop algorithm as the optimization algorithm, which is an efficient implementation of mini-batch gradient descent and we choose 0.9 as the decay rate [21]. In order to improve the performance of the DNN, we choose the Xavier initialization [22] to initialize the weights. When using a large learning rate, the training speed will be improved and will get high convergence error. Otherwise, when using a small learning rate, the train speed will be slow down and get the low convergence error. In addition, when using a big batch size, the convergence error will increase. otherwise, when using small batch size, the convergence error will decrease but unstable. Therefore, we will try different learning rate and batch size, then choose appropriate learning rate and batch size based on the validation error of previous 300 times of the DNN training, which are shown in Fig. 5 and Fig. 6, respectively.

FIGURE 5.

Batch size selection.

Show All

FIGURE 6.

Learning rate selection.

Show All

SECTION IV.

Simulation Results

In this section, we generate 50000 samples as training dataset, of which 500 are randomly spilt into validation dataset and 49500 are randomly spilt into training dataset. Then, in order to reduce the influence of randomness on experimental results, we will generate 5000 samples as testing dataset and repeat the experiment 5000 times in two methods, finally get their average respectively.

A. Scenario I and Parameters Select

In this part, we consider the scenario I, which contains 5 RAUs and 3 UEs. A fully-connected deep neural network which contains five layers, one input layer with 15 nodes, three hidden layers with 50, 100, 50 nodes and one output layer with 15 nodes is proposed for DAS. We use ReLU as the activate function for both hidden layers and output layer. We set the batch size and the learning rate to be 512 and 0.001 respectively. In addition, we set the scale factors $\lambda _{1}$ and $\lambda _{2}$ to be 1 and 0.1, respectively.

B. Scenario II and Parameters Select

In this part, in order to verify the scalability of the power allocation based on DNN method, we test the model in scenario II, which contains 5 RAUs and 10 UEs. The input layer is 50 nodes, the number of neurons in the hidden layer are 100, 150 and 100. The number of neurons in output layer is 50. We use the same batch size, learning rate and scale factors as the scenario I.

C. Results

The simulation parameters are listed in Table. 1. In the case of scenario I, when the objective problem is maximizing SE, the SE and EE performance of the sub-gradient algorithm and the power allocation based on DNN method is shown in Fig. 7 and Fig. 8. When the objective problem is maximizing EE, the SE and EE performance of the two methods is shown in Fig. 11 and Fig. 12.

TABLE 1 Simulation Parameters

FIGURE 7.

SE versus maximum transmit power in scenario I when the objective problem is maximizing SE.

Show All

FIGURE 8.

EE versus maximum transmit power in scenario I when the objective problem is maximizing SE.

Show All

FIGURE 9.

SE versus maximum transmit power in scenario II when the objective problem is maximizing SE.

Show All

FIGURE 10.

EE versus maximum transmit power in scenario I when the objective problem is maximizing SE.

Show All

FIGURE 11.

SE versus maximum transmit power in scenario I when the objective problem is maximizing EE.

Show All

FIGURE 12.

EE versus maximum transmit power in scenario I when the objective problem is maximizing EE.

Show All

In the case of scenario II, when the objective problem is maximizing SE, the SE and EE performance of the two methods is shown in Fig. 9 and Fig. 10. When the objective problem is maximizing EE, the SE and EE performance of the two methods is shown in Fig. 13 and Fig. 14.

FIGURE 13.

SE versus maximum transmit power in scenario II when the objective problem is maximizing EE.

Show All

FIGURE 14.

EE versus maximum transmit power in scenario II when the objective problem is maximizing EE.

Show All

In order to prove that the performance of power allocation based on DNN method is very close to the sub-gradient algorithm, the accuracy of the DNN method under two scenarios are shown in Table 1 and Table 2, respectively. In addition, to better compare the computational complexity between the two methods, the computational time of the two methods under the two scenarios is shown in Table 3 and Table 4, respectively.

TABLE 2 PErformance Comparison for the Two Methods in Scenario I

TABLE 3 Performance Comparison for the Two Methods in Scenario II

TABLE 4 Computational Times for the Two Methods in Scenario I

TABLE 5 Computational Times for the Two Methods in Scenario II

D. Results Analyze

As shown in the simulation results above, we can see from Fig. 6 to Fig. 13, the performance of the DNN method is very close to the traditional sub-gradient algorithm. We can see form the Table 1 and Table 2. In the two scenarios described above, whether the objective problem is maximizing SE or maximizing EE, the accuracy of the DNN method can reach more than 92% of the traditional sub-gradient algorithm.

We can see from the table 3 and table 4. To begin with, for the DNN method, the online computational time is almost same at different maximum transmission power, because the structure of the DNN is same under the same scenario. For the traditional sub-gradient algorithm, the online computational time will be different under different maximum transmission power, because the number of iterations required to find the optimal power allocation schemes may be different. In addition, for the sub-gradient algorithm, the computational time required to maximize EE is much higher than the maximum SE. What’s more, the online computational time of the the DNN method is many times less then that of the traditional sub-gradient algorithm.

In order to further understand the advantages of the DNN method, we compare the time complexity of the DNN method and traditional sub-gradient algorithm. According to the [23], the online time complexity of the already trained well fully connected neural network is $O(n)$ . When the objective problem is maximizing SE, the online time complexity of the traditional sub-gradient methods is $O(n^{3})$ . When the objective problem is maximizing EE, because the algorithm needs to combine fractional programming with sub-gradient algorithms, the online time complexity is higher than when maximizing SE.

In conclusion, the performance of the DNN method can achieve 92% of the traditional sub-gradient algorithm and provide with at least three orders of magnitude times speed up.

SECTION V.

Conclusion

In this paper, we exploited the DNN algorithm to solve power allocation problem in DAS. Firstly, we introduced a system model and assumed that perfect channel state information is known at both transmitter and receiver side and the channels are orthogonal so there are no interference to each other. Secondly, we randomly generated a large number of the channel realizations and used the traditional sub-gradient algorithm to maximize SE and EE optimization and save the optimal power allocation schemes. Thirdly, we fed the channel realizations and the corresponding power allocation schemes into the DNN and used the loss function to train the DNN. Fourthly, we generated other channel realizations for testing dataset and fed into the DNN which had been trained well and got the outputs of DNN. Otherwise, the same testing dataset were fed into the traditional sub-gradient algorithm and calculate the optimal power allocation schemes. Finally, we showed the performance and computational time difference between the DNN method and the sub-gradient algorithm, respectively.

In the future, we will explore more excellent machine learning algorithms to solve the power allocation problem in DAS. For example, we will try to use the CNN to realize power allocation by using user geographic location instead of user channel information. In addition, we will try to use migration learning to deal with the problem of insufficient training dataset in wireless communication systems. What’s more, we will try to use the ensemble learning to make the generalization ability of the trained model stronger so that it can cope with various complicated situations in actual wireless communication systems. Finally, we hope to find more machine learning methods with excellent performance and low computational complexity to truly implement power allocation algorithms in actual wireless communication systems.

References is not available for this document.

MIT Libraries

MIT Libraries

Power Allocation Schemes Based on Deep Learning for Distributed Antenna Systems

Abstract:

Metadata

Abstract:

Funding Agency:

Introduction