I. Introduction
The rapid growth of data traffic in wireless networks impose great challenges on future wireless communication systems [2]– [4], in particular on improving the spectral efficiency as well as the energy efficiency. At the same time, the users are expecting that future networks will provide a uniform quality of service (QoS) over the coverage area. In many challenging scenarios, e.g., in shopping malls, dense urban environments, or during the occurrence of traffic jams, the users are non-uniformly distributed over the network [5]. One widely acknowledged cost- and energy-efficient approach to tackle these challenges is the concept of heterogeneous dense networks, where the traditional macro base stations (BSs) are complemented with a dense deployment of low-cost and low-power BSs [6]– [8]. By adding such a large number of small cells, the corresponding low-power BSs can offload traffic from the macro BSs, reduce the average distance between users and transmitters, and thereby improve the data rates and/or reduce the average transmit power. Since the data traffic load fluctuates greatly over the day [9], both macro and small cells might be needed at peak hours while there is an opportunity to turn off some BSs when there is little traffic in the corresponding coverage areas. Load balancing is the technique that maps the current traffic load to the available transmission resources, i.e., associates users with BSs. Mathematically speaking, the network would like to find the BS association that maximizes some performance metric, under the condition that the QoS requirements of all users are fulfilled.