Introduction
With the increasing popularity of smartphones and Internet-of-Things (IoT) devices, many new applications that require significant computation and prompt responses have been developed. However, mobile devices and general embedded systems usually have limited computation capabilities. One solution to this problem is to offload individual tasks to cloud servers because, in general, cloud servers have massive computation resources [1]. Unfortunately, the servers are often located far from the users, resulting in high transmission latency. Recently, edge computing has been proposed to enhance the efficiency of the task offloading process [2], [3]. Edge computing deploys computational and storage resources to the edge of a network. Because the computational resources are relocated closer to the users, the communication latency can be significantly reduced. Nevertheless, given the rapid development of smartphone and IoT device applications, offloading demand can increase significantly and an unbalanced workload among edge servers may still result in longer computational latency when offloading tasks.
Some previous studies have considered load balancing by monitoring certain utilization metrics for load generated by server-side computation [4], [5], but rarely considered communication distance as the main factor impacting workload balancing. The transmission time is usually determined by the data size and data transmission rate [6], [7], with longer transmission distances also creating greater loads within the network. Traditional models for load balancing usually consider only the computational load on the servers and do not consider the transmission load due to distance.
In this paper, a task-offloading paradigm is considered. A number of edge servers are deployed with edge nodes within a network. Offloading requests from other edge nodes can be allocated to the edge servers in order to fulfill the demand for computational power. Edge server placement and task allocation are tailored by considering both the workload balance among the edge servers and the transmission distance for the offloaded tasks. Balancing the workload among the edge servers and reducing the transmission distance for offloaded tasks is equivalent to reducing the computational and communication latency, respectively. This is formulated as a mixed-integer linear programming problem. The Edge Server Placement (ESP) Algorithm is thus proposed as an efficient strategy to determine a better solution for edge server placement and task allocation. The ESP algorithm is based on simulated annealing and considers both edge server placement and task allocation. Simulated annealing is used to search the enormous solution space for edge server locations, while task allocation is resolved for the selected edge server locations using the Lagrangian duality theory [8] and the sub-gradient method. Simulations are conducted to evaluate the performance of the proposed scheme, with the results indicating that the ESP algorithm exhibits great flexibility in balancing the load among the edge servers and the transmission distance.
The main contributions of this paper are as follows:
This paper proposes a load balancing model that considers both the computational load on the edge servers and the transmission distance for task offloading in edge-computing networks.
An efficient and effective scheme is developed to provide better solution for both task allocation and server placement.
The Lagrangian duality theory is integrated with the simulated annealing process to improve the search efficiency
The rest of this paper is organized as follows. Section II briefly reviews related work. Section III presents the problem formulation and solution in detail. Section IV describes the simulation experiments and evaluates the performance. Finally, Section V concludes the paper.
Related Work
Task allocation and server placement problems have been intensively studied in recent decades. Both problems are often associated with optimization objectives and constraints that make them non-trivial and very challenging.
Some research has assumed that servers are already deployed and focus on allocating the clients to servers with fixed locations. For example, Ye et al. [9] studied user association policies in heterogeneous networks (HetNets) for load balancing and presented an efficient distribution algorithm that obtained a near-optimal solution. In [10], Athanasiou et al. modeled a client association problem that assigned a client to access points in a 60-GHz wireless access network. However, these studies did not consider the transmission distance between the clients and servers. Heuristic algorithms are often used for discovering the effective association between clients and servers.
The studies reported in [11], [12], and [13] employed genetic algorithms for server placement and task allocation, while Lim and Lee [11] aimed to find an effective strategy based on graph coloring to offload tasks to edge servers to balance the load. Tang and Pan [12] focused on the energy consumption in the communication network of a data center. They proposed a hybrid genetic algorithm that improved performance and efficiency by optimizing the placement of virtual machines. Xu et al. [13] presented a computational model based on vehicle-to-all communication (V2X) in edge computing. A genetic algorithm-based method was proposed as a balanced offloading strategy.
Game theory has also been widely adopted in task allocation for mobile-edge cloud computing [14], [15], [16], [17]. With distributed multiple users, every user is modeled as a game player, with each can independently determining their own offloading strategies. In [14], Chen presented a computation offloading model in which the tasks are assumed to be either processed locally or offloaded entirely to a single cloud server. The problem was formulated as a decentralized computation offloading game that promised to achieve a Nash equilibrium. Based on [14], Chen [15] further considered the problem of deciding whether to forward a user’s tasks to a single remote cloud server with a single access point in each round of the game. In contrast, instead of considering a single access point, Ma et al. [16] took multiple access points into account. In addition, in [17], a multi-user offloading problem was formulated as a stochastic game, and a stochastic learning algorithm was proposed in order to reach a Nash equilibrium.
Other researchers have proposed the task-offloading schemes based on machine-learning techniques in edge computing. Li et al. [18] and Shuja et al. [19] surveyed solutions based on machine learning for caching in an edge network. These approaches trained the model to offload computationally-intense applications to a specified edge server. However, the offloading model could be complex, and a huge amount of data was required to train the model.
In term of server placement, previous studies have mostly concentrated on searching for candidate server positions as a cluster head to reduce the response time. Recently, studies have investigated the
Li and Wang [22] proposed an energy-aware edge server placement model to reduce the energy consumption and computing resource utilization. A discrete particle swarm optimization algorithm was proposed for both server placement and task allocation. In [23], Xu et al. proposed a model to minimize a multi-objective problem for social media services within the Cognitive Internet of Vehicles. An integrated genetic algorithm was adopted to improve the quality of the services. Guo et al. [24] described a multi-objective optimization problem to minimize the communication delay with load balancing between devices. Although the above studies aim to optimize multiple objectives, the weights for each term in the objective function are usually selected intuitively and, thus, can dramatically change the final results. In contrast, the model proposed in this paper seamlessly integrates the cost of computation and communication and provides a more meaningful control between the loads of computation and communication.
Task Allocation and Server Placement
In this section, the system model for the server placement and task allocation problem is illustrated. An integer programming problem is then formulated based on the system model.
A. System Model
Consider a network composed of nodes and links, as shown in Fig. 1. The network topology can be considered an undirected graph
B. Problem Formulation
Assume that there are \begin{equation*} d_{ij} = {(hop_{ij} + 1)}^\alpha, \tag{1}\end{equation*}
Let \begin{equation*} \eta = \max _{j \in V} \sum _{i \in V} d_{ij} \lambda _{i} y_{ij} \tag{2}\end{equation*}
\begin{align*} y_{ij} = \begin{cases} 1, & {\mathrm {if ~the~ task~ originated ~from~ node}}~ i \\ & {\mathrm {is ~allocated~ to~ edge~ server~ at~ node}}~ j \\ 0, & {\mathrm {otherwise}}. \end{cases} \tag{3}\end{align*}
Note that the workload is weighted by the distance from the node to the edge server. Let \begin{align*} x_{i} = \begin{cases} 1, & {\mathrm {if ~node}}~ i~ {\mathrm {is ~deployed~ with~ an~ edge ~server}}\\ 0, & {\mathrm {otherwise}}.\end{cases} \tag{4}\end{align*}
The goal of the problem is to minimize the largest workload for edge server from among the set of all edge servers. Specifically, the server placement and task allocation problem can be formally expressed as follows:\begin{align*}&{\text {minimize}}~\eta \tag{5}\\&\text {subject to}~\sum _{i \in V} d_{ij} \lambda _{i} y_{ij} \leq \eta,\quad \forall j \in V \tag{6}\\&\hphantom {\text {subject to}~}\sum _{j \in V} y_{ij} = 1,\quad \forall i \in V \tag{7}\\&\hphantom {\text {subject to}~} \sum _{i \in V} x_{i} = K \tag{8}\\&\hphantom {\text {subject to}~} y_{ij} \leq x_{j},\quad \forall {i,j} \in V \tag{9}\\&\hphantom {\text {subject to}~} y_{ij} \in \{0,1\},\quad \forall i,j \in V \tag{10}\\&\hphantom {\text {subject to}~} x_{i} \in \{0,1\},\quad \forall i \in V. \tag{11}\end{align*}
Constraint (6) guarantees that the workload for all edge servers is less than or equal to
The proposed solution can be separated into two phases: (1) the location of the edge servers is selected, and (2) the tasks from each node are allocated to the edge servers. To simplify the problem, the task allocation is resolved under the assumption that the edge server locations have already been selected. The edge server locations are then selected by integrating the task-allocation scheme.
C. Task Allocation Problem
Assume that \begin{align*}&{\text {minimize}}~\eta \tag{12}\\&\text {subject to}~\sum _{i \in V} d_{ij} \lambda _{i} y_{ij} \leq \eta,\quad \forall j \in X \tag{13}\\&\hphantom {\text {subject to}~}\sum _{j \in X} y_{i,j} = 1,\quad \forall i \in V \tag{14}\\&\hphantom {\text {subject to}~} y_{ij} \in \{0,1\},\quad \forall i \in V,\quad \forall j \in X. \tag{15}\end{align*}
This is a combinatorial problem. In the worst case, the complexity of conventional algorithms in solving this problem would grow exponentially with the increase of the topology size.
The task allocation problem differs from the general assignment problem in that the objective is to reduce the largest load among the edge servers rather than the sum of the assignment costs. The Lagrangian duality theory applied for the association problem in [10] is referred to solve the problem. For completeness, the derivation is briefly described as follows. The Lagrangian duality theory aims to solve the original objective function by finding the solution for a dual function that is derived from the original function. The transformed dual function is usually easier to solve, and the obtained solution can place a bound on the primal function. First, denote \begin{equation*} L(\eta, \mathbf {y}, \boldsymbol {u}) = \eta \left({1-\sum _{j \in X} u_{j} }\right)+\sum _{i \in V}\sum _{j \in X} d_{ij} \lambda _{i} u_{j} y_{ij}. \tag{16}\end{equation*}
To simplify the notation, let \begin{equation*} \mathcal {Y} = \mathcal {Y}_{1} \times \mathcal {Y}_{2} \times \cdots \times \mathcal {Y}_{n} \tag{17}\end{equation*}
\begin{equation*} \mathcal {Y}_{i} = \left\{{ \mathbf {y}_{i} = (y_{ij})_{j \in X} \arrowvert \sum _{j \in X} y_{ij} = 1, y_{ij} \in \{0,1\} }\right\}. \tag{18}\end{equation*}
Furthermore, the Lagrange dual function \begin{align*} g(\boldsymbol {u})=&\inf _{\mathbf {y} \in \mathcal {Y}} L(\eta, \mathbf {y}, \boldsymbol {u}) \tag{19}\\=&\begin{cases} \inf _{\mathbf {y} \in \mathcal {Y}}\sum _{i \in V}\sum _{j \in X} d_{ij} \lambda _{i} u_{j} y_{ij}, & \sum _{j \in X} u_{j} \!=\! 1 \\ - \infty, & \mathrm {otherwise}\end{cases} \tag{20}\\=&\begin{cases} \sum _{i \in V}\inf _{\mathbf {y}_{i} \in \mathcal {Y}_{i}}\sum _{j \in X} d_{ij} \lambda _{i} u_{j} y_{ij}, & \sum _{j \in X} u_{j} \!=\! 1 \\ - \infty, & \mathrm {otherwise}\end{cases}\qquad \tag{21}\\=&\begin{cases} \sum _{i \in V}\,\,g_{i}(\boldsymbol {u}), &\sum _{j \in X} u_{j} = 1 \\ - \infty, & \mathrm {otherwise}. \end{cases} \tag{22}\end{align*}
In Eq. (16), \begin{align*}&\min \sum _{j \in X} d_{ij} \lambda _{i} u_{j} y_{ij} \tag{23}\\&\textrm {s.t. } \mathbf {y}_{i} \in \mathcal {Y}_{i}. \tag{24}\end{align*}
Finally, the Lagrange dual problem can be formulated as \begin{align*}&\max g(\boldsymbol {u}) = \sum _{i \in V} g_{i}(\boldsymbol {u}) \tag{25}\\&\textrm {s.t. } \sum _{j \in X} u_{j} = 1 \tag{26}\\&\hphantom {\textrm {s.t. }}u_{j} \geq 0, j \in X. \tag{27}\end{align*}
The load-balancing problem in (12) is converted to the Lagrange dual problem which is now a convex problem.
According to the properties of the Lagrangian duality theory, if the primal problem is convex, the optimal value of the primal problem and the corresponding dual problem is the same. In contrast, if the primal problem is non-convex, there would be a duality gap between the optimal value of the two problems. Though a gap exists, a feasible solution approximating the optimal solution for the primal problem can still be obtained by solving the dual problem.
The objective
In problem (23), although it is combinatorial, the solution can be obtained trivially as follows:\begin{align*} y^{*}_{ij}= \begin{cases} 1, & j = \arg \min _{k \in X} d_{ik} \lambda _{i} u_{k} \\ 0, & \mathrm {otherwise}. \end{cases} \tag{28}\end{align*}
The solution
The sub-gradient method is used for the problem in (25). Let \begin{equation*} s_{j} = - \sum _{i \in V} d_{ij} \lambda _{i} y^{*}_{ij} \tag{29}\end{equation*}
\begin{equation*} \boldsymbol {u}^{(k+1)} = P((\boldsymbol {u})^{(k)} - \beta _{k} \boldsymbol {s}^{(k)}) \tag{30}\end{equation*}
However, as mentioned previously, the primal problem (12) is non-convex, so a feasible solution for the primal problem cannot be obtained from the dual problem directly. In our problem, let
D. The Proposed Scheme
In the next phase, the edge server locations are selected. The task allocation scheme presented above is integrated with the server placement. Selecting locations for the edge servers is important because different edge server locations may lead to different workloads, which are weighted with the distance between the servers and the allocated nodes. Given the increasing size of the toplogy, it has become very challenging to optimally deploy edge servers due to the exponential increase in the number of combinations in task allocation and server location selection. To deal with this large combinatorial optimization problem, simulated annealing is adopted in the search for the global optimal solution for edge server placement and task allocation. The proposed algorithm is listed in Algorithm 2.
Initially, the \begin{equation*} P = \min (1, \exp ^{- \Delta f / T}) \tag{31}\end{equation*}
Third, a generation mechanism for new configurations is required. To search efficiently in the large combinatorial problem, simulated annealing explores different configurations to obtain better results. At every iteration, the configurations are updated and compared, allowing the better configuration to be determined. In the proposed algorithm, at each iteration, the configuration is updated as follows. One of the edge servers including the group of nodes that are allocated to the edge server is selected. Then, in the cluster, the node which will has the least workload if performing as the edge server is chosen as the new edge server. Algorithm 3 lists the sever updating process. Finally, the nodes in the network are then reallocated according to the new set of servers in the next iteration. The algorithm stops when the temperature is lower than the set threshold.
E. Time Complexity
In the ESP algorithm, time is primarily spent on the allocation algorithm. At each iteration, server placement and task allocation are updated, with the complexity of updating the server in Algorithm 3 is about
Simulations
In this section, the proposed algorithm is verified and evaluated using topologies of various sizes. The experimental results are discussed to determine the effectiveness of the proposed algorithm in terms of the placement and allocation problem.
A. Environmental Setup
In the experiments, 100 different topologies are generated with different numbers of links ranging from
The performance of the proposed scheme is compared to classic K-medoids clustering [27] and a greedy algorithm. The greedy algorithm deploys the edge server one by one. In each iteration, it places an edge server at the node with the highest request rate among the remaining nodes that have not yet been allocated to an edge server. The nodes in its vicinity are allocated to that edge server starting with the closest node until the sum of the allocated task request rate reaches the average, i.e.,
B. Simulation Results
Fig. 2 presents the results for the largest server workload against the number of nodes in the topology. Twenty edge servers are deployed in the topology. Intuitively, the average objective value (
In Fig. 3,
In addition, the algorithms are also compared with the approximate solution obtained using CPLEX, an IBM tool for optimization problems. CPLEX is used as a comparison because its solution is considered to be close to the optimal. The community edition of the optimizer is employed in the experiments, so the solver can only handle small topologies. The number of edge servers is fixed at
The computational efficiency of each algorithm is also compared. Fig. 7 shows the average execution time according to the number of nodes.
Experiments are also conducted to evaluate the impact of the distance factor
In Fig. 9, the difference in the weighted load between servers is evaluated with different distance factor
In Fig. 10, an example of the effect of the distance factor
Conclusion
Edge server placement strongly affects the efficiency of task allocation. In this paper, a server placement and task allocation method for an edge-computing network is studied. We formulate the scenario as a mixed-integer linear programming problem and propose a simulated annealing-based edge server placement and task allocation algorithm. To evaluate the performance of the proposed algorithm, different topology sizes are considered, including real-world networks. The impact of the distance factor is also examined in detail. The results show that, by adopting the proposed ESP algorithm, the costs of the largest server, i.e., the objective value in this paper, can be effectively reduced while the runtime remains manageable.
In reality, the ability of each server to handle the workload may differ. Furthermore, mobile users may offload different types of task, and those tasks could be processed by different edge servers. Thus, a more advanced model with different types of task and different computing capacities for the edge servers should be considered in the future.