Loading [MathJax]/extensions/MathZoom.js
Ant Colony Optimization Based Memetic Algorithm to Solve Bi-Objective Multiple Traveling Salesmen Problem for Multi-Robot Systems | IEEE Journals & Magazine | IEEE Xplore

Ant Colony Optimization Based Memetic Algorithm to Solve Bi-Objective Multiple Traveling Salesmen Problem for Multi-Robot Systems


Multiple-ACO and local search in bounded neighborhoods for bi-objective MTSP.

Abstract:

This paper considers the problem of having a team of mobile robots to visit a set of target locations. This problem is known as multi-robot patrolling problems. In this p...Show More

Abstract:

This paper considers the problem of having a team of mobile robots to visit a set of target locations. This problem is known as multi-robot patrolling problems. In this paper, the problem is formulated as a multiple traveling salesman problem (MTSP) with single depot or multiple depot, which is an nondeterministic polynomial-hard problem. Unlike most previous research works, in real-world applications, the requirement of optimizing the maximum traveled distance and the total traveled distance simultaneously widely exists. In this paper, a bi-objective ant colony optimization (ACO) based memetic algorithm is proposed to solve the problem. In the algorithm, a simple multi-ACO is integrated with a sequential variable neighborhood descent. A powerful local optimization method for bi-objective MTSP is proposed to improve the candidate solutions. In addition, we adopt the technique for order preference by similarity to an ideal solution method to select a reasonable solution from the optimal Pareto. Through computational experiments, we demonstrated the benefits of our algorithm as compared with four other existing algorithms. Computational results show that proposed algorithm is promising and effective for the bi-objective MTSPs.
Multiple-ACO and local search in bounded neighborhoods for bi-objective MTSP.
Published in: IEEE Access ( Volume: 6)
Page(s): 21745 - 21757
Date of Publication: 19 April 2018
Electronic ISSN: 2169-3536

Funding Agency:


SECTION I.

Introduction

Research on real-world problems of multi-robot systems has received widespread interest [1]–​[4]. In multi-robot systems, robots need to collaborate to accomplish complex missions that consist of multiple subtasks efficiently. The subtasks may need to be accomplished concurrently or in sequence [5]. The multi-robot task allocation problem (MRTA) is one of the main challenges remain in control of multi-robot systems. The MRTA problem address optimization of minimizing the overall system cost by figure out an efficient assignment of the tasks to the robots.

In this paper, we consider patrol missions under a security patrol situation or a disaster situation. The missions are to use the robots to take photos and to collect sensors data at several target locations to help the rescuers or the policemen take appropriate actions in time. According to the taxonomy given in [6] and [7], the problem belongs to the category of (single-task (ST), single robot (SR), time-extended assignment (TA)).

The specific problem investigated in this paper is that of assigning m target locations to n<m robots with limited battery life. The objective is to minimize the total traveled distance and the maximum tour length simultaneously. Each target location requires only one robot to visit it. Each robot may be assigned multiple target locations that they visited in sequence. Each robot should back to its base and recharge before its battery run out.

Existing methods to solve this type of task allocation problems can be divided into two categories: centralized methods and distributed methods. In a centralized task allocation system, a central server gathers information from each robot and computes the assignments. In a distributed case, the algorithm runs on each robot simultaneously and then exchange their information to reach better solutions. Compared to the centralized methods, the advantage of distributed methods is to avoid overloaded communication and computation burden on the central server. On the other hand, the centralized methods may be easier to achieve global optimization and ensure the consistency of information [8].

This problem is similar to the multiple traveling salesman problem (MTSP), a variation of the well-known traveling salesman problem (TSP). As the TSP, which has only one salesman, is a special case of MTSP, the MTSP is strongly non-deterministic polynomial (NP)-hard. In this paper, we model the problem as multiple depot multiple traveling salesman problem (MD-MTSP) with an additional limit on the travel distance for a robot after leaving its depot. In an MD-MTSP, each robot starts from a certain depot location and returns to the location after having visited all the assigned target. During this time, if the robot needs to be charged, it needs to go back to its depot, either. The special case that all the robots start from the same depot called single-depot multiple traveling salesman problem (SD-MTSP) is also considered in this paper. Most of the previous researches on this issue considered a mission cost equal to the total travel distance or the maximum tour length. In real-world problems, both total travel distance and maximum tour length need to be simultaneously optimized to save the cost and improve the efficiency. In this paper, we focus on the bi-objective combinatorial optimization problem. In a multi-objective problem, none of the feasible solutions is simultaneously optimal for all objectives. Generally, there are two kinds of approaches for multi-objective problems. One is to reduce the problem to a scalar optimization problem by combining all objectives in a single cost function or optimization the objectives in sequence according to their importance and so on [9]. The other is first to find a set of Pareto-optimal solutions in decision space called Pareto front and then the final solution is found in Pareto front by a decision maker according to its preference [10], [11]. Among the existing approaches for multi-objective problems, only a few take effective local search into account. To our knowledge, few local search methods have been developed to take every objective into account.

This paper proposes an ant colony optimization (ACO) based memetic algorithm to solve the bi-objective multiple traveling salesmen problem. The proposed algorithm is a centralized method, but with reasonable calculation time. We consider two objectives of total travel distance and maximum tour length in capturing the optimal Pareto front. A simple multi-objective ACO (mACO) is used. In the mACO, ants are divided into groups. Each group is responsible for one single objective subproblem (a weighted sum of all the objectives), whose optimal solution is a particular point in the Pareto set of all the Pareto-optimal solutions in the objective space. Some neighboring points in the Pareto set may be reached by the group before it reaches its optimal point. A local search procedure integrates with the sequential variable neighborhood descent (seq_VND) and takes both objectives into account is proposed to further improve the solutions in Pareto front.

The main contributions of this paper are:

  • It proposes an ACO based memetic algorithm to optimize both total travel distance and maximum tour length simultaneously for MTSP.

  • Its proposed algorithm is suitable for both MD-MTSP and SD-MTSP.

  • It defines a neighborhood sequence for the seq_VND on the bi-objective MTSP.

  • It proposes a method locally improve a solution in Pareto front taking two objectives into account.

The rest of this paper is organized as follows: In Section 2, a comprehensive literature review of relevant works is presented. Section 3 describes the problem. Section 4 presents the ACO based memetic algorithm. Section 5 study the experimental results. Finally, Section 6 provides some conclusions and directions for future study.

SECTION II.

Related Works

Multi-robot task allocation problem is considered as a variant or an instance of the multiple traveling salesman problem by many researchers [12]. Kivelevitch et al. [13] proposed a market-based algorithm where agents and tasks operate in a market to achieve near-optimal solutions. Spensieri et al. [14] proposed an iterative and decoupled approach to dispatch the tasks among the robots and route and schedule the robots in a collision-free way. Miao et al. [15] established a dynamic task allocation model and proposed a distributed immune multi-agent algorithm (DIMAA) based on an immune multi-agent network framework. Cheikhrouhou et al. [16] and [17] proposed a distributed approach called Move-and-Improve to solve the allocation problem with a limitation of communication range.

Several approaches in literature to solve the MTSP. Lixin et al. [18] proposed a modified genetic algorithm (MGA) with one-chromosome representation for MTSP problem and use the algorithm to solve the hot rolling production scheduling problem. Genetic algorithms with a two-chromosome representation are used to solve the vehicle scheduling problem(VSP) in [19] and [20]. This approaches can be adapted for MTSP either. MTSP is a grouping problem which is to seek an optimal assignment of targets into different groups according to the objective. The grouping genetic algorithms(GGA) is a modified genetic algorithm designed for grouping problems [21]. Several researchers proposed grouping genetic algorithms to solve MTSP. Carter and Ragsdale [22] presented a genetic algorithm with a two-part chromosome similar to the grouping genetic algorithm. Brown et al. [23] also successively proposed a grouping genetic algorithm to solve the MTSP. The superiority of both approaches have shown by comparing their results with those of the two genetic algorithms with one-chromosome and two-chromosome representations. Singh and Baghel [24] proposed a new grouping genetic algorithm using different chromosome representation and genetic operators than those used in [23]. Simulation results have shown the advantages of the algorithm compared with all the previous approaches.

Generally, a local search procedure (LSP) can improved the search performance of algorithms. Weimin et al. [25] proposed an ant colony optimization (ACO) algorithm and employs a local search that uses both inter-tour and intra-tour operators for solving the MTSP. Venkatesh and Singh [26] proposed a artificial bee colony algorithm based algorithm and a invasive weed optimization based algorithm and combine them with a simple LSP. However, the experimental results showed that the simple LSP cannot achieve great improvements. Soylu [27] proposed a general variable neighborhood search algorithm and the experimental results shows the effectiveness of the algorithm. Yongzhen et al. [28] proposed a memetic algorithm based on sequential variable neighborhood descent for the min-max MTSP. The memetic algorithm integrates a sequential variable neighborhood descent(seq_VND), which is a powerful local search procedure including three local search operators derived from [27] and one extension one.

However, the above-mentioned research works consider mono-objective problems. The objective of total travel distance and the maximum distance are considered separately. Usually, in the real-world situations, one criterion is not sufficient to achieve a desirable solution. It is highly necessary to consider the bi-objective MTSP. To solve a multi-objective problem, is to get a Pareto front and use decision makers (DM) to determine the optimal one. Few research works were conducted to solve the MTSP problem as a multi-objective problem. Shim et. al. [29] integrated the estimation of distribution algorithm(EDA) into a decomposition framework and hybridized with three local search techniques to solve the multi-objective MTSP. Ke et al. [11] proposed a multi-objective evolutionary algorithm combining ant colony optimization (ACO) and the MOEA/D framework [30] (MOEA/D-ACO). The kernel of this algorithm is to decompose the multi-objective problem into single-objective problems. The ants are decomposed into several groups. The number of the groups is equal to the number of objectives. The number of the ants is equal to the number of single-objective problems. All ants in a group maintain a pheromone matrix, and each ant has a heuristic information matrix. Experimental results have shown the efficiency of the MOEA/D-ACO. Trigui et al. [31] proposed a fuzzy logic approach to solve the multi-objective MTSP. The approach is centralized, yet fast. The experiment result shows that its execution time is always faster that of the MDMTSP-GA [32] approach with a ratio of 89%. However, the approach decomposes the multi-optimization problem into a single optimization problem, it can get only one of the pareto-optimal solutions for each problem.

Due to the complexity of the multi-objective problems, few research works use local search procedure to improve their algorithms for multi-objective multiple traveling salesman problem. The main contribution of this paper is to combine a simple multi-ACO algorithm with a seq_VND and proposed a local optimization method for bi-objective MTSP. In this way, the candidate solutions found by the algorithm can be further improved by local search procedures, and the improved solutions, in turn, can improve the search capability of the multi-ACO.

SECTION III.

Problem Description

The multiple traveling salesmen problem has received wide attention in multi-robot applications. In this paper, the bi-objective multiple traveling salesmen problem(MTSP) is studied. The scope of the MTSP in this paper is confined to assign a set of m targets to a set of n robots having single or multiple depots. In this problem, every target should be visited once by one of the robots, and every robot should at least visit one target. The objective is to simultaneously optimize the total distance and the maximum distance traveled by robots. The SD/MDMTSP is an NP-hard problem and can get near-optimal solutions by heuristic approaches. The problem is usually fixed in two steps:

  1. Assign each target to a robot;

  2. Determine the visit order of each robot.

We assume all the locations of the targets and depots are given at the beginning and no dynamic changes during the whole process. Assumed that there are $n$ robots and $m$ targets in the problem. Let $R_{i}$ denotes robot $i$ , $T_{j}$ denotes target $j$ . The $k$ th route of robot $R_{i}$ is described as ${P_{i,k}} = \{ {R_{i}},{T_{i1}},\ldots,{T_{i{m_{i}}}},{R_{i}}\}$ , $m_{ik}$ describes the target number assigned to robot $i$ ’s $k$ th route, $p_{i}$ describes the route number of robot $R_{i}$ . $D(P_{i,k})$ describes the total length of $P_{i,k}$ :\begin{align*} D({P_{i,k}})=&distance({R_{i}},{T_{i_{1}}}) \\&+ \sum \limits _{j = 2}^{m_{i}} {distance({T_{{i_{j - 1}}}},{T_{i_{j}}})} \\&+ \,distance({T_{{i_{{m_{ik}}}}}},{R_{i}}). \tag{1}\end{align*} View SourceRight-click on figure for MathML and additional features. In addition, a binary variable is defined:\begin{equation*} {x_{ij}} = \begin{cases} 1& if~T_{j}~is~assigned~to~R_{i}\\ 0&otherwise \end{cases} \end{equation*} View SourceRight-click on figure for MathML and additional features. The objective is to optimize two main performance criteria in MTSP, the max tour length and the total travel distance, simultaneously.\begin{align*}&{Min}~{\sum \limits _{i = 1}^{n} {\sum \limits _{k = 1}^{p_{i}} {D({P_{i,k}})} } }. \tag{2}\\&{Min}~{arg\max \left[{\sum \limits _{k}^{p_{i}}D({P_{i,k}})}\right] }. \tag{3}\end{align*} View SourceRight-click on figure for MathML and additional features. Subject to the following constraints:\begin{align*} D({P_{i,k}})\le&~dma{x_{i}}, \\&\qquad~~ i = 1,2,\ldots,N,~k=1,2,\ldots,p_{i}. \qquad \tag{4}\\ \sum \limits _{i = 1}^{N} {x_{ij}}=&1,\quad j = 1,2,\ldots,M. \tag{5}\\ \sum \limits _{i = 1}^{N} \sum \limits _{j = 1}^{M} {x_{ij}}=&M,. \tag{6}\\ \sum \limits _{j = 1}^{m} {x_{ij}}\ge&1,\quad i = 1,2,\ldots,N \tag{7}\end{align*} View SourceRight-click on figure for MathML and additional features. Constraint (4) describes the max length of a robot’s route decided by its cruising ability where $dmax_{i}$ describes the distance per charge for robot $i$ . Constraint (5)(6) implies that every target should be assigned to exactly one robot. Constraint (7) implies that every robot should be assigned at least one target.

SECTION IV.

The ACO-Based Memetic Algorithm For Bi-Objective MTSP

Generally, local search algorithms can explore a neighborhood of a solution and further improve the solution quality. The proposed algorithm is a combination of an ACO algorithm and local search algorithms. Features of our ACO algorithm and local search algorithms are described in subsequent sections.

A. The ACO Algorithm for MTSP

Our approach is based on an ACO algorithm with MIN-MAX Ant System [33]. In real ant colonies, ants share their information and construct their routes by pheromones. In ACO-based algorithms, the heuristic desirability is an artificially added factor to guide ants to construct solutions. The main features of our ACO algorithm for MTSP are as follows.

1) Solution Construction

In ACO, every ant constructs a solution in every iteration. In the proposed algorithm, the solution constructions follow the below procedure.

  1. Initialize every robot’s target assignment sequence ${L_{i}}=\emptyset $ . Initialize every robot’s tour ${P_{i}} = [{R_{i}},{R_{i}},{R_{i}}],\;i = 1,2,\ldots,N$ ;

  2. For each robot, select an unassigned target, insert it into the second position of the robot’s tour and add it to the assignment sequence;

  3. Select an unassigned target and add it to the end of the assignment sequence. All possible insertion positions (all positions except the first and the last of a tour) in the tours must be checked to determine the insertion position of the target. The target would be inserted into the position with least cost increase. If the target is inserted into $P_{i}$ , add it to the end of $L_{i}$ . If the target is inserted into the second-to-last position of $P_{i}$ , an extra ”$R_{i}$ ” should be added to the end of $P_{i}$ ;

  4. Repeat step 3 until all targets are assigned.

In this paper, to construct a solution means to construct every robot’s tour. Step 2 ensures that every robot visits at least one target in all generated solutions. The target insertion in step 3 guarantees that only one solution can be generated with the same target assignment order. To add an extra ”$R_{i}$ ” to the end of $P_{i}$ in step 3 helps to calculate multiple tours of a robot under the limitation of cruising ability.

In ACO-based algorithms, the heuristic desirability, and the pheromones guide ants to choose where to go next from their departure place. In the MTSP problem, tours are independent to each other to some extent. Therefore, in the proposed algorithm, the departure place for an ant at each step is selected randomly from the latest selected target of one of the robots.

The probability of ant $a$ to select $T_{j}$ after selecting $T_{i}$ as the departure target at step 3 is calculated as (8), as shown at the bottom of this page,

where $\eta \,\,({O_{i}},{T_{j}})(\eta \,\,({O_{i}},{T_{j}}) = 1/distance({O_{i}},{T_{j}}))$ is the visibility from a node (a target or a robot) to a target, $\tau ^{target}$ is the existing pheromone trail on the path between targets, $\tau ^{robot}$ is the existing pheromone trail on the path between every target and robots.

The probability of ant $a$ to select $T_{j}$ for $R_{i}$ at step 2 (to choose $T_{j}$ to go next from $R_{i}$ ) is calculated as follow:\begin{align*} P({R_{i}},{T_{j}}) = \begin{cases} {\displaystyle \frac {{{{[{\tau ^{robot}}({T_{j}})] }^\alpha }{{[\eta ({R_{i}},{T_{j}})] }^\beta }}}{{\sum \limits _{T_{o} \in {S_{a}}} {{{[{\tau ^{robot}}({T_{o}})] }^\alpha }{{[\eta ({R_{i}},{T_{o}})] }^\beta }} }},}&{if~{{\mathrm{T}}_{j}} \in {S_{a}}}\\ 0&{otherwise} \end{cases} \\ \tag{9}\end{align*} View SourceRight-click on figure for MathML and additional features. The existing pheromone trail on the path between a target and every robot is the same so that the method is suitable for both single depot and multiple depot scenarios of MTSP. $\alpha $ and $\beta $ are the two coefficients which determine the relative importance of pheromone trail and heuristic desirability, $S_{a}$ is the set of unassigned targets for ant $a$ .

2) Pheromone Updating Rule

In this paper, the ACO algorithm follows the MAX-MIN Ant System scheme. $tau_{0}$ , $tau_{min}$ , $\tau _{max}$ describes the initial, the lower limit and upper limit of pheromone trail values on paths between every two targets or between a target and a robot. $tau_{0}$ is equaled to $\tau _{max}$ , $\tau _{max}$ and $\tau _{min}$ are defined as follows:\begin{align*} {\tau _{\max }}=&\frac {1}{\rho }\frac {1}{{f({s^{gb}})}}. \tag{10}\\ {\tau _{\min }}=&{\tau _{\max }}/2M. \tag{11}\end{align*} View SourceRight-click on figure for MathML and additional features. $\rho $ is the evaporating rate of pheromone trail, $f({s^{gb}})$ is the value of the objective function of the globally best solution. At each iteration, pheromone trails evaporate automatically:\begin{equation*} {\tau ^{robot}} \leftarrow (1 - \rho){\tau ^{robot}},\quad {{\tau ^{target}} \leftarrow (1 - \rho){\tau ^{target}}}.\qquad \tag{12}\end{equation*} View SourceRight-click on figure for MathML and additional features. The global best ant deposits pheromone on its routes according to its target assignment sequences as (13), (14):\begin{align*} {\tau ^{robot}}({{\mathrm{T}}_{i}})\leftarrow&{\tau ^{robot}}({{\mathrm{T}}_{i}}) + 1/f({s^{gb}}), \\ {{\mathrm{T}}_{i}}\in&\{ L_{j}^{gb}[{1}]|j = 1,2,\ldots,N\} \tag{13}\\ {\tau ^{target}}({{\mathrm{T}}_{i}},{T_{j}})\leftarrow&{\tau ^{target}}({{\mathrm{T}}_{i}},{T_{j}}) + 1/f({s^{gb}}), \\ ({T_{i}},{T_{j}})\in&\{ (L_{g}^{gb}[k],L_{g}^{gb}[k + 1])|i = 1,2\ldots,len(L_{g}^{gb}) \\&\qquad \qquad \qquad - 1,g = 1,2,\ldots,N\} \tag{14}\end{align*} View SourceRight-click on figure for MathML and additional features.

B. The Multi-ACO Algorithm for Bi-Objective MTSP

In this paper, we address the bi-objective MTSP. In other words, both of the total travel distance and the maximum tour length should be optimized at the same time. In ACO algorithms, an ant colony is improved for an optimization over iterations. To solve the bi-objective problem, the proposed algorithm divided the colony into $G$ groups. Each group uses its own objective function and pheromone matrix. For the $g$ th group, the objective function is defined below:\begin{equation*} {f^{g}}(s) = \lambda _{1}^{g}{f_{1}}(s) + \lambda _{2}^{g}{f_{2}}(s) \tag{15}\end{equation*} View SourceRight-click on figure for MathML and additional features. ${f_{1}}(s)$ and ${f_{2}}(s)$ are the first criterion and second criterion of solution $s$ , respectively. $\lambda _{1}^{g}$ and $\lambda _{2}^{g}$ are the weighted coefficients for the two objectives, and is defined as follow:\begin{equation*} {\lambda _{1}^{g} = \frac {g - 1}{G - 1},}\quad {\lambda _{2}^{g} = 1 - } \lambda _{1}^{g} \tag{16}\end{equation*} View SourceRight-click on figure for MathML and additional features. A public Pareto set $PS$ contains non-dominated solutions is updated by all ants in all groups at each iteration. Solutions in the Pareto set are sorted in ascending order according to the value of the first objective function. The complete framework of our proposed ACO algorithm for bi-objective MTSP is shown in Algorithm 1.

Algorithm 1 The Proposed Multi-ACO for Bi-Objective MTSP

1:

Initialize parameters $\alpha $ , $\beta $ , $\rho $ , $G$ ,$colony\_{}size$

2:

Initialize pheromone trails for each group

3:

while termination condition $not$ satisfied do

4:

for $i:=1$ to $colony\_{}size$ do

5:

Construct a solution by ant $i$ using pheromone trails of its group.

6:

Update pareto set.

7:

end for

8:

Evaporate pheromone using (12).

9:

for $g:=1$ to $G$ do

10:

$best:=$ find the best ant using (15).

11:

Deposit pheromone regarding ant $best$ on group $g$ ’s pheromone trails using (13) and (14).

12:

Limit pheromone values to [$\tau _{min}$ , $\tau _{max}$ ].

13:

end for

14:

end while

C. Local Search for Bi-Objective MTSP

Generally, in ACO-based memetic algorithms, after all ants finishing constructing their solutions, the best ant is selected to carry out local search [34]. In our work, to solve the bi-objective optimization problem, the local search is randomly carried out on one of the solutions in the Pareto set.

1) Locally Improve a Non-Dominated Solution

Each local search operator corresponds to a neighborhood of a solution. To locally improve a solution by a local search operator is to find out the best solution in the corresponding neighborhood of the solution. In single objective problems, it is easy to find out the best solution with the minimum or maximum value. However, in multi-objective problems, the optimal solution in non-dominated solutions cannot be easily find out in the same way. Fig.1 shows a Pareto set. For each solution in the Pareto set, we narrow its neighborhoods to a reasonable range. For example, for solution $s_{5}$ in the Pareto set, a neighbor that is not in the shaded part cannot be a locally optimal result of $s_{5}$ (Fig.1). By adding a solution which is in the shaded part to the Pareto set, either the spacing or the generational distance of the Pareto set would be decreased.

FIGURE 1. - An illustrative example of a Pareto set.
FIGURE 1.

An illustrative example of a Pareto set.

Let $n_{i,j}^{l}$ be a neighbor of $s_{i}$ in the $l$ th neighborhood, $N_{i,j}^{l}({f_{1}}(n_{i,j}^{l}),{f_{2}}(n_{i,j}^{l}))$ and ${P_{i}}({f_{1}}({s_{i}}),{f_{2}}({s_{i}}))$ be the coordinates. $\varphi ~(i,j,l)$ and $\phi ~(i,j,l)$ are two criterions for evaluating $n_{i,j}^{l}$ ’s improvement:\begin{align*} \varphi (i,j,l)=&\frac {{\overrightarrow {P_{i}N_{i,j}^{l}} \bullet \overrightarrow {P_{i}O} }}{{\overline {P_{i}O} }}. \tag{17}\\ \phi (i,j,l)=&\frac {{\left |{ {\overrightarrow {P_{i}O} \times \overrightarrow {P_{i}N_{i,j}^{l}} } }\right |}}{{\overline {P_{i}O} }}. \tag{18}\end{align*} View SourceRight-click on figure for MathML and additional features. The ranges of the two criterions ${\varphi _{left}}$ , ${\varphi _{right}}$ , ${\phi _{left}}$ , ${\phi _{right}}$ are defined according to ${s_{i - 1}}$ and ${s_{i + 1}}$ :\begin{align*} {\varphi _{left}}=&\frac {{\overrightarrow {P_{i}{P_{i - 1}}} \cdot \overrightarrow {P_{i}O} }}{{\overline {P_{i}O} }},\quad {\varphi _{right}} = \frac {{\overrightarrow {P_{i}{P_{i + 1}}} \cdot \overrightarrow {P_{i}O} }}{{\overline {P_{i}O} }}. \tag{19}\\ {\phi _{left}}=&\frac {{\left |{ {\overrightarrow {P_{i}O} \times \overrightarrow {P_{i}{P_{i - 1}}} } }\right |}}{{\overline {P_{i}O} }},\quad {\phi _{right}} = \frac {{\left |{ {\overrightarrow {P_{i}O} \times \overrightarrow {P_{i}{P_{i + 1}}} } }\right |}}{{\overline {P_{i}O} }}.\qquad \tag{20}\end{align*} View SourceRight-click on figure for MathML and additional features. The $s_{i}$ ’s $l$ th narrowed neighborhood $N{r_{l}}({s_{i}})$ consists of two subsets $NL_{i}^{l}$ and $NR_{i}^{l}$ :\begin{align*} NL_{i}^{l}=&\left \{{ {n_{i,j}^{l}\left |{ \begin{array}{l} \varphi (i,j,l) \ge {\varphi _{left}},\\ {\phi _{left}} \le \phi (i,j,l) \le 0,\\ j = 1,2,\ldots,\left |{ {N_{l}({p_{i}})} }\right | \end{array} }\right.} }\right \} \tag{21}\\ NR_{i}^{l}=&\left \{{ {n_{i,j}^{l}\left |{ \begin{array}{l} \varphi (i,j,l) \ge {\varphi _{right}},\\ 0 < \phi (i,j,l) \le {\phi _{right}},\\ j = 1,2,\ldots,\left |{ {N_{l}({p_{i}})} }\right | \end{array} }\right.} }\right \} \tag{22}\\ N{r_{l}}({s_{i}})=&NL_{i}^{l} \cup NR_{i}^{l}. \tag{23}\end{align*} View SourceRight-click on figure for MathML and additional features. In this paper, the optimization result of a non-dominated solution is generated as follows:

First, a set of ’best’ neighbors would be found from $NL_{i}^{l}$ and $NR_{i}^{l}$ , respectively:\begin{align*} BL_{i}^{l1}=&\{ n_{i,j}^{l}\left |{ {\varphi (i,j,l) = \max \varphi (i,j,l)} }\right.,n_{i,j}^{l} \in NL_{i}^{l}\}. \tag{24}\\ BR_{i}^{l1}=&\{ n_{i,j}^{l}\left |{ {\varphi (i,j,l) = \max \varphi (i,j,l)} }\right.,n_{i,j}^{l} \in NR_{i}^{l}\}.\qquad \tag{25}\end{align*} View SourceRight-click on figure for MathML and additional features. Then, another set of ’best’ neighbors would be found from $BL{_{i}^{l1}}$ and $BR{_{i}^{l1}}$ , respectively:\begin{align*} BL{_{i}^{l2}}=&\{ n_{i,j}^{l}\left |{ {\phi (i,j,l) = \min \phi (i,j,l)} }\right.,n_{i,j}^{l} \in BL{_{i}^{l1}}\}. \tag{26}\\ BR{_{i}^{l2}}=&\{ n_{i,j}^{l}\left |{ {\phi (i,j,l) = \min \phi (i,j,l)} }\right.,n_{i,j}^{l} \in BR{_{i}^{l1}}\}.\qquad \tag{27}\end{align*} View SourceRight-click on figure for MathML and additional features. $BL{_{i}^{l2}}$ and $BR{_{i}^{l2}}$ each contains one or more solutions with the same objective values. We randomly choose one solution from $BL{_{i}^{l2}} \cup BR{_{i}^{l2}}$ as the optimization result.

2) Sequential Variable Neighborhood Descent

In this paper, sequential variable neighborhood descent (seq_VND) procedure is used to further improve one of the solutions in the Pareto set locally when all ants complete their solution construction and update the Pareto set at each iteration. The neighborhoods in the seq_VND are visited one by one in a specific order until a local minimum is reached [26].$l_{max}$ is the maximum number of different operators in the seq_VND. In this local search procedure, we use two types of heuristic, relocation [35] and 2-opt [36]. As shown in Fig.2, the relocation operators remove one target or several adjacent targets in one tour and reinsert the target(s) in a different position of the tour or into a different tour. It is worth mentioning that, two directions of the adjacent targets should be considered to reinsert them (Fig.2(c), Fig.2(d)). The 2-opt operator removes two edges in one tour and reconnects the two resulting paths using another two edges, as shown in Fig.3.

FIGURE 2. - The sketch of the relocation. (a) one target moves between different tours. (b) one target moves in a tour. (c) two or more continuous targets moves between different tours. (d) two or more continuous targets moves in a tour.
FIGURE 2.

The sketch of the relocation. (a) one target moves between different tours. (b) one target moves in a tour. (c) two or more continuous targets moves between different tours. (d) two or more continuous targets moves in a tour.

FIGURE 3. - The sketch of the 2_opt.
FIGURE 3.

The sketch of the 2_opt.

As we used target assignment sequences to update pheromone tails in the multiple-ACO, the solutions’ target assignment sequences should be modified according to the tours’ change after using every operator. Both relocation operator and 2_opt operator are to remove one or several adjacent targets and reinsert them to a tour as a whole. The modification of the target assignment sequences is to remove all the moved targets from their assignment sequence, rearrange them into a sequence and add the sequence to the end of the assignment sequence corresponding to the tour they inserted to. The pseudo-code of rearranging adjacent targets is given in Algorithm 2.

Algorithm 2 The Pseudo-Code of Rearranging Adjacent Targets

Require:

The sub-tour contains the adjacent targets $P1$ , a tour contains the two target next to the adjacent targets $P2$

Ensure:

a sequence of the moved targets $S$

1:

$S=\emptyset $

2:

while $size(P1)!=0$ do

3:

Calculate the cost of adding the first and the last target of $P1$ to the second position of $P2$ , respectively

4:

if The cost of the first target is lower then

5:

Remove the first target from $P2$

6:

Remove the first target from $P1$ , add it to the first position of $P2$ and add it to the end of $S$

7:

else

8:

Remove the second target from $P2$

9:

Remove the last target from $P1$ , add it to the second position of $P2$ and add it to the end of $S$

10:

end if

11:

end while

Several relocation operators differentiated by the number of adjacent targets of relocating could be adopted. The neighborhood sequence of relocation operators is according to the number. ${N_{l}}(s)(l = 1,2,\ldots,{l_{\max }})$ is used to denote a set of neighborhoods for solution $s$ . $N_{l}$ corresponds to relocation $l$ . If no improvement has been made by any of the relocation operators, the 2-opt operator is employed for each tour.The pseudo-code of our local search procedure is given in Algorithm 3.

Algorithm 3 The Pseudo-Code of the Local Search Procedure

Require:

$s$ ,$s\_{}left$ , $s\_{}right$ , $l_{max}$

Ensure:

a pareto set $ND$

1:

$ND:=\{s\_{}left,s,s\_{}right\}$

2:

$l:=1$ ; $improved:=0$

3:

while $l<=l_{max}$ && $s.lable!=1$ do

4:

$s':=Local\_{}Improve(s,l,s\_{}left,s\_{}right)$

5:

if $s'\;dominates\;s$ then

6:

$improved:=1$

7:

$s:=s'$

8:

else if $s' \;is \;non-dominated$ then

9:

add $s'$ to $ND$

10:

$s\_{}left:=find\_{}the\_{}left\_{}solution(s,ND)$

11:

$s\_{}right:=find\_{}the\_{}right\_{}solution(s,ND)$ ;

12:

end if

13:

$l:=l+1$

14:

end while

15:

if $improved==0$ then

16:

$s.lable:=1$

17:

for $i:=1$ to $N$ do

18:

$s.P_{i}:=2\_{}opt(s.P_{i})$

19:

end for

20:

if no improvements have been made by 2_opt then

21:

$s.lable:=2$

22:

end if

23:

end if

After all the iterations have finished, the Pareto set is further improved by all non-dominated neighbors of all solutions in it. The pseudo-code of the proposed ACO based memetic algorithm is given in Algorithm 4.

Algorithm 4 The Pseudo-Code of the Proposed ACO Based Memetic Algorithm

Require:

Set of parameters and a MTSP instance

Ensure:

pareto set $PS$

1:

Initialize Pheromone Trails

2:

while Termination condition $1~not$ satisfied do

3:

Construct solutions

4:

Update $PS$

5:

Randomly choose a solution $s$ with $s.lable!=2$ from $PS$

6:

Apply local search to $s$

7:

Update Pheromone

8:

end while

9:

while Termination condition $2~not$ satisfied do

10:

Randomly choose a solution $s$ with $s.lable!=2$ from $PS$

11:

Apply local search to $s$

12:

end while

13:

$PS':=PS$

14:

for $i:=1$ to $size(PS)$ do

15:

find all non-dominate neighbors of $PS_{i}$

16:

update $PS'$

17:

end for

18:

$PS:=PS'$

D. The TOPSIS Method

The TOPSIS method is to choose the solution with the shortest distance from the optimal solution and the farthest distance from the worst solution [37]. Commonly, in the last step of a multi-objective algorithm, a decision maker should choose a most appropriate solution from the resultant optimal Pareto front of the algorithm. In this paper, we assume that both the two criteria are equally important, and the weights for the two objectives accommodated to the decision matrix of TOPSIS method is set to $w=[{1/2, 1/2}]$ .

SECTION V.

Computational Results

The proposed ACO based memetic algorithm is designed for bi-objective multiple traveling salesmen problem, including the single depot multiple salesmen problem and the multiple depot multiple salesmen problems. In order to evaluate the performance of the proposed ACO based memetic algorithm in SD-MTSP and MD-MTSP, we conduct 2 experiments across typical sets of test problems. We have compared the proposed algorithm with 4 existing algorithms. The well-known NSGA-II [38] is implemented using the relocation operator for mutation and the Partial Mapped Crossover (PMX) operator. The effective multi-objective evolutionary algorithm presented in [11] will be referred to as MOEA/D-ACO. A fuzzy logic approach solving the multi-objective MTSP with short calculation time presented in [31] will be referred to as FL-MTSP. A centralized GA approach based on [39] by Kivelevitch [32] is referred to as MDMTSP-GA. The original optimization results of the proposed MA, the NSGA-II and the MOEA/D-ACO are Pareto sets. The final solutions of the three algorithms are obtained by the TOPSIS method from the Pareto fronts captured by them. Besides comparing the final solutions, we also compared the Pareto sets.

All algorithms using in this paper are coded or recoded in MATLAB. All the simulations are run on a single laptop with an Intel Core i5 CPU @2.40GHz and 8GB of RAM. The parameters of the proposed MA is as follows: $colony\_{}size = 100$ ; $\alpha = 1$ ; $\beta = 2$ ; $G = 3$ ; $\rho = 0.9$ , $l\_{}{max} = 2$ . The termination condition 1 was that none of the groups in the colony obtain a better solution in successive 30 iterations. The termination condition 2 was that the total number of iterations would not exceed 1000. The parameters setting of MOEA/D-ACO is as [11]. Specifically, we have the following: $Total number of ants = 24$ ; $Number of groups = 3$ ; $Neighborhood size = 10$ ; $\rho = 0.95$ ; $\alpha = 1$ ; $\beta = 2$ ; $r = 0.9$ ; $e=(1/2n)$ ; $\delta = 0.05* \tau _{max}$ . The algorithm stops after 200 iterations. To get relatively good results from the MDMTSP-GA, the population size was set to 240, and the number of iteration was set to 1000. The parameters of NSGA-II is as follows: population size = 100; the crossover probability = 0.5; the mutation probability = 0.1. The algorithm terminates after 200 iterations. Battery life is not considered in the experiments.

A. Experiment for Single Depot Cases

In this experiment, we have considered the single depot case, that is, all the salesmen start and end at one and the same depot. We adopted 2 instances from the TSPLIB [40]. Table.1 shows the details of the instances. Among them, there are totally 6 test problems. For each instance, the first city is set as the depot. As FL-MTSP and MDMTSP-GA are suitable for multiple depot cases, we compare the proposed algorithm with MOEA/D-ACO and NSGA-II in this experiment. The three algorithms have been repeated 20 times on each test problems.

TABLE 1 The Test Problems
Table 1- 
The Test Problems

Visual comparisons are presented as depicted in Fig.4. Pareto fronts get by all 20 runs of the 6 test problems are depicted. As shown in Fig.4, the solutions get by the proposed algorithm locate much closer to each other and to the coordinate axis, which prove the stability and performance of the algorithm. We can easily find out that the Pareto fronts obtained by the proposed MA are more competitive than the ones obtained by MOEA/D-ACO and NSGA-II. In Fig.4, we also depict the average solution of MASVND of the test problems recorded in [28] to evaluate the obtained Pareto fronts. MASVND is a state-of-art algorithm for the MinMax multiple traveling salesman problem. The results show that our algorithm can achieve approximate optimal solutions for the MinMax objective.

FIGURE 4. - Pareto fronts obtained by the three algorithms of 20 runs. (a) Target number= 51, robot number=3. (b) Target number=51, robot number=5. (c) Target number=51, robot number=10. (d) Target number=100, robot number=3. (e) Target number=100, robot number=5. (f) Target number=100, robot number=10.
FIGURE 4.

Pareto fronts obtained by the three algorithms of 20 runs. (a) Target number= 51, robot number=3. (b) Target number=51, robot number=5. (c) Target number=51, robot number=10. (d) Target number=100, robot number=3. (e) Target number=100, robot number=5. (f) Target number=100, robot number=10.

We introduces two metrics to evaluate the Pareto solutions of the proposed MA, MOEA/D-ACO and NSGA-II, the average ratio(AR) and the spacing metric(SP). Let $P1$ , $P2$ , and $P3$ be the Pareto solutions set merging the 20 Pareto fronts of the three algorithms respectively, and the set $P$ is obtained by merging $P1$ , $P2$ , and $P3$ . The $AR$ metric can be calculated as follow:\begin{equation*} A{R_{i}} = \frac {{\left |{ {Pi - \left \{{ {x \in Pi|\exists y \in P:y \succ x} }\right \}} }\right |}}{{\left |{ {Pi} }\right |}}. \tag{28}\end{equation*} View SourceRight-click on figure for MathML and additional features. where $y \succ x$ means solution $y$ dominates $x$ . Obviously, a solution set with a higher AR is better. Let $d$ denotes the distance from a solution to the next solution in a Pareto front. The $SP$ metric can be calculated as follow:\begin{equation*} SP = \sqrt {\frac {1}{{\left |{ {PS} }\right | - 1}}\sum \limits _{i = 1}^{\left |{ {PS} }\right |} {{{(\overline d - {d_{i}})}^{2}}} }. \tag{29}\end{equation*} View SourceRight-click on figure for MathML and additional features. where $PS$ is a Pareto set. The $SP$ of a test problem and an algorithm is the average of the $20~SP\text{s}$ . Obviously, a lower $SP$ is better. Table.2 shows the comparison between the three algorithms under the two metrics. All the AR values of the proposed MA in Table.2 take value 1 while the AR values of NSGA-II and MOEA/D-ACO take value 0, that is, every non-dominated solution obtained by NSGA-II or MOEA/D-ACO is dominated by one or more non-dominated solutions obtained by the proposed MA. Smaller SP values of the proposed MA in Table.2 demonstrate the high performance of the proposed MA again. A smaller SP value means the distribution of the non-dominated solutions is more densely and more equally.

TABLE 2 Comparison of the Proposed MA, the MOEA/D-ACO, and the NSGA-II
Table 2- 
Comparison of the Proposed MA, the MOEA/D-ACO, and the NSGA-II

B. Experiment for Multiple Depot Cases

In this experiment, we follow the same rules as [17] and [31] to generate the test problems. We simulate different scenarios with 10 to 20 robots and 30 to 300 targets. Each scenario corresponds to a fixed number of targets and robots respectively. Each scenario is repeated 30 times where target locations and robot locations are randomly generated in a 1000x1000 area space in each run. Table.3 shows the comparison between the proposed MA, MOEA/D-ACO and NSGA-II under AR metric and SP metric. The value of AR and SP are the average of the 30 runs. Most of the AR values of the proposed algorithm in Table.3 take value close to 1, which shows the high performance of the proposed algorithm in terms of solution quality, especially in large instances. Smaller SP values of the proposed algorithm than those of the other two algorithms in Table.3 mean the solutions captured by the proposed algorithm are distributed more equally. Moreover, the gap between SP value of the proposed algorithm and other two algorithms is bigger in large instances. The average values of total travel distance criterion and maximum tour length criterion of the final solutions are calculated with 95% of confidence interval. The final solution of the proposed MA, the MOEA/D-ACO and the NSGA-II is obtained by the TOPSIS method. Fig.5 and Fig.6 show the comparison between the five algorithms in terms of average values of the two criterions respectively. The average value of total travel distance of FLMTSP, MOEA/D-ACO and our proposed algorithm in Fig.5 are closed to each other. In Fig.6, the average values of maximum tour length of FLMTSP are longer than those of MOEA/D-ACO and our proposed algorithm.

TABLE 3 Comparison of the Proposed MA, the MOEA/D-ACO, and the NSGA-II
Table 3- 
Comparison of the Proposed MA, the MOEA/D-ACO, and the NSGA-II
FIGURE 5. - The comparison between the five algorithms (maximum tour length). (a) robot number=10. (b) robot number=20. (c) robot number=30.
FIGURE 5.

The comparison between the five algorithms (maximum tour length). (a) robot number=10. (b) robot number=20. (c) robot number=30.

FIGURE 6. - The comparison between the five algorithms (total travel cost). (a) robot number=10. (b) robot number=20. (c) robot number=30.
FIGURE 6.

The comparison between the five algorithms (total travel cost). (a) robot number=10. (b) robot number=20. (c) robot number=30.

By means of the one-way ANOVA(95% confidence), Table.4 summarizes the number of the test problems where one algorithm obtains a statistically longer(<), equivalent(=) or shorter(>) average total travel distance than the others, Table.5 summarizes the number of the test problems where one algorithm obtains a statistically longer(<), equivalent(=) or shorter(>) average maximum tour length than the others. It can be seen that the proposed MA obtains a shorter average total travel distance and a shorter average maximum tour length than other algorithms in most of the test problems.

TABLE 4 The Summary of the Superiority of One Algorithm Over the Others (Total Travel Distance)
Table 4- 
The Summary of the Superiority of One Algorithm Over the Others (Total Travel Distance)
TABLE 5 The Summary of the Superiority of One Algorithm Over the Others (Maximum Tour Length)
Table 5- 
The Summary of the Superiority of One Algorithm Over the Others (Maximum Tour Length)

Fig.7 shows the execution time of the five algorithms. It is clear that the FL-MTSP is the fastest algorithm, the MOEA/D-ACO is the slowest, and the proposed algorithm is in the middle. Among them, the FL-MTSP and the MDMTSP provide an optimal solution in every run, while the proposed algorithm, the MOEA/D-ACO, and the NSGA-II provide an optimal Pareto set in every run.

FIGURE 7. - Time comparison between the five algorithms. (a) robot number=10. (b) robot number=20. (c) robot number=30.
FIGURE 7.

Time comparison between the five algorithms. (a) robot number=10. (b) robot number=20. (c) robot number=30.

In summary, we can conclude that the proposed ACO based memetic algorithm has a better or at least competitive capacity to solve bi-objective multiple traveling salesmen problems than the other algorithms. In terms of calculation time, the proposed algorithm can provide optimal Pareto fronts in reasonable time.

SECTION VI.

Conclusion

In this paper, we have proposed an ACO based memetic algorithm that integrates with the sequential variable neighborhood descent. A local search procedure for bi-objective problems is proposed and implemented to solve the bi-objective multiple traveling salesmen problem. We have evaluated and compared our algorithm with four state-of-art or classical algorithms in small-scale or large-scale test problems. Experimental results demonstrate the superiority of our algorithm over other compared algorithm in terms of solution quality, whether for single depot or multiple depot problems. In terms of the execution time, the proposed algorithm is not competitive with FL-MTSP, which decomposes the multi-objective problem into a single-objective problem and provide one optimal solution. Our algorithm aims to provide the optimal Pareto set to meet the needs of different situations. Compared to the MOEA/D-ACO, the proposed algorithm significantly reduces the execution time for generating approximate optimal Pareto sets.

The bi-objective multiple traveling salesmen problem in this paper is a static problem. However, in the real world, a robot system needs to deal with some uncertain events. In future research, the problem under dynamic environment can be considered.

References

References is not available for this document.