Introduction
With the rapid development of economy, the transportation industry has made remarkable progress and the number of vehicles has also increased dramatically. Due to the restriction of land resources, traffic problems in large and medium cities are continuously exacerbated, which is manifested in serious traffic congestion and frequent traffic accident [1] and causes the waste of resources, environment pollution and huge economic loss to society [2]. Because the urban road resources are very limited, improving road utilization is an effective way to cope with the growing requirements of transportation, in addition to building new transport infrastructures. However, in real life, road utilization is often unsatisfactory due to the lack of information that can help drivers make correct and accurate choice. Therefore, it is meaningful to predict real-time traffic situation, and the predicted data can provide the drivers with effective and timely congestion information so that to obtain the optimal path with shortest travel time [3].
The existing path planning algorithms include classical and dynamic algorithms. The classical shortest path planning algorithms, such as Dijkstra [4] and the A* algorithm [5], take the current state of road segments into consideration, e.g., the travel time is usually taken as the weight of the road segment. The dynamic path planning algorithms consider more influence factors, which usually use big data prediction technology to analyze the characteristics of traffic and predict future situation of roads to get the optimal path [6]. Dynamic path planning algorithms often consider special situations, e.g., how to plan a path with the shortest travel time under the premise of meeting a prescribed [7], how to rearrange a path quickly in the case of deviating from the original path [8], and how to protect the source location privacy in the routing for industrial internet of things (IIoTs) [9]. However, all the existing traffic path planning algorithms have not taken the problem of concurrent requests into consideration, which brings two problems:
The load balance of the whole traffic network will be affected, for when a certain road segment is recommended too many times at the same time, its optimality will be lost.
The response time will be affected, for a large number of concurrent requests cause a big challenge to the runtime of the traffic path planning system.
Based on the above, a traffic path planning algorithm based on data prediction (TPPDP) is proposed in this paper, in order to achieve the optimal path with the shortest travel time. Also, from the viewpoint of load balancing, an improved path planning algorithm and a specific distributed big data processing framework is designed to improve the processing speed of the system, because the common distributed computing framework cannot meet the parallel processing requirements of the algorithm. The main works of this paper are listed as follows:
A model for traffic information prediction is designed to predict the traffic state of the road segments in the future based on historical traffic data and current traffic information, and a traffic path planning algorithm based on data prediction (TPPDP) is proposed to arrange a path with the shortest travel time.
A traffic path planning algorithm based on data prediction with load balancing (TPPDP-LB) is proposed, which combines the predicted information and the recommended number of the road segments to plan a path with the shortest travel time while keeping global load balancing.
A distributed computing framework is designed for running TPPDP-LB algorithm to respond to a large number of concurrent path planning requests quickly. This distributed processing framework is carried out through simulation.
The results of simulation show that TPPDP algorithm has the advantage of shortest travel time while TPPDP-LB algorithm also achieves load balancing, which can effectively avoid congested sections in the future. It is also proved that the distributed computing framework designed for TPPDP-LB can improve the system operation in terms of big data processing and response time.
The rest of the paper is organized as follows: the related works are introduced in Section II. Section III introduces two proposed path planning algorithms in details: TPPDP and TPPDP-LB algorithm. The distributed computing framework designed for TPPDP-LB algorithm is introduced in Section IV. The simulation is implemented in Section V and the results are analyzed and discussed in terms of the performance of the proposed algorithms and the distributed computing framework. Finally, conclusions are drawn in Section VI.
Related Works
A. Traditional Path Planning Algorithms
Traditional path planning problems are often simplified to the shortest path finding problem, which aims at find out the shortest path (SP) between two nodes in a graph. The classical shortest path planning algorithms include Dijkstra [4], A-Star (A*) [5] and their family algorithms.
Dijkstra algorithm is used to find out the shortest path between two nodes in a graph. The algorithm flow is described as follows. Firstly, a directed graph is constructed. Starting from a current node, a shortest path is found out from this node to one of its neighboring nodes, taking the length of the path as the weight of each neighboring node. Then the node with minimum weight value is selected as the next hop node and the shortest path from it to its neighbor nodes is chosen. Repeat this step, until the shortest path to the destination is determined.
Wang [10] proposed an improved Dijkstra algorithm, redefining the weight value between two nodes and introducing the concept of equivalent path instead of the actual path which can achieve adaptive improvement of Dijkstra algorithm. However, this paper is applied to find the best evacuation route when a mine is flooded and the weight value between nodes is calculated based on the length, type and condition of the roadways, which cannot be applied to modern transportation system.
Wei et al. [11] proposed an improved Dijkstra algorithm, which introduces the concept of congestion weight function which only considers the road conditions at the current time and not realizes real-time performance.
A* algorithm is a popular heuristic search algorithm. It estimates the superiority of the nodes by using function. The innovation of the algorithm is that a known global variable is introduced when selecting the next node, which estimates the distance between the current node and the destination. The global variable is used as a measure to evaluate the possibility of the node on the optimal path, which leads to search for more likely nodes, so that to increase the efficiency of the search process.
Li et al. [12] proposed an improved A* search algorithm. Through inverse search and optimized evaluation function, the undirected search is transformed into directed search and the global evaluation is transformed into local evaluation, so that to improve the efficiency of the algorithm and make it more suitable for processing large scale path planning problems. Shi et al. [13] took the distance between the head and the end nodes as the cost function of the element instead of the direction, and the algorithm improves the validity of A*.
Classical path planning algorithms are usually used in graph theory to find the shortest path in the graph, which cannot be directly applied to the actual complex traffic environments.
B. Dynamic Path Planning Algorithms
With the increasing of real-time traffic information, dynamic spatial networks are getting popular and path planning in dynamic spatial networks is becoming more and more important. Dynamic route planning problem (DRPP) aims at solving the optimization problem of path planning for vehicles traveling from a given source point to given destination. Although path planning algorithms have been widely studied, it is not enough for dynamic route planning in real life, for most of the route planning algorithms are designed by finding the SP. However, SP algorithm is ineffective for dynamic route planning, i.e., due to the difference and complexity of road conditions, the travel time calculated by SP algorithm might not equal to the overall path with shortest time (ST).
In recent years, dynamic path planning algorithms are often applied in special application scenario. He et al. [14] and Zhang et al. [15] took experiential routes collected from taxis into consideration, and proposed a route planning algorithm that best fits the driver’s experience based on the fact that the driver’s wisdom is always hidden in a large amount of taxis GPS data. Similarly, Luo et al. [16] and Wei et al. [17] also proposed a popular path planning algorithm based on taxi trajectory aimed at providing the most frequent route. Han et al. [18] proposed a novel charging path planning schema for wireless rechargeable sensor networks, which takes the energy of nodes into consideration to reduce the number of dead nodes and prolong the network lifetime.
Considering that different drivers have different driving preferences, i.e., some prefer to time-efficient driving and the others prefer to fuel-efficient driving, Dai et al. [19] proposed a personalized route recommendation algorithm to meet the different preferences of drivers.
Kanoh [20] proposed a dynamic path planning algorithm which is based on the genetic algorithm, and uses the latest recorded travel time as a prediction of travel time at a certain time in the future.
Li et al. [21] proposed an optimal route queries algorithm that can deal with the arbitrary order constraints. It uses two different methodologies, namely backward search and forward search, to achieve optimal path query efficiently with the order constraints.
Sun et al. [22] proposed a robust optimal path selection method, which mainly considers the reliability of the road section. It aims to find a relatively stable path to ensure that the arrival time is within a certain range.
Sommer et al. [23] evaluated a traffic information system based on vehicle-to-infrastructure communications that provide information about traffic jams for vehicles and help to optimize route planning.
Li et al. [7] defined a path discovery problem named temporal-sensitive optimal route discovery (TORD), and they aimed to find an optimal route that covers a set of user-specified locations and satisfies the specified sorting constraints and time constraints.
Liu et al. [8] proposed spatio-temporal outliers (STO) trees, an algorithm for discovering spatiotemporal anomalies and their causal relationship. Similarly, Shang et al. [24] proposed a dynamic shortest path monitoring algorithm in spatial network. When the cost of driving changes or the route deviates from the planned route, the algorithm can update the path efficiently using the pre-planned route.
Most the above algorithms aim at solving specific problems in certain scenarios, not the path planning problem with shortest time which is the most fundamental demand in real traffic environment. Furthermore, these algorithms do not take the impact of the recommended number of road segment for consideration. When a certain road segment is recommended too many times at the same time, its optimality will be lost, for in the following time this section may be blocked because of too many recommendations. Therefore, the research of dynamic path planning algorithm in the actual traffic environment is still significant.
C. Prediction Methods of Traffic Information
For dynamic path planning algorithm, it is meaningful to predict real-time traffic situation, and the forecast data can provide the driver with effective and timely congestion information to obtain the optimal path which usually means the path with the shortest travel time. Short-term traffic prediction refers to estimating basic parameters of traffic within 15 minutes in the future. It is the key to realizing intelligent transport system (ITS) control and induction. Vlahogianni et al. [25] have proposed many solutions to deal with parameter prediction problems of short-term traffic flow. However, the performance of prediction is disturbed by various random conditions, such as weather, human activities, etc. Besides, short-term traffic flow has its own random and fuzzy pattern. Based on the above reasons, the problem of short-term traffic forecasting cannot be completely solved [26].
There are two kinds of short-term traffic state prediction schemes. The first type is parameter prediction method, which is based on a certain mathematical model. It includes linear regression, time series and kalman filter model prediction methods, etc. Linear regression method analyzes historical data to find out the functional relationship between independent variables and dependent variables.
Fontanelli et al. [27] proposed a dynamic route planning algorithm based on three future travel estimation. They used the combination of the latest travel time and the number of vehicles currently in a road segment as a prediction of future travel time.
Kong et al. [28] proposed a TLR model which uses Gaussianprocess regression and statistical approaches to predict traffic information including average travel time and so on.
Guo et al. [29] proposed a prediction algorithm including an adaptive Kalman recursion for stochastic short-term traffic flow rate prediction.
In view of various traffic conditions, Fei et al. [30] proposed an adaptive dynamic linear prediction model, which can identify traffic conditions and use real-time data to adjust parameters to achieve accurate prediction. However, this kind of method cannot reflect the non-linearity and uncertainty in traffic prediction and cannot avoid the influence of random factors [31].
The second type is the nonparametric prediction method, which is a knowledge-based intelligent prediction method. Its structure is not predefined compared with the parameter prediction method. In addition, this method relies on the relationship between related or non-related business variables and is regarded as a dynamic classification model with strong non-linear processing ability. Among them, artificial neural network has attracted much attention due to its superior prediction performance.
Kong et al. [32] proposed a congestion prediction method based on support vector machine to predict traffic flow parameters including traffic volume and average traffic speed. The particle swarm optimization sub-module is used to optimize the punish coefficients and the parameters.
Kang et al. [33] considers that the traffic flow, speed and occupancy of existing and neighboring stations can be used as input to improve traffic flow prediction performance. They used the long short-term memory (LSTM) model to analyze the impact of different input settings on traffic information prediction performance and proposed a traffic information prediction algorithm with LSTM recurrent neural network.
Extreme learning machine (ELM) is an efficient learning paradigm for a wide range of fields. Xing et al. [34] used the kernel function method instead of the hidden layer method to overcome the variation problem and proposed a traffic flow prediction method to improve the accuracy of traffic flow prediction.
Due to the complexity of the transportation system, it may be interfered by random factors, such as random weather, traffic events, etc. These uncertain factors increase the uncertainty of traffic, and it is difficult to find a suitable mathematical model to describe the traffic characteristics. The k-nearest neighbor (KNN) model is one of the most commonly used methods in nonparametric regression. Cai [35] and Su [36] both used KNN-based methods to predict traffic flow. The results showed that compared with the other prediction model, the model could improve the prediction accuracy and avoid the influence of subjective classification in traffic state forecasting.
Due to the excellent performance in traffic prediction, the KNN-based model has been chosen as the prediction model in this paper. Also, the input parameters of KNN is selected in this paper to improve the accuracy of prediction.
TPPDP: Traffic Path Planning Algorithm Based on Data Prediction
A. Network Model
In this paper, the actual traffic road is abstracted as a graphic structure and referred to as a road network. The definition used in this paper is as follows:
Definition 1:
Road network: As shown in Figure 1, the urban road network is defined as a directed graph
Definition 2:
Road segment: a continuous road between two intersections in the road network. For example,
Definition 3:
Path:
Definition 4:
B. Overall Design
The overall process of TPPDP algorithm is shown in Figure 2. The users initiate path planning requests to the system, and the input value is the departure place and the destination. Once receiving parallel data requests, the system sequentially executes the data prediction algorithm and path planning algorithm. The data prediction algorithm runs based on the current traffic data an historical data, and finally computes the predicted average travel speed of each road segment. Then the travel time of the road segment is regarded as the selection criterion of the road segment in the path planning algorithm. The travel time of each segment is calculated based on the segment speed, which is determined by two factors: one is the predicted average travel speed, and the other is the recommended times of each road segment which is based on the path planning results at the same time. The optimal path planning result is the path with shortest travel time.
TPPDP algorithm studies the dynamic and global path planning algorithm, combines the path planning algorithm with traffic prediction technology. In addition, considering the overall coordination of the road network, load-balanced paths are recommended dynamically from a global perspective. On the one hand, it avoids recommending the same section to too many users, causing possible traffic congestion in the future; On the other hand, it can respond to unexpected situations such as traffic control and traffic accidents and update the planned path in time.
C. Traffic Speed Prediction Method Based on Historical Data
Since the driving time reflects the effects of traffic situation, the path planning algorithm usually uses travel time of the path as the criterion for determining the optimal path, and the driving time is determined by the average travel speed. Therefore, in this paper a prediction algorithm is proposed to predict the speed on the road. The traffic prediction described in this paper refers to the prediction of the average travel speed of a road segment in a specific time slot.
This algorithm is based on the historical data of traffic, which mainly records the average travel speed of road segment. In order to summarize the rules for prediction, a traffic speed forecasting model based on KNN is proposed in this paper, which takes the factors that affect the travel speed as the characteristic attributes of input data, such as the time slot, current speed and so on, and takes the prediction speed of a specific time slot in the future as the output data.
In this paper, a data prediction model was used. As shown in Figure 3, the current road segment data is filtered out from the historical database and the step is based on the ID of road segment. Then the
1) Parameters in KNN
There are two parameters in KNN algorithm: the dimension n of the input data (eigenvector) and the number
2) Input Data (X)
The attributes of the input data are the factors that affect driving on the road segment. Usually, the factors affecting the driving on the road segment include driving time slot, holidays, rainfall and so on. Therefore, the input instance of the KNN algorithm in this paper is determined as a 5-dimensional eigenvector: {Time slot, Day, Holiday, Rainfall, Current speed}, as shown in Table 1.
According to [38], 10 minutes is regarded as a unit of traffic status, so one day is divided into 144 time slots as a time interval is 10 minutes. The reason for choosing this attribute is that the travelling of urban residents is regular and the traffic pattern of urban roads is also periodic, similar and dynamic.
Day represents the day of the week, and the value is from 1 to 7. Since the urban residents’ travel within a week has certain regularity, the road traffic mode also has periodicity, similarity and dynamics with the unit of day.
Holiday is a boolean value to mark it’s a holiday or not. People have different travel habits on holidays or non-holidays. Therefore, there is also a significant difference between holiday and non-holiday traffic flows.
The values of Rainfall consist of 6 levels:
Light rain: 10 millimeter (24h).
Moderate rain: 10-24.9 millimeter (24h).
Heavy rain: 25-49.9 millimeter (24h).
Rainstorm: >50 millimeter (24h).
Heavy rainstorm: >100 millimeter (24h).
Torrential rain: >250 millimeter (24h).
The current speed reflects the average travel speed in this road segment in this time slot, and it generally does not exceed the maximum speed limit of the road. Since the current speed has an important influence on the speed of the next time slot, this attribute can combine current traffic data with historical data to make the prediction more realistic.
The prediction of traffic requires robust to short-term and long-term changes in traffic conditions. If these changes are unexpected, such as traffic accidents and bad weather, it is very difficult to construct a responsive prediction. Above all, unexpected changes are not considered in this paper.
3) Output Data (Y)
The output data is the predicted speed \begin{equation*} d_{i}=\sqrt {\sum _{j=1}^{5}(x_{j}-x_{i,j})^{2}}\tag{1}\end{equation*}
The weight of each neighbor is the reciprocal of the distance:
Sum these weights to get \begin{equation*} D=\sum _{i=1}^{k}\left({\frac {1}{d_{i}}}\right)\tag{2}\end{equation*}
The output \begin{equation*} D=\sum _{i=1}^{k}\left({\frac {1/d_{i}}{D}\times {y_{i}}}\right)\tag{3}\end{equation*}
The output data is the predicted speed
D. Process of TPPDP Algorithm
The TPPDP algorithm is an improved Dijkstra algorithm. The road network is abstracted as a graph shown in Figure 4. The point indicates the starting point of the road segment and the edge refers to road segment. The algorithm is designed as follows.
1) Edge Weight Setting
The travel time intuitively reflects the real time traffic condition and also meets the query requirements of most people, so the weight of the road segment in Figure 4 is set as the travel time. The weight of the road segment is expressed as
The cost of the node is the weight of road segment, that is, the travel time of its upstream road segment. The optimal path planned by this way is with the least travel time.
In real life the traffic situation of the road segment changes over time, and correspondingly the weight of the road segment is different at each time slot of one day. According to Section III-C, one day can be divided into 144 time slots, with every 10 minutes as one unit. Therefore, each road segment has 144 weights:
The average travel time of the road segment is determined by the average travel speed
2) Path Planning Algorithm
The process of path planning executes as follows: Firstly, a graph for road network is constructed as Figure 4 shown and the weight of edge is set as described in Section III-D.1. The path planning request from user is received, including the start point and the destination. Starting from the requested source node, the optimal path with minimum weights from the current node to its neighboring node is obtained. Then take the node with the least cost as the current node, recursively, until the optimal path to the destination point is determined. Since the average travel time is the weight of the road segment, the optimal path obtained by this way is the path with the minimum total travel time.
The above algorithm is a global query algorithm, and in every recursive step it is possible to scan all the nodes to find the shortest path to the destination. In the case of big data for urban traffic, the global query will seriously restrict the calculation speed of the path planning algorithm proposed above. Therefore, it is necessary to design the method of accelerating calculation.
According to [19], the method of building a local road network can be used to accelerate the calculation speed of path planning algorithm. The local road network is a sub graph of the global road network which contains all the nodes in the city map. In this paper, the local road network can be constructed by searching for the set of nodes \begin{equation*} Node_{SD}=\{node_{i}|node_{i} ~in ~the ~paths ~from ~S ~to ~D\}\tag{4}\end{equation*}
The process is described by the following example. Suppose that Figure 5(a) represents the global road network, now the path from node C to B needs to be planned. Instead of scanning all nodes in the graph, the algorithm firstly searches for the nodes include in the paths starting from C to B in the historical database. The nodes formed by search result construct the local road network. Then the path planning algorithm executes in the local graph. Assuming that the paths from C to B include the following:\begin{align*} P_{CB1}=&\{\overrightarrow {CE},\overrightarrow {EB}\} \\ P_{CB2}=&\{\overrightarrow {CF},\overrightarrow {FA},\overrightarrow {AB}\} \\ P_{CB3}=&\{\overrightarrow {CG},\overrightarrow {GD},\overrightarrow {DB}\}\tag{5}\end{align*}
The nodes contained in these paths can be added to the set of \begin{equation*} Node_{CB}=\{A,E,D,F,G\}\tag{6}\end{equation*}
Therefore, the local road network from C to B is constructed by
The above process aims to simplify the original road network. By using local road network instead of global one for searching, this algorithm greatly reduces the computation load.
E. TPPDP-LB: Traffic Path Planning Algorithm Based on Data Prediction With Load Balancing
The average road speed required by the path planning algorithm is studied in this section. Ge [39] considers that path recommendation algorithm is usually targeted at a single vehicle, and it leads to overlapping segments recommended by the path planning system to different users. E.g., the same optimal path is recommended to a large number of users without considering the overall coordination, when they request path planning from same source and destination simultaneously. It causes overlapping recommended road segments could generate new traffic congestion and the capacity of these sections will decrease as the vehicles grow in the next time slot. The path planning system is eventually caught in the problem of local optimization.
For this reason, the recommendation factor is introduced as a correction of the average speed predicted in Section III-C. It represents the impact of the recommended times on the average speed of the road segment. Combined with the recommendation factors, the average speed predicted in Section III-C can be effectively adjusted, avoiding certain road segments being repeatedly recommended. The combination of average prediction speed and recommendation factors reflects the load balancing of path planning algorithm.
According to Lu et al. [40], the relationship between driving speed and the traffic density is as follows:\begin{equation*} V(\rho)=V_{\rho =10}-13.375\times ln(\rho)+30.797\tag{7}\end{equation*}
The average speed of a certain segment in time slot \begin{equation*} V(\rho)_{m}=V_{\rho =10}-13.375\times ln(\rho)+30.797\tag{8}\end{equation*}
The system continuously recommends the path to the users and these paths could contain the same road segments. The number of recommended times \begin{equation*} \rho _{inc}=r/L\tag{9}\end{equation*}
Considering this impact, the average speed of the road segment is corrected to \begin{equation*} V(\rho \!+\!\rho _{inc})_{m}\!=\!V_{\rho =10}\!-\!13.375\!\times \! ln(\rho \!+\!\rho _{inc})\!+\!30.797\tag{10}\end{equation*}
It can be obtained:\begin{equation*} V(\rho +\rho _{inc})_{m}-V(\rho)_{m}=13.375\times ln\left({\frac {\rho }{\rho +\rho _{inc}}}\right)\tag{11}\end{equation*}
That is:\begin{align*} V(\rho +\rho _{inc})_{m}=&V(\rho)_{m}+13.375\times ln\left({\frac {\rho }{\rho +\rho _{inc}}}\right) \qquad \tag{12}\\ V=&V(\rho +\rho _{inc})_{m}\tag{13}\end{align*}
As mentioned above, the average travel speed is corrected to \begin{equation*} weight=L/V\tag{14}\end{equation*}
F. Dynamic Adaption
TPPDP and TPPDP-LB algorithms only consider the static path planning problem, and cannot be adapted to the situation where the traffic situation changes after the navigation starts. Therefore, the dynamic adaption to sudden traffic changes, such as traffic accident and control, is required. In order to obtain real-time traffic information and change the path planning dynamically, an index of drivers and road segments is used. The format is defined as follows:\begin{equation*} Tral=\{Seg_{i}|i\epsilon (1,2, {\dots },n)\}\tag{15}\end{equation*}
The above formula indicates that the planned path for the driver is composed of the road segments from 1 to
Distributed Computing Framework for TPPDP-LB
A. The Requirement of Distributed Computing for TPPDP-LB
In modern transportation systems, TPPDP and TPPDP-LB algorithms usually face many concurrent path planning requests from different uses, so this is a problem of big data processing and it is particularly important to improve the execution speed of the algorithms by using a distributed computing framework. The condition of a distributed parallel computing framework is that the processing between each part of the data set is independent and isolated, such as Map-Reduce [41]. However, in TPPDP-LB algorithms, the number of recommended times is a key factor to determine the predicted average speed of road segment in next time slot, that is, the result of each path planning request will have an impact on the next path planning request. Therefore, TPPDP-LB algorithm is actually a serial calculation as shown in Figure 6(a): One job corresponds to one path planning request and the jobs are divided into
The improved distributed computing framework proposed in this section mainly solves the problem of computation decomposition. The concept of “correlation” is defined firstly. If the path based on a path planning request does not overlap or less overlaps with the path based on another path planning request, the two requests are considered to have low correlation. Path planning requests with less correlation have little effect on each other, so they can be distributed to different compute nodes for parallel computing. As shown in Figure 6(b), the path planning requests with high correlation are distributed to the same compute node for serial computing, while the path planning requests with low correlation are distributed to different compute nodes for parallel computing. This framework can both take advantage of the distributed computing framework and handle data-dependent calculations.
B. Design of Distributed Computing Framework for TPPDP-LB
1) Clustering Method
In order to solve the problem of computation decomposition, K-means++ algorithm [42] is chosen, for it augments the K-means algorithm with a randomized seeding technique and improves the speed and accuracy of clustering efficiently. The path planning requests are classified by using K-means++ algorithm and the purpose of grouping the requests with larger correlation is realized. K-means++ algorithm divides
The clustering method in this paper is based on the path planning request which includes starting node and destination node. It is regarded that the path planning requests with close starting node and destination node have larger correlation. Two mappings in this clustering method are defined as follows.
: the path planning requests are mapped to vectors in a two-dimensional space, and the coordinates are composed of latitude and longitude of the starting node and the destination node. One vector in the coordinates corresponds to one path planning request:The \; first \; mapping \begin{equation*} Req_{i}=\overrightarrow {(S_{LOi},S_{LAi}),(D_{LOi},D_{LAi})}\tag{16}\end{equation*} View Source\begin{equation*} Req_{i}=\overrightarrow {(S_{LOi},S_{LAi}),(D_{LOi},D_{LAi})}\tag{16}\end{equation*}
: the vectorThe \;~second \; mapping is mapped to a point with four-dimensional coordinates, which are respectively the longitude of the starting pointReq_{i} , the latitude of the starting pointS_{LOi} , the longitude of the destination pointS_{LAi} , the latitude of the starting pointD_{LOi} .D_{LAi}
For example, the vectors in Figure 7 are mapped to the points in a four-dimensional space. Corresponding vector and coordinate point represent the same path planning request.
Then the points in the coordinate system are clustered by using K-means++ algorithm and the points close to each other are divided into one cluster according to Euclidean distance. That is, the path planning requests with close starting point and the destination point are divided into a cluster.
Since the path planning requests in the same cluster have close starting points and the destination, the correlation between them is large. Therefore, the results of these requests have more overlapping road segments, so each result of a request will affect results of other requests in the cluster. According to TPPDP-LB, serial computing is required in one cluster.
While the path planning requests are in different clusters, the correlation between them is small. Then the results of these requests will not affect each other, that is, they have less overlapping road segments. Therefore, parallel computing can be performed between different clusters to increase executing speed.
2) Optimization of Clustering Method
The value
Simulation
A. Simulation Settings
The digital map data used in the simulation is downloaded from the website OpenStreetMap [43], where the map data is uploaded by the public and free for public non-commercial use. As shown in Figure 8(a), the simulation map selects the area of Yanta West Road in Xi’an, China, and the map size is about
The path planning algorithms are simulated in the above map, and the distributed computing framework is implemented in multiple virtual machines with the same configuration on two physical computers. All simulation works are implemented with Matlab R2014a.
B. Simulation of Path Planning Algorithms
In this part, the simulation compared the following four path planning algorithms: Dijkstra, improved Dijkstra (i-Dijkstra) [10], TPPDP and TPPDP-LB in terms of travel time, dynamic adaptability and load balancing, etc. The simulation experiments in this part run on a single computing node, without considering distributed parallel computing.
1) Travel Time
The simulations in this part compare the travel time planned by four algorithms, in the cases of different time periods and different number of path planning requests. It is assumed that all requests start from source point
The second simulation compares the travel time with different number of concurrent requests in same time slot to evaluate the abilities of four algorithms to handle concurrent requests. The departure time is set at 7am, when the traffic is heavy. From Figure 11, it can be concluded that the number of concurrent requests has a certain impact on the travel time of the path planned by different algorithm. The four algorithms all cost more travel time as the number of concurrent requests increases, because the road will become congested as the number of concurrent requests increases. It is also proved that TPPDP and TPPDP-LB algorithms achieve shorter travel time in any case while TPPDP-LB algorithm has best performance when number of requests reaches the maximum. TPPDP-LP algorithm is least affected by the number of concurrent requests because the algorithm takes the recommended times of road segments into account as that to implement load balancing of path planning.
As is shown in Figure 12, the path length generated by four algorithms is compared. The path planned by the Dijkstra algorithm is always the shortest, and the paths planned by the TPPDP and TPPDP-LB algorithm are longer because the two algorithms concern more about the travel time instead of the length of path. It can be seen from the results that the path lengths calculated by TPPDP and TPPDP-LB algorithms are not much longer than Dijkstra algorithm, and are even the same at some time slot. Therefore, these two algorithms sacrifice the cost of a little distance and achieve the minimum travel time, and as mentioned above, travel time is the most important issue for querying users.
2) Dynamic Adaptability
In Section III-F, a dynamic adaptive optimization algorithm is proposed to improve the TPPDP and TPPDP-LB algorithms, in order to deal with road sudden conditions in time. The simulation in this part is implemented to verify the dynamic adaptability in the case of traffic control.
The path planning request is set from node 22 to 201 and the departure time is 9 am. By running TPPDP algorithm, a planned path represented by red solid lines is obtained as shown in Figure 13(a), which has the shortest travel time of 34.1466 seconds. Then we set up road traffic control between sections 184 to 202, and by using dynamic adaptive optimization algorithm, the planned route becomes as shown in Figure 13(b), avoiding the traffic control sections, and the travel time reaches 37.0532 seconds. Therefore, it can be inferred that the algorithm proposed in this paper can realize the dynamic planning path. When sudden traffic control or traffic accidents happen, paths can be update in time to avoid congestion.
3) Load Balancing
As mentioned in Section III, TPPDP-LB algorithm implements load balancing for path planning by considering the number of concurrent requests. The simulation in this part aims to verify the capacity of load balancing of TPPDP-LB compared to TPPDP. The settings of the simulation experiment are shown in Table 4. 20 concurrent path planning requests from node 30 to node 122 are sent to the simulation system at the same time, and then TPPDP algorithm and TPPDP-LB algorithm respectively plan different paths to these requesting users.
Heat map is used to describe the traffic density in the next period, and the result is shown in Figure 14. Figure 14(a) is the traffic hot zone map generated by the path planned by TPPDP Algorithm, while Figure 14(b) is generated by TPPDP-LB Algorithm. The hot zone map uses different colors to represent different traffic density. Red represents the most congested traffic conditions, and yellow, green, cyan, and blue in turn represent a decrease in traffic density. Comparing Figure 14 (a) and (b), it shows that the paths planned by TPPDP-LB algorithm achieves better load balancing than the paths planned by TPPDP algorithm. It is because TPPDP-LB algorithm will recommend users to other paths if a road segment is recommended to many users at the same time, so as to ensure the load balancing of these road segments in the next time. The larger the number of concurrent path planning requests is, the better the performance of load balancing is.
C. Simulation of Distributed Computing Framework
In Section IV, a distributed computing framework is designed for implementing TPPDP-LB algorithm to solve the problem that the algorithm is difficult to implement in a distributed mode because of serial computing. The simulation experiments in this section run the TPPDP-LB algorithm on the distributed computing framework to verify the performance of the framework in terms of runtime and result validity. The frequency of algorithm execution can be adjusted according to the frequency of request generation. The algorithm is executed once per second in the simulation.
1) Runtime
Reducing algorithm runtime is the main purpose of using a distributed computing framework, therefore the first simulation experiment is to compare the runtime with different numbers of distributed compute nodes. As mentioned in Section IV, the number of distributed compute nodes depends on the
Figure 15 shows the runtime of TPPDP-LB algorithm with different
2) Algorithm Accuracy
The TPPDP-LB algorithm is essentially a serial calculation process, because each path planning request needs take into account the results of the previous requests. By dividing the request data into clusters with no correlation, the distributed computing framework achieves parallel computing. However, the actual path planning request is difficult to achieve no correlation completely, and some road sections will always overlap. In this case, simulation experiments are required to prove that the difference between the recommended path generated by the distributed computing framework and the path generated by serial computing is very small.
The simulation experiment setting is as follows: Concurrent path planning requests are randomly generate, and the number of requests are 100, 500, 1000 and 5000 respectively. Then the distributed computing framework is used to execute the TPPDP-LB algorithm, taking the values of
The simulation experiment setting is as follows: Concurrent path planning requests are randomly generate, and the number of requests are 100, 500, 1000 and 5000 respectively. Then the distributed computing framework is used to execute the TPPDP-LB algorithm, taking the values of
The difference between results produced by distributed computing framework and serial computing.
In summary, the simulation results proved that TPPDP algorithm can produce the planned path with the shortest travel time, based on historical traffic data, as well as the TPPDP-LB algorithm can achieve load balancing and avoid localized congestion by considering the situation where some road sections are reused by concurrently requested paths. In addition, the simulation also proves that the distributed computing framework to execute TPPDP-LB algorithm can implement parallel computing to reduce runtime, without affecting the superiority of the TPPDP-LB algorithm.
Conclusion
In this work, a traffic path planning algorithm based on data prediction named TPPDP is proposed. A prediction model is designed to predict the traffic state of the road segment in the future based on historical traffic data and current traffic information, and based on the predicted result, which is the average travel speed of road segments in next timeslot, TPPDP algorithm is proposed to plan a path with the shortest travel time. Furthermore, considering that there are overlaps in path segments between planned paths produced by concurrent requests, a path planning algorithm based on data prediction with load balancing named TPPDP-LB is proposed, which combines the predicted information and the concurrent recommended number of the road segments to plan a path with the shortest travel time as well as keeping global load balancing. A specific distributed computing framework for TPPDP-LB algorithm is designed to improve the processing speed of the system, because the processing of TPPDP-LB algorithm is serial and the common distributed computing framework cannot meet the parallel processing requirements of TPPDP-LB algorithm.
The results of simulation show that TPPDP algorithm can find the path with shortest travel time while TPPDP-LB algorithm can achieve load balancing of computing, which can effectively avoid congested sections in the future. It is also shown that the distributed computing framework designed for TPPDP-LP algorithm can effectively reduce the runtime of system operation by 80% as well as keep the accuracy of TPPDP-LB algorithm.