Introduction
In recent years, with the continuous integration of information technology and the transportation industry, the concept of intelligent transportation systems has attracted attention from all parties. Intelligent transportation systems aim to improve traffic efficiency, optimize traffic flow, and improve traffic safety. The prerequisite for achieving these functions is to have a clear understanding of the location of vehicles in transit. Therefore, in transit vehicle position estimation is a key foundational technology in the construction of intelligent transportation systems, which has important applications in path navigation, vehicle collaborative control, vehicle collision warning, and other aspects [1], [2].
For highways, accurately estimating the location of vehicles can monitor traffic flow and congestion in real-time, which helps traffic managers understand the condition of the road, take timely measures to alleviate congestion, guide traffic flow to idle roads, and more effectively optimize road conditions. Secondly, by estimating the location of vehicles in transit, more accurate navigation services can be provided for vehicles based on their real-time location information and traffic conditions, optimizing their driving routes, avoiding congestion and cumbersome routes, and thereby improving driving efficiency and safety [3], [4], [5]. In addition, by collecting vehicle location data, traffic management departments can analyze the traffic flow patterns of different periods and regions, providing a basis for traffic policy formulation and rational planning. This helps to plan road construction and transportation facility layout reasonably to meet the growing transportation demand. Therefore, accurately obtaining the location of vehicles in transit is of great significance for ensuring the safety of highway driving and improving the efficiency of highway operation [6], and is also a key technology necessary for building a smart highway.
With the accelerated integration of new-generation information technology and the automotive industry, it has further promoted the development of the intelligent automotive industry. In transit vehicle position estimation is a crucial part of intelligent driving systems and is of great significance for the upgrading of the intelligent driving industry [7]. One of the foundations for achieving high-level intelligent driving is the real-time perception of the driving environment by intelligent vehicles [8]. The environmental composition that causes the greatest impact on vehicles while driving on highways is the surrounding vehicles. Therefore, achieving position perception of vehicles in transit is the core task of intelligent vehicles in perceiving the driving environment, and it is also an important guarantee for the safety and comfort of intelligent vehicles [9]. The current popular perception method is to perceive the location of vehicles around smart cars through sensors such as LiDAR and cameras. However, these methods are limited by the mechanical performance of the equipment, weather, and other factors, and their perception distance to the surrounding driving environment is limited and unstable. In situations where road design obstructs the view of equipment and low visibility, it is easy to cause dangerous working conditions. Secondly, due to the limited perception distance of devices, smart cars are unable to perceive ultra low speed vehicles outside the sight range, resulting in their inability to plan their speed in advance and affecting the driving experience of smart cars. Therefore, stable vehicle position perception is the key to the upgrading and development of the intelligent vehicle industry.
In summary, accurately obtaining vehicle positions is of great significance for the comprehensive construction of intelligent transportation systems and promoting the upgrading of China’s automotive industry towards intelligent driving. With the continuous progress of social and economic levels, the number of private cars is bound to increase significantly, which will bring severe challenges to the transportation system. At that time, the demand for functions such as traffic flow guidance and road condition evaluation of intelligent transportation systems will become even stronger. How to achieve large-scale vehicle position estimation without increasing costs has become a key issue.
We will organize the paper as follows: In the next section, we summarize the relevant literature on vehicle position estimation at home and abroad, and summarize its drawbacks in the field of vehicle position estimation within highway segments; In the third section, we discuss the differences and innovations between our research and existing research; In the fourth section, we introduced the implementation process of our proposed method for estimating the position of vehicles in transit on highways, taking into account road features and short-term driving styles; The fifth section introduces the application effect of our method in actual highway environments; The final section summarizes the application effectiveness of our method.
Related Research
With the rapid development of the automotive industry and the increasing complexity of road design and traffic situations, vehicle location estimation methods have shown a trend of upgrading from relying solely on the Global Positioning System (GPS), to multi-sensor collaboration, and then to collaborative iteration of sensors and machine learning methods.
The precise positioning of vehicles was initially achieved by the Global Navigation Satellite System (GNSS) and the GPS. However, due to communication interruptions and multipath errors in tunnels, canyons, and other road segments, GPS and GNSS have poor performance and cannot meet the requirements of modern intelligent transportation systems [10], [11], [12], [13]. Therefore, for many application scenarios that require precise and reliable positioning, independent GNSS or GPS is considered unreliable [2], [14], [15], [16].
To solve the problem that GPS or GNSS alone is prone to interference, Zongwei Wu [15] et al. proposed a method to improve the attitude estimation accuracy of a low-cost inertial navigation system/GPS (INS/GPS) integrated vehicle by utilizing the heading angle measured by the GPS, which effectively reduces the effects of yaw angle error, sideslip angle, and the noise of the GPS measurements, and improves the positioning accuracy compared to the GPS/GNSS alone. GPS/GNSS alone improves the positioning accuracy. Due to the complementary nature of sensors [18], [19], [20], information from GPS and vehicle motion sensors is widely used for vehicle position estimation to obtain reliable and accurate vehicle position. Rezaei and Sengupta [21] et al. proposed a vehicle position estimation scheme based on the integration of GPS and vehicle sensors and used a Kalman filter for data fusion to improve the vehicle position estimation accuracy. The loading rate and cost of vehicle motion sensors are always difficult to balance
To address the problem of GPS positioning accuracy degradation or signal loss when a vehicle moves in an area where GPS signals cannot be received (tunnels or underpasses) or in an area where very strong multipath propagation occurs (areas surrounded by buildings covered by high glass). Omar et al. [22] designed a position estimation method for the integration of GPS and waypoint projection nav-igation systems, and through the simulation to verify the effectiveness of their method. Wang et al. [23] et al. first proposed a robust Direction Of Arrival (DOA) estimation method based on sparse Bayesian learning (SBL) to achieve DOA estimation of target vehicles under nonuniform noise conditions. Then, based on the DOA estimation results, every two base stations in the localization system cross-localize the target vehicle once. Finally, based on the results of three cross-localizations, robust localization can be achieved. A large number of simulation results show the effectiveness and superiority of the method. Guo et al. [24] introduced multi-output (FDA-MIMO) radar into an intelligent transportation system (ITS) and used tensor decomposition to process transportation big data (TBD) to improve the real-time performance of target position estimation. An algorithm is proposed for angle and distance estimation in multi-output radar systems where array gain phase error and spatial color noise coexist. Firstly, a four-dimensional tensor was constructed using the temporal irrelevance of colored noise to eliminate the impact of colored noise on ITS. Secondly, a directional matrix containing target information is obtained through parallel factorization. An optimization problem was constructed for the array gain phase error, and the Lagrange multiplier method was used to solve the optimal solution. The influence of gain phase error is eliminated by utilizing the optimal solution and directional matrix. Finally, the position information of the car was obtained by fitting and solving using the least squares method (LS). Havyarimana et al. [25] proposed a fusion framework based on sparse Gaussian Wigner prediction (SG-WP). This method assumes that the measurement noise is a non-Gaussian distribution, and uses a generalized error distribution as an approximation of the non-Gaussian density. It combines the advantages of random matrix theory and sparse characteristics to provide enhanced vehicle positioning capabilities. Jo et al. [26] addressed a significant limitation in existing models, which often assume that vehicles travel on a flat plane without con-sidering the impact of road slopes. Furthermore, they highlighted the high cost associated with three-dimensional vehicle positioning equipment. They analyzed how road slope affects location estimation and proposed an estimation algorithm that accounts for this influence. By compensating for errors caused by roadside slopes, this algorithm enhances the precision and reliability of location estimation. However, a common drawback of the methods mentioned above is their reliance on GPS devices or base stations for assistance. This limitation prevents scalable applications and confines them to vehicles equipped with GPS or necessitates large-scale base station deployments. Consequently, the scope of the application is significantly restricted, and the associated costs are substantially increased.
With the continuous advancement of communication technology, Vehicular Ad-Hoc Networks (VANETs) have found widespread applications in intelligent transportation systems [27], [28], [29], [30], [31]. Tsai et al. [32] introduced a collaborative positioning algorithm (CPDR) aimed at improving GPS location accuracy within VANETs by incorporating Dead Reckoning (DR) algorithms. In this work, the DR algorithm helps filter out some unreasonable GPS positions by referencing travel history records. However, VANETs pose several challenges, including highly heterogeneous vehicle network design, security, privacy concerns, and the dynamic nature of vehicular mobility. These factors create additional challenges for protocol designers. Particularly, the constantly changing scenarios due to vehicle mobility result in short lifetimes for multi-hop paths. In such situations, protocols that rely on knowledge of the system’s state can be inefficient due to frequent network changes. Moreover, VANET applications may require a different protocol stack [33]. Additionally, VANETs must grapple with the trade-off between the deployment rate of vehicular hardware devices and the associated cost.
In recent years, neural network models have gained widespread adoption in intelligent transportation systems due to their outstanding performance. Wan et al. [34] introduced a novel system architecture incorporating Massive Multiple-Input Multiple-Output (MIMO) or Reconfigurable Intelligent Surfaces (RIS) along with multiple autonomous vehicles for vehicle positioning. By leveraging geometric algebra, they reformulated the Direction of Arrival (DOA) and polarization estimation problem as a new block sparse recovery problem. They achieved DOA and polarization parameter estimation for autonomous vehicles with relatively low computational complexity using the deep network architecture SBLNet. Meanwhile, Yuexia and Chong [35] proposed a high-precision vehicle localization method based on neural networks and Road Side Units (RSUs) fingerprints. They divided the localization area into uniform grid regions, collected Received Signal Strength Indicator (RSSI) data from different RSUs in each grid region, and constructed an RSU fingerprint database. During the localization phase, they utilized Backpropagation Neural Networks (BPNN) to estimate the approximate coordinates of the target vehicle. Using these estimated coordinates as the center and the maximum prediction error of the BPNN as the radius, they constructed a fingerprint-matching region. This approach allowed them to calculate the precise coordinates of the target vehicle through local fingerprint localization. Alzyout and Alsmirat [36] introduced a short-term vehicle location prediction framework that enhances prediction accuracy and framework execution time by dynamically adjusting parameters and employing both multi-selective and single-selective ARIMA models. Anitha and Duraiswamy [37] addressed the limitations of current methods in vehicle location prediction, which often lack analysis of both current and future vehicle positions and are affected by errors in GPS location data. They proposed a heuristic mobile vehicle location prediction algorithm, demonstrating that this heuristic algorithm can accurately predict the future positions of vehicles. In response to the shortcomings of existing methods, Fan et al. [38] presented a deep learning-based approach for next-step location prediction (DLNLP), incorporating contextual information into urban-level vehicle motion prediction. Experimental results indicate that DLNLP outperforms other methods such as MM, WMM, CNN, and LSTM in terms of accuracy, recall, and balanced F-score. Long et al. [39] addressed the limitation of existing location prediction models, which focus solely on location and time, oversimplifying the regularities and preferences in human mobility. Furthermore, existing state-of-the-art RNN-based models may fail to capture long-term patterns in sparse scenarios due to the lack of sequential dependencies. They proposed an individual vehicle location prediction method that utilizes travel patterns and preferences. Experiments conducted on three real vehicle trajectory datasets, each containing over 10,000 individual vehicles, demonstrated that the proposed model outperforms state-of-the-art models by 7%-10% in terms of prediction accuracy. Xiao and Nian [40] tackled the issue of low prediction accuracy in existing methods. They introduced a vehicle location prediction algorithm based on spatiotemporal feature transformation and a hybrid LSTM neural network. This approach effectively reduces information loss in vehicle trajectories and enhances the accuracy of vehicle location prediction. However, it should be noted that this model does not fully consider the impact of road conditions.
After analyzing the literature review mentioned above, we can identify the following shortcomings in the existing methods:
Most of the methods mentioned above have largely overlooked the influence of road structure characteristics on vehicle positioning, especially in provinces like Fujian, which are primarily mountainous, and where highways traverse through mountain ranges. Therefore, methods that do not consider road features may not necessarily be suitable in predominantly mountainous regions.
The low adoption rate of in-vehicle positioning devices hinders their wide-spread application, thus limiting the provision of comprehensive and accurate posi-tioning information for intelligent transportation systems.
The addition of extra in-vehicle positioning equipment results in an increase in overall vehicle costs. From the perspective of vehicle owners, unless there are significant benefits, they may not be willing to bear the associated expenses.
Our Contribution
In response to the challenges faced by existing methods and the existing infrastructure, this paper proposes a highway in transit vehicle position estimation method that considers road features and short-term driving style by integrating ETC data (Edata) and invehicle GPS positioning data. As of the end of 2022, Fujian Province’s highways have generated over 5 million ETC transaction data daily, providing basic driving characteristics for estimating the position of vehicles in transit on highways. However, since Edata only records the status information of vehicles during information exchange with the gantry, the vehicle status information within the highway segment cannot be obtained. GPS positioning data records the driving status of vehicles within a certain time interval, providing a modeling basis for estimating the position of vehicles within a highway segment. The main contribution of this article is:
This method integrates Edata and GPS positioning data (Gdata) for vehicle position estimation, excavates vehicle driving patterns in Edata to construct basic driving features, and uses corresponding vehicle Gdata to generate road segment features and target state variables. Due to the large-scale deployment of ETC devices on highways, this method has better universality.
In the vehicle segment speed prediction model, based on the spatiotemporal dependence of the vehicle segment speed, the SC-Kmeans-Bilstm model based on PCA optimization is proposed, which fully takes into account the influence of the spatiotemporal dependence of the vehicle segment speed and the vehicle’s short-term driving style to improve the model prediction accuracy.
In vehicle location estimation, to address the instability of vehicle location data in the spatial and temporal dimensions, linear interpolation and first-order backward difference are used to improve the data smoothness and maximize the retention of the information entropy of the data; secondly, the road features are taken into account into the model to minimize the impact of the changes in spatial and temporal features of the road on the performance of the position estimation model.
The DLCNN-LSTM-ATTENTION fusion module based on L1 regularization is designed to achieve multimodal feature fusion and abstraction at a deeper level to better understand the multidimensional relationships of the data. Also the inclusion of L1 regularization avoids the overfitting phenomenon. In addition, the two-layer CNN can interact and fuse different features in the second convolutional layer to capture higher-level data patterns. The LSTM layer has memory units and gating mechanisms to better capture and utilize long-term dependencies in sequential data. Finally, by applying an attention mechanism on the output of the LSTM layer, we can adaptively learn the important weights for each time step.
Although vehicle position estimation has always been a research hotspot in the academic community, to our knowledge, we are the first team to propose using Edata and Gdata to jointly achieve vehicle position estimation within highway segments.
Model Construction
The highway in transit vehicle position estimation model considering road characteristics and short-term driving style of vehicles first analyzes and processes the data problems existing in Edata and Gdata to improve data matching efficiency and reduce noise that affects the fitting degree of the model, mainly including noise reduction processing for duplicate, missing, and incorrect data. Based on the denoised data, perform bidirectional matching between Edata and Gdata, and construct vehicle basic driving features based on Edata and Gdata respectively. Secondly, the short-term driving style of the vehicle is integrated into the vehicle speed prediction model to obtain accurate vehicle segment passing speeds; Next, Since the segment speed of the vehicle cannot fully reflect the driving pattern of the vehicle within the segment, a road model within the segment is constructed using moving average wavelet transform, and the vehicle segment speed is combined to jointly map the changes in vehicle position within the segment; Finally, based on the basic driving characteristics of the vehicle, the segment speed of the vehicle, and the road characteristics, a DLCNN-LSTM-Attention method for estimating the position of vehicles in transit is proposed to achieve accurate prediction of vehicles in transit. The overall framework of the model is shown in Fig 1. Table 1 summarizes the parameters and characteristic variables involved in this article.
This method integrates Edata and GPS positioning data (Gdata) for vehicle position estimation, excavates vehicle driving patterns in Edata to construct basic driving features, and uses corresponding vehicle Gdata to generate road segment features and target state variables. Due to the large-scale deployment of ETC devices on highways, this method has better universality.
A. Data Preprocessing and Analysis
When building a model, duplicate data makes the model overly dependent on these data, leading to overfitting and a decrease in generalization ability to new data. Missing data may result in the model not being able to fully utilize information, thereby reducing the accuracy of the model. Data with incorrect information may cause the model to be affected by outliers or unreasonable values, leading to overfitting or errors in these data. Because the experimental data comes from Edata and Gdata in real driving environments, there are inevitably duplicates, omissions, and incorrect information in the data. Therefore, it is necessary to conduct deep cleaning of the data to ensure data quality and accuracy and construct features related to vehicle position changes based on the cleaned data as input to the model, to maximize model fit.
1) Data Noise Reduction
To reduce data noise, the main objects for cleaning Gdata include repeated positioning data of the same vehicle, long positioning time interval, continuously changing speed but unchanged vehicle position, data with a continuous speed of 0km/h but unchanged vehicle position, and data points with trajectory drift outside the highway. Among them, for trajectory drift points, this question uses a drift point detection method based on Gaode map path planning.
As shown in Fig 2, Gdata is affected by factors such as satellite geometry configuration, receiver error, and noise, resulting in individual positioning data points drifting outside the highway network. The drift phenomenon can affect the regularity of vehicle position changes and affect the fitting effect of the model. Therefore, for trajectory drift points, this article proposes a drift point cleaning method based on the Gaode API, which calculates the distance between adjacent two points of the vehicle trajectory through the Gaode API and sets the calculation rule to prioritize high-speed. As long as the vehicle’s positioning point drifts outside the highway, its distance to adjacent positioning points must be much greater than the distance between normal adjacent positioning points. This article takes the position change per second at the maximum speed of each segment as the threshold and records the points where the vehicle’s position change per second exceeds this threshold as drift points.
By analyzing the traffic speeds of typical segments of highways in Fujian Province, it can be concluded that the average speed of segment 1 is 71 km/h, the maximum speed is 111 km/h, and 85th percentile speed is 85 km/h; The average speed of segment 2 is 80km/h, the maximum speed is 121km/h, and 85th percentile speed is 92km/h; The average speed of segment 3 is 83 km/h, the maximum speed is 111 km/h, and 85th percentile speed is 94 km/h; The average speed of segment 4 is 79 km/h, the maximum speed is 110 km/h, and 85th percentile speed is 92 km/h. 85th percentile speed is a reasonable choice that can represent the speed of most vehicles. This is because some vehicles may be traveling at slower speeds in normal traffic flow (e.g., because of problems such as traffic congestion, poor road conditions, etc.), while at the same time, some vehicles may be traveling at faster speeds (e.g., speeding). However, most vehicles typically travel at intermediate speeds, so the 85th percentile speed is considered a good indicator to describe the speed of most vehicles. However, to preserve the data samples to the greatest extent possible, this article takes the distance of position change per second at the maximum speed of each segment as the threshold, and records points where the distance change of adjacent positions of vehicles per second exceeds this threshold as drift points, and removes them.
In addition, there are issues in Edata such as license plate garbled codes, missing transactions caused by no information exchange with the gantry when the vehicle passes through it, and duplicate transactions when passing through the gantry. Due to the occurrence of missed transactions, it becomes difficult to obtain the time for vehicles to enter and exit the gantry, resulting in the inability to perform subsequent data matching. Therefore, this article directly eliminates the missing transaction trajectory. Misregistration of license plates can also lead to data mismatch, so they are directly removed. For duplicate transaction data, this article uses “flagid” and “entity” as the de-duplication subsets to eliminate duplicate data from the data.
2) Data Matching Based on Spatiotemporal Location
The Edata comes from the vehicle information that interacts with the ETC gantry and is captured by the ETC system when the vehicle passes through it. It includes information reflecting the vehicle’s attributes such as the transaction time between the vehicle and the gantry, vehicle type, and vehicle entry time, but does not include the driving characteristics of the vehicle within the segment, such as changes in position within the segment, real-time vehicle speed, and other internal vehicle status information within the segment. If the distance between the two gantries is too large, the traffic situation inside the segment changes, and it is easy to distort the results with the actual driving status by estimating the overall traffic status of vehicles inside the segment only from the Edata. The Gdata records the vehicle’s position, speed, heading an-gle, and other state information that changes with time inside the segment at certain time intervals, which can reflect the regularity of the vehicle’s traveling state changes inside the segment. Therefore, to obtain more accurate vehicle location information within the segment, it is necessary to open the blind spot inside the segment using Gdata in conjunction with Edata. Among them, the segment is a small road segment consisting of two adjacent gantry nodes of the highway, and the gantry that the vehicle passes through first is called the front gantry of the segment, and the gantry that passes through later is called the back gantry of the segment, as shown in Fig 3.
This section proposes two-way data matching based on spatiotemporal location, i.e., the time point of the vehicle’s transaction with the front and rear gantries of the segment in Edata determines the time range of the corresponding vehicle’s localization in Gdata, which in turn specifies the target vehicle located in the target segment in Gdata. However, the color description of the license plate in the Edata is at the top of the license plate characters, while the characters indicating the license plate color in the Gdata are at the end of the license plate. To match the Edata with the Gdata, it is necessary to perform a string left rotation operation on the license plate information string in the Edata, and put the characters indicating the license plate color in the Edata to the tail, to facilitate the data matching between Edata and Gdata.
Firstly, obtain the Edata of the target segment and extract the license plate data
3) Construction of Basic Driving Characteristics of Vehicles
This section is based on the target vehicles within the target segment matched in the previous step, and excavates the hidden information in Edata to construct the basic features of vehicle position estimation. Basic features mainly refer to features that can be extracted from data without the need for deep processing by machine learning models.It mainly includes \begin{equation*}\mathrm {Traj}= < \mathrm {Node}_{1},\ldots,\mathrm {Node}_{n}>. \tag{1}\end{equation*}
Firstly, traverse the driving trajectory of each vehicle to obtain the passing time nodes of adjacent gantry frames in the vehicle trajectory, and calculate the time it takes for the vehicle to pass through the segment composed of adjacent ganties. Secondly, obtain the coordinates of adjacent gantries and use the Gaode API to calculate the distance between adjacent gantries; Then, calculate the vehicle’s travel speed on historical segments based on kinematic formulas. Due to the influence of driver status and traffic conditions on vehicle speed, this article only takes the speed
To maintain the authenticity of traffic flow to the greatest extent and avoid significant differences between the calculated traffic flow and the traffic flow during vehicle travel due to large time differences, this article only calculates the total number of vehicles passing through the front and rear gantries of the segment from the first ten minutes of the target vehicle entering the segment to the transaction time point between the target vehicle and the front gantry of the segment as the corresponding traffic flow
B. Prediction Model of Vehicle Segment Speed Considering Short-Term Driving Style
In the driving environment of highways without significant bends, vehicle speed is one of the important characteristics that reflect changes in vehicle position. The faster the vehicle speed, the greater the change in position. In actual driving environments, speed is limited by factors such as traffic flow, vehicle mechanical performance, and road structural characteristics. This section analyzes the relevant factors that affect speed changes. The input variables are required to construct a vehicle speed prediction model based on the analysis results.
1) Analysis of Correlation Characteristics of Vehicle Speed Changes
2) Analysis of Speed Characteristics in the Dimension of Vehicle Flow
The traffic flow affects the driving environment of vehicles. When the traffic flow reaches a certain level, the speed of vehicles will also be constrained to varying degrees. Figure 4 shows the difference in vehicle speeds between two segments with different traffic flows. As shown in Fig 4, in segments with high traffic flow, there are significantly more vehicles with speeds below 50km/h than in segments with low traffic flow, and as the flow increases, more and more low-speed vehicles appear in both segments. Therefore, when the traffic flow reaches a certain amount, it becomes an important feature that affects speed, thereby affecting changes in vehicle position.
3) Analysis of Vehicle Type Characteristics
The speed is not only constrained by the driving environment of the vehicle but also by the type of vehicle. Under the same road conditions, due to the limitations of vehicle mechanical performance, the speed of large trucks may be lower than that of small cars, and due to safety considerations, the braking distance required for large trucks is long. Drivers will also control the speed appropriately to prevent emergencies. As shown in Fig 5, by analyzing the traffic speeds of various types of vehicles in different segments of Fujian Province’s highways, it is found that regardless of which segment, the average speed, 85% percentile speed, and maximum speed of passenger cars are greater than those of trucks and special operation vehicles, and the difference is significant. At the 15% percentile speed, only 15% percentile speeds for passenger cars in segment 2 are lower than that of trucks and specialized work vehicles. From this, it can be seen that different types of vehicles have significant speed differences due to differences in mechanical performance. The speed of passenger cars is much higher than that of other types of vehicles. Therefore, vehicle types should also be included in the feature library that affects speed changes.
Statistical chart of segment speed characteristics for different types of vehicles.
4) Analysis of Driving Style Characteristics
Driving style refers to the behavior and habits of the driver on the road, which directly determines the speed of the vehicle on the road [41]. Radical drivers tend to exceed speed, accelerate rapidly, and brake sharply, often ignoring traffic rules and speed limit signs. This driving method may cause the vehicle’s speed to soar in a short period. On the contrary, a cautious and steady driving style often leads to slower speeds. Prudent and steady drivers tend to follow traffic rules, maintain a moderate speed, and drive the vehicle steadily. They pay attention to road safety and avoid sharp turns and brakes, resulting in relatively slow speeds. An economical and energy-saving driving style can also affect vehicle speed. Economically energy-saving drivers focus on efficient driving, adopt smooth acceleration and deceleration, and plan their driving routes reasonably, effectively reducing fuel consumption.
From Fig 5, it can be seen that there is a significant difference between the maximum speed of passenger cars and trucks and the 15% percentile speed, while the difference between special operation vehicles is small. The reason for this phenomenon may be that most special operation vehicles travel with tasks, and the drivers of special operation vehicles have strict requirements during recruitment, rich driving experience, and high comprehensive driving quality, so their speed is relatively stable. Buses and trucks are mostly private cars or public vehicles owned by small businesses, and drivers can drive their vehicles according to their habits. So its speed fluctuates greatly. In summary, different driving styles have a significant impact on the fluctuation and speed of vehicle speed, so driving style should become one of the important factors in perceiving vehicle speed.
5) Short-Term Driving Style Construction Method Based on SC-KMEANS Clustering
Driving style is affected by many factors, and driving habit is an important determinant. From the perspective of the overall journey of car owners, each car owner has a fixed driving style, including radical type, general type, and cautious type, which are determined by daily driving habits. In the actual driving process, driving style is affected by driving tasks, driving environment, driver subjective factors, and so on, and there are often changes between different styles, especially the driving style changes in specific scenes that have strong randomness. Therefore, this paper proposes a short-term driving style considering the spatiotemporal characteristics, that is, considering the driving state with the strongest correlation with the current vehicle state in time and space to build a short-term driving style, to more accurately capture the driving characteristics of the vehicle’s current journey.
This section is based on the basic driving characteristics of the vehicle, including
a: Selection of Driving Style Classification Number
Determining the number of driving styles is a subjective problem in theory. There is no fixed rule or standard to determine how many driving styles should be divided into. However, when classifying driving styles, we need to comprehensively consider many factors, including road environment, drivers’ preferences, and driving habits. Due to the complexity and diversity of driving styles, it is difficult to get an accurate number of categories simply by subjective judgment. The contour coefficient method can provide an objective and quantitative evaluation method, help us select the most appropriate classification number without fixed rules, and increase the reliability and scientificity of the analysis results.
Contour coefficient [42], i.e. SC index, indicates the degree of compactness and dispersion among various types of samples after clustering. The smaller the distance between samples in the same class and the larger the distance between samples in different classes, the greater the value of SC(i) and the better the clustering effect. Therefore, it is often used as a performance index to evaluate the clustering results. The \begin{equation*}\text {SC}(i)=\frac {b(i)-a(i)}{\text {max}\left \{{ a(i),b(i) }\right \} {}' } \tag{2}\end{equation*}
In equation (2),
b: Recognition of Short-Term Driving Style
After determining the number of driving style classifications, it is necessary to classify them according to the driving characteristics of vehicles. Raw data contains high-dimensional features, while high-dimensional data sets increase model complexity and computational cost. Dimensionality reduction can map data to low dimensional space, reduce the requirements of calculation and storage, and reduce the complexity of model training and inference.
The main goal of PCA is to map the high-dimensional data into the low-dimensional space through some kind of spatial linear projection, and at the same time try to ensure the maximum variance of the data in the low-dimensional space of the target, to prevent the loss of more information of the original data [43]. Set the sample set \begin{align*} C&=\begin{bmatrix} \text {cov}(x_{1},x_{1}) &\quad \text {cov}(x_{1},x_{2}) \\ \text {cov}(x_{2},x_{1})&\quad \text {cov}(x_{2},x_{2}) \end{bmatrix} \tag{3}\\ \text {cov}(x_{1}, x_{1})&=\frac {\sum _{i=1}^{M}(x_{1}^{i}-\bar {x}_{1})(x_{1}^{i}-\bar {x}_{1}) }{M-1} \tag{4}\end{align*}
The eigenvalues \begin{align*}\begin{bmatrix}y_{i}^{1} \\ y _{i}^{2} \\ \ldots \\ y _{i}^{k} \end{bmatrix}=\begin{bmatrix}u_{1}^{T}\cdot (X_{1}^{i},X_{2}^{i},\ldots,X_{n}^{i})^{T} \\[7pt] u_{2}^{T}\cdot (X_{1}^{i},X_{2}^{i},\ldots,X_{n}^{i})^{T} \\ \ldots \\[7pt] u_{k}^{T}\cdot (X_{1}^{i},X_{2}^{i},\ldots,X_{n}^{i})^{T} \end{bmatrix} \tag{5}\end{align*}
Based on the new feature volume after dimensionality reduction, the K-means algorithm measures the similarity of different data objects by selecting an appropriate distance formula. The distance between data is inversely proportional to the similarity, i.e. the smaller the similarity, the larger the distance.
The K-means algorithm first gives the corresponding initial clustering center C based on the number of driving style classifications and calculates the distance from the initial clustering center to the rest of the data objects, which is chosen in this paper as the Euclidean distance [44]. In this paper, the Euclidean distance is chosen, and the formula for the Euclidean distance from the clustering center to other data objects in the space is shown in equation (6):\begin{equation*}d(x,C_{i})=\sqrt {\sum _{j=1}^{m}(x_{j}-C_{ij})^{2}} \tag{6}\end{equation*}
In equation (6),
Based on the Euclidean distance, the similarity is measured and the target data with the highest similarity to the clustering center is assigned to the corresponding cluster. After allocation, the data objects in the k clusters are averaged to form a new round of clustering centers, thus reducing the sum of error squares of the dataset, calculated as equation (7):\begin{equation*}\text {SSE}=\sum _{i=1}^{k}\sum _{x\in C}^{} \left |{ d(x,C_{i} }\right | ^{2} \tag{7}\end{equation*}
The magnitude of the value of SSE is used as a measure of how good the clustering results are, and when it no longer changes or converges, the iteration is stopped and the final result is obtained.
3) Prediction of Vehicle Segment Speed Based on Bi-LSTM
The so- called bidirectional LSTM is that one LSTM unit processes forward input and the other unit processes reverse input [45]. It can retain both past information and future information, which can make the network better understand the information before and after a certain time. The principle of bidirectional LSTM is shown in Fig 7.
If used \begin{align*}h_{t}^{\alpha }& =\text {sigmod}(u^{\alpha } x_{t}+\omega _{\alpha } h_{t-1}+b_{\alpha }) \tag{8}\\ h _{t}^{\beta } &=\text {sigmod}(u^{\beta } x_{t}+\omega _{\beta } h_{t-1}+b_{\beta }) \tag{9}\\ y _{t} &=\text {sigmod}(h_{t}^{\alpha } \upsilon _{\alpha }+\upsilon _{\beta } h_{t}^{\beta } +b_{f }) \tag{10}\end{align*}
The weight vectors corresponding to the forward and backward propagation of the expression are
The bidirectional LSTM used in this paper contains two bi-direction al LSTM layers and a fully connected layer. Among them, the first bi-directional LSTM layer, which contains 64 LSTM units, has an activation function of ReLU and returns complete sequence information. This layer is responsible for extracting temporal dependencies from the input data and allows information to be passed forward and backward simultaneously to fully capture the characteristics of the time series. The second bi-directional LSTM layer, also contains 64 LSTM units with an activation function of ReLU. this layer further deepens the understanding of the time series data and captures more complex temporal patterns. The output layer contains 1 neuron and is used to generate the final prediction. Finally, the model is compiled using Adam optimizer and mean square error as a loss function.
C. Road Model Based on Moving Average Wavelet Smoothing for Box Graph Anomaly Detection
A basic vehicle motion model widely used for 2D heading projection is shown in equation (11)–(13), which adopts a 2D planar road assumption with state vectors consisting of heading \begin{align*}{\varPsi} _{k} &={\varPsi} _{k-1}+\triangle T\cdot \omega _{XY} \tag{11}\\ X_{k} &=X_{k-1}+\triangle T\cdot V_{XY} \cos ({\varPsi} _{k-1} +\triangle T\cdot \omega _{XY}) \tag{12}\\ Y_{k} &=Y_{k-1}+\triangle T\cdot V_{XY} \sin ({\varPsi} _{k-1} +\triangle T\cdot \omega _{XY}) \tag{13}\end{align*}
However, as shown in Fig 8, when there is a gradient in the road, it will decompose the gravity force formed by the total weight of the vehicle and the passengers into the resistance of the vehicle to move forward. When the output power is kept constant and friction increases, P is a constant when the output power is kept constant as shown in equation (14). If the friction increases, then the driving force must increase to maintain a constant power. According to Newton’s second law, the driving force is proportional to the acceleration (a) and mass (m) as shown in equation (15). Therefore, when friction increases, the driving force must be increased to maintain the same power, but this also results in a decrease in acceleration. The speed of the vehicle is proportional to the product of acceleration and time as shown in equation (16). As the acceleration decreases, the vehicle speed will decrease. Therefore, there is an effect of uphill position on vehicle speed, especially for large loaded vehicles.\begin{align*}P&=F_{drive}\ast v \tag{14}\\ F_{drive}&=m\ast a \tag{15}\\ v&=a\ast t \tag{16}\end{align*}
Edata can only be used to calculate the segment speed of the vehicle by obtaining the time point and segment distance of the vehicle entering and leaving the segment and using the vehicle kinematics formula. However, the vehicle segment speed cannot reflect the real-time speed of the vehicle. For example, if the first half of a segment is uphill and the second half is downhill, the probability is that the vehicle’s speed on the uphill portion will be lower than on the downhill portion. Due to the offset of uphill and downhill segments, the speed of vehicles in the whole segment may be similar to that in flat segments. Therefore, the segment speed of the vehicle can only accurately reflect the time when the vehicle arrives at the rear gantry of the segment, and cannot reflect the position change law of the vehicle within the segment. This paper further constructs the road structure characteristics and the vehicle segment speed to jointly map the vehicle position change law in the segment. The overall construction process of the road model is shown in Fig 9.
1) Detection of Abnormal Road Elevation Data Based on Box Graph
There are many tunnels on the highways in Fujian Province, and the GPS signals are susceptible to interference, resulting in uneven road elevation data in the raw data, and a common error is the existence of multiple different values of elevation information at the same location. To reduce the data anomalies that lead to road model distortion, the box-and-line diagram method is used to screen the elevation data under the same location dimension for outliers.
Box-plot is a method of describing the data using five statistics: upper bound (maximum value in the non-anomalous range), upper quartile, median, lower quartile, and lower bound (minimum value in the non-anomalous range). In this paper, it is used as an adaptive threshold for determining data outliers. The basic principle of box-plot analysis is to arrange the data from small to large and calculate the quartile of the data. Calculate the outliers of the data through the quartile, and the calculation formula is shown in equations (17)–(18):\begin{align*}K&\le L_{1} -1.5(L_{3}-L_{1}) \tag{17}\\ K&\ge L_{3} +1.5(L_{3}-L_{1}) \tag{18}\end{align*}
2) Moving Average Wavelet Smoothing Road Model
Based on the data after outlier cleaning, the moving average method is used to slice the road according to 100m as the minimum unit, that is, taking 100m as an interval, and taking the average value of the elevation information of vehicles in this interval as the elevation data of the starting point of this interval. For example, the average value of the [100,200] interval is used as the elevation information at 100m, and so on. Then, the linear interpolation method is used to interpolate the missing elevation information in the middle of the data after taking the average value of the moving interval. However, there are local uneven points in the interpolated data, which is inconsistent with the actual road elevation of the highway. Therefore, this paper uses wavelet transform to smooth the above road data.
Wavelet Transform [46] is a commonly used signal processing method that can be applied to smooth data. Wavelet functions have localization characteristics in both the time and frequency domains and can provide information in different time scales and frequency ranges. Utilize the multi-scale analysis characteristics of wavelet transform to reduce noise in data. By selecting appropriate wavelet functions and decomposition levels, we can filter out high-frequency noise components and retain smoothing trends at lower frequencies, thereby achieving data smoothing.
Let \begin{equation*}\text {WT}_{x} (a,b)=\int x (t) \psi _{a,b}(t)dt \tag{19}\end{equation*}
In the figure,\begin{align*}c_{j+1}(z)&=\sum _{m\in z }^{} c_{j}(m) F_{0}(m-2z) \tag{20}\\ d _{j+1}(z)&=\sum _{m\in z }^{} c_{j}(m) F_{1}(m-2z) \tag{21}\end{align*}
The process of reconstructing elevation information is as follows: the signal is stretched by inserting zero values between two samples on average, that is, upsampling, and then passing through a low-pass filter to obtain a large-scale low-resolution approximation, that is, low-pass output; After upsampling the detail signal and passing through a high pass filter, a high pass output can be obtained, and the recon-structed signal x(z) can be obtained by adding the two. The signal reconstruction for-mula is shown in equation (22):\begin{equation*}c_{j}(z)=\sum _{m\in z }^{} c_{j}(m) F_{0}(z-2m)+ d_{j+1}(m)F_{1}(z-2m) \tag{22}\end{equation*}
D. Vehicle Position Estimation Model Based on Spatio-Temporal Smoothing DLCNN-LSTM-Attention With L1 Regularization
Based on multi-dimensional features such as vehicle segment speed, road characteristics, short-term driving style, and the time the vehicle has entered the segment, this section proposes the DLCNN-LSTM-Attention method based on L1 regularization for predicting the position of vehicles on the road [47], [48]. However, the distance between the vehicle and the front door frame in the current segment is positively correlated with time and is in a non-stationary state. Non-stationary state data seriously affects the fitting performance of the model. Therefore, before conducting vehicle position estimation, this article smoothes the vehicle position data from both the temporal and spatial dimensions.
1) Smoothing Module for the Time Dimension of Vehicle Positioning Data
The original trajectory of a vehicle is usually composed of a series of trajectory points, which can be represented as
From Fig 11, it can be seen that the most frequent occurrence of the positioning time interval is 15s, followed by 20s, 30s ranked third, and most of the neighboring points have short positioning time intervals. Considering that the highway road condition is good, the traffic density is small, and the vehicle speed will not change abruptly during the vehicle driving process, except for unexpected situations. Based on this characteristic, this paper cuts the highway segment into multiple subinterval with GPS positioning points as the starting and ending points, and the vehicle’s driving speed in the subinterval is regarded as an approximate uniform speed.
As shown in Fig 12, the relationship between the distance of the vehicle from the start of the subinterval and time can be viewed as linear within each subinterval. The positional relationship between \begin{equation*} p_{i}=p_{i-1}+(t_{i}-t_{i-1})v_{i-1} \tag{23}\end{equation*}
2) Smoothing Module for Spatial Dimension of Vehicle Positioning Data
Since the vehicle is traveling in the segment, its distance from the front gantry of the segment is necessarily positively correlated with its traveling time in the segment, which results in the vehicle position data being in a non-smooth state in the spatial dimension. The usual treatment is to differentiate the data so that the data is in a smooth state. As for the Edata, the difference operation will result in the loss of the initial position
After first-order inverse differential smoothing, the distance between the vehicle and the gantry is mapped as the amount of change in vehicle position at adjacent moments
3) DLCNN-LSTM-Attention Fusion Module Based on L1 Regularization
Based on the spatiotemporal stabilized data, we defined a vehicle position estimation model using DLCNN-LSTM-ATTENTION fusion module based on L1 regularization to achieve vehicle position estimation. The overall structure of the model is shown in Fig 14.
Schematic structure of DLCNN-LSTM-ATTENTION model based on L1 regularization optimization.
The number of input features to the model is not as high as possible, but needs to be balanced against the complexity of the problem, the characteristics of the data, and the performance of the model. An increase in the dimensionality of the features can cause the samples to become sparse in high-dimensional spaces, and more data may be required to maintain the reliability of the model. In addition, high-dimensional data tends to lead to increased computational complexity. When there are too many features, the model may become overly complex, attempting to fit accurately on the training data, but may have reduced generalization ability on the test data. This results in a model that performs well on training data but may not perform well in real-world applications. On the contrary, choosing the appropriate number and type of features can lead to a more simplified model and improved computational efficiency and generalization performance. For example, in the case of highway traffic flow is small, its multiple speed and vehicle location changes have less impact, when the traffic flow features may increase the complexity and overfitting of the model. Therefore, before regression prediction, this paper uses L1 regularization to select features for the constructed features. L1 regularization realizes the selection and sparsity adjustment of the model features by adding the penalty term of L1 norm, which makes the model more concise and explanatory, and helps to prevent overfitting.
Lasso is a linear regression method that incorporates L1 regularization to obtain the parameters of a classification or regression model by minimizing the empirical error as shown in equation (24) \begin{equation*} \bar {\beta } =\text {argmin}\left \|{ Y-X\beta }\right \| +\lambda \sum _{i=1}^{n}\left |{ \beta _{i} }\right | \tag{24}\end{equation*}
The regularization term is the L1 norm, and the regularization parameter alpha controls the strength of this regularization term. When the value of alpha is small, the model prefers to keep the coefficients of all features, and the model will have a stronger fitting ability, but overfitting may occur, especially when there are more features. When the value of alpha is large, the model will more strongly drive the coefficients of some features to zero, thus realizing the effect of feature selection and the model will become more sparse. This helps to reduce the complexity of the model and improve the generalization ability, but it may also lose some predictive performance of the model. Therefore, the regularization parameter is crucial for feature selection, and the selection of the regularization parameter is not scientifically sound when realized only by experience. To achieve reasonable feature selection, this paper uses cross-validation to adjust the values of regularization parameters in the Lasso model.
First, we define a two-layer CNN layer containing convolution and pooling operations for extracting local features from the input data. We used multiple convolutional kernels and ReLU activation functions to capture nonlinear relationships in the input data, as shown in equation (25). By maximizing the pooling operation, we reduced the dimensionality of the features to reduce the impact of non-essential features on the model fitting effectiveness.\begin{align*} \text {RELU}=\begin{cases}\displaystyle x, & x> 0 \\ \displaystyle 0,& x < 0 \end{cases} \tag{25}\end{align*}
Double-layer CNN can simultaneously process multiple input features, such as vehicle speed, traffic flow, and vehicle type. The first layer convolution can detect the local pattern of each feature and capture the relevant information of different features. The second layer convolution can fuse and abstract these multimodal features, to better understand the multidimensional relationship of data. In addition, double-layer CNN can capture the spatio-temporal relationship. The first layer convolution can extract local features of data in time and space, while the second layer convolution can further Abstract these spatio-temporal patterns. There may be complex interrelationships among vehicle speed, traffic flow, and vehicle type. For example, the vehicle speed at a certain time may be related to traffic flow and vehicle type. Double-layer CNN can interact and fuse different features in the second-layer convolution, to capture higher-level data patterns.
Next, we introduce an LSTM layer for processing sequence data with temporal dependencies [49]. The LSTM layer has memory cells and gating mechanisms to better capture and utilize long-term dependencies in sequence data. We use 64 hidden cells and set them to return the complete output sequence to preserve the information of the sequence. The details of each computation of LSTM are as follows:
The forgetting gate can be described as shown in equation (26).\begin{equation*} f_{t}= \text {sigmoid}(w_{f} [h_{t},x_{t}]+b_{f}) \tag{26}\end{equation*}
The input door can be described as equations (27)–(28).\begin{align*}i_{t}&= \text {sigmoid}(w_{i} [h_{t-1},x_{t}]+b_{i}) \tag{27}\\ \tilde {C} _{t}&= \text {tanh}(w_{c} [h_{t-1},x_{t}]+b_{c}) \tag{28}\end{align*}
Cell state (information transmission) \begin{equation*} {C}_{t}={C}_{t-1}{f}_{t}+{i}_{t}{C}_{t} \tag{29}\end{equation*}
The output gate \begin{align*}{O}_{t}&=\text {sigmoid}({w}_{0} \left [{{h}_{t-1},{x}_{t} }\right] +{b}_{o}) \tag{30}\\ {h}_{t}&={O}_{t}\ast \text {tanh}\left ({C_{t} }\right) \tag{31}\end{align*}
Compared with the RNN model, LSTM adds several gate settings, especially the forgetting gate, which can filter the input information of the previous period, to retain the key information and forget part of the unimportant information, which is the key that LSTM can overcome the gradient disappearance. However, LSTM still has the phenomenon of losing important data information when the input sequence is too long, so it needs CNN to process the original data and filter out part of the unimportant information, to improve the accuracy of prediction.
To further improve the model performance, we introduce an attention mechanism to enhance the attention to the input sequence. By applying the attention mechanism on the output of the LSTM layer, we can adaptively learn the importance weights for each time step. By generating query vectors and value vectors and computing the attention scores between them, we can obtain the attention weights. Then, by weighting and summing the attention weights with the value vectors, we obtain a context vector that weights and averages the value vectors of different time steps to capture the information of the time steps with higher attention.
Finally, to synthesize the information from the output of the LSTM layer and the context vectors, we concatenate them. Through the Concatenate layer, we will obtain a synthesized feature vector that contains the raw output of the LSTM layer as well as the key information from the attention mechanism. This synthesized feature vector is spread into a one-dimensional vector and passed through a fully connected layer for final regression prediction. The model is trained using the Adam optimizer using the mean square error as the loss function.
The attention mechanism will calculate the attention weight of each input part according to the change rule of vehicle position combined with the current input and task requirements, i.e., to determine the importance of each part for the current task, and to weigh different parts, so that the model can pay more attention to the important information. Thus, the information related to the current task is strengthened. This can help the model to pay better attention to important features and improve the performance of the model when dealing with complex tasks. In addition, this dynamic weight allocation mechanism enables the model to capture the features related to the task more accurately, thus improving the model’s performance and generalization ability. Thus, by combining these three models, the DLCNN-LSTM-ATTENTION model can better understand the data and improve prediction accuracy.
Experimental Analysis
A. Description of Experimental Material
The experimental data used the ETC transaction data of Fujian Provincial Expressway from September 3 to September 5, 2020, and the Gdata of two passengers and one hazardous vehicle. Edata contains information such as vehicle number, transaction time, etc., as shown in Table 2. The two-passenger and one hazardous data contain information such as real-time vehicle speed, longitude, latitude, and direction angle of the vehicle, as shown in Table 3.
To verify the generalization ability of the model under different road features, the experimental road segments were selected as G3 Ningde Gutian to Fuzhou Minhou segment, G15 Fuqing Jiangyin Port to Putian segment, and G76 Fujian Zhangzhou to Longyan segment. Among them, the coastal segment is the segment from Jiangyin Port to Putian, the mountainous segment is the segment from Ningde Gutian to Fuzhou Minhou, and the segment from Zhangzhou to Longyan, Fujian. The three segments are all segments of the same type with large traffic flow and can reflect the full sample traffic conditions of the highway. The experimental segment is shown in Fig 15.
B. Setting of Network Parameters and Selection of Evaluation Indicators
1) The Setting of Network Parameters
In the comparison experiments of different prediction models, the choice of network parameters largely determines the performance of the algorithm. In our study, the parameters involved in each experiment are chosen according to the actual situation of this study by referring to a large number of literature or using empirical and trial-and-error methods. The network parameter settings for each model in this method are shown in Table 4.
2) Selection of Evaluation Indicators
Appropriate evaluation indexes can accurately and intuitively reflect the prediction effect of the model and compare the predicted value of the model with the real value of the dataset to quantify the performance of the model. The evaluation metrics are selected as mean absolute error (MAE), root mean square error (RMSE), Mean Absolute Percentage Error (MAPE), and (coefficient of determination)
C. Construction of Short-Term Driving Style Features
To more accurately capture the change rule of vehicle position, this paper first classifies the short-term driving style of vehicles according to the driving characteristics of vehicles in the historical segments during the current trip, and the construction results are shown in Fig 16(a) to Fig 16(d). In each segment, the model subdivided the driving style of the vehicle into different categories according to the traffic flow, speed, and vehicle type. Among them, in segment 1, the model classifies the target vehicles into 4 categories based on each characteristic variable when the vehicles are traveling; in segment 2, the target vehicles are classified into 2; in segment 3, the target vehicles are classified into 2 categories; and in segment 4, the target vehicles are classified into 10 categories.
It can be seen that there are some differences in the driving characteristics of vehicles in each category in each segment. For example, as shown in Fig 16(b), category 1 vehicles indicate the passing speed characteristics of vehicles when the traffic flow is large; category 0 can be seen as the driving characteristics of vehicles when the traffic flow is small. Therefore, the short-term driving style reflects, to a certain extent, the driving pattern of vehicles in different driving environments, which provides an important estimation basis for the subsequent vehicle segment speed prediction and vehicle position estimation.
D. Experimental Analysis of Vehicle Segment Speed Characterization Construction
The inputs to the model contain
As shown in Fig 17(a) to Fig 17(d), our predicted velocity trends in each segment are generally consistent with the actual velocity trends. In segment 1, the average absolute error, root mean square error, correlation coefficient, and average absolute error percentage of our prediction results are 2.83%, 12.93%, 0.89%, and 12.77%, respectively. In segment 2, the average absolute error, root mean square error, correlation coefficient, and average absolute error percentage of our prediction results are 2.30%, 9.48%, 0.85%, and 11.88%, respectively. In segment 3, the average absolute error, root mean square error, correlation coefficient, and average absolute error percentage of our prediction results are 2.98%, 13.30%, 0.84%, and 12.43%, respectively. In segment 4, the average absolute error, root mean square error, model fitting degree, and average absolute error percentage of our prediction results are 3.08%, 17.52%, 0.78, and 11.30%, respectively. From the predicted performance indicators, vehicle owners will drive their vehicles more based on their driving habits when the road traffic condition is good, and our model takes into account the short-term driving style of vehicle owners, which can better capture the dynamic change of the speed pattern.
Secondly, the proposed model is subjected to ablation experiments regarding short-term driving style, and the results of the experiments are shown in Table 6. Without considering the short-term driving style, the mean absolute error, root mean square error, correlation coefficient, and mean absolute error percentage of our model are 3.19, 15.27, 0.85, and 15.72%, respectively, in segment 1; in segment 2, the mean absolute error, root mean square error, correlation coefficient, and mean absolute error percentage of our model are 2.05, 8.67, 0.86, and 11.62%; in segment 3, the mean abso-lute error, root mean square error, correlation coefficient, and mean absolute error percentage of our model were 3.23, 13.54, 0.84, and 13.72%; in segment 4, the mean absolute error, root mean square error, correlation coefficient, and mean absolute error percentage of our model were 3.31, 18.80, 0.77, and 11.40%, respectively.
In the experiments of segment 1, segment 3, and segment 4, the performance of the model without considering the short-term driving style decreased to a certain extent compared with that considering the short-term driving style. In segment 2, the performance of the model without considering the short-term driving style is slightly better than that of the model considering the short-term driving style, but the improvement is small. According to the overall comparative experiment of the four segments, taking short-term driving style into account in the model helps explain the degree of data variation and accurately obtain the predicted target value.
E. Construction of Road Slope Characteristics
The road structure constructed by the road model based on moving average-wavelet smoothing for anomaly detection of the box-and-line diagram is shown in Fig 18. Among them, Fig 18(a), 18(c), 18(e), and 18(g) show the original road elevation information in Gdata data, and Fig 18(b), 18(d), 18(f), and 18(h) show the road structure after moving average-wavelet smoothing. As can be seen from the figures, the original road elevation information is disorganized, and the heights under the same geographic location are uneven, which makes it impossible to obtain an effective road structure. The road model constructed by moving average-wavelet transform is similar to the road structure stacked with elevation information in the original data in terms of contour, which proves the similarity between the constructed slope and the actual slope. In this paper, the main consideration for road features is the effect of slope. Therefore, only the slope of the constructed road model is concerned.
Since the vehicle elevation information is more cluttered and there are trajectory drift points, if the GPS trajectory drift points are too concentrated in a small interval in the segment, these drift points will be eliminated, resulting in sparse elevation information in some of the intervals. This may lead to a large gap between the slope derived from the model and the actual slope. To avoid the influence of a few slope distortions caused by data sparsity, this paper will obtain the slope value for the maximum and minimum normalization, the uphill segment is categorized as between [0, 1], and the downhill segment is categorized as between [−1,0], to avoid the influence of the abnormal slope to the maximum extent, and to improve the antinoise ability of the model. Finally, the effectiveness of the slope features constructed by this method is also further verified in the subsequent vehicle position estimation ablation experiments.
F. Experimental Analysis of Feature Selection
To improve the efficiency of the model and reduce phenomena such as overfitting. Feature selection is performed using L1 regularization based on predicted vehicle segment speeds, roadway characteristics, and vehicle-based driving characteristics. The results of the L1 regularization feature selection for each segment are shown in Table 7. In segment 1, five features are finally selected as inputs to the model, which are
Lasso tends to produce a sparse model, i.e., only a small number of
G. Experimental Analysis of Spatio-Temporal Data Stabilization
To verify whether the data after spatiotemporal stabilization meets the stability requirements, the ADF test is performed on the data. The results of the test are shown in Fig 19, where each subplot has four parts corresponding to the indicator values of the four segments. The left side of each section shows the values of each indicator obtained from the original data and the right side shows the values of each indicator obtained from the smoothed data. As can be seen from Fig 19(a), after smoothing the data, the ADF statistics of segment 1, segment 2, and segment 4 are smaller than the critical values, and the ADF statistics of segment 3 are also substantially reduced from the original data. As can be seen from Fig 19(b), the p-value value of the data after the smoothing process is significantly reduced and the p-value values of segment 2 and segment 4 are less than the level of significance, which indicates that the data of segment 2 and segment 3 have been smoothed after the process. While the p-value values of segment 1 and segment 3 are greater than the significance level, the number of their steady-state trajectories has been substantially increased after the smoothing treatment as shown in Fig 19(c) and Fig 19(d). To retain the information entropy of the data to the greatest extent, the data are not further smoothed in this paper.
H. Model Prediction Performance Comparison and Spatial Dimension Generalization Ability Assessment
This experiment aims to validate the prediction performance of the model. In this experiment, CNN, bilstm, DNN, CNN-ATTENTION, LSTM-ATTENTION, and TGCN are used to compare with the proposed model, DLCNN-LSTM-ATTENTION, and the recursive prediction distance is uniformly set to 2km, and the vehicle enters the target roadway with the front doorframe of the vehicle as the initial node, and the vehicle’s position at the next moment is predicted recursively position of the vehicle at the next moment. The results of the model’s prediction in each segment are shown in Fig 20(a) to Fig 20(d) in comparison with the rest of the prediction models.
From Table 8, it can be seen that the values of MAE, RMSE, MAPE, and correlation coefficient
From the correlation coefficient
From the spatial dimension, the four segments are located in different directions in Fujian Province, from north to south and from east to west; the overall altitude ranges from tens of meters to hundreds of meters. In four different segments under the 2km recursive prediction, this paper proposed model and vehicle position change law of the fit are above 0.98, the average absolute error is less than 50, and MAPE is 10% or less, indicating that the model in the highway full sample spatial environment can achieve a high degree of fitting to the vehicle position, the spatial generalization ability is better.
I. Ability to Generalize the Time Dimension
The traffic flow of the highway changes regularly with time, and the difference between the peak period and idle time is large, When the traffic flow reaches a certain level, it will have a limiting effect on the speed of the car. To verify the performance of the model in different driving environments, this paper regards 8:00 a.m. to 10:00 a.m. and 4:00 p.m. to 6:00 p.m. as the peak period of highway travel, and 5:00 a.m. to 7:00 a.m. and 7:00 p.m. to 9:00 p.m. as the idle period, and examines the model’s performance by time.
As can be seen from Fig 21(a) to Fig 21(d), during the peak period, the average absolute error of the model is larger than that of the idle period, and the RMSE and MAPE are larger than that of the idle period. This is because vehicles traveling in the peak period, high traffic flow on the highway, will inevitably cause some restrictions on the speed of the vehicle, and at this time travel more vehicles. When the traffic flow is saturated, normal traveling vehicles encounter low-speed vehicles, and the side of the vehicle behind the vehicle to follow the car close, may not be able to change lanes in time, this time is bound to need to slow down. This also leads to the highway in which the traffic flow saturation speed fluctuation is large, and the vehicle position change is not smooth, thus reducing the predictive effect of the model.
The
J. Multi-Step Estimation Capacity Assessment
This experiment aims to test the multi-step estimation capability of the proposed model. The evaluation results are shown in Table 9. Among the estimation distances from 2km to 5km, the MAE value at 2km is the smallest, and the MAE value at 5km is the largest. Overall, the MAE value increases with the increase of the estimation distance, and the RMSE shows the same pattern of change as the MAE. This indicates that the prediction accuracy of the model gradually decreases with the increase in distance. The reason for this phenomenon is that, due to the limitation of Edata, this paper adopts a recursive prediction of vehicle position and takes the output of the model as the input of the next estimation step. As the number of estimation steps increases, the estimation error is constantly accumulated. In recursive estimation, the error tends to accumulate as each step propagates. This may cause the estimation to become unstable over a long period as the error may keep on amplifying. This problem may affect the accuracy and stability of the estimation.
Although the larger the number of estimation steps, the larger the gap between the vehicle position values estimated by the model and the true values, the MAPE, which measures the mean absolute percentage error between the estimated values and the true values, decreases with the growth of the estimation distance, which indicates that the ratio of estimation error to estimation distance decreases to a certain extent with the growth of the estimation distance, and the model performs well in capturing long-term regularity in the changes of vehicle positions on the highway. The model performs well in capturing the long-term regularity of highway vehicle location changes. In addition, from the value of the correlation coefficient
K. Model Ablation Experiments
To test the effect of each model on location estimation in the proposed highway in-transit vehicle location estimation method considering the road characteristics and short-term driving style, this method is subjected to ablation experiments in block 3. The role of each module in the estimation performance is revealed by removing different modules step by step. We have used the control variable method to perform the ablation experiment on the models. The experimental results are shown in Table 10.
It can be seen that both the slope and the vehicle segment speed, when not taken into account, have some impact on the performance of the model. However, the performance improvement due to slope is smaller than the performance improvement due to segment speed. The main reason for this is that the main factor affecting the change of vehicle position is the vehicle traveling speed. The faster the vehicle speed, the greater the vehicle position change per unit time. In addition, the segment where the ablation experiments were conducted is located on the southeast coast with no substantial up-and-down grades. For small uphill and downhill slopes, the vehicle owner will adjust the output power of the vehicle to offset the influence of the uphill and downhill slopes, which in turn leads to a weaker influence of the slope in the model.
In addition, the performance of the modules in the DLCNN-LSTM-Attention fusion model used individually for vehicle position estimation shows that the MAE of DLCNN is 50.28, the MAE of LSTM is 46.42, and the MAE of Attention is 82.38. This indicates that DLCNN and LSTM perform better relative to Attention in terms of the better performance on the mean absolute error between predicted and true values. Whereas, the RMSE of DLCNN is 152.57, the RMSE of LSTM is 68.26, and the RMSE of Attention is 116.01. The relatively low RMSE of LSTM indicates that the LSTM model performs better in terms of overall prediction error. Finally, the
From the two-by-two combinations of the modules, the MAE of the DLCNN-LSTM model is 82.27, and the RMSE is 116.16; the MAE of the DLCNN-Attention model is 53.61, and the RMSE is 79.6; and the MAE of the LSTM-Attention model is 81.44, and the RMSE is 114.43. The
Conclusion and Outlook
This paper is based on the ETC equipment that has been massively deployed on the highway, considering that the ETC gantry interval is large and cannot accurately perceive the real driving state of the vehicles inside the segment, using Gdata to construct the pattern of change of the position of the vehicles inside the segment, the road structural features, and based on the Edata to construct the historical driving characteristics of the vehicles, the short-term driving style, and then to predict the speed of the vehicles passing through the segment, and on the basis, a highway in- transit vehicle position estimation method considering road characteristics and short-term driving style is proposed. The model performance comparison experiments show that the model proposed in this paper can better extract the change rule of vehicle position from highway multi-featured data, and more accurately estimate the vehicle position in the highway segment.
In terms of estimation error, as one of the practical application scenarios of this method is to assist intelligent driving on highways, and the detection range of the vehicle lidar for intelligent driving can reach more than 100m at present, errors within 100 m can be corrected by the vehicle lidar. Therefore, the minimum detection range of the vehicle lidar is regarded as the acceptable maximum error value in this paper. The experimental results of spatial and temporal dimensions show that this method can realize the accurate estimation of vehicle position with an acceptable error range. The multi-step estimation capability evaluation experiments show that the model can realize long-range vehicle position estimation. Although the estimation error is accumulated as the estimation distance increases, the growth ratio of the error is much smaller than the growth ratio of the estimation distance, which lays a solid foundation for the over-the-horizon estimation capability of the model. Finally, the ablation experiment of the model also proves the rationality of the model structure design.
In addition, as the research object of this article is the estimation of vehicle position within the highway segment, interference from intersections was excluded during the experiment. In actual driving environments, intersections are common. Therefore, when encountering an intersection in the road, due to the diversity of vehicle path selection, accurate estimation of vehicle position cannot be effectively achieved at this time. Finally, in actual highways, the deployment intervals of ETC gantry vary from a few kilometers to more than ten kilometers, which also leads to the accumulation of vehicle position estimation errors in the long-distance estimation of this model. In future research, this article will further expand the research by considering the impact of vehicle owner’s path selection on vehicle position estimation and how to reduce the accumulation of recursive estimation method errors.