Introduction
Cooperative intelligent transportation technologies including connected and automated vehicles (CAVs) offer the potential to revolutionize transportation systems by improving efficiency and safety [1]. Since driver error is estimated to be a contributing factor in 93% of crashes [2], CAVs may reduce these crashes by reducing the effect of operator errors. CAVs utilize communication between vehicles (V2V) and infrastructure (V2I) to communicate location, position, etc., to surrounding vehicles and roadside units (RSUs). This communication is vital for the wider deployment of CAVs.
The operation of multiple CAVs on the road network will require resilient communications between vehicles and infrastructure. This communication dependence provides access points to hackers, however. While vulnerability to cyberattacks also exists in traditional transportation systems [3], the additional reliance and interdependency on communication and automation may exacerbate the situation for CAVs and result in severe crashes and traffic instability. This makes it imperative to analyze the cyber risks of the CAV environment before reaching a mass deployment stage so that the design of these applications can be enhanced to perform gracefully under cyberattacks. Most studies have analyzed cyber risks qualitatively [4], [5], [6], while some have analyzed CAVs quantitatively under cyberattack [3], [7], [8], [9], [10]. However, the literature has mainly analyzed V2V communication based single platoon of CAVs on a single lane [7], [10]. The authors [11] have analyzed cyberattacks impact on multiple CAV platoons under compromised merge advisories. The lead author [12] assessed energy impacts of cyberattacks and studied resilience to cyberattacks under specific compromise. Furthermore, it is imperative to devise strategies for improving resilience to disruptions in both intra-vehicle and inter-vehicular communications. An efficient system designed for resilience against cyberattacks should normalize the impact of cyberattacks by detecting cyberattacks on a compromised system and allow the system to fall back to a state of resilience by performing tasks even under cyberattacks. Recently, [13] investigated an adversarial noise removal technique based on a deep learning based supervised approach. A study [14] used pseudonym certificates for detected misbehavior using pseudonyms to classify malicious activity. Another study [15] filtered extreme values received from neighboring vehicles, to prevent CAVs from being misguided by malicious vehicles. However, there is a lack of guidance on how to design systems that are resilient to cyberattacks, and the cyber risks associated with applications based on V2I communication in representative traffic interacting with traffic on adjacent lanes has not been explored as extensively. This study enhances the authors past work [11] and seeks to overcome the aforementioned challenges by accomplishing the following objectives:
Develop a CAV-monitoring system architecture that monitors the state of the system to detect anomalous behavior and degrade the CAV system to a safe state of resilience.
Utilize a novel redundancy-based concept and machine learning based algorithms to detect anomalous behavior within the CAV environment and allow the system to operate resiliently using safe control policies upon detection of cyberattacks.
Analyze the security vulnerabilities of CAVs using an integrated platform consisting of V2I and V2V based advisories.
Identify how cyberattacks on a single CAV and multiple CAVs affect the CAV performance in a representative traffic environment and how these systems can perform resiliently even under cyberattacks.
The above objectives were achieved using a cyber risk assessment platform in simulation, since at present no real CAV data exists for cyberattacks. While the authors past study [11] have assessed cyberattacks for their influence on CAVs in simulation environment, the simulation assumed perfect communication and did not validate the controllers. The major contribution of this paper lies in developing a realtime monitoring system architecture based on data science techniques to detect anomalous behavior and degrade the CAV environment to a safe state for resilient operation of CAVs. Further, this study also contributes by developing a modified CAV platform that considers communication and has been validated to generate data for realistic traffic conditions (e.g., CAVs operating on a multilane lane freeway with several platoons and having interactions with surrounding traffic). This platform can operate with both normal and compromised operations. A few past studies have modeled cyberattacks using a single platoon [7], [16] or developed anomaly detection algorithms for CAVs using speed and acceleration data from human-driven vehicles and a car following model with random attack data generation [17]. However, there is a knowledge gap in assessing performance under realistic conditions for CAV operation involving multiple platoons and coordination with adjacent lanes for developing anomalous behavior detection algorithms with countermeasures for resilient operation. Thus, the current study fills this knowledge gap by developing a cyber risk monitoring system architecture for the resilient operation of CAVs involving multiple platoons. Then a series of cases studies with varying parameters were conducted to emulate cyberattacks under realistic traffic conditions, as opposed to generating random attacks. The CAV monitoring system was then tested under cyberattacks for resilient operation of CAVs. This assessment is important in realizing the resilient CAV operation and helpful in future mass deployment of CAVs.
Study Scope and Contributions
This study seeks to improve the resilience of connected and automated vehicles under cyberattacks. While cybersecurity is a broad topic, there is limited literature on cybersecurity and resilience of surface transportation systems. Past studies have mostly focused on studying the impact of cyberattacks on a single platoon of CAVs or proposed anomaly detection algorithms to detect anomalous behavior. This study contributes to the cybersecurity literature on surface transportation systems by developing a real-time monitoring system architecture based on data science techniques to detect anomalous behavior and degrade the CAV environment to a safe state to ensure resilient operation of CAVs. This research utilizes a novel redundancy-based concept, whereby redundant channels are used to provide additional sources of data for real time detection of cyberattacks. The study considers a vehicle to infrastructure-based communication application of CAVs, as illustrated in Figure 1, which is subjected to a broad array of cyberattacks from network level and application-level communication. Furthermore, the monitoring system architecture allows the system to perform resiliently by using safe control policies instead of compromised advisories after real-time detection of cyberattacks. The concept is explained in detail with Figure 2, where redundant channels allow the detection of cyberattacks and resilient operation of CAVs with safe control policies.
Review of Past Studies
Studies on quantitative assessment of cyber risks in CAVs are limited. A study [4] focused on security issues connected cars and identified vulnerability in Electronic Control Units (ECUs). Bertini et al. [5] observed around 40% of DOT official surveyed to be concerned about security risks of CAVs. Hasan et al. [18] performed a survey of the vehicle-to-everything (V2X) ecosystem for security risks and identified existing security gaps. Khan et al. [19] a synthesis of existing security issues and mitigation techniques in CAVs. Another study [20] analyzed the vulnerability of urban road network under cascading failure using real-time traffic situations. They observed node-based attacks to be the most dangerous.
Some quantitative studies on cyberattacks exist. Amoozadeh et al. [7] analyzed ten CACC vehicles on a single lane in terms of how they are impacted by jamming and message falsification attacks. They observed instability to be magnified throughout the stream by adversary providing falsified messages. Another study Islam et al. [8] performed cyberattacks detection and observed conflicts to reduce with their approach. Another study [16] observed oscillations in speeds and crashes under cyberattacks on a platoon of ten vehicles. Khattak et al. [3] developed a monitoring architecture to revert the compromised active traffic management system to safe state under cyberattacks. Li et al. [10] assessed how cyberattack on a single vehicle impacted CAV safety. The attack was active for a short duration of time. The attack factors were based on speeds of the vehicles and deceleration profiles of nine vehicle were observed to be impacted by slight cyberattack case. Wardzinski. [21] used the current and foreseen risk situations to develop vehicle control keeping the safety risk minimum. They observed cooperative communication to result in better performance. Wang et al. [22] used the concept of spreading information maliciously to model cyberattack on a single platoon and observed significant flow disruption. Khattak et al. [11] performed cyber risk analysis of CAVs with compromised lane merge advisories, however, the study assumed perfect communication and did not validate the simulation platform.
Van Wyk et al. [17] USDOT Research Data Exchange (RDE) database for anomaly detection of randomly generated attacks on speeds and acceleration. Javed et al. [23] used sensor data from the RDE to detect anomalies from random attacks. They observed high performance for their algorithm. Kamel et al. [24] used SUMO simulation to develop anomaly detection model. They used machine learning algorithms to detect six types of attacks that were modeled. Dong et al. [25] used simulation to model attacks on platoons of CACC to assess their impact on traffic flow and safety. A disruption of traffic flow and increase in risk of collisions were observed. Yen et al. [26] analyzed the impact of cyberattacks on different traffic control algorithms. They observed spoofing attacks to increase the vulnerability of delay-based algorithms. Singh et al. [27] observed the stability of ten CACC vehicles to be severely impacted by cyberattacks Nguyen et al. [28] used signals from host vehicle and RSUs verifications in a V2X environment to develop anomaly detection algorithm. So et al. [29] used simulated data from VEINS to classify misbehavior using SVM and k-neural nets and observed 20% improvemet in plausibility checks. Another study [30] proposed VetaDetect for vehicle tampering detection. The method used reported belief on tampering to adjust each detector’s belief and observed better performance accuracy of 98% for real world tampering. Another study [31] used time delay attacks to test CACC platoon of six vehicles and observed their CACC algorithm to be stable against the attacks. Tanksale [32] used data from OBD of vehicles driven on road to develop LSTM model for anomaly detection. Their model showed an average 96% accuracy to predict anomalies. Haidar et al. [14] proposed a misbehavior detection strategy based on pseudonym certificates and tested their approach on sybil and replay attacks.
Recently, Haydari et al. [33] used reinforcement learning based to model attacks on traffic control and study their impacts. They used SUMO simulations and observed anomalous behavior to be detected well. Kloukiniotis et al. [13] provided a synthesized adversarial attack on automated vehicles and some defense strategies. Tanksale [34] used real autonomous vehicles sensor data to design anomaly detection based on LSTM. Haidar et al. [14] used neighbors to classify misbehavior for sybil and replay attacks. Zheng et al. [35] focused on safe performance of autonomous vehicles by proposing learning-based algorithm. Their algorithm enhanced the efficiency and safety of control policies. Kamaruzzaman et al. [36] studied spoofing attacks and methods to mitigate their impact on the departure of taxiing flights. They proposed a modulated synchronous taxiing approach under simulated spoofing attack scenarios and observed their algorithm to get more aircraft onto the runway.
The literature revealed that past studies have mostly used qualitative methods to assess security risks or utilized a single platoon of a few CACC vehicles to assess cyber risks and develop anomaly detection algorithms. The authors’ past study [11] only assessed the impact of cyberattacks on CAV environment. However, there is a knowledge gap regarding the security risk with V2I based CAV applications in a representative traffic environment and their resilient operation. Architectures for cyberattack detection and the resilient operation of CAVs need to be developed. This paper, therefore, seeks to fill these gaps by developing a threat monitoring system to detect anomalous behavior and revert the system to a safe state of operation in the event of cyberattacks [37]. Further, the CAV risk and the monitoring system performance in a V2I based application under cyberattacks and varying traffic dynamics is also quantified. The assessment is conducted in a representative traffic environment consisting of multiple platoons and interaction with adjacent lanes requiring lane changes.
Cyberattacks in CAV Environment
The CAV environment is vulnerable to three categories of attacks, as shown in Figure 1: at the application level on the V2I communication medium, at the application level on the V2V communication medium, and at the system level on the vehicle. Figure 1 depicts a traffic scene with platoons of CAVs traveling on roadway and having V2V communication while V2I communication exists with the infrastructure. An incident on the roadway is detected by the TMC, and lane control information is communicated to CAVs through RSUs. The adversary can gain access to the V2V communication shown by point A, V2I communication at the RSU level (shown by point B) and at the TMC level (shown by point C). Likewise, the adversary can also gain access to the CAV (shown by point D- the enlarged vehicle from the traffic scene).
In a typical realistic CAV operation, the traffic management center (TMC) detects incidents and sends advisories for lane changes using V2I communication through roadside units (RSUs) to CAVs. The CAVs also operate using CACC that relies on V2V communication. This allows the followers to create required gaps for merging traffic. Thus, the adversary may be compromised at four possible attack points in Figure 1:
Point A involving vehicular communication (V2V)
Point B involving communication with the infrastructure (V2I) when advisories for lane control are sent to the CAVs
Point C, where V2I communication at the TMC generates lane control advisories from the incident data
Point D involving the CAV itself.
For instance, during a merge from the adjacent lane, the adversary can access V2V communication and manipulate alerts for speed increases or decreases. Similarly, an insider can manipulate the message content by listening to the V2I advisories for lane control. Further, the adversary can affect the CAV operations by manipulating the final lane control advisory being delivered to CAVs after gaining access to the communication at the TMC level. Attacks at the aforementioned entry points have been categorized into three groups according to the research needs, and a description of these three groups of attacks has been provided.
Infrastructure elements can be compromised by attacks at the application level (V2I communication).
Network level (V2V) communication may compromise vehicular communication.
Attacks that require physical access to the system (CAV itself in this case).
Although multiple attacks could be modeled, a detailed investigation was conducted with a likely set of attacks. Some attacks require a high level of expertise and demand a higher cost. The attacks requiring access to CAV have a relatively lower success rate due to the requirement of physical access to the OBD units of specific vehicles. Thus, the attacks from the system (vehicle) access perspective were not considered in this study. Three attack types that required access to communication were thus selected: fake BSM, message falsification (a type of application-level attack), and denial of service attacks. These case studies were used to demonstrate the performance of the monitoring system under cyberattacks. The selection of these attacks was based on the probability of success of these attacks, level of compromise to safety and operation of CAVs and requirement of some expertise and cost for the attack. The CAV application integrated with advisories from the infrastructure can be compromised by the application-level attacks. The denial-of-service attack results in having no communication of advisories and is similar to jamming attacks in [16]. Another category tests the interaction of platoons on multiple lanes of a freeway from a vehicle to infrastructure communication perspective known as message falsification. Likewise, the network-level attacks compromise vehicular communication. Fake BSM was selected as a network level attack, where fake BSM advisories were sent to platoon leader with a fake speed compromised in increments of 40%, 60%, and 80%.
CAV Monitoring System Architecture
The goal of the CAV-monitoring system is to detect cyberattacks and add redundancy so that CAVs can continue to operate in a safe state under cyberattacks. It is important to clarify that monitoring in transportation engineering is traditionally considered as the measurement of vehicle speeds and volumes using detectors. However, here it refers to analyzing the state of the system. The CAV monitoring system is devised to monitor the functionality of the CAVs to assure they are operating in the expected state. Figure 2 shows the monitoring architecture.
The monitoring system uses two channels, acting as paths of communication to process the data independently. The first channel utilizes real-time CAV data, while the second channel utilizes independent data received from redundant sources. The two data sources are matched for detection of anomalies using LSTM and support vector machine (SVM) based naïve Bayes (NB) algorithms. If any deviation is detected during the monitoring, it is reported as an anomaly or a possible cyberattack. When this occurs, the controls generated by CAV data are ceded by the monitoring system, and the monitoring channel is used to control CAV operation over the network. The redundant channels could utilize an array of sources such as historical data from normal CAV driving profiles, plausibility checks using additional sensors from CAVs, additional redundant sensors, and collaborative matching between vehicle groups to collaboratively accept or reject messages as anomalous behavior. In this paper, the functionality of the monitoring system will focus on the use of historical CAV profiles generated from normal operation as the redundant source.
A. Long-Short Term Memory (LSTM) Neural Network
The complex correlations between input features of timeseries were accounted using LSTM neural networks (Figure 3). Multiple LSTM neurons were fed with inputs of headway, acceleration, deceleration, and indication of lane change. This also accounts for gradient vanishing problem [38]. The previous timesteps are tracked through the memory of cell state of LSTM, which existing patterns to be captured within the variation of time series. This layer is followed by the fully connected (FC) layer, which includes rectified linear unit (ReLU) function. The function ensures that negative outputs are not generated from the neurons. The links weights are set to zero by linking FC layer and dropout layer through dropout probability. Further, dropout also reduced overfitting.
Finally, a single neuron was used within output layer with no attack and an attack. The binary output was classified using a sigmoid function. The binary cross entropy with Adam optimizer is estimated as follows in equation (1).\begin{equation*} E = \frac {1}{J}\ \sum _{i = 1}^{J}(y_{i}\log (P( y_{i} )) + (1 -(y_{i})) \log (1 - P(y_{i})) \tag {1}\end{equation*}
The value of E indicates loss function, the binary output is given by
Once an anomaly is detected, the monitoring channel is utilized to provide controls for CAV operation over the network while the controls based on the compromised data are ceased. The monitoring system utilizes historical data from normal profiles of CAV driving as a redundant channel. CAVs were simulated without cyberattacks during normal operation to generate historical data using diverse conditions of traffic and settings for parameters discussed later in the experimental design. The sets of CAVs driving profiles were generated using multiple replications of simulations. These driving profiles included the behavior under varying speed and acceleration conditions, lane change and safe headways. Several sets of policies for safe operation of CAVs were defined, including safe acceleration-decelerations, safe headways, and safe lane change controls for normal operation of CAVs without cyberattacks using historical data. These control inputs were sent to CAVs on the network, and the system continued to operate gracefully in the event of attack detection.
B. Other Tested Algorithms
Several other machine learning algorithms were also tested and compared to the proposed model.
1) 1D Convolutional Neural Network (CNN)
1D CNN architectures were also tested and compared to the proposed LSTM model. CNN differs from feed-forward neural nets based on patterns of connectivity within layers. They utilize neurons to form a connection with the receptive field that reduces the dimensionality. Multiple important layers are utilized in building CNN architecture, with each of the layers performing a set of operations to transform input to output neuron. A unique input volume comprising width and depth is accepted by the CNN architecture. The independent samples comprised of four channels that includes acceleration, deceleration, headway, and lane control. The points forming the segment is given by shape of the channel M. The initial convolutional layer consists of learnable filters with neurons connected to the receptive fields and the neuron output is given as a function of parameter of the filter parameters and receptive field. The same filter is convolved over the input volume to generate feature or activation map. The output dimension is created by stacking several activation maps along the input depth. All convolutional layers utilize size (3xC) filters, C is representation of channel numbers. Equation 2 shows the forward propagation (1D) executed within each convolution layer.\begin{equation*} x_{j}^{k} = b_{j}^{k}\sum _{k - 1}^{i = 1}{{conv}_{1D}(w_{ij}^{k - 1},\ s_{i}^{k - 1})}\ \tag {2}\end{equation*}
Within the activation layer, non-linearity is achieved by activating each convoluted layer output with Leaky RELU [39] that improves learning rate. Next within the pooling layer, a partitioning process is used to divide the input volume vectors of non-overlapping depth and the maximum value is estimated for the sub-vector. A filter of 1x2 was used for pooling to reduce the overfitting, and dimensionality [40]. The CNN architecture ends with multiple fully connected (FC) layers that extract features except for the last layer. These features are fed into the last layer to perform classification. The softmax function generates a probability distribution function for cyberattack outputs.
C. Recurrent Neural Network
The recurrent neural network (RNN) [41] is neural network intelligently designed for analyzing sequential input. The RNN connects different time steps and uses them to circulate weights. RNN uses past input data to generate the output, as opposed to traditional neural networks, which consider independence of input features and output. Several neurons were used in the hidden layer and one neuron in the output layer, where the hidden layer was activated by a Rectified Linear Unit (ReLU), and sigmoid function was used for activation of output layer. The best model structure was identified by changing the neurons within the hidden layers as well as their numbers.
D. Multilayer Perceptron (MLP-ANN)
Multilayer perceptron (MLP) extends the prediction power of artificial neural nets (ANNs) by adding memory to the neurons [42]. The MLP uses back propagation with a feedforward network as an objective to minimize the loss function. This process allowed the selection of best hyperparameters [42], [43].
E. Naive Bayes
The naïve Bayes algorithm was used due to its ease of implementation with multiple features and the small training data required for parameter estimation and computational efficiency [34]. The model can converge quickly compared to discriminative models, which significantly reduces the training time. Initially, a set of data is used to train the algorithm to create profiles of normal and anomalous behavior. Once the distribution profile is created from the training data, the algorithm compares the entering test data (including the driving profiles and lane control advisories discussed later) against the profile of the training data to classify it either as a normal data point corresponding to the usual operation or as an anomaly. The naïve Bayes algorithm utilizes a conditional independence assumption, which assumes that for known classes all attributes are independent of each other. This means that classification is affected by each attribute independently and the joint probability distribution is equal to the product of every term’s probability.
F. Support Vector Machines
The support vector machine (SVM) was utilized due to its ability to separate dependent features in classification. The principle behind SVM is to construct a hyperplane such that the distance between the two sets of structures is maximized. This results in a quadratic optimization problem since SVM also tries to minimize the chances of misclassification. SVM can construct a hyperplane given a separable training set which conditionally divides the sample into a region above the plane given and region below the plane.
G. SVM-Based NB Algorithm
The features of both SVM and NB were combined to construct an SVM based NB algorithm. This accounts for the limitation of NB in terms of the independence of features. The limitation affects the accuracy and recall rate of NB. To address this issue, we use a trimming technique to eliminate the samples wrongly classified by NB. This reduces the dependency of feature vectors and improves the independence among samples. The fast classification rate of NB makes it appropriate for anomaly detection. With SVM based NB, the training data is first classified using NB, generating a category for normal or anomaly based on the features of the input.
Further, SVM is used to reduce the independence of features. Thus, the nearest neighbor for each feature vector is estimated. For every feature vector, if the nearest neighbor and the feature vector belong to the same category, the vector will be kept for further processing. Otherwise, the feature vector and its corresponding information will be trimmed from the training set since the vector is classified into a wrong category due to dependence with the neighbor category. Thus, dependent samples are removed from the training set, resulting in improved accuracy. Finally, the trimmed sample is used by the NB algorithm to build an anomaly detection mode.
Connected and Automated Vehicle Application
The CAV platform developed for data generation and testing under cyberattacks is illustrated in this section. The vehicles are assumed to be equipped with On Board Units (OBUs) and Dedicated Short Range Communication (DSRC) and are broadcasting speed, location, and acceleration as basic safety messages (BSMs) at 10Hz. An analytical model for packet level communication derived from the ns-2 simulator was used to model communication within a range of 300 m in simulation. The platform leverages the connectivity feature of CAVs and includes cooperative adaptive cruise control (CACC) and lane controls. These features are extendable to consider further cooperative driving applications and test the risks of cyberattacks on a wide array of CAV applications. The application is based on the concept that a TMC detects incidents and lane closures and utilizes roadside units to send V2I advisory alerts to CAVs. The advisory alerts initiate cooperative and automated merging. The vehicles in the adjacent lanes are advised at the same time about the merge action and communicate using V2V to reduce their speeds in order to generate sufficient gaps to maintain tight headways.
The integration of CACC platoons and lane controls form the CAV platform discussed below. Platoons were formed using the CACC gap following mode [44], as given by equation (3) while ACC mode is active when no lead vehicle is present.\begin{equation*} \ v_{ccs}(t) = v_{ccs}\left ({{ t - {\Delta }t }}\right ) + g_{p}e_{p}(t) + g_{d}{e^{\prime }}_{p}{t\ } \tag {3}\end{equation*}
\begin{align*} e_{p(t)}=& s\left ({{ t - {\Delta }t }}\right ) - t_{1}v_{ccs}\left ({{ t - {\Delta }t }}\right ) - d \tag {4}\\ {e^{\prime }}_{p}(t)=& v_{1}\left ({{ t - {\Delta }t }}\right ) - v_{ccs}\left ({{ t - {\Delta }t }}\right ) - t_{1}a_{ccs}\left ({{ t - {\Delta }t }}\right ) \tag {5}\end{align*}
The vehicle transitions to ACC mode in equation (6) [44], [45] when a large gap exists or when a leader is absent.\begin{equation*} a_{ccs} = g_{1}(v_{f} - v_{ccs}) \tag {6}\end{equation*}
Further, the selection of ACC control and speed regulation is dependent on the preceding and host vehicle’s clearance distance. For a threshold higher than 120m, the speed regulation [44] estimated by equation (7) is used by the subject vehicle.\begin{equation*} a_{ccs} = g_{1}\left ({{ s - t_{h}v_{ccs} - d }}\right ) + g_{2}(v_{i} - v_{ccs}) \tag {7}\end{equation*}
Likewise, the lane control component [44] was utilized to activate active lane changes. \begin{equation*} L_{active} = \{\gamma ,\ T_{lane},\ \theta _{des}\}\ \tag {8}\end{equation*}
\begin{equation*} T_{lane} = c_{lane} \pm 1 \tag {9}\end{equation*}
\begin{equation*} \theta _{des} = \ \theta *L_{active} \tag {10}\end{equation*}
The existing traffic conditions on the network are used to calculate the current lane angle for each vehicle. The advisories are sent to both the subject and target vehicles when the merge advisory is activated. Then, cooperative merging-based speed reduction is utilized to generate gaps. The avoidance criteria for collision [44] known as Forward Collision Warning (FCW) [46], [47] is used as shown in equation (11) and (12). The deceleration function is generated based on involved traffic movement. The incentive of speed increase is used to decide whether adjacent platoons may be joined by the subject CACC vehicle if the target lane has multiple vehicles.\begin{align*} {de}_{re}=& - 0.165 + 0.685.{de}_{1} + 0.080.q - 0.00889.\left ({{ v_{ccs} - v_{1} }}\right ) \tag {11}\\ q=& \begin{cases} 1& v_{1} \gt 0 \\ 0& otherwise \end{cases} \tag {12}\end{align*}
While
The deceleration rate from equation (11) and (12) is used to avoid rear end collision between preceding and subject vehicle. No braking is required with \begin{equation*} {gap}_{re} = \max \left ({{0,{}\frac {v_{ccs}^{2}}{- 2.{de}_{re}.} - {}\frac {v_{1}^{2}}{- 2.{de}_{1}.}}}\right ) \tag {13}\end{equation*}
A. Communication Model
An analytical model for a packet level communication module derived from ns-2 [48] was used to model communications. It is not feasible to conduct a packet level communication assessment with thousands of vehicles in the ns-2 simulator as considered in this study due to the computational burden and low scalability. The analytical model, on the other hand, is computationally feasible with speed improvement by a factor of 500 times compared to ns-2. The analytical model was derived by [48], [49] and is based on the level of communication density that represents the load in vehicular communication as per unit time transmissions [48], [49]. The model provides the reception probability of one-hop broadcast for BSM under IEEE 802.11p standards. The level of communication density and transmission power together determines the data reception rate. The single sender model is derived from the Nakagami distribution [48] and path loss model while Levernberg-Marquardt method [49] is used to derive a statistical model for multiple sensors. Readers interested in detailed derivations are encouraged to refer to [49]. The probability for a vehicle to successfully receive a packet at a distance (x) from the sender is given by equations (14) and (15).\begin{align*} P\left ({{ x,\varphi ,\delta ,f }}\right )=& e^{- 3\left ({{ \frac {x}{\varphi } }}\right )}\left ({{1 + \sum _{i = 1}^{4}h_{i}(\xi ,\varphi )\left ({{ \frac {x}{\varphi } }}\right )^{i}}}\right ) \tag {14}\\ \xi =& \varphi .\delta .f \tag {15}\end{align*}
The probability of reception for the DSRC communication dependent on the sender and receiver distance is given in Figure 4. The 500-byte messages broadcasted per second reveals that the probability of reception decreases with the increasing density of traffic.
Simulation Framework
The External Driver Behavior Model (EDM) of the VISSIM simulation package was used to code lane controls and platoons in C++. VISSIM’s existing driving behavior model was replaced by EDM. The coding of user-defined control behavior in C++ generated a new Dynamic Link Library (DLL). An iterative process is used by VISSIM to collect all vehicles’ information in the network during each time step and forwarded it to the DLL. The DLL determines the behavior for platoons and lane changes according to the authors’ defined algorithm.
The updated behavior from DLL is continuously sent to VISSIM for the upcoming simulation period. C# was used to invoke VISSIM and model cyberattacks. The CAV simulation platform was utilized for modeling cyberattacks and testing the monitoring system using a case study site from I-66 in Northern Virginia. The site consists of four lanes with CAV platoons, and incidents gathered from the real operation of traffic using a traditional ATM on I-66.
A. Attack Model
Internal attackers having valid authentications and external attackers without valid credentials may compromise a typical cyber-physical environment. The access to valid credentials affords the internal attackers an opportunity to launch a wide array of attacks, while the limited network access limits the ability of external attackers to a small set of attacks. Digital signatures through Security Credential Management System (SCMS) are used to validate messages transmitted by external attackers [50]. A mechanism of elliptic curve digital signature algorithms (ECDSA) that is based on asymmetric cryptography is used by SCMS to authenticate messages. Each vehicle generates pseudonyms that are security credentials, and their trackability is reduced by regular updates. The presence of SCMS limits the external attackers in their abilities to launch attacks. Thus, the study focuses on internal attackers with access to communications that are managed by algorithms for anomalous behavior detection developed in this study. Thus, successfully initiating any of the mentioned attacks may require multiple attacks.
The attack model involves an insider with access to the communication medium. The insider is assumed to have the required credentials like a legitimate user, who participates actively and sends falsified data. The attacker having access to BSM and CAN data is assumed to have the ability to alter any of its data fields. The attacker is further assumed to have a modified unit in their car, granting them the ability to alter the OBU’s transmission rate. Due to the modified OBU, the attacker can use desired pseudonyms. The study used game theoretic approach for modeling attacks, where an agent or attacker interacts with the environment and learns an optimal action from form the action space given by.\begin{equation*} Q(s_{t},a_{t}) \leftarrow r_{t} + \gamma \max _{a_{t + 1}}{(s_{t + 1},a_{t + 1})} \tag {16}\end{equation*}
The attacker’s goal is to maximize congestion or cause a crash and after successful achievement of desired goal, the attacker is provided a reward given by.\begin{equation*} r_{t} = \left ({{ gain*{queue}_{t} }}\right ) - (c*A_{t}) \tag {17}\end{equation*}
The reward is a function of queue caused on the road and is amplified by a gain factor. The reward for the attacker is penalized using a constant c (value of 0.2) and current action \begin{align*} a_{t} = f(x) = \begin{cases} arg\max _{a\in A}{(s_{t},a_{t})},& with\ probability - 1 \\ a\sim A,& otherwise\end{cases} \tag {18}\end{align*}
The study uses three diverse sets of attacks to analyze their effect on CAVs and test the resilience of the monitoring system based on threat model assumptions. The three types of attacks are selected as case study examples due to their success likelihood, ability to compromise CAV operation and safety, and ease of implementation due to the requirement of a reasonable level of expertise and cost. The three types of cyberattacks and their assumptions are discussed in the following sections.
1) Fake BSM
Fake BSMs were used to demonstrate the impact on traffic arising from cyberattacks on a single CAV and the ability of the monitoring system to revert the system back to a safe state. To initiate fake BSMs, the attacker injects fraudulent information into vehicles on the network by impersonating one of the vehicles in the traffic stream. The attacker can send fake BSMs by spoofing GPS since the attacker has the capability of gaining existence anywhere on the network. Such fake BSMs are transmitted at 10 Hz and may be accepted by the subject vehicle. The attack was modeled at the application level and the adversary sends fake BSMs to a CAV in lane 2 of a 4-lane freeway through V2V communication to reduce their speed values by 40%, 60%, and 80% of original speeds. The attack lasts for a duration of 4 minutes. For example, the speed values for CAV were reduced in intervals of 20% from 65 mph to 39 mph, 65 mph to 26 mph, and 65 mph to 13 mph. The vehicles in the blocked lane are not aware of the gradual speed reduction due to the fake BSM attack. Thus, the flow is disrupted since there is no cooperative merging.
2) Message Falsification
During message falsification, an insider rebroadcasts the manipulated beacons to the network after listening to the messages that are communicated. The malicious insider broadcasts the manipulated lane controls from RSUs to the network. CAVs receive falsified advisory messages for incident data or the lane controls. Falsified messages for lane control were emulated through the code for merging lanes. For example, the values for the closed lane were changed to lane 3 instead of lane 1 using C# when lane 3 is in reality open on the network. This causes disruption due to reduced capacity as vehicles from lane 3 merge into lane 2, and vehicles in blocked lanes are not aware of the incident and try to merge as they reach the incident location.
3) Denial of Service
Denial of service generates a shutdown of lane controls. The communication is assumed to occur over seven DSRC channels ranging from 5.90-5.97 GHz. Different channel numbers are used to operate even channels that include control and service channels. A single operation frequency is selected by the OBU and RSU after authentication. The attack is initiated by the adversary after flooding the communication with excess data packets that cause the advisories for lane controls sent to CAVs to cease. Thus, CAVs are not aware of any incidents and merge advisories are not provided since CAV application is overwhelmed with excessive data packets. The attack was emulated in C# by sending excessive data packets exceeding the capacity thus, blocking the lane controls to generate no advisories.
B. Cyberattack Data Generation
The three types of attacks were selected based on their likelihood of success, ease of implementation and cost. While the monitoring system and cyberattack detection algorithm can also identify other similar attacks, the model is not universal. Thus, certain novel attacks may remain undetected. The machine learning (ML) algorithms are also vulnerable to adversarial attacks [51] where attackers can deceive ML algorithms to evade detection. These can be performed under two cases: a) when the adversary has full access to the attacked classifier architecture, including its inputs and outputs (white box scenario) or b) when the adversary has no access to the attacked classifier (black box scenario). The black box scenario is more challenging for the attacker as opposed to the white box scenario. Anomaly detection can be strengthened by constantly updating the input parameters and best-fit boundaries along with using adversarial training for anomaly detection. To scale our models to detect other cyberattacks, it is necessary to determine appropriate sets of model parameters.
When the monitoring architecture is introduced to a new set of cyberattacks, it is important to update the model parameters, and the monitoring architecture would have the ability to observe and learn from new behaviors. The safety strategies can also be updated to improve resilience. Furthermore, attacks that can evade detection with unseen behavior require the model parameters to be re-trained.
Mixed traffic was not considered since the study focus was on CAVs and cyberattacks in the CAV environment would be more critical where no human control exists. Since the simulation model has been calibrated and validated [44], the data is representative of existing traffic on I-66. The CACC model was validated with an available speed profile from a field test, and the CACC kinematic trace profile provides a good fit with that field test, as shown in Figure 5. Likewise, the lane controls were modeled based on data available from the operation of an (ATM) system on I-66 (https://smarterroads.org). The TMC logs provided the lane controls for each lane along with the traffic conditions including incidents and work zone presence. The collected data for instances of incidents and corresponding traffic conditions at the site along with merge advisories were used to calibrate the lane control system. The data output from the CAV model was collected at 0.1s intervals for each CAV within the network and can be traced with unique vehicle IDs and timestamps. The time series data were generated with the normal operation of CAVs and under cyberattacks that were utilized by the monitoring system in training the system. Likewise, the historical data generated were used in reverting the system operation to a Fig. 5. Validation of Simulation Platform normal state. Further, data generated were separated into 80% training and 20% test sets based on normal and anomalous data. The historical data were generated with multiple iterations of volume inputs (2500-4000 vph, in increments of 500) and random seeds to capture a broad spectrum of traffic conditions including the conditions observed during the testing phase. This covered the conditions of both normal and compromised operation of CAVs.
C. Experimental Design
The performance of the monitoring system was analyzed using a case study simulation environment in VISSIM, calibrated, and validated using real-world traffic data from I-66. The case study site consisted of a 1.5-mile network having four lanes, with an incident modeled at a downstream location based on observed incidents and traffic conditions on I-66 to provide advisories to CAVs. These lane controls were modeled to consider realistic traffic conditions involving lane change events. All four lanes were simulated with platoons of CAVs and varying levels of traffic demand. The parameters and their levels investigated included:
attacks with and without monitoring system and no attack
headways of 1s, 1.5s and 2s
CACC vehicle speeds of 55 mph, 60 mph and 65 mph
Maximum platoon sizes of 5, 7 and 10 vehicles
desired decelerations of −3.5ft/s2, −4 ft/s2, −4.5ft/s2
volume levels of 2500, 3000, 3500, 4000 and 4500 vph.
The volumes were utilized since these traffic flows were observed on the study network. The combinations of simulation parameters were run through a separate analysis. A total of 3*3*3*3*
1) Volatility
The effects of cyberattacks on CAVs and the performance of the monitoring system were assessed using driving volatility measures [54]. Volatility measures express acceleration and deceleration variations [55] and represent erratic vehicular movements in three-dimensions. Volatility is a flow and safety surrogate and can be calculated before the occurrence of safety-critical events. Coefficient of variation and mean absolute deviation were used as volatility measures due to better representation of CAVs volatility [54]. The dispersion in relative terms is provided by the coefficient of variation, shown in equation (19).\begin{equation*} C_{v} = \frac {{S.}D_{e}}{\overline {x}}*100\% \tag {19}\end{equation*}
The average difference of each acceleration-deceleration value from the mean is expressed as mean absolute deviation.\begin{equation*} {\ D}_{mean} = \frac {1}{\overline {n}}{\ \ }\sum _{i = 1}^{n}|x - \overline {x}| \tag {20}\end{equation*}
2) Conflicts
There is a risk of conflict when movement of two or more vehicles remain constant as they approach each other in space and time [56]. The surrogate safety model was used to generate conflicts from trajectories. An event is termed as a rear-end conflict if the time to collision (TTC) threshold is within a conflict angle below 30°. Likewise, a lane change conflict is given by an angle of 30–85°. The research used a TTC threshold of 1s-2s based on past literature guidance [57].
The TTC provides collision time for two constantly moving vehicles when the leading vehicle \begin{align*} {TTC}_{n}(t) = f(x) = \begin{cases} \frac {x_{n - 1}(t) - x_{n}(t) - K_{n - 1}}{v_{n}(t) - v_{n - 1}(t)}, & if\ v_{n}(t) \gt v_{n - 1}(t) \\ \infty , & v_{n}(t) \leq v_{n - 1}(t)\end{cases} \tag {21}\end{align*}
Results and Discussion
The detection accuracy of the monitoring system and its impact in reverting CAVs to a normal state of operation are discussed in this section.
A. Anomaly Detection Performance
The prediction performance of the proposed model in comparison to the baseline models was analyzed under the three modeled attacks.
Table 1 shows the performance data for all three cyberattacks used to train the model. The LSTM neural network performs well with the combination of all three-attack data. The model provides an average accuracy of around 98%. The sensitivity ranges to an average value of 96% for the three attacks. Multiple runs of simulations were used to generate scenarios for cyberattacks and showed significance at
There is a tradeoff between accuracy and computation, which makes time of computation an important metric for anomaly detection. This indicates the time required for anomaly detection. High end computing with Intel Core i7 workstation and 20GB ram was used to train and test the algorithms. Table 2 provides computation time ranging from 50–4000 for different epochs. The computational burden increases with the increase in sample size. The complexity increases slightly for LSTM, however, higher accuracy trades off with increased computational burden. Denial of service shows the highest computational time due to higher data packets per sec. Message falsification follows denial of service due to their message complexity. Furthermore, Figure 6 also provides an example AUC curve to show detection performance over true positive and false positive detection.
B. Comparison to Existing Models
The performance of the best fit LSTM model was analyzed and compared to several machine learning (ML) and deep learning (DL) algorithms. These ML and DL models were used as a baseline for comparison with LSTM model. The widely used models in anomaly detection domain used for comparison included CNN [23], naïve Bayes [58], support vector machines [23] as shown in Table 3. Furthermore, multilayer perceptron with artificial neural nets (MLP-ANN), SVM-based NB and recurrent neural networks (RNN) were also used for comparison, representing fully connected ensembles. These algorithms were compared to the LSTM model using similar parameters including speed, acceleration, deceleration, headway, and lane control to achieve a true representation. The features fed into the algorithms were manually crafted. K-fold cross validation and grid search were used to optimize the hyperparameters for achieving the best fit model. The evaluation was performed using similar test data after searching the optimal parameters exhaustively. The optimal hyperparameters included (0.5–40) regularization parameters for SVM, (1-25) hidden layers for MLP-ANN, and (1-13) hidden layers for RNN. The metrics for assessing performance LSTM in comparison with classical ML and deep learning models in Table 3 include precision, accuracy, and sensitivity. These measures are the average of the cyberattack and no attack events. The results show that the proposed LSTM model outperforms the widely used models in the literature with higher accuracy by an average of 3% for all categories of tested cyberattacks. Furthermore, CNN and RNN performance fall close to LSTM. The precision values for the widely used models are also lower compared to the proposed model, thus indicating higher chances of misclassification with those models. Our CAV data is uniquely sequenced time series data. Thus, it is worth mentioning that some models such as k-nearest neighbors, isolation forest etc. only support univariate data. Thus, our model with time series data cannot be compared to those models.
C. Impact of Monitoring System on Mobility and Safety
The performance of the monitoring system was analyzed from the standpoint of traffic stability, volatility, and conflicts.
This analysis was conducted to test how well the monitoring system reverted the system performance to normal operation under cyberattacks using the specified countermeasures. Three models were compared; a) baseline CAV model without cyberattack, b) CAV model with cyberattack, and c) CAV model with the monitoring system and cyberattacks. The monitoring performance is only presented for LSTM since it outperformed the others tested. Multiple simulation runs were conducted to capture stochasticity within the results. Table 4 represents the combinations of different parameters to generate random design space through Latin hypercube sampling for running multiple simulation runs. The results for volatility and safety assessment indicate multiple simulation runs to show the influence of changing the combinations of these different parameters.
1) Traffic Stability
The traffic stream stability under cyberattacks was analyzed due to its influence on safety and operations. For the sake of brevity, the stability results are presented only for the fake BSM attack since the limits of the monitoring system are further tested for stability under this attack, which has three sub-cases. Thus, to limit the discussion for this performance measure, only fake BSM and its sub-cases were analyzed in this category. Figure 7 reveals the abrupt deceleration of CAVs with the initiation of the fake BSM attack at 830s. A single CAV is attacked for 4 minutes in this case. A platoon leader is spoofed and advised to reduce speed in one lane, causing abrupt deceleration while the adjacent lane is already closed due to an incident.
Profiles with and without fake BSM attack for accelerations (averaged across the tested scenarios tested) for (a) under cyberattack without monitoring system, (b) normal operation without cyberattack (c) under cyberattack with monitoring system.
The deceleration peak of −20 ft/s2 reveals a severe impact even though the attack lasts for only 4 mins. The chances of getting rear ended increases due to higher deceleration variations. The monitoring system, however, is able to revert the system back to a safe state of operation with performance similar to the no attack case and provides a stable acceleration-deceleration profile (Figure 7c).
To further test the limits of the monitoring system, five vehicle profiles were analyzed under the fake BSM attack. The results in Figure 8 (a), (b), and (c) show how the speed profiles of the five vehicles change when a fake BSM is spoofed to the leader and the leader is advised to reduce its speed by 40%, 60%, and 80%. Under all three cases, the followers respond with higher instability as soon as the attack begins. The instability increases as the level of attack increases from a sudden reduction of speed by 40% to 60% and 80%. The variations are the greatest in the severe level attack (80% compromise) as the followers had to respond to much abrupt reductions in speeds reaching 11 mph. The monitoring system in Figure 8(d) shows an average response to the three cases of fake BSMs. The response reveals the ability of the monitoring system to revert the operation of CAVs to a normal state and negate the negative effect of cyberattacks.
Speed Profiles for first five vehicles in Platoon; a) Fake BSM (40%), b) Fake BSM (60%), c) Fake BSM (80%), d) Fake BSM (average of three attacks) with monitoring system.
2) Volatility
Traffic volatility indicates abrupt variations in behavior of traffic over time. The results in Figure 9 indicate traffic volatility over time, with the x-axis representing different combinations of parameters including platoon sizes, speed profiles, volume, and headways in increasing order from multiple simulation runs as provided in Table 4. These combinations were chosen using Latin hypercube sampling as discussed in Section VII. Each bar on the x-axis indicates multiple simulation runs through a combination of these parameters to show its impact on increasing or decreasing traffic volatility. This assessment helps to capture stochasticity of the systems based on different factors. The impact of cyberattacks on volatility with and without the monitoring system was examined. Under a cyberattack, volatility increased as opposed to normal operation when no monitoring system was present. The monitoring system was able to counter the negative impact of a fake BSM attack and revert the system to a safe state as shown in Figure 9. The results for the fake BSM attack are the average of the three sub-cases.
Heatmap for volatility with and without cyberattacks for (a) scenario for fake bsm attack, (b) scenario denial of service, (c) scenario of message falsification, x-axis indicates different parameter combinations for speeds, volume, headway and platoon size.
The volatilities were analyzed for their significance and any confounding factors with and without the monitoring system using ANOVA (Analysis of Variance). The fake BSM attack was observed to increase the acceleration and deceleration volatility by 44% and 37.2% over the base case (without a cyberattack) when no monitoring system was used. These changes were statistically significance. The risk of getting rear ended increases with higher deceleration variation. The introduction of the monitoring system to the CAV environment was able to reduce the negative impacts of cyberattacks for variation in volatile behavior by an average of 39.2% and 36.5%. This reveals a performance similar to the no-attack case with the monitoring system. The volatile behavior was observed to have a mixed effect of increase and decrease in variation due to the confounding factor of volume. The effect may be attributed to a combination of platoon size with cases of cyberattack and no attack. The impact of size of platoon was also significant and an increase of 2% was observed in acceleration regime volatility that was statistically significant. Similar performance was observed for the falsified message case, and denial of service attacks revealed, as shown in Figure (10 b and c).
3) Safety Assessment
The quantitative safety assessment of CAVs under cyberattacks was conducted using conflicts. The impact of a cyberattack on conflicts with and without the operation of the monitoring system was analyzed, as shown in the boxplots presented in Figure 10. The box plots in Figure 10 indicate different combinations of parameter including platoon sizes, speed profiles, volume, and headways in increasing order from multiple simulation runs as provided in Table 4. The combinations were chosen using a Latin hypercube sampling as discussed in Section VI, and the box plots represent the influence of change in combination of these different parameters on conflicts, capturing the stochasticity across the behavior of the system. The conflicts were observed to increase drastically under the fake BSM attack (Figure 10a) without the operation of the monitoring system.
Conflicts with and without cyberattacks for (a) scenario for rear end conflicts under message falsification, (b) scenario for rear end conflicts under fake bsm attack, (c) scenario for rear end conflicts under denial of service, boxplots indicates different combinations of speed, headway, volume and platoon size using multiple simulation runs.
This shows that the single spoofed vehicle creates a disturbance in the traffic stream as the vehicles from the blocked lane try to merge into the lane with a spoofed vehicle. The rear end conflicts showed up to 1000, 1900, and 1500 conflicts for message falsification, fake BSM and denial of service attack scenarios. This impact is negated by the monitoring system, reducing the conflicts to an average of 50–150 for rear ends during the cyberattack.
This graceful degradation is similar to normal operation and reveals that the monitoring system is able to revert the system to a safe state. The results were again significant, having a p-value of 0.01. Further, analyzing the performance under falsified messages and denial of service attacks in (Figure 10 c to f) revealed similar insights with the monitoring system degrading CAVs to a safe operational state.
Conclusion
The research identified critical access points that can be compromised by adversaries and developed a prototype monitoring system to detect attacks in real-time for a V2I based CAV environment operating with platoons on multiple lanes provides an assessment of cyber risks in a V2I-based application in a representative traffic environment and the interaction with adjacent vehicles for lane changes. While studies in recent years have focused on analyzing the cyber risk in V2V environment using a single CAV platoon, this research contributes towards assessment in a representative traffic environment and continuously monitor the CAV environment for anomalies.
The LSTM based monitoring system developed in this study detects cyberattacks with high accuracy. The analysis reveals that attacks cause severe disruption in CAV operations and safety, as shown by stability profiles and increases in volatility and conflicts. However, the monitoring system reverts the system back to a safe state of operation with performance similar to no attack case. The results show that even a fake basic safety message (BSM) attack on one CAV causes the traffic stream to be significantly unstable. Furthermore, volatility increases by 37.2% to 44%, as represented by acceleration-deceleration variation. The monitoring system, however, degrades the system to a safe state of operation by improving the stability by 38%, which is similar to the normal operation.
These findings show how cyberattacks may impact the CAV environment. The security design schemes provided for CAVs over instantaneous driving periods will enable the secure operation of CAVs in the future. While this study did not examine a mixed autonomy operation of CAVs with human drivers, future research could analyze the anomaly detection performance for cyberattacks in that scenario. The real-world and conditions that are simulated may have some differences. However, this study modeled cyberattacks on CAVs within a well-calibrated and validated simulation environment since at present cyberattack data for real CAVs does not exist. Further, the limited availability of CAVs and infrastructure limitations make it impractical to generate CAV data using an actual traffic environment to be used for cyberattack generation. In addition, testing cyberattacks on a limited CAV fleet in the real-world can also pose dangerous situations in responding to false BSMs such as a sudden stop in a high-speed traffic stream. This makes simulation the best possible method for analyzing the cyber risks in the CAV environment.