Introduction
Automated vehicles have drawn considerable attention widespread from the public recently since they have been expected to have a transformative impact on road transportation, for instance, to address critical traffic issues such as energy and capacity shortage. AVs are those equipped with embedded sensors named on-board units (OBUs) such as Radar or LIDAR to acquire real-time driving information from the leading vehicle via the detecting process [1], [2]. Among vehicle automation functions, longitudinal control, such as adaptive cruise control plays an essential role to automatically adjust their speeds to maintain a desired distance from the preceding vehicle to avoid rear-end collisions. And ACC provides assistance to the drivers in the task of longitudinal control during their motorway driving [3], [4]. With fast-developing communication technologies, cooperative ACC is very heated recently [5], [6]. It is appealing because of the enhancement of vehicle performance and situational awareness. However, cooperative ACC can be unreliable due to the immaturity of vehicle to vehicle communication [7]. Hence, ACC is still worth investigating.
A quantity of ACC car-following optimal control algorithms and strategies were proposed during the past decades. For example, two illustrious models, the optimal velocity model (OVM) [8] and intelligent driver model (IDM) [9], [10], initially designed to mimic human-driven vehicle’s (HV) car-following behavior are then applied to describe ACC vehicle behavior. The OVM regulates the vehicle towards an optimal speed defined as a function of the time headway, however, it does not have a collision-free property. IDM addresses the safety problem by introducing extra parameters such as a brake term to constrain acceleration. Nevertheless, these models do not consider human driver characteristics and have an obvious flaw in distinguishing the essential difference between HV’s car-following and AV’s car-following behavior.
Other than directly applying or modifying existed HV car-following models, a large amount of control theory based ACC algorithms are developed rapidly, which can be generally divided into three main categories: (i) optimal control with explicit objectives and hard constraints [11], [12], (ii) linear controllers [5], [13] and (iii) nonlinear controllers [14], [15]. The first type is usually implemented in a model predictive control (MPC) fashion [14], [16]–[20], which represents a series of control algorithms that use explicit process models to predict future responses of certain inputs. The MPC approach is attractive due to its flexibilities on objectives and constraints modelling. However, especially for constrained MPC, it does not guarantee feasible solutions and requires an efficient algorithm to solve the constrained optimization problem. Compared with MPC, objectives of unconstrained nonlinear and linear controllers are more fixed, and constraints are lack. Linear ACC controllers usually design the acceleration to be proportional to the spacing deviation and the relative speed with the predecessor, which are fast in computing and easier to apply in practice. Nonlinear controllers exist to depict some scenarios more precisely such as a mixed traffic environment consisting of AVs, HVs, and semi-autonomous vehicles while with the drawback of increasing calculation complexity [15]. In spite of the above-mentioned three categories of methods have both strengths and weaknesses, personalized preferences are barely considered in previous controller design. AVs are usually treated as simply homogenous.
Even though AV can satisfy the driving automation requirement, personalized automation such as some certain driving styles, driver-based preferences, and driver patterns has rarely been considered inside the ACC design yet. As far as authors know, current ACC control algorithms cannot really incorporate users driving preference [21]. Besides, as AV’s market penetration grows gradually, the number of user-preferred settings can naturally diversify [22], [23]. Among all the personal settings, risk-taking preference, also known as risk-sensitivity, is an extremely important characteristic due to the fact that it has a close relationship with the safety of car-following behaviors and should be heterogeneous for different drivers depending on the states and current driving situations. Hence, it is necessary to fill the gap on risk-sensitive personalization interpreting in AV control to satisfied driver’s expectation better.
Additionally, comfort in AV control process is also critical. To avoid sudden starts and stops, expensive control strategy [23] which has an extra penalty on large control input (acceleration) can also help to smooth car-following behaviors based on different drivers’ preference. Hence, to increase the diversity of ACC controller behaviors, we present a personalized stochastic optimal control algorithm for AV. The developed ACC algorithm shed the light on diversifying drivers’ choices to satisfy the personal requirements, which gives two detailed contributions: Firstly, to show the functionality of being personalized in risk-sensitive preference, the control framework applies linear exponential-of-quadratic Gaussian (LEQG) (or risk sensitive linear quadratic) mechanism via extending LQG, in which the cost function is designed as an exponential form and involves a risk-sensitive parameter to interpret different willingness degrees to bear additional risk under mixed uncertainties [25]–[28]. As a result, various driving trajectories and behaviors are generated by the controller based on different sensitive parameter settings. Secondly, control comfort requirement in the control process is also incorporated via the relative magnitude settings on weight matrices, i.e., switching between expensive control mode and non-expensive control mode. These contributions in ACC design are innovative and important to becoming practical functions in ACC future industrialization. For verification, the presented controller is evaluated through several simulation experiments. The results show that the proposed controller can provide effective and convergent control as well as generate different trajectories for AVs with different risk preferences. The control framework has overall satisfying performances.
The remainder of the paper is organized as follows: Section II presents the continuous and discretized system state-space formulation and introduces control design. Section III describes the validation process of the proposed control strategy. Finally, Section IV summarizes the key findings and provides future research direction.
State-Space Formulation and Control Design
This section starts by presenting the system state-space formulations in both continuous and discretized form, then introduces the proposed LEQG stochastic optimal control problem. For simplicity, some preliminaries are stated firstly. We postulate that all ACC vehicles can receive real-time information (e.g. velocity, acceleration) detecting by the sensors regarding the direct predecessor. We also assume all discretized uncertainties are zero-mean additive white Gaussian noises. Besides, communication delay and vehicular actuation delay are not considered here since they are not large compared to the control sample period [5].
A. Continuous State-Space Formulation
Accordingly, this contribution applies the state-space formulation for ACC system propose by Zhou et al., [18]. The notable constant time headway (CTH) policy is used in this paper, thus the target equilibrium spacing at time \begin{equation*} s^{\ast }(t)=v(t)\times t^{\ast }+s_{0}\tag{1}\end{equation*}
\begin{equation*} \Delta v(t)=v^{\ast }(t)-v(t)\tag{2}\end{equation*}
Using the above-mentioned variables, the system state \begin{equation*} x(t)=(s(t)-s^{\ast }(t),\Delta v(t))^{T}\tag{3}\end{equation*}
\begin{align*} \dot {x}(t)=&Ax(t)+Bu(t)+V(t) \tag{4}\\ A=&\left ({{{\begin{array}{cccccccccccccccccccc} 0 &\quad 1 \\ 0 &\quad 0 \\ \end{array}}} }\right),\quad B=\left ({{{\begin{array}{cccccccccccccccccccc} {-t^{\ast }} \\ {-1} \\ \end{array}}} }\right)\tag{5}\end{align*}
B. State-Space Formulation Discretization
Though continuous system is usually more desirable, in application, systems usually have a control frequency. Therefore, to be more realistic, we discretize the continuous ACC system proposed in Eq. (4) by assuming control input \begin{equation*} x_{t+1} =A_{d} x_{t} +B_{d} u_{t} +V_{t}\tag{6}\end{equation*}
\begin{align*} A_{d}=&e^{At_{s}} \tag{7}\\ B_{d}=&\left({\int _{0}^{t_{s}} {e^{A\tau }} d\tau }\right)B \tag{8}\\ x_{0}=&\mu _{0}\tag{9}\end{align*}
Note that zero-order hold means that within each sampling interval, the control input is time-invariant [29].
C. State Feedback Control Strategy
Our designing of stochastic optimal longitudinal control of ACC system applies a linear exponential-of-quadratic Gaussian control framework. LEQG control problem, adopts risk sensitive control and differential game theory, is an important generalization of the linear-quadratic Gaussian (LQG) control problem. LEQG replaces the original quadratic cost function in LQG with an exponential form of a quadratic functional of the state and the input, hence, is called “LEQG”. Furthermore, a risk-sensitive parameter is introduced in the cost function of LEQG. Other than being a part of the control framework, the parameter has a realistic meaning in this paper in depicting different levels of risk preference for AVs when facing oscillations.
Firstly, the state feedback control strategy is discussed, which is a widely applied technique for adaptive cruise controllers. In this strategy, the observation of system is assumed to be perfect. The control input is a linear function of the state that defined as state multiplies feedback gain. The running cost at each time point \begin{equation*} \Upsilon _{t} ={x}'_{t} M_{t} x_{t} +{u}'_{t} N_{t} u_{t}\tag{10}\end{equation*}
\begin{align*} M_{t} =\left ({{{\begin{array}{cccccccccccccccccccc} {\beta _{1,t}} &\quad 0 \\ 0 &\quad {\beta _{2,t}} \\ \end{array}}} }\right),\quad N_{t} =\omega _{t}\tag{11}\end{align*}
\begin{equation*} J_{\theta } (x_{t},u_{t})=2\theta ^{-1}\log E\left({e^{\frac {1}{2}\theta G}}\right)\tag{12}\end{equation*}
\begin{equation*} G=\sum \nolimits _{t=1}^{T} {\Upsilon _{t}}\tag{13}\end{equation*}
Note that
The objective of the proposed ACC controller is to minimize the cost function defined by Eq. (12) and Eq. (13) which commits to regulating vehicle’s velocity and spacing towards the equilibrium one in terms of running cost function, i.e., to do its best to remain the equilibrium state \begin{equation*} u^{\ast }=\arg \min \left \{{{J_{\theta } (x_{t},u_{t})} }\right \}\tag{14}\end{equation*}
The value of
In the case
Whereas in the case
Particularly, the developed control problem will reduce to a conventional LQG problem when \begin{equation*} J\left ({{x_{t},u_{t}} }\right)=E\left(\sum \nolimits _{t=1}^{T} x'_{t} M_{t} x_{t} +{u}'_{t} N_{t} u_{t}\right)=E(G)\tag{15}\end{equation*}
Remark 1:
According to [26], the sign of risk-sensitive parameter
Scenario \begin{equation*} \mathop {\textrm {minimize}}\limits _{\left \{{{u_{t}} }\right \},\left \{{{v_{t}} }\right \}} \sum \nolimits _{t=1}^{T} \left({x}'_{t} M_{t} x_{t} +{u}'_{t} N_{t} u_{t} + {V}'_{t} Q_{t} V_{t}\right)\tag{16}\end{equation*}
Scenario 2 (\begin{equation*} \mathop {\textrm {min max}}\limits _{\left \{{{u_{t}} }\right \},\left \{{{v_{t}} }\right \}} \sum \nolimits _{t=1}^{T} \left({x}'_{t} M_{t} x_{t} +{u}'_{t} N_{t} u_{t} + {V}'_{t} Q_{t} V_{t}\right)\tag{17}\end{equation*}
Based on Remark 1, we can have a more comprehensive understanding of the function of the risk-sensitive parameter in LEQG.
Correspondingly, when
When
For the LEQG state feedback control problem defined by Eq. (6), Eq. (12) and Eq. (13), the optimal control takes the form [25], [27]:\begin{equation*} u_{\theta,t}^{\ast } =K_{\theta,t} x_{t}\tag{18}\end{equation*}
\begin{equation*} K_{\theta,t} =-N_{t}^{-1} {B}'_{d} \times (B_{d} N_{t}^{-1} {B}'_{d} +\tilde {P}_{\theta,t+1}^{-1})^{-1}\times A_{d}\tag{19}\end{equation*}
\begin{align*} P_{\theta,t}=&M_{t} +A_{d}^{\prime }\times (B_{d} N_{t}^{-1} {B}'_{d} +\tilde {P}_{\theta,t+1}^{-1})^{-1}\times A_{d} \tag{20}\\ \tilde {P}_{\theta,t+1}=&(P_{\theta,t+1}^{-1} -\theta \xi)^{-1},\quad t=0,1,\ldots,T-1\tag{21}\end{align*}
D. Output Feedback Control Strategy
The precondition to applying the aforementioned state feedback control strategy is that to assume the system has perfect measurement. Nevertheless, it is highly unrealistic in practice due to inevitable measurement uncertainties. Here we introduce the measurement equation considering sensor disturbance as below:\begin{equation*} y(t)=Cx(t)+W(t)\tag{22}\end{equation*}
\begin{align*} C=\left ({{{\begin{array}{cccccccccccccccccccc} 1 &\quad 0 \\ 0 &\quad 1 \\ \end{array}}} }\right)\tag{23}\end{align*}
Eq. (22) describes the measurement (observation) equality, in which
The covariance matrix of two disturbances \begin{align*} \Delta =\left ({{{\begin{array}{cccccccccccccccccccc} \xi &\quad \gamma \\ {\gamma '} &\quad \Gamma \\ \end{array}}} }\right)\tag{24}\end{align*}
\begin{equation*} \xi =H(t),\quad \Gamma =S(t),~\gamma =\text {cov}(V(t),W(t))\tag{25}\end{equation*}
Similarly, we discretize Eq. (22) using the same method mentioned before and we obtain:\begin{equation*} y_{t} =C_{d} x_{t} +W_{t}\tag{26}\end{equation*}
\begin{align*} H_{t}=&\int _{0}^{t_{s}} {e^{A\tau }} M(t)d\tau \tag{27}\\ S_{t}=&\frac {t_{s}}{N(t)}\tag{28}\end{align*}
For the LEQG problem defined by Eq. (6), Eq. (12), Eq. (13) and Eq. (26), the output feedback case optimal control can be obtained by using the separation principle [30], which states that control and estimation are two independent processes in designing. Therefore, based on this principle, output feedback control is designed to remain the original control framework while replaces the unmeasurable parameter in the objective function to its estimated value. The detailed optimal solution follows:\begin{equation*} u_{\theta,t}^{\ast } =K_{\theta,t} (I-\theta R_{\theta,t} P_{\theta,t})^{-1}\mu _{\theta,t}\tag{29}\end{equation*}
Further, let \begin{align*}&\hspace {-0.5pc}R_{\theta,t+1} =\xi +A_{d} \tilde {R}_{\theta,t+1} A_{d}^{\prime }-(\gamma +A_{d} \tilde {R}_{\theta,t+1} {C}'_{d}) \\& \qquad\qquad {{\times (\Gamma \textrm {+C}_{d} \tilde {R}_{\theta,t+1} {C}'_{d})^{-1}\times (\gamma +A_{d} \tilde {R}_{\theta,t+1} {C}'_{d} {)}'}}\tag{30}\end{align*}
\begin{align*} \tilde {R}_{\theta,t} =(R_{\theta,t}^{-1} -\theta M_{t})^{-1};\quad t=0,1,\ldots,T-1,~R_{\theta,0} =R_{0} \\\tag{31}\end{align*}
For state estimation, the Kalman filter which contains prediction stage and update stage is applied [31]. We use superscripts + and − to denote predicted estimates and updated estimates, respectively. Firstly, the predicted state estimate \begin{align*} \tilde {\mu }_{\theta,t+1}^{-}=&A_{d} \tilde {\mu }_{\theta,t}^{+} +B_{d} u_{t} \tag{32}\\ \tilde {\mu }_{\theta,t}^{+}=&\tilde {R}_{\theta,t} \times R_{\theta,t}^{-1} \tilde {\mu }_{\theta,t}^{-}\tag{33}\end{align*}
Note that we tend to use a different symbol
In the update stage, the measurement residual \begin{equation*} \tilde {z}_{t+1} =y_{t+1} -C_{d} \tilde {\mu }_{\theta,t+1}^{-}\tag{34}\end{equation*}
The filter estimates the real measurement via using the product of the predicted state estimate and the discretized measurement matrix. And we then multiply the residual by the Kalman gain to update the state estimate together with the predicted state estimate:\begin{equation*} \tilde {\mu }_{\theta,t+1}^{+} =\tilde {\mu }_{\theta,t+1}^{-} +L_{\theta,t+1} \tilde {z}_{t+1}\tag{35}\end{equation*}
\begin{equation*} L_{\theta,t+1} {= (}{\gamma }'+C_{d} \tilde {R}_{\theta,t} {A}'_{d} {)}'\times (\Gamma +C_{d} \tilde {R}_{\theta,t} {C}'_{d})^{-1}\tag{36}\end{equation*}
Noting that the output feedback strategy has the same logic with the state feedback strategy in understanding the functionality of
E. Expensive and Non-Expensive Control Mode
To be better personalized in control comfort requirements, two control modes are proposed below.
1) Expensive Control Mode
Controlling of a time-invariant dynamic system with a heavy constraint on the input amplitudes is known as expensive control since the control cost of input
2) Non-Expensive Control Mode
When
Based on the methodology given above, except for the state feedback case and output feedback case, the proposed ACC framework can represent six different AV driving behaviors shown as Fig. 1. In Fig. 1, the first level branch is based on the system control mode and the second level branch distinguishes different categories of risk sensitivity. For example, risk-preferring driving behavior under expensive control mode implying the ACC controller has an optimistic driving attitude towards uncertainties meanwhile the acceleration action magnitude is limited to some extent during the control period.
Experiments and Results Analysis
In order to validate the performance of the proposed stochastic linear optimal control method, numerical simulation experiments have been conducted since field test is expensive and beyond scope. This section includes four parts. We firstly begin with an experiment set-up part to initialize the condition and set some parameter default values in Section III.A. Based on that, we systematically designed multiple scenarios with different control parameters to show our proposed framework can produce the heterogeneous driving behaviors. Specifically, we conducted a sensitivity analysis to compare the performances difference between input feedback case and output feedback case with different risk sensitivity parameters in Section III.B. Then we analyzed heterogeneous driving behaviors caused by joint impact of risk sensitivity and magnitudes of disturbances in Section III.C. Finally, two different control modes adopting output feedback strategy with varied values of risk sensitivity are discussed in Section III.D.
A. Experiment Set-Up
As mentioned before, we consider simulation experiments to validate the proposed algorithm. Firstly, Table 1 gives the default values of corresponding parameters. The simulation study period
B. State Feedback Strategy and Output Feedback Strategy Performance Comparison Under Same Intensities of Mixed Uncertainties
We firstly conducted a sensitivity analysis on the control performance comparison between the output feedback case and state feedback case with varied variances of both system disturbance and measurement disturbance to investigate joint impact caused by the risk preference and disturbance intensities on control performance. The test range of measurement disturbance is set according to the authoritative report National Highway Traffic Safety Administration [32], which states that radar’s range of accuracy for distance and velocity are
The relative change percentage of the total cost is chosen to be an indicator to compare the performance with/without considering measurement noises, which is calculated as follows:\begin{align*} \textrm {Relative change percentage =}\frac {J_{\theta,State} -J_{\theta,Output}}{J_{\theta,Output}} \times {100\%} \\\tag{37}\end{align*}
Sensitivity analysis of disturbances: (a)
Performance comparison for state feedback case and output feedback case
C. Heterogeneous Driving Behavior Caused by Joint Impact of Risk Sensitivity and Magnitudes of Disturbances
Since the behaviors of different risk sensitivity will react heterogeneously under different magnitudes of disturbances, here we investigate the heterogeneous driving behavior caused by joint impact induced by different risk sensitivity and magnitudes of disturbances. Since the experiments of previous part have verified that output feedback case outperforms state feedback case, we will no longer discuss the latter one in the following experiments. Further, we extend the behavior analysis using the default value given in Table 1 but varying the system disturbance \begin{align*} C_{\theta,\textrm {State deviation}}=&\sum \nolimits _{t=1}^{T} {x_{t,\theta }} {x}'_{t,\theta } \tag{38}\\ C_{\theta,\textrm {Control input}}=&\sum \nolimits _{t=1}^{T} {u_{t,\theta }} {u}'_{t,\theta }\tag{39}\end{align*}
Performance analysis for joint impact of risk sensitivity and magnitudes of system disturbances.
Fig. 4 has the characteristics that the system states converge from the initial condition to the equilibrium state
Therefore, for a more in-depth analysis, we study the asymmetric controller response to
Asymmetric controller response to the risk-sensitive parameter: (a)
D. Heterogeneous Driving Behavior Caused by Joint Impact of Risk Sensitivity and Expensive/Non-Expensive Control Mode
Expensive control mode and non-expensive control mode are realized by changing the values of predetermined weight matrices
Performance comparison for non-expensive and expensive control mode with different risk sensitive parameters: (a)
Compare horizontally, the spacing convergence speed towards the equilibrium one for non-expensive control mode is quicker than expensive mode. Meanwhile, the relative speed error fluctuates in a smaller range for non-expensive mode compared with the expensive mode.
This is mainly because the corresponding selectable range of acceleration for expensive control mode shrinks due to the larger input weight matrix setting and objective of minimizing the cost function. Hence, the system approaches equilibrium state slower meanwhile less speed change rate causes speed oscillations more difficult to recover. Studying vertically gives us more insights into the impacts of the risk parameters on system control smoothness after convergence. The best smoothness performance happens in the risk-preferring case
Conclusion
Diverse driving preference requirements on risk call for the designing of personalized ACC controller considering various human drivers’ risk sensitivities. In this paper, a driving risk-resistance characteristic depicting optimal control algorithm for ACC is presented based on the LEQG control framework. The proposed algorithm can qualify and quantify AV’s risk sensitivity preference description under mixed disturbances as well as incorporating driving comfort. With different settings of risk-sensitive parameters and control modes, six categories of AVs’ heterogeneous driving behaviors when facing disturbances can be interpreted. For the validation of this contribution, sensitivity analysis and several tests are accomplished. Generally, the control performances of the proposed algorithm are satisfying. According to the results of simulated experiments, risk sensitivity, disturbance magnitude, and control mode are all effective factors on trajectory generation. Future work will extended current framework to CACC case by allowing cooperation multiple vehicles (such as [33]–[35]).