Complex and high-dimensional dynamical systems like legged robots interacting with unknown environments grapple with various uncertainties in the form of unplanned contacts and unknown terrains. Developing robust control solutions that accommodate uncertainties explicitly initiates the first step toward safe and reliable real-world deployment of legged robots. This has a couple of benefits. For instance, it may be impractical to know all the properties of the terrain a legged robot is navigating ab initio to inform the control algorithms, even in the presence of visual sensors. In these scenarios, robust control algorithms designed to accommodate unplanned environmental interactions can extend the variety of terrains a legged robot can negotiate. Furthermore, robust control algorithms can expedite the transfer of algorithms from simulation to reality, potentially circumventing the assumptions made by typical physics-based simulation environments [1].
Toward this objective, this paper presents a robust control approach that can explicitly accommodate disturbances. We propose an algorithm that builds on the conventional optimization-based control stack, especially convex quadratic programming (QP)-based model predictive controllers (MPCs). Under mild conditions, we study the continuous differentiability properties of MPC laws and closed-loop nonlinear discrete-time systems on an open neighborhood of gaits. We then linearize the resulting closed-loop systems around the gait to obtain a linear time-varying (LTV) system. Using suitable coordinate transformation via discrete-time Floquet theory [2], [3], the resulting periodic LTV system is transformed into a linear system with a constant state-transition matrix. This system can then be analyzed using the $\mathcal {H}_{2}$ and $\mathcal {H}_\infty$ control methods subject to parametric uncertainty, and robust state feedback laws can be obtained as solutions to linear matrix inequalities (LMIs) [4]. The theoretical contributions are extended to the single rigid body (SRB) template dynamics of quadrupedal robots and verified numerically. The proposed controllers are then extensively validated for robust locomotion of the A1 quadrupedal robot both through simulations and experiments.
A. Background and Related Work
Real-time and robust gait planning for full-order dynamical models of locomotion is a challenge arising from nonlinearities, the hybrid nature of models, underactuation, and high dimensionality. Models of legged locomotion are hybrid with continuous-time domains representing the Lagrangian dynamics and discrete-time transitions representing the change of contact points with the environment [5], [6], [7], [8], [9], [10]. Although the direct-collocation-based trajectory optimization techniques generate trajectories for full-order and nonlinear locomotion models in a fast manner [11], [12], [13], [14], [15], they cannot be solved in real-time. Reduced-order models (i.e., templates) [16] present low-dimensional representations for complex locomotion models. Existing control algorithms for the real-time planning and control of legged robots predominantly use optimization-based techniques, particularly MPCs. These MPCs typically use a template model such as linear inverted pendulum (LIP) [17], [18], [19], [20], [21], SRB dynamics [22], [23], [24], [25], [26], and centroidal dynamics [27], [28], [29].
Designing robust optimal MPC solutions for periodic gaits of these template systems is challenging, particularly because state-of-the-art $\mathcal {H}_{2}$ and $\mathcal {H}_\infty$ approaches pertain to the robust stabilization of equilibrium points for ordinary differential equations (see, e.g., [30], [31]) but not periodic solutions (i.e., gaits) of template models with MPCs. The situation is exacerbated with template models such as SRB, where the control inputs are switched depending on the robot's gait. Other approaches considering uncertainty directly at the MPC level can be sectioned into two categories: robust MPC (RMPC) and stochastic MPC (SMPC). Popular RMPC approaches include closed-loop (feedback) min-max MPC [32], [33], open-loop min-max MPC [34], [35], and tube MPC [36], [37]. While open-loop MPC often produces conservative control actions, feedback MPC mitigates this issue by optimizing over a set of control policies rather than control actions. However, determining a suitable control policy beforehand is prohibitively expensive, especially in the context of real-time systems [36]. On the other hand, tube-based MPC and SMPC rely on the offline computation of robust invariant (RI) tubes and pre-stabilizing feedback control laws for chance constraints, respectively. Recently, SMPC and tube-based MPC have been utilized to address uncertainty in legged locomotion using LIP [38], [39] and centroidal models [40]. Alternatively, our previous work [26] considered integrating model-free reinforcement learning (RL) techniques with an RMPC algorithm to improve the rough terrain locomotion of quadrupedal robots.
Within legged locomotion and nonlinear control theory, several robust control solutions are proposed. For instance, to account for uncertainties in the ground height variations, feedback control solutions have been developed using either a finite state machine [41] or through robust Lyapunov functions [42] by considering ground height variations as disturbances in the context of input-to-state stability (ISS). Control Lyapunov functions (CLFs) have also been used to generate input-to-state stable control laws for the stabilization of bipedal [43] and quadrupedal robots [44]. Linear and bilinear matrix inequalities have also been used to generate robust centralized [45] and decentralized controllers [46] for legged robots. The current work differs from our previous work [45], [46] in that the proposed controller operates in a real-time MPC fashion instead of slow event-based controllers.
B. Goal, Objectives, and Contributions
The overarching goal of this paper is to present a systematic approach to developing robust $\mathcal {H}_{2}$- and $\mathcal {H}_\infty$-optimal predictive controllers for periodic legged locomotion. In this regard, the objectives and key contributions of the paper are as follows. The paper develops a robust state feedback control law that builds on the existing MPC architectures, thereby acting as a complementary control solution. The paper presents sufficient conditions for continuous differentiability of closed-loop nonlinear discrete-time template models subject to MPCs on an open neighborhood of periodic gaits. We then analyze the conditions under which the discrete-time closed-loop template dynamics with MPC can be transformed into a linear system with a constant state-transition matrix using Floquet theory. The system is further analyzed to accommodate parametric uncertainties and to synthesize robust optimal $\mathcal {H}_{2}$- and $\mathcal {H}_\infty$ feedback controllers via LMIs. The paper extends the theoretical results to the SRB dynamics of quadrupedal locomotion with closed-form Jacobian matrices and numerically verifies them. The proposed robust MPC laws are then used in a layered control structure, where the optimal reduced-order trajectories prescribed by the robust predictive controllers are provided to a full-order and nonlinear whole-body controller (WBC) for tracking at the low level (see Fig. 1). The developed layered control structures are rigorously validated using extensive numerical and experimental studies for the robust locomotion of the A1 quadrupedal robot at various speeds subject to disturbances and rough terrains. Our numerical results indicate an improvement of around 50% for the $\mathcal {H}_\infty$-optimal MPC compared to the normal MPC for locomotion subject to randomly generated external forces and another 15% improvement with the $\mathcal {H}_{2}$-optimal MPC over the $\mathcal {H}_\infty$-optimal MPC. The paper additionally conducts a numerical comparison to evaluate the efficacy of the proposed robust-optimal MPC laws against the RL-based RMPC approach outlined in [26]. Our numerical findings reveal a notable enhancement of 20% compared to the RL-based RMPC in the presence of unknown and randomly generated rough terrains.
C. Organization
The paper is organized as follows. Section II presents nonlinear template models, including SRB, MPC law, fundamental assumptions, and the problem statement for the robust stabilization of gaits. Section III formally addresses the properties of the closed-loop system, including continuous differentiability, the Floquet transform, and the synthesis of LMIs. Section IV analyzes the SRB template model in more detail and applies the theoretical results. Section V presents the layered control structure with the high-level, robust trajectory planner and the low-level nonlinear WBC. Section VI presents the extensive numerical and experimental validation of the proposed layered control structure on the A1 quadrupedal robot. It also discusses the limitations of the approach. Finally, Section VII presents concluding remarks and future research directions.
SECTION II.
Template Dynamics and MPC Law
This section aims to present the template dynamics, MPC law for periodic gaits of templates, assumptions, and the problem statement. Without loss of generality, we consider the template-based SRB dynamics as follows:
\begin{align*}
\Sigma :\left\lbrace \begin{array}{ll} \ddot{r}=\frac{f^{\text {net}}}{m}-g_{0}\\
\dot{R}=R\,\hat{\omega }\\
I\,\dot{\omega }+\hat{\omega }\,I\,\omega =R^\top \tau ^{\text {net}}, \end{array} \right. \tag{1}
\end{align*}
View Source
\begin{align*}
\Sigma :\left\lbrace \begin{array}{ll} \ddot{r}=\frac{f^{\text {net}}}{m}-g_{0}\\
\dot{R}=R\,\hat{\omega }\\
I\,\dot{\omega }+\hat{\omega }\,I\,\omega =R^\top \tau ^{\text {net}}, \end{array} \right. \tag{1}
\end{align*}
where $r \in \mathbb {R}^{3}$ denotes the position of the center of mass (COM) of the robot in an inertial world frame, $R\in \text {SO}(3)$ represents the rotation matrix of the body frame with respect to the world frame, $\omega \in \mathbb {R}^{3}$ denotes the angular velocity in the body frame, $m$ is the total mass, $g_{0}$ is the acceleration due to gravity, and $I\in \mathbb {R}^{3\times 3}$ is the positive definite inertia of the robot in the body frame (see Fig. 2). Furthermore, $\hat{(\cdot)}:\mathbb {R}^{3}\rightarrow \mathfrak {so}(3)$ is the skew-symmetric operator with the property $\hat{a}\,b=a\times b$ for all $a,b\in \mathbb {R}^{3}$. In our notation, $f^{\text {net}}$ and $\tau ^{\text {net}}$ denote the total net force and torque generated by the legs, respectively, and are expressed in the world frame. In other words, the net wrench is given by
\begin{align*}
\begin{bmatrix}f^{\text {net}}\\
\tau ^{\text {net}} \end{bmatrix}:=\sum _{j\in \mathcal {C}} \begin{bmatrix}f_{j}\\
\hat{r}_{j}\,f_{j} \end{bmatrix}, \tag{2}
\end{align*}
View Source
\begin{align*}
\begin{bmatrix}f^{\text {net}}\\
\tau ^{\text {net}} \end{bmatrix}:=\sum _{j\in \mathcal {C}} \begin{bmatrix}f_{j}\\
\hat{r}_{j}\,f_{j} \end{bmatrix}, \tag{2}
\end{align*}
where $j\in \mathcal {C}$ represents the stance foot index and $\mathcal {C}\subseteq \lbrace 1,\ldots,n_{\text {legs}}\rbrace$ is the set of contacting feet with the ground with $n_{\text {legs}}$ representing the total number of legs for the robot (e.g., $n_{\text {legs}}=4$ for quadrupedal robots). In addition, $f_{j}\in \mathbb {R}^{3}$ represents the ground reaction force (GRF) at the stance foot $j\in \mathcal {C}$, and $r_{j}\in \mathbb {R}^{3}$ denotes the distance between the stance foot $j$ and the COM. The state variables for the SRB dynamics can be taken as $x:=\text {col}(r,\dot{r},\alpha,\omega)$, where $\alpha \in \mathbb {R}^{3}$ represents the body's Euler angles (e.g., roll, pitch, and yaw) parameterizing the rotation matrix and “$\text {col}$” denotes the column operator. The control inputs are further chosen as the GRFs at the stance feet, i.e., $u:=\text {col}\lbrace f_{j}:\,j\in \mathcal {C}\rbrace$.
Most template models, including SRB and centroidal dynamics, can be discretized and written in a general nonlinear form as
\begin{align*}
x_{k+1} = g^{\text {ol}}(x_{k},u_{k}),\quad k=0,1,{\ldots }, \tag{3}
\end{align*}
View Source
\begin{align*}
x_{k+1} = g^{\text {ol}}(x_{k},u_{k}),\quad k=0,1,{\ldots }, \tag{3}
\end{align*}
where $x_{k} \in \mathcal {X} \subset \mathbb {R}^{n_{x}}$ and $u_{k} \in \mathcal {U} \subset \mathbb {R}^{n_{u}}$ denote the state variables and control inputs, respectively, for some positive integers $n_{x}$ and $n_{u}$, and $k\in \mathbb{Z}_{\geq 0}:=\lbrace 0,1,\cdots \rbrace$ is the time sample. Furthermore, the superscript “$\text {ol}$” stands for the open-loop system, and $\mathcal {X}$ and $\mathcal {U}$ are polyhedra representing the feasible states and admissible control inputs. More specifically, they can be expressed as an intersection of a finite set of closed halfspaces in $\mathbb {R}^{n_{x}}$ and $\mathbb {R}^{n_{u}}$ as follows:
\begin{align*}
\mathcal {X}&:=\left\lbrace x\in \mathbb {R}^{n_{x}}: V^{x}\,x\leq W^{x}\right\rbrace \\
\mathcal {U}&:=\left\lbrace u\in \mathbb {R}^{n_{u}}: V^{u}\,u\leq W^{u}\right\rbrace, \tag{4}
\end{align*}
View Source
\begin{align*}
\mathcal {X}&:=\left\lbrace x\in \mathbb {R}^{n_{x}}: V^{x}\,x\leq W^{x}\right\rbrace \\
\mathcal {U}&:=\left\lbrace u\in \mathbb {R}^{n_{u}}: V^{u}\,u\leq W^{u}\right\rbrace, \tag{4}
\end{align*}
where $V^{x}$ and $V^{u}$ represent some appropriate matrices and $W^{x}$ and $W^{u}$ denote some appropriate vectors. For the SRB dynamics (1), $\mathcal {U}$ can be taken as the linearized friction cone $\mathcal {FC}$, where $\mathcal {FC}:=\lbrace f=\text {col}(f_{x},f_{y},f_{z})\in \mathbb {R}^{3}:\,|f_{x}|\leq \frac{\mu _{\text {friction}}}{\sqrt{2}}\, f_{z},\, |f_{y}|\leq \frac{\mu _{\text {friction}}}{\sqrt{2}}\, f_{z},\, f_{z}\geq 0\rbrace$ with $\mu _{\text {friction}}$ being the friction coefficient.
The physics-based template dynamics (i.e., $g^{\text {ol}}$) are usually smooth (i.e., $\mathcal {C}^{\infty }$) with respect to $(x,u)$. We are interested in MPC-based feedback control solutions subject to linearized template models that typically take the form of a finite-dimensional and strictly convex QP as follows:
\begin{align*}
& \min _{(x(\cdot),u(\cdot))}\quad p\left(x_{k+N|k}\right)+\sum _{i=0}^{N-1}\mathcal {L}\left(x_{k+i|k},u_{k+i|k}\right)\\
& \quad \text {s.t.}\quad \quad x_{k+i+1|k}=A_{\text {op}}\,x_{k+i|k} + B_{\text {op}}\,u_{k+i|k}+d_{\text {op}}\\
& \qquad \qquad x_{k+i|k}\in \mathcal {X}, \quad \quad i=1,\ldots,N\\
& \qquad \qquad u_{k+i|k}\in \mathcal {U}, \quad \quad \, i=0,\ldots,N-1, \tag{5}
\end{align*}
View Source
\begin{align*}
& \min _{(x(\cdot),u(\cdot))}\quad p\left(x_{k+N|k}\right)+\sum _{i=0}^{N-1}\mathcal {L}\left(x_{k+i|k},u_{k+i|k}\right)\\
& \quad \text {s.t.}\quad \quad x_{k+i+1|k}=A_{\text {op}}\,x_{k+i|k} + B_{\text {op}}\,u_{k+i|k}+d_{\text {op}}\\
& \qquad \qquad x_{k+i|k}\in \mathcal {X}, \quad \quad i=1,\ldots,N\\
& \qquad \qquad u_{k+i|k}\in \mathcal {U}, \quad \quad \, i=0,\ldots,N-1, \tag{5}
\end{align*}
where $N$ is the finite control horizon, $(A_{\text {op}},B_{\text {op}})$ represents a Jacobian linearization of the nonlinear template model (3) around a current operating point, $d_{\text {op}}$ denotes an offset in the linearization, and $p(x)$ and $\mathcal {L}(x,u)$ represent the terminal and stage costs, respectively, defined as $p(x_{k+N|k}):=\Vert x_{k+N|k}-x^{\text {des}}_{k+N|k}\Vert _{P}^{2}$ and $\mathcal {L}(x,u):=\Vert x_{k+i|k}-x^{\text {des}}_{k+i|k}\Vert _{Q}^{2} + \Vert u_{k+i|k}-u^{\text {des}}_{k+i|k}\Vert _{R}^{2}$ for some positive definite matrices $P$, $Q$, and $R$ and some desired state and control trajectories $x^\text {des}(\cdot)$ and $u^\text {des}(\cdot)$. In our notation, $\Vert z\Vert _{Q}^{2}:=z^\top Q\,z$, and $x_{k+i|k}$ and $u_{k+i|k}$ denote the state and control at time $k+i$ computed at time $k$ with the initial condition of $x_{k|k}=x_{k}$. In addition, $x(\cdot)$ and $u(\cdot)$ represent the predicted state and control trajectories over the control horizon, i.e., $x(\cdot):=\text {col}\lbrace x_{k+i|k}\,|\,i=1,\ldots,N\rbrace$ and $u(\cdot):=\text {col}\lbrace u_{k+i|k}\,|\,i=0,\ldots,N-1\rbrace$. The nominal MPC feedback solution is taken as the first optimal input denoted by $u_{k}=\pi (k,x_{k})$. By applying this time-varying control law, the evolution of the template model is described by the following nonlinear time-varying closed-loop dynamics
\begin{align*}
x_{k+1}=g^{\text {cl}}(k,x_{k}):=g^{\text {ol}}(x_{k},\pi (k,x_{k})). \tag{6}
\end{align*}
View Source
\begin{align*}
x_{k+1}=g^{\text {cl}}(k,x_{k}):=g^{\text {ol}}(x_{k},\pi (k,x_{k})). \tag{6}
\end{align*}
We now make the following assumptions.
Assumption 1 (Periodic Orbit):
There exists a periodic orbit $\mathcal {O}$, referred to as the gait, for the nonlinear time-varying closed-loop system $x_{k+1}=g^{\text {cl}}(k,x_{k})$ with the fundamental period of $M$. For future purposes, the evolution of the state variables and the MPC feedback law on the orbit $\mathcal {O}$ is given by the periodic functions $x^{\star }_{k}$ and $u^{\star }_{k}$, respectively, with the property $x_{k+M}^{\star }=x_{k}^{\star }$ and $u_{k+M}^{\star }=u_{k}^{\star }$ for all $k\in \mathbb{Z}_{\geq 0}$.
Assumption 2 (Feasibility of the Orbit):
We further suppose that for every initial condition $x_{k}^{\star }$ on the orbit $\mathcal {O}$, the corresponding predicted state and control trajectories $(x(\cdot),u(\cdot))$ in the MPC formulation (5) strictly belong to the interior of the feasibility sets $\mathcal {X}$ and $\mathcal {U}$, i.e., $x_{k+i|k}\in \text {int}(\mathcal {X})$ for $i=1,\ldots,N$ and $u_{k+i|k}\in \text {int}(\mathcal {U})$ for $i=0,\ldots,N-1$, where “$\text {int}$” denotes the interior of a set.
Assumption 2 will allow us to study some fundamental properties of the nominal MPC law and the closed-loop dynamics in Section III. In particular, it will let us locally linearize the closed-loop dynamics around the gait and study the robust stabilization problem. With these assumptions in hand, we are now interested in robust stabilization of the periodic orbit $\mathcal {O}$ in the presence of some unknown external disturbances, denoted by $w_{k}\in \mathbb {R}^{n_{w}}$ for some positive integer $n_{w}$. In particular, we assume that the original template dynamics are subject to some external force or wrench disturbances and rewrite them as $x_{k+1}=g^{\text {ol}}(x_{k},u_{k},w_{k})$. We then alter the control input to $u_{k}=\pi (k,x_{k})+v_{k}$ and study the following closed-loop system subject to disturbances
\begin{align*}
\Sigma ^{\text {cl}}:\left\lbrace \begin{array}{ll} x_{k+1} = g^{\text {cl}}\left(k,x_{k},v_{k},w_{k}\right),\quad k=0,1,\cdots \\
z_{k} = c(x_{k}), \end{array} \right. \tag{7}
\end{align*}
View Source
\begin{align*}
\Sigma ^{\text {cl}}:\left\lbrace \begin{array}{ll} x_{k+1} = g^{\text {cl}}\left(k,x_{k},v_{k},w_{k}\right),\quad k=0,1,\cdots \\
z_{k} = c(x_{k}), \end{array} \right. \tag{7}
\end{align*}
where $v_{k}$ is the auxiliary input to be designed, $z_{k}:=c(x_{k})\in \mathbb {R}^{n_{z}}$ represents a smooth, measured performance in terms of state variables for a positive integer $n_{z}$, and $g^{\text {cl}}(k,x_{k},v_{k},w_{k}):=g^{\text {ol}}(x_{k},\pi (k,x_{k})+v_{k},w_{k})$.
Problem 1 (Robust Stabilization):
The robust stabilization problem of the periodic orbit $\mathcal {O}$ consists of designing the auxiliary control input $v_{k}$ such that the effect of the unknown disturbance input $w_{k}$ on the measured performance $z_{k}=c(x_{k})$ is attenuated.
SECTION III.
Main Results and Robust Optimal MPC
This section presents the main results of the paper. In particular, it first analyzes some properties of the MPC laws and closed-loop dynamics. It then presents a systematic approach to synthesize $\mathcal {H}_{2}$- and $\mathcal {H}_{\infty }$-optimal MPC laws that address the robust stabilization of the periodic orbits in Problem 1. To study the properties of the MPC law, we present the following theorem that addresses the continuous differentiability of the closed-loop dynamics on an open neighborhood of the periodic orbit $\mathcal {O}$.
Theorem 1 (Properties of the Closed-Loop System):
Under Assumptions 1 and 2, the following statements hold.
The orbit $\mathcal {O}$ is a periodic orbit for the unperturbed closed-loop system $x_{k+1}=g^{\text {cl}}(k,x_{k},0,0)$ with $v_{k}\equiv 0$ and $w_{k}\equiv 0$.
There is an open neighborhood of the periodic orbit $\mathcal {O}$, denoted by $\mathcal {N}(\mathcal {O})\subset \mathbb {R}^{n_{x}}$ such that there exists a unique solution to the QP (5) for every initial condition $x_{k|k}\in \mathcal {N}(\mathcal {O})$ and for every $k\in \mathbb{Z}_{\geq 0}$. Furthermore, the MPC law and the unperturbed closed-loop system $g^{\text {cl}}(k,x,0,0)$ are continuously differentiable (i.e., $\mathcal {C}^{1}$) with respect to the state $x$ on $\mathcal {N}(\mathcal {O})$ for all $k\in \mathbb{Z}_{\geq 0}$.
Proof:
Part 1 is an immediate result of Assumption 1. For Part 2, the QP in (5) can be expressed as an optimization problem whose cost and constraints are parameterized by the initial condition of $x_{k|k}$. We want to show that the solution of this QP is unique and $\mathcal {C}^{1}$ with respect to $x_{k|k}$ on an open neighborhood of the orbit. On the orbit $\mathcal {O}$, the inequality constraints become inactive from Assumption 2. Hence, the active constraints are only reduced to equality constraints. These equality constraints, on the other hand, represent the prediction subject to the linearized dynamics of the template models. From [47], it can be shown that the gradients matrix of the equality constraints with respect to the decision variables of the QP, i.e., $(x(\cdot),u(\cdot))$, has the following sparsity structure
\begin{align*}
A_{\text {eq}}:=\begin{bmatrix}\mathbb {I}& 0 & \cdots & 0 & 0 & -B_{\text {op}} & \cdots & 0\\
-A_{\text {op}} & \mathbb {I}& \cdots & 0 & 0 & 0 & \cdots & 0\\
\vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots \\
0 & 0 & \ldots & -A_{\text {op}} & \mathbb {I}& 0 & \ldots & -B_{\text {op}} \end{bmatrix} \tag{8}
\end{align*}
View Source
\begin{align*}
A_{\text {eq}}:=\begin{bmatrix}\mathbb {I}& 0 & \cdots & 0 & 0 & -B_{\text {op}} & \cdots & 0\\
-A_{\text {op}} & \mathbb {I}& \cdots & 0 & 0 & 0 & \cdots & 0\\
\vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots \\
0 & 0 & \ldots & -A_{\text {op}} & \mathbb {I}& 0 & \ldots & -B_{\text {op}} \end{bmatrix} \tag{8}
\end{align*}
which has the full row rank. Here, $\mathbb {I}$ represents the identity matrix. Hence, the gradients of the active constraints are linearly independent. In addition, the cost function in (5) is smooth (i.e., $\mathcal {C}^{\infty }$), and strictly convex, and the second-order sufficient conditions for optimality for every initial condition on the orbit are met. More specifically, the Hessian matrix of the Lagrangian for the QP on the orbit is positive definite. Consequently, the sufficient conditions of Fiacco's Theorem [48, Theorem 2.1] are satisfied. This guarantees the existence, uniqueness, and $\mathcal {C}^{1}$ continuity of the solutions of the QP (i.e., MPC solution) with respect to the parameters vector $x_{k|k}$ on an open neighborhood of the orbit. The continuous differentiability of the closed-loop dynamics is immediate. This completes the proof of Part 2.$\hfill \blacksquare$
From Theorem 1, one can differentiate the closed-loop dynamics (7) around the augmented orbit
\begin{align*}
\mathcal {O}_{a}:=\left\lbrace (x_{k},v_{k},w_{k}) : x_{k}=x_{k}^{\star },\, v_{k}\equiv 0,\,w_{k}\equiv 0 \right\rbrace
\end{align*}
View Source
\begin{align*}
\mathcal {O}_{a}:=\left\lbrace (x_{k},v_{k},w_{k}) : x_{k}=x_{k}^{\star },\, v_{k}\equiv 0,\,w_{k}\equiv 0 \right\rbrace
\end{align*}
to obtain a periodic LTV system as follows:
\begin{align*}
\delta x_{k+1} &= A_{k}^{{\text {cl}}}\, \delta x_{k} + B_{k}\,v_{k}+E_{k}\,w_{k}, \quad k=0,1,\cdots \\
\delta z_{k} &= C_{k} \, \delta x_{k} \tag{9}
\end{align*}
View Source
\begin{align*}
\delta x_{k+1} &= A_{k}^{{\text {cl}}}\, \delta x_{k} + B_{k}\,v_{k}+E_{k}\,w_{k}, \quad k=0,1,\cdots \\
\delta z_{k} &= C_{k} \, \delta x_{k} \tag{9}
\end{align*}
with the property $A_{k+M}^{\text {cl}}=A_{k}^{\text {cl}}$, $B_{k+M}=B_{k}$, $E_{k+M}=E_{k}$, and $C_{k+M}=C_{k}$, where
\begin{align*}
A^{\text {cl}}_{k} := \frac{\partial g^{\text {cl}}}{\partial x}(k,x_{k}^{\star },0,0)=\left(\frac{\partial g^{\text {ol}}}{\partial x} + \frac{\partial g^{\text {ol}}}{ \partial u}\, \frac{\partial \pi }{\partial x} \right)\Big {|}_{x=x_{k}^{\star },u=u_{k}^{\star }}
\end{align*}
View Source
\begin{align*}
A^{\text {cl}}_{k} := \frac{\partial g^{\text {cl}}}{\partial x}(k,x_{k}^{\star },0,0)=\left(\frac{\partial g^{\text {ol}}}{\partial x} + \frac{\partial g^{\text {ol}}}{ \partial u}\, \frac{\partial \pi }{\partial x} \right)\Big {|}_{x=x_{k}^{\star },u=u_{k}^{\star }}
\end{align*}
and $B_{k}:=\frac{\partial g^{\text {cl}}}{\partial v}(k,x_{k}^{\star },0,0)=\frac{\partial g^{\text {ol}}}{\partial u}(x_{k}^{\star },u_{k}^{\star },0)$, $E_{k}:=\frac{\partial g^{\text {cl}}}{\partial w}(k,x_{k}^{\star },0,0)=\frac{\partial g^{\text {ol}}}{\partial w}(x_{k}^{\star },u_{k}^{\star },0)$ and $C_{k} := \frac{\partial c}{\partial x}(x_{k}^{\star })$ for all $0\leq k < M$. In addition, $\delta x_{k}$ represents the change in the state variables with respect to the periodic orbit, i.e., $\delta x_{k}:= x_{k} - x_{k}^{\star }$. Next, we make the following assumption.
Assumption 3 (Rank Conditions):
We suppose that the state-transition matrices $A^{\text {cl}}_{k}$ satisfy the following rank conditions
\begin{align*}
\text {rank}\left(A^{\text {cl}}_{j+i-1} \, \cdots \, A^{\text {cl}}_{j+1} \, A^{\text {cl}}_{j}\right) = \rho _{i}, \tag{10}
\end{align*}
View Source
\begin{align*}
\text {rank}\left(A^{\text {cl}}_{j+i-1} \, \cdots \, A^{\text {cl}}_{j+1} \, A^{\text {cl}}_{j}\right) = \rho _{i}, \tag{10}
\end{align*}
where $\rho _{i}$ is independent of the running index $j$ for all $1 \leq i \leq n_{x}$ and $1 \leq j \leq M$.
This assumption is not restrictive and will be utilized in the following theorem to generate a suitable change of coordinates that converts the periodic LTV system (9) into a linear one with a time-invariant state-transition matrix.
Theorem 2 (Floquet Transform):
Under Assumptions 1–3, there exits a time-varying, periodic, and invertible matrix $\Lambda _{k}$, referred to as the discrete-time Floquet transform, such that the change of coordinates $\overline{\delta x}_{k}=\Lambda _{k}^{-1}\,\delta x_{k}$ transforms the closed-loop system (9) into a linear one with a constant state-transition matrix, that is,
\begin{align*}
\overline{\delta x}_{k+1} &= \overline{A}^{\text {cl}}\, \overline{\delta x}_{k} + \Lambda _{k+1}^{-1} B_{k}\,v_{k} + \Lambda _{k+1}^{-1} E_{k}\,w_{k}, \,\, k=0,1,\cdots \\
\delta z_{k} &= C_{k} \, \Lambda _{k} \, \overline{\delta x}_{k}, \tag{11}
\end{align*}
View Source
\begin{align*}
\overline{\delta x}_{k+1} &= \overline{A}^{\text {cl}}\, \overline{\delta x}_{k} + \Lambda _{k+1}^{-1} B_{k}\,v_{k} + \Lambda _{k+1}^{-1} E_{k}\,w_{k}, \,\, k=0,1,\cdots \\
\delta z_{k} &= C_{k} \, \Lambda _{k} \, \overline{\delta x}_{k}, \tag{11}
\end{align*}
where $\overline{A}^{\text {cl}} := \Lambda _{k+1}^{-1} \, A_{k}^{\text {cl}} \,\Lambda _{k}$ is independent of time $k$.
Proof:
We are looking for a time-varying, periodic, and invertible change of coordinates matrix $\Lambda _{k}$, which renders the matrix $\overline{A}^{\text {cl}}$ constant. In other words, we need to satisfy the following set of matrix equations
\begin{align*}
& \Lambda _{2}\, \overline{A}^{\text {cl}} = A^{\text {cl}}_{1} \, \Lambda _{1}\\
& \Lambda _{3}\, \overline{A}^{\text {cl}} = A^{\text {cl}}_{2} \, \Lambda _{2}\\
& \vdots \\
& \Lambda _{M+1}\, \overline{A}^{\text {cl}} = A^{\text {cl}}_{M} \, \Lambda _{M}. \tag{12}
\end{align*}
View Source
\begin{align*}
& \Lambda _{2}\, \overline{A}^{\text {cl}} = A^{\text {cl}}_{1} \, \Lambda _{1}\\
& \Lambda _{3}\, \overline{A}^{\text {cl}} = A^{\text {cl}}_{2} \, \Lambda _{2}\\
& \vdots \\
& \Lambda _{M+1}\, \overline{A}^{\text {cl}} = A^{\text {cl}}_{M} \, \Lambda _{M}. \tag{12}
\end{align*}
Without loss of generality, we assume $\Lambda _{1} = \Lambda _{M+1} = \mathbb {I}$ and expand the recursive equation to get
\begin{align*}
\left(\overline{A}^{\text {cl}} \right)^{M} = A^{\text {cl}}_{M} \, \cdots A^{\text {cl}}_{2} \, A^{\text {cl}}_{1}. \tag{13}
\end{align*}
View Source
\begin{align*}
\left(\overline{A}^{\text {cl}} \right)^{M} = A^{\text {cl}}_{M} \, \cdots A^{\text {cl}}_{2} \, A^{\text {cl}}_{1}. \tag{13}
\end{align*}
With Assumption 3, conditions of [2, Theorem 2] are met, which in turn guarantees the existence of the sequence $\lbrace \Lambda _{k}\rbrace _{k=2}^{M}$ to satisfy (12). In addition, the time-invariant state-transition matrix $\overline{A}^{\text {cl}}$ can be obtained as the $M^{\text {th}}$ root of the matrix product in (13) using the techniques of [49]. Once $\overline{A}^{\text {cl}}$ is computed, one can compute the sequence $\lbrace \Lambda _{k}\rbrace _{k=2}^{M}$ using the recursive equation of $\Lambda _{k}\,\overline{A}^{\text {cl}}=A_{k-1}^{\text {cl}}\,\Lambda _{k-1}$ for $2\leq k \leq M$ with $\Lambda _{1}=\mathbb {I}$.$\hfill \blacksquare$
To address the synthesis of the auxiliary control input $v_{k}$ for robust optimal stability, we next assume that the leftover time-varying input and disturbance distribution matrices, together with the output matrix in (11), take values in a convex and bounded set. More specifically, we make the following assumption.
Assumption 4 (Paramteric Uncertainty):
The time-varying and periodic input and disturbance distribution matrices $\overline{B}_{k}:=\Lambda _{k+1}^{-1}\,B_{k}$ and $\overline{E}_{k}:=\Lambda _{k+1}^{-1}\,E_{k}$ together with the output matrix $\overline{C}_{k}:=C_{k}\,\Lambda _{k}$ take values in a convex and bounded set. In particular, the matrix
\begin{align*}
H_{k}:=\begin{bmatrix}0_{n_{x}\times n_{x}} & \overline{B}_{k} & \overline{E}_{k}\\
\overline{C}_{k} & 0_{n_{z}\times n_{u}} & 0_{n_{z}\times n_{w}} \end{bmatrix} \tag{14}
\end{align*}
View Source
\begin{align*}
H_{k}:=\begin{bmatrix}0_{n_{x}\times n_{x}} & \overline{B}_{k} & \overline{E}_{k}\\
\overline{C}_{k} & 0_{n_{z}\times n_{u}} & 0_{n_{z}\times n_{w}} \end{bmatrix} \tag{14}
\end{align*}
takes values in a convex and bounded polyhedron (i.e., polytope) $\mathcal {H}$ for all time samples $0\leq k < M$ as follows:
\begin{align*}
\mathcal {H}:=\left\lbrace H(\theta) = \sum _{\ell =1}^{n_{\mathcal {H}}} H^{\ell }\,\theta ^{\ell }\,|\,\theta ^{\ell }\geq 0,\,\sum _{\ell =1}^{n_{\mathcal {H}}}\theta ^{\ell }=1\right\rbrace, \tag{15}
\end{align*}
View Source
\begin{align*}
\mathcal {H}:=\left\lbrace H(\theta) = \sum _{\ell =1}^{n_{\mathcal {H}}} H^{\ell }\,\theta ^{\ell }\,|\,\theta ^{\ell }\geq 0,\,\sum _{\ell =1}^{n_{\mathcal {H}}}\theta ^{\ell }=1\right\rbrace, \tag{15}
\end{align*}
where each matrix in this domain can be written as an unknown convex combination of $n_{\mathcal {H}}$ given and known vertex matrices $H^{\ell }$ for $1\leq \ell \leq n_{\mathcal {H}}$.
This assumption is not restrictive, and our numerical simulation results in Section VI for the linearized SRB dynamics (1) show that the time-varying matrices $(\overline{B}_{k},\overline{E}_{k},\overline{C}_{k})$ are indeed quite constant for $0\leq k < M$, which is a practical consideration. This will be clarified more in Sections IV and VI. Hence, one can utilize Assumption 4 together with Theorem 2 to study a suboptimal robustness problem that translates the synthesis of $\mathcal {H}_{2}$- and $\mathcal {H}_{\infty }$-optimal MPC for the periodic orbit into LMIs for an LTI system via parametric uncertainties. More specifically, the vertices of the convex polytope $\mathcal {H}$ are given by
\begin{align*}
H^{\ell }=\begin{bmatrix}0_{n_{x}\times n_{x}} & \overline{B}^{\ell } & \overline{E}^{\ell }\\
\overline{C}^{\ell } & 0_{n_{z}\times n_{u}} & 0_{n_{z}\times n_{w}} \end{bmatrix}, \quad \ell =1,\ldots,n_{\mathcal {H}}. \tag{16}
\end{align*}
View Source
\begin{align*}
H^{\ell }=\begin{bmatrix}0_{n_{x}\times n_{x}} & \overline{B}^{\ell } & \overline{E}^{\ell }\\
\overline{C}^{\ell } & 0_{n_{z}\times n_{u}} & 0_{n_{z}\times n_{w}} \end{bmatrix}, \quad \ell =1,\ldots,n_{\mathcal {H}}. \tag{16}
\end{align*}
which results in the following uncertain LTI system
\begin{align*}
\overline{\delta x}_{k+1} &= \overline{A}^{\text {cl}}\, \overline{\delta x}_{k} + \left(\sum _{\ell =1}^{n_{\mathcal {H}}}\overline{B}^{\ell }\,\theta ^{\ell }\right)v_{k} + \left(\sum _{\ell =1}^{n_{\mathcal {H}}}\overline{E}^{\ell }\,\theta ^{\ell }\right)w_{k}\\
\delta z_{k} &= \left(\sum _{\ell =1}^{n_{\mathcal {H}}}\overline{C}^{\ell }\,\theta ^{\ell }\right)\overline{\delta x}_{k}. \tag{17}
\end{align*}
View Source
\begin{align*}
\overline{\delta x}_{k+1} &= \overline{A}^{\text {cl}}\, \overline{\delta x}_{k} + \left(\sum _{\ell =1}^{n_{\mathcal {H}}}\overline{B}^{\ell }\,\theta ^{\ell }\right)v_{k} + \left(\sum _{\ell =1}^{n_{\mathcal {H}}}\overline{E}^{\ell }\,\theta ^{\ell }\right)w_{k}\\
\delta z_{k} &= \left(\sum _{\ell =1}^{n_{\mathcal {H}}}\overline{C}^{\ell }\,\theta ^{\ell }\right)\overline{\delta x}_{k}. \tag{17}
\end{align*}
Here, the parameters vector $\theta :=\text {col}\lbrace \theta ^{\ell }\,|\,\ell =1,\ldots,n_{\mathcal {H}}\rbrace$ is assumed to be unknown with the properties $\theta ^{\ell }\geq 0$ and $\sum _{\ell =1}^{n_{\mathcal {H}}}{\theta ^{\ell }}=1$. Next, we are interested in designing state feedback laws for the suboptimal problem that minimizes the $\mathcal {H}_{2}$ and $\mathcal {H}_{\infty }$ norms of the transfer matrix $T_{wz}$ relating $w_{k}$ to $\delta z_{k}$ in the uncertain LTI system (17), in which
\begin{align*}
\Vert T_{wz}\Vert _{\mathcal {H}_{2}}^{2} &:= \frac{1}{2\pi } \int _{-\pi }^{\pi } \text {trace}\left(T_{wz}^{\mathsf {H}}\left(e^{j\Omega }\right) T_{wz}\left(e^{j\Omega }\right)\right) \text {d}\Omega \\
\Vert T_{wz}\Vert _{\mathcal {H}_{\infty }} &:= \max _{\Omega \in [-\pi,\pi ]}\sigma _{\max }\left(T_{wz}\left(e^{j\Omega }\right)\right), \tag{18}
\end{align*}
View Source
\begin{align*}
\Vert T_{wz}\Vert _{\mathcal {H}_{2}}^{2} &:= \frac{1}{2\pi } \int _{-\pi }^{\pi } \text {trace}\left(T_{wz}^{\mathsf {H}}\left(e^{j\Omega }\right) T_{wz}\left(e^{j\Omega }\right)\right) \text {d}\Omega \\
\Vert T_{wz}\Vert _{\mathcal {H}_{\infty }} &:= \max _{\Omega \in [-\pi,\pi ]}\sigma _{\max }\left(T_{wz}\left(e^{j\Omega }\right)\right), \tag{18}
\end{align*}
and “trace”, “$\mathsf {H}$”, and “$\sigma _{\max }$” denote the trace operator, conjugate transpose, and largest singular value of a matrix, respectively. We are now in a position to present the following theorem to address the suboptimal problem.
Theorem 3 (Robust Stabilizing Controllers):
Under Assumptions 1–4, the following statements are correct.
($\mathcal {H}_{2}$-Optimal Controller): There exists a linear state feedback controller of the form $v_{k} = K\, \overline{\delta x}_{k}$ such that the inequality $\Vert T_{wz} \Vert _{\mathcal {H}_{2}}^{2} < \mu$ holds if, and only if, the following set of LMIs hold
\begin{align*}
\text {trace}(W^\ell) < \mu,\,\,\, &\begin{bmatrix}W^\ell & \overline{C}^{\ell } X \\
(\cdot)^\top & X + X^\top - P^\ell \!\!\end{bmatrix} > 0, \\
& \begin{bmatrix}P^\ell & \overline{A}^{\text {cl}} X + \overline{B}^{\ell }\,L & \overline{E}^{\ell } \\
(\cdot)^\top & X + X^\top - P^{\ell } & 0 \\
(\cdot)^\top & (\cdot)^\top & \mathbb {I}\end{bmatrix}>0 \tag{19}
\end{align*}
View Source
\begin{align*}
\text {trace}(W^\ell) < \mu,\,\,\, &\begin{bmatrix}W^\ell & \overline{C}^{\ell } X \\
(\cdot)^\top & X + X^\top - P^\ell \!\!\end{bmatrix} > 0, \\
& \begin{bmatrix}P^\ell & \overline{A}^{\text {cl}} X + \overline{B}^{\ell }\,L & \overline{E}^{\ell } \\
(\cdot)^\top & X + X^\top - P^{\ell } & 0 \\
(\cdot)^\top & (\cdot)^\top & \mathbb {I}\end{bmatrix}>0 \tag{19}
\end{align*}
for all $\ell =1,\ldots,n_{\mathcal {H}}$, where $X$ and $L$, and the sequence of matrices $\lbrace P^{\ell }\rbrace _{\ell =1}^{n_{\mathcal {H}}}$ and $\lbrace W^{\ell }\rbrace _{\ell =1}^{n_{\mathcal {H}}}$ with the property $P^{\ell }=(P^{\ell })^\top$ and $W^{\ell }=(W^{\ell })^\top$ are decision variables. The gain matrix is also calculated as $K=L\,X^{-1}$.
($\mathcal {H}_\infty$-Optimal Controller): There exists a linear state feedback controller of the form $v_{k} = K\, \overline{\delta x}_{k}$ such that the inequality $\Vert T_{wz} \Vert _{\mathcal {H}_{\infty }}^{2} < \mu$ holds if, and only if, the following set of LMI holds
\begin{align*}
\begin{bmatrix}P^{\ell } & \overline{A}^{\text {cl}} X + \overline{B}^{\ell } L & \overline{E}^{\ell } & 0 \\
(\cdot)^\top & X + X^\top - P^{\ell } & 0 & X^\top \overline{C}^{\ell \top } \\
(\cdot)^\top & (\cdot)^\top & \mathbb {I}& 0 \\
(\cdot)^\top & (\cdot)^\top & (\cdot)^\top & \mu \, \mathbb {I}\!\end{bmatrix} > 0 \tag{20}
\end{align*}
View Source
\begin{align*}
\begin{bmatrix}P^{\ell } & \overline{A}^{\text {cl}} X + \overline{B}^{\ell } L & \overline{E}^{\ell } & 0 \\
(\cdot)^\top & X + X^\top - P^{\ell } & 0 & X^\top \overline{C}^{\ell \top } \\
(\cdot)^\top & (\cdot)^\top & \mathbb {I}& 0 \\
(\cdot)^\top & (\cdot)^\top & (\cdot)^\top & \mu \, \mathbb {I}\!\end{bmatrix} > 0 \tag{20}
\end{align*}
for all $\ell =1,\ldots,n_{\mathcal {H}}$, where $X$, $L$, and the sequence of matrices $\lbrace P^{\ell }\rbrace _{\ell =1}^{n_{\mathcal {H}}}$ with the property $P^{\ell }=(P^{\ell })^\top$ are decision variables. In addition, the gain matrix can be calculated as $K=L\,X^{-1}$.
Proof:
The proof of Part 1 is based on an extension of [31, Theorem 3], in which the Lyapunov matrix can be written as the same convex combination of the sequence $\lbrace P^{\ell }\rbrace _{\ell =1}^{n_{\mathcal {H}}}$, that is,
\begin{align*}
P=\sum _{\ell =1}^{n_{\mathcal {H}}}\theta ^{\ell }\,P^{\ell }. \tag{21}
\end{align*}
View Source
\begin{align*}
P=\sum _{\ell =1}^{n_{\mathcal {H}}}\theta ^{\ell }\,P^{\ell }. \tag{21}
\end{align*}
Finally, the nonlinear transformation of $L=K\,X$ enables us to extend [31, Theorem 3] for the controller synthesis problem while having an LMI structure. In an analogous manner, the proof of Part 2 is based on an extension of [31, Theorem 4].$\hfill \blacksquare$
Finally, the robust stabilizing feedback controller in the original coordinates can take a time-varying form as follows:
\begin{align*}
u_{k} = \pi (k,x_{k}) + K\, \Lambda _{k}^{-1} (x_{k} - x_{k}^{\star }), \tag{22}
\end{align*}
View Source
\begin{align*}
u_{k} = \pi (k,x_{k}) + K\, \Lambda _{k}^{-1} (x_{k} - x_{k}^{\star }), \tag{22}
\end{align*}
where the time-varying feedback term $K\,\Lambda _{k}^{-1}(x_{k} - x_{k}^{\star })$ is added to the original MPC law $\pi (k,x_{k})$ for robust optimality.
SECTION IV.
Application to the SRB Dynamics
This section aims to expand upon the theoretical findings concerning the SRB dynamics with the goal of deriving closed-form expressions for the Jacobian matrices. These expressions will be subsequently subjected to numerical verification within the assumptions and synthesis framework of the proposed computational approach. For this purpose, we consider the SRB dynamics (1) subject to an unknown external disturbance signal $w\in \mathbb {R}^{3}$ acting on the robot's COM. The measured performance is also taken as the COM position, that is, $z=c(x):=r$ (see Fig. 2). In particular, we aim to minimize the effect of external disturbance $w$ on the robot's COM positions $r$, as highlighted in Problem 1.
We assume that the robot's orientation can be expressed as a Z-Y-X vector, i.e., $\alpha :=\text {col}(\phi,\zeta,\psi)$, where $\psi$, $\zeta$, and $\phi$ represent the yaw, pitch, and roll angles, respectively. More specifically, the rotation matrix in (1) can be expressed as
\begin{align*}
R=R_{z,\psi }\,R_{y,\zeta }\,R_{x,\phi }, \tag{23}
\end{align*}
View Source
\begin{align*}
R=R_{z,\psi }\,R_{y,\zeta }\,R_{x,\phi }, \tag{23}
\end{align*}
where $R_{z,\psi }$, $R_{y,\zeta }$, and $R_{x,\phi }$ denote the basic rotation matrices around the z-, y-, and x-axes. The angular velocity in the body frame $\omega$ can then be expressed as
\begin{align*}
\omega = \begin{bmatrix}1 & 0 & -\sin (\zeta)\\
0 & \cos (\phi) & \cos (\zeta)\,\sin (\phi)\\
0 & -\sin (\phi) & \cos (\phi)\,\cos (\zeta) \end{bmatrix}\dot{\alpha }. \tag{24}
\end{align*}
View Source
\begin{align*}
\omega = \begin{bmatrix}1 & 0 & -\sin (\zeta)\\
0 & \cos (\phi) & \cos (\zeta)\,\sin (\phi)\\
0 & -\sin (\phi) & \cos (\phi)\,\cos (\zeta) \end{bmatrix}\dot{\alpha }. \tag{24}
\end{align*}
We next extend the small angle linearization analysis approach of [24], originally expressed in the world frame, to the body frame. In particular, for small values of roll and pitch $(\phi,\zeta)$, we can show that $\omega \approx \dot{\alpha }$ and $R\approx R_{z,\psi }$. Ignoring the term $\hat{\omega }\,I\,\omega$, the open-loop SRB dynamics (1), expressed in the body frame, subject to the disturbance input $w$ can be then linearized around an operating point as follows:
\begin{align*}
\dot{x} =& \frac{\text {d}}{\text {d}t}\begin{bmatrix}r\\
\dot{r}\\
\alpha \\
\omega \end{bmatrix} = \begin{bmatrix}0_{3} & \mathbb {I}_{3} & 0_{3} & 0_{3}\\
0_{3} & 0_{3} & 0_{3} & 0_{3}\\
0_{3} & 0_{3} & 0_{3} & \mathbb {I}_{3}\\
0_{3} & 0_{3} & 0_{3} & 0_{3} \end{bmatrix}x \\
& +\begin{bmatrix}0_{3} & \cdots & 0_{3}\\
\frac{1}{m} \mathbb {I}_{3} & \cdots & \frac{1}{m} \mathbb {I}_{3}\\
0_{3} & \cdots & 0_{3}\\
I^{-1}\,R_{z,\psi }^\top \hat{r}_{1} & \cdots &I^{-1}\,R_{z,\psi }^\top \hat{r}_{4} \end{bmatrix}u \\
& + \begin{bmatrix}0_{3}\\
\frac{1}{m}\mathbb {I}_{3}\\
0_{3}\\
0_{3} \end{bmatrix}w + \begin{bmatrix}0\\
-g_{0}\\
0\\
0 \end{bmatrix} \tag{25}
\end{align*}
View Source
\begin{align*}
\dot{x} =& \frac{\text {d}}{\text {d}t}\begin{bmatrix}r\\
\dot{r}\\
\alpha \\
\omega \end{bmatrix} = \begin{bmatrix}0_{3} & \mathbb {I}_{3} & 0_{3} & 0_{3}\\
0_{3} & 0_{3} & 0_{3} & 0_{3}\\
0_{3} & 0_{3} & 0_{3} & \mathbb {I}_{3}\\
0_{3} & 0_{3} & 0_{3} & 0_{3} \end{bmatrix}x \\
& +\begin{bmatrix}0_{3} & \cdots & 0_{3}\\
\frac{1}{m} \mathbb {I}_{3} & \cdots & \frac{1}{m} \mathbb {I}_{3}\\
0_{3} & \cdots & 0_{3}\\
I^{-1}\,R_{z,\psi }^\top \hat{r}_{1} & \cdots &I^{-1}\,R_{z,\psi }^\top \hat{r}_{4} \end{bmatrix}u \\
& + \begin{bmatrix}0_{3}\\
\frac{1}{m}\mathbb {I}_{3}\\
0_{3}\\
0_{3} \end{bmatrix}w + \begin{bmatrix}0\\
-g_{0}\\
0\\
0 \end{bmatrix} \tag{25}
\end{align*}
where $0_{3}$ and $\mathbb {I}_{3}$ are $3\times 3$ zero and identity matrices, respectively. We remark that (25) can be discretized using the Euler approach, which yields the following time-varying state equation
\begin{align*}
x_{k+1} = A_{\text {op}}\,x_{k} + B_{\text {op}}(r_{1},\ldots,r_{4},\psi)\,u_{k}+E_{\text {op}}\,w_{k}+d_{\text {op}}, \tag{26}
\end{align*}
View Source
\begin{align*}
x_{k+1} = A_{\text {op}}\,x_{k} + B_{\text {op}}(r_{1},\ldots,r_{4},\psi)\,u_{k}+E_{\text {op}}\,w_{k}+d_{\text {op}}, \tag{26}
\end{align*}
in which $A_{\text {op}}$, $E_{\text {op}}$, and $d_{\text {op}}$ are constant matrices/vectors whose values do not depend on the operating point, whereas $B_{\text {op}}$ is a time-varying input distribution matrix whose value at every time $k$ depends on the yaw angle of the operating point as well as the distance from the stance feet to the COM, denoted by $(r_{1},\ldots,r_{4})$ (see Fig. 2). The matrix $A_{\text {op}}$ and the vector $d_{\text {op}}$ together with the updated input matrix $B_{\text {op}}$ at every time sample are used in the nominal MPC formulation of (5). We remark that the nominal MPC is not aware of the disturbance input, and the proposed auxiliary control input will address the robustness with respect to the disturbance. For the remainder of this analysis, we assume the robot walks along a straight line on the orbit, and the yaw angle remains very small. Hence, we can ignore the yaw dependence in the $B_{\text {op}}$ matrix. But it is still time-varying as it depends on $(r_{1},\ldots,r_{4})$ at every time sample.
We now compute a closed-form expression for the closed-loop Jacobian matrices in (9). If Assumptions 1 and 2 hold for the unperturbed SRB dynamics, the nominal MPC problem on the orbit $\mathcal {O}$ is reduced to the following QP
\begin{align*}
&\min _{\xi } \qquad \frac{1}{2} \xi ^\top \bar{Q}\, \xi + \beta ^\top \xi \\
&\quad \text {s.t.} \qquad A_{\text {eq}}\,\xi =b_{\text {eq}}, \tag{27}
\end{align*}
View Source
\begin{align*}
&\min _{\xi } \qquad \frac{1}{2} \xi ^\top \bar{Q}\, \xi + \beta ^\top \xi \\
&\quad \text {s.t.} \qquad A_{\text {eq}}\,\xi =b_{\text {eq}}, \tag{27}
\end{align*}
for which the inequality constraints are inactive. In our formulation, $\xi :=\text {col}(x(\cdot),u(\cdot))$ represents the decision variables, consisting of the predicted state and control trajectories. In addition, $\bar{Q}:=\text {block diag}\lbrace Q,\ldots,Q,P,R,\ldots,R\rbrace$ and $\beta :=-\bar{Q}\,\text {col}(x^{\text {des}}(\cdot),u^{\text {des}}(\cdot))$. Furthermore, $A_{\text {eq}}$ is already defined in (8) with $b_{\text {eq}}:=\text {col}(A_{\text {op}}\,x_{k|k}+d_{\text {op}},d_{\text {op}},\ldots,d_{\text {op}})$. Using the Lagrange multipliers, the optimal solution for the QP (27) can be computed as follows:
\begin{align*}
\xi ^{\star } =& \left(\bar{Q}^{-1}A_{\text {eq}}^{\top }\left(A_{\text {eq}}\, \bar{Q}^{-1} A_{\text {eq}}^\top \right)^{-1}A_{\text {eq}} - \mathbb {I}\right)\bar{Q}^{-1} \beta \\
& +\bar{Q}^{-1}A_{\text {eq}}^{\top } \left(A_{\text {eq}}\, \bar{Q}^{-1} A_{\text {eq}}^\top \right)^{-1} b_{\text {eq}}. \tag{28}
\end{align*}
View Source
\begin{align*}
\xi ^{\star } =& \left(\bar{Q}^{-1}A_{\text {eq}}^{\top }\left(A_{\text {eq}}\, \bar{Q}^{-1} A_{\text {eq}}^\top \right)^{-1}A_{\text {eq}} - \mathbb {I}\right)\bar{Q}^{-1} \beta \\
& +\bar{Q}^{-1}A_{\text {eq}}^{\top } \left(A_{\text {eq}}\, \bar{Q}^{-1} A_{\text {eq}}^\top \right)^{-1} b_{\text {eq}}. \tag{28}
\end{align*}
From this latter equation, we can now compute a closed-form expression for the MPC law on the orbit. In particular, there is a constant projection matrix to project down $\xi ^{\star }$ into $u_{k|k}^{\star }$ as $u_{k|k}^{\star }=\Pi _{\text {proj}}\,\zeta ^{\star }$. In addition, $b_{\text {eq}}$ and thereby $\xi ^{\star }$ in (28) are indeed affine functions with respect to $x_{k|k}$. Consequently, the MPC law on the orbit can be reduced to an affine function with respect to the state vector $x_{k|k}$, and one can compute its derivative with respect to $x_{k|k}$ as follows:
\begin{align*}
\frac{\partial \pi }{\partial x}=\frac{\partial u_{k|k}^{\star }}{x_{k|k}}=\Pi _{\text {proj}}\,\bar{Q}^{-1}A_{\text {eq}}^{\top } \left(A_{\text {eq}}\, \bar{Q}^{-1} A_{\text {eq}}^\top \right)^{-1} \Pi _{\text {lift}}, \tag{29}
\end{align*}
View Source
\begin{align*}
\frac{\partial \pi }{\partial x}=\frac{\partial u_{k|k}^{\star }}{x_{k|k}}=\Pi _{\text {proj}}\,\bar{Q}^{-1}A_{\text {eq}}^{\top } \left(A_{\text {eq}}\, \bar{Q}^{-1} A_{\text {eq}}^\top \right)^{-1} \Pi _{\text {lift}}, \tag{29}
\end{align*}
where $\Pi _{\text {lift}}:=\frac{\partial b_{\text {eq}}}{\partial x_{k|k}}$ is a constant matrix. We remark that all other terms except $\Pi _{\text {proj}}$ and $\Pi _{\text {lift}}$ are time-varying in (29) as $A_{\text {eq}}$ depends on the time-varying $B_{\text {op}}$ matrix, as defined in (8). We also note that (29) is the Jacobian of the MPC law with respect to the states on the orbit, not other places, as the inequality constraints may be active off the orbit.
With the above-mentioned closed-form expressions for the open-loop Jacobian matrices as well as the Jacobian of the MPC law with respect to the state vector on the orbit, we can compute the Jacobian matrices for the closed-loop system in (9) as follows:
\begin{align*}
A_{k}^{\text {cl}}&=\frac{\partial g^{\text {cl}}}{\partial x}(k,x_{k}^{\star },0,0)=\left(\frac{\partial g^{\text {ol}}}{\partial x} + \frac{\partial g^{\text {ol}}}{ \partial u}\, \frac{\partial \pi }{\partial x} \right)\Big {|}_{x=x_{k}^{\star },u=u_{k}^{\star }} \\
& =A_{\text {op}} + B_{\text {op}}\,\Pi _{\text {proj}}\,\bar{Q}^{-1}A_{\text {eq}}^{\top }\left(A_{\text {eq}}\, \bar{Q}^{-1} A_{\text {eq}}^\top \right)^{-1} \Pi _{\text {lift}} \\
B_{k}&=\frac{\partial g^{\text {cl}}}{\partial v}(k,x_{k}^{\star },0,0)=\frac{\partial g^{\text {ol}}}{\partial u}(x_{k}^{\star },u_{k}^{\star },0)=B_{\text {op}} \\
E_{k}&=\frac{\partial g^{\text {cl}}}{\partial w}(k,x_{k}^{\star },0,0)=\frac{\partial g^{\text {ol}}}{\partial w}(x_{k}^{\star },u_{k}^{\star },0)=E_{\text {op}} \\
C_{k}&=\frac{\partial c}{\partial x}(x_{k}^{\star })=\begin{bmatrix}\mathbb {I}_{3} & 0_{3} & 0_{3} & 0_{3} \end{bmatrix}. \tag{30}
\end{align*}
View Source
\begin{align*}
A_{k}^{\text {cl}}&=\frac{\partial g^{\text {cl}}}{\partial x}(k,x_{k}^{\star },0,0)=\left(\frac{\partial g^{\text {ol}}}{\partial x} + \frac{\partial g^{\text {ol}}}{ \partial u}\, \frac{\partial \pi }{\partial x} \right)\Big {|}_{x=x_{k}^{\star },u=u_{k}^{\star }} \\
& =A_{\text {op}} + B_{\text {op}}\,\Pi _{\text {proj}}\,\bar{Q}^{-1}A_{\text {eq}}^{\top }\left(A_{\text {eq}}\, \bar{Q}^{-1} A_{\text {eq}}^\top \right)^{-1} \Pi _{\text {lift}} \\
B_{k}&=\frac{\partial g^{\text {cl}}}{\partial v}(k,x_{k}^{\star },0,0)=\frac{\partial g^{\text {ol}}}{\partial u}(x_{k}^{\star },u_{k}^{\star },0)=B_{\text {op}} \\
E_{k}&=\frac{\partial g^{\text {cl}}}{\partial w}(k,x_{k}^{\star },0,0)=\frac{\partial g^{\text {ol}}}{\partial w}(x_{k}^{\star },u_{k}^{\star },0)=E_{\text {op}} \\
C_{k}&=\frac{\partial c}{\partial x}(x_{k}^{\star })=\begin{bmatrix}\mathbb {I}_{3} & 0_{3} & 0_{3} & 0_{3} \end{bmatrix}. \tag{30}
\end{align*}
Using the closed-form expressions in (30) for the SRB dynamics, one can simply evaluate the rank conditions in Assumption 3 and the convex parametric uncertainty in Assumption 4. Our numerical results in Section VI show that for a typical trotting gait of the SRB dynamics with an MPC, the rank conditions for the above-mentioned closed-form Jacobian matrices $A_{k}^{\text {cl}}$ hold and hence, one can compute the change of coordinates Floquet transform matrix $\Lambda _{k}$. Furthermore, our numerical results indicate that the matrix $H_{k}$ computed in the new coordinates using (14) do not vary that much. Hence, we can choose a finite number of vertices to form a convex and bounded polytope $\mathcal {H}$, as in (15), such that $H_{k}$ takes values in $\mathcal {H}$ over a period. In the numerical results of Section VI, the vertices for the set $\mathcal {H}$ are chosen among finite samples of $H_{k}$. Theorem 3 finally synthesizes the robust stabilizing $\mathcal {H}_{2}$ and $\mathcal {H}_{\infty }$ controllers using the vertices of this convex polytope $\mathcal {H}$.
SECTION V.
Layered Control Structure for Application to Quadrupedal Robots
This section aims to present a layered control structure to apply the proposed robust optimal predictive controllers for quadrupedal locomotion. The layered controller consists of two layers (see Fig. 1). At the high level, the $\mathcal {H}_{2}$- or $\mathcal {H}_{\infty }$- optimal MPC laws generate the optimal GRFs for real-time planning of the SRB template dynamics. At the low level, similar to [44], a QP-based nonlinear controller is employed for WBC that imposes the full-order locomotion model to track the optimal reduced-order trajectories subject to the feasibility conditions. In what follows, we briefly describe the low-level nonlinear WBC.
The full-order and floating-base model of locomotion can be expressed by the Euler-Lagrange equations and principle of virtual work as follows:
\begin{align*}
D(q)\,\ddot{q}+F(q,\dot{q}) = \Upsilon \,\tau + J^{\top }(q)\,f, \tag{31}
\end{align*}
View Source
\begin{align*}
D(q)\,\ddot{q}+F(q,\dot{q}) = \Upsilon \,\tau + J^{\top }(q)\,f, \tag{31}
\end{align*}
where $q\in \mathcal {Q}\subset \mathbb {R}^{n_{q}}$ represents the generalized coordinates, $\mathcal {Q}$ denotes the configuration space, $n_{q}$ represents the number of degrees of freedom (DOFs), including actuated and unactuated DOFs, $\tau \in \mathcal {T}\subset \mathbb {R}^{n_{\tau }}$ is the vector of joint-level torques, and $\mathcal {T}$ denotes the admissible set of torques taken as a bounded convex set with $n_{\tau }$ being the number of actuated joints. In addition, $D(q)\in \mathbb {R}^{n_{q}\times n_{q}}$ denotes the positive definite mass-inertia matrix, $F(q,\dot{q})\in \mathbb {R}^{n_{q}}$ represents the Coriolis, centrifugal, and gravitational terms, $\Upsilon \in \mathbb {R}^{n_{q}\times n_{\tau }}$ represents the input distribution matrix, $J(q)$ denotes the contact Jacobian matrix, and $f$ is the vector of GRFs. We further remark that under no slippage condition, the full-order nonlinear model of (31) is subject to the algebraic constraints of $\ddot{r}_{j}=0$ for every $j\in \mathcal {C}$ stating that the acceleration of each stance foot is zero.
We are now in a position to present the low-level nonlinear WBC. The controller is developed based on the idea of virtual constraints [6] and QP. In particular, we consider output functions to be regulated, referred to as virtual constraints, as follows:
\begin{align*}
h(t,q):=h_{a}(q) - h_{\text {des}}(t), \tag{32}
\end{align*}
View Source
\begin{align*}
h(t,q):=h_{a}(q) - h_{\text {des}}(t), \tag{32}
\end{align*}
where $h_{a}(q)$ represents a set of holonomic controlled variables and $h_{\text {des}}(t)$ denotes the desired evolution of the controlled variables on the gait in terms of time. Here, the controlled variables $h_{a}(q)$ consist of the COM position and orientation, i.e., $(r,\alpha)$, together with the Cartesian positions of the swing feet. The desired evolution for the COM position and orientation can be computed heuristically based on the gait's velocity. The desired profile for the swing feet is then chosen as Bézier polynomials connecting the previous footholds to the upcoming ones. In particular, we employ Raibert's heuristic [50, Eq. (2.4), pp. 46] for the footplacment strategy. To satisfy the virtual constraints, no slippage conditions, and feasibility, we then solve the following real-time and strictly convex QP at 1 kHz [44]
\begin{align*}
\min _{(\tau,f,\delta)} \,\,\,& \frac{\gamma _{1}}{2}\Vert \tau \Vert ^{2} + \frac{\gamma _{2}}{2}\Vert f - f_{\text {des}}\Vert ^{2} + \frac{\gamma _{3}}{2} \Vert \delta \Vert ^{2} \\
\text {s.t.}\quad & \ddot{h}(\tau,f) = -K_{P}\,h - K_{D}\,\dot{h} + \delta \,\text {(Output Dynamics)} \\
&\ddot{r}_{j}(\tau,f) = 0,\quad \quad \quad \,\forall j\in \mathcal {C} \quad \text {(No slippage)} \\
& \tau \in \mathcal {T},\quad f_{j}\in \mathcal {FC},\,\,\,\forall j\in \mathcal {C} \quad \text {(Feasibility)}, \tag{33}
\end{align*}
View Source
\begin{align*}
\min _{(\tau,f,\delta)} \,\,\,& \frac{\gamma _{1}}{2}\Vert \tau \Vert ^{2} + \frac{\gamma _{2}}{2}\Vert f - f_{\text {des}}\Vert ^{2} + \frac{\gamma _{3}}{2} \Vert \delta \Vert ^{2} \\
\text {s.t.}\quad & \ddot{h}(\tau,f) = -K_{P}\,h - K_{D}\,\dot{h} + \delta \,\text {(Output Dynamics)} \\
&\ddot{r}_{j}(\tau,f) = 0,\quad \quad \quad \,\forall j\in \mathcal {C} \quad \text {(No slippage)} \\
& \tau \in \mathcal {T},\quad f_{j}\in \mathcal {FC},\,\,\,\forall j\in \mathcal {C} \quad \text {(Feasibility)}, \tag{33}
\end{align*}
in which $\gamma _{1}$, $\gamma _{2}$, and $\gamma _{3}$ are positive weighting factors and $f_{\text {des}}$ denotes the desired GRF profiles generated by the high-level $\mathcal {H}_{2}$- or $\mathcal {H}_{\infty }$-optimal MPC. The first equality constraint in (33) represents the desired output dynamics with some positive definite gain matrices $K_{P}$ and $K_{D}$. Here, $\delta$ is a defect variable to ensure the feasibility of the QP. The second equality constrict represents the condition of no slippage at the stance feet. The inequality constraints guarantee the joint-level torques' admissibility and the GRFs' feasibility. We remark that the second-order derivatives $\ddot{h}$ and $\ddot{r}_{j}$ are indeed affine functions with respect to $(\tau,f)$ that can be computed via Lie derivatives and input-output linearization [51]. The cost function of the QP in (33) tries to minimize the 2-norm of the joint-level torques $\tau$ while 1) imposing the full-order model to track the desired GRF profile $f_{\text {des}}$ prescribed by the proposed high-level, robust MPCs and 2) minimizing the defect variable $\delta$ via a high-value weighting factor $\gamma _{3}$.
SECTION VI.
Application to Quadrupedal Locomotion
This section aims to validate the proposed $\mathcal {H}_{2}$- and $\mathcal {H}_\infty$-optimal MPCs with the layered control structure using numerical case studies and experiments for the robust locomotion of the A1 quadrupedal robot subject to various external disturbances and uneven terrains.
A. Control Synthesis
A1 is a fully torque-controlled robot with $n_{q}=18$ DOFs, 6 representing the unactuated position of the COM and the orientation of the body, while the remaining $n_{\tau }=12$ DOFs correspond to the actuated joints of each leg (see Fig. 2). Specifically, each leg has a 2 DOF hip joint in addition to a 1 DOF knee joint. The robot weighs 12.45 (kg), with motors at each actuated joint. We consider the full-order nonlinear model of the A1 quadrupedal robot with the proposed layered control structure for the numerical and experimental studies. In what follows, we will synthesize the high-level $\mathcal {H}_{2}$- and $\mathcal {H}_\infty$-optimal MPCs at various speeds. Both the QP-based high-level MPC and the low-level nonlinear WBC are solved online using qpSWIFT [52] at 160 Hz and 1 kHz, respectively.
To synthesize the high-level, robust planners, we linearize the SRB dynamics (1) with the state variables taken as $x=\text {col}(r,\dot{r},\alpha,\omega)$, as in (25). Next, we consider the MPC solution for trotting at a speed of 0.5 (m/s) with a control horizon of $N=7$. The MPC parameters in (5) are chosen as $Q=\text {diag}\lbrace Q_{r},Q_{\dot{r}},Q_{\alpha },Q_{\omega }\rbrace$ with $Q_{r}=\text {diag}\lbrace 3e5,3e5,3e6\rbrace$, $Q_{\dot{r}}=\text {diag}\lbrace 1e4,1e4,1e4\rbrace$, $Q_{\alpha }=\text {diag}\lbrace 1e5,1e5,1e5\rbrace$, and $Q_{\omega }=\text {diag}\lbrace 5e3,5e3,5e3\rbrace$, $P=10\,Q$, and $R$ as the identity matrix. We get a two-step periodic gait with the fundamental period of $M=60$ samples (or 0.375 (s)). The following performance and disturbances are then chosen for the SRB dynamics. As mentioned in Section IV, we assume that the external force disturbances are applied along the x-, y-, and z-axes to the robot's COM, and the performance index is chosen as the position of the COM. With the closed-form expressions for the Jacobian matrices in (30), we then evaluate the assumptions. In particular, Assumptions 1–3 are met for the gait with the normal MPC. We next aim to parameterize the time-varying matrix (14) according to Assumption 4. We observe that this matrix is almost quite constant for the SRB dynamics, as shown in Fig. 3. In particular, Fig. 3 illustrates that the variation of the 2-norm of the $H_{k}$ matrix in (14) does not change much over a period. In this study, we choose $n_{\mathcal {H}}=4$ vertices to parameterize the time-varying matrix in Assumption 4. These four vertices $H^{\ell },\, \ell =1,\ldots,4$ in (16) are chosen based on the time-varying values of the matrix $H_{k}$ at time samples of $k=0,15,45,59$. The LMIs in Theorem 3 are synthesized using the MATLAB Robust Control Toolbox with the upper bound of the $\mathcal {H}_{2}$ and $\mathcal {H}_{\infty }$ norms taken as $\mu =2.89$ and $\mu =1.96$, respectively. For the low-level nonlinear WBC (33), the parameters are chosen as $\gamma _{1}=1$, $\gamma _{2}=1e4$, and $\gamma _{3}=1e6$ with the PD control gains of $K_{P}=400$ and $K_{D}=40$.
B. Numerical Simulations
We initiate our analysis with reduced-order simulations, employing the proposed $\mathcal {H}_{2}$- and $\mathcal {H}_{\infty }$-optimal controllers along with the normal MPC on the nonlinear SRB dynamics (1) for the trotting gait at the speed of 0.5 (m/s). To evaluate the performance of the closed-loop template dynamics, a half sinusoidal pulse force disturbance is applied along the y-axis to the robot's COM, with a magnitude of 100 (N) and a duration of 2 seconds, starting from 5 (s) and ending at 7 (s). Fig. 4 depicts the closed-loop system's performance, specifically the COM position along the y- and z-axes, comparing the results for robust-optimal and normal MPCs. The figure illustrates that, under the influence of the external disturbance, the lateral displacement of the COM (in the y-direction) with the normal MPC significantly deviates from the origin, rendering the system unstable. Following this instability, the normal MPC becomes infeasible, and state trajectories are not plotted beyond the gray area. In contrast, the $\mathcal {H}_{2}$- and $\mathcal {H}_{\infty }$-optimal controllers effectively address the external disturbance, restoring the system to the robustly stable gait. Fig. 5 depicts the vertical component of the GRFs, prescribed by the $\mathcal {H}_{2}$- and $\mathcal {H}_{\infty }$-optimal controllers along with the normal MPC, for the front right leg during these simulations. The chattering phenomena demonstrate the infeasibility of the normal MPC both within and beyond the shaded region.
To quantitatively study the efficacy of the proposed $\mathcal {H}_{2}$- and $\mathcal {H}_\infty$-optimal MPCs with the full-order model, we conduct a numerical case study in the physics-based simulation environment RaiSim [53]. We compare the performance of the $\mathcal {H}_{2}$-optimal MPC, $\mathcal {H}_\infty$-optimal MPC, and normal MPC for the trotting gait at the speed of 0.5 (m/s) subject to randomly generated external forces in the x- and y- directions. In particular, the full-order dynamical model of the A1 robot with the proposed layered control structure (i.e., the high-level planner and the low-level nonlinear WBC) is studied subject to 300 different randomly generated external forces with a maximum absolute magnitude of 70 (N) for a period of 30 (s). Fig. 6 illustrates the percentage success rate of the $\mathcal {H}_{2}$-optimal MPC, $\mathcal {H}_\infty$-optimal MPC, and normal MPC on 300 different trails. Here, the failure or the termination criterion is determined when the rigid body links connecting the knee with the toe hit the ground before 30 (s). The figure indicates that the robust optimal MPCs outperform the normal MPC. In particular, the overall success rate of the $\mathcal {H}_{2}$-optimal MPC, $\mathcal {H}_\infty$-optimal MPC, and normal MPC is 38.6%, 33.3%, and 22.2%, respectively. More specifically, a roughly 50% enhancement in performance with the $\mathcal {H}_\infty$-optimal MPC is observed over the normal MPC for locomotion under randomly generated external forces. Additionally, there is a further 15% advancement with the $\mathcal {H}_{2}$-optimal MPC compared to the $\mathcal {H}_\infty$-optimal MPC.
C. Experimental Evaluations
For the experimental validations, we synthesize both $\mathcal {H}_{2}$- and $\mathcal {H}_\infty$-optimal MPCs at 0.1, 0.3, and 0.5 (m/s). We consider various indoor experimental investigations, particularly a) forward trot gait with the $\mathcal {H}_{2}$-optimal MPC at 0.1, 0.3, and 0.5 (m/s), b) forward trot gait with the $\mathcal {H}_\infty$-optimal MPC at 0.1, 0.3, and 0.5 (m/s), c) $\mathcal {H}_{2}$-optimal MPC trotting on uneven terrains at 0.5 (m/s), and d) push disturbances to the $\mathcal {H}_{2}$-optimal MPC. Fig. 9(a) illustrates the snapshots of robustly stable locomotion of the A1 quadrupedal robot in the presence of external disturbances, and Fig. 9(b) shows locomotion on uneven terrain with wooden blocks. Both experiments are performed using the $\mathcal {H}_{2}$-optimal MPC with a forward speed of 0.5 (m/s). In addition, Fig. 7 shows the vertical component of the GRF, prescribed by the high-level $\mathcal {H}_{2}$-optimal MPC in (22), for the front right leg when the robot is subject to external disturbances, as indicated in Fig. 9(a). The plot of the normal GRF prescribed by the high-level $\mathcal {H}_{2}$-optimal MPC for the same leg of the robot during the rough terrain experiments (see Fig. 9(b)) is illustrated in Fig. 8. In all of the experiments, the layered control structure with the proposed $\mathcal {H}_{2}$- and $\mathcal {H}_\infty$-optimal MPCs successfully stabilized the locomotion of the quadrupedal robot. Simulation and experimental videos are available online [54].
D. Discussion and Limitations
Comparison with Alternative Robust Control Approaches: To assess the efficacy of the proposed $\mathcal {H}_{2}$- and $\mathcal {H}_{\infty }$-optimal MPC laws in comparison to alternative robust control methodologies, we conduct an extensive numerical simulation aimed at verifying the robustness of the closed-loop system for locomotion over rough terrains. In this study, we compare the proposed robust-optimal MPC approach against the RL-based RMPC approach outlined in our previous work [26]. Specifically, our previous work introduced a QP-based RMPC approach to steer template dynamics subject to a convex set of uncertainties arising from abstraction and unmodeled dynamics. In particular, the RMPC considers all possible realizations of the temple model subject to different vertices of the uncertainty set acting as a disturbance on the model. This RMPC framework was further enhanced through the integration of RL techniques, wherein a multi-layer perceptron (MLP) was trained to computationally determine the vertices of the uncertainty set based on the system's states (for comprehensive details on the synthesis of RMPC and MLP, please refer to Section V of [26]).
In our current investigation, we adopted the RL-based RMPC approach of [26], employing a modeling uncertainty set comprising two or three vertices computed by an MLP architecture featuring two hidden layers of 128 neurons with a rectified linear unit (ReLU) as the activation function. The control horizon for the RL-based RMPC was set to $N=7$, aligning it with the $\mathcal {H}_{2}$- and $\mathcal {H}_{\infty }$-optimal MPC strategies. Solving times for the RL-based RMPC with two and three vertices were recorded as 4.2 (ms) and 10.9 (ms), respectively, on an off-board laptop equipped with an i7-1185G7 processor running at 3.00 GHz. Fig. 10 illustrates the percentage success rate of various MPC methods, including the $\mathcal {H}_{2}$-optimal MPC, $\mathcal {H}_{\infty }$-optimal MPC, normal MPC, and RL-based RMPCs with two and three vertices across 550 randomly generated heightmaps, each with a different distribution of blocks. Here, the blocks, each 5 (cm) in height, are distributed randomly over a terrain of 30 (m) in length (approximately 85 times the robot's body length). From Fig. 10, the entire success rates of the $\mathcal {H}_{2}$-optimal MPC, RL-based RMPC with three vertices, $\mathcal {H}_{\infty }$-optimal MPC, RL-based RMPC with two vertices, and normal MPC are 45.56%, 43.58%, 39.42%, 37.79%, and 23.51%, respectively. Analysis of Fig. 10 reveals that the $\mathcal {H}_{2}$-optimal MPC achieved the highest overall success rate, outperforming other MPC strategies. Both the $\mathcal {H}_{2}$- and $\mathcal {H}_{\infty }$-optimal MPCs exhibited superior success rates compared to the RL-based RMPC approach with two vertices. Notably, the computational time for the RL-based RMPC technique with three vertices was considerably higher than that of other MPC methods, underscoring the efficacy of this study's proposed robust optimal MPCs. More specifically, a roughly 20% enhancement in performance with the $\mathcal {H}_{2}$-optimal MPC is observed over the RL-based MPC with two vertices.
Other Disturbed Environments: To assess the efficacy of the proposed robust-optimal planning approach in other disturbed environments, we consider quadrupedal locomotion subject to unknown payloads and external disturbances. Here, we investigate the robustness of the proposed approach with a payload of 4.54 (kg) (36% uncertainty in the robot's mass) while the robot is subject to a lateral external force disturbance acting on the robot's COM with a magnitude of 50 (N) for a duration of 2 seconds, starting from 5 (s) and ending at 7 (s). Fig. 11 illustrates the evolution of reduced-order states by employing the $\mathcal {H}_{2}$-optimal MPC controller. It can be seen that the proposed high-level planner can robustly address the uncertainties arising from the unknown payload and external disturbance. More general disturbed environments will be considered in future work.
Generalizability and Scalability: For reduced-order models of legged robots with heavy limbs and high dimensionality, such as humanoid robots, the SRB and centroidal dynamics may have some limitations and cannot accurately represent the full-order dynamics. More specifically, these template models typically ignore the limbs' dynamics, and hence, there can be a gap between reduced- and full-order models of heavy-limb-legged robots. Consequently, one would need to find better reduced-order models for other robots, such as humanoid robots or other tasks. This can be a limitation of the proposed approach. We will need further investigations to address this problem and to generalize the design of $\mathcal {H}_{2}$- and $\mathcal {H}_{\infty }$-optimal trajectory planners for heavy-limb-legged robots with high degrees of freedom.
Assumptions and Extension to Non-Periodic Gaits: This work operates under the assumption of periodic locomotion, specifically tailored to periodic gaits of template models as outlined in Assumptions 1–4. Extending the proposed robust trajectory optimization algorithm to accommodate non-periodic gaits presents a notable challenge. One potential avenue for such an extension involves composing multiple trajectories, encompassing both transient and steady-state periodic gaits, to construct an augmented trajectory with an extended period. In addition to periodicity, Assumption 4 presupposes a finite convex parameterization of the polytope $\mathcal {H}$ in terms of its vertices. As the number of vertices increases, so does the complexity of LMIs for synthesizing robust-optimal controllers in Theorem 3. This increase in the number of LMIs can pose a further limitation to the proposed approach and needs investigation in future work.
SECTION VII.
Conclusion and Future Work
This article presented a formal approach for synthesizing $\mathcal {H}_{2}$- and $\mathcal {H}_\infty$-optimal MPCs to robustly stabilize the periodic locomotion of legged robots. The proposed algorithm builds on the existing optimization-based control stack. The paper first studied template models with MPC solutions and shows that under some mild conditions, discrete-time, closed-loop template dynamics are differentiable around periodic gaits. The paper then presented the conditions under which the periodic, discrete-time, closed-loop templates can be transformed into a linear system with a constant state-transition matrix using Floquet's theory. The paper presented an approach to systematically generate $\mathcal {H}_{2}$ and $\mathcal {H}_\infty$ robust controllers subject to parametric uncertainty using LMIs. The paper then extended the theoretical results to the SRB template dynamics and numerically verified them. This work employed a layered control algorithm for robust planning and control of legged robots, where the optimal reduced-order trajectories prescribed by the high-level, robust MPCs are provided to a low-level nonlinear WBC for tracking by the full-order locomotion models. The proposed $\mathcal {H}_{2}$- and $\mathcal {H}_\infty$-optimal MPCs with the layered control structure were extensively validated both numerically and experimentally for the robust locomotion of the A1 quadrupedal robot subject to various external disturbances and uneven terrains. Our numerical analysis suggests a significant improvement in the performance of robust locomotion compared to the normal MPC.
Future work will investigate the incorporation of different kinds of uncertainties with robust adaptive predictive controllers. In addition, we will investigate the extension of this approach for a broader range of gaits, including non-periodic gaits.