Journals & Magazines >IEEE Robotics and Automation ... >Volume: 7 Issue: 2

Precision Motion Control of Robotized Industrial Hydraulic Excavators via Data-Driven Model Inversion

Abstract:

This work proposes a novel precision motion control framework of robotized industrial hydraulic excavators via data-driven model inversion. Rather than employing a single...Show More

Metadata

Abstract:

This work proposes a novel precision motion control framework of robotized industrial hydraulic excavators via data-driven model inversion. Rather than employing a single neural network to approximate the whole excavator dynamics, including input delays and dead-zones, we construct a physics-inspired data-driven model with a modular structure. The data-driven model is then inverted in a modular fashion which benefits the training speed. The data-driven model and its inversion are trained offline in a supervised manner using the real operational data since online learning methods can damage the machine and surroundings. The entire motion control framework consists of the data-driven model inversion that compensates for the excavator dynamics and the proportional control that determines the input of the model inversion to enhance the robustness. The framework is experimentally validated with a commercial 38-ton class hydraulic excavator for digging and grading tasks, achieving a precise control performance (i.e., root-mean-square of the path following error under

$2 \;[\rm cm]$ ) even under severe soil interactions.

Published in: IEEE Robotics and Automation Letters ( Volume: 7, Issue: 2, April 2022)

Page(s): 1912 - 1919

Date of Publication: 13 January 2022

ISSN Information:

DOI: 10.1109/LRA.2022.3142389

Funding Agency:

Contents

SECTION I.

Introduction

Hydraulic actuator systems are widely employed in various engineering domains due to their high power-to-weight ratio, reliability, and affordability. In particular, hydraulic excavators are essential in construction, demolition, mining, and forestry where large operating forces are required. However, multiple joints of the excavator must be manipulated simultaneously while maximizing the operation efficiency, making the manipulation a demanding task that must be performed by skilled operators with many years of experience. In this regard, the automation of the excavators has drawn a great interest [1] to reduce human-associated costs (e.g., fatigue, safety, etc.). Central to this automation is a precise motion control, but it still remains a challenging problem in practical settings [2], particularly with complex soil interactions.

In this paper, we propose a novel precision motion control framework of robotized industrial hydraulic excavators based on a data-driven model inversion. The data-driven inversion is challenging for typical data-driven approaches (e.g., methods using a recurrent neural network (RNN), a multilayer perceptron (MLP), etc.), as it is expensive to represent the inverse of time-related behaviors and to train discontinuous relations. The hydraulic excavators, however, are under the effect of input delays and dead-zones, which intensify in the presence of complex hydraulic circuits (e.g., a main control valve (MCV) [3]) commonly found in industrial hydraulic settings. To address these distinct features, we introduce a physics-inspired data-driven model with a modular structure composed of the following neural network modules: 1) an infinite impulse response (IIR) unit, which accommodates the input delays; 2) a piecewise linear (PL) map, which deals with the state-dependent dead-zones; and 3) MLP networks, which capture the remaining nonlinear and coupled dynamics. The environmental impacts (e.g., soil interaction forces) and the hydraulic states (e.g., hydraulic pressures) are taken into account by including the measurements in the network input.

Learning the data-driven model and its inversion online can endanger the excavator and the environments, thus the learning is done offline in a supervised manner using the operational data of the real machine. We then design our control to consist of the following two layers: 1) the data-driven model inversion control, constructed as an inversion of the data-driven model in a modular fashion that significantly enhances the training speed; and 2) the proportional (P) control, implemented on top of the data-driven model inversion control to enhance the robustness. The stability and robustness of the control framework are theoretically established. Even in the presence of intense soil interactions, the proposed control framework accomplishes a remarkable performance (i.e., the path following root-mean-square error (RMSE) less than $2 \;[\rm cm]$ ) for digging and grading operations of a commercial 38-ton class hydraulic excavator Doosan DX380LC.

Model-based methods have been proposed [3]–[6] for the control of hydraulic excavators, but they adopted simplifications in modeling which necessarily compromise the control performance. To avoid the difficulties of deriving accurate mathematical models, data-driven methods were introduced for the hydraulic excavator control [7]–[10]. Reinforcement learning (RL) approaches were presented in [7], [8], where the dynamics were approximated by a single large MLP. However, the large number of trainable parameters, which arises from importing the data history as an input of the MLP to handle the delays, substantially slows down the learning speed (e.g., 57476 parameters leading to 10 hours of training with 0.72 million data for [7] as compared to 9160 parameters and 2 hours with 2.6 million data for our plant model). Further, they were presented only in slow operation speeds to commercialize (e.g., average speed of $10 \;[\rm cm/s]$ to $20 \;[\rm cm/s]$ for a 12-ton class excavator), limiting the practical usefulness of the control. On the other hand, an RNN was employed to learn the controller of hydraulic excavators online in [9], [10]. Their performances, however, exhibited rather large tracking errors (e.g., RMSE greater than $1 \;[\rm m]$ ) in digging operations. The works on data-driven methods did not consider soil interactions [7], [8], [10], and the dead-zone compensation was simply defined by constant control input offsets [8], [10], further limiting them from precision motion control.

For a single hydraulic actuator, a data-driven force control was proposed in [11], where the controller was configured as an MLP. The force controller network was fed with a large dimensional history of the actuator position and force, which again put a strain on the training, and the dead-zone compensation was not considered in the control learning. The dead-zone compensation was studied in [12], [13] with a trainable tailored map, but they could not represent the state-dependent nature of the dead-zones. In [14], a soil interaction model of the excavator was suggested without an examination of the hydraulics and various soil properties for the industrial applications. In contrast to these previous results, our proposed framework can address the complex hydraulic excavator dynamics including the input delays and the state-dependent dead-zones without especially increasing the network size due to the modular structure, while fully considering the interaction with the soil. We also believe that our proposed framework would be advantageous for other hydraulically-actuated robotic systems with multiple actuators and complex environmental interactions.

The rest of the paper is organized as follows. Section II describes the autonomous hydraulic excavator adopted for the experimental validation. Section III introduces the modular design for the proposed data-driven model inversion. The entire offline process that derives data-driven model inversion is depicted in Section IV. Experimental results are presented in Section V, and then Section VI concludes the paper.

SECTION II.

System Description

This work employs Doosan DX380LC, an industrial hydraulic excavator, to validate our data-driven control strategy. The excavator is customized using sensors to measure states and environmental impacts, as shown in Fig. 1. Inertial measurement unit (IMU) sensors are attached to the boom, arm, and bucket links to estimate the joint configuration. Swing angle is also measurable, but we only consider the motion within the sagittal plane as visualized in Fig. 2 because the swing action is not involved in the excavation. Apart from the joint configuration, hydraulic pressure sensors are located in pumps and cylinders to consider the hydraulic behavior. The pumps and the cylinders are connected through the MCV as detailed in [3], which consists of spool valves that distribute the pump flow rate to generate the cylinder velocity (i.e., the joint angular rate). The spool positions are controlled by electronic proportional pressure reducing (EPPR) valves commanded by the joystick signal. The joystick signal also affects the pump control provided by the manufacturer, which implies that the joystick signal must be regarded as the control input of our industrial excavator. Meanwhile, the soil interactions are evaluated using the momentum-based wrench estimator [15] since we cannot attach a force/torque sensor to the bucket joint due to reliability and cost concerns. A LiDAR sensor scans the point cloud data (PCD) of the terrain for the reference trajectory planning. The communication is via controller area network (CAN), where the sensing and control frequency is set to $100 \;[\rm Hz]$ .

Fig. 1.

Doosan DX380LC, a commercial 38-ton class industrial hydraulic excavator. The excavator is customized with IMUs, pressure sensors, and a Velodyne Puck VLP-16 LiDAR. The IMUs are attached to each link, and the pressure sensors are located in all pumps and cylinders.

Show All

$Fig. 2. - Joint configuration of the excavator. The bucket tip position is observed in the frame $\lbrace \mathcal B\rbrace$ attached to the base of the excavator.$

Fig. 2.

Joint configuration of the excavator. The bucket tip position is observed in the frame $\lbrace \mathcal B\rbrace$ attached to the base of the excavator.

Show All

SECTION III.

Designing Data-Driven Model Inversion

This section introduces the concept of the data-driven model inversion illustrated in Fig. 3. First, we propose a data-driven, physics-inspired, and easy-to-control model with a modular structure that provides an approximate of the excavator dynamics. The data-driven model can cope with the distinct features of the excavator dynamics, including the input delays, the state-dependent dead-zones, and the soil interactions. Then, the inversion control of the data-driven model is configured to compensate for the excavator dynamics.

Fig. 3.

Framework of the data-driven model inversion and its offline learning schema. First, the excavator plant model (right) is proposed to approximate the excavator dynamics. Then, the inversion control (left) of the excavator plant model, constructed in a modular manner, compensates for the excavator dynamics including the input delays, the state-dependent dead-zones, and the soil interactions.

Show All

A. Excavator Plant Model

Assuming that the time-related behavior (i.e., the spool dynamics and the hydraulic delays) can be approximated by a linear time-invariant (LTI) system, the resulting network, namely the excavator plant model shown in the right-hand side of Fig. 3, predicts the joint angular rate by

$\begin{align*} \eta _{f,t} &= f_{\Gamma _t} (u_t) \tag{1} \\ \mathcal Z \lbrace \eta _{h,t}\rbrace &= P (z) \mathcal Z \lbrace \eta _{f,t}\rbrace \tag{2} \\ \hat{\omega }_t &= h_{\Gamma _t} (\eta _{h,t}) \tag{3} \end{align*}$ View Source

where

$t \in \mathbb{Z}$

is the time step identified by the subscript of a time signal

$\star _t := \star (t)$

$\mathcal Z \lbrace \star _t\rbrace := \sum _{t = 0}^\infty \star _t / z^t$

is the

$z$

-transform, and

$P (z)$

is the delaying system, a stable

$z$

-domain

$n_h \times n_f$

LTI transfer function matrix which captures the multiple and different delays of the hydraulic excavator. The nonlinear nature of the hydraulic circuit is accommodated in pre-delay map

$f_{\Gamma _t} : [-1, 1]^3 \to \mathbb {R}^{n_f}$

and post-delay map

$h_{\Gamma _t} : \mathbb {R}^{n_h} \to \mathbb {R}^3$

with a simplified expression of a

$\Gamma _t$

-dependent map

$\star _{\Gamma _t} (\cdot) := \star (\Gamma _t,\cdot)$

. There are two intermediate variables, the pre-delay state

$\eta _{f,t} \in \mathbb {R}^{n_f}$

and the post-delay state

$\eta _{h,t} \in \mathbb {R}^{n_h}$

, to integrate the LTI system and the nonlinear maps. The control input (i.e., joystick signal) is denoted by

$u_t \in [-1, 1]^3$

, the joint angular rate and its prediction are denoted by

$\omega _t, \hat{\omega }_t \in \mathbb {R}^3$

, and the excavator state is denoted by

$\begin{equation*} \Gamma _t := (\theta _t, P_t^{\rm cyl}, P_t^{\rm pump}, F_t^{\rm ext}) \in \mathbb {R}^{13} \end{equation*}$

View Source

where

$\theta _t := (\theta _t^{\rm boom}, \theta _t^{\rm arm}, \theta _t^{\rm bucket}) \in \mathbb {R}^3$

is the joint angle,

$P_t^{\rm cyl} \in \mathbb {R}^6$

is the pressure of head- and rod-side chambers of the cylinders,

$P_t^{\rm pump} \in \mathbb {R}^2$

is the pressure of two pumps that supply the hydraulic fluid, and

$F_t^{\rm ext} \in \mathbb {R}^2$

is the horizontal and vertical external force acting on the bucket tip which captures the soil interactions. Refer to Section IV-A for neural network module architectures and offline learning methods for the proposed excavator plant model. We would like to comment that a state-dependent delaying system is available for (2), but the LTI transfer function matrix

$P(z)$

works satisfactorily in our application with and without large soil interactions.

B. Excavator Plant Model Inversion Control

From the command joint angular rate $\omega _t^{\rm cmd} \in \mathbb {R}^3$ , the excavator plant model inversion control shown in the left-hand side of Fig. 3 computes the joystick signal as

$\begin{align*} \zeta _{h,t} &= g_{h,\Gamma _t} (\omega _t^{\rm cmd}) \tag{4} \\ \mathcal Z \lbrace \zeta _{f,t}\rbrace &= C_P (z) \mathcal Z \lbrace \zeta _{h,t}\rbrace \tag{5} \\ u_t &= g_{f,\Gamma _t} (\zeta _{f,t}) \tag{6} \end{align*}$ View Source

where

$C_P (z)$

is the delay-tracking system, a stable

$z$

-domain

$n_f \times n_h$

LTI transfer function,

$g_{h,\Gamma _t} : \mathbb {R}^3 \to \mathbb {R}^{n_h}$

is the pre-control map, and

$g_{f,\Gamma _t} : \mathbb {R}^{n_f} \to [-1, 1]^3$

is the post-control map. The pre-control state

$\zeta _{h,t} \in \mathbb {R}^{n_h}$

and the post-control state

$\zeta _{f,t} \in \mathbb {R}^{n_f}$

are intermediate variables. The reference (e.g.,

$n_r = 0$

for the step-reference and

$n_r = 1$

for the ramp-reference) tracking condition of the delay-tracking system

$C_P (z)$

is written as

$\begin{equation*} \lim _{z \to 1} (z - 1) \left(P (z) C_P (z) - I_{n_h}\right) \mathcal Z \lbrace t^{n_r}\rbrace = 0_{n_h \times n_h} \tag{7} \end{equation*}$

View Source

from the final value theorem, where

$I_a \in \mathbb {R}^{a \times a}$

is an identity matrix and

$0_{a \times b} \in \mathbb {R}^{a \times b}$

is a zero matrix. Two nonlinear maps

$g_h, g_f$

satisfy the pseudo-inverse relation s.t.

$\star _{\Gamma _t} \circ g_{\star,\Gamma _t}$

is an identity function on the domain of

$g_{\star,\Gamma _t}$

given

$\Gamma _t$

. Note that the exact inverse of the pre- and post-delay maps (i.e.,

$g_{\star,\Gamma _t} \circ \star _{\Gamma _t}$

is also an identity function) may be out of existence because of many-to-one relations such as the dead-zones. The inversion method for each module is illustrated in Section IV-B. The following Proposition 1 provides the properties of our data-driven inversion control (4), (5), and (6).

Proposition 1:

Consider the excavator plant model (1), (2), and (3) under the data-driven inversion control (4), (5), and (6). Assume that a) errors of the model prediction $\delta _{\omega,t} := \hat{\omega }_t - \omega _t \in \mathbb {R}^3$ and the errors of the pseudo-inverse relations $\delta _{h,t} := (h_{\Gamma _t} \circ g_{h,\Gamma _t}) (\omega _t^{\rm cmd}) - \omega _t^{\rm cmd} \in \mathbb {R}^3$ , $\delta _{f,t} := (f_{\Gamma _t} \circ g_{f,\Gamma _t}) (\zeta _{f,t}) - \zeta _{f,t}\in \mathbb {R}^{n_f}$ are bounded; b) the post-delay map $h_{\Gamma _t}$ is a Lipschitz continuous function; and c) the pre-control map $g_{h,\Gamma _t}$ is a bounded function. Then if the joint angular rate $\omega _{t_0}$ and its command $\omega _{t_0}^{\rm ref}$ at the initial time step $t_0 \in \mathbb{Z}$ are bounded, the difference between the joint angular rate and its command $\nu _{\omega,t} := \omega _t - \omega _t^{\rm cmd} \in \mathbb {R}^3$ is bounded $\forall t \geq t_0$ .

Proof:

The triangle inequality provides two inequalities s.t.

$\begin{align*} \Vert \nu _{\omega,t}\Vert &\leq \Vert \delta _{\omega,t}\Vert + \Vert \hat{\omega }_t - \omega _t^{\rm cmd}\Vert \\ &\leq \Vert \delta _{\omega,t}\Vert + \Vert \delta _{h,t}\Vert + \Vert h_{\Gamma _t} (\eta _{h,t}) - h_{\Gamma _t} (\zeta _{h,t})\Vert \end{align*}$ View Source

where

$\Vert \delta _{\omega,t}\Vert$

and

$\Vert \delta _{h,t}\Vert$

are bounded from the first assumption. For the second inequality, see the definition of the post-delay map (3) and the pseudo-inverse error

$\delta _{h,t} = h_{\Gamma _t} (\zeta _{h,t}) - \omega _t^{\rm cmd}$

. From the Lipschitz continuity of

$h_{\Gamma _t}$

, there

$\exists L \in \mathbb {R}_\geq$

s.t.

$\begin{equation*} \Vert h_{\Gamma _t} (\eta _{h,t}) - h_{\Gamma _t} (\zeta _{h,t})\Vert \leq L \Vert \eta _{h,t} - \zeta _{h,t}\Vert \end{equation*}$

View Source

where

$L$

is referred to as a Lipschitz constant. The delaying system

$P (z)$

and its tracking control

$C_P (z)$

is rearranged as

$\begin{align*} &\mathcal Z \lbrace \eta _{h,t} -\zeta _{h,t}\rbrace \\ &\quad= \left(P (z) C_P (z) - I_{n_h} \right) \mathcal Z \lbrace \zeta _{h,t}\rbrace + P (z) \mathcal Z \lbrace \delta _{f,t}\rbrace \end{align*}$

View Source

where

$P (z) C_P (z) - I_{n_h}$

is a stable linear system satisfying the reference tracking condition (7). From the bounded-input bounded-output (BIBO) property, the error converges as

$\begin{multline*} \Vert \eta _{h,t} - \zeta _{h,t}\Vert \leq \beta \left(\Vert \eta _{h,t_0} - \zeta _{h,t_0}\Vert, t - t_0 \right) \\ + \gamma _1 \left({\textstyle \sup _{t_0 \leq \tau \leq t}} \Vert \zeta _{h,\tau }\Vert \right) + \gamma _2 \left({\textstyle \sup _{t_0 \leq \tau \leq t}} \Vert \delta _{f,\tau }\Vert \right) \end{multline*}$

View Source

where

$\gamma _\star : [0, a) \to [0, \infty)$

is a class

$\mathcal K$

function (i.e.,

$\gamma _\star$

is strictly increasing with

$\gamma _\star (0) = 0$

), and

$\beta : [0, a) \times [0, \infty) \to [0, \infty)$

is a class

$\mathcal {K L}$

function (i.e.,

$\beta (r,s)$

for each fixed

$s$

belongs to class

$\mathcal K$

and

$\beta (r,s)$

for each fixed

$r$

is decreasing with

$\lim _{s \to \infty } \beta (r,s) = 0$

). The pre-control state

$\zeta _{h,\tau }$

and the pseudo-inverse error

$\delta _{f,\tau }$

are bounded due to the last and the first assumptions, respectively. Bounded properties of the initial conditions lead

$\Vert \eta _{h,t_0} - \zeta _{h,t_0}\Vert$

to be bounded, which implies that

$\nu _{\omega,t}$

is also bounded.

$\blacksquare$

On top of the excavator plant model inversion control, the joint angle P control added to the feedforward reference angular rate enhances the robustness of the entire framework with the command joint angular rate:

$\begin{equation*} \omega _t^{\rm cmd} := \omega _t^{\rm ref} - K e_{\theta,t} \in \mathbb {R}^3 \tag{8} \end{equation*}$ View Source

where

$\omega _t^{\rm ref} \in \mathbb {R}^3$

is the reference joint angular rate,

$\theta _t^{\rm ref} \in \mathbb {R}$

is the reference joint angle,

$e_{\theta,t} := \theta _t - \theta _t^{\rm ref} \in \mathbb {R}^3$

is the joint angle error, and

$K \in \mathbb {R}^{3 \times 3}$

is the P gain which is a positive-definite matrix. Theorem 1 then concludes the entire control framework. Note that the command joint angular rate (8) can be determined independently of the data-driven inversion control. For instance, a velocity field control or a proportional-integral (PI) control can replace the P control in (8).

Theorem 1:

Consider the excavator plant model (1), (2), and (3) under the data-driven inversion control (4), (5), and (6) with the P control (8). Following the assumptions of Proposition 1, the joint angle error $e_{\theta,t}$ is ultimately bounded.

Proof:

Let us first consider the following Lyapunov function:

$\begin{equation*} V := \frac{1}{2} e_{\theta,t}^T e_{\theta,t} \end{equation*}$ View Source

for the error convergence in continuous-time domain. The time derivative of the Lyapunov function yields

$\begin{equation*} \dot{V} = e_{\theta,t}^T \dot{e}_{\theta,t} = - e_{\theta,t}^T K e_{\theta,t} + e_{\theta,t}^T \nu _{\omega,t} \end{equation*}$

View Source

with

$\dot{e}_{\theta,t} + K e_{\theta,t} = \nu _{\omega,t}$

where

$\dot{e}_{\theta,t} = e_{\omega,t} := \omega _t - \omega _t^{\rm ref} \in \mathbb {R}^3$

is the joint angular rate error. From the inequality

$\dot{V} \leq - \lambda _{\min } (K) \Vert e_{\theta,t}\Vert ^2 + \Vert e_{\theta,t}\Vert \Vert \nu _{\omega,t}\Vert$

with the minimum eigenvalue operator

$\lambda _{\min } (\cdot)$

, the joint angle error is ultimately bounded by a closed ball of radius

$\Vert \nu _{\omega,t}\Vert / \lambda _{\min } (K)$

where

$\Vert \nu _{\omega,t}\Vert$

is bounded by Proposition 1.

$\blacksquare$

In Proposition 1, the first assumption stems from reliable learning performances, and the second assumption is based on the continuous and bounded dynamic behavior of the excavator. The last assumption can be enforced by choosing a bounded output activation, such as a hyperbolic tangent, an arc-tangent, or a logistic function, for the pre-control map $g_h$ . Theorem 1 theoretically establishes the robustness of the entire control system, which is not provided in other data-driven controls of hydraulic excavators (e.g., [7]–[10]).

SECTION IV.

Learning Data-Driven Model Inversion

As schematized in Fig. 3, constructing the data-driven model inversion consists of two steps. The first step is to learn the excavator plant model, made up of the delaying system $P (z)$ and the pre- and post-delay maps $f, h$ ; and the second step is to obtain the inversion of each component to constitute the excavator plant model inversion control. The learning steps are detailed in the following Section IV-A and IV-B.

For the offline learning process, we assemble the measurements of Doosan DX380LC to capture complex nonlinear dynamics and soil interactions. The data is collected from autonomous digging/grading operations (with various depths and bucket speeds) and sinusoidal joystick signals (at frequencies $0.25 \;[\rm Hz]$ to $0.5 \;[\rm Hz]$ and amplitudes 0.3 to 0.5) near the initial and final configurations of the operations. The reference path of the autonomous operation is obtained by length-scaling the nominal bucket configuration extracted from the pattern of human experts [15]. The reference joint angle is computed by inverse kinematics, and the trajectory is time-scaled by the bang-bang approach on joint angular rate considering the hardware limits (e.g., the workspace of the excavator, a rough range of the joint angular rate, and the maximum excavation volume). For the control during the data collection, we employ the manufacturer-provided control and the proposed control trained with a small amount of data. In this work, the main focus is digging/grading tasks, but we believe that the proposed framework can be easily extended to the entire workspace by collecting sufficient data.

We use the data of 2.6 million time steps at a frequency of $100 \;[\rm Hz]$ (i.e., 7.2 hours of data) to train the controller. The data is randomly split into training, validation, and test sets at a ratio of 80:15:5. Using the data sets, the offline learning is performed on a computer with an AMD Ryzen 5 3600X $3.8 \;[\rm GHz]$ CPU, a $16 \;[\rm GB]$ RAM, and an NVIDIA GeForce GTX 1660 Ti GPU. Note that the proposed controller requires at least 1.2 million time steps (i.e., 3.3 hours) of data to obtain good enough performance under the nominal operating condition. However, we include data as extensively as possible on various operating conditions (e.g., soil properties and weather conditions) to address the diverse circumstances of the machines commercialized by the manufacturer.

A. Learning Excavator Plant Model

The first step, a supervised learning of the excavator plant model, exploits the following loss function:

$\begin{equation*} L^{\rm plant} := \Vert \hat{\omega }_t - \omega _t\Vert ^2 \end{equation*}$ View Source

where

$P (z)$

$f$

, and

$h$

are all trainable. The preexisting neural network architectures, however, cannot effectively address the unique properties of the excavator dynamics. For this reason, we propose new neural network modules: an IIR unit for the delaying system

$P (z)$

and a monotonically non-decreasing PL map for the pre-delay map

$f$

Infinite Impulse Response Unit

The delaying system $P (z)$ is a multi-input multi-output (MIMO) transfer function configured as a matrix of single-input single-output (SISO) transfer functions. To construct a neural network for the transfer function learning, let us first consider a $z$ -domain SISO LTI transfer function written as

$\begin{equation*} H (z) := \frac{b_0 + b_1 z^{-1} + \cdots + b_{n_b} z^{-n_b}}{a_0 + a_1 z^{-1} + \cdots + a_{n_a} z^{-n_a}} \end{equation*}$ View Source

where

$b_0, b_1,\ldots, b_{n_b} \in \mathbb {R}$

and

$a_0, a_1,\ldots, a_{n_a} \in \mathbb {R}$

are constant coefficients with

$a_0 \ne 0$

. The transfer function is equivalent to a recursive filter in

$t$

-domain, described in terms of the difference equation

$y_t = (\sum _{i = 0}^{n_b} b_i x_{t - i} - \sum _{i = 1}^{n_a} a_i y_{t - i}) / a_0$

where

$x_t \in \mathbb {R}$

is the input signal and

$y_t \in \mathbb {R}$

is the output signal. The difference equation then can be rearranged to

$\begin{equation*} y_t = \sum _{i = 1}^{n_b} \bar{b}_i (x_{t - i} - x_t) - \sum _{i = 1}^{n_a} \bar{a}_i (y_{t - i} - DC_H x_t) + DC_H x_t \end{equation*}$

View Source

where

$\bar{b}_i := b_i / a_0 \in \mathbb {R} \ \forall i \in \lbrace 1, 2,\ldots, n_b\rbrace$

and

$\bar{a}_i := a_i / a_0 \in \mathbb {R} \ \forall i \in \lbrace 1, 2,\ldots, n_a\rbrace$

are normalized coefficients and

$DC_H := H(1) \in \mathbb {R}$

is the low-frequency (DC) gain. Now the IIR unit can be written as an

$n_h \times n_f$

matrix s.t.

$\begin{equation*} P (z) := \begin{bmatrix}P_{j k} (z) \end{bmatrix}_{j \in \lbrace 1, 2,\ldots, n_h\rbrace \text{ and } k \in \lbrace 1, 2,\ldots, n_f\rbrace } \end{equation*}$

View Source

where

$P_{j k} (z) \ \forall j, k$

is a SISO transfer function whose orders of the numerator and the denominator are denoted by

$n_b^{j k}, n_a^{j k} \in \mathbb{Z}_\geq$

and normalized coefficients are denoted by

$\bar{b}_i^{j k} \in \mathbb {R} \ \forall i \in \lbrace 1, 2,\ldots, n_b^{j k}\rbrace$

and

$\bar{a}_i^{j k} \in \mathbb {R} \ \forall i \in \lbrace 1, 2,\ldots, n_a^{j k}\rbrace$

. The IIR unit belongs to the recurrent neural network family with the given network size

$n_h, n_f, n_b^{j k}, n_a^{j k}$

and trainable variables

$\bar{b}_i^{j k}, \bar{a}_i^{j k} \ \forall i, j, k$

. The DC gain can also be trainable, but here, we choose

$DC_P := P (1)$

as a

$n_h \times n_f$

matrix with ones on the main diagonal and zeros on the off-diagonal so that

$\operatorname{rank}P (1) = \min (n_h, n_f)$

. Any

$n_h \times n_f$

transfer function matrix whose DC gain rank is

$\min (n_h, n_f)$

can be transformed into the IIR unit with row and column matrix operations.

Piecewise Linear Map

The post-control map $g_f$ must deal with jump discontinuities or large slopes to compensate the dead-zones. However, the compensation map is not well trainable with a vanilla MLP because $(\omega _t, u_t)$ pairs have one-to-many relations in the dead-zone intervals. For this reason, the pre-delay map $f$ learning is conducted with a monotonically non-decreasing $n$ -segment PL map $\operatorname{PL}_{(X, Y)} : [X_0, X_n] \to [Y_0, Y_n]$ s.t.

$\begin{multline*} \operatorname{PL}_{(X, Y)} (x) \\ := {\begin{cases}\left((Y_i - Y_{i-1}) x + X_i Y_{i-1} - X_{i-1} Y_i\right) \big / \left(X_i - X_{i-1}\right) \\ & \text{{\kern-170.0pt} if } x \in [X_{i-1}, X_i) \ \forall i \in \lbrace 1, 2,\ldots, n\rbrace \\ Y_n & \text{{\kern-170.0pt} if } x = X_n \end{cases}} \end{multline*}$ View Source

where

$X_i, Y_i \in \mathbb {R} \ \forall i \in \lbrace 0, 1,\ldots, n\rbrace$

are breakpoints of the map and

$X := \lbrace X_i\rbrace _{i = 0}^n, Y := \lbrace Y_i\rbrace _{i = 0}^n$

are the non-decreasing sequences of the breakpoints. The PL map is continuous at

$x = X_i$

with

$\operatorname{PL}_{(X, Y)} (X_i) = Y_i$

$X_{i-1} < X_i$

, while the map can represent the jump discontinuity at

$x = X_i$

$X_{i-1} = X_i$

. To apply distinct PL maps to the boom, arm, and bucket joystick signals, a tuple of multiple PL maps is expressed as

$(y_1, y_2,\ldots, y_m) := \operatorname{PL}_{(\mathbf X, \mathbf Y)} (x_1, x_2,\ldots, x_m)$

where

$X_i^k, Y_i^k \in \mathbb {R} \ \forall i \in \lbrace 0, 1,\ldots, n^k\rbrace$

are the breakpoints of the

$n^k$

-segment

$k$

-th PL map

$\forall k \in \lbrace 1, 2,\ldots, m\rbrace$

. Lists of the breakpoints are denoted by

$\mathbf X := \lbrace X^k := \lbrace X_i^k\rbrace _{i = 0}^{n^k}\rbrace _{k = 1}^m$

and

$\mathbf Y := \lbrace Y^k := \lbrace Y_i^k\rbrace _{i = 0}^{n^k}\rbrace _{k = 1}^m$

. Here, we choose the boundary breakpoints as

$X_0^k = Y_0^k = -1$

and

$X_{n^k}^k = Y_{n^k}^k = 1 \ \forall k$

, so that

$\operatorname{PL}_{(\mathbf X, \mathbf Y)} : [-1, 1]^m \to [-1, 1]^m$

. Note that the pseudo-inverse of the PL map is defined as

$\operatorname{PL}_{(\mathbf X, \mathbf Y)}^+ = \operatorname{PL}_{(\mathbf Y, \mathbf X)}$

owing to its monotonicity. The non-decreasing sequences of an interval

$[a, b]$

can be trained with the custom activation function

$\sigma _{[a, b]} : \mathbb {R}^n \to [a, b]^{n+1}$

written as

$\sigma _{[a, b]} (c) = d$

s.t.

$\begin{equation*} d_i := a + (b - a) \frac{\sum _{j = 1}^i \exp (c_j)}{\sum _{j = 1}^n \exp (c_j)} \ \forall i \in \lbrace 1, 2,\ldots, n\rbrace \end{equation*}$

View Source

where

$c := (c_1, c_2,\ldots, c_n) \in \mathbb {R}^n$

is the activation input and

$d := (a, d_1,\ldots, d_n) \in [a, b]^{n+1}$

is the partition of the interval with

$a \leq d_1 \leq d_2 \leq \cdots \leq d_n = b$

. The activation function can also be extended to a two-dimensional input (e.g.,

$\sigma _{[a, b]} : \mathbb {R}^{n (\times \text{2}m)} \to [a, b]^{(n+1) \times \text{2}m}$

for

$m$

distinct

$n$

-segment PL maps) by applying the function to every column of the input.

Employing the proposed neural network modules, the delaying system $P (z)$ is configured as a $3 \times 3$ IIR unit whose order is chosen by $n_b^{j k} = n_a^{j k} = 3$ if $j = k$ and $n_b^{j k} = n_a^{j k} = 0$ if $j \ne k \ \forall j, k$ (i.e., 3-rd order transfer functions on the diagonal and zeros on the off-diagonal). This is based on the assumption that the pre- and post-delay maps $f, h$ can adjust the input coupling. Fig. 4 shows the characteristics of the delaying system trained with the pre- and post-delay maps. Here, we would like to mention that the trained controller without the IIR unit causes fatal oscillations in the real machine.

$Fig. 4. - Visualization of the pre-delay map $f_{\Gamma _t}$ (left) and the delaying system $P (z)$ (right). The PL map landscape visualizes the state-dependent pre-delay map for the bucket joystick signal during a trial of grading operation. The IIR step response shows the input delays of the boom, arm, and bucket, with settling times to within 5% are $(0.41, 0.33, 0.26) \;[\rm s]$.$

Fig. 4.

Visualization of the pre-delay map $f_{\Gamma _t}$ (left) and the delaying system $P (z)$ (right). The PL map landscape visualizes the state-dependent pre-delay map for the bucket joystick signal during a trial of grading operation. The IIR step response shows the input delays of the boom, arm, and bucket, with settling times to within 5% are $(0.41, 0.33, 0.26) \;[\rm s]$ .

Show All

The pre-delay map $f$ is replaced with PL maps as $f_{\Gamma _t} := \operatorname{PL}_{\Phi _{f,t}} : [-1, 1]^3 \to [-1, 1]^3$ . Breakpoints of the PL map are assumed to depend on $\Gamma _t$ by an auxiliary network $\Phi _{f,t} := \phi _f (\Gamma _t) \in [-1, 1]^{9 \times 6}$ which consists of a hidden layer of 64 ReLU nodes and an output layer of $8 \times 6$ nodes with the custom output activation function $\sigma _{[-1, 1]} : \mathbb {R}^{8 \times 6} \to [-1, 1]^{9 \times 6}$ , for the three distinct 8-segment PL maps. The trained PL map is visualized in Fig. 4. Notice that the PL map enables the dead-zone compensation, shown in Fig. 5 as jumps of joystick signals. The pre-delay map can be further customized with constraints or other designs (e.g., applying a constraint to pass through the origin, training boundary breakpoints, or combining the PL map with an additional MLP), but this is not necessary in our case.

$Fig. 5. - Prediction of the joint angular rate (top) and reconstruction of the joystick signal (bottom) for the digging (left) and grading (right) data within the test set. The predicted joint angular rate $\hat{\omega }_t$ is computed using the excavator plant model given the same joystick signal $u_t$. The reconstructed joystick signal $\check{u}_t$ is the output of the model inversion taking the measured joint angular rate $\omega _t$ as the inversion input. About the discrepancies between the recorded/reconstructed joystick signals, we would like to note that: 1) there is no exact inverse of the dead-zones, that is, different joystick signals in the dead-zones can lead to the same excavator response; and 2) perfect tracking control of the non-minimum phase delays does not exist.$

Fig. 5.

Prediction of the joint angular rate (top) and reconstruction of the joystick signal (bottom) for the digging (left) and grading (right) data within the test set. The predicted joint angular rate $\hat{\omega }_t$ is computed using the excavator plant model given the same joystick signal $u_t$ . The reconstructed joystick signal $\check{u}_t$ is the output of the model inversion taking the measured joint angular rate $\omega _t$ as the inversion input. About the discrepancies between the recorded/reconstructed joystick signals, we would like to note that: 1) there is no exact inverse of the dead-zones, that is, different joystick signals in the dead-zones can lead to the same excavator response; and 2) perfect tracking control of the non-minimum phase delays does not exist.

Show All

To address the remaining nonlinear properties and couplings, the post-delay map $h$ is expressed as an MLP, with a single hidden layer of 256 nodes with a ReLU activation and a linear output layer. Training the excavator plant model (1), (2), and (3) takes only around 2 hours to converge using the Adam optimizer. Fig. 5 visualizes two examples of the prediction, while the prediction RMSE of the test data set is $(0.51, 0.66, 1.16) \;[\rm deg/s]$ for each boom, arm, and bucket joint angular rate, which is small enough to justify the presented neural network architecture.

B. Learning Excavator Plant Model Inversion Control

The second step configures the modular inversion of the data-driven excavator plant model. The delay-tracking system $C_P (z)$ , the inversion of the delaying system $P (z)$ , is a $3 \times 3$ MIMO LTI transfer function. Although we can train $C_P (z)$ with another IIR unit, we analytically compose the delay-tracking system since obtaining the tracking control for the diagonal transfer function matrix is relatively simple. A stable and exact inverse of the delaying system (i.e., $P^{-1} (z)$ ) does not exist because the trained delaying system has unstable zeros characterized by inverse responses as shown in Fig. 4. Thus, the delay-tracking system is constructed to meet the reference tracking condition (7) $\forall n_r \in \lbrace 0, 1\rbrace$ where the poles are empirically optimized to 0.82 with multiplicity 2. We choose the delay-tracking system with the minimum numerator order, which is a proper transfer function matrix. The post-control map $g_f$ , the pseudo-inverse of the pre-delay map $f$ , does not require any offline learning process as the post-control map can be easily computed as $g_{f,\Gamma _t} := \operatorname{PL}_{\Phi _{f,t}}^+$ .

On the other hand, the pre-control map $g_h$ (i.e., the pseudo-inverse of the post-delay map $h$ ) cannot be analytically obtained since the post-delay map is an MLP. For the offline learning of the pre-control map, a distal learning approach [16] is introduced. The loss function is defined as

$\begin{equation*} L_h^{\rm inv} := \Vert \check{\omega }_t - \omega _t \Vert ^2 \end{equation*}$ View Source

where

$\check{\omega }_t := (h_{\Gamma _t} \circ g_{h,\Gamma _t}) (\omega _t) \in \mathbb {R}^3$

to realize the pseudo-inverse relation. The MLP network of the pre-control map

$g_h$

has a hidden layer of 256 nodes with a ReLU activation and an output layer with a hyperbolic tangent activation. The pre-control map can also be customized with additional PL maps to tolerate large slopes or jump discontinuities. After some trials, however, we found that an MLP is enough for the pre-control map

$g_h$

, implying that the dead-zones are all captured in the pre-delay map

$f$

while training the plant model. The pre-control map training takes less than 5 minutes, owing to the modular inversion method. Reconstruction of the joystick signal is compared to the recorded signal in Fig. 5.

SECTION V.

Experiment

The proposed control framework is verified with the commercial 38-ton class excavator Doosan DX380LC in digging and grading tasks. The reference trajectories for both operations are generated using the same planning algorithm as described in Section IV. The control frequency is $100 \;[\rm Hz]$ , though the inversion control can be easily implemented with higher frequencies. The P gain is chosen as $K = 1.5 I_3$ to determine the model inversion input. We implemented PI feedback as well, yet, found that only P control suffices as the control error is already fairly small without the I feedback.

The bucket tip position $p_t := (p_{x,t}, p_{z,t}) \in \mathbb {R}^2$ is calculated to evaluate the control performance, where $p_{x,t}, p_{z,t} \in \mathbb {R}$ are the horizontal and vertical tip positions as shown in Fig. 2. The path following error is denoted by $e_{p,t}^{\rm path} := \min _{t_0 \leq \tau \leq t_f} \Vert p_t - p_\tau ^{\rm ref}\Vert \in \mathbb {R}$ , which indicates the error of the excavated ground geometry. The trajectory error, or the bucket tip position error, is written as $e_{p,t}^{\rm traj} := \Vert p_t - p_t^{\rm ref}\Vert \in \mathbb {R}$ . Here, we calculate the RMSE from one second after the initial time. This is to assess the performance/precision of our proposed control in the steady-state as typically done in control literature. Note from Figs. 6, 7, and the supplementary video that, at the initial time, we have a non-zero initial error due to our using of the (less accurate) manufacturer-provided PI control until then. See to it also that our proposed control well-behaves (e.g., with no sudden dipping) during this transition. This initial performance can further be improved by turning on our proposed data-driven controller before the initial operation or by re-planning the path using the measured configuration as the initial condition (i.e., ensuring the controller in its steady-state). The control performance is compared to the manufacturer-provided PI control, which determines the joystick signal by a $(\omega _t, u_t)$ pair look-up table with a joint angle PI feedback and an angular rate feedforward. The manufacturer manually fine-tuned the control gain and the look-up table using the air digging data (i.e., data without soil interactions).

$Fig. 6. - Bucket tip position (top-left), external force (bottom-left), and performance evaluation (right) during the digging operations. Repeated experiments are visualized as thin lines where the reference trajectories are planned for every repetition considering the current/target terrain and the hardware limit. The bold lines are the averages of the trials, and the shaded bucket plots are the average bucket configuration for every $2 \;[\rm s]$. Histograms of the path following and trajectory RMSEs are also provided for the performance comparison.$

Fig. 6.

Bucket tip position (top-left), external force (bottom-left), and performance evaluation (right) during the digging operations. Repeated experiments are visualized as thin lines where the reference trajectories are planned for every repetition considering the current/target terrain and the hardware limit. The bold lines are the averages of the trials, and the shaded bucket plots are the average bucket configuration for every $2 \;[\rm s]$ . Histograms of the path following and trajectory RMSEs are also provided for the performance comparison.

Show All

Fig. 7.

Experimental results of the repeated grading operations which is to level the ground surface after the digging operations shown in Fig. 6.

Show All

Digging: The digging operation is the removal of soil from the current terrain to achieve the target ground shape. Due to the excavation capacity limit, multiple digging operations may be required to reach the final target ground geometry. Fig. 6 visualizes bucket trajectories, soil interactions, and error distributions of repeated experiments on various excavation depths and volumes. The experimental results of the manufacturer-provided PI control have a path following RMSE of $5.79 \;[\rm cm]$ and a trajectory RMSE of $25.0 \;[\rm cm]$ . The PI control results have an RMS reference bucket tip velocity of $66.2 \;[\rm cm/s]$ and an RMS external force of $4.31 \times 10^4 \;[\rm N]$ . The proposed control framework outperforms the manufacturer-provided PI control, where a path following RMSE is $1.99 \;[\rm cm]$ and a trajectory RMSE is $5.21 \;[\rm cm]$ with an RMS reference bucket tip velocity of $66.3 \;[\rm cm/s]$ and an RMS external force of $6.41 \times 10^4 \;[\rm N]$ . The operation speed and the external force are large enough for industrial applications.

Grading: The grading operation is to level the ground surface after the digging operations, where its experimental results are shown in Fig. 7. The manufacturer-provided PI control has a path following RMSE of $5.28 \;[\rm cm]$ , trajectory RMSE of $15.1 \;[\rm cm]$ , an RMS reference bucket tip velocity of $87.2 \;[\rm cm/s]$ , and an RMS external force of $2.69 \times 10^4 \;[\rm N]$ . The excavator plant model inversion control with the P control attains a path following RMSE of $1.83 \;[\rm cm]$ and a trajectory RMSE of $3.17 \;[\rm cm]$ with an RMS reference bucket tip velocity of $87.3 \;[\rm cm/s]$ and an RMS external force of $2.23 \times 10^4 \;[\rm N]$ . The errors are evenly small both with and without intense soil interactions since the inversion captures and compensates for the effect of the external force.

SECTION VI.

Conclusion

This work presents a precision motion control of robotized industrial hydraulic excavators via data-driven model inversion. Considering distinct features that hinder the learning-based control methods (i.e., input delays and dead-zones), we propose a data-driven model with a physics-inspired modular structure to approximate the excavator dynamics. We then derive the inversion of the plant model in a modular manner which considerably promotes the learning speed. To prevent injuries of the machine and surroundings, the model and its inversion are trained offline in a supervised fashion using the measurements of the Doosan DX380LC, a 38-ton class industrial hydraulic excavator. Our proposed control framework is composed of the data-driven model inversion control, which compensates for the excavator dynamics, and a P control that computes the model inversion input and enhances the robustness. The stability and robustness of the control framework are theoretically proven, and experimental results are presented in comparison with the manufacturer-provided PI control. The proposed control framework significantly outperforms the PI control and shows a precise control performance (i.e., path following RMSE under $2 \;[\rm cm]$ ) even in the presence of intense soil interactions.

Some possible future research directions include: 1) generalization of the excavator plant model using the state-dependent delaying system; 2) incorporation of the expert-emulating planning [15]; 3) implementation of the over-the-air programming to effectively collect the measurements and update the control; 4) rigorous comparison with other control strategies (e.g., [17]); and 5) application of our framework to other systems with delays and dead-zones.

References is not available for this document.

Precision Motion Control of Robotized Industrial Hydraulic Excavators via Data-Driven Model Inversion

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Introduction

System Description

Designing Data-Driven Model Inversion

A. Excavator Plant Model

B. Excavator Plant Model Inversion Control

Proposition 1:

Proof:

Theorem 1:

Proof:

Learning Data-Driven Model Inversion

A. Learning Excavator Plant Model

Infinite Impulse Response Unit

Piecewise Linear Map

B. Learning Excavator Plant Model Inversion Control

Experiment

Conclusion

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Precision Motion Control of Robotized Industrial Hydraulic Excavators via Data-Driven Model Inversion

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Introduction

System Description

Designing Data-Driven Model Inversion

A. Excavator Plant Model

B. Excavator Plant Model Inversion Control

Proposition 1:

Proof:

Theorem 1:

Proof:

Learning Data-Driven Model Inversion

A. Learning Excavator Plant Model

Infinite Impulse Response Unit

Piecewise Linear Map

B. Learning Excavator Plant Model Inversion Control

Experiment

Conclusion

References