Conferences >2018 European Control Confere...

Constrained and Stabilizing Stacked Adaptive Dynamic Programming and a Comparison with Model Predictive Control

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Model predictive control (MPC) is in many applications the de facto approach to optimal control. It typically provides an optimal input (sequence) for a finite-horizon of...Show More

Metadata

Abstract:

Model predictive control (MPC) is in many applications the de facto approach to optimal control. It typically provides an optimal input (sequence) for a finite-horizon of given running costs. Another approach, called dynamic programming (DP), is based on the Hamilton-Jacobi-Bellman formalism and usually seeks optimal inputs over an infinite horizon of running costs. Unlike MPC, DP is much less computationally tractable and typically requires state space discretization which leads to the so-called curse of dimensionality. Adaptive dynamic programming (ADP), an approach based on reinforcement learning, seeks to address the difficulties of DP by introducing approximation models for the optimal cost function and control policies. In a variant of ADP called stacked ADP (sADP), control policies are optimized over a finite stack of value function approximants, thus making it somewhat similar to MPC. First, similarities and differences between a variant of ADP and MPC are discussed. Second, MPC stability results are transferred to ADP and state and input constraints are considered. The work is concluded by a case study.

Published in: 2018 European Control Conference (ECC)

Date of Conference: 12-15 June 2018

Date Added to IEEE Xplore: 29 November 2018

ISBN Information:

DOI: 10.23919/ECC.2018.8550545

Conference Location: Limassol, Cyprus

Contents

I. Introduction

Dynamic programming (DP) dates back to the mid 20th century. In [4], Bellman described optimality principles and the structure of DP with paticular application to stochastic descision processes. The core of DP is the Hamilton-Jacobi- Bellman (HJB) equation which is a partial differential equation describing the behavior of the optimal cost function, sometimes called cost-to-go due to the infinite horizon over which an optimal control policy is sought. In general, when applying DP, one has to face the problem of discretizing the state space and computing the cost function and optimal policies at the discrete points. Only in some very special cases, like the linear dynamics case, analytic solutions can be found. In the linear case with a quadratic running cost, the corresponding optimal controller is the linear quadratic regu- lator (LQR) which can be found by solving the corresponding algebraic Riccati equation [16].

References is not available for this document.

Constrained and Stabilizing Stacked Adaptive Dynamic Programming and a Comparison with Model Predictive Control

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Constrained and Stabilizing Stacked Adaptive Dynamic Programming and a Comparison with Model Predictive Control

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

References