Loading [MathJax]/extensions/MathMenu.js
Finite-Approximation-Error-Based Discrete-Time Iterative Adaptive Dynamic Programming | IEEE Journals & Magazine | IEEE Xplore

Finite-Approximation-Error-Based Discrete-Time Iterative Adaptive Dynamic Programming


Abstract:

In this paper, a new iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal control problems for infinite horizon discrete-time nonlinear sy...Show More

Abstract:

In this paper, a new iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal control problems for infinite horizon discrete-time nonlinear systems with finite approximation errors. First, a new generalized value iteration algorithm of ADP is developed to make the iterative performance index function converge to the solution of the Hamilton-Jacobi-Bellman equation. The generalized value iteration algorithm permits an arbitrary positive semi-definite function to initialize it, which overcomes the disadvantage of traditional value iteration algorithms. When the iterative control law and iterative performance index function in each iteration cannot accurately be obtained, for the first time a new “design method of the convergence criteria” for the finite-approximation-error-based generalized value iteration algorithm is established. A suitable approximation error can be designed adaptively to make the iterative performance index function converge to a finite neighborhood of the optimal performance index function. Neural networks are used to implement the iterative ADP algorithm. Finally, two simulation examples are given to illustrate the performance of the developed method.
Published in: IEEE Transactions on Cybernetics ( Volume: 44, Issue: 12, December 2014)
Page(s): 2820 - 2833
Date of Publication: 26 September 2014

ISSN Information:

PubMed ID: 25265640

Funding Agency:


I. Introduction

Optimal control of nonlinear systems has been the focus of control fields for many decades [1]–[11]. Dynamic programming has been a useful technique in handling optimal control problems for many years, though it is often computationally untenable to perform it to obtain the optimal solutions [12]. Characterized by strong abilities of self-learning and adaptivity, adaptive dynamic programming (ADP), proposed by Werbos [13], [14], has demonstrated the capability to find the optimal control policy and solve the Hamilton–Jacobi–Bellman (HJB) equation in a practical way [15]–[20]. In [21]–[26], data-driven ADP algorithms were developed to design the optimal control, where mathematical models of control systems were necessary. In [27]–[29], hierarchical ADP with multiple-goal representation networks was investigated and the implementing efficiency of ADP was improved. Iterative methods are primary tools in ADP to obtain the solution of HJB equation indirectly and have received more and more attentions [30]–[37].

Contact IEEE to Subscribe

References

References is not available for this document.