Loading [MathJax]/extensions/MathMenu.js
An Adaptive Dynamic Programming Algorithm Based on ITF-OELM for Discrete-Time Systems | IEEE Conference Publication | IEEE Xplore

An Adaptive Dynamic Programming Algorithm Based on ITF-OELM for Discrete-Time Systems


Abstract:

Adaptive dynamic programming (ADP) is a kind of intelligent control method, and it is a non-model-based method that can directly approximate the optimal control policy vi...Show More

Abstract:

Adaptive dynamic programming (ADP) is a kind of intelligent control method, and it is a non-model-based method that can directly approximate the optimal control policy via online learning. The gradient algorithm is usually used to update weights of action networks and critic networks, however it is clear that gradient descent-based learning methods are generally very slow due to improper learning steps or may easily converge to local minimum. In this paper, in order to overcome those disadvantages of gradient descent-based learning methods, a novel ADP algorithm based on initial-training-free online extreme learning machine (ITF-OELM), in which the critic network link weights of hidden nodes to output nodes can be obtained by least squares instead of gradient algorithm, is introduced. Finally, the ADP algorithm based on ITF-OELM is tested on a discrete time torsional pendulum system, and simulation results indicate that this algorithm makes the system converge in a shorter time compared with the ADP based on gradient algorithm.
Date of Conference: 22-24 May 2021
Date Added to IEEE Xplore: 30 November 2021
ISBN Information:

ISSN Information:

Conference Location: Kunming, China

Funding Agency:

References is not available for this document.

1 Introduction

Model based control techniques have been developed in order to cope with control problems on the assumption that models of the controlled systems are known, the production equipment is becoming increasingly complicated, modeling a system is not easy, and sometimes it is impossible. It is a very meaningful to study non-model-based control methods for unknown discrete time control systems. Adaptive dynamic programming (ADP) [1]–[5] is a kind of intelligent control method, and it can directly approximate the optimal control policy via online learning. Heuristic dynamic programming (HDP), dual heuristic programming(DHP), action dependent heuristic dynamic programming(ADHDP), and action dependent dual heuristic programming (ADDHP) are four basic adaptive dynamic programming structures [6]. HDP is a typical ADP, it was proposed in the 1970s, and the idea was firmed up in the early 1990s under the names of adaptive critic designs.

Select All
1.
R. Beard, G. Saridis and J. Wen, "Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation", Automatic, vol. 33, no. 12, pp. 2159-2177, DEC 1997.
2.
J. Murray, C. Cox, G. Lendaris and R. Saeks, "Adaptive dynamic programming", IEEE Transaction on Systems Man and Cybernetics Part C-application and Reviews, vol. 32, no. 2, pp. 140-153, MAY 2002.
3.
T. Bian, Y. Jiang and Z. P. Jiang, "Adaptive dynamic programming and optimal control of nonlinear nonaffine systems", Automatica, vol. 50, no. 10, pp. 2624-2632, OCT 2014.
4.
D. Vrabie, O. Pastravanu, M. Abu-Khalaf and F. L. Lewis, "Adaptive optimal control for continuous-time linear systems based on policy iteration", Automatic, vol. 45, no. 2, pp. 477-484, FEB 2009.
5.
Y. Jiang and Z.-P. Jiang, "Robust adaptive dynamic programming and feedback stabilization of nonlinear systems", IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 5, pp. 882-893, MAY 2014.
6.
P. Werbos, "Advanced forecasting methods for global crisis warning and models of intelligence", General Systems Yearbook, vol. 22, 01 1977.
7.
J. Lu, Q. Wei and F. Wang, "Parallel control for optimal tracking via adaptive dynamic programming", IEEE-CAA Journal of Automatica Sinica, vol. 7, no. 6, pp. 1662-1674, NOV 2020.
8.
Y. Zhu, D. Zhao and H. He, "Invariant adaptive dynamic programming for discrete-time optimal control", IEEE Transactions On Systems Man Cybernetics-Systems, vol. 50, no. 11, pp. 3959-3971, NOV 2020.
9.
J. Zhao, J. Na and G. Gao, "Adaptive dynamic programming based robust control of nonlinear systems with unmatched uncertainties", Neurocomputing, vol. 395, pp. 56-65, JUN 2020.
10.
G. Huang, Q. Y. Zhu and C. K. Siew, "Extreme learning machine: a new learning scheme of feedforward neural networks", Neural Networks, vol. 2, no. 3, pp. 985-990, 2004.
11.
G. Huang, Q. Y. Zhu and C. K. Siew, "Extreme learning machine: theory and applications", Neurocomputing, vol. 70, no. 1–3, pp. 489-501, 2006.
12.
G. Huang, L. Chen and C. K. Siew, "Universal approximation using incremental constructive feedforward networks with random hidden nodes", IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, Jul 2006.
13.
G. Huang, Q. Y. Zhu and C. K. Siew, "Extreme learning machine: theory and applications", Neurocomputing, vol. 70, no. 1–3, pp. 489-501, Dec 2006.
14.
N. Liang, G. Huang, P. Saratchandran and N. Sundararajan, "A fast and accurate online sequential learning algorithm for feedforward networks", IEEE Transactions on Neural Networks, vol. 17, no. 6, pp. 1411-1423, 2006.
15.
H. T. Huynh and Y. Won, "Regularized online sequential learning algorithm for single-hidden layer feed forward neural networks", Pattern Recognition Letters, vol. 32, no. 14, pp. 1930-1935, 2011.
16.
X. Gao, K. Wong, P. Wong and C. M. Vong, "Adaptive control of rapidly time-varying discrete-time system using initial-training-free online extreme learning machine", Neurocomputing, vol. 194, pp. 117-125, 2016.
17.
P. Bartlett, "The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network", IEEE Transactions on Information Theory, vol. 44, no. 2, pp. 525-536, MAR 1998.
18.
J. Si and Y. T. Wang, "Online learning control by association and reinforcement", IEEE Transactions on Neural Networks, vol. 12, no. 2, pp. 264-276.
19.
J. Si, A. Barto, W. Powell and W. Donald, Handbook of learning and approximate dynamic programming: scaling up to the real world, New York:IEEE Press John Wiley Sons, 2004.
20.
D. Liu and Q. Wei, "Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems", IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 3, pp. 621-634, 2014.
21.
S. C. Chapra, Numerical Methods for Engineerings, New Delhi, India:New Age Int. Company, 2005.
Contact IEEE to Subscribe

References

References is not available for this document.