Reinforcement learning and adaptive dynamic programming for feedback control | IEEE Journals & Magazine | IEEE Xplore

Scheduled Maintenance on Monday 1/13/2025

Single article sales and account management will be unavailable from 5:00 AM - 7:00 PM ET (09:00 - 23:00 UTC). We apologize for the inconvenience.

IEEE.org
IEEE Xplore
IEEE SA
IEEE Spectrum
More Sites

- Donate
- Personal Sign In

Access provided by:

MIT Libraries

Access provided by:

MIT Libraries

ADVANCED SEARCH

Journals & Magazines >IEEE Circuits and Systems Mag... >Volume: 9 Issue: 3

Reinforcement learning and adaptive dynamic programming for feedback control

Frank L. Lewis; Draguna Vrabie

Alerts
Alerts
Manage Content Alerts
Add to Citation Alerts

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Living organisms learn by acting on their environment, observing the resulting reward stimulus, and adjusting their actions accordingly to improve the reward. This action...Show More

Metadata

Abstract:

Living organisms learn by acting on their environment, observing the resulting reward stimulus, and adjusting their actions accordingly to improve the reward. This action-based or reinforcement learning can capture notions of optimal behavior occurring in natural systems. We describe mathematical formulations for reinforcement learning and a practical implementation method known as adaptive dynamic programming. These give us insight into the design of controllers for man-made engineered systems that both learn and exhibit optimal behavior.

Published in: IEEE Circuits and Systems Magazine ( Volume: 9, Issue: 3, Third Quarter 2009)

Page(s): 32 - 50

Date of Publication: 28 August 2009

ISSN Information:

DOI: 10.1109/MCAS.2009.933854

References is not available for this document.

Select All

1.

M. Abu-Khalaf and F. L. Lewis, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach", Automatica, vol. 41, no. 5, pp. 779-791, 2005.

CrossRef Google Scholar

2.

M. Abu-Khalaf, F. L. Lewis and J. Huang, "Policy iterations on the Hamilton-Jacobi-Isaacs equation for Η-infinity state feedback control with input saturation", IEEE Trans. Automat. Contr., vol. 51, no. 12, pp. 1989-1995, Dec. 2006.

3.

A. Tamimi, F. L. Lewis and M. Abu-Khalaf, "Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control", Automatica, vol. 43, pp. 473-481, 2007.

CrossRef Google Scholar

4.

Al-Tamimi, F. L. Lewis and M. Abu-Khalaf, "Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof", IEEE Trans. Syst. Man Cybern. Β, vol. 38, no. 4, pp. 943-949, Aug. 2008.

5.

Anderson, R. M. Kretchner, P. M. Young and D. C. Hittle, "Robust reinforcement learning control with static and dynamic stability", Int. J. Robust Nonlinear Contr., vol. 11, 2001.

6.

L. Baird, "Reinforcement learning in continuous time: Advantage updating", Proc. Int. Conf. Neural Networks, 1994-June.

7.

S. N. Balakrishnan and V. Biega, "Adaptive critic based neural networks for aircraft optimal control", AIAA J. Guid. Contr. Dyn., vol. 19, no. 4, pp. 731-739, 1996.

CrossRef Google Scholar

8.

S. N. Balakrishnan, J. Ding and F. L. Lewis, "Issues on stability of ADP feedback controllers for dynamical systems", IEEE Trans. Syst. Man Cybern. Β, vol. 38, no. 4, pp. 913-917, Aug. 2008.

9.

A. G. Barto, R. S. Sutton and C. Anderson, "Neuron-like adaptive elements that can solve difficult learning control problems", IEEE Trans. Syst. Man Cybern., vol. SMC-13, pp. 834-846, 1983.

10.

A. G. Barto, "Connectionist learning for control" in Neural Networks for Control, MA, Cambridge:MIT Press, 1991.

11.

G. Barto, "Reinforcement learning and adaptive critic methods" in Handbook of Intelligent Control: Neural Fuzzy and Adaptive Approaches, New York:Van Nostrand Reinhold, 1992.

12.

A. G. Barto and T. G. Dietterich, "Reinforcement learning and its relationship to supervised learning" in Handbook of Learning and Approximate Dynamic Programming, New York:Wiley-IEEE Press, 2004.

CrossRef Google Scholar

13.

R. Beard, G. Saridis and J. Wen, "Approximate solutions to the time-invariant Hamilton-Jacobi-Bellman equation", Automatica, vol. 33, no. 12, pp. 2159-2177, Dec. 1997.

CrossRef Google Scholar

14.

R. E. Bellman, Dynamic Programming, NJ, Princeton:Princeton Univ. Press, 1957.

15.

D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming, MA:Athena Scientific, 1996.

16.

S. Bradtke, Β. Ydstie and A. Barto, Adaptive linear quadratic control using policy iteration, June 1994.

17.

X. Cao, Stochastic Learning and Optimization, Berlin:Springer-Verlag, 2009.

18.

P. Dayan, "The convergence of TD(λ) for general λ", Mach. Learn., vol. 8, no. 3/4, pp. 341-362, May 1992.

CrossRef Google Scholar

19.

K. Doya, "Reinforcement learning in continuous time and space", Neural Comput., vol. 12, pp. 219-245, 2000.

20.

K. Doya, H. Kimura and M. Kawato, "Neural mechanisms for learning and control", IEEE Control Syst. Mag., pp. 42-54, Aug. 2001.

21.

R. Enns and J. Si, "Apache helicopter stabilization using neural dynamic programming", AIAA J. Guid. Control Dyn., vol. 25, no. 1, pp. 19-25, 2002.

CrossRef Google Scholar

22.

T. Erez and W. D. Smart, "Coupling perception and action using minimax optimal control", Proc. ADPRL, 2009.

23.

P. Farias, "The linear programming approach to approximate dynamic programming" in Handbook of Learning and Approximate Dynamic Programming, New York:Wiley-IEEE Press, Aug. 2004.

24.

L. Feldkamp and D. Prokhorov, "Recurrent neural networks for state estimation", Proc. 12th Yale Workshop Adaptive and Learning Systems, pp. 17-22, 2003.

25.

S. Ferrari and R. Stengel, "An adaptive critic global controller", Proc. American Control Conf., pp. 2665-2670, 2002.

26.

S. Hagen and B. Kröse, "Linear quadratic regulation using reinforcement learning", Proc. 8th Belgian-Dutch Conf. Machine Learning, pp. 39-46, 1998-Oct.

27.

T. Hanselmann, L. Noakes and A. Zaknich, "Continuous-time adaptive critics", IEEE Trans. Neural Networks, vol. 18, no. 3, pp. 631-647, 2007.

28.

P. He and S. Jagannathan, "Reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints", IEEE Trans. Syst. Man Cybern. Β, vol. 37, no. 2, pp. 425-436, Apr. 2007.

29.

G. A. Hewer, "An iterative technique for the computation of steady state gains for the discrete optimal regulator", IEEE Trans. Automat. Contr., pp. 382-384, 1971.

30.

K. Hornik, M. Stinchcombe and H. White, "Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks", Neural Netw., vol. 3, pp. 551-560, 1990.

CrossRef Google Scholar

Contact IEEE to Subscribe

More Like This

Adaptive Dynamic Programming for feedback control

2009 7th Asian Control Conference

Published: 2009

Guest Editorial - Special Issue on Adaptive Dynamic Programming and Reinforcement Learning in Feedback Control

IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)

Published: 2008

References

References is not available for this document.

IEEE Personal Account

Change username/password

Purchase Details

Payment Options
View Purchased Documents

Profile Information

Communications Preferences
Profession and Education
Technical interests

Need Help?

US & Canada: +1 800 678 4333
Worldwide: +1 732 981 0060
Contact & Support

Follow

About IEEE Xplore | Contact Us | Help | Accessibility | Terms of Use | Nondiscrimination Policy | IEEE Ethics Reporting | Sitemap | IEEE Privacy Policy

A public charity, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

© Copyright 2025 IEEE - All rights reserved, including rights for text and data mining and training of artificial intelligence and similar technologies.

IEEE Account

Change Username/Password
Update Address

Purchase Details

Payment Options
Order History
View Purchased Documents

Profile Information

Communications Preferences
Profession and Education
Technical Interests

Need Help?

US & Canada: +1 800 678 4333
Worldwide: +1 732 981 0060
Contact & Support

About IEEE Xplore
Contact Us
Help
Accessibility
Terms of Use
Nondiscrimination Policy
Sitemap
Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.
© Copyright 2025 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Test Whats new message.