Conferences >2016 IEEE Symposium Series on...

Iterative Q-learning-based nonlinear optimal tracking control

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

A new Q-learning algorithm is developed for a class of discrete-time nonlinear systems in this paper to solve the infinite horizon optimal tracking problems. Using system...Show More

Metadata

Abstract:

A new Q-learning algorithm is developed for a class of discrete-time nonlinear systems in this paper to solve the infinite horizon optimal tracking problems. Using system transformations, the optimal tracking problem is transformed to be an optimal regulation problem. Thereafter, for the regulation system, the new Q-learning algorithm is developed in order to obtain the optimal control law. Convergence of the iterative Q functions and the admissibility of the iterative control law are analyzed. In the end, two corresponding simulation examples are presented to illustrate the performance of the newly developed algorithm.

Published in: 2016 IEEE Symposium Series on Computational Intelligence (SSCI)

Date of Conference: 06-09 December 2016

Date Added to IEEE Xplore: 13 February 2017

ISBN Information:

DOI: 10.1109/SSCI.2016.7849841

Conference Location: Athens, Greece

Contents

I. Introduction

In the past several decades, the optimal control problems especially for nonlinear systems have always been the focus in the control field [6]. As is known to all, dynamic programming is a very useful tool while solving the optimal control problems. Nevertheless, considering the “curse of dimensionality”, when trying to obtain the optimal solution, it is very likely to be computationally untenable to perform dynamic programming. Correspondingly, The adaptive dynamic programming (ADP) algorithm was proposed by [1], [2] as a solution to optimal control problems in a forward-in-time way. Policy and value iterations are two primary iterative ADP algorithms [3]. In [4], policy iteration algorithms are firstly used for optimal control of continuous-time (CT) systems, which have continuous states and action spaces. In [5], the optimal control law for multiple actor-critic structures was effectively obtained using shunting inhibitory artificial neural network (SIANN). Policy iteration for zero-sum and non-zero-sum games was discussed in [7]–[9]. In [10], the multi-agent optimal control was obtained using fuzzy approximation structures. In [11], while solving problems of discrete-time (DT) nonlinear systems, policy iteration algorithm was developed. Thereafter, value iteration algorithm was presented for the optimal control problems of discrete-time nonlinear systems in [12]. For deterministic discrete-time affine nonlinear systems, [13] studied the value iteration algorithm. It was proven that the iterative value function is non-decreasing and bounded, and hence converges to the optimum as the iteration index increases to infinity. In [6], [14], [15], value iteration algorithms with approximation errors were analyzed. Based on the framework of policy and value iteration algorithms, more investigations on iterative ADP algorithms have been developed [16]–[32].

References is not available for this document.

Iterative Q-learning-based nonlinear optimal tracking control

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Iterative Q-learning-based nonlinear optimal tracking control

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?