I. Introduction
The PRESENT work has its roots in the approximate dynamic programming/adaptive critic concept [2], [30], [20], [32], [16], in which soft computing techniques are used to approximate the solution of a dynamic programming algorithm without the explicit imposition of a stability or convergence constraint, and the authors' stability criteria for these algorithms [6], [24]. Alternatively, a number of authors have combined hard and soft computing techniques to develop tracking controllers. These include Lyapunov synthesis techniques using both neural [25], [28], [18], [5], [21] and fuzzy learning laws [28], [29], [17], sliding mode techniques [31], and input–output techniques [9]. The objective of the present paper is to describe an adaptive dynamic programming algorithm (ADPA) which uses soft computing techniques to learn the optimal cost (or return) functional for a stabilizable nonlinear system with unknown dynamics and hard computing techniquesto verify the stability and convergence of the algorithm.