I. Introduction
Optimal control is generally an offline design technique that requires full knowledge of the system dynamics [1], e.g., in the linear system case, one must solve the Riccati equation. In the nonlinear system case, approximation methods are used to solve Hamilton–Jacobi–Bellman (HJB) equations [2]–[4]. Adaptive/approximate dynamic programming (ADP) is one of the most useful intelligent approximate control methods for solving nonlinear HJB equations [5]–[10]. Such as in [11], a novel numerically adaptive learning control scheme based on ADP was developed to solve numerically HJB equation, which is the first result of applying numerical ADP to solve optimal control problems of nonlinear systems. In [12], a finite horizon iterative ADP algorithm was developed to solve the optimal solution of the HJB equation for a class of discrete-time nonlinear systems with an unfixed initial state. That is the first result about obtaining an epsilon-optimal control law for nonlinear systems in finite time with an arbitrary initial state in the initial state set. However, due to the large scale and complex manufacturing techniques, many industrial dynamics are difficult to be estimated and cannot be accurately obtained [13]–[16]. Therefore, optimal adaptive controllers have been designed using indirect techniques, whereby the unknown plant is first identified and then HJB equation is solved [17]–[19].