1 INTRODUCTION
A large class of real systems are controlled by more than one controller or decision maker with each using an individual strategy. These controllers often operate in a group with a general quadratic performance index function as game theory which has been widely applied in management, military battles, power networks and different types of contest [1]–[6]. The two-player zero-sum game with a general quadratic performance index function is an important part of the game theory. Two players work on the performance index function together and minimax it. Over the past decades, the optimal strategies of linear zero-sum game and affine nonlinear zero-sum game have received a great deal of attention in the literature [6], [8]–[12], [15], which have the form $$\dot{x}(t)=f(x)+g(x)u+k(x)d \eqno{\hbox{(1)}}$$ with the performance index function $$V(x)={1\over 2}\int_{t_{0}}^{\infty}(x^{T}x+u^{T}u-\gamma^{2}d^{T}d)dt \eqno{\hbox{(2)}}$$ where is the state, and are the inputs. seeks to minimize the performance index function while seeks to maximize it. In [8], Al-Tamimi et al. applied the heuristic dynamic programming and dual heuristic dynamic programming structures to solve a discrete-time linear quadratic zero-sum game problem in which the state and action spaces are continuous. Then, they designed the optimal strategies of the discrete-time linear quadratic zero-sum game without knowing the system dynamical matrices by the model-free Q-learning approach [9]. A Class of continuous-time affine nonlinear quadratic zero-sum game problem was researched by Wei et al. in [13]. Abu-Khalaf et al. studied the affine nonlinear zero-sum game problem in [10] and used neural networks to solve it in [11]. It is worthy of mentioning that most of the above discussions are focused on the linear or affine nonlinear zero-sum game problems.