I. Introduction
Fuzzy neural networks (FNNs) have long considered successful learning machines in terms of function approximation, pattern recognition, classification, and image processing. They can be used for problem solving if no known mathematical models of a given problem are available. The universal approximation property is important for the success of an FNN in a variety of applications [1], [4]. This provides more flexibility in designing an appropriate learning machine for nonlinear problems. The advantage of using fuzzy neural network for machine learning is that its parameters usually have clear physical meanings and we have some intuitive methods to choose good initial values for them. There exist many algorithms for training an FNN such as back-propagation algorithm (BP), genetic algorithm (GA) [5], particle swarm optimization algorithm (PSO) [6], differential evolution (DE) [7] and so on. The simplest FNN learning construction makes use of the incremental gradient descent approach in BP algorithm [8]. Unfortunately, the approximating solution using BP algorithm may easily get trapped in local minima of the cost surface, especially for those non-linearly separable pattern classification problems or complex function approximation problems, and never finds a near-optimal solution [9]. Moreover, it is quite sensitive to the initial settings of the connection weights and learning rate.