A Comprehensive Review of Stability Analysis of Continuous-Time Recurrent Neural Networks

SECTION I.

Introduction

Approaches based on recurrent neural networks for solving optimization problems, which use analog computation implemented on electronic devices to replace numerical computation realized by mathematical algorithms, have attracted considerable attention (see [102], [105], [131], [220], [233], [248], [251], [252], [254], [255], and the references therein). However, due to the existence of many equilibrium points of recurrent neural networks, spurious suboptimal responses are likely to be present [69], [205], [254], [258], which limit the applications of recurrent neural networks. Thus, the global asymptotic/exponential stability of a unique equilibrium point for the concerned recurrent neural networks is of great importance from a theoretical and application point of view [23], [55], [69], [103], [204], [205], [281], [282].

Research on the stability of recurrent neural networks in the early days was for symmetric recurrent neural networks with or without delays. References [23], [105], [106], [145], [200], [254], and [287] looked into the dynamic stability of symmetrically connected networks and showed their practical applications to optimization problems. Cohen and Grossberg [55] presented the analytical results on the global stability of symmetric recurrent neural networks. A brief review on the dynamics and stability of symmetrically connected networks is presented in [199], in which the effects of time delay, the eigenvalue of the interconnection matrix, and the gain of the activation function on the local dynamics or stability of symmetric Hopfield neural networks were discussed in detail.

Both in practice and theory, the symmetric restriction on the connection matrix of recurrent neural networks is too strong, while asymmetric connection structures are more general [8], [206], [281]. For instance, a nonsymmetric interconnection matrix may originate from slight perturbations in the electronic implementation of a symmetric matrix. Asymmetries of the interconnection matrix may also be deliberately introduced to accomplish special tasks [267] or may be related to the attempt to consider a more realistic model of some classes of neural circuits composed of the interconnection of two different sets of amplifiers (e.g., neural networks for nonlinear programming [131]). Therefore, the local and global stabilities of asymmetrically connected neural networks have been widely studied [30], [239], [274]. As pointed out in [69], the topic on global stability of neural networks was more significant than that of local stability in applications, such as signal processing and optimization problems. These important applications motivated researchers to investigate the dynamical behaviors of neural networks and global stability conditions of neural networks [23], [103], [281], [282]. Reference [130] applied the contraction mapping theory to obtain some sufficient conditions for global stability. Reference [201] generalized some results in [103] and [130] using a new Lyapunov function. Reference [126] proved that diagonal stability of the interconnection matrix implied the existence and uniqueness of an equilibrium point and the global stability at the equilibrium point. References [65], [69], and [73] pointed out that the negative semidefiniteness of the interconnection matrix guaranteed the global stability of the Hopfield networks, which generalized the results in [103], [126], [130], and [201]. Reference [63] applied the matrix measure theory to get some sufficient conditions for global and local stability. References [122] and [123] discussed the stability of a delayed neural network using Lyapunov function and established a Lyapunov diagonal stability (LDS) condition on the interconnection matrix. References [30]–[32], [35], and [172] introduced a direct approach to address the stability of delayed neural networks, in which the existence of equilibrium point and its stability were proved simultaneously without using complicated theory, such as degree theory and homeomorphism theory. Note that the above references provided the global stability criteria of recurrent neural networks using different algebraic methods, which reflected the different measure scales on the stability property due to different sufficient conditions. These expressions of global stability criteria can generally be divided into two categories: 1) LDS condition and 2) matrix measure stability condition, which had been sufficiently developed in parallel. The former condition considered the effects of the positive and negative signs, that is, excitatory (\(a>0\) ) and inhibitory (\(a<0\) ) influences, of the interconnection strengths between neurons, while the latter only considered the positive sign effects of the interconnection strengths between neurons, which made the main differences between these two stability conditions.

It is well known that symmetrically connected analog neural networks without delays operating in continuous time will not oscillate [103]–[106], in which it is assumed that neurons communicate and respond instantaneously. In electronic neural networks, time delays will occur due to the finite switching speed of amplifiers [9], [54], [200], [236], [237]. Designing a network to operate more quickly will increase the relative size of the intrinsic delay and can eventually lead to oscillation [200]. In biological neural networks, it is well known that time delay can cause a stable system to oscillate [79]. Time delay has become one of the main sources to lead to instability. Therefore, the study of effects of time delay on stability and convergence of neural networks has attracted considerable attentions in neural network community [9], [21], [25], [54], [113], [236], [237]. Under certain symmetric connectivity assumptions, neural networks with time delay will be stable when the magnitude of time delay does not exceed certain bounds [13], [200], [290]. For asymmetric neural networks with delays, sufficient stability conditions independent of or depending on the magnitude of delays were also established [50], [81], [256], [300]. These results are mostly based on linearization analysis and energy and/or Lyapunov function method. Recently, most of the stability results are for recurrent neural networks with delays, such as discrete delays, distributed delays, neutral-type delays, and other types of delays, and many different analysis methods were proposed. Since 2002 [162], [163], the linear matrix inequality (LMI) method has been used in the stability analysis of recurrent neural networks, and then many different LMI-based stability results have been developed. Up to date, LMI-based stability analysis of recurrent neural networks is still one of the most commonly used methods in the neural network community.

More recently, lots of efforts have been made on various stability analyses of recurrent neural networks [5], [135], [245], [263], [278], [304], [313], [325]. A detailed survey and summary of stability results are necessary for understanding the development of stability theory of recurrent neural networks. Although there are some literature surveys available on the stability of recurrent neural networks [83], [103], [199], exhaustive/cohesive reviews on stability of recurrent neural networks are still lacking, which motivates us to present a comprehensive review on this specific topic. Although there are many different types of recurrent neural networks, including complex-valued neural networks [110], [285] and fractional-order neural networks [125], this paper is mainly concerned with the real-valued continuous-time recurrent neural networks described by ordinary differential equations in the time domain.

This paper is organized as follows. In Section II, the research categories of the stability of recurrent neural networks are presented, which include the evolution of recurrent neural network models, activation functions, connection weight matrices, main types of Lyapunov functions, and different kinds of expression forms of stability results. In Section III, a brief review on the early methods to the stability analysis of recurrent neural networks is presented. In Section IV, LMI-based approach is discussed in detail and some related proof methods to LMI-based stability results are also analyzed. In Section V, two classes of Cohen-Grossberg neural networks are discussed, and some related LMI-based stability criteria are introduced and compared. In Section VI, the stability problem of recurrent neural networks with discontinuous activation functions is presented. The emphasis is placed on recurrent neural networks without delays. In Section VII, some necessary and sufficient conditions for the dynamics of recurrent neural networks without delays are developed. In Section VIII, stability problems of recurrent neural networks with multiple equilibrium points are discussed, which is a useful complement to the neural networks with a unique equilibrium point. The conclusion and some future directions are finally given in Section IX, which show some potential and promising directions to the stability analysis of recurrent neural networks.

SECTION II.

Scope of Recurrent Neural Network Research

A recurrent neural network model is mainly composed of such components as self-feedback connection weights, activation functions, interconnection weights, amplification functions, and delays. To establish efficient stability criteria, there are usually two ways to be adopted. One way is to efficiently use the information of recurrent neural networks under different assumptions. Another is to relax the assumptions in the neural networks using novel mathematical techniques. Along the above lines, we will give a detailed review on the stability research of recurrent neural networks in this section.

A. Evolution of Recurrent Neural Network Models

Since Hopfield and Cohen-Grossberg proposed two types of recurrent neural network models in the 1980s, modified models have been frequently proposed by incorporating different internal and external factors. Especially, when time delays are incorporated into the network models, stability research on delayed neural networks has gained significant progress. A short review on the evolution of neural network models with delays is presented in [305]. However, with the development of the theory of neural networks, some new variations have taken place in neural network models. Therefore, in this section, we will briefly review some basic models of recurrent neural networks and their recent variants.

Cohen and Grossberg [55] proposed a neural network model described by \begin{equation} \dot {u}_i (t) = d_i (u_i (t))\bigg [ a_i (u_i (t)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))\bigg ]\end{equation} View Source where \(u_i(t)\) is the state variable of the \(i\) th neuron at time \(t\) , \(d_i (u_i (t))\) is an amplification function, \(a_i (u_i (t))\) is a well-defined function to guarantee the existence of solutions of (1), \(g_j (u_j (t))\) is an activation function describing how the \(j\) th neuron reacts to the input \(u_j(t)\) , and \(w_{ij}=w_{ji}\) is the connection weight coefficient of the neurons, \(i,j = 1,\dotsc ,n.\) System (1) includes a number of models from neurobiology, population biology, and evolution theory [55].

Hopfield [105] proposed the following continuous-time Hopfield neural network model\(:\) \begin{equation} \dot {u}_i (t) = - \gamma _i u_i (t) + \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))+U_i\end{equation} View Source where \(U_i\) represents the external input source introduced from the outside of the network to the neuron, \(\gamma _i>0\) , and the others are the same as those defined in (1). Obviously, Hopfield neural network model (2) is only a special case of (1) in the form of mathematical description (for example, let\(d_i (u_i (t))=1\) and \(a_i (u_i (t))=- \gamma _i u_i (t)+U_i\) ), not in the sense of physical meaning due to the existence of the amplification function. For more details on (1) and (2) (e.g., stability criteria and background of these two models), one can see the contents in Sections III and V .

In (1) and (2), it was assumed that neurons communicated and responded instantaneously. However, in electronic neural networks, time delay will be present due to the finite switching speed of amplifiers, which will be a source of instability. Moreover, many motion-related phenomena can be represented and/or modeled by delay-type neural networks, which make the general neural networks with feedback and delay-type synapses become more important as well [236]. Time delay was first considered in Hopfield model in [200], which was described by the following delayed networks\(:\) \begin{equation} \dot {u}_i (t) = - u_i (t) + \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t-\tau )).\end{equation} View Source

Ye et al. [290] introduced the constant discrete delays into (1), which is in the following form\(:\) \begin{equation} \dot {u}_i (t) \!=\! - d_i (u_i (t))\bigg [a_i (u_i (t)) \!-\! \sum \limits _{k = 0}^N {\sum \limits _{j = 1}^n w_{ij}^k } g_j (u_j (t \!-\! \tau _k ))\bigg ]\qquad\end{equation} View Source where \(\tau _k\geq 0\) is a bounded constant delay, \(d_i (u_i (t))>0\) is a positive and bounded function, \( w_{ij}^k\) is the connection weight coefficient, \(0=\tau _0<\tau _1<\tau _2<\cdots <\tau _N\) , and other notations are the same as those in (1), \(k=0,\dotsc ,N\) , \(i,j=1,\dotsc ,n\) . Obviously, Hopfield neural network model (2) is a special case of (4).

Note that neural networks similar to (2) are only for the case of instantaneous transmission of signals. Due to the effect of signal delay, the following model has been widely considered as an extension of (2): \begin{align} \dot {u}_i (t)\!&=\!-\gamma _iu_i (t)\!+\!\!\sum _{j=1}^nw_{ij}g_j(u_j(t))\!+\!U_i\!+\!\!\sum _{j=1}^nw^1_{ij} g_j (u_j (t\!-\!\tau ))\notag\\\text{}\end{align} View Source where \( w^1_{ij}\) is the connection weight coefficient associated with delayed term. Meanwhile, no matter in biological neural networks or practical implementation of neural networks, both instantaneous transmission and delayed transmission of signal often occur simultaneously, and produce more complicated phenomena. Therefore, similar to (4), the recurrent neural network models involving the instantaneous and delayed state actions have become the dominant models and have been widely studied in [143], [152], [166], [177], [243], [244], and [265].

In many real applications, signals that are transmitted from one point to another may experience a few network segments, which can possibly induce successive delays with different properties due to various network transmission conditions. Therefore, it is reasonable to combine them together, which leads to the following model\(:\) \begin{align} \dot {u}_i (t) &= -\gamma _iu_i (t)+\sum \limits _{j=1}^nw_{ij}g_j(u_j(t))+U_i\notag\\ &\quad +\sum \limits _{j=1}^nw^1_{ij} g_j \left (u_j \left (t-\sum \limits _{k=1}^m\tau _k \right )\right ).\end{align} View Source Some special cases of (6) were studied in [76], [279], and [316]. For the case of two additive time delay components, some stability results are presented in [136] and [302]. System (6) extends the classical single point-to-point delay to the case of successive single point-to-point delay. This kind of delay cannot be modeled as a discrete delay, which seems intermediate between discrete and distributed delays.

The use of discrete time delay in the models of delayed feedback systems serves as a good approximation in simple circuits containing a small number of neurons. However, neural networks usually have a spatial extent due to the presence of a multitude of parallel pathways with a variety of axon sizes and lengths. There will be a distribution of propagation delays. In this case, the signal propagation is no longer instantaneous and cannot be modeled with discrete delays. It is desired to model them by introducing continuously distributed delays [24], [28], [42], [82], [114], [115], [142], [179], [193], [209], [255], [276], [305]. The extent to which the values of the state variable in the past affect their present dynamics is determined by a delay kernel. The case of constant discrete delay corresponds to a choice of the delay kernel being a Dirac delta function [29]. Nowadays, there are generally two types of continuously distributed delays in the neural network models. One is the finite distributed delay \begin{equation} \dot {u}_i (t) = -a_i (u_i (t)) + \sum \limits _{j = 1}^n {w_{ij} } \int _{t-\tau (t)}^t g_j (u_j (s))\mathrm {d}s\end{equation} View Source where \(i=1,\dotsc ,n,\) and \(\tau (t)\) is a time-varying delay. Model (7) and its variants have been studied in [261] and [138]. The other is infinite distributed delay [112], [210], [269]\begin{equation} \dot {u}_i (t) = -a_i (u_i (t))\!+\!\sum \limits _{j = 1}^n {w_{ij} } \int _{-\infty }^t K_{ij}(t-s) g_j (u_j (s))\mathrm {d}s\qquad\end{equation} View Source and \begin{equation} \dot {u}_i (t) = -a_i (u_i (t))\!+\!\sum \limits _{j = 1}^n {w_{ij} } g_j \bigg (\int _{-\infty }^t K_{ij}(t-s) u_j (s)\mathrm {d}s\bigg )\quad\end{equation} View Source where the delay kernel function \(K_{ij}(\cdot )\colon [0,\infty )\rightarrow [0,\infty )\) is a real-valued nonnegative continuous function. If the delay kernel function \(K_{ij}(s)\) is selected as \(K_{ij}(s)=\delta (t-\tau _{ij})\) , where \(\delta (s)\) is the Dirac delta function, then (8) and (9) can be reduced to the following neural networks with different multiple discrete delays\(:\) \begin{equation} \dot {u}_i (t) = -a_i (u_i (t)) + \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t-\tau _{ij})).\end{equation} View Source If the delay kernel function \(K_{ij}(s)\) takes other forms, delayed models (2), (5), and (7) can also be recovered from (8). Therefore, discrete delays and finite distributed delays can be included in (8) and (9) by choosing suitable kernel functions.

The following recurrent neural networks with a general continuously distributed delays were proposed and studied in [27], [36], [34], [29], [165], [178], [268], and [271]\(:\) \begin{align} \dot u_i(t) &=-\gamma _i u_i(t)+\sum _{j=1}^n\int _0^{\infty } g_j(u_j(t-s))\mathrm {d}J_{ij}(s)\notag\\ &\quad +\sum _{j=1}^n\int _0^{\infty } g_j(u_j(t-\tau _{ij}(t)-s))\mathrm {d}K_{ij}(s)+U_i\qquad \quad\end{align} View Source where \(\mathrm {d}J_{ij}(s)\) and \(\mathrm {d}K_{ij}(s)\) are Lebesgue-Stieltjes measures for each \(i,j=1,\dotsc ,n\) . As pointed out in [27] and [29], by taking different forms of Lebesgue-Stieltjes measures, all the above models with discrete delays or distributed delays can be uniformly expressed by (11), and all the proofs of stability for systems with delays can be unified by (11) as special cases.

By choosing either neuron states (the external states of neurons) or local field states (the internal states of neurons) as basic variables, a dynamic recurrent neural network is usually cast either as a static neural network model or as a local field neural network model [227], [284]. The recurrent backpropagating neural networks [102], the brain-state-in-a-box/domian type neural networks [257], and the optimization-type neural networks [73], [272] are modeled as an static neural network model described in the following matrix-vector form\(:\) \begin{equation} \dot u(t)=-Au(t)+g(Wu(t)+U)\end{equation} View Source where \(u(t)=(u_1(t),\dotsc , u_n(t))^T\) is the state vector of neurons, \(A=\mbox {diag}(a_1,\dotsc , a_n)>0\) is the self-feedback positive diagonal matrix, \(W=(w_{ij})_{n\times n}\) is the interconnection matrix, \(U=(U_1,\dotsc , U_n)^T\) , and \(g(u(t))=(g_1(u_1(t)),~\dotsc ,~g_n(u_n(t)))^T\) . On the other hand, the well-known Hopfield neural networks and the cellular neural networks can be modeled as a local field neural network model of the following matrix form\(:\) \begin{equation} \dot u(t)=-Au(t)+Wg(u(t))+U.\end{equation} View Source Network models (1)–(8) and (11) belong to the local field neural networks, while (9) belongs to the static neural network model. In the aspect of model description, (11) can be used to describe a large class of local field neural network models. Along with the similar routine, it will be meaningful to present a unified model to describe both static neural networks and local field neural networks. This work has been done in [303], which can be described as follows\(:\) \begin{equation} \dot u(t)=-Au(t)+W_0g(W_2u(t))+W_1g(W_2u(t-\tau (t)))\qquad\end{equation} View Source where \(W_i\) are some matrices with appropriate dimensions and \(\tau (t)\) is a time-varying delay. In the sequel, (14) is extended to the following model [242], [304]\(:\) \begin{align} \dot u(t) \!= \!-Au(t)\!+\!W_0g(W_2u(t))\!+\!W_1g(W_2u(t\!-\!\tau _1(t)\!-\!\tau _2(t)))\notag\\\text{}\end{align} View Source which is also a further extension of (6), where \(\tau _1(t)>0\) and \(\tau _2(t)>0\) denote two delay components in which the transmission of signals experiences two segments of networks.

There are many different factors considered in the neural network models, such as stochastic actions [44], reaction-diffusion actions [15], [164], [217], [230], [232], [238], [240], [269], [318], [326], high-order interactions [57], [262], impulse and switching effects [229], and so on. These effects are all superimposed on the elementary Hopfield neural networks or Cohen-Grossberg neural networks, which lead to many complex neural network models in different applications. There are many internal or external effects considered in practical neural networks besides many different types of delays.

B. Evolution of the Activation Functions

Many early results on the existence, uniqueness, and global asymptotic/exponential stability of the equilibrium point concern the case that activation functions are continuous, bounded, and strictly monotonically increasing. However, when recurrent neural networks are designed for solving optimization problems in the presence of constraints (linear, quadratic, or more general programming problems), unbounded activation functions modeled by diode-like exponential-type functions are needed to impose the constraints. Because of the differences between the bounded and unbounded activation functions, extensions of the results with bounded activation functions to unbounded cases are not straightforward. Therefore, many different classes of activation functions are proposed in the literature. Note that a suitable and more generalized activation function can greatly improve the performance of neural networks. For example, the property of activation function is important to the capacity of neural networks. References [212] and [213] showed that the absolute capacity of an associative memory model can be remarkably improved by replacing the usual sigmoid activation function with a nonmonotonic activation function. Therefore, it is very significant to design a new neural network with a more generalized activation function. In recent years, many researchers have devoted their attention to attain this goal by proposing new classes of generalized activation functions. Next, we will decribe some various types of activation functions used in the literature.

In the early research of neural networks, different types of activation functions are used, for example, threshold function [104], piecewise linear function [1], [151], [292], signum function [93], hyperbolic tangent function [2], hard-limiter nonlinearity [197], and so on. In the following, we are mainly concerned with Lipschitz-continuous activation functions and their variants.

The following sigmoidal activation functions have been used in [105], [106], [254], [261], and [290]\(:\) \begin{align} g_{i}^{\prime }(\zeta )&=\mathrm {d}g_{i}(\zeta )/\mathrm {d}\zeta >0, \lim _{\zeta \rightarrow +\infty }g_{i}(\zeta )=1, \notag\\ \lim _{\zeta \rightarrow -\infty }g_{i}(\zeta )&=-1, \lim _{|\zeta |\rightarrow \infty }g_{i}^{\prime }(\zeta )=0\end{align} View Source where \(g_{i}(\cdot )\) is the activation function of the \(i\) th neuron, \(i=1,\dotsc , n\) , and \(n\ge 1\) is the number of neurons. Obviously, it is differentiable, monotonic, and bounded.
The following activation functions have been used in [137], [265], [272], [301], and [315]\(:\) \begin{equation} |g_i(\zeta )-g_i(\xi )|\le \delta _i|\zeta -\xi |\end{equation} View Source no matter whether the activation function is bounded or not. As pointed out in [265], this type of activation function in (17) is not necessarily monotonic and smooth.
The following activation functions have been employed in [16], [35], and [166]\(:\) \begin{equation} 0<\frac {g_i(\zeta )-g_i(\xi )}{\zeta -\xi }\le \delta _i.\end{equation} View Source
The following activation functions have been employed in [19], [52], [137], and [315]\(:\) \begin{equation} 0\le \frac {g_i(\zeta )-g_i(\xi )}{\zeta -\xi }\le \delta _i.\end{equation} View Source
The following activation functions are developed in [43], [135], [138], [139], [176], [177], [242], [294], and [304]\(:\) \begin{equation} \delta _i^-\le \frac {g_i(\zeta )-g_i(\xi )}{\zeta -\xi }\le \delta _i^+.\end{equation} View Source

As pointed out in [138], [139], and [177], \(\delta _i^-\) and \(\delta _i^+\) may be positive, negative, or zero. Then, those previously used Lipschitz conditions (16), (18), and (19) are just special cases of (20). Comparisons among various classes of continuous activation functions are shown in Table I.

TABLE I

Comparisons of Different Types of Activation Functions

Show All

One of the important things associated with activation functions is the existence and uniqueness of the equilibrium point of recurrent neural networks. Now, we will give a brief comment on this problem. For the bounded activation function \(|g_i(x_i)|\le M\) or the quasi-Lipschitz activation function \(|g_i(x_i)|\le \delta _i^0 |x_i|+\sigma _i^0\) (which may be unbounded), where \(M>0,~\delta _i^0\ge 0 \) , and \(\sigma _i^0\ge 0\) are constants, the existence of the equilibrium point is established mainly on the basis of the fixed-point theorem [65], [73], [85], [261].

In general, for the case of bounded activation functions satisfying Lipschitz continuous conditions, the existence of the solution can be guaranteed by the existence theorem of ordinary differential equations [65], [73], [85], [182], [201], [207], [243], [261], [307].

For the unbounded activation function in the general form, the existence of the equilibrium point is established mainly on the basis of homeomorphism mapping [28], [73], [180], [307], [321], Leray-Schauder principle [191], and so on.

Another important thing associated with activation functions is whether the existence, uniqueness, and global asymptotic/exponential stability must be simultaneously dealt with in the stability analysis of recurrent neural networks. This question is often encountered and there is no consistent viewpoint on this question in the early days of the stability theory of neural networks. This problem leads to two classes of routines in the stability analysis of recurrent neural networks: 1) to directly present the global asymptotic/exponential stability results without the proof of the existence and uniqueness of the equilibrium point and 2) to give complete proof of the existence, uniqueness, and the global asymptotic/exponential stability. Clearly, this question must be clarified before the stability analysis of recurrent neural networks is proceeded.

From a mathematical point of view, it is necessary to establish the existence (and, if applicable, uniqueness) of equilibrium point(s) to prove stability. However, according to different requirements on the activation function, one can have slightly different treatment routine in the stability proof of the equilibrium point.

For the general case of the bounded activation functions, we can directly present the proof of the global asymptotic/exponential stability as it is well known that the bounded activation function always guarantees the existence of the equilibrium point [65], [73], [85], [261]. For the quasi-Lipschitz case, the existence of equilibrium point is also guaranteed as in the case of bounded activation functions. Therefore, it suffices to present the proof of the global asymptotic/exponential stability of the equilibrium point for recurrent neural networks with bounded activation functions, and the uniqueness of the equilibrium point follows directly from the global asymptotic/exponential stability [181].

For the case of unbounded activation functions, on the contrary, one must provide the proof of the existence, uniqueness, and global asymptotic/exponential stability of the equilibrium point for the concerned neural networks simultaneously.

Activation functions listed above belong to the class of continuous function. For more details on the relationship of global Lipschitz continuous, partially Lipschitz continuous, and locally Lipschitz continuous, readers can refer to [22] and [280]. Some discontinuous activation functions also exist in practical applications. For example, in the classical Hopfield neural networks with graded response neurons [105], the standard assumption is that the activations are employed in the high-gain limit where they closely approximate a discontinuous hard comparator function. Another important example concerns the class of neural networks introduced in [131] to solve linear and nonlinear programming problems , in which the constraint neurons are with a diode-like input-output activations. To guarantee satisfaction of the constraints, the diodes are required to possess a very high slope in the conducting region, i.e., they should approximate the discontinuous characteristic of an ideal diode. Therefore, the following activation functions are for the discontinuous case.

Discontinuous activation functions [68], [71], [72], [116], [173], [175], [183], [184], [216], [263]: Let \(g_i(\cdot )\) be a continuous nondecreasing function, and in every compact set of real space \(\mathcal {R}\) , each \(g_i(\cdot )\) has only finite discontinuity points. Therefore, in any compact set in \(\mathcal {R}\) , except some finite points \(\rho _k\) , there exist finite right and left limits \(g_i(\rho ^+)\) and \(g_i(\rho _-)\) with \(g_i(\rho ^+) > g_i(\rho _-)\) . In general, one assumes \(g_i(\cdot )\) to be bounded, i.e., there exists a positive number \(G>0\) , such that \(|g_i(\cdot )|\le G\) . Stability analysis of neural networks with discontinuous activation functions has drawn many researchers' attention, and many related results have been published in the literature since the independent pioneering works of Forti and Nistri [71] and Lu and Chen [183]. Hopfield neural networks with bounded discontinuous activations were first proposed in [71], in which the existence of the equilibrium point and stability were discussed, but the uniqueness of the equilibrium point and its global stability were not given. Instead, in [183], the Cohen-Grossberg neural networks with unbounded discontinuous activations were proposed, where the global exponential stability and the existence and uniqueness of the equilibrium point were given. Delayed neural networks with discontinuous activations were first proposed in [184]. Similar models were also proposed in [72]. It can be concluded that [72, Th. 1] is a special case of [184, Th. 1]. The almost periodic dynamics of networks with discontinuous activations was first investigated in [186], where the integro-differential systems were discussed. It includes discrete delays and distributed delays as special cases.
Therefore, activation functions have evolved from bounded to unbounded cases, from the continuous to discontinuous, and from the strictly monotonic case to the nonmonotonic case. All these show the depth of the research on stability theory of recurrent neural networks.

C. Evolution of Uncertainties in Connection Weight Matrix

For the deterministic and accurate connection weight matrix, a lot of stability results have been published since the 1980s. However, in the electronic implementation of recurrent neural networks, the connection weight matrix can be disturbed or perturbed by the external environment. Therefore, the robustness of neural networks against such perturbation should be considered [5], [18].

At present, there are several forms of uncertainties considered in the literature.

Uncertainties with the matched condition: Assume that the connection matrix is \(A\) . Then, uncertainty \(\Delta A\) is described by \begin{equation} \Delta A=MF(t)N ~\mbox {with}~ F^T(t)F(t)\le I\end{equation} View Source or \begin{align}\Delta A=MF_0(t)N ~&\mbox {with}~ F_0(t)=(I-F(t)J)^{-1}F(t) \notag\\ & \mbox {and} ~F^T(t)F(t)\le I\end{align} View Source where \(M, N,\) and \(J\) are all constant matrices, \(J^TJ\le I\) , and \(I\) is an identity matrix with compatible dimension. This kind of uncertainty is very convenient in the stability analysis based on the LMI method. Robust stability for neural networks with matched uncertainty (21) has been widely studied in [121], [246], and [310]. For the physical meaning of linear-fractional representation of uncertainty (22), readers can refer to [14], [59], [61], [322], and [324] for more details.
Interval uncertainty: In this case, the connection matrix \(A\) satisfies [168], [198]\begin{equation} A\in A_I=[\underline A, \overline A]=\{[a_{ij}]\colon \underline a_{ij}\le a_{ij}\le \overline a_{ij}\}.\end{equation} View Source If we let \(A_0=(\overline A+\underline A)\hbox {/}2\) and \(\Delta A=(\overline A-\underline A)\hbox {/}2\) , then uncertainty (23) can be expressed as follows [137], [144], [187]\(:\) \begin{equation} \hspace {-15pt}A_J=\big \{A=A_0+\Delta A=A_0 +M_AF_AN_A~|~F_A^TF_A\le I\big \}\qquad\end{equation} View Source where \(M_A, N_A\) , and \(F_A\) are well defined according to some arrangement of elements in \(\underline A\) and \(\overline A\) . Obviously, interval uncertainty (23) has been changed into the form of uncertainty with matched condition (21).
Absolute value uncertainties or unmatched uncertainties, where \begin{equation} \Delta A=(\delta a_{ij})\in \{ |\delta a_{ij}|\le \overline a_{ij}\}.\end{equation} View Source This kind of uncertainty has been studied in [290], while LMI-based results have been established in [286] and [288].
Note that for nonlinear neural systems with uncertainties (23) or (25), most robust stability results have been proposed based on algebraic inequalities, \(M\) -matrix, matrix measure, and so on in the early days of the theory of recurrent neural networks. Since 2007, LMI-based robust stability results for nonlinear neural systems with uncertainties (23) or (25) have appeared. This tendency implies that many different classes of robust stability results for uncertain neural systems will be proposed.
Polytopic type uncertainties \begin{equation} A\in \Omega ,~ \Omega =\Bigg \{A(\xi )=\sum _{k=1}^p\xi _kA_k,\sum _{k=1}^p\xi _k=1,~ \xi _k\ge 0\Bigg \}\qquad\end{equation} View Source where \(A_k\) is a constant matrix with compatible dimension and \(\xi _k\) is a time-invariant uncertainty. Robust stability for systems with this type of uncertainty has been studied in [77], [99], and [97].

Note that the above uncertainties represent the parameter uncertainties, which are the reflection of the bounded changes of system parameters. Different types of uncertainties are equivalent in the sense of bounded perturbation. Meanwhile, different robust stability results generally require different mathematical analysis methods due to the different uncertainty descriptions in neural systems.

D. Evolution of Time Delays

Due to the different transmission channels and media, time delays are unavoidable in real systems [56], [110], [176], [228]. In the aspects of describing time delays, there are some different ways that depend on the approximation capability and description complexity. For example, the simplest way is to assume that delays are the same in all the transmission channels. A further relaxation is to assume that the delay is the same in each channel, which is different from other channels.

Discrete delays reflect the centralized effects of delays on the system, while distributed delays have effects on the neural networks at some duration or period with respect to the discrete point of delays. As for different classes of time delays, one can refer to Table II.

TABLE II

Different Types of Time Delays

Show All

For the time-varying delay case, the derivative of time delay is usually limited to be less than one in the early days (i.e., slowly time-varying delay). Now, with the applications of some novel mathematical methods, e.g., the free weight matrix method or Finsler formula, the derivative of time-varying delay can be allowed to be greater than one for some time (i.e., fast time-varying delay. It cannot be always grater than one. Otherwise, delay \(\tau (t)\) might be greater than \(t\) ). In practice, the restriction of the derivative of time-varying delay being less or greater than one is only meaningful in mathematics, which lies on different analysis methods to be used.

In the previous methods, the time-varying delay \(\tau (t)\) is often assumed to belong to the interval \(0\le \tau (t)\le \overline \tau \) (dating back almost to the end of 2006). The delay interval \(0\le \tau (t)\le \overline \tau \) has been replaced by \(0\le \underline \tau \le \tau (t)\le \overline \tau \) since 2007 [94], [328]. The meaning or reason of this expansion consists of the fact that the lower bound of time varying delay in a practical system cannot be zero, and it may vary in a bounded interval. On the other hand, the upper bound of time delay to be estimated in a real delayed system can accurately be approximated if the nonzero lower bound of time delay is used. It should be noted that the discrete delay is usually required to be bounded, which has been stated as above. However, in [39] and [40], the bounded delay was extended to the unbounded case, i.e., \(\tau (t)\le \mu t\) , where \(\mu >0\) .

It can be seen from the existing references that only the deterministic time-delay case was concerned, and the stability criteria were derived based only on the information of variation range of the time delay. Actually, the time delay in some NNs is often existent in a stochastic fashion [294], [295], [317]. It often occurs in real systems that some values of the delays are very large but the probabilities of the delays taking such large values are very small. In this case, if only the variation range of time delay is employed to derive the stability criteria, the results may be conservative. Therefore, the challenging issue is how to derive some criteria for the uncertain stochastic delayed neural networks, which can exploit the available probability distribution of the delay and obtain a larger allowable variation range of the delay.

Recently, a class of neural networks with leakage delays were studied in [80], [140], and [147]. The leakage delay can be explained as follows. In general, a nonlinear system can be stated as follows:\begin{equation} \dot x(t)=-Ax(t)+f(t,x(t),x(t-\tau ))\end{equation} View Source where \(x(t)\) is the state vector and \(f(\cdot ,\cdot ,\cdot )\) is a class of nonlinear functions with some restrictive conditions. \(A\) is the constant matrix with appropriate dimensions, and \(\tau >0\) is the time delay. The first term \(-Ax(t)\) on the right-hand side of (27) corresponds to a stabilizing negative feedback of the system that acts instantaneously without time. This term is called forgetting or leakage term [133]. It is known from [79] that time delays in the stabilizing negative feedback terms will have a tendency to destabilize a system. When time delay is incorporated in the leakage term \(-Ax(t)\) , this class of systems is called systems with leakage delays \begin{equation} \dot x(t)=-Ax(t-\sigma )+f(t,x(t),x(t-\tau ))\end{equation} View Source where \(\sigma >0\) is a time delay. Obviously, the leakage delay is not a new kind of delay. However, for the stability analysis on the system with leakage delay, it cannot be dealt with using the same routine as the conventional system with delays.

E. Evolution and Main Types of Lyapunov Approaches

As most of the stability criteria of recurrent neural networks are derived via the Lyapunov theory, they all have a certain degree of conservatism. Reducing the conservatism has been the topic of much research. With the Lyapunov stability theory, the reduction can be achieved mainly from two phases: 1) choosing the suitable Lyapunov functional and 2) estimating its derivative. The choice of the Lyapunov functional is crucial for deriving less conservative criteria. Various types of Lyapunov functionals and estimation methods on the derivative of Lyapunov functionals have been constructed to study the stability of recurrent neural networks. In this section, we will mainly discuss the evolution and the main types of Lyapunov approaches and Lyapunov functions used in the analysis of global stability. For the estimation method of the derivative of the Lyapunov functional, a brief review can be found in [135] and [304].

In [105], for the Hopfield neural network (2) under symmetry assumption on the interconnections, the following continuously differentiable function is used\(:\) \begin{align} V_H(u(t))\!&=\!-\frac {1}{2}\sum _{i=1}^n\!\sum _{j=1}^ny_iw_{ij}y_j\!-\!\!\sum _{i=1}^nU_iy_i \!+\!\sum _{i=1}^n{\gamma _i}\!\!\int _{0}^{y_i}\!\!g_i^{-1}(s)\mathrm {d}s\notag\\\text{}\end{align} View Source where \(u(t)=(u_1(t),\dotsc , u_n(t))^T\) , \(y_i=g_i(u_i(t))\) , \(g_i^{-1}(s)\) is the inverse function of \(g_i(s)\) , and \(w_{ij}=w_{ji}\) , \(i,j=1,\dotsc ,n\) .

The derivative of (29) along the trajectories of (2) is \begin{equation} \frac {\mathrm {d}V_H(u(t))}{\mathrm {d}t}=-\sum _{i=1}^n\left (\frac {\mathrm {d}}{\mathrm {d}y_i}g_i^{-1}(y_i)\right )\left (\frac {\mathrm {d}y_i}{\mathrm {d}t}\right )^2.\end{equation} View Source If \(g_i^{-1}(s)\) is a monotonically increasing function, then \(({\mathrm {d}V_{\rm H}(u(t))})\hbox {/}{\mathrm {d}t}\le 0\) . An alternative proof is presented in [248].

For the original Cohen-Grossberg network model (1) in [55], the following continuously differentiable function is used\(:\) \begin{align} V_{CG}(u(t))&=\;\frac {1}{2}\sum _{i=1}^n\sum _{j=1}^ng_i(u_i(t))w_{ij}g_j(u_j(t))\notag\\ &\quad \; -\sum _{i=1}^n\int _{0}^{u_i(t)}a_i(s)\left (\frac {\mathrm {d}}{\mathrm {d}s}g_i(s)\right )\mathrm {d}s\end{align} View Source where \(w_{ij}=w_{ji}\) , \(i,j=1,\dotsc ,n\) .

The derivative of (31) along the trajectories of (1) is as follows\(:\) \begin{align} \frac {\mathrm {d}V_{CG}(u(t))}{\mathrm {d}t}&=-\sum _{i=1}^nd_i(u_i(t))\left (\frac {\mathrm {d}}{\mathrm {d}u_i(t)}g_i(u_i(t))\right )\notag\\ &\quad \times \left (a_i(u_i(t))-\sum _{j=1}^nw_{ij}g_j(u_j(t))\right )^2\!.\qquad\end{align} View Source If \(g_i(u_i(t))\) is a monotonically nondecreasing function and \(d_i(u_i(t))\) is a nonnegative function, then \(({\mathrm {d}V_{{\rm CG}}(u(t))})\hbox {/}~{\mathrm {d}t}\le 0\) .

From the stability proof of above two classes of neural networks, we can find the following facts: 1) the above proof procedure shows the reason that why the activation function is usually required to be a monotonically increasing function and 2) both functions \(V_{\rm H}(u(t))\) and \(V_{{\rm CG}}(u(t))\) are continuously differentiable functions instead of Lyapunov functions in the sense of Lyapunov stability theory [132]. The stability proof of (1) and (2) are based on the LaSalle invariance principle [55], [132].

In the pioneering work of Cohen and Grossberg and Hopfield, the global limit property of (1) and (2) was established, which means that given any initial conditions, the solution of (1) [or (2)] will converge to some equilibrium points of the system. However, the global limit property does not give a description or even an estimate of the region of attraction for each equilibrium. In other words, given a set of initial conditions, one knows that the solution will converge to some equilibrium points, but does not know exactly to which one it will converge. In terms of associative memories, one does not know what initial conditions are needed to retrieve a particular pattern stored in the networks. On the other hand, in applications of neural networks to parallel computation, signal processing, and other problems involving the solutions of optimization problems, it is required that there is a well-defined computable solution for all possible initial states. That is, it is required that the networks have a unique equilibrium point, which is globally attractive. Earlier applications of neural networks to optimization problems have suffered from the existence of a complicated set of equilibrium points [254]. Thus, the global attractivity of a unique equilibrium point for the system is of great importance for both theoretical and practical purposes, and has been the major concern of [65], [69], [103], [201], [274], and [275].

In [69], using the continuous energy functions as those in (1) and (2), some sufficient conditions were proved guaranteeing that a class of neural circuits were globally convergent toward a unique stable equilibrium at the expense that the neuron connection matrix must be symmetric and negative semidefinite. In practice, the condition of symmetry and negative semidefiniteness of interconnection matrix is rather restrictive. The research on the global attractivity/stability of neural networks is mainly concentrated on the construction of Lyapunov function on the basis of Lyapunov stability theory. In [23], the following Lyapunov function was first constructed for purely delayed system (5), i.e., (5) with \(w_{ij}=0:\) \begin{equation} V(u(t))=\sum \limits _{i=1}^nu_i^2(t)+ \sum \limits _{i=1}^n\int _{t-\tau }^{t}u_i^2(s)\mathrm {d}s\end{equation} View Source and then a sufficient condition ensuring the uniqueness and global asymptotic stability of the equilibrium point was established. Based on the Lyapunov method in [23], the global stability problem has been widely studied by incorporating different information of neural networks into the construction of Lyapunov functions.

In [275], a Lyapunov function is constructed for (4) with discrete delays \begin{align} V(u(t))=\sum _{i=1}^n\left (\frac {1}{\bar d_i}|u_i(t)|\!+\!\sum _{k=0}^N\sum _{j=1}^n|w_{ij}^k|\delta _j\int _{t-\tau _k}^t\!\!\!|u_j(s)|\mathrm {d}s\right )\notag\\\text{}\end{align} View Source where \(0<\underline d_i\le d_i(u_i(t))\le \bar d_i\) . For neural networks (9) with infinite distributed delays, the following Lyapunov function is constructed in [261]\(:\) \begin{align} V(u(t))&=\sum _{i=1}^n\Bigg (q_i|u_i(t)|+\bar d_iq_i\sum _{j=1}^n|w_{ij}|\delta _j\notag\\ &\qquad \qquad \times \int _0^{+\infty }\!K_{ij}(\theta )\!\int _{t-s}^t\!|u_j(s)|\mathrm {d}s\mathrm {d}\theta \Bigg )\qquad\end{align} View Source where \(q_i>0\) is a positive number, \(i=1, \dotsc , n\) .

Based on the above Lyapunov functions, some global stability results have been derived in the forms of different algebraic inequalities, in which the absolute value operations are conducted on the interconnection weight coefficients. To derive the LMI-based stability results, Lyapunov function in quadratic forms is generally adopted. In [185], the following Lyapunov function is constructed for (4) with \(N=1 :\) \begin{align} V(u(t))&=\;u^T(t)Pu(t)+\sum _{i=1}^nq_i\int _{0}^{u_i(t)}\frac {g_i(s)}{d_i(s)}\mathrm {d}s\notag\\ &\quad + \int _{t-\tau _1}^{t}g^T(u(s))Qg(u(s))\mathrm {d}s\end{align} View Source where \(u(t)=(u_1,\dotsc , u_n)^T\) and \(g(u(t))=(g_1(u_1(t)),\dotsc ,~g_n(u_n(t)))^T\) , \(i=1,\dotsc ,n\) . In [311], the following Lyapunov function is constructed for (4) with \(N=1 :\) \begin{align} V(u(t))=\sum _{i=1}^nq_i\int _{0}^{u_i(t)}\frac {s}{d_i(s)}\mathrm {d}s+ \sum _{i=1}^np_i\int _{0}^{u_i(t)}\frac {g_i(s)}{d_i(s)}\mathrm {d}s\notag\\\text{}\end{align} View Source where the symbols are defined as in (36), and \(p_i>0\) and \(q_i>0\) are constants to be determined. In [223], the following Lyapunov function is constructed for (5)\(:\) \begin{align} V(u(t))&=\;u^T(t)Pu(t)+2\sum _{j=1}^nq_j\int _{0}^{u_j(t)}g_j(s)\mathrm {d}s\notag\\ &\quad + \beta \sum _{j=1}^np_j\int _{t-\tau }^{t}g_j^2(u_j(s))\mathrm {d}s\end{align} View Source where \(\alpha ,\beta , p_j,\) and \(q_j\) are all positive numbers, \(P\) is a symmetric positive definite matrix, and \(j=1,\dotsc ,n\) . In [98], the following Lyapunov function is constructed for (5)\(:\) \begin{align} V(u(t))&=\;u^T(t)Pu(t)+2\sum _{j=1}^nq_j\int _{0}^{u_j(t)}g_j(s)\mathrm {d}s\notag\\ &\quad + \int _{t-\tau }^{t}\big [u^T(s)Ru(s)\!+\!g^T(u(s))Qg(u(s))\big ]\mathrm {d}s~\qquad\end{align} View Source where \(R\) and \(Q\) are positive definite symmetric matrices.

For (5) with time-varying delay \(0\le \tau _1\le \tau (t)\le \tau _2\) , the following Lyapunov function is constructed in [94]\(:\) \begin{align} V(u(t))&=\;u^T(t)Pu(t)+2\sum _{j=1}^nq_j\int _{0}^{u_j(t)}g_j(s)\mathrm {d}s\notag\\ &\quad + \int _{t-\tau (t)}^{t}\big [u^T(s)Ru(s)+g^T(u(s))Qg(u(s))\big ]\mathrm {d}s\notag\\ &\quad +\sum _{i=1}^2\int _{t-\tau _i}^tu^T(s)\bar R_i u(s)\mathrm {d}s\notag\\ &\quad +\int _{-\tau _2}^0\int _{t+\theta }^t\dot u^T(s)Z_1 \dot u(s)\mathrm {d}s\mathrm {d}\theta \notag\\ &\quad +\int _{-\tau _2}^{-\tau _1}\int _{t+\theta }^t\dot u^T(s)Z_2\dot u(s)\mathrm {d}s\mathrm {d}\theta\end{align} View Source where \(\bar R_i\) and \(Z_i\) are positive definite symmetric matrices, \(i=1,2\) , and the others are defined as in (39). Similarly, for (5) with constant time delay, the following function based on delay decomposition method is proposed in [302]\(:\) \begin{align} &\hspace {-17pt}V(u(t))=u^T(t)Pu(t)+2\sum _{j=1}^nq_j\int _{0}^{u_j(t)}g_j(s)\mathrm {d}s\notag\\ &\qquad + \sum _{j=1}^m\int _{-\tau _j}^{-\tau _{j-1}}\left [\begin{array}{c} u(t+s)\\ g(u(t+s))\end{array}\right ]^T\Upsilon _j\left [\begin{array}{c} u(t+s)\\ g(u(t+s))\end{array}\right ]\mathrm {d}s\notag\\ &\qquad +\sum _{j=1}^m(\tau _j-\tau _{j-1})\int _{-\tau _j}^{-\tau _{j-1}}\int _{t+s}^t\dot u^T(\theta )R_j\dot u(\theta )\mathrm {d}\theta \,\mathrm {d}s\notag\\\text{}\end{align} View Source where \(m>0\) is a positive integer, \(\tau _j\) is a scalar satisfying \(0=\tau _0<\tau _1<\cdots <\tau _m=\tau \) , and \(P,\Upsilon _j,\) and \(R_j \) are symmetric positive definite matrices with appropriate dimensions.

In general, the equilibrium points reached by (1) and (2) are locally stable if the continuous functions (29) and (31) are selected, respectively. To improve the chances of reaching the global stability, Lyapunov function is required to be positive according to Lyapunov stability theory. Therefore, the absolute positive continuous function or energy function is adopted in the recent literature; see (33)–(41) and their variations.

For recurrent neural networks with different kinds of actions, such as stochastic perturbations, neutral types, distributed delays, reaction-diffusion, and so on, the construction of the Lyapunov-Krasovskii function is similar to the above ones, besides some special information are incorporated into the functions. It is the different incorporations of such information that make the construction of Lyapunov-Krasovskii functions more flexible and diverse than the classical functions (29) and (31).

F. Comparisons of Delay-Independent Stability Criteria and Delay-Dependent Stability Criteria

Generally speaking, there are two concepts concerning the stability of systems with time delays. The first one is called the delay-independent stability criteria that do not include any information about the size of the time delays and the change rate of time-varying delays [20], [185], [191], [307], [311]. For the systems with unknown delays, delay-independent stability criteria will play an important role in solving the stability problems. The second one is called the delay-dependent stability criteria, in which the size of the time delays and/or the change rate of time-varying delays are involved in the stability criteria [94], [309], [312].

Note that the delay-dependent stability conditions in the literature are mainly referred to as systems with discrete delays or finite distributed delays, in which the specific size of time delays and the change rate of time-varying delays can be measured or estimated. For the cases, such as infinite distributed delays and stochastic delays, in which there are no specific descriptions on the size of time delays and the change rate of time-varying delays, the concept of delay-independent/dependent stability criteria will still hold. If the relevant delay information (such as Kernel function information or the expectation value information of stochastic delays) is involved in the stability criteria, such results are also called delay dependent. Otherwise, they are delay independent.

Since the information on the size of delay and the change rate of time-varying delay is used, i.e., holographic delays, delay-dependent criteria may be less conservative than delay-independent ones, especially when the size of time delay is very small. When the size of time delay is large or unknown, delay-dependent criteria will be unusable, while delay-independent stability criteria may be useful.

G. Stability Results and Evaluations

At present, there are many different analysis methods to show the stability property of recurrent neural networks, such as Lyapunov stability theory [37], [182], [189], [203], [235], Razumikhin-type theorems [196], [264], nonsmooth analysis [224], [225], [293], ordinary differential equation theory [38], [149], [174], [326], LaSalle invariant set theory [55], [239], nonlinear measure method [226], [260], gradient-like system theory [66], [74], [258], comparison principle of delay differential systems [50], [153], and so on.

The expressions of the stability criteria are different due to different analysis and proof methods, such as \(M\) -matrix [16], [28], [38], [149], [182], [191], [301]; algebraic inequality [20], [33], [38]–[40], [85], [143], [166], [174], [188], [189], [203], [229], [299], [306], [326]; matrix norm [273], [290], [315]; additive diagonal stability [7], [109], [156]; LMI [39], [43], [94], [121], [137]–[139], [162], [177], [185], [294], [309], [310], [312], [313], [315], [323]; matrix measure [63], [225]; and spectral radius [259], [305].

Stability criteria in the form of \(M\) -matrix, matrix measure, matrix norm, and spectral radius are only associated with the absolute value of system parameters, and no freedom or free variables can be used in the criteria. These stability criteria are mainly considered in the early days of the stability theory of recurrent neural networks. In contrast, LDS result involves an adjustable matrix to establish the relationship among system parameters. Note that the above forms of stability results have the advantage of simple expressions and easy to check except that they may be more conservative than other forms of stability results.

Algebraic inequality results and LMI results have many different expressions and may involve many free parameters to be tuned, which always make the stability results become rather complex. For example, the exceedingly complex LMI-based stability results are only useful for numerical purposes, and the theoretic meaning is deprived. How to find simple and effective stability criteria is still a challenging research direction.

Furthermore, we can find that with the increase of additive terms in recurrent neural networks (e.g., discrete time delay terms, distributed delay terms, reaction-diffusion terms, and stochastic delay terms), the stability criterion will become more and more conservative. This phenomenon can be resorted to the additive complexity of the system structure. The conservativeness of the criteria will be further increased with the multiplicative complexity of the system structure, and a few related results have been published.

SECTION III.

Brief Review of the Analysis Methods For Early Stability Results

Before presenting the main content of this paper, we will first review some algebraic methods for the stability analysis of neural networks, such as the methods based on the concept of LDS matrices, \(M\) -matrix or \(H\) -matrices [10], and so on. Note that it is usually difficult to compare many different stability results even for the same network model due to some different assumptions required on the networks. Therefore, to present an outline of the evolution of the stability results of neural networks, in this section and the sequel, all the stability results listed below are assumed to hold for the concerned neural networks under some suitable assumptions if no confusion occurs.

In [11], the following condition was derived to ensure the global exponential stability of (2) in the case that \(W=(w_{ij})\) is the lateral feedback matrix with zero diagonal entries (may be asymmetric) and \(g_j(u_j(t))\) is continuous and Lipschitzian\(:\) \begin{equation} \eta =\lambda _{\min }(\Gamma )-\sigma _{\max }(W)>0\end{equation} View Source where \(\Gamma =\mbox {diag}(\gamma _1,\dotsc ,\gamma _n)\) , \(W=(w_{ij})_{n\times n}\) , \(\lambda _{\min }(\Gamma )\) means the minimum eigenvalue of \(\Gamma \) , and \(\sigma _{\max }(W)\) denotes the maximum singular value of \(W\) (i.e., the square root of the largest eigenvalue of \(WW^T\) ). In [222], the following global exponential stability criterion was proposed for (2) [where \(g_j(u_j(t))\) is monotonically increasing and Lipschitz continuous] based on the matrix measure, if for any positive definite diagonal matrix, \(L=\mbox {diag}(L_1,\dotsc , L_n)>0\) such that \begin{equation} \mu (LW-\Delta ^{-1}\Gamma L)<0\end{equation} View Source where \(\mu (A)=\lim _{s\rightarrow 0^+}({||I+sA||-1})\hbox {/}{s}\) , \(||A||=\sup _{||x||=1}||Ax||\) , and \(\Delta =\mbox {diag}(\delta _1,\dotsc ,\delta _n)>0\) . In [35], the following global exponential stability/convergence criterion was proposed for (2) [where \(g_j(u_j(t))\) is strictly monotonically increasing and Lipschitz continuous], if there exist positive definite diagonal matrix \(L=\mbox {diag}(L_1,\dotsc , L_n)>0\) and positive constant \(\alpha \) such that \(0<\alpha <\gamma _i\) , and \begin{equation} L[(\Gamma -\alpha I)\Delta ^{-1}-W]+[(\Gamma -\alpha I)\Delta ^{-1}-W]^TL>0\qquad\end{equation} View Source where \(I\) is an identity matrix with appropriate dimension. Comparing (43) and (44), one can see that (44) is less conservative than (43). In [226], the following condition was derived for the global stability of (2)\(:\) \begin{equation} \delta _i^{-1}\gamma _iL_i-L_iw_{ii}>\sum _{j\ne i}^nL_j|w_{ij}|\end{equation} View Source which includes [25, Corollary 2.1] as a special case, where \(L_i>0\) .

In [35], the following global exponential stability/ convergence criterion was also proposed for (2) [where \(g_j(u_j(t))\) is strictly monotonically increasing and Lipschitz continuous], if there exist positive constant \(\xi _i>0\) and \(\alpha >0\) such that \(\alpha <\gamma _i,\) and any of the following three inequalities holds\(:\) \begin{align} \gamma _j\xi _j-\delta _j\left (\xi _jw_{jj}+\sum _{i=1,i\ne j}^n\xi _i|w_{ij}|\right )&>\alpha \xi _j\notag\\ \xi _i(\gamma _i-\delta _iw_{ii})-\sum _{j=1,j\ne i}^n\xi _j\delta _j|w_{ij}|&>\alpha \xi _j\notag\\ \xi _i(\gamma _i\!-\!\delta _iw_{ii})\!-\!\frac {1}{2}\sum _{j=1,j\ne i}^n(\xi _i\delta _i|w_{ij}|+\xi _j\delta _j|w_{ji}|)&>\alpha \xi _i\qquad\end{align} View Source where \(i,j=1,\dotsc ,n\) . Comparing (45) and (46), one can see that (46) is less conservative than (45). For (2), a set of global exponential stability criteria similar to (46) was presented in [30], and some detailed comparisons with those results in [63] and [287] were provided in [30, Remarks 1–3], respectively.

In [23], the purely delayed Hopfield networks (3) were studied, and the following criterion was established\(:\) \begin{equation} \delta ||W||_2<1\end{equation} View Source where \(\delta =\max (\delta _i), i=1,\dotsc ,n\) , and \(||\cdot ||_2\) denotes the usual 2-norm. In [180], the following result was established for (3)\(:\) \begin{equation} \rho (\Gamma ^{-1}|W|\Delta )<1\end{equation} View Source which guarantees the global asymptotic stability of the unique equilibrium point, where \(\rho (A)\) denotes the spectral radius of a matrix \(A\) , and \(|W|=(|w_{ij}|)_{n\times n}\) .

In [70], for (2) with symmetric connection matrix \(W=W^T\) , a necessary and sufficient condition was derived to guarantee the absolute stability (ABST) of the equilibrium \begin{equation} \max _{1\le i\le n}\lambda _i(W)\le 0\end{equation} View Source or \begin{equation} -W\in P_0 ~\mbox {matrix}\end{equation} View Source where \(\lambda _i(W)\) is the \(i\) th eigenvalue of matrix \(W\) , \(P_0\) is defined as in [70] [or see (128) in the sequel]. Note that (50) is also a necessary and sufficient condition to ensure the unique equilibrium point for (2) with asymmetric connection matrix, but is not a stability condition in this case. For (2) with normal connection weight matrix \(W^TW=WW^T\) , a necessary and sufficient condition was derived to guarantee the ABST in [52]\begin{equation} \max _{1\le i\le n}\mbox {Re}\{\lambda _i(W)\}\le 0\end{equation} View Source where \(\mbox {Re}\{\lambda _i(W)\}\) is the real part of \(\lambda _i(W)\) . Obviously, for different requirements on the connection matrix \(W\) in (2), different stability conditions can be derived, and the result in [70] has been improved in [52].

In [158], it is shown that quasi-diagonally row-sum and column-sum dominance of \(-W\) can ensure the ABST of (2). Quasi-diagonal dominance of \(-W\) implies that \(W\) is an \(H\) -matrix with nonpositive diagonal elements. For (2), a strictly diagonal dominant condition of \(-W\) was derived in [150], which implied that \(W\) is LDS. In [126], the ABST was guaranteed if the interconnection matrix \(W\) was LDS \begin{equation} PW+W^TP<0\end{equation} View Source where \(P\) was a positive definite diagonal matrix. In [127], the result (52) in [126] was further improved to become \begin{equation} PW+W^TP\le 0\end{equation} View Source or \(W\) was Lyapunov diagonally semistable (LDSS). In [7], the following additive diagonal stability condition was derived for (2), i.e., for any positive definite diagonal matrix \(D_1\) , there existed a positive definite diagonal matrix \(D_2\) , such that:\begin{equation} D_2(W-D_1)+(W-D_1)^TD_2<0.\end{equation} View Source Condition (54) extended the conditions in [155] and [158] that the connection weight matrix \(W\) was an \(H\) -matrix with nonpositive diagonal elements. Within the class of locally Lipschitz continuous and monotone nondecreasing activation functions, some ABST results, such as the diagonal semistability result [157] and the quasidiagonal column-sum dominance result [3], can be developed. It is worth pointing out that the additive diagonal stability condition introduced in [7] and [109] is the mildest one among the known sufficient conditions for ABST of neural networks in the literature.

The above results mainly focused on (2) on the basis of \(M\) -matrix, \(H\) -matrix, and LDS concept. For the delayed case of (2), i.e., networks (5), the existence and uniqueness of the equilibrium point can be proved in a similar way to the above methods. However, for the global asymptotic stability of the unique equilibrium point of (5), the methods based on \(H\) -matrix and LDS concept often lose their superiority, while the \(M\) -matrix method will be still effective (for details, please refer to [16], [28], [38], [149], [182], [191], [280], and [301]).

Now, we summarize the relationship among LDS concept, LMI, and \(M\) -matrix, which will be helpful for readers to look insight into different kinds of stability criteria for neural networks (2). According to LDS definition (52), it is obvious that LDS is in the LMI form. According to the definition of \(M\) -matrix, the nonsingular \(M\) -matrix \(\Delta ^{-1}\Gamma -W\) is equivalent to \begin{equation} P(\Delta ^{-1}\Gamma -W)+(\Delta ^{-1}\Gamma -W)^TP>0\end{equation} View Source which is [73, Th. 4], where \(P\) is a positive definite diagonal matrix. The other results in [73] also gave a general condition for global asymptotic stability based on LDS matrices, which hold for those unbounded activations (possibly) also possessing saturations and zero-slope segments. Readers can refer to [73] for more details. Since \(P\Delta ^{-1}\Gamma \) is a positive definite diagonal matrix, and if (52) holds, (55) is valid. On the contrary, if \(PW+W^TP<0\) does not hold, (55) may also hold by choosing suitable gains of activation functions. Therefore, the results in [226] are less conservative than those in [127] and [126]. For the case of nonsingular \(M\) -matrix \(\Delta ^{-1}\Gamma -|W|\) , it is equivalent to the following form\(:\) \begin{equation} P(\Delta ^{-1}\Gamma -|W|)+(\Delta ^{-1}\Gamma -|W|)^TP>0\end{equation} View Source where \(P\) is defined as in (55). If \(W=(w_{ij})_{n\times n}\) is a positive matrix, i.e., \(w_{ij}\ge 0\) , then (56) is equivalent to (55). Otherwise, (56) is not equivalent to (55) in general, and there is no way to compare (56) and (52). Semidefiniteness of the interconnection matrix and the removal of the restriction of symmetry of the interconnection matrix of the neural network models are all direct consequences of the LDS condition. To get more information on various different subsets of the set of stable matrices and show the relationships among the various subsets, one can refer to [100] and [128]. Both LDS condition and \(M\) -matrix condition have simple expressions and are easy to verify, and they were popular in the early days of the stability theory of neural networks.

It should be noted that in the early days of stability research of Hopfield neural networks, there is a direct approach to prove the existence of equilibrium and its exponential stability simultaneously, in which an energy function or Lyapunov function is not required [30], [74]. To the best of the authors' knowledge, it is [30] that first adopted such kind of unified method to prove the stability of Hopfield neural networks. Now, we give a short review for this direct method.

Forti and Tesi [74] proposed the so-called finite length of the trajectory by proving the following result. If the activation function is analytic, bounded, and strictly monotonically increasing, then any trajectory of (2) has finite length on \([0,\infty )\) , i.e., \(\int _0^{\infty }\|\dot u(s)\|_2\mathrm {d}s=\lim _{t\rightarrow \infty }\int _0^{t}\|\dot u(s)\|_2\mathrm {d}s<\infty \) , where \(\|\cdot \|_2\) is the Euclidean norm. Once the length of \(u(t)\) is proved to be finite on \([0,\infty )\) , a standard mathematical argument can prove the existence of the limit of \(u(t)\) as \(t\rightarrow \infty \) , i.e., convergence toward an equilibrium point of (2), and hence ABST of (2). The details are as follows. From [74, Th. 3], we have \(\lim _{t\rightarrow \infty }\int _0^{t}\|\dot u(s)\|_2\mathrm {d}s<\infty \) . From Cauchy criterion on limit existence (necessary part), it follows that for any \(\epsilon >0\) , there exists \(T(\epsilon )\) such that when \(s_2>s_1>T(\epsilon )\) , it results \(\int _{s_1}^{s_2}\|\dot u(s)\|_2\mathrm {d}s<\epsilon \) . Hence, \(\epsilon >\int _{s_1}^{s_2}\|\dot u(s)\|_2\mathrm {d}s>\|\int _{s_1}^{s_2}\dot u(s)\mathrm {d}s\|_2=\|u(s_2)-u(s_1)\|_2\) for \(s_2>s_1>T(\epsilon )\) . On the basis of Cauchy criterion on limit existence (sufficiency part), it follows that there exists \(\lim _{t\rightarrow \infty }u(t)=u^*=\mbox {constant}\) , where \(u^*\) is an equilibrium point of (2).

On the other hand, in [30], the following lemma was given, which was used to derive the corresponding stability criterion: for some norm \(\|u(t)\|\) , if \(\|({\mathrm {d}u(t)})\hbox {/}{\mathrm {d}t}\|\le E e^{-\eta t}\) , then when \(t\rightarrow \infty \) , \(u(t)\) has a limit \(u^*\) and \(\|u(t)-u^*\|\le {E}\hbox {/}{\eta } e^{-\eta t}\) , where \(E\) is a constant. This lemma can be proved briefly as follows. Since \(\|u(t_2)-u(t_1)\|\le \int _{t_1}^{t_2}\|\dot u(s)\|\mathrm {d}s\le {E}\hbox {/}{\eta } (e^{-\eta t_2}-e^{-\eta t_1})\) . By the Cauchy convergence principle, \(u(t)\) has a limit \(u^*\) and the lemma holds. It is clear that the idea of finite length of the trajectory was first proposed and used in [30]. Then, based on this lemma, and in the case that activation function is continuously differentiable and strictly monotonically increasing, several exponential stability conditions were derived for (2) in [30].

SECTION IV.

Development of Proof Methods and Proof Skills in LMI-Based Stability Results

In this section, we will first state the superiority of the LMI method in the analysis and synthesis of dynamical systems. Then, we will show the main technical skills used in the derivation of LMI-based stability result.

Before we begin this section, we will present a simple introduction to the early analysis methods (e.g., LDS and \(M\) -matrix) and LMI methods. We agree that the early analysis methods and LMI methods are developed in different contexts, and they are not the competing methods, and the derived stability criteria are usually sufficient conditions. Therefore, these different methods have their own advantages in the stability analysis of neural networks. The LDS method aims to establish qualitative conditions and structural conditions on the interconnection matrix (which, incidentally, are verifiable via algebraic inequalities only in low-dimensional cases), whereas the LMI approach, which is a convex optimization based and therefore numerically efficient generalization of the LDS approach, aims to establish numerically computable conditions for stability, given a larger class of LMI type constraints. Thus, in the aspects of dealing with high-dimensional network model and amounts of adjustable parameters or freedoms in the stability conditions, LMI method is a great generalization to the early analysis methods, especially to the LDS method. As for the conservativeness of the stability criteria, there is no general criterion to employ. A common method to test the conservativeness of the stability condition is the case study, that is, using a specific neural network model to verify the stability conditions. Nondiagonal solutions may be more meaningful than diagonal solutions. In this respect, the LMI method is more effective than the LDS method.

A. Superiorities and Shortcomings of the LMI-Based Method

In the early days of the stability theory of neural networks, almost all the stability studies stem from the viewpoint of building direct relationship among the physical parameters in neural networks. Therefore, stability criteria based on matrix measure, matrix norm, and \(M\) -matrix were developed.

The physical parameters among neural networks have some nonlinear redundancies, which can be expressed by some constrained relationship with free variables. Stability criteria based on algebraic inequalities, e.g., Young inequality, Holder inequality, Poincare inequality, and Hardy inequality [92], have been paid lots of attention in recent years, which have improved the stability criteria significantly.

Although stability criteria based on the algebraic inequality method can be less conservative in theory, they are generally difficult to check due to adjustable parameters involved while one has no prior information on how to tune these variables. Since LMI is regarded as a powerful tool to deal with matrix operations, LMI-based stability criteria have received attentions from researchers. A good survey of LMI techniques in stability analysis of delayed systems was presented in [283], and LMI methods in control applications were reviewed in [48] and [60]. LMI-based stability results are in matrix forms relating the physical parameters of neural networks to compact structure and elegant expressions.

The popularity of the LMI method is mainly due to the following reasons.

The LMI technique can be applied to a convex optimization problem that can be handled efficiently by resorting to the existing numerical algorithms for solving LMIs [12]. Meanwhile, LMI methods can easily solve the corresponding synthesis problems in control system design once the LMI-based stability (or other performance) conditions have been established, especially when state feedback is employed [283].
For neural networks without delay, LDS method bridges the \(M\) -matrix method and LMI method, which is also a special case of LMI form. For delayed neural networks, the core condition is either LDS or the \(M\) -matrix condition. However, in the delayed case, both LDS condition and \(M\) -matrix condition lack suitable freedoms to be tuned and lead to much conservativeness of the stability criteria. In contrast, the LMI method can easily incorporate the free variables into stability criteria and decrease the conservativeness. Correspondingly, many different kinds of stability results based on matrix inequalities have been proposed. In the sense of total performance evaluation of the desired results, LMI-based results are the most effective at present.
On the one hand, LMI-based method is most suitable for the model or system described using state-space equation. On the other hand, many matrix theory-related methods can be incorporated into the LMI-based methods. Therefore, like algebraic inequality methods (which mainly deal with the scalar space or dot measure, and almost all scalar inequalities can be used in the algebraic inequality methods), many matrix inequalities can be used in the LMI-based method, e.g., Finsler formula [148], Jensen inequality [293], [309], [312], Park inequality [218], and Moon inequality [211]. Especially, the LMI-based method directly deals with the 2-D vector space, which extends the application space of algebraic inequality methods. Therefore, more inhibitory information on the system can be contained in LMI-based results than the algebraic inequality methods.

Every method has its own shortcomings, so does the LMI-based method. With the applications of many different mathematical tools and techniques to the stability analysis of neural networks, the shortcomings of LMI-based method have appeared. Now, we list several main disadvantages as follows.

The exceedingly complex expression form of the stability condition becomes the most inefficiency of this method. The more complex the stability condition is, the less the physical meaning and theoretical meaning the stability condition has. In this case, the LMI-based method will lose its original superiority to the classical algebraic inequality methods, and will become useful only for numerical purposes.
It will become more difficult to compare the different conditions among the LMI-based stability results. Therefore, the efficiency of the proposed conditions can only be compared by means of specific examples, and not in an analytical way.
The increase of slack variables can significantly increase the complexity of the computation, and it is necessary to make some efforts to reduce the redundancy of some of the slack variables. Therefore, how to develop new methods to further reduce the conservatism in the existing stability results while keeping a reasonably low computational complexity is an important issue to investigate in the future.
The exceedingly complex expression form of the stability condition is not easy to study synthesis problems in neural control system due to the cross terms of many slack variables. How to find simple and more effective LMI-based stability criteria is still a challenging topic.

B. Technical Skills Used in LMI-Based Stability Results for Delayed Neural Networks

Note that LMI-based approaches for the stability analysis of recurrent neural networks with time delay are based on the Lyapunov-Krasovskii function method. By incorporating different information of the concerned system into the construction of Lyapunov-Krasovskii function and using some technical skills in the proof procedure, several novel LMI-based stability results have been proposed to reduce the conservativeness of the stability results (e.g., for the case of fast time-varying delay, achieving the maximum upper bound of time delay given the network parameters, etc.). Now, we shall summarize some technical skills used in the stability analysis of delayed recurrent neural networks.

Free Weight Matrix Method: This method was first proposed in [99] and [277], which was used to improve the delay-dependent stability of systems with a time-varying delay. One feature of the method is that it employs neither a model transformation nor bounding techniques for cross terms. Especially, it is a very powerful method to deal with the case of fast time-varying delay, i.e., \(\dot \tau (t)\ge 1\) . Before the emergence of free weight matrix method, almost all the LMI-based stability results can only deal with the case of slowly time-varying delay, i.e., \(\dot \tau (t)< 1\) . The assumption that \(\dot \tau (t)<1\) stems from the need to bound the growth variations in the delay factor as a time function. It may be considered restrictive, but in some applications, it is considered realistic and holds for a wide class of retarded functional differential equations.

The essence of the free weight matrix method is to add some free variables/matrices to an identity, which will improve the effectiveness of the stability results by involving some adjustable variables. For example, the following identity holds according to Newton-Leibniz formula\(:\) \begin{equation} u(t)-u(t-\tau (t))-\int _{t-\tau (t)}^t\dot u(s)\mathrm {d}s=0\end{equation} View Source or we have the following identity for a nonlinear system\(:\) \begin{equation} \dot u(t)+\Gamma u(t)-Wg(u(t))-W_1g(u(t-\tau (t)))=0\end{equation} View Source where \(u(t), \Gamma , W, W_1,\) and \(g(u(t))\) are with compatible dimensions. Multiplying on both sides of (57) and (58) by some suitable vectors with compatible dimensions, respectively, for example, left-multiplying \(u^T(t)Q\) or \(-u^T(t)Q\) on both sides of (57) or left-multiplying \(u^T(t)Q+g^T(u(t))P\) or \(u^T(t)Q-g^T(u(t))P\) on both sides of (58), the equations still hold. That is to say, element 0 on the right-hand side is substituted by some implicit relationships among the system parameters and redundant variables \(Q\) and \(P\) . In this case, we call \(Q\) and \(P\) the free weight matrices. Therefore, by the free weight matrix method, we can utilize sufficiently the combination of elements involved in the identity to deal with the complexity induced from the crossover terms.

In short, the contribution of the free weight matrix method is that by involving more freedoms (or equivalently using more relations of the systems), the conservativeness of stability criteria will be decreased significantly in the sense of total performance of evaluation. Certainly, it is also a decrease of conservativeness in the sense that the restriction of change rate of time-varying delay is relaxed from \(\dot \tau (t)< 1\) to \(\dot \tau (t)< \mu \) , where \(\mu \) may be any constant.

Matrix Decomposition Method: This method is mainly used to deal with the stability problem for recurrent neural networks without delay. For example, for the case of Hopfield and Cohen-Grossberg neural networks without delay, these kinds of matrix decomposition methods have been used in [107], [108], and [239]. In [107], the connection matrix \(W\) is decomposed into the summation of \(n\) matrices \(W_i\) , where the \(i\) th column is composed by the \(i\) th column of \(W\) , and other columns are all zeros. Similarly, in [108], the connection matrix \(W\) is decomposed into the summation of \(n\) matrices \(W_i\) , where the \(i\) th row is composed by the \(i\) th row of \(W\) , and other rows are all zeros. In general, the method used in [107] and [108] improves the stability result based on LDS, while it is more conservative than the stability results based on LMI. For the Cohen-Grossberg neural networks, the connection matrix \(W\) is decomposed into the product of a symmetric matrix and a positive definite diagonal matrix, i.e., \(W=DS\) , where \(D\) is a positive definite diagonal matrix and \(S\) is a symmetric matrix [239]. In general, \(DS\ne SD\) .

Delay-Matrix Decomposition Method: Since LMI is a very powerful method for analyzing the stability of many classes of neural networks with different delays, it is natural to build some LMI-based stability criteria for neural networks with different multiple delays \(\tau _{ij}(t)\) . For the case of \(\tau (t)\) and \(\tau _j(t)\) , many LMI-based stability results have been established. For the case of \(\tau _{ij}(t)\) , delay-matrix decomposition method is proposed to deal with the term \(x_j(t-\tau _{ij}(t))\) or \(g_j(x(t-\tau _{ij}(t)))\) [308]–[312]. For neural networks with continuously distributed delay \(\int _0^{\infty }K_{ij}(s)\mathrm {d}s\) , the delay-matrix decomposition method is still valid [269]. The expression form of stability results based on the delay-matrix decomposition method is a natural generalization of the form of those stability results for the case of \(\tau (t)\) and \(\tau _j(t)\) . Delay-matrix decomposition method is a general method to analyze the system with \(\tau _{ij}(t)\) . It is the contribution of delay-matrix decomposition method that unifies many LMI-based stability results for neural networks with different kinds of delays into one framework. Note that in [52], a matrix decomposition method, which is different from that employed in [308]–[312], is proposed for a class of purely delayed neural networks. The delay matrix \(B\) is decomposed into two parts: 1) excitatory and 2) inhibitory parts, i.e., \(B=B^+-B^-\) , where \(B^+=(b^+_{ij})\) , \(B^-=(b^-_{ij})\) , \(b^+_{ij}=\max \{b_{ij}, 0\}\) signifies the excitatory weights, and \(b^-_{ij}=\max \{-b_{ij}, 0\}\) signifies the inhibitory weights. Obviously, elements of \(B^+\) and \(B^-\) are both nonnegative. Then, a symmetric transformation is used to embed the networks, which are constructed by the interconnected matrices \(B^+\) and \(B^-\) , into an augmented cooperative dynamical system. Using the monotone dynamical system theory, such a system has a significant order-preserving or monotone property that is useful in the analysis of the purely delayed neural networks. For more details, one can refer to [52].

It is worth pointing out that the delay-matrix decomposition method proposed in [308]–[312] mainly focused on systems with multiple discrete delays \(\tau _{ij}\) or multiple continuously distributed delay \(\int _0^{\infty }K_{ij}(s)\mathrm {d}s\) to build LMI-based stability criteria. The method proposed in [52] aims to decompose the delay connection matrix into two positive matrices to use the monotone dynamical system theory to establish the corresponding stability results.

Delay Decomposition/Partition Approach: Time delay is one of the most important parameters in delayed neural networks. Since interconnection weight matrices have been sufficiently explored in the development of neural network stability theory, especially with the occurrence of the free weight matrix method, it seems that the stability criteria have reached the point where a little space is left in connection weights that can be used to further decrease the conservativeness of stability results. In this case, delay-dependent stability criteria may have more space for improvement than that of delay-independent stability criteria, because the information of time delay is not sufficiently explored yet. In the previous stability analysis methods, time delay \(\tau (t)\) is simply regarded as an isolated or discrete value, which implicitly implies that \(\tau (t)\) belongs to the single interval \([0,\tau _M]\) and only the upper bound information \(\tau _M\) is utilized, where \(\tau (t)\le \tau _M\) , and \(\tau _M>0\) is a constant. However, according to the sampling theory or the approximation theory, if the single interval \([0, \tau _M]\) is divided into \(m>0\) subintervals (for example, \([0, {1}\hbox {/}{m}\tau _M]\) , \([{1}\hbox {/}{m}\tau _M, {2}\hbox {/}{m}\tau _M], \ldots \) , \([{m-1}\hbox {/}{m}\tau _M, \tau _M]\) ) and the subinterval size or the sampling frequency in the interval \([0, \tau _M]\) is suitable (it may be fixed or variable, for example, the size equals to \(\frac {1}{m}\tau _M\) ), then more information in the interval \([0, \tau _M]\) can be used and more free variables can be involved similar to the case of free weight matrix method. This is the principle of delay decomposition approach. Based on this method, some new delay-dependent stability criteria have been obtained to further decrease the conservativeness of the stability results [84], [89], [90], [214], [215], [221], [231], [241], [297], [302]. Note that how to achieve the maximum upper bound of time delay is always an important aspect in judging the conservativeness of the stability criteria. The larger the maximum value of time delay, the less conservative of the stability results. It is in this sense that delay decomposition approach reduces the conservativeness of the stability criteria.

The essence of the delay decomposition approach is to enlarge/augment the state space and involve many adjustable variables, which has larger augmented state space or more system dimensions than that of the original system. A challenging topic existing in the delay decomposition approach is how to determine the number of subintervals and the subinterval size to achieve the optimal upper bound value of time delay. At present, by combining the delay decomposition approach with the augmented Lyapunov-Krasovskii functions [96], [134], [135], [297], some new stability results for delayed neural networks have been published, and the conservativeness of the stability results is decreased at the expense of more unknown parameters or matrices involved.

Descriptor System Method: This is a universal transformation method that can transform a normal differential system into a descriptor-like system and use the analysis approach of descriptor system to study the normal differential system. Therefore, the dimensions of the original differential system are enlarged from \(n\) to \(2n\) [75], [88], [167], [289]. With the augment of the dimensions, the number of adjustable matrices in the construction of Lyapunov functional will be increased. It is the essence of the descriptor system method that by increasing the state space and correspondingly the number of tuning matrices to decrease the conservativeness of the stability results.

Splitting Interval Matrix Method: This method is devoted to the robust stability analysis of neural networks with interval uncertainty, i.e., the uncertain connection matrix \(A\in [\underline A, \overline A]\) , where \(\underline A=(\underline a_{ij})_{n\times n}\) and \(\overline A=(\overline a_{ij})_{n\times n}\) . Similar to delay decomposition approach, the interval \(A\in [\underline A, \overline A]\) is divided by \(\tilde A={\overline A-\underline A}\hbox {/}{m}\) or \(\tilde a_{ij}={\overline a_{ij}-\underline a_{ij}}\hbox {/}{m}\) , where \(m\) is a positive integer greater than or equal to two, or the splitting interval may be unequal. Then, based on the LMI method, a large set of matrix inequalities need to be checked simultaneously. This splitting interval matrix method was proposed in [247].

SECTION V.

Stability Problems for Two Classes of Cohen-Grossberg Neural Networks

For models similar to Cohen-Grossberg neural networks (4), we will discuss the stability problems based on the following different assumptions.

Assumption 5.1[189], [290], [309], [311]:

The amplification function \(d_i (\zeta )\) is continuous and there exist constants \(\underline d_i\) and \(\overline d_i\) such that \begin{equation} 0<\underline d_i\le d_i (\zeta )\le \overline d_i\notag\end{equation} View Source for \(\forall ~ \zeta \in \Re \) , \(i=1,\dotsc ,n\) .

Assumption 5.2 [6], [20], [28], [38]:

The function \(a_i~(u_i(t))\) is continuous and there exists constant \(\gamma _i>0\) such that \begin{equation} \frac {a_i(\zeta )-a_i(\xi )}{\zeta -\xi }\ge \gamma _i\notag\end{equation} View Source for \(\forall ~\zeta , \xi \in \Re \) with \(\zeta \ne \xi \) , \(i=1,\dotsc ,n\) .

Assumption 5.3 [307], [315], [319], [326]:

The activation function \(g_i(\cdot )\) is globally Lipschitz continuous, i.e., there exists a positive constant \(\delta _i\) such that \begin{equation} |g_i(\zeta )-g_i(\xi )|\le \delta _i |\zeta -\xi |\notag\end{equation} View Source for \(\forall ~\zeta , \xi \in \Re \) , where \(|\cdot |\) denotes the absolute value, \(i=1,\dotsc ,n\) .

Assumption 5.4 [6], [181], [189], [269], [311]:

The activation function \(g_i(\cdot )\) is globally Lipschitz continuous, i.e., there exists a positive constant \(\delta _i\) such that \begin{equation} 0\le \frac {g_i(\zeta )-g_i(\xi )}{\zeta -\xi }\le \delta _i\notag\end{equation} View Source for \(\forall ~\zeta , \xi \in \Re \) with \(\zeta \ne \xi \) , \(i=1,\dotsc ,n\) .

Assumption 5.5 [169], [182], [185], [191], [269]:

The amplification function \(d_i(\zeta )\) is continuous with \(d_i(0)=0\) , \(d_i(\zeta )>0\) for all \( \zeta >0\) , and \begin{equation} \int _0^{\epsilon }\frac {1}{d_i(s)}\mathrm {d}s=+\infty \notag\end{equation} View Source for all \(i=1,\dotsc , n\) , where \(\epsilon >0\) is a constant.

Note that the differences between Assumptions 5.3 and 5.4 can be found in [270]. The differences between Assumptions 5.1 and 5.5 indicate the fact that the function in Assumption 5.1 is strictly positive, while the function in Assumption 5.5 is nonnegative. Moreover, if the amplification function \(d_i(\zeta )\) equals to a positive constant \(C_0>0\) , then it satisfies Assumption 5.1 and does not satisfy Assumption 5.5 due to \(\int _0^{\epsilon }{1}\hbox {/}{d_i(s)}\mathrm {d}s={1}\hbox {/}{C_0}\epsilon <+\infty \) . Hence, Assumptions 5.1 and 5.5 are different and cannot include each other.

Based on the above assumptions, we now show the relationship between the original Cohen-Grossberg neural networks (1) and the delayed Cohen-Grossberg neural networks (4).

The differences between (1) and (4) are as follows.

The amplification functions are different. Assumption 5.5 is required in (1), while Assumption 5.1 is required in (4).
Due to the different assumptions on amplification functions in (1) and (4), the Hopfield model (2) is only a special case of (4) with constant amplification functions, while (1) does not include Hopfield model (2).
The state curves of Cohen-Grossberg [55] neural networks with Assumption 5.5 are all nonnegative under positive initial conditions, while the state curves of Cohen-Grossberg neural networks with Assumption 5.1 may be positive, negative, or their mixture under any forms of initial conditions [290].
The requirements for the function \(a_i(u_i(t))\) in (1) and (4) are different. Function \(a_i(u_i(t))\) is monotonically increasing and required to be radially unbounded in (4), while in (1), it may vary according to the different choice of positivity conditions.
The connection coefficients in (1) are all positive, while the connection coefficients in (4) can be any sign.
Model (1) often represents biological systems, which reflects the survival and perdition of species. In contrast, (4) stems from engineering applications, and in a similar manner to Hopfield neural network model, they can be used in fields, such as optimization, decision making and learning [91], [208], [252], [253], and signal processing [327].

The similarities between (1) and (4) are as follows: 1) the model structure in mathematical description is the same and 2) the symmetry requirements of the interconnection matrices are the same in the early days of neural network stability theory. However, the symmetry of interconnection matrices is not required in this research.

Due to such a huge amounts of related literature published, it is not easy to list all the references. To outline clearly the research progress of the stability theory of neural networks, we mainly discuss two classes of neural network models, i.e., the original Cohen-Grossberg neural network model (1) and the Cohen-Grossberg type neural network model (4). Based on these two primitive models, we will pinpoint some main achievements obtained in a few relevant papers, whereas several other results are presented as corollaries or minor improvements.

The next subsections are organized as follows. Section V-A will focus on the stability of the original Cohen-Grossberg neural network model (1), and some improvements surrounding this model will be discussed appropriately. Sections V-B–V-D will concentrate on the Cohen-Grossberg type neural network model (4) and review the progress in different aspects.

A. Stability of Cohen-Grossberg Neural Networks With Nonnegative Equilibrium Points

In this section, we will focus on five papers to describe the progress of the Cohen-Grossberg neural networks (1). Some related references are used to complement the progress of stability at different levels.

The main contribution of [55] is to discover the essence of symmetry on the effects of dynamics of complex systems, and to establish the stability criterion for (1). Since then, many different stability results have been proposed for (1) with Assumption 5.5 and its variants.

For Cohen-Grossberg neural networks (1), Ruan [239] proposed the following sufficient global stability condition based on LaSalle's invariance principle: if the connection matrix \( W \) is decomposed into the product of a symmetric matrix and a positive definite diagonal matrix\(:\) \begin{equation} W=DS\end{equation} View Source\begin{equation} W=DS\end{equation} where \(D\) is a positive definite diagonal matrix and \(S\) is a symmetric matrix, then every bounded trajectory approaches one of the possibly large number of equilibrium points as \(t\rightarrow \infty \) . In general, \(DS\ne SD\) , and therefore, the stability condition in [239] relaxed the condition in [55].

The following Lotka-Volterra model of competing species:\begin{equation} \dot u_i(t)=G_iu_i\left (1-\sum _{k=1}^nH_{ik}u_k(t)\right )\end{equation} View Source that is a special case of Cohen-Grossberg neural networks (1) has been studied in [67], where \(u_i(t)\) is the population of the \(i\) th species, \(H_{ik}\le 0\) for \(i\ne k\) are the negative interaction parameters between different species, and \(G_i>0\) are constants. Using the Lojasiewicz inequality method, Forti [67] extends the results in [55] as far as the convergence is concerned for (60). Especially, for the case of isolated equilibrium points, Cohen and Grossberg [55] presented the ABST property for (60) via LaSalle's invariance principle, in which all the trajectories converge to isolated equilibrium points, while the main result in [67] is to prove the convergence of the isolated equilibrium point for (60).

In the aspects of time delay and symmetric connection weights, [169], [183], [185], [191], [311], and [312] improved the conditions in [55], and the stability of nonnegative/positive equilibrium points for corresponding Cohen-Grossberg neural networks with delays has been studied.

For the reaction-diffusion Cohen-Grossberg neural networks described by \begin{align} \frac { \partial {u}_i (t)}{\partial t} &= \sum _{k=1}^m\frac {\partial }{\partial x_k}\left (D_{ik}\frac {\partial u_i(t,x)}{\partial x_k}\right )- d_i (u_i (t,x)) \notag\\ &\quad \times \bigg [ a_i (u_i (t,x)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t,x)) \bigg ]\end{align} View Source where \(D_{ik}=D_{ik}(t,x,u)\ge 0\) denotes the diffusion operator in the transmission of axonal signal, the results in [169] required \(W=(w_{ij})\) to be LDSS or LDS (which was also called Lyapunov-Volterra quasi-stable or Lyapunov-Volterra stable in [169], respectively) \begin{equation} -PW-(PW)^T\ge 0~(\mbox {or}>0)\end{equation} View Source for some positive definite diagonal matrix \(P\) , then the nonnegative equilibrium point of (61) is (locally) asymptotically stable.

Note that, despite the symmetry restriction on the matrix \(W\) being removed, the results in [169] will not always hold for any symmetric matrix \(W\) . If and only if symmetric matrix \(W\) is stable, e.g., Hurwitz stable, then the results in [169] hold. Obviously, the main results in [55] and [169] are different sufficient conditions to guarantee the (local) asymptotic stability for (1) due to the effects of additive reaction-diffusion terms.

In the case that the activation function satisfies a quasi-Lipschitz condition \(|g_i(s)|\le \delta _i|s|+q_i\) instead of global Lipschitz Assumption 5.3 and \(a_i(s)\mbox {sgn}(s)\ge \gamma _i |s|-\beta _i\) , [191] considered the following networks with bounded delays \(\tau _{ij}(t)\) and positive initial state conditions\(:\) \begin{align} \dot {u}_i (t) &= - d_i (u_i (t))\bigg [ a_i (u_i (t)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))\notag\\ &\qquad ~-\sum \limits _{j = 1}^n {w_{ij}^1 } g_j (u_j (t-\tau _{ij}(t)))\bigg ],~~i, j=1,\dotsc ,n.\notag\\\text{}\end{align} View Source If the following matrix:\begin{equation} \Gamma -(|W|+|W_1|)\Delta\end{equation} View Source is a nonsingular \(M\) -matrix, then (63) has a unique equilibrium point, which is globally asymptotically stable, where \(|W|=(|w_{ij}|)\) , \(|W_1|=(|w^1_{ij}|)\) , \(\Gamma =\mbox {diag}(\gamma _1,\dotsc ,\gamma _n)\) , and \(\Delta =\mbox {diag}(\delta _1,\dotsc ,\delta _n), i,j=1,\dotsc ,n\) .

For (63), [311] required \begin{align} 2L_i\gamma _i&-\sum _{j=1}^n(L_iw_{ij}\delta _j+L_jw_{ji}\delta _i)\notag\\ & -\sum _{j=1}^n\left (L_iw_{ij}^1\delta _j+L_jw_{ji}^1\delta _i\right )>0\end{align} View Source where \(L_i>0,~ i=1,\dotsc ,n\) , or the following matrix\(:\) \begin{equation} 2\Gamma -(|W|+|W_1|)\Delta -\Delta (|W|+|W_1|)^T\end{equation} View Source be a nonsingular \(M\) -matrix, which guarantees the uniqueness of the equilibrium point of (63). Obviously, the existence condition (65) in [311] improved (64) in [191]. However, global asymptotic stability condition in [311] is the same as that in [191].

System (63) with \(\tau _{ij}(t)=\tau \) has been studied in [185]. If \(\Gamma \Delta ^{-1}-W-W_1\) is LDS, or equivalently, the following LMI holds\(:\) \begin{equation} P(\Gamma \Delta ^{-1} -W-W_1)+(\Gamma \Delta ^{-1} -W-W_1)^TP>0\end{equation} View Source where \(P\) is a positive definite diagonal matrix, then there exists a unique nonnegative equilibrium point of (63) with \(\tau _{ij}(t)=\tau \) . If there exist a positive definite diagonal matrix \(P\) and a positive definite symmetric matrix \(Q\) such that the following LMI holds\(:\) \begin{equation} \left [ {{\begin{array}{cc} 2P\Gamma \Delta ^{-1}\!-PW-(PW)^T\!-Q &~~ -PW_1 \\ -(PW_1)^T &~~ Q \end{array} }} \right ] >0\end{equation} View Source then the unique nonnegative equilibrium point of (63) with \(\tau _{ij}(t)=\tau \) is globally asymptotically stable. If the unique equilibrium point is positive, then (68) can ensure the global exponential stability of (63) with \(\tau _{ij}(t)=\tau \) .

Using Schur complement lemma [12], (68) is equivalent to the following form\(:\) \begin{align} 2P\Gamma \Delta ^{-1}-PW-(PW)^T-Q-PW_1Q^{-1}(PW_1)^T>0.\notag\\\text{}\end{align} View Source Comparing the uniqueness condition (67) and the global asymptotic stability condition (69), we can see that the uniqueness and global asymptotic stability conditions are generally different and are not equivalent. As far as the existence condition of the equilibrium point is concerned, (67) is less conservative than (69).

For the following Cohen-Grossberg neural networks with finite distributed delays:\begin{align} &\hspace {-29pt}\dot {u}_i (t) = - d_i (u_i (t))\bigg [ a_i (u_i (t)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))\notag\\ &\qquad \qquad \quad ~~-\sum \limits _{k = 1}^N\sum \limits _{j = 1}^n {w_{ij}^k } g_j (u_j (t-\tau _{kj}(t)))\notag\\ &\qquad \qquad \quad ~~-\sum \limits _{l = 1}^r\sum \limits _{j = 1}^n b_{ij}^l \int _{t-d_l}^tg_j (u_j(s))\mathrm {d}s\bigg ].\end{align} View Source Zhang et al. [312] established the following global asymptotic stability condition\(:\) \begin{align} -2P\Gamma & \Delta ^{-1}\!+PW+(PW)^T\!+\sum _{i=1}^N(PW_iQ_i^{-1}W_i^TP+Q_i)\notag\\ &\qquad \quad ~+\sum _{l=1}^r(d_lY_l+d_iPB_lY_l^{-1}B_l^TP)<0\end{align} View Source where \(P, Q_i,\) and \(Y_l\) are positive definite diagonal matrices, \(W=(w_{ij}), W_k=(w^k_{ij})\) , and \(B_l=(b^l_{ij})\) . Obviously, (71) in [312] extends (68) in [185].

From the above results, we can see that the core condition is (62) for neural networks without delay. With the addition of delayed terms, the core condition is expanded from (67) to (71). Therefore, different results are derived for different network models, which become more complex under similar LMI form. It is in the LMI form that (71) unifies many LMI-based stability results in the literature.

In the following three subsections, we will discuss the Cohen-Grossberg neural networks with mixed equilibrium point, i.e., the amplification function \(d_i(u_i(t))\) satisfies Assumption 5.1.

B. Stability of Cohen-Grossberg Neural Networks via\(M\) -Matrix Methods or Algebraic Inequality Methods

In this section, we will focus on 10 papers to describe the progress on stability analysis of the Cohen-Grossberg neural networks (4). Some related references are used to complement the progress of stability analysis at different levels.

Assume that matrix \begin{equation*}W^e=\left (\sum _{k=1}^N w_{ij}^k\right )\end{equation*} View Source is symmetric and the activation function \( g_j(\cdot )\) is sigmoidal and bounded. System (4) is globally stable if the following condition holds [290]\(:\) \begin{equation} \sum _{k=1}^N(\tau _k\beta \|W^k\|)<1\end{equation} View Source where \(\beta \le \overline d~ \overline \delta \) , \(\overline d=\max \{\overline d_i\}\) , \(\overline \delta =\max \{\delta _i\}\) , \(W^k=(w_{ij}^k)\) , and \(\|\cdot \|\) denotes the Euclidean norm. Since the publication of [290] in 1995, research on the dynamics of Cohen-Grossberg neural networks with Assumption 5.1 has become the main topic in the neural networks community [182], [274], [275]. Therefore, a lot of different stability results on the equilibrium point of Cohen-Grossberg neural network model (4) and its variants have been established.

Under Assumptions 5.2 and 5.3, Lu and Chen [182] studied the global stability of (4) with \(N=0\) , and the boundedness of activation functions and positive lower bound of amplification functions were not required. If the following matrix:\begin{equation} \Gamma \Delta ^{-1}-W\end{equation} View Source is LDS [182, Th. 1], i.e., there exists a positive definite diagonal matrix \(P=\mbox {diag}(p_1,\dotsc ,p_n)\) such that \begin{equation} P(\Gamma \Delta ^{-1}-W)+(\Gamma \Delta ^{-1}-W)^TP>0\end{equation} View Source then (4) with \(N=0\) has a unique equilibrium point. More importantly, the relationship between \(M\) -matrix and LDS has been discussed in [182] and [101]. Specifically, for (4) (\(N=0\) ) with positive interconnection coefficients, the asymptotic stability criteria based on \(M\) -matrix and LDS concept are equivalent. It is the \(M\) -matrix that builds a bridge between the algebraic inequality and the LMI method. However, for Cohen-Grossberg neural networks (4) (\(N=1\) ) with delays, the asymptotic stability criteria based on \(M\) -matrix and LMI approach are not equivalent any more. Generally speaking, the results based on \(M\) -matrix can have a unified expression, while LMI-based results often have various expressions for Cohen-Grossberg neural networks (4) with different kinds of delays. That is why so many different LMI-based stability results have been proposed in the literature. If the positive lower bound of the amplification function is given and \(\int _0^{\infty }{\rho }\hbox {/}{d_i(\rho )}\mathrm {d}\rho =\infty \) , then it is proved in [182] that (73) or (74) also guarantees the global exponential stability of (4) with \(N=0\) . In contrast, the exponential stability results in [274] require that the amplification function in Cohen-Grossberg neural networks (4) with \(N=0\) to be lower and upper bounded.

Under Assumptions 5.2 and 5.3 and the positive lower boundedness of the amplification function, the following system\(:\) \begin{align} \dot {u}_i (t) = - d_i (u_i (t))\bigg [ a_i (u_i (t)) - \sum \limits _{j = 1}^n {w_{ij}^1 } g_j (u_j (t-\tau _{ij}(t)))\bigg ]\notag\\\text{}\end{align} View Source has been studied in [307]. It has been proved that if the following condition holds\(:\) \begin{equation} \mbox {det}(\Gamma -W_1K)\ne 0\end{equation} View Source for diagonal matrix \(K\) satisfying \(-\Delta \le K\le \Delta \) , then (75) has a unique equilibrium point. Furthermore, if the following matrix:\begin{equation} \Gamma \Delta ^{-1}-|W_1|\end{equation} View Source is a nonsingular \(M\) -matrix, then the equilibrium point of (75) is globally exponentially stable. Obviously, from (77), we can deduce (76).

Under Assumptions 5.1–5.3, the result in [28] requires \begin{equation} M_0=\underline D\Gamma -\sum _{k=0}^N|W_k|\Delta \overline D\end{equation} View Source be an \(M\) -matrix, which guarantees the global asymptotic stability of Cohen-Grossberg neural networks (4), where \(\underline D=\mbox {diag}(\underline d_1,\dotsc , \underline d_n)\) , \(\overline D=\mbox {diag}(\overline d_1,\dotsc , \overline d_n)\) , and \(|W_k|=(|w_{ij}^k|)_{n\times n}\) .

Note that the analysis method in [28] can also be applied to the following networks\(:\) \begin{align} \dot {u}_i (t) &= -d_i(u_i(t))\bigg [a_i (u_i (t)) - \sum _{k=0}^N\sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t - \tau _{ij}^k))\bigg ]\notag\\\text{}\\ \dot {u}_i (t) &= -d_i(u_i(t))\bigg [a_i (u_i (t))- \sum \limits _{j = 1}^n {w_{ij} }\notag\\ &\qquad\qquad \qquad \times \int _{-\infty }^t K_{ij}(t-s) g_j (u_j (s))\mathrm {d}s\bigg ]\quad\\ \dot {u}_i (t) &= -d_i(u_i(t))\bigg [a_i (u_i (t))- \sum \limits _{j = 1}^n {w_{ij} } g_j\notag\\ &\qquad\qquad \quad ~~ \times \left (\int _{-\infty }^t K_{ij}(t-s) u_j (s)\mathrm {d}s\right )\bigg ]\end{align} View Source where the delay kernel \(K_{ij}(s)\) satisfies \(\int _0^{\infty }K_{ij}(s)\mathrm {d}s\,{=}\,1\) and other suitable conditions. The above three models include the models in (8)–(10) as special cases. The unifying global asymptotic stability criterion for models in (80) and (81) requires \begin{equation} M_{0}^\prime =\underline D\Gamma -|W|\Delta \overline D\end{equation} View Source and \begin{equation} M_{0}^{\prime \prime }=\Gamma -|W|\Delta\end{equation} View Source be a nonsingular \(M\) -matrices, where \(|W|=(|w_{ij}|)_{n\times n}\) . Obviously, the stability result in the form of \(M\) -matrix in [28] can give a unified expression for Cohen-Grossberg neural networks with many different types of delays, and it is also easy to check.

Under Assumptions 5.1–5.3 and the boundedness of the activation function, for the following Cohen-Grossberg neural networks:\begin{align} \dot {u}_i (t) &= - d_i (u_i (t))\bigg [ a_i (u_i (t)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))\notag\\ &\quad\qquad \qquad \qquad ~-\sum \limits _{j = 1}^n {w_{ij}^1 } g_j (u_j (t-\tau _{ij}))\bigg ]\end{align} View Source[181] requires the matrix \(M_1=(m_{ij}^1)_{n\times n}\) to be a nonsingular \(M\) -matrix. Then, the equilibrium point is unique and globally exponentially stable, where \(m_{ii}^1=\gamma _i-w_{ii}\delta _i-|w_{ij}^1|\delta _i\) and \(m_{ij}^1=-(|w_{ij}|+|w_{ij}^1|)\delta _j\) for \(i\ne j\) . In addition, the main result in [181] is equivalent to requiring the following matrix to be a nonsingular \(M\) -matrix:\begin{equation} M_1=\Gamma -W^*\Delta -|W_1|\Delta\end{equation} View Source or the following algebraic inequalities to hold:\begin{align} &M_1^{\prime }\!=\!\zeta _i\gamma _i-\zeta _iw_{ii}\delta _i\!-\!\sum _{j=1,j\ne i}^n\zeta _j|w_{ji}|\delta _i\!-\!\sum _{j=1}^n\zeta _j|w_{ji}^1|\delta _i\!>\!0\notag\\\text{}\\ &M_1^{\prime \prime }\!=\!\zeta _i\gamma _i\!-\!\zeta _iw_{ii}\delta _i\!-\!\sum _{j=1,j\ne i}^n\zeta _j|w_{ij}|\delta _j\!-\!\sum _{j=1}^n\zeta _j|w_{ij}^1|\delta _j\!>\!0\notag\\\text{}\\ &M_1^{\prime \prime \prime }\!=\!\zeta _i\gamma _i-\zeta _iw_{ii}\delta _i-\frac {\sum _{j=1,j\ne i}^n(\zeta _j|w_{ji}|\delta _i+\zeta _i|w_{ij}|\delta _j)}{2}\qquad \notag\\ &\qquad ~~-\frac {\sum _{j=1}^n(\zeta _j|w_{ji}^1|\delta _i+\zeta _i|w_{ij}^1|\delta _j)}{2}>0\end{align} View Source for positive constant \(\zeta _i>0\) , where \(W^*=(w_{ij}^*), w_{ij}^*=|w_{ij}| ~\mbox{if}~ i\ne j, w_{ij}^*=w_{ij} ~\mbox{if}~ i=j\) , and \(|W_1|=(|w_{ij}^1|)_{n\times n}\) .

We should note that (86)–(88) are equivalent to \(\mu _1(\zeta M_1){<}\,{0}\) (strictly diagonally column dominant), \(\mu _{\infty }( M_1 \zeta )<0\) (strictly diagonally row dominant), and \(\mu _2(\zeta M_1)<0\) , respectively, where \(\zeta =\mbox {diag}(\zeta _1, \zeta _2, \ldots , \zeta _n)\) is a positive definite diagonal matrix, and for a matrix \(M=(m_{ij})_{n\times n}\) , these three matrix measures are defined by \(\mu _1(M)=\max _i (m_{ii}+\sum _{j\ne i}m_{ji})\) , \(\mu _{\infty }(M)=\max _i(m_{ii}+\sum _{j\ne i}m_{ij})\) , and \(\mu _2(M)=\lambda _{\max }\{(M+M^T)\hbox {/}2\}\) , and \(\lambda _{\max }(\cdot )\) denotes the maximal eigenvalue of a symmetric square matrix. Therefore, the main results in [181] improved the results in [4], [274], and [261].

For the following Cohen-Grossberg neural networks with reaction-diffusion term:\begin{align} &\hspace {-45pt}\frac { \partial {u}_i (t,x)}{\partial t} = \sum _{k=1}^m\frac {\partial }{\partial x_k}\Big (D_{ik}\frac {\partial u_i(t,x)}{\partial x_k}\Big )- d_i (u_i (t,x))\notag\\ &\quad \times \bigg [ a_i (u_i (t,x)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t,x))\notag\\ &\qquad ~~-\sum \limits _{j = 1}^n {w_{ij}^1 } f_j (u_j (t-\tau _{ij}(t),x))\bigg ]\end{align} View Source global exponential stability problem has been discussed in [326] under Neumann boundary condition, where \(x=(x_1, x_2, \dotsc , x_m)^T\in \Omega \subset \Re ^m\) , \(\Omega \) is a bounded compact set with smooth boundary \(\partial \Omega \) and measure \(\mbox {mes} (\Omega ) >0\) in space \(\Re ^m\) , \(u_i(t,x)\) is the state of the \(i\) th unit at time \(t\) and in space \(x\) , and \(D_{ik}=D_{ik}(t,x,u)\ge 0\) denotes the transmission diffusion operator along the \(i\) th neuron. Under Assumptions 5.1–5.3, and the condition that the bounded activation functions are globally Lipschitz with positive constants \(\delta _i~{\rm and}~\delta _i^0\) , i.e., \(|g_i(\zeta )-g_i(\xi )|\le \delta _i |\zeta -\xi |\) and \(|f_i(\zeta )-f_i(\xi )|\le \delta _i^0 |\zeta -\xi |\) for \(\zeta , \xi \in \Re \) , [326, Corollary 3.2] established the following global exponential stability condition\(:\) \begin{equation} M_3=\underline d_i\gamma _i-\sum _{j=1}^n\overline d_j|w_{ji}|\delta _i-\sum _{i=1}^n\overline d_j|w_{ji}^1|\delta _i^0>0.\end{equation} View Source Obviously, if \begin{equation} M_3^{\prime }=\underline D \Gamma -|W|\overline D\Delta - |W_1|\overline D\Delta ^0\end{equation} View Source is a nonsingular \(M\) -matrix, (90) is naturally satisfied, where \(\underline D=\mbox {diag}(\underline d_1, \dotsc ,\underline d_n)\) , \(\overline D=\mbox {diag}(\overline d_1, \dotsc ,\overline d_n)\) , and \(\Delta ^0=\mbox {diag}(\delta _1^0,\dotsc , \delta _n^0)\) .

For stochastic Hopfield neural networks (89) with constant delays \begin{align} \mathrm {d} u_i(t,x)&= \sum _{k=1}^m\frac {\partial }{\partial x_k}\Big (D_{ik}\frac {\partial u_i(t,x)}{\partial x_k}\Big )\notag\\ &\quad - \bigg [ a_i (u_i (t,x)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t,x))\notag\\ &\qquad -\sum \limits _{j = 1}^n {w_{ij}^1 } f_j (u_j (t-\tau _{ij},x))\bigg ]\mathrm {d}t\notag\\ &\quad +\sum \limits _{j = 1}^n {\sigma _{ij}(u_j(t,x)) }\mathrm {d}\omega _{j}(t)\end{align} View Source the global exponential stability problem has been discussed in [250], where \(\omega (t)=(\omega _1(t),\dotsc ,\omega _n (t))^T\) is an \(n\) -dimensional Brownian motion defined on a complete probability space \((\Omega , \mathcal {F}, \mathbb {P})\) with a natural filtration \(\{{\mathcal {F}_t}\}_{t\ge 0}\) generated by \(\{\omega (s), 0\le s\le t\}\) , where \(\Omega \) represents the canonical space generated by \(\{\omega _i(t)\}\) and \(\mathcal {F}\) denotes the associated \(\sigma \) -algebra generated by \(\{\omega (t)\}\) with probability measure \(\mathbb {P}\) .

For the deterministic case of (92), (91) with \(\underline d_i=\overline d_i=1\) has been derived in [250] to guarantee the global exponential stability of the unique equilibrium point. For (92), in the case of \(\sigma _{ij}(u_j^*)=0\) and \(\sigma _{ij}(\cdot )\) being Lipschitz continuous with Lipschitz constant \(L_{ij}\) , the following nonsingular \(M\) -matrix conditions have been derived\(:\) \begin{align} M_4&=\Gamma -|W|\Delta - W_1\Delta ^0-\overline C\\ M_4^{\prime }&=\Gamma -|W|\Delta - W_1\Delta ^0-\tilde C\end{align} View Source which guarantee the almost sure exponential stability and mean-value exponential stability, respectively, where \begin{align*} \overline C&=\mbox {diag}(\overline c_1, \dotsc ,\overline c_n)\\ \overline c_i&=-\gamma _i+\sum \limits _{j = 1}^n {w_{ij}\delta _j}+\sum \limits _{j = 1}^n {w^1_{ij}\delta ^0_j }+\sum \limits _{j = 1}^n {L^2_{ij} }\ge 0\\ \tilde C&=\mbox {diag}(\tilde c_1, \dotsc ,\tilde c_n)\\ \tilde c_i&=0.5\sum \limits _{j = 1}^n {L^2_{ij} }+K_1\left (\sum \limits _{j = 1}^n {L^2_{ij} }\right )^{1/2}\ge 0\end{align*} View Source and \(K_1>0\) is a constant.

Obviously, \(M\) -matrix (93) or (94) unifies many existing results as special cases, for example, the results in [38], [181], [260], and [326].

For the reaction-diffusion Hopfield neural networks (89) with continuously distributed delays \begin{align} \frac { \partial {u}_i (t)}{\partial t} &= \sum _{k=1}^m\frac {\partial }{\partial x_k}\left (D_{ik}\frac {\partial u_i(t,x)}{\partial x_k}\right )\notag\\ &\quad ~- \bigg [ a_i (u_i (t,x)) -\sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t,x))\notag\\ &\qquad \quad -\sum \limits _{j = 1}^n {w_{ij}^1 } \int _{-\infty }^tK_{ij}(t-s)g_j (u_j (s,x))\mathrm {d}s\bigg ]\qquad\end{align} View Source the global exponential stability problem has been studied in [318] and [249]. The following \(M\) -matrices have been derived to ensure the global exponential stability of (95) in [318] and [249]:\begin{align} M_0^{\prime \prime \prime }=\Gamma -|W|\Delta -|W_1|\Delta\\ M_0^{\prime \prime \prime \prime }=\Gamma -W^+\Delta -|W_1|\Delta\end{align} View Source where \(W^+=(w_{ij}^+)\) , \(w^+_{ii}=\max \{0, w_{ii}\}\) , and \(w^+_{ij}=|w_{ij}|\) for \(i\ne j\) .

For the following systems with distributed delays:\begin{align} \dot u_i(t)&= - \bigg [ a_i (u_i (t)) -\sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))\notag\\ &\qquad ~~-\sum \limits _{j = 1}^n {w_{ij}^1 } f_j (u_j (t-\tau _{ij}(t)))\notag\\ &\qquad ~~ -\sum \limits _{j = 1}^n {w_{ij}^2 } \int _0^{\infty }K_{ij}(s)h_j(u_j (t-s))\mathrm {d}s\bigg ]\end{align} View Source and \begin{equation} \int _0^{\infty }e^{\lambda s}K_{ij}(s)\mathrm {d}s=k_{ij}(\lambda )>0\end{equation} View Source where \(0\le (h_i(\zeta )-h_i(\xi ))\hbox {/}(\zeta -\xi )\le \delta _i^1\) and \(k_{ij}(0)=1\) , the following conditions are established in [149]\(:\) \begin{align} \left [\lambda I-\Gamma +|W|\Delta +e^{\lambda \tau }|W_1|\Delta ^0+(\rho (\lambda )\otimes |W_2|\Delta ^1)\right ]\zeta <0\notag\\\text{}\end{align} View Source and \begin{equation} \Gamma -|W|\Delta -|W_1|\Delta ^0-|W_2|\Delta ^1\end{equation} View Source is a nonsingular \(M\) -matrix, which guarantee the global exponential stability of the unique equilibrium point for (98), where \(\lambda >0\) is a positive number, \(I\) is an identity matrix with appropriate dimension, \(0\le \tau _{ij}(t)\le \tau \) , \(A\otimes B=(a_{ij}b_{ij})_{n\times n}\) , \(\zeta =(\zeta _1,\ldots ,\zeta _n)^T>0, \zeta _i>0\) , \(W_2=(w_{ij}^2)_{n\times n}, \Delta ^1=\mbox {diag}(\delta _1^1, \ldots , \delta _n^1)\) , and \(\rho (\lambda )=(k_{ij}(\lambda ))_{n\times n}\) .

For the following neural networks with finite distributed delays:\begin{align} \frac { \partial {u}_i (t)}{\partial t} &= \sum _{k=1}^m\frac {\partial }{\partial x_k}\left (D_{ik}\frac {\partial u_i(t,x)}{\partial x_k}\right )- \bigg [ a_i (u_i (t,x)) \notag\\ &\quad -\sum \limits _{j = 1}^n {w_{ij}^1 } f_j \left (\int _0^{{T}}K_{ij}(s)u_j (t-s,x)\mathrm {d}s\right )\bigg ]\qquad\end{align} View Source where \({T}>0\) is a positive constant, it can also use the same method as those in (95) to analyze the stability of (102), and the asymptotic stability criteria can be expressed as (97) [192]. Therefore, it is the same method to deal with the continuous distributed delays in (95) and (102).

For the neutral-type Cohen-Grossberg neural networks with constant delays \begin{align} &\hspace {-40pt}\dot {u}_i (t) +\sum \limits _{j = 1}^n {e_{ij} } \dot u_j (t-\tau _j)\notag\\ &= - d_i (u_i (t))\bigg [ a_i (u_i (t))- \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t)) \notag\\ &\qquad\qquad \qquad -\sum \limits _{j = 1}^n {w_{ij}^1 } g_j (u_j (t-\tau _{j}))\bigg ]\end{align} View Source the global asymptotic stability problem has been discussed in [45], where \(E=(e_{ij})\) is a constant matrix and denotes the coefficients of the time derivative of the delayed states. When \(a_i (u_i (t))\) and \(a^{-1}_i (u_i (t))\) are continuous, differentiable, and \(0<\gamma _i\le a_i{\prime } (u_i (t))\le \gamma ^0_i<\infty \) , the following conditions are presented\(:\) \begin{gather} 0\le \|E\|<1\notag \\ \delta _M p_w(1\!\!+\!\|E\|) \!+\!\delta _M r_w(1\!\!+\!\|E\|)\!+\!q_w\!\!<\!\min\nolimits _{1\le i\le n}\{\underline d_i \gamma _i\}\end{gather} View Source which guarantee the global asymptotic stability of (103), where \(\delta _M=\max \{\delta _i\}\) , \(p_w=\max \{\overline d_i\}\|W\|\) , \(q_w=\max \{\overline d_i\gamma ^0_i\}\|E\|\) , and \(r_w=\max \{\overline d_i\}\|W_1\|\) . When \(E=0\) in (103), (104) can be reduced to the following form\(:\) \begin{equation} \delta _M(\|W\|+\|W_1\|)\max _i \{\overline d_i \} \le \min _i\{\underline d_i \gamma _i\}.\end{equation} View Source Furthermore, if \(\overline d_i=\underline d_i=1\) , then, (105) can be reduced to the following form\(:\) \begin{equation} \delta _M(\|W\|+\|W_1\|)\le \min _i\{\gamma _i\}.\end{equation} View Source

From the above results, we can see that the core condition is (73) for neural networks without delay, or similarly (77) for purely delayed neural networks. With the increasing complexity of networks, the core condition is expanded from (73) or (77) to (101). Note that the \(M\) -matrix-based stability results for different networks have the same or similar structures.

C. Stability of Cohen-Grossberg Neural Networks via Matrix Inequality Methods or Mixed Methods

In this section, we will focus on four papers to describe the stability analysis of Cohen-Grossberg neural networks (4). Some related references are used to complement the progress at different levels.

In this section, the activation function is assumed to satisfy Assumption 5.4 if there is no other declaration.

For the following Cohen-Grossberg neural networks:\begin{align} \dot {u}_i (t) &= - d_i (u_i (t))\bigg [ a_i (u_i (t)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))\notag\\ &\qquad \qquad \qquad -\sum \limits _{j = 1}^n {w_{ij}^1 } f_j (u_j (t-\tau ))\bigg ]\end{align} View Source which is a special case of (84), the global exponential stability has been studied in [315], and the following matrix inequality-based and matrix norm-based stability criteria have been established\(:\) \begin{equation} 2P\Gamma \Delta ^{-1}\!-\!PW\!-\!(PW)^T\!-\!Q\!-\!PW_1Q^{-1}W_1^TP\!>\!0\qquad\end{equation} View Source and \begin{equation} \delta _M(\|W\|+\|W_1\|)<\gamma _m\end{equation} View Source where \(P\) and \(Q\) are positive definite diagonal matrices, \(\gamma _m=\min \{\gamma _i\}\) , and \(\delta _M=\max \{\delta _i\}\) , \(i=1,\dotsc ,n\) . Condition (108) is just (69) [185]. It is easy to see that (108) includes (62) and LDS conditions (73) or (74) as special cases. Now, we will show that (109) can be recovered from (108). Without loss of generality, we consider the case \(\|W_1\|\ne 0\) . We choose \(P=I\) and \(Q=\|W_1\|I>0\) , then (108) becomes \begin{equation} 2\Gamma \Delta ^{-1}\!-\!W\!-\!W^T\!-\!{\|W_1\|}I\!-\!\frac {1}{\|W_1\|}W_1W_1^T\!>\!0.\end{equation} View Source If (110) holds, then for any vector \(x(t)\ne 0\) , we have \begin{equation} x^T\!\left (2\Gamma \Delta ^{-1}\!\!-\!\!W\!\!-\!\!W^T\!\!-\!{\|W_1\|}I\!-\!\frac {1}{\|W_1\|}W_1W_1^T\right )x(t)\!>\!0.\quad\end{equation} View Source Inequality (111) holds if the following condition holds\(:\) \begin{equation} x^T\left (2\gamma _m\delta _M^{-1}-2\|W\|-2\|W_1\|\right )x(t)>0.\end{equation} View Source Obviously, (112) implies (109). Similarly, we can show that (109) also includes the results in [6] and [119].

For the following Cohen-Grossberg neural networks with continuously distributed delays:\begin{align} &\hspace {-25pt}\dot u_i(t)= -d_i(u_i(t)) \bigg [ a_i (u_i (t)) -\sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t)) \notag\\ &\qquad\qquad \qquad ~~-\sum \limits _{j = 1}^n {w_{ij}^1 } g_j (u_j (t-\tau (t)))-\sum \limits _{j = 1}^n {w_{ij}^2 } \notag\\ &\qquad\qquad \qquad ~~\times \int _{-\infty }^tK_{j}(t-s)g_j (u_j (s))\mathrm {d}s\bigg ]\end{align} View Source the LMI-based global exponential stability problem has been studied in [139], where, \begin{align} \int _0^{\infty }K_{j}(s)\mathrm {d}s=1, \int _0^{\infty }sK_{j}(s)e^{2\lambda s}\mathrm {d}s=\pi _j(\lambda )<\infty , \lambda >0.\notag\\\text{}\end{align} View Source System (113) can be written in a compact matrix-vector form \begin{align} \dot u(t)&= -D(u(t)) \Big [ A (u (t)) -W g (u(t)) -W_1g (u (t-\tau (t)))\notag\\ &\qquad\qquad \qquad -W_2 \int _{-\infty }^tK(t-s)g (u (s))\mathrm {d}s\Big ].\end{align} View Source Obviously, the distributed delay in (114) [139] is different from that in (7). Therefore, the analysis method in [139] cannot be applied to the neural networks with distributed delay (7). The LMI-based global asymptotic stability results in [139] have no restrictions on the change rate of time-varying delays.

For the following Cohen-Grossberg neural networks with continuously distributed delays:\begin{align} &\hspace {-10pt}\dot u_i(t)= - d_i(u_i(t))\bigg [ a_i (u_i (t)) -\sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))-\sum \limits _{j = 1}^n {w_{ij}^1 } \notag\\ &\qquad\qquad \qquad \quad ~~ \times \int _{-\infty }^tK_{ij}(t-s)g_j (u_j (s))\mathrm {d}s\bigg ]\end{align} View Source [41, Th. 5.2] requires the following matrix inequality\(:\) \begin{equation} 2P\Gamma \Delta ^{-1}\!\!-\!PW\!-\!W^T\!P\!-\!(PQ^{-1}W_1)_{\infty }\!-\!(PQW_1)_1\!\!>\!\!0\quad\end{equation} View Source to guarantee the equilibrium point of (116) to be globally asymptotically stable. Furthermore, if the following condition holds\(:\) \begin{equation} \int _0^{\infty }K_{ij}(s)e^{\delta _0s}\mathrm {d}s<\infty\end{equation} View Source where \(\delta _0>0\) is a constant, then the equilibrium point of (116) is globally exponentially stable if (117) holds.

For (116), it can be transformed into the following vector-matrix form [269]\(:\) \begin{align} \dot u(t) &= - D(u(t))\bigg [A(u(t)) - Wg(u(t)) \notag\\ &\qquad\qquad \quad ~- \sum \limits _{i = 1}^n {E_i\int _{- \infty }^t \bar K_i(t-s){g(u(s))\mathrm {d}s}} \bigg ]\qquad\end{align} View Source where \(\bar K_i(s)=\mbox {diag}[K_{i1}(s), K_{i2}(s), \ldots , K_{in}(s)]\) , and \(E_i\) is an \(n\times n\) matrix, whose \(i\) th row is composed of the \(i\) th row of matrix \(W_1\) , and other rows are all zeros. Obviously, (119) is different from (115) due to the difference of kernel functions. For (119), some LMI-based stability criteria have been proposed, which improved the results in [269].

Note that, with the use of Moon inequality [211], Finsler inequality, the well-known Newton-Leibnitz formula, and the free-weight matrix method, a large number of different classes of LMI-based stability results have been established in [94] and [320].

For the Cohen-Grossberg neural networks (63) and the Cohen-Grossberg neural networks with finite distributed delays (70), LMI-based stability results have been established in [311] and [312], respectively, in which a similar delay-matrix-decomposition method is proposed to derive the main results.

From the above results, we can see that the core condition of stability criterion for neural networks with delay is (67) in Section V-A, from which one can derive (108) and (117), respectively. Under different assumptions on the network models, one may have the same stability results in the mathematical description. However, the physical meanings which they reflect are different in essence.

D. Topics on Robust Stability of Recurrent Neural Networks

In the design and hardware implementation of neural networks, a common problem is that accurate parameters in neural networks are difficult to guarantee. To design neural networks, vital data, such as the neuron firing rates, the synaptic interconnection weights, and the signal transmission delays, usually need to be measured, acquired, and processed by means of statistical estimation, which definitely leads to estimation errors. Moreover, parameter fluctuation in neural network implementation on very large-scale integration chips is also unavoidable. In practice, it is possible to explore the range of the above-mentioned vital data as well as the bounds of circuit parameters by engineering experience even from incomplete information. This fact implies that good neural networks should have certain robustness, which paves the way for introducing the theory of interval matrices and interval dynamics to investigate the global stability of interval neural networks. As pointed out in [19], robust stability is very important in the consideration of dynamics of neural networks with or without delays. There are many related results on robust stability [170], [246], [291]. In [170], global robust stability of delayed interval Hopfield neural networks was investigated with respect to the bounded and strictly increasing activation functions. Several \(M\) -matrix conditions to ensure the robust stability were given for delayed interval Hopfield neural networks. In [246], a global robust stability criterion was presented for Hopfield-type network parameters with delays and interval parameters uncertainties, which was in the hybrid form of the matrix inequality and the matrix norm of connection matrices. Ye et al. [291] viewed the uncertain parameters as perturbations and gave some testable results for robust stability of continuous-time Hopfield neural networks without time delays. Cao et al. [19] viewed the interval uncertain parameters as the matched uncertainty and gave some testable LMI-based robust results.

For the LMI-based robust stability results of recurrent neural networks, the difficulty is how to tackle different classes of uncertainties. For the cases of matched uncertainties and interval uncertainties, many LMI-based robust stability results have been published [18], [19], [38], [44], [121], [137], [144], [187], [246], [273], [310], [311]. However, for recurrent neural networks with other forms of uncertainties, LMI-based robust stability results are few [286], [288]. It is important to establish the LMI-based robust stability results for recurrent neural networks with different classes of uncertainties because one can attempt to use the advantages of LMI technique to establish new stability theory for recurrent neural networks with uncertainties, which is in parallel to the scalar methods, such as \(M\) -matrix and algebraic inequality methods.

Since the proof method of the robust stability for systems with interval uncertainties and matched uncertainties is similar to the case of system without uncertainties, the review on the robust stability results of recurrent neural networks is omitted.

SECTION VI.

Stability Analysis of Neural Networks With Discontinuous Activation Functions

Although this paper mainly focuses on the stability of continuous-time recurrent neural networks, we will also spend a small space on discontinuous recurrent neural networks that have been intensively studied in the literature.

When dealing with dynamical systems possessing high-slope nonlinear elements, it is often advantageous to model them with a system of differential equations with discontinuous right-hand side, rather than studying the case where the slope is high but of finite value [71], [72]. The main advantage of analyzing the ideal discontinuous case is that such analysis is usually able to give a clear picture of the salient features of motion, such as the presence of sliding modes, i.e., the possibility that trajectories be confined for some time intervals to discontinuity surfaces.

The existing literature reports a few other investigations on discontinuous neural networks, which pertain to a different application context, or to different neural architectures. A significant case is that of Hopfield neural networks where neurons are modeled by a hard discontinuous comparator function [146]. Different from the discontinuous activation function addressed in [71] (see discontinuous activation function in Section II-B of this paper), the analysis in [146] was valid for symmetric neural networks, which possessed multiple equilibrium points located in saturation regions, i.e., networks useful to implement content addressable memories. References [49] and [78] introduced a special neural-like architecture for solving linear programming problems, in which the architecture is substantially different from the additive neural networks. Moreover, the networks in [49] are designed as gradient systems of a suitable energy function, while it is known that additive neural networks of the Hopfield type are gradient systems only under the restrictive assumption of symmetric interconnection matrix [73], [103]. To study the class of discontinuous neural networks, the concepts from the theory of differential equations with discontinuous right-hand side as introduced by Filippov are usually used [68], [86], [111], [184].

In [71], discontinuous Hopfield networks (2) were studied. The established conditions on global convergence could be applicable to general nonsymmetric interconnection matrices, and they generalized the previous results to the discontinuous case for neural networks possessing smooth neuron activations. Specifically, if the following simple condition established in [71] is satisfied\(:\) \begin{equation} -W ~\mbox {is a}~P\mbox{-matrix}\end{equation} View Source then discontinuous Hopfield networks (2) have a unique equilibrium point and a unique corresponding output equilibrium point, where a matrix \(A\) is said to be a \(P\) -matrix if and only if all the principal minors of \(A\) are positive.

More importantly, the concept of global convergence of the output equilibrium point was proposed in [71]. Usually, in the standard case considered in the literature where the neuron activation functions are continuous and monotone, it is easy to see that global attractivity of an equilibrium point also implies global attractivity of the output equilibrium point. Unfortunately, this property is no longer valid for the class of discontinuous activation functions, since for discontinuous neuron activations, convergence of the state does not imply convergence of the output. Therefore, for discontinuous activation functions, it is needed to address separately both global convergence of the state variables and the output variables. In [71], the following condition was derived [71, Th. 2] \(:\) \begin{equation} {-W} ~\mbox {is Lyapunov diagonally stable}\end{equation} View Source which guarantees that discontinuous networks (2) have a unique equilibrium point and a unique corresponding output equilibrium point, respectively, which are globally attractive.

Under (121), [71, Th. 2] holds for all neural network inputs \(U_i\) in (2). To show that for almost all inputs it is possible to establish the standard property of global attractivity, Forti and Nistri [71] first established the existence and uniqueness condition, global convergence condition in finite time for the equilibrium point, and the corresponding output equilibrium point of discontinuous networks (2). The concept of global convergence in finite time was then extended to the discontinuous neural networks with delay (5) in [72], in which the boundedness of the activation function required in [71] is also removed. Theorem 1 in [72] could be restated simply as follows. If \(w_{ii}<0\) and:\begin{equation} \mathcal{M}(W)-|W_1|~ \mbox {is an} ~ M\mbox{-matrix}\end{equation} View Source then discontinuous system (5) had a unique equilibrium point and a unique corresponding output equilibrium point, and the unique equilibrium point is globally exponentially stable, where \(\mathcal {M}(W)=(\mathcal {M}(w_{ij}))\) and \(\mathcal {M}(w_{ij})=|w_{ij}|\) if \(i=j\) , otherwise \(\mathcal {M}(w_{ij})=-|w_{ij}|\) if \(i\ne j\) , \(i,j=1,\ldots ,n\) .

Now, let us compare (121) and (122). When the delay is sufficiently small, the interconnection matrix in discontinuous system (5) is given by \(W+W_1\) . Condition (121) implies that \(-(W+W_1)\) is LDS. Therefore, conditions on the neuron interconnections in [72, Th. 1] are more restrictive than those in [71]. It is known, however, that for large classes of matrices, such as those involved in modeling cooperative neural networks, the concepts of \(M\) -matrices and LDS matrices coincide [72]. On the other hand, the class of unbounded neuron activations considered in [72] and [73] is larger than that in [71], which leads to different analysis methods from that of bounded activation functions. Therefore, it yields more restrictive results in [72] than that in [71].

After the pioneering work in [71], [72], and [183], the topics on discontinuous neural networks have been paid much attention, and many related results are established [64], [117], [175], [216]. Among these works, there are mainly four research teams in the study on the discontinuous neural networks [68], [95], [173], [183], [184], [186], [111], [171], [86], [116], [141]. Readers can refer to the references cited therein, and the details are omitted here.

From the above results, we can see that there are mainly two kinds of core conditions: 1) the LDS form 121 and 2) the \(M\) -matrix form 122 . Based on these two core conditions, many stability results can be derived for more complex neural network models.

SECTION VII.

Some Necessary and Sufficient Conditions for Recurrent Neural Networks

Nowadays, almost all the stability results for recurrent neural networks are sufficient conditions. However, there also exist some necessary and sufficient conditions for some special classes of recurrent neural networks with/without delays.

Note that sufficient asymptotic/exponential stability criteria in the existing literature are all established on the basis of strict inequalities (i.e., >0 or <0). It is natural to ask: what will happen if the strict inequalities are replaced by nonstrict inequalities (i.e., \(\ge 0\) or \(\le 0\) )? For the case of necessary and sufficient conditions, we must consider the case of the nonstrict inequalities. For the following neural networks:\begin{align} \dot {u}_i (t) &= - \gamma _iu_i (t) \!+\!\sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t)) +\sum \limits _{j = 1}^n {w_{ij}^1 } f_j (u_j (t\!-\!\tau _{ij})).\notag\\\text{}\end{align} View Source Chen [26] established a global convergence condition. If \begin{align} M_j&=-\gamma _j+\alpha _j\bigg (w_{jj}+\sum _{i=1,i\ne j}^n|w_{ij}|\bigg )^+ +\beta _j\sum _{i=1}^n|w_{ij}^1|\le 0\notag\\\text{}\end{align} View Source then there is a unique equilibrium point \(u^*\) such that each solution of (123) satisfies \(\lim _{t\rightarrow \infty }u(t)=u^*\) , where \(g_i(u_i(t))=\tanh (\alpha _i u_i(t))\) , \(f_i(u_i(t))=\tanh (\beta _i u_i(t))\) , \(\alpha _i>0\) , \(\beta _i>0\) , and \(a^+=\max (a, 0), i,j=1,\dotsc , n.\) From (124), we can see that the result in [26] further relaxes the restrictive condition on the stability/convergence, which is close to the necessary condition of stability/convergence.

For the following purely delayed Hopfield neural networks:\begin{equation} \dot {u} (t) = - \Gamma u (t) +W_1 f(u (t-\tau ))\end{equation} View Source [52, Th. 3] presented the following necessary and sufficient condition\(:\) \begin{equation} \left (\sigma I-\left [\begin{array}{cc}\Gamma &0\\ 0&\Gamma \end{array}\right ]+e^{\sigma \tau }\left [\begin{array}{cc}W_1^+ &W_1^-\\ W _1^-&W_1^+\end{array}\right ]\left [\begin{array}{cc}\Delta &0\\ 0&\Delta \end{array}\right ]\right )\eta \le 0\qquad\end{equation} View Source which guarantees the (125) to be componentwise exponential convergence, where \(\eta =[\alpha ^T_c, \beta ^T_c]^T\) , and \(\alpha _c>0\) and \(\beta _c>0\) are two constant vectors with appropriate dimensions, \(\sigma >0\) is a scalar, \(I\) is an identity matrix with appropriate dimension, \(W_1^+=((w^1_{ij})^+)\) , \(W_1^-=((w^1_{ij})^-)\) , and \((w^1_{ij})^+=\max \{w^1_{ij}, 0\}\) are the excitatory weights, and \((w^1_{ij})^-=\max \{-w^1_{ij}, 0\}\) are the inhibitory weights. Obviously, the elements in \(W_1^+\) and \(W_1^-\) are all nonnegative.

For the following Hopfield neural networks, the ABST was studied in [51], [53], [70], and [159]\(:\) \begin{equation} \dot {u}_i (t) = - \gamma _i u_i (t) + \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))+U_i\end{equation} View Source where \(g_i(u_i(t)))\) is a class of sigmoid functions that consist of smooth, strictly monotonic, and increasing functions, which are saturated as \(u_i(t)\rightarrow \pm \infty \) (e.g., \(\mbox {tanh} (u_i(t))\) ). Note that ABST [70] of a neural network means that there is a globally asymptotically stable equilibrium point for every neuron activation function belonging to the defined class of some functions and for every constant input vector. In [70], for the nonsymmetric case of connection matrix \(W\) in (127), a necessary and sufficient condition \begin{equation} -W\in \mathcal {P}_0\end{equation} View Source is presented to guarantee the uniqueness of the equilibrium point for any bounded activation function class, where \(\mathcal {P}_0\) denotes the class of square matrices \(A\) defined by one of the following equivalent properties [70]: 1) all principal minors of \(A\) are nonnegative; 2) every real eigenvalue of \(A\) as well as of each principal submatrix of \(A\) is nonnegative; and 3) \(\det (K + A) \ne 0\) for every diagonal matrix \(K = \mbox {diag}(K_1,\dotsc , K_n)\) with \(K > 0, i = 1,\dotsc ,n\) . That is to say, the negative semidefiniteness of matrix \(W\) is a necessary and sufficient condition to guarantee the uniqueness of the equilibrium point for (127) with asymmetric connection matrix. However, (128) is not in general sufficient for the ABST of (127) with asymmetric connection matrix. On the contrary, for (127) with symmetric connection matrix, it has been shown that (128), or the negative semidefiniteness of matrix \(W\) , is a necessary and sufficient condition to guarantee the ABST of the unique equilibrium point. This is consistent with the result in [159]. The ABST result was also extended to the absolute exponential stability (AEST) in [160] and [161].

In [127], a conjecture was raised: the necessary and sufficient condition for ABST of the neural networks (127) is that its connection matrix \(W\) belongs to the class of matrices such that all eigenvalues of matrix \((W-D_1)D_2\) have negative real parts for arbitrary positive diagonal matrices \(D_1\) and \(D_2\) . This condition was proven to be a necessary and sufficient condition for ABST of the neural networks (127) with two neurons [154]. The necessity of such a condition for ABST was proven in [127], and the result in [127] included many existing sufficient conditions for ABST as special cases. However, whether or not such a condition is sufficient for ABST of general neural networks remains unknown in the case of more than two neurons. Within the class of partially Lipschitz continuous and monotonically nondecreasing activation functions (this class of activation function includes the sigmoidal function as a special case), a novel AEST result was given in [156], i.e., for any positive definite diagonal matrix \(D_1\) , there exists a positive definite diagonal matrix \(D_2\) such that \(D_2(W-D_1)+(W-D_1)^TD_2<0\) . This condition extended the stability condition in [155] that requires the connection weight matrix \(W\) to be an \(H\) -matrix with nonpositive diagonal elements.

In [53], for (127) with \(W^TW=WW^T\!\) , i.e., \(W\) is a normal matrix, the following necessary and sufficient condition was established. If \begin{equation} \max _i \mbox {Re}~ \lambda _i(W)\le 0\end{equation} View Source or \begin{equation} \max _i \lambda _i\left (\frac {W+W^T}{2}\right )\le 0\end{equation} View Source then the normal neural network (127) is absolutely stable, where \(\lambda _i(B)\) represents the \(i\) th eigenvalue of matrix \(B\) . Re\(\{\lambda _i(B)\}\) represents the real part of eigenvalue \(\lambda _i(B)\) . Since a symmetric matrix is normal, then, for a symmetric neural network, the negative semidefiniteness result in [70] is obviously a special case of the result in [53].

In [51], (127) was further discussed. By removing the assumption of normal matrix on \(W\) , a matrix decomposition \(W=W^{\rm s}+W^{\rm ss}\) was used, where \(W^{\rm s}=(W+W^T)\hbox {/}2\) and \(W^{\rm ss}=(W-W^T)\hbox {/}2\) were the symmetric and the skew-symmetric parts of \(W\) , respectively. Then, based on the matrix eigenvalue method and a solvable Lie algebra condition, a new necessary and sufficient condition was presented to guarantee the ABST of the concerned Hopfield neural networks. Specifically, suppose that \(\{W^{\rm s}, W^{\rm ss}\}\) generated a solvable Lie algebra. If and only if the following conditions hold [51]\(:\) \begin{equation} \max _i \mbox {Re}~ \lambda _i(W)\le 0\end{equation} View Source or the symmetric part \(W^{\rm s} \) of the weight matrix \(W\) is negative semidefinite, then (127) is absolutely stable, which included the results in [53], [70], and [159].

For (127) with Assumption 5.4, the following necessary and sufficient condition was derived in [107] and [124]\(:\) \begin{equation} -\Gamma +W\Delta ~\mbox {is nonsingular or }\det (-\Gamma +W\Delta )\ne 0\qquad\end{equation} View Source which ensures that (127) has a unique equilibrium point.

For the Hopfield neural networks with delays \begin{equation} \dot {u}_i (t) = - \gamma _i u_i (t) + \sum \limits _{j = 1}^n {w^1_{ij} } g_j (u_j (t-\tau _{ij}))\end{equation} View Source global attractivity is studied in [194], and the following necessary and sufficient condition was obtained\(:\) \begin{equation} \det (-\Gamma +W_1)\ne 0,~\mbox {and}~ \Gamma -|W_1|~\mbox {is a \(\mathcal {P}_0\)-matrix}\end{equation} View Source where \(\mathcal {P}_0\) -matrix is defined in (128), \(g_i(0)=0\) , \(g_i(u_i (t))\) saturates at ±1 for any \(u_i (t)\in \Re \) , \(g_i{\prime }(u_i (t))\) is continuous such that \(g_i{\prime }(u_i (t))>0\) for any \(u_i (t)\in \Re \) , \(g_i{\prime }(0)=1\) , and \(0<\bar g_i(u_i (t))<m_b\) for any \(m_b>0\) , \(\bar g_i(u_i (t))=\max \{g_i(u_i (t)),-g_i(-u_i (t))\}, i=1,\dotsc ,n\) , and \(\max \{\tau _{ij}\}=\tau _M\ge 0\) . Comparing the result (134) in [194] with those results (76) and (77) in [307], we can see that the conditions in [194] improve the results in [307] for (133). Note that (133) is a special case of (75) studied in [307].

From the above results, we can see that \(\mathcal {P}_0\) matrix form (128) and matrix eigenvalue form (129) or (130) are the two main core conditions for the stability criteria. For more complex neural network models, there are a few necessary and sufficient conditions to be obtained yet.

SECTION VIII.

Multistability of Recurrent Neural Networks and Its Comparisons With Global Stability

Preceding sections are about the global stability of the unique equilibrium point of continuous-time recurrent neural networks. Multistability problems also require further investigation. For example, when recurrent neural networks are applied to pattern recognition, image processing, associative memories, and pattern formation, it is desired that the network has several equilibria, of which each represents an individual pattern [55], [62], [129]. In addition, in some neuromorphic analog circuits, multistable dynamics even play an essential role, as revealed in [58] and [87]. Therefore, the study of the coexistence and stability of multiple equilibrium points, in particular, the basins of attraction, is of great interest in both theory and applications [17], [31], [46], [47], [190], [195], [266], [298], [314]. A tutorial on the applications of neural networks to associative memories and pattern formation can refer to [118], [120], [296], and [234]. Theoretical research on convergence and multistability of recurrent neural networks can refer to [103] and [83]. In this section, we will mainly focus on the recent theoretic results of multiple equilibrium points of recurrent neural networks.

Chen and Amari [31] pointed out that the one-neuron neural network model has three equilibrium points; two of them are locally stable, and one is unstable. For the \(n\) -neuron neural networks, by decomposing phase space \(R^n\) into \(3^n\) subsets, Zeng and Wang [298] investigated the multiperiodicity of delayed cellular neural networks, and showed that the \(n\) -neuron networks can have \(2^n\) stable periodic orbits located in \(2^n\) subsets of \(R^n\) . The multistability of Cohen-Grossberg neural networks with a general class of piecewise activation functions was also discussed in [17]. It was shown in [17], [47], and [298] that under some conditions, the \(n\) -neuron networks could have \(2^n\) locally exponentially stable equilibrium points located in \(2^n\) saturation regions. Cheng et al. [46] indicated that there could be \(3^n\) equilibrium points for the \(n\) -neuron neural networks. However, they only placed emphasis on \(2^n\) equilibrium points that were stable in a class of subsets with positive invariance, and never mentioned the stability nor the dynamical behaviors of the rest of possible \(3^n\) –\(2^n\) equilibrium points. It was [266] that first studied the dynamics in the remaining \(3^n-2^n\) subsets of \(R^n\) , and the attraction basins of all stable equilibrium points.

All the above methods to stability analysis of multiple equilibria are based on the decomposition of phase space. Then, in each invariant attractive subset of stable equilibrium point, the neural network is reduced to the linear case, in which the stability property of the equilibrium point is executed. The main difficulty lies in how to efficiently decompose the phase space \(R^n\) on the basis of different types of activation functions and to determine the accurate size of attractive basin of each equilibrium point.

The differences of stability analysis between recurrent neural networks with unique equilibrium point and recurrent neural networks with multiple equilibrium points can be summarized as follows.

The region of initial states of the unique solution of recurrent neural networks is the whole state space, while the initial region of the multiple solutions of recurrent neural networks belongs to different subspace. This is the main difference that leads to global stability and local stability, respectively.
The types of activation functions play different roles in analyzing the stability of unique equilibrium point and multiple equilibrium points. For a large class of activation functions, one can prove the existence and uniqueness, and the global stability of the equilibrium point. In contrast, if the specific form of the activation function is not given in the analysis of recurrent neural networks with multiple equilibrium points, the subspace decomposition cannot be proceeded. Thus, the local stability analysis of the multiple equilibrium points cannot be conducted by the subspace decomposition method.
There are many methods to analyze the global stability of recurrent neural networks with unique equilibrium point, for example, the contraction method, Lyapunov method, differential equation method, comparison principle method, and so on. However, for recurrent neural networks with multiple equilibrium points, one of the most used methods in the literature is the linearized method at the local equilibrium point, which, consequently, is only concerned with the local stability. This is also the main reason why there are so fewer stability results for recurrent neural networks with multiple equilibrium points than that for recurrent neural networks with unique equilibrium point. However, the results on the estimation of the domain of the attraction of multiple equilibria are more than the corresponding local stability results.
In applications, recurrent neural networks with unique equilibrium point are mainly used to solve optimization problems. In contrast, recurrent neural networks with multiple equilibrium points can be applied to many different fields, such as associative memories, pattern recognition, pattern formation, signal processing, and so on [202].

SECTION IX.

Some Future Directions and Conclusion

In this paper, some topics on the stability of recurrent neural networks have been discussed in detail. The coverage includes most aspects of stability research on recurrent neural networks. The fruitful results in the fields of stability of recurrent neural networks have greatly promoted the development of the neural network theory.

For future directions of the stability study on recurrent neural networks, we now give some prospective suggestions.

Continue to apply and find some useful mathematical methods to decrease the conservativeness of the stability results, especially to further reduce the conservatism in the existing stability results while keeping a reasonably low computational complexity. This topic is sometimes related to the development of other disciplines, such as applied mathematics, computational mathematics, and mechanics.
How to establish necessary and sufficient stability conditions for delayed recurrent neural networks with more neurons is still an open problem. For the case of constant time delay, a necessary and sufficient stability result has been proposed only for recurrent neural networks with two neurons. Moreover, how to obtain the approximate necessary and sufficient stability conditions is also meaningful in the development of neural network theory.
In addition to the global stability property, how to establish the stability criteria for multiple equilibrium points of recurrent neural networks still needs more efforts. In general, global stability property is related to optimization problems, while the multiple stability is related to associative memories. In the applications of image recognition, data classification, and information processing, multiple stability may play an important role. The details include the size of domains of the attraction and the precise boundary of domain of attraction.
How to balance the computational complexity and the efficiency of stability results needs to be investigated. At present, the conservativeness of stability results are reduced at the expense of complex expressions of stability results, which involves too many parameters to be determined. How to reduce the redundancy of some of the slack variables in LMI-based stability conditions needs to be further investigated.
For the original Cohen-Grossberg neural networks, in which the equilibrium points are all positive or nonnegative, only a few stability results are established. These classes of neural networks have important role in biological systems or competition-cooperation systems. Comparing with the stability study of Hopfield neural networks, no matter in the width or the depth of stability research, the works for the original Cohen-Grossberg neural networks are not sufficient. For Cohen-Grossberg neural networks with nonnegative equilibrium point, how to study the stability properties in the case of reaction-diffusion, stochastic environment, impulse action, and other cases are all to be investigated with depth.
Considering the complexity of the internal and external factors of neural networks, some new features must be incorporated into the existing network models, for example, the internal elasticity connections and spike effects, the external stochastic fields, switching, impulse, and so on. These factors may have direct effects on neural networks, which are especially challenging for the study of stability problems.
The stability property of recurrent neural networks concerned in this paper focuses on the isolated Cohen-Grossberg-type recurrent neural networks with regular topology structure, for example, Hopfield neural networks and cellular neural networks. For other types of recurrent neural networks with different topology structure, for example, symmetrically/asymmetrically ring networks and random symmetric/asymmetric networks, the stability results are few. Especially, when these same or different classes of networks are composed of a large-scale complex neural networks, stability problem of synchronization and consensus should be deeply investigated in different cases, such as linkage failure, pinning control, clustering, and so on. Moreover, complex-valued and fractional-order neural networks, which are regarded as extensions of the real-valued neural networks and integer-order neural networks, have also been investigated in recent years. In these directions, there will be many challenging topics to be further studied.

SECTION X.

Conclusion

In summary, stability studies for recurrent neural networks with or without time delays have achieved a great deal in the last three decades. However, there are still many new problems to be solved. All these future developments will accompany the development of mathematical theory, especially applied mathematics and computational mathematics. Keeping in mind, different forms of stability criteria have their own feasible ranges, and one cannot expect that only a few stability results can tackle all the stability problems existing in recurrent neural networks. Every class of stability results, for example, in the forms of algebraic inequality, LDS, \(M\) -matrix, and LMI, has their own advantages, which has considered different tradeoffs between computational complexity and efficiency of stability results. No one form of stability result is absolutely superior to other forms of stability results, and it only reflects different aspects of concerned recurrent neural networks. Therefore, it is the different expression forms of stability results that promote the development of the stability theory of recurrent neural networks.

MIT Libraries