Introduction
Approaches based on recurrent neural networks for solving optimization problems, which use analog computation implemented on electronic devices to replace numerical computation realized by mathematical algorithms, have attracted considerable attention (see [102], [105], [131], [220], [233], [248], [251], [252], [254], [255], and the references therein). However, due to the existence of many equilibrium points of recurrent neural networks, spurious suboptimal responses are likely to be present [69], [205], [254], [258], which limit the applications of recurrent neural networks. Thus, the global asymptotic/exponential stability of a unique equilibrium point for the concerned recurrent neural networks is of great importance from a theoretical and application point of view [23], [55], [69], [103], [204], [205], [281], [282].
Research on the stability of recurrent neural networks in the early days was for symmetric recurrent neural networks with or without delays. References [23], [105], [106], [145], [200], [254], and [287] looked into the dynamic stability of symmetrically connected networks and showed their practical applications to optimization problems. Cohen and Grossberg [55] presented the analytical results on the global stability of symmetric recurrent neural networks. A brief review on the dynamics and stability of symmetrically connected networks is presented in [199], in which the effects of time delay, the eigenvalue of the interconnection matrix, and the gain of the activation function on the local dynamics or stability of symmetric Hopfield neural networks were discussed in detail.
Both in practice and theory, the symmetric restriction on the connection matrix of recurrent neural networks is too strong, while asymmetric connection structures are more general [8], [206], [281]. For instance, a nonsymmetric interconnection matrix may originate from slight perturbations in the electronic implementation of a symmetric matrix. Asymmetries of the interconnection matrix may also be deliberately introduced to accomplish special tasks [267] or may be related to the attempt to consider a more realistic model of some classes of neural circuits composed of the interconnection of two different sets of amplifiers (e.g., neural networks for nonlinear programming [131]). Therefore, the local and global stabilities of asymmetrically connected neural networks have been widely studied [30], [239], [274]. As pointed out in [69], the topic on global stability of neural networks was more significant than that of local stability in applications, such as signal processing and optimization problems. These important applications motivated researchers to investigate the dynamical behaviors of neural networks and global stability conditions of neural networks [23], [103], [281], [282]. Reference [130] applied the contraction mapping theory to obtain some sufficient conditions for global stability. Reference [201] generalized some results in [103] and [130] using a new Lyapunov function. Reference [126] proved that diagonal stability of the interconnection matrix implied the existence and uniqueness of an equilibrium point and the global stability at the equilibrium point. References [65], [69], and [73] pointed out that the negative semidefiniteness of the interconnection matrix guaranteed the global stability of the Hopfield networks, which generalized the results in [103], [126], [130], and [201]. Reference [63] applied the matrix measure theory to get some sufficient conditions for global and local stability. References [122] and [123] discussed the stability of a delayed neural network using Lyapunov function and established a Lyapunov diagonal stability (LDS) condition on the interconnection matrix. References [30]–[32], [35], and [172] introduced a direct approach to address the stability of delayed neural networks, in which the existence of equilibrium point and its stability were proved simultaneously without using complicated theory, such as degree theory and homeomorphism theory. Note that the above references provided the global stability criteria of recurrent neural networks using different algebraic methods, which reflected the different measure scales on the stability property due to different sufficient conditions. These expressions of global stability criteria can generally be divided into two categories: 1) LDS condition and 2) matrix measure stability condition, which had been sufficiently developed in parallel. The former condition considered the effects of the positive and negative signs, that is, excitatory (
It is well known that symmetrically connected analog neural networks without delays operating in continuous time will not oscillate [103]–[106], in which it is assumed that neurons communicate and respond instantaneously. In electronic neural networks, time delays will occur due to the finite switching speed of amplifiers [9], [54], [200], [236], [237]. Designing a network to operate more quickly will increase the relative size of the intrinsic delay and can eventually lead to oscillation [200]. In biological neural networks, it is well known that time delay can cause a stable system to oscillate [79]. Time delay has become one of the main sources to lead to instability. Therefore, the study of effects of time delay on stability and convergence of neural networks has attracted considerable attentions in neural network community [9], [21], [25], [54], [113], [236], [237]. Under certain symmetric connectivity assumptions, neural networks with time delay will be stable when the magnitude of time delay does not exceed certain bounds [13], [200], [290]. For asymmetric neural networks with delays, sufficient stability conditions independent of or depending on the magnitude of delays were also established [50], [81], [256], [300]. These results are mostly based on linearization analysis and energy and/or Lyapunov function method. Recently, most of the stability results are for recurrent neural networks with delays, such as discrete delays, distributed delays, neutral-type delays, and other types of delays, and many different analysis methods were proposed. Since 2002 [162], [163], the linear matrix inequality (LMI) method has been used in the stability analysis of recurrent neural networks, and then many different LMI-based stability results have been developed. Up to date, LMI-based stability analysis of recurrent neural networks is still one of the most commonly used methods in the neural network community.
More recently, lots of efforts have been made on various stability analyses of recurrent neural networks [5], [135], [245], [263], [278], [304], [313], [325]. A detailed survey and summary of stability results are necessary for understanding the development of stability theory of recurrent neural networks. Although there are some literature surveys available on the stability of recurrent neural networks [83], [103], [199], exhaustive/cohesive reviews on stability of recurrent neural networks are still lacking, which motivates us to present a comprehensive review on this specific topic. Although there are many different types of recurrent neural networks, including complex-valued neural networks [110], [285] and fractional-order neural networks [125], this paper is mainly concerned with the real-valued continuous-time recurrent neural networks described by ordinary differential equations in the time domain.
This paper is organized as follows. In Section II, the research categories of the stability of recurrent neural networks are presented, which include the evolution of recurrent neural network models, activation functions, connection weight matrices, main types of Lyapunov functions, and different kinds of expression forms of stability results. In Section III, a brief review on the early methods to the stability analysis of recurrent neural networks is presented. In Section IV, LMI-based approach is discussed in detail and some related proof methods to LMI-based stability results are also analyzed. In Section V, two classes of Cohen-Grossberg neural networks are discussed, and some related LMI-based stability criteria are introduced and compared. In Section VI, the stability problem of recurrent neural networks with discontinuous activation functions is presented. The emphasis is placed on recurrent neural networks without delays. In Section VII, some necessary and sufficient conditions for the dynamics of recurrent neural networks without delays are developed. In Section VIII, stability problems of recurrent neural networks with multiple equilibrium points are discussed, which is a useful complement to the neural networks with a unique equilibrium point. The conclusion and some future directions are finally given in Section IX, which show some potential and promising directions to the stability analysis of recurrent neural networks.
Scope of Recurrent Neural Network Research
A recurrent neural network model is mainly composed of such components as self-feedback connection weights, activation functions, interconnection weights, amplification functions, and delays. To establish efficient stability criteria, there are usually two ways to be adopted. One way is to efficiently use the information of recurrent neural networks under different assumptions. Another is to relax the assumptions in the neural networks using novel mathematical techniques. Along the above lines, we will give a detailed review on the stability research of recurrent neural networks in this section.
A. Evolution of Recurrent Neural Network Models
Since Hopfield and Cohen-Grossberg proposed two types of recurrent neural network models in the 1980s, modified models have been frequently proposed by incorporating different internal and external factors. Especially, when time delays are incorporated into the network models, stability research on delayed neural networks has gained significant progress. A short review on the evolution of neural network models with delays is presented in [305]. However, with the development of the theory of neural networks, some new variations have taken place in neural network models. Therefore, in this section, we will briefly review some basic models of recurrent neural networks and their recent variants.
Cohen and Grossberg [55] proposed a neural network model described by \begin{equation} \dot {u}_i (t) = d_i (u_i (t))\bigg [ a_i (u_i (t)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))\bigg ]\end{equation}
Hopfield [105] proposed the following continuous-time Hopfield neural network model\begin{equation} \dot {u}_i (t) = - \gamma _i u_i (t) + \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))+U_i\end{equation}
In (1) and (2), it was assumed that neurons communicated and responded instantaneously. However, in electronic neural networks, time delay will be present due to the finite switching speed of amplifiers, which will be a source of instability. Moreover, many motion-related phenomena can be represented and/or modeled by delay-type neural networks, which make the general neural networks with feedback and delay-type synapses become more important as well [236]. Time delay was first considered in Hopfield model in [200], which was described by the following delayed networks\begin{equation} \dot {u}_i (t) = - u_i (t) + \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t-\tau )).\end{equation}
Ye et al. [290] introduced the constant discrete delays into (1), which is in the following form\begin{equation} \dot {u}_i (t) \!=\! - d_i (u_i (t))\bigg [a_i (u_i (t)) \!-\! \sum \limits _{k = 0}^N {\sum \limits _{j = 1}^n w_{ij}^k } g_j (u_j (t \!-\! \tau _k ))\bigg ]\qquad\end{equation}
Note that neural networks similar to (2) are only for the case of instantaneous transmission of signals. Due to the effect of signal delay, the following model has been widely considered as an extension of (2): \begin{align} \dot {u}_i (t)\!&=\!-\gamma _iu_i (t)\!+\!\!\sum _{j=1}^nw_{ij}g_j(u_j(t))\!+\!U_i\!+\!\!\sum _{j=1}^nw^1_{ij} g_j (u_j (t\!-\!\tau ))\notag\\\text{}\end{align}
In many real applications, signals that are transmitted from one point to another may experience a few network segments, which can possibly induce successive delays with different properties due to various network transmission conditions. Therefore, it is reasonable to combine them together, which leads to the following model\begin{align} \dot {u}_i (t) &= -\gamma _iu_i (t)+\sum \limits _{j=1}^nw_{ij}g_j(u_j(t))+U_i\notag\\ &\quad +\sum \limits _{j=1}^nw^1_{ij} g_j \left (u_j \left (t-\sum \limits _{k=1}^m\tau _k \right )\right ).\end{align}
The use of discrete time delay in the models of delayed feedback systems serves as a good approximation in simple circuits containing a small number of neurons. However, neural networks usually have a spatial extent due to the presence of a multitude of parallel pathways with a variety of axon sizes and lengths. There will be a distribution of propagation delays. In this case, the signal propagation is no longer instantaneous and cannot be modeled with discrete delays. It is desired to model them by introducing continuously distributed delays [24], [28], [42], [82], [114], [115], [142], [179], [193], [209], [255], [276], [305]. The extent to which the values of the state variable in the past affect their present dynamics is determined by a delay kernel. The case of constant discrete delay corresponds to a choice of the delay kernel being a Dirac delta function [29]. Nowadays, there are generally two types of continuously distributed delays in the neural network models. One is the finite distributed delay \begin{equation} \dot {u}_i (t) = -a_i (u_i (t)) + \sum \limits _{j = 1}^n {w_{ij} } \int _{t-\tau (t)}^t g_j (u_j (s))\mathrm {d}s\end{equation}
\begin{equation} \dot {u}_i (t) = -a_i (u_i (t))\!+\!\sum \limits _{j = 1}^n {w_{ij} } \int _{-\infty }^t K_{ij}(t-s) g_j (u_j (s))\mathrm {d}s\qquad\end{equation}
\begin{equation} \dot {u}_i (t) = -a_i (u_i (t))\!+\!\sum \limits _{j = 1}^n {w_{ij} } g_j \bigg (\int _{-\infty }^t K_{ij}(t-s) u_j (s)\mathrm {d}s\bigg )\quad\end{equation}
\begin{equation} \dot {u}_i (t) = -a_i (u_i (t)) + \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t-\tau _{ij})).\end{equation}
The following recurrent neural networks with a general continuously distributed delays were proposed and studied in [27], [36], [34], [29], [165], [178], [268], and [271]\begin{align} \dot u_i(t) &=-\gamma _i u_i(t)+\sum _{j=1}^n\int _0^{\infty } g_j(u_j(t-s))\mathrm {d}J_{ij}(s)\notag\\ &\quad +\sum _{j=1}^n\int _0^{\infty } g_j(u_j(t-\tau _{ij}(t)-s))\mathrm {d}K_{ij}(s)+U_i\qquad \quad\end{align}
By choosing either neuron states (the external states of neurons) or local field states (the internal states of neurons) as basic variables, a dynamic recurrent neural network is usually cast either as a static neural network model or as a local field neural network model [227], [284]. The recurrent backpropagating neural networks [102], the brain-state-in-a-box/domian type neural networks [257], and the optimization-type neural networks [73], [272] are modeled as an static neural network model described in the following matrix-vector form\begin{equation} \dot u(t)=-Au(t)+g(Wu(t)+U)\end{equation}
\begin{equation} \dot u(t)=-Au(t)+Wg(u(t))+U.\end{equation}
\begin{equation} \dot u(t)=-Au(t)+W_0g(W_2u(t))+W_1g(W_2u(t-\tau (t)))\qquad\end{equation}
\begin{align} \dot u(t) \!= \!-Au(t)\!+\!W_0g(W_2u(t))\!+\!W_1g(W_2u(t\!-\!\tau _1(t)\!-\!\tau _2(t)))\notag\\\text{}\end{align}
There are many different factors considered in the neural network models, such as stochastic actions [44], reaction-diffusion actions [15], [164], [217], [230], [232], [238], [240], [269], [318], [326], high-order interactions [57], [262], impulse and switching effects [229], and so on. These effects are all superimposed on the elementary Hopfield neural networks or Cohen-Grossberg neural networks, which lead to many complex neural network models in different applications. There are many internal or external effects considered in practical neural networks besides many different types of delays.
B. Evolution of the Activation Functions
Many early results on the existence, uniqueness, and global asymptotic/exponential stability of the equilibrium point concern the case that activation functions are continuous, bounded, and strictly monotonically increasing. However, when recurrent neural networks are designed for solving optimization problems in the presence of constraints (linear, quadratic, or more general programming problems), unbounded activation functions modeled by diode-like exponential-type functions are needed to impose the constraints. Because of the differences between the bounded and unbounded activation functions, extensions of the results with bounded activation functions to unbounded cases are not straightforward. Therefore, many different classes of activation functions are proposed in the literature. Note that a suitable and more generalized activation function can greatly improve the performance of neural networks. For example, the property of activation function is important to the capacity of neural networks. References [212] and [213] showed that the absolute capacity of an associative memory model can be remarkably improved by replacing the usual sigmoid activation function with a nonmonotonic activation function. Therefore, it is very significant to design a new neural network with a more generalized activation function. In recent years, many researchers have devoted their attention to attain this goal by proposing new classes of generalized activation functions. Next, we will decribe some various types of activation functions used in the literature.
In the early research of neural networks, different types of activation functions are used, for example, threshold function [104], piecewise linear function [1], [151], [292], signum function [93], hyperbolic tangent function [2], hard-limiter nonlinearity [197], and so on. In the following, we are mainly concerned with Lipschitz-continuous activation functions and their variants.
The following sigmoidal activation functions have been used in [105], [106], [254], [261], and [290]
\(:\) where\begin{align} g_{i}^{\prime }(\zeta )&=\mathrm {d}g_{i}(\zeta )/\mathrm {d}\zeta >0, \lim _{\zeta \rightarrow +\infty }g_{i}(\zeta )=1, \notag\\ \lim _{\zeta \rightarrow -\infty }g_{i}(\zeta )&=-1, \lim _{|\zeta |\rightarrow \infty }g_{i}^{\prime }(\zeta )=0\end{align} View Source\begin{align} g_{i}^{\prime }(\zeta )&=\mathrm {d}g_{i}(\zeta )/\mathrm {d}\zeta >0, \lim _{\zeta \rightarrow +\infty }g_{i}(\zeta )=1, \notag\\ \lim _{\zeta \rightarrow -\infty }g_{i}(\zeta )&=-1, \lim _{|\zeta |\rightarrow \infty }g_{i}^{\prime }(\zeta )=0\end{align}
is the activation function of the\(g_{i}(\cdot )\) th neuron,\(i\) , and\(i=1,\dotsc , n\) is the number of neurons. Obviously, it is differentiable, monotonic, and bounded.\(n\ge 1\) The following activation functions have been used in [137], [265], [272], [301], and [315]
\(:\) no matter whether the activation function is bounded or not. As pointed out in [265], this type of activation function in (17) is not necessarily monotonic and smooth.\begin{equation} |g_i(\zeta )-g_i(\xi )|\le \delta _i|\zeta -\xi |\end{equation} View Source\begin{equation} |g_i(\zeta )-g_i(\xi )|\le \delta _i|\zeta -\xi |\end{equation}
The following activation functions have been employed in [16], [35], and [166]
\(:\) \begin{equation} 0<\frac {g_i(\zeta )-g_i(\xi )}{\zeta -\xi }\le \delta _i.\end{equation} View Source\begin{equation} 0<\frac {g_i(\zeta )-g_i(\xi )}{\zeta -\xi }\le \delta _i.\end{equation}
The following activation functions have been employed in [19], [52], [137], and [315]
\(:\) \begin{equation} 0\le \frac {g_i(\zeta )-g_i(\xi )}{\zeta -\xi }\le \delta _i.\end{equation} View Source\begin{equation} 0\le \frac {g_i(\zeta )-g_i(\xi )}{\zeta -\xi }\le \delta _i.\end{equation}
The following activation functions are developed in [43], [135], [138], [139], [176], [177], [242], [294], and [304]
\(:\) \begin{equation} \delta _i^-\le \frac {g_i(\zeta )-g_i(\xi )}{\zeta -\xi }\le \delta _i^+.\end{equation} View Source\begin{equation} \delta _i^-\le \frac {g_i(\zeta )-g_i(\xi )}{\zeta -\xi }\le \delta _i^+.\end{equation}
As pointed out in [138], [139], and [177],
One of the important things associated with activation functions is the existence and uniqueness of the equilibrium point of recurrent neural networks. Now, we will give a brief comment on this problem. For the bounded activation function
In general, for the case of bounded activation functions satisfying Lipschitz continuous conditions, the existence of the solution can be guaranteed by the existence theorem of ordinary differential equations [65], [73], [85], [182], [201], [207], [243], [261], [307].
For the unbounded activation function in the general form, the existence of the equilibrium point is established mainly on the basis of homeomorphism mapping [28], [73], [180], [307], [321], Leray-Schauder principle [191], and so on.
Another important thing associated with activation functions is whether the existence, uniqueness, and global asymptotic/exponential stability must be simultaneously dealt with in the stability analysis of recurrent neural networks. This question is often encountered and there is no consistent viewpoint on this question in the early days of the stability theory of neural networks. This problem leads to two classes of routines in the stability analysis of recurrent neural networks: 1) to directly present the global asymptotic/exponential stability results without the proof of the existence and uniqueness of the equilibrium point and 2) to give complete proof of the existence, uniqueness, and the global asymptotic/exponential stability. Clearly, this question must be clarified before the stability analysis of recurrent neural networks is proceeded.
From a mathematical point of view, it is necessary to establish the existence (and, if applicable, uniqueness) of equilibrium point(s) to prove stability. However, according to different requirements on the activation function, one can have slightly different treatment routine in the stability proof of the equilibrium point.
For the general case of the bounded activation functions, we can directly present the proof of the global asymptotic/exponential stability as it is well known that the bounded activation function always guarantees the existence of the equilibrium point [65], [73], [85], [261]. For the quasi-Lipschitz case, the existence of equilibrium point is also guaranteed as in the case of bounded activation functions. Therefore, it suffices to present the proof of the global asymptotic/exponential stability of the equilibrium point for recurrent neural networks with bounded activation functions, and the uniqueness of the equilibrium point follows directly from the global asymptotic/exponential stability [181].
For the case of unbounded activation functions, on the contrary, one must provide the proof of the existence, uniqueness, and global asymptotic/exponential stability of the equilibrium point for the concerned neural networks simultaneously.
Activation functions listed above belong to the class of continuous function. For more details on the relationship of global Lipschitz continuous, partially Lipschitz continuous, and locally Lipschitz continuous, readers can refer to [22] and [280]. Some discontinuous activation functions also exist in practical applications. For example, in the classical Hopfield neural networks with graded response neurons [105], the standard assumption is that the activations are employed in the high-gain limit where they closely approximate a discontinuous hard comparator function. Another important example concerns the class of neural networks introduced in [131] to solve linear and nonlinear programming problems , in which the constraint neurons are with a diode-like input-output activations. To guarantee satisfaction of the constraints, the diodes are required to possess a very high slope in the conducting region, i.e., they should approximate the discontinuous characteristic of an ideal diode. Therefore, the following activation functions are for the discontinuous case.
Discontinuous activation functions [68], [71], [72], [116], [173], [175], [183], [184], [216], [263]: Let
be a continuous nondecreasing function, and in every compact set of real space\(g_i(\cdot )\) , each\(\mathcal {R}\) has only finite discontinuity points. Therefore, in any compact set in\(g_i(\cdot )\) , except some finite points\(\mathcal {R}\) , there exist finite right and left limits\(\rho _k\) and\(g_i(\rho ^+)\) with\(g_i(\rho _-)\) . In general, one assumes\(g_i(\rho ^+) > g_i(\rho _-)\) to be bounded, i.e., there exists a positive number\(g_i(\cdot )\) , such that\(G>0\) . Stability analysis of neural networks with discontinuous activation functions has drawn many researchers' attention, and many related results have been published in the literature since the independent pioneering works of Forti and Nistri [71] and Lu and Chen [183]. Hopfield neural networks with bounded discontinuous activations were first proposed in [71], in which the existence of the equilibrium point and stability were discussed, but the uniqueness of the equilibrium point and its global stability were not given. Instead, in [183], the Cohen-Grossberg neural networks with unbounded discontinuous activations were proposed, where the global exponential stability and the existence and uniqueness of the equilibrium point were given. Delayed neural networks with discontinuous activations were first proposed in [184]. Similar models were also proposed in [72]. It can be concluded that [72, Th. 1] is a special case of [184, Th. 1]. The almost periodic dynamics of networks with discontinuous activations was first investigated in [186], where the integro-differential systems were discussed. It includes discrete delays and distributed delays as special cases.\(|g_i(\cdot )|\le G\) Therefore, activation functions have evolved from bounded to unbounded cases, from the continuous to discontinuous, and from the strictly monotonic case to the nonmonotonic case. All these show the depth of the research on stability theory of recurrent neural networks.
C. Evolution of Uncertainties in Connection Weight Matrix
For the deterministic and accurate connection weight matrix, a lot of stability results have been published since the 1980s. However, in the electronic implementation of recurrent neural networks, the connection weight matrix can be disturbed or perturbed by the external environment. Therefore, the robustness of neural networks against such perturbation should be considered [5], [18].
At present, there are several forms of uncertainties considered in the literature.
Uncertainties with the matched condition: Assume that the connection matrix is
. Then, uncertainty\(A\) is described by\(\Delta A\) or\begin{equation} \Delta A=MF(t)N ~\mbox {with}~ F^T(t)F(t)\le I\end{equation} View Source\begin{equation} \Delta A=MF(t)N ~\mbox {with}~ F^T(t)F(t)\le I\end{equation}
where\begin{align}\Delta A=MF_0(t)N ~&\mbox {with}~ F_0(t)=(I-F(t)J)^{-1}F(t) \notag\\ & \mbox {and} ~F^T(t)F(t)\le I\end{align} View Source\begin{align}\Delta A=MF_0(t)N ~&\mbox {with}~ F_0(t)=(I-F(t)J)^{-1}F(t) \notag\\ & \mbox {and} ~F^T(t)F(t)\le I\end{align}
and\(M, N,\) are all constant matrices,\(J\) , and\(J^TJ\le I\) is an identity matrix with compatible dimension. This kind of uncertainty is very convenient in the stability analysis based on the LMI method. Robust stability for neural networks with matched uncertainty (21) has been widely studied in [121], [246], and [310]. For the physical meaning of linear-fractional representation of uncertainty (22), readers can refer to [14], [59], [61], [322], and [324] for more details.\(I\) Interval uncertainty: In this case, the connection matrix
satisfies [168], [198]\(A\) If we let\begin{equation} A\in A_I=[\underline A, \overline A]=\{[a_{ij}]\colon \underline a_{ij}\le a_{ij}\le \overline a_{ij}\}.\end{equation} View Source\begin{equation} A\in A_I=[\underline A, \overline A]=\{[a_{ij}]\colon \underline a_{ij}\le a_{ij}\le \overline a_{ij}\}.\end{equation}
and\(A_0=(\overline A+\underline A)\hbox {/}2\) , then uncertainty (23) can be expressed as follows [137], [144], [187]\(\Delta A=(\overline A-\underline A)\hbox {/}2\) \(:\) where\begin{equation} \hspace {-15pt}A_J=\big \{A=A_0+\Delta A=A_0 +M_AF_AN_A~|~F_A^TF_A\le I\big \}\qquad\end{equation} View Source\begin{equation} \hspace {-15pt}A_J=\big \{A=A_0+\Delta A=A_0 +M_AF_AN_A~|~F_A^TF_A\le I\big \}\qquad\end{equation}
, and\(M_A, N_A\) are well defined according to some arrangement of elements in\(F_A\) and\(\underline A\) . Obviously, interval uncertainty (23) has been changed into the form of uncertainty with matched condition (21).\(\overline A\) Absolute value uncertainties or unmatched uncertainties, where
This kind of uncertainty has been studied in [290], while LMI-based results have been established in [286] and [288].\begin{equation} \Delta A=(\delta a_{ij})\in \{ |\delta a_{ij}|\le \overline a_{ij}\}.\end{equation} View Source\begin{equation} \Delta A=(\delta a_{ij})\in \{ |\delta a_{ij}|\le \overline a_{ij}\}.\end{equation}
Note that for nonlinear neural systems with uncertainties (23) or (25), most robust stability results have been proposed based on algebraic inequalities,
-matrix, matrix measure, and so on in the early days of the theory of recurrent neural networks. Since 2007, LMI-based robust stability results for nonlinear neural systems with uncertainties (23) or (25) have appeared. This tendency implies that many different classes of robust stability results for uncertain neural systems will be proposed.\(M\) Polytopic type uncertainties
where\begin{equation} A\in \Omega ,~ \Omega =\Bigg \{A(\xi )=\sum _{k=1}^p\xi _kA_k,\sum _{k=1}^p\xi _k=1,~ \xi _k\ge 0\Bigg \}\qquad\end{equation} View Source\begin{equation} A\in \Omega ,~ \Omega =\Bigg \{A(\xi )=\sum _{k=1}^p\xi _kA_k,\sum _{k=1}^p\xi _k=1,~ \xi _k\ge 0\Bigg \}\qquad\end{equation}
is a constant matrix with compatible dimension and\(A_k\) is a time-invariant uncertainty. Robust stability for systems with this type of uncertainty has been studied in [77], [99], and [97].\(\xi _k\)
Note that the above uncertainties represent the parameter uncertainties, which are the reflection of the bounded changes of system parameters. Different types of uncertainties are equivalent in the sense of bounded perturbation. Meanwhile, different robust stability results generally require different mathematical analysis methods due to the different uncertainty descriptions in neural systems.
D. Evolution of Time Delays
Due to the different transmission channels and media, time delays are unavoidable in real systems [56], [110], [176], [228]. In the aspects of describing time delays, there are some different ways that depend on the approximation capability and description complexity. For example, the simplest way is to assume that delays are the same in all the transmission channels. A further relaxation is to assume that the delay is the same in each channel, which is different from other channels.
Discrete delays reflect the centralized effects of delays on the system, while distributed delays have effects on the neural networks at some duration or period with respect to the discrete point of delays. As for different classes of time delays, one can refer to Table II.
For the time-varying delay case, the derivative of time delay is usually limited to be less than one in the early days (i.e., slowly time-varying delay). Now, with the applications of some novel mathematical methods, e.g., the free weight matrix method or Finsler formula, the derivative of time-varying delay can be allowed to be greater than one for some time (i.e., fast time-varying delay. It cannot be always grater than one. Otherwise, delay
In the previous methods, the time-varying delay
It can be seen from the existing references that only the deterministic time-delay case was concerned, and the stability criteria were derived based only on the information of variation range of the time delay. Actually, the time delay in some NNs is often existent in a stochastic fashion [294], [295], [317]. It often occurs in real systems that some values of the delays are very large but the probabilities of the delays taking such large values are very small. In this case, if only the variation range of time delay is employed to derive the stability criteria, the results may be conservative. Therefore, the challenging issue is how to derive some criteria for the uncertain stochastic delayed neural networks, which can exploit the available probability distribution of the delay and obtain a larger allowable variation range of the delay.
Recently, a class of neural networks with leakage delays were studied in [80], [140], and [147]. The leakage delay can be explained as follows. In general, a nonlinear system can be stated as follows:\begin{equation} \dot x(t)=-Ax(t)+f(t,x(t),x(t-\tau ))\end{equation}
\begin{equation} \dot x(t)=-Ax(t-\sigma )+f(t,x(t),x(t-\tau ))\end{equation}
E. Evolution and Main Types of Lyapunov Approaches
As most of the stability criteria of recurrent neural networks are derived via the Lyapunov theory, they all have a certain degree of conservatism. Reducing the conservatism has been the topic of much research. With the Lyapunov stability theory, the reduction can be achieved mainly from two phases: 1) choosing the suitable Lyapunov functional and 2) estimating its derivative. The choice of the Lyapunov functional is crucial for deriving less conservative criteria. Various types of Lyapunov functionals and estimation methods on the derivative of Lyapunov functionals have been constructed to study the stability of recurrent neural networks. In this section, we will mainly discuss the evolution and the main types of Lyapunov approaches and Lyapunov functions used in the analysis of global stability. For the estimation method of the derivative of the Lyapunov functional, a brief review can be found in [135] and [304].
In [105], for the Hopfield neural network (2) under symmetry assumption on the interconnections, the following continuously differentiable function is used\begin{align} V_H(u(t))\!&=\!-\frac {1}{2}\sum _{i=1}^n\!\sum _{j=1}^ny_iw_{ij}y_j\!-\!\!\sum _{i=1}^nU_iy_i \!+\!\sum _{i=1}^n{\gamma _i}\!\!\int _{0}^{y_i}\!\!g_i^{-1}(s)\mathrm {d}s\notag\\\text{}\end{align}
The derivative of (29) along the trajectories of (2) is \begin{equation} \frac {\mathrm {d}V_H(u(t))}{\mathrm {d}t}=-\sum _{i=1}^n\left (\frac {\mathrm {d}}{\mathrm {d}y_i}g_i^{-1}(y_i)\right )\left (\frac {\mathrm {d}y_i}{\mathrm {d}t}\right )^2.\end{equation}
For the original Cohen-Grossberg network model (1) in [55], the following continuously differentiable function is used\begin{align} V_{CG}(u(t))&=\;\frac {1}{2}\sum _{i=1}^n\sum _{j=1}^ng_i(u_i(t))w_{ij}g_j(u_j(t))\notag\\ &\quad \; -\sum _{i=1}^n\int _{0}^{u_i(t)}a_i(s)\left (\frac {\mathrm {d}}{\mathrm {d}s}g_i(s)\right )\mathrm {d}s\end{align}
The derivative of (31) along the trajectories of (1) is as follows\begin{align} \frac {\mathrm {d}V_{CG}(u(t))}{\mathrm {d}t}&=-\sum _{i=1}^nd_i(u_i(t))\left (\frac {\mathrm {d}}{\mathrm {d}u_i(t)}g_i(u_i(t))\right )\notag\\ &\quad \times \left (a_i(u_i(t))-\sum _{j=1}^nw_{ij}g_j(u_j(t))\right )^2\!.\qquad\end{align}
From the stability proof of above two classes of neural networks, we can find the following facts: 1) the above proof procedure shows the reason that why the activation function is usually required to be a monotonically increasing function and 2) both functions
In the pioneering work of Cohen and Grossberg and Hopfield, the global limit property of (1) and (2) was established, which means that given any initial conditions, the solution of (1) [or (2)] will converge to some equilibrium points of the system. However, the global limit property does not give a description or even an estimate of the region of attraction for each equilibrium. In other words, given a set of initial conditions, one knows that the solution will converge to some equilibrium points, but does not know exactly to which one it will converge. In terms of associative memories, one does not know what initial conditions are needed to retrieve a particular pattern stored in the networks. On the other hand, in applications of neural networks to parallel computation, signal processing, and other problems involving the solutions of optimization problems, it is required that there is a well-defined computable solution for all possible initial states. That is, it is required that the networks have a unique equilibrium point, which is globally attractive. Earlier applications of neural networks to optimization problems have suffered from the existence of a complicated set of equilibrium points [254]. Thus, the global attractivity of a unique equilibrium point for the system is of great importance for both theoretical and practical purposes, and has been the major concern of [65], [69], [103], [201], [274], and [275].
In [69], using the continuous energy functions as those in (1) and (2), some sufficient conditions were proved guaranteeing that a class of neural circuits were globally convergent toward a unique stable equilibrium at the expense that the neuron connection matrix must be symmetric and negative semidefinite. In practice, the condition of symmetry and negative semidefiniteness of interconnection matrix is rather restrictive. The research on the global attractivity/stability of neural networks is mainly concentrated on the construction of Lyapunov function on the basis of Lyapunov stability theory. In [23], the following Lyapunov function was first constructed for purely delayed system (5), i.e., (5) with \begin{equation} V(u(t))=\sum \limits _{i=1}^nu_i^2(t)+ \sum \limits _{i=1}^n\int _{t-\tau }^{t}u_i^2(s)\mathrm {d}s\end{equation}
In [275], a Lyapunov function is constructed for (4) with discrete delays \begin{align} V(u(t))=\sum _{i=1}^n\left (\frac {1}{\bar d_i}|u_i(t)|\!+\!\sum _{k=0}^N\sum _{j=1}^n|w_{ij}^k|\delta _j\int _{t-\tau _k}^t\!\!\!|u_j(s)|\mathrm {d}s\right )\notag\\\text{}\end{align}
\begin{align} V(u(t))&=\sum _{i=1}^n\Bigg (q_i|u_i(t)|+\bar d_iq_i\sum _{j=1}^n|w_{ij}|\delta _j\notag\\ &\qquad \qquad \times \int _0^{+\infty }\!K_{ij}(\theta )\!\int _{t-s}^t\!|u_j(s)|\mathrm {d}s\mathrm {d}\theta \Bigg )\qquad\end{align}
Based on the above Lyapunov functions, some global stability results have been derived in the forms of different algebraic inequalities, in which the absolute value operations are conducted on the interconnection weight coefficients. To derive the LMI-based stability results, Lyapunov function in quadratic forms is generally adopted. In [185], the following Lyapunov function is constructed for (4) with \begin{align} V(u(t))&=\;u^T(t)Pu(t)+\sum _{i=1}^nq_i\int _{0}^{u_i(t)}\frac {g_i(s)}{d_i(s)}\mathrm {d}s\notag\\ &\quad + \int _{t-\tau _1}^{t}g^T(u(s))Qg(u(s))\mathrm {d}s\end{align}
\begin{align} V(u(t))=\sum _{i=1}^nq_i\int _{0}^{u_i(t)}\frac {s}{d_i(s)}\mathrm {d}s+ \sum _{i=1}^np_i\int _{0}^{u_i(t)}\frac {g_i(s)}{d_i(s)}\mathrm {d}s\notag\\\text{}\end{align}
\begin{align} V(u(t))&=\;u^T(t)Pu(t)+2\sum _{j=1}^nq_j\int _{0}^{u_j(t)}g_j(s)\mathrm {d}s\notag\\ &\quad + \beta \sum _{j=1}^np_j\int _{t-\tau }^{t}g_j^2(u_j(s))\mathrm {d}s\end{align}
\begin{align} V(u(t))&=\;u^T(t)Pu(t)+2\sum _{j=1}^nq_j\int _{0}^{u_j(t)}g_j(s)\mathrm {d}s\notag\\ &\quad + \int _{t-\tau }^{t}\big [u^T(s)Ru(s)\!+\!g^T(u(s))Qg(u(s))\big ]\mathrm {d}s~\qquad\end{align}
For (5) with time-varying delay \begin{align} V(u(t))&=\;u^T(t)Pu(t)+2\sum _{j=1}^nq_j\int _{0}^{u_j(t)}g_j(s)\mathrm {d}s\notag\\ &\quad + \int _{t-\tau (t)}^{t}\big [u^T(s)Ru(s)+g^T(u(s))Qg(u(s))\big ]\mathrm {d}s\notag\\ &\quad +\sum _{i=1}^2\int _{t-\tau _i}^tu^T(s)\bar R_i u(s)\mathrm {d}s\notag\\ &\quad +\int _{-\tau _2}^0\int _{t+\theta }^t\dot u^T(s)Z_1 \dot u(s)\mathrm {d}s\mathrm {d}\theta \notag\\ &\quad +\int _{-\tau _2}^{-\tau _1}\int _{t+\theta }^t\dot u^T(s)Z_2\dot u(s)\mathrm {d}s\mathrm {d}\theta\end{align}
\begin{align} &\hspace {-17pt}V(u(t))=u^T(t)Pu(t)+2\sum _{j=1}^nq_j\int _{0}^{u_j(t)}g_j(s)\mathrm {d}s\notag\\ &\qquad + \sum _{j=1}^m\int _{-\tau _j}^{-\tau _{j-1}}\left [\begin{array}{c} u(t+s)\\ g(u(t+s))\end{array}\right ]^T\Upsilon _j\left [\begin{array}{c} u(t+s)\\ g(u(t+s))\end{array}\right ]\mathrm {d}s\notag\\ &\qquad +\sum _{j=1}^m(\tau _j-\tau _{j-1})\int _{-\tau _j}^{-\tau _{j-1}}\int _{t+s}^t\dot u^T(\theta )R_j\dot u(\theta )\mathrm {d}\theta \,\mathrm {d}s\notag\\\text{}\end{align}
In general, the equilibrium points reached by (1) and (2) are locally stable if the continuous functions (29) and (31) are selected, respectively. To improve the chances of reaching the global stability, Lyapunov function is required to be positive according to Lyapunov stability theory. Therefore, the absolute positive continuous function or energy function is adopted in the recent literature; see (33)–(41) and their variations.
For recurrent neural networks with different kinds of actions, such as stochastic perturbations, neutral types, distributed delays, reaction-diffusion, and so on, the construction of the Lyapunov-Krasovskii function is similar to the above ones, besides some special information are incorporated into the functions. It is the different incorporations of such information that make the construction of Lyapunov-Krasovskii functions more flexible and diverse than the classical functions (29) and (31).
F. Comparisons of Delay-Independent Stability Criteria and Delay-Dependent Stability Criteria
Generally speaking, there are two concepts concerning the stability of systems with time delays. The first one is called the delay-independent stability criteria that do not include any information about the size of the time delays and the change rate of time-varying delays [20], [185], [191], [307], [311]. For the systems with unknown delays, delay-independent stability criteria will play an important role in solving the stability problems. The second one is called the delay-dependent stability criteria, in which the size of the time delays and/or the change rate of time-varying delays are involved in the stability criteria [94], [309], [312].
Note that the delay-dependent stability conditions in the literature are mainly referred to as systems with discrete delays or finite distributed delays, in which the specific size of time delays and the change rate of time-varying delays can be measured or estimated. For the cases, such as infinite distributed delays and stochastic delays, in which there are no specific descriptions on the size of time delays and the change rate of time-varying delays, the concept of delay-independent/dependent stability criteria will still hold. If the relevant delay information (such as Kernel function information or the expectation value information of stochastic delays) is involved in the stability criteria, such results are also called delay dependent. Otherwise, they are delay independent.
Since the information on the size of delay and the change rate of time-varying delay is used, i.e., holographic delays, delay-dependent criteria may be less conservative than delay-independent ones, especially when the size of time delay is very small. When the size of time delay is large or unknown, delay-dependent criteria will be unusable, while delay-independent stability criteria may be useful.
G. Stability Results and Evaluations
At present, there are many different analysis methods to show the stability property of recurrent neural networks, such as Lyapunov stability theory [37], [182], [189], [203], [235], Razumikhin-type theorems [196], [264], nonsmooth analysis [224], [225], [293], ordinary differential equation theory [38], [149], [174], [326], LaSalle invariant set theory [55], [239], nonlinear measure method [226], [260], gradient-like system theory [66], [74], [258], comparison principle of delay differential systems [50], [153], and so on.
The expressions of the stability criteria are different due to different analysis and proof methods, such as
Stability criteria in the form of
Algebraic inequality results and LMI results have many different expressions and may involve many free parameters to be tuned, which always make the stability results become rather complex. For example, the exceedingly complex LMI-based stability results are only useful for numerical purposes, and the theoretic meaning is deprived. How to find simple and effective stability criteria is still a challenging research direction.
Furthermore, we can find that with the increase of additive terms in recurrent neural networks (e.g., discrete time delay terms, distributed delay terms, reaction-diffusion terms, and stochastic delay terms), the stability criterion will become more and more conservative. This phenomenon can be resorted to the additive complexity of the system structure. The conservativeness of the criteria will be further increased with the multiplicative complexity of the system structure, and a few related results have been published.
Brief Review of the Analysis Methods For Early Stability Results
Before presenting the main content of this paper, we will first review some algebraic methods for the stability analysis of neural networks, such as the methods based on the concept of LDS matrices,
In [11], the following condition was derived to ensure the global exponential stability of (2) in the case that \begin{equation} \eta =\lambda _{\min }(\Gamma )-\sigma _{\max }(W)>0\end{equation}
\begin{equation} \mu (LW-\Delta ^{-1}\Gamma L)<0\end{equation}
\begin{equation} L[(\Gamma -\alpha I)\Delta ^{-1}-W]+[(\Gamma -\alpha I)\Delta ^{-1}-W]^TL>0\qquad\end{equation}
\begin{equation} \delta _i^{-1}\gamma _iL_i-L_iw_{ii}>\sum _{j\ne i}^nL_j|w_{ij}|\end{equation}
In [35], the following global exponential stability/ convergence criterion was also proposed for (2) [where \begin{align} \gamma _j\xi _j-\delta _j\left (\xi _jw_{jj}+\sum _{i=1,i\ne j}^n\xi _i|w_{ij}|\right )&>\alpha \xi _j\notag\\ \xi _i(\gamma _i-\delta _iw_{ii})-\sum _{j=1,j\ne i}^n\xi _j\delta _j|w_{ij}|&>\alpha \xi _j\notag\\ \xi _i(\gamma _i\!-\!\delta _iw_{ii})\!-\!\frac {1}{2}\sum _{j=1,j\ne i}^n(\xi _i\delta _i|w_{ij}|+\xi _j\delta _j|w_{ji}|)&>\alpha \xi _i\qquad\end{align}
In [23], the purely delayed Hopfield networks (3) were studied, and the following criterion was established\begin{equation} \delta ||W||_2<1\end{equation}
\begin{equation} \rho (\Gamma ^{-1}|W|\Delta )<1\end{equation}
In [70], for (2) with symmetric connection matrix \begin{equation} \max _{1\le i\le n}\lambda _i(W)\le 0\end{equation}
\begin{equation} -W\in P_0 ~\mbox {matrix}\end{equation}
\begin{equation} \max _{1\le i\le n}\mbox {Re}\{\lambda _i(W)\}\le 0\end{equation}
In [158], it is shown that quasi-diagonally row-sum and column-sum dominance of \begin{equation} PW+W^TP<0\end{equation}
\begin{equation} PW+W^TP\le 0\end{equation}
\begin{equation} D_2(W-D_1)+(W-D_1)^TD_2<0.\end{equation}
The above results mainly focused on (2) on the basis of
Now, we summarize the relationship among LDS concept, LMI, and \begin{equation} P(\Delta ^{-1}\Gamma -W)+(\Delta ^{-1}\Gamma -W)^TP>0\end{equation}
\begin{equation} P(\Delta ^{-1}\Gamma -|W|)+(\Delta ^{-1}\Gamma -|W|)^TP>0\end{equation}
It should be noted that in the early days of stability research of Hopfield neural networks, there is a direct approach to prove the existence of equilibrium and its exponential stability simultaneously, in which an energy function or Lyapunov function is not required [30], [74]. To the best of the authors' knowledge, it is [30] that first adopted such kind of unified method to prove the stability of Hopfield neural networks. Now, we give a short review for this direct method.
Forti and Tesi [74] proposed the so-called finite length of the trajectory by proving the following result. If the activation function is analytic, bounded, and strictly monotonically increasing, then any trajectory of (2) has finite length on
On the other hand, in [30], the following lemma was given, which was used to derive the corresponding stability criterion: for some norm
Development of Proof Methods and Proof Skills in LMI-Based Stability Results
In this section, we will first state the superiority of the LMI method in the analysis and synthesis of dynamical systems. Then, we will show the main technical skills used in the derivation of LMI-based stability result.
Before we begin this section, we will present a simple introduction to the early analysis methods (e.g., LDS and
A. Superiorities and Shortcomings of the LMI-Based Method
In the early days of the stability theory of neural networks, almost all the stability studies stem from the viewpoint of building direct relationship among the physical parameters in neural networks. Therefore, stability criteria based on matrix measure, matrix norm, and
The physical parameters among neural networks have some nonlinear redundancies, which can be expressed by some constrained relationship with free variables. Stability criteria based on algebraic inequalities, e.g., Young inequality, Holder inequality, Poincare inequality, and Hardy inequality [92], have been paid lots of attention in recent years, which have improved the stability criteria significantly.
Although stability criteria based on the algebraic inequality method can be less conservative in theory, they are generally difficult to check due to adjustable parameters involved while one has no prior information on how to tune these variables. Since LMI is regarded as a powerful tool to deal with matrix operations, LMI-based stability criteria have received attentions from researchers. A good survey of LMI techniques in stability analysis of delayed systems was presented in [283], and LMI methods in control applications were reviewed in [48] and [60]. LMI-based stability results are in matrix forms relating the physical parameters of neural networks to compact structure and elegant expressions.
The popularity of the LMI method is mainly due to the following reasons.
The LMI technique can be applied to a convex optimization problem that can be handled efficiently by resorting to the existing numerical algorithms for solving LMIs [12]. Meanwhile, LMI methods can easily solve the corresponding synthesis problems in control system design once the LMI-based stability (or other performance) conditions have been established, especially when state feedback is employed [283].
For neural networks without delay, LDS method bridges the
-matrix method and LMI method, which is also a special case of LMI form. For delayed neural networks, the core condition is either LDS or the\(M\) -matrix condition. However, in the delayed case, both LDS condition and\(M\) -matrix condition lack suitable freedoms to be tuned and lead to much conservativeness of the stability criteria. In contrast, the LMI method can easily incorporate the free variables into stability criteria and decrease the conservativeness. Correspondingly, many different kinds of stability results based on matrix inequalities have been proposed. In the sense of total performance evaluation of the desired results, LMI-based results are the most effective at present.\(M\) On the one hand, LMI-based method is most suitable for the model or system described using state-space equation. On the other hand, many matrix theory-related methods can be incorporated into the LMI-based methods. Therefore, like algebraic inequality methods (which mainly deal with the scalar space or dot measure, and almost all scalar inequalities can be used in the algebraic inequality methods), many matrix inequalities can be used in the LMI-based method, e.g., Finsler formula [148], Jensen inequality [293], [309], [312], Park inequality [218], and Moon inequality [211]. Especially, the LMI-based method directly deals with the 2-D vector space, which extends the application space of algebraic inequality methods. Therefore, more inhibitory information on the system can be contained in LMI-based results than the algebraic inequality methods.
Every method has its own shortcomings, so does the LMI-based method. With the applications of many different mathematical tools and techniques to the stability analysis of neural networks, the shortcomings of LMI-based method have appeared. Now, we list several main disadvantages as follows.
The exceedingly complex expression form of the stability condition becomes the most inefficiency of this method. The more complex the stability condition is, the less the physical meaning and theoretical meaning the stability condition has. In this case, the LMI-based method will lose its original superiority to the classical algebraic inequality methods, and will become useful only for numerical purposes.
It will become more difficult to compare the different conditions among the LMI-based stability results. Therefore, the efficiency of the proposed conditions can only be compared by means of specific examples, and not in an analytical way.
The increase of slack variables can significantly increase the complexity of the computation, and it is necessary to make some efforts to reduce the redundancy of some of the slack variables. Therefore, how to develop new methods to further reduce the conservatism in the existing stability results while keeping a reasonably low computational complexity is an important issue to investigate in the future.
The exceedingly complex expression form of the stability condition is not easy to study synthesis problems in neural control system due to the cross terms of many slack variables. How to find simple and more effective LMI-based stability criteria is still a challenging topic.
B. Technical Skills Used in LMI-Based Stability Results for Delayed Neural Networks
Note that LMI-based approaches for the stability analysis of recurrent neural networks with time delay are based on the Lyapunov-Krasovskii function method. By incorporating different information of the concerned system into the construction of Lyapunov-Krasovskii function and using some technical skills in the proof procedure, several novel LMI-based stability results have been proposed to reduce the conservativeness of the stability results (e.g., for the case of fast time-varying delay, achieving the maximum upper bound of time delay given the network parameters, etc.). Now, we shall summarize some technical skills used in the stability analysis of delayed recurrent neural networks.
Free Weight Matrix Method: This method was first proposed in [99] and [277], which was used to improve the delay-dependent stability of systems with a time-varying delay. One feature of the method is that it employs neither a model transformation nor bounding techniques for cross terms. Especially, it is a very powerful method to deal with the case of fast time-varying delay, i.e.,
The essence of the free weight matrix method is to add some free variables/matrices to an identity, which will improve the effectiveness of the stability results by involving some adjustable variables. For example, the following identity holds according to Newton-Leibniz formula\begin{equation} u(t)-u(t-\tau (t))-\int _{t-\tau (t)}^t\dot u(s)\mathrm {d}s=0\end{equation}
\begin{equation} \dot u(t)+\Gamma u(t)-Wg(u(t))-W_1g(u(t-\tau (t)))=0\end{equation}
In short, the contribution of the free weight matrix method is that by involving more freedoms (or equivalently using more relations of the systems), the conservativeness of stability criteria will be decreased significantly in the sense of total performance of evaluation. Certainly, it is also a decrease of conservativeness in the sense that the restriction of change rate of time-varying delay is relaxed from
Matrix Decomposition Method: This method is mainly used to deal with the stability problem for recurrent neural networks without delay. For example, for the case of Hopfield and Cohen-Grossberg neural networks without delay, these kinds of matrix decomposition methods have been used in [107], [108], and [239]. In [107], the connection matrix
Delay-Matrix Decomposition Method: Since LMI is a very powerful method for analyzing the stability of many classes of neural networks with different delays, it is natural to build some LMI-based stability criteria for neural networks with different multiple delays
It is worth pointing out that the delay-matrix decomposition method proposed in [308]–[312] mainly focused on systems with multiple discrete delays
Delay Decomposition/Partition Approach: Time delay is one of the most important parameters in delayed neural networks. Since interconnection weight matrices have been sufficiently explored in the development of neural network stability theory, especially with the occurrence of the free weight matrix method, it seems that the stability criteria have reached the point where a little space is left in connection weights that can be used to further decrease the conservativeness of stability results. In this case, delay-dependent stability criteria may have more space for improvement than that of delay-independent stability criteria, because the information of time delay is not sufficiently explored yet. In the previous stability analysis methods, time delay
The essence of the delay decomposition approach is to enlarge/augment the state space and involve many adjustable variables, which has larger augmented state space or more system dimensions than that of the original system. A challenging topic existing in the delay decomposition approach is how to determine the number of subintervals and the subinterval size to achieve the optimal upper bound value of time delay. At present, by combining the delay decomposition approach with the augmented Lyapunov-Krasovskii functions [96], [134], [135], [297], some new stability results for delayed neural networks have been published, and the conservativeness of the stability results is decreased at the expense of more unknown parameters or matrices involved.
Descriptor System Method: This is a universal transformation method that can transform a normal differential system into a descriptor-like system and use the analysis approach of descriptor system to study the normal differential system. Therefore, the dimensions of the original differential system are enlarged from
Splitting Interval Matrix Method: This method is devoted to the robust stability analysis of neural networks with interval uncertainty, i.e., the uncertain connection matrix
Stability Problems for Two Classes of Cohen-Grossberg Neural Networks
For models similar to Cohen-Grossberg neural networks (4), we will discuss the stability problems based on the following different assumptions.
Assumption 5.1[189], [290], [309], [311]:
The amplification function \begin{equation} 0<\underline d_i\le d_i (\zeta )\le \overline d_i\notag\end{equation}
Assumption 5.2 [6], [20], [28], [38]:
The function \begin{equation} \frac {a_i(\zeta )-a_i(\xi )}{\zeta -\xi }\ge \gamma _i\notag\end{equation}
Assumption 5.3 [307], [315], [319], [326]:
The activation function \begin{equation} |g_i(\zeta )-g_i(\xi )|\le \delta _i |\zeta -\xi |\notag\end{equation}
Assumption 5.4 [6], [181], [189], [269], [311]:
The activation function \begin{equation} 0\le \frac {g_i(\zeta )-g_i(\xi )}{\zeta -\xi }\le \delta _i\notag\end{equation}
Assumption 5.5 [169], [182], [185], [191], [269]:
The amplification function \begin{equation} \int _0^{\epsilon }\frac {1}{d_i(s)}\mathrm {d}s=+\infty \notag\end{equation}
Note that the differences between Assumptions 5.3 and 5.4 can be found in [270]. The differences between Assumptions 5.1 and 5.5 indicate the fact that the function in Assumption 5.1 is strictly positive, while the function in Assumption 5.5 is nonnegative. Moreover, if the amplification function
Based on the above assumptions, we now show the relationship between the original Cohen-Grossberg neural networks (1) and the delayed Cohen-Grossberg neural networks (4).
The differences between (1) and (4) are as follows.
The amplification functions are different. Assumption 5.5 is required in (1), while Assumption 5.1 is required in (4).
Due to the different assumptions on amplification functions in (1) and (4), the Hopfield model (2) is only a special case of (4) with constant amplification functions, while (1) does not include Hopfield model (2).
The state curves of Cohen-Grossberg [55] neural networks with Assumption 5.5 are all nonnegative under positive initial conditions, while the state curves of Cohen-Grossberg neural networks with Assumption 5.1 may be positive, negative, or their mixture under any forms of initial conditions [290].
The requirements for the function
in (1) and (4) are different. Function\(a_i(u_i(t))\) is monotonically increasing and required to be radially unbounded in (4), while in (1), it may vary according to the different choice of positivity conditions.\(a_i(u_i(t))\) The connection coefficients in (1) are all positive, while the connection coefficients in (4) can be any sign.
Model (1) often represents biological systems, which reflects the survival and perdition of species. In contrast, (4) stems from engineering applications, and in a similar manner to Hopfield neural network model, they can be used in fields, such as optimization, decision making and learning [91], [208], [252], [253], and signal processing [327].
The similarities between (1) and (4) are as follows: 1) the model structure in mathematical description is the same and 2) the symmetry requirements of the interconnection matrices are the same in the early days of neural network stability theory. However, the symmetry of interconnection matrices is not required in this research.
Due to such a huge amounts of related literature published, it is not easy to list all the references. To outline clearly the research progress of the stability theory of neural networks, we mainly discuss two classes of neural network models, i.e., the original Cohen-Grossberg neural network model (1) and the Cohen-Grossberg type neural network model (4). Based on these two primitive models, we will pinpoint some main achievements obtained in a few relevant papers, whereas several other results are presented as corollaries or minor improvements.
The next subsections are organized as follows. Section V-A will focus on the stability of the original Cohen-Grossberg neural network model (1), and some improvements surrounding this model will be discussed appropriately. Sections V-B–V-D will concentrate on the Cohen-Grossberg type neural network model (4) and review the progress in different aspects.
A. Stability of Cohen-Grossberg Neural Networks With Nonnegative Equilibrium Points
In this section, we will focus on five papers to describe the progress of the Cohen-Grossberg neural networks (1). Some related references are used to complement the progress of stability at different levels.
The main contribution of [55] is to discover the essence of symmetry on the effects of dynamics of complex systems, and to establish the stability criterion for (1). Since then, many different stability results have been proposed for (1) with Assumption 5.5 and its variants.
For Cohen-Grossberg neural networks (1), Ruan [239] proposed the following sufficient global stability condition based on LaSalle's invariance principle: if the connection matrix \begin{equation} W=DS\end{equation}
The following Lotka-Volterra model of competing species:\begin{equation} \dot u_i(t)=G_iu_i\left (1-\sum _{k=1}^nH_{ik}u_k(t)\right )\end{equation}
In the aspects of time delay and symmetric connection weights, [169], [183], [185], [191], [311], and [312] improved the conditions in [55], and the stability of nonnegative/positive equilibrium points for corresponding Cohen-Grossberg neural networks with delays has been studied.
For the reaction-diffusion Cohen-Grossberg neural networks described by \begin{align} \frac { \partial {u}_i (t)}{\partial t} &= \sum _{k=1}^m\frac {\partial }{\partial x_k}\left (D_{ik}\frac {\partial u_i(t,x)}{\partial x_k}\right )- d_i (u_i (t,x)) \notag\\ &\quad \times \bigg [ a_i (u_i (t,x)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t,x)) \bigg ]\end{align}
\begin{equation} -PW-(PW)^T\ge 0~(\mbox {or}>0)\end{equation}
Note that, despite the symmetry restriction on the matrix
In the case that the activation function satisfies a quasi-Lipschitz condition \begin{align} \dot {u}_i (t) &= - d_i (u_i (t))\bigg [ a_i (u_i (t)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))\notag\\ &\qquad ~-\sum \limits _{j = 1}^n {w_{ij}^1 } g_j (u_j (t-\tau _{ij}(t)))\bigg ],~~i, j=1,\dotsc ,n.\notag\\\text{}\end{align}
\begin{equation} \Gamma -(|W|+|W_1|)\Delta\end{equation}
For (63), [311] required \begin{align} 2L_i\gamma _i&-\sum _{j=1}^n(L_iw_{ij}\delta _j+L_jw_{ji}\delta _i)\notag\\ & -\sum _{j=1}^n\left (L_iw_{ij}^1\delta _j+L_jw_{ji}^1\delta _i\right )>0\end{align}
\begin{equation} 2\Gamma -(|W|+|W_1|)\Delta -\Delta (|W|+|W_1|)^T\end{equation}
System (63) with \begin{equation} P(\Gamma \Delta ^{-1} -W-W_1)+(\Gamma \Delta ^{-1} -W-W_1)^TP>0\end{equation}
\begin{equation} \left [ {{\begin{array}{cc} 2P\Gamma \Delta ^{-1}\!-PW-(PW)^T\!-Q &~~ -PW_1 \\ -(PW_1)^T &~~ Q \end{array} }} \right ] >0\end{equation}
Using Schur complement lemma [12], (68) is equivalent to the following form\begin{align} 2P\Gamma \Delta ^{-1}-PW-(PW)^T-Q-PW_1Q^{-1}(PW_1)^T>0.\notag\\\text{}\end{align}
For the following Cohen-Grossberg neural networks with finite distributed delays:\begin{align} &\hspace {-29pt}\dot {u}_i (t) = - d_i (u_i (t))\bigg [ a_i (u_i (t)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))\notag\\ &\qquad \qquad \quad ~~-\sum \limits _{k = 1}^N\sum \limits _{j = 1}^n {w_{ij}^k } g_j (u_j (t-\tau _{kj}(t)))\notag\\ &\qquad \qquad \quad ~~-\sum \limits _{l = 1}^r\sum \limits _{j = 1}^n b_{ij}^l \int _{t-d_l}^tg_j (u_j(s))\mathrm {d}s\bigg ].\end{align}
\begin{align} -2P\Gamma & \Delta ^{-1}\!+PW+(PW)^T\!+\sum _{i=1}^N(PW_iQ_i^{-1}W_i^TP+Q_i)\notag\\ &\qquad \quad ~+\sum _{l=1}^r(d_lY_l+d_iPB_lY_l^{-1}B_l^TP)<0\end{align}
From the above results, we can see that the core condition is (62) for neural networks without delay. With the addition of delayed terms, the core condition is expanded from (67) to (71). Therefore, different results are derived for different network models, which become more complex under similar LMI form. It is in the LMI form that (71) unifies many LMI-based stability results in the literature.
In the following three subsections, we will discuss the Cohen-Grossberg neural networks with mixed equilibrium point, i.e., the amplification function
B. Stability of Cohen-Grossberg Neural Networks via\(M\)
-Matrix Methods or Algebraic Inequality Methods
In this section, we will focus on 10 papers to describe the progress on stability analysis of the Cohen-Grossberg neural networks (4). Some related references are used to complement the progress of stability analysis at different levels.
Assume that matrix \begin{equation*}W^e=\left (\sum _{k=1}^N w_{ij}^k\right )\end{equation*}
\begin{equation} \sum _{k=1}^N(\tau _k\beta \|W^k\|)<1\end{equation}
Under Assumptions 5.2 and 5.3, Lu and Chen [182] studied the global stability of (4) with \begin{equation} \Gamma \Delta ^{-1}-W\end{equation}
\begin{equation} P(\Gamma \Delta ^{-1}-W)+(\Gamma \Delta ^{-1}-W)^TP>0\end{equation}
Under Assumptions 5.2 and 5.3 and the positive lower boundedness of the amplification function, the following system\begin{align} \dot {u}_i (t) = - d_i (u_i (t))\bigg [ a_i (u_i (t)) - \sum \limits _{j = 1}^n {w_{ij}^1 } g_j (u_j (t-\tau _{ij}(t)))\bigg ]\notag\\\text{}\end{align}
\begin{equation} \mbox {det}(\Gamma -W_1K)\ne 0\end{equation}
\begin{equation} \Gamma \Delta ^{-1}-|W_1|\end{equation}
Under Assumptions 5.1–5.3, the result in [28] requires \begin{equation} M_0=\underline D\Gamma -\sum _{k=0}^N|W_k|\Delta \overline D\end{equation}
Note that the analysis method in [28] can also be applied to the following networks\begin{align} \dot {u}_i (t) &= -d_i(u_i(t))\bigg [a_i (u_i (t)) - \sum _{k=0}^N\sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t - \tau _{ij}^k))\bigg ]\notag\\\text{}\\ \dot {u}_i (t) &= -d_i(u_i(t))\bigg [a_i (u_i (t))- \sum \limits _{j = 1}^n {w_{ij} }\notag\\ &\qquad\qquad \qquad \times \int _{-\infty }^t K_{ij}(t-s) g_j (u_j (s))\mathrm {d}s\bigg ]\quad\\ \dot {u}_i (t) &= -d_i(u_i(t))\bigg [a_i (u_i (t))- \sum \limits _{j = 1}^n {w_{ij} } g_j\notag\\ &\qquad\qquad \quad ~~ \times \left (\int _{-\infty }^t K_{ij}(t-s) u_j (s)\mathrm {d}s\right )\bigg ]\end{align}
\begin{equation} M_{0}^\prime =\underline D\Gamma -|W|\Delta \overline D\end{equation}
\begin{equation} M_{0}^{\prime \prime }=\Gamma -|W|\Delta\end{equation}
Under Assumptions 5.1–5.3 and the boundedness of the activation function, for the following Cohen-Grossberg neural networks:\begin{align} \dot {u}_i (t) &= - d_i (u_i (t))\bigg [ a_i (u_i (t)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))\notag\\ &\quad\qquad \qquad \qquad ~-\sum \limits _{j = 1}^n {w_{ij}^1 } g_j (u_j (t-\tau _{ij}))\bigg ]\end{align}
\begin{equation} M_1=\Gamma -W^*\Delta -|W_1|\Delta\end{equation}
\begin{align} &M_1^{\prime }\!=\!\zeta _i\gamma _i-\zeta _iw_{ii}\delta _i\!-\!\sum _{j=1,j\ne i}^n\zeta _j|w_{ji}|\delta _i\!-\!\sum _{j=1}^n\zeta _j|w_{ji}^1|\delta _i\!>\!0\notag\\\text{}\\ &M_1^{\prime \prime }\!=\!\zeta _i\gamma _i\!-\!\zeta _iw_{ii}\delta _i\!-\!\sum _{j=1,j\ne i}^n\zeta _j|w_{ij}|\delta _j\!-\!\sum _{j=1}^n\zeta _j|w_{ij}^1|\delta _j\!>\!0\notag\\\text{}\\ &M_1^{\prime \prime \prime }\!=\!\zeta _i\gamma _i-\zeta _iw_{ii}\delta _i-\frac {\sum _{j=1,j\ne i}^n(\zeta _j|w_{ji}|\delta _i+\zeta _i|w_{ij}|\delta _j)}{2}\qquad \notag\\ &\qquad ~~-\frac {\sum _{j=1}^n(\zeta _j|w_{ji}^1|\delta _i+\zeta _i|w_{ij}^1|\delta _j)}{2}>0\end{align}
We should note that (86)–(88) are equivalent to
For the following Cohen-Grossberg neural networks with reaction-diffusion term:\begin{align} &\hspace {-45pt}\frac { \partial {u}_i (t,x)}{\partial t} = \sum _{k=1}^m\frac {\partial }{\partial x_k}\Big (D_{ik}\frac {\partial u_i(t,x)}{\partial x_k}\Big )- d_i (u_i (t,x))\notag\\ &\quad \times \bigg [ a_i (u_i (t,x)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t,x))\notag\\ &\qquad ~~-\sum \limits _{j = 1}^n {w_{ij}^1 } f_j (u_j (t-\tau _{ij}(t),x))\bigg ]\end{align}
\begin{equation} M_3=\underline d_i\gamma _i-\sum _{j=1}^n\overline d_j|w_{ji}|\delta _i-\sum _{i=1}^n\overline d_j|w_{ji}^1|\delta _i^0>0.\end{equation}
\begin{equation} M_3^{\prime }=\underline D \Gamma -|W|\overline D\Delta - |W_1|\overline D\Delta ^0\end{equation}
For stochastic Hopfield neural networks (89) with constant delays \begin{align} \mathrm {d} u_i(t,x)&= \sum _{k=1}^m\frac {\partial }{\partial x_k}\Big (D_{ik}\frac {\partial u_i(t,x)}{\partial x_k}\Big )\notag\\ &\quad - \bigg [ a_i (u_i (t,x)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t,x))\notag\\ &\qquad -\sum \limits _{j = 1}^n {w_{ij}^1 } f_j (u_j (t-\tau _{ij},x))\bigg ]\mathrm {d}t\notag\\ &\quad +\sum \limits _{j = 1}^n {\sigma _{ij}(u_j(t,x)) }\mathrm {d}\omega _{j}(t)\end{align}
For the deterministic case of (92), (91) with \begin{align} M_4&=\Gamma -|W|\Delta - W_1\Delta ^0-\overline C\\ M_4^{\prime }&=\Gamma -|W|\Delta - W_1\Delta ^0-\tilde C\end{align}
\begin{align*} \overline C&=\mbox {diag}(\overline c_1, \dotsc ,\overline c_n)\\ \overline c_i&=-\gamma _i+\sum \limits _{j = 1}^n {w_{ij}\delta _j}+\sum \limits _{j = 1}^n {w^1_{ij}\delta ^0_j }+\sum \limits _{j = 1}^n {L^2_{ij} }\ge 0\\ \tilde C&=\mbox {diag}(\tilde c_1, \dotsc ,\tilde c_n)\\ \tilde c_i&=0.5\sum \limits _{j = 1}^n {L^2_{ij} }+K_1\left (\sum \limits _{j = 1}^n {L^2_{ij} }\right )^{1/2}\ge 0\end{align*}
Obviously,
For the reaction-diffusion Hopfield neural networks (89) with continuously distributed delays \begin{align} \frac { \partial {u}_i (t)}{\partial t} &= \sum _{k=1}^m\frac {\partial }{\partial x_k}\left (D_{ik}\frac {\partial u_i(t,x)}{\partial x_k}\right )\notag\\ &\quad ~- \bigg [ a_i (u_i (t,x)) -\sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t,x))\notag\\ &\qquad \quad -\sum \limits _{j = 1}^n {w_{ij}^1 } \int _{-\infty }^tK_{ij}(t-s)g_j (u_j (s,x))\mathrm {d}s\bigg ]\qquad\end{align}
\begin{align} M_0^{\prime \prime \prime }=\Gamma -|W|\Delta -|W_1|\Delta\\ M_0^{\prime \prime \prime \prime }=\Gamma -W^+\Delta -|W_1|\Delta\end{align}
For the following systems with distributed delays:\begin{align} \dot u_i(t)&= - \bigg [ a_i (u_i (t)) -\sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))\notag\\ &\qquad ~~-\sum \limits _{j = 1}^n {w_{ij}^1 } f_j (u_j (t-\tau _{ij}(t)))\notag\\ &\qquad ~~ -\sum \limits _{j = 1}^n {w_{ij}^2 } \int _0^{\infty }K_{ij}(s)h_j(u_j (t-s))\mathrm {d}s\bigg ]\end{align}
\begin{equation} \int _0^{\infty }e^{\lambda s}K_{ij}(s)\mathrm {d}s=k_{ij}(\lambda )>0\end{equation}
\begin{align} \left [\lambda I-\Gamma +|W|\Delta +e^{\lambda \tau }|W_1|\Delta ^0+(\rho (\lambda )\otimes |W_2|\Delta ^1)\right ]\zeta <0\notag\\\text{}\end{align}
\begin{equation} \Gamma -|W|\Delta -|W_1|\Delta ^0-|W_2|\Delta ^1\end{equation}
For the following neural networks with finite distributed delays:\begin{align} \frac { \partial {u}_i (t)}{\partial t} &= \sum _{k=1}^m\frac {\partial }{\partial x_k}\left (D_{ik}\frac {\partial u_i(t,x)}{\partial x_k}\right )- \bigg [ a_i (u_i (t,x)) \notag\\ &\quad -\sum \limits _{j = 1}^n {w_{ij}^1 } f_j \left (\int _0^{{T}}K_{ij}(s)u_j (t-s,x)\mathrm {d}s\right )\bigg ]\qquad\end{align}
For the neutral-type Cohen-Grossberg neural networks with constant delays \begin{align} &\hspace {-40pt}\dot {u}_i (t) +\sum \limits _{j = 1}^n {e_{ij} } \dot u_j (t-\tau _j)\notag\\ &= - d_i (u_i (t))\bigg [ a_i (u_i (t))- \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t)) \notag\\ &\qquad\qquad \qquad -\sum \limits _{j = 1}^n {w_{ij}^1 } g_j (u_j (t-\tau _{j}))\bigg ]\end{align}
\begin{gather} 0\le \|E\|<1\notag \\ \delta _M p_w(1\!\!+\!\|E\|) \!+\!\delta _M r_w(1\!\!+\!\|E\|)\!+\!q_w\!\!<\!\min\nolimits _{1\le i\le n}\{\underline d_i \gamma _i\}\end{gather}
\begin{equation} \delta _M(\|W\|+\|W_1\|)\max _i \{\overline d_i \} \le \min _i\{\underline d_i \gamma _i\}.\end{equation}
\begin{equation} \delta _M(\|W\|+\|W_1\|)\le \min _i\{\gamma _i\}.\end{equation}
From the above results, we can see that the core condition is (73) for neural networks without delay, or similarly (77) for purely delayed neural networks. With the increasing complexity of networks, the core condition is expanded from (73) or (77) to (101). Note that the
C. Stability of Cohen-Grossberg Neural Networks via Matrix Inequality Methods or Mixed Methods
In this section, we will focus on four papers to describe the stability analysis of Cohen-Grossberg neural networks (4). Some related references are used to complement the progress at different levels.
In this section, the activation function is assumed to satisfy Assumption 5.4 if there is no other declaration.
For the following Cohen-Grossberg neural networks:\begin{align} \dot {u}_i (t) &= - d_i (u_i (t))\bigg [ a_i (u_i (t)) - \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))\notag\\ &\qquad \qquad \qquad -\sum \limits _{j = 1}^n {w_{ij}^1 } f_j (u_j (t-\tau ))\bigg ]\end{align}
\begin{equation} 2P\Gamma \Delta ^{-1}\!-\!PW\!-\!(PW)^T\!-\!Q\!-\!PW_1Q^{-1}W_1^TP\!>\!0\qquad\end{equation}
\begin{equation} \delta _M(\|W\|+\|W_1\|)<\gamma _m\end{equation}
\begin{equation} 2\Gamma \Delta ^{-1}\!-\!W\!-\!W^T\!-\!{\|W_1\|}I\!-\!\frac {1}{\|W_1\|}W_1W_1^T\!>\!0.\end{equation}
\begin{equation} x^T\!\left (2\Gamma \Delta ^{-1}\!\!-\!\!W\!\!-\!\!W^T\!\!-\!{\|W_1\|}I\!-\!\frac {1}{\|W_1\|}W_1W_1^T\right )x(t)\!>\!0.\quad\end{equation}
\begin{equation} x^T\left (2\gamma _m\delta _M^{-1}-2\|W\|-2\|W_1\|\right )x(t)>0.\end{equation}
For the following Cohen-Grossberg neural networks with continuously distributed delays:\begin{align} &\hspace {-25pt}\dot u_i(t)= -d_i(u_i(t)) \bigg [ a_i (u_i (t)) -\sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t)) \notag\\ &\qquad\qquad \qquad ~~-\sum \limits _{j = 1}^n {w_{ij}^1 } g_j (u_j (t-\tau (t)))-\sum \limits _{j = 1}^n {w_{ij}^2 } \notag\\ &\qquad\qquad \qquad ~~\times \int _{-\infty }^tK_{j}(t-s)g_j (u_j (s))\mathrm {d}s\bigg ]\end{align}
\begin{align} \int _0^{\infty }K_{j}(s)\mathrm {d}s=1, \int _0^{\infty }sK_{j}(s)e^{2\lambda s}\mathrm {d}s=\pi _j(\lambda )<\infty , \lambda >0.\notag\\\text{}\end{align}
\begin{align} \dot u(t)&= -D(u(t)) \Big [ A (u (t)) -W g (u(t)) -W_1g (u (t-\tau (t)))\notag\\ &\qquad\qquad \qquad -W_2 \int _{-\infty }^tK(t-s)g (u (s))\mathrm {d}s\Big ].\end{align}
For the following Cohen-Grossberg neural networks with continuously distributed delays:\begin{align} &\hspace {-10pt}\dot u_i(t)= - d_i(u_i(t))\bigg [ a_i (u_i (t)) -\sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))-\sum \limits _{j = 1}^n {w_{ij}^1 } \notag\\ &\qquad\qquad \qquad \quad ~~ \times \int _{-\infty }^tK_{ij}(t-s)g_j (u_j (s))\mathrm {d}s\bigg ]\end{align}
\begin{equation} 2P\Gamma \Delta ^{-1}\!\!-\!PW\!-\!W^T\!P\!-\!(PQ^{-1}W_1)_{\infty }\!-\!(PQW_1)_1\!\!>\!\!0\quad\end{equation}
\begin{equation} \int _0^{\infty }K_{ij}(s)e^{\delta _0s}\mathrm {d}s<\infty\end{equation}
For (116), it can be transformed into the following vector-matrix form [269]\begin{align} \dot u(t) &= - D(u(t))\bigg [A(u(t)) - Wg(u(t)) \notag\\ &\qquad\qquad \quad ~- \sum \limits _{i = 1}^n {E_i\int _{- \infty }^t \bar K_i(t-s){g(u(s))\mathrm {d}s}} \bigg ]\qquad\end{align}
Note that, with the use of Moon inequality [211], Finsler inequality, the well-known Newton-Leibnitz formula, and the free-weight matrix method, a large number of different classes of LMI-based stability results have been established in [94] and [320].
For the Cohen-Grossberg neural networks (63) and the Cohen-Grossberg neural networks with finite distributed delays (70), LMI-based stability results have been established in [311] and [312], respectively, in which a similar delay-matrix-decomposition method is proposed to derive the main results.
From the above results, we can see that the core condition of stability criterion for neural networks with delay is (67) in Section V-A, from which one can derive (108) and (117), respectively. Under different assumptions on the network models, one may have the same stability results in the mathematical description. However, the physical meanings which they reflect are different in essence.
D. Topics on Robust Stability of Recurrent Neural Networks
In the design and hardware implementation of neural networks, a common problem is that accurate parameters in neural networks are difficult to guarantee. To design neural networks, vital data, such as the neuron firing rates, the synaptic interconnection weights, and the signal transmission delays, usually need to be measured, acquired, and processed by means of statistical estimation, which definitely leads to estimation errors. Moreover, parameter fluctuation in neural network implementation on very large-scale integration chips is also unavoidable. In practice, it is possible to explore the range of the above-mentioned vital data as well as the bounds of circuit parameters by engineering experience even from incomplete information. This fact implies that good neural networks should have certain robustness, which paves the way for introducing the theory of interval matrices and interval dynamics to investigate the global stability of interval neural networks. As pointed out in [19], robust stability is very important in the consideration of dynamics of neural networks with or without delays. There are many related results on robust stability [170], [246], [291]. In [170], global robust stability of delayed interval Hopfield neural networks was investigated with respect to the bounded and strictly increasing activation functions. Several
For the LMI-based robust stability results of recurrent neural networks, the difficulty is how to tackle different classes of uncertainties. For the cases of matched uncertainties and interval uncertainties, many LMI-based robust stability results have been published [18], [19], [38], [44], [121], [137], [144], [187], [246], [273], [310], [311]. However, for recurrent neural networks with other forms of uncertainties, LMI-based robust stability results are few [286], [288]. It is important to establish the LMI-based robust stability results for recurrent neural networks with different classes of uncertainties because one can attempt to use the advantages of LMI technique to establish new stability theory for recurrent neural networks with uncertainties, which is in parallel to the scalar methods, such as
Since the proof method of the robust stability for systems with interval uncertainties and matched uncertainties is similar to the case of system without uncertainties, the review on the robust stability results of recurrent neural networks is omitted.
Stability Analysis of Neural Networks With Discontinuous Activation Functions
Although this paper mainly focuses on the stability of continuous-time recurrent neural networks, we will also spend a small space on discontinuous recurrent neural networks that have been intensively studied in the literature.
When dealing with dynamical systems possessing high-slope nonlinear elements, it is often advantageous to model them with a system of differential equations with discontinuous right-hand side, rather than studying the case where the slope is high but of finite value [71], [72]. The main advantage of analyzing the ideal discontinuous case is that such analysis is usually able to give a clear picture of the salient features of motion, such as the presence of sliding modes, i.e., the possibility that trajectories be confined for some time intervals to discontinuity surfaces.
The existing literature reports a few other investigations on discontinuous neural networks, which pertain to a different application context, or to different neural architectures. A significant case is that of Hopfield neural networks where neurons are modeled by a hard discontinuous comparator function [146]. Different from the discontinuous activation function addressed in [71] (see discontinuous activation function in Section II-B of this paper), the analysis in [146] was valid for symmetric neural networks, which possessed multiple equilibrium points located in saturation regions, i.e., networks useful to implement content addressable memories. References [49] and [78] introduced a special neural-like architecture for solving linear programming problems, in which the architecture is substantially different from the additive neural networks. Moreover, the networks in [49] are designed as gradient systems of a suitable energy function, while it is known that additive neural networks of the Hopfield type are gradient systems only under the restrictive assumption of symmetric interconnection matrix [73], [103]. To study the class of discontinuous neural networks, the concepts from the theory of differential equations with discontinuous right-hand side as introduced by Filippov are usually used [68], [86], [111], [184].
In [71], discontinuous Hopfield networks (2) were studied. The established conditions on global convergence could be applicable to general nonsymmetric interconnection matrices, and they generalized the previous results to the discontinuous case for neural networks possessing smooth neuron activations. Specifically, if the following simple condition established in [71] is satisfied\begin{equation} -W ~\mbox {is a}~P\mbox{-matrix}\end{equation}
More importantly, the concept of global convergence of the output equilibrium point was proposed in [71]. Usually, in the standard case considered in the literature where the neuron activation functions are continuous and monotone, it is easy to see that global attractivity of an equilibrium point also implies global attractivity of the output equilibrium point. Unfortunately, this property is no longer valid for the class of discontinuous activation functions, since for discontinuous neuron activations, convergence of the state does not imply convergence of the output. Therefore, for discontinuous activation functions, it is needed to address separately both global convergence of the state variables and the output variables. In [71], the following condition was derived [71, Th. 2] \begin{equation} {-W} ~\mbox {is Lyapunov diagonally stable}\end{equation}
Under (121), [71, Th. 2] holds for all neural network inputs \begin{equation} \mathcal{M}(W)-|W_1|~ \mbox {is an} ~ M\mbox{-matrix}\end{equation}
Now, let us compare (121) and (122). When the delay is sufficiently small, the interconnection matrix in discontinuous system (5) is given by
After the pioneering work in [71], [72], and [183], the topics on discontinuous neural networks have been paid much attention, and many related results are established [64], [117], [175], [216]. Among these works, there are mainly four research teams in the study on the discontinuous neural networks [68], [95], [173], [183], [184], [186], [111], [171], [86], [116], [141]. Readers can refer to the references cited therein, and the details are omitted here.
From the above results, we can see that there are mainly two kinds of core conditions: 1) the LDS form 121 and 2) the
Some Necessary and Sufficient Conditions for Recurrent Neural Networks
Nowadays, almost all the stability results for recurrent neural networks are sufficient conditions. However, there also exist some necessary and sufficient conditions for some special classes of recurrent neural networks with/without delays.
Note that sufficient asymptotic/exponential stability criteria in the existing literature are all established on the basis of strict inequalities (i.e., >0 or <0). It is natural to ask: what will happen if the strict inequalities are replaced by nonstrict inequalities (i.e., \begin{align} \dot {u}_i (t) &= - \gamma _iu_i (t) \!+\!\sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t)) +\sum \limits _{j = 1}^n {w_{ij}^1 } f_j (u_j (t\!-\!\tau _{ij})).\notag\\\text{}\end{align}
\begin{align} M_j&=-\gamma _j+\alpha _j\bigg (w_{jj}+\sum _{i=1,i\ne j}^n|w_{ij}|\bigg )^+ +\beta _j\sum _{i=1}^n|w_{ij}^1|\le 0\notag\\\text{}\end{align}
For the following purely delayed Hopfield neural networks:\begin{equation} \dot {u} (t) = - \Gamma u (t) +W_1 f(u (t-\tau ))\end{equation}
\begin{equation} \left (\sigma I-\left [\begin{array}{cc}\Gamma &0\\ 0&\Gamma \end{array}\right ]+e^{\sigma \tau }\left [\begin{array}{cc}W_1^+ &W_1^-\\ W _1^-&W_1^+\end{array}\right ]\left [\begin{array}{cc}\Delta &0\\ 0&\Delta \end{array}\right ]\right )\eta \le 0\qquad\end{equation}
For the following Hopfield neural networks, the ABST was studied in [51], [53], [70], and [159]\begin{equation} \dot {u}_i (t) = - \gamma _i u_i (t) + \sum \limits _{j = 1}^n {w_{ij} } g_j (u_j (t))+U_i\end{equation}
\begin{equation} -W\in \mathcal {P}_0\end{equation}
In [127], a conjecture was raised: the necessary and sufficient condition for ABST of the neural networks (127) is that its connection matrix
In [53], for (127) with \begin{equation} \max _i \mbox {Re}~ \lambda _i(W)\le 0\end{equation}
\begin{equation} \max _i \lambda _i\left (\frac {W+W^T}{2}\right )\le 0\end{equation}
In [51], (127) was further discussed. By removing the assumption of normal matrix on \begin{equation} \max _i \mbox {Re}~ \lambda _i(W)\le 0\end{equation}
For (127) with Assumption 5.4, the following necessary and sufficient condition was derived in [107] and [124]\begin{equation} -\Gamma +W\Delta ~\mbox {is nonsingular or }\det (-\Gamma +W\Delta )\ne 0\qquad\end{equation}
For the Hopfield neural networks with delays \begin{equation} \dot {u}_i (t) = - \gamma _i u_i (t) + \sum \limits _{j = 1}^n {w^1_{ij} } g_j (u_j (t-\tau _{ij}))\end{equation}
\begin{equation} \det (-\Gamma +W_1)\ne 0,~\mbox {and}~ \Gamma -|W_1|~\mbox {is a \(\mathcal {P}_0\)-matrix}\end{equation}
From the above results, we can see that
Multistability of Recurrent Neural Networks and Its Comparisons With Global Stability
Preceding sections are about the global stability of the unique equilibrium point of continuous-time recurrent neural networks. Multistability problems also require further investigation. For example, when recurrent neural networks are applied to pattern recognition, image processing, associative memories, and pattern formation, it is desired that the network has several equilibria, of which each represents an individual pattern [55], [62], [129]. In addition, in some neuromorphic analog circuits, multistable dynamics even play an essential role, as revealed in [58] and [87]. Therefore, the study of the coexistence and stability of multiple equilibrium points, in particular, the basins of attraction, is of great interest in both theory and applications [17], [31], [46], [47], [190], [195], [266], [298], [314]. A tutorial on the applications of neural networks to associative memories and pattern formation can refer to [118], [120], [296], and [234]. Theoretical research on convergence and multistability of recurrent neural networks can refer to [103] and [83]. In this section, we will mainly focus on the recent theoretic results of multiple equilibrium points of recurrent neural networks.
Chen and Amari [31] pointed out that the one-neuron neural network model has three equilibrium points; two of them are locally stable, and one is unstable. For the
All the above methods to stability analysis of multiple equilibria are based on the decomposition of phase space. Then, in each invariant attractive subset of stable equilibrium point, the neural network is reduced to the linear case, in which the stability property of the equilibrium point is executed. The main difficulty lies in how to efficiently decompose the phase space
The differences of stability analysis between recurrent neural networks with unique equilibrium point and recurrent neural networks with multiple equilibrium points can be summarized as follows.
The region of initial states of the unique solution of recurrent neural networks is the whole state space, while the initial region of the multiple solutions of recurrent neural networks belongs to different subspace. This is the main difference that leads to global stability and local stability, respectively.
The types of activation functions play different roles in analyzing the stability of unique equilibrium point and multiple equilibrium points. For a large class of activation functions, one can prove the existence and uniqueness, and the global stability of the equilibrium point. In contrast, if the specific form of the activation function is not given in the analysis of recurrent neural networks with multiple equilibrium points, the subspace decomposition cannot be proceeded. Thus, the local stability analysis of the multiple equilibrium points cannot be conducted by the subspace decomposition method.
There are many methods to analyze the global stability of recurrent neural networks with unique equilibrium point, for example, the contraction method, Lyapunov method, differential equation method, comparison principle method, and so on. However, for recurrent neural networks with multiple equilibrium points, one of the most used methods in the literature is the linearized method at the local equilibrium point, which, consequently, is only concerned with the local stability. This is also the main reason why there are so fewer stability results for recurrent neural networks with multiple equilibrium points than that for recurrent neural networks with unique equilibrium point. However, the results on the estimation of the domain of the attraction of multiple equilibria are more than the corresponding local stability results.
In applications, recurrent neural networks with unique equilibrium point are mainly used to solve optimization problems. In contrast, recurrent neural networks with multiple equilibrium points can be applied to many different fields, such as associative memories, pattern recognition, pattern formation, signal processing, and so on [202].
Some Future Directions and Conclusion
In this paper, some topics on the stability of recurrent neural networks have been discussed in detail. The coverage includes most aspects of stability research on recurrent neural networks. The fruitful results in the fields of stability of recurrent neural networks have greatly promoted the development of the neural network theory.
For future directions of the stability study on recurrent neural networks, we now give some prospective suggestions.
Continue to apply and find some useful mathematical methods to decrease the conservativeness of the stability results, especially to further reduce the conservatism in the existing stability results while keeping a reasonably low computational complexity. This topic is sometimes related to the development of other disciplines, such as applied mathematics, computational mathematics, and mechanics.
How to establish necessary and sufficient stability conditions for delayed recurrent neural networks with more neurons is still an open problem. For the case of constant time delay, a necessary and sufficient stability result has been proposed only for recurrent neural networks with two neurons. Moreover, how to obtain the approximate necessary and sufficient stability conditions is also meaningful in the development of neural network theory.
In addition to the global stability property, how to establish the stability criteria for multiple equilibrium points of recurrent neural networks still needs more efforts. In general, global stability property is related to optimization problems, while the multiple stability is related to associative memories. In the applications of image recognition, data classification, and information processing, multiple stability may play an important role. The details include the size of domains of the attraction and the precise boundary of domain of attraction.
How to balance the computational complexity and the efficiency of stability results needs to be investigated. At present, the conservativeness of stability results are reduced at the expense of complex expressions of stability results, which involves too many parameters to be determined. How to reduce the redundancy of some of the slack variables in LMI-based stability conditions needs to be further investigated.
For the original Cohen-Grossberg neural networks, in which the equilibrium points are all positive or nonnegative, only a few stability results are established. These classes of neural networks have important role in biological systems or competition-cooperation systems. Comparing with the stability study of Hopfield neural networks, no matter in the width or the depth of stability research, the works for the original Cohen-Grossberg neural networks are not sufficient. For Cohen-Grossberg neural networks with nonnegative equilibrium point, how to study the stability properties in the case of reaction-diffusion, stochastic environment, impulse action, and other cases are all to be investigated with depth.
Considering the complexity of the internal and external factors of neural networks, some new features must be incorporated into the existing network models, for example, the internal elasticity connections and spike effects, the external stochastic fields, switching, impulse, and so on. These factors may have direct effects on neural networks, which are especially challenging for the study of stability problems.
The stability property of recurrent neural networks concerned in this paper focuses on the isolated Cohen-Grossberg-type recurrent neural networks with regular topology structure, for example, Hopfield neural networks and cellular neural networks. For other types of recurrent neural networks with different topology structure, for example, symmetrically/asymmetrically ring networks and random symmetric/asymmetric networks, the stability results are few. Especially, when these same or different classes of networks are composed of a large-scale complex neural networks, stability problem of synchronization and consensus should be deeply investigated in different cases, such as linkage failure, pinning control, clustering, and so on. Moreover, complex-valued and fractional-order neural networks, which are regarded as extensions of the real-valued neural networks and integer-order neural networks, have also been investigated in recent years. In these directions, there will be many challenging topics to be further studied.
Conclusion
In summary, stability studies for recurrent neural networks with or without time delays have achieved a great deal in the last three decades. However, there are still many new problems to be solved. All these future developments will accompany the development of mathematical theory, especially applied mathematics and computational mathematics. Keeping in mind, different forms of stability criteria have their own feasible ranges, and one cannot expect that only a few stability results can tackle all the stability problems existing in recurrent neural networks. Every class of stability results, for example, in the forms of algebraic inequality, LDS,