Introduction
Differential equations, where the derivatives at time \begin{equation*} y^{(k)}(t)=f(t,y(t),\ldots,y^{(k-1)}(t),y(d_{0}),\ldots,y^{(k)}(d_{k});\mathbf {p}), \tag{1}\end{equation*}
Several studies have been done on the numerical approximations and asymptotic properties of solutions to NDDEs [1]–[7]. NDDEs have been used to develop models in engineering [8]–[12], biosciences [1], [13]–[17], and economics [18], [19]. A review of the applications of NDDEs in various fields is discussed in [20].
Here, we consider a single constant delay \begin{equation*} d_{j}(t,y(t))=t-\tau,\quad \forall j.\end{equation*}
Note that the solution \begin{equation*} \theta:=[\mathbf {p};\tau],\end{equation*}
The values of the parameters in a model are often unknown. While some parameters are found in the literature, others need to be estimated. Ideally, we want the experimental data recorded at time \begin{equation*} D_{i} \approx y_\theta (t_{i}),\quad i=1,2,\ldots,N, \tag{2}\end{equation*}
\begin{equation*} \min _{\theta } \dfrac {1}{M}\sum _{i=1}^{N} \left ({D_{i} - y_\theta (t_{i})}\right)^{2}, \tag{3}\end{equation*}
\begin{equation*} J(\theta):=\dfrac {1}{M}\sum _{i=1}^{N} \left ({D_{i} - y_\theta (t_{i})}\right)^{2}.\tag{4}\end{equation*}
There are many numerical optimization algorithms in the literature that can be used to solve (3). Classical techniques use the gradient of the cost function
Numerical algorithms that only use function values of
In [27], a modified particle swarm optimization (PSO) was used to estimate the parameters of dynamical systems involving ordinary differential equations. It was shown in [28] how a hybrid adaptive cuckoo search with simulated annealing could be effective in identifying parameters in chaotic systems. A hybrid Taguchi-differential evolution algorithm was proposed in [29] to estimate the parameters of an HIV model. Heuristic algorithms have also been shown to be effective in identifying parameters of S-system models [30]–[32]. Other recent works on the application of various heuristic optimization methods on estimating parameters of engineering systems are found in [33]–[41].
Heuristic algorithms such as genetic algorithm (GA) [42], which is a hybrid of an evolutionary search strategy with a local multiple-shooting approach [43], and differential evolution [44] have been applied to parameter estimation in RDDE models. In [45], Ant Colony Optimization was used to determine parameter values in RDDE models on human health. To determine the unknown parameters in chaotic systems with time-delays, a hybrid bio-geography optimization was used in [46], PSO was used in [47], and a hybrid cuckoo search algorithm was used in [48].
To the best of our knowledge, heuristic algorithms have never been employed in estimating parameters of models involving NDDEs. In this paper, we solve the least-squares formulation in (3) using Genetic Algorithm with Multi-Parent Crossover (GA-MPC), and compare the results to those obtained using standard global optimization algorithms: GA, pattern search algorithm (PS), simulated annealing (SA), and PSO. GA-MPC is a modified GA that proposes a new crossover method [49]. It has been used to effectively solve an economic dispatch problem [50] and an optimal power flow problem [51]. It has also been applied to solve an inverse problem in RDDEs [52].
In the next section, we discuss GA-MPC and a parameter bootstrapping method. In Section III, parameter estimation is done on three NDDE models arising from applications using the heuristic algorithms mentioned above. The first model describes the population growth of E. coli using a first-order NDDE. The second one is an age-structured model involving a system of first-order NDDEs. The third model is a second-order NDDE describing a pendulum-mass-spring-damper (PMSD) system. Parameter bootstrapping is done to assess uncertainty in the estimates. Furthermore, since the only stopping criterion is the maximum number of function evaluations, we investigate the effect of varying this quantity to the cost function value. In the last section, we state our conclusions and recommendations.
Methods
A. Genetic Algorithm With Multi-Parent Crossover
Genetic Algorithm was formulated by John Holland in 1975 based on Darwin’s theory of evolution [53]. It has gained popularity for its ease of use and applicability in various fields of study. Some applications of GA can be found in [25], [26], [42], [54]–[59].
The general process of GA starts with a pool of randomly generated individuals that serve as members of the first generation. The members of the population act as estimates of the minimizer. The best members of the population are selected based on their survival fitness. In the case of minimization, the lower the function value of an individual, the higher is its chance of survival. The members of the population who did not survive will be replaced by the offspring of the surviving population. The offspring are created using a crossover method. To make sure that the solution does not converge to a local minimum, a mutation step is done at the end of the generation. In this step, a portion of the population is modified randomly. The process is repeated until a stopping criterion is satisfied. A detailed discussion on GA, as well as the MATLAB codes, are found in [26].
GA-MPC proposes a new crossover method that uses three parents from the selection pool to generate three offspring (refer to Algorithm 1). Two of the offspring (
Algorithm 1 Genetic Algorithm With Multi-Parent Crossover (GA-MPC) [49]
Require: Cost function
Ensure:
Randomly generate the initial population size
while
Sort the population by their cost function value. Form the archive pool
Generate tournament selection size
Randomly generate a number
for each three consecutive individuals in the selection pool do
if one of the selected individual is the same as another individual then
Replace one by a randomly-selected individual in the selection pool.
end if
if
Sort the three individuals
Compute
The three offspring are generated from the three parents using:
end if
for
Randomly generate
if
The offspring undergoes mutation:
end if
end for
end for
if a duplicate individual
Change the duplicate with
end if
end while
In the following simulations, the parameter values of GA-MPC are set to their default values as stated in [49]. That is, the population size
The results obtained using GA-MPC are compared to those obtained using standard heuristic algorithms: GA, PS, SA, and PSO, which are implemented using the built-in functions
B. Parameter Uncertainty Analysis
In [61], a parameter bootstrapping approach is presented to quantify the standard error in the parameter estimates in models involving RDDEs. We adopt this strategy to analyze the uncertainty in parameter estimates in the NDDE models. The method is summarized in the following steps:
Solve the parameter estimation problem in (3) and denote the estimated parameter by
.\hat {\theta } Let
be the solution of the NDDE model. Set the maximum numbery_{\hat {\theta }}(t) of realizations to 100. Initialize the number of realizationsL to 1.r Using
,(y_{\hat {\theta }}(t_{i}),\sigma ^{2}) , generate a noisy datai=1,\ldots,N based on an assumed probability distribution. Here, the noisy data are generated using the normal distribution\{\hat {D}_{i}\}_{i=1}^{N} Normal ,$(y_{\hat {\theta }}(t_{i}),\sigma ^{2})$ , wherei=1,\ldots,N is the noise set to 10% of the standard deviation of\sigma (compare with [61]).\{y_{\hat {\theta }}(t_{i})\}_{i=1}^{N} Get a new set of parameter estimates using the simulated noisy data
. This is done by replacing\{\hat {D}_{i}\}_{i=1}^{N} by\{D_{i}\}_{i=1}^{N} in (4) and minimizing the modified cost function. Store this parameter as\{\hat {D}_{i}\}_{i=1}^{N} .\bar {\theta }_{r} If
, setr < L and go back to step 3. Otherwise, go to the next step.r\leftarrow r+1 The standard error of the parameter
, denoted by\hat {\theta } , is equal to the standard deviation ofErr(\hat {\theta }) . The 95% confidence interval for the parameter\{\bar {\theta }_{r}\}_{r=1}^{100} is\hat {\theta } . The standard error and 95% confidence interval of the solution to the NDDE are computed similarly.[\hat {\theta }-1.96*Err(\hat {\theta }),\hat {\theta }+1.96*Err(\hat {\theta })]
Results and Discussion
A. E. Coli Population Growth Model
In [16], an NDDE model of E. coli population growth is proposed. The model assumes that all cells have the same division time and that they divide simultaneously. Furthermore, it is assumed that there is a prolonged initial step-like growth. The model is given by \begin{equation*} y'(t) =\rho _{0}y(t)+\rho _{1}y(t - \tau)+\rho _{2}y'(t - \tau), \tag{5}\end{equation*}
\begin{equation*} \phi (t)=y_{0}+\beta \Psi (t),\quad t\in [-\tau,0],\end{equation*}
\begin{equation*} \Psi (t)=\dfrac {2y_{0}\beta \gamma }{\rho _{1}\tau }E\left ({\dfrac {2t}{\tau }+1}\right),\end{equation*}
\begin{align*} E(t)=\begin{cases} \exp [-1/(1-t^{2})],& \text {for }|t| < 1,\\ 0, & \text {otherwise.} \end{cases}\end{align*}
The function
We wish to estimate the parameters \begin{equation*} \theta:=[\rho _{0},\rho _{1},\rho _{2},\tau,\beta].\end{equation*}
Thus, the minimization problem to estimate the parameters of the NDDE model of E. coli population growth is given by \begin{equation*} \min _{\theta \in \mathbb {R}^{5}} \dfrac {\sum \limits _{i=1}^{28} \left ({y^\star _{i} - y_\theta (t_{i})}\right)^{2}}{\sum \limits _{i=1}^{28} \left ({y^\star _{i} }\right)^{2}}. \end{equation*}
Here,
Figure 1 shows the box plots of the parameter estimates of the E. coli population growth model obtained using GA-MPC, GA, PS, SA, and PSO. Compared to the other algorithms, most of the estimates obtained using GA-MPC fall within the first and third quartiles of the 100 estimates. Furthermore, the narrow interquartile ranges suggest small variations among estimates. The cyan curves in Figure 2 are the plots of the solutions corresponding to the 100 sets of estimates. The data points are represented by the asterisks. Among the five algorithms, plots of the solution using GA-MPC produce the best fit to the experimental data.
Box plots of the parameter estimates for the E. coli population growth model using GA-MPC, GA, PS, SA, and PSO. The outliers, i.e., the estimates that fall outside the lower and upper quartiles, are plotted in red (+). The median and mean are represented by a red line across the box plots and black dot, respectively.
Plots of the solution
Next, we apply the bootstrapping method presented in Subsection II-B to assess the uncertainty in the parameter estimates obtained using GA-MPC. Table 2 shows the standard errors and mean of the estimated parameters, which all fall within the 95% confidence interval. In Figure 3, we see that the plot of the solution using the mean of the estimated parameters (blue curve) fits the experimental data (asterisks) well. The bounds for the 95% confidence interval of the solution
The numerical solution
B. Age-Structured Model
Next, we consider a population divided into two age groups – juveniles and adults. This can be described by the following age-structured population growth model from [14]:\begin{align*} \dot {U}(t)=&\begin{cases} \displaystyle -\mu _{0}U(t)+b_{1}V(t)\\ \displaystyle +\,(b_{2}-1)n_{0}(\tau -t)e^{-\mu _{0}t},& 0\leq t\leq \tau,\\ \displaystyle c_{0}U(t)+c_{1}V(t)\\ \displaystyle +\,c_{3}V(t-\tau)+c_{4}\dot {V}(t-\tau),& t>\tau,\\ \displaystyle \end{cases}\\ \dot {V}(t)=&\begin{cases} \displaystyle n_{0}(\tau -t)e^{-\mu _{0}t}-\mu _{1}V(t),& 0\leq t\leq \tau,\\ \displaystyle a_{1}V(t-\tau)+a_{2}\dot {V}(t-\tau)-a_{3}V(t), & t>\tau, \end{cases}\end{align*}
\begin{align*} c_{0}=&b_{0}-\mu _{0}, \\ c_{1}=&b_{1}, \\ c_{2}=&(b_{2}-1)b_{0}e^{-\mu _{0}\tau }, \\ c_{3}=&(b_{2}-1)(b_{1}e^{-\mu _{2}}+b_{2}\mu _{1})e^{\mu _{2}-\mu _{0}\tau }, \\ c_{4}=&(b_{2}-1)b_{2}e^{\mu _{2}-\mu _{0}\tau }, \\ a_{0}=&b_{0}e^{\mu _{2}-\mu _{0}\tau }, \\ a_{1}=&(b_{1}e^{-\mu _{2}}+b_{2}\mu _{1})e^{-\mu _{0}\tau }, \\ a _{2}=&b_{2}e^{-\mu _{0}\tau }, \\ a_{3}=&\mu _{1}, \tag{6}\end{align*}
\begin{equation*} a_{1}\geq a_{2}a_{3}.\end{equation*}
We set the initial values for \begin{align*}\begin{bmatrix} U(0)\\ V (0) \end{bmatrix}=\begin{bmatrix} \int _{0}^{36}n_{0}(a)da\\[0.5pc] \int _{36}^{50}n_{0}(a)da\end{bmatrix}=\begin{bmatrix} \int _{0}^{36}1da\\[0.5pc] \int _{36}^{50}1da\end{bmatrix}=\begin{bmatrix} 36\\ 14 \end{bmatrix},\end{align*}
\begin{align*} n_{0}(a)=\begin{cases} 1,& 0\leq a\leq 50,\\ 0,& \text {otherwise}. \end{cases}\end{align*}
In this example, we use generated population data for
The minimization problem for the parameter estimation of this model is given by \begin{equation*} \min \limits _{\theta \in \mathbb {R}^{7}}\left ({\sum _{i=1}^{n} \dfrac {|U^{*}(t_{i})-U_\theta (t_{i})|^{2}}{|U^{*}(t_{i})|^{2}}+ \dfrac {|V^{*}(t_{i})-V_\theta (t_{i})|^{2}}{|V^{*}(t_{i})|^{2}}}\right)+\zeta g,\end{equation*}
\begin{equation*} \theta =[\tau,b_{0},b_{1},b_{2},\mu _{0},\mu _{1},\mu _{2}],\end{equation*}
\begin{align*} g=&-\min [a_{1}-a_{2}a_{3},0]\\=&-\min [(b_{1}e^{-\mu _{2}}+b_{2}\mu _{1})e^{-\mu _{0}\tau }-\mu _{1}b_{2}e^{-\mu _{0}\tau },0]\end{align*}
We see in Figures 4 to 5 that in most cases, GA-MPC and PSO result in parameter estimates which are close to the true parameter values (black dotted line) and vary within small interquartile ranges. However, GA-MPC gives better estimates than PSO for the parameters
Box plots of the estimates for
Box plots of the estimates for
Figure 6 shows the plots of the solutions
Plots of the solutions
The numerical solutions
Box plots of the parameter estimates in the PMSD NDDE model using GA-MPC, GA, PS, SA, and PSO. The outliers, i.e., the estimates that fall outside the lower and upper quartiles, are plotted in red (+). The median and mean are represented by a red line across the box plots and black dot, respectively. The true values are shown as a black dotted line. The estimates are computed from a generated data set containing 20, 50, and 100 sample points.
Plots of the solutions
The numerical solution
The mean cost function value in models 1 to 3 using GA-MPC, GA, PS, SA, and PSO for different maximum number of function evaluations. The maximum function evaluations is set to
Plots of the solutions
Plots of the solutions
Plots of the solutions
To analyze the uncertainty in the estimates, we apply the bootstrapping method to the data with
C. Coupled Pendulum-Mass-Spring-Damper System
Consider a mechanical system consisting of a mass \begin{equation*} M\ddot {x}(t) + C\dot {x}(t)+Kx(t)+m\ddot {x}(t-\tau)=0, \tag{7}\end{equation*}
\begin{equation*} \ddot {y}+2\zeta \dot {y}+y+p\ddot {y}(t-\tau)=0,\tag{8}\end{equation*}
We estimate the parameters of (8) from noisy data generated in the same way as in the age-structured NDDE model. First, clean data is obtained by numerically solving (8) using the MATLAB function \begin{equation*} \min _{\theta \in \mathbb {R}^{3}} \dfrac {\sum \limits _{i=1}^{n} \left ({y^\star _{i} - y_\theta (t_{i})}\right)^{2}}{\sum \limits _{i=1}^{n} \left ({y^\star _{i} }\right)^{2}}, \end{equation*}
\begin{equation*} \theta:=[\tau,\zeta,p],\end{equation*}
Figure 8 shows the zoomed-in box plots of the parameter estimates of the PMSD NDDE model. Aside from PS, the heuristic algorithms performed well in the parameter estimation. This might be attributed to having only 3 parameters being estimated. We note that there are many outliers in the estimates using PS with
Plots of the solutions
Plots of the solutions
Plots of the solutions
Results of the parameter bootstrapping method show small standard errors of the parameter estimates in the PMSD NDDE model with mean estimates lying within the 95% confidence intervals (Table 6). Figure 10 shows a good fit to the data of most of the solutions using the re-estimated parameters.
D. Varying Maximum Number of Function Evaluations
The only stopping criterion used in the previous simulations is the maximum number of function evaluations and this is set to 2000 times the number of parameters to be estimated. Results show that GA-MPC consistently obtained good parameter estimates in the three models. Now, we look at how varying the maximum number of function evaluations affect the performance of GA-MPC, GA, PS, SA, and PSO in parameter estimation.
We consider setting the stopping criterion to 500, 1000, 1500, or 2000 times the number of parameters to be estimated, which we denote by
Conclusion
In this work, we examine the use of a modified genetic algorithm called GA-MPC in solving parameter estimation problems involving NDDEs. We test our method on three different NDDE models – a first-order NDDE, a system of first-order NDDEs, and a second-order NDDE. We look at the performance of GA-MPC in comparison to other heuristic algorithms and incorporate parameter uncertainty analysis to examine the extent of uncertainties in the estimates. For consistency of comparison, the default values of the parameters in the algorithms used are unchanged and the only stopping criterion is the maximum number of function evaluations. The simulations show that heuristic algorithms, particularly GA-MPC, can provide good estimates for the parameter values and a good fit to experimental data. The high accuracy and good consistency of the parameter estimates using GA-MPC are observed in all three different types of NDDE models we presented.
An advantage of using GA-MPC in solving the least-squares formulation arising from the parameter estimation problem is that it does not depend on the explicit formulation or the derivatives of the solution to the NDDE. Although heuristic algorithms are computationally expensive, they are robust and are easy to implement.
Many factors contribute to the performance of heuristic algorithms in solving optimization problems. For future research, one can explore modifying the default parameter values or the stopping criteria in the algorithms. One can also consider other recent heuristic algorithms and hybrids of heuristic algorithms in solving inverse problems in NDDEs. In this study, we have varied the number of data points and maximum number of function evaluations. For future work, one can use other metrics for comparison such as varying the noise level in the simulated data and adding different types of noise.
Appendix AAdditional Figures
Additional Figures
Figures 12 to 14 show the plots of the solutions to the age-structured NDDE model using parameter estimates obtained from the data set containing 20, 50, and 100 points using GA, PS, SA, and PSO. The plots of