Introduction
As a high-level technical computing language and development environment, MATLAB® has over 4,000,000 registered users and over 525,000 contributors to its official repository [1]. It builds in five ‘derivative-free optimizers’ (DFOs) in the form of heuristic and direct search algorithms. These DFOs have provided powerful tools suitable for a broad range of real-world optimization applications and have hence been widely used by optimization practitioners [2]–[4].
However, with the development of optimizers beyond conventional means, it has become challenging for a practicing engineer or scientist to select the most suitable one for their application at hand, especially without a common ground or a full set of criteria available for assessment. At present, there exists no widely applicable theoretical or analytical means to compare the performance of numerical optimizers. Further, benchmarks for comparing optimizers through tests are incomplete, mainly relying on fitness and convergence as benchmarks [5], despite a large number of benchmarking tests have been reported [6]. It is thus still difficult to assert quickly whether one particular algorithm is indeed ‘better’ or ‘more suitable’ than the others for a practitioner’s application.
This paper aims to help practitioners rapidly to select a numerical optimizer, and in particular a DFO, suitable for their application at hand without needing to become an expert in these algorithms first. Therefore, we establish a full set of benchmarks for comparing DFOs, extending beyond those benchmarks currently available in the literature. Further, a scoring system for benchmarking DFOs and a decision tree are to be developed to facilitate their selection and use without the need for deep knowledge of the DFOs on the user’s part. We shall carry out complete benchmark tests against four Congress on Evolution Computation (CEC) benchmarking problems [7]–[11], and then verify the conclusion against a practical application - the particle filtering (PF) problem [12].
Included in the benchmarking tests are five MATLAB built-in DFO functions [2]–[4], i.e. simulated annealing (SA), particle swarm optimization (PSO), the genetic algorithm (with elitism, GAe), simplex search (SS), and pattern search (PS), plus one third-party implementation of the widely used Powell’s conjugate (PC) method (as an open-source m-file available from MATLAB’s official user repository [13]) recommended by MathWorks® [7]. Among these six DFO algorithms, the SS, PS and PC are direct search algorithms and the other three are heuristics, where the GAe uses encoding.
The following section analyzes the four CEC benchmark problems to be adopted in benchmarking. Then, Section III develops the full set of benchmarks for comparing numerical optimization algorithms. Section IV undertakes benchmarking tests, and then discusses the results with recommendations. Given the recommendations, Section V tests the DFOs further, in a case study of a nonlinear particle filter application. Section VI summarizes the paper and draws conclusions.
Benchmarking Analysis for Matlab DFOs
A. Optimization Objectives and Solutions
For evaluating the performance of optimization, search or machine learning algorithms, consider a benchmarking optimization problem. Suppose that the objective function or ‘performance index’ of the optimization, subject to direct and/or indirect constraints, is represented by:\begin{equation*} \boldsymbol {f}\left ({\boldsymbol {x} }\right):\boldsymbol {X} \to \boldsymbol {F}\tag{1}\end{equation*}
As MATLAB DFOs are usually for a single objective, which would nonetheless include all elements of f through preference weighting, we consider the case \begin{equation*} f_{0}=\min _{\boldsymbol x \in \boldsymbol X}{f\left ({\boldsymbol {x} }\right)}\tag{2}\end{equation*}
Note that a non-numerical decision variable, including a ‘logic’ variable (such as True/On or False/Off) and a ‘category’ variable (such as R, L or C in an electronic circuit) may only take a discrete value in \begin{equation*} f\left ({\boldsymbol {x}_{0} }\right)=f_{0}\tag{3}\end{equation*}
Let \begin{equation*} \hat {f}_{0}=f\left ({\hat {\boldsymbol x}_{0} }\right)\tag{4}\end{equation*}
According to evolutionary algorithm competitions in IEEE CEC’14 and CEC’15, many real-world problems are multimodal and can be represented by a set of benchmark problems or, analytically, test functions [8], [9]. Local optimization is concerned with unimodality, and global with multimodality. Given possible modal differences in different dimensions, four benchmark problems of unimodal, multimodal and hybrid types are used in this paper to illustrate and analyze benchmarking with thorough comparison. Further, all the functions are tested in 10 and 30 dimensions for 30 times [8], [9].
B. Unimodal Problem
One of the basic test functions is a unimodal function, such as the Quartic test function \begin{equation*} f_{1}\left ({\boldsymbol {x} }\right)=\sum \nolimits _{i=1}^{D} {ix_{i}^{4}+random[0,1)}\tag{5}\end{equation*}
C. Multimodal Problems
In addition to multidimensionality, multimodality is also a common feature seen in practical applications. Two example benchmark problems are given below.
1) Varying Landscape Multimodal Problem
Consider the \begin{equation*} f_{2}\left ({\boldsymbol {x} }\right)=\sum \nolimits _{i=1}^{D} {f_{i}\left ({x_{\mathbf {i}} }\right)=\sum \nolimits _{i=1}^{D} {sin\left ({x_{i} }\right)}} {sin}^{2m}\left ({\frac {ix_{i}^{2}}{\pi } }\right)\tag{6}\end{equation*}
The objective function
The larger the product
is the sharper the landscape becomes, and hence harder for the optimizer.mD There are
local maxima within the search spaceD\mathrm {!=2.6525\times }{10}^{32} .\left [{ 0,\pi }\right]^{30} The theoretical benchmark solution to this
-dimensional optimization problem may be obtained by maximizingD independent uni-dimensional functions,D ,f_{i} , a fact that is however unknown to the optimizer under test. The results for\forall i\in \left \{{\mathrm {1,\cdots,}D }\right \} and forD\in \left \{{ \mathrm {10, 30} }\right \} arem\mathrm {=100} andf_{2}\left ({\mathbf {x} }\right)\in \left [{ \mathrm {0, 9.56} }\right] , respectively. The corresponding theoretical solutions are shown in Tables 1 and 2.f_{2}\left ({\mathbf {x} }\right)\in \left [{ \mathrm {0, 29.63} }\right] The ease of obtaining theoretical benchmarks regardless of
makes it ideal for studying nondeterministic polynomial (NP) characteristics of the algorithms being tested.D
2) Expanded Schaffer’s Multimodal Problem
Consider a third test function \begin{align*} f_{3}\left ({\boldsymbol {x} }\right)=&g\left ({x_{1},x_{2} }\right)+g\left ({x_{2},x_{3} }\right)+\cdots +g\left ({x_{D-1},x_{D} }\right) \\&+\,g\left ({x_{D},x_{1} }\right) \\ g\left ({x,y }\right)=&0.5+\frac {\left ({{sin}^{2}\left ({\sqrt {x^{2}+y^{2}} }\right)-0.5 }\right)}{\left ({1+0.001\left ({x^{2}+y^{2} }\right) }\right)^{2}}\tag{7}\end{align*}
D. Hybrid Problem
In real-world optimization, subsets of decision variables may have diverse properties, similar to a hybrid test function, where the variables nay be in subsets. Such a function is used to test diverse properties of the problem that may be encountered in practice. Hence, the fourth test function \begin{align*} f_{4}\left ({\boldsymbol {x} }\right)=&0.3~f_{41}\left ({\boldsymbol {x} }\right)+0.3~f_{42}\left ({\boldsymbol {x} }\right)+0.4~f_{43}\left ({\boldsymbol {x} }\right) \\[3pt] f_{41}\left ({\boldsymbol {x} }\right)=&418.9829 ~D-\sum \nolimits _{i=1}^{D} {g\left ({z_{i} }\right)},\tag{8}\\[3pt] z_{i}=&x_{i}+4.209687462275036\times {10}^{2}\tag{9}\\ g\left ({z_{i} }\right)=&\begin{cases} z_{i}sin\left ({\left |{ z_{i} }\right |^{1/2} }\right) \qquad if \left |{ z_{i} }\right |\le 500\\[3pt] \left ({500-mod\left ({z_{i},500 }\right) }\right) \\[3pt] \times sin\left ({\sqrt {\left |{ 500-mod\left ({z_{i},500 }\right) }\right |} }\right)\\[3pt] \quad -\dfrac {\left ({z_{i}-500 }\right)^{2}}{10000D} \quad if~z_{i}>500\\[3pt] \left ({mod\left ({\left |{ z_{i} }\right |,500 }\right)-500 }\right) \\[3pt] \times sin\left ({\sqrt {\left |{ mod\left ({\left |{ z_{i} }\right |,500 }\right)-500 }\right | }}\right)\\ ~\qquad -\dfrac {\left ({z_{i}+500 }\right)^{2}}{10000D} \quad if z_{i}< -500 \end{cases}\tag{10}\\ f_{42}\left ({\boldsymbol {x} }\right)=&\sum \nolimits _{i=1}^{D} \left ({x_{i}^{2}-10cos\left ({2\pi x_{i} }\right)+10 }\right)\tag{11}\\ f_{43}\left ({\boldsymbol {x} }\right)=&\sum \nolimits _{i=1}^{D} \left ({{10}^{6} }\right)^{\frac {i-1}{D-1}} x_{i}^{2}\tag{12}\end{align*}
In summary,
Full Set of Benchmarks for Evaluating Numerical Optimizers
A. Optimality Benchmark
Optimality represents the relative closeness of a found optimum, \begin{equation*} Optimality=1-\frac {\left \|{ f_{0-}\hat {f}_{0} }\right \|}{\left \|{ \overline f -\underline {f} }\right \|}\in \left [{ 0,1 }\right]\tag{13}\end{equation*}
For a single objective problem (i.e. \begin{equation*} {Optimality}_{min}=1-\frac {\left |{ f_{min}-\hat {f}_{0} }\right |}{\left |{ f_{max}-f_{min} }\right |}=\frac {f_{max}- \hat {f}_{0}}{f_{max}-f_{min}}\tag{14}\end{equation*}
Similarly, the optimality for the corresponding maximization problem can be simplified to:\begin{equation*} {Optimality}_{max}=1-\frac {\left |{ f_{max}-\hat {f}_{0} }\right |}{\left |{ f_{max}-f_{min} }\right |}=\frac {\hat {f}_{0}-f_{min}}{f_{max}-f_{min}}\tag{15}\end{equation*}
B. Accuracy Benchmark
Is it necessary to introduce an accuracy benchmark in the control space in addition to the Optimality benchmark that already reflects a ‘performance index’ in the objective space? The necessity lies in that we are assessing global optimization that involves multiple optima and is approached stochastically often by a nondeterministic algorithm. As the highest value of optimality is one of the local optima, the quality of a found solution
Here, the accuracy benchmark is defined as the relative closeness of the found solution, \begin{equation*} Accuracy=1-\frac {\left \|{\boldsymbol x_{0}-\widehat {\boldsymbol x}_{0}}\right \|}{\|\overline {\boldsymbol x}-\underline {\boldsymbol x}\|} \in [{0,1}]\tag{16}\end{equation*}
By (16), the accuracy with respect to a single-parameter minimization problem within \begin{equation*} {Accuracy}_{D=1}=1-\frac {\hat {x}_{0}{-x}_{0}}{x_{max}-x_{min}}\tag{17}\end{equation*}
C. Convergence Benchmarks
To date, convergence of population-based algorithms is benchmarked only qualitatively using fitness or cost traces graphically against the number of iterations, generations or function evaluations (FEs). With this paper, convergence can now be measured both qualitatively and quantitatively.
Qualitatively, we adopt the method of graphical tracing, but on both benchmarks of optimality and accuracy, instead of on fitness alone, i.e.:
The highest ‘optimality’ or fitness in every generation;
The highest ‘accuracy’ or the parameter values of the individual solution that has the highest optimality fitness in every generation.
1) Reach Time
Quantitatively, we define a ‘reach time’ as:\begin{equation*} Reach~time\vert _{b}=C^{b}\tag{18}\end{equation*}
2) NP-Time
The power of a heuristic DFO is that it reduces an otherwise exponential time of an exhaustive search algorithm to a nondeterministic polynomial time. To estimate the order of the polynomial, \begin{equation*} NP-time\left ({D }\right)=C^{0.999}\left ({D }\right)\tag{19}\end{equation*}
3) Total Number of Function Evaluations
The optimality of 99.9% may not be reached by certain algorithms under test. The total number of function evaluations is the number of evaluations, search trials or simulations performed until the optimization process is terminated. For benchmarking, the total number of evaluations is kept the same for all algorithms, such as \begin{equation*} N=min\left \{{C^{0.999},400mD^{2} }\right \}\tag{20}\end{equation*}
D. Optimizer Overheads Benchmark
The ‘total CPU time’, \begin{equation*} Optimizer~overheads=\frac {T-T_{FE}}{T_{FE}}\tag{21}\end{equation*}
E. Sensitivity Benchmark
When the values of the optimal parameters found are perturbed, the optimality may well change, which will affect the reliability or robustness of the solution found and can hence impact on the quality of real-world applications. This is critical, for example, in an industrial design, where manufacturing tolerance may cause suddenly deteriorated quality of the end product as a result of the sensitive optimality or polarized optimization.
In this paper, “sensitivity” is hence used to indicate the robustness of the optimizer by measuring how much relative change in the optimality will result from a ‘small’ relative change in the corresponding solution found. Quantitatively, the sensitivity is hence defined as:\begin{equation*} Sensitivity=\frac {d(Optimality)}{d(Accuracy)}=\left \|{ \nabla \hat {\boldsymbol f} }\right \|\frac {\left \|{ \overline {\boldsymbol {x}} -\underline {\boldsymbol {x}} }\right \|}{\left \|{ \overline {\boldsymbol {f}} -\underline {\boldsymbol {f}} }\right \|}\tag{22}\end{equation*}
\begin{equation*} Sensitivity\approx \frac {\left \|{ \Delta \hat {\boldsymbol f} }\right \|}{\left \|{ \Delta \hat {\boldsymbol x} }\right \|}\frac {\left \|{ \overline {\boldsymbol {x}} -\underline {\boldsymbol {x}} }\right \|}{\left \|{ \overline {\boldsymbol {f}} -\underline {\boldsymbol {f}} }\right \|}\tag{23}\end{equation*}
It is convenient to set
In addition to sensitivity, the two-tailed “t-test” could be used to assess the reliability in comparing two algorithms to a certain degree [14], [18]. This is a statistical hypothesis test, which is usually used to determine if two sets of data are significantly different from each other [19] and used as a complimentary approach to further validation of a comparison when necessary.
Benchmarking MATLAB DFOs
A. Test Conditions
Using the defined benchmarks and the chosen benchmark problems, we investigate the performance of the MATLAB (R2016b) heuristics SA, PSO and GAe, and the direct-search optimizers, SS, PS, and PC, through extensive experimental tests. Since the main purpose of this work is to help MATLAB DFO users select an algorithm without the need for in-depth knowledge of the algorithms, MATLAB’s default settings for these algorithms are used in the tests (unless otherwise stated) and all the DFOs are tested on the same random set of starting points.
Therefore, for SS, the reflection coefficient is 1, the expansion coefficient 2, the contraction coefficient 0.5, and the shrink coefficient 0.5.
For SA, the initial temperature is 100, and the temperature is reduced to 95% consecutively at each iteration.
For PSO, the swarm size is
MATLAB’s genetic algorithm uses elitism, hence named GAe, which guarantees the best 5% of a generation to survive to the next generation. For the GAe, the population size is
Each experiment is repeated 30 times to obtain a mean value of each benchmark value. As SA, SS, PS and PC are unary (as opposed to population-based) optimizers, one starting point is generated randomly for each run, identically used by all of these optimizers for fair comparison. Note that, however, when displaying convergence traces for these unary optimizers, one point is plotted for every
In order to compare the optimizer overheads later, the CPU time, albeit less accurate in MATLAB than in a real-time program, is recorded for
B. Benchmarking DFOs on Four Test Problems
The performance of the six individual algorithms on each of the five benchmarks was first ranked and then assigned scores from 6 to 1, with 6 representing the best. For example, if the mean optimality of PSO is the highest among all the six algorithms, then it scores a 6 on Optimality. The scores of each algorithm on all the five benchmarks are summed up to give the algorithm a total score. Results from all the experiments are summarized in Tables 4–7, where the bold indicates the best at each benchmark and the red the best overall score and corresponding algorithm.
1) Quartic Unimodal Problem f_{\mathbf{1}}
It can be observed that, for 10-D, the algorithms except SS and SA offer optimality over 99.9995%. PC offers the best performance overall, although PC, PS and PSO all win on three benchmarks. From both the 10-D and 30-D tests, it can be seen that PC offers the best overall performance, and should hence be recommended for uni-modal problems. Further, PC also offers the most ‘linear’ polynomial time (although NP), shown in the last column, whilst SS and SA are the worst. As expected, PSO and GAe are also relatively ‘polynomial’. On contrast, SS and SA offer the lowest performance with the highest overheads, and hence should be avoided for this type of problems. Fig. 5 depicts typical optimality convergence of the best point in a ‘generation’.
Typical optimality of each DFO in optimizing the unimodal quartic problem
2) Varying Landscape n-D Multimodal Problem f_{2}
This function has multiple extrema in
Typical optimality of each DFO in optimizing the varying landscape problem
3) Expanded Scaffer’s F6 Multimodal Problem f_{3}
Table 6 summarizes the test results on this problem. Fig. 7 shows the convergence traces of the algorithms. It can be seen that PSO delivered consistently good and the overall best performance. However, the GAe again consistently demonstrated the highest optimality and highest accuracy in both 10-D and 30-D tests. Same as on
Typical optimality of each DFO in optimizing Schaffer’s F6 problem
4) Hybrid Problem f_{4}
Table 7 summarizes the test results on this function. More optimizer overheads were expected due to the complexity of this function. Except for SA and SS, the overheads of the algorithms are less than 2000. Results depicted in Fig. 8 shows that PC has similar convergence tendency to the case in
Typical optimality of each DFO in optimizing the hybrid problem
C. Comparison on Individual Benchmarks
1) Optimality
Table 8 presents the scores and ranking of the six algorithms on optimality in solving problems
2) Accuracy
Table 9 ranks the six algorithms on their accuracy. It can be seen that, when applied to
3) Reach Time
Table 10 ranks on reach time, where
4) Optimizer Overheads
Table 11 ranks on optimizer overheads. Note that the ranking of each algorithm remains unchanged across the problems and dimensions, suggesting that optimizer overheads of the algorithms are not problem-dependent. It is therefore reasonable to conclude that, where reducing computational overheads is a priority, PC would be the most suitable, followed by PSO and GAe.
5) Sensitivity
Table 12 ranks on sensitivity. It can be seen that PC and PSO would be the choice, if low sensitivity is a major concern in practice.
D. Quick Guide to Selecting a DFO
To suit their application at hand most, the user could select a DFO using its individual benchmark ranking or the overall ranking on amalgamated scores in Table 13, which is summarized from Tables 8–12. For example, if the objective space or optimality is of a paramount importance for the application, then the GAe would be the No. 1 tool for it, as well as for a potentially multimodal or relatively complex problem.
If the decision space or accuracy is the goal, then PSO and PS would be the choice, with the former being more suitable for a multimodal problem and the latter unimodal. For unknown modality or a relatively simple problem, PC would provide a rapid assessment on possible effects of applying optimization to the problem at hand.
If the application is rather unknown, however, it is recommended that it could first be treated as a multimodal problem for global optimization using the GAe, which will take a longer time, and then as a unimodal problem for local optimization using PC, which will be a lot quicker. From the above analysis, a visual guide is provided as a decision tree in Fig. 9.
A simplified decision tree for selecting a DFO. Note that if an application requires optimizing its structure, the GAe is recommended, since its encoding capability permits optimization of structures and categories, in addition to numerical parameters of a structure. Relevantly, the GAe is recommended for global search offline together with PC for local tuning or rapid learning and adaptation online.
Case Study - Particle Filtering
A. The Stochastic Nonlinear Filter
The particle filter (PF) uses a set of particles (i.e., samples) to represent the posterior distribution of a stochastic process given partial observations. It is a Monte Carlo algorithm to solve Bayesian estimation and signal processing problems. Second to the median filter, the PF is so far the most widely used nonlinear filter [20] following the Sequential Importance Resampling algorithm developed by Gorden [21].
A PF can be formalized in the generic form of a nonlinear system:\begin{align*} \boldsymbol {x}_{k}=&f\left ({\boldsymbol {x}_{k-1},\boldsymbol {u}_{k-1} }\right)\tag{24}\\ \mathbf {z}_{k}=&h\left ({\boldsymbol {x}_{k},\boldsymbol {v}_{k-1} }\right)\tag{25}\end{align*}
The objective of the PF is recursively to estimate the posterior density \begin{equation*} p\left ({\boldsymbol {x}_{0:k}\vert \boldsymbol {z}_{1:k}}\right) =p\left ({\boldsymbol {x}_{0:k-1}\vert \boldsymbol {z}_{1:k-1}}\right) \frac {p\left ({\boldsymbol {z}_{k}\vert \boldsymbol {x}_{k}}\right) p\left ({\boldsymbol {x}_{k}\vert \boldsymbol {x}_{k-1}}\right) }{p\left ({\boldsymbol {z}_{k}\vert \boldsymbol {z}_{1:k-1}}\right) }\tag{26}\end{equation*}
\begin{equation*} p\left ({\boldsymbol {x}_{0:k}\vert \boldsymbol {z}_{1:k}}\right) \propto p\left ({\boldsymbol {x}_{0:k-1}\vert \boldsymbol {z}_{1:k-1}}\right) p\left ({\boldsymbol {z}_{k}\vert \boldsymbol {x}_{k}}\right) p\left ({\boldsymbol {x}_{k}\vert \boldsymbol {x}_{k-1}}\right)\tag{27}\end{equation*}
The above formulas are intractable integrals to an analytic solution for
To estimate the posterior distribution of the transition, the PF uses an ‘importance sampling’ technique to sample particles first [22]:\begin{equation*} q\left ({\mathbf {x}_{0:k}\vert \mathbf {z}_{1:k}}\right) =q\left ({\mathbf {x}_{0:k-1}\vert \mathbf {z}_{1:k-1}}\right) q\left ({\mathbf {x}_{k}\vert {\mathbf {x}_{0:k-1},\mathbf {z}_{1:k}}}\right)\tag{28}\end{equation*}
\begin{equation*} w_{k}^{i}=\frac {p\left ({\boldsymbol {x}_{0:k}\vert \boldsymbol {z}_{1:k}}\right)}{q\left ({\boldsymbol {x}_{0:k}\vert \boldsymbol {z}_{1:k}}\right) }\propto w_{k-1}^{i}\frac {p\left ({\boldsymbol {z}_{k}\vert \boldsymbol {x}_{k}^{i}}\right) p\left ({\boldsymbol {x}_{k}^{i}\vert \boldsymbol {x}_{k-1}^{i}}\right) }{q\left ({\boldsymbol {x}_{k}^{i}\vert {\boldsymbol {x}_{0:k-1}^{i}\boldsymbol {,z}}_{1:k}}\right)}\tag{29}\end{equation*}
\begin{equation*} q\left ({\boldsymbol {x}_{k}\vert {\boldsymbol {x}_{0:k-1}\boldsymbol {,z}}_{1:k}}\right) =q\left ({\boldsymbol {x}_{k}\vert {\boldsymbol {x}_{k-1},\boldsymbol {z}_{k}}}\right)\tag{30}\end{equation*}
\begin{equation*} \tilde {w}_{k}^{i}\propto \tilde {w}_{k-1}^{i}\frac {p\left ({\boldsymbol {z}_{k}\vert \boldsymbol {x}_{k}^{i}}\right) p\left ({\boldsymbol {x}_{k}^{i}\vert \boldsymbol {x}_{k-1}^{i}}\right) }{q\left ({\boldsymbol {x}_{k}^{i}\vert {\boldsymbol {x}_{k-1}^{i}\boldsymbol {,z}}_{k}}\right)}\tag{31}\end{equation*}
\begin{equation*} p\left ({\boldsymbol {x}_{k}\vert \boldsymbol {z}_{1:k}}\right) \approx \sum \nolimits _{i=1}^{N} \tilde {w}_{k}^{i} \delta \left ({\boldsymbol {x}_{k}-\boldsymbol {x}_{k}^{i} }\right)\tag{32}\end{equation*}
Despite that the PF is the second most successful nonlinear filter, serious challenges exist. One is the particle impoverishment problem, which due to most particles sharing a few distinct values cause a loss of diversity and estimation.
To improve, a larger number of optimization algorithms have been applied to particle filtering.
B. Tests of MATLAB DFOs and Verification
In this case study, tests of all the six DFO are conducted to select the most suitable DFO for the PF and to verify the benchmark guide. In the following, the SS, PS, PC, SA, PSO and GAe versions of the PF are termed SSPF, PSPF, PCPF, SAPF, PSOPF and GAePF, respectively.
A PF application is likely a multimodal or complex problem, the nature of which could be unknown to the user. The tests will be under the representative conditions described in [18], which are widely adopted owing to their strong nonlinearity [22]–[24].
The test formulation is given by:\begin{align*} x_{k}=&1+sin {\left ({w\pi k }\right)+\phi _{1}}x_{k-1}+v_{k}\tag{33}\\ z_{k}=&\begin{cases} \phi _{2}x_{k}^{2}+n_{k} & k\le 30\\ \phi _{3}x_{k}-2+n_{k} & k>30\\ \end{cases}\tag{34}\end{align*}
1) Accuracy Analysis on State Estimations
The following three metrics were used to evaluate the performance of the six PF algorithms:\begin{align*} rmse=&\sqrt {\frac {1}{T}\sum \nolimits _{k=1}^{T} \left ({x_{k}-\hat {x}_{k} }\right)^{2}}\tag{35}\\ \overline {rmse}=&\sqrt {\frac {1}{M}\sum \nolimits _{m=1}^{M} \left [{ \frac {1}{T}\sum \nolimits _{k=1}^{T} \left ({x_{k}^{m}-\hat {x}_{k}^{m} }\right)^{2} }\right]}\tag{36}\\ {rmse}^{'}=&\sqrt {\frac {1}{M}\sum \nolimits _{m=1}^{M} \left ({x_{k}^{m}-\hat {x}_{k}^{m} }\right)^{2}}\tag{37}\end{align*}
Given the observation time, the mean rmse values at every observation in 100 simulations of the six algorithms are shown in Fig. 10. As expected from the DFO guide of Fig. 9, the mean rmse values of the GAePF are seen the lowest. Unexpectedly, however, the PSOPF has shown poor performance. Investigation into this has uncovered that the PSO in MATLAB does not make a set of the best population available for access after optimization, but the PF algorithm needs a set of particles.
Mean RMSE errors with observation times for 100 simulations, comparing MATLAB DFOs.
To further verify the results, the mean estimation rmse of the six algorithms in different numbers of simulations are shown in Fig. 11, where
Mean RMSE errors vs. the number of simulations for 100 simulations, comparing MATLAB DFOs.
2) Utilization of Particles and Runtime
The means
Conclusions
In this paper, a set of five criteria for benchmarking numerical optimization algorithms are first presented. These benchmarks reflect performance of DFOs on various aspects. Not only have the existing optimality, accuracy and convergence benchmarks been expanded in this paper, but also have the optimizer overheads and sensitivity benchmarks developed to enrich the existing evaluation criteria for practical numerical optimization applications.
Using four relatively representative and commonly adopted benchmarking problems, the six DFO algorithms available in MATLAB have been compared and ranked on the five benchmarks. The experiments have shown that PC is the overall best for unimodal and simple hybrid problems, whilst the GAe is the overall best for multimodal and relatively complex problems.
To provide a guide for practical engineers to decide rapidly which of the DFOs would be the most suitable for their application at hand without the need for in-depth knowledge of the DFO algorithms themselves, a scoring system and a decision tree are provided in this paper. If optimality (the objective space) is of the paramount importance to the application, such as in a competition, the GAe would be the choice. If accuracy (the decision space) or sensitivity is the main concern, such as for tolerating manufacturing inaccuracies of a design-optimized product, then PSO or PS would be the choice. If the application is rather unknown, it is recommended that it could first be treated as a multimodal problem for global optimization using the GAe, which will take a longer time, and then, if desired, as a unimodal problem for local optimization using PC, which will be a lot quicker.
With the guide of the ranks and the decision tree, all the six optimizers have been applied to the nonlinear problem of particle filtering. The GAe is confirmed being better equipped to deal with such a relatively challenging optimization problem.
For future work, benchmarking and selection criteria will be extended to multi-target optimization problems.