Loading [MathJax]/extensions/MathMenu.js
Benchmarks for Evaluating Optimization Algorithms and Benchmarking MATLAB Derivative-Free Optimizers for Practitioners’ Rapid Access | IEEE Journals & Magazine | IEEE Xplore

Benchmarks for Evaluating Optimization Algorithms and Benchmarking MATLAB Derivative-Free Optimizers for Practitioners’ Rapid Access


A simplified decision tree for selecting a DFO. Note that if an application requires optimizing its structure, the GAe is recommended, since its encoding capability permi...

Abstract:

MATLAB® has built in five derivative-free optimizers (DFOs), including two direct search algorithms (simplex search, pattern search) and three heuristic algorithms (simul...Show More

Abstract:

MATLAB® has built in five derivative-free optimizers (DFOs), including two direct search algorithms (simplex search, pattern search) and three heuristic algorithms (simulated annealing, particle swarm optimization, and genetic algorithm), plus a few in the official user repository, such as Powell’s conjugate (PC) direct search recommended by MathWorks®. To help a practicing engineer or scientist to choose a MATLAB DFO most suitable for their application at hand, this paper presents a set of five benchmarking criteria for optimization algorithms and then uses four widely adopted benchmark problems to evaluate the DFOs systematically. Comprehensive tests recommend that the PC be most suitable for a unimodal or relatively simple problem, whilst the genetic algorithm (with elitism in MATLAB, GAe) for a relatively complex, multimodal or unknown problem. This paper also provides an amalgamated scoring system and a decision tree for specific objectives, in addition to recommending the GAe for optimizing structures and categories as well as for offline global search together with PC for local parameter tuning or online adaptation. To verify these recommendations, all the six DFOs are further tested in a case study optimizing a popular nonlinear filter. The results corroborate the benchmarking results. It is expected that the benchmarking system would help select optimizers for practical applications.
A simplified decision tree for selecting a DFO. Note that if an application requires optimizing its structure, the GAe is recommended, since its encoding capability permi...
Published in: IEEE Access ( Volume: 7)
Page(s): 79657 - 79670
Date of Publication: 14 June 2019
Electronic ISSN: 2169-3536

Funding Agency:

No metrics found for this document.

SECTION I.

Introduction

As a high-level technical computing language and development environment, MATLAB® has over 4,000,000 registered users and over 525,000 contributors to its official repository [1]. It builds in five ‘derivative-free optimizers’ (DFOs) in the form of heuristic and direct search algorithms. These DFOs have provided powerful tools suitable for a broad range of real-world optimization applications and have hence been widely used by optimization practitioners [2]–​[4].

However, with the development of optimizers beyond conventional means, it has become challenging for a practicing engineer or scientist to select the most suitable one for their application at hand, especially without a common ground or a full set of criteria available for assessment. At present, there exists no widely applicable theoretical or analytical means to compare the performance of numerical optimizers. Further, benchmarks for comparing optimizers through tests are incomplete, mainly relying on fitness and convergence as benchmarks [5], despite a large number of benchmarking tests have been reported [6]. It is thus still difficult to assert quickly whether one particular algorithm is indeed ‘better’ or ‘more suitable’ than the others for a practitioner’s application.

This paper aims to help practitioners rapidly to select a numerical optimizer, and in particular a DFO, suitable for their application at hand without needing to become an expert in these algorithms first. Therefore, we establish a full set of benchmarks for comparing DFOs, extending beyond those benchmarks currently available in the literature. Further, a scoring system for benchmarking DFOs and a decision tree are to be developed to facilitate their selection and use without the need for deep knowledge of the DFOs on the user’s part. We shall carry out complete benchmark tests against four Congress on Evolution Computation (CEC) benchmarking problems [7]–​[11], and then verify the conclusion against a practical application - the particle filtering (PF) problem [12].

Included in the benchmarking tests are five MATLAB built-in DFO functions [2]–​[4], i.e. simulated annealing (SA), particle swarm optimization (PSO), the genetic algorithm (with elitism, GAe), simplex search (SS), and pattern search (PS), plus one third-party implementation of the widely used Powell’s conjugate (PC) method (as an open-source m-file available from MATLAB’s official user repository [13]) recommended by MathWorks® [7]. Among these six DFO algorithms, the SS, PS and PC are direct search algorithms and the other three are heuristics, where the GAe uses encoding.

The following section analyzes the four CEC benchmark problems to be adopted in benchmarking. Then, Section III develops the full set of benchmarks for comparing numerical optimization algorithms. Section IV undertakes benchmarking tests, and then discusses the results with recommendations. Given the recommendations, Section V tests the DFOs further, in a case study of a nonlinear particle filter application. Section VI summarizes the paper and draws conclusions.

SECTION II.

Benchmarking Analysis for Matlab DFOs

A. Optimization Objectives and Solutions

For evaluating the performance of optimization, search or machine learning algorithms, consider a benchmarking optimization problem. Suppose that the objective function or ‘performance index’ of the optimization, subject to direct and/or indirect constraints, is represented by:\begin{equation*} \boldsymbol {f}\left ({\boldsymbol {x} }\right):\boldsymbol {X} \to \boldsymbol {F}\tag{1}\end{equation*} View SourceRight-click on figure for MathML and additional features. which is also called a fitness function in the context of maximization, or a cost function in that of minimization. Here, $\mathbf {f}\in \mathbf {F}\subseteq \mathbf {R}^{m}$ represents $m$ objective elements, and $\mathbf {x}\in \mathbf {X}\subseteq \mathbf {R}^{D}$ represents $D$ decision variables or parameters to be optimized. The benchmark function f may only be evaluated numerically through computer simulations within X.

As MATLAB DFOs are usually for a single objective, which would nonetheless include all elements of f through preference weighting, we consider the case $m=1$ here for simplicity. For benchmark testing, a benchmark problem or test function usually has a known theoretical optimum:\begin{equation*} f_{0}=\min _{\boldsymbol x \in \boldsymbol X}{f\left ({\boldsymbol {x} }\right)}\tag{2}\end{equation*} View SourceRight-click on figure for MathML and additional features. Without loss of generality, minimization is typically considered, as maximization is simply an inverse case of minimization.

Note that a non-numerical decision variable, including a ‘logic’ variable (such as True/On or False/Off) and a ‘category’ variable (such as R, L or C in an electronic circuit) may only take a discrete value in $\mathbf {R}^{1}$ , in the form of an encoded x as accommodated in the genetic algorithm or genetic programming. For an $\mathbf {x}_{0}\in \mathbf {X}$ that satisfies:\begin{equation*} f\left ({\boldsymbol {x}_{0} }\right)=f_{0}\tag{3}\end{equation*} View SourceRight-click on figure for MathML and additional features. it is said to be a theoretical solution corresponding to $f_{0} $ to the optimization problem.

Let $\hat {x}_{0}\in \mathbf {X}$ represent a found solution by an optimization, search or machine learning algorithm, such as a DFO. Then, the corresponding found optimum is defined as \begin{equation*} \hat {f}_{0}=f\left ({\hat {\boldsymbol x}_{0} }\right)\tag{4}\end{equation*} View SourceRight-click on figure for MathML and additional features. which is usually an approximate optimum or a sub-optimum with respect to nondeterministic algorithms. Since DFOs are often nondeterministic, benchmarking should use mathematical means of results over multiple test runs.

According to evolutionary algorithm competitions in IEEE CEC’14 and CEC’15, many real-world problems are multimodal and can be represented by a set of benchmark problems or, analytically, test functions [8], [9]. Local optimization is concerned with unimodality, and global with multimodality. Given possible modal differences in different dimensions, four benchmark problems of unimodal, multimodal and hybrid types are used in this paper to illustrate and analyze benchmarking with thorough comparison. Further, all the functions are tested in 10 and 30 dimensions for 30 times [8], [9].

B. Unimodal Problem

One of the basic test functions is a unimodal function, such as the Quartic test function $f_{1}$ [14], [15]:\begin{equation*} f_{1}\left ({\boldsymbol {x} }\right)=\sum \nolimits _{i=1}^{D} {ix_{i}^{4}+random[0,1)}\tag{5}\end{equation*} View SourceRight-click on figure for MathML and additional features. where $\mathbf {x\in }\left [{ \mathrm {-1.28,1.28} }\right]^{D}$ , random[0, 1) is a Gaussian noise that helps assess the robustness of the tests, and $f_{1}\left ({\mathbf {x} }\right)\in \left [{ 0, 147.64 }\right]$ and $f_{1}\left ({\mathbf {x} }\right)\in \left [{ 0, 1248.2 }\right] $ for $D\mathrm {=10}$ and $D\mathrm {=30}$ , respectively. A 3-D map for a 2-D function is shown in Fig. 1.

FIGURE 1. - Contour plot to visualize a 2-D case of 
$f_{1}$
.
FIGURE 1.

Contour plot to visualize a 2-D case of $f_{1}$ .

C. Multimodal Problems

In addition to multidimensionality, multimodality is also a common feature seen in practical applications. Two example benchmark problems are given below.

1) Varying Landscape Multimodal Problem

Consider the $D$ -dimensional maximization problem introduced by Michalewicz [16] and further studied by Renders and Bersini [15]:\begin{equation*} f_{2}\left ({\boldsymbol {x} }\right)=\sum \nolimits _{i=1}^{D} {f_{i}\left ({x_{\mathbf {i}} }\right)=\sum \nolimits _{i=1}^{D} {sin\left ({x_{i} }\right)}} {sin}^{2m}\left ({\frac {ix_{i}^{2}}{\pi } }\right)\tag{6}\end{equation*} View SourceRight-click on figure for MathML and additional features. where $\mathbf {x\in }\left [{ 0,\pi }\right]^{D}$ . It is composed of a family of amplitude-modulated sinewaves whose frequencies are linearly modulated. A 3-D map of a 2-D function is shown in Fig. 2.

FIGURE 2. - Visualizing a 2-D case of 
$f_{2}$
.
FIGURE 2.

Visualizing a 2-D case of $f_{2}$ .

The objective function $f_{2}\left ({\mathbf {x} }\right)$ is, in effect, de-coupled in every dimension represented by $f_{i}\left ({x_{\mathbf {i}} }\right)$ , for the ease of verifying optimality. This characteristic yields the following properties:

  1. The larger the product $mD$ is the sharper the landscape becomes, and hence harder for the optimizer.

  2. There are $D\mathrm {!=2.6525\times }{10}^{32}$ local maxima within the search space $\left [{ 0,\pi }\right]^{30}$ .

  3. The theoretical benchmark solution to this $D$ -dimensional optimization problem may be obtained by maximizing $D$ independent uni-dimensional functions, $f_{i}$ , $\forall i\in \left \{{\mathrm {1,\cdots,}D }\right \}$ , a fact that is however unknown to the optimizer under test. The results for $D\in \left \{{ \mathrm {10, 30} }\right \}$ and for $m\mathrm {=100}$ are $f_{2}\left ({\mathbf {x} }\right)\in \left [{ \mathrm {0, 9.56} }\right]$ and $f_{2}\left ({\mathbf {x} }\right)\in \left [{ \mathrm {0, 29.63} }\right]$ , respectively. The corresponding theoretical solutions are shown in Tables 1 and 2.

  4. The ease of obtaining theoretical benchmarks regardless of $D$ makes it ideal for studying nondeterministic polynomial (NP) characteristics of the algorithms being tested.

TABLE 1 Theoretical Solutions of Benchmark Function 2 for ${D} =10$
Table 1- 
Theoretical Solutions of Benchmark Function 2 for 
${D} =10$
TABLE 2 Theoretical Solutions of Benchmark Function 2 for ${D} =30$
Table 2- 
Theoretical Solutions of Benchmark Function 2 for 
${D} =30$

2) Expanded Schaffer’s Multimodal Problem

Consider a third test function $f_{3}$ of the expanded Schaffer’s F6 function [8], [9], which includes many peaks and is difficult for an algorithm to converge to:\begin{align*} f_{3}\left ({\boldsymbol {x} }\right)=&g\left ({x_{1},x_{2} }\right)+g\left ({x_{2},x_{3} }\right)+\cdots +g\left ({x_{D-1},x_{D} }\right) \\&+\,g\left ({x_{D},x_{1} }\right) \\ g\left ({x,y }\right)=&0.5+\frac {\left ({{sin}^{2}\left ({\sqrt {x^{2}+y^{2}} }\right)-0.5 }\right)}{\left ({1+0.001\left ({x^{2}+y^{2} }\right) }\right)^{2}}\tag{7}\end{align*} View SourceRight-click on figure for MathML and additional features. where $\boldsymbol { x\in }\left [{ \mathrm {-100, 100} }\right]^{D}$ , $f_{3}\left ({\mathbf {x} }\right)\in \left [{ 0, 9.98 }\right]$ and $f_{3}\left ({\mathbf {x} }\right)\in \left [{ 0, 29.93 }\right] $ for $D=10$ and $D=30$ , respectively. A 3-D map for a 2-D function is shown in Fig. 3.

FIGURE 3. - Visualizing a 2-D case of 
$f_{3}$
.
FIGURE 3.

Visualizing a 2-D case of $f_{3}$ .

D. Hybrid Problem

In real-world optimization, subsets of decision variables may have diverse properties, similar to a hybrid test function, where the variables nay be in subsets. Such a function is used to test diverse properties of the problem that may be encountered in practice. Hence, the fourth test function $f_{4}$ is a hybrid of Schwefel’s function $f_{41}$ , Rastrigin’s function $f_{42}$ and the High Conditioned Elliptic Function $f_{43}$ [8], [9]:\begin{align*} f_{4}\left ({\boldsymbol {x} }\right)=&0.3~f_{41}\left ({\boldsymbol {x} }\right)+0.3~f_{42}\left ({\boldsymbol {x} }\right)+0.4~f_{43}\left ({\boldsymbol {x} }\right) \\[3pt] f_{41}\left ({\boldsymbol {x} }\right)=&418.9829 ~D-\sum \nolimits _{i=1}^{D} {g\left ({z_{i} }\right)},\tag{8}\\[3pt] z_{i}=&x_{i}+4.209687462275036\times {10}^{2}\tag{9}\\ g\left ({z_{i} }\right)=&\begin{cases} z_{i}sin\left ({\left |{ z_{i} }\right |^{1/2} }\right) \qquad if \left |{ z_{i} }\right |\le 500\\[3pt] \left ({500-mod\left ({z_{i},500 }\right) }\right) \\[3pt] \times sin\left ({\sqrt {\left |{ 500-mod\left ({z_{i},500 }\right) }\right |} }\right)\\[3pt] \quad -\dfrac {\left ({z_{i}-500 }\right)^{2}}{10000D} \quad if~z_{i}>500\\[3pt] \left ({mod\left ({\left |{ z_{i} }\right |,500 }\right)-500 }\right) \\[3pt] \times sin\left ({\sqrt {\left |{ mod\left ({\left |{ z_{i} }\right |,500 }\right)-500 }\right | }}\right)\\ ~\qquad -\dfrac {\left ({z_{i}+500 }\right)^{2}}{10000D} \quad if z_{i}< -500 \end{cases}\tag{10}\\ f_{42}\left ({\boldsymbol {x} }\right)=&\sum \nolimits _{i=1}^{D} \left ({x_{i}^{2}-10cos\left ({2\pi x_{i} }\right)+10 }\right)\tag{11}\\ f_{43}\left ({\boldsymbol {x} }\right)=&\sum \nolimits _{i=1}^{D} \left ({{10}^{6} }\right)^{\frac {i-1}{D-1}} x_{i}^{2}\tag{12}\end{align*} View SourceRight-click on figure for MathML and additional features. where $\mathbf {x\in }\left [{ \mathrm {-100,100} }\right]^{D}$ and the hybrid rates used are 0.3, 0.3 and 0.4 for $f_{41}\left ({\mathbf {x} }\right)$ , $f_{42}\left ({\mathbf {x} }\right)$ and $f_{43}\left ({\mathbf {x} }\right)$ , respectively. For $D\mathrm {=10}$ and $D\mathrm {=30}$ , $f_{4}\left ({\mathbf {x} }\right)\in \left [{ 0, 5.54\times {10}^{8} }\right] $ and $f_{4}\left ({\mathbf {x} }\right)\in \left [{ 0, 1.22\times {10}^{9} }\right]$ , respectively. The objective of this benchmark problem is to minimize $f_{4}\left ({\mathbf {x} }\right)$ , which is like a unimodal problem as depicted in Fig. 4 when $D = 2$ , which is relatively simple in global optimization.

FIGURE 4. - Visualizing a 2-D case of 
$f_{4}$
.
FIGURE 4.

Visualizing a 2-D case of $f_{4}$ .

In summary, $f_{1}$ is unimodal for testing local DFOs, $f_{4}$ is relatively simple in terms of modality, and $f_{3}$ and $f_{4}$ are multimodal for testing global DFOs.

SECTION III.

Full Set of Benchmarks for Evaluating Numerical Optimizers

A. Optimality Benchmark

Optimality represents the relative closeness of a found optimum, $\hat {f}_{0}$ , to the theoretical optimum, $f_{0}$ , which is defined here as:\begin{equation*} Optimality=1-\frac {\left \|{ f_{0-}\hat {f}_{0} }\right \|}{\left \|{ \overline f -\underline {f} }\right \|}\in \left [{ 0,1 }\right]\tag{13}\end{equation*} View SourceRight-click on figure for MathML and additional features. where $\overline f $ and $\underline {f}$ are the lower and upper bounds of $f$ , respectively, which are known from proper benchmark problems or test functions.

For a single objective problem (i.e. $m=1$ ), consider the minimization problem with an objective bound of $\left [{ f_{\mathrm {min}},f_{\mathrm {max}} }\right]$ as an example. Then, by (13), the optimality benchmark can be simplified to:\begin{equation*} {Optimality}_{min}=1-\frac {\left |{ f_{min}-\hat {f}_{0} }\right |}{\left |{ f_{max}-f_{min} }\right |}=\frac {f_{max}- \hat {f}_{0}}{f_{max}-f_{min}}\tag{14}\end{equation*} View SourceRight-click on figure for MathML and additional features.

Similarly, the optimality for the corresponding maximization problem can be simplified to:\begin{equation*} {Optimality}_{max}=1-\frac {\left |{ f_{max}-\hat {f}_{0} }\right |}{\left |{ f_{max}-f_{min} }\right |}=\frac {\hat {f}_{0}-f_{min}}{f_{max}-f_{min}}\tag{15}\end{equation*} View SourceRight-click on figure for MathML and additional features.

B. Accuracy Benchmark

Is it necessary to introduce an accuracy benchmark in the control space in addition to the Optimality benchmark that already reflects a ‘performance index’ in the objective space? The necessity lies in that we are assessing global optimization that involves multiple optima and is approached stochastically often by a nondeterministic algorithm. As the highest value of optimality is one of the local optima, the quality of a found solution $\hat {\mathbf x}_{0}$ obtained by an algorithm cannot be assessed by the optimality benchmark alone if $\hat {\mathbf x}_{0}$ does not lie in the neighborhood of the global optimum.

Here, the accuracy benchmark is defined as the relative closeness of the found solution, $\hat {x}_{0}$ , to the theoretical solution, $\mathbf {x}_{0}$ :\begin{equation*} Accuracy=1-\frac {\left \|{\boldsymbol x_{0}-\widehat {\boldsymbol x}_{0}}\right \|}{\|\overline {\boldsymbol x}-\underline {\boldsymbol x}\|} \in [{0,1}]\tag{16}\end{equation*} View SourceRight-click on figure for MathML and additional features. where $\underline {\mathbf {x}}$ is the lower bound of $\boldsymbol { x}$ and $\mathbf {\overline x} $ is the upper bound, and $\left [{ \underline {\mathbf {x}}\mathbf {,} \mathbf {\overline x} }\right]$ represents the search range. This benchmark may be particularly useful if the solution space is noisy, there exist multiple optima, or ‘niching’ is used.

By (16), the accuracy with respect to a single-parameter minimization problem within $\left [{ x_{\mathrm {min}},x_{\mathrm {max}} }\right]$ is measured by:\begin{equation*} {Accuracy}_{D=1}=1-\frac {\hat {x}_{0}{-x}_{0}}{x_{max}-x_{min}}\tag{17}\end{equation*} View SourceRight-click on figure for MathML and additional features.

C. Convergence Benchmarks

To date, convergence of population-based algorithms is benchmarked only qualitatively using fitness or cost traces graphically against the number of iterations, generations or function evaluations (FEs). With this paper, convergence can now be measured both qualitatively and quantitatively.

Qualitatively, we adopt the method of graphical tracing, but on both benchmarks of optimality and accuracy, instead of on fitness alone, i.e.:

  • The highest ‘optimality’ or fitness in every generation;

  • The highest ‘accuracy’ or the parameter values of the individual solution that has the highest optimality fitness in every generation.

1) Reach Time

Quantitatively, we define a ‘reach time’ as:\begin{equation*} Reach~time\vert _{b}=C^{b}\tag{18}\end{equation*} View SourceRight-click on figure for MathML and additional features. where $C$ is the count of ‘function evaluations’ performed when the optimality of the best individual first reaches a predefined value $b$ , where $b\in \left [{ 0,1 }\right]$ , i.e., when the distance to the theoretical objective first drops to $\left ({1-b }\right)$ . For instance, $C^{0.999}$ indicates the generation number needed for an optimizer to reach the optimality with an error no greater than 0.1%, and $C^{0.632}$ indicates a convergence ‘time constant’ by which time optimality of 63.2% is first reached (in a manner similar to a first-order dynamic behavior).

2) NP-Time

The power of a heuristic DFO is that it reduces an otherwise exponential time of an exhaustive search algorithm to a nondeterministic polynomial time. To estimate the order of the polynomial, $C^{0.999}$ may be plotted against the number of parameters being optimized, $D$ . We can hence also propose a revised reach time, of $D$ components, as:\begin{equation*} NP-time\left ({D }\right)=C^{0.999}\left ({D }\right)\tag{19}\end{equation*} View SourceRight-click on figure for MathML and additional features.

3) Total Number of Function Evaluations

The optimality of 99.9% may not be reached by certain algorithms under test. The total number of function evaluations is the number of evaluations, search trials or simulations performed until the optimization process is terminated. For benchmarking, the total number of evaluations is kept the same for all algorithms, such as $400mD^{2}$ , which is more informatively defined as:\begin{equation*} N=min\left \{{C^{0.999},400mD^{2} }\right \}\tag{20}\end{equation*} View SourceRight-click on figure for MathML and additional features. Namely, with $C^{0.999}$ , the benchmark test will terminate when the goal of the optimality reaching an error no greater than 0.1% is achieved or $20D$ generations with a population size of $\left ({20D\times m }\right)$ have been iterated.

D. Optimizer Overheads Benchmark

The ‘total CPU time’, $T$ in seconds, of the entire optimization process can also be used in a benchmark test when assessing how long an optimization process would take in the real world to single out the amount of program overheads due to optimization maneuvers. Quantitatively, the optimizer overheads benchmark is defined as:\begin{equation*} Optimizer~overheads=\frac {T-T_{FE}}{T_{FE}}\tag{21}\end{equation*} View SourceRight-click on figure for MathML and additional features. where $T_{\mathrm {FE}}$ is defined as the CPU time taken (in seconds) for completing $N$ function evaluations.

E. Sensitivity Benchmark

When the values of the optimal parameters found are perturbed, the optimality may well change, which will affect the reliability or robustness of the solution found and can hence impact on the quality of real-world applications. This is critical, for example, in an industrial design, where manufacturing tolerance may cause suddenly deteriorated quality of the end product as a result of the sensitive optimality or polarized optimization.

In this paper, “sensitivity” is hence used to indicate the robustness of the optimizer by measuring how much relative change in the optimality will result from a ‘small’ relative change in the corresponding solution found. Quantitatively, the sensitivity is hence defined as:\begin{equation*} Sensitivity=\frac {d(Optimality)}{d(Accuracy)}=\left \|{ \nabla \hat {\boldsymbol f} }\right \|\frac {\left \|{ \overline {\boldsymbol {x}} -\underline {\boldsymbol {x}} }\right \|}{\left \|{ \overline {\boldsymbol {f}} -\underline {\boldsymbol {f}} }\right \|}\tag{22}\end{equation*} View SourceRight-click on figure for MathML and additional features. In other words, sensitivity is a scaled ‘relative gradient’, which can be estimated to:\begin{equation*} Sensitivity\approx \frac {\left \|{ \Delta \hat {\boldsymbol f} }\right \|}{\left \|{ \Delta \hat {\boldsymbol x} }\right \|}\frac {\left \|{ \overline {\boldsymbol {x}} -\underline {\boldsymbol {x}} }\right \|}{\left \|{ \overline {\boldsymbol {f}} -\underline {\boldsymbol {f}} }\right \|}\tag{23}\end{equation*} View SourceRight-click on figure for MathML and additional features. where $\Delta \hat {\mathbf f}$ is a neighborhood of $\hat {\mathbf f}_{0}$ and $\Delta \hat {\mathbf x}$ is the corresponding neighborhood of the solution found.

It is convenient to set $\mathrm {\Delta }\hat {\mathbf x}=1$ %, which means that the prevailing sensitivity measures how many percent the optimality will degrade if the accuracy is $\mathrm {1}$ % off. Note that the trend of sensitivity is rather dependent on the nature of the problem (e.g., the test function), and not mainly on the optimizer.

In addition to sensitivity, the two-tailed “t-test” could be used to assess the reliability in comparing two algorithms to a certain degree [14], [18]. This is a statistical hypothesis test, which is usually used to determine if two sets of data are significantly different from each other [19] and used as a complimentary approach to further validation of a comparison when necessary.

SECTION IV.

Benchmarking MATLAB DFOs

A. Test Conditions

Using the defined benchmarks and the chosen benchmark problems, we investigate the performance of the MATLAB (R2016b) heuristics SA, PSO and GAe, and the direct-search optimizers, SS, PS, and PC, through extensive experimental tests. Since the main purpose of this work is to help MATLAB DFO users select an algorithm without the need for in-depth knowledge of the algorithms, MATLAB’s default settings for these algorithms are used in the tests (unless otherwise stated) and all the DFOs are tested on the same random set of starting points.

Therefore, for SS, the reflection coefficient is 1, the expansion coefficient 2, the contraction coefficient 0.5, and the shrink coefficient 0.5.

For SA, the initial temperature is 100, and the temperature is reduced to 95% consecutively at each iteration.

For PSO, the swarm size is $20D$ , where $D$ is the dimension number of the test function. The maximum iteration is $20D$ ; so the maximum number of function evaluations is $400D^{2}$ .

MATLAB’s genetic algorithm uses elitism, hence named GAe, which guarantees the best 5% of a generation to survive to the next generation. For the GAe, the population size is $20D$ . The maximum number of generations is also $20D$ ; so the maximum number of function evaluations is the same as PSO. The crossover rate was 0.8, and the mutation rate is 0.2.

Each experiment is repeated 30 times to obtain a mean value of each benchmark value. As SA, SS, PS and PC are unary (as opposed to population-based) optimizers, one starting point is generated randomly for each run, identically used by all of these optimizers for fair comparison. Note that, however, when displaying convergence traces for these unary optimizers, one point is plotted for every $20D$ FEs, so as to compare fairly with one ‘generation’ of the population-based optimizers. For each run of the latter, the initial generation of $20D$ individuals are generated randomly for use identically by both PSO and the GAe.

In order to compare the optimizer overheads later, the CPU time, albeit less accurate in MATLAB than in a real-time program, is recorded for $N$ function evaluations. The average $T_{\mathrm {FE}}$ over 30 runs for every benchmark problem is shown in Table 3. It also reveals that, when $D$ increases by two times from 10 to 30, $T_{\mathrm {FE}}$ increases by over 20 times, indicating a highly nonlinear challenge by the dimensionality. These FEs were performed on a PC of an Intel Core i7-4790K 4.00 GHz CPU with 8 GB RAM running a Windows 64-bit operating system.

TABLE 3 Time Taken by ${N}$ FEs and Dimensionality Challenge
Table 3- 
Time Taken by 
${N}$
FEs and Dimensionality Challenge

B. Benchmarking DFOs on Four Test Problems

The performance of the six individual algorithms on each of the five benchmarks was first ranked and then assigned scores from 6 to 1, with 6 representing the best. For example, if the mean optimality of PSO is the highest among all the six algorithms, then it scores a 6 on Optimality. The scores of each algorithm on all the five benchmarks are summed up to give the algorithm a total score. Results from all the experiments are summarized in Tables 4–​7, where the bold indicates the best at each benchmark and the red the best overall score and corresponding algorithm.

TABLE 4 Benchmarking Results in Solving the Quartic Unimodal Problem (10-D and 30-D)
Table 4- 
Benchmarking Results in Solving the Quartic Unimodal Problem (10-D and 30-D)
TABLE 5 Benchmarking Results in Solving the Varying Landscape Multimodal Problem (10-D and 30-D)
Table 5- 
Benchmarking Results in Solving the Varying Landscape Multimodal Problem (10-D and 30-D)
TABLE 6 Benchmarking Results in Solving Schaffer’s F6 Multimodal Problem (10-D and 30-D)
Table 6- 
Benchmarking Results in Solving Schaffer’s F6 Multimodal Problem (10-D and 30-D)
TABLE 7 Benchmarking Results in Solving the Hybrid Problem (10-D and 30-D)
Table 7- 
Benchmarking Results in Solving the Hybrid Problem (10-D and 30-D)

1) Quartic Unimodal Problem $f_{\mathbf{1}}$

It can be observed that, for 10-D, the algorithms except SS and SA offer optimality over 99.9995%. PC offers the best performance overall, although PC, PS and PSO all win on three benchmarks. From both the 10-D and 30-D tests, it can be seen that PC offers the best overall performance, and should hence be recommended for uni-modal problems. Further, PC also offers the most ‘linear’ polynomial time (although NP), shown in the last column, whilst SS and SA are the worst. As expected, PSO and GAe are also relatively ‘polynomial’. On contrast, SS and SA offer the lowest performance with the highest overheads, and hence should be avoided for this type of problems. Fig. 5 depicts typical optimality convergence of the best point in a ‘generation’.

FIGURE 5. - Typical optimality of each DFO in optimizing the unimodal quartic problem 
$f_{1}$
.(a) 
$D = 10$
. (b) 
$D = 30$
.
FIGURE 5.

Typical optimality of each DFO in optimizing the unimodal quartic problem $f_{1}$ .(a) $D = 10$ . (b) $D = 30$ .

2) Varying Landscape n-D Multimodal Problem $f_{2}$

This function has multiple extrema in $\left [{ 0,\pi }\right]^{D}$ and is a more challenging problem than the unimodal problem. Table 5 summarizes the test results and Fig. 6 shows the convergence traces of the optimizers. It is seen that none of the algorithms was able to reach the optimality of 99.9% in $N$ function evaluations. SS and SA were observed stalled at a local maximum and hence delivered poor performance. The GAe, on contrast, showed steady performance and produced the best results in both 10-D and 30-D tests. PSO, PC and PS were able to approach the global optimum with relatively good results.

FIGURE 6. - Typical optimality of each DFO in optimizing the varying landscape problem 
$\mathrm {f}_{\mathrm {2}}\mathrm {.}$
 (a) 
$D = 10$
. (b) 
$D = 30$
.
FIGURE 6.

Typical optimality of each DFO in optimizing the varying landscape problem $\mathrm {f}_{\mathrm {2}}\mathrm {.}$ (a) $D = 10$ . (b) $D = 30$ .

3) Expanded Scaffer’s F6 Multimodal Problem $f_{3}$

Table 6 summarizes the test results on this problem. Fig. 7 shows the convergence traces of the algorithms. It can be seen that PSO delivered consistently good and the overall best performance. However, the GAe again consistently demonstrated the highest optimality and highest accuracy in both 10-D and 30-D tests. Same as on $f_{2}$ , SS and SA showed poor optimality and accuracy, indicating low capability on multimodal problems. It is also seen that the optimality of PC was relatively inconsistent with that on the previous multimodal problem, indicating it is likely to be trapped in a local optimum when the complexity of the problem increases.

FIGURE 7. - Typical optimality of each DFO in optimizing Schaffer’s F6 problem 
${{f}}_{\mathrm {3}}$
. (a) 
$D = 30$
. (b) 
$D = 30$
.
FIGURE 7.

Typical optimality of each DFO in optimizing Schaffer’s F6 problem ${{f}}_{\mathrm {3}}$ . (a) $D = 30$ . (b) $D = 30$ .

4) Hybrid Problem $f_{4}$

Table 7 summarizes the test results on this function. More optimizer overheads were expected due to the complexity of this function. Except for SA and SS, the overheads of the algorithms are less than 2000. Results depicted in Fig. 8 shows that PC has similar convergence tendency to the case in $f_{3}$ . Combining Figs. 7 and 8, we can conclude that PC is suitable for simple or unimodal functions, but not for multimodal functions. As $f_{4}$ has relatively more straightforward troughs, PS and PC were able to exploit the search well.

FIGURE 8. - Typical optimality of each DFO in optimizing the hybrid problem 
${\boldsymbol { f}}_{\mathbf {4}}$
. (a) 
$D = 10$
. (b) 
$D = 30$
.
FIGURE 8.

Typical optimality of each DFO in optimizing the hybrid problem ${\boldsymbol { f}}_{\mathbf {4}}$ . (a) $D = 10$ . (b) $D = 30$ .

C. Comparison on Individual Benchmarks

1) Optimality

Table 8 presents the scores and ranking of the six algorithms on optimality in solving problems $f_{1}$ to $f_{4}$ in 10-D and 30-D, where a higher score is assigned to better optimality. The algorithms are then ranked from 1 to 6 based on their total scores in solving all the problems, where 1 being the top rank. Overall, when a good optimal solution is needed, GAe is seen the best method to use.

TABLE 8 Scores and Ranking on Optimality
Table 8- 
Scores and Ranking on Optimality

2) Accuracy

Table 9 ranks the six algorithms on their accuracy. It can be seen that, when applied to $f_{2}$ , $f_{3}$ and $f_{4}$ , the algorithms did equally well in both 10-D and 30-D. Although different in 10-D and 30-D on $f_{1}$ , SS and SA did worse than the other four algorithms. For accuracy, GAe is seen the most suitable for multimodal problems such as $f_{2}$ , $f_{3}$ , whilst PS and PSO for unimodal and simple problems such as $f_{1}$ and $f_{4}$ .

TABLE 9 Scores and Ranking on Accuracy
Table 9- 
Scores and Ranking on Accuracy

3) Reach Time

Table 10 ranks on reach time, where $f_{2}$ and $f_{3}$ are missing, as none of the algorithms could reach 99.9% accuracy by the end of the predefined maximum number of FEs. When the purpose of optimization is to find a relatively fast converging solution, the PC is seen most suitable for unimodal and simple problems.

TABLE 10 Scores and Ranking on Reach Time
Table 10- 
Scores and Ranking on Reach Time

4) Optimizer Overheads

Table 11 ranks on optimizer overheads. Note that the ranking of each algorithm remains unchanged across the problems and dimensions, suggesting that optimizer overheads of the algorithms are not problem-dependent. It is therefore reasonable to conclude that, where reducing computational overheads is a priority, PC would be the most suitable, followed by PSO and GAe.

TABLE 11 Scores and Ranking on Optimizer Overheads
Table 11- 
Scores and Ranking on Optimizer Overheads

5) Sensitivity

Table 12 ranks on sensitivity. It can be seen that PC and PSO would be the choice, if low sensitivity is a major concern in practice.

TABLE 12 Scores and Ranking on Sensitivity
Table 12- 
Scores and Ranking on Sensitivity

D. Quick Guide to Selecting a DFO

To suit their application at hand most, the user could select a DFO using its individual benchmark ranking or the overall ranking on amalgamated scores in Table 13, which is summarized from Tables 8–​12. For example, if the objective space or optimality is of a paramount importance for the application, then the GAe would be the No. 1 tool for it, as well as for a potentially multimodal or relatively complex problem.

TABLE 13 Individual and Overall Performance Rankings of Heuristic and Direct Search DFO for Structural and Numerical Optimization
Table 13- 
Individual and Overall Performance Rankings of Heuristic and Direct Search DFO for Structural and Numerical Optimization

If the decision space or accuracy is the goal, then PSO and PS would be the choice, with the former being more suitable for a multimodal problem and the latter unimodal. For unknown modality or a relatively simple problem, PC would provide a rapid assessment on possible effects of applying optimization to the problem at hand.

If the application is rather unknown, however, it is recommended that it could first be treated as a multimodal problem for global optimization using the GAe, which will take a longer time, and then as a unimodal problem for local optimization using PC, which will be a lot quicker. From the above analysis, a visual guide is provided as a decision tree in Fig. 9.

FIGURE 9. - A simplified decision tree for selecting a DFO. Note that if an application requires optimizing its structure, the GAe is recommended, since its encoding capability permits optimization of structures and categories, in addition to numerical parameters of a structure. Relevantly, the GAe is recommended for global search offline together with PC for local tuning or rapid learning and adaptation online.
FIGURE 9.

A simplified decision tree for selecting a DFO. Note that if an application requires optimizing its structure, the GAe is recommended, since its encoding capability permits optimization of structures and categories, in addition to numerical parameters of a structure. Relevantly, the GAe is recommended for global search offline together with PC for local tuning or rapid learning and adaptation online.

SECTION V.

Case Study - Particle Filtering

A. The Stochastic Nonlinear Filter

The particle filter (PF) uses a set of particles (i.e., samples) to represent the posterior distribution of a stochastic process given partial observations. It is a Monte Carlo algorithm to solve Bayesian estimation and signal processing problems. Second to the median filter, the PF is so far the most widely used nonlinear filter [20] following the Sequential Importance Resampling algorithm developed by Gorden [21].

A PF can be formalized in the generic form of a nonlinear system:\begin{align*} \boldsymbol {x}_{k}=&f\left ({\boldsymbol {x}_{k-1},\boldsymbol {u}_{k-1} }\right)\tag{24}\\ \mathbf {z}_{k}=&h\left ({\boldsymbol {x}_{k},\boldsymbol {v}_{k-1} }\right)\tag{25}\end{align*} View SourceRight-click on figure for MathML and additional features. where $\mathbf {x}_{k}$ and $\mathbf {z}_{k}$ are the state variables and observations at time $k$ , respectively, $\mathbf {x}_{k}\in \mathbf {R}^{n_{\mathbf {x}}}$ , $\mathbf {z}_{k}\in \mathbf {R}^{n_{\mathbf {z}}}$ , $n_{\mathbf {x}}$ and $n_{\mathbf {z}}$ are the dimensions of $\mathbf {x}_{k}$ and $\mathbf {z}_{k}$ , respectively, $f\left ({\cdot }\right)$ and $h\left ({\cdot }\right)$ represent the system and observation functions, respectively, $\mathbf {u}_{k-1}\in \mathbf {R}^{n_{\mathbf {x}}}$ is the system noise and $\mathbf {v}_{k}\in \mathbf {R}^{n_{\mathbf {z}}}$ is the observation noise of given distributions, and $\mathbf {u}_{k-1}$ and $\mathbf {v}_{k-1}$ are independent of the past and current states. In addition, $\mathbf {v}_{k}$ may be independent of $\mathbf {u}_{k-1}$ . Denote the state and observation sequences by $\mathbf {x}_{0:k}=\left \{{\mathbf {x}_{0,\cdots,}\mathbf {x}_{k} }\right \}$ and $\mathbf {z}_{1:k}=\left \{{\mathbf {z}_{1,\cdots,}\mathbf {z}_{k} }\right \}$ , respectively. The initial distribution $\mathrm {p}\left ({\mathbf {x}_{0} }\right)$ is known a priori.

The objective of the PF is recursively to estimate the posterior density $p\left ({\mathbf {x}_{0:k}\vert \mathbf {z}_{1:k}}\right) $ of the state $\mathbf {x}_{0:k}$ based on all available measurements $\mathbf {z}_{1:k}$ . A recursive derivation of the posterior density is given by the Bayesian theorem when new observations arrive:\begin{equation*} p\left ({\boldsymbol {x}_{0:k}\vert \boldsymbol {z}_{1:k}}\right) =p\left ({\boldsymbol {x}_{0:k-1}\vert \boldsymbol {z}_{1:k-1}}\right) \frac {p\left ({\boldsymbol {z}_{k}\vert \boldsymbol {x}_{k}}\right) p\left ({\boldsymbol {x}_{k}\vert \boldsymbol {x}_{k-1}}\right) }{p\left ({\boldsymbol {z}_{k}\vert \boldsymbol {z}_{1:k-1}}\right) }\tag{26}\end{equation*} View SourceRight-click on figure for MathML and additional features. where the conditional density $\mathrm {p}\left ({\mathbf {z}_{k}\vert \mathbf {z}_{1:k-1}}\right) $ is a normalizing constant. Eq. (26) can be simplified to:\begin{equation*} p\left ({\boldsymbol {x}_{0:k}\vert \boldsymbol {z}_{1:k}}\right) \propto p\left ({\boldsymbol {x}_{0:k-1}\vert \boldsymbol {z}_{1:k-1}}\right) p\left ({\boldsymbol {z}_{k}\vert \boldsymbol {x}_{k}}\right) p\left ({\boldsymbol {x}_{k}\vert \boldsymbol {x}_{k-1}}\right)\tag{27}\end{equation*} View SourceRight-click on figure for MathML and additional features. where $\propto $ means proportional to [19].

The above formulas are intractable integrals to an analytic solution for $p\left ({\mathbf {x}_{0:k}\vert \mathbf {z}_{1:k}}\right) $ . The PF hence uses the Monte Carlo methods to translate the integral problem into a cumulative particle probability transition. It approximates $p\left ({\mathbf {x}_{0:k}\vert \mathbf {z}_{1:k}}\right) $ with a mass of particles $\mathbf {x}_{0:k}^{i}\left ({i=1,\cdots,N }\right)$ , in which $N$ is the particle number, $\mathbf {x}_{0}^{i}$ is a set of initial particles drawn from $p\left ({\mathbf {x}_{0} }\right)$ .

To estimate the posterior distribution of the transition, the PF uses an ‘importance sampling’ technique to sample particles first [22]:\begin{equation*} q\left ({\mathbf {x}_{0:k}\vert \mathbf {z}_{1:k}}\right) =q\left ({\mathbf {x}_{0:k-1}\vert \mathbf {z}_{1:k-1}}\right) q\left ({\mathbf {x}_{k}\vert {\mathbf {x}_{0:k-1},\mathbf {z}_{1:k}}}\right)\tag{28}\end{equation*} View SourceRight-click on figure for MathML and additional features. Accordingly, the weights of the particles are termed the importance weight. Un-normalized weights can then be written as \begin{equation*} w_{k}^{i}=\frac {p\left ({\boldsymbol {x}_{0:k}\vert \boldsymbol {z}_{1:k}}\right)}{q\left ({\boldsymbol {x}_{0:k}\vert \boldsymbol {z}_{1:k}}\right) }\propto w_{k-1}^{i}\frac {p\left ({\boldsymbol {z}_{k}\vert \boldsymbol {x}_{k}^{i}}\right) p\left ({\boldsymbol {x}_{k}^{i}\vert \boldsymbol {x}_{k-1}^{i}}\right) }{q\left ({\boldsymbol {x}_{k}^{i}\vert {\boldsymbol {x}_{0:k-1}^{i}\boldsymbol {,z}}_{1:k}}\right)}\tag{29}\end{equation*} View SourceRight-click on figure for MathML and additional features. where $w_{k}^{i}$ is the importance weight. Denote the normalized $w_{k}^{i}$ as $\tilde {w}_{k}^{i}$ . If the importance distribution satisfies \begin{equation*} q\left ({\boldsymbol {x}_{k}\vert {\boldsymbol {x}_{0:k-1}\boldsymbol {,z}}_{1:k}}\right) =q\left ({\boldsymbol {x}_{k}\vert {\boldsymbol {x}_{k-1},\boldsymbol {z}_{k}}}\right)\tag{30}\end{equation*} View SourceRight-click on figure for MathML and additional features. Then \begin{equation*} \tilde {w}_{k}^{i}\propto \tilde {w}_{k-1}^{i}\frac {p\left ({\boldsymbol {z}_{k}\vert \boldsymbol {x}_{k}^{i}}\right) p\left ({\boldsymbol {x}_{k}^{i}\vert \boldsymbol {x}_{k-1}^{i}}\right) }{q\left ({\boldsymbol {x}_{k}^{i}\vert {\boldsymbol {x}_{k-1}^{i}\boldsymbol {,z}}_{k}}\right)}\tag{31}\end{equation*} View SourceRight-click on figure for MathML and additional features. Consequently, the posterior density $p\left ({\mathbf {x}_{\mathrm {k}}\vert \mathbf {z}_{\mathrm {1:k}}}\right) $ can be formulated as:\begin{equation*} p\left ({\boldsymbol {x}_{k}\vert \boldsymbol {z}_{1:k}}\right) \approx \sum \nolimits _{i=1}^{N} \tilde {w}_{k}^{i} \delta \left ({\boldsymbol {x}_{k}-\boldsymbol {x}_{k}^{i} }\right)\tag{32}\end{equation*} View SourceRight-click on figure for MathML and additional features. where $\delta \left ({\cdot }\right)$ is the Dirac delta [22].

Despite that the PF is the second most successful nonlinear filter, serious challenges exist. One is the particle impoverishment problem, which due to most particles sharing a few distinct values cause a loss of diversity and estimation.

To improve, a larger number of optimization algorithms have been applied to particle filtering.

B. Tests of MATLAB DFOs and Verification

In this case study, tests of all the six DFO are conducted to select the most suitable DFO for the PF and to verify the benchmark guide. In the following, the SS, PS, PC, SA, PSO and GAe versions of the PF are termed SSPF, PSPF, PCPF, SAPF, PSOPF and GAePF, respectively.

A PF application is likely a multimodal or complex problem, the nature of which could be unknown to the user. The tests will be under the representative conditions described in [18], which are widely adopted owing to their strong nonlinearity [22]–​[24].

The test formulation is given by:\begin{align*} x_{k}=&1+sin {\left ({w\pi k }\right)+\phi _{1}}x_{k-1}+v_{k}\tag{33}\\ z_{k}=&\begin{cases} \phi _{2}x_{k}^{2}+n_{k} & k\le 30\\ \phi _{3}x_{k}-2+n_{k} & k>30\\ \end{cases}\tag{34}\end{align*} View SourceRight-click on figure for MathML and additional features. where $v_{\mathrm {k}}$ is a Gamma (3,2) random variable modeling the process noise, and $w=4e-2$ , $\phi _{1}=\phi _{3}=0.5$ , and $\phi _{2}=0.2$ are scalar parameters. The observation noise $n_{\mathrm {k}}$ is drawn from a Gaussian distribution $N\left ({\mathrm {0, 0.00001} }\right)$ . Different filters were used to estimate the state sequences $\mathbf {x}_{\mathrm {k}}$ for $k\mathrm {=1,2,\cdots,}T$ , with the total observation time set as $T=70$ . The maximum number of generations is set as $G=20$ . In the GAePF, the crossover probability and the mutation probability are the usual 0.8 and 0.2, respectively. Other parameters of other four DFOs are the same as the above. All particle filters used $N=100$ particles. The experiment was repeated for $M=100$ Monte Carlo simulations to evaluate the performance of the six potential DFO PF algorithms.

1) Accuracy Analysis on State Estimations

The following three metrics were used to evaluate the performance of the six PF algorithms:\begin{align*} rmse=&\sqrt {\frac {1}{T}\sum \nolimits _{k=1}^{T} \left ({x_{k}-\hat {x}_{k} }\right)^{2}}\tag{35}\\ \overline {rmse}=&\sqrt {\frac {1}{M}\sum \nolimits _{m=1}^{M} \left [{ \frac {1}{T}\sum \nolimits _{k=1}^{T} \left ({x_{k}^{m}-\hat {x}_{k}^{m} }\right)^{2} }\right]}\tag{36}\\ {rmse}^{'}=&\sqrt {\frac {1}{M}\sum \nolimits _{m=1}^{M} \left ({x_{k}^{m}-\hat {x}_{k}^{m} }\right)^{2}}\tag{37}\end{align*} View SourceRight-click on figure for MathML and additional features. where $rmse$ is the root mean squared error, its mean is $\overline {rmse} $ , ${rmse}^{\mathrm {'}}$ is the $rmse$ over $M$ simulations with a given observation time, $x_{k}^{m}$ is the real state at time $k$ for the $m$ -th simulation, and $\hat {x}_{k}^{m}$ is the estimated state.

Given the observation time, the mean rmse values at every observation in 100 simulations of the six algorithms are shown in Fig. 10. As expected from the DFO guide of Fig. 9, the mean rmse values of the GAePF are seen the lowest. Unexpectedly, however, the PSOPF has shown poor performance. Investigation into this has uncovered that the PSO in MATLAB does not make a set of the best population available for access after optimization, but the PF algorithm needs a set of particles.

FIGURE 10. - Mean RMSE errors with observation times for 100 simulations, comparing MATLAB DFOs.
FIGURE 10.

Mean RMSE errors with observation times for 100 simulations, comparing MATLAB DFOs.

To further verify the results, the mean estimation rmse of the six algorithms in different numbers of simulations are shown in Fig. 11, where $N=100$ and $M=100$ . This corroborates the observations from Fig. 10 above, confirming that the GAe is indeed the best DFO for such an unknown application as recommended in the guide.

FIGURE 11. - Mean RMSE errors vs. the number of simulations for 100 simulations, comparing MATLAB DFOs.
FIGURE 11.

Mean RMSE errors vs. the number of simulations for 100 simulations, comparing MATLAB DFOs.

2) Utilization of Particles and Runtime

The means $rmse$ and average runtime with the total observation time $T$ of the algorithms over 100 simulations are compared in Table 14 for both $N=100 $ and $N=300 $ at $G=20$ . It can be seen that the GAePF achieved a lower $rmse$ for just 100 particles than all the other algorithms even with 300 particles. As a result, the GAePF also demanded a shorter runtime. In other words, the use of the GAe needed fewer particles to achieve more accurate estimations for the PF application. These tests also confirm that the PF problem, looking relatively complex, is indeed a relatively multimodal problem.

TABLE 14 Mean RMSE and Runtime
Table 14- 
Mean RMSE and Runtime

SECTION VI.

Conclusions

In this paper, a set of five criteria for benchmarking numerical optimization algorithms are first presented. These benchmarks reflect performance of DFOs on various aspects. Not only have the existing optimality, accuracy and convergence benchmarks been expanded in this paper, but also have the optimizer overheads and sensitivity benchmarks developed to enrich the existing evaluation criteria for practical numerical optimization applications.

Using four relatively representative and commonly adopted benchmarking problems, the six DFO algorithms available in MATLAB have been compared and ranked on the five benchmarks. The experiments have shown that PC is the overall best for unimodal and simple hybrid problems, whilst the GAe is the overall best for multimodal and relatively complex problems.

To provide a guide for practical engineers to decide rapidly which of the DFOs would be the most suitable for their application at hand without the need for in-depth knowledge of the DFO algorithms themselves, a scoring system and a decision tree are provided in this paper. If optimality (the objective space) is of the paramount importance to the application, such as in a competition, the GAe would be the choice. If accuracy (the decision space) or sensitivity is the main concern, such as for tolerating manufacturing inaccuracies of a design-optimized product, then PSO or PS would be the choice. If the application is rather unknown, it is recommended that it could first be treated as a multimodal problem for global optimization using the GAe, which will take a longer time, and then, if desired, as a unimodal problem for local optimization using PC, which will be a lot quicker.

With the guide of the ranks and the decision tree, all the six optimizers have been applied to the nonlinear problem of particle filtering. The GAe is confirmed being better equipped to deal with such a relatively challenging optimization problem.

For future work, benchmarking and selection criteria will be extended to multi-target optimization problems.

Usage
Select a Year
2025

View as

Total usage sinceJun 2019:2,324
010203040JanFebMarAprMayJunJulAugSepOctNovDec291732000000000
Year Total:78
Data is updated monthly. Usage includes PDF downloads and HTML views.

References

References is not available for this document.