Introduction
With the rapid development of computer technology, modeling and simulation (M&S) has become a valuable tool to solve practical engineering problems. For engineering projects with high-standard requirements, it is desirable to know how reliable the simulation model is. Therefore, model verification and validation (V&V) is proposed to assess the accuracy and reliability of the model [1]. Model validation is the process of determining the degree to which a model is an accurate representation of the real world from the perspective of the intended uses of the model [2]. Decision making is usually a challenge problem because of its uncertainty input and limited samples due to the costs of experiments. However, model validation provides more confidence for decision makers when improving prediction accuracy at the same time. The goal of model validation is to ensure the simulation model with enough accuracy and to increase the prediction credibility of simulation models through comparing simulation model with experimental performance measures.
Model validation, which is a fundamental concept proposed by the U.S. Department of Energy, is mainly used in reliability assessment and decision making for storage management of nuclear weapons at the beginning. Developing quantitative methods for model validation under uncertainty have attracted considerable interest. Professional societies and standards committees have played an important role in guiding development in model validation, seeing, for example, guides published by ASME [2], AIAA [3], NASA [4] and IEEE [5]. Oreskes [6] and Kleindorfer [7] demonstrated the meanings of model validation in terms of philosophy and methodology. Oberkampf et al. [8]–[10] summarized the model validation concept in computational fluid dynamics and presented a monograph and a comprehensive framework for model validation in scientific computing. Xiong et al. [11], Youn et al. [12], Oh et al. [13], Hu et al. [14] and Wu et al. [15] made some great contributions in model calibration considering model bias. In electromagnetic and power electronic engineering, Sevgi [16] and Mehrabadi [17], [18] made some great contributions in model validation.
Model validation requires measurement of the degree of agreement between the model output and the experimental observations, and can be performed quantitatively (validation metric) or qualitatively (graphical comparison). There are four main types of validation metrics [19-21], including classical hypothesis testing, Bayes factor, frequentist’s metric and area metric. The classical hypothesis testing and the Bayes factor only provide a binary (accept or reject) outcome, which does not give a quantitative assessment about the accuracy of the model. The frequentist’s metric gives a quantitative assessment of the mismatch between the predictions and the experimental observations, in which only the means of the two sets of populations are considered. The area metric [22] quantifies the difference between the predictions and the experimental observations by calculating the area between the cumulative distribution function (CDF) related to the model simulation output and the empirical distribution function based on the experimental observation data. The frequentist’s metric and area metric methods can provide a quantitative assessment of the model. Since the distributions of the predictions and experimental observations are considered in the area metric method, it can be considered as an extension of the frequentist’s metric. Based on these four model validation metrics, various model validation metrics have been developed. To solve correlated multiple responses of model validation, based on the multivariate probability integral transformation (PIT), Li et al. [23] generalized the area metric method and proposed the PIT area metric and t-pooling transformation. Wu et al. [24] established a validation area metric based on the component functions of model output, in which a calibration of conditional expectations is used. Zhao et al. [25] developed the Mahalanobis distance area metric to validate the simulation models with multiple correlated responses. Bi et al. [26] investigated uncertainty quantification metrics with varying statistical information based on Euclidian distance-based criteria, Mahalanobis distance-based criteria and Bhattacharyya distance-based criteria. Mullins et al. [27] investigated model validation under multiple uncertainty sources. In their work, the point-by-point approach is applied for the separation of aleatory and epistemic uncertainty sources.
The validation metric is directly related to the number and locations of validation experiments. Hence, how to design a validation experiment associated with the M&S tools would be the key step for V&V. There are many researches reported in this area [28]. Design of experiments (DoE) can be divided into two general categories, namely classical DoE (such as Full and fractional-factorial designs, Central composite designs and Box-Behnken designs, Optimal designs, Orthogonal arrays experiments) and modern DoE (such as Random designs, Quasi-random designs, Projections-based designs, Uniform designs, Miscellaneous designs, Hybrid designs) [29]. In contrast to traditional experiments, the validation experiment is a new type of experimental methods, whose main purpose is to determine the prediction accuracy and reliability of the simulation model used to describe the physical model. In order to apply the validation experimental results to the prediction of the target model, Hamilton [30], [31] proposed a potential experimental method to evaluate the correlation with the target by using a series of simple experiments and calculation models, and verified the experimental method by a one-dimensional nonlinear transient heat conduction simulation. Further research and discussion are needed on the specific arrangement of the validation experiment and its relationship with the predicted target. On the one hand, the validation experiment should coordinate with the simulation results; on the other hand, it should be independent with the modeling process. Oberkampf and Smith [32] developed a framework, consisting of the predictive capability maturity method and the strong-sense model validation experiments, for assessing model validation experiments for computational fluid dynamics. Oomen and Bosgra [33] investigated the ill-posedness in the deterministic model validation problem formulation. Jiang and Mahadevan [34] presented an integrated Bayesian cross-entropy methodology for validation experiment design of computational models. Ao et al. [35] proposed a validation experiment design optimization method for life prediction models to obtain the optimal testing stress levels and the number of tests at each stress level. So far, various methods for validation experiments have been developed. However, in the previous works the credibility of the validation experiment is always ignored, which is a very important evaluation index for the validation experiment. How to evaluate the credibility of the validation experiment and how to conduct the validation experiment under certain credibility conditions are open problems. In this paper, the issues are studied, and methodologies for credibility evaluation of validation experiment and validation experiment design based on the area metric factor and the fuzzy expert system are developed. In the methods, the Latin hypercube sampling is used to obtain test scheme. By optimizing the sample size, the test cost is reduced while the credibility of the validation experiment is satisfied. A numerical example and a Sandia thermal challenge problem are used as simulation examples to demonstrate the proposed methods.
The rest of this paper is structured as follows. The area metric factor is developed in Section II. In Section III, the fuzzy expert system for model validation is presented. A mathematical formulation of experimental design for model validation and a methodology of experimental design for model validation are presented in Section IV. Two simulation examples are analyzed in Section V. Finally, the conclusions are provided in Section VI.
Area Metric Factor
The area metric proposed by Ferson et al. [22] is a validation metric based on the probability theory, which uses the area between the cumulative distribution function of the model response and the empirical CDF of the observed data to measure the disagreement of the entire distribution of the predictions and the experimental observations.
Mathematically, the area metric is defined as \begin{equation*} d(F^{m},S^{n})=\int _{-\infty }^{+\infty } {\left |{ {F^{m}(y)-S^{n}(y)} }\right |dy}\tag{1}\end{equation*}
Since area metric is a dimension validation metric, and its unit varies with different systems, which makes it impossible to form a unified evaluation standard. To obtain a unified evaluation standard, in this work, the concept of the area metric factor is proposed, as shown in Figure. 1, which is a dimensionless validation metric and is defined by Eq. (2).\begin{equation*} \rho \left ({{F^{m},S^{n}} }\right)=\frac {d(F^{m},S^{n})}{d(F^{m},F^{0})}\tag{2}\end{equation*}
\begin{equation*} d(F^{m},F^{0})=\int _{-\infty }^{+\infty } {\left |{ {F^{m}(y)-F^{0}(y)} }\right |dy}\tag{3}\end{equation*}
\begin{equation*} F^{0}(y)=\begin{cases} {1,} & {y\ge \mu -3\sigma } \\ 0 & {y < \mu -3\sigma } \\ \end{cases}\tag{4}\end{equation*}
A small area metric factor is a great indication for agreement between the model predictions and the experimental observations at the validation site. The area metric factor approaches to zero, when the predictive model perfectly matches the physical model. The area metric factor is influenced by the sample size of the experiment. A large area metric factor may be due to insufficient experimental data.
Fuzzy Expert System for Model Validation
Area metric factor is a dimensionless validation metric, and can be used for quantitative validation of a predictive model. In engineering, deciders usually want to get qualitative assessment (Good, Moderate, Worst, et al.) of the predictive model to determine whether to accept the predictive model. The model evaluation results are based on the expert experiences. For different experts, the qualitative assessments exist bias. Hence, in this work, a fuzzy expert system is presented to qualitatively evaluate the predictive model, in which the inconsistency among the different expert groups is considered. Based on the fuzzy theory, the model evaluation results can be divided into four groups, i.e., Excellent, Good, Moderate and Worst (Figure. 2). The membership functions are given by,
Excellent \begin{equation*} \mu _{\tilde {A}} \left ({\rho }\right)=\begin{cases} 1 & {\rho \le 0.1} \\ {1-(\rho -0.1)/0.05} &{0.1 < \rho < 0.15}\\ 0 &{\rho \ge 0.15} \end{cases}\tag{5}\end{equation*}
Good \begin{equation*} \mu _{\tilde {A}} \left ({\rho }\right)=\begin{cases} 0 & {\rho \le 0.1} \\ {(\rho -0.1)/0.05} & {0.1 < \rho < 0.15} \\ 1 & {0.15\le \rho < 0.2} \\ {1-(\rho -0.2)/0.1} & {0.2\le \rho < 0.3} \\ 0 & {\rho \ge 0.3} \end{cases}\tag{6}\end{equation*}
Moderate \begin{equation*} \mu _{\tilde {A}} \left ({\rho }\right)=\begin{cases} 0 & {\rho \le 0.2} \\ {(\rho -0.2)/0.1} & {0.2 < \rho < 0.3} \\ 1 & {0.3\le \rho < 0.4} \\ {1-(\rho -0.4)/0.1}&{0.4\le \rho < 0.5} \\ 0 &{\rho \ge 0.5} \end{cases}\tag{7}\end{equation*}
Worst \begin{equation*} \mu _{\tilde {A}} \left ({\rho }\right)=\begin{cases} 0 & {\rho \le 0.4} \\ {(\rho -0.4)/0.1} & {0.4 < \rho < 0.5} \\ 1 & {\rho \ge 0.5} \\ \end{cases}\tag{8}\end{equation*}
The fuzzy expert system provides a unified standard for simulation model evaluation and selection.
Experimental Design for Model Validation Under Aleatory Uncertainty
Current validation experiment design is always based on the experience of engineers, which significantly affects the credibility of the validation experiment. To improve the credibility of the model validation experiment, a large test size is needed, which is time-consuming and expensive. To obtain the experiment scheme for model validation with low cost and satisfactory credibility, in this work, a methodology for validation experiment design under aleatory uncertainty (Aleatory uncertainty [36] is a kind of uncertainty, which derives from an inherent random nature of physical quantities or environment.) based on the area metric factor and the fuzzy expert system is developed.
A. Mathematical Formulation of Experimental Design for Model Validation
In this method, the number of experimental observations is taken as a design variable. By optimizing the sample size, the test cost is reduced while the credibility of the model validation experiment is satisfied. The mathematical formulation of the design approach can be expressed as, \begin{align*}&\min T_{C} ~\left ({m }\right) \\&s.t.~P\left ({{\mu _{\tilde {A}} \left ({{\rho \left ({{F^{m},S^{n}} }\right)} }\right)\left |{ {\textrm {Excellent}} }\right.\ge \mu _{0}} }\right)\ge P_{r} \\&\hphantom {s.t.~} m_{d} \le m\le m_{u}\tag{9}\end{align*}
\begin{equation*} T_{C} \left ({m }\right)=C_{s} \cdot m+C_{t} \cdot m\tag{10}\end{equation*}
B. A Methodology of Experimental Design for Model Validation
For the optimization model of the design approach, the number of experimental observations is the design variable, which is a positive integer constrained maximum. Gradient-type and intelligent optimization methods are inefficient to solve this kind of optimization problem. Hence, an optimization approach for the experimental design for model validation is presented. The main procedures of the approach are shown in Figure. 3 with the following steps.
Step 1:
Initiate the sample size of the experiment at
, and set$m_{d} $ at 0.$n_{1} $ Step 2:
Choose the number of the sample sets,
, which decides on the number of the sample sets of the model validation experiment. The number of the sample sets need to be set at a high value to reduce the influence on the probability distribution of area metric factor.$n_{2} $ Step 3:
Choose
random sample sets, containing$n_{2} $ samples, from each distribution of each aleatory uncertainty by using Latin hypercube sampling.$n_{2} \times m$ Step 4:
Choose a random sample set from
sets,$n_{2} $ .${\mathbf{A}}_{i} $ Step 5:
Use the complete array of sampled values
to obtain test data, and calculate the CDF from the experiment,${\mathbf{A}}_{i} $ .$F^{m}\left ({i }\right)$ Step 6:
Calculate the area metric factor,
, by Eq. (2).$\rho _{i} $ Step 7:
If
, set$\mu _{\tilde {A}} \left ({{\rho \left ({{F^{m},S^{n}} }\right)} }\right)\left |{ {\textrm {Excellent}} }\right.\ge \mu _{0}$ .$n_{1} =n_{1} +1$ Step 8:
Test if all
sample sets have been used. If No, return to Step 3. If Yes, go to Step 9.$n_{2} $ Step 9:
Calculate
, determine if$P_{m} ={n_{1}} \mathord {\left /{ {\vphantom {{n_{1} } {n_{2}}}} }\right. } {n_{2}}$ is greater than or equal to$P_{m} $ . If No,$P_{r} $ ,$n_{1} =0$ , and return to Step 2. If Yes, go to Step 10.$m=m+1$ Step 10:
Calculate the test cost by Eq. (10), and the output of the sample size of the experiment,
, probability value of$m$ ,$\mu _{\tilde {A}} \left ({{\rho \left ({{F^{m},S^{n}} }\right)} }\right)\left |{ {\textrm {Excellent}} }\right.\ge \mu _{0} $ , and test cost,$P_{m} $ .$T_{C} \left ({m }\right)$
Results and Discussion
In this section, two simulation examples are employed to demonstrate the proposed validation metric, expert system for model validation and experimental design method for model validation. The simulation examples include a numerical example, and a Sandia thermal challenge problem.
A. Numerical Example
The experimental observations in this section are generated using the following model, \begin{equation*} y^{e}=\theta \cos (2\pi x_{1})+\sin x_{2}\tag{11}\end{equation*}
Three test scenarios, as shown in Table 1, are created by using different predictive models, in which model 1 is a correct predictive model with
Area metric factor of three models with different numbers of experimental observations.
Table 2 shows the area metric factors and qualitative assessment of the three predictive models when the number of the experimental observations is 10, 20, 50, 500 and 5000. It shows that the area metric factors for model 1 with 10, 20, 50, 500 and 5000 observations are 0.1679, 0.1433, 0.0498, 0.0136 and 0.0065, respectively. Accordingly, the qualitative assessments are obtained based on the fuzzy expert system as,
Considering that the test scheme has a significant influence on the model evaluation in the case of small sample sizes, in order to avoid the influence of the test scheme on model evaluation, the proposed experimental design method for model validation is used to obtain the minimum sample size of the experiment with satisfying credibility. In this section, the probability value,
B. Sandia Thermal Challenge Problem
The Sandia thermal challenge problem is shown in Figure. 6. The mathematical model for transient temperature response of the one-dimensional transient heat conduction problem can be expressed as [37], \begin{align*}&\hspace{-1.5pc}T\left ({{x,t} }\right) \\=&T_{i} +\frac {qL}{k}\left [{ {\frac {\left ({{k/\rho C} }\right)t}{L^{2}}+\frac {1}{3}-\frac {x}{L}+} }\right.\frac {1}{2}\left ({{\frac {x}{L}} }\right)^{2} \\&\qquad \qquad -\left.{ {\frac {2}{\pi ^{2}}\sum \limits _{n=1}^{6} {\frac {1}{n^{2}}\textrm {e}^{-n^{2}\pi ^{2}\frac {\left ({{k/\rho C} }\right)t}{L^{2}}}\cos \left ({{n\pi \frac {x}{L}} }\right)}} }\right]\tag{12}\end{align*}
To investigate the influence of the number of experimental observations and the distribution of uncertain parameters on the model prediction results, four test scenarios are created as shown in Table 3. Model 1 is consistent with the physical model. Comparing with the physical model, there is 5% deviation in the mean and standard deviation of the random variables for model 2, model 3 and model 4. The area metric factors of the four predictive models for the surface temperature at the time t = 1000 s, considering different numbers of experimental observations, are shown in Figure. 7. It indicates that the area metric factors of the four models are converging as the number of experimental observations increases. The area metric factor of model 1 approaches zero as the number of the experimental observations increases. The area metric factor of model 2 is smaller than that of model 3, which means that the influence of the deviation in the mean on the predicted results is more noticeable. The area metric factor of model 4 is smaller than that of model 3, indicating model 4 is better than model 3 in predicting the physical observations, though model 4 adds the deviation in the standard deviation of random variables.
The area metric factors and qualitative assessment of the four predictive models with 10, 20, 50, 500 and 5000 observations are shown in Table 4. The area metric factors for model 1 with 10, 20, 50, 500 and 5000 observations are 0.0972, 0.1185, 0.0706, 0.0261 and 0.0055, respectively. By using the fuzzy expert system, the results of the qualitative assessment are
Since the test scheme can significantly affect the model evaluation accuracy in the case of small sample sizes, the experimental design method for model validation is applied to optimize the model validation experiment. Here,
Conclusions
Considering that the credibility of the validation experiment is a very important evaluation index for the validation experiment, in this work, we investigate the methodology for validation experiment design with low cost and satisfactory credibility. Firstly, a dimensionless validation metric (area metric factor) is presented, and a fuzzy expert system for model validation is developed. Then, an optimal model for validation experiment design based on the area metric factor and the fuzzy expert system is constructed, and a methodology for validation experiment design is presented. The simulation results of a numerical example and a Sandia thermal challenge problem lead to the following observations. The number of experimental observations and sampling schemes can significantly affect the area metric factor when the number of experimental observations is small. A false model evaluation can be obtained because of the randomness of the sampling scheme. The area metric factor converges with the increase of the number of experimental observations. By using the experimental design method for model validation, the number of experimental observations is obtained, by which, the credibility of the validation experiment is satisfied.