Nomenclature
AbbreviationExpansionPrediction accuracy enhancement | |
Dimensionality reduction | |
Fitness value improvement | |
Computation time reduction | |
Linear-epsilon-insensitive SVR | |
SVR model auxiliary parameters | |
SVR model normalization parameter | |
SVR model mapping function | |
SVR model loss parameter | |
Lagrange multipliers | |
Percentage forecast error | |
Gauss parameter (width of RBF kernel) | |
NCA model regularization parameter | |
NCA model probability parameter | |
accwith_PSS | Prediction accuracy with PSS |
accwithout_PSS | Prediction accuracy without PSS |
AI | Artificial intelligence |
ACO | Ant colony optimization |
ANN | Artificial neural network |
BGA | Binary genetic algorithm |
b | SVR model bias parameter |
C | Correlation |
Corr | Relationship/correlation strength indicator |
CPU | Central processing unit |
d | Difference between the predictor and target values |
DL | Deep learning |
ERM | Empirical risk minimization |
FFANN | Feedforward artificial neural network |
fi | Predictor i |
fr | Number of predictors in the reduced predictor subset |
fitwith_PSS | Fitness value of the selected predictors with PSS |
fitwithout_PSS | Fitness value of the predictors without PSS |
GA | Genetic algorithm |
GB | Gigabyte |
GHz | Gigahertz |
h | Hour |
i | Predictor/feature index |
IoT | Internet of things |
k | Elitism of size k |
K | SVR model kernel function |
KKT | Karush-Kuhn-Tucker optimality condition |
kW | Kilowatt |
kWh | kilowatt hour |
l | NCA model loss function |
LBFGS | Limited memory Broyden-Fletcher-Goldfarb-Shanno |
MAE | Mean absolute error |
MAPE | Mean absolute percentage error |
MAPEwith_PSS | MAPE of prediction with PSS |
MAPEwithout_PSS | MAPE of prediction without PSS |
ML | Machine leaning |
n | Number of predictors in the original predictor space |
N | SVR model training sample size |
NCA | Neighborhood Component Analysis |
NH | Forecasting horizon |
O1 | Number of elite offsprings |
O2 | Number of crossover offsprings |
O3 | Number of mutation offsprings |
p | Number of chromosomes |
Actual/measured PV power at hour h | |
Forecasted PV power at hour h | |
PSO | Particle swarm optimization |
PSS | Predictor subset selection |
PV | Photovoltaic |
q | Chromosome length (Genomelength) |
RAM | Random access memory |
RBF | Radial basis function |
Original predictor space without PSS with m samples and n predictors | |
Reduced predictor space with PSS with m samples and nr predictors | |
rP | Pearson correlation coefficient |
rS | Spearman correlation coefficient |
R&D | Research and development |
SRM | Structural risk minimization |
SVR | Support vector regression |
twith_PSS | Total computation time with PSS |
twithout_PSS | Total computation time without PSS |
UN | United Nation |
w | SVR model weight/coefficient parameter |
x | SVR model training input (predictor) value |
y | SVR model training target value |
Introduction
Installation of renewable energy resources, in particular solar energy, has received much attention globally due to several environmental protocols agreed by almost all countries as primary directives of the United Nation (UN). This is because electricity generation from solar energy is clean, accessible nearly everywhere, has a simple structure, and does not require a prime-mover. Besides, the advent of power electronics and its associated control technology has further accelerated the rapid deployment of solar generation systems globally. Although solar power generation has significant environmental advantages and is a promising source of energy for the future, its uncertainty due to intermittency of weather variables makes it more challenging to utilize than the conventional generation sources. This is due to the uncertainty of the generation causes large problems on the power grid stability and control.
However, this problem is not insurmountable. To harness the benefit and increase the competitiveness of the solar energy, accurate forecast of the solar generation is essential. Accurate solar power forecast enhances the control, stability, reliability and flexibility of power grids with a large penetration of PV power. Accurate forecasts assist the various stakeholders involved in the power industry to make better decisions on power system investment, planning, operation, management, economics, market, strategy, and risk analysis. Thus, accurate prediction of PV power plays a key role in power grids containing a huge penetration of PV solar power.
Selecting suitable input variables or a predictor subset is currently a very important research and development (R&D) topic in the field of PV power forecasting. Choosing the best predictor subset from a large number of predictors to constitute the input dataset for PV power forecasting enhances the prediction performance.
This calls for R&D in effective and applicable predictor subset selection (PSS) strategies and enabling tools for enhancing the existing accuracy levels of PV power forecasting.
Predictor subset selection is a process of picking a subset of most important predictors (features, attributes, or variables) for use in forecasting model development.
Different research groups have performed various PSS methods for various applications and scenarios. However, very few of them have coupled and investigated PSS tools and forecasting models. Moreover, there is no standard and universally agreed PSS method so far. The R&D for finding the most effective PSS tools is still ongoing by various independent research groups and institutions.
Predictor subset selection strategies are important for forecasting problems and big data analysis because they:
Decrease computation time, storage requirement and overfitting
Simplify models and evade curse of dimensionality
Enhance data understandability, interpretability and generalization
The core argument when applying a PSS method is that the original dataset holds some variables that are either duplicated or not important, and can therefore be eliminated without inducing ample damage in the information. It has been proved by several research groups that redundant and irrelevant features reduce the accuracy and generalization capability of forecasting models. That is why, nowadays, PSS studies have become very popular in AI, Machine Leaning (ML), Deep Learning (DL), and Statistics.
The techniques being used for forecasting the future power production in distributed PV systems have a great impact on achieving the best economic benefit and energy flexibility of PV systems. However, PSS strategies and enabling tools for the PV power forecasting models have not yet been investigated deeply and the results so far in this regard are not adequate.
Most prior works on PV power forecasting approaches used a predetermined and user-defined set of variables as inputs for the forecast models. They did not utilize PSS techniques to choose the forecast model input variables or predictors, which would have a significant improvement on the obtained forecasting accuracy.
Therefore, the goal of this paper is to propose and implement a predictor subset selection approach for modeling and forecasting the uncertain generation power in distributed PVs in general, and building rooftop PVs in particular. The results will assist the various PV stakeholders in having accurate PV power forecast models that will aid the efficient use of limited energy resources and the regulation of dispatchable generation and flexible demand levels.
Prediction accuracy is the indispensable target in forecasting studies. It is soundly revealed in [1] and [2] that the accuracy of prediction models not only relies on the models’ configurations and associated learning methods but also on the predictor domain, which is established via the initial predictor space and PSS techniques. PSS is mostly applied in ML implementations as one of the preprocessing steps, where a predictor subset (independent attributes) is found by removing predictors with lower or irrelevant information and highly redundant information [3]. However, very few forecasting techniques perform PSS before training the prediction models.
Meta-heuristic optimization algorithms have become very popular and significantly effective for various problems in the power/energy sector, especially in the field of renewable energy generation [4], [5]. They have been effectively implemented as searching techniques for PSS problems. For instance, these methods contain Particle Swarm Optimization (PSO) [6], Ant Colony Optimization (ACO) [7] and the Genetic Algorithm (GA) [8]. GA has gained extensive consideration due to its operability and robust searching ability. GA is one of the artificial intelligence (AI) probabilistic searching algorithms, and has been broadly implemented for several optimization problems [9]. BGA is a special version of GA which operates by first representing the given predictor space (chromosomes or candidate solutions) in binary bit-strings. This makes the BGA better suited for PSS problems than the conventional GA.
PSS methods are classified as filter, wrapper and embedded techniques [1].
Filter techniques do not depend on any prediction model and they sort features depending on statistical characteristics. They utilize a correlation score to grade a feature subset. Filter technique based PSS methods are generally fast. The Filter PSS approach includes correlation-based [10], mutual information-based [11], and principal component analysis-based methods [12]. Filters generally require less computation time than other PSS techniques, but they generate a predictor set which is not fitted to a particular forecast model. Wrapper techniques evaluate predictor subsets based on their worth to a specific forecaster or classifier. Wrapper techniques assume the PSS to be a searching problem that prepares various mixes of predictors, which are assessed and contrasted with other mixes. The common heuristic AI-based optimization methods mentioned above are used to monitor the searching procedure. Compared to filter techniques, wrapper techniques reveal improved performance, since various predictor sets are assessed by a predictive model or fitting method in every iteration [13]. Embedded techniques merge the predictor selection process into the training task of prediction models. For instance, the regularization approaches in [1] are one example of an embedded type PSS method. Table 1 presents recent works on PSS strategies for forecasting problems.
The works proposed in [22]–[30] have implemented the GA-based PSS in different application domains and scenarios.
Following a comprehensive assessment of the above-mentioned genetic algorithm based PSS techniques, we discover that a conventional genetic algorithm with the usual framework (conventional GA configuration) has been used in most research. For instance, the initial population (initial chromosome set) is arbitrarily created where the population variety cannot be guaranteed and the occurrence of duplicated predictors may influence the quality of the search procedure. Moreover, the conventional GA works with the continuous features themselves to minimize the desired fitness function (PSS evaluation measure). This reduces the efficiency of the algorithm and causes computational complexity and increased total computation time.
Assuming adaptive heuristic algorithms should be among the best options to determine the search target; a research problem exists and can be addressed by replacing the conventional GA with the BGA and hybridizing it with robust fitness evaluation measures. The BGA first represents the predictors as encoded binary strings, and works with the binary strings to minimize the SVR-based evaluation measure to obtain a relevant and nonredundant predictor subset at the end. BGA is more efficient and stable than the conventional GA. It also reduces computational complexity and execution time compared to the conventional GA.
Therefore, this paper proposes an adaptive hybrid predictor subset selection strategy to obtain the most relevant and nonredundant predictors for enhanced short-term forecasting of the power output of distributed PVs. In the proposed strategy, the Binary Genetic Algorithm (BGA) is applied for the predictor selection process and Support Vector Regression (SVR) is used for measuring the fitness score of the features.
To the best of our understanding, there exist very few research works that have performed PSS work before fitting or training forecasting models. Moreover, as far as we have investigated, the BGA-SVR based hybrid machine learning approach has never been applied for PSS problem in the domain of renewable energy generation forecasting. Generally, the paper’s contribution can be considered as (1) modeling, parameterization and implementation of the BGA and SVR algorithms to suit the predictor selection problem in question, and (2) establishment of seamless combination of the two algorithms to work in unison for solving the predictor selection problem. Therefore, from application and hybridization point of view, this is the first work to hybridize the BGA and SVR algorithms for PSS problem in the domain of electric power system research.
Specifically, the paper contributions can be summarized as follows:
Analyze and recommend the relevance of an effective PSS strategy and enabling tools for enhanced performance or accurate PV power forecasting;
Present an effective and efficient machine learning-based adaptive PSS strategy for PV power forecasting;
Enhance PV power forecasting accuracy through the application of PSS before training forecasting models.
The rest of the paper is organized as follows. Section II describes the dataset and states the PSS problem. Section III presents the brief working principle of the BGA. Similarly, the theory and mathematical modeling of the SVR model used for the fitness measure in the BGA is described in Section IV. Section V presents the proposed BGA-SVR based PSS strategy. The achieved experimental results and validations are presented in Section VI. The paper is concluded in Section VII.
Dataset and Predictor Selection Problem
The original predictor set is constructed through basic assessment of the characteristics of the power production of distributed PV systems and its association with external agents. The external agents are seasonality (minute/hour, month and season) and weather factors. The availability of the data sources for these external agents affecting the PV power production is also another major factor to construct the original predictor space.
The candidate original predictor set for the PV power forecasting in this PSS work consists of seasonal (or calendar) parameters and weather parameters. The variables
Therefore, the predictor space of the PSS is an
Fall: October 4, 2017 and October 12, 2017,
Winter: January 7, 2018 and January 11, 2018,
Spring: April 21, 2018 and April 26, 2018, and
Summer: July 10, 2018 and July 11, 2018.
In this paper, the following optimization problem is solved to find the best (relevant and nonredundant) predictor subset from the original dataset given in Table 2.
PSS Problem:
Given that:\begin{equation*} f_{r} \in Z^{+}, 1 \leq f_{r} \leq 20, \text {and}~ \beta \in \mathrm {R}^{+} \ni 0 \leq \beta \leq 100\tag{1}\end{equation*}
Binary Genetic Algorithm (BGA)
GA is a population-based heuristic optimization method that is inspired by the survival of the fittest principle of the Charles Darwin theory of evolution and genetics [31]. The GA operating mechanism involves iterative steps processing a set of chromosomes (candidate solutions) to generate a new population (offsprings) via genetic operators - selection, crossover and mutation. The fitnesses of the nominee solutions (chromosomes) are calculated employing an objective or fitness function, meaning that the objective function provides scores (numeric values) which are employed for grading the existing solutions in the population. BGA is an extended version of the standard GA. BGA first represents the candidate solutions as encoded binary strings (binary search space) and works with the binary strings to minimize or maximize the fitness function. BGA is more efficient and stable. It also reduces computational complexity and execution time. Figure 1 shows the flowchart of BGA.
Support Vector Regression (SVR)
SVR is a non-parametric method that essentially depends on a kernel function. Vapnik [32] established the essentials of SVRs in 1995. SVRs are gaining significant credit at the time of writing due to a number of noticeable characteristics and promising hands-on performances. SVR has been effectively implemented to perform prediction tasks and pattern classifications, mainly the clustering of two unlike pattern categories. Their formulation comprises the structural-risk-minimization (SRM) theory, which has been proved to be superior to the standard empirical-risk-minimization (ERM) theory utilized by conventional ANNs [33].
Linear-regressions in the upper-dimension hyperplanes are associated with non-linear regressions in the lower-dimension plane, and are articulated below [34].\begin{equation*} y\left ({x }\right)=w.\Phi \left ({x }\right)+b; \Phi: R^{n}\to R^{N}\tag{2}\end{equation*}
Figure 2 illustrates the configuration of a SVR, where input
A special SVR known as linear-epsilon-insensitive SVR (
The \begin{align*}&min~ \left \{{\frac {1}{2}w^{T}w+\gamma \sum \nolimits _{i=1}^{N} \left ({\xi _{i}+\xi _{i}^{\ast } }\right) }\right \} \\&subject~ to:~ y_{i}-w.\Phi \left ({x_{i} }\right)-b\le \varepsilon +\xi _{i} \\&\hphantom {subject~ to:~}w.\Phi \left ({x_{i} }\right)+b-y_{i}\le \varepsilon +\xi _{i}^{\ast } \\&\hphantom {subject~ to:~}\xi _{i}, \xi _{i}^{\ast }\ge 0\tag{3}\end{align*}
The optimization problem expressed by equation (3) is a quadratic programming type, and is generally solved by solving its equivalent dual-problem defined below.\begin{align*}&min\left \{{\begin{array}{c}{\frac {1}{2} \sum _{i=1}^{N} \sum _{j=1}^{N}\left ({\alpha _{i}-\alpha _{i}^{*}}\right). \Phi \left ({x_{i}, x_{j}}\right) \cdot \left ({\alpha _{j}-\alpha _{j}^{*}}\right)-} \\ {\sum _{i=1}^{N}\left ({\alpha _{i}+\alpha _{i}^{*}}\right) \cdot \varepsilon +\sum _{i=1}^{N}\left ({\alpha _{i}-\alpha _{i}^{*}}\right) \cdot y_{i}}\end{array}}\right \} \\&subject~ to: \sum \nolimits _{i=1}^{N} \left ({\alpha _{i}-\alpha _{i}^{\ast } }\right) =0; \alpha _{i}, \alpha _{i}^{\ast }\ge 0\tag{4}\end{align*}
\begin{equation*} \hat {y}\left ({x }\right)=\sum \nolimits _{i=1}^{N} \left ({\alpha _{i}-\alpha _{i}^{\ast } }\right).K\left ({x, x_{i} }\right)+b\tag{5}\end{equation*}
From the Karush-Kuhn-Tucker (KKT) optimality condition [34] for quadratic-programming type objective functions, all the terms (\begin{equation*} K\left ({x_{i}, x_{j} }\right)=exp\left ({-\frac {\left \|{ x_{i}-x_{j} }\right \|^{2}}{\sigma ^{2}} }\right)\tag{6}\end{equation*}
The SVR model parameters are obtained by solving the optimization problem formulated in (3).
Proposed BGA-SVR Subset Selection Strategy
As shown by the flowchart in Figure 3, there are five key sub-operations in BGA: chromosome encoding, objective value calculation, selection methods, genetic operators and stopping condition. The BGA works in a binary search domain (chromosome bitstrings), and operates the finite binary chromosome set based on the survival of the fittest principle. A starting population is generated and assessed using an objective function. For the binary chromosome employed in this paper, a gene value of ‘1’ indicates that the specific feature pointed to by the location of the ‘1’ is chosen. Else, (if ‘0’), the feature is not chosen for the fitness evaluation.
Employing the place pointer of the variables pointed by the ‘1s’, the individuals are then ordered and the
A. Initial Population
The BGA starting solution space used in this work is a matrix of size
B. Fitness Evaluation
For the BGA to choose the predictor subset, an objective function (BGA driver) should be specified to calculate the discriminative power of each predictor subset.
The fitness of each chromosome in the population is evaluated employing an SVR-based fitness function. In this paper, the fitness of the various subsets of predictors is evaluated using the MSE (mean squared error) of the SVR model residuals. The SVR model output y(
Hence, the MSE of the training target and the SVR model estimate evaluated for each predictor subset in the predictor search space defined in Table 2 are used as the fitness evaluation measure, defined as follows.\begin{equation*} fit=\frac {1}{n}\sum \nolimits _{i=1}^{N} \left ({T_{i}-y_{i} }\right)^{2}\tag{7}\end{equation*}
The aim of the BGA is to minimize the fitness function (MSE) defined in (7) by choosing a subset of input predictors having the best fitness over subsequent iterations. In each chromosome, a gene value of ‘1’ shows the specific predictor pointed by the place of ‘1’ is chosen. If it is ‘0’, the predictor is not chosen for assessment of the chromosome in question. The chromosomes representing the predictors are encoded as bitstrings.
While the BGA runs, the individual chromosomes (feature subsets) in the present population are assessed, and their fitnesses are graded based on the SVR model residual or error. Chromosomes with smaller fitness (smaller residual or error) have a greater probability of persisting in the next population or mating-pool.
Each iteration of the BGA running guarantees that the BGA decreases the error level and classifies the chromosome with the lowest (best) objective function value as Elite. This is because the error level is stated for each individual engaged and the least error level is obtained by the BGA at the end. The individual chromosome corresponding to the least error level of the fitness evaluation contains the most relevant desired predictors.
C. Reproduction
Table 3 presents the parameters of the BGA used in this paper. From Table 3, the chromosome length equals 20, as there are an overall number of 20 predictors nominated for the PSS work in this paper. Following the fitness evaluation, a new population is produced for the next generation through elitism, crossover and mutation.
In BGA, three kinds of sequential offsprings are formed to create the new population [35]. They are:
Elite offspring: Tournament Selection Mechanism (with size 2) is used in this study because of its ease-of-use, swiftness and efficiency [29], [36]. Hence, the upper 2 offsprings with the best fitness scores are directly taken into the following generation. Therefore, the quantity of the elite offsprings (elite count) = O1 = 2. That is, there are 18 (i.e. 20 - O1) chromosomes in the population in addition to the elite offsprings. From the other 18 individuals, crossover and mutation offsprings are then generated.
Crossover offspring: The crossover function used in this paper is of the arithmetic type, which applies a logical XOR operation on the chromosomes of the two parents, as they are represented in binary form. The portion of the following generation (excluding the elite offsprings) created by the crossover operator is known as crossover offspring.
The crossover fraction, which refers to the ratio of the number of crossover offsprings, is taken as 0.8. With the crossover ratio of 0.8, the number of the crossover offsprings is
.$\text{O}_{2} = \text {round}\,\,(18\ast 0.8) = 14$ Mutation offspring: The BGA implemented in this study uses uniform mutation. Using uniform mutation, the BGA creates a set of uniformly distributed random numbers whose size equals the length of the chromosomes. The quantity of mutation offsprings is O3 = 20 – O1 - O2 = 20 – 2 - 14 = 4. This is verified by O1+ O2+ O3 = 20.
D. Convergence Condition
The BGA terminates when it converges to the desired optimal solution. The optimal solution corresponds to the desired predictor subset for the PSS problem in question. The termination condition where the BGA ends running is known as the convergence or stopping condition. The two convergence conditions used in this paper are the following:
Maximum number of generations or iterations
Stalled generation limit
Experimental Results and Validation
In this section, the case study for the proposed PSS work and the results obtained are discussed. Comparative validation, configuration of an adaptive PV power forecasting model based on the PSS results and quantitative relevance analysis of the PSS results are also presented in this section.
A. Case Study
In this paper, the hybrid BGA-SVR based PSS strategy is developed and implemented based on a pilot distributed PV system installed on a building rooftop located in the Otaniemi area of Espoo, Finland. The PV system has a peak generation capacity of 4.3kW.
The original predictor space for the PSS work is described in Table 2. The amount of the PV power production is the desired target variable in the proposed PSS strategy.
Hourly samples from eight days, 192 values, of both the predictor set and target variable are used in the PSS.
B. PSS Results
The empirical results achieved by the proposed PSS method are presented in Table 4.
As is clearly observed from the PSS result in Table 4, the number of predictors chosen by the proposed PSS strategy is considerably lower than the size of the predictor space (the number of predictors in the original dataset is given in Table 2). This can be due to irrelevant and redundant information in most of the variables in the original predictor space. The BGA-SVR finally selects the predictor subset which contains the most relevant and nonredundant variables. A predictor subset consisting of predictors 1, 2, 3, 4, 8, 14, 17, and 20, which represent hour of the day, month of the year, season of the year, ambient air temperature, snow depth, cloud cover, global solar radiation, and sunshine duration, respectively, is selected by the devised BGA-SVR based PSS method. This selected predictor subset can therefore establish an appropriate input dataset for improved PV power forecasting.
Figure 5 shows the BGA objective function value (SVR model based MSE function formulated in (7)) over generations.
Besides, the average computation time of the devised integrated BGA-SVR based PSS algorithm with eight-days long hourly sample of 20 initial predictors is about 5 minutes, using MATLAB simulation environment on a research workstation with Intel Core i7-6820HQ Processor, 2.70 GHz CPU, 16 GB RAM.
C. Comparison With Other PSS Methods
To validate the BGA-SVR PSS work in this paper, the predictor subset result by the proposed BGA-SVR PSS is compared with predictor subset results using two other common PSS techniques, namely: Correlation-based predictor subset selection (C PSS) and Neighborhood Component Analysis Regression-based predictor subset selection (NCA PSS).
The Correlation-based PSS first calculates the Pearson and Spearman correlations of each predictor with the target, and it then takes the maximum of the two correlation coefficients. The Pearson correlation (
The Pearson correlation (
) is defined as:$r_{P}$ where\begin{equation*} r_{P}=\frac {n\left ({\sum {xy} }\right)-\left ({\sum x }\right)\left ({\sum y }\right)}{\sqrt {\left [{ n\sum x^{2} -\left ({\sum x }\right)^{2} }\right]\left [{ n\sum y^{2} -\left ({\sum y }\right)^{2} }\right]}}\tag{8}\end{equation*} View Source\begin{equation*} r_{P}=\frac {n\left ({\sum {xy} }\right)-\left ({\sum x }\right)\left ({\sum y }\right)}{\sqrt {\left [{ n\sum x^{2} -\left ({\sum x }\right)^{2} }\right]\left [{ n\sum y^{2} -\left ({\sum y }\right)^{2} }\right]}}\tag{8}\end{equation*}
is sample size,$n$ is the value of the predictor and$x$ is value of the target variable.$y$ The Spearman correlation (
) is defined as:$r_{P}$ where\begin{equation*} r_{S}=1-\frac {6\sum \limits _{i=1}^{n} d_{i}^{2}}{n\left ({n^{2}-1 }\right)}\tag{9}\end{equation*} View Source\begin{equation*} r_{S}=1-\frac {6\sum \limits _{i=1}^{n} d_{i}^{2}}{n\left ({n^{2}-1 }\right)}\tag{9}\end{equation*}
is the number of samples,$n$ is the difference between the predictor value$d_{i}$ and the target variable value$x$ .$y$
The values of these two correlation coefficients are the same if and only if there exists a linear relationship between the variables. The values of
Either of the correlation values can be higher based on the nature of the relationship of the variables. In this paper, the maximum of the two correlation coefficients is used to measure the strength of the relationship between the predictors and target variable as defined below:\begin{equation*} Corr=max \left \{{\left |{ r_{P} }\right |,\left |{ r_{S} }\right | }\right \}\tag{10}\end{equation*}
A predictor with correlation value greater than a given threshold value can be selected as a relevant predictor and included in the final predictor subset. The SVR model based fitness evaluation measure (MSE) formulated by equation (7) can be calculated in order to determine the threshold correlation value to select the relevant predictors affecting the PV power production. Table 7 provides the values of the fitness measure for the various predictor subsets for the different correlation coefficients given in Table 6.
As clearly observed in Table 7, the predictor subset consisting of predictors with correlation values greater than or equal to 0.20 achieves the best fitness value, lowest MSE, (
The NCA PSS is based on the neighborhood component analysis (NCA) regression model fitted over the predictor subsets versus target dataset. The NCA PSS obtains the predictor weights (for reduced predictor subsets) using a diagonal adaptation of the NCA regression model. The NCA model realizes PSS by regularizing the predictor weights. The predictor weight indicates the strength of the relationship of the predictor with the target variable. The predictor weights are obtained by solving the following NCA regression model based on the unconstrained stochastic minimization problem:\begin{align*} \min:{f\left ({w }\right)}=\frac {1}{n}\sum \nolimits _{i=1}^{n} \sum \nolimits _{j=1, j\ne i}^{n} {\rho _{ij}l\left ({y_{i},y_{j} }\right)} \!+\!\lambda \sum \nolimits _{r=1}^{p} w_{r}^{2} \\\tag{11}\end{align*}
\begin{equation*} l\left ({y_{i},y_{j} }\right)=\left |{ y_{i}-y_{j} }\right |\tag{12}\end{equation*}
The predictors and their associated weight values are plotted in Figure 8.
As shown in Figure 8, the irrelevant predictors that are not selected by this method are indicated with zero weight values. Predictors whose weight value is not indicated by zero in Figure 8 are chosen. Hence, according to the NCA regression model based PSS, predictors 12, 17 and 19 are selected to constitute the input variables for the PV power forecasting.
Table 8 provides the performance comparison of the PSS result by the proposed method and the other two methods. For the purpose of suitability of comparison, the same fitness function (MSE) modeled as the residual of the SVR model is used. That means that each selected predictor subset by the respective method is evaluated for fitness using the SVR model residual.
As shown in Table 8, the proposed BGA-SVR based PSS achieved the predictor subset with the best fitness value (lowest MSE). Hence, the predictor subset selected by the proposed PSS strategy contains more relevant and nonredundant features than the other PSS methods. That means, a PV power forecasting model whose input dataset constitutes the predictor subset found by the proposed BGA-SVR PSS strategy can achieve accurate prediction results.
D. PSS Results for Enhanced-Accuracy and Adaptive PV Power Forecasting
For further validation of the effectiveness of the obtained PSS results, a Feedforward Artificial Neural Network (FFANN) based 24h-ahead PV power forecast model was developed for the case study PV system. The eight predictors selected by the devised BGA-SVR PSS, presented in Section VI B, form the training input dataset for FFANN forecast model. The training target variable is the output power of the PV plant. Eight month’s time series hourly data of the selected predictors and target variable were used to train the FFANN model. The FFANN model parameters was found experimentally. A hidden layer of 10 neurons was used. Moreover, the conventional GA was used to find the optimal weight parameters of the FFANN model.
The proposed model is adaptive, such that it can learn or adapt continuously the changes in the values of the predictor and target variables. It can be retrained periodically when new input datasets are available. This way it can acquire continuous knowledge about the predictor versus PV power production characteristics, and hence improves its prediction performance for future times.
Figure 9 shows the configuration of the PV power forecasting model that uses the selected predictors by the predictor selection strategy proposed and implemented in this paper.
Configuration of an adaptive PV power prediction model employing the selected predictors.
The prediction performance of the developed FFANN forecast model was verified with an out-of-sample hourly testing data of four randomly chosen days representing the four seasons of a year. The model testing (forecasting) results are presented with one-hour time resolution, and they are depicted in Figures 10 to 13, for the winter, spring, summer and fall testing days, respectively.
As shown in Figures 10–13, the forecasts follow the actual PV power production trends with smaller gaps (errors) in between. This further verifies the effectiveness of the proposed PSS approach in selecting the best predictor subset that enables the forecast model to achieve improved forecasts that are more accurate.
Furthermore, the following criteria were employed to evaluate the accuracy of the obtained forecasts:
Error
where,\begin{equation*} \mathrm {Error}=P_{h}^{a}-P_{h}^{f}\tag{13}\end{equation*} View Source\begin{equation*} \mathrm {Error}=P_{h}^{a}-P_{h}^{f}\tag{13}\end{equation*}
and$\mathrm {P}_{\mathrm {h}}^{\mathrm {a}}$ are the actual and forecasted values of the PV power production at hour h, respectively.$\mathrm {P}_{\mathrm {h}}^{\mathrm {f}}$ Mean absolute error (MAE)
where, NH is the forecasting horizon and its value is 24 for 24h-ahead forecast.\begin{equation*} MAE=\frac {1}{NH}\left |{ P_{h}^{a}-P_{h}^{f} }\right |\tag{14}\end{equation*} View Source\begin{equation*} MAE=\frac {1}{NH}\left |{ P_{h}^{a}-P_{h}^{f} }\right |\tag{14}\end{equation*}
Mean absolute percentage error (MAPE)
\begin{equation*} MAPE=\frac {100}{NH}\sum \limits _{h=1}^{NH} \left |{ \frac {P_{h}^{a}-P_{h}^{f}}{P_{h}^{a}} }\right |\tag{15}\end{equation*} View Source\begin{equation*} MAPE=\frac {100}{NH}\sum \limits _{h=1}^{NH} \left |{ \frac {P_{h}^{a}-P_{h}^{f}}{P_{h}^{a}} }\right |\tag{15}\end{equation*}
Average MAE of 16.13kWh, MAPE of 4.64%, and daily peak MAPE of 4.72% are obtained for the forecasts of the four testing days using the proposed BGA-SVR PSS results as input dataset for the FFANN based forecast model of the case study local PV system. Hence, the obtained results validate the quality of the predictions and effectiveness of the implemented PSS method, compared to the existing accuracy levels for day-ahead prediction of solar power generation. The numerical analysis of the prediction accuracy improvement is presented next.
E. Quantitative Relevance Analysis of PSS Results
In order to quantify the benefits and relevance of the proposed hybrid BGA-SVR based PSS method and the selected predictors, the following metrics are used:
Computation time reduction:
where, twithout_PSS is the total computation time which includes data preprocessing, forecasting model training, validation, and prediction using the original predictor space without PSS, twith_PSS is the total computation time with the use of the obtained PSS results, and\begin{equation*} {\Delta t}_{comp}=\frac {t_{without\_{}PSS} -{t}_{with\_{}PSS}}{t_{without\_{}PSS}}\tag{16}\end{equation*} View Source\begin{equation*} {\Delta t}_{comp}=\frac {t_{without\_{}PSS} -{t}_{with\_{}PSS}}{t_{without\_{}PSS}}\tag{16}\end{equation*}
is the change in total computation time due to PSS. Positive value of$\Delta \text{t}_{\mathrm {comp}}$ indicates the reduction of computation time requirement of the PV power forecasting model due to making use of PSS results.$\Delta \text{t}_{\mathrm {comp}}$ Dimensionality reduction:
where,\begin{equation*} \Delta D=\frac {R_{without\_{}PSS}^{m\times n} -{R}_{with\_{}PSS}^{m\times n_{r}}}{R_{without\_{}PSS}^{m\times n}}\tag{17}\end{equation*} View Source\begin{equation*} \Delta D=\frac {R_{without\_{}PSS}^{m\times n} -{R}_{with\_{}PSS}^{m\times n_{r}}}{R_{without\_{}PSS}^{m\times n}}\tag{17}\end{equation*}
is a matrix of predictor space without PSS with m number of samples and n number of predictors,$\mathrm {R}_{\mathrm {without\_{}PSS}}^{\mathrm {m\times n}}$ is a matrix of the reduced predictor space with PSS with m number of samples and nr number of reduced predictors, and$R_{with\_{}PSS}^{m\times n_{r}}$ is the change in data dimension due to PSS. Positive value of$\Delta \text{D}$ (n-nr) indicates the reduction of input data dimension for the PV power forecasting model.$\Delta \text{D}$ PSS fitness value enhancement:
where, fitwithout_PSS is the fitness value of the predictors without PSS with respect to a predefined fitness function (MSE of SVR output and actual target formulated in equation (7)), fitwith_PSS is the fitness value of the selected predictors with PSS,\begin{equation*} \Delta fit=\frac {fit_{without\_{}PSS} -{fit}_{with\_{}PSS}}{fit_{without\_{}PSS}}\tag{18}\end{equation*} View Source\begin{equation*} \Delta fit=\frac {fit_{without\_{}PSS} -{fit}_{with\_{}PSS}}{fit_{without\_{}PSS}}\tag{18}\end{equation*}
fit is the change in fitness value due to PSS. Positive value of$\Delta $ fit indicates the improvement in fitness value (reduction in MSE value) due to PSS.$\Delta $ Prediction accuracy enhancement:
where, accwithout_PSS is the accuracy of the predictions without making use of PSS results (using the original predictor space as training input) and accwith_PSS is the accuracy of the predictions with PSS results (using the reduced predictor space as training input). accwithout_PSS and accwith_PSS are defined as follows:\begin{equation*} \Delta acc=\frac {acc_{with\_{}PSS} -{acc}_{without\_{}PSS}}{acc_{without\_{}PSS}}\tag{19}\end{equation*} View Source\begin{equation*} \Delta acc=\frac {acc_{with\_{}PSS} -{acc}_{without\_{}PSS}}{acc_{without\_{}PSS}}\tag{19}\end{equation*}
where, MAPEwithout_PSS is the mean absolute percentage error of the predictions without PSS and MAPEwith_PSS is the mean absolute percentage error of the predictions with PSS.\begin{align*} {acc}_{without\_{}PSS}=&100 -{MAPE}_{without\_{}PSS}\qquad \tag{20}\\ {acc}_{with\_{}PSS}=&100 -{MAPE}_{with\_{}PSS}\tag{21}\end{align*} View Source\begin{align*} {acc}_{without\_{}PSS}=&100 -{MAPE}_{without\_{}PSS}\qquad \tag{20}\\ {acc}_{with\_{}PSS}=&100 -{MAPE}_{with\_{}PSS}\tag{21}\end{align*}
acc is the change in prediction accuracy due to PSS. Positive value of$\Delta $ acc indicates the improvement of prediction accuracy due to making use of PSS results in the forecasting process.$\Delta $
Table 9 presents the values of the metrics defined in (16) to (19) to determine the benefits achieved due to the implementation of the devised PSS method for short-term PV power forecasting. It also shows the performance comparison of the PSS results by the proposed method and other conventional counterparts.
As shown in Table 9, the implementation of the PSS and its integration to the forecasting model has resulted in much improvements compared to the forecasting performance using the original dataset without PSS. For example, the enhancement in fitness value (MSE) using the BGA selected predictors to fit the PV power by the SVR model is 64.5% over the original predictor set (without PSS). Similarly, the reductions in computation time and data dimensionality over the original predictor space are 53% and 60%, respectively.
Primarily, the enhancement of the prediction accuracy is the most important and major objective of this paper. The enhancement in prediction accuracy using the BGA-SVR PSS selected predictors to constitute the forecasting model training inputs is 58.4%, compared to prediction accuracy using the original predictor space. Moreover, it is shown that the proposed PSS has given higher performance improvement compared to the other, conventional, counterparts, regarding prediction accuracy and fitness value metrics.
Therefore, the above quantifications and experimental results further validate the relevance and effectiveness of the PSS work for the enhancement of the PV power forecasting.
Conclusions
This study devised and developed a BGA based predictor subset selection strategy for enhanced short-term PV power forecasting. The strategy includes the use of an SVR fitness function to choose a combination of predictors from a given original predictor space. A real local PV output power measurement data is used for the PSS work. The devised BGA-SVR PSS has given a predictor subset that resulted in better fitness (lower MSE value) than the original predictor space with all the initial predictors. It achieved the best predictor subset, which can constitute the input variables for accurate forecasting of distributed PV systems. For comparison and validation purposes, predictors selected by two other PSS methods were investigated. The BGA-SVR selected predictors outperformed the other predictors with respect to the MSE fitness function defined using the SVR framework. Besides, a FFANN based 24h-ahead PV power forecast model was developed to evaluate effectiveness of the PSS results. The PV power forecasting model developed using the obtained PSS results has achieved a prediction accuracy improvement of 58.4% compared to forecasting based on the original predictor space without PSS. The devised PSS has also achieved 53%, 60% and 64.5% improvements in computation time, data dimensionality and MSE fitness value, respectively, compared to the original predictor space without PSS. The paper findings confirm that the combination of effective PSS method and forecasting models owns robust forecasting power, compared to forecasting with arbitrary predictors without predictor selection methods. This work is both new and effective from the viewpoints of application in the renewable energy sector and hybridization of algorithms for performance improvement. It contributes a novel and robust predictor selection tool by combining BGA and SVR for enhanced and more accurate forecasting of short-term solar power forecasting.