Introduction
A. Background
Sustainable energy sources have shown a keen interest around the world for several important reasons, such as the downturn of fossil fuels, the rise in prices, industrial pollution, the energy crisis, and the surge of ecological concerns [1]–[3]. Renewable Energy (RE) generation has been strongly encouraged and supported by government policies and technology advancements [3]. In 2018, the share of RE (181 gigawatts) in cumulative production capacity worldwide had increased rapidly, contributing to more than 50% of the annual average power production potential added in the same year [4]. Besides, the energy produced by solar irradiation is considered the most promoting and safe energy supplied from the Photovoltaic (PV) systems [5]. PV energy sources are among the most extensively accessible and highly attractive RE sources according to their significant potential for energy production [6]–[9].
The main difficulty in the PV system is the complexity, parasitic capacitance, harmonic distortion, and sophistication of the equation of current-voltage and power-voltage characteristics [10]. The relationship among PV current and voltage is both implicit and complex depending on certain variables, among them are the ambient temperature, solar irradiation, wind speed, and dust accumulation [11], [12]. On hot days, the cell module temperature can quickly be attained 70°C, where power energy output can drop significantly below nominal values [13]. The production of PV systems mainly depends on the amount of global solar irradiation received by the modules; any change in the power implies that solar irradiation changes during the day or infected by shading. In addition, wind speed can be a significant factor in the occurring of dust, dirt accumulation, and soiling in the PV system [14]. Such a phenomenon prevents the absorption of successful solar irradiance by PV cells and significantly reduces the overall PV power generation. This reduction in power can reach 50% in the arid and semiarid regions, where the solar irradiation is usually high [15]. Thus, the production from such energy sources depends on intermittent (stochastic) weather variables that calls for prediction (forecasting) models and tools capable of accurately estimating the PV power productions by accommodating such inherent stochasticity in the weather variables [16]–[18].
In this context, the prediction of PV power generation would make a significant contribution to the management and maintenance of modern energy systems, such as the connection to microgrids [19]. Prediction plays a critical role in managing the efficiency of the power system [7], [8], [20].
B. Literature Review and Motivation
Numerous methods for predicting PV production have been published in the literature. Nevertheless, an effective method is still needed for enhancing the performance of PV prediction to decrease the adverse effects of system instability. However, the prediction methods are generally classified into model-based and data-driven [16], [21]–[23].
Model-based methods are based on analytical equations that focus on the PV power production concept. The equations typically use weather conditions to predict power output [23]. Usually, such methods do not require historical data, but they strongly depend on comprehensive station location details and reliable meteorological data. They could be easily defined and focused on solar irradiation, or they could be more complicated if additional weather variables, like ambient temperature, wind speed, and dust, are used. Thus, the effectiveness of their forecasts is heavily dependent on the precision of the Numerical Weather Predictions (NWPs) details. Although such methods increase prediction accuracy, the uncertainty resulting from the approximations and/or assumptions in the adopted models could constitute a limitation on their realistic implementation [18].
Contrarily, data-driven methods (developed using various Machine Learning (ML) techniques) depend uniquely on the availability of the historical pairs of weather variables and the associated solar power productions. They aim to build models (called black-box models) to capture the hidden mathematical relationship between the weather variables and the associated PV power productions [4], [16]–[18].
For example, Ding et al., [24] proposed an improved version of the Back-Propagation (BP) learning algorithm-based Artificial Neural Network (ANN) to predict the power output of a PV system during different environmental conditions. The improved BP algorithm was shown superior compared to the traditional BP algorithm in enhancing the accuracy of the power output prediction.
Zeng and Qiao [25] designed the Radial Basis Function-based Neural Network (RBF-NN) for short-term solar PV power prediction using past values of meteorological data (e.g., sky cover, transmissivity). Results showed that the RBF-NN outperforms the linear autoregressive (AR) and the Local Linear Regression (LLR) models. The authors ended up that the use of transmissivity and other extra meteorological data, particularly the sky cover, could mainly improve the efficiency of the power prediction.
Li et al. [26] predicted the PV output power using the Auto Regressive Moving Average with exogenous inputs (ARMAX) and Auto-Regressive Integrated Moving Average (ARIMA). The two models used as exogenous inputs the ambient temperature, insolation duration, precipitation amount, and relative humidity to predict the power output of a 2.1 kW grid-connected PV system. Results revealed that the ARMAX model significantly enhances the predictability of the power output over the ARIMA.
De Leone et al. [27] used Support Vector Regression (SVR) to predict the energy production of a PV plant located in Italy. The method used the past meteorological data (e.g., solar radiation, ambient temperature) and power outputs to predict future power outputs. The obtained results revealed that the quality of the expected power output depends heavily on the accuracy of the meteorological data.
Yang et al. [28] predicted the PV power in the short-term using Auto-Regressive with exogenous input based Spatio-Temporal (ARX-ST). The evaluation of the results was compared to the conventional Persistence model. The authors addressed that the existing ARX-ST can be expanded with more meteorological data to help boosting the prediction precision.
Khademi et al. [29] proposed a Multi-Layer Perceptron equipped with an Artificial Bee Colony (MLP-ABC) algorithm to predict the power output of a 3.2kW PV plant. The collected data were separated into sunny and cloudy days and used to develop the MLP-ABC prediction model. The findings were compared to the MLP-ABC model when both sunny and cloudy days were used to establish the prediction model. It was concluded that the separation of different weather conditions enhanced the accuracy of the PV power output predictions.
Li et al. [30] used the Multivariate Adaptive Regression Splines (MARS) model for daily power output prediction of a grid-connected 2.1 kW PV system. This model maintains the flexibility of the traditional Multi-Linear Regression (MLR) paradigm; thus, having the ability to handle non-linearity. The obtained results using the MARS model were compared with linear models, such as MLR, ARIMA, and ARMAX, as well as some non-linear models, such as SVR,
Muhammad Ehsan et al. [31] implemented an MLP-based ANN model for a 1-day ahead power output prediction of a 20 kWp grid-connected solar plant situated in India. Authors examined different combinations of hidden layers, hidden neuron activation functions, and learning algorithms for reliable 1-day ahead power predictions. The authors concluded that the ANN characterized by a single hidden layer, Linear Sigmoid Axon (neuron activation function), and Conjugate Gradient (learning algorithm) was able to deliver reliable power output predictions.
Theocharides et al. [32] examined the performance of three different ML methods, namely ANNs, SVR, and Regression Trees (RTs), with different hyper-parameters and sets of features, in predicting the power production of PV systems. Their success was related to the Persistence model throughout the computation of Mean Absolute Percentage Error (MAPE) and normalized Root Mean Square Error (nRMSE). The obtained enhancements were then evaluated using the Skill Score (SS). It was found that the ANNs outperform other prediction models from the literature.
Alomari et al. [33] proposed an ANN model for PV power production prediction. The proposed model investigated the strengths of two different learning algorithms (i.e., Levenberg-Marquardt (LM) and Bayesian Regularizations (BR)), by utilizing different variations of ANN model’s inputs. The conclusions drawn revealed that an ANN-based BR provides more accurate predictions than those obtained by ANN-based LM (i.e., RMSE = 0.0706 and 0.0753, respectively).
Al-Dahidi et al. [16] investigated the capability of the Extreme Learning Machine (ELM) in predicting the PV power output. The obtained results revealed that the ELM provides better generalization capability with negligible computational times compared to the traditional BP-ANN.
Later, Al-Dahidi et al. [18] suggested a comprehensive ANN-based set-up solution for enhancing the 24h-ahead solar PV power output predictions. The authors also used the bootstrap technique to quantify the sources of ambiguity that influence the model structure predictions in the form of Prediction Intervals (PIs). The efficacy of the recommended ensemble solution was illustrated by a real case study of a solar PV system (264kWp capacity) located in Amman, Jordan. The suggested method has been shown to be advantageous to various standards in providing more accurate power predictions and accurately quantifying multiple sources of ambiguity.
Behera et al. [34] proposed a prediction technique based on a combination of ELM, Incremental Conductance (IC), and Maximum Power Point Tracking (MPPT) techniques. The obtained results revealed that the ELM provides better performance compared to the standard BP-ANN and that performance can be further enhanced using the PSO technique.
Huang and Kuo [35] proposed a high-precision PVPNet model-based Deep-Learning Neural Networks (DLNNs) for 1-day ahead power output prediction. The prediction results obtained by the proposed PVPNet model were evaluated (in terms of RMSE and MAE) and compared to other ML techniques of literature. Authors concluded that the proposed PVPNet model has an excellent generalization capability and can boost the prediction performance, while reducing monitoring expenses, initial costs of hardware components, and long-term maintenance costs of future PV plants.
Catalina et al. [36] proposed two linear ML models (i.e., Least Absolute Shrinkage and Selection Operator (LASSO) and linear SVR), and two non-linear ML models (i.e., MLPs and Gaussian SVRs) with satellite-measured radiances and clear sky irradiance as inputs to nowcast the PV energy outputs over peninsular Spain. Results revealed that the two non-linear ML models were better than the two linear ML models.
From the above research works, it is apparent that the efforts were mainly dedicated to enhancing the employed data-driven prediction model or investigating other advanced models from the literature. Differently, this work aims to propose a local training strategy applicable to any data-driven prediction model for ultimately boosting the prediction accuracy of the solar PV power outputs, while reducing the computational times. Specifically, the hour-by-hour variability (i.e., the 24-hour seasonality patterns of each day) arise in the solar data, of both the weather variables and the corresponding power productions, has never been explored while developing the prediction models. The consideration of such seasonality while developing the prediction models is expected to be beneficial in enhancing the prediction accuracy while reducing the computational times.
C. Contributions
The proposed training strategy requires splitting the available inputs-output patterns collected from the actual operation of a PV system based on an hour interval of
Each built-ANN is initially optimized on a validation dataset in terms of the number of hidden neurons to enhance the prediction accuracy further and, then, utilized online to estimate the corresponding hourly production of a day on a test “unseen” dataset.
The effectiveness of the proposed training strategy-based ANN is examined on a grid-connected solar PV system (264kWp capacity) located in the Applied Science Private University (ASU), Amman, Jordan [4], [16], [18], [39]. Specifically, the accuracy of the power production predictions and the computational times required to develop and evaluate the built-ANNs are verified by resorting to three performance metrics from the literature [4], i.e., the RMSE, the MAE, and the Weighted MAE (WMAE), and to the computational time in minutes, respectively.
For comparison and validation, a single prediction model (i.e., for a fair comparison, an ANN model is considered) developed and optimized, in terms of the number of hidden neurons, globally on the entire dataset is used as a benchmark to verify the effectiveness of the proposed strategy on the ASU solar PV system. Moreover, the ELMs are used instead of the ANNs, and the Persistence prediction model is adopted further to verify the superiority of the proposed training strategy-based ANN.
Therefore, the significant contributions of the present work are two-fold:
The development of a local training strategy-based ANN for an accurate estimation of the solar PV power productions with short computational times;
The comparison of the obtained results to the global and local training strategies-based ANN and ELM, respectively, as well as to the Persistence prediction model of literature, to further explore the effectiveness of the proposed local training strategy.
The remaining of this article is organized as follows. In Section II, the work objectives are illustrated, and the problem of predicting the solar PV power output is stated. In Section III, the ASU solar PV system case study is described, and the proposed local training strategy-based ANN is illustrated, also providing an essential background of ANN. In Section IV, the application of the proposed training strategy-based ANN to the ASU case study is shown, and the obtained results are discussed and compared with those obtained by the global and local training strategies-based ANN and ELM, respectively, as well as to the Persistence model of literature. Section V investigates the influence of using different hour intervals on the prediction performance. Lastly, conclusions are drawn, and future works are recommended in Section VI.
Work Objectives
This work aims to develop a data-driven model for an accurate estimation of the power productions of a solar Photovoltaic (PV) system with convenient computational times. We consider the availability of historical weather data (
wind speed (
);S_{h} relative humidity (
);RH_{h} ambient temperature (
); andT_{amb_{h}} global solar irradiation (
).I_{rr_{h}}
Together with the timestamp and the corresponding power data (\begin{equation*} \boldsymbol {X}=\left [{ \overrightarrow {hr}\overrightarrow {d}\overrightarrow {S}\mathrm { }\overrightarrow {RH}\overrightarrow {T}_{amb}\overrightarrow {I}_{rr}\Big | \overrightarrow {P} }\right]\tag{1}\end{equation*}
The timestamp is here represented by the chronological hour (
We aim to effectively exploit the pre-processed dataset during the development/training stage of a data-driven prediction model for providing accurate predictions of the PV power production, with convenient computational times. To this aim, the available dataset is divided into
The proposed local training strategy is expected to provide more accurate power production predictions with short computational efforts, compared to the traditional global training strategy, in which the available dataset is used, entirely, to develop and optimize a single (
Material and Methodology
In this Section, the real data of a solar PV system used in this work, together with the proposed (local) and benchmark (global) training strategies-based ANN, are presented in Section III.A and Section III.B, respectively.
A. Material
This Section shows the real case study of a solar PV system of a capacity of 264kWp mounted on the rooftop of the Faculty of Engineering (Al-Khawarizmi Building) at Applied Science Private University (ASU) in Shafa Badran, Amman, Jordan (Latitude = 32.042044 and Longitude = 35.900232) (Fig. 2).
The dataset utilized in this work comprises real weather data,
1) ASU Solar PV System
The ASU PV system comprises 14 SMA sunny tri-power inverters (13 inverters with a power of 17kW and 1 inverter with a power of 10kW) attached with Yingli Solar panels (of a type YL 245P-29b-PC) tilted by 11° and oriented 36° (from S to E). This orientation is chosen to accumulate as much optimum solar radiation as possible during the day (as depicted in Fig. 3 [39].
ASU 264 kWp PV panels installed at the rooftop of the Faculty of Engineering (left) and the connected inverters (right).
The design characteristics of the ASU PV system are reported in Table 1.
2) ASU Weather Station
The ASU weather station (depicted in Fig. 4) is 36m high equipped with latest instruments used to measure 45 different weather variables, such as global solar irradiations, relative humidity, precipitation amounts, wind speeds and directions, barometric pressure, and ambient temperatures collected at various levels from the ground.
ASU weather station (top) and the installed instruments (wind speed (bottom left), hygro-thermo (bottom middle), and pyranometer (bottom right)).
Among the 45 weather variables, engineering and professional opinion suggested to use the highly correlated weather variables to the PV power productions as inputs to the prediction model. Those variables, together with the instruments installed for their measurements and their detailed characteristics are [42]:
the wind speed at 10m (
) (m/s). The wind speed transmitter is used for measuring the horizontal component of the wind speed with high accuracy. The transmitter is equipped with an electronically regulated heating for providing a smooth running of ball bearings during winter operations, and for avoiding the shaft and slot from icing-up. The technical specifications of this instrument are reported in Table 2;S the relative humidity at 1m (
) (%). The hygro-thermo transmitter with capacities sensing element is used for measuring the relative humidity with high accuracy. The transmitter is equipped with weather and thermal radiation shield to protect the humidity sensor against radiation, precipitation, and mechanical damages. The technical specifications of this instrument are reported in Table 2;RH the ambient temperature at 1m (
) (in °C). The hygro-thermo transmitter with RTD is used for measuring the temperature of ambient air with high accuracy. The transmitter is equipped with weather and thermal radiation shield to protect the temperature sensor against radiation, precipitation and mechanical damages. The technical specifications of this instrument are reported in Table 2; andT_{amb} the global solar irradiations (
) (in W/m2). The pyranometers are used for measuring the global (total) irradiation on a plane surface with high accuracy. The technical specifications of this instrument are reported in Table 2.I_{rr}
Additionally, the corresponding timestamps (i.e., number of hours and day from the beginning of each year data) are also considered in inputs [4], [16], [18], [33], [43]. The remaining variables have been excluded from the analysis.
3) Data Pre-Processing
For proper utilization of the dataset, it has been pre-processed following the guidelines reported in [4], [16], [18], [33], [43] using the same case study. For example:
missing values have been excluded from the analysis;
negative
values and corresponding missingIrr values (recognized in the early morning (i.e., 12 a.m.–6 a.m.) and late evening (i.e., 6 p.m.–11 p.m.)) have been set to zero; andP the overall data have been normalized between 0 and 1.
Fig. 5 shows the pre-processed hourly
the different variability of the reported parameters’ values collected at each
-th hour interval of 1 hour,h , of the days. This is highly necessary for a notification here to justify the motivation of using the collectedh\in [{1,24}] -th parameters’ values solely in developingh different ANN prediction models. Each model is dedicated to estimating the correspondingH=24 -th power production value;h the extensive and random variability in the reported parameters’ values collected at the entire hours of the days. It is essential to mention here that, typically, an ANN prediction model built on the whole dataset is expected to estimate the PV power productions at each
-th hour interval less accurately,h compare to a).h\in [{1,24}]
Hourly weather variables (top and middle) and the corresponding PV solar power productions (bottom) of few days selected throughout the 2017-year data.
The pre-processed
B. Methodology
In this Section, the proposed local and the benchmark global training strategies-based ANN performed with a MATLAB code that has been developed in-house are presented in Section B.1 and Section B.2, respectively.
1) The Proposed Local Training Strategy-Based ANN
The proposed training strategy-based ANN is sketched in Fig. 6, and it goes along the following four steps:
Step 1 (Establishing
Different Datasets):H=24 This step entails partitioning the overall available pre-processed dataset (
) into\mathbf {X} different datasets (H=24 ) of equal size. Each dataset represents the timestamp (\mathbf {X}_{ \boldsymbol {h}} ), weather variables (\overrightarrow {hr}_{h},\overrightarrow {d}_{h} ), and the corresponding power productions (\overrightarrow {S}_{h},\overrightarrow {RH}_{h},\overrightarrow {T}_{amb_{h}},\overrightarrow {I}_{rr_{h}} ) collected at each hour interval\overrightarrow {P}_{h} (i.e.,h , using an hour intervalh\in [1,24/\Delta h] hour) during the period\Delta h=1 years.Y=3.625 Thus, the
different datasets can be written as follows:H=24 \begin{equation*} \boldsymbol {X}_{ \boldsymbol {h}}\!=\!\left [{ \overrightarrow {hr}_{h}\overrightarrow {d}_{\!h}\overrightarrow {S}_{\!h}\overrightarrow {RH}_{h}\overrightarrow {T}_{\!amb_{h}}\overrightarrow {I}_{\!rr_{h}}\!\left |{ \overrightarrow {P}_{\!h} }\right. }\right]\!,\quad h\in [{1,24}].\tag{2}\end{equation*} View Source\begin{equation*} \boldsymbol {X}_{ \boldsymbol {h}}\!=\!\left [{ \overrightarrow {hr}_{h}\overrightarrow {d}_{\!h}\overrightarrow {S}_{\!h}\overrightarrow {RH}_{h}\overrightarrow {T}_{\!amb_{h}}\overrightarrow {I}_{\!rr_{h}}\!\left |{ \overrightarrow {P}_{\!h} }\right. }\right]\!,\quad h\in [{1,24}].\tag{2}\end{equation*}
This partitioning is motivated by the fact that the inputs-output patterns collected at different hours and different days are mostly variable. Thus, utilizing solely the data collected at the
-th hour, for the estimation of the correspondingh -th power production, in the development of the ANNs prediction models, is expected to produce more accurate power production predictions. Additionally, this partitioning will, indeed, reduce the computational efforts required by the ANN prediction models in capturing the hidden “unknown” relationship between the inputs and the output power.h Step 2 (Train the
ANNs Prediction Models With Different Configurations):H=24 Once the
different datasets are established, training (H=24 ), validation (\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {train}} ), and test (\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {valid}} ) datasets are extracted randomly from the established datasets\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {test}} , with arbitrary fractions of\mathbf {X}_{ \boldsymbol {h}},h\in [{1,24}] %,\gamma =50 %, and\beta =20 %, respectively. Such datasets are used to train, validate (Step 3), and test (Step 4) the\alpha =30 different feedforward ANNs, respectively, whose detailed characteristics are hereafter described. Specifically:H=24 the training datasets (
), each is formed by\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {train}},h\in [{1,24}] patterns (i.e., 663 days), are used for building/training theN^{train}=663 ANNs prediction models;H=24 the validation datasets (
), each is formed by\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {valid}},h\in [{1,24}] patterns (i.e., 265 days), are used for optimizing the configurations of theN^{valid}=265 ANNs prediction models;H=24 the test datasets (
), each is formed by\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {test}},h\in [{1,24}] patterns (i.e., 398 days), are used to evaluate the performance of the optimum ANNs prediction models. It is, indeed, important to mention thatN^{test}=398 test patterns have never been used during the building/training and the optimization of the ANNs.N^{test}
In other words, the extracted datasets
, obtained in Step 1 are divided into three disjoint datasets, namely, training (\mathbf {X}_{ \boldsymbol {h}},h\in \left [{ 1,24 }\right] ), validation (\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {train}} ), and test (\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {valid}} ), by sampling their patterns randomly from the patterns of the corresponding datasets\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {test}} , with arbitrary fractions of\mathbf {X}_{ \boldsymbol {h}},h\in \left [{ 1,24 }\right] %,\gamma =50 %, and\beta =20 %, respectively. The motivation of using such fractions is to assure that the annual seasonality appears in the datasets has been sufficiently captured while developing the ANNs prediction models.\alpha =30 To further assure that while sampling randomly, a Cross-Validation (CV) procedure is being employed, as we shall see in Step 3. Thus, other arbitrary fractions can be considered, and the obtained conclusions would be, indeed, the same.
Each
-th ANN prediction model (h ) comprises three main layers, as shown in Fig. 7:\mathrm {AN}\mathrm {N}_{h},h\in [{1,24}] Input layer. It receives the
-th pattern,j , that comprises thex_{h}^{j} -th six inputs (i.e.,h ),{[hr}_{h}^{j}d_{h}^{j}S_{h}^{j}{RH}_{h}^{j}T_{amb_{h}}^{j}I_{rr_{h}}^{j}] ;j=1,\ldots,N^{train} Hidden layer. It comprises
hidden neurons that used to process the received inputs via a hidden neuron activation function,N_{h} , and send the processed information to the output layer. In practice, the hidden neuron activation function is a continuous non-polynomial function (e.g., “Log-Sigmoid”, “Linear”, “Radial Basis”, etc.) established to capture the mathematical hidden non-linear “unknown” relationship between the inputs and the outputs;f_{1}() Output layer. It provides an estimation of the corresponding
-th power production (h ) via an output neuron activation function,\hat {P}_{h}^{j} , which is typically a linear transfer function (“Purlin”) [4], [44]. The estimatedf_{2}() -th power production (h ) of the\hat {P}_{h}^{j} -th input pattern,j , can be written as follows:x_{h}^{j} \begin{align*}&\hspace {-0.5pc}\hat {P}_{h}^{j}=f_{2}\left ({\sum \nolimits _{n=1}^{N_{h}} {\overrightarrow {\beta }_{n}f_{1}\left ({\overrightarrow {w}_{n}x_{h}^{j}+b_{n} }\right)+b_{o}} }\right), \\&\qquad \qquad \qquad \quad h\in [{1,24}],\quad j=1,\ldots,N^{train}\tag{3}\end{align*} View Source\begin{align*}&\hspace {-0.5pc}\hat {P}_{h}^{j}=f_{2}\left ({\sum \nolimits _{n=1}^{N_{h}} {\overrightarrow {\beta }_{n}f_{1}\left ({\overrightarrow {w}_{n}x_{h}^{j}+b_{n} }\right)+b_{o}} }\right), \\&\qquad \qquad \qquad \quad h\in [{1,24}],\quad j=1,\ldots,N^{train}\tag{3}\end{align*}
andn are the indexes of the number of hidden neurons (j ) and the number of available training patterns (n=1,\ldots,N_{h} ), respectively.j=1,\ldots,N^{train} andb_{n} are the weights (biases) of the connections established between the bias neurons and eachb_{o} -th hidden neuron and output neuron, respectively.n and\overrightarrow {w}_{n} are the weights vectors of the connections established between the input and output neurons with each\overrightarrow {\beta }_{n} -th hidden neuron, respectively.n To adequately define the ANNs configurations, different candidate numbers of hidden neurons (
) are explored. Then_{candidate} is considered to span the interval [5-30] with a step size of 5. For each possible configuration, then_{candidate} -th power production (whose actual value ish ) is estimated (P_{h}^{j} ) and the Levenberg-Marquardt (LM) error BP learning algorithm is adopted to minimize the mismatch (typically by calculating the Mean Square Error (\hat {P}_{h}^{j} ) between actual and estimated power productions by exploring different random initializations of the ANN internal parameters (i.e.,MSE) ,\overrightarrow {\beta }_{n} ,\overrightarrow {w}_{n} ,b_{n} ). The built-ANNs (b_{o} ) are those whose internal parameters are optimally selected to minimize the\mathrm {AN}\mathrm {N}_{h}^{\prime },h\in [{1,24}] on theMSE training inputs-output patterns.N^{train} Step 3 (Validate the Built-ANNs With Different Configurations):
Once the
different ANNs are built with the different possible configurations, the best models for eachH=24 -th hour (h ) are selected by evaluating their prediction performances on the validation dataset (\mathrm {AN}\mathrm {N}_{h}^{opt},h\in [{1,24}] ). To this aim, three standard performance metrics from the literature are computed. The considered performance metrics are [4], [9], [16], [18]:\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {valid}} Root Mean Square Error (
) [kW] (Eq. (4)). The metric describes the difference between the actual (true) and estimated power productions produced by the built-ANNs models. SmallRMSE indicates that the predictions are accurate, and vice versa;RMSE \begin{equation*} RMSE_{h}=\sqrt {\frac {\sum _{j=1}^{N^{valid}} \left ({P_{h}^{j}-\hat {P}_{h}^{j} }\right)^{2}}{N^{valid}}}\tag{4}\end{equation*} View Source\begin{equation*} RMSE_{h}=\sqrt {\frac {\sum _{j=1}^{N^{valid}} \left ({P_{h}^{j}-\hat {P}_{h}^{j} }\right)^{2}}{N^{valid}}}\tag{4}\end{equation*}
Mean Absolute Error (
) [kW] (Eq. (5)). This metric calculates the average error between the actual (true) and estimated power productions produced by the built-ANNs models. Similar toMAE metric, smallRMSE values indicate that the predictions are accurate, and vice versa;MAE \begin{equation*} MAE_{h}=\frac {\sum _{j=1}^{N^{valid}} \left |{ P_{h}^{j}-\hat {P}_{h}^{j} }\right |}{N^{valid}}\tag{5}\end{equation*} View Source\begin{equation*} MAE_{h}=\frac {\sum _{j=1}^{N^{valid}} \left |{ P_{h}^{j}-\hat {P}_{h}^{j} }\right |}{N^{valid}}\tag{5}\end{equation*}
Weighted Mean Absolute Error (
) (Eq. (6)). This metric computes the average relative error between the actual (true) and estimated power productions produced by the built-ANNs models. Similar to the previous two metrics, smallWMAE values indicate that the predictions are accurate, and vice versa. In practice, this metric is of interest to compare the prediction accuracy when the production capacities are changing;WMAE \begin{equation*} WMAE_{h}=\frac {\sum _{j=1}^{N^{valid}} \left |{ P_{h}^{j}-\hat {P}_{h}^{j} }\right |}{\sum _{j=1}^{N^{valid}} P_{h}^{j}}\tag{6}\end{equation*} View Source\begin{equation*} WMAE_{h}=\frac {\sum _{j=1}^{N^{valid}} \left |{ P_{h}^{j}-\hat {P}_{h}^{j} }\right |}{\sum _{j=1}^{N^{valid}} P_{h}^{j}}\tag{6}\end{equation*}
,RMSE_{h} , andMAE_{h} are the performance metrics computed for eachWMAE_{h} -th built-ANN prediction model, andh is the average actual production collected at the\bar {P}_{h} -th hour.h A 100-fold CV procedure is employed to evaluate the different ANNs with the various possible configurations robustly. It entails establishing the datasets 100 different times, each with the same fractions of
%,\gamma =50 %, and\beta =20 % for the training, validation, and test datasets, respectively. The simulations are then repeated 100 times, and the performance metrics are evaluated 100 times. The ultimate performance metrics are, then, calculated by computing the average and standard deviation values and the optimum number of hidden neurons (\alpha =30 ) are reported for eachN_{h}^{opt} -th ANN prediction model. For eachh -th hour, the best ANNs are selected to be those that minimizing the multiplied metric of theh ,RMSE_{h} , andMAE_{h} , i.e.,WMAE_{h} .RMSE_{h}\ast MAE_{h}\ast WMAE_{h} Step 4 (Test the Optimum ANNs):
The optimum ANNs whose configurations are the best selected among the whole possible different combinations as reported in Step 3 and obtained on the validation dataset are to be evaluated on the unseen test datasets (
). The predictability of the optimum ANNs is evaluated using the above-mentioned performance metrics, and their average and standard deviation results calculated over the 100 CV trials are reported.\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {valid}}
Sketch of the proposed local training strategy-based ANN for the prediction of solar PV power productions.
2) The Benchmark Global Training Strategy-Based ANN
The benchmark global training strategy-based ANN entails building/training (using a training dataset), optimizing (using a validation dataset), and evaluating (using a test dataset) an
the training dataset (
). It is formed by\mathbf {X}^{ \boldsymbol {train}} patterns (collected from 663 days, each comprises 24 inputs-output patterns, thus establishingN^{train}=15912 patterns). The24\times 366=15912 is used for building/training the single\mathbf {X}^{ \boldsymbol {train}} ANN model;H=1 the validation dataset (
). It is formed by\mathbf {X}^{ \boldsymbol {valid}} patterns (collected from 265 days, each comprises 24 inputs-output patterns, thus establishingN^{valid}=6360 patterns). The24\times 265=6360 is used for optimizing the configuration of the single\mathbf {X}^{ \boldsymbol {valid}} ANN model in terms of number of hidden neurons;H=1 the test dataset (
). It is formed by\mathbf {X}^{ \boldsymbol {test}} remaining patterns (collected from 398 days, each comprises 24 inputs-output patterns, thus establishingN^{test}=9552 patterns). The24\times 398=9552 is used to evaluate the performance of the optimum single\mathbf {X}^{ \boldsymbol {test}} ANN model.H=1
In other words, the benchmark global training strategy aims at exploiting the complete inputs-output patterns collected at the different hour intervals of different days for the development of the single ANN prediction model (thus called “global” training strategy).
For a fair comparison with the proposed approach, the 663, 265, and 398 days considered here in each simulation (CV) trial are the same as those that have been obtained (using the arbitrary fractions
Application Results
The application results of the proposed local training strategy-based ANN (Section III.B.1) on the ASU real case study (Section III.A) are here described and compared with the results obtained by the benchmark (Section III.B.2). Further, its effectiveness is verified and compared when Extreme Learning Machines (ELMs) are adopted instead of the ANNs, and when the Persistence model is used.
Because the early morning and late evening hours are with no need to estimate the power productions since there are no solar radiations, the reported results are shown solely for
A. Application Results of the Proposed Local Training Strategy-Based ANN
The optimum configurations obtained for each
It is worth mentioning that one could follow an exhaustive search procedure to select the best-hidden/output neuron activation functions (i.e.,
Notice that:
The optimum number of hidden neurons obtained for each
-th ANN model is, in general, shown to be proportional to the level of variability exhibited by the data collected at the corresponding hour interval,h . For example, ath\in [{7,21}] andh=7 (small and large data variability, respectively, as depicted in Fig. 5), small and large numbers of hidden neurons (h=12 andN_{h}^{opt}=5 , respectively) are selected instead of the other candidate numbers,N_{h}^{opt}=20 , to accurately represent the inputs-output relationship;n_{candidate} the prediction performance obtained for each
-th ANN model is, in general, shown to be proportional to the level of variability exhibited by the data collected at the corresponding hour interval,h . For example, ath\in [{7,21}] andh=7 (small and large data variability, respectively), small and large performance metrics are obtained, respectively;h=12
B. Application Results of the Benchmark Global Training Strategy-Based
Similarly, the optimum configuration of the (
For clarification purposes, Fig. 8 shows the evolution of the overall multiplied performance metrics (
The evolution of the overall multiplied performance metrics concerning the candidate numbers of hidden neurons.
C. Comparisons and Discussions
Table 4 reports the average performance metrics obtained by the optimum ANN configurations of the proposed model (as reported in Table 3) accompanied with those produced by the optimum ANN configuration of the benchmark model (as shown in Figure 8) on the entire day hours (
The performances of the two models are also verified using the test “unseen” datasets. In this regard, Table 5 reports the average performance metrics obtained by the proposed model accompanied with those produced by the benchmark model on the entire day hours (
To effectively compare the performance metrics from the two prediction models for the test datasets, the Performance Gain (\begin{equation*} PG_{Metric}=\frac {Metric^{Benchmark}-Metric^{Proposed}}{Metric^{Benchmark}}\times 100\%\tag{7}\end{equation*}
This gain describes the performance gain obtained by the proposed model to the performance of the benchmark for each of the computed performance metrics [4], [17]. Typically, positive values of the
Also, the computational efforts required by the prediction models for their developments, validations, and evaluations are reported as well in Table 5.
Looking at Table 5, one can easily recognize that the proposed prediction model outperforms the benchmark significantly. Specifically:
The proposed approach enhances the prediction performance by ~25% (for the
), ~30% (for theRMSE ), and ~22% (for theMAE );WMAE Additionally, the computational efforts required by the proposed approach are significantly reduced by ~40%, as expected;
Thus, the proposed training strategy boosts the prediction performance than that of the benchmark with short computational efforts.
Specifically, Fig. 9 shows the evolutions of the average performance metrics (Fig. 9(left)) and the corresponding standard deviations (Fig. 9(right)) at each hour
The overall hourly performance metrics and the corresponding standard deviations obtained by the proposed (circles) and the benchmark (squares) models over the 100-CV on the test datasets.The overall hourly performance metrics and the corresponding standard deviations obtained by the proposed (circles) and the benchmark (squares) models over the 100-CV on the test datasets.
Looking at Fig. 9 (left), one can recognize that:
The predictions provided by the two models are comparable. Specifically, the benefit in accuracy of the proposed model to the benchmark (in particular at the early and late days’ hours) for the three performance metrics is justified by the use of solely the hourly data in building/developing the ANNs for the prediction of the corresponding hourly power productions. In contrast, the entire dataset of the whole days’ hours is used to build/develop the single ANN model. For example, at
, the performance metrics obtained by the proposed training strategy (i.e.,h=12 kW,RMSE=19.05 kW, andMAE=12.52 ) are smaller and, thus, superior to those obtained by the benchmark (i.e.,WMAE=0.095 kW,RMSE=20.9 kW, andMAE=14.63 ).WMAE=0.111
Whereas looking at Fig. 9(right), one can recognize that:
the variability (standard deviation) in the three performance metrics obtained by the proposed model is smaller than that obtained by the benchmark. Similarly, this is because the proposed model exploits solely the data collected at each hour
to predict the corresponding power production predictions, whereas the benchmark utilizes the entire data collected at the whole hours to predict the power productions at each hourh ;h the variability is reduced between the two models at the middle days’ hours, as expected, due to the large variability in the collected data at these particular hours utilized in the proposed model compared to the early and late days’ hours;
Further insights on the superiority of the proposed model can be seen by looking at Fig. 10 that shows the computed average performance metrics over the 100-fold CV at each season (i.e., different weather conditions) on the test datasets (Fig. 10 (left)), together with the obtained performance gains (Fig. 10 (right)). It can be seen that:
the proposed model (a dark shade of color) provides more satisfactory performance in terms of prediction accuracy of the power productions, i.e., lower metrics, for all seasons, compared to the benchmark (a light shade of color) (Fig. 10(left));
the highest performance gains achieved by the proposed model to the benchmark using the three metrics are obtained at the Summer season, whereas the lowest performance gains are obtained at the Winter season, with almost equal intermediate performance gains obtained at both Autumn and Spring seasons (Fig. 10 (right)). This indicates the superiority of the proposed model in achieving more accurate predictions for significant power production season (Summer season) compared to the small power production season (Winter season).
The overall average performance metrics obtained by using proposed (dark shade of color) and the benchmark (light shade of color) training strategies at each season on the test datasets (left), together with the performance gains (right).
For clarification purposes, Fig. 11 shows four examples of the best (Fig. 11 (top)) and worst (Fig. 11 (bottom)) power production predictions obtained by the proposed model (circles) of four different days (one day for each season) compared with the corresponding predictions obtained by the benchmark model (squares) together with the actual productions (solid lines), respectively. The predictions provided by the two prediction models are comparable: the benefit in the prediction accuracy of the proposed model to the benchmark is justified by the use of the solely hourly data for training the proposed model to predict the corresponding hourly power productions, whereas the complete hourly data are used to train the benchmark model.
Comparison of the power production predictions obtained by using the proposed (circles) and the benchmark (squares) training strategies for some days in the four seasons.
For completeness, Table 6 reports the average performance metrics and the corresponding performance gains obtained by the proposed model to the benchmark of these particular days in the four seasons for one CV trial, i.e., CV = 6. One can recognize the superiority of the proposed model compared to the benchmark for all of the selected days across the four seasons. For example, the most significant enhancement obtained by using the proposed training strategy reaches up to ~58% (
D. Comparisons With Other Prediction Techniques
In this Section, the effectiveness of the proposed local training strategy with respect to the global training strategy when other ML techniques are adopted is investigated (refer to Fig. 6). Mainly, the Extreme Learning Machines (ELMs) are employed as prediction models instead of the ANNs. In addition, the consideration of using the ANNs is justified by comparing the prediction performances obtained by using the ANNs to those obtained by using the ELMs. Further, the prediction performances obtained by using the ANNs and the ELMs are compared to the well-known Persistence prediction model of literature, for completeness.
The ELM, initially developed by [46], is a new learning algorithm for single-hidden layer neural networks. Similar to ANN model architecture, the ELM comprises an input layer, a hidden layer that consists of
The Persistence model [47] for PV power production prediction is an intuitive and straightforward approach commonly used as a benchmark for evaluating the effectiveness of any proposed prediction techniques. Basically, it assumes that the PV power production at time
Both the proposed (local) and the benchmark (global) training strategies are employed for developing
Table 7 reports the average performance metrics obtained by using the proposed (local) training strategy-based ELM together with those obtained by using the benchmark (global) training strategy-based ELM on the entire day hours (
Looking at Table 7, one can easily recognize that:
the utilization of the local training strategy for the development of the ELMs largely enhances the solar PV power production prediction compared to the use of the global training strategy. Accurately, enhancements reach up to ~34%, ~36%, and ~30% for the RMSE, MAE, and WMAE, respectively.
the predictability obtained by using the ELMs is slightly lower than that obtained by the adoption of the ANNs. Future works can be devoted to enhancing the adopted prediction model embedded with the proposed local training strategy of this work.
In addition, Fig. 12 shows the evolutions of the average performance metrics (Fig. 12(left)) and the corresponding standard deviations (Fig. 12 (right)) at each hour
The predictions provided by the three models are comparable. Specifically, the Persistence model seems to slightly/largely outperform the proposed (local) training strategy-based ANN/ELM, respectively, at the early morning (i.e.,
) and late evening hours (i.e.,h=7,8,9,10 ). This can be justified by the fact that at those time hours, the variability of the weather conditions is small that makes the intuitive justification of the Persistence model valid (i.e., the PV power production at timeh=18,19,20,21 , of the next day will be the same as the present PV power production collected at the same timeh,h\in [{1,24}] ). However, the performance of the Persistence model starts to notably decrease (the RMSE, MAE, and WMAE start to increase) at the middle days’ hours (i.e.,h ) with respect to the performances obtained by using both the ANN and the ELM. This is due to the large variability of the weather conditions experienced by the ASU PV plant at those time hours;h\in [{11,17}] The proposed (local) training strategy-based ANN allows obtaining more accurate power predictions throughout the whole time hours than the proposed (local) training strategy-based ELM;
The overall hourly performance metrics and the corresponding standard deviations obtained by the proposed training strategy-based ANN (circles), the proposed training strategy-based ELM (squares), and the Persistence (diamonds) models over the 100-CV on the test datasets.
Whereas looking at Fig. 12 (right), one can recognize that:
the variability (standard deviation) in the three performance metrics obtained by the Persistence model is the smallest among the three models. Similarly, this is due to the intuitive operation of the Persistence model that considers solely the data collected at each hour
to predict the corresponding power production predictions;h the variability obtained by the utilization of the ANN is lower than that obtained by the ELM, in particular at the middle days’ hours. This assures the effectiveness of the proposed (local) training strategy-based ANN in providing accurate solar PV power production predictions with small variability (i.e., confidence bounds).
Influence of Using Different Hour Intervals for Dataset Partitioning on the Prediction Performance
In this Section, the influence of using different hour intervals,
Specifically, three different hour intervals are considered in this work; they are:
hours. This interval entails partitioning the dataset (\Delta h=2 ) into\mathbf {X} different datasets, and thus,H=\frac {24}{\Delta h}=12 ANNs will be built, optimized, and evaluated. Each dataset represents the timestamps, weather variables, and the corresponding power productions collected at each hour intervalH=12 during theh years,Y=3.625 ;h\in [{1,12}] hours. This interval entails partitioning the dataset (\Delta h=3 ) into\mathbf {X} different datasets, and thus,H=\frac {24}{\Delta h}=8 ANNs will be built, optimized, and evaluated. Each dataset represents the timestamps, weather variables, and the corresponding power productions collected at each hour intervalH=8 during theh years,Y=3.625 ;h\in [{1,8}] hours. This interval entails partitioning the dataset (\Delta h=4 ) into\mathbf {X} different datasets, and thus,H=\frac {24}{\Delta h}=6 ANNs will be built, optimized, and evaluated. Each dataset represents the timestamps, weather variables, and the corresponding power productions collected at each hour intervalH=6 during theh years,Y=3.625 .h\in [{1,6}]
Once the overall dataset (
Fig. 13 shows the evolutions of the average performance metrics at each hour
The overall hourly performance metrics obtained by using the proposed (circles (
For more clarifications, Fig. 14 shows the average performance metrics obtained by using the proposed training strategy-based ANN with different hour intervals (light shade of color) accompanied with those produced by using the benchmark training strategy-based ANN (dark shade of color) on the entire day hours (
The overall performance metrics obtained by using the proposed training strategy-based ANN with different hour intervals (light shade of color) and the benchmark training strategy-based ANN (dark shade of color) over the 100-CV on the test datasets.
The analysis of Fig. 13 and Fig. 14 conclude that:
The predictions provided by the proposed (local) and the benchmark (global) training strategies-based ANN are comparable. Specifically, the benefit in accuracy obtained by the proposed training strategy using the different hour intervals to the benchmark (in particular at the early and late days’ hours) for the three performance metrics is still justified by the use of the local hourly data for building/developing the ANNs for the prediction of the corresponding local hourly power productions;
The prediction performance seems to be shifted towards the global training strategy (i.e., the benchmark with
) when large hour intervals are being used (e.g.,H=1 hours) and, thus, less number of ANN models are being used.\Delta h=4
For completeness, Fig. 15 shows the computational efforts in minutes required by using the proposed and the benchmark strategies during the training, optimization, and evaluation phases. One can notice that the computational efforts needed by the proposed approach using the three different hour intervals (i.e.,
The computational efforts in minutes required by the proposed (using different hour intervals) and the benchmark training strategies-based ANN on the test dataset.
To conclude, the consideration of the hour-by-hour variation (i.e., the 24-hour seasonality patterns of each day) while building/training a data-driven prediction model is shown beneficial in enhancing the predictability of the solar PV power productions while reducing the computational efforts required by the adopted model compare to the traditional global training strategy. In practice, these enhancements are valuable in balancing power supplies and demands across centralized grid networks through economic dispatch decisions between the energy sources.
Conclusion and Future Works
In this work, a local training strategy-based Artificial Neural Network (ANN) is proposed to enhance the solar PV power productions with short computational efforts. Specifically, the proposed training strategy is local in a sense that solely the timestamp, weather variables and the corresponding power productions collected at each
Future works can be devoted to the application of deep learning techniques, e.g., Long Short Term Memory (LSTM) and/or Echo State Networks (ESNs), instead of the employed ANN to further enhance the prediction performance.
Nomenclature
A. | Abbreviations |
RE | Renewable Energy |
PV | Photovoltaic |
ASU | Applied Science Private University |
NWP | Numerical Weather Prediction |
ML | Machine Learning |
ANNs | Artificial Neural Networks |
BP | Back-Propagation |
RBF-NN | Radial Basis Function based Neural Network |
AR | Auto-Regressive |
LLR | Local Linear Regression |
ARMAX | AR Moving Average with exogenous inputs |
ARIMA | AR Integrated Moving Average |
SVR | Support Vector Regression |
ARX-ST | AR with exogenous input based Spatio-Temporal |
MLP-ABC | Multi-Layer Perceptron-Artificial Bee Colony |
MARS | Multivariate Adaptive Regression Splines |
MLR | Multi-Linear Regression |
CART | Classification and Regression Trees |
RTs | Regression Trees |
ELMs | Extreme Learning Machines |
PSO | Particle Swarm Optimization |
IC | Incremental Conductance |
MPPT | Maximum Power Point Tracking |
DLNNs | Deep-Learning Neural Networks |
LASSO | Least Absolute Shrinkage and Selection Operator |
MLPs | Multilayer Perceptrons |
LM | Levenberg-Marquardt |
BR | Bayesian Regularizations |
PIs | Prediction Intervals |
RMSE | Root Mean Square Error |
nRMSE | Normalized RMSE |
MAPE | Mean Absolute Percentage Error |
MAE | Mean Absolute Error |
WMAE | Weighted MAE |
MSE | Mean Square Error |
SS | Skill Score |
CV | Cross-Validation |
LSTM | Long Short Term Memory |
ESN | Echo State Network |
B. | Notations |
Weather data | |
Number of available years data | |
Number of available days data | |
Number of available inputs-output patterns | |
Global solar radiations | |
Wind speed at 10m | |
Relative humidity at 1m | |
Ambient temperature at 1m | |
Hour number from the beginning of each year data, | |
Day number from the beginning of each year data, | |
Number of hours in a day, | |
Hour instant, | |
Overall inputs-output dataset | |
Overall inputs-output dataset available at | |
Inputs-output training dataset | |
Training dataset available at | |
Inputs-output validation dataset | |
Validation dataset available at | |
Inputs-output test dataset | |
Test dataset available at | |
Number of training inputs-output patterns | |
Number of validation inputs-output patterns | |
Number of test inputs-output patterns | |
Arbitrary fractions used for extracting training, validation, and test datasets, respectively | |
Generic | |
The | |
The | |
Hidden neurons activation function | |
Output neuron activation function | |
Number of hidden neurons | |
Index of hidden neuron, | |
Possible (candidate) number of hidden neurons | |
Hidden and output bias neurons, respectively | |
Hidden and output connection weights, respectively | |
Optimum ANN model obtained at hour | |
Hour interval | |
Performance metric average value calculated over the CV trials at hour | |
Prediction performance gain calculated for a performance metric | |
Prediction performance metric obtained by the Benchmark global training strategy | |
Prediction performance metric obtained by the Proposed local training strategy | |
Prediction performance gain obtained for the performance metric |
ACKNOWLEDGMENT
The authors would like to thank the Renewable Energy Center at the Applied Science Private University for sharing with them the Solar PV data. The authors would like to thank all the reviewers for their valuable comments to improve the quality of this article.