Journals & Magazines >IEEE Access >Volume: 8

A Local Training Strategy-Based Artificial Neural Network for Predicting the Power Production of Solar Photovoltaic Systems

Sketch of the proposed local training strategy-based ANN for the prediction of solar PV power productions.

Abstract:

Power production prediction from Renewable Energy (RE) sources has been widely studied in the last decade. This is extremely important for utilities to counterpart electr...Show More

Metadata

Abstract:

Power production prediction from Renewable Energy (RE) sources has been widely studied in the last decade. This is extremely important for utilities to counterpart electricity supply with consumer demands across centralized grid networks. In this context, we propose a local training strategy-based Artificial Neural Network (ANN) for predicting the power productions of solar Photovoltaic (PV) systems. Specifically, the timestamp, weather variables, and corresponding power productions collected locally at each hour interval h, h = [1,24] (i.e., an interval of Δh = 1 hour), are exploited to build, optimize, and evaluate H = 24 different ANNs for the 24 hourly solar PV production predictions. The proposed local training strategy-based ANN is expected to provide more accurate predictions with short computational times than those obtained by a single (i.e., H = 1) ANN model (hereafter called benchmark) built, optimized, and evaluated globally on the entire available dataset. The proposed strategy is applied to a case study regarding a 264kWp solar PV system located in Amman, Jordan, and its effectiveness compared to the benchmark is verified by resorting to different performance metrics from the literature. Further, its effectiveness is verified and compared when Extreme Learning Machines (ELMs) are adopted instead of the ANNs, and when the Persistence model is used. The prediction performance of the two training strategies-based ANN is also investigated and compared in terms of i) different weather conditions (i.e., seasons) experienced by the solar PV system under study and ii) different hour intervals (i.e., Δh = 2, 3, and 4 hours) used for partitioning the overall dataset and, thus, establishing the different ANNs (i.e., H = 12, 8, and 6 models, respectively).

Sketch of the proposed local training strategy-based ANN for the prediction of solar PV power productions.

Published in: IEEE Access ( Volume: 8)

Page(s): 150262 - 150281

Date of Publication: 12 August 2020

Electronic ISSN: 2169-3536

DOI: 10.1109/ACCESS.2020.3016165

Contents

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.

SECTION I.

Introduction

A. Background

Sustainable energy sources have shown a keen interest around the world for several important reasons, such as the downturn of fossil fuels, the rise in prices, industrial pollution, the energy crisis, and the surge of ecological concerns [1]–[3]. Renewable Energy (RE) generation has been strongly encouraged and supported by government policies and technology advancements [3]. In 2018, the share of RE (181 gigawatts) in cumulative production capacity worldwide had increased rapidly, contributing to more than 50% of the annual average power production potential added in the same year [4]. Besides, the energy produced by solar irradiation is considered the most promoting and safe energy supplied from the Photovoltaic (PV) systems [5]. PV energy sources are among the most extensively accessible and highly attractive RE sources according to their significant potential for energy production [6]–[9].

The main difficulty in the PV system is the complexity, parasitic capacitance, harmonic distortion, and sophistication of the equation of current-voltage and power-voltage characteristics [10]. The relationship among PV current and voltage is both implicit and complex depending on certain variables, among them are the ambient temperature, solar irradiation, wind speed, and dust accumulation [11], [12]. On hot days, the cell module temperature can quickly be attained 70°C, where power energy output can drop significantly below nominal values [13]. The production of PV systems mainly depends on the amount of global solar irradiation received by the modules; any change in the power implies that solar irradiation changes during the day or infected by shading. In addition, wind speed can be a significant factor in the occurring of dust, dirt accumulation, and soiling in the PV system [14]. Such a phenomenon prevents the absorption of successful solar irradiance by PV cells and significantly reduces the overall PV power generation. This reduction in power can reach 50% in the arid and semiarid regions, where the solar irradiation is usually high [15]. Thus, the production from such energy sources depends on intermittent (stochastic) weather variables that calls for prediction (forecasting) models and tools capable of accurately estimating the PV power productions by accommodating such inherent stochasticity in the weather variables [16]–[18].

In this context, the prediction of PV power generation would make a significant contribution to the management and maintenance of modern energy systems, such as the connection to microgrids [19]. Prediction plays a critical role in managing the efficiency of the power system [7], [8], [20].

B. Literature Review and Motivation

Numerous methods for predicting PV production have been published in the literature. Nevertheless, an effective method is still needed for enhancing the performance of PV prediction to decrease the adverse effects of system instability. However, the prediction methods are generally classified into model-based and data-driven [16], [21]–[23].

Model-based methods are based on analytical equations that focus on the PV power production concept. The equations typically use weather conditions to predict power output [23]. Usually, such methods do not require historical data, but they strongly depend on comprehensive station location details and reliable meteorological data. They could be easily defined and focused on solar irradiation, or they could be more complicated if additional weather variables, like ambient temperature, wind speed, and dust, are used. Thus, the effectiveness of their forecasts is heavily dependent on the precision of the Numerical Weather Predictions (NWPs) details. Although such methods increase prediction accuracy, the uncertainty resulting from the approximations and/or assumptions in the adopted models could constitute a limitation on their realistic implementation [18].

Contrarily, data-driven methods (developed using various Machine Learning (ML) techniques) depend uniquely on the availability of the historical pairs of weather variables and the associated solar power productions. They aim to build models (called black-box models) to capture the hidden mathematical relationship between the weather variables and the associated PV power productions [4], [16]–[18].

For example, Ding et al., [24] proposed an improved version of the Back-Propagation (BP) learning algorithm-based Artificial Neural Network (ANN) to predict the power output of a PV system during different environmental conditions. The improved BP algorithm was shown superior compared to the traditional BP algorithm in enhancing the accuracy of the power output prediction.

Zeng and Qiao [25] designed the Radial Basis Function-based Neural Network (RBF-NN) for short-term solar PV power prediction using past values of meteorological data (e.g., sky cover, transmissivity). Results showed that the RBF-NN outperforms the linear autoregressive (AR) and the Local Linear Regression (LLR) models. The authors ended up that the use of transmissivity and other extra meteorological data, particularly the sky cover, could mainly improve the efficiency of the power prediction.

Li et al. [26] predicted the PV output power using the Auto Regressive Moving Average with exogenous inputs (ARMAX) and Auto-Regressive Integrated Moving Average (ARIMA). The two models used as exogenous inputs the ambient temperature, insolation duration, precipitation amount, and relative humidity to predict the power output of a 2.1 kW grid-connected PV system. Results revealed that the ARMAX model significantly enhances the predictability of the power output over the ARIMA.

De Leone et al. [27] used Support Vector Regression (SVR) to predict the energy production of a PV plant located in Italy. The method used the past meteorological data (e.g., solar radiation, ambient temperature) and power outputs to predict future power outputs. The obtained results revealed that the quality of the expected power output depends heavily on the accuracy of the meteorological data.

Yang et al. [28] predicted the PV power in the short-term using Auto-Regressive with exogenous input based Spatio-Temporal (ARX-ST). The evaluation of the results was compared to the conventional Persistence model. The authors addressed that the existing ARX-ST can be expanded with more meteorological data to help boosting the prediction precision.

Khademi et al. [29] proposed a Multi-Layer Perceptron equipped with an Artificial Bee Colony (MLP-ABC) algorithm to predict the power output of a 3.2kW PV plant. The collected data were separated into sunny and cloudy days and used to develop the MLP-ABC prediction model. The findings were compared to the MLP-ABC model when both sunny and cloudy days were used to establish the prediction model. It was concluded that the separation of different weather conditions enhanced the accuracy of the PV power output predictions.

Li et al. [30] used the Multivariate Adaptive Regression Splines (MARS) model for daily power output prediction of a grid-connected 2.1 kW PV system. This model maintains the flexibility of the traditional Multi-Linear Regression (MLR) paradigm; thus, having the ability to handle non-linearity. The obtained results using the MARS model were compared with linear models, such as MLR, ARIMA, and ARMAX, as well as some non-linear models, such as SVR, $K$ -Nearest Neighbors ( $K$ -NN), and Classification and Regression Trees (CART). Results showed that non-linear models tend to provide higher performance than linear models, on average. The authors concluded that no model could do consistently better than the other at both the training and prediction levels.

Muhammad Ehsan et al. [31] implemented an MLP-based ANN model for a 1-day ahead power output prediction of a 20 kWp grid-connected solar plant situated in India. Authors examined different combinations of hidden layers, hidden neuron activation functions, and learning algorithms for reliable 1-day ahead power predictions. The authors concluded that the ANN characterized by a single hidden layer, Linear Sigmoid Axon (neuron activation function), and Conjugate Gradient (learning algorithm) was able to deliver reliable power output predictions.

Theocharides et al. [32] examined the performance of three different ML methods, namely ANNs, SVR, and Regression Trees (RTs), with different hyper-parameters and sets of features, in predicting the power production of PV systems. Their success was related to the Persistence model throughout the computation of Mean Absolute Percentage Error (MAPE) and normalized Root Mean Square Error (nRMSE). The obtained enhancements were then evaluated using the Skill Score (SS). It was found that the ANNs outperform other prediction models from the literature.

Alomari et al. [33] proposed an ANN model for PV power production prediction. The proposed model investigated the strengths of two different learning algorithms (i.e., Levenberg-Marquardt (LM) and Bayesian Regularizations (BR)), by utilizing different variations of ANN model’s inputs. The conclusions drawn revealed that an ANN-based BR provides more accurate predictions than those obtained by ANN-based LM (i.e., RMSE = 0.0706 and 0.0753, respectively).

Al-Dahidi et al. [16] investigated the capability of the Extreme Learning Machine (ELM) in predicting the PV power output. The obtained results revealed that the ELM provides better generalization capability with negligible computational times compared to the traditional BP-ANN.

Later, Al-Dahidi et al. [18] suggested a comprehensive ANN-based set-up solution for enhancing the 24h-ahead solar PV power output predictions. The authors also used the bootstrap technique to quantify the sources of ambiguity that influence the model structure predictions in the form of Prediction Intervals (PIs). The efficacy of the recommended ensemble solution was illustrated by a real case study of a solar PV system (264kWp capacity) located in Amman, Jordan. The suggested method has been shown to be advantageous to various standards in providing more accurate power predictions and accurately quantifying multiple sources of ambiguity.

Behera et al. [34] proposed a prediction technique based on a combination of ELM, Incremental Conductance (IC), and Maximum Power Point Tracking (MPPT) techniques. The obtained results revealed that the ELM provides better performance compared to the standard BP-ANN and that performance can be further enhanced using the PSO technique.

Huang and Kuo [35] proposed a high-precision PVPNet model-based Deep-Learning Neural Networks (DLNNs) for 1-day ahead power output prediction. The prediction results obtained by the proposed PVPNet model were evaluated (in terms of RMSE and MAE) and compared to other ML techniques of literature. Authors concluded that the proposed PVPNet model has an excellent generalization capability and can boost the prediction performance, while reducing monitoring expenses, initial costs of hardware components, and long-term maintenance costs of future PV plants.

Catalina et al. [36] proposed two linear ML models (i.e., Least Absolute Shrinkage and Selection Operator (LASSO) and linear SVR), and two non-linear ML models (i.e., MLPs and Gaussian SVRs) with satellite-measured radiances and clear sky irradiance as inputs to nowcast the PV energy outputs over peninsular Spain. Results revealed that the two non-linear ML models were better than the two linear ML models.

From the above research works, it is apparent that the efforts were mainly dedicated to enhancing the employed data-driven prediction model or investigating other advanced models from the literature. Differently, this work aims to propose a local training strategy applicable to any data-driven prediction model for ultimately boosting the prediction accuracy of the solar PV power outputs, while reducing the computational times. Specifically, the hour-by-hour variability (i.e., the 24-hour seasonality patterns of each day) arise in the solar data, of both the weather variables and the corresponding power productions, has never been explored while developing the prediction models. The consideration of such seasonality while developing the prediction models is expected to be beneficial in enhancing the prediction accuracy while reducing the computational times.

C. Contributions

The proposed training strategy requires splitting the available inputs-output patterns collected from the actual operation of a PV system based on an hour interval of $\Delta h=1$ , into $H=24$ datasets, each dataset represents the data collected at each hour interval $h,h\in [{1,24}]$ . The established datasets are then used to build $H=24$ feedforward ANNs models. The selection of the ANNs is driven by the fact that they are simple, easy to understand and to implement, and capable of solving non-linear interpolation problems [37], [38].

Each built-ANN is initially optimized on a validation dataset in terms of the number of hidden neurons to enhance the prediction accuracy further and, then, utilized online to estimate the corresponding hourly production of a day on a test “unseen” dataset.

The effectiveness of the proposed training strategy-based ANN is examined on a grid-connected solar PV system (264kWp capacity) located in the Applied Science Private University (ASU), Amman, Jordan [4], [16], [18], [39]. Specifically, the accuracy of the power production predictions and the computational times required to develop and evaluate the built-ANNs are verified by resorting to three performance metrics from the literature [4], i.e., the RMSE, the MAE, and the Weighted MAE (WMAE), and to the computational time in minutes, respectively.

For comparison and validation, a single prediction model (i.e., for a fair comparison, an ANN model is considered) developed and optimized, in terms of the number of hidden neurons, globally on the entire dataset is used as a benchmark to verify the effectiveness of the proposed strategy on the ASU solar PV system. Moreover, the ELMs are used instead of the ANNs, and the Persistence prediction model is adopted further to verify the superiority of the proposed training strategy-based ANN.

Therefore, the significant contributions of the present work are two-fold:

The development of a local training strategy-based ANN for an accurate estimation of the solar PV power productions with short computational times;
The comparison of the obtained results to the global and local training strategies-based ANN and ELM, respectively, as well as to the Persistence prediction model of literature, to further explore the effectiveness of the proposed local training strategy.

The remaining of this article is organized as follows. In Section II, the work objectives are illustrated, and the problem of predicting the solar PV power output is stated. In Section III, the ASU solar PV system case study is described, and the proposed local training strategy-based ANN is illustrated, also providing an essential background of ANN. In Section IV, the application of the proposed training strategy-based ANN to the ASU case study is shown, and the obtained results are discussed and compared with those obtained by the global and local training strategies-based ANN and ELM, respectively, as well as to the Persistence model of literature. Section V investigates the influence of using different hour intervals on the prediction performance. Lastly, conclusions are drawn, and future works are recommended in Section VI.

SECTION II.

Work Objectives

This work aims to develop a data-driven model for an accurate estimation of the power productions of a solar Photovoltaic (PV) system with convenient computational times. We consider the availability of historical weather data ( $\mathbf {W}$ ) and corresponding power production data ( $\overrightarrow {P}$ ) of the PV system collected during $Y$ years (or $D$ days) (see Fig. 1). The weather data ( $\mathbf {W}$ ) consist of four weather variables collected at hour $h,h\in [{1,24}]$ of each day during the period $Y$ ; they are:

wind speed ( $S_{h}$ );
relative humidity ( $RH_{h}$ );
ambient temperature ( $T_{amb_{h}}$ ); and
global solar irradiation ( $I_{rr_{h}}$ ).

FIGURE 1.

Modelling architecture.

A.	Abbreviations
RE	Renewable Energy
PV	Photovoltaic
ASU	Applied Science Private University
NWP	Numerical Weather Prediction
ML	Machine Learning
ANNs	Artificial Neural Networks
BP	Back-Propagation
RBF-NN	Radial Basis Function based Neural Network
AR	Auto-Regressive
LLR	Local Linear Regression
ARMAX	AR Moving Average with exogenous inputs
ARIMA	AR Integrated Moving Average
SVR	Support Vector Regression
ARX-ST	AR with exogenous input based Spatio-Temporal
MLP-ABC	Multi-Layer Perceptron-Artificial Bee Colony
MARS	Multivariate Adaptive Regression Splines
MLR	Multi-Linear Regression
CART	Classification and Regression Trees
RTs	Regression Trees
ELMs	Extreme Learning Machines
PSO	Particle Swarm Optimization
IC	Incremental Conductance
MPPT	Maximum Power Point Tracking
DLNNs	Deep-Learning Neural Networks
LASSO	Least Absolute Shrinkage and Selection Operator
MLPs	Multilayer Perceptrons
LM	Levenberg-Marquardt
BR	Bayesian Regularizations
PIs	Prediction Intervals
RMSE	Root Mean Square Error
nRMSE	Normalized RMSE
MAPE	Mean Absolute Percentage Error
MAE	Mean Absolute Error
WMAE	Weighted MAE
MSE	Mean Square Error
SS	Skill Score
CV	Cross-Validation
LSTM	Long Short Term Memory
ESN	Echo State Network

B.	Notations
$\mathbf {W}$	Weather data
$Y$	Number of available years data
$D$	Number of available days data
$N$	Number of available inputs-output patterns
$j$	$j$ -th inputs-output pattern, $j=1,\ldots,N$
$I_{rr}$	Global solar radiations
$S$	Wind speed at 10m
$RH$	Relative humidity at 1m
$T_{amb}$	Ambient temperature at 1m
$hr$	Hour number from the beginning of each year data, $h=1,\ldots,8760$
$d$	Day number from the beginning of each year data, $d=1,\ldots,365$
$H$	Number of hours in a day, $H=24$
$h$	Hour instant, $h\in \left [{ 1,24 }\right]$
$I_{rr_{h}}$	$I_{rr}$ value collected at hour $h$ of each day during the period $Y$
$\overrightarrow I_{rr_{h}}$	${I}_{rr_{h}}$ data vector collected at hour $h$
$\overrightarrow I_{rr}$	${I}_{rr}$ data vector collected at all $H=24$ hours
$S_{h}$	$S$ value collected at hour $h$ of each day during the period $Y$
$\overrightarrow S_{h}$	${S}_{h}$ data vector collected at hour $h$
$\overrightarrow {S}$	$S$ data vector collected at all $H=24$ hours
${RH}_{h}$	$RH$ value collected at hour $h$ of each day during the period $Y$
$\overrightarrow {RH}_{h}$	${RH}_{h}$ data vector collected at hour $h$
$\overrightarrow {RH}$	$RH$ data vector collected at all $H=24$ hours
$T_{amb_{h}}$	$T_{amb}$ value collected at hour $h$ of each day during the period $Y$
$\overrightarrow {T}_{amb_{h}}$	$T_{amb_{h}}$ data vector collected at hour $h$
$\overrightarrow {T}_{amb}$	$T_{amb}$ data vector collected at all $H=24$ hours
$P_{h}$	$P$ value collected at hour $h$ of each day during the period $Y$
$\overrightarrow {P}_{h}$	$P_{h}$ data vector collected at hour $h$
$\overrightarrow {P}$	$P$ data vector collected at all $H=24$ hours
$\mathbf {X}$	Overall inputs-output dataset
$\mathbf {X}_{ \boldsymbol {h}}$	Overall inputs-output dataset available at $h$ -th hour
$\mathbf {X}^{ \boldsymbol {train}}$	Inputs-output training dataset
$\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {train}}$	Training dataset available at $h$ -th hour
$\mathbf {X}^{ \boldsymbol {valid}}$	Inputs-output validation dataset
$\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {valid}}$	Validation dataset available at $h$ -th hour
$\mathbf {X}^{ \boldsymbol {test}}$	Inputs-output test dataset
$\mathbf {X}_{ \boldsymbol {h}}^{ \boldsymbol {test}}$	Test dataset available at $h$ -th hour
$N^{train}$	Number of training inputs-output patterns
$N^{valid}$	Number of validation inputs-output patterns
$N^{test}$	Number of test inputs-output patterns
$\gamma,\beta,\alpha$	Arbitrary fractions used for extracting training, validation, and test datasets, respectively
$x_{h}^{j}$	Generic $j$ -th input pattern collected at hour $h$
$\hat {P}_{h}^{j}$	The $j$ -th power prediction obtained at hour $h$
$P_{h}^{j}$	The $j$ -th actual power collected at hour $h$
$f_{1}()$	Hidden neurons activation function
$f_{2}()$	Output neuron activation function
$N_{h}$	Number of hidden neurons
$n$	Index of hidden neuron, $n=1,\ldots,N_{h}$
$n_{candidate}$	Possible (candidate) number of hidden neurons
$b_{n},b_{o}$	Hidden and output bias neurons, respectively
$\overrightarrow {w}_{n},\overrightarrow {\beta }_{n}$	Hidden and output connection weights, respectively
$\mathrm {AN}\mathrm {N}_{h}^{opt}$	Optimum ANN model obtained at hour $h$
$\Delta h$	Hour interval
$Metric_{h}$	Performance metric average value calculated over the CV trials at hour $h$
$PG_{Metric}$	Prediction performance gain calculated for a performance metric $Metric$
$Metric^{Benchmark}$	Prediction performance metric obtained by the Benchmark global training strategy
$Metric^{Proposed}$	Prediction performance metric obtained by the Proposed local training strategy
$PG_{Metric}$	Prediction performance gain obtained for the performance metric $Metric$

MIT Libraries

MIT Libraries

A Local Training Strategy-Based Artificial Neural Network for Predicting the Power Production of Solar Photovoltaic Systems

Alerts

Abstract:

Metadata

Abstract:

Introduction

A. Background

B. Literature Review and Motivation

C. Contributions

Work Objectives

Material and Methodology

A. Material

1) ASU Solar PV System

2) ASU Weather Station

3) Data Pre-Processing

B. Methodology

1) The Proposed Local Training Strategy-Based ANN

Step 1 (Establishing H=24H=24 Different Datasets):

Step 2 (Train the H=24H=24 ANNs Prediction Models With Different Configurations):

Step 3 (Validate the Built-ANNs With Different Configurations):

Step 4 (Test the Optimum ANNs):

2) The Benchmark Global Training Strategy-Based ANN

Application Results

A. Application Results of the Proposed Local Training Strategy-Based ANN

B. Application Results of the Benchmark Global Training Strategy-Based

C. Comparisons and Discussions

D. Comparisons With Other Prediction Techniques

Influence of Using Different Hour Intervals for Dataset Partitioning on the Prediction Performance

Conclusion and Future Works

Nomenclature

ACKNOWLEDGMENT

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Step 1 (Establishing $H=24$ Different Datasets):

Step 2 (Train the $H=24$ ANNs Prediction Models With Different Configurations):