INTRODUCTION
Over the past several decades, the challenges posed by global warming and the energy crisis have spurred the advancement and adoption of alternative, sustainable, and eco-friendly energy sources [1]. Solar energy, an inexhaustible resource, is widely regarded as one of the most promising renewable for power generation[2]. Photovoltaic (PV) cells represent the principal technology for the conversion of solar energy into electrical power [3]. The implementation of PV power generation has fostered considerable economic and environmental benefits, including the reduction of CO2 emissions and the generation of employment opportunities, thereby heightening public awareness and social engagement on these matters [4].
A. The PV System Challenges
PV cells harness energy from sunlight by utilizing photons to displace electrons within silicon semiconductors, thereby generating an electrical current [5]. Consequently, the power output of PV cells is intrinsically linked to solar irradiance. Moreover, various environmental factors, such as temperature, cloud cover, particulate matter, relative humidity, and others, impact the efficiency of energy generation in PV cells [6], [7]. The meteorological variables that influence the performance of PV cells are inherently unpredictable and subject to constant fluctuations [8]. Additionally, certain PV systems exhibit non-stationary characteristics, wherein PV panels undergo continual movement [9]. This dynamic behavior results in fluctuations and unpredictability in the irradiance levels captured by the PV panels. The intricate and dynamic nature of weather systems, coupled with the uncertainties surrounding conditions, renders the precise control and maintenance of PV modules a complex endeavor [10].
One potential solution to effectively manage these intricate and volatile PV systems is the implementation of Digital Twin (DT) technology [11]. The concept of DT technology involves the creation of an advanced data-driven virtual model that mirrors a physical entity, and its purpose is to refine, optimize, and sustain real-world operation [12]. Fig. 1 shows an overview of the DT system of a large scale floating PV installation and displays relevant technologies that could be implemented at real-world and virtual sides.
A DT system commonly incorporates an interface that facilitates various functionalities. This interface serves as a means to visually represent the DT model, providing users with the ability to observe its constituent elements and manipulate relevant parameters [13]. Moreover, it encompasses real-time data monitoring capabilities, analytics tools, and simulation functionalities, thereby empowering users to analyze and optimize the behavior of the DT model [14]. In addition, the interface offers alerts and notifications based on predefined conditions, supports collaborative efforts among users, and seamlessly integrates with other software systems. Through the DT interface, users gain the ability to diligently monitor, exercise control, and make well-informed decisions pertaining to the physical object or system that the DT model represents [15]. To illustrate, Fig. 2 showcases a prime example of a DT interface showcasing a PV system, which was developed using the specialized solar PV design software tool known as Aurora [16].
The application of DT technology can address the challenges posed by PV systems, offering several advantages over conventional simulation methods:
A DT model is inherently dynamic, capable of adapting to evolving environmental conditions in real-time [17].
Access to extensive data sets enables DT models to leverage machine learning algorithms for improved performance and results [18].
By integrating data from multiple sources, a DT model generates a more comprehensive and robust representation [19].
By incorporating these advantages, DT technology presents a promising avenue for overcoming the complexities associated with PV system management.
B. Brief History of DT Technology
Although the term “Digital Twin” was recently coined in 2011, the underlying concept can be traced back to the 1960s [20]. One notable example is NASA's ground-based engineering team, who successfully resolved Apollo 13’s oxygen tank explosion problem in 1970 by testing potential solutions in a virtual simulation [21]. In 2003, Michael Grieves was credited with first publicly introducing the concept and model of the DT during a specialized meeting on product life-cycle management at the University of Michigan's Lurie Engineering Center [22]. In a subsequent publication, Grieves delineated the three primary components of a DT:
A virtual twin that emulates the behavior of the corresponding physical counterpart, generating identical outputs in response to the same input values.
The physical twin, which represents the real-world entity that the virtual twin seeks to emulate, may encompass a product, system, model, or other physical entity.
A data flow cycle that facilitates the exchange of information between the physical and virtual twins, enabling each to inform and influence the other.
NASA employed the term “Digital Twin” in 2011 to describe the digital replication of an aircraft's structural behavior [23]. Initially, DT models were utilized as maintenance tools for continuous structural monitoring. Subsequently, they evolved into comprehensive replicas that could simulate the entire life cycle of an aircraft and predict its performance [24]. Over time, DT technology has played an important part in Industry 4.0 and gained widespread adoption across various sectors, including construction, education, business, transportation, power and electronics, human and healthcare, sports, and networking and communications [25]. The technology has garnered increased popularity among industries striving to make their processes more intelligent, adaptable, and optimally responsive to operational conditions. DT systems are highly sought after for their capacity to identify product defects [26], reduce production costs [27], enable real-time monitoring [28], and extend product lifespans through the prediction of product failure [29]. Fig. 3 shows a timeline that catalogs notable developments in DT technology.
DT technology also revolutionized the power electronics field by enabling the creation of highly accurate virtual models of physical systems and components. For example, ABB Ltd. used DT models to monitor and optimize the performance of their electromagnetic flowmeters, leading to enhanced efficiency and reduced energy losses [30]. Engineers at General Electric leveraged digital twins to predict and mitigate potential issues in their power converters before they occurred, optimizing system designs and conducting virtual testing under various operating conditions [31]. This technology also facilitated predictive maintenance, as seen in Siemens' implementation, which reduced downtime and extended the lifespan of critical components in their power management systems [32].
C. Research and Review GAP
In recent years, academic literature concerning DT technology has seen a marked increase, though research activities in various fields are not uniformly distributed. The most prevalent areas of focus include manufacturing DT, with over 1,000 papers published since 2010, and architectural DT, with over 400 papers published since 2010. In contrast, there is a relative scarcity of published research on solar energy DT. Consequently, the development of solar energy DT systems represents an emerging domain in the realm of digitization that warrants increased attention, given the pressing need for clean, renewable energy solutions in response to global environmental challenges. Considering the emerging status of solar DT systems, it is essential for researchers to coordinate their efforts to produce meaningful research outputs that directly impact the solar industry and contribute to the urgent demand for clean energy. Therefore, a systematic review of solar DT technology is both timely and necessary to guide future research and development in this critical field.
Numerous review papers have explored the concept of DT modeling. Liu et al. provided a comprehensive analysis of industrial applications, comparing DT models to digital models and digital shadows, and emphasizing the significance of data fusion in DT modeling [33]. However, their focus was on industrial rather than energy applications. The criteria for effective DT modeling in industrial manufacturing differ significantly from those in the energy sector, where predictive analysis is crucial due to the variability in energy availability and consumption [34]. In their review of energy DT models, Amaral et al. categorized these twins by their applications in energy generation, storage, transmission, and consumption, but minimal references to solar energy DT models are distinguished by their unique and complex energy generation methods. Researchers aiming to design solar energy DT systems may find this review less useful due to the lack of specific insights into solar applications. Ghenai et al.’s review discusses some modeling methods for solar energy DT models but offers a limited analysis, primarily highlighting the benefits without addressing the limitations or comparing different methods to understand their underlying mechanisms [35]. Lastly, Kavousi-Fard et al.’s review analyzes DT technology for solar energy, highlighting its applications in optimization, maintenance, security, and resiliency, but fails to compare different DT models within each category [36]. It is crucial to note that DT models utilizing high volumes of diverse data can be structurally different and perform differently from those that do not, highlighting the need for comparative analysis within application categories.
D. Review AIMS and Contributions
This review aims to categorize and explore the design and application of DT systems to enhance the efficiency of PV solar energy generation. To accomplish this objective, the paper presents an exhaustive literature review centered around addressing the following research questions:
How do academic publications define DT technology and classify its various types?
How do DT systems leverage the data fusion techniques to benefit solar energy generation?
What are the advantages and limitations of various PV models?
Identification of the implementation methods for data fusion in solar DT systems.
Recognition of current limitations in the utilization of DT technology for PV management.
The subsequent sections of this paper follow a structured layout. In Section II, a novel classification framework for DT systems is introduced, utilizing data connectivity and data integration concepts drawn from existing literature. Section III conducts an in-depth analysis of prevalent modeling methods applied in PV systems across various DT categories, carefully evaluating their respective advantages and limitations. Section IV highlights critical challenges encountered by DT systems within the context of the PV industry, while also proposing prospective research directions aimed at surmounting these challenges. Finally, Section V provides a conclusive summary to the paper.
DT Definition and Classification
Numerous academic and industrial publications have sought to define “Digital Twin”, resulting in a multitude of ambiguous definitions that provide limited clarity without further elaboration. Examples of these definitions include terms such as “digital replica [37],” “digital counterpart [38],” “virtual counterpart [39],” and “virtual representation [40].” Without specifying the meaning of replica, counterpart, or representation within these definitions, they offer minimal explanatory value. This section defines and classifies DT systems by examining their data connectivity and data integration attributes. Fig. 4 shows the classification framework that categorizes different types of DT systems as Digital Model (DM), Digital Shadow (DS), and Digital Twin [41].
A. Data Connectivity Attribute
PV models exhibit diverse levels of data connection between their physical and virtual parts. Simple digital simulations are created without linkage to a physical entity, whereas more complex ones possess total fusion with instantaneous data transfer [42]. The types of data connections present within a DT model can be classified into three distinct categories: non-automated, direct, and recursive data flow.
Non-automated data flow excludes any sort of automated data interchange between physical and virtual components. Such digital representations may encompass comprehensive descriptions of the physical object, including simulation models of proposed manufacturing facilities, mathematical models of innovative products, or other representations of physical objects that do not rely on automatic data integration [41]. For example, software programs are utilized to create 3D models of rooftop PV installation, incorporating the layout of panel, tilt angle, and other physical attributes [43]. An example of rooftop PV installation model is shown in Fig. 5. However, this model does not reflect real-time changes in the environment, such as the rising and falling of the sun or blocking of clouds. Though digital information from real-world systems might be utilized in developing these models, all data transfer is conducted manually. Consequently, changes in the condition of the physical object do not immediately affect its virtual model. Non-automated data flow is used in DM for conducting steady-state analysis of PV systems and is not responsive to real-time changes in environmental data.
Direct data flow is distinguished by the existence of a one-way data flow between an extant physical object and its digital counterpart [45]. Alterations in the state of the physical entity lead to corresponding changes in the digital entity; however, the inverse does not transpire. To illustrate, consider a PV unit within a solar farm. Sensors are strategically placed throughout the unit to collect data such as panel temperature, incident irradiance, or produced power. These data are utilized to create a DS model, which provides a real-time representation of the current status of the PV unit [46]. However, a DS model lacks the ability to respond to changes in the environment. A DS model demonstrates the capability to react in real-time to environmental fluctuations, enabling dynamic analysis of PV systems [47]. While DS models that incorporate direct data flow offer considerable value for dynamic simulation and data acquisition, they do not possess the ability to deliver decision-level control over PV systems.
Recursive data flow is characterized by the complete bidirectional integration of data exchange between an existing physical and digital object [48]. In this arrangement, the digital entity serves as a controlling instance for the physical entity. A change in the condition of the physical object immediately leads to a transformation in the condition of the virtual model, and vice versa. Suppose an electrical company designs a DT model for a grid connected PV system. This model collects real-time data from sensors embedded in the solar panel, such as voltage, current, and temperature levels [49]. If the DT model predicts a short circuit failure based on these data, it can suggest preventative maintenance. Also, engineers can simulate different scenarios in the DT model to see how the physical PV panel might react, helping them to improve the panel's design or operational protocols. The DT model with recursive data flow is proficient at adapting to environmental conditions by adjusting specific physical attributes of the PV system [11], such as the orientation of sensors, tilt angle of solar panels, and input voltage of solar cells, etc. By recursively optimizing the best solution to maximize power output, a DT model can provide decision-level control for the PV system.
B. Data Integration Attribute
The pivotal technology driving a DT system is data fusion, enabling the seamless transition of information from raw numeric input to a comprehensive understanding and actionable information [50]. Data fusion in DT systems can be implemented at three different levels: raw-data level, feature level, and decision level fusion [51].
Raw-data level data fusion is the process of combining unsorted data from multiple sources, typically sensors, to create a more comprehensive and accurate representation of a given environment or situation [52]. For example, consider a photovoltaic simulation system that synthesizes historical power data, meteorological information, and statistical analyses into a unified model [53]. This integration effectively addresses the challenges associated with the limited availability and incompleteness of historical photovoltaic output power and meteorological data. This fusion technique proves particularly advantageous in scenarios where visibility is compromised under low light conditions or bad weather conditions [54]. The primary objective of this fusion process is to enhance the overall quality and dependability of the data gathered by reducing ambiguity, redundancy, and interference [55]. Through the amalgamation of information at the raw-data level, the ensuing combined data offers a fuller and more precise portrayal of the physical object. Consequently, the fusion process enables enhanced situational awareness, improved decision-making, and a deeper understanding of the observed phenomena. By employing raw-data level data fusion, DT systems can harness the collective power of multiple sources to generate more reliable and informative representations.
In feature-level data fusion, the data from different sources are first preprocessed and transformed into a set of features, which are characteristic attributes or patterns that can be used for further analysis [56]. This fusion technique plays a crucial role in generating a unified feature set that encompasses the relevant information from diverse sources. For instance, consider a PV simulation that incorporates both light model and thermal model [57]. Each of these models extracts distinct features, such as irradiance from the light model and temperature from the thermal model. By combining the extracted features, a more comprehensive and representative simulation of PV working conditions can be created, exemplifying feature-level fusion [58]. The integration of these features has the potential to enhance the accuracy, reliability, and robustness of the PV simulation. The amalgamation of features enables a more holistic representation of the underlying data, allowing for a deeper understanding and analysis of the observed phenomena. The integration of an extensive set of features derived from various sources not only contributes to creating a simulation that mirrors a broader range of aspects found in reality, but also facilitates more accurate modeling and simulation outcomes.
Decision-level data fusion combines the output from multiple sources, models, or classifiers to arrive at a final decision or result [59]. In decision-level data fusion, each source, model, or classifier first processes the input data independently and produces its own decision or output. Then, these individual decisions are combined or fused to reach a final, more reliable decision [60]. Decision-level fusion is exemplified by an automated system designed for PV fault diagnosis, which utilizes results from multiple distinct classifiers [61]. Each of these classifier undergo separate processing, leading to preliminary diagnoses based on each one. By combining these preliminary diagnoses through decision-level fusion, a final diagnosis can be determined. This fusion technique boosts the precision and dependability of the diagnostic procedure by utilizing the advantages of various information sources [62]. The main goal of decision-level data fusion is to elevate the collective accuracy, reliability, and resilience of the decision-making mechanism, harnessing the synergistic strengths of different models. By integrating outputs from various sources, decision-level reduces the impact of uncertainties and errors associated with individual classifiers [63]. By employing decision-level data fusion, DT systems can harness the collective output of multiple models, leading to improved decision-making processes.
Comparison of Models, Shadows, and Twins
This section provides an in-depth analysis of contemporary modeling methodologies relevant to the application and implementation of PV system models. PV models are divided into three categories based on data connectivity: DM, DS, and DT. These categories are arranged hierarchically, with the DS comprising multiple instances of DM, and the DT incorporating multiple instances of DS.
This paper concentrates on the most advanced level of modeling, the DT, exploring its role in enhancing our understanding of PV systems. An examination of DT models in existing literature highlights data fusion as a key element in their development. The integration of DT technology and data fusion techniques significantly enhances the monitoring, analysis, and optimization of the performance of physical assets, systems, or processes. Fig. 6 shows a framework of how different types of data fusion can be implemented in the DT structure.
Differentiation among DT models is based on the type of data fusion employed, categorized into three subdivisions: data acquisition DT, data-driven simulation DT, and data-driven control DT. These correspond to raw-data level data fusion, feature level data fusion, and decision level data fusion, respectively. Furthermore, the full data cycle DT integrates these various levels of data fusion, encapsulating the entire process from raw sensor inputs to system control decisions. This hierarchical and integrated classification approach based on data connectivity and integration attributes enables a more structured and comprehensive analysis and control of solar energy systems. This paper seeks to provide insights into the current applications of DT and data fusion technologies in PV systems, identifying both challenges and opportunities associated with these advancements.
A. Digital Model
The DM encapsulates a discrete component of a PV system, isolating it from temporal fluctuations. These components may include the physical architecture, interactions with light, and the thermal and electrical properties of the PV system [64]. The primary attribute of the DM is its static nature, characterized by a set of equations that represent the physical properties of the object it models. The primary aim of DM models is to perform simple PV performance forecasts, which represent 83% of the reviewed DM models. These models focus on how a single environmental change over time affects system performance. Table 1 presents an assortment of DMs used for PV simulation which have been documented extensively in academic research.
DM has proven beneficial for estimating energy gain from FPV (Floating Photovoltaic) systems. The cooling effects of water within these FPV systems were used to project power output prediction contrasting ground-based and FPV systems [65]. The parameters of the model include irradiance-weighted average temperatures, heat loss coefficients, and wind-dependent heat loss coefficients. This weighted temperature takes into account the higher impact of hours with increased irradiance, due to the significantly higher energy yields that PV panels produce under such conditions. Therefore, temperature data under high irradiance settings are of greater significance. The yearly outputs of the PV systems were inferred from the assessed temperature disparity weighted by irradiance and PVsyst model. The model is used to improve energy yield from the cooling effect of FPV systems compared to the reference PV systems by up to 6%.
DMs help in simulating various configurations and environmental conditions to determine the most efficient setup for solar panels. This can significantly enhance energy output and reduce wastage by predicting how different angles, spacing, and types of solar cells will perform under varying conditions [43]. However, the limitation of the DM lies in their static nature; they aren't typically updated with real-time data. As a result, they might not account for ongoing changes in environmental conditions or degradation of solar panel efficiency over time, which can lead to discrepancies between predicted and actual system performance [66].
B. Digital Shadow
The DS demonstrates direct data connectivity and provides a real-time representation of the system through the aggregation of multiple DMs. The defining feature of the DS is its ability to reflect the real-time status of a physical object by automatically gathering and analyzing data. DS models employ sensors to establish a direct link between the physical and digital entities, facilitating the transmission of real-time parameters. DS models primarily aim for real-time monitoring, representing 54% of the reviewed DS models. They leverage their direct data connection to facilitate autonomous monitoring and integration with other IoT technologies. Table 1 exhibits a variety of DS models used for PV simulation and prediction, widely cited in academic literature.
The DS framework facilitates real-time PV monitoring through dynamic alternating current equivalent electric circuit (AC-EEC) modeling of PV modules using impedance spectroscopy [72]. The parameters of the model include series resistance, junction resistance, capacitance, and minority carrier lifetime. This technique allows for detailed characterization of the cells under various conditions such as illumination, shading, and faults, by observing changes in impedance. A key feature is the ability to extract and monitor various parameters such as resistance and capacitance changes that reflect different operational and fault states. This method provides significant insights into internal processes of the photovoltaic cells, which are crucial for improving design, diagnostic processes, and ensuring efficient performance. Moreover, it can be adapted for real-time condition monitoring, offering a non-destructive, insightful tool for ongoing assessment of PV systems.
For PV systems, a DS model offers the advantage of real-time monitoring and performance analysis. By maintaining an ongoing data stream from the physical PV system, a DS model can track energy production, identify panels that are underperforming, and monitor the health of the system [46]. This enables operators to make informed maintenance decisions and quickly address issues like shading or dirt accumulation that can reduce efficiency [73]. The limitation, however, is that a DS model typically does not interact with the system to initiate corrective measures. The DS serves as a passive observer, limiting its utility to monitoring and diagnostic functions without direct intervention capabilities.
C. Digital Twin
The DT model encompasses multiple instances of DS, each representing a potential future process outcome. The DT model evaluates each DS prediction to identify the most advantageous outcome, thereby optimizing the system towards an optimal state. The fundamental attribute of the DT is its capability for continuous optimization, either by manipulating real-world objects or by refining its modeling strategies. Within a DT framework, the virtual entity implements modifications that influence the state of the physical entity. Subsequently, changes in the physical entity are reflected by the virtual counterpart, creating a recursive data connection loop. The DT represents the pinnacle of sophisticated simulation, harnessing data from diverse sources to lead the trend towards more intelligent simulations. DT models are mainly used for fault diagnosis, representing 36% of the reviewed DT models, and PV system control, representing 32% of the reviewed DT models. These models require optimization through various potential actions. While DT models also predict PV performance, representing 29% of the reviewed DT models, they differ significantly from DM models by forecasting performance in entirely new environments rather than just temporal variations in the same environment. These new environments introduce a set of compounding environmental factors, making accurate PV performance predictions more complex. Table 2 introduces an array of DT models in the field of PV control, extensively cited in scholarly literature. To enhance comprehension of DT modeling for PV systems, this paper will categorize DT models based on their approach to data fusion, identified as the data integration attribute of the DT.
1) Data Acquisition Digital Twin
Data acquisition DT models employ raw-data level fusion, integrating data from multiple sources such as sensors, external databases, historical records, and other online resources. In the context of PV systems, Complex conditions such as fluctuating irradiance, partial shading, and low lighting can render some data inaccurate or unavailable. Additionally, Sensors are susceptible to magnetic interference and degradation associated with aging, factors which may compromise the precision and reliability of data acquisition [85]. The data acquisition DT's key characteristic is its ability to process raw data from various sources, which may be unreliable, incomplete, or contradictory, and to optimize a set of features for dependable modeling.
Pirayawaraporn and colleagues proposed a two axis solar tracking system without sensors that embodies recursive data connectivity at the raw-data level, by merging model predictions with actual measurements [78].The parameters of the model include daily angle, elevation angle, number of particles, mean values of daily and elevation angles, variance values of daily and elevation angles, weights of particles, and normalized weights of particles. The model addresses the challenge posed by the absence of sensor information through the implementation of a robust sampling-based tracking algorithm, known as the particle filter, which is utilized to develop a novel solar tracking strategy. Initially, particles correlating with various orientation angles of the proposed tracking system are generated as inputs. Subsequently, PV power is captured for each particle, and a corresponding weight is calculated to denote each particle's relative importance. Each particle's alignment is successively calculated and revised, incorporating its measurement to determine the likely location of the sun. The tracking approach concludes once every particle aligns to a uniform direction, indicative of the attainment of the optimal PV angle. This method necessitates the physical PV module to recursively explore different potential solutions to arrive at an optimal tracking angle, which improved PV energy generation by 20.1% compared to a fixed flat-plate system.
2) Data-Driven Simulation Digital Twin
Data-driven simulation DT models utilize feature-level data fusion, where information from various sources is processed and transformed into a set of distinct features displaying unique attributes or patterns. These features are then merged to form a unified set, which is analyzed to understand relationships and correlations among different attributes such as physical structure, lighting, heat, and electrical characteristics of the PV system. The fundamental trait of the data-driven simulation DT is its use of preprocessed data to refine a simulation model or multiple component models, continuously aligning model outputs with actual world data and revising feature connections to better represent the real-world object. An example of data driven simulation DT is shown in Fig. 7.
Qadir et al. applied a recursive feature selection technique to remove the least effective features and choose the best features to estimate the power yield of a combined PV-wind power system [79]. The parameters of the model include solar irradiation, wind speed, ambient temperature, humidity, precipitation, atmospheric pressure, and wind direction. At first, meteorological information is gathered from instruments and refined to eliminate any inaccurate figures that might weaken the framework. Subsequently, feature selection is executed through iterative feature removal employing cross-validation techniques. The dataset undergoes training with artificial neural network predictors, and interconnections among various attributes in the collection are identified. In each iteration, the estimator uses all the features in the data to generate a set of scores at the iteration's conclusion. Each score is associated with a particular feature. For instance, consider a scenario where the goal is to identify the top five features which contribute most significantly to model accuracy from a total of 20 features. In this case, the algorithm begins by recursively eliminating features at each iteration, provided their corresponding scores are below the algorithm's threshold. The primary objective is to identify meaningful patterns among features to enhance performance having MSE (Mean Squared Error) of 0.000000104, MAE (Mean Absolute Error) of 0.00083, R
3) Data-Driven Control Digital Twin
Data-driven Control DT models utilize decision-level data fusion, amalgamating outputs from diverse sources, models, or classifiers to formulate a comprehensive control scheme. Each source or model independently processes input data, producing individual decisions that are then synthesized into a final, robust control strategy.The primary characteristic of a data-driven control digital twin is its ability to leverage outputs and predictions from multiple models to formulate an optimal control strategy that aims to achieve the best possible results in real-world applications. Within the context of PV systems, this DT model considers numerous control variables that affect different aspects of system performance, including adjustments to panel tilt angles, voltage control, and power stability. The Decision-making DT focuses on identifying the optimal control decision to maximize system performance.
Gugulothu et al. adopted bayesian fusion for Maximum Power Point Tracking (MPPT) in PV systems [82]. The parameters of the model include solar irradiance, temperature, voltage, and current of the PV modules. Their approach combined the conventional incremental conductance algorithm with the jaya optimization algorithm, resulting in individual MPP estimates. These estimates were assigned prior probabilities based on factors such as historical performance and reliability. The likelihoods of observing the actual system output given the MPP estimates were calculated, and bayes' theorem was employed to update the posterior probabilities. The final MPP estimate was obtained by calculating a weighted average of the individual MPP estimates, where the weights were determined by their corresponding posterior probabilities. This methodology enhances the accuracy and robustness of the MPP estimation with tracking speed less than 0.1 s and achieving 99.8% tracking efficiency.
4) Full Data Cycle Digital Twin
Full data cycle DT models integrates multiple levels of data fusion, managing data from raw sensor signals to the final control decision. Within this comprehensive framework, several simpler models are interconnected to create a data pipeline that streamlines and automates a series of computational processes. These processes typically include data preprocessing, feature extraction, and model inference. As an end-to-end model, the full data cycle DT model encompasses all core functionalities of the DT, such as simulation, monitoring, optimization, and decision-making. This holistic approach renders it highly effective for PV system modeling.
Radhakrishnan et al. proposed a full data cycle DT design for categorizing Power Quality Disturbances (PQDs) in PV enhanced electrical grids [61]. The parameters of the model include value of confidence factor assigned for pruning in the decision tree and the configuration of the ensemble classification model using 10-fold cross-validation. This framework was executed via Matlab-Simulink, and diverse types of PQDs were examined. During the initial processing phase, the discrete wavelet transformation method was applied for feature extraction from assorted PQDs. These isolated characteristics were then employed in the training of foundational classifiers, encompassing logistic regression, naïve bayes, and the J48 decision tree, at the base tier. The outcomes derived from these base classifiers were then utilized in the training of the meta-classifier at a subsequent stage, culminating in a final prediction. In noisy environments, the proposed model achieves up to 27.33% enhanced classification accuracy compared to base classifiers. The utilization of different levels of data fusion in this meta-classifier model is illustrated in Fig. 8.
DT models possess the ability to not only monitor and analyze but also predict and optimize system performance in real time. With a digital twin, operators can simulate the impact of different operational strategies, predict the effects of upcoming weather conditions on energy production, and automatically adjust system parameters to maximize output [79]. The digital twin can also preemptively suggest maintenance or troubleshooting steps before issues become significant [81]. The following points illustrate how DT system and data fusion work together to achieve better results:
Data Collection: DT systems depend on data procured from an array of sources, including sensors, databases, and other information systems. Data fusion is instrumental in amalgamating this heterogeneous data to generate a comprehensive and precise representation of the asset or system within the DT model [87].
Enhanced Precision: Data fusion methodologies aid in mitigating noise, errors, and discrepancies in the data gathered from disparate sources. This improvement in data quality bolsters the accuracy of DT models, resulting in superior predictions, decision-making, and optimization [88].
Real-time Analysis: DT and data fusion technology both facilitate real-time analysis and monitoring of assets or systems. Data fusion techniques guarantee the incorporation of pertinent and current information into the DT model, thereby enabling real-time modifications, predictions, and decision-making [89].
Augmented Decision-making: DT systems leverage data fusion to assimilate information from various sources, simplifying the process for decision-makers to access and decipher complex data. This consolidated information empowers them to make well-informed decisions concerning the administration, maintenance, and optimization of the physical asset or system [90].
However, the complexity and cost of implementing and maintaining a digital twin can be substantial, requiring advanced data analytics capabilities and continuous data flow, which might be resource-intensive for smaller operations or less critical applications [82].
Direction for Future Research
A comprehensive review of the literature indicates that DT technology holds significant potential for fostering improved integration and optimization within PV systems. This, in turn, could facilitate increased energy efficiency and greater adoption of renewable energy sources. As a result, it is strongly recommended that solar farms consider integrating DT into their control processes. Nonetheless, the current developmental stage of DT technology presents certain limitations. This section aims to outline the primary constraints identified during the review process and proposes directions for future research to address these challenges. Ultimately, the goal is to further advance the integration of DT technology with PV systems, thereby maximizing its potential benefits.
A. Ai-Driven Decision Making
Within the spectrum of DT systems evaluated, an approximate two-thirds can be partitioned into DM at 38% and DS at 24%. While these categories facilitate efficient surveillance and prognostication of PV performance, they inherently lack the capability for autonomous decision-making. Consequently, they cannot serve as the primary control mechanism for their corresponding physical entities.
To foster the evolution of fully autonomous PV systems, the integration of more advanced artificial intelligence methodologies within the DT system is necessary to augment decision-making capabilities. Xia et al. posited that “Reinforcement Learning (RL) has been deployed in the domain of process system engineering to effectively resolve some formidable optimal control challenges [91].” The DT system creates a virtual representation of a tangible physical system, thereby providing an arena where the RL agent can hone its actions. The agent engages with the DT model by initiating actions, garnering feedback in the form of rewards or penalties, and updating its policy to amplify future actions based on this feedback [92]. For example, in a PV powered system, the RL agent could govern operational parameters to minimize energy consumption whilst optimizing output quality [63]. Matulis et al. suggested that “the fusion of digital twins and reinforcement learning provides numerous benefits, such as reduced training time and costs in physical space, or reduced risk of damage to an expensive physical test-bed [93].” This combination of DT technology and RL shapes a powerful instrument for optimizing and understanding intricate physical systems, with the added advantage of allowing real-time adaptation to changes in system dynamics.
B. Computational Requirement
The drawbacks inherent to several data fusion methodologies predominantly stem from their intense requirement for computational resources when dealing with substantial data volumes. This necessity for considerable computational power not only results in significant investment but also introduces complications associated with real-time operation within DT systems [40]. Any computational latency may cause the injection of inaccuracies, as sensors would then be recording outdated data. Accumulating over time, these inaccuracies could potentially cause the DT model to lose synchrony with the physical counterpart it is designed to mimic.
This circumstance is especially critical for PV systems, which display a heightened sensitivity to environmental variations; therefore, any latency due to computation is unacceptable. Zhang et al. proposed edge computing, capable of facilitating shared computing resources at the network edge, as a dominant paradigm to address these computational demands [94]. Edge devices in the DT system can be controlled by an on-site microcontroller to manage panel activities and sensor data collection. However, Chang et al. pointed out that PV “models that can deliver high prediction accuracy typically require extensive computational and storage capacities at run-time, and generally underperform in edge computing systems with limited resources [95].” Edge computing units, capable of executing signal filtering and lightweight algorithms, enable prompt responses to potential hazards while transmitting only relevant data to the cloud for further analysis [96]. This method effectively diminishes the volume of data necessitating transfer. The cloud environment, offering a larger assortment of computational resources, can support decision-level data fusion, thus facilitating the implementation of advanced machine learning algorithms and enhancing the overall efficacy and precision of the system.
C. System Security
The DT model of a PV system, being an essential component of the energy system, constitutes a vulnerable target for malicious cyber-attacks [97]. Any disruption or damage inflicted on the power supply could have extensive implications for individuals dependent on its steady operation. However, as Alcaraz et al. indicated, DT's “cybersecurity issues have not been sufficiently explored yet [98]”, a statement underscored by the scant number of studies that integrate any security feature into their DT framework. Thus, maintaining security within the PV system's DT model emerges as a crucial challenge that demands immediate resolution. In addressing this security issue effectively, Yaqoob et al. identified that “blockchain possesses the potential to become the most relevant and capable technology to assure transparency, trust, and security in DTs [99].” The integration of a DT system and blockchain technology necessitates a comprehensive understanding of the DT's structure, the identification of critical data points suitable for blockchain integration, and the selection of an appropriate type of blockchain. When considering a PV system, a private blockchain appears to be the most optimal choice due to its inherent advantages in security and reduced computational overhead [100]. The integration of smart contracts can enable the automation of environmental data processing based on predetermined conditions [101]. By adopting these measures, the overall aim is to enhance the security, traceability, and data integrity of the DT models.
Conclusion
This critical and systematic review of DT technology and research, with a focus on the PV industry, has provided a comprehensive understanding of DT technologies. The review has illustrated the distinct advantages and limitations of different digital twin models by classifying them based on data connectivity and integration. It also summarizes how DT models are applied in various PV systems. Additionally, the review introduces a framework for integrating data fusion into digital twin systems. The critical review of published works reveals that DM models are primarily used to optimize initial designs and predict performance under various conditions but are limited by their static nature and lack of real-time data adaptation. DS models offer real-time monitoring and performance analysis, aiding in maintenance and diagnostics, yet function only as passive observers without direct interaction with the system. In contrast, DT models deliver comprehensive benefits by monitoring, analyzing, predicting, and actively optimizing system performance in real-time, despite their high implementation and maintenance costs requiring sophisticated data analytics and continuous data integration.
The review has also identified key challenges that must be addressed to promote the widespread adoption of DT technology in the PV industry. Future research should focus on overcoming decision-making, computation, and security challenges. By addressing these obstacles and refining the proposed framework, the PV industry can capitalize on the benefits offered by DT and data fusion technologies, ultimately paving the way for a more sustainable, efficient, and reliable energy landscape.