Introduction
In the era of Industry 4.0, the cyber-physical system (CPS) has become the central object of investigation in both industrial and academic domains [1]. From the viewpoint of system architecture, a bridge is built between the virtual and the physical dimensions. As a transdisciplinary topic, CPS is receiving extensive research interests covering a wide spectrum of subjects such as industrial design, industrial technologies, computer science, electrical engineering and so forth. While cyber-physical systems commonly exist in both the industrial domain and in human daily life, this review studies the advances in industrial CPS (ICPS), with special focus on novel approaches to system monitoring, fault diagnosis and control.
Typical examples of industrial CPS include the smart grid, intelligent factory and intelligent traffic systems, where the physical entities and the virtual cyber world benefit from mutual association with each other [2]–[4]. A major new challenge is that the traditional problems regarding hardware, software and networked systems are correlated [5], [6]. As a result, feasibility analysis, robustness analysis, performance evaluation, and performance optimization of the ICPS monitoring and control strategies become an imperative requirement to be investigated in an overall systematic level.
In addition to interpreting the abstract concept of ICPS, this review will discuss the recent advances based on the practical challenges in the design of ICPS monitoring and safety control systems and the future horizons.
Figure 1 shows the trend of research focus in recent years. Indexed by the Web of Science core collection, the statistics of the past five years illustrate the high quality research literature under the themes of cyber-physical system (CPS), industrial CPS (ICPS), Industry 4.0, Internet of Things (IoT), as well as the application oriented ones, including smart grid, intelligent factory (intelligent manufacturing) and autonomous vehicles. It can be seen that the research output in most topics has increased continuously, with the number of journal publications in IoT surging over 8, 000 level in 2017. As a subtopic of CPS, ICPS has emerged since 2013, and has received increasing attention ever after. Among the application topics, smart grid first received most investigations and has over 3,000 SCI core collection publications per year since 2014. Researches in autonomous vehicles show steady growth since 2014 and have output over 2,600 papers in 2017. By contrast, it is surprising that the intelligent factory and intelligent manufacturing related studies have seen the least outcome among the investigated research areas during the past five years.
Figure 2 depicts the indexing results of IEEE publications. The journal and the magazine papers reflect the recent advances in different subjects, while the conference publications indicate the popularity of the topics in the IEEE community, and the number of books published reveals how systematically they have been studied. The plots have produced results essentially in agreement with the above analysis based on the Web of Science, except that the IEEE publications on intelligent factory notably outweigh those on ICPS. Smart grid and IoT are the most systematically studied. By contrast, few books about Industry 4.0 and ICPS are seen. It is also interesting to note that autonomous vehicle related research is the hottest topic in terms of the volume of conference papers.
Dedicated to producing a systematic and thorough review, the contributions of this work are as follows. Firstly, the current status of the theoretical research and the applications of ICPS monitoring and control are investigated. Secondly, the novel sensing techniques that provide the primary information used in the data-driven approaches are categorized and compared. Thirdly, recent advances in data-driven monitoring, fault diagnosis and control approaches, as well as their applications to the smart grid and the autonomous vehicles are reviewed in detail. Fourthly, key challenges and future research directions are pointed out.
The remainder of the paper is organized as follows. The next section summarizes popular sensing devices and novel techniques. In Section III, the recent advances in the data-driven ICPS monitoring and fault diagnosis are reviewed, with a special focus on smart grid applications. In Section IV, recent progress in data-driven control strategies is reviewed. Based on the analysis of the requirements in ICPS applications given in Section V, Section VI proposes the key challenges and significant topics of future research. The last section concludes the paper.
Intelligent Sensing Techniques
A. Smart Sensors
In most cyber-physical systems, sensing or perception acts as an important process of interacting with the external environment, through which the current status of the system is represented with quantified features. From the viewpoint of multi-agent systems, each intelligent device can be regarded as an agent that takes up some computing power and has some autonomous capabilities, sharing data and information that other agents cannot observe.
Smart sensors are sensing devices that have digitalization ability and digital information processing functionalities [7]–[9]. Smart sensors generally consist of micro-processor, electronic circuits and I/O interfaces. Compared with traditional sensors, smart sensors have greater capabilities for data processing, storage and information transformation. Useful functions such as automatic calibration, zero correction and scaling of the measured signals are achieved by the microprocessors, and thus do not require as much meticulous electronic design, debugging and testing as the traditional analog signal based sensors do [10]–[12]. In addition, efficient communication and flexible networking are also irreplaceable features of smart sensors in order to implement standardized I/O interfaces [13].
B. Frequency Disturbance Recorder (FDR)
Frequency disturbance recorders have been developed as a plug-and-play (PnP) device which has easy access to the power grid by simply plugging into the 110 Volts or 220 Volts power outlets. They collect the frequency data in real-time and send them to the information management system through long-distance communication (Ethernet) for the dynamic monitoring purpose. The first generation of frequency disturbance recorder is based on the GPS timing mechanism, while the second generation is based on the Internet timing synchronization technique.
C. SCADA
SCADA is the abbreviation of “supervisory control and data acquisition”. The SCADA system is a computer based production process control and dispatching automation system. It can monitor and control the field devices. Applications of SCADA systems include real-time process monitoring such as oil refining and water treatment processes, as well as critical urban infrastructure such as the power grid and the transportation systems [14]. As for the third generation of SCADA systems, they can be deployed over long physical distances and the design targets are for the distributed control tasks and large-scale networked monitoring systems. For the fourth generation of SCADA systems, the open network protocol plays a central role in the design of more powerful SCADA systems, enabling cloud computing and Internet-of-things (IoT) technologies to be employed to increase the robustness and flexibility of the objective systems [15]. However, these open configurations bring about increased security vulnerability and expose the monitoring systems to various potential cyber-attacks.
D. Soft Sensors
Soft sensors are the software libraries or algorithms that achieve state estimation and key-performance-indicator prediction. In the data-driven process monitoring and fault diagnosis framework, soft sensing approaches have been developed based on data-driven observers, multivariate analysis (MVA) and machine learning techniques.
Over the past decade, the relationship between the parity vectors and the Luenberger type observer has been extensively studied [16]. It has been revealed that there is a one-to-one mapping from an arbitrary parity vector to a normalized diagnostic observer if the observer dimension is fixed. Using historical data rather than the demanding knowledge about the systems’ mechanism models, the “parity space design–observer realization” scheme can greatly reduce the design effort of the diagnostic observer, which is the core of constructing a data-driven state observer. It can also be guaranteed that the observer achieves unbiased state tracking and unbiased output estimation for LTI systems.
For nonlinear systems and large-scale complex systems, the observer based approaches will lose their advantages. In these circumstances, learning based schemes show better performance and robustness against noises. Just-in-time learning (JITL) and deep neural networks are some examples of the current research focus of soft sensing for nonlinear systems with robustness design against missing data, infrequently measured key-performance-indicator (KPI) and stochastic noise [17]–[20].
In contrast to real-time sensing, which corresponds to soft measurement, predictive soft sensing can “measure” variables that traditional sensors cannot measure. This is especially meaningful for KPI oriented process monitoring and high-level production scheduling. For instance, in the steel production industry, the quality of steel plate rolled out of the rollers cannot be instantly measured due to the extremely high temperature. It usually takes a period of time before the product is finally evaluated. Predictive soft sensing provides the approximation of significant evaluation index before real experimental tests are completed. To this end, various KPI prognosis approaches have been proposed and applied to wine production, chemical reaction process and battery health monitoring [21]–[26]. Among these approaches, the modified partial least squares (PLS) based ones are effective in terms of robust KPI prognosis and KPI oriented fault detection, since they can properly address the issues of high dimensionality and collinearity. These algorithms were realized in a recently published open source Matlab toolbox called the Data based KPI oriented fault detection toolbox (DB-KIT)1.
Advances on Data-Driven Monitoring and Fault Diagnosis
A. Threats in Industrial Cps
Industrial CPSs are exposed to more severe threats from multiple sources than traditional industrial systems. This part briefly summarizes different categories of threats with supportive real cases.
There are external threats and internal threats endangering the industrial CPSs. Typical external threats include hacking activities, APT (Advanced Persistent Threat) attacks and hardware targeted attacks, while the internal ones can be triggered intentionally by an employee or an external contractor.
While external attacks are mostly making use of the systems’ vulnerabilities, internal attacks access the central control units much more easily and can cause major breakdown to the operations. A famous case tells about a disaffected employee attacking the waste water treatment control system [27], [28]. The employee worked for a company that provides SCADA system installation service for Queensland Maroochy Shire Council. By hacking into the control system and using unauthorized instructions, 800 thousand litres sewage influxed into local parks, rivers and even hotel lobbies. Serious damage to the environment was caused by this shocking event.
Human error is also one of the biggest threats to industrial control systems, despite being unintentional in most cases. For example, investigations showed that the Three Mile Island nuclear accident started from the negligence of an initial cleaning equipment [29]. The reactor was completely destroyed, and the whole process took only 120 seconds. Reports show that this event was graded as a five level nuclear accident, caused by a series of erroneous human operations and failing to deploy an effective fault diagnosis system.
The unsafe operation of complex industrial systems has caused huge losses to people’s lives and property. In 1993, an explosion that occurred at the Beilun power plant in Zhejiang, China led to 22 deaths and 8 serious injuries. In 2005, an explosion occurred at the Petro China Jilin Petrochemical Corporation, resulting in serious environmental pollution, and the water supply to 9 million people in the downstream city was cut off for five days. The large-scale and networking of modern industry also magnifies local and tiny failures through the control loops [30]. Based on the above facts, reliable process monitoring and fault diagnosis system design schemes that can be applied to large-scale processes are urgently in need.
B. Data-Driven State Estimation
Traditionally, state estimation approaches rely on a system model with known structures and parameters, and are well developed in the framework of state observers [31]. For instance, Kalman filter is recognized as a powerful estimator for linear time varying systems with stochastic noise. However, the a priori knowledge these approaches required is not usually available, in which circumstances the data-driven state estimation methods are necessary.
Multivariate statistical analysis (MVA) based and subspace aided approaches are two typical alternatives. For smart grid state estimation, a notable property is the frequent and significant changes in power generation and power consumption [32]. Compared with traditional power grid, power generation and consumption in smart grid is uncertain in nature because (i) New energy, such as wind energy and solar energy, is hooked up to the grid. These new energy sources are seriously influenced by natural conditions. (ii) Newly introduced scheduling of the power in the large-scale grid re-distributes the power generation requirement. (iii) Plug-and-play devices such as the hybrid electric vehicles bring variability to the energy consumption end. Considering these facts, the least square based state estimation approach has the “local optimum problem”, where the current state is not suitable for the initial guess of the next state [33]. Likewise, the gradient descent based approaches become invalid in these cases. To deal with the above concerned problem, recursive tracking techniques of the statistical properties based on first order perturbation are used for continuous varying process [34]. For processes with abrupt changes and strong nonlinearity, locally weighted projection regression (LWPR) based approaches show better performance and robustness.
C. Unobservable Attacks
While some researchers are dedicated to designing reliable monitoring systems against disturbances and uncertainties, some others are trying to reveal the defects of the existing fault diagnosis systems by proposing potentially factitious attacking schemes, among which the low-sparsity unobservable attack is one of the most challenging types in the anti-hacker-attack practice. Low-sparsity unobservable attack refers to the adversarial false data injection methods that tamper with minor measurement variables. A successful attack will not trigger alarms when the traditional monitoring and fault detection approaches are used, such as the bad data detection (BDD) approach [35]. Specifically, the monitoring and control of the smart grid requires a timely update of the system status, which is determined based on real-time collected meter data. There is a potential security hazard for the smart grid when the polluted measurements from the unprotected and easily attacked meters are used to monitor and manage the smart grid, because the false data could mislead the SCADA system to make wrong decisions. It is revealed that using partial (low-sparsity) meter leakage data could launch an unobservable attack to the smart grid. It is of vital importance to prevent this from happening.
There has been literature reporting potential unobservable attack schemes, which provides references for system safety design before an actual attack makes use of these defects of the monitoring and fault diagnosis systems. Reference [36] proposed a subspace aided unobservable attack approach using partial measurement data. It was also shown that although such an attack triggers an alarm with the existing fault diagnosis systems, it “launders” the injected data and incriminates the normal data.
D. Data-Driven Fault Diagnosis
Traditionally, industrial systems operating in the steady state are monitored with univariate statistical analysis methods or signal processing methods. However, these methods fail to explore the patterns and correlation relationship among the variables, knowledge of which is characteristic in revealing the malfunctioning in the systems. This fact is even more noticeable for the modern large-scale complex systems with multi-loop, multi-level coupling and correlated features [34]. Considerable efforts have to be made to monitor multiple monitoring screens if these methods are applied in such systems, which introduces potential threats to the safety and reliability of the system.
In recent years, multivariate statistical analysis (MVA) based process monitoring and fault diagnosis approaches have been extensively investigated. The corresponding system design and realization procedure follows an “offline training—online implementation” procedure [37]. Principal component analysis (PCA) and PLS techniques are two representative basis of the MVA monitoring and diagnosis schemes [38]. For PCA, it is assumed that there exists a lower dimensional principal component space, which represents significant variance information of the whole dataset. This assumption enables the monitoring of a lower dimension principal component space rather than the high dimensional mutually coupled variables. In PLS based schemes, optimal decomposition of the measurement space is achieved in the sense of maximizing the correlation between the input and output variables [39].
In practice, for those systems where process dynamics cannot be omitted, the performance of the basic MVA based process monitoring and fault diagnosis schemes will be seriously affected [40], [41]. Specifically, the normalization procedures are no longer suitable for these processes due to the time-varying nature of the variables’ statistics such as the mean and variance values. As a result, the estimations are biased, and the interpretability of these statistics is weakened. To handle the problem, the state space representation of the dynamical processes is used to characterize necessary dynamics. In this context, data-driven analytical redundancy is used to describe the process and generate virtual estimation signals, which are used for subsequent fault diagnosis procedures, as well as performance analysis and evaluation [42]. It can also be regarded as a reformulation of the fault diagnosis system design objective: variation detection by comparing and evaluating of the difference between the real plant and its analytical redundancy. Furthermore, by means of the so-called stable kernel representation (SKR), it is possible to bridge the observer based methods with their data-driven realizations [43].
Approaches based on fault detection filter (FDF), diagnostic observer (DO) and Kalman filter have been designed for systems corrupted with deterministic disturbances and process faults [44], [45]. The observer based residual generation schemes are in closed-loop configuration with a feedback gain to be designed. Through a proper choice of this gain, the residual dynamics can be designed to possess arbitrary convergence rate given any initial value while suppressing the effects of model uncertainties. Robust approaches against unknown input signals such as disturbances have been proposed based on Kalman filter. It should be noted that for data-driven process monitoring and fault diagnosis, the parameters of the mechanism models are not known a priori, which indicates that the system matrices in the Luenberger equations are unavailable in the design of the feedback gain
From the sampled data point of view, the observed variables represented in time series forms can be transformed into a compact form where only one sample of the historical state variables is of concern [47]. By stacking the state variables into a compact data form, the so-called parity relation is established. In order to achieve precise estimation of the outputs and force the residual signals to converge to zero under fault-free condition, the term with the state variables should be eliminated by a parity vector orthogonal to the coefficient matrix. All the qualified parity vectors constitute the parity space, which is the null space of the coefficient matrix of the state variables. Therefore, the selection of parity vector used for residual generation becomes an important research focus. The goal is to choose such favorable parity vectors that lead to fault detectors sensitive to the faults and at the same time ensures robustness to uncertainties and disturbances. Other research efforts in this area are dedicated to reducing the computational load by dimensionality reduction [48] and dealing with nonuniform sampling issues [49].
Advances in Data-Driven Controller Design
Industrial cyber-physical systems that have hierarchical control architecture generally consist of four levels: the component level, the control loop level, the functional subsystem level, and the plant-wide decision-making level. The component and control loop levels have been the subjects of traditional control theory and the system identification area [30], [50]. The upper decision-making levels have access to the cyber resources so they are more “intelligent”. By contrast, the lower levels have to complete their tasks under the specifications from the upper levels. This part reviews two categories of fault-safe approaches applicable to the lower levels and to the higher levels, respectively.
A. Plug-And-Play Control
In terms of the existing control systems, especially those with encapsulated modules in complex control loops, equipping them with fault-tolerant capability requires embedding additional monitors and controllers without modifying the existing ones. This requires the monitoring systems to have strong scalability and modularity. Plug-and-play (PnP) control aims at designing control strategies that add new controller modules while reserving the existing ones in use (such as the widely deployed PID controllers) [51], [52]. In practice, full-time operating processes and live systems are frequently maintained. Redesign of the whole system in response to small changes is usually not feasible due to cost and commissioning concerns. In this sense, PnP strategies are one of the most practically applicable alternatives to improve the systems’ long-term performance and optimize the set-point configuration according to the instructions from the upper decision-making levels.
In [53], an advanced PnP process monitoring and control architecture (PnP-PMCA) was proposed, which is an integrated implementation of process monitoring and control with scalable structure and modularized components. By introducing Youla parameterization, all stabilizing controllers are represented in a uniform formula [54]. This achieves a smooth transformation from the original system to the modified system with guaranteed internal stability. Based on these studies, some fundamental schemes have been proposed for observer-based residual generator design and online configuration. In [53], the plug-in module of process monitoring system is developed. In order to achieve self-configuration, adaptive and iterative online configuration approaches are proposed in [55]. The adaptive approach has a high convergence speed, however, it carries huge online computational load at each sampling instant. Compared with the adaptive approach, the iterative scheme avoids the numerical sensitivity problem and significantly reduces the online computational load. On the other hand, the scheme has a lower convergence speed. Considering the fact that the process industries are generally corrupted with unknown disturbances, a reliable process monitoring scheme is developed in [56] for stationary processes to ensure the monitoring performance.
Another challenging problem is the online configuration of the controllers to accommodate faults and retain the system stability and performance. For data-driven fault-tolerant control, some passive control schemes have been developed with the aid of predesigned fault scenarios [57]. However, passive control improves the system robustness while introducing more conservativeness, which limits the online performance. In recent years, active fault tolerant control schemes have been attracting extensive attention. Badihi et al. [58] have extended attention to adaptive control strategies of actuator faults. The work [59] is one of the first that proposed an active fault tolerant scheme for uncertain strict-feedback nonlinear systems. Kamal et al. [60] proposed a multi-observer switching control strategy for robust active fault tolerant fuzzy control of variable-speed wind energy conversion systems.
B. Reinforcement Learning Aided Control Strategies
Reinforcement learning (RL) is an experience based approach to achieve (near) optimal control policy design using only data obtained from actual observations [61]. Without an explicit model of the system dynamics, RL keeps maintaining updates of the value function to achieve better approximation, based on which to select actions that approach to the long term goal formulated by the accumulated future rewards.
For vehicle control problems, one of the major difficulties to apply RL approaches in learning a controller from scratch lies in the oversized state space. Traditional “tabular approaches” estimate the value function
The research towards solving the above mentioned challenges fall into two directions. The first direction focuses on the Q-learning and actor-critic learning where convergence condition should be carefully discussed and the balance between exploration and exploitation be addressed.
In [61], a Takagi-Sugeno fuzzy inference system was constructed to generate control signals, and a
The second direction is closer to trivial solutions since it treats RL as an optimization tool to optimize existing stabilizing controllers with undetermined parameters to be tuned. In this scenario, parameterization of the stabilizing controllers and the definition of optimization goals play the central role. The advantage is that the systems’ stability can be guaranteed. The only concern would be the controller’s performance. One of the possible means of Youla parameterization has gained extensive research attention and is expected to be integrated with the fault-tolerant controllers.
Practical Requirements and Current Status of ICPS Applications
A. Smart Grid
Power grid is the electric network that provides the electrical energy to the users. In the broad sense, power grid consists of all the connections between the power generation end and the customer end. The major functions of the traditional power grid include power transmission, power transformation and power distribution. Real-time monitoring of such large-scale complex systems requires robust and efficient approaches, as well as reliable measurement devices and communication networks.
To manage the huge system of power grid, measurement devices are installed at a great number of terminals to measure and inspect the real-time voltage, current and frequency. This open loop configuration is incapable of achieving grid-wide fault diagnosis and energy scheduling. It should be noted that unlike water supply systems, electrical energy is difficult and expensive to store. Large-scale electricity storage plan is not a economical option to compensate the excessive electric supply at peak hours for that at low hours, and relies heavily on the power infrastructure. How to balance the power generation capacity and guarantee the supply for heavy loads at peak hours is essential for environmental and economic concerns.
Smart grid development not only aims at including more supervisory equipment to enhance the observability and controllability of the network system, but also extends various forms of energy sources [63]. Compared with traditional power grids, smart grids have the following features:
Network scales are enlarged with better expendability. In the CPS framework, the local power plants are treated as the input terminals that can serve globally. In other words, their energy can be delivered to remote customers with less border limitation. In addition, a variety of new energy sources and types such as wind power and solar energy are hooked up to the grid.
Interconnection is more complex. To achieve efficient transmission, more city-wide connections will be established and as a consequence the topological structure will be greatly changed. The number and location of agents will change frequently.
Grid-wide scheduling with closed-loop control is integrated. Based on the real-time information provided by the smart monitoring devices, it is easier to redistribute the available power resource.
More potential safety hazards. The smart sensors have a decentralized physical distribution that is more vulnerable to hostile attack. Electrical substations at key intersections are responsible for multiple domains. Closed-loop configuration introduces stability issues. New monitoring and fault diagnosis schemes are required against hardware failure and cyber malfunctioning (such as false data injection).
B. Constrained Control
Another challenge to the controller design task lies in the constrained control problem under system dynamics variation due to external environment changes [64]. Regarding autonomous driving, the newly introduced difficulty of such a problem mainly results from the contradiction between the demanding high level commands and the limited ability of low level execution. In particular, the control objectives are set in real-time by the top decision making levels of the autonomous system, which are oriented to the transactions from external environment condition perception to internal responses to adapt to various working conditions. By contrast, the control loops and actuation components at the bottom levels must guarantee the systems’ stability while trying to achieve the tracking performance required by the upper levels, which may be unachievable.
A reasonable balance and coordination among the hierachical cyber-physical system should be achieved. To this end, researches in output constraint control were carried out to deal with path following problem in [65], and trade-off strategies for over-actuated vehicles using control allocation laws were proposed to improve the operational efficiency and motion control performance of the vehicle in [66] and [67].
C. Intelligent Factory
Conventionally, the production lines in factories are yield oriented. Given the desired specifications of final product quality, production quantity and efficiency are the top priorities. By contrast, the new generation of factories based on ICPS aim to achieve plant-wide intelligent manufacturing and mass customization, and the ultimate goal is oriented to the marketing. This needs increased flexibility among the departments and units so that data can be shared across the levels of the multi-hierarchical architectures.
Qingdao Haier Mould Corporation is the largest mould and fixture manufacturer in China. Haier Mould Corporation is equipped with world’s leading products, mold design, analysis and processing software, as well as various kinds of high speed machining center, spark machine, wire cutting machine and many other types of professional equipment. To achieve a deep fusion of informatization and industrialization, Haier Mould Corporation started building smart factory based on CPS management system since 2013.
The core of the intelligent factory is the integrated ICPS collaboration and management system, as shown in Figure 5. At the first stage, the production plans are input to the collaborative platform based on higher level decisions. When the processing tasks are updated, program preparation, tool preparation and electrode preparation are carried out in parallel. In the meantime, status information of different subsystems is transmitted to the corresponding sectors in charge. Then, automatic check and automatic calibration will be carried out before the raw material is pushed to the machine tools. Throughout the process, the digital machine tools’ status is fed to the control and management center. The core features of the intelligent mould production factory are listed as below.
Deep fusion of information software and manufacturing equipment.
Multi-sectoral collaboration and management.
Decision support based on big data analysis.
Improved OEE (Overall Equipment Efficiency).
Schematic of the idea and realization of PnP control. (a) Existing control system (b) Controller cannot be modified due to encapsulation (c) PnP architecture (d) PnP realization.
D. Intelligent Traffic
The traditional research topics of traffic engineering have been extremely complicated due to the simultaneous requirements from various subjects including systematics, engineering, law and regulations, etc. There are too many constraints the system should fulfill. Therefore, in the general case, some factors have to be compromised [68]. Intelligent traffic is the central development direction of future traffic engineering. Oriented to the goal of high transportation capacity, low traffic accident rate, low energy consumption and more economical transport, intelligent traffic systems should make full use of the newly built information infrastructure and vehicles (from the hardware aspect), as well as advanced prediction approaches and scheduling strategies (from the software aspect) [68].
Over the past decade, the market has promoted the applications of a series of smart terminals and devices which serve for information digitalization, data acquisition, open-loop monitoring, etc. One of the most successful applications is the vehicle license plate recognition system, using the reliable pattern recognition based image processing techniques [69]. With the recognized digital plate numbers, the access information of vehicles into a certain area can be automatically collected. They are also used in the automatic snapping system to record infringing vehicles, whose data are sent to the public security website for self-serve fine payment.
For logistics companies, freight volume prediction and route planning algorithms are also developed to maximize the efficiency–cost ratio [70], [71]. They predict how busy the highways and lanes are, so most traffic congestions can be avoided, and optimal scheduling can be achieved. However, the effectiveness of these schemes is subject to ideal assumptions, and their performance will be greatly degraded if the randomly occurring natural and social factors violate the applicable conditions.
Key Challenges and Future Research Directions
A. Key Challenges to ICPS Monitoring and Safety Control
In this paper, we have looked into ICPSs including the smart grid, autonomous vehicle, as well as intelligent factory and intelligent traffic systems. It can be seen that the transition from the theoretical breakthroughs to applicable techniques of the monitoring and control systems is urgently in need.
The existing monitoring and control schemes mostly focus on lower level performance indices. For instance, at the control loop level, the closed-loop stability, the response time and the tracking error are generally of concern. However, such control loop level performance indices are insufficient in the context of Industry 4.0. The conventional system monitoring and control techniques are applied mostly to a few control loops rather than the whole process. However, for large-scale complex systems, it is more important to achieve global stabilization and performance optimization at the plant-wide decision making level [72]. From the overall system stability and performance guarantee point of view, the plant-wide oriented decisions such as a more reasonable resource allocation strategy may outweigh the performance improvement of a local maintenance action that is less urgent. Moreover, the online learning and optimization policies are beneficial for transparent global system management. The macroscopic guidelines can avoid catastrophic breakdowns to the best extent, and reduce the chance of a massive shutdown. In addition, the global performance indicators are more intuitive and goal-oriented. It is therefore more feasible to alter the overall economic strategy based on instructions given by the plant-wide performance supervised monitoring and control architecture. Such features are of great value to increase the competitiveness in the current rapidly changing economic environment and boost the economic benefits [73].
Industrial standards and theoretical frameworks are being established from the leading research institutes and scholars around the world. Since the topic of ICPS monitoring and control covers too many domains, we only summarize the key challenges to the existing problems from a macroscopic perspective. Table 1 compares the traditional tasks and the corresponding new challenges in different stages of system monitoring and control.
B. Future Research Directions
To achieve the target of plant-wide monitoring and safety control of the large-scale complex ICPSs in a data-driven manner, and in the face of the aforementioned challenges, the following research topics are proposed.
Design of data-driven fault-tolerant controllers against unknown faults, especially for nonlinear systems, time delay systems, and distributed systems.
Online fault localization and isolation based on the fault propagation analysis, with little or no a priori knowledge of the system structure.
Plant-wide key performance indicator supervised system monitoring that reduces unnecessary shutdown maintenance.
Deep learning based data analysis against false data injection and low sparsity unobservable attack.
Reinforcement learning aided online optimization and adaptive control of highly dynamic systems.
Plug-and-play controller design (PnP-PMCA) that reduces subsystem maintenance time.
OEE oriented life cycle management based on key performance indicator prediction and fault prognosis.
Sensor data fusion based knowledge exploration, such as feature selection, pattern recognition and statistical modelling.
Prototype system development based on microprocessors, FPGA based design for multi tasks, as well as GPU acceleration for deep learning based solutions.
Conclusions
Oriented to ICPS monitoring, fault diagnosis and control tasks, this paper investigates the recent status of research and reviews the advances in data-driven approaches over the past two decades. Data sources and preliminary processing and transmission are summarized in relation with the intelligent sensing techniques. The new challenges in the smart grid systems that the conventional monitoring approaches cannot handle are pointed out. Furthermore, the data-driven state estimation problems, unobservable attacks and recent advances in the data-driven fault diagnosis schemes are discussed with a focus on their limitations in real systems. Advances in the control strategies including the plug-and-play control, reinforcement learning aided control and constrained control are reviewed. Finally, new challenges and future research topics are proposed based on the practical requirements of ICPS applications.
ACKNOWLEDGEMENT
The authors would like to sincerely thank the anonymous reviewers and the editors who provided appropriate suggestions and made detailed corrections, and have made the paper much more readable.