Nomenclature
Monitoring statistic built by applying | |
Detection threshold with confidence level | |
Monitoring statistic built by applying | |
Detection threshold with confidence level | |
Number of principal components. | |
Covariance matrix of normalized variables. | |
Vector of contributions of variables to | |
Vector of contributions of variables to | |
Ratio percentage of sum of | |
Square of Euclidean distance between two windows. | |
Derivative operator. | |
Vector of residual variables obtained by PCA. | |
Number for a data window formulated offline. | |
Vector of principal components obtained by PCA. | |
Identity matrix with dimension as | |
Parameter for | |
Length of data window. | |
Temporary variable for counting from 1 to | |
Number of measured variables. | |
Size of modelling dataset. | |
Sampling time point for offline data. | |
Sampling time point for online data. | |
Squared Prediction Error (SPE) statistic calculated based on residual variables. | |
Number for a data window formulated offline, and | |
Specific values of | |
Value of | |
Hotelling's statistic calculated based on principal components. | |
Continuous time. | |
Matrix with columns as | |
Matrix with columns as | |
Vector of measured variables. | |
Value of | |
Vector of normalized variables. | |
Embedding matrix of | |
Confidence level for detection thresholds. | |
Diagonal matrix with diagonal elements as | |
Diagonal matrix with diagonal elements as |
Introduction
WIDE-AREA monitoring of power systems plays a crucial role in understanding the system behavior and improving the system operating stability margin. It usually places much emphasis on the detection and localization of disturbances, because disturbances pose an increasingly severe threat to the system security and stability [1].
Generally, disturbances deteriorate the system health by making a power system deviate from the normal operating status. With more and more advanced measuring devices such as Phasor Measurement Units (PMUs) spreading across power systems, abundant measurements containing the information of the system operating status are available for analysis. How to extract such information from the measured data for disturbance detection and localization is an important issue for power system researchers [2]. Generally, the existing data-driven methods can be divided into three categories according to the applications: (1) for the protection of power system equipment, e.g., the wavelet coefficient energy based method [3] and the hidden Markov model based method [4]; (2) for the analysis of power quality especially the waveform of alternate voltage, e.g., the Hilbert-Huang transform based method [5] and the power quality state estimation based method [6]; (3) for the assessment of the system security and stability, typically by multivariate statistical analysis based methods [7]–[10].
Usually, the first two categories of methods take a univariate approach to analyze electrical variables separately. In contrast, the third category of methods use a multivariate approach to handle variables together, particularly suitable for wide-area monitoring of power systems where many variables need to be analyzed simultaneously. This work focuses on the latter.
Principal Component Analysis (PCA), one of the classical multivariate statistical analysis techniques, is well-known
for its capability of compressing high-dimensional and correlated data without significant loss of information. It
obtains Principal Components (PCs) that are uncorrelated and Residual Variables (RVs) by projecting physical variables
onto a low-dimensional subspace that retains most of the variances of the projected variables
[11]. To measure the variation of PCs within the PCA model and the variation
of RVs not accounted for by the PCA model, two popular monitoring statistics were used respectively, that is, the
Hotelling's
In 2013, Barocio et al. [7] introduced the PCA-based
statistical monitoring method for the detection and visualization of power system disturbances and discussed its
potential for wide-area monitoring of power systems. Subsequently, Liu et al.
[8] focused on the geometric interpretation of
Specifically, the above works require the amplitude of electrical measurements recorded before and after
disturbances to be markedly different so that the amplitude of the
Against this background, the motivation of this work is to integrate
The paper is organized as follows. Section II gives a brief description
of wide-area monitoring based on PCA. Section III presents the wide-area
monitoring method based on PCA and
The following notational conventions are used throughout this contribution. Boldface capital and lower-case letters
stand for matrices and column vectors respectively, while
Wide-Area Monitoring Based on PCA
In this section, wide-area monitoring based on PCA [7]–[9], referred to as WAM-PCA here, is briefly introduced.
The symbol
Firstly, the variables in the vector \begin{equation}
\boldsymbol{C} \ = \frac{1}{{N-1}}\ \ \sum\limits_{n= 1}^N
{{{\tilde{\boldsymbol{x}}}_n}{{\tilde{\boldsymbol{x}}}_n}^{\rm{T}}} = \ \boldsymbol{U}{\Lambda \
}{\boldsymbol{U}^{\rm{T}}} = \sum\limits_{i= 1}^m {{\lambda _i}{\boldsymbol{u}_i}{\boldsymbol{u}_i}^{\rm{T}}}
\end{equation}
Then, a vector of PCs can be obtained by:
\begin{equation}
{\boldsymbol{h}^{\rm{T}}} = \left[ {{h_1} \ {h_2} \ \cdots \ {h_a}} \right] \ =
{\left({{\boldsymbol{U}_{1:a}}^{\rm{T}}\tilde{\boldsymbol{x}}} \right)^{\rm{T}}}
\end{equation}
Concurrently, a vector of RVs can be obtained by:
\begin{equation}
{\boldsymbol{e}^{\rm{T}}} = \left[ {{e_1} \ {e_2} \ \cdots \ {e_m}} \right] \ = {\left({\tilde{\boldsymbol{x}} -
{\boldsymbol{U}_{1:a}}{\boldsymbol{U}_{1:a}}^{\rm{T}}\tilde{\boldsymbol{x}}} \right)^{\rm{T}}}
\end{equation}
The variation of PCs within the PCA model can be measured by the \begin{equation}
{T^2} = {\boldsymbol{h}^{\rm{T}}} \ \boldsymbol{\Omega h} \ = \sum\limits_{i= 1}^a {{{\left({{h_i}/\sqrt {{\lambda _i}}
} \right)}^2}}
\end{equation}
Moreover, the variation of RVs not accounted for by the PCA model can be measured by the \begin{equation}
Q\ = {\boldsymbol{e}^{\rm{T}}} \ \boldsymbol{e} \ = \sum\limits_{i= 1}^m {e_i^2}
\end{equation}
Wide-Area Monitoring Based on PCA and KNN
Both
A. Disturbance Detection of WAM-PCAkNN
\begin{equation}
D\left({\boldsymbol{\varphi },\phi } \right) \buildrel \Delta \over = \sqrt {\sum\nolimits_{j= 1}^L {{{\left({{\varphi
_j} - {\phi _j}} \right)}^2}} } \geq 0
\end{equation}
This paper also uses ED to assess the similarity of two data windows. The reason why ED is used here instead of other types of distance measures such as Mahalanobis Distance (MD) is because the calculation of ED is much simpler which can facilitate the recursive calculation for the online detection.
If the
1) The Offline Modelling
The offline modelling calculates a sequence of the AI values by using
Specifically, based on the modelling data \begin{equation}
\boldsymbol{Z} \ = \left[ {\begin{array}{c} {{\boldsymbol{z}_1}^{\rm{T}}}\\ {{\boldsymbol{z}_2}^{\rm{T}}}\\ \vdots \\
{{\boldsymbol{z}_{N - L + 1}}^{\rm{T}}} \end{array}} \right]\ = \left[ {\begin{array}{cccc} {{Q_1}}&{{Q_2}}&
\cdots &{{Q_L}}\\ {{Q_2}}&{{Q_3}}& \cdots &{{Q_{L+ 1}}}\\ \vdots & \vdots & \cdots &
\vdots \\ {{Q_{N - L + 1}}}&{{Q_{N - L + 2}}}& \cdots &{{Q_N}} \end{array}} \right]
\end{equation}
\begin{equation}
{D^2}\ \left({{\boldsymbol{z}_g},{\boldsymbol{z}_r}} \right) = \sum\limits_{l= 1}^L {{{\left({{Q_{g - l + L}} - {Q_{r -
l + L}}} \right)}^2}}
\end{equation}
The reason for using SED instead of directly using ED is due to the consideration of the calculation efficiency.
This can be observed later in Section III-A2. For the
When all rows of
Similar with
2) The Online Detection
Next is the online detection, for which real-time calculation of AI is required. To meet this requirement,
strategies for recursively calculating SED and for fast selecting the
The symbol \begin{equation}
{D^2}\ \left({\boldsymbol{z}_p^ \circ,{\boldsymbol{z}_r}} \right) = \sum\limits_{l= 1}^L {{{\left({Q_{p - l + 1}^ \circ
- {Q_{r - l + L}}} \right)}^2}}
\end{equation}
The calculation of (9) needs
Strategy
for recursively calculating SED${{\Gamma }}$
For the window \begin{equation}
{D^2}\ \left({\boldsymbol{z}_{p- 1}^ \circ,{\boldsymbol{z}_{r- 1}}} \right) = \sum\limits_{l= 1}^L {{{\left({Q_{p- l}^
\circ - {Q_{r - l + L - 1}}} \right)}^2}}
\end{equation}
Using (9) and (10
), a recursive equation can be obtained as:
\begin{equation}
{D^2}\ \left({\boldsymbol{z}_p^ \circ,{\boldsymbol{z}_r}} \right) = \left\{ {\begin{array}{l}
{{D^2}\left({\boldsymbol{z}_{p- 1}^ \circ,{\boldsymbol{z}_{r- 1}}} \right) + {{\left({Q_p^ \circ - {Q_{r - 1 + L}}}
\right)}^2}}\\ { - {{\left({Q_{p- L}^ \circ - {Q_{r- 1}}} \right)}^2},\ r > = 2}\\ {\sum\nolimits_{l= 1}^L
{{{\left({Q_{p - l + 1}^ \circ - {Q_{r - l + L}}} \right)}^2}},\ r = 1} \end{array}} \right.
\end{equation}
In comparison to (9), the calculation of
Using (11), the sequence of the SED values
Strategy
for fast selection of the${{\Gamma \Gamma }}$ th smallest SED$k$
If
For the best case,
In addition, the binary search is also used to sort the first
Similarly, the AI value
B. Disturbance Localization of WAM-PCAkNN
Once a disturbance is detected, it needs to be located. Since the variables nearest to a local disturbance are
usually affected most, identifying the variables affected most by the detected disturbance can provide a meaningful
reference for disturbance localization. In the present study, contribution plots display the effectiveness in
identifying such variables [20]. Usually, variables with largest
contributions to monitoring statistics are the ones affected most by disturbances. In the following, a contribution
plot strategy that can quantify the Contributions of Variables (CVs) to
When the online detection is implemented, the AI value \begin{equation}
AI_{Q,p}^ \circ = {D^2}\ \ \left({\boldsymbol{z}_p^ \circ,{\boldsymbol{z}_{{{\rm{r}}_1}}}} \right) = \sum\limits_{l=
1}^L {{{\left({Q_{p - l + 1}^ \circ - {Q_{{{\rm{r}}_1} - l + L}}} \right)}^2}}
\end{equation}
Similar with \begin{equation}
AI_{{T^2},p}^ \circ = \sum\limits_{l= 1}^L {{{\left({{T^2}_{p - l + 1}^ \circ - {T^2}_{{{\rm{r}}_2} - l + L}}
\right)}^2}}
\end{equation}
Then, the CVs to \begin{align}
\ {{\bf con}}_{A{I_Q},p}^ \circ &= \sum\limits_{l= 1}^L {\left| {\frac{{d{{\left({Q_{p - l + 1}^ \circ -
{Q_{{{\rm{r}}_1} - l + L}}} \right)}^2}}}{{d\tilde{\boldsymbol{x}}_{p - l + 1}^ \circ }}} \right|} \nonumber\\
& = \sum\limits_{l= 1}^L {\left| {2\left({Q_{p - l + 1}^ \circ - {Q_{{{\rm{r}}_1} - l + L}}} \right)\frac{{dQ_{p -
l + 1}^ \circ }}{{d\tilde{\boldsymbol{x}}_{p - l + 1}^ \circ }}} \right|} \nonumber\\
& = \sum\limits_{l= 1}^L \left| 4\left({Q_{p - l + 1}^ \circ - {Q_{{{\rm{r}}_1} - l + L}}} \right)\right.\nonumber\\
&\quad\times\left.\left({{{{\bf I}}_m} - {\boldsymbol{U}_{1:a}}{\boldsymbol{U}_{1:a}}^{\rm{T}}}
\right)\tilde{\boldsymbol{x}}_{p - l + 1}^ \circ \right|
\end{align}
Meanwhile, the CVs to \begin{align}
{{\bf con}}_{A{I_{{T^2}}},\ p}^ \circ &= \sum\limits_{l= 1}^L {\left| {\frac{{d{{\left({{T^2}_{p - l + 1}^ \circ -
{T^2}_{{{\rm{r}}_2} - l + L}} \right)}^2}}}{{d\tilde{\boldsymbol{x}}_{p - l + 1}^ \circ }}} \right|} \nonumber\\
& = \sum\limits_{l= 1}^L {\left| {2\left({{T^2}_{p - l + 1}^ \circ - {T^2}_{{{\rm{r}}_2} - l + L}}
\right)\frac{{d{T^2}_{p - l + 1}^ \circ }}{{d\tilde{\boldsymbol{x}}_{p - l + 1}^ \circ }}} \right|} \nonumber\\
& = \sum\limits_{l= 1}^L \left| 4\left({{T^2}_{p - l + 1}^ \circ - {T^2}_{{{\rm{r}}_2} - l + L}}
\right)\right.\nonumber\\
&\quad\times\left.{\boldsymbol{U}_{1:a}}\boldsymbol{\Omega
}{\boldsymbol{U}_{1:a}}^{\rm{T}}\tilde{\boldsymbol{x}}_{p - l + 1}^ \circ \right|
\end{align}
Thus, a contribution plot strategy has been developed which quantifies the CVs to
C. Parameter Settings for WAM-PCAkNN
1) Parameter ${k}$ and Window Length ${L}$
For the parameter
2) Number of PCs
To determine the number \begin{equation}
{\rm{CPV}\ }\left(a \right) = \frac{{\mathop \sum \nolimits_{i= 1}^a {\lambda _i}}}{{\mathop \sum \nolimits_{i= 1}^m
{\lambda _i}}}\ \times 100\%
\end{equation}
Case Studies
In this section, WAM-PCAkNN is evaluated and compared with WAM-PCA in two case studies, involving data from a four-variable numerical model and the New England power system model.
A. Four-Variable Numerical Model
A four-variable numerical model which was also studied in [13] is given by:
\begin{align}
{x_{1,t}} &= 0.5{s_{1,t}} + 0.3{s_{2,t}} + 0.2{s_{3,t}}\\
{x_{2,t}} &= 0.7{s_{1,t}} + 0.2{s_{2,t}} + 0.1{s_{3,t}}\\
{x_{3,t}} &= 0.4{s_{1,t}} + 0.3{s_{2,t}} + 0.3{s_{3,t}}\\
{x_{4,t}} &= 0.2{s_{1,t}} + 0.4{s_{2,t}} + 0.4{s_{3,t}}
\end{align}
Suppose a disturbance occurs near
\begin{align}
{x_{1,t}} &= 0.5{s_{1,t}} + 0.3{s_{2,t}} + 0.2{s_{3,t}} + 0.6{s_{4,t}}\\
{x_{2,t}} &= 0.7{s_{1,t}} + 0.2{s_{2,t}} + 0.1{s_{3,t}} + 0.02{s_{4,t}}\\
{x_{3,t}} &= \ 0.4{s_{1,t}} + 0.3{s_{2,t}} + 0.3{s_{3,t}} + 0.01{s_{4,t}}\\
{x_{4,t}} &= \ 0.2{s_{1,t}} + 0.4{s_{2,t}} + 0.4{s_{3,t}} + 0.015{s_{4,t}}
\end{align}
The total simulation time is 300 seconds and data are sampled with the sampling frequency of 10 Hz. Thus, the first
2000 data points are from (17)–(
20) representing the measurements under the ambient condition
and the last 1000 data points are from (21)–(
24) representing the measurements under the disturbance
condition. The signal-to-noise ratios (SNRs) in the measurements of \begin{equation}
{\rm{SN}}{{\rm{R}}_i} = 10{\log _{10}}\left({\sum\limits_{n= 1}^{3000} {{x_{i,n}}^2} \Bigg/ \sum\limits_{n= 1}^{3000}
{{\omega _{i,n}}^2} } \right)
\end{equation}
The window length
The detection charts of WAM-PCA and WAM-PCAkNN are shown in
Figs. 3 and 4 respectively. To facilitate the observation, the
monitoring statistic values (solid lines) are normalized by their thresholds so that the thresholds (dashed lines) are
equal to one. In Fig. 3, after the disturbance occurs at the 2000th
sampling time point, most of the
The detection chart of WAM-PCA in the first case study, with values of
The detection chart of WAM-PCAkNN in the first case study, with values of
The probability density values of
To evaluate the efficiency of the online detection, the time in calculating
To identify the variables affected most by the detected disturbance, the contributions of variables to
The contributions of variables to
The contributions of variables to
B. New England Power System Model
The New England power system model was described in [24] based on a single line diagram of the test system. The system is a 16-machine 68-bus system with 16 generators serving five geographical areas and eight tie lines connecting the areas to one another. The data used here were provided by authors of [24], which are 20 Hz samples comprising measurements of active power (MW) and reactive power (MVAR) from the 16 generators (G1 ∼ G16) and the sending terminals of the 8 tie lines (L01, L16, L61, L62, L74, L76, L77, L86).
Two data sets were provided, one for the ambient condition (Data I) and another for the disturbance condition (Data II). According to the supplier of the data, Data I was generated by running the New England power system model normally with no disturbance, while Data II was generated by running the model with a local disturbance simulated. The disturbance is due to the step change in the voltage reference input of the automatic voltage regulator of the excitation system in G3. Four inter-area oscillations are present in both Data I and Data II, reflecting the property of the whole system. In addition, one local oscillation caused by the disturbance is present in Data II. The local oscillation is observable in the power measurements of G3 and G2 that is nearest to G3.
Table I lists the active and reactive power for the wide-area monitoring. The first 1000 samples of Data I are taken as modelling data, and the remaining 29000 samples of Data I are taken as testing data from the ambient condition. The 30000 samples of Data II are taken as testing data from the disturbance condition. To provide a compact demonstration, the normalized trends of the first 1200 samples of Data I (one-minute simulation episode) and those of Data II are shown in Figs. 8 and 9 respectively. Also for a compact demonstration, only reactive power measurements from representative generators G3, G2 (nearest to G3) and G14 (farthest from G3) rather than all power measurements are shown in Figs. 8 and 9 . It can be observed from the comparison between Figs. 8 and 9 that the disturbance affects G3 and G2 much but it affects G14 little.
The normalized trends of the first 1200 samples of Data I (reactive power measurements from G2, G3 and G14) in the second case study.
The normalized trends of the first 1200 samples of Data II (reactive power measurements from G2, G3 and G14) in the second case study.
The simulated disturbance is difficult to detect and locate because the inter-area oscillatory trends in the data exhibit a masking effect on the local oscillation and thus on the disturbance. To detect and locate this disturbance can not only provide increased situational awareness of generators to the system operators but also give some reference about the time and the variables suitable for estimating the frequencies and damping ratios of different oscillations. Here, the expected detection result is that alarms should be rarely triggered for Data I whereas alarms should be constantly triggered for Data II. Besides, the expected localization result is that the active power and reactive power of G2 and G3 should be identified as the variables affected most by the detected disturbance.
Using the oscillation analysis method presented in [24] on the modelling
data, it is found that the maximum oscillation period contains about 36 samples. Accordingly, the window length
The detection charts of WAM-PCA and WAM-PCAkNN on Data I are shown in
Figs. 10 and 11, respectively.
Again, to facilitate the observation, the monitoring statistic values (solid lines) are normalized by their
corresponding detection thresholds so that the thresholds (dashed lines) are equal to one. It can be observed from
Figs. 10 and 11 that most of the
values of
The detection chart of WAM-PCA on Data I in the second case study, with values of
The detection chart of WAM-PCAkNN on Data I in the second case study, with values of
The detection charts of WAM-PCA and WAM-PCAkNN on Data II are shown in
Figs. 12 and 13, respectively.
Observing from Figs. 12 and 13, a
large number of the values of
The detection chart of WAM-PCA on Data II in the second case study, with values of
The detection chart of WAM-PCAkNN on Data II in the second case study, with values of
To quantify the results observed in Figs. 10 and
11 as well as the results observed in
Figs. 12 and 13, Table II
lists the alarm rates of WAM-PCA and WAM-PCAkNN on Data I and Data II. An alarm rate is
calculated as the ratio percentage of the number of the triggered alarms over the dataset size. Taking the value
‘‘69.47%’’ in Table II as an example,
it is obtained by dividing the number of the triggered alarms (that is, the number of the
The detection results in Figs. 10–13, and Table II demonstrate
that WAM-PCAkNN can significantly enhance the sensitivity of WAM-PCA in detecting disturbances by
reducing the masking effect of the oscillatory trends on disturbances while behaving reliably in the ambient condition
by triggering the appropriate quantity of false alarms acceptable against the given confidence level. Moreover,
WAM-PCAkNN is suitable for the online detection in real time, since the maximum time on the
calculation of
After the disturbance is detected by
The contributions of variables to
The contributions of variables to
A point worth mentioning is that the ordinates in Figs. 8–15 and the abscissas in Figs. 8–13 have no unit, because Figs. 8–15 show the results obtained based on the normalized power measurements and the abscissas in Figs. 8–13 represent the sequence number of sampling with the interval of 0.05 seconds.
Discussions
An issue that can affect the performance of the proposed method in detecting and locating disturbances is limited sensor coverage, which has already been pointed out in [25] and [26]. Due to limited sensor coverage, the measured data may contain little disturbance information, making it difficult to detect and locate disturbances. Fortunately, this issue has been much relieved by the widespread PMUs. Besides, a feasible solution to this issue, as suggested in [25], is optimal sensor placement.
Another issue is the automatic update of a previously built detection model for new ambient conditions. To automatically identify the time when power systems enter new ambient conditions rather than relying on the experience of the system operators is a solution to this issue worth considering.
These two issues are outside the scope of the present work, but will make interesting topics for future study.
Conclusion
A wide-area monitoring method (WAM-PCAkNN) has been proposed by combining Principal Component
Analysis (PCA) with k-Nearest Neighbor (kNN) analysis to detect and locate power
system disturbances in real time. The contribution is three-fold. Firstly, kNN has been combined with
multivariate analysis PCA to build new system-wide monitoring statistics
The analysis on the data from a four-variable numerical model and the New England power system model has illustrated
that WAM-PCAkNN significantly improves the performance of the traditional wide-area monitoring method
based on PCA (WAM-PCA) in detecting disturbances, e.g.,
ACKNOWLEDGMENT
The authors gratefully acknowledge Dr. Karine Hay and Dr. Alex Golder for providing the simulated data of the New England power system model to support this paper.