Introduction
Because of the increasing deployment of phasor measurement units (PMUs) and phasor data concentrators (PDCs), vast volumes of PMU data are being collected at a fairly high reporting rate [1]. Limited by communication and storage capacities, these vast volumes may cause severe communication congestion and discontinuous storage. This greatly restricts systematic applications of enhanced situational awareness techniques (e.g., power system event detection [2] identification [3], and location [4]). Also, time-latency in the range of 100 ms to 5 s is required by most WAMS applications [5]. To address these problems, compressing PMU data in WAMS substations is an effective approach. Once the compressed PMU data are transmitted through the communication network, they can be reconstructed at WAMS master stations to restore their original resolution for high-accuracy applications.
Compression techniques for PMU data include lossless and lossy categories. Lossless compression techniques tend to reduce the scale of PMU data without any loss, i.e., the compressed data can be completely reconstructed into the original PMU data. In [6], a preprocessing method based on frequency-compensated dif-ference encoding is presented to reduce the complexity of PMU data, and the Golomb-Rice codes-based entropy encoder is then exploited to compress the preprocessed data. In [7], a slack-referenced encoding (SRE)-based technique is used for the compression of PMU data from different sources. Nevertheless, these lossless compression techniques are adapted from other fields for general applications, prioritizing high-accuracy reconstruction at the expense of achieving low compression ratios. Hence, as mentioned in [8], lossless compression methods cannot achieve the high compression ratio required for big data applications, such as wide-area synchrophasors networks where the data of thousands of PMUs are shared among multiple entities. In pursuit of a balanced approach, the lossy compression technique that achieves higher compression ratios, albeit with a degree of information loss, is preferred and adopted for the compression of PMU data in this study.
Lossy compression techniques can significantly reduce the scale of PMU data, though at the cost of introducing limited errors between the reconstructed data and the originals. Signal feature analysis is a well-studied lossy compression technique for PMU data, among which the most widely used include the principal component analysis (PCA)-based [9] and wavelet transform (WT)-based [10] methods. However, data buffering is required for PCA-based and WT-based methods, resulting in insufficient timeliness for real-time applications. Another type of lossy compression technique is based on segment approximation algorithms. They approximate the PMU signals as segments and select many fewer data points to represent their trends. For real-time event detection, the ordinary least squares (OLS) [11] and swinging door trending (SDT)-based compression algorithms [12], [13] are developed to generate multiple compression segments of the PMU data. However, the selection of algorithm parameters is a serious challenge for OLS and SDT-based methods. More specifically, the tunable parameters of OLS and SDT-based algorithms are empirically set and remain constant, but the compression performance primarily depends on these algorithm parameters. While a group of algorithm parameters may achieve relatively high compression ratios with minimal distortion in the steady state of the power system, they may not be suitable for compressing rapidly changing PMU data under severe disturbance. This can lead to significant fluctuations in compression ratios. Similarly, the algorithm parameters for disturbed PMU data are also unsuitable for PMU data in the steady state of the power system. This phenomenon that the compression ratios undergo significant and unpredictable changes because of the mismatch of algorithm parameters for varying power system dynamics is referred to as compression performance degradation. The DP algorithm is also a well-studied lossy compression method for vector data based on segment approximation [14], [15]. The fundamental concept behind the DP algorithm is to progressively extract feature points of the data curve, transitioning from coarse to fine levels of detail, and thus the compressed data can retain the overall trend of signals with minimal distortion. However, a notable limitation of the DP algorithm is its inability to adequately capture the local characteristics of the PMU signals. This hinders its capacity to mitigate the issue of compression performance degradation, because of the challenge persisting in the selection of the algorithm parameters for different system dynamics.
Fundamentally, there is no absolute “steady state” for power systems [16]. Power systems typically operate in quasi-steady state, where the system operating conditions change primarily because of stochastic variations in demand/generation, while they can also experience transient states resulting from severe disturbances [17], [18]. Hence, it is of great significance for the compression method to tune the algorithm parameters for varying system dynamics, so as to maintain the desired compression ratio and reconstruction accuracy as much as possible, irrespective of the power system dynamics.
To address the compression performance degradation problem, an effective CQDP-based lossy compression method is proposed. Compared to previously used compression methods, the advantage of the proposed approach lies in its use of the curvature integrated distance (CID) and parameter adaptation scheme. This strategy effectively mitigates the degradation of compression performance caused by inappropriate parameters during significant shifts in power system dynamics caused by severe disturbances. As a result, it consistently delivers superior compression performance. First, CID is presented to measure the local flection and fluctuation trends within the PMU signals. Then, a DP-based algorithm is combined with a quantile-based parameter adaptation scheme to extract feature points that profile the contour of the PMU signals. Finally, the compressed data is reconstructed by a linear interpolation model to restore to the original resolution, for systematic applications in power systems. The contributions of this work are:
A CQDP-based PMU data compression method is proposed for power system situational awareness. By using the CID that measures local flection and fluctuation of PMU signals, and incorporating the DP algorithm that extracts feature points, the scale of PMU data can be compressed for easy communication and storage.
A quantile-based parameter adaptation scheme is embedded in the CQDP to effectively alleviate compression performance degradation for varying power system dynamics. Compared with previous lossy compression methods, the proposed method achieves a higher compression distortion composite index when the operating status of the power system changes, thus verifying its higher compression performance than other algorithms.
Cqdp-Based Data Compression Method for Pmu Measurement
A noteworthy characteristic of the PMU data is that it keeps almost constant in the steady state of the power system, but fluctuates significantly under disturbances. Thus, as mentioned in Section I, robust compression methods for PMU data are required for varying power system dynamics. In the proposed method, the DP algorithm is improved by combining it with the CID and quantile-based parameter adaptation schemes, so as to alleviate the compression performance degradation problem caused by inappropriate algorithm parameters, particularly during significant changes in power system dynamics.
A. CQDP-Based PMU Data Compression Method
Assume \begin{equation*} V_{i}=[V_{i,1},V_{i,2},\cdots,V_{i,N}] \tag{1} \end{equation*}
CID is used to quantify the local flection and fluctuation of PMU signals, denoted as the product of curvature and Euclidean distance as:
\begin{equation*} \mu_{i,j}^{\text{CID}}=\delta_{i,j}d_{i,j}=\delta_{i,j}\frac{\left\vert kt_{j}+b-V_{i,j}\right\vert}{\sqrt{1+k^{2}}},1\leqslant i\leqslant M, 1 < j < N\tag{2} \end{equation*}
\begin{align*} \delta_{i,j}=&\frac{\left\vert \arctan \left(\frac{V_{i,j+1}- V_{i,j}}{t_{j+1}- t_{j}}\right)-\arctan \left(\frac{V_{i,j}- V_{i,j-1}}{t_{j}- t_{j-1}}\right)\right\vert}{\sqrt{(t_{j+1}-t_{j-1})^{2}+(V_{j+1}- V_{j-1})^{2}}}= \tag{3}\\ &\frac{\vert \arctan (f_{r}(V_{i,j+1}- V_{i,j}))-\arctan (f_{r}(V_{i,j}- V_{i,j-1}))\vert}{\sqrt{4(V_{i,j+1}- V_{i,j-1})^{2}/ f_{r}^{2}}} \end{align*}
A higher CID indicates more pronounced and significant changes in the PMU signals and vice versa. Therefore the feature point of the PMU signals is defined as the data point with the largest CID in each iteration of the DP algorithm. Note that for PMU data with small fluctuation when there is no large disturbance in an actual power system, the CIDs of the interior data points are all significantly low (close to 0). This indicates that these data points are approximately linear, and can be highly compressed. On the contrary, a much higher CID will be obtained if a severe disturbance occurs on the corresponding data point. Hence, CID can be used for the compression of PMU data in both steady state and disturbance of the power system.
With the local flection and fluctuation measured by CID, the DP algorithm is adopted to extract feature points. The diagram of the CQDP-based compression method is presented in Fig. 1, where the red points denote extracted feature points, while the black and red lines denote raw PMU data and compressed PMU data,
respectively. The CID thresholdB. Quantile-Based Parameter Adaptation Scheme for Varying Power System Dynamics
For the CQDP-based data compression algorithm, the CID threshold
With all CIDs calculated by (3), the CID vector is formed as:
\begin{equation*} \boldsymbol{\zeta}_{i}=[\mu_{i,1}^{\ast},\mu_{i,1}^{\ast},\cdots,\mu_{i,N-2}^{\ast}]\tag{4} \end{equation*}
\begin{gather*} \varepsilon_{\text{CID}}= \mu_{i,\lfloor p^{\ast}\rfloor}^{\ast}+(\mu_{i,\lfloor p^{\ast}\rfloor+1}^{\ast}- \mu_{i,\lfloor p^{\ast}\rfloor}^{\ast})(p^{\ast}-\lfloor p^{\ast}\rfloor) \tag{5}\\ \quad p^{\ast}=1+(N-3)\times p \tag{6} \end{gather*}
Since the
Algorithm 1 CQDP-based data compression with parameter adaption
Input: PMU data series
Set the number of reserved points
Initialize feature points set as
While the length of
For adjacent feature points
If
Continue
Else:
Calculate CIDs from interior points
End if
End for
Adjust the CID threshold
For adjacent feature points
If max
Remove the interior points
End if
End for
Append
Output: feature point set
Note that the PMU data point is a phasor including amplitude and phase angle components, which are treated as two independent data series to be compressed separately in this study. Even though the compression is processed separately, the amplitude and phase angle data will be reconstructed in the WAMS master station to restore the original resolution. This ensures the simultaneous and accurate representation of the phasor data. The complexity of the DP algorithm (i.e., extracting process of feature points), which primarily determines the execution time of the proposed method, is proved to be
C. Reconstruction of the Compressed PMU Data
The compressed PMU data are formed by feature points that are sparsely extracted from the raw data, and the reconstruction is required first before further analysis and application. In order to achieve high fidelity of the raw PMU data, the linear interpolation model is used for reconstruction, and the reconstructed PMU data can be represented as:
\begin{equation*} \tilde{V}_{i,j}=\begin{cases} V_{i,j} & \text{if}\ V_{i,j}\neq \text{NULL}\\ V_{i,j-\Delta j_{1}}+\frac{V_{i,j+\Delta j_{2}}-V_{i,j-\Delta j_{1}}}{\Delta j_{1}+\Delta j_{2}}\Delta j_{1} & \text{if}\ V_{i,j}= \text{NULL} \end{cases}\tag{7}\end{equation*}
D. Performance Evaluation of the Compression Method
The performance of PMU data compression methods can be evaluated from two aspects: the compression scale and the accuracy of the reconstructed data.
Compression ratio (CR) is the most widely used indicator to measure the compression scale, shown as:
\begin{equation*} \eta_{\text{CR}}=(N_{i}^{\text{raw}}-N_{i}^{\text{res}})/N_{i}^{\text{raw}} \tag{8} \end{equation*}
Normalized mean square error (NMSE) is adopted to evaluate the accuracy of reconstructed data, denoted as:
\begin{equation*} \eta_{\text{NMSE}}=\frac{\Vert \boldsymbol{V}_{i}-\tilde{\boldsymbol{V}}_{i}\Vert_{2}}{\Vert \boldsymbol{V}_{i}\Vert_{2}}=\sqrt{\left.\sum\limits_{j=1}^{N}(V_{i,j}-\tilde{V}_{i,j})^{2}\right/\sum\limits_{j=1}^{N}V_{i,j}^{2}}\tag{9} \end{equation*}
It should be noted that CR is usually associated with NMSE for the same compression method, and a higher CR will lead to a higher NMSE. As a compromise between CR and NMSE, the compression distortion composite index (CDCI) can be used to represent the comprehensive performance of the compression method, calculated as:
\begin{equation*} \eta_{\text{CDCI}}=a\eta_{\text{CR}}^{\ast} +b\eta_{\text{NMSE}}^{\ast}=a\frac{1}{\eta_{\text{CR}}}+b\frac{\eta_{\text{NMSE}}}{2\times 10^{-3}} \tag{10} \end{equation*}
Numerical Results and Comparisons
A. Compression Results With Simulated PMU Data
The proposed method is evaluated with the simulated PMU data on the western electricity coordinating council (WECC) 179-bus system, whose diagram can be seen in [23]. A three-phase short circuit fault is artificially set at Bus 8 and is cleared in 0.16 s. The simulated data is of 5 s duration, including the 1 s pre-event and the 4 s post-event.
The CQDP-based method is applied to the compression of PMU data, with the window length of 1 s in this case. Note that for the same compression method, the compression scale is contradicted with the reconstruction fidelity (i.e., the higher CR, the worse fidelity, and vice versa). To reach a trade-off,
It can be seen from Table I that, for the pre-event stage (0–1 s), the voltage amplitude of Bus 1 is almost constant, and thus only the start point and the end point are reserved, i.e.,
B. Compression Results With Actual Recorded PMU Data of the Guangdong Power System
The proposed method is further verified with actual PMU data of the Guangdong power system (GDPS) in this subsection. GDPS is a large interconnected power grid in southern China, with PMUs installed on all power plants and critical substations. Time-stamped streaming PMU data including voltage, frequency, and real/reactive power are recorded at 50 frames per second (FPS). Note that the actual recorded PMU data, which comes from our industrial partners in GDPS, have been filtered before archiving, so as to reduce the impact of noise.
On March 2, 2017, an actual generator ramping event was recorded by the PMUs. However, because of the limited storage and transmission capabilities, only partial recorded data in 31 PMUs were actually transmitted and saved. The recorded PMU data are shown in Fig. 3, where the PMU record is 56 s long from 11:50:00-11:50:56, including the whole duration of the event.
The compression performances for the CQDP-based method are presented in Fig. 4. It can be seen that:
With the proposed CQDP-based method, most of the PMU data are highly compressed yet the reconstruction error is still relatively low for all the data types in the steady state of the GDPS, i.e., the minimum CR is 75.4% for real power data, and the maximum NMSE is
for reactive power.$0.153\times 10^{-3}$ There are only slight changes on the CDCI during the generator ramping event, which means that the compression performance of the proposed method is largely unaffected by this disturbance.
The compression performances are different for different data types. Specifically, the CDCIs of the real and reactive power data are higher than those of the frequency and voltage data, but are still lower than 0.943 even in the worst case scenario.
It can be concluded that the proposed method is effective in compressing PMU data while ensuring reconstruction accuracy for different data types and power system dynamics.
C. Sensitivity Analysis of the Window Length on Compression Performance
Because of the limited communication bandwidth, the PMU data are compressed and transmitted only when they accumulate to a certain size for achieving a high CR. Therefore the window length (i.e., the data buffering) is of great significance to the time delay. In this subsection, the compression performance with different window lengths are discussed. It should be noted that the average communication delay of PMU data is around 100 ms, and the data buffering for the WAMS master station is generally more than 1 s according to [25]. Thus, the window lengths of 1 s, 2.5 s, and 5 s are discussed here.
The test cases are performed on the actual recorded PMU data of the GDPS, and the PCA-based method [9] is used for comparison. Taking the voltage data of substation DG as an example, the reconstruction results with a 1 s window length are presented in Fig. 5, where the CRs for these two methods are both 90%. It can be seen that there are obvious distortions in the reconstructed data for the PCA-based method since crucial information during the severe disturbance is lost, while the reconstructed data for the proposed method fit the original PMU data well, achieving much higher fidelity.
The overall compression performance with different window lengths are demonstrated in Table II.
It can be seen that the CDCI of PCA changes from 0.905 to 1.106 as the window length decreases from 5 s to 1 s, which means that the compression performance is sensitive to the window length in case of disturbance, while a bigger window length is preferred. Hence the PCA-based method is more suitable for situations with low timeliness such as the data archival in PDCs integrated with multiple PMUs. For the CQDP-based method, the CDCI is largely unaffected by the window length, i.e., less data buffering is required. In practice, the selection of window length depends on the actual requirements of communication time delays (e.g., if a 2 s time delay is required, the window length can be set as 1.5 s, and the remaining time of 0.5 s is used for transmission and reconstruction). In addition, considering that a single PMU also has a certain data storage and processing capability, the CQDP-based compression method can be executed on a single PMU instead of the PDC, which further improves the near real-time processing capability.
To illustrate the practical computational complexity of the CQDP algorithm, case studies are conducted to calculate the time delay on a typical personal computer. The total time delay can be denoted as [26]:
D. Computational Complexity Analysis
\begin{equation*} \tau_{\text{tot}}=\tau_{\text{comp}}+\tau_{\text{trans}}+\tau_{\text{prop}}+\tau_{\text{queu}}+\tau_{\text{PMU}} \tag{11} \end{equation*}
The average execution times of multiple tests for the proposed CQDP method are shown in Table III, where the time windows are all 100 samples-length.
It can be seen that the maximum average execution time of the proposed CQDP method is 17.3 ms. As a result, in a real power grid, the proposed method could deliver compression results to PDC in less than 0.1 s after the PMU data is collected. Hence, the total time delay of the proposed method can satisfy the time-latency requirement (i.e., from 100 ms to 5 s) in [5], and can be applied for real-time applications in WAMS.
E. Comparisons With Other PMU Data Compression Methods
To demonstrate the superiority of the proposed method, the other two segment approximation-based methods, SDT [12] and DP [14] are employed for comparison. The comparison is first performed on the WECC 179-bus system with the simulated voltage data, and an ablation test is designed as follows. The PMU data in a steady state of the power system is compressed with SDT, DP and CQDP methods, and the algorithm parameters of the compression methods are fine-tuned to achieve the same CDCI of 0.787 (i.e., the CRs are all 90%, and NMSEs are all around
It can be seen from Table IV that CDCIs of SDT and DP reach 1.667 and 2.424 respectively after the three-phase short circuit fault. The main reason for the increases in CDCI is the rapid decrease in CR, while the compensated reconstruction accuracy does not increase significantly. This indicates that the compressed data obtained by SDT and DP still retain high compressibility potential. Thus, inappropriate algorithm parameters may lead to the degradation of compression performance when there are large changes in power system dynamics, even though these parameters have been proved to be effective for the steady state of the power system. The benefits of the CID and parameter adaptation scheme in the CQDP method are fully demonstrated when compared with the basic DP algorithm. The CDCI of the proposed method only increases slightly from 0.787 to 0.839 after the severe disturbance. Thus, the results of the ablation test show that the merit of the proposed method is to alleviate the compression performance degradation when the power system dynamics change.
Then, the SDT, DP and CQDP compression methods are performed on the actual recorded PMU data of GDPS for further comparisons, and the results are shown in Table V.
As can be seen there, for SDT and DP-based methods, there is significant degradation in compression performance of the reactive power data, i.e., only 32.6% of the reactive power data is compressed for SDT, and 42.9% for DP. The reason is that the parameters of SDT and DP are set empirically based on the base value and error tolerance, but the optimal parameters are different for different data types and power system dynamics. As a result, the CRs of SDT and DP greatly reduce since there are more drastic fluctuations in reactive power. For the proposed method, it can adjust the algorithm parameter adaptively based on the trends within the PMU data, and thus outperforms SDT and DP in terms of CR for reactive power data (the CR is 78.2% as shown in Fig. 4). Also, the average CDCI for CQDP is 0.871, which is also much better than the 1.222 for SDT and 1.093 for DP.
Conclusions
To address the challenges of insufficient communication and storage capabilities caused by large volumes of PMU data, an effective CQDP-based PMU data compression method is proposed for power system situational awareness. Case studies on simulated and actual PMU data demonstrate that the proposed method is capable of highly compressing PMU data in the presence of steady state and dynamic transient states of the power system, and achieves higher compression ratios and less data buffering for near real-time situational awareness.
Availability of Data and Materials
Not applicable.
ACKNOWLEDGEMENT
Not applicable.