Loading [MathJax]/extensions/TeX/boldsymbol.js
A Fast and Intelligent Open-Circuit Fault Diagnosis Method for a Five-Level NNPP Converter Based on an Improved Feature Extraction and Selection Model | IEEE Journals & Magazine | IEEE Xplore

A Fast and Intelligent Open-Circuit Fault Diagnosis Method for a Five-Level NNPP Converter Based on an Improved Feature Extraction and Selection Model


Framework for fault diagnosis. It includes ?ve modules: original signal acquisition, signal preprocessing, feature extraction, feature selection and fault classi?cation. ...

Abstract:

The open-circuit faults of power semiconductor devices in multilevel converters are generally diagnosed by analyzing circuit signals. For converters with five or more lev...Show More

Abstract:

The open-circuit faults of power semiconductor devices in multilevel converters are generally diagnosed by analyzing circuit signals. For converters with five or more levels, the difficulty of fault detection increases with increasing topological complexity, the number of switching devices and the number of candidate signal parameters. In this paper, a complete solution for open-circuit fault detection for a five-level nested neutral-point piloted (NNPP) converter is proposed based on improved unsupervised feature learning algorithms. Feature engineering and machine learning algorithms are applied for feature extraction and selection and the construction of classification models. Circuit signals are monitored, and their time-domain characteristics are extracted for fault recognition. An unsupervised feature learning selector, which combines a dependence-guided unsupervised feature selection (DGUFS) filter and a random forest feature selection (RFFS) wrapper to automatically select parameters and generate the optimal feature subset for fault detection, is proposed. The random forest (RF) algorithm is used to build a classifier. The experimental results show that the solution framework presented has the advantages of high efficiency, high flexibility, a superior fault recognition rate and good generalization ability.
Framework for fault diagnosis. It includes ?ve modules: original signal acquisition, signal preprocessing, feature extraction, feature selection and fault classi?cation. ...
Published in: IEEE Access ( Volume: 8)
Page(s): 52852 - 52862
Date of Publication: 16 March 2020
Electronic ISSN: 2169-3536

Funding Agency:


CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.
SECTION I.

Introduction

As the key equipment of power conversion systems, power electronic converters are widely used in motor drives and power systems. Power switching devices, with limited overload capacities, are inevitably exposed to high voltages and currents, which will lead to a high risk of damage and affect the safe and stable operation of the system. According to statistics, in the AC motor control systems for industrial applications, 38% failures derive from damage to power electronic devices [1]. The number of semiconductor devices increases with the voltage level of the converters, which will enhance the risk of switch breakdown and reduce the reliability of the system [2]. The need for fast and accurate fault diagnosis technology continues to grow.

Open-circuit failure is one of the most common issues for switches in converters. By monitoring the signals of the operating system, malfunctions should be identified with as accurate classification and position information as possible [3]. Traditional methods of fault detection analysis depend more on cognitive-based comparisons of fault and normal signals. The open-circuit fault detection method for three-level T-type converters proposed in [4] uses the grid current amplitude and phase angle as the recognition parameters. An alternative approach presented in [5] uses irregular variations of neutral-point current, switching states and phase current for fault detection. In [6], the authors proposed an open-circuit fault diagnosis method for a four-wire T-type converter considering both unbalanced load and unbalanced input voltage conditions. The method was performed by decomposing the voltage into positive-, negative- and zero sequence components, and the faults were located by the law of positive- and negative-sequence voltage error offsets and zero sequence voltage error offsets. In [7], an open-circuit fault detection algorithm in single-phase 3L-NPC converters was presented. The grid voltage, the DC side voltages and the switching states were used to build a mixed logical dynamic model to estimate the grid current. By subtracting the estimated value from the measured current, the residual was calculated and used for fault diagnosis. Most analytical diagnosis procedures are intuitive, but the design can be overly complex [8]–​[11]. However, in converters with five or more levels, the traditional methods appear incapable of fast and accurate fault detection since the complexity considerably increases with the number of switches. In [12], an open-switch fault-tolerant operation was presented for a multichannel voltage-source five-level power converter by monitoring the voltage across flying capacitors. The variations in the rotor currents and DC link voltage are also used for fault diagnosis in grid/machine-connected mode. The speed of fault detection depends on the frequency of the grid/machine-side rotor current, which tends to be very low. In recent years, machine learning algorithms have become effective tools for fault diagnosis in complex environments [13]. Since the performance of an algorithm is greatly influenced by the available data and features, feature engineering is widely used to extract characteristics from raw data and improve the corresponding fault recognition results. The authors of [14] proposed a concept that exploits the features extracted from circuit responses instead of component parameters for circuit health estimation using a kernel learning technique. In [15], a fault diagnosis method for a wind turbine planetary gearbox was developed. A stacked denoising autoencoder technique was adopted to learn robust and distinguishable features from measured signals, and a least squares support vector machine was employed for fault identification.

For multilevel converters, signals such as the phase or amplitude of currents and voltages can be used for fault detection. However, the direct use of these signals lacks identifiability and increases the data processing burden. Feature extraction is used to reduce the dimensions of large-scale data and summarize new characteristics with accurate classification. Time-domain features are effective indicators that reflect the state of operation and can be used individually or in combination for fault detection [16]. Designing classifiers with many unrelated features will result in considerable computational complexity and poor classification performance. Therefore, filtering redundant parameters through feature selection to further reduce dimensionality has very important practical significance [17]–​[19].

Traditional feature selection methods are usually based on expert experience or enumeration. If the number of candidate parameters is large, the approach may be too time consuming to find the optimal feature subset. In recent years, fault classification based on pattern recognition has been the main method for intelligent fault diagnosis. The validity of the pattern features directly affects the design and performance of the classifier. The amount of raw data obtained from the detected signals is considerable. To effectively perform classification and recognition tasks, the original data must be selected or transformed to obtain the essential characteristics that can best reflect the differences among modes. Optimization of the raw data is mainly achieved by dimensionality reduction, that is, conversion of a data space with a high dimension into one with a low dimension. The filter, wrapper and hybrid modes were introduced to reduce dimensionality and perform effective feature selection [20]. A filter uses a guideline of feature importance for the target attributes to weight and rank all features in the pool; this method is efficient but limited in accuracy [21]. A wrapper conducts feature selection based on specific evaluation criteria (mostly accuracy) and determines the optimal feature subset accordingly [22], [23]; this approach is high in accuracy but low in computational efficiency. Although the hybrid method inherits the advantages of both the filter and the wrapper, it inevitably falls to suboptimal solutions under certain conditions [24].

In this paper, a complete fault diagnosis solution for electronic power converters is proposed. A five-level nested neutral-point piloted (NNPP) converter is used to build the system for fault detection. A feature selection algorithm combining a dependence-guided unsupervised feature selection (DGUFS) filter and a random forest feature selection (RFFS) wrapper is also proposed. The proposed DGUFS-RFFS is a hybrid filter-wrapper method that not only avoids suboptimal solutions but also accelerates the feature selection process and generally achieves better performance than traditional methods. DGUFS, as presented in [25], is used to select parameters and evaluate feature combinations of a given dimension. The RFFS wrapper is used to determine the optimal feature subset from different dimensions. The currents of the flying capacitors I_{Cfa1} and I_{Cfa2} , the phase current I_{a} and the phase voltage V_{ao} of the five-level NNPP converter are selected as the original signals. The time-domain features are introduced as the extracted features, and a multidimensional feature pool is constructed. The optimal feature combination is selected by the DGUFS-RFFS model and input to the classifier for fault recognition.

SECTION II.

Brief Introduction to the Open-Circuit Faults of an NNPP Converter

Multilevel converters are considered the most attractive solutions in high-voltage and high-power applications due to their advantages, such as a low common-mode output voltage, low harmonics, and the use of low-voltage semiconductor devices. The five-level NNPP converter, which was proposed by GE in [26], was developed by nesting two or more medium-voltage 3-level neutral-point piloted (NPP) cells. This topology is easily scalable in a reasonable voltage range and at output voltage levels, and this approach provides a small filter, high power density and high efficiency, as shown in Fig. 1.

FIGURE 1. - Topology of the five-level NNPP converter.
FIGURE 1.

Topology of the five-level NNPP converter.

A. Modulation Strategy of the Five-Level NNPP Converter

The five-level NNPP converter generates five-level phase voltages and nine-level phase-phase voltages. The output phase-phase voltages are supposed to be 2E , E , 0, -E and -2E . The nested structure requires the modulation strategy to effectively balance the DC-side capacitor voltages and the flying capacitor voltages. The authors in [27] proposed an optimized space vector pulse width modulation (SVPWM) algorithm based on gh coordinates for this topology.

Taking phase A as an example, as shown in Fig. 1, each group of (S_{a11}, S_{a11}) , (S_{a14}, S_{a14}) , (S_{a21}, S_{a21}) and (S_{a24}, S_{a24}) should work under the same operating status. Each group of (S_{a11}, S_{a14}) , (S_{a21}, S_{a24}) , (S_{a11}, S_{a12}, S_{a13}) , (S_{a21}, S_{a22}, S_{a23}) , (S_{a12}, S_{a13}, S_{a14}) and (S_{a22}, S_{a23}, S_{a24}) cannot work in conduction mode simultaneously, thus preventing the formation of a short circuit. Each group of (S_{a11}, S_{a13}) , (S_{a21}, S_{a23}) , (S_{a12}, S_{a14}) and (S_{a24}, S_{a24}) should work in a complementary state to reduce the switching frequency and to ensure that the switching sequence is unaffected by dead zones or the direction of the load current. Therefore, S_{a11} , S_{a12} , S_{a21} and S_{a22} are four switches that need to be controlled independently. S_{a11} and S_{a13} , S_{a12} and S_{a14} , S_{a21} and S_{a23} , S_{a22} and S_{a24} should work in a complementary manner. The output voltage and the corresponding control strategy are shown in Table 1.

TABLE 1 The Control Strategy of the 5-L NNPP Converter
Table 1- 
The Control Strategy of the 5-L NNPP Converter

B. Open-Circuit Fault Analysis of the Five-Level NNPP Converter

Fault diagnosis studies of the five-level NNPP converter are limited. In [28], a fault-tolerant solution for the five-level NNPP topology was proposed, and it identified both the short- and open-circuit faults of a single IGBT. In this paper, the open-circuit failures of a single IGBT and dual IGBTs are analyzed. The probability of simultaneous failures of more than two IGBTs is relatively low; therefore, this topic is not discussed in this paper. As discussed in [28], when a single IGBT fails to open, taking S_{a11} as an example, the output voltages marked in red are distorted, as shown in Table 2.

TABLE 2 The Distorted Output Voltages When S_{a11} Fails to Open
Table 2- 
The Distorted Output Voltages When 
$S_{a11}$
 Fails to Open

For the simultaneous failure of two IGBTs, taking S_{a11} and S_{a12} as examples; when both of the switches fail to open, the output voltages marked in red are distorted, as shown in Table 3. Assuming that the breakdown of the IGBTs does not affect the reverse diode, Fig. 2 shows the change in the current path when the phase current I_{a} is positive or negative and when 2E is the output phase voltage.

TABLE 3 The Distorted Output Voltages When S_{a11} and S_{a12} Fail to Open
Table 3- 
The Distorted Output Voltages When 
$S_{a11}$
 and 
$S_{a12}$
 Fail to Open
FIGURE 2. - Change in the current path. The blue solid line represents the current path of the fault-free circuit, and the red dotted line represents the current path when 
$S_{a11}$
 and 
$S_{a12}$
 fail to open simultaneously: (a) 
$I_{a}$
 > 0. (b) 
$I_{a}$
 < 0.
FIGURE 2.

Change in the current path. The blue solid line represents the current path of the fault-free circuit, and the red dotted line represents the current path when S_{a11} and S_{a12} fail to open simultaneously: (a) I_{a} > 0. (b) I_{a} < 0.

A total of 36 faulty states and one fault-free state are included for analysis and identification. The failures with classification labels are summarized in Table 4.

TABLE 4 Classification of the Open-Circuit Faults
Table 4- 
Classification of the Open-Circuit Faults

SECTION III.

Feature Extraction and Selection Based on the Hybrid DGUFS-RFFS Method

In this paper, the time-domain signal parameters that are more representative of the fault characteristics are selected for feature extraction. Once the feature parameters are extracted and a feature space is constructed, feature selection is implemented to further achieve dimensionality reduction. The importance of the features is determined, and the redundant information is deleted. The proposed hybrid DGUFS-RFFS method is used to optimize the feature set, ensuring that the remaining features are reliable, relatively uncorrelated, retain as much information as possible, and reduce the amount of data as much as possible.

A. Time-Domain Feature Extraction Algorithms

The kurtosis (Ku ), skewness (Sk ), root mean square metric (RMS ), crest factor (Cf ) and form factor (Ff ) focus more on the extremes of a data set than on the average and are indicators of faulty signals. These individual feature parameters and combinations of parameters have different recognition effects for fault diagnosis. The formulas of the above feature parameters are listed in Table 5.

TABLE 5 Formulas of the Time-Domain Feature Extraction Parameters
Table 5- 
Formulas of the Time-Domain Feature Extraction Parameters

We added all these features into a feature pool as candidate parameters and then used certain rules to select the feature subsets that yielded the best performance (e.g., the highest fault recognition rate or the fastest detection speed). These feature subsets contain different features derived from different signals and of different dimensions; thus, different options will affect the accuracy and efficiency of the fault diagnosis process. Selecting the optimal feature subset has always been a concern of scholars.

B. DGUFS-RFFS Feature Selection Method

1) Hybrid Filter-Wrapper Methods

Feature selection is crucial for high-dimensional data classification problems. Dash and Liu proposed the basic framework of feature selection in 1997, which consisted of the following four parts: the generation of feature subsets, the evaluation of feature subsets, the stopping criteria and the verification of the results, as shown in Fig. 3 [29].

FIGURE 3. - The basic framework of feature selection.
FIGURE 3.

The basic framework of feature selection.

The commonly used feature subset generation methods include filters and wrappers. A filter is independent of the subsequent learning algorithms, uses evaluation criteria or evaluation functions to enhance the correlations among certain features and categories and reduces the correlations among other features. This approach has been widely used due to its fast processing speed and high computing efficiency. Unlike filters, wrappers use the accuracy of the subsequent classifiers as an evaluation index and are included in the learning algorithm. The feature subset selected by a wrapper is relatively small in size and high in prediction accuracy, although the algorithm has high complexity and low execution efficiency.

Combining the efficiency of a filter with the high accuracy of a wrapper can yield complementary advantages. Key feature recognition effectively reduces the dimensionality of the feature space while maintaining the accuracy of subsequent classification algorithms. The flowchart is shown in Fig 4.

FIGURE 4. - The hybrid filter-wrapper method.
FIGURE 4.

The hybrid filter-wrapper method.

2) DGUFS Filter

Suppose that the feature pool consists of d candidate features. The commonly used hybrid filter-wrapper methods mostly evaluate the importance of all d features based on certain criteria and then sort and select m prominent features from d to obtain the best classification results; this approach inevitably tends to fall to a suboptimal solution.

To solve this problem, a DGUFS method is used as a filter to select features. The aim of the algorithm is to directly select the most discriminative feature subset from d features for any given m instead of evaluating the importance of all d features. The selection based on this algorithm is targeted, which not only reduces the required computations but also enhances the recognition ability. The objective function is designed by combining two weighted dependence-guided terms to achieve the maximum similarity among the original data \boldsymbol {X} , the cluster label matrix \boldsymbol {V} and the selected feature \boldsymbol {Y} .

The overall DGUFS model can be expressed as:\begin{align*}&\mathop {min}\limits _{{ \boldsymbol {Y}},~{ \boldsymbol {L}}} - \beta Tr\left ({{{{ \boldsymbol {S}}^{T}}{ \boldsymbol {L}}} }\right) - \left ({{1 - \beta } }\right)Tr\left ({{{{ \boldsymbol {Y}}^{T}}{ \boldsymbol {YH}}{{ \boldsymbol {V}}^{T}}{ \boldsymbol {VH}}} }\right) \\&s.~t. {\left \|{ {{ \boldsymbol {X}} - { \boldsymbol {Y}}} }\right \|_{2, 0}} = d - m,~{\left \|{ { \boldsymbol {Y}} }\right \|_{2, 0}} = m, \\&\hphantom {s.~t. } { \boldsymbol {V}} \in { \boldsymbol {\Omega }},\quad { \boldsymbol {L}} = {{ \boldsymbol {V}}^{T}}{ \boldsymbol {V}},~rank({ \boldsymbol {L}}) = c, \\&\hphantom {s.~t. } { \boldsymbol {L}}\underline \succ 0,\quad { \boldsymbol {L}} \in {\{ 0,{\mathrm{ 1\}}}^{n}}^{ \times n},~diag{\mathrm{(}}{ \boldsymbol {L}}) ={ \boldsymbol {I}}.\tag{1}\end{align*} View SourceRight-click on figure for MathML and additional features. where \boldsymbol {X} is a d \times n original data matrix with n samples, \boldsymbol {V} is a c \times n cluster label matrix of \boldsymbol {X} , c is the cluster indicator, \boldsymbol {Y} is the selected feature, \boldsymbol {\Omega } is the candidate set of cluster label matrices that classify data into c groups, \boldsymbol {L} is the linear kernel matrix of \boldsymbol {V} , \boldsymbol {S} is the similarity matrix, { \boldsymbol {H}} = { \boldsymbol {I}} - \frac {1}{n}{ \boldsymbol {e}}{{ \boldsymbol {e}}^{T}} is the centering matrix, and \boldsymbol {e} is an n-dimensional column vector with element values equal to 1. Tr denotes the trace, and \beta \in \left ({{0,1} }\right) is a regularization parameter. Additionally, { \boldsymbol {L,\,\,I,\,\,H}} \in { \boldsymbol {R}^{n}}^{ \times n} .

In the objective function, the first dependence-guided term -Tr\left ({{{{ \boldsymbol {S}}^{T}}{ \boldsymbol {L}}} }\right) is used to increase the dependence of the desired label matrix \boldsymbol {L} on the original data \boldsymbol {X} and is designed based on the geometrical structure of the data and the associated discriminative information; the second dependence-guided term -Tr\left ({{{{ \boldsymbol {Y}}^{T}}{ \boldsymbol {YH}}{{ \boldsymbol {V}}^{T}}{ \boldsymbol {VH}}} }\right) is designed based on the matrix form of the Hilbert-Schmidt independence criterion (HSIC) to enhance the dependence of selected feature \boldsymbol {Y} on the desired label matrix \boldsymbol {V} . Both of the terms are designed using the trace norm (abbreviated as Tr ), and the method is based on the l2,0 -norm equality constraints, thereby effectively avoiding overfitting problems.

The DGUFS method increases the interdependence among selected features, raw data and cluster labels. m is considered in the process of obtaining the optimal subset. The features are no longer evaluated separately, and the subsets are treated as a whole considering the correlations and redundancy among features. The algorithm outperforms many of the leading methods of sparse learning-based unsupervised feature selection.

3) RFFS Wrapper

The heuristic search strategy is one of the main wrapper strategies, and such methods include the individual optimal feature search strategy, sequence forward selection method, sequence backward selection method, etc., [30], [31]. Individual optimal feature search strategies have the advantages of low time complexity and high operational efficiency, and they are widely used for high-dimensional data sets. The sequence forward selection method is efficient, but the correlations among features to be added and the selected feature set are not considered. Once the features are added, they will not be deleted, which will result in redundancy in the feature subset. The sequence backward selection method involves an elimination algorithm based on a complete feature set, which requires many computations. However, this approach has displayed good performance in practical applications because it considers the redundancy among features.

In this paper, an individual optimal feature selection wrapper based on the random forest (RF) approach was presented in combination with DGUFS to evaluate and compare the optimal feature subsets and acquire the final feature subset with the best recognition performance.

The RF, which was proposed by Leo Breiman in 2001, is an integrated classifier composed of a set of decision tree classifiers h({ \boldsymbol {X}},\,\,{\theta _{k}}),\,\,k = 1,\,\,2,\,\,\ldots,\,\,K , where \theta _{k} is a random vector that follows an independent distribution, and K represents the number of decision trees in the RF [32]. With a given independent variable \boldsymbol {X} , each decision tree classifier determines the optimal classification result by voting.

Given a set of classifiers {h_{1}}(\boldsymbol {x}),{h_{2}}(\boldsymbol {x}), \ldots,{h_{k}}(\boldsymbol {x}) , the training set of each classifier is randomly sampled from the original randomly distributed data vector Y , \boldsymbol {X} , and the margin function is defined as:\begin{equation*} mg({ \boldsymbol {X}},~Y) = a{v_{k}}I({h_{k}}({ \boldsymbol {X}}) = Y) - \max \limits _{j \ne Y} a{v_{k}}I({h_{k}}({ \boldsymbol {X}}) = j)\tag{2}\end{equation*} View SourceRight-click on figure for MathML and additional features. where I(\cdot) is an indicator function.

The margin function is used to measure the degree to which the average number of correct classifications exceeds the average number of incorrect classifications. The larger the margin is, the more reliable the classification prediction.

The generalization error is defined as:\begin{equation*} P{E^ {*} } = {P_{ \boldsymbol {X},~Y}}(mg({ \boldsymbol {X}},~Y) < 0)\tag{3}\end{equation*} View SourceRight-click on figure for MathML and additional features. where the subscripts \boldsymbol {X} and Y represent the probability and \boldsymbol {P} spans the \boldsymbol {X} and Y spaces.

In an RF, when there are sufficient decision tree classifiers, {h_{k}}(\boldsymbol {X}) = h(\boldsymbol {X},{\theta _{k}}) obeys the strong law of large numbers.

As the number of decision trees in the RF increases, all sequences {\theta _{1}},{\theta _{2}}, \ldots,{\theta _{k}},P{E^ {*} } converge almost everywhere:\begin{equation*} {P_{{ \boldsymbol {X}},Y}}({P_\theta }(h({ \boldsymbol {X}},\theta) = Y) - \max \limits _{j \ne Y} {P_\theta }(h({ \boldsymbol {X}},\theta) = j) < 0)\tag{4}\end{equation*} View SourceRight-click on figure for MathML and additional features.

Formula (4) shows that RFs do not cause overfitting problems as the number of decision trees increases but may increase generalization errors within a certain limit.

The base classifier in the RF method proposed in this paper chooses the classification and regression tree (CART) algorithm. Assuming that the selected feature dimension is m , the complete feature set dimension is d , the number of training samples is n , and the time complexity is represented by O(\cdot) , the time complexity of the RF algorithm can be approximated as O\left ({{mnlog{n^{2}}} }\right) . In our experiment, the RFFS needs to be executed d times, and the total time complexity of the algorithm can be approximated as O\left ({{dmnlog{n^{2}}} }\right) . The time complexity of the RFFS algorithm is approximately linearly related to m and d and related to the approximate square of n .

4) The DGUFS-RFFS Method

In this paper, the DGUFS method is used as a filter to determine the optimal solution of the subset in each feature dimension, and the RFFS wrapper then is used to determine the dimension of the feature subset that yields the optimal solution. By combining the wrapper with the DGUFS algorithm, the RFFS wrapper only needs to implement individual optimal selection strategies for feature subsets of different dimensions m instead of performing C_{d}^{m} traverse optimization within the feature subset, thus considerably reducing the computational overhead. The complexity of the algorithm is effectively reduced in this approach. The feature subset obtained by the proposed DGUFS-RFFS method can achieve the optimal prediction effect within a given range of feature dimensions. The accuracy, detection efficiency and computational cost can be well balanced and flexibly adjusted according to the demand.

SECTION IV.

Fault Detection With the Five-Level NNPP Converter

A. The Framework of Open-Circuit Fault Diagnosis Using DGUFS-RFFS

A general framework for fault identification is proposed, and it includes five modules, namely, original signal acquisition, signal preprocessing, feature extraction, feature selection and fault classification. Feature preprocessing includes sampling and abnormal sample cleaning.

The current of the flying capacitors I_{Cfa1} and I_{Cfa2} , the phase current I_{a} and the phase voltage V_{ao} are monitored and used for fault detection. The information contained in the signals above can instantly reflect changes in the modulation strategy and operating state; thus, these parameters are effective indicators for fault diagnosis. Five feature parameters, including Ku , Sk , RMS , Cf and Ff , are extracted to form a feature pool with 20 features.

The DGUFS-RFFS method is used for feature selection, and another two feature selection methods are explored for comparison. MCFS (multicluster feature selection) is a popular unsupervised learning algorithm with excellent classification ability based on manifold and l1 -norm regularization models. The method uses the l1 -norm to measure the ability of each feature to distinguish different categories, and the selection can maintain the various clustering structures of the data. The manual selection method is based on enumeration. Referring to the experience of experts and analyses of actual situations, the feature subset with the best possible classification effect is selected through hundreds of experiments. Although this method is time consuming and labor intensive, it is widely used in practice and can achieve good fault recognition results; thus, it is superior to many other prevailing algorithms.

The features selected by the above methods are input into the RF classifier for fault recognition. The DGUFS, MCFS and manual selection methods are compared in terms of the fault recognition rate, training time and testing time and are further analyzed based on evaluation indexes.

The flowchart of the complete solution is shown in Fig. 5.

FIGURE 5. - Flowchart of the complete solution for fault diagnosis.
FIGURE 5.

Flowchart of the complete solution for fault diagnosis.

B. Implementation and Simulation

1) Simulation of an Open-Circuit Fault for a Five-Level NNPP Converter

The simulation is conducted using MATLAB 2018 (a), and four failure modes are selected as representatives to simulate the current and voltage waveform changes in four failure states. The four modes include (a) S_{a11} fails to open, (b) S_{a12} fails to open, (c) S_{a11} and S_{a12} fail to open, and (d) S_{a13} and S_{a14} fail to open. Fig. 6 displays the simulation results for the waveforms of I_{Cfa1} , I_{Cfa2} , I_{a} and V_{ao} under the four faulty circumstances above. Failures occur at T = 0.2 s.

FIGURE 6. - Waveforms of 
$I_{Cfa1}$
, 
$I_{Cfa2}$
, 
$I_{a}$
 and 
$V_{ao}$
 under open-circuit faults in the five-level NNPP converter: (a) 
$S_{a11}$
 fails to open. (b) 
$S_{a12}$
 fails to open. (c) 
$S_{a11}$
 and 
$S_{a12}$
 fail to open. (d) 
$S_{a13}$
 and 
$S_{a14}$
 fail to open.
FIGURE 6.

Waveforms of I_{Cfa1} , I_{Cfa2} , I_{a} and V_{ao} under open-circuit faults in the five-level NNPP converter: (a) S_{a11} fails to open. (b) S_{a12} fails to open. (c) S_{a11} and S_{a12} fail to open. (d) S_{a13} and S_{a14} fail to open.

2) Analysis of Feature Extraction and Selection

After the faulty signals are obtained, time-domain feature extraction and dimensionality reduction are performed. Taking Sk , Ku and Cf for instance, Fig. 7 shows the spatial distributions of the three feature parameters extracted from four original signals I_{Cfa1} , I_{Cfa2} , I_{a} and V_{ao} , including nine operating states of one fault-free state (represented as OK ) and eight faulty states when the open-circuit failure of a single IGBT (from S_{a11} to S_{a24} , represented as B1 to B8 ) occurs.

FIGURE 7. - Spatial distributions of 
$Sk$
, 
$Ku$
 and 
$Cf$
 extracted from the original signals: (a) Features of 
$I_{Cfa1}$
. (b) Features of 
$I_{Cfa2}$
. (c) Features of 
$I_{a}$
. (d) Features of 
$V_{ao}$
.
FIGURE 7.

Spatial distributions of Sk , Ku and Cf extracted from the original signals: (a) Features of I_{Cfa1} . (b) Features of I_{Cfa2} . (c) Features of I_{a} . (d) Features of V_{ao} .

The above spatial distributions are only a few representatives of many combinations of feature parameters. The spatial distribution and concentration of different features of the same signal exhibit differences. Different feature combinations will have different effects on fault identification.

SECTION V.

Experimental Results and Discussion

A. Experimental Setup

An experimental prototype of the five-level NNPP converter was built, as shown in Fig. 8. The semiconductor power switches adopted were Infineon FF100R12RT4 switches. The system was controlled by a TI TMS320F28335 digital signal processor (DSP) and ACTEL A3P250 field-programmable gate array (FPGA). The device parameters of the converter were set as follows in Table 6.

TABLE 6 Device Parameters of the Converter
Table 6- 
Device Parameters of the Converter
FIGURE 8. - Experimental prototype of the five-level NNPP converter.
FIGURE 8.

Experimental prototype of the five-level NNPP converter.

Fig. 9 shows the output waveforms of phase A when the modulation ratio M = 0.9; the line voltage V_{ab} , the phase voltage V_{ao} , the DC link voltages U_{C1} and U_{C2} , the flying capacitor voltages U_{Cfa1} and U_{Cfa2} and the phase current I_{a} are shown. The DC link voltages and the flying capacitor voltages are balanced by the control strategies proposed in [25], [26].

FIGURE 9. - The output waveforms of the five-level NNPP converter under normal condition when M=0.9: (a) 
$V_{ab}$
, 
$U_{C1}$
, 
$U_{C2}$
 and 
$I_{a}$
. (b) 
$V_{ao}$
, 
$U_{Cfa1}$
, 
$U_{Cfa2}$
 and 
$I_{a}$
.
FIGURE 9.

The output waveforms of the five-level NNPP converter under normal condition when M=0.9: (a) V_{ab} , U_{C1} , U_{C2} and I_{a} . (b) V_{ao} , U_{Cfa1} , U_{Cfa2} and I_{a} .

B. Experimental Design and Implememtation

The signals of the open-circuit faults of the NNPP converter are acquired, including those for 1 fault-free state, 8 single IGBT failures and 28 double IGBT failures. The frequency of the output voltage f = 50\,\,Hz . The raw sampled signals obtained include 480,000 values used for training and 80,000 values used for testing. Each feature extraction was based on 4000 sampled values, which constituted a training feature set with a maximum dimension of 120 \times m and a test feature set with a dimension of 20 \times m , where m is the number of selected features (m = 1, 2, \ldots, 20) . Table 7 shows the experimental results when the training feature set dimension is 40 \times m , including the fault recognition accuracy, training time, and testing time. When the DGUFS-RFFS method is adopted, the accuracy increases with the number of features and reaches a peak when the feature number is 7. Then, the fault recognition rate remains relatively stable as the number of features continues to increase and finally decreases when the number exceeds 10. The main reason for the decrease in accuracy is the introduction of redundant information and overfitting. The arithmetic means and the root mean square error (RMSE) of the results were calculated based on 20 repeated experiments. The MCFS and the manual feature selection method are used for comparison with the DGUFS. The experimental results show that the hybrid DGUFS-RFFS method is generally superior to the other two methods, with higher fault recognition rates and, in most cases, shorter training and testing times. Using the three methods, the number of features required to reach the highest accuracy is 7, 11, and 4, with accuracies of 96.02 ± 0.98, 86.02 ± 1.6, and 88.39 ± 0.47, respectively.

TABLE 7 Comparison of Experimental Results for Three Different Feature Selection Methods
Table 7- 
Comparison of Experimental Results for Three Different Feature Selection Methods

Fig. 10 and Fig. 11 show the relationships among the number of selected features, the training data set size and the fault detection accuracy using three feature selection methods. Notably, when m is the same in each method, the fault detection accuracy using the DGUFS-RFFS method is significantly higher than that of MCFS and the manual method. Figs. 12 and 13 illustrate the relationship among the number of selected features, the training data set size and the training time. The training speed is not sensitive to the number of features but is greatly influenced by the training data set size. For the same m, the training time of the DGUFS-RFFS method is slightly shorter than that of the other two methods.

FIGURE 10. - Contour map of the relationship among the fault detection accuracy, the number of selected features and the size of the training data set: (a) DGUFS-RFFS method. (b) MCFS-RFFS method. (c) Manual-RFFS method.
FIGURE 10.

Contour map of the relationship among the fault detection accuracy, the number of selected features and the size of the training data set: (a) DGUFS-RFFS method. (b) MCFS-RFFS method. (c) Manual-RFFS method.

FIGURE 11. - 3D mesh grid of the relationship among the fault detection accuracy, the number of selected features and the size of the training data set: (a) DGUFS-RFFS method. (b) MCFS-RFFS method. (c) Manual-RFFS method.
FIGURE 11.

3D mesh grid of the relationship among the fault detection accuracy, the number of selected features and the size of the training data set: (a) DGUFS-RFFS method. (b) MCFS-RFFS method. (c) Manual-RFFS method.

FIGURE 12. - Contour map of the relationship among the training time, the number of selected features and the size of the training data set: (a) DGUFS-RFFS method. (b) MCFS-RFFS method. (c) Manual-RFFS method.
FIGURE 12.

Contour map of the relationship among the training time, the number of selected features and the size of the training data set: (a) DGUFS-RFFS method. (b) MCFS-RFFS method. (c) Manual-RFFS method.

FIGURE 13. - 3D mesh grid of the relationship among the training time, the number of selected features and the size of the training data set: (a) DGUFS-RFFS method. (b) MCFS-RFFS method. (c) Manual-RFFS method.
FIGURE 13.

3D mesh grid of the relationship among the training time, the number of selected features and the size of the training data set: (a) DGUFS-RFFS method. (b) MCFS-RFFS method. (c) Manual-RFFS method.

Table 8 lists the optimal feature subsets selected by the three methods when the number of features is 7 and 8. The raw signals and the feature parameters can be selected according to different situations. In this paper, a feature pool with 20 candidate features is constructed using 4 circuit signals and 5 feature parameters. The DGUFS-RFFS method greatly shortens the computational process and provides the possibility for numerous attempts based on multiple features. In the case of large numbers of candidate signals and features, the superiority of the solution can be clearly illustrated. Not only can this approach greatly reduce the computational cost, but it can also provide an improved and stable fault recognition rate compared to the rates obtained by other state-of-the-art algorithms. The proposed solution provides good extension and generalization ability. Although the DGUFS-RFFS method achieves superior performance, this result does not mean that there is no better choice. By selecting additional circuit signals or extracting other features, candidate feature parameters can be added to the feature pool to provide more alternatives for fault recognition. This issue can be discussed in future work.

TABLE 8 Optimum Feature Subsets Based on Different Feature Selection Methods
Table 8- 
Optimum Feature Subsets Based on Different Feature Selection Methods

Table 9 lists two of the important evaluation indexes for clustering: the mutual information index (NMI) and adjusted rand index (ARI). Figure 14 shows a comparison of the NMI and ARI for the three feature selectors. When the number of features is small, compared with the DGUFS-RFFS-based NMI and ARI, the indexes based on the MCFS-RFFS and the MANUAL-RFFS methods are generally smaller, and as the number of features increases, the convergence rate of the indicators towards 1 is also slower. The NMI and ARI indicators based on the UDUFS-RFFS method are generally stable and close to 1, indicating a satisfactory clustering effect.

TABLE 9 Comparison of Evaluation Indicators for Three Different Feature Selection Methods
Table 9- 
Comparison of Evaluation Indicators for Three Different Feature Selection Methods
FIGURE 14. - Comparison of the NMI and ARI for three feature selection methods: (a) NMI relative to the selected feature number. (b) ARI relative to the selected feature number. (c) NMI relative to the training data set size. (d) ARI relative to the training data set size.
FIGURE 14.

Comparison of the NMI and ARI for three feature selection methods: (a) NMI relative to the selected feature number. (b) ARI relative to the selected feature number. (c) NMI relative to the training data set size. (d) ARI relative to the training data set size.

SECTION VI.

Conclusion

In this paper, a complete fault detection solution for a five-level NNPP converter is proposed. Through signal acquisition, feature extraction, feature selection and classification, the effective diagnosis of 36 open-circuit faults is achieved. This method does not rely on past experience and does not require massive amounts of data for training. Failures can be discovered with limited data sets through signal processing and feature selection, which greatly shortens the time for fault diagnosis. The solution is flexible and can be easily used by other types of converters; moreover, this approach is not limited by topologies, the levels of converters or the device parameters in a given system. Different circuit signals and feature parameters can be selected according to specific circumstances. The remarkable advantage of the solution is that an increase in the number of candidate features will not result in considerable computational complexity but will provide more choices for optimal feature subset selection. The optimal feature subset of a certain dimension can be directly determined by the DGUFS filter, thereby reducing the computational overhead of the subsequent RFFS wrappers. Compared with other leading feature selection algorithms, the proposed DGUFS-RFFS feature selector exhibits better recognition performance and efficiency. The effectiveness and practicality of the solution are verified by simulations and experiments.

References

References is not available for this document.