Journals & Magazines >IEEE Access >Volume: 11

A New Method for Fault Diagnosis of Hydraulic System Based on Improved Empirical Wavelet Transform and Kernel Extreme Learning Machine

This graphical abstract is a flowchart of a new method proposed in this study for fault detection in hydraulic systems. Firstly, the acquired raw signal is subjected to a...

Abstract:

To address the problem that multivariable faults in hydraulic systems are difficult to diagnose accurately in a short time, a new signal processing method combining an im...Show More

Metadata

Abstract:

To address the problem that multivariable faults in hydraulic systems are difficult to diagnose accurately in a short time, a new signal processing method combining an improved empirical wavelet transform with a kernel extreme learning machine optimized by the Pelican optimization algorithm is proposed. First, the problem of unreasonable spectral partitioning of the empirical wavelet transform is solved using an improved k-means clustering method, which upgrades the empirical wavelet transform in the maximum scale space and then adaptively decomposes the acquired pressure signal to obtain a series of sub-signal components, thus reducing the dimensionality of multivariate faults and saving computation time. Second, 17 features of each sub-signal component are calculated and input to the kernel extreme learning machine for training, and the sequence-forward selection strategy is used to select the optimal features from the kernel extreme learning machine to ensure the basic accuracy and efficiency of prediction. Finally, a Pelican optimization algorithm improved kernel extreme learning machine algorithm is proposed for fast classification of faults. Experiments show that the method can diagnose a variety of faults in hydraulic systems quickly and accurately, with a diagnostic accuracy of 97.24 percent, which is better than other methods.

This graphical abstract is a flowchart of a new method proposed in this study for fault detection in hydraulic systems. Firstly, the acquired raw signal is subjected to a...

Published in: IEEE Access ( Volume: 11)

Page(s): 92135 - 92149

Date of Publication: 26 June 2023

Electronic ISSN: 2169-3536

DOI: 10.1109/ACCESS.2023.3289471

Funding Agency:

Contents

SECTION I.

Introduction

Hydraulic transmission systems are widely used in various fields because of their ability to produce large forces or torques and their ease of control. Some scholars have conducted statistical analyzes on the location of failures of machinery and equipment containing hydraulic systems and found that most failures of machinery and equipment are caused by hydraulic systems [1], [2], [3]. The operation of the hydraulic system is so frequent that it is easy to break down, but if the fault location is not found in time and corresponding measures are not taken, it will cause serious economic losses at least and cause personal safety problems at worst [4]. Therefore, it is necessary to perform fault diagnosis for hydraulic systems so that the fault location can be quickly located and the type of fault can be determined to avoid unforeseen consequences.

In general, fault diagnosis is divided into analytical model-based methods and data-driven methods [5]. However, many components in hydraulic systems are complex rotating machines, and it is difficult to build fault models. Thanks to the improvement of industrial sensor accuracy and the speed of computer computation, data-driven hydraulic system fault diagnosis has become mainstream [6].

Popular data-driven fault diagnosis includes fault diagnosis methods based on artificial intelligence [7], [8], [9], [10], [11], [12] and signal processing [13], [14]. Both methods can also be combined for fault diagnosis [15], [16]. Artificial intelligence fault diagnosis methods are preferred by researchers because of their high computational accuracy and the fact that they rarely require expertise in fault objects. However, artificial intelligence diagnosis has a complex structure and is inefficient when diagnosing multiple faults. To solve this problem, different signal preprocessing methods combined with convolutional neural networks were utilized in the literature [17] and [18], respectively, for fault diagnosis of rotating machinery and to enhance the efficiency of fault diagnosis. Although this approach may not be directly applicable to non-rotating machinery, it demonstrates that signal preprocessing can improve the efficiency of fault diagnosis. Therefore, it is especially important to choose the appropriate signal processing algorithm.

In 1807, the Fourier transform was introduced, which greatly accelerated the development of the field of digital signal processing. The wavelet transform is a multi-resolution signal analysis method based on the Fourier transform, which can achieve the decomposition and reconstruction of a discrete-time signal through the fusion of approximate and detailed components [19]. Unfortunately, the choice of the wavelet transform basis function is determined by subjective experience, which is very unfriendly to inexperienced stakeholders. In addition, the wavelet transform produces great frequency leakage when dealing with complex signals, which makes it impossible to use the wavelet transform for hydraulic system fault diagnosis [20].

Empirical Modal Decomposition (EMD) is another classical signal processing method. EMD can adaptively decompose the vibration signal into many kinds of Intrinsic Mode Functions (IMFs), which solves the problem of empirically selecting appropriate wavelet bases for the wavelet transform. After the IMFs are merged, the noise interference existing in the vibration signal can be eliminated [21]. Yang et al. [22] achieved their purpose of reducing the noise within the signal through multipoint data fusion. EMD can decompose the signal according to the temporal characteristics of the processed object, so there is no need to set the basis function, and there is no problem similar to the wavelet transform. However, modal blending and breakpoint effects are very common in EMD, and it has no theoretical basis to support them.

The Empirical Wavelet Transform (EWT), which is similar in composition to the wavelet transform, was mentioned by Gilles in 2013 and is an emerging method for time-frequency analysis with the advantage that EMD can automatically adapt to the signal and has a rigorous theoretical basis [23]. EWT is widely used in the field of signal processing [24], [25], [26]. For the process of EWT analysis of vibration signals, the selection of meaningful boundaries is the key to the success of signal decomposition [27]. However, most studies on EWT use local maxima or local minima in the selection of boundaries, which has a large error in dealing with complex nonlinear nonsmooth signals because, with noisy signals, the amplitude of many noises has reached the same height as normal signals, so some of the boundaries determined by the minima are actually due to noise [28].

To overcome these problems of conventional EWT, Pan et al. [29] used the Gaussian kernel inner product to remove the noise from the target signal and thus decompose the normal signal in the target signal. Kong et al. [30] decomposed the fault signal under strong noise using the phenomenon of meshing resonance. Kim et al. [31] used the inverse-spectrum-assisted EWT to make the spectrum of the target signal more smooth, together with the Hilbert transform to average the envelope spectrum to obtain the results of gear faults. The order spectrum coherence is combined with historical data obtained from healthy machines to obtain the anomaly envelope spectrum, which is further processed for fault diagnosis. This anomaly envelope spectrum is further processed by smoothing operations to perform not only automatic fault detection but also to identify damaged components [32]. Kedadouche et al. [33] used operational modal analysis to determine the stable frequency of the signal, calculate the support boundary, apply the scaling function and wavelet function corresponding to each detection segment, and filter the signal with the constructed filter bank to obtain the IMF. The approach is then roughly the same as for EMD. Like Gilles to calculate the scale space [27], Zhao et al. [34] used several maximum lengths of the scale space as the boundaries for dividing the spectrum, obtained a series of IMF components after reconstructing the signal, and performed power spectrum analysis on these components to derive the fault frequency. Zhang et al. [35] used the Power Spectral Density (PSD) instead of the Fourier spectrum, and since the extreme value of the PSD points is smaller, there is no need to calculate the complex scale space; instead, the minimal value of PSD is used as the boundary to divide the spectrum to obtain the components of the original signal. Zhang et al. [36] proposed a variable spectrum splitting EWT to estimate the modes using multi-taper power spectral density, and finally a set of boundaries associated with the spectrum fluctuations to obtain the components of the original signal. Ding [37] obtained the upper and lower boundaries of the sidebands by searching left and right bidirectionally at the center of the spectrum and merged the upper and lower boundaries of the sidebands as the boundaries of the spectrum segmentation to achieve the fault detection of bearings. Zheng et al. [38] replaced the Fourier amplitude spectrum with the power spectrum to verify the effectiveness of the method by decomposing the fault signal of the hydraulic pump slide valve loosening. However, these studies seem to avoid the spectrum division of EWT, and do not fundamentally solve the problem of EWT, i.e., the unreasonable spectrum division. To solve this problem, Yu et al. [39] used the DBSCAN clustering method to cluster the largest scale space to obtain the boundary of the spectrum, and the decomposition of the vibration signals of the hydraulic pump in three directions proved the effectiveness of the method in the detection of weak faults in the pump. The DBSCAN method does have a more significant improvement for EWT of vibration signals. However, we found through experiments that this method seems to fail for components in hydraulic systems where it is difficult to detect vibration signals.

In a hydraulic circuit, the pressure signal and the flow signal are easily detected for any component through which hydraulic fluid passes. The pressure and flow signals of hydraulic components contain a great deal of information, and the fault information is only the tip of the iceberg. The feature extractor can reflect the features of the fault information from many features [40], [41]. Li et al. [42] first conducted an empirical determination of the number of signal EWT decompositions. Subsequently, they selected the component with the highest energy intensity and utilized the inverse dispersion entropy features of this component as the features of the original signal for efficient feature extraction purposes. Lu et al. [43] took the first n-order components with high correlation coefficients between the decomposed components and the original signal to construct fused feature vectors, which were used to build a two-level diagnostic model based on the salp swarm algorithm optimized Kernel Extreme Learning Machine (KELM) to identify normal and abnormal states and the categories under the abnormal states. Ding et al. [44] used EWT to decompose the hydraulic pump vibration signal, used principal component analysis to reduce the dimensionality of the feature extraction of the decomposed signal, and finally input the feature vector containing the fault features into the Extreme Learning Machine (ELM) to obtain the fault classification results. Liu et al. [45] developed a novel personalized diagnosis method for gear fault detection using numerical simulation and ELM algorithm, which can diagnose the health condition of gears by extracting features and training ELM model based on large amount of vibration signals.

Many studies have been conducted at home and abroad for the fault diagnosis of vibration signals of hydraulic components, and vibration analysis is the most effective condition-monitoring technique for rotating systems [46]. It is very effective for fault detection of bearings, shafts, hydraulic pumps, hydraulic motors, etc. [47], [48]. However, many hydraulic components are non-rotating components and we cannot or have difficulty in measuring their vibration signals. In contrast, the pressure signals of the hydraulic system are easier to measure. According to our experiments, it is verified that the pressure signals can be applied to the fault diagnosis of the hydraulic system.

This paper proposes to combine the improved EWT and the Pelican Optimization Algorithm (POA) optimized KELM to preprocess the pressure signals to diagnose multivariable faults in hydraulic systems. The implementation of the proposed method in this study is shown in Fig. 1. The article makes significant contributions in the following ways:

By using the light-k-means clustering algorithm, the problem of unreasonable spectral division of pressure signals when performing EWT is solved.
Seventeen parameters are proposed to construct the feature pool, and a Sequence Forward Selection (SFS) strategy is used to select the features with the optimal ability to distinguish faults.
Optimizing the kernel parameters and regularization coefficients of KELM using POA has better fault classification results.

Fig. 1.

The implementation flow of the proposed method.

Show All

In the second part, the pertinent theory is introduced. The third part delineates the experimental setting and conducts an experimental analysis to validate and evaluate the proposed approach. The fourth part lays out the conclusions and outlines the future work.

SECTION II.

Methodology

The following sections will introduce in detail the basic principles of EWT, the realization of the largest scale space, the steps to realize light-k-means, and the process of the KELM improved by the POA.

A. The Fundamentals of EWT

Inspired by the wavelet transform, Gills proposed the EWT in 2013, which has a similar definition to the transform.

From the perspective of Fourier, this method can adaptively select a band-pass filter bank according to the spectrum of the signal to be processed, divide the signal spectrum into different frequency bands, and extract a series of different Sub-Signal Components (SSCs).

We define $\omega _{\textrm {n}}$ as the middle point of the segment with width $2\tau _{\textrm {n}}$ and index n (where $1\le \textrm {n}\le \textrm {N}$ ). According to the Shannon criterion, the values of $\omega _{0}$ and $\omega _{\textrm {n}}$ are 0 and $\pi$ , respectively. The partition of the EWT boundary is shown in Fig. 2. Divide a signal into N continuous frequency bands, marked as bands $\Lambda _{\textrm {n}} ={[}\omega _{\textrm {n-1}},\omega _{\textrm {n}}]$ , which limits the signal to $\textrm {U}_{\textrm {n=1}}^{\textrm {N}} \Lambda _{\textrm {n}} =\textrm {[0,}\pi]$ in the Fourier spectrum.

Fig. 2.

Band segmentation of EWT.

Show All

Inspired by the idea of Littlewood-Paley and Meyer wavelet construction, Gills used (1) and (2) formulas to construct empirical scaling functions ${\varphi }_{\textrm {n}} (\omega)$ and empirical wavelet functions ${\psi }_{\textrm {n}} (\omega)$ when the frequencies are at different positions [23].

$\begin{align*} \varphi _{\textrm {n}} (\omega)&=\begin{cases} \displaystyle 1~ if ~\left |{ \omega }\right |\le (1-\gamma)\omega _{n} \\ \displaystyle \cos \left [{ {\frac {\pi }{2}\beta \left ({{\frac {1}{2\gamma \omega _{n} }\left ({{\left |{ \omega }\right |-\left ({{1-\gamma } }\right)\omega _{n}} }\right)} }\right)} }\right] \\ \displaystyle~ if ~(1-\gamma)\omega _{n} \le \left |{ \omega }\right |\le (1+\gamma)\omega _{n} \\ \displaystyle 0 otherwise \end{cases} \tag{1}\\ \psi _{\textrm {n}} (\omega)&=\begin{cases} \displaystyle 1~ if ~(1+\gamma)\omega _{n} \le \left |{ \omega }\right |\le (1-\gamma)\omega _{n+1} \\ \displaystyle \cos \left [{ {\frac {\pi }{2}\beta \left ({{\frac {1}{2\gamma \omega _{n+1} }\left ({{\left |{ \omega }\right |-\left ({{1-\gamma } }\right)\omega _{n+1}} }\right)} }\right)} }\right] \\ \displaystyle~ if ~(1-\gamma)\omega _{n+1} \le \left |{ \omega }\right |\le (1+\gamma)\omega _{n+1} \\ \displaystyle \sin \left [{ {\frac {\pi }{2}\beta \left ({{\frac {1}{2\gamma \omega _{n} }\left ({{\left |{ \omega }\right |-\left ({{1-\gamma } }\right)\omega _{n}} }\right)} }\right)} }\right] \\ \displaystyle~ if ~(1-\gamma)\omega _{n+1} \le \left |{ \omega }\right |\le (1+\gamma)\omega _{n} \\ \displaystyle 0 otherwise \end{cases} \\{}\tag{2}\end{align*}$ View Source

where

$\gamma =\tau _{\textrm {n}} /\omega _{\textrm {n}}$

, and

$\beta (x)$

is a function about

$\forall C^{k}\in [{0,1}]$

that is calculated by (3).

$\begin{align*} \beta (x)=\begin{cases} \displaystyle 0~ if ~x\le 0 \\ \displaystyle and~ \beta (x)+\beta (1-x)=1 \forall x\in [{0,1}] \\ \displaystyle 1~ if ~x>1 \end{cases} \tag{3}\end{align*}$

View Source

Many functions satisfy this equation. According to document [49], (4) is the most used one.

$\begin{equation*} \beta (x)=35x^{4}-84x^{5}+70x^{6}-20x^{7} \tag{4}\end{equation*}$ View Source

Let $f(\omega)$ and $F^{-1}$ represent the Fourier transform and the inverse transform operator of the signal $f(t)$ to be decomposed, and then the detail coefficients $W_{f}^{e} (n,t)$ of the EWT are obtained by taking the inner product of the signal $f(t)$ and the empirical wavelets $\psi _{n} (t)$ :

$\begin{align*} W_{f}^{e} (n,t)&=\left \langle{ {f(t),\psi _{n} (t)} }\right \rangle \\ &=\int {f(\tau)} \overline {\psi _{n} (\tau -t)} d\tau =F^{-1}[f(\omega)\overline {\psi _{n} (\omega)}] \tag{5}\end{align*}$ View Source

And the approximation coefficients

$W_{f}^{e} (0,t)$

are calculated by the inner product of the signal

$f(t)$

and the scaling function

$\varphi _{1} (t)$

$\begin{align*} W_{f}^{e} (0,t)&=\left \langle{ {f(t),\varphi _{1} (t)} }\right \rangle \\ &=\int {f(\tau)} \overline {\varphi _{1} (\tau -t)} d\tau =F^{-1}[f(\omega)\overline {\varphi _{1} (\omega)}] \tag{6}\end{align*}$

View Source

where

$\overline {\psi _{n}} (\tau)$

and

$\overline {\varphi _{1}} (\tau)$

represent the complex conjugate functions of

$\psi _{n} (\tau)$

and

$\varphi _{1} (\tau)$

, respectively, while

$\varphi _{1} (\omega)$

and

$\psi _{n} (\omega)$

are given by (1) and (2).

The signal $f(t)$ is broken down into $N+1$ SSCs $f_{0} (t),f_{1} (t),\ldots,f_{N} (t)$ , and these SSCs are mathematically represented by (7).

$\begin{align*} \begin{cases} \displaystyle f_{0} (t)=W_{f}^{e} (0,t)\ast \varphi _{1} (t) \\ \displaystyle f_{k} (t)=W_{f}^{e} (k,t)\ast \psi _{k} (t) \end{cases} \tag{7}\end{align*}$ View Source

where k is an index variable such that

$1\le k\le N$

, and the reconstructed signal is calculated by (8).

$\begin{align*} f(t)&=W_{f}^{e} (0,t)\ast \varphi _{1} (t)+\sum \limits _{k=1}^{N} {W_{f}^{e} (k,t)} \ast \psi _{k} (t) \\ &=F^{-1}\left[{W_{f}^{e} (0,\omega)\varphi _{1} (\omega)+\sum \limits _{k=1}^{N} {W_{f}^{e} (k,\omega)} \psi _{k} (\omega)}\right] \tag{8}\end{align*}$

View Source

where

$W_{f}^{e} (0,\omega)$

$\varphi _{1} (\omega)$

$W_{f}^{e} (k,\omega)$

, and

$\psi _{k} (\omega)$

are respectively the Fourier transforms of

$W_{f}^{e} (0,t)$

$\varphi _{1} (t)$

$W_{f}^{e} (k,t)$

, and

$\psi _{k} (t)$

B. Improvement of the Largest Scale Space

In the process of signal decomposition and reconstruction, segmenting the Fourier spectrum effectively and reasonably is key to the success of signal decomposition. Currently, the popular method is to use the maximum value of the spectrum interval as the boundary of the divided spectrum. However, when this method decomposes the signal containing noise interference, the local maximum value caused by the noise will be used as the boundary of the divided spectrum, which will lead to the spectrum. Over-segmentation means that the single signal is wrongly decomposed into two or more SSCs, which has a great negative impact on the final fault classification. Therefore, it is necessary to improve the EWT to achieve accurate and effective spectrum segmentation.

Gilles and Heal [27] proposed a parameter-free scale space method in 2014, projecting the spectrum onto the scale space and using the clustering method to divide the scale space into two categories (one is a meaningful spectrum segmentation boundary, and the other is a meaningless boundary). The classification criterion serves as a threshold for meaningful spectral segmentation.

Assuming that $f(x)$ is a continuous function defined on $[0,x_{\max }]$ , the scale space $L(x,t)$ of the function represented by (9).

$\begin{equation*} L(x,t)=g(x;t)^{\ast} f(x) \tag{9}\end{equation*}$ View Source

where the Gaussian kernel function

$g(x;t)=\textrm {exp}(-x^{2}/(2t))/ \sqrt {2\pi t}$

$\sqrt {t}$

represents the scale parameter,

$^{\ast}$

represents the convolution product, and

$x$

represents the time series.

For a discrete function such as a signal, the scale space must also be discrete, so the scale space of a discrete signal is expressed as (10):

$\begin{equation*} L(x,t)=\sum \limits _{n=-\infty }^{+\infty } {f(x-n)g(n;t)} \tag{10}\end{equation*}$ View Source

For real signals, a truncation threshold $M=C\sqrt {t} +1$ is used to obtain a discrete scale space:

$\begin{equation*} L(x,t)=\sum \limits _{n=-M}^{+M} {f(x-n)g(n;t)} \tag{11}\end{equation*}$ View Source

To ensure that the calculated scale space approximation error is less than 10⁻⁹, $C=6$ is usually taken. $\sqrt {t} =s\sqrt {t_{0}}$ , $s\in [1,S_{\max }]$ and s is an integer. The initial value $\sqrt {t_{0}} =0.5$ is usually chosen because it corresponds to the midpoint value of two signal samples. $\sqrt {t_{\max }} =s_{\max } \sqrt {t_{0}}$ , and $s_{\max } =2x_{\max }$ . Generally speaking, the scale curve is generated by the frequency corresponding to the minimum point in the frequency domain, and the length of the scale curve corresponding to other frequencies is 0.

Gilles and Heal [27] did not extend $\sqrt {t_{\max }} =x_{\max }$ , which raises a problem: Since the scale parameter is too small, when the noise signal is large enough, its projection into the scale space fills the entire scale space with the same length as the normal signal. This will make the noise boundary and the meaningful boundary classified into one category, resulting in mistakenly using the noise boundary as the boundary of the divided spectrum. In order not to make such mistakes, the best way is to expand the scale space and display all the scale space of the signal. Therefore, under the premise of the full-time series, expand the scale parameter to the maximum value that can be calculated by the scaling function, and usually choose to add a constant $\varepsilon$ to the maximum time series to test whether the scale space is fully displayed. The final modified scale parameter is $\sqrt {t_{\max }} =x_{\max } +\varepsilon$ [39]. The constant $\varepsilon$ can be selected according to the following principles:

$\varepsilon$ takes a constant much larger than the maximum time series, i.e., $\varepsilon \gg x_{\max }$ .
Calculate the largest scale parameter $P_{\max }$ that can be displayed, and then update $\varepsilon$ , i.e., $\varepsilon =P_{\max } -x_{\max }$ .

C. Light-K-Means

Based on the selection of the above-mentioned largest scale space, we can know that the scale parameter is limited to the largest scale parameter that can be displayed, i.e., $\sqrt {t} \in [0,P_{\max }]$ . According to a large number of experiments, the extreme value of a signal is projected into the scale space, and when its resulting length value is greater than a given standard threshold, its corresponding frequency can be used as the boundary value of a meaningful segmented spectrum. The boundary of splitting the spectrum is nothing more than dividing into meaningful boundaries and meaningless boundaries, so boundary detection becomes a binary classification problem.

The frequencies in the largest scale space are divided into two categories according to the scale length, greater than or equal to the threshold T represents a meaningful spectrum division boundary, and others are represented as meaningless boundaries.

For clustering problems, the simplest and most effective method is the k-means algorithm [50]. For the k-means algorithm, the difficulty of clustering lies in the determination of the data set category (that is, the size of the k value) and the selection of the initial cluster center (once the initial value is not well selected, effective clustering results may not be obtained even give wrong results). For our binary classification problem, it is obvious that k=2. Therefore, the biggest difficulty with the algorithm becomes the problem of initial cluster center selection.

Li et al. [51] proposed light-k-means under the premise that the number of categories k is known. The specific steps are as follows:

Randomly select $p{'}$ data points from S, which are represented as H in the set, and the complement of H is $H_{c}$ .
Apply the traditional k-means algorithm in the set H, and divide the set H into subsets.
Find the center point of the set.
Assign the set $H_{c}$ to the subset closest to them.

However, there is a fatal problem when randomly selecting data points, if all the data points of the same category are selected, this method will consume more time than traditional k-means, and even get wrong results.

Inspired by this, we improved the light-k-means algorithm, the specific steps are as follows:

Constrain all data points to a rectangular window with sides a and b.
Compare the lengths of the side lengths a and b of the rectangular window, select the larger one, and record it as c.
Divide side c into k small windows on average.
Randomly select data points in each small window, and then perform the steps of the original light-k-means algorithm.

Fig. 3 shows the comparison of k-means and light-k-means clustering. It can be seen from the figure that light-k-means reduces the number of initial iterations by randomly selecting sample points, so that two calculations can greatly reduce the time cost, thereby greatly improving the calculation speed. After all, we know that when the number of samples reaches a certain level, the number of iterations increases exponentially.

Fig. 3.

Comparison of the clustering process of k-means and light-k-means (a) The process of k-means clustering (b) The process of light-k-means clustering.

Show All

D. Extraction of Features

The selection of the signal feature vector depends on the specific problem. Gamboa-Medina et al. [52] verified for the water network leakage problem that the feature vector based on the pressure signal depends on three features: energy (ENE), entropy (ENT), and the number of excess zeros (ZCC).

The dimensionless parameters are not affected by mechanical conditions and are widely used for the diagnosis of mechanical faults [53]. If the feature vector consisting of dimensionless parameters has poor ability to distinguish between different faults, the final achieved classification performance may be less than satisfactory, no matter how good the adopted learning algorithm is [54]. This means that we have to choose the appropriate dimensionless indicators in order to seek an accurate determination of the type of failure. Since the object of our study is a hydraulic system, its failure is not only the leakage problem, but also other failures such as slide valve failure. Inspired by the processing of raw data by Xiong et al. [55] using dimensionless indicators, we add 14 time-domain features of the mechanical principle to enhance the accuracy of feature extraction, including Mean Value, Standard Deviation, Variance, Peak-to-Peak Value, Square Root Amplitude, Average Amplitude, Mean Square Amplitude, Peak Value, Waveform Index, Peak Index, Impulsion Index, Clearance Factor, Degree of Skewness, and Kurtosis Value.

The above 17 features are combined, and a feature pool $S=\{F1,F2,\ldots,17\}$ is created, whose expressions are shown in Table 1, where discrete data points $X=\{x_{1},x_{2},\ldots,x_{n} \}$ . The subset that with the highest precision is found iteratively through the SFS [56]. The accuracy $A$ can be expressed as:

$\begin{equation*} A=\frac {\sum \nolimits _{i=1}^{N} {P(c_{i} =y_{i})}}{N}\ast 100\% \tag{12}\end{equation*}$ View Source

where

$\begin{aligned} P(s)={\begin{cases} 1~ if~ s=true \\ 0~ if~ s=false \\ \end{cases}} \end{aligned}$

$c_{i}$

is the classification prediction label,

$y_{i}$

is the actual fault label, and N is the total number of labels.

TABLE 1 Expressions for 17 Characteristic Indicators

E. KELM

The output weight of the ELM [57] is calculated according to the input weight, and the weight is randomly generated, which makes the result unstable. Drawing on the experience of the successful application of kernel function in SVM, KELM [43] replaces the output matrix between the hidden layer and output layer with a kernel function. This not only avoids the uncertainty of the learning model but also retains the advantages of ELM. Therefore, the objective function $F$ of KELM training is:

$\begin{equation*} F=[K(x,x_{1})\ldots K(x,x_{N})]\left({\frac {I}{C}+\Omega }\right)^{-1}L \tag{13}\end{equation*}$ View Source

where

$K(x,x_{\textrm {i}})$

is the kernel function,

$\Omega$

is a matrix calculated by selecting different kernel functions,

$C$

is the regularization coefficient,

$I$

is the identity matrix, and

$L$

is the ideal output matrix.

F. POA

The POA is an intelligent optimization algorithm proposed by Trojovský and Dehghani [58] in 2022, which simulates the attack and hunting behavior of the pelican to build a model to solve the optimization problem. In POA, pelican hunting is divided into two processes, the approaching prey phase, and the surface flight phase. The position of the pelican changes with the position of the prey, and the steps to solve the optimization problem are as follows:

Determine the size of the pelican group and calculate the objective function value.
Approaching the prey phase: Calculate the position status of each pelican and update the group size.
Surface flight stage: Calculate whether the position state in 2 is to achieve a better objective function value, if not, keep the previous position state, and if so, update the position information and update the group size.
Keep the best candidate plan for each pelican, complete, and output the result.

G. POA-KELM

The regularization coefficients C and kernel parameters $\gamma$ have a great influence on the prediction accuracy and test accuracy of KELM, and once their selection is not appropriate, it will greatly attenuate the performance of KELM. Therefore, we use POA to optimize the regularization coefficients and kernel parameters of KELM, to make the accuracy of KELM higher. The workflow of POA-KELM is shown in Fig. 4. The specific optimization process is as follows:

Given an upper and lower bound, make a POA search in this interval.
Given the population range, calculate the objective function value and initialize KELM parameters.
Optimize C and $\gamma$ of KLEM using POA.
Train KELM for each C and $\gamma$ individually to derive the training accuracy.
Keep the C and $\gamma$ that make the highest accuracy.

Fig. 4.

The workflow of POA-KELM.

Show All

H. Framework for Fault Classification

The framework for fault classification using pressure signals is shown in Fig. 1, and some key steps are explained as follows:

Use the pressure signal that reflects the state of the hydraulic components as the initial signal.
Transferring the signal to the frequency domain by Fourier transform, and the spectrum is projected under the maximum scale space, and the maximum scale space is split into two parts using light-k-means, leaving the part larger than the threshold T as the boundary of the split Fourier spectrum. The improved EWT decomposes the initial signal to obtain a series of SSCs.
Calculate 17 feature indicators for each component, input each feature indicator into the original KELM and calculate its test accuracy, select the highest accuracy feature indicator in each round until the test accuracy no longer increases (i.e., SFS strategy), select the optimal feature for each SSC, and compose the feature vector as the input of POA-KELM.
Input the feature vector into POA-KELM, and use POA to update C and $\gamma$ of KELM to derive the highest accuracy solution. Finally, classify the faults and get the results.

SECTION III.

Experiment and Analysis

A. Experimental Model

To verify the validity of the improved EWT and POA-KELM, we use the pressure data sets collected by Helwig et al. [59] for multivariable faults of various components of the hydraulic system, which was experimented on a hydraulic test stand and obtained through sensors. The test stand consists of a primary working circuit and a secondary cooling-filtration circuit, connected through a tank, where the working hydraulic system principle is shown in Fig. 5. Because of the difference in the location of the sensors, they serve different objects. We know that sensor PS1 is very sensitive to the performance of the hydraulic pump and the hydraulic valve by the calculation of the characteristic values of Liu et al. [12] for the same component of the 3 pressure sensors. So we choose three kinds of failure samples of the hydraulic pump without leakage, weak leakage, and serious leakage, and four kinds of failure samples of the hydraulic valve without jamming, slight jamming, serious jamming, and near failure. The hydraulic pump does not leak and the hydraulic valve does not jam as a set of a normal state of the hydraulic system, then they have 12 combinations, the combination number 1-1, 1-4,…, 3-6, each combination has 10 groups of the original signal, a total of 120 groups; fault type has a total of 6, the fault label is 1, 2,…, 6. The pressure signals of groups 1–1 are not decomposed, groups 1-4, 1-5, 1-6, 2-1, and 3–1 are decomposed into 2 sub-signals, and the rest are decomposed into 3 sub-signals, so there are 290 groups of signals in total. The 17 eigenvalues of each of the 290 groups of signals are calculated separately to calculate their eigenvectors by the SFS strategy. The resulting eigenvectors are trained and validated in a 4:1 manner, and finally, the sample data are expanded using a five-fold cross-validation method to make the results more accurate. For the pressure data acquisition experiments, the system cycle was repeated with a constant load cycle (duration of 60 s) and a sampling frequency of 100 Hz. Table 2 describes the different fault types.

TABLE 2 Description of the Fault

Fig. 5.

Working hydraulics for data acquisition with switchable orifice V9.

Show All

Fig. 6 indicates the original pressure signal for each fault type measured by sensor PS1. We can find that among the signals in groups 3-6, both the hydraulic pump and the hydraulic valve are the most serious failures, still working properly in the first 20s, but as time goes by, both the valve and the pump fail to the point of not following the laws of the hydraulic system. The remaining 11 groups of signals are not so different that the naked eye can not distinguish their categories, so it is necessary to improve the EWT decomposition of each pressure signal.

Fig. 6.

12 types of raw pressure signals.

Show All

B. Improved EWT Decomposition Results

According to A, based on each sensor data, we can get 120 signals containing 12 types of faults. Next, we will introduce the decomposition process of the improved EWT and the decomposition results in detail, using groups 2–4 as examples. The sampling frequency is chosen to be 100 Hz and the running time is 60 seconds. The working algorithm environment is MATLAB 2016b, the hardware central processing unit (CPU) is Intel ®Core™ i5-6500 CPU @ 3.20GHz, and the random access memory (RAM) of the computer is 4.00 GB. Assuming the initial constant $\varepsilon =5x_{\max } =30000$ , calculate the maximum scale parameter $P_{\max } =25001$ , update the constant $\varepsilon =19001$ , and the signals are expanded under the largest scale space (as shown in Fig. 7). The scale parameters of each signal extremum are ordered in ascending order, from the lowest to the highest. The number of categories is known to be 2, so we constrain all data points to a $957\times25000$ rectangular space and divide it equally into 2 windows A and B. Randomly select data points in these two windows. Here we select 1 data point in window A and 10 data points in window B. The center coordinates (957, 25001) and (690.1, 209.3) of the data points in the two windows are obtained using the k-means algorithm, respectively. Using these two center coordinates as initial coordinates, the k-means algorithm is applied to all data points. The final updated clustering centers are (956.5, 20975.5) and (478, 82.8). Thus, under the original scale space, the two data points (27.094, 25001) and (13.814, 16950) are divided into one class, and the rest of the data points are divided into another class. The final clustering results are shown in Fig. 8. Therefore, we set the boundaries for dividing the Fourier spectrum as 27.094 and 13.814, as shown in Fig. 9. Since the amplitude of the Fourier spectrum at low frequencies is so large that the information at low amplitudes cannot be seen, we zoom in on the Fourier frequency domain to study the information at low amplitudes as shown in Fig. 10. Finally, 2 meaningful points are obtained to split the Fourier spectrum, and 3 SSCs are obtained (as shown in Fig. 11). It is obvious from Fig. 11 that the SSC (1) is very similar to the normal signal (groups 1-1), which shows that we successfully decompose the fault signal into a normal signal, a hydraulic pump leakage fault signal, and a hydraulic valve jamming fault signal. This also precisely illustrates the effectiveness of the improved EWT proposed in this study and also shows that the pressure signal characteristics can effectively detect faults in hydraulic systems. Through the processing of the signal by EWT, we transformed the multivariate fault signal into a univariate fault classification problem, which helped greatly in the subsequent work.

Fig. 7.

Largest-scale space of pressure signals.

Show All

Fig. 8.

Selection of meaningful points based on light-k-means.

Show All

Fig. 9.

Fourier spectrum division of pressure signals.

Show All

Fig. 10.

Local amplification of Fourier spectrum division of pressure signals.

Show All

Fig. 11.

EWT based on light-k-means.

Show All

We compared light-k-means with empirical law and other clustering methods as a way to illustrate the superiority of the method proposed in this study. The comparison of the decomposed signals from Fig. 11 and Fig. 12 shows that the number of SCCs obtained by the improved EWT decomposition based on light-k-means increases from 3 to 6. This is because the EWT based on the empirical law clustering method divides the three scale parameters caused by hydraulic pump leakage failure or hydraulic valve jamming failure into a group of meaningful boundaries together. This leads to the over-decomposition of one of the fault signals into 4 signals when the signal is decomposed, so we are not clear about what these 4 signals represent, and therefore incorrectly classify them into 6 categories when the fault is classified. According to our attempts at other clustering methods, we found that the segmentation boundaries calculated by all the methods except k-means and light-k-means are greater than 2. Compared with traditional k-means, light-k-means reduces the iteration time by orders of magnitude after randomly selecting samples, so it can do its job simply and efficiently in a very short time.

Fig. 12.

EWT based on empirical law.

Show All

C. Improving the Superiority of EWT

To further illustrate the superiority of the improved EWT, we compared the improved EWT with other clustering methods in terms of the number of segmentation boundaries, error statistics, precision, and time consumption. Among them, the error statistics include mean absolute error (MAE), mean square error (MSE), mean absolute percentage error (MAPE), weighted mean absolute percentage error (WMAPE), and Fréchet distance (FD) [60].

FD defines a method to calculate the similarity of two curves considering the relationship between the positions of two points, which is commonly used for time series similarity measure and trajectory series. Eiter and Heikki [61] computed recursively the Fréchet distance of two discrete series. The expressions for the error statistics and FD can be found in Table 3, where $y_{i}$ refers to the original signal sequence and $y_{i}^{\prime }$ refers to the signal reconstruction sequence. $c(i-1,j)$ denotes the distance from a point $(i-1,j)$ to point $(i,j)$ , and $d(ui,vj)$ denotes the Euclidean distance between points i and j.

TABLE 3 The Expressions for the Error Statistics

The accuracy was calculated by the impulse factor proposed by Yu et al. [62]. The impulse factor metric is the ratio of the fault shock frequency amplitude $E_{f}$ to the intrinsic pulse frequency amplitude $E_{0}$ :

$\begin{equation*} \alpha =E_{f} /E_{0} \tag{14}\end{equation*}$ View Source

The boundary detection methods used in the traditional EWT largest scale space include Empirical law, Otsu, Half-Normal law, and k-means, and to enhance the convincing power, we also added the DBSCAN method [39] for comparison. The reconstructed signal for each component is calculated according to (8), followed by the calculation of the comparison of each parameter mentioned above. The comparison results are shown in Table 4.

TABLE 4 Comparison of Several Detection Methods

As can be seen from Table 4, the difference between light-k-means and k-means data is not large, but the difference in time spent between the two is quite large. This is because light-k-means randomly select samples to reduce the iteration time and the time reduces by orders of magnitude, thus greatly improving the computational speed. DBSCAN has obvious advantages in vibration signal-based fusion, but for pressure signals, it has obvious limitations. This is because the choice of the density radius r and the minimum number of Minpts is highly artificial, and the scale space under pressure signals is more dense than that under vibration signals. Therefore, if r or Minpts is too large or too small, it will lead to problems in scale space clustering. light-k-means is the most suitable spectral boundary selection method for EWT to decompose pressure signals, both in terms of accuracy and error statistics and in terms of time consumption, thus demonstrating the effectiveness and rapidity of improved EWT.

D. Classification of Faults

Because the radial basis function (RBF) can map samples to a higher-dimensional space to solve many nonlinear problems, the kernel function we choose is the Gaussian kernel function, where the Gaussian kernel function is expressed as:

$\begin{equation*} K(x,x_{i})=\exp \left({-\frac {(x-x_{i})^{2}}{\gamma ^{2}}}\right) \tag{15}\end{equation*}$ View Source

where

$\gamma$

is the parameter of the Gaussian kernel function.

The 290 groups of SSCs are divided into three groups training, testing, and verification according to 3:1:1 to obtain the feature vectors. The 17 feature values of the 290 groups of SSCs obtained by EWT decomposition are calculated separately, and each feature value is concatenated into the training group to train the original KELM (the initial Gaussian kernel parameter is set to 4, and the regularization coefficient is set to 2) to get the testing accuracy, and the SFS-based feature selection process is shown in Table 5. First, the feature pool is emptied, and in the first round, we can see that the highest test accuracy is obtained by training KELM with feature 11, so the pool is updated to {F11}, and then KELM is trained with the remaining 16 features except feature 11 and the above operation is repeated until the accuracy no longer increases. According to our experiments, starting from the fifth round, the accuracy of each feature input is lower than 93.79%, so we stop the loop after the fourth round and combine {F11, F4, F2, F5} to form the feature vectors are used as input to POA-KELM.

TABLE 5 SFS-Based Feature Selection

To classify the results more accurately, we use a five-fold cross-validation to expand the samples. The above operation is cycled through five different compositions of test groups, and finally, 1450 data samples are obtained. It is found that no matter how we replace the compositions between the test and training groups, the final selected feature pool is still {F11, F4, F2, F5}, which illustrates the effectiveness of this feature extraction method from the side. Finally, the data are divided according to the ratio of training and testing of 4:1. The final confusion matrix is obtained as shown in Fig. 13. Thus the computational accuracy is 97.24%.

Fig. 13.

Confusion matrix of POA-KELM test set.

Show All

E. Comparison of Classifier Prediction Accuracy

In order to demonstrate the supremacy of POA-KELM in multivariate fault diagnosis within hydraulic systems, identical signal sources were utilized and subjected to multivariable fault classifiers, including the Seagull Optimization Algorithm enhanced KELM (SOA-KELM) and the original KELM. The outputs were then analysed for discrepancies in both efficiency and precision, thus indicating the disparities between the aforementioned classifiers. Their confusion matrix is shown in the Fig. 13, Fig. 14 and Fig. 15, where the abscissa represents the predicted fault type, the ordinate represents the actual fault type, the value on the main diagonal represents the number of correct predictions, and the numbers in other positions represent the number of incorrect predictions the sum of numbers represents the number of samples in the test set. As can be seen in Fig. 13, Fig. 14 and Fig. 15, the second and third types of failures do not match the reality when predicted. We find that the signals of these two categories differ only in their peak values when performing the calculation of the eigenvectors. However, the change in the external temperature of the hydraulic system at certain moments makes the peak values of the second and third category of faults very close to each other. Therefore, the POA-KELM classifier incorrectly classifies the two classes of faults into the same category. With regards to the remaining categories of faults, there exists a considerable discrepancy among the four characteristics associated with these faults as well as an elevated resistance to interference, resulting in an exact correspondence between the predicted and actual fault types.

Fig. 14.

Confusion matrix of SOA-KELM test set.

Show All

Fig. 15.

Confusion matrix of the KELM test set.

Show All

For comparison, the kernel functions we chose were all RBF. The number of initial populations and the number of iterations of POA have a great influence on the accuracy of the optimization, and they must be chosen appropriately, otherwise, they will cause a decrease in accuracy. Fig. 16 and Fig. 17 show the variation in the accuracy of POA-KELM and SOA-KELM with the number of initial populations and the number of iterations. We can find that POA-KELM can achieve the highest accuracy of 97.24% when the initial population is 1, the initial population and the number of iterations are updated to 1 and 1, respectively, and the regularization coefficient and kernel parameters are 858.0409 and 23.7134, respectively, and the time required is 0.381905 seconds. In contrast, SOA-KELM achieves a maximum test accuracy of 95.17% when the initial number of populations and the number of iterations are 2 and 4. After updating the initial number of populations and the number of iterations, the obtained regularization coefficients and kernel parameters are 44.0626 and 0.1, respectively, and the time required is 0.565492 seconds. We give KELM initial regularization coefficients and kernel parameters of 2 and 4, respectively, and KELM The test accuracy obtained is 93.79% and the time required is 0.379686 seconds. The results show that POA-KELM is undoubtedly the most suitable method for multivariate fault diagnosis of hydraulic systems in terms of balance time and accuracy.

Fig. 16.

Variation of POA-KELM test accuracy with population size and iteration times.

Show All

Fig. 17.

Variation of SOA-KELM test accuracy with population size and iteration times.

Show All

SECTION IV.

Conclusion

In this paper, the improved EWT and POA optimized KELM are combined to solve the problem of low efficiency of multivariable fault diagnosis in hydraulic systems. The improved EWT is used to reduce the dimension of multivariable problems and greatly reduce the calculation time. Then POA-KELM is used to classify the decomposed signals, which improves the accuracy of fault classification. This article solves the problem of inaccurate spectral segmentation in EWT by expanding the scale space. Using the clustering method of light-k-means, the iteration time of data points is greatly shortened, and the computational efficiency is improved. Compared with other boundary detection methods, the improved EWT has high accuracy and high computational speed in signal reconstruction. The SFS method was used to remove the useless features in the SSCs decomposed by EWT, and the feature vectors of the signals were selected, which not only improved the computation speed, but also retained the features that could best reflect the signal trend. Input the feature vector into the POA-KELM classifier to predict the fault type.

Due to the fast nature of the POA optimization method, the KELM classifier achieves the highest accuracy in very few iteration cycles, thus reducing the overall fault diagnosis time and benefiting even more from the accuracy of the decomposition of the improved EWT method, which makes the distinction between the characteristics of each type of fault more obvious. In conclusion, the combination of improved EWT and POA-KELM improves the diagnostic accuracy of multivariable faults and also reduces the diagnostic time, which is undoubtedly beneficial for scientific progress.

Despite the achievements obtained by this study, there are still some limitations. Firstly, the 17 features used in this study have a weak capability to identify the second and third types of faults. To address this issue, future studies should focus on improving the feature selection process to enhance the ability to resist interference. Secondly, this study did not consider the presence of sensor faults in hydraulic systems, which is a common problem. Thus, future studies should incorporate multi-sensor information fusion techniques to detect sensor faults and eliminate the effects of single-sensor errors.

Based on these limitations, future research can further explore how to improve the accuracy and efficiency of fault diagnosis. For instance, other machine learning algorithms can be applied to fault diagnosis, or more efficient feature extraction methods can be developed. In summary, the limitations of this study provide a vast space for future research to explore.

References is not available for this document.

A New Method for Fault Diagnosis of Hydraulic System Based on Improved Empirical Wavelet Transform and Kernel Extreme Learning Machine

Abstract:

Metadata

Abstract:

Funding Agency:

Introduction

Methodology

A. The Fundamentals of EWT

B. Improvement of the Largest Scale Space

C. Light-K-Means

D. Extraction of Features

E. KELM

F. POA

G. POA-KELM

H. Framework for Fault Classification

Experiment and Analysis

A. Experimental Model

B. Improved EWT Decomposition Results

C. Improving the Superiority of EWT

D. Classification of Faults

E. Comparison of Classifier Prediction Accuracy

Conclusion

References

IEEE Account

Purchase Details

Profile Information

Need Help?

A New Method for Fault Diagnosis of Hydraulic System Based on Improved Empirical Wavelet Transform and Kernel Extreme Learning Machine

Alerts

Abstract:

Metadata

Abstract:

Funding Agency:

Introduction

Methodology

A. The Fundamentals of EWT

B. Improvement of the Largest Scale Space

C. Light-K-Means

D. Extraction of Features

E. KELM

F. POA

G. POA-KELM

H. Framework for Fault Classification

Experiment and Analysis

A. Experimental Model

B. Improved EWT Decomposition Results

C. Improving the Superiority of EWT

D. Classification of Faults

E. Comparison of Classifier Prediction Accuracy

Conclusion

Authors

Figures

References

Citations

Keywords

Metrics

References