Introduction
Hydraulic transmission systems are widely used in various fields because of their ability to produce large forces or torques and their ease of control. Some scholars have conducted statistical analyzes on the location of failures of machinery and equipment containing hydraulic systems and found that most failures of machinery and equipment are caused by hydraulic systems [1], [2], [3]. The operation of the hydraulic system is so frequent that it is easy to break down, but if the fault location is not found in time and corresponding measures are not taken, it will cause serious economic losses at least and cause personal safety problems at worst [4]. Therefore, it is necessary to perform fault diagnosis for hydraulic systems so that the fault location can be quickly located and the type of fault can be determined to avoid unforeseen consequences.
In general, fault diagnosis is divided into analytical model-based methods and data-driven methods [5]. However, many components in hydraulic systems are complex rotating machines, and it is difficult to build fault models. Thanks to the improvement of industrial sensor accuracy and the speed of computer computation, data-driven hydraulic system fault diagnosis has become mainstream [6].
Popular data-driven fault diagnosis includes fault diagnosis methods based on artificial intelligence [7], [8], [9], [10], [11], [12] and signal processing [13], [14]. Both methods can also be combined for fault diagnosis [15], [16]. Artificial intelligence fault diagnosis methods are preferred by researchers because of their high computational accuracy and the fact that they rarely require expertise in fault objects. However, artificial intelligence diagnosis has a complex structure and is inefficient when diagnosing multiple faults. To solve this problem, different signal preprocessing methods combined with convolutional neural networks were utilized in the literature [17] and [18], respectively, for fault diagnosis of rotating machinery and to enhance the efficiency of fault diagnosis. Although this approach may not be directly applicable to non-rotating machinery, it demonstrates that signal preprocessing can improve the efficiency of fault diagnosis. Therefore, it is especially important to choose the appropriate signal processing algorithm.
In 1807, the Fourier transform was introduced, which greatly accelerated the development of the field of digital signal processing. The wavelet transform is a multi-resolution signal analysis method based on the Fourier transform, which can achieve the decomposition and reconstruction of a discrete-time signal through the fusion of approximate and detailed components [19]. Unfortunately, the choice of the wavelet transform basis function is determined by subjective experience, which is very unfriendly to inexperienced stakeholders. In addition, the wavelet transform produces great frequency leakage when dealing with complex signals, which makes it impossible to use the wavelet transform for hydraulic system fault diagnosis [20].
Empirical Modal Decomposition (EMD) is another classical signal processing method. EMD can adaptively decompose the vibration signal into many kinds of Intrinsic Mode Functions (IMFs), which solves the problem of empirically selecting appropriate wavelet bases for the wavelet transform. After the IMFs are merged, the noise interference existing in the vibration signal can be eliminated [21]. Yang et al. [22] achieved their purpose of reducing the noise within the signal through multipoint data fusion. EMD can decompose the signal according to the temporal characteristics of the processed object, so there is no need to set the basis function, and there is no problem similar to the wavelet transform. However, modal blending and breakpoint effects are very common in EMD, and it has no theoretical basis to support them.
The Empirical Wavelet Transform (EWT), which is similar in composition to the wavelet transform, was mentioned by Gilles in 2013 and is an emerging method for time-frequency analysis with the advantage that EMD can automatically adapt to the signal and has a rigorous theoretical basis [23]. EWT is widely used in the field of signal processing [24], [25], [26]. For the process of EWT analysis of vibration signals, the selection of meaningful boundaries is the key to the success of signal decomposition [27]. However, most studies on EWT use local maxima or local minima in the selection of boundaries, which has a large error in dealing with complex nonlinear nonsmooth signals because, with noisy signals, the amplitude of many noises has reached the same height as normal signals, so some of the boundaries determined by the minima are actually due to noise [28].
To overcome these problems of conventional EWT, Pan et al. [29] used the Gaussian kernel inner product to remove the noise from the target signal and thus decompose the normal signal in the target signal. Kong et al. [30] decomposed the fault signal under strong noise using the phenomenon of meshing resonance. Kim et al. [31] used the inverse-spectrum-assisted EWT to make the spectrum of the target signal more smooth, together with the Hilbert transform to average the envelope spectrum to obtain the results of gear faults. The order spectrum coherence is combined with historical data obtained from healthy machines to obtain the anomaly envelope spectrum, which is further processed for fault diagnosis. This anomaly envelope spectrum is further processed by smoothing operations to perform not only automatic fault detection but also to identify damaged components [32]. Kedadouche et al. [33] used operational modal analysis to determine the stable frequency of the signal, calculate the support boundary, apply the scaling function and wavelet function corresponding to each detection segment, and filter the signal with the constructed filter bank to obtain the IMF. The approach is then roughly the same as for EMD. Like Gilles to calculate the scale space [27], Zhao et al. [34] used several maximum lengths of the scale space as the boundaries for dividing the spectrum, obtained a series of IMF components after reconstructing the signal, and performed power spectrum analysis on these components to derive the fault frequency. Zhang et al. [35] used the Power Spectral Density (PSD) instead of the Fourier spectrum, and since the extreme value of the PSD points is smaller, there is no need to calculate the complex scale space; instead, the minimal value of PSD is used as the boundary to divide the spectrum to obtain the components of the original signal. Zhang et al. [36] proposed a variable spectrum splitting EWT to estimate the modes using multi-taper power spectral density, and finally a set of boundaries associated with the spectrum fluctuations to obtain the components of the original signal. Ding [37] obtained the upper and lower boundaries of the sidebands by searching left and right bidirectionally at the center of the spectrum and merged the upper and lower boundaries of the sidebands as the boundaries of the spectrum segmentation to achieve the fault detection of bearings. Zheng et al. [38] replaced the Fourier amplitude spectrum with the power spectrum to verify the effectiveness of the method by decomposing the fault signal of the hydraulic pump slide valve loosening. However, these studies seem to avoid the spectrum division of EWT, and do not fundamentally solve the problem of EWT, i.e., the unreasonable spectrum division. To solve this problem, Yu et al. [39] used the DBSCAN clustering method to cluster the largest scale space to obtain the boundary of the spectrum, and the decomposition of the vibration signals of the hydraulic pump in three directions proved the effectiveness of the method in the detection of weak faults in the pump. The DBSCAN method does have a more significant improvement for EWT of vibration signals. However, we found through experiments that this method seems to fail for components in hydraulic systems where it is difficult to detect vibration signals.
In a hydraulic circuit, the pressure signal and the flow signal are easily detected for any component through which hydraulic fluid passes. The pressure and flow signals of hydraulic components contain a great deal of information, and the fault information is only the tip of the iceberg. The feature extractor can reflect the features of the fault information from many features [40], [41]. Li et al. [42] first conducted an empirical determination of the number of signal EWT decompositions. Subsequently, they selected the component with the highest energy intensity and utilized the inverse dispersion entropy features of this component as the features of the original signal for efficient feature extraction purposes. Lu et al. [43] took the first n-order components with high correlation coefficients between the decomposed components and the original signal to construct fused feature vectors, which were used to build a two-level diagnostic model based on the salp swarm algorithm optimized Kernel Extreme Learning Machine (KELM) to identify normal and abnormal states and the categories under the abnormal states. Ding et al. [44] used EWT to decompose the hydraulic pump vibration signal, used principal component analysis to reduce the dimensionality of the feature extraction of the decomposed signal, and finally input the feature vector containing the fault features into the Extreme Learning Machine (ELM) to obtain the fault classification results. Liu et al. [45] developed a novel personalized diagnosis method for gear fault detection using numerical simulation and ELM algorithm, which can diagnose the health condition of gears by extracting features and training ELM model based on large amount of vibration signals.
Many studies have been conducted at home and abroad for the fault diagnosis of vibration signals of hydraulic components, and vibration analysis is the most effective condition-monitoring technique for rotating systems [46]. It is very effective for fault detection of bearings, shafts, hydraulic pumps, hydraulic motors, etc. [47], [48]. However, many hydraulic components are non-rotating components and we cannot or have difficulty in measuring their vibration signals. In contrast, the pressure signals of the hydraulic system are easier to measure. According to our experiments, it is verified that the pressure signals can be applied to the fault diagnosis of the hydraulic system.
This paper proposes to combine the improved EWT and the Pelican Optimization Algorithm (POA) optimized KELM to preprocess the pressure signals to diagnose multivariable faults in hydraulic systems. The implementation of the proposed method in this study is shown in Fig. 1. The article makes significant contributions in the following ways:
By using the light-k-means clustering algorithm, the problem of unreasonable spectral division of pressure signals when performing EWT is solved.
Seventeen parameters are proposed to construct the feature pool, and a Sequence Forward Selection (SFS) strategy is used to select the features with the optimal ability to distinguish faults.
Optimizing the kernel parameters and regularization coefficients of KELM using POA has better fault classification results.
In the second part, the pertinent theory is introduced. The third part delineates the experimental setting and conducts an experimental analysis to validate and evaluate the proposed approach. The fourth part lays out the conclusions and outlines the future work.
Methodology
The following sections will introduce in detail the basic principles of EWT, the realization of the largest scale space, the steps to realize light-k-means, and the process of the KELM improved by the POA.
A. The Fundamentals of EWT
Inspired by the wavelet transform, Gills proposed the EWT in 2013, which has a similar definition to the transform.
From the perspective of Fourier, this method can adaptively select a band-pass filter bank according to the spectrum of the signal to be processed, divide the signal spectrum into different frequency bands, and extract a series of different Sub-Signal Components (SSCs).
We define
Inspired by the idea of Littlewood-Paley and Meyer wavelet construction, Gills used (1) and (2) formulas to construct empirical scaling functions \begin{align*} \varphi _{\textrm {n}} (\omega)&=\begin{cases} \displaystyle 1~ if ~\left |{ \omega }\right |\le (1-\gamma)\omega _{n} \\ \displaystyle \cos \left [{ {\frac {\pi }{2}\beta \left ({{\frac {1}{2\gamma \omega _{n} }\left ({{\left |{ \omega }\right |-\left ({{1-\gamma } }\right)\omega _{n}} }\right)} }\right)} }\right] \\ \displaystyle~ if ~(1-\gamma)\omega _{n} \le \left |{ \omega }\right |\le (1+\gamma)\omega _{n} \\ \displaystyle 0 otherwise \end{cases} \tag{1}\\ \psi _{\textrm {n}} (\omega)&=\begin{cases} \displaystyle 1~ if ~(1+\gamma)\omega _{n} \le \left |{ \omega }\right |\le (1-\gamma)\omega _{n+1} \\ \displaystyle \cos \left [{ {\frac {\pi }{2}\beta \left ({{\frac {1}{2\gamma \omega _{n+1} }\left ({{\left |{ \omega }\right |-\left ({{1-\gamma } }\right)\omega _{n+1}} }\right)} }\right)} }\right] \\ \displaystyle~ if ~(1-\gamma)\omega _{n+1} \le \left |{ \omega }\right |\le (1+\gamma)\omega _{n+1} \\ \displaystyle \sin \left [{ {\frac {\pi }{2}\beta \left ({{\frac {1}{2\gamma \omega _{n} }\left ({{\left |{ \omega }\right |-\left ({{1-\gamma } }\right)\omega _{n}} }\right)} }\right)} }\right] \\ \displaystyle~ if ~(1-\gamma)\omega _{n+1} \le \left |{ \omega }\right |\le (1+\gamma)\omega _{n} \\ \displaystyle 0 otherwise \end{cases} \\{}\tag{2}\end{align*}
\begin{align*} \beta (x)=\begin{cases} \displaystyle 0~ if ~x\le 0 \\ \displaystyle and~ \beta (x)+\beta (1-x)=1 \forall x\in [{0,1}] \\ \displaystyle 1~ if ~x>1 \end{cases} \tag{3}\end{align*}
Many functions satisfy this equation. According to document [49], (4) is the most used one.\begin{equation*} \beta (x)=35x^{4}-84x^{5}+70x^{6}-20x^{7} \tag{4}\end{equation*}
Let \begin{align*} W_{f}^{e} (n,t)&=\left \langle{ {f(t),\psi _{n} (t)} }\right \rangle \\ &=\int {f(\tau)} \overline {\psi _{n} (\tau -t)} d\tau =F^{-1}[f(\omega)\overline {\psi _{n} (\omega)}] \tag{5}\end{align*}
\begin{align*} W_{f}^{e} (0,t)&=\left \langle{ {f(t),\varphi _{1} (t)} }\right \rangle \\ &=\int {f(\tau)} \overline {\varphi _{1} (\tau -t)} d\tau =F^{-1}[f(\omega)\overline {\varphi _{1} (\omega)}] \tag{6}\end{align*}
The signal \begin{align*} \begin{cases} \displaystyle f_{0} (t)=W_{f}^{e} (0,t)\ast \varphi _{1} (t) \\ \displaystyle f_{k} (t)=W_{f}^{e} (k,t)\ast \psi _{k} (t) \end{cases} \tag{7}\end{align*}
\begin{align*} f(t)&=W_{f}^{e} (0,t)\ast \varphi _{1} (t)+\sum \limits _{k=1}^{N} {W_{f}^{e} (k,t)} \ast \psi _{k} (t) \\ &=F^{-1}\left[{W_{f}^{e} (0,\omega)\varphi _{1} (\omega)+\sum \limits _{k=1}^{N} {W_{f}^{e} (k,\omega)} \psi _{k} (\omega)}\right] \tag{8}\end{align*}
B. Improvement of the Largest Scale Space
In the process of signal decomposition and reconstruction, segmenting the Fourier spectrum effectively and reasonably is key to the success of signal decomposition. Currently, the popular method is to use the maximum value of the spectrum interval as the boundary of the divided spectrum. However, when this method decomposes the signal containing noise interference, the local maximum value caused by the noise will be used as the boundary of the divided spectrum, which will lead to the spectrum. Over-segmentation means that the single signal is wrongly decomposed into two or more SSCs, which has a great negative impact on the final fault classification. Therefore, it is necessary to improve the EWT to achieve accurate and effective spectrum segmentation.
Gilles and Heal [27] proposed a parameter-free scale space method in 2014, projecting the spectrum onto the scale space and using the clustering method to divide the scale space into two categories (one is a meaningful spectrum segmentation boundary, and the other is a meaningless boundary). The classification criterion serves as a threshold for meaningful spectral segmentation.
Assuming that \begin{equation*} L(x,t)=g(x;t)^{\ast} f(x) \tag{9}\end{equation*}
For a discrete function such as a signal, the scale space must also be discrete, so the scale space of a discrete signal is expressed as (10):\begin{equation*} L(x,t)=\sum \limits _{n=-\infty }^{+\infty } {f(x-n)g(n;t)} \tag{10}\end{equation*}
For real signals, a truncation threshold \begin{equation*} L(x,t)=\sum \limits _{n=-M}^{+M} {f(x-n)g(n;t)} \tag{11}\end{equation*}
To ensure that the calculated scale space approximation error is less than 10−9,
Gilles and Heal [27] did not extend
takes a constant much larger than the maximum time series, i.e.,\varepsilon .\varepsilon \gg x_{\max } Calculate the largest scale parameter
that can be displayed, and then updateP_{\max } , i.e.,\varepsilon .\varepsilon =P_{\max } -x_{\max }
C. Light-K-Means
Based on the selection of the above-mentioned largest scale space, we can know that the scale parameter is limited to the largest scale parameter that can be displayed, i.e.,
The frequencies in the largest scale space are divided into two categories according to the scale length, greater than or equal to the threshold T represents a meaningful spectrum division boundary, and others are represented as meaningless boundaries.
For clustering problems, the simplest and most effective method is the k-means algorithm [50]. For the k-means algorithm, the difficulty of clustering lies in the determination of the data set category (that is, the size of the k value) and the selection of the initial cluster center (once the initial value is not well selected, effective clustering results may not be obtained even give wrong results). For our binary classification problem, it is obvious that k=2. Therefore, the biggest difficulty with the algorithm becomes the problem of initial cluster center selection.
Li et al. [51] proposed light-k-means under the premise that the number of categories k is known. The specific steps are as follows:
Randomly select
data points from S, which are represented as H in the set, and the complement of H isp{'} .H_{c} Apply the traditional k-means algorithm in the set H, and divide the set H into subsets.
Find the center point of the set.
Assign the set
to the subset closest to them.H_{c}
However, there is a fatal problem when randomly selecting data points, if all the data points of the same category are selected, this method will consume more time than traditional k-means, and even get wrong results.
Inspired by this, we improved the light-k-means algorithm, the specific steps are as follows:
Constrain all data points to a rectangular window with sides a and b.
Compare the lengths of the side lengths a and b of the rectangular window, select the larger one, and record it as c.
Divide side c into k small windows on average.
Randomly select data points in each small window, and then perform the steps of the original light-k-means algorithm.
Fig. 3 shows the comparison of k-means and light-k-means clustering. It can be seen from the figure that light-k-means reduces the number of initial iterations by randomly selecting sample points, so that two calculations can greatly reduce the time cost, thereby greatly improving the calculation speed. After all, we know that when the number of samples reaches a certain level, the number of iterations increases exponentially.
Comparison of the clustering process of k-means and light-k-means (a) The process of k-means clustering (b) The process of light-k-means clustering.
D. Extraction of Features
The selection of the signal feature vector depends on the specific problem. Gamboa-Medina et al. [52] verified for the water network leakage problem that the feature vector based on the pressure signal depends on three features: energy (ENE), entropy (ENT), and the number of excess zeros (ZCC).
The dimensionless parameters are not affected by mechanical conditions and are widely used for the diagnosis of mechanical faults [53]. If the feature vector consisting of dimensionless parameters has poor ability to distinguish between different faults, the final achieved classification performance may be less than satisfactory, no matter how good the adopted learning algorithm is [54]. This means that we have to choose the appropriate dimensionless indicators in order to seek an accurate determination of the type of failure. Since the object of our study is a hydraulic system, its failure is not only the leakage problem, but also other failures such as slide valve failure. Inspired by the processing of raw data by Xiong et al. [55] using dimensionless indicators, we add 14 time-domain features of the mechanical principle to enhance the accuracy of feature extraction, including Mean Value, Standard Deviation, Variance, Peak-to-Peak Value, Square Root Amplitude, Average Amplitude, Mean Square Amplitude, Peak Value, Waveform Index, Peak Index, Impulsion Index, Clearance Factor, Degree of Skewness, and Kurtosis Value.
The above 17 features are combined, and a feature pool \begin{equation*} A=\frac {\sum \nolimits _{i=1}^{N} {P(c_{i} =y_{i})}}{N}\ast 100\% \tag{12}\end{equation*}
E. KELM
The output weight of the ELM [57] is calculated according to the input weight, and the weight is randomly generated, which makes the result unstable. Drawing on the experience of the successful application of kernel function in SVM, KELM [43] replaces the output matrix between the hidden layer and output layer with a kernel function. This not only avoids the uncertainty of the learning model but also retains the advantages of ELM. Therefore, the objective function \begin{equation*} F=[K(x,x_{1})\ldots K(x,x_{N})]\left({\frac {I}{C}+\Omega }\right)^{-1}L \tag{13}\end{equation*}
F. POA
The POA is an intelligent optimization algorithm proposed by Trojovský and Dehghani [58] in 2022, which simulates the attack and hunting behavior of the pelican to build a model to solve the optimization problem. In POA, pelican hunting is divided into two processes, the approaching prey phase, and the surface flight phase. The position of the pelican changes with the position of the prey, and the steps to solve the optimization problem are as follows:
Determine the size of the pelican group and calculate the objective function value.
Approaching the prey phase: Calculate the position status of each pelican and update the group size.
Surface flight stage: Calculate whether the position state in 2 is to achieve a better objective function value, if not, keep the previous position state, and if so, update the position information and update the group size.
Keep the best candidate plan for each pelican, complete, and output the result.
G. POA-KELM
The regularization coefficients C and kernel parameters
Given an upper and lower bound, make a POA search in this interval.
Given the population range, calculate the objective function value and initialize KELM parameters.
Optimize C and
of KLEM using POA.\gamma Train KELM for each C and
individually to derive the training accuracy.\gamma Keep the C and
that make the highest accuracy.\gamma
H. Framework for Fault Classification
The framework for fault classification using pressure signals is shown in Fig. 1, and some key steps are explained as follows:
Use the pressure signal that reflects the state of the hydraulic components as the initial signal.
Transferring the signal to the frequency domain by Fourier transform, and the spectrum is projected under the maximum scale space, and the maximum scale space is split into two parts using light-k-means, leaving the part larger than the threshold T as the boundary of the split Fourier spectrum. The improved EWT decomposes the initial signal to obtain a series of SSCs.
Calculate 17 feature indicators for each component, input each feature indicator into the original KELM and calculate its test accuracy, select the highest accuracy feature indicator in each round until the test accuracy no longer increases (i.e., SFS strategy), select the optimal feature for each SSC, and compose the feature vector as the input of POA-KELM.
Input the feature vector into POA-KELM, and use POA to update C and
of KELM to derive the highest accuracy solution. Finally, classify the faults and get the results.\gamma
Experiment and Analysis
A. Experimental Model
To verify the validity of the improved EWT and POA-KELM, we use the pressure data sets collected by Helwig et al. [59] for multivariable faults of various components of the hydraulic system, which was experimented on a hydraulic test stand and obtained through sensors. The test stand consists of a primary working circuit and a secondary cooling-filtration circuit, connected through a tank, where the working hydraulic system principle is shown in Fig. 5. Because of the difference in the location of the sensors, they serve different objects. We know that sensor PS1 is very sensitive to the performance of the hydraulic pump and the hydraulic valve by the calculation of the characteristic values of Liu et al. [12] for the same component of the 3 pressure sensors. So we choose three kinds of failure samples of the hydraulic pump without leakage, weak leakage, and serious leakage, and four kinds of failure samples of the hydraulic valve without jamming, slight jamming, serious jamming, and near failure. The hydraulic pump does not leak and the hydraulic valve does not jam as a set of a normal state of the hydraulic system, then they have 12 combinations, the combination number 1-1, 1-4,…, 3-6, each combination has 10 groups of the original signal, a total of 120 groups; fault type has a total of 6, the fault label is 1, 2,…, 6. The pressure signals of groups 1–1 are not decomposed, groups 1-4, 1-5, 1-6, 2-1, and 3–1 are decomposed into 2 sub-signals, and the rest are decomposed into 3 sub-signals, so there are 290 groups of signals in total. The 17 eigenvalues of each of the 290 groups of signals are calculated separately to calculate their eigenvectors by the SFS strategy. The resulting eigenvectors are trained and validated in a 4:1 manner, and finally, the sample data are expanded using a five-fold cross-validation method to make the results more accurate. For the pressure data acquisition experiments, the system cycle was repeated with a constant load cycle (duration of 60 s) and a sampling frequency of 100 Hz. Table 2 describes the different fault types.
Fig. 6 indicates the original pressure signal for each fault type measured by sensor PS1. We can find that among the signals in groups 3-6, both the hydraulic pump and the hydraulic valve are the most serious failures, still working properly in the first 20s, but as time goes by, both the valve and the pump fail to the point of not following the laws of the hydraulic system. The remaining 11 groups of signals are not so different that the naked eye can not distinguish their categories, so it is necessary to improve the EWT decomposition of each pressure signal.
B. Improved EWT Decomposition Results
According to A, based on each sensor data, we can get 120 signals containing 12 types of faults. Next, we will introduce the decomposition process of the improved EWT and the decomposition results in detail, using groups 2–4 as examples. The sampling frequency is chosen to be 100 Hz and the running time is 60 seconds. The working algorithm environment is MATLAB 2016b, the hardware central processing unit (CPU) is Intel ®Core™ i5-6500 CPU @ 3.20GHz, and the random access memory (RAM) of the computer is 4.00 GB. Assuming the initial constant
We compared light-k-means with empirical law and other clustering methods as a way to illustrate the superiority of the method proposed in this study. The comparison of the decomposed signals from Fig. 11 and Fig. 12 shows that the number of SCCs obtained by the improved EWT decomposition based on light-k-means increases from 3 to 6. This is because the EWT based on the empirical law clustering method divides the three scale parameters caused by hydraulic pump leakage failure or hydraulic valve jamming failure into a group of meaningful boundaries together. This leads to the over-decomposition of one of the fault signals into 4 signals when the signal is decomposed, so we are not clear about what these 4 signals represent, and therefore incorrectly classify them into 6 categories when the fault is classified. According to our attempts at other clustering methods, we found that the segmentation boundaries calculated by all the methods except k-means and light-k-means are greater than 2. Compared with traditional k-means, light-k-means reduces the iteration time by orders of magnitude after randomly selecting samples, so it can do its job simply and efficiently in a very short time.
C. Improving the Superiority of EWT
To further illustrate the superiority of the improved EWT, we compared the improved EWT with other clustering methods in terms of the number of segmentation boundaries, error statistics, precision, and time consumption. Among them, the error statistics include mean absolute error (MAE), mean square error (MSE), mean absolute percentage error (MAPE), weighted mean absolute percentage error (WMAPE), and Fréchet distance (FD) [60].
FD defines a method to calculate the similarity of two curves considering the relationship between the positions of two points, which is commonly used for time series similarity measure and trajectory series. Eiter and Heikki [61] computed recursively the Fréchet distance of two discrete series. The expressions for the error statistics and FD can be found in Table 3, where
The accuracy was calculated by the impulse factor proposed by Yu et al. [62]. The impulse factor metric is the ratio of the fault shock frequency amplitude \begin{equation*} \alpha =E_{f} /E_{0} \tag{14}\end{equation*}
The boundary detection methods used in the traditional EWT largest scale space include Empirical law, Otsu, Half-Normal law, and k-means, and to enhance the convincing power, we also added the DBSCAN method [39] for comparison. The reconstructed signal for each component is calculated according to (8), followed by the calculation of the comparison of each parameter mentioned above. The comparison results are shown in Table 4.
As can be seen from Table 4, the difference between light-k-means and k-means data is not large, but the difference in time spent between the two is quite large. This is because light-k-means randomly select samples to reduce the iteration time and the time reduces by orders of magnitude, thus greatly improving the computational speed. DBSCAN has obvious advantages in vibration signal-based fusion, but for pressure signals, it has obvious limitations. This is because the choice of the density radius r and the minimum number of Minpts is highly artificial, and the scale space under pressure signals is more dense than that under vibration signals. Therefore, if r or Minpts is too large or too small, it will lead to problems in scale space clustering. light-k-means is the most suitable spectral boundary selection method for EWT to decompose pressure signals, both in terms of accuracy and error statistics and in terms of time consumption, thus demonstrating the effectiveness and rapidity of improved EWT.
D. Classification of Faults
Because the radial basis function (RBF) can map samples to a higher-dimensional space to solve many nonlinear problems, the kernel function we choose is the Gaussian kernel function, where the Gaussian kernel function is expressed as:\begin{equation*} K(x,x_{i})=\exp \left({-\frac {(x-x_{i})^{2}}{\gamma ^{2}}}\right) \tag{15}\end{equation*}
The 290 groups of SSCs are divided into three groups training, testing, and verification according to 3:1:1 to obtain the feature vectors. The 17 feature values of the 290 groups of SSCs obtained by EWT decomposition are calculated separately, and each feature value is concatenated into the training group to train the original KELM (the initial Gaussian kernel parameter is set to 4, and the regularization coefficient is set to 2) to get the testing accuracy, and the SFS-based feature selection process is shown in Table 5. First, the feature pool is emptied, and in the first round, we can see that the highest test accuracy is obtained by training KELM with feature 11, so the pool is updated to {F11}, and then KELM is trained with the remaining 16 features except feature 11 and the above operation is repeated until the accuracy no longer increases. According to our experiments, starting from the fifth round, the accuracy of each feature input is lower than 93.79%, so we stop the loop after the fourth round and combine {F11, F4, F2, F5} to form the feature vectors are used as input to POA-KELM.
To classify the results more accurately, we use a five-fold cross-validation to expand the samples. The above operation is cycled through five different compositions of test groups, and finally, 1450 data samples are obtained. It is found that no matter how we replace the compositions between the test and training groups, the final selected feature pool is still {F11, F4, F2, F5}, which illustrates the effectiveness of this feature extraction method from the side. Finally, the data are divided according to the ratio of training and testing of 4:1. The final confusion matrix is obtained as shown in Fig. 13. Thus the computational accuracy is 97.24%.
E. Comparison of Classifier Prediction Accuracy
In order to demonstrate the supremacy of POA-KELM in multivariate fault diagnosis within hydraulic systems, identical signal sources were utilized and subjected to multivariable fault classifiers, including the Seagull Optimization Algorithm enhanced KELM (SOA-KELM) and the original KELM. The outputs were then analysed for discrepancies in both efficiency and precision, thus indicating the disparities between the aforementioned classifiers. Their confusion matrix is shown in the Fig. 13, Fig. 14 and Fig. 15, where the abscissa represents the predicted fault type, the ordinate represents the actual fault type, the value on the main diagonal represents the number of correct predictions, and the numbers in other positions represent the number of incorrect predictions the sum of numbers represents the number of samples in the test set. As can be seen in Fig. 13, Fig. 14 and Fig. 15, the second and third types of failures do not match the reality when predicted. We find that the signals of these two categories differ only in their peak values when performing the calculation of the eigenvectors. However, the change in the external temperature of the hydraulic system at certain moments makes the peak values of the second and third category of faults very close to each other. Therefore, the POA-KELM classifier incorrectly classifies the two classes of faults into the same category. With regards to the remaining categories of faults, there exists a considerable discrepancy among the four characteristics associated with these faults as well as an elevated resistance to interference, resulting in an exact correspondence between the predicted and actual fault types.
For comparison, the kernel functions we chose were all RBF. The number of initial populations and the number of iterations of POA have a great influence on the accuracy of the optimization, and they must be chosen appropriately, otherwise, they will cause a decrease in accuracy. Fig. 16 and Fig. 17 show the variation in the accuracy of POA-KELM and SOA-KELM with the number of initial populations and the number of iterations. We can find that POA-KELM can achieve the highest accuracy of 97.24% when the initial population is 1, the initial population and the number of iterations are updated to 1 and 1, respectively, and the regularization coefficient and kernel parameters are 858.0409 and 23.7134, respectively, and the time required is 0.381905 seconds. In contrast, SOA-KELM achieves a maximum test accuracy of 95.17% when the initial number of populations and the number of iterations are 2 and 4. After updating the initial number of populations and the number of iterations, the obtained regularization coefficients and kernel parameters are 44.0626 and 0.1, respectively, and the time required is 0.565492 seconds. We give KELM initial regularization coefficients and kernel parameters of 2 and 4, respectively, and KELM The test accuracy obtained is 93.79% and the time required is 0.379686 seconds. The results show that POA-KELM is undoubtedly the most suitable method for multivariate fault diagnosis of hydraulic systems in terms of balance time and accuracy.
Conclusion
In this paper, the improved EWT and POA optimized KELM are combined to solve the problem of low efficiency of multivariable fault diagnosis in hydraulic systems. The improved EWT is used to reduce the dimension of multivariable problems and greatly reduce the calculation time. Then POA-KELM is used to classify the decomposed signals, which improves the accuracy of fault classification. This article solves the problem of inaccurate spectral segmentation in EWT by expanding the scale space. Using the clustering method of light-k-means, the iteration time of data points is greatly shortened, and the computational efficiency is improved. Compared with other boundary detection methods, the improved EWT has high accuracy and high computational speed in signal reconstruction. The SFS method was used to remove the useless features in the SSCs decomposed by EWT, and the feature vectors of the signals were selected, which not only improved the computation speed, but also retained the features that could best reflect the signal trend. Input the feature vector into the POA-KELM classifier to predict the fault type.
Due to the fast nature of the POA optimization method, the KELM classifier achieves the highest accuracy in very few iteration cycles, thus reducing the overall fault diagnosis time and benefiting even more from the accuracy of the decomposition of the improved EWT method, which makes the distinction between the characteristics of each type of fault more obvious. In conclusion, the combination of improved EWT and POA-KELM improves the diagnostic accuracy of multivariable faults and also reduces the diagnostic time, which is undoubtedly beneficial for scientific progress.
Despite the achievements obtained by this study, there are still some limitations. Firstly, the 17 features used in this study have a weak capability to identify the second and third types of faults. To address this issue, future studies should focus on improving the feature selection process to enhance the ability to resist interference. Secondly, this study did not consider the presence of sensor faults in hydraulic systems, which is a common problem. Thus, future studies should incorporate multi-sensor information fusion techniques to detect sensor faults and eliminate the effects of single-sensor errors.
Based on these limitations, future research can further explore how to improve the accuracy and efficiency of fault diagnosis. For instance, other machine learning algorithms can be applied to fault diagnosis, or more efficient feature extraction methods can be developed. In summary, the limitations of this study provide a vast space for future research to explore.