Introduction
Artificial neural network is an intelligent computing model of bionics, which can perfectly handle linear and nonlinear problems. Among them, the single hidden layer feedforward neural network has been widely used in many fields because of its good learning ability [1], [2]. However, because traditional feedforward neural networks mostly use gradient descent to modify the value of hidden layer nodes, the performance of the machine is tested in the iterative process, the model is slower to form, and it is more sensitive to the adjustment of hyper parameters [3], [4]. In recent years, a new type of feedforward neural network—Extreme Learning Machine (ELM) was proposed by Huang et al. Because of the ELM method can randomly generate constant connection weights between the input layer and the hidden layer and the hidden layer neuron threshold before training, it can overcome some shortcomings of traditional feedforward neural networks [5]. The ELM method has attracted the research and attention of many scholars and experts at home and abroad because of its fast learning speed and excellent generalization performance. The ELM algorithm has a wide range of applicability [6]–[9]. It is not only suitable for regression and fitting problems, but also for classification, pattern recognition and other fields, so it has been widely used. Owe to the connection weights and thresholds of ELM are randomly generated before training and remain unchanged during training, the role of some hidden layer nodes is very small. If the data set is biased, it will even cause most of the nodes to be close to zero, so it is generally optimized and improved [10].
Classification problem is one of the most important research contents in the field of pattern recognition, data mining and machine learning, which has a wide range of applications in the real world. In recent years, swarm intelligence optimization algorithm, as an effective evolutionary computing technology, has been valued by many scholars. Swarm intelligence optimization algorithm refers to a type of algorithm inspired by the real environment. Its core idea is to achieve the balance between random behavior and local search in the search process [11]–[13]. At present, swarm intelligence optimization algorithms show excellent performance in solving most nonlinear and multi-modal real-world optimization problems [14], [15]. All swarm intelligence optimization algorithms use some degree of randomization and local search trade-offs. Compared with traditional search algorithms, swarm intelligence optimization algorithms can find better solutions for complex and difficult optimization problems. The genetic algorithm evolved from the evolutionary laws of the biological world, the particle swarm optimization based on the information transmission of the bird group foraging, the cuckoo search algorithm that simulates the parasitic brooding characteristics of certain species of cuckoos, and so on [16]. Swarm intelligence optimization algorithms are simple and easy to implement, few parameters, short running time, etc. Therefore, in solving many nonlinear and multi-modal real-world optimization problems, swarm intelligence optimization algorithms show excellent operability and optimization capabilities [17].
However, the traditional network model based on gradient descent and support vector network have some shortcomings, such as too long training time, too slow convergence speed, easy to over fit and so on. For the actual complex classification problems, it is difficult to get the ideal results by using the traditional classification learning methods directly [18]. How to design an efficient and generalization ability classification model is still a problem that has not been well solved. In recent years, the research and application of methods based on extreme learning machines have become more and more extensive. It overcomes the above-mentioned shortcomings of traditional BP algorithms. In the extreme learning machine method, the input weight value and the hidden layer bias value of the network are randomly generated, and the output weight can be calculated by the Moore-Penrose generalized inverse matrix. The activation function used only needs a bounded continuous function, which has extremely fast training speed and good generalization ability. However, in some specific applications, it is compared with traditional optimization methods based on gradient descent, the extreme learning machine method may require more hidden layer neurons. Too many or too few hidden layer neurons will also cause over-fitting or under-fitting problems. During the training process, non-optimal or unnecessary weight values and thresholds may be generated, which reduces the performance of the algorithm and causes unstable results. In addition, this random assignment reduces the response speed of the extreme learning machine to unknown test data. The larger the number of neurons in the hidden layer, the more complex the network structure. An extremely complex network structure not only has a slow response speed, but also easily causes increase in computational complexity and memory consumption. Therefore, the research on the improvement of the extreme learning machine classification method is of great significance.
In this work, a novel data classification method of extreme learning machine based on the crow search algorithm optimized by particle swarm optimization (PSO-CSA-ELM) is proposed, In comparison with the current general selection approaches, the main contributions of our work in this paper can be summarized as follows:
Characterize the swarm intelligence optimization algorithm–crow search algorithm, and classify the current data classification methods.
Propose a novel data classification method of extreme learning machine based on the crow search algorithm optimized by particle swarm optimization (PSO-CSA-ELM).
Provide extensive simulation results to demonstrate the use and efficiency of the proposed data classification method.
Evaluate the performance of the proposed algorithms by comparing them with the data classification methods of the ELM, DE-ELM, PSO-KELM and CSA-ELM algorithm.
The remainder of this paper is organized as follows: Section 2 discusses the related work. Section 3 describes the basic principles of Crow Search Algorithm optimized Particle Swarm Optimization. Section 4 Describes the algorithm design idea of Extreme learning machine method based on crow search algorithm optimized by particle swarm optimization. Section 5 provides the parameters and simulation results that validate the performance of the proposed algorithm. Section 6 concludes the paper.
Related Work
Generally speaking, the improvement of the extreme learning machine method has attracted a large number of researchers. In view of the above-mentioned shortcomings of ELM, researchers try to use the good global search ability of swarm intelligence optimization algorithm to solve such problems. Xu et al. [19] proposed a method combining particle swarm optimization (PSO) optimization algorithm and ELM. Among them, the PSO algorithm is used to optimize the input weight value and hidden layer threshold in ELM, and some prediction problems have been studied, and relatively ideal prediction results have been obtained. Li et al. [20] proposed the optimization algorithm E-ELM of the evolutionary extreme learning machine. The differential evolution (DE) algorithm is used to optimize the important parameter values (input weight value and hidden layer threshold) in ELM, instead of the traditional BP parameter optimization algorithm, to obtain a more compact network structure and improve the accuracy of data classification. Alharbi et al. [21] used the genetic algorithm of integer coding and combined with the ELM classifier to study gene selection and cancer classification. GA algorithm is used for feature selection, redundant features are removed, and the most important features are selected as the input of ELM classifier. The results verify that the algorithm has good classification performance and can handle sparse data and data imbalance. Silitonga et al. [22] studied the effect of the random weight value connecting the input layer and the hidden layer in the ELM algorithm on the ELM performance, and the results obtained proved that the randomly set input weight value does have a greater impact on model training. In some practical classification and regression problems, these effects have a negative effect on algorithm performance. Tian et al. [23] proposed using the improved PSO method to optimize the parameters of the neural network model, and obtained better experimental results than the BP algorithm.
Owe to the connection weight and threshold of ELM are randomly generated before training and remain unchanged during the training process. Therefore, the role of some hidden layer nodes is very small. If the data set is biased, it will even cause most of the nodes to be close to zero. Therefore, in literature [24] the author pointed out that in order to achieve the desired accuracy, a large number of hidden layer nodes need to be set up. In order to solve this shortcoming, some researchers have gone to combine the optimization algorithm with the ELM method. In literature [25] the author proposed an artificial bee colony algorithm to optimize the extreme learning machine method. The algorithm uses the artificial bee colony algorithm to optimize the hidden layer node parameters of the ELM, thereby improving the performance of the ELM. According to the inspiration of evolutionary computing, in literature [26] the author obtained an adaptive evolutionary extreme learning machine through combination. The algorithm merged the relevant operators of ELM and evolutionary computing. On the basis of setting fewer parameters, the hidden layer nodes are optimized, which improves the accuracy and stability of ELM in regression and classification problems. In literature [27] the author drawed on the memetic evolution mechanism of the leapfrog algorithm and proposes a hybrid intelligent optimization algorithm (SFLA-ELM) for parameter optimization. The ELM algorithm is used to obtain the output weight of the ELM, and good results are obtained. In literature [28] the author used the whale optimization algorithm to simulate and calculate the hyperparameters of the extreme learning machine randomly initialized, and finally obtain an optimal network, which is the extreme learning machine based on particle swarm optimization (WOA-ELM).
Through the above research, it can be seen that in the training process of the algorithm, the input weight, hidden layer threshold, output weight and the number of hidden layer neurons of the ELM algorithm are the essence that affects its generalization performance. There have been some researches on the optimization algorithm of ELM model, but relatively few, although DE algorithm and PSO algorithm can guide the parameter optimization of ELM to a certain extent. However, due to their different search mechanisms, they all have certain shortcomings. For example, although the DE algorithm has better global search capabilities, its convergence speed is very slow. Although the PSO algorithm has a faster convergence speed and a strong local search ability, it is easy to enter the local minimum and cannot find the global optimal solution. The combination of swarm intelligence optimization algorithm and artificial neural network can further improve the performance of ELM algorithm, but due to the complexity of the optimization model itself and the lack of theoretical foundation. The research on this issue has not yet reached a satisfactory level, so further research on this issue is necessary.
Based on the above ideas, in order to overcome the shortcomings of the ELM method, the generalization ability of ELM is further improved. This paper proposes to optimize the ELM algorithm based on the particle swarm optimization crow search algorithm to verify whether the proposed method can get no lower or even higher performance than the existing ELM optimization algorithm. In order to provide a new alternative algorithm and a new way of ELM optimization method, the reason why we use the improved crow search algorithm to optimize the ELM model is that the PSO, DE, and GA algorithms are all random intelligent optimization algorithms. These three algorithms have developed longer than the improved crow search algorithm, and their research and application are more mature. There are relatively few researches on crow search methods, but the algorithm is simple and easy to implement, has low computational overhead, has good global search capabilities, and has fewer parameters to be adjusted in the algorithm. It can not only deal with continuous optimization problems, but also solve combinatorial optimization problems. The performance of dealing with some complex optimization problems is better than these three algorithms.
Crow Search Algorithm Optimized Particle Swarm Optimization
In 2016, Askarzadeh proposed a new optimization algorithm based on the foraging behavior of crow flocks in nature. Crow search algorithm (CSA) is an intelligent meta-inspired algorithm. Crows are highly intelligent birds that live in groups [29]. After they find food, they usually hide the excess food. The hiding position is called memory. Take it out when needed, and steal food from other crows by tracking other crows. The tracked crows can protect their food with a certain Awareness Probability (AP) to prevent theft. The crow search algorithm has made some research results in the fields of network optimization and distribution, medical testing and so on [30]. The crow search algorithm has only two parameters (flight length and perception probability), and the crow search algorithm is easy to implement and its convergence speed is fast. Therefore, the crow search algorithm has certain application research value in different fields, and has stronger competitiveness compared with other intelligent optimization algorithms [31].
Set a reasonable number of iterations iter,
If the crow individual \begin{equation*} X^{i,iter+1}=X^{i,iter}+r_{i} \times fl^{i,iter}\times (m^{j,iter}-X^{i,iter})\tag{1}\end{equation*}
The parameter
The other state is state \begin{align*} X^{i,iter+1}=\begin{cases} \displaystyle X^{i,iter}+r_{i} \times fl^{i,iter}\times (m^{j,iter}-X^{i,iter}), \\ \displaystyle \qquad \qquad \qquad \qquad ~r_{\textrm {i}} \ge AP^{j,iter} \\ \displaystyle a~\textrm {random position} \quad \text {otherwise} \end{cases} \\\tag{2}\end{align*}
The parameter
Since the position generated by the global search of the crow algorithm is completely random, in the crow algorithm, the crow
In the population initialization stage, the Tent chaos based on reverse learning is used for initialization.
In terms of flight length, according to the increase in the number of iterations, the size of the flight length is adaptively reduced.
In the global search, the particle search strategy of the particle swarm optimization algorithm is used to search to expand the search range of the crow search algorithm.
Extreme Learning Machine Method Based on Crow Search Algorithm Optimized By Particle Swarm Optimization
The original crow search algorithm used a random initialization method when initializing the population. The randomness of the initial solution would lead to poor distribution and affect the convergence speed of the experiment. The crow search algorithm tries to maintain a balance between diversity and convergence through flight length (fl) and perceived probability (AP). The smaller the value of the flight length fl is, the more likely it is to cause a local search. On the contrary, the larger the value of fl is, the more likely it is to cause a global search. These parameters are selected by the user before the algorithm is executed, and the state of the solution obtained during the execution is not considered when selecting these parameters, which may cause the final solution to easily fall into the local optimum.
The PSO-CSA-ELM model proposed in this paper combines the improved crow algorithm with the ELM model. The input weight value and hidden layer threshold in the ELM are intelligently optimized by the CSA algorithm, and the output weight value is calculated by the Moore-Penrose generalized inverse matrix. Then train and construct the optimized ELM model on the classification data set, and finally use the test set to obtain the classification results of the model. The specific process of the PSO-CSA-ELM algorithm is as follows:
Step 1:
Define the relevant parameters of the PSO-CSA algorithm and the extreme learning machine algorithm, randomly generate a d-dimensional crow population, and then initialize the crow population by mapping based on the chaos method. The 5-fold CV method is used to divide the training data set into five subsets, four of which are used as training sets and the remaining one is used as test sets.
Step 2:
Initialize the population, randomly generate initial solutions and encode them. The dimension of each solution is
, and the firstL\times (n+1) dimension represents the input weight value. The remaining L dimensions represent hidden layer thresholds, and they are all continuous real numbers.L\times n Step 3:
Initialize the parameters of the algorithm, including the number of populations
, the upper and lower bounds of the population, and the maximum number of iterations Genmax.N Step 4:
Use the solution obtained in step 2 to decode, obtain the input weight value and the hidden layer threshold, and train the ELM model on the training data set. Note that the solution obtained is actually a vector value, but in the training process it is actually an
-dimensional matrix, which needs to be converted into a vector form in advance.L\times (n+1) Step 5:
Calculate the fitness value corresponding to each solution.
Step 6:
Add one iteration.
.\textit {iter} = \textit {iter} + 1 Step 7:
In the employment bee phase, update each solution.
Step 8:
Use the input weight value and hidden layer threshold value obtained in step 7 to calculate the fitness value corresponding to each solution.
Step 9:
A crow
is randomly selected. According to the perception probability AP, if the random numberj is greater than or equal to the perception probability, then the crowr_{j} follows the crowi and flies to the memory position of the crowj . If the random numberj is less than the perception probability AP, in order to deceive crowr_{j} , crowi will fly to another location according to the particle search strategy in the particle swarm algorithm.j Step 10:
The current crow position is used as the ELM model parameter and the data is predicted, and the prediction result is converted into a fitness function value and compared with the fitness function value of the memory position of the crow. If it is better than the memory position, the memory position is updated to the current position.
Step 11:
Perform the above steps for all crows, iterate the number of times specified in the above steps, and return the global optimal position as the initial input weight and threshold of the ELM prediction model.
Step 12:
Use the new solution obtained in step 11 to train the ELM model and calculate its fitness value.
Step 13:
Determine whether the algorithm has reached the maximum number of iterations, if it is satisfied, go to step 14; otherwise, return to step 6 to continue running the algorithm.
Step 14:
Decoding from the returned optimal solution can obtain the optimal input weight value and the threshold value of the hidden layer.
Step 15:
Use the trained ELM model to perform classification tests and record the final classification results.
The flow chart of the PSO-CSA-ELM algorithm is shown in Figure 2.
The time complexity indirectly reflects the length of time the algorithm executes. In the CSA-ELM algorithm, it is assumed that the execution time required to initialize the parameters (under the condition that the population size is \begin{equation*} O(x_{1} +N(nx_{2} +f(n))=O(n+f(n))\tag{3}\end{equation*}
Assuming that the execution time required for the iterative update of each dimension of the individual is the same, which is x3, the time for comparing the advantages and disadvantages and selecting the best after iteration is x4. The calculation time of the flight length of the crow is x5, and the time consumption of the Awareness Probability is x6, then the time complexity of the algorithm at this stage is:\begin{equation*} O(N(nx_{3} +f(n))+x_{4} +x_{5} +x_{6})=O(n+f(n))\tag{4}\end{equation*}
Therefore, the total time complexity of the CSA-ELM algorithm to solve each generation’s optimal is:\begin{equation*} T(n)=O(n+f(n))+O(n+f(n))=O(n+f(n))\tag{5}\end{equation*}
In the improved PSO-CSA-ELM algorithm, the time required for the initialization phase of the algorithm is basically the same as the CSA-ELM algorithm. Therefore, the time complexity of the initialization phase of the improved algorithm is the same as equation (12). In the algorithm loop, suppose the calculation time of the weighted center is z1, the calculation time of the individual learning position is z2, and the calculation time of the comparison and selection process between the learning individual and the initial individual is z3. Then the time complexity of the loop part is:\begin{align*} O(N(nx_{3} \!+\!f(n))\!+\!x_{4} \!+\!x_{5} \!+\!x_{6} \!+\!N(z_{2} \!+\!z_{3})\!+\!z_{1}) \!=\!O(n\!+\!f(n))\!\!\!\! \\\tag{6}\end{align*}
Therefore, the total time complexity of the improved PSO-CSA-ELM algorithm to solve the optimal of each generation is:\begin{equation*} T(n)=O(n+f(n))+O(n+f(n))=O(n+f(n))\tag{7}\end{equation*}
In summary, the improved strategy of the improved PSO-CSA-ELM algorithm does not increase the time complexity of the algorithm solution compared to the initial CSA-ELM algorithm.
Algorithm Simulation Comparison and Analysis
A. Simulation Environment Settings
To verify the performance of the proposed algorithm, this article conducts experiments on eight data sets, which are Computer Hardware, QSAR Aquatic Toxicity, Real Estate Valuation, Servo, Bupa Liver, Cleveland Heart, Breast Cancer and iris data sets. The first 4 are used for regression and the last 4 are used for classification. These experimental data sets are all from well-known open source databases-the University of California Irvine provides a database for machine learning. Before the experiment, the data needs to be preprocessed first. Because the Australian, Breast Cancer and Cleveland Heart data sets have missing features, in order to ensure the integrity of the sample data, this experiment has performed an average processing method on these records. At the same time, in order to reduce the difference between the eigenvalues and prevent the larger eigenvalues from overly affecting the smaller eigenvalues, we normalize each eigenvalue to the interval [−1,1].
In order to verify that the PSO-CSA-ELM algorithm has better performance in terms of convergence and optimization speed, the proposed PSO-CSA-ELM algorithm is compared with the CSA-ELM algorithm, PSO-ELM algorithm and DE-ELM algorithm on function test, regression and classification data sets. The maximum number of population evolution in all experiments is set to 50, and the population size of the algorithm is 30. All experiments were run 50 times and the root mean square error or the average and standard deviation of the classification accuracy were taken as the experimental results. In the crow search algorithm (CSA), the crow flight length fl and the awareness probability are 2 and 0.1 respectively. The parameter settings of the PSO algorithm: learning factor
B. Test Objective Function Optimization
1) Sinc Function Simulation Experiment Comparison
The four algorithms are compared by fitting the Sinc function. The expression of the Sinc function is as follows:\begin{align*} f(x)=\begin{cases} \displaystyle \frac {\sin (x)}{x}, & x\ne 0 \\ \displaystyle 0, & x=0 \end{cases}\tag{8}\end{align*}
We set to generate 1000 [−10,10] uniformly distributed data sets x, and calculate 1000 data sets
Root Mean Square Error (RMSE) and Standard Deviation (Std. Dev) indicators are used as evaluation indicators for error analysis. The calculation formulas of the two indicators are as follows:\begin{align*} RMSE=&\sqrt {\frac {1}{N}\sum \limits _{i=1}^{N} {(y(i)-y'(i))^{2}}}\tag{9}\\ Std.Dev=&\sqrt {\frac {1}{N-1}\sum \limits _{i=1}^{N} {(y'(i)-\overline y '(i))^{2}}}\tag{10}\end{align*}
Among them, the smaller the index values of RMSE and Std. Dev, the lower the forecast error. The Sinc function fitting results are shown in Table 1.
It can be seen from Table 1 that the RMSE and Std. Dev index values of the basic ELM method are the largest and the performance is the worst. Calculated by the PSO-ELM algorithm, the index values of RMSE and Std. Dev are large, and the performance of the test results is poor. The index values of RMSE and Std. Dev of the DE-ELM algorithm are large, and the test results have poor performance. The RMSE and Std. Dev index values of the CSA-ELM algorithm are smaller, and the test results have better performance. The RMSE and Std. Dev index values of the PSO-CSA-ELM algorithm are the smallest, and the test results have the best performance. It shows that the error of the PSO-CSA-ELM algorithm model is relatively smaller, and the prediction accuracy is better than the ELM, PSO-ELM, DE-ELM and CSA-ELM algorithms. At the same time, it can be seen from Table 1 that as the number of hidden layer nodes increases, the average test error and standard deviation gradually decrease. When there are too many hidden layer nodes, over-fitting will occur. Because the CSA-ELM algorithm is easy to fall into the local optimal solution and other shortcomings, the effect is still poor when the number of nodes is high. In most cases, when the number of hidden layer nodes is the same, the PSO-CSA-ELM algorithm has a smaller average test error and standard deviation.
2) Comparison of Classification Data Set
In this paper, we compare the performance of the four algorithms using four real regression data sets in the machine learning library of the University of California, Irvine. The names of the data sets are: Breast Cancer, Bupa Liver, Cleveland Heart and Iris. In the experiment, the data in the data set is randomly divided into training set and test set, 70% of which are used as training set and the remaining 30% are used as test set. In order to reduce the influence of the large difference of each variable, we normalize the data before the algorithm runs, that is, the input variable is normalized to [−1,1], and the output variable is normalized to [0, 1]. In all experiments, the algorithm iterates 50 times, and calculates the average of 50 experimental results.
In order to evaluate the effectiveness of the PSO-CSA-ELM algorithm, we conduct a series of experiments on four classification data sets. The range of the number of hidden layer neurons in the algorithm is set to [5], [25], and the step size is 5. The reason why this section of the experiment chooses this range is because for ELM, too large number of hidden layer neurons will cause the ELM algorithm to produce over-fitting problems as the number of nodes increases. In addition, the ABC method can find better parameters, so that the algorithm only needs fewer hidden layer neurons and achieves more stable results. Tables 2, 3, 4 and 5 show the comparison of the results of the 50% CV on the four data sets of Breast Cancer, Bupa Liver, Cleveland Heart, and Iris, respectively. Performance evaluation criteria include training classification accuracy, test classification accuracy, standard square deviation, the number of hidden layer neurons required, output weight norm, and training time. In addition, we add a large data set for simulation test comparison, the electrical grid stability simulated data set. The electrical grid stability simulated data set consists of 10000 data sets, each of which has 14 eigenvalues. Comparison of electrical grid stability simulated data set fitting results is shown in Tables 6.
As can be seen from Tables 2, 3, 4, and 5, PSO-CSA-ELM achieved the best results in these four classification data sets, and the number of neurons used to achieve the best results was also the least. This shows that the proposed PSO-CSA algorithm can effectively optimize the parameters and obtain a more compact network structure. It also shows that the PSO-CSA algorithm is used to optimize the ELM model, which can achieve better classification performance and generalization ability. It can also be seen from Tables 2, 3, 4 and 5 that the standard square deviation obtained by the PSO-CSA-ELM algorithm is also the smallest, which also shows that the algorithm has good stability. From the comparison of training time in Tables 2, 3, 4 and 5, it can be seen that since ELM random generation parameters do not need to be adjusted, the training speed is very fast, but the classification effect is not good. Compared with the ELM, DE-ELM, PSO-ELM and CSA-ELM algorithms, the PSO-CSA-ELM algorithm has little difference in training time, and does not show the advantage of efficiency. However, due to the ELM model constructed by the PSO-CSA optimization algorithm, our algorithm has achieved better classification accuracy in the classification results, so the calculation efficiency is acceptable. It can be seen from the large number of electronic grid stability simulated data sets in Table 6 that the proposed algorithm also has better classification accuracy, better effect and minimum error than the other four algorithms.
In this experiment, we analyzed the performance of the algorithm with the increasing number of hidden layer neurons, as shown in Figure 3.
The classification accuracy of the five algorithms on different data sets varies with the number of hidden layer nodes.
Figure 3 shows the comparison of their changes on different classification data sets. Figure 3 (a) shows the changes of the five algorithms on Breast Cancer. It can be seen from Figure 3(a) that the curve of PSO-CSA-ELM changes relatively smoothly, and the highest classification accuracy is obtained when the number of neurons is equal to 20. Even when the number of neurons is small, the result is higher than 70%. In contrast, the original ELM and DE-ELM have poor results when the number of neurons is less than 15, which may cause underfitting problems. Figure 3(b) shows the changes of the algorithm on the Bupa Liver data set. From Figure 3(b), it can be seen that PSO-CSA-ELM has the highest classification accuracy. Although PSO-CSA-ELM reached the highest value when the number of neurons was 30, PSO-CSA-ELM achieved 95.37% classification accuracy compared to other algorithms with the same number of neurons. For the Cleveland Heart data set, as shown in Figure 3(c), it can be seen from the figure that our proposed method is relatively close to other improved ELM methods, but only requires 15 neurons and the smallest variance value is obtained. For the Iris data set, it can be seen from Figure 3(d) that our proposed method is superior to several other methods. In addition, as the number of neurons increases, the classification accuracy also increases. However, when the number of neurons is greater than 30, the results of ELM and PSO-ELM drop significantly, which may cause overfitting problems. However, the PSO-CSA-ELM algorithm proposed in this paper has small fluctuations, which shows that the algorithm has good stability. From the data classification effect of the big data set in Figure 3(e), the algorithm proposed in this paper has the highest classification accuracy and the best effect.
3) Regression Test Data Set Simulation Experiment Comparison
Similarly, we tested the four regression data sets of Computer Hardware, QSAR Aquatic Toxicity, Real Estate Valuation and Servo. The test results are shown in Tables 7, 8, 9 and 10. The classification accuracy of the five algorithms varies with the number of hidden layer nodes as shown in Figure 4.
The classification accuracy of five algorithms varies with the number of hidden nodes.
It can be seen from Tables 6, 7, 8 and 9, that the ELM method has the largest root mean square error, the largest variation range, and poor performance. The root mean square error of the DE-ELM method is larger, the root mean square error of the PSO-ELM method is also larger, and the root mean square error of the CSA-ELM method is smaller. The PSO-CSA-ELM algorithm proposed in this paper has the smallest root mean square error and the best classification result. The root mean square error obtained by the proposed PSO-CSA-ELM algorithm is the smallest, which also shows that the algorithm has good stability. From the comparison of training time in Tables 6, 7, 8 and 9, it can be seen that since ELM randomly generates parameters without adjustment, its training speed is very fast, but the classification effect is not good. Compared with the ELM, DE-ELM, PSO-ELM and CSA-ELM algorithms, the PSO-CSA-ELM algorithm has little difference in training time, does not reflect the advantage of efficiency, and has little difference in training time. From the four regression classification results shown in Figure 4, whether it is Computer Hardware, QSAR Aquatic Toxicity, or Real Estate Valuation, Servo data sets, the ELM classification method has the worst performance. The PSO-CSA-ELM algorithm proposed in this paper has the highest classification accuracy and the best performance.
4) Simulation Experiment Comparison of Speech Signal Classification Data Set
Speech feature signal recognition and classification is an important aspect in the field of speech recognition research, which is generally solved by the principle of pattern matching. This paper selects four different types of music, namely famous songs, guzheng, rock and pop, and uses the PSO-CSA-ELM classification method proposed in this paper to effectively classify these four types of music. Each piece of music uses the cepstrum coefficient method to extract 500 groups of 24-dimensional speech feature signals. The speech feature signal classification algorithm modeling based on the PSO-CSA-ELM method includes three steps: PSO-CSA-ELM neural network construction, PSO-CSA-ELM neural network training and PSO-CSA-ELM neural network classification. The comparison of the classification results of the speech feature signals of the five algorithms is shown in Figure 5. Figure 5(a) is the prediction result of the ELM algorithm, Figure 5(b) is the prediction result of the DE-ELM algorithm, and Figure 5(c) is the prediction result of the PSO-ELM algorithm. Figure 5(d) is the prediction result of the CSA-ELM algorithm, and Figure 5(e) is the prediction result of the PSO-CSA-ELM algorithm. Figure 6 is a graph showing changes in the prediction accuracy of five algorithms with the number of hidden layer nodes. Figure 7 is a comparison of the prediction accuracy of the five algorithms.
The prediction accuracy of five algorithms varies with the number of hidden nodes.
It can be seen from Figure 5 that the prediction result of the ELM algorithm in Figure 5(a) has the worst performance, with an accuracy of 78.5%. The prediction result of DE-ELM algorithm in Figure 5(b) is poor, with an accuracy of 87%. The prediction result of the PSO-ELM algorithm in Figure 5(c) is average, with an accuracy of 89.5%. The prediction result of CSA-ELM algorithm in Figure 5(d) is better, with an accuracy of 92.5%. Figure 5(e) The prediction result of the PSO-CSA-ELM algorithm proposed in this paper has the best performance, with an accuracy of 96.5%. It can be seen that the algorithm proposed in this paper has the best classification accuracy of speech feature signals. Figure 6 shows the comparison of the classification accuracy of the five algorithms in the case of different numbers of hidden layer neurons. As the number of hidden layer neurons increases, the classification results of the five algorithms are gradually increasing. However, the ELM classification effect is the worst, and the performance of the other four algorithms has increased significantly. At the same time, the classification accuracy of the PSO-CSA-ELM algorithm proposed in this paper has the largest increase. Figure 7 shows the comparison of the classification accuracy of the five algorithms under different experimental times. It can be seen that the algorithm proposed in this paper has the best classification accuracy of speech feature signals.
Conclusion
In this paper, we thoroughly analyze the performance and related defects of the extreme learning machine, and use an improved crow search algorithm to solve them. By introducing the behavior of following the current optimal solution in the particle swarm optimization algorithm, the shortcomings of the crow search algorithm are slow and easy to fall into the local optimal solution. In this way, the weight and threshold of the ELM can be quickly calculated, and the ELM is optimized to have better calculation speed and accuracy. A search strategy of particle swarm algorithm is proposed to enhance the global search ability, and Gaussian function is added in the later stage of algorithm iteration. Use the penalty coefficient of the function to perform local disturbance, gradually reduce the amplitude of the search trajectory, and then adjust the parameters adaptively to avoid being attracted by the local extremum, and further improve its generalization ability. Finally, the improved crow algorithm is used to optimize the hidden layer neurons and connection weights of the extreme learning machine neural network, so as to obtain accurate prediction results. The proposed PSO-CSA-ELM algorithm has been verified on classification, regression data set and speech signal recognition, and the results show that the accuracy of the improved algorithm has been correspondingly improved.
The algorithm proposed in this paper loses a certain amount of group diversity because the group tends to move in the direction of the local optimal solution every time the algorithm is updated, and the stability of the algorithm is not improved accordingly. In the future research process, we should try to preserve the diversity of the group in the iterative process and make it the focus of research work.