I. Introduction
With the rapid development of Internet technology, big data technology has gradually become a potential resource for information transmission and resource sharing in different industries in the world. Data transmission in various fields is realized through Internet technology, and the data information on the Web is also growing explosively. There is a phenomenon that “information technology is difficult to find”. How to solve the problem from these scattered, heterogeneous, and not unified management vast numbers, it is difficult to obtain information quickly and accurately in data mining. Due to the diversified development of various big data content, for example, data information contains a large number of text, audio, video, graphics, images and other elements, and the display forms is also different. In the existing technology, when users use these data, the retrieval and classification methods are usually difficult to meet the needs of users, and it is obvious to obtain the information of users' needs in these data information deli does not follow the heart [1]–[2]. Therefore, how to find the potential rules and useful knowledge from the massive data, so that users can properly classify the data, obtain useful data, and improve the efficiency of data processing and utilization, the application of data mining technology is particularly necessary. Data mining is an interdisciplinary subject, involving many fields, including statistics, database, machine learning and artificial intelligence. Data mining, also known as knowledge discovery in database, is a non-trivial process of obtaining novel, useful, effective and understandable patterns from a large amount of “ocean like” data, that is, extracting knowledge from a large amount of data. Classification is a very important research topic in data mining technology. By using classification technology, we can extract models or functions that describe the same data categories from the data set, and can smoothly classify each unknown category of data in the data set into a known category. At present, the commonly used data mining classification algorithms are: statistical classification, decision tree, artificial neural network methods. Different algorithms will produce different classifiers, and different classifiers will affect the accuracy and efficiency of data mining. Therefore, it is very necessary to choose an appropriate classification algorithm when faced with the huge amount of data.