I. INTRODUCTION
The increase of in the amount of data collected from multiple sources requires using different strategies for data analysis. Such strategies fall under the areas of data mining and machine learning. One of these strategies is classification, which aims to divide the dataset into different groups according to some selected features. There are several methods used to improve the accuracy of classification, these approaches can be classified into two categories. The first category aims to improve the classification by using meta-heuristic (MH) approaches. For example, the works in [1], [2] and [3] used artificial bee colony and particle swarm optimization to improve the performance of support vector machines. On the other hand, the second category involves preparing the dataset before being used by any classifier to remove irrelevant features that may result in degrading the performance of classifier. Therefore, selecting the relevant features is required for posterior classification processes and this leads to improved the classification accuracy and reduced classification time. Feature selection (FS) is a method used to extract the most representative features from a large set of data. FS is important step that used to reduce the dimensionality of the dataset [4], [5]. Its application is reflected in the speed of the entire processing method, and the performance of the learning model applied in other postprocessing steps [6]. FS can be used to tackle realworld applications, for example signal processing, computer graphics, data mining and biology [7].