I. INTRODUCTION
In rough set theory (RST) [1], the set of all objects with the same properties is called a concept (a rough set). However, there is incomplete knowledge of the concepts from the decision systems and they are approximated using their if upper and lower approximations. In this context, the concepts are termed rough sets. Objects of different concepts can be discerned using their corresponding attribute values. The subsets of attributes which discern the same number of objects as do all attributes are termed reducts [1]. Rough set feature selection (RSFS) aims to find the reducts with the minimal size for classifier training. RSFS works on discrete attributes only, because the indiscernibility relation works on discrete values only. For datasets with real attributes, discretization must be performed before RSFS to transform real attributes values into discrete intervals. This work is motivated by the results obtained in our previous work [9] which evaluated the effect of RSFS on the performance of decision trees. RMEP was integrated with a genetic algorithm-based RSFS approach and a decision tree classifier. 9 datasets from the UCI database [7] were used to evaluate the approach. For the datasets with high dimensionality, RMEP generated discrete datasets with empty cores; for the datasets with low dimensionality, RMEP generated discrete datasets with nonempty cores. The results suggested that the discretization process affects the performance of RSFS as follows. The core size of the discretized datasets is determined by the discretization process; if the ratio of core size to data dimensionality is close to 0, RSFS tends to have random performance and to not always improve the performance of decision trees. If the ratio of core size to data dimensionality is greater than 0.1, RSFS tends to improve the performance of decision trees [9]. Current discretization methods perform discretization without considering the core size of the discretized dataset to be produced. As a consequence, this paper proposes the core-generating approximate minimum entropy discretization (C-GAME) which selects cuts of minimum entropy values capable of generating discrete datasets with nonempty cores and proposes a modelling approach for C-GAME based on constraint satisfaction [10]. The paper is organized as follows: Section II presents the basic concepts concerning rough set theory, discretization problems and constraint satisfaction optimization problems (CSOPs); C-GAME is defined and modelled as a CSOP in Section III; Section IV investigates the performance of C-GAME on two datasets by integrating it with RSFS and decision trees; conclusions and further work are given in Section V.