Loading [MathJax]/extensions/MathZoom.js
Solving the Discretization-based Feature Construction Problem using Bi-level Evolutionary Optimization | IEEE Conference Publication | IEEE Xplore

Solving the Discretization-based Feature Construction Problem using Bi-level Evolutionary Optimization


Abstract:

Feature construction represents a crucial data preprocessing technique in machine learning applications because it ensures the creation of new informative features from t...Show More

Abstract:

Feature construction represents a crucial data preprocessing technique in machine learning applications because it ensures the creation of new informative features from the original ones. This fact leads to the improvement of the classification performance and the reduction of the problem dimensionality. Since many feature construction methods require discrete data, it is important to perform discretization in order to transform the constructed features given in continuous values into their corresponding discrete versions. To deal with this situation, the aim of this paper is to jointly perform feature construction and feature discretization in a synchronous manner in order to benefit from the advantages of each process. Thus, we propose here to model the discretization-based feature construction task as a bi-level optimization problem in which the constructed features are evaluated based on their optimized sequence of cut-points. The resulting algorithm is termed Discretization-Based Feature Construction (Bi-DFC) where the proposed model is solved using an improved version of an existing co-evolutionary algorithm, named I-CEMBA that ensures the variation of concatenation trees. Bi-DFC performs the selection of original attributes at the upper level and ensures the creation and the evaluation of constructed features at the upper level based on their optimal corresponding sequence of cut-points. The obtained experimental results on ten high-dimensional datasets illustrate the ability of Bi-DFC in outperforming relevant state-of-the-art approaches in terms of classification results.
Date of Conference: 01-05 July 2023
Date Added to IEEE Xplore: 25 September 2023
ISBN Information:
Conference Location: Chicago, IL, USA

I. Introduction

In machine learning applications, high-dimensional data with tens of thousands of dimensions is available which makes the classification task a challenging one due to the high dimensionality aspect and the quality of the feature set. In this way, data preprocessing techniques should be used in order to enhance the quality of the features space. One of these techniques is the feature construction task that ensures the combination of original features in order to create new high-level ones. In this context, it is important to only select in-formative features from the original dataset and then combine these features together with the aim to construct new high-level ones termed constructed features [1]. Feature construction is a very important and challenging task due to the existence of a large search space of attributes combinations. For this reason, it is important to search for the optimal possible combinations for constructed features in the construction process. In other words, it is important to execute an optimization process to determine the optimal constructed features.

Contact IEEE to Subscribe

References

References is not available for this document.