Impact Analysis of Stacked Machine Learning Algorithms Based Feature Selections for Deep Learning Algorithm Applied to Regression Analysis

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Ensemble learning algorithms have proved to be one of the best machine learning algorithms towards optimal performances in terms of regression and classification tasks fo...Show More

Metadata

Abstract:

Ensemble learning algorithms have proved to be one of the best machine learning algorithms towards optimal performances in terms of regression and classification tasks for a variety of applications. When applied for small or medium structured datasets, eXtreme Gradient Boosting (XGBoost) has emerged as a popular ensemble learning technique based on its performance merits. In recent years Light Gradient Boosting Machine (LightGBM) has emerged as a promising ensemble strategy that is competing with XGBoost in terms of performance. Also, Lasso Regression has proven capabilities in terms of feature selections and applications for small datasets. This paper illustrates experimentation performed on a diabetes dataset where the authors tested the hypothesis that feature selection has relatively no impact on the performances of Deep Learning Algorithms as they have built-in capabilities in terms of layers to perform feature selection on their own. Therefore, the hypothesis tested stacking using Deep Learning – Multi-Layer Perceptron (DMLP) with optimal algorithms like XGBoost, LightGBM, and Lasso Regression. In the present work, DMLP with all feature variables (DMLP-ALL) outscored DMLP with stacked selected features (DMLP-MS) by 8.78 % in terms of R². Also, DMLP-ALL outperformed the benchmarked algorithm Automated Machine Learning (AML) by 10.25% in terms of R². The validation of the proposed stacking models by applying a moderate-sized dataset provides promising results for deep learning models stacked with a powerful Level-0 learner.

Published in: SoutheastCon 2022

Date of Conference: 26 March 2022 - 03 April 2022

Date Added to IEEE Xplore: 02 May 2022

ISBN Information:

ISSN Information:

DOI: 10.1109/SoutheastCon48659.2022.9764105

Conference Location: Mobile, AL, USA

Contents

I. Introduction

The concept of stacking models in machine learning is an interesting application as it leverages the learning capabilities from the base models and enables a higher-level model to predict with a higher degree of accuracy [1]. Model stacking is a winning application among practitioners and competition winners [2]. Decision tree-based algorithms perform well on structured data. XGBoost one of the popular ensemble techniques based on gradient-boosting learning is preferred in many scenarios for predictive analysis due to its high accuracy of prediction and its feature of regularization [3] as compared to other ensemble techniques. XGBoost can work well for unbalanced datasets and also has a lesser requirement in terms of hyperparameter tunings as compared to Random Forest [23].In recent years LightGBM [4] has emerged as a promising competitor to XGBoost because of its speed and relatively higher efficiency. LightGBM with its inherent capability to split the tree leaf-wise as the best fit minimizes the loss and thus achieves higher accuracy. Also, it is quite popular that Lasso Regression [14] has emerged as a model for small datasets complemented by a very powerful feature selection capability. Thus at Level-0 of stacking, we have employed XGBoost, LightGBM, and Lasso Regression as the primary learners of best features contributing to more accurate results. Deep learning algorithms have been the best performers in terms of applications that demand higher amounts of accuracy. However, Deep Learning algorithms perform well on larger datasets i.e. they need more amount of training data. Also, the best part of Deep Learning algorithms is their ability to learn the features of the dataset used in the experimentation. The DeepMLP as the Level-1 model performs regression tasks for diabetes progression [5]. Based on these facts the research questions of the experimentation in this paper are as follows:

References is not available for this document.

Impact Analysis of Stacked Machine Learning Algorithms Based Feature Selections for Deep Learning Algorithm Applied to Regression Analysis

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Impact Analysis of Stacked Machine Learning Algorithms Based Feature Selections for Deep Learning Algorithm Applied to Regression Analysis

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

Authors

Figures

References

Keywords

Metrics

References