Conferences >2012 IEEE 12th International ...

GPU-Accelerated Feature Selection for Outlier Detection Using the Local Kernel Density Ratio

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Effective outlier detection requires the data to be described by a set of features that captures the behavior of normal data while emphasizing those characteristics of ou...Show More

Metadata

Abstract:

Effective outlier detection requires the data to be described by a set of features that captures the behavior of normal data while emphasizing those characteristics of outliers which make them different than normal data. In this work, we present a novel non-parametric evaluation criterion for filter-based feature selection which caters to outlier detection problems. The proposed method seeks the subset of features that represents the inherent characteristics of the normal dataset while forcing outliers to stand out, making them more easily distinguished by outlier detection algorithms. Experimental results on real datasets show the advantage of our feature selection algorithm compared to popular and state-of-the-art methods. We also show that the proposed algorithm is able to overcome the small sample space problem and perform well on highly imbalanced datasets. Furthermore, due to the highly parallelizable nature of the feature selection, we implement the algorithm on a graphics processing unit (GPU) to gain significant speedup over the serial version. The benefits of the GPU implementation are two-fold, as its performance scales very well in terms of the number of features, as well as the number of data points.

Published in: 2012 IEEE 12th International Conference on Data Mining

Date of Conference: 10-13 December 2012

Date Added to IEEE Xplore: 17 January 2013

ISBN Information:

ISSN Information:

DOI: 10.1109/ICDM.2012.51

Conference Location: Brussels, Belgium

Contents

I. Introduction

An integral part of any data mining task is having a good set of features that can be used to accurately model the inherent characteristics of the data. In practice, the best set of features is not known in advance. Therefore, a pool of candidate features are collected and processed to removed irrelevant and redundant features. This can improve both the memory and computational cost of the data mining algorithm, as well as the accuracy of the learner. Reducing the space of possible features is done in two ways: feature transformation and feature (subset) selection. In the former, the original space of features is transformed into a new feature space, as in Principal Components Analysis (PCA).

References is not available for this document.

GPU-Accelerated Feature Selection for Outlier Detection Using the Local Kernel Density Ratio

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

GPU-Accelerated Feature Selection for Outlier Detection Using the Local Kernel Density Ratio

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References