Loading [a11y]/accessibility-menu.js
A review in feature extraction approach in question classification using Support Vector Machine | IEEE Conference Publication | IEEE Xplore

A review in feature extraction approach in question classification using Support Vector Machine


Abstract:

Text classification which is an integral part of text mining has caught much attention in various industries and fields recently. The ability is in assigning text documen...Show More

Abstract:

Text classification which is an integral part of text mining has caught much attention in various industries and fields recently. The ability is in assigning text documents to one or more pre-defined categories based on content similarity. While most of application of text classification focuses on document level, question classification works at much granular level such as sentence and phrase. There have been numerous studies on question classification in accordance to Bloom taxonomy in assessments to measure cognitive level of learners in higher learning institutions. But it has not been effective yet to resolve overlapping issue of Bloom taxonomy verb keywords being assigned to more than one category of Bloom taxonomy. The presence of this poses a problem in respect of classifying a particular question into a right category of Bloom taxonomy. And feature extraction plays an important role in improving the accuracy of classifier such as Support Vector Machine in question classification. Much of the past related research work focus on feature extraction methods such as bag of word (BOW) and syntactic analysis to classify questions and to address the issue, an improvement in feature extraction is needed. In view of this, this study proposes an integrated approach in feature extraction involving semantic aspect in classifying questions in accordance to Bloom taxonomy. Support Vector Machine classifier is used as it is well known for its high accuracy in text classification. With all this in place, an improved accuracy in classifying questions in accordance to Bloom taxonomy can be expected.
Date of Conference: 28-30 November 2014
Date Added to IEEE Xplore: 02 April 2015
ISBN Information:
Conference Location: Penang, Malaysia

I. Introduction

Text mining is a process of extracting the unknown, understandable, and ultimately available knowledge from large-scale text data in advance. Text mining is a branch of data mining, whose object is entirely composed of text, is text mining. Therefore, text mining is also known as text data mining or text knowledge discovery, and its main purpose is to extract the interesting, important patterns and knowledge from the unstructured text documents. Text mining can be seen as an extension to database-based data mining or knowledge discovery [1]. Text mining has to deal with those most obscured and unstructured text data, so it has a very close contact with other fields, such as information retrieval, information filtering, automatic summary, text clustering, text classification, natural language processing, artificial intelligence, machine learning, pattern recognition, statistics, visualization and so on. One of the areas in the text mining that is gaining popularity nowadays is text classification or text categorization. In general, text classification is the process of assigning text documents to one or more pre-defined categories based on content similarity [2], [3]. The documents in a collection (or corpus) are usually preprocessed so as to represent them by some numerical measures, before applying supervised learning techniques to create models and subsequently using it to assign predefined category labels to unlabeled documents based on the likelihood inferred [4], [5], [6] and a decent accuracy in classifying documents has been reported. Also some work has been extended to classify documents in web environment. However, the same may be arduous task in question classification.

References

References is not available for this document.