I. Introduction
The goal of word sense disambiguation (WSD) is to assign an appropriate sense to an ambiguous word within a given context. A variety of techniques for supervised WSD have demonstrated reasonable performance, such as exemplar-based learning [1], decision list [2], maximum entropy model [3], Naive Bayes model [4] [5]. Among these supervised approaches, the sense ambiguity of words is resolved with the help of the contexts of their occurrences. Two types of features local collocation features (LCF) and topical contextual features (TCF) are commonly used in WSD studies to represent the contexts [4] [6], such as local words or part-of-speech (POS) tags with position information, bi-gram templates, collocations, and syntactic features. LCF and TCF generally take morphological or syntactic information into account.