I. Introduction
In many fields, we are often faced with the task of making decisions based on a set of feature-vector data . This data is typically accompanied by a set of training labels for each feature-vector, giving the pair , where is a vector of labels such that is the label of feature-vector . This problem can be considered a classification task, and is typically tackled by training a classifier such that it can accurately predict the class label of a new sample of data where the label is not known. More concretely, the data are used to learn some prediction function such that we can accurately predict the label of feature vectors as .