I. Introduction
Classification problems generally seek a function, , that can transform or map an observation, , to a prediction or decision, . Machine learning is often utilized to determine a suitable function from a set of training data , where (a vector of labels) and (a set of feature vectors). Learning the prediction function typically involves solving an optimization problem that minimizes the error between the true labels of the training data and the labels predicted by the function .