1. Introduction
The considerable knowledge of tumor-related molecular biomedicine indicates that the tumor is recognized as a complex Systems Biology disease since its genesis and development involves the complicated spatiotemporal organization of signaling pathway [1]. Fortunately, the microarray techniques emerged in recent years can simultaneously monitor the expressions of a great number of genes, thus they have become powerful and high cost-effective tools for the insights of biological processes and tumor etiology. Consequently, such high-throughput techniques provide a good prospect for the more accurate diagnosis of tumor subtypes and the personalized treatment of patients with a tumor [2]. Up to now, many tumor classification methods based on Gene Expression Profiles (GEPs) have been proposed [3], [4], [5], [6]. Generally, according to whether the label information is utilized in tumor classification or not, those methods can be roughly divided into three categories: supervised method [7], semi-supervised method [8], and unsupervised method [9], [10], respectively. The supervised methods usually predict the label of test sample by a classification model which is constructed by using the labeled training set, so they are also called model-based methods. The unsupervised methods cluster samples by only using the intrinsic information from data set without utilizing any label information of the data set. The semi-supervised methods take advantage of the information of both labeled and unlabeled samples to predict the label of unlabeled sample. Though the research on GEP-based tumor classification has made a great progress in the past decade, there still remains some challenging problems which have not been well solved including the problem of “large , small ” [11] which means the number of genes () far exceeds the number of samples () in tumor data set.