1. Introduction
Clustering aims to classify data into several classes without label information [15]. It is one of the fundamental tasks of unsupervised learning. A number of methods have been proposed [38,19,8]. Based on the approaches to model the space structure, most clustering methods can be classified into two categories, namely, model based methods and similarity based methods. The model based methods, such as the Gaussian mixture model [4] and subspace clustering[1,36], focus on the global structure of the data space. They put assumptions on the whole data space and fit the data using some specific models. An advantage of model based methods is their good generalization ability. Once trained, new samples can be readily clustered using the learnt model parameters. However, it is challenging for these methods to deal with data with complex spread. Different from model based methods, the similarity based methods emphasize the local structure of the data. These methods formulate the local structures using some similarities or distances between the samples. Spectral clustering [33,26], a popular similarity-based method, constructs a graph using the sample similarities, and treats the smoothest signals on the graph as the features of the data. With mild assumption, similarity-based methods achieve tremendous success [25]. Many similarity-based methods, however, suffer from high computational complexity. Spectral clustering, for instance, requires to perform a singular value decomposition when computing features, which is prohibitive for large datasets. To address this issue, a lot of efforts have been made and many methods have been proposed [5,10,22,39].