I. Introduction
Clustering, the art of discovering meaningful patterns in the unlabeled data sets, is one of the main tasks in machine learning. Semisupervised clustering is a branch of clustering methods that uses prior supervision information, such as labeled data, known data associations, or pairwise constraints, to aid the clustering process. This paper focuses on pairwise constraints, i.e., pairs of instances known as belonging to the same cluster (must-link constraints) or different clusters (cannot-link constraints). Pairwise constraints arise naturally in many real tasks and have been widely used in semisupervised clustering. There is a wide range of issues in the clustering methods. For instance, individual clustering algorithms provide different accuracies in a complex data set because they generate the clustering results by optimizing a local or global function instead of natural relations between data points [1]–[4]. As another example, pairwise constraints often result in highly unstable clustering performance, whereas they have the potential to improve clustering accuracy in practice [5], [6].