I. Introduction
The last decades have witnessed the profound changes in human lifestyle with the rapid development of modern digital technologies. A huge amount of multitype relational data emerge every moment in a broad range of real-world applications [12], [29], e.g., numerous documents in online offices, various images or videos in social networks, and gene expression data for medical diagnosis. Clustering has established itself as a very useful tool to handle a vast number of data with successful applications in knowledge management, information retrieval, bioinformatics, etc. As an unsupervised learning mechanism, clustering seeks appropriate partitioning of the data with the rule that the data points within the same cluster should be more closely and mutually interdependent than those in different clusters. In general, traditional clustering belongs to the unilateral learning, i.e., it only emphasizes clustering along the sample or feature dimension. Recent works have shown that clustering samples and features simultaneously, i.e., coclustering, is beneficial to further improving the clustering performance in the sense that coclustering fully makes use of the dual interdependence between samples and features to discover certain hidden clustering structures in data [17], [18], [34].