I. Introduction
Data in real world applications naturally contain complex interactions and are described with multiple modalities or types of features, which can be considered as multiple views. Most multi-view methods aim at exploring the common hidden structure of cross-view data to better integrate multiple features. Over the past decades, a great number of methods have been proposed and achieved promising results. Some researchers focus on consistent graph learning [1], [2], [3], [4], [5], [6], [7], [8], [9] or consistency and specificity from the data distribution [10], [11], [12], [13] for multi-view clustering or classification. And some other methods projects different views into one common space and learns the latent representation [14], [15], [16], i.e., the multi-view representation learning (MvRL) methods. MvRL methods have the higher generalization capability since the learnt representation can be used for downstream tasks, including clustering and classification.