Joint Deep Multi-View Learning for Image Clustering | IEEE Journals & Magazine | IEEE Xplore

Joint Deep Multi-View Learning for Image Clustering


Abstract:

In this paper, a novel Deep Multi-view Joint Clustering (DMJC) framework is proposed, where multiple deep embedded features, multi-view fusion mechanism, and clustering a...Show More

Abstract:

In this paper, a novel Deep Multi-view Joint Clustering (DMJC) framework is proposed, where multiple deep embedded features, multi-view fusion mechanism, and clustering assignments can be learned simultaneously. Through the joint learning strategy, the clustering-friendly multi-view features and useful multi-view complementary information can be exploited effectively to improve the clustering performance. Under the proposed joint learning framework, we design two ingenious variants of deep multi-view joint clustering models, whose multi-view fusion is implemented by two kinds of simple yet effective schemes. The first model, called DMJC-S, performs multi-view fusion in an implicit way via a novel multi-view soft assignment distribution. The second model, termed DMJC-T, defines a novel multi-view auxiliary target distribution to conduct the multi-view fusion explicitly. Both DMJC-S and DMJC-T are optimized under a KL divergence objective. Experiments on eight challenging image datasets demonstrate the superiority of both DMJC-S and DMJC-T over single/multi-view baselines and the state-of-the-art multi-view clustering methods, which proves the effectiveness of the proposed DMJC framework. To the best of our knowledge, this is the first work to model the multi-view clustering in a deep joint framework, which will provide a meaningful thinking in unsupervised multi-view learning.
Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 33, Issue: 11, 01 November 2021)
Page(s): 3594 - 3606
Date of Publication: 14 February 2020

ISSN Information:

Funding Agency:


1 Introduction

Recently, multi-view clustering is becoming one of the researching hotspots in unsupervised learning. It is defined as a machine learning paradigm which gathers similar subjects into the same group and dissimilar ones into different groups by utilizing the available multi-view features, such that the complementary information and consistency among different views can be captured. These multi-view features are usually generated by various handcrafted feature extractors, for example, there are many heterogeneous handcrafted visual features including SIFT [1], LBP [2], and HOG [3]. Due to the success of deep learning, various kinds of deep neural networks, such as stacked autoencoder (SAE) [4], variational autoencoder (VAE) [5], and convolutional autoencoder (CAE) [6], have been proposed for unsupervised feature learning. The existence of these multi-view features raised the interest of multi-view clustering, in particular, the deep multi-view clustering, which is the main focus of this paper.

Contact IEEE to Subscribe

References

References is not available for this document.