Journals & Magazines >IEEE Transactions on Circuits... >Volume: 34 Issue: 5

MBSI-Net: Multimodal Balanced Self-Learning Interaction Network for Image Classification

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

A growing number of earth observation satellites are able to simultaneously gather multimodal images of the same area due to the expanding availability and resolution of ...Show More

Metadata

Abstract:

A growing number of earth observation satellites are able to simultaneously gather multimodal images of the same area due to the expanding availability and resolution of satellite remote sensing data. This paper proposes a novel multimodal balanced self-learning interaction network (MBSI-Net) for the classification task. It involves a dual-branch teacher-student network that enables knowledge interaction and transfer between the multimodalities. Firstly, in order to introduce statistical information in addition to local and global structural information, a texture feature equalization module (TFE-Module) is proposed. This can enhance the texture information of features through histogram equalization and further improve the representation ability of features. Secondly, to enable the student network to provide timely feedback questions, the paper proposes a feature fusion module (F2-Module) that models and enhances teacher features through the student network. This helps to raise the classification’s accuracy by incorporating information from multimodal images. Finally, the paper proposes a loss function based on structural similarity analysis to ensure balanced self-learning between the student and the teacher networks. Taking the multispectral (MS) and the panchromatic (PAN) images of the same scene as examples, through experimental verification, the proposed method can achieve good results on multiple datasets compared with other methods. Therefore, it offers an effective method for classifying and fusing multimodal data.

Published in: IEEE Transactions on Circuits and Systems for Video Technology ( Volume: 34, Issue: 5, May 2024)

Page(s): 3819 - 3833

Date of Publication: 06 October 2023

ISSN Information:

DOI: 10.1109/TCSVT.2023.3322470

Funding Agency:

Contents

I. Introduction

Multimodal satellite images are frequently employed in military systems, environmental monitoring, surveying, and mapping services thanks to the quick development of satellite sensors [1], [2], [3], [4]. Simultaneously, fine-resolution earth surface coverage or utilization, change detection, and multimodal classification also have received more and more attention [5], [6], [7], [8]. Due to the constraints of satellite imaging systems and other factors, most remote sensing satellites provide pairs of complementary high spectral resolution but low spatial resolution (LR) multi-spectral (MS) images, and high spatial resolution (HR) but low spectral resolution panchromatic (PAN) images [9], [10]. Therefore, the PAN image can precisely characterize the geometric aspects of ground objects, which is very helpful for remote sensing interpretation, while the MS image is typically used to identify various categories of ground objects [11], [12], [13].

References is not available for this document.

MBSI-Net: Multimodal Balanced Self-Learning Interaction Network for Image Classification

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MBSI-Net: Multimodal Balanced Self-Learning Interaction Network for Image Classification

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?