Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection | IEEE Journals & Magazine | IEEE Xplore

Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection


Abstract:

Existing RGB-D Salient Object Detection (SOD) methods take advantage of depth cues to improve the detection accuracy, while pay insufficient attention to the quality of d...Show More

Abstract:

Existing RGB-D Salient Object Detection (SOD) methods take advantage of depth cues to improve the detection accuracy, while pay insufficient attention to the quality of depth information. In practice, a depth map is often with uneven quality and sometimes suffers from distractors, due to various factors in the acquisition procedure. In this article, to mitigate distractors in depth maps and highlight salient objects in RGB images, we propose a Hierarchical Alternate Interactions Network (HAINet) for RGB-D SOD. Specifically, HAINet consists of three key stages: feature encoding, cross-modal alternate interaction, and saliency reasoning. The main innovation in HAINet is the Hierarchical Alternate Interaction Module (HAIM), which plays a key role in the second stage for cross-modal feature interaction. HAIM first uses RGB features to filter distractors in depth features, and then the purified depth features are exploited to enhance RGB features in turn. The alternate RGB-depth-RGB interaction proceeds in a hierarchical manner, which progressively integrates local and global contexts within a single feature scale. In addition, we adopt a hybrid loss function to facilitate the training of HAINet. Extensive experiments on seven datasets demonstrate that our HAINet not only achieves competitive performance as compared with 19 relevant state-of-the-art methods, but also reaches a real-time processing speed of 43 fps on a single NVIDIA Titan X GPU. The code and results of our method are available at https://github.com/MathLee/HAINet.
Published in: IEEE Transactions on Image Processing ( Volume: 30)
Page(s): 3528 - 3542
Date of Publication: 05 March 2021

ISSN Information:

PubMed ID: 33667161

Funding Agency:


I. Introduction

Salient object detection (SOD) is an essential and important task in computer vision. The goal of SOD is to detect and highlight the most salient objects in visual input, such as color images, RGB-D images and videos. It has been applied to many other computer vision tasks, such as visual tracking [1], image captioning [2], weakly supervised learning [3], object segmentation [4], [5], etc. Several surveys on color image SOD [6]–[8], RGB-D SOD [9], [10] and video SOD [11], [12] summarize recent developments of SOD in detail. Since the distance-to-camera cues of depth maps naturally supplement appearance information from RGB images for SOD, RGB-D SOD has recently attracted increasing amount of research attention, especially considering the popularity of affordable RGB-D sensors. Numerous RGB-D SOD methods [13]–[41] have been proposed for this purpose and substantial advancements have been achieved.

Contact IEEE to Subscribe

References

References is not available for this document.