Loading [MathJax]/extensions/MathMenu.js
Calibrated RGB-D Salient Object Detection | IEEE Conference Publication | IEEE Xplore

Calibrated RGB-D Salient Object Detection


Abstract:

Complex backgrounds and similar appearances between objects and their surroundings are generally recognized as challenging scenarios in Salient Object Detection (SOD). Th...Show More

Abstract:

Complex backgrounds and similar appearances between objects and their surroundings are generally recognized as challenging scenarios in Salient Object Detection (SOD). This naturally leads to the incorporation of depth information in addition to the conventional RGB image as input, known as RGB-D SOD or depth-aware SOD. Meanwhile, this emerging line of research has been considerably hindered by the noise and ambiguity that prevail in raw depth images. To address the aforementioned issues, we propose a Depth Calibration and Fusion (DCF) framework that contains two novel components: 1) a learning strategy to calibrate the latent bias in the original depth maps towards boosting the SOD performance; 2) a simple yet effective cross reference module to fuse features from both RGB and depth modalities. Extensive empirical experiments demonstrate that the proposed approach achieves superior performance against 27 state-of-the-art methods. Moreover, our depth calibration strategy alone can work as a preprocessing step; empirically it results in noticeable improvements when being applied to existing cutting-edge RGB-D SOD models. Source code is available at https://github.com/jiwei0921/DCF.
Date of Conference: 20-25 June 2021
Date Added to IEEE Xplore: 02 November 2021
ISBN Information:

ISSN Information:

Conference Location: Nashville, TN, USA

Funding Agency:


1. Introduction

Salient Object Detection (SOD) is an important computer vision problem that aims to identify and segment the most prominent object in a scene. It has found successful applications in a variety of tasks such as object recognition [59], image retrieval [38], [61], SLAM [37] and video analysis [25], [19], [14]. To tackle the innate challenges in addressing difficult scenes with low texture contrast or in the presence of cluttered backgrounds, depth information has been incorporated as a complementary input source. The growing interests in the development of RGB-D SOD methods [12], [42], [48] are especially boosted by the rapid progress and flourish of varied 3D imaging sensors [29], ranging from the traditional stereo imaging that produces disparity maps, to the more recent structured lighting [76], [30], time-of-flight, light field [63], [71], [72] and LIDAR cameras that directly generate depth images. As showcased by the recent cross-modality fusion schemes [7], [10], [44], adding depth-map on top of RGB image as an extra input leads to superior performance in localizing salient objects on challenging scenes.

Contact IEEE to Subscribe

References

References is not available for this document.