Conferences >2022 IEEE/CVF Conference on C...

Layered Depth Refinement with Mask Guidance

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Depth maps are used in a wide range of applications from 3D rendering to 2D image effects such as Bokeh. However, those predicted by single image depth estimation (SIDE) ...Show More

Metadata

Abstract:

Depth maps are used in a wide range of applications from 3D rendering to 2D image effects such as Bokeh. However, those predicted by single image depth estimation (SIDE) models often fail to capture isolated holes in objects and/or have inaccurate boundary regions. Meanwhile, high-quality masks are much easier to obtain, using commercial auto-masking tools or off-the-shelf methods of segmentation and matting or even by manual editing. Hence, in this paper, we formulate a novel problem of mask-guided depth refinement that utilizes a generic mask to refine the depth prediction of SIDE models. Our framework performs layered refinement and inpainting/outpainting, decomposing the depth map into two separate layers signified by the mask and the inverse mask. As datasets with both depth and mask annotations are scarce, we propose a self-supervised learning scheme that uses arbitrary masks and RGB-D datasets. We empirically show that our method is robust to different types of masks and initial depth predictions, accurately refining depth values in inner and outer mask boundary regions. We further analyze our model with an ablation study and demonstrate results on real applications. More information can be found on our project page.¹¹https://sooyekim.github.io/MaskDepth/

Published in: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Date of Conference: 18-24 June 2022

Date Added to IEEE Xplore: 27 September 2022

ISBN Information:

ISSN Information:

DOI: 10.1109/CVPR52688.2022.00383

Conference Location: New Orleans, LA, USA

Citations are not available for this document.

Contents

1. Introduction

Recent progress in deep learning has enabled the prediction of fairly reliable depth maps from single RGB images [20], [31], [32], [47]. However, despite the specialized network architectures [11], [29], [31] and training strategies [32], [46] in single image depth estimation (SIDE) models, the estimated depth maps are still inadequate in the following aspects: (i) depth boundaries tend to be blurry and inaccurate; (ii) thin structures such as poles and wires are often missing; and (iii) depth values in narrow or isolated background regions (e.g., between body parts in humans) are often imprecise, as shown in the initial depth estimation in Figure 1. Addressing these issues within a single SIDE model can be very challenging due to limited model capacity and the lack of high-quality RGB-D datasets. Figure 1.

Our layered depth refinement result on an initial prediction by DPT [31]. Aided by a high-quality mask generated with an auto-masking tool [33], our method is able to accurately refine mask boundaries and correct depth values in isolated hole regions between body parts. Regions in and are refined and inpainted/outpainted separately with our layered approach.

Cites in Papers - |

Cites in Papers - IEEE (3)

Select All

Phan Xuan Tan, Dinh-Cuong Hoang, Anh-Nhat Nguyen, Van-Thiep Nguyen, Van-Duc Vu, Thu-Uyen Nguyen, Ngoc-Anh Hoang, Khanh-Toan Phan, Duc-Thanh Tran, Duy-Quang Vu, Phuc-Quan Ngo, Quang-Tri Duong, Ngoc-Trung Ho, Cong-Trinh Tran, Van-Hiep Duong, Anh-Truong Mai, "Attention-Based Grasp Detection With Monocular Depth Estimation", IEEE Access, vol.12, pp.65041-65057, 2024.

Show Article

Google Scholar

Yuan Liang, Bailin Deng, Wenxi Liu, Jing Qin, Shengfeng He, "Monocular Depth Estimation for Glass Walls With Context: A New Dataset and Method", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.45, no.12, pp.15081-15097, 2023.

Show Article

Google Scholar

Anmei Zhang, Yunchao Ma, Jiangyu Liu, Jian Sun, "Promoting Monocular Depth Estimation by Multi-Scale Residual Laplacian Pyramid Fusion", IEEE Signal Processing Letters, vol.30, pp.205-209, 2023.

Show Article

Google Scholar

Cites in Papers - Other Publishers (2)

Youwei Pang, Xiaoqi Zhao, Jiaming Zuo, Lihe Zhang, Huchuan Lu, "Open-Vocabulary Camouflaged Object Segmentation", Computer Vision – ECCV 2024, vol.15105, pp.476, 2025.

CrossRef Google Scholar

Akshay Paruchuri, Samuel Ehrenstein, Shuxian Wang, Inbar Fried, Stephen M. Pizer, Marc Niethammer, Roni Sengupta, "Leveraging Near-Field Lighting for\\xa0Monocular Depth Estimation from\\xa0Endoscopy Videos", Computer Vision – ECCV 2024, vol.15090, pp.473, 2025.

CrossRef Google Scholar

References is not available for this document.

MIT Libraries

MIT Libraries

Layered Depth Refinement with Mask Guidance

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

Cites in Papers - |

Cites in Papers - IEEE (3)

Cites in Papers - Other Publishers (2)

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Layered Depth Refinement with Mask Guidance

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

Cites in Papers - IEEE (3) | Other Publishers (2)

Cites in Papers - IEEE (3)

Cites in Papers - Other Publishers (2)

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Cites in Papers - |