Loading [MathJax]/extensions/MathZoom.js
Real-Time Self-Adaptive Deep Stereo | IEEE Conference Publication | IEEE Xplore

Real-Time Self-Adaptive Deep Stereo


Abstract:

Deep convolutional neural networks trained end-to-end are the state-of-the-art methods to regress dense disparity maps from stereo pairs. These models, however, suffer fr...Show More

Abstract:

Deep convolutional neural networks trained end-to-end are the state-of-the-art methods to regress dense disparity maps from stereo pairs. These models, however, suffer from a notable decrease in accuracy when exposed to scenarios significantly different from the training set (e.g., real vs synthetic images, etc.). We argue that it is extremely unlikely to gather enough samples to achieve effective training/tuning in any target domain, thus making this setup impractical for many applications. Instead, we propose to perform unsupervised and continuous online adaptation of a deep stereo network, which allows for preserving its accuracy in any environment. However, this strategy is extremely computationally demanding and thus prevents real-time inference. We address this issue introducing a new lightweight, yet effective, deep stereo architecture, Modularly ADaptive Network(MADNet), and developing a Modular ADaptation (MAD) algorithm, which independently trains sub-portions of the network. By deploying MADNet together with MAD we introduce the first real-time self-adaptive deep stereo system enabling competitive performance on heterogeneous datasets. Our code is publicly available at https://github.com/CVLAB-Unibo/Real-time-self-adaptive-deep-stereo.
Date of Conference: 15-20 June 2019
Date Added to IEEE Xplore: 09 January 2020
ISBN Information:

ISSN Information:

Conference Location: Long Beach, CA, USA

1. Introduction

Many key tasks in computer vision rely on the availability of dense and reliable 3D reconstructions of the sensed environment. Due to high precision, low latency and affordable costs, passive stereo has proven particularly amenable to depth estimation in both indoor and outdoor set-ups. Following the groundbreaking work by Mayer et al [21], current state-of-the-art stereo methods rely on deep convolutional neural networks (CNNs) that take as input a pair of left-right frames and directly regress a dense disparity map. In challenging real-world scenarios, like the popular KITTI benchmarks [8], [23], these networks turn out to be more effective, and sometimes faster, than traditional algorithms.

Contact IEEE to Subscribe

References

References is not available for this document.