Loading [MathJax]/extensions/MathMenu.js
Real-Time Self-Adaptive Deep Stereo | IEEE Conference Publication | IEEE Xplore

Real-Time Self-Adaptive Deep Stereo


Abstract:

Deep convolutional neural networks trained end-to-end are the state-of-the-art methods to regress dense disparity maps from stereo pairs. These models, however, suffer fr...Show More

Abstract:

Deep convolutional neural networks trained end-to-end are the state-of-the-art methods to regress dense disparity maps from stereo pairs. These models, however, suffer from a notable decrease in accuracy when exposed to scenarios significantly different from the training set (e.g., real vs synthetic images, etc.). We argue that it is extremely unlikely to gather enough samples to achieve effective training/tuning in any target domain, thus making this setup impractical for many applications. Instead, we propose to perform unsupervised and continuous online adaptation of a deep stereo network, which allows for preserving its accuracy in any environment. However, this strategy is extremely computationally demanding and thus prevents real-time inference. We address this issue introducing a new lightweight, yet effective, deep stereo architecture, Modularly ADaptive Network(MADNet), and developing a Modular ADaptation (MAD) algorithm, which independently trains sub-portions of the network. By deploying MADNet together with MAD we introduce the first real-time self-adaptive deep stereo system enabling competitive performance on heterogeneous datasets. Our code is publicly available at https://github.com/CVLAB-Unibo/Real-time-self-adaptive-deep-stereo.
Date of Conference: 15-20 June 2019
Date Added to IEEE Xplore: 09 January 2020
ISBN Information:

ISSN Information:

Conference Location: Long Beach, CA, USA
References is not available for this document.

1. Introduction

Many key tasks in computer vision rely on the availability of dense and reliable 3D reconstructions of the sensed environment. Due to high precision, low latency and affordable costs, passive stereo has proven particularly amenable to depth estimation in both indoor and outdoor set-ups. Following the groundbreaking work by Mayer et al [21], current state-of-the-art stereo methods rely on deep convolutional neural networks (CNNs) that take as input a pair of left-right frames and directly regress a dense disparity map. In challenging real-world scenarios, like the popular KITTI benchmarks [8], [23], these networks turn out to be more effective, and sometimes faster, than traditional algorithms.

Select All
1.
Hatem Alismail, Brett Browning and M Bernardine Dias, "Evaluating pose estimation methods for stereo visual odometry on robots", the 11th International Conference on Intelligent Autonomous Systems (IAS-11), 2011.
2.
Konstantinos Batsos, Changjiang Cai and Philippos Mordohai, "Cbmv: A coalesced bidirectional matching volume for disparity estimation", The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
3.
D. J. Butler, J. Wulff, G. B. Stanley, M. J. Black et al., "A naturalistic open source movie for optical flow evaluation", European Conf. on Computer Vision (ECCV) Part IV LNCS 7577, pp. 611-625, Oct. 2012.
4.
Jia-Ren Chang and Yong-Sheng Chen, "Pyramid stereo matching network", The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
5.
Zhuoyuan Chen, Xun Sun, Liang Wang, Yinan Yu and Chang Huang, "A deep visual correspondence embedding model for stereo matching costs", The IEEE International Conference on Computer Vision (ICCV), December 2015.
6.
Ravi Garg, Vijay Kumar BG, Gustavo Carneiro and Ian Reid, "Unsupervised cnn for single view depth estimation: Geometry to the rescue", European Conference on Computer Vision, pp. 740-756, 2016.
7.
Andreas Geiger, Philip Lenz, Christoph Stiller and Raquel Urtasun, "Vision meets robotics: The kitti dataset", International Journal of Robotics Research (IJRR), 2013.
8.
Andreas Geiger, Philip Lenz and Raquel Urtasun, "Are we ready for autonomous driving? the kitti vision benchmark suite", Computer Vision and Pattern Recognition (CVPR) 2012 IEEE Conference on, pp. 3354-3361, 2012.
9.
Spyros Gidaris and Nikos Komodakis, "Detect replace refine: Deep structured prediction for pixel wise labeling", The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
10.
Clément Godard, Oisin Mac Aodha and Gabriel J Bros-tow, "Unsupervised monocular depth estimation with leftright consistency", CVPR, vol. 2, pp. 7, 2017.
11.
Xiaoyang Guo, Kai Yang, Wukui Yang, Xiaoyang Wang and Hongsheng Li, "Group-wise correlation stereo network", CVPR, 2019.
12.
R. Haeusler, R. Nair and D. Kondermann, "Ensemble learning for confidence measures in stereo vision", CVPR. Proceedings, pp. 305-312, 2013.
13.
Heiko Hirschmuller, "Accurate and efficient stereo processing by semi-global matching and mutual information", Computer Vision and Pattern Recognition 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 2, pp. 807-814, 2005.
14.
H. Hirschmller and D. Scharstein, "Evaluation of stereo matching costs on images with radiometric differences", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, pp. 1582-1599, 2008.
15.
Xiaoyan Hu and Philippos Mordohai, "A quantitative evaluation of confidence measures for stereo vision", IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), pp. 2121-2133, 2012.
16.
Zequn Jie, Pengfei Wang, Yonggen Ling, Bo Zhao, Yunchao Wei, Jiashi Feng, et al., "Left-right comparative recurrent model for stereo matching", The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
17.
Alex Kendall, Hayk Martirosyan, Saumitro Dasgupta, Peter Henry, Ryan Kennedy, Abraham Bachrach, et al., "End-to-end learning of geometry and context for deep stereo regression", The IEEE International Conference on Computer Vision (ICCV), Oct 2017.
18.
Sameh Khamis, Sean Fanello, Christoph Rhemann, Adarsh Kowdle, Julien Valentin and Shahram Izadi, "Stereonet: Guided hierarchical refinement for real-time edge-aware depth prediction", 15th European Conference on Computer Vision (ECCV 2018), 2018.
19.
Zhengfa Liang, Yiliu Feng, Yulan Guo, Hengzhu Liu, Wei Chen and Linbo Qiao Li Zhou Jianfeng Zhang, "Learning for disparity estimation through feature constancy", The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
20.
Wenjie Luo, Alexander G Schwing and Raquel Urtasun, "Efficient deep learning for stereo matching", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5695-5703, 2016.
21.
Nikolaus Mayer, Eddy Ilg, Philip Hausser, Philipp Fischer, Daniel Cremers, Alexey Dosovitskiy, et al., "A large dataset to train convolutional networks for disparity optical flow and scene flow estimation", The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
22.
Simon Meister, Junhwa Hur and Stefan Roth, "UnFlow: Unsupervised learning of optical flow with a bidirectional census loss", AAAI, Feb. 2018.
23.
Moritz Menze and Andreas Geiger, "Object scene flow for autonomous vehicles", Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
24.
Jiahao Pang, Wenxiu Sun, Jimmy SJ. Ren, Chengxi Yang and Qiong Yan, "Cascade residual learning: A two-stage convolutional neural network for stereo matching", The IEEE International Conference on Computer Vision (ICCV), Oct 2017.
25.
Jiahao Pang, Wenxiu Sun, Chengxi Yang, Jimmy Ren, Ruichao Xiao, Jin Zeng, et al., "Zoom and learn: Generalizing deep stereo matching to novel domains", The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
26.
Min Gyu Park and Kuk Jin Yoon, "Leveraging stereo matching with learning-based confidence measures", The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
27.
Matteo Poggi, Filippo Aleotti, Fabio Tosi and Stefano Mattoccia, "Towards real-time unsupervised monocular depth estimation on cpu", IEEE/JRS Conference on Intelligent Robots and Systems (IROS), 2018.
28.
Matteo Poggi and Stefano Mattoccia, "Learning a generalpurpose confidence measure based on o(1) features and a smarter aggregation strategy for semi global matching", Proceedings of the 4th International Conference on 3D Vision 3DV, 2016.
29.
Matteo Poggi and Stefano Mattoccia, "Learning from scratch a confidence measure", Proceedings of the 27th British Conference on Machine Vision BMVC, 2016.
30.
Matteo Poggi, Davide Pallotti, Fabio Tosi and Stefano Mattoccia, "Guided stereo matching", The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.

Contact IEEE to Subscribe

References

References is not available for this document.