I. Introduction
Depth reconstruction (estimation) is one of the important problems in mobile robotics, augmented reality, computer aided design etc. The sensors that explicitly provide range measurements such as LIDARs, RGB-D cameras etc., are typically i) expensive, ii) large and heavy, iii) power-demanding, which prevents their widespread usage especially when it comes down to compact mobile robots (like small drones). Thus a strong interest exists in depth estimation using a single camera, as almost every mobile robot is equipped with this sensor. Moreover, there exist data-driven learning-based approaches that are capable of solving monocular vision-based depth reconstruction tasks with suitable (for typical mobile robotics applications) accuracy - see works [1]–[3]. Commonly, the main focus of such papers is increasing the accuracy while the performance issues are left out of scope. As a result, the majority of the state-of-the-art methods for depth reconstruction are very resource demanding and need high-performance graphic processing units (GPU) to work in real time. Thus, they are not suitable for creating a fully-autonomous robotic system equipped with a typical embedded computer, even the one that is particularly suitable for image processing with neural networks - NVidia Jetson TX2. On the other hand, there are plenty of reports of this embedded computer being successfully used for autonomous navigation, SLAM etc., but still there is a limited number of papers, e.g. [4], that report a successful usage of single camera deep-learning driven depth estimation that works in real-time on NVidia Jetson TX2. Furthermore, to the best of our knowledge, there are no reproducible results (in terms of open-source code) of FCNNs for realtime embedded vSLAM usage. The foregoing defines the scope of this work. We wish to present a CNN-based depth reconstruction method that i) is accurate enough to be used within the monocular vSLAM pipeline and is equivalent accuracy-wise to the state-of-the-art, ii) is fast enough to work in real time on NVidia Jetson TX2, iii) is open to the community, i.e. comes with a source-code of the ROS-node.
Monocular vslam based on FCNN depth reconstruction and running in real time on nvidia jetson tx2. This is ascreenshot of the video available at: https://youtu.Be/ayjvfzm-c7s