1. Introduction
A wide spread of various depth-guided problems related to augmented reality, gesture recognition, object segmentation, autonomous driving and bokeh effect rendering tasks has created a strong demand for fast and efficient single-image depth estimation approaches that can run on portable low-power hardware. While many accurate deep learning-based solutions have been proposed for this problem in the past [46], [16], [14], [47], [48], [42], [15], [10], they were optimized for high fidelity results only while not taking into account computational efficiency and mobile-related constraints, which is essential for tasks related to image processing [23], [24], [37] on mobile devices. This results in solutions requiring powerful high-end GPUs and consuming gigabytes of RAM when processing even low-resolution input data, thus being incompatible with resource-constrained mobile hardware. In this challenge, we change the current depth estimation benchmarking paradigm by using a new depth estimation dataset collected in the wild and by imposing additional efficiency-related constraints on the designed solutions.