I. Introduction
Binocular stereo matching is a vital and demanding task in computer vision that aims to estimate a disparity map from a pair of input binocular stereo images [15], as depicted in Figure 1. This task is particularly critical for autonomous driving, as it provides the vehicle with precise depth information about the objects and scenes for decision-making. Such depth information is also crucial for obstacle detection, localization, mapping, and path planning, helping to ensure the safety and efficient operation of autonomous vehicles [8], [24], [32], [33]. However, this task is highly challenging because the computational complexity of binocular stereo matching, particularly with large images, can be prohibitively high, making the process time-consuming.
An illustrative example of the binocular stereo matching task. Given the left and right views, this task aims to obtain a disparity map with reference to the left view.