1. Introduction
Imaging and scene understanding in the presence of scattering media, such as fog, smog, light rain and snow, is an open challenge for computer vision and photography. As rare out-of-distribution events that occur based on geography and region [8], these weather phenomena can drastically reduce the image quality of the captured intensity images, reducing local contrast, color reproduction, and image resolution [8]. A large body of existing work has investigated methods for dehazing [57], [5], [49], [29], [73], [77] with the most successful methods employing learned feed-forward models [57], [5], [49], [29], [73]. Some methods [49], [5], [35] use synthetic data and full supervision, but struggle to overcome the domain gap between simulation and real world. Acquiring paired data in real world conditions is challenging and existing methods either learn natural image priors from large unpaired datasets [74], [73], or they rely on cross-modal semi-supervision to learn to separate atmospheric effects from clear RGB intensity [57]. Unfortunately, as the semi-supervised training cues are weak compared to paired supervised data, these methods often fail to completely separate atmospheric scatter from clear image content, especially at long distances. The problem of predicting clear images in the presence of haze is an open challenge, and notably harsh weather also results in severely impaired human vision – a major driver behind fatal automotive accidents [4].