1. Introduction
While there has been strong progress in data-driven methods for object detection [10], [14], [20], [40], tracking [7], [58], [59], [62], segmentation [4], [22], [30], [39], [50] and reconstruction [25], [27], [29], [53] with limited occlusions, most methods underperform in severely occluded scenarios. Severe occlusions are common in busy intersections and crowded places. Even in less dense scenes, pedestrians and vehicles often pass each other or pass behind other objects. As a result, objects are either not detected at all, or the 2D bounding boxes and segments are truncated and produce errors in downstream processes such as 3D reconstruction [5], [6], [25], [41], [42], [45].