1. Introduction
With recent success on many vision subtasks-object detection [21], [18], [3], multi-class image segmentation [17], [7], [13], and 3D reconstruction [10], [16]—holistic scene understanding has emerged as one of the next great challenges for computer vision [11], [9], [19]. Here the aim is to reason jointly about objects, regions and geometry of a scene with the hope of avoiding the many errors induced by modeling these tasks in isolation.