1. Introduction
Image understanding has been a long standing problem in both human perception [1] and computer vision [25]. The image parsing framework [35] is concerned with the task of decomposing and segmenting an input image into constituents such as objects (text and faces) and generic regions through the integration of image segmentation, object detection, and object recognition. Scene parsing is similar in spirit and consists of both non-parametric [33] and parametric [40] approaches.