I. Introduction
Multiview video is a collection of multiple videos obtained by simultaneously capturing a scene with multiple cameras from different viewpoints. This new type of medium can provide users with vivid perception about the scene far beyond what is offered by traditional media. Target applications include free-viewpoint television (FTV), three-dimensional television (3-DTV), immersive teleconference and surveillance [1]–[3]. Due to the huge increase in data volume with the number of views, the technology of multiview video coding has recently become an active research area focusing on the compression for efficient storage and transmission of multiview video data. The joint video team (JVT) of ITU-T VCEG and ISO/IEC MPEG is currently working on the standardization of MVC as a new extension of the H.264/AVC standard. Various techniques in different aspects of MVC have been proposed [4]. For example, the spatial-temporal prediction structure of hierarchical B pictures is dedicated in exploiting both the temporal and inter-view correlation of multiview video data [5]. Furthermore, compression algorithms such as disparity vector prediction [6] and [7], illumination compensation [8], adaptive filtering [9], and view synthesis prediction [10] are put forward for both objective and subjective quality enhancement.