1 Introduction
Person re-identification (re-id) is the task of matching two pedestrian images crossing non-overlapping camera views [1]. It plays an important role in a number of applications in video surveillance, including multi-camera tracking [2], [3], crowd counting [4], [5], and multi-camera activity analysis [6], [7]. Person re-id is extremely challenging and remains unsolved for a number of reasons. First, in different camera views, one person's appearance often changes dramatically due to the variances in body poses, camera viewpoints, occlusion and illumination conditions. Second, in a public space, many people often wear very similar clothing (e.g., dark coats in winter). Thus, the differences that can be used to distinguish between people are subtle. These subtle discrepancies exist at different locations in the image and are of different spatial scales. For instance, they could be global, e.g., one person is bulkier than another, or local, e.g., the two people are wearing different shoes.