1. Introduction
Real-world monitoring and surveillance application (e.g., individuals in airport, and vehicles in traffic) rely on challenging tasks, like object detection [55], [51], tracking [29], and re-identification (ReID) [24], [50]. The aim of person ReID is to recognize individuals over a set of distributed non-overlapping cameras. State-of-art systems for person re-identification (e.g., deep Siamese networks) typically learn an embedding through various metric learning losses, which aim at making similar image pairs (with the same identity) closer to each other and dissimilar image pairs (with different identities) more distant from each other. Despite the recent advances with deep learning (DL) models, person ReID remains a challenging task due to the non-rigid structure of the human body, the different view-points/poses with which a person can be observed, image corruption, and the variability of capture conditions (e.g., illumination, scale, contrast) [3], [31].