1. Introduction
Distance metric learning has been widely used in many visual analysis applications, which aims to learn an embedding space where the distance between similar samples is closer and that of dissimilar samples is farther. Conventional metric learning approaches learn the embedding space by a linear Mahalanobis distance metric [13], [25], [57]. As linear metric learning approaches usually suffer from nonlinear correlations of samples, deep metric learning methods have been proposed to learn discriminative nonlinear embeddings by deep neural networks [36], [44], [55].