Abstract:
Despite significant recent advances in the field of face recognition [10, 14, 15, 17], implementing face verification and recognition efficiently at scale presents seriou...Show MoreMetadata
Abstract:
Despite significant recent advances in the field of face recognition [10, 14, 15, 17], implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. In this paper we present a system, called FaceNet, that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure offace similarity. Once this space has been produced, tasks such as face recognition, verification and clustering can be easily implemented using standard techniques with FaceNet embeddings asfeature vectors. Our method uses a deep convolutional network trained to directly optimize the embedding itself, rather than an intermediate bottleneck layer as in previous deep learning approaches. To train, we use triplets of roughly aligned matching / non-matching face patches generated using a novel online triplet mining method. The benefit of our approach is much greater representational efficiency: we achieve state-of-the-artface recognition performance using only 128-bytes perface. On the widely used Labeled Faces in the Wild (LFW) dataset, our system achieves a new record accuracy of 99.63%. On YouTube Faces DB it achieves 95.12%. Our system cuts the error rate in comparison to the best published result [15] by 30% on both datasets.
Date of Conference: 07-12 June 2015
Date Added to IEEE Xplore: 15 October 2015
ISBN Information:
ISSN Information:
References is not available for this document.
Select All
1.
Y. Bengio, J. Louradour, R. Collobert and J. Weston, "Curriculum learning", Proc. of ICML, 2009.
2.
D. Chen, X. Cao, L. Wang, F. Wen and J. Sun, "Bayesian face revisited: A joint formulation", Proc. ECCV, 2012.
3.
D. Chen, S. Ren, Y. Wei, X. Cao and J. Sun, "Joint cascade face detection and alignment", Proc. ECCV, 2014.
4.
"Large scale distributed deep networks", NIPS, pp. 1232-1240, 2012.
5.
J. Duchi, E. Hazan and Y. Singer, "Adaptive sub gradient methods for online learning and stochastic optimization", 1. Mach. Learn. Res., vol. 12, pp. 2121-2159, July 2011.
6.
I. J. Goodfellow, D. Warde-farley, M. Mirza, A. Courville and Y. Bengio, "Maxout networks", ICML, 2013.
7.
G. B. Huang, M. Ramesh, T. Berg and E. Learned-Miller, "Labeled faces in the wild: A database for studying face recognition in unconstrained environments" in Technical Report, University of Massachusetts, Amherst, pp. 07-49, October 2007.
8.
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, et al., "Backpropagation applied to handwritten zip code recognition", Neural Computation, vol. 1, no. 4, pp. 541-551, Dec. 1989.
9.
M. Lin, Q. Chen and S. Yan, "Network in network", CaRR abs/1312.4400, 2013.
10.
C. Lu and X. Tang, "Surpassing human-level face verification performance on LFW with gaussianface", CoRR, 2014.
11.
D. E. Rumelhart, G. E. Hinton and R. J. Williams, "Learning representations by back-propagating errors", Nature, 1986.
12.
M. Schultz and T. Joachims, "Learning a distance metric from relative comparisons" in , MIT Press, pp. 41-48, 2004.
13.
T. Sim, S. Baker and M. Bsat, "The CMU pose illumination and expression (PIE) database", Proc. FG, 2002.
14.
Y. Sun, X. Wang and X. Tang, "Deep learning face representation by joint identification-verification", CoRR, 2014.
15.
Y. Sun, X. Wang and X. Tang, "Deeply learned face representations are sparse selective and robust", CaRR, 2014.
16.
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, et al., "Going deeper with convolutions", CoRR, 2014.
17.
Y. Taigman, M. Yang, M. Ranzato and L. Wolf, "Deepface: Closing the gap to human-level performance in face verification", IEEE Conf. on CVPR, 2014.
18.
J. Wang, Y. Song, T. Leung, C. Rosenberg, J. Wang, J. Philbin, et al., "Learning fine-grained image similarity with deep ranking", CoRR, 2014.
19.
K. Q. Weinberger, J. Blitzer and L. K. Saul, "Distance metric learning for large margin nearest neighbor classification" in , MIT Press, 2006.
20.
D. R. Wilson and T. R. Martinez, "The general inefficiency of batch training for gradient descent learning", Neural Networks, vol. 16, no. 10, pp. 1429-1451, 2003.
21.
L. Wolf, T. Hassner and I. Maoz, "Face recognition in unconstrained videos with matched background similarity", IEEE Conf. on CVPR, 2011.
22.
M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks", CoRR, 2013.
23.
Z. Zhu, P. Luo, X. Wang and X. Tang, "Recover canonical-view faces in the wild with deep neural networks", CoRR, 2014.