Loading [a11y]/accessibility-menu.js
NetVLAD: CNN Architecture for Weakly Supervised Place Recognition | IEEE Conference Publication | IEEE Xplore

NetVLAD: CNN Architecture for Weakly Supervised Place Recognition


Abstract:

We tackle the problem of large scale visual place recognition, where the task is to quickly and accurately recognize the location of a given query photograph. We present ...Show More

Abstract:

We tackle the problem of large scale visual place recognition, where the task is to quickly and accurately recognize the location of a given query photograph. We present the following three principal contributions. First, we develop a convolutional neural network (CNN) architecture that is trainable in an end-to-end manner directly for the place recognition task. The main component of this architecture, NetVLAD, is a new generalized VLAD layer, inspired by the "Vector of Locally Aggregated Descriptors" image representation commonly used in image retrieval. The layer is readily pluggable into any CNN architecture and amenable to training via backpropagation. Second, we develop a training procedure, based on a new weakly supervised ranking loss, to learn parameters of the architecture in an end-to-end manner from images depicting the same places over time downloaded from Google Street View Time Machine. Finally, we show that the proposed architecture significantly outperforms non-learnt image representations and off-the-shelf CNN descriptors on two challenging place recognition benchmarks, and improves over current state of-the-art compact image representations on standard image retrieval benchmarks.
Date of Conference: 27-30 June 2016
Date Added to IEEE Xplore: 12 December 2016
ISBN Information:
Electronic ISSN: 1063-6919
Conference Location: Las Vegas, NV, USA

1. Introduction

Visual place recognition has received a significant amount of attention in the past years both in computer vision [5], [10], [11], [24], [35], [62], [63]–[65], [79], [80] and robotics communities [16], [17], [44], [46], [74] motivated by, e.g., applications in autonomous driving [46], augmented reality [47] or geo-localizing archival imagery [6].

Contact IEEE to Subscribe

References

References is not available for this document.