Loading [MathJax]/extensions/MathMenu.js
BoQ: A Place is Worth a Bag of Learnable Queries | IEEE Conference Publication | IEEE Xplore

BoQ: A Place is Worth a Bag of Learnable Queries


Abstract:

In visual place recognition, accurately identifying and matching images of locations under varying environmental conditions and viewpoints remains a significant challenge...Show More

Abstract:

In visual place recognition, accurately identifying and matching images of locations under varying environmental conditions and viewpoints remains a significant challenge. In this paper, we introduce a new technique, called Bag-of-Queries (BoQ), which learns a set of global queries, designed to capture universal place-specific attributes. Unlike existing techniques that employ self-attention and generate the queries directly from the input, BoQ employ distinct learnable global queries, which probe the input features via cross-attention, ensuring consistent information aggregation. In addition, this technique provides an inter-pretable attention mechanism and integrates with both CNN and Vision Transformer backbones. The performance of BoQ is demonstrated through extensive experiments on 14 large-scale benchmarks. It consistently outperforms current state-of-the-art techniques including NetVLAD, MixVPR and EigenPlaces. Moreover, despite being a global re-trieval technique (one-stage), BoQ surpasses two-stage re-trieval methods, such as Patch-NetVLAD, TransVPR and R2Former, all while being orders of magnitude faster and more efficient. The code and model weights are publicly available at https:/github.com/amaralibey/Bag-of-Queries.
Date of Conference: 16-22 June 2024
Date Added to IEEE Xplore: 16 September 2024
ISBN Information:

ISSN Information:

Conference Location: Seattle, WA, USA

Funding Agency:

No metrics found for this document.

1. Introduction

Visual Place Recognition (VPR) consists of determining the geographical location of a place depicted in a given image, by comparing its visual features to a database of previously visited places. The dynamic and ever-changing nature of real-world environments pose significant challenges for VPR [33], [57]. Factors such as varying lighting conditions, seasonal changes and the presence of dynamic elements such as vehicles and pedestrians introduce considerable variability into the appearance of a place. Additionally, changes in viewpoint and image scale can expose previously obscured areas, further complicating the recognition process. These challenges are exacerbated by the operational constraints of VPR systems, which often need to operate in real-time and under limited memory. Consequently, there is a compelling need for efficient algorithms capable of generating compact yet robust representations.

Recall@1 performance comparison between our proposed technique, Bag-of-Queries (BoQ), and current state of the art methods, Conv-AP [3], CosPlace [11], MixVPR [4] and Eigen-Places [12]. ResNet-50 is used as backbone for all techniques. BoQ consistently achieves better performance in various environment conditions such as viewpoint changes (Pitts-250k [44], MapillarySLS [50]), seasonal changes (Nordland [53]), historical locations (AmsterTime [51]) and extreme lightning and weather conditions (SVOX [10]).

Usage
Select a Year
2025

View as

Total usage sinceSep 2024:50
05101520JanFebMarAprMayJunJulAugSepOctNovDec1587000000000
Year Total:30
Data is updated monthly. Usage includes PDF downloads and HTML views.
Contact IEEE to Subscribe

References

References is not available for this document.