Robust Visual Place Recognition for Severe Appearance Changes | IEEE Journals & Magazine | IEEE Xplore

Robust Visual Place Recognition for Severe Appearance Changes


Abstract:

Severe appearance changes represent a pervasive and intricate challenge within Visual Place Recognition (VPR) tasks, and the current best solution adopts a composite stra...Show More

Abstract:

Severe appearance changes represent a pervasive and intricate challenge within Visual Place Recognition (VPR) tasks, and the current best solution adopts a composite strategy encompassing global retrieval and reranking. However, these reranking techniques necessitate sophisticated considerations to extract and match local features, which leads to a notable escalation of computational resource demands and inference duration. To this end, we propose a novel framework unifying global and local features within a single pipeline network, representing a simple solution capable of seamlessly operating across diverse scenarios without other fussy structures. Specifically, our overall thought involves training discriminative global features via image classification techniques, concurrently extracting effective local features directly from the intermediate layers without extra operations. To augment the expressiveness of features, we introduce multi-layer Convolutional Neural Network (CNN) feature maps to fuse diverse semantic information. Concurrently, a Transformer with relative position encoding is employed to capture cross-layer long-range and positional correlations. In conjunction with the associated attention values, low-resolution feature maps lessen features involved in the matching, resulting in decreased computational overhead and a remarkable acceleration of reranking. Extensive experimentations showcase that our model achieves State-Of-The-Art (SOTA) performance across datasets with severe appearance changes, the fastest inference duration and minimal memory usage.
Published in: IEEE Robotics and Automation Letters ( Volume: 9, Issue: 5, May 2024)
Page(s): 4289 - 4296
Date of Publication: 13 March 2024

ISSN Information:

Funding Agency:


I. Introduction

Vpr serves as a critical component in the realm of mobile robots, with its main objective being to provide previously encountered locations within a visual navigation system. While VPR has been extensively investigated in computer vision, severe appearance changes are still a substantial challenge when transitioning to the robust real world. Presently, there exist two categories of solutions: (i) Global Retrieval (e.g., GeM [1], NetVLAD [2], CosPlace [3], et al.) is a predominant solution, which is efficient but falls short in terms of accuracy for this challenge. (ii) Global Retrieval + Reranking (e.g., DELG [4], Patch-NetVLAD [5], TransVPR [6], et al.) is an optimal solution for severe appearance changes, demonstrating high performance yet encountering issues with high training costs and inefficiency.

References

References is not available for this document.