Conferences >2024 IEEE/CVF Conference on C...

You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Two primary input modalities prevail in image retrieval: sketch and text. While text is widely used for inter-category retrieval tasks, sketches have been established as ...Show More

Metadata

Abstract:

Two primary input modalities prevail in image retrieval: sketch and text. While text is widely used for inter-category retrieval tasks, sketches have been established as the sole preferred modality for fine-grained image retrieval due to their ability to capture intricate visual details. In this paper, we question the reliance on sketches alone for fine-grained image retrieval by simultaneously exploring the fine-grained representation capabilities of both sketch and text, orchestrating a duet between the two. The end result enables precise retrievals previously unattainable, allowing users to pose ever-finer queries and incorporate attributes like colour and contextual cues from text. For this purpose, we introduce a novel compositionality framework, effectively combining sketches and text using pre-trained CLIP models, while eliminating the need for extensive fine-grained textual descriptions. Last but not least, our system extends to novel applications in composed image retrieval, domain attribute transfer, and fine-grained generation, providing solutions for various real-world scenarios.

Published in: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Date of Conference: 16-22 June 2024

Date Added to IEEE Xplore: 16 September 2024

ISBN Information:

ISSN Information:

DOI: 10.1109/CVPR52733.2024.01562

Conference Location: Seattle, WA, USA

Contents

1. Introduction

Sketch and text represent the two most common [11], [59] input modalities in the realm of image retrieval. The choice between these modalities depends on the nature of the retrieval problem, especially when fine-grained distinctions are required [18], [59], [60], [69]. In inter-category retrieval, text dominates as the primary modality, exemplified by widely-used platforms like Google Images. However, when the challenge transitions to fine-grained image retrieval, sketches take the spotlight [11], [59], [60]. Sketches promise to capture fine-grained visual cues that can be cumbersome or even impossible for text to express [11]. Research in this domain predominantly revolves around harnessing the unique qualities of sketches, exploring aspects such as style [54], abstraction [32], and more [4], [18], [59].

References is not available for this document.

You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

Supplemental Items

References