Loading [MathJax]/extensions/MathMenu.js
Your “Flamingo” is My “Bird”: Fine-Grained, or Not | IEEE Conference Publication | IEEE Xplore

Your “Flamingo” is My “Bird”: Fine-Grained, or Not


Abstract:

Whether what you see in Figure 1 is a "flamingo" or a "bird", is the question we ask in this paper. While fine-grained visual classification (FGVC) strives to arrive at t...Show More

Abstract:

Whether what you see in Figure 1 is a "flamingo" or a "bird", is the question we ask in this paper. While fine-grained visual classification (FGVC) strives to arrive at the former, for the majority of us non-experts just "bird" would probably suffice. The real question is therefore – how can we tailor for different fine-grained definitions under divergent levels of expertise. For that, we re-envisage the traditional setting of FGVC, from single-label classification, to that of top-down traversal of a pre-defined coarse-to-fine label hierarchy – so that our answer becomes "bird" ⇒ "Phoenicopteriformes" ⇒ "Phoenicopteridae" ⇒ "flamingo".To approach this new problem, we first conduct a comprehensive human study where we confirm that most participants prefer multi-granularity labels, regardless whether they consider themselves experts. We then discover the key intuition that: coarse-level label prediction exacerbates fine-grained feature learning, yet fine-level feature betters the learning of coarse-level classifier. This discovery enables us to design a very simple albeit surprisingly effective solution to our new problem, where we (i) leverage level-specific classification heads to disentangle coarse-level features with fine-grained ones, and (ii) allow finer-grained features to participate in coarser-grained label predictions, which in turn helps with better disentanglement. Experiments show that our method achieves superior performance in the new FGVC setting, and performs better than state-of-the-art on the traditional single-label FGVC problem as well. Thanks to its simplicity, our method can be easily implemented on top of any existing FGVC frameworks and is parameter-free.
Date of Conference: 20-25 June 2021
Date Added to IEEE Xplore: 02 November 2021
ISBN Information:

ISSN Information:

Conference Location: Nashville, TN, USA

Funding Agency:


1. Introduction

Fine-grained visual classification (FGVC) was first introduced to the vision community almost two decades ago with the landmark paper of [2]. It brought out a critical question that was largely overlooked back then – that can machines match up to humans on recognising objects at fine-grained level (e.g., a "flamingo" other than a "bird"). Great strides have been made over the years, starting with the conventional part-based models [51], [14], [1], [3], to the recent surge of deep models that either explicitly or implicitly tackle part learning with or without strong supervision [26], [34], [52], [55], [57], [48]. Without exception, the focus has been on mining fine-grained discriminative features to better classification performances.

Contact IEEE to Subscribe

References

References is not available for this document.