On-the-Fly Category Discovery | IEEE Conference Publication | IEEE Xplore

Abstract:

Although machines have surpassed humans on visual recognition problems, they are still limited to providing closed-set answers. Unlike machines, humans can cognize novel ...Show More

Abstract:

Although machines have surpassed humans on visual recognition problems, they are still limited to providing closed-set answers. Unlike machines, humans can cognize novel categories at the first observation. Novel category discovery (NCD) techniques, transferring knowledge from seen categories to distinguish unseen categories, aim to bridge the gap. However, current NCD methods assume a transductive learning and offline inference paradigm, which restricts them to a predefined query set and renders them unable to deliver instant feedback. In this paper, we study on-the-fly category discovery (OCD) aimed at making the model instantaneously aware of novel category samples (i.e., enabling inductive learning and streaming inference). We first design a hash coding-based expandable recognition model as a practical baseline. Afterwards, noticing the sensitivity of hash codes to intra-category variance, we further propose a novel Sign-Magnitude dIsentangLEment (SMILE) architecture to alleviate the disturbance it brings. Our experimental results demonstrate the superiority of SMILE against our baseline model and prior art. Our code is available at https://github.com/PRIS-CV/On-the-fly-Category-Discovery.
Date of Conference: 17-24 June 2023
Date Added to IEEE Xplore: 22 August 2023
ISBN Information:

ISSN Information:

Conference Location: Vancouver, BC, Canada

Funding Agency:

References is not available for this document.

1. Introduction

Deep models are well known for beating humans in visual recognition [13]. However, this is just a victory of specialist models over generalist humans - existing vision recognition models are mostly closed-set experts. Given a defined category set, huge datasets are gathered and annotated, and then, deep models trained with the annotated data can easily handle such an in-category recognition due to their great fitting ability. However, these models are arguably only learning to memorize in that they are restricted to the defined category set and are incapable of modeling novel categories. Although paradigms like open set recognition [9] aim to filter out the out-of-category samples, simply rejecting them is not satisfactory. For humans, visual recognition is far beyond a closed-set problem - instead of learning to memorize, we learn to cognize. In particular, given samples containing novel categories, we can not only tell which are novel but we can also tell which may share the same novel category. e.g., even you have never seen “hedgehogs”, you can easily realize that they differ from other creatures you have seen before and realise that multiple hedgehog images belong to the same category, even if you don't know the name.

Comparison of the conventional ncd setting and the proposed ocd setting. (a) Ncd adopts transductive learning and offline inference. (b) Ocd removes the predefined query set assumption and conducts inductive learning and instant inference.

Select All
1.
Abhijit Bendale and Terrance Boult, "Towards open world recognition", CVPR, 2015.
2.
Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, et al., "Emerging properties in self-supervised vision transformers", ICCV, 2021.
3.
Hakan Cevikalp and Bill Triggs, "Polyhedral conic classifiers for visual object detection and classification", CVPR, 2017.
4.
Dongliang Chang, Kaiyue Pang, Yixiao Zheng, Zhanyu Ma, Yi-Zhe Song and Jun Guo, "Your” flamingo” is my” bird”: Fine-grained or not", CVPR, 2021.
5.
Haoang Chi, Feng Liu, Wenjing Yang, Long Lan, Tongliang Liu, Bo Han, et al., "Meta discovery: Learning to discover novel classes given very limited data", ICLR, 2021.
6.
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly et al., "An image is worth 16×16 words: Transformers for image recognition at scale", ICLR, 2021.
7.
Enrico Fini, Enver Sangineto, Stéphane Lathuilière, Zhun Zhong, Moin Nabi and Elisa Ricci, "A unified objective for novel class discovery", ICCV, 2021.
8.
ZongYuan Ge, Sergey Demyanov, Zetao Chen and Rahil Garnavi, "Generative openmax for multi-class open set classification", arXiv preprint, 2017.
9.
Chuanxing Geng, Sheng-jun Huang and Songcan Chen, "Recent advances in open set recognition: A survey", IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 10, pp. 3614-3631, 2020.
10.
Kai Han, Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Andrea Vedaldi and Andrew Zisserman, "Autonovel: Automatically discovering and learning novel visual categories", IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
11.
Kai Han, Andrea Vedaldi and Andrew Zisserman, "Learning to discover novel visual categories via deep transfer clustering", ICCV, 2019.
12.
John A Hartigan, Clustering algorithms, John Wiley & Sons, Inc., 1975.
13.
Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun, "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification", ICCV, 2015.
14.
Dat Huynh and Ehsan Elhamifar, "Fine-grained generalized zero-shot learning via dense attribute-based attention", CVPR, 2020.
15.
Xuhui Jia, Kai Han, Yukun Zhu and Bradley Green, "Joint representation learning and novel category discovery on single-and multi-modal data", ICCV, 2021.
16.
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, et al., "Supervised contrastive learning", NeurIPS, 2020.
17.
Shu Kong and Deva Ramanan, "Opengan: Open-set recognition via open data generation", ICCV, 2021.
18.
Jonathan Krause, Michael Stark, Jia Deng and Li Fei-Fei, "3d object representations for fine-grainmd categorization", ICCV Workshops, 2013.
19.
Alex Krizhevsky, Geoffrey Hinton et al., Learning multiple layers of features from tiny images, 2009.
20.
Da Li, Yongxin Yang, Yi-Zhe Song and Timothy Hospedales, "Learning to generalize: Meta-learning for domain generalization", AAAI, 2018.
21.
Xiangyu Li, Xu Yang, Kun Wei, Cheng Deng and Muli Yang, "Siamese contrastive embedding network for compositional zero-shot learning", CVPR, 2022.
22.
Ilya Loshchilov and Frank Hutter, "Sgdr: Stochastic gradient descent with warm restarts", arXiv preprint, 2016.
23.
Lawrence Neal, Matthew Olson, Xiaoli Fern, Weng-Keen Wong and Fuxin Li, "Open set learning with counterfactual images", ECCV, 2018.
24.
Farhad Pourpanah, Moloud Abdar, Yuxuan Luo, Xinlei Zhou, Ran Wang, Chee Peng Lim, et al., "A review of generalized zero-shot learning methods", IEEE transactions on pattern analysis and machine intelligence, 2022.
25.
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein et al., "Imagenet large scale visual recognition challenge", International Journal of Computer Vision, vol. 115, no. 3, pp. 211-252, 2015.
26.
Walter J Scheirer, Anderson de Rezende Rocha, Archana Sapkota and Terrance E Boult, "Toward open set recognition", IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 7, pp. 1757-1772, 2012.
27.
Walter J Scheirer, Lalit P Jain and Terrance E Boult, "Probability models for open set recognition", IEEE transactions on pattern analysis and machine intelligence, 2014.
28.
Richard Socher, Milind Ganjoo, Christopher D Manning and Andrew Ng, "Zero-shot learning through cross-modal transfer", NeurIPS, 2013.
29.
Kiat Chuan Tan, Yulong Liu, Barbara Ambrose, Melissa Tulig and Serge Belongie, "The herbarium challenge 2019 dataset", arXiv preprint, 2019.
30.
Sagar Vaze, Kai Han, Andrea Vedaldi and Andrew Zisserman, "Generalized category discovery", CVPR, 2022.
Contact IEEE to Subscribe

References

References is not available for this document.