1. Introduction
Humans possess the ability to compose their knowledge of known entities to generalize to novel concepts inherently. Given words, such as green horse, people can combine the known state green with the known object horse immediately, although they have never seen the inexistent stuff. To equip an AI system the similar ability, Compositional Zero-Shot Learning (CZSL) [20] is proposed, which aims to recognize unseen compositions composed of a set of seen states and objects. In CZSL setting, each composition comprises two components, namely, state and object, where the compositions of train and test sets are disjoint.
The overall concept of our method. We aim to separately extract discriminative prototypes of state and object based on establishing state and object specific databases, which can generalize to represent corresponding properties.