I. Introduction
Recently, the boundaries of standard classification tasks have been pushed further, partly due to the availability of deep learning techniques. Unfortunately, most classification models require a large number of labeled training examples, which is of great difficulty to annotate all the concepts with high quality because most categories of objects follow a long-tail distribution. To address this awkward situation, zero-shot learning (ZSL) [22] is proposed to recognize test samples that are not available in the training process. In the context of ZSL, seen and unseen classes are associated with semantic features that can be attributes [23] or word vectors [46]. ZSL completes the transferable recognition ability by building a relationship from seen classes to unseen classes with the help of shared semantic information. In real-world applications, the objects that need to be recognized may come from both seen and unseen classes instead of only from the unseen classes, which is called the generalized ZSL (GZSL) task.