1 Introduction
Generative Adversarial Networks (GANs) have shown promising results in generating data that are indistinguishable from real data [1], [2], [3]. Recently, a trend has emerged of synthesizing Convolutional Neural Network (CNN) features using GAN architectures, which mitigates the lack of unseen samples in zero-shot learning (ZSL) [4], [5], [6]. Of these methods, f-CLSWGAN [4] is one of the first attempts to leverage GANs in order to push the ZSL performance forward. In an attempt to progress this field, some improved approaches (e.g., LisGAN [7] and AFC-GAN [8]) that may potentially offer better performance, have been proposed. However, despite the empirical success of these approaches, it should be noted that they all rely heavily on hand-crafted GAN architectures designed by human experts, meaning that laborious trial-and-error testing is required (Fig. 1a). The instability issue in GAN training increases the difficulty of architecture design significantly. Once obtained, these manually designed architectures are fixed across all diversified data samples and application scenarios, which can easily lead to sub-optimal results. It is therefore highly valuable to automatically determine the GAN architectures customized for each specific ZSL task, rather than simply adopting a hand-crafted architecture.
Comparison of the architecture design and training method of (a) Existing GANs for ZSL, (b) AutoGAN, (c) ZeroNAS.