I. Introduction
Adversarial examples [1], [2] cause serious safety concerns in deploying deep learning models. In order to defend against adversarial attacks, many approaches have been proposed [3], [4], [5], [6], [7], [8], [9]. Among them, adversarial training and its variants [7], [8], [10] have been recognized as the most effective defense mechanism. Adversarial training (AT) is generally formulated as a minimax problem \begin{equation*} \min _{ \boldsymbol {\theta }}\max _{ {\mathbf {x}}_{i}^{\ast} \in {\mathcal {B}} _{p}({\mathbf {x}}_{i}, \varepsilon)} \frac {1}{n} \sum _{i=1}^{n} \ell ({\mathbf {x}}_{i}^{\ast}, y_{i}; { \boldsymbol {\theta }})\;, \tag{1}\end{equation*} where is the training set and is the loss function parametrized by . represents a norm ball centered at with radius . AT in Equation (1) boosts the adversarial robustness by adopting adversarial examples generated in the inner maximization. Despite the effectiveness of AT, solving the inner maximization requires multiple steps of projected gradient descent (PGD) [7], [11]. Therefore, AT is much slower than vanilla training (e.g., 10 times longer training time for AT in [11]), making it challenging to scale AT to large datasets such as ImageNet.