I. Introduction
A spoofing attack refers to an attempt to mimic a target speaker in order to fool a speaker verification (SV) system. Different spoofing techniques are available, such as voice mimicry, playback, voice conversion or speech synthesis [1]. With the recent advances of voice conversion and speech synthesis technologies, open-source toolkits to facilitate voice spoofing has become more prevalent [2], posing serious threats to automatic SV systems. Furthermore, state-of-the-art HMM-based speech synthesizers now require only a few minutes of a speaker's data to perform model adaptation [3], making spoofing techniques easily available. As reported in [4], [2], [5]–[6], [1], synthetic speech greatly compromises the accuracy of SV systems. The false acceptance rate could be as high as 85.5% when using a GMM-UBM based SV system on synthetic speech obtain from HMM-based speech synthesizer trained on Wall Street Journal corpus [4], and as high as 98.08% [2] for a corpus synthesized using the MARY Text-to-Speech Synthesis (MaryTTS) system [7] based on unit selection.