I. Introduction
Automatic voice recognition has been steadily advanced with an absolute necessity for human-machine interaction [1], [2]. Recently, a number of consumer electronic devices including smartphones and smart TVs have become powerful enough to be equipped with voice recognition capability [3]–[5]. Especially in smart TVs with many novel functions such as internet access or contents search, voice-driven interface is able to provide plenty of convenience by delivering commands and keywords via voice.