I. Introduction
Two major classes of single channel speech enhancement techniques may be the statistical model-based and template-based approaches [1]–[14]. In the methods falling in the former category, speech and noise are assumed to have separate parametric distributions for which the parameters are estimated from the input signal [1]–[4]. In most of the cases, these approaches perform voice activity detection (VAD) implicitly or explicitly and compute the gains based on the assumed statistical models and estimated parameters. One of the significant advantages of the statistical model-based techniques is that the models do not need to be trained a priori. Since, however, the statistical models are constructed based on a stationarity assumption, the performance deteriorates when the background noise is highly non-stationary.