Loading [MathJax]/extensions/MathZoom.js
A robust word boundary detection algorithm for variable noise-level environment in cars | IEEE Journals & Magazine | IEEE Xplore

A robust word boundary detection algorithm for variable noise-level environment in cars


Abstract:

This paper discusses the problem of automatic word boundary detection in the presence of variable-level background noise in cars. Commonly used robust word boundary detec...Show More

Abstract:

This paper discusses the problem of automatic word boundary detection in the presence of variable-level background noise in cars. Commonly used robust word boundary detection algorithms always assume that the background noise level is fixed and sets fixed thresholds to find the boundary of word signal. In fact, the background noise level in cars varies in the procedure of recording due to speed change and moving environment, and some thresholds should be tuned according to the variation of background noise level. This is the major reason that most robust word boundary detection algorithms cannot work well in the condition of variable background noise level. To solve this problem, we propose a minimum mel-scale frequency band (MiMSB) parameter which can estimate the varying background noise level in cars by adaptively choosing one band with minimum energy from the mel-scale frequency bank. With the MiMSB parameter, some preset thresholds used to find the boundary of word signal are no longer fixed in all the recording intervals. These thresholds will be tuned according to the MiMSB parameter. We also propose an enhanced time frequency (ETF) parameter by extending the time-frequency (TF) parameter proposed by Junqua et al. from single band to multiband spectrum analysis, where the frequency bands help to make the distinction between speech signal and noise. The ETF parameter can extract useful frequency information by choosing some bands of the mel-scale frequency bank. Based on the MiMSB and ETF parameters, we finally propose a new robust algorithm for word boundary detection in variable noise-level environment. The new algorithm has been tested over a variety of noise conditions in cars and has been found to perform well not only under variable background noise level condition, but also under fixed background noise level condition. The new robust algorithm using the MiMSB and ETF parameters achieved higher recognition rate than the TF-based robust algorithm, which h...
Published in: IEEE Transactions on Intelligent Transportation Systems ( Volume: 3, Issue: 1, March 2002)
Page(s): 89 - 101
Date of Publication: 31 March 2002

ISSN Information:


I. Introduction

The widespread use of mobile telephones has motivated the development of robust speech recognition systems in cars [1]. A major source of errors in automatic speech recognition systems is the inaccurate detection of the beginning and ending boundaries. In cars, the problem is further complicated by nonstationary backgrounds where there may exist concurrent noises due to movements, engine running, speed change, braking, slams, etc. These background noises can be broadly classified into three classes: impulse noise, fixed-level noise, and variable-level noise. Decreasing the distance between the mouth and microphone is one way of minimizing the effects of such transient background noise. However, this method is not user-friendly. In order to solve this problem, many researchers proposed robust word boundary detection algorithms in the presence of noise. However, they focused only on the impulse noise and fixed-level background noise. The main aim of this paper is to develop a new robust word boundary detection algorithm to attack the problem of variable-level background noise in cars.

References

References is not available for this document.