Abstract:
The intensive computation of Automatic Speech Recognition (ASR) models obstructs them from being deployed on mobile devices. In this work, we present a novel quantized Wi...Show MoreMetadata
Abstract:
The intensive computation of Automatic Speech Recognition (ASR) models obstructs them from being deployed on mobile devices. In this work, we present a novel quantized Wino-grad optimization framework, combining quantization and fast convolution to achieve efficient inference acceleration for ASR models on mobile devices. To avoid the information loss due to the combination of quantization and Winograd convolution, a Range-Scaled Quantization (RSQ) training method is proposed, integrating integer-range scaling and quantization noise minimization. Moreover, the Conv1D equipped DFSMN (ConvDFSMN) model is designed for mobile applications and experimental verification. We conduct extensive experiments on ConvDFSMN and Wav2letter models, demonstrating that the models can be effectively optimized with the proposed optimization framework. Especially, the optimized Wav2letter model achieves 1.48× speedup for end-to-end inference and 1.92× speedup for model backbone inference on ARMv7-based mobile devices, with only an approximate 0.07% decrease in WER on AIShell-1.
Published in: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 23-27 May 2022
Date Added to IEEE Xplore: 27 April 2022
ISBN Information: