Conferences >2013 IEEE International Confe...

Recurrent neural network language modeling for code switching conversational speech

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Code-switching is a very common phenomenon in multilingual communities. In this paper, we investigate language modeling for conversational Mandarin-English code-switching...Show More

Metadata

Abstract:

Code-switching is a very common phenomenon in multilingual communities. In this paper, we investigate language modeling for conversational Mandarin-English code-switching (CS) speech recognition. First, we investigate the prediction of code switches based on textual features with focus on Part-of-Speech (POS) tags and trigger words. Second, we propose a structure of recurrent neural networks to predict code-switches. We extend the networks by adding POS information to the input layer and by factorizing the output layer into languages. The resulting models are applied to our task of code-switching language modeling. The final performance shows 10.8% relative improvement in perplexity on the SEAME development set which transforms into a 2% relative improvement in terms of Mixed Error Rate and a relative improvement of 16.9% in perplexity on the evaluation set which leads to a 2.7% relative improvement of MER.

Published in: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing

Date of Conference: 26-31 May 2013

Date Added to IEEE Xplore: 21 October 2013

Electronic ISBN:978-1-4799-0356-6

ISSN Information:

DOI: 10.1109/ICASSP.2013.6639306

Conference Location: Vancouver, BC, Canada

Contents

1. Introduction

Code-switching speech is defined as speech that contains more than one language (‘code’). The switch between languages may happen between or within an utterance. It is a common phenomenon in many multilingual communities where people of different cultures and language background communicate with each other [1]. For the automated processing of spoken communication in these scenarios, a speech recognition system must be able to handle code switches. Usually, speech recognition systems are monolingual and that is why the task appears to be very difficult to solve. Another challenge is the lack of bilingual training data. While there have been promising research results in the area of acoustic modeling, only few approaches so far address code-switching in the language model. Recently, it has been shown that recurrent neural network language models (RNNLMs) improve perplexity and error rates in speech recognition systems in comparison to traditional n-gram approaches [2],[3],[4]. One reason for that is their ability to handle longer contexts. Furthermore, the integration of additional features as input is rather straight-forward due to their structure. In this paper, we propose a recurrent neural network language model applied for code-switching. We extend its traditional structure by integrating features into the input layer and by factorizing the output layer using language information. Our experimental results demonstrate that this approach leads to significant improvements in terms of perplexity which transform into decent error rate reductions. Figure 1 illustrates our code-switching system. Fig. 1. Overview: our code-switching system

References is not available for this document.

Recurrent neural network language modeling for code switching conversational speech

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Recurrent neural network language modeling for code switching conversational speech

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

References