Conferences >2017 IEEE Virtual Reality (VR)

Acoustic VR in the mouth: A real-time speech-driven visual tongue system

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

We propose an acoustic-VR system that converts acoustic signals of human language (Chinese) to realistic 3D tongue animation sequences in real time. It is known that dire...Show More

Metadata

Abstract:

We propose an acoustic-VR system that converts acoustic signals of human language (Chinese) to realistic 3D tongue animation sequences in real time. It is known that directly capturing the 3D geometry of the tongue at a frame rate that matches the tongue's swift movement during the language production is challenging. This difficulty is handled by utilizing the electromagnetic articulography (EMA) sensor as the intermediate medium linking the acoustic data to the simulated virtual reality. We leverage Deep Neural Networks to train a model that maps the input acoustic signals to the positional information of pre-defined EMA sensors based on 1,108 utterances. Afterwards, we develop a novel reduced physics-based dynamics model for simulating the tongue's motion. Unlike the existing methods, our deformable model is nonlinear, volume-preserving, and accommodates collision between the tongue and the oral cavity (mostly with the jaw). The tongue's deformation could be highly localized which imposes extra difficulties for existing spectral model reduction methods. Alternatively, we adopt a spatial reduction method that allows an expressive subspace representation of the tongue's deformation. We systematically evaluate the simulated tongue shapes with real-world shapes acquired by MRI/CT. Our experiment demonstrates that the proposed system is able to deliver a realistic visual tongue animation corresponding to a user's speech signal.

Published in: 2017 IEEE Virtual Reality (VR)

Date of Conference: 18-22 March 2017

Date Added to IEEE Xplore: 06 April 2017

ISBN Information:

Electronic ISSN: 2375-5334

DOI: 10.1109/VR.2017.7892238

Conference Location: Los Angeles, CA, USA

References is not available for this document.

Contents

1 Introduction

The human tongue is a muscular organ that plays an essential role during speech production. A high-quality visual representation of the human tongue for specific speech sounds is of importance in the domain of speech research and has numerous potential applications. For example, in the rehabilitation of speech disorders [16], a realistic visualization of 3D tongue motion could provide a visible paradigm that helps an individual achieve the correct articulation of the tongue during the production of various speech sounds.

Select All

Y. S. Akgul, C. Kambhamettu and M. Stone, "Automatic extraction and tracking of the tongue contours", IEEE Transactions on Medical Imaging, vol. 18, no. 10, pp. 1035-1045, 1999.

View Article

Google Scholar

S. S. An, T. Kim and D. L. James, "Optimizing cubature for efficient integration of subspace deformations", ACM Trans. Graph., vol. 27, no. 5, pp. 165:1-165:10, Dec 2008.

MIT Libraries

MIT Libraries

Acoustic VR in the mouth: A real-time speech-driven visual tongue system

Alerts

Abstract:

Metadata

Abstract:

1 Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?