Conferences >2024 L Latin American Compute...

Towards Speech Emotion Recognition Applied to Social Robots

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Nowadays, the advancement of technology allows the use of social robots for various daily tasks such as therapies, teaching assistants, restaurant services, among others....Show More

Metadata

Abstract:

Nowadays, the advancement of technology allows the use of social robots for various daily tasks such as therapies, teaching assistants, restaurant services, among others. Human-Robot Interaction (HRI) is under constant study due to the new capabilities that robots acquire thanks to their improved hardware (e.g., more joints). Robots receive information through sensors such as cameras and microphones and can thus modify their behavior and adapt to different situations. However, an exhaustive real-time analysis of data within the robot requires excessive computing power and energy usage, which are limited in social robots. In this context, we propose a lightweight Machine Learning model to balance accuracy and audio processing time to recognize the emotions of happiness, sadness, anger, and neutral in real-time, aiming to improve HRI. Additionally, an empirical analysis to identify the most relevant audio features for emotion recognition is presented. The objective is to generate a lighter and more appropriate model for the robot's hardware. Results show better accuracy by using the RAVDESS, IEMOCAP, and RAVDESS+IEMOCAP datasets and a recognition time around 1 second.

Published in: 2024 L Latin American Computer Conference (CLEI)

Date of Conference: 12-16 August 2024

Date Added to IEEE Xplore: 08 October 2024

ISBN Information:

ISSN Information:

DOI: 10.1109/CLEI64178.2024.10700306

Conference Location: Buenos Aires, Argentina

Contents

I. Introduction

The last decade has seen significant technological development, bringing with it several products that are now part of our daily lives, such as robotic agents and voice assistants. Several companies like Google, Apple, and Amazon have created and marketed virtual assistants, such as Google Assistant¹

https://blog.google/products/assistant/assistant-io-2022/

, Siri²

https://machinelearning.apple.com/research/hey-siri

, and Alexa³

https://www.amazon.science/code-and-datasets/alexa-voice-service-avs

, respectively. These technologies are based on voice recognition to understand words, phrases, and in general, the expressions contained in a voice signal. Voice recognition requires audio processing. For this, there are different applicable techniques, being the most common one converting audios to text for analysis applying natural language processing (NLP), allowing the machine to receive and analyze the user's message [1].

References is not available for this document.

Towards Speech Emotion Recognition Applied to Social Robots

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Towards Speech Emotion Recognition Applied to Social Robots

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

Authors

Figures

References

Keywords

Metrics

Footnotes

References