I. Introduction
The last decade has seen significant technological development, bringing with it several products that are now part of our daily lives, such as robotic agents and voice assistants. Several companies like Google, Apple, and Amazon have created and marketed virtual assistants, such as Google Assistant
https://blog.google/products/assistant/assistant-io-2022/
, Sirihttps://machinelearning.apple.com/research/hey-siri
, and Alexahttps://www.amazon.science/code-and-datasets/alexa-voice-service-avs
, respectively. These technologies are based on voice recognition to understand words, phrases, and in general, the expressions contained in a voice signal. Voice recognition requires audio processing. For this, there are different applicable techniques, being the most common one converting audios to text for analysis applying natural language processing (NLP), allowing the machine to receive and analyze the user's message [1].