Loading [MathJax]/extensions/MathZoom.js
Practical Guidelines for Intent Recognition: BERT with Minimal Training Data Evaluated in Real-World HRI Application | IEEE Conference Publication | IEEE Xplore

Practical Guidelines for Intent Recognition: BERT with Minimal Training Data Evaluated in Real-World HRI Application


Abstract:

Intent recognition models, which match a written or spoken input’s class in order to guide an interaction, are an essential part of modern voice user interfaces, chatbots...Show More

Abstract:

Intent recognition models, which match a written or spoken input’s class in order to guide an interaction, are an essential part of modern voice user interfaces, chatbots, and social robots. However, getting enough data to train these models can be very expensive and challenging, especially when designing novel applications such as real-world human-robot interactions. In this work, we first investigate how much training data is needed for high performance in an intent classification task. We train and evaluate BiLSTM and BERT models on various subsets of the ATIS and Snips datasets. We find that only 25 training examples per intent are required for our BERT model to achieve 94% intent accuracy compared to 98% with the entire datasets, challenging the belief that large amounts of labeled data are required for high performance in intent recognition. We apply this knowledge to train models for a real-world HRI application, character strength recognition during a positive psychology interaction with a social robot, and evaluate against the Character Strength dataset collected in our previous HRI study. Our real-world HRI application results also confirm that our model can produce 76% intent accuracy with 25 examples per intent compared to 80% with 100 examples. In a real-world scenario, the difference is only one additional error per 25 classifications. Finally, we investigate the limitations of our minimal data models and offer suggestions on developing high quality datasets. We conclude with practical guidelines for training BERT intent recognition models with minimal training data and make our code and evaluation framework available for others to replicate our results and easily develop models for their own applications. CCS CONCEPTS • Computing methodologies → Natural language processing; • Human-centered computing → Systems and tools for interaction design.
Date of Conference: 09-11 March 2021
Date Added to IEEE Xplore: 22 February 2023
ISBN Information:

ISSN Information:

Conference Location: Boulder, CO, USA

1 Introduction

Voice user interfaces (VUI) (e.g. Amazon Alexa, chatbots, and social robots) are becoming an essential part of everyday life [5, 15]. For these systems to carry out effective dialogue, they must be able to determine the intent behind a user’s spoken utterance. For the purpose of this paper, intent recognition is defined as commonly understood in the NLP community, i.e. the task of taking a written or spoken input, and determining which of several classes it matches in order to best respond or guide the interaction, not to be confused with the broader meaning in the HRI context, i.e. inferring goals of the user based on their observed actions from sensors or visual cues. This type of intent recognition is essential to building complex conversational experiences in HRI, which is a key challenge. While rule-based parsing is a common approach for some interactions, it is not effective for more advanced and novel dialogue contexts [27]. To improve user experience while interacting with such systems, state-of-the-art models are trained using large labeled datasets for intent recognition customized to specific applications [8, 9, 26].

Contact IEEE to Subscribe

References

References is not available for this document.