Loading [MathJax]/extensions/MathMenu.js
Database-Aware ASR Error Correction for Speech-to-SQL Parsing | IEEE Conference Publication | IEEE Xplore

Database-Aware ASR Error Correction for Speech-to-SQL Parsing


Abstract:

We study the task of spoken natural language to SQL parsing (speech-to-SQL), where the goal is to map a spoken utterance to the corresponding SQL. A simple way to develop...Show More

Abstract:

We study the task of spoken natural language to SQL parsing (speech-to-SQL), where the goal is to map a spoken utterance to the corresponding SQL. A simple way to develop a speech-to-SQL parser is to pass the speech to an automatic speech recognition (ASR) system, and pass the transcription to a text-to-SQL parser. However, ASR is still error-prone. We propose an ASR correction method, DBATI (DataBase-Aware TaggerILM). The method first detects erroneous spans in the input, and rewrites each span. Our method leverages a novel joint representation of text and the database (DB). Our experiments show that our method yields better performance on both text quality and downstream SQL accuracy, compared to existing ASR error correction methods.
Date of Conference: 04-10 June 2023
Date Added to IEEE Xplore: 05 May 2023
ISBN Information:

ISSN Information:

Conference Location: Rhodes Island, Greece

1. INTRODUCTION

Interfaces that support human language as a medium of communication between humans and computers have been of interest for decades [1], [2]. Known as Natural Language Interfaces (NLIs), early systems saw limited success due to the difficult problem of endowing computers with the ability to understand natural language. Progress in language understanding has led to renewed interest in NLIs [3]. In particular, several studies have focused on NLIs to databases (NLIDBs) [4], [5], [6]. NLIDBs, when fully realized, stand to support users who are not proficient in query languages. The primary focus of NLIDBs has been on parsing natural language text utterances into executable SQL queries (text-to-SQL parsing). Motivated by the rise of speech-driven digital assistants on smartphones, tablets, and other small handheld devices, we study the task of parsing spoken natural language to executable SQL queries (speech-to-SQL parsing).

Contact IEEE to Subscribe

References

References is not available for this document.