Conferences >ICASSP 2023 - 2023 IEEE Inter...

Database-Aware ASR Error Correction for Speech-to-SQL Parsing

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

We study the task of spoken natural language to SQL parsing (speech-to-SQL), where the goal is to map a spoken utterance to the corresponding SQL. A simple way to develop...Show More

Metadata

Abstract:

We study the task of spoken natural language to SQL parsing (speech-to-SQL), where the goal is to map a spoken utterance to the corresponding SQL. A simple way to develop a speech-to-SQL parser is to pass the speech to an automatic speech recognition (ASR) system, and pass the transcription to a text-to-SQL parser. However, ASR is still error-prone. We propose an ASR correction method, DBATI (DataBase-Aware TaggerILM). The method first detects erroneous spans in the input, and rewrites each span. Our method leverages a novel joint representation of text and the database (DB). Our experiments show that our method yields better performance on both text quality and downstream SQL accuracy, compared to existing ASR error correction methods.

Published in: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Date of Conference: 04-10 June 2023

Date Added to IEEE Xplore: 05 May 2023

ISBN Information:

ISSN Information:

DOI: 10.1109/ICASSP49357.2023.10097246

Conference Location: Rhodes Island, Greece

Contents

1. INTRODUCTION

Interfaces that support human language as a medium of communication between humans and computers have been of interest for decades [1], [2]. Known as Natural Language Interfaces (NLIs), early systems saw limited success due to the difficult problem of endowing computers with the ability to understand natural language. Progress in language understanding has led to renewed interest in NLIs [3]. In particular, several studies have focused on NLIs to databases (NLIDBs) [4], [5], [6]. NLIDBs, when fully realized, stand to support users who are not proficient in query languages. The primary focus of NLIDBs has been on parsing natural language text utterances into executable SQL queries (text-to-SQL parsing). Motivated by the rise of speech-driven digital assistants on smartphones, tablets, and other small handheld devices, we study the task of parsing spoken natural language to executable SQL queries (speech-to-SQL parsing).

References is not available for this document.

Database-Aware ASR Error Correction for Speech-to-SQL Parsing

Abstract:

Metadata

Abstract:

ISSN Information:

1. INTRODUCTION

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Database-Aware ASR Error Correction for Speech-to-SQL Parsing

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1. INTRODUCTION

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?