QALD-9-plus: A Multilingual Dataset for Question Answering over DBpedia and Wikidata Translated by Native Speakers | IEEE Conference Publication | IEEE Xplore

QALD-9-plus: A Multilingual Dataset for Question Answering over DBpedia and Wikidata Translated by Native Speakers


Abstract:

The ability to have the same experience for different user groups (i.e., accessibility) is one of the most important characteristics of Web-based systems. The same is tru...Show More

Abstract:

The ability to have the same experience for different user groups (i.e., accessibility) is one of the most important characteristics of Web-based systems. The same is true for Knowledge Graph Question Answering (KGQA) systems that provide the access to Semantic Web data via natural language interface. While following our research agenda on the multilingual aspect of accessibility of KGQA systems, we identified several ongoing challenges. One of them is the lack of multilingual KGQA benchmarks. In this work, we extend one of the most popular KGQA benchmarks - QALD-9 by introducing high-quality questions' translations to 8 languages provided by native speakers, and transferring the SPARQL queries of QALD-9 from DBpedia to Wikidata, s.t., the usability and relevance of the dataset is strongly increased. Five of the languages - Armenian, Ukrainian, Lithuanian, Bashkir and Belarusian - to our best knowledge were never considered in KGQA research community before. The latter two of the languages are considered as “endangered” by UNESCO. We call the extended dataset QALD-9-plus and made it available online11Figshare: https://doi.org/10.6084/m9.figshare.16864273. GitHub: https://github.com/Perevalov/qald_9_plus.
Date of Conference: 26-28 January 2022
Date Added to IEEE Xplore: 23 March 2022
ISBN Information:
Print on Demand(PoD) ISSN: 2325-6516
Conference Location: Laguna Hills, CA, USA
Citations are not available for this document.

I. Introduction

The core task of a Knowledge Graph Question Answering system is to represent a natural language question in the form of a structured query (e.g., SPARQL) to a knowledge graph (KG). In other words, KGQA systems provide access to the data in KGs via a natural-language user interface, s.t., end users are not required to learn a particular query language for fetching data manually. Obviously, the relevance (or accuracy) of the answers given by such system should strive to human performance and reduce labor costs for learning a particular query language; otherwise, the system is useless. Many researchers are aiming at measuring and increasing the Question Answering (QA) quality or the quality of a particular KGQA sub-tasks, such as named entity linking (e.g., [1]), expected answer type prediction (e.g., [2]), etc. However, the accessibility

The accessibility for the Web is defined by W3C: https://www.w3.org/standards/webdesign/accessibility

characteristic of the KGQA systems often stays overlooked. In this context, the perfect accessibility denotes an equivalent experience to all user groups of a particular KGQA system. Hence, such research questions as: “How many people can really take advantage of the high-quality KGQA system?” and “Who are these people?” as well as “How diverse they are?” are often left unnoticeable.

Cites in Papers - |

Cites in Papers - IEEE (4)

Select All
1.
Zezhong Xu, Juan Li, Wen Zhang, "Large Language Model and Knowledge Graph Entangled Logical Reasoning", 2024 IEEE International Conference on Knowledge Graph (ICKG), pp.432-439, 2024.
2.
Jaeeun Jang, Sangmin Kim, Mikyoung Lee, Mira Yun, Charles Wiseman, "imEL: Instance-level Masked Entity Linking Model", 2024 58th Annual Conference on Information Sciences and Systems (CISS), pp.1-6, 2024.
3.
Sri Vasavi Chandu, Manogna Grandhi, Chandu Venkata Phaneendra, Krishna Siva Prasad Mudigonda, "A Survey on Extraction of Relations using Knowledge Graphs in Various Applications", 2023 IEEE Silchar Subsection Conference (SILCON), pp.1-6, 2023.
4.
Adithya MS, Mohsin Ahmed, Mihir Madhusudan Kestur, A Sai Chaithanya, Bhaskarjyothi Das, "A Dataset and Multi-task Multi-view Approach for Question-Answering with the Dual Perspectives of Text and Knowledge", 2023 15th International Conference on Computer and Automation Engineering (ICCAE), pp.296-301, 2023.

Cites in Papers - Other Publishers (9)

1.
Aleksandr Perevalov, Aleksandr Gashkov, Maria Eltsova, Andreas Both, "Understanding SPARQL Queries: Are We Already There? Multilingual Natural Language Generation Based on\\xa0SPARQL Queries and\\xa0Large Language Models", The Semantic Web – ISWC 2024, vol.15232, pp.173, 2025.
2.
Markus Hornsteiner, Michael Kreussel, Christoph Steindl, Fabian Ebner, Philip Empl, Stefan Schönig, "Real-Time Text-to-Cypher Query Generation with Large Language Models for Graph Databases", Future Internet, vol.16, no.12, pp.438, 2024.
3.
Yihao Li, Ru Zhang, Jianyi Liu, "An Enhanced Prompt-Based LLM Reasoning Scheme via\\xa0Knowledge Graph-Integrated Collaboration", Artificial Neural Networks and Machine Learning – ICANN 2024, vol.15020, pp.251, 2024.
4.
Mr. Jeevan Tonde, Dr. Satish Sankaye, "A Review on Conversational Question Answering (CQA)", International Journal of Advanced Research in Science, Communication and Technology, pp.365, 2024.
5.
Angus Addlesee, Arash Eshghi, "You have interrupted me again!: making voice assistants more dementia-friendly with incremental clarification", Frontiers in Dementia, vol.3, 2024.
6.
Jiexing Qi, Chang Su, Zhixin Guo, Lyuwen Wu, Zanwei Shen, Luoyi Fu, Xinbing Wang, Chenghu Zhou, "Enhancing SPARQL Query Generation for Knowledge Base Question Answering Systems by Learning to Correct Triplets", Applied Sciences, vol.14, no.4, pp.1521, 2024.
7.
Ricardo Usbeck , Xi Yan , Aleksandr Perevalov , Longquan Jiang , Julius Schulz , Angelie Kraft , Cedric Möller , Junbo Huang , Jan Reineke , Axel-Cyrille Ngonga Ngomo , Muhammad Saleem , Andreas Both , " QALD-10 – The 10th challenge on question answering over linked data ", Semantic Web , pp. 1 , 2023 .
8.
Jiexing Qi, Chang Su, Zhixin Guo, Lyuwen Wu, Kai Zou, He Yan, Xinbing Wang, Chenghu Zhou, Zhouhan Lin, , 2023.
9.
Aleksandr Perevalov, Andreas Both, Dennis Diefenbach, Axel-Cyrille Ngonga Ngomo, "Can Machine Translation be a Reasonable Alternative for Multilingual Question Answering Systems over Knowledge Graphs?", Proceedings of the ACM Web Conference 2022, pp.977, 2022.
Contact IEEE to Subscribe

References

References is not available for this document.