Loading [MathJax]/extensions/MathZoom.js
A Construction method of Multilingual Comparable Corpus in the background of Artificial Intelligence and Internet of Things | IEEE Conference Publication | IEEE Xplore

A Construction method of Multilingual Comparable Corpus in the background of Artificial Intelligence and Internet of Things


Abstract:

Comparable corpus is a critical component in the application of the Artificial intelligence and internet of Things (AfoT). AfoT provides a more extensive data source for ...Show More

Abstract:

Comparable corpus is a critical component in the application of the Artificial intelligence and internet of Things (AfoT). AfoT provides a more extensive data source for corpus, which also presents new requirements and challenges for the construction of comparable corpora in adapting to multilingual application scenarios. To meet the need of it, the comparable corpus plays an essential part of research in language information processing and multilingual application scenarios. However, the multilingual comparable corpus is rare, so there is an urgent need to construct multilingual corpus resources. This paper proposes a method for constructing a multilingual comparable corpus, taking a Chinese-Uighur-Tibetan news corpus as an example, and mapping the different language corpus to a unified language vector space. Then, this paper calculates the similarity between different language news texts and serves as a comparability index to construct comparable relations. Through the decision-making mechanism of minimizing the impossibility, it can candidate a comparable corpus pair of multilingual news which of chapter size to realize the construction of a Chinese-Uighur-Tibetan news comparable corpus (CUTCC). After an evaluation analysis, the results shows that our method is superior in accuracy rate and F value compared to existing method. Finally, multilingual comparable corpus constructed in this study provides valuable data resources support and language service for multilingual situations and AfoT application scenarios.
Date of Conference: 17-21 December 2023
Date Added to IEEE Xplore: 01 May 2024
ISBN Information:

ISSN Information:

Conference Location: Danzhou, China

Contact IEEE to Subscribe

References

References is not available for this document.