Loading [MathJax]/extensions/MathZoom.js
Adaptive Target-Consistency Entity Matching Algorithm Based on Semi-Supervised Learning | IEEE Conference Publication | IEEE Xplore

Adaptive Target-Consistency Entity Matching Algorithm Based on Semi-Supervised Learning


Abstract:

Entity matching, which identifies two entities from different sources as the same, is a key task in natural language processing (NLP). Although fine-tuning pre-trained la...Show More

Abstract:

Entity matching, which identifies two entities from different sources as the same, is a key task in natural language processing (NLP). Although fine-tuning pre-trained language models has significantly advanced entity matching by leveraging vast amounts of domain-specific unlabeled data, achieving satisfactory accuracy still typically requires thousands of labeled instances, which are often difficult to obtain in real-world applications. To address this challenge, we introduce a novel framework called ASTCEM, which utilizes an Adaptive Target Consistency (ATC) strategy. Building on the mean-teacher paradigm, this approach incorporates target perturbation consistency learning into entity matching. By encouraging agreement between the outputs of the student and teacher models, our method takes full advantage of unlabeled data, reducing the need for extensive labeled datasets while maintaining comparable performance. This improves both test robustness and generalization. We conducted extensive comparative and ablation studies across 13 datasets from the classical entity matching domain. The experimental results demonstrate that, compared to current SOTA models, the ASTCEM framework improves F1 scores by 12.26%, 6.94%, and 4.23% on various datasets. ASTCEM also achieves higher precision and accuracy than existing SOTA methods, especially under conditions of limited data size and high imbalance.
Date of Conference: 25-28 October 2024
Date Added to IEEE Xplore: 27 December 2024
ISBN Information:

ISSN Information:

Conference Location: Chiang Mai, Thailand
College of Systems Engineering, National University of Defense Technology, Changsha, Hunan, China

Introduction

Entity matching seeks to identify whether two data entities correspond to the same real-world object. It plays a crucial role in tasks like social network analysis, big data integration, and managing semantic web data. In different data sources, entities such as people, books, or organizations may have multiple representations in the real world. Furthermore, the same representation might correspond to different entities. Therefore, the task of entity matching involves identifying instances where different representations refer to the same entity, despite variations.

College of Systems Engineering, National University of Defense Technology, Changsha, Hunan, China
Contact IEEE to Subscribe

References

References is not available for this document.