Analogy-based Assessment of Domain-specific Word Embeddings | IEEE Conference Publication | IEEE Xplore

Analogy-based Assessment of Domain-specific Word Embeddings


Abstract:

The ability of word embeddings to identify shared semantic regularities between word pair categories such as capital-country has led to the use of analogies as a method o...Show More

Abstract:

The ability of word embeddings to identify shared semantic regularities between word pair categories such as capital-country has led to the use of analogies as a method of validating word embedding models. Further research has shown that relative to the complete breadth of possible analogy categories, there exists a limit to the particular categories accessible, in terms of accuracy, to current analogy equations executed against word embeddings trained on generalized, non domain-specific text corpora. As most, if not all, domain-specific, scientific analogy pairs belong to problematic analogy categories (i.e. the lexicographical and the encyclopedic), we examine the degree to which a domain-specific text corpus and vocabulary positively improve analogy predictions from word embeddings. Our findings demonstrate that in comparison to analogy-based tests performed against general word embeddings, predictions by domain-specific word embeddings outperform in exactly those analogy categories that are both highly problematic and the location of domain knowledge.
Published in: 2020 SoutheastCon
Date of Conference: 28-29 March 2020
Date Added to IEEE Xplore: 13 November 2020
ISBN Information:

ISSN Information:

Conference Location: Raleigh, NC, USA
No metrics found for this document.

Usage
Select a Year
2025

View as

Total usage sinceNov 2020:69
01234JanFebMarAprMayJunJulAugSepOctNovDec003000000000
Year Total:3
Data is updated monthly. Usage includes PDF downloads and HTML views.

Contact IEEE to Subscribe

References

References is not available for this document.