OnPerDis: Ontology-Based Personal Name Disambiguation on the Web | IEEE Conference Publication | IEEE Xplore

OnPerDis: Ontology-Based Personal Name Disambiguation on the Web


Abstract:

With the growth of web documents, the ambiguity of personal name becomes more common and brings poor performance of web search. Identifying a correct personal entity from...Show More

Abstract:

With the growth of web documents, the ambiguity of personal name becomes more common and brings poor performance of web search. Identifying a correct personal entity from the a piece of or the whole document is still a very challenging problem, especially for Chinese websites. In this paper, we propose a novel Ontology-based approach for Personal Name Disambiguation (named "OnPerDis"). This approach has two main steps: first, we construct person ontology (PO) with rich conceptual modeling as well as a large set of supporting instances, second, for a given personal name on the web, we create a temporary instance and extract features from the web documents, calculate the similarity between this temporary instance and the instances in the PO. The one with the highest similarity score is chosen as the appropriate personal name. Our extensive evaluations with two rich real-life datasets (CIPS-SIGHAN 2012 NERD and Chinese web documents) shows OnPerDis' efficacy on personal name disambiguation on the Web.
Date of Conference: 17-20 November 2013
Date Added to IEEE Xplore: 23 December 2013
ISBN Information:
Conference Location: Atlanta, GA, USA
References is not available for this document.

I. Introduction

Recently, web search engines become vital in people's daily life and are widely used to retrieve information of realworld entities including people themselves. In such cases, users enter the name of the target entity in search engines to obtain a set of Web pages that contain the name. However, the ambiguous of name (many entities share the same name or an entity has several names) typically causes ambiguous search results containing Web pages of several different entities. Such ambiguity is more common in Chinese names. For example, when search “Yao Ming”, the results are dominated by the well-known basketball player, and users have to manually fitter out these Web pages to identify the expected non-famous people who share the same name. This is the personal name ambiguity problem.

Select All
1.
M. Yoshida, M. Ikeda, S. Ono, I. Sato, and H. Nakagawa. Person Name Disambiguation by Bootstrapping. 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 10-17, 2010.
2.
T. Anwar and M. Abulaish. An MCL-Based Text Mining Approach for Namesake Disambiguation on the Web. 2012 IEEE/WIC/ACM International Conference on Web Intelligence, 40-44, 2012.
3.
M.B. Fleischman and F. Hovy. Multidocument personal name resolution. 42th Annual Meeting of the Association for Computational Linguistics, Reference Resolution Workshop, 2004.
4.
S. Ono, I. Sato, M. Yoshida and H. Nakagawa. Personal name Disambiguation in Web Pages Using Social Network, Compound Words and Latent Topics. 12th Pacific-Asia Conference on Advances in knowledge discovery and data mining, 260-271, 2008.
5.
H. Srinivasan, J. Chen and R. Srihari. Cross document person name disambiguation using entity profiles. 2nd Text Analysis Conference, 2009.
6.
M. Ikeda, S. Ono and I. Sato. Personal Name Disambiguation on the Web by Two Stage Clustering. 18th International World Wide Web conference, 2009.
7.
S.F.A. Castano, D. Lorusso and S. Montanelli. On the Ontology Instance Matching Problem. 19th International Workshop on Database and Expert Systems Application, 180-184, 2008.
8.
R. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. 11th Conference of the European Chapter of the Association for Compu-tational Linguistics, 3-7, 2006.
9.
Z.Z. Liu, Q. Lu and J. Xu. High Performance Clustering for Web personal name Disambiguation Using Topic Capturing. 1st International Workshop on Entity-Oriented Search at 34th Annual Internaltion ACM SIGIR conference, 1-6, 2011.
10.
J. Xu, Q. Lu and Z. Liu. Combining Classification with Clustering for Web Person Dis-ambiguation. 21st international conference companion on World Wide Web, 637-638, 2012.
11.
C. Long and L. Shi. Web personal name Disambiguation by Relevance Weighting of Extended Feature Sets. 3th Web People Search Evaluation Forum, 2010.
12.
S.R. Yerva, Z. Miklos and K. Aberer. Entity-based Classification of Twitter Messages. International Journal of Computer Science Applications. 9, 88-115, 2012
13.
S. Araujo, J. Hidders, D. Schwabe and A.P. Vries. SERIMI C Resource Description Similarity, RDF Instance Matching and Interlinking. 10th International the Semantic Web Conference, 2011.
14.
S. Sekine and J. Artiles. WePS2 Attribute Extraction Task. 18th International World Wide Web conference Conference, 2009.
15.
H.T. Tanev and B. Magnini. Weakly Supervised Approaches for Ontology Population. Buitelaar, P., Philipp, C. (eds.) Ontology Learning and Population: Bridging the Gap between Text and Knowledge. 129-143. IOS Press, 2006.
16.
A. Formica and M. Missikoff. Concept similarity in symontos: An enterprise management tool. The Computer Journal, 45(6), 583-595, 2002.
17.
G. Petasis, V. Karkaletsis, G. Paliouras, A. Krithara and E. Zavitsanos. Ontology population and enrichment: State of the art. Multimedia Information Extraction, LNAI, Vol. 6050, 134-166, 2011.
18.
Z. Lu, Z. Miklos, L. He, S.M. Cai and J. Gu. A novel multi-aspect consistency measurement for ontologies. Journal of Web Engeneering, 10 (1), 48-69, Rinton Press, 2011.
19.
B. Hachey, W. Radford, J. Nothman, M. Honnibal and J.R. Curran. Evaluating Entity Linking with Wikipedia. Artificial Intelligence, 194, 130-150, Elsevier, 2013.
20.
X. Han, L. Sun and J. Zhao. Collective entity linking in web text: a graph-based metod. 34th international ACM SIGIR conference on Research and development in Information Retrieval, 765-774, Acm, 2011.
21.
S. Kulkarni and A. Singh, G. Ramakrishnan, S. chakrabarti. Collective annotation of Wikipedia entities in web text. 15th ACM SIGKDD, 457-466, Acm, 2009.
22.
Z. Wang, X. Zhu and Z. Lu. A Context-aware Automatic Chinese Transliterated Person Names Recognition Approach. 8th International Conference on Semantics, Knowledge and Grids, 143-149, 2012.
23.
J. Wang, W.X. Zhao, R. Yan, H. Wei, J. Nie and X. Li. Using Lexical and Thematic Knowledge for Name disambiguation. 8th Asia Information Retrieval Societies Conference, LNCS 7675, 76-88, 2012.
24.
Z. He, H. Wang and S. Li. The Task 2 of CIPS-SIGHAN 2012 Named Entity Recognition and Disambiguation in Chinese Bakeoff. 2nd CIPS-SIGHAN Joint Conference on Chinese Language Processing, 108-114, 2012.
25.
L. Wang, S. Li, D. Wong, et al A Joint Chinese Named Entity Recognition and Disambiguation System. 2nd CIPS-SIGHAN Joint Conference on Chinese Language Processing, 2012.
26.
J. Liu, R. Xu, Q. Lu, et al Explore chinese encyclopedic knowledge to disambiguate person names. 2nd CIPS-SIGHAN Joint Conference on Chinese Language Processing, 2012.
27.
Q. Fan, H. Zan, Y. Chai, et al Chinese personal name disambiguation based on vector space model. 2nd CIPS-SIGHAN Joint Conference on Chinese Language Processing, 2012.

References

References is not available for this document.