DSpace
 

Tai Nguyen So - Vietnam National University, Ha Noi - VNU >
TRƯỜNG ĐẠI HỌC CÔNG NGHỆ >
PTN Micro Nano >
Articles of Universities of Vietnam from Scopus >

Search

Please use this identifier to cite or link to this item: http://tainguyenso.vnu.edu.vn/jspui/handle/123456789/12417

Title: Named entity disambiguation: A hybrid statistical and rule-based incremental approach
Authors: Nguyen H.T.
Cao T.H.
Keywords: 
Issue Date: 2008
Publisher: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Citation: Volume 5367 LNCS, Issue , Page 420-433
Abstract: The rapidly increasing use of large-scale data on the Web makes named entity disambiguation become one of the main challenges to research in Information Extraction and development of Semantic Web. This paper presents a novel method for detecting proper names in a text and linking them to the right entities in Wikipedia. The method is hybrid, containing two phases of which the first one utilizes some heuristics and patterns to narrow down the candidates, and the second one employs the vector space model to rank the ambiguous cases to choose the right candidate. The novelty is that the disambiguation process is incremental and includes several rounds that filter the candidates, by exploiting previously identified entities and extending the text by those entity attributes every time they are successfully resolved in a round. We test the performance of the proposed method in disambiguation of names of people, locations and organizations in texts of the news domain. The experiment results show that our approach achieves high accuracy and can be used to construct a robust named entity disambiguation system. © 2008 Springer Berlin Heidelberg.
URI: http://tainguyenso.vnu.edu.vn/jspui/handle/123456789/12417
ISSN: 3029743
Appears in Collections:Articles of Universities of Vietnam from Scopus

Files in This Item:

File SizeFormat
HCM_U205.pdf49.18 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback