Tai Nguyen So - Vietnam National University, Ha Noi - VNU >
TRƯỜNG ĐẠI HỌC CÔNG NGHỆ >
PTN Micro Nano >
Articles of Universities of Vietnam from Scopus >
Search
|
Please use this identifier to cite or link to this item:
http://tainguyenso.vnu.edu.vn/jspui/handle/123456789/12575
|
Title: | A fuzzy synset-based Hidden Markov Model for automatic text segmentation |
Authors: | Ha-Thuc V. Nguyen-Van Q.-A. Cao T.H. Lawry J. |
Keywords: | |
Issue Date: | 2006 |
Publisher: | Advances in Soft Computing |
Citation: | Volume 37, Issue , Page 365-372 |
Abstract: | Automatic segmentation of text strings, in particular entity names, into structured records is often needed for efficient information retrieval, analysis, mining, and integration. Hidden Markov Model (HMM) has been shown as the state of the art for this task. However, previous work did not take into account the synonymy of words and their abbreviations, or possibility of their misspelling. In this paper, we propose a fuzzy synset-based HMM for text segmentation, based on a semantic relation and an edit distance between words. The model is also to deal with texts written in a language like Vietnamese, where a meaningful word can be composed of more than one syllable. Experiments on Vietnamese company names are presented to demonstrate the performance of the model. © 2006 Springer. |
URI: | http://tainguyenso.vnu.edu.vn/jspui/handle/123456789/12575 |
ISSN: | 16153871 |
Appears in Collections: | Articles of Universities of Vietnam from Scopus
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|