DSpace
 

Tai Nguyen So - Vietnam National University, Ha Noi - VNU >
ĐẠI HỌC QUỐC GIA HÀ NỘI - VIETNAM NATIONAL UNIVERSITY, HANOI >
BÀI BÁO ĐĂNG TRÊN SCOPUS >
2009-2010 VNU-DOI-Publications >

Search

Please use this identifier to cite or link to this item: http://tainguyenso.vnu.edu.vn/jspui/handle/123456789/7285

Title: A hybrid approach to vietnamese word segmentation using part of speech tags
Authors: Pham, D.D.
Tran, G.B.
Pham, S.B.
Keywords: F-measure
Hybrid approach
Boundary determination
Maximum matchings
Issue Date: 2009
Publisher: KSE 2009 - The 1st International Conference on Knowledge and Systems Engineering
Citation: Page : 154-161
Abstract: Word segmentation is one of the most important tasks in NLP. This task, within Vietnamese language and its own features, faces some challenges, especially in words boundary determination. To tackle the task of Vietnamese word segmentation, in this paper, we propose the WS4VN system that uses a new approach based on Maximum matching algorithm combining with stochastic models using part-of-speech information. The approach can resolve word ambiguity and choose the best segmentation for each input sentence. Our system gives a promising result with an F-measure of 97%, higher than the results of existing publicly available Vietnamese word segmentation systems. ?? 2009 IEEE.
URI: http://tainguyenso.vnu.edu.vn/jspui/handle/123456789/7285
ISBN: 9.78E+12
Appears in Collections:2009-2010 VNU-DOI-Publications

Files in This Item:

File Description SizeFormat
232.pdf47.66 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback