首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于既定词表的彝文自动分词技术研究
引用本文:王成平.基于既定词表的彝文自动分词技术研究[J].科学技术与工程,2012,12(10):2328-2332.
作者姓名:王成平
作者单位:西南民族大学民族语言文字信息处理实验中心,成都,610041
基金项目:国家民委科研项目《信息处理用规范彝文分词系统的设计与实现》(09XN07);2010年国家外专项目《信息处理用规范彝文自动分词系统的设计与实现》(Y-2010-26)
摘    要:自动分词是彝文信息处理中一项不可缺少的基础性工作,彝文信息处理只要涉及到检索、翻译、校对等,就需要以词为基本单位。本文根据彝文的特点,介绍了彝文分词规范与分词词表的设计,提出了实现基于既定词表的彝文自动分词技术的算法选择、系统结构,以及实现流程,而且进行了抽样测试,其分词准确率和速度都比较令人满意。最后结合彝文的特点对实现彝文自动分词的难点进行了分析。

关 键 词:彝文  自动分词  算法  测试评价  难点分析
收稿时间:1/15/2012 8:39:58 PM
修稿时间:2/1/2012 8:48:11 AM

Based on the established vocabulary of Yi Automatic Segmentation System Design and Implementation
wangchengping.Based on the established vocabulary of Yi Automatic Segmentation System Design and Implementation[J].Science Technology and Engineering,2012,12(10):2328-2332.
Authors:wangchengping
Institution:(Southwest University for Nationalities,Chengdu 610041,P.R.China)
Abstract:The automatic word segmentation is an indispensable basic work of Yi language information processing.As long as Yi language information processing related to the retrieval,translation,proofreading,it requires the use of word as basic unit.According to characteristics of Yi language,the automatic word segmentation standdard and design of word vocabulary are described.The technology of automatic word segmentation is proposed,which based on established vocabulary of Yi language.The technology includes algorithm selection,system architecture,and implementation process.And sample test are given.The accuracy rate and speed of word segmentation are quite satisfactory.Finally,on characteristics of Yi language and the difficulty of achieve automatic word segmentation are analyzed.
Keywords:Yi language automatic segmentation algorithm testing and evaluation difficulties analysis
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号