首页 | 本学科首页   官方微博 | 高级检索  
     检索      

构建单字词表识别未登录词的方法
引用本文:于童,刘淑芬.构建单字词表识别未登录词的方法[J].吉林大学学报(理学版),2015,53(2):307-310.
作者姓名:于童  刘淑芬
作者单位:吉林大学 计算机科学与技术学院, 长春 130012
基金项目:国家自然科学基金(批准号:60973041);吉林省科技发展计划项目(批准号:20112112)
摘    要:针对目前中文分词技术主要依赖于常用词词典,而词典对未登录词识别率较低的问题,提出一种用双词典识别未登录词的方法,即构建一个常用词词典和一个单字词词典,二者相互结合进行分词,有效解决了对未登录词识别效率偏低的问题.实验表明,采用构建单字词表法对未登录词的识别准确率可达90%以上.

关 键 词:单字词表  未登录词  中文分词  双词典法  
收稿时间:2014-07-11

Method of Recognizing Unknown Words by Building Single Word Dictionary
YU Tong , LIU Shufen.Method of Recognizing Unknown Words by Building Single Word Dictionary[J].Journal of Jilin University: Sci Ed,2015,53(2):307-310.
Authors:YU Tong  LIU Shufen
Institution:College of Computer Science and Technology, Jilin University, Changchun 130012, China
Abstract:Chinese word segmentation is a very important task in information processing. The present Chinese word segmentation technology mainly relies on common word dictionary. But the dictionary has no recognition capability for unknown words. The authors brought forth a method of using double dictionary to recognize unknown words. The process is to build a common word dictionary and a single word dictionary, then combine  them for  segmentation, solving the inefficiency in recognizing unknown words. As a result, the accuracy rate can reach above 90%.
Keywords:single word dictionary  unknown words  Chinese word segmentation  double dictionary
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《吉林大学学报(理学版)》浏览原始摘要信息
点击此处可从《吉林大学学报(理学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号