首页 | 本学科首页   官方微博 | 高级检索  
     

基于Naive Bayes的中文人名识别研究
引用本文:曾辉,王俊,熊李艳. 基于Naive Bayes的中文人名识别研究[J]. 科学技术与工程, 2015, 15(6): 83-86,98
作者姓名:曾辉  王俊  熊李艳
作者单位:华东交通大学信息工程学院,南昌,330013
基金项目:国家自然科学基金项目(61363072)教育部人文社科基金(11YJC740157 ,09YJC740027)、江西省自然科学基金(20114BAB201027)
摘    要:在传统的只统计人名用字的Naive Bayes分类算法的基础上,将人名上下文边界融入其中,并利用从大规模语料库中统计的人名用字、边界模板频率对人名定界,再通过扩散操作召回遗漏人名。该方法简单易行,并能取得很好的效果。实验结果表明,其F值达到了93.28%。

关 键 词:Naive Bayes分类算法  边界模板  人名识别
收稿时间:2014-10-15
修稿时间:2014-10-17

Chinese person name recognition based on Naive Bayes
ZENG Hui , WANG Jun , XIONG Li-yan. Chinese person name recognition based on Naive Bayes[J]. Science Technology and Engineering, 2015, 15(6): 83-86,98
Authors:ZENG Hui    WANG Jun    XIONG Li-yan
Abstract:On the basis of the traditional Naive Bayesian classification algorithm that just considered character of Chinese person name, we brought person name's up and down boundary words in it. In order to overcome the difficulty of boundary defining, we counted Chinese name's character frequency and boundary templates' frequency from tagged corpus. Then these recognized person names are used to match the missed occurrence in the text. The method is easy and the final result is good. Experimental results show that the F-value for recognition of Chinese person name was increased.
Keywords:Naive Bayesian classification algorithm   boundary templates   person name recognition
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号