首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于多源知识的中文微博命名实体链接
引用本文:昝红英,吴泳钢,贾玉祥,牛桂玲.基于多源知识的中文微博命名实体链接[J].山东大学学报(理学版),2015,50(7):9-16.
作者姓名:昝红英  吴泳钢  贾玉祥  牛桂玲
作者单位:1. 郑州大学信息工程学院, 河南 郑州 450001;
2. 郑州大学外语学院, 河南 郑州 450001
基金项目:国家自然科学基金资助项目(61402419,60970083,61272221);国家社会科学基金资助项目(14BYY096);国家高技术研究发展计划863计划项目(2012AA011101);河南省科技厅科技攻关计划资助项目(132102210407);河南省科技厅基础研究资助项目(142300410231,142300410308);河南省教育厅科学技术研究重点项目(12B520055,13B520381);计算语言学教育部重点实验室(北京大学)开放课题资助项目
摘    要:命名实体在文本中是承载信息的重要单元,而微博作为一种分享简短实时信息的社交网络平台,其文本长度短、不规范,而且常有新词出现,这就需要对其命名实体进行准确的理解,以提高对文本信息的正确分析。提出了基于多源知识的中文微博命名实体链接,把同义词词典、百科资源等知识与词袋模型相结合实现命名实体的链接。在NLP&CC2013中文微博实体链接评测数据集进行了实验,获得微平均准确率为92.97%,与NLP&CC2013中文实体链接评测最好的评测结果相比,提高了两个百分点。

关 键 词:命名实体  中文微博实体链接  同义词词典  百科资源  词袋模型  
收稿时间:2015-03-03

Chinese Micro-blog named entity linking based on multisource knowledge
ZAN Hong-ying,WU Yong-gang,JIA Yu-xiang,NIU Gui-ling.Chinese Micro-blog named entity linking based on multisource knowledge[J].Journal of Shandong University,2015,50(7):9-16.
Authors:ZAN Hong-ying  WU Yong-gang  JIA Yu-xiang  NIU Gui-ling
Institution:1. School of Information Engineering, Zhengzhou University, Zhengzhou 450001, Henan, China;
2. School of Foreign Language, Zhengzhou University, Zhengzhou 450001, Henan, China
Abstract:Named entity is an important component conveying information in texts. Micro-blog is a social network platform used to share brief real-time information, with characteristics such as short text length, nonstandard words, and even the frequent emergence of neologisms.So an accurate understanding of the named entities is needed to ensure a correct analysis of the text information. A Chinese Micro-blog entity linking strategy was proposed based on multi-resource knowledge, combing the dictionary of synonyms, the encyclopedia resources as well as the bag-of-words model together to deal with named entity linking.In this strategy, named entities to be linked in Micro-blog were mapped to the corresponding candidate entities in the knowledge base. The evaluation results obtain a micro average accuracy of 92.97%, based on experiments using data sets of NLP& CC2013 Chinese micro-blog entity linking track. Compared with the state-of-the-art result, the accuracy of this method is two percent higher,which demonstrates the effectiveness of our method.
Keywords:named entity  Chinese Micro-blog entity linking  dictionary of synonyms  encyclopedia resources  bag-of-words model
点击此处可从《山东大学学报(理学版)》浏览原始摘要信息
点击此处可从《山东大学学报(理学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号