首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于语义特征扩展的知识库增量引文推荐算法
引用本文:徐也,徐蔚然.基于语义特征扩展的知识库增量引文推荐算法[J].山东大学学报(理学版),2016,51(11):26-32.
作者姓名:徐也  徐蔚然
作者单位:北京邮电大学信息与通信工程学院, 北京 100876
摘    要:将知识库增量引文推荐(cumulative citation recommendation, CCR)任务分解为3个基本的关键问题:针对知识库某一实体名的查询扩展;针对文档和实体的特征提取;基于线性和非线性相结合的分类模型。提出了基于语义词典(DBpedia)与词向量(word embedding)相结合的方法进行查询扩展,以及利用LDA和ESA两种算法对文档进行特征提取,最终通过线性逻辑回归与非线性随机森林相融合的分类算法实现CCR算法。与基线系统相比,该方法在TREC KBA2014评测数据上的试验结果的F1平均提升了14.7%,表明本文设计的方法能够较好地解决引文推荐问题。

关 键 词:查询扩展  分类  知识库  特征提取  
收稿时间:2015-09-18

Algorithm of knowledge base cumulative citation recommendation based on semantic features expansion
XU Ye,XU Wei-ran.Algorithm of knowledge base cumulative citation recommendation based on semantic features expansion[J].Journal of Shandong University,2016,51(11):26-32.
Authors:XU Ye  XU Wei-ran
Institution:School of Information and Communication and Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China
Abstract:The task of knowledge base cumulative citation recommendation was mainly decomposed into three basic key problems: query expansion based on an entity name in knowledge base, feature extraction for documents and entities.We proposed a method that using the combination of the semantic dictionary(DBpedia)and the word vector(word embedding)for query expansion, and using LDA and ESA algorithms for feature extraction. Finally classify documents based on linear Logistic Regresion combined with unlinear random forest. The F1 value of this system operated on TREC KBA2014 promoted 14.7% compared to the baseline, which indicated that the method raised by the study is good at dealing with question of citation recommendation.
Keywords:query expansion  feature extraction  knowledge base  classification  
本文献已被 CNKI 等数据库收录!
点击此处可从《山东大学学报(理学版)》浏览原始摘要信息
点击此处可从《山东大学学报(理学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号