首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于元路径异构网络嵌入的姓名实体消歧方法
引用本文:王建霞,张玉璇,许云峰.基于元路径异构网络嵌入的姓名实体消歧方法[J].河北科技大学学报,2020,41(3):233-241.
作者姓名:王建霞  张玉璇  许云峰
作者单位:河北科技大学信息科学与工程学院,河北石家庄 050018,河北科技大学信息科学与工程学院,河北石家庄 050018,河北科技大学信息科学与工程学院,河北石家庄 050018
基金项目:中国留学基金委地方合作项目(201808130283); 中国教育部人工智能协同育人项目(201801003011); 河北科技大学校立课题(82/1182108); 河北科技大学雾霾与空气污染防治科研项目(82/1182169); 河北省科技支撑计划项目(17210104D, 18210109D); 河北省高等学校科学技术研究项目(ZD2015099); 河北省高层次人才资助项目(A2016002015)
摘    要:为了解决大型学术数据库中重名作者的歧义消解问题,提出了基于元路径异构网络嵌入的姓名实体消歧模型。使用大型在线学术搜索系统DBLP上的公开数据集,首先抽取学术出版物的作者信息、标题和会议期刊名称等特征属性,再利用word2vec模型工具生成的特征属性词嵌入输入到GRU网络中进行训练,构造出一个PHNet矩阵网络进行随机游走操作,从而捕捉不同类型节点之间的关系,最后进行相似节点的划分,完成姓名消歧工作。实验结果显示,新方法的精确度为0.865,召回率为0.792,F_1值为0.815。基于元路径的异构网络嵌入模型的精确度、召回率等指标都优于对比模型。因此,所提出的模型在提高大型学术数据库的消歧精准度方面具有良好的应用前景。

关 键 词:自然语言处理  计算机神经网络  实体消歧  网络嵌入  异构网络
收稿时间:2020/3/25 0:00:00
修稿时间:2020/5/25 0:00:00

Disambiguation method of name entities embedded in meta-path heterogeneous networks
WANG Jianxi,ZHANG Yuxuan,XU Yunfeng.Disambiguation method of name entities embedded in meta-path heterogeneous networks[J].Journal of Hebei University of Science and Technology,2020,41(3):233-241.
Authors:WANG Jianxi  ZHANG Yuxuan  XU Yunfeng
Abstract:In order to solve the problem of disambiguation of duplicate authors in large academic databases, a name entity disambiguation model based on meta-path heterogeneous network was proposed. Based on the public data of the large online academic search system DBLP, the author information, title, name of conference journal and other characteristic attributes of academic publications were extracted first. Then the characteristic attribute words generated by the word2vec model tool were embedded into the GRU network for training, so that a PHNet matrix network for random walk operation was constructed to capture the relationship between different types of nodes and finally similar nodes were divided to complete the name disambiguation. The experimental results show that the accuracy of the method is 0.865, the recall rate is 0.792, and the F1 value is 0.815.The meta-path-based heterogeneous network embedding model is superior to the comparison model in terms of accuracy and recall rate. Therefore, the proposed model has a good application prospect in improving the accuracy of disambiguation of large academic databases.
Keywords:natural language processing  computer neural network  entity disambiguation  network embedding heterogeneous network
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《河北科技大学学报》浏览原始摘要信息
点击此处可从《河北科技大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号