首页 | 本学科首页   官方微博 | 高级检索  
     

基于机器学习的论文作者名消歧方法研究
引用本文:邓可君,华凯,邓昌明,姜宁,袁玲,彭一明,张治坤. 基于机器学习的论文作者名消歧方法研究[J]. 四川大学学报(自然科学版), 2019, 56(2): 241-245
作者姓名:邓可君  华凯  邓昌明  姜宁  袁玲  彭一明  张治坤
作者单位:北京大学计算中心,北京大学计算中心,北京大学计算中心,北京大学计算中心,北京大学计算中心,北京大学计算中心,北京大学计算中心
摘    要:本文提出了一种基于规则匹配和机器学习的论文作者名自动化消歧方法:首先基于人工构建的人名匹配规则确定候选作者,对于存在多个候选人的情况,基于论文的属性信息(例如合作者、标题、摘要、关键词和出版物名称等)提取特征,然后选取合适的机器学习算法进行消歧.实验效果表明K近邻和Softmax分类器较适合于论文作者名消歧任务;此外,将作者信息与论文的其他信息分开提取特征能够有效提高作者名消歧的准确性.

关 键 词:作者名消歧;机器学习;文本特征提取
收稿时间:2018-10-17
修稿时间:2018-12-10

RResearch on author name disambiguation method based on machine learning
Deng Kejun,Hua Kai,Deng Changming,Jiang Ning,Yuan Ling,Peng Yiming and Zhang Zhikun. RResearch on author name disambiguation method based on machine learning[J]. Journal of Sichuan University (Natural Science Edition), 2019, 56(2): 241-245
Authors:Deng Kejun  Hua Kai  Deng Changming  Jiang Ning  Yuan Ling  Peng Yiming  Zhang Zhikun
Affiliation:Computer Center, Peking University,Computer Center, Peking University,Computer Center, Peking University,Computer Center, Peking University,Computer Center, Peking University,Computer Center, Peking University,Computer Center, Peking University
Abstract:This paper proposes an automatic article author name disambiguation method based on rule matching and machine learning. For each article, the candidate authors are determined based on artificial constructed name matching rules firstly. For the cases of multiple candidates, features are extracted from the attribute information of the article, such as collaborators, title, abstract, key words and publication name, and then selected machine learning models are applied to author name disambiguating. The experimental results show that the K-nearest neighbor and Softmax classifier are more suitable for the author name disambiguation task than other models. In addition, extracting features of the authors information from other information separately can effectively improve the accuracy of the author name disambiguation.
Keywords:Author name disambiguation   machine learning   text feature extraction
本文献已被 CNKI 等数据库收录!
点击此处可从《四川大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《四川大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号