首页 | 本学科首页   官方微博 | 高级检索  
     

基于注意力机制的远程监督实体关系抽取
引用本文:邢毅雪,朱永华,高海燕,周金,张克. 基于注意力机制的远程监督实体关系抽取[J]. 上海大学学报(自然科学版), 2021, 27(5): 983-992. DOI: 10.12066/j.issn.1007-2861.2197
作者姓名:邢毅雪  朱永华  高海燕  周金  张克
作者单位:上海大学上海电影学院,上海200072;上海大学生命科学学院,上海200444
基金项目:"十三五"国家重点研发计划项目(2017YFD0400101)
摘    要:关系抽取是许多信息抽取系统中的一个关键步骤,旨在从文本中挖掘结构化事实.在应用传统的远程监督方法完成实体关系抽取任务时存在2个问题:①远程监督方法将语料库中的文本与已标注实体和实体间关系的知识库启发式地对齐,并将对齐结果作为文本的标注数据,这必然会导致错误标签问题;②目前基于统计学的方法过于依赖自然语言处理工具,提取特...

关 键 词:实体关系抽取  注意力机制  深度学习  远程监督
收稿时间:2019-08-20

Distant supervision for relation extraction via attention CNNs
XING Yixue,ZHU Yonghua,GAO Haiyan,ZHOU Jin,ZHANG Ke. Distant supervision for relation extraction via attention CNNs[J]. Journal of Shanghai University(Natural Science), 2021, 27(5): 983-992. DOI: 10.12066/j.issn.1007-2861.2197
Authors:XING Yixue  ZHU Yonghua  GAO Haiyan  ZHOU Jin  ZHANG Ke
Affiliation:1. Shanghai Film Academy, Shanghai University, Shanghai 200072, China;2. School of Life Sciences, Shanghai University, Shanghai 200444, China
Abstract:The process of relation extraction is a significant step in several information extraction systems designed to mine structured facts from text. However, two problems surface when traditional distant supervision methods are employed to conduct the entity relation extraction task. First, the distant supervision heuristic aligns the text in the corpus using existing knowledge marked with entities and relations, after which the alignment results are treated as annotation data; this leads to inevitable labeling errors. Second, current statistical methods rely extensively on natural language processing tools to extract features, and the noise accumulating during the entire process significantly affects the extraction results. In this study, an end-to-end, attention mechanism-based convolutional neural network (CNN) is proposed. First, the attention mechanism is added to the input layer for automatic detection of more subtle clues and learning of parts of sentences that are relevant to relation extraction. Second, the sentence is encoded based on the position feature and word feature, a piecewise CNN (PCNN) is used to extract sentence features and classify relationships, and finally a max-margin loss function with a higher efficiency is used on the network. The accuracy of this method when used on the New York Times dataset is 2.0% higher than that of the classical PCNN+MIL model, and 1.0% higher than that of the classical APCNN+D model. The experimental results therefore demonstrate excellent accuracy of the proposed model when compared with that of other baselinemodels.
Keywords:entity relation extraction  attention mechanism  deep learning  distant supervision  
本文献已被 万方数据 等数据库收录!
点击此处可从《上海大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《上海大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号