首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于注意力机制的远程监督实体关系抽取
引用本文:邢毅雪,朱永华,高海燕,周金,张克.基于注意力机制的远程监督实体关系抽取[J].上海大学学报(自然科学版),2021,27(5):983-992.
作者姓名:邢毅雪  朱永华  高海燕  周金  张克
作者单位:1.上海大学 上海电影学院, 上海 200072;2.上海大学 生命科学学院, 上海 200444
基金项目:"十三五"国家重点研发计划项目(2017YFD0400101)
摘    要:关系抽取是许多信息抽取系统中的一个关键步骤, 旨在从文本中挖掘结构化事实. 在应用传统的远程监督方法完成实体关系抽取任务时存在 2 个问题: ① 远程监督方法将语料库中的文本与已标注实体和实体间关系的知识库启发式地对齐, 并将对齐结果作为文本的标注数据, 这必然会导致错误标签问题; ② 目前基于统计学的方法过于依赖自然语言处理工具, 提取特征处理过程中生成的噪声积累严重影响抽取结果. 为了解决远程监督存在的弊端, 提出了一种基于注意力机制的端到端的分段循环卷积神经网络(convolutional neural network, CNN)模型. 为了检测更加细微的特征, 在网络输入层添加了注意力机制, 自动学习句子中与关系抽取相关的内容; 基于位置特征和词向量特征对句子进行编码, 并使用分段卷积神经网络(piecewise CNN, PCNN)抽取句子特征进行分类, 在网络中使用了效率较高的最大边界损失函数来衡量模型的性能. 该方法在 New York Times (NYT)数据集上的准确率比经典的 PCNN+MIL 模型提高了 2.0%, 比经典的 APCNN+D 模型提高了 1.0%, 与其他几种基线模型相比, 该模型准确率表现出色.

关 键 词:实体关系抽取  注意力机制  深度学习  远程监督  
收稿时间:2019-08-20

Distant supervision for relation extraction via attention CNNs
XING Yixue,ZHU Yonghua,GAO Haiyan,ZHOU Jin,ZHANG Ke.Distant supervision for relation extraction via attention CNNs[J].Journal of Shanghai University(Natural Science),2021,27(5):983-992.
Authors:XING Yixue  ZHU Yonghua  GAO Haiyan  ZHOU Jin  ZHANG Ke
Institution:1. Shanghai Film Academy, Shanghai University, Shanghai 200072, China;2. School of Life Sciences, Shanghai University, Shanghai 200444, China
Abstract:The process of relation extraction is a significant step in several information extraction systems designed to mine structured facts from text. However, two problems surface when traditional distant supervision methods are employed to conduct the entity relation extraction task. First, the distant supervision heuristic aligns the text in the corpus using existing knowledge marked with entities and relations, after which the alignment results are treated as annotation data; this leads to inevitable labeling errors. Second, current statistical methods rely extensively on natural language processing tools to extract features, and the noise accumulating during the entire process significantly affects the extraction results. In this study, an end-to-end, attention mechanism-based convolutional neural network (CNN) is proposed. First, the attention mechanism is added to the input layer for automatic detection of more subtle clues and learning of parts of sentences that are relevant to relation extraction. Second, the sentence is encoded based on the position feature and word feature, a piecewise CNN (PCNN) is used to extract sentence features and classify relationships, and finally a max-margin loss function with a higher efficiency is used on the network. The accuracy of this method when used on the New York Times dataset is 2.0% higher than that of the classical PCNN+MIL model, and 1.0% higher than that of the classical APCNN+D model. The experimental results therefore demonstrate excellent accuracy of the proposed model when compared with that of other baselinemodels.
Keywords:entity relation extraction  attention mechanism  deep learning  distant supervision  
点击此处可从《上海大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《上海大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号