首页 | 本学科首页   官方微博 | 高级检索  
     

基于句法结构的神经网络复述识别模型
引用本文:刘明童,张玉洁,徐金安,陈钰枫.基于句法结构的神经网络复述识别模型[J].北京大学学报(自然科学版),2020,56(1):45-52.
作者姓名:刘明童  张玉洁  徐金安  陈钰枫
作者单位:北京交通大学计算机与信息技术学院, 北京 100044
基金项目:国家自然科学基金(61876198, 61976015, 61370130, 61473294)、中央高校基本科研业务费专项资金(2018YJS025)、北京市自然科学基金(4172047)和科学技术部国际科技合作计划(K11F100010)资助
摘    要:为解决已有复述语义计算方法未考虑句法结构的问题, 提出基于句法结构的神经网络复述识别模型, 设计基于树结构的神经网络模型进行语义组合计算, 使得语义表示从词语级扩展到短语级。进一步地, 提出基于短语级语义表示的句法树对齐机制, 利用跨句子注意力机制提取特征。最后, 设计自注意力机制来增强语义表示, 从而捕获全局上下文信息。在公开英语复述识别数据集Quora上进行评测, 实验结果显示, 复述识别性能得到改进, 达到89.3%的精度, 证明了提出的基于句法结构的语义组合计算方法以及基于短语级语义表示的跨句子注意力机制和自注意力机制在改进复述识别性能方面的有效性。

关 键 词:复述识别  句法结构  树结构神经网络  注意力机制  
收稿时间:2019-05-22

A Neural Paraphrase Identification Model Based on Syntactic Structure
LIU Mingtong,ZHANG Yujie,XU Jin’an,CHEN Yufeng.A Neural Paraphrase Identification Model Based on Syntactic Structure[J].Acta Scientiarum Naturalium Universitatis Pekinensis,2020,56(1):45-52.
Authors:LIU Mingtong  ZHANG Yujie  XU Jin’an  CHEN Yufeng
Affiliation:School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044
Abstract:Paraphrase identification involves natural language semantic understanding. Most previous methods regarded sentences as sequential structures, and used sequential neural network for semantic composition. These methods do not consider the influence of syntactic structure on semantic computation. In this paper, we proposed a neural paraphrase identification model based on syntactic structure, and designed a tree-based neural network model for semantic composition, which extended the semantic representation from word level to phrase level. Furthermore, this paper proposed a syntactic tree alignment mechanism based on phrase-level semantic representation, and extracted features by using cross-sentence attention mechanism. Finally, a self-attention mechanism was used to enhance semantic representation, which could effectively model context information based on syntactic structure. Experiments on Quora paraphrase dataset show that the performance of paraphrase identification has been improved to 89.3% accuracy. The results further prove that the proposed semantic composition method based on syntactic structure, phrase-level cross sentence attention and self-attention are effective in improving paraphrase identification.
Keywords:paraphrase identification  syntactic structure  tree-structured neural network  attention mechansim  
本文献已被 CNKI 等数据库收录!
点击此处可从《北京大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《北京大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号