首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种轻量级文本蕴含模型
引用本文:王伟,孙成胜,伍少梅,张芮,康睿,李小俊.一种轻量级文本蕴含模型[J].四川大学学报(自然科学版),2021,58(5):052001.
作者姓名:王伟  孙成胜  伍少梅  张芮  康睿  李小俊
作者单位:中国电子科技网络信息安全有限公司,中国电子科技网络信息安全有限公司,四川大学计算机学院,四川大学计算机学院,四川大学计算机学院,卫士通信息产业股份有限公司
基金项目:四川省新一代人工智能重大专项(2018GZDZX0039);四川省重点研发项目(2019YFG0521);JG2020125
摘    要:现有主流文本蕴含模型大多采用循环神经网络编码,并采用各种注意力推理机制或辅以手工提取的特征来提升蕴含关系识别准确率,由于复杂的网络结构和RNNs网络串行机制导致这些模型训练和推理速度较慢.本文提出轻量级文本蕴含模型,采用自注意力编码器编码文本向量,点积注意力交互两段文本,再采用卷积神经网络对交互特征推理,整个结构可根据不同数据的推理难度叠加不同模块数量.在多个文本蕴含数据集实验表明,本文模型在保持较高识别准确率情况下仅用一个块参数仅为665K,模型推理速度相比其他主流文本蕴含模型至少提升一倍.

关 键 词:注意力机制  卷积神经网络  轻量级  文本蕴含
收稿时间:2021/6/28 0:00:00
修稿时间:2021/7/15 0:00:00

A lightweight text entailment model
WANG Wei,SUN Cheng-Sheng,WU Shao-Mei,ZHANG Rui,KANG Rui,LI Xiao-Jun.A lightweight text entailment model[J].Journal of Sichuan University (Natural Science Edition),2021,58(5):052001.
Authors:WANG Wei  SUN Cheng-Sheng  WU Shao-Mei  ZHANG Rui  KANG Rui  LI Xiao-Jun
Institution:China Electronic Technology Cyber Security Co,Ltd,China Electronic Technology Cyber Security Co,Ltd,College of Computer Science,Sichuan University,College of Computer Science,Sichuan University,College of Computer Science,Sichuan University,Westone Information Industry INC
Abstract:Most of the existing mainstream textual entailment models adopt recurrent neutral network to encode text, and various complex attention mechanisms or manually extracted text features are used to improve the accuracy of textual entailment recognition. The training and inference speed of the models is usually slow due to the complex network structure and the sequential nature of RNNs. In this paper, Lightweight Text Entailment Model is proposed. In the proposed model, the self attentional encoder is adopted to encode text vectors; the dot product attention mechanism is adopted to interact two texts; the convolutional neural network is adopted to deduce interactive features, and the module number of the structure can be adjusted according to the reasoning difficulty of data. Experiments on multiple datasets show that the parameter size of single module in the model is only 665 K, and the inference speed of the model is at least twice as high as that of other mainstream models, under the condition of high accuracy.
Keywords:Attention Mechanism  CNN  Lightweight  Textual Entailment
点击此处可从《四川大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《四川大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号