首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于关键 $n$-grams 和门控循环神经网络的文本分类模型
引用本文:赵倩,吴悦,刘宗田.基于关键 $n$-grams 和门控循环神经网络的文本分类模型[J].上海大学学报(自然科学版),2021,27(3):544-552.
作者姓名:赵倩  吴悦  刘宗田
作者单位:上海大学 计算机工程与科学学院, 上海 200444
摘    要:提出一种基于关键 $n$-grams 和门控循环神经网络的文本分类模型. 模型采用更为简单高效的池化层替代传统的卷积层来提取关键的 $n$-grams 作为重要语义特征, 同时构建双向门控循环单元(gated recurrent unit, GRU)获取输入文本的全局依赖特征, 最后将两种特征的融合模型应用于文本分类任务. 在多个公开数据集上评估模型的质量, 包括情感分类和主题分类. 与传统模型的实验对比结果表明: 所提出的文本分类模型可有效改进文本分类的性能, 在语料库 20newsgroup 上准确率提高约 1.95%, 在语料库 Rotton Tomatoes 上准确率提高约 1.55%.

关 键 词:文本分类  门控循环单元(gated  recurrent  unit    GRU)  $n$-grams  自然语言处理  
收稿时间:2019-03-27

Text classification model based on essential $n$-grams and gated recurrent neural network
ZHAO Qian,WU Yue,LIU Zongtian.Text classification model based on essential $n$-grams and gated recurrent neural network[J].Journal of Shanghai University(Natural Science),2021,27(3):544-552.
Authors:ZHAO Qian  WU Yue  LIU Zongtian
Institution:School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
Abstract:An effective text classification model based on $n$-grams and a gated recurrent neural network is proposed in this paper. First, we adopt a simpler and more efficient pooling layer to replace the traditional convolutional layer to extract the essential $n$-grams as important semantic features. Second, a bidirectional gated recurrent unit (GRU) is constructed to obtain the global dependency features of the input text. Finally, we apply the fusion model of the two features to the text classification task. We evaluate the quality of our model on sentiment and topic categorization tasks over multiple public datasets. Experimental results show that the proposed method can improve text classification effectiveness compared with the traditional model. On accuracy, it approaches an improvement of 1.95% on the 20newsgroup and 1.55% on the Rotten Tomatoes corpus.
Keywords:text classification  gated recurrent unit (GRU)  $n$-grams  natural language processing  
本文献已被 CNKI 等数据库收录!
点击此处可从《上海大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《上海大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号