基于注意力机制的分层次交互融合多模态情感分析 Multimodal sentiment analysis of hierarchical interactive fusion based on attention mechanism期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于注意力机制的分层次交互融合多模态情感分析

引用本文：	李文雪,甘臣权. 基于注意力机制的分层次交互融合多模态情感分析[J]. 重庆邮电大学学报(自然科学版), 2023, 35(1): 176-184

作者姓名：	李文雪甘臣权

作者单位：	重庆邮电大学通信与信息工程学院, 重庆 400065

基金项目：	国家自然科学基金(61702066,61903056);重庆市教委科学技术重点研究项目(KJZD-M201900601)

摘要：	针对基于视频的多模态情感分析中，通常在同一语义层次采用同一种注意力机制进行特征捕捉，而未能考虑模态间交互融合对情感分类的差异性，从而导致模态间融合特征提取不充分的问题，提出一种基于注意力机制的分层次交互融合多模态情感分析模型(hierarchical interactive fusion network based on attention mechanism, HFN-AM),采用双向门控循环单元捕获各模态内部的时间序列信息，使用基于门控的注意力机制和改进的自注意机制交互融合策略分别提取属于句子级和篇章级层次的不同特征，并进一步通过自适应权重分配模块判定各模态的情感贡献度，通过全连接层和Softmax层获得最终分类结果。在公开的CMU-MOSI和CMU-MOSEI数据集上的实验结果表明，所给出的分析模型在2个数据集上有效改善了情感分类的准确率和F1值。
关键词：	多模态情感分析注意力机制分层次交互融合
收稿时间：	2021-06-30
修稿时间：	2022-10-17
Multimodal sentiment analysis of hierarchical interactive fusion based on attention mechanism

LI Wenxue,GAN Chenquan. Multimodal sentiment analysis of hierarchical interactive fusion based on attention mechanism[J]. Journal of Chongqing University of Posts and Telecommunications, 2023, 35(1): 176-184

Authors:	LI Wenxue GAN Chenquan

Affiliation:	School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China

Abstract:	In video-based multimodal sentiment analysis, the same attention mechanism is usually used to capture features at the same semantic level, and the difference in sentiment classification by interaction fusion between modals is not considered, which leads to insufficient feature extraction of fusion between modals. In response to the above problems, this paper proposes a hierarchical interactive fusion based on attention mechanism (HFN-AM). Firstly, the bidirectional gated recurrent unit is used to capture the time series information within each modal, and then the interactive fusion strategy of gating-based attention mechanism and improved self-attention mechanism are used to extract different levels of features belonging to the sentences and document level respectively. Furthermore, the affective contribution degree of each mode is determined by the adaptive weight distribution module. Finally, the final classification result is obtained through the fully connected layer and the Softmax layer. Experimental results show that the accuracy and F1 value of the presented analysis model have achieved significant improvement on the public CMU-MOSI and CMU-MOSEI datasets, indicating that the model can effectively improve the performance of sentiment classification.

Keywords:	multimodal sentiment analysis attention mechanism hierarchical interactive fusion

	点击此处可从《重庆邮电大学学报(自然科学版)》浏览原始摘要信息
	点击此处可从《重庆邮电大学学报(自然科学版)》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏