首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于辅助模态监督训练的情绪识别神经网络
引用本文:邹纪云,许云峰.基于辅助模态监督训练的情绪识别神经网络[J].河北科技大学学报,2020,41(5):424-432.
作者姓名:邹纪云  许云峰
作者单位:河北科技大学信息科学与工程学院,河北石家庄 050018
基金项目:中国教育部人工智能协同育人项目(201801003011); 河北科技大学校立课题(82/1182108)
摘    要:为了解决多模态数据中数据样本不平衡的问题,利用资源丰富的文本模态知识对资源贫乏的声学模态建模,构建一种利用辅助模态间相似度监督训练的情绪识别神经网络。首先,使用以双向门控单元为核心的神经网络结构,分别学习文本与音频模态的初始特征向量;其次,使用SoftMax函数进行情绪识别预测,同时使用一个全连接层生成2个模态对应的目标特征向量;最后,利用该目标特征向量计算彼此之间的相似度辅助监督训练,提升情绪识别的性能。结果表明,该神经网络可以在IEMOCAP数据集上进行情绪4分类,实现了82.6%的加权准确率和81.3%的不加权准确率。研究结果为人工智能多模态领域的情绪识别以及辅助建模提供了参考依据。

关 键 词:计算机神经网络  情绪识别  有监督训练  深度学习  多模态
收稿时间:2020/9/5 0:00:00
修稿时间:2020/9/25 0:00:00

Emotion recognition neural network based on auxiliary modal supervised training
ZOU Jiyun,XU Yunfeng.Emotion recognition neural network based on auxiliary modal supervised training[J].Journal of Hebei University of Science and Technology,2020,41(5):424-432.
Authors:ZOU Jiyun  XU Yunfeng
Abstract:In order to solve the problem of imbalance of data samples in multi-modal data, the resource-rich text modal know-ledge was used to model the resource-poor acoustic mode, and an emotion recognition neural network was constructed by using the similarity between auxiliary modes to supervise training. Firstly, the neural network with bi-GRU as the core was used to learn the initial feature vectors of the text and acoustic modalities. Secondly, the SoftMax function was used for emotion recognition prediction, and simultaneously a fully connected layer was used to generate the target feature vectors corresponding to the two modalities. Finally, the target feature vector assisted the supervised training by calculating the similarity between each other to improve the performance of emotion recognition. The results show that this neural network can perform four emotion classifications on the IEMOCAP data set to achieve a weighted accuracy of 82.6% and an unweighted accuracy of 81.3%. The research result provides a reference and method basis for emotion recognition and auxiliary modeling in the multi-modal field of artificial intelligence.
Keywords:computer neural network  emotion recognition  supervised training  deep learning  multimodal
本文献已被 万方数据 等数据库收录!
点击此处可从《河北科技大学学报》浏览原始摘要信息
点击此处可从《河北科技大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号