首页 | 本学科首页   官方微博 | 高级检索  
     检索      

3D多重注意力机制下的行为识别
引用本文:吴丽君.3D多重注意力机制下的行为识别[J].福州大学学报(自然科学版),2022,50(1):47-53.
作者姓名:吴丽君
作者单位:福州大学物理与信息工程学院
基金项目:福建省科技厅引导性项目(Grant No. 2019H0006);国家自然科学基金 (Grant No. 51508105)
摘    要:为解决传统3D卷积中难以提取时空信息的缺点,提出一种适用于3D卷积网络的多重注意力机制模块.该模块是由通道结合时间子模块和空间子模块组成的多维度特征调整模块.在通道结合时间模块中,通过调整池化层和卷积层的顺序,保留更多的有效通道信息和时间信息;在空间模块中,压缩冗余时间信息以减少计算量.该模块的整体计算量较少,可嵌入到各3D卷积网络中.为验证多重注意力机制模块的性能,基于3D ResNet网络设计部署了该多重注意力机制模块,并在UCF-101和HMDB-51两个行为识别数据集上分别进行训练.结果表明,改进后的3D ResNet在UCF-101上可提升1.50%的精度,在HMDB-51可提升1.24%的精度,而参数量只增加0.24%.

关 键 词:3D卷积网络  注意力机制  行为识别  3D  ResNet
收稿时间:2020/12/8 0:00:00
修稿时间:2021/1/18 0:00:00

Action recognition under 3D multiple attention mechanism
Institution:College of Physics and Information Engineering Fuzhou University
Abstract:In order to solve the shortcomings of traditional 3D convolutional network which is difficult to extract time and space information, this paper proposes a multiple attention mechanism module based on 3D convolutional network. Specifically, the module is divided into multi-dimensional feature adjustment modules such as channel combining time and space location. In the channel combined time attention mechanism module, by adjusting the order of the pooling layer and the convolutional layer, more effective channel information and time information are retained; in the spatial position attention mechanism module, redundant time information is compressed , To reduce the amount of calculation. Moreover, the overall calculation of this module is small, so it can be embedded in various 3D convolutional networks to improve its performance. In order to verify the effectiveness and simplicity of the multi-attention mechanism module, this paper designs the multi-attention mechanism module based on the 3D-ResNet network, and conducts training on the two behavior recognition data sets UCF-101 and HMDB-51. Experimental results show that compared to the original 3D ResNet, the improved 3D ResNet can increase the accuracy of 1.50% on the UCF-101 data set, and can increase the accuracy of 1.24% on the HMDB-51 data set, while the amount of parameters only increases 0.24%, which proves the effectiveness and simplicity of the multiple attention mechanism designed in this article.
Keywords:3D convolutional network  attention mechanism  action recognition  3D ResNet
点击此处可从《福州大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《福州大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号