基于多流融合网络的3D骨架人体行为识别 3D Skeleton-based Human Action Recognition Based on Multi-stream Fusion Network期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于多流融合网络的3D骨架人体行为识别

引用本文：	陈泯融,彭俊杰,曾国强.基于多流融合网络的3D骨架人体行为识别[J].华南师范大学学报(自然科学版),2023,55(1):94-101.

作者姓名：	陈泯融彭俊杰曾国强

作者单位：	1.华南师范大学计算机学院，广州 510631

基金项目：	国家自然科学基金项目61872153国家自然科学基金项目61972288

摘要：	当前大多基于卷积神经网络的3D骨架人体行为识别模型没有充分挖掘骨架序列所蕴含的几何特征，为了弥补这方面的不足，文章在AIF-CNN模型的基础上进行改进，提出多流融合网络模型(MS-CNN)。在此模型中，新增一种几何特征(kernel特征)作为输入，起到了丰富原始特征的作用；新增多运动特征，使模型学习到更加健壮的全局运动信息。最后，在NTU RGB+D 60数据集上进行消融实验，分别在NTU RGB+D 60数据集、NTU RGB+D 120数据集上，将MS-CNN模型与19、8个行为识别模型进行对比实验。消融实验结果表明：MS-CNN模型采用joint特征与kernel特征融合，其识别准确率比与core特征融合的高；随着多运动特征的增多，MS-CNN模型的识别准确率有所提高。对比实验结果表明：MS-CNN模型在2个评估策略下的识别准确率超过了大部分对比模型(包括基准AIF-CNN模型)。
关键词：	人体行为识别 3D骨架多流融合网络卷积神经网络
收稿时间：	2021-10-15
3D Skeleton-based Human Action Recognition Based on Multi-stream Fusion Network

Institution:	1.School of Computer, South China Normal University, Guangzhou 510631, China2.College of Cyber Security, Jinan University, Guangzhou 510631, China

Abstract:	Most of the current 3D skeleton human action recognition models based on convolutional neural network do not fully explore the geometric features embedded in skeleton sequences. To make up for this deficiency, based on the AIF-CNN model, the multi-stream fusion network model (MS-CNN for short) is proposed. A geometric feature (kernel feature) is proposed as the input of MS-CNN, which plays the role of enrich the original features. At the same time, the multi-motion feature is proposed, which allows the model to learn a more robust global motion information. Finally, ablation experiments are conducted on NTU RGB+D 60 dataset, and the MS-CNN model was compared with 19 and 8 action recognition models on NTU RGB+D 60 dataset and NTU RGB+D 120 dataset, respectively. The ablation experimental results show that the MS-CNN model using joint features fused with kernel features has higher recognition accuracy than fused with core features; In addition, the recognition accuracy of the MS-CNN model improves with the increase of multi-motion features. The comparison experimental results show that the MS-CNN model outperforms most of the comparison models (including the benchmark AIF-CNN model) in terms of recognition accuracy under the 2 evaluation strategies.

Keywords:

	点击此处可从《华南师范大学学报(自然科学版)》浏览原始摘要信息
	点击此处可从《华南师范大学学报(自然科学版)》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏