首页 | 本学科首页   官方微博 | 高级检索  
     

长短时记忆网络的自由体操视频自动描述方法
引用本文:贺凤1,2,3,张洪博1,2,3,杜吉祥1,2,3,汪冠鸿1,2,3. 长短时记忆网络的自由体操视频自动描述方法[J]. 华侨大学学报(自然科学版), 2020, 0(6): 808-815. DOI: 10.11830/ISSN.1000-5013.201911047
作者姓名:贺凤1  2  3  张洪博1  2  3  杜吉祥1  2  3  汪冠鸿1  2  3
作者单位:1. 华侨大学 计算机科学与技术学院, 福建 厦门 361021;2. 华侨大学 福建省大数据智能与安全重点实验室, 福建 厦门 361021;3. 华侨大学 厦门市计算机视觉与模式识别重点实验室, 福建 厦门 361021
摘    要:提出一种长短时记忆网络的自由体操视频自动描述方法.在视频描述模型S2VT中,通过长短时记忆网络学习单词序列和视频帧序列之间的映射关系.引入注意力机制对S2VT模型进行改进,增大含有翻转方向、旋转度数、身体姿态等关键帧的权重,提高自由体操视频自动描述的准确性.建立自由体操分解动作数据集,在数据集MSVD及自建数据集上进行3种模型的对比实验,并通过计划采样方法消除训练解码器与预测解码器之间的差异.实验结果表明:文中方法可提高自由体操视频自动描述的精度.

关 键 词:长短时记忆网络  注意力机制  自由体操  自动描述

Floor Exercise Video Automatic Description Method Using Long Short-Term Memory Network
HE Feng,,' target=_blank rel=external>,ZHANG Hongbo,,' target=_blank rel=external>,DU Jixiang,,' target=_blank rel=external>,WANG Guanhong,,' target=_blank rel=external>. Floor Exercise Video Automatic Description Method Using Long Short-Term Memory Network[J]. Journal of Huaqiao University(Natural Science), 2020, 0(6): 808-815. DOI: 10.11830/ISSN.1000-5013.201911047
Authors:HE Feng    ' target=_blank rel=external>  ZHANG Hongbo    ' target=_blank rel=external>  DU Jixiang    ' target=_blank rel=external>  WANG Guanhong    ' target=_blank rel=external>
Affiliation:1. College of Computer Science and Technology, Huaqiao University, Xiamen 361021, China; 2. Fujian Key Laboratory of Big Data Intelligence and Security, Huaqiao University, Xiamen 361021, China; 3. Xiamen Key Laboratory of Computer Vision and Pattern Recognition, Huaqiao University, Xiamen 361021, China
Abstract:An automatic description method of floor exercise video based on long short-term memory network was proposed. In the video description model S2VT, learning the mapping relationship between word sequence and video frame sequence through long and short-term memory network. The attention mechanism was introduced to improve the S2VT model, increase the weight of key frames including turning direction, rotation degree and body posture, and improve the accuracy of automatic description of floor exercise video. The data set of floor exercise decomposition was established, and three models were compared among the data set MSVD and self built data set. The difference between the training decoder and the prediction decoder was eliminated by the scheduled sampling method. The experimental results showed that the proposed method can improve the accuracy of automatic description of floor exercise video.
Keywords:long short-term memory network  attention mechanism  floor exercise  automatic description
点击此处可从《华侨大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《华侨大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号