首页 | 本学科首页   官方微博 | 高级检索  
     检索      

采用Transformer架构的长跨度视频压缩感知重构方法
引用本文:夏开国,田畅,毛鹏强.采用Transformer架构的长跨度视频压缩感知重构方法[J].解放军理工大学学报,2023(3):67-72.
作者姓名:夏开国  田畅  毛鹏强
作者单位:1.陆军工程大学 通信工程学院,江苏 南京 210007;2.陆军工程大学 指挥控制工程学院,江苏 南京 210007
基金项目:国家自然科学基金(62076251)
摘    要:深度学习的快速发展给视频压缩感知重构提供了新思路。受网络模型限制,现有的基于深度学习的压缩感知重构方法不能充分利用视频的空时特征,且对于超过16帧的视频段重构效果不够理想。采用Transformer网络构建压缩感知重构网络,利用Transformer网络在序列信号处理方面的优势构建空时注意力提取模块,学习视频帧间的空时注意力特征,更好地实现对视频连续帧的建模,从而解决长跨度视频段压缩感知重构问题。实验结果表明:所提方法在处理32张视频帧的视频分段时,能达到30 dB以上的重构精度,在处理96张视频帧的视频分段时,仍能达到27 dB以上的良好性能。

关 键 词:视频压缩感知  深度学习  Transformer  长跨度
收稿时间:2023/3/7 0:00:00

Long-Span Video Compressive Sensing Reconstruction Based on Transformer Architecture
XIA Kaiguo,TIAN Chang,MAO Pengqiang.Long-Span Video Compressive Sensing Reconstruction Based on Transformer Architecture[J].Journal of PLA University of Science and Technology(Natural Science Edition),2023(3):67-72.
Authors:XIA Kaiguo  TIAN Chang  MAO Pengqiang
Abstract:The rapid development of deep learning has provided new ideas for video compressive sensing reconstruction. Due to the limitation of the network model, the existing compressive sensing reconstruction methods based on deep learning cannot fully utilize the spatial-temporal feature of video, and the reconstruction effect is poor for video segments of more than 16 frames. In order to solve this problem, the Transformer network is proposed in this paper to build a compressive sensing reconstruction network, on which the advantage of the Transformer network in sequence signal processing is beneficial to building a spatial-temporal attention extraction module, which can learn the spatial-temporal attention features between video frames. On this basis, the modeling of continuous video frames can be realized and the problem of compressive sensing reconstruction of long-span video segments can be solved.The experimental results show that the proposed method can achieve more than 30 dB reconstruction accuracy for video segments of 32 video frames and still achieve more than 27 dB performance when processing video segments of up to 96 video frames.
Keywords:
点击此处可从《解放军理工大学学报》浏览原始摘要信息
点击此处可从《解放军理工大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号