基于联合注意力-可分离卷积的立体匹配算法 Stereo matching algorithm based on joint attention-separable convolution期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于联合注意力-可分离卷积的立体匹配算法

引用本文：	张伟,黄娟,顾寄南,黄则栋,李兴家,刘星.基于联合注意力-可分离卷积的立体匹配算法[J].科学技术与工程,2023,23(31):13457-13463.

作者姓名：	张伟黄娟顾寄南黄则栋李兴家刘星

基金项目：	江苏省重点研发计划-重点项目(BE2021016)；

摘要：	为解决立体匹配网络模型轻量化与高精度不能共存的问题，本文提出新的立体匹配算法CSA-Net。算法具体是在特征提取阶段，利用类ResNet进行特征提取，训练空洞金字塔池化（ASPP）模块扩大感受野，提取多尺度上下文信息，加入联合注意力机制（CSM），在空间和通道维度提高表征能力，关注重要特征并抑制不必要的特征。在特征融合阶段，将2D深度可分离卷积提升到3D来代替原网络中标准3D卷积在空间维度和通道维度分别进行卷积运算，以降低特征融合网络的参数量与模型运行时间。最终实验表明，本文所提出的立体匹配网络模型在KITTI 2012和2015数据集进行验证，在三像素匹配误差率为1.44%和2.24%，模型运行时间减少近1/3。因此，相比于其他实现了更高的匹配精度和更快的运行速度。
关键词：	卷积神经网络立体匹配深度可分离卷积联合注意力
收稿时间：	2022/11/25 0:00:00
修稿时间：	2023/7/28 0:00:00
Stereo matching algorithm based on joint attention-separable convolution

Zhang Wei,Huang Juan,Gu Jinan,Huang Zedong,Li Xingji,Liu Xin.Stereo matching algorithm based on joint attention-separable convolution[J].Science Technology and Engineering,2023,23(31):13457-13463.

Authors:	Zhang Wei Huang Juan Gu Jinan Huang Zedong Li Xingji Liu Xin

Institution:	Jiangsu University

Abstract:	A new stereo matching algorithm called CSA-Net is proposed in this paper to solve the problem of the coexistence of lightweight and high-precision in stereo matching network models. In the feature extraction stage, ResNet-like feature extraction is employed and the empty pyramid pooling (ASPP) module is trained to expand the receptive field and extract multi-scale context information. Additionally, a joint attention mechanism (CSM) is added to improve the representation ability in the spatial and channel dimensions by focusing on important features and suppressing unnecessary ones. In the feature fusion stage, 2D depth separable convolution is extended to 3D to replace the standard 3D convolution in the original network to reduce the parameter amount and model running time of the feature fusion network. The final experiment validates the stereo matching network model proposed in this article on the KITTI 2012 and 2015 datasets, with the three pixel matching error rates being 1.44% and 2.24%, respectively. The model running time is reduced by nearly 1/3, leading to higher matching accuracy and faster running speed compared to other implementations.

Keywords:	convolution neural network stereo matching depth separable convolution joint attention

	点击此处可从《科学技术与工程》浏览原始摘要信息
	点击此处可从《科学技术与工程》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏