基于多尺度特征提取的单目图像深度估计 Monocular image depth estimation based on multi-scale feature extraction期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于多尺度特征提取的单目图像深度估计

引用本文：	杨巧宁,蒋思,纪晓东,杨秀慧.基于多尺度特征提取的单目图像深度估计[J].北京化工大学学报(自然科学版),2023,50(1):97-106.

作者姓名：	杨巧宁蒋思纪晓东杨秀慧

作者单位：	北京化工大学信息科学与技术学院, 北京 100029

摘要：	在目前基于深度学习的单目图像深度估计方法中，由于网络提取特征不够充分、边缘信息丢失从而导致深度图整体精度不足。因此提出了一种基于多尺度特征提取的单目图像深度估计方法。该方法首先使用Res2Net101作为编码器，通过在单个残差块中进行通道分组，使用阶梯型卷积方式来提取更细粒度的多尺度特征，加强特征提取能力；其次使用高通滤波器提取图像中的物体边缘来保留边缘信息；最后引入结构相似性损失函数，使得网络在训练过程中更加关注图像局部区域，提高网络的特征提取能力。在NYU Depth V2室内场景深度数据集上对本文方法进行验证，实验结果表明所提方法是有效的，提升了深度图的整体精度，其均方根误差（RMSE）达到0.508，并且在阈值为1.25时的准确率达到0.875。
关键词：	单目图像深度估计多尺度特征结构相似性损失函数
收稿时间：	2021-12-27
Monocular image depth estimation based on multi-scale feature extraction

YANG QiaoNing,JIANG Si,JI XiaoDong,YANG XiuHui.Monocular image depth estimation based on multi-scale feature extraction[J].Journal of Beijing University of Chemical Technology,2023,50(1):97-106.

Authors:	YANG QiaoNing JIANG Si JI XiaoDong YANG XiuHui

Institution:	College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China

Abstract:	The overall accuracy of the depth map in current monocular image depth estimation methods based on depth learning is poor due to insufficient network extraction features and loss of edge information. In this paper, a monocular image depth estimation method based on multi-scale feature extraction is proposed. Firstly, Res2Net101 is used as the encoder, the channel is grouped in a single residual block, and the stepped convolution method is used to extract more fine-grained multi-scale features to strengthen the ability of feature extraction. Secondly, a high pass filter is used to extract the edge of the object in the image to preserve the edge information. Finally, the structural similarity loss function is introduced to make the network pay more attention to the depth correlation between adjacent pixels in the local area of the image. The method is verified on the NYU Depth V2 indoor scene depth data set. The experimental results show that the method proposed in this paper is effective, improves the overall accuracy of the depth map, and the root mean square error (RMSE) reaches as high as 0.508. For a threshold value of 1.25, the accuracy reaches 0.875.

Keywords:	monocular image depth estimation multi-scale feature structural similarity loss function

	点击此处可从《北京化工大学学报(自然科学版)》浏览原始摘要信息
	点击此处可从《北京化工大学学报(自然科学版)》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏