首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于多尺度特征提取的单目图像深度估计
引用本文:杨巧宁,蒋思,纪晓东,杨秀慧.基于多尺度特征提取的单目图像深度估计[J].北京化工大学学报(自然科学版),2023,50(1):97-106.
作者姓名:杨巧宁  蒋思  纪晓东  杨秀慧
作者单位:北京化工大学 信息科学与技术学院, 北京 100029
摘    要:在目前基于深度学习的单目图像深度估计方法中,由于网络提取特征不够充分、边缘信息丢失从而导致深度图整体精度不足。因此提出了一种基于多尺度特征提取的单目图像深度估计方法。该方法首先使用Res2Net101作为编码器,通过在单个残差块中进行通道分组,使用阶梯型卷积方式来提取更细粒度的多尺度特征,加强特征提取能力;其次使用高通滤波器提取图像中的物体边缘来保留边缘信息;最后引入结构相似性损失函数,使得网络在训练过程中更加关注图像局部区域,提高网络的特征提取能力。在NYU Depth V2室内场景深度数据集上对本文方法进行验证,实验结果表明所提方法是有效的,提升了深度图的整体精度,其均方根误差(RMSE)达到0.508,并且在阈值为1.25时的准确率达到0.875。

关 键 词:单目图像  深度估计  多尺度特征  结构相似性损失函数  
收稿时间:2021-12-27

Monocular image depth estimation based on multi-scale feature extraction
YANG QiaoNing,JIANG Si,JI XiaoDong,YANG XiuHui.Monocular image depth estimation based on multi-scale feature extraction[J].Journal of Beijing University of Chemical Technology,2023,50(1):97-106.
Authors:YANG QiaoNing  JIANG Si  JI XiaoDong  YANG XiuHui
Institution:College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
Abstract:The overall accuracy of the depth map in current monocular image depth estimation methods based on depth learning is poor due to insufficient network extraction features and loss of edge information. In this paper, a monocular image depth estimation method based on multi-scale feature extraction is proposed. Firstly, Res2Net101 is used as the encoder, the channel is grouped in a single residual block, and the stepped convolution method is used to extract more fine-grained multi-scale features to strengthen the ability of feature extraction. Secondly, a high pass filter is used to extract the edge of the object in the image to preserve the edge information. Finally, the structural similarity loss function is introduced to make the network pay more attention to the depth correlation between adjacent pixels in the local area of the image. The method is verified on the NYU Depth V2 indoor scene depth data set. The experimental results show that the method proposed in this paper is effective, improves the overall accuracy of the depth map, and the root mean square error (RMSE) reaches as high as 0.508. For a threshold value of 1.25, the accuracy reaches 0.875.
Keywords:monocular image  depth estimation  multi-scale feature  structural similarity loss function  
点击此处可从《北京化工大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《北京化工大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号