首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于多掩膜技术的无监督深度与光流估计方法
引用本文:张旭东,赵柏淦,吴国庆,姚建南.基于多掩膜技术的无监督深度与光流估计方法[J].上海理工大学学报,2024,46(2):129-137.
作者姓名:张旭东  赵柏淦  吴国庆  姚建南
作者单位:南通大学 信息科学技术学院,南通 226019;南通大学 机械工程学院,南通 226019
基金项目:国家自然科学基金资助项目(51805273);江苏省青蓝工程和江苏省高等学校重点学科建设计划资助项目
摘    要:针对自动驾驶领域现有方法在处理动态、遮挡等复杂实际场景时存在的估计不准确问题,提出了一种以多掩膜技术为基础的无监督深度与光流估计方法,通过无监督学习从单目视频序列中提取目标深度、相机运动位姿和光流信息。根据不同外点类型设计了多种特定掩膜,以有效抑制外点对光照一致性损失函数的干扰,并在位姿估计和光流估计任务中起到剔除外点的作用。引入预训练的光流估计网络,协助深度和位姿估计网络更好地利用三维场景的几何约束,从而增强联合训练性能。最后,借助训练得到的深度和位姿信息,以及计算得到的掩膜,对光流估计网络进行了优化训练。在KITTI数据集上的实验结果表明,该策略能够显著提升模型的性能,并优于其他同类型方法。

关 键 词:无监督学习  深度估计  位姿估计  三维重建
收稿时间:2023/9/6 0:00:00

Unsupervised depth and optical flow estimation method based on multiple mask techniques
ZHANG Xudong,ZHAO Baigan,WU Guoqing,YAO Jiannan.Unsupervised depth and optical flow estimation method based on multiple mask techniques[J].Journal of University of Shanghai For Science and Technology,2024,46(2):129-137.
Authors:ZHANG Xudong  ZHAO Baigan  WU Guoqing  YAO Jiannan
Institution:School of Information Science and Technology, Nantong University, Nantong 226019, China;School of Mechanical Engineering, Nantong University, Nantong 226019, China
Abstract:In response to the challenges posed by inaccurate estimations in handling dynamic objects, occlusions, and other complex real-world scenarios within the autonomous driving domain, an unsupervised depth and optical flow estimation method based on the multi-mask technique was proposed. This approach aimed to extract target depth, camera pose, and optical flow information from monocular video sequences through unsupervised learning. The method firstly designed a variety of special masks for different outlier types which effectively suppressed the interference of outliers on the photometric consistency loss function and played a key role in outlier removal during both pose and optical flow estimation tasks. Secondly, a pre-trained optical flow estimation network was introduced to assist the depth and pose estimation networks to fully utilize the geometric constraints of the 3D scene, thus enhancing the joint training performance. Finally, the optical flow estimation network was optimally trained with the help of the depth and pose information obtained from training, as well as the computationally obtained mask. Experimental results on the KITTI dataset showed that this strategy significantly improved the performance of the model and outperforms other methods of the same type.
Keywords:unsupervised learning  depth estimation  pose estimation  3D reconstruction
点击此处可从《上海理工大学学报》浏览原始摘要信息
点击此处可从《上海理工大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号