基于多掩膜技术的无监督深度与光流估计方法 Unsupervised depth and optical flow estimation method based on multiple mask techniques期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于多掩膜技术的无监督深度与光流估计方法

引用本文：	张旭东,赵柏淦,吴国庆,姚建南.基于多掩膜技术的无监督深度与光流估计方法[J].上海理工大学学报,2024,46(2):129-137.

作者姓名：	张旭东赵柏淦吴国庆姚建南

作者单位：	南通大学信息科学技术学院，南通 226019;南通大学机械工程学院，南通 226019

基金项目：	国家自然科学基金资助项目（51805273）；江苏省青蓝工程和江苏省高等学校重点学科建设计划资助项目

摘要：	针对自动驾驶领域现有方法在处理动态、遮挡等复杂实际场景时存在的估计不准确问题，提出了一种以多掩膜技术为基础的无监督深度与光流估计方法，通过无监督学习从单目视频序列中提取目标深度、相机运动位姿和光流信息。根据不同外点类型设计了多种特定掩膜，以有效抑制外点对光照一致性损失函数的干扰，并在位姿估计和光流估计任务中起到剔除外点的作用。引入预训练的光流估计网络，协助深度和位姿估计网络更好地利用三维场景的几何约束，从而增强联合训练性能。最后，借助训练得到的深度和位姿信息，以及计算得到的掩膜，对光流估计网络进行了优化训练。在KITTI数据集上的实验结果表明，该策略能够显著提升模型的性能，并优于其他同类型方法。
关键词：	无监督学习深度估计位姿估计三维重建
收稿时间：	2023/9/6 0:00:00
Unsupervised depth and optical flow estimation method based on multiple mask techniques

ZHANG Xudong,ZHAO Baigan,WU Guoqing,YAO Jiannan.Unsupervised depth and optical flow estimation method based on multiple mask techniques[J].Journal of University of Shanghai For Science and Technology,2024,46(2):129-137.

Authors:	ZHANG Xudong ZHAO Baigan WU Guoqing YAO Jiannan

Institution:	School of Information Science and Technology, Nantong University, Nantong 226019, China;School of Mechanical Engineering, Nantong University, Nantong 226019, China

Abstract:	In response to the challenges posed by inaccurate estimations in handling dynamic objects, occlusions, and other complex real-world scenarios within the autonomous driving domain, an unsupervised depth and optical flow estimation method based on the multi-mask technique was proposed. This approach aimed to extract target depth, camera pose, and optical flow information from monocular video sequences through unsupervised learning. The method firstly designed a variety of special masks for different outlier types which effectively suppressed the interference of outliers on the photometric consistency loss function and played a key role in outlier removal during both pose and optical flow estimation tasks. Secondly, a pre-trained optical flow estimation network was introduced to assist the depth and pose estimation networks to fully utilize the geometric constraints of the 3D scene, thus enhancing the joint training performance. Finally, the optical flow estimation network was optimally trained with the help of the depth and pose information obtained from training, as well as the computationally obtained mask. Experimental results on the KITTI dataset showed that this strategy significantly improved the performance of the model and outperforms other methods of the same type.

Keywords:	unsupervised learning depth estimation pose estimation 3D reconstruction

	点击此处可从《上海理工大学学报》浏览原始摘要信息
	点击此处可从《上海理工大学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏