首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度学习的图像抠图技术
引用本文:王榕榕,徐树公,黄剑波.基于深度学习的图像抠图技术[J].上海大学学报(自然科学版),2021,28(2):261-269.
作者姓名:王榕榕  徐树公  黄剑波
作者单位:1.上海大学 上海电影学院, 上海 200072;2.上海大学 上海先进通信与数据科学研究院, 上海 200444;3.上海大学 上海电影特效工程技术研究中心, 上海 200072
基金项目:上海大学电影学高峰学科和上海电影特效工程技术研究中心研究项目(16dz2251300)
摘    要:图像抠图(image matting)技术是图像编辑技术的基础, 广泛应用于影视后期制作和日常生活. 基于深度学习的图像抠图网络, 通过输入的原图和三元图来估计每个像素的 $\alpha$ 值. 在原下、上采样的图像抠图技术基础上, 针对抠图数据集图像差异较大容易造成网络收敛较慢的问题, 在每个卷积层后加入了批量标准化(batch normalization, BN)层, 对输入数据进行归一化操作, 加快模型收敛速度, 同时参数更新方向更符合数据集整体特性; 针对抠图任务需要更关注物体边缘部分的特点, 使用可变形卷积(deformable convolution)层替换普通卷积层. 可变形卷积层会根据不同输入数据自适应学习卷积核形状, 有效扩大感受野范围, 在细节部分有更好的预测效果.

关 键 词:深度学习  图像抠图  语义分割  预测  
收稿时间:2020-03-13

Image matting based on deep learning
WANG Rongrong,XU Shugong,HUANG Jianbo.Image matting based on deep learning[J].Journal of Shanghai University(Natural Science),2021,28(2):261-269.
Authors:WANG Rongrong  XU Shugong  HUANG Jianbo
Institution:1. Shanghai Film Academy, Shanghai University, Shanghai 200072, China;2. Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China;3. Shanghai Engineering Research Center of Motion Picture Special Effects, Shanghai University, Shanghai 200072, China
Abstract:Image editing technology, which is widely used in the post-production of film and television and in daily life, is based on image matting. In this study, an image matting network based on deep learning which estimates the value of each pixel by inputting the original image and trimap is proposed. Based on the original down- and up-sampling network and to address the problem of slow network convergence caused by the large difference between matting dataset pictures, batch normalisation (BN) is applied after each convolution layer in this study. In the normalisation layer, the input data are normalised to speed up the convergence of the model. This enables the update direction of the parameters to be more consistent with the overall characteristics of the dataset. Because the edge of the object should be carefully considered in the matting task, a deformable convolution layer is used instead of the custom convolution layer. The deformable convolution layer can adaptively learn the shape of the convolution kernel according to different input data, effectively expand the range of the receptive field, and improve the prediction effect in detailed image parts.
Keywords:deep learning  image matting  semantic segmentation  prediction  
点击此处可从《上海大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《上海大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号