基于注意力机制和离散高斯混合模型的端到端图像压缩方法 End-to-end image compression method based on attention modules and discretized Gaussian mixture model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于注意力机制和离散高斯混合模型的端到端图像压缩方法

引用本文：	朱俊,高陈强,陈志乾,谌放.基于注意力机制和离散高斯混合模型的端到端图像压缩方法[J].重庆邮电大学学报(自然科学版),2020,32(5):769-778.

作者姓名：	朱俊高陈强陈志乾谌放

作者单位：	重庆邮电大学通信与信息工程学院,重庆 400065; 信号与信息处理重庆市重点实验室,重庆 400065

基金项目：	国家自然科学基金(61571071, 61906025); 重庆市科委自然科学基金(cstc2018jcyjAX0227)

摘要：	图像压缩是图像处理领域重要的基础支撑技术之一。近年来，深度学习被用于解决图像压缩任务。潜在表示特征的冗余和概率估计的不准确往往会限制压缩性能的进一步提高。为了改善这类问题，提出一种基于注意力机制和离散高斯混合模型的端到端图像压缩方法。将全局上下文注意力模块嵌入到编码器，旨在构造紧凑的潜在表示特征。同时，将潜在表示特征建模为参数化的离散高斯混合模型，用于提高码率估计的准确度。实验结果表明，提出的算法无论在峰值信噪比(peak signal noise rate，PSNR)还是多尺度结构相似度(multi-scale structural similarity，MS-SSIM)指标上都高于传统方法。在视觉感知上，提出的图像压缩算法能产生更令人满意的压缩图像。
关键词：	图像压缩自编码器卷积神经网络深度学习
收稿时间：	2020/6/30 0:00:00
修稿时间：	2020/9/14 0:00:00
End-to-end image compression method based on attention modules and discretized Gaussian mixture model

ZHU Jun,GAO Chenqiang,CHEN Zhiqian,CHEN Fang.End-to-end image compression method based on attention modules and discretized Gaussian mixture model[J].Journal of Chongqing University of Posts and Telecommunications,2020,32(5):769-778.

Authors:	ZHU Jun GAO Chenqiang CHEN Zhiqian CHEN Fang

Abstract:	Image compression is one of the important basic technologies in the image processing field. In recent years, deep learning is used to handle image compression task. However, the redundancy of the latent representation feature and inaccurate probability estimation usually limit the compression performance. To address these problems, this paper proposes an end-to-end image compression method based on attention mechanism and discrete Gaussian mixture model. Firstly, this paper embeds the global context attention module into the encoder to construct compact latent representational features. Besides, this paper models the latent features as a parameterized discrete Gaussian mixture model to improve the accuracy of rate estimation. Experimental results demonstrate that the proposed method outperforms traditional method in terms of peak signal noise rate (PSNR) and multi-scale structural similarity (MS-SSIM). In terms of visual perception, the proposed method is able to produce more satisfying visual results.

Keywords:	image compression auto encoder convolutional neural network deep learning

	点击此处可从《重庆邮电大学学报(自然科学版)》浏览原始摘要信息
	点击此处可从《重庆邮电大学学报(自然科学版)》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏