首页 | 本学科首页   官方微博 | 高级检索  
     检索      

HEVC帧内率失真优化预测模式的并行流水线硬件设计
引用本文:林志坚,丁永强,杨秀芝,等.HEVC帧内率失真优化预测模式的并行流水线硬件设计[J].华南理工大学学报(自然科学版),2023,51(5):95-103.
作者姓名:林志坚  丁永强  杨秀芝  
作者单位:福州大学 物理与信息工程学院,福建 福州 350108
基金项目:国家自然科学基金面上项目(61871132);福建省高等学校科技创新团队项目(500190)
摘    要:近年来,随着人们对视频数据需求的不断增加,视频的分辨率和帧率也在不断地提高,而实时视频序列的压缩编码速度往往受到帧率和分辨率的影响,分辨率和帧率越大,编码所需要的时间越长。为了实现更高分辨率和更高帧率的视频序列实时压缩编码,文中设计了一种新的帧内率失真优化预测模式的并行流水线硬件架构,该架构支持最大64×64编码树单元的帧内预测编码。首先设计了9路预测模式并行方案;然后,按照Z型扫描顺序实现以4×4块为基本处理单元的流水线硬件架构,并复用32×32预测单元的预测数据,用以代替64×64预测单元的预测数据,减少运算量;最后,基于该流水线架构,提出了一种新的哈达玛变换电路,用以实现高效的流水线处理。实验结果表明:在Altera Arria 10系列的现场可编程门阵列上,该9路模式并行架构仅占用75 kb的查找表和55 kb的寄存器资源,主频可以达到207 MHz,完成一个64×64编码树单元的预测仅需要4 096个时钟周期,最大能够支持1 080 P分辨率99 f/s全I帧的实时编码;与已有设计方案相比,文中方案能够用更小的电路面积实现更高帧率的1 080 P实时视频编码。

关 键 词:帧内预测  现场可编程门阵列  模式并行  高效视频编码
收稿时间:2022-09-20

Parallel Pipeline Hardware Design of Intra Rate-Distortion Optimization Prediction Mode in HEVC
LIN Zhijian,DING Yongqiang,YANG Xiuzhi,et al.Parallel Pipeline Hardware Design of Intra Rate-Distortion Optimization Prediction Mode in HEVC[J].Journal of South China University of Technology(Natural Science Edition),2023,51(5):95-103.
Authors:LIN Zhijian  DING Yongqiang  YANG Xiuzhi  
Institution:College of Physics and Information Engineering,Fuzhou University,Fuzhou 350108,Fujian,China
Abstract:In recent years, the resolution and frame rate of video have been continuously improved to meet people’s increasing demand for video data. However, the compression encoding speed of real-time video sequence is often restricted by frame rate and resolution. The higher the frame rate and resolution are, the longer the encoding time will be. In order to achieve real-time compression encode for video sequences with higher resolution and frame rate, this paper designed a new parallel pipeline hardware architecture of intra rate-distortion optimization prediction mode, which supports intra prediction coding of up to 64×64 coding tree unit. Firstly, a parallel scheme with 9-way prediction mode was designed. Secondly, a pipeline hardware architecture was implemented based on a 4×4 block as the basic processing unit in a Z-shaped scanning order, and the prediction data of 32×32 prediction units were reused to replace the prediction data of 64×64 prediction units so as to reduce the amount of calculation. Lastly, a new Hadamard transform circuit was proposed based on this pipelined architecture for efficient pipelined processing. The experimental results show that: on the Altera Arria 10 series field programmable gate array, the 9-way mode parallel architecture only occupies 75 kb look up table and 55 kb register resources, the main frequency can reach 207 MHz, and it only takes 4 096 clocks cycles to complete a 64×64 coding tree unit prediction and can support real-time encoding of 1 080 P resolution 99 f/s full I-frame at most. Compared with the existing design scheme, the scheme designed in this paper can realize higher frame rate 1 080 P real time video encoding with smaller circuit area.
Keywords:intra prediction  field programmable gate array  mode in parallel  high efficiency video coding  
点击此处可从《华南理工大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《华南理工大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号