首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于FPGA的卷积神经网络加速系统
引用本文:李小燕,张欣,闫小兵,任德亮,李彦青,傅长娟.基于FPGA的卷积神经网络加速系统[J].河北大学学报(自然科学版),2019,39(1):99.
作者姓名:李小燕  张欣  闫小兵  任德亮  李彦青  傅长娟
作者单位:河北大学 电信与信息工程系,河北保定,071002;保定永红铸造机械厂,河北保定,072150
摘    要:以在现场可编程门阵列(FPGA)上部署卷积神经网络为背景,提出了卷积神经网络在硬件上进行并行加速的方案.主要是通过分析卷积神经网络的结构特点,对数据的存储、读取、搬移以流水式的方式进行,对卷积神经网络中的每一层内的卷积运算单元进行展开,加速乘加操作. 基于FPGA特有的并行化结构和流水线的处理方式可以很好地提升运算效率,从对ciafr-10数据集的物体分类结果看,在不损失正确率的前提下,当时钟工作在800 MHz时,相较于中端的Intel处理器,可实现4倍左右的加速.卷积神经网络通过循环展开并行处理以及多级流水线的处理方式,可以加速卷积神经网络的前向传播,适合于实际工程任务中的需要.

关 键 词:现场可编程门阵列(FPGA)  卷积神经网络  并行化  流水线  分类  加速  
收稿时间:2018-09-02

Convolutional neural network acceleration system based on FPGA
LI Xiaoyan,ZHANG Xin,YAN Xiaobing,REN Deliang,LI Yanqing,FU Changjuan.Convolutional neural network acceleration system based on FPGA[J].Journal of Hebei University (Natural Science Edition),2019,39(1):99.
Authors:LI Xiaoyan  ZHANG Xin  YAN Xiaobing  REN Deliang  LI Yanqing  FU Changjuan
Institution:1. College of Telecommunications and Information Engineering, Hebei University, Baoding 071002, China; 2. Baoding Yonghong Foundry Machinery Factory, Baoding 072150, China
Abstract:In this paper, the convolutional neural network is deployed on the Field Programmable Gate Array(FPGA). As a background, a convolutional neural network is proposed to accelerate hardware. The paper analyzes the structural characteristics of convolutional neural networks, stores, reads, and moves data in a stream-style manner. Next, the convolution unit in each layer of the convolutional neural network is expanded to speed up the multiplication and addition operations. Based on the(FPGA)unique parallel structure, pipeline processing method can effectively improve the efficiency of the operation. From object classification results for the ciafr-10 dataset, at 800MHz operating frequency and without loss of accuracy, FPGA compared to General purpose processor can achieve 4 times speed up, Convolutional neural network through parallel process and multi-stage pipeline process can accelerate forward propagation of convolutional neural networks, being suitable for the demand of practical engineering tasks.
Keywords:field programmable gate array(FPGA)  convolutional neural network  parallelization  stream-style  classification  accelerate  
本文献已被 万方数据 等数据库收录!
点击此处可从《河北大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《河北大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号