基于System Generator的卷积加速结构设计与实现

DESIGN AND IMPLEMENTATION OF CONVOLUTION ACCELERATION STRUCTURE BASED ON SYSTEM GENERATOR

摘要: 为解决卷积神经网络中卷积运算耗时长、运算复杂的问题, 针对卷积运算的并行性特征, 提出一种基于分块的流水线加速方法, 并基于该方法在System Generator上进行了电路设计。通过在FPGA(Field-programmable Gate Array)上进行实验验证, 该设计模型能正确输出卷积运算结果; 在结构和输入数据相同的情况下, 该设计模型在计算速度上相比于普通CPU最高可加速258倍, 相比于服务器级CPU提高了近40倍, 具有良好的加速效果。

Abstract: In order to solve the time-consuming and complicated operation problems in convolutional neural networks, this paper proposes a block-based pipeline acceleration method according to the parallelism characteristics of convolution operation, and designs the circuit on System Generator based on this method. Through the experimental verification on field-programmable gate array (FPGA), the design model can correctly output the convolution operation results. In the case of the same structure and input data, the design model can accelerate up to 258 times compared with ordinary CPU in calculation speed, and increase by nearly 40 times compared with server-level CPU, and has a good acceleration effect.