378 / 2019-01-23 17:25:49
A Software-Hardware collaboration system for CNN algorithms based on FPGA
CNN algorithms,convolution accelerator,SoC,data reuse
终稿
Shuo Zhao / Tsinghua University
Kunning Zhang / Tsinghua University
Jun Fan / Tsinghua University
Hu He / Tsinghua University
In this paper, a SoC system with ARM processor and convolution accelerator is designed for CNN algorithms on the ZC706 evaluation board. Using tiling technology and loop reorganization, the system has a high data reuse rate, thus greatly reducing the data bandwidth between the on-chip buffer and DDR memory. This convolution accelerator supports different kernel size from 1x1 to 11x11, while the activation functions supported are ReLU and Leaky ReLU. The processor of the SoC is mainly responsible for controlling and processing other computations of the CNN, such as LRN and pooling, which makes the system more versatile and flexible. At the working frequency of 100MHz, the peak performance can reach 45.16 GFLOPS, which is 142.8x faster than Cortex-A9 and the energy efficiency is 219.5x better compared to i7-4790K.
重要日期
  • 会议日期

    06月12日

    2019

    06月14日

    2019

  • 06月12日 2019

    初稿截稿日期

  • 06月14日 2019

    注册截止日期

承办单位
Xi'an University of Technology
联系方式
历届会议
移动端
在手机上打开
小程序
打开微信小程序
客服
扫码或点此咨询