526 / 2019-03-07 10:55:24
A FPGA-based Accelerator of Convolutional Neural Network for Face Feature Extraction
CNN, FPGA-based accelerator, parallelism, data quantization, RTL
终稿
ru ding / Tsinghua University
xingjun Wu / Tsinghua University
guangda Su / Tsinghua University
guoqiang Bai / Tsinghua University
wei Xu / Tsinghua University
nan Su / Tsinghua University
Convolutional Neural Network (CNN) as a typical deep learning model has been widely used to solve many complex problems. However, the computation-intensive convolutional layers and memory-intensive fully connected layers limit the implementation of CNN on embedded platforms. In this paper we proposed a FPGA-based accelerator for face feature extraction, which supports the acceleration of entire CNN. In our design, all the CNN layers are optimized and deployed separately and independently with hand coded Verilog templates instead of basing on high level synthesis (HLS) tool. The RTL-designed layers can use the most optimized parallelism strategy for convolutional layer and pipeline structure for convolutional layer and pooling layer to achieve high resource utilization. For the fully connected layer, the batch-based method is applied to reduce the number of data access. Moreover, a dynamic fixed-point quantization strategy is adopted to reduce the resource consumption. As a result, a system of “FPGA+ARM” is applied to complete the hardware acceleration of CNN and the precision error is less than 1% compared with software.
重要日期
  • 会议日期

    06月12日

    2019

    06月14日

    2019

  • 06月12日 2019

    初稿截稿日期

  • 06月14日 2019

    注册截止日期

承办单位
Xi'an University of Technology
联系方式
历届会议
移动端
在手机上打开
小程序
打开微信小程序
客服
扫码或点此咨询