基于HLS工具的CNN加速器的設計與優(yōu)化方法研究
2021年電子技術應用第3期
程佳風,王紅亮
中北大學 電子測量技術國家重點實驗室,,山西 太原030051
摘要: 基于軟硬件協(xié)同設計的思想,,利用HLS工具,,在PYNQ-Z2平臺上設計并實現(xiàn)了一個卷積神經(jīng)網(wǎng)絡加速器,對卷積運算采用矩陣切割的優(yōu)化方法,,均衡了資源消耗和計算資源,,使得加速器的性能達到了最優(yōu)。利用MNIST數(shù)據(jù)集對加速器IP核進行性能測試,,實驗結果表明:對單張圖片的測試,,該加速器相對于ARM平臺實現(xiàn)了5.785的加速效果,對于1 000張圖片的測試則可達到9.72的加速效果,,隨著測試圖片數(shù)量的不斷增加,,加速器的性能也將越來越優(yōu)。
中圖分類號: TN108.1
文獻標識碼: A
DOI:10.16157/j.issn.0258-7998.200841
中文引用格式: 程佳風,,王紅亮. 基于HLS工具的CNN加速器的設計與優(yōu)化方法研究[J].電子技術應用,,2021,47(3):18-21,,26.
英文引用格式: Cheng Jiafeng,,Wang Hongliang. Research on the design and optimization method of CNN accelerator based on HLS tools[J]. Application of Electronic Technique,2021,,47(3):18-21,,26.
文獻標識碼: A
DOI:10.16157/j.issn.0258-7998.200841
中文引用格式: 程佳風,,王紅亮. 基于HLS工具的CNN加速器的設計與優(yōu)化方法研究[J].電子技術應用,,2021,47(3):18-21,,26.
英文引用格式: Cheng Jiafeng,,Wang Hongliang. Research on the design and optimization method of CNN accelerator based on HLS tools[J]. Application of Electronic Technique,2021,,47(3):18-21,,26.
Research on the design and optimization method of CNN accelerator based on HLS tools
Cheng Jiafeng,Wang Hongliang
National Key Laboratory for Electronic Measurement Technology,,North University of China,,Taiyuan 030051,China
Abstract: Based on the idea of software and hardware co-design, this article uses HLS tools to design and implement a convolutional neural network accelerator on the PYNQ-Z2 platform, and uses the matrix cutting optimization method for convolution operations to balance resource consumption and computing resources , so that the performance of the accelerator is optimized. This article uses the MNIST data set to test the performance of the accelerator IP core. The experimental results show that: for a single image test, the accelerator achieves an acceleration effect of 5.785 compared with the ARM platform, and an acceleration of 9.72 for a 1000 image test. As a result, as the number of test images continues to increase, the performance of the accelerator will become better and better.
Key words : convolutional neural network(CNN),;PYNQ-Z2,;HLS tool;accelerator
0 引言
近年來,卷積神經(jīng)網(wǎng)絡的應用范圍越來越廣泛,,其應用場景也日益復雜,,卷積神經(jīng)網(wǎng)絡的計算密集和存儲密集特征日益凸顯,成為快速高效實現(xiàn)卷積神經(jīng)網(wǎng)絡的限制,。于是基于GPU[1],、ASIC[2]、FPGA[3]的不同的加速器平臺被相繼提出以提升CNN的設計性能,。GPU的電力消耗巨大,,硬件結構固定,限制了卷積神經(jīng)網(wǎng)絡在嵌入式設備的應用,;ASIC開發(fā)成本極高,,靈活性低,不適合搭載復雜多變的卷積神經(jīng)網(wǎng)絡,;FPGA具有功耗低,、性能高、靈活性好的特點,,因此更加適用于卷積神經(jīng)網(wǎng)絡硬件加速的開發(fā)研究,,但由于Verilog HDL開發(fā)門檻高,開發(fā)周期相對較長,,影響了FPGA在卷積神經(jīng)網(wǎng)絡應用的普及[4-5],。
本文基于軟硬件協(xié)同的思想,利用HLS工具,,在PYNQ-Z2上實現(xiàn)了一個卷積神經(jīng)網(wǎng)絡加速器,,并采用矩陣切割的設計方法對卷積核運算進行優(yōu)化。
本文詳細內(nèi)容請下載:http://forexkbc.com/resource/share/2000003402
作者信息:
程佳風,,王紅亮
(中北大學 電子測量技術國家重點實驗室,山西 太原030051)
此內(nèi)容為AET網(wǎng)站原創(chuàng),,未經(jīng)授權禁止轉載,。