Meng et al., 2021 - Google Patents

Dynamap: Dynamic algorithm mapping framework for low latency cnn inference

Meng et al., 2021

Document ID: 3061403050136301784
Author: Meng Y; Kuppannagari S; Kannan R; Prasanna V
Publication year: 2021
Publication venue: The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

External Links

Cited by

Snippet

Most of the existing work on FPGA acceleration of Convolutional Neural Network (CNN) focuses on employing a single strategy (algorithm, dataflow, etc.) across all the layers. Such an approach does not achieve optimal latency on complex and deep CNNs. Emerging …

Continue reading at dl.acm.org (PDF) (other versions)

238000005457 optimization 0 abstract description 11

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5045—Circuit design
- G06F17/505—Logic synthesis, e.g. technology mapping, optimisation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G06F17/12—Simultaneous equations, e.g. systems of linear equations
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5068—Physical circuit design, e.g. layout for integrated circuits or printed circuit boards
- G06F17/5072—Floorplanning, e.g. partitioning, placement
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3893—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator
- G06F9/3895—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros
- G06F9/3897—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros with adaptable data path
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2217/00—Indexing scheme relating to computer aided design [CAD]
- G06F2217/78—Power analysis and optimization
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models

Similar Documents

Publication	Publication Date	Title
Meng et al.	2021	Dynamap: Dynamic algorithm mapping framework for low latency cnn inference
Zhang et al.	2020	DNNExplorer: a framework for modeling and exploring a novel paradigm of FPGA-based DNN accelerator
US12468924B2 (en)	2025-11-11	Parallel computing scheme generation for neural networks
US11347480B2 (en)	2022-05-31	Transpose operations using processing element array
Shen et al.	2019	Toward an efficient deep pipelined template-based architecture for accelerating the entire 2-D and 3-D CNNs on FPGA
Liu et al.	2021	WinoCNN: Kernel sharing Winograd systolic array for efficient convolutional neural network acceleration on FPGAs
Kästner et al.	2018	Hardware/software codesign for convolutional neural networks exploiting dynamic partial reconfiguration on PYNQ
US20220253683A1 (en)	2022-08-11	Implementing Fully-Connected Neural-Network Layers in Hardware
Lee et al.	2021	NP-CGRA: Extending CGRAs for efficient processing of light-weight deep neural networks
Wang et al.	2019	Systolic cube: A spatial 3D CNN accelerator architecture for low power video analysis
Yang et al.	2023	Aim: Accelerating arbitrary-precision integer multiplication on heterogeneous reconfigurable computing platform versal acap
US11488066B2 (en)	2022-11-01	Efficient convolution of multi-channel input samples with multiple kernels
Chen et al.	2023	Exploiting on-chip heterogeneity of versal architecture for gnn inference acceleration
Lee et al.	2021	Specializing CGRAs for light-weight convolutional neural networks
Zhang et al.	2022	Low-latency mini-batch gnn inference on cpu-fpga heterogeneous platform
Liang et al.	2021	FCNNLib: A flexible convolution algorithm library for deep learning on FPGAs
Roorda et al.	2022	FPGA architecture exploration for DNN acceleration
Arredondo-Velazquez et al.	2020	A streaming architecture for Convolutional Neural Networks based on layer operations chaining
Meng et al.	2021	PPOAccel: A high-throughput acceleration framework for proximal policy optimization
Hamdi et al.	2025	MATCH: Model-Aware TVM-based Compilation for Heterogeneous Edge Devices
Meng et al.	2021	How to avoid zero-spacing in fractionally-strided convolution? a hardware-algorithm co-design methodology
Meng et al.	2022	Accelerator design and exploration for deformable convolution networks
Zhang et al.	2024	VisionAGILE: A Versatile Domain-Specific Accelerator for Computer Vision Tasks
Chen et al.	2024	High throughput and low bandwidth demand: Accelerating CNN inference block-by-block on FPGAs
Chowdhury et al.	2025	Demystifying the 7-D Convolution Loop Nest for Data and Instruction Streaming in Reconfigurable AI Accelerators