Stojanovic et al., 2007 - Google Patents

Matrix-vector multiplication on a fixed size unidirectional systolic array

Stojanovic et al., 2007

Document ID: 4851815153704624108
Author: Stojanovic N; Milovanovic I; Stojcev M; Milovanovic E
Publication year: 2007
Publication venue: 2007 8th International Conference on Telecommunications in Modern Satellite, Cable and Broadcasting Services

External Links

Cited by

Snippet

In this paper, the problem of multiplication of matrix A=(a ik) nxn by vector b macr=(bk) nxl unidirectional linear systolic array (ULSA) comprised of ples [n/2] processing elements is considered. To match the dimension of matrix A to the ULSA size, the partitioning of the …

Continue reading at ieeexplore.ieee.org (other versions)

239000011159 matrix material 0 abstract description 23

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
- G06F17/142—Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/147—Discrete orthonormal transforms, e.g. discrete cosine transform, discrete sine transform, and variations therefrom, e.g. modified discrete cosine transform, integer transforms approximating the discrete cosine transform
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G06F17/12—Simultaneous equations, e.g. systems of linear equations
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5045—Circuit design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/60—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers
- G06F7/72—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers using residue arithmetic
- G06F7/724—Finite field arithmetic
- G06F7/726—Inversion; Reciprocal calculation; Division of elements of a finite field
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G06F1/16—Constructional details or arrangements
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements

Similar Documents

Publication	Publication Date	Title
Ebeling et al.	1997	Mapping applications to the RaPiD configurable architecture
Duong-Ngoc et al.	2022	Area-efficient number theoretic transform architecture for homomorphic encryption
Mera et al.	2020	Compact domain-specific co-processor for accelerating module lattice-based KEM
US20220188072A1 (en)	2022-06-16	Systems and methods for calculating large polynomial multiplications
Lin et al.	2012	Scalable montgomery modular multiplication architecture with low-latency and low-memory bandwidth requirement
US20250097009A1 (en)	2025-03-20	Semi-custom accelerator device for bootstrappable fully homomorphic encryption
Chang et al.	2011	Efficient hardware accelerators for the computation of Tchebichef moments
US6658441B1 (en)	2003-12-02	Apparatus and method for recursive parallel and pipelined fast fourier transform
US8307021B1 (en)	2012-11-06	Hardware architecture and scheduling for high performance solution to cholesky decomposition
Cardarilli et al.	2017	RNS applications in digital signal processing
Chen et al.	2025	A High-performance NTT/MSM Accelerator for Zero-knowledge Proof Using Load-balanced Fully-pipelined Montgomery Multiplier
Campbell et al.	2006	Resource and delay efficient matrix multiplication using newer FPGA devices
Zicari et al.	2008	A matrix product accelerator for field programmable systems on chip
Stojanovic et al.	2007	Matrix-vector multiplication on a fixed size unidirectional systolic array
US7847349B2 (en)	2010-12-07	Single-cycle FFT butterfly calculator
Castillo-Atoche et al.	2010	Towards real time implementation of reconstructive signal processing algorithms using systolic arrays coprocessors
More et al.	2013	FPGA implementation of FFT processor using vedic algorithm
Tan et al.	2019	Loop optimizations of mgs-qrd algorithm for fpga high-level synthesis
Meher	2005	Design of a fully-pipelined systolic array for flexible transposition-free VLSI of 2-D DFT
Buček et al.	2012	Dedicated hardware implementation of a linear congruence solver in FPGA
Koo et al.	2007	Evaluation of a high-level-language methodology for high-performance reconfigurable computers
Kittitornkun et al.	2003	Mapping deep nested do-loop DSP algorithms to large scale FPGA array structures
Liu et al.	2009	Testable design and BIST techniques for systolic motion estimators in the transform domain
Bucek et al.	2013	Comparison of FPGA and ASIC implementation of a linear congruence solver
US20240396706A1 (en)	2024-11-28	Fully homomorphic encrypted processing acceleration