Stojanovic et al., 2007 - Google Patents
Matrix-vector multiplication on a fixed size unidirectional systolic arrayStojanovic et al., 2007
- Document ID
- 4851815153704624108
- Author
- Stojanovic N
- Milovanovic I
- Stojcev M
- Milovanovic E
- Publication year
- Publication venue
- 2007 8th International Conference on Telecommunications in Modern Satellite, Cable and Broadcasting Services
External Links
Snippet
In this paper, the problem of multiplication of matrix A=(a ik) nxn by vector b macr=(bk) nxl unidirectional linear systolic array (ULSA) comprised of ples [n/2] processing elements is considered. To match the dimension of matrix A to the ULSA size, the partitioning of the …
- 239000011159 matrix material 0 abstract description 23
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
- G06F17/142—Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/147—Discrete orthonormal transforms, e.g. discrete cosine transform, discrete sine transform, and variations therefrom, e.g. modified discrete cosine transform, integer transforms approximating the discrete cosine transform
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G06F17/12—Simultaneous equations, e.g. systems of linear equations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5045—Circuit design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/60—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers
- G06F7/72—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers using residue arithmetic
- G06F7/724—Finite field arithmetic
- G06F7/726—Inversion; Reciprocal calculation; Division of elements of a finite field
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G06F1/16—Constructional details or arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Ebeling et al. | Mapping applications to the RaPiD configurable architecture | |
| Duong-Ngoc et al. | Area-efficient number theoretic transform architecture for homomorphic encryption | |
| Mera et al. | Compact domain-specific co-processor for accelerating module lattice-based KEM | |
| US20220188072A1 (en) | Systems and methods for calculating large polynomial multiplications | |
| Lin et al. | Scalable montgomery modular multiplication architecture with low-latency and low-memory bandwidth requirement | |
| US20250097009A1 (en) | Semi-custom accelerator device for bootstrappable fully homomorphic encryption | |
| Chang et al. | Efficient hardware accelerators for the computation of Tchebichef moments | |
| US6658441B1 (en) | Apparatus and method for recursive parallel and pipelined fast fourier transform | |
| US8307021B1 (en) | Hardware architecture and scheduling for high performance solution to cholesky decomposition | |
| Cardarilli et al. | RNS applications in digital signal processing | |
| Chen et al. | A High-performance NTT/MSM Accelerator for Zero-knowledge Proof Using Load-balanced Fully-pipelined Montgomery Multiplier | |
| Campbell et al. | Resource and delay efficient matrix multiplication using newer FPGA devices | |
| Zicari et al. | A matrix product accelerator for field programmable systems on chip | |
| Stojanovic et al. | Matrix-vector multiplication on a fixed size unidirectional systolic array | |
| US7847349B2 (en) | Single-cycle FFT butterfly calculator | |
| Castillo-Atoche et al. | Towards real time implementation of reconstructive signal processing algorithms using systolic arrays coprocessors | |
| More et al. | FPGA implementation of FFT processor using vedic algorithm | |
| Tan et al. | Loop optimizations of mgs-qrd algorithm for fpga high-level synthesis | |
| Meher | Design of a fully-pipelined systolic array for flexible transposition-free VLSI of 2-D DFT | |
| Buček et al. | Dedicated hardware implementation of a linear congruence solver in FPGA | |
| Koo et al. | Evaluation of a high-level-language methodology for high-performance reconfigurable computers | |
| Kittitornkun et al. | Mapping deep nested do-loop DSP algorithms to large scale FPGA array structures | |
| Liu et al. | Testable design and BIST techniques for systolic motion estimators in the transform domain | |
| Bucek et al. | Comparison of FPGA and ASIC implementation of a linear congruence solver | |
| US20240396706A1 (en) | Fully homomorphic encrypted processing acceleration |