DK3764314T3 - Sparse-beregningsmekanisme til maskinindlæring - Google Patents
Sparse-beregningsmekanisme til maskinindlæring Download PDFInfo
- Publication number
- DK3764314T3 DK3764314T3 DK20192178.0T DK20192178T DK3764314T3 DK 3764314 T3 DK3764314 T3 DK 3764314T3 DK 20192178 T DK20192178 T DK 20192178T DK 3764314 T3 DK3764314 T3 DK 3764314T3
- Authority
- DK
- Denmark
- Prior art keywords
- machine learning
- computation mechanism
- sparse computation
- sparse
- learning
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0207—Addressing or allocation; Relocation with multidimensional access, e.g. row/column, matrix
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0811—Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0831—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0888—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using selective caching, e.g. bypass
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2136—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on sparsity criteria, e.g. with an overcomplete basis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/3001—Arithmetic instructions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0495—Quantised networks; Sparse networks; Compressed networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/60—Memory management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/005—General purpose rendering architectures
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
- G06F2212/1024—Latency reduction
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/30—Providing cache or TLB in specific location of a processing system
- G06F2212/302—In image processor or graphics adapter
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/40—Specific encoding of data in memory or cache
- G06F2212/401—Compressed data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/62—Details of cache specific to multiprocessor cache arrangements
- G06F2212/621—Coherency control relating to peripheral accessing, e.g. from DMA or I/O device
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/28—Indexing scheme for image data processing or generation, in general involving image processing hardware
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Pure & Applied Mathematics (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Medical Informatics (AREA)
- Computer Graphics (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Neurology (AREA)
- Image Processing (AREA)
- Image Generation (AREA)
- Complex Calculations (AREA)
- Image Analysis (AREA)
- Multi Processors (AREA)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US15/482,791 US10346944B2 (en) | 2017-04-09 | 2017-04-09 | Machine learning sparse computation mechanism |
| EP18160825.8A EP3385901B1 (en) | 2017-04-09 | 2018-03-08 | Machine learning sparse computation mechanism |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| DK3764314T3 true DK3764314T3 (da) | 2024-02-12 |
Family
ID=61569137
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| DK21180506.4T DK3937119T3 (da) | 2017-04-09 | 2018-03-08 | Ufuldstændig beregningsmekanisme til maskinindlæring |
| DK20192178.0T DK3764314T3 (da) | 2017-04-09 | 2018-03-08 | Sparse-beregningsmekanisme til maskinindlæring |
Family Applications Before (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| DK21180506.4T DK3937119T3 (da) | 2017-04-09 | 2018-03-08 | Ufuldstændig beregningsmekanisme til maskinindlæring |
Country Status (7)
| Country | Link |
|---|---|
| US (8) | US10346944B2 (da) |
| EP (5) | EP3385901B1 (da) |
| CN (4) | CN112116098B (da) |
| DK (2) | DK3937119T3 (da) |
| ES (3) | ES2972297T3 (da) |
| FI (2) | FI3937119T3 (da) |
| PL (4) | PL3385901T3 (da) |
Families Citing this family (88)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9942074B1 (en) | 2016-11-30 | 2018-04-10 | Micron Technology, Inc. | Wireless devices and systems including examples of mixing coefficient data specific to a processing mode selection |
| US10346944B2 (en) | 2017-04-09 | 2019-07-09 | Intel Corporation | Machine learning sparse computation mechanism |
| US11164071B2 (en) | 2017-04-18 | 2021-11-02 | Samsung Electronics Co., Ltd. | Method and apparatus for reducing computational complexity of convolutional neural networks |
| US11429861B1 (en) | 2017-05-01 | 2022-08-30 | Perceive Corporation | Device storing multiple sets of parameters for machine-trained network |
| US10572773B2 (en) * | 2017-05-05 | 2020-02-25 | Intel Corporation | On the fly deep learning in machine learning for autonomous machines |
| US11567816B2 (en) | 2017-09-13 | 2023-01-31 | Hrl Laboratories, Llc | Transitive tensor analysis for detection of network activities |
| US10755141B2 (en) * | 2017-09-13 | 2020-08-25 | Hrl Laboratories, Llc | Streaming data tensor analysis using blind source separation |
| US11619927B2 (en) | 2017-11-03 | 2023-04-04 | Drishti Technologies, Inc. | Automatic analysis of real time conditions in an activity space |
| GB2568085B (en) * | 2017-11-03 | 2020-01-01 | Imagination Tech Ltd | Hardware unit for performing matrix multiplication with clock gating |
| US10372787B2 (en) * | 2017-12-12 | 2019-08-06 | Facebook, Inc. | Hardware accelerator pre-configured with coefficients for matrix-transform operations |
| AU2017279610A1 (en) * | 2017-12-19 | 2019-07-04 | Canon Kabushiki Kaisha | Memory access optimisation using per-layer computational mapping and memory allocation for CNN application |
| US11514306B1 (en) * | 2018-03-14 | 2022-11-29 | Meta Platforms, Inc. | Static memory allocation in neural networks |
| US11210586B1 (en) | 2018-04-20 | 2021-12-28 | Perceive Corporation | Weight value decoder of neural network inference circuit |
| US11222257B1 (en) | 2018-04-20 | 2022-01-11 | Perceive Corporation | Non-dot product computations on neural network inference circuit |
| US11783167B1 (en) | 2018-04-20 | 2023-10-10 | Perceive Corporation | Data transfer for non-dot product computations on neural network inference circuit |
| US12093696B1 (en) | 2018-04-20 | 2024-09-17 | Perceive Corporation | Bus for transporting output values of a neural network layer to cores specified by configuration data |
| US11250326B1 (en) | 2018-04-20 | 2022-02-15 | Perceive Corporation | Splitting neural network filters for implementation by neural network inference circuit |
| US11049013B1 (en) | 2018-04-20 | 2021-06-29 | Perceive Corporation | Encoding of weight values stored on neural network inference circuit |
| US10740434B1 (en) * | 2018-04-20 | 2020-08-11 | Perceive Corporation | Reduced dot product computation circuit |
| US11568227B1 (en) | 2018-04-20 | 2023-01-31 | Perceive Corporation | Neural network inference circuit read controller with multiple operational modes |
| US11586910B1 (en) | 2018-04-20 | 2023-02-21 | Perceive Corporation | Write cache for neural network inference circuit |
| US11481612B1 (en) | 2018-04-20 | 2022-10-25 | Perceive Corporation | Storage of input values across multiple cores of neural network inference circuit |
| US11216732B2 (en) * | 2018-05-31 | 2022-01-04 | Neuralmagic Inc. | Systems and methods for generation of sparse code for convolutional neural networks |
| CN112437930A (zh) * | 2018-07-12 | 2021-03-02 | 华为技术有限公司 | 以熟练的推理速度和功耗,生成神经网络的压缩表示 |
| US10769310B2 (en) * | 2018-07-20 | 2020-09-08 | Nxp B.V. | Method for making a machine learning model more difficult to copy |
| US10380997B1 (en) | 2018-07-27 | 2019-08-13 | Deepgram, Inc. | Deep learning internal state index-based search and classification |
| US10719323B2 (en) | 2018-09-27 | 2020-07-21 | Intel Corporation | Systems and methods for performing matrix compress and decompress instructions |
| US11468291B2 (en) | 2018-09-28 | 2022-10-11 | Nxp B.V. | Method for protecting a machine learning ensemble from copying |
| CN111047020B (zh) * | 2018-10-12 | 2022-11-29 | 上海寒武纪信息科技有限公司 | 支持压缩及解压缩的神经网络运算装置及方法 |
| KR102848548B1 (ko) * | 2018-11-06 | 2025-08-25 | 한국전자통신연구원 | 딥러닝 모델 압축 및 압축 해제 방법 그리고 장치 |
| US11663001B2 (en) * | 2018-11-19 | 2023-05-30 | Advanced Micro Devices, Inc. | Family of lossy sparse load SIMD instructions |
| US10846363B2 (en) * | 2018-11-19 | 2020-11-24 | Microsoft Technology Licensing, Llc | Compression-encoding scheduled inputs for matrix computations |
| US11995533B1 (en) | 2018-12-05 | 2024-05-28 | Perceive Corporation | Executing replicated neural network layers on inference circuit |
| CN111310535B (zh) * | 2018-12-11 | 2023-07-14 | 财团法人工业技术研究院 | 使用卷积神经网络模型的对象检测方法及对象检测设备 |
| US20200210517A1 (en) | 2018-12-27 | 2020-07-02 | Intel Corporation | Systems and methods to accelerate multiplication of sparse matrices |
| US11520331B2 (en) * | 2018-12-28 | 2022-12-06 | Intel Corporation | Methods and apparatus to update autonomous vehicle perspectives |
| US11353870B2 (en) * | 2018-12-31 | 2022-06-07 | Baidu Usa Llc | Autonomous driving computing and storage expansion device with flexible host and client configuration |
| US11347297B1 (en) | 2019-01-23 | 2022-05-31 | Perceive Corporation | Neural network inference circuit employing dynamic memory sleep |
| US11275968B2 (en) * | 2019-02-13 | 2022-03-15 | Western Digital Technologies, Inc. | Super-sparse image compression using cross-bar non-volatile memory device |
| CN111612153B (zh) * | 2019-02-22 | 2024-06-14 | 华为技术有限公司 | 训练模型的方法和装置 |
| WO2020183396A1 (en) | 2019-03-11 | 2020-09-17 | Untether Ai Corporation | Computational memory |
| US12124530B2 (en) | 2019-03-11 | 2024-10-22 | Untether Ai Corporation | Computational memory |
| US11580371B2 (en) * | 2019-03-13 | 2023-02-14 | Roviero, Inc. | Method and apparatus to efficiently process and execute Artificial Intelligence operations |
| US11493985B2 (en) | 2019-03-15 | 2022-11-08 | Microsoft Technology Licensing, Llc | Selectively controlling memory power for scheduled computations |
| JP7494197B2 (ja) * | 2019-03-15 | 2024-06-03 | インテル コーポレイション | 行列アクセラレータアーキテクチャ内のシストリック分解 |
| US11127167B2 (en) * | 2019-04-29 | 2021-09-21 | Nvidia Corporation | Efficient matrix format suitable for neural networks |
| US11625585B1 (en) | 2019-05-21 | 2023-04-11 | Perceive Corporation | Compiler for optimizing filter sparsity for neural network implementation configuration |
| US11537949B2 (en) * | 2019-05-23 | 2022-12-27 | Google Llc | Systems and methods for reducing idleness in a machine-learning training system using data echoing |
| US10936311B1 (en) * | 2019-07-09 | 2021-03-02 | Xilinx, Inc. | Sparse matrix processing circuitry |
| US11342944B2 (en) | 2019-09-23 | 2022-05-24 | Untether Ai Corporation | Computational memory with zero disable and error detection |
| CN113272854B (zh) * | 2019-10-12 | 2024-12-31 | 昆仑芯(北京)科技有限公司 | 利用高级互连技术加速ai训练的方法和系统 |
| US11694076B2 (en) * | 2019-10-14 | 2023-07-04 | Micron Technology, Inc. | Memory sub-system with internal logic to perform a machine learning operation |
| US11676010B2 (en) * | 2019-10-14 | 2023-06-13 | Micron Technology, Inc. | Memory sub-system with a bus to transmit data for a machine learning operation and another bus to transmit host data |
| US11769076B2 (en) | 2019-10-14 | 2023-09-26 | Micron Technology, Inc. | Memory sub-system with a virtualized bus and internal logic to perform a machine learning operation |
| US11681909B2 (en) * | 2019-10-14 | 2023-06-20 | Micron Technology, Inc. | Memory component with a bus to transmit data for a machine learning operation and another bus to transmit host data |
| US12165055B1 (en) | 2019-11-11 | 2024-12-10 | Amazon Technologies, Inc. | Storing of intermediate computed values for subsequent use in a machine trained network |
| US10924152B1 (en) | 2019-11-13 | 2021-02-16 | Micron Technology, Inc. | Mixing coefficient data for processing mode selection |
| US11314515B2 (en) | 2019-12-23 | 2022-04-26 | Intel Corporation | Instructions and logic for vector multiply add with zero skipping |
| US11442631B2 (en) * | 2019-12-26 | 2022-09-13 | Micron Technology, Inc. | Memory operations with consideration for wear leveling |
| CN111176582A (zh) * | 2019-12-31 | 2020-05-19 | 北京百度网讯科技有限公司 | 矩阵存储方法、矩阵访问方法、装置和电子设备 |
| CN111191778B (zh) * | 2019-12-31 | 2021-11-30 | 深圳云天励飞技术股份有限公司 | 深度学习网络处理方法、装置与编译器 |
| US11416959B1 (en) * | 2020-02-10 | 2022-08-16 | Zoox, Inc. | Vision architecture |
| KR102878366B1 (ko) | 2020-02-20 | 2025-10-30 | 삼성전자주식회사 | 전자 장치 및 그의 제어 방법 |
| US11468002B2 (en) | 2020-02-28 | 2022-10-11 | Untether Ai Corporation | Computational memory with cooperation among rows of processing elements and memory thereof |
| US11500644B2 (en) | 2020-05-15 | 2022-11-15 | Alibaba Group Holding Limited | Custom instruction implemented finite state machine engines for extensible processors |
| WO2021248433A1 (en) * | 2020-06-12 | 2021-12-16 | Moffett Technologies Co., Limited | Method and system for dual-sparse convolution processing and parallelization |
| KR102724444B1 (ko) * | 2020-07-10 | 2024-11-01 | 삼성전자주식회사 | 전자 장치 및 그의 제어 방법 |
| US11481214B2 (en) | 2020-07-14 | 2022-10-25 | Alibaba Group Holding Limited | Sparse matrix calculations untilizing ightly tightly coupled memory and gather/scatter engine |
| US12124939B1 (en) | 2020-11-24 | 2024-10-22 | Perceive Corporation | Generation of machine-trained network instructions |
| US11782757B2 (en) * | 2021-05-07 | 2023-10-10 | SiMa Technologies, Inc. | Scheduling off-chip memory access for programs with predictable execution |
| US11853717B2 (en) | 2021-01-14 | 2023-12-26 | Microsoft Technology Licensing, Llc | Accelerating processing based on sparsity for neural network hardware processors |
| CN112748998B (zh) * | 2021-01-21 | 2023-10-03 | 中南大学 | 一种移动端的卷积神经网络任务调度方法及系统 |
| KR20220140694A (ko) | 2021-04-09 | 2022-10-18 | 엔비디아 코포레이션 | 데이터 세트들의 희소도 증가 |
| US12217160B1 (en) | 2021-04-23 | 2025-02-04 | Amazon Technologies, Inc. | Allocating blocks of unified memory for integrated circuit executing neural network |
| US20220366008A1 (en) * | 2021-05-13 | 2022-11-17 | Nvidia Corporation | Application programming interface to decompress data |
| CN113591654B (zh) * | 2021-07-22 | 2023-09-01 | 中南大学 | 一种基于长时程深度特征的锌浮选工况识别方法 |
| US12288142B2 (en) * | 2021-08-09 | 2025-04-29 | Qualcomm Incorporated | Sparsity-aware compute-in-memory |
| CN113923723B (zh) * | 2021-10-15 | 2023-05-09 | 中国联合网络通信集团有限公司 | 流量重构方法、装置、设备及存储介质 |
| WO2023068959A1 (ru) * | 2021-10-20 | 2023-04-27 | Акционерное общество "ДжиЭс-Нанотех" | Модульная система сбора и анализа информации в промышленном окружении |
| US11816488B2 (en) * | 2021-11-10 | 2023-11-14 | Huawei Technologies Co., Ltd. | Method and apparatus for dynamically simplifying processor instructions |
| CN114281554B (zh) * | 2022-03-08 | 2022-06-17 | 之江实验室 | 用于3d图像处理的3d-cnn加速方法及装置、电子设备 |
| US12293090B2 (en) | 2022-07-14 | 2025-05-06 | Samsung Electronics Co., Ltd. | Systems and methods for managing bias mode switching |
| CN115239546B (zh) * | 2022-07-18 | 2025-09-16 | 华中科技大学 | 一种基于非易失存储器的前向梯度回归加速器及操作方法 |
| US12224774B2 (en) | 2022-11-16 | 2025-02-11 | Samsung Electronics Co., Ltd. | Runtime reconfigurable compression format conversion |
| US12231152B2 (en) | 2022-11-16 | 2025-02-18 | Samsung Electronics Co., Ltd. | Runtime reconfigurable compression format conversion with bit-plane granularity |
| WO2024110768A1 (en) * | 2022-11-24 | 2024-05-30 | Think Silicon Research and Technology Single Member S.A. | Techniques for thread reduction in processing tensors utilizing sparsity detection |
| CN116578425B (zh) * | 2023-07-11 | 2023-09-22 | 沐曦集成电路(上海)有限公司 | 一种基于光栅化的负载均衡方法及系统 |
| WO2025226815A1 (en) * | 2024-04-23 | 2025-10-30 | Applied Physics, Inc. | Sparse processing unit |
Family Cites Families (53)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5262973A (en) * | 1992-03-13 | 1993-11-16 | Sun Microsystems, Inc. | Method and apparatus for optimizing complex arithmetic units for trivial operands |
| HK1039992A1 (zh) * | 1999-03-05 | 2002-05-17 | Clarity Llc | 在一个器件中感知和处理的集成实现的两种结构 |
| JP2002108837A (ja) * | 2000-09-29 | 2002-04-12 | Nec Corp | 計算機システムとその計算制御方法 |
| US7046848B1 (en) * | 2001-08-22 | 2006-05-16 | Olcott Peter L | Method and system for recognizing machine generated character glyphs and icons in graphic images |
| US20050171918A1 (en) * | 2002-03-14 | 2005-08-04 | Ronald Eden | Method and system of cost variance analysis |
| US7085420B2 (en) * | 2002-06-28 | 2006-08-01 | Microsoft Corporation | Text detection in continuous tone image segments |
| US7539714B2 (en) * | 2003-06-30 | 2009-05-26 | Intel Corporation | Method, apparatus, and instruction for performing a sign operation that multiplies |
| US7873812B1 (en) | 2004-04-05 | 2011-01-18 | Tibet MIMAR | Method and system for efficient matrix multiplication in a SIMD processor architecture |
| EP1889178A2 (en) * | 2005-05-13 | 2008-02-20 | Provost, Fellows and Scholars of the College of the Holy and Undivided Trinity of Queen Elizabeth near Dublin | A data processing system and method |
| US8775495B2 (en) * | 2006-02-13 | 2014-07-08 | Indiana University Research And Technology | Compression system and method for accelerating sparse matrix computations |
| US7792895B1 (en) * | 2006-06-16 | 2010-09-07 | Nvidia Corporation | Efficient matrix multiplication on a parallel processing device |
| EP2130381A2 (en) * | 2007-01-23 | 2009-12-09 | Euclid Discoveries, LLC | Computer method and apparatus for processing image data |
| CN100562895C (zh) * | 2008-01-14 | 2009-11-25 | 浙江大学 | 一种基于区域分割和分段学习的三维人脸动画制作的方法 |
| US7945765B2 (en) * | 2008-01-31 | 2011-05-17 | International Business Machines Corporation | Method and structure for asynchronous skip-ahead in synchronous pipelines |
| US8364739B2 (en) * | 2009-09-30 | 2013-01-29 | International Business Machines Corporation | Sparse matrix-vector multiplication on graphics processor units |
| US8676874B2 (en) * | 2010-12-06 | 2014-03-18 | International Business Machines Corporation | Data structure for tiling and packetizing a sparse matrix |
| US8862653B2 (en) * | 2011-04-26 | 2014-10-14 | University Of South Carolina | System and method for sparse matrix vector multiplication processing |
| CN102436438B (zh) * | 2011-12-13 | 2015-03-04 | 华中科技大学 | 基于gpu的稀疏矩阵数据存储方法 |
| US9153230B2 (en) * | 2012-10-23 | 2015-10-06 | Google Inc. | Mobile speech recognition hardware accelerator |
| CN103399841A (zh) * | 2013-07-31 | 2013-11-20 | 清华大学 | 基于gpu的稀疏矩阵lu分解方法 |
| US9367519B2 (en) * | 2013-08-30 | 2016-06-14 | Microsoft Technology Licensing, Llc | Sparse matrix data structure |
| US9754561B2 (en) * | 2013-10-04 | 2017-09-05 | Nvidia Corporation | Managing memory regions to support sparse mappings |
| US20150160371A1 (en) * | 2013-12-06 | 2015-06-11 | Schlumberger Technology Corporation | Gpu accelerated deflation in geomechanics simulator |
| US9978014B2 (en) * | 2013-12-18 | 2018-05-22 | Intel Corporation | Reconfigurable processing unit |
| US10275479B2 (en) * | 2014-02-27 | 2019-04-30 | Sas Institute Inc. | Sparse matrix storage in a database |
| CN103984527B (zh) * | 2014-04-01 | 2017-12-15 | 杭州电子科技大学 | 优化稀疏矩阵向量乘提升不可压缩管流模拟效率的方法 |
| CN104077233B (zh) * | 2014-06-18 | 2017-04-05 | 百度在线网络技术(北京)有限公司 | 多通道卷积层处理方法和装置 |
| CN104036451B (zh) * | 2014-06-20 | 2018-12-11 | 深圳市腾讯计算机系统有限公司 | 基于多图形处理器的模型并行处理方法及装置 |
| US10223333B2 (en) | 2014-08-29 | 2019-03-05 | Nvidia Corporation | Performing multi-convolution operations in a parallel processing system |
| US9697176B2 (en) | 2014-11-14 | 2017-07-04 | Advanced Micro Devices, Inc. | Efficient sparse matrix-vector multiplication on parallel processors |
| US10255547B2 (en) | 2014-12-04 | 2019-04-09 | Nvidia Corporation | Indirectly accessing sample data to perform multi-convolution operations in a parallel processing system |
| US9760538B2 (en) | 2014-12-22 | 2017-09-12 | Palo Alto Research Center Incorporated | Computer-implemented system and method for efficient sparse matrix representation and processing |
| US20160179540A1 (en) | 2014-12-23 | 2016-06-23 | Mikhail Smelyanskiy | Instruction and logic for hardware support for execution of calculations |
| US9606934B2 (en) | 2015-02-02 | 2017-03-28 | International Business Machines Corporation | Matrix ordering for cache efficiency in performing large sparse matrix operations |
| US20160259826A1 (en) * | 2015-03-02 | 2016-09-08 | International Business Machines Corporation | Parallelized Hybrid Sparse Matrix Representations for Performing Personalized Content Ranking |
| US10262259B2 (en) * | 2015-05-08 | 2019-04-16 | Qualcomm Incorporated | Bit width selection for fixed point neural networks |
| US10380479B2 (en) | 2015-10-08 | 2019-08-13 | International Business Machines Corporation | Acceleration of convolutional neural network training using stochastic perforation |
| CN105427360B (zh) * | 2015-11-11 | 2019-01-18 | 华南理工大学 | 一种动态网格的误差可控cage序列表示算法 |
| US10409560B1 (en) | 2015-11-18 | 2019-09-10 | Amazon Technologies, Inc. | Acceleration techniques for graph analysis programs |
| US9558156B1 (en) * | 2015-11-24 | 2017-01-31 | International Business Machines Corporation | Sparse matrix multiplication using a single field programmable gate array module |
| US10061748B2 (en) * | 2015-12-11 | 2018-08-28 | Sap Se | Adaptive tile matrix representation and multiplication |
| US20170193361A1 (en) * | 2015-12-31 | 2017-07-06 | Microsoft Technology Licensing, Llc | Neural network training performance optimization framework |
| CN107563497B (zh) * | 2016-01-20 | 2021-03-19 | 中科寒武纪科技股份有限公司 | 用于稀疏人工神经网络的计算装置和运算方法 |
| US9715508B1 (en) * | 2016-03-28 | 2017-07-25 | Cogniac, Corp. | Dynamic adaptation of feature identification and annotation |
| CN106126481B (zh) * | 2016-06-29 | 2019-04-12 | 华为技术有限公司 | 一种计算系统和电子设备 |
| US10997496B2 (en) * | 2016-08-11 | 2021-05-04 | Nvidia Corporation | Sparse convolutional neural network accelerator |
| US10891538B2 (en) | 2016-08-11 | 2021-01-12 | Nvidia Corporation | Sparse convolutional neural network accelerator |
| CN106407158B (zh) * | 2016-09-12 | 2019-01-29 | 东南大学 | 一种gpu加速的批处理同构稀疏矩阵乘满向量的处理方法 |
| US10733505B2 (en) * | 2016-11-10 | 2020-08-04 | Google Llc | Performing kernel striding in hardware |
| US10395424B2 (en) | 2016-12-22 | 2019-08-27 | Advanced Micro Devices, Inc. | Method and apparatus of copying data to remote memory |
| US10409887B1 (en) * | 2017-02-28 | 2019-09-10 | Ambarella, Inc. | Generalized dot product for computer vision applications |
| CN108628807B (zh) | 2017-03-20 | 2022-11-25 | 北京百度网讯科技有限公司 | 浮点数矩阵的处理方法、装置、设备及计算机可读存储介质 |
| US10346944B2 (en) | 2017-04-09 | 2019-07-09 | Intel Corporation | Machine learning sparse computation mechanism |
-
2017
- 2017-04-09 US US15/482,791 patent/US10346944B2/en active Active
-
2018
- 2018-03-08 EP EP18160825.8A patent/EP3385901B1/en active Active
- 2018-03-08 PL PL18160825.8T patent/PL3385901T3/pl unknown
- 2018-03-08 DK DK21180506.4T patent/DK3937119T3/da active
- 2018-03-08 EP EP20192178.0A patent/EP3764314B1/en active Active
- 2018-03-08 PL PL21180506.4T patent/PL3937119T3/pl unknown
- 2018-03-08 EP EP21180506.4A patent/EP3937119B1/en active Active
- 2018-03-08 FI FIEP21180506.4T patent/FI3937119T3/fi active
- 2018-03-08 PL PL20192178.0T patent/PL3764314T3/pl unknown
- 2018-03-08 FI FIEP20192178.0T patent/FI3764314T3/fi active
- 2018-03-08 EP EP20192185.5A patent/EP3764315B1/en active Active
- 2018-03-08 DK DK20192178.0T patent/DK3764314T3/da active
- 2018-03-08 PL PL20192185.5T patent/PL3764315T3/pl unknown
- 2018-03-08 ES ES20192178T patent/ES2972297T3/es active Active
- 2018-03-08 ES ES18160825T patent/ES3011834T3/es active Active
- 2018-03-08 ES ES21180506T patent/ES2962813T3/es active Active
- 2018-03-08 EP EP24171188.6A patent/EP4379648A3/en active Pending
- 2018-04-09 CN CN202010842577.2A patent/CN112116098B/zh active Active
- 2018-04-09 CN CN201810310968.2A patent/CN108694692B/zh active Active
- 2018-04-09 CN CN202110382312.3A patent/CN113191501B/zh active Active
- 2018-04-09 CN CN202010843382.XA patent/CN112116099A/zh active Pending
-
2019
- 2019-05-20 US US16/417,132 patent/US10706498B2/en active Active
-
2020
- 2020-05-21 US US16/880,338 patent/US11164281B2/en active Active
- 2020-07-16 US US16/930,841 patent/US10943325B2/en active Active
-
2021
- 2021-03-05 US US17/193,658 patent/US11430083B2/en active Active
-
2022
- 2022-08-05 US US17/881,720 patent/US11803935B2/en active Active
-
2023
- 2023-09-14 US US18/466,991 patent/US12141891B2/en active Active
-
2024
- 2024-10-04 US US18/906,790 patent/US20250117873A1/en active Pending
Also Published As
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| DK3764314T3 (da) | Sparse-beregningsmekanisme til maskinindlæring | |
| EP3579103C0 (en) | CALCULATION OPTIMIZATION MECHANISM | |
| EP4220380C0 (en) | HARDWARE ACCELERATED MACHINE LEARNING | |
| EP3643932A4 (en) | LINKAGE MECHANISM | |
| PL3037400T3 (pl) | Bezchromowe uwodornianie mieszanin hydroformylowania | |
| ITUB20153920A1 (it) | Dispositivo optofluidico. | |
| EP3513980A4 (en) | PRINTER | |
| EP3452996A4 (en) | SALES MECHANISM | |
| EP3513978A4 (en) | PRINTER | |
| FR3040399B1 (fr) | Dispositif formant etaleur-nappeur a courroie | |
| EP3401110A4 (en) | Printer | |
| EP3332966A4 (en) | Printing apparatus | |
| EP3491255A4 (en) | IMPROVED TELESCOPIC MECHANISM | |
| EP3619920A4 (en) | AUDIO OBJECT INTERACTIONS WITHOUT METADATA | |
| DK3442345T3 (da) | Forskydningsvejeapparat | |
| EP3424733A4 (en) | PRINTER | |
| SE1651579A1 (sv) | A screw-retaining device | |
| FR3046841B1 (fr) | Couplemetre a torsion | |
| DK3414147T3 (da) | Sadel til et køretøj | |
| IT201600117182A1 (it) | Ribobinatrice. | |
| EP3502128C0 (en) | EPITOPE | |
| EP3649515A4 (en) | DEVELOPMENT APPARATUS | |
| DK3220057T3 (da) | Flammesimuleringsenhed | |
| EP3450173A4 (en) | PRINTING APPARATUS | |
| EP3403836A4 (en) | PRINTER |