Ausavarungnirun, 2017 - Google Patents

Techniques for shared resource management in systems with throughput processors

Ausavarungnirun, 2017

Document ID: 15956486713799836682
Author: Ausavarungnirun R
Publication year: 2017

External Links

Cited by

Snippet

The continued growth of the computational capability of throughput processors has made throughput processors the platform of choice for a wide variety of high performance computing applications. Graphics Processing Units (GPUs) are a prime example of …

Continue reading at arxiv.org (PDF) (other versions)

238000000034 method 0 title description 65

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling, out of order instruction execution
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3004—Arrangements for executing specific machine instructions to perform operations on memory
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/16—Handling requests for interconnection or transfer for access to memory bus
- G06F13/1605—Handling requests for interconnection or transfer for access to memory bus based on arbitration
- G06F13/1642—Handling requests for interconnection or transfer for access to memory bus based on arbitration with request queuing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/16—Handling requests for interconnection or transfer for access to memory bus
- G06F13/1605—Handling requests for interconnection or transfer for access to memory bus based on arbitration
- G06F13/161—Handling requests for interconnection or transfer for access to memory bus based on arbitration with latency improvement
- G06F13/1626—Handling requests for interconnection or transfer for access to memory bus based on arbitration with latency improvement by reordering requests
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring

Similar Documents

Publication	Publication Date	Title
Ausavarungnirun et al.	2015	Exploiting inter-warp heterogeneity to improve GPGPU performance
Ausavarungnirun et al.	2012	Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems
Muralidhara et al.	2011	Reducing memory interference in multicore systems via application-aware memory channel partitioning
Subramanian et al.	2016	BLISS: Balancing performance, fairness and complexity in memory access scheduling
Hassan et al.	2018	Bounding dram interference in cots heterogeneous mpsocs for mixed criticality systems
Usui et al.	2016	DASH: Deadline-aware high-performance memory scheduler for heterogeneous systems with hardware accelerators
Bojnordi et al.	2012	PARDIS: A programmable memory controller for the DDRx interfacing standards
US8656401B2 (en)	2014-02-18	Method and apparatus for prioritizing processor scheduler queue operations
US20110055838A1 (en)	2011-03-03	Optimized thread scheduling via hardware performance monitoring
WO2019005105A1 (en)	2019-01-03	Speculative memory activation
US10866834B2 (en)	2020-12-15	Apparatus, method, and system for ensuring quality of service for multi-threading processor cores
Wang et al.	2016	CAF: Core to core communication acceleration framework
Abeydeera et al.	2017	SAM: Optimizing multithreaded cores for speculative parallelism
US20190243769A1 (en)	2019-08-08	System, Apparatus And Method For Dynamic Profiling In A Processor
Ausavarungnirun	2017	Techniques for shared resource management in systems with throughput processors
Kopser et al.	2011	Overview of the next generation Cray XMT
Zhu et al.	2002	Fine-grain priority scheduling on multi-channel memory systems
Asiatici et al.	2019	Dynaburst: Dynamically assemblying dram bursts over a multitude of random accesses
Lin et al.	2018	GPU performance vs. thread-level parallelism: Scalability analysis and a novel way to improve TLP
Ausavarungnirun et al.	2018	Holistic management of the GPGPU memory hierarchy to manage warp-level latency tolerance
US20180165200A1 (en)	2018-06-14	System, apparatus and method for dynamic profiling in a processor
Prieto et al.	2013	CMP off-chip bandwidth scheduling guided by instruction criticality
Bojnordi et al.	2013	A programmable memory controller for the DDRx interfacing standards
Elhelw et al.	2014	Time-based least memory intensive scheduling
Wang et al.	2017	Incorporating selective victim cache into GPGPU for high‐performance computing