US20080320274A1 - Age matrix for queue dispatch order - Google Patents
Age matrix for queue dispatch order Download PDFInfo
- Publication number
- US20080320274A1 US20080320274A1 US11/820,350 US82035007A US2008320274A1 US 20080320274 A1 US20080320274 A1 US 20080320274A1 US 82035007 A US82035007 A US 82035007A US 2008320274 A1 US2008320274 A1 US 2008320274A1
- Authority
- US
- United States
- Prior art keywords
- queue
- dispatch
- entries
- entry
- data structure
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3814—Implementation provisions of instruction buffers, e.g. prefetch buffer; banks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3838—Dependency mechanisms, e.g. register scoreboarding
Definitions
- An instruction scheduling queue is used to store instructions prior to execution. There are many different ways to manage the dispatch order, or age, of instructions in an instruction scheduling queue.
- a common queue implementation uses a first-in-first-out (FIFO) data structure. In this implementation, instruction dispatches arrive at the tail, or end, of the FIFO data structure.
- a look-up mechanism finds the first instruction ready for issue from the head, or start, of the FIFO data structure.
- instructions are selected from anywhere in the FIFO data structure. This creates “holes” in the FIFO data structure at the locations of the selected instructions.
- To maintain absolute ordering of instruction dispatches in the FIFO data structure e.g., for fairness, all of the remaining instructions after the selected instructions are shifted forward in the FIFO, and the data structure is collapsed to form a contiguous chain of instructions. Shifting and collapsing the remaining queue entries in this manner allows new entries to be added to the tail, or end, of the FIFO data structure.
- several instructions are shifted and collapsed every cycle. Hence, maintaining a contiguous sequence of queue entries without “holes” consumes a significant amount of power and processing resources.
- the apparatus is an apparatus for queue allocation.
- An embodiment of the apparatus includes a dispatch order data structure, a bit vector, and a queue controller.
- the dispatch order data structure corresponds to a queue.
- the dispatch order data structure stores a plurality of dispatch indicators associated with a plurality of pairs of entries of the queue to indicate a write order of the entries in the queue.
- the bit vector stores a plurality of mask values corresponding to the dispatch indicators of the dispatch order data structure.
- the queue controller interfaces with the queue and the dispatch order data structure.
- the queue controller excludes at least some of the entries from a queue operation based on the mask values of the bit vector.
- Other embodiments of the apparatus are also described.
- the method is a method for managing a dispatch order of queue entries in a queue.
- An embodiment of the method includes storing a plurality of dispatch indicators corresponding to pairs of entries in a queue. Each dispatch indicator is indicative of the dispatch order of the corresponding pair of entries.
- the method also includes storing a bit vector comprising a plurality of mask values corresponding to the dispatch indicators of the dispatch order data structure.
- the method also includes performing a queue operation on a subset of the entries in the queue. The subset excludes at least some of the entries of the queue based on the mask values of the bit vector.
- Other embodiments of the method are also described.
- Embodiments of a computer readable storage medium are also described.
- the computer readable storage medium embodies a program of machine-readable instructions, executable by a digital processor, to perform operations to facilitate queue allocation.
- the operations include operations to store a plurality of dispatch indicators corresponding to pairs of entries in a queue. Each dispatch indicator is indicative of the dispatch order of the corresponding pair of entries.
- the operations also include operations to store a bit vector comprising a plurality of mask values corresponding to the dispatch indicators of the dispatch order data structure, and to perform a queue operation on a subset of the entries in the queue. The subset excludes at least some of the entries of the queue based on the mask values of the bit vector.
- Other embodiments of the computer readable storage medium are also described.
- FIG. 1 depicts a schematic block diagram of one embodiment of a plurality of instruction scheduling queues with corresponding dispatch order data structures.
- FIG. 2 depicts a schematic diagram of one embodiment of a dispatch order data structure in a matrix configuration.
- FIG. 3 depicts a schematic diagram of one embodiment of a sequence of data structure states of the dispatch order data structure shown in FIG. 2 .
- FIG. 4 depicts a schematic diagram of another embodiment of a dispatch order data structure with masked duplicate entries.
- FIG. 5 depicts a schematic diagram of one embodiment of a sequence of data structure states of the dispatch order data structure shown in FIG. 4 .
- FIG. 6 depicts a schematic diagram of another embodiment of a dispatch order data structure in a partial matrix configuration.
- FIG. 7 depicts a schematic diagram of one embodiment of a sequence of data structure states of the dispatch order data structure shown in FIG. 6 .
- FIG. 8 depicts a schematic block diagram of one embodiment of an instruction queue scheduler which uses a dispatch order data structure.
- FIG. 9 depicts a schematic flow chart diagram of one embodiment of a queue operation method for use with the instruction queue scheduler of FIG. 8 .
- FIG. 1 depicts a schematic block diagram of one embodiment of a plurality of instruction scheduling queues 102 with corresponding dispatch order data structures 104 .
- the instruction scheduling queues 102 store instructions, or some representative indicators of the instructions, prior to execution.
- the instruction scheduling queues 102 are also referred to as issue queues.
- the stored instructions are referred to as entries. It should be noted that although the following description references a specific type of queue (i.e., an instruction scheduling queue), embodiments may be implemented for other types of queues.
- each issue queue 102 is a fully-associative structure in a random access memory (RAM) device.
- the dispatch order data structures 104 are separate control structures to maintain the relative dispatch order, or age, of the entries in the corresponding issue queues 102 .
- An associated instruction scheduler may be implemented as a RAM structure or, alternatively, as another type of structure.
- the dispatch order data structures 104 correspond to the queues 102 .
- Each dispatch order data structure 104 stores a plurality of dispatch indicators associated with a plurality of pairs of entries of the corresponding queue 102 .
- Each dispatch indicator indicates a dispatch order of the entries in each pair.
- the dispatch order data structure 104 stores a representation of at least a partial matrix with intersecting rows and columns. Each row corresponds to one of the entries of the queue, and each column corresponding to one of the entries of the queue. Hence, the intersections of the rows and columns correspond to the pairs of entries in the queue. Since the dispatch order data structure 104 stores dispatch, or age, information, and may be configured as a matrix, the dispatch order data structure 104 is also referred to as an age matrix.
- FIG. 2 depicts a schematic diagram of one embodiment of a dispatch order data structure 110 in a matrix configuration.
- the dispatch order data structure 110 is associated with a specific issue queue 102 .
- the dispatch order of the entries in the queue 102 depends on the relative age of each entry, or when the entry is written into the queue, compared to the other entries in the queue 102 .
- the dispatch order data structure 110 provides a representation of the dispatch order for the corresponding issue queue 102 .
- the illustrated dispatch order data structure 110 has four rows, designated as rows 0 - 3 , corresponding to entries of the issue queue 102 . Similarly, the dispatch order data structure has four columns, designated as columns 0 - 3 , corresponding to the same entries of the issue queue 102 . Other embodiments of the dispatch order data structure 110 may include fewer or more rows and columns, depending on the number of entries in the corresponding issues queue 102 .
- each entry of the dispatch order data structure 110 indicates a relative dispatch order, or age, of the corresponding pair of entries in the queue 102 . Since there is not a relative age difference between an entry in the queue 102 and itself (i.e., where the row and column correspond to the same entry in the queue 102 ), the diagonal of the dispatch order data structure 110 is not used or masked. Masked dispatch indicators are designated by an “X.”
- arrows are shown to indicate the relative dispatch order for the corresponding pairs of entries in the queue 102 .
- the arrow points toward the older entry, and away from the newer entry, in the corresponding pair of entries.
- a left arrow indicates that the issue queue entry corresponding to the row is older than the issue queue entry corresponding to the column.
- an upward arrow indicates that the issue queue entry corresponding to the column is older than the issue queue entry corresponding to the row.
- Entry_ 0 of the queue 102 is older than all of the other entries, as shown in the bottom row and the rightmost column of the dispatch order data structure 110 (i.e., all of the arrows point toward the older entry, Entry_ 0 ).
- Entry_ 3 of the queue 102 is newer than all of the other entries, as shown in the top row and the leftmost column of the dispatch order data structure 110 (all of the arrows point away from the newer entry, Entry_ 3 ).
- FIG. 3 depicts a schematic diagram of one embodiment of a sequence 112 of data structure states of the dispatch order data structure 110 shown in FIG. 2 .
- the dispatch order data structure 110 has the same dispatch order as shown in FIG. 2 and described above.
- a new entry is written in Entry_ 0 of the issue queue 102 .
- the dispatch indicators of the dispatch order data structure 110 are updated to show that Entry_ 0 is the newest entry in the issue queue 102 . Since Entry_ 0 was previously the oldest entry in the issue queue 102 , all of the dispatch indicators for Entry_ 0 are updated.
- FIG. 4 depicts a schematic diagram of another embodiment of a dispatch order data structure 120 with masked duplicate entries. Since the dispatch indicators above and below the masked diagonal entries are duplicates, either the top or bottom half of the dispatch order data structure 120 may be masked. In the embodiment of FIG. 4 , the top portion is masked. However, other embodiments may use the top portion and mask the bottom portion.
- FIG. 5 depicts a schematic diagram of one embodiment of a sequence 122 of data structure states of the dispatch order data structure 120 shown in FIG. 4 .
- the sequence 122 shows how the dispatch indicators in the lower portion of the dispatch order data structure 120 are changed each time an entry in the corresponding queue 102 is changed.
- a new entry is written in Entry_ 2 , and the dispatch indicator for the pair Entry_ 2 /Entry_ 3 is updated.
- a new entry is written in Entry_ 0 , and the dispatch indicators for all the pairs associated with Entry_ 0 are updated.
- FIG. 6 depicts a schematic diagram of another embodiment of a dispatch order data structure 130 in a partial matrix configuration. Instead of masking the duplicate and unused dispatch indicators, the dispatch order data structure 130 only stores one dispatch indicator for each pair of entries in the queue.
- the partial matrix configuration has fewer entries, and may be stored in less memory space, than the previously described embodiments of the dispatch order data structures 110 and 120 .
- the dispatch order data structure 130 may store the same number of dispatch indicators, n, as there are pairs of entries, according to the following:
- n designates the number of pairs of entries of the queue 102
- N designates a total number of entries in the queue 102 .
- the dispatch order data structure 130 stores six dispatch indicators, instead of 16 (i.e., a 4 ⁇ 4 matrix) dispatch indicators.
- an issue queue 102 with 16 entries has 120 unique pairs, and the corresponding dispatch order data structure 130 stores 120 dispatch indicators.
- FIG. 7 depicts a schematic diagram of one embodiment of a sequence 132 of data structure states of the dispatch order data structure 130 shown in FIG. 6 .
- the illustrated dispatch order data structures 130 of FIG. 7 are shown as binary values.
- a binary “1” corresponds to a left arrow
- a binary “0” corresponds to an upward arrow.
- other embodiments may be implemented using a different convention.
- the sequence 132 of queue operations for times T 0 -T 4 are the same as described above for FIG. 5 .
- FIG. 8 depicts a schematic block diagram of one embodiment of an instruction queue scheduler 140 which uses a dispatch order data structure 104 such as one of the dispatch order data structures 110 , 120 , or 130 .
- the scheduler 140 is implemented in a processor (not shown).
- the processor may be implemented in a reduced instruction set computer (RISC) design.
- the processor may implement a design based on the MIPS instruction set architecture (ISA).
- ISA MIPS instruction set architecture
- alternative embodiments of the processor may implement other instruction set architectures.
- other embodiments of the scheduler 140 may include fewer or more components than are shown in FIG. 8 .
- the processor also may include execution units (not shown) such as an arithmetic logic unit (ALU), a floating point unit (FPU), a load/store unit (LSU), and a memory management unit (MMU).
- execution units such as an arithmetic logic unit (ALU), a floating point unit (FPU), a load/store unit (LSU), and a memory management unit (MMU).
- ALU arithmetic logic unit
- FPU floating point unit
- LSU load/store unit
- MMU memory management unit
- the illustrated scheduler 140 includes a queue 102 , a mapper 142 , and a queue controller 144 .
- the mapper 142 is configured to issue one or more queue operations to insert new entries in the queue 102 .
- the mapper 142 dispatches up to two instructions per cycle to each issue queue 102 .
- the queue controller 144 also interfaces with the queue 102 to update a dispatch order data structure 104 in response to a queue operation to insert a new entry in the queue 102 .
- each issue queue 102 has two write ports, which are designated as Port_ 0 and Port_ 1 .
- the mapper 142 may dispatch a single instruction on one of the write ports.
- the issue queue 102 may have more than two write ports. If multiple instructions are dispatched at the same time to multiple write ports, then the write ports may have a designated order to indicate the relative dispatch order of the instructions which are issued together. For example, an instruction issued on Port_ 0 may be designated as older than an instruction issued in the same cycle on Port_ 1 .
- write addresses are generated internally in each issue queue 102 .
- the queue controller 144 keeps track of the dispatch order of the entries in the issue queue 102 to determine which entries can be overwritten (or evicted).
- the queue controller 144 includes dispatch logic 146 with least recently used (LRU) logic 148 .
- the queue controller 144 also includes a bit mask vector 150 and an age matrix flop bank 152 .
- the flop bank 152 includes a plurality of flip-flops. Each flip-flop stores a bit value indicative of the dispatch order of the entries of a corresponding pair of entries. In other words, each flip-flop corresponds to a dispatch indicator, and the flop bank 152 implements the dispatch order data structure 104 .
- the bit value of each flip-flop is a binary bit value.
- a logical high value of the binary bit value indicates one dispatch order of the pair of entries (e.g., the corresponding row is older than the corresponding column), and a logical low value of the binary bit value to indicate a reverse dispatch order of the pair of entries (e.g., the corresponding column is older than the corresponding row).
- the dispatch logic 146 is configured to potentially flip the binary bit value for the corresponding dispatch indicators.
- the number of flip-flops in the flop bank 152 may be determined by the number of pairs (e.g., combinations) of entries in the queue 102 .
- the dispatch logic 146 includes least recently used (LRU) logic 148 to implement a LRU replacement strategy.
- the LRU replacement strategy is based, at least in part, on the dispatch indicators of the corresponding dispatch order data structure 104 implemented by the flop bank 152 .
- the LRU logic 148 may implement a true LRU replacement strategy or a pseudo LRU replacement strategy.
- a true LRU replacement strategy the LRU entries in the queue 102 are replaced.
- the LRU entries are designated by LRU replacement addresses.
- generating the LRU replacement addresses which is a serial operation, can be logically complex.
- a pseudo LRU replacement strategy approximates the true LRU replacement strategy using a less complicated implementation.
- the queue 102 interfaces with the queue controller 144 to determine which existing entry to discard to make room for the newly dispatched entry.
- the dispatch logic 146 uses the age matrix flop bank 152 to determine which entry to replace based on the absolute dispatch order of the entries in the queue 102 .
- some entries in the queue 102 may be associated with a replay operation, so it may be useful to maintain the corresponding entries in the queue 102 , regardless of the dispatch order of the entries.
- the entry to be discarded may be selected from a subset that excludes the entries associated with the replay operation.
- the entry to be discarded may be selected from a subset that excludes the entries that, if discarded, would potentially create a hazard event.
- the entry to be discarded may be selected from a subset that excludes entries related to the identified thread. In this way, the preserved entries corresponding to the identified thread are given priority, because the entries associated with the thread are not discarded.
- each bit mask vector 150 is used to mask out one or more dispatch indicators of a dispatch order data structure 104 such as the age matrix flop bank 152 .
- each bit mask vector 150 (or bit vector) is configured to store a plurality of mask values corresponding to the dispatch indicators of the dispatch order data structure 104 .
- the queue controller 144 can exclude at least some of the entries of the queue 102 from a queue operation based on the mask values of the bit vector 150 .
- the dispatch logic 146 may select the oldest entry of the subset of entries that are not masked by the bit mask vector 150 .
- the bit mask vector 150 is used to identify entries that may be discarded in a dispatch operation, rather than entries to be maintained in the queue 102 (i.e., excluded from potentially discarding) in a dispatch operation.
- FIG. 9 depicts a schematic flow chart diagram of one embodiment of a queue operation method 160 for use with the instruction queue scheduler 140 of FIG. 8 .
- the tracking method 160 is described with reference to the instruction queue scheduler 140 of FIG. 8 , other embodiments may be implemented in conjunction with other schedulers.
- the queue controller 144 initializes 162 the dispatch order data structure 104 .
- the queue controller 144 may initialize the dispatch order data structure 104 with a plurality of dispatch indicators based on the dispatch order of the entries in the queue 102 .
- the dispatch order data structure 104 maintains an absolute dispatch order for the queue 102 to indicate the order in which the entries are written into the queue 102 .
- some embodiments are described as using a particular type of dispatch order data structure 104 such as the age matrix, other embodiments may use other implementations of the dispatch order data structure.
- the illustrated queue operation method 160 continues as the queue 102 receives 164 a command for a queue operation such as a dispatch operation.
- the queue controller 144 selects an existing entry of the queue 102 to be discarded from all of the entries in the queue 102 or from a subset of the entries in the queue 102 .
- the queue controller 144 determines 166 if there is a bit mask vector 150 to use with the received queue operation. If there is a bit mask vector 150 , then the dispatch logic 146 applies 168 the bit mask vector 150 to the dispatch order data structure 104 before executing 170 the queue operation.
- the candidate entries which may be discarded from the queue 102 is limited to some subset of the entries in the queue 102 . Otherwise, if there is not an applicable bit mask vector 150 , then the dispatch logic 146 may directly execute 170 the queue operation. In this situation, the candidate entries which may be discarded from the queue 102 is not limited to a subset of the entries in the queue 102 . After executing 170 the queue operation, the dispatch logic 146 updates 172 the dispatch order data structure 104 , and the depicted queue operation method 160 ends.
- embodiments of the methods, operations, functions, and/or logic may be implemented in software, firmware, hardware, or some combination thereof. Additionally, some embodiments of the methods, operations, functions, and/or logic may be implemented using a hardware or software representation of one or more algorithms related to the operations described above. To the degree that an embodiment may be implemented in software, the methods, operations, functions, and/or logic are stored on a computer-readable medium and accessible by a computer processor.
- an embodiment may be implemented as a computer readable storage medium embodying a program of machine-readable instructions, executable by a digital processor, to perform operations to facilitate queue allocation.
- the operations may include operations to store a plurality of dispatch indicators corresponding to pairs of entries in a queue. Each dispatch indicator is indicative of the dispatch order of the corresponding pair of entries.
- the operations also include operations to store a bit vector comprising a plurality of mask values corresponding to the dispatch indicators of the dispatch order data structure, and to perform a queue operation on a subset of the entries in the queue. The subset excludes at least some of the entries of the queue based on the mask values of the bit vector.
- Other embodiments of the computer readable storage medium may facilitate fewer or more operations.
- Embodiments of the invention also may involve a number of functions to be performed by a computer processor such as a central processing unit (CPU), a graphics processing unit (GPU), or a microprocessor.
- the microprocessor may be a specialized or dedicated microprocessor that is configured to perform particular tasks by executing machine-readable software code that defines the particular tasks.
- the microprocessor also may be configured to operate and communicate with other devices such as direct memory access modules, memory storage devices, Internet related hardware, and other devices that relate to the transmission of data.
- the software code may be configured using software formats such as Java, C++, XML (Extensible Mark-up Language) and other languages that may be used to define functions that relate to operations of devices required to carry out the functional operations related described herein.
- the code may be written in different forms and styles, many of which are known to those skilled in the art. Different code formats, code configurations, styles and forms of software programs and other means of configuring code to define the operations of a microprocessor may be implemented.
- the memory/storage device where data is stored may be a separate device that is external to the processor, or may be configured in a monolithic device, where the memory or storage device is located on the same integrated circuit, such as components connected on a single substrate.
- Cache memory devices are often included in computers for use by the CPU or GPU as a convenient storage location for information that is frequently stored and retrieved.
- a persistent memory is also frequently used with such computers for maintaining information that is frequently retrieved by a central processing unit, but that is not often altered within the persistent memory, unlike the cache memory.
- Main memory is also usually included for storing and retrieving larger amounts of information such as data and software applications configured to perform certain functions when executed by the central processing unit.
- These memory devices may be configured as random access memory (RAM), static random access memory (SRAM), dynamic random access memory (DRAM), flash memory, and other memory storage devices that may be accessed by a central processing unit to store and retrieve information.
- RAM random access memory
- SRAM static random access memory
- DRAM dynamic random access memory
- flash memory and other memory storage devices that may be accessed by a central processing unit to store and retrieve information.
- Embodiments may be implemented with various memory and storage devices, as well as any commonly used protocol for storing and retrieving information to and from these memory devices respectively.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Advance Control (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
- An instruction scheduling queue is used to store instructions prior to execution. There are many different ways to manage the dispatch order, or age, of instructions in an instruction scheduling queue. A common queue implementation uses a first-in-first-out (FIFO) data structure. In this implementation, instruction dispatches arrive at the tail, or end, of the FIFO data structure. A look-up mechanism finds the first instruction ready for issue from the head, or start, of the FIFO data structure.
- In conventional out-of-order implementations, instructions are selected from anywhere in the FIFO data structure. This creates “holes” in the FIFO data structure at the locations of the selected instructions. To maintain absolute ordering of instruction dispatches in the FIFO data structure (e.g., for fairness), all of the remaining instructions after the selected instructions are shifted forward in the FIFO, and the data structure is collapsed to form a contiguous chain of instructions. Shifting and collapsing the remaining queue entries in this manner allows new entries to be added to the tail, or end, of the FIFO data structure. However, with a robust out-of-order issue rate, several instructions are shifted and collapsed every cycle. Hence, maintaining a contiguous sequence of queue entries without “holes” consumes a significant amount of power and processing resources.
- Embodiments of an apparatus are described. In one embodiment, the apparatus is an apparatus for queue allocation. An embodiment of the apparatus includes a dispatch order data structure, a bit vector, and a queue controller. The dispatch order data structure corresponds to a queue. The dispatch order data structure stores a plurality of dispatch indicators associated with a plurality of pairs of entries of the queue to indicate a write order of the entries in the queue. The bit vector stores a plurality of mask values corresponding to the dispatch indicators of the dispatch order data structure. The queue controller interfaces with the queue and the dispatch order data structure. The queue controller excludes at least some of the entries from a queue operation based on the mask values of the bit vector. Other embodiments of the apparatus are also described.
- Embodiments of a method are also described. In one embodiment, the method is a method for managing a dispatch order of queue entries in a queue. An embodiment of the method includes storing a plurality of dispatch indicators corresponding to pairs of entries in a queue. Each dispatch indicator is indicative of the dispatch order of the corresponding pair of entries. The method also includes storing a bit vector comprising a plurality of mask values corresponding to the dispatch indicators of the dispatch order data structure. The method also includes performing a queue operation on a subset of the entries in the queue. The subset excludes at least some of the entries of the queue based on the mask values of the bit vector. Other embodiments of the method are also described.
- Embodiments of a computer readable storage medium are also described. In one embodiment, the computer readable storage medium embodies a program of machine-readable instructions, executable by a digital processor, to perform operations to facilitate queue allocation. The operations include operations to store a plurality of dispatch indicators corresponding to pairs of entries in a queue. Each dispatch indicator is indicative of the dispatch order of the corresponding pair of entries. The operations also include operations to store a bit vector comprising a plurality of mask values corresponding to the dispatch indicators of the dispatch order data structure, and to perform a queue operation on a subset of the entries in the queue. The subset excludes at least some of the entries of the queue based on the mask values of the bit vector. Other embodiments of the computer readable storage medium are also described.
- Other aspects and advantages of embodiments of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrated by way of example of the principles of the invention.
-
FIG. 1 depicts a schematic block diagram of one embodiment of a plurality of instruction scheduling queues with corresponding dispatch order data structures. -
FIG. 2 depicts a schematic diagram of one embodiment of a dispatch order data structure in a matrix configuration. -
FIG. 3 depicts a schematic diagram of one embodiment of a sequence of data structure states of the dispatch order data structure shown inFIG. 2 . -
FIG. 4 depicts a schematic diagram of another embodiment of a dispatch order data structure with masked duplicate entries. -
FIG. 5 depicts a schematic diagram of one embodiment of a sequence of data structure states of the dispatch order data structure shown inFIG. 4 . -
FIG. 6 depicts a schematic diagram of another embodiment of a dispatch order data structure in a partial matrix configuration. -
FIG. 7 depicts a schematic diagram of one embodiment of a sequence of data structure states of the dispatch order data structure shown inFIG. 6 . -
FIG. 8 depicts a schematic block diagram of one embodiment of an instruction queue scheduler which uses a dispatch order data structure. -
FIG. 9 depicts a schematic flow chart diagram of one embodiment of a queue operation method for use with the instruction queue scheduler ofFIG. 8 . - Throughout the description, similar reference numbers may be used to identify similar elements.
-
FIG. 1 depicts a schematic block diagram of one embodiment of a plurality ofinstruction scheduling queues 102 with corresponding dispatchorder data structures 104. In general, the instruction schedulingqueues 102 store instructions, or some representative indicators of the instructions, prior to execution. Theinstruction scheduling queues 102 are also referred to as issue queues. The stored instructions are referred to as entries. It should be noted that although the following description references a specific type of queue (i.e., an instruction scheduling queue), embodiments may be implemented for other types of queues. - Instead of implementing shifting and collapsing operations to continually adjust the positions of the entries in each
queue 102, the dispatchorder data structure 104 is kept separately from the queue. In one embodiment, eachissue queue 102 is a fully-associative structure in a random access memory (RAM) device. The dispatchorder data structures 104 are separate control structures to maintain the relative dispatch order, or age, of the entries in thecorresponding issue queues 102. An associated instruction scheduler may be implemented as a RAM structure or, alternatively, as another type of structure. - In one embodiment, the dispatch
order data structures 104 correspond to thequeues 102. Each dispatchorder data structure 104 stores a plurality of dispatch indicators associated with a plurality of pairs of entries of thecorresponding queue 102. Each dispatch indicator indicates a dispatch order of the entries in each pair. - In one embodiment, the dispatch
order data structure 104 stores a representation of at least a partial matrix with intersecting rows and columns. Each row corresponds to one of the entries of the queue, and each column corresponding to one of the entries of the queue. Hence, the intersections of the rows and columns correspond to the pairs of entries in the queue. Since the dispatchorder data structure 104 stores dispatch, or age, information, and may be configured as a matrix, the dispatchorder data structure 104 is also referred to as an age matrix. -
FIG. 2 depicts a schematic diagram of one embodiment of a dispatchorder data structure 110 in a matrix configuration. The dispatchorder data structure 110 is associated with aspecific issue queue 102. The dispatch order of the entries in thequeue 102 depends on the relative age of each entry, or when the entry is written into the queue, compared to the other entries in thequeue 102. The dispatchorder data structure 110 provides a representation of the dispatch order for thecorresponding issue queue 102. - The illustrated dispatch
order data structure 110 has four rows, designated as rows 0-3, corresponding to entries of theissue queue 102. Similarly, the dispatch order data structure has four columns, designated as columns 0-3, corresponding to the same entries of theissue queue 102. Other embodiments of the dispatchorder data structure 110 may include fewer or more rows and columns, depending on the number of entries in the correspondingissues queue 102. - The intersections between the rows and columns correspond to different pairs, or combinations, of entries in the
issue queue 102. As described above, each entry of the dispatchorder data structure 110 indicates a relative dispatch order, or age, of the corresponding pair of entries in thequeue 102. Since there is not a relative age difference between an entry in thequeue 102 and itself (i.e., where the row and column correspond to the same entry in the queue 102), the diagonal of the dispatchorder data structure 110 is not used or masked. Masked dispatch indicators are designated by an “X.” - For the remaining entries, arrows are shown to indicate the relative dispatch order for the corresponding pairs of entries in the
queue 102. As a matter of convention inFIG. 2 , the arrow points toward the older entry, and away from the newer entry, in the corresponding pair of entries. Hence, a left arrow indicates that the issue queue entry corresponding to the row is older than the issue queue entry corresponding to the column. In contrast, an upward arrow indicates that the issue queue entry corresponding to the column is older than the issue queue entry corresponding to the row. - For example, Entry_0 of the
queue 102 is older than all of the other entries, as shown in the bottom row and the rightmost column of the dispatch order data structure 110 (i.e., all of the arrows point toward the older entry, Entry_0). In contrast, Entry_3 of thequeue 102 is newer than all of the other entries, as shown in the top row and the leftmost column of the dispatch order data structure 110 (all of the arrows point away from the newer entry, Entry_3). By looking at all of the dispatch indicators of the dispatchorder data structure 110, it can be seen that the dispatch order, from oldest to newest, of thecorresponding issue queue 102 is: Entry_0, Entry_1, Entry_2, Entry_3. -
FIG. 3 depicts a schematic diagram of one embodiment of asequence 112 of data structure states of the dispatchorder data structure 110 shown inFIG. 2 . At time T0, the dispatchorder data structure 110 has the same dispatch order as shown inFIG. 2 and described above. At time T1, a new entry is written in Entry_0 of theissue queue 102. As a result, the dispatch indicators of the dispatchorder data structure 110 are updated to show that Entry_0 is the newest entry in theissue queue 102. Since Entry_0 was previously the oldest entry in theissue queue 102, all of the dispatch indicators for Entry_0 are updated. - At time T2, a new entry is written in Entry_2. As a result, the dispatch indicators of the dispatch
order data structure 110 are updated to show that Entry_2 is the newest entry in theissue queue 102. Since Entry_2 was previously older than Entry_3 and Entry_0 at time T1, the corresponding dispatch indicators for the pairs Entry_2/Entry_3 and Entry_2/Entry_0 are updated, or flipped. Since Entry_2 is already marked as newer than Entry_1 at time T1, the corresponding dispatch indicators for the pair Entry_2/Entry_1 is not changed. - At time T3, a new entry is written in Entry_1. As a result, the dispatch indicators of the dispatch
order data structure 110 are updated to show that Entry_1 is the newest entry in theissue queue 102. Since Entry_1 was previously the oldest entry in theissue queue 102 at time T2, all of the corresponding dispatch indicators for Entry_1 are updated, or flipped. -
FIG. 4 depicts a schematic diagram of another embodiment of a dispatchorder data structure 120 with masked duplicate entries. Since the dispatch indicators above and below the masked diagonal entries are duplicates, either the top or bottom half of the dispatchorder data structure 120 may be masked. In the embodiment ofFIG. 4 , the top portion is masked. However, other embodiments may use the top portion and mask the bottom portion. -
FIG. 5 depicts a schematic diagram of one embodiment of asequence 122 of data structure states of the dispatchorder data structure 120 shown inFIG. 4 . In particular, thesequence 122 shows how the dispatch indicators in the lower portion of the dispatchorder data structure 120 are changed each time an entry in thecorresponding queue 102 is changed. At time T1, a new entry is written in Entry_2, and the dispatch indicator for the pair Entry_2/Entry_3 is updated. At time T2, a new entry is written in Entry_0, and the dispatch indicators for all the pairs associated with Entry_0 are updated. At time T3, a new entry is written in Entry_3, and the dispatch indicators for the pairs Entry_3/Entry_0 and Entry_3/Entry_2 are updated. At time T4, a new entry is written in Entry_1, and the dispatch indicators for all of the entries associated with Entry_1 are updated. -
FIG. 6 depicts a schematic diagram of another embodiment of a dispatchorder data structure 130 in a partial matrix configuration. Instead of masking the duplicate and unused dispatch indicators, the dispatchorder data structure 130 only stores one dispatch indicator for each pair of entries in the queue. - In this embodiment, the partial matrix configuration has fewer entries, and may be stored in less memory space, than the previously described embodiments of the dispatch
110 and 120. In particular, for anorder data structures issue queue 102 with a number of entries, N, the dispatchorder data structure 130 may store the same number of dispatch indicators, n, as there are pairs of entries, according to the following: -
- where n designates the number of pairs of entries of the
queue 102, and N designates a total number of entries in thequeue 102. For example, if thequeue 102 has 4 entries, then the number of pairs of entries is 6. Hence, the dispatchorder data structure 130 stores six dispatch indicators, instead of 16 (i.e., a 4×4 matrix) dispatch indicators. As another example, anissue queue 102 with 16 entries has 120 unique pairs, and the corresponding dispatchorder data structure 130stores 120 dispatch indicators. -
FIG. 7 depicts a schematic diagram of one embodiment of asequence 132 of data structure states of the dispatchorder data structure 130 shown inFIG. 6 . However, instead of showing the dispatch indicators as arrows, the illustrated dispatchorder data structures 130 ofFIG. 7 are shown as binary values. As a matter of convention, a binary “1” corresponds to a left arrow, and a binary “0” corresponds to an upward arrow. However, other embodiments may be implemented using a different convention. Other than using binary values for a limited number of dispatch indicators, thesequence 132 of queue operations for times T0-T4 are the same as described above forFIG. 5 . -
FIG. 8 depicts a schematic block diagram of one embodiment of aninstruction queue scheduler 140 which uses a dispatchorder data structure 104 such as one of the dispatch 110, 120, or 130. In one embodiment, theorder data structures scheduler 140 is implemented in a processor (not shown). The processor may be implemented in a reduced instruction set computer (RISC) design. Additionally, the processor may implement a design based on the MIPS instruction set architecture (ISA). However, alternative embodiments of the processor may implement other instruction set architectures. It should also be noted that other embodiments of thescheduler 140 may include fewer or more components than are shown inFIG. 8 . - In conjunction with the
scheduler 140, the processor also may include execution units (not shown) such as an arithmetic logic unit (ALU), a floating point unit (FPU), a load/store unit (LSU), and a memory management unit (MMU). In one embodiment, each of these execution units is coupled to thescheduler 140, which schedules instructions for execution by one of the execution units. Once an instruction is scheduled for execution, the instruction may be sent to the corresponding execution unit where it is stored in aninstruction queue 102. - The illustrated
scheduler 140 includes aqueue 102, amapper 142, and aqueue controller 144. Themapper 142 is configured to issue one or more queue operations to insert new entries in thequeue 102. In one embodiment, themapper 142 dispatches up to two instructions per cycle to eachissue queue 102. Thequeue controller 144 also interfaces with thequeue 102 to update a dispatchorder data structure 104 in response to a queue operation to insert a new entry in thequeue 102. - In order to receive two instructions per cycle, each
issue queue 102 has two write ports, which are designated as Port_0 and Port_1. Alternatively, themapper 142 may dispatch a single instruction on one of the write ports. In other embodiments, theissue queue 102 may have more than two write ports. If multiple instructions are dispatched at the same time to multiple write ports, then the write ports may have a designated order to indicate the relative dispatch order of the instructions which are issued together. For example, an instruction issued on Port_0 may be designated as older than an instruction issued in the same cycle on Port_1. In one embodiment, write addresses are generated internally in eachissue queue 102. - The
queue controller 144 keeps track of the dispatch order of the entries in theissue queue 102 to determine which entries can be overwritten (or evicted). In order to track the dispatch order of the entries in thequeue 102, thequeue controller 144 includesdispatch logic 146 with least recently used (LRU)logic 148. Thequeue controller 144 also includes abit mask vector 150 and an agematrix flop bank 152. In one embodiment, theflop bank 152 includes a plurality of flip-flops. Each flip-flop stores a bit value indicative of the dispatch order of the entries of a corresponding pair of entries. In other words, each flip-flop corresponds to a dispatch indicator, and theflop bank 152 implements the dispatchorder data structure 104. The bit value of each flip-flop is a binary bit value. In one embodiment, a logical high value of the binary bit value indicates one dispatch order of the pair of entries (e.g., the corresponding row is older than the corresponding column), and a logical low value of the binary bit value to indicate a reverse dispatch order of the pair of entries (e.g., the corresponding column is older than the corresponding row). When a dispatch indicator is updated in response to a new instruction written to thequeue 102, thedispatch logic 146 is configured to potentially flip the binary bit value for the corresponding dispatch indicators. As described above, the number of flip-flops in theflop bank 152 may be determined by the number of pairs (e.g., combinations) of entries in thequeue 102. - In order to determine which entries may be overwritten in the
queue 102, thedispatch logic 146 includes least recently used (LRU)logic 148 to implement a LRU replacement strategy. In one embodiment, the LRU replacement strategy is based, at least in part, on the dispatch indicators of the corresponding dispatchorder data structure 104 implemented by theflop bank 152. As examples, theLRU logic 148 may implement a true LRU replacement strategy or a pseudo LRU replacement strategy. In a true LRU replacement strategy, the LRU entries in thequeue 102 are replaced. The LRU entries are designated by LRU replacement addresses. However, generating the LRU replacement addresses, which is a serial operation, can be logically complex. A pseudo LRU replacement strategy approximates the true LRU replacement strategy using a less complicated implementation. - When the mapper dispatches a new entry to the
queue 102 as a part of a queue operation, thequeue 102 interfaces with thequeue controller 144 to determine which existing entry to discard to make room for the newly dispatched entry. In some embodiments, thedispatch logic 146 uses the agematrix flop bank 152 to determine which entry to replace based on the absolute dispatch order of the entries in thequeue 102. However, in other embodiments, it may be useful to identify an entry to discard from among a subset of the entries in thequeue 102. - As one example, some entries in the
queue 102 may be associated with a replay operation, so it may be useful to maintain the corresponding entries in thequeue 102, regardless of the dispatch order of the entries. Thus, the entry to be discarded may be selected from a subset that excludes the entries associated with the replay operation. - As another example, it may be useful to maintain certain entries in the
queue 102 in order to prevent a hazard event such as a structural, data, or control hazard. Thus, the entry to be discarded may be selected from a subset that excludes the entries that, if discarded, would potentially create a hazard event. - As another example, it may be useful to preserve entries of the
queue 102 that are related to a particular thread of a multi-threaded processing system. Thus, the entry to be discarded may be selected from a subset that excludes entries related to the identified thread. In this way, the preserved entries corresponding to the identified thread are given priority, because the entries associated with the thread are not discarded. - In order to identify a subset of the entries in the
queue 102, thequeue controller 144 may use one or morebit mask vectors 150. In one embodiment, eachbit mask vector 150 is used to mask out one or more dispatch indicators of a dispatchorder data structure 104 such as the agematrix flop bank 152. In other words, each bit mask vector 150 (or bit vector) is configured to store a plurality of mask values corresponding to the dispatch indicators of the dispatchorder data structure 104. Thus, thequeue controller 144 can exclude at least some of the entries of thequeue 102 from a queue operation based on the mask values of thebit vector 150. For example, instead of selecting the absolute oldest entry of thequeue 102 to be discarded, thedispatch logic 146 may select the oldest entry of the subset of entries that are not masked by thebit mask vector 150. In an alternative embodiment, thebit mask vector 150 is used to identify entries that may be discarded in a dispatch operation, rather than entries to be maintained in the queue 102 (i.e., excluded from potentially discarding) in a dispatch operation. -
FIG. 9 depicts a schematic flow chart diagram of one embodiment of aqueue operation method 160 for use with theinstruction queue scheduler 140 ofFIG. 8 . Although thetracking method 160 is described with reference to theinstruction queue scheduler 140 ofFIG. 8 , other embodiments may be implemented in conjunction with other schedulers. - In the illustrated
queue operation method 160, thequeue controller 144 initializes 162 the dispatchorder data structure 104. As described above, thequeue controller 144 may initialize the dispatchorder data structure 104 with a plurality of dispatch indicators based on the dispatch order of the entries in thequeue 102. In this way, the dispatchorder data structure 104 maintains an absolute dispatch order for thequeue 102 to indicate the order in which the entries are written into thequeue 102. Although some embodiments are described as using a particular type of dispatchorder data structure 104 such as the age matrix, other embodiments may use other implementations of the dispatch order data structure. - The illustrated
queue operation method 160 continues as thequeue 102 receives 164 a command for a queue operation such as a dispatch operation. As explained above, thequeue controller 144 selects an existing entry of thequeue 102 to be discarded from all of the entries in thequeue 102 or from a subset of the entries in thequeue 102. In order to identify a subset of the entries in thequeue 102, thequeue controller 144 determines 166 if there is abit mask vector 150 to use with the received queue operation. If there is abit mask vector 150, then thedispatch logic 146 applies 168 thebit mask vector 150 to the dispatchorder data structure 104 before executing 170 the queue operation. In this situation, the candidate entries which may be discarded from thequeue 102 is limited to some subset of the entries in thequeue 102. Otherwise, if there is not an applicablebit mask vector 150, then thedispatch logic 146 may directly execute 170 the queue operation. In this situation, the candidate entries which may be discarded from thequeue 102 is not limited to a subset of the entries in thequeue 102. After executing 170 the queue operation, thedispatch logic 146 updates 172 the dispatchorder data structure 104, and the depictedqueue operation method 160 ends. - It should be noted that embodiments of the methods, operations, functions, and/or logic may be implemented in software, firmware, hardware, or some combination thereof. Additionally, some embodiments of the methods, operations, functions, and/or logic may be implemented using a hardware or software representation of one or more algorithms related to the operations described above. To the degree that an embodiment may be implemented in software, the methods, operations, functions, and/or logic are stored on a computer-readable medium and accessible by a computer processor.
- As one example, an embodiment may be implemented as a computer readable storage medium embodying a program of machine-readable instructions, executable by a digital processor, to perform operations to facilitate queue allocation. The operations may include operations to store a plurality of dispatch indicators corresponding to pairs of entries in a queue. Each dispatch indicator is indicative of the dispatch order of the corresponding pair of entries. The operations also include operations to store a bit vector comprising a plurality of mask values corresponding to the dispatch indicators of the dispatch order data structure, and to perform a queue operation on a subset of the entries in the queue. The subset excludes at least some of the entries of the queue based on the mask values of the bit vector. Other embodiments of the computer readable storage medium may facilitate fewer or more operations.
- Embodiments of the invention also may involve a number of functions to be performed by a computer processor such as a central processing unit (CPU), a graphics processing unit (GPU), or a microprocessor. The microprocessor may be a specialized or dedicated microprocessor that is configured to perform particular tasks by executing machine-readable software code that defines the particular tasks. The microprocessor also may be configured to operate and communicate with other devices such as direct memory access modules, memory storage devices, Internet related hardware, and other devices that relate to the transmission of data. The software code may be configured using software formats such as Java, C++, XML (Extensible Mark-up Language) and other languages that may be used to define functions that relate to operations of devices required to carry out the functional operations related described herein. The code may be written in different forms and styles, many of which are known to those skilled in the art. Different code formats, code configurations, styles and forms of software programs and other means of configuring code to define the operations of a microprocessor may be implemented.
- Within the different types of computers, such as computer servers, that utilize the invention, there exist different types of memory devices for storing and retrieving information while performing some or all of the functions described herein. In some embodiments, the memory/storage device where data is stored may be a separate device that is external to the processor, or may be configured in a monolithic device, where the memory or storage device is located on the same integrated circuit, such as components connected on a single substrate. Cache memory devices are often included in computers for use by the CPU or GPU as a convenient storage location for information that is frequently stored and retrieved. Similarly, a persistent memory is also frequently used with such computers for maintaining information that is frequently retrieved by a central processing unit, but that is not often altered within the persistent memory, unlike the cache memory. Main memory is also usually included for storing and retrieving larger amounts of information such as data and software applications configured to perform certain functions when executed by the central processing unit. These memory devices may be configured as random access memory (RAM), static random access memory (SRAM), dynamic random access memory (DRAM), flash memory, and other memory storage devices that may be accessed by a central processing unit to store and retrieve information. Embodiments may be implemented with various memory and storage devices, as well as any commonly used protocol for storing and retrieving information to and from these memory devices respectively.
- Although the operations of the method(s) herein are shown and described in a particular order, the order of the operations of each method may be altered so that certain operations may be performed in an inverse order or so that certain operations may be performed, at least in part, concurrently with other operations. In another embodiment, instructions or sub-operations of distinct operations may be implemented in an intermittent and/or alternating manner.
- Although specific embodiments of the invention have been described and illustrated, the invention is not to be limited to the specific forms or arrangements of parts so described and illustrated. The scope of the invention is to be defined by the claims appended hereto and their equivalents.
Claims (30)
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US11/820,350 US20080320274A1 (en) | 2007-06-19 | 2007-06-19 | Age matrix for queue dispatch order |
| US11/830,727 US8285974B2 (en) | 2007-06-19 | 2007-07-30 | Age matrix for queue entries dispatch order |
| US11/847,170 US20080320016A1 (en) | 2007-06-19 | 2007-08-29 | Age matrix for queue dispatch order |
| PCT/US2008/007723 WO2009088396A2 (en) | 2007-06-19 | 2008-06-19 | Age matrix for queue dispatch order |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US11/820,350 US20080320274A1 (en) | 2007-06-19 | 2007-06-19 | Age matrix for queue dispatch order |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US11/830,727 Continuation US8285974B2 (en) | 2007-06-19 | 2007-07-30 | Age matrix for queue entries dispatch order |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| US20080320274A1 true US20080320274A1 (en) | 2008-12-25 |
Family
ID=40137740
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US11/820,350 Abandoned US20080320274A1 (en) | 2007-06-19 | 2007-06-19 | Age matrix for queue dispatch order |
| US11/830,727 Expired - Fee Related US8285974B2 (en) | 2007-06-19 | 2007-07-30 | Age matrix for queue entries dispatch order |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US11/830,727 Expired - Fee Related US8285974B2 (en) | 2007-06-19 | 2007-07-30 | Age matrix for queue entries dispatch order |
Country Status (1)
| Country | Link |
|---|---|
| US (2) | US20080320274A1 (en) |
Cited By (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20080320016A1 (en) * | 2007-06-19 | 2008-12-25 | Raza Microelectronics, Inc. | Age matrix for queue dispatch order |
| US20080320478A1 (en) * | 2007-06-19 | 2008-12-25 | Raza Microelectronics, Inc. | Age matrix for queue dispatch order |
| US20110185159A1 (en) * | 2009-04-03 | 2011-07-28 | International Business Machines Corporation | Processor including age tracking of issue queue instructions |
| US20140129806A1 (en) * | 2012-11-08 | 2014-05-08 | Advanced Micro Devices, Inc. | Load/store picker |
| US9870231B2 (en) | 2015-07-27 | 2018-01-16 | International Business Machines Corporation | Age based fast instruction issue |
| WO2019199817A1 (en) * | 2018-04-12 | 2019-10-17 | Advanced Micro Devices, Inc. | Queue in a network switch |
| US20210026713A1 (en) * | 2019-07-26 | 2021-01-28 | Toshiba Memory Corporation | Two-layered deterministic interprocess communication scheduler for input output determinism in solid state drives |
| US11106467B2 (en) | 2016-04-28 | 2021-08-31 | Microsoft Technology Licensing, Llc | Incremental scheduler for out-of-order block ISA processors |
| US11269692B2 (en) * | 2011-12-29 | 2022-03-08 | Oracle International Corporation | Efficient sequencer for multiple concurrently-executing threads of execution |
| US11294678B2 (en) | 2018-05-29 | 2022-04-05 | Advanced Micro Devices, Inc. | Scheduler queue assignment |
| US11334384B2 (en) * | 2019-12-10 | 2022-05-17 | Advanced Micro Devices, Inc. | Scheduler queue assignment burst mode |
| US11948000B2 (en) | 2020-10-27 | 2024-04-02 | Advanced Micro Devices, Inc. | Gang scheduling for low-latency task synchronization |
| US20250085974A1 (en) * | 2023-01-06 | 2025-03-13 | Beijing Vcore Technology Co., Ltd. | Method and device for selecting entry of queue in out-of-order processor |
Families Citing this family (21)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR101658035B1 (en) * | 2010-03-12 | 2016-10-04 | 삼성전자주식회사 | Virtual machine monitor and scheduling method of virtual machine monitor |
| US8656401B2 (en) * | 2011-05-13 | 2014-02-18 | Advanced Micro Devices, Inc. | Method and apparatus for prioritizing processor scheduler queue operations |
| US9569222B2 (en) | 2014-06-17 | 2017-02-14 | International Business Machines Corporation | Implementing out of order processor instruction issue queue |
| US10102002B2 (en) * | 2014-09-30 | 2018-10-16 | International Business Machines Corporation | Dynamic issue masks for processor hang prevention |
| US10409609B2 (en) | 2015-12-14 | 2019-09-10 | International Business Machines Corporation | Age management logic |
| US10031757B2 (en) * | 2016-02-12 | 2018-07-24 | International Business Machines Corporation | Operation of a multi-slice processor implementing a mechanism to overcome a system hang |
| US9983879B2 (en) | 2016-03-03 | 2018-05-29 | International Business Machines Corporation | Operation of a multi-slice processor implementing dynamic switching of instruction issuance order |
| US11086628B2 (en) * | 2016-08-15 | 2021-08-10 | Advanced Micro Devices, Inc. | System and method for load and store queue allocations at address generation time |
| US10387147B2 (en) | 2017-08-02 | 2019-08-20 | International Business Machines Corporation | Managing an issue queue for fused instructions and paired instructions in a microprocessor |
| US10802829B2 (en) | 2017-11-30 | 2020-10-13 | International Business Machines Corporation | Scalable dependency matrix with wake-up columns for long latency instructions in an out-of-order processor |
| US10901744B2 (en) | 2017-11-30 | 2021-01-26 | International Business Machines Corporation | Buffered instruction dispatching to an issue queue |
| US10929140B2 (en) | 2017-11-30 | 2021-02-23 | International Business Machines Corporation | Scalable dependency matrix with a single summary bit in an out-of-order processor |
| US10884753B2 (en) | 2017-11-30 | 2021-01-05 | International Business Machines Corporation | Issue queue with dynamic shifting between ports |
| US10572264B2 (en) | 2017-11-30 | 2020-02-25 | International Business Machines Corporation | Completing coalesced global completion table entries in an out-of-order processor |
| US10564979B2 (en) | 2017-11-30 | 2020-02-18 | International Business Machines Corporation | Coalescing global completion table entries in an out-of-order processor |
| US10564976B2 (en) | 2017-11-30 | 2020-02-18 | International Business Machines Corporation | Scalable dependency matrix with multiple summary bits in an out-of-order processor |
| US10789013B2 (en) * | 2018-03-01 | 2020-09-29 | Seagate Technology Llc | Command scheduling for target latency distribution |
| US11106469B2 (en) | 2019-08-14 | 2021-08-31 | International Business Machines Corporation | Instruction selection mechanism with class-dependent age-array |
| US10963402B1 (en) * | 2019-12-28 | 2021-03-30 | Advanced Micro Devices, Inc. | Using age matrices for managing entries in sub-queues of a queue |
| US11182164B1 (en) | 2020-07-23 | 2021-11-23 | International Business Machines Corporation | Pairing issue queues for complex instructions and instruction fusion |
| US12287964B2 (en) | 2022-06-09 | 2025-04-29 | Samsung Electronics Co., Ltd. | System and method for managing queues in systems with high parallelism |
Citations (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5933627A (en) * | 1996-07-01 | 1999-08-03 | Sun Microsystems | Thread switch on blocked load or store using instruction thread field |
| US6065105A (en) * | 1997-01-08 | 2000-05-16 | Intel Corporation | Dependency matrix |
| US6237079B1 (en) * | 1997-03-30 | 2001-05-22 | Canon Kabushiki Kaisha | Coprocessor interface having pending instructions queue and clean-up queue and dynamically allocating memory |
| US6324640B1 (en) * | 1998-06-30 | 2001-11-27 | International Business Machines Corporation | System and method for dispatching groups of instructions using pipelined register renaming |
| US6334182B2 (en) * | 1998-08-18 | 2001-12-25 | Intel Corp | Scheduling operations using a dependency matrix |
| US20030093509A1 (en) * | 2001-10-05 | 2003-05-15 | Li Raymond M. | Storage area network methods and apparatus with coordinated updating of topology representation |
| US6721874B1 (en) * | 2000-10-12 | 2004-04-13 | International Business Machines Corporation | Method and system for dynamically shared completion table supporting multiple threads in a processing system |
| US6732242B2 (en) * | 2002-03-28 | 2004-05-04 | Intel Corporation | External bus transaction scheduling system |
| US6785802B1 (en) * | 2000-06-01 | 2004-08-31 | Stmicroelectronics, Inc. | Method and apparatus for priority tracking in an out-of-order instruction shelf of a high performance superscalar microprocessor |
| US20040243743A1 (en) * | 2003-05-30 | 2004-12-02 | Brian Smith | History FIFO with bypass |
| US20060184738A1 (en) * | 2005-02-17 | 2006-08-17 | Bridges Jeffrey T | Unaligned memory access prediction |
| US7266675B2 (en) * | 2003-04-21 | 2007-09-04 | International Business Machines Corporation | Processor including a register file and method for computing flush masks in a multi-threaded processing system |
| US20080320478A1 (en) * | 2007-06-19 | 2008-12-25 | Raza Microelectronics, Inc. | Age matrix for queue dispatch order |
| US20080320016A1 (en) * | 2007-06-19 | 2008-12-25 | Raza Microelectronics, Inc. | Age matrix for queue dispatch order |
-
2007
- 2007-06-19 US US11/820,350 patent/US20080320274A1/en not_active Abandoned
- 2007-07-30 US US11/830,727 patent/US8285974B2/en not_active Expired - Fee Related
Patent Citations (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5933627A (en) * | 1996-07-01 | 1999-08-03 | Sun Microsystems | Thread switch on blocked load or store using instruction thread field |
| US6065105A (en) * | 1997-01-08 | 2000-05-16 | Intel Corporation | Dependency matrix |
| US6237079B1 (en) * | 1997-03-30 | 2001-05-22 | Canon Kabushiki Kaisha | Coprocessor interface having pending instructions queue and clean-up queue and dynamically allocating memory |
| US6324640B1 (en) * | 1998-06-30 | 2001-11-27 | International Business Machines Corporation | System and method for dispatching groups of instructions using pipelined register renaming |
| US6334182B2 (en) * | 1998-08-18 | 2001-12-25 | Intel Corp | Scheduling operations using a dependency matrix |
| US6785802B1 (en) * | 2000-06-01 | 2004-08-31 | Stmicroelectronics, Inc. | Method and apparatus for priority tracking in an out-of-order instruction shelf of a high performance superscalar microprocessor |
| US6721874B1 (en) * | 2000-10-12 | 2004-04-13 | International Business Machines Corporation | Method and system for dynamically shared completion table supporting multiple threads in a processing system |
| US20030093509A1 (en) * | 2001-10-05 | 2003-05-15 | Li Raymond M. | Storage area network methods and apparatus with coordinated updating of topology representation |
| US6732242B2 (en) * | 2002-03-28 | 2004-05-04 | Intel Corporation | External bus transaction scheduling system |
| US7266675B2 (en) * | 2003-04-21 | 2007-09-04 | International Business Machines Corporation | Processor including a register file and method for computing flush masks in a multi-threaded processing system |
| US20040243743A1 (en) * | 2003-05-30 | 2004-12-02 | Brian Smith | History FIFO with bypass |
| US20060184738A1 (en) * | 2005-02-17 | 2006-08-17 | Bridges Jeffrey T | Unaligned memory access prediction |
| US20080320478A1 (en) * | 2007-06-19 | 2008-12-25 | Raza Microelectronics, Inc. | Age matrix for queue dispatch order |
| US20080320016A1 (en) * | 2007-06-19 | 2008-12-25 | Raza Microelectronics, Inc. | Age matrix for queue dispatch order |
Cited By (25)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8285974B2 (en) | 2007-06-19 | 2012-10-09 | Netlogic Microsystems, Inc. | Age matrix for queue entries dispatch order |
| US20080320478A1 (en) * | 2007-06-19 | 2008-12-25 | Raza Microelectronics, Inc. | Age matrix for queue dispatch order |
| US20080320016A1 (en) * | 2007-06-19 | 2008-12-25 | Raza Microelectronics, Inc. | Age matrix for queue dispatch order |
| US8489863B2 (en) * | 2009-04-03 | 2013-07-16 | International Business Machines Corporation | Processor including age tracking of issue queue instructions |
| US20120260069A1 (en) * | 2009-04-03 | 2012-10-11 | Ibm Corporation | Processor Including Age Tracking of Issue Queue Instructions |
| US8380964B2 (en) | 2009-04-03 | 2013-02-19 | International Business Machines Corporation | Processor including age tracking of issue queue instructions |
| US20110185159A1 (en) * | 2009-04-03 | 2011-07-28 | International Business Machines Corporation | Processor including age tracking of issue queue instructions |
| US11269692B2 (en) * | 2011-12-29 | 2022-03-08 | Oracle International Corporation | Efficient sequencer for multiple concurrently-executing threads of execution |
| US20140129806A1 (en) * | 2012-11-08 | 2014-05-08 | Advanced Micro Devices, Inc. | Load/store picker |
| US9870231B2 (en) | 2015-07-27 | 2018-01-16 | International Business Machines Corporation | Age based fast instruction issue |
| US9880850B2 (en) | 2015-07-27 | 2018-01-30 | International Business Machines Corporation | Age based fast instruction issue |
| US9965286B2 (en) | 2015-07-27 | 2018-05-08 | International Business Machines Corporation | Age based fast instruction issue |
| US11106467B2 (en) | 2016-04-28 | 2021-08-31 | Microsoft Technology Licensing, Llc | Incremental scheduler for out-of-order block ISA processors |
| US11687345B2 (en) | 2016-04-28 | 2023-06-27 | Microsoft Technology Licensing, Llc | Out-of-order block-based processors and instruction schedulers using ready state data indexed by instruction position identifiers |
| US11449342B2 (en) | 2016-04-28 | 2022-09-20 | Microsoft Technology Licensing, Llc | Hybrid block-based processor and custom function blocks |
| US10601723B2 (en) * | 2018-04-12 | 2020-03-24 | Advanced Micro Devices, Inc. | Bandwidth matched scheduler |
| WO2019199817A1 (en) * | 2018-04-12 | 2019-10-17 | Advanced Micro Devices, Inc. | Queue in a network switch |
| US11294678B2 (en) | 2018-05-29 | 2022-04-05 | Advanced Micro Devices, Inc. | Scheduler queue assignment |
| US11016829B2 (en) * | 2019-07-26 | 2021-05-25 | Toshiba Memory Corporation | Two-layered deterministic interprocess communication scheduler for input output determinism in solid state drives |
| CN114521253A (en) * | 2019-07-26 | 2022-05-20 | 铠侠股份有限公司 | Dual-layer deterministic inter-process communication scheduler for input-output determinism in solid state drives |
| US20210026713A1 (en) * | 2019-07-26 | 2021-01-28 | Toshiba Memory Corporation | Two-layered deterministic interprocess communication scheduler for input output determinism in solid state drives |
| US11334384B2 (en) * | 2019-12-10 | 2022-05-17 | Advanced Micro Devices, Inc. | Scheduler queue assignment burst mode |
| US11948000B2 (en) | 2020-10-27 | 2024-04-02 | Advanced Micro Devices, Inc. | Gang scheduling for low-latency task synchronization |
| US20250085974A1 (en) * | 2023-01-06 | 2025-03-13 | Beijing Vcore Technology Co., Ltd. | Method and device for selecting entry of queue in out-of-order processor |
| US12333308B2 (en) * | 2023-01-06 | 2025-06-17 | Beijing Vcore Technology Co., Ltd. | Method and device for selecting entry of queue in out-of-order processor |
Also Published As
| Publication number | Publication date |
|---|---|
| US8285974B2 (en) | 2012-10-09 |
| US20080320478A1 (en) | 2008-12-25 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US8285974B2 (en) | Age matrix for queue entries dispatch order | |
| US7711935B2 (en) | Universal branch identifier for invalidation of speculative instructions | |
| CN113703834B (en) | Block-based processor core compound register | |
| KR102659813B1 (en) | Handling move instructions using register renaming | |
| US6931639B1 (en) | Method for implementing a variable-partitioned queue for simultaneous multithreaded processors | |
| CN102934076B (en) | Instruction issue and control device and method | |
| US20040210743A1 (en) | Dynamically shared group completion table between multiple threads | |
| WO2009088396A2 (en) | Age matrix for queue dispatch order | |
| US20110087865A1 (en) | Intermediate Register Mapper | |
| US20150154022A1 (en) | Soft-Partitioning of a Register File Cache | |
| US11132202B2 (en) | Cache control circuitry and methods | |
| US20040215936A1 (en) | Method and circuit for using a single rename array in a simultaneous multithread system | |
| US10437599B2 (en) | System and method of reducing processor pipeline stall caused by full load queue | |
| JP2022500749A (en) | Controlling access to the branch prediction unit for a sequence of fetch groups | |
| US20170139751A1 (en) | Scheduling method and processing device using the same | |
| US20090138680A1 (en) | Vector atomic memory operations | |
| KR20130127437A (en) | Method and apparatus for floating point register caching | |
| US10430342B2 (en) | Optimizing thread selection at fetch, select, and commit stages of processor core pipeline | |
| CN115867888A (en) | Method and system for utilizing master-shadow physical register files | |
| CN117501254A (en) | Providing atomicity for complex operations using near-memory computation | |
| US7130990B2 (en) | Efficient instruction scheduling with lossy tracking of scheduling information | |
| US11977782B2 (en) | Approach for enabling concurrent execution of host memory commands and near-memory processing commands | |
| US6212619B1 (en) | System and method for high-speed register renaming by counting | |
| US6959377B2 (en) | Method and system for managing registers | |
| US20080010440A1 (en) | Means for supporting and tracking a large number of in-flight stores in an out-of-order processor |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: RAZA MICROELECTRONICS, INC, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SINGH, GUARAV;SRINIVASAN, SRIVATSAN;WONG, LINTSUNG;REEL/FRAME:020013/0777;SIGNING DATES FROM 20070619 TO 20070719 Owner name: RAZA MICROELECTRONICS, INC, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SINGH, GUARAV;SRINIVASAN, SRIVATSAN;WONG, LINTSUNG;SIGNING DATES FROM 20070619 TO 20070719;REEL/FRAME:020013/0777 |
|
| AS | Assignment |
Owner name: RMI CORPORATION, CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:RAZA MICROELECTRONICS, INC.;REEL/FRAME:020886/0454 Effective date: 20071217 |
|
| AS | Assignment |
Owner name: NETLOGIC MICROSYSTEMS, INC.,CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:RMI CORPORATION;REEL/FRAME:023926/0338 Effective date: 20091229 Owner name: NETLOGIC MICROSYSTEMS, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:RMI CORPORATION;REEL/FRAME:023926/0338 Effective date: 20091229 |
|
| STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
| AS | Assignment |
Owner name: BROADCOM CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NETLOGIC I LLC;REEL/FRAME:035443/0763 Effective date: 20150327 Owner name: NETLOGIC I LLC, DELAWARE Free format text: CHANGE OF NAME;ASSIGNOR:NETLOGIC MICROSYSTEMS, INC.;REEL/FRAME:035443/0824 Effective date: 20130123 |
|
| AS | Assignment |
Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH CAROLINA Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:037806/0001 Effective date: 20160201 Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:037806/0001 Effective date: 20160201 |
|
| AS | Assignment |
Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD., SINGAPORE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:041706/0001 Effective date: 20170120 Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:041706/0001 Effective date: 20170120 |
|
| AS | Assignment |
Owner name: BROADCOM CORPORATION, CALIFORNIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS COLLATERAL AGENT;REEL/FRAME:041712/0001 Effective date: 20170119 |