WO2025128115A1 - Prise en charge d'espace d'adresse à des fins de cohérence de mémoire cache point à point - Google Patents
Prise en charge d'espace d'adresse à des fins de cohérence de mémoire cache point à point Download PDFInfo
- Publication number
- WO2025128115A1 WO2025128115A1 PCT/US2023/084209 US2023084209W WO2025128115A1 WO 2025128115 A1 WO2025128115 A1 WO 2025128115A1 US 2023084209 W US2023084209 W US 2023084209W WO 2025128115 A1 WO2025128115 A1 WO 2025128115A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cores
- core
- address
- cache coherency
- memory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
Definitions
- This specification relates to systems having integrated circuit devices.
- a cache is a device that stores data retrieved from memory or data to be written to memory for one or more different hardware devices in a system.
- the hardware devices can be different components integrated into a system on a chip (SOC) or a system that includes cores on several different chips.
- SOC system on a chip
- client devices the devices that provide read requests and write requests through caches will be referred to as client devices.
- a multiprocessor system can have multiple processing devices that each have a separate cache.
- a multiprocessor system can have multiple copies of any shared data i.e., in a shared memory as well as a memory for each cache. In order to maintain cache coherency, when one copy of data is changed, the other copies must also be changed or invalidated in other caches that share the same data.
- a cache coherency system can keep data consistent among caches by communicating with various caches in the multiprocessor system that may have access to a particular piece of data when the data is updated.
- performing cache coherency processes can be expensive in multiple ways. For example, performing a cache coherency process can introduce more complex hardware, high latency, and wasted operations, e.g., when a cache does not have access to the particular piece of data that is the subject of the coherency process.
- This specification describes techniques for modifying addresses in a way that triggers a cache coherency system to perform a reduced cache coherency process. For example, when data that needs to be updated is only shared among two cores in a plurality of cores, this allows the cache coherency system to only communicate with the caches associated with those two cores. Additionally, when the data that needs to be updated is only accessed by a single core, this allows the cache coherency system to only communicate the cache associated with that core.
- the reduced cache coherency process checks cache coherency on fewer cores than the full cache coherency process.
- one or more of the cores are configured to execute instructions to implement an operating system, and the operating system is configured to perform operations comprising: maintaining a mapping between programs, cores that the programs execute on, and respective regions of memory that the programs share; and populating the reserved field of addresses with core identifiers whenever a physical address must be calculated for use by a program identified in the mapping as running on one or two cores.
- the operations further comprise: receiving a request to move a program from a first core and a second core to a third core and a fourth core; and recalculating the reserved field addresses with core identifiers associated with the third and fourth core.
- the address is a physical address
- the reserved field of the address occupies fewer than all bits of the physical address
- each program shares one or more regions of memory with one or more of each of the other programs.
- the cores in the plurality of cores are divided into pairs of cores, where each core shares one or more regions of memory with its paired core.
- the operations further comprise assigning memory sharing programs to each of the pairs of cores.
- the operations further comprise: receiving a second address from one of the plurality of cores; determining that a reserved field of the address specifies a pair of clusters of cores of the plurality of cores that share a region of memory; and in response, performing a reduced cache coherency process between the cores in the pair of clusters of cores for the plurality of cores.
- the operations further comprise: receiving a second address from one of the plurality of cores; determining that the reserved field of the address specifies a single core; and in response, bypassing a cache coherency process for the plurality of cores.
- Cache coherency can be a complex process that requires a large amount of circuitry and delay time.
- the cache coherency system can reduce coherency traffic for multiprocessor systems. When data is not shared among all cores in a multiprocessor system with multiple cores, the system can use an address that has a reserved field as an indication that the cache coherency system should perform only a reduced cache coherency process.
- the reduced cache coherency process checks cache coherency on fewer cores than a full cache coherency process does and thus is faster to perform, requires less complex operations, and less inter-core communications traffic..
- FIG. 1 is a diagram of an example system.
- FIG. 2 is a diagram illustrating an example address that has a reserved field that specifies a pair of cores.
- FIG. 3 is a flow chart illustrating an example process for determining a type of cache coherency process to perform.
- FIG. 4 is a flow chart illustrating an example process for populating an address based on a memory request.
- FIG. 1 is a diagram of an example system 100.
- the system 100 includes a system on a chip (SOC) 102 communicatively coupled to a memory device 118.
- the SOC 102 has multiple client devices 1 lOa-n that each have an associated local cache 112a-n, and a cache coherency subsystem 114.
- Each local cache 112a-n services memory requests from its associated client device HOa-n.
- the cache coherency subsystem 114 is a communication subsystem that performs a cache coherency process whenever data in any local cache 112a-n is updated or data is requested from a cache.
- the techniques described in this specification can also be used for systems having additional layers of caches.
- the system 100 can modify some addresses in a way that triggers the cache coherency subsystem 114 to perform a reduced cache coherency process.
- the cache coherency subsystem 114 can use a reserved space in an address to identify the client devices for which a reduced cache coherency process should be performed. For example, when data that needs to be updated is only shared among two client devices out of the multiple client devices 112a-n, the system can indicate this by populating the reserved field of addresses used by the client devices, which signals to the cache coherency system to only communicate with the caches associated with those two client devices when performing cache coherency maintenance.
- the SOC 102 is an example of a device that can be installed on or integrated into any appropriate computing device, which may be referred to as a host device.
- the SOC 102 can be installed on a mobile host device, e.g., a smart phone, a smart watch or another wearable computing device, a tablet computer, or a laptop computer, to name just a few examples.
- the SOC 102 has multiple client devices HOa-n.
- Each of the client devices HOa-n can have one or more cores that implement any appropriate module, device, or functional component that is configured to read and store data in the memory device.
- a client device can be a CPU, a DMA controller, or lower-level components of the SOC itself.
- the system 100 can include an operating system that can be configured to modify addresses in a way that indicates that one or more client devices share a region of memory.
- the client devices 1 lOa-n can be configured to execute instructions to implement an operating system.
- the operating system can be configured to maintain mappings between programs or client devices 1 lOan-n that share respective regions of memory e.g., one mapping between regions accessible to one or more client devices for each region of memory.
- the shared regions of memory can, for example, be distinguished from one another using a memory region identifier.
- a memory region identifier can be an abstraction for contiguous data that can be mapped into a region of an address space.
- a memory region identifier can represent data in a memory storage device, such as the memory' storage device 118.
- the operating system can be configured to populate the page table with addresses having reserved fields set with core identifiers.
- a core identifier can be a value to identify a set of one or more devices that use a particular cache. For example, each client device 1 lOa-n can have multiple computing cores. But if a particular client device uses only one local cache, a single core identifier can be used forthat client device. Alternatively or in addition, each client device can have a single computing core, in which case each client device can have a separate and distinct core identifier. In the case where a client device has multiple cores that use multiple caches respectively, a single client device can be associated with multiple core identifiers to distinguish between cores that use the multiple caches respectively.
- a reserved field of addresses can be allowed within an otherwise unused bit field within the address.
- the cache coherency subsystem 114 is a communications subsystem of the SOC 102 that ensures that all data stored in the local caches 112a-n remains coherent. In other words, the cache coherency subsystem 114 strives to give each cache a same view of values in memory 118. The values in the caches can be different from corresponding values in memory, e.g., in the case of a cached write not yet being flushed, but the caches themselves should have the same view of such unflushed values. Whenever data is updated in one local cache 112a-n, the cache coherency subsystem can perform a cache coherency process to ensure that any other cache with access to that data does not contain a differing version of the data.
- the cache coherency subsystem 114 includes communications pathways that allow the client devices 1 lOa-n to communicate with one another as well as to make requests to read and write data using the memory device 140.
- the cache coherency subsystem 114 can include any appropriate combination of communications hardware, e.g., buses or dedicated interconnect circuitry.
- the cache coherency subsystem 118 can be configured to receive an address from a first client device and determine if a reserved field of the address specifies a pair, or a subset, of client devices that share a region of memory not shared with any other client devices. If the reserved field of the address specifies a pair of client devices, the cache coherency subsystem 118 can perform a reduced cache coherency process between the pair of client devices without needing to communicate with any other client devices in the SOC 102. The reduced cache coherency process checks cache coherency on fewer client devices than a full cache coherency process that would check cache coherency in all local caches 112-a-n.
- the cache coherency subsystem 118 can perform a reduced cache coherency process that checks cache coherency for only the pair of client devices.
- the reserved field can also specify one client device. When the reserved field specifies one client device, the cache coherency subsystem bypasses performing a cache coherency process.
- the cache coherency subsystem 118 can perform a full cache coherency process that checks cache coherency for all client devices.
- the addresses can be physical addresses and the reserved field of the address can occupy fewer than all bits of the physical address. For example, if the reserved field is unpopulated, the address can read 0x000 ppppp . Using the same format, the reserved field can identify a client device labelled j and a client device labelled k by populating the bits of the address as OxOjkO ppppp . When the reserved field identifies a single client device, for example a client device j, the address reads OxOjjO ppppp .
- the reserved field can also identify clusters of client devices.
- the reserved field of the address specifies a pair of clusters of client devices of the plurality of cores that share a region of memory.
- the cache coherency subsystem 118 can perform a reduced cache coherency process between the client devices in the pair of clusters of cores, without checking cache coherency for client devices that are not a part of either cluster.
- the caches HOa-n are positioned in the data pathway between the client devices 112a-n and the memory controller 130.
- the memory controller 130 can handle requests to and from the memory device 140.
- FIG. 2 is a diagram illustrating an example physical address 200 that has a reserved field that specifies a pair of cores.
- the address is an address space 202 for a 64-bit system.
- the address space 202 includes a reserved field for a first core identifier 204 and a second core identifier 206 that share a region of memory as well as a field for a physical page 208 and offset bits 210.
- the field for the physical page 208 identifies a page that is shared among only two cores and the reserved field for the first core identifier 204 can identify one of the cores while the second core identifier 206 can identify the second core.
- the reserved field of the address can be fewer than all bits of the physical address. For example, if the reserved field is unpopulated, the address can read 0x000 ppppp . Using the same format, the reserved field can identify a core labeled j and a core labeled k by populating the bits of the address as OxOjkO ppppp , where the first core identifier 204 is populated by j and the second core identifier 206 is populated by k. When a page is only shared with one core, for example a core /, the address reads OxOjjO ppppp and both core identifiers are populated by j.
- FIG. 3 is a flow chart illustrating an example process for determining a type of cache coherency process to perform.
- the example process 300 can be performed by the components of the SOC 102, specifically the cache coherency subsystem 114.
- the cache coherency subsystem 114 receives an address from a first core of a plurality of cores (Step 310).
- the cores can be cores of a multiprocessor system that each have an associated local cache.
- the cache coherency subsystem 114 determines if a reserved field of the address specifies a pair of the cores of the plurality of cores that share a region of memory (Step 320).
- the address can be a physical address where the reserved field of the address occupies fewer than all bits of the physical address e.g., the example address 200 of FIG. 2.
- the cache coherency subsystem 114 performs a reduced cache coherency process between the pair of cores (Step 330).
- the reduced cache coherency process checks cache coherency on fewer cores than the full cache coherency process e.g., the two cores that are in the pair of cores.
- the reserved field of the address can also specify a single core.
- the cache coherency subsystem can determine that the reserved field of the address specifies a single core and bypass performing a cache coherency process as the data does not need to be updated for the other cores.
- the operating system can delegate which cores share regions of memory.
- Each program or core can share one or more regions of memory with one or more other programs or cores.
- each core can share a memory region identifier with a subset drawn from each of the other cores in the system.
- An operating system can choose which cores to assign shared memory depending on which tasks are assigned to specific cores, e.g., based on the needs of a task.
- the operating system can divide the cores in the plurality of cores into pairs of cores.
- each core only shares one or more regions of memory with its paired core.
- the reserved field of the address can also specify clusters of cores.
- the reserved field of the address can specify a pair of clusters of cores of the plurality of cores that share a region of memory.
- a cluster AT can represent cores j and k while cluster A can represent core h and z.
- the cache coherency subsystem 114 can perform a reduced cache coherency process between the cores in the pair of clusters of cores e.g, the cache coherency subsystem can check cache coherency for cores j, k, h, and z.
- FIG. 4 is a flow chart illustrating an example process for populating the bit fields of an address based on a configuration of shared memory between programs running on cores.
- the example process 400 can be performed by an operating system.
- the system maintains a mapping between programs that share regions of memory (step 410).
- Programs can share regions of memory for a variety of reasons. For example, a multiprocess application can have multiple processes that read from and write to the same region of memory. As another example, producer and consumer processes can use the shared region of memory as a buffer between the two processes.
- the shared region of memory can, for example, be identified by a memory region identifier.
- a memory region identifier is an abstraction for one or more regi ons of memory that are mapped into a region of an address space.
- a memory region identifier can represent portions of a memory storage device, such as the memory storage device 118.
- the system receives a request to update the page table for a program (Step 420). Updating the page table can be due to a variety of reasons, e.g., due to allocating memory to a program for the first time or due to encountering a page fault.
- updating the page table requires physical addresses to be calculated, implicitly or explicitly, from virtual addresses belonging to the program.
- the mapping between physical and virtual addresses or pages can then be added to a page table.
- the operating system uses a page table to store mappings of virtual addresses to physical addresses, where each mapping is a page table entry.
- the system determines whether the program is identified as running one one or two cores in the mapping of programs that share regions of memory (step 430). If so, the system can modify the physical addresses in the page table by writing the core identifiers into the reserved spaces of the physical addresses (branch to step 440). This modification will then cause the cache coherency system to perform a reduced cache coherency process whenever those physical addresses are encountered during execution.
- the system can make similar modifications to other architectural structures that maintain address translations, e.g., in one or more translation lookaside buffers.
- the operating system may choose to maintain a mapping of pairs of cores rather than a mapping of programs.
- Each core can share one or more regions of memory with its paired core.
- the system can assign memory sharing programs to each of the pairs of cores.
- the operating system can choose to track and map shared memory regions individually.
- Each core in the plurality of cores can share one or more regions of memory with each of the other cores in the plurality of cores.
- a program can use one or more memory region identifiers that are shared between more than two cores e.g., between five cores or between seven cores etc.
- a program uses a memory region identifier that is shared with more than a designated amount (e.g., two) of cores, the system does not modify the physical address and the cache coherency subsystem performs a full cache coherency process.
- a program can also use one or more memory region identifiers that are shared between only one core or a pair of cores.
- the system does modify the physical address and the cache coherency subsystem performs a full cache coherency process.
- the system can leave the reserved field of addresses unaltered (branch to step 440).
- the system can determine whether the virtual address is part of a memory region identifier that is known to be accessible to only one core or only two cores (or groups of cores). If the virtual address is part of a memory region identifier that is known to be accessible to only one core or only two cores, the operating system can write core identifiers in a reserved field of the physical address in the page table. For example, the reserved field can identify a core labeled j and a core labeled k by populating the bits of the physical address as OxOjkO ppppp in the page table.
- Other bit encodings may be possible, some of which may reduce the number of bits required for the encoding, perhaps with cooperation from hardware. For example, as little as 1 bit may be used if the hardware applies the reduced coherency protocol only between pairs of cores with numbers differing only in the least significant bit, and the operating system only sets that bit for programs using memory only on those pairs of cores.
- the system can receive a request for a core to access a specific address of memory. This can trigger a cache coherency process.
- the cache coherency hardware will use a reduced cache coherency process.
- the operating system can move a program from one core to another core.
- the operating system can recalculate the relevant physical addresses in the page table.
- the operating system can move a program that runs on a first and second core to a second and third core instead.
- the system can populate the reserved field of the physical address with respective identifiers of the second core and the third core.
- the third core can be a core labeled h.
- the first and second cores can be labeled j and k respectively.
- the system can change the address from OxOjkO ppppp to OxOhkO ppppp .
- the system can use an N-to-N crossbar (e.g., butterfly circuit) to transfer cache data between arbitrary pairs of cores.
- N-to-N crossbar e.g., butterfly circuit
- some regions of memory can be shared between cores 4 and 5, and other regions can be shared between 4 and 17.
- Each page in a memory region will have its own set of j and k bits, so an N-to-N crossbar would allow process A to have one memory region identifier shared with process B, and another memory region identifier shared with program C, and the crossbar would support an A-B link and a B-C link as needed.
- the system 100 can use simpler point-to-point connections between cores 0 and 1, 2 and 3, etc., and the operating system can assign memory-sharing programs to those appropriately-paired cores.
- Embodiments of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, in tangibly-embodied computer software or firmware, in computer hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them.
- Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions encoded on a tangible non- transitory storage medium for execution by, or to control the operation of, data processing apparatus.
- the computer storage medium can be a machine-readable storage device, a machine- readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
- the program instructions can be encoded on an artificially-generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus for execution by a data processing apparatus.
- data processing apparatus refers to data processing hardware and encompasses all kinds of apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers.
- the apparatus can also be, or further include, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
- the apparatus can optionally include, in addition to hardware, code that creates an execution environment for computer programs, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
- a computer program which may also be referred to or described as a program, software, a software application, an app, a module, a software module, a script, or code, can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages; and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- a program may, but need not, correspond to a file in a file system.
- a program can be stored in a portion of a file that holds other programs or data, e.g., one or more scripts stored in a markup language document, in a single file dedicated to the program in question, or in multiple coordinated files, e.g., files that store one or more modules, sub-programs, or portions of code.
- a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a data communication network.
- the processes and logic flows described in this specification can be performed by one or more programmable computers executing one or more computer programs to perform functions by operating on input data and generating output.
- the processes and logic flows can also be performed by special purpose logic circuitry, e.g., an FPGA or an ASIC, or by a combination of special purpose logic circuitry and one or more programmed computers.
- Computers suitable for the execution of a computer program can be based on general or special purpose microprocessors or both, or any other kind of central processing unit.
- a central processing unit will receive instructions and data from a read-only memory or a random access memory or both.
- the essential elements of a computer are a central processing unit for performing or executing instructions and one or more memory devices for storing instructions and data.
- the central processing unit and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
- a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks.
- a computer need not have such devices.
- a computer can be embedded in another device, e.g., a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device, e.g., a universal serial bus (USB) flash drive, to name just a few.
- PDA personal digital assistant
- GPS Global Positioning System
- USB universal serial bus
- Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks.
- semiconductor memory devices e.g., EPROM, EEPROM, and flash memory devices
- magnetic disks e.g., internal hard disks or removable disks
- magneto-optical disks e.g., CD-ROM and DVD-ROM disks.
- Embodiment 1 is a system comprising: a plurality of cores, wherein each core is associated with a cache; a cache coherency subsystem comprising data processing apparatus configured to perform operations comprising: receiving an address from a first core of the plurality of cores; determining that a reserved field of the address specifies a pair of the cores of the plurality of cores that share a region of memory; and in response, performing a reduced cache coherency process between the pair of cores of the plurality of cores.
- Embodiment 2 is the system of embodiment 1, wherein the operations further comprise: receiving a second address from one of the plurality of cores; determining that the reserved field of the address does not specify a pair of cores; and in response, performing a full cache coherency process among the plurality of cores.
- Embodiment 3 is the system of embodiment 2, wherein the reduced cache coherency process checks cache coherency on fewer cores than the full cache coherency process.
- Embodiment 4 is the system of any one of embodiments 1-3, wherein one or more of the cores are configured to execute instructions to implement an operating system, and wherein the operating system is configured to perform operations comprising: maintaining a mapping between programs, cores that the programs execute on, and respective regions of memory that the programs share; and populating the reserved field of addresses with core identifiers whenever a physical address must be calculated for use by a program identified in the mapping as running on one or two cores .
- Embodiment 5 is the system of embodiment 4, wherein the operations further comprise: receiving a request to move a program from a first core and a second core to a third core and a fourth core; and recalculating the reserved field addresses with core identifiers associated with the third and fourth core.
- Embodiment 6 is the system of any one of embodiments 1-5, wherein the address is a physical address, and the reserved field of the address occupies fewer than all bits of the physical address.
- Embodiment 7 is the system of embodiment 4, wherein each program shares one or more regions of memory with one or more of each of the other programs.
- Embodiment 8 is the system of embodiment 4, wherein the cores in the plurality of cores are divided into pairs of cores, where each core shares one or more regions of memory with its paired core.
- Embodiment 9 is the system of embodiment 8, wherein the operations further comprise assigning memory sharing programs to each of the pairs of cores.
- Embodiment 10 is the system of any one of embodiments 1-9, wherein the operations further comprise: receiving a second address from one of the plurality of cores; determining that a reserved field of the address specifies a pair of clusters of cores of the plurality of cores that share a region of memory; and in response, performing a reduced cache coherency process between the cores in the pair of clusters of cores for the plurality of cores.
- Embodiment 11 is the system of any one of embodiments 1-10, wherein the operations further comprise: receiving a second address from one of the plurality of cores; determining that the reserved field of the address specifies a single core; and in response, bypassing a cache coherency process for the plurality of cores.
- Embodiment 12 is a method comprising performing the operations of any one of embodiments 1 - 11.
- Embodiment 13 is a computer storage medium encoded with instructions that are operable, when executed by data processing apparatus, to cause the data processing apparatus to perform the operations of any one of claims 1-11.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Système comprenant : une pluralité de noyaux, chaque noyau étant associé à une mémoire cache ; un sous-système de cohérence de mémoire cache comprenant un appareil de traitement de données configuré pour effectuer des opérations consistant à : recevoir une adresse provenant d'un premier noyau de la pluralité de noyaux ; déterminer qu'un champ réservé de l'adresse spécifie une paire des noyaux de la pluralité de noyaux qui partagent une région de mémoire ; et en réponse, effectuer un processus de cohérence de mémoire cache réduite entre les noyaux de la paire de noyaux de la pluralité de noyaux.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/US2023/084209 WO2025128115A1 (fr) | 2023-12-15 | 2023-12-15 | Prise en charge d'espace d'adresse à des fins de cohérence de mémoire cache point à point |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/US2023/084209 WO2025128115A1 (fr) | 2023-12-15 | 2023-12-15 | Prise en charge d'espace d'adresse à des fins de cohérence de mémoire cache point à point |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025128115A1 true WO2025128115A1 (fr) | 2025-06-19 |
Family
ID=89768487
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2023/084209 Pending WO2025128115A1 (fr) | 2023-12-15 | 2023-12-15 | Prise en charge d'espace d'adresse à des fins de cohérence de mémoire cache point à point |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025128115A1 (fr) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140006714A1 (en) * | 2012-06-29 | 2014-01-02 | Naveen Cherukuri | Scalable coherence for multi-core processors |
| US20150120998A1 (en) * | 2013-10-31 | 2015-04-30 | Kebing Wang | Method, apparatus and system for dynamically controlling an addressing mode for a cache memory |
-
2023
- 2023-12-15 WO PCT/US2023/084209 patent/WO2025128115A1/fr active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140006714A1 (en) * | 2012-06-29 | 2014-01-02 | Naveen Cherukuri | Scalable coherence for multi-core processors |
| US20150120998A1 (en) * | 2013-10-31 | 2015-04-30 | Kebing Wang | Method, apparatus and system for dynamically controlling an addressing mode for a cache memory |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6953488B2 (ja) | ハイブリッドメモリキューブシステム相互接続ディレクトリベースキャッシュコヒーレンス方法 | |
| KR101887797B1 (ko) | 메모리 내 가벼운 일관성 | |
| US6826653B2 (en) | Block data mover adapted to contain faults in a partitioned multiprocessor system | |
| US6910108B2 (en) | Hardware support for partitioning a multiprocessor system to allow distinct operating systems | |
| US20100325374A1 (en) | Dynamically configuring memory interleaving for locality and performance isolation | |
| US10042762B2 (en) | Light-weight cache coherence for data processors with limited data sharing | |
| US20020087614A1 (en) | Programmable tuning for flow control and support for CPU hot plug | |
| US20150106560A1 (en) | Methods and systems for mapping a peripheral function onto a legacy memory interface | |
| EP3298497B1 (fr) | Tampon de consultation de traduction en mémoire | |
| US11681553B2 (en) | Storage devices including heterogeneous processors which share memory and methods of operating the same | |
| CN110275840B (zh) | 在存储器接口上的分布式过程执行和文件系统 | |
| US11157405B2 (en) | Programmable cache coherent node controller | |
| US6546465B1 (en) | Chaining directory reads and writes to reduce DRAM bandwidth in a directory based CC-NUMA protocol | |
| US10019258B2 (en) | Hardware assisted software versioning of clustered applications | |
| US9003160B2 (en) | Active buffered memory | |
| CN113849262A (zh) | 用于无需复制而在虚拟机之间移动数据的技术 | |
| GB2493340A (en) | Address mapping of boot transactions between dies in a system in package | |
| US20240338315A1 (en) | Systems, methods, and apparatus for computational device communication using a coherent interface | |
| WO2025128115A1 (fr) | Prise en charge d'espace d'adresse à des fins de cohérence de mémoire cache point à point | |
| US10394707B2 (en) | Memory controller with memory resource memory management | |
| US12340195B2 (en) | Handling interrupts from a virtual function in a system with a reconfigurable processor | |
| KR20250129037A (ko) | 2개 이상의 상이한 메모리 유형에 걸쳐 비대칭적으로 채워진 메모리 채널에 걸쳐 인터리브를 호스팅하기 위한 시스템 및 방법 | |
| TW202340931A (zh) | 具有雜訊鄰居緩解及動態位址範圍分配的直接交換快取 | |
| US20200301832A1 (en) | Page-based memory operation with hardware initiated secure storage key update |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23847937 Country of ref document: EP Kind code of ref document: A1 |