[go: up one dir, main page]

GB2639994A - Access control information - Google Patents

Access control information

Info

Publication number
GB2639994A
GB2639994A GB2404684.9A GB202404684A GB2639994A GB 2639994 A GB2639994 A GB 2639994A GB 202404684 A GB202404684 A GB 202404684A GB 2639994 A GB2639994 A GB 2639994A
Authority
GB
United Kingdom
Prior art keywords
access control
lookup
physical address
control information
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
GB2404684.9A
Other versions
GB202404684D0 (en
Inventor
Elad Yuval
Donald Charles Chadwick Alexander
Garcia-Tobin Carlos
Parker Jason
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ARM Ltd
Original Assignee
ARM Ltd
Advanced Risc Machines Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ARM Ltd, Advanced Risc Machines Ltd filed Critical ARM Ltd
Priority to GB2404684.9A priority Critical patent/GB2639994A/en
Publication of GB202404684D0 publication Critical patent/GB202404684D0/en
Priority to PCT/GB2025/050550 priority patent/WO2025210336A1/en
Publication of GB2639994A publication Critical patent/GB2639994A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • G06F12/0246Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory in block erasable memory, e.g. flash memory
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/0292User address space allocation, e.g. contiguous or non contiguous base addressing using tables or multilevel address translation means
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1041Resource optimization
    • G06F2212/1044Space efficiency improvement
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/72Details relating to flash memory management
    • G06F2212/7201Logical to physical mapping or translation of blocks or pages

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Dram (AREA)
  • Memory System (AREA)

Abstract

Memory access control circuitry controls access to a memory system in response to a given memory access operation, based on access control information associated with a target physical address specified for the given memory access operation. Access control information obtaining circuitry is configured to perform a lookup of an access control table structure based on the target physical address of the given memory access operation, for obtaining the access control information. For at least one setting of table lookup control information indicating that a system physical address space has been configured to comprise a plurality of non-contiguous table-lookup-skipping address windows, the access control information obtaining circuitry determines whether the target physical address corresponds to one of the table-lookup-skipping address windows, if so, identifies, as the access control information corresponding to the target physical address, a default value selected independent of the lookup to the access control table structure.

Description

ACCESS CONTROL INFORMATION
The present technique relates to the field of data processing.
Accesses to a memory system may be subject to access control based on access control information associated with a target physical address of the memory access operation. The access control information may be obtained from a table structure stored in the memory system, which is looked up based on the target physical address.
At least some examples of the present technique provide an apparatus comprising: memory access control circuitry to control access to a memory system in response to a given memory access operation, based on access control information associated with a target physical address specified for the given memory access operation; and access control information obtaining circuitry to obtain the access control information corresponding to the target physical address; in which: the access control information obtaining circuitry is configured to perform a lookup of an access control table structure based on the target physical address of the given memory access operation, for obtaining the access control information; and for at least one setting of table lookup control information indicating that a system physical address space has been configured to comprise a plurality of table-lookup-skipping address windows at non-contiguous positions within the system physical address space, the access control information obtaining circuitry is configured to: determine whether the target physical address corresponds to one of the plurality of table-lookup-skipping address windows; and in response to determining that the target physical address corresponds to one of the plurality of table-lookup-skipping windows, identify, as the access control information corresponding to the target physical address, a default value selected independent of the lookup to the access control table structure.
At least some examples of the present technique provide computer-readable code for fabrication of an apparatus as described above. The computer-readable code may be stored on a storage medium. The storage medium may be a non-transitory storage medium.
At least some examples of the present technique provide a method comprising: in response to a given memory access operation specifying a target physical address: obtaining access control information corresponding to the target physical address using access control information obtaining circuitry configured to perform a lookup of an access control table structure based on the target physical address of the given memory access operation; and controlling access to a memory system based on the access control information; in which: in response to detecting that table lookup control information is set to at least one setting indicating that a system physical address space has been configured to comprise a plurality of table-lookup-skipping address windows at non-contiguous positions within the system physical address space: the obtaining comprises determining whether the target physical address corresponds to one of the plurality of table-lookup-skipping address windows; and in response to determining that the target physical address corresponds to one of the plurality of table-lookup-skipping windows of physical address space, a default value selected independent of the lookup to the access control table structure is identified as the access control information corresponding to the target physical address.
At least some examples provide a computer program for controlling a host data processing apparatus to provide an instruction execution environment for execution of target program code, the computer program comprising: access control program logic to control access to a simulated memory system in response to a given memory access operation, based on access control information associated with a target simulated physical address specified for the given memory access operation; and access control information obtaining program logic to obtain the access control information corresponding to the target simulated physical address; in which: the access control information obtaining program logic is configured to perform a lookup of an access control table structure based on the target simulated physical address of the given memory access operation, for obtaining the access control information; and for at least one setting of table lookup control information indicating that a simulated system physical address space has been configured to comprise a plurality of table-lookup-skipping address windows at non-contiguous positions within the simulated system physical address space, the access control information obtaining program logic is configured to: determine whether the target simulated physical address corresponds to one of the plurality of table-lookup-skipping address windows; and in response to determining that the target simulated physical address corresponds to one of the plurality of tablelookup-skipping windows, identify, as the access control information corresponding to the target simulated physical address, a default value selected independent of the lookup to the access control table structure.
The computer program may be stored on a storage medium. The storage medium may be a non-transitory storage medium.
Further aspects, features and advantages of the present technique will be apparent from the following description of examples, which is to be read in conjunction with the accompanying drawings, in which: Figure 1 schematically illustrates an example of a data processing system; Figure 2 illustrates an example of an apparatus comprising memory access control circuitry and access control information obtaining circuitry; Figure 3 illustrates an example of an access control scheme based on access control information specifying which architectural physical address space permits access to a given physical address in a system physical address space; Figure 4 illustrates a point of physical aliasing; Figure 5 illustrates an example of an access control table structure; Figure 6 illustrates examples of configuring table lookup control information to define table-lookup-skipping address windows within the system physical address space; Figure 7 illustrates steps for obtaining access control information for a given memory access request; Figure 8 illustrates an example of table lookup control information; Figure 9 illustrates a first example of determining whether a physical address is within a table-lookup-skipping address window; Figure 10 illustrates an example of obtaining a table index based on concatenation of portions of bits from the target physical address; Figure 11 illustrates steps for obtaining access control information in an implementation supporting the first example; Figures 12 and 13 illustrate a second example of determining whether a physical address is within a table-lookup-skipping address window; Figure 14 illustrates steps for obtaining access control information in an implementation supporting the second example; and Figure 15 illustrates a simulation example.
An apparatus comprises memory access control circuitry to control access to a memory system in response to a given memory access operation, based on access control information associated with a target physical address specified for the given memory access operation.
Access control information obtaining circuitry is provided to obtain the access control information corresponding to the target physical address. The access control information obtaining circuitry supports performing a lookup of an access control table structure based on the target physical address of the given memory access operation, for obtaining the access control information. Such table lookups may be an effective way of describing properties of physical address regions of the physical address space, supporting the ability to define access control information in a relatively fine-grained manner. However, increasingly such table structures are becoming hard to scale to the complexity associated with modern memory system topologies, while meeting competing demands of limiting the memory footprint of the table structure, being scalable to different physical memory system topologies, and complying with any security constraints which may limit which storage locations can be used to store certain portions of the table structure.
In the examples discussed below, the access control information obtaining circuitry supports the ability to set table lookup control information to a setting indicating that a system physical address space has been configured to comprise two or more table-lookup-skipping address windows at non-contiguous positions within the system physical address space. When obtaining access control information for a given memory access operation, the access control information obtaining circuitry determines whether the target physical address of the given memory access operation corresponds to one of the plurality of table-lookup-skipping address windows. In response to determining that the target physical address corresponds to one of the plurality of table-lookup-skipping windows, the access control information obtaining circuitry identifies, as the access control information corresponding to the target physical address, a default value selected independent of the lookup to the access control table structure.
Hence, it is possible to define multiple windows within the system physical address space for which there is no need to look up the access control table structure, as a default setting for the access control information can be assumed when a target physical address of a memory access falls within one of the table-lookup-skipping address windows. Depending on the size of the table- lookup-skipping address window, this can provide a number of benefits. For relatively large table-lookup-skipping address windows, this can eliminate the need to define any table entries for the addresses in such windows at all, reducing the memory footprint of the table structure. On the other hand, if a relatively small table-lookup-skipping address window is defined, this can support the ability to define different access control information for different portions of an address region corresponding to a single table entry in a given level of access control table, reducing the need to incur the cost of setting up a subsequent level of table in a multi-level table structure to distinguish the properties of the different sub-regions corresponding to that single table entry in the given level. This can be beneficial for security in some use cases which might restrict the given level of table to be stored in on-chip storage rather than being permitted to be exported to external off-chip storage accessed via a device (input/output) port. The ability to define more than one table-lookup-skipping address window at non-contiguous positions in the address space can be particularly beneficial to help support scaling of the table architecture to systems involving varying numbers of chiplets or sockets, where the memory system of a given processing system is spread over two or more distinct integrated circuits. Hence, by providing table lookup control information which can configure multiple non-contiguous table-lookup-skipping address windows for which default properties can be assumed, the table lookup process can scale better to the needs of modern processing systems.
It will be appreciated that while the access control information obtaining circuitry supports setting the table lookup control information to define the multiple non-contiguous table-lookup-skipping address windows, the table lookup control information can also support other settings which do not define such multiple non-contiguous table-lookup-skipping address windows (e.g. either defining only a single table-lookup-skipping address window, not defining any such windows at all, or defining multiple windows of partially overlapping range). The particular setting of the table lookup control information used at a given time is dependent on the specific choices of a particular user providing software to execute on the apparatus, so is not a feature of the apparatus platform itself.
In some examples, for at least one setting of the table lookup control information, the tablelookup-skipping address windows may comprise a first type of table-lookup-skipping address window detected based on a comparison of a first subset of bits of the target physical address; and a second type of table-lookup-skipping address window detected based on a comparison of a second subset of bits of the target physical address different from the first subset. The second subset of bits could be either non-overlapping with, or partially overlapping with, the first subset of bits. By supporting both the first type and the second type of table-lookup-skipping address window detected based on comparisons of different subsets of bits of the target physical address, this helps support the ability to define both larger and smaller windows in the physical address space for which default access control properties are assumed. This can be helpful as larger table-lookup-skipping address windows are helpful for reducing memory footprint of the table structure while smaller table-lookup-skipping address windows can help with reducing the need to create further levels of table in a multi-level table structure for describing properties of different sub-regions within a region corresponding to a single higher-level table entry. Hence, providing architectural support for both types of region can be helpful in increasing the flexibility for managing the cost of table footprint.
Nevertheless, other examples could support only one of the first and second types of table-lookup-skipping address window.
In some examples, for at least one setting of the table lookup control information, the plurality of table-lookup-skipping address windows comprise at least two table-lookup-skipping address windows disposed in the system physical address space at intervals of a constant stride.
This can provide a relatively efficient way of defining multiple table-lookup-skipping address windows while requiring relatively little control state information to define the positions of the windows. While at first glance one might expect that providing architectural state information which restricts the multiple windows to be positioned at constant stride intervals in the address space might be too restrictive to be useful (rather than permitting more arbitrary definition of the positions of the windows), in practice, the inventors recognised that in modern multi-chiplet or multi-socket systems it can be common for a number of integrated circuits each having the same local memory system layout to be integrated into a wider processing system with the physical address space of the overall system spanning the local memory systems on each chiplet or socket. In this case, it can be very common for the memory system components which might benefit from being assigned default access control properties to be at the same relative addresses in the local memory space of each chiplet or socket, so that the windows of overall system physical address space which correspond to those memory system components can conveniently and efficiently be defined by a stride.
Hence, in some examples, the table lookup control information comprises a configurable stride parameter and the constant stride depends on the configurable stride parameter. The table lookup control information could also include one or more parameters defining a start address and/or size of each table-lookup-skipping address window.
In some examples, the access control table structure comprises at least one level of table, including an initial-level table that is first to be looked up in the lookup of the access control table structure, the initial-level table comprising table entries each corresponding to a granule of physical addresses of an initial-level granule size. For at least one setting of the table lookup control information, the table lookup control information may support at least one table-lookupskipping address window being defined to have a size less than the initial-level granule size. In some use cases, there may be a security requirement to store the initial-level table entirely on-chip to reduce risk of tampering with the initial-level table, but if a number of sub-regions of relatively small size require different access control properties (e.g. as the sub-regions describe a mixture of on-chip and off-chip resources), then without the ability to define table-lookup-skipping address windows smaller than the initial-level granule size, this would normally require either the initial-level granule size to be reduced to the size of the sub-regions or less (which can greatly increase the memory system footprint if the initial-level table is still stored on-chip, as the number of options for the initial-level granule size may be limited and so it may be required to choose a granule size much smaller than the sub-region size), or require the sub-regions' properties to be described with a further level of table which might be infeasible to store on-chip.
If the further level of table has to be stored in off-chip storage accessed via an input/output port but describes properties of sensitive on-chip resources, this can increase security vulnerability as the off-chip storage of the table may make the table more vulnerable to tampering. These problems can be mitigated by providing architectural support for settings which allow users, if they wish, to define the table-lookup-skipping address window as a window of system physical address space smaller than the initial-level granule size This supported option for configuration of the table structure can be extremely useful in modern systems because it means that even if a subregion within the initial-level granule described by a given initial-level table entry requires a different property to other sub-regions of that granule and stores sensitive data for which it is not acceptable for the associated control information to be stored in external storage, the specific setting for that sub-region can be described using the default access control information associated with a table-lookup-skipping address window and the setting for other parts of the granule can be defined using the initial-level table entry, without needing to increase on-chip memory footprint or breach security requirements.
In some examples, the access control information obtaining circuitry may determine whether the target physical address corresponds to a given type of table-lookup-skipping address window based on a comparison between a selected portion of bits of the target physical address and a reference value.
In some examples, the access control information obtaining circuitry may apply a mask to the target physical address, and compare the masked target physical address with the reference value to determine whether the target physical address is within one of the table-lookup-skipping address windows. In some examples, such masking may also be applied to the reference value. Such masking can be helpful to support variable window sizes, so that the size and/or relative location of the compared portion of bits within the target physical address can be adjusted based on the table lookup control information. The mask can be selected based on the table lookup control information.
In some examples, for at least one type of table-lookup-skipping address window, the reference value comprises a programmable window base address specified in the table lookup control information. This can be helpful for allowing users flexibility to define the location of the table-lookup-skipping address window within the physical address space. This may be particularly helpful for the relatively small window of size smaller than the initial-level granule size as mentioned earlier, as the sub-region within an initial-level address granule to be assigned the default property can be at any location within that initial-level address granule depending on the address map used for a given system, so providing architectural support for software to define the start of the window using a programmable reference value can make the architecture more scalable to different system implementations.
In some examples, for at least one type of table-lookup-skipping address window, the access control information obtaining circuitry is configured to determine that the target physical address corresponds to one of the plurality of table-lookup-skipping address windows in response to determining that the selected portion of bits of the target physical address matches the reference value. This approach can be particularly helpful for supporting definition of relatively small windows of address space of the type discussed earlier.
In some examples, for at least one type of table-lookup-skipping address window, the reference value comprises a predetermined non-programmable value. For example, the predetermined non-programmable value may be zero. While one might think that restricting the reference value to a fixed non-programmable value might not offer enough flexibility, in practice this type of table-lookup-skipping address window can be useful in a multi-chiplet or multi-socket system where there may be relatively large regions of address space that do not need to be described by the table structure, e.g. because they correspond to unused portions of system physical address space at the end of each local physical address space for an individual chiplet or socket.
For at least one type of table-lookup-skipping address window, the access control information obtaining circuitry may determine that the target physical address corresponds to one of the plurality of table-lookup-skipping address windows in response to determining that the selected portion of bits of the target physical address does not match the reference value. Again, this approach can be helpful for handling unused portions of physical address space at the end of local physical address spaces associated with each chiplet or socket in a multi-chiplet or multi-socket system.
It will be appreciated that some implementations may support more than one type of tablelookup-skipping address window, e.g. supporting both the ability to define windows using a programmable reference value and the ability to define windows using a fixed non-programmable reference value. Other examples could support only one of these types of table-lookup-skipping address window.
In some examples, in response to determining that the target physical address does not correspond to one of the table-lookup-skipping windows of the system physical address space, the access control information obtaining circuitry may identify, as the access control information corresponding to the target physical address. a value derived from the lookup to the access control table structure.
It is not necessary for such a lookup to the access control table structure to take place on every memory access, as some implementations may support caching of information derived from the access control table structure, to allow faster access when the same information is needed again for a subsequent access. Hence, in some cases, if the target physical address does not correspond to one of the table-lookup-skipping windows, a cache structure may be looked up, and if there is a hit for the target physical address in the cache structure, a value derived from a previous lookup to the access control table structure may be obtained from the cache structure, while if there is a miss for the target physical address in the cache structure, the lookup to the access control table structure may be performed to obtain the required access control information (and information obtained in the lookup may be allocated to the cache structure to help speed up finding the access control information on subsequent accesses that require the same information again).
In some examples, in the lookup to the access control table structure, the access control information obtaining circuitry may determine, based on the target physical address, a table index for selecting an initial-level table entry from an initial-level table of the access control table structure. For at least one setting of the table lookup control information in which the system physical address space has been configured to comprise a given type of table-lookup-skipping address window for which detection of whether the target physical address is within the given type of table-lookup-skipping address window depends on a selected portion of bits of the target physical address, the access control information obtaining circuitry may obtain the table index based on a concatenated value comprising a concatenation of a first portion of bits of the given target physical address more significant than the selected portion and a second portion of bits of the given target physical address less significant than the selected portion. This can be helpful for reducing the memory system footprint of the initial-level table, because it allows the initial-level table to occupy a contiguous region in memory even if the physical address space the initial-level table describes actually includes a (potentially large) intervening table-lookup-skipping address window for which no initial-level table entries are defined.
Hence, this concatenation of address bits allows the portions of address space either side of the table-lookup-skipping address window to be logically "stitched together" for the purpose of index generation when looking up the table. For other purposes such as controlling the actual memory access, the selected portion of bits would still be included in the version of the address used to control memory operations for that target physical address, so the omission of the selected portion of bits from the version of the address used for table index generation may be specific to index generation and is not required for other functionality that uses the address. Also, in implementations supporting more than one type of table-lookup-skipping address window, in some cases, the omission from index generation of the selected portion of bits used to detect membership of a table-lookup-skipping address window may apply only for the portion of bits used to detect one or more of such types of table-lookup-skipping address window, but there could also be another type of table-lookup-skipping address window for which the portion of bits used to detect whether an address falls in that other type of window would still contribute to index generation. For example, the omission of the selected portion of bits from the version of the address used for table index generation may be helpful for an example where detection of whether the target physical address falls in the table-lookup-skipping window depends on detection that the selected portion of bits does not match a predetermined fixed reference value (e.g. 0), but may not be required for an example where the detection of whether the target physical address falls in the table-lookup-skipping window depends on the selected portion of bits matching a programmable reference value.
There could be a number of ways of defining the default value which is assumed for the access control information associated with a physical address in the table-lookup-skipping address window.
In some examples, for at least one type of table-lookup-skipping address window, the default value for the access control information indicates a setting for the access control information which indicates that a least secure class of memory access operation is permitted to access a memory system location associated with the target physical address. This can be helpful for reducing memory footprint of the table structure, as often the amount of address space for which this setting is required can be much larger than the amount of address space for which the least secure class of memory access operation would not be permitted to access the corresponding addresses (typically, there are fewer secure resources than non-secure resources). Hence, by providing support for a table-lookup-skipping address window for which the default value assumes that the least secure class of memory access operation can access the location, it is more likely that the window can support a larger reduction in overall memory footprint for the table structure, as this would mean that a window of address space not comprising any secure or sensitive resources may be described using the table-lookup-skipping address window and so does not require memory footprint to be occupied by explicit table entries describing those regions. It will be appreciated that even if the access control information indicates that the least secure class of memory access operation can be permitted to access the memory system location, this does not necessarily guarantee that memory access operations attempting to access that memory system location would definitely be allowed to access the memory system location, as there could also be other types of access control scheme in use (e.g. based on page table permissions or further access control checks implemented at a downstream location from the memory access control circuitry, such as completer-side checks performed local to the completer circuitry which acts upon the memory access request once passed to the memory system). Such additional checks might still reject the memory access operation if other criteria are not satisfied, even if the access was permitted based on the access control information.
In some examples, for at least one type of table-lookup-skipping address window, the default value for the access control information indicates a most permissive setting for the access control information. For example, this may be a setting of the access control information which indicates that any class of memory access operation is permitted to access the memory system location (again, this does not necessarily mean the current memory access operation would definitely succeed, as there can be other types of checks independent of the access control information being performed too). This approach can be helpful for certain use cases where certain memory system components such as peripheral (input/output) devices or hardware accelerators may be assigned the most permissive setting, either because they are for a non-sensitive use case, or because the memory system component implements its own form of security access control (e.g. completer-side checks) and so the access control circuitry can pass through the memory access request unrestricted to allow the memory system component to make its own decision.
In some examples, the default value may be fixed for a certain type of table-lookup-skipping address window, without any ability for the user to configure which access control setting is the default setting associated with addresses in the table-lookup-skipping address window. More than one type of table-lookup-skipping address window could be supported, each with a different setting for the default value, but nevertheless the default setting for a given type of tablelookup-skipping address window could be fixed and non-programmable.
However, in other examples, for at least one type of table-lookup-skipping address window, the default value for the access control information may be specified by a programmable configuration parameter of the table lookup control information. This can provide increased flexibility in the use cases for which it may be helpful to allow certain regions to be assigned access control information without needing the access control information to be defined using the
table structure.
The particular purpose of the access control information may vary depending on use case. In general, the access control information may comprise any information defined for a particular physical address or region of physical addresses, which can be used for controlling handling of memory access operations specifying that physical address or region of physical addresses. In general, for a use case where a table structure looked up by physical address is provided to describe certain memory access control properties of regions of physical memory, such a table structure may benefit from use of the table-lookup-skipping address windows to help keep the memory footprint cost of the table within acceptable bounds.
However, the techniques discussed above can be particularly useful in examples where the access control information is information used to enforce access restrictions on regions of physical memory for security purposes.
For example, the apparatus may comprise address translation circuitry to translate a virtual address to a physical address in a selected architectural physical address space selected from among a plurality of architectural physical address spaces, for which the access control information specifies which of the architectural physical address spaces permits access to the target physical address in the system physical address space. This ability to isolate portions of system physical address space for access in different architectural physical address spaces can be helpful for security, but there can be a concern that the access control table structure used to define which architectural physical address spaces can access each region of physical address space may require a relatively high memory footprint when for security reasons it is desirable for at least the initial-level table of the access control table structure should be stored on-chip. The use of table-lookup-skipping address windows can be particularly helpful in this scenario because they can help limit the memory footprint cost of the table structure used to provide access control information without needing to use other less secure techniques such as relying on external storage to store parts of the initial table looked up in the table lookup.
In some examples, the apparatus may comprise at least one memory system component configured to treat aliasing physical addresses from different architectural physical address spaces as if they relate to different memory system resources even though the aliasing physical addresses of the different architectural physical address spaces actually correspond to the same memory system resource in the system physical address space. This can be helpful for reducing risk of information about address access patterns associated with a sensitive software workload which uses one physical address space being leaked to an attacker process using a different physical address space based on cache timing side channels or other measurements of the performance effects of the conflicting demands placed on memory system components by memory access requests from different requesters.
The techniques discussed above may be implemented within an apparatus which has hardware circuitry provided for implementing the instruction decoding circuitry and processing circuitry discussed above. However, the same technique can also be implemented within a computer program which executes on a host data processing apparatus to provide an instruction execution environment for execution of target code. Such a computer program may control the host data processing apparatus to simulate the architectural environment which would be provided on a hardware apparatus which actually supports target code according to a certain instruction set architecture, even if the host data processing apparatus itself does not support that architecture. The computer program may have access control program logic and access control information obtaining circuitry which emulates functions of the memory access control circuitry and access control information obtaining circuitry discussed above.
Such a simulation program can be useful, for example, when legacy code written for one instruction set architecture is being executed on a host processor which supports a different instruction set architecture. Also, the simulation can allow software development for a newer version of the instruction set architecture to start before processing hardware supporting that new architecture version is ready, as the execution of the software on the simulated execution environment can enable testing of the software in parallel with ongoing development of the hardware devices supporting the new architecture.
References to a "register" (when made in the context of a hardware supported embodiment) may, in the context of a software-based simulation, be understood as referring to a simulated register of the target instruction set architecture which is mapped by the simulator program onto host storage resources (e.g. registers and/or memory) provided by the host apparatus. Similarly, references to a "physical address" or a "physical address space" made in the context of the hardware supported embodiment may be understood for the simulated embodiment as referring to a simulated physical address or simulated physical address space respectively, which may similarly be mapped by the simulator program onto host storage resources (registers and/or memory) of the host apparatus.
The simulation program may be stored on a storage medium, which may be a non-transitory storage medium.
Figure 1 schematically illustrates an example of a data processing system 2. The data processing system 2 comprises a host system 4, which could be implemented as a system-on-chip or as a set of multiple interconnected chiplets. The host system 4 comprises one or more processing elements (processors) 6. Each processor 6 could, for example, be a central processing unit (CPU, e.g. a general purpose CPU), graphics processing unit (GPU), neural processing unit (NPU), or any other processor capable of instruction execution.
Each processing element 6 may include at least one private cache 8 for caching data obtained from memory storage 12 via an interconnect 10. Each memory storage unit 12 has an associated memory controller 14 for mapping requests made according to the bus protocol used by the interconnect 10 onto the specific protocols for addressing the particular kind of memory storage implemented in the corresponding storage unit 12 (e.g. the storage units 12 could include volatile or non-volatile storage, according to various kinds of memory storage technology). The interconnect 10 may include a system cache 34 which acts as a shared cache accessible to each processing element 6.
Hence, a number of processing elements 6 may each have access to shared memory 12 within the host system 4. However, in addition to the processing elements 6 themselves, another source of memory access requests to shared memory 12 can be from devices 20, 22 coupled to the host system via corresponding root ports 26. Each root port 26 acts as a gateway to the host system for a corresponding device or group of devices. Although Figure 1 shows each device 20, 22 having a separate root port 26, it is also possible for a group of devices to share a single root port. The devices 20, 22 may for example include any one or more of: an I/O (input/output) device for controlling interaction between the host system and the user or the outside world (e.g. a network controller, display controller, user input device, etc.); a hardware accelerator for performing certain bespoke processing functions (e.g. neural network processing, cryptographic functions, etc.) in a more efficient manner than could be performed in software using a general purpose processor 6; and/or external memory storage provided to provide additional storage capacity beyond the capacity provided in the memory storage 12 of the host system 4. The devices 20, 22 accesses shared memory 12 via a system memory management unit (SMMU, also known as Input/Output memory management unit or IOMMU) 30, which performs address translation and access permission checks for requests made by the devices 20, 22 in a similar manner to a memory management unit (MMU) 106 within a processing element 6 (the MMU is shown in Figure 2), such translation and permissions checks being based on address mappings and access permissions defined in translation table structures (page table structures) stored in the memory system 12 and configured by software executing on the processing elements 6.
The interconnect 10 may be associated with home node circuitry 32 which is responsible for maintaining coherency between cached data held at private caches 8, 24 of a number of caching agents of the data processing system 2. The caching agents can include the processing elements 6 as well as any coherent devices 22 which have their own coherent private cache 24 (other devices 20 may be non-caching devices (or "I/O coherent" devices) which do not have a private cache that needs to remain coherent with the host device). For example, a coherent device 22 could be a device for which the interface between the device 22 and host system 4 is compatible with a given device interface protocol, such as the CXL (Compute Express Link) standard.
The home node circuitry 32 implements a given coherency protocol, which defines a set of request types and response protocols associated with those request types. Each address may, with respect to a particular caching agent, be considered to be held in that caching agent's private cache in a particular coherency state. For example, the coherency state may specify, with respect to a given address and a given caching agent 6, 22, whether valid data for that address is held at the given caching agent's private cache 24, and if valid data is held, whether that data is clean or dirty, and/or is held in a unique or shared state (unique data being held exclusively in that caching agent's cache, and not in other caching agents caches, and shared data being capable of also being held in other caching agent's caches). The coherency protocol may require that certain request types or responses to such requests may be associated with certain transitions of coherency state for cached items of data associated with the target address of the request. When a read/write request is received from one of the caching agents 6, 22 or an I/O coherent device requesting a read/write operation to a given physical address, the home node circuitry 32 issues snoop requests to one or more other caching agents that could potentially hold valid cached data for that physical address. A snoop request may query the current coherency state of the cached data for a specified address at a corresponding caching agent, and/or trigger changes in coherency state at the caching agent (e.g. invalidating cached data if the requester of the original read/write request requires the data to be cached in the unique state in its cache, and/or causing return of dirty data held in a snooped caching agent's cache 8 so that the dirty data can be made accessible to the requester which sent the read/write request).
As shown in Figure 1, the home node circuitry 32 may be associated with a snoop filter 36 for tracking (at least partially) which data addresses are cached at certain caching agents 6, 22. The snoop filter 36 can be used to reduce snoop traffic by allowing the coherent interconnect 10 to determine when data is not cached at a particular requester. In the absence of snoop filtering, when one requester 6, 20, 22 issues a read or write transaction to data which could be shared with other caching agents 6, 22, the coherent interconnect 10 may trigger snoop requests to be issued to each other caching agent which could have a cached copy of the data from the same address. However, if there are a lot of caching agents, then this approach of broadcasting snoops to all cached requesters can be complex and result in a large volume of coherency traffic being exchanged within the system 2. By providing a snoop filter 36 which can at least partially track which addresses are cached at the respective caching agents 6, 22, this can help to reduce the volume of snoop traffic, enabling more efficient use of available request bandwidth and improving system performance. With the number of caching agents present in a modern system, it can be infeasible to implement a precise snoop filter scheme exactly tracking the addresses stored at each caching agent 6, 22, as such precision may be unacceptably expensive in terms of the storage and bandwidth cost. Therefore, the snoop filter 36 may track the content of the caches imprecisely. Provided there are no false snoop suppression instances where data actually held at a given private cache 8, 24 is mistakenly identified as not present so that snoops to that given private cache 8, 24 are incorrectly suppressed, it can be permitted to use a less precise tracking scheme which permits cases where the snoop request is issued to a given caching agent but (due to lack of precise information) that caching agent actually does not hold a valid copy of the data for the address specified in the snoop request. In some examples the system cache 34 and snoop filter 36 may be combined, with a single structure looked up based on an address providing both cached data and snoop filter information associated with that address. In some instances, further snoop filtering circuitry can be provided at the root port 26 (device interface) associated with at least one coherent device 24, and/or within the device itself to provide for further filtering of snoop requests targeting that device.
The host system 4 may also include memory encryption/decryption circuitry 50 for encrypting/decrypting data written to the memory storage 12 or read from the memory storage 12. Providing an on-board encryption engine can be useful for improving security in confidential computing scenarios. The encryption/decryption applied by the memory encryption/decryption circuitry 50 may depend on key information 52 accessible to the memory encryption/decryption circuitry 50. The key information 52 to use for encrypting/decrypting data for a given memory request may be selected based on a physical address space identifier (PASID) and/or memory encryption context identifier (MECID) associated with the memory request, which will be described further below. The MECID distinguishes between two or more different memory encryption context associated with the same physical address space, so that respective portions of mutually distrusting software having access to portions of the same physical address space can have their data be subject to different encryption regimes (e.g. different encryption keys) to help preserve each other's confidentiality.
Figure 2 illustrates a more detailed example of circuitry provided at a processing element 6, which can be regarded as an example of an apparatus comprising memory access control circuitry 108 and access control information obtaining circuitry 109. The processing element 6 includes instruction decoding circuitry 104 for decoding program instructions defined according to a given instruction set architecture (ISA), to generate micro-operations (decoded instructions) which are passed to processing circuitry 105 to control the processing circuitry 105 to perform processing operations corresponding to the decoded instructions. A set of control registers 62 is provided to store control state information which defines behaviour of the processing circuitry 105 in response to the instructions.
The processing element 6 may also include a memory management unit (MMU) 106, comprising a translation lookaside buffer (TLB), for controlling access to memory by the processing element 6 based on address translation mappings and access permissions information defined in translation table structures stored in the memory 12. The TLB caches translation table information derived from the translation table structures (also known as page table structures). The control registers 62 in this example include an indication of a current security state 64 and a current exception level 66, and a set of MECID registers 68 used for assignment of MECIDs to requests issued in a current operating state. The processing element 6 also includes other registers not shown in Figure 2, e.g. general purpose registers for storing operands for instructions and results of instructions, and other control registers providing further control state information.
The processing element 6 also includes requester-side access control circuitry 108 for performing access control functions based on access control information associated with a target physical address of a memory access operation. The requester-side access control circuitry 108 comprises access control information obtaining circuitry 109 for obtaining the access control information corresponding to a particular address. The access control information obtaining circuitry 109 supports performing a lookup of an access control table structure stored in the memory system, looked up based on the target physical address of the memory access operation.
For example, the requester-side isolated address region assignment information may comprise a granule protection table which associates each physical address described by the table with an indication of which of a set of multiple architectural physical address spaces supported in the ISA is allowed to provide access to the corresponding physical address.
As shown in Figure 3, in this example a processing element 6 supports multiple distinct architectural physical address spaces 84 which can be used to address the memory system. Data processing systems may support use of virtual memory, where address translation circuitry (e.g. the MMU 106) is provided to translate a virtual address specified by a memory access request from a virtual address space 80 into a physical address associated with a location in a memory system to be accessed. The mappings between virtual addresses and physical addresses may be defined in one or more page table structures. The page table entries within the page table structures could also define some access permission information which may control whether a given software process executing on the processing circuitry is allowed to access a particular virtual address. For some translation regimes, the translation may involve two stages of address translation. If a two-stage translation is used then mapping from a virtual address space 80 to a physical address space 84 is via an intermediate address space 82 based on two separate sets of translation table structures, one for stage 1 (virtual-to-intermediate address translation) and one for stage 2 (intermediate-to-physical address translation).
In some processing systems, all virtual addresses may be mapped by the address translation circuitry onto a single physical address space which is used by the memory system to identify locations in memory to be accessed. In such a system, control over whether a particular software process can access a particular address is provided solely based on the page table structures used to provide the virtual-to-physical address translation mappings. However, such page table structures may typically be defined by an operating system and/or a hypervisor. If the operating system or the hypervisor is compromised then this may cause a security leak where sensitive information may become accessible to an attacker.
Therefore, for some systems where there is a need for certain processes to execute securely in isolation from other processes, the system may support operation in a number of security states (domains) and a number of distinct architectural physical address spaces 84 may be supported. The architectural physical address space 84 selected for a given memory access generated in a given security state may depend on which security state is the given security state, and in some cases also depend on information defined in page table entries corresponding to the address accessed by the given memory access. For at least some components of the memory system (e.g. caches 8, interconnects 10, devices 20, 22 and/or certain structures within the SMMU 30), memory access requests whose virtual addresses are translated into physical addresses in different architectural physical address spaces 84 are treated as if they were accessing completely separate addresses in memory, even if the physical addresses are aliasing physical addresses in the respective physical address spaces which actually correspond to the same location in memory. By isolating accesses from different domains of operation of the processing circuitry into respective distinct physical address spaces as viewed for some memory system components, this can provide a stronger security guarantee which does not rely on the page table permission information set by an operating system or hypervisor.
In this example, the processing circuitry 105 can execute instructions in one of four security states: a non-secure security state, a secure security state, a realm security state and a root security state. The current security state indication 64 in the control registers 62 designates which security state is currently being used. Each of the four security states is associated with a corresponding architectural physical address space (PAS) 84. Hence, there are four architectural PASs: a non-secure PAS, secure PAS, realm PAS and root PAS.
The root state is the most privileged state, and is used for executing software which controls transitions to/from the other security states. The root state is able to have its virtual addresses translated from the virtual address space 80 to any of the four architectural physical address spaces 84. Information (NSE, NS) specified in the page table structures used to control the virtual address (VA) to physical address (PA) mapping is used to control which architectural PAS is selected for a given memory access request issued in the root security state (while Figure 3 for conciseness shows examples of the root state software selecting one of the Realm PAS or Root PAS, the root state software can also access the Secure PAS or Non-Secure PAS if desired).
The non-secure state is the least privileged security state, and its memory accesses are translated by default into physical addresses in the non-secure PAS (potentially via two stages of address translation from virtual address space 80 to intermediate address space 82 and from intermediate address space to the non-secure physical address space 84). Instructions executed in the non-secure state are not allowed to cause access requests to be generated specifying physical addresses in the secure PAS, realm PAS and root PAS.
The secure state and realm state are orthogonal security states which are not able to access each other's PASs or the root PAS, but which are able to select whether they access their own respective PAS (realm PAS for the realm state and secure PAS for the secure state) or whether they should access the non-secure PAS. Hence, information (NS) specified in the page table structures used to control address translation can be used to select whether a given memory access request has its address translated into the non-secure or secure PAS (when the request is issued from the secure security state) or into the non-secure or real PAS (when the request is issued from the realm security state).
As shown in Figure 4, the memory system may include a point of physical aliasing (PoPA) 94, which is a point within the memory system at which aliasing physical addresses from different architectural physical address spaces 84 which correspond to the same memory system resource are mapped to a single physical address uniquely identifying that memory system resource in a hardware physical address space 86 used by downstream memory system components (e.g. memory controllers 14 and memory storage 12). Although more complicated mappings between aliasing addresses are possible (e.g. based on a mapping table tracking which values in the PASs 84 map to the same hardware physical address), it can be simplest if the addresses of the respective architectural PASs 84 which are considered to alias to the same location in the hardware physical address space 86 are those addresses which have the same physical address value (e.g. PA=X in the secure PAS and PA=X in the non-secure PAS are considered aliasing addresses). The location of the PoPA 94 could vary, e.g. it could be either upstream or downstream of the interconnect 10. In the example of Figure 1, the PoPA 94 is downstream of the interconnect 10 (and upstream of the memory controllers 14) so that the system cache 34 is prior to the PoPA, but other examples could provide the PoPA 94 upstream of the interconnect 10. Also, while Figure 4 shows an example with at least one cache 98 downstream of the PoPA 94, this is not essential and other examples may not have any further caches downstream of the PoPA 94 (e.g. Figure 1 does not show any post-PoPA cache).
Pre-PoPA memory system components, such as caches 8, 34 in the example of Figures 1 and 4, may treat aliasing physical addresses of different PASs 84 as if they correspond to different memory system resources, even if they ultimately map to the same memory system location in the system physical address space (system PAS) 86 used by underlying memory 12 beyond the PoPA 94. For example, the pre-PoPA caches 8, 34 may cache data or program code for the aliasing physical addresses in separate entries, so that if the same memory system resource is requested to be accessed from different physical address spaces, then the accesses will cause separate cache or TLB entries to be allocated. Also, the pre-PoPA memory system component could include the home node circuitry 32 and snoop filter 36, which may separately track coherency states of data held at respective caching agents 6, 22 for the aliasing addresses in different architectural physical address spaces 84. Hence, the aliasing physical addresses are treated as separate addresses for the purpose of maintaining coherency even if they do actually correspond to the same underlying memory system resource. As shown in Figure 4, one way to ensure that aliasing addresses from different architectural PASs 84 are treated as separate addresses can be to include a PAS identifier (PASID), which distinguishes which architectural PAS 84 is associated with a given memory access request, as additional address bits in the representation of physical addresses used to look up these pre-PoPA memory system components or control coherency operations. Also, a PAS tag value 96 identifying the architectural PAS associated with a given cache entry may be specified in the given cache entry or snoop filter, and lookups of pre-PoPA caches/snoop filters may depend on a comparison of a PAS tag associated with a memory access request causing the cache lookup (derived from the page table information and/or current security state 64 that is used to select which PAS to specify for a given request) with a PAS tag of a cache entry (derived from the requester-side isolated address region assignment information associated with the corresponding physical address).
Regardless of the form of the pre-PoPA memory system component, it can be useful for such a PoPA memory system component to treat the aliasing physical addresses of the respective architectural address spaces 84 as if they correspond to different memory system resources, as this provides hardware-enforced isolation between the accesses issued to different physical address spaces so that information associated with one domain cannot be leaked to another domain by features such as cache timing side channels or side channels involving changes of coherency triggered by the coherency control circuitry.
In contrast, once requests pass beyond the PoPA 94, the aliasing addresses from the respective architectural PASs 84 are mapped to a single unique physical address in the hardware PAS 86. For example, if the aliasing physical addresses in the architectural PASs 84 are simply those having the same physical address value, the mapping to the hardware PAS (system PAS) 86 can be carried out simply by stopping using the PASID as additional address bits when looking up storage structures. As shown in Figure 4, for example, a post-PoPA cache 98 may, for its lookups, no longer use the PASID as part of the address lookup information and no longer tags its cache entries with a PASID 96, unlike the pre-PoPA caches 98 which do use the PASID 96 for address lookups and cache tagging.
As shown in Figure 3, the hardware physical address space 86 may be partitioned to enable access to certain physical memory system locations only within certain architectural PASs.
This could be based either on a static mapping (e.g. memory regions assigned to certain devices 20, 22 could be statically reserved only to be accessible to a certain architectural PAS) for some regions of the system PAS. However, for other regions a dynamic mapping can be defined in an access control table structure looked up by the requester-side access control circuitry 108 to determine whether, for a memory access request which has caused translation of the virtual address specified by the request into a particular target physical address in a target architectural PAS 84, that target physical address is allowed to be accessed from within that target architectural PAS 84. This control structure (also known as a granule protection table or GPT, for supporting a granule protection check, or GPC) is an example of requester-side isolated address region assignment information used by a requester to control memory access as described earlier. Each entry of the GPT may specify, for a given region of physical addresses, which of the architectural PASs is assigned to that physical address as being allowed to give access to that physical address (some examples may also support GPT settings which permit more than one, or even all, of the architectural PASs to enable access to the corresponding physical address in the system PAS).
As shown in Figure 2, the requester-side access control circuitry 108 may have one or more GPT caches for caching portions of the GPT (or information derived from the GPT). While Figure 2 shows such GPT caches as separate from the corresponding TLBs in the MMU 106, in other examples a combined caching structure could cache information from both the page tables (translation table structures) used for address translation and the GPT used for the granule protection check. Similarly, while the MMU 106 and requester-side access control circuitry 108 are shown as separate in Figure 2, in other examples a single circuit unit could perform both functions. Hence, when looking up access control information for a given target physical address, the access control information could be obtained either from the GPT cache or from a lookup of the table structure in the memory system.
The use of multiple architectural PASs 86 for addressing some pre-PoPA memory system components such as caches 8, 34 can be useful for improving security for some use cases, to enable software operating in the Realm or Secure state to be isolated (by a hardware-enforced mechanism) from untrusted software running in the Non-Secure state. However, nevertheless a given architectural PAS (e.g. the realm PAS) may support a number of pieces of software which are mutually distrusting and which may not trust the operating system or hypervisor setting the translation table structures to set access permissions to prevent inappropriate access to its data by other software sharing the same architectural PAS 84. Therefore, as shown in Figure 2, it can be useful, for at least some of the architectural PASs 84 (e.g. for the Realm PAS), to provide support for MECIDs (memory encryption context identifiers) which distinguish different encryption contexts within the same architectural PAS 84.
As shown in Figure 2, the processing element 6 may comprise MECID registers 68 which can be used by software to configure which MECID values should be assigned to the memory requests issued by the processing circuitry 60 at a given time. For example, some approaches could simply provide a single MECID register 68 which indicates the MECID currently in use (which can be updated by privileged software on a context switch between one encryption context and another). It is also possible to provide multiple items of MECID-identifying state (e.g. in different fields within a single MECID register 68 or in different MECID registers 68), each item of MECID-identifying state being associated with a respective operating state (e.g. exception level or privilege level) of the processing element 6, so that, based on the current exception level indication 66 or another indication of the current operating state, the appropriate MECID for that operating state can be selected. By supporting MECIDs being defined simultaneously for multiple operating states (e.g. exception levels), this can reduce the amount of reprogramming of MECID registers needed when transitioning back and forth between different operating states (e.g. such frequent transitions can be common when handling exceptions). Also, it may be possible to define multiple items of MECID-defining state which may apply to different classes of memory access operations. For example, which item of MECID-defining state is used to provide the MECID to be used for a given memory access request may depend on the type of load/store instruction executed to cause that memory access request to be issued. Hence, it will be appreciated that there are a wide variety of architectural mechanisms by which software-configured state in control registers 62 can be specified to influence assignment of MECIDs to particular memory access requests.
Hence, as shown in Figure 2, when a given memory access request is issued by the processing element 6 to the memory system 8, 10, 12, the request may specify, in addition to the physical address translated from a virtual address by the MMU 70 and the PASID identifying a target architectural PAS 84, a MECID which distinguishes a selected memory encryption context associated with the request from other memory encryption contexts of the target architectural PAS. The memory encryption/decryption circuitry 50 can select between different items of key information depending on the combination of PASID and MECID, so that different pieces of software within the same architectural PAS (as well as different pieces of software within different architectural PASs) can use different encryption keys to ensure mutual privacy of each other's data. While Figure 2 shows an example of registers 68 defined at the processing element 6 and information derived from translation table structures to control assignment of PASID and MECID to memory access requests, requests sent to the memory system by devices 20, 22 may similarly be tagged with a PASID and MECID. Software executing on the processing elements 6 may configure state information associated with a given device 20, 22 which specifies which PASID and MECID to assign to device-originating memory system requests from that device, or some devices may have a fixed assignment of PASID (e.g. the Non-secure PASID only if a device is never used in a more secure context).
It will be appreciated that support for MECIDs is not essential, and other examples may select key information for encryption/decryption of data stored in the memory system based solely on the PASID specified by a given memory access request.
The GPT described above is an example of requester-side access control information, used to enforce requester-side access control checks at the point when a requester entity requesting access to a memory system issues the corresponding memory access requests. If the requester-side access control checks fail, a fault may be signalled and the memory access request may be prevented from successfully accessing the associated memory system location.
In some cases, completer entities such as devices 20, 22 or memory controllers 14 may also implement a form of access control based on completer-side access control information. For example, a device compliant with the CXL standard may support a protocol for supporting trusted execution environments (TEEs) which enables certain physical addresses to be designated as "TEE-Exclusive" so that they cannot be accessed by access requests not associated with the TEE-exclusive state. To support such completer-side checks, the access control information used by the requester-side access control circuitry 108 may support an encoding indicating that the physical address can be accessed in any of the architectural PASs 84, so that requests targeting those address regions may effectively be passed through to the completer side to implement further checks there. Alternatively, the completer-side checks may be in addition to checks at the requester-side, so that despite the requester-side checks restricting access to the physical address to particular architectural PASs 84, further checks are nevertheless performed on the completer-side.
Figure 5 illustrates an example of an access control table structure that can be used to provide the requester-side access control information. In this example, the access control table structure is the GPT, the requester-side access control information is information indicating which of the architectural PASs 84 are allowed to provide access to the corresponding physical address within the system PAS 86. However, it will be appreciated that the techniques discussed below for defining table-lookup-skipping windows could also be applied to other kinds of access control table structure which associate access control information with physical addresses in the system physical address space 86.
A table base register 150 (GPTBR_EL3) identifies a base address of a level-0 (LO) table 152 (initial-level table) stored in the memory system. The LO table 152 comprises a number of LO table entries 153. Each LO table entry 153 corresponds to a block of physical address space of a size corresponding to an LO granule size (also referred to as initial-level granule size above). For example, the LO granule size could be 1GB, 16GB, 64GB or 512GB, and may be selected based on control parameter stored in a control register 162, as described below with respect to Figure 8. The LO granule size influences selection of the portion of bits of the target physical address that is used as an index portion to select which entry of the LO table corresponds to that target physical address. The index portion of the target physical address may be used to derive an offset value which is applied relative to the LO base address defined in the base address register 150 to determine the address in memory at which the table entry 153 corresponding to the target physical address is stored. Each LO table entry 153 is either a table descriptor providing a pointer to a subsequent level-1 (L1) table 154, or a block descriptor which provides the access control information for the corresponding block (LO granule) 155 of physical address space. A control parameter specified by the LO table entry 153 distinguishes whether the entry is a table descriptor or a block descriptor.
The L1 table 154 referenced via a table descriptor in the LO table 152 contains a number of L1 table entries 156. In this example, the GPT structure has a maximum of 2 levels of table, the LO table 152 and L1 table 154, and so each L1 entry 156 is a granule descriptor providing the access control information for one or more corresponding granules (L1 granule) 157 of physical address space of size corresponding to an L1 granule size. The L1 granule size can be configurable based on a control parameter (e.g. PGS in Figure 8) stored in a control register 62, in a similar way to the configurable LO granule size. For example, the L1 granule size may be 4 KB, 16 KB or 64 KB. The L1 granule size influences which portion of the target physical address is used to select a particular L1 entry from the L1 table 154, again allowing determination of the address of a given L1 entry 156 by applying an offset derived from that portion of the target physical address to an L1 pointer specified in the corresponding table descriptor 153 of the LO table 152, to generate the physical address at which the required L1 entry 156 is stored.
While Figure 5 shows an example with a maximum of 2 levels of table, it will be appreciated that other examples could provide more than 2 levels, in which case the L1 entries 156 could also act as table descriptors providing a pointer to a subsequent level of table which provides further entries for address regions at finer granularity than provided in the L1 table 154.
As the access control information for one granule of physical address space may not necessarily need all the bits of one memory location, it is possible for multiple granules to share one granule descriptor or block descriptor, so a further portion of bits of the target physical address may be used to select which of multiple access control information fields (referred to as granule protection information fields or GPI fields in Figure 5) within the granule descriptor or block descriptor provides the access control information related to the target physical address. Hence, when looking up the GPT for a target physical address of a given memory access operation, in a case where no cached information is available in the GPT cache of the requester-side access control circuitry 108 for that target physical address, the access control information obtaining circuitry 109 generates a series of one or more load memory access requests to obtain the relevant LO (and if required L1) table entries 153, 156 for the target physical address, until either a block or granule descriptor is found, and uses a portion of the target physical address to select the relevant GPI field from that block or granule descriptor that corresponds to the target physical address. The access control circuitry 108 uses the access control information specified in the selected GPI field to determine whether the given memory access operation specifying a given architectural PAS 84 is allowed to access the corresponding region of the system PAS 86.
The access control information (GPI field) for a given granule of physical addresses could, for example, support at least the following encodings:
GPI field encoding Meaning
Ob0000 No accesses permitted in any of the architectural PASs Ob1000 Accesses permitted to Secure PAS only Ob1001 Accesses permitted to Non-secure PAS only Ob1010 Accesses permitted to Root PAS only Ob1011 Accesses permitted to Realm PAS only Ob1111 All accesses permitted from any PAS Other Reserved or used for other purposes It will be appreciated that this is just one possible encoding scheme, and other examples may represent the same settings using different patterns of bits of a GPI field. Also, the list of supported encodings is not exhaustive. Other encodings could be supported to provide further control settings for determining whether a given memory access request is allowed to access a particular physical address region.
A challenge when managing such access control tables which are looked up based on a target physical address is that, given the large memory capacity supported in the system PAS implemented in modern processing systems, the access control table may incur considerable cost in terms of memory footprint. This may be a particular problem in scenarios where there is a security risk associated with pushing certain parts of the access control table structure out to external storage accessed via a device port 26, such that to provide security and reliability guarantees it may be desirable that at least the LO table 152 is stored in on-chip storage. However, unlike address translation tables which may often be defined for only a portion of addressable virtual address space, for security-sensitive access control information provided in a table structure looked up based on physical address it may typically be needed to be able to associate each physical address in the addressable range of the system PAS with a setting for the access control information, in case software generates an access request whose virtual address translates to that physical address based on address translation mappings in the page tables. Hence, the size of the LO table 152 may scale with the overall size of the addressable range of the system PAS 86. For example, if each level 0 table entry is a 64-bit entry to enable table pointers to be specified in the entry, a given level 0 table entry spans 1GB of memory address space and the addressable range of the physical address space corresponds to 48 bits of address space (i.e. covering 243 addressable bytes of memory), the LO table 152 may require a 2MB contiguous block of physical address space to be reserved for it, which may be prohibitive as it may be difficult to reserve such a large block of physical address space in on-chip memory while preserving the security/reliability guarantees.
Another challenge with managing such access control tables is that modern processing systems may provide a shared physical address space which covers physical storage units in more than one chip or integrated circuit. For example, a multi-socket processing system may be implemented on a circuit board having multiple "sockets" for receiving integrated circuit chips. The components of the host system 4 described with reference to Figure 1 may be distributed across the integrated circuits corresponding to each socket. Often, but not always, the memory layout of the chips in each socket may be nominally identical (each chip having one or more PEs 6 and some memory storage 12, as well as the system interconnect 10 functionality and root ports 26 for coupling to devices). In a multi-socket system, there may be interfaces at the boundary of each socket to handle inter-socket communications, as well as each socket comprising root ports 26 for communication with external devices 20, 22 other than the integrated circuits in the sockets themselves. Another way of providing a multi-chip system can be to distribute components of the host system 4 across multiple chiplets which are disposed on an interposer or stacked three-dimensionally so as to provide for inter-chiplet communication (the chiplets being more tightly integrated than the chips in the sockets described earlier). In a multi-socket or multi-chiplet system, it may be that the portion of the system PAS which corresponds to the hardware storage provided on one socket or chiplet does not necessarily correspond to an exact power-of-2 number of locations. Nevertheless, for more efficient addressing of memory and ease of manufacture it is likely that the physical address range addressable on a given chiplet or socket would start at an aligned address aligned to a power-of-2 size boundary. Hence, it may be likely that the addressable portions of the system PAS which correspond to actual physical memory storage on the chiplets/sockets may not form a single contiguous range of addresses, but may include some "gaps" at the end of the range allocated to each chiplet or socket. This means that a single contiguous linear table as shown in Figure 5 may be wasteful of memory footprint as it could contain many LO table entries 153 which do not correspond to any real hardware storage on any of the chiplets/sockets, but which nevertheless occupy space in the memory footprint of the table due to the linear indexing scheme for selecting an address 153 of the LO table by applying an offset relative to a base address 150 where the offset is selected from a portion of bits of the target address.
In the context of a multi-chiplet or multi-socket system, references to "on-chip" memory can be understood as storage circuitry on any of the integrated circuits corresponding to the chiplets/sockets of the system, even if the memory accessed as on-chip memory by a processing element 6 on one chiplet/socket is actually located on a separate integrated circuit corresponding to a different chiplet/socket. From a security point of view the memory on another chiplet/socket of the multi-chiplet/multi-socket system can be regarded as sufficiently secure compared to the on-chip memory on the same chiplet/socket. Hence, in a multi-chiplet/socket system "on-chip" memory is distinguished from external storage accessed via an I/O device port 26, but may not necessarily be on the same chiplet or socket as the processing element 6 accessing that "on-chip" memory.
Figure 6 shows four examples A to D of access control table configurations that could be used to help manage the overall memory footprint of the access control table. The access control circuitry 108 may have access to table lookup control information stored in the control registers 62, specifying one or more parameters which define how the access control table structure is configured, which influences table lookup actions performed by the access control information obtaining circuitry 109. Based on the current setting of the table lookup control information at a given time, any of the examples A to D could be selected (and other examples could also be selected). For some of these examples C and D, a setting of the table lookup control information has been selected which causes the system PAS to be treated as having a number of tablelookup-skipping address windows 160, 170 at non-contiguous positions in the physical address space. In example A there are no such table-lookup-skipping address windows and in example B there is one such table-lookup-skipping address window 160. Hence, it will be appreciated that the access control circuitry 108 supports the ability to define multiple such table-lookup-skipping address windows for at least one setting of the table lookup control information, but it is not essential for that setting to always be selected, as control software provided by a given user may configure the table lookup control information depending on needs of a given processing system hardware implementation or the needs of a particular piece of software.
Examples A to D show four possible ways of configuring the table lookup behaviour for particular regions of the system PAS 86, to try to address the memory footprint of the table structure. Each of examples A to D shows a portion of the system PAS 86 supported by the host system 4, and denotes the portions of the system PAS 86 that correspond to particular LO entries (within the columns headed LOGPT) and L1 entries (within the columns headed L1GPT). It will be appreciated that the portions of the system PAS 86 shown under columns LOGPT and L1GPT refer to the regions of physical address space whose access control behaviour is described by the corresponding entries 153, 156 of the LO and Li tables 152, 154, rather than showing the locations storing those entries themselves. Also, although Figure 6 for conciseness shows the system PAS 86 being describable by ten LO entries 153 and shows an address map for the system PAS 86 corresponding to six Li entries 156 per LO entry 153, this is merely a schematic diagram, and it will be appreciated that the number of LO entries covering the system PAS 86 and number of L1 entries per given LO entry 153 can be a different number (and often both will correspond to a power-of-2 number of entries).
One approach for reducing memory footprint can be to increase the LO granule size so that each LO entry 153 describes a larger block of physical addresses of the system PAS 86. For example, a 64GB LO granule size could be used as in Example A of Figure 6. However, while this can reduce the footprint of the LO table required to cover a system PAS of a given size, compared to implementations with smaller LO granule size, it reduces the granularity with which access control properties can be specified, as it means it is not possible to use a LO entry 153 as a block descriptor if one sub-region of the 64GB LO granule corresponding to that entry is to be provided with a different access control behaviour to another sub-region of the same 64GB LO granule. In practice, it can be very common for 64GB granules to map to a mixture of on-chip resources within the host system 4 and off-chip resources accessed via device ports 26, so that if the 64GB granule size is to be supported, the LO entry 153 for that 64GB granule would need to be a table entry allowing respective L1 entries 156 in a subsequent level of table to distinguish the access control properties for the different sub regions. However, once L1 entries are needed and there is a mix of on-chip and off-chip resources described in the corresponding LO granule, then either the L1 table has to be stored on-chip (further exacerbating the problem of limited on-chip storage for the table structure, as some on-chip storage is wasted in storing L1 entries that relate to off-chip resources for which the access control information is not subject to the same security/reliability guarantees requiring storage in on-chip memory), or the L1 table has to be stored off-chip in external storage, in which case the security/reliability guarantees for the L1 entries associated with on-chip resources cannot be respected (pushing those entries off-chip could risk the entries being tampered with so that the access control information cannot be trusted).
Another approach can be that, as shown in example B of Figure 6, the access control table structure describes access control information only for a "protected PA space range" (PPS) 159 within the system physical address space. This leaves a non-PPS window 160 at the end of the system PAS, which is not described by the access control table structure. The physical addresses in the non-PPS window 160 can be assumed to have a default value of the access control information (e.g. the default value can be the encoding mentioned above that indicates that accesses to these addresses are permitted only to the non-secure PAS, this being an example of a setting which permits a least secure class of memory access operation to access a corresponding physical address in the non-PPS window 160). Hence, the non-PPS window 160 is an example of a table-lookup-skipping address window as discussed above.
Figure 6 is not to scale, as often the size of the non-PPS window 160 may be much greater than the size of the PPS window 159. The size of the non-PPS window 160 can be selected based on a software-programmable control parameter stored in a control register 62. For example, the non-PPS window can be used when either there is no memory storage provided in hardware beyond a certain point of the addressable PAS, or when there are no hardware storage resources beyond a certain point of the system PAS 86 that require any setting other than the default setting associated with the non-PPS window 160.
While use of a single non-PPS window 160 as shown in example B can help to reduce memory footprint of the table structure by reducing the region to be covered by the LO table 152, it does not scale well to multi-socket systems because the single contiguous PPS window 159 still needs a single contiguous LO table to be defined sufficient to cover all physical addresses addressable across all sockets other than the non-PPS window 160 at the end of the final socket's local address range. This can leave vast regions of actually unused physical address space associated with each socket (other than the final socket having the non-PPS window 160 in its portion of address space) which nevertheless requires memory footprint to be allocated for redundant LO entries 153 describing those unused regions. A similar problem arises with multichiplet systems.
Example C in Figure 6 shows a first approach for addressing these problems. In this example, the table lookup control information supports a setting which allows multiple non-PPS windows 160 to be defined at non-contiguous locations within the system PAS 86. Hence, the access control table only needs to explicitly describe properties for the remaining PPS windows 159, and explicit LO table entries can be omitted for the addresses in the non-PPS windows 160, as the access control information is assumed to be a default value (e.g. access permitted to the non-secure PAS only). Hence, the non-PPS window 160 is an example of a table-lookup-skipping address window.
In some examples, the architectural control parameters defining the positions of the non-PPS windows 160 could be flexible enough to permit the non-PPS windows 160 to be defined at arbitrary locations (not necessarily at constant stride intervals), e.g. using a set of registers independently defining the base address and size of each non-PPS window.
However, in practice, an expected use case for such non-PPS windows may be in multi-socket or multi-chiplet systems which often have symmetric layouts of their local portions of system PAS 86. Therefore, to reduce the amount of control state information needed to define the positions of the non-PPS windows 160 and reduce the software overhead in setting that control state information, a convenient way of defining the positions of the non-PPS windows 160 can be using a stride parameter which specifies a constant stride offset between the positions of the non-PPS windows. For example, example C shows an example with 2 non-PPS windows 160 disposed at the end of each half of the system PAS 86 corresponds to sockets 1 and 2 (of course other examples could have more than 2 sockets and so could have more non-PPS windows 160 at the end of each equally divided portion of system PAS 86).
As the non-PPS windows 160 can be relatively large, it can be helpful to adjust the function used to determine the LO table indexing offset from the target physical address, so that portions of bits which distinguish addresses within each of the PPS windows 159 are concatenated ("stitched together), and then the concatenated value is used as the offset relative to the LO table base address 150. This means that a single contiguous linear LO table of size corresponding to the overall size of the PPS windows 159 can be used, even though those PPS windows 159 are actually distributed across a wider range of the system PAS 86.
Example D of Figure 6 shows another approach for addressing these problems. Again, multiple non-PPS windows 160 are defined, but in addition, a further type of table-lookup-skipping window of address space (referred to as a GPC-bypass window 170) is defined using the table lookup control information. Similar to the non-PPS window 160, the GPC-bypass window 170 can be regarded as a table-lookup-skipping window of address space, as again if a target physical address falls within one of the GPC-bypass windows 170, then a default value for the access control information can be assumed without requiring any explicit lookup of table entries 153, 154.
Again, the GPC-bypass windows 170 can be disposed at intervals of constant stride to handle multi-chiplet or multi-socket systems. Alternatively, the table lookup control information may also support settings where only a single instance of a GPC-bypass window 170 is configured (suitable for systems which only have one chiplet/socket). Another option could be to provide more complex table lookup control information which allows more arbitrary definition of locations of each GPC-bypass window 170.
However, the GPC-bypass window 170 differs from the non-PPS window 160 in a number of respects. Firstly, the GPC-bypass windows 170 may be determined based on comparison of portions of bits of the target address less significant than the portion of bits of the target address that is used for identifying whether the address corresponds to one of the non-PPS windows 160.
Therefore, typically the size of the GPC-bypass windows 170 may be much smaller than the size of the non-PPS windows 160. It will be appreciated that an architecture may support a variety of possible sizes for the windows 160, 170, and so while in use it may be expected to be common that the GPC-bypass window 170 is smaller than the non-PPS window 160, this is not essential and the architecture could, at least from a software point of view, theoretically support combinations of settings for the sizes of the windows 160, 170 where the GPC-bypass window 170 could be the same size or larger than the non-PPS window 160. Nevertheless, the ISA may support at least one setting where the size of the GPC-bypass window 170 is smaller than the minimum size supported for the non-PPS window 160.
Also, irrespective of non-PPS window size (or whether non-PPS windows 160 are supported at all) the ISA may support at least one setting of the table lookup control information for which the GPC-bypass window 170 is a window of smaller size than the size of one LO granule -e.g. in the example D the GPC-bypass window is smaller than the LO granule size of 64 GB. This can be helpful for cases where it is desirable to have a larger LO granule size such as 64 GB to reduce the memory footprint of the on-chip LO table 152, but a LO granule of that size would describe a mixture of on-chip and off-chip resources. By defining the GPC-bypass window 170 to have a size smaller than the LO granule size, this allows the sub-region of the LO granule that describes off-chip resources to be described using the default properties defined by the GPC-bypass window while the other sub-region(s) of the LO granule can be explicitly described by a LO entry stored on-chip to preserve security/reliability guarantees. Hence, the competing demands of low memory footprint but sufficient security/reliability guarantees can be preserved even if the granularity of distribution of on-chip and off-chip resources is too fine to be describable in a single LO table entry 153 of larger LO granule size, and the next smallest LO granule size supported would cause too great an increase in memory footprint of the LO table.
Also, the default value assumed when a physical address is determined to be within one of the GPC-bypass windows 170 may differ from the default value assumed if the physical address is in one of the non-PPS windows 160. For example, as a common use for the GPC-bypass windows 170 is for regions of address space corresponding to off-chip devices 20, 22 which may be implementing their own form of access control checks (an example of completer-side checks as mentioned earlier), then it may be sufficient for the default value of the GPCbypass windows 170 to be a most permissive control setting (e.g. the setting which permits access from any PAS (0b1111) mentioned earlier), so that the decision on whether the access should be allowed could be delegated to a completer-side access control unit associated with the corresponding device 20, 22. This contrasts with the more restrictive default value for the access control information used for the non-PPS windows 160, where it may be preferred (given the wide range of address space covered by the non-PPS windows 160) to restrict access to the non-secure PAS only, for example.
Also, as the size of the GPC-bypass window 170 can be smaller than one LO granule, the table lookup indexing function is not adjusted to skip bits of the target physical address associated with identification of whether the address is in the GPC-bypass window 170 in the generation of the index used as an offset to find the corresponding LO entry 153. This contrasts with the approach taken for the non-PPS windows 160 where the corresponding bits of the target address used to identify occupancy of the non-PPS window 160 are skipped in the index generation function.
Hence, Figure 6 shows two examples 160, 170 which can be provided as table-lookupskipping address windows of system PAS 86 for which, if a target physical address of a given memory access falls within one of those windows, a given default value of the access control information is assumed and there is no need to look up the access control table structure to obtain the access control information for that target physical address. As shown in examples C and D, for scaling to multi-socketimulti-chiplet systems and/or for supporting memory system topologies involving fine-grained distributions of on-chip and off-chip resources while supporting certain security guarantees, it can be helpful to provide settings of the table lookup control information which support the ability to define non-contiguous table-lookup-skipping address windows 160, 170. However, examples C and D are not the only ways in which multiple non-contiguous tablelookup-skipping address windows could be provided. For example, while not shown in Figure 6, another example could comprise only the GPC-bypass windows 170, without having any non-PPS windows 160 (either because such non-PPS windows 160 are not supported at all, or because they are supported but the software has set the table lookup control information to a setting that disables the non-PPS windows 160 from being used). Also, while examples C and D show settings where both the non-PPS window 160 and GPC-bypass window 170 are repeated at intervals of constant stride to handle multi-chiplet and multi-socket systems, it is also possible to provide a setting of the table lookup control information which configures the system PAS 86 to comprise a single instance of the non-PPS window 160 and a single instance of the GPCbypass window 170, which again would provide for non-contiguous table-lookup-skipping address windows. It will also be appreciated that while examples C and D providing non-contiguous table-lookup-skipping address windows are contrasted with examples A and B comprising zero or one such window, nevertheless a given ISA (and a given host system 4 supporting that ISA) may still support settings of the table lookup control information corresponding to the examples A or B. Figure 7 illustrates steps for enforcing access control restrictions on memory access request based on access control information. At step 200, the access control circuitry 108 receives a given memory access request specifying a given target physical address (PA). The target PA may have been generated based on address translation by the MMU 106, or for some operating states of the processing element 6 may be mapped directly from the virtual address specified by a corresponding instruction. At step 202, the access control information obtaining circuitry 109 determines whether the target PA corresponds to a table-lookup-skipping address window. For example, this can be determined based on comparing a selected portion of bits of the target PA with a reference value, which could be a fixed predetermined value (e.g. 0) or a programmable value identified using a control parameter within a control register 62.
If the target PA corresponds to a table-lookup-skipping address window, then at step 204, a default value is returned for the access control information, the default value being selected independent of any lookup of the access control table structure. In many cases, this means that there is no need to perform any lookup of the access control table structure. However, in the case of an address hitting in the GPC-bypass window 170, some micro-architectural implementations might choose to initiate a lookup of the access control table structure anyway (e.g. in case other accesses associated with other sub-regions of a LO granule comprising a GPC-bypass window 170 may follow later, in which case information from the corresponding LO entry 153 could be cached ready for those accesses, to try to improve performance). Nevertheless, if a lookup of the access control table structure is initiated in cases where the PA corresponds to the table-lookup-skipping address window, then if that lookup finds there is no valid LO entry defined for the target PA, no fault would be generated as any speculatively performed lookup would not be an architecturally required lookup and software may, anticipating that the address would corresponding to a GPC-bypass window 170, not have defined any valid LO table entry for the target PA. In cases of an address hitting in the non-PPS window, it is best not to trigger any lookup of access control information for the target PA (when returning the default), because the concatenation of portions of bits for table indexing means generating the LO table index based on the target PA may risk hitting against a LO entry for a different LO granule in the PPS window 159 (as the concatenation function means the index generation function treats all addresses as if they are in the PPS window 159). Regardless of whether a table lookup is actually performed, at step 208 the access to the memory system in response to the given memory access request is controlled by the access control circuitry 108 based on the default value of the access control information.
If the target PA does not correspond to any table-lookup-skipping address window, then at step 206 the access control information obtaining circuitry 109 generates a table index based on the target PA, and the access control table structure is looked up based on that table index (which may be applied as an offset relative to a base address 150 of the access control table structure). Access control information is returned based on the lookup to the indexed entry of the access control table structure (either directly if the indexed entry is a block descriptor, or via a further access to a subsequent level of table if the indexed entry is a table descriptor). If an invalid table entry is encountered at any point of the lookup process for the target PA, a fault can be triggered to cause an exception handler to deal with the cause of the fault (e.g. to define a valid table entry corresponding to the target PA). Assuming valid access control information is returned, then again at step 208 the access control circuitry 108 controls access to the memory system for the given memory access request, based on the access control information returned from the access control table structure. It will be appreciated that in systems having a caching structure for caching information derived from the access control table structure, it may not be essential to look up the access control table structure at every instance of step 206, as on some instances the target PA may correspond to valid cached information in the caching structure, in which case that information can be used to provide the access control information for the target PA (or at least to eliminate some steps of the lookup process for looking up the access control data structure in memory).
Figure 8 illustrates an example of table lookup control information which can be programmed by software to configure the size, layout and position of the access control table structure, and hence adjust corresponding properties of the table lookup process by which the access control information obtaining circuitry 109 obtains access control information for a given physical address. Figure 8 illustrates a subset of the control registers 62 of the processing element 6. It will be appreciated that Figure 8 does not necessarily show all control registers 62, but shows a subset relevant to lookups of the access control table.
In this example, the table lookup control information comprises a table base address register 150 (GPTBR_EL3) which as mentioned earlier stores a parameter identifying a base address of the LO table 152. In this example, the table lookup control information also comprises a table lookup control register (GPCCR_EL3) 180 which stores control information defining the format of the access control table structure including definition of any non-PPS windows 160, and a GPC-bypass window control register (GPCBVV EL3) 190 which stores information defining the position and size of any GPC-bypass windows 170. It will be appreciated that the same control information could be represented in a different format, e.g. using a different arrangement of parameters within a set of registers, so the particular allocation of control values to particular registers 180, 190 is not essential and could be varied (e.g. the same information could be spread across three or more registers instead of just the two registers 180, 190 as in this example). In this example, the control registers 150, 180, 190 are (as denoted by the register name suffixes _EL3) restricted to being updatable only by program instructions executing in exception level EL3, which is an exception level associated with a level of privilege more privileged than a hypervisor-level privilege. In some examples, exception level EL3 may be the most privileged operating state supported by the processing element 6 (where exception levels refer to operating states with different levels of privilege that are orthogonal to the security states (Non-Secure, Secure, Root, Realm) mentioned earlier. Hence, the table lookup control information 150, 180, 190 can be reserved to be programmable only by secure software which is more privileged than a hypervisor which manages virtualisation of guest operating systems on the host system 4.
In this example, the table lookup control register GPCCR_EL3 180 comprises a number of items of control state: Access control check enable parameter (GPC) 182: Specifies whether or not the access control checks performed by the access control circuitry 108 based on the access control table structure are enabled or disabled. If disabled, memory accesses are not prevented based on the access control information defined in the table structure or based on default settings of the access control information implicitly defined for table-lookupskipping address windows 160, 170.
LO granule size parameter (LOGPTZ) 183: Defines the size of one LO granule, which is a block of physical addresses that shares a single LO entry 153 of the LO table 152. For example, the LO granule size could be selected from two or more settings providing different options for LO granule size.
In one example encoding, LOGPTZ may define the LO granule size as follows: LOGPTZ Meaning Ob0000 30 bits. Each LO entry covers 1GB of address space Ob0100 34 bits. Each LO entry covers 16GB of address space.
Ob0110 36 bits. Each LO entry covers 64GB of address space.
Ob1001 39 bits. Each LO entry covers 512GB of address space.
Of course, this set of options is just one possible example, and other LO granule sizes could also be supported. When an LO entry is described as corresponding to X bits, this means that the lower X bits of the target PA can be ignored for the purpose of generating indexes into the LO table, with the LO table index being derived from a more significant portion of the target PA.
Physical granule size PGS 184: Defines the size of one physical address granule, which is the block of physical addresses that shares a single GPI field within an L1 table entry 157 (the GPI field being the item of access control information corresponding to that physical address granule). In an example with 4-bit GPI fields and 64-bit L1 table entries 157, GPI fields for 16 granules could be combined into the same L1 entry 157.
In one example encoding, PGS 184 may define the physical granule size as follows: PGS Meaning Ob00 4KB Ob01 64KB Ob10 16KB Again, it will be appreciated that other sizes or encodings could also be supported. The PGS 184 influences the choice of which bits of the target PA are used to index into the L1 table and/or to select the GPI field within a block descriptor or granule descriptor that provides the access control information for that target PA.
Protected physical address region size (PPS) 185 Defines the bit width of the portion of the target PA used to address a single PPS memory region 159, and hence also the size of that PPS memory region 159. Implicitly, this parameter therefore also specifies the size and position of the corresponding non-PPS memory region 160. For example, one possible encoding can be as follows: PPS Meaning PPS region size Ob000 32 bits 4GB Ob001 36 bits 64GB Ob010 40 bits 1TB Ob011 42 bits 4TB Ob100 44 bits 16TB Ob101 48 bits 256TB Ob110 52 bits 4PB It will be appreciated that other implementations may have a different range of options supported and could use a different encoding scheme (e.g. a 4-bit encoding could support additional options for the PPS region size). Some examples could support an encoding of PPS (or another parameter) which disables support for the non-PPS region entirely so that in that case the entire system PAS is regarded as a PPS region 159.
PPS stride size 186 If supported, a parameter defining the number of partitions into which the system PAS 86 is logically divided, with each partition comprising an instance of the PPS region 159 mentioned above of size specified by the PPS parameter 185, and the remaining part of that partition being considered a non-PPS region 160. As typically the stride size may generally be desired to be the same for both non-PPS windows 160 and GPC-bypass windows 170 (e.g. corresponding to the size of the address space provided per socket), then while the PPS stride size 186 is shown in Figure 8 as separate from the bypass window stride parameter 192 mentioned below, these parameters 186, 192 could also be combined into a single item of control information.
For example, as described further with respect to Figure 9, the PPS stride size 186 may specify a bit position c representing a bit beyond which it is not necessary to compare bits of the target PA with a reference value (e.g. 0) for detecting non-PPS windows, which may effectively cause the arrangement of PPS/non-PPS regions 159, 160 to be replicated multiple times across the system PAS 86. If provided, the PPS stride size 186 may be encoded in a similar way to the encoding shown for the bypass window stride parameter 192 mentioned below, so this encoding is not mentioned again here.
GPC-bypass window enable parameter (GPCBW ENABLE) 187: Selects whether or not GPC-bypass windows 170 are enabled in the current configuration. If disabled, any access to a PA in the PPS region 159 will have its access control checks controlled based on access control information derived from a corresponding block descriptor or granule descriptor of the access control table structure. If support for GPC-bypass windows 170 is enabled, the control state stored in the GPC-bypass window control register 190 defines the size, position, and number of GPC-bypass windows 170 currently configured, and physical addresses which correspond to a GPC-bypass window may have access control checks performed based on a default value of the access control information rather than a value obtained from a lookup of the access control table structure (GPT).
The GPC-bypass window control register 190 in this example specifies the following control parameters: GPC-bypass window size (BWSIZE) 191: Defines the size of one instance of the GPC-bypass window 170. For example, one possible encoding of the parameter is as follows: BWSIZE Meaning (bit position L) GPC-bypass window size Ob000 30 bits 1GB Ob001 31 bits 2GB Ob010 32 bits 4GB Ob100 34 bits 16GB Ob110 36 bits 64GB As can be seen by comparing the BWSIZE encodings with the encodings for the PPS region size 185 described above (which may support regions of the order of TB or PB), the GPC-bypass window 170 may often be relatively small. Also, as seen from comparing the encodings of the BWSIZE 191 and LOGPTZ 183 parameters, settings of the table lookup control information are supported which allow the GPC-bypass window 170 to be smaller than the LO granule size defined by LOGPTZ 183. The bit position L shown in the table refers to the bit position at the lower end of the portion of PA bits that is compared with corresponding bits of the BWADDR parameter to determine whether a PA falls within the GPC-bypass window.
GPC-bypass window stride (BWSTRIDE) 192: Defines the stride between start addresses of successive GPC-bypass windows, so that multiple GPC bypass memory regions can be created in the memory map across a specific stride. This can be useful for multi-chiplet or multi-socket systems. One possible encoding can be as follows (again, it will be appreciated this is just one example and others are also possible): BWSTRIDE GPC-bypass window stride bit position c Ob0000 1TB 40 Ob0001 4TB 42 Ob0010 16TB 44 Ob0011 64TB 46 Ob0100 128TB 47 Ob0101 256TB 48 Ob0110 512TB 49 Ob0111 1PB 50 Ob1111 No stride (only one GPC-bypass window supported) 56 Again, in some examples, a separate GPC-bypass window stride parameter 192 may not be necessary if the stride can be deduced from a PPS stride parameter 186. The bit position c shown in the table above is a position starting from which PA bits are masked for the purpose of address comparisons for detecting whether the PA falls within a GPC-bypass window, which has the effect of causing the GPC-bypass window to be replicated in the system PAS at intervals of the corresponding stride.
GPC-bypass window base address BWADDR 193: Defines the position of the GPC-bypass window 170 relative to a given partition of the system PAS (a partition being one unit of repetition of a size defined by the BWSTRIDE parameter 192). The BWADDR 193 does not need to be a full address having the maximum number of bits used for addressing the memory system, as the GPC-bypass window may be assumed to be aligned to a natural address boundary corresponding to the size of the GPC-bypass window 170 as defined by the BWSIZE parameter 170, so that the lower X bits (where X is the number of bits indicated above for a particular BWSIZE encoding) of the GPC-bypass window address do not need to be specified explicitly in the BWADDR parameter 193 in order to identify the base address of the GPC-bypass window 170.
For example, BWADDR could be encoded using a 26-bit field which represents bits [55:30] 25 of the GPC-bypass window base address (recognising that with the minimum supported BWSIZE value representing 30 bits / 1GB as shown above, with a size-aligned bypass window there is never a need to indicate any bits of the GPC-bypass window base address that correspond to bits [29:0] of a physical address, as these bits [29:0] would merely distinguish different physical addresses within the GPC-bypass window 170). However, if other BWSIZE encodings are supported, the number of BWADDR bits that can be omitted from the stored representation of the base address could vary. In general, the BWADDR parameter identifies a certain number of upper bits of the base address of the GPC-bypass window.
Also, it will be appreciated, based on the more specific examples below, that for some settings of BWSTRIDE, some bits of BWADDR may be masked out of address comparisons to determine whether a given PA falls in the GPC-bypass window 170.
Figure 9 illustrates an approach for determining whether a given physical address corresponds to any non-PPS window 160 of physical address space as defined by the current settings of the table lookup control information. As indicated in Figure 9, a given 64-bit physical address (PA) may be regarded as having certain bit positions s, t, c, y which can be related to parameters of the access control table structure and system PAS as follows: * bit s corresponds to the LO granule size, i.e. the granule size is 2As bytes, such that bits [s-1:0] are the bits of the PA that distinguish respective addresses corresponding to the same LO entry 153 of the LO table 152. The position of bit s can be derived from the LOGPTZ 183 parameter (e.g. s could be equal to 30, 34, 36 or 39 if the encoding of LOGPTZ 183 set out above is used).
* bit t corresponds to the PPS size, i.e. the PPS is 2At bytes. Hence, bits [t-1:0] are the bits of the PA that are used to identify addresses covering the whole PPS window 159 within a single one of the replicated partitions of the system PAS 86 defined for multiple chiplets or sockets. The position of bit t can be derived from the PPS parameter 185, e.g. as one of 32, 36, 40, 42, 44, 48 or 52 bits if the encoding scheme above is used.
* bit c corresponds to the stride size by which PPS/non-PPS regions are repeated across the system PAS 86 (i.e. the stride is 2Ac bytes). c can be selected based on the PPS stride parameter 186, or if the BWSTRIDE parameter 192 is supported, based on the BWSTRIDE parameter 192 such that the same parameter 192 defines strides for both non-PPS region repetition and GPC-bypass window repetition. Regardless of which parameter 186, 192 is used, the encoding shown above for the BWSTRIDE parameter 192 can be used to identify the position of bit c. Note that while c is shown at a less significant bit position than bit y in the example of Figure 9, if the stride parameter 186, 192 has been configured to specify that there should only be one instance of the PPS/nonPPS region in the system PAS (with no strided repetition), then bit c may be at the same position as bit y mentioned below.
* bit y corresponds to the overall size of the system PAS 86, i.e. the system PAS comprises 2^y bytes. In some examples, the system PAS may have a fixed size (e.g. with y = 55). In other examples, a control parameter (not shown in Figure 8) could define the size of the system PAS 86 and hence the position of bit y.
As shown in the lower part of Figure 9, to determine whether a given target PA of a memory access operation falls within the non-PPS region, the access control information obtaining circuitry 109 can compare bits [c-1: t] of the target PA with a fixed reference value of 0. For example, to support variable definition of c and t, it can be efficient to apply a mask to the 64-bit PA to mask out bits [63:c] and [t-1:0] to clear those bits to 0, before comparing the masked PA with a 64-bit value of 0. If the result of the comparison indicates that bits [c-1: t] of the target PA are equal to 0, then the target PA is determined to be in a PPS region 159 of system PAS 86, and so the access control information for that target PA is described in the access control table structure, so a lookup to that structure may be required if cached access control information is not already available in a GPT cache, TLB or other caching structure. If bits [c-1: t] are not equal to the fixed reference value of 0, then the target PA is determined to be in a non-PPS region 160 and so a default setting for the access control information is assumed, in this case that the default is the setting that permits access to the least secure class of memory access operations (e.g. the setting which permits access to the target PA via the non-secure PAS only).
Figure 10 illustrates generation of addresses of LO table entries in cases where a non-PPS region 160 has been defined and the stride parameter 186, 192 is such that c < y so that there are at least two non-contiguous non-PPS regions 160 in the system PAS 86. In this case, the bits [c-1:t] of the target PA that are used to detect whether the target PA falls within the non-PPS region 160 are excluded from determination of the table offset used to index into the LO table 152. Instead, the table index, applied as an offset 230 relative to the table base address 150, is formed by concatenating bit portions [y-1:c] and [t-1:s] of the target PA, so that the LO table can be implemented as a contiguous block of entries in memory, indexed linearly by the table index offset 230, even though the LO table describes a number of non-contiguous PPS regions 159 distributed at intervals through the system PAS 86. This enables the relatively simple linear indexing scheme of the LO table to continue to be used even with the support for multiple non-contiguous PPS/non-PPS regions, while preserving memory capacity by reducing the memory footprint for the LO table 152.
Figure 11 illustrates steps performed by the access control information obtaining circuitry 109 for determining whether a given PA corresponds to a non-PPS window 160 of address space (an example of one of the table-lookup-skipping address windows mentioned in Figure 7), when at least one non-PPS window 160 has been enabled using the table lookup control information. If the table lookup control information has disabled use of any non-PPS windows 160, then the entire system PAS 86 corresponds to a PPS window 159 and so, unless the target PA falls within a GPC-bypass window, the memory access would be controlled based on access control information derived from a table lookup to the access control structure.
However, if at least one non-PPS window 160 is currently configured to be enabled, then at step 250 the access control information obtaining circuitry 109 determines whether bits [c-1:t] of the target PA are equal to a fixed reference value (e.g. 0), where c and t are derived from the stride (186 or 192) and PPS size (185) parameters respectively as mentioned above.
If PA[c-1:t] is not equal to 0, then at step 254 the access control information obtaining circuitry 109 determines that the target PA falls within a non-PPS window 160, and so a default setting for the access control information is returned. For example, the default setting could be the setting permitting a least secure class of memory access operations (e.g. accesses specifying the non-secure architectural PAS) to access the memory location. At step 258, the access to the memory system is controlled by the access control circuitry 108 based on the default value of the access control information.
If PA[c-1:t] is equal to 0, then the target PA falls in the PPS window 159. Note that for the majority of settings where c-t > 1 there will be a greater number of encodings of PA[c-1:t] not equal to 0 than equal to 0, so often the PPS window 159 is many times smaller than the non-PPS window.
If the current setting of control parameters in control registers 162 is such that c=y (the stride parameter 186 or 192 has been set to specify no strided repetition of the non-PPS window through the system PAS 86), then at step 257 there is only one single non-PPS window 160 and so there is no need for the concatenation operation shown in Figure 10 to stitch together a number of non-contiguous portions of the target PA to form a table index. Instead, at step 257 the access control information obtaining circuitry 109 forms a table index based on bits PA[t-1:s] of the target PA, and at step 259 looks up the access control table structure based on that table index, to obtain (via a single access if a block descriptor is found in the LO table, or multiple accesses if a table descriptor providing a pointer to the L1 table is found in the LO table) access control information obtained from the access control table structure corresponding to the target PA.
On the other hand, if c < y (the stride parameter 186 or 192 has been set to specify a strided repetition of the non-PPS window through the system PAS 86), then at step 256 the table index for LO table lookup is derived from concatenating bits [y-1:c] and [t-1:s] of the target PA, and again at step 259 the access control table structure is looked up based on that table index, to obtain corresponding access control information returned from memory.
Regardless of whether the table index was generated at step 257 or 256, the returned access control information from the access control table structure can be used to control access to the memory system for the given memory access request at step 258. For example, if the memory access request specifies an architectural PAS 84 which is indicated by the default/obtained value of the GPI as not being allowed to access the corresponding target PA, then a fault can be signalled and the memory access rejected.
Figure 12 illustrates an example of using the parameters of the GPCBW control register to define the number, position and size of the GPC-bypass windows 170 within the system PAS 86. Figure 12 assumes the encodings of BWSIZE and BWSTRIDE suggested above, and an encoding for the BWADDR parameter where the position of the GPC bypass window 170 is implicitly aligned in memory to the size of the window as specified by BWSIZE and so BWADDR represents bits [c-1:L] of the base address of the window. Hence, in the example of Figure 12, BWSIZE = 0, BWSTRIDE = 1, and BWADDR = 2 (i.e. bits [c-1:L] encoded as 0b0...010, any intervening bits denoted by... also having bit values of 0). Hence, this defines a number of GPC-bypass windows 170 defined at constant stride intervals of 4TB in the system PAS 86, each window starting at an address 2GB beyond the start address 0 of the corresponding 4TB partition of the system PAS 86. For example, a first instance of a GPC-bypass window 170 starts at an address corresponding to 2GB beyond the start of the address space at address 0, a second window 170 starts at an address corresponding to 4TB + 2GB beyond the start of the address space at address 0, and so on. Each window 170 is of size 1 GB as selected by the BWSIZE parameter.
Figure 13 illustrates in more detail determination of whether a given target physical address (PA) falls within one of the GPC-bypass windows 170. Again, certain bit positions y, c, s, L of the PA are defined by the control parameters in control registers 62. Bit positions y, c, s correspond to the system PAS size, stride size and LO granule size respectively, and are defined in the same way as explained above for Figure 9. Bit position L corresponds to the bypass window size as determined based on the BWSIZE parameter as shown in the example encoding above, and represents the lower bit of the portion of bits which are compared to detect whether an address falls in the GPC-bypass window 170. If the BWSTRIDE parameter 192 defines that there is to be no strided repetition of the GPC-bypass window 170 (e.g. because the software developer chose this setting for software intended for a system only comprising one socket), then c = y. As shown in the lower part of Figure 13, to determine whether a PA falls within a GPCbypass window 170, the access control information obtaining circuitry 109 compares bits [c-1:L] of the target PA against corresponding bits [c-1:L] of the base address defined using the BWADDR parameter. Note that, as explained above, a certain number of lower bits (Z lower bits in total) of the base address do not need to be explicitly defined in the BWADDR parameter because the base address is size-aligned to the size of the GPC-bypass window 170, i.e. these Z lower bits are implicitly 0 (for example, with the example encodings of control parameters shown above, Z can be 30 to correspond with the minimum possible setting for L).
Hence, in practice bits [c-1:L] of the base address are indicated by bits [c-1-Z:L-Z] of a stored BWADDR parameter. This means that actually the portion of the stored BWADDR parameter compared with bits [c-1:L] of the target PA may be BWADDR[c-1-Z:L-Z]. In any case, it will be appreciated that there could be other ways of defining the base address parameter using other encodings.
Regardless of the particular encoding used for the base address, to implement the option of repeating the GPC-bypass window 170 at stride intervals through the address space, bits of the target PA outside positions [c-1:L] are masked out, and similarly bits of the BWADDR parameter that do not represent bits [c-1:L] of a corresponding base address are masked out, to ensure that these bits do not affect the comparison result (e.g. all masked bits can be cleared to 0). Based on the comparison, the access control information obtaining circuitry 109 determines whether the masked portions of the target PA and GPC-bypass window base address are equal, and if so the target PA is determined to be in the GPC-bypass window 170. If the target PA is in the GPC-bypass window 170, a default value is selected for the access control information, e.g. corresponding to the most permissive access control setting (e.g. GPI = Ob1111 to indicate access is permitted in all architectural PASs 84). If the selected portion [c-1:L] of the target PA does not equal the corresponding portion [c-1:L] of the base address, then the target PA is not in the GPC-bypass window 170 and so the access control information for this target PA would be derived from a lookup of the access control table instead.
Note that in Figure 13, in comparison to Figure 9, there are several differences in the comparison operation. Firstly, the comparison condition required to be satisfied to detect that a PA is in the GPC-bypass window 170 is an equals condition, rather than a not-equals condition as in Figure 9. Also, the reference value used for the comparison is a programmable value that can be varied by software by programming the value of the BWADDR parameter, rather than a fixed value of 0 as in Figure 9. These features can be more appropriate for the GPC-bypass window 170 so that a relatively small window of arbitrary position within the LO granule covered by one LO entry 153 can be assigned default properties distinct from the properties indicated via that LO entry 153 for other sub-portions of the same LO granule.
Figure 14 illustrates steps performed for controlling access to memory in an example supporting the GPC-bypass window 170. At step 280, in response to a given memory access specifying a given target PA, the access control information obtaining circuitry 109 determines whether the bypass window feature is enabled (e.g. based on the GPCBW ENABLE parameter 187). Also, the access control information obtaining circuitry 109 determines whether a selected bit portion of the target PA (e.g. bits [c-1:L] as mentioned above, where c and L are derived from the programmable BWSTRIDE and BWSIZE parameters 192, 192 respectively) is equal to a programmable reference value, corresponding to non-masked bits of BWADDR 193 that represent bits [c-1:L] of a corresponding GPC-bypass window base address. If use of the GPCbypass window 170 is enabled and bits PA[c-1:L] are equal to the programmable reference value, then at step 282 the target PA is determined to be in a GPC-bypass window 170 and so a default setting is returned for the access control information. In particular, the default setting may be the most permissive setting (e.g. a setting which permits access to the target PA in any of the architectural PASs 84).
If either use of the GPC-bypass windows disabled, or the comparison of PA[c-1:L] detects that these bits of the target PA do not match the programmable reference value, then at step 284 a table index is generated based on the target PA (e.g. using the approach shown in steps 256 or 257 of Figure 11 if at least one non-PPS window has been defined, or using bits [y-1:s] if no non-PPS window 170 has been defined). At step 286, the access control information obtaining circuitry 109 looks up the access control table structure based on the generated table index, to return access control information obtained from the access control table structure stored in the memory system. As at step 259 of Figure 11, in some cases a lookup process as at steps 284, 286 could be omitted if the required information is instead already available in a GPT cache, TLB or other caching structure.
At step 288, the returned access control information (either the default value from step 282, or a value derived from the access control table structure at step 286) is used to control access to the memory system for the given memory access request. For example, the access control circuitry 108 determines based on the access control information whether the architectural PAS 84 specified for the current memory access request is allowed to access the target PA.
The examples above describe architectural control parameters 187, 190 enabling a single type of GPC-bypass window 170 to be configured (either a single such window 170 or multiple windows 170 at intervals of constant stride). It will be appreciated that other examples may provide multiple sets of such control parameters 187, 190 so that multiple programmable window groups can be defined by software (with each group is defined by its own base address, size and/or stride parameters 193, 191, 192). In this case, detection of whether an address is in a window of a given window group may be controlled in a similar way to the GPC-bypass window 170 described, although different window groups may be associated with different base address/size/stride values and/or different default values for the access control information.
Although often such programmable window groups may be configured by software to be non-overlapping in address range, the architecture could also support cases where the software programs the control parameters 187, 190 for the window groups so that the window groups partially overlap in address range. In that case, if an address in the overlapping range is accessed, a priority scheme could determine which window group takes priority. Hence, if two or more windows are determined to correspond to the same target address, the access to that target physical address is controlled based on default setting of the access control information associated with the highest priority window of those two or more windows. Support for multiple groups of table-lookup-bypassing windows could be helpful to handle different classes of devices, or to handle a system comprising both chiplets and sockets (e.g. with one group of windows at intervals of one stride corresponding to the chiplet address range and a second group of windows at intervals of a second stride corresponding to the socket address range).
Concepts described herein may be embodied in computer-readable code for fabrication of an apparatus that embodies the described concepts. For example, the computer-readable code can be used at one or more stages of a semiconductor design and fabrication process, including an electronic design automation (EDA) stage, to fabricate an integrated circuit comprising the apparatus embodying the concepts. The above computer-readable code may additionally or alternatively enable the definition, modelling, simulation, verification and/or testing of an apparatus embodying the concepts described herein.
For example, the computer-readable code for fabrication of an apparatus embodying the concepts described herein can be embodied in code defining a hardware description language (HDL) representation of the concepts. For example, the code may define a register-transfer-level (RTL) abstraction of one or more logic circuits for defining an apparatus embodying the concepts.
The code may define a HDL representation of the one or more logic circuits embodying the apparatus in Verilog, SystemVerilog, Chisel, or VHDL (Very High-Speed Integrated Circuit Hardware Description Language) as well as intermediate representations such as FIRRTL. Computer-readable code may provide definitions embodying the concept using system-level modelling languages such as SystemC and SystemVerilog or other behavioural representations of the concepts that can be interpreted by a computer to enable simulation, functional and/or formal verification, and testing of the concepts.
Additionally or alternatively, the computer-readable code may define a low-level description of integrated circuit components that embody concepts described herein, such as one or more netlists or integrated circuit layout definitions, including representations such as GOSH.
The one or more netlists or other computer-readable representation of integrated circuit components may be generated by applying one or more logic synthesis processes to an RTL representation to generate definitions for use in fabrication of an apparatus embodying the invention. Alternatively or additionally, the one or more logic synthesis processes can generate from the computer-readable code a bitstream to be loaded into a field programmable gate array (FPGA) to configure the FPGA to embody the described concepts. The FPGA may be deployed for the purposes of verification and test of the concepts prior to fabrication in an integrated circuit or the FPGA may be deployed in a product directly.
The computer-readable code may comprise a mix of code representations for fabrication of an apparatus, for example including a mix of one or more of an RTL representation, a netlist representation, or another computer-readable definition to be used in a semiconductor design and fabrication process to fabricate an apparatus embodying the invention. Alternatively or additionally, the concept may be defined in a combination of a computer-readable definition to be used in a semiconductor design and fabrication process to fabricate an apparatus and computer-readable code defining instructions which are to be executed by the defined apparatus once fabricated.
Such computer-readable code can be disposed in any known transitory computer-readable medium (such as wired or wireless transmission of code over a network) or non-transitory computer-readable medium such as semiconductor, magnetic disk, or optical disc. An integrated circuit fabricated using the computer-readable code may comprise components such as one or more of a central processing unit, graphics processing unit, neural processing unit, digital signal processor or other components that individually or collectively embody the concept. Figure 15 illustrates a simulator implementation that may be used. Whilst the earlier described embodiments implement the present invention in terms of apparatus and methods for operating specific processing hardware supporting the techniques concerned, it is also possible to provide an instruction execution environment in accordance with the embodiments described herein which is implemented through the use of a computer program. Such computer programs are often referred to as simulators, insofar as they provide a software based implementation of a hardware architecture. Varieties of simulator computer programs include emulators, virtual machines, models, and binary translators, including dynamic binary translators. Typically, a simulator implementation may run on a host processor 730, optionally running a host operating system 720, supporting the simulator program 710. In some arrangements, there may be multiple layers of simulation between the hardware and the provided instruction execution environment, and/or multiple distinct instruction execution environments provided on the same host processor. Historically, powerful processors have been required to provide simulator implementations which execute at a reasonable speed, but such an approach may be justified in certain circumstances, such as when there is a desire to run code native to another processor for compatibility or re-use reasons. For example, the simulator implementation may provide an instruction execution environment with additional functionality which is not supported by the host processor hardware, or provide an instruction execution environment typically associated with a different hardware architecture. An overview of simulation is given in "Some Efficient Architecture Simulation Techniques", Robert Bedichek, Winter 1990 USENIX Conference, Pages 53 -63.
To the extent that embodiments have previously been described with reference to particular hardware constructs or features, in a simulated embodiment, equivalent functionality may be provided by suitable software constructs or features. For example, particular circuitry may be implemented in a simulated embodiment as computer program logic. Similarly, memory hardware, such as a register or cache, may be implemented in a simulated embodiment as a software data structure. In arrangements where one or more of the hardware elements referenced in the previously described embodiments are present on the host hardware (for example, host processor 730), some simulated embodiments may make use of the host hardware, where suitable.
The simulator program 710 may be stored on a computer-readable storage medium (which may be a non-transitory medium), and provides a program interface (instruction execution environment) to the target code 700 (which may include applications, operating systems and a hypervisor) which is the same as the interface of the hardware architecture being modelled by the simulator program 710. Thus, the program instructions of the target code 700 described above, may be executed from within the instruction execution environment using the simulator program 710, so that a host computer 730 which does not actually have the hardware features of the apparatus 2 discussed above can emulate these features.
For example, the simulator code 710 may comprise instruction decoding program logic 712 which simulates decoding of instructions in an equivalent manner to the functionality offered by the instruction decoding circuitry 104 described above. The instruction decoding program logic 712 comprises conditional logic (e.g. if/then clauses) for decoding instructions of the target code 700 and selecting corresponding sequences of instructions (defined in the native instruction set supported by the host apparatus 730) that are selected for execution by the host apparatus 730 to implement the functions represented by the decoded instructions. Access control program logic 714 and access control information obtaining program logic 716 emulate functions of the access control circuitry 108 and access control information obtaining circuitry 109 described earlier. The access control program logic 714 controls access to simulated physical memory based on access control information corresponding to the simulated physical address being accessed. The access control information obtaining program logic 716 obtains the access control information based on table structures stored in simulated physical memory, and supports definition of table-lookup-skipping windows as discussed earlier to allow default access control information to be assigned to some simulated physical addresses without requiring a table lookup for those simulated physical addresses. Host storage mapping program logic 718 maps accesses to simulated registers or simulated memory (requested by the target code 700 according to the target ISA supported by the target code 700) onto host storage resources (e.g. registers and/or memory) provided in hardware in the host apparatus 730. For example, where an operation requested by the target code 700 requires access to a given register, the register access may be mapped onto a register simulating data structure maintained in host memory by the host storage mapping program logic 718 of the simulation program 710, while when an operation requested by the target code 700 requires an access to an address in simulated physical memory, this is mapped onto host virtual addresses in the host virtual address space by the host storage mapping program logic 718. These host virtual addresses may themselves be translated into host physical addresses using the address translation mechanisms supported by the host (the translation of host virtual addresses to host physical addresses being outside the scope of what is controlled by the simulator program 710).
In the present application, the words "configured to..." are used to mean that an element of an apparatus has a configuration able to carry out the defined operation. In this context, a "configuration" means an arrangement or manner of interconnection of hardware or software. For example, the apparatus may have dedicated hardware which provides the defined operation, or a processor or other processing device may be programmed to perform the function. "Configured to" does not imply that the apparatus element needs to be changed in any way in order to provide the defined operation.
In the present application, lists of features preceded with the phrase "at least one of mean that any one or more of those features can be provided either individually or in combination. For example, "at least one of: [A], [B] and [C]" encompasses any of the following options: A alone (without B or C), B alone (without A or C), C alone (without A or B), A and B in combination (without C), A and C in combination (without B), B and C in combination (without A), or A, B and C in combination.
Although illustrative embodiments of the invention have been described in detail herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various changes and modifications can be effected therein by one skilled in the art without departing from the scope of the invention as defined by the appended claims.

Claims (21)

  1. CLAIMS1. An apparatus comprising: memory access control circuitry to control access to a memory system in response to a given memory access operation, based on access control information associated with a target physical address specified for the given memory access operation; and access control information obtaining circuitry to obtain the access control information corresponding to the target physical address; in which: the access control information obtaining circuitry is configured to perform a lookup of an access control table structure based on the target physical address of the given memory access operation, for obtaining the access control information; and for at least one setting of table lookup control information indicating that a system physical address space has been configured to comprise a plurality of table-lookup-skipping address windows at non-contiguous positions within the system physical address space, the access control information obtaining circuitry is configured to: determine whether the target physical address corresponds to one of the plurality of table-lookup-skipping address windows; and in response to determining that the target physical address corresponds to one of the plurality of table-lookup-skipping windows, identify, as the access control information corresponding to the target physical address, a default value selected independent of the lookup to the access control table structure.
  2. 2. The apparatus according to claim 1, in which for at least one setting of the table lookup control information, the plurality of table-lookup-skipping address windows comprise: a first type of table-lookup-skipping address window detected based on a comparison of a first subset of bits of the target physical address; and a second type of table-lookup-skipping address window detected based on a comparison of a second subset of bits of the target physical address different from the first subset.
  3. 3. The apparatus according to any of claims 1 and 2, in which for at least one setting of the table lookup control information, the plurality of table-lookup-skipping address windows comprise at least two table-lookup-skipping address windows disposed in the system physical address space at intervals of a constant stride.
  4. 4. The apparatus according to claim 3, in which the table lookup control information comprises a configurable stride parameter and the constant stride depends on the configurable stride parameter.
  5. 5. The apparatus according to any preceding claim, in which the access control table structure comprises at least one level of table, including an initial-level table that is first to be looked up in the lookup of the access control table structure, the initial-level table comprising table entries each corresponding to a granule of physical addresses of an initial-level granule size; and for at least one setting of the table lookup control information, the table lookup control information is configured to support at least one table-lookup-skipping address window being defined to have a size less than the initial-level granule size.
  6. 6. The apparatus according to any preceding claim, in which, the access control information obtaining circuitry is configured to determine whether the target physical address corresponds to a given type of table-lookup-skipping address window based on a comparison between a selected portion of bits of the target physical address and a reference value.
  7. 7. The apparatus according to claim 6, in which, for at least one type of table-lookup-skipping address window, the reference value comprises a programmable window base address specified in the table lookup control information.
  8. 8. The apparatus according to any of claims 6 and 7, in which, for at least one type of table-lookup-skipping address window, the access control information obtaining circuitry is configured to determine that the target physical address corresponds to one of the plurality of table-lookup-skipping address windows in response to determining that the selected portion of bits of the target physical address matches the reference value.
  9. 9. The apparatus according to any of claims 6 to 8, in which, for at least one type of table- lookup-skipping address window, the reference value comprises a predetermined non-programmable value.
  10. 10. The apparatus according to any of claims 6 to 9, in which, for at least one type of table-lookup-skipping address window, the access control information obtaining circuitry is configured to determine that the target physical address corresponds to one of the plurality of table-lookup-skipping address windows in response to determining that the selected portion of bits of the target physical address does not match the reference value.
  11. 11. The apparatus according to any preceding claim, in which in response to determining that the target physical address does not correspond to one of the plurality of table-lookup-skipping windows of the system physical address space, the access control information obtaining circuitry is configured to identify, as the access control information corresponding to the target physical address, a value derived from the lookup to the access control table structure.
  12. 12. The apparatus according to any preceding claim, in which, in the lookup to the access control table structure, the access control information obtaining circuitry is configured to determine, based on the target physical address, a table index for selecting an initial-level table entry from an initial-level table of the access control table structure; and for at least one setting of the table lookup control information in which the system physical address space has been configured to comprise a given type of table-lookup-skipping address window for which detection of whether the target physical address is within the given type of tablelookup-skipping address window depends on a selected portion of bits of the target physical address, the access control information obtaining circuitry is configured to obtain the table index based on a concatenated value comprising a concatenation of a first portion of bits of the given target physical address more significant than the selected portion and a second portion of bits of the given target physical address less significant than the selected portion.
  13. 13. The apparatus according to any preceding claim, in which for at least one type of table-lookup-skipping address window, the default value for the access control information indicates a setting for the access control information which indicates that a least secure class of memory access operation is permitted to access a memory system location associated with the target physical address.
  14. 14. The apparatus according to any preceding claim, in which for at least one type of table-lookup-skipping address window, the default value for the access control information indicates a most permissive setting for the access control information.
  15. 15. The apparatus according to any preceding claim, in which for at least one type of table-lookup-skipping address window, the default value for the access control information is specified by a programmable configuration parameter of the table lookup control information.
  16. 16. The apparatus according to any preceding claim, comprising address translation circuitry to translate a virtual address to a physical address in a selected architectural physical address space selected from among a plurality of architectural physical address spaces; and the access control information specifies which of the architectural physical address spaces permits access to the target physical address in the system physical address space.
  17. 17. The apparatus according to claim 16, comprising at least one memory system component configured to treat aliasing physical addresses from different architectural physical address spaces as if they relate to different memory system resources even though the aliasing physical addresses of the different architectural physical address spaces actually correspond to the same memory system resource in the system physical address space.
  18. 18. Computer-readable code for fabrication of the apparatus according to any preceding claim.
  19. 19. A method comprising: in response to a given memory access operation specifying a target physical address: obtaining access control information corresponding to the target physical address using access control information obtaining circuitry configured to perform a lookup of an access control table structure based on the target physical address of the given memory access operation; and controlling access to a memory system based on the access control information; in which: in response to detecting that table lookup control information is set to at least one setting indicating that a system physical address space has been configured to comprise a plurality of table-lookup-skipping address windows at non-contiguous positions within the system physical address space: the obtaining comprises determining whether the target physical address corresponds to one of the plurality of table-lookup-skipping address windows; and in response to determining that the target physical address corresponds to one of the plurality of table-lookup-skipping windows of physical address space, a default value selected independent of the lookup to the access control table structure is identified as the access control information corresponding to the target physical address.
  20. 20. A computer program for controlling a host data processing apparatus to provide an instruction execution environment for execution of target program code, the computer program comprising: access control program logic to control access to a simulated memory system in response to a given memory access operation, based on access control information associated with a target simulated physical address specified for the given memory access operation; and access control information obtaining program logic to obtain the access control information corresponding to the target simulated physical address; in which: the access control information obtaining program logic is configured to perform a lookup of an access control table structure based on the target simulated physical address of the given memory access operation, for obtaining the access control information; and for at least one setting of table lookup control information indicating that a simulated system physical address space has been configured to comprise a plurality of table-lookup-skipping address windows at non-contiguous positions within the simulated system physical address space, the access control information obtaining program logic is configured to: determine whether the target simulated physical address corresponds to one of the plurality of table-lookup-skipping address windows; and in response to determining that the target simulated physical address corresponds to one of the plurality of table-lookup-skipping windows, identify, as the access control information corresponding to the target simulated physical address, a default value selected independent of the lookup to the access control table structure.
  21. 21. A storage medium storing the computer-readable code of claim 18 or the computer program of claim 20.
GB2404684.9A 2024-04-02 2024-04-02 Access control information Pending GB2639994A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB2404684.9A GB2639994A (en) 2024-04-02 2024-04-02 Access control information
PCT/GB2025/050550 WO2025210336A1 (en) 2024-04-02 2025-03-18 Access control information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB2404684.9A GB2639994A (en) 2024-04-02 2024-04-02 Access control information

Publications (2)

Publication Number Publication Date
GB202404684D0 GB202404684D0 (en) 2024-05-15
GB2639994A true GB2639994A (en) 2025-10-08

Family

ID=91023410

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2404684.9A Pending GB2639994A (en) 2024-04-02 2024-04-02 Access control information

Country Status (2)

Country Link
GB (1) GB2639994A (en)
WO (1) WO2025210336A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150356029A1 (en) * 2013-02-05 2015-12-10 Arm Limited Handling memory access operations in a data processing apparatus
US20210311997A1 (en) * 2018-07-27 2021-10-07 Arm Limited Binary search procedure for control table stored in memory system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150356029A1 (en) * 2013-02-05 2015-12-10 Arm Limited Handling memory access operations in a data processing apparatus
US20210311997A1 (en) * 2018-07-27 2021-10-07 Arm Limited Binary search procedure for control table stored in memory system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ROBERT BEDICHEK: "Some Efficient Architecture Simulation Techniques", USENIX CONFERENCE, 1990, pages 53 - 63

Also Published As

Publication number Publication date
GB202404684D0 (en) 2024-05-15
WO2025210336A1 (en) 2025-10-09

Similar Documents

Publication Publication Date Title
EP4127948B1 (en) Apparatus and method using plurality of physical address spaces
EP4127945B1 (en) Apparatus and method using plurality of physical address spaces
EP4139806B1 (en) Variable nesting control parameter for table structure providing access control information for controlling access to a memory system
WO2025163283A1 (en) Attribute information
CN120712559A (en) Predetermined lower security memory properties
GB2639994A (en) Access control information
TW202540858A (en) Access control information
WO2025210329A1 (en) Isolated address region assignment updating instruction
TW202540862A (en) Selection of tag translation mode
TW202538529A (en) Data-access-to-tag check
CN120752619A (en) Determines whether to deny a memory access request issued by the requestor device
TW202538530A (en) Tag-locating-address translation operation
WO2025163282A1 (en) Memory access request filtering based on requester group identifier
TW202538531A (en) Tag-locating address determination
GB2637714A (en) Attribute information