WO2018165957A1 - Gestion de stockage structuré-annexé-de journal avec accessibilité de niveau octet - Google Patents
Gestion de stockage structuré-annexé-de journal avec accessibilité de niveau octet Download PDFInfo
- Publication number
- WO2018165957A1 WO2018165957A1 PCT/CN2017/076999 CN2017076999W WO2018165957A1 WO 2018165957 A1 WO2018165957 A1 WO 2018165957A1 CN 2017076999 W CN2017076999 W CN 2017076999W WO 2018165957 A1 WO2018165957 A1 WO 2018165957A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- address
- data block
- data
- storage
- data blocks
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
- G06F12/0238—Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
- G06F12/0246—Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory in block erasable memory, e.g. flash memory
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/72—Details relating to flash memory management
- G06F2212/7201—Logical to physical mapping or translation of blocks or pages
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/72—Details relating to flash memory management
- G06F2212/7205—Cleaning, compaction, garbage collection, erase control
Definitions
- NVM non-volatile memory
- main memory traditionally, such data can only be stored in disks
- storage of data in NVM brings several critical problems to memory space management, one of which is the memory fragmentation. That is, unused spaces cannot be allocated with other data, which will cause significant wastage under certain workloads. Since data will be held for a long term in NVM, the fragmentation may become especially severe.
- Implementations of the subject matter described herein provide a “log-structured” NVM and associated storage management solution with byte addressability.
- the whole NVM is organized in the form of logs, which act as a virtual home space at the same time.
- Such log-appended-structured storage system eliminates the dichotomy of memory data between home space and log space, and unifies those two spaces, so that applications view the NVM in a same way as the traditional home space.
- the plurality of data blocks are sequentially appended and having non-fixed block sizes.
- the storage system further includes a mapping structure including a plurality of address mappings, and each of the address mappings associates a home address (accessible to an application) with a log address (a physical address of the data block) for a data block. Furthermore, the storage system includes a storage manager, which is configured to control various operations of the storage system.
- FIG. 1 illustrates a schematic of the storage system according to an implementation of the subject matter described herein;
- FIG. 2 illustrates a typical tree structure that represents the mapping table as shown in FIG. 1;
- FIG. 3 illustrates a flow chart of a process performed by the storage manager according to an implementation of the subject matter described herein;
- FIG. 4 illustrates a schematic two-layer mapping structure according to an implementation of the subject matter described herein;
- FIG. 5 shows a flow chart illustrating the an example method for accessing a node in the tree when there is a cache equipped with the storage system, according to an implementation of the subject matter described herein;
- FIG. 6 shows a flow chart of an example method for writing data into NVM of the storage system according to an implementation of the subject matter described herein;
- FIG. 7 illustrates a block diagram of an example implementation of the storage manager in which one or more implementations of the subject matter described herein may be implemented.
- the term “includes” and its variants are to be read as open terms that mean “includes, but is not limited to. ”
- the term “based on” is to be read as “based at least in part on. ”
- the term “one implementation” and “an implementation” are to be read as “at least one implementation. ”
- the term “another implementation” is to be read as “at least one other implementation. ”
- the terms “first, ” “second, ” and the like may refer to different or same objects. Other definitions, explicit and implicit, may be included below. A definition of a term is consistent throughout the description unless the context clearly indicates otherwise.
- DRAM dynamic random access memory
- NVM-oriented storage system data can be stored persistently in NVM.
- the data stored in NVM would not go away by simply powering the NVM off.
- how to deal with the issue of external fragmentation existed in NVM would become challenging.
- fragmentation there are two sources of fragmentation.
- One is called “internal” fragmentation.
- 64B 64 Byte
- 64B 64 Byte
- 65B of NVM is requested, it shall effectively allocate 128B.
- an “internal” fragmentation of 63B is created consequently.
- the other one is called “external” fragmentation.
- supposing that a 64B data block is freed or discarded but the resulting freed space is still surrounded by data blocks in use. In this case, the freed space cannot serve any data allocation request beyond 64B.
- such external fragmentation would become even severe if allocated block sizes vary.
- Fragmentation can be reduced by garbage collection that is capable of changing allocated addresses by moving allocated data.
- garbage collection that is capable of changing allocated addresses by moving allocated data.
- Such a process involves time-consuming object reference analysis and process pauses.
- this technique is typically used in a managed language runtime, so no byte-addressability is provided to programs, which means a program cannot access an arbitrary address.
- a novel log-appended-structured NVM-based storage system associated with its storage management mechanism are proposed herein.
- the issue of fragmentation can be effectively avoided or significantly reduced.
- byte-level look-up granularity can be achieved, which renders byte-level addressability.
- efficient memory management could be achieved, which further avoids redundant bus traffic and write wear.
- FIG. 1 illustrates a schematic of the storage system 1000 according to an implementation of the subject matter described herein.
- the storage system 1000 includes a non-volatile storage (NVM) 100.
- the NVM 100 is configured to store a plurality of data blocks (D 1 , D 2 , ..., D n , collectively referred to as “D” ) that are sequentially appended (or logged) and having non-fixed block sizes. That is, a newly allocated data block will be always “compactly” appended to the end of the log, and the block size of each appended data block may have an arbitrary block size, even tens of bytes, per request.
- D non-volatile storage
- the NVM 100 as shown in FIG. 1 includes an array of NVM chunks (represented by several stripes) , in other examples, the storage 100 may only include a single NVM chunk.
- a small gap may exist between any of two neighboring appended data blocks per request.
- Such small gap (for example, 1B) could be for example the header of the data block, and thus usually has much smaller length compared to the length of the data block.
- Those sequentially appended data blocks as shown in FIG. 1 forms a data structure which can be called “log-appended-structure. ”
- PTM persistent transactional memory
- NVM needs to hold not only the home space of data that is “viewable” by applications, but also a separate log space hidden from the application as the physical space of data.
- the sequentially organized data structure in NVM 100 as shown in FIG. 1 eliminates the “dichotomy” of NVM data introduced by the traditional memory management and PTM.
- the log-appended-structured NVM-based storage 100 according to the subject matter described herein unifies the home space and the log space.
- the whole NVM itself is physically organized in a form of logs, and at the same time the NVM also acts as the home space exposed to the application.
- a defragmentation procedure becomes necessary for a NVM based system.
- moving allocated data and merging (or coalescing) free spaces can “absorb” external fragmentation without stopping the process.
- some data blocks in use might be moved to some other space (or addresses) .
- the address mapping is necessary for locating the physical address where the data blocks are currently stored, which further makes the performance or efficiency of the address mapping becomes crucial, especially when the defragmentation is performed.
- NVM-based storage system can effectively solve the dominated latency issue (such as the latency from the SSDs) in existing log-appended-structured systems.
- optimization to the mapping structure now can be expected to result in a further improved performance of the address mapping.
- the storage system 1000 further includes a mapping structure 200.
- the mapping structure 200 as shown including a plurality of address mappings, and each of the address mappings associates a home address 202 with a log address 204 for one data block D.
- the home address indicates an address that is accessible to an application
- the log address indicates a physical address of the data block stored in the NVM 100.
- home address “0x789” is mapped to the log address “0x1aa” for the data block D 3 stored in the NVM 100, where “0x” indicates hexadecimal.
- the correspondence between data block D 3 and the home address “0x789” (and its log address “0x1aa” ) is indicated by the arrow 150.
- the home addresses or log addresses all indicate the staring addresses unless the context clearly indicates otherwise.
- the third column of the mapping table 200 indicates the length 206 for each allocated data block. The starting address and the length together defines an area (or address scope) in the home space and in the log space.
- mapping structure 200 as shown in FIG. 1 is in the form of a mapping table (for example, a hash table) , this is merely for ease of discussion without suggesting any limitations as to the scope of the subject matter described herein.
- the mapping structure 200 may take other forms, such as a fully balanced binary tree structure or a skip-list data structure, which will be further described later.
- log-appended-structured data blocks stored in NVM 100 allows for arbitrary block size
- log-appended-structured NVM 100 containing byte-level data blocks “intrinsically” enables the mapping structure 200 as shown in FIG. 1 to support byte-level accessibility/addressability, or in other words, to have byte-level look-up granularity.
- log-appended-structured system targets an arbitrary starting address with any length.
- log-appended-structured data structure with byte-level addressability may face a unique challenge of address mapping overhead.
- other log-appended-structured systems having basic data elements with fixed (or predefined) block size can simply use, for example, a hash table or a bloom filter for the mapping, thus other systems may not face this challenge described as above.
- This address mapping overhead may become even severe, when data blocks are moved frequently in order to perform the defragmentation (that is, the consolidation of fragments) . This is because, during the procedure of defragmentation, one or more data blocks proximate to or immediately surrounding a fragment will be moved, which in turn leads to a change of mapping associating the home address with the log address, that is mapping the same home address to an updated/new log address.
- the mapping structure 200 can be organized as a tree structure consisting of a plurality of chained nodes covering the whole address space of the NVM 100.
- Each of nodes includes home address, length for a data block, and log address that the home address is mapped to.
- one node in the tree holds a pair ⁇ home address, length ⁇ denoting an area in the home space, and the log address that the area is mapped to.
- FIG. 2 illustrates a typical tree structure 200 that represents the mapping table 200 as shown in FIG. 1.
- node 201 with the home address “0x456” is selected as the root node of the tree structure, with node 202 with the home address “0x123” and node 203 with the home address “0x789” immediately downstream to root node as intermediate nodes.
- node 201 with the home address “0x456” is selected as the root node of the tree structure
- node 203 with the home address “0x789” immediately downstream to root node as intermediate nodes.
- FIG. 2 illustrates a typical tree structure 200 that represents the mapping table 200 as shown in FIG. 1.
- node 201 with the home address “0x456” is selected as the root node of the tree structure
- node 202 with the home address “0x123” and node 203 with the home address “0x789” immediately downstream to root node as intermediate nodes.
- the root node 201 includes the home address “0x456, ” the length “140B” for the data block D 2
- the log address “0x11e, ” and node 202 includes the home address “0x123, ” the length “30B” for data block D 1
- the log address “0x100, ” and node 203 includes information about the home address “0x789, ” the length “70B” for data block D 3 , and the log address “0x1aa. ”
- the log addresses for those data blocks are contiguous.
- the (starting) log address and the length for a data block both determine the (starting) log address for the sequentially appended data block.
- the tree structure 200 is based on a fully balanced tree.
- the fully balanced tree could be a binary tree where each node has at most two children nodes, which sometimes are referred to as the left child node and the right child node.
- node 201 has only two children nodes with node 202 as the left child node and 203 as the right child node.
- the time complexity of an operation on the tree 200 is O (log n) .
- the storage system 1000 further includes a storage manager 700 which is configured to control various operations of the storage system 1000.
- the storage manager 700 can be configured to control the data blocks to be allocated in a form of a log-appended-structured data blocks, or in other words, to control the plurality of data blocks having non-fixed sizes (that is, arbitrary size) to be sequentially appended to the end of the logs.
- the storage manager 700 may further be configured to manage the change of the address mapping caused by the defragmentation operation.
- FIG. 3 illustrates a flow chart of a process 300 performed by the storage manager 700 according to an implementation of the subject matter described herein.
- the process 300 generally illustrates how the storage manager 700 responds to the idle fragments.
- an idle fragment in the NVM 100 as shown in FIG. 1 is detected.
- a plurality of data blocks that are sequentially appended and have non-fixed sizes are stored in NVM 100.
- such idle fragment is caused by removal of one or more data blocks from the NVM 100.
- fragments (or freed space) represented by the shaded areas 102 and 104 are caused by the removal or discard of data blocks D 2 and D 4 , respectively.
- the fragment 102 is immediately surrounded by two data blocks D 1 and D 3 in use, and similarly, the fragment 104 is surrounded by two data blocks D 3 and D 5 in use.
- one or more data blocks proximate to or immediately next to the fragments might need to be moved. Therefore, at 340, when the idle fragment is detected, one or more second data blocks proximate to the idle fragment is caused to be moved from a first physical address to a second physical address in the NVM 100. For example, when the idle fragments 102 and 104 are detected, the data block D 3 surrounded by fragments 102 and 104 might need to be moved in order to form a larger freed space for a further data allocation.
- an address mapping for the second data block to associate the second address with a logical address (that is, home address) of the second data block accessible to an application is updated. For example, if the data block D 3 is moved from its original physical address “0x1aa” to a new physical address “0x2000, ” the address mapping for D 3 to associate the logical address with the physical address will be changed/updated, accordingly.
- the data block D 3 After the change/update of the address mapping for the data block D 3 , in some implementations, when an access (read/write) request to the data block D 3 is received, the data block D 3 will be accessed at its new log address “0x2000” via the updated address mapping.
- the average cost of an operation on a tree structure is proportional to the tree height.
- the height of the tree can be huge, especially when the block size is small (for example, tens of bytes) while the NVM is large. Therefore, in order to further improve the mapping efficiency and reduce the cost, four tree structure optimization mechanisms are further proposed as blow.
- the single tree structure 200 as shown in FIG. 2 may be replaced by a skip-list.
- a skip-list is a type of data structure that is based on a probabilistically balanced data structure. Contrary to the conventional balanced tree (as described with respect to FIG. 2) , such skip-list has a “express lane” within an ordered sequence of elements and does not need a complex rebalancing operation which may be costly and largely impair parallelism or concurrent operations.
- a typical balanced tree requires a readers-writer lock to protect concurrent operations. That is, a readers-writer lock allows concurrent access for read-only operations, while write operations require exclusive access. For example, a look-up that typically serves a memory load has to obtain a read lock in a shared manner, while an update that typically serves a memory store has to hold an exclusive write lock. Such lock contention, however, can deteriorate the throughput of multiple threads applications.
- a partitioned tree approach can be used, so that the number of objects in each partition is limited to an acceptable value.
- the whole home space can be divided into a plurality of sub-home spaces (hereafter also referred to as partitions or chunks) , and each partition holds a respective tree structure for further address look-ups.
- Such mapping structure is also referred to as two-layer mapping structure.
- FIG. 4 illustrates a schematic two-layer mapping structure 400.
- the whole address space is first divided into many static fixed-length partitions (or chunks) (collectively referred to as 410) with starting addresses of 0x0, 0x8000, 0x10000, ..., 0x28000, respectively, so that data can be first routed to a specific partition where the address locates in O (1) time which may only cost as low as one CPU cycle.
- the first layer can simply take the form of a hash table.
- the address can be finally accessed via the corresponding “local” tree structure held in each partition (collectively referred to as “420” ) . Due to the fact that the whole address space has been divided into many smaller spaces, or in other words, a huge tree is now split into numerous small local trees, the average number of nodes in each local tree can be much smaller than that of a huge tree coveting the whole home space.
- the storage system 1000 may further include a cache (also called the node cache) configured for storing a node that is recently or frequently accessed. In this case, when an address is to be accessed, the node cache will be first searched. In some implementations, the cache node stores recently accessed home-space addresses and pointers to their nodes in the trees.
- a cache also called the node cache
- FIG. 5 shows a flow chart illustrating the process 500 of accessing a node in the tree when there is a cache equipped with the storage system 1000.
- the first node will be directly accessed via the cache. Specifically, if the first node is cached, the corresponding pointer will directly point to the first node in the tree that contains the requested address mapping. Otherwise, at 540, when it is determined that the first data block is not cached, the node will still need to be accessed via the mapping structure 200. For example, if the first node is not cached, a full tree look-up is still necessary and the resulting node is then added to the node cache. By caching recently visited nodes, many full tree lookup paths starting from the root node can be avoided.
- all addresses within its mapped area tend to be a cache hit.
- a plain hash table does not support such feature, as cached addresses are randomly distributed. For example, conventionally, supposing the node for a 70B area starting at 0x1000 is cached, an access to, for example, the address 0x1008 is usually not a hit, because it falls into a different bucket and loses the chance to be checked with this node. On the contrary, in our case, all addresses within a 70B scope to the same bucket will become a cache hit, so that nearby addresses can be checked with chained tree nodes that may cover them.
- Merging tree nodes is another way to further reduce the number of nodes and thereby the height of the tree.
- the storage manager 700 is further configured to first buffer a plurality of data blocks. If it is detected that two or more data blocks have contiguous target home addresses as discussed above, then the two or more data blocks are grouped. Then the grouped data blocks are appended into the storage space at a time.
- FIG. 6 is a flowchart of an example method 600 for writing data into NVM 100 of the storage system in accordance with an example implementation of the subject matter described herein.
- a plurality of data blocks that are to be written into the non-volatile storage 100 is buffered.
- the at least two data blocks are grouped.
- the grouped at least two data blocks is appended into the non-volatile storage 100.
- the tree structure is updated.
- all writes can be buffered into NVM 100, and for those with contiguous home addresses on transaction commit, they can be combined into a group. The group of combined writes is then appended to the end of logs.
- the mapping structure for example, the tree 200
- the storage manager 700 is in a form of a general-purpose computing device.
- Components of the storage manager 700 may include, but are not limited to, one or more processors or processing units 710, a memory 1320, one or more input devices 730, one or more output devices 740, storage 750, and one or more communication units 760.
- the processing unit 710 may be a real or a virtual processor and is capable of performing various processes in accordance with a program stored in the memory 720. In a multi-processing system, multiple processing units execute computer-executable instructions to increase processing power.
- the storage manager 700 typically includes a variety of machine readable medium. Such medium may be any available medium that is accessible by the computing system/server 1000, including volatile and non-volatile medium, removable and non-removable medium.
- the memory 720 may be volatile memory (e.g., registers, cache, a random-access memory (RAM) ) , non-volatile memory (e.g., a read only memory (ROM) , an electrically erasable programmable read only memory (EEPROM) , a flash memory) , or some combination thereof.
- the storage 750 may be removable or non-removable, and may include machine readable medium such as flash drives, magnetic disks or any other medium which can be used to store information and which can be accessed within the storage manager 700.
- the storage manager 700 may further include other removable/non-removable, volatile/non-volatile computing system storage medium.
- a disk driver for reading from or writing to a removable, non-volatile disk e.g., a “floppy disk”
- an optical disk driver for reading from or writing to a removable, non-volatile optical disk can be provided.
- the memory 720 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of various implementations of the subject matter described herein.
- a program/utility tool 722 having a set (at least one) of the program modules 724 may be stored in, for example, the memory 720.
- Such program modules 724 include, but are not limited to, an operating system, one or more applications, other program modules, and program data. Each or a certain combination of these examples may include an implementation of a networking environment.
- the program modules 724 generally carry out the functions and/or methodologies of implementations of the subject matter described herein, for example, the method 300 and method 500.
- the functionally described herein can be performed, at least in part, by one or more hardware logic components.
- illustrative types of hardware logic components include Field-Programmable Gate Arrays (FPGAs) , Application-specific Integrated Circuits (ASICs) , Application-specific Standard Products (ASSPs) , System-on-a-chip systems (SOCs) , Complex Programmable Logic Devices (CPLDs) , and the like.
- Program code for carrying out methods of the subject matter described herein may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowcharts and/or block diagrams to be implemented.
- the program code may execute entirely on a machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
- a machine readable medium may be any tangible medium that may contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
- the machine readable medium may be a machine readable signal medium or a machine readable storage medium.
- a machine readable medium may include but not limited to an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing.
- machine readable storage medium More specific examples of the machine readable storage medium would include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM) , a read-only memory (ROM) , an erasable programmable read-only memory (EPROM or Flash memory) , an optical fiber, a portable compact disc read-only memory (CD-ROM) , an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
- RAM random access memory
- ROM read-only memory
- EPROM or Flash memory erasable programmable read-only memory
- CD-ROM portable compact disc read-only memory
- magnetic storage device or any suitable combination of the foregoing.
- a storage system comprises: a non-volatile storage configured to store a plurality of data blocks in a storage space; a storage manager configured to cause the plurality of data blocks to be sequentially appended and have non-fixed block sizes; and a mapping structure including a plurality of address mappings, each of the address mappings associating a logical address with a physical address for one of the data blocks, the logical address indicating an address accessible to an application, the storage manager being further configured to defragment the plurality of data blocks by moving at least part of the plurality of data blocks to merge idle fragments.
- the defragmenting the plurality of data blocks comprises: detecting an idle fragment in the non-volatile storage, the idle fragment being caused by removal of a first data block from the non-volatile storage; in response to detecting the idle fragment, causing a second data block proximate to the idle fragment to be moved from a first physical address to a second physical address; and in response to the second data block being moved, updating the address mapping for the second data block.
- the mapping structure comprises a tree structure including of a plurality of nodes, and each of the plurality of nodes includes a logical address, length for a data block, and a physical address that the logical address is mapped to.
- the tree structure includes a fully balanced tree.
- the mapping structure includes a skip-list that is based on a probabilistically balanced data structure.
- the storage space includes a plurality of subspaces, and each of the subspaces is associated with a group of logical addresses, and wherein each of the subspaces is provided with a tree structure to associate a logical address being within a respective group of logical addresses with a physical address.
- the storage system further comprises: a cache configured to store a node that is recently accessed.
- a storage manager comprising: a processing unit; and a memory coupled to the processing unit and storing instructions for execution by the processing unit, the instructions, when executed by the processing unit, causing the device to perform acts comprising: detecting an idle fragment in a non-volatile storage that stores a plurality of data blocks that are sequentially appended and have non-fixed sizes in a storage space, the idle fragment being caused by removal of a first data block from the non-volatile storage; in response to detecting the idle fragment, causing a second data block proximate to the idle fragment to be moved from a first physical address to a second physical address in the non-volatile storage; and in response to the second data block being moved, updating an address mapping for the second data block to associate the second address with a logical address of the second data block accessible to an application.
- the acts further comprises: sequentially appending the plurality of data blocks having non-fixed sizes into the storage space.
- the mapping structure comprises a tree structure including of a plurality of nodes, and each of the plurality of nodes includes a logical address, length for a data block, and a physical address that the logical address is mapped to.
- the acts further comprises: in response to receiving an access request to the second data block, causing the second data block to be accessed via the tree structure.
- the tree structure includes a fully balanced tree.
- the mapping structure includes a skip-list that is based on a probabilistically balanced data structure.
- the storage space includes a plurality of subspaces, each of the subspaces being associated with a group of logical addresses, and wherein each of the subspaces is provided with a tree structure to associate a logical address being within a respective group of logical addresses with a physical address.
- the causing the second data block to be accessed via the tree structure comprises: determining a subspace that the second data block is associated with; and causing the second data block to be accessed via the tree structure associated with the subspace.
- the causing the second data block to be accessed via the tree structure comprises: in response to determining that a first node for the second data block is stored in a cache, causing the second data block to be accessed via the cache; and in response to determining that the first node is not stored in the cache, causing the second data block to be accessed via the tree structure.
- the acts further comprises: buffeting a plurality of data blocks that are to be written into the non-volatile storage; in response to detecting at least two data blocks of the plurality of data blocks having contiguous logical addresses, grouping the at least two data blocks; appending the grouped at least two data blocks into the non-volatile storage; and updating the tree structure.
- the updating the tree structure further comprises: merging at least two nodes having the contiguous logical addresses as one node.
- a computer-implemented method comprises: detecting an idle fragment in a non-volatile storage that stores a plurality of data blocks that are sequentially appended and have non-fixed sizes in a storage space, the idle fragment being caused by removal of a first data block from the storage; in response to detecting the idle fragment, causing a second data block proximate to the idle fragment to be moved from a first physical address to a second physical address in the non-volatile storage; and in response to the second data block being moved, updating an address mapping for the second data block to associate the second address with a logical address of the second data block accessible to an application.
- the method further comprises: sequentially appending the plurality of data blocks having non-fixed sizes into the storage space.
- the mapping structure comprises a tree structure including of a plurality of nodes, and each of the plurality of nodes includes a logical address, length for a data block, and a physical address that the logical address is mapped to.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Les modes de réalisation de la présente invention concernent une solution de gestion de stockage basée sur une mémoire non volatile (NVM). Dans cette solution, l'invention concerne un système de stockage basé sur NVM structurée-annexée-de journal. Dans un tel système de stockage structuré-annexé-de journal, la pluralité de blocs de données sont séquentiellement annexés et ont des tailles de blocs non fixes. Le système de stockage comprend en outre une structure de mappage comprenant une pluralité de mappages d'adresses, et chacun des mappages d'adresses associant une adresse de domicile (accessible à une application) à une adresse de journal (une adresse physique du bloc de données) pour un bloc de données. En outre, le système de stockage comprend un gestionnaire de stockage, qui est configuré afin de commander diverses opérations du système de stockage. En utilisant une telle NVM structurée-annexée-de journal avec son mécanisme de mappage d'adresse associé, le problème de fragmentation peut être significativement réduit avec une efficacité élevée et un faible coût. Par ailleurs, à l'aide de la structure de mappage, une granularité de recherche de niveau octet peut être obtenue, ce qui rend une capacité d'adressage au niveau octet.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2017/076999 WO2018165957A1 (fr) | 2017-03-16 | 2017-03-16 | Gestion de stockage structuré-annexé-de journal avec accessibilité de niveau octet |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2017/076999 WO2018165957A1 (fr) | 2017-03-16 | 2017-03-16 | Gestion de stockage structuré-annexé-de journal avec accessibilité de niveau octet |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2018165957A1 true WO2018165957A1 (fr) | 2018-09-20 |
Family
ID=63522756
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2017/076999 Ceased WO2018165957A1 (fr) | 2017-03-16 | 2017-03-16 | Gestion de stockage structuré-annexé-de journal avec accessibilité de niveau octet |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2018165957A1 (fr) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112817526A (zh) * | 2021-01-19 | 2021-05-18 | 杭州和利时自动化有限公司 | 一种数据存储方法、装置及介质 |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1243317A (zh) * | 1998-07-28 | 2000-02-02 | 索尼公司 | 非易失存储器、记录装置和记录方法 |
| US6282605B1 (en) * | 1999-04-26 | 2001-08-28 | Moore Computer Consultants, Inc. | File system for non-volatile computer memory |
| JP2004094429A (ja) * | 2002-08-30 | 2004-03-25 | Toshiba Corp | ディスクアレイ装置及び同装置におけるレイドレベル変更方法 |
-
2017
- 2017-03-16 WO PCT/CN2017/076999 patent/WO2018165957A1/fr not_active Ceased
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1243317A (zh) * | 1998-07-28 | 2000-02-02 | 索尼公司 | 非易失存储器、记录装置和记录方法 |
| US6282605B1 (en) * | 1999-04-26 | 2001-08-28 | Moore Computer Consultants, Inc. | File system for non-volatile computer memory |
| JP2004094429A (ja) * | 2002-08-30 | 2004-03-25 | Toshiba Corp | ディスクアレイ装置及び同装置におけるレイドレベル変更方法 |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112817526A (zh) * | 2021-01-19 | 2021-05-18 | 杭州和利时自动化有限公司 | 一种数据存储方法、装置及介质 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6205650B2 (ja) | 不均等アクセス・メモリにレコードを配置するために不均等ハッシュ機能を利用する方法および装置 | |
| US9471500B2 (en) | Bucketized multi-index low-memory data structures | |
| US9846642B2 (en) | Efficient key collision handling | |
| CN106354745B (zh) | 用于提供计算机装置的接口的方法和计算机装置 | |
| JP7724318B2 (ja) | ハードウェアベースのメモリ圧縮 | |
| US20170286507A1 (en) | Database search system and database search method | |
| US20150067283A1 (en) | Image Deduplication of Guest Virtual Machines | |
| US10296250B2 (en) | Method and apparatus for improving performance of sequential logging in a storage device | |
| JP5902323B2 (ja) | コンテンツ派生データのメモリ内配置方法および装置 | |
| US10366011B1 (en) | Content-based deduplicated storage having multilevel data cache | |
| US11061827B2 (en) | Metadata representation for enabling partial page duplication | |
| US20150324281A1 (en) | System and method of implementing an object storage device on a computer main memory system | |
| US8688948B2 (en) | Flexible memory controller for autonomous mapping of memory | |
| US11747998B1 (en) | Indexing technique for large scale distributed key-value systems | |
| CN116795736A (zh) | 数据预读取方法、装置、电子设备和存储介质 | |
| KR20210144656A (ko) | 비연접 백업 물리적 서브페이지에 가상 페이지를 할당하는 방법 | |
| WO2018165957A1 (fr) | Gestion de stockage structuré-annexé-de journal avec accessibilité de niveau octet | |
| CN110832473A (zh) | 日志结构管理系统及方法 | |
| US12086411B2 (en) | Storage control apparatus and method for a database storing data in pages | |
| KR102697447B1 (ko) | 하프-매치(half match) 중복 제거를 수행하는 메모리 시스템 및 이의 동작 방법 | |
| CN117762323A (zh) | 一种数据处理方法及装置 | |
| CN120386744A (zh) | 存储系统及其操作方法 | |
| CN117931066A (zh) | 一种磁盘空间的分配方法、装置、电子设备及存储介质 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17901072 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 17901072 Country of ref document: EP Kind code of ref document: A1 |