WO2022061727A1

WO2022061727A1 - Method and apparatus for cache management

Info

Publication number: WO2022061727A1
Application number: PCT/CN2020/117791
Authority: WO
Inventors: Haoran LI; Pengcheng Li
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-09-25
Filing date: 2020-09-25
Publication date: 2022-03-31
Anticipated expiration: 2023-03-25

Abstract

A method for cache management. The method includes predicting, based on a prediction model, eviction time offset values respectively for items stored in a cache; storing, in a queue, predicted lifetime values respectively for the items in the cache according to the eviction time offset values; and in response to an eviction operation, selecting a target item to be removed from the cache based on the predicted lifetime values in the queue.

Description

METHOD AND APPARATUS FOR CACHE MANAGEMENT

TECHNICAL FIELD

The present disclosure generally relates to cache management, and more particularly, to methods and systems for determining eviction policies for caches.

BACKGROUND

Cache technology is widely used in various computer systems. Due to the limited cache size, caches need to be managed and apply an eviction policy to optimize their performance. Existing cache replacement policies, such as Least Recently Used ( “LRU” ) or Least Frequently Used ( “LFU” ) eviction policies, are based on fixed rules and historical data access information, and thus are not flexible for different cache scenarios.

With the increasing amount of data requests and data exchange and the diverse cache access patterns of server workloads, existing cache management methods cannot solve the complicated cache replacement optimization challenges to meet the needs in today's computer systems.

SUMMARY

Embodiments of the present disclosure provide methods and apparatus for cache management.

In some exemplary embodiments, a non-transitory computer-readable storage medium is provided. The non-transitory computer-readable storage medium stores a set of instructions that are executable by one or more processor of a device to cause the device to perform a method for cache management of: predicting, based on a prediction model, eviction time offset values respectively for items stored in a cache; storing, in a queue, predicted lifetime values respectively for the items in the cache according to the eviction time offset values; and in response to an eviction operation, selecting a target item to be removed from the cache based on the predicted lifetime values in the queue.

In some exemplary embodiments, an apparatus is provided. The apparatus includes a memory configured to store instructions and one or more processors communicatively coupled to the memory and configured to execute the instructions to cause the apparatus to: predict, based on a prediction model, eviction time offset values respectively for items stored in a cache; store, in a queue, predicted lifetime values respectively for the items in the cache according to the eviction time offset values; and in response to an eviction operation, select a target item to be removed from the cache based on the predicted lifetime values in the queue.

In some exemplary embodiments, a method for cache management is provided. The method includes predicting, based on a prediction model, eviction time offset values respectively for items stored in a cache; storing, in a queue, predicted lifetime values respectively for the items in the cache according to the eviction time offset values; and in response to an eviction operation, selecting a target item to be removed from the cache based on the predicted lifetime values in the queue.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments and various aspects of the present disclosure are illustrated in the following detailed description and the accompanying figures. Various features shown in the figures are not drawn to scale.

FIG. 1A illustrates an exemplary system for performing a cache management method, consistent with some embodiments of the disclosure.

FIG. 1B illustrates exemplary iterative operations performed by an agent, consistent with some embodiments of the disclosure.

FIG. 2 illustrates an exemplary apparatus for performing a cache management method, consistent with some embodiments of the disclosure.

FIG. 3 illustrates an exemplary controller, consistent with some embodiments of the disclosure.

FIG. 4 illustrates an exemplary lifetime queue, consistent with some embodiments of the disclosure.

FIG. 5 illustrates an exemplary Time-to-Item hash table stored in the controller, consistent with some embodiments of the disclosure.

FIG. 6 illustrates an exemplary Item-to-Time hash table stored in the controller, consistent with some embodiments of the disclosure.

FIG. 7 illustrates a flowchart of an exemplary cache management method, consistent with some embodiments of the disclosure.

DETAILED DESCRIPTION

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. The following description refers to the accompanying drawings in which the same numbers in different drawings represent the same or similar elements unless otherwise represented. The implementations set forth in the following description of exemplary embodiments do not represent all implementations consistent with the disclosure. Instead, they are merely examples of apparatuses and methods consistent with aspects related to the disclosure as recited in the appended claims. Particular aspects of the present disclosure are described in greater detail below. The terms and definitions provided herein control, if in conflict with terms or definitions incorporated by reference.

In computing systems, cache is a software or hardware component that stores frequently used data items to improve the data access performance. Caches can be used in various applications, including web services, Content Delivery Networks (CDNs) , file IO buffer, etc. In these storage systems, hierarchical memory architectures result tradeoffs between storage capability and latency. Accessing data from a second-level auxiliary storage device usually takes a longer time to process due to multiple read/write operations and data transmissions per each request. By storing frequently used data items in a high-speed cache, computing systems can service requests from the cache directly and return data to the requesting device faster. Accordingly, utilizing the data locality in the program can speed up the computing systems.

Due to a limited cache size, an accurate prediction of future data requests is required for cache management to improve the caching performance. In this disclosure, unsupervised Reinforcement Learning can be used for optimizing cache eviction policies to manage the cache content. Items stored in the cache can be ranked by their predicted lifetimes, and the ranking can be used to determine which item should be replaced in the cache eviction process, thereby maximizing a cache hit-rate and minimizing a cache miss-rate. In addition, the cache management utilizing Reinforcement Learning in various embodiments of the present disclosure can provide a low-overhead, self-adaptive, and online optimization for diverse locality scenarios and cache access patterns.

FIG. 1A illustrates an exemplary system 100 for performing a cache management method, consistent with some embodiments of the disclosure. System 100 can communicate with one or more requesting devices 104 to receive requests 110, and provide requested data item 190, which may be stored in a cache (e.g., cache 140) or in a second-level auxiliary storage device 102, in response to data requests 110. As shown in FIG. 1A, system 100 includes a controller 120, a cache 140, and an Artificial Intelligence (AI) agent 160. An “agent” in this context refers to autonomous software program that is able to learn and adapt to its environment in order to perform certain tasks.

Controller 120 is configured to communicate with cache 140 and an auxiliary storage device 102 to access data items associated with requests 110, and is configured to perform various functions to manage cache 140. Storage device 102 storing the data may be any storage device, including volatile or non-volatile storage devices, such as tape drives, flash memory, optical disk drives, etc. Particularly, controller 120 can add new items into cache 140 or remove existing items in cache 140 to achieve the cache management, in order to optimize the data stored in cache 140 and improve the performance of cache 140.

Cache 140 can include one or more volatile memory devices and be configured to store data items 142 that are frequently requested by the requesting device. Cache 140 may also include volatile or non-volatile storage devices, such as flash memory.

Agent 160 is configured to interact with controller 120 and cache 140, and configured to calculate predicted eviction time offset values respectively for the data items stored in cache 140. Controller 120 can perform operations, such as adding new items or removing existing items, to manage the data items stored in cache 140 according to the calculation results from agent 160. Operations by agent 160 will be discussed in detail in following paragraphs.

FIG. 1B illustrates how agent 160 interacts with the cache environment and exemplary iterative operations performed by agent 160, consistent with some embodiments of the disclosure. In some embodiments, agent 160 can be built based on various Machine Learning (ML) algorithms. For example, Reinforcement Learning (RL) is a type of ML algorithm for decision-making problems for optimizing policies through trial-and-error interactions with the environment. An RL model observes the state of the environment and takes actions to maximize “rewards. ” Particularly, a reward r can be defined as an evaluation result of taking an action a at a state s, and the objective of the RL model is to maximize the total rewards of a policy.

For example, agent 160 can include a Deep Deterministic Policy Gradient (DDPG) based RL agent. DDPG is an RL algorithm that learns deterministic continuous actions by training actor and critic neural networks with feedback from the environment. In some embodiments, “actions” may be the predicted future eviction time offset for the current cache access, and “states” observed in the environment may include the cache request itself, one or more characteristics of the requested data items, or one or more characteristics of cache 140 (e.g., the cache size, the data stored in the cache, etc. ) . For example, the “states” may include locality observations, such as an access frequency of a current item in a sliding window, the number of evictions for the current item in the sliding window, the number of hits for the current item in the sliding window, the memory address offset delta of the current access over a former access. The sliding window is a fixed-size queue used to keep track of information regarding the latest events in the cache.

The “states” may also include reuse distance histories with an O (1) time complexity, such as the last reuse distance of current item in a hash table, the second last reuse distance of current item in the hash table, etc., but the present disclosure is not limited thereto. For ease of computation, in some embodiments, the “states” may be assigned to normalized values. For example, the assigned values can lie within the closed interval [0, 1] , that is, the set of numbers between 0 and 1, including 0 and 1. In some other embodiments, the values can also be scaled or normalized to other suitable intervals.

“Rewards” may be defined by the cache access result, which may be a hit or a miss in cache 140. A “hit” in cache 140 indicates that the data item associated with the cache request can be found in cache 140, so that system 100 can directly serve the requested data (e.g., data item 190 in FIG. 1A) to the requesting device (e.g., device 104 in FIG. 1A) , without accessing data in storage device 102. On the other hand, a “miss” in cache 140 indicates that the data item associated with the cache request is not in cache 140, and thus system 100 needs to access the requested data item from storage device 102, which takes longer time to process.

For example, if the cache access result is a hit, the corresponding reward can be assigned to 1, which is a positive value because it is a desired result. If the cache access result is a miss and cache 140 is not full yet, the corresponding reward can be assigned to 0, because the new item can still be added into cache 140. If the cache access result is a miss and cache 140 is already full, the reward can be assigned to -1, which is a negative value because it is less desirable compared to the previous scenario. Because cache 140 has no available space for additional data items, an item needs to be evicted from cache 140 to add the new item in to cache 140. The rewards defined herein are merely examples and not meant to limit the present disclosure. In other embodiments, rewards can be defined differently based on the practical needs. Other factors, such as the type of requested items, may also be considered when assigning the rewards.

Accordingly, the goal of maximizing the total rewards represents improving the caching performance with a greater hit rate and a lower miss rate of cache 140. At every cache access, agent 160 can corrects its prediction based on rewards return from cache 140 and thus learns to predict the relative eviction time offset O _e for the current accessed data item.

In some embodiments, the eviction time offset O _e represents the significance of the current accessed data item, and how long (e.g., after how many cache requests) the current accessed data item should last in cache 140. In some embodiments, the predicted eviction time offset O _e can be set within an upper bound O _u and a lower bound O _l. For example, the predicted eviction time offset O _e may be within the range of 0 to 1,000. The predicted eviction time offset O _e can then be sent to controller 120, so that controller 120 can manage cache 140 accordingly. By above operations, agent 160 can observe complicated system states and correct the cache eviction policies based on the reward given by the system.

FIG. 2 illustrates an exemplary apparatus 200 for performing a cache management method, consistent with some embodiments of the disclosure. In some embodiments, the same or similar architecture of apparatus 200 may be used to implement system 100 shown in FIG. 1A. According to FIG. 2, apparatus 200 includes one or more memory devices 210, one or more hardware processors 220, a bus 230, and communication interface 240. Memory devices 210 may include random access memory (RAM) , read only memory (ROM) , and data storage systems comprised of partitions. Memory devices 210 can be communicatively coupled with processor (s) 220 via bus 230. Memory devices 210 may include a main memory, which can be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor (s) 220. Such instructions, after being stored in non-transitory storage media accessible to processors 220, render apparatus 200 into a special-purpose machine that is customized to perform operations specified in the instructions. Hardware processor (s) 220 communicatively coupled with bus 230 for processing information. Hardware processor (s) 220 can be, for example, one or more central processors or microprocessors. In some embodiments, bus 230 can also be replaced by other communication mechanism for communicating information.

Apparatus 200 can be communicatively coupled via bus 230 to one or more devices 300, which can be storage devices (e.g., storage device 102 in FIG. 1A) or requesting devices (e.g., device 104 in FIG. 1A) , or other peripheral devices, such as displays (e.g., cathode ray tube (CRT) , liquid crystal display (LCD) , touch screen, etc. ) or input devices (e.g., keyboard, mouse, soft keypad, etc. ) . In addition, apparatus 200 can transmit data to or communicate with servers S1-Sn or other terminal devices through network 400 via communication interface 240. Network 400 can be a local network, an internet service provider, internet, or any combination thereof. Apparatus 200 can be implemented using customized hard-wired logic, one or more ASICs or FPGAs, firmware, or program logic that in combination with the server causes apparatus 200 to be a special-purpose machine.

The term “non-transitory media” as used herein refers to any non-transitory media storing data or instructions that cause a machine to operate in a specific fashion. Such non-transitory media can comprise non-volatile media or volatile media. Non-transitory media include, for example, optical or magnetic disks, dynamic memory, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, flash memory, register, cache, any other memory chip or cartridge, and networked versions of the same.

Various forms of media can be involved in carrying one or more sequences of one or more instructions to processor (s) 220 for execution. For example, the instructions can initially be carried out on a magnetic disk or solid-state drive of a remote computer. Bus 230 carries the data to the main memory within memory devices 210, from which processor (s) 220 retrieves and executes the instructions. Alternatively stated, memory devices 210 can store a set of instructions, and processor (s) 220 can be configured to execute the set of instructions to cause the apparatus 200 to perform cache management operations described in various embodiments of the present disclosure.

Referring again to FIG. 1A, as shown in system 100, based on the cache access result, available free space in cache 140, and the predicted eviction time offset O _e from agent 160, system 100 can perform corresponding operations. Particularly, if a “hit” can be found in cache 140, system 100 can directly serve a data item 182 associated with the cache request from cache 140, without accessing data in storage device 102.

On the other hand, if the cache access result is a “miss, ” indicating that the requested data is not stored in cache 140, controller 120 accesses data in storage device 102. If cache 140 has available space, controller 120 can perform an “add” operation and store the item associated with the cache request (e.g., data item 184) in cache 140. If cache 140 is full, controller 120 can further determine whether the requested item should be stored in cache 140 based on the predicted eviction time offset O _e.

In response to a determination that the requested item should be stored in cache 140, controller 120 can perform an evict operation, removing one existing item in cache 140 and storing the item associated with the cache request (e.g., data item 186) in cache 140. In response to a determination that the requested item should not be stored in cache 140, controller 120 can perform a bypass operation, reading out the item associated with the cache request (e.g., data item 188) without storing the item in cache 140.

FIG. 3 illustrates an exemplary controller 120, consistent with some embodiments of the disclosure. As shown in FIG. 3, in some embodiments, controller 120 includes a conversion module 122, a lifetime queue 124, and hash tables 126, 128. Conversion module 122 is configured to convert the relative predicted eviction time offset O _e into an absolute predicted lifetime value T _e for eviction. In some embodiments, the predicted lifetime value T _e can be defined and computed as follows:

where T _e denotes the predicted lifetime value, T _c denotes a current timestamp of system 100, O _e denotes the predicted eviction time offset received from agent 160, and O _u and O _l respectively denote the upper bound and the lower bound of the eviction time offset.

FIG. 4 illustrates an exemplary lifetime queue 124, consistent with some embodiments of the disclosure. All converted predicted lifetime values T _e are stored in lifetime queue 124. Accordingly, lifetime queue 124 includes multiple entries, where each entry includes the predicted lifetime value T _e for a corresponding stored element in cache 140. In some embodiments, lifetime queue 124 is a data structure where each element has a priority associated with its value, so that the element popped from lifetime queue 124 is the element with the highest priority.

For example, lifetime queue 124 can be a minimum priority queue, which is a collection of items that come out in an increasing order of priorities. Accordingly, lifetime queue 124 keeps data items stored in a cache (e.g., cache 140 of FIG. 1) sorted with respect to their predicted absolute lifetime value T _e. Thus, lifetime queue 124 supports efficient query and deletion of its minimum element. When an eviction is necessary, the data item with the earliest lifetime value T _e is selected and evicted.

FIG. 5 and FIG. 6 respectively illustrate time-to-item hash table 126 and item-to-time hash table 128 stored in controller 120, consistent with some embodiments of the disclosure. The data structure of hash tables 126, 128 is configured to facilitate the process of maintaining and updating lifetime queue 124. As shown in FIG. 5, hash table 126 implements a structure that can map keys 510 (e.g., predicted lifetime values T _e) to corresponding values (e.g., stored data items in cache 140) in a list of key-value pairs 530. A hash function can be used to compute indices 520, also called hash codes, into the linked list of key-value pairs, from which the desired value (e.g., the data item) can be found. During lookup, the key is hashed, and the resulting hash indicates where the corresponding value is stored. Similarly, as shown in FIG. 6, hash table 128 implements a structure that can map keys 610 (e.g., stored data items in cache 140) to corresponding values (e.g., predicted lifetime values T _e) in a list of key-value pairs 630 via corresponding indices 620 using a hash function.

By utilizing hash tables 126 and 128, controller 120 can update entries in lifetime queue 124 accordingly. That is, when a new data item is added to cache 140, a new entry is added to lifetime queue 124 to store the corresponding predicted lifetime value T _e. When a hit is found in cache 140, the corresponding predicted lifetime value T _e is generated based on the latest prediction, and the corresponding entry in lifetime queue 124 is updated accordingly to store the new predicted lifetime values T _e. When a cache replacement is performed, a corresponding entry associated with the evicted data item is removed from lifetime queue 124, and a new entry associated with the data item newly stored in cache 140 is added to lifetime queue 124 to store the corresponding predicted lifetime value T _e.

FIG. 7 illustrates a flowchart of an exemplary method 700 for cache management, consistent with some embodiments of the disclosure. In some embodiments, method 700 can be performed by one or more processors in a computer system (e.g., system 100 in FIG. 1A) to optimize the cache performance and improve the cache hit rate in various computer science applications, including web services, content delivery networks (CDNs) , file IO buffer, etc.

Referring to method 700, at step 710, the computer system (e.g., system 100 in FIG. 1A) receives a request (e.g., request 110 in FIG. 1A) from a requesting device (e.g., device 104 in FIG. 1A) , such as a client end. The data transmission between the computer system and the requesting device may be achieved by wired or wireless communications.

At step 720, the computer system determines whether there is a hit to the cache request. That is, the computer system determines whether the requested item is stored in a cache of the computer system (e.g., cache 140 in FIG. 1A) .

If there is a miss to the cache request (step 720 -no) , the computer system performs

steps

732 and 734 to obtain a predicted lifetime value for the requested item. Particularly, at step 732, the computer system predicts, based on a prediction model (e.g., agent 160 in FIG. 1A) , a corresponding eviction time offset value for the requested item. In some embodiments, the prediction model includes an RL model, such as a DDPG-based RL model. At step 734, the computer system calculates a predicted lifetime value according to the eviction time offset value and a current timestamp.

In

steps

742 and 744, the computer system determines whether to store the requested item into the cache (e.g., cache 140 in FIG. 1A) . At step 742, the computer system determines whether the cache has available space. If the cache has available space (step 742 -yes) , the computer system performs

steps

762 and 764.

At step 762, the computer system stores the item associated with the cache request (e.g., data item 184 in FIG. 1A) in the cache. At step 764, the computer system stores a corresponding predicted lifetime value associated with the stored item in a queue (e.g., lifetime queue 124 in FIG. 3) in a controller (e.g., controller 120 in FIG. 1A) of the computer system.

If the cache is full (step 742 -no) , at step 744, the computer system further compares the predicted lifetime value associated with the requested item and a smallest predicted lifetime value stored in the queue and determines which one is greater. In response to the corresponding predicted lifetime value associated with the requested item being greater (step 744 -yes) , the computer system performs the eviction operation at step 750 to replace a target item with the requested item in the cache. In response to the eviction operation, the computer system selects the target item to be removed from the cache based on the predicted lifetime values in the queue, Particularly, the target item can be the item associated with the smallest predicted lifetime value.

After the target item is removed from the cache, the computer system continues to perform

steps

762 and 764, storing the item associated with the cache request (e.g., data item 186 in FIG. 1A) in the cache and the corresponding predicted lifetime value associated with the stored item in the queue to complete the cache replacement.

On the other hand, in response to the smallest predicted lifetime value stored in the queue being greater (step 744 -no) , the computer system bypasses

steps

750, 762, and 764, reading out the requested item (e.g., data item 188 in FIG. 1A) without storing the requested item in the cache.

If there is a hit to the cache request (step 720 -yes) , the computer system performs

steps

772 and 774. In step 772, the computer system reads out the item associated with the cache request (e.g., data item 182 in FIG. 1A) in the cache. In step 774, the computer system updates, based on the prediction model (e.g., agent 160 in FIG. 1A) , the corresponding predicted lifetime value for the item associated with the cache request. Similar to

steps

732 and 734, the computer system can obtain the updated predicted lifetime value by a calculation according to the corresponding eviction time offset value predicted by the prediction model and the current timestamp.

By steps 720-774, the computer system stores and updates, in the queue, predicted lifetime values respectively for the items in the cache according to the eviction time offset values predicted by the prediction model. At step 780, the computer system provides the item associated with the cache request (e.g., data item 190 in FIG. 1A) , which may be read out from the cache or from a second-level auxiliary storage device (e.g., storage device 102 in FIG. 1A) , to the requesting device.

At step 790, in response to this cache request, the computer system updates the prediction model based on a cache access result. That is, the prediction model, can refine its policies based on the reward in this cache request, so that the prediction model can provide a more accurate prediction in the next request to maximize the cache hit rate.

In view of above, as proposed in various embodiments of the present disclosure, by predicting eviction time offset values respectively for items stored in the cache, storing predicted lifetime values in a queue respectively for the items according to the eviction time offset values, and selecting a target item to be removed from the cache based on the predicted lifetime values in response to an eviction operation, a cache replacement optimization can be achieved through an RL agent.

Compared to existing cache management methods, embodiments of the present disclosure provide a low-overhead, self-adaptive cache replacement optimization with online adaptivity, which is adaptive for diverse cache workloads and can be applied for diverse locality scenarios and cache access patterns. When the computer system is running and performing its tasks, the RL agent keeps updating the lifetime queue based on observations and feedbacks on access traces and cache characteristics to optimize the cache performance in real time.

Various exemplary embodiments described herein are described in the general context of method steps or processes, which may be implemented in one aspect by a computer program product, embodied in a computer-readable medium, including computer-executable instructions, such as program code, executed by computers in networked environments. Generally, program modules may include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of program code for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps or processes.

In some embodiments, the computer-readable medium may include a non-transitory computer-readable storage medium, and the computer-executable instructions may be executed by a device (e.g. the disclosed encoder and decoder) , for performing the above-described methods. Common forms of non-transitory media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a compact disc (CD) , a digital versatile disc (DVD) , or any other optical data storage medium, any physical medium with patterns of holes, a Read Only Memory (ROM) , a Random Access Memory (RAM) , a PROM, and EPROM, a FLASH-EPROM or any other flash memory, NVRAM, a cache, a register, any other memory chip or cartridge, and networked versions of the same. The device may include one or more processors (CPUs) , an input/output interface, a network interface, or a memory.

It should be noted that, the relational terms herein such as “first” and “second” are used only to differentiate an entity or operation from another entity or operation, and do not require or imply any actual relationship or sequence between these entities or operations. Moreover, the words “comprising, ” “having, ” “containing, ” and “including, ” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items.

As used herein, unless specifically stated otherwise, the term “or” encompasses all possible combinations, except where infeasible. For example, if it is stated that a database may include A or B, then, unless specifically stated otherwise or infeasible, the database may include A, or B, or A and B. As a second example, if it is stated that a database may include A, B, or C, then, unless specifically stated otherwise or infeasible, the database may include A, or B, or C, or A and B, or A and C, or B and C, or A and B and C.

It is appreciated that the above described embodiments can be implemented by hardware, or software (program codes) , or a combination of hardware and software. If implemented by software, it may be stored in the above-described computer-readable media. The software, when executed by the processor can perform the disclosed methods. The computing units and other functional units described in this disclosure can be implemented by hardware, or software, or a combination of hardware and software. One of ordinary skill in the art will also understand that multiple ones of the above described modules/units may be combined as one module/unit, and each of the above described modules/units may be further divided into a plurality of sub-modules/sub-units.

The embodiments may further be described using the following clauses:

1. A non-transitory computer-readable storage medium storing a set of instructions that are executable by one or more processors of a device to cause the device to perform a method for cache management, comprising:

predicting, based on a prediction model, eviction time offset values respectively for a plurality of items stored in a cache;

storing, in a queue, predicted lifetime values respectively for the plurality of items in the cache according to the eviction time offset values; and

in response to an eviction operation, selecting a target item to be removed from the cache based on the predicted lifetime values in the queue.

2. The non-transitory computer-readable storage medium of clause 1, wherein the set of instructions that are executable by the one or more processors cause the device to further perform:

in response to a cache request, updating the prediction model based on a cache access result.

3. The non-transitory computer-readable storage medium of

clauses

1 or 2, wherein the set of instructions that are executable by the one or more processors cause the device to further perform:

in response to a hit to a cache request, updating, based on the prediction model, a corresponding predicted lifetime value for the item associated with the cache request.

4. The non-transitory computer-readable storage medium of any of clauses 1-3, wherein the set of instructions that are executable by the one or more processors cause the device to further perform:

in response to a miss to a cache request, if the cache has available space:

storing an item associated with the cache request in the cache; and

storing a corresponding predicted lifetime value associated with the stored item in the queue.

5. The non-transitory computer-readable storage medium of any of clauses 1-4, wherein the set of instructions that are executable by the one or more processors cause the device to further perform:

in response to a miss to a cache request, if the cache is full:

comparing a corresponding predicted lifetime value associated with the requested item and a smallest predicted lifetime value stored in the queue; and

in response to the corresponding predicted lifetime value associated with the requested item being greater than the smallest predicted lifetime value stored in the queue, performing the eviction operation to replace the target item with the requested item in the cache.

6. The non-transitory computer-readable storage medium of clause 5, wherein the set of instructions that are executable by the one or more processors cause the device to further perform:

in response to the corresponding predicted lifetime value associated with the requested item being smaller than the smallest predicted lifetime value stored in the queue, reading out the requested item without storing the requested item in the cache.

7. The non-transitory computer-readable storage medium of any of clauses 1-6, wherein the set of instructions that are executable by the one or more processors cause the device to further perform:

determining each of the predicted lifetime values according to a corresponding eviction time offset value and a current timestamp.

8. The non-transitory computer-readable storage medium of any of clauses 1-7, wherein the prediction model comprises a reinforcement learning model.

9. The non-transitory computer-readable storage medium of any of clauses 1-8, wherein the queue is a minimum priority queue.

10. An apparatus, comprising:

a memory configured to store instructions; and

one or more processors communicatively coupled to the memory and configured to execute the instructions to cause the apparatus to:

predict, based on a prediction model, eviction time offset values respectively for a plurality of items stored in a cache;

store, in a queue, predicted lifetime values respectively for the plurality of items in the cache according to the eviction time offset values; and

in response to an eviction operation, select a target item to be removed from the cache based on the predicted lifetime values in the queue.

11. The apparatus of clause 10, wherein the processor is further configured to execute the instructions to cause the apparatus to:

in response to a cache request, update the prediction model based on a cache access result.

12. The apparatus of clauses 10 or 11, wherein the processor is further configured to execute the instructions to cause the apparatus to:

in response to a hit to a cache request, update, based on the prediction model, a corresponding predicted lifetime value for the item associated with the cache request.

13. The apparatus of any of clauses 10-12, wherein the processor is further configured to execute the instructions to cause the apparatus to:

in response to a miss to a cache request, if the cache has available space:

store an item associated with the cache request in the cache; and

store a corresponding predicted lifetime value associated with the stored item in the queue.

14. The apparatus of any of clauses 10-13, wherein the processor is further configured to execute the instructions to cause the apparatus to:

in response to a miss to a cache request, if the cache is full:

compare a corresponding predicted lifetime value associated with the requested item and a smallest predicted lifetime value stored in the queue; and

in response to the corresponding predicted lifetime value associated with the requested item being greater than the smallest predicted lifetime value stored in the queue, perform the eviction operation to replace the target item with the requested item in the cache.

15. The apparatus of clause 14, wherein the processor is further configured to execute the instructions to cause the apparatus to:

in response to the corresponding predicted lifetime value associated with the requested item being smaller than the smallest predicted lifetime value stored in the queue, read out the requested item without storing the requested item in the cache.

16. The apparatus of any of clauses 10-15, wherein the processor is further configured to execute the instructions to cause the apparatus to:

determine each of the predicted lifetime values according to a corresponding eviction time offset value and a current timestamp.

17. The apparatus of any of clauses 10-16, wherein the prediction model comprises a reinforcement learning model.

18. The apparatus of any of clauses 10-17, wherein the queue is a minimum priority queue.

19. A method for cache management, comprising:

20. The method of clause 19, further comprising:

21. The method of clauses 19 or 20, further comprising:

22. The method of any of clauses 19-21, further comprising:

in response to a miss to a cache request, if the cache has available space:

storing an item associated with the cache request in the cache; and

23. The method of any of clauses 19-21, further comprising:

in response to a miss to a cache request, if the cache is full:

24. The method of any of clauses 19-21, further comprising:

in response to a miss to a cache request, if the cache is full:

25. The method of any of clauses 19-24, further comprising:

26. The method of any of clauses 19-25, wherein the prediction model comprises a reinforcement learning model.

27. The method of any of clauses 19-26, wherein the queue is a minimum priority queue.

In the foregoing specification, embodiments have been described with reference to numerous specific details that can vary from implementation to implementation. Certain adaptations and modifications of the described embodiments can be made. Other embodiments can be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims. It is also intended that the sequence of steps shown in figures are only for illustrative purposes and are not intended to be limited to any particular sequence of steps. As such, those skilled in the art can appreciate that these steps can be performed in a different order while implementing the same method.

In the drawings and specification, there have been disclosed exemplary embodiments. However, many variations and modifications can be made to these embodiments. Accordingly, although specific terms are employed, they are used in a generic and descriptive sense only and not for purposes of limitation, the scope of the embodiments being defined by the following claims.

Claims

A non-transitory computer-readable storage medium storing a set of instructions that are executable by one or more processors of a device to cause the device to perform a method for cache management, comprising:

predicting, based on a prediction model, eviction time offset values respectively for a plurality of items stored in a cache;

storing, in a queue, predicted lifetime values respectively for the plurality of items in the cache according to the eviction time offset values; and

in response to an eviction operation, selecting a target item to be removed from the cache based on the predicted lifetime values in the queue.
The non-transitory computer-readable storage medium of claim 1, wherein the set of instructions that are executable by the one or more processors cause the device to further perform:

in response to a cache request, updating the prediction model based on a cache access result.
The non-transitory computer-readable storage medium of claim 1, wherein the set of instructions that are executable by the one or more processors cause the device to further perform:

in response to a hit to a cache request, updating, based on the prediction model, a corresponding predicted lifetime value for the item associated with the cache request.
The non-transitory computer-readable storage medium of claim 1, wherein the set of instructions that are executable by the one or more processors cause the device to further perform:

in response to a miss to a cache request, if the cache has available space:

storing an item associated with the cache request in the cache; and

storing a corresponding predicted lifetime value associated with the stored item in the queue.
The non-transitory computer-readable storage medium of claim 1, wherein the set of instructions that are executable by the one or more processors cause the device to further perform:

in response to a miss to a cache request, if the cache is full:

comparing a corresponding predicted lifetime value associated with the requested item and a smallest predicted lifetime value stored in the queue; and

in response to the corresponding predicted lifetime value associated with the requested item being greater than the smallest predicted lifetime value stored in the queue, performing the eviction operation to replace the target item with the requested item in the cache.
The non-transitory computer-readable storage medium of claim 5, wherein the set of instructions that are executable by the one or more processors cause the device to further perform:

in response to the corresponding predicted lifetime value associated with the requested item being smaller than the smallest predicted lifetime value stored in the queue, reading out the requested item without storing the requested item in the cache.
The non-transitory computer-readable storage medium of claim 1, wherein the set of instructions that are executable by the one or more processors cause the device to further perform:

determining each of the predicted lifetime values according to a corresponding eviction time offset value and a current timestamp.
The non-transitory computer-readable storage medium of claim 1, wherein the prediction model comprises a reinforcement learning model.
An apparatus, comprising:

a memory configured to store instructions; and

one or more processors communicatively coupled to the memory and configured to execute the instructions to cause the apparatus to:

predict, based on a prediction model, eviction time offset values respectively for a plurality of items stored in a cache;

store, in a queue, predicted lifetime values respectively for the plurality of items in the cache according to the eviction time offset values; and

in response to an eviction operation, select a target item to be removed from the cache based on the predicted lifetime values in the queue.
The apparatus of claim 9, wherein the processor is further configured to execute the instructions to cause the apparatus to:

in response to a cache request, update the prediction model based on a cache access result.
The apparatus of claim 9, wherein the processor is further configured to execute the instructions to cause the apparatus to:

in response to a hit to a cache request, update, based on the prediction model, a corresponding predicted lifetime value for the item associated with the cache request.
The apparatus of claim 9, wherein the processor is further configured to execute the instructions to cause the apparatus to:

in response to a miss to a cache request, if the cache has available space:

store an item associated with the cache request in the cache; and

store a corresponding predicted lifetime value associated with the stored item in the queue.
The apparatus of claim 9, wherein the processor is further configured to execute the instructions to cause the apparatus to:

in response to a miss to a cache request, if the cache is full:

compare a corresponding predicted lifetime value associated with the requested item and a smallest predicted lifetime value stored in the queue; and

in response to the corresponding predicted lifetime value associated with the requested item being greater than the smallest predicted lifetime value stored in the queue, perform the eviction operation to replace the target item with the requested item in the cache.
The apparatus of claim 9, wherein the processor is further configured to execute the instructions to cause the apparatus to:

determine each of the predicted lifetime values according to a corresponding eviction time offset value and a current timestamp.
A method for cache management, comprising:

predicting, based on a prediction model, eviction time offset values respectively for a plurality of items stored in a cache;

storing, in a queue, predicted lifetime values respectively for the plurality of items in the cache according to the eviction time offset values; and

in response to an eviction operation, selecting a target item to be removed from the cache based on the predicted lifetime values in the queue.
The method of claim 15, further comprising:

in response to a cache request, updating the prediction model based on a cache access result.
The method of claim 15, further comprising:

in response to a hit to a cache request, updating, based on the prediction model, a corresponding predicted lifetime value for the item associated with the cache request.
The method of claim 15, further comprising:

in response to a miss to a cache request, if the cache has available space:

storing an item associated with the cache request in the cache; and

storing a corresponding predicted lifetime value associated with the stored item in the queue.
The method of claim 15, further comprising:

in response to a miss to a cache request, if the cache is full:

comparing a corresponding predicted lifetime value associated with the requested item and a smallest predicted lifetime value stored in the queue; and

in response to the corresponding predicted lifetime value associated with the requested item being greater than the smallest predicted lifetime value stored in the queue, performing the eviction operation to replace the target item with the requested item in the cache.
The method of claim 15, further comprising:

in response to a miss to a cache request, if the cache is full:

comparing a corresponding predicted lifetime value associated with the requested item and a smallest predicted lifetime value stored in the queue; and

in response to the corresponding predicted lifetime value associated with the requested item being smaller than the smallest predicted lifetime value stored in the queue, read out the requested item without storing the requested item in the cache.
The method of claim 15, further comprising:

determining each of the predicted lifetime values according to a corresponding eviction time offset value and a current timestamp.
The method of claim 15, wherein the prediction model comprises a reinforcement learning model.