WO2023165308A1 - Memory reclaim method and apparatus, and control device - Google Patents
Memory reclaim method and apparatus, and control device Download PDFInfo
- Publication number
- WO2023165308A1 WO2023165308A1 PCT/CN2023/075201 CN2023075201W WO2023165308A1 WO 2023165308 A1 WO2023165308 A1 WO 2023165308A1 CN 2023075201 W CN2023075201 W CN 2023075201W WO 2023165308 A1 WO2023165308 A1 WO 2023165308A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- memory
- virtual machine
- page
- struct
- free
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5022—Mechanisms to release resources
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45583—Memory management, e.g. access or allocation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the present application relates to the technical field of memory allocation, in particular, to a method, device and control device for memory recovery.
- Reclaiming the free memory of the virtual machine and improving the memory elasticity of the virtual machine can effectively improve the resource utilization efficiency of the host machine, help increase the density of virtual machines that the host machine can deploy, and reduce the cost of using virtual machines.
- device-side DMA requires that the memory map of the virtual machine cannot be changed at all during the running process, so it is impossible to dynamically reclaim the memory of the virtual machine.
- the problem solved by this application is that the memory mapping of DMA cannot be changed in the existing memory recovery method, so that the memory of the virtual machine cannot be recovered.
- this application firstly provides a method for memory recovery, including:
- the method further includes:
- the memory parsing function is BPF bytecode compiled by the virtual machine.
- the memory analysis function for obtaining the memory management metadata information of the virtual machine and parsing the memory management metadata information includes:
- the memory management metadata information injected by the virtual machine and the compiled BPF bytecode of the memory analysis function are received .
- the free memory of the virtual machine is reclaimed, and scanned by the memory analysis function and the memory management metadata information After the free memory of the virtual machine is recovered, the free memory of the virtual machine is reclaimed.
- the memory management metadata information includes struct page physical page information and corresponding struct page structure information; After the free memory of the virtual machine is recovered, the free memory of the virtual machine is reclaimed, including:
- the body physical page is a struct page physical page that stores the struct page structure information corresponding to the free said struct page physical page.
- the allocation request of the virtual machine for the reclaimed free memory is a physical access to the struct page structure whose permission bit is read-only in the EPT mapping table or the IOMMU mapping table. Triggered when the page is mapped.
- the allocating new physical memory to the virtual machine is re-establishing the mapping of free struct page physical pages in the EPT mapping table or the IOMMU mapping table.
- the second control device of the present application includes: a memory and a processor
- said memory for storing programs
- the processor coupled to the memory, for executing the program for:
- the processor is specifically configured to:
- the memory parsing function is BPF bytecode compiled by the virtual machine.
- the processor is specifically configured to:
- the memory management metadata information injected by the virtual machine and the compiled BPF bytecode of the memory analysis function are received .
- the processor is specifically configured to reclaim the free memory of the virtual machine after scanning out the free memory of the virtual machine through the memory analysis function and the memory management metadata information.
- the memory management metadata information includes struct page physical page information and corresponding struct page structure information; based on this, the processor is specifically configured to:
- the permission location is read-only, and the physical page of the struct page structure is a physical page of the struct page that stores the information of the struct page structure corresponding to the free physical page of the struct page.
- the processor is specifically configured to:
- the allocation request of the reclaimed free memory by the virtual machine is triggered when the virtual machine accesses the mapping of the physical page of the struct page structure whose permission bit is read-only in the EPT mapping table or the IOMMU mapping table.
- the processor is specifically configured to:
- the allocating new physical memory to the virtual machine is to re-establish the mapping of free struct page physical pages in the EPT mapping table or the IOMMU mapping table.
- the present application provides a memory recycling device again, which includes:
- An information obtaining module which is used to obtain memory management metadata information of the virtual machine and a memory analysis function for parsing the memory management metadata information;
- a memory recycling module configured to reclaim free memory of the virtual machine based on the memory analysis function and the memory management metadata information.
- the device also includes:
- a memory allocation module configured to allocate new physical memory to the virtual machine in response to the allocation request of the virtual machine for the reclaimed free memory.
- This application reclaims the free memory of the virtual machine, thereby avoiding the problem that the device side DMA cannot change the memory mapping, and does not require additional hardware or paravirtualization drivers.
- the virtual machine has no other changes except for injecting code and data into the host when it starts, so the intrusion control of the virtual machine is minimal, and it is easy to achieve software compatibility; the host actively and spontaneously updates the memory status of the virtual machine Scanning without waiting for the response of the virtual machine, so compared with the traditional PV solution, it can achieve better real-time performance; this solution is a pure software solution based on the existing architecture, and does not rely on supporting Page Fault on the device side Specific hardware and therefore hardware compatible.
- memory management metadata information and memory management metadata analysis functions are injected into the host machine through the virtual machine, so that the host machine can scan and reclaim the memory of the virtual machine actively in real time, improving the response speed of memory reclamation; And the recycling/reallocation of free memory does not involve Disk I/O, Therefore, the processing efficiency is also improved; while recycling, the modification of the corresponding struct page mapping is considered, so that this solution can be used in device direct scenarios.
- FIG. 1 is a flowchart of a memory recovery method according to an embodiment of the present application
- FIG. 2 is a flowchart of a memory recovery method according to another embodiment of the present application.
- FIG. 3 is a schematic diagram of a memory recovery process according to the present application.
- FIG. 4 is a flow chart of a memory recovery method according to another embodiment of the present application.
- Fig. 5 is the schematic diagram before reclaiming according to EPT mapping table and IOMMU mapping table according to the present application;
- Fig. 6 is the schematic diagram after recycling according to the EPT mapping table and the IOMMU mapping table of the present application;
- FIG. 7 is a structural block diagram of a memory recovery device according to an embodiment of the present application.
- Fig. 8 is a structural block diagram of a control device according to an embodiment of the present application.
- the existing memory reclamation method dynamically reclaims the memory of the virtual machine by changing the memory mapping, but the device-side DMA requires that the memory mapping of the virtual machine cannot be changed at all during the running process of the virtual machine.
- the present application provides a solution to reclaim the free memory of the virtual machine by determining the free memory area of the virtual machine, so that there is no need to change the memory mapping being used during the running of the virtual machine.
- Memory reclamation After the host allocates memory to the virtual machine, it reclaims the excess memory when the virtual machine cannot make full use of it
- Memory elasticity the host can allocate the memory used by the virtual machine on demand
- DMA Direct Memory Access, which means that I/O devices can directly access main memory without CPU intervene
- Device-side page fault interrupt In a virtualization scenario, when the pass-through device accesses the memory of the virtual machine during DMA, if the host does not establish a mapping from the corresponding memory of the virtual machine to the memory of the host, the DMA will fail. In the case of hardware and software support, a page fault interrupt on the device side will be triggered, and the host will replay the DMA after the memory map is established.
- Memory management metadata metadata that records the status of each memory page frame in the operating system. In the Linux scenario, it is the memory area where the struct page is located and the content in the struct page
- GVA The virtual address space seen by the program inside the virtual machine
- GPA The physical memory space seen by the virtual machine
- HPA host physical memory space
- EPT Extended Page Table, used to establish a page table for mapping between virtual machine GPA and host machine HPA in a virtualization scenario
- IOMMU In a virtualization scenario, it is used to establish a page table for mapping between IOVA (generally virtual machine GPA) and HPA
- Host Refers to the host
- the embodiment of the present application provides a memory recovery method, which can be executed by a memory recovery device, and the memory recovery device can be integrated in electronic devices such as computers, servers, computers, server clusters, and data centers.
- FIG. 1 it is a flowchart of a memory recovery method according to an embodiment of the present application; wherein, the memory recovery method includes:
- S200 Reclaim free memory of the virtual machine based on the memory analysis function and the memory management metadata information.
- the virtual machine runs in the host machine, and one or more virtual machines can be set in one host machine, and different virtual machines are isolated from each other.
- the isolation between the same virtual machine is to prevent the resources (CPU, I/O device) allocated to other virtual machines from accessing the physical address of this virtual machine.
- Each virtual machine will have its own independent physical address space, that is, GPA (Guest Physical Address) space, which is different from the host physical address space, that is, HPA (Host Physical Address) space.
- free memory of the virtual machine is reclaimed based on the memory analysis function corresponding to the virtual machine and the memory management metadata information.
- the linus kernel still regards the physical page as the basic unit of memory management, and uses the struct page structure to represent each physical page in the system; memory management is usually done in pages processed as a unit.
- the physical memory can be divided into several physical pages. For example, on a host that supports a 4KB physical page size and has 1GB of physical memory, the physical memory will be divided into 262144 physical pages.
- the embodiment of the present application provides another memory recovery method, which is similar to the aforementioned memory recovery method, except that, as shown in FIG. 2 , the S200 is based on the memory analysis function and The memory management metadata information, after reclaiming the free memory of the virtual machine, the method further includes:
- the free memory corresponding to the allocation request of the virtual machine is a part of all the free memory recovered from the virtual machine; in this case, when allocating new physical memory to the virtual machine, only The portion of memory requested by the virtual machine.
- the memory parsing function is BPF bytecode compiled by the virtual machine. In this way, the compilation of the memory analysis function based on the eBPF program can be completed directly on the existing virtual machine, without the need for the virtual machine to provide additional hardware or additional paravirtualization drivers.
- FIG. 3 it is a schematic diagram of a memory recovery process, and the following content is described in detail in conjunction with this figure.
- the embodiment of the present application provides another memory recovery method, which is similar to the above-mentioned memory recovery method, the difference is that in S100, the memory management metadata information of the virtual machine is obtained and the memory management metadata information is parsed memory parsing functions, including:
- the memory management metadata information injected by the virtual machine and the compiled BPF bytecode of the memory analysis function are received .
- the eBPF source program of the memory analysis function is pre-written and put into the file system of the virtual machine; after the virtual machine starts, the eBPF source program of the memory analysis function in the file system is compiled into BPF bytecode, and The BPF bytecode of the compiled memory parsing function is injected into the host; through this In this way, the host machine obtains the BPF bytecode of the memory parsing function of the virtual machine.
- the memory management metadata information of the virtual machine includes struct page physical page information and corresponding struct page structure information; the struct page structure is created by the kernel for each struct page physical page, and the The struct page structure includes the field flags, which is used to describe the status and other information of the physical page.
- the field flags is mainly divided into 4 parts, in which the flag bit (the status identifier of the physical page) increases to the high bit, and the other bit fields section (mainly used Based on the sparse memory model SPARSEMEM), node (NUMA node number, which node the physical page belongs to), zone (memory domain mark, which zone the physical page belongs to) grows toward the lower bit, and there are idle bits in the middle.
- the memory management metadata information and memory analysis function will be injected into the host machine only after the virtual machine is started once, and the host machine can only scan the struct page physics of the virtual machine that has been injected with the memory management metadata information and memory analysis function Page. That is to say, the host machine can only reclaim the free memory of the virtual machine that has been started.
- the host machine can only scan the struct page physical pages of the virtual machines that have been started.
- the virtual machine can only be started if the host has allocated memory.
- the physical memory of the host machine is 100G
- the memory to be allocated to the virtual machines is 10G.
- the embodiment of the present application provides another memory recovery method, which is similar to the aforementioned memory recovery method, except that, as shown in Figure 4,
- the S200 based on the memory analysis function and the memory management metadata information, reclaim the free memory of the virtual machine, and scan out the memory of the virtual machine through the memory analysis function and the memory management metadata information After free memory, reclaim the free memory of the virtual machine.
- the memory management metadata information includes struct page physical page information and corresponding struct page structure information
- reclaiming the free memory of the virtual machine includes:
- the page structure physical page is a struct page physical page that stores the struct page structure information corresponding to the free struct page physical page.
- the virtual machine only transmits its memory management metadata information and memory analysis function to the host at the initial stage of startup; after that, the host can use this information to actively scan the free memory area in the virtual machine and reclaim the free memory of the virtual machine; And while reclaiming the free memory of the virtual machine and clearing the memory map of the free page, the memory map of the memory management metadata is cleared, so as to ensure that the memory we reclaim must be the free memory in the virtual machine.
- the struct page physical page corresponding to the flag bit flag of the preset identification is determined as a free struct page physical page.
- different virtual machines may have different labels for the flag bit flag in the struct page structure, so the flag bit flag for expressing "idle" is a preset flag and will also be different; for example, in a virtual machine, The flag bit in the struct page structure is PG-free, indicating that the corresponding struct page physical page is free; in another virtual machine, the flag bit flag in the struct page structure is PG-clean, indicating that the corresponding struct page is free page physical page.
- the struct page structure physical page is a struct page physical page storing the struct page structure information corresponding to the idle struct page physical page.
- FIG. 5 and FIG. 6 the EPT mapping table and the IOMMU mapping table are relatively similar, so in this embodiment, the process of memory recovery is illustrated through the EPT mapping table.
- Figure 5 and Figure 6 show the EPT mapping table and IOMMU mapping table before and after recycling respectively.
- the schematic diagram after receipt, the figure only expresses the simple content of the general process, and does not strictly correspond to the specific content of the EPT mapping table and the IOMMU mapping table. Therefore, when understanding, some expressions that are not drawn in the figure should not be used To identify errors in drawings or representations.
- the mapping relationship between the struct page physical pages of GPA, HPA, and IOVA is recorded in the EPT mapping table and the IOMMU mapping table; and is used to describe the struct page
- the struct page structure of the physical page also needs to be stored, which is stored in another struct page physical page, and the struct page physical page storing the struct page structure is the above-mentioned struct page structure physical page.
- the struct page physical page showing the mapping relationship through the connection is considered as a free struct page physical page.
- the permission bit is modified based on the struct page physical page as the basic unit, that is to say, once the permission bit of the mapping of a struct page physical page is modified to "read-only", the struct page All struct page structures stored in physical pages are restricted to "read-only”.
- the struct page physical page has a certain capacity and can store multiple struct page structures.
- a 4KB struct page physical page can store 64 struct page structures with a size of 64B; the 64 The struct page structure corresponds to 64 struct page physical pages.
- the modification of the permission bits of the physical page of the struct page structure in the EPT mapping table and the IOMMU mapping table means that the struct page structure is also modified to "read-only" at the same time, so the corresponding 64 The mapping of a struct page physical page in the EPT mapping table and the IOMMU mapping table.
- the modification of the physical page of the struct page structure is linked with the struct page physical page corresponding to the physical page of the struct page structure, and needs to be kept in sync; therefore, if the 64 in the physical page of the struct page structure A struct page structure has a corresponding struct page physical page that is not free (even if the remaining 63 are free), then the struct page structure physical page and its The corresponding 64 struct page physical pages cannot be reclaimed; only the 64 struct page structure physical pages in the struct page structure physical pages are all idle struct page physical pages, and all of them can be reclaimed.
- the allocation request of the virtual machine to the reclaimed free memory is when the virtual machine accesses the struct page whose permission bit is read-only in the EPT mapping table or the IOMMU mapping table Triggered when the physical page of the structure is mapped.
- the virtual machine if it needs to use new memory, it will access the corresponding struct page structure in the physical page of the "read-only” struct page structure, but this access will exceed the "read-only” permission, and return A page-fault signal.
- the host receives the page-fault signal, it can be considered that the virtual machine has given an allocation request for the reclaimed free memory.
- the allocating new physical memory to the virtual machine is re-establishing the mapping of free struct page physical pages in the EPT mapping table or the IOMMU mapping table.
- the modification of the permission bit of the mapping of the physical page of the struct page structure is a solution determined after considering the execution path of the kernel memory allocation of the virtual machine.
- the virtual machine allocates memory, its access to the struct page must be It will be intercepted by the host (returned page-fault signal), so that the page can establish EPT and IOMMU mapping before the virtual machine is available to prevent DMA failure.
- the virtual machine has no other changes except for injecting code and data into the host when it starts, so the intrusion control of the virtual machine is minimal, and it is easy to achieve software compatibility; the host actively and spontaneously updates the memory status of the virtual machine Scanning without waiting for the response of the virtual machine, so compared with the traditional PV solution, it can achieve better real-time performance; this solution is a pure software solution based on the existing architecture, and does not rely on supporting Page Fault on the device side Specific hardware and therefore hardware compatible.
- memory management metadata information and memory management metadata analysis functions are injected into the host machine through the virtual machine, so that the host machine can scan and reclaim the memory of the virtual machine actively in real time, improving the response speed of memory reclamation; Moreover, the recycling/redistribution of free memory does not involve Disk I/O, so the processing efficiency is also improved; while recycling, the modification of the corresponding struct page mapping is considered, so that this solution can be used in device direct scenarios.
- An embodiment of the present application provides a memory reclamation device configured to execute the memory reclamation method described above in the present application.
- the memory reclamation device will be described in detail below.
- the memory recycling device includes:
- An information acquisition module 101 which is used to acquire memory management metadata information of a virtual machine and a memory analysis function for parsing the memory management metadata information;
- a memory recycling module 102 configured to reclaim free memory of the virtual machine based on the memory analysis function and the memory management metadata information.
- the device also includes:
- a memory allocation module 103 configured to allocate new physical memory to the virtual machine in response to the virtual machine's allocation request for the reclaimed free memory.
- the memory analysis function is the compiled BPF bytecode of the virtual machine.
- the information acquisition module 101 is also used for:
- the memory management metadata information injected by the virtual machine and the compiled BPF bytecode of the memory analysis function are received .
- the memory reclaiming module 102 is further configured to reclaim the free memory of the virtual machine after scanning out the free memory of the virtual machine through the memory analysis function and the memory management metadata information.
- the memory management metadata information includes struct page physical page information and corresponding struct page structure information; the memory recycling module 102 is also used for:
- the body physical page is a struct page physical page that stores the struct page structure information corresponding to the free said struct page physical page.
- the reclaimed idle memory of the virtual machine The memory allocation request is triggered when the virtual machine accesses the mapping of the physical page of the struct page structure whose permission bit is read-only in the EPT mapping table or the IOMMU mapping table.
- the allocation of new physical memory to the virtual machine is to re-establish the mapping of free struct page physical pages in the EPT mapping table or the IOMMU mapping table.
- the memory reclamation device provided by the above embodiments of the present application is based on the same inventive concept as the memory reclamation method provided by the embodiments of the present application, and has the same beneficial effect as the method adopted, run or implemented by the stored application program.
- the memory recovery device can be implemented as a control device, including: a memory 301 and a processor 303 .
- the memory 301 may be configured to store programs.
- the memory 301 may also be configured to store other various data to support operations on the control device. Examples of such data include instructions for any application or method operating on the controlling device, contact data, phonebook data, messages, pictures, videos, etc.
- the memory 301 can be implemented by any type of volatile or non-volatile storage device or their combination, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
- SRAM static random access memory
- EEPROM electrically erasable programmable read-only memory
- EPROM erasable Programmable Read Only Memory
- PROM Programmable Read Only Memory
- ROM Read Only Memory
- Magnetic Memory Flash Memory
- Magnetic or Optical Disk any type of volatile or non-volatile storage device or their combination
- the processor 303 coupled to the memory 301, is used to execute the program in the memory 301 for:
- the processor 303 is specifically configured to:
- the memory parsing function is the compiled BPF byte of the virtual machine code.
- the processor 303 is specifically configured to:
- the memory management metadata information injected by the virtual machine and the compiled BPF bytecode of the memory analysis function are received .
- the processor 303 is specifically configured to: reclaim the free memory of the virtual machine after scanning out the free memory of the virtual machine through the memory analysis function and the memory management metadata information.
- the memory management metadata information includes struct page physical page information and corresponding struct page structure information; based on this, the processor 303 is specifically used to:
- the body physical page is a struct page physical page that stores the struct page structure information corresponding to the free said struct page physical page.
- the processor 303 is specifically configured to:
- the allocation request of the reclaimed free memory by the virtual machine is triggered when the virtual machine accesses the mapping of the physical page of the struct page structure whose permission bit is read-only in the EPT mapping table or the IOMMU mapping table.
- the processor 303 is specifically configured to:
- the allocating new physical memory to the virtual machine is to re-establish the mapping of free struct page physical pages in the EPT mapping table or the IOMMU mapping table.
- FIG. 8 only some components are schematically shown in FIG. 8 , which does not mean that the server device only includes the components shown in FIG. 8 .
- control device provided in this embodiment is based on the same inventive concept as the memory recovery method provided in the embodiment of this application, and has the same method adopted, run or implemented by its stored application program. Beneficial effect.
- the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
- computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
- These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions
- the device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
- a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
- processors CPUs
- input/output interfaces network interfaces
- memory volatile and non-volatile memory
- Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and/or nonvolatile memory such as read-only memory (ROM) or flash RAM. Memory is an example of computer readable media.
- RAM random access memory
- ROM read-only memory
- flash RAM flash random access memory
- Computer-readable media including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information.
- Information may be computer readable instructions, data structures, modules of a program, or other data.
- Examples of computer storage media include, but are not limited to phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technologies, compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic tape cartridges, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
- computer-readable media excludes transitory computer-readable media, such as modulated data signals and carrier waves.
- the embodiments of the present application may be provided as methods, systems or computer program products. Accordingly, the present application can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
- a computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
本申请要求于2022年03月01日提交中国专利局、申请号为202210197905.7、申请名称为“一种内存回收方法、装置及控制设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202210197905.7 and the application name "A memory recovery method, device and control device" submitted to the China Patent Office on March 1, 2022, the entire content of which is incorporated by reference In this application.
本申请涉及内存分配技术领域,具体而言,涉及一种内存回收方法、装置及控制设备。The present application relates to the technical field of memory allocation, in particular, to a method, device and control device for memory recovery.
对虚拟机的空闲内存进行回收,提升虚拟机内存弹性,能够有效提高宿主机的资源使用效率,帮助提高宿主机能够部署的虚拟机的密度,降低虚拟机使用成本。但是在虚拟机设备直通大规模普及的今天,设备侧DMA要求虚拟机在运行过程中,内存映射完全不能改变,因此也就无法动态地回收虚拟机的内存。Reclaiming the free memory of the virtual machine and improving the memory elasticity of the virtual machine can effectively improve the resource utilization efficiency of the host machine, help increase the density of virtual machines that the host machine can deploy, and reduce the cost of using virtual machines. However, in today's large-scale popularization of virtual machine device passthrough, device-side DMA requires that the memory map of the virtual machine cannot be changed at all during the running process, so it is impossible to dynamically reclaim the memory of the virtual machine.
解决该问题的一个方法是提供设备侧缺页处理能力,但这种方法依赖特殊的硬件,无法在现有硬件中使用。此外,还有一些半虚拟化方案能够解决此类问题,但是这些方案需要虚拟机中运行专用的半虚拟化驱动程序。One way to solve this problem is to provide device-side page fault handling capability, but this method depends on special hardware and cannot be used in existing hardware. In addition, there are some paravirtualization solutions that can solve such problems, but these solutions require a dedicated paravirtualization driver to run in the virtual machine.
发明内容Contents of the invention
本申请解决的问题是现有内存回收方法中DMA的内存映射无法改变导致无法回收虚拟机的内存。The problem solved by this application is that the memory mapping of DMA cannot be changed in the existing memory recovery method, so that the memory of the virtual machine cannot be recovered.
为解决上述问题,本申请首先提供一种内存回收方法,包括:In order to solve the above problems, this application firstly provides a method for memory recovery, including:
获取虚拟机的内存管理元数据信息和解析该内存管理元数据信息的内存解析函数;Obtaining memory management metadata information of the virtual machine and a memory parsing function for parsing the memory management metadata information;
基于所述内存解析函数和所述内存管理元数据信息,回收所述虚拟机的 空闲内存。Based on the memory analysis function and the memory management metadata information, reclaim the virtual machine free memory.
在一种实施方式中,所述基于所述内存解析函数和所述内存管理元数据信息,回收所述虚拟机的空闲内存之后,所述方法还包括:In one embodiment, after reclaiming the free memory of the virtual machine based on the memory analysis function and the memory management metadata information, the method further includes:
响应于所述虚拟机对已回收的空闲内存的分配请求,对所述虚拟机分配新的物理内存。Allocating new physical memory to the virtual machine in response to the virtual machine's allocation request for the reclaimed free memory.
在一种实施方式中,所述内存解析函数为所述虚拟机编译后的BPF字节码。In an implementation manner, the memory parsing function is BPF bytecode compiled by the virtual machine.
在一种实施方式中,所述获取虚拟机的内存管理元数据信息和解析该内存管理元数据信息的内存解析函数,包括:In one embodiment, the memory analysis function for obtaining the memory management metadata information of the virtual machine and parsing the memory management metadata information includes:
当所述虚拟机启动后对预先编写的所述内存解析函数的eBPF源程序进行编译时,接收所述虚拟机注入的内存管理元数据信息和编译后的所述内存解析函数的BPF字节码。When the pre-written eBPF source program of the memory analysis function is compiled after the virtual machine is started, the memory management metadata information injected by the virtual machine and the compiled BPF bytecode of the memory analysis function are received .
在一种实施方式中,所述基于所述内存解析函数和所述内存管理元数据信息,回收所述虚拟机的空闲内存中,通过所述内存解析函数和所述内存管理元数据信息扫描出所述虚拟机的空闲内存后,回收所述虚拟机的空闲内存。In one embodiment, based on the memory analysis function and the memory management metadata information, the free memory of the virtual machine is reclaimed, and scanned by the memory analysis function and the memory management metadata information After the free memory of the virtual machine is recovered, the free memory of the virtual machine is reclaimed.
在一种实施方式中,所述内存管理元数据信息,包括struct page物理页信息和对应的struct page结构体信息;所述通过所述内存解析函数和所述内存管理元数据信息扫描出所述虚拟机的空闲内存后,回收所述虚拟机的空闲内存,包括:In one embodiment, the memory management metadata information includes struct page physical page information and corresponding struct page structure information; After the free memory of the virtual machine is recovered, the free memory of the virtual machine is reclaimed, including:
扫描所述虚拟机的struct page物理页信息,调用所述内存解析函数解析与对应的struct page结构体信息,确定空闲的struct page物理页;Scanning the struct page physical page information of the virtual machine, calling the memory analysis function to analyze and correspond to the struct page structure information, and determining the idle struct page physical page;
删除EPT映射表和IOMMU映射表中空闲的所述struct page物理页的映射,并将EPT映射表和IOMMU映射表中struct page结构体物理页的映射的权限位置为只读,所述struct page结构体物理页为存储空闲的所述struct page物理页对应的struct page结构体信息的struct page物理页。Delete the free mapping of the struct page physical page in the EPT mapping table and the IOMMU mapping table, and make the mapping permission position of the struct page structure physical page in the EPT mapping table and the IOMMU mapping table read-only, and the struct page structure The body physical page is a struct page physical page that stores the struct page structure information corresponding to the free said struct page physical page.
在一种实施方式中,所述虚拟机对已回收的空闲内存的分配请求是在所述虚拟机访问所述EPT映射表或所述IOMMU映射表中权限位为只读的struct page结构体物理页的映射时触发的。 In one embodiment, the allocation request of the virtual machine for the reclaimed free memory is a physical access to the struct page structure whose permission bit is read-only in the EPT mapping table or the IOMMU mapping table. Triggered when the page is mapped.
在一种实施方式中,所述对所述虚拟机分配新的物理内存是在所述EPT映射表或所述IOMMU映射表中重新建立空闲的struct page物理页的映射。In one embodiment, the allocating new physical memory to the virtual machine is re-establishing the mapping of free struct page physical pages in the EPT mapping table or the IOMMU mapping table.
本申请其次一种控制设备,其包括:存储器和处理器;The second control device of the present application includes: a memory and a processor;
所述存储器,其用于存储程序;said memory for storing programs;
所述处理器,耦合至所述存储器,用于执行所述程序,以用于:the processor, coupled to the memory, for executing the program for:
获取虚拟机的内存管理元数据信息和解析该内存管理元数据信息的内存解析函数;Obtaining memory management metadata information of the virtual machine and a memory parsing function for parsing the memory management metadata information;
基于所述内存解析函数和所述内存管理元数据信息,回收所述虚拟机的空闲内存。Reclaim the free memory of the virtual machine based on the memory analysis function and the memory management metadata information.
在一种实施方式中,所述处理器具体用于:In one embodiment, the processor is specifically configured to:
响应于所述虚拟机对已回收的空闲内存的分配请求,对所述虚拟机分配新的物理内存。Allocating new physical memory to the virtual machine in response to the virtual machine's allocation request for the reclaimed free memory.
在一种实施方式中,所述内存解析函数为所述虚拟机编译后的BPF字节码。In an implementation manner, the memory parsing function is BPF bytecode compiled by the virtual machine.
在一种实施方式中,所述处理器具体用于:In one embodiment, the processor is specifically configured to:
当所述虚拟机启动后对预先编写的所述内存解析函数的eBPF源程序进行编译时,接收所述虚拟机注入的内存管理元数据信息和编译后的所述内存解析函数的BPF字节码。When the pre-written eBPF source program of the memory analysis function is compiled after the virtual machine is started, the memory management metadata information injected by the virtual machine and the compiled BPF bytecode of the memory analysis function are received .
在一种实施方式中,所述处理器具体用于:通过所述内存解析函数和所述内存管理元数据信息扫描出所述虚拟机的空闲内存后,回收所述虚拟机的空闲内存。In an implementation manner, the processor is specifically configured to reclaim the free memory of the virtual machine after scanning out the free memory of the virtual machine through the memory analysis function and the memory management metadata information.
在一种实施方式中,所述内存管理元数据信息,包括struct page物理页信息和对应的struct page结构体信息;基于此,所述处理器具体用于:In one embodiment, the memory management metadata information includes struct page physical page information and corresponding struct page structure information; based on this, the processor is specifically configured to:
扫描所述虚拟机的struct page物理页信息,调用所述内存解析函数解析与对应的struct page结构体信息,确定空闲的struct page物理页;Scanning the struct page physical page information of the virtual machine, calling the memory analysis function to analyze and correspond to the struct page structure information, and determining the idle struct page physical page;
删除EPT映射表和IOMMU映射表中空闲的所述struct page物理页的映射,并将EPT映射表和IOMMU映射表中struct page结构体物理页的映射的 权限位置为只读,所述struct page结构体物理页为存储空闲的所述struct page物理页对应的struct page结构体信息的struct page物理页。Delete the mapping of the struct page physical page that is free in the EPT mapping table and the IOMMU mapping table, and map the mapping of the struct page structure physical page in the EPT mapping table and the IOMMU mapping table The permission location is read-only, and the physical page of the struct page structure is a physical page of the struct page that stores the information of the struct page structure corresponding to the free physical page of the struct page.
在一种实施方式中,所述处理器具体用于:In one embodiment, the processor is specifically configured to:
所述虚拟机对已回收的空闲内存的分配请求是在所述虚拟机访问所述EPT映射表或所述IOMMU映射表中权限位为只读的struct page结构体物理页的映射时触发的。The allocation request of the reclaimed free memory by the virtual machine is triggered when the virtual machine accesses the mapping of the physical page of the struct page structure whose permission bit is read-only in the EPT mapping table or the IOMMU mapping table.
在一种实施方式中,所述处理器具体用于:In one embodiment, the processor is specifically configured to:
所述对所述虚拟机分配新的物理内存是在所述EPT映射表或所述IOMMU映射表中重新建立空闲的struct page物理页的映射。The allocating new physical memory to the virtual machine is to re-establish the mapping of free struct page physical pages in the EPT mapping table or the IOMMU mapping table.
本申请再次提供一种内存回收装置,其包括:The present application provides a memory recycling device again, which includes:
信息获取模块,其用于获取虚拟机的内存管理元数据信息和解析该内存管理元数据信息的内存解析函数;An information obtaining module, which is used to obtain memory management metadata information of the virtual machine and a memory analysis function for parsing the memory management metadata information;
内存回收模块,其用于基于所述内存解析函数和所述内存管理元数据信息,回收所述虚拟机的空闲内存。A memory recycling module, configured to reclaim free memory of the virtual machine based on the memory analysis function and the memory management metadata information.
在一种实施方式中,所述装置还包括:In one embodiment, the device also includes:
内存分配模块,其用于响应于所述虚拟机对已回收的空闲内存的分配请求,对所述虚拟机分配新的物理内存。A memory allocation module, configured to allocate new physical memory to the virtual machine in response to the allocation request of the virtual machine for the reclaimed free memory.
本申请通过对虚拟机的空闲的内存进行回收,从而避开了对设备侧DMA无法改变内存映射的问题,且无需额外的硬件或半虚拟化驱动程序。This application reclaims the free memory of the virtual machine, thereby avoiding the problem that the device side DMA cannot change the memory mapping, and does not require additional hardware or paravirtualization drivers.
本申请中,虚拟机除启动时向宿主机注入代码与数据之外,无任何其它改动,因此对虚拟机的侵入控制最小,很容易做到软件兼容;宿主机主动自发地对虚拟机内存状态进行扫描,而无需等待虚拟机的响应,因此相较于传统的PV方案,可以做到更好的实时性;本方案为在已有架构上的纯软件方案,不依赖支持设备侧Page Fault的特定硬件,因此可以硬件兼容。In this application, the virtual machine has no other changes except for injecting code and data into the host when it starts, so the intrusion control of the virtual machine is minimal, and it is easy to achieve software compatibility; the host actively and spontaneously updates the memory status of the virtual machine Scanning without waiting for the response of the virtual machine, so compared with the traditional PV solution, it can achieve better real-time performance; this solution is a pure software solution based on the existing architecture, and does not rely on supporting Page Fault on the device side Specific hardware and therefore hardware compatible.
本申请中,通过虚拟机向宿主机注入内存管理元数据信息,以及内存管理元数据解析函数,让宿主机能够实时、主动地对虚拟机内存进行扫描与回收,提高了内存回收的响应速度;且对空闲内存的回收/再分配不涉及Disk I/O, 因此也提升了处理效率;回收的同时,考虑了相应struct page映射的修改,使该方案可用于设备直通场景。In this application, memory management metadata information and memory management metadata analysis functions are injected into the host machine through the virtual machine, so that the host machine can scan and reclaim the memory of the virtual machine actively in real time, improving the response speed of memory reclamation; And the recycling/reallocation of free memory does not involve Disk I/O, Therefore, the processing efficiency is also improved; while recycling, the modification of the corresponding struct page mapping is considered, so that this solution can be used in device direct scenarios.
图1为根据本申请一个实施例的内存回收方法的流程图;FIG. 1 is a flowchart of a memory recovery method according to an embodiment of the present application;
图2为根据本申请又一实施例的内存回收方法的流程图;FIG. 2 is a flowchart of a memory recovery method according to another embodiment of the present application;
图3为根据本申请内存回收过程的示意图;FIG. 3 is a schematic diagram of a memory recovery process according to the present application;
图4为根据本申请又一实施例的内存回收方法的流程图;FIG. 4 is a flow chart of a memory recovery method according to another embodiment of the present application;
图5为根据本申请EPT映射表和IOMMU映射表回收前的示意图;Fig. 5 is the schematic diagram before reclaiming according to EPT mapping table and IOMMU mapping table according to the present application;
图6为根据本申请EPT映射表和IOMMU映射表回收后的示意图;Fig. 6 is the schematic diagram after recycling according to the EPT mapping table and the IOMMU mapping table of the present application;
图7为根据本申请一个实施例的内存回收装置的结构框图;FIG. 7 is a structural block diagram of a memory recovery device according to an embodiment of the present application;
图8为根据本申请实施例的控制设备的结构框图。Fig. 8 is a structural block diagram of a control device according to an embodiment of the present application.
为使本申请的上述目的、特征和优点能够更为明显易懂,下面结合附图对本申请的具体实施例做详细的说明。In order to make the above purpose, features and advantages of the present application more obvious and understandable, specific embodiments of the present application will be described in detail below in conjunction with the accompanying drawings.
现有的内存回收方法,通过改变内存的映射来动态地回收虚拟机的内存,但是设备侧DMA要求虚拟机在运行过程中,内存映射完全不能改变。The existing memory reclamation method dynamically reclaims the memory of the virtual machine by changing the memory mapping, but the device-side DMA requires that the memory mapping of the virtual machine cannot be changed at all during the running process of the virtual machine.
针对上述问题,本申请提供一种解决方案,通过确定虚拟机的空闲内存区域,来回收虚拟机的空闲内存,从而无需改变虚拟机运行过程中正在使用的内存映射。Aiming at the above problems, the present application provides a solution to reclaim the free memory of the virtual machine by determining the free memory area of the virtual machine, so that there is no need to change the memory mapping being used during the running of the virtual machine.
为了便于理解,在此对下述可能使用的术语进行解释:For ease of understanding, the following terms that may be used are explained here:
内存回收:宿主机将内存分配给虚拟机后,在虚拟机无法充分利用时将多余的内存收回Memory reclamation: After the host allocates memory to the virtual machine, it reclaims the excess memory when the virtual machine cannot make full use of it
内存弹性:宿主机能够按需分配给虚拟机所用内存Memory elasticity: the host can allocate the memory used by the virtual machine on demand
DMA:Direct Memory Access,指I/O设备可直接访问主内存,而无需CPU 干预DMA: Direct Memory Access, which means that I/O devices can directly access main memory without CPU intervene
设备直通:在虚拟化场景下,让宿主机硬件可以直接被VM使用的技术Device passthrough: In a virtualization scenario, a technology that allows the host hardware to be directly used by the VM
设备侧缺页中断:在虚拟化场景下,直通设备在进行DMA时访问虚拟机内存时,若宿主机未建立虚拟机相应内存到宿主机内存的映射,则DMA会失效。在硬件和软件支持的情况下,会触发设备侧缺页中断,由宿主机建立好内存映射后,对DMA进行重放。Device-side page fault interrupt: In a virtualization scenario, when the pass-through device accesses the memory of the virtual machine during DMA, if the host does not establish a mapping from the corresponding memory of the virtual machine to the memory of the host, the DMA will fail. In the case of hardware and software support, a page fault interrupt on the device side will be triggered, and the host will replay the DMA after the memory map is established.
内存管理元数据:记录操作系统中每个内存页帧的状态的元数据,在Linux场景下,即为struct page所在内存区域,以及struct page中的内容Memory management metadata: metadata that records the status of each memory page frame in the operating system. In the Linux scenario, it is the memory area where the struct page is located and the content in the struct page
GVA:虚拟机内部程序所看到的虚拟地址空间GVA: The virtual address space seen by the program inside the virtual machine
GPA:虚拟机所看到的物理内存空间GPA: The physical memory space seen by the virtual machine
HPA:宿主机物理内存空间HPA: host physical memory space
EPT:Extended Page Table,在虚拟化场景下用来建立虚拟机GPA与宿主机HPA映射的页表EPT: Extended Page Table, used to establish a page table for mapping between virtual machine GPA and host machine HPA in a virtualization scenario
IOMMU:在虚拟化场景下,用来建立IOVA(一般是虚拟机GPA)与HPA映射的页表IOMMU: In a virtualization scenario, it is used to establish a page table for mapping between IOVA (generally virtual machine GPA) and HPA
Guest:指虚拟机Guest: Refers to the virtual machine
Host:指宿主机Host: Refers to the host
需要说明的是,不同运行环境下,内存的表现形式或对应结构会有所不同;本申请中,是以Linux环境下的具体实施方式为例进行说明的。It should be noted that, under different operating environments, the representation form or corresponding structure of the memory will be different; in this application, the specific implementation mode under the Linux environment is taken as an example for illustration.
本申请实施例提供了一种内存回收方法,该方法可以由内存回收装置来执行,该内存回收装置可以集成在电脑、服务器、计算机、服务器集群、数据中心等电子设备中。如图1所示,其为根据本申请一个实施例的内存回收方法的流程图;其中,所述内存回收方法,包括:The embodiment of the present application provides a memory recovery method, which can be executed by a memory recovery device, and the memory recovery device can be integrated in electronic devices such as computers, servers, computers, server clusters, and data centers. As shown in FIG. 1, it is a flowchart of a memory recovery method according to an embodiment of the present application; wherein, the memory recovery method includes:
S100,获取虚拟机的内存管理元数据信息和解析该内存管理元数据信息的内存解析函数;S100, acquiring memory management metadata information of the virtual machine and a memory analysis function for parsing the memory management metadata information;
S200,基于所述内存解析函数和所述内存管理元数据信息,回收所述虚拟机的空闲内存。 S200. Reclaim free memory of the virtual machine based on the memory analysis function and the memory management metadata information.
本实施例中,虚拟机运行在宿主机内,一个宿主机内可以设置一个或多个虚拟机,不同虚拟机之间是相互隔离的。同虚拟机之间的隔离是通过防止分配到其他虚拟机的资源(CPU、I/O设备)访问到本虚拟机的物理地址。每个虚拟机都会有自己独立的物理地址空间,即GPA(Guest Physical Address)空间,该空间不同于主机物理地址空间,即HPA(Host Physical Address)空间。In this embodiment, the virtual machine runs in the host machine, and one or more virtual machines can be set in one host machine, and different virtual machines are isolated from each other. The isolation between the same virtual machine is to prevent the resources (CPU, I/O device) allocated to other virtual machines from accessing the physical address of this virtual machine. Each virtual machine will have its own independent physical address space, that is, GPA (Guest Physical Address) space, which is different from the host physical address space, that is, HPA (Host Physical Address) space.
在一种实施方式中,所述虚拟机为多个时,基于与所述虚拟机对应的所述内存解析函数和所述内存管理元数据信息,回收该虚拟机的空闲内存。In one implementation manner, when there are multiple virtual machines, free memory of the virtual machine is reclaimed based on the memory analysis function corresponding to the virtual machine and the memory management metadata information.
尽管处理器的最小可寻址单位通常为字(甚至字节),linus内核仍然把物理页作为内存管理的基本单位,用struct page结构表示系统中的每个物理页;内存管理时通常以页为单位进行处理。Although the smallest addressable unit of the processor is usually a word (or even a byte), the linus kernel still regards the physical page as the basic unit of memory management, and uses the struct page structure to represent each physical page in the system; memory management is usually done in pages processed as a unit.
需要说明的是,不同的体系结构中,支持的物理页的大小也是不尽相同,还有些体系结构甚至支持几种不同的物理页大小。大多数32位体系结构支持4KB的物理页,而64位体系结构一般会支持8KB的物理页。It should be noted that different architectures support different physical page sizes, and some architectures even support several different physical page sizes. Most 32-bit architectures support 4KB physical pages, while 64-bit architectures generally support 8KB physical pages.
基于物理页的大小和宿主机的物理内存的大小,可以将物理内存划分为若干个物理页。例如,在支持4KB物理页大小并有1GB物理内存的宿主机上,物理内存会被划分为262144个物理页。Based on the size of the physical page and the size of the physical memory of the host machine, the physical memory can be divided into several physical pages. For example, on a host that supports a 4KB physical page size and has 1GB of physical memory, the physical memory will be divided into 262144 physical pages.
现有的内存使用方式中,在所述虚拟机的GPA空间内,一般会将具有相同特性的物理页集中在一起,例如,将空闲的物理页放置在一起,这样在解析空闲的物理页时,内存的物理页的组织方式,会给出物理页的数量,只需要解析出头页(第一个物理页)是否为空闲的物理页,结合物理页的数量就可以确定该虚拟机的所有空闲的物理页。In the existing memory usage mode, in the GPA space of the virtual machine, physical pages with the same characteristics are generally gathered together, for example, free physical pages are placed together, so that when parsing free physical pages , the organization of the physical pages of the memory will give the number of physical pages. It is only necessary to analyze whether the head page (the first physical page) is a free physical page. Combined with the number of physical pages, all the free pages of the virtual machine can be determined. physical page.
需要说明的是,对于虚拟机的GPA空间,其空闲的物理页一般为DMA的缓冲区,It should be noted that for the GPA space of a virtual machine, its free physical pages are generally DMA buffers.
这样,对虚拟机的空闲的内存进行回收,从而避开了对设备侧DMA无法改变内存映射的问题,且无需额外的硬件或半虚拟化驱动程序。In this way, the free memory of the virtual machine is reclaimed, thereby avoiding the problem that the device side DMA cannot change the memory mapping, and no additional hardware or paravirtualization driver is required.
本申请实施例提供了另一种内存回收方法,其与前述所述的内存回收方法类似,不同之处在于,如图2所示,所述S200,基于所述内存解析函数和 所述内存管理元数据信息,回收所述虚拟机的空闲内存之后,所述方法还包括:The embodiment of the present application provides another memory recovery method, which is similar to the aforementioned memory recovery method, except that, as shown in FIG. 2 , the S200 is based on the memory analysis function and The memory management metadata information, after reclaiming the free memory of the virtual machine, the method further includes:
S300,响应于所述虚拟机对已回收的空闲内存的分配请求,对所述虚拟机分配新的物理内存。S300. Allocate new physical memory to the virtual machine in response to the virtual machine's allocation request for the reclaimed free memory.
这样,已回收的空闲内存再次被虚拟机分配用于程序运行或DMA缓冲区时,通过分配请求,来重建这部分映射,因此能够有效规避DMA缓冲区没有映射,进而导致的设备侧缺页问题。In this way, when the reclaimed free memory is re-allocated by the virtual machine for program running or DMA buffer, this part of the mapping is rebuilt by allocating the request, so it can effectively avoid the page fault on the device side caused by the DMA buffer not being mapped. .
在一种实施方式中,虚拟机的分配请求所对应的空闲内存,为所有从该虚拟机回收的空闲内存的一部分;在此情况下,对所述虚拟机分配新的物理内存时,仅分配所述虚拟机请求的该部分内存。In one embodiment, the free memory corresponding to the allocation request of the virtual machine is a part of all the free memory recovered from the virtual machine; in this case, when allocating new physical memory to the virtual machine, only The portion of memory requested by the virtual machine.
在一种实施方式中,在对应的物理内存未被占用的情况下。对所述虚拟机分配新的物理内存,为从该虚拟机回收的物理内存。这样可以最大程度恢复虚拟机的该部分的空闲的原有参数,从而减少内存回收对DMA等的可能影响。In one embodiment, when the corresponding physical memory is not occupied. Allocating new physical memory to the virtual machine is the physical memory reclaimed from the virtual machine. In this way, the original idle parameters of this part of the virtual machine can be restored to the greatest extent, thereby reducing the possible impact of memory reclamation on DMA and the like.
在一种实施方式中,所述内存解析函数为所述虚拟机编译后的BPF字节码。这样,基于eBPF程序实现内存解析函数的编译,可以直接在现有的虚拟机上完成,而无需虚拟机提供额外的硬件或额外的半虚拟化驱动程序。In an implementation manner, the memory parsing function is BPF bytecode compiled by the virtual machine. In this way, the compilation of the memory analysis function based on the eBPF program can be completed directly on the existing virtual machine, without the need for the virtual machine to provide additional hardware or additional paravirtualization drivers.
如图3所示,其为内存回收过程的示意图,结合该图对下述内容进行详细阐述。As shown in FIG. 3 , it is a schematic diagram of a memory recovery process, and the following content is described in detail in conjunction with this figure.
本申请实施例提供了另一种内存回收方法,其与前述所述的内存回收方法类似,不同之处在于,所述S100,获取虚拟机的内存管理元数据信息和解析该内存管理元数据信息的内存解析函数,包括:The embodiment of the present application provides another memory recovery method, which is similar to the above-mentioned memory recovery method, the difference is that in S100, the memory management metadata information of the virtual machine is obtained and the memory management metadata information is parsed memory parsing functions, including:
当所述虚拟机启动后对预先编写的所述内存解析函数的eBPF源程序进行编译时,接收所述虚拟机注入的内存管理元数据信息和编译后的所述内存解析函数的BPF字节码。When the pre-written eBPF source program of the memory analysis function is compiled after the virtual machine is started, the memory management metadata information injected by the virtual machine and the compiled BPF bytecode of the memory analysis function are received .
本实施例中,预先编写内存解析函数的eBPF源程序,放入虚拟机的文件系统中;虚拟机启动后,将文件系统中的内存解析函数的eBPF源程序编译为BPF字节码,并将编译后的内存解析函数的BPF字节码注入宿主机;通过该 方式,宿主机获取了虚拟机的内存解析函数的BPF字节码。In this embodiment, the eBPF source program of the memory analysis function is pre-written and put into the file system of the virtual machine; after the virtual machine starts, the eBPF source program of the memory analysis function in the file system is compiled into BPF bytecode, and The BPF bytecode of the compiled memory parsing function is injected into the host; through this In this way, the host machine obtains the BPF bytecode of the memory parsing function of the virtual machine.
在一种实施方式中,虚拟机的内存管理元数据信息包括struct page物理页信息和对应的struct page结构体信息;所述struct page结构体为内核为每个struct page物理页创建的,所述struct page结构体包括字段flags,用于描述物理页的状态和其他信息,该字段flags主要分为4部分,其中标志位flag(物理页的状态标识)向高位增长,其余位字段section(主要用于稀疏内存模型SPARSEMEM)、node(NUMA节点号,标识该物理页属于哪一个节点)、zone(内存域标志,标识该物理页属于哪一个zone)向低位增长,中间存在空闲位。In one embodiment, the memory management metadata information of the virtual machine includes struct page physical page information and corresponding struct page structure information; the struct page structure is created by the kernel for each struct page physical page, and the The struct page structure includes the field flags, which is used to describe the status and other information of the physical page. The field flags is mainly divided into 4 parts, in which the flag bit (the status identifier of the physical page) increases to the high bit, and the other bit fields section (mainly used Based on the sparse memory model SPARSEMEM), node (NUMA node number, which node the physical page belongs to), zone (memory domain mark, which zone the physical page belongs to) grows toward the lower bit, and there are idle bits in the middle.
本实施例中,虚拟机执行一次启动后才会将内存管理元数据信息和内存解析函数注入宿主机,宿主机只能够扫描已经注入内存管理元数据信息和内存解析函数的虚拟机的struct page物理页。也即是说,宿主机只能够回收已经启动过的虚拟机的空闲内存。In this embodiment, the memory management metadata information and memory analysis function will be injected into the host machine only after the virtual machine is started once, and the host machine can only scan the struct page physics of the virtual machine that has been injected with the memory management metadata information and memory analysis function Page. That is to say, the host machine can only reclaim the free memory of the virtual machine that has been started.
需要说明的是,宿主机对应的虚拟机可以为多个;同时,宿主机只能够扫描已经启动后的虚拟机的struct page物理页。另外,虚拟机只有在宿主机已经分配了内存的情况下,才能进行启动。It should be noted that there can be multiple virtual machines corresponding to the host machine; at the same time, the host machine can only scan the struct page physical pages of the virtual machines that have been started. In addition, the virtual machine can only be started if the host has allocated memory.
例如,宿主机内设置有11个虚拟机,宿主机的物理内存为100G,虚拟机需要分配的内存为10G,此种情况下,可以先选取10个虚拟机分配对应的10G内存,然后依次启动所述10个虚拟机并同时回收其空闲内存,直至回收的空闲内存达到10G为止,然后将回收的10G内存分配给第11个虚拟机,从而完成了全部虚拟机的内存分配工作。For example, there are 11 virtual machines in the host machine, the physical memory of the host machine is 100G, and the memory to be allocated to the virtual machines is 10G. In this case, you can first select 10 virtual machines to allocate the corresponding 10G memory, and then start the The 10 virtual machines reclaim their free memory at the same time until the reclaimed free memory reaches 10G, and then distribute the reclaimed 10G memory to the 11th virtual machine, thereby completing the memory allocation work of all virtual machines.
本申请实施例提供了另一种内存回收方法,其与前述所述的内存回收方法类似,不同之处在于,如图4所示,The embodiment of the present application provides another memory recovery method, which is similar to the aforementioned memory recovery method, except that, as shown in Figure 4,
所述S200,基于所述内存解析函数和所述内存管理元数据信息,回收所述虚拟机的空闲内存中,通过所述内存解析函数和所述内存管理元数据信息扫描出所述虚拟机的空闲内存后,回收所述虚拟机的空闲内存。The S200, based on the memory analysis function and the memory management metadata information, reclaim the free memory of the virtual machine, and scan out the memory of the virtual machine through the memory analysis function and the memory management metadata information After free memory, reclaim the free memory of the virtual machine.
所述内存管理元数据信息,包括struct page物理页信息和对应的struct page结构体信息; The memory management metadata information includes struct page physical page information and corresponding struct page structure information;
所述通过所述内存解析函数和所述内存管理元数据信息扫描出所述虚拟机的空闲内存后,回收所述虚拟机的空闲内存,包括:After the free memory of the virtual machine is scanned through the memory analysis function and the memory management metadata information, reclaiming the free memory of the virtual machine includes:
S201,扫描所述虚拟机的struct page物理页信息,调用所述内存解析函数解析与对应的struct page结构体信息,确定空闲的struct page物理页;S201, scanning the struct page physical page information of the virtual machine, calling the memory parsing function to parse and correspond to the struct page structure information, and determining free struct page physical pages;
S202,删除EPT映射表和IOMMU映射表中空闲的所述struct page物理页的映射,并将EPT映射表和IOMMU映射表中struct page结构体物理页的映射的权限位置为只读,所述struct page结构体物理页为存储空闲的所述struct page物理页对应的struct page结构体信息的struct page物理页。S202, delete the free mapping of the struct page physical page in the EPT mapping table and the IOMMU mapping table, and set the permission position of the mapping of the struct page structure physical page in the EPT mapping table and the IOMMU mapping table to read-only, and the struct The page structure physical page is a struct page physical page that stores the struct page structure information corresponding to the free struct page physical page.
这样,虚拟机仅在启动初期,向宿主机传递其的内存管理元数据信息及内存解析函数;之后,宿主机可以利用该信息主动扫描虚拟机内的空闲内存区域,回收虚拟机的空闲内存;且在回收虚拟机空闲内存,清除空闲页内存映射的同时,清除内存管理元数据的内存映射,这样来保证我们所回收的内存一定为虚拟机内的空闲内存。In this way, the virtual machine only transmits its memory management metadata information and memory analysis function to the host at the initial stage of startup; after that, the host can use this information to actively scan the free memory area in the virtual machine and reclaim the free memory of the virtual machine; And while reclaiming the free memory of the virtual machine and clearing the memory map of the free page, the memory map of the memory management metadata is cleared, so as to ensure that the memory we reclaim must be the free memory in the virtual machine.
在一种实施方式中,调用所述内存解析函数解析与对应的struct page结构体信息,确定空闲的struct page物理页,即通过调用所述内存解析函数识别struct page结构体中的标志位flag,将预设标识的标志位flag对应的struct page物理页确定为空闲的struct page物理页。In one embodiment, call the memory analysis function to analyze and correspond to the struct page structure information, and determine the free struct page physical page, that is, identify the flag bit flag in the struct page structure by calling the memory analysis function, The struct page physical page corresponding to the flag bit flag of the preset identification is determined as a free struct page physical page.
本实施例中,不同的虚拟机,对于struct page结构体中的标志位flag的标注可能会不同,因此表述“空闲”的标志位flag是预设标识也会不同;例如,一个虚拟机中,struct page结构体中的标志位flag为PG-free表示对应的为空闲的struct page物理页;另一个虚拟机中,struct page结构体中的标志位flag为PG-clean表示对应的为空闲的struct page物理页。In this embodiment, different virtual machines may have different labels for the flag bit flag in the struct page structure, so the flag bit flag for expressing "idle" is a preset flag and will also be different; for example, in a virtual machine, The flag bit in the struct page structure is PG-free, indicating that the corresponding struct page physical page is free; in another virtual machine, the flag bit flag in the struct page structure is PG-clean, indicating that the corresponding struct page is free page physical page.
本实施例中,删除EPT映射表和IOMMU映射表中空闲的所述struct page物理页的映射,并将EPT映射表和IOMMU映射表中struct page结构体物理页的映射的权限位置为只读,所述struct page结构体物理页为存储空闲的所述struct page物理页对应的struct page结构体信息的struct page物理页。In the present embodiment, delete the mapping of the free described struct page physical page in the EPT mapping table and the IOMMU mapping table, and the permission position of the mapping of the struct page structure physical page in the EPT mapping table and the IOMMU mapping table is read-only, The struct page structure physical page is a struct page physical page storing the struct page structure information corresponding to the idle struct page physical page.
结合图5和图6所示,对于EPT映射表和IOMMU映射表而言,其较为相似,因此本实施例中通过EPT映射表来对内存回收的过程进行示例。在此需要强调的是,图5和图6分别为EPT映射表和IOMMU映射表回收前和回 收后的示意图,图中仅是表述其大致过程的简单内容,并未对EPT映射表和IOMMU映射表的具体内容严格对应,因此在理解时,不应以图中的未画出某些表述来认定附图或表述错误。As shown in FIG. 5 and FIG. 6 , the EPT mapping table and the IOMMU mapping table are relatively similar, so in this embodiment, the process of memory recovery is illustrated through the EPT mapping table. What needs to be emphasized here is that Figure 5 and Figure 6 show the EPT mapping table and IOMMU mapping table before and after recycling respectively. The schematic diagram after receipt, the figure only expresses the simple content of the general process, and does not strictly correspond to the specific content of the EPT mapping table and the IOMMU mapping table. Therefore, when understanding, some expressions that are not drawn in the figure should not be used To identify errors in drawings or representations.
如图所示,在图5中,可以看出,GPA、HPA、IOVA的struct page物理页之间的映射关系,该映射关系记录在EPT映射表和IOMMU映射表中;且用于描述struct page物理页的struct page结构体也需要存储,其存储于另外的struct page物理页中,储存struct page结构体的struct page物理页,即是上述的struct page结构体物理页。其中,通过连线展示出映射关系的struct page物理页,我们将其认为是空闲的struct page物理页。As shown in the figure, in Figure 5, it can be seen that the mapping relationship between the struct page physical pages of GPA, HPA, and IOVA is recorded in the EPT mapping table and the IOMMU mapping table; and is used to describe the struct page The struct page structure of the physical page also needs to be stored, which is stored in another struct page physical page, and the struct page physical page storing the struct page structure is the above-mentioned struct page structure physical page. Among them, the struct page physical page showing the mapping relationship through the connection is considered as a free struct page physical page.
回收空闲的struct page物理页时,删除EPT映射表和IOMMU映射表中空闲的所述struct page物理页的映射,也即是图6中Remove EPT Mapping和Remove IOMMU Mapping所在的虚线连接的struct page物理页。将EPT映射表和IOMMU映射表中struct page结构体物理页的映射的权限位置为只读,也即是将与删除映射的struct page物理页对应的struct page结构体存储位置的struct page物理页的映射的权限位修改为“只读”。When reclaiming free struct page physical pages, delete the mapping of the free struct page physical pages in the EPT mapping table and IOMMU mapping table, that is, the struct page physical pages connected by the dotted line where Remove EPT Mapping and Remove IOMMU Mapping are located in Figure 6 Page. Set the permission location of the mapping of the physical page of the struct page structure in the EPT mapping table and the IOMMU mapping table to read-only, that is, set the struct page physical page of the storage location of the struct page structure corresponding to the struct page physical page that deletes the mapping The permission bit of the map is changed to "read-only".
本实施例中,对权限位修改是以struct page物理页为基本单位进行的,也即是说,一旦对某个struct page物理页的映射的权限位修改为“只读”,则该struct page物理页内存储的所有struct page结构体均限制为“只读”。In this embodiment, the permission bit is modified based on the struct page physical page as the basic unit, that is to say, once the permission bit of the mapping of a struct page physical page is modified to "read-only", the struct page All struct page structures stored in physical pages are restricted to "read-only".
本实施例中,struct page物理页具有一定的容量,可以存储多个struct page结构体,例如一个4KB的struct page物理页可以存储大小为64B的struct page结构体的数量是64个;该64个struct page结构体对应于64个struct page物理页。这样,struct page结构体物理页的在EPT映射表和IOMMU映射表中的映射的权限位的修改,意味着其中的struct page结构体也同时修改为“只读”,因此需要同时删除对应的64个struct page物理页在EPT映射表和IOMMU映射表中的映射。In this embodiment, the struct page physical page has a certain capacity and can store multiple struct page structures. For example, a 4KB struct page physical page can store 64 struct page structures with a size of 64B; the 64 The struct page structure corresponds to 64 struct page physical pages. In this way, the modification of the permission bits of the physical page of the struct page structure in the EPT mapping table and the IOMMU mapping table means that the struct page structure is also modified to "read-only" at the same time, so the corresponding 64 The mapping of a struct page physical page in the EPT mapping table and the IOMMU mapping table.
由此可以看出,对struct page结构体物理页的修改,是与该struct page结构体物理页对应的struct page物理页相互联动的,需要保持同步;因此若struct page结构体物理页中的64个struct page结构体有一个对应的不是空闲的struct page物理页(即使其余63个是空闲的),则该struct page结构体物理页及其 对应的64个struct page物理页均不能进行回收;只有struct page结构体物理页中的64个struct page结构体对一个的均是空闲的struct page物理页,才能对其进行全部回收。It can be seen from this that the modification of the physical page of the struct page structure is linked with the struct page physical page corresponding to the physical page of the struct page structure, and needs to be kept in sync; therefore, if the 64 in the physical page of the struct page structure A struct page structure has a corresponding struct page physical page that is not free (even if the remaining 63 are free), then the struct page structure physical page and its The corresponding 64 struct page physical pages cannot be reclaimed; only the 64 struct page structure physical pages in the struct page structure physical pages are all idle struct page physical pages, and all of them can be reclaimed.
类似地,若需要对已回收的虚拟机分配新的物理内存,即使虚拟机仅需要一个struct page物理页,也应当是同时分配64个struct page物理页。Similarly, if new physical memory needs to be allocated to the reclaimed virtual machine, even if the virtual machine only needs one struct page physical page, 64 struct page physical pages should be allocated at the same time.
在一种实施方式中,所述S300中,虚拟机对已回收的空闲内存的分配请求是在所述虚拟机访问所述EPT映射表或所述IOMMU映射表中权限位为只读的struct page结构体物理页的映射时触发的。In one embodiment, in the S300, the allocation request of the virtual machine to the reclaimed free memory is when the virtual machine accesses the struct page whose permission bit is read-only in the EPT mapping table or the IOMMU mapping table Triggered when the physical page of the structure is mapped.
其中,若虚拟机需要使用新的内存,则其会访问“只读”的struct page结构体物理页中的对应的struct page结构体,但是这种访问会超出“只读”的权限,从而返回一个page-fault信号,宿主机收到该page-fault信号,即可以认为虚拟机已经对已回收的空闲内存给出了分配请求。Among them, if the virtual machine needs to use new memory, it will access the corresponding struct page structure in the physical page of the "read-only" struct page structure, but this access will exceed the "read-only" permission, and return A page-fault signal. When the host receives the page-fault signal, it can be considered that the virtual machine has given an allocation request for the reclaimed free memory.
在一种实施方式中,所述对所述虚拟机分配新的物理内存是在所述EPT映射表或所述IOMMU映射表中重新建立空闲的struct page物理页的映射。In one embodiment, the allocating new physical memory to the virtual machine is re-establishing the mapping of free struct page physical pages in the EPT mapping table or the IOMMU mapping table.
这样,struct page结构体物理页的映射的权限位的修改,是考虑虚拟机Kernel内存分配的执行路径后确定的方案,通过该修改,当虚拟机分配内存时,其对该struct page的访问一定会被宿主机拦截(返回的page-fault信号),从而让该页面在虚拟机可用之前,建立好EPT和IOMMU映射,防止DMA失效。In this way, the modification of the permission bit of the mapping of the physical page of the struct page structure is a solution determined after considering the execution path of the kernel memory allocation of the virtual machine. Through this modification, when the virtual machine allocates memory, its access to the struct page must be It will be intercepted by the host (returned page-fault signal), so that the page can establish EPT and IOMMU mapping before the virtual machine is available to prevent DMA failure.
本申请中,虚拟机除启动时向宿主机注入代码与数据之外,无任何其它改动,因此对虚拟机的侵入控制最小,很容易做到软件兼容;宿主机主动自发地对虚拟机内存状态进行扫描,而无需等待虚拟机的响应,因此相较于传统的PV方案,可以做到更好的实时性;本方案为在已有架构上的纯软件方案,不依赖支持设备侧Page Fault的特定硬件,因此可以硬件兼容。In this application, the virtual machine has no other changes except for injecting code and data into the host when it starts, so the intrusion control of the virtual machine is minimal, and it is easy to achieve software compatibility; the host actively and spontaneously updates the memory status of the virtual machine Scanning without waiting for the response of the virtual machine, so compared with the traditional PV solution, it can achieve better real-time performance; this solution is a pure software solution based on the existing architecture, and does not rely on supporting Page Fault on the device side Specific hardware and therefore hardware compatible.
本申请中,通过虚拟机向宿主机注入内存管理元数据信息,以及内存管理元数据解析函数,让宿主机能够实时、主动地对虚拟机内存进行扫描与回收,提高了内存回收的响应速度;且对空闲内存的回收/再分配不涉及Disk I/O,因此也提升了处理效率;回收的同时,考虑了相应struct page映射的修改,使该方案可用于设备直通场景。 In this application, memory management metadata information and memory management metadata analysis functions are injected into the host machine through the virtual machine, so that the host machine can scan and reclaim the memory of the virtual machine actively in real time, improving the response speed of memory reclamation; Moreover, the recycling/redistribution of free memory does not involve Disk I/O, so the processing efficiency is also improved; while recycling, the modification of the corresponding struct page mapping is considered, so that this solution can be used in device direct scenarios.
本申请实施例提供了一种内存回收装置,用于执行本申请上述内容所述的内存回收方法,以下对所述内存回收装置进行详细描述。An embodiment of the present application provides a memory reclamation device configured to execute the memory reclamation method described above in the present application. The memory reclamation device will be described in detail below.
如图7所示,所述内存回收装置,包括:As shown in Figure 7, the memory recycling device includes:
信息获取模块101,其用于获取虚拟机的内存管理元数据信息和解析该内存管理元数据信息的内存解析函数;An information acquisition module 101, which is used to acquire memory management metadata information of a virtual machine and a memory analysis function for parsing the memory management metadata information;
内存回收模块102,其用于基于所述内存解析函数和所述内存管理元数据信息,回收所述虚拟机的空闲内存。A memory recycling module 102, configured to reclaim free memory of the virtual machine based on the memory analysis function and the memory management metadata information.
在一种实施方式中,所述装置还包括:In one embodiment, the device also includes:
内存分配模块103,其用于响应于所述虚拟机对已回收的空闲内存的分配请求,对所述虚拟机分配新的物理内存。A memory allocation module 103, configured to allocate new physical memory to the virtual machine in response to the virtual machine's allocation request for the reclaimed free memory.
所述信息获取模块101的操作执行过程中,所述内存解析函数为所述虚拟机编译后的BPF字节码。During the operation execution process of the information acquisition module 101, the memory analysis function is the compiled BPF bytecode of the virtual machine.
所述信息获取模块101还用于:The information acquisition module 101 is also used for:
当所述虚拟机启动后对预先编写的所述内存解析函数的eBPF源程序进行编译时,接收所述虚拟机注入的内存管理元数据信息和编译后的所述内存解析函数的BPF字节码。When the pre-written eBPF source program of the memory analysis function is compiled after the virtual machine is started, the memory management metadata information injected by the virtual machine and the compiled BPF bytecode of the memory analysis function are received .
所述内存回收模块102还用于:通过所述内存解析函数和所述内存管理元数据信息扫描出所述虚拟机的空闲内存后,回收所述虚拟机的空闲内存。The memory reclaiming module 102 is further configured to reclaim the free memory of the virtual machine after scanning out the free memory of the virtual machine through the memory analysis function and the memory management metadata information.
在一种实施方式中,所述内存管理元数据信息,包括struct page物理页信息和对应的struct page结构体信息;所述内存回收模块102还用于:In one embodiment, the memory management metadata information includes struct page physical page information and corresponding struct page structure information; the memory recycling module 102 is also used for:
扫描所述虚拟机的struct page物理页信息,调用所述内存解析函数解析与对应的struct page结构体信息,确定空闲的struct page物理页;Scanning the struct page physical page information of the virtual machine, calling the memory analysis function to analyze and correspond to the struct page structure information, and determining the idle struct page physical page;
删除EPT映射表和IOMMU映射表中空闲的所述struct page物理页的映射,并将EPT映射表和IOMMU映射表中struct page结构体物理页的映射的权限位置为只读,所述struct page结构体物理页为存储空闲的所述struct page物理页对应的struct page结构体信息的struct page物理页。Delete the free mapping of the struct page physical page in the EPT mapping table and the IOMMU mapping table, and make the mapping permission position of the struct page structure physical page in the EPT mapping table and the IOMMU mapping table read-only, and the struct page structure The body physical page is a struct page physical page that stores the struct page structure information corresponding to the free said struct page physical page.
所述内存分配模块103的操作执行过程中,所述虚拟机对已回收的空闲 内存的分配请求是在所述虚拟机访问所述EPT映射表或所述IOMMU映射表中权限位为只读的struct page结构体物理页的映射时触发的。During the execution of the operation of the memory allocation module 103, the reclaimed idle memory of the virtual machine The memory allocation request is triggered when the virtual machine accesses the mapping of the physical page of the struct page structure whose permission bit is read-only in the EPT mapping table or the IOMMU mapping table.
所述内存分配模块103的操作执行过程中,所述对所述虚拟机分配新的物理内存是在所述EPT映射表或所述IOMMU映射表中重新建立空闲的struct page物理页的映射。During the operation execution process of the memory allocation module 103, the allocation of new physical memory to the virtual machine is to re-establish the mapping of free struct page physical pages in the EPT mapping table or the IOMMU mapping table.
本申请的上述实施例提供的内存回收装置与本申请实施例提供的内存回收方法出于相同的发明构思,具有与其存储的应用程序所采用、运行或实现的方法相同的有益效果。The memory reclamation device provided by the above embodiments of the present application is based on the same inventive concept as the memory reclamation method provided by the embodiments of the present application, and has the same beneficial effect as the method adopted, run or implemented by the stored application program.
以上描述了内存回收装置的内部功能和结构,如图8所示,实际中,该内存回收装置可实现为控制设备,包括:存储器301及处理器303。The above describes the internal function and structure of the memory recovery device, as shown in FIG. 8 , in practice, the memory recovery device can be implemented as a control device, including: a memory 301 and a processor 303 .
存储器301,可被配置为存储程序。The memory 301 may be configured to store programs.
另外,存储器301,还可被配置为存储其它各种数据以支持在控制设备上的操作。这些数据的示例包括用于在控制设备上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。In addition, the memory 301 may also be configured to store other various data to support operations on the control device. Examples of such data include instructions for any application or method operating on the controlling device, contact data, phonebook data, messages, pictures, videos, etc.
存储器301可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。The memory 301 can be implemented by any type of volatile or non-volatile storage device or their combination, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
处理器303,耦合至存储器301,用于执行存储器301中的程序,以用于:The processor 303, coupled to the memory 301, is used to execute the program in the memory 301 for:
获取虚拟机的内存管理元数据信息和解析该内存管理元数据信息的内存解析函数;Obtaining memory management metadata information of the virtual machine and a memory parsing function for parsing the memory management metadata information;
基于所述内存解析函数和所述内存管理元数据信息,回收所述虚拟机的空闲内存。Reclaim the free memory of the virtual machine based on the memory analysis function and the memory management metadata information.
在一种实施方式中,处理器303具体用于:In one implementation manner, the processor 303 is specifically configured to:
响应于所述虚拟机对已回收的空闲内存的分配请求,对所述虚拟机分配新的物理内存。Allocating new physical memory to the virtual machine in response to the virtual machine's allocation request for the reclaimed free memory.
在一种实施方式中,所述内存解析函数为所述虚拟机编译后的BPF字节 码。In one embodiment, the memory parsing function is the compiled BPF byte of the virtual machine code.
在一种实施方式中,处理器303具体用于:In one implementation manner, the processor 303 is specifically configured to:
当所述虚拟机启动后对预先编写的所述内存解析函数的eBPF源程序进行编译时,接收所述虚拟机注入的内存管理元数据信息和编译后的所述内存解析函数的BPF字节码。When the pre-written eBPF source program of the memory analysis function is compiled after the virtual machine is started, the memory management metadata information injected by the virtual machine and the compiled BPF bytecode of the memory analysis function are received .
在一种实施方式中,处理器303具体用于:通过所述内存解析函数和所述内存管理元数据信息扫描出所述虚拟机的空闲内存后,回收所述虚拟机的空闲内存。In one implementation manner, the processor 303 is specifically configured to: reclaim the free memory of the virtual machine after scanning out the free memory of the virtual machine through the memory analysis function and the memory management metadata information.
在一种实施方式中,所述内存管理元数据信息,包括struct page物理页信息和对应的struct page结构体信息;基于此,处理器303具体用于:In one embodiment, the memory management metadata information includes struct page physical page information and corresponding struct page structure information; based on this, the processor 303 is specifically used to:
扫描所述虚拟机的struct page物理页信息,调用所述内存解析函数解析与对应的struct page结构体信息,确定空闲的struct page物理页;Scanning the struct page physical page information of the virtual machine, calling the memory analysis function to analyze and correspond to the struct page structure information, and determining the free struct page physical page;
删除EPT映射表和IOMMU映射表中空闲的所述struct page物理页的映射,并将EPT映射表和IOMMU映射表中struct page结构体物理页的映射的权限位置为只读,所述struct page结构体物理页为存储空闲的所述struct page物理页对应的struct page结构体信息的struct page物理页。Delete the free mapping of the struct page physical page in the EPT mapping table and the IOMMU mapping table, and make the mapping permission position of the struct page structure physical page in the EPT mapping table and the IOMMU mapping table read-only, and the struct page structure The body physical page is a struct page physical page that stores the struct page structure information corresponding to the free said struct page physical page.
在一种实施方式中,处理器303具体用于:In one implementation manner, the processor 303 is specifically configured to:
所述虚拟机对已回收的空闲内存的分配请求是在所述虚拟机访问所述EPT映射表或所述IOMMU映射表中权限位为只读的struct page结构体物理页的映射时触发的。The allocation request of the reclaimed free memory by the virtual machine is triggered when the virtual machine accesses the mapping of the physical page of the struct page structure whose permission bit is read-only in the EPT mapping table or the IOMMU mapping table.
在一种实施方式中,处理器303具体用于:In one implementation manner, the processor 303 is specifically configured to:
所述对所述虚拟机分配新的物理内存是在所述EPT映射表或所述IOMMU映射表中重新建立空闲的struct page物理页的映射。The allocating new physical memory to the virtual machine is to re-establish the mapping of free struct page physical pages in the EPT mapping table or the IOMMU mapping table.
本申请中,图8中仅示意性给出部分组件,并不意味着服务端设备只包括图8所示组件。In this application, only some components are schematically shown in FIG. 8 , which does not mean that the server device only includes the components shown in FIG. 8 .
本实施例提供的控制设备,与本申请实施例提供的内存回收方法出于相同的发明构思,具有与其存储的应用程序所采用、运行或实现的方法相同的 有益效果。The control device provided in this embodiment is based on the same inventive concept as the memory recovery method provided in the embodiment of this application, and has the same method adopted, run or implemented by its stored application program. Beneficial effect.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and/or nonvolatile memory such as read-only memory (ROM) or flash RAM. Memory is an example of computer readable media.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存 (PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory ( EEPROM), flash memory or other memory technologies, compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic tape cartridges, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer-readable media excludes transitory computer-readable media, such as modulated data signals and carrier waves.
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus comprising a set of elements includes not only those elements, but also includes Other elements not expressly listed, or elements inherent in the process, method, commodity, or apparatus are also included. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or apparatus comprising said element.
本领域技术人员应明白,本申请的实施例可提供为方法、系统或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems or computer program products. Accordingly, the present application can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
以上所述仅为本申请的实施例而已,并不用于限制本申请。对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的权利要求范围之内。 The above descriptions are only examples of the present application, and are not intended to limit the present application. For those skilled in the art, various modifications and changes may occur in this application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application shall be included within the scope of the claims of the present application.
Claims (11)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210197905.7A CN114840330A (en) | 2022-03-01 | 2022-03-01 | A memory recovery method, device and control device |
| CN202210197905.7 | 2022-03-01 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023165308A1 true WO2023165308A1 (en) | 2023-09-07 |
Family
ID=82561832
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/075201 Ceased WO2023165308A1 (en) | 2022-03-01 | 2023-02-09 | Memory reclaim method and apparatus, and control device |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN114840330A (en) |
| WO (1) | WO2023165308A1 (en) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114840330A (en) * | 2022-03-01 | 2022-08-02 | 阿里巴巴(中国)有限公司 | A memory recovery method, device and control device |
| CN120653411A (en) * | 2024-03-13 | 2025-09-16 | 杭州阿里云飞天信息技术有限公司 | Memory resource allocation method, network cloud platform, computing device, computer readable storage medium and computer program product |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130339568A1 (en) * | 2012-06-14 | 2013-12-19 | Vmware, Inc. | Proactive memory reclamation for java virtual machines |
| CN105159742A (en) * | 2015-07-06 | 2015-12-16 | 北京星网锐捷网络技术有限公司 | Unvarnished transmission method and system for PCI device of virtual machine |
| CN106843756A (en) * | 2017-01-13 | 2017-06-13 | 中国科学院信息工程研究所 | Memory pages recovery method and system based on page classifications |
| US20180011797A1 (en) * | 2016-07-06 | 2018-01-11 | Massclouds Innovation Research Institute (Beijing) Of Information Technology | Memory sharing method of virtual machines based on combination of ksm and pass-through |
| CN107885666A (en) * | 2016-09-28 | 2018-04-06 | 华为技术有限公司 | A kind of EMS memory management process and device |
| US10346313B1 (en) * | 2017-01-21 | 2019-07-09 | Virtuozzo International Gmbh | Memory reclaim management for virtual machines |
| CN114840330A (en) * | 2022-03-01 | 2022-08-02 | 阿里巴巴(中国)有限公司 | A memory recovery method, device and control device |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102306126B (en) * | 2011-08-24 | 2014-06-04 | 华为技术有限公司 | Memory management method, device and system |
| CN108021442A (en) * | 2016-11-03 | 2018-05-11 | 阿里巴巴集团控股有限公司 | The system of release physical memory, device and method |
| CN109739613B (en) * | 2018-11-22 | 2021-08-13 | 海光信息技术股份有限公司 | Nested page table maintenance method, access control method and related device |
| CN112256395B (en) * | 2020-10-23 | 2023-01-31 | 海光信息技术股份有限公司 | Safe memory allocation, virtual CPU scheduling method and related device |
-
2022
- 2022-03-01 CN CN202210197905.7A patent/CN114840330A/en active Pending
-
2023
- 2023-02-09 WO PCT/CN2023/075201 patent/WO2023165308A1/en not_active Ceased
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130339568A1 (en) * | 2012-06-14 | 2013-12-19 | Vmware, Inc. | Proactive memory reclamation for java virtual machines |
| CN105159742A (en) * | 2015-07-06 | 2015-12-16 | 北京星网锐捷网络技术有限公司 | Unvarnished transmission method and system for PCI device of virtual machine |
| US20180011797A1 (en) * | 2016-07-06 | 2018-01-11 | Massclouds Innovation Research Institute (Beijing) Of Information Technology | Memory sharing method of virtual machines based on combination of ksm and pass-through |
| CN107885666A (en) * | 2016-09-28 | 2018-04-06 | 华为技术有限公司 | A kind of EMS memory management process and device |
| CN106843756A (en) * | 2017-01-13 | 2017-06-13 | 中国科学院信息工程研究所 | Memory pages recovery method and system based on page classifications |
| US10346313B1 (en) * | 2017-01-21 | 2019-07-09 | Virtuozzo International Gmbh | Memory reclaim management for virtual machines |
| CN114840330A (en) * | 2022-03-01 | 2022-08-02 | 阿里巴巴(中国)有限公司 | A memory recovery method, device and control device |
Also Published As
| Publication number | Publication date |
|---|---|
| CN114840330A (en) | 2022-08-02 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US8261267B2 (en) | Virtual machine monitor having mapping data generator for mapping virtual page of the virtual memory to a physical memory | |
| US9940228B2 (en) | Proactive memory reclamation for java virtual machines | |
| US9183015B2 (en) | Hibernate mechanism for virtualized java virtual machines | |
| US8176294B2 (en) | Reducing storage expansion of a virtual machine operating system | |
| US9146847B2 (en) | Optimizing for page sharing in virtualized java virtual machines | |
| US8135899B1 (en) | Expansion of virtualized physical memory of virtual machine | |
| EP2581828B1 (en) | Method for creating virtual machine, virtual machine monitor and virtual machine system | |
| US11860792B2 (en) | Memory access handling for peripheral component interconnect devices | |
| WO2023165308A1 (en) | Memory reclaim method and apparatus, and control device | |
| US11656982B2 (en) | Just-in-time virtual per-VM swap space | |
| US20230195533A1 (en) | Prepopulating page tables for memory of workloads during live migrations | |
| US12001869B2 (en) | Memory over-commit support for live migration of virtual machines | |
| CN101957801A (en) | Messaging device and information processing method | |
| US20230205560A1 (en) | Selective memory deduplication for virtualized computer systems | |
| US10394596B2 (en) | Tracking of memory pages by a hypervisor | |
| CN114860439A (en) | Memory allocation method, host machine, distributed system and program product | |
| US20140208034A1 (en) | System And Method for Efficient Paravirtualized OS Process Switching | |
| US11762573B2 (en) | Preserving large pages of memory across live migrations of workloads | |
| US12367059B2 (en) | Efficient memory swap for virtual machines | |
| US12013799B2 (en) | Non-interrupting portable page request interface | |
| WO2024067479A1 (en) | Container escape detection method, electronic device, and system | |
| US11586371B2 (en) | Prepopulating page tables for memory of workloads during live migrations | |
| US20230027307A1 (en) | Hypervisor-assisted transient cache for virtual machines | |
| US11314522B2 (en) | Fast boot resource allocation for virtual machines | |
| US20230266992A1 (en) | Processor for managing resources using dual queues, and operating method thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23762709 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 23762709 Country of ref document: EP Kind code of ref document: A1 |