[go: up one dir, main page]

WO2025177500A1 - System, method and program for managing virtualization infrastructure - Google Patents

System, method and program for managing virtualization infrastructure

Info

Publication number
WO2025177500A1
WO2025177500A1 PCT/JP2024/006379 JP2024006379W WO2025177500A1 WO 2025177500 A1 WO2025177500 A1 WO 2025177500A1 JP 2024006379 W JP2024006379 W JP 2024006379W WO 2025177500 A1 WO2025177500 A1 WO 2025177500A1
Authority
WO
WIPO (PCT)
Prior art keywords
infrastructure
physical resources
priority
workloads
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/JP2024/006379
Other languages
French (fr)
Japanese (ja)
Inventor
権次郎 森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SoftBank Corp
Original Assignee
SoftBank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SoftBank Corp filed Critical SoftBank Corp
Priority to PCT/JP2024/006379 priority Critical patent/WO2025177500A1/en
Publication of WO2025177500A1 publication Critical patent/WO2025177500A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0896Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities
    • H04L41/0897Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities by horizontal or vertical scaling of resources, or by migrating entities, e.g. virtual resources or entities

Definitions

  • the present invention relates to the management of a virtualization platform on which virtual machines can be built.
  • a failure occurs in one of the racks, which are multiple infrastructure-component devices that make up the virtualization infrastructure (for example, a failure in the power supply installed in each rack), all physical servers (physical resources) in the failed rack will stop operating, and processing (for example, service provision) by virtual machines built on the virtualization infrastructure may become impossible.
  • a virtualization infrastructure is configured with multiple container-type data centers (infrastructure-component devices).
  • a system is a system for managing a virtualization infrastructure comprising multiple infrastructure component devices, each having multiple physical resources.
  • This system comprises an information storage unit that stores information on the overall physical resources of a cluster on which multiple workloads constructed on the virtualization infrastructure can run, and information on the physical resources of each of the multiple infrastructure component devices; and an information processing unit that classifies multiple services provided by the cluster on which multiple workloads constructed on the virtualization infrastructure can run into multiple service groups with different priorities, and, for each of the multiple service groups, allocates the multiple workloads executed on the cluster and manages the physical resources required by the service group based on the information on the overall physical resources of the cluster and the information on the physical resources of each of the multiple infrastructure component devices.
  • the system may also include a control unit that, when a failure occurs in one of the multiple infrastructure component devices, checks the priority of the service groups to which each of the multiple workloads running on the failed infrastructure component device belongs, and if it is confirmed that a workload of a service group with a priority equal to or higher than a predetermined priority was running on the failed infrastructure component device, stops the workload of the low-priority service group running on the non-failed infrastructure component device, and controls the system to restart the workload of the high-priority service group running on the failed infrastructure component device using the physical resources freed by the stopping of the workload on the non-failed infrastructure component device.
  • the information processing unit may calculate, for each resource component, the physical resources that can be used in the event of a failure in an infrastructure component device with the largest physical resources among the plurality of infrastructure component devices, based on information about the physical resources of each of the plurality of infrastructure component devices, and determine the predetermined priority based on the calculation results for each resource component of the available physical resources, information about each resource component of the physical resources corresponding to each of the plurality of priorities, and the competitive allocation setting conditions for each resource component.
  • the information processing unit may provide the user with impact information in the event that a failure occurs in the infrastructure component device with the greatest physical resources among the plurality of infrastructure component devices.
  • a method is a method for managing a virtualization infrastructure comprising a plurality of infrastructure component devices each having a plurality of physical resources.
  • This method includes storing information about the overall physical resources of a cluster on which a plurality of workloads constructed on the virtualization infrastructure can run, and information about the physical resources of each of the plurality of infrastructure component devices; classifying a plurality of services provided by the cluster on which a plurality of workloads constructed on the virtualization infrastructure can run into a plurality of service groups with different priorities; and, for each of the plurality of service groups, allocating a plurality of workloads to be executed on the cluster and managing the physical resources required by the service group based on the information about the overall physical resources of the cluster and the information about the physical resources of each of the plurality of infrastructure component devices.
  • the method may include, when a failure occurs in one of the multiple infrastructure component devices, checking the priority of the service groups to which each of the multiple workloads running on the failed infrastructure component device belongs; and, if it is confirmed that a workload of a service group with a priority equal to or higher than a predetermined priority was running on the failed infrastructure component device, stopping the workload of the low-priority service group running on the non-failed infrastructure component device; and restarting the workload of the high-priority service group running on the failed infrastructure component device using the physical resources freed by stopping the workload on the non-failed infrastructure component device.
  • a program according to yet another aspect of the present invention is a program executed by a computer or processor provided in a system that manages a virtualization infrastructure that includes multiple infrastructure-component devices, each having multiple physical resources.
  • This program includes program code for storing information about the overall physical resources of a cluster on which multiple workloads constructed on the virtualization infrastructure can run, and information about the physical resources of each of the multiple infrastructure-component devices; program code for classifying multiple services provided by the cluster on which multiple workloads constructed on the virtualization infrastructure can run, into multiple service groups with different priorities; and program code for allocating multiple workloads to be executed on the cluster and managing the physical resources required by the service group for each of the multiple service groups, based on the information about the overall physical resources of the cluster and the information about the physical resources of each of the multiple infrastructure-component devices.
  • the program may include program code for, when a failure occurs in one of the multiple infrastructure component devices, checking the priority of the service groups to which each of the multiple workloads running on the failed infrastructure component device belongs; program code for stopping workloads of low-priority service groups running on the non-failed infrastructure component device when it is confirmed that workloads of high-priority service groups with a priority equal to or higher than a predetermined priority were running on the failed infrastructure component device; and program code for restarting workloads of high-priority service groups running on the failed infrastructure component device using physical resources freed by the stopping of the workloads on the non-failed infrastructure component device.
  • the program may include program code for calculating, for each resource component, the physical resources available for use in the event of a failure in an infrastructure component device with the greatest number of physical resources among the plurality of infrastructure component devices, based on information about the physical resources of each of the plurality of infrastructure component devices; and program code for determining the predetermined priority based on the calculation results for each resource component of the available physical resources, information about each resource component of the physical resources corresponding to each of the plurality of priorities, and competitive allocation setting conditions for each resource component.
  • the program may include program code for providing users with impact information in the event that a failure occurs in the infrastructure component device with the greatest physical resources among the plurality of infrastructure components.
  • the physical resource may be a physical server
  • the infrastructure component device may be a rack having multiple physical servers and a power supply.
  • the physical resource may be a rack having multiple physical servers
  • the infrastructure component device may be a container-type data center having multiple racks and a power source.
  • the program may include a machine-learned model.
  • the virtualization platform may be used, for example, in a data center, or in a RIC (RAN Intelligent Controller) used in the RAN (Radio Access Network) of a mobile communications system.
  • RIC RAN Intelligent Controller
  • FIG. 1 is a schematic diagram showing an example of the overall configuration of a virtualization platform to which a system according to an embodiment can be applied.
  • FIG. 2 is an explanatory diagram showing an example of a failure occurring in a single physical server in a virtualization infrastructure according to a reference example.
  • FIG. 3 is an explanatory diagram illustrating an example of a rack-scale failure occurring in the virtualization platform according to the embodiment.
  • FIG. 4 is an explanatory diagram illustrating an example of the allocation of workloads with multiple priorities in a virtualization platform according to the embodiment.
  • FIG. 5 is an explanatory diagram illustrating an example of the placement of workloads with multiple priorities when a rack-scale failure occurs in the virtualization platform according to the embodiment.
  • FIG. 1 is a schematic diagram showing an example of the overall configuration of a virtualization platform to which a system according to an embodiment can be applied.
  • FIG. 2 is an explanatory diagram showing an example of a failure occurring in a single physical server in a virtualization infrastructure according to a reference
  • FIG. 6 is an explanatory diagram illustrating an example of the configuration of a management system that manages a virtualization infrastructure according to the embodiment.
  • FIG. 7 is a flowchart showing an example of generating a resource-related table of a virtual environment in the virtualization infrastructure according to the embodiment.
  • FIG. 8 is a flowchart showing an example of generating a physical resource-related table in the virtualization platform according to the embodiment.
  • FIG. 9 is a flowchart illustrating an example of calculation of the priority of a service group that restarts a workload when a rack-scale failure occurs in the virtualization platform according to the embodiment.
  • An example of a system according to an embodiment described herein defines service group units and priorities for each service group for resources of workloads running in a virtual environment on a virtualization platform, and when a resource situation makes it difficult to start all workloads (service groups) due to a failure of a platform component device (rack), etc., the system stops workloads of low-priority service groups and controls so that workloads of high-priority service groups can be started.
  • control involves defining and managing resource components (CPU (Central Processing Unit), memory, GPU (Graphics Processing Unit)) of physical resources consumed by each service group and competitive allocation settings for physical resources (e.g., allocating four virtual CPUs to one physical CPU actually installed in a physical server).
  • resource components CPU (Central Processing Unit), memory, GPU (Graphics Processing Unit)
  • competitive allocation settings for physical resources e.g., allocating four virtual CPUs to one physical CPU actually installed in a physical server.
  • the virtualization infrastructure managed by the system of this embodiment can be used in the RIC (RAN Intelligent Controller) used in the RAN (Radio Access Network) of a mobile communications system.
  • RIC RAN Intelligent Controller
  • RAN Radio Access Network
  • FIG. 1 is a schematic diagram showing an example of the overall configuration of a virtualization platform to which a system according to an embodiment can be applied.
  • the virtualization platform 10 comprises racks 110, 120, and 130 as multiple platform-constituting devices.
  • the racks 110, 120, and 130 each have physical servers (also referred to as "physical hosts") 111, 121, and 131 as multiple physical resources.
  • the multiple physical servers 111, 121, and 131 (24 in the illustrated example) contained in the racks 110, 120, and 130 form a cluster, which is a unit of a group in which workloads such as virtual machines (VMs) or containers can run.
  • VMs virtual machines
  • the virtualization platform 10 is composed of three racks 110, 120, and 130, but the virtualization platform 10 may also be composed of two racks or four or more racks.
  • the racks 110, 120, and 130 each have eight physical servers 111, 121, and 131, respectively, but the number of physical servers each rack 110, 120, and 130 may have may be one to seven, or nine or more.
  • the number of physical servers each rack 110, 120, and 130 may have may differ from each other.
  • a cluster capable of running multiple workloads on the virtualization platform 10 is constructed using multiple racks 110, 120, and 130, with workloads such as virtual machines (VMs) or containers running on multiple physical servers 111, 121, and 131.
  • VMs virtual machines
  • Each of the multiple racks 110, 120, and 130 also has other devices such as a switch 112.
  • Each of the multiple racks 110, 120, and 130 may have a power supply that supplies power to the multiple physical servers 111, 121, and 131.
  • the entire cluster only needs to have surplus resources equivalent to one physical server.
  • rack-scale failures can occur in a virtualization platform 10 configured as described above. For example, as shown in Figure 3, if a failure occurs in the power supply system of one rack 110, all physical servers 111 within the rack 110 will go down. When a rack-scale failure occurs, there is nowhere to move the large number of workloads that were running on all physical servers 111' within the rack 110, and these workloads cannot be started, affecting the services provided. While it is possible to start all workloads that were running on all physical servers 111' in the rack 110 on physical servers in other racks, this would require a huge amount of surplus resources and would be unrealistic. For example, the entire cluster would require surplus resources equivalent to one rack. For this reason, rack failures are generally not included in failure cases.
  • the multiple services provided by a cluster built on the virtualization platform 10, which can run multiple workloads, are classified into multiple service groups with different priorities. Based on information about the physical servers (physical resources) 111, 121, and 131 of the entire cluster and information about the physical servers (physical resources) 111, 121, and 131 of each of the multiple racks (platform configuration devices) 110, 120, and 130, the placement of the multiple workloads executed in the cluster and the management of the physical servers (physical resources) 111, 121, and 131 required by the service group are performed for each of the multiple service groups.
  • This workload placement and management of the physical servers (physical resources) for each of the multiple service groups makes it possible to continue providing high-priority services on the virtualization platform 10, even if a failure occurs in one of the multiple racks 110, 120, and 130 that make up the virtualization platform 10, causing all physical servers in the failed rack to stop operating.
  • FIG. 4 is an explanatory diagram showing an example of the arrangement of workloads with multiple priorities in the virtualization platform 10 according to the embodiment.
  • the workloads running on the physical servers 111(1) to 111(3), 121(1) to 121(3), and 131(1) to 131(3) of the racks 110, 120, and 130 are indicated by square blocks 201A to 201C, 202A to 202C, and 203A to 203C.
  • the symbols A, B, and C in each workload block indicate the priority of each service group. Priority A is the highest priority, Priority B is the second highest priority, and Priority C is the lowest priority.
  • four workloads can be started per physical server 111(1) to 111(3), 121(1) to 121(3), and 131(1) to 131(3).
  • the upper limit for the number of workloads in a cluster is 36, and there are currently 30 workloads running. Even if a single physical server fails, there are surplus resources available that allow up to one of the physical servers to be restarted on another physical server. However, if a rack 110 experiences a rack-wide failure, the remaining free resources in racks 120 and 130 do not have the capacity to start all 10 workloads 201A-201C running on physical servers 111(1)-111(3) of rack 110.
  • FIG. 5 is an explanatory diagram showing an example of the allocation of workloads with multiple priorities when a rack-wide failure occurs in the virtualization platform 10 according to the embodiment.
  • FIG. 5 when a failure occurs in rack 110, four workloads 201A in the service group with the highest priority A can be restarted using the free resources of other physical servers 121 and 131. However, for three workloads 201B in the service group with the second-highest priority B that were running in rack 110, there are insufficient free resources to restart them on the other physical servers 121 and 131.
  • the system checks the priority of the service groups to which each of the multiple workloads running on the rack (infrastructure configuration device) 110 where the failure occurred belongs. If it is confirmed that a workload from a service group with a higher priority than a predetermined priority (e.g., "medium” priority) is running on the rack (infrastructure configuration device) 110 where the failure occurred, the system stops the workloads from service groups with a lower priority (e.g., "low” priority) running on the other racks (infrastructure configuration devices) 120, 130 where the failure did not occur.
  • a predetermined priority e.g., "medium” priority
  • the system restarts the workloads from the high-priority service groups A and B that were running on the rack (infrastructure configuration device) 110 where the failure occurred, using the physical resources freed up by stopping the workloads on the other racks (infrastructure configuration devices) 120, 130 where the failure did not occur.
  • all of the workloads of the service group with priority C and some or all of the workloads of the service group with priority B that were running on the physical servers 121 and 131 in racks 120 and 130 may be stopped, and the resources freed up by stopping the workloads may be used to restart workload 201A of the service group with priority A, workload 201B of the service group with priority B, or both of these workloads that were running on rack 110.
  • surplus resources in the virtualization platform 10 can be reduced by accepting the abandonment of services from low-priority service groups.
  • FIG. 6 is an explanatory diagram showing an example of the configuration of a management system 30 that manages the virtualization platform 10 according to an embodiment.
  • the management system 30 includes an information storage unit (DB) 310, an information processing unit 320, and a control unit 330.
  • the management system 30 may further include a communication unit 340.
  • the management system 30 may be installed on a server or cloud computer system separate from the virtualization platform 10, or may be installed as a management server on a physical server in the virtualization platform 10.
  • the information storage unit (DB) 310 stores information on all physical servers (physical resources) 111, 121, 131 of a cluster constructed on the virtualization platform 10 that can run multiple workloads, as well as information on each physical server (physical resource) 111, 121, 131 of multiple racks (platform configuration devices) 110, 120, 130.
  • the information storage unit (DB) 310 also stores information on various tables in the configuration of the virtualization platform 10 described above, such as the service group table in Table 1, the priority-based resource table in Table 2, the rack resource table in Table 3, the cluster resource table in Table 4, and the condition table in Table 5.
  • control unit 330 controls the restart of the workloads from the high-priority service groups A and B that were running on the rack (infrastructure configuration device) 110 where the failure occurred, using the physical resources freed up by stopping the workloads on the other racks (infrastructure configuration devices) 120, 130 where the failure did not occur.
  • FIG. 7 is a flowchart showing an example of generating a resource-related table for a virtual environment in the virtualization platform 10 according to an embodiment.
  • the information processing unit 320 generates a workload table for a workload when deploying (e.g., building or creating) the workload in the virtualization platform 10 (S101).
  • the information processing unit 320 aggregates resource components (CPU, memory, GPU) for each service group with different priorities and creates the service group table shown in Table 1 (S102).
  • the information processing unit 320 aggregates resource components (CPU, memory, GPU) for each priority for the service group table shown in Table 1 and creates the priority-based resource table shown in Table 3 (S103).
  • FIG. 8 is a flowchart showing an example of generating a physical resource-related table in a virtualization platform according to an embodiment.
  • the information processing unit 320 generates a physical server resource table from information about the physical servers in each rack that constitutes a cluster in the virtualization platform 10 (S201).
  • the information processing unit 320 aggregates the resource components (CPU, memory, GPU) in the physical server resource table to create the cluster resource table shown in Table 4 (S202).
  • the information processing unit 320 creates the rack resource table shown in Table 3 from the physical server resource table (S203).
  • the information processing unit 320 calculates and determines how many priorities can be covered in descending order of priority, i.e., the predetermined priority, based on the calculation results for each resource component (CPU, memory, GPU) of the physical resources of the available racks 120 and 130 (rack resource table in Table 7), information for each resource component (CPU, memory, GPU) of the physical resources corresponding to each of multiple priorities (high, medium, low) (resource table by priority in Table 6), and the competitive allocation setting conditions for each resource component (CPU, memory, GPU) in Table 5 (S302).
  • the predetermined priority based on the calculation results for each resource component (CPU, memory, GPU) of the physical resources of the available racks 120 and 130 (rack resource table in Table 7), information for each resource component (CPU, memory, GPU) of the physical resources corresponding to each of multiple priorities (high, medium, low) (resource table by priority in Table 6), and the competitive allocation setting conditions for each resource component (CPU, memory, GPU) in Table 5 (S302).
  • the information processing unit 320 transmits impact information regarding the impact when the rack 110 with the largest number of resource components (CPU, memory, GPU) goes down (fails), i.e., when a failure occurs in the rack 110, to the terminal device of the user (or operator, etc.) 40 via the communication unit 340 and provides it (S303).
  • resource components CPU, memory, GPU
  • all or part of the generation of the virtual environment resource-related table in FIG. 7, the generation of the physical resource-related table in FIG. 8, and the calculation of service group priorities in FIG. 9 may be pre-calculated before the actual operation of the virtualization infrastructure 10 begins, or may be recalculated when the number or configuration of racks in the virtualization infrastructure 10 changes, when workloads such as virtual machines (VMs) or containers running on the virtualization infrastructure 10 change, when services provided by the virtualization infrastructure 10 change, etc.
  • VMs virtual machines
  • the multiple infrastructure-constituting devices that make up the virtualization infrastructure 10 are mainly racks 110, 120, and 130 each having multiple physical servers and power supplies, and the physical resources are physical servers 111, 121, and 131.
  • the types of multiple infrastructure-constituting devices and physical resources that make up the virtualization infrastructure 10 are not limited to these.
  • the multiple infrastructure-constituting devices that make up the virtualization infrastructure 10 may each be a container-type data center having multiple physical servers and power supplies, and the physical resources may be racks having multiple physical servers.
  • the present invention reduces excess resources in the virtualization infrastructure 10, while enabling high-priority services to continue to be provided on the virtualization infrastructure 10 even if a failure occurs in one of the multiple racks 110, 120, and 130 that make up the virtualization infrastructure 10. This can contribute to achieving Goal 9 of the Sustainable Development Goals (SDGs), which is to "build inclusive and sustainable industrial infrastructure, promote inclusive and sustainable industrialization, and build resilient infrastructure.”
  • SDGs Sustainable Development Goals
  • processing steps and components of the management system storage unit, information processing unit, control unit, communication unit, etc.
  • storage unit information processing unit, control unit, communication unit, etc.
  • these steps and components may be implemented by hardware, firmware, software, or a combination thereof.
  • the processing units and other means used to realize the above steps and components in an entity may be implemented in one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), processors, controllers, microcontrollers, microprocessors, electronic devices, other electronic units designed to perform the functions described herein, computers, or combinations thereof.
  • ASICs application-specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGAs field programmable gate arrays
  • processors controllers, microcontrollers, microprocessors, electronic devices, other electronic units designed to perform the functions described herein, computers, or combinations thereof.
  • the means such as a processing unit, used to realize the components may be implemented with a program (e.g., code, such as procedures, functions, modules, instructions, etc.) that performs the functions described herein.
  • a program e.g., code, such as procedures, functions, modules, instructions, etc.
  • any computer/processor-readable medium tangibly embodying firmware and/or software code may be used to implement the means, such as a processing unit, used to realize the steps and components described herein.
  • the firmware and/or software code may be stored in memory and executed by a computer or processor, for example in a control device.
  • the memory may be implemented within the computer or processor, or external to the processor.
  • the firmware and/or software code may also be stored on a computer- or processor-readable medium, such as, for example, random access memory (RAM), read-only memory (ROM), non-volatile random access memory (NVRAM), programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), flash memory, floppy disk, compact disk (CD), digital versatile disk (DVD), magnetic or optical data storage device, etc.
  • RAM random access memory
  • ROM read-only memory
  • NVRAM non-volatile random access memory
  • PROM programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory floppy disk, compact disk (CD), digital versatile disk (DVD), magnetic or optical data storage device, etc.
  • CD compact disk
  • DVD digital versatile disk
  • magnetic or optical data storage device etc.
  • the code may be executed by one or more computers or processors and may cause the computers or processors to perform certain aspects of the functionality described herein.
  • the medium may be a non-transitory recording medium.
  • the program code may be in any format as long as it can be read and executed by a computer, processor, or other device or machine, and its format is not limited to a specific format.
  • the program code may be source code, object code, or binary code, or may be a mixture of two or more of these codes.
  • Virtualization platform 30 Management system 110: Rack 111: Physical servers 111(1) to 111(3): Physical server 111': Physical server 112: Switch 120: Rack 121: Physical servers 121(1) to 121(3): Physical server 130: Rack 131: Physical servers 131(1) to 131(3): Physical servers 201A to 201C: Workloads 202A to 202C: Workloads 203A to 203C: Workload 320: Information processing unit 330: Control unit 340: Communication unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Hardware Redundancy (AREA)

Abstract

The present invention continuously provides a high-priority service on a virtualization infrastructure even when a failure occurs in any of a plurality of infrastructure constituent devices constituting the virtualization infrastructure, and all physical resources in the infrastructure constituent device in which the failure has occurred stop operation. This system comprises: an information storage unit that stores information relating to physical resources of the entire cluster in which a plurality of workloads constructed on a virtualization infrastructure can operate, and information relating to the respective physical resources of a plurality of infrastructure constituent devices; and an information processing unit that classifies a plurality of services provided by the cluster into a plurality of service groups having different priorities, and for each of the plurality of respective service groups, arranges the plurality of workloads to be executed in the cluster and manages the physical resources required for the respective service groups on the basis of the information relating to the physical resources of the entire cluster and the information relating to the respective physical resources of the plurality of infrastructure constituent devices.

Description

仮想化基盤を管理するシステム、方法及びプログラムSystem, method and program for managing virtualization infrastructure

 本発明は、仮想マシンを構築することができる仮想化基盤の管理に関する。 The present invention relates to the management of a virtualization platform on which virtual machines can be built.

 従来、複数の物理サーバ(物理リソース)をそれぞれ有する複数のラックで構成され、仮想マシンを構築するための仮想化基盤が知られている(例えば、特許文献1参照)。 Conventionally, a virtualization platform for building virtual machines, consisting of multiple racks each containing multiple physical servers (physical resources), has been known (see, for example, Patent Document 1).

国際公開第2011/043317号International Publication No. 2011/043317

 上記仮想化基盤を構成する複数の基盤構成装置であるラックのいずれかに障害(例えば、ラックごとに設けられた電源の障害)が発生すると、障害が発生したラック内のすべての物理サーバ(物理リソース)が稼働停止してしまい、仮想化基盤上に構築されている仮想マシンによる処理(例えば、サービス提供)ができなくなるおそれがある。なお、同様な課題は、複数のコンテナ型データセンタ(基盤構成装置)で仮想化基盤を構成する場合も同様に発生し得る課題である。 If a failure occurs in one of the racks, which are multiple infrastructure-component devices that make up the virtualization infrastructure (for example, a failure in the power supply installed in each rack), all physical servers (physical resources) in the failed rack will stop operating, and processing (for example, service provision) by virtual machines built on the virtualization infrastructure may become impossible. Note that similar issues can also arise when a virtualization infrastructure is configured with multiple container-type data centers (infrastructure-component devices).

 本発明の一態様に係るシステムは、複数の物理リソースをそれぞれ有する複数の基盤構成装置を備える仮想化基盤を管理するシステムである。このシステムは、前記仮想化基盤上に構築された複数のワークロードが動作可能なクラスタの全体の物理リソースの情報と、前記複数の基盤構成装置のそれぞれの物理リソースの情報と、を記憶する情報記憶部と、前記仮想化基盤上に構築された複数のワークロードが動作可能なクラスタで提供する複数のサービスを、優先度が互いに異なる複数のサービスグループに分類し、前記クラスタの全体の物理リソースの情報と前記複数の基盤構成装置のそれぞれの物理リソースの情報とに基づいて、前記複数のサービスグループのそれぞれについて、前記クラスタで実行される複数のワークロードの配置と前記サービスグループで必要な物理リソースの管理を行う情報処理部と、を備える。 A system according to one aspect of the present invention is a system for managing a virtualization infrastructure comprising multiple infrastructure component devices, each having multiple physical resources. This system comprises an information storage unit that stores information on the overall physical resources of a cluster on which multiple workloads constructed on the virtualization infrastructure can run, and information on the physical resources of each of the multiple infrastructure component devices; and an information processing unit that classifies multiple services provided by the cluster on which multiple workloads constructed on the virtualization infrastructure can run into multiple service groups with different priorities, and, for each of the multiple service groups, allocates the multiple workloads executed on the cluster and manages the physical resources required by the service group based on the information on the overall physical resources of the cluster and the information on the physical resources of each of the multiple infrastructure component devices.

 前記システムにおいて、前記複数の基盤構成装置のいずれかで障害が発生したとき、前記障害が発生した基盤構成装置で起動していた複数のワークロードのそれぞれが属するサービスグループの優先度を確認し、前記障害が発生した基盤構成装置で所定優先度以上の優先度が高いサービスグループのワークロードが起動していたことを確認した場合、前記障害が発生していない基盤構成装置で起動している低い優先度のサービスグループのワークロードを停止し、前記障害が発生していない基盤構成装置における前記ワークロードの停止で空いた物理リソースで、前記障害が発生した基盤構成装置で起動していた前記優先度が高いサービスグループのワークロードを起動し直すように制御する制御部を備えてもよい。 The system may also include a control unit that, when a failure occurs in one of the multiple infrastructure component devices, checks the priority of the service groups to which each of the multiple workloads running on the failed infrastructure component device belongs, and if it is confirmed that a workload of a service group with a priority equal to or higher than a predetermined priority was running on the failed infrastructure component device, stops the workload of the low-priority service group running on the non-failed infrastructure component device, and controls the system to restart the workload of the high-priority service group running on the failed infrastructure component device using the physical resources freed by the stopping of the workload on the non-failed infrastructure component device.

 前記システムにおいて、前記情報処理部は、前記複数の基盤構成装置のそれぞれの物理リソースの情報に基づいて、前記複数の基盤構成装置のうち物理リソースが最大の基盤構成装置に障害が発生した場合に使用可能な物理リソースをリソース構成要素ごとに計算し、前記使用可能な物理リソースのリソース構成要素ごとの計算結果と、前記複数の優先度それぞれに対応する物理リソースのリソース構成要素ごとの情報と、リソース構成要素ごとの競合割り当て設定条件とに基づいて、前記所定優先度を決定してもよい。 In the system, the information processing unit may calculate, for each resource component, the physical resources that can be used in the event of a failure in an infrastructure component device with the largest physical resources among the plurality of infrastructure component devices, based on information about the physical resources of each of the plurality of infrastructure component devices, and determine the predetermined priority based on the calculation results for each resource component of the available physical resources, information about each resource component of the physical resources corresponding to each of the plurality of priorities, and the competitive allocation setting conditions for each resource component.

 前記システムにおいて、前記情報処理部は、前記複数の基盤構成装置のうち物理リソースが最大の基盤構成装置に障害が発生した場合の影響情報を利用者に提供してもよい。 In the system, the information processing unit may provide the user with impact information in the event that a failure occurs in the infrastructure component device with the greatest physical resources among the plurality of infrastructure component devices.

 本発明の他の態様に係る方法は、複数の物理リソースをそれぞれ有する複数の基盤構成装置を備える仮想化基盤を管理する方法である。この方法は、前記仮想化基盤上に構築された複数のワークロードが動作可能なクラスタの全体の物理リソースの情報と、前記複数の基盤構成装置のそれぞれの物理リソースの情報と、を記憶することと、前記仮想化基盤上に構築された複数のワークロードが動作可能なクラスタで提供する複数のサービスを、優先度が互いに異なる複数のサービスグループに分類することと、前記クラスタの全体の物理リソースの情報と前記複数の基盤構成装置のそれぞれの物理リソースの情報とに基づいて、前記複数のサービスグループのそれぞれについて、前記クラスタで実行される複数のワークロードの配置と前記サービスグループで必要な物理リソースの管理を行うことと、を含む。 A method according to another aspect of the present invention is a method for managing a virtualization infrastructure comprising a plurality of infrastructure component devices each having a plurality of physical resources. This method includes storing information about the overall physical resources of a cluster on which a plurality of workloads constructed on the virtualization infrastructure can run, and information about the physical resources of each of the plurality of infrastructure component devices; classifying a plurality of services provided by the cluster on which a plurality of workloads constructed on the virtualization infrastructure can run into a plurality of service groups with different priorities; and, for each of the plurality of service groups, allocating a plurality of workloads to be executed on the cluster and managing the physical resources required by the service group based on the information about the overall physical resources of the cluster and the information about the physical resources of each of the plurality of infrastructure component devices.

 前記方法において、前記複数の基盤構成装置のいずれかで障害が発生したとき、前記障害が発生した基盤構成装置で起動していた複数のワークロードのそれぞれが属するサービスグループの優先度を確認することと、前記障害が発生した基盤構成装置で所定優先度以上の優先度が高いサービスグループのワークロードが起動していたことを確認した場合、前記障害が発生していない基盤構成装置で起動している低い優先度のサービスグループのワークロードを停止することと、前記障害が発生していない基盤構成装置における前記ワークロードの停止で空いた物理リソースで、前記障害が発生した基盤構成装置で起動していた前記優先度が高いサービスグループのワークロードを起動し直すことと、を含んでもよい。 The method may include, when a failure occurs in one of the multiple infrastructure component devices, checking the priority of the service groups to which each of the multiple workloads running on the failed infrastructure component device belongs; and, if it is confirmed that a workload of a service group with a priority equal to or higher than a predetermined priority was running on the failed infrastructure component device, stopping the workload of the low-priority service group running on the non-failed infrastructure component device; and restarting the workload of the high-priority service group running on the failed infrastructure component device using the physical resources freed by stopping the workload on the non-failed infrastructure component device.

 前記方法において、前記複数の基盤構成装置のそれぞれの物理リソースの情報に基づいて、前記複数の基盤構成装置のうち物理リソースが最大の基盤構成装置に障害が発生した場合に使用可能な物理リソースをリソース構成要素ごとに計算することと、前記使用可能な物理リソースのリソース構成要素ごとの計算結果と、前記複数の優先度それぞれに対応する物理リソースのリソース構成要素ごとの情報と、リソース構成要素ごとの競合割り当て設定条件とに基づいて、前記所定優先度を決定することと、を含んでもよい。 The method may include calculating, for each resource component, the physical resources that can be used in the event of a failure in an infrastructure component device with the largest physical resources among the plurality of infrastructure component devices, based on information about the physical resources of each of the plurality of infrastructure component devices; and determining the predetermined priority based on the calculation results for each resource component of the available physical resources, information about each resource component of the physical resources corresponding to each of the plurality of priorities, and competitive allocation setting conditions for each resource component.

 前記方法において、前記複数の基盤構成装置のうち物理リソースが最大の基盤構成装置に障害が発生した場合の影響情報を利用者に提供することを含んでもよい。 The method may also include providing the user with impact information in the event that a failure occurs in the infrastructure component device with the greatest physical resources among the plurality of infrastructure components.

 本発明の更に他の態様に係るプログラムは、複数の物理リソースをそれぞれ有する複数の基盤構成装置を備える仮想化基盤を管理するシステムに設けられたコンピュータ又はプロセッサで実行されるプログラムである。このプログラムは、前記仮想化基盤上に構築された複数のワークロードが動作可能なクラスタの全体の物理リソースの情報と、前記複数の基盤構成装置のそれぞれの物理リソースの情報と、を記憶するためのプログラムコードと、前記仮想化基盤上に構築された複数のワークロードが動作可能なクラスタで提供する複数のサービスを、優先度が互いに異なる複数のサービスグループに分類するためのプログラムコードと、前記クラスタの全体の物理リソースの情報と前記複数の基盤構成装置のそれぞれの物理リソースの情報とに基づいて、前記複数のサービスグループのそれぞれについて、前記クラスタで実行される複数のワークロードの配置と前記サービスグループで必要な物理リソースの管理を行うためのプログラムコードと、を含む。 A program according to yet another aspect of the present invention is a program executed by a computer or processor provided in a system that manages a virtualization infrastructure that includes multiple infrastructure-component devices, each having multiple physical resources. This program includes program code for storing information about the overall physical resources of a cluster on which multiple workloads constructed on the virtualization infrastructure can run, and information about the physical resources of each of the multiple infrastructure-component devices; program code for classifying multiple services provided by the cluster on which multiple workloads constructed on the virtualization infrastructure can run, into multiple service groups with different priorities; and program code for allocating multiple workloads to be executed on the cluster and managing the physical resources required by the service group for each of the multiple service groups, based on the information about the overall physical resources of the cluster and the information about the physical resources of each of the multiple infrastructure-component devices.

 前記プログラムにおいて、前記複数の基盤構成装置のいずれかで障害が発生したとき、前記障害が発生した基盤構成装置で起動していた複数のワークロードのそれぞれが属するサービスグループの優先度を確認するためのプログラムコードと、前記障害が発生した基盤構成装置で所定優先度以上の優先度が高いサービスグループのワークロードが起動していたことを確認した場合、前記障害が発生していない基盤構成装置で起動している低い優先度のサービスグループのワークロードを停止するためのプログラムコードと、前記障害が発生していない基盤構成装置における前記ワークロードの停止で空いた物理リソースで、前記障害が発生した基盤構成装置で起動していた前記優先度が高いサービスグループのワークロードを起動し直すためのプログラムコードと、を含んでもよい。 The program may include program code for, when a failure occurs in one of the multiple infrastructure component devices, checking the priority of the service groups to which each of the multiple workloads running on the failed infrastructure component device belongs; program code for stopping workloads of low-priority service groups running on the non-failed infrastructure component device when it is confirmed that workloads of high-priority service groups with a priority equal to or higher than a predetermined priority were running on the failed infrastructure component device; and program code for restarting workloads of high-priority service groups running on the failed infrastructure component device using physical resources freed by the stopping of the workloads on the non-failed infrastructure component device.

 前記プログラムにおいて、前記複数の基盤構成装置のそれぞれの物理リソースの情報に基づいて、前記複数の基盤構成装置のうち物理リソースが最大の基盤構成装置に障害が発生した場合に使用可能な物理リソースをリソース構成要素ごとに計算するためのプログラムコードと、前記使用可能な物理リソースのリソース構成要素ごとの計算結果と、前記複数の優先度それぞれに対応する物理リソースのリソース構成要素ごとの情報と、リソース構成要素ごとの競合割り当て設定条件とに基づいて、前記所定優先度を決定するためのプログラムコードと、を含んでもよい。 The program may include program code for calculating, for each resource component, the physical resources available for use in the event of a failure in an infrastructure component device with the greatest number of physical resources among the plurality of infrastructure component devices, based on information about the physical resources of each of the plurality of infrastructure component devices; and program code for determining the predetermined priority based on the calculation results for each resource component of the available physical resources, information about each resource component of the physical resources corresponding to each of the plurality of priorities, and competitive allocation setting conditions for each resource component.

 前記プログラムにおいて、前記複数の基盤構成装置のうち物理リソースが最大の基盤構成装置に障害が発生した場合の影響情報を利用者に提供するためのプログラムコードを含んでもよい。 The program may include program code for providing users with impact information in the event that a failure occurs in the infrastructure component device with the greatest physical resources among the plurality of infrastructure components.

 前記システム、前記方法及び前記プログラムにおいて、前記物理リソースは、物理サーバであってもよく、前記基盤構成装置は、複数の物理サーバと電源とを有するラックであってもよい。 In the system, method, and program, the physical resource may be a physical server, and the infrastructure component device may be a rack having multiple physical servers and a power supply.

 前記システム、前記方法及び前記プログラムにおいて、前記物理リソースは、複数の物理サーバを有するラックであってもよく、前記基盤構成装置は、複数のラックと電源とを有するコンテナ型データセンタであってもよい。 In the system, method, and program, the physical resource may be a rack having multiple physical servers, and the infrastructure component device may be a container-type data center having multiple racks and a power source.

 前記プログラムは、機械学習済みモデルを含んでもよい。 The program may include a machine-learned model.

 前記仮想化基盤は、例えばデータセンターに用いてもよいし、移動通信システムのRAN(無線アクセスネットワーク)に用いられるRIC(RAN Intelligent Controller)に用いてもよい。 The virtualization platform may be used, for example, in a data center, or in a RIC (RAN Intelligent Controller) used in the RAN (Radio Access Network) of a mobile communications system.

 本発明によれば、仮想化基盤を構成する複数の基盤構成装置のいずれかに障害が発生し、障害が発生した基盤構成装置内のすべての物理リソースが稼働停止した場合でも、仮想化基盤上で優先度の高いサービスを継続して提供することできる。 According to this invention, even if a failure occurs in one of the multiple infrastructure component devices that make up the virtualization infrastructure and all physical resources within the failed infrastructure component device stop operating, high-priority services can continue to be provided on the virtualization infrastructure.

図1は、実施形態に係るシステムを適用可能な仮想化基盤の全体構成の一例を示す概略構成図である。FIG. 1 is a schematic diagram showing an example of the overall configuration of a virtualization platform to which a system according to an embodiment can be applied. 図2は、参考例に係る仮想化基盤において単一の物理サーバの障害発生の一例を示す説明図である。FIG. 2 is an explanatory diagram showing an example of a failure occurring in a single physical server in a virtualization infrastructure according to a reference example. 図3は、実施形態に係る仮想化基盤においてラック規模の障害発生の一例を示す説明図である。FIG. 3 is an explanatory diagram illustrating an example of a rack-scale failure occurring in the virtualization platform according to the embodiment. 図4は、実施形態に係る仮想化基盤における複数の優先度のワークロードの配置の一例を示す説明図である。FIG. 4 is an explanatory diagram illustrating an example of the allocation of workloads with multiple priorities in a virtualization platform according to the embodiment. 図5は、実施形態に係る仮想化基盤においてラック規模の障害が発生した場合の複数の優先度のワークロードの配置の一例を示す説明図である。FIG. 5 is an explanatory diagram illustrating an example of the placement of workloads with multiple priorities when a rack-scale failure occurs in the virtualization platform according to the embodiment. 図6は、実施形態に係る仮想化基盤を管理する管理システムの構成の一例を示す説明図である。FIG. 6 is an explanatory diagram illustrating an example of the configuration of a management system that manages a virtualization infrastructure according to the embodiment. 図7は、実施形態に係る仮想化基盤における仮想環境のリソース関連のテーブルの生成の一例を示すフローチャートである。FIG. 7 is a flowchart showing an example of generating a resource-related table of a virtual environment in the virtualization infrastructure according to the embodiment. 図8は、実施形態に係る仮想化基盤における物理的なリソース関連のテーブルの生成の一例を示すフローチャートである。FIG. 8 is a flowchart showing an example of generating a physical resource-related table in the virtualization platform according to the embodiment. 図9は、実施形態に係る仮想化基盤においてラック規模の障害が発生した場合にワークロードを起動し直すサービスグループの優先度の計算の一例を示すフローチャートである。FIG. 9 is a flowchart illustrating an example of calculation of the priority of a service group that restarts a workload when a rack-scale failure occurs in the virtualization platform according to the embodiment.

 以下、図面を参照して本発明の実施形態について説明する。
 本書に記載された実施形態に係るシステムの一例は、仮想化基盤上の仮想環境で稼働するワークロードのリソースに対してサービスグループの単位とサービスグループ毎の優先度とを定義し、基盤構成装置(ラック)単位の障害などによって、全てのワークロード(サービスグループ)を起動することが困難なリソース状況の場合、優先度の低いサービスグループのワークロードを停止し、優先度の高いサービスグループのワークロードを起動できるように制御する。また、制御においては、サービスグループ毎に消費する物理リソースのリソース構成要素(CPU(Central Processing Unit)、メモリ、GPU(Graphics Processing Unit))や、物理リソースに対する競合割当て設定(例えば、物理サーバに実際に搭載されている1つの物理CPUに対して4つの仮想CPUを割り当てる、など)を定義して管理するシステムである。
Hereinafter, an embodiment of the present invention will be described with reference to the drawings.
An example of a system according to an embodiment described herein defines service group units and priorities for each service group for resources of workloads running in a virtual environment on a virtualization platform, and when a resource situation makes it difficult to start all workloads (service groups) due to a failure of a platform component device (rack), etc., the system stops workloads of low-priority service groups and controls so that workloads of high-priority service groups can be started. In addition, the control involves defining and managing resource components (CPU (Central Processing Unit), memory, GPU (Graphics Processing Unit)) of physical resources consumed by each service group and competitive allocation settings for physical resources (e.g., allocating four virtual CPUs to one physical CPU actually installed in a physical server).

 特に、本実施形態に係るシステムで管理する仮想化基盤は、移動通信システムのRAN(無線アクセスネットワーク)に用いられるRIC(RAN Intelligent Controller)に用いることができる。 In particular, the virtualization infrastructure managed by the system of this embodiment can be used in the RIC (RAN Intelligent Controller) used in the RAN (Radio Access Network) of a mobile communications system.

 図1は、実施形態に係るシステムを適用可能な仮想化基盤の全体構成の一例を示す概略構成図である。図1において、仮想化基盤10は、複数の基盤構成装置としてのラック110、120、130を備える。ラック110、120、130はそれぞれ、複数の物理リソースとしての物理サーバ(「物理ホスト」ともいう。)111、121、131を有する。図1の例では、ラック110、120、130が有する複数台(図示の例では24台)の物理サーバ111、121、131により、仮想マシン(VM)又はコンテナなどのワークロードが動作可能なグループの一単位であるクラスタが形成されている。 FIG. 1 is a schematic diagram showing an example of the overall configuration of a virtualization platform to which a system according to an embodiment can be applied. In FIG. 1, the virtualization platform 10 comprises racks 110, 120, and 130 as multiple platform-constituting devices. The racks 110, 120, and 130 each have physical servers (also referred to as "physical hosts") 111, 121, and 131 as multiple physical resources. In the example of FIG. 1, the multiple physical servers 111, 121, and 131 (24 in the illustrated example) contained in the racks 110, 120, and 130 form a cluster, which is a unit of a group in which workloads such as virtual machines (VMs) or containers can run.

 なお、図1の例では、仮想化基盤10は3台のラック110、120、130で構成されているが、2台のラック又は4台以上のラックで仮想化基盤10を構成してもよい。また、ラック110、120、130はそれぞれ、8台ずつの物理サーバ111、121、131を有しているが、ラック110、120、130それぞれが有する物理サーバは1~7台であってもよいし、9台以上であってもよい。また、ラック110、120、130のそれぞれが有する物理サーバの台数は互いに異なってもよい。 In the example of FIG. 1, the virtualization platform 10 is composed of three racks 110, 120, and 130, but the virtualization platform 10 may also be composed of two racks or four or more racks. Furthermore, the racks 110, 120, and 130 each have eight physical servers 111, 121, and 131, respectively, but the number of physical servers each rack 110, 120, and 130 may have may be one to seven, or nine or more. Furthermore, the number of physical servers each rack 110, 120, and 130 may have may differ from each other.

 複数のラック110、120、130により、仮想化基盤10上で複数のワークロードが動作可能なクラスタが構築され、複数の物理サーバ111、121、131上で仮想マシン(VM)又はコンテナなどのワークロードが動いている。また、複数のラック110、120、130はそれぞれスイッチ112等の他のデバイスを有している。複数のラック110、120、130はそれぞれ、ラック毎に、複数の物理サーバ111、121、131に電力を供給する電源を有してもよい。 A cluster capable of running multiple workloads on the virtualization platform 10 is constructed using multiple racks 110, 120, and 130, with workloads such as virtual machines (VMs) or containers running on multiple physical servers 111, 121, and 131. Each of the multiple racks 110, 120, and 130 also has other devices such as a switch 112. Each of the multiple racks 110, 120, and 130 may have a power supply that supplies power to the multiple physical servers 111, 121, and 131.

 上記構成の仮想化基盤10において、例えば図2に示すように一つのラック110中の単一の物理サーバ111'(図2の例ではラック110の最下段の物理サーバ)に障害が発生した場合、障害が発生した物理サーバ111'で起動していたワークロードをクラスタ内の他の物理サーバ上で起動し直すことができる。従って、クラスタの全体で、物理サーバ1台分の余剰リソースがあればよい。 In the virtualization platform 10 configured as described above, if a failure occurs in a single physical server 111' in one rack 110 (in the example of FIG. 2, the physical server at the bottom of the rack 110), for example, as shown in FIG. 2, the workload that was running on the failed physical server 111' can be restarted on another physical server in the cluster. Therefore, the entire cluster only needs to have surplus resources equivalent to one physical server.

 しかしながら、上記構成の仮想化基盤10においてラック規模で障害が発生する場合がある。例えば、図3に示すように1台のラック110の電源系に障害が発生すると、ラック110内のすべての物理サーバ111がダウンしてしまう。ラック規模の障害が発生すると、ラック110内のすべての物理サーバ111'で起動していた大量のワークロードの移動先がなく、当該ワークロードを起動できないため、提供サービスに影響を与える。ラック110内のすべての物理サーバ111'で起動していたすべてのワークロードを他のラックの物理サーバで起動することも考えられるが、余剰リソースが膨大になり現実的でない。例えば、クラスタ全体でラック1台分の余剰リソースが必要である。そのため、ラック障害は障害ケースに含めないのが一般的である。 However, rack-scale failures can occur in a virtualization platform 10 configured as described above. For example, as shown in Figure 3, if a failure occurs in the power supply system of one rack 110, all physical servers 111 within the rack 110 will go down. When a rack-scale failure occurs, there is nowhere to move the large number of workloads that were running on all physical servers 111' within the rack 110, and these workloads cannot be started, affecting the services provided. While it is possible to start all workloads that were running on all physical servers 111' in the rack 110 on physical servers in other racks, this would require a huge amount of surplus resources and would be unrealistic. For example, the entire cluster would require surplus resources equivalent to one rack. For this reason, rack failures are generally not included in failure cases.

 ラック障害が考えられるシナリオで最も起こりうるのは供給電力が停止する、電源系の障害によるものである。一般のデータセンタの電源系は複数系統引込み、UPS(無停電電源システム)の設置、自家発電などによって非常に安定していることにより、電源系の障害は無視できるものとしているケースが多い。しかし、近年、コンテナ型データセンタなどが出てきたことにより、データセンタのコンパクト化が実用的になってきており、電源系の障害も無視できないものとなってくる。電源系の障害が起きた際にも、可能な限りサービスを提供し続ける(仮想化基盤において必要なワークロードを起動し直す)ことが望ましい。こういったケースに備えてワークロードをロードバランサなどで冗長化しておく方法もあるが、冗長化しないシングル構成のほうが最小限のリソースでサービスを提供でき、データの共有などもシンプルなど、メリットがある。 The most likely scenario for a rack failure is one resulting in an interruption in the power supply, or a power system failure. Because the power systems of typical data centers are very stable due to multiple system connections, the installation of UPS (uninterruptible power systems), and in-house power generation, power system failures are often considered negligible. However, in recent years, with the emergence of container-type data centers and other technologies, it has become practical to make data centers more compact, and power system failures are no longer negligible. Even in the event of a power system failure, it is desirable to continue providing services as much as possible (by restarting the necessary workloads on the virtualization platform). While one option for preparing for such cases is to make workloads redundant using load balancers, a single configuration without redundancy has the advantage of being able to provide services with minimal resources and simplifying data sharing.

 本実施形態では、仮想化基盤10上に構築された複数のワークロードが動作可能なクラスタで提供する複数のサービスを、優先度が互いに異なる複数のサービスグループに分類し、クラスタの全体の物理サーバ(物理リソース)111,121,131の情報と複数のラック(基盤構成装置)110、120、130のそれぞれの物理サーバ(物理リソース)111,121,131の情報とに基づいて、複数のサービスグループのそれぞれについて、クラスタで実行される複数のワークロードの配置とサービスグループで必要な物理サーバ(物理リソース)111,121,131の管理を行なっている。この複数のサービスグループのそれぞれについてのワークロードの配置及び物理サーバ(物理リソース)の管理により、仮想化基盤10を構成する複数のラック110、120、130のいずれかに障害が発生し、障害が発生したラック内のすべての物理サーバが稼働停止した場合でも、仮想化基盤10上で優先度の高いサービスを継続して提供するのを可能にしている。 In this embodiment, the multiple services provided by a cluster built on the virtualization platform 10, which can run multiple workloads, are classified into multiple service groups with different priorities. Based on information about the physical servers (physical resources) 111, 121, and 131 of the entire cluster and information about the physical servers (physical resources) 111, 121, and 131 of each of the multiple racks (platform configuration devices) 110, 120, and 130, the placement of the multiple workloads executed in the cluster and the management of the physical servers (physical resources) 111, 121, and 131 required by the service group are performed for each of the multiple service groups. This workload placement and management of the physical servers (physical resources) for each of the multiple service groups makes it possible to continue providing high-priority services on the virtualization platform 10, even if a failure occurs in one of the multiple racks 110, 120, and 130 that make up the virtualization platform 10, causing all physical servers in the failed rack to stop operating.

 図4は、実施形態に係る仮想化基盤10における複数の優先度のワークロードの配置の一例を示す説明図である。図4において、複数のラック110、120、130のそれぞれの物理サーバ111(1)~111(3),121(1)~121(3),131(1)~131(3)で起動しているワークロードを正方形のブロック201A~201C,202A~202C,203A~203Cで示している。各ワークロードのブロック中に記号A,B,Cは、各サービスグループの優先度を示している。優先度Aは一番高い優先度であり、優先度Bは2番目に高い優先度であり、優先度Cは一番低い優先度である。また、図4の例では、物理サーバ111(1)~111(3),121(1)~121(3),131(1)~131(3)の1台当たりに起動できるワークロードは4つのワークロードである。 FIG. 4 is an explanatory diagram showing an example of the arrangement of workloads with multiple priorities in the virtualization platform 10 according to the embodiment. In FIG. 4, the workloads running on the physical servers 111(1) to 111(3), 121(1) to 121(3), and 131(1) to 131(3) of the racks 110, 120, and 130 are indicated by square blocks 201A to 201C, 202A to 202C, and 203A to 203C. The symbols A, B, and C in each workload block indicate the priority of each service group. Priority A is the highest priority, Priority B is the second highest priority, and Priority C is the lowest priority. In the example of FIG. 4, four workloads can be started per physical server 111(1) to 111(3), 121(1) to 121(3), and 131(1) to 131(3).

 図4の仮想化基盤10では、クラスタのワークロード数の上限が36個、現状稼働しているワークロードが30個という状況である。どの物理サーバが単体で故障しても、1台分までなら別の物理サーバで起動し直すことができる余剰リソースを持っている状態である。しかし、1台のラック110がラック規模の障害を起こすと、ラック110の物理サーバ111(1)~111(3)上で稼働する10個のワークロード201A~201Cを全て起動するキャパシティは、他のラック120及びラック130の空きリソースには無い。 In the virtualization platform 10 shown in Figure 4, the upper limit for the number of workloads in a cluster is 36, and there are currently 30 workloads running. Even if a single physical server fails, there are surplus resources available that allow up to one of the physical servers to be restarted on another physical server. However, if a rack 110 experiences a rack-wide failure, the remaining free resources in racks 120 and 130 do not have the capacity to start all 10 workloads 201A-201C running on physical servers 111(1)-111(3) of rack 110.

 図5は、実施形態に係る仮想化基盤10においてラック規模の障害が発生した場合の複数の優先度のワークロードの配置の一例を示す説明図である。図5において、ラック110に障害が起きた際、高い優先度Aのサービスグループの4つのワークロード201Aは、他の物理サーバ121,131の空きリソースで起動し直せる。しかし、ラック110で起動していた2番目に高い中の優先度Bのサービスグループの3つのワークロード201Bについては、他の物理サーバ121,131で起動し直す空きリソースが足りない。 FIG. 5 is an explanatory diagram showing an example of the allocation of workloads with multiple priorities when a rack-wide failure occurs in the virtualization platform 10 according to the embodiment. In FIG. 5, when a failure occurs in rack 110, four workloads 201A in the service group with the highest priority A can be restarted using the free resources of other physical servers 121 and 131. However, for three workloads 201B in the service group with the second-highest priority B that were running in rack 110, there are insufficient free resources to restart them on the other physical servers 121 and 131.

 そこで、本実施形態では、複数のラック(基盤構成装置)のいずれかで障害が発生したとき、障害が発生したラック(基盤構成装置)110で起動していた複数のワークロードのそれぞれが属するサービスグループの優先度を確認する。そして、障害が発生したラック(基盤構成装置)110で所定優先度(例えば「中」の優先度)以上の優先度が高いサービスグループのワークロードが起動していたことを確認した場合、障害が発生していない他のラック(基盤構成装置)120,130で起動している低い優先度(例えば「低」の優先度)のサービスグループのワークロードを停止する。更に、障害が発生していない他のラック(基盤構成装置)120,130におけるワークロードの停止で空いた物理リソースで、障害が発生したラック(基盤構成装置)110で起動していた高い優先度A,Bのサービスグループのワークロードを起動し直す。 In this embodiment, when a failure occurs in one of multiple racks (infrastructure configuration devices), the system checks the priority of the service groups to which each of the multiple workloads running on the rack (infrastructure configuration device) 110 where the failure occurred belongs. If it is confirmed that a workload from a service group with a higher priority than a predetermined priority (e.g., "medium" priority) is running on the rack (infrastructure configuration device) 110 where the failure occurred, the system stops the workloads from service groups with a lower priority (e.g., "low" priority) running on the other racks (infrastructure configuration devices) 120, 130 where the failure did not occur. Furthermore, the system restarts the workloads from the high-priority service groups A and B that were running on the rack (infrastructure configuration device) 110 where the failure occurred, using the physical resources freed up by stopping the workloads on the other racks (infrastructure configuration devices) 120, 130 where the failure did not occur.

 図5の例では、ラック120,130の物理サーバ121,131で起動していた低い優先度Cのサービスグループのワークロード202C,203Cの一部を停止し、そのワークロードの停止で空いたリソースを使って、ラック110で起動していた優先度Bのサービスグループの3つのワークロード201Bを起動し直している。 In the example of Figure 5, some of the workloads 202C and 203C of the service group with low priority C that were running on physical servers 121 and 131 in racks 120 and 130 are stopped, and the resources freed up by stopping these workloads are used to restart three workloads 201B of the service group with priority B that were running on rack 110.

 なお、図5の例において、各ラックにおけるリソースの状況などに応じて、ラック120,130の物理サーバ121,131で起動していた優先度Cのサービスグループのワークロードの全部を停止し、そのワークロードの停止で空いたリソースを使って、ラック110で起動していた優先度Bのサービスグループのワークロードを起動し直してもよい。また、各ラックにおけるリソースの状況などに応じて、ラック120,130の物理サーバ121,131で起動していた優先度Cのサービスグループのワークロードの全部と優先度Bのサービスグループのワークロードの一部又は全部を停止し、そのワークロードの停止で空いたリソースを使って、ラック110で起動していた優先度Aのサービスグループのワークロード201A、優先度Bのサービスグループのワークロード201B又はそれらの両方のワークロードを起動し直してもよい。 In the example of FIG. 5, depending on the resource status in each rack, all of the workloads of the service group with priority C that were running on the physical servers 121 and 131 in racks 120 and 130 may be stopped, and the resources freed up by stopping the workloads may be used to restart the workloads of the service group with priority B that were running on rack 110. Also, depending on the resource status in each rack, all of the workloads of the service group with priority C and some or all of the workloads of the service group with priority B that were running on the physical servers 121 and 131 in racks 120 and 130 may be stopped, and the resources freed up by stopping the workloads may be used to restart workload 201A of the service group with priority A, workload 201B of the service group with priority B, or both of these workloads that were running on rack 110.

 以上のように、本実施形態では、ラック障害への対策として、低い優先度のサービスグループのサービスは提供を諦めることを受容することで、仮想化基盤10における余剰リソースを減らすことができるようになる。 As described above, in this embodiment, as a countermeasure against rack failures, surplus resources in the virtualization platform 10 can be reduced by accepting the abandonment of services from low-priority service groups.

 図6は、実施形態に係る仮想化基盤10を管理する管理システム30の構成の一例を示す説明図である。図6において、管理システム30は、情報記憶部(DB)310と情報処理部320と制御部330とを備える。管理システム30は、通信部340を更に備えてもよい。なお、管理システム30は、仮想化基盤10とは別のサーバ又はクラウドコンピュータシステムに搭載してもよいし、仮想化基盤10における物理サーバに管理サーバとして搭載してもよい。 FIG. 6 is an explanatory diagram showing an example of the configuration of a management system 30 that manages the virtualization platform 10 according to an embodiment. In FIG. 6, the management system 30 includes an information storage unit (DB) 310, an information processing unit 320, and a control unit 330. The management system 30 may further include a communication unit 340. The management system 30 may be installed on a server or cloud computer system separate from the virtualization platform 10, or may be installed as a management server on a physical server in the virtualization platform 10.

 情報記憶部(DB)310は、仮想化基盤10上に構築された複数のワークロードが動作可能なクラスタの全体の物理サーバ(物理リソース)111,121,131の情報と、複数のラック(基盤構成装置)110,120,130のそれぞれの物理サーバ(物理リソース)111,121,131の情報と、を記憶している。また、情報記憶部(DB)310は、上記構成の仮想化基盤10の構成における、表1のサービスグループテーブル、表2の優先度別リソーステーブル、表3のラックリソーステーブル、表4のクラスタリソーステーブル、表5の条件テーブルなどの各種テーブルの情報を記憶している。 The information storage unit (DB) 310 stores information on all physical servers (physical resources) 111, 121, 131 of a cluster constructed on the virtualization platform 10 that can run multiple workloads, as well as information on each physical server (physical resource) 111, 121, 131 of multiple racks (platform configuration devices) 110, 120, 130. The information storage unit (DB) 310 also stores information on various tables in the configuration of the virtualization platform 10 described above, such as the service group table in Table 1, the priority-based resource table in Table 2, the rack resource table in Table 3, the cluster resource table in Table 4, and the condition table in Table 5.

 なお、表5の条件テーブルの例では、物理サーバに搭載されている1個の物理CPUに対して4個の仮想CPUを割り当てて使用することを許容し、メモリに関しては物理サーバに搭載されている物理メモリの上限まで使用することを許容し、物理サーバに搭載されている1個の物理GPUに対して2個の仮想GPUを割り当てて使用することを許容する、という条件を示している。 In the example condition table in Table 5, the following conditions are shown: four virtual CPUs can be allocated and used for one physical CPU installed on the physical server; memory can be used up to the upper limit of the physical memory installed on the physical server; and two virtual GPUs can be allocated and used for one physical GPU installed on the physical server.

 情報処理部320は、仮想化基盤10上に構築された複数のワークロードが動作可能なクラスタで提供する複数のサービスを、優先度が互いに異なる複数のサービスグループに分類する。また、情報処理部320は、クラスタの全体の物理サーバ(物理リソース)111,121,131の情報と複数のラック(基盤構成装置)110,120,130のそれぞれの物理サーバ(物理リソース)111,121,131の情報とに基づいて、複数のサービスグループのそれぞれについて、クラスタで実行される複数のワークロードの配置とサービスグループで必要な物理サーバ(物理リソース)の管理を行う。 The information processing unit 320 classifies the multiple services provided by a cluster constructed on the virtualization infrastructure 10, capable of running multiple workloads, into multiple service groups with different priorities. Furthermore, the information processing unit 320 allocates the multiple workloads executed in the cluster and manages the physical servers (physical resources) required by each of the multiple service groups, based on information on the physical servers (physical resources) 111, 121, 131 of the entire cluster and information on the physical servers (physical resources) 111, 121, 131 of each of the multiple racks (infrastructure components) 110, 120, 130.

 制御部330は、前述のように、複数のラック(基盤構成装置)110,120,130のいずれかで障害が発生したとき、障害が発生したラック(基盤構成装置)110で起動していた複数のワークロードのそれぞれが属するサービスグループの優先度を確認する。また、制御部330は、障害が発生したラック(基盤構成装置)110で所定優先度(例えば「中」の優先度)以上の優先度が高いサービスグループのワークロードが起動していたことを確認した場合、障害が発生していない他のラック(基盤構成装置)120,130で起動している低い優先度(例えば「低」の優先度)のサービスグループのワークロードを停止する。更に、制御部330は、障害が発生していない他のラック(基盤構成装置)120,130におけるワークロードの停止で空いた物理リソースで、障害が発生したラック(基盤構成装置)110で起動していた高い優先度A,Bのサービスグループのワークロードを起動し直すように、制御する。 As described above, when a failure occurs in one of the multiple racks (infrastructure configuration devices) 110, 120, 130, the control unit 330 checks the priority of the service groups to which each of the multiple workloads running on the rack (infrastructure configuration device) 110 where the failure occurred belongs. Furthermore, if the control unit 330 checks that a workload from a service group with a higher priority than a predetermined priority (e.g., "medium" priority) was running on the rack (infrastructure configuration device) 110 where the failure occurred, it stops the workloads from a service group with a lower priority (e.g., "low" priority) running on the other racks (infrastructure configuration devices) 120, 130 where the failure did not occur. Furthermore, the control unit 330 controls the restart of the workloads from the high-priority service groups A and B that were running on the rack (infrastructure configuration device) 110 where the failure occurred, using the physical resources freed up by stopping the workloads on the other racks (infrastructure configuration devices) 120, 130 where the failure did not occur.

 通信部340は、通信回線などを介して利用者(又はオペレータ等)40の端末装置と通信を行い、複数のラック(基盤構成装置)110、120,130のうち物理リソースが最大のラック(基盤構成装置)110に障害が発生した場合の影響情報を利用者に提供する。 The communications unit 340 communicates with the terminal device of the user (or operator, etc.) 40 via a communications line or the like, and provides the user with impact information in the event of a failure in the rack (infrastructure component device) 110 with the greatest physical resources among multiple racks (infrastructure component devices) 110, 120, and 130.

 図7は、実施形態に係る仮想化基盤10における仮想環境のリソース関連のテーブルの生成の一例を示すフローチャートである。図7において、情報処理部320は、仮想化基盤10におけるワークロードのデプロイ(例えば、構築又は作成を意味する。)時にワークロードに関するワークロードテーブルを生成する(S101)。次に、優先度が異なるサービスグループ毎にリソース構成要素(CPU、メモリ、GPU)を集計し、表1のサービスグループテーブルを作成する(S102)。次に、情報処理部320は、表1のサービスグループテーブルについて優先度毎にリソース構成要素(CPU、メモリ、GPU)を集計し、表3の優先度別リソーステーブルを作成する(S103) FIG. 7 is a flowchart showing an example of generating a resource-related table for a virtual environment in the virtualization platform 10 according to an embodiment. In FIG. 7, the information processing unit 320 generates a workload table for a workload when deploying (e.g., building or creating) the workload in the virtualization platform 10 (S101). Next, the information processing unit 320 aggregates resource components (CPU, memory, GPU) for each service group with different priorities and creates the service group table shown in Table 1 (S102). Next, the information processing unit 320 aggregates resource components (CPU, memory, GPU) for each priority for the service group table shown in Table 1 and creates the priority-based resource table shown in Table 3 (S103).

 図8は、実施形態に係る仮想化基盤における物理的なリソース関連のテーブルの生成の一例を示すフローチャートである。図8において、情報処理部320は、仮想化基盤10におけるクラスタを構成する各ラックの物理サーバの情報から物理サーバリソーステーブルを生成する(S201)。次に、物理サーバリソーステーブルにおけるリソース構成要素(CPU、メモリ、GPU)を集計し、表4のクラスタリソーステーブルを作成する(S202)。次に、情報処理部320は、物理サーバリソーステーブルから表3のラックリソーステーブルを作成する(S203) FIG. 8 is a flowchart showing an example of generating a physical resource-related table in a virtualization platform according to an embodiment. In FIG. 8, the information processing unit 320 generates a physical server resource table from information about the physical servers in each rack that constitutes a cluster in the virtualization platform 10 (S201). Next, the information processing unit 320 aggregates the resource components (CPU, memory, GPU) in the physical server resource table to create the cluster resource table shown in Table 4 (S202). Next, the information processing unit 320 creates the rack resource table shown in Table 3 from the physical server resource table (S203).

 図9は、実施形態に係る仮想化基盤10においてラック規模の障害が発生した場合にワークロードを起動し直すサービスグループの優先度の計算の一例を示すフローチャートである。図9において、情報処理部320は、複数のラック110,120,130のそれぞれの物理サーバ111,121,131の情報に基づいて、リソース構成要素(CPU、メモリ、GPU)が最大のラック110がダウンした(落ちた)場合すなわちラック110に障害が発生した場合に使用可能な物理リソースをリソース構成要素(CPU、メモリ、GPU)ごとに計算する(S301)。次に、情報処理部320は、使用可能なラック120,130の物理リソースのリソース構成要素(CPU、メモリ、GPU)ごとの計算結果(表7のラックリソーステーブル)と、複数の優先度(高、中、低)それぞれに対応する物理リソースのリソース構成要素(CPU、メモリ、GPU)ごとの情報(表6の優先度別リソーステーブル)と、表5のリソース構成要素(CPU、メモリ、GPU)ごとの競合割り当て設定条件とに基づいて、優先度の高い順にどれだけの優先度をカバーできるか、すなわち、前記所定優先度を計算して決定する(S302)。 Figure 9 is a flowchart showing an example of calculating the priority of a service group that restarts a workload when a rack-wide failure occurs in the virtualization platform 10 according to an embodiment. In Figure 9, the information processing unit 320 calculates the physical resources that can be used for each resource component (CPU, memory, GPU) when the rack 110 with the largest number of resource components (CPU, memory, GPU) goes down (fails), i.e., when a failure occurs in the rack 110, based on information on the physical servers 111, 121, 131 of each of the multiple racks 110, 120, 130 (S301). Next, the information processing unit 320 calculates and determines how many priorities can be covered in descending order of priority, i.e., the predetermined priority, based on the calculation results for each resource component (CPU, memory, GPU) of the physical resources of the available racks 120 and 130 (rack resource table in Table 7), information for each resource component (CPU, memory, GPU) of the physical resources corresponding to each of multiple priorities (high, medium, low) (resource table by priority in Table 6), and the competitive allocation setting conditions for each resource component (CPU, memory, GPU) in Table 5 (S302).

 表7のテーブルを参照すると、ラック110(ラックID=1)がダウンすると(落ちると)、仮想化基盤10の全体で使用可能なCPUが16個(64コア)、メモリが256GB、GPUが8個(16コア)になることがわかる。また、表7において、表5のリソース構成要素(CPU、メモリ、GPU)ごとの競合割り当て設定条件を考慮して計算した条件加味の行を参照すると、 Referring to Table 7, we can see that if rack 110 (rack ID = 1) goes down, the total number of CPUs available for use on the virtualization platform 10 will be 16 (64 cores), memory 256 GB, and GPUs 8 (16 cores). Furthermore, referring to the condition-accepting row in Table 7, which is calculated taking into account the conflict allocation setting conditions for each resource component (CPU, memory, GPU) in Table 5, we see that

 次に、情報処理部320は、リソース構成要素(CPU、メモリ、GPU)が最大のラック110がダウンした(落ちた)場合すなわちラック110に障害が発生した場合の影響に関する影響情報を、通信部340を介して利用者(又はオペレータ等)40の端末装置に送信して提供する(S303)。 Next, the information processing unit 320 transmits impact information regarding the impact when the rack 110 with the largest number of resource components (CPU, memory, GPU) goes down (fails), i.e., when a failure occurs in the rack 110, to the terminal device of the user (or operator, etc.) 40 via the communication unit 340 and provides it (S303).

 本実施形態の仮想化基盤10において、正常時では、ラック110,120,130がすべて正常に動作していれば、全ての優先度(高、中、低)のサービスグループのワークロードを動作可能である。 In the virtualization platform 10 of this embodiment, under normal circumstances, if all racks 110, 120, and 130 are operating normally, the workloads of all service groups with high, medium, and low priorities can be run.

 また、本実施形態において、いずれかのラックがダウンした(落ちた)ラック障害が発生したときに備えた事前計算では、3台のラック110,120,130のうちラック110のリソース構成要素(CPU、メモリ、GPU)が最も大きいので、当該ラック110がダウンした(落ちた)ラック障害が発生したときの影響が大きいと、判定できる。ラック110がダウンする(落ちる)と、優先度が高と中のサービスグループでCPU数が60コアとなり、優先度が低のサービスグループのワークロードはカバーできない。 Furthermore, in this embodiment, in pre-calculation to prepare for a rack failure in which one of the racks goes down (falls), it can be determined that the impact of a rack failure in which rack 110 goes down (falls) would be significant, since rack 110 has the largest resource components (CPU, memory, GPU) of the three racks 110, 120, and 130. If rack 110 goes down (falls), the number of CPU cores in the high and medium priority service groups will be 60, which will not be able to cover the workload of the low priority service group.

 また、上記事前計算で、優先度が低のサービスグループのワークロードはカバーできないことがわかっているので、ラック110がダウンした(落ちた)ラック障害が発生したときに、他のラック120,130の物理サーバ121,131において優先度が低のサービスグループのワークロードを停止し、ラック120,130で空いた物理サーバ121,131のリソースで、障害発生前にラック110で起動していた優先度が高と中のサービスグループのワークロードを起動し直す。 Furthermore, since the above pre-calculation shows that the workloads of low-priority service groups cannot be covered, when a rack failure occurs in which rack 110 goes down (falls), the workloads of low-priority service groups are stopped on the physical servers 121, 131 of the other racks 120, 130, and the workloads of high- and medium-priority service groups that were running on rack 110 before the failure are restarted using the resources of the physical servers 121, 131 that are free in racks 120, 130.

 なお、本実施形態において、図7の仮想環境のリソース関連のテーブルの生成、図8の物理的なリソース関連のテーブルの生成並びに図9のサービスグループの優先度の計算の全部又は一部は、仮想化基盤10の実際の運用開始前に事前計算してもよいし、仮想化基盤10のラックの個数又は構成の変更時、又は、仮想化基盤10で動作させる仮想マシン(VM)若しくはコンテナなどのワークロードの変更時、仮想化基盤10で提供するサービスの変更時等に計算し直してもよい。 In this embodiment, all or part of the generation of the virtual environment resource-related table in FIG. 7, the generation of the physical resource-related table in FIG. 8, and the calculation of service group priorities in FIG. 9 may be pre-calculated before the actual operation of the virtualization infrastructure 10 begins, or may be recalculated when the number or configuration of racks in the virtualization infrastructure 10 changes, when workloads such as virtual machines (VMs) or containers running on the virtualization infrastructure 10 change, when services provided by the virtualization infrastructure 10 change, etc.

 以上、本実施形態によれば、仮想化基盤10を構成する複数のラック110,120,130のいずれかに障害が発生し、障害が発生したラック内のすべての物理サーバ111が稼働停止した場合でも、仮想化基盤10上で優先度の高いサービスを継続して提供することできる。 As described above, according to this embodiment, even if a failure occurs in one of the multiple racks 110, 120, 130 that make up the virtualization infrastructure 10 and all physical servers 111 in the failed rack stop operating, high-priority services can continue to be provided on the virtualization infrastructure 10.

 また、本実施形態によれば、仮想化基盤10におけるラック内の物理サーバの余剰リソースを抑制することができる。 Furthermore, according to this embodiment, excess resources of physical servers within racks in the virtualization platform 10 can be reduced.

 なお、本実施形態では、仮想化基盤10を構成する複数の基盤構成装置がそれぞれ複数の物理サーバと電源とを有するラック110、120,130であり、物理リソースが物理サーバ111,121,131である場合について主に説明したが、仮想化基盤10を構成する複数の基盤構成装置及び物理リソースの種類はこれらに限定されない。例えば、仮想化基盤10を構成する複数の基盤構成装置はそれぞれ、複数の物理サーバ及び電源を有するコンテナ型データセンタであり、物理リソースは複数の物理サーバを有するラックであってもよい。この場合、仮想化基盤10を構成する複数のコンテナ型データセンタのいずれかに障害が発生し、障害が発生したコンテナ型データセンタ内のすべての物理サーバが稼働停止した場合でも、仮想化基盤10上で優先度の高いサービスを継続して提供することできる。 Note that in this embodiment, the multiple infrastructure-constituting devices that make up the virtualization infrastructure 10 are mainly racks 110, 120, and 130 each having multiple physical servers and power supplies, and the physical resources are physical servers 111, 121, and 131. However, the types of multiple infrastructure-constituting devices and physical resources that make up the virtualization infrastructure 10 are not limited to these. For example, the multiple infrastructure-constituting devices that make up the virtualization infrastructure 10 may each be a container-type data center having multiple physical servers and power supplies, and the physical resources may be racks having multiple physical servers. In this case, even if a failure occurs in one of the multiple container-type data centers that make up the virtualization infrastructure 10 and all physical servers in the container-type data center where the failure occurred stop operating, high-priority services can continue to be provided on the virtualization infrastructure 10.

 また、本実施形態によれば、仮想化基盤10におけるコンテナ型データセンタ内の余剰リソースを抑制することができる。 Furthermore, according to this embodiment, excess resources within the container-type data center in the virtualization platform 10 can be reduced.

 本発明は、仮想化基盤10における余剰リソースを抑制しつつ、仮想化基盤10を構成する複数のラック110,120,130のいずれかに障害が発生した場合でも仮想化基盤10上で優先度の高いサービスを継続して提供することできるため、持続可能な開発目標(SDGs)の目標9「産業と技術革新の基盤をつくろう」の達成に貢献できる。 The present invention reduces excess resources in the virtualization infrastructure 10, while enabling high-priority services to continue to be provided on the virtualization infrastructure 10 even if a failure occurs in one of the multiple racks 110, 120, and 130 that make up the virtualization infrastructure 10. This can contribute to achieving Goal 9 of the Sustainable Development Goals (SDGs), which is to "build inclusive and sustainable industrial infrastructure, promote inclusive and sustainable industrialization, and build resilient infrastructure."

 なお、本明細書で説明された処理工程並びに管理システム(記憶部、情報処理部、制御部、通信部)等の構成要素は、様々な手段によって実装することができる。例えば、これらの工程及び構成要素は、ハードウェア、ファームウェア、ソフトウェア、又は、それらの組み合わせで実装されてもよい。 Note that the processing steps and components of the management system (storage unit, information processing unit, control unit, communication unit, etc.) described in this specification can be implemented by various means. For example, these steps and components may be implemented by hardware, firmware, software, or a combination thereof.

 ハードウェア実装については、実体(例えば、物理サーバ、ラック、コンテナ型データセンタ、ハードディスクドライブ装置、又は、光ディスクドライブ装置)において前記工程及び構成要素を実現するために用いられる処理ユニット等の手段は、1つ又は複数の、特定用途向けIC(ASIC)、デジタルシグナルプロセッサ(DSP)、デジタル信号処理装置(DSPD)、プログラマブル・ロジック・デバイス(PLD)、フィールド・プログラマブル・ゲート・アレー(FPGA)、プロセッサ、コントローラ、マイクロコントローラ、マイクロプロセッサ、電子デバイス、本明細書で説明された機能を実行するようにデザインされた他の電子ユニット、コンピュータ、又は、それらの組み合わせの中に実装されてもよい。 With regard to hardware implementation, the processing units and other means used to realize the above steps and components in an entity (e.g., a physical server, rack, container-type data center, hard disk drive device, or optical disk drive device) may be implemented in one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), processors, controllers, microcontrollers, microprocessors, electronic devices, other electronic units designed to perform the functions described herein, computers, or combinations thereof.

 また、ファームウェア及び/又はソフトウェア実装については、前記構成要素を実現するために用いられる処理ユニット等の手段は、本明細書で説明された機能を実行するプログラム(例えば、プロシージャ、関数、モジュール、インストラクション、などのコード)で実装されてもよい。一般に、ファームウェア及び/又はソフトウェアのコードを明確に具体化する任意のコンピュータ/プロセッサ読み取り可能な媒体が、本明細書で説明された前記工程及び構成要素を実現するために用いられる処理ユニット等の手段の実装に利用されてもよい。例えば、ファームウェア及び/又はソフトウェアコードは、例えば制御装置において、メモリに記憶され、コンピュータやプロセッサにより実行されてもよい。そのメモリは、コンピュータやプロセッサの内部に実装されてもよいし、又は、プロセッサの外部に実装されてもよい。また、ファームウェア及び/又はソフトウェアコードは、例えば、ランダムアクセスメモリ(RAM)、リードオンリーメモリ(ROM)、不揮発性ランダムアクセスメモリ(NVRAM)、プログラマブルリードオンリーメモリ(PROM)、電気的消去可能PROM(EEPROM)、フラッシュメモリ、フロッピー(登録商標)ディスク、コンパクトディスク(CD)、デジタルバーサタイルディスク(DVD)、磁気又は光データ記憶装置、などのような、コンピュータやプロセッサで読み取り可能な媒体に記憶されてもよい。そのコードは、1又は複数のコンピュータやプロセッサにより実行されてもよく、また、コンピュータやプロセッサに、本明細書で説明された機能性のある態様を実行させてもよい。 Furthermore, with regard to firmware and/or software implementations, the means, such as a processing unit, used to realize the components may be implemented with a program (e.g., code, such as procedures, functions, modules, instructions, etc.) that performs the functions described herein. In general, any computer/processor-readable medium tangibly embodying firmware and/or software code may be used to implement the means, such as a processing unit, used to realize the steps and components described herein. For example, the firmware and/or software code may be stored in memory and executed by a computer or processor, for example in a control device. The memory may be implemented within the computer or processor, or external to the processor. The firmware and/or software code may also be stored on a computer- or processor-readable medium, such as, for example, random access memory (RAM), read-only memory (ROM), non-volatile random access memory (NVRAM), programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), flash memory, floppy disk, compact disk (CD), digital versatile disk (DVD), magnetic or optical data storage device, etc. The code may be executed by one or more computers or processors and may cause the computers or processors to perform certain aspects of the functionality described herein.

 また、前記媒体は非一時的な記録媒体であってもよい。また、前記プログラムのコードは、コンピュータ、プロセッサ、又は他のデバイス若しくは装置機械で読み込んで実行可能であればよく、その形式は特定の形式に限定されない。例えば、前記プログラムのコードは、ソースコード、オブジェクトコード及びバイナリコードのいずれでもよく、また、それらのコードの2以上が混在したものであってもよい。 Furthermore, the medium may be a non-transitory recording medium. Furthermore, the program code may be in any format as long as it can be read and executed by a computer, processor, or other device or machine, and its format is not limited to a specific format. For example, the program code may be source code, object code, or binary code, or may be a mixture of two or more of these codes.

 また、本明細書で開示された実施形態の説明は、当業者が本開示を製造又は使用するのを可能にするために提供される。本開示に対するさまざまな修正は当業者には容易に明白になり、本明細書で定義される一般的原理は、本開示の趣旨又は範囲から逸脱することなく、他のバリエーションに適用可能である。それゆえ、本開示は、本明細書で説明される例及びデザインに限定されるものではなく、本明細書で開示された原理及び新規な特徴に合致する最も広い範囲に認められるべきである。 Furthermore, the description of the embodiments disclosed herein is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to the present disclosure will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other variations without departing from the spirit or scope of the present disclosure. Thus, the present disclosure is not intended to be limited to the examples and designs described herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

10   :仮想化基盤
30   :管理システム
110  :ラック
111  :物理サーバ
111(1)~111(3)  :物理サーバ
111' :物理サーバ
112  :スイッチ
120  :ラック
121  :物理サーバ
121(1)~121(3)  :物理サーバ
130  :ラック
131  :物理サーバ
131(1)~131(3)  :物理サーバ
201A~201C  :ワークロード
202A~202C  :ワークロード
203A~203C  :ワークロード
320  :情報処理部
330  :制御部
340  :通信部
10: Virtualization platform 30: Management system 110: Rack 111: Physical servers 111(1) to 111(3): Physical server 111': Physical server 112: Switch 120: Rack 121: Physical servers 121(1) to 121(3): Physical server 130: Rack 131: Physical servers 131(1) to 131(3): Physical servers 201A to 201C: Workloads 202A to 202C: Workloads 203A to 203C: Workload 320: Information processing unit 330: Control unit 340: Communication unit

Claims (10)

 複数の物理リソースをそれぞれ有する複数の基盤構成装置を備える仮想化基盤を管理するシステムであって、
 前記仮想化基盤上に構築された複数のワークロードが動作可能なクラスタの全体の物理リソースの情報と、前記複数の基盤構成装置のそれぞれの物理リソースの情報と、を記憶する情報記憶部と、
 前記仮想化基盤上に構築された複数のワークロードが動作可能なクラスタで提供する複数のサービスを、優先度が互いに異なる複数のサービスグループに分類し、前記クラスタの全体の物理リソースの情報と前記複数の基盤構成装置のそれぞれの物理リソースの情報とに基づいて、前記複数のサービスグループのそれぞれについて、前記クラスタで実行される複数のワークロードの配置と前記サービスグループで必要な物理リソースの管理を行う情報処理部と、
を備える、ことを特徴とするシステム。
A system for managing a virtualization infrastructure having a plurality of infrastructure component devices each having a plurality of physical resources,
an information storage unit that stores information on the overall physical resources of a cluster on which multiple workloads constructed on the virtualization infrastructure can operate, and information on the physical resources of each of the multiple infrastructure-composing devices;
an information processing unit that classifies a plurality of services provided by a cluster constructed on the virtualization infrastructure in which a plurality of workloads can operate into a plurality of service groups having different priorities, and manages the allocation of the plurality of workloads to be executed in the cluster and the physical resources required by each of the plurality of service groups based on information on the overall physical resources of the cluster and information on the physical resources of each of the plurality of infrastructure-composing devices;
A system comprising:
 請求項1のシステムにおいて、
 前記複数の基盤構成装置のいずれかで障害が発生したとき、前記障害が発生した基盤構成装置で起動していた複数のワークロードのそれぞれが属するサービスグループの優先度を確認し、前記障害が発生した基盤構成装置で所定優先度以上の優先度が高いサービスグループのワークロードが起動していたことを確認した場合、前記障害が発生していない基盤構成装置で起動している低い優先度のサービスグループのワークロードを停止し、前記障害が発生していない基盤構成装置における前記ワークロードの停止で空いた物理リソースで、前記障害が発生した基盤構成装置で起動していた前記優先度が高いサービスグループのワークロードを起動し直すように制御する制御部を備える、ことを特徴とするシステム。
10. The system of claim 1,
A system characterized by comprising a control unit that, when a failure occurs in one of the multiple infrastructure component devices, checks the priority of the service groups to which each of the multiple workloads running on the failed infrastructure component device belongs, and if it is confirmed that a workload of a service group with a priority higher than a predetermined priority was running on the failed infrastructure component device, stops the workload of the low-priority service group running on the infrastructure component device where the failure did not occur, and controls the system to restart the workload of the high-priority service group that was running on the failed infrastructure component device using the physical resources freed up by the stopping of the workload on the infrastructure component device where the failure did not occur.
 請求項2のシステムにおいて、
 前記情報処理部は、
  前記複数の基盤構成装置のそれぞれの物理リソースの情報に基づいて、前記複数の基盤構成装置のうち物理リソースが最大の基盤構成装置に障害が発生した場合に使用可能な物理リソースをリソース構成要素ごとに計算し、
  前記使用可能な物理リソースのリソース構成要素ごとの計算結果と、前記複数の優先度それぞれに対応する物理リソースのリソース構成要素ごとの情報と、リソース構成要素ごとの競合割り当て設定条件とに基づいて、前記所定優先度を決定する、
ことを特徴とするシステム。
In the system of claim 2,
The information processing unit
Based on the information on the physical resources of each of the plurality of infrastructure component devices, calculates for each resource component an available physical resource in the event of a failure in an infrastructure component device having the largest physical resource among the plurality of infrastructure component devices;
determining the predetermined priority based on the calculation result for each resource component of the available physical resources, information for each resource component of the physical resources corresponding to each of the plurality of priorities, and a competitive allocation setting condition for each resource component;
A system characterized by:
 請求項3のシステムにおいて、
 前記情報処理部は、前記複数の基盤構成装置のうち物理リソースが最大の基盤構成装置に障害が発生した場合の影響情報を利用者に提供する、
ことを特徴とするシステム。
In the system of claim 3,
the information processing unit provides a user with impact information when a failure occurs in an infrastructure component device having the largest physical resources among the plurality of infrastructure component devices.
A system characterized by:
 請求項1乃至4のいずれかのシステムにおいて、
 前記物理リソースは、物理サーバであり、
 前記基盤構成装置は、複数の物理サーバと電源とを有するラックである、
ことを特徴とするシステム。
5. The system of claim 1,
the physical resource is a physical server,
The infrastructure component device is a rack having a plurality of physical servers and a power supply.
A system characterized by:
 請求項1乃至4のいずれかのシステムにおいて、
 前記物理リソースは、複数の物理サーバを有するラックであり、
 前記基盤構成装置は、複数のラックと電源とを有するコンテナ型データセンタである、
ことを特徴とするシステム。
5. The system of claim 1,
the physical resource is a rack having a plurality of physical servers;
The infrastructure component device is a container-type data center having a plurality of racks and a power supply.
A system characterized by:
 複数の物理リソースをそれぞれ有する複数の基盤構成装置を備える仮想化基盤を管理する方法であって、
 前記仮想化基盤上に構築された複数のワークロードが動作可能なクラスタの全体の物理リソースの情報と、前記複数の基盤構成装置のそれぞれの物理リソースの情報と、を記憶することと、
 前記仮想化基盤上に構築された複数のワークロードが動作可能なクラスタで提供する複数のサービスを、優先度が互いに異なる複数のサービスグループに分類することと、
 前記クラスタの全体の物理リソースの情報と前記複数の基盤構成装置のそれぞれの物理リソースの情報とに基づいて、前記複数のサービスグループのそれぞれについて、前記クラスタで実行される複数のワークロードの配置と前記サービスグループで必要な物理リソースの管理を行うことと、
を含む、ことを特徴とする方法。
A method for managing a virtualization infrastructure having a plurality of infrastructure component devices each having a plurality of physical resources, comprising:
storing information on the overall physical resources of a cluster on which a plurality of workloads constructed on the virtualization infrastructure can operate, and information on the physical resources of each of the plurality of infrastructure-composing devices;
classifying a plurality of services provided by a cluster that is constructed on the virtualization infrastructure and that is capable of running a plurality of workloads into a plurality of service groups having different priorities;
Based on information on the overall physical resources of the cluster and information on the physical resources of each of the plurality of infrastructure components, for each of the plurality of service groups, allocating a plurality of workloads to be executed in the cluster and managing the physical resources required by the service group;
A method comprising:
 請求項7の方法において、
 前記複数の基盤構成装置のいずれかで障害が発生したとき、前記障害が発生した基盤構成装置で起動していた複数のワークロードのそれぞれが属するサービスグループの優先度を確認することと、
 前記障害が発生した基盤構成装置で所定優先度以上の優先度が高いサービスグループのワークロードが起動していたことを確認した場合、前記障害が発生していない基盤構成装置で起動している低い優先度のサービスグループのワークロードを停止することと、
 前記障害が発生していない基盤構成装置における前記ワークロードの停止で空いた物理リソースで、前記障害が発生した基盤構成装置で起動していた前記優先度が高いサービスグループのワークロードを起動し直すことと、
を含む、ことを特徴とする方法。
8. The method of claim 7,
When a failure occurs in any of the plurality of infrastructure components, checking the priority of a service group to which each of the plurality of workloads running on the infrastructure component device where the failure occurred belongs;
When it is confirmed that a workload of a service group with a higher priority than a predetermined priority is running on the infrastructure component device where the failure has occurred, stopping the workload of a service group with a lower priority running on the infrastructure component device where the failure has not occurred;
restarting the workload of the high-priority service group that was running on the infrastructure component device where the failure occurred, using physical resources that have become available due to the suspension of the workload on the infrastructure component device where the failure has not occurred;
A method comprising:
 複数の物理リソースをそれぞれ有する複数の基盤構成装置を備える仮想化基盤を管理するシステムに設けられたコンピュータ又はプロセッサで実行されるプログラムであって、
 前記仮想化基盤上に構築された複数のワークロードが動作可能なクラスタの全体の物理リソースの情報と、前記複数の基盤構成装置のそれぞれの物理リソースの情報と、を記憶するためのプログラムコードと、
 前記仮想化基盤上に構築された複数のワークロードが動作可能なクラスタで提供する複数のサービスを、優先度が互いに異なる複数のサービスグループに分類するためのプログラムコードと、
 前記クラスタの全体の物理リソースの情報と前記複数の基盤構成装置のそれぞれの物理リソースの情報とに基づいて、前記複数のサービスグループのそれぞれについて、前記クラスタで実行される複数のワークロードの配置と前記サービスグループで必要な物理リソースの管理を行うためのプログラムコードと、
を含む、ことを特徴とするプログラム。
A program executed by a computer or processor provided in a system that manages a virtualization infrastructure that includes multiple infrastructure component devices each having multiple physical resources,
program code for storing information on the overall physical resources of a cluster on which a plurality of workloads constructed on the virtualization infrastructure can operate, and information on the physical resources of each of the plurality of infrastructure-composing devices;
a program code for classifying a plurality of services provided by a cluster constructed on the virtualization platform in which a plurality of workloads can be operated into a plurality of service groups having different priorities;
program code for managing, for each of the plurality of service groups, the placement of a plurality of workloads to be executed in the cluster and the management of physical resources required by the service group, based on information on the overall physical resources of the cluster and information on the physical resources of each of the plurality of infrastructure components;
A program comprising:
 請求項9のプログラムにおいて、
 前記複数の基盤構成装置のいずれかで障害が発生したとき、前記障害が発生した基盤構成装置で起動していた複数のワークロードのそれぞれが属するサービスグループの優先度を確認するためのプログラムコードと、
 前記障害が発生した基盤構成装置で所定優先度以上の優先度が高いサービスグループのワークロードが起動していたことを確認した場合、前記障害が発生していない基盤構成装置で起動している低い優先度のサービスグループのワークロードを停止するためのプログラムコードと、
 前記障害が発生していない基盤構成装置における前記ワークロードの停止で空いた物理リソースで、前記障害が発生した基盤構成装置で起動していた前記優先度が高いサービスグループのワークロードを起動し直すためのプログラムコードと、
を含む、ことを特徴とするプログラム。
In the program of claim 9,
a program code for, when a failure occurs in any of the plurality of infrastructure components, checking the priority of a service group to which each of a plurality of workloads running on the infrastructure component device in which the failure occurs belongs;
a program code for stopping a workload of a service group with a high priority equal to or higher than a predetermined priority that is running on the infrastructure component device where the failure has occurred, when it is confirmed that the workload of the service group with a high priority equal to or higher than a predetermined priority is running on the infrastructure component device where the failure has occurred;
program code for restarting the workload of the high-priority service group that was running on the infrastructure component device where the failure occurred, using physical resources that have become available due to the suspension of the workload on the infrastructure component device where the failure has not occurred;
A program comprising:
PCT/JP2024/006379 2024-02-21 2024-02-21 System, method and program for managing virtualization infrastructure Pending WO2025177500A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2024/006379 WO2025177500A1 (en) 2024-02-21 2024-02-21 System, method and program for managing virtualization infrastructure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2024/006379 WO2025177500A1 (en) 2024-02-21 2024-02-21 System, method and program for managing virtualization infrastructure

Publications (1)

Publication Number Publication Date
WO2025177500A1 true WO2025177500A1 (en) 2025-08-28

Family

ID=96846614

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2024/006379 Pending WO2025177500A1 (en) 2024-02-21 2024-02-21 System, method and program for managing virtualization infrastructure

Country Status (1)

Country Link
WO (1) WO2025177500A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10121026B1 (en) * 2015-12-31 2018-11-06 Amazon Technologies, Inc. Secure enclosure systems in a provider network
JP2019087033A (en) * 2017-11-07 2019-06-06 富士通株式会社 Information processing device, information processing system, and program

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10121026B1 (en) * 2015-12-31 2018-11-06 Amazon Technologies, Inc. Secure enclosure systems in a provider network
JP2019087033A (en) * 2017-11-07 2019-06-06 富士通株式会社 Information processing device, information processing system, and program

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
データセンター選択のチェックポイント データセンターサービスガイド, データセンター完全ガイド 2011年春号, 01 April 2011, pp. 030-031, non-official translation (Checkpoints for selecting a data center -Data center service guide, Complete guide to data centers – Spring 2011 edition) *

Similar Documents

Publication Publication Date Title
US12333327B2 (en) Coordinated container scheduling for improved resource allocation in virtual computing environment
CN110083494B (en) Method and apparatus for managing hardware errors in a multi-core environment
US9189381B2 (en) Managing CPU resources for high availability micro-partitions
CN113886089B (en) Task processing method, device, system, equipment and medium
US20160011914A1 (en) Distributed power delivery
JP6840099B2 (en) Service provision system, resource allocation method, and resource allocation program
WO2007096350A1 (en) Dynamic resource allocation for disparate application performance requirements
US9916215B2 (en) System and method for selectively utilizing memory available in a redundant host in a cluster for virtual machines
CN115225642B (en) Elastic load balancing method and system of super fusion system
US9244826B2 (en) Managing CPU resources for high availability micro-partitions
WO2022105659A1 (en) Application container management method and apparatus, and device.
WO2015118679A1 (en) Computer, hypervisor, and method for allocating physical cores
US9043575B2 (en) Managing CPU resources for high availability micro-partitions
CN110109782A (en) A kind of replacing options, the apparatus and system of failure PCIe device
JP2017138895A (en) Virtualization environment management system and virtualization environment management method
US11385972B2 (en) Virtual-machine-specific failover protection
CN115733842A (en) Resource scheduling method, device, electronic device, storage medium and edge cloud system
CN105208111A (en) Information processing method and physical machine
WO2025177500A1 (en) System, method and program for managing virtualization infrastructure
CN113590306A (en) Method for realizing reliable communication between two systems of domain controller
WO2011118424A1 (en) Machine operation plan creation device, machine operation plan creation method, and machine operation plan creation program
CN112231057A (en) Information processing method, device and system
CN116303218A (en) A storage device path reporting control method, device, device and storage medium
JP2022110245A (en) A storage system with a storage cluster that provides a virtual storage system
JP2014225116A (en) Server for operating virtual machine thereon

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24925833

Country of ref document: EP

Kind code of ref document: A1