[go: up one dir, main page]

WO2016075771A1 - Computer system and autoscaling method for computer system - Google Patents

Computer system and autoscaling method for computer system Download PDF

Info

Publication number
WO2016075771A1
WO2016075771A1 PCT/JP2014/079936 JP2014079936W WO2016075771A1 WO 2016075771 A1 WO2016075771 A1 WO 2016075771A1 JP 2014079936 W JP2014079936 W JP 2014079936W WO 2016075771 A1 WO2016075771 A1 WO 2016075771A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual environment
computer system
computer
virtual
addition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2014/079936
Other languages
French (fr)
Japanese (ja)
Inventor
健 寺村
謙太 山崎
陽子 平島
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Priority to PCT/JP2014/079936 priority Critical patent/WO2016075771A1/en
Publication of WO2016075771A1 publication Critical patent/WO2016075771A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Definitions

  • the present invention relates to a computer system that changes allocation of computer resources according to a load, and an autoscaling method in the computer system.
  • the effective resource addition method may differ depending on the application status in the computer system. For example, in a system consisting of one interconnected Web server and one DB server, if the allowable number of simultaneous connections in the DB server is 100 and the number of simultaneous connections for the DB server in the Web server is 100, add one Web server Even so, since the total number of simultaneous connections for the DB server exceeds the allowable number of simultaneous connections in the DB server, the service performance may be deteriorated.
  • a first object of the present invention is to auto-allocate computer resources dynamically and maintain the service level of business even when it is difficult to predict load fluctuations.
  • the purpose is to provide technology related to scale.
  • a second object of the present invention is to provide a technology relating to autoscaling for appropriately allocating computer resources appropriately according to the state of an application in a computer system and maintaining a service level of business. It is in.
  • a computer system includes a physical server having computer resources including a processor, a storage device, and a communication interface, a virtualization device that virtualizes computer resources and assigns them to a plurality of virtual environments, and a management that manages the plurality of virtual environments.
  • a monitoring unit that monitors the loads of a plurality of virtual environments, a control unit that determines whether or not a virtual environment needs to be added according to the monitored loads, and a virtual unit that is based on the determination of the control unit
  • a configuration that instructs the virtualization device to add an environment, cancels the use restrictions on the computer resources that make up the existing virtual environment, and resets the use restrictions that have been released after the addition of the virtual environment is complete to the existing virtual environment And a change unit.
  • FIG. 1 It is a block diagram which shows the structure of the computer system which concerns on Example 1 of this invention. It is a figure which shows the logical connection relationship and role of a virtual server. It is a figure which shows system construction request information. It is a figure which shows system configuration information. It is a figure which shows physical server performance information. It is a figure which shows service performance information. It is a figure which shows system performance model information. It is a figure which shows scaling rule information. It is a figure which shows the monitoring flowchart which concerns on Example 1 of this invention. It is a figure which shows the scale-out flowchart which concerns on Example 1 of this invention. It is a figure which shows connection information. It is a figure which shows the parameter determination flowchart which concerns on Example 2 of this invention.
  • Example 1 and Example 2 will be described with reference to the drawings.
  • FIG. 1 is a block diagram illustrating a configuration of a computer system according to the first embodiment of the present invention.
  • the computer system includes a management unit 11, a business unit 13, and a network 16.
  • the management unit 11 includes a control unit 18 and a management information group 19.
  • the control unit 18 includes a system monitoring unit 181, a system configuration calculation unit 182, a system configuration change unit 183, and a management software control unit 184.
  • the management information group 19 includes system construction request information 191, system configuration information 192, physical server performance information 194, service performance information 195, system performance model information 196, and scaling rule information 197.
  • the management unit 11 is executed by the physical server 12.
  • the physical server 12 includes a CPU 121, a storage device 122, and a communication IF 123. Each function is realized by the CPU 121 executing the control unit 18 stored in the storage device 122.
  • the storage device 122 also stores the management information group 19.
  • the physical server 12 is connected to the network 16 via the communication IF 123.
  • the business unit 13 is one or more in the computer system, and includes one or more virtual servers (hereinafter may be referred to as “VM”).
  • FIG. 1 illustrates an example in which three virtual servers of the virtual server 131, the virtual server 132, and the virtual server 133 operate as one business unit.
  • the virtual server plays various roles depending on the work executed in the computer system.
  • the virtual server 131 operates as an LB (load balancer, load balancer)
  • the virtual server 132 operates as a Web server
  • the virtual server 133 operates as a DB server.
  • the hypervisor 15 virtualizes the computer resources of the physical server 14 and plays a role of providing a virtual server in the business unit 13.
  • the physical server 14 includes a CPU 141, a storage device 142, and a communication IF 143.
  • the business unit 13 is stored in the storage device 142 and executed by the CPU 141.
  • the business unit 13 is connected to the network 16 via the communication IF 143.
  • FIG. 2 is an explanatory diagram illustrating an example of a logical connection relationship and roles of virtual servers in the business unit 13.
  • the LB 21 operates as a load balancer and distributes the received request message to any one of the Web 22, the Web 23, and the Web 24.
  • Web22, Web23, and Web24 are all Web servers.
  • the DB 25 is a DB server and accepts access from the Web 22, Web 23, and Web 24.
  • the Webs 22 to 24 are set as the initial number of Webs (initial number) of Webs that receive the request messages distributed from the LB 21 and are increased or decreased as necessary (the dotted line portion in FIG. 2 indicates an increase). Show time).
  • FIG. 3 is an explanatory diagram showing an example of the system construction request information 191.
  • the system construction request information 191 includes a role 302 indicating the role of the virtual server (VM) deployed in the business unit 13, a CPU core allocation number 303 indicating the CPU core allocation number for each role, and a CPU operating frequency per core. It includes an initial number 305 indicating the core frequency 304 and the number of initially deployed units for each role. These values are set by the administrator.
  • VM virtual server
  • the system configuration changing unit 183 deploys the VM in the business unit 13 in accordance with the system construction request information 191 before step S1001 (FIG. 9) of the monitoring flowchart described later starts.
  • the CPU core allocation number and the capped frequency are calculated and set according to the following formula.
  • capping means limiting (using an upper limit) the usage rate (usage amount) of computer resources (resources).
  • CPU core allocation number to be set CPU core allocation number 303 / (1-capping ratio 903)
  • Capped frequency to be set CPU core allocation number to be set ⁇ 1 core frequency 304 ⁇ capping ratio 903
  • the CPU core allocation number and the capped frequency are calculated according to the following formula, and the values are set.
  • CPU core allocation number to be set CPU core allocation number 303
  • Set capped frequency 0
  • FIG. 4 is an explanatory diagram showing an example of the system configuration information 192.
  • the system configuration information 192 includes a server name 401 that is an identifier for identifying each VM, a role 402 that indicates the role of the VM, a CPU core allocation number 403 that indicates the number of CPU cores allocated to the VM, and a CPU operation per core. 1 core frequency 404 indicating the frequency, capped frequency 405 indicating the capped CPU operating frequency, and operation destination server 407 which is an identifier of the physical server 12 on which the VM operates.
  • the system configuration information 192 is set by the control unit 18 in accordance with the contents of the system configuration change.
  • FIG. 5 is an explanatory diagram showing an example of physical server performance information.
  • the physical server performance information 194 includes a date and time 601 indicating the date and time when performance is monitored, a server name 602 that is an identifier of the monitored physical server, a CPU usage rate 603 that indicates the CPU usage rate of the monitored physical server, and a memory that indicates the memory usage rate. It includes a usage rate 604, a network flow 605 indicating the communication flow of the network, and a disk flow 606 indicating the input / output amount of the disk.
  • the system monitoring unit 181 monitors the performance of the physical server 14 and records the monitored performance information in the physical server performance information 194.
  • FIG. 6 is an explanatory diagram showing an example of service performance information.
  • the service performance information 195 includes a date and time 701 indicating the date and time when performance is monitored, a request number 702 indicating the number of requests received by the business unit 13 per unit time, and a response time 703 indicating a response time (response time) to the request. .
  • the system monitoring unit 181 monitors the performance of the business unit 13 and records the monitored performance information in the service performance information 195.
  • FIG. 7 is an explanatory diagram showing an example of system performance model information.
  • the system performance model information 196 includes a total resource amount 801 indicating the total number of CPU cores of the system and a performance 802 indicating the number of requests that can be processed by the system when the total resource amount 801 is present. These values are set by the administrator. Before starting step S1001 (FIG. 9) of the monitoring flowchart to be described later, the administrator determines a value shown in the system performance model information 196 by a method such as a verification experiment or literature reference, and stores it in the system performance model information 196. Set its value.
  • FIG. 8 is an explanatory diagram showing an example of scaling rule information.
  • the scaling rule information 197 determines a capping target 902 indicating the role of the VM to be CPU capped, a capping ratio 903 indicating the ratio of capping set in the VM, and whether to execute request allocation to the newly added VM.
  • Load increase monitoring period 904 indicating a monitoring period to perform, trigger coefficient 906 related to a trigger condition for determining whether to start addition processing of VM, normal time monitoring interval 907 which is a normal time monitoring interval, and high It includes a high load monitoring interval 908 that is a monitoring interval at the time of loading. These values are set by the administrator.
  • FIG. 9 is a flowchart (monitoring flowchart) showing a monitoring process for the business unit 13 executed by the control unit 18.
  • the system monitoring unit 181 monitors the business unit 13 at the interval set as the normal monitoring interval 907 (FIG. 8), and records the monitored performance information result in the service performance information 195 (FIG. 6). .
  • the monitored date and time 701 the number of requests 702 received by the business unit 13 per unit time, and the response time 703 for the request are recorded.
  • the system monitoring unit 181 monitors the physical server 14 and records the result of the monitored performance information in the physical server performance information 194 (FIG. 5).
  • the monitored date / time 601, server name 602, CPU usage rate 603, memory usage rate 604, network flow rate 605, and disk flow rate 606 are recorded.
  • step S1002 the management software control unit 184 determines whether the scale-out trigger condition is satisfied according to the following equation.
  • the number of requests is the latest number of requests 702 recorded in the service performance information 195 (FIG. 6).
  • the trigger coefficient is the trigger coefficient 906 of the scaling rule information 197 (FIG. 8).
  • the system capacity searches the system performance model information 196 (FIG. 7) using the total number of CPU core allocations of the system as a key, and uses the value of the corresponding performance 802 (the number of requests that can be processed by the system).
  • the total CPU core allocation number of the system is the total CPU core allocation number 403 of the record whose role 402 is “Web” in the system configuration information 192 (FIG. 4).
  • step S1003 the management software control unit 184 executes the processing after step S1003 as the scale-out processing. When it determines with not satisfy
  • FIG. 10 is a flowchart (scale-out flowchart) showing processing executed by the control unit 18 as scale-out processing after step S1003 in FIG.
  • the system configuration changing unit 183 starts processing for adding a virtual server (VM) to the business unit 13 in accordance with information recorded in the system construction request information 191 (FIG. 3).
  • the system construction request information 191 (FIG. 3)
  • the value of the CPU core allocation number 303 and the 1 core frequency 304 of the record whose role 302 is “Web” is used, and the virtual server on the hypervisor 15 is used.
  • the process of deploying (making it available) (VM) is started.
  • the CPU core allocation number set in the virtual server (VM) is changed from 1 to the CPU core allocation number 303 from 1 to the capping ratio 903 (scaling rule information 197 in FIG. 8).
  • the capping rate set for the virtual server (VM) to be added is the capping rate 903 corresponding to the role of the VM to be added (capping target 902 in FIG. 8) from the scaling rule information 197 (FIG. 8) (for example, (If the role is “Web”, the capping rate is 50%).
  • step S1102 the system configuration change unit 183 executes a process of canceling CPU capping of the operated virtual server.
  • the operated virtual server is found by searching for a record whose role 402 is “Web” in the system configuration information 192 (FIG. 4).
  • the capping amount to be released is a value obtained by multiplying the CPU allocation number 303 of the newly added virtual server by the 1-core frequency 304. For example, when the capping amount to be released is 12000 MHz, the Web server recorded in the system configuration information 192 (FIG. 4) is VM1, and the capped frequency 405 of VM1 is 12000 MHz, the following is performed.
  • step S1103 the system monitoring unit 181 changes the monitoring interval from the normal monitoring interval 907 to the high load monitoring interval 908, and monitors the business unit 13 at the set interval. And the same information as step S1001 (FIG. 9) is recorded on the service performance information 195 (FIG. 6). Similarly, the physical server 14 is monitored at the interval set as the high load monitoring interval 908, and the monitored information is recorded in the physical server performance information 194 (FIG. 5).
  • step S1104 the management software control unit 184 determines whether a high load continues for a predetermined period.
  • the value of the load increase monitoring period 904 of the scaling rule information 197 (FIG. 8) is used as the predetermined period. Whether or not the high load continues is determined based on whether or not the trigger condition shown in step S1002 (FIG. 9) is satisfied.
  • step S1105 the management software control unit 184 changes the setting of the LB (load distribution device) and starts request allocation to the newly added VM. However, if the VM addition started in step S1101 has not yet been completed, the management software control unit 184 changes the setting of the LB (load distribution device) after waiting for the completion of the addition.
  • step S1109 the system configuration change unit 183 deletes the VM added in step S1101. However, if the VM addition started in step S1101 has not yet been completed, the system configuration changing unit 183 stops the addition.
  • the system configuration change unit 183 first starts the virtual server (VM) addition process (step S1101), and subsequently releases the CPU capping (step S1102).
  • VM virtual server
  • CPU capping is faster than adding and deleting VMs, and is suitable as a response to sudden load increase / decrease. Therefore, by canceling CPU capping in parallel with VM addition processing, The close state is reproduced in advance to cope with the load increase (fluctuation).
  • the system monitoring unit 181 monitors this at a high load monitoring interval (step S1103).
  • the high load state does not continue (step S1104).
  • the process is canceled without adding to (step S1109).
  • the management software control unit 184 responds by assigning a load to the added VM (step S1105). .
  • step S1106 the system configuration changing unit 183 resets (returns) capping to the VM for which capping was canceled in step S1102.
  • the target VM for capping is the same as the target VM in step S1102. Further, the amount of capping is the same as the amount of capping released in step S1102.
  • step S1107 the system monitoring unit 181 returns the monitoring interval to the normal monitoring interval 907 and monitors the business unit 13 at the set interval. And the system monitoring part 181 records the same information as step S1001 (FIG. 9) in the service performance information 195 (FIG. 6).
  • step S1108 the system configuration calculation unit 182 adds the VM information that has been added in step S1101 to the system configuration information 192 (FIG. 4). However, if step S1109 is executed, the information is not added and the process is terminated.
  • the computer resource to be capped has been described as a CPU. However, other computer resources (memory, disk, network, etc.) may be targeted.
  • a second embodiment of the present invention will be described with reference to FIGS.
  • it is determined whether or not the scale-out trigger condition is satisfied (step S1002 in FIG. 9). If the condition is satisfied (YES), VM addition processing is started (step S1101 in FIG. 10). .
  • the addition of the VM may not be effective. For example, in a system consisting of one interconnected Web server and one DB server, if the allowable number of simultaneous connections in the DB server is 100 and the number of simultaneous connections for the DB server in the Web server is 100, add one Web server Even so, the service performance is not improved because the total number of simultaneous connections for the DB server exceeds the allowable number of simultaneous connections in the DB server.
  • a resource addition method at the time of load increase is selected using information (connection information) indicating the state of the application.
  • connection information shown in FIG. 11 is used in addition to the system configuration shown in FIG.
  • the flowchart shown in FIG. 12 is used.
  • FIG. 11 is an explanatory diagram showing an example of connection information.
  • the connection information 198 includes a role 1201 indicating the role of the VM and a connection number 1202 indicating the number of communication connections set in the VM to which the role is assigned. This connection information 198 is set by the administrator.
  • the connection information 198 is stored in the management information group 19 (indicated by a dotted frame in FIG. 1).
  • FIG. 12 is a flowchart (parameter determination flowchart) showing the parameter determination process executed by the control unit 18. Steps S1001 and S1002 are the same as those in the first embodiment. However, if it is determined in step S1002 that the scale-out trigger condition is satisfied (YES), the management software control unit 184 executes step S1303 instead of step S1003. In step S1303, the management software control unit 184 determines whether or not the number of connections in the Web server group falls within the number of connections in the DB server group even when one Web server that is a virtual server is added.
  • Number of connections set in Web server (Number of operated Web servers + 1) ⁇ Number of connections set in DB server ⁇ Number of operated DB servers
  • the number of connections 1202 of the record whose role 1201 is “Web”.
  • the number of connections set in the DB server is the number of connections 1202 of the record whose role 1201 is “DB” in the connection information 198.
  • the number of operated Web servers is the number of records whose role 402 is “Web” in the system configuration information 192 (FIG. 4).
  • the number of operated DB servers is the number of records whose role 402 is “DB” in the system configuration information 192 (FIG. 4).
  • step S1102 the system configuration change unit 183 executes the capping release processing of the existing virtual server, and thereafter ends the processing.
  • step S1003 shown in FIG. Scale-out processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Computer And Data Communications (AREA)

Abstract

A computer system having a varying load, wherein computer resources are appropriately allocated according to the load even when load variations are difficult to predict and even when different computer resource allocation methods are effective for different application conditions. In order to accomplish this feature, the computer system is at least provided with: a physical server which has computer resources including a processor, a storage device, and a communication interface; a virtualization device which virtualizes computer resources and allocates virtual computer resources to a plurality of virtual environments; and a management device which manages the plurality of virtual environments. The management device comprises: a monitoring unit which monitors loads on the plurality of virtual environments; a control unit which determines, based on the monitored loads, whether it is necessary to add another virtual environment; and a configuration changing unit which requests addition of a virtual environment in accordance with the determination made by the control unit, removes usage restrictions on computer resources that constitute existing virtual environments, and, upon completion of the addition of a virtual environment, restores the usage restrictions on the existing virtual environments.

Description

計算機システムおよび計算機システムにおけるオートスケール方法Computer system and autoscale method in computer system

 本発明は、計算機資源の割当てを負荷に応じて変更する計算機システムおよび計算機システムにおけるオートスケール方法に関する。 The present invention relates to a computer system that changes allocation of computer resources according to a load, and an autoscaling method in the computer system.

 企業で利用される計算機システムでは、時間や時期によって負荷が変動する。負荷が変動した場合でも、システム応答時間等のサービスレベルを維持するために、リソース(計算機資源)の割当てを動的に制御する技術が提案されている。
 例えば、特許文献1には、仮想化環境上において、複数の業務クラスタにより構成される業務システムに対して、将来の負荷変動を予測し、予測された結果に対して、複数のハードウェア資源を変更させる手段を組み合わせることにより、負荷変動に追従するために必要な資源を割り当てる技術が開示されている。
 また、オートスケールは、CPU使用率やネットワーク通信量等、仮想マシンの負荷に応じて、負荷分散装置(ロードバランサ)配下の仮想マシンの数や、仮想マシンに割り当てる資源の量を、自動的に増減させる機能である。
In a computer system used in a company, the load varies depending on time and time. In order to maintain a service level such as system response time even when the load fluctuates, a technique for dynamically controlling the allocation of resources (computer resources) has been proposed.
For example, in Patent Document 1, in a virtual environment, a future load fluctuation is predicted for a business system configured by a plurality of business clusters, and a plurality of hardware resources are allocated to the predicted result. A technique for allocating resources necessary to follow load fluctuations by combining means for changing is disclosed.
Autoscale automatically determines the number of virtual machines under the load balancer (load balancer) and the amount of resources allocated to the virtual machine according to the load on the virtual machine, such as CPU usage and network traffic. It is a function to increase or decrease.

特許第5332065号明細書Japanese Patent No. 5332065

 しかし、計算機システムにおける負荷変動の予測が困難な場合がある。例えば、イベントチケットのオンライン販売システム等の場合、事前の予測を超える多大なアクセスが発生そして集中することがある。従来技術では、負荷増加の予測に基づきVMを追加する等のリソース追加手法をスケジュールし、予測を上回る負荷があった場合に、VMの追加が間に合わずにサービス性能の劣化を招く恐れがある。 However, it may be difficult to predict the load fluctuation in the computer system. For example, in the case of an event ticket online sales system or the like, a great deal of access that exceeds a prior prediction may occur and concentrate. In the prior art, when a resource addition method such as adding a VM is scheduled based on the prediction of load increase, and there is a load exceeding the prediction, there is a risk that the addition of the VM will not be in time and service performance will be degraded.

 また、計算機システムにおけるアプリケーションの状態によって、有効なリソース追加手法が異なる場合がある。例えば、相互接続されたWebサーバ1台とDBサーバ1台から成るシステムにおいて、DBサーバにおける許容同時接続数が100、WebサーバにおけるDBサーバ向け同時接続数が100の場合、Webサーバを1台追加しても、DBサーバにおける許容同時接続数をDBサーバ向け同時接続数の合計が超えるため、サービス性能の劣化を招く恐れがある。 Also, the effective resource addition method may differ depending on the application status in the computer system. For example, in a system consisting of one interconnected Web server and one DB server, if the allowable number of simultaneous connections in the DB server is 100 and the number of simultaneous connections for the DB server in the Web server is 100, add one Web server Even so, since the total number of simultaneous connections for the DB server exceeds the allowable number of simultaneous connections in the DB server, the service performance may be deteriorated.

 これらの課題を解決するために、本発明の第1の目的は、負荷変動の予測が困難な場合でも、計算機資源の動的な割当てを適切に行い、業務のサービスレベルを維持するためのオートスケールに関する技術を提供することにある。
 また、本発明の第2の目的は、計算機システムにおけるアプリケーションの状態に応じて、計算機資源の動的な割当てを適切に行い、業務のサービスレベルを維持するためのオートスケールに関する技術を提供することにある。
In order to solve these problems, a first object of the present invention is to auto-allocate computer resources dynamically and maintain the service level of business even when it is difficult to predict load fluctuations. The purpose is to provide technology related to scale.
A second object of the present invention is to provide a technology relating to autoscaling for appropriately allocating computer resources appropriately according to the state of an application in a computer system and maintaining a service level of business. It is in.

 本発明に係る計算機システムは、プロセッサ、記憶装置および通信インタフェースを含む計算機資源を有する物理サーバと、計算機資源を仮想化して複数の仮想環境に割り当てる仮想化装置と、複数の仮想環境を管理する管理装置とを少なくとも備え、管理装置が、複数の仮想環境の負荷を監視する監視部と、監視した負荷に応じて仮想環境の追加要否を判定する制御部と、制御部の判定に応じて仮想環境の追加を仮想化装置に指示すると共に、既存の仮想環境を構成する計算機資源の使用制限を解除し、仮想環境の追加が完了した後に解除した使用制限を既存の仮想環境に再設定する構成変更部とを有する。 A computer system according to the present invention includes a physical server having computer resources including a processor, a storage device, and a communication interface, a virtualization device that virtualizes computer resources and assigns them to a plurality of virtual environments, and a management that manages the plurality of virtual environments. A monitoring unit that monitors the loads of a plurality of virtual environments, a control unit that determines whether or not a virtual environment needs to be added according to the monitored loads, and a virtual unit that is based on the determination of the control unit A configuration that instructs the virtualization device to add an environment, cancels the use restrictions on the computer resources that make up the existing virtual environment, and resets the use restrictions that have been released after the addition of the virtual environment is complete to the existing virtual environment And a change unit.

 本発明によれば、負荷変動の予測情報がない場合においても、また同様に、計算機システムにおけるアプリケーションの状態に応じても、計算機資源の動的な割当てを効果的に行い、業務のサービスレベルを維持するためのオートスケールを実現できる。 According to the present invention, even when there is no load fluctuation prediction information, and similarly, depending on the state of the application in the computer system, the dynamic allocation of the computer resources is effectively performed, and the service level of the work is reduced. Autoscale to maintain can be realized.

本発明の実施例1に係る計算機システムの構成を示すブロック図である。It is a block diagram which shows the structure of the computer system which concerns on Example 1 of this invention. 仮想サーバの論理的な接続関係と役割を示す図である。It is a figure which shows the logical connection relationship and role of a virtual server. システム構築要求情報を示す図である。It is a figure which shows system construction request information. システム構成情報を示す図である。It is a figure which shows system configuration information. 物理サーバ性能情報を示す図である。It is a figure which shows physical server performance information. サービス性能情報を示す図である。It is a figure which shows service performance information. システム性能モデル情報を示す図である。It is a figure which shows system performance model information. スケーリングルール情報を示す図である。It is a figure which shows scaling rule information. 本発明の実施例1に係る監視フローチャートを示す図である。It is a figure which shows the monitoring flowchart which concerns on Example 1 of this invention. 本発明の実施例1に係るスケールアウトフローチャートを示す図である。It is a figure which shows the scale-out flowchart which concerns on Example 1 of this invention. コネクション情報を示す図である。It is a figure which shows connection information. 本発明の実施例2に係るパラメータ判定フローチャートを示す図である。It is a figure which shows the parameter determination flowchart which concerns on Example 2 of this invention.

 以下、本発明の実施の形態として、実施例1および実施例2について図面を参照しながら説明する。 Hereinafter, as an embodiment of the present invention, Example 1 and Example 2 will be described with reference to the drawings.

 図1は、本発明の実施例1に係る計算機システムの構成を示すブロック図である。
 本計算機システムは、管理部11、業務部13およびネットワーク16で構成される。
FIG. 1 is a block diagram illustrating a configuration of a computer system according to the first embodiment of the present invention.
The computer system includes a management unit 11, a business unit 13, and a network 16.

 管理部11は、制御部18および管理情報群19で構成される。制御部18は、システム監視部181、システム構成算出部182、システム構成変更部183および管理ソフト制御部184で構成される。管理情報群19は、システム構築要求情報191、システム構成情報192、物理サーバ性能情報194、サービス性能情報195、システム性能モデル情報196およびスケーリングルール情報197で構成される。 The management unit 11 includes a control unit 18 and a management information group 19. The control unit 18 includes a system monitoring unit 181, a system configuration calculation unit 182, a system configuration change unit 183, and a management software control unit 184. The management information group 19 includes system construction request information 191, system configuration information 192, physical server performance information 194, service performance information 195, system performance model information 196, and scaling rule information 197.

 また、管理部11は、物理サーバ12が実行する。物理サーバ12は、CPU121、記憶装置122および通信IF123を備える。記憶装置122に格納された制御部18をCPU121が実行することで、各機能を実現する。記憶装置122は、管理情報群19も格納する。物理サーバ12は、通信IF123を介して、ネットワーク16に接続される。 The management unit 11 is executed by the physical server 12. The physical server 12 includes a CPU 121, a storage device 122, and a communication IF 123. Each function is realized by the CPU 121 executing the control unit 18 stored in the storage device 122. The storage device 122 also stores the management information group 19. The physical server 12 is connected to the network 16 via the communication IF 123.

 業務部13は、本計算機システム内に一つ以上存在し、かつ一つ以上の仮想サーバ(以下、「VM」と記す場合がある)で構成される。図1では、仮想サーバ131、仮想サーバ132および仮想サーバ133の三つの仮想サーバが一つの業務部として動作する例を示す。仮想サーバは、計算機システムで実行される業務に応じて、様々な役割を担う。例えば、仮想サーバ131はLB(ロードバランサ、負荷分散装置)として、仮想サーバ132はWebサーバとして、仮想サーバ133はDBサーバとしてそれぞれ動作する。 The business unit 13 is one or more in the computer system, and includes one or more virtual servers (hereinafter may be referred to as “VM”). FIG. 1 illustrates an example in which three virtual servers of the virtual server 131, the virtual server 132, and the virtual server 133 operate as one business unit. The virtual server plays various roles depending on the work executed in the computer system. For example, the virtual server 131 operates as an LB (load balancer, load balancer), the virtual server 132 operates as a Web server, and the virtual server 133 operates as a DB server.

 ハイパーバイザ15は、物理サーバ14の計算機資源を仮想化し、業務部13における仮想サーバを提供する役割を担う。物理サーバ14は、CPU141、記憶装置142および通信IF143を備える。業務部13は、記憶装置142に格納され、かつCPU141により実行される。また、業務部13は、通信IF143を介して、ネットワーク16に接続される。 The hypervisor 15 virtualizes the computer resources of the physical server 14 and plays a role of providing a virtual server in the business unit 13. The physical server 14 includes a CPU 141, a storage device 142, and a communication IF 143. The business unit 13 is stored in the storage device 142 and executed by the CPU 141. The business unit 13 is connected to the network 16 via the communication IF 143.

 図2は、業務部13における仮想サーバの論理的な接続関係と役割との一例を示す説明図である。LB21は、負荷分散装置として動作し、受信したリクエストメッセージをWeb22、Web23およびWeb24のいずれかに振り分ける。Web22、Web23およびWeb24はいずれもWebサーバである。DB25は、DBサーバであって、Web22、Web23およびWeb24からのアクセスを受け付ける。
 なお、Web22~24は、LB21から振り分けられたリクエストメッセージを受け取るWebの初期配備台数(初期台数)として設定されるもので、必要に応じて増減されることになる(図2の点線部は増加時を示す)。
FIG. 2 is an explanatory diagram illustrating an example of a logical connection relationship and roles of virtual servers in the business unit 13. The LB 21 operates as a load balancer and distributes the received request message to any one of the Web 22, the Web 23, and the Web 24. Web22, Web23, and Web24 are all Web servers. The DB 25 is a DB server and accepts access from the Web 22, Web 23, and Web 24.
The Webs 22 to 24 are set as the initial number of Webs (initial number) of Webs that receive the request messages distributed from the LB 21 and are increased or decreased as necessary (the dotted line portion in FIG. 2 indicates an increase). Show time).

 図3は、システム構築要求情報191の一例を示す説明図である。システム構築要求情報191は、業務部13に配備する仮想サーバ(VM)の役割を示す役割302、役割毎のCPUコア割当数を示すCPUコア割当数303、1コア当たりのCPU動作周波数を示す1コア周波数304および役割毎の初期配備台数を示す初期台数305を含む。これらの値は、管理者によって設定される。 FIG. 3 is an explanatory diagram showing an example of the system construction request information 191. As shown in FIG. The system construction request information 191 includes a role 302 indicating the role of the virtual server (VM) deployed in the business unit 13, a CPU core allocation number 303 indicating the CPU core allocation number for each role, and a CPU operating frequency per core. It includes an initial number 305 indicating the core frequency 304 and the number of initially deployed units for each role. These values are set by the administrator.

 システム構成変更部183は、後述する監視フローチャートのステップS1001(図9)が開始する前に、システム構築要求情報191に従って、VMを業務部13に配備する。
 ただし、後述するスケーリングルール情報(図8)のキャッピング対象902に示された役割のVMである場合、下記の式に従って、CPUコア割当数とキャッピング済周波数を算出して、その値を設定する。ここにおいて、キャッピングとは、計算機資源(リソース)の使用率(使用量)を制限する(上限を設ける)ことを意味する。
 設定するCPUコア割当数=CPUコア割当数303/(1-キャッピング割合903)
 設定するキャッピング済み周波数=設定するCPUコア割当数×1コア周波数304×キャッピング割合903
The system configuration changing unit 183 deploys the VM in the business unit 13 in accordance with the system construction request information 191 before step S1001 (FIG. 9) of the monitoring flowchart described later starts.
However, in the case of a VM having the role shown in the capping target 902 of scaling rule information (FIG. 8) to be described later, the CPU core allocation number and the capped frequency are calculated and set according to the following formula. Here, capping means limiting (using an upper limit) the usage rate (usage amount) of computer resources (resources).
CPU core allocation number to be set = CPU core allocation number 303 / (1-capping ratio 903)
Capped frequency to be set = CPU core allocation number to be set × 1 core frequency 304 × capping ratio 903

 一方、このキャッピング対象902に示された役割のVMではない場合、下記の式に従って、CPUコア割当数とキャッピング済周波数を算出して、その値を設定する。
 設定するCPUコア割当数=CPUコア割当数303
 設定するキャッピング済み周波数=0
On the other hand, if the VM does not have the role indicated by the capping target 902, the CPU core allocation number and the capped frequency are calculated according to the following formula, and the values are set.
CPU core allocation number to be set = CPU core allocation number 303
Set capped frequency = 0

 図4は、システム構成情報192の一例を示す説明図である。システム構成情報192は、個々のVMを識別するための識別子であるサーバ名401、VMの役割を示す役割402、VMに割り当てたCPUコア数を示すCPUコア割当数403、1コア当たりのCPU動作周波数を示す1コア周波数404、キャッピングされたCPU動作周波数を示すキャッピング済み周波数405およびVMが動作する物理サーバ12の識別子である稼働先サーバ407を含む。システム構成情報192は、システムの構成変更の内容に伴い、制御部18によって設定される。 FIG. 4 is an explanatory diagram showing an example of the system configuration information 192. The system configuration information 192 includes a server name 401 that is an identifier for identifying each VM, a role 402 that indicates the role of the VM, a CPU core allocation number 403 that indicates the number of CPU cores allocated to the VM, and a CPU operation per core. 1 core frequency 404 indicating the frequency, capped frequency 405 indicating the capped CPU operating frequency, and operation destination server 407 which is an identifier of the physical server 12 on which the VM operates. The system configuration information 192 is set by the control unit 18 in accordance with the contents of the system configuration change.

 図5は、物理サーバ性能情報の一例を示す説明図である。物理サーバ性能情報194は、性能を監視した日時を示す日時601、監視した物理サーバの識別子であるサーバ名602、監視した物理サーバのCPU使用率を示すCPU使用率603、メモリ使用率を示すメモリ使用率604、ネットワークの通信流量を示すネットワーク流量605およびディスクの入出力量を示すディスク流量606を含む。システム監視部181は、物理サーバ14の性能を監視し、監視した性能情報を物理サーバ性能情報194に記録する。 FIG. 5 is an explanatory diagram showing an example of physical server performance information. The physical server performance information 194 includes a date and time 601 indicating the date and time when performance is monitored, a server name 602 that is an identifier of the monitored physical server, a CPU usage rate 603 that indicates the CPU usage rate of the monitored physical server, and a memory that indicates the memory usage rate. It includes a usage rate 604, a network flow 605 indicating the communication flow of the network, and a disk flow 606 indicating the input / output amount of the disk. The system monitoring unit 181 monitors the performance of the physical server 14 and records the monitored performance information in the physical server performance information 194.

 図6は、サービス性能情報の一例を示す説明図である。サービス性能情報195は、性能を監視した日時を示す日時701、単位時間あたりに業務部13が受信したリクエスト数を示すリクエスト数702およびそのリクエストに対する応答時間(レスポンスタイム)を示すレスポンスタイム703を含む。システム監視部181は、業務部13の性能を監視し、監視した性能情報をサービス性能情報195に記録する。 FIG. 6 is an explanatory diagram showing an example of service performance information. The service performance information 195 includes a date and time 701 indicating the date and time when performance is monitored, a request number 702 indicating the number of requests received by the business unit 13 per unit time, and a response time 703 indicating a response time (response time) to the request. . The system monitoring unit 181 monitors the performance of the business unit 13 and records the monitored performance information in the service performance information 195.

 図7は、システム性能モデル情報の一例を示す説明図である。システム性能モデル情報196は、システムのCPUコア数の合計を示す総リソース量801および総リソース量801のときにシステムが処理可能なリクエスト数を示す性能802を含む。これらの値は、管理者によって設定される。管理者は、後述する監視フローチャートのステップS1001(図9)を開始する以前に、検証実験あるいは文献参照等の方法で、システム性能モデル情報196に示す値を決定し、かつシステム性能モデル情報196にその値を設定する。 FIG. 7 is an explanatory diagram showing an example of system performance model information. The system performance model information 196 includes a total resource amount 801 indicating the total number of CPU cores of the system and a performance 802 indicating the number of requests that can be processed by the system when the total resource amount 801 is present. These values are set by the administrator. Before starting step S1001 (FIG. 9) of the monitoring flowchart to be described later, the administrator determines a value shown in the system performance model information 196 by a method such as a verification experiment or literature reference, and stores it in the system performance model information 196. Set its value.

 図8は、スケーリングルール情報の一例を示す説明図である。スケーリングルール情報197は、CPUキャッピングの対象とするVMの役割を示すキャッピング対象902、VMに設定するキャッピングの割合を示すキャッピング割合903、新規追加したVMへのリクエスト割当てを実行するか否かを決定するための監視期間を示す負荷増加監視期間904、VMの追加処理を開始するか否かを決定するためのトリガー条件に関わるトリガー係数906、通常時の監視間隔である通常時監視間隔907および高負荷時の監視間隔である高負荷時監視間隔908を含む。これらの値は、管理者によって設定される。 FIG. 8 is an explanatory diagram showing an example of scaling rule information. The scaling rule information 197 determines a capping target 902 indicating the role of the VM to be CPU capped, a capping ratio 903 indicating the ratio of capping set in the VM, and whether to execute request allocation to the newly added VM. Load increase monitoring period 904 indicating a monitoring period to perform, trigger coefficient 906 related to a trigger condition for determining whether to start addition processing of VM, normal time monitoring interval 907 which is a normal time monitoring interval, and high It includes a high load monitoring interval 908 that is a monitoring interval at the time of loading. These values are set by the administrator.

 図9は、制御部18が実行する業務部13に対する監視処理を示すフローチャート(監視フローチャート)である。
 システム監視部181は、ステップS1001において、通常時監視間隔907(図8)に設定された間隔で業務部13を監視し、監視した性能情報の結果をサービス性能情報195(図6)に記録する。具体的には、監視した日時701、単位時間あたりに業務部13が受信したリクエスト数702およびそのリクエストに対するレスポンスタイム703を記録する。また、システム監視部181は、物理サーバ14を監視し、監視した性能情報の結果を物理サーバ性能情報194(図5)に記録する。具体的には、監視した日時601、サーバ名602、CPU使用率603、メモリ使用率604、ネットワーク流量605およびディスク流量606を記録する。
FIG. 9 is a flowchart (monitoring flowchart) showing a monitoring process for the business unit 13 executed by the control unit 18.
In step S1001, the system monitoring unit 181 monitors the business unit 13 at the interval set as the normal monitoring interval 907 (FIG. 8), and records the monitored performance information result in the service performance information 195 (FIG. 6). . Specifically, the monitored date and time 701, the number of requests 702 received by the business unit 13 per unit time, and the response time 703 for the request are recorded. Further, the system monitoring unit 181 monitors the physical server 14 and records the result of the monitored performance information in the physical server performance information 194 (FIG. 5). Specifically, the monitored date / time 601, server name 602, CPU usage rate 603, memory usage rate 604, network flow rate 605, and disk flow rate 606 are recorded.

 次に、管理ソフト制御部184は、ステップS1002において、次の式に従ってスケールアウトのトリガー条件を満たすか判定する。
 リクエスト数 > トリガー係数 × システム容量
 ここで、リクエスト数は、サービス性能情報195(図6)に記録された最新のリクエスト数702である。トリガー係数は、スケーリングルール情報197(図8)のトリガー係数906である。システム容量は、システムのCPUコア割当数の合計値をキーとして、システム性能モデル情報196(図7)を検索し、該当する性能802の値(システムが処理可能なリクエスト数)を使用する。また、システムのCPUコア割当数の合計値は、システム構成情報192(図4)において、役割402が「Web」であるレコードの、CPUコア割当数403の合計値である。
Next, in step S1002, the management software control unit 184 determines whether the scale-out trigger condition is satisfied according to the following equation.
Number of Requests> Trigger Factor × System Capacity Here, the number of requests is the latest number of requests 702 recorded in the service performance information 195 (FIG. 6). The trigger coefficient is the trigger coefficient 906 of the scaling rule information 197 (FIG. 8). The system capacity searches the system performance model information 196 (FIG. 7) using the total number of CPU core allocations of the system as a key, and uses the value of the corresponding performance 802 (the number of requests that can be processed by the system). The total CPU core allocation number of the system is the total CPU core allocation number 403 of the record whose role 402 is “Web” in the system configuration information 192 (FIG. 4).

 管理ソフト制御部184は、ステップ1002においてトリガー条件を満たすと判定した場合(YES)、スケールアウト処理としてステップS1003以降の処理を実行する。満たさないと判定した場合(NO)、ステップS1001からの処理を繰り返す。 If the management software control unit 184 determines that the trigger condition is satisfied in step 1002 (YES), the management software control unit 184 executes the processing after step S1003 as the scale-out processing. When it determines with not satisfy | filling (NO), the process from step S1001 is repeated.

 図10は、図9のステップS1003以降のスケールアウト処理として制御部18が実行する処理を示すフローチャート(スケールアウトフローチャート)である。
 ステップS1101において、システム構成変更部183は、システム構築要求情報191(図3)に記録される情報に従って、仮想サーバ(VM)を業務部13に追加する処理を開始する。具体的には、システム構築要求情報191(図3)において、役割302が「Web」であるレコードの、CPUコア割当数303と1コア周波数304の値を使用し、ハイパーバイザ15上に仮想サーバ(VM)をデプロイする(利用可能な状態にする)処理を開始する。ただし、負荷増加時の対応余力を残すことを目的として、前記仮想サーバ(VM)に設定するCPUコア割当数は、CPUコア割当数303を、1からキャッピング割合903(図8のスケーリングルール情報197)を引いた値で割った数を使用する。例えば、CPUコア割当数303が4で、キャッピング割合903が50%だった場合、前記仮想サーバ(VM)に設定するCPUコア割当数は、8(=4/(1-0.5)=4/0.5)を使用する。また、追加する仮想サーバ(VM)に設定するキャッピング割合は、スケーリングルール情報197(図8)からその追加するVMの役割(図8のキャッピング対象902)に対応したキャッピング割合903とする(例えば、役割が「Web」の場合はキャッピング割合50%)。
FIG. 10 is a flowchart (scale-out flowchart) showing processing executed by the control unit 18 as scale-out processing after step S1003 in FIG.
In step S1101, the system configuration changing unit 183 starts processing for adding a virtual server (VM) to the business unit 13 in accordance with information recorded in the system construction request information 191 (FIG. 3). Specifically, in the system construction request information 191 (FIG. 3), the value of the CPU core allocation number 303 and the 1 core frequency 304 of the record whose role 302 is “Web” is used, and the virtual server on the hypervisor 15 is used. The process of deploying (making it available) (VM) is started. However, for the purpose of leaving a surplus capacity when the load increases, the CPU core allocation number set in the virtual server (VM) is changed from 1 to the CPU core allocation number 303 from 1 to the capping ratio 903 (scaling rule information 197 in FIG. 8). ) Minus the value used. For example, when the CPU core allocation number 303 is 4 and the capping ratio 903 is 50%, the CPU core allocation number set in the virtual server (VM) is 8 (= 4 / (1-0.5) = 4. /0.5). Further, the capping rate set for the virtual server (VM) to be added is the capping rate 903 corresponding to the role of the VM to be added (capping target 902 in FIG. 8) from the scaling rule information 197 (FIG. 8) (for example, (If the role is “Web”, the capping rate is 50%).

 ステップS1102において、システム構成変更部183は、動作済みの仮想サーバのCPUキャッピングを解除する処理を実行する。動作済みの仮想サーバは、システム構成情報192(図4)において、役割402が「Web」であるレコードを検索することによって発見する。解除するキャッピング量は、新規に追加する仮想サーバのCPU割当数303に、1コア周波数304を掛けた値とする。例えば、解除するキャッピング量が12000MHzであり、システム構成情報192(図4)に記録されるWebサーバがVM1であり、VM1のキャッピング済周波数405が12000MHzの場合は、以下のようになる。すなわち、VM1のキャッピング済周波数405(12000MHz)から解除するキャッピング量(12000MHz=3000MHz×4)を引いた値(0)を、新規にVM1のキャッピング済周波数としてVM1に設定し、かつVM1のキャッピング済周波数405にその値を記録する。 In step S1102, the system configuration change unit 183 executes a process of canceling CPU capping of the operated virtual server. The operated virtual server is found by searching for a record whose role 402 is “Web” in the system configuration information 192 (FIG. 4). The capping amount to be released is a value obtained by multiplying the CPU allocation number 303 of the newly added virtual server by the 1-core frequency 304. For example, when the capping amount to be released is 12000 MHz, the Web server recorded in the system configuration information 192 (FIG. 4) is VM1, and the capped frequency 405 of VM1 is 12000 MHz, the following is performed. That is, a value (0) obtained by subtracting the capping amount (12000 MHz = 3000 MHz × 4) to be released from the capped frequency 405 (12000 MHz) of the VM1 is newly set to the VM1 as the capped frequency of the VM1, and the VM1 is capped. Record the value at frequency 405.

 ステップS1103において、システム監視部181は、監視間隔を通常時監視間隔907から高負荷時監視間隔908に変更し、その設定された間隔で業務部13を監視する。かつ、ステップS1001(図9)と同じ情報を、サービス性能情報195(図6)に記録する。同様に、高負荷時監視間隔908に設定された間隔で物理サーバ14を監視し、監視した情報を物理サーバ性能情報194(図5)に記録する。 In step S1103, the system monitoring unit 181 changes the monitoring interval from the normal monitoring interval 907 to the high load monitoring interval 908, and monitors the business unit 13 at the set interval. And the same information as step S1001 (FIG. 9) is recorded on the service performance information 195 (FIG. 6). Similarly, the physical server 14 is monitored at the interval set as the high load monitoring interval 908, and the monitored information is recorded in the physical server performance information 194 (FIG. 5).

 ステップS1104において、管理ソフト制御部184は、所定の期間、高負荷が継続しているか判定する。ここで、所定の期間としては、スケーリングルール情報197(図8)の負荷増加監視期間904の値を使用する。高負荷が継続するか否かは、ステップS1002(図9)で示した、トリガー条件を満たすか否かで判定する。 In step S1104, the management software control unit 184 determines whether a high load continues for a predetermined period. Here, the value of the load increase monitoring period 904 of the scaling rule information 197 (FIG. 8) is used as the predetermined period. Whether or not the high load continues is determined based on whether or not the trigger condition shown in step S1002 (FIG. 9) is satisfied.

 管理ソフト制御部184は、高負荷が継続していると判定した場合(YES)、ステップS1105において、LB(負荷分散装置)の設定を変更し、新規に追加したVMにリクエスト割当てを開始する。ただし、ステップS1101で開始したVMの追加がまだ完了していない場合には、その追加の完了を待って、管理ソフト制御部184は、LB(負荷分散装置)の設定を変更する。 If the management software control unit 184 determines that the high load continues (YES), in step S1105, the management software control unit 184 changes the setting of the LB (load distribution device) and starts request allocation to the newly added VM. However, if the VM addition started in step S1101 has not yet been completed, the management software control unit 184 changes the setting of the LB (load distribution device) after waiting for the completion of the addition.

 管理ソフト制御部184が高負荷は継続していないと判定した場合(NO)、ステップS1109において、システム構成変更部183は、ステップS1101で追加したVMを削除する。ただし、ステップS1101で開始したVMの追加がまだ完了していない場合には、システム構成変更部183は、その追加を中止する。 When the management software control unit 184 determines that the high load is not continued (NO), in step S1109, the system configuration change unit 183 deletes the VM added in step S1101. However, if the VM addition started in step S1101 has not yet been completed, the system configuration changing unit 183 stops the addition.

 以上のステップS1101~ステップS1105およびステップS1109の処理フローの仕組みについて、補足する。スケールアウト処理として、システム構成変更部183は、まず仮想サーバ(VM)の追加処理を開始し(ステップS1101)、それに続いてCPUキャッピングを解除する(ステップS1102)。CPUキャッピングは、VMの追加削除よりも対応が早く、急激な負荷の増減への対応としては好適であるから、VMの追加処理に並行してCPUキャッピングの解除を実行することにより、VM追加後に近い状態を先行して再現し、負荷増加(変動)への対応を図るのである。これを、システム監視部181は高負荷時監視間隔で監視する(ステップS1103)。負荷増加(変動)が一時的であったり、CPUキャッピング解除により負荷増加(変動)の影響を吸収できるのであれば、高負荷状態は継続しないので(ステップS1104)、システム構成変更部183は、VMを追加することに及ばずそれをキャンセルする(ステップS1109)。しかし、CPUキャッピング解除によっても高負荷状態が継続するようであれば(ステップS1104)、管理ソフト制御部184は、追加したVMに対して負荷割当てを行うことにより対応することになる(ステップS1105)。 A supplementary description will be given of the mechanism of the processing flow of steps S1101 to S1105 and S1109. As the scale-out process, the system configuration change unit 183 first starts the virtual server (VM) addition process (step S1101), and subsequently releases the CPU capping (step S1102). CPU capping is faster than adding and deleting VMs, and is suitable as a response to sudden load increase / decrease. Therefore, by canceling CPU capping in parallel with VM addition processing, The close state is reproduced in advance to cope with the load increase (fluctuation). The system monitoring unit 181 monitors this at a high load monitoring interval (step S1103). If the load increase (fluctuation) is temporary or if the influence of the load increase (fluctuation) can be absorbed by canceling CPU capping, the high load state does not continue (step S1104). The process is canceled without adding to (step S1109). However, if the high load state continues even after the CPU capping is released (step S1104), the management software control unit 184 responds by assigning a load to the added VM (step S1105). .

 ステップS1106において、システム構成変更部183は、ステップS1102においてキャッピングを解除したVMに対してキャッピングを再設定する(元に戻す)。キャッピングの対象とするVMは、ステップS1102において対象としたVMと同一である。また、キャッピングの量は、ステップS1102において解除したキャッピングの量と同一である。 In step S1106, the system configuration changing unit 183 resets (returns) capping to the VM for which capping was canceled in step S1102. The target VM for capping is the same as the target VM in step S1102. Further, the amount of capping is the same as the amount of capping released in step S1102.

 ステップS1107において、システム監視部181は、監視間隔を通常時監視間隔907に戻し、その設定された間隔で業務部13を監視する。かつ、システム監視部181は、ステップS1001(図9)と同じ情報を、サービス性能情報195(図6)に記録する。 In step S1107, the system monitoring unit 181 returns the monitoring interval to the normal monitoring interval 907 and monitors the business unit 13 at the set interval. And the system monitoring part 181 records the same information as step S1001 (FIG. 9) in the service performance information 195 (FIG. 6).

 ステップS1108において、システム構成算出部182は、ステップS1101で追加を開始したVMの情報を、システム構成情報192(図4)に追加する。ただし、ステップS1109を実行した場合は、その情報の追加をせず、そのまま処理を終了する。 In step S1108, the system configuration calculation unit 182 adds the VM information that has been added in step S1101 to the system configuration information 192 (FIG. 4). However, if step S1109 is executed, the information is not added and the process is terminated.

 なお、本実施例1では、キャッピングの対象とする計算機資源をCPUとして説明したが、他の計算機資源(メモリ、ディスク、ネットワークなど)を対象にしてもよい。 In the first embodiment, the computer resource to be capped has been described as a CPU. However, other computer resources (memory, disk, network, etc.) may be targeted.

 以上説明したように、本実施例1によれば、負荷変動の予測情報がない場合でも、計算機資源の動的な割当てを効果的に行い、業務のサービスレベルを維持するためのオートスケールを実現できる。 As described above, according to the first embodiment, even when there is no load fluctuation prediction information, dynamic allocation of computer resources is effectively performed, and auto-scaling for maintaining the service level of business is realized. it can.

 次に、本発明の実施例2について、図11および12を参照して説明する。
 実施例1では、スケールアウトのトリガー条件を満たすか否かを判定し(図9のステップS1002)、満たした場合(YES)は、VMの追加処理を開始(図10のステップS1101)していた。しかし、アプリケーションの状態によっては、VMの追加が有効ではない場合がある。例えば、相互接続されたWebサーバ1台とDBサーバ1台から成るシステムにおいて、DBサーバにおける許容同時接続数が100、WebサーバにおけるDBサーバ向け同時接続数が100の場合、Webサーバを1台追加しても、DBサーバにおける許容同時接続数をDBサーバ向け同時接続数の合計が超えるため、サービス性能が改善されない場合である。このような場合を鑑み、実施例2では、アプリケーションの状態を表す情報(コネクション情報)を使用して、負荷増加時のリソース追加手法を選択する。
Next, a second embodiment of the present invention will be described with reference to FIGS.
In the first embodiment, it is determined whether or not the scale-out trigger condition is satisfied (step S1002 in FIG. 9). If the condition is satisfied (YES), VM addition processing is started (step S1101 in FIG. 10). . However, depending on the state of the application, the addition of the VM may not be effective. For example, in a system consisting of one interconnected Web server and one DB server, if the allowable number of simultaneous connections in the DB server is 100 and the number of simultaneous connections for the DB server in the Web server is 100, add one Web server Even so, the service performance is not improved because the total number of simultaneous connections for the DB server exceeds the allowable number of simultaneous connections in the DB server. In view of such a case, in the second embodiment, a resource addition method at the time of load increase is selected using information (connection information) indicating the state of the application.

 実施例2では、図1に示したシステム構成に加えて、図11に示すコネクション情報を使用する。かつ、図9および図10に示した実施例1のフローチャートに代えて、図12に示すフローチャートを使用する。 In the second embodiment, the connection information shown in FIG. 11 is used in addition to the system configuration shown in FIG. In place of the flowchart of the first embodiment shown in FIGS. 9 and 10, the flowchart shown in FIG. 12 is used.

 図11は、コネクション情報の一例を示す説明図である。コネクション情報198は、VMの役割を示す役割1201およびその役割を割り当てられたVMに設定される通信コネクション数を示すコネクション数1202を含む。このコネクション情報198は、管理者によって設定される。また、コネクション情報198は、管理情報群19に格納される(図1で点線枠により示す)。 FIG. 11 is an explanatory diagram showing an example of connection information. The connection information 198 includes a role 1201 indicating the role of the VM and a connection number 1202 indicating the number of communication connections set in the VM to which the role is assigned. This connection information 198 is set by the administrator. The connection information 198 is stored in the management information group 19 (indicated by a dotted frame in FIG. 1).

 図12は、制御部18が実行するパラメータ判定処理を示すフローチャート(パラメータ判定フローチャート)である。
 ステップS1001およびS1002は、実施例1の場合と同じである。ただし、ステップS1002において、スケールアウトのトリガー条件を満たすと判定された場合(YES)、管理ソフト制御部184は、ステップS1003ではなくステップS1303を実行する。このステップS1303において、管理ソフト制御部184は、仮想サーバであるWebサーバを1台追加した場合でも、Webサーバ群におけるコネクション数が、DBサーバ群におけるコネクション数以内に収まるか否かを判定する。
FIG. 12 is a flowchart (parameter determination flowchart) showing the parameter determination process executed by the control unit 18.
Steps S1001 and S1002 are the same as those in the first embodiment. However, if it is determined in step S1002 that the scale-out trigger condition is satisfied (YES), the management software control unit 184 executes step S1303 instead of step S1003. In step S1303, the management software control unit 184 determines whether or not the number of connections in the Web server group falls within the number of connections in the DB server group even when one Web server that is a virtual server is added.

 具体的には、次の式に基づいて判定する。
 Webサーバに設定されるコネクション数×(動作済みWebサーバ数+1)<DBサーバに設定されるコネクション数×動作済みDBサーバ数
 ここで、Webサーバに設定されるコネクション数は、コネクション情報198において、役割1201が「Web」であるレコードのコネクション数1202である。同様に、DBサーバに設定されるコネクション数は、コネクション情報198において、役割1201が「DB」であるレコードのコネクション数1202である。動作済みWebサーバ数は、システム構成情報192(図4)において、役割402が「Web」であるレコードの数である。同様に、動作済みDBサーバ数は、システム構成情報192(図4)において、役割402が「DB」であるレコードの数である。
Specifically, the determination is made based on the following equation.
Number of connections set in Web server × (Number of operated Web servers + 1) <Number of connections set in DB server × Number of operated DB servers Here, the number of connections set in Web server The number of connections 1202 of the record whose role 1201 is “Web”. Similarly, the number of connections set in the DB server is the number of connections 1202 of the record whose role 1201 is “DB” in the connection information 198. The number of operated Web servers is the number of records whose role 402 is “Web” in the system configuration information 192 (FIG. 4). Similarly, the number of operated DB servers is the number of records whose role 402 is “DB” in the system configuration information 192 (FIG. 4).

 仮想サーバ(Webサーバ)の追加に対してコネクション数が十分でない(上記判定式を満足しない)と判定された場合(NO)、Webサーバを追加してもシステムの性能が改善しない恐れがある。そこで、その場合には、ステップS1102において、システム構成変更部183は、既存仮想サーバのキャッピング解除処理を実行し、その後に処理を終了する。 If it is determined that the number of connections is not sufficient for the addition of the virtual server (Web server) (does not satisfy the above judgment formula) (NO), there is a possibility that the system performance will not be improved even if the Web server is added. Therefore, in that case, in step S1102, the system configuration change unit 183 executes the capping release processing of the existing virtual server, and thereafter ends the processing.

 仮想サーバ(Webサーバ)の追加に対してコネクション数が十分である(上記判定式を満足する)と判定された場合(YES)、実施例1の場合と同様に、図10で示したステップS1003(スケールアウト処理)を実行する。 When it is determined that the number of connections is sufficient (addition of the above determination formula) to the addition of the virtual server (Web server) (YES), step S1003 shown in FIG. (Scale-out processing) is executed.

 以上説明したように、本実施例2によれば、計算機システムにおけるアプリケーションの状態に応じて、計算機資源の動的な割当てを効果的に行い、業務のサービスレベルを維持するためのオートスケールを実現できる。 As described above, according to the second embodiment, dynamic allocation of computer resources is effectively performed according to the application state in the computer system, and auto-scaling for maintaining the service level of business is realized. it can.

11 管理部、12 物理サーバ、13 業務部、14 物理サーバ、15 ハイパーバイザ、16 通信ネットワーク、18 管理部、19 管理情報群、121,141 CPU、122,142 記憶装置、123,143 通信IF、131~133 VM、181 システム監視部、182 システム構成算出部、183 システム構成変更部、184 管理ソフト制御部、191 システム構築要求情報、192 システム構成情報、194 物理サーバ性能情報、195 サービス性能情報、196 システム性能モデル情報、197 スケーリングルール情報、198 コネクション情報 11 management unit, 12 physical server, 13 business unit, 14 physical server, 15 hypervisor, 16 communication network, 18 management unit, 19 management information group, 121, 141 CPU, 122, 142 storage device, 123, 143 communication IF, 131 to 133 VM, 181 system monitoring unit, 182 system configuration calculation unit, 183 system configuration change unit, 184 management software control unit, 191 system construction request information, 192 system configuration information, 194 physical server performance information, 195 service performance information, 196 System performance model information, 197 Scaling rule information, 198 Connection information

Claims (12)

 プロセッサ、記憶装置および通信インタフェースを含む計算機資源を有する物理サーバと、
 前記計算機資源を仮想化して複数の仮想環境に割り当てる仮想化装置と、
 前記複数の仮想環境を管理する管理装置と、
を少なくとも備える計算機システムであって、
 前記管理装置が、
 前記複数の仮想環境の負荷を監視する監視部と、
 監視した前記負荷に応じて仮想環境の追加要否を判定する制御部と、
 前記制御部の判定に応じて仮想環境の追加を前記仮想化装置に指示すると共に、既存の仮想環境を構成する計算機資源の使用制限を解除し、前記仮想環境の追加が完了した後には前記解除した使用制限を前記既存の仮想環境に再設定する構成変更部と、を有する
ことを特徴とする計算機システム。
A physical server having computer resources including a processor, a storage device and a communication interface;
A virtualization device that virtualizes the computer resources and assigns them to a plurality of virtual environments;
A management device for managing the plurality of virtual environments;
A computer system comprising at least
The management device is
A monitoring unit that monitors loads of the plurality of virtual environments;
A control unit that determines whether or not a virtual environment needs to be added according to the monitored load;
Instructs the virtualization apparatus to add a virtual environment according to the determination of the control unit, releases the use restriction of the computer resources constituting the existing virtual environment, and releases the virtual environment after the addition of the virtual environment is completed. And a configuration change unit that resets the use restrictions to the existing virtual environment.
 請求項1に記載の計算機システムであって、
 前記使用制限を解除する対象となる計算機資源はプロセッサである
ことを特徴とする計算機システム。
The computer system according to claim 1,
The computer system according to claim 1, wherein the computer resource for which the use restriction is removed is a processor.
 請求項1または請求項2に記載の計算機システムであって、
 前記管理装置は、前記計算機資源の使用制限割合を保持する記憶部を有し、
 前記構成変更部は、前記追加する仮想環境の計算機資源の使用制限量を前記使用制限割合に基づいて設定する
ことを特徴とする計算機システム。
The computer system according to claim 1 or 2,
The management device has a storage unit that holds a use restriction ratio of the computer resource,
The configuration change unit sets a use restriction amount of a computer resource of the virtual environment to be added based on the use restriction ratio.
 請求項1から請求項3のいずれか一項に記載の計算機システムであって、
 前記仮想環境の追加の指示後に、
 前記監視部は、前記監視する時間間隔を短時間に変更し、
 前記構成変更部は、前記変更した時間間隔の間、高負荷状態を継続している場合には前記仮想環境の追加を完了させ、高負荷状態を継続していない場合には当該追加を中止および既に追加した仮想環境を削除する
 ことを特徴とする計算機システム。
It is a computer system as described in any one of Claims 1-3, Comprising:
After the instruction to add the virtual environment,
The monitoring unit changes the monitoring time interval to a short time,
The configuration changing unit completes the addition of the virtual environment when the high load state is continued during the changed time interval, and cancels the addition when the high load state is not continued. A computer system characterized by deleting a virtual environment that has already been added.
 請求項1から請求項4のいずれか一項に記載の計算機システムであって、
 前記制御部は、監視した前記負荷に応じて仮想環境の追加要否を判定することに加えて、前記計算機システムが処理するアプリケーションの通信コネクション数に応じて前記仮想環境の追加要否を判定する
ことを特徴とする計算機システム。
A computer system according to any one of claims 1 to 4, wherein
In addition to determining whether or not to add a virtual environment according to the monitored load, the control unit determines whether or not to add the virtual environment according to the number of application communication connections processed by the computer system. A computer system characterized by that.
 請求項5における計算機システムであって、
 前記制御部が、前記仮想環境の追加に対して前記通信コネクション数が不十分であると判定した場合には、
 前記構成変更部は、前記既存の仮想環境を構成する計算機資源の使用制限を解除するのみで前記仮想環境の追加を指示しない
ことを特徴とする計算機システム。
A computer system according to claim 5, wherein
When the control unit determines that the number of communication connections is insufficient for the addition of the virtual environment,
The computer system according to claim 1, wherein the configuration change unit only cancels the use restriction of the computer resources constituting the existing virtual environment and does not instruct the addition of the virtual environment.
 プロセッサ、記憶装置および通信インタフェースを含む計算機資源を有する物理サーバと、
 前記計算機資源を仮想化して複数の仮想環境に割り当てる仮想化装置と、
 前記複数の仮想環境を管理する管理装置と
を少なくとも備えた計算機システムにおけるオートスケール方法であって、
 前記管理装置は、
 前記複数の仮想環境の負荷を監視する第1のステップと、
 監視した前記負荷に応じて仮想環境の追加要否を判定する第2のステップと、
 前記判定に応じて仮想環境の追加を前記仮想化部に指示する第3のステップと、
 既存の仮想環境を構成する計算機資源の使用制限を解除する第4のステップと、
 前記仮想環境の追加が完了した後に前記解除した使用制限を前記既存の仮想環境に再設定する第5のステップと、を実行する
ことを特徴とするオートスケール方法。
A physical server having computer resources including a processor, a storage device and a communication interface;
A virtualization device that virtualizes the computer resources and assigns them to a plurality of virtual environments;
An autoscaling method in a computer system comprising at least a management device for managing the plurality of virtual environments,
The management device
A first step of monitoring a load of the plurality of virtual environments;
A second step of determining whether or not a virtual environment needs to be added according to the monitored load;
A third step of instructing the virtualization unit to add a virtual environment according to the determination;
A fourth step of releasing the restriction on the use of computer resources constituting the existing virtual environment;
And a fifth step of resetting the released use restriction to the existing virtual environment after the addition of the virtual environment is completed.
 請求項7に記載のオートスケール方法であって、
 前記第4のステップにおける計算機資源はプロセッサである
ことを特徴とするオートスケール方法。
The auto-scaling method according to claim 7,
The computer scale resource in the fourth step is a processor.
 請求項7または請求項8に記載のオートスケール方法であって、
 前記管理装置は、前記計算機資源の使用制限割合を保持し、
 前記第3のステップは、前記追加する仮想環境の計算機資源の使用制限量を前記使用制限割合に基づいて設定するステップを更に含む
ことを特徴とするオートスケール方法。
The autoscale method according to claim 7 or claim 8, wherein
The management device holds a use restriction ratio of the computer resource,
The third step further includes a step of setting a use restriction amount of the computer resource of the virtual environment to be added based on the use restriction ratio.
 請求項7から請求項9のいずれか一項に記載のオートスケール方法であって、
 前記第5のステップは、前記監視する監視間隔を短時間に変更するステップと、該変更した監視間隔の間、高負荷状態を継続している場合には前記仮想環境の追加を完了させるステップと、高負荷状態を継続していない場合には当該追加を中止および既に追加した仮想環境を削除するステップと、を更に含む
ことを特徴とするオートスケール方法。
The autoscale method according to any one of claims 7 to 9,
The fifth step includes a step of changing the monitoring interval to be monitored to a short time, and a step of completing the addition of the virtual environment when a high load state is continued during the changed monitoring interval; And a step of canceling the addition when the high load state is not continued and deleting the already added virtual environment.
 請求項7から請求項10のいずれか一項に記載のオートスケール方法であって、
 前記管理装置は、前記第2のステップと前記第3のステップの間に、前記計算機システムが処理するアプリケーションの通信コネクション数に応じて前記仮想環境の追加要否を判定するステップを実行する
ことを特徴とするオートスケール方法。
The autoscale method according to any one of claims 7 to 10, wherein
The management device executes a step of determining whether or not to add the virtual environment according to the number of communication connections of the application processed by the computer system between the second step and the third step. A featured autoscale method.
 請求項11に記載のオートスケール方法であって、
 前記管理装置は、前記仮想環境の追加に対して前記通信コネクション数が不十分である場合に、前記既存の仮想環境を構成する計算機資源の使用制限を解除するのみで前記仮想環境の追加を指示しない
ことを特徴とするオートスケール方法。
The auto-scaling method according to claim 11,
When the number of communication connections is insufficient for the addition of the virtual environment, the management device instructs to add the virtual environment only by releasing the use restriction of the computer resources constituting the existing virtual environment. Auto-scaling method characterized by not.
PCT/JP2014/079936 2014-11-12 2014-11-12 Computer system and autoscaling method for computer system Ceased WO2016075771A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2014/079936 WO2016075771A1 (en) 2014-11-12 2014-11-12 Computer system and autoscaling method for computer system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2014/079936 WO2016075771A1 (en) 2014-11-12 2014-11-12 Computer system and autoscaling method for computer system

Publications (1)

Publication Number Publication Date
WO2016075771A1 true WO2016075771A1 (en) 2016-05-19

Family

ID=55953884

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2014/079936 Ceased WO2016075771A1 (en) 2014-11-12 2014-11-12 Computer system and autoscaling method for computer system

Country Status (1)

Country Link
WO (1) WO2016075771A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017203556A1 (en) * 2016-05-23 2017-11-30 株式会社日立製作所 Management computer and optimal value calculation method of system parameter
US10445198B2 (en) 2016-12-27 2019-10-15 Fujitsu Limited Information processing device that monitors a plurality of servers and failover time measurement method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002182934A (en) * 2000-10-02 2002-06-28 Internatl Business Mach Corp <Ibm> Method and device to intensify capacity restriction of logical partition formed in system
JP2008186210A (en) * 2007-01-30 2008-08-14 Hitachi Ltd Processor capping method for virtual machine system
JP2011258119A (en) * 2010-06-11 2011-12-22 Hitachi Ltd Cluster configuration management method, management device and program
JP2012164260A (en) * 2011-02-09 2012-08-30 Nec Corp Computer operation management system, computer operation management method, and computer operation management program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002182934A (en) * 2000-10-02 2002-06-28 Internatl Business Mach Corp <Ibm> Method and device to intensify capacity restriction of logical partition formed in system
JP2008186210A (en) * 2007-01-30 2008-08-14 Hitachi Ltd Processor capping method for virtual machine system
JP2011258119A (en) * 2010-06-11 2011-12-22 Hitachi Ltd Cluster configuration management method, management device and program
JP2012164260A (en) * 2011-02-09 2012-08-30 Nec Corp Computer operation management system, computer operation management method, and computer operation management program

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017203556A1 (en) * 2016-05-23 2017-11-30 株式会社日立製作所 Management computer and optimal value calculation method of system parameter
US10445198B2 (en) 2016-12-27 2019-10-15 Fujitsu Limited Information processing device that monitors a plurality of servers and failover time measurement method

Similar Documents

Publication Publication Date Title
US9588789B2 (en) Management apparatus and workload distribution management method
US9069465B2 (en) Computer system, management method of computer resource and program
US9929931B2 (en) Efficient provisioning and deployment of virtual machines
JP5417287B2 (en) Computer system and computer system control method
RU2697700C2 (en) Equitable division of system resources in execution of working process
JP6190969B2 (en) Multi-tenant resource arbitration method
US20170017511A1 (en) Method for memory management in virtual machines, and corresponding system and computer program product
US20180253247A1 (en) Method and system for memory allocation in a disaggregated memory architecture
JP2016103179A (en) Allocation method for computer resource and computer system
CN110058966A (en) Method, equipment and computer program product for data backup
US20140215080A1 (en) Provisioning of resources
JP2014520346A5 (en)
CN113886089A (en) Task processing method, device, system, equipment and medium
KR101585160B1 (en) Distributed Computing System providing stand-alone environment and controll method therefor
CN111274033B (en) Resource deployment method, device, server and storage medium
WO2013082742A1 (en) Resource scheduling method, device and system
JP2020024646A (en) Resource reservation management device, resource reservation management method, and resource reservation management program
CN114546587A (en) A method for expanding and shrinking capacity of online image recognition service and related device
WO2018235739A1 (en) Information processing system and resource allocation method
US20210006472A1 (en) Method For Managing Resources On One Or More Cloud Platforms
CN109446062B (en) Method and device for software debugging in cloud computing service
US20160216986A1 (en) Infrastructure performance enhancement with adaptive resource preservation
CN117472570A (en) Methods, apparatus, electronic devices and media for scheduling accelerator resources
KR102014246B1 (en) Mesos process apparatus for unified management of resource and method for the same
WO2016075771A1 (en) Computer system and autoscaling method for computer system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14905693

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14905693

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP