WO2016075771A1

WO2016075771A1 - Computer system and autoscaling method for computer system

Info

Publication number: WO2016075771A1
Application number: PCT/JP2014/079936
Authority: WO
Inventors: 健寺村; 謙太山崎; 陽子平島
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2014-11-12
Filing date: 2014-11-12
Publication date: 2016-05-19
Anticipated expiration: 2017-05-12

Abstract

A computer system having a varying load, wherein computer resources are appropriately allocated according to the load even when load variations are difficult to predict and even when different computer resource allocation methods are effective for different application conditions. In order to accomplish this feature, the computer system is at least provided with: a physical server which has computer resources including a processor, a storage device, and a communication interface; a virtualization device which virtualizes computer resources and allocates virtual computer resources to a plurality of virtual environments; and a management device which manages the plurality of virtual environments. The management device comprises: a monitoring unit which monitors loads on the plurality of virtual environments; a control unit which determines, based on the monitored loads, whether it is necessary to add another virtual environment; and a configuration changing unit which requests addition of a virtual environment in accordance with the determination made by the control unit, removes usage restrictions on computer resources that constitute existing virtual environments, and, upon completion of the addition of a virtual environment, restores the usage restrictions on the existing virtual environments.

Description

Computer system and autoscale method in computer system

　本発明は、計算機資源の割当てを負荷に応じて変更する計算機システムおよび計算機システムにおけるオートスケール方法に関する。 The present invention relates to a computer system that changes allocation of computer resources according to a load, and an autoscaling method in the computer system.

　企業で利用される計算機システムでは、時間や時期によって負荷が変動する。負荷が変動した場合でも、システム応答時間等のサービスレベルを維持するために、リソース（計算機資源）の割当てを動的に制御する技術が提案されている。
　例えば、特許文献１には、仮想化環境上において、複数の業務クラスタにより構成される業務システムに対して、将来の負荷変動を予測し、予測された結果に対して、複数のハードウェア資源を変更させる手段を組み合わせることにより、負荷変動に追従するために必要な資源を割り当てる技術が開示されている。
　また、オートスケールは、ＣＰＵ使用率やネットワーク通信量等、仮想マシンの負荷に応じて、負荷分散装置（ロードバランサ）配下の仮想マシンの数や、仮想マシンに割り当てる資源の量を、自動的に増減させる機能である。 In a computer system used in a company, the load varies depending on time and time. In order to maintain a service level such as system response time even when the load fluctuates, a technique for dynamically controlling the allocation of resources (computer resources) has been proposed.
For example, in Patent Document 1, in a virtual environment, a future load fluctuation is predicted for a business system configured by a plurality of business clusters, and a plurality of hardware resources are allocated to the predicted result. A technique for allocating resources necessary to follow load fluctuations by combining means for changing is disclosed.
Autoscale automatically determines the number of virtual machines under the load balancer (load balancer) and the amount of resources allocated to the virtual machine according to the load on the virtual machine, such as CPU usage and network traffic. It is a function to increase or decrease.

特許第５３３２０６５号明細書Japanese Patent No. 5332065

　しかし、計算機システムにおける負荷変動の予測が困難な場合がある。例えば、イベントチケットのオンライン販売システム等の場合、事前の予測を超える多大なアクセスが発生そして集中することがある。従来技術では、負荷増加の予測に基づきＶＭを追加する等のリソース追加手法をスケジュールし、予測を上回る負荷があった場合に、ＶＭの追加が間に合わずにサービス性能の劣化を招く恐れがある。 However, it may be difficult to predict the load fluctuation in the computer system. For example, in the case of an event ticket online sales system or the like, a great deal of access that exceeds a prior prediction may occur and concentrate. In the prior art, when a resource addition method such as adding a VM is scheduled based on the prediction of load increase, and there is a load exceeding the prediction, there is a risk that the addition of the VM will not be in time and service performance will be degraded.

　また、計算機システムにおけるアプリケーションの状態によって、有効なリソース追加手法が異なる場合がある。例えば、相互接続されたＷｅｂサーバ１台とＤＢサーバ１台から成るシステムにおいて、ＤＢサーバにおける許容同時接続数が１００、ＷｅｂサーバにおけるＤＢサーバ向け同時接続数が１００の場合、Ｗｅｂサーバを１台追加しても、ＤＢサーバにおける許容同時接続数をＤＢサーバ向け同時接続数の合計が超えるため、サービス性能の劣化を招く恐れがある。 Also, the effective resource addition method may differ depending on the application status in the computer system. For example, in a system consisting of one interconnected Web server and one DB server, if the allowable number of simultaneous connections in the DB server is 100 and the number of simultaneous connections for the DB server in the Web server is 100, add one Web server Even so, since the total number of simultaneous connections for the DB server exceeds the allowable number of simultaneous connections in the DB server, the service performance may be deteriorated.

　これらの課題を解決するために、本発明の第１の目的は、負荷変動の予測が困難な場合でも、計算機資源の動的な割当てを適切に行い、業務のサービスレベルを維持するためのオートスケールに関する技術を提供することにある。
　また、本発明の第２の目的は、計算機システムにおけるアプリケーションの状態に応じて、計算機資源の動的な割当てを適切に行い、業務のサービスレベルを維持するためのオートスケールに関する技術を提供することにある。 In order to solve these problems, a first object of the present invention is to auto-allocate computer resources dynamically and maintain the service level of business even when it is difficult to predict load fluctuations. The purpose is to provide technology related to scale.
A second object of the present invention is to provide a technology relating to autoscaling for appropriately allocating computer resources appropriately according to the state of an application in a computer system and maintaining a service level of business. It is in.

　本発明に係る計算機システムは、プロセッサ、記憶装置および通信インタフェースを含む計算機資源を有する物理サーバと、計算機資源を仮想化して複数の仮想環境に割り当てる仮想化装置と、複数の仮想環境を管理する管理装置とを少なくとも備え、管理装置が、複数の仮想環境の負荷を監視する監視部と、監視した負荷に応じて仮想環境の追加要否を判定する制御部と、制御部の判定に応じて仮想環境の追加を仮想化装置に指示すると共に、既存の仮想環境を構成する計算機資源の使用制限を解除し、仮想環境の追加が完了した後に解除した使用制限を既存の仮想環境に再設定する構成変更部とを有する。 A computer system according to the present invention includes a physical server having computer resources including a processor, a storage device, and a communication interface, a virtualization device that virtualizes computer resources and assigns them to a plurality of virtual environments, and a management that manages the plurality of virtual environments. A monitoring unit that monitors the loads of a plurality of virtual environments, a control unit that determines whether or not a virtual environment needs to be added according to the monitored loads, and a virtual unit that is based on the determination of the control unit A configuration that instructs the virtualization device to add an environment, cancels the use restrictions on the computer resources that make up the existing virtual environment, and resets the use restrictions that have been released after the addition of the virtual environment is complete to the existing virtual environment And a change unit.

　本発明によれば、負荷変動の予測情報がない場合においても、また同様に、計算機システムにおけるアプリケーションの状態に応じても、計算機資源の動的な割当てを効果的に行い、業務のサービスレベルを維持するためのオートスケールを実現できる。 According to the present invention, even when there is no load fluctuation prediction information, and similarly, depending on the state of the application in the computer system, the dynamic allocation of the computer resources is effectively performed, and the service level of the work is reduced. Autoscale to maintain can be realized.

本発明の実施例１に係る計算機システムの構成を示すブロック図である。It is a block diagram which shows the structure of the computer system which concerns on Example 1 of this invention. 仮想サーバの論理的な接続関係と役割を示す図である。It is a figure which shows the logical connection relationship and role of a virtual server. システム構築要求情報を示す図である。It is a figure which shows system construction request information. システム構成情報を示す図である。It is a figure which shows system configuration information. 物理サーバ性能情報を示す図である。It is a figure which shows physical server performance information. サービス性能情報を示す図である。It is a figure which shows service performance information. システム性能モデル情報を示す図である。It is a figure which shows system performance model information. スケーリングルール情報を示す図である。It is a figure which shows scaling rule information. 本発明の実施例１に係る監視フローチャートを示す図である。It is a figure which shows the monitoring flowchart which concerns on Example 1 of this invention. 本発明の実施例１に係るスケールアウトフローチャートを示す図である。It is a figure which shows the scale-out flowchart which concerns on Example 1 of this invention. コネクション情報を示す図である。It is a figure which shows connection information. 本発明の実施例２に係るパラメータ判定フローチャートを示す図である。It is a figure which shows the parameter determination flowchart which concerns on Example 2 of this invention.

　以下、本発明の実施の形態として、実施例１および実施例２について図面を参照しながら説明する。 Hereinafter, as an embodiment of the present invention, Example 1 and Example 2 will be described with reference to the drawings.

　図１は、本発明の実施例１に係る計算機システムの構成を示すブロック図である。
　本計算機システムは、管理部１１、業務部１３およびネットワーク１６で構成される。 FIG. 1 is a block diagram illustrating a configuration of a computer system according to the first embodiment of the present invention.
The computer system includes a management unit 11, a business unit 13, and a network 16.

　管理部１１は、制御部１８および管理情報群１９で構成される。制御部１８は、システム監視部１８１、システム構成算出部１８２、システム構成変更部１８３および管理ソフト制御部１８４で構成される。管理情報群１９は、システム構築要求情報１９１、システム構成情報１９２、物理サーバ性能情報１９４、サービス性能情報１９５、システム性能モデル情報１９６およびスケーリングルール情報１９７で構成される。 The management unit 11 includes a control unit 18 and a management information group 19. The control unit 18 includes a system monitoring unit 181, a system configuration calculation unit 182, a system configuration change unit 183, and a management software control unit 184. The management information group 19 includes system construction request information 191, system configuration information 192, physical server performance information 194, service performance information 195, system performance model information 196, and scaling rule information 197.

　また、管理部１１は、物理サーバ１２が実行する。物理サーバ１２は、ＣＰＵ１２１、記憶装置１２２および通信ＩＦ１２３を備える。記憶装置１２２に格納された制御部１８をＣＰＵ１２１が実行することで、各機能を実現する。記憶装置１２２は、管理情報群１９も格納する。物理サーバ１２は、通信ＩＦ１２３を介して、ネットワーク１６に接続される。 The management unit 11 is executed by the physical server 12. The physical server 12 includes a CPU 121, a storage device 122, and a communication IF 123. Each function is realized by the CPU 121 executing the control unit 18 stored in the storage device 122. The storage device 122 also stores the management information group 19. The physical server 12 is connected to the network 16 via the communication IF 123.

　業務部１３は、本計算機システム内に一つ以上存在し、かつ一つ以上の仮想サーバ（以下、「ＶＭ」と記す場合がある）で構成される。図１では、仮想サーバ１３１、仮想サーバ１３２および仮想サーバ１３３の三つの仮想サーバが一つの業務部として動作する例を示す。仮想サーバは、計算機システムで実行される業務に応じて、様々な役割を担う。例えば、仮想サーバ１３１はＬＢ（ロードバランサ、負荷分散装置）として、仮想サーバ１３２はＷｅｂサーバとして、仮想サーバ１３３はＤＢサーバとしてそれぞれ動作する。 The business unit 13 is one or more in the computer system, and includes one or more virtual servers (hereinafter may be referred to as “VM”). FIG. 1 illustrates an example in which three virtual servers of the virtual server 131, the virtual server 132, and the virtual server 133 operate as one business unit. The virtual server plays various roles depending on the work executed in the computer system. For example, the virtual server 131 operates as an LB (load balancer, load balancer), the virtual server 132 operates as a Web server, and the virtual server 133 operates as a DB server.

　ハイパーバイザ１５は、物理サーバ１４の計算機資源を仮想化し、業務部１３における仮想サーバを提供する役割を担う。物理サーバ１４は、ＣＰＵ１４１、記憶装置１４２および通信ＩＦ１４３を備える。業務部１３は、記憶装置１４２に格納され、かつＣＰＵ１４１により実行される。また、業務部１３は、通信ＩＦ１４３を介して、ネットワーク１６に接続される。 The hypervisor 15 virtualizes the computer resources of the physical server 14 and plays a role of providing a virtual server in the business unit 13. The physical server 14 includes a CPU 141, a storage device 142, and a communication IF 143. The business unit 13 is stored in the storage device 142 and executed by the CPU 141. The business unit 13 is connected to the network 16 via the communication IF 143.

　図２は、業務部１３における仮想サーバの論理的な接続関係と役割との一例を示す説明図である。ＬＢ２１は、負荷分散装置として動作し、受信したリクエストメッセージをＷｅｂ２２、Ｗｅｂ２３およびＷｅｂ２４のいずれかに振り分ける。Ｗｅｂ２２、Ｗｅｂ２３およびＷｅｂ２４はいずれもＷｅｂサーバである。ＤＢ２５は、ＤＢサーバであって、Ｗｅｂ２２、Ｗｅｂ２３およびＷｅｂ２４からのアクセスを受け付ける。
　なお、Ｗｅｂ２２～２４は、ＬＢ２１から振り分けられたリクエストメッセージを受け取るＷｅｂの初期配備台数（初期台数）として設定されるもので、必要に応じて増減されることになる（図２の点線部は増加時を示す）。 FIG. 2 is an explanatory diagram illustrating an example of a logical connection relationship and roles of virtual servers in the business unit 13. The LB 21 operates as a load balancer and distributes the received request message to any one of the Web 22, the Web 23, and the Web 24. Web22, Web23, and Web24 are all Web servers. The DB 25 is a DB server and accepts access from the Web 22, Web 23, and Web 24.
The Webs 22 to 24 are set as the initial number of Webs (initial number) of Webs that receive the request messages distributed from the LB 21 and are increased or decreased as necessary (the dotted line portion in FIG. 2 indicates an increase). Show time).

　図３は、システム構築要求情報１９１の一例を示す説明図である。システム構築要求情報１９１は、業務部１３に配備する仮想サーバ（ＶＭ）の役割を示す役割３０２、役割毎のＣＰＵコア割当数を示すＣＰＵコア割当数３０３、１コア当たりのＣＰＵ動作周波数を示す１コア周波数３０４および役割毎の初期配備台数を示す初期台数３０５を含む。これらの値は、管理者によって設定される。 FIG. 3 is an explanatory diagram showing an example of the system construction request information 191. As shown in FIG. The system construction request information 191 includes a role 302 indicating the role of the virtual server (VM) deployed in the business unit 13, a CPU core allocation number 303 indicating the CPU core allocation number for each role, and a CPU operating frequency per core. It includes an initial number 305 indicating the core frequency 304 and the number of initially deployed units for each role. These values are set by the administrator.

　システム構成変更部１８３は、後述する監視フローチャートのステップＳ１００１（図９）が開始する前に、システム構築要求情報１９１に従って、ＶＭを業務部１３に配備する。
　ただし、後述するスケーリングルール情報（図８）のキャッピング対象９０２に示された役割のＶＭである場合、下記の式に従って、ＣＰＵコア割当数とキャッピング済周波数を算出して、その値を設定する。ここにおいて、キャッピングとは、計算機資源（リソース）の使用率（使用量）を制限する（上限を設ける）ことを意味する。
　設定するＣＰＵコア割当数＝ＣＰＵコア割当数３０３／（１－キャッピング割合９０３）
　設定するキャッピング済み周波数＝設定するＣＰＵコア割当数×１コア周波数３０４×キャッピング割合９０３ The system configuration changing unit 183 deploys the VM in the business unit 13 in accordance with the system construction request information 191 before step S1001 (FIG. 9) of the monitoring flowchart described later starts.
However, in the case of a VM having the role shown in the capping target 902 of scaling rule information (FIG. 8) to be described later, the CPU core allocation number and the capped frequency are calculated and set according to the following formula. Here, capping means limiting (using an upper limit) the usage rate (usage amount) of computer resources (resources).
CPU core allocation number to be set = CPU core allocation number 303 / (1-capping ratio 903)
Capped frequency to be set = CPU core allocation number to be set × 1 core frequency 304 × capping ratio 903

　一方、このキャッピング対象９０２に示された役割のＶＭではない場合、下記の式に従って、ＣＰＵコア割当数とキャッピング済周波数を算出して、その値を設定する。
　設定するＣＰＵコア割当数＝ＣＰＵコア割当数３０３
　設定するキャッピング済み周波数＝０ On the other hand, if the VM does not have the role indicated by the capping target 902, the CPU core allocation number and the capped frequency are calculated according to the following formula, and the values are set.
CPU core allocation number to be set = CPU core allocation number 303
Set capped frequency = 0

　図４は、システム構成情報１９２の一例を示す説明図である。システム構成情報１９２は、個々のＶＭを識別するための識別子であるサーバ名４０１、ＶＭの役割を示す役割４０２、ＶＭに割り当てたＣＰＵコア数を示すＣＰＵコア割当数４０３、１コア当たりのＣＰＵ動作周波数を示す１コア周波数４０４、キャッピングされたＣＰＵ動作周波数を示すキャッピング済み周波数４０５およびＶＭが動作する物理サーバ１２の識別子である稼働先サーバ４０７を含む。システム構成情報１９２は、システムの構成変更の内容に伴い、制御部１８によって設定される。 FIG. 4 is an explanatory diagram showing an example of the system configuration information 192. The system configuration information 192 includes a server name 401 that is an identifier for identifying each VM, a role 402 that indicates the role of the VM, a CPU core allocation number 403 that indicates the number of CPU cores allocated to the VM, and a CPU operation per core. 1 core frequency 404 indicating the frequency, capped frequency 405 indicating the capped CPU operating frequency, and operation destination server 407 which is an identifier of the physical server 12 on which the VM operates. The system configuration information 192 is set by the control unit 18 in accordance with the contents of the system configuration change.

　図５は、物理サーバ性能情報の一例を示す説明図である。物理サーバ性能情報１９４は、性能を監視した日時を示す日時６０１、監視した物理サーバの識別子であるサーバ名６０２、監視した物理サーバのＣＰＵ使用率を示すＣＰＵ使用率６０３、メモリ使用率を示すメモリ使用率６０４、ネットワークの通信流量を示すネットワーク流量６０５およびディスクの入出力量を示すディスク流量６０６を含む。システム監視部１８１は、物理サーバ１４の性能を監視し、監視した性能情報を物理サーバ性能情報１９４に記録する。 FIG. 5 is an explanatory diagram showing an example of physical server performance information. The physical server performance information 194 includes a date and time 601 indicating the date and time when performance is monitored, a server name 602 that is an identifier of the monitored physical server, a CPU usage rate 603 that indicates the CPU usage rate of the monitored physical server, and a memory that indicates the memory usage rate. It includes a usage rate 604, a network flow 605 indicating the communication flow of the network, and a disk flow 606 indicating the input / output amount of the disk. The system monitoring unit 181 monitors the performance of the physical server 14 and records the monitored performance information in the physical server performance information 194.

　図６は、サービス性能情報の一例を示す説明図である。サービス性能情報１９５は、性能を監視した日時を示す日時７０１、単位時間あたりに業務部１３が受信したリクエスト数を示すリクエスト数７０２およびそのリクエストに対する応答時間（レスポンスタイム）を示すレスポンスタイム７０３を含む。システム監視部１８１は、業務部１３の性能を監視し、監視した性能情報をサービス性能情報１９５に記録する。 FIG. 6 is an explanatory diagram showing an example of service performance information. The service performance information 195 includes a date and time 701 indicating the date and time when performance is monitored, a request number 702 indicating the number of requests received by the business unit 13 per unit time, and a response time 703 indicating a response time (response time) to the request. . The system monitoring unit 181 monitors the performance of the business unit 13 and records the monitored performance information in the service performance information 195.

　図７は、システム性能モデル情報の一例を示す説明図である。システム性能モデル情報１９６は、システムのＣＰＵコア数の合計を示す総リソース量８０１および総リソース量８０１のときにシステムが処理可能なリクエスト数を示す性能８０２を含む。これらの値は、管理者によって設定される。管理者は、後述する監視フローチャートのステップＳ１００１（図９）を開始する以前に、検証実験あるいは文献参照等の方法で、システム性能モデル情報１９６に示す値を決定し、かつシステム性能モデル情報１９６にその値を設定する。 FIG. 7 is an explanatory diagram showing an example of system performance model information. The system performance model information 196 includes a total resource amount 801 indicating the total number of CPU cores of the system and a performance 802 indicating the number of requests that can be processed by the system when the total resource amount 801 is present. These values are set by the administrator. Before starting step S1001 (FIG. 9) of the monitoring flowchart to be described later, the administrator determines a value shown in the system performance model information 196 by a method such as a verification experiment or literature reference, and stores it in the system performance model information 196. Set its value.

　図８は、スケーリングルール情報の一例を示す説明図である。スケーリングルール情報１９７は、ＣＰＵキャッピングの対象とするＶＭの役割を示すキャッピング対象９０２、ＶＭに設定するキャッピングの割合を示すキャッピング割合９０３、新規追加したＶＭへのリクエスト割当てを実行するか否かを決定するための監視期間を示す負荷増加監視期間９０４、ＶＭの追加処理を開始するか否かを決定するためのトリガー条件に関わるトリガー係数９０６、通常時の監視間隔である通常時監視間隔９０７および高負荷時の監視間隔である高負荷時監視間隔９０８を含む。これらの値は、管理者によって設定される。 FIG. 8 is an explanatory diagram showing an example of scaling rule information. The scaling rule information 197 determines a capping target 902 indicating the role of the VM to be CPU capped, a capping ratio 903 indicating the ratio of capping set in the VM, and whether to execute request allocation to the newly added VM. Load increase monitoring period 904 indicating a monitoring period to perform, trigger coefficient 906 related to a trigger condition for determining whether to start addition processing of VM, normal time monitoring interval 907 which is a normal time monitoring interval, and high It includes a high load monitoring interval 908 that is a monitoring interval at the time of loading. These values are set by the administrator.

　図９は、制御部１８が実行する業務部１３に対する監視処理を示すフローチャート（監視フローチャート）である。
　システム監視部１８１は、ステップＳ１００１において、通常時監視間隔９０７（図８）に設定された間隔で業務部１３を監視し、監視した性能情報の結果をサービス性能情報１９５（図６）に記録する。具体的には、監視した日時７０１、単位時間あたりに業務部１３が受信したリクエスト数７０２およびそのリクエストに対するレスポンスタイム７０３を記録する。また、システム監視部１８１は、物理サーバ１４を監視し、監視した性能情報の結果を物理サーバ性能情報１９４（図５）に記録する。具体的には、監視した日時６０１、サーバ名６０２、ＣＰＵ使用率６０３、メモリ使用率６０４、ネットワーク流量６０５およびディスク流量６０６を記録する。 FIG. 9 is a flowchart (monitoring flowchart) showing a monitoring process for the business unit 13 executed by the control unit 18.
In step S1001, the system monitoring unit 181 monitors the business unit 13 at the interval set as the normal monitoring interval 907 (FIG. 8), and records the monitored performance information result in the service performance information 195 (FIG. 6). . Specifically, the monitored date and time 701, the number of requests 702 received by the business unit 13 per unit time, and the response time 703 for the request are recorded. Further, the system monitoring unit 181 monitors the physical server 14 and records the result of the monitored performance information in the physical server performance information 194 (FIG. 5). Specifically, the monitored date / time 601, server name 602, CPU usage rate 603, memory usage rate 604, network flow rate 605, and disk flow rate 606 are recorded.

　次に、管理ソフト制御部１８４は、ステップＳ１００２において、次の式に従ってスケールアウトのトリガー条件を満たすか判定する。
　リクエスト数　＞　トリガー係数　×　システム容量
　ここで、リクエスト数は、サービス性能情報１９５（図６）に記録された最新のリクエスト数７０２である。トリガー係数は、スケーリングルール情報１９７（図８）のトリガー係数９０６である。システム容量は、システムのＣＰＵコア割当数の合計値をキーとして、システム性能モデル情報１９６（図７）を検索し、該当する性能８０２の値（システムが処理可能なリクエスト数）を使用する。また、システムのＣＰＵコア割当数の合計値は、システム構成情報１９２（図４）において、役割４０２が「Ｗｅｂ」であるレコードの、ＣＰＵコア割当数４０３の合計値である。 Next, in step S1002, the management software control unit 184 determines whether the scale-out trigger condition is satisfied according to the following equation.
Number of Requests> Trigger Factor × System Capacity Here, the number of requests is the latest number of requests 702 recorded in the service performance information 195 (FIG. 6). The trigger coefficient is the trigger coefficient 906 of the scaling rule information 197 (FIG. 8). The system capacity searches the system performance model information 196 (FIG. 7) using the total number of CPU core allocations of the system as a key, and uses the value of the corresponding performance 802 (the number of requests that can be processed by the system). The total CPU core allocation number of the system is the total CPU core allocation number 403 of the record whose role 402 is “Web” in the system configuration information 192 (FIG. 4).

　管理ソフト制御部１８４は、ステップ１００２においてトリガー条件を満たすと判定した場合（ＹＥＳ）、スケールアウト処理としてステップＳ１００３以降の処理を実行する。満たさないと判定した場合（ＮＯ）、ステップＳ１００１からの処理を繰り返す。 If the management software control unit 184 determines that the trigger condition is satisfied in step 1002 (YES), the management software control unit 184 executes the processing after step S1003 as the scale-out processing. When it determines with not satisfy | filling (NO), the process from step S1001 is repeated.

　図１０は、図９のステップＳ１００３以降のスケールアウト処理として制御部１８が実行する処理を示すフローチャート（スケールアウトフローチャート）である。
　ステップＳ１１０１において、システム構成変更部１８３は、システム構築要求情報１９１（図３）に記録される情報に従って、仮想サーバ（ＶＭ）を業務部１３に追加する処理を開始する。具体的には、システム構築要求情報１９１（図３）において、役割３０２が「Ｗｅｂ」であるレコードの、ＣＰＵコア割当数３０３と１コア周波数３０４の値を使用し、ハイパーバイザ１５上に仮想サーバ（ＶＭ）をデプロイする（利用可能な状態にする）処理を開始する。ただし、負荷増加時の対応余力を残すことを目的として、前記仮想サーバ（ＶＭ）に設定するＣＰＵコア割当数は、ＣＰＵコア割当数３０３を、１からキャッピング割合９０３（図８のスケーリングルール情報１９７）を引いた値で割った数を使用する。例えば、ＣＰＵコア割当数３０３が４で、キャッピング割合９０３が５０％だった場合、前記仮想サーバ（ＶＭ）に設定するＣＰＵコア割当数は、８（＝４／（１－０．５）＝４／０．５）を使用する。また、追加する仮想サーバ（ＶＭ）に設定するキャッピング割合は、スケーリングルール情報１９７（図８）からその追加するＶＭの役割（図８のキャッピング対象９０２）に対応したキャッピング割合９０３とする（例えば、役割が「Ｗｅｂ」の場合はキャッピング割合５０％）。 FIG. 10 is a flowchart (scale-out flowchart) showing processing executed by the control unit 18 as scale-out processing after step S1003 in FIG.
In step S1101, the system configuration changing unit 183 starts processing for adding a virtual server (VM) to the business unit 13 in accordance with information recorded in the system construction request information 191 (FIG. 3). Specifically, in the system construction request information 191 (FIG. 3), the value of the CPU core allocation number 303 and the 1 core frequency 304 of the record whose role 302 is “Web” is used, and the virtual server on the hypervisor 15 is used. The process of deploying (making it available) (VM) is started. However, for the purpose of leaving a surplus capacity when the load increases, the CPU core allocation number set in the virtual server (VM) is changed from 1 to the CPU core allocation number 303 from 1 to the capping ratio 903 (scaling rule information 197 in FIG. 8). ) Minus the value used. For example, when the CPU core allocation number 303 is 4 and the capping ratio 903 is 50%, the CPU core allocation number set in the virtual server (VM) is 8 (= 4 / (1-0.5) = 4. /0.5). Further, the capping rate set for the virtual server (VM) to be added is the capping rate 903 corresponding to the role of the VM to be added (capping target 902 in FIG. 8) from the scaling rule information 197 (FIG. 8) (for example, (If the role is “Web”, the capping rate is 50%).

　ステップＳ１１０２において、システム構成変更部１８３は、動作済みの仮想サーバのＣＰＵキャッピングを解除する処理を実行する。動作済みの仮想サーバは、システム構成情報１９２（図４）において、役割４０２が「Ｗｅｂ」であるレコードを検索することによって発見する。解除するキャッピング量は、新規に追加する仮想サーバのＣＰＵ割当数３０３に、１コア周波数３０４を掛けた値とする。例えば、解除するキャッピング量が１２０００ＭＨｚであり、システム構成情報１９２（図４）に記録されるＷｅｂサーバがＶＭ１であり、ＶＭ１のキャッピング済周波数４０５が１２０００ＭＨｚの場合は、以下のようになる。すなわち、ＶＭ１のキャッピング済周波数４０５（１２０００ＭＨｚ）から解除するキャッピング量（１２０００ＭＨｚ＝３０００ＭＨｚ×４）を引いた値（０）を、新規にＶＭ１のキャッピング済周波数としてＶＭ１に設定し、かつＶＭ１のキャッピング済周波数４０５にその値を記録する。 In step S1102, the system configuration change unit 183 executes a process of canceling CPU capping of the operated virtual server. The operated virtual server is found by searching for a record whose role 402 is “Web” in the system configuration information 192 (FIG. 4). The capping amount to be released is a value obtained by multiplying the CPU allocation number 303 of the newly added virtual server by the 1-core frequency 304. For example, when the capping amount to be released is 12000 MHz, the Web server recorded in the system configuration information 192 (FIG. 4) is VM1, and the capped frequency 405 of VM1 is 12000 MHz, the following is performed. That is, a value (0) obtained by subtracting the capping amount (12000 MHz = 3000 MHz × 4) to be released from the capped frequency 405 (12000 MHz) of the VM1 is newly set to the VM1 as the capped frequency of the VM1, and the VM1 is capped. Record the value at frequency 405.

　ステップＳ１１０３において、システム監視部１８１は、監視間隔を通常時監視間隔９０７から高負荷時監視間隔９０８に変更し、その設定された間隔で業務部１３を監視する。かつ、ステップＳ１００１（図９）と同じ情報を、サービス性能情報１９５（図６）に記録する。同様に、高負荷時監視間隔９０８に設定された間隔で物理サーバ１４を監視し、監視した情報を物理サーバ性能情報１９４（図５）に記録する。 In step S1103, the system monitoring unit 181 changes the monitoring interval from the normal monitoring interval 907 to the high load monitoring interval 908, and monitors the business unit 13 at the set interval. And the same information as step S1001 (FIG. 9) is recorded on the service performance information 195 (FIG. 6). Similarly, the physical server 14 is monitored at the interval set as the high load monitoring interval 908, and the monitored information is recorded in the physical server performance information 194 (FIG. 5).

　ステップＳ１１０４において、管理ソフト制御部１８４は、所定の期間、高負荷が継続しているか判定する。ここで、所定の期間としては、スケーリングルール情報１９７（図８）の負荷増加監視期間９０４の値を使用する。高負荷が継続するか否かは、ステップＳ１００２（図９）で示した、トリガー条件を満たすか否かで判定する。 In step S1104, the management software control unit 184 determines whether a high load continues for a predetermined period. Here, the value of the load increase monitoring period 904 of the scaling rule information 197 (FIG. 8) is used as the predetermined period. Whether or not the high load continues is determined based on whether or not the trigger condition shown in step S1002 (FIG. 9) is satisfied.

　管理ソフト制御部１８４は、高負荷が継続していると判定した場合（ＹＥＳ）、ステップＳ１１０５において、ＬＢ（負荷分散装置）の設定を変更し、新規に追加したＶＭにリクエスト割当てを開始する。ただし、ステップＳ１１０１で開始したＶＭの追加がまだ完了していない場合には、その追加の完了を待って、管理ソフト制御部１８４は、ＬＢ（負荷分散装置）の設定を変更する。 If the management software control unit 184 determines that the high load continues (YES), in step S1105, the management software control unit 184 changes the setting of the LB (load distribution device) and starts request allocation to the newly added VM. However, if the VM addition started in step S1101 has not yet been completed, the management software control unit 184 changes the setting of the LB (load distribution device) after waiting for the completion of the addition.

　管理ソフト制御部１８４が高負荷は継続していないと判定した場合（ＮＯ）、ステップＳ１１０９において、システム構成変更部１８３は、ステップＳ１１０１で追加したＶＭを削除する。ただし、ステップＳ１１０１で開始したＶＭの追加がまだ完了していない場合には、システム構成変更部１８３は、その追加を中止する。 When the management software control unit 184 determines that the high load is not continued (NO), in step S1109, the system configuration change unit 183 deletes the VM added in step S1101. However, if the VM addition started in step S1101 has not yet been completed, the system configuration changing unit 183 stops the addition.

　以上のステップＳ１１０１～ステップＳ１１０５およびステップＳ１１０９の処理フローの仕組みについて、補足する。スケールアウト処理として、システム構成変更部１８３は、まず仮想サーバ（ＶＭ）の追加処理を開始し（ステップＳ１１０１）、それに続いてＣＰＵキャッピングを解除する（ステップＳ１１０２）。ＣＰＵキャッピングは、ＶＭの追加削除よりも対応が早く、急激な負荷の増減への対応としては好適であるから、ＶＭの追加処理に並行してＣＰＵキャッピングの解除を実行することにより、ＶＭ追加後に近い状態を先行して再現し、負荷増加（変動）への対応を図るのである。これを、システム監視部１８１は高負荷時監視間隔で監視する（ステップＳ１１０３）。負荷増加（変動）が一時的であったり、ＣＰＵキャッピング解除により負荷増加（変動）の影響を吸収できるのであれば、高負荷状態は継続しないので（ステップＳ１１０４）、システム構成変更部１８３は、ＶＭを追加することに及ばずそれをキャンセルする（ステップＳ１１０９）。しかし、ＣＰＵキャッピング解除によっても高負荷状態が継続するようであれば（ステップＳ１１０４）、管理ソフト制御部１８４は、追加したＶＭに対して負荷割当てを行うことにより対応することになる（ステップＳ１１０５）。 A supplementary description will be given of the mechanism of the processing flow of steps S1101 to S1105 and S1109. As the scale-out process, the system configuration change unit 183 first starts the virtual server (VM) addition process (step S1101), and subsequently releases the CPU capping (step S1102). CPU capping is faster than adding and deleting VMs, and is suitable as a response to sudden load increase / decrease. Therefore, by canceling CPU capping in parallel with VM addition processing, The close state is reproduced in advance to cope with the load increase (fluctuation). The system monitoring unit 181 monitors this at a high load monitoring interval (step S1103). If the load increase (fluctuation) is temporary or if the influence of the load increase (fluctuation) can be absorbed by canceling CPU capping, the high load state does not continue (step S1104). The process is canceled without adding to (step S1109). However, if the high load state continues even after the CPU capping is released (step S1104), the management software control unit 184 responds by assigning a load to the added VM (step S1105). .

　ステップＳ１１０６において、システム構成変更部１８３は、ステップＳ１１０２においてキャッピングを解除したＶＭに対してキャッピングを再設定する（元に戻す）。キャッピングの対象とするＶＭは、ステップＳ１１０２において対象としたＶＭと同一である。また、キャッピングの量は、ステップＳ１１０２において解除したキャッピングの量と同一である。 In step S1106, the system configuration changing unit 183 resets (returns) capping to the VM for which capping was canceled in step S1102. The target VM for capping is the same as the target VM in step S1102. Further, the amount of capping is the same as the amount of capping released in step S1102.

　ステップＳ１１０７において、システム監視部１８１は、監視間隔を通常時監視間隔９０７に戻し、その設定された間隔で業務部１３を監視する。かつ、システム監視部１８１は、ステップＳ１００１（図９）と同じ情報を、サービス性能情報１９５（図６）に記録する。 In step S1107, the system monitoring unit 181 returns the monitoring interval to the normal monitoring interval 907 and monitors the business unit 13 at the set interval. And the system monitoring part 181 records the same information as step S1001 (FIG. 9) in the service performance information 195 (FIG. 6).

　ステップＳ１１０８において、システム構成算出部１８２は、ステップＳ１１０１で追加を開始したＶＭの情報を、システム構成情報１９２（図４）に追加する。ただし、ステップＳ１１０９を実行した場合は、その情報の追加をせず、そのまま処理を終了する。 In step S1108, the system configuration calculation unit 182 adds the VM information that has been added in step S1101 to the system configuration information 192 (FIG. 4). However, if step S1109 is executed, the information is not added and the process is terminated.

　なお、本実施例１では、キャッピングの対象とする計算機資源をＣＰＵとして説明したが、他の計算機資源（メモリ、ディスク、ネットワークなど）を対象にしてもよい。 In the first embodiment, the computer resource to be capped has been described as a CPU. However, other computer resources (memory, disk, network, etc.) may be targeted.

　以上説明したように、本実施例１によれば、負荷変動の予測情報がない場合でも、計算機資源の動的な割当てを効果的に行い、業務のサービスレベルを維持するためのオートスケールを実現できる。 As described above, according to the first embodiment, even when there is no load fluctuation prediction information, dynamic allocation of computer resources is effectively performed, and auto-scaling for maintaining the service level of business is realized. it can.

　次に、本発明の実施例２について、図１１および１２を参照して説明する。
　実施例１では、スケールアウトのトリガー条件を満たすか否かを判定し（図９のステップＳ１００２）、満たした場合（ＹＥＳ）は、ＶＭの追加処理を開始（図１０のステップＳ１１０１）していた。しかし、アプリケーションの状態によっては、ＶＭの追加が有効ではない場合がある。例えば、相互接続されたＷｅｂサーバ１台とＤＢサーバ１台から成るシステムにおいて、ＤＢサーバにおける許容同時接続数が１００、ＷｅｂサーバにおけるＤＢサーバ向け同時接続数が１００の場合、Ｗｅｂサーバを１台追加しても、ＤＢサーバにおける許容同時接続数をＤＢサーバ向け同時接続数の合計が超えるため、サービス性能が改善されない場合である。このような場合を鑑み、実施例２では、アプリケーションの状態を表す情報（コネクション情報）を使用して、負荷増加時のリソース追加手法を選択する。 Next, a second embodiment of the present invention will be described with reference to FIGS.
In the first embodiment, it is determined whether or not the scale-out trigger condition is satisfied (step S1002 in FIG. 9). If the condition is satisfied (YES), VM addition processing is started (step S1101 in FIG. 10). . However, depending on the state of the application, the addition of the VM may not be effective. For example, in a system consisting of one interconnected Web server and one DB server, if the allowable number of simultaneous connections in the DB server is 100 and the number of simultaneous connections for the DB server in the Web server is 100, add one Web server Even so, the service performance is not improved because the total number of simultaneous connections for the DB server exceeds the allowable number of simultaneous connections in the DB server. In view of such a case, in the second embodiment, a resource addition method at the time of load increase is selected using information (connection information) indicating the state of the application.

　実施例２では、図１に示したシステム構成に加えて、図１１に示すコネクション情報を使用する。かつ、図９および図１０に示した実施例１のフローチャートに代えて、図１２に示すフローチャートを使用する。 In the second embodiment, the connection information shown in FIG. 11 is used in addition to the system configuration shown in FIG. In place of the flowchart of the first embodiment shown in FIGS. 9 and 10, the flowchart shown in FIG. 12 is used.

　図１１は、コネクション情報の一例を示す説明図である。コネクション情報１９８は、ＶＭの役割を示す役割１２０１およびその役割を割り当てられたＶＭに設定される通信コネクション数を示すコネクション数１２０２を含む。このコネクション情報１９８は、管理者によって設定される。また、コネクション情報１９８は、管理情報群１９に格納される（図１で点線枠により示す）。 FIG. 11 is an explanatory diagram showing an example of connection information. The connection information 198 includes a role 1201 indicating the role of the VM and a connection number 1202 indicating the number of communication connections set in the VM to which the role is assigned. This connection information 198 is set by the administrator. The connection information 198 is stored in the management information group 19 (indicated by a dotted frame in FIG. 1).

　図１２は、制御部１８が実行するパラメータ判定処理を示すフローチャート（パラメータ判定フローチャート）である。
　ステップＳ１００１およびＳ１００２は、実施例１の場合と同じである。ただし、ステップＳ１００２において、スケールアウトのトリガー条件を満たすと判定された場合（ＹＥＳ）、管理ソフト制御部１８４は、ステップＳ１００３ではなくステップＳ１３０３を実行する。このステップＳ１３０３において、管理ソフト制御部１８４は、仮想サーバであるＷｅｂサーバを１台追加した場合でも、Ｗｅｂサーバ群におけるコネクション数が、ＤＢサーバ群におけるコネクション数以内に収まるか否かを判定する。 FIG. 12 is a flowchart (parameter determination flowchart) showing the parameter determination process executed by the control unit 18.
Steps S1001 and S1002 are the same as those in the first embodiment. However, if it is determined in step S1002 that the scale-out trigger condition is satisfied (YES), the management software control unit 184 executes step S1303 instead of step S1003. In step S1303, the management software control unit 184 determines whether or not the number of connections in the Web server group falls within the number of connections in the DB server group even when one Web server that is a virtual server is added.

　具体的には、次の式に基づいて判定する。
　Ｗｅｂサーバに設定されるコネクション数×（動作済みＷｅｂサーバ数＋１）＜ＤＢサーバに設定されるコネクション数×動作済みＤＢサーバ数
　ここで、Ｗｅｂサーバに設定されるコネクション数は、コネクション情報１９８において、役割１２０１が「Ｗｅｂ」であるレコードのコネクション数１２０２である。同様に、ＤＢサーバに設定されるコネクション数は、コネクション情報１９８において、役割１２０１が「ＤＢ」であるレコードのコネクション数１２０２である。動作済みＷｅｂサーバ数は、システム構成情報１９２（図４）において、役割４０２が「Ｗｅｂ」であるレコードの数である。同様に、動作済みＤＢサーバ数は、システム構成情報１９２（図４）において、役割４０２が「ＤＢ」であるレコードの数である。 Specifically, the determination is made based on the following equation.
Number of connections set in Web server × (Number of operated Web servers + 1) <Number of connections set in DB server × Number of operated DB servers Here, the number of connections set in Web server The number of connections 1202 of the record whose role 1201 is “Web”. Similarly, the number of connections set in the DB server is the number of connections 1202 of the record whose role 1201 is “DB” in the connection information 198. The number of operated Web servers is the number of records whose role 402 is “Web” in the system configuration information 192 (FIG. 4). Similarly, the number of operated DB servers is the number of records whose role 402 is “DB” in the system configuration information 192 (FIG. 4).

　仮想サーバ（Ｗｅｂサーバ）の追加に対してコネクション数が十分でない（上記判定式を満足しない）と判定された場合（ＮＯ）、Ｗｅｂサーバを追加してもシステムの性能が改善しない恐れがある。そこで、その場合には、ステップＳ１１０２において、システム構成変更部１８３は、既存仮想サーバのキャッピング解除処理を実行し、その後に処理を終了する。 If it is determined that the number of connections is not sufficient for the addition of the virtual server (Web server) (does not satisfy the above judgment formula) (NO), there is a possibility that the system performance will not be improved even if the Web server is added. Therefore, in that case, in step S1102, the system configuration change unit 183 executes the capping release processing of the existing virtual server, and thereafter ends the processing.

　仮想サーバ（Ｗｅｂサーバ）の追加に対してコネクション数が十分である（上記判定式を満足する）と判定された場合（ＹＥＳ）、実施例１の場合と同様に、図１０で示したステップＳ１００３（スケールアウト処理）を実行する。 When it is determined that the number of connections is sufficient (addition of the above determination formula) to the addition of the virtual server (Web server) (YES), step S1003 shown in FIG. (Scale-out processing) is executed.

　以上説明したように、本実施例２によれば、計算機システムにおけるアプリケーションの状態に応じて、計算機資源の動的な割当てを効果的に行い、業務のサービスレベルを維持するためのオートスケールを実現できる。 As described above, according to the second embodiment, dynamic allocation of computer resources is effectively performed according to the application state in the computer system, and auto-scaling for maintaining the service level of business is realized. it can.

１１　管理部、１２　物理サーバ、１３　業務部、１４　物理サーバ、１５　ハイパーバイザ、１６　通信ネットワーク、１８　管理部、１９　管理情報群、１２１，１４１　ＣＰＵ、１２２，１４２　記憶装置、１２３，１４３　通信ＩＦ、１３１～１３３　ＶＭ、１８１　システム監視部、１８２　システム構成算出部、１８３　システム構成変更部、１８４　管理ソフト制御部、１９１　システム構築要求情報、１９２　システム構成情報、１９４　物理サーバ性能情報、１９５　サービス性能情報、１９６　システム性能モデル情報、１９７　スケーリングルール情報、１９８　コネクション情報 11 management unit, 12 physical server, 13 business unit, 14 physical server, 15 hypervisor, 16 communication network, 18 management unit, 19 management information group, 121, 141 CPU, 122, 142 storage device, 123, 143 communication IF, 131 to 133 VM, 181 system monitoring unit, 182 system configuration calculation unit, 183 system configuration change unit, 184 management software control unit, 191 system construction request information, 192 system configuration information, 194 physical server performance information, 195 service performance information, 196 System performance model information, 197 Scaling rule information, 198 Connection information

Claims

A physical server having computer resources including a processor, a storage device and a communication interface;
A virtualization device that virtualizes the computer resources and assigns them to a plurality of virtual environments;
A management device for managing the plurality of virtual environments;
A computer system comprising at least
The management device is
A monitoring unit that monitors loads of the plurality of virtual environments;
A control unit that determines whether or not a virtual environment needs to be added according to the monitored load;
Instructs the virtualization apparatus to add a virtual environment according to the determination of the control unit, releases the use restriction of the computer resources constituting the existing virtual environment, and releases the virtual environment after the addition of the virtual environment is completed. And a configuration change unit that resets the use restrictions to the existing virtual environment.

The computer system according to claim 1,
The computer system according to claim 1, wherein the computer resource for which the use restriction is removed is a processor.

The computer system according to claim 1 or 2,
The management device has a storage unit that holds a use restriction ratio of the computer resource,
The configuration change unit sets a use restriction amount of a computer resource of the virtual environment to be added based on the use restriction ratio.

It is a computer system as described in any one of Claims 1-3, Comprising:
After the instruction to add the virtual environment,
The monitoring unit changes the monitoring time interval to a short time,
The configuration changing unit completes the addition of the virtual environment when the high load state is continued during the changed time interval, and cancels the addition when the high load state is not continued. A computer system characterized by deleting a virtual environment that has already been added.

A computer system according to any one of claims 1 to 4, wherein
In addition to determining whether or not to add a virtual environment according to the monitored load, the control unit determines whether or not to add the virtual environment according to the number of application communication connections processed by the computer system. A computer system characterized by that.

A computer system according to claim 5, wherein
When the control unit determines that the number of communication connections is insufficient for the addition of the virtual environment,
The computer system according to claim 1, wherein the configuration change unit only cancels the use restriction of the computer resources constituting the existing virtual environment and does not instruct the addition of the virtual environment.

A physical server having computer resources including a processor, a storage device and a communication interface;
A virtualization device that virtualizes the computer resources and assigns them to a plurality of virtual environments;
An autoscaling method in a computer system comprising at least a management device for managing the plurality of virtual environments,
The management device
A first step of monitoring a load of the plurality of virtual environments;
A second step of determining whether or not a virtual environment needs to be added according to the monitored load;
A third step of instructing the virtualization unit to add a virtual environment according to the determination;
A fourth step of releasing the restriction on the use of computer resources constituting the existing virtual environment;
And a fifth step of resetting the released use restriction to the existing virtual environment after the addition of the virtual environment is completed.

The auto-scaling method according to claim 7,
The computer scale resource in the fourth step is a processor.

The autoscale method according to claim 7 or claim 8, wherein
The management device holds a use restriction ratio of the computer resource,
The third step further includes a step of setting a use restriction amount of the computer resource of the virtual environment to be added based on the use restriction ratio.

The autoscale method according to any one of claims 7 to 9,
The fifth step includes a step of changing the monitoring interval to be monitored to a short time, and a step of completing the addition of the virtual environment when a high load state is continued during the changed monitoring interval; And a step of canceling the addition when the high load state is not continued and deleting the already added virtual environment.

The autoscale method according to any one of claims 7 to 10, wherein
The management device executes a step of determining whether or not to add the virtual environment according to the number of communication connections of the application processed by the computer system between the second step and the third step. A featured autoscale method.

The auto-scaling method according to claim 11,
When the number of communication connections is insufficient for the addition of the virtual environment, the management device instructs to add the virtual environment only by releasing the use restriction of the computer resources constituting the existing virtual environment. Auto-scaling method characterized by not.