JP2002123406A

JP2002123406A - High reliability system

Info

Publication number: JP2002123406A
Application number: JP2000316213A
Authority: JP
Inventors: Keisuke Kawai; 桂介河合; Kazuhiro Shimada; 一洋島田; Hidekazu Imai; 秀和今井
Original assignee: PFU Ltd
Current assignee: PFU Ltd
Priority date: 2000-10-17
Filing date: 2000-10-17
Publication date: 2002-04-26

Abstract

(57)【要約】【課題】本発明は、現用系ノードのディスクをオリジナ
ルとし、ネットワークを介して接続される待機系ノード
のディスクをシャドウとするミラーリング構成を採ると
きに、障害発生に適切に対処できるようにすることを目
的とする。【解決手段】現用系ノードのディスクに障害が発生する
と、待機系として動作しているノードを運用状態にする
とともに、それまで現用系として動作していたノードを
停止させることで、二重化システムとしての動作を停止
させるように処理する。また、待機系ノードのディスク
に障害が発生すると、待機系として動作しているノード
を停止させることで、二重化システムとしての動作を停
止させるように処理する。また、シャドウディスクへの
アクセス用に用意されるネットワークに障害が発生する
と、待機系として動作しているノードを停止させること
で、二重化システムとしての動作を停止させるように処
理する。 (57) [Summary] The present invention appropriately addresses failure occurrence when employing a mirroring configuration in which a disk of an active node is used as an original and a disk of a standby node connected via a network is used as a shadow. The purpose is to be able to cope. When a failure occurs in a disk of an active node, a node operating as a standby system is brought into an operating state, and a node that has been operating as an active system is stopped so that a redundant system as a redundant system is obtained. Process to stop the operation. Further, when a failure occurs in the disk of the standby node, processing is performed so as to stop the operation as the redundant system by stopping the node operating as the standby system. Further, when a failure occurs in the network prepared for accessing the shadow disk, the node operating as the standby system is stopped to stop the operation as the redundant system.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、現用系ノードのロ
ーカルディスクをオリジナルディスクとし、ネットワー
クを介して接続される待機系ノードのローカルディスク
をシャドウディスクとするミラーセット機能を有して、
いずれか一方のディスクに障害が発生する場合に、正常
なもう一方のディスクを使用することで業務を遂行する
高信頼性システムに関し、特に、障害発生に適切に対処
できるようにする高信頼性システムに関する。The present invention has a mirror set function in which a local disk of an active node is used as an original disk and a local disk of a standby node connected via a network is used as a shadow disk.
When a failure occurs in one of the disks, a highly reliable system that performs business by using the other disk that works normally, and in particular, a highly reliable system that can appropriately cope with the failure About.

【０００２】[0002]

【従来の技術】業務を行うサーバと、そのサーバの障害
に備えて待機するサーバとから構成される高信頼性シス
テムが広く用いられている。前者は現用系ノードと呼ば
れ、後者は待機系ノードと呼ばれている。2. Description of the Related Art A highly reliable system composed of a server that performs business and a server that stands by in preparation for a failure of the server is widely used. The former is called a working node, and the latter is called a standby node.

【０００３】現用系ノードが何らかの障害によりダウン
した場合、待機系ノードが業務の引き継ぎを行う。この
業務の引き継ぎを行うために、業務データを現用系ノー
ドから待機系ノードへ引き継ぐ必要がある。If the active node goes down due to some kind of failure, the standby node takes over the business. In order to take over the business, it is necessary to transfer business data from the active node to the standby node.

【０００４】これまで、この業務データの引き継ぎのた
めに、現用系ノードと待機系ノードとの間に、ＳＣＳＩ
インタフェースなどで接続される共用ディスク装置を用
意するという構成を採っていた。Until now, in order to take over the business data, a SCSI node has been connected between the active node and the standby node.
In this configuration, a shared disk device connected by an interface or the like is prepared.

【０００５】しかしながら、共用ディスク装置は極めて
高価なものである。そこで、現用系ノードのローカルデ
ィスクをオリジナルディスクとし、ネットワークを介し
て接続される待機系ノードのローカルディスクをシャド
ウディスクとするミラーセット機能（ＲＡＩＤ１の機
能）を有して、いずれか一方のディスクに障害が発生す
る場合に、正常なもう一方のディスクを使用することで
業務を遂行するという構成を採る高信頼性システムが利
用されつつある。[0005] However, the shared disk device is extremely expensive. Therefore, a mirror set function (RAID1 function) is used in which the local disk of the active node is used as an original disk and the local disk of the standby node connected via a network is used as a shadow disk. When a failure occurs, a highly reliable system employing a configuration in which a normal disk is used to perform a task is being used.

【０００６】この高信頼性システムでは、図５に示すよ
うに、現用系ノードは、オリジナルディスクに業務デー
タを書き込むときに、ネットディスクドライバを使って
シャドウディスクにその業務データを書き込むことで、
ミラーリングを実行するという構成を採っている。In this highly reliable system, as shown in FIG. 5, the active node writes the business data to the shadow disk using a net disk driver when writing the business data to the original disk.
It adopts a configuration to execute mirroring.

【０００７】そして、現用系ノードは、オリジナルディ
スクが正常である場合には、オリジナルディスクにアク
セスすることで業務を遂行し、オリジナルディスクに障
害が発生する場合には、ネットディスクドライバを使っ
てシャドウディスクにアクセスすることで業務を遂行す
るという構成を採っている。When the original disk is normal, the active node performs the job by accessing the original disk, and when the original disk fails, the active node performs shadowing using the net disk driver. The configuration is such that the business is performed by accessing the disk.

【０００８】[0008]

【発明が解決しようとする課題】しかしながら、この従
来技術には、次のような問題点がある。However, this prior art has the following problems.

【０００９】すなわち、オリジナルディスクに障害が発
生するときに、現用系ノードは、業務データの入出力を
すべてシャドウディスクに対して行うことから、データ
の入力性能が低下し、これがために、システム全体の性
能低下が発生するという問題点がある。That is, when a failure occurs in the original disk, the active node performs all input and output of business data to and from the shadow disk, thus deteriorating the data input performance. However, there is a problem that the performance is deteriorated.

【００１０】また、シャドウディスクに障害が発生する
ときに、この従来技術では、待機系ノードを停止させる
という構成を採っていない。Further, when a failure occurs in the shadow disk, this prior art does not employ a configuration in which the standby node is stopped.

【００１１】これから、シャドウディスクに障害が発生
した後、現用系ノードがオリジナルディスクを使って業
務を遂行していくときに、現用系ノードがシステムダウ
ンすることが起こると、ユーザは、待機系ノードに業務
が引き継がれていく筈であるのにそれが実行されないこ
とで、システムの信頼性に対して不安を持つという問題
点がある。[0011] From now on, if the active node goes down when the active node performs a task using the original disk after the failure of the shadow disk, the user will be asked to enter the standby node. However, there is a problem that if the work is supposed to be taken over but is not executed, there is anxiety about the reliability of the system.

【００１２】しかも、この現用系ノードのシステムダウ
ンの際に、障害個所を避けることでシャドウディスクを
使用することが可能になることがあっても、そのシャド
ウディスクの業務データは更新されていないことから、
新現用系ノード（そのシャドウディスクを持つノード）
は、最新の業務データを引き継ぐことができないという
問題点もある。In addition, in the event of a system failure of the active node, even if it is possible to use a shadow disk by avoiding a failure point, the business data of the shadow disk must not be updated. From
New active node (node with the shadow disk)
However, there is a problem that the latest business data cannot be taken over.

【００１３】また、現用系ノードがシャドウディスクに
アクセスするために用意されるネットワーク（図５の専
用ネットワーク）に障害が発生するときにも、この従来
技術では、待機系ノードを停止させるという構成を採っ
ていない。In addition, even when a failure occurs in a network (dedicated network in FIG. 5) prepared for the active node to access the shadow disk, the prior art has a configuration in which the standby node is stopped. Not taken.

【００１４】これから、シャドウディスクに障害が発生
するときの問題点と同様の問題点がある。From this, there is a problem similar to that when a failure occurs in the shadow disk.

【００１５】本発明はかかる事情に鑑みてなされたもの
であって、現用系ノードのローカルディスクをオリジナ
ルディスクとし、ネットワークを介して接続される待機
系ノードのローカルディスクをシャドウディスクとする
ミラーセット機能を有して、いずれか一方のディスクに
障害が発生する場合に、正常なもう一方のディスクを使
用することで業務を遂行するという構成を採るときにあ
って、障害発生に適切に対処できるようにする新たな高
信頼性システムの提供を目的とする。The present invention has been made in view of such circumstances, and has a mirror set function in which a local disk of an active node is used as an original disk, and a local disk of a standby node connected via a network is a shadow disk. When a failure occurs in one of the disks and a configuration is adopted in which the business is performed by using the other normal disk, the failure can be appropriately dealt with. The purpose is to provide a new highly reliable system.

【００１６】[0016]

【課題を解決するための手段】この目的を達成するため
に、本発明の高信頼性システムは、現用系ノードのロー
カルディスクをオリジナルディスクとし、ネットワーク
を介して接続される待機系ノードのローカルディスクを
シャドウディスクとするミラーセット機能を有して、い
ずれか一方のディスクに障害が発生する場合に、正常な
もう一方のディスクを使用することで業務を遂行すると
いう構成を採るときに、オリジナルディスクに障害が発
生したのか否かを検出する検出手段と、検出手段により
オリジナルディスクの障害発生が検出されるときに、待
機系ノードを運用状態に切り替える切替手段と、切替手
段によるノード切り替えに対応させて、それまで現用系
ノードとして動作していたノードを停止させることで二
重化システムを停止させる停止手段とを備えるように構
成する。In order to achieve this object, a high reliability system according to the present invention uses a local disk of an active node as an original disk and a local disk of a standby node connected via a network. When a configuration is adopted in which a mirror set function is used as a shadow disk and one of the disks fails, the business is performed by using the other normal disk. Detecting means for detecting whether a failure has occurred in the original disk, switching means for switching the standby node to the operating state when the failure detection of the original disk is detected by the detecting means, and node switching by the switching means. The redundant system by stopping the node that was operating as the active node until then. Configured to and a stop means for.

【００１７】このように構成される本発明の高信頼性シ
ステムでは、現用系として動作しているノードのハード
ディスクに障害が発生すると、待機系として動作してい
るノードを運用状態にするとともに、それまで現用系と
して動作していたノードを停止させることで、二重化シ
ステムとしての動作を停止させるように処理する。In the high-reliability system of the present invention configured as described above, when a failure occurs in the hard disk of the node operating as the active system, the node operating as the standby system is brought into the operating state, and By stopping the node that has been operating as the active system until then, processing is performed to stop the operation as the redundant system.

【００１８】これにより、新たに業務処理を実行するノ
ードは、自ノードのハードディスクを使って業務を遂行
できるようになるので、データの入力性能が劣化すると
いうことがなくなり、システム全体の性能低下を防止で
きるようになる。そして、二重化システムとしての動作
を停止することから、その後、不都合な動作を行うこと
がない。Thus, the node that newly executes the business process can execute the business using the hard disk of its own node, so that the data input performance does not deteriorate and the performance of the entire system is reduced. Can be prevented. Then, since the operation as the duplex system is stopped, no inconvenient operation is performed thereafter.

【００１９】また、この目的を達成するために、本発明
の高信頼性システムは、現用系ノードのローカルディス
クをオリジナルディスクとし、ネットワークを介して接
続される待機系ノードのローカルディスクをシャドウデ
ィスクとするミラーセット機能を有して、いずれか一方
のディスクに障害が発生する場合に、正常なもう一方の
ディスクを使用することで業務を遂行するという構成を
採るときに、シャドウディスクに障害が発生したのか否
かを検出する検出手段と、検出手段によりシャドウディ
スクの障害発生が検出されるときに、待機ノードを停止
させることで二重化システムを停止させる停止手段とを
備えるように構成する。In order to achieve this object, a high reliability system according to the present invention uses a local disk of an active node as an original disk and a local disk of a standby node connected via a network as a shadow disk. If one of the disks has a failure and the other disk fails, the shadow disk fails And a stopping unit for stopping the redundant system by stopping the standby node when the detection of the failure of the shadow disk is detected by the detecting unit.

【００２０】このように構成される本発明の高信頼性シ
ステムでは、待機系として動作しているノードのハード
ディスクに障害が発生すると、待機系として動作してい
るノードを停止させることで、二重化システムとしての
動作を停止させるように処理する。In the high reliability system of the present invention configured as described above, when a failure occurs in the hard disk of the node operating as the standby system, the node operating as the standby system is stopped to thereby provide a redundant system. Is processed to stop the operation as.

【００２１】これにより、業務処理を実行するノード
は、待機系ノードが用意されていないことをユーザに知
らせることができるようになることで、ユーザがシステ
ムの信頼性に対して不安を持つということがなくなると
ともに、古い業務データでもって業務が引き継がれてし
まうという不都合も起こらない。Thus, the node executing the business process can inform the user that the standby node is not prepared, so that the user is concerned about the reliability of the system. And the inconvenience of taking over the business with old business data does not occur.

【００２２】また、この目的を達成するために、本発明
の高信頼性システムは、現用系ノードのローカルディス
クをオリジナルディスクとし、ネットワークを介して接
続される待機系ノードのローカルディスクをシャドウデ
ィスクとするミラーセット機能を有して、いずれか一方
のディスクに障害が発生する場合に、正常なもう一方の
ディスクを使用することで業務を遂行するという構成を
採るときに、現用系ノードがシャドウディスクにアクセ
スするために用意されるシャドウ用ネットワークに障害
が発生したのか否かを検出する検出手段と、検出手段に
よりシャドウ用ネットワークの障害発生が検出されると
きに、待機ノードを停止させることで二重化システムを
停止させる停止手段とを備えるように構成する。To achieve this object, the high reliability system of the present invention uses a local disk of an active node as an original disk and a local disk of a standby node connected via a network as a shadow disk. When one of the disks has a failure and one of the disks fails, the working node is used to perform the job by using the other normal disk. Detection means for detecting whether or not a failure has occurred in the shadow network prepared for accessing the network, and duplication by stopping the standby node when the detection means detects the failure of the shadow network. And a stopping means for stopping the system.

【００２３】このように構成される本発明の高信頼性シ
ステムでは、シャドウ用ネットワークに障害が発生する
と、待機系として動作しているノードを停止させること
で、二重化システムとしての動作を停止させるように処
理する。In the high reliability system of the present invention configured as described above, when a failure occurs in the shadow network, the operation of the redundant system is stopped by stopping the node operating as the standby system. To process.

【００２４】これにより、業務処理を実行するノード
は、待機系ノードが用意されていないことをユーザに知
らせることができるようになることで、ユーザがシステ
ムの信頼性に対して不安を持つということがなくなると
ともに、古い業務データでもって業務が引き継がれてし
まうという不都合も起こらない。Thus, the node executing the business process can inform the user that the standby node is not prepared, so that the user is concerned about the reliability of the system. And the inconvenience of taking over the business with old business data does not occur.

【００２５】このようにして、本発明によれば、現用系
ノードのローカルディスクをオリジナルディスクとし、
ネットワークを介して接続される待機系ノードのローカ
ルディスクをシャドウディスクとするミラーセット機能
を有して、いずれか一方のディスクに障害が発生する場
合に、正常なもう一方のディスクを使用することで業務
を遂行するという構成を採るときにあって、障害発生に
適切に対処できるようになる。Thus, according to the present invention, the local disk of the active node is set as the original disk,
It has a mirror set function that uses the local disk of the standby node connected via the network as a shadow disk, so that if one of the disks fails, the other normal disk can be used. When adopting a configuration of performing a task, it becomes possible to appropriately cope with the occurrence of a failure.

【００２６】[0026]

【発明の実施の形態】以下、実施の形態に従って本発明
を詳細に説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, the present invention will be described in detail according to embodiments.

【００２７】図１に、本発明の適用される高信頼性シス
テムのシステム構成を図示する。FIG. 1 shows a system configuration of a high reliability system to which the present invention is applied.

【００２８】この図に示すように、本発明の適用される
高信頼性システムは、業務処理を実行する現用系ノード
１と、現用系ノード１の待機系として用意される待機系
ノード２と、現用系ノード１と待機系ノード２との間を
接続する専用ネットワーク３と、ユーザの操作する業務
用端末４と、現用系ノード１と待機系ノード２との間を
接続するとともに、現用系ノード１／待機系ノード２と
業務用端末４との間を接続する業務用ネットワーク５と
で構成されている。As shown in the figure, a high reliability system to which the present invention is applied includes an active node 1 for executing business processing, a standby node 2 prepared as a standby system of the active node 1, A dedicated network 3 for connecting the active node 1 and the standby node 2, a business terminal 4 operated by a user, and a connection between the active node 1 and the standby node 2 and a connection for the active node 1 / A business network 5 connecting the standby node 2 and the business terminal 4.

【００２９】ここで、専用ネットワーク３は、現用系ノ
ード１が待機系ノード２のハードディスクにアクセスす
るときの通信ラインとして用いられるとともに、現用系
ノード１と待機系ノード２との間でやり取りされる生存
通知の通信ラインとして用いられるものであり、好まし
くは二重化されている。Here, the dedicated network 3 is used as a communication line when the active node 1 accesses the hard disk of the standby node 2, and is exchanged between the active node 1 and the standby node 2. It is used as a communication line for alive notification, and is preferably duplicated.

【００３０】この現用系ノード１は、ミラーリングのオ
リジナルディスクとして機能するハードディスク１０
と、ハードディスク１０を使って業務処理を実行するア
プリケーションプログラム１１と、待機系ノード２のハ
ードディスク２０にアクセスするためのネットディスク
ドライバ１２と、ハードディスク１０，２０に対するミ
ラーリングを実行するミラードライバ１３と、全体の制
御処理を実行するクラスタ制御部１４とを備える。The active node 1 includes a hard disk 10 functioning as an original disk for mirroring.
An application program 11 for executing business processes using the hard disk 10, a net disk driver 12 for accessing the hard disk 20 of the standby node 2, a mirror driver 13 for executing mirroring for the hard disks 10, 20. And a cluster control unit 14 for executing the control processing.

【００３１】一方、この待機系ノード２は、ミラーリン
グのシャドウディスクとして機能するハードディスク２
０と、現用系ノード１のアプリケーションプログラム１
１に対応付けて用意されるアプリケーションプログラム
２１と、現用系ノード１のハードディスク１０にアクセ
スするためのネットディスクドライバ２２と、ハードデ
ィスク２０，１０に対するミラーリングを実行するミラ
ードライバ２３と、全体の制御処理を実行するクラスタ
制御部２４とを備える。On the other hand, the standby node 2 has a hard disk 2 functioning as a shadow disk for mirroring.
0 and application program 1 of active node 1
1, an application program 21 prepared in association with the active node 1, a net disk driver 22 for accessing the hard disk 10 of the active node 1, a mirror driver 23 for executing mirroring on the hard disks 20, 10, and an overall control process. And a cluster control unit 24 to execute.

【００３２】図２に、現用系ノード１のハードディスク
１０に障害が発生するときに機能する本発明の一実施形
態例を図示する。FIG. 2 shows an embodiment of the present invention which functions when a failure occurs in the hard disk 10 of the active node 1.

【００３３】ここで、図中、現用系ノード１を「＃ノー
ド１」、待機系ノード２を「＃ノード２」、現用系ノー
ド１のハードディスク１０を「ＨＤ１」、待機系ノード
２のハードディスク２０を「ＨＤ２」で示している。In the figure, the active node 1 is “#node 1”, the standby node 2 is “#node 2”, the hard disk 10 of the active node 1 is “HD1”, and the hard disk 20 of the standby node 2 Is indicated by “HD2”.

【００３４】次に、この図に従って、現用系ノード１の
ハードディスク１０に障害が発生したときに、本発明が
実行する処理について説明する。Next, the processing executed by the present invention when a failure occurs in the hard disk 10 of the active node 1 will be described with reference to FIG.

【００３５】（１）現用系ノード１のハードディスク１
０に何らかの障害が発生したとする。(1) Hard disk 1 of active node 1
It is assumed that some trouble occurs in 0.

【００３６】（２）この障害が発生しているときに、現
用系ノード１のアプリケーションプログラム１１が業務
処理を遂行すべく業務データの書込要求を発行すると、
（３）現用系ノード１のミラードライバ１３は、ミラー
リングを実行すべく、自ノードのハードディスク１０に
書込要求のある業務データの書き込みを行うとともに、
ネットディスクドライバ１２を使って待機系ノード２の
ハードディスク２０に書込要求のある業務データの書き
込みを行うことになる。このとき、現用系ノード１のミ
ラードライバ１３は、自ノードのハードディスク１０に
はこの書き込みを実行できないことで、自ノードのハー
ドディスク１０に障害が発生していることを検出する。(2) When the application program 11 of the active node 1 issues a business data write request to perform business processing when this failure has occurred,
(3) The mirror driver 13 of the active node 1 writes business data requested to be written to the hard disk 10 of the own node in order to execute mirroring.
Using the net disk driver 12, the business data requested to be written is written to the hard disk 20 of the standby node 2. At this time, the mirror driver 13 of the active node 1 detects that a failure has occurred in the hard disk 10 of the own node because the mirror driver 13 cannot execute the writing to the hard disk 10 of the own node.

【００３７】（４）現用系ノード１のミラードライバ１
３は、自ノードのハードディスク１０に障害が発生して
いることを検出すると、その旨を自ノードのクラスタ制
御部１４に通知する。(4) Mirror driver 1 of active node 1
3 detects that a failure has occurred in the hard disk 10 of its own node, and notifies the cluster control unit 14 of its own node to that effect.

【００３８】（５）現用系ノード１のクラスタ制御部１
４は、この通知を受け取ると、（６）自ノードのアプリ
ケーションプログラム１１を停止させてから、自ノード
を運用状態から停止状態に遷移させ、（７）それに続け
て、自ノードのミラードライバ１３を停止させること
で、二重化システムとしての動作を停止させる。(5) Cluster controller 1 of active node 1
4 receives this notification, (6) stops the application program 11 of its own node, and then transitions its own node from the operating state to the stopped state. (7) Subsequently, the mirror driver 13 of its own node is By stopping the operation, the operation as the redundant system is stopped.

【００３９】（８）続いて、現用系としての動作を停止
した現用系ノード１のクラスタ制御部１４は、業務用ネ
ットワーク５を使って、待機系ノード２のクラスタ制御
部２４に対して運用状態への切り替えを通知する。(8) Subsequently, the cluster control unit 14 of the active node 1 that has stopped operating as the active system uses the business network 5 to operate the cluster control unit 24 of the standby node 2 in the operating state. Notify the switch to.

【００４０】（９）この通知を受け取ると、待機系ノー
ド２のクラスタ制御部２４は、自ノードを停止状態から
運用状態に遷移させ、（１０）それに続けて、自ノード
のミラードライバ２３を起動するとともに、（１１）自
ノードのアプリケーションプログラム２１を起動するこ
とで業務を再開する。(9) Upon receiving this notification, the cluster control unit 24 of the standby node 2 changes its own node from the stopped state to the operating state, and (10) subsequently activates the mirror driver 23 of its own node. (11) The application is restarted by activating the application program 21 of the own node.

【００４１】ここで、ミラードライバ２３を起動すると
きには、自ノードのハードディスク２０のみをアクセス
可能とし、障害の発生したハードディスク１０について
はアクセスしないようにすべく、ミラーリングの構成情
報を変更する処理を行う。Here, when the mirror driver 23 is started, a process of changing the mirroring configuration information is performed so that only the hard disk 20 of the own node can be accessed and the failed hard disk 10 is not accessed. .

【００４２】このようにして、本発明では、現用系とし
て動作しているノードのハードディスクに障害が発生す
ると、待機系として動作しているノードを運用状態にす
るとともに、それまで現用系として動作していたノード
を停止させることで、二重化システムとしての動作を停
止させるように処理する。As described above, according to the present invention, when a failure occurs in the hard disk of the node operating as the active system, the node operating as the standby system is brought into the operating state, and the node operating as the active system has been operated until then. By stopping the existing node, processing is performed to stop the operation as a redundant system.

【００４３】これにより、新たに業務処理を実行するノ
ードは、自ノードのハードディスクを使って業務を遂行
できるようになるので、データの入力性能が劣化すると
いうことがなくなり、システム全体の性能低下を防止で
きるようになる。そして、二重化システムとしての動作
を停止することから、その後、不都合な動作を行うこと
がない。As a result, the node that newly executes the business process can execute the business using the hard disk of the own node, so that the data input performance does not deteriorate and the performance of the entire system is reduced. Can be prevented. Then, since the operation as the duplex system is stopped, no inconvenient operation is performed thereafter.

【００４４】図３に、待機系ノード２のハードディスク
２０に障害が発生するときに機能する本発明の一実施形
態例を図示する。FIG. 3 shows an embodiment of the present invention which functions when a failure occurs in the hard disk 20 of the standby node 2.

【００４５】ここで、図中、現用系ノード１を「＃ノー
ド１」、待機系ノード２を「＃ノード２」、現用系ノー
ド１のハードディスク１０を「ＨＤ１」、待機系ノード
２のハードディスク２０を「ＨＤ２」で示している。In the figure, the active node 1 is “#node 1”, the standby node 2 is “#node 2”, the hard disk 10 of the active node 1 is “HD1”, and the hard disk 20 of the standby node 2 Is indicated by “HD2”.

【００４６】次に、この図に従って、待機系ノード２の
ハードディスク２０に障害が発生するときに、本発明が
実行する処理について説明する。Next, the processing executed by the present invention when a failure occurs in the hard disk 20 of the standby node 2 will be described with reference to FIG.

【００４７】（１）待機系ノード２のハードディスク２
０に何らかの障害が発生したとする。(1) Hard disk 2 of standby node 2
It is assumed that some trouble occurs in 0.

【００４８】（２）この障害が発生しているときに、現
用系ノード１のアプリケーションプログラム１１が業務
処理を遂行すべく業務データの書込要求を発行すると、
（３）現用系ノード１のミラードライバ１３は、ミラー
リングを実行すべく、自ノードのハードディスク１０に
書込要求のある業務データの書き込みを行うとともに、
ネットディスクドライバ１２を使って待機系ノード２の
ハードディスク２０に書込要求のある業務データの書き
込みを行うことになる。このとき、現用系ノード１のミ
ラードライバ１３は、待機系ノード２のハードディスク
２０にはこの書き込みを実行できないことで、待機系ノ
ード２のハードディスク２０に障害が発生していること
を検出する。(2) When the application program 11 of the active node 1 issues a business data write request to perform business processing while this fault has occurred,
(3) The mirror driver 13 of the active node 1 writes business data requested to be written to the hard disk 10 of the own node in order to execute mirroring.
Using the net disk driver 12, the business data requested to be written is written to the hard disk 20 of the standby node 2. At this time, the mirror driver 13 of the active node 1 detects that a failure has occurred in the hard disk 20 of the standby node 2 because this writing cannot be performed on the hard disk 20 of the standby node 2.

【００４９】（４）現用系ノード１のミラードライバ１
３は、待機系ノード２のハードディスク２０に障害が発
生したことを検出すると、その旨を自ノードのクラスタ
制御部１４に通知する。(4) Mirror driver 1 of active node 1
3 detects that a failure has occurred in the hard disk 20 of the standby node 2 and notifies the cluster control unit 14 of the own node of the failure.

【００５０】（５）この通知を受け取ると、現用系ノー
ド１のクラスタ制御部１４は、業務用ネットワーク５を
使って、待機系ノード２のクラスタ制御部２４に対して
切り捨て処理を通知し、（６）これを受けて、待機系ノ
ード２のクラスタ制御部２４は、待機状態から停止状態
へ遷移させることで、二重化システムとしての動作を停
止させる。(5) Upon receiving this notification, the cluster control unit 14 of the active node 1 notifies the cluster control unit 24 of the standby node 2 of the truncation process using the business network 5, 6) In response to this, the cluster control unit 24 of the standby system node 2 stops the operation as the redundant system by making a transition from the standby state to the stop state.

【００５１】このようにして、本発明では、待機系とし
て動作しているノードのハードディスクに障害が発生す
ると、待機系として動作しているノードを停止させるこ
とで、二重化システムとしての動作を停止させるように
処理する。As described above, according to the present invention, when a failure occurs in the hard disk of the node operating as the standby system, the operation of the redundant system is stopped by stopping the node operating as the standby system. Process as follows.

【００５２】これにより、業務処理を実行するノード
は、待機系ノードが用意されていないことをユーザに知
らせることができるようになることで、ユーザがシステ
ムの信頼性に対して不安を持つということがなくなると
ともに、古い業務データでもって業務が引き継がれてし
まうという不都合も起こらない。Thus, the node executing the business process can inform the user that the standby node is not prepared, so that the user is concerned about the reliability of the system. And the inconvenience of taking over the business with old business data does not occur.

【００５３】図４に、専用ネットワーク３に障害が発生
するときに機能する本発明の一実施形態例を図示する。FIG. 4 shows an embodiment of the present invention which functions when a failure occurs in the dedicated network 3.

【００５４】ここで、図中、現用系ノード１を「＃ノー
ド１」、待機系ノード２を「＃ノード２」、現用系ノー
ド１のハードディスク１０を「ＨＤ１」、待機系ノード
２のハードディスク２０を「ＨＤ２」で示している。In the figure, the active node 1 is “#node 1”, the standby node 2 is “#node 2”, the hard disk 10 of the active node 1 is “HD1”, and the hard disk 20 of the standby node 2 Is indicated by “HD2”.

【００５５】次に、この図に従って、専用ネットワーク
３に障害が発生するときに、本発明が実行する処理につ
いて説明する。Next, the processing executed by the present invention when a failure occurs in the dedicated network 3 will be described with reference to FIG.

【００５６】（１）専用ネットワーク３に何らかの障害
が発生したとする。(1) It is assumed that some failure occurs in the dedicated network 3.

【００５７】（２）この障害が発生しているときに、現
用系ノード１のアプリケーションプログラム１１が業務
処理を遂行すべく業務データの書込要求を発行すると、
（３）現用系ノード１のミラードライバ１３は、ミラー
リングを実行すべく、自ノードのハードディスク１０に
書込要求のある業務データの書き込みを行うとともに、
ネットディスクドライバ１２を使って待機系ノード２の
ハードディスク２０に書込要求のある業務データの書き
込みを行うことになる。このとき、現用系ノード１のミ
ラードライバ１３は、待機系ノード２のハードディスク
２０にはこの書き込みを実行できないことで、専用ネッ
トワーク３に障害が発生していることを検出する。(2) When the application program 11 of the active node 1 issues a business data write request to perform business processing while this fault has occurred,
(3) The mirror driver 13 of the active node 1 writes business data requested to be written to the hard disk 10 of the own node in order to execute mirroring.
Using the net disk driver 12, the business data requested to be written is written to the hard disk 20 of the standby node 2. At this time, the mirror driver 13 of the active node 1 detects that a failure has occurred in the dedicated network 3 because this write cannot be executed on the hard disk 20 of the standby node 2.

【００５８】（４）現用系ノード１のミラードライバ１
３は、専用ネットワーク３に障害が発生したことを検出
すると、その旨を自ノードのクラスタ制御部１４に通知
する。(4) Mirror driver 1 of active node 1
When detecting that a failure has occurred in the dedicated network 3, the node 3 notifies the cluster control unit 14 of its own node.

【００５９】（５）この通知を受け取ると、現用系ノー
ド１のクラスタ制御部１４は、業務用ネットワーク５を
使って、待機系ノード２のクラスタ制御部２４に対して
切り捨て処理を通知し、（６）これを受けて、待機系ノ
ード２のクラスタ制御部２４は、待機状態から停止状態
へ遷移させるとともに、（７）自ノードのミラードライ
バ２３に対して、自ノードのハードディスク２０をミラ
ーセットから切り離すことを指示することで、二重化シ
ステムとしての動作を停止させる。(5) Upon receiving this notification, the cluster control unit 14 of the active node 1 notifies the cluster control unit 24 of the standby node 2 of the truncation process using the business network 5, 6) In response to this, the cluster control unit 24 of the standby node 2 makes a transition from the standby state to the stop state, and (7) sends the hard disk 20 of the own node from the mirror set to the mirror driver 23 of the own node. By instructing disconnection, the operation as a redundant system is stopped.

【００６０】（８）そして、この指示を受け取ると、ミ
ラードライバ２３は、自ノードのハードディスク２０に
記憶されるミラーセットの構成情報をこのハードディス
ク２０が使用できない形に更新することで、自ノードの
ハードディスク２０をミラーセットから切り離す。(8) Upon receiving this instruction, the mirror driver 23 updates the mirror set configuration information stored in the hard disk 20 of the own node to a form in which the hard disk 20 cannot be used, and thereby the mirror driver 23 The hard disk 20 is separated from the mirror set.

【００６１】このようにして、本発明では、専用ネット
ワーク３に障害が発生すると、待機系として動作してい
るノードを停止させることで、二重化システムとしての
動作を停止するように処理する。As described above, according to the present invention, when a failure occurs in the dedicated network 3, by stopping the node operating as the standby system, the processing as the redundant system is stopped.

【００６２】これにより、業務処理を実行するノード
は、待機系ノードが用意されていないことをユーザに知
らせることができるようになることで、ユーザがシステ
ムの信頼性に対して不安を持つということがなくなると
ともに、古い業務データでもって業務が引き継がれてし
まうという不都合も起こらない。As a result, the node executing the business process can inform the user that the standby node is not prepared, so that the user is concerned about the reliability of the system. And the inconvenience of taking over the business with old business data does not occur.

【００６３】[0063]

【発明の効果】以上説明したように、本発明の高信頼性
システムでは、現用系として動作しているノードのハー
ドディスクに障害が発生すると、待機系として動作して
いるノードを運用状態にするとともに、それまで現用系
として動作していたノードを停止させることで、二重化
システムとしての動作を停止させるように処理する。As described above, in the high reliability system according to the present invention, when a failure occurs in the hard disk of the node operating as the active system, the node operating as the standby system is put into operation and Then, processing is performed so as to stop the operation as the redundant system by stopping the node that has been operating as the active system until then.

【００６４】これにより、新たに業務処理を実行するノ
ードは、自ノードの持つハードディスクを使って業務を
遂行できるようになるので、データの入力性能が劣化す
るということがなくなり、システム全体の性能低下を防
止できるようになる。そして、二重化システムとしての
動作を停止することから、その後、不都合な動作を行う
ことがない。As a result, the node that newly executes the business process can execute the business using the hard disk of its own node, so that the data input performance does not deteriorate, and the performance of the entire system decreases. Can be prevented. Then, since the operation as the duplex system is stopped, no inconvenient operation is performed thereafter.

【００６５】また、本発明の高信頼性システムでは、待
機系として動作しているノードのハードディスクに障害
が発生すると、待機系として動作しているノードを停止
させることで、二重化システムとしての動作を停止させ
るように処理する。Further, in the high reliability system of the present invention, when a failure occurs in the hard disk of the node operating as the standby system, the node operating as the standby system is stopped to operate as a redundant system. Process to stop.

【００６６】これにより、業務処理を実行するノード
は、待機系ノードが用意されていないことをユーザに知
らせることができるようになることで、ユーザがシステ
ムの信頼性に対して不安を持つということがなくなると
ともに、古い業務データでもって業務が引き継がれてし
まうという不都合も起こらない。As a result, the node executing the business process can inform the user that the standby node is not prepared, so that the user is concerned about the reliability of the system. And the inconvenience of taking over the business with old business data does not occur.

【００６７】また、本発明の高信頼性システムでは、シ
ャドウディスクのアクセス用に用意されるシャドウ用ネ
ットワークに障害が発生すると、待機系として動作して
いるノードを停止させることで、二重化システムとして
の動作を停止させるように処理する。Further, in the high reliability system of the present invention, when a failure occurs in the shadow network prepared for accessing the shadow disk, the node operating as the standby system is stopped, so that the redundant system can be used. Process to stop the operation.

【００６８】これにより、業務処理を実行するノード
は、待機系ノードが用意されていないことをユーザに知
らせることができるようになることで、ユーザがシステ
ムの信頼性に対して不安を持つということがなくなると
ともに、古い業務データでもって業務が引き継がれてし
まうという不都合も起こらない。As a result, the node executing the business process can inform the user that the standby node is not prepared, so that the user is concerned about the reliability of the system. And the inconvenience of taking over the business with old business data does not occur.

【００６９】このようにして、本発明によれば、現用系
ノードのローカルディスクをオリジナルディスクとし、
ネットワークを介して接続される待機系ノードのローカ
ルディスクをシャドウディスクとするミラーセット機能
を有して、いずれか一方のディスクに障害が発生する場
合に、正常なもう一方のディスクを使用することで業務
を遂行するという構成を採るときにあって、障害発生に
適切に対処できるようになる。As described above, according to the present invention, the local disk of the active node is set as the original disk,
It has a mirror set function that uses the local disk of the standby node connected via the network as a shadow disk, so that if one of the disks fails, the other normal disk can be used. When adopting a configuration of performing a task, it becomes possible to appropriately cope with the occurrence of a failure.

[Brief description of the drawings]

【図１】本発明の適用される高信頼性システムのシステ
ム構成図である。FIG. 1 is a system configuration diagram of a highly reliable system to which the present invention is applied.

【図２】本発明の一実施形態例である。FIG. 2 is an embodiment of the present invention.

【図３】本発明の一実施形態例である。FIG. 3 is an embodiment of the present invention.

【図４】本発明の一実施形態例である。FIG. 4 is an embodiment of the present invention.

【図５】本発明の適用される高信頼性システムの説明図
である。FIG. 5 is an explanatory diagram of a high reliability system to which the present invention is applied.

[Explanation of symbols]

１現用系ノード２待機系ノード３専用ネットワーク４業務用端末５業務用ネットワーク１０，２０ハードディスク１１，２１アプリケーションプログラム１２，２２ネットディスクドライバ１３，２３ミラードライバ１４，２４クラスタ制御部 DESCRIPTION OF SYMBOLS 1 Working node 2 Standby node 3 Dedicated network 4 Business terminal 5 Business network 10,20 Hard disk 11,21 Application program 12,22 Net disk driver 13,23 Mirror driver 14,24 Cluster control unit

───────────────────────────────────────────────────── フロントページの続き (72)発明者今井秀和石川県河北郡宇ノ気町字宇野気ヌ98番地の２株式会社ピーエフユー内Ｆターム(参考） 5B018 GA06 HA04 HA05 KA14 MA14 QA20 5B034 BB02 CC02 5B065 BA01 EA12 ────────────────────────────────────────────────── ─── Continuing from the front page (72) Inventor Hidekazu Imai 98 Uno-ki-nu, Unoki-cho, Kawakita-gun, Ishikawa Pref.

Claims

[Claims]

1. A mirror set function in which a local disk of an active node is used as an original disk and a local disk of a standby node connected via a network is used as a shadow disk. In the event of occurrence, in a highly reliable system that performs business by using the other normal disk, means for detecting whether or not a failure has occurred in the original disk; Means for switching the standby node to the operating state when detected, and means for stopping the node that has been operating as the active node and stopping the redundant system in response to the node switching. High reliability system characterized by that.

2. A mirror set function in which a local disk of an active node is used as an original disk and a local disk of a standby node connected via a network is a shadow disk. In the event of occurrence, in a highly reliable system that performs business by using the other normal disk, a means for detecting whether or not a failure has occurred in the shadow disk; When detected,
Means for stopping the redundant system by stopping the standby node.

3. A mirror set function in which a local disk of an active node is used as an original disk and a local disk of a standby node connected via a network is used as a shadow disk. If this occurs, in a highly reliable system that performs business by using the other normal disk, has a failure occurred in the shadow network provided for the active node to access the shadow disk? A high reliability system comprising: means for detecting whether a failure has occurred in the shadow network, and means for stopping a redundant system by stopping a standby node when the occurrence of a failure in the shadow network is detected.