JP2005505053A

JP2005505053A - Aggregation of hardware events in a multi-node system (aggregation of hardware events)

Info

Publication number: JP2005505053A
Application number: JP2003533136A
Authority: JP
Inventors: ラリー、リチャード、エイ; バックス、ダニエル、エイチ
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2001-10-01
Filing date: 2002-09-26
Publication date: 2005-02-17
Anticipated expiration: 2022-09-26
Also published as: EP1449097A4; TWI235920B; DE60224438D1; DE60224438T2; CN1561493A; EP1449097B1; ATE382898T1; CN1303545C; EP1449097A1; US6988155B2; US20030065853A1; WO2003029999A1; JP3940397B2

Abstract

【課題】マルチノード・システムにおけるハードウェア・イベントの集約化を開示すること。
【解決手段】遠隔ノードで発生したイベントが、遠隔ノードのファームウェアが一次ノードの第１のレジスタに書き込みを行うことによって、一次ノードに転送される（１１４）。イベントは、一次ノードの第１のレジスタから一次ノードの第２のレジスタに伝達される（１１６）。それに自動的に応答して、一次ノードで割り込みが生成される（１１８）。割り込みの発生に応答して、一次ノードの割り込みハンドラが、遠隔ノードで生成したイベントを処理するために、一次ノードでコードを呼び出す（１２０）。
【選択図】図１Disclosed is a consolidation of hardware events in a multi-node system.
An event that occurs at a remote node is forwarded to a primary node by the remote node firmware writing to a first register of the primary node (114). The event is communicated 116 from the primary register of the primary node to the second register of the primary node. In response, an interrupt is generated at the primary node (118). In response to the occurrence of the interrupt, the interrupt handler at the primary node calls code at the primary node to process the event generated at the remote node (120).
[Selection] Figure 1

Description

【技術分野】
【０００１】
本発明は、一般に、プロセッサやメモリなどを備える２以上のノードが存在するコンピュータ・システムであるマルチノード・コンピュータ・システムに関し、より詳細には、そのようなシステムのノードによって生成されるハードウェア・イベントの管理に関する。
【背景技術】
【０００２】
コンピュータ・システムでは、様々なハードウェアが、処理する必要のあるイベントを生成する。例えば、ＡＣＰＩ（拡張構成および電源インターフェース（advanced configuration and power interface））仕様は、電源管理／構成機構を提供する。ＡＣＰＩ互換ハードウェアを含むＡＣＰＩ互換コンピュータ・システムでは、内部および外部イベントに応答して、システム自身が電源をオン、オフすることができ、特定のハードウェア要素を電力監視点（powervantage point）から管理することもできる。低電力モード（low-power mode）にあるネットワーク・カードは、例えば、接続されているネットワークからデータ・パケットを受信したときに、イベントを生成することができ、低電力モードから抜け出すことができる。このイベントは、ネットワーク・カードがその一部を成すコンピュータ・システムによって受信され、その結果、例えば、コンピュータ・システムは、それまで入っていた低電力モードから抜け出すことができる。別のタイプのイベントに、ホットプラグ（hot-plug）・イベントがあり、このイベントは、システム稼動中にハードウェア・カードがコンピュータ・システムに挿入されたり、コンピュータ・システムから取り外されたりした場合に発生する。
【発明の開示】
【発明が解決しようとする課題】
【０００３】
ＡＣＰＩイベント、ならびにその他のタイプのハードウェア・イベントの不都合な点は、それらのイベントが、イベントを発生させたハードウェアをその一部として含むコンピュータ・システムによってイベント処理が行われると想定していることである。すなわち、ハードウェア・イベントはしばしば、マルチノード・システム非認識（multi-node system unaware）のアーキテクチャに基づいて定められ、したがって、当該アーキテクチャをシングルノード・システムであると想定する。マルチノード・コンピュータ・システムには、自身のプロセッサやメモリなどを備えた複数のノードが存在しており、それらに処理が分散される。さらに、ＡＣＰＩイベント・ハードウェアを実装するチップセット自体が、一般にはマルチノード非認識である。ＡＣＰＩ対応のオペレーティング・システムは一般に、システム内にはＡＣＰＩイベント・ハードウェアのインスタンスが１つだけ存在するものと想定し、また大抵は複製ＡＣＰＩイベント・ハードウェアを認識しない。
【０００４】
したがって、現在のハードウェア・イベント処理アーキテクチャはしばしば、マルチノード・コンピュータ・システムのある遠隔ノード内で生成されたイベントはそのノードによって処理されると想定している。例えば、システムの一次すなわちブート・ノードがイベントの通知を受信する機構も存在せず、このノードがイベントを操作し、遠隔ノードにイベントをどのように処理すべきか指示する機構も存在しない。オペレーティング・システムにどのノードでイベントが発生したかを通知するために、標準的なＡＣＰＩイベント・ハードウェアによって提供される機構も存在しない。これは問題と言えるが、このようになったのは、ハードウェアを操作するためのオペレーティング・システムの方針が、システム全体でＡＣＰＩイベント・ハードウェアのインスタンスは１つであると想定しているためである。しかし、この想定は、標準的なＡＣＰＩイベント・ハードウェアに基づいて設計されたマルチノード・システムについては当てはまらない。ここで説明した理由、およびその他の理由のため、本発明が必要とされている。
【課題を解決するための手段】
【０００５】
本発明は、マルチノード・システムにおけるハードウェア・イベントの集約化に関する。本発明の方法では、遠隔ノードで発生したイベントが、遠隔ノードのファームウェアが一次ノードの第１のレジスタに書き込みを行うことによって、一次ノードに転送される。イベントは、一次ノードの第１のレジスタから一次ノードの第２のレジスタに伝達される。それに自動的に応答して、一次ノードで割り込みが生成される。割り込みの生成に応答して、一次ノードの割り込みハンドラが、遠隔ノードで発生したイベントを処理するために、一次ノードでコードを呼び出す。
【０００６】
本発明のマルチノード・システムは、一次ノードと１つまたは複数の遠隔ノードを含む。一次ノードは、互いに通信可能に結合された第１のレジスタと第２のレジスタを備える。第２のレジスタは、一次ノードのイベント用に通常は予約されている。一次ノードは、イベントを処理するためのマルチノード非認識コードも含み、このコードは、第１のレジスタから第２のレジスタへのイベントの転送に応答して生成される割り込みに応答して呼び出される。イベントは遠隔ノードで発生して、一次ノードの第１のレジスタに転送され、イベントの最終的な処理は一次ノードによって行われる。イベントは、一次ノードの第１のレジスタから第２のレジスタに自動的に伝達される。
【０００７】
本発明の製造物品は、コンピュータ可読媒体および媒体中の手段を含む。この手段は、一次ノードの第１のレジスタに書き込まれたイベントを一次ノードの第２のレジスタに自動的に伝達して、遠隔ノードで発生したイベントを第１のレジスタから第２のレジスタに転送するためのものである。第２のレジスタへの書き込みに自動的に応答して、一次ノードで割り込みが生成される。前記手段は、イベントを処理するために、割り込みの生成に応答して、一次ノードでコードを呼び出すためのものでもある。本発明のその他の特徴および利点は、本発明の現在のところ好ましい実施形態の以下の詳細な説明を、添付の図面と併せ読むことにより明らかとなるであろう。
【発明を実施するための最良の形態】
【０００８】
概要
図１には、本発明の好ましい実施形態による方法１００が示されている。方法１００の機能を、製造物品のコンピュータ可読媒体中の手段として実施することができる。例えば、コンピュータ可読媒体を、記録可能データ記憶媒体または変調搬送波信号（modulated carrier signal）とすることができる。方法１００の各部分は、破線１０６で分割された列１０２および列１０４によってそれぞれ示される、マルチノード・システムの遠隔ノードおよび一次すなわちブート・ノードで実行される。
【０００９】
ハードウェア・イベントは最初、遠隔ノードで発生する（１０８）。ハードウェア・イベントは、ホットプラグ・イベントなどのＡＣＰＩ（拡張構成および電源インターフェース）イベント、あるいは別のタイプのハードウェア・イベントまたは他のイベントとすることができる。イベントは、遠隔ノードでＰＭＩ（プラットフォーム管理割り込み（platform management interrupt））などの割り込みを生成する（１１０）。具体的には、割り込みルータ（interruptrouter）などの遠隔ノードのコンポーネントが、割り込みを生成する。割り込みの生成に応答して、ファームウェアの割り込みハンドラ（firmwareinterrupt handler）がイベントを検出し、遠隔ノード上で呼び出される（１１２）。ファームウェアの割り込みハンドラは、イベントを一次ノードの第１のレジスタに書き込むことによって、イベントを遠隔ノードから一次ノードに転送する（１１４）。例えば、遠隔ノードのファームウェアのＰＭＩハンドラは、イベントを一次ノードのＧＰＩＯ（汎用入出力）レジスタ（generalpurpose input/output register）に転送することができる。
【００１０】
一次ノードでは、第１のレジスタへのイベントの書き込みに自動的に応答して、一次ノードの第１のレジスタの出力と第２のレジスタの入力とを直接接続したコネクションを介して、イベントが伝達される（１１６）。第２のレジスタは好ましくは、遠隔ノードで発生したイベント用にではなく、一次ノードで発生したイベント用に通常は予約されている。このようにして、方法１００は、第２のレジスタなど、一次ノードで発生したイベント専用に通常は使用される一次ノードのレジスタを利用する。ＡＣＰＩイベントの場合、第２のレジスタは、こうしたイベント用に予約されているＧＰＥ（汎用イベント）レジスタ（general purpose event register）とすることができる。第２のレジスタへのイベントの書き込みによって、一次ノードでＳＣＩ（システム構成割り込み（systemconfiguration interrupt））割り込みなどの割り込みが生成される（１１８）。
【００１１】
一次ノードでの割り込みの生成によって、コードが呼び出される（１２０）。このコードは、前記割り込みを最終的に発生させる元となったイベントを処理するためのものである。例えば、ＡＣＰＩイベントの場合、当該コードをオペレーティング・システム（ＯＳ）のＡＣＰＩドライバとすることができる。イベントが一次ノードで生成されたものではないことをコードが認識しない点で、このコードはマルチノード非認識である。このようにして、方法１００は、標準ＡＣＰＩドライバなどの標準的なドライバを使用することができ、したがって、一次ノード以外のノードで発生したイベントに対応できるように、これらのドライバを書き換える必要はない。
【００１２】
前記コードは、イベントを処理するために特定のプロセスを呼び出す（１２２）。このプロセスはこれまでとは異なり、マルチノード認識（multi-node aware）であり、具体的には一次ノード以外のノードで発生したイベントに対応できるように設計される。例えば、このプロセスをＡＭＬ（ＡＣＰＩマシン語）メソッド（ACPIMachine Language method）とすることができる。ＡＭＬはコンパクトな、トークン化された（tokenize）、抽象マシン語である。このプロセスは、遠隔ノードにイベントを処理するための適切な指示を出し（１２４）、イベントはこの指示に従って遠隔ノードで処理される（１２６）。例えば、プロセスは、コントローラやその他のハードウェアなど、イベントが発生したハードウェアを遠隔操作する。
【００１３】
背景技術
図２には、イベント・アーキテクチャ２００の一例が示されており、それに基づき本発明の実施形態を実施することができる。アーキテクチャ２００は、具体的にはＡＣＰＩイベント用とし、シングルノード・システムに関して説明されるが、本発明の実施形態に従ってマルチノード・システムに拡張することができる。プラットフォーム・ハードウェア２０２は、ＡＣＰＩイベントを生成するコントロール・カードおよびその他のタイプのハードウェアを含む。ハードウェア２０２は、その設定を、例えば、一種のファームウェアであるＢＩＯＳ（基本入出力システム）２０４から受信することができる。ＯＳ非依存のコンポーネント２０６は、ＡＣＰＩレジスタ２０８、ＡＣＰＩＢＩＯＳ２１０、およびＡＣＰＩテーブル２１２を含む。ＡＣＰＩレジスタ２０８は、詳細な説明の前のセクションで説明した第２のレジスタを含む。ＡＣＰＩＢＩＯＳ２１０は、一種のファームウェアであり、ハードウェア２０２のＡＣＰＩ設定を記録することができる。ＡＣＰＩテーブル２１２は、ＡＣＰＩドライバ／ＡＭＬインタプリタ（ACPI driver and AML interpreter）２１６によって使用される、ハードウェア２０２へのインターフェースを記述する。
【００１４】
ＡＣＰＩドライバ／ＡＭＬインタプリタ２１６は好ましくは、ＡＣＰＩドライバによって呼び出されて遠隔ノード・イベントを処理する、詳細な説明の前のセクションで説明したプロセスを含むか、またはそのようなプロセスにアクセスすることができる。ＡＣＰＩドライバは一般に、所与のＯＳにとって標準的なものであり、ＡＭＬで記述されたプロセスの解釈および構文解析を行うためのＡＭＬインタプリタを含む。すなわち、プロセスは非標準的であり、遠隔ノード・イベントのような所与の状況に特化したものとすることができるが、ＡＣＰＩドライバ自体は一般に標準的なものである。ＡＣＰＩドライバ／ＡＭＬインタプリタ２１６は、プラットフォーム・ハードウェア２０２のために、ＯＳのデバイス・ドライバ２１４と対話を行う。デバイス・ドライバは、ハードウェアをＯＳにつなげるソフトウェア・ルーチンである。
【００１５】
さらに、デバイス・ドライバ２１４およびＡＣＰＩドライバ／ＡＭＬインタプリタ２１６は、ＯＳカーネル２１８と対話を行う。カーネルは、ふつうメモリ内に常駐し、基本サービスを提供するＯＳの基本部分である。カーネルは、ハードウェアに最も近いＯＳの部分であり、一般にデバイス・ドライバ２１４とインターフェースをとることによってハードウェアを活動化する。カーネル２１８の上に、コンピュータ・システム上で動作するアプリケーション・プログラム２２２が存在する。カーネル２１８は、電源管理サービスを提供し管理するために設計されたＯＳの一部である、ＯＳＰＭ（ＯＳ電源管理）システム・コード（OS power management system code）２２０とも対話を行う。
【００１６】
図３には、マルチノード・システム３００の一例が示されており、それに関して本発明の実施形態を実施することができる。マルチノード・システム３００は、相互接続３０６を介して相互接続された複数のノード３０２ａ、３０２ｂ、．．．、３０２ｎを含む。ノードの１つは一次すなわちブート・ノードであり、その他のノードはこの一次ノードにとって遠隔ノードである。ノードの各々は、１つまたは複数のＣＰＵ（中央処理装置）、または揮発性もしくは不揮発性メモリ、あるいはその両方、およびハードディスク・ドライブやフロッピー（Ｒ）ディスク・ドライブなどの１つまたは複数の記憶装置のうちのどれかまたはすべてを任意選択で含むことができる。例えば、ノード３０２ａは、ＣＰＵ３０８、メモリ３１２、サービス・プロセッサ３１０、および記憶装置３１４を有する。同様に、ノード３０２ｂは、ＣＰＵ３１６、サービス・プロセッサ３１８、メモリ３２０、および記憶装置３２２を有する。最後に、ノード３０２ｎは、ＣＰＵ３２４、サービス・プロセッサ３２６、メモリ３２８、および記憶装置３３０を有する。
【００１７】
遠隔ノードのハードウェア・イベントの一次すなわちブート・ノードへの集約化
図４は、本発明の一実施形態によってハードウェア・イベントがどのように遠隔ノードから一次すなわちブート・ノードに集約化されるかを示したフローチャート４００である。チャート４００は、遠隔ノードＮとも呼ばれる第１の遠隔ノード４０２によって実行される機能と、一次すなわちブート・ノード４０４によって実行される機能とに分かれている。第１の遠隔ノード４０２によって実行される機能は、一次ノード４０４によって実行される機能から、破線４０６によって分けられている。遠隔ノード４０２以外の他の遠隔ノードを含むこともできる。
【００１８】
チャート４００は、第１の遠隔ノード４０２で発生したイベントの一次ノード４０４への集約化または合体化（coalescing）に関して描かれている。あるイベントが、線４１４で示すようにノード４０２のホットプラグ・コントローラ４１２で発生する。このイベントをホットプラグＡＣＰＩイベントとする。コントローラ４１２はハードウェア・カードのノード４０２への挿入およびノード４０２からの取り外しを検出する。前記イベントがノード４０２の割り込みルータ４１６によって検出され、それに応答して、割り込みルータが、線４１８で示すようにＰＭＩ割り込みを生成する。この割り込みの生成に応答して、ノード４０２のＰＭＩ割り込みハンドラ４２０は最初に、線４２２で示すようにＰＭＩ割り込みを無効化する。割り込みハンドラ４２０は次に、一次ノード４０４のＧＰＩＯレジスタ４２６にイベントを書き込むことによって、線４２４で示すようにイベントを一次ノードに通知する。
【００１９】
一次ノード４０４のＧＰＩＯレジスタ４２６にイベントを書き込むと、線４２８に示すように、そのイベントは一次ノード４０４のＧＰＥレジスタ４３０に自動的に転送される。これは、ＧＰＩＯレジスタ４２６がＧＰＥレジスタ４３０に少なくとも通信可能に結合され、好ましくは直接接続されるためである。ＧＰＥレジスタ４３０へのイベントの書き込みによって、線４３２に示すようにＳＣＩ割り込みが生成される。ＯＳの割り込みハンドラ４３４がＳＣＩ割り込みを処理し、それに応答して、ノード４０４のＡＣＰＩドライバ４３８が、線４３６に示すように呼び出される。ドライバ４３８は、それに応答して、線４４２に示すようにＡＭＬメソッド４４４を呼び出す。ＡＭＬメソッド４４４は、遠隔ハードウェア・イベントを処理するように特殊化されており、一方、ドライバ４３８は好ましくは、遠隔ハードウェア・イベントを処理するように修正されていない標準的なＡＣＰＩドライバとする。同様に、ＧＰＥレジスタ４３０は、ローカル・ハードウェア・イベントを処理するのに通常使用されるＡＣＰＩレジスタとする。
【００２０】
ＡＭＬメソッド４４４は最初、線４４６に示すようにコントローラ４１２を操作して、コントローラ４１２で発生した遠隔イベントを処理する。この処理が行われると、ＡＭＬメソッド４４４は、線４４８に示すようにホットプラグ・イベントを消去し、線４５０に示すようにＰＭＩ割り込みを再有効化する。ＡＭＬメソッド４４４は最後に、線４５２に示すようにイベントの処理が完了したことをドライバ４３８に通知し、線４４０に示すようにＧＰＥレジスタ４３０からイベントを消去する。このようにして、第１の遠隔ノード４０２のホットプラグ・コントローラ４１２で生成されたホットプラグ・イベントが、レジスタ４３０と、これ以外の通常の場合は一次ノードのイベント用に使用されるマルチノード非認識のドライバ４３８とを利用して当該イベントを処理するために、ブート・ノード４０４に集約化される。これは、一次ノード４０４以外のノードのイベント用に予約されたレジスタとしてレジスタ４２６を使用することによって、またマルチノード認識のメソッド４４４を備えることによって達成される。言い換えれば、ドライバ４３８はマルチノード非認識コードであるが、メソッド４４４はマルチノード認識プロセスである。遠隔ノード４０２以外の他の遠隔ノードも同様の方式で処理される。
【００２１】
図５には、本発明の一実施形態によるマルチノード・システム５００が示されており、このシステム内で図４のフローチャート４００を実施することができる。システム５００は一次ノード５０２を含み、一次ノードは、図５には示されていない相互接続を介すなどして、複数の遠隔ノード５１０ａ、５１０ｂ、．．．、５１０ｎに通信可能に結合される。例えば、一次ノード５０２を図４のブート・ノード４０４とすることができ、一方、図４の遠隔ノード４０２を遠隔ノード５１０ａ、５１０ｂ、．．．、５１０ｎの１つとすることができる。
【００２２】
一次ノード５０２は、好ましくはマルチノード非認識のオペレーティング・システムのイベント・ドライバ５０４と、好ましくはマルチノード認識のメソッド５１２とを含む。ドライバ５０４を図４のＡＣＰＩドライバ４３８とすることができ、一方、メソッド５１２を図４のメソッド４４４とすることができる。ドライバ５０４は、一次ノード５０２で生成されたイベントに対して標準化されており、遠隔ノード５１０ａ、５１０ｂ、．．．、５１０ｎで生成されたイベントに対してはそうでないという点で、マルチノード非認識である。一次ノード５０２はレジスタ５０６およびレジスタ５０８を含み、それぞれ図４のＧＰＥレジスタ４３０およびＧＰＩＯレジスタ４２６を含むことができる。レジスタ５０６は好ましくは、一次ノード５０２で発生したハードウェア・イベント用に通常は予約されており、一方、レジスタ５０８は好ましくは、遠隔ノード５１０ａ、５１０ｂ、．．．、５１０ｎで発生したイベント用に予約されている。
【００２３】
したがって、システム５００内では、遠隔ノード５１０ａ、５１０ｂ、．．．、５１０ｎで生成されたイベントは、転送先として予約されているレジスタ５０８に転送される。それに自動的に応答して、次にこのイベントは、レジスタ５０８からレジスタ５０６に転送されて、一次ノード５０２でのハードウェア・イベント用に通常は予約されているレジスタ５０６が使用される。レジスタ５０８は好ましくは、レジスタ５０６に直接接続されるが、レジスタ５０８は少なくとも、レジスタ５０６に通信可能に結合される。マルチノード非認識であるためイベントが一次ノード５０２で生成されたと認識しているドライバ５０４は次に、メソッド５１２を呼び出す。メソッド５１２はマルチノード認識であるので、イベントを適切に操作し処理することができる。このようにして、マルチノード非認識のドライバ５０４が、マルチノード・システム５００内でも利用できるようになる。
【００２４】
従来技術を上回る利点
本発明の実施形態は、従来技術を上回る利点を提供する。本発明は、本発明を用いなければマルチノード認識でもマルチノード操作可能（multi-node operable）でもないアーキテクチャ内で、遠隔ノードのハードウェア・イベントを一次すなわちブート・ノードに合体化または集約化することを可能にする。こうした集約化は、当該アーキテクチャの一次ノードのドライバを必ずしも書き換える必要なく達成され、これらのドライバが通常は一次ノードのイベント用に参照するレジスタを使用して達成される。したがって、ＡＣＰＩイベントなどのハードウェア・イベントの集約化は、ＡＣＰＩ仕様から逸脱することなく、またＡＣＰＩ準拠のドライバを変更することなく、達成することができる。
【００２５】
代替実施形態
本明細書では説明のために本発明の具体的な実施形態を説明してきたが、本発明の主旨および範囲から逸脱することなく様々な修正を施し得ることは理解されよう。例えば、本発明は実質的に、ホットプラグ・イベントなどのＡＣＰＩハードウェア・イベントに関して説明された。しかし、本発明自体は、そのようなイベントに制限されるものではない。例えば、本発明は、その他のタイプのイベント、およびその他のタイプのハードウェア・イベントにも適合可能である。
【００２６】
別の例として、本発明は実質的に、他の通常の場合は一次ノードで生成されたイベント用に予約されたマルチノード非認識のドライバとある種のレジスタ群とに関して説明された。しかし、本発明自体は、そのようなものに制限されるものではない。例えば、本発明は、特別に記述されたマルチノード認識のドライバにも適合可能であり、そうしたドライバは、マルチノード・システムのすべての遠隔ノードが使用するためにパーティション分割されたある種のレジスタ群を精査することができる。すなわち、ドライバによって直接読み取られるレジスタ群は、マルチノード・システムで使用するためにパーティションに分割することができ、本発明が実質的に説明されたように、一次ノードで使用するために通常は予約されることがない。したがって、本発明の保護の範囲は、添付の特許請求の範囲およびその均等物によってのみ制限される。
【図面の簡単な説明】
【００２７】
【図１】登録特許の第１ページに記載するよう推奨された、本発明の好ましい実施形態による方法のフローチャートである。
【図２】本発明の実施形態を実施することができる、例示的なハードウェア・イベント・アーキテクチャの図である。
【図３】本発明の実施形態を実施することができる、マルチノード・システムの一例の図である。
【図４】本発明の一実施形態による、図３のマルチノード・システムなどのマルチノード・システム内で行われる、図２のイベント・アーキテクチャなどに関するイベントの転送および処理の全体フローを詳細に示した図である。
【図５】本発明の一実施形態によるマルチノード・システム、特にその一次ノードのレジスタの図である。【Technical field】
[0001]
The present invention relates generally to a multi-node computer system, which is a computer system having two or more nodes comprising a processor, memory, etc., and more particularly to hardware generated by the nodes of such a system. Concerning event management.
[Background]
[0002]
In a computer system, various hardware generates events that need to be processed. For example, the ACPI (advanced configuration and power interface) specification provides a power management / configuration mechanism. In ACPI-compatible computer systems that include ACPI-compatible hardware, the system itself can be turned on and off in response to internal and external events, and certain hardware elements are managed from a powervantage point You can also A network card that is in low-power mode can generate an event, for example, when it receives a data packet from a connected network, and can exit the low-power mode. This event is received by the computer system of which the network card is a part, so that, for example, the computer system can get out of the low power mode that it had previously entered. Another type of event is a hot-plug event, which occurs when a hardware card is inserted or removed from the computer system while the system is running. Occur.
DISCLOSURE OF THE INVENTION
[Problems to be solved by the invention]
[0003]
The disadvantages of ACPI events, as well as other types of hardware events, are that they are assumed to be handled by a computer system that includes as part of the hardware that generated the event. That is. That is, hardware events are often defined based on a multi-node system unaware architecture, and therefore assume that the architecture is a single-node system. A multi-node computer system has a plurality of nodes each having its own processor, memory, etc., and processing is distributed among them. Furthermore, the chipset itself that implements the ACPI event hardware is generally multi-node unaware. ACPI-capable operating systems generally assume that there is only one instance of ACPI event hardware in the system and are usually unaware of duplicate ACPI event hardware.
[0004]
Thus, current hardware event processing architectures often assume that events generated within a remote node of a multi-node computer system are processed by that node. For example, there is no mechanism for the primary or boot node of the system to receive notification of an event, and there is no mechanism for this node to handle the event and instruct the remote node how to handle the event. There is also no mechanism provided by standard ACPI event hardware to inform the operating system which node an event has occurred. This is a problem, but this is because the operating system policy for operating the hardware assumes that there is one instance of ACPI event hardware in the entire system. It is. However, this assumption does not apply for multi-node systems designed based on standard ACPI event hardware. There is a need for the present invention for the reasons described herein, and for other reasons.
[Means for Solving the Problems]
[0005]
The present invention relates to hardware event aggregation in a multi-node system. In the method of the present invention, an event occurring at a remote node is transferred to the primary node by the remote node firmware writing to the first register of the primary node. The event is communicated from the first register of the primary node to the second register of the primary node. In response, an interrupt is generated at the primary node. In response to generating the interrupt, the interrupt handler at the primary node calls code at the primary node to handle the event that occurred at the remote node.
[0006]
The multi-node system of the present invention includes a primary node and one or more remote nodes. The primary node includes a first register and a second register communicatively coupled to each other. The second register is normally reserved for primary node events. The primary node also includes multi-node unrecognized code for handling the event, which is called in response to an interrupt generated in response to the transfer of the event from the first register to the second register. . The event occurs at the remote node and is transferred to the first register of the primary node, and the final processing of the event is performed by the primary node. Events are automatically communicated from the first register of the primary node to the second register.
[0007]
The article of manufacture of the present invention includes a computer readable medium and means in the medium. This means automatically transmits the event written in the first register of the primary node to the second register of the primary node and transfers the event generated at the remote node from the first register to the second register. Is to do. An interrupt is generated at the primary node in response to writing to the second register automatically. The means is also for calling code at the primary node in response to generating an interrupt to handle the event. Other features and advantages of the present invention will become apparent from the following detailed description of the presently preferred embodiments of the invention, taken in conjunction with the accompanying drawings.
BEST MODE FOR CARRYING OUT THE INVENTION
[0008]
Overview FIG. 1 illustrates a method 100 according to a preferred embodiment of the present invention. The functionality of the method 100 can be implemented as a means in a computer readable medium of an article of manufacture. For example, the computer readable medium can be a recordable data storage medium or a modulated carrier signal. Each portion of method 100 is performed at a remote node and a primary or boot node of a multi-node system, indicated by columns 102 and 104, respectively, divided by dashed line 106.
[0009]
A hardware event initially occurs at the remote node (108). The hardware event may be an ACPI (Extended Configuration and Power Interface) event, such as a hot plug event, or another type of hardware event or other event. The event generates an interrupt, such as a PMI (platform management interrupt) at the remote node (110). Specifically, a remote node component, such as an interrupt router, generates an interrupt. In response to the generation of the interrupt, a firmware interrupt handler detects the event and is called on the remote node (112). The firmware interrupt handler forwards the event from the remote node to the primary node by writing the event to the primary node's first register (114). For example, the remote node firmware PMI handler can forward the event to the GPIO (general purpose input / output register) of the primary node.
[0010]
In the primary node, the event is automatically transmitted in response to the writing of the event to the first register via the connection directly connecting the output of the first register and the input of the second register of the primary node. (116). The second register is preferably reserved for events that occurred at the primary node rather than for events that occurred at the remote node. In this way, the method 100 utilizes a primary node register that is typically used exclusively for events that occurred at the primary node, such as a second register. For ACPI events, the second register may be a GPE (general purpose event register) reserved for such events. Writing an event to the second register generates an interrupt, such as an SCI (system configuration interrupt) interrupt, at the primary node (118).
[0011]
The generation of an interrupt at the primary node calls the code (120). This code is for processing the event that ultimately caused the interrupt to occur. For example, in the case of an ACPI event, the code can be an operating system (OS) ACPI driver. This code is multi-node unrecognized in that the code does not recognize that the event was not generated at the primary node. In this way, the method 100 can use standard drivers, such as standard ACPI drivers, and therefore does not need to rewrite these drivers to accommodate events that occur on nodes other than the primary node. .
[0012]
The code invokes a specific process to handle the event (122). This process is different from before, and is multi-node aware, specifically designed to handle events that occur on nodes other than the primary node. For example, this process can be an AML (ACPI Machine Language) method. AML is a compact, tokenized, abstract machine language. The process issues an appropriate instruction to process the event to the remote node (124), and the event is processed at the remote node according to this instruction (126). For example, the process remotely operates hardware in which an event has occurred, such as a controller or other hardware.
[0013]
2. Background Art FIG. 2 shows an example of an event architecture 200 on which an embodiment of the present invention can be implemented. The architecture 200 is specifically for ACPI events and will be described with respect to a single node system, but can be extended to a multi-node system in accordance with embodiments of the present invention. Platform hardware 202 includes control cards and other types of hardware that generate ACPI events. The hardware 202 can receive the setting from, for example, a BIOS (basic input / output system) 204 that is a kind of firmware. The OS independent component 206 includes an ACPI register 208, an ACPI BIOS 210, and an ACPI table 212. ACPI register 208 includes the second register described in the previous section of the detailed description. The ACPI BIOS 210 is a kind of firmware, and can record the ACPI setting of the hardware 202. The ACPI table 212 describes the interface to the hardware 202 used by the ACPI driver and AML interpreter 216.
[0014]
ACPI driver / AML interpreter 216 preferably includes or has access to such a process that is invoked by the ACPI driver to handle remote node events as described in the previous section of the detailed description. . ACPI drivers are generally standard for a given OS and include an AML interpreter for interpreting and parsing processes written in AML. That is, the process is non-standard and can be specific to a given situation such as a remote node event, but the ACPI driver itself is generally standard. The ACPI driver / AML interpreter 216 interacts with the OS device driver 214 for the platform hardware 202. A device driver is a software routine that connects hardware to the OS.
[0015]
In addition, the device driver 214 and the ACPI driver / AML interpreter 216 interact with the OS kernel 218. The kernel is usually the basic part of an OS that resides in memory and provides basic services. The kernel is the part of the OS that is closest to the hardware and generally activates the hardware by interfacing with the device driver 214. Above the kernel 218 is an application program 222 that runs on the computer system. The kernel 218 also interacts with an OSPM (OS power management system code) 220, which is part of an OS designed to provide and manage power management services.
[0016]
FIG. 3 illustrates an example of a multi-node system 300 in which embodiments of the present invention can be implemented. Multi-node system 300 includes a plurality of nodes 302a, 302b,. . . , 302n. One of the nodes is a primary or boot node and the other nodes are remote nodes to this primary node. Each of the nodes is one or more CPUs (Central Processing Units), or volatile or non-volatile memory, or both, and one or more storage devices such as hard disk drives or floppy disk drives. Any or all of these may optionally be included. For example, the node 302 a includes a CPU 308, a memory 312, a service processor 310, and a storage device 314. Similarly, the node 302 b includes a CPU 316, a service processor 318, a memory 320, and a storage device 322. Finally, the node 302 n has a CPU 324, a service processor 326, a memory 328, and a storage device 330.
[0017]
Aggregation of Remote Node Hardware Events to Primary or Boot Node FIG. 4 illustrates how hardware events are aggregated from a remote node to primary or boot node according to one embodiment of the present invention. It is the flowchart 400 which showed. The chart 400 is divided into functions performed by the first remote node 402, also called remote node N, and functions performed by the primary or boot node 404. The function performed by the first remote node 402 is separated from the function performed by the primary node 404 by a dashed line 406. Other remote nodes other than the remote node 402 can also be included.
[0018]
The chart 400 is drawn with respect to the aggregation or coalescing of events occurring at the first remote node 402 into the primary node 404. An event occurs at the hot plug controller 412 at node 402 as indicated by line 414. This event is a hot plug ACPI event. Controller 412 detects the insertion and removal of hardware cards from node 402. The event is detected by the interrupt router 416 of node 402 and in response, the interrupt router generates a PMI interrupt as indicated by line 418. In response to the generation of this interrupt, the PMI interrupt handler 420 at node 402 first disables the PMI interrupt as indicated by line 422. Interrupt handler 420 then notifies the primary node of the event as indicated by line 424 by writing the event to GPIO register 426 of primary node 404.
[0019]
Writing an event to the GPIO register 426 of the primary node 404 automatically transfers the event to the GPE register 430 of the primary node 404 as shown by line 428. This is because GPIO register 426 is at least communicatively coupled to GPE register 430 and is preferably directly connected. Writing an event to the GPE register 430 generates an SCI interrupt as indicated by line 432. The OS interrupt handler 434 handles the SCI interrupt, and in response, the ACPI driver 438 of the node 404 is invoked as shown on line 436. In response, driver 438 invokes AML method 444 as indicated by line 442. AML method 444 is specialized to handle remote hardware events, while driver 438 is preferably a standard ACPI driver that has not been modified to handle remote hardware events. . Similarly, GPE register 430 is an ACPI register that is typically used to handle local hardware events.
[0020]
The AML method 444 initially manipulates the controller 412 as indicated by line 446 to process remote events generated by the controller 412. When this processing occurs, the AML method 444 clears the hot plug event as shown at line 448 and re-enables the PMI interrupt as shown at line 450. Finally, the AML method 444 notifies the driver 438 that the event has been processed as indicated by line 452 and erases the event from the GPE register 430 as indicated by line 440. In this way, the hot plug event generated by the hot plug controller 412 of the first remote node 402 is registered in the multi-node non-use used for the register 430 and other normal node events. In order to process the event using the recognized driver 438, it is centralized in the boot node 404. This is accomplished by using the register 426 as a register reserved for events of nodes other than the primary node 404 and by providing a multi-node recognition method 444. In other words, driver 438 is multi-node unrecognized code, while method 444 is a multi-node recognition process. Other remote nodes other than the remote node 402 are processed in the same manner.
[0021]
FIG. 5 illustrates a multi-node system 500 according to one embodiment of the invention in which the flowchart 400 of FIG. 4 can be implemented. System 500 includes a primary node 502, which includes a plurality of remote nodes 510a, 510b,..., Such as via interconnections not shown in FIG. . . , 510n are communicatively coupled. For example, the primary node 502 may be the boot node 404 of FIG. 4, while the remote node 402 of FIG. 4 may be remote nodes 510a, 510b,. . . , 510n.
[0022]
The primary node 502 preferably includes a multi-node non-aware operating system event driver 504 and preferably a multi-node aware method 512. The driver 504 can be the ACPI driver 438 of FIG. 4, while the method 512 can be the method 444 of FIG. The driver 504 is standardized for events generated at the primary node 502, and the remote nodes 510a, 510b,. . . , 510n is multi-node unrecognized in that it is not. Primary node 502 includes register 506 and register 508, which may include GPE register 430 and GPIO register 426 of FIG. 4, respectively. Register 506 is preferably reserved for hardware events that occurred at primary node 502, while register 508 is preferably remote nodes 510a, 510b,. . . , 510n are reserved for events that occurred.
[0023]
Accordingly, within system 500, remote nodes 510a, 510b,. . . , 510n are transferred to the register 508 reserved as a transfer destination. In response to this automatically, this event is then transferred from register 508 to register 506, using register 506 normally reserved for hardware events at primary node 502. Register 508 is preferably connected directly to register 506, but register 508 is at least communicatively coupled to register 506. Next, the driver 504 that recognizes that the event has been generated by the primary node 502 because it is not multi-node aware calls the method 512. Since method 512 is multi-node recognition, events can be appropriately manipulated and processed. In this way, the multi-node non-recognized driver 504 can be used in the multi-node system 500.
[0024]
Advantages over the prior art Embodiments of the present invention provide advantages over the prior art. The present invention coalesces or aggregates remote node hardware events into a primary or boot node within an architecture that would otherwise not be multi-node aware or multi-node operable. Make it possible. Such aggregation is achieved without necessarily having to rewrite the primary node drivers of the architecture, and these drivers are typically achieved using registers that are referenced for primary node events. Thus, the consolidation of hardware events such as ACPI events can be achieved without departing from the ACPI specification and without changing ACPI-compliant drivers.
[0025]
Alternative Embodiments Although specific embodiments of the invention have been described herein for purposes of illustration, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. For example, the present invention has been substantially described with respect to ACPI hardware events, such as hot plug events. However, the present invention is not limited to such an event. For example, the present invention is adaptable to other types of events and other types of hardware events.
[0026]
As another example, the present invention has been substantially described with respect to multi-node unrecognized drivers and certain groups of registers reserved for other normal cases of events generated at the primary node. However, the present invention itself is not limited to such a case. For example, the present invention is also adaptable to specially described multi-node aware drivers, which are a group of registers that are partitioned for use by all remote nodes of a multi-node system. Can be scrutinized. That is, the registers that are read directly by the driver can be partitioned for use in a multi-node system and are usually reserved for use on the primary node, as the invention has been substantially described. It will not be done. Accordingly, the scope of protection of the present invention is limited only by the appended claims and equivalents thereof.
[Brief description of the drawings]
[0027]
FIG. 1 is a flow chart of a method according to a preferred embodiment of the present invention, recommended to be described on page 1 of a registered patent.
FIG. 2 is an exemplary hardware event architecture diagram in which embodiments of the invention may be implemented.
FIG. 3 is a diagram of an example multi-node system in which embodiments of the invention may be implemented.
4 shows in detail the overall flow of event forwarding and processing, such as the event architecture of FIG. 2, performed within a multi-node system, such as the multi-node system of FIG. 3, according to one embodiment of the invention. It is a figure.
FIG. 5 is a diagram of a multi-node system, particularly its primary node registers, according to one embodiment of the invention.

Claims

Transferring (114) an event occurring at a remote node from the remote node to the primary node, by the firmware of the remote node writing to a first register (508) of the primary node;
Communicating the event from the first register of the primary node to a second register (506) of the primary node;
Automatically generating an interrupt at the primary node in response to writing to the second register of the primary node;
Calling a code at the primary node by an interrupt handler at the primary node in response to the occurrence of the interrupt to process the event that occurred at the remote node.

The method of claim 1, wherein the second register of the primary node is normally reserved for events of the primary node, and the code invoked on the primary node is multi-node unaware.

The second register of the primary node is reserved for events of all nodes including the primary node and the remote node, and the code invoked on the primary node is multi-node recognition. A method according to any one of the above.

Further comprising invoking a multi-node recognition process (512) at the primary node by the code of the primary node in response to the code invocation to process the event that occurred at the remote node. Item 4. The method according to any one of Items 1 to 3.

Generating a remote interrupt at the remote node automatically in response to the event occurring at the remote node; The method according to claim 1, wherein the event is transferred to the primary node by writing to the first register of the primary node.

Generating an event in the remote node's hardware first, wherein the event generation is automatically responsive to the remote node's interrupt router in response to the event occurring at the remote node; 6. The method of claim 5, wherein a remote interrupt is generated at the node.

The method according to any of claims 1 to 6, wherein the event that occurred at the remote node and at least some events of the primary node comprise asynchronous hardware events.

A multi-node computer system for carrying out the method according to claim 1.

A computer readable medium;
An article comprising means in the medium for performing the method of any of claims 1-7.