JP2002509311A

JP2002509311A - Scalable single-system image operating software for multi-processing computer systems

Info

Publication number: JP2002509311A
Application number: JP2000540495A
Authority: JP
Inventors: レスカー，ポール・エイ; ベルトーニ，ジョナサン・エル
Original assignee: エス・アール・シィ・コンピューターズ・インコーポレイテッド
Priority date: 1998-01-20
Filing date: 1998-12-03
Publication date: 2002-03-26
Also published as: EP1064597A1; WO1999036851A1; CA2317132A1

Abstract

(57)【要約】別個のサービス（１６）および計算（１８）プロセッサを有するマルチプロセッシングコンピュータシステムのためのスケーラブル単一システム画像（「Ｓ３１」）オペレーティングシステムアーキテクチャであって、これらのプロセッサには独特の差異が存在するが、両方ともがコンピュータシステムメモリのすべてに対する共有の、共通アクセスを有する。計算プロセッサはそれらに直接マッピングされる入力／出力（「Ｉ／Ｏ」）装置を有さないが一方、サービスプロセッサは完全なＩ／Ｏ能力を有する。単一オペレーティングシステムソフトウェア画像は、プロセッサのすべてにわたってアプリケーションプログラムへの単一システムアプリケーションプログラミングインターフェイスを与え、計算プロセッサとサービスプロセッサとの間の通信メカニズムは、サービスプロセッサへの複数個の要求と、各要求に対する高速で非同期の割込応答とを可能にする。計算スケジューラは、各計算プロセッサ上で実行しオペレーティングソフトウェアが実行するサービスプロセッサへのインターフェイスを与える。 Abstract: A scalable single system image ("S31") operating system architecture for a multi-processing computer system having separate service (16) and computation (18) processors, which is unique to these processors. However, both have shared, common access to all of the computer system memory. Compute processors do not have input / output ("I / O") devices mapped directly to them, while service processors have full I / O capability. The single operating system software image provides a single system application programming interface to application programs across all of the processors, and the communication mechanism between the compute processor and the service processor uses multiple requests to the service processor and each request. And a fast and asynchronous interrupt response. The compute scheduler provides an interface to a service processor running on each compute processor and running operating software.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

BACKGROUND OF THE INVENTION

この発明は、一般的には、マルチプロセッシングコンピュータシステムの分野
に関する。より特定的には、この発明は、マルチプロセッシングコンピュータシ
ステムのためのスケーラブル単一システム画像（「Ｓ３１」）オペレーティング
ソフトウェアに関する。The present invention relates generally to the field of multi-processing computer systems. More specifically, the present invention relates to scalable single system image ("S31") operating software for a multi-processing computer system.

【０００２】単一のモノリシック中央メモリへの２つ以上の同種のプロセッサの接続は、マ
ルチプロセッシングと呼ばれる。最近まで、ハードウェアおよびソフトウェアの
限界によって、単一メモリに効率よくアクセス可能である物理的プロセッサの総
数は最小限のものであった。コンピュータシステム内のプロセッサの数が増える
のに伴って、これらの限界は計算効率性を維持する能力を低下させ、システムの
全体的なスケーラビリティを低下させてきた。より高速でこれまでよりも安価な
マイクロプロセッサの到来とともに、大きいプロセッサカウントシステムがハー
ドウェアの現実となってきている。[0002] The connection of two or more homogeneous processors to a single monolithic central memory is called multiprocessing. Until recently, due to hardware and software limitations, the total number of physical processors that could efficiently access a single memory was minimal. As the number of processors in a computer system increases, these limitations have reduced the ability to maintain computational efficiency and have reduced the overall scalability of the system. With the advent of faster and cheaper microprocessors, large processor count systems are becoming a reality for hardware.

【０００３】ハードウェアの進歩は、効率の低下を最小限度としながらも、相互接続された
ネットワークが単一のメモリに対して何百個のプロセッサを多重化することを可
能にした。大きい構成内のソフトウェアロッキング基本命令（primitive）に関する性能の問題のために、多数のプロセッサを効率よく収容するのに必要とされ
るオペレーティングソフトウェアスケーラビリティを、システムの設計者らは未
だ得ていない。[0003] Advances in hardware have allowed interconnected networks to multiplex hundreds of processors to a single memory, with minimal loss of efficiency. Due to performance issues with software locking primitives in large configurations, system software designers still do not have the operating software scalability needed to efficiently accommodate large numbers of processors .

【０００４】[0004]

SUMMARY OF THE INVENTION

これらのコンピュータシステムアーキテクチャ効率性の問題に対応して、コロ
ラド州、コロラド・スプリングス、エス・アール・シィ・コンピューターズ・イ
ンコーポレイテッド（SRC Computers, Inc.）は、システムに大きい共有メモリを与え高速の商業用プロセッサを利用して高帯域入力／出力（「Ｉ／Ｏ」）機能
性を得ることによって、従来の高性能スーパーコンピュータに取って代わる手頃
で高性能のコンピュータを開発した。これは、プロセッサ速度とメモリサイズと
Ｉ／Ｏ帯域幅との間のバランスを取って、システムのハードウェアとソフトウェ
アとの間の高度の効率性を達成することによって果たされ、このため並列性の度
合いはより大きくなった。In response to these computer system architecture efficiency issues, SRC Computers, Inc., of Colorado Springs, Colorado, has provided systems with large shared memory, Utilizing commercial processors to obtain high-bandwidth input / output ("I / O") functionality has developed an affordable, high-performance computer that replaces traditional high-performance supercomputers. This is achieved by balancing processor speed, memory size, and I / O bandwidth to achieve a high degree of efficiency between the system hardware and software, thus achieving parallelism. The degree of the bigger.

【０００５】ここに開示されるのは、スケーラブル単一システム画像（「Ｓ３１」）オペレ
ーティングソフトウェアアーキテクチャを利用するコンピュータシステムである
。このアーキテクチャは、単一システム画像を効率よく与えるマルチプロセッサ
環境において、数個のプロセッサから何百個のプロセッサまでの、オペレーティ
ングソフトウェアの効率的スケーラビリティを可能にする。この発明のＳ３１ア
ーキテクチャは、超並列（「ＭＰＰ」）アーキテクチャの必要性を実質的になく
す。これは単に、分散メモリおよびメッセージパッシング同期基本命令がもはや
必要でないからである。簡単でプログラムするのが簡単なフラットメモリモデル
がメッセージパッシングに取って代わる。共通のアプリケーションプログラミン
グインターフェイス／アプリケーションバイナリインターフェイス（「ＡＰＩ／
ＡＢＩ」）がすべてのアプリケーションに与えられる。メッセージパッシングパ
ラダイムを排除するために、計算的にも入力／出力（「Ｉ／Ｏ」）の観点からも
性能は大きく向上する。[0005] Disclosed herein is a computer system that utilizes a scalable single system image ("S31") operating software architecture. This architecture enables efficient scalability of operating software from a few processors to hundreds of processors in a multiprocessor environment that efficiently provides a single system image. The S31 architecture of the present invention substantially eliminates the need for massively parallel ("MPP") architectures. This is simply because the distributed memory and message passing synchronization primitives are no longer needed. The flat memory model, which is simple and easy to program, replaces message passing. Common application programming interface / application binary interface ("API /
ABI ") is given to all applications. Eliminating the message passing paradigm greatly improves performance both computationally and in terms of input / output ("I / O").

【０００６】ここに開示されるのは、別個のサービスプロセッサおよび計算プロセッサを有
するマルチプロセッシングコンピュータシステムのためのスケーラブル単一シス
テム画像（「Ｓ３１」）オペレーティングシステムアーキテクチャであって、こ
れらのプロセッサの間には独特の差異が存在するが、その両方がすべてのコンピ
ュータシステムメモリへの共有の共通アクセスを有する。計算プロセッサは、そ
れらの上に直接マッピングされる入力／出力（「Ｉ／Ｏ」）装置を有しないが、
サービスプロセッサは付加される装置を制御することができる。単一オペレーテ
ィングシステムソフトウェア画像は、すべてのプロセッサにわたってアプリケー
ションプログラムへの単一システムアプリケーションプログラミングインターフ
ェイスを与え、計算プロセッサとサービスプロセッサとの間の通信メカニズムは
、サービスプロセッサに対する複数個の要求と、各要求に対する高速の非同期割
込応答とを可能にする。計算スケジューラは、各計算プロセッサ上で実行し、オ
ペレーティングシステムソフトウェアが実行するサービスプロセッサへのインタ
ーフェイスを与える。[0006] Disclosed herein is a scalable single system image ("S31") operating system architecture for a multi-processing computer system having a separate service processor and a compute processor, with a processor between the processors. Have unique differences, but both have shared common access to all computer system memory. Compute processors have no input / output ("I / O") devices mapped directly on them,
The service processor can control the added device. The single operating system software image provides a single system application programming interface to the application program across all processors, and the communication mechanism between the compute processor and the service processor uses multiple requests for the service processor and for each request Enables fast asynchronous interrupt response. The compute scheduler runs on each compute processor and provides an interface to the service processor where the operating system software runs.

【０００７】Ｓ３１アーキテクチャはまた、キャッシュコンフリクトを低減することによっ
てキャッシュ性能を向上させる。このコンフリクトが低減されるのは、オペレー
ティングソフトウェアが、アプリケーション要求をサービスするプロセスの間、
キャッシュからのアプリケーションデータをもはや強制しないからである。最新
世代のマルチプロセッサが示すように、メモリ待ち時間に関するプロセッサ速度
が向上するにつれ、この性能向上はより重要となる。[0007] The S31 architecture also improves cache performance by reducing cache conflicts. This conflict is reduced because the operating software must be able to process application requests during the process.
Because it no longer forces application data from the cache. As the processor speed in terms of memory latency increases, as the latest generation of multiprocessors shows, this performance improvement becomes more important.

【０００８】ここに特に開示されるのは、オペレーティングソフトウェアを含むマルチプロ
セッサコンピュータシステムである。このコンピュータシステムは、オペレーテ
ィングソフトウェアと連係して機能する第１の複数個のサービスプロセッサを含
み、このサービスプロセッサは、コンピュータシステムでのすべての入力／出力
機能を処理する。第２の複数個の計算プロセッサは、計算スケジューラと連係し
て機能する。オペレーティングソフトウェアおよび計算スケジューラは、サービ
スプロセッサと計算プロセッサとの間に通信媒体を与える。[0008] Specifically disclosed herein is a multiprocessor computer system that includes operating software. The computer system includes a first plurality of service processors functioning in conjunction with operating software, the service processor handling all input / output functions in the computer system. The second plurality of computation processors function in conjunction with the computation scheduler. The operating software and the computation scheduler provide a communication medium between the service processor and the computation processor.

【０００９】ここにさらに開示されるのは、オペレーティングソフトウェアを含むマルチプ
ロセッサコンピュータシステムである。このコンピュータシステムは、第１の複
数個のサービスプロセッサと第２の複数個の計算プロセッサとを含む。このサー
ビスの各々は、オペレーティングソフトウェアと連係して機能し、コンピュータ
システムでの全ての入力／出力機能を処理する。この計算の各々は、計算スケジ
ューラと連係して機能し、オペレーティングソフトウェアおよび計算スケジュー
ラは、サービスプロセッサと計算プロセッサとの間に通信媒体を与える。[0009] Further disclosed herein is a multiprocessor computer system that includes operating software. The computer system includes a first plurality of service processors and a second plurality of computing processors. Each of these services works in conjunction with operating software to handle all input / output functions in the computer system. Each of these calculations works in conjunction with a calculation scheduler, and the operating software and calculation scheduler provide a communication medium between the service processor and the calculation processor.

【００１０】添付図面と関連付けて好ましい実施例の以下の説明を参照することによって、
この発明の前記および他の特徴および目的ならびにそれらを達成するための方法
はより明らかとなり、この発明自体が最もよく理解される。By referring to the following description of the preferred embodiment in connection with the accompanying drawings,
The above and other features and objects of the present invention and the manner of achieving them will become more apparent and the invention itself is best understood.

【００１１】[0011]

DESCRIPTION OF THE PREFERRED EMBODIMENTS

図１Ａおよび図１Ｂを参照すると、この発明に従うマルチプロセッシングコン
ピュータシステム１０が示される。例示のコンピュータシステム１０は、直接関
係のある部分で、任意の数の相互接続されたセグメント１２₀から１２₁₅を含むが、この発明の原理は、如何なるスケーラブルシステムの多数のプロセッサにも
同様に適用可能である。以下に詳しく記載するとおり、さまざまなセグメント１
２₀から１２₁₅は、多数のトランクライン１４₀から１４₁₅を介して結合される。Referring to FIGS. 1A and 1B, a multi-processing computer system 10 according to the present invention is shown. Exemplary computer system 10, in pertinent part, including interconnected segments 12 _0-12 ₁₅ any number of principles of the invention, equally applicable to multiple processors of any scalable system It is possible. Various segments 1 as described in detail below
2 _0-12 ₁₅ is coupled via a number of trunk lines 14 ₀ to 14 _15.

【００１２】セグメント１２の各々は、サービスプロセッサ１６₀から１６₃（サービスプロ
セッサ１６₀はマスタブート装置としてさらに機能する）および計算プロセッサ１８₀から１８₁₅の形を取る多数の機能的に異なる処理要素を含む。サービスプロセッサ１６は、多数の周辺コンポーネント相互接続（「ＰＣＩ」）インターフ
ェイスモジュール２０に結合され、図示の実施例では、各サービスプロセッサは
、２つのそのようなモジュール２０に結合され、サービスプロセッサ１６がセグ
メント１２のすべてのＩ／Ｏ機能を実行することを可能にする。[0012] Each of the segments 12, 16 ₃ from the service processor 16 ₀ (service processor 16 ₀ further functioning as the master boot device) and a number of functionally distinct processing elements in the form of a calculation processor 18 ₀ 18 ₁₅ including. The service processor 16 is coupled to a number of peripheral component interconnect ("PCI") interface modules 20, and in the illustrated embodiment, each service processor is coupled to two such modules 20 and the service processor 16 Allows all I / O functions of segment 12 to be performed.

【００１３】コンピュータシステム１０は、システムコンソール２４をコンピュータシステ
ム１０の少なくとも１つのセグメント１２に結合するシリアル−ＰＣＩインター
フェイス２２をさらに含む。システムコンソール２４は、コンピュータシステム
１０のユーザがブート情報をコンピュータシステム１０にダウンロードし、装置
を構成し、ステータスを監視し、診断機能を実行することを可能にするよう動作
する。いくつセグメント１２がコンピュータシステム１０内に構成されていよう
とも、必要とされるシステムコンソール２４は１つのみである。Computer system 10 further includes a serial-PCI interface 22 that couples system console 24 to at least one segment 12 of computer system 10. The system console 24 operates to allow a user of the computer system 10 to download boot information to the computer system 10, configure devices, monitor status, and perform diagnostic functions. No matter how many segments 12 are configured in computer system 10, only one system console 24 is required.

【００１４】ブート装置２６（たとえば、ユタ州、ロイのイオメガ社（Iomega Corporation
, Roy UT）から入手可能な、ＪＡＺ（登録商標）取外し可能ディスクコンピュー
タ大容量記憶装置）もまた、ＰＣＩモジュール２０の１つを介してマスタブート
サービスプロセッサ１６₀に結合される。サービスプロセッサ１６₁から１６₃に結合されるＰＣＩモジュール２０は、たとえば、ディスクアレイ２８₀から２８₅ などのすべての他の周辺装置にセグメント１２を結合するために利用され、この
うち任意の１つ以上が、たとえば、イーサネット接続と置き換えられてもよい。A boot device 26 (eg, Iomega Corporation, Roy, Utah)
, Roy UT), available from, JAZ (R) removable disk computer mass storage device) is also coupled to the master boot service processor 16 ₀ through one of the PCI module 20. PCI module is coupled from the service processor 16 ₁ to 16 ₃ 20, for example, be utilized to couple the segment 12 to any other peripheral devices such as ₅ from ₀ disk array 28 28, one of any these The above may be replaced with, for example, an Ethernet connection.

【００１５】コンピュータシステム１０は、洗練されたハードウェアおよびビルディングブ
ロックを含み、これは商業ベースであって、高性能計算（「ＨＰＣ」）の特異性
に適合するようにいくぶん改良されている。ハードウェア側では、コンピュータ
システム１０の基本ユニットはセグメント１２である。各セグメント１２は、計
算およびサービスプロセッサ１８、１６要素、メモリ、電源ならびにクロスバー
スイッチアセンブリを含む。コンピュータシステム１０は、エンドユーザが１か
ら１６の相互接続されたセグメント１２からなるシステムを構成することができ
るという点で、「スケーラブル」である。各セグメント１２は、全部で２０個の
プロセッサ、すなわち、１６個の計算プロセッサ１８と４個のサービスプロセッ
サ１６とを含む。好ましい実施例では、計算プロセッサ１８は、４個のプロセッ
サ（たとえば、カリフォルニア州、サンタクララのインテル社（Intel Corporat
ion, Santa Clara, CA）から入手可能のDeschutes^TMマイクロプロセッサ）と８個のインターフェイスチップ（すなわち計算プロセッサ１８１個あたり２個）
とを含む個々のアセンブリ上にあってもよい。各計算プロセッサ１８は、３００
ＭＨｚより高い内部プロセッサクロックレートと、１００ＭＨｚより高いシステ
ムクロック速度とを有し、インターフェイスチップは、以下に詳細に記載し示す
とおり、計算プロセッサ１８とメモリに接続するメモリスイッチとの間に接続を
与える。Computer system 10 includes sophisticated hardware and building blocks, which are commercially available and are somewhat improved to accommodate the specifics of high performance computing (“HPC”). On the hardware side, the basic unit of computer system 10 is segment 12. Each segment 12 includes a compute and service processor 18, 16 elements, memory, power supplies and a crossbar switch assembly. Computer system 10 is "scalable" in that end users can configure a system consisting of one to sixteen interconnected segments 12. Each segment 12 includes a total of 20 processors, namely, 16 computation processors 18 and 4 service processors 16. In the preferred embodiment, compute processor 18 has four processors (eg, Intel Corporat, Santa Clara, Calif.).
ion, Santa Clara, CA), available from Deschutes ^TM microprocessor) and eight interface chips (i.e. two calculation processor 18 per)
And may be on individual assemblies including Each calculation processor 18 has 300
Having an internal processor clock rate of greater than 100 MHz and a system clock rate of greater than 100 MHz, the interface chip provides a connection between the computing processor 18 and a memory switch that connects to the memory, as described and described in detail below. .

【００１６】サービスプロセッサ１６は、サブプロセッサアセンブリ上に含まれてもよく、
これはコンピュータシステム１０でのすべての入力および出力を担う。サービス
プロセッサアセンブリの各々は、プロセッサ（計算プロセッサ１８と同じ種類）
、２つのインターフェイスチップ、２つの１ＭバイトＩ／Ｏバッファおよび２つ
の双方向ＰＣＩバスを含む。各ＰＣＩバスは単一のコネクタを有する。すべての
Ｉ／Ｏポートは、プロセッサに対する優先順位が等しいＤＭＡ能力を有する。Ｐ
ＣＩモジュール２０は、どのサービスプロセッサ１６によってこれらが使用され
るかによって、２つの目的を果たす。マスタブートサービスプロセッサ１６₀上のＰＣＩコネクタは、ブート装置２６およびシステムコンソール２４に接続する
ために使用される。レギュラーサービスプロセッサ１６₁から１６₃上のＰＣＩモ
ジュール２０は、すべての他の周辺装置に対して使用される。サポートされるＰ
ＣＩベース相互接続のいくつかは、スモールコンピュータシステムインターフェ
イス（「ＳＣＳＩ」）、ファイバ分散データインターフェイス（「ＦＤＤＩ」）
、高性能並列インターフェイス（「ＨＩＰＰＩ」）およびその他を含む。各ＰＣ
Ｉバスは、対応する商業ベースのホストアダプタを有する。The service processor 16 may be included on a sub-processor assembly,
It is responsible for all inputs and outputs at computer system 10. Each of the service processor assemblies is a processor (same type as the computation processor 18)
Includes two interface chips, two 1 Mbyte I / O buffers and two bidirectional PCI buses. Each PCI bus has a single connector. All I / O ports have equal priority DMA capabilities to the processor. P
The CI module 20 serves two purposes, depending on which service processor 16 uses them. PCI connector master boot service processor 16 on ₀ is used to connect to the boot device 26 and the system console 24. PCI module 20 from the regular service processor 16 ₁ 16 on ₃ is used for all other peripheral devices. Supported P
Some of the CI-based interconnects include Small Computer System Interface ("SCSI"), Fiber Distributed Data Interface ("FDDI")
, High Performance Parallel Interface ("HIPPI") and others. Each PC
The I-bus has a corresponding commercial host adapter.

【００１７】サービス機能を計算機能から分離することによって、数値的処理とオペレーテ
ィングシステムデューティおよび外部周辺装置のサービスとの同時実行が可能と
なる。Separating the service functions from the computation functions allows simultaneous execution of numerical processing and operating system duty and services of external peripherals.

【００１８】さらに図２を参照すると、図１Ａおよび図１Ｂのコンピュータシステム１０の
相互接続ストラテジーが、１６個のトランクライン１４₀から１４₁₅によって相互接続される１６個のセグメント１２₀から１２₁₅を採用する実現化例において、より詳細に示される。図示のとおり、多数のメモリバンク５０₀から５０₁₅は、各々が計算プロセッサ１８₀から１８₁₅のそれぞれ１つに割当てられ（セグメント１２１つあたり１６個のメモリバンク５０があるので、１６個のセグメン
ト１２があるコンピュータシステム１０では合計２５６個のメモリバンク５０が
あることになる）、コンピュータシステム１０の一部を形成し、トランクライン
１４₀から１４₁₅に、同数のメモリスイッチ５２₀から５２₁₅を介してそれぞれ結
合される。メモリバンク５０₀から５０₁₅に利用されるメモリは、シンクロナススタティックランダムアクセスメモリ（「ＳＳＲＡＭ」）または他の好適な高速
メモリ装置であってもよい。また図示のとおり、セグメント１２₀から１２₁₅の各々は、たとえば、トランクライン１４₀から１４₁₅に、同数のプロセッサスイッチ５４₀から５４₁₅の対応する１つを介して結合される、２０個のプロセッサ（４個のサービスプロセッサ１６₀から１６₃および１６個の計算プロセッサ１８ ₀ から１８₁₅）を含む。Referring still to FIG. 2, the computer system 10 of FIGS. 1A and 1B
The interconnect strategy has 16 trunk lines 14₀From 14_Fifteen16 segments 12 interconnected by₀From 12_FifteenThis is shown in more detail in an implementation that employs As shown, a number of memory banks 50₀From 50_FifteenAre the calculation processors 18₀From 18_Fifteen(Since there are 16 memory banks 50 per segment 12, there are 16 segment banks.
In the computer system 10 with the memory 12, a total of 256 memory banks 50 are provided.
Will form part of the computer system 10
14₀From 14_FifteenAnd the same number of memory switches 52₀From 52_FifteenThrough each
Are combined. Memory bank 50₀From 50_FifteenThe memory utilized for synchronous random access memory ("SSRAM") or other suitable high speed
It may be a memory device. As shown, segment 12₀From 12_FifteenEach, for example, the trunk line 14₀From 14_FifteenAnd the same number of processor switches 54₀From 54_Fifteen20 processors (4 service processors 16) coupled via a corresponding one of₀From 16_ThreeAnd 16 calculation processors 18 ₀ From 18_Fifteen)including.

【００１９】各セグメント１２は、クロスバースイッチを介してすべての他のセグメント１
２と相互接続する。コンピュータシステム１０クロスバースイッチ技術は、セグ
メント１２がセグメントの境界を越えて、さらに個々のセグメント１２内でも、
均一なメモリアクセス時間を有することを可能にする。それはまた、コンピュー
タシステム１０がシステム内のすべてのメモリに対する単一のメモリアクセスプ
ロトコルを採用することも可能にする。クロスバースイッチは、高速フィールド
プログラマブルゲートアレイ（「ＦＰＧＡ」）を利用し、プロセッサおよびメモ
リが物理的にどこに位置しているかにかかわらず、メモリとプロセッサとの間の
相互接続経路を与えることができる。このクロスバースイッチは、あらゆるセグ
メント１２を相互接続し、異なったセグメント１２に位置するプロセッサおよび
メモリが均一の待ち時間で通信することを可能にする。好ましい実施例では、各
クロスバースイッチは、１層あたり１クロックの待ち時間を有し、これは再構成
時間を含む。３２０個のプロセッサ１６、１８を利用する１６個のセグメント１
２を備えるコンピュータシステム１０では、必要とされるクロスバー層は２つの
みである。Each segment 12 is connected to all other segments 1 via a crossbar switch.
Interconnect with 2. Computer system 10 crossbar switch technology allows segments 12 to cross segment boundaries and even within individual segments 12.
It allows to have a uniform memory access time. It also allows computer system 10 to employ a single memory access protocol for all memories in the system. Crossbar switches can utilize high speed field programmable gate arrays ("FPGAs") to provide an interconnect path between memory and processor, regardless of where the processor and memory are physically located. . This crossbar switch interconnects every segment 12 and allows processors and memories located in different segments 12 to communicate with uniform latency. In the preferred embodiment, each crossbar switch has a latency of one clock per layer, which includes the reconfiguration time. 16 segments 1 utilizing 320 processors 16, 18
In a computer system 10 with two, only two crossbar layers are required.

【００２０】前述したとおり、コンピュータシステム１０は、好ましくは、メモリバンク５
０にＳＳＲＡＭを利用してもよい。というのも、これは６ナノ秒のコンポーネン
トサイクル時間を提供するからである。各メモリバンク５０は、６４から２５６
Ｍバイトのメモリをサポートする。各計算プロセッサ１８は、１つのメモリバン
ク５０をサポートし、各メモリバンク５０は２５６ビット幅であって、３２パリ
ティビットを加えて合計２８８ビットの幅である。加えて、メモリバンク５０の
大きさをキャッシュラインの大きさに一致するよう設計すると、全キャッシュラ
インに対するバンクアクセスを一回とすることが可能である。アドレスパケット
およびデータパケットに対してパリティチェックを完了することによって、読出
および書込メモリエラー修正が与えらてもよい。As mentioned above, the computer system 10 preferably includes the memory bank 5
0 may use SSRAM. This provides a component cycle time of 6 nanoseconds. Each memory bank 50 has 64 to 256
Supports M bytes of memory. Each compute processor 18 supports one memory bank 50, each memory bank 50 being 256 bits wide, plus 32 parity bits, for a total of 288 bits. In addition, if the size of the memory bank 50 is designed to match the size of the cache line, it is possible to make one bank access to all the cache lines. Completing the parity check on the address and data packets may provide read and write memory error correction.

【００２１】アドレスパケットに対するパリティチェックは、読出機能でも書込機能でも同
じであり得る。新しいパリティビットと古いパリティビットとが比較されてメモ
リ読出または書込を続行するべきかアボートするべきかを決定する。メモリ「書
込」が生じる場合、パリティチェックはメモリに到着するデータパケットの各々
に対してなされてもよい。これらのデータパケットの各々は、それに付加される
８ビットのパリティコードを有する。データパケットがメモリに到着すると、新
しい８ビットのパリティコードがデータパケットに対して生成され、古いパリテ
ィコードと新しいパリティコードとが比較される。この比較によって、２種類の
コード、すなわち、シングルビットエラー（「ＳＢＥ」）またはダブルビットも
しくは複数ビットエラー（「ＤＢＥ」）のうちの１つが得られる。シングルビッ
トのエラーは、データパケット上で、それがメモリに入れられる前に、訂正され
てもよい。ダブルビットまたは複数ビットのエラーの場合、データパケットはメ
モリに書込まれず、プロセッサにレポートされ、これがデータパケット参照を再
試行する。メモリ「読出」が生じる場合、メモリから読出されるデータパケット
の各々は、８ビットのパリティコードを生成する。このパリティコードは、デー
タとともにプロセッサへ送られる。プロセッサは、シングルエラー訂正およびダ
ブルエラー検出（「ＳＥＣＤＥＤ」）を各データパケットに対して実行する。The parity check for an address packet can be the same for both read and write functions. The new and old parity bits are compared to determine whether the memory read or write should continue or abort. If a memory "write" occurs, a parity check may be made for each data packet arriving at the memory. Each of these data packets has an 8-bit parity code appended to it. When a data packet arrives in memory, a new 8-bit parity code is generated for the data packet and the old and new parity codes are compared. This comparison yields one of two types of codes: a single bit error ("SBE") or a double or multiple bit error ("DBE"). Single bit errors may be corrected on the data packet before it is put into memory. For double or multiple bit errors, the data packet is not written to memory and is reported to the processor, which retries the data packet reference. When a memory "read" occurs, each of the data packets read from the memory generates an 8-bit parity code. This parity code is sent to the processor together with the data. The processor performs single error correction and double error detection ("SECEDED") on each data packet.

【００２２】さらに図３を参照すると、６４個のサービスプロセッサ１６と２５６個の計算
プロセッサ１８との合計３２０個のプロセッサを含む、１６個のセグメント１２
を備えるコンピュータシステム１０の簡素化された図が示される。サービスプロ
セッサ１６および計算プロセッサ１８は、以下に詳細に記載するとおり、スケー
ラブル単一システム画像（「Ｓ３１」）層６０によって、コンピュータプログラ
ムアプリケーションソフトウェアにインターフェイスする。サービスプロセッサ
１６は、前述したとおりすべてのＩ／Ｏ動作を処理し、さらにコンピュータシス
テム１０オペレーティングシステムの実行を処理する。Still referring to FIG. 3, 16 segments 12 containing 64 service processors 16 and 256 compute processors 18, for a total of 320 processors
A simplified diagram of a computer system 10 comprising a is shown. Service processor 16 and compute processor 18 interface to computer program application software through a scalable single system image ("S31") layer 60, as described in more detail below. The service processor 16 handles all I / O operations as described above, and also handles the execution of the computer system 10 operating system.

【００２３】さらに図４を参照すると、Ｓ３１層６０がより詳細に示され、これはコンピュ
ータプログラムコードアプリケーション６２ソフトウェアとサービスおよび計算
プロセッサ１６、１８とに関連付けられている。Ｓ３１層６０は、アプリケーシ
ョン６２の上部上にあって、共通のＡＰＩ／ＡＢＩ層６４とさらに計算スケジュ
ーラ６６層およびオペレーティングソフトウェア６８層とを含む。計算スケジュ
ーラ６６は計算プロセッサ１８₀から１８₁₅にインターフェイスし、一方、オペレーティングソフトウェア６８はサービスプロセッサ１６₀から１６₃にインター
フェイスする。オペレーティングソフトウェア６８は、図示のとおり、計算スケ
ジューラに割込応答信号７０を与えて計算プロセッサ１８の動作を制御する。多
数のメモリ通信バッファ７２は、さまざまな計算プロセッサ１８₀から１８₁₅からデータを受取って、サービスプロセッサ１６₀から１６₃にデータを与える。Still referring to FIG. 4, the S31 layer 60 is shown in more detail, which is associated with the computer program code application 62 software and service and computing processors 16,18. The S31 layer 60 is on top of the application 62 and includes a common API / ABI layer 64 and further a calculation scheduler 66 layer and an operating software 68 layer. Calculating the scheduler 66 is the interface to the calculation processor 18 ₀ to 18 _15, while the operating software 68 interface from the service processor 16 ₀ to 16 _3. The operating software 68 provides an interrupt response signal 70 to the calculation scheduler to control the operation of the calculation processor 18 as shown. A plurality of memory communication buffer 72 receives the 18 ₁₅ or data from the various computing processors 18 ₀ provides data from the service processor 16 ₀ to 16 _3.

【００２４】さきに記載し示したとおり、この発明のスケーラブル単一システム相互接続ア
ーキテクチャの好ましい実現化例は、複数個のメモリバンク５０₀から５０_Nを含
む共通の共有メモリにわたって均一のメモリアクセスを備えるマルチプロセッサ
コンピュータシステム１０上にある。前述したとおり、プロセッササブシステム
は、２つのグループ、つまり、Ｉ／Ｏ接続を有するもの、すなわちサービスプロ
セッサ１６₀から１６_Nと、Ｉ／Ｏ接続を有さないもの、すなわち計算プロセッサ
１８₀から１８_Nとに隔てられてもよい。[0024] As shown described above, a preferred realization flounder scalable single system interconnect architecture of the present invention, a uniform memory access from a plurality of memory banks 50 ₀ over common shared memory containing 50 _N Provided on a multiprocessor computer system 10. As described above, the processor subsystem, two groups, i.e., those having an I / O connection, and the service processor 16 ₀ 16 _N, having no I / O connections, i.e. from the calculation processor 18 ₀ 18 _N.

【００２５】Ｓ３１は、サービスおよび計算プロセッサ１６、１８からなるソフトウェア環
境を利用する。オペレーティングシステムソフトウェア６８の単一のコピーは、
サービスパーティション内のすべてのプロセッサ１６にわたって存在する。別個
の計算スケジューラ６６が、各計算プロセッサ１８内に存在する。このソフトウ
ェアモデルは、強力な「単一システム画像」と連係して、パラダイムを共有する
グローバル資源を保証する。このアーキテクチャを効率よく利用するためには、
実行の高度にスケーラブルなスレッドが、高度のソフトウェア「マルチスレッド
」と連係して、オペレーティングシステムソフトウェア６８とユーザアプリケー
ション６２ソフトウェアデザインモデルとの両方に存在していなければならない
。S 31 utilizes a software environment consisting of the service and computing processors 16, 18. A single copy of operating system software 68
It exists across all processors 16 in the service partition. A separate computation scheduler 66 resides within each computation processor 18. This software model works with powerful "single system images" to ensure a global resource sharing paradigm. To use this architecture efficiently,
A highly scalable thread of execution must be present in both the operating system software 68 and the user application 62 software design model in conjunction with the advanced software "multithreading."

【００２６】コンピュータシステム１０の強力な共有メモリハードウェアアーキテクチャの
ために、オペレーティングソフトウェア６８機能のこの選択により、プロセッサ
同士またはハードウェア境界の間の通信のためのメッセージパッシングパラダイ
ムの必要性がなくなる。しかしながら、ある程度の簡単な通信は、計算プロセッ
サ１８とサービスプロセッサ１６との間に必要とされる。如何なる物理的ハード
ウェア境界もアプリケーション６２プログラムにとっては明らかではない。物理
的プロセッサ１６、１８はすべて、エンドユーザに同じアプリケーションプログ
ラミングインターフェイス（ＡＰＩ）を与える。Due to the powerful shared memory hardware architecture of computer system 10, this choice of operating software 68 features eliminates the need for a message passing paradigm for communication between processors or between hardware boundaries. However, some simple communication is required between the compute processor 18 and the service processor 16. No physical hardware boundaries are apparent to the application 62 program. The physical processors 16, 18 all provide the same application programming interface (API) to the end user.

【００２７】スケーラビリティモデルを完了するために、ユーザアプリケーション６２は、
アプリケーションユーザ空間内の実行の複数のスレッドを開始しかつ終了するこ
とができ、このためオペレーティングシステムソフトウェア６８のオーバーヘッ
ドのさらなる排除と、物理的プロセッサの数が増える場合のスケーラビリティの
増大とが可能となる。To complete the scalability model, the user application 62
Multiple threads of execution in application user space can be started and terminated, further eliminating the overhead of operating system software 68 and increasing scalability as the number of physical processors increases. .

【００２８】Ｓ３１アーキテクチャの下では、ユーザアプリケーション６２は、通常のシス
テムソフトウェアメカニズム内でシステムに要求をする。アプリケーション６２
は、アプリケーション６２がサービスプロセッサ１６上で実行しているのかまた
は計算プロセッサ１８上で実行しているのかを認識していない。要求がサービス
プロセッサ１６で実行される場合、要求は通常のオペレーティングシステムバス
を通ってオペレーティングシステムソフトウェア６８内に直接入り処理される。Under the S31 architecture, a user application 62 makes a request to the system within a normal system software mechanism. Application 62
Does not know if the application 62 is running on the service processor 16 or on the computation processor 18. When the request is executed by the service processor 16, the request is entered directly into the operating system software 68 via the normal operating system bus for processing.

【００２９】要求が計算プロセッサ１８上で実行される場合、要求は計算スケジューラ６６
によって処理される。要求しているスレッドはサービスプロセッサ１６の実行待
ちキュー上に置かれ、計算プロセッサ１８はサービスプロセッサ１６へキューを
調べるよう要求を発行する。サービスプロセッサ１６内で実行するオペレーティ
ングソフトウェア６８は、要求キューを調べ、あたかもそれがサービスプロセッ
サ上で生じたかのように要求を処理する。If the request is to be executed on compute processor 18, the request
Processed by The requesting thread is placed on the pending queue of the service processor 16 and the compute processor 18 issues a request to the service processor 16 to check the queue. Operating software 68 running in service processor 16 examines the request queue and processes the request as if it had occurred on the service processor.

【００３０】要求している計算プロセッサ１８は、割込み応答まで一時停止されるか、また
は、オペレーティングソフトウェア６８によって維持される一般的なスケジュー
リングテーブルに置かれてさらなる作業をディスパッチする。サービスプロセッ
サ１６は、アプリケーションスレッドをキューイングしこれは計算プロセッサ１
８上で実行され、元のアプリケーションコンテキストを復活してこれがアプリケ
ーション６２を実行に戻すことによって、元の要求に応答する。The requesting processor 18 is suspended until an interrupt response or is placed in a general scheduling table maintained by operating software 68 to dispatch further work. The service processor 16 queues the application thread, which is
8 and responds to the original request by restoring the original application context, which returns the application 62 to execution.

【００３１】基準オペレーティングソフトウェア６８への非常に小さい計算スケジューラ６
６と小さいさらなるコンポーネントとの追加の他は、オペレーティングソフトウ
ェア６８の大きな変更は必要とされないことが注目される。サービスパーティシ
ョン内の如何なる物理的プロセッサ１６も、オペレーティングシステム６８内で
同時に実行可能である。重要なデータ領域は、下にあるハードウェア内に現在見
出される標準のロッキング基本命令を利用してロック可能である。Very small computation scheduler 6 to reference operating software 68
It is noted that no major changes in operating software 68 are required, other than the addition of 6 and smaller additional components. Any physical processors 16 in the service partition can execute simultaneously in operating system 68. Critical data areas can be locked using standard locking primitives currently found in the underlying hardware.

【００３２】ソフトウェア環境の基本コンポーネントは、オペレーティングシステムソフト
ウェア６８である。好ましい実施例では、コンピュータシステム１０は、カリフ
ォルニア州、パロアルトのサン・マイクロシステムズ社（Sun Microsystems, In
c. Palo Alto, CA）から入手可能であるサンソフト（登録商標）ソラリス（登録
商標）２．６オペレーティングシステム（Sun Soft, Solaris 2.6 Operating Sy
stem）の強化バージョンであって、オペレーティングシステムをサービスプロセ
ッサ１６内でのみ実行するように制限することによって複数の計算およびサービ
スプロセッサ１８、１６にわたってよりよい性能を達成するように変形されたも
のを使用してもよい。前述したとおり、この技術はさらに、計算スケジューラ６
６を利用してサービスプロセッサ１６と計算プロセッサ１８との間でオペレーテ
ィングシステム要求およびスケジューリング情報を通信することによって、達成
される。The basic component of the software environment is operating system software 68. In a preferred embodiment, the computer system 10 is a Sun Microsystems, Inc., Palo Alto, California.
c. Palo Alto, CA) available from SunSoft, Solaris 2.6 Operating Sy.
stem), modified to achieve better performance across multiple compute and service processors 18, 16 by restricting the operating system to run only in service processor 16. May be. As described above, this technique further includes a calculation scheduler 6
This is accomplished by utilizing OS 6 to communicate operating system requests and scheduling information between service processor 16 and compute processor 18.

【００３３】換言すれば、オペレーティングシステムソフトウェア６８の単一のコピーは、
すべてのサービスプロセッサ１６内で実行し、一方、別個の計算スケジューラ６
６は各計算プロセッサ１８内にある。このソフトウェアモデルは、強力なスケー
ラブル単一システム画像と連係してグローバル資源共有に供する。この発明のコ
ンピュータシステム１０は、高度のソフトウェア「マルチスレッド」と連係して
、オペレーティングシステムソフトウェア６８とユーザアプリケーションソフト
ウェア６２との両方において実行の高度にスケーラブルなスレッドに供する。In other words, a single copy of operating system software 68
Running in all service processors 16, while a separate computation scheduler 6
6 is in each calculation processor 18. This software model, coupled with powerful scalable single system images, provides for global resource sharing. The computer system 10 of the present invention, in conjunction with advanced software "multithreading", provides a highly scalable thread of execution in both the operating system software 68 and the user application software 62.

【００３４】特定のコンピュータアーキテクチャと関連付けてこの発明の原理が上述された
が、如何なる数のサービスおよび／または計算プロセッサが利用されてもよく、
前記記載は例としてのみなされこの発明の範囲を制限するものではないことが明
らかである。特に前記開示の教示は、当業者には他の変形を示唆することが認め
られる。そのような変形は、それ自体周知であって、ここに記載された特徴に代
えてまたはそれに加えて用いられ得る他の特徴を含むであろう。クレームが、特
徴の特定の組合せに対してこの出願において明確に表現されているが、ここに開
示される範囲はまた、当業者には明らかであろう如何なる新規な特徴、または、
明示的にまたは非明示的に開示される特徴の如何なる新規な組合せ、または如何
なるその一般化または変形をも含み、これはそういったものがここでいずれかの
クレームにクレームされたのと同じ発明に関連するかどうかにかかわらず、また
それがこの発明が対処するのと同じ技術的課題のすべてのまたはいずれかを軽減
するかどうかにかかわらない。出願人は、この発明のまたはそこから由来するさ
らなる出願の手続遂行の間に、そのような特徴または／およびそのような特徴の
組合せに対する新しいクレームを明確に表現する権利をここに留保する。Although the principles of the present invention have been described above in connection with a particular computer architecture, any number of services and / or computing processors may be utilized,
It is clear that the above description is to be regarded as illustrative and not restrictive of the scope of the invention. In particular, it is recognized that the teachings of the above disclosure will suggest other modifications to those skilled in the art. Such variations would include other features that are known per se and may be used instead of or in addition to the features described herein. Although the claims are expressly stated in this application for a particular combination of features, the scope disclosed herein also includes any novel features or modifications that would be apparent to those skilled in the art.
Includes any novel combination of features, either explicitly or implicitly disclosed, or any generalization or variation thereof, which relates to the same invention as claimed herein in any claim. Whether or not to do so, whether or not it alleviates all or any of the same technical problems addressed by the present invention. Applicant reserves the right to expressly claim new claims for such features or / and combinations of such features during the prosecution of this application or of any further application derived therefrom.

[Brief description of the drawings]

【図１Ａ】１個から１６個の、同数のトランクラインによって互いに結合
されるセグメントを含み、各セグメントは多数の計算プロセッサおよびサービス
プロセッサとさらにメモリおよびクロスバースイッチアセンブリとを含む、この
発明の実施例に従うコンピュータシステムを例示する機能ブロックシステム概観
図である。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1A is an implementation of the invention that includes one to sixteen segments connected to each other by an equal number of trunk lines, each segment including a number of compute and service processors and also a memory and crossbar switch assembly. 1 is a functional block system overview diagram illustrating a computer system according to an example.

【図１Ｂ】１個から１６個の、同数のトランクラインによって互いに結合
されるセグメントを含み、各セグメントは多数の計算プロセッサおよびサービス
プロセッサとさらにメモリおよびクロスバースイッチアセンブリとを含む、この
発明の実施例に従うコンピュータシステムを例示する機能ブロックシステム概観
図である。FIG. 1B is an implementation of the present invention that includes one to sixteen segments connected to each other by an equal number of trunk lines, each segment including a number of compute and service processors and also a memory and crossbar switch assembly. 1 is a functional block system overview diagram illustrating a computer system according to an example.

【図２】図１Ａおよび図１Ｂのコンピュータシステムのための相互接続ス
トラテジーの簡素化された機能ブロック図である。FIG. 2 is a simplified functional block diagram of an interconnect strategy for the computer system of FIGS. 1A and 1B.

【図３】この発明に従うスケーラブル単一システム画像（「Ｓ３１」）を
介してコンピュータプログラムアプリケーションソフトウェアにインターフェイ
スする２５６個の計算プロセッサおよび６４個のサービスプロセッサを含む１６
セグメントシステムを例示する、図１Ａ、図１Ｂおよび図２のコンピュータシス
テムの簡素化された機能ブロック図である。FIG. 3 includes a 256 computation processor and 64 service processors that interface to a computer program application software via a scalable single system image (“S31”) in accordance with the present invention.
FIG. 3 is a simplified functional block diagram of the computer system of FIGS. 1A, 1B and 2 illustrating a segment system.

【図４】計算スケジューラを介する計算プロセッサと、計算スケジューラ
に割込応答を与えるオペレーティングソフトウェアを介するサービスプロセッサ
とへの共通ＡＰＩ／ＡＢＩを介するコンピュータプログラムアプリケーションソ
フトウェアへのインターフェイスを例示する、図３のシステムの一部に対応する
単一セグメントコンピュータシステムのより詳細なブロック図である。FIG. 4 illustrates the system of FIG. 3 illustrating an interface to computer program application software via a common API / ABI to a compute processor via a compute scheduler and a service processor via operating software to provide an interrupt response to the compute scheduler. FIG. 4 is a more detailed block diagram of a single-segment computer system corresponding to part of FIG.

【手続補正書】[Procedure amendment]

【提出日】平成１２年７月２４日（２０００．７．２４）[Submission date] July 24, 2000 (2000.7.24)

【手続補正１】[Procedure amendment 1]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】特許請求の範囲[Correction target item name] Claims

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【特許請求の範囲】[Claims]

───────────────────────────────────────────────────── フロントページの続き (72)発明者ベルトーニ，ジョナサン・エルアメリカ合衆国、80918 コロラド州、コロラド・スプリングス、パラゴン・ドライブ、2630、アパートメント・イーＦターム(参考） 5B045 AA01 BB12 BB28 BB29 DD12 EE02 GG07 GG08 GG12 5B098 AA10 GA01 GA02 GC16 【要約の続き】 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Bertoni, Jonathan El, United States, 80918 Colorado, Colorado Springs, Paragon Drive, 2630, Apartment E F-term (reference) 5B045 AA01 BB12 BB28 BB29 DD12 EE02 GG07 GG08 GG12 5B098 AA10 GA01 GA02 GC16 [Continuation of summary]

Claims

[Claims]

1. A multi-processor computer system including operating software, the system including a first plurality of service processors operative in conjunction with the operating software, wherein the service processor is configured to provide all input / outputs in the computer system. Processing an output function, further comprising a second plurality of computation processors, wherein the computation processors function in conjunction with a computation scheduler, wherein the operating software and the computation scheduler are located between the service processor and the computation processors. Give communication medium,
Multiprocessor computer system.

2. The multiprocessor computer system according to claim 1, wherein said communication medium operates to enable a plurality of input / output requests to said service processor.

3. The multiprocessor computer system according to claim 2, wherein said communication medium is operative to enable an asynchronous response to said input / output request.

4. The multiprocessor computer system of claim 1, wherein said service processor and said computing processor have shared access to a plurality of associated memory banks.

5. The system of claim 1, further comprising: a common application programming interface operatively coupled to the computation scheduler and the operating software;
The multiprocessor computer system of claim 1, wherein the computation scheduler and the operating software include a single system image application programming interface.

6. The multiprocessor computer system of claim 5, wherein said single system image application programming interface is scalable across said first plurality of service processors and said second plurality of computing processors. .

7. The multiprocessor of claim 1, further comprising a system console coupled to at least one of said first plurality of service processors to enable a user of said computer system to interact therewith. Computer system.

8. The at least one of the first plurality of service processors.
A boot device coupled to the computer and booting the computer system.
A multiprocessor computer system according to claim 7.

9. The multiprocessor computer system according to claim 1, further comprising at least one computer mass storage device coupled to at least one of said first plurality of service processors.

10. The first plurality of service processors and the second plurality of computation processors include a first computer system segment, wherein the computer system functions in conjunction with the operating software. Further comprising at least one additional computer system segment comprising: a plurality of service processors; and a fourth plurality of compute processors operative in conjunction with the compute scheduler.
The multiprocessor computer system according to claim 1.

11. A multiprocessor computer system including operating software, the system including a first plurality of service processors and a second plurality of computing processors, each of the service processors associated with the operating software. Functioning to handle all input / output functions in the computer system, each of the computing processors functioning in conjunction with a computing scheduler, wherein the operating software and the computing scheduler operate between the service processor and the computing processor. Multiprocessor computer system that provides a communication medium to

12. The multiprocessor computer system according to claim 11, wherein said communication medium operates to enable a plurality of input / output requests to said service processor.

13. The multiprocessor computer system according to claim 12, wherein said communication medium operates to enable asynchronous response to said input / output request.

14. The multiprocessor computer system according to claim 11, wherein said services and computations have shared access to a plurality of associated memory banks.

15. The system further comprising a common application programming interface operatively coupled to the computation scheduler and the operating software, wherein the application programming interface, the computation scheduler and the operating software include a single system image application programming interface. The multiprocessor computer system of claim 11, wherein:

16. The multiprocessor computer system of claim 15, wherein said single system image application programming interface is scalable over said N computational segments.

17. A system console coupled to at least one of said first plurality of service processors of one of said N computational segments to allow a user of said computer system to interact therewith. The multiprocessor computer system of claim 11, comprising:

18. The multi-processor of claim 17, further comprising a boot device coupled to the at least one of the first plurality of service processors in one of the N computational segments to boot the computer system. Processor computer system.

19. The multi-processor of claim 11, further comprising at least one computer mass storage device coupled to at least one of the first plurality of service processors in one of the N computational segments. Processor computer system.