[go: up one dir, main page]

WO1993000639A1 - Multiprocessor array - Google Patents

Multiprocessor array Download PDF

Info

Publication number
WO1993000639A1
WO1993000639A1 PCT/US1992/005079 US9205079W WO9300639A1 WO 1993000639 A1 WO1993000639 A1 WO 1993000639A1 US 9205079 W US9205079 W US 9205079W WO 9300639 A1 WO9300639 A1 WO 9300639A1
Authority
WO
WIPO (PCT)
Prior art keywords
bus
coupled
processor
asynchronously
cache memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US1992/005079
Other languages
French (fr)
Inventor
Kin M. Ho
Dietmar M. Kurpanek
Adam W. K. Li
Jonathan W. Liu
Brian J. Sassone
Tahir Q. Sheikh
Sam Tam
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisys Corp
Original Assignee
Unisys Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisys Corp filed Critical Unisys Corp
Priority to EP92914447A priority Critical patent/EP0591405A1/en
Priority to JP5501543A priority patent/JPH06508707A/en
Publication of WO1993000639A1 publication Critical patent/WO1993000639A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/10Program control for peripheral devices
    • G06F13/12Program control for peripheral devices using hardware independent of the central processor, e.g. channel or peripheral processor
    • G06F13/122Program control for peripheral devices using hardware independent of the central processor, e.g. channel or peripheral processor where hardware performs an I/O function other than control of data transfer
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0875Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack

Definitions

  • This invention relates to computer systems, and more particularly to techniques for arranging central processor means thereof including techniques for arranging a single input/output channel for several processors in a single system.
  • An object hereof is to address at least some of the foregoing problems and to provide at least some of the mentioned, and other, advantages.
  • Fig. 1 is a generalized simplified block diagram of a preferred embodiment
  • Fig. 2 is a -more detailed version of this embodiment, while Fig. 2A, 2B, 2C are enlargements of portions of Fig. 2. .
  • the present invention comprises a multiprocessor system with processors coupled, by a common bus arrangement, as is an input/output adapter (hereinafter "I/O") to common memory.
  • I/O input/output adapter
  • a “main memory” is coupled, by the common bus arrangement, to the several central processors (hereinafter “CPUs”)- the common bus thus being shared between such CPUs.
  • CPUs central processors
  • One of the processors is a base processor coupled to the I/O adapter (by the common bus arrangement) .
  • the I/O adapter is the only one for the system, and is coupled, by output bus means, to I/O devices, and by a ring coupling to the base processor.
  • the I/O adapter, and each CPU has its own dedicated cache memory.
  • Each cache memory is preferably operated as a "write-back" cache which updates "main memory” only upon a CPU initiated "read miss” or "write miss” (on a dirty cache line) .
  • Fig. 1 particularly represents a generalized, simplified block diagram of a multiple microprocessor computer system along the foregoing lines, being characterized by a base CPU 1 (central processing unit, e.g., preferably using Intel 80486 microprocessor chip), preferably coupled to its own associated private cache memory unit 1-C and, thence, via a connect • channel 8, to a single, shared MP (multiprocessor) bus 21.
  • CPU 1 central processing unit, e.g., preferably using Intel 80486 microprocessor chip
  • CPU 1 may preferably comprise a single processor card (printed circuit board—including cache 1-C) and that the entire computer system preferably involves a number of other similar (application) processor units (up to five, here), such as like CPU/cache units 23, 25—all similar to CPU 1/cache 1-C and linked by the shared MP bus 21.
  • each CPU may be understood as connected to its cadhe memory to accommodate direct transmission there between (cf. requests, data).
  • CPU 1 is also, coupled, in "O-Ring fashion” (see further below) to the I/O adapter (channel) 5 (input-output control unit—the only one in the system, preferably on a single board) which is, in turn, coupled to I/O bus means (preferably to a SCSI bus " 30 and an EISA bus 40, as illustrated, there being an associated EISA chip set on I/O card 5, along with an associated private cache memory 5-C).
  • I/O unit 5 (with its cache 5-C) is coupled to MP bus 21 via the common single channel (connector) 8 that also couples CPU 1 (and its cache 1-C) to MP bus 21.
  • This O-Ring coupling logically ties CPU 1 and I/O 5 together (exclusively) in a ring-like configuration, for bilateral, asynchronous intercommunication.
  • CPU 1 and 1/0-5 are so connected in "O-Ring" fashion ("O-Ring architecture") to "talk bilaterally and asynchronously” over two inter- coupling ⁇ (channels).
  • I/O register R-0 are provided on the base and application processor units to facilitate communication with, and between, the processors.
  • This O-Ring logically ties CPU 1 TO 1/0-5, these being -shown in more detail in Figs. 2A, 2B (see inside respective dotted-lines) .
  • the line (connector channel 8) connecting CPU 1 and I/O 5 with MP bus 21, is preferably coupled thereto via a SAD (system address) bus.
  • channel 8 comprises a MAD bus and a bidirectional buffer-register (BCT-G52-M) A, linking MP bus 21 with SAD bus and thus linking 1/0-5 with all processors. Workers will recognize the advantageous use of such O-Ring architecture with such a common access-line to MP bus 21.
  • Buses MAD, SAD are depicted in Fig. 2A, along with two private buses VD, VA from base processor 1 to I/O board 5.
  • 3 - base processor 1 can "talk" asynchronously with I/O board 5 (not synchronously, as is conventional).
  • the base processor cache can operate asynchronously with its processor, and also with
  • SAD bus also links MP bus with a Cache Address bus (CA bus), with a Cache Data bus (CD bus), with a Cache Tag data bus (TD bus) and with a DMA ASIC portion of 1/0-5 ("Direct Memory Access" chip to move data between disks and CP-memory or cache memory 1-C?).
  • Register BCT-652-M is a bidirectional buffer/register coupled between SAD bus and MP bus to transfer data to the CPU 1, 1/0-5 ring (e.g., from main memory, from other CPUs).
  • a composite "bus tag” arrangement is provided with two tags for each CPU (board); as part of this, we provide a "cache tag” aa (Fig. 2A) and “bus tag” bb (Fig. 2A); their contents (they are static RAMs) should always be the same. Also provided is a math co-processor dd (e.g., preferably a "4167" chip by Weitek Corp. or an Intel 80387).
  • PA bus Processor Address bus
  • PD bus Processor Data bus
  • All the above five buses are indicated in Figs. 2A, 2B.
  • An address buffer 543 is- provided to isolate CA bus and PA bus.
  • a latch 574 is provided to store the SAD address, while a D-flip flop 574 is provided to "evict" addresses to the SAD bus.
  • a bidirectional buffer 245 provides the path for updating tag data and for "snooping" the CPU (i.e., to invalidate data elsewhere when it is updated in a given memory site) .
  • a comparator unit 521 is provided to check for cache "hit"/"miss” .
  • the base processor CPU 1 is coupled for bilateral asynchronous communication with the associated I/O unit 5 (cf. asynchronous I/O-CPU interface 2). It is conventional for such an interface to be synchronous (e.g., clock- synchronized; e.g., see US 4,669,043 to aplinsky, or US 4,591,977 to Nissen et al. where I/O is directly-coupled to a CPU; or see US 4,931,984 to Ny or US 4,161,024 to Joyce et al. ) .
  • clock- synchronized e.g., see US 4,669,043 to aplinsky, or US 4,591,977 to Nissen et al. where I/O is directly-coupled to a CPU; or see US 4,931,984 to Ny or US 4,161,024 to Joyce et al.
  • This "O-Ring" architecture may be contrasted with U.S. Patent 4,351,025 where a control CPU is coupled by a separate bus to a master control I/O unit; or with more conventional arrangements coupling I/O and CPU via a single shared bus—and doing so in a synchronous, tightly-coupled interface.
  • This O-Ring coupling with its two channels enhances system performance—for instance, allowing MP bus to access CP-cache 1-C (or I/O cache 5-C) while CPU 1 simultaneously accesses EISA bus, and with no problematic contention.
  • the asynchronous nature of this CPU-I/O interface allows I/O to operate at any frequency, thus enhancing system-modularity and design flexibility.
  • O-Ring .
  • couple 2 will be assumed to preferably comprise a pair of bidirectional connection channels between CPU 1 and I/O adapter 5 (e.g., vs. U.S. 4,459,655 using a master bus 17 along with a conversation bus 16 which allows two-way communication between slave- modules 12, 13, 14). . '
  • each CPU in the system is provided with private cache memory means (e.g., 1-C for CPU 1) to which it is asynchronously coupled; likewise our I/O unit 5 is coupled, asynchronously, with an associated cache means 5-C. And, each cache is also coupled asynchronously with MP bus 21.
  • private cache memory means e.g., 1-C for CPU 1
  • I/O unit 5 is coupled, asynchronously, with an associated cache means 5-C.
  • each cache is also coupled asynchronously with MP bus 21.
  • the cited Joyce patent teaches no such private caching and no such bilateral asynchronous cache interfacing either.
  • our multiple* processor system preferably uses a base processor (e.g., CPU 1) along with application processor CPUs, each preferably with associated cache memory) to share a single MP bus (21 in Fig. 1).
  • a base processor e.g., CPU 1
  • application processor CPUs each preferably with associated cache memory
  • each CPU with the common shared bus via a single channel, and also via the respective associated private cache memory (e.g., vs. the multiprocessor system in U.S. 4,591,977 to Nissen which teaches other local memory units for each CPU, these unconnected to a common bus and also using a common memory that must be time-shared and is not asynchronous with any CPU or any common bus).
  • private cache memory e.g., vs. the multiprocessor system in U.S. 4,591,977 to Nissen which teaches other local memory units for each CPU, these unconnected to a common bus and also using a common memory that must be time-shared and is not asynchronous with any CPU or any common bus.
  • Fig. 2 expands on the described simplified arrangement of Fig. 1 in a multiprocessor system adapted for file server and OLTP (on-line transaction processing) applications.
  • Base processor CPU 1' (including processor chip CP-11', and associated private cache memory 1-C), I/O adapter card 5' (including its own private cache memory).
  • Processor CPU 1' and I/O adapter 5' will be understood as configured in O-Ring fashion, being intercoupled along VA bus and VD bus, via interface control IFC, and being both. coupled to MP-bus 21' along a single common" channel 8', (including MAD bus and buffer/register BCT-652), with I/O adapter 5' coupled to access SCSI bus and EISA bus.
  • a system address bus couples channel 8' (register BCT-652 thereof) to a pair of (tag-check units H-A, H-B, via a latch 573, as well as to base processor array CP-1' (via register/buffers BCT-652: -C, -B lf -B 2 , and via evict unit 574-F); and also couples channel 8' to I/O adapter (via control IFC, and DMA-ASIC-DD, and BSAD bus and a tag- snoop unit 245-E).
  • SAD bus system address bus
  • Registers BCT-652-Bi are -B 2 are .coupled by a cache address bus (CA bus) to data cache 3' and to Moesi cache 3", as well as to a pair of isolation buffers 543-G' , 453-G", and also to a comparator stage 521-h (Fig. 2A) .
  • CA bus cache address bus
  • a cache data bus couples data cache 3' with address buffer 543-G and register BCT-652-C.
  • a cache tag .data bus (TD bus) couples evict unit
  • SAD bus provides linkage between MP bus on the one hand, and the following: BSAD bus, CA bus, CD bus, TD bus, and DMA ASIC DD (140, via tag snoop unit 245-E. ) .
  • Fig. 2B also note that an 80486 data bus (PD bus) links buffer 543G and tag-snoop unit 245-E' with base processor chip CP-11' and with math coprocessor chip dd. Also, note processor address bus (PA bus, cf. address bus for Intel 80486 microprocessor) which links processors CP-11' and dd with evict unit 574-F' and with buffer 543-G' and tag cache TC.
  • Fig. 2C is an enlargement of the 1/0-5' portion of Figs. 2 which will be better understood by the following description of salient component parts (Intel chip designations) as follows:
  • the 82358 EBC is the central component of the EISA system.
  • the EBC performs the translations between host CPU cycles, AT cycles, and EISA cycles. Masters on any of the three buses communicate with the other buses through the EBC. It takes care of all necessary timing alignments and translations for the different buses to communicate.
  • the EBC sits between the fast host (CPU) bus and the 8 Mhz EISA/AT buses. It watches cycles initiated on all buses. When a host bus master initiates a cycle and no host slave responds, the EBC forwards the cycle to the EISA and AT buses. All cycles initiated by EISA bus masters are forwarded to the host and AT buses. It also provides the control for the address and data buffers between the buses and takes care of inserting delays between back to back I/O cycles coming from the host bus to the EISA bus. 82357 Integrated System Peripheral (5-A):
  • the 82357 ISP is a multi-function support peripheral that is designed to work in conjunction with the 82358 EISA bus controller to provide most of the system functions necessary in EISA specific applications.
  • the 82357 ISP is comprised of several computer system functions that are typically found in separate LSI and VLSI components.
  • a high-performance 7-channel programmable DMA controller includes: a high-performance 7-channel programmable DMA controller; an arbitration scheme that allows efficient bus sharing among multiple EISA masters and DMA devices; a 15 level programmable interrupt controller which provides level-or-edge triggered interrupt capability on a channel by channel basis; non- maskable interrupt logic for multiple NMI control and generation; refresh address generation and control; 5 counter/timers which provide a system timer interrupt for a time of day, diskette time-out, DRAM refresh requests, and other system timing operations.
  • the 82352 EBB is used to integrate the data swap logic - and the address buffers. This integrates approximately 17 components and lowers the system board chip count. Additionally, the EBB is designed to meet some of the timing requirements of EISA that would be difficult to do with discrete components and to eliminate excess EMI for FCC testing requirements.
  • BMIC Master Interface Controller
  • the 82355 BMIC greatly simplifies the design of 32-bit EISA bus masters.
  • an expansion board can be implemented with, simple logic similar to that used in traditional AT DMA designs; however, the BMIC also allows the designer to take full advantage of the advanced features of EISA bus masters.
  • Features available when using the BMIC are the burst " mode for data transfer rates up to 33 megabytes/sec, EISA automatic configuration, and 32-bit address bus which covers the entire 4-gigabyte EISA address space.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Bus Control (AREA)

Abstract

A multiprocessor computer system wherein a base processor is coupled in asynchronous O-ring fashion to an associated input/output adapter, with the processor and I/O adapter each including associated private cache memory through which they are both connected with a common shared MP bus via a single connector channel.

Description

MULTI-PROCESSOR ARRAY
Field of Invention;
This invention relates to computer systems, and more particularly to techniques for arranging central processor means thereof including techniques for arranging a single input/output channel for several processors in a single system.
Background, Features;
In data processing systems utilizing a number of processors it is typically advantageous to intercouple these via a single shared common system bus. In addition, data processing systems utilizing peripheral bus means for a number of processors typically use a single common input/output channel and intercouple it with all processors via a single shared common system bus. Yet this is not easy, and presents grave problems of overloading the shared bus and slowing the system. The instant approach addresses this problem and offers some solutions, such as by providing the I/O channel and each processor with its own dedicated cache memory, coupling the main (base) processor to the input/output (I/O) channel in O-Ring fashion, and coupling the main processor and I/O channel to the common system bus via a single channel.
An object hereof is to address at least some of the foregoing problems and to provide at least some of the mentioned, and other, advantages.
Brief Description of the Drawings;
These and other features and advantages of the present invention will be appreciated by workers as they become better understood by reference to the following detailed description of the present preferred embodiments which should be considered in conjunction with the accompanying drawings, wherein like reference symbols denote like elements: Fig. 1 is a generalized simplified block diagram of a preferred embodiment;
Fig. 2 is a -more detailed version of this embodiment, while Fig. 2A, 2B, 2C are enlargements of portions of Fig. 2. . Generalized Embodiment, Fig. 1:
Generally, the present invention comprises a multiprocessor system with processors coupled, by a common bus arrangement, as is an input/output adapter (hereinafter "I/O") to common memory. A "main memory" is coupled, by the common bus arrangement, to the several central processors (hereinafter "CPUs")- the common bus thus being shared between such CPUs. One of the processors is a base processor coupled to the I/O adapter (by the common bus arrangement) . The I/O adapter is the only one for the system, and is coupled, by output bus means, to I/O devices, and by a ring coupling to the base processor. The I/O adapter, and each CPU, has its own dedicated cache memory.
Each cache memory is preferably operated as a "write-back" cache which updates "main memory" only upon a CPU initiated "read miss" or "write miss" (on a dirty cache line) .
Fig. 1 particularly represents a generalized, simplified block diagram of a multiple microprocessor computer system along the foregoing lines, being characterized by a base CPU 1 (central processing unit, e.g., preferably using Intel 80486 microprocessor chip), preferably coupled to its own associated private cache memory unit 1-C and, thence, via a connectchannel 8, to a single, shared MP (multiprocessor) bus 21. Workers will understand that CPU 1 may preferably comprise a single processor card (printed circuit board—including cache 1-C) and that the entire computer system preferably involves a number of other similar (application) processor units (up to five, here), such as like CPU/cache units 23, 25—all similar to CPU 1/cache 1-C and linked by the shared MP bus 21. Preferably, one or several "main memory" units 9 are also coupled to this MP bus 21. Each CPU may be understood as connected to its cadhe memory to accommodate direct transmission there between (cf. requests, data). CPU 1 is also, coupled, in "O-Ring fashion" (see further below) to the I/O adapter (channel) 5 (input-output control unit—the only one in the system, preferably on a single board) which is, in turn, coupled to I/O bus means (preferably to a SCSI bus "30 and an EISA bus 40, as illustrated, there being an associated EISA chip set on I/O card 5, along with an associated private cache memory 5-C). Notably, I/O unit 5 (with its cache 5-C) is coupled to MP bus 21 via the common single channel (connector) 8 that also couples CPU 1 (and its cache 1-C) to MP bus 21. This O-Ring coupling logically ties CPU 1 and I/O 5 together (exclusively) in a ring-like configuration, for bilateral, asynchronous intercommunication. As a salient feature, CPU 1 and 1/0-5 are so connected in "O-Ring" fashion ("O-Ring architecture") to "talk bilaterally and asynchronously" over two inter- couplingε (channels). Preferably, I/O register R-0 are provided on the base and application processor units to facilitate communication with, and between, the processors. This O-Ring logically ties CPU 1 TO 1/0-5, these being -shown in more detail in Figs. 2A, 2B (see inside respective dotted-lines) . Note (Fig. 2A) that the line (connector channel 8) connecting CPU 1 and I/O 5 with MP bus 21, is preferably coupled thereto via a SAD (system address) bus. Preferably, channel 8 comprises a MAD bus and a bidirectional buffer-register (BCT-G52-M) A, linking MP bus 21 with SAD bus and thus linking 1/0-5 with all processors. Workers will recognize the advantageous use of such O-Ring architecture with such a common access-line to MP bus 21.
Buses MAD, SAD are depicted in Fig. 2A, along with two private buses VD, VA from base processor 1 to I/O board 5.
Some novel, advantageous functions of such an O-Ring coupling are:
1 - It reduces' traffic load on the common system bus (MP bus); and so 2 - opens-up.access for the bus user; and
3 - base processor 1 can "talk" asynchronously with I/O board 5 (not synchronously, as is conventional).
4 - The base processor cache can operate asynchronously with its processor, and also with
MP bus, thus reducing access-time to the cache, and improving performance of the base processor; and
5 - MP bus is free to monitor cache-access while the base processor and I/O channel are otherwise occupied (e.g., accessing one another). SAD bus also links MP bus with a Cache Address bus (CA bus), with a Cache Data bus (CD bus), with a Cache Tag data bus (TD bus) and with a DMA ASIC portion of 1/0-5 ("Direct Memory Access" chip to move data between disks and CP-memory or cache memory 1-C?). Register BCT-652-M is a bidirectional buffer/register coupled between SAD bus and MP bus to transfer data to the CPU 1, 1/0-5 ring (e.g., from main memory, from other CPUs).
A composite "bus tag" arrangement is provided with two tags for each CPU (board); as part of this, we provide a "cache tag" aa (Fig. 2A) and "bus tag" bb (Fig. 2A); their contents (they are static RAMs) should always be the same. Also provided is a math co-processor dd (e.g., preferably a "4167" chip by Weitek Corp. or an Intel 80387).
There is also a Processor Address bus (PA bus; preferably for Intel 80486 microprocessor) and a Processor Data bus (PD bus, or 80486 data bus). All the above five buses are indicated in Figs. 2A, 2B. An address buffer 543 is- provided to isolate CA bus and PA bus. And a latch 574 is provided to store the SAD address, while a D-flip flop 574 is provided to "evict" addresses to the SAD bus. A bidirectional buffer 245 provides the path for updating tag data and for "snooping" the CPU (i.e., to invalidate data elsewhere when it is updated in a given memory site) . A comparator unit 521 is provided to check for cache "hit"/"miss" . Features:
The subject embodiment has several noteworthy features. For instance, as noted in Fig. 1, the base processor CPU 1 is coupled for bilateral asynchronous communication with the associated I/O unit 5 (cf. asynchronous I/O-CPU interface 2). It is conventional for such an interface to be synchronous (e.g., clock- synchronized; e.g., see US 4,669,043 to aplinsky, or US 4,591,977 to Nissen et al. where I/O is directly-coupled to a CPU; or see US 4,931,984 to Ny or US 4,161,024 to Joyce et al. ) .
This "O-Ring" architecture may be contrasted with U.S. Patent 4,351,025 where a control CPU is coupled by a separate bus to a master control I/O unit; or with more conventional arrangements coupling I/O and CPU via a single shared bus—and doing so in a synchronous, tightly-coupled interface. This O-Ring coupling, with its two channels enhances system performance—for instance, allowing MP bus to access CP-cache 1-C (or I/O cache 5-C) while CPU 1 simultaneously accesses EISA bus, and with no problematic contention. And, the asynchronous nature of this CPU-I/O interface allows I/O to operate at any frequency, thus enhancing system-modularity and design flexibility. Thus O-Ring . couple 2 will be assumed to preferably comprise a pair of bidirectional connection channels between CPU 1 and I/O adapter 5 (e.g., vs. U.S. 4,459,655 using a master bus 17 along with a conversation bus 16 which allows two-way communication between slave- modules 12, 13, 14). . '
Similarly, each CPU in the system is provided with private cache memory means (e.g., 1-C for CPU 1) to which it is asynchronously coupled; likewise our I/O unit 5 is coupled, asynchronously, with an associated cache means 5-C. And, each cache is also coupled asynchronously with MP bus 21. By contrast the cited Joyce patent teaches no such private caching and no such bilateral asynchronous cache interfacing either.
Also, our multiple* processor system preferably uses a base processor (e.g., CPU 1) along with application processor CPUs, each preferably with associated cache memory) to share a single MP bus (21 in Fig. 1).
More conventional systems use a number of CPU buses in such a situation (e.g., as in U.S. 4,459,655 to Willemin which uses two buses, or U.S. 4,351,025; or U.S.
4,161,024; or see U.S. 4,692,862 to Cousin et al. which teaches an interconnect-network for communication between processors) .
Also noteworthy is our coupling each CPU with the common shared bus via a single channel, and also via the respective associated private cache memory (e.g., vs. the multiprocessor system in U.S. 4,591,977 to Nissen which teaches other local memory units for each CPU, these unconnected to a common bus and also using a common memory that must be time-shared and is not asynchronous with any CPU or any common bus).
Our single-channel coupling of CPU 1 and I/O 5 to MP bus 21 reduces the capacitive load on MP bus, and so accelerates its operation. The asynchronous coupling of each cache to its CPU, and to MP bus 21, (vs. synchronous interface) provides great versatility: e.g. , the cache can run at CPU-frequency when CPU "owns" it, at MP bus frequency when MP bus "owns" it, thus eliminating "excess synchronization time" (to access cache from either side), and so increasing throughput of the overall system. And, so providing a private,- dedicated write-back cache for each processor, significantly reduces traffic on MP bus, enabling MP bus to support more processors. [Note coupling of 1/0-5 and control CPU 1, via respective caches, to common MP bus 21; i.e., CPU 1 to 1-C to channel 8, and 1/0-5 to 5-C to channel 8, and via channel 8 to MP bus 21 in Fig. 1. ]
Our system is also constructed with a "separate supervisor mode" whereby each processor in the system (e.g., 23, 25) can schedule and spin-off operations by itself— nlike more conventional "master-slave" multiprocessor systems (e.g., in cited Willemin) . Detailed Preferred Embodiment (Figs. 2):
Fig. 2 expands on the described simplified arrangement of Fig. 1 in a multiprocessor system adapted for file server and OLTP (on-line transaction processing) applications. In this preferred detailed embodiment, using an arrangement generally like that in Fig. 1, note the comparable elements: Base processor CPU 1', (including processor chip CP-11', and associated private cache memory 1-C), I/O adapter card 5' (including its own private cache memory). Processor CPU 1' and I/O adapter 5' will be understood as configured in O-Ring fashion, being intercoupled along VA bus and VD bus, via interface control IFC, and being both. coupled to MP-bus 21' along a single common" channel 8', (including MAD bus and buffer/register BCT-652), with I/O adapter 5' coupled to access SCSI bus and EISA bus.
As best seen in Fig. 2A, a replica of Fig. 2, a system address bus (SAD bus) couples channel 8' (register BCT-652 thereof) to a pair of (tag-check units H-A, H-B, via a latch 573, as well as to base processor array CP-1' (via register/buffers BCT-652: -C, -Blf -B2, and via evict unit 574-F); and also couples channel 8' to I/O adapter (via control IFC, and DMA-ASIC-DD, and BSAD bus and a tag- snoop unit 245-E).
Registers BCT-652-Bi, are -B2 are .coupled by a cache address bus (CA bus) to data cache 3' and to Moesi cache 3", as well as to a pair of isolation buffers 543-G' , 453-G", and also to a comparator stage 521-h (Fig. 2A) .
A cache data bus (CD bus) couples data cache 3' with address buffer 543-G and register BCT-652-C. A cache tag .data bus (TD bus) couples evict unit
574-F with tag cache TC, with comparator 521-h and with buffer 543-G" (Fig. 2B).
Thus, in effect, SAD bus provides linkage between MP bus on the one hand, and the following: BSAD bus, CA bus, CD bus, TD bus, and DMA ASIC DD (140, via tag snoop unit 245-E. ) .
In Fig. 2B, also note that an 80486 data bus (PD bus) links buffer 543G and tag-snoop unit 245-E' with base processor chip CP-11' and with math coprocessor chip dd. Also, note processor address bus (PA bus, cf. address bus for Intel 80486 microprocessor) which links processors CP-11' and dd with evict unit 574-F' and with buffer 543-G' and tag cache TC. I/O System (1/0-5', Fig. 2C):
Fig. 2C is an enlargement of the 1/0-5' portion of Figs. 2 which will be better understood by the following description of salient component parts (Intel chip designations) as follows:
82358 Bus Controller (5-B):
The 82358 EBC is the central component of the EISA system. The EBC performs the translations between host CPU cycles, AT cycles, and EISA cycles. Masters on any of the three buses communicate with the other buses through the EBC. It takes care of all necessary timing alignments and translations for the different buses to communicate.
The EBC sits between the fast host (CPU) bus and the 8 Mhz EISA/AT buses. It watches cycles initiated on all buses. When a host bus master initiates a cycle and no host slave responds, the EBC forwards the cycle to the EISA and AT buses. All cycles initiated by EISA bus masters are forwarded to the host and AT buses. It also provides the control for the address and data buffers between the buses and takes care of inserting delays between back to back I/O cycles coming from the host bus to the EISA bus. 82357 Integrated System Peripheral (5-A):
The 82357 ISP is a multi-function support peripheral that is designed to work in conjunction with the 82358 EISA bus controller to provide most of the system functions necessary in EISA specific applications. The 82357 ISP is comprised of several computer system functions that are typically found in separate LSI and VLSI components. They include: a high-performance 7-channel programmable DMA controller; an arbitration scheme that allows efficient bus sharing among multiple EISA masters and DMA devices; a 15 level programmable interrupt controller which provides level-or-edge triggered interrupt capability on a channel by channel basis; non- maskable interrupt logic for multiple NMI control and generation; refresh address generation and control; 5 counter/timers which provide a system timer interrupt for a time of day, diskette time-out, DRAM refresh requests, and other system timing operations.
82352 EISA Bus Buffer (5-C):
The 82352 EBB is used to integrate the data swap logic - and the address buffers. This integrates approximately 17 components and lowers the system board chip count. Additionally, the EBB is designed to meet some of the timing requirements of EISA that would be difficult to do with discrete components and to eliminate excess EMI for FCC testing requirements.
82355 Bus Master Interface Controller (5-E): For add-in board support the 82355 EISA Bus
Master Interface Controller (BMIC) provides a simple, yet powerful and flexible interf ce between the local functions on the bus master board and the EISA bus master protocol. With the help of external buffer devices, the BMIC provides all of the control signals, address lines, and data lines necessary for an EISA bus master to interface to the EISA bus.
The 82355 BMIC; greatly simplifies the design of 32-bit EISA bus masters. With the BMIC, an expansion board can be implemented with, simple logic similar to that used in traditional AT DMA designs; however, the BMIC also allows the designer to take full advantage of the advanced features of EISA bus masters. Features available when using the BMIC are the burst" mode for data transfer rates up to 33 megabytes/sec, EISA automatic configuration, and 32-bit address bus which covers the entire 4-gigabyte EISA address space.
In conclusion, it will be understood that the preferred embodiments described herein are only exemplary, and that the invention is capable of many modifications and variations in construction, arrangement and use without departing from the spirit of the claims. For example, the means and methods disclosed herein are also applicable to other related computer systems. Also, the present invention is applicable for enhancing other multiprocessor arrangements. The above examples of possible variations of the present invention ar merely illustrative. Accordingly, the present invention is to be considered as including all possible modifications and variations coming within the scope of the invention as defined by the appended claims.

Claims

What is claimed is:
1. A multiprocessor computer system including multiprocessor bus means, base processor means and input/output adapter means for accessing output buses and related equipment, wherein said processor means and said adapter means are inter-coupled in asynchronous O-Ring fashion by a pair of connector means and are both coupled to said multiprocessor bus means via single channel connector means.
2. The invention of claim 1- wherein said base processor means and said input/output adapter means each include an associated private cache memory coupled directly and asynchronously thereto.
3. The invention of claim 2, wherein each said cache memory is also coupled asynchronously and directly to said multiprocessor bus means.
4. The invention of claim 1, wherein said processor means and said adapter, means each comprises a separate circuit board.
5. The invention of claim 4, wherein said pair of connections between said processor and adapter means are asynchronously mediated via interface control means.
6. The invention tif claim 3, wherein said multiprocessor bus means is so coupled to said processor means and adapter means via intermediate buffer means and associated system address bus means.
7. The invention of claim 6, wherein said multiprocessor bus means is coupled with said system address bus means via bidirectional buffer/register means.
8. The invention of claim 1, wherein said system also includes application processor means and wherein all said processor means are intercoupled by said multiprocessor bus means.
9. The invention of claim 8, wherein said application processor means each includes a private, associated cache memory coupled directly, and asynchronously, thereto.
10. A data Multi-processor processing system comprising: system bus means; main memory means coupled to said .bus means; base processor means coupled asynchronously to said bus via prescribed single channel means; other processor means coupled asynchronously to said bus;
I/O adapter means coupled asynchronously, directly to said base processor means via a pair of asynchronous, bi-lateral connect channels adapted for asynchronous bilateral intercommunication; and also coupled asynchronously to said bus means via said single channel means.
11. The system of claim 10, wherein said base processor means includes associated private cache memory means coupled directly thereto, and to said system bus, for asynchronous communication, also being coupled to said bus means via said single channel" means, shared in common with said I/O adapter means.
12. The system of claim 11, wherein said system also includes system-bus-monitoring means for producing a predetermined output to said main processor means when said system bus is transmitting data corresponding to that in said cache memory means associated with, and coupled directly to, said main processor means.
13. The system of claim 12, wherein said cache memory also includes replacement and update means responsive to said predetermined output for replacing data in a specific address in said cache memory corresponding to said specific address in main memory with the data on said system bus.
14. Multiprocessor apparatus comprising in combination: a plurality of microprocessor processing units adapted to receive and process data signals, said units arranged in parallel with one another and connected to a common system bus in asynchronous fashion; a plurality of local private cache memory means, one being directly, asynchronously connected to each said processing unit and being adapted for storing data, instruction and control signals; common memory means coupled directly to said bus means for providing data, instruction and control signals to each processing unit; input/output control means coupled to a main, controlling one of said processing units for communicating directly, asynchronously therewith via a pair of channels, while being also coupled asynchronously with said bus means in common with said main processing unit via a single shared channel means.
15. A method of arranging the inpu /output adapter means in a computer system, this system also including base processor means, and associated multiprocessor bus means for communication with, and between, memory and other processor means, this method including: intercoupling said base processor means with said input/output adapter means in bilateral asynchronous O-Ring fashion with at least two interconnect means; and interconnecting this O-Ring array with all other processor means and shared memory means using said multiprocessor bus means.
16. The method of claim 15, wherein said O-Ring array is coupled to said multiprocessor bus means via single channel connect means.
17. The method of claim 16, wherein private cache memory means is directly coupled asynchronously to each processor means, and to said adapter means, and to said multiprocessor bus means as well.
18. The method of claim 15, where said base processor means and said adapter means each comprises a separate circuit board.
19. The method of claim 18, wherein said two interconnect means between said base processor and adapter means are asynchronously mediated via interface control means.
20. The method of claim 17, wherein said multiprocessor bus means is so coupled to said base processor means and adapter means via intermediate buffer means and via associated system address bus means.
21. The method of claim 20, wherein said multiprocessor bus means" is coupled with said system address bus means via bidirectional buffer/register means.
22. The method of claim 15, wherein said other processor means comprise •one or more independent application processor means and wherein all said processor means are intercoupled by said multiprocessor bus means.
23. The method of claim 22, wherein said application processor means also each include private, associated cache memory means coupled directly, and asynchronously, thereto.
24. A method of arranging a data processing system which includes input/output channel means, system bus means, main memory means coupled to said bus means, base processor means coupled asynchronously to said bus means via prescribed single channel means, and other processor means coupled asynchronously to said bus means; said method comprising: coupling said input/output channel means directly to said base processor means via a pair of connect channels adapted for asynchronous bilateral intercommunication and thus creating an O-Ring array; and coupling this O-Ring array asynchronously to said system bus means via said single channel means.
25. The system of claim 24, wherein said input/output channel means and base processor means each include associated private cache memory means coupled directly thereto, and to said single channel means for asynchronous communication, said cache memory means being coupled to said bus means via said single channel" means.
26. The system of claim 25, wherein said system also includes system-bus-monitoring means for producing a predetermined output to said base processor means when said system bus means is transmitting data corresponding to that in said cache memory means associated with, and coupled directly to, said base processor means.
27. The system of claim 26, wherein said cache memory also includes replacement and update means responsive to said predetermined output for replacing data in a specific address in said cache memory corresponding to said specific address in main memory with the data on said system bus means.
28. A data processing arrangement comprising, in combination: input/output channel means; a plurality of independent data processing units adapted to receive and process data signals, said units and said input/output channel means being inter-connected via a common system bus in a synchronous fashion; a plurality of local private cache memory means, one being directly, asynchronously connected to said I/O channel means and each said processing unit, and being adapted for storing data, instruction and control signals; common memory means coupled directly to said bus for providing data, instruction and control signals to each processing unit; said input/output channel means being coupled to a base one of said processing units for communicating directly, asynchronously therewith via a pair of channels to form an O-Ring array, this array being also coupled asynchronously with said bus via a single shared bus-channel means.
PCT/US1992/005079 1991-06-20 1992-06-19 Multiprocessor array Ceased WO1993000639A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP92914447A EP0591405A1 (en) 1991-06-20 1992-06-19 Multiprocessor array
JP5501543A JPH06508707A (en) 1991-06-20 1992-06-19 multiprocessor array

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US71889991A 1991-06-20 1991-06-20
US71890091A 1991-06-20 1991-06-20
US718,899 1991-06-20
US718,900 1991-06-20

Publications (1)

Publication Number Publication Date
WO1993000639A1 true WO1993000639A1 (en) 1993-01-07

Family

ID=27109996

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1992/005079 Ceased WO1993000639A1 (en) 1991-06-20 1992-06-19 Multiprocessor array

Country Status (3)

Country Link
EP (1) EP0591405A1 (en)
JP (1) JPH06508707A (en)
WO (1) WO1993000639A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0820023A1 (en) * 1996-07-18 1998-01-21 International Business Machines Corporation Use of processor bus for the transmission of i/o traffic

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0032863A1 (en) * 1980-01-22 1981-07-29 COMPAGNIE INTERNATIONALE POUR L'INFORMATIQUE CII - HONEYWELL BULL (dite CII-HB) Method and device to control the conflicts posed by multiple accesses to a same cache-memory of a digital data processing system comprising at least two processors each possessing a cache
EP0166341A2 (en) * 1984-06-27 1986-01-02 International Business Machines Corporation Multiprocessor system with fast path means for storage accesses
WO1990005953A1 (en) * 1988-11-14 1990-05-31 Unisys Corporation Hardware implemented cache coherency protocole with duplicated distributed directories for high-performance multiprocessors

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0032863A1 (en) * 1980-01-22 1981-07-29 COMPAGNIE INTERNATIONALE POUR L'INFORMATIQUE CII - HONEYWELL BULL (dite CII-HB) Method and device to control the conflicts posed by multiple accesses to a same cache-memory of a digital data processing system comprising at least two processors each possessing a cache
EP0166341A2 (en) * 1984-06-27 1986-01-02 International Business Machines Corporation Multiprocessor system with fast path means for storage accesses
WO1990005953A1 (en) * 1988-11-14 1990-05-31 Unisys Corporation Hardware implemented cache coherency protocole with duplicated distributed directories for high-performance multiprocessors

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0820023A1 (en) * 1996-07-18 1998-01-21 International Business Machines Corporation Use of processor bus for the transmission of i/o traffic

Also Published As

Publication number Publication date
JPH06508707A (en) 1994-09-29
EP0591405A1 (en) 1994-04-13

Similar Documents

Publication Publication Date Title
US5764934A (en) Processor subsystem for use with a universal computer architecture
KR0181471B1 (en) Computer data routing system
US5796605A (en) Extended symmetrical multiprocessor address mapping
US5887138A (en) Multiprocessing computer system employing local and global address spaces and COMA and NUMA access modes
US5611058A (en) System and method for transferring information between multiple buses
US5754877A (en) Extended symmetrical multiprocessor architecture
US5805839A (en) Efficient technique for implementing broadcasts on a system of hierarchical buses
US5897657A (en) Multiprocessing system employing a coherency protocol including a reply count
JP2501375B2 (en) Multiprocessor system
US5490279A (en) Method and apparatus for operating a single CPU computer system as a multiprocessor system
AU691777B2 (en) Computer system providing a universal architecture adaptive to a variety of processor types and bus protocols
JP3723700B2 (en) Method and apparatus for transferring data over a processor interface bus
US5845107A (en) Signaling protocol conversion between a processor and a high-performance system bus
Bryg et al. A high-performance, low-cost multiprocessor bus for workstations and midrange servers
EP0817095A2 (en) Extended symmetrical multiprocessor architecture
WO1993000639A1 (en) Multiprocessor array
NZ233538A (en) Multi-bus microcomputer with programmable control of lock function
JP3531368B2 (en) Computer system and inter-bus control circuit
JP2000347933A (en) Bus bridge, device and method for controlling cache coherence, processor unit and multiprocessor system
JPH0981504A (en) Computer system
US6567871B2 (en) Method and apparatus for repeating (extending) transactions on a bus without clock delay
GB2233480A (en) Multiprocessor data processing system
KR100253795B1 (en) An 8-way smp system and a method for minimizing traffic in order to maintain coherency of the system
EP0472753B1 (en) Multiprocessor system having selective global data replication
Del Corso Experiences in designing the M3 backplane bus standard

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IT LU MC NL SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 1992914447

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1992914447

Country of ref document: EP

WWR Wipo information: refused in national office

Ref document number: 1992914447

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1992914447

Country of ref document: EP