[go: up one dir, main page]

WO2025087005A1 - Interconnect system, device and network - Google Patents

Interconnect system, device and network Download PDF

Info

Publication number
WO2025087005A1
WO2025087005A1 PCT/CN2024/122115 CN2024122115W WO2025087005A1 WO 2025087005 A1 WO2025087005 A1 WO 2025087005A1 CN 2024122115 W CN2024122115 W CN 2024122115W WO 2025087005 A1 WO2025087005 A1 WO 2025087005A1
Authority
WO
WIPO (PCT)
Prior art keywords
pcie
interconnection
interface
host
pcie interface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CN2024/122115
Other languages
French (fr)
Chinese (zh)
Inventor
王江为
张静东
郝锐
王彦伟
肖麟阁
杨乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ieit Systems Beijing Co Ltd
Original Assignee
Ieit Systems Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ieit Systems Beijing Co Ltd filed Critical Ieit Systems Beijing Co Ltd
Publication of WO2025087005A1 publication Critical patent/WO2025087005A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4063Device-to-bus coupling
    • G06F13/4068Electrical coupling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4282Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus

Definitions

  • the embodiments of the present application relate to the field of computer technology, and more specifically, to an interconnected system, device, and network.
  • the purpose of this application is to provide an interconnection system, device and network to achieve direct connection between different host servers using a hardware device.
  • the solution is as follows:
  • the present application provides an interconnection system, comprising: an interconnection device and multiple hosts;
  • the interconnection devices include: network interface, logic interconnection module and multiple PCIE (Peripheral Component Interconnect Express, a high-speed serial computer expansion bus standard) interfaces;
  • PCIE Peripheral Component Interconnect Express, a high-speed serial computer expansion bus standard
  • the network interface is set to connect to the master device
  • Any PCIE interface is configured as EP (End Point) mode and connected to a host to form a host cluster with a mesh topology or a host cluster with a crossbar topology;
  • the logic interconnection module is configured to: forward data sent by a host connected to any PCIE interface to hosts connected to other PCIE interfaces; forward data sent by a master control device to a host connected to any PCIE interface; and forward data sent by a host connected to any PCIE interface to the master control device.
  • the interconnection device further includes: a configuration device;
  • the configuration device is configured to: configure the host connected to the current PCIE interface and the slave device core of the current PCIE interface according to the configuration message of any PCIE interface.
  • the configuration device is configured to: obtain configuration data of the slave device core based on the configuration message, and configure the configuration data in the slave device core.
  • the network interface is built with a first register configured to record a network protocol supported by the master control device;
  • the logic interconnection module is also configured to: encapsulate data sent by a host connected to any PCIE interface using a network protocol supported by the main control device, and forward the encapsulated data to the main control device.
  • the network interface is further configured to: update the network protocol recorded in the first register according to the protocol update information sent by the main control device.
  • any PCIE interface is configured in RP mode and connected to an accelerator card
  • the logical interconnection module is configured to: forward the acceleration card data sent by the PCIE interface connected to the acceleration card to the host connected to other PCIE interfaces; forward the data sent by the main control device to the acceleration card connected to any PCIE interface; and forward the acceleration card data sent by the PCIE interface connected to the acceleration card to the main control device.
  • the interconnection device further includes: an enabling device;
  • the enabling device is configured to enable the network interface and any one PCIE interface and then disable other PCIE interfaces.
  • the enabling device is further configured to: disable the network interface and other PCIE interfaces when enabling any two PCIE interfaces.
  • the logical interconnection module is further configured to: encapsulate the network protocol message sent by the master control device into a PCIE message, and forward the PCIE message to any PCIE interface.
  • the logical interconnect module is further configured to discard data with missing forwarding addresses.
  • the interconnection device further includes: a DMA (Direct Memory Access) management device and a DMA controller;
  • DMA Direct Memory Access
  • the DMA management device is configured to: query the memory free information of any host connected to the PCIE interface; and write the data to be transmitted into the memory of the current host in a DMA manner by using the DMA controller and the memory free information.
  • the DMA management device is further configured to broadcast the memory information of the interconnected device to any host connected to the PCIE interface.
  • the DMA management device is further configured to: store host memory information broadcast by any host connected to the PCIE interface.
  • the interconnection device is an FPGA (Field Programmable Gate Array) accelerator card, an ASIC (Application Specific Integrated Circuit) accelerator card or a multi-core processor.
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • the present application provides an interconnection device, comprising: a network interface, a logic interconnection module and a plurality of PCIE interfaces;
  • the network interface is set to connect to the master device
  • Any PCIE interface is configured in EP mode and connected to a host to form a host cluster of mesh topology or a host cluster of crossbar topology;
  • the logic interconnection module is configured to: forward data sent by a host connected to any PCIE interface to hosts connected to other PCIE interfaces;
  • the interconnection device further includes: a configuration device;
  • the configuration device is configured to: configure the host connected to the current PCIE interface and the slave device core of the current PCIE interface according to the configuration message of any PCIE interface.
  • the network interface is built with a first register configured to record a network protocol supported by the master control device;
  • the logic interconnection module is also configured to: encapsulate data sent by a host connected to any PCIE interface using a network protocol supported by the main control device, and forward the encapsulated data to the main control device.
  • any PCIE interface is configured as RP (Root Port) mode and connected to an accelerator card;
  • the logical interconnection module is configured to: forward the acceleration card data sent by the PCIE interface connected to the acceleration card to the host connected to other PCIE interfaces; forward the data sent by the main control device to the acceleration card connected to any PCIE interface; and forward the acceleration card data sent by the PCIE interface connected to the acceleration card to the main control device.
  • the present application provides an Internet network, comprising: a plurality of any of the aforementioned Internet devices.
  • network interfaces of different interconnected devices are connected to the same main control device.
  • the present application provides an interconnection system, including: an interconnection device and multiple hosts; the interconnection device includes: a network interface, a logical interconnection module and multiple PCIE interfaces; the network interface is configured to connect to a master control device; any PCIE interface is configured to EP mode and connected to a host to form a host cluster of a mesh topology or a host cluster of a crossbar topology; the logical interconnection module is configured to: forward data sent by a host connected to any PCIE interface to hosts connected to other PCIE interfaces; forward data sent by the master control device to a host connected to any PCIE interface; forward data sent by a host connected to any PCIE interface to the master control device.
  • the interconnected device can directly connect to multiple hosts through its own PCIE interface to form a host cluster of mesh topology or a host cluster of crossbar topology. Since the PCIE interface of the interconnected device is configured as EP mode, the interconnected device can be used to realize the following without increasing the hardware cost: the data path from any host connected to the interconnected device to the network interface of the interconnected device, the data path from the network interface of the interconnected device to the host connected to any PCIE interface of the interconnected device, and the data path between hosts connected to different PCIE interfaces of the interconnected device, without using a network card and a network to connect different hosts.
  • FIG1 is a schematic diagram of an interconnection system disclosed in the present application.
  • FIG2 is a schematic diagram of an Internet network disclosed in the present application.
  • FIG3 is a schematic diagram of another Internet network disclosed in the present application.
  • FIG4 is a schematic diagram of a Switch function of an interconnection device implemented based on FPGA disclosed in the present application.
  • FIG5 is a structural diagram of an interconnection device 1 provided by the present application.
  • FIG6 is a structural diagram of an interconnection device 2 provided by the present application.
  • FIG. 7 is a schematic diagram of another interconnection device provided in the present application.
  • the interconnection between different host servers is mostly realized through network cards and network cables, and different host servers cannot be directly connected through a hardware network card.
  • the present application provides an interconnection solution that can directly connect multiple hosts using an interconnection device, so that multiple hosts form a host cluster of mesh topology or a host cluster of crossbar topology, and also use this interconnection device to realize: a data path from any host connected to the interconnection device to the network interface of the interconnection device, a data path from the network interface of the interconnection device to any host connected to the interconnection device, and a data path between hosts connected to different PCIE interfaces of the interconnection device, without using network cards and networks to connect different hosts.
  • an interconnection system disclosed in an embodiment of the present application includes: an interconnection device and multiple hosts; the interconnection device includes: a network interface, a logical interconnection module and multiple PCIE interfaces; the network interface is configured to connect to a master control device; any PCIE (Peripheral Component Interconnect Express, a high-speed serial computer expansion bus standard) interface is configured to EP mode and connected to a host to form a host cluster of a mesh topology or a host cluster of a crossbar topology. It can be seen that any device having a network interface, a logical interconnection module, and a PCIE interface configured to EP mode can be used as an interconnection device provided in this embodiment.
  • the logical interconnection module can be a hardware module with a processor function.
  • the logical interconnection module is configured to: forward data sent by a host connected to any PCIE interface to hosts connected to other PCIE interfaces; forward data sent by a master control device to a host connected to any PCIE interface; and forward data sent by a host connected to any PCIE interface to the master control device.
  • This embodiment enables data sent by any host to the interconnected device through the PCIE interface to reach the master device connected to the network interface of the interconnected device.
  • This embodiment enables data sent by any host to the interconnected device through the PCIE interface to reach hosts connected to other PCIE interfaces, thereby realizing data paths between hosts connected to different PCIE interfaces of the interconnected device.
  • This embodiment enables data sent by the master device to the interconnected device to reach hosts connected to any PCIE interface of the interconnected device, thereby realizing data paths from the network interface of the interconnected device to hosts connected to any PCIE interface of the interconnected device.
  • the main control device may be a remote host or a remote acceleration device.
  • the interconnection device further comprises: a configuration device; correspondingly, the configuration device is configured to: configure the host connected to the current PCIE interface and the slave device core of the current PCIE interface according to the configuration message of any PCIE interface, and configure the device number and bus address of the host according to the configuration message.
  • the configuration device is configured to: obtain configuration data of the slave device core based on the configuration message, and configure the configuration data in the slave device core.
  • the interconnected device also includes: a bus register and a memory window register; accordingly, the configuration device is configured to: configure the bus register and the memory window register according to the configuration message to configure the root device core of the PCIE interface. Or convert the configuration message into a data format that matches the EP mode, and configure the bus address obtained by converting the host's memory address to the slave device core according to the converted configuration message.
  • the data format that matches the EP mode is the type0 format
  • the data format that matches the RP mode is the type1 format.
  • An interconnected device has multiple PCIE interfaces, and the configuration of any PCIE interface can be achieved with the help of a configuration device.
  • the interconnected device includes N PCIE interfaces, thereby achieving: a data path from any PCIE interface of the interconnected device to a network interface, a data path from the network interface to any PCIE interface, and a data path between different PCIE interfaces without the help of an expansion controller. Therefore, the interconnected device can support mesh topology and crossbar topology, and can enable a master control device and multiple hosts to construct mesh topology diagrams and crossbar topology diagrams.
  • the command registers such as the enable memory space register, IO space register, and bus master register; configure the write maximum load length register, the maximum read request register, and the strong order mode register.
  • the bus registers such as the root bus register, the sub-bus register, and the maximum bus register; configure the memory window register to set the interface pre-fetchable memory range, the window base address of the non-pre-fetchable memory, the window size and other information; configure the host device identification information (BDF value).
  • BDF value host device identification information
  • the PCIE interface can implement the Switch function based on the built-in root device core.
  • the RP's capability configuration space register is in the root interface (Root Port), while the Switch's capability configuration register is in the Switch.
  • the BDF (Bus Device Function) value refers to: bus number, device number, and function number.
  • the PCIE interface When configuring the built-in slave core of the PCIE interface, first convert the TLP (Transaction Layer Packet, a transaction layer data packet for data transmission) configuration message in type0 format to type1 format, and then configure the BDF register to configure the identification information for the host device of the current interface; then configure the BAR (Base Address Register) address (converted from the memory address of the host device) so that the current interface can read the memory address of the host device; then configure the command registers such as the enable memory space register, IO space register, and bus master register; configure the write maximum load length register, the maximum read request register, and the strong order mode register.
  • the PCIE interface can implement the Switch function based on the built-in slave core.
  • the network interface is built with a first register configured to record the network protocol supported by the master device; accordingly, the logic interconnection module is also configured to: encapsulate data sent by any host connected to the PCIE interface using the network protocol supported by the master device, and forward the encapsulated data to the master device.
  • the user uses the master device to change the transmission protocol supported by the master device. Transmission protocols such as TCP (Transmission Control Protocol, a transport layer communication protocol) and the like.
  • any PCIE interface is configured as RP mode and connected to an accelerator card; the logical interconnection module is configured to: forward the accelerator card data sent by the PCIE interface connected to the accelerator card to the host connected to other PCIE interfaces; forward the data sent by the main control device to the accelerator card connected to any one PCIE interface; forward the accelerator card data sent by the PCIE interface connected to the accelerator card to the main control device.
  • This forwarding process also depends on the destination address of the forwarded data.
  • the logical interconnection module can determine through this destination address: whether the data is forwarded to the network interface or which PCIE interface.
  • the logical interconnection module is further configured to: encapsulate the network protocol message sent by the master device into a PCIE message, and forward the PCIE message to any PCIE interface, so that the data sent by the master device connected to the network interface to the interconnection device through the network interface can reach the host connected to at least one PCIE interface.
  • the logical interconnect module is further configured to discard data with missing forwarding addresses. That is, the forwarded data contains a forwarding address, that is, the destination address of the data, and the logical interconnect module can determine whether the data is forwarded to a network interface or a PCIE interface through the destination address.
  • the interconnected device also includes: a DMA (Direct Memory Access) management device and a DMA controller; accordingly, the DMA management device is set to: query the memory free information of any host connected to the PCIE interface; use the DMA controller and the memory free information to write the data to be transmitted into the memory of the current host in DMA mode, thereby realizing direct DMA write operations between the interconnected device and the host.
  • DMA Direct Memory Access
  • the DMA management device is further configured to: broadcast the memory information of the interconnected device to any host connected to the PCIE interface, and store the host memory information broadcast by any host connected to the PCIE interface.
  • the interconnection device is an FPGA (Field Programmable Gate Array) accelerator card, an ASIC (Application Specific Integrated Circuit) accelerator card, or a multi-core processor.
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • the interconnection device can directly connect multiple hosts through its own PCIE interface to form a host cluster of mesh topology or a host cluster of crossbar topology. Since the PCIE interface of the interconnection device is configured as EP mode, the interconnection device can be used to realize the following without increasing the hardware cost: the data path from any host connected to the interconnection device to the network interface of the interconnection device, the data path from the network interface of the interconnection device to the host connected to any PCIE interface of the interconnection device, and the data path between hosts connected to different PCIE interfaces of the interconnection device, without the need to use network cards and networks to connect different hosts. Of course, the interconnection device also supports other topologies.
  • An embodiment of the present application provides an interconnection network, including: N interconnection devices provided in the aforementioned embodiments, and different interconnection devices are connected to the same main control device.
  • each interconnection device has multiple PCIE interfaces, and each PCIE interface is connected to a host.
  • the interconnection device can be a GPU (Graphics Processing Unit), FPGA, NPU (Neural Processing Unit) and TPU (Tensor Processing Unit).
  • GPU usually performs pure computing acceleration
  • FPGA can complete application acceleration processing such as computing acceleration, network acceleration and storage acceleration.
  • interconnected device 1 interconnected device 2 and interconnected device 3.
  • interconnected device 2 interconnected device 3
  • interconnected device 3 interconnected devices are all connected to the same master device.
  • the three interconnected devices periodically broadcast their free memory information to other interconnected devices through the PCIE interface.
  • the three interconnected devices can all be accelerator cards, and can realize the interconnection between different hosts without the help of network cards and networks, forming a host cluster with mesh topology or a host cluster with crossbar topology to achieve higher performance computing.
  • the PCIE interface of the interconnected device is an RP mode interface or an EP mode interface, and the bus address range of the PCIE interface is not less than the bus address range of the device connected to it.
  • EP mode When configured in EP mode, it can be connected to a host server to form an integration of multiple host servers; when configured in RP mode, it can be connected to an acceleration card to form an AI (Artificial Intelligence) acceleration card system.
  • AI Artificial Intelligence
  • the PCIE interface of the interconnected device has RP (Root Port) mode and EP (End Point) mode.
  • the current RP mode and the current EP mode only support one-to-one communication between the interconnected device and the device connected to its PCIE interface, and do not support communication between the main control device connected to the network interface of the interconnected device and the device connected to the PCIE interface of the interconnected device, nor do they support communication between devices connected to different PCIE interfaces of the interconnected device.
  • the PCIE interface of this embodiment is made into an RP mode interface or an EP mode interface, the communication between the device connected to the network interface of the interconnected device and the device connected to the PCIE interface of the interconnected device, and the communication between devices connected to different PCIE interfaces of the interconnected device are realized accordingly.
  • the network interface of the interconnected device provided in the embodiment of the present application can be directly connected to the server. When the interconnected device is an acceleration card, its direct connection to the server can form a "host directly connected to the acceleration card" system with the server.
  • data follows the PCIE protocol.
  • the following data paths can be implemented in this interconnected device: bottom-up data paths, top-down data paths, and data paths between host devices connected to different PCIE interfaces of the interconnected device.
  • the type of transmission protocol used by the main control device can be flexibly determined according to the scene requirements. It can be seen that the interconnected device supports full interconnection of mesh topology, crossbar topology, etc., that is, any interface of the interconnected device can communicate with other interfaces of the interconnected device.
  • the following takes the interconnected devices implemented by the FPGA accelerator card as an example to introduce the communication process between the main control device connected to the FPGA's network interface and the device connected to the FPGA's PCIE interface, and the communication process between devices connected to different PCIE interfaces of the FPGA when the FPGA's PCIE interface is in RP mode.
  • this embodiment enables the PCIE interface of the FPGA to implement the PCIE switch function based on the PCIE RP hard core inside the FPGA, and can also improve the utilization rate of FPGA logic resources.
  • configuring the PCIE interface in RP mode you only need to change the configuration space content to enable the PCIE interface to implement the PCIE switch function.
  • the logic processing module has the following modules for the network module and the two downstream PCIE interfaces: analysis module, multiplexing module and arbitration module.
  • the FPGA in Figure 4 has two downstream PCIE interfaces, and each downstream PCIE interface has a PCIE RP hard core.
  • the FPGA can have more or fewer downstream PCIE interfaces, and one downstream PCIE interface is connected to a host or an accelerator card device.
  • the PCIE RP hard core provides three groups of user interfaces, namely the user receive bus, the user send bus and the configuration space configuration bus.
  • the user receive bus and the send bus are user interfaces for transmitting TLP messages.
  • the configuration space configuration bus can be used to complete the configuration of the PCIE RP configuration space, and read the default register value of the configuration space and output it to the parsing module of the logic processing module to assist in parsing and processing.
  • the device configuration module completes the initialization configuration of the configuration space of the two processors through the initiated configuration TLP message, and the BAR base addresses of the two processors are configured as BAR1 and BAR2 respectively.
  • the interface configuration module completes the initialization configuration of the configuration space of the two PCIE interfaces of the FPGA through the cfg_mgnt bus (configuration management, control management bus), and the BAR base addresses of the two PCIE interfaces are configured as BAR3 and BAR4 respectively, wherein the address range of BAR3 is greater than or equal to the address range of BAR1, and the address range of BAR4 is greater than or equal to the address range of BAR2.
  • the parsing module parses the TLP message output by the PCIE RP's transmit bus, outputs an arbitration request based on the routing information, and handles abnormal TLP messages; the arbitration module receives the arbitration request and outputs an arbitration signal to select the corresponding transmission path for the message, and the multiplexing module receives the data output by each path parsing module, and selects the appropriate path according to the arbitration signal and sends it to the PCIE RP's receive bus.
  • the network module is configured to process operations related to the network interface.
  • the network module is connected to the network interface of the FPGA.
  • the network module implements multiple network transmission protocols, such as RDMA (Remote Direct Memory Access), TCP/IP or a custom low-latency network protocol, etc., and selects one of the network protocols according to the mode configuration input by the user and outputs it to the main control device connected to the network interface through the network interface.
  • RDMA Remote Direct Memory Access
  • TCP/IP Transmission Control Protocol
  • custom low-latency network protocol etc.
  • the network module not only supports the PCIE protocol inside the FPGA, but also supports external RDMA, TCP/IP and other protocols, and can realize the conversion between internal and external protocols. Please refer to Figure 4. When selecting an externally supported protocol, it can be dynamically configured through the mode selection switch.
  • the user sends a switch register message for configuring the mode selection through the network interface.
  • the network interface parses the message and extracts the corresponding mode selection switch value to configure it to the mode selection register module, and the network protocol processing module processes the corresponding network protocol according to the value of the mode selection register.
  • the PCIE Switch includes three channels, two of which are connected to the AI processor and the other is connected to the network interface.
  • AI processor-1 initiates a memory access operation to AI processor-2, and the target address is BAR2.
  • the DMA controller of AI processor-1 initiates the operation.
  • This operation message is first sent to the downstream interface PCIE RP IP-1 corresponding to the FPGA PCIE Switch through the PCIE interface. After the PCIE RPIP-1 is output, it is sent to the parsing module 1.
  • the parsing module 1 parses the TLP message and extracts the corresponding destination address and other information.
  • the destination address arbitration request is output and sent to the arbitration modules of each interface.
  • the destination address is the BAR2 address.
  • the arbitration module 2 outputs a valid arbitration enable signal, and other arbitration modules output invalid enable signals.
  • the TLP message is sent to the receiving bus of PCIE RP IP-2, and then sent to AI processor-2 through the PCIE interface.
  • the DMA controller of AI processor-2 extracts the data of the TLP message and stores it in the corresponding storage medium. If the destination address of the TLP message is not within the routing window range (that is, not within the routing window of PCIE IP-2, nor within the routing window of the network path) or the TLP message is abnormal, the message is discarded.
  • each PCIE interface of the FPGA can be configured so that the PCIE interface becomes a PCIE interface with a PCIE Switch function.
  • the network interface of the FPGA can also be configured as a Switch network interface accordingly.
  • this embodiment can enable the PCIE switch function of the PCIE interface of the FPGA to be realized based on the PCIE RP IP hard core that comes with the FPGA, without the need to use an additional PCIE Switch controller, with low cost, simple implementation, and low utilization of FPGA logic resources.
  • this method can be applied to almost all FPGAs and has strong portability.
  • This method can also be extended to other PCIE processors. If its PCIE interface comes with a PCIE RP IP, but does not support the PCIE Switch mode, the PCIE interface of the PCIE Switch can be realized through the PCI RP IP to realize the Switch forwarding function.
  • the PCIE interface that realizes the PCIE Switch can also be configured with the help of the PCIE EP IP hard core that comes with the interconnected device without adding a Switch controller; but it should be noted that in the EP mode, the configuration TLP message format for configuring the PCIE Switch needs to be converted from the type0 format to the type1 format, and other configuration contents can refer to the configuration of the RP mode provided in this application.
  • the IP (Intellectual Property) hard core is a controller in the form of a hardware module.
  • the embodiment of the present application discloses an interconnection device, including: a network interface, a PCIE interface and a logical interconnection module; the network interface has a built-in first register and is connected to a main control device; the PCIE interface is configured as an RP mode or an EP mode and is connected to a host device or an accelerator card.
  • the PCIE interface is configured to receive target data sent by a device connected thereto.
  • the logical interconnection module is configured to convert the target data into a first message that complies with a first transmission protocol recorded in the first register, and send the first message to the main control device through the network interface, if it is determined that the destination address of the target data is within the routing address recorded in the first register (network register).
  • the interconnect device is an FPGA acceleration card, an ASIC acceleration card, or a multi-core processor.
  • the embodiment of the present application also provides an interconnection device.
  • the interconnection device can be either the interconnection device 1 shown in FIG5 or the interconnection device 2 shown in FIG6.
  • FIG5 and FIG6 are both interconnection device structure diagrams according to an exemplary embodiment, and the contents in the diagrams cannot be regarded as any limitation on the scope of use of the present application.
  • FIG5 is a schematic diagram of the structure of an interconnection device 1 provided in an embodiment of the present application.
  • the interconnection device 1 may include: at least one processor, at least one memory, a power supply, a communication interface, an input/output interface, and a communication bus.
  • the memory is configured to store a computer program, which is loaded and executed by the processor and can implement the relevant steps implemented by the logical interconnection module disclosed in any of the aforementioned embodiments.
  • the power supply is configured to provide working voltage for each hardware device on the interconnected device 1;
  • the communication interface can create a PCIE-based data transmission channel between the interconnected device 1 and the external device, and the communication protocol it follows is any communication protocol that can be applied to the technical solution of the present application, which is not limited here;
  • the input and output interface is configured to obtain external input data or output data to the outside world, and its interface type can be selected according to actual application needs and is not limited here.
  • the memory as a carrier of resource storage can be a read-only memory, random access memory, disk or CD, etc.
  • the resources include operating systems, computer programs, and data, and the storage method can be temporary or permanent.
  • the operating system is configured to manage and control the hardware devices and computer programs on the interconnected device 1 to realize the operation and processing of the data in the memory by the processor, and it can be Windows Server, Netware, Unix, Linux, etc.
  • computer programs can also include computer programs that can be used to complete other specific tasks.
  • data can also include data such as application developer information.
  • FIG6 is a schematic diagram of the structure of an interconnected device 2 provided in an embodiment of the present application.
  • the interconnected device 2 may include but is not limited to a smart phone, a tablet computer, a laptop computer, or a desktop computer.
  • the interconnection device 2 in this embodiment includes: a processor and a memory.
  • the processor may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc.
  • the processor can be implemented in at least one hardware form of DSP (Digital Signal Processing), FPGA, and PLA (Programmable Logic Array).
  • the processor may also include a main processor and a coprocessor.
  • the main processor is a processor for processing data in the awake state, also known as CPU (Central Processing Unit);
  • the coprocessor is a low-power processor for processing data in the standby state.
  • the processor may be integrated with a GPU, which is responsible for rendering and drawing the content that needs to be displayed on the display screen.
  • the processor may also include an AI (Artificial Intelligence) processor, which is used to process computing operations related to machine learning.
  • AI Artificial Intelligence
  • the memory may include one or more computer non-volatile readable storage media, which may be non-transitory.
  • the memory may also include high-speed random access memory, and non-volatile memory, such as one or more disk storage devices, flash memory storage devices.
  • the memory is at least used to store the following computer program, wherein, after the computer program is loaded and executed by the processor, it can implement the relevant steps implemented by the logical interconnection module executed by the interconnection device 2 side disclosed in any of the aforementioned embodiments.
  • the resources stored in the memory may also include operating systems and data, etc., and the storage method may be temporary storage or permanent storage.
  • the operating system may include Windows, Unix, Linux, etc.
  • the data may include, but is not limited to, update information of the application.
  • the interconnection device 2 may also include a display screen, an input and output interface, a communication interface, a sensor, a power supply, and a communication bus.
  • FIG. 6 does not constitute a limitation on the interconnection device 2 , and may include more or fewer components than those shown in the figure.
  • the present application provides an interconnection device, including: a network interface, a logical interconnection module and multiple PCIE interfaces; the network interface is configured to connect to a master control device; any PCIE interface is configured to EP mode and connected to a host to form a host cluster of a mesh topology or a host cluster of a crossbar topology; the logical interconnection module is configured to: forward data sent by a host connected to any PCIE interface to hosts connected to other PCIE interfaces; forward data sent by the master control device to a host connected to any PCIE interface; forward data sent by a host connected to any PCIE interface to the master control device.
  • the interconnection device also includes: a configuration device; accordingly, the configuration device is configured to: configure the host connected to the current PCIE interface and the slave device core of the current PCIE interface according to the configuration message of any PCIE interface.
  • the configuration device is configured to: obtain the configuration data of the slave device core based on the configuration message, and configure the configuration data in the slave device core.
  • the network interface is built with a first register configured to record the network protocol supported by the master device; accordingly, the logical interconnection module is also configured to: encapsulate the data sent by the host connected to any PCIE interface using the network protocol supported by the master device, and forward the encapsulated data to the master device.
  • the network interface is also configured to: update the network protocol recorded in the first register according to the protocol update information sent by the master device.
  • Any PCIE interface is configured as RP mode and connected to an accelerator card; the logic interconnection module is set to: forward the accelerator card data sent by the PCIE interface connected to the accelerator card to the host connected to other PCIE interfaces; forward the data sent by the main control device to the accelerator card connected to any PCIE interface; forward the accelerator card data sent by the PCIE interface connected to the accelerator card to the main control device.
  • the interconnection device also includes an enabling device; accordingly, the enabling device is configured to enable the network interface and any one PCIE interface to disable other PCIE interfaces.
  • the enabling device is also configured to enable any two PCIE interfaces to disable the network interface and other PCIE interfaces.
  • the logical interconnection module is further configured to: encapsulate the network protocol message sent by the master control device into a PCIE message, and forward the PCIE message to any PCIE interface.
  • the logical interconnection module is further configured to: discard data with missing forwarding addresses.
  • the interconnection device also includes: a DMA management device and a DMA controller; accordingly, the DMA management device is configured to: query the memory free information of any host connected to the PCIE interface; and use the DMA controller and the memory free information to write the data to be transmitted into the memory of the current host in a DMA manner.
  • the DMA management device is also configured to: broadcast the memory information of the interconnection device to any host connected to the PCIE interface.
  • the DMA management device is also configured to: store the host memory information broadcast by any host connected to the PCIE interface.
  • the steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be implemented directly using hardware, a software module executed by a processor, or a combination of the two.
  • the software module may be placed in a random access memory (RAM (Random Access Memory), memory, read-only memory (ROM (Read-Only Memory), electrically programmable ROM, electrically erasable programmable ROM, register, hard disk, removable disk, CD-ROM, or any other form of non-volatile readable storage medium known in the art.
  • RAM Random Access Memory
  • ROM Read-Only Memory
  • electrically programmable ROM electrically erasable programmable ROM
  • register hard disk, removable disk, CD-ROM, or any other form of non-volatile readable storage medium known in the art.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Bus Control (AREA)

Abstract

The present application relates to the technical field of computers, and discloses an interconnect system, device and network. In the present application, the interconnect device is connected to a plurality of hosts by means of its own PCIE interfaces so as to form a host cluster of a mesh topology or a host cluster of a crossbar topology. The PCIE interfaces of the interconnect device are configured to be in an EP mode, and therefore, a data path from any host connected to the interconnect device to a network interface of the interconnect device, a data path from the network interface of the interconnect device to a host connected to any PCIE interface of the interconnect device, and a data path between hosts connected to different PCIE interfaces of the interconnect device can be realized by means of the interconnect device, without increasing hardware costs, and different hosts do not need to be connected by means of a network interface card and a network.

Description

一种互联系统、设备及网络An interconnection system, device and network

相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS

本申请要求于2023年10月27日提交中国专利局,申请号为2023114130298,申请名称为“一种互联系统、设备及网络”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to a Chinese patent application filed with the China Patent Office on October 27, 2023, with application number 2023114130298 and application name “An Interconnected System, Device and Network”, all contents of which are incorporated by reference in this application.

技术领域Technical Field

本申请实施例涉及计算机技术领域,具体而言,涉及一种互联系统、设备及网络。The embodiments of the present application relate to the field of computer technology, and more specifically, to an interconnected system, device, and network.

背景技术Background Art

目前,不同主机服务器之间的互联大多通过网卡和网线实现,不同主机服务器不能通过一个硬件网卡直接进行连接。Currently, the interconnection between different host servers is mostly achieved through network cards and network cables, and different host servers cannot be directly connected through a hardware network card.

因此,如何用一个硬件设备实现不同主机服务器间的直联,是本领域技术人员需要解决的问题。Therefore, how to use a hardware device to achieve direct connection between different host servers is a problem that those skilled in the art need to solve.

发明内容Summary of the invention

有鉴于此,本申请的目的在于提供一种互联系统、设备及网络,以用一个硬件设备实现不同主机服务器间的直联。其方案如下:In view of this, the purpose of this application is to provide an interconnection system, device and network to achieve direct connection between different host servers using a hardware device. The solution is as follows:

本申请提供了一种互联系统,包括:互联设备和多个主机;The present application provides an interconnection system, comprising: an interconnection device and multiple hosts;

互联设备包括:网络接口、逻辑互联模块和多个PCIE(Peripheral Component Interconnect Express,一种高速串行计算机扩展总线标准)接口;The interconnection devices include: network interface, logic interconnection module and multiple PCIE (Peripheral Component Interconnect Express, a high-speed serial computer expansion bus standard) interfaces;

网络接口被设置为连接主控设备;The network interface is set to connect to the master device;

任意PCIE接口被配置为EP(End Point,端点)模式并连接一个主机,以构成mesh(网格)拓扑的主机集群或crossbar(交叉)拓扑的主机集群;Any PCIE interface is configured as EP (End Point) mode and connected to a host to form a host cluster with a mesh topology or a host cluster with a crossbar topology;

逻辑互联模块被设置为:将任意一个PCIE接口所连主机发出的数据转发至其他PCIE接口所连主机;将主控设备发出的数据转发至任意一个PCIE接口所连主机;将任意一个PCIE接口所连主机发出的数据转发至主控设备。The logic interconnection module is configured to: forward data sent by a host connected to any PCIE interface to hosts connected to other PCIE interfaces; forward data sent by a master control device to a host connected to any PCIE interface; and forward data sent by a host connected to any PCIE interface to the master control device.

可选地,互联设备还包括:配置器件;Optionally, the interconnection device further includes: a configuration device;

相应地,配置器件被设置为:根据任意PCIE接口的配置报文配置当前PCIE接口所连主机以及当前PCIE接口的从设备核。Accordingly, the configuration device is configured to: configure the host connected to the current PCIE interface and the slave device core of the current PCIE interface according to the configuration message of any PCIE interface.

可选地,配置器件被设置为:基于配置报文获取从设备核的配置数据,并将配置数据配置于从设备核。Optionally, the configuration device is configured to: obtain configuration data of the slave device core based on the configuration message, and configure the configuration data in the slave device core.

可选地,网络接口内置有被设置为记录主控设备支持的网络协议的第一寄存器;Optionally, the network interface is built with a first register configured to record a network protocol supported by the master control device;

相应地,逻辑互联模块还被设置为:利用主控设备支持的网络协议封装任意一个PCIE接口所连主机发出的数据,并将封装后的数据转发至主控设备。Correspondingly, the logic interconnection module is also configured to: encapsulate data sent by a host connected to any PCIE interface using a network protocol supported by the main control device, and forward the encapsulated data to the main control device.

可选地,网络接口还被设置为:根据主控设备发送的协议更新信息,对第一寄存器中记录的网络协议进行更新。Optionally, the network interface is further configured to: update the network protocol recorded in the first register according to the protocol update information sent by the main control device.

可选地,任意PCIE接口被配置为RP模式并连接一个加速卡;Optionally, any PCIE interface is configured in RP mode and connected to an accelerator card;

逻辑互联模块被设置为:将连接加速卡的PCIE接口发出的加速卡数据转发至其他PCIE接口所连主机;将主控设备发出的数据转发至任意一个PCIE接口所连加速卡;将连接加速卡的PCIE接口发出的加速卡数据转发至主控设备。The logical interconnection module is configured to: forward the acceleration card data sent by the PCIE interface connected to the acceleration card to the host connected to other PCIE interfaces; forward the data sent by the main control device to the acceleration card connected to any PCIE interface; and forward the acceleration card data sent by the PCIE interface connected to the acceleration card to the main control device.

可选地,互联设备还包括:使能器件;Optionally, the interconnection device further includes: an enabling device;

相应地,使能器件被设置为:使能网络接口和任意一个PCIE接口时去使能其他PCIE接口。Accordingly, the enabling device is configured to enable the network interface and any one PCIE interface and then disable other PCIE interfaces.

可选地,使能器件还被设置为:使能任意两个PCIE接口时去使能网络接口和其他PCIE接口。Optionally, the enabling device is further configured to: disable the network interface and other PCIE interfaces when enabling any two PCIE interfaces.

可选地,逻辑互联模块还被设置为:将主控设备发送的网络协议报文封装为PCIE报文,将PCIE报文转发至任意PCIE接口。Optionally, the logical interconnection module is further configured to: encapsulate the network protocol message sent by the master control device into a PCIE message, and forward the PCIE message to any PCIE interface.

可选地,逻辑互联模块还被设置为:丢弃转发地址缺失的数据。Optionally, the logical interconnect module is further configured to discard data with missing forwarding addresses.

可选地,互联设备还包括:DMA(Direct Memory Access,存储器直接访问)管理器件和DMA控制器;Optionally, the interconnection device further includes: a DMA (Direct Memory Access) management device and a DMA controller;

相应地,DMA管理器件被设置为:查询任意PCIE接口所连主机的内存空闲信息;利用DMA控制器和内存空闲信息将需传输数据以DMA方式写入当前主机的内存。Accordingly, the DMA management device is configured to: query the memory free information of any host connected to the PCIE interface; and write the data to be transmitted into the memory of the current host in a DMA manner by using the DMA controller and the memory free information.

可选地,DMA管理器件还被设置为:将互联设备的内存信息广播至任意PCIE接口所连的主机。Optionally, the DMA management device is further configured to broadcast the memory information of the interconnected device to any host connected to the PCIE interface.

可选地,DMA管理器件还被设置为:存储任意PCIE接口所连的主机广播的主机内存信息。Optionally, the DMA management device is further configured to: store host memory information broadcast by any host connected to the PCIE interface.

可选地,互联设备为FPGA(Field Programmable Gate Array,现场可编程门列阵)加速卡、ASIC(Application Specific Integrated Circuit,专用集成电路)加速卡或多核处理器。Optionally, the interconnection device is an FPGA (Field Programmable Gate Array) accelerator card, an ASIC (Application Specific Integrated Circuit) accelerator card or a multi-core processor.

本申请提供了一种互联设备,包括:网络接口、逻辑互联模块和多个PCIE接口; The present application provides an interconnection device, comprising: a network interface, a logic interconnection module and a plurality of PCIE interfaces;

网络接口被设置为连接主控设备;The network interface is set to connect to the master device;

任意PCIE接口被配置为EP模式并连接一个主机,以构成mesh拓扑的主机集群或crossbar拓扑的主机集群;Any PCIE interface is configured in EP mode and connected to a host to form a host cluster of mesh topology or a host cluster of crossbar topology;

逻辑互联模块被设置为:将任意一个PCIE接口所连主机发出的数据转发至其他PCIE接口所连主机;The logic interconnection module is configured to: forward data sent by a host connected to any PCIE interface to hosts connected to other PCIE interfaces;

将主控设备发出的数据转发至任意一个PCIE接口所连主机;将任意一个PCIE接口所连主机发出的数据转发至主控设备。Forward the data sent by the master device to any host connected to the PCIE interface; forward the data sent by any host connected to the PCIE interface to the master device.

可选地,互联设备还包括:配置器件;Optionally, the interconnection device further includes: a configuration device;

相应地,配置器件被设置为:根据任意PCIE接口的配置报文配置当前PCIE接口所连主机以及当前PCIE接口的从设备核。Accordingly, the configuration device is configured to: configure the host connected to the current PCIE interface and the slave device core of the current PCIE interface according to the configuration message of any PCIE interface.

可选地,网络接口内置有被设置为记录主控设备支持的网络协议的第一寄存器;Optionally, the network interface is built with a first register configured to record a network protocol supported by the master control device;

相应地,逻辑互联模块还被设置为:利用主控设备支持的网络协议封装任意一个PCIE接口所连主机发出的数据,并将封装后的数据转发至主控设备。Correspondingly, the logic interconnection module is also configured to: encapsulate data sent by a host connected to any PCIE interface using a network protocol supported by the main control device, and forward the encapsulated data to the main control device.

可选地,任意PCIE接口被配置为RP(Root Port,根接口)模式并连接一个加速卡;Optionally, any PCIE interface is configured as RP (Root Port) mode and connected to an accelerator card;

逻辑互联模块被设置为:将连接加速卡的PCIE接口发出的加速卡数据转发至其他PCIE接口所连主机;将主控设备发出的数据转发至任意一个PCIE接口所连加速卡;将连接加速卡的PCIE接口发出的加速卡数据转发至主控设备。The logical interconnection module is configured to: forward the acceleration card data sent by the PCIE interface connected to the acceleration card to the host connected to other PCIE interfaces; forward the data sent by the main control device to the acceleration card connected to any PCIE interface; and forward the acceleration card data sent by the PCIE interface connected to the acceleration card to the main control device.

本申请提供了一种互联网络,包括:多个前述任意一项互联设备。The present application provides an Internet network, comprising: a plurality of any of the aforementioned Internet devices.

可选地,不同互联设备的网络接口连接同一主控设备。Optionally, network interfaces of different interconnected devices are connected to the same main control device.

通过以上方案可知,本申请提供了一种互联系统,包括:互联设备和多个主机;互联设备包括:网络接口、逻辑互联模块和多个PCIE接口;网络接口被设置为连接主控设备;任意PCIE接口被配置为EP模式并连接一个主机,以构成mesh拓扑的主机集群或crossbar拓扑的主机集群;逻辑互联模块被设置为:将任意一个PCIE接口所连主机发出的数据转发至其他PCIE接口所连主机;将主控设备发出的数据转发至任意一个PCIE接口所连主机;将任意一个PCIE接口所连主机发出的数据转发至主控设备。It can be seen from the above scheme that the present application provides an interconnection system, including: an interconnection device and multiple hosts; the interconnection device includes: a network interface, a logical interconnection module and multiple PCIE interfaces; the network interface is configured to connect to a master control device; any PCIE interface is configured to EP mode and connected to a host to form a host cluster of a mesh topology or a host cluster of a crossbar topology; the logical interconnection module is configured to: forward data sent by a host connected to any PCIE interface to hosts connected to other PCIE interfaces; forward data sent by the master control device to a host connected to any PCIE interface; forward data sent by a host connected to any PCIE interface to the master control device.

可见,本申请的有益效果为:互联设备可直接通过自身的PCIE接口连接多个主机,以构成mesh拓扑的主机集群或crossbar拓扑的主机集群。由于互联设备的PCIE接口被配置为EP模式,因此可在不增加硬件成本的前提下借助此互联设备实现:互联设备所连的任意主机至互联设备的网络接口的数据通路、互联设备的网络接口至互联设备任意PCIE接口所连主机的数据通路、以及互联设备的不同PCIE接口所连主机之间的数据通路,而不用借助网卡和网络连接不同主机。It can be seen that the beneficial effects of the present application are: the interconnected device can directly connect to multiple hosts through its own PCIE interface to form a host cluster of mesh topology or a host cluster of crossbar topology. Since the PCIE interface of the interconnected device is configured as EP mode, the interconnected device can be used to realize the following without increasing the hardware cost: the data path from any host connected to the interconnected device to the network interface of the interconnected device, the data path from the network interface of the interconnected device to the host connected to any PCIE interface of the interconnected device, and the data path between hosts connected to different PCIE interfaces of the interconnected device, without using a network card and a network to connect different hosts.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings required for use in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of the present application. For ordinary technicians in this field, other drawings can be obtained based on the provided drawings without paying any creative work.

图1为本申请公开的一种互联系统示意图;FIG1 is a schematic diagram of an interconnection system disclosed in the present application;

图2为本申请公开的一种互联网络示意图;FIG2 is a schematic diagram of an Internet network disclosed in the present application;

图3为本申请公开的又一种互联网络示意图;FIG3 is a schematic diagram of another Internet network disclosed in the present application;

图4为本申请公开的一种基于FPGA实现的互联设备的Switch功能示意图;FIG4 is a schematic diagram of a Switch function of an interconnection device implemented based on FPGA disclosed in the present application;

图5为本申请提供的一种互联设备1的结构图;FIG5 is a structural diagram of an interconnection device 1 provided by the present application;

图6为本申请提供的一种互联设备2的结构图;FIG6 is a structural diagram of an interconnection device 2 provided by the present application;

图7为本申请提供的又一种互联设备示意图。FIG. 7 is a schematic diagram of another interconnection device provided in the present application.

具体实施方式DETAILED DESCRIPTION

下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The following will be combined with the drawings in the embodiments of the present application to clearly and completely describe the technical solutions in the embodiments of the present application. Obviously, the described embodiments are only part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of this application.

目前,不同主机服务器之间的互联大多通过网卡和网线实现,不同主机服务器不能通过一个硬件网卡直接进行连接。为此,本申请提供了一种互联方案,能够利用互联设备直接连接多个主机,使得多个主机构成mesh拓扑的主机集群或crossbar拓扑的主机集群,还借助此互联设备实现:互联设备所连的任意主机至互联设备的网络接口的数据通路、互联设备的网络接口至互联设备所连的任意主机的数据通路、以及互联设备的不同PCIE接口所连主机之间的数据通路,而不用借助网卡和网络连接不同主机。At present, the interconnection between different host servers is mostly realized through network cards and network cables, and different host servers cannot be directly connected through a hardware network card. To this end, the present application provides an interconnection solution that can directly connect multiple hosts using an interconnection device, so that multiple hosts form a host cluster of mesh topology or a host cluster of crossbar topology, and also use this interconnection device to realize: a data path from any host connected to the interconnection device to the network interface of the interconnection device, a data path from the network interface of the interconnection device to any host connected to the interconnection device, and a data path between hosts connected to different PCIE interfaces of the interconnection device, without using network cards and networks to connect different hosts.

参见图1所示,本申请实施例公开的一种互联系统,包括:互联设备和多个主机;互联设备包括:网络接口、逻辑互联模块和多个PCIE接口;网络接口被设置为连接主控设备;任意PCIE(Peripheral Component Interconnect Express,一种高速串行计算机扩展总线标准)接口被配置为EP模式并连接一个主机,以构成mesh拓扑的主机集群或crossbar拓扑的主机集群。可见,任意具有网络接口、逻辑互联模块、以及被配置为EP模式的PCIE接口的设备均可作为本实施例提供的互联设备。逻辑互联模块可为具有处理器功能的硬件模块。 As shown in FIG1 , an interconnection system disclosed in an embodiment of the present application includes: an interconnection device and multiple hosts; the interconnection device includes: a network interface, a logical interconnection module and multiple PCIE interfaces; the network interface is configured to connect to a master control device; any PCIE (Peripheral Component Interconnect Express, a high-speed serial computer expansion bus standard) interface is configured to EP mode and connected to a host to form a host cluster of a mesh topology or a host cluster of a crossbar topology. It can be seen that any device having a network interface, a logical interconnection module, and a PCIE interface configured to EP mode can be used as an interconnection device provided in this embodiment. The logical interconnection module can be a hardware module with a processor function.

其中,逻辑互联模块被设置为:将任意一个PCIE接口所连主机发出的数据转发至其他PCIE接口所连主机;将主控设备发出的数据转发至任意一个PCIE接口所连主机;将任意一个PCIE接口所连主机发出的数据转发至主控设备。Among them, the logical interconnection module is configured to: forward data sent by a host connected to any PCIE interface to hosts connected to other PCIE interfaces; forward data sent by a master control device to a host connected to any PCIE interface; and forward data sent by a host connected to any PCIE interface to the master control device.

本实施例可使任意主机通过PCIE接口发给互联设备的数据到达互联设备的网络接口所连主控设备。本实施例可使任意主机通过PCIE接口发给互联设备的数据到达其他PCIE接口所连的主机,实现互联设备的不同PCIE接口所连主机之间的数据通路。本实施例可使主控设备发给互联设备的数据到达互联设备的任意PCIE接口所连的主机,实现互联设备的网络接口至互联设备任意PCIE接口所连主机的数据通路。This embodiment enables data sent by any host to the interconnected device through the PCIE interface to reach the master device connected to the network interface of the interconnected device. This embodiment enables data sent by any host to the interconnected device through the PCIE interface to reach hosts connected to other PCIE interfaces, thereby realizing data paths between hosts connected to different PCIE interfaces of the interconnected device. This embodiment enables data sent by the master device to the interconnected device to reach hosts connected to any PCIE interface of the interconnected device, thereby realizing data paths from the network interface of the interconnected device to hosts connected to any PCIE interface of the interconnected device.

在本实施例中,主控设备可以是远端主机,也可以是远端加速设备。In this embodiment, the main control device may be a remote host or a remote acceleration device.

在一种实施方式中,互联设备还包括:配置器件;相应地,配置器件被设置为:根据任意PCIE接口的配置报文配置当前PCIE接口所连主机以及当前PCIE接口的从设备核。根据配置报文对主机进行设备号配置和总线地址配置。In one embodiment, the interconnection device further comprises: a configuration device; correspondingly, the configuration device is configured to: configure the host connected to the current PCIE interface and the slave device core of the current PCIE interface according to the configuration message of any PCIE interface, and configure the device number and bus address of the host according to the configuration message.

在一种实施方式中,配置器件被设置为:基于配置报文获取从设备核的配置数据,并将配置数据配置于从设备核。互联设备还包括:总线寄存器和内存窗口寄存器;相应地,配置器件被设置为:根据配置报文配置总线寄存器和内存窗口寄存器,以对PCIE接口的根设备核进行配置。或将配置报文转换为与EP模式匹配的数据格式,并根据转换后的配置报文将由主机的内存地址转换得到的总线地址配置于从设备核。其中,与EP模式匹配的数据格式为type0格式,与RP模式匹配的数据格式为type1格式。In one embodiment, the configuration device is configured to: obtain configuration data of the slave device core based on the configuration message, and configure the configuration data in the slave device core. The interconnected device also includes: a bus register and a memory window register; accordingly, the configuration device is configured to: configure the bus register and the memory window register according to the configuration message to configure the root device core of the PCIE interface. Or convert the configuration message into a data format that matches the EP mode, and configure the bus address obtained by converting the host's memory address to the slave device core according to the converted configuration message. Among them, the data format that matches the EP mode is the type0 format, and the data format that matches the RP mode is the type1 format.

一个互联设备的PCIE接口有多个,任意PCIE接口的配置可借助配置器件实现,互联设备包括N个PCIE接口,由此可实现:互联设备的任意PCIE接口至网络接口的数据通路、网络接口至任意PCIE接口的数据通路、以及不同PCIE接口之间的数据通路,而不用借助扩展控制器,由此互联设备可支持mesh拓扑和crossbar拓扑,能够使主控设备和多个主机构建得到mesh拓扑图和crossbar拓扑图。An interconnected device has multiple PCIE interfaces, and the configuration of any PCIE interface can be achieved with the help of a configuration device. The interconnected device includes N PCIE interfaces, thereby achieving: a data path from any PCIE interface of the interconnected device to a network interface, a data path from the network interface to any PCIE interface, and a data path between different PCIE interfaces without the help of an expansion controller. Therefore, the interconnected device can support mesh topology and crossbar topology, and can enable a master control device and multiple hosts to construct mesh topology diagrams and crossbar topology diagrams.

在为PCIE接口配置IP(Internet Protocol Address,互联网协议地址)地址时,配置使能内存空间寄存器、IO空间寄存器以及总线主控寄存器等command命令寄存器;配置写最大负载长度寄存器,最大读请求寄存器和强序模式寄存器。在PCIE接口初始化时,配置根总线寄存器、子总线寄存器和最大总线寄存器等总线寄存器;配置内存窗口寄存器,以设定接口可预取内存范围、不可预取的内存的窗口基地址、窗口大小等信息;配置主机设备的标识信息(BDF值)。至此PCIE接口可基于内置的根设备核实现Switch功能。一般RP的capability(能力)配置空间寄存器在根接口(Root Port),而Switch的capability配置寄存器在Switch。BDF(Bus Device Function)值指:总线号、设备号及功能号。When configuring the IP (Internet Protocol Address) address for the PCIE interface, configure the command registers such as the enable memory space register, IO space register, and bus master register; configure the write maximum load length register, the maximum read request register, and the strong order mode register. When the PCIE interface is initialized, configure the bus registers such as the root bus register, the sub-bus register, and the maximum bus register; configure the memory window register to set the interface pre-fetchable memory range, the window base address of the non-pre-fetchable memory, the window size and other information; configure the host device identification information (BDF value). At this point, the PCIE interface can implement the Switch function based on the built-in root device core. Generally, the RP's capability configuration space register is in the root interface (Root Port), while the Switch's capability configuration register is in the Switch. The BDF (Bus Device Function) value refers to: bus number, device number, and function number.

在为PCIE接口内置的从设备核进行配置时,先将type0格式的TLP(Transaction Layer Packet,一种用于数据传输的事务层数据包)配置报文转换为type1格式,然后配置BDF寄存器,以为当前接口的主机设备配置标识信息;然后配置BAR(Base Address Register,基地址寄存器)地址(由主机设备的内存地址转换得到),以使当前接口能够读取主机设备的内存地址;之后配置使能内存空间寄存器、IO空间寄存器以及总线主控寄存器等command命令寄存器;配置写最大负载长度寄存器,最大读请求寄存器和强序模式寄存器。至此PCIE接口可基于内置的从设备核实现Switch功能。When configuring the built-in slave core of the PCIE interface, first convert the TLP (Transaction Layer Packet, a transaction layer data packet for data transmission) configuration message in type0 format to type1 format, and then configure the BDF register to configure the identification information for the host device of the current interface; then configure the BAR (Base Address Register) address (converted from the memory address of the host device) so that the current interface can read the memory address of the host device; then configure the command registers such as the enable memory space register, IO space register, and bus master register; configure the write maximum load length register, the maximum read request register, and the strong order mode register. At this point, the PCIE interface can implement the Switch function based on the built-in slave core.

在一种实施方式中,网络接口内置有被设置为记录主控设备支持的网络协议的第一寄存器;相应地,逻辑互联模块还被设置为:利用主控设备支持的网络协议封装任意一个PCIE接口所连主机发出的数据,并将封装后的数据转发至主控设备。在一种示例中,用户利用主控设备更改主控设备支持的传输协议。传输协议如:TCP(Transmission Control Protocol,一种传输层通信协议)等。In one embodiment, the network interface is built with a first register configured to record the network protocol supported by the master device; accordingly, the logic interconnection module is also configured to: encapsulate data sent by any host connected to the PCIE interface using the network protocol supported by the master device, and forward the encapsulated data to the master device. In one example, the user uses the master device to change the transmission protocol supported by the master device. Transmission protocols such as TCP (Transmission Control Protocol, a transport layer communication protocol) and the like.

在一种实施方式中,网络接口还被设置为:根据主控设备发送的协议更新信息,对第一寄存器中记录的网络协议进行更新。In one implementation, the network interface is further configured to: update the network protocol recorded in the first register according to the protocol update information sent by the master control device.

需要说明的是,当互联设备中,有的PCIE接口连主机,有的PCIE接口连加速卡,那么可实现一个PCIE接口所连主机与另一个PCIE接口所连加速卡间的通信。在一种实施方式中,任意PCIE接口被配置为RP模式并连接一个加速卡;逻辑互联模块被设置为:将连接加速卡的PCIE接口发出的加速卡数据转发至其他PCIE接口所连主机;将主控设备发出的数据转发至任意一个PCIE接口所连加速卡;将连接加速卡的PCIE接口发出的加速卡数据转发至主控设备。此转发过程也依赖被转发的数据的目的地址实现,逻辑互联模块通过此目的地址可确定:数据是转发至网络接口还是哪个PCIE接口。It should be noted that when some PCIE interfaces in the interconnected devices are connected to the host and some PCIE interfaces are connected to the accelerator card, communication between the host connected to one PCIE interface and the accelerator card connected to another PCIE interface can be achieved. In one embodiment, any PCIE interface is configured as RP mode and connected to an accelerator card; the logical interconnection module is configured to: forward the accelerator card data sent by the PCIE interface connected to the accelerator card to the host connected to other PCIE interfaces; forward the data sent by the main control device to the accelerator card connected to any one PCIE interface; forward the accelerator card data sent by the PCIE interface connected to the accelerator card to the main control device. This forwarding process also depends on the destination address of the forwarded data. The logical interconnection module can determine through this destination address: whether the data is forwarded to the network interface or which PCIE interface.

在本实施例中,网络接口和各PCIE接口均能灵活使能和去使能。因此在逻辑互联模块根据第一寄存器中的当前协议信息确定传输协议之前,使能器件还可以使能网络接口,并去使能其他PCIE接口,以使数据通过网络接口发出。相应地,逻辑互联模块通过任意PCIE接口将数据转发至该PCIE接口所连的主机之前,使能器件还可以使能该PCIE接口,去使能网络接口和其他PCIE接口,以使数据通过使能的PCIE接口发出。在一种实施方式中,使能器件还被设置为:使能任意两个PCIE接口时去使能网络接口和其他PCIE接口。在一种实施方式中,互联设备还包括:使能器件;相应地,使能器件被设置为:使能网络接口和任意一个PCIE接口时去使能其他PCIE接口。In this embodiment, the network interface and each PCIE interface can be flexibly enabled and disabled. Therefore, before the logical interconnection module determines the transmission protocol according to the current protocol information in the first register, the enabling device can also enable the network interface and disable other PCIE interfaces so that data can be sent through the network interface. Accordingly, before the logical interconnection module forwards data to the host connected to the PCIE interface through any PCIE interface, the enabling device can also enable the PCIE interface, disable the network interface and other PCIE interfaces, so that data can be sent through the enabled PCIE interface. In one embodiment, the enabling device is also configured to: disable the network interface and other PCIE interfaces when enabling any two PCIE interfaces. In one embodiment, the interconnection device also includes: an enabling device; accordingly, the enabling device is configured to: disable other PCIE interfaces when enabling the network interface and any one PCIE interface.

在一种实施方式中,逻辑互联模块还被设置为:将主控设备发送的网络协议报文封装为PCIE报文,将PCIE报文转发至任意PCIE接口。由此可使网络接口所连主控设备通过网络接口发给互联设备的数据到达至少一个PCIE接口所连的主机。 In one embodiment, the logical interconnection module is further configured to: encapsulate the network protocol message sent by the master device into a PCIE message, and forward the PCIE message to any PCIE interface, so that the data sent by the master device connected to the network interface to the interconnection device through the network interface can reach the host connected to at least one PCIE interface.

在一种实施方式中,逻辑互联模块还被设置为:丢弃转发地址缺失的数据。也就是说:被转发的数据中含转发地址,也就是数据的目的地址,逻辑互联模块通过此目的地址可确定:数据是转发至网络接口还是哪个PCIE接口。In one embodiment, the logical interconnect module is further configured to discard data with missing forwarding addresses. That is, the forwarded data contains a forwarding address, that is, the destination address of the data, and the logical interconnect module can determine whether the data is forwarded to a network interface or a PCIE interface through the destination address.

在一种示例中,互联设备还包括:DMA(Direct Memory Access,存储器直接访问)管理器件和DMA控制器;相应地,DMA管理器件被设置为:查询任意PCIE接口所连主机的内存空闲信息;利用DMA控制器和内存空闲信息将需传输数据以DMA方式写入当前主机的内存,从而实现互联设备与主机之间的直接DMA写操作。In one example, the interconnected device also includes: a DMA (Direct Memory Access) management device and a DMA controller; accordingly, the DMA management device is set to: query the memory free information of any host connected to the PCIE interface; use the DMA controller and the memory free information to write the data to be transmitted into the memory of the current host in DMA mode, thereby realizing direct DMA write operations between the interconnected device and the host.

其中,DMA管理器件还被设置为:将互联设备的内存信息广播至任意PCIE接口所连的主机。存储任意PCIE接口所连的主机广播的主机内存信息。The DMA management device is further configured to: broadcast the memory information of the interconnected device to any host connected to the PCIE interface, and store the host memory information broadcast by any host connected to the PCIE interface.

在一种实施方式中,互联设备为FPGA(Field Programmable Gate Array,现场可编程门阵列)加速卡、ASIC(Application Specific Integrated Circuit,专用集成电路)加速卡或多核处理器。In one embodiment, the interconnection device is an FPGA (Field Programmable Gate Array) accelerator card, an ASIC (Application Specific Integrated Circuit) accelerator card, or a multi-core processor.

可见,本实施例利用互联设备可直接通过自身的PCIE接口连接多个主机,以构成mesh拓扑的主机集群或crossbar拓扑的主机集群。由于互联设备的PCIE接口被配置为EP模式,因此可在不增加硬件成本的前提下借助此互联设备实现:互联设备所连的任意主机至互联设备的网络接口的数据通路、互联设备的网络接口至互联设备任意PCIE接口所连主机的数据通路、以及互联设备的不同PCIE接口所连主机之间的数据通路,而不用借助网卡和网络连接不同主机。当然,互联设备还支持其他拓扑。It can be seen that in this embodiment, the interconnection device can directly connect multiple hosts through its own PCIE interface to form a host cluster of mesh topology or a host cluster of crossbar topology. Since the PCIE interface of the interconnection device is configured as EP mode, the interconnection device can be used to realize the following without increasing the hardware cost: the data path from any host connected to the interconnection device to the network interface of the interconnection device, the data path from the network interface of the interconnection device to the host connected to any PCIE interface of the interconnection device, and the data path between hosts connected to different PCIE interfaces of the interconnection device, without the need to use network cards and networks to connect different hosts. Of course, the interconnection device also supports other topologies.

请参见图2,本申请实施例提供了一种互联网络,包括:N个前述实施例提供的互联设备,不同互联设备连接同一主控设备。其中,每一互联设备有多个PCIE接口,每一PCIE接口连接一个主机。在本实施例中,互联设备可以是GPU(Graphics Processing Unit,图形处理器)、FPGA、NPU(Neural Processing Unit,网络处理器)和TPU(Tensor ProcessingUnit,张量处理器),GPU通常进行纯计算加速,FPGA可以完成计算加速、网络加速和存储加速等应用加速处理。Please refer to Figure 2. An embodiment of the present application provides an interconnection network, including: N interconnection devices provided in the aforementioned embodiments, and different interconnection devices are connected to the same main control device. Among them, each interconnection device has multiple PCIE interfaces, and each PCIE interface is connected to a host. In this embodiment, the interconnection device can be a GPU (Graphics Processing Unit), FPGA, NPU (Neural Processing Unit) and TPU (Tensor Processing Unit). GPU usually performs pure computing acceleration, and FPGA can complete application acceleration processing such as computing acceleration, network acceleration and storage acceleration.

如图3所示,假设需要加速运算一个任务,且基于互联设备1、互联设备2和互联设备3实现加速,这三个互联设备均连接同一主控设备。该系统的初始化过程:主控设备分别初始化三个互联设备的PCIE接口,完成配置空间的配置以及总线基地址的分配。三个互联设备都周期性通过PCIE接口广播自己的空闲内存信息给其他互联设备。三个互联设备均可以是加速卡,可以不借助网卡和网络实现不同主机之间的互联,构成mesh拓扑的主机集群或crossbar拓扑的主机集群,以实现更高性能计算。As shown in Figure 3, it is assumed that a task needs to be accelerated, and the acceleration is achieved based on interconnected device 1, interconnected device 2 and interconnected device 3. These three interconnected devices are all connected to the same master device. The initialization process of the system: the master device initializes the PCIE interfaces of the three interconnected devices respectively, completes the configuration of the configuration space and the allocation of the bus base address. The three interconnected devices periodically broadcast their free memory information to other interconnected devices through the PCIE interface. The three interconnected devices can all be accelerator cards, and can realize the interconnection between different hosts without the help of network cards and networks, forming a host cluster with mesh topology or a host cluster with crossbar topology to achieve higher performance computing.

参照上述实施例,互联设备的PCIE接口为RP模式接口或EP模式接口,且PCIE接口的总线地址范围不小于其所连设备的总线地址范围。配置成EP模式时可以接主机服务器,以形成多主机服务器的集成;配置成RP模式时可以接加速卡,形成AI(Artificial Intelligence,人工智能)加速卡系统。互联设备的PCIE接口存在RP(Root Port)模式和EP(End Point)模式,当前的RP模式和当前的EP模式仅支持互联设备与其PCIE接口所连设备的一对一通信,不支持互联设备的网络接口所连主控设备与互联设备的PCIE接口所连设备的通信,也不支持互联设备的不同PCIE接口所连设备的通信。而本实施例使PCIE接口为RP模式接口或EP模式接口后,并据此实现了互联设备的网络接口所连设备与互联设备的PCIE接口所连设备的通信,互联设备的不同PCIE接口所连设备的通信。本申请实施例提供的互联设备的网络接口可直连服务器,当互联设备为加速卡时,其直连服务器可与服务器构成“主机直连加速卡”的系统。Referring to the above embodiment, the PCIE interface of the interconnected device is an RP mode interface or an EP mode interface, and the bus address range of the PCIE interface is not less than the bus address range of the device connected to it. When configured in EP mode, it can be connected to a host server to form an integration of multiple host servers; when configured in RP mode, it can be connected to an acceleration card to form an AI (Artificial Intelligence) acceleration card system. The PCIE interface of the interconnected device has RP (Root Port) mode and EP (End Point) mode. The current RP mode and the current EP mode only support one-to-one communication between the interconnected device and the device connected to its PCIE interface, and do not support communication between the main control device connected to the network interface of the interconnected device and the device connected to the PCIE interface of the interconnected device, nor do they support communication between devices connected to different PCIE interfaces of the interconnected device. However, after the PCIE interface of this embodiment is made into an RP mode interface or an EP mode interface, the communication between the device connected to the network interface of the interconnected device and the device connected to the PCIE interface of the interconnected device, and the communication between devices connected to different PCIE interfaces of the interconnected device are realized accordingly. The network interface of the interconnected device provided in the embodiment of the present application can be directly connected to the server. When the interconnected device is an acceleration card, its direct connection to the server can form a "host directly connected to the acceleration card" system with the server.

在互联设备内部,数据遵循PCIE协议。按照本申请可在此互联设备中实现:由下至上的数据通路、由上至下的数据通路以及该互联设备的不同PCIE接口所连主机设备之间的数据通路。并且可根据场景需求灵活确定主控设备用的传输协议类型。可见,互联设备支持mesh拓扑、crossbar拓扑等的全互连,也即:互联设备的任一个接口都可以与互联设备的其他接口通信。Inside the interconnected device, data follows the PCIE protocol. According to this application, the following data paths can be implemented in this interconnected device: bottom-up data paths, top-down data paths, and data paths between host devices connected to different PCIE interfaces of the interconnected device. And the type of transmission protocol used by the main control device can be flexibly determined according to the scene requirements. It can be seen that the interconnected device supports full interconnection of mesh topology, crossbar topology, etc., that is, any interface of the interconnected device can communicate with other interfaces of the interconnected device.

下面以FPGA加速卡实现的互联设备为例,介绍在FPGA的PCIE接口为RP模式的情况下,FPGA的网络接口所连主控设备与FPGA的PCIE接口所连设备的通信过程,FPGA的不同PCIE接口所连设备的通信过程。The following takes the interconnected devices implemented by the FPGA accelerator card as an example to introduce the communication process between the main control device connected to the FPGA's network interface and the device connected to the FPGA's PCIE interface, and the communication process between devices connected to different PCIE interfaces of the FPGA when the FPGA's PCIE interface is in RP mode.

可选的,本实施例基于FPGA内部的PCIE RP硬核使FPGA的PCIE接口实现PCIESwitch功能,还能提高FPGA逻辑资源使用率。在RP模式下配置PCIE接口,只需在配置空间内容上进行改动,就可以使PCIE接口实现PCIE Switch功能。Optionally, this embodiment enables the PCIE interface of the FPGA to implement the PCIE switch function based on the PCIE RP hard core inside the FPGA, and can also improve the utilization rate of FPGA logic resources. When configuring the PCIE interface in RP mode, you only need to change the configuration space content to enable the PCIE interface to implement the PCIE switch function.

请参见图4,需要基于FPGA内部的PCIE RP硬核实现接口配置模块、设备配置模块、逻辑处理模块以及网络模块;逻辑处理模块针对网络模块、下游两个PCIE接口分别设有:解析模块、多路复用模块和仲裁模块。图4中的FPGA有两个下游PCIE接口,每一下游PCIE接口内设PCIE RP硬核。当然,FPGA可有更多或更少的下游PCIE接口,一个下游PCIE接口连接一个主机或一个加速卡设备。Please refer to Figure 4. It is necessary to implement the interface configuration module, device configuration module, logic processing module and network module based on the PCIE RP hard core inside the FPGA; the logic processing module has the following modules for the network module and the two downstream PCIE interfaces: analysis module, multiplexing module and arbitration module. The FPGA in Figure 4 has two downstream PCIE interfaces, and each downstream PCIE interface has a PCIE RP hard core. Of course, the FPGA can have more or fewer downstream PCIE interfaces, and one downstream PCIE interface is connected to a host or an accelerator card device.

一般地,PCIE RP硬核提供三组用户接口,分别为用户接收总线、用户发送总线和配置空间配置总线。其中,用户接收总线和发送总线为传输TLP报文的用户接口,通过配置空间配置总线可以完成对PCIE RP的配置空间配置,并读取配置空间默认寄存器值输出给逻辑处理模块的解析模块协助解析处理。设备配置模块通过发起的配置TLP报文完成对两个处理器的配置空间的初始化配置,两个处理器的BAR基地址分别被配置为BAR1和BAR2。接口配置模块通过cfg_mgnt总线(configuration management,控制管理总线)完成对FPGA的两个PCIE接口的配置空间的初始化配置,两个PCIE接口的BAR基地址分别被配置为BAR3和BAR4,其中,BAR3地址范围大于等于BAR1地址范围,BAR4地址范围大于等于BAR2地址范围。 Generally, the PCIE RP hard core provides three groups of user interfaces, namely the user receive bus, the user send bus and the configuration space configuration bus. Among them, the user receive bus and the send bus are user interfaces for transmitting TLP messages. The configuration space configuration bus can be used to complete the configuration of the PCIE RP configuration space, and read the default register value of the configuration space and output it to the parsing module of the logic processing module to assist in parsing and processing. The device configuration module completes the initialization configuration of the configuration space of the two processors through the initiated configuration TLP message, and the BAR base addresses of the two processors are configured as BAR1 and BAR2 respectively. The interface configuration module completes the initialization configuration of the configuration space of the two PCIE interfaces of the FPGA through the cfg_mgnt bus (configuration management, control management bus), and the BAR base addresses of the two PCIE interfaces are configured as BAR3 and BAR4 respectively, wherein the address range of BAR3 is greater than or equal to the address range of BAR1, and the address range of BAR4 is greater than or equal to the address range of BAR2.

可见,设备配置模块被设置为初始化下游EP设备(主机或加速卡)。接口配置模块被设置为初始化配置PCIE接口。逻辑处理模块被设置为实现TLP报文的转发和控制,其中针对单个接口均设有解析模块、多路复用模块和仲裁模块。解析模块解析PCIE RP的发送总线输出的TLP报文,并根据路由信息输出仲裁请求,同时处理异常的TLP报文;仲裁模块接收仲裁请求,并输出仲裁信号,以为报文选择对应的发送通路,多路复用模块接收各通路解析模块输出的数据,根据仲裁信号选择合适的通路发送给PCIE RP的接收总线。It can be seen that the device configuration module is set to initialize the downstream EP device (host or accelerator card). The interface configuration module is set to initialize the configuration of the PCIE interface. The logic processing module is set to realize the forwarding and control of TLP messages, in which a parsing module, a multiplexing module and an arbitration module are provided for each interface. The parsing module parses the TLP message output by the PCIE RP's transmit bus, outputs an arbitration request based on the routing information, and handles abnormal TLP messages; the arbitration module receives the arbitration request and outputs an arbitration signal to select the corresponding transmission path for the message, and the multiplexing module receives the data output by each path parsing module, and selects the appropriate path according to the arbitration signal and sends it to the PCIE RP's receive bus.

而网络模块被设置为处理与网络接口相关的操作。在本实施例中,网络模块连接FPGA的网络接口。网络模块内部实现了多个网络传输协议,如RDMA(Remote Direct MemoryAccess,远程直接数据存取)、TCP/IP或自定义的低延时网络协议等,根据用户输入的模式配置选择其中一种网络协议通过网络接口输出给网络接口所连主控设备。如前,网络模块不仅支持FPGA内部的PCIE协议,还支持外部的RDMA、TCP/IP等协议,且可实现内部与外部协议的相互转换。请参见图4,选择外部支持的协议时,可通过模式选择开关进行动态配置,用户通过网络接口发送配置模式选择的开关寄存器报文,网络接口接收到报文后解析该报文,并提取对应的模式选择开关值配置到模式选择寄存器模块,而网络协议处理模块根据模式选择寄存器的值进行对应网络协议的处理。The network module is configured to process operations related to the network interface. In this embodiment, the network module is connected to the network interface of the FPGA. The network module implements multiple network transmission protocols, such as RDMA (Remote Direct Memory Access), TCP/IP or a custom low-latency network protocol, etc., and selects one of the network protocols according to the mode configuration input by the user and outputs it to the main control device connected to the network interface through the network interface. As before, the network module not only supports the PCIE protocol inside the FPGA, but also supports external RDMA, TCP/IP and other protocols, and can realize the conversion between internal and external protocols. Please refer to Figure 4. When selecting an externally supported protocol, it can be dynamically configured through the mode selection switch. The user sends a switch register message for configuring the mode selection through the network interface. After receiving the message, the network interface parses the message and extracts the corresponding mode selection switch value to configure it to the mode selection register module, and the network protocol processing module processes the corresponding network protocol according to the value of the mode selection register.

在此实例中,PCIE Switch共包括3个通路,其中两路与AI处理器连接,另一路与网络接口连接。下面介绍FPGA的两个PCIE接口连接的两个处理器间的通信。AI处理器-1发起对AI处理器-2的内存访问操作,目标地址为BAR2。AI处理器-1的DMA控制器发起操作,此操作消息先通过PCIE接口发送给FPGA PCIE Switch对应的下游接口PCIE RP IP-1,PCIE RPIP-1输出后发送给解析模块1,解析模块1模块解析TLP报文,提取对应的目的地址等信息。如果目的地址在PCIE IP□2的路由窗口内,则输出目的地址仲裁请求发送给各接口的仲裁模块,当前例子中目的地址为BAR2地址,仲裁模块2输出有效的仲裁使能信号,其他仲裁模块输出无效的使能信号。根据有效的仲裁信号,TLP报文被发送给PCIE RP IP-2的接收总线,再通过PCIE接口发送给AI处理器-2,AI处理器-2的DMA控制器提取TLP报文的数据并存储到对应存储介质。如果TLP报文的目的地址不在路由窗口范围内(即不在PCIE IP-2的路由窗口内,也不在网络通路的路由窗口内)或者TLP报文异常,则丢弃报文。In this example, the PCIE Switch includes three channels, two of which are connected to the AI processor and the other is connected to the network interface. The following describes the communication between the two processors connected by the two PCIE interfaces of the FPGA. AI processor-1 initiates a memory access operation to AI processor-2, and the target address is BAR2. The DMA controller of AI processor-1 initiates the operation. This operation message is first sent to the downstream interface PCIE RP IP-1 corresponding to the FPGA PCIE Switch through the PCIE interface. After the PCIE RPIP-1 is output, it is sent to the parsing module 1. The parsing module 1 parses the TLP message and extracts the corresponding destination address and other information. If the destination address is within the routing window of PCIE IP□2, the destination address arbitration request is output and sent to the arbitration modules of each interface. In the current example, the destination address is the BAR2 address. The arbitration module 2 outputs a valid arbitration enable signal, and other arbitration modules output invalid enable signals. According to the valid arbitration signal, the TLP message is sent to the receiving bus of PCIE RP IP-2, and then sent to AI processor-2 through the PCIE interface. The DMA controller of AI processor-2 extracts the data of the TLP message and stores it in the corresponding storage medium. If the destination address of the TLP message is not within the routing window range (that is, not within the routing window of PCIE IP-2, nor within the routing window of the network path) or the TLP message is abnormal, the message is discarded.

需要说明的是,按照本实施例可对FPGA的每一PCIE接口进行配置,以使PCIE接口变为具有PCIE Switch功能的PCIE接口。当然,FPGA的网络接口也可以据此被配置为Switch的网络接口。由此可通过对FPGA进行配置实现AI处理器间或主机间的通信,实现AI处理器与FPGA的上游网络接口的通信,而无需借助扩展控制器或扩展卡。It should be noted that, according to this embodiment, each PCIE interface of the FPGA can be configured so that the PCIE interface becomes a PCIE interface with a PCIE Switch function. Of course, the network interface of the FPGA can also be configured as a Switch network interface accordingly. In this way, communication between AI processors or hosts can be achieved by configuring the FPGA, and communication between the AI processor and the upstream network interface of the FPGA can be achieved without the help of an expansion controller or expansion card.

可见,本实施例能够基于FPGA自带的PCIE RP IP硬核使FPGA的PCIE接口实现PCIESwitch功能,无需使用额外的PCIE Switch控制器,成本低,实现简单,并且FPGA逻辑资源利用率低。同时该方法几乎可应用于所有FPGA,可迁移性强。该方法也可以扩展到其他PCIE处理器,如果其PCIE接口自带PCIE RP IP,但不支持PCIE Switch模式,则可通过PCIERP IP可实现PCIE Switch的PCIE接口,以实现Switch转发功能。参照此实现方式,还可以借助互联设备自带的PCIE EP IP硬核,在不增加Switch控制器的前提下,配置实现PCIE Switch的PCIE接口;但需留意,EP模式下需要把配置PCIE Switch的配置TLP报文格式由type0格式转换为type1格式,其他配置内容可参照本申请提供的RP模式的配置。IP(Intellectual Property)硬核为硬件模块形式的控制器。It can be seen that this embodiment can enable the PCIE switch function of the PCIE interface of the FPGA to be realized based on the PCIE RP IP hard core that comes with the FPGA, without the need to use an additional PCIE Switch controller, with low cost, simple implementation, and low utilization of FPGA logic resources. At the same time, this method can be applied to almost all FPGAs and has strong portability. This method can also be extended to other PCIE processors. If its PCIE interface comes with a PCIE RP IP, but does not support the PCIE Switch mode, the PCIE interface of the PCIE Switch can be realized through the PCI RP IP to realize the Switch forwarding function. Referring to this implementation method, the PCIE interface that realizes the PCIE Switch can also be configured with the help of the PCIE EP IP hard core that comes with the interconnected device without adding a Switch controller; but it should be noted that in the EP mode, the configuration TLP message format for configuring the PCIE Switch needs to be converted from the type0 format to the type1 format, and other configuration contents can refer to the configuration of the RP mode provided in this application. The IP (Intellectual Property) hard core is a controller in the form of a hardware module.

下面对本申请实施例提供的一种互联设备进行介绍,下文描述的一种互联设备与本文描述的其他实施例可以相互参照。An interconnection device provided in an embodiment of the present application is introduced below. The interconnection device described below can be referenced to other embodiments described in this document.

本申请实施例公开了一种互联设备,包括:网络接口、PCIE接口和逻辑互联模块;网络接口内置第一寄存器并连接主控设备;PCIE接口被配置为RP模式或EP模式并连接主机设备或加速卡。PCIE接口被设置为:接收其所连设备发送的目标数据。逻辑互联模块被设置为:若确定目标数据的目的地址在第一寄存器(网络寄存器)记录的路由地址内,则将目标数据转换为符合第一寄存器记录的第一传输协议的第一报文,并通过网络接口发送第一报文至主控设备。The embodiment of the present application discloses an interconnection device, including: a network interface, a PCIE interface and a logical interconnection module; the network interface has a built-in first register and is connected to a main control device; the PCIE interface is configured as an RP mode or an EP mode and is connected to a host device or an accelerator card. The PCIE interface is configured to receive target data sent by a device connected thereto. The logical interconnection module is configured to convert the target data into a first message that complies with a first transmission protocol recorded in the first register, and send the first message to the main control device through the network interface, if it is determined that the destination address of the target data is within the routing address recorded in the first register (network register).

在一种实施方式中,互联设备为FPGA加速卡、ASIC加速卡或多核处理器。In one embodiment, the interconnect device is an FPGA acceleration card, an ASIC acceleration card, or a multi-core processor.

本申请实施例还提供了一种互联设备。其中,上述互联设备既可以是如图5所示的互联设备1,也可以是如图6所示的互联设备2。图5和图6均是根据一示例性实施例示出的互联设备结构图,图中的内容不能被认为是对本申请的使用范围的任何限制。The embodiment of the present application also provides an interconnection device. The interconnection device can be either the interconnection device 1 shown in FIG5 or the interconnection device 2 shown in FIG6. FIG5 and FIG6 are both interconnection device structure diagrams according to an exemplary embodiment, and the contents in the diagrams cannot be regarded as any limitation on the scope of use of the present application.

图5为本申请实施例提供的一种互联设备1的结构示意图。该互联设备1可以包括:至少一个处理器、至少一个存储器、电源、通信接口、输入输出接口和通信总线。其中,存储器被设置为存储计算机程序,计算机程序由处理器加载并执行,能够实现前述任一实施例公开的逻辑互联模块所实现的相关步骤。FIG5 is a schematic diagram of the structure of an interconnection device 1 provided in an embodiment of the present application. The interconnection device 1 may include: at least one processor, at least one memory, a power supply, a communication interface, an input/output interface, and a communication bus. The memory is configured to store a computer program, which is loaded and executed by the processor and can implement the relevant steps implemented by the logical interconnection module disclosed in any of the aforementioned embodiments.

本实施例中,电源被设置为为互联设备1上的各硬件设备提供工作电压;通信接口能够为互联设备1创建与外界设备之间的基于PCIE的数据传输通道,其所遵循的通信协议是能够适用于本申请技术方案的任意通信协议,在此不对其进行限定;输入输出接口,被设置为获取外界输入数据或向外界输出数据,其接口类型可以根据实际应用需要进行选取,在此不进行限定。In this embodiment, the power supply is configured to provide working voltage for each hardware device on the interconnected device 1; the communication interface can create a PCIE-based data transmission channel between the interconnected device 1 and the external device, and the communication protocol it follows is any communication protocol that can be applied to the technical solution of the present application, which is not limited here; the input and output interface is configured to obtain external input data or output data to the outside world, and its interface type can be selected according to actual application needs and is not limited here.

另外,存储器作为资源存储的载体,可以是只读存储器、随机存储器、磁盘或者光盘等,其上所存储 的资源包括操作系统、计算机程序及数据等,存储方式可以是短暂存储或者永久存储。In addition, the memory as a carrier of resource storage can be a read-only memory, random access memory, disk or CD, etc. The resources include operating systems, computer programs, and data, and the storage method can be temporary or permanent.

其中,操作系统被设置为管理与控制互联设备1上的各硬件设备以及计算机程序,以实现处理器对存储器中数据的运算与处理,其可以是Windows Server、Netware、Unix、Linux等。计算机程序除了包括能够用于完成前述逻辑互联模块所实现的相关步骤的计算机程序之外,还可以包括能够用于完成其他特定工作的计算机程序。数据除了可以包括应用程序的更新信息等数据外,还可以包括应用程序的开发商信息等数据。The operating system is configured to manage and control the hardware devices and computer programs on the interconnected device 1 to realize the operation and processing of the data in the memory by the processor, and it can be Windows Server, Netware, Unix, Linux, etc. In addition to computer programs that can be used to complete the relevant steps implemented by the aforementioned logical interconnection module, computer programs can also include computer programs that can be used to complete other specific tasks. In addition to data such as application update information, data can also include data such as application developer information.

图6为本申请实施例提供的一种互联设备2的结构示意图,该互联设备2可以包括但不限于智能手机、平板电脑、笔记本电脑或台式电脑等。FIG6 is a schematic diagram of the structure of an interconnected device 2 provided in an embodiment of the present application. The interconnected device 2 may include but is not limited to a smart phone, a tablet computer, a laptop computer, or a desktop computer.

通常,本实施例中的互联设备2包括有:处理器和存储器。Generally, the interconnection device 2 in this embodiment includes: a processor and a memory.

其中,处理器可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器可以在集成有GPU,GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。Among them, the processor may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor can be implemented in at least one hardware form of DSP (Digital Signal Processing), FPGA, and PLA (Programmable Logic Array). The processor may also include a main processor and a coprocessor. The main processor is a processor for processing data in the awake state, also known as CPU (Central Processing Unit); the coprocessor is a low-power processor for processing data in the standby state. In some embodiments, the processor may be integrated with a GPU, which is responsible for rendering and drawing the content that needs to be displayed on the display screen. In some embodiments, the processor may also include an AI (Artificial Intelligence) processor, which is used to process computing operations related to machine learning.

存储器可以包括一个或多个计算机非易失性可读存储介质,该计算机非易失性可读存储介质可以是非暂态的。存储器还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。本实施例中,存储器至少用于存储以下计算机程序,其中,该计算机程序被处理器加载并执行之后,能够实现前述任一实施例公开的由互联设备2侧执行的逻辑互联模块所实现的相关步骤。另外,存储器所存储的资源还可以包括操作系统和数据等,存储方式可以是短暂存储或者永久存储。其中,操作系统可以包括Windows、Unix、Linux等。数据可以包括但不限于应用程序的更新信息。The memory may include one or more computer non-volatile readable storage media, which may be non-transitory. The memory may also include high-speed random access memory, and non-volatile memory, such as one or more disk storage devices, flash memory storage devices. In this embodiment, the memory is at least used to store the following computer program, wherein, after the computer program is loaded and executed by the processor, it can implement the relevant steps implemented by the logical interconnection module executed by the interconnection device 2 side disclosed in any of the aforementioned embodiments. In addition, the resources stored in the memory may also include operating systems and data, etc., and the storage method may be temporary storage or permanent storage. Among them, the operating system may include Windows, Unix, Linux, etc. The data may include, but is not limited to, update information of the application.

在一些实施例中,互联设备2还可包括有显示屏、输入输出接口、通信接口、传感器、电源以及通信总线。In some embodiments, the interconnection device 2 may also include a display screen, an input and output interface, a communication interface, a sensor, a power supply, and a communication bus.

本领域技术人员可以理解,图6中示出的结构并不构成对互联设备2的限定,可以包括比图示更多或更少的组件。Those skilled in the art will appreciate that the structure shown in FIG. 6 does not constitute a limitation on the interconnection device 2 , and may include more or fewer components than those shown in the figure.

请参见图7,本申请提供了一种互联设备,包括:网络接口、逻辑互联模块和多个PCIE接口;网络接口被设置为连接主控设备;任意PCIE接口被配置为EP模式并连接一个主机,以构成mesh拓扑的主机集群或crossbar拓扑的主机集群;逻辑互联模块被设置为:将任意一个PCIE接口所连主机发出的数据转发至其他PCIE接口所连主机;将主控设备发出的数据转发至任意一个PCIE接口所连主机;将任意一个PCIE接口所连主机发出的数据转发至主控设备。Please refer to Figure 7. The present application provides an interconnection device, including: a network interface, a logical interconnection module and multiple PCIE interfaces; the network interface is configured to connect to a master control device; any PCIE interface is configured to EP mode and connected to a host to form a host cluster of a mesh topology or a host cluster of a crossbar topology; the logical interconnection module is configured to: forward data sent by a host connected to any PCIE interface to hosts connected to other PCIE interfaces; forward data sent by the master control device to a host connected to any PCIE interface; forward data sent by a host connected to any PCIE interface to the master control device.

互联设备还包括:配置器件;相应地,配置器件被设置为:根据任意PCIE接口的配置报文配置当前PCIE接口所连主机以及当前PCIE接口的从设备核。配置器件被设置为:基于配置报文获取从设备核的配置数据,并将配置数据配置于从设备核。网络接口内置有被设置为记录主控设备支持的网络协议的第一寄存器;相应地,逻辑互联模块还被设置为:利用主控设备支持的网络协议封装任意一个PCIE接口所连主机发出的数据,并将封装后的数据转发至主控设备。网络接口还被设置为:根据主控设备发送的协议更新信息,对第一寄存器中记录的网络协议进行更新。The interconnection device also includes: a configuration device; accordingly, the configuration device is configured to: configure the host connected to the current PCIE interface and the slave device core of the current PCIE interface according to the configuration message of any PCIE interface. The configuration device is configured to: obtain the configuration data of the slave device core based on the configuration message, and configure the configuration data in the slave device core. The network interface is built with a first register configured to record the network protocol supported by the master device; accordingly, the logical interconnection module is also configured to: encapsulate the data sent by the host connected to any PCIE interface using the network protocol supported by the master device, and forward the encapsulated data to the master device. The network interface is also configured to: update the network protocol recorded in the first register according to the protocol update information sent by the master device.

任意PCIE接口被配置为RP模式并连接一个加速卡;逻辑互联模块被设置为:将连接加速卡的PCIE接口发出的加速卡数据转发至其他PCIE接口所连主机;将主控设备发出的数据转发至任意一个PCIE接口所连加速卡;将连接加速卡的PCIE接口发出的加速卡数据转发至主控设备。Any PCIE interface is configured as RP mode and connected to an accelerator card; the logic interconnection module is set to: forward the accelerator card data sent by the PCIE interface connected to the accelerator card to the host connected to other PCIE interfaces; forward the data sent by the main control device to the accelerator card connected to any PCIE interface; forward the accelerator card data sent by the PCIE interface connected to the accelerator card to the main control device.

互联设备还包括:使能器件;相应地,使能器件被设置为:使能网络接口和任意一个PCIE接口时去使能其他PCIE接口。使能器件还被设置为:使能任意两个PCIE接口时去使能网络接口和其他PCIE接口。The interconnection device also includes an enabling device; accordingly, the enabling device is configured to enable the network interface and any one PCIE interface to disable other PCIE interfaces. The enabling device is also configured to enable any two PCIE interfaces to disable the network interface and other PCIE interfaces.

其中,逻辑互联模块还被设置为:将主控设备发送的网络协议报文封装为PCIE报文,将PCIE报文转发至任意PCIE接口。逻辑互联模块还被设置为:丢弃转发地址缺失的数据。The logical interconnection module is further configured to: encapsulate the network protocol message sent by the master control device into a PCIE message, and forward the PCIE message to any PCIE interface. The logical interconnection module is further configured to: discard data with missing forwarding addresses.

互联设备还包括:DMA管理器件和DMA控制器;相应地,DMA管理器件被设置为:查询任意PCIE接口所连主机的内存空闲信息;利用DMA控制器和内存空闲信息将需传输数据以DMA方式写入当前主机的内存。DMA管理器件还被设置为:将互联设备的内存信息广播至任意PCIE接口所连的主机。DMA管理器件还被设置为:存储任意PCIE接口所连的主机广播的主机内存信息。The interconnection device also includes: a DMA management device and a DMA controller; accordingly, the DMA management device is configured to: query the memory free information of any host connected to the PCIE interface; and use the DMA controller and the memory free information to write the data to be transmitted into the memory of the current host in a DMA manner. The DMA management device is also configured to: broadcast the memory information of the interconnection device to any host connected to the PCIE interface. The DMA management device is also configured to: store the host memory information broadcast by any host connected to the PCIE interface.

本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。The various embodiments in this specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments. The same or similar parts between the various embodiments can be referenced to each other.

结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM(Random Access Memory,随机存取存储器))、内存、只读存储器(ROM(Read-Only Memory,只读存储器))、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的非易失性可读存储介质中。The steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be implemented directly using hardware, a software module executed by a processor, or a combination of the two. The software module may be placed in a random access memory (RAM (Random Access Memory), memory, read-only memory (ROM (Read-Only Memory), electrically programmable ROM, electrically erasable programmable ROM, register, hard disk, removable disk, CD-ROM, or any other form of non-volatile readable storage medium known in the art.

本文中应用了个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申 请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。 This article uses examples to illustrate the principles and implementation methods of this application. The above examples are only used to help understand the application. The present invention relates to a method and its core idea of the present invention; at the same time, for a person skilled in the art, according to the idea of the present invention, there may be changes in the specific implementation method and the scope of application. In summary, the content of this specification should not be understood as a limitation on the present application.

Claims (20)

一种互联系统,包括:互联设备和多个主机;An interconnection system, comprising: an interconnection device and a plurality of hosts; 所述互联设备包括:网络接口、逻辑互联模块和多个高速串行计算机扩展总线标准PCIE接口;The interconnection device includes: a network interface, a logic interconnection module and a plurality of high-speed serial computer expansion bus standard PCIE interfaces; 所述网络接口被设置为连接主控设备;The network interface is configured to connect to a main control device; 任意PCIE接口被配置为端点EP模式并连接一个主机,以构成网格mesh拓扑的主机集群或交叉crossbar拓扑的主机集群;Any PCIE interface is configured as endpoint EP mode and connected to a host to form a host cluster of mesh topology or a host cluster of crossbar topology; 所述逻辑互联模块被设置为:将任意一个PCIE接口所连主机发出的数据转发至其他PCIE接口所连主机;将所述主控设备发出的数据转发至任意一个PCIE接口所连主机;将任意一个PCIE接口所连主机发出的数据转发至所述主控设备。The logic interconnection module is configured to: forward data sent by a host connected to any PCIE interface to hosts connected to other PCIE interfaces; forward data sent by the main control device to a host connected to any PCIE interface; and forward data sent by a host connected to any PCIE interface to the main control device. 根据权利要求1所述的互联系统,其特征在于,所述互联设备还包括:配置器件;The interconnection system according to claim 1, characterized in that the interconnection device further comprises: a configuration device; 相应地,所述配置器件被设置为:根据任意PCIE接口的配置报文配置当前PCIE接口所连主机以及当前PCIE接口的从设备核。Correspondingly, the configuration device is configured to: configure the host connected to the current PCIE interface and the slave device core of the current PCIE interface according to the configuration message of any PCIE interface. 根据权利要求2所述的互联系统,其特征在于,所述配置器件被设置为:基于所述配置报文获取所述从设备核的配置数据,并将所述配置数据配置于所述从设备核。The interconnection system according to claim 2 is characterized in that the configuration device is configured to: obtain configuration data of the slave device core based on the configuration message, and configure the configuration data in the slave device core. 根据权利要求1所述的互联系统,其特征在于,所述网络接口内置有被设置为记录所述主控设备支持的网络协议的第一寄存器;The interconnection system according to claim 1, characterized in that the network interface has a built-in first register configured to record the network protocol supported by the master device; 相应地,所述逻辑互联模块还被设置为:利用所述主控设备支持的网络协议封装任意一个PCIE接口所连主机发出的数据,并将封装后的数据转发至所述主控设备。Correspondingly, the logic interconnection module is further configured to: encapsulate data sent by a host connected to any PCIE interface using a network protocol supported by the main control device, and forward the encapsulated data to the main control device. 根据权利要求4所述的互联系统,其特征在于,所述网络接口还被设置为:根据所述主控设备发送的协议更新信息,对所述第一寄存器中记录的网络协议进行更新。The interconnection system according to claim 4 is characterized in that the network interface is further configured to: update the network protocol recorded in the first register according to the protocol update information sent by the master control device. 根据权利要求1所述的互联系统,其特征在于,任意PCIE接口被配置为根接口RP模式并连接一个加速卡;The interconnection system according to claim 1, characterized in that any PCIE interface is configured as a root interface RP mode and connected to an accelerator card; 所述逻辑互联模块被设置为:将连接加速卡的PCIE接口发出的加速卡数据转发至其他PCIE接口所连主机;将所述主控设备发出的数据转发至任意一个PCIE接口所连加速卡;将连接加速卡的PCIE接口发出的加速卡数据转发至所述主控设备。The logic interconnection module is configured to: forward the acceleration card data sent by the PCIE interface connected to the acceleration card to the host connected to other PCIE interfaces; forward the data sent by the main control device to the acceleration card connected to any one of the PCIE interfaces; and forward the acceleration card data sent by the PCIE interface connected to the acceleration card to the main control device. 根据权利要求1所述的互联系统,其特征在于,所述互联设备还包括:使能器件;The interconnection system according to claim 1, characterized in that the interconnection device further comprises: an enabling device; 相应地,所述使能器件被设置为:使能所述网络接口和任意一个PCIE接口时去使能其他PCIE接口。Correspondingly, the enabling device is configured to: enable the network interface and any one PCIE interface and disable other PCIE interfaces. 根据权利要求7所述的互联系统,其特征在于,所述使能器件还被设置为:使能任意两个PCIE接口时去使能所述网络接口和其他PCIE接口。The interconnection system according to claim 7 is characterized in that the enabling device is further configured to: disable the network interface and other PCIE interfaces when any two PCIE interfaces are enabled. 根据权利要求4所述的互联系统,其特征在于,所述逻辑互联模块还被设置为:将所述主控设备发送的网络协议报文封装为PCIE报文,将所述PCIE报文转发至任意PCIE接口。The interconnection system according to claim 4 is characterized in that the logical interconnection module is also configured to: encapsulate the network protocol message sent by the master control device into a PCIE message, and forward the PCIE message to any PCIE interface. 根据权利要求1至9任一项所述的互联系统,其特征在于,所述逻辑互联模块还被设置为:丢弃转发地址缺失的数据。The interconnection system according to any one of claims 1 to 9 is characterized in that the logical interconnection module is further configured to discard data with missing forwarding addresses. 根据权利要求1至9任一项所述的互联系统,其特征在于,所述互联设备还包括:存储器直接访问DMA管理器件和DMA控制器;The interconnection system according to any one of claims 1 to 9, characterized in that the interconnection device further comprises: a memory direct access DMA management device and a DMA controller; 相应地,所述DMA管理器件被设置为:查询任意PCIE接口所连主机的内存空闲信息;利用所述DMA控制器和所述内存空闲信息将需传输数据以DMA方式写入当前主机的内存。Accordingly, the DMA management device is configured to: query the memory idle information of any host connected to the PCIE interface; and write the data to be transmitted into the memory of the current host in a DMA manner by using the DMA controller and the memory idle information. 根据权利要求11所述的互联系统,其特征在于,所述DMA管理器件还被设置为:将所述互联设备的内存信息广播至任意PCIE接口所连的主机。The interconnection system according to claim 11 is characterized in that the DMA management device is also configured to: broadcast the memory information of the interconnection device to any host connected to the PCIE interface. 根据权利要求11所述的互联系统,其特征在于,所述DMA管理器件还被设置为:存储任意PCIE接口所连的主机广播的主机内存信息。The interconnection system according to claim 11 is characterized in that the DMA management device is also configured to: store host memory information broadcast by any host connected to the PCIE interface. 根据权利要求1至9任一项所述的互联系统,其特征在于,所述互联设备为现场可编程门列阵FPGA加速卡、专用集成电路ASIC加速卡或多核处理器。The interconnection system according to any one of claims 1 to 9 is characterized in that the interconnection device is a field programmable gate array FPGA acceleration card, an application specific integrated circuit ASIC acceleration card or a multi-core processor. 一种互联设备,其特征在于,包括:网络接口、逻辑互联模块和多个PCIE接口;An interconnection device, characterized in that it comprises: a network interface, a logic interconnection module and a plurality of PCIE interfaces; 所述网络接口被设置为连接主控设备;The network interface is configured to connect to a main control device; 任意PCIE接口被配置为EP模式并连接一个主机,以构成mesh拓扑的主机集群或crossbar拓扑的主机集群;Any PCIE interface is configured in EP mode and connected to a host to form a host cluster of mesh topology or a host cluster of crossbar topology; 所述逻辑互联模块被设置为:将任意一个PCIE接口所连主机发出的数据转发至其他PCIE接口所连主机;将所述主控设备发出的数据转发至任意一个PCIE接口所连主机;将任意一个PCIE接口所连主机发出的数据转发至所述主控设备。The logic interconnection module is configured to: forward data sent by a host connected to any PCIE interface to hosts connected to other PCIE interfaces; forward data sent by the main control device to a host connected to any PCIE interface; and forward data sent by a host connected to any PCIE interface to the main control device. 根据权利要求15所述的互联设备,其特征在于,所述互联设备还包括:配置器件;The interconnection device according to claim 15, characterized in that the interconnection device further comprises: a configuration device; 相应地,所述配置器件被设置为:根据任意PCIE接口的配置报文配置当前PCIE接口所连主机以及当前PCIE接口的从设备核。Correspondingly, the configuration device is configured to: configure the host connected to the current PCIE interface and the slave device core of the current PCIE interface according to the configuration message of any PCIE interface. 根据权利要求15所述的互联设备,其特征在于,所述网络接口内置有被设置为记录所述主控设备支持的网络协议的第一寄存器; The interconnection device according to claim 15, characterized in that the network interface is built with a first register configured to record the network protocol supported by the master control device; 相应地,所述逻辑互联模块还被设置为:利用所述主控设备支持的网络协议封装任意一个PCIE接口所连主机发出的数据,并将封装后的数据转发至所述主控设备。Correspondingly, the logic interconnection module is further configured to: encapsulate data sent by a host connected to any PCIE interface using a network protocol supported by the main control device, and forward the encapsulated data to the main control device. 根据权利要求15所述的互联设备,其特征在于,任意PCIE接口被配置为RP模式并连接一个加速卡;The interconnection device according to claim 15, characterized in that any PCIE interface is configured as RP mode and connected to an accelerator card; 所述逻辑互联模块被设置为:将连接加速卡的PCIE接口发出的加速卡数据转发至其他PCIE接口所连主机;将所述主控设备发出的数据转发至任意一个PCIE接口所连加速卡;将连接加速卡的PCIE接口发出的加速卡数据转发至所述主控设备。The logic interconnection module is configured to: forward the acceleration card data sent by the PCIE interface connected to the acceleration card to the host connected to other PCIE interfaces; forward the data sent by the main control device to the acceleration card connected to any one of the PCIE interfaces; and forward the acceleration card data sent by the PCIE interface connected to the acceleration card to the main control device. 一种互联网络,其特征在于,包括:多个如权利要求15至18任意一项所述的互联设备。An Internet network, characterized in that it comprises: a plurality of Internet devices as described in any one of claims 15 to 18. 根据权利要求19所述的互联网络,其特征在于,不同互联设备的网络接口连接同一主控设备。 The Internet network according to claim 19 is characterized in that the network interfaces of different Internet devices are connected to the same master control device.
PCT/CN2024/122115 2023-10-27 2024-09-29 Interconnect system, device and network Pending WO2025087005A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202311413029.8A CN117421268A (en) 2023-10-27 2023-10-27 Interconnection system, equipment and network
CN202311413029.8 2023-10-27

Publications (1)

Publication Number Publication Date
WO2025087005A1 true WO2025087005A1 (en) 2025-05-01

Family

ID=89529864

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2024/122115 Pending WO2025087005A1 (en) 2023-10-27 2024-09-29 Interconnect system, device and network

Country Status (2)

Country Link
CN (1) CN117421268A (en)
WO (1) WO2025087005A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117421268A (en) * 2023-10-27 2024-01-19 浪潮(北京)电子信息产业有限公司 Interconnection system, equipment and network
CN117978759B (en) * 2024-03-29 2024-05-28 苏州元脑智能科技有限公司 Interconnection device, high-performance exchange device and large-model all-in-one machine
CN119940249A (en) * 2024-11-26 2025-05-06 北京智芯微电子科技有限公司 Reconfigurable interconnect circuit, implementation method and chip for multi-core processor
CN120238521B (en) * 2025-06-03 2025-11-11 山东云海国创云计算装备产业创新中心有限公司 Methods, devices, systems, equipment, and media for assigning bus numbers to switching chips

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070130407A1 (en) * 2005-11-22 2007-06-07 Olson David M Bus system with multiple modes of operation
CN108897703A (en) * 2018-05-30 2018-11-27 郑州云海信息技术有限公司 A kind of high speed data transmission system and method based on PCIE
CN110941576A (en) * 2018-09-21 2020-03-31 苏州库瀚信息科技有限公司 System, method and apparatus for multimode PCIE capable storage controller
CN114827151A (en) * 2022-05-20 2022-07-29 合肥边缘智芯科技有限公司 Heterogeneous server cluster and data forwarding method, device and equipment
CN114968895A (en) * 2022-05-30 2022-08-30 浪潮电子信息产业股份有限公司 Heterogeneous interconnection system and cluster
CN117421268A (en) * 2023-10-27 2024-01-19 浪潮(北京)电子信息产业有限公司 Interconnection system, equipment and network

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8189603B2 (en) * 2005-10-04 2012-05-29 Mammen Thomas PCI express to PCI express based low latency interconnect scheme for clustering systems
US8972611B2 (en) * 2011-08-11 2015-03-03 Cisco Technology, Inc. Multi-server consolidated input/output (IO) device
EP3625939A1 (en) * 2017-07-10 2020-03-25 Fungible, Inc. Access node for data centers
CN107360088A (en) * 2017-08-28 2017-11-17 郑州云海信息技术有限公司 A kind of gateway architecture and collocation method of UNICOM's xenogenesis interconnection media
US11714775B2 (en) * 2021-05-10 2023-08-01 Zenlayer Innovation LLC Peripheral component interconnect (PCI) hosting device
CN113489607B (en) * 2021-06-29 2023-04-07 杭州海康威视数字技术股份有限公司 Service processing system, acquisition equipment and convergence equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070130407A1 (en) * 2005-11-22 2007-06-07 Olson David M Bus system with multiple modes of operation
CN108897703A (en) * 2018-05-30 2018-11-27 郑州云海信息技术有限公司 A kind of high speed data transmission system and method based on PCIE
CN110941576A (en) * 2018-09-21 2020-03-31 苏州库瀚信息科技有限公司 System, method and apparatus for multimode PCIE capable storage controller
CN114827151A (en) * 2022-05-20 2022-07-29 合肥边缘智芯科技有限公司 Heterogeneous server cluster and data forwarding method, device and equipment
CN114968895A (en) * 2022-05-30 2022-08-30 浪潮电子信息产业股份有限公司 Heterogeneous interconnection system and cluster
CN117421268A (en) * 2023-10-27 2024-01-19 浪潮(北京)电子信息产业有限公司 Interconnection system, equipment and network

Also Published As

Publication number Publication date
CN117421268A (en) 2024-01-19

Similar Documents

Publication Publication Date Title
WO2025087005A1 (en) Interconnect system, device and network
US10140242B2 (en) General purpose input/output (GPIO) signal bridging with I3C bus interfaces and virtualization in a multi-node network
US11036669B2 (en) Scalable direct inter-node communication over peripheral component interconnect-express (PCIe)
CN117033275B (en) DMA method and device between acceleration cards, acceleration card, acceleration platform and medium
CN105450588A (en) RDMA-based data transmission method and RDMA network cards
CN105531684A (en) Universal PCI EXPRESS port
CN117312229B (en) Data transmission device, data processing equipment, system, method and medium
US20230045601A1 (en) Far-end data migration device and method based on fpga cloud platform
CN108345555B (en) Interface bridge circuit based on high-speed serial communication and method thereof
WO2021244194A1 (en) Register reading/writing method, chip, subsystem, register group, and terminal
US9753883B2 (en) Network interface device that maps host bus writes of configuration information for virtual NIDs into a small transactional memory
US9515963B2 (en) Universal network interface controller
WO2025152506A1 (en) Data processing system and method, device and nonvolatile readable storage medium
CN103222286B (en) Route switching device, network switching system and route switching method
US20230153153A1 (en) Task processing method and apparatus
US20190286606A1 (en) Network-on-chip and computer system including the same
CN117971135B (en) Storage device access method and device, storage medium and electronic device
CN110287142B (en) Multifunctional space-borne supercomputing device and satellite
CN115114192A (en) Memory interface, functional core, many-core system and storage data access method
CN113691466A (en) Data transmission method, intelligent network card, computing device and storage medium
US20150220445A1 (en) Transactional memory that performs a programmable address translation if a dat bit in a transactional memory write command is set
CN117041147A (en) Intelligent network card equipment, host equipment, method and system
US20180255157A1 (en) Network service chains using hardware logic devices in an information handling system
EP3631640B1 (en) Communication between field programmable gate arrays
CN113472964B (en) Image processing device and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24881371

Country of ref document: EP

Kind code of ref document: A1