[go: up one dir, main page]

CN112088521A - Memory device for high-bandwidth and high-capacity switch - Google Patents

Memory device for high-bandwidth and high-capacity switch Download PDF

Info

Publication number
CN112088521A
CN112088521A CN201880093306.2A CN201880093306A CN112088521A CN 112088521 A CN112088521 A CN 112088521A CN 201880093306 A CN201880093306 A CN 201880093306A CN 112088521 A CN112088521 A CN 112088521A
Authority
CN
China
Prior art keywords
memory
pipe
memory device
memory blocks
write
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201880093306.2A
Other languages
Chinese (zh)
Inventor
拉米·扎查里亚
伊扎克·巴拉克
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN112088521A publication Critical patent/CN112088521A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/30Peripheral units, e.g. input or output ports
    • H04L49/3036Shared queuing

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention provides a shared memory device for a high bandwidth, high capacity switch. The memory device includes a plurality of memory blocks. The memory device also includes a plurality of ingress pipes, where each ingress pipe is to request a data packet to be written to the memory chunk. The memory device also includes a plurality of egress pipes, wherein each egress pipe is configured to request a data packet to be read from the memory block. Such that each outlet pipe is associated with (soft-allocates to) the set of memory chunks.

Description

一种用于高带宽、高容量交换机的内存设备A memory device for high-bandwidth, high-capacity switches

技术领域technical field

本发明涉及在网络系统中使用高容量、高带宽交换机。本发明尤其涉及用于此类交换机的内存设备,以及包括所述内存设备的交换机。所述内存设备分别为入口管和出口管提供新的内存块分配方案的实现。所述新的分配方案尤其适合提供具有多个内存块的共享内存架构。本发明还涉及一种用于内存设备的相应控制方法。The present invention relates to the use of high capacity, high bandwidth switches in network systems. In particular, the present invention relates to memory devices for such switches, and switches including such memory devices. The memory device provides the realization of a new memory block allocation scheme for the inlet pipe and the outlet pipe, respectively. The new allocation scheme is particularly suitable for providing shared memory architectures with multiple memory blocks. The invention also relates to a corresponding control method for a memory device.

背景技术Background technique

传统的交换机通常包含多个双向端口。基于在交换机内所做的决策将来自输入端口的到达流量(通常是以太网数据包)定向到输出端口。Traditional switches typically contain multiple bidirectional ports. Incoming traffic (usually Ethernet packets) from an input port is directed to an output port based on decisions made within the switch.

一个端口通常由它的速率来识别,且对于输入和输出,速率通常相同。例如,100Gbps的端口(每秒100吉比特)能够以100Gbps的速率接收流量以及以100Gbps的速率发送流量。A port is usually identified by its speed, and the speed is usually the same for input and output. For example, a 100Gbps port (100 gigabits per second) can receive traffic at 100Gbps and send traffic at 100Gbps.

所述传统的交换机还包含存储器,用于在将传入流量发送到输出端口之前临时保存所述流量。将流量控制在交换机内部的原因有很多,例如:The conventional switch also includes memory for temporarily holding incoming traffic before sending it to an egress port. There are many reasons to control traffic inside the switch, such as:

1.多个输入端口可以接收定向到单个输出端口的流量(多对一)。如果所述输出端口无法传送所有接收到的流量,那么必须临时保存所述输出端口接收到的一些流量。1. Multiple input ports can receive traffic directed to a single output port (many-to-one). If the output port cannot carry all the traffic received, some traffic received by the output port must be temporarily saved.

2.从所述交换机外部向输出端口施加的反压可以阻止进一步输出流量到所述输出端口。因此,必须临时保存所有定向到该端口的接收流量。2. The back pressure applied to the output port from outside the switch can prevent further output traffic to the output port. Therefore, all incoming traffic directed to this port must be temporarily saved.

3.输出端口的调度速率是交换机内部的一个参数,可以用于限制某个端口的输出速率。因此,必须临时保存定向到所述端口的接收流量。3. The scheduling rate of the output port is a parameter inside the switch, which can be used to limit the output rate of a certain port. Therefore, incoming traffic directed to the port must be temporarily saved.

高容量、高带宽交换机中的内存通常构建为共享内存。即,所述内存是由所有输出端口共享的。与每个输出端口具有专用内存相比,可共享性的主要优点是内存更少。Memory in high-capacity, high-bandwidth switches is typically built as shared memory. That is, the memory is shared by all output ports. The main advantage of shareability is less memory than having dedicated memory for each output port.

传统交换机的简化架构图如图7所示。来自不同输入端口的到达流量经过分类引擎,所述分类引擎用于选择输出端口和相应的队列,以便存储所述接收到的流量。所述分类引擎还决定所述到达流量可能需要的任意编辑。分类后,将所述流量存储在内存中,以临时缓存所述接收到的流量。将所述内存虚拟地排列在队列中,并作为队列进行管理。由控制逻辑管理队列、缓存管理以及到输出端口的调度。队列可以是任意方式,如输入队列、输出队列、虚拟输出队列(virtual output queue,VOQ)等。A simplified architecture diagram of a traditional switch is shown in Figure 7. Incoming traffic from different input ports passes through a classification engine that selects output ports and corresponding queues for storing the received traffic. The classification engine also determines any edits that may be required for the incoming traffic. After classification, the traffic is stored in memory to temporarily cache the received traffic. The memory is virtually queued and managed as a queue. Queues, buffer management, and scheduling to output ports are managed by the control logic. The queue can be in any form, such as input queue, output queue, virtual output queue (VOQ), etc.

在传统的高容量、高带宽交换机中,内存架构通常由所有输出端口共享。这意味着,来自任意输入端口的以及定向到任意输出端口的接收流量都可以写入该共享内存。存在用于针对每个输出端口管理内存的算法。所述交换机通常由单个硅构成,即,它是单个设备,使得对所述共享内存的所有高速访问都限制在所述设备内部,而没有外部接口。外部接口可能导致所述交换机不切实际地被构建为单个设备。In traditional high-capacity, high-bandwidth switches, the memory fabric is typically shared by all output ports. This means that incoming traffic from any input port and directed to any output port can be written to this shared memory. Algorithms exist for managing memory for each output port. The switch is typically constructed from a single piece of silicon, ie, it is a single device such that all high-speed access to the shared memory is confined to the device, with no external interface. External interfaces may result in the switch being impractically constructed as a single device.

不利地是,当专用于高容量、高带宽交换机时,所述传统交换机的所述共享内存架构具有局限性。Disadvantageously, the shared memory architecture of the conventional switch has limitations when dedicated to high capacity, high bandwidth switches.

首先,在最坏的情况下,即,当所述交换机被充分利用时,即100%负载时,单个具有N个端口的交换机必须支持来自所述共享内存的N*B的写带宽和N*B的读带宽,其中,每个端口支持B带宽(即,10Gbps、100Gbps等)。例如,在最坏的情况下,一个64端口的交换机必须支持来自所述共享内存的6.4Tbps的读带宽和6.4Tbps写带宽,其中,每个端口支持100Gbps。First, in the worst case, i.e. when the switch is fully utilized, i.e. 100% loaded, a single switch with N ports must support N*B write bandwidth and N* from the shared memory Read bandwidth of B, where each port supports B bandwidth (ie, 10Gbps, 100Gbps, etc.). For example, in the worst case, a 64-port switch must support 6.4Tbps read bandwidth and 6.4Tbps write bandwidth from the shared memory, where each port supports 100Gbps.

其次,由于网络流量是由大小可变的数据包构成的(例如,以太网数据包的大小可以是64字节,但最多不能超过9KB),由于所述交换机中的所述共享内存是固定宽度的(每个内存位置包含C字节),因此,每个到达的数据包必须被分割成大小为C字节的数据块写入所述共享内存。最后一个数据块的大小可以是1字节,最多C字节。在最坏的情况下,如果到达的流量包含一个大小为(C+1)字节的数据包流,则每个数据包包必须写入所述共享内存的两个位置,并在要发送到输出端口时从所述内存的两个位置读取。在该场景中,所述内存的带宽需求翻倍,最大读2*N*B,写2*N*B。Second, since network traffic is composed of packets of variable size (for example, Ethernet packets can be 64 bytes in size, but cannot exceed 9KB at most), since the shared memory in the switch is of fixed width (each memory location contains C bytes), therefore, each arriving packet must be written to the shared memory in chunks of size C bytes. The size of the last data block can be 1 byte, up to C bytes. In the worst case, if the incoming traffic contains a stream of packets of size (C+1) bytes, each packet must be written to two locations in said shared memory and sent to The output port is read from two locations in the memory. In this scenario, the bandwidth requirement of the memory is doubled to a maximum of 2*N*B for reading and 2*N*B for writing.

第三,通常在这种高带宽、高容量的交换机中,所述缓存共享内存在所述交换机中实现。所述共享内存通常由单端口内存块构建,以节省硅面积,而不是由大小是单端口内存块的两倍双端口内存块构建。单端口内存块每个时钟只能执行1次读或1次写,但不能同时执行读和写。这意味着,由单端口内存块构建的共享内存必须再次增加一倍的带宽容量,以支持同时进行读和写的N*B带宽。Third, typically in such high bandwidth, high capacity switches, the cache shared memory is implemented in the switch. The shared memory is typically built from single-ported memory blocks to save silicon area, rather than dual-ported memory blocks that are twice the size of single-ported memory blocks. A single-port memory block can only perform 1 read or 1 write per clock, but not both. This means that shared memory built from single-ported memory blocks must again double the bandwidth capacity to support N*B bandwidth for simultaneous reads and writes.

第四,由于工作频率的物理限制,不可能支持来自单端口内存的单个块的如此高的带宽要求。例如,64端口交换机和100Gbps端口必须支持针对共享内存的6.4Tbps的读带宽和6.4Tbps的写带宽,而不考虑将包分割为固定大小C所需的加速。对于64字节的分割大小,也就是内存宽度,12.8Tbps的单端口内存应运行在25GHz,以便能够维持所需的带宽。当所需带宽翻倍时,为了支持C+1大小的数据包流,所述频率甚至更高。增加C值以获得更宽的内存对内存速度和硅的面积(所述内存和所述逻辑的内存速度和硅的面积,以支持C)有影响。Fourth, it is impossible to support such high bandwidth requirements from a single block of single-port memory due to physical limitations of operating frequency. For example, a 64-port switch and a 100Gbps port must support a read bandwidth of 6.4Tbps and a write bandwidth of 6.4Tbps for shared memory, regardless of the speedup required to split packets into fixed size C. For a split size of 64 bytes, or memory width, a 12.8Tbps single-port memory should run at 25GHz to be able to maintain the required bandwidth. When the required bandwidth is doubled, the frequency is even higher in order to support C+1 sized packet flows. Increasing the C value for wider memory has an impact on memory speed and silicon area (memory speed and silicon area of the memory and logic to support C).

为了解决所述共享内存架构的上述带宽问题,通常使用单端口内存的多个内存块而不是一个内存块。具体的,使用所述多个内存块分为两个步骤:In order to solve the above-mentioned bandwidth problem of the shared memory architecture, multiple memory blocks of a single-port memory are usually used instead of one memory block. Specifically, using the multiple memory blocks is divided into two steps:

1.将一组端口捆绑(捆绑到管中),使得在工作频率和最坏情况的流量场景(100%负载和分段)下,所选分段大小C足以在每个时钟周期提供C,而不会产生任意瓶颈。1. Bundling a set of ports (bundled into tubes) such that at operating frequency and worst-case traffic scenario (100% load and segmentation), the selected segment size C is sufficient to provide C per clock cycle, without creating arbitrary bottlenecks.

2.M个内存块并联连接,使得每个管道可以访问内存块以进行读取或写入。2. M memory blocks are connected in parallel so that each pipe can access the memory block for reading or writing.

图8示出了架构。在每个时钟周期,每个输入管(也称为入口管或管道)可以请求对所述共享内存进行写入,每个输出管(也称为出口管或管道)可以请求从所述共享内存进行读取。为了恰当的操作,对于给定数量P的管道,内存块的数量M至少应为2P个内存块,以便可以在相同的时钟周期(同时)执行P次读取和P次写入。Figure 8 shows the architecture. At each clock cycle, each input pipe (also called an entry pipe or pipe) can request a write to the shared memory, and each output pipe (also called an exit pipe or pipe) can request a write from the shared memory to read. For proper operation, for a given number P of pipes, the number M of memory blocks should be at least 2P memory blocks so that P reads and P writes can be performed in the same clock cycle (simultaneously).

“队列引擎和控制”块每个时钟接受来自所有入口管的写入请求,且每个时钟接受来自所有出口管的读取请求。然后,所述块决定哪个出口管从哪个内存块执行读取,以及哪个入口管向哪个内存块执行写入。The "Queue Engine and Control" block accepts write requests per clock from all ingress pipes, and each clock accepts read requests from all egress pipes. The block then decides which egress pipe performs reads from which memory block, and which ingress pipe performs writes to which memory block.

来自所述出口管的读取的顺序取决于调度算法,与数据包在内存块中的位置无关,也与其他出口管的调度决策无关。The order of reads from the egress pipe depends on the scheduling algorithm, independent of the location of the data packet in the memory block, and independent of the scheduling decisions of other egress pipes.

如果两个或多个出口管请求从同一内存块进行读取,则只有一个出口管被授予读取权限,而其余的将等待且不会在该时钟周期执行读取。这种场景称为冲突。由于读取必须按顺序进行,由于数据包的片段必须按原始包的顺序进行传输,在典型的实现方式中,读取操作不能乱序。If two or more egress pipes request a read from the same memory block, only one egress pipe is granted read permission, while the rest will wait and will not perform reads for that clock cycle. This scenario is called a conflict. Since reads must be in order, and since the fragments of the data packets must be transmitted in the order of the original packets, in a typical implementation, read operations cannot be out-of-order.

此处描述了针对图8所示的架构的读和写的结合的过程。每个时钟周期,所述控制逻辑接受来自入口管的W个写入请求和来自出口管的R个读取请求(不是所有的管都在每个时钟周期请求读取或写入)。然后,所述控制逻辑基于所述请求选择用于读取和写入的内存块,如下所示:The combined read and write process for the architecture shown in FIG. 8 is described herein. Each clock cycle, the control logic accepts W write requests from entry pipes and R read requests from exit pipes (not all pipes request a read or write every clock cycle). The control logic then selects memory blocks for reading and writing based on the request as follows:

1.选择用于读取的内存块。1. Select the memory block for reading.

a.将请求进行读取的出口管(最多P个)与对应的内存块(M个)进行最大匹配。a. Perform maximum matching of the exit pipes (up to P) requested for reading with the corresponding memory blocks (M).

b.需要注意的是,由于可能发生冲突,并非所有的读取请求都能在同一时钟周期得到支持。b. Note that not all read requests can be supported on the same clock cycle due to possible collisions.

c.设置匹配对{出口管,内存块}。c. Set matching pair {exit pipe, memory block}.

2.选择用于写入的内存块。2. Select the memory block for writing.

a.设置可用于写入的内存块(M'个)列表。a. Set the list of memory blocks (M') available for writing.

i.任意未选择用于读取且未满的内存块。i. Any memory block not selected for reading and not full.

b.在M’个内存块的列表之外,选择W个内存块,并将其分别附加到具有有效写入请求的入口管。b. Out of the list of M' memory blocks, select W memory blocks and attach them to entry pipes with valid write requests respectively.

ii.利用以下机制中的一个选择内存块:ii. Select the memory block using one of the following mechanisms:

1.所有可用内存块之间的轮询顺序。1. The polling order among all available memory blocks.

2.根据占用程度从小到大创建内存块的列表,并选择W个占用程度最小的内存块。2. Create a list of memory blocks according to the occupancy level from small to large, and select W memory blocks with the smallest occupancy level.

c.设置匹配对{入口管,内存块}。c. Set matching pair {entry pipe, memory block}.

在使用多个内存块的传统共享内存架构中,由于来自不同出口管的对同一内存块的读取请求之间可能存在冲突,不利地,读带宽会降低。In a traditional shared memory architecture using multiple memory blocks, read bandwidth is disadvantageously reduced due to possible conflicts between read requests for the same memory block from different exit pipes.

需要注意的是,在每个出口管计算来自出口管的读取请求,而不考虑来自其他出口管的请求。因此,两个或多个出口管可能同时请求从同一内存块执行读取。事实上,对于给定数量的内存块,任意冲突的概率会随着管的数量的增加而增加,而对于给定数量的管,任意冲突的概率会随着内存块数量的增加而减少。It is important to note that read requests from an outlet pipe are counted at each outlet pipe, regardless of requests from other outlet pipes. Therefore, two or more exit pipes may request to perform reads from the same memory block at the same time. In fact, for a given number of memory blocks, the probability of arbitrary collisions increases with the number of pipes, and for a given number of pipes, the probability of arbitrary collisions decreases with the number of memory blocks.

下面的等式计算了完全不冲突的概率:The following equation calculates the probability of no conflict at all:

Figure BDA0002768307760000031
Figure BDA0002768307760000031

其中,M是内存块的数量,P是请求读取的出口管的数量。where M is the number of memory blocks and P is the number of exit pipes requested to be read.

因此,任意冲突的概率为:Therefore, the probability of any collision is:

Figure BDA0002768307760000032
Figure BDA0002768307760000032

例如:E.g:

·对于32个内存块和4个出口管:P(collision)=0.177· For 32 memory blocks and 4 exit pipes: P(collision) = 0.177

·对于32个内存块和8个出口管:P(collision)=0.614· For 32 memory blocks and 8 exit pipes: P(collision) = 0.614

·对于32个内存块和12个出口管:P(collision)=0.857· For 32 memory blocks and 12 exit pipes: P(collision) = 0.857

·对于32个内存块和16个出口管:P(collision)=0.990· For 32 memory blocks and 16 exit pipes: P(collision) = 0.990

相应地,在所述传统的共享内存架构中,存在很高的共谋概率。因此,在最大可能容量下,在最坏情况的流量场景下,以及在最坏情况的共谋模式下(例如,在同一个时钟周期的所有读取都是向某个特定的内存块请求的,因此,最后一个管在等待P–1个时钟周期之后进行读取),所述共享内存带宽过低,以至于无法支撑出网流量。Accordingly, in the conventional shared memory architecture, there is a high probability of collusion. Therefore, at maximum possible capacity, in worst-case traffic scenarios, and in worst-case collusion mode (e.g. all reads in the same clock cycle are requested to a particular memory block , so the last pipe reads after waiting P–1 clock cycles), the shared memory bandwidth is too low to support outgoing traffic.

提出了多种机制以减少由读取冲突现象引起的延迟增加量和带宽减少量,例如:Various mechanisms have been proposed to reduce the amount of latency increase and bandwidth reduction caused by read collision phenomena, such as:

1.输入队列:为输入管硬分配内存块:该机制具有良好的内存利用率和更简单的逻辑。然而,它在读取侧产生高冲突率,并可能导致头端堵塞。1. Input Queue: Hard-allocate memory blocks for input pipes: This mechanism has good memory utilization and simpler logic. However, it produces a high collision rate on the read side and can lead to head-end jamming.

2.输出队列:为输出管硬分配内存块:该机制具有简化的逻辑,消除了读取冲突,但是内存利用率低,且内存无法共享。2. Output Queue: Hard-allocate memory blocks for output pipes: This mechanism has simplified logic and eliminates read conflicts, but memory utilization is low and memory cannot be shared.

3.相对于管的数量而言增加内存块的数量:该机制导致内存块和逻辑(复用器和解复用器)的硅面积增加。3. Increase the number of memory blocks relative to the number of tubes: This mechanism results in increased silicon area for memory blocks and logic (multiplexers and demultiplexers).

4.片段的读取可能会乱序,并在每个出口管处重新排序:使用该机制,因为在开始传输之前需要等待顺序的段的读取,所以延迟问题没有解决。4. Reads of segments may be out-of-order and reordered at each exit pipe: With this mechanism, the latency problem is not resolved because reads of sequential segments need to be waited before starting a transfer.

5.使用双端口内存块代替单端口内存块:在这种机制下,对于相同的总共享内存大小,内存块的硅面积增加了一倍。5. Use dual-port memory blocks instead of single-port memory blocks: Under this mechanism, the silicon area of the memory block is doubled for the same total shared memory size.

发明内容SUMMARY OF THE INVENTION

鉴于上述缺点,本发明旨在改进传统的共享内存架构和建议的机制。本发明尤其旨在提供改善延迟和吞吐量的共享内存架构。因此,本发明的一个目标是显著降低冲突概率,以避免共享内存带宽的增加。最理想的情况是,甚至完全消除了读取串谋的可能性。In view of the above disadvantages, the present invention aims to improve the traditional shared memory architecture and the proposed mechanism. In particular, the present invention aims to provide a shared memory architecture that improves latency and throughput. Therefore, an object of the present invention is to significantly reduce the collision probability in order to avoid an increase in shared memory bandwidth. Ideally, the possibility of read collusion is even completely eliminated.

通过披露的独立权利要求中提供的方案实现了本发明的目标。从属权利要求中进一步定义了本发明的有利实现方式。The object of the invention is achieved by the solutions provided in the disclosed independent claims. Advantageous implementations of the invention are further defined in the dependent claims.

本发明第一方面提供一种用于交换机的内存设备。所述内存设备包括多个内存块、多个入口管以及多个出口管,其中,每个入口管用于请求将数据包写入所述内存块,每个出口管用于请求从所述内存块读取数据包,每个出口管与所述内存块的集合相关联。A first aspect of the present invention provides a memory device for a switch. The memory device includes a plurality of memory blocks, a plurality of ingress pipes, and a plurality of egress pipes, wherein each ingress pipe is used to request data packets to be written into the memory block, and each egress pipe is used to request a read from the memory block Taking packets, each egress pipe is associated with the set of memory blocks.

“集合”是与单个出口管相关联的内存块的组。“关联”是指这些内存块是用于写入目的地为所述关联的出口管的数据包的优选内存块,并且是用于所述关联的出口管执行读取的优选内存块。然而,如果同时需要更多的写入,则剩余的写入也可以分配给其它内存块,因此,所述出口管还可以从所述其它内存块中读取。A "collection" is a group of memory blocks associated with a single outlet pipe. "Associated" means that these memory blocks are the preferred memory blocks for writing packets destined for the associated egress pipe, and are the preferred memory blocks for the associated egress pipe to perform reads. However, if more writes are required at the same time, the remaining writes can also be allocated to other memory blocks, so the exit pipe can also read from the other memory blocks.

通过将所述出口管与内存块的集合相关联,到达的数据包的权限由所述内存设备控制,因此,至少显著降低了读取串谋的概率。因为这些缺点常常是由于不同出口管对同一内存块的读取合谋造成的,因此,还降低了高延迟和低吞吐量的概率。因此,所述第一方面的所述内存设备支持改进的交换机以获得更高的带宽和容量。By associating the egress pipe with a set of memory blocks, the rights of arriving packets are controlled by the memory device, thus at least significantly reducing the probability of read collusion. Since these drawbacks are often caused by collusion of reads of the same memory block by different exit pipes, the probability of high latency and low throughput is also reduced. Therefore, the memory device of the first aspect supports an improved switch for higher bandwidth and capacity.

在所述第一方面的一种实现方式中,所述集合为不相交集合。In an implementation manner of the first aspect, the sets are disjoint sets.

这使得合谋的可能性最低,甚至有可能根本不合谋,从而优化地降低了延迟并增加了吞吐量。This minimizes the possibility of collusion, and possibly even no collusion at all, optimally reducing latency and increasing throughput.

在所述第一方面的另一种实现方式中,每个集合包括相同数量的内存块。In another implementation of the first aspect, each set includes the same number of memory blocks.

这允许所述内存设备高效地被实现为所述交换机的共享内存架构。This allows the memory device to be efficiently implemented as a shared memory architecture for the switch.

在所述第一方面的另一种实现方式中,所述内存设备包括控制器,用于从与请求读取数据包的出口管相关联的所述集合中为所述出口管选择内存块,以用于所述数据包的读取。In another implementation of the first aspect, the memory device includes a controller for selecting a memory block for the egress pipe from the set associated with the egress pipe requesting the read packet, for the reading of the data packet.

在所述第一方面的另一种实现方式中,所述控制器还用于从与确定的出口管相关联的集合中为请求写入目的地为所述确定的出口管的数据包的入口管选择内存块,以用于所述数据包的写入。In another implementation of the first aspect, the controller is further configured to request to write an entry of a data packet destined for the determined egress pipe from a set associated with the determined egress pipe The pipe selects the memory block for the writing of the packet.

所述控制器可以是处理器。所述控制器具体能够实现内存块和出口管的“软分配”(即,关联),以减少读取串谋的概率。The controller may be a processor. The controller specifically enables "soft allocation" (ie, association) of memory blocks and exit pipes to reduce the probability of read collusion.

在所述第一方面的另一种实现方式中,所述控制器还用于:当从与所述确定的出口管相关联的所述集合中选择所述内存块以用于所述数据包的写入时,排除一个或多个已满的内存块。In another implementation of the first aspect, the controller is further configured to: when selecting the memory block from the set associated with the determined outlet pipe for the data packet When writing, exclude one or more full memory blocks.

进一步提升了内存块分配效率。The efficiency of memory block allocation is further improved.

在所述第一方面的另一种实现方式中,如果对于目的地为所述确定的出口管的数据包的写入请求的总数量小于与所述确定的出口管相关联的集合中允许写入的内存块的数量,则所述控制器还用于基于最小占用率或随机地从与所述确定的出口管相关联的集合中的允许写入的内存块中选择内存块。In another implementation of the first aspect, if the total number of write requests for data packets destined for the determined egress pipe is less than a write allowed in the set associated with the determined egress pipe The controller is further configured to select a memory block from the write-allowed memory blocks in the set associated with the determined outlet pipe based on the minimum occupancy rate or randomly.

这减少了控制器的选择次数,从而提高了分配效率,并降低了串谋的可能性。This reduces the number of controller selections, which increases allocation efficiency and reduces the likelihood of collusion.

在所述第一方面的另一种实现方式中,如果对于目的地为所述确定的出口管的数据包的写入请求的总数量大于与所述确定的出口管相关联的集合中允许写入的内存块的数量,则所述控制器还用于创建所有具有未完成的写入请求的入口管的列表。In another implementation of the first aspect, if the total number of write requests for data packets destined for the determined egress pipe is greater than write allowed in the set associated with the determined egress pipe the number of incoming memory blocks, the controller is also used to create a list of all ingress pipes with outstanding write requests.

这允许所述控制器监视所有未完成的写入请求,以便更高效地处理所有请求。This allows the controller to monitor all outstanding write requests in order to process all requests more efficiently.

在所述第一方面的另一种实现方式中,如果对于目的地为所述确定的出口管的数据包的写入请求的总数量大于与所述确定的出口管相关联的集合中允许写入的内存块的数量,则所述控制器还用于从与所述确定的出口管不相关联的另一集合中选择所述内存块,以进行所述数据包的写入,具体而言,为每个未完成的写入请求分配剩余的可用内存块。In another implementation of the first aspect, if the total number of write requests for data packets destined for the determined egress pipe is greater than write allowed in the set associated with the determined egress pipe the number of incoming memory blocks, the controller is further configured to select the memory block from another set not associated with the determined egress pipe to write the data packet, specifically , allocates the remaining block of free memory for each outstanding write request.

这保证了串谋的可能性尽可能低,同时最后完成了所有的请求。This ensures that the possibility of collusion is as low as possible, while all requests are finalized.

在所述第一方面的另一种实现方式中,所述控制器用于:从与所述确定的出口管不相关联的另一集合中为请求读取数据包的所述确定的出口管选择内存块,以用于所述数据包的读取。In another implementation of the first aspect, the controller is configured to: select from another set not associated with the determined egress pipe for the determined egress pipe requesting a read packet A block of memory for the reading of the data packet.

这保证了每个数据包都到达了正确的目的地。This ensures that each packet reaches the correct destination.

在所述第一方面的另一种实现方式中,所述控制器还用于基于最小占用率或随机地为所述未完成的写入请求选择剩余的可用内存块。In another implementation manner of the first aspect, the controller is further configured to select a remaining available memory block for the outstanding write request based on a minimum occupancy rate or randomly.

该实现方式可以进一步提高内存块分配的效率。This implementation can further improve the efficiency of memory block allocation.

在所述第一方面的另一种实现方式中,所述控制器还用于将所述出口管与所述内存块的所述集合关联。In another implementation of the first aspect, the controller is further configured to associate the outlet pipe with the set of memory blocks.

因此,所述控制器完全控制所述内存设备。必要时,所述控制器还能改变所述内存块的所述集合与出口管的关联。Thus, the controller fully controls the memory device. The controller can also change the association of the set of the memory blocks with the outlet pipe if necessary.

本发明第二方面提供一种用于分组交换的交换机。所述交换机包括根据所述第一方面或其任一实现方式所述的内存设备。A second aspect of the present invention provides a switch for packet switching. The switch includes the memory device according to the first aspect or any implementation thereof.

在所述第二方面的一种实现方式中,所述交换机包括多个输入端口和多个输出端口,其中每个所述入口管与所述输入端口的组相关联,且每个所述出口管与所述输出端口的组相关联。In one implementation of the second aspect, the switch includes a plurality of input ports and a plurality of output ports, wherein each of the inlet pipes is associated with a group of the input ports, and each of the outlets A tube is associated with the set of output ports.

本发明的第三方面提供了一种用于控制内存设备的方法。所述内存设备包括多个内存块、入口管和出口管。所述方法包括:从与请求读取数据包的出口管相关联的内存块的集合中为所述出口管选择内存块,以进行所述数据包的读取;和/或从与确定的出口管相关联的内存块的集合中为请求写入目的地为所述确定的出口管的数据包的任意入口管选择内存块,以进行所述数据包的写入。A third aspect of the present invention provides a method for controlling a memory device. The memory device includes a plurality of memory blocks, inlet pipes and outlet pipes. The method comprises: selecting a memory block for the egress pipe from a set of memory blocks associated with the egress pipe requesting to read the data packet for the reading of the data packet; and/or from the exit pipe associated with the determined egress In the set of memory blocks associated with the pipe, a memory block is selected for any ingress pipe requesting to write a data packet whose destination is the determined egress pipe, so as to write the data packet.

在所述第三方面的一种实现方式中,所述集合为不相交集合。In an implementation manner of the third aspect, the sets are disjoint sets.

在所述第三方面的另一种实现方式中,每个集合包括相同数量的内存块。In another implementation of the third aspect, each set includes the same number of memory blocks.

在所述第三方面的另一种实现方式中,所述方法包括:从与请求读取数据包的出口管相关联的集合中为所述出口管选择内存块,以用于所述数据包的读取。In another implementation of the third aspect, the method includes selecting a memory block for the egress pipe from a set associated with the egress pipe requesting to read the data packet, for the data packet read.

在所述第三方面的另一种实现方式中,所述方法还包括:从与确定的出口管相关联的所述集合中为请求写入目的地为所述确定的出口管的任意入口管选择内存块,以用于所述数据包的写入。In another implementation of the third aspect, the method further comprises: requesting to write any inlet pipe destined for the determined outlet pipe from the set associated with the determined outlet pipe A block of memory is selected for writing the packet.

在所述第三方面的另一种实现方式中,所述方法还包括:当从与所述确定的出口管相关联的所述集合中选择所述内存块以进行所述数据包的写入时,排除一个或多个已满的内存块。In another implementation manner of the third aspect, the method further includes: when selecting the memory block from the set associated with the determined egress pipe for writing the data packet , exclude one or more full memory blocks.

在所述第三方面的另一种实现方式中,如果对于目的地为所述确定的出口管的数据包的写入请求的总数量小于与所述确定的出口管相关联的集合中允许写入的内存块的数量,所述方法包括:基于最小占用率或随机地从与所述确定的出口管关联的集合中的允许写入的内存块中选择内存块。In another implementation of the third aspect, if the total number of write requests for data packets destined for the determined egress pipe is less than a write allowed in the set associated with the determined egress pipe the number of incoming memory blocks, the method comprising: selecting a memory block based on a minimum occupancy rate or randomly from the write-allowed memory blocks in the set associated with the determined egress pipe.

在所述第三方面的另一种实现方式中,如果对于目的地为所述确定的出口管的数据包的写入请求的总数量大于与所述确定的出口管相关联的集合中允许写入的内存块的数量,则所述方法包括:创建所有具有未完成的写入请求的入口管的列表。In another implementation of the third aspect, if the total number of write requests for data packets destined for the determined egress pipe is greater than the write allowed in the set associated with the determined egress pipe the number of incoming memory blocks, the method includes creating a list of all ingress pipes with outstanding write requests.

在所述第三方面的另一种实现方式中,如果对于目的地为所述确定的出口管的数据包的写入请求的总数量大于与所述确定的出口管相关联的集合中允许写入的内存块的数量,则所述方法包括:从与所述确定的出口管不相关联的另一集合中选择所述内存块,以进行所述数据包的写入,具体而言,为每个未完成的写入请求分配剩余的可用内存块。In another implementation of the third aspect, if the total number of write requests for data packets destined for the determined egress pipe is greater than the write allowed in the set associated with the determined egress pipe the number of incoming memory blocks, the method includes: selecting the memory block from another set not associated with the determined egress pipe to write the data packet, specifically, for Each outstanding write request allocates the remaining block of free memory.

在所述第三方面的另一种实现方式中,所述方法包括:从与所述确定的出口管不相关联的另一集合中为请求读取数据包的所述确定的出口管选择内存块,以用于所述数据包的读取。In another implementation of the third aspect, the method includes selecting a memory for the determined egress pipe requesting a read packet from another set not associated with the determined egress pipe block for reading of the packet.

在所述第三方面的另一种实现方式中,所述方法包括:基于最小占用率或随机地为所述未完成的写入请求选择剩余的可用内存块。In another implementation manner of the third aspect, the method includes: selecting a remaining available memory block for the outstanding write request based on a minimum occupancy rate or randomly.

在所述第三方面的另一种实现方式中,所述方法包括:将所述出口管与所述内存块的集合关联。In another implementation of the third aspect, the method includes associating the outlet pipe with the set of memory blocks.

所述第三方面的所述方法实现了所述第一方面的内存设备的所有上述优势和效果。The method of the third aspect achieves all of the above advantages and effects of the memory device of the first aspect.

本发明的第四方面提供了一种计算机程序产品,其存储程序代码,所述程序代码用于控制根据第一方面或其任一实现方式的内存设备和/或根据第二方面或其任一实现方式的交换机,或者当在计算机上实施时,用于执行根据所述第三方面或其任一实现方式的方法。A fourth aspect of the present invention provides a computer program product storing program code for controlling a memory device according to the first aspect or any of its implementations and/or according to the second aspect or any of its implementations A switch of an implementation, or when implemented on a computer, for performing a method according to the third aspect or any implementation thereof.

相应地,利用所述第四方面的所述计算机程序产品,可以分别实现所述第一、第二和第三方面的优点。Accordingly, with the computer program product of the fourth aspect, the advantages of the first, second and third aspects, respectively, can be achieved.

需要注意的是,本申请中描述的所有设备、元件、单元和装置都可以在软件或硬件元件或其任意种类的组合中实现。本申请中描述的各种实体执行的所有步骤和所描述的将由各种实体执行的功能旨在表明各个实体适于或用于执行各自的步骤和功能。虽然在以下具体实施例的描述中,由外部实体执行的特定功能或步骤没有在执行特定步骤或功能的该实体的具体元件的描述中反映,但是技术人员应该清楚的是这些方法和功能可以在各自的硬件或软件元件或其任意组合中实现。It should be noted that all the devices, elements, units and means described in this application can be implemented in software or hardware elements or any kind of combination thereof. All steps performed by the various entities described in this application and the functions described to be performed by the various entities are intended to indicate that the various entities are adapted or used to perform the respective steps and functions. Although in the following description of specific embodiments, specific functions or steps performed by an external entity are not reflected in the description of specific elements of that entity performing the specific steps or functions, it should be clear to those skilled in the art that these methods and functions may be implemented in implemented in respective hardware or software elements or any combination thereof.

附图说明Description of drawings

结合所附附图,下面具体实施例的描述将阐述上述本发明的各方面及其实现方式,其中:In conjunction with the accompanying drawings, the following description of specific embodiments will illustrate various aspects of the present invention described above and implementations thereof, wherein:

图1示出了根据本发明实施例的内存设备;FIG. 1 shows a memory device according to an embodiment of the present invention;

图2示出了根据本发明实施例的交换机;FIG. 2 shows a switch according to an embodiment of the present invention;

图3示出了根据本发明实施例的方法;Figure 3 shows a method according to an embodiment of the present invention;

图4示出了根据本发明实施例的通过内存设备对均匀分布的随机流量进行模拟;FIG. 4 shows the simulation of uniformly distributed random traffic through a memory device according to an embodiment of the present invention;

图5示出了根据本发明实施例的通过内存设备对均匀分布的随机流量进行模拟;FIG. 5 shows the simulation of uniformly distributed random traffic through a memory device according to an embodiment of the present invention;

图6示出了根据本发明实施例的通过内存设备对均匀分布的随机突发流量进行模拟;FIG. 6 shows the simulation of uniformly distributed random burst traffic through a memory device according to an embodiment of the present invention;

图7示出了一种传统的交换机共享内存架构;以及Figure 7 illustrates a conventional switch shared memory architecture; and

图8示出了具有多个内存块和多个出口管和入口管的传统的交换机共享内存架构。Figure 8 shows a conventional switch shared memory architecture with multiple memory blocks and multiple egress and ingress pipes.

具体实施方式Detailed ways

图1示出了根据本发明实施例的内存设备100。所述内存设备100尤其适用于在交换机200(如图2所示)中实现。特别地,所述内存设备100可以作为所述交换机200的共享内存架构。FIG. 1 shows a memory device 100 according to an embodiment of the present invention. The memory device 100 is particularly suitable for implementation in a switch 200 (shown in FIG. 2 ). In particular, the memory device 100 may serve as a shared memory architecture of the switch 200 .

所述内存设备100包括多个内存块101、多个入口管102和多个出口管103,其中,每个内存块101可以是传统的内存块。此外,入口管102和出口管103本身可以像传统的管一样实现。每个入口管102用于请求将数据包写入所述内存块101。例如,所述数据包可以为以太网数据包。进一步地,每个出口管103用于请求从所述内存块101读取数据包。The memory device 100 includes a plurality of memory blocks 101, a plurality of inlet pipes 102 and a plurality of outlet pipes 103, wherein each memory block 101 may be a conventional memory block. Furthermore, the inlet pipe 102 and the outlet pipe 103 themselves can be implemented like conventional pipes. Each ingress pipe 102 is used to request to write data packets into the memory block 101 . For example, the data packets may be Ethernet data packets. Further, each egress pipe 103 is used for requesting to read data packets from the memory block 101 .

根据本发明,每个出口管103与所述内存块101的集合104(由内存块101和出口管103之间的虚线表示,但并不意味着出口管103只能从这些内存块101中读取)相关联。所述内存块101的所述集合104可以是不相交的集合,即,两个集合100不共享任何内存块101。进一步地,每个集合104可以包括相同数量的内存块101。然而,两个集合104可能包括不同数量的内存块101。According to the present invention, each outlet pipe 103 and the set 104 of the memory blocks 101 (represented by the dotted line between the memory block 101 and the outlet pipe 103 , but it does not mean that the outlet pipe 103 can only read from these memory blocks 101 take) associated. The set 104 of the memory blocks 101 may be disjoint sets, ie the two sets 100 do not share any memory blocks 101 . Further, each set 104 may include the same number of memory blocks 101 . However, the two sets 104 may include different numbers of memory blocks 101 .

由于不同出口管103从内存块101中进行读取的请求和顺序是一定的,并且无法控制,因此,图1的所述内存设备100控制将到达的数据包写入内存块101,从而显著降低了读取冲突的概率。因此,还降低了由于不同出口管103对相同内存块101的读取冲突而造成的高延迟和低吞吐量的概率。Since the read requests and sequences of different egress pipes 103 from the memory block 101 are certain and cannot be controlled, the memory device 100 in FIG. 1 controls to write the arriving data packets into the memory block 101, thereby significantly reducing the the probability of read collisions. Therefore, the probability of high latency and low throughput due to conflicting reads of the same memory block 101 by different outlet pipes 103 is also reduced.

以下描述了图1的所述内存设备100的示例。在所述内存设备100中,M个内存块101被划分给P个出口管103,使得每个出口管103与M/P个内存块103相关联。An example of the memory device 100 of FIG. 1 is described below. In the memory device 100 , M memory blocks 101 are divided into P outlet pipes 103 such that each outlet pipe 103 is associated with M/P memory blocks 103 .

例如,对于M=64和P=16,每个出口管103可以与4个内存块101相关联,如下所示:For example, for M=64 and P=16, each outlet pipe 103 can be associated with 4 memory blocks 101 as follows:

–出口管0:内存块0-3– outlet pipe 0: memory blocks 0-3

–出口管1:内存块4-7– Outlet Pipe 1: Memory Blocks 4-7

-……-…

–出口管16:内存块60-63– Outlet Pipe 16: Memory Blocks 60-63

通常,如果待写入数据包的目的地是出口管103,则为任一入口管102选择与出口管103相关联的内存块101以进行写入。值得注意的是,如果这些内存块101已满,则可能不会选择它们进行写入。Typically, if the destination of the data packet to be written is the egress pipe 103, the memory block 101 associated with the egress pipe 103 is selected for any one of the ingress pipes 102 for writing. It is worth noting that if these memory blocks 101 are full, they may not be selected for writing.

由于某个出口管103与M/P个内存块103相关联,并且假设有一个内存块用于读取,因此存在M/P–1内存块101专用于该出口管101,并且可用于写入。假设所述M/P–1个内存块101均未满。Since a certain outlet pipe 103 is associated with M/P memory blocks 103, and it is assumed that there is one memory block for reading, there is M/P-1 memory block 101 dedicated to this outlet pipe 101 and available for writing . It is assumed that none of the M/P-1 memory blocks 101 are full.

考虑到用于选择内存块101进行写入的基本规则,如果每个出口管103在相同的时钟周期有M/P–1个写入请求或更少,则读取时的冲突概率降低到零。相应地修改对用于写入请求的内存块的选择算法,如上所述的用于所述传统的共享内存架构的算法,以显著降低读取冲突。Considering the basic rules for selecting memory blocks 101 for writing, if each outlet pipe 103 has M/P - 1 write requests or less in the same clock cycle, the probability of collisions on reads is reduced to zero . The selection algorithm for the memory block for write requests, as described above for the conventional shared memory architecture, is modified accordingly to significantly reduce read conflicts.

具体而言,本发明的写入过程可以划分为两个步骤:第一步(步骤1):仅从的关联范围的内存块101,即,从M/P个内存块101,中为请求入口管102分配内存块101以用于针对每个出口管103进行写入。第二步(步骤2):为剩余的写入请求分配内存块101(即,如果针对出口管103已经分配了M/P个内存块用于读取,或者如果未给读取分配内存块,针对出口管103的写入的数量大于M/P–1)。下面对这两个步骤进行更详细的说明:Specifically, the writing process of the present invention can be divided into two steps: the first step (step 1): only from the memory blocks 101 of the associated range, that is, from M/P memory blocks 101 , is the request entry Pipes 102 allocate memory blocks 101 for writing to each outlet pipe 103 . Second step (step 2): allocate memory blocks 101 for the remaining write requests (ie, if M/P memory blocks have been allocated for egress pipe 103 for reading, or if no memory blocks are allocated for reading, The number of writes for the outlet pipe 103 is greater than M/P-1). These two steps are described in more detail below:

步骤1:step 1:

1.从M/P个内存块101中单独为每个入口管102选择内存块101,以用于写入。1. Select a memory block 101 for each entry pipe 102 individually from the M/P memory blocks 101 for writing.

a.例如,对于,M=64,P=16,可以为针对出口管0的写入选择内存块0-3(除非其中有内存块已经被选择用于读取)。a. For example, for M=64, P=16, memory blocks 0-3 may be selected for writes to outlet pipe 0 (unless any of the memory blocks are already selected for reading).

b.如果某个出口管103需要少于4个写入,或者如果需要少于3个写入,其中,已经选择了用于读取的内存块101,可以从M/P(或M/P–1)个内存块中选择更少的内存块101。值得注意的是,本发明不限于所述内存块选择过程,所述过程可以为:b. If a certain outlet pipe 103 requires less than 4 writes, or if less than 3 writes are required, where the memory block 101 has been selected for reading, it can be accessed from the M/P (or M/P – 1) fewer memory blocks 101 are selected. It is worth noting that the present invention is not limited to the memory block selection process, and the process can be:

i.选择具有最小占用率的内存块101.i. Select the memory block 101 with the smallest occupancy.

ii.随机选择内存块101ii. Randomly select memory block 101

iii.等iii. etc.

2.创建所有具有未完成的请求的入口管102的列表。2. Create a list of all ingress pipes 102 with outstanding requests.

a.如果以下条件成立,该情况可以发生。a. This can happen if the following conditions are true.

i.针对单个出口管103具有多于M/P–1个写入请求,且已经为所述出口管103分配了读取;或者i. have more than M/P-1 write requests for a single outlet pipe 103 for which reads have been allocated; or

ii.针对单个出口管103具有多于M/P个写入请求,且没有为所述出口管103分配读取。ii. There are more than M/P write requests for a single outlet pipe 103 and no reads are allocated for that outlet pipe 103 .

步骤2:Step 2:

1.来自所有入口管102的剩余的写入请求被分配到剩余的可用内存块101。1. The remaining write requests from all ingress pipes 102 are allocated to the remaining available memory blocks 101.

a.可用的内存块101为未分配用于读取或写入的且未满的内存块(即,可以被写入)。a. Available memory blocks 101 are memory blocks that are not allocated for reading or writing and are not full (ie, can be written).

b.本发明不限于所述内存块选择过程,所述过程可以为:b. The present invention is not limited to the memory block selection process, and the process can be:

i.选择具有最小占用率的内存块101.i. Select the memory block 101 with the smallest occupancy.

ii.随机选择内存块101ii. Randomly select memory block 101

iii.等iii. etc.

正如所定义的,所述内存块101被“软分配”到(关联)出口管103。这意味着针对出口管103的M/P内存块是优先用于写入的内存块,但是如果同时需要更多的写入,将剩余的写入分配给剩余的内存块101,从而使其以不同的出口管103作为终点。As defined, the memory block 101 is "soft allocated" to (associated with) the outlet pipe 103 . This means that the M/P memory block for outlet pipe 103 is the memory block that is preferentially used for writing, but if more writes are needed at the same time, the remaining writes are allocated to the remaining memory block 101, making it as A different outlet pipe 103 serves as the end point.

图2示出了根据本发明实施例的交换机200。所述交换机200具体用于网络系统中的分组交换,且可以为高带宽、高容量交换机。所述交换机200包括根据本发明实施例的至少一个内存设备100,尤其如图1所示。Figure 2 shows a switch 200 according to an embodiment of the present invention. The switch 200 is specifically used for packet switching in a network system, and can be a high-bandwidth, high-capacity switch. The switch 200 includes at least one memory device 100 according to an embodiment of the present invention, especially as shown in FIG. 1 .

如图2所示,所述交换机200还可以包括多个输入端口201和多个输出端口202。所述内存设备100的每个入口管102与所述输入端口201的组203相关联,且所述内存设备100的每个出口管103与所述输出端口202的组204相关联。As shown in FIG. 2 , the switch 200 may further include multiple input ports 201 and multiple output ports 202 . Each inlet pipe 102 of the memory device 100 is associated with the group 203 of the input ports 201 , and each outlet pipe 103 of the memory device 100 is associated with the group 204 of the output ports 202 .

图3示出了根据本发明实施例的方法300。所述交换机300尤其用于控制根据本发明实施例的内存设备100,尤其是如图1所示的内存设备。所述方法300可以由所述内存设备100的控制器执行,或者由交换机200(如图2所示)的控制器执行,其中,所述交换机包括所述内存设备100。Figure 3 illustrates a method 300 according to an embodiment of the present invention. The switch 300 is particularly used to control the memory device 100 according to the embodiment of the present invention, especially the memory device shown in FIG. 1 . The method 300 may be performed by the controller of the memory device 100 , or by the controller of the switch 200 (shown in FIG. 2 ), wherein the switch includes the memory device 100 .

所述方法包括:步骤301:从与请求读取数据包的出口管103相关联的内存块101的集合中为所述内存设备100的所述出口管103选择所述内存设备100的内存块101,以进行所述数据包的读取。此外或可替代地,所述方法300包括:步骤302:从与确定的出口管103相关联的内存块101的集合104中为所述内存设备100的请求写入目的地为所述确定的出口管103的数据包的任意入口管102选择所述内存设备100的内存块101,以进行所述数据包的写入。The method includes: Step 301 : Selecting the memory block 101 of the memory device 100 for the egress pipe 103 of the memory device 100 from the set of memory blocks 101 associated with the egress pipe 103 requesting to read the data packet , to read the data packet. Additionally or alternatively, the method 300 includes: Step 302 : from the set 104 of memory blocks 101 associated with the determined exit pipe 103 for the request of the memory device 100 to write to the determined exit Any ingress pipe 102 of the data packet of the pipe 103 selects the memory block 101 of the memory device 100 to write the data packet.

下面分析根据本发明实施例的内存设备100、交换机200和方法300的性能提升。The following analyzes the performance improvement of the memory device 100, the switch 200, and the method 300 according to the embodiments of the present invention.

作为示例,模拟了包含P=16个出口管103和M=32-256个内存块101的内存设备100,用于两种内存块分配方案:As an example, a memory device 100 containing P=16 outlet pipes 103 and M=32-256 memory blocks 101 is simulated for two memory block allocation schemes:

1.随机为写入请求分配内存块101(传统的)。1. Randomly allocate memory block 101 for write requests (conventional).

2.根据本发明的方案,为写入请求(出口管103)软分配内存块101。2. According to the solution of the present invention, the memory block 101 is softly allocated for the write request (exit pipe 103).

图4示出了模拟均匀分布的随机流量的第一次测试。应用了来自所有输入端口的最坏情况的流量模式(小数据包),其中为每个数据包选择均匀分布的随机目的地。Figure 4 shows a first test simulating a uniformly distributed random flow. A worst-case traffic pattern (small packets) from all input ports was applied, where a uniformly distributed random destination was chosen for each packet.

图5示出了模拟均匀分布的随机流量的第二次测试。应用了来自所有输入端口的最坏情况的流量模式(小数据包),其中,在9个时钟周期为每个数据包选择均匀分布的随机目的地,在第10个时钟周期中,所有到达的数据包的目的地为随机选择的均匀分布的单个目的地。Figure 5 shows a second test simulating uniformly distributed random traffic. A worst-case traffic pattern (small packets) from all input ports is applied, where a uniformly distributed random destination is chosen for each packet in 9 clock cycles, and in the 10th clock cycle, all arriving The destination of the packet is a randomly chosen uniformly distributed single destination.

图6示出了模拟均匀分布的随机突发流量的第三次测试。应用了来自所有输入端口的最坏情况的流量模式(小数据包),其中,在每个时钟周期为所有到达的数据包选择均匀分布的随机目的地。所有到达的数据包的目的地为随机选择的均匀分布的单个目的地。Figure 6 shows a third test simulating a uniformly distributed random burst of traffic. A worst-case traffic pattern (small packets) from all input ports is applied, where a uniformly distributed random destination is chosen for all arriving packets at each clock cycle. The destination of all arriving packets is a single, uniformly-distributed destination chosen at random.

在图4、图5和图6的概要中,所有模拟结果显示使用本发明的软分配方案(与传统的随机内存块选择相比)时,在较少数量的内存块101时具有较高的传输带宽。只有在内存块101数量非常高的情况下,这两种方案的性能才一样。In the summaries of Figures 4, 5, and 6, all simulation results show that when using the soft allocation scheme of the present invention (compared to traditional random memory block selection), a lower number of memory blocks 101 has a higher transmission bandwidth. Only when the number of memory blocks 101 is very high, the performance of the two schemes is the same.

值得注意的是,大量内存块101增加了整体的硅面积和功耗。本发明的分配方案的另一个优点是减少了延迟。由于所述内存设备100的整体共享内存结构提供了更多的带宽,因此降低了时延。时延是数据包在传输之前在所述内存设备100或交换机200中停留的时间。该时间从数据包的第一个字节的到达开始计算,到数据包的第一个字节的离开为止。Notably, the large number of memory blocks 101 increases the overall silicon area and power consumption. Another advantage of the allocation scheme of the present invention is that delays are reduced. Since the overall shared memory structure of the memory device 100 provides more bandwidth, the latency is reduced. Latency is the time a packet stays in the memory device 100 or switch 200 before being transmitted. The time is counted from the arrival of the first byte of the packet to the departure of the first byte of the packet.

已经结合作为实例的不同实施例以及实施方案描述了本发明。但本领域技术人员通过实践所请发明,研究附图、本公开以及独立权项,能够理解并获得其他变体。在权利要求以及描述中,术语“包括”不排除其他元件或步骤,且“一”并不排除复数可能。单个元件或其它单元可满足权利要求书中所叙述的若干实体或项目的功能。在仅凭某些措施被记载在相互不同的从属权利要求书中这个单纯的事实并不意味着这些措施的结合不能在有利的实现方式中使用。The present invention has been described in connection with various embodiments and embodiments by way of example. However, those skilled in the art can understand and obtain other variations by practicing the claimed invention, studying the drawings, the present disclosure and the independent claims. In the claims and descriptions, the term "comprising" does not exclude other elements or steps, and "a" does not exclude a plural possibility. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims (16)

1.一种用于交换机(200)的内存设备(100),其特征在于,所述内存设备(100)包括:1. A memory device (100) for a switch (200), wherein the memory device (100) comprises: 多个内存块(101);multiple memory blocks(101); 多个入口管(102),其中,每个入口管(102)用于请求将数据包写入所述内存块(101);以及a plurality of ingress pipes (102), wherein each ingress pipe (102) is used to request data packets to be written to the memory block (101); and 多个出口管(103),其中,每个出口管(103)用于请求从所述内存块(101)读取数据包,a plurality of exit pipes (103), wherein each exit pipe (103) is used for requesting to read data packets from the memory block (101), 其中,每个出口管(103)与所述内存块(101)的集合(104)相关联。Therein, each outlet pipe (103) is associated with a set (104) of said memory blocks (101). 2.根据权利要求1所述的内存设备(100),其特征在于,2. The memory device (100) according to claim 1, characterized in that, 所述集合(104)为不相交集合。The sets (104) are disjoint sets. 3.根据权利要求1或2所述的内存设备(100),其特征在于,3. The memory device (100) according to claim 1 or 2, characterized in that, 每个集合(104)包括相同数量的内存块(101)。Each set (104) includes the same number of memory blocks (101). 4.根据权利要求1至3中任一项所述的内存设备(100),其特征在于,包括:4. The memory device (100) according to any one of claims 1 to 3, characterized in that, comprising: 控制器,用于从与请求读取数据包的出口管(103)相关联的集合(104)中为所述出口管(103)选择内存块(101),以用于所述数据包的读取。a controller for selecting a memory block (101) for the egress pipe (103) from the set (104) associated with the egress pipe (103) requesting to read the data packet, for the reading of the data packet Pick. 5.根据权利要求4所述的内存设备(100),其特征在于,5. The memory device (100) according to claim 4, characterized in that, 所述控制器还用于从与确定的出口管(103)相关联的所述集合(104)中为请求写入目的地为所述确定的出口管(103)的数据包的任意入口管(102)选择内存块(101),以用于所述数据包的写入。The controller is further configured to request to write any ingress pipe ( 102) Select a memory block (101) for writing the data packet. 6.根据权利要求5所述的内存设备(100),其特征在于,6. The memory device (100) according to claim 5, characterized in that, 所述控制器还用于:当从与所述确定的出口管(103)相关联的所述集合(104)中选择所述内存块(101)以进行所述数据包的写入时,排除一个或多个已满的内存块(101)。The controller is further configured to exclude, when selecting the memory block (101) from the set (104) associated with the determined egress pipe (103) for writing of the data packet One or more full memory blocks (101). 7.根据权利要求5或6所述的内存设备(100),其特征在于,7. The memory device (100) according to claim 5 or 6, characterized in that, 如果用于写入目的地为所述确定的出口管(103)的数据包的请求的总数量小于与所述确定的出口管(103)相关联的所述集合(104)中的允许写入的内存块(101)的数量,则所述控制器还用于:if the total number of requests to write packets destined for the determined egress pipe (103) is less than the allowable writes in the set (104) associated with the determined egress pipe (103) the number of memory blocks (101), the controller is also used to: 基于最小占用率或随机地从与所述确定的出口管(103)相关联的所述集合(104)中的允许写入的内存块(101)中选择内存块(101)。A memory block (101) is selected from the write-enabled memory blocks (101) in the set (104) associated with the determined outlet pipe (103) based on minimum occupancy or randomly. 8.根据权利要求5至7中任一项所述的内存设备(100),其特征在于,8. The memory device (100) according to any one of claims 5 to 7, characterized in that, 如果用于写入目的地为所述确定的出口管(103)的数据包的请求的总数量大于与所述确定的出口管(103)相关联的所述集合(104)中的允许写入的内存块(101)的数量,则所述控制器还用于:if the total number of requests to write packets destined for the determined egress pipe (103) is greater than the allowable writes in the set (104) associated with the determined egress pipe (103) the number of memory blocks (101), the controller is also used to: 创建所有具有未完成的写入请求的入口管(102)的列表。A list of all ingress pipes (102) with outstanding write requests is created. 9.根据权利要求5至8中任一项所述的内存设备(100),其特征在于,9. The memory device (100) according to any one of claims 5 to 8, characterized in that, 如果用于写入目的地为所述确定的出口管(103)的数据包的请求的总数量大于与所述确定的出口管(103)相关联的所述集合(104)中的允许写入的内存块(101)的数量,则所述控制器还用于:if the total number of requests to write packets destined for the determined egress pipe (103) is greater than the allowable writes in the set (104) associated with the determined egress pipe (103) the number of memory blocks (101), the controller is also used to: 从与所述确定的出口管(103)不相关联的另一集合(104)中选择所述内存块(101),以用于所述数据包的写入,具体而言,为每个未完成的写入请求分配剩余的可用内存块(101)。Said memory block (101) is selected from another set (104) not associated with said determined egress pipe (103) for writing of said data packet, in particular, for each unrelated set (104) A completed write request allocates the remaining available memory blocks (101). 10.根据权利要求9所述的内存设备(100),其特征在于,10. The memory device (100) according to claim 9, characterized in that, 所述控制器用于从与请求读取数据包的所述确定的出口管(103)不相关联的另一集合(104)中为所述确定的出口管(103)选择内存块(101),以用于所述数据包的读取。said controller for selecting a memory block (101) for said determined egress pipe (103) from another set (104) not associated with said determined egress pipe (103) requesting a read packet, for the reading of the data packet. 11.根据权利要求9所述的内存设备(100),其特征在于,11. The memory device (100) according to claim 9, characterized in that, 所述控制器还用于基于最小占用率或随机地为所述未完成的写入请求选择剩余的可用内存块(101)。The controller is further configured to select remaining available memory blocks for the outstanding write requests based on minimum occupancy or randomly (101). 12.根据权利要求4至11中任一项所述的内存设备(100),其特征在于,12. The memory device (100) according to any one of claims 4 to 11, characterized in that, 所述控制器还用于将所述出口管(103)与所述内存块(101)的所述集合(104)相关联。The controller is also configured to associate the outlet pipe (103) with the set (104) of the memory blocks (101). 13.一种用于分组交换的交换机(200),其特征在于,所述交换机(200)包括:13. A switch (200) for packet switching, characterized in that the switch (200) comprises: 根据权利要求1至12中任一项所述的内存设备(100)。The memory device (100) according to any of claims 1 to 12. 14.根据权利要求13所述的交换机(200),其特征在于,包括:14. The switch (200) of claim 13, comprising: 多个输入端口(201)和多个输出端口(202),multiple input ports (201) and multiple output ports (202), 其中,每个入口管(102)与所述输入端口(201)的组(203)相关联,且每个出口管(103)与所述输出端口(202)的组(204)相关联。Therein, each inlet pipe (102) is associated with the group (203) of said input ports (201) and each outlet pipe (103) is associated with the group (204) of said output ports (202). 15.一种用于控制内存设备(100)的方法(300),其特征在于,所述内存设备(100)包括多个内存块(101)、入口管(102)和出口管(103),所述方法包括:15. A method (300) for controlling a memory device (100), wherein the memory device (100) comprises a plurality of memory blocks (101), an inlet pipe (102) and an outlet pipe (103), The method includes: (301):从与请求读取数据包的出口管(103)相关联的内存块(101)的集合中为所述出口管(103)选择内存块(101),以用于所述数据包的读取;和/或(301): Select a memory block (101) for the egress pipe (103) from the set of memory blocks (101) associated with the egress pipe (103) requesting to read the data packet, for the data packet read; and/or (302):从与确定的出口管(103)相关联的内存块(101)的集合(104)中为请求写入目的地为所述确定的出口管(103)的数据包的任意入口管(102)选择内存块(101),以用于所述数据包的写入。(302): From the set (104) of memory blocks (101) associated with the determined egress pipe (103), request to write any ingress pipe of a packet destined for the determined egress pipe (103) (102) Select a memory block (101) for writing the data packet. 16.一种计算机程序产品,其特征在于,存储有程序代码,其中,所述程序代码用于控制根据权利要求1至12中任一项所述的内存设备(100)和/或根据权利要求13或14所述的交换机,或者当在计算机上实现时,用于执行根据权利要求15所述的方法(300)。16. A computer program product, characterized in that a program code is stored, wherein the program code is used to control a memory device (100) according to any one of claims 1 to 12 and/or according to claim 1 The switch of 13 or 14, or when implemented on a computer, for performing the method (300) of claim 15.
CN201880093306.2A 2018-05-07 2018-05-07 Memory device for high-bandwidth and high-capacity switch Pending CN112088521A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2018/061671 WO2019214801A1 (en) 2018-05-07 2018-05-07 Memory device for a high bandwidth high capacity switch

Publications (1)

Publication Number Publication Date
CN112088521A true CN112088521A (en) 2020-12-15

Family

ID=62152541

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201880093306.2A Pending CN112088521A (en) 2018-05-07 2018-05-07 Memory device for high-bandwidth and high-capacity switch

Country Status (2)

Country Link
CN (1) CN112088521A (en)
WO (1) WO2019214801A1 (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5790545A (en) * 1996-03-14 1998-08-04 Motorola Inc. Efficient output-request packet switch and method
EP1616415B1 (en) * 2003-04-22 2009-06-03 Agere Systems Inc. Method and apparatus for shared multi-bank memory
US20100054268A1 (en) * 2006-03-28 2010-03-04 Integrated Device Technology, Inc. Method of Tracking Arrival Order of Packets into Plural Queues
US7933283B1 (en) * 2008-03-04 2011-04-26 Cortina Systems, Inc. Shared memory management
CN103036805A (en) * 2011-09-30 2013-04-10 美国博通公司 System and method for improving multicast performance in banked shared memory architectures
US20150026361A1 (en) * 2013-07-19 2015-01-22 Broadcom Corporation Ingress Based Headroom Buffering For Switch Architectures
CN107005487A (en) * 2014-12-29 2017-08-01 甲骨文国际公司 System and method for supporting efficient VOQ (VOQ) utilization of resources in networked devices
CN107005489A (en) * 2014-12-29 2017-08-01 甲骨文国际公司 Systems and methods for supporting efficient virtual output queue (VOQ) packet flushing schemes in networked devices
CN107592966A (en) * 2015-05-13 2018-01-16 思科技术公司 Dynamic protection of shared memory used by output queues in network devices

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5790545A (en) * 1996-03-14 1998-08-04 Motorola Inc. Efficient output-request packet switch and method
EP1616415B1 (en) * 2003-04-22 2009-06-03 Agere Systems Inc. Method and apparatus for shared multi-bank memory
US20100054268A1 (en) * 2006-03-28 2010-03-04 Integrated Device Technology, Inc. Method of Tracking Arrival Order of Packets into Plural Queues
US7933283B1 (en) * 2008-03-04 2011-04-26 Cortina Systems, Inc. Shared memory management
CN103036805A (en) * 2011-09-30 2013-04-10 美国博通公司 System and method for improving multicast performance in banked shared memory architectures
US20150026361A1 (en) * 2013-07-19 2015-01-22 Broadcom Corporation Ingress Based Headroom Buffering For Switch Architectures
CN107005487A (en) * 2014-12-29 2017-08-01 甲骨文国际公司 System and method for supporting efficient VOQ (VOQ) utilization of resources in networked devices
CN107005489A (en) * 2014-12-29 2017-08-01 甲骨文国际公司 Systems and methods for supporting efficient virtual output queue (VOQ) packet flushing schemes in networked devices
CN107592966A (en) * 2015-05-13 2018-01-16 思科技术公司 Dynamic protection of shared memory used by output queues in network devices

Also Published As

Publication number Publication date
WO2019214801A1 (en) 2019-11-14

Similar Documents

Publication Publication Date Title
CN110417670B (en) Network switch
CN104641608B (en) Ultra-low latency network buffer storage
US8977774B2 (en) Method for reducing buffer capacity in a pipeline processor
US6922749B1 (en) Apparatus and methodology for an input port of a switch that supports cut-through operation within the switch
EP1839166B1 (en) Shared-memory switch fabric architecture
JP2016195375A (en) Method and apparatus for using multiple linked memory lists
US20080123525A1 (en) System and Method for Filtering Packets in a Switching Environment
CN105577576A (en) Distributed Switch Architecture
US10951549B2 (en) Reusing switch ports for external buffer network
US11947483B2 (en) Data flow management
US7856026B1 (en) Configurable central memory buffered packet switch module for use in a PLD
CN113110943B (en) Software defined switching structure and data switching method based on same
US20030174708A1 (en) High-speed memory having a modular structure
KR100321784B1 (en) Distributed type input buffer switch system having arbitration latency tolerance and method for processing input data using the same
JP4446757B2 (en) System, method and logic for managing memory resources shared in a fast switching environment
CN106897235A (en) Packet buffer, corresponding memory system and multi-port memory controller
JP4516395B2 (en) Memory management system with link list processor
CN107483405B (en) A scheduling method and scheduling system supporting variable-length cells
JP4408376B2 (en) System, method and logic for queuing packets to be written to memory for exchange
KR20090127490A (en) High speed packet routing system apparatus and method
JP2008017387A (en) Load balanced switching apparatus and method
KR100468946B1 (en) Input Buffered Switches and Its Contention Method Using Pipelined Simple Matching
CN112088521A (en) Memory device for high-bandwidth and high-capacity switch
JP4852138B2 (en) System, method and logic for multicasting in fast exchange environment
US7269697B1 (en) Apparatus and methodology for an input port scheduler

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201215

RJ01 Rejection of invention patent application after publication