WO2025222944A1 - Procédé et appareil de transmission de message - Google Patents
Procédé et appareil de transmission de messageInfo
- Publication number
- WO2025222944A1 WO2025222944A1 PCT/CN2024/144645 CN2024144645W WO2025222944A1 WO 2025222944 A1 WO2025222944 A1 WO 2025222944A1 CN 2024144645 W CN2024144645 W CN 2024144645W WO 2025222944 A1 WO2025222944 A1 WO 2025222944A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- message
- sending
- packet
- target
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/22—Traffic shaping
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/26—Flow control; Congestion control using explicit feedback to the source, e.g. choke packets
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/32—Flow control; Congestion control by discarding or delaying data units, e.g. packets or frames
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/52—Queue scheduling by attributing bandwidth to queues
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/62—Queue scheduling characterised by scheduling criteria
- H04L47/625—Queue scheduling characterised by scheduling criteria for service slots or service orders
- H04L47/6275—Queue scheduling characterised by scheduling criteria for service slots or service orders based on priority
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/62—Queue scheduling characterised by scheduling criteria
- H04L47/629—Ensuring fair share of resources, e.g. weighted fair queuing [WFQ]
Definitions
- This application relates to the field of cloud computing technology, and in particular to a message transmission method and apparatus.
- cloud data centers are undergoing tremendous changes in both service models and hardware/software organizational structures, evolving into large-scale interconnected cloud service clusters. Due to the continuous increase in cloud data center transmission bandwidth and the large influx of tenants, traffic congestion problems are inevitable.
- TCP Transmission Control Protocol
- RTT round-trip time
- ACK acknowledgment character
- explicit congestion control mechanisms can detect congestion more accurately and proactively, adjusting packet transmission rates accordingly.
- current explicit congestion control mechanisms are port-level, meaning their granularity is coarse.
- Coarse-grained congestion control mechanisms cannot solve the network performance bottlenecks in large-scale cloud service cluster interconnection scenarios. Therefore, a congestion control mechanism that can improve network performance is urgently needed.
- This application provides a message transmission method and apparatus, which can achieve congestion control at the flow level.
- this application provides a message transmission method.
- This method is applied to a source end sending multiple message streams, where a target message stream is any one of the multiple message streams.
- the target message stream includes a first message that has already been sent and a second message to be sent.
- the method includes: acquiring the endpoint delay and network delay of the first message; determining a first packet transmission rate based on the endpoint delay and network delay of the first message; and transmitting the second message at the first packet transmission rate.
- the endpoint delay of the first message includes the delay of the source end processing the first message, the delay of the destination end of the target message stream processing the first message, and the delay of the source end processing a first acknowledgment character (ACK) message.
- the first ACK message originates from the destination end of the first message and is an ACK message of the first message.
- the network delay of the first message is the delay of the communication link between the source and destination ends of the first message for transmitting the first message and the first ACK message.
- the method described in this application enables the regulation of the packet transmission rate of a target data stream.
- this method can predict and regulate the subsequent packet transmission rate of each packet stream based on the end-side latency and network latency of the packets already sent, thereby achieving flow-level congestion control.
- the flow-level congestion control mechanism implemented through the scheme provided in this application solves the problem of lagging congestion control in current industry congestion control mechanisms. For example, current industry congestion control mechanisms perform flow control by feeding back signals such as packet loss and RTT latency after severe congestion, or by feeding back ECN markers after mild congestion for the next round of rate regulation.
- this application when adjusting the packet sending rate of the message flow, this application not only refers to the network latency on the communication link between the source and destination to predict the subsequent packet sending rate of the source, but also refers to the latency (i.e., network latency) caused by the source's own limitations in resource computing power, etc., to predict the subsequent packet sending rate of the source.
- This avoids the problem of a mismatch between the controlled packet sending rate and the actual data flow packet sending rate during congestion control. Therefore, the flow control scope of this application is larger than that of related technologies. Based on this, the flow-level congestion control mechanism implemented by the solution provided in this application solves the problem of a mismatch between the controlled packet sending rate and the actual data flow packet sending rate caused by the current industry's focus on network-side congestion while ignoring end-side congestion.
- determining the first packet transmission rate based on the endpoint delay and network delay of the first message includes: determining the endpoint packet transmission rate based on the endpoint delay of the first message and the size of the historical endpoint congestion window (CWND); determining the network packet transmission rate based on the network delay of the first message and the size of the historical network-side CWND; and determining the smaller of the endpoint packet transmission rate and the network-side packet transmission rate as the first packet transmission rate.
- the historical endpoint CWND is the CWND when the source end sent packets in the target message stream before the current time
- the historical network-side CWND is the CWND when the communication link transmitted packets in the target message stream before the current time.
- determining the end-side packet transmission rate based on the end-side delay of the first packet and the historical end-side CWND size includes: calculating a first factor and a second factor based on the end-side delay of the first packet and the historical end-side CWND size; and calculating the end-side packet transmission rate based on the first factor, the second factor, and the historical end-side CWND size.
- the first factor represents the congestion state of the source end
- the second factor represents the state attribute of the target packet flow at the source end.
- determining the network-side packet transmission rate based on the network latency of the first packet and the historical network-side CWND size includes: calculating a third factor and a fourth factor based on the network latency of the first packet and the historical network-side CWND size; and calculating the network-side packet transmission rate based on the third factor, the fourth factor, and the historical network-side CWND size.
- the third factor represents the congestion state of the communication link
- the fourth factor represents the state attributes of the target packet flow on the communication link.
- this method when this method is executed for each flow sent from the source end, it is possible to predict and regulate the packet sending rate of each flow based on its historical state data (including the latency corresponding to the packet sending rate of historically regulated packet flows (e.g., the end-side latency and network latency of the first packet), historical end-side CWND, historical network-side CWND, etc.), thereby achieving flow-level granular congestion control. Based on this, it can solve the performance bottleneck problems of the implicit congestion control and port-level explicit congestion control mechanisms commonly used in the industry.
- the flow-level congestion control mechanism implemented by the scheme provided in this application can achieve differentiated congestion control (or flow control) for each data flow based on its attributes and network state (characterized by parameters such as the first factor, second factor, third factor, and fourth factor), and this fine-grained flow-level congestion control can simultaneously meet the diverse needs of numerous applications.
- determining the first packet transmission rate based on the terminal delay and network delay of the first message includes: in response to receiving the first ACK message, determining the first packet transmission rate based on the terminal delay and network delay of the first message.
- the method further includes: obtaining the endpoint latency and network latency of the third packet already sent in the target packet stream; in response to receiving the second ACK packet, determining a second packet transmission rate based on the endpoint latency and network latency of the third packet; and sending the packets to be sent in the target packet stream at the second packet transmission rate.
- the third packet is the packet sent by the source end after sending the first packet
- the second ACK packet is the ACK packet for the third packet.
- the packet sending rate of the target packet flow can be predicted and controlled at the granularity of ACK packets in the target packet flow.
- the first packet transmission rate is represented based on a first CWND.
- Sending the second message at the first packet transmission rate includes: determining the target transmission queue in the source network interface card (NIC) corresponding to the target packet flow; scheduling data packets from the packets to be sent in the target packet flow that include the second message, up to a number satisfying the first CWND size, to the target transmission queue; and sending the data packets of the target transmission queue according to its priority.
- the source NIC includes multiple transmission queues, each including a target transmission queue. These multiple transmission queues have different priorities, and the priority of a transmission queue indicates the order in which packets are scheduled for transmission.
- determining the target sending queue corresponding to the target packet flow in the source network interface card includes: determining the size of the packets already sent in the target packet flow; determining the sending priority of the target packet flow based on the size of the packets already sent in the target packet flow; and determining the sending queue corresponding to the sending priority of the target packet flow among the multiple sending queues included in the source network interface card as the target sending queue.
- the aforementioned transmission of data packets to the target transmission queue according to its priority includes: when the target transmission queue belongs to the first transmission queue set, scheduling the transmission packets of the transmission queues in the first transmission queue set according to the priority queue (PQ) strategy.
- PQ priority queue
- the priorities of all transmission queues in the first transmission queue set are higher than or equal to a preset priority, or, in the descending order of priorities of multiple transmission queues included in the source network interface card, the priorities of all transmission queues in the first transmission queue set are ranked in the top m positions, where m is an integer greater than or equal to 1.
- the PQ priority strategy means that data packets in high-priority transmission queues are transmitted only after the data packets in low-priority transmission queues have been transmitted.
- the transmission priority of the target message flow determined at this time is a high priority indicating priority for sending the target message flow, then this scheme can achieve priority transmission of the target message flow. In this way, low latency can be guaranteed for latency-sensitive traffic.
- the aforementioned transmission of data packets to the target sending queue according to its priority includes: when the target sending queue belongs to a second sending queue set, scheduling the transmission packets of the sending queues in the second sending queue set according to a weighted fair queuing (WFQ) mechanism.
- WFQ weighted fair queuing
- the sending queues in the second sending queue set are each preset with different weights, and the priorities of all sending queues in the second sending queue set are lower than or equal to the preset priorities.
- the priorities of all sending queues in the second sending queue set are ranked m positions after the preset priority, where m is an integer greater than or equal to 1.
- the lower the priority of a sending queue in the second sending queue set the more data packets that sending queue are scheduled to be sent at one time.
- the transmission priority of the target packet flow determined at this time is a low priority indicating that the target packet flow will be sent later
- this scheme can allocate more transmission bandwidth to the target packet flow. This is because the weight of the low-priority (indicating later transmission) transmission queue indicates that a larger number of data packets will be sent at one time. In this way, it can be ensured that bandwidth-sensitive traffic can obtain more transmission bandwidth.
- the target message flow is the message flow that accesses the target service, which is deployed on at least one server of the DCN.
- the target message stream is the message stream used for communication between different nodes in a distributed storage system.
- the delay of the source end processing the first message is represented by the difference between a first time and a second time.
- the first time is the time when the source end generates the first data packet of the first message
- the second time is the time when the source end sends the last data packet of the first message.
- the delay of the destination end processing the first message is represented by the difference between a third time and a fourth time.
- the third time is the time when the destination end receives the first data packet of the first message
- the fourth time is the time when the destination end sends the first ACK message.
- the delay of the source end processing the first ACK message is represented by the difference between a fifth time and a sixth time.
- the fifth time is the time when the source end receives the first ACK message
- the sixth time is the time when the source end completes processing the first ACK message.
- this application provides a message transmission apparatus.
- This apparatus is applied to a source end sending multiple message streams, where a target message stream is any one of the multiple message streams.
- the target message stream includes a first message that has already been sent and a second message to be sent.
- the apparatus includes: an acquisition unit for acquiring the end-side delay and network delay of the first message; a determination unit for determining a first packet transmission rate based on the end-side delay and network delay of the first message; and a transmission unit for transmitting the second message at the first packet transmission rate.
- the end-side delay of the first message includes the delay of the source end processing the first message, the delay of the destination end of the target message stream processing the first message, and the delay of the source end processing a first ACK message.
- the first ACK message originates from the destination end of the first message and is an ACK message of the first message.
- the network delay of the first message is the delay of the communication link between the source and destination ends of the first message for transmitting the first message and the first
- the determining unit is specifically configured to: determine the end-side packet transmission rate based on the end-side delay of the first message and the size of the historical end-side CWND; determine the network-side packet transmission rate based on the network delay of the first message and the size of the historical network-side CWND; and determine the smaller of the end-side packet transmission rate and the network-side packet transmission rate as the first packet transmission rate.
- the historical end-side CWND is the CWND when the source end transmitted packets in the target message stream before the current time
- the historical network-side CWND is the CWND when the communication link transmitted packets in the target message stream before the current time.
- the determining unit is further specifically used to calculate a first factor and a second factor based on the end-side delay of the first message and the size of the historical end-side CWND; and to calculate the end-side packet transmission rate based on the first factor, the second factor, and the size of the historical end-side CWND.
- the first factor represents the congestion state of the source end
- the second factor represents the state attribute of the target message flow at the source end.
- the determining unit is further specifically used to calculate the third factor and the fourth factor based on the network latency of the first message and the size of the historical network-side CWND; and to calculate the network-side packet transmission rate based on the third factor, the fourth factor, and the size of the historical network-side CWND.
- the third factor represents the congestion state of the communication link
- the fourth factor represents the state attributes of the target message flow on the communication link.
- the determining unit is also specifically used to determine the first packet transmission rate in response to receiving the first ACK message, based on the end-side delay and network delay of the first message.
- the acquisition unit is further configured to acquire the end-side latency and network latency of the third packet already sent in the target packet stream.
- the determination unit is further configured to determine a second packet transmission rate based on the end-side latency and network latency of the third packet in response to receiving the second ACK packet.
- the sending unit is further configured to send the packets to be sent in the target packet stream at the second packet transmission rate.
- the third packet is a packet sent by the source end after sending the first packet
- the second ACK packet is an ACK packet for the third packet.
- the first packet transmission rate is represented based on the first CWND.
- the determining unit is also used to determine the target transmission queue in the source network interface card (NIC) corresponding to the target packet flow.
- the sending unit is specifically used to schedule data packets from the packets to be sent in the target packet flow, including the second packet, to the target transmission queue in a number that satisfy the first CWND size; and to send the data packets of the target transmission queue according to the priority of the target transmission queue.
- the source NIC includes multiple transmission queues, each including the target transmission queue, and each transmission queue has a different priority, where the priority of the transmission queue indicates the order in which packets are scheduled for transmission.
- the determining unit is also specifically used to determine the size of the sent packets in the target packet stream; determine the sending priority of the target packet stream based on the size of the sent packets in the target packet stream; and determine the sending queue corresponding to the sending priority of the target packet stream among the multiple sending queues included in the source network interface card as the target sending queue.
- the transmission unit when packets in a high-priority transmission queue are scheduled for transmission first, the transmission unit, specifically when the target transmission queue belongs to the first transmission queue set, schedules the transmission data packets of the transmission queues in the first transmission queue set according to the PQ strategy.
- the priorities of all transmission queues in the first transmission queue set are higher than or equal to a preset priority; or, in the order of priorities of multiple transmission queues included in the source network interface card from highest to lowest, the priorities of all transmission queues in the first transmission queue set are ranked in the top m positions, where m is an integer greater than or equal to 1.
- the transmission unit when packets in a high-priority transmission queue are scheduled for transmission first, the transmission unit is also specifically used to schedule the transmission data packets of the transmission queues in the second transmission queue set according to the WFQ mechanism when the target transmission queue belongs to the second transmission queue set.
- the transmission queues in the second transmission queue set are each preset with different weights, and the priorities of all transmission queues in the second transmission queue set are lower than or equal to the preset priorities.
- the priorities of all transmission queues in the second transmission queue set are ranked m positions after the preset priority, where m is an integer greater than or equal to 1.
- the lower the priority of a sending queue in the second sending queue set the more data packets that sending queue are scheduled to be sent at one time.
- the target message flow is the message flow that accesses the target service, which is deployed on at least one server of the DCN.
- the target message stream is the message stream used for communication between different nodes in a distributed storage system.
- the delay of the source end processing the first message is represented by the difference between a first time and a second time.
- the first time is the time when the source end generates the first data packet of the first message
- the second time is the time when the source end sends the last data packet of the first message.
- the delay of the destination end processing the first message is represented by the difference between a third time and a fourth time.
- the third time is the time when the destination end receives the first data packet of the first message
- the fourth time is the time when the destination end sends the first ACK message.
- the delay of the source end processing the first ACK message is represented by the difference between a fifth time and a sixth time.
- the fifth time is the time when the source end receives the first ACK message
- the sixth time is the time when the source end completes processing the first ACK message.
- this application provides a computing device comprising: a memory, a communication interface, and one or more processors, the one or more processors receiving or transmitting data through the communication interface, and the one or more processors being configured to read program instructions stored in the memory to perform the methods provided by the first aspect and any possible design of the first aspect.
- this application provides a computing device cluster including at least one computing device, each computing device including a processor and a memory.
- the processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device, causing the computing device cluster to perform the methods provided by the first aspect and any possible design of the first aspect.
- this application provides a chip that includes a processor.
- the chip including the processor or the device including the chip performs the methods provided by the first aspect and any possible design of the first aspect.
- the chip further includes an input interface, an output interface, and a memory.
- the chip's input interface, output interface, processor, and memory are connected via internal interconnection paths.
- the memory in the chip stores program instructions or code executed by the processor, and the input and output interfaces are used for communication between the chip and other chips or devices.
- this application provides a computer-readable storage medium that is a non-volatile computer-readable storage medium, the computer-readable storage medium including computer program instructions that, when executed by a computing device or processor, perform the methods provided by the first aspect and any possible design of the first aspect.
- this application provides a computer program product comprising instructions that, when executed by a processor, cause a computing device or processor to perform the methods provided by the first aspect and any possible design of the first aspect.
- any of the message transmission devices, computing devices, computing device clusters, computer-readable storage media, computer program products or chips provided above can be applied to the corresponding methods provided above. Therefore, the beneficial effects they can achieve can be referred to the beneficial effects in the corresponding methods, and will not be repeated here.
- Figure 1 is a schematic diagram of an explicit congestion control mechanism according to an embodiment of this application.
- Figure 2 is a schematic diagram of another explicit congestion control mechanism described in an embodiment of this application.
- Figure 3 is a schematic diagram of an implementation environment for the method provided in the embodiments of this application.
- Figure 4 is a schematic diagram of a distributed storage system according to an embodiment of this application.
- Figure 5 is a flowchart illustrating a message transmission method provided in an embodiment of this application.
- Figure 6 is a schematic diagram of the end-side delay and network delay of a source end obtaining a first message according to an embodiment of this application;
- Figure 7 is a schematic diagram illustrating the relationship between the first time point and the sixth time point provided in the embodiments of this application.
- Figure 8 is a comparison diagram of the flow control range in the method provided in the embodiments of this application and related technologies.
- Figure 9 is a flowchart illustrating another message transmission method provided in an embodiment of this application.
- FIG. 10 is a schematic diagram of the priority threshold values provided in an embodiment of this application.
- Figure 11 is a schematic diagram of a process for scheduling data packets of each sending queue in a second sending queue set using a WFQ mechanism, according to an embodiment of this application.
- Figure 12 is a flowchart illustrating another message transmission method provided in an embodiment of this application.
- Figure 13 is a schematic diagram illustrating the relationship between packet transmission rate and latency in real-time message flow control according to an embodiment of this application;
- Figure 14 is a schematic diagram of a message transmission device provided in an embodiment of this application.
- FIG. 15 is a schematic diagram of another message transmission device provided in an embodiment of this application.
- Figure 16 is a schematic diagram of the structure of a computing device provided in an embodiment of this application.
- Figure 17 is a schematic diagram of the structure of a computing device cluster provided in an embodiment of this application.
- Figure 18 is a schematic diagram of one or more computing devices in a computing device cluster provided in an embodiment of this application being connected via a network.
- cloud data centers are undergoing tremendous changes in both service models and hardware and software organizational structures. Due to the continuous increase in cloud data center transmission bandwidth and the large-scale migration of tenants, traditional coarse-grained congestion control and traffic scheduling schemes are finding it difficult to provide satisfactory network transmission performance.
- cloud storage architectures are evolving towards a compute-storage separation architecture, where compute clusters and storage clusters are deployed separately within the storage system.
- compute-storage separation architectures increase network traffic and network system architecture complexity, leading to a surge in data traffic in data center networks (DCNs) where storage systems are deployed.
- DCNs data center networks
- compute-storage separation architectures also drive the evolution of distributed systems towards ultra-large-scale clusters. More applications with diverse needs require sharing limited network system resources, making it difficult for existing coarse-grained congestion control mechanisms to simultaneously meet numerous differentiated demands.
- the congestion control mechanisms used in current network protocols are generally divided into two categories.
- TCP Transmission Control Protocol
- BBR Bottleneck Bandwidth and Round-Trip Time
- CUBIC congestion control mechanism CUBIC congestion control mechanism
- TCP Transmission Control Protocol
- BBR Bottleneck Bandwidth and Round-Trip Time
- CUBIC congestion control mechanism CUBIC congestion control mechanism
- TTT round-trip time
- RTT round-trip time
- these mechanisms usually only receive the corresponding congestion signal when severe congestion occurs, such as packet timeouts or packet loss, and then trigger congestion control based on this signal to control the packet transmission rate. Therefore, these congestion control mechanisms are widely used in wide area networks (WANs) or DCN scenarios where latency requirements are not high.
- WANs wide area networks
- DCN DCN scenarios where latency requirements are not high.
- port-level congestion control such as the Data Center Quantized Congestion Notification (DCQCN) and Data Center TCP (DCTCP) congestion control mechanisms used in the Remote Direct Memory Access (RDMA) protocol.
- DCQCN Data Center Quantized Congestion Notification
- DCTCP Data Center TCP
- RDMA Remote Direct Memory Access
- Network transport protocols employing port-level congestion control mechanisms include, but are not limited to, TCP, User Datagram Protocol (UDP), and RDMA.
- FIG. 1 a schematic diagram of an explicit congestion control mechanism is shown.
- ECN explicit congestion notification
- the switch determines that the length of the sending queue containing the data packet exceeds the preset ECN threshold, it sets the ECN flag bit in the data packet to carry the ECN flag. Then, when the receiver receives the data packet carrying the ECN flag bit, it sets the echo of congestion encountered (ECE) flag bit in the ACK message of the data packet. Subsequently, the ACK packet carrying the ECE flag is returned from the receiver to the sender via the switch.
- the sender determines that the received ACK packet carries the ECE flag, it can determine that the ACK packet also carries the ECN flag. At this point, the sender performs traffic reduction control on the port that received the ACK to reduce network congestion on the communication link between the sender and receiver.
- the sender determines that the received ACK packet does not carry the ECE flag, i.e., does not carry the ECN flag, it indicates that there is no congestion on the communication link between the sender and receiver. Therefore, the sender performs traffic acceleration control on the port that received the ACK to improve the utilization of the communication link.
- the traffic acceleration control performed by the sender on the port can be rapid acceleration, accelerated acceleration, or slow acceleration, which will not be detailed further.
- the congestion control mechanism shown in Figure 1 can instruct the sender to reduce the packet sending rate of the sending port when network congestion occurs, and instruct the sender to increase the packet sending rate of the sending port when network congestion does not occur, thereby achieving network congestion control.
- FIG. 2 a schematic diagram of another explicit congestion control mechanism is shown.
- PFC priority flow control
- switch 210 when switch 210 determines that the length of the data packets to be sent in the outgoing port 1's sending queue is less than the PFC threshold, switch 210 normally sends the data packets to be sent in the outgoing port 1's sending queue without performing any additional operations.
- switch 210 when switch 210 determines that the length of the data packets to be sent in the outgoing port 1's sending queue exceeds the PFC threshold, in addition to normally sending the data packets to be sent in the outgoing port 1's sending queue, switch 210 also sends a pause frame to the upstream port of incoming port 1 to instruct the upstream port of incoming port 1 to stop sending data packets. In response, the upstream port of incoming port 1 stops sending data packets after receiving the pause frame. Thus, ingoing port 1 of switch 210 will no longer receive data packets sent by the upstream port, and therefore will no longer schedule data packets to be sent in the outgoing port 1's sending queue.
- switch 210 when switch 210 determines that the length of the data packet to be sent in the outgoing port 1 transmission queue is less than the PFC threshold, in addition to sending the data packet to be sent in the outgoing port 1 transmission queue normally, switch 210 can also send a resume frame to the upstream port of incoming port 1 to instruct the upstream port of incoming port 1 to resume sending data packets. In response, the upstream port of incoming port 1 resumes sending data packets after receiving the resume frame. In this way, incoming port 1 of switch 210 will receive data packets sent by the upstream port again, and then resume scheduling the data packets to be sent in the outgoing port 1 transmission queue.
- the upstream port that receives the stop frame sends probe data packets to switch 210 at regular intervals to detect whether it can resume sending data packets, and resumes sending data packets when it detects that it can resume sending data packets.
- the PFC congestion control mechanism achieves congestion control by pressuring the upstream port to stop sending data packets.
- both of the above explicit congestion control mechanisms can achieve flow rate regulation (referred to as flow control) when network congestion occurs, thereby alleviating network congestion.
- flow control flow rate regulation
- explicit congestion control mechanisms can detect and adjust congestion more accurately and in advance.
- Port-level congestion control mechanisms cannot provide differentiated flow control services to different data flows and can lead to data flow collateral damage: that is, if a data flow (referred to as flow 1) becomes congested in the network, the sender's port traffic is slowed down (as shown in Figure 1), or the upstream port of the switch stops sending traffic (as shown in Figure 2). In this case, other data flows (referred to as flow 2) sent through that port will also be forced to slow down or stop sending, even if there is no congestion in the network. Therefore, port-level congestion control mechanisms have a coarse-grained congestion control granularity, making it difficult to provide satisfactory network transmission performance.
- large-scale congestion control mechanisms cannot solve the network performance bottleneck problem in large-scale cloud service cluster interconnection scenarios.
- These large-scale cloud service cluster interconnection scenarios include, but are not limited to, scenarios involving large-scale model training for artificial intelligence (AI) and ultra-large-scale cloud services.
- AI artificial intelligence
- embodiments of this application provide a message transmission method that can achieve flow-level congestion control, that is, the method provides a flow-level congestion control mechanism (or simply a flow-level congestion control mechanism).
- the flow-level congestion control mechanism refers to a congestion control mechanism with data flow/message flow as the granularity.
- Figure 3 shows a schematic diagram of an implementation environment for the method provided in the embodiments of this application.
- the implementation environment includes a source end for sending message streams, a message forwarding device for forwarding message streams, and a destination end for receiving message streams.
- This application does not limit the network transmission protocol used when the source sends a message stream to the destination via a message forwarding device.
- network transmission protocols include, but are not limited to, TCP, UDP, and RDMA protocols.
- This application embodiment does not specifically limit the specific implementation of the message flow, source end, message forwarding device and destination end in the implementation environment shown in Figure 3.
- the storage system 400 using a compute-storage separation architecture includes a compute cluster and a storage cluster.
- the storage cluster includes an index layer cluster and a persistent layer cluster.
- the compute cluster communicates with the index layer cluster (denoted as front-end communication) and provides service interfaces to upper-layer applications.
- the index layer cluster communicates with both the compute cluster and the persistent layer cluster, providing metadata storage and management services.
- metadata refers to data describing the data stored in the persistent layer cluster.
- the persistent layer cluster communicates with the index layer cluster (denoted as back-end communication) and provides persistent storage services for the data.
- the storage system 400 also includes a control cluster for controlling and managing the compute cluster and the storage cluster. It should be understood that communication between different clusters in the storage system 400 is achieved through communication between the network interface cards (NICs) of different devices within the cluster. It should also be understood that any cluster in Figure 4 includes at least one compute device, which can be implemented as a physical device or a virtual machine (VM), etc., without limitation.
- NICs network interface cards
- VM virtual machine
- storage system 400 is a distributed storage system
- the compute clusters, index layer clusters, persistent layer clusters, and control clusters within storage system 400 are deployed separately and interconnected via a network (such as the Internet).
- Storage system 400 is usually required to provide low-latency, high-concurrency, and high-bandwidth services to upper-layer applications. Therefore, communication between different clusters within storage system 400 (such as front-end communication and back-end communication) is implemented using high-performance network protocols.
- These high-performance network protocols include, but are not limited to, TCP, RDMA, and the Unit Bus Over Ethernet (UBoE) protocol.
- the congestion control mechanism When the congestion control mechanism provided by the embodiments of this application is applied to the high-performance network protocol used for communication between different clusters in the storage system 400, it can realize flow-level congestion control of communication (for example, realizing flow-level congestion control of communication between the computing cluster (as the source end) and the index layer cluster (as the destination end), and for example, realizing flow-level congestion control of communication between the index layer cluster (as the source end) and the persistence layer cluster (as the destination end), thereby avoiding congestion during communication, and further improving the communication performance between different clusters in the storage system, such as realizing low latency, high concurrency, and high bandwidth of communication, such as reducing the packet loss rate of communication, such as providing smooth network jitter during communication, and such as providing the processing capabilities of many-to-one input/output (IO) model, many-to-many IO model, and one-to-many IO model, etc.
- IO input/output
- the embodiments of this application do not specifically limit the storage services provided by the aforementioned storage system.
- the storage services provided by the storage system include, but are not limited to: Elastic Volume Service (EVS) (or cloud disk service), Object Storage Service (OBS), Scalable File Service (SFS), and Key-Value Store (KVStore) service, etc.
- EVS Elastic Volume Service
- OBS Object Storage Service
- SFS Scalable File Service
- KVStore Key-Value Store
- the implementation environment shown in Figure 3 is applied to a scenario of accessing a DCN.
- the destination shown in Figure 3 is a device (such as a server) deployed in the DCN
- the source shown in Figure 3 is a node/device/equipment that needs to access the destination in the DCN via a network (such as the Internet).
- the message stream is the message stream from the source to the destination in the DCN.
- the source uses the Hypertext Transfer Protocol (HTTP) to send the message stream when accessing the DCN.
- HTTP Hypertext Transfer Protocol
- the HTTP protocol has been updated to HTTP 3.0.
- HTTP 3.0 uses the Quick UDP Internet Connections (QUIC) protocol, and the essence of the QUIC protocol is HTTP 2.0 + Transport Layer Security (TLS) + UDP + Internet Protocol (IP).
- TLS Transport Layer Security
- IP UDP + Internet Protocol
- the implementation environment shown in Figure 3 is applied to a scenario of large-scale model training (such as AI large model training).
- the source and destination ends shown in Figure 3 are two processes within a computing device used to train the model, and these two processes communicate across a network. That is, in this example, the source and destination ends are actually located in the same device/node/organization.
- This application also provides a message transmission system, which includes a source end for sending message streams and a destination end for receiving message streams.
- a message transmission system which includes a source end for sending message streams and a destination end for receiving message streams.
- This application also provides a message transmission apparatus, which is applied to the source end of sending a message stream and is used to execute the message transmission method described in this application to realize flow-level congestion control for communication between the source end and the destination end of the message stream.
- the message transmission device is a computing device that implements the aforementioned source end, or a functional module within that computing device; there is no limitation on this.
- the computing device includes, but is not limited to, general-purpose computers, laptops, mobile phones, tablets, etc., or, the computing device may be implemented as a server or server cluster, etc., and is not limited to these.
- Figure 5 shows a schematic flowchart of a message transmission method provided in an embodiment of this application.
- the method is applied to the implementation environment shown in Figure 3 or Figure 4.
- the method includes the following steps.
- Step 101 The source end obtains the end-side latency of the first packet already sent in the target packet stream, and obtains the network latency of the first packet.
- the source can send multiple message streams, and the destination message stream is any one of these multiple message streams. That is, the source address of the messages in the destination message stream is the source's IP address, and the destination address is the destination's IP address.
- the target message stream is a service message stream provided by a client (as the source) accessing a server (as the destination) in a DCN.
- a control stream sent by a device (as the source) in a distributed storage system's control cluster to a computing cluster or storage cluster (as the destination).
- an I/O data stream where a device (as the source) in a distributed storage system's control cluster writes data to a storage cluster. And so on.
- the source end when the source end generates and sends the target message stream to the destination end, for any sent message in the target message stream, such as the first message, the source end can obtain the end-side latency and network latency of the first message.
- the endpoint latency of the first message includes the latency of the source processing the first message, the latency of the destination processing the first message, and the latency of the source processing the first ACK message.
- the first ACK message is the ACK message returned by the destination to the source after receiving the first message, to notify the source that the destination has received the first message. That is, the first ACK message comes from the destination of the first message, and the first ACK message is the ACK message of the first message.
- the endpoint latency of the first message is the time delay (i.e., latency) incurred by the source and destination of the target message flow when processing the first message and its ACK message.
- the endpoint latency of the first message is also referred to as the host latency of the first message.
- the network latency of the first message is the latency of transmitting the first message and the first ACK message through the communication link between the source and destination.
- the network latency of the first message is the RTT latency of the first message.
- Figure 6 shows a schematic diagram of a process for the source end to obtain the end-side latency and network latency of the first packet according to an embodiment of this application.
- this process is applied to the implementation environment shown in Figure 3 or Figure 4.
- the process includes the following steps.
- Step 1011 Extract the first and second timestamps from the source end.
- the first timetamp is the time when the source end generates the first data packet of the first message
- the second timetamp is the time when the source end sends the last data packet of the first message.
- a message is also called a data packet, which includes complete data to be sent.
- messages are typically encapsulated into at least one data packet (or group) for transmission.
- Each data packet is configured with a message identifier (ID) to indicate that the data packet belongs to the message identified by that ID.
- ID can be any character and/or number that can uniquely identify a message; there are no restrictions on this.
- the source can encapsulate the first message into at least one data packet; that is, the first message includes at least one data packet.
- this data packet includes the message ID of the first message.
- This data packet also includes the total number of data packets included in the first message, and the packet's number (or packet number) within the total number of data packets.
- this first data packet includes "1/5", where "5" indicates that the first message includes a total of 5 data packets, and "1" indicates that the current data packet is the first of the 5 data packets.
- the source end records the time of generating the first data packet as the first time and stores it.
- the source end also records and stores the message ID of the first message so as to distinguish which message the first time corresponds to in subsequent events.
- the source end schedules the at least one data packet to its own network interface card (NIC), which then sends the at least one data packet to the destination end.
- NIC network interface card
- the source end's NIC sends the last data packet among the at least one data packets
- the source end records the time of sending the last data packet as the second time and stores the second time.
- the source end records the second time, the already recorded and stored first time, and the message ID of the first message as a single piece of information (such as a log entry).
- the source end can determine whether a data packet is the last data packet of the first message by the packet number of the data packets included in the first message; this will not be elaborated further.
- Step 1012 The destination extracts the third and fourth times and sends them to the source.
- the third time is the time when the destination receives the first data packet of the first message
- the fourth time is the time when the destination sends the first ACK message.
- the destination end when it determines that the received data packet is the first data packet of the first message (based on the packet number), it records the time of receiving the data packet as the third time. Subsequently, when the destination end determines that the received data packet is the last data packet of the first message (based on the packet number), it determines that it has completed receiving all data packets of the first message, that is, the destination end determines that the reception of the first message is complete. At this time, the destination end generates the first ACK packet and returns the first ACK packet to the source end to notify the source end that the reception of the first message is complete. The destination end also records the time when it sends the first ACK packet through the network card as the fourth time.
- the destination sends the third and fourth timestamps to the source.
- the first ACK packet carries a third and a fourth time. For example, after the destination records the third time, it adds the third time to the first ACK packet when generating it, and adds its own transmission time to the first ACK packet when sending it, that is, adding the fourth time to the first ACK packet when sending it. Through this implementation, the destination sends the third and fourth times to the source via the first ACK packet.
- the destination generates a notification message including a third time and a fourth time, and sends the notification message to the source.
- the notification message also includes the message ID of the first message, so that after receiving the notification message, the source can determine that the time information carried in the notification message is the third time and the fourth time corresponding to the first message based on the message ID of the first message carried in the notification message.
- This application does not specifically limit the format and type of the notification message.
- Step 1013 The source receives the third and fourth times.
- the source end when the third and fourth times are carried in the first ACK message, the source end extracts the third and fourth times from the first ACK message after receiving it from the destination end. Then, the source end records the third and fourth times, along with the message ID of the first message carried in the first ACK message, as a single piece of information. For example, the source end records the third and fourth times, along with the previously recorded first and second times and the message ID of the first message, as a single piece of information based on the message ID of the first message carried in the first ACK message.
- the source end when the third and fourth times are carried in the notification message, extracts the third and fourth times from the notification message after receiving it from the destination end. Then, the source end records the third and fourth times, along with the message ID of the first message carried in the notification message, as a single piece of information. For example, the source end records the third and fourth times, along with the previously recorded first and second times and the message ID of the first message, as a single piece of information based on the message ID of the first message carried in the notification message.
- Step 1014 The source end extracts the fifth and sixth times.
- the fifth time is the time when the source end receives the first ACK packet
- the sixth time is the time when the source end finishes processing the first ACK packet.
- the source end After receiving the first ACK packet sent by the destination in response to the completion of receiving the first packet, the source end records the time of receiving the first ACK packet as the fifth time and stores the fifth time. For example, the source end records the fifth time, the previously recorded and stored first time, second time, third time, fourth time, and the packet ID of the first packet as a single piece of information.
- the source end processes the first ACK packet, for example, in response to the first ACK packet, deleting the data sent via the first packet from local storage.
- the source end records the time of processing the first ACK packet as the sixth time and stores the sixth time. For example, the source end records the sixth time, along with the previously recorded and stored first, second, third, fourth, and fifth times, and the packet ID of the first packet, as a single piece of information.
- Figure 7 illustrates the relationship between the first to sixth time points provided in this embodiment of the application.
- the source end generates the first data packet of the first message.
- the network card at the source end sends the data packet of the first message.
- the destination end receives the data packet of the first message.
- the destination end returns an ACK message (first ACK message) to the source end.
- the source end receives the first ACK message.
- the source end completes processing the first ACK message.
- the time when the network card sends the first or last data packet in a message can be approximated as the same time, i.e., considered as the sending time of the message.
- the time when the network card receives the first or last data packet in a message can be approximated as the same time, i.e., considered as the receiving time of the message.
- the source end has obtained several time nodes from the generation of the first message to its transmission, to the receipt of the corresponding ACK, and the completion of the processing of the ACK. Based on these time nodes, the end-side latency and network latency of the first message can be calculated.
- Step 1015 The source determines the end-side delay and network delay of the first message based on the first time, second time, third time, fourth time, fifth time and sixth time.
- the source end calculates the end-side delay E delay of the first message according to the following formula (1).
- E_delay_send represents the delay ( t2 - t1 ) of the source end processing the first packet and the delay ( t6 - t5 ) of the source end processing the first ACK packet
- E_delay_receive represents the delay ( t4 - t3 ) of the destination end processing the first packet. That is, the delay of the source end processing the first packet is represented by the difference between the first and second times, the delay of the source end processing the first ACK packet is represented by the difference between the fifth and sixth times, and the delay of the destination end processing the first packet is represented by the difference between the third and fourth times.
- the source calculates the network delay F_delay of the first message according to the following formula (2).
- F ⁇ sub>delay_packet ⁇ /sub> represents the delay (t ⁇ sub>3 ⁇ /sub> - t ⁇ sub> 2 ⁇ /sub> ) in the communication link between the source and destination for transmitting the first packet
- F ⁇ sub>delay_ack ⁇ /sub> represents the delay (t ⁇ sub> 5 ⁇ /sub> - t ⁇ sub> 4 ⁇ /sub>) in the communication link between the source and destination.
- Step 102 The source determines the first packet sending rate based on the end-side delay of the first packet and the network delay of the first packet.
- the first packet sending rate is the packet sending rate required when controlling the target message stream sending rate.
- the process by which the source determines the first packet transmission rate based on the endpoint latency and network latency of the first sent packet can be understood as the source predicting the packet transmission rate (i.e., the first packet transmission rate) needed to adjust the current packet transmission rate of the target packet flow based on the historical state data of the target packet flow.
- This process can also be understood as a prediction process based on machine learning.
- the packet sending rate can be represented by the congestion window (CWND).
- CWND is an important concept in TCP, specifically referring to the number of data packets sent to the network in a single round, primarily used to control the amount of data sent to the network by the packet sender. Therefore, the first packet sending rate predicted by the source end based on the historical state data of the target packet flow can be understood as the number of data packets in the target packet flow that the source end will send to the network in the next round.
- the source upon receiving the first ACK packet, determines a first packet transmission rate based on the endpoint latency and network latency of the first packet.
- the source determines the first packet transmission rate by: determining the endpoint packet transmission rate based on the endpoint latency and network latency of the first packet; and determining the network packet transmission rate based on the network latency and the historical network-side CWND size of the first packet. Then, the source determines the smaller of the endpoint packet transmission rate and the network-side packet transmission rate as the first target packet rate.
- the historical end-side CWND is the CWND of the source end when it sent packets in the target packet stream before the current time.
- the size of the historical end-side CWND can be understood as the number of packets in the target packet stream sent by the source end to the network in one or more rounds before the current time.
- the historical network-side CWND is the CWND of the communication link between the source and destination ends when it transmitted packets in the target packet stream before the current time.
- the size of the historical network-side CWND can be understood as the number of packets in the target packet stream transmitted by the communication link between the source and destination ends in one or more rounds before the current time. It can be seen that both the historical end-side CWND and the historical network-side CWND are historical state data of the target packet stream.
- the endpoint packet transmission rate refers to the number of data packets the source end needs to send in the target packet flow in the next round, as predicted by the source end based on historical state data of the target packet flow (such as the endpoint delay of the first packet and historical endpoint CWND). It should be understood that the endpoint (i.e., the host side) also faces congestion risks when processing packets due to resource and computing power limitations, and the endpoint packet rate predicted by the source end can prevent congestion at the source end.
- Network-side packet transmission rate refers to the number of data packets in the target message stream that the source end predicts, based on historical state data of the target message stream (such as the network latency of the first message and historical network-side CWND), will need to transmit in the next round of communication between the source and destination. Since communication links in the network are subject to congestion risk, the network-side packet transmission rate predicted by the source end can help avoid congestion in the communication link between the source and destination.
- the smaller of the source-side packet transmission rate and the network-side packet transmission rate is chosen as the first packet transmission rate. This ensures that network congestion is prevented and avoids a mismatch between the predicted and actual packet transmission rates.
- the packet transmission rate is determined by considering only the impact of communication links in the network when preventing network congestion.
- significant latency occurs. Therefore, even if the sender is configured to transmit packets at the determined rate, in reality, the sender cannot achieve the required transmission rate due to its own latency.
- the source determines the end-side packet transmission rate based on the end-side latency of the first packet and the historical end-side CWND size, including: the source calculates a first factor and a second factor based on the end-side latency of the first packet and the historical end-side CWND size, and calculates the end-side packet transmission rate based on the first factor, the second factor, and the historical end-side CWND size.
- the first factor represents the congestion state of the source
- the second factor represents the state attribute of the target packet flow at the source.
- the source determines the network-side packet transmission rate based on the network latency of the first packet and the historical network-side CWND size, including: the source calculates a third factor and a fourth factor based on the network latency of the first packet and the historical network-side CWND size, and determines the network-side packet transmission rate based on the third factor, the fourth factor, and the historical network-side CWND size.
- the third factor represents the congestion state of the communication link between the source and destination
- the fourth factor represents the state attribute of the target packet flow on that communication link.
- the state attributes of the target message flow at the source end can be understood as the sending status of the target message flow at the source end (e.g., the number of messages sent and processing delay).
- the state attributes of the target message flow on the communication link can be understood as the transmission status of the target message flow on the communication link (e.g., the number of messages transmitted and transmission delay).
- the end-side congestion state is inversely proportional to the square of the number of data streams sent through the ports of the end-side devices, and also inversely proportional to the square of the number of data packets in a single data stream (such as the target message stream) sent from the end to the network.
- the end-side congestion state can be represented by the size of the end-side congestion window.
- the network congestion state is inversely proportional to the square of the number of data streams sent through the ports of message forwarding devices (such as switches) in the network, and also inversely proportional to the square of the number of data packets in a single data stream (such as the target message stream) transmitted over the network.
- the network congestion state can be represented by the size of the network-side congestion window.
- R is a preset coefficient.
- delay i represents the delay.
- CWND represents the congestion window.
- ⁇ represents the congestion state, and ⁇ represents the state attribute of the target packet flow.
- k represents the number of rounds of adjusting the packet rate before the current time, and i represents the i-th transmission rate adjustment in k rounds of adjusting the packet rate. For example, k takes the value 2, which means that the packet rate has been adjusted in two rounds before the current time. At this time, i takes the values in [1,2] in sequence.
- the relevant data i.e., delay 1 and CWND i
- the relevant data i.e., delay 1 and CWND i
- delay 1 and CWND i are used to calculate the coefficients A, B, C, and D needed to calculate the first packet rate when adjusting the packet rate of the target packet flow in this adjustment are used.
- i 1
- delay 1 is the end-side delay or network delay obtained after adjusting the packet sending rate in the previous round.
- delay 1 is the end-side delay and network delay of the first packet obtained for predicting the first packet sending rate.
- CWND 1 is the packet sending rate adjusted in the previous round.
- CWND 1 is the number of data packets sent from the source end to the target packet stream from the time of the previous packet sending rate adjustment until the current time.
- delay 2 is the end-side delay or network delay obtained after adjusting the packet sending rate in the previous round.
- delay 2 is the end-side delay or network delay of the packets in the target packet stream obtained when predicting the packet sending rate in the previous round.
- CWND 2 is the packet sending rate controlled in the previous round, that is, the number of data packets in the target message stream sent by the source end from the packet sending rate controlled in the previous round to the packet sending rate controlled in the previous round.
- CWND i is the historical end-side CWND or historical network-side CWND mentioned above.
- the values of k and i in the above formulas (3) to (7) are pre-configured. Based on this, during the transmission of the target packet stream, the source end counts and stores in real time the end-side latency, network latency, historical end-side CWND and historical network-side CWND corresponding to the k times the packet transmission rate of the target packet stream is adjusted, so as to be used when determining the first packet transmission rate.
- the source end inputs the pre-statistical historical end-side CWND size and coefficients A1, B1, C1, and D1 into formula (8) to obtain ⁇ 1, which represents the source end's congestion state, and ⁇ 1, which represents the state attribute of the target packet flow at the source end.
- ⁇ 1 is the first factor
- ⁇ 1 is the second factor.
- the source end inputs the pre-statistical size of the historical network-side CWND and the coefficients A2, B2, C2, and D2 into formula (8) to obtain ⁇ 2, which represents the congestion state of the communication link between the source and destination ends, and ⁇ 2, which represents the state attribute of the target message flow on the communication link.
- ⁇ 2 is the third factor
- ⁇ 2 is the fourth factor.
- the source end can calculate the predicted packet sending rate based on the calculated first factor, second factor, third factor and fourth factor, and by the following formula (9).
- CWND represents the terminal-side packet sending rate or network-side packet sending rate calculated when adjusting the packet sending rate of the target message flow in the previous round.
- the end-side packet sending rate calculated (or understood as “predicted”) in this round of regulation of the target message flow can be calculated and denoted as CWND_endhost , as shown in formula (10) below.
- the network-side packet sending rate calculated (or understood as "predicted" when the target packet flow of the current round is controlled can be calculated, denoted as CWND_fabric, as shown in the following formula (11).
- the source determines the smaller of the calculated end-side packet sending rate and the network-side packet sending rate as the first packet sending rate. For example, the source takes the smaller of the calculated end-side packet sending rate and the network-side packet sending rate as the first packet sending rate based on the following formula (12), denoted as CWND_target.
- Step 103 The source end sends the second message to be sent in the target message stream at the first packet sending rate.
- the source end After predicting the first packet sending rate based on the historical state data of the target packet flow (including the latency, congestion window, and other data corresponding to the historical adjustment of the target packet flow's packet sending rate), the source end sends the data packets of the packets in the target packet flow that have not yet been sent at the first packet sending rate.
- the first packet transmission rate can be represented as the first CWND.
- the source end schedules a number of packets that meet the size of the first CWND from the unsent packets in the target packet stream to the network card's transmission queue, and the network card sends the packets scheduled to the transmission queue.
- the process of the source scheduling data packets of the target message stream to be sent to the network card's sending queue according to the first packet sending rate, and the network card scheduling and sending data packets in the sending queue can be referred to the process described in Figure 9 below, and will not be repeated here.
- the congestion control mechanism implemented based on the methods described in steps 101 to 103 achieves the regulation of the packet transmission rate of the target data stream.
- the packet transmission rate of each stream can be regulated according to the historical state data of each stream (including the latency, congestion window, and other data corresponding to the historical regulation of the packet transmission rate of the packet stream), that is, flow-level granular congestion control is achieved. Based on this, the performance bottleneck problem of the implicit congestion control and port-level explicit congestion control mechanisms commonly used in the industry can be solved.
- the flow-level congestion control mechanism implemented by the scheme provided in this application embodiment can achieve differentiated congestion control (or flow control) for each data stream based on the attributes and network state (characterized by historical state data), and the fine-grained flow-level congestion control can simultaneously meet the differentiated needs of many applications.
- the flow-level congestion control mechanism implemented by the scheme provided in the embodiments of this application solves the problem of congestion control lag in current industry congestion control mechanisms. For example, current industry congestion control mechanisms perform flow control by feeding back signals such as packet loss and RTT delay after severe congestion, or by feeding back ECN tags after mild congestion for the next round of rate adjustment.
- this application embodiment in addition to referencing the latency and congestion window of the communication link between the source and destination during historical data transmission to predict the subsequent packet sending rate of the source, also refers to the latency and congestion window of the source itself due to resource and computing power limitations to predict the subsequent packet sending rate of the source. This avoids the problem of a mismatch between the regulated packet sending rate and the actual data flow packet sending rate during congestion control. It can be seen that the flow control range of this application embodiment is larger than that in related technologies.
- the flow-level congestion control mechanism implemented by the solution provided in this application embodiment solves the problem of a mismatch between the packet sending rate after congestion control and the actual data flow packet sending rate, which is caused by the current industry focusing only on network-side congestion while ignoring end-side congestion.
- Figure 8 shows a comparison of the flow control range of the method provided in this application embodiment and related technologies.
- t1 to t6 are the first time to the sixth time mentioned above, respectively.
- the flow control range of the congestion control mechanism is only the source network interface card (NIC) - transmission network - destination NIC.
- the flow control range of the congestion control mechanism implemented by the method provided in this application embodiment extends from the source host - source NIC - transmission network - destination NIC - destination host.
- the flow control in this application embodiment considers the latency caused by the source host generating data packets and processing ACKs, and also considers the latency caused by the destination host generating and sending ACKs in response to received data packets. Therefore, since the solution provided in this application embodiment uses the end-side latency and congestion state, as well as the network latency and congestion state of the transmission network, as inputs to flow control, the flow control implemented in this way can avoid end-side congestion and network-side congestion. That is, the congestion control mechanism implemented by the solution provided in this application embodiment has a larger flow control range and more accurate flow control precision.
- the following describes the process of the source scheduling data packets of the target message stream to the network card's sending queue according to the first packet sending rate, and the network card scheduling and sending data packets in the sending queue.
- Figure 9 shows a schematic flowchart of another message transmission method provided by an embodiment of this application.
- this method is applied to the source end of the implementation environment shown in Figure 3 or Figure 4.
- the method includes the following steps 201 to 203.
- steps 201 to 203 can be executed after step 102 to implement the above-described step 103.
- the method shown in Figure 9 can also be used alone, that is, the source end can use the method shown in Figure 9 to schedule data packets of the target message stream to be sent to the network card's sending queue according to any packet sending rate, and to implement the network card's scheduling of data packets in the sending queue.
- Step 201 The source determines the target sending queue in its own network card that corresponds to the target packet flow.
- the network interface card can include multiple transmit queues.
- the source network interface card includes multiple transmission queues, including a target transmission queue corresponding to the target packet flow, and each transmission queue has a different priority.
- the priority of a transmission queue indicates the priority of scheduling packets within that queue for transmission.
- This embodiment does not specifically limit the correspondence between priority levels and the order in which different transmission queues are scheduled for transmission. For example, packets from high-priority transmission queues are scheduled for transmission first, and packets from low-priority transmission queues are scheduled for transmission later. Or, for another example, packets from low-priority transmission queues are scheduled for transmission first, and packets from high-priority transmission queues are scheduled for transmission later. This is not a limitation.
- the source end pre-defines the correspondence between different types of services and transmission queues with different priorities in the network interface card (NIC). For example, type 1 services correspond to transmission queue 1 with first priority, type 2 services correspond to transmission queue 2 with second priority, and so on.
- the source end determines the transmission queue corresponding to the service type to which the target packet flow belongs as the target transmission queue.
- the source end pre-defines the correspondence between different Quality of Service (QoS) levels and transmission queues with different priorities in the network interface card (NIC). For example, the service requiring the highest QoS level corresponds to transmission queue 1 with first priority, the service requiring the second highest QoS level corresponds to transmission queue 2 with second priority, and so on.
- the source end after determining the first packet transmission rate (i.e., the first CWND), the source end, based on the QoS level required by the service in the target packet flow, determines the transmission queue corresponding to that QoS level as the target transmission queue for the target packet flow.
- the source end determines the target transmission queue corresponding to the target packet flow in its own network interface card (NIC). This includes: the source end first determining the size of the packets already transmitted in the target packet flow; then, the source end determining the transmission priority of the target packet flow based on the size of the packets already transmitted; and finally, the source end determining the transmission queue corresponding to the transmission priority of the target packet flow from among the multiple transmission queues included in the NIC as the target transmission queue.
- the higher the transmission priority of the packet flow the higher the priority of the packet flow; the lower the transmission priority of the packet flow, the later the packet flow is transmitted.
- the higher the transmission priority of the packet flow the higher the priority of the corresponding transmission queue; and the lower the transmission priority of the packet flow, the lower the priority of the corresponding transmission queue.
- the higher the sending priority of the message stream the lower the priority of the corresponding sending queue, and vice versa. No such limitation applies.
- latency-sensitive traffic generally requires a relatively small amount of data to be transmitted, thus requiring low latency.
- Bandwidth-sensitive traffic requires a large amount of data to be transmitted, thus necessitating high bandwidth. Therefore, to ensure the low latency required by latency-sensitive traffic, it needs to be set to a high transmission priority.
- This requires scheduling latency-sensitive traffic into a transmission queue configured for priority transmission, ensuring its low latency.
- the source end after determining the first packet transmission rate (i.e., the first CWND), the source end counts the size of the packets already transmitted in the target packet stream and determines the transmission priority of the target packet stream based on the size of the transmitted packets. Then, the transmission queue corresponding to the transmission priority of the target packet stream from the multiple transmission queues included in the source end network interface card is determined as the target transmission queue.
- the source end is configured with at least one priority threshold value indicating the data size.
- This priority threshold value can be divided into multiple threshold value intervals, each corresponding to a transmission priority.
- four priority threshold values can be divided into five threshold value intervals, each corresponding to a transmission priority, for a total of five transmission priorities.
- w is a preset value
- KB represents the data length unit kilobytes
- N represents the number of transmission queues included in the network interface card, or the number of priorities
- k is an integer between [0, N-2]
- P ⁇ sub> k ⁇ /sub> is the k-th priority threshold value. It can be seen that the value of P ⁇ sub>k ⁇ /sub> increases as k increases; that is, the larger the value of k, the larger the k-th priority threshold value P ⁇ sub> k ⁇ /sub>.
- the source network card includes 5 transmission queues, namely transmission queue 0 to transmission queue 4.
- the threshold value interval 0 where the data size is less than or equal to P0 corresponds to a transmission priority, denoted as transmission priority 0.
- a threshold range 1 for data sizes between P0 and P1 corresponds to a transmission priority, denoted as transmission priority 1.
- a threshold range 2 for data sizes between P1 and P2 corresponds to a transmission priority, denoted as transmission priority 2.
- a threshold range 3 for data sizes between P2 and P3 corresponds to a transmission priority, denoted as transmission priority 3.
- a threshold range 4 for data sizes greater than or equal to P3 corresponds to a transmission priority, denoted as transmission priority 4.
- transmission priorities 0 through 4 correspond one-to-one with transmission queues 0 through 4.
- the sending priority corresponding to the threshold range greater than or equal to PN - 2 needs to instruct the last scheduler to send packets to the sending queue corresponding to that sending priority.
- k takes the minimum value 0 and the size of the packets already sent by the packet stream is less than or equal to P0 , it indicates that the packet stream has sent a small amount of data. Therefore, it can be determined that the packet stream is a latency-sensitive packet stream.
- the sending priority corresponding to the threshold range less than or equal to P0 needs to instruct the first scheduler to send packets to the sending queue corresponding to that sending priority.
- the following description assumes that the sending priority corresponding to the first scheduled sending queue is the highest priority, and the sending priority corresponding to the last scheduled sending queue is the lowest priority. That is, the higher the sending priority corresponding to the sending queue, the more preferentially packets in that sending queue are scheduled for sending. Conversely, the lower the sending priority corresponding to a sending queue, the later the message in that queue is scheduled for sending. For example, referring to Figure 10, sending priority 0 is the highest priority, and the message in sending queue 0 corresponding to sending priority 0 is the first to be scheduled for sending. Sending priority 4 is the lowest priority, and the message in sending queue 4 corresponding to sending priority 4 is the last to be scheduled for sending.
- the sending queue corresponding to the sending priority of the message stream is simply the sending queue with that sending priority.
- the source end calculates the size of the packets already sent in the target packet flow, it determines the sending queue corresponding to the sending priority of the threshold value range to which the size of the packets already sent in the target packet flow belongs as the target sending queue. For example, if the packets already sent in the target packet flow belong to threshold value range 0 in the example shown in Figure 10, the source end determines sending queue 0 in its own network card corresponding to the highest priority (i.e., sending priority 0) as the target sending queue.
- the source end determines sending queue 4 in its own network card corresponding to the lowest priority (i.e., sending priority 4) as the target sending queue.
- the source end also configures a weight for each transmission queue of the network interface card.
- priority indicators assign greater weight to the sending queues that are scheduled to be sent first (i.e., high priority), and assign less weight to the sending queues that are scheduled to be sent later (i.e., low priority).
- high-priority sending queues are configured with lower weights
- low-priority sending queues are configured with higher weights.
- the flow to which the data packets scheduled to the high-priority sending queue belong is latency-sensitive traffic
- the flow to which the data packets scheduled to the low-priority sending queue belong is bandwidth-sensitive traffic. Therefore, if a higher weight is assigned to the low-priority sending queue, then when scheduling data packets from different sending queues using WFQ, the low-priority sending queue can be scheduled with more data packets in each round of scheduling; that is, the bandwidth-sensitive traffic to which the data packets scheduled to the low-priority sending queue belong can obtain more transmission bandwidth.
- Step 202 The source end schedules data packets from the unsent packets in the target message stream that meet the first CWND to the target sending queue.
- the first CWND is the first packet transmission rate.
- the source end After determining the target sending queue corresponding to the target packet stream, the source end schedules a number of data packets that meet the first CWND size from the packets to be sent in the target packet stream (such as packets including the second packet mentioned above) to the target sending queue, so as to wait for the network card to schedule and send them.
- the "data packets in the target message stream that meet the first CWND size" are denoted as the first group of data packets.
- the source will schedule the data packets in the first group of data packets that are longer than the length of the target sending queue (denoted as redundant data packets) to a sending queue with a priority one level lower than that of the target sending queue.
- Step 203 The source end sends the data packets of the target sending queue according to the priority of the target sending queue.
- the source network interface card is configured to schedule the transmission of data packets from multiple transmission queues according to the PQ (Priority Queuing) policy.
- PQ Primary Queuing
- the source after the source has scheduled the transmission of data packets (or messages) from transmission queues with higher priority than the target transmission queue, it schedules the transmission of data packets from the target transmission queue until all data packets in the target transmission queue have been transmitted.
- the source network interface card schedules the transmission of data packets from each sending queue in the first sending queue set according to the PQ (Priority Queuing) strategy.
- the first sending queue set is a set of at least one sending queue, where the priority of each of these at least one sending queue is higher than or equal to a preset priority, or, in the descending priority order of the multiple sending queues included in the source NIC, the priority of each of these at least one sending queue is among the top m, where m is an integer greater than or equal to 1. It can be seen that the sending queues in the first sending queue set are all sending queues with higher priority than the target sending queue in the source NIC.
- the source after the source has scheduled the transmission of data packets from the sending queues in the first sending queue set whose priority is higher than the target sending queue's priority, it schedules the transmission of data packets from the target sending queue until all data packets in the target sending queue have been transmitted.
- data packets scheduled to higher-priority sending queues belong to message flows that have already sent relatively little data. Therefore, these message flows are latency-sensitive traffic, meaning they have a higher sending priority, and consequently, their corresponding sending queues have a higher priority.
- the second possible implementation by prioritizing the sending of data packets in higher-priority sending queues, ensures that latency-sensitive traffic is prioritized, thereby guaranteeing low latency for latency-sensitive traffic.
- the source end schedules the sending of data packets from each sending queue in the second sending queue set according to the WFQ mechanism. Specifically, the source end schedules the sending of data packets from each sending queue in descending order of priority, according to the weight of each sending queue in the second sending queue set, and so on in a cyclical manner.
- the second sending queue set is a set of at least two sending queues, each with a preset weight. The description of the preset weights can be found in the relevant description of step 202, and will not be repeated here.
- the priority of each of the at least two sending queues is lower than or equal to the preset priority, or, in the descending order of priority of the multiple sending queues included in the source end network interface card, the priority of each of the at least two sending queues is after the m-th position. It can be seen that the sending queues in the first sending queue set belong to the lower priority sending queues among the multiple sending queues of the source end network interface card.
- Figure 11 illustrates a process of scheduling data packets in each of the second sending queues in the WFQ mechanism according to an embodiment of this application.
- the second sending queue set including sending queues 100 to 102 as an example, when the priorities of sending queues 100 to 102 decrease sequentially, and the weight 0 of sending queue 100 indicates that 1 data packet is scheduled to be sent at a time, the weight 1 of sending queue 101 indicates that 3 data packets are scheduled to be sent at a time, and the weight 2 of sending queue 102 indicates that 5 data packets are scheduled to be sent at a time, in this case, the source network card first schedules 1 data packet from sending queue 100 to send, then schedules 3 data packets from sending queue 101 to send, and then schedules 5 data packets from sending queue 102 to send.
- the source again schedules 1 data packet from sending queue 100 to send, then schedules 3 data packets from sending queue 101 to send, and then schedules 5 data packets from sending queue 102 to send, and so on, until all data packets in sending queues 100 to 102 have been sent.
- the actual number of data packets scheduled for a certain sending queue by the source may differ in different rounds of scheduling.
- the pre-configured weight P1 of the first sending queue indicates the number of data packets to be sent in one round is C
- the remaining number of data packets is denoted as C(k):
- C(k) when C(k) is greater than or equal to 0, it indicates that before the k-th round of scheduling of the first sending queue, the number of data packets included in the first sending queue is less than the number of data packets C indicated by weight P1.
- the source network card schedules and sends all the data packets of the first sending queue in the k-th round of scheduling, and in the (k+1)-th round of scheduling, the source network card still schedules and sends the data packets of the first sending queue according to the number of data packets C indicated by P1.
- C(k) when C(k) is less than 0, it indicates that before the k-th round of scheduling for the first sending queue, the number of data packets in the first sending queue is greater than the number of data packets C indicated by weight P1.
- the source network card schedules all data packets in the first sending queue for transmission in the k-th round of scheduling, and in the (k+1)-th round of scheduling, the source network card schedules [C+C(k)] data packets from the first sending queue.
- a lower priority transmission queue means a lower transmission priority for the message stream to which the data packets scheduled to that queue belong. Consequently, the message stream to which the data packets scheduled to that queue belong has already transmitted more data, making it less sensitive to latency and more sensitive to bandwidth—in other words, it is bandwidth-sensitive traffic.
- a weight is set to indicate how many data packets are sent in a single scheduling, thus ensuring that bandwidth-sensitive traffic receives more transmission bandwidth.
- the source network interface card first uses a Priority Queuing (PQ) mechanism to schedule transmission for the first set of transmission queues, which includes higher-priority transmission queues, to ensure that latency-sensitive packet flows are sent first. Then, after all data packets in the first set of transmission queues have been sent, the source NIC uses a Wired Frequency Queuing (WFQ) mechanism to schedule transmission for the second set of transmission queues, which includes the remaining lower-priority transmission queues, to ensure that bandwidth-sensitive packet flows receive more transmission bandwidth.
- PQ Priority Queuing
- WFQ Wired Frequency Queuing
- Steps 201 to 203 achieve the implementation of scheduling and sending based on the size of the sent packets in the packet stream, selecting a suitable scheduling strategy for the packet stream, thereby enabling priority transmission of latency-sensitive traffic and providing more transmission bandwidth for bandwidth-sensitive traffic.
- FIG12 shows a schematic flowchart of another message transmission method provided by an embodiment of the present application.
- the method is applied to the implementation environment shown in FIG3 or FIG4.
- the method further performs steps 104 to 106.
- Step 104 The source end obtains the end-side latency and network latency of the third packet that has been sent in the target packet stream.
- the source obtains the end-side latency and network latency of the third packet already sent in the target packet stream, please refer to the description of "the source obtains the end-side latency and network latency of the first packet already sent in the target packet stream" in step 101, which will not be repeated here.
- the third message is a message sent by the source after sending the first message.
- the third message is the second message mentioned above, or a message sent after the second message. There is no limitation on this.
- Step 105 The source determines the second packet transmission rate based on the end-side delay and network delay of the third message.
- the source determines the second packet transmission rate based on the end-side latency and network latency of the third packet.
- the second ACK packet is the ACK packet for the third packet.
- the source determines the second packet transmission rate based on the end-side delay and network delay of the third message.
- the source determines the first packet transmission rate based on the end-side delay and network delay of the first message.
- Step 106 The source end sends the packets to be sent in the target packet stream at the second packet sending rate.
- the source sends the target message stream at the second packet sending rate please refer to the description of "the source sends the second message to be sent in the target message stream at the first packet sending rate" in step 103, which will not be repeated here.
- the second ACK packet when the second ACK packet is the next ACK packet received from the source end after receiving the first ACK packet, it indicates that the source end can respond to each ACK packet in the target packet stream and adjust the packet transmission rate of the target packet stream in real time, thereby achieving real-time control of the packet transmission rate of the target packet stream and thus achieving real-time congestion control at the flow level.
- the scheme described in steps 101 to 106 provided in this application embodiment can avoid packet stream congestion, thereby improving the transmission performance of the packet stream and meeting the low-latency transmission performance requirements of the packet stream.
- the source end executes step 101, so that the source end can obtain the terminal latency and network latency of each packet.
- step 102 in response to the first ACK packet received from flow 1, the source end executes step 102 according to the corresponding terminal latency and network latency to calculate the packet transmission rate 1, and then the source end sends the packets of flow 1 at the packet transmission rate 1 (refer to step 103).
- the source end executes step 102 again according to the corresponding terminal latency and network latency to calculate the packet transmission rate 2, and then the source end sends the packets of flow 1 at the packet transmission rate 2 (refer to step 103).
- the source end executes step 102 again based on the corresponding end-side latency and network latency to calculate the packet transmission rate 3.
- the source end transmits Flow 1 at the packet transmission rate 3 (refer to step 103). This process is repeated to achieve real-time control of the packet transmission rate of Flow 1. This avoids congestion in Flow 1, thereby improving the transmission performance of Flow 1 and meeting its low-latency transmission performance requirements.
- Figure 13 illustrates a schematic diagram of the relationship between packet transmission rate and latency in real-time packet flow control according to an embodiment of this application.
- the points (latency i , congestion window i ) in Figure 13 represent the relationship between the end-side packet transmission rate calculated according to steps 101 to 102 and the end-side latency used to calculate the packet transmission rate, or the relationship between the network-side packet transmission rate calculated according to steps 101 to 102 and the network latency used to calculate the packet transmission rate.
- the source when the source executes step 102 for the first packet of the target packet stream, and calculates the first factor (49.998, 100.174, and the second factor (50, 100, and 100, respectively) based on the end-side latency and the historical end-side CWND size, and the third factor (50, 100, and the fourth factor (100, respectively) based on the network latency and the historical network-side CWND size), the source can calculate the end-side packet sending rate CWND_endhost and the network-side packet sending rate CWND_fabric shown in Figure 13. The source can then choose the smaller of CWND_endhost and CWND_fabric as the first packet sending rate and send the target packet stream at that rate.
- the source continues to respond to the received ACK packets (such as the second ACK packet) of the packets in the target packet stream, calculates (or predicts) a new packet sending rate, and sends the target packet stream using the new rate.
- This achieves the goal of real-time control of the target packet stream's packet sending rate.
- the curve shown in Figure 13 illustrates the relationship between the packet sending rate (i.e., congestion window) and latency for real-time control of the target packet flow.
- the packet sending rate of the target packet flow increases, the latency decreases and tends to stabilize. Therefore, the proposed solution can improve network transmission performance (e.g., low latency) and provide stable network jitter.
- the second ACK packet is another ACK packet received by the source end from a packet in the target packet stream after a preset time has elapsed since the source end received the first ACK packet, or when the second ACK packet is another ACK packet received by the source end from a packet in the target packet stream after the source end received the first ACK packet, and there is a preset number of ACK packets between them, this indicates that the source end can respond to ACK packets spaced several ACK packets apart to determine the packet rate required to adjust the packet rate of the target packet stream, thereby achieving a slightly coarser granularity of packet rate adjustment for the target packet stream compared to real-time adjustment.
- this method can save the source end's computing power to a certain extent.
- this method can save the source end's computing power while minimizing congestion and meeting the transmission performance requirements of packet streams that are not sensitive to latency.
- FIG14 shows a schematic diagram of a message transmission device provided in an embodiment of this application.
- the message transmission device 1400 is applied to a source end that sends multiple message streams.
- the target message stream is any one of the multiple message streams, and the target message stream includes a first message that has been sent and a second message to be sent.
- the message transmission device 1400 is specifically used to execute the message transmission method described above, for example, to execute the steps performed by the source end in the methods shown in FIG5, FIG6, FIG9 or FIG12.
- the message transmission device 1400 may include an acquisition unit 1401, a determination unit 1402 and a sending unit 1403.
- Acquisition unit 1401 is used to acquire the endpoint delay and network delay of the first message.
- Determination unit 1402 is used to determine a first packet transmission rate based on the endpoint delay and network delay of the first message.
- Sending unit 1403 is used to send a second message at the first packet transmission rate.
- the endpoint delay of the first message includes the delay of the source end processing the first message, the delay of the destination end of the target message stream processing the first message, and the delay of the source end processing the first ACK message.
- the first ACK message originates from the destination end of the first message and is an ACK message of the first message.
- the network delay of the first message is the delay of the communication link between the source and destination ends of the first message in transmitting the first message and the first ACK message.
- the acquisition unit 1401 can be used to perform step 101
- the determination unit 1402 can be used to perform step 102
- the sending unit 1403 can be used to perform step 103.
- the determining unit 1402 is specifically configured to: determine the end-side packet transmission rate based on the end-side delay of the first message and the size of the historical end-side CWND; determine the network-side packet transmission rate based on the network delay of the first message and the size of the historical network-side CWND; and determine the smaller of the end-side packet transmission rate and the network-side packet transmission rate as the first packet transmission rate.
- the historical end-side CWND is the CWND when the source end transmitted the message in the target message stream before the current time
- the historical network-side CWND is the CWND when the communication link transmitted the message in the target message stream before the current time.
- unit 1402 can be used to perform step 102.
- the determining unit 1402 is further configured to calculate a first factor and a second factor based on the end-side delay of the first message and the size of the historical end-side CWND; and to calculate the end-side packet transmission rate based on the first factor, the second factor, and the size of the historical end-side CWND.
- the first factor represents the congestion state of the source end
- the second factor represents the state attribute of the target message flow at the source end.
- the determining unit 1402 is further configured to calculate a third factor and a fourth factor based on the network latency of the first message and the size of the historical network-side CWND; and to calculate the network-side packet transmission rate based on the third factor, the fourth factor, and the size of the historical network-side CWND.
- the third factor represents the congestion state of the communication link
- the fourth factor represents the state attribute of the target message flow on the communication link.
- the determining unit 1402 is further specifically configured to, in response to receiving the first ACK message, determine the first packet transmission rate based on the end-side delay of the first message and the network delay of the first message.
- the acquisition unit 1401 is further configured to acquire the end-side delay and network delay of the third packet already sent in the target packet stream.
- the determination unit 1402 is further configured to determine a second packet transmission rate based on the end-side delay and network delay of the third packet in response to receiving the second ACK packet.
- the sending unit 1403 is further configured to send the packets to be sent in the target packet stream at the second packet transmission rate.
- the third packet is a packet sent by the source end after sending the first packet, and the second ACK packet is an ACK packet for the third packet.
- the acquisition unit 1401 can be used to perform step 104
- the determination unit 1402 can be used to perform step 105
- the sending unit 1403 can be used to perform step 106.
- the first packet transmission rate is represented based on the first CWND.
- the determining unit 1402 is further configured to determine the target transmission queue in the source network interface card (NIC) corresponding to the target packet flow.
- the sending unit 1403 is specifically configured to schedule data packets satisfying the first CWND size from the packets to be sent in the target packet flow, including the second packet, to the target transmission queue; and to send the data packets of the target transmission queue according to the priority of the target transmission queue.
- the source NIC includes multiple transmission queues, each including a target transmission queue. Each transmission queue has a different priority, and the priority of a transmission queue indicates the order in which packets in that transmission queue are scheduled for transmission.
- the determining unit 1402 can be used to perform step 201, and the sending unit 1403 can be used to perform steps 202 to 203.
- the determining unit 1402 is further specifically used to determine the size of the sent packets in the target packet stream; determine the sending priority of the target packet stream based on the size of the sent packets in the target packet stream; and determine the sending queue corresponding to the sending priority of the target packet stream among the multiple sending queues included in the source network interface card as the target sending queue.
- the transmission unit 1403 is specifically used to schedule the transmission data packets of the transmission queues in the first transmission queue set according to the PQ strategy when the target transmission queue belongs to the first transmission queue set.
- the priorities of all transmission queues in the first transmission queue set are higher than or equal to a preset priority, or, in the order of priorities of multiple transmission queues included in the source network interface card from high to low, the priorities of all transmission queues in the first transmission queue set are ranked in the top m positions, where m is an integer greater than or equal to 1.
- the sending unit 1403 is further specifically used to schedule the sending data packets of the sending queues in the second sending queue set according to the WFQ mechanism when the target sending queue belongs to the second sending queue set.
- the sending queues in the second sending queue set are each preset with different weights, and the priorities of all sending queues in the second sending queue set are lower than or equal to the preset priorities.
- the priorities of all sending queues in the second sending queue set are ranked m positions after the preset priority, where m is an integer greater than or equal to 1.
- the lower the priority of a sending queue in the second sending queue set the more data packets that sending queue are scheduled to be sent in one go.
- the target message flow is a message flow that accesses the target service, which is deployed on at least one server of the DCN.
- the target message stream is a message stream used for communication between different nodes in a distributed storage system.
- the delay of the source end processing the first message is represented by the difference between a first time and a second time, where the first time is the time when the source end generates the first data packet of the first message, and the second time is the time when the source end sends the last data packet of the first message.
- the delay of the destination end processing the first message is represented by the difference between a third time and a fourth time, where the third time is the time when the destination end receives the first data packet of the first message, and the fourth time is the time when the destination end sends the first ACK message.
- the delay of the source end processing the first ACK message is represented by the difference between a fifth time and a sixth time, where the fifth time is the time when the source end receives the first ACK message, and the sixth time is the time when the source end completes processing the first ACK message.
- the functions implemented by the acquisition unit 1401 and the determination unit 1402 in the message transmission device 1400 can be implemented by the processor 1604 in FIG16 executing the program code in the memory 1606 in FIG16.
- the functions implemented by the sending unit 1403 in the message transmission device 1400 can be implemented by the communication interface 1608 shown in FIG16.
- Figure 15 shows a schematic diagram of another message transmission device provided in an embodiment of this application.
- the message transmission device 1500 is applied to a source end that sends multiple message streams, and is specifically used to execute the message transmission method described above, for example, to execute the steps performed by the source end in the methods shown in Figures 5, 6, 9, or 12.
- the message transmission device 1500 may include an end-to-end decoupling module 1501, a flow-level congestion prediction module 1502, a flow-level congestion control module 1503, and a transmission priority determination module 1504.
- the end-to-end decoupling module 1501 is used to obtain the end-side delay and network delay of the sent packets in any packet stream (such as stream 1) (refer to the process described in step 101).
- Flow-level congestion prediction module 1502 and flow-level congestion control module 1503 are used to obtain the target packet transmission rate of flow 1 (refer to the first packet transmission rate obtained in step 102).
- flow-level congestion prediction module 1502 can be used to predict the end-side packet transmission rate of flow 1 based on the end-side delay and historical end-side CWND determined by end-network decoupling module 1501, and can also be used to predict the network-side packet transmission rate of flow 1 based on the network delay and historical network-side CWND determined by end-network decoupling module 1501.
- flow-level congestion control module 1503 selects the smaller of the end-side packet transmission rate and the network-side packet transmission rate predicted by flow-level congestion prediction module 1502 as the subsequent packet transmission rate for sending packets to be sent in flow 1.
- the sending priority determination module 1504 is used to determine the sending priority of stream 1 based on the size of the sent messages in stream 1, and to determine the sending queue corresponding to the sending priority as the target sending queue.
- the network card of the source device where the message transmission device 1500 is located can schedule the data packets of the message to be sent in stream 1 to the target sending queue and send them according to the target packet sending rate.
- the functions implemented by the end-to-end decoupling module 1501, the flow-level congestion prediction module 1502, the flow-level congestion control module 1503, and the transmission priority determination module 1504 in the message transmission device 1500 can be implemented by the processor 1604 in Figure 16 executing the program code in the memory 1606 in Figure 16.
- module/unit division in Figure 14 or Figure 15 is illustrative and represents only one logical functional division. In actual implementation, other division methods are possible. For example, two or more functions can be integrated into a single processing module.
- the integrated module described above can be implemented in hardware or as a software functional module.
- the implementation of the determination unit 1402 of the message transmission device 1400 shown in FIG14 will be described below.
- the implementation of the acquisition unit 1401 and the sending unit 1403 shown in FIG14 can refer to the implementation of the determination unit 1402.
- unit 1402 may include code running on a computing instance.
- the computing instance may include at least one of a physical host (computing device), a virtual machine, and a container. Further, the aforementioned computing instance may be one or more.
- unit 1402 may include code running on multiple hosts/virtual machines/containers. It should be noted that the multiple hosts/virtual machines/containers used to run the code may be distributed in the same region or in different regions. Further, the multiple hosts/virtual machines/containers used to run the code may be distributed in the same availability zone (AZ) or in different AZs, each AZ including one or more geographically proximate data centers. Typically, a region may include multiple AZs.
- AZ availability zone
- VPC Virtual Private Cloud
- multiple hosts/virtual machines/containers used to run this code can be distributed within the same Virtual Private Cloud (VPC) or across multiple VPCs.
- VPC Virtual Private Cloud
- a VPC is set up within a region. Communication between two VPCs within the same region, as well as between VPCs in different regions, requires a communication gateway to be set up within each VPC to enable interconnection between VPCs.
- unit 1402 may include at least one computing device, such as a server.
- unit 1402 may also be a device implemented using an application-specific integrated circuit (ASIC) or a programmable logic device (PLD).
- ASIC application-specific integrated circuit
- PLD programmable logic device
- the PLD may be implemented using a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL), or any combination thereof.
- CPLD complex programmable logical device
- FPGA field-programmable gate array
- GAL generic array logic
- the multiple computing devices included in the determining unit 1402 can be distributed in the same region or in different regions. Similarly, the multiple computing devices included in the determining unit 1402 can be distributed in the same Availability Zone (AZ) or in different AZs. Likewise, the multiple computing devices included in the determining unit 1402 can be distributed in the same Virtual Private Cloud (VPC) or in multiple VPCs. These multiple computing devices can be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.
- the determining unit 1402 can be used to execute any step related to determining the packet transmission rate in the message transmission method described in the embodiments of this application
- the acquiring unit 1401 can be used to execute any step related to the acquiring operation in the message transmission method described in the embodiments of this application
- the sending unit 1403 can be used to execute any step related to the sending operation in the message transmission method described in the embodiments of this application.
- the steps implemented by the acquiring unit 1401, the determining unit 1402, and the sending unit 1403 can be specified as needed.
- the message transmission device can realize all functions by implementing different steps in the message transmission method described in the embodiments of this application through the acquiring unit 1401, the determining unit 1402, and the sending unit 1403 respectively.
- This application also provides a message transmission system, which includes a source end for sending multiple message streams and a destination end for receiving message streams.
- the source end is used to execute the portion of the message transmission method described above that is performed by the source end.
- the destination end is used to execute the portion of the message transmission method described above that is performed by the destination end.
- Both the source and destination ends can be implemented in software or hardware.
- the implementation of the source end will be described below.
- the implementation of the destination end can be referenced from the implementation of the source end.
- a module can include code running on a compute instance.
- This compute instance can be at least one of a physical host (compute device), virtual machine, container, or other compute device.
- the aforementioned compute device can be one or more.
- the source end can include code running on multiple hosts/virtual machines/containers.
- the multiple hosts/virtual machines/containers used to run the application can be distributed within the same region or in different regions.
- the multiple hosts/virtual machines/containers used to run the code can be distributed within the same Availability Zone (AZ) or in different AZs, each AZ comprising one or more geographically proximate data centers.
- a region can include multiple AZs.
- VPC virtual machines/containers used to run this code
- multiple hosts/virtual machines/containers used to run this code can be distributed within the same VPC or across multiple VPCs.
- a VPC is set up within a single region. Communication between two VPCs within the same region, and between VPCs in different regions, requires a communication gateway to be set up within each VPC to enable interconnection between VPCs.
- the source end of a module can include at least one computing device, such as a server.
- the source end can also be a device implemented using an ASIC or a PLD.
- the aforementioned PLD can be implemented using a CPLD, FPGA, GAL, or any combination thereof.
- the multiple computing devices included in the source end can be distributed in the same region or in different regions. Similarly, the multiple computing devices included in the source end can be distributed in the same Availability Zone (AZ) or in different AZs. Likewise, the multiple computing devices included in the source end can be distributed in the same Virtual Private Cloud (VPC) or in multiple VPCs. These multiple computing devices can be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.
- This application provides a computing device.
- This computing device can be implemented as a source or destination as described above.
- the computing device 1600 includes a bus 1602, a processor 1604, a memory 1606, and a communication interface 1608.
- the processor 1604, the memory 1606, and the communication interface 1608 are interconnected via the bus 1602.
- Bus 1602 can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of illustration, only one line is used in Figure 16, but this does not imply that there is only one bus or one type of bus. Bus 1602 can include pathways for transmitting information between various components of computing device 1600 (e.g., memory 1606, processor 1604, communication interface 1608).
- PCI Peripheral Component Interconnect
- EISA Extended Industry Standard Architecture
- Processor 1604 may include a general-purpose processor and/or a dedicated hardware chip.
- a general-purpose processor may include a central processing unit (CPU), a microprocessor (MP), or a graphics processing unit (GPU).
- CPU may be, for example, a single-core processor or a multi-core processor.
- a dedicated hardware chip is a high-performance processing hardware module.
- a dedicated hardware chip includes at least one of the following: a digital signal processor (DSP), a data processing unit (DPU), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, a neural processing unit (NPU), a tensor processing unit (TPU), an artificial intelligence chip, or a network processor (NP).
- DSP digital signal processor
- DPU data processing unit
- ASIC application-specific integrated circuit
- FPGA field-programmable gate array
- the processor 1604 can also be an integrated circuit chip with signal processing capabilities. In implementation, some or all of the functions of the method provided in this application embodiment can be accomplished through the integrated logic circuitry in the hardware of the processor 1604 or through software instructions.
- Memory 1606 may include volatile memory, such as random access memory (RAM). Memory 1606 may also include non-volatile memory, such as read-only memory (ROM), flash memory, hard disk drive (HDD), or solid state drive (SSD).
- RAM random access memory
- ROM read-only memory
- HDD hard disk drive
- SSD solid state drive
- the memory 1606 stores executable program code.
- the processor 1604 executes the executable program code to implement the functions of the acquisition unit 1401, the determination unit 1402, and the sending unit 1403 shown in FIG. 14, thereby implementing the method portion executed by the source end in the message transmission method described in this application embodiment. That is, the memory 1606 stores instructions for executing the methods implemented by the acquisition unit 1401, the determination unit 1402, and the sending unit 1403 in the message transmission method described in this application embodiment.
- the processor 1604 executes the executable program code to implement the functions of the end-to-end decoupling module 1501, the flow-level congestion prediction module 1502, the flow-level congestion control module 1503, and the sending priority determination module 1504 shown in FIG.
- the memory 1606 stores instructions for executing the functions implemented by the end-to-end decoupling module 1501, the flow-level congestion prediction module 1502, the flow-level congestion control module 1503, and the transmission priority determination module 1504 in the message transmission method described in the embodiments of this application.
- the processor 1604 can implement the method portion of the message transmission method described in the embodiments of this application executed by the destination end by executing this executable program code.
- Communication interface 1608 uses transceiver modules, such as, but not limited to, transceivers, to enable communication with other devices or communication networks.
- communication interface 1608 can be any one or any combination of the following devices: network interfaces (such as Ethernet interfaces), wireless network cards, and other devices with network access capabilities.
- Communication interface 1608 includes a receiving unit for receiving data/messages and a sending unit for sending data/messages.
- the aforementioned devices can be disposed on separate chips, or at least partially or entirely on the same chip. Whether to dispose of the devices independently on different chips or integrate them on one or more chips often depends on the needs of the product design. This application does not limit the specific implementation of the aforementioned devices. Furthermore, the descriptions of the processes corresponding to the various figures above each have their own emphasis; for parts of a process not described in detail in one figure, please refer to the relevant descriptions of other processes.
- the implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof.
- software When implemented using software, it can be implemented, in whole or in part, in the form of a computer program product.
- the computer program product providing the program development platform includes one or more computer instructions. When these computer program instructions are loaded and executed on the computing device 1600, they implement, in whole or in part, some or all of the functions of the message transmission method provided in the embodiments of this application.
- computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another.
- computer instructions can be transmitted from one website, computer, server, or data center to another via wired (e.g., coaxial cable, fiber optic, digital subscriber line) or wireless (e.g., infrared, wireless, microwave, etc.) means.
- the computer-readable storage medium stores computer program instructions that provide a program development platform.
- the computing device cluster includes at least one computing device.
- the computing device can be a server, such as a central server, an edge server, or a local server in a local data center.
- the computing device can also be a terminal device such as a desktop computer, a laptop computer, or a smartphone.
- the computing device cluster includes at least one computing device 1600.
- the memory 1606 of one or more computing devices 1600 in the computing device cluster may store the same instructions for executing the message transmission method described above.
- the memory 1606 of one or more computing devices 1600 in the computing device cluster may also store instructions for executing the message transmission method described above.
- a combination of one or more computing devices 1600 can jointly execute instructions for executing the message transmission method described above.
- the memory 1606 in different computing devices 1600 within the computing device cluster can store different instructions, which are used to execute some functions of the message transmission device described in Figure 14 above. That is, the instructions stored in the memory 1606 of different computing devices 1600 can implement the functions of one or more unit modules in the acquisition unit 1401, determination unit 1402, and sending unit 1403 shown in Figure 14.
- the memory 1606 in different computing devices 1600 within the computing device cluster can store different instructions, each used to execute a portion of the functions of the message transmission device described in Figure 15 above. That is, the instructions stored in the memory 1606 of different computing devices 1600 can implement the functions of one or more unit modules among the end-to-end decoupling module 1501, flow-level congestion prediction module 1502, flow-level congestion control module 1503, and transmission priority determination module 1504 shown in Figure 15.
- one or more computing devices in a computing device cluster can be connected via a network.
- This network can be a wide area network (WAN) or a local area network (LAN), etc.
- Figure 18 illustrates one possible implementation.
- two computing devices 1600A and 1600B are connected via a network. Specifically, they are connected to the network through communication interfaces in each computing device.
- the memory 1606 in computing device 1600A stores instructions that implement the functions of the acquisition unit 1401 and the determination unit 1402 shown in Figure 14.
- the memory 1606 in computing device 1600B stores instructions that implement the functions of the transmission unit 1403 shown in Figure 14.
- connection method between the computing device clusters shown in Figure 18 can be considered in the message transmission method provided in this application embodiment, where the method steps executed by the source end need to obtain the end-side latency and network latency of the sent messages in the message stream, and need to determine the target packet sending rate (such as the first packet sending rate or the second packet sending rate described in the method embodiment). Therefore, it is considered that the functions implemented by the acquisition unit 1401 and the determination unit 1402 are handed over to the computing device 1600A for execution, and other operations (such as sending operations) are handed over to the computing device 1600B for execution.
- computing device 1600A shown in Figure 18 can also be performed by multiple computing devices 1600.
- computing device 1600B can also be performed by multiple computing devices 1600, without limitation.
- This application also provides another computing device cluster.
- the connection relationship between the computing devices in this computing device cluster can be similarly referred to the connection method of the computing device cluster described in Figures 17 and 18.
- the difference is that the memory 1606 in one or more computing devices 1600 in this computing device cluster can store the same instructions for executing the message transmission method described in this application embodiment.
- the memory 1606 of one or more computing devices 1600 in the computing device cluster may also store partial instructions for executing the message transmission method described in the embodiments of this application.
- a combination of one or more computing devices 1600 can jointly execute instructions for executing the message transmission method described in the embodiments of this application.
- the memory 1606 in different computing devices 1600 within the computing device cluster can store different instructions for executing some functions of the message transmission system described in this application embodiment. That is, the instructions stored in the memory 1606 of different computing devices 1600 can implement the functions of one or more device modules in the source and destination ends described above.
- the computer program product may be a software or program product containing instructions, capable of running on a computing device or stored on any usable medium.
- the computer program product When the computer program product is run on at least one computing device or processor, it causes the at least one computing device or processor to execute the message transmission method described in this application.
- the computer-readable storage medium can be any available medium capable of being stored by a computing device, or a data storage device such as a data center containing one or more available media.
- the available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state drive).
- the computer-readable storage medium includes instructions that instruct the computing device to execute the message transmission method provided in this application.
- This application also provides a chip that includes a processor.
- the chip executes program instructions or code
- the chip including the processor or the device including the chip performs the message transmission method described above.
- the chip further includes an input interface, an output interface, and a memory.
- the chip's input interface, output interface, processor, and memory are connected via internal interconnection paths.
- the memory in the chip stores program instructions or code executed by the processor, and the input and output interfaces are used for communication between the chip and other chips or devices.
- the terms “first,” “second,” and “third” are used for descriptive purposes only and should not be construed as indicating or implying relative importance.
- the term “at least one” refers to one or more, and the term “multiple” refers to multiples, unless otherwise expressly defined.
- a and/or B can represent: A existing alone, A and B existing simultaneously, or B existing alone.
- the character "/" in this document generally indicates that the preceding and following related objects have an "or" relationship.
- determining B based on A does not mean determining B solely based on A; B can also be determined based on A and/or other information.
- sequence number of each process does not imply the order of execution.
- the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.
- the information including but not limited to user device information, user personal information, etc.
- data including but not limited to data used for analysis, data stored, data displayed, etc.
- signals involved in this application are all authorized by the user or fully authorized by all parties, and the collection, use and processing of related data must comply with the relevant laws, regulations and standards of the relevant countries and regions.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
La présente demande se rapporte au domaine technique de l'informatique en nuage. Sont divulgués un procédé et un appareil de transmission de message. Le procédé est appliqué à une extrémité source qui envoie une pluralité de flux de messages. Le procédé consiste à : acquérir la latence côté extrémité et la latence de réseau d'un premier message dans un flux de messages cible, lequel premier message a été envoyé ; déterminer un premier débit de transmission de paquets sur la base de la latence côté extrémité du premier message et de la latence de réseau du premier message ; et envoyer, au premier débit de transmission de paquets, un second message à envoyer dans le flux de messages cible, le flux de messages cible étant l'un quelconque d'une pluralité de flux de messages envoyés au moyen de l'extrémité source ; la latence côté extrémité du premier message comprenant la latence d'une extrémité source traitant le premier message, la latence d'une extrémité de destination du flux de messages cible traitant le premier message, et la latence de l'extrémité source traitant un message ACK du premier message ; et la latence de réseau du premier message étant la latence de transmission du premier message et du message ACK du premier message au moyen d'une liaison de communication entre l'extrémité source et l'extrémité de destination. Sur la base du procédé, une régulation d'encombrement à une granularité de niveau de flux peut être mise en œuvre.
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410486856.8 | 2024-04-22 | ||
| CN202410486856 | 2024-04-22 | ||
| CN202410780555.6 | 2024-06-17 | ||
| CN202410780555.6A CN120835040A (zh) | 2024-04-22 | 2024-06-17 | 报文传输方法及装置 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025222944A1 true WO2025222944A1 (fr) | 2025-10-30 |
Family
ID=97394374
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2024/144645 Pending WO2025222944A1 (fr) | 2024-04-22 | 2024-12-31 | Procédé et appareil de transmission de message |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN120835040A (fr) |
| WO (1) | WO2025222944A1 (fr) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113824607A (zh) * | 2020-06-20 | 2021-12-21 | 华为技术服务有限公司 | 时延测量方法及装置 |
| US20230059755A1 (en) * | 2021-08-11 | 2023-02-23 | Enfabrica Corporation | System and method for congestion control using a flow level transmit mechanism |
| CN116723550A (zh) * | 2023-06-15 | 2023-09-08 | 南京大学 | 一种基于用户态的RDMA网络QoS协调方法 |
| CN117376212A (zh) * | 2023-10-12 | 2024-01-09 | 中国电信股份有限公司技术创新中心 | 网络速率调整方法及装置、存储介质及电子设备 |
-
2024
- 2024-06-17 CN CN202410780555.6A patent/CN120835040A/zh active Pending
- 2024-12-31 WO PCT/CN2024/144645 patent/WO2025222944A1/fr active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113824607A (zh) * | 2020-06-20 | 2021-12-21 | 华为技术服务有限公司 | 时延测量方法及装置 |
| US20230059755A1 (en) * | 2021-08-11 | 2023-02-23 | Enfabrica Corporation | System and method for congestion control using a flow level transmit mechanism |
| CN116723550A (zh) * | 2023-06-15 | 2023-09-08 | 南京大学 | 一种基于用户态的RDMA网络QoS协调方法 |
| CN117376212A (zh) * | 2023-10-12 | 2024-01-09 | 中国电信股份有限公司技术创新中心 | 网络速率调整方法及装置、存储介质及电子设备 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN120835040A (zh) | 2025-10-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12074808B1 (en) | Distributed artificial intelligence extension modules for network switches | |
| US10764215B2 (en) | Programmable broadband gateway hierarchical output queueing | |
| US11102129B2 (en) | Adjusting rate of outgoing data requests for avoiding incast congestion | |
| US6560243B1 (en) | System and method for receiver based allocation of network bandwidth | |
| KR101670642B1 (ko) | 클라이언트 디바이스 상에서의 패킷 송신을 스케줄링하기 위한 시스템 및 방법 | |
| US20150117199A1 (en) | Multi-Level iSCSI QoS for Target Differentiated Data in DCB Networks | |
| CN114095457A (zh) | 基于流的共享缓冲区资源管理 | |
| Li et al. | Survey on traffic management in data center network: from link layer to application layer | |
| CN111131061B (zh) | 一种数据传输方法及网络设备 | |
| US20240098155A1 (en) | Systems and methods for push-based data communications | |
| CN112041826A (zh) | 用于网络接口卡的细粒度业务整形分流 | |
| CN113783785A (zh) | Ecn水线值的配置方法、装置及网络设备 | |
| WO2025237023A1 (fr) | Procédé et appareil de gestion d'encombrement à chemins multiples, puce, carte d'interface réseau et dispositif | |
| WO2020143509A1 (fr) | Procédé de transmission de données et dispositif de réseau | |
| WO2025246804A1 (fr) | Procédé et appareil de régulation de congestion, et dispositif électronique, support de stockage lisible par ordinateur et produit-programme d'ordinateur | |
| Yan et al. | A survey of RoCEv2 congestion control | |
| WO2025222944A1 (fr) | Procédé et appareil de transmission de message | |
| US20230336483A1 (en) | Congestion Control for Networks Using Deployable INT | |
| CN116170377B (zh) | 一种数据处理方法以及相关设备 | |
| CN113746890A (zh) | 用于利用边缘平台进行的自适应对等通信的系统、装置和方法 | |
| Dong et al. | QIMS: QoE-centric information-agnostic mix-flows scheduling in SD-WAN | |
| Wu et al. | Revisiting network congestion avoidance through adaptive packet-chaining reservation | |
| CN116567088A (zh) | 数据传输方法、装置、计算机设备、存储介质和程序产品 | |
| WO2025131298A1 (fr) | Réduction de l'impact d'un encombrement dû à un engorgement dans un réseau | |
| CN121151320A (zh) | 数据传输方法、装置、电子设备、介质和计算机程序产品 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24936836 Country of ref document: EP Kind code of ref document: A1 |