WO2024078018A1 - Data transmission method, apparatus and system - Google Patents
Data transmission method, apparatus and system Download PDFInfo
- Publication number
- WO2024078018A1 WO2024078018A1 PCT/CN2023/103315 CN2023103315W WO2024078018A1 WO 2024078018 A1 WO2024078018 A1 WO 2024078018A1 CN 2023103315 W CN2023103315 W CN 2023103315W WO 2024078018 A1 WO2024078018 A1 WO 2024078018A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- node
- message
- data
- data block
- deduplication
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/12—Avoiding congestion; Recovering from congestion
- H04L47/125—Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/31—Flow control; Congestion control by tagging of packets, e.g. using discard eligibility [DE] bits
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/43—Assembling or disassembling of packets, e.g. segmentation and reassembly [SAR]
Definitions
- the present application relates to the field of network technology, and in particular to a data transmission method, device and system.
- the audio and video data of the same user usually needs to be distributed to multiple recipients, which results in a large amount of data transmission and thus a large network overhead.
- the present application provides a data transmission method, device and system, which can reduce network bandwidth overhead by reducing the amount of data transmitted by messages.
- a data transmission method includes: a first node obtains a first message whose sender is a selective forwarding unit (SFU) server, and the first node is the first node that supports data deduplication on the transmission path of the first message. If a target message exists in the historical message sent by the first node to the second node, and the payload part of the target message has repeated content with the payload part of the first message, the first node performs deduplication processing on the payload part of the first message to obtain a second message.
- the second message does not include repeated content, and the second message carries a deduplication mark and indication information of the repeated content.
- the deduplication mark is used to indicate that the second message is a deduplication message.
- the second node is the next hop of the first message on the first node. The first node sends the second message to the second node.
- the first node after receiving a message whose sender is an SFU server, the first node can determine whether to send a historical message whose payload part has duplicate content with the payload part of the message to the next hop of the message. If the first node sends a historical message whose payload part has duplicate content with the payload part of the message to the next hop of the message, the first node can deduplicate the data of the message and then send the deduplicated message to the subordinate node. Since the data volume of the deduplicated message is smaller than the data volume of the non-deduplicated message, the data volume of the message transmission can be reduced, thereby reducing the network bandwidth overhead.
- the repeated content includes one or more repeated data blocks
- the indication information of the repeated content includes one or more indications.
- the one or more indications correspond one-to-one to the one or more repeated data blocks in the repeated content.
- Each indication is used to indicate the hash value of the corresponding repeated data block.
- a data set is stored in a first node, and the data set includes the payload part of a historical message sent by the first node to a second node.
- the first node matches the content of the payload part of the first message with the payload part in the data set. If there is a target payload part in the data set that has a duplicate data block with the payload part of the first message, the first node determines that the target message exists in the historical message. Accordingly, the implementation process of the first node performing deduplication processing on the payload part of the first message includes: for each duplicate data block between the payload part of the first message and the target payload part, the first node calculates the hash value of the duplicate data block.
- the first node removes the duplicate data block in the payload part of the first message, and adds an indication corresponding to the duplicate data block to the payload part of the first message, where the indication is used to indicate the hash value of the duplicate data block and the position of the duplicate data block in the payload part of the first message.
- the first node can perform content matching on the payload part of the acquired message with the payload part of the stored historical message. If the payload part of the message and the payload part of the historical message have duplicate data blocks, the first node calculates the hash value of the duplicate data block, removes the duplicate data block in the message to obtain a deduplicated message, and further carries the hash value of the duplicate data block and an indication of the location of the duplicate data block in the deduplicated message to achieve data deduplication of the message.
- the first node determines that the target message does not exist in the historical message.
- the first node adds the payload portion of the first message to the data set.
- the updated data set can be used by the first node to perform deduplication processing on messages obtained subsequently whose sender is the SFU server.
- the payload portion includes a protocol portion and a data portion, and the one or more repeated data blocks are located in the data portion of the first message.
- a sampling tag set is stored in the first node, and the sampling tag set includes a hash value of a historical data block, and the historical data block is a data block obtained by sampling a preset position of a data part of a historical message sent by the first node to the second node.
- the first node samples a preset position of the data part of the first message to obtain a sampled data block.
- the first node calculates the hash value of the sampled data block. If the sampling tag set includes the hash value of the sampled data block, the first node determines that the target message exists in the historical message.
- the implementation process of the first node performing deduplication processing on the payload part of the first message includes: the first node uses the sampled data block whose hash value belongs to the sampling tag set as a duplicate data block, removes the duplicate data block from the data part of the first message, and adds an indication corresponding to the duplicate data block to the payload part of the first message, and the indication is used to indicate the hash value of the duplicate data block.
- the first node can calculate the hash value of the sampled data block at the preset position of the data part of the acquired message, and compare it with the hash value of the stored historical data block. If the hash value of a sampled data block in the message is the same as the hash value stored by the first node, the first node removes the sampled data block in the message to obtain a deduplicated message, and further carries the hash value of the sampled data block in the deduplicated message to achieve data deduplication of the message.
- the first node obtains multiple sampled data blocks by sampling the preset positions of the data part of the first message.
- the indication corresponding to the repeated data block is also used to indicate the position of the repeated data block in the data part of the first message.
- the first node when there are multiple sampling positions of the pre-set data part, the first node samples the data part of the message and obtains multiple sampled data blocks. In this case, it is necessary to indicate the position of the duplicate data blocks removed from the deduplicated message in the original message so that subsequent nodes can restore the data of the deduplicated message.
- the first node determines that the target message does not exist in the historical message.
- the first node adds the hash value of the sampled data block to the sampling tag set.
- the updated sampling tag set can be used by the first node to perform deduplication processing on the message whose sender is the SFU server that is subsequently obtained.
- the sampling tag set also includes a historical data block indicated by a hash value. If the sampling tag set includes the hash value of the sampling data block, the first node determines that the target message exists in the historical message, including: if the sampling tag set includes the hash value of the sampling data block, the first node performs content matching on the sampling data block and the historical data block indicated by the hash value of the sampling data block. When the content of the sampling data block is the same as the content of the historical data block indicated by the hash value of the sampling data block, the first node determines that the target message exists in the historical message.
- the first node can further perform content matching on the sampled data block and the historical data block after determining that the hash value of the sampled data block of the message is the same as the hash value of a historical data block, so as to achieve an exact match, thereby improving the accuracy of deduplication of the message.
- the first node also stores the protocol part of the historical message sent by the first node to the second node.
- the above-mentioned repeated content also includes the protocol information located in the protocol part of the first message, and the indication information of the repeated content also includes a difference indication, which is used to indicate the difference between the protocol part of the first message and the protocol part of the target message.
- the first node in addition to deduplicating data blocks at preset positions in the data part of the message, can also deduplicate the protocol part of the message.
- the amount of data transmitted in the message can be further reduced, thereby reducing the network bandwidth overhead.
- one or more flow grouping sets corresponding to the second node are stored in the first node.
- Each flow grouping set includes flow identifiers of multiple flows flowing through the second node.
- the first node can determine whether it is necessary to perform deduplication processing on the message sent to the subordinate node based on the flow grouping set corresponding to the subordinate node. If the flow identifier of the flow to which the message belongs is not in the flow grouping set corresponding to the subordinate node, then the first node directly forwards the message to the subordinate node without executing the message deduplication process, which can reduce the processing overhead of the first node.
- the first node when there are multiple flow grouping sets corresponding to the subordinate node, under this implementation method, the first node only needs to perform duplicate content judgment on the payload part of the historical message belonging to the multiple flows indicated by a flow grouping set and the payload part of the message, which reduces the number of historical messages that the first node needs to judge, thereby reducing the processing overhead of the first node, while improving the message processing efficiency of the first node, thereby improving the message transmission efficiency.
- the first node receives grouping information sent by the second node, where the grouping information includes a correspondence between a node identifier of the second node and one or more flow grouping sets corresponding to the second node.
- the first node is not an SFU server.
- a first node discovery message is sent to a third node, the third node is the next hop of the third message on the first node, the destination of the third message is the SFU server, the first node discovery message carries the identifier of the SFU server, and the first node discovery message indicates that the first node is a subordinate node of the third node on the transmission path starting from the SFU server.
- the first node determines that the first node is the first node that supports data deduplication on the transmission path starting from the SFU server.
- the first node generates a first node discovery message according to the third message, a message header of the first node discovery message is the same as a message header of the third message, and a payload portion of the first node discovery message carries an indication of a message type of the first node discovery message.
- the first node receives a second node discovery message sent by the fourth node, the second node discovery message carries an identifier of the SFU server, and the second node discovery message indicates that the fourth node is a subordinate node of the first node on a transmission path starting from the SFU server.
- the first node determines that the fourth node supports data deduplication based on the second node discovery message, and sends a second node discovery response message corresponding to the second node discovery message to the fourth node, and the second node discovery response message indicates that the first node supports data deduplication.
- the node discovery process is triggered by a message whose destination is the SFU server
- a node receives a node discovery message sent by a subordinate node and does not receive a node discovery response message sent by an upper node
- the node can determine itself as the first node on the transmission path starting from the SFU server.
- the first node after the first node receives the fourth message whose source port number is the SFU service port number, the first node sends a third node discovery message to the fifth node, the fifth node is the next hop of the fourth message on the first node, the sender of the fourth message is the SFU server, the third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the superior node of the fifth node on the transmission path starting from the SFU server.
- the first node determines that the fifth node supports data deduplication.
- the node discovery process is triggered by a message whose sender is an SFU server
- a node receives a node discovery response message sent by a subordinate node and does not receive a node discovery message sent by an upper node
- the node can determine that it is the first node on the transmission path starting from the SFU server.
- the first node sends the first message to the second node.
- a data transmission method includes: a first node receives a first message sent by a second node.
- the sender of the first message is an SFU server.
- the first message carries a deduplication mark and indication information of repeated content.
- the deduplication mark is used to indicate that the first message is a deduplication message.
- the first node is the last node that supports data deduplication on the transmission path of the first message.
- the first node determines that the first message is a deduplication message based on the deduplication mark in the first message.
- the first node obtains the repeated content from a data set according to the indication information, and the data set includes at least part of the load part of the historical message received by the first node from the second node.
- the first node performs deduplication recovery processing on the load part of the first message according to the repeated content to obtain a second message.
- the load part of the second message includes the repeated content.
- the first node sends a second message to a third node, and the third node is the next hop of the first message on the first node.
- the first node can perform data recovery on deduplicated messages whose sender is the SFU server, so that the user can receive the original message carrying complete data content, thereby ensuring the user's business.
- the repeated content includes one or more repeated data blocks.
- the indication information of the repeated content includes one or more indications.
- the one or more indications correspond one-to-one to the one or more repeated data blocks in the repeated content.
- Each indication is used to indicate the hash value of the corresponding repeated data block.
- each indication is also used to indicate the position of the corresponding repeated data block in the payload part of the original message corresponding to the first message
- the data set includes the payload part of the historical message received by the first node from the second node.
- the implementation process of the first node obtaining repeated content from the data set according to the indication information includes: for each indication in the indication information, the first node obtains the data block to be matched at the position of the payload part in the data set according to the position indicated by the indication.
- the first node calculates the hash value of the data block to be matched.
- the first node determines the data block to be matched whose hash value is consistent with the hash value indicated by the indication as the repeated data block corresponding to the indication.
- the first node can calculate the hash values of the data blocks at that position in the payload part of multiple stored payload parts according to the position of the duplicate data block indicated by the indication carried in the deduplication message in the payload part of the original message, so as to obtain a stored data block whose hash value is consistent with the hash value indicated by the indication, and then add the stored data block to the position indicated by the indication to achieve data recovery of the deduplication message.
- the payload portion includes a protocol portion and a data portion, and one or more repeated data blocks are located in the data portion.
- the data set includes a corresponding relationship between a hash value of a historical data block and a historical data block, where the historical data block is a corresponding relationship between a hash value of a historical data block and a historical data block received by the first node.
- the first node obtains the repeated content from the data set according to the indication information, comprising: the first node determines the historical data block in the data set corresponding to the hash value indicated by the indication in the indication information as the repeated data block corresponding to the indication.
- the indication in the indication information of the repeated content is also used to indicate the position of the corresponding repeated data block in the data part of the original message corresponding to the first message.
- the implementation method of the first node performing deduplication recovery processing on the payload part of the first message according to the repeated content includes: for each indication in the indication information, the first node adds the repeated data block corresponding to the indication at the position indicated by the indication in the data part of the first message.
- the first node can search for the hash value carried in the deduplication message among the hash values of the stored historical data blocks, and add the historical data block corresponding to the hit hash value to the data part of the deduplication message to achieve recovery of the data part of the deduplication message.
- the repeated content also includes protocol information located in the protocol part
- the indication information also includes a difference indication
- the difference indication is used to indicate the difference between the protocol part of the original message corresponding to the first message and the protocol part of the target message
- the target message is a historical message received by the first node from the second node, in which the data part and the data part of the original message have one or more repeated data blocks.
- the data set also includes the protocol part of the message to which the historical data block belongs.
- the implementation process of the first node obtaining the repeated content from the data set according to the indication information also includes: the first node obtains the protocol part of the target message to which one or more repeated data blocks belong from the data set.
- the implementation process of the first node performing deduplication and recovery processing on the payload part of the first message according to the repeated content also includes: the first node modifies the protocol part of the target message according to the difference indication, and uses the modified protocol part of the target message as the protocol part of the second message.
- the first node can obtain the protocol part of the historical message to which the historical data block corresponding to the hit hash value belongs, and restore the protocol part of the original message corresponding to the deduplicated message in combination with the difference indication for the protocol part in the deduplicated message, so as to realize the recovery of the protocol part of the deduplicated message.
- the deduplication mark is located in the payload portion of the first message.
- One or more flow grouping sets are stored in the first node, and each flow grouping set includes flow identifiers of multiple flows flowing through the first node.
- the first node determines that there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the one or more flow grouping sets.
- the first node parses the payload portion of the first message to obtain the deduplication mark.
- the implementation method of the first node obtaining repeated content from the data set according to the indication information includes: the first node obtains repeated content from the payload content of the target historical message in the data set according to the indication information, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set.
- the first node when there are multiple flow grouping sets corresponding to the first node, the first node only needs to search the payload part of the historical messages belonging to multiple flows indicated by one flow grouping set to obtain duplicate content, which reduces the number of historical messages that the first node needs to retrieve, thereby reducing the processing overhead of the first node and improving the message processing efficiency of the first node, thereby improving the message transmission efficiency.
- the first node receives a third message, and the flow identifier of the flow to which the third message belongs does not belong to any flow grouping set.
- the first node forwards the third message.
- the first node can first determine whether the flow identifier of the flow to which the message belongs belongs to a certain flow grouping set corresponding to itself after receiving the message. If the flow identifier of the flow to which the message belongs belongs to a certain flow grouping set corresponding to itself, it means that the message may be a deduplication message that has been deduplication processed by the upper-level node, and the first node needs to further parse the payload part of the message to determine whether the message is a deduplication message.
- the flow identifier of the flow to which the message belongs does not belong to any flow grouping set corresponding to itself, it means that the upper-level node will not perform data deduplication on the message, that is, the message cannot be a deduplication message, so the first node can directly forward the message without parsing the payload part of the message to determine whether the message is a deduplication message. This can reduce the processing overhead of the bottom node.
- the first node receives a fourth message, and the flow identifier of the flow to which the fourth message belongs belongs to the flow grouping set.
- the first node parses the payload part of the fourth message and determines that the payload part of the fourth message does not carry a deduplication mark.
- the first node adds at least part of the content of the payload part of the fourth message to the data set and forwards the fourth message.
- the fourth message here can be regarded as the first packet in a group of deduplicated messages received by the first node.
- the first node stores at least part of the content of the payload part of the fourth message in the data set so as to perform data recovery on the deduplicated messages received subsequently.
- the first node adds the flow identifiers of the flows to which the messages with duplicate contents in the payload part belong among the multiple messages received belonging to different flows to the same flow grouping set, and the senders of the different flows are all SFU servers.
- the first node sends grouping information to the second node, where the grouping information includes a correspondence between a node identifier of the first node and one or more flow grouping sets.
- the first node after the first node receives the fifth message whose destination port number is the SFU service port number, the first node sends a first node discovery message to the fourth node, the fourth node is the next hop of the fifth message on the first node, the destination of the fifth message is the SFU server, the first node discovery message carries the identifier of the SFU server, and the first node discovery message indicates that the first node is the subordinate node of the fourth node on the transmission path starting from the SFU server.
- the first node determines that the fourth node supports data deduplication.
- the first node generates a first node discovery message according to the fifth message, a message header of the first node discovery message is the same as a message header of the fifth message, and a payload portion of the first node discovery message carries an indication of a message type of the first node discovery message.
- the node discovery process is triggered by a message whose destination is the SFU server
- a node receives a node discovery response message sent by an upper-level node and does not receive a node discovery message sent by a lower-level node
- the node can determine that it is the last node on the transmission path starting from the SFU server.
- the first node receives a second node discovery message sent by the fifth node, the second node discovery message carries an identifier of the SFU server, and the second node discovery message indicates that the fifth node is the superior node of the first node on the transmission path starting from the SFU server.
- the first node determines that the fifth node supports data deduplication based on the second node discovery message, and sends a second node discovery response message corresponding to the second node discovery message to the fifth node, and the second node discovery response message indicates that the first node supports data deduplication.
- the first node sends a third node discovery message to the sixth node, the sixth node is the next hop of the sixth message on the first node, the sender of the sixth message is the SFU server, the third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the superior node of the sixth node on the transmission path starting from the SFU server.
- the first node determines that the first node is the last node that supports data deduplication on the transmission path starting from the SFU server.
- the node discovery process is triggered by a message whose sender is an SFU server
- a node receives a node discovery message sent by an upper-level node and does not receive a node discovery response message sent by a lower-level node
- the node can determine that it is the last node on the transmission path starting from the SFU server.
- a data transmission method includes: a first node receives a first message sent by a second node.
- the sender of the first message is an SFU server.
- the first node is an intermediate node that supports data deduplication on the transmission path of the first message.
- the first node performs deduplication processing on the payload part of the first message to obtain a second message, the second message does not include the first repeated content, and the second message carries a deduplication mark and a first indication information of the first repeated content, the deduplication mark is used to indicate that the second message is a deduplication message, and the third node is the next hop of the first message on the first node.
- the first node sends a second message to the third node.
- the first node can perform data deduplication on the non-deduplicated message whose sender is the SFU server, so as to send the deduplicated message to the subordinate node. Since the data volume of the deduplicated message is smaller than that of the non-deduplicated message, the data volume of the message transmission can be reduced, thereby reducing the network bandwidth overhead.
- the first node When the first node sends a deduplicated message to the subordinate node, it only needs to ensure that the overload part of the historical message carrying the content of the deduplicated message removed relative to the original message is sent to the subordinate node, so as to ensure that there is a node on the subsequent transmission path that can perform data recovery on the deduplicated message, so that the user finally receives the original message carrying the complete data content, thereby ensuring user services.
- the first node sends the first message to the third node.
- the first repetitive content includes one or more repetitive data blocks
- the first indication information includes one or more indications.
- the one or more indications in the first indication information correspond one-to-one to the one or more repetitive data blocks in the first repetitive content, and each indication is used to indicate a hash value of a corresponding repetitive data block.
- a first data set is stored in a first node, and the first data set includes a payload portion of a historical message sent by the first node to a third node.
- the first node performs content matching on the payload portion of the first message and the payload portion in the first data set. If there is a target payload portion in the first data set that has a duplicate data block with the payload portion of the first message, the first node determines that the first original message exists in the historical message.
- the first node determines that the first original message does not exist in the historical message, and the first node adds the payload portion of the first message to the first data set.
- the first data set includes a target payload part
- the first node performs deduplication processing on the payload part of the first message, including: for each repeated data block between the payload part of the first message and the target payload part, the first node calculates a hash of the repeated data block; The first node removes the duplicate data block from the payload portion of the first message, and adds an indication corresponding to the duplicate data block to the payload portion of the first message, the indication being used to indicate the hash value of the duplicate data block and the position of the duplicate data block in the payload portion of the first message.
- the payload portion includes a protocol portion and a data portion.
- One or more repeated data blocks in the first repeated content are located in the data portion of the first message.
- a sampling tag set is stored in the first node, and the sampling tag set includes a hash value of a historical data block, where the historical data block is a data block obtained by sampling a preset position of the data portion of a historical message sent by the first node to the third node.
- the first node samples a preset position of the data portion of the first message to obtain a sampled data block.
- the first node calculates a hash value of the sampled data block. If the sampling tag set includes the hash value of the sampled data block, the first node determines that the first original message exists in the historical message. If the sampling tag set does not include the hash value of the sampled data block, the first node determines that the first original message does not exist in the historical message, and the first node adds the hash value of the sampled data block to the sampling tag set.
- the sampling tag set includes a hash value of the sampling data block
- the implementation method of the first node deduplicating the payload part of the first message includes: the first node uses the sampling data block whose hash value belongs to the sampling tag set as a duplicate data block, removes the duplicate data block from the data part of the first message, and adds an indication corresponding to the duplicate data block to the payload part of the first message, where the indication is used to indicate the hash value of the duplicate data block.
- the first node also stores the protocol part of the historical message sent by the first node to the third node; the first repeated content also includes protocol information located in the protocol part of the first message, and the first indication information also includes a difference indication, which is used to indicate the difference between the protocol part of the first message and the protocol part of the first original message.
- the first message is a deduplicated message
- the first message carries second indication information of the second duplicate content
- the second original message exists in the historical message sent by the first node to the third node
- the payload part of the second original message includes the second duplicate content
- the first node sends the first message to the third node.
- the first message is a deduplicated message
- the first message carries second indication information for the second repeated content
- the second original message does not exist in the historical message sent by the first node to the third node
- the payload part of the second original message includes the second repeated content
- the first node obtains the second repeated content from the second data set according to the second indication information
- the second data set includes at least part of the payload part of the historical message received by the first node from the second node.
- the first node performs deduplication recovery processing on the payload part of the first message according to the second repeated content to obtain a third message
- the payload part of the third message includes the second repeated content.
- the first node sends a third message to the third node.
- the second repetitive content includes one or more repetitive data blocks
- the second indication information includes one or more indications
- the one or more indications in the second indication information correspond one-to-one to one or more repetitive data blocks in the second repetitive content
- each indication is used to indicate the hash value of the corresponding repetitive data block.
- each indication is also used to indicate the position of the corresponding repeated data block in the payload part of the original message corresponding to the first message
- the second data set includes the payload part of the historical message received by the first node from the second node.
- the implementation method for the first node to obtain the second repeated content from the second data set according to the second indication information includes: for each indication in the second indication information, the first node obtains the data block to be matched at the position of the payload part in the second data set according to the position indicated by the indication.
- the first node calculates the hash value of the data block to be matched.
- the first node determines the data block to be matched whose hash value is consistent with the hash value indicated by the indication as the repeated data block corresponding to the indication.
- the payload portion includes a protocol portion and a data portion, and one or more repeated data blocks in the second repeated content are located in the data portion.
- the second data set includes a correspondence between a hash value of a historical data block and a historical data block, and the historical data block is a data block obtained by sampling a preset position of a data portion of a historical message received by the first node from the second node.
- An implementation method in which the first node obtains the second repeated content from the second data set according to the second indication information includes: the first node determines the historical data block in the second data set corresponding to the hash value indicated by the indication in the second indication information as the repeated data block corresponding to the indication.
- the indication in the second indication information is also used to indicate the position of the corresponding duplicate data block in the data portion of the original message corresponding to the first message.
- the implementation method of the first node performing deduplication recovery processing on the payload portion of the first message according to the second duplicate content includes: for each indication in the second indication information, the first node adds the duplicate data block corresponding to the indication at the position indicated by the indication in the data portion of the first message.
- the second repeated content also includes protocol information located in the protocol part
- the second indication information also includes a difference indication
- the difference indication is used to indicate the difference between the protocol part of the original message corresponding to the first message and the protocol part of the target message
- the target message is a historical message received by the first node from the second node, in which the data part and the data part of the original message have one or more repeated data blocks.
- the second data set also includes the protocol part of the message to which the historical data block belongs.
- the implementation process of the first node obtaining the second repeated content from the second data set according to the second indication information also includes: the first node obtains the target message to which one or more repeated data blocks belong from the second data set; Protocol part.
- the implementation process of the first node performing deduplication recovery processing on the payload part of the first message according to the second duplicate content includes: the first node modifies the protocol part of the target message according to the difference indication, and uses the modified protocol part of the target message as the protocol part of the third message.
- one or more local flow grouping sets are stored in the first node, and each local flow grouping set includes flow identifiers of multiple flows flowing through the first node.
- the first node parses the payload part of the first message. If the payload part of the first message carries a deduplication mark, the first node determines that the first message is a deduplication message. If the payload part of the first message does not carry a deduplication mark, the first node determines that the first message is a non-deduplication message.
- the first node adds the flow identifiers of the flows to which the messages with duplicate contents in the payload part belong among the multiple messages received belonging to different flows to the same local flow grouping set, and the senders of the different flows are all SFU servers.
- the first node sends first grouping information to the second node, where the first grouping information includes a correspondence between a node identifier of the first node and one or more local flow grouping sets.
- the first node stores one or more lower-level flow grouping sets corresponding to the third node, and each lower-level flow grouping set includes flow identifiers of multiple flows flowing through the third node.
- the first node determines whether there is a message in the target historical message sent to the third node whose payload part has repeated content with the payload part of the first message, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set. If all lower-level flow grouping sets corresponding to the third node do not include the flow identifier of the flow to which the first message belongs, the first node sends the first message to the third node.
- the first node receives second grouping information sent by the third node, where the second grouping information includes a correspondence between a node identifier of the third node and one or more lower-level flow grouping sets.
- the first node after the first node receives the fourth message whose destination port number is the SFU service port number, the first node sends a first node discovery message to the fourth node, the fourth node is the next hop of the fourth message on the first node, the destination of the fourth message is the SFU server, the first node discovery message carries the identifier of the SFU server, and the first node discovery message indicates that the first node is the subordinate node of the fourth node on the transmission path starting from the SFU server.
- the first node determines that the fourth node supports data deduplication.
- the first node receives a fourth node discovery message sent by the seventh node, the fourth node discovery message carries an identifier of the SFU server, and the fourth node discovery message indicates that the seventh node is a subordinate node of the first node on a transmission path starting from the SFU server.
- the first node determines that the seventh node supports data deduplication based on the fourth node discovery message, and sends a fourth node discovery response message corresponding to the fourth node discovery message to the seventh node, and the fourth node discovery response message indicates that the first node supports data deduplication.
- the node discovery process is triggered by a message whose destination is the SFU server
- a node receives a node discovery response message sent by an upper-level node and receives a node discovery message sent by a lower-level node
- the node can determine itself as an intermediate node on the transmission path starting from the SFU server.
- the first node after the first node receives the fifth message whose source port number is the SFU service port number, the first node sends a third node discovery message to the sixth node, the sixth node is the next hop of the fifth message on the first node, the sender of the fifth message is the SFU server, the third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the upper node of the sixth node on the transmission path starting from the SFU server.
- the first node determines that the sixth node supports data deduplication.
- the first node receives a second node discovery message sent by the fifth node, the second node discovery message carries an identifier of the SFU server, and the second node discovery message indicates that the fifth node is the superior node of the first node on the transmission path starting from the SFU server.
- the first node determines that the fifth node supports data deduplication based on the second node discovery message, and sends a second node discovery response message corresponding to the second node discovery message to the fifth node, and the second node discovery response message indicates that the first node supports data deduplication.
- the node discovery process is triggered by a message whose sender is an SFU server
- a node receives a node discovery message sent by an upper-level node and a node discovery response message sent by a lower-level node, then the node can determine itself as an intermediate node on the transmission path starting from the SFU server.
- a node in a fourth aspect, includes multiple functional modules, and the multiple functional modules interact with each other to implement the method in the first aspect and its embodiments.
- the multiple functional modules can be implemented based on software, hardware, or a combination of software and hardware, and the multiple functional modules can be arbitrarily combined or divided based on specific implementations.
- the node is a first node
- the first node includes: an acquisition module, which is used to acquire a first message whose sender is an SFU server, and the first node is the first node that supports data deduplication on the transmission path of the first message.
- a processing module which is used to deduplicate the payload part of the first message to obtain a second message if there is a target message in the historical message sent by the first node to the second node, and the payload part of the target message has repeated content with the payload part of the first message, wherein the second message does not include the repeated content, and the second message carries a deduplication mark and indication information of the repeated content, wherein the deduplication mark is used to indicate that the second message is a deduplication message, and the second node is the next hop of the first message on the first node.
- a sending module which is used to send the second message to the second node.
- the repeated content includes one or more repeated data blocks
- the indication information includes one or more indications
- the one or more indications correspond one-to-one to the one or more repeated data blocks
- each indication is used to indicate a hash value of a corresponding repeated data block.
- a data set is stored in the first node, and the data set includes the payload part of the historical message sent by the first node to the second node, and the processing module is used to: perform content matching on the payload part of the first message and the payload part in the data set; if there is a target payload part in the data set that has a repeated data block with the payload part of the first message, determine that the target message exists in the historical message; for each repeated data block between the payload part of the first message and the target payload part, the first node calculates the hash value of the repeated data block; removes the repeated data block in the payload part of the first message, and adds an indication corresponding to the repeated data block to the payload part of the first message, wherein the indication is used to indicate the hash value of the repeated data block and the position of the repeated data block in the payload part of the first message.
- the processing module is used to: if there is no payload part with repeated data blocks with the payload part of the first message in the data set, determine that the target message does not exist in the historical message; and add the payload part of the first message to the data set.
- the payload part includes a protocol part and a data part, and the one or more repeated data blocks are located in the data part of the first message.
- a sampling tag set is stored in the first node, the sampling tag set includes a hash value of a historical data block, and the historical data block is a data block obtained by sampling a preset position of a data part of a historical message sent by the first node to the second node; the processing module is used to: sample the preset position of the data part of the first message to obtain a sampled data block; calculate the hash value of the sampled data block; if the sampling tag set includes the hash value of the sampled data block, determine that the target message exists in the historical message; use the sampled data block whose hash value belongs to the sampling tag set as a duplicate data block, remove the duplicate data block in the data part of the first message, and add an indication corresponding to the duplicate data block in the payload part of the first message, the indication being used to indicate the hash value of the duplicate data block.
- the first node obtains multiple sampled data blocks by sampling the preset positions of the data part of the first message, and the indication is also used to indicate the position of the repeated data block in the data part of the first message.
- the processing module is configured to: if the sampling tag set does not include the hash value of the sampling data block, determine that the target message does not exist in the historical message; and add the hash value of the sampling data block to the sampling tag set.
- the sampling tag set also includes a historical data block indicated by a hash value. If the sampling tag set includes the hash value of the sampling data block, the processing module is used to: if the sampling tag set includes the hash value of the sampling data block, perform content matching on the sampling data block and the historical data block indicated by the hash value of the sampling data block; when the content of the sampling data block is the same as that of the historical data block indicated by the hash value of the sampling data block, determine that the target message exists in the historical message.
- the first node also stores the protocol part of the historical message sent by the first node to the second node;
- the repeated content also includes protocol information located in the protocol part of the first message, and the indication information also includes a difference indication, and the difference indication is used to indicate the difference between the protocol part of the first message and the protocol part of the target message.
- the first node stores one or more flow grouping sets corresponding to the second node, each of the flow grouping sets including flow identifiers of multiple flows flowing through the second node; the processing module is further used to, after the first node obtains the first message, determine whether the target message exists in the target historical message sent to the second node, if there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the flow grouping set corresponding to the second node, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set; the sending module is further used to send the first message to the second node if all flow grouping sets corresponding to the second node do not include the flow identifier of the flow to which the first message belongs.
- the first node further includes a receiving module; the receiving module is used to receive grouping information sent by the second node, and the grouping information includes a correspondence between a node identifier of the second node and the one or more flow grouping sets.
- the first node is not the SFU server; the sending module is also used to send a first node discovery message to the third node after receiving a third message whose destination port number is the SFU service port number, the third node is the next hop of the third message on the first node, the destination of the third message is the SFU server, the first node discovery message carries an identifier of the SFU server, and the first node discovery message indicates that the first node is a subordinate node of the third node on the transmission path starting from the SFU server; the processing module is also used to determine that the first node is the first node that supports data deduplication on the transmission path starting from the SFU server in response to not receiving a first node discovery response message corresponding to the first node discovery message sent by the third node.
- the processing module is also used to: generate the first node discovery message based on the third message, the message header of the first node discovery message is the same as the message header of the third message, and the payload part of the first node discovery message carries an indication of the message type of the first node discovery message.
- the first node also includes a receiving module; the receiving module is used to receive a second node discovery message sent by a fourth node, the second node discovery message carries the identifier of the SFU server, and the second node discovery message indicates that the fourth node is a subordinate node of the first node on the transmission path starting from the SFU server; the processing module is also used to determine whether the fourth node supports data deduplication based on the second node discovery message; the sending module is also used to send a second node discovery response message corresponding to the second node discovery message to the fourth node, and the second node discovery response message indicates that the first node supports data deduplication.
- the receiving module is used to receive a second node discovery message sent by a fourth node, the second node discovery message carries the identifier of the SFU server, and the second node discovery message indicates that the fourth node is a subordinate node of the first node on the transmission path starting from the SFU server
- the processing module is also used to determine whether
- the sending module is also used to send a third node discovery message to the fifth node after receiving a fourth message whose source port number is the SFU service port number, the fifth node is the next hop of the fourth message on the first node, the sender of the fourth message is the SFU server, the third node discovery message carries an identifier of the SFU server, and the third node discovery message indicates that the first node is the superior node of the fifth node on the transmission path starting from the SFU server; the processing module is also used to determine that the fifth node supports data deduplication in response to receiving a third node discovery response message corresponding to the third node discovery message sent by the fifth node.
- the sending module is further configured to send the first message to the second node if the target message does not exist in the historical messages sent by the first node to the second node.
- a node in a fifth aspect, includes multiple functional modules, and the multiple functional modules interact with each other to implement the method in the second aspect and its respective embodiments.
- the multiple functional modules can be implemented based on software, hardware, or a combination of software and hardware, and the multiple functional modules can be arbitrarily combined or divided based on specific implementations.
- the node is a first node
- the first node includes: a receiving module, used to receive a first message sent by a second node, the sender of the first message is an SFU server, the first message carries a deduplication mark and indication information of repeated content, the deduplication mark is used to indicate that the first message is a deduplication message, and the first node is the last node that supports data deduplication on the transmission path of the first message; a processing module, used to determine that the first message is a deduplication message based on the deduplication mark; obtain the repeated content from a data set according to the indication information, the data set includes at least part of the payload part of the historical message received by the first node from the second node; perform deduplication recovery processing on the payload part of the first message according to the repeated content to obtain a second message, and the payload part of the second message includes the repeated content; a sending module, used to send the second message to a third node, and the third node is the next
- the repeated content includes one or more repeated data blocks
- the indication information includes one or more indications
- the one or more indications correspond one-to-one to the one or more repeated data blocks
- each indication is used to indicate a hash value of a corresponding repeated data block.
- each of the indications is also used to indicate the position of a corresponding duplicate data block in the payload part of the original message corresponding to the first message, and the data set includes the payload part of the historical message received by the first node from the second node; the processing module is used to: for each indication in the indication information, obtain the data block to be matched at the position of the payload part in the data set according to the position indicated by the indication; calculate the hash value of the data block to be matched; and determine the data block to be matched whose hash value is consistent with the hash value indicated by the indication as the duplicate data block corresponding to the indication.
- the payload portion includes a protocol portion and a data portion, and the one or more repeated data blocks are located in the data portion.
- the data set includes a correspondence between a hash value of a historical data block and the historical data block, wherein the historical data block is a data block obtained by sampling a preset position of a data portion of a historical message received by the first node from the second node; the processing module is used to determine a historical data block in the data set corresponding to the hash value indicated by the indication in the indication information as a duplicate data block corresponding to the indication.
- the indication is also used to indicate the position of the corresponding repeated data block in the data part of the original message corresponding to the first message; the processing module is used to add the repeated data block corresponding to the indication at the position indicated by the indication in the data part of the first message for each indication in the indication information.
- the repeated content also includes protocol information located in the protocol part, and the indication information also includes a difference indication, wherein the difference indication is used to indicate the difference between the protocol part of the original message corresponding to the first message and the protocol part of the target message, and the target message is a historical message received by the first node from the second node, in which the data part and the data part of the original message have the one or more repeated data blocks;
- the data set also includes the protocol part of the message to which the historical data blocks belong;
- the processing module is also used to: obtain the protocol part of the target message to which the one or more repeated data blocks belong from the data set; modify the protocol part of the target message according to the difference indication, and use the modified protocol part of the target message as the protocol part of the second message.
- the deduplication mark is located in the payload part of the first message, and one or more flow group sets are stored in the first node, each of the flow group sets including flow identifiers of multiple flows flowing through the first node; the processing module is also used to determine that there is a target flow group set including the flow identifier of the flow to which the first message belongs in the one or more flow group sets before the first node determines that the first message is a deduplication message based on the deduplication mark; and parse the payload part of the first message to obtain the deduplication mark.
- the processing module is used to: obtain the repeated content from the payload content of the target historical message in the data set according to the indication information, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set.
- the receiving module is further used to receive a third message, and the flow identifier of the flow to which the third message belongs does not belong to any of the flow grouping sets; the sending module is further used to forward the third message.
- the receiving module is also used to receive a fourth message, and the flow identifier of the flow to which the fourth message belongs belongs to the flow group set; the processing module is also used to parse the payload part of the fourth message, determine that the payload part of the fourth message does not carry the deduplication mark; add at least part of the content of the payload part of the fourth message to the data set, and forward the fourth message.
- the processing module is further used to: add the flow identifiers of the flows to which the messages with duplicate content in the payload part belong among the multiple messages received belonging to different flows to the same flow grouping set, and the senders of the different flows are all the SFU servers.
- the sending module is further used to send grouping information to the second node, where the grouping information includes a correspondence between a node identifier of the first node and the one or more flow grouping sets.
- the sending module is also used to send a first node discovery message to the fourth node after receiving a fifth message whose destination port number is the SFU service port number, the fourth node is the next hop of the fifth message on the first node, the destination of the fifth message is the SFU server, the first node discovery message carries an identifier of the SFU server, and the first node discovery message indicates that the first node is a subordinate node of the fourth node on the transmission path starting from the SFU server; the processing module is also used to determine that the fourth node supports data deduplication in response to receiving a first node discovery response message corresponding to the first node discovery message sent by the fourth node.
- the processing module is also used to: generate the first node discovery message based on the fifth message, the message header of the first node discovery message is the same as the message header of the fifth message, and the payload part of the first node discovery message carries an indication of the message type of the first node discovery message.
- the receiving module is also used to receive a second node discovery message sent by the fifth node, the second node discovery message carries the identifier of the SFU server, and the second node discovery message indicates that the fifth node is the superior node of the first node on the transmission path starting from the SFU server; the processing module is also used to determine whether the fifth node supports data deduplication based on the second node discovery message; the sending module is also used to send a second node discovery response message corresponding to the second node discovery message to the fifth node, and the second node discovery response message indicates that the first node supports data deduplication.
- the sending module is also used to send a third node discovery message to the sixth node after receiving the sixth message whose source port number is the SFU service port number, the sixth node is the next hop of the sixth message on the first node, the sender of the sixth message is the SFU server, the third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the superior node of the sixth node on the transmission path starting from the SFU server; the processing module is also used to determine that the first node is the last node supporting data deduplication on the transmission path starting from the SFU server in response to not receiving the third node discovery response message corresponding to the third node discovery message sent by the sixth node.
- a node in a sixth aspect, includes multiple functional modules, and the multiple functional modules interact with each other to implement the above-mentioned
- the plurality of functional modules can be implemented based on software, hardware or a combination of software and hardware, and the plurality of functional modules can be arbitrarily combined or divided based on specific implementation.
- the node is a first node
- the first node includes: a receiving module, used to receive a first message sent by a second node, the sender of the first message is an SFU server, and the first node is an intermediate node that supports data deduplication on the transmission path of the first message; a processing module, used to, if the first message is a non-deduplicated message, and there is a first original message in the historical message sent by the first node to the third node, and the payload part of the first original message and the payload part of the first message have a first repeated content, deduplicate the payload part of the first message to obtain a second message, the second message does not include the first repeated content, and the second message carries a deduplication mark and a first indication information of the first repeated content, the deduplication mark is used to indicate that the second message is a deduplicated message, and the third node is the next hop of the first message on the first node; a sending module, used to send a
- the sending module is further used to send the first message to the third node if the first message is a non-deduplicated message and the first original message does not exist in the historical messages sent by the first node to the third node.
- the first repeated content includes one or more repeated data blocks
- the first indication information includes one or more indications
- the one or more indications correspond one-to-one to the one or more repeated data blocks
- each indication is used to indicate a hash value of a corresponding repeated data block.
- a first data set is stored in the first node, and the first data set includes the payload part of the historical message sent by the first node to the third node.
- the processing module is also used to: perform content matching on the payload part of the first message and the payload part in the first data set; if there is a target payload part in the first data set that has a repeated data block with the payload part of the first message, determine that the first original message exists in the historical message; if there is no payload part in the first data set that has a repeated data block with the payload part of the first message, determine that the first original message does not exist in the historical message, and the first node adds the payload part of the first message to the first data set.
- the first data set includes the target payload part
- the processing module is used to: calculate the hash value of the repeated data block for each repeated data block between the payload part and the target payload part of the first message; remove the repeated data block in the payload part of the first message, and add an indication corresponding to the repeated data block to the payload part of the first message, wherein the indication is used to indicate the hash value of the repeated data block and the position of the repeated data block in the payload part of the first message.
- the payload part includes a protocol part and a data part, and the one or more repeated data blocks are located in the data part of the first message;
- the first node stores a sampling tag set, and the sampling tag set includes a hash value of a historical data block, and the historical data block is a data block obtained by sampling a preset position of the data part of the historical message sent by the first node to the third node;
- the processing module is further used to: sample the preset position of the data part of the first message to obtain a sampled data block; calculate the hash value of the sampled data block; if the sampling tag set includes the hash value of the sampled data block, determine that the first original message exists in the historical message; if the sampling tag set does not include the hash value of the sampled data block, determine that the first original message does not exist in the historical message, and the first node adds the hash value of the sampled data block to the sampling tag set.
- the sampling tag set includes a hash value of the sampling data block
- the processing module is used to: take the sampling data block whose hash value belongs to the sampling tag set as a repeated data block, remove the repeated data block from the data part of the first message, and add an indication corresponding to the repeated data block to the payload part of the first message, wherein the indication is used to indicate the hash value of the repeated data block.
- the first node also stores the protocol part of the historical message sent by the first node to the third node;
- the first repeated content also includes protocol information located in the protocol part of the first message, and the first indication information also includes a difference indication, and the difference indication is used to indicate the difference between the protocol part of the first message and the protocol part of the first original message.
- the sending module is also used to send the first message to the third node if the first message is a deduplicated message, the first message carries second indication information for second repeated content, and there is a second original message in the historical message sent by the first node to the third node, and the payload part of the second original message includes the second repeated content.
- the processing module is also used to, if the first message is a deduplicated message, the first message carries second indication information for second repeated content, and the second original message does not exist in the historical messages sent by the first node to the third node, and the payload part of the second original message includes the second repeated content, obtain the second repeated content from a second data set according to the second indication information, the second data set including at least part of the payload part of the historical message received by the first node from the second node; perform deduplication and recovery processing on the payload part of the first message according to the second repeated content to obtain a third message, the payload part of the third message including the second repeated content; the sending module is also used to send the third message to the third node.
- the second repeated content includes one or more repeated data blocks
- the second indication information includes one or more indications
- the one or more indications correspond one-to-one to the one or more repeated data blocks
- each indication is used to indicate a hash value of a corresponding repeated data block.
- each of the indications is also used to indicate the position of the corresponding repeated data block in the payload part of the original message corresponding to the first message, and the second data set includes the payload part of the historical message received by the first node from the second node; the processing module is used to: for each indication in the second indication information, obtain the data block to be matched at the position of the payload part in the second data set according to the position indicated by the indication; calculate the hash value of the data block to be matched; and determine the data block to be matched whose hash value is consistent with the hash value indicated by the indication as the repeated data block corresponding to the indication.
- the payload part includes a protocol part and a data part, and the one or more repeated data blocks are located in the data part;
- the second data set includes the correspondence between the hash value of the historical data block and the historical data block, and the historical data block is a data block obtained by sampling the preset position of the data part of the historical message received by the first node from the second node;
- the processing module is used to determine the historical data block in the second data set corresponding to the hash value indicated by the indication in the second indication information as the repeated data block corresponding to the indication.
- the indication is also used to indicate the position of the corresponding repeated data block in the data part of the original message corresponding to the first message; the processing module is used to add the repeated data block corresponding to the indication at the position indicated by the indication in the data part of the first message for each indication in the second indication information.
- the second repeated content also includes protocol information located in the protocol part, and the second indication information also includes a difference indication, wherein the difference indication is used to indicate the difference between the protocol part of the original message corresponding to the first message and the protocol part of the target message, and the target message is a historical message received by the first node from the second node, in which the data part and the data part of the original message have the one or more repeated data blocks;
- the second data set also includes the protocol part of the message to which the historical data blocks belong;
- the processing module is also used to: obtain the protocol part of the target message to which the one or more repeated data blocks belong from the second data set; modify the protocol part of the target message according to the difference indication, and use the modified protocol part of the target message as the protocol part of the third message.
- one or more local flow group sets are stored in the first node, and each of the local flow group sets includes flow identifiers of multiple flows flowing through the first node; the processing module is also used to: after determining that there is a target flow group set including the flow identifier of the flow to which the first message belongs in the one or more local flow group sets, parse the payload part of the first message; if the payload part of the first message carries a deduplication mark, determine that the first message is a deduplication message; if the payload part of the first message does not carry a deduplication mark, determine that the first message is a non-deduplication message.
- the processing module is further used to: add the flow identifiers of the flows to which the messages with duplicate content in the payload part among the multiple messages received belonging to different flows belong to, to the same local flow grouping set, and the senders of the different flows are all the SFU server.
- the sending module is further used to send first grouping information to the second node, where the first grouping information includes a correspondence between a node identifier of the first node and the one or more local flow grouping sets.
- the first node stores one or more lower-level flow group sets corresponding to the third node, each of the lower-level flow group sets including flow identifiers of multiple flows flowing through the third node.
- the processing module is further used to, after the first node receives the first message sent by the second node, determine whether there is a message in the target historical message sent to the third node whose payload part has repeated content with the payload part of the first message, if there is a target flow group set including the flow identifier of the flow to which the first message belongs in the lower-level flow group set corresponding to the third node, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow group set; the sending module is further used to send the first message to the third node if all lower-level flow group sets corresponding to the third node do not include the flow identifier of the flow to which the first message belongs.
- the receiving module is further used to receive second grouping information sent by the third node, where the second grouping information includes a correspondence between a node identifier of the third node and the one or more lower-level flow grouping sets.
- the sending module is also used to send a first node discovery message to a fourth node after receiving a fourth message whose destination port number is the SFU service port number, wherein the fourth node is the next hop of the fourth message on the first node, and the destination of the fourth message is the SFU server.
- the first node discovery message carries an identifier of the SFU server, and the first node discovery message indicates that the first node is a subordinate node of the fourth node on a transmission path starting from the SFU server; the processing module is also used to determine that the fourth node supports data deduplication in response to receiving a first node discovery response message corresponding to the first node discovery message sent by the fourth node.
- the receiving module is also used to receive a second node discovery message sent by the fifth node, the second node discovery message carries the identifier of the SFU server, and the second node discovery message indicates that the fifth node is the superior node of the first node on the transmission path starting from the SFU server; the processing module is also used to determine whether the fifth node supports data deduplication based on the second node discovery message; the sending module is also used to send a second node discovery response message corresponding to the second node discovery message to the fifth node, and the second node discovery response message indicates that the first node supports data deduplication.
- the sending module is also used to send a third node discovery message to the sixth node after receiving a fifth message whose source port number is the SFU service port number, the sixth node is the next hop of the fifth message on the first node, the sender of the fifth message is the SFU server, the third node discovery message carries an identifier of the SFU server, and the third node discovery message indicates that the first node is the superior node of the sixth node on the transmission path starting from the SFU server; the processing module is also used to determine that the sixth node supports data deduplication in response to receiving a third node discovery response message corresponding to the third node discovery message sent by the sixth node.
- the receiving module is also used to receive a fourth node discovery message sent by the seventh node, the fourth node discovery message carries the identifier of the SFU server, and the fourth node discovery message indicates that the seventh node is a subordinate node of the first node on the transmission path starting from the SFU server; the processing module is also used to determine whether the seventh node supports data deduplication based on the fourth node discovery message; the sending module is also used to send a fourth node discovery response message corresponding to the fourth node discovery message to the seventh node, and the fourth node discovery response message indicates that the first node supports data deduplication.
- a data transmission system comprising: an SFU server and a plurality of nodes in a communication network, the plurality of nodes comprising a first node and a second node, the first node being located between the SFU server and the second node.
- the SFU server is used to send a message to the first node
- the first node is used to execute the method in the first aspect and its various embodiments
- the second node is used to execute the method in the second aspect and its various embodiments.
- the multiple nodes further include a third node, the third node is located between the first node and the second node, and the third node is used to execute the method in the above third aspect and its various embodiments.
- another data transmission system comprising: an SFU server and a first node in a communication network.
- the SFU server is used to execute the method in the first aspect and its respective embodiments
- the first node is used to execute the method in the second aspect and its respective embodiments.
- the communication network further includes a second node, the second node is located between the SFU server and the first node, and the second node is used to execute the method in the above third aspect and its various embodiments.
- a communication node comprising: a processor and a memory.
- the memory is used to store a computer program, and the computer program includes program instructions.
- the processor is used to call the computer program to implement the method in the first aspect and its various embodiments, or to implement the method in the second aspect and its various embodiments, or to implement the method in the third aspect and its various embodiments.
- a computer-readable storage medium on which instructions are stored.
- the instructions are executed by a processor, the method of the above-mentioned first aspect and its various embodiments is implemented, or the method of the above-mentioned second aspect and its various embodiments is implemented, or the method of the above-mentioned third aspect and its various embodiments is implemented.
- a computer program product comprising a computer program, which, when executed by a processor, implements the method of the above-mentioned first aspect and its various embodiments, or implements the method of the above-mentioned second aspect and its various embodiments, or implements the method of the above-mentioned third aspect and its various embodiments.
- a chip which includes a programmable logic circuit and/or program instructions.
- the chip When the chip is running, it implements the method in the above-mentioned first aspect and its various embodiments, or implements the method in the above-mentioned second aspect and its various embodiments, or implements the method in the above-mentioned third aspect and its various embodiments.
- FIG1 is a schematic diagram of an SFU communication architecture provided in an embodiment of the present application.
- FIG2 is a schematic diagram of the structure of a data transmission system provided in an embodiment of the present application.
- FIG3 is a schematic diagram of the structure of another data transmission system provided in an embodiment of the present application.
- FIG4 is a schematic diagram of the structure of another data transmission system provided in an embodiment of the present application.
- FIG5 is a schematic diagram of the structure of another data transmission system provided in an embodiment of the present application.
- FIG6 is a schematic diagram of a flow chart of a data transmission method provided in an embodiment of the present application.
- FIG7 is a schematic diagram of a message deduplication process provided in an embodiment of the present application.
- FIG8 is a schematic diagram of another message deduplication process provided in an embodiment of the present application.
- FIG9 is a schematic diagram of another message deduplication process provided in an embodiment of the present application.
- FIG10 is a schematic flow chart of another data transmission method provided in an embodiment of the present application.
- FIG11 is a flow chart of another data transmission method provided in an embodiment of the present application.
- FIG12 is a schematic diagram of the structure of a communication node provided in an embodiment of the present application.
- FIG. 13 is a schematic diagram of the structure of another communication node provided in an embodiment of the present application.
- FIG14 is a schematic diagram of the structure of another communication node provided in an embodiment of the present application.
- FIG. 15 is a schematic diagram of the hardware structure of a communication device provided in an embodiment of the present application.
- the audio and video data of the same user usually needs to be distributed to multiple recipients.
- the audio and video data of the conference host needs to be distributed to multiple conference members.
- the audio and video data of the anchor user needs to be distributed to multiple audience users.
- the current mainstream communication scenarios such as video conferencing and live broadcasting usually adopt the SFU communication architecture.
- the source user sends the audio and video data to the SFU server, and the SFU server is responsible for completing the replication and distribution of the audio and video data.
- the SFU server can be a video conferencing server.
- the SFU server can be a live broadcasting server.
- FIG1 is a schematic diagram of an SFU communication architecture provided by an embodiment of the present application.
- the SFU communication architecture includes a source user, an SFU server, and multiple receivers, such as multiple receivers including receiver a, receiver b, and receiver c.
- the implementation process of the source user sending audio and video data to multiple receivers includes: the source user sends audio and video data to the SFU server; the SFU server encapsulates the audio and video data in the payload part of multiple messages respectively, and the destinations of the multiple messages are the multiple receivers respectively; the SFU server sends corresponding messages to the multiple receivers respectively through the communication network, such as the SFU server sends message a to receiver a, sends message b to receiver b, and sends message c to receiver c through the communication network.
- the nodes in the communication network need to forward the messages sent by the SFU server to different receivers respectively, and the payload part of each message carries the complete audio and video data from the source user, that is, the data volume of each message is large, this will cause the nodes in the communication network to forward multiple messages.
- the transmission data volume is large, which in turn leads to a large network bandwidth overhead.
- the present application proposes a technical solution.
- the original message whose sender is the SFU server
- the first node or intermediate node that supports data deduplication on the transmission path of the original message obtains the original message, it determines whether it has sent a historical message whose payload part has duplicate content with the payload part of the original message to the next hop of the original message.
- the node If it has sent a historical message whose payload part has duplicate content with the payload part of the original message to the next hop of the message, the node performs deduplication processing on the payload part of the original message to obtain a deduplication message, which does not include the duplicate content, and the deduplication message carries a deduplication mark indicating that it is a deduplication message, and then the node sends the deduplication message to the next hop of the original message.
- the intermediate node or the last node that supports data deduplication on the transmission path of the original message performs deduplication recovery processing on the payload part of the deduplication message, restores the original message, and then continues to send the original message to the destination.
- the original messages in this application all refer to non-deduplication messages.
- the present application can transmit deduplicated messages in the communication network to replace the original messages. Without affecting user services, the amount of message transmission data can be reduced through deduplication and recovery processing, thereby reducing network bandwidth overhead, improving transmission efficiency and quality, and reducing user costs. For example, in video conferencing scenarios or live broadcast scenarios that use the SFU communication architecture, when multiple recipients in the same video conference or live broadcast receive audio and video data from the same user, the received data is highly overlapping. By applying the technical solution of the present application, data deduplication can be performed on concurrent traffic sent by the same user to multiple recipients, and duplicate traffic sent by the same user to different recipients can be reduced, reducing the amount of message transmission data, thereby reducing network bandwidth opening and closing. pin.
- the data transmission system provided in the embodiment of the present application adopts the SFU communication architecture and can be applied to a variety of communication scenarios, such as video conferencing scenarios or live broadcast scenarios.
- the SFU server can be a video conferencing server.
- the SFU server can be a live broadcast server.
- a typical application scenario of the present application is a cloud-to-branch scenario.
- the SFU server is deployed on the cloud, that is, the cloud is where the SFU server is located.
- the branch is usually deployed in a wide area network, and the branch is where the user is located.
- the cloud and the branch can communicate through a local area network.
- the nodes that support data deduplication can be independent hardware devices, or they can also be deployed in the form of software on a server or network device.
- the application scenario of the present application is not limited to the cloud-to-branch scenario, and can also be deployed according to actual needs. For example, if multiple nodes that support data deduplication are deployed in a local area network, data deduplication and data recovery can be performed on the traffic in the local area network. For example, if multiple nodes supporting data deduplication are deployed in a certain section of a wide area network, data deduplication and data recovery can be performed on the traffic in this section of the network.
- the present application can be used to perform data deduplication and data recovery on various traffic that conforms to the SFU communication architecture.
- the present application is not limited to deploying only two layers of nodes that support data deduplication, but can also support the deployment of multiple layers of nodes that support data deduplication.
- the node close to the SFU server can be called the top node
- the node close to the user can be called the bottom node
- the node between the top node and the bottom node can be called the intermediate node.
- the data transmission system provided in the embodiment of the present application includes at least a top node and a bottom node.
- the top node has a data deduplication function and is responsible for deduplicating data for the traffic sent by the SFU server to the user.
- the bottom node has a data recovery function and is responsible for recovering data for the traffic sent by the SFU server to the user.
- the data transmission system provided in the embodiment of the present application may also include an intermediate node.
- the intermediate node has a data deduplication function and a data recovery function, and is responsible for deduplicating data or recovering data for the traffic sent by the SFU server to the user.
- FIG. 2 is a structural diagram of a data transmission system provided in an embodiment of the present application.
- the data transmission system 200 adopts a one-to-one branch two-layer node deployment model.
- the data transmission system 200 includes an SFU server 201 and nodes 202A and 202B in a communication network.
- node 202A is deployed close to the SFU server 201, that is, node 202A is the top node, that is, node 202A is the first node that supports data deduplication on the transmission path starting from the SFU server 201.
- Node 202B is deployed close to branch 21, that is, node 202B is the bottom node, that is, node 202B is the last node that supports data deduplication on the transmission path from the SFU server 201 to the branch 21.
- FIG. 3 is a schematic diagram of the structure of another data transmission system provided by an embodiment of the present application.
- a data transmission system 300 adopts a one-to-many branch two-layer node deployment model.
- the data transmission system 300 includes an SFU server 301 and nodes 302A, 302B, 302C, and 302D in a communication network.
- node 302A is deployed close to the SFU server 301, that is, node 302A is a top node, that is, node 302A is the first node that supports data deduplication on the transmission path starting from the SFU server 301.
- Node 302B is deployed close to branch 31, that is, node 302B is the bottom node corresponding to branch 31, that is, node 302B is the last node that supports data deduplication on the transmission path from the SFU server 301 to branch 31.
- Node 302C is deployed close to branch 32, that is, node 302C is the bottom node corresponding to branch 32, that is, node 302C is the last node that supports data deduplication on the transmission path from the SFU server 301 to branch 32.
- Node 302D is deployed close to branch 33 , that is, node 302D is the bottom node corresponding to branch 33 , that is, node 302D is the last node supporting data deduplication on the transmission path from SFU server 301 to branch 33 .
- FIG. 4 is a structural diagram of another data transmission system provided by an embodiment of the present application.
- a data transmission system 400 adopts a multi-layer node multi-branch deployment model.
- the data transmission system 400 includes an SFU server 401 and nodes 402A, 402B, 402C, 402D, 402E, 402F, and 402G in a communication network.
- node 402A is deployed close to the SFU server 401, that is, node 402A is a top node, that is, node 402A is the first node that supports data deduplication on the transmission path starting from the SFU server 401.
- Node 402B is deployed close to branch 41, that is, node 402B is the bottom node corresponding to branch 41, that is, node 402B is the last node that supports data deduplication on the transmission path from the SFU server 401 to the branch 41.
- Node 402D is deployed close to branch 45, that is, node 402D is the bottom node corresponding to branch 45, that is, node 402D is the last node that supports data deduplication on the transmission path from SFU server 401 to branch 45.
- Node 402E is deployed close to branch 42, that is, node 402E is the bottom node corresponding to branch 45.
- node 402E is the last node supporting data deduplication on the transmission path from SFU server 401 to branch 42.
- Node 402F is deployed close to branch 43, that is, node 402F is the bottom node corresponding to branch 43, that is, node 402F is the last node supporting data deduplication on the transmission path from SFU server 401 to branch 43.
- Node 402G is deployed close to branch 44, that is, node 402G is the bottom node corresponding to branch 44, that is, node 402G is the last node supporting data deduplication on the transmission path from SFU server 401 to branch 44.
- Node 402C is deployed between node 402A and node 402E, node 402F and node 402G, that is, node 402C is the intermediate node corresponding to branch 42, branch 43 and branch 44 respectively, that is to say, node 402C is an intermediate node that supports data deduplication on the transmission path from SFU server 401 to branch 42, and node 402C is an intermediate node that supports data deduplication on the transmission path from SFU server 401 to branch 43, and node 402C is an intermediate node that supports data deduplication on the transmission path from SFU server 401 to branch 44.
- FIG. 5 is a structural diagram of another data transmission system provided in an embodiment of the present application.
- a data transmission system 500 includes an SFU server 501 and a node 502 in a communication network.
- the SFU server 501 is a top node, that is, the SFU server 501 is the first node that supports data deduplication on the transmission path starting from the SFU server 501.
- Node 502 is deployed close to branch 51, that is, node 502 is the bottom node corresponding to branch 51, that is, node 502 is the last node that supports data deduplication on the transmission path from the SFU server 501 to branch 51.
- an intermediate node that supports data deduplication may also be deployed between the SFU server 501 and the node 502.
- the data transmission system 400 shown in FIG. 4 and the embodiments of the present application will not be repeated here.
- the node supporting data deduplication in the communication network may be an independent hardware device, or may be deployed in the form of software to a server or network device in an existing network.
- Network devices include but are not limited to routers or switches.
- the embodiment of the present application does not limit the type of communication network.
- the communication network may be a dedicated line network, or it may be a non-dedicated line network.
- the communication network may include a tunnel, or it may not include a tunnel.
- the communication network may be a wide area network or a local area network, etc.
- the communication scenario involved in the embodiments of the present application adopts scalable video coding (SVC).
- SVC is a type of video coding, which is implemented by layered encoding and selective transmission of video data.
- the specific implementation method of using SVC in the communication scenario is that the source host sends a multi-layer stream to the SFU server, and the multi-layer stream includes a basic layer stream and an enhanced layer stream, and then the SFU server selects and distributes the multi-layer stream according to the capabilities of the receiving host. For example, for a receiving host with strong decoding capabilities, the SFU sends a basic layer stream and an enhanced layer stream to the receiving host. For a receiving host with weak decoding capabilities, the SFU only sends a basic layer stream to the receiving host.
- the communication scenario involved in the embodiments of the present application supports end-to-end encryption (E2EE).
- the source host uses the advanced encryption standard (AES) key to encrypt the business data, and then sends the encrypted ciphertext to the SFU server.
- the SFU server carries the ciphertext in the message and distributes it to the receiving host.
- the receiving host uses the AES key to decrypt the ciphertext to obtain the business data.
- the multiple receiving hosts can use the same AES key. For example, in a video conferencing scenario, the conference host can distribute the AES key to multiple conference members uniformly through signaling.
- the communication scenario involved in the embodiment of the present application adopts the SFU communication architecture, adopts SVC, and supports E2EE.
- the data transmission in this communication scenario has the following characteristics: signaling data is transmitted based on the transmission control protocol (TCP) and the transport layer security (TLS) protocol, and business data (such as audio and video data) is based on the user datagram protocol (UDP) and AES encryption transmission.
- TCP transmission control protocol
- TLS transport layer security
- business data (such as audio and video data) is based on the user datagram protocol (UDP) and AES encryption transmission.
- UDP user datagram protocol
- AES AES encryption
- the processes of deduplication and recovery processing of the message are different.
- the present application respectively describes the implementation methods of the three types of node transmission messages through the following three embodiments.
- FIG6 is a flow chart of a data transmission method 600 provided in an embodiment of the present application.
- the first node is referred to as node 11, the second node is referred to as node 12, the first message is referred to as message 11, and the second message is referred to as message 12.
- the method 600 can be applied to For the data transmission system 200 shown in FIG. 2 , the node 11 in the method 600 is the node 202A, and the node 12 is the node 202B.
- the method 600 can be applied to the data transmission system 300 shown in FIG.
- the node 11 in the method 600 is the node 302A
- the node 12 is the node 302B, the node 302C, or the node 302D.
- the method 600 can be applied to the data transmission system 400 shown in FIG. 4 , and the node 11 in the method 600 is the node 402A, and the node 12 is the node 402B, the node 402C, or the node 402D.
- the method 600 can be applied to the data transmission system 500 shown in FIG. 5
- the node 11 in the method 600 is the SFU server 501
- the node 12 is the node 502.
- the method 600 includes but is not limited to the following steps 601 to 603.
- Step 601 Node 11 obtains message 11 whose sender is the SFU server. Node 11 is the first node on the transmission path of message 11 that supports data deduplication.
- message 11 is the original message, that is, the message is not deduplicated.
- the sender of message 11 is the SFU server
- node 11 is the first node on the transmission path of message 11 that supports data deduplication, that is, node 11 is the first node on the transmission path starting from the SFU server and ending at the destination of message 11 that supports data deduplication.
- the source Internet Protocol (IP) address of message 11 can be the IP address of the SFU server.
- IP Internet Protocol
- the source IP address of message 11 can be represented by the public network address of the SFU server.
- the source IP address of message 11 can be the address obtained after the IP address of the SFU server is converted to network address translation (NAT).
- NAT network address translation
- the source IP address of message 11 can be represented by the public network address corresponding to the private network address of the SFU server.
- the node 11 is an SFU server, and the node 11 obtains the message 11 , which may be the node 11 generating the message 11 .
- node 11 if node 11 is not an SFU server, for example, node 11 is a network device connected to an SFU server, node 11 obtains message 11, which may be node 11 receiving message 11 sent by the SFU server.
- node 11 may determine whether the sender of the message is an SFU server according to the source port number carried by the message.
- Step 602 If the target message exists in the historical message sent by node 11 to node 12, and the payload part of the target message has repeated content with the payload part of message 11, node 11 deduplicates the payload part of message 11 to obtain message 12.
- Message 12 does not include the repeated content, and message 12 carries a deduplication mark and indication information of the repeated content.
- the deduplication mark is used to indicate that message 12 is a deduplication message, and node 12 is the next hop of message 11 on node 11.
- the payload part is a UDP payload or a TCP payload.
- the payload part includes a protocol part and a data part.
- the protocol part can be used to carry application layer protocols, such as real-time transport protocol (RTP), file transfer protocol (FTP) or hypertext transfer protocol (HTTP).
- the data part can be used to carry application layer data, such as signaling data or service data.
- the repeated content between the payload portion of the target message and the payload portion of message 11 includes one or more repeated data blocks
- the indication information of the repeated content includes one or more indications
- the one or more indications correspond one-to-one to one or more repeated data blocks in the repeated content.
- Each indication is used to indicate the hash value of the corresponding repeated data block, and specifically, each indication includes the hash value of the corresponding repeated data block.
- the hash value of the repeated data block can be calculated based on the data content and length of the repeated data block, or part of the data content of the repeated data block can be used as the hash value of the repeated data block. The embodiment of the present application does not limit the calculation method of the hash value of the repeated data block.
- node 11 needs to execute a judgment process, which is used to determine whether there is a target message in the historical message sent by itself to node 12 whose payload part has repeated content with the payload part of message 11.
- the historical message here refers to the original message sent by node 11 to node 12 before executing the judgment process.
- the following embodiments of the present application provide two possible implementation methods for node 11 to execute the judgment process, including possible implementation method A1 and possible implementation method A2.
- one or more repeated data blocks in the above-mentioned repeated content are located in the data part and/or the protocol part of the payload part of the message 11.
- Node 11 stores a data set, which includes the payload part of the historical message sent by node 11 to node 12.
- Node 11 executes a judgment process, including: node 11 matches the payload part of message 11 with the payload part in the data set. If there is a target payload part with repeated data blocks in the payload part of message 11 in the data set, node 11 determines that there is a target message in the historical message sent to node 12.
- node 11 determines that there is no target message in the historical message sent to node 12.
- node 11 can add the payload part of message 11 to the data set to obtain an updated data set.
- the updated data set can be used by node 11 to perform deduplication processing on the message whose sender is the SFU server obtained subsequently.
- node 11 performs content matching on the payload part of message 11 and the payload part in the data set, which may be sampling one or more positions of the payload part of message 11 and sampling one or more positions of the payload part in the data set.
- the sampling content of position 1 of the payload part of message 11 is the same as the sampling content of position 2 of a payload part in the data set, and node 11 continues to accurately match the content before and after position 1 of the payload part of message 11 with the content before and after position 2 of the payload part in the data set to determine the repeated data blocks of the payload part of message 11 and the payload part in the data set.
- the specific implementation method of content matching is not limited in the embodiment of the present application.
- the content of the payload part of message 11 is "44336655", and the content of a payload part in the data set is "11332255".
- Node 11 matches the content of the payload part of message 11 with the payload part in the data set, and determines that the payload part of message 11 and the payload part in the data set have repeated data blocks "33" and "55". Then, node 11 can determine the payload part in the data set as the target payload part, and determine that the target message exists in the historical messages sent by node 11 to node 12.
- the target message is the historical message including the target payload part.
- the implementation process of node 11 performing deduplication processing on the payload part of message 11 includes: for each duplicate data block between the payload part of message 11 and the target payload part, node 11 calculates the hash value of the duplicate data block. Node 11 removes the duplicate data block in the payload part of message 11, and adds an indication corresponding to the duplicate data block in the payload part of message 11, the indication is used to indicate the hash value of the duplicate data block and the position of the duplicate data block in the payload part of message 11.
- the position of the duplicate data block in the payload part of message 11 can be represented by the starting position of the duplicate data block in the payload part of message 11 and the length of the duplicate data block, or can be represented by the ending position of the duplicate data block in the payload part of message 11 and the length of the duplicate data block, or can be represented by the starting position of the duplicate data block in the payload part of message 11 and the ending position of the duplicate data block in the payload part of message 11.
- the hash value of the repeated data block "33" is a
- the hash value of the repeated data block "55" is b.
- the content of the payload part of message 12 can be expressed as "4466; Indication 1: ⁇ a, position 3-4>; Indication 2: ⁇ b, position 7-8>".
- “4466” is the remaining content after removing the duplicate content in the payload part of message 11 and the target payload part
- Indication 1 “ ⁇ a, position 3-4>” is used to indicate that the data block corresponding to hash value a is located at the 3rd byte and 4th byte of the payload part of message 12
- Indication 2 “ ⁇ b, position 7-8>” is used to indicate that the data block corresponding to hash value b is located at the 7th byte and 8th byte of the payload part of message 12.
- node 11 has multiple subordinate nodes
- multiple node-level data sets may be stored in node 11, each node-level data set is used to store the payload portion of the historical message sent by node 11 to a subordinate node, that is, node 11 stores a data set for each subordinate node.
- a global data set may be stored in node 11, and the global data set includes a correspondence between the payload portion and the node identifier, and the node identifier is used to indicate that the historical message to which the corresponding payload portion belongs is sent to the subordinate node indicated by the node identifier, that is, node 11 stores a common data set for all subordinate nodes.
- the subordinate node of node 11 refers to a node located after node 11 on the transmission path starting from the SFU server.
- node 11 may be node 302A, and node 302B, node 302C, and node 302D are all subordinate nodes of node 11.
- the top node stores the complete content of the payload part of the historical message sent to the lower node.
- the top node determines whether the original message obtained has duplicate content with the historical message, it matches the entire content of the payload part of the message without distinguishing whether the duplicate content is in the data part or the protocol part.
- the top node can perform content matching on the payload part of the acquired original message and the payload part of the stored historical message. If there are duplicate data blocks in the payload part of the original message and the payload part of the historical message, the top node calculates the hash value of the duplicate data block, removes the duplicate data block in the original message to obtain a deduplicated message, and further carries the hash value of the duplicate data block and the position of the duplicate data block in the deduplicated message to achieve data deduplication of the original message.
- FIG7 is a schematic diagram of a message deduplication process provided by an embodiment of the present application.
- the original message includes an Ethernet header, an IP header, a TCP/UDP header and a payload part
- the payload part includes a protocol part and a data part.
- the payload part of the original message is deduplicated to obtain a deduplicated message.
- the deduplicated message includes an Ethernet header, an IP header, a TCP/UDP header and a payload part
- the payload part includes a deduplication mark, an indication corresponding to a duplicate data block, and other contents in the payload part of the original message except the duplicate data block.
- the indication corresponding to the duplicate data block is used to indicate the hash value of the duplicate data block and the position of the duplicate data block in the payload part of the original message.
- the deduplication mark and the indication corresponding to the duplicate data block may be before the protocol part and the data part, or may also be after the protocol part and the data part, and the embodiment of the present application does not limit this.
- one or more repeated data blocks in the above repeated content are located in the data part of the payload part of message 11.
- Node 11 stores a sampling label set, which includes the hash value of the historical data block.
- the historical data block is a data block obtained by sampling the preset position of the data part of the historical message sent by node 11 to node 12.
- Node 11 executes a judgment process, including: node 11 samples the preset position of the data part of message 11 to obtain a sampled data block.
- Node 11 calculates the hash value of the sampled data block. If the sample The tag set includes the hash value of the sampled data block, and the node 11 determines that the target message exists in the historical messages sent to the node 12.
- the node 11 determines that the target message does not exist in the historical messages sent to the node 12.
- the node 11 can add the hash value of the sampled data block to the sampling tag set to obtain an updated sampling tag set.
- the updated sampling tag set can be used by the node 11 to perform deduplication processing on the messages whose sender is the SFU server that are subsequently obtained.
- the preset position of the data part is the sampling position of the data part set in advance.
- the preset position of the data part can be the entire data part, and the sampled data block obtained by node 11 sampling the preset position of the data part of message 11 is the complete content of the data part of message 11.
- the preset position of the data part can also be a local field of the data part. In this case, there can be one or more preset positions, and node 11 samples each preset position of the data part of message 11 to obtain one or more sampled data blocks.
- the implementation process of node 11 deduplicating the payload part of message 11 includes: node 11 takes the sampling data block whose hash value belongs to the sampling tag set as a duplicate data block, removes the duplicate data block from the data part of message 11, and adds an indication corresponding to the duplicate data block to the payload part of message 11, where the indication is used to indicate the hash value of the duplicate data block.
- node 11 obtains multiple sampled data blocks by sampling the preset positions of the data part of message 11.
- the indication corresponding to the duplicate data block is also used to indicate the position of the duplicate data block in the data part of message 11.
- the indication corresponding to a duplicate data block can be expressed as ⁇ hash value, position>.
- the top node samples the data part of the original message and obtains multiple sampled data blocks. In this case, it is necessary to indicate the position of the duplicate data block removed from the deduplicated message in the original message so that subsequent nodes can perform data recovery on the deduplicated message.
- node 11 may store multiple node-level sampling label sets, each node-level sampling label set corresponds to a subordinate node, and each node-level sampling label set is used to store the hash value of the data block sampled at the preset position of the data part of the historical message sent by node 11 to the corresponding subordinate node. That is, node 11 stores a sampling label set for each subordinate node.
- node 11 may store a global sampling label set, which includes the correspondence between hash values and node identifiers, and the node identifier is used to indicate that the corresponding hash value comes from the historical message sent to the subordinate node indicated by the node identifier, that is, node 11 stores a common sampling label set for all subordinate nodes.
- the top node stores the hash value of the data block at the preset position of the data part of the historical message sent to the lower node.
- the top node determines whether the acquired original message and the historical message have duplicate content, it calculates the hash value of the sampled data block sampled at the preset position of the data part of the original message, and compares it with the stored hash value to achieve overall deduplication of the data block at the preset position of the data part of the original message.
- the top node can calculate the hash value of the sampled data block at a preset position of the data portion of the acquired original message, and compare it with the hash value of the stored historical data block. If the hash value of a sampled data block in the original message is the same as the hash value stored by the top node, the top node removes the sampled data block in the original message to obtain a deduplicated message, and further carries the hash value of the sampled data block in the deduplicated message to achieve data deduplication of the original message.
- the top node does not need to perform content matching on the payload part of the original message and the historical message, thereby improving the efficiency of message deduplication.
- Figure 8 is a schematic diagram of another message deduplication process provided by an embodiment of the present application.
- the original message includes an Ethernet header, an IP header, a TCP/UDP header and a payload part
- the payload part includes a protocol part and a data part.
- the payload part of the original message is deduplicated to obtain a deduplicated message.
- the deduplicated message includes an Ethernet header, an IP header, a TCP/UDP header and a payload part.
- the payload part of the deduplicated message includes a deduplication mark, a hash value of the data part of the original message, and the protocol part of the original message.
- the deduplication mark and the hash value of the data part can be before the protocol part, or can also be after the protocol part, and the embodiment of the present application does not limit this.
- the sampling label set stored in the node 11 also includes the historical data block indicated by the hash value, that is, the sampling label set stored in the node 11 may include the correspondence between the historical data block and the hash value of the historical data block. Then, if the sampling label set includes the hash value of the sampling data block, the node 11 determines that the target message exists in the historical message in an implementation manner, including: if the sampling label set includes the hash value of the sampling data block, the node 11 performs content matching on the sampling data block and the historical data block indicated by the hash value of the sampling data block. If the sampling data block has the same content as the historical data block indicated by the hash value of the sampling data block, the node 11 determines that the target message exists in the historical message.
- the node 11 Since the data contents of two data blocks with the same hash value may be different, by storing the historical data blocks and historical The corresponding relationship between the hash values of the data blocks enables the node 11 to further perform content matching on the sampled data block and the historical data block after determining that the hash value of the sampled data block is the same as the hash value of a historical data block, so as to achieve an accurate match, thereby improving the accuracy of deduplication of the message.
- the node 11 may also store the protocol part of the historical message sent by the node 11 to the node 12.
- the repeated content of the payload part of the message 11 and the payload part of the target message may also include the protocol information located in the protocol part of the message 11.
- the indication information of the repeated content may also include a difference indication, and the difference indication is used to indicate the difference between the protocol part of the message 11 and the protocol part of the target message.
- the difference between the protocol part of the message 11 and the protocol part of the target message specifically includes the difference information between the protocol part of the message 11 and the protocol part of the target message and the position of the difference information in the protocol part of the message 11.
- node 11 after determining that the target message exists in the historical message, performs content matching on the protocol part of message 11 and the protocol part of the target message to determine the difference information between the protocol part of message 11 and the protocol part of the target message and the position of the difference information in the protocol part of message 11.
- Figure 9 is another schematic diagram of a message deduplication process provided by an embodiment of the present application.
- the original message includes an Ethernet header, an IP header, a TCP/UDP header and a payload part
- the payload part includes a protocol part and a data part.
- the payload part of the original message is deduplicated to obtain a deduplicated message.
- the deduplicated message includes an Ethernet header, an IP header, a TCP/UDP header and a payload part.
- the payload part of the deduplicated message includes a deduplication mark, a hash value of the data part of the original message, and a difference indication.
- the difference indication is used to indicate the difference information between the protocol part of the original message and the protocol part of the target message and the position of the difference information in the protocol part of the original message.
- the top node in addition to deduplicating data blocks at preset positions in the data part of the message, can also deduplicate the protocol part of the message.
- the amount of data transmitted in the message can be further reduced, thereby reducing the network bandwidth overhead.
- Step 603 Node 11 sends message 12 to node 12 .
- node 11 executes the above-mentioned judgment process for determining whether there is a target message with a payload part having repeated content with the payload part of message 11 in the historical messages sent by itself to node 12, if the target message does not exist in the historical messages sent by node 11 to node 12, node 11 sends message 11 to node 12, that is, node 11 directly forwards message 11 without performing deduplication processing on message 11, that is, the above-mentioned steps 602 and 603 are not executed.
- the top node after receiving a message whose sender is an SFU server, the top node can determine whether to send a historical message whose overload part has duplicate content with the load part of the message to the next hop of the message. If the top node sends a historical message whose overload part has duplicate content with the load part of the message to the next hop of the message, the top node can perform data deduplication on the message, and then send a deduplication message to the subordinate node. Since the data volume of the deduplication message is smaller than the data volume of the non-deduplication message, the data volume of the message transmission can be reduced, thereby reducing the network bandwidth overhead.
- the top node When the top node sends a deduplication message to the subordinate node, it only needs to ensure that the overload part sent to the subordinate node carries the historical message with the content removed from the deduplication message relative to the original message, so as to ensure that there is a node on the subsequent transmission path that can perform data recovery on the deduplication message, so that the user finally receives the original message carrying the complete data content, thereby ensuring user services.
- FIG. 10 is a flow chart of another data transmission method 1000 provided in an embodiment of the present application.
- the first node is referred to as node 21, the second node is referred to as node 22, the third node is referred to as node 23, the first message is referred to as message 21, and the second message is referred to as message 22.
- the method 1000 can be applied to the data transmission system 200 shown in Figure 2, then the node 21 in the method 1000 is node 202B, and the node 22 is node 202A.
- the method 1000 can be applied to the data transmission system 300 shown in Figure 3, then the node 21 in the method 1000 is node 302B, node 302C or node 302D, and the node 22 is node 302A.
- the method 1000 may be applied to the data transmission system 400 shown in FIG4 , then the node 21 in the method 1000 is the node 402B or the node 402D, and the node 22 is the node 402A, or the node 21 is the node 402E, the node 402F or the node 402G, and the node 22 is the node 402C.
- the method 1000 may be applied to the data transmission system 500 shown in FIG5 , then the node 21 in the method 1000 is the node 502, and the node 22 is the SFU server 501. As shown in FIG10 , the method 1000 includes but is not limited to the following steps 1001 to 1005.
- Step 1001 node 21 receives message 21 sent by node 22.
- the sender of message 21 is the SFU server.
- Message 21 carries a deduplication mark and indication information of repeated content.
- the deduplication mark is used to indicate that message 21 is a deduplication message.
- Node 21 is the last node on the transmission path of message 21 that supports data deduplication.
- the sender of message 21 is the SFU server, and node 21 is the last node that supports data deduplication on the transmission path of message 21, that is, node 21 is the last node that supports data deduplication on the transmission path starting from the SFU server and ending at the destination of message 21.
- the source IP address of message 21 may be the IP address of the SFU server, or may be the address obtained after the IP address of the SFU server is NATed.
- the repeated content includes one or more repeated data blocks
- the indication information of the repeated content includes one or more indications, and the one or more indications correspond to one or more repeated data blocks in the repeated content.
- Each indication is used to indicate the hash value of the corresponding repeated data block, and specifically, each indication includes the hash value of the corresponding repeated data block.
- the specific explanation of the repeated content and the indication information can refer to the relevant content in the above step 602, and the embodiment of the present application will not be repeated here.
- Step 1002 Node 21 determines that message 21 is a deduplicated message based on the deduplication flag.
- the deduplication mark is located in the payload of the message 21.
- the node 21 parses the payload of the message 21 to obtain the deduplication mark, and further determines that the node 21 is a deduplication message based on the deduplication mark.
- Step 1003 Node 21 obtains the repeated content from a data set according to the indication information of the repeated content.
- the data set includes at least part of the payload part of the historical message received by node 21 from node 22 .
- the data set stored in the node 21 includes the payload part of the historical message received by the node 21 from the node 22. If the node in the data transmission system uses the possible implementation method A2 in the above step 602 to perform deduplication processing on the message, the data set stored in the node 21 includes the corresponding relationship between the hash value of the historical data block and the historical data block, and the historical data block is a data block obtained by sampling the preset position of the data part of the historical message received by the node 21 from the node 22. If the node in the data transmission system uses the possible implementation method A2 in the above step 602 to perform deduplication processing on the message, the data set stored in the node 21 may also include the protocol part of the message to which the historical data block belongs.
- Possible implementation B1 Message 21 is obtained by a node located before node 21 on the transmission path by performing deduplication processing on the original message using possible implementation A1 in step 602.
- Each indication in the indication information of repeated content is used to indicate the hash value of the corresponding repeated data block and the position of the corresponding repeated data block in the payload part of the original message corresponding to message 21.
- the implementation process of node 21 obtaining the repeated content from the data set according to the indication information of the repeated content includes: for each indication in the indication information, node 21 obtains the to-be-matched data block at the position indicated by the indication in the load part of the data set. Node 21 calculates the hash value of the to-be-matched data block. Node 21 determines the to-be-matched data block whose hash value is consistent with the hash value indicated by the indication as the repeated data block corresponding to the indication.
- the content of the payload part of message 21 includes "4466; Indication 1: ⁇ a, position 3-4>; Indication 2: ⁇ b, position 7-8>".
- “4466” includes the content of the protocol part and/or the content of the data part
- Indication 1 " ⁇ a, position 3-4>” is used to indicate that the data block corresponding to the hash value a is located at the 3rd byte and the 4th byte of the payload part of the original message corresponding to message 21
- Indication 2 " ⁇ b, position 7-8>” is used to indicate that the data block corresponding to the hash value b is located at the 7th byte and the 8th byte of the payload part of the original message corresponding to message 21.
- node 21 obtains the to-be-matched data block "33" in the payload part according to the position indicated by Indication 1, and obtains the to-be-matched data block "55" in the payload part according to the position indicated by Indication 2. If the hash value of the to-be-matched data block "33" is a, node 21 uses the to-be-matched data block "33” as the duplicate data block corresponding to Indication 1. If the hash value of the to-be-matched data block "55" is b, node 21 uses the to-be-matched data block "55” as the duplicate data block corresponding to indication 2.
- Possible implementation B2 Message 21 is obtained by a node located before node 21 on the transmission path by performing deduplication processing on the original message using possible implementation A2 in step 602.
- One or more repeated data blocks in the repeated content are located in the data part of the payload part of message 11.
- the implementation process of node 21 obtaining the repeated content from the data set according to the indication information of the repeated content includes: node 21 determines the historical data block in the data set corresponding to the hash value indicated by the indication in the indication information as the repeated data block corresponding to the indication.
- the indication in the indication information of repeated content does not need to indicate the position of the corresponding repeated data block in the original message corresponding to message 21, and node 21 assumes that the repeated data block is located at the sampling position of the data portion of the original message corresponding to message 21. If there are multiple sampling positions of the data portion pre-set in possible implementation manner A2 in step 602, the indication in the indication information of repeated content is also used to indicate the position of the corresponding repeated data block in the data portion of the original message corresponding to message 21.
- the data set stored in the node 21 may include the protocol part of the message to which the historical data block belongs.
- the repeated content may also include protocol information located in the protocol part.
- the indication information of the repeated content also includes a difference indication.
- the difference indication is used to indicate the difference between the protocol part of the original message corresponding to the message 21 and the protocol part of the target message, and the target message is a historical message received by the node 21 from the node 22, in which the data part and the data part of the original message corresponding to the message 21 have the above-mentioned one or more repeated data blocks.
- the implementation process of the node 21 obtaining the repeated content from the data set according to the indication information of the repeated content also includes: the node 21 obtains the protocol part of the target message to which the one or more repeated data blocks belong from the data set.
- Step 1004 Node 21 performs deduplication recovery processing on the payload of message 21 according to the repeated content to obtain message 22, wherein the payload of message 22 includes the repeated content.
- message 22 is the original message corresponding to the above message 21.
- the implementation method in which node 21 performs deduplication recovery processing on the payload part of message 21 based on the repeated content is as follows: for each indication in the indication information, node 21 adds the repeated data block corresponding to the indication at the position indicated by the indication in the payload part of message 21.
- the content of the payload part of message 21 includes "4466; Indication 1: ⁇ a, position 3-4>; Indication 2: ⁇ b, position 7-8>", assuming that the repeated data block corresponding to hash value a is "33", and the repeated data block corresponding to hash value b is "55", then node 21 inserts data block "33" between "44” and “66” to obtain "443366", so that data block "33” is located at the 3rd byte and 4th byte of the payload part of the restored message, and adds data block "55” after "443366” to obtain "44336655", so that data block "55” is located at the 7th byte and 8th byte of the payload part of the restored message.
- the bottom node can calculate the hash values of the data blocks at that position in the payload part of the multiple stored payload parts according to the position of the duplicate data block indicated by the indication carried in the deduplication message in the payload part of the original message, so as to obtain a storage data block whose hash value is consistent with the hash value indicated by the indication, and then add the storage data block to the position indicated by the indication to achieve data recovery of the deduplication message.
- the implementation method of node 21 performing deduplication recovery processing on the payload part of the message 21 according to the repeated content is: for each indication in the indication information, node 21 adds the repeated data block corresponding to the indication at the position indicated by the indication in the data part of the message 21.
- the implementation method of node 21 performing deduplication recovery processing on the payload part of the message 21 according to the repeated content is: node 21 adds the obtained repeated data block to the default deduplication position of the data part of the message 21, and the default deduplication position is the above-mentioned pre-set sampling position of the data part.
- the implementation process of node 21 performing deduplication recovery processing on the payload part of message 21 according to the duplicate content further includes: node 21 modifies the protocol part of the target message according to the difference indication, and uses the modified protocol part of the target message as the protocol part of message 22.
- the difference indication reference may be made to the relevant content in the above step 602, which will not be repeated in detail in the embodiment of the present application.
- the bottom node can search for the hash value carried in the deduplication message in the hash value of the stored historical data block, and add the historical data block corresponding to the hit hash value to the data part of the deduplication message to achieve the recovery of the data part of the deduplication message.
- the bottom node can also obtain the protocol part of the historical message to which the historical data block corresponding to the hit hash value belongs, and restore the protocol part of the original message corresponding to the deduplication message in combination with the difference indication for the protocol part in the deduplication message to achieve the recovery of the protocol part of the deduplication message.
- step 1004 when node 21 performs data recovery on message 21 , it may also delete the deduplication mark and the indication information of the duplicate content in message 21 to restore the original message corresponding to message 21 .
- Step 1005 node 21 sends message 22 to node 23 , and node 23 is the next hop of message 21 on node 21 .
- the node 23 may be the destination of the message 22 , or may be a network device on the transmission path of the message 22 that does not support data deduplication.
- messages may arrive at the bottom node out of order.
- the bottom node may first receive a deduplicated message, and then receive a non-deduplicated message with duplicate content corresponding to the original message of the deduplicated message, and the non-deduplicated message carries the duplicate content indicated in the deduplicated message.
- the bottom node cannot immediately perform data recovery on the deduplicated message after receiving the deduplicated message, and needs to wait for the arrival of the non-deduplicated message carrying the duplicate content indicated in the deduplicated message, and then perform data recovery on the deduplicated message based on the payload of the non-deduplicated message.
- the node 21 receives a non-deduplicated message, the node 21 directly forwards the non-deduplicated message.
- the bottom node does not need to perform the message deduplication process.
- the bottom node can perform data recovery on deduplicated messages whose sender is the SFU server, so that the user can receive the original message carrying complete data content, thereby ensuring user services.
- FIG. 11 is a flow chart of another data transmission method 1100 provided in an embodiment of the present application.
- the first node is referred to as node 31, the second node is referred to as node 32, the third node is referred to as node 33, the first message is referred to as message 31, the second message is referred to as message 32, and the third message is referred to as message 33.
- the method 1100 can be applied to the data transmission system 400 shown in Figure 4, then the node 31 in the method 1100 is 402C, the node 32 is 402A, and the node 33 is 402E, node 402F or node 402G. As shown in Figure 11, the method 1100 includes but is not limited to the following steps 1101 to 1105.
- Step 1101 node 31 receives message 31 sent by node 32 , the sender of message 31 is the SFU server, and node 31 is an intermediate node supporting data deduplication on the transmission path of message 31 .
- message 31 can be a deduplicated message or a non-deduplicated message.
- the sender of message 31 is the SFU server, and node 31 is an intermediate node that supports data deduplication on the transmission path of message 31, that is, node 31 is an intermediate node that supports data deduplication on the transmission path with the SFU server as the starting point and the destination of message 31 as the end point.
- the source IP address of message 31 can be the IP address of the SFU server, or it can be the address obtained after the IP address of the SFU server is NATed.
- Step 1102 Node 31 determines whether message 31 is a deduplicated message or a non-deduplicated message.
- the deduplication message in the embodiment of the present application carries a deduplication mark.
- the deduplication mark is located in the payload part of the deduplication message.
- the node 31 parses the payload part of the message 31. If the payload part of the message 31 carries the deduplication mark, the node 31 determines that the message 31 is a deduplication message. If the payload part of the message 31 does not carry the deduplication mark, the node 31 determines that the message 31 is a non-deduplication message.
- Step 1103 if message 31 is a non-deduplicated message, node 31 determines whether there is a first original message in the historical messages sent to node 33, the payload part of the first original message and the payload part of message 31 have first repeated content, and node 33 is the next hop of message 31 on node 31.
- the first repetitive content includes one or more repetitive data blocks.
- a first data set is stored in node 31, and the first data set includes the payload part of the historical message sent by node 31 to node 33.
- the implementation process of node 31 judging whether there is a first original message in the historical message sent to node 33 includes: node 31 matches the payload part of message 31 with the payload part in the first data set. If there is a target payload part with a repeated data block with the payload part of message 31 in the first data set, node 31 determines that there is a first original message in the historical message sent to node 33. If there is no payload part with a repeated data block with the payload part of message 31 in the first data set, node 31 determines that there is no first original message in the historical message sent to node 33.
- node 31 can add the payload part of message 31 to the first data set to obtain an updated first data set.
- the updated first data set can be used by node 31 to perform deduplication processing on the non-deduplication messages obtained subsequently and sent by the SFU server.
- each node-level data set is used to store the payload part of the historical message sent by the node 31 to a subordinate node, that is, the node 31 stores a data set for each subordinate node.
- the node 31 may store a global data set, which includes the corresponding relationship between the payload part and the node identifier, and the node identifier is used to indicate that the corresponding payload part is sent to the subordinate node indicated by the node identifier, that is, the node 31 stores a common data set for all subordinate nodes.
- a sampling label set is stored in the node 31, and the sampling label set includes a hash value of a historical data block.
- the historical data block is a data block obtained by sampling a preset position of a data part of a historical message sent by the node 31 to the node 33.
- the implementation process of the node 31 judging whether there is a first original message in the historical message sent to the node 33 includes: the node 31 samples a preset position of the data part of the message 31 to obtain a sampling data block.
- the node 31 calculates the hash value of the sampling data block. If the sampling label set includes the hash value of the sampling data block, the node 31 determines that there is a first original message in the historical message sent to the node 33.
- the node 31 determines that there is no first original message in the historical message sent to the node 33.
- the node 31 can add the hash value of the sampling data block to the sampling label set to obtain an updated sampling label set.
- the updated sampling tag set can be used by the node 31 to perform deduplication processing on the subsequently acquired non-deduplication messages whose sender is the SFU server.
- This implementation method can refer to the possible implementation method A2 in the above step 602.
- each node-level sampling label set corresponds to a subordinate node
- each node-level sampling label set is used to store the hash value of the data block sampled at the preset position of the data part of the historical message sent by the node 31 to the corresponding subordinate node. That is, the node 31 stores a sampling label set for each subordinate node.
- a global sampling label set may be stored in the node 31, and the global sampling label set includes the correspondence between the hash value and the node identifier, and the node identifier is used to indicate that the corresponding hash value comes from the historical message sent to the subordinate node indicated by the node identifier, that is, the node 31 stores a common sampling label set for all subordinate nodes.
- the sampling tag set stored in the node 31 also includes the historical data block indicated by the hash value, that is, the sampling tag set stored in the node 31 may include the correspondence between the historical data block and the hash value of the historical data block. Then the implementation method of the node 31 determining that the first original message exists in the historical message if the sampling tag set includes the hash value of the sampling data block includes: if the sampling tag set includes the hash value of the sampling data block, the node 31 performs content matching on the sampling data block and the historical data block indicated by the hash value of the sampling data block. If the sampling data block has the same content as the historical data block indicated by the hash value of the sampling data block, the node 31 determines that the first original message exists in the historical message.
- node 31 can further perform content matching on the sampled data block and the historical data block after determining that the hash value of the sampled data block is the same as the hash value of a historical data block, so as to achieve an accurate match, thereby improving the accuracy of deduplication of the message.
- Step 1104 If the first original message exists in the historical message sent by node 31 to node 33, node 31 deduplicates the payload part of message 31 to obtain message 32.
- message 32 does not include the first repeated content, and message 32 carries a deduplication mark and first indication information for the first repeated content.
- the deduplication mark is used to indicate that message 32 is a deduplication message.
- the first repetitive content includes one or more repetitive data blocks
- the first indication information for the first repetitive content includes one or more indications, which correspond one to one with the one or more repetitive data blocks in the first repetitive content, and each indication is used to indicate a hash value of a corresponding repetitive data block.
- one or more repeated data blocks in the first repeated content are located in the data part and/or the protocol part of the payload part of the message 31.
- the first original message exists in the historical message sent by the node 31 to the node 33, that is, the target payload part having repeated data blocks with the payload part of the message 31 exists in the first data set.
- the implementation process of node 31 performing deduplication processing on the payload part of the message 31 includes: for each repeated data block between the payload part of the message 31 and the target payload part, the node 31 calculates the hash value of the repeated data block.
- the node 31 removes the repeated data block in the payload part of the message 31, and adds an indication corresponding to the repeated data block to the payload part of the message 31, the indication is used to indicate the hash value of the repeated data block and the position of the repeated data block in the payload part of the message 31.
- the implementation process of node 31 performing deduplication processing on the payload part of the message 31 can refer to the implementation process of node 11 performing deduplication processing on the payload part of the message 11 under the possible implementation A1 in the above step 602.
- the intermediate node can perform content matching on the payload portion of the acquired non-deduplicated message and the payload portion of the stored historical message. If the payload portion of the non-deduplicated message and the payload portion of the historical message have duplicate data blocks, the intermediate node calculates the hash value of the duplicate data block, removes the duplicate data block from the non-deduplicated message to obtain a deduplicated message, and further carries the hash value of the duplicate data block and an indication of the location of the duplicate data block in the deduplicated message to achieve data deduplication of the non-deduplicated message.
- one or more repeated data blocks in the first repeated content are located in the data part of the payload part of the message 31.
- the implementation process of node 31 performing deduplication processing on the payload part of the message 31 includes: node 31 uses the sampling data block whose hash value belongs to the sampling tag set as a repeated data block, removes the repeated data block from the data part of the message 31, and adds an indication corresponding to the repeated data block to the payload part of the message 31, and the indication is used to indicate the hash value of the repeated data block.
- the implementation process of node 31 performing deduplication processing on the payload part of the message 31 can refer to the implementation process of node 11 performing deduplication processing on the payload part of the message 11 under the possible implementation A2 in the above step 602.
- the intermediate node can calculate the hash value of the sampled data block at the preset position of the data portion of the acquired non-deduplicated message, and compare it with the hash value of the stored historical data block. If the hash value of a sampled data block in the non-deduplicated message is the same as the hash value stored by the intermediate node, the intermediate node removes the sampled data block in the non-deduplicated message to obtain a deduplicated message, and further carries the hash value of the sampled data block in the deduplicated message to achieve data deduplication of the non-deduplicated message.
- the intermediate node does not need to perform content matching on the payload part of the non-deduplicated message and the historical message, thereby improving the efficiency of message deduplication.
- the node 31 may also store the protocol part of the historical message sent by the node 31 to the node 33.
- the first repetition of the payload part of the message 31 and the payload part of the first original message also includes the protocol part located in the message 31.
- the first indication information also includes a difference indication, which is used to indicate the difference between the protocol part of the message 31 and the protocol part of the first original message.
- the implementation method of node 31 performing deduplication processing on the protocol part of the message 31 can refer to the implementation method of node 11 performing deduplication processing on the protocol part of the message 11 under the possible implementation method A2 in the above step 602.
- the intermediate node in addition to deduplicating data blocks at preset positions in the data part of the message, can also deduplicate the protocol part of the message. By carrying a difference indication in the payload part of the message to replace the protocol part, the amount of data transmitted in the message can be further reduced, thereby reducing the network bandwidth overhead.
- Step 1105 node 31 sends message 32 to node 33 .
- the intermediate node can determine whether to send a historical message whose overload part has duplicate content with the load part of the non-deduplicated message to the next hop of the non-deduplicated message. If the intermediate node sends a historical message whose overload part has duplicate content with the load part of the non-deduplicated message to the next hop of the non-deduplicated message, the intermediate node can perform data deduplication on the non-deduplicated message, and then send a deduplication message to the subordinate node.
- the data volume of the deduplication message is smaller than the data volume of the non-deduplication message, the data volume of the message transmission can be reduced, thereby reducing the network bandwidth overhead.
- the intermediate node sends a deduplication message to the subordinate node, it only needs to ensure that the historical message with the content of the deduplication message removed relative to the original message is sent to the subordinate node, so as to ensure that there is a node on the subsequent transmission path that can perform data recovery on the deduplication message, so that the user finally receives the original message carrying the complete data content, and guarantees the user service.
- Step 1106 If the first original message does not exist in the historical messages sent by node 31 to node 33 , node 31 sends message 31 to node 33 .
- Step 1107 If message 31 is a deduplicated message, message 31 carries second indication information for the second duplicate content, node 31 determines whether there is a second original message in the historical message sent to node 33, the payload part of the second original message includes the second duplicate content, and node 33 is the next hop of message 31 on node 31.
- the second repetitive content includes one or more repetitive data blocks
- the second indication information for the second repetitive content includes one or more indications, which correspond one to one with the one or more repetitive data blocks in the second repetitive content, and each indication is used to indicate a hash value of a corresponding repetitive data block.
- message 31 is obtained by the node located before node 31 on the transmission path by performing deduplication processing on the original message using possible implementation method A1 in the above step 602.
- Each indication in the second indication information of the second repeated content is used to indicate the hash value of the corresponding repeated data block and the position of the corresponding repeated data block in the payload part of the original message corresponding to message 31.
- Node 31 stores a first data set, and the first data set includes the payload part of the historical message sent by node 31 to node 33.
- the implementation process of node 31 judging whether there is a second original message in the historical message sent to node 33 includes: for any indication in the second indication information, node 31 obtains the data block to be matched at the position of the payload part in the first data set according to the position indicated by the indication. Node 31 calculates the hash value of the data block to be matched. Node 31 determines the data block to be matched whose hash value is consistent with the hash value indicated by the indication as the repeated data block corresponding to the indication. If the first data set includes a payload portion including repeated data blocks corresponding to the respective indications in the second indication information, the node 31 determines that the second original message exists in the historical message sent to the node 33.
- the second original message is a historical message including a payload portion including repeated data blocks corresponding to the respective indications in the second indication information. If the first data set does not include a payload portion including repeated data blocks corresponding to the respective indications in the second indication information, the node 31 determines that the second original message does not exist in the historical message sent to the node 33.
- message 31 is obtained by the node located before node 31 on the transmission path by performing deduplication processing on the original message using possible implementation A2 in the above step 602.
- Each indication in the second indication information of the second repeated content is used to indicate the hash value of the corresponding repeated data block.
- Node 31 stores a sampling label set, which includes the hash value of the historical data block.
- the historical data block is a data block obtained by sampling the preset position of the data part of the historical message sent by node 31 to node 33.
- the implementation process of node 31 judging whether there is a second original message in the historical message sent to node 33 includes: if the sampling label set includes the hash value indicated by each indication in the second indication information, node 31 determines that there is a second original message in the historical message sent to node 33.
- the second original message that is, the payload part includes the historical message of all historical data blocks corresponding to the hash value indicated by each indication in the second indication information. If the sampling label set does not include the hash value indicated by any indication in the second indication information, node 31 determines that there is no second original message in the historical message sent to node 33.
- Step 1108 If the second original message exists in the historical messages sent by node 31 to node 33 , node 31 sends message 31 to node 33 .
- the intermediate node after receiving the deduplication message sent by the SFU server, the intermediate node can determine whether to send a historical message with a payload portion carrying the content of the deduplication message removed from the original message to the next hop of the deduplication message.
- the next hop of the deduplication message sends a historical message with the content of the deduplication message removed relative to the original message over the payload part, and the intermediate node can directly forward the deduplication message to the lower-level node.
- Step 1109 If the second original message does not exist in the historical message sent by node 31 to node 33, node 31 obtains the second repeated content from the second data set according to the second indication information, and the second data set includes at least part of the payload part of the historical message received by node 31 from node 32.
- each indication in the second indication information is used to indicate the hash value of the corresponding repeated data block and the position of the corresponding repeated data block in the payload part of the original message corresponding to the message 31.
- the second data set includes the payload part of the historical message received by the node 31 from the node 32.
- the implementation process of the node 31 obtaining the second repeated content from the second data set according to the second indication information includes: for each indication in the second indication information, the node 31 obtains the to-be-matched data block at the position of the payload part in the second data set according to the position indicated by the indication. The node 31 calculates the hash value of the to-be-matched data block.
- the node 31 determines the to-be-matched data block whose hash value is consistent with the hash value indicated by the indication as the repeated data block corresponding to the indication.
- the implementation process of the node 31 obtaining the second repeated content from the second data set according to the second indication information can refer to the implementation process of the node 21 obtaining the repeated content from the data set according to the indication information of the repeated content under the possible implementation B1 in the above step 1003.
- one or more repeated data blocks in the second repeated content are located in the data part.
- the second data set includes the correspondence between the hash value of the historical data block and the historical data block, and the historical data block is a data block obtained by sampling the preset position of the data part of the historical message received by the node 31 from the node 32.
- the implementation process of node 31 obtaining the second repeated content from the second data set according to the second indication information includes: node 31 determines the historical data block in the second data set corresponding to the hash value indicated by the indication in the second indication information as the repeated data block corresponding to the indication.
- the implementation process of node 31 obtaining the second repeated content from the second data set according to the second indication information can refer to the implementation process of node 21 obtaining the repeated content from the data set according to the indication information of the repeated content under the possible implementation B2 in the above step 1003.
- the second data set may also include the protocol part of the message to which the historical data block belongs.
- the second repeated content may also include protocol information located in the protocol part.
- the second indication information also includes a difference indication, which is used to indicate the difference between the protocol part of the original message corresponding to the message 31 and the protocol part of the target message.
- the target message is a historical message in which the data part of the historical message received by the node 31 from the node 32 and the data part of the original message corresponding to the message 31 have the above one or more repeated data blocks.
- the implementation process of the node 31 obtaining the second repeated content from the second data set according to the second indication information also includes: the node 31 obtains the protocol part of the target message to which the one or more repeated data blocks belong from the second data set.
- Step 1110 Node 31 performs deduplication recovery processing on the payload of message 31 according to the second repeated content to obtain message 33, where the payload of message 33 includes the second repeated content.
- message 31 is a deduplicated message
- message 33 is the original message corresponding to message 31 .
- the implementation of node 31 performing deduplication recovery processing on the payload part of message 31 according to the second repeated content is as follows: for each indication in the second indication information, node 31 adds the repeated data block corresponding to the indication at the position indicated by the indication in the payload part of message 31.
- This implementation can refer to the implementation of node 21 performing deduplication recovery processing on the payload part of message 21 according to the repeated content under the possible implementation B1 in the above step 1004.
- the intermediate node can calculate the hash values of the data blocks at that position in the payload part of multiple stored payload parts according to the position of the duplicate data block indicated by the indication carried in the deduplication message in the payload part of the original message, so as to obtain a storage data block whose hash value is consistent with the hash value indicated by the indication, and then add the storage data block to the position indicated by the indication to achieve data recovery of the deduplication message.
- the implementation method of node 31 performing deduplication recovery processing on the payload part of the message 31 according to the second duplicate content is: for each indication in the second indication information, node 31 adds the duplicate data block corresponding to the indication at the position indicated by the indication in the data part of the message 31.
- node 31 adds the obtained duplicate data block to the default deduplication position of the data part of the message 31, and the default deduplication position is the sampling position of the data part pre-set above.
- This implementation method can refer to the possible implementation method B2 in the above step 1004, in which node 21 performs deduplication recovery processing on the payload part of the message 21 according to the duplicate content.
- the implementation process of node 31 performing deduplication recovery processing on the payload part of message 31 according to the second repeated content also includes: node 31 modifies the protocol part of the target message according to the difference indication, and uses the modified protocol part of the target message as the protocol part of message 33.
- the intermediate node can search for the hash value carried in the deduplication message among the hash values of the stored historical data blocks, and add the historical data block corresponding to the hit hash value to the data portion of the deduplication message to achieve recovery of the data portion of the deduplication message.
- the intermediate node can also obtain the protocol portion of the historical message to which the historical data block corresponding to the hit hash value belongs, and restore the protocol portion of the original message corresponding to the deduplication message in combination with the difference indication for the protocol portion in the deduplication message to achieve recovery of the protocol portion of the deduplication message.
- step 1110 when node 31 performs data recovery on message 31 , it may also delete the deduplication mark and the second indication information in message 31 to restore the original message corresponding to message 31 .
- Step 1111 node 31 sends message 33 to node 33 .
- the node 33 is a network device on the transmission path of the message 33 .
- the intermediate node may first receive a deduplication message, and then receive a non-deduplication message with duplicate content corresponding to the original message of the deduplication message.
- the non-deduplication message carries the duplicate content indicated in the deduplication message.
- the intermediate node needs to perform order-preserving processing on the out-of-order messages.
- the intermediate node After receiving the deduplication message, the intermediate node first caches the deduplication message. After the non-deduplication message carrying the duplicate content indicated in the deduplication message arrives, the intermediate node further determines whether to directly forward the deduplication message or perform data recovery on the deduplication message based on the payload part of the non-deduplication message.
- the intermediate node can perform data deduplication on the non-deduplicated message whose sender is the SFU server, so as to send the deduplication message to the subordinate node. Since the data volume of the deduplication message is smaller than the data volume of the non-deduplicated message, the message transmission data volume can be reduced, thereby reducing the network bandwidth overhead.
- the intermediate node can also perform data recovery on the deduplication message whose sender is the SFU server, or directly forward the deduplication message to the subordinate node.
- the intermediate node When the intermediate node sends a deduplication message to the subordinate node, it only needs to ensure that the historical message carrying the content of the deduplication message removed relative to the original message is sent to the subordinate node in the overload part, so as to ensure that there is a node on the subsequent transmission path that can perform data recovery on the deduplication message, so that the user finally receives the original message carrying the complete data content, thereby ensuring user services.
- an embodiment of the present application also proposes a traffic grouping scheme for an SFU communication architecture.
- Traffic grouping is used by nodes to determine which flows can be used as a group of flows, and then to perform data deduplication and/or data recovery on the messages in the same group of flows to improve message processing efficiency.
- group identification can be performed by the bottom node and the middle node, that is, the deduplication-capable flows are actively identified, and the deduplication-capable flows are grouped. After the grouping is completed, the bottom node and the middle node can send the grouping results to the upper node respectively, and the upper node can perform data deduplication or data recovery on the messages according to the grouping results.
- the top node does not need to perform group identification. The specific implementation methods of applying the traffic grouping scheme to the top node, the bottom node, and the middle node during data transmission are described below.
- Each flow grouping set includes flow identifiers of multiple flows flowing through node 12.
- a flow grouping set indicates a packet, and the flows indicated by the flow identifiers in a flow grouping set belong to the same packet.
- the flow grouping set stored in node 11 is the flow grouping set corresponding to the lower-level node.
- the flow grouping set corresponding to the stored lower-level node may be referred to as a lower-level flow grouping set.
- the flow identifier may be represented by one or more of the five-tuple information of the flow.
- the five-tuple information includes the source IP address, the destination IP address, the source port, the destination port, and the transport layer protocol.
- node 11 may first determine whether the flow grouping set corresponding to node 12 includes the flow identifier of the flow to which message 11 belongs. If there is a target flow grouping set including the flow identifier of the flow to which message 11 belongs in the flow grouping set corresponding to node 12, node 11 executes the above steps 602 and 603. If all flow grouping sets corresponding to node 12 do not include the flow identifier of the flow to which message 11 belongs, node 11 sends message 11 to node 12.
- node 11 determines whether the flow grouping set corresponding to node 12 includes the flow identifier of the flow to which message 11 belongs, that is, node 11 determines whether the flow to which message 11 belongs has been added to a certain group corresponding to node 12.
- node 11 can search the flow grouping set corresponding to node 12 according to the five-tuple information of message 11 to determine whether the flow to which message 11 belongs has been added to a certain group corresponding to node 12. For messages in the flow that has been added to the group, node 11 further determines whether to perform data deduplication processing on the message, and for messages in the flow that has not been added to the group, node 11 directly forwards the message.
- node 11 receives the grouping information sent by node 12, and the grouping information includes the correspondence between the node identification of node 12 and one or more stream grouping sets corresponding to node 12.
- node 11 stores the correspondence between the node identification of the subordinate node and the subordinate stream grouping set.
- node 11 may only store the subordinate stream grouping set.
- the node identification may be information such as an IP address, a Media Access Control (MAC) address or a hardware address of the node that can uniquely identify the node in a communication network.
- MAC Media Access Control
- the grouping information sent by node 12 to node 11 may also include one or more grouping identifiers corresponding to one or more stream grouping sets in the grouping information.
- Each grouping identifier is used to uniquely identify a stream grouping set corresponding to node 12.
- the grouping identifier may be a number that node 12 assigns to a stream grouping set within the node and that can uniquely identify the stream grouping set.
- the grouping information may also include a conference identifier such as a conference number.
- the grouping information may also include a live broadcast identifier such as a live broadcast room number.
- the grouping information is carried in a join message. That is, node 12 can send the grouping information to node 11 via a join message. After receiving the join message, node 11 stores the grouping information carried in the join message in a grouping table. Afterwards, node 11 can also send a join response message to node 12 so that node 12 confirms that the grouping is successful. Further, after receiving the join response message, node 12 can also send a join response confirmation message to node 11.
- the grouping information in the grouping table stored in the top node all comes from the lower-level nodes, and the grouping table can be called a lower-level grouping table.
- the grouping table stored in the top node can be as shown in Table 1.
- the grouping table uses the flow identifier as the key, the node identifier and the group identifier as the value.
- the node identifier A is used to identify a subordinate node A of the top node
- the node identifier B is used to identify another subordinate node B of the top node.
- Group identifier A1 is used to identify a flow grouping set of the subordinate node A
- group identifier A2 is used to identify another flow grouping set of the subordinate node A.
- Group identifier B1 is used to identify the flow grouping set of the subordinate node B.
- flow identifier 1 and flow identifier 2 belong to a flow grouping set corresponding to the subordinate node A
- flow identifier 3 and flow identifier 4 belong to another flow grouping set corresponding to the subordinate node A
- flow identifier 5 and flow identifier 6 belong to the flow grouping set corresponding to the subordinate node B.
- the top node can search the key column according to the flow identifier of the flow to which the message belongs. If the key column does not have the flow identifier of the flow to which the message belongs, the top node directly forwards the message. If the key column has the flow identifier of the flow to which the message belongs, the top node further determines whether to perform data deduplication on the message.
- the top node can determine whether it is necessary to perform deduplication processing on the message sent to the lower node according to the flow grouping set corresponding to the lower node. If the flow identifier of the flow to which the message belongs is not in the flow grouping set corresponding to the lower node, the top node directly forwards the message to the lower node without executing the message deduplication process, which can reduce the processing overhead of the top node.
- the flow identifier of the flow to which message 11 belongs in the flow grouping set corresponding to node 12 when node 11 determines whether there is a target message whose payload part has duplicate content with the payload part of message 11 in the historical messages sent to node 12, it can only determine whether there is a target message in the target historical messages sent to node 12, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set.
- node 11 when there are multiple flow grouping sets corresponding to node 12, node 11 only needs to determine the duplicate content of the payload part of the historical messages belonging to multiple flows indicated by one flow grouping set and the payload part of message 11, which reduces the number of historical messages that node 11 needs to determine, thereby reducing the processing overhead of node 11, while improving the message processing efficiency of node 11, thereby improving the message transmission efficiency.
- node 11 uses the flow identifier of the flow to which message 11 belongs as a key, performs key value matching in the grouping table shown in Table 1, and determines that the flow identifier of the flow to which message 11 belongs belongs to the target flow grouping set corresponding to node 12.
- a data set is stored in the node 11, and the data set includes a payload portion of a historical message sent by the node 11 to a subordinate node.
- node 11 stores a common global data set for all lower-level nodes, and the global data set includes the payload part of the historical message sent by node 11 to all lower-level nodes.
- the global data set may include the correspondence between the payload part, the node identifier and the group identifier.
- the node identifier can be used to indicate that the historical message to which the corresponding payload part belongs is sent to the lower-level node indicated by the node identifier
- the group identifier can be used to indicate that the flow identifier of the flow to which the historical message to which the corresponding payload part belongs belongs is in the group identifier.
- the stream grouping set is shown.
- the global data set stored in the top node may be as shown in Table 2.
- the global data set uses the node identifier and the group identifier as the key and the payload part as the value.
- the node identifier A is used to identify a subordinate node A of the top node
- the node identifier B is used to identify another subordinate node B of the top node.
- the group identifier A1 is used to identify a flow packet set of the subordinate node A
- the group identifier A2 is used to identify another flow packet set of the subordinate node A.
- the group identifier B1 is used to identify the flow packet set of the subordinate node B.
- payload part 1, payload part 2 and payload part 3 come from historical messages sent by the top node to the subordinate node A and belong to a group.
- Payload part 4 and payload part 5 come from historical messages sent by the top node to the subordinate node A and belong to another group.
- Payload part 6 and payload part 7 come from historical messages sent by the top node to the subordinate node B and belong to the same group.
- the top node after receiving the message, can determine the node identifier and group identifier corresponding to the message based on Table 1, and then use the corresponding node identifier and group identifier as the key to perform key value matching in Table 2 to obtain the corresponding payload part, and further determine whether the payload part of the message and the obtained payload part have duplicate content.
- node 11 uses the five-tuple information of message 11 as a key, performs key-value matching in the grouping table shown in Table 1, and determines that message 11 corresponds to node identifier A and group identifier A2. Then node 11 uses node identifier A and group identifier A2 as keys, performs key-value matching in the global data set shown in Table 2, and obtains payload part 4 and payload part 5. Further, node 11 performs content matching on the payload part of message 11 with payload part 4 and payload part 5, respectively.
- node 11 determines that the target message exists in the historical message sent to node 12, and the target message includes the historical message of payload part 4 or payload part 5 that has duplicate content with the payload part of message 11.
- the node 11 stores a node-level data set for each subordinate node, each node-level data set includes the payload part of the historical message sent by the node 11 to a subordinate node.
- the node-level data set may include the correspondence between the payload part and the grouping identifier.
- the top node may store a node-level data set corresponding to the lower-level node A and a node-level data set corresponding to the lower-level node B.
- the node-level data set corresponding to the lower-level node A may be as shown in Table 3
- the node-level data set corresponding to the lower-level node B may be as shown in Table 4.
- the top node after receiving the message, can determine the node identifier and group identifier corresponding to the message based on Table 1, and then find the node-level data set corresponding to the node identifier, and then use the group identifier as the key to perform key-value matching in the node-level data set shown in, for example, Table 3 or Table 4 to obtain the corresponding payload part, and further determine whether the payload part of the message and the obtained payload part have duplicate content.
- the node 11 stores a packet-level data set for each flow packet set, each packet-level data set including the payload portion of historical messages belonging to multiple flows indicated by a flow packet set sent by the node 11 to a lower-level node.
- Each packet-level data set includes the payload portion corresponding to a flow packet set.
- the top node after receiving the message, can determine the node identifier and group identifier corresponding to the message based on Table 1, and then find the group-level data set corresponding to the node identifier and the group identifier, and then obtain the payload part in the group-level data set, and further determine whether the payload part of the message and the obtained payload part have duplicate content.
- a sampling tag set including a hash value of a historical data block is stored in node 11 .
- the historical data block is a data block obtained by sampling a preset position of the data part of a historical message sent by node 11 to node 12 .
- the node 11 stores a common global sampling label set for all lower-level nodes, and the global sampling label set includes a hash value of a data block sampled at a preset position of a data portion of a historical message sent by the node 11 to all lower-level nodes.
- the global sampling set may include a correspondence between a hash value, a node identifier, and a group identifier.
- the node identifier may be used to indicate that the corresponding hash value is obtained based on a historical message sent to a lower-level node indicated by the node identifier
- the group identifier may be used to indicate that the flow identifier of the flow to which the historical message to which the corresponding hash value originates belongs is in the flow group set indicated by the group identifier.
- the global sampling label set stored in the top node may be as shown in Table 5.
- the global sampling label set uses the node identifier and the group identifier as the key and the hash value as the value.
- the node identifier A is used to identify a subordinate node A of the top node
- the node identifier B is used to identify another subordinate node B of the top node.
- the group identifier A1 is used to identify a flow grouping set of the subordinate node A
- the group identifier A2 is used to identify another flow grouping set of the subordinate node A.
- the group identifier B1 is used to identify the flow grouping set of the subordinate node B.
- the top node after receiving the message, can determine the node identifier and group identifier corresponding to the message based on Table 1, and then use the corresponding node identifier and group identifier as the key to perform key value matching in Table 5 to obtain the corresponding hash value, and further determine whether the obtained hash value includes the hash value of the sampled data block sampled from the preset position of the data part of the message 11.
- node 11 uses the five-tuple information of message 11 as a key, performs key value matching in the grouping table shown in Table 1, and determines that message 11 corresponds to node identifier A and group identifier A2. Then node 11 uses node identifier A and group identifier A2 as keys, performs key value matching in the global sampling label set shown in Table 5, and obtains hash value 4 and hash value 5. Node 11 calculates the hash value of the sampled data block sampled from the preset position of the data part of message 11. If the hash value of the sampled data block is hash value 4 or hash value 5, node 11 determines that the target message exists in the historical messages sent to node 12.
- the node 11 stores a node-level sampling label set for each subordinate node, each node-level sampling label set including a hash value of a data block sampled at a preset position of a data portion of a historical message sent by the node 11 to a subordinate node.
- the node-level sampling label set may include a correspondence between a hash value and a grouping identifier.
- the top node may store a node-level sampling label set corresponding to the lower node A and a node-level sampling label set corresponding to the lower node B.
- the node-level sampling label set corresponding to the lower node A may be as shown in Table 6, and the node-level sampling label set corresponding to the lower node B may be as shown in Table 7.
- the top node after receiving the message, can determine the node identifier and group identifier corresponding to the message based on Table 1, and then find the node-level sampling label set corresponding to the node identifier, and then use the group identifier as the key to perform key value matching in the node-level sampling label set shown in, for example, Table 6 or Table 7 to obtain the corresponding hash value, and further determine whether the obtained hash value includes the hash value of the sampling data block sampled from the preset position of the data part of the message 11.
- the node 11 stores a group-level sampling label set for each flow group set, each group-level sampling label set including a hash value of a data block sampled at a preset position of a data portion of a historical message belonging to multiple flows indicated by a flow group set and sent by the node 11 to a lower-level node.
- Each group-level sampling set includes a hash value corresponding to a flow group set.
- the top node after receiving the message, can determine the node identifier and group identifier corresponding to the message based on Table 1, and then find the group-level sampling label set corresponding to the node identifier and the group identifier, and then obtain the hash value in the group-level sampling label set, and further determine whether the obtained hash value includes the hash value of the sampling data block sampled from the preset position of the data part of the message 11.
- Each flow grouping set includes flow identifiers of multiple flows flowing through the node 21.
- the flow grouping set stored in the node 21 is the flow grouping set corresponding to the node 21 itself.
- the flow grouping set corresponding to the node itself may be referred to as a local flow grouping set.
- node 21 determines that there is a target flow group set including the flow identifier of the flow to which message 21 belongs in the one or more flow group sets. That is, after receiving message 21, node 21 first determines whether the flow group set corresponding to node 21 includes the flow identifier of the flow to which message 21 belongs. Node 21 will parse the payload part of message 21 only after determining that the flow identifier of the flow to which message 21 belongs belongs to a certain flow group set corresponding to node 21.
- the implementation method of node 21 determining whether the flow group set corresponding to node 21 includes the flow identifier of the flow to which message 21 belongs can refer to the implementation method of node 11 determining whether the flow group set corresponding to node 12 includes the flow identifier of the flow to which message 11 belongs.
- the node 21 receives the message 23 , and the flow identifier of the flow to which the message 23 belongs does not belong to any flow grouping set corresponding to the node 21 .
- the node 21 forwards the message 23 .
- the bottom node can first determine whether the flow identifier of the flow to which the message belongs belongs to a certain flow grouping set corresponding to itself. If the flow identifier of the flow to which the message belongs belongs to a certain flow grouping set corresponding to itself, it means that the message may be a deduplicated message that has been deduplicated by the upper-level node, and the bottom node needs to further parse the payload part of the message to determine whether the message is a deduplicated message.
- the flow identifier of the flow to which the message belongs does not belong to any flow grouping set corresponding to itself, it means that the upper-level node will not perform data deduplication on the message. That is, the message cannot be a deduplicated message, so the bottom node can directly forward the message without parsing the payload part of the message to determine whether the message is a deduplicated message. This can reduce the processing overhead of the bottom node.
- node 21 receives message 24, and the flow identifier of the flow to which message 24 belongs belongs to the flow group set corresponding to node 21.
- Node 21 parses the payload part of message 24 and determines that the payload part of message 24 does not carry a deduplication mark, that is, message 24 is a non-deduplicated message.
- Node 21 adds at least part of the content of the payload part of message 24 to the data set and forwards message 24.
- Message 24 here can be regarded as the first packet in a group of deduplicated messages received by node 21.
- Node 21 stores at least part of the content of the payload part of message 24 in the data set so as to perform data recovery on the deduplicated messages received subsequently.
- node 21 adds the flow identifiers of the flows to which the messages with duplicate contents in the payload part of the received multiple messages belonging to different flows belong to the same flow grouping set, and the sender of these different flows is the same SFU server.
- node 21 can store at least the contents of the payload part of message 23, and then after receiving a message whose sender is the same SFU server as the sender of message 23 and belongs to a different flow from message 23, determine whether there are duplicate contents in the payload part of the message and the payload part of message 23, and if so, determine that the flow to which the message belongs and the flow to which message 23 belongs can form a new group, and then generate a new flow grouping set, which includes the flow identifier of the flow to which the message belongs and the flow identifier of the flow to which message 23 belongs.
- node 21 may send grouping information to node 22, and the grouping information includes a correspondence between the node identifier of node 21 and one or more stream grouping sets corresponding to node 21.
- the grouping information includes a correspondence between the node identifier of node 21 and one or more stream grouping sets corresponding to node 21.
- node 21 may send the new stream grouping set to node 22.
- node 21 may also send the grouping identifier corresponding to the updated stream grouping set and the new stream identifier to node 22, so that node 22 adds the new stream identifier to the stream grouping set indicated by the grouping identifier, thereby realizing synchronous update of the stream grouping set.
- the grouping information is carried in a join message. That is, node 21 can send the grouping information to node 22 via a join message. After receiving the join message, node 22 stores the grouping information carried in the join message in a grouping table. Afterwards, node 22 can also send a join response message to node 21 so that node 21 confirms that the grouping is successful. Further, after receiving the join response message, node 21 can also send a join response confirmation message to node 22.
- the data set stored in the node 21 includes the payload part of the non-deduplicated message received by the node 21 .
- the bottom node For a flow whose sender is an SFU server and whose flow identifier does not belong to any flow grouping set corresponding to the bottom node, after the bottom node receives the message in the flow, it stores the payload part of the message in the flow. Afterwards, if the bottom node receives a message in another flow whose sender is the SFU server and whose flow identifier does not belong to any flow grouping set corresponding to the bottom node, the bottom node performs content matching on the payload parts of the messages of the two flows. If there is duplicate content in the payload parts of the messages of the two flows, it means that the two flows are matched successfully, and a flow grouping set including the flow identifiers of the two flows is further generated.
- the data set stored in node 21 includes a correspondence between a hash value of a data block sampled at a preset position of a data portion of a non-deduplicated message received by node 21 and the data block.
- the bottom node After the bottom node receives the message in the flow, the preset position of the data part of the message is sampled to obtain a sampled data block, and the hash value of the sampled data block is calculated and stored. Afterwards, if the bottom node receives a message in another flow whose sender is the SFU server and whose flow identifier does not belong to any flow grouping set corresponding to the bottom node, the bottom node also samples the preset position of the data part of the message to obtain a sampled data block, and calculates the hash value of the sampled data block.
- the bottom node can also store the content of the sampled data block.
- the bottom node can further accurately match the data content of the sampled data blocks of the two messages to determine whether the data content of the sampled data blocks of the two messages is the same.
- the implementation of the above step 1003 may be: the node 21 obtains the repeated content from the payload content of the target historical message in the data set according to the indication information of the repeated content, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set.
- the node 21 only needs to search for the payload part of the historical messages belonging to multiple flows indicated by one flow grouping set to obtain the repeated content, which reduces the number of historical messages that the node 21 needs to retrieve, thereby reducing the processing overhead of the node 21, and at the same time improving the message processing efficiency of the node 21, thereby improving the message transmission efficiency.
- one or more local flow grouping sets may be stored in the node 31, each of which includes flow identifiers of multiple flows flowing through the node 31.
- the node 31 may store one or more subordinate flow grouping sets corresponding to the node 33, each of which includes flow identifiers of multiple flows flowing through the node 33.
- node 31 determines that there is a target flow grouping set including the flow identifier of the flow to which message 31 belongs in the one or more local flow grouping sets. That is, after receiving message 31, node 31 first determines whether the local flow grouping set includes the flow identifier of the flow to which message 31 belongs. After determining that the flow identifier of the flow to which message 31 belongs belongs to a certain local flow grouping set, node 31 parses the payload part of message 31.
- the implementation method of node 31 determining whether the local flow grouping set includes the flow identifier of the flow to which message 31 belongs can refer to the implementation method of node 11 determining whether the flow grouping set corresponding to node 12 includes the flow identifier of the flow to which message 11 belongs.
- the intermediate node can first determine whether the flow identifier of the flow to which the message belongs belongs to a certain flow grouping set corresponding to itself after receiving the message. If the flow identifier of the flow to which the message belongs belongs to a certain flow grouping set corresponding to itself, it means that the message may be a deduplication message that has been deduplication processed by the upper node, and the intermediate node needs to further parse the payload part of the message to determine whether the message is a deduplication message.
- the intermediate node can directly forward the message without parsing the payload part of the message to determine whether the message is a deduplication message. This can reduce the processing overhead of the intermediate node.
- node 31 adds the flow identifier of the flow to which the message with repeated content in the payload part belongs among the multiple messages received belonging to different flows to the same local flow grouping set, and the sender of these different flows is the same SFU server. Further, node 31 can send first grouping information to node 32, and the first grouping information includes the correspondence between the node identifier of node 31 and one or more local flow grouping sets.
- the generation method, sending method and function of the local flow grouping set stored in node 31 can refer to the generation method, sending method and function of the flow grouping set corresponding to node 21 stored in the above-mentioned node 21, respectively.
- the content in the second data set stored in node 31 can also refer to the content in the data set stored in node 21, and the embodiments of the present application will not be repeated here.
- node 31 may first determine whether the lower-level flow grouping set corresponding to node 33 includes the flow identifier of the flow to which message 31 belongs. If there is a target flow grouping set including the flow identifier of the flow to which message 31 belongs in the lower-level flow grouping set corresponding to node 33, node 31 executes the above step 1103, that is, node 31 determines whether there is a message in the historical message sent to node 33 whose payload part has duplicate content with the payload part of message 31. If all flow grouping sets corresponding to node 33 do not include the flow identifier of the flow to which message 31 belongs, node 31 sends message 31 to node 33.
- the intermediate node can determine whether it is necessary to perform deduplication processing on the received non-deduplication message that needs to be sent to the subordinate node based on the subordinate flow group set corresponding to the subordinate node. If the flow identifier of the flow to which the non-deduplication message belongs is not in the subordinate flow group set corresponding to the subordinate node, then the intermediate node directly forwards the non-deduplication message to the subordinate node without executing the message deduplication process, which can reduce the processing overhead of the intermediate node.
- node 31 determines whether there is a first original message whose payload part has first repetitive content with the payload part of message 31 in the historical message sent to node 33, it can only determine whether there is a first original message in the target historical message sent to node 33, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set.
- node 31 when there are multiple lower-level flow grouping sets corresponding to node 33, node 31 only needs to perform repetitive content determination on the payload part of the historical messages belonging to multiple flows indicated by one lower-level flow grouping set and the payload part of message 31, which reduces the number of historical messages that node 31 needs to determine, thereby reducing the processing overhead of node 31, while improving the message processing efficiency of node 31, thereby improving the message transmission efficiency.
- node 31 receives second grouping information sent by node 33, and the second grouping information includes a correspondence between a node identifier of node 33 and one or more lower-level flow grouping sets corresponding to node 33.
- the acquisition method and function of the lower-level flow grouping set corresponding to node 33 stored in node 31 can refer to the acquisition method and function of the flow grouping set corresponding to node 12 stored in the above-mentioned node 11, and the content in the first data set or sampling label set stored in node 31 can also correspond to the content in the data set or sampling label set stored in the reference node 11, which will not be described in detail in the embodiments of the present application.
- the top node, middle node and bottom node defined in the embodiment of the present application are used to distinguish the positions of different nodes relative to the SFU server, and the functions of these nodes may be the same.
- the positions of multiple nodes in the data transmission system relative to the SFU server may be manually configured.
- multiple nodes in the data transmission system may respectively execute the node discovery process to determine their own positions relative to the SFU server and the connection relationships with other nodes to achieve automated deployment.
- a message whose sender is an SFU server or whose destination is an SFU server can trigger a node in a data transmission system to execute a node discovery process.
- the source port number of the message whose sender is an SFU server is the SFU service port number, and the node can judge whether the sender of the message is an SFU server based on the source port number of the message.
- the destination port number of the message whose destination is an SFU server is the SFU service port number, and the node can judge whether the destination of the message is an SFU server based on the destination port number of the message.
- the node 11 in the method 600 shown in FIG6 For the top node, for example, the node 11 in the method 600 shown in FIG6 . If the node 11 is an SFU server, and the SFU server is configured with a function that supports data deduplication, the node 11 directly determines itself as the top node. If the node 11 is not an SFU server, the node 11 can determine its position relative to the SFU server through the following two possible implementations.
- a message destined for an SFU server triggers the node discovery process.
- node 11 can send node discovery message 11 to node 13.
- Node 13 is the next hop of message 13 on node 11.
- the destination of message 13 is the SFU server, and node discovery message 11 carries the identifier of the SFU server, and node discovery message 11 indicates that node 11 is the subordinate node of node 13 on the transmission path starting from the SFU server.
- node 11 determines that node 11 is the first node that supports data deduplication on the transmission path starting from the SFU server.
- the identifier of the SFU server can be the IP address of the SFU server, or the address obtained after the IP address of the SFU server is NATed.
- the node discovery message 11 indicates that the node 11 is a subordinate node of the node 13 on the transmission path starting from the SFU server. For example, if the destination port number of the node discovery message 11 is the SFU service port number, and the node discovery message 11 carries the identifier of the SFU server, then it means that the node 11 that sends the node discovery message 11 is a subordinate node of the node 13 that receives the node discovery message 11 on the transmission path starting from the SFU server.
- the node discovery message 11 carries a position indication for indicating that it is a subordinate node
- the position indication carried by the node discovery message 11 and the identifier of the SFU server jointly indicate that the node 11 that sends the node discovery message 11 is a subordinate node of the node 13 that receives the node discovery message 11 on the transmission path starting from the SFU server.
- node 11 generates node discovery message 11 according to message 13, the message header of node discovery message 11 is the same as the message header of message 13, and the payload of node discovery message 11 carries an indication of the message type of node discovery message 11.
- the implementation method of node 11 generating node discovery message 11 according to message 13 may be that node 11 copies message 13, then only retains the message header of the copied message, and adds an indication to the payload of the copied message that indicates that its message type is a node discovery message, thereby obtaining node discovery message 11.
- the message header includes an Ethernet header, an IP header, and a transport layer protocol header (UDP header or TCP header).
- node discovery message 11 may also be a new message constructed by node 11, and the embodiment of the present application does not limit the message structure of the node discovery message.
- node 11 can also receive node discovery message 12 sent by node 14, node discovery message 12 carries the identifier of the SFU server, and node discovery message 12 indicates that node 14 is a subordinate node of node 11 on the transmission path starting from the SFU server.
- Node 11 determines that node 14 supports data deduplication based on node discovery message 12, and sends node discovery response message 12 corresponding to node discovery message 12 to node 14, node discovery response message 12 indicates that node 11 supports data deduplication, and node 11 can also determine that it is not the last node on the transmission path starting from the SFU server.
- Node 14 can be, for example, the above-mentioned node 12.
- the node discovery process is triggered by a message whose destination is an SFU server
- a node receives a node discovery message sent by a subordinate node and does not receive a node discovery response message sent by an upper node
- the node can determine that it is the first node on the transmission path starting from the SFU server.
- node discovery process is triggered by a message sent by the SFU server.
- node 11 After node 11 receives message 14 whose source port number is the SFU service port number, it sends node discovery message 13 to node 15. Node 15 is the next hop of message 14 on node 11. The sender of message 14 is the SFU server. Node discovery message 13 carries the identifier of the SFU server, and node discovery message 13 indicates that node 11 is the superior node of node 15 on the transmission path starting from the SFU server. In response to receiving the node discovery response message 13 corresponding to the node discovery message 13 sent by node 15, node 11 determines that node 15 supports data deduplication, and node 11 can also determine that it is not the last node on the transmission path starting from the SFU server. Node 15 can be, for example, the above-mentioned node 12. Further, node 11 can also send a node discovery response confirmation message to node 15.
- the node discovery message 13 indicates that the node 11 is the superior node of the node 15 on the transmission path starting from the SFU server. For example, if the source port number of the node discovery message 13 is the SFU service port number, and the node discovery message 11 carries the SFU The identifier of the server indicates that the node 11 sending the node discovery message 13 is the superior node of the node 15 receiving the node discovery message 11 on the transmission path starting from the SFU server.
- the position indication carried by the node discovery message 13 and the identifier of the SFU server together indicate that the node 11 sending the node discovery message 13 is the superior node of the node 15 receiving the node discovery message 11 on the transmission path starting from the SFU server.
- the node 11 generates a node discovery message 13 according to the message 14 , the message header of the node discovery message 13 is the same as the message header of the message 14 , and the payload part of the node discovery message 13 carries an indication of the message type of the node discovery message 13 .
- the node discovery process is triggered by a message whose sender is an SFU server
- a node receives a node discovery response message sent by a subordinate node and does not receive a node discovery message sent by an upper node
- the node can determine that it is the first node on the transmission path starting from the SFU server.
- the node 21 in the method 1000 shown in Fig. 10 may determine its position relative to the SFU server through the following two possible implementations.
- a message destined for an SFU server triggers the node discovery process.
- node 21 After node 21 receives message 25 whose destination port number is the SFU service port number, it sends node discovery message 21 to node 24.
- Node 24 is the next hop of message 25 on node 21.
- the destination of message 25 is the SFU server, and node discovery message 21 carries the identifier of the SFU server, and node discovery message 21 indicates that node 21 is the subordinate node of node 24 on the transmission path starting from the SFU server.
- node 21 determines that node 24 supports data deduplication, and node 21 can also determine that it is not the first node on the transmission path starting from the SFU server.
- Node 24 can be, for example, the above-mentioned node 22. Further, node 21 can also send a node discovery response confirmation message to node 24.
- node 21 generates node discovery message 21 according to message 25, the message header of node discovery message 21 is the same as the message header of message 25, and the payload of node discovery message 21 carries an indication of the message type of node discovery message 21.
- the manner in which node 21 generates a node discovery message may refer to the manner in which node 11 generates a node discovery message.
- the node discovery process is triggered by a message whose destination is an SFU server
- a node receives a node discovery response message sent by an upper-level node and does not receive a node discovery message sent by a lower-level node
- the node can determine that it is the last node on the transmission path starting from the SFU server.
- node discovery process is triggered by a message sent by the SFU server.
- node 21 can receive node discovery message 22 sent by node 25, node discovery message 22 carries the identifier of the SFU server, and node discovery message 22 indicates that node 25 is the superior node of node 21 on the transmission path starting from the SFU server.
- Node 21 determines that node 25 supports data deduplication based on node discovery message 22, and sends node discovery response message 22 corresponding to node discovery message 22 to node 25, node discovery response message 22 indicates that node 21 supports data deduplication, and node 21 can also determine that it is not the first node on the transmission path starting from the SFU server.
- Node 25 can be, for example, the above-mentioned node 22.
- node 21 after node 21 receives message 26 whose source port number is the SFU service port number, it can also send a node discovery message 23 to node 26, and node 26 is the next hop of message 26 on node 21.
- the sender of message 26 is the SFU server, and node discovery message 23 carries the identifier of the SFU server, and node discovery message 23 indicates that node 21 is the superior node of node 26 on the transmission path starting from the SFU server.
- node 21 determines that node 21 is the last node that supports data deduplication on the transmission path starting from the SFU server.
- the node 21 generates a node discovery message 23 according to the message 26 , the message header of the node discovery message 23 is the same as the message header of the message 26 , and the payload part of the node discovery message 23 carries an indication of the message type of the node discovery message 23 .
- the node discovery process is triggered by a message whose sender is an SFU server
- a node receives a node discovery message sent by an upper-level node and does not receive a node discovery response message sent by a lower-level node
- the node can determine that it is the last node on the transmission path starting from the SFU server.
- the node 31 may determine its position relative to the SFU server through the following two possible implementations.
- a message destined for an SFU server triggers the node discovery process.
- node 31 After node 31 receives message 34 whose destination port number is the SFU service port number, it sends node 34 a message. Discovery message 31. Node 34 is the next hop of message 34 on node 31. The destination of message 34 is the SFU server, the node discovery message 31 carries the identifier of the SFU server, and the node discovery message 31 indicates that node 31 is the subordinate node of node 34 on the transmission path starting from the SFU server. In response to receiving the node discovery response message 31 corresponding to the node discovery message 31 sent by node 34, node 31 determines that node 34 supports data deduplication, and node 31 can also determine that it is not the first node on the transmission path starting from the SFU server. Node 34 can be, for example, the above-mentioned node 32. Further, node 31 can also send a node discovery response confirmation message to node 34.
- the node 31 generates a node discovery message 31 according to the message 34, the message header of the node discovery message 31 is the same as the message header of the message 34, and the payload of the node discovery message 31 carries an indication of the message type of the node discovery message 31.
- the manner in which the node 31 generates the node discovery message may refer to the manner in which the node 11 generates the node discovery message.
- node 31 can also receive node discovery message 34 sent by node 37, node discovery message 34 carries the identifier of the SFU server, and node discovery message 34 indicates that node 37 is a subordinate node of node 31 on the transmission path starting from the SFU server.
- Node 31 determines that node 37 supports data deduplication based on node discovery message 34, and sends node discovery response message 34 corresponding to node discovery message 34 to node 37, node discovery response message 34 indicates that node 31 supports data deduplication, and node 31 can also determine that it is not the last node on the transmission path starting from the SFU server.
- Node 37 can be, for example, the above-mentioned node 33.
- the node discovery process is triggered by a message whose destination is an SFU server
- a node receives a node discovery response message sent by an upper-level node and receives a node discovery message sent by a lower-level node
- the node can determine itself as an intermediate node on the transmission path starting from the SFU server.
- node discovery process is triggered by a message sent by the SFU server.
- node 31 after node 31 receives message 35 whose source port number is the SFU service port number, it can send a node discovery message 33 to node 36.
- Node 36 is the next hop of message 35 on node 31.
- the sender of message 35 is the SFU server, and the node discovery message 33 carries the identifier of the SFU server, and the node discovery message 33 indicates that node 31 is the superior node of node 36 on the transmission path starting from the SFU server.
- node 31 determines that node 36 supports data deduplication, and node 31 can also determine that it is not the last node on the transmission path starting from the SFU server.
- Node 36 can be, for example, the above-mentioned node 33. Further, node 31 can also send a node discovery response confirmation message to node 36.
- the node 31 generates a node discovery message 33 according to the message 35 , the message header of the node discovery message 33 is the same as the message header of the message 35 , and the payload part of the node discovery message 33 carries an indication of the message type of the node discovery message 33 .
- node 31 can also receive node discovery message 32 sent by node 35, node discovery message 32 carries the identifier of the SFU server, and node discovery message 32 indicates that node 35 is the superior node of node 31 on the transmission path starting from the SFU server.
- Node 31 determines that node 35 supports data deduplication based on node discovery message 32, and sends node discovery response message 32 corresponding to node discovery message 32 to node 35, node discovery response message 32 indicates that node 31 supports data deduplication, and node 31 can also determine that it is not the first node on the transmission path starting from the SFU server.
- Node 35 can be, for example, the above-mentioned node 32.
- the node discovery process is triggered by a message whose sender is an SFU server
- a node receives a node discovery message sent by an upper-level node and a node discovery response message sent by a lower-level node
- the node can determine that it is an intermediate node on the transmission path starting from the SFU server.
- each node in the data transmission system may also store a correspondence between the identifier of the SFU server and the node position, and the node position is used to indicate the position of the node itself relative to the corresponding SFU server. Since there may be multiple SFU servers in the data transmission system, the same node may be in different positions relative to different SFU servers. For example, a certain node is a top node relative to SFU server 1 and an intermediate node relative to SFU server 2.
- the node can determine the node at which position it should perform deduplication or recovery processing on the message, which is suitable for application scenarios with complex networking.
- the order of the steps of the data transmission method provided in the embodiment of the present application can be adjusted appropriately, and the steps can also be increased or decreased accordingly according to the situation. Any technician familiar with the technical field can easily think of the method of change within the technical scope disclosed in this application, which should be covered within the protection scope of this application.
- the content of the data set and/or sampling label set stored in each node can be updated regularly, such as only storing data within a certain period of time, and automatically deleting expired data to reduce the memory resource consumption of the node.
- the message with the destination as the SFU server or the sender as the SFU server can trigger the reception of the message as described in the above embodiment.
- the node receiving the message can also trigger the node receiving the message to send a node discovery message in the reverse transmission direction of the message.
- the principle of the node sending a node discovery message in the reverse transmission direction of the message to determine its own position relative to the SFU server can refer to the principle of the node sending a node discovery message in the transmission direction of the message to determine its own position relative to the SFU server in the above-mentioned embodiment, and the embodiments of the present application will not be described one by one here.
- FIG12 is a schematic diagram of the structure of a communication node provided in an embodiment of the present application.
- the communication node is a first node, for example, it can be the node 11 in the method 600 shown in FIG6.
- the communication node 1200 includes an acquisition module 1201, a processing module 1202 and a sending module 1203.
- the communication node 1200 also includes a receiving module 1204.
- the acquisition module 1201 is used to acquire the first message whose sender is the SFU server, and the first node is the first node that supports data deduplication on the transmission path of the first message.
- the processing module 1202 is used to deduplicate the payload part of the first message if there is a target message in the historical message sent by the first node to the second node, and the payload part of the target message has repeated content with the payload part of the first message, to obtain a second message, the second message does not include repeated content, and the second message carries a deduplication mark and indication information of the repeated content, the deduplication mark is used to indicate that the second message is a deduplication message, and the second node is the next hop of the first message on the first node.
- the sending module 1203 is used to send the second message to the second node.
- the first node can be the above-mentioned node 11
- the second node can be the above-mentioned node 12
- the first message can be the above-mentioned message 11
- the second message can be the above-mentioned message 12.
- the repeated content includes one or more repeated data blocks
- the indication information includes one or more indications
- the one or more indications correspond to the one or more repeated data blocks one-to-one
- each indication is used to indicate a hash value of a corresponding repeated data block.
- a data set is stored in the first node, and the data set includes the payload part of the historical message sent by the first node to the second node.
- the processing module 1202 is used to: match the content of the payload part of the first message with the payload part in the data set. If there is a target payload part in the data set that has a duplicate data block with the payload part of the first message, determine that the target message exists in the historical message. For each duplicate data block between the payload part of the first message and the target payload part, the first node calculates the hash value of the duplicate data block.
- the processing module 1202 is configured to: if there is no payload part having a duplicate data block with the payload part of the first message in the data set, determine that the target message does not exist in the historical message, and add the payload part of the first message to the data set.
- the payload part includes a protocol part and a data part, and one or more repeated data blocks are located in the data part of the first message.
- a sampling tag set is stored in the first node, and the sampling tag set includes a hash value of a historical data block, and the historical data block is a data block obtained by sampling a preset position of the data part of a historical message sent by the first node to the second node.
- Processing module 1202 is used to: sample a preset position of the data part of the first message to obtain a sampled data block. Calculate the hash value of the sampled data block. If the sampling tag set includes the hash value of the sampled data block, determine that the target message exists in the historical message.
- the sampled data block whose hash value belongs to the sampling tag set is used as a duplicate data block, remove the duplicate data block of the data part of the first message, and add an indication corresponding to the duplicate data block in the payload part of the first message, the indication is used to indicate the hash value of the duplicate data block.
- the first node obtains multiple sampled data blocks by sampling the preset positions of the data part of the first message.
- the indication is also used to indicate the position of the repeated data block in the data part of the first message.
- the processing module 1202 is configured to: if the sampling tag set does not include the hash value of the sampling data block, determine that the target message does not exist in the historical message, and add the hash value of the sampling data block to the sampling tag set.
- the sampling tag set also includes a historical data block indicated by a hash value. If the sampling tag set includes the hash value of the sampling data block, the processing module 1202 is used to: if the sampling tag set includes the hash value of the sampling data block, perform content matching on the sampling data block and the historical data block indicated by the hash value of the sampling data block. When the content of the sampling data block is the same as the content of the historical data block indicated by the hash value of the sampling data block, it is determined that the target message exists in the historical message.
- the first node also stores the protocol part of the historical message sent by the first node to the second node; the repeated content also includes protocol information located in the protocol part of the first message, and the indication information also includes a difference indication, which is used to indicate the difference between the protocol part of the first message and the protocol part of the target message.
- the first node stores one or more flow grouping sets corresponding to the second node, each flow grouping set including flow identifiers of multiple flows flowing through the second node.
- the processing module 1202 is further configured to, after the first node obtains the first message, determine whether the target history message sent to the second node contains the flow identifier of the flow to which the first message belongs if there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the flow grouping set corresponding to the second node.
- the target message exists, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set.
- the sending module 1203 is further configured to send the first message to the second node if all flow grouping sets corresponding to the second node do not include the flow identifier of the flow to which the first message belongs.
- the receiving module 1204 is configured to receive grouping information sent by the second node, where the grouping information includes a correspondence between a node identifier of the second node and one or more flow grouping sets.
- the first node is not an SFU server; the sending module 1203 is further used to send a first node discovery message to the third node after receiving a third message whose destination port number is the SFU service port number, the third node is the next hop of the third message on the first node, the destination of the third message is the SFU server, the first node discovery message carries the identifier of the SFU server, and the first node discovery message indicates that the first node is a subordinate node of the third node on the transmission path starting from the SFU server.
- the processing module 1202 is also used to respond to the first node discovery response message corresponding to the first node discovery message sent by the third node not being received, and determine that the first node is the first node that supports data deduplication on the transmission path starting from the SFU server.
- the third node can be the above-mentioned node 13
- the third message can be the above-mentioned message 13
- the first node discovery message can be the above-mentioned node discovery message 11
- the first node discovery response message can be the above-mentioned node discovery response message 11.
- the processing module 1202 is also used to: generate a first node discovery message based on the third message, the message header of the first node discovery message is the same as the message header of the third message, and the payload part of the first node discovery message carries an indication of the message type of the first node discovery message.
- the receiving module 1204 is used to receive a second node discovery message sent by the fourth node, the second node discovery message carries the identifier of the SFU server, and the second node discovery message indicates that the fourth node is a subordinate node of the first node on the transmission path starting from the SFU server.
- the processing module 1202 is also used to determine that the fourth node supports data deduplication based on the second node discovery message.
- the sending module 1203 is also used to send a second node discovery response message corresponding to the second node discovery message to the fourth node, and the second node discovery response message indicates that the first node supports data deduplication.
- the fourth node can be the above-mentioned node 14
- the second node discovery message can be the above-mentioned node discovery message 12
- the second node discovery response message can be the above-mentioned node discovery response message 12.
- the sending module 1203 is also used to send a third node discovery message to the fifth node after receiving the fourth message whose source port number is the SFU service port number.
- the fifth node is the next hop of the fourth message on the first node.
- the sender of the fourth message is the SFU server.
- the third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the upper node of the fifth node on the transmission path starting from the SFU server.
- the processing module 1202 is also used to respond to the third node discovery response message corresponding to the third node discovery message sent by the fifth node, and determine that the fifth node supports data deduplication.
- the fifth node can be the above-mentioned node 15
- the fourth message can be the above-mentioned message 14
- the third node discovery message can be the above-mentioned node discovery message 13
- the third node discovery response message can be the above-mentioned node discovery response message 13.
- the sending module 1203 is further configured to send the first message to the second node if the target message does not exist in the historical messages sent by the first node to the second node.
- Figure 13 is a schematic diagram of the structure of another communication node provided in an embodiment of the present application.
- the communication node is a first node, for example, it can be the node 21 in the method 1000 shown in Figure 10.
- the communication node 1300 includes a receiving module 1301, a processing module 1302 and a sending module 1303.
- the receiving module 1301 is used to receive a first message sent by a second node.
- the sender of the first message is an SFU server.
- the first message carries a deduplication mark and indication information of duplicate content.
- the deduplication mark is used to indicate that the first message is a deduplication message.
- the first node is the last node that supports data deduplication on the transmission path of the first message.
- the processing module 1302 is used to determine that the first message is a deduplication message based on the deduplication mark; obtain duplicate content from a data set according to the indication information, and the data set includes at least part of the content of the load part of the historical message received by the first node from the second node; perform deduplication recovery processing on the load part of the first message according to the duplicate content to obtain a second message, and the load part of the second message includes duplicate content.
- the sending module 1303 is used to send a second message to a third node, and the third node is the next hop of the first message on the first node.
- the first node can be the above-mentioned node 21
- the second node can be the above-mentioned node 22
- the third node can be the above-mentioned node 23
- the first message can be the above-mentioned message 21
- the second message can be the above-mentioned message 22.
- the repeated content includes one or more repeated data blocks
- the indication information includes one or more indications
- the one or more indications correspond to the one or more repeated data blocks one-to-one
- each indication is used to indicate a hash value of a corresponding repeated data block.
- each indication is also used to indicate the position of the corresponding repeated data block in the payload part of the original message corresponding to the first message, and the data set includes the payload part of the historical message received by the first node from the second node.
- Processing module 1302 is used to: for each indication in the indication information, obtain the data block to be matched at the position of the payload part in the data set according to the position indicated by the indication. Calculate the hash value of the data block to be matched. The data block to be matched whose hash value is consistent with the hash value indicated by the indication is determined as the repeated data block corresponding to the indication.
- the payload portion includes a protocol portion and a data portion, and one or more repeated data blocks are located in the data portion.
- the data set includes a correspondence between a hash value of a historical data block and a historical data block, where the historical data block is a data block obtained by sampling a preset position of a data portion of a historical message received by the first node from the second node.
- the processing module 1302 is configured to determine a historical data block in the data set corresponding to the hash value indicated by the indication in the indication information as a duplicate data block corresponding to the indication.
- the indication is also used to indicate the position of the corresponding repeated data block in the data portion of the original message corresponding to the first message.
- the processing module 1302 is used to indicate the indicated position in the data portion of the first message for each indication in the indication information, and add the repeated data block corresponding to the indication.
- the repeated content also includes protocol information located in the protocol part, and the indication information also includes a difference indication, the difference indication is used to indicate the difference between the protocol part of the original message corresponding to the first message and the protocol part of the target message, the target message is a historical message received by the first node from the second node, in which the data part and the data part of the original message have one or more repeated data blocks; the data set also includes the protocol part of the message to which the historical data block belongs.
- the processing module 1302 is also used to: obtain the protocol part of the target message to which one or more repeated data blocks belong from the data set. Modify the protocol part of the target message according to the difference indication, and use the modified protocol part of the target message as the protocol part of the second message.
- the deduplication mark is located in the payload part of the first message, and one or more flow grouping sets are stored in the first node, each flow grouping set including flow identifiers of multiple flows flowing through the first node.
- the processing module 1302 is also used to determine that there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the one or more flow grouping sets before the first node determines that the first message is a deduplication message based on the deduplication mark; parse the payload part of the first message to obtain the deduplication mark.
- the processing module 1302 is configured to obtain the repeated content from the payload content of the target historical message in the data set according to the indication information, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set.
- the receiving module 1301 is further configured to receive a third message, wherein the flow identifier of the flow to which the third message belongs does not belong to any flow grouping set.
- the sending module 1303 is further configured to forward the third message.
- the third message may be the message 23 described above.
- the receiving module 1301 is further used to receive a fourth message, and the flow identifier of the flow to which the fourth message belongs belongs to the flow grouping set.
- the processing module 1302 is further used to parse the payload part of the fourth message, determine that the payload part of the fourth message does not carry a deduplication mark; add at least part of the content of the payload part of the fourth message to the data set, and forward the fourth message.
- the fourth message can be the above-mentioned message 24.
- the processing module 1302 is further configured to add the flow identifiers of the flows to which the messages with duplicate contents in the payload part belong among the received multiple messages belonging to different flows to the same flow grouping set, and the senders of the different flows are all SFU servers.
- the sending module 1303 is further configured to send grouping information to the second node, where the grouping information includes a correspondence between a node identifier of the first node and one or more flow grouping sets.
- the sending module 1303 is also used to send a first node discovery message to the fourth node after receiving the fifth message whose destination port number is the SFU service port number.
- the fourth node is the next hop of the fifth message on the first node, and the destination of the fifth message is the SFU server.
- the first node discovery message carries the identifier of the SFU server, and the first node discovery message indicates that the first node is the lower node of the fourth node on the transmission path starting from the SFU server.
- the processing module 1302 is also used to determine that the fourth node supports data deduplication in response to receiving a first node discovery response message corresponding to the first node discovery message sent by the fourth node.
- the fourth node can be the above-mentioned node 24
- the fifth message can be the above-mentioned message 25
- the first node discovery message can be the above-mentioned node discovery message 21
- the first node discovery response message can be the above-mentioned node discovery response message 21.
- the processing module 1302 is also used to generate a first node discovery message based on the fifth message, the message header of the first node discovery message is the same as the message header of the fifth message, and the payload part of the first node discovery message carries an indication of the message type of the first node discovery message.
- the receiving module 1301 is also used to receive a second node discovery message sent by the fifth node, the second node discovery message carries the identifier of the SFU server, and the second node discovery message indicates that the fifth node is the upper node of the first node on the transmission path starting from the SFU server.
- the processing module 1302 is also used to determine that the fifth node supports data deduplication based on the second node discovery message.
- the sending module 1303 is also used to send a second node discovery response message corresponding to the second node discovery message to the fifth node, and the second node discovery response message indicates that the first node supports data deduplication.
- the fifth node can be the above-mentioned node 25
- the second node discovery message can be the above-mentioned node discovery message 22
- the second node discovery response message can be the above-mentioned node discovery response message 22.
- the sending module 1303 is further used to send a third node discovery message to the sixth node after receiving the sixth message whose source port number is the SFU service port number, the sixth node is the next hop of the sixth message on the first node, the sender of the sixth message is the SFU server, the third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the upper node of the sixth node on the transmission path starting from the SFU server.
- the processing module 1302 is also used to respond to the failure to receive the third node discovery message sent by the sixth node.
- the third node discovery response message corresponding to the message determines that the first node is the last node that supports data deduplication on the transmission path starting from the SFU server.
- the sixth node can be the above-mentioned node 26
- the sixth message can be the above-mentioned message 26
- the third node discovery message can be the above-mentioned node discovery message 23
- the third node discovery response message can be the above-mentioned node discovery response message 23.
- Figure 14 is a schematic diagram of the structure of another communication node provided in an embodiment of the present application.
- the communication node is a first node, for example, it can be the node 31 in the method 1100 shown in Figure 11.
- the communication node 1400 includes a receiving module 1401, a processing module 1402 and a sending module 1403.
- the receiving module 1401 is used to receive a first message sent by the second node, the sender of the first message is an SFU server, and the first node is an intermediate node that supports data deduplication on the transmission path of the first message.
- the processing module 1402 is used to, if the first message is a non-deduplicated message, and there is a first original message in the historical message sent by the first node to the third node, and the payload part of the first original message and the payload part of the first message have a first repeated content, deduplication processing is performed on the payload part of the first message to obtain a second message, the second message does not include the first repeated content, and the second message carries a deduplication mark and a first indication information of the first repeated content, the deduplication mark is used to indicate that the second message is a deduplication message, and the third node is the next hop of the first message on the first node.
- the sending module 1403 is used to send a second message to the third node.
- the first node can be the above-mentioned node 31
- the second node can be the above-mentioned node 32
- the third node can be the above-mentioned node 33
- the first message can be the above-mentioned message 31
- the second message can be the above-mentioned message 32.
- the sending module 1403 is further configured to send the first message to the third node if the first message is a non-deduplicated message and the first original message does not exist in the historical messages sent by the first node to the third node.
- the first repetitive content includes one or more repetitive data blocks
- the first indication information includes one or more indications
- the one or more indications correspond one-to-one to the one or more repetitive data blocks
- each indication is used to indicate a hash value of a corresponding repetitive data block.
- a first data set is stored in the first node, and the first data set includes the payload part of the historical message sent by the first node to the third node.
- the processing module 1402 is also used to: content match the payload part of the first message with the payload part in the first data set. If there is a target payload part in the first data set that has a repeated data block with the payload part of the first message, it is determined that the first original message exists in the historical message. If there is no payload part in the first data set that has a repeated data block with the payload part of the first message, it is determined that the first original message does not exist in the historical message, and the first node adds the payload part of the first message to the first data set.
- the first data set includes a target payload part
- the processing module 1402 is used to: for each repeated data block between the payload part and the target payload part of the first message, calculate a hash value of the repeated data block, remove the repeated data block in the payload part of the first message, and add an indication corresponding to the repeated data block to the payload part of the first message, where the indication is used to indicate the hash value of the repeated data block and the position of the repeated data block in the payload part of the first message.
- the payload portion includes a protocol portion and a data portion, and one or more repeated data blocks are located in the data portion of the first message; a sampling tag set is stored in the first node, and the sampling tag set includes a hash value of a historical data block, and the historical data block is a data block obtained by sampling a preset position of the data portion of the historical message sent by the first node to the third node.
- the processing module 1402 is also used to: sample a preset position of the data portion of the first message to obtain a sampled data block. Calculate the hash value of the sampled data block. If the sampling tag set includes the hash value of the sampled data block, it is determined that the first original message exists in the historical message. If the sampling tag set does not include the hash value of the sampled data block, it is determined that the first original message does not exist in the historical message, and the first node adds the hash value of the sampled data block to the sampling tag set.
- the sampling tag set includes a hash value of the sampling data block
- the processing module 1402 is used to take the sampling data block whose hash value belongs to the sampling tag set as a repeated data block, remove the repeated data block in the data part of the first message, and add an indication corresponding to the repeated data block in the payload part of the first message, where the indication is used to indicate the hash value of the repeated data block.
- the first node also stores the protocol part of the historical message sent by the first node to the third node; the first repeated content also includes protocol information located in the protocol part of the first message, and the first indication information also includes a difference indication, which is used to indicate the difference between the protocol part of the first message and the protocol part of the first original message.
- the sending module 1403 is also used to send the first message to the third node if the first message is a deduplicated message, the first message carries second indication information for the second duplicate content, and the second original message exists in the historical message sent by the first node to the third node, and the payload part of the second original message includes the second duplicate content.
- the processing module 1402 is further used to obtain the second duplicate content from a second data set according to the second indication information if the first message is a deduplicated message, the first message carries second indication information for the second duplicate content, and the second original message does not exist in the historical message sent by the first node to the third node, and the payload part of the second original message includes the second duplicate content, and the second duplicate content is obtained from the second data set according to the second indication information, and the second data set includes at least part of the payload part of the historical message received by the first node from the second node; deduplicated the payload part of the first message according to the second duplicate content
- the third message is obtained by re-recovery processing, and the payload part of the third message includes the second repeated content.
- the sending module 1403 is further used to send the third message to the third node.
- the third message can be the message 33 mentioned above.
- the second repetitive content includes one or more repetitive data blocks
- the second indication information includes one or more indications
- the one or more indications correspond one-to-one to the one or more repetitive data blocks
- each indication is used to indicate a hash value of a corresponding repetitive data block.
- each indication is also used to indicate the position of the corresponding repeated data block in the payload part of the original message corresponding to the first message
- the second data set includes the payload part of the historical message received by the first node from the second node.
- Processing module 1402 is used to: for each indication in the second indication information, obtain the data block to be matched at the position of the payload part in the second data set according to the position indicated by the indication. Calculate the hash value of the data block to be matched. The data block to be matched whose hash value is consistent with the hash value indicated by the indication is determined as the repeated data block corresponding to the indication.
- the payload portion includes a protocol portion and a data portion, and one or more repeated data blocks are located in the data portion;
- the second data set includes a correspondence between a hash value of a historical data block and a historical data block, and the historical data block is a data block obtained by sampling a preset position of a data portion of a historical message received by the first node from the second node.
- the processing module 1402 is configured to determine a historical data block in the second data set corresponding to the hash value indicated by the indication in the second indication information as a repeated data block corresponding to the indication.
- the indication is also used to indicate the position of the corresponding repeated data block in the data portion of the original message corresponding to the first message.
- the processing module 1402 is used to indicate the indicated position in the data portion of the first message for each indication in the second indication information, and add the repeated data block corresponding to the indication.
- the second repeated content also includes protocol information located in the protocol part, and the second indication information also includes a difference indication, the difference indication is used to indicate the difference between the protocol part of the original message corresponding to the first message and the protocol part of the target message, the target message is a historical message received by the first node from the second node, in which the data part and the data part of the original message have one or more repeated data blocks; the second data set also includes the protocol part of the message to which the historical data block belongs.
- the processing module 1402 is also used to: obtain the protocol part of the target message to which one or more repeated data blocks belong from the second data set. Modify the protocol part of the target message according to the difference indication, and use the modified protocol part of the target message as the protocol part of the third message.
- one or more local flow grouping sets are stored in the first node, each local flow grouping set including flow identifiers of multiple flows flowing through the first node; the processing module 1402 is further used to: after determining that there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the one or more local flow grouping sets, parse the payload part of the first message. If the payload part of the first message carries a deduplication mark, determine that the first message is a deduplication message. If the payload part of the first message does not carry a deduplication mark, determine that the first message is a non-deduplication message.
- the processing module 1402 is further used to add the flow identifiers of the flows to which the messages with duplicate contents in the payload part belong among the received multiple messages belonging to different flows to the same local flow grouping set, and the senders of the different flows are all SFU servers.
- the sending module 1403 is further configured to send first grouping information to the second node, where the first grouping information includes a correspondence between a node identifier of the first node and one or more local flow grouping sets.
- one or more lower-level flow grouping sets corresponding to the third node are stored in the first node, and each lower-level flow grouping set includes flow identifiers of multiple flows flowing through the third node.
- Processing module 1402 is also used to determine whether there is a message with a payload part that has repeated content with the payload part of the first message in the target historical message sent to the third node after the first node receives the first message sent by the second node, if there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the lower-level flow grouping set corresponding to the third node, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set.
- Sending module 1403 is also used to send the first message to the third node if all lower-level flow grouping sets corresponding to the third node do not include the flow identifier of the flow to which the first message belongs.
- the receiving module 1401 is further configured to receive second grouping information sent by a third node, where the second grouping information includes a correspondence between a node identifier of the third node and one or more lower-level flow grouping sets.
- the sending module 1403 is also used to send a first node discovery message to the fourth node after receiving a fourth message whose destination port number is the SFU service port number.
- the fourth node is the next hop of the fourth message on the first node.
- the destination of the fourth message is the SFU server.
- the first node discovery message carries the identifier of the SFU server, and the first node discovery message indicates that the first node is the lower node of the fourth node on the transmission path starting from the SFU server.
- the processing module 1402 is also used to determine that the fourth node supports data deduplication in response to receiving a first node discovery response message corresponding to the first node discovery message sent by the fourth node.
- the fourth node can be the above-mentioned node 34
- the fourth message can be the above-mentioned message 34
- the first node discovery message can be the above-mentioned node discovery message 31
- the first node discovery response message can be the above-mentioned node discovery response message 31.
- the receiving module 1401 is further used to receive a second node discovery message sent by the fifth node, the second node discovery message carries an identifier of the SFU server, and the second node discovery message indicates that the fifth node is the first node on the transmission path starting from the SFU server.
- the processing module 1402 is also used to determine that the fifth node supports data deduplication according to the second node discovery message.
- the sending module 1403 is also used to send a second node discovery response message corresponding to the second node discovery message to the fifth node, and the second node discovery response message indicates that the first node supports data deduplication.
- the fifth node can be the above-mentioned node 35
- the second node discovery message can be the above-mentioned node discovery message 32
- the second node discovery response message can be the above-mentioned node discovery response message 32.
- the sending module 1403 is also used to send a third node discovery message to the sixth node after receiving the fifth message whose source port number is the SFU service port number.
- the sixth node is the next hop of the fifth message on the first node.
- the sender of the fifth message is the SFU server.
- the third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the upper node of the sixth node on the transmission path starting from the SFU server.
- the processing module 1402 is also used to respond to the third node discovery response message corresponding to the third node discovery message sent by the sixth node, and determine that the sixth node supports data deduplication.
- the sixth node can be the above-mentioned node 36
- the fifth message can be the above-mentioned message 35
- the third node discovery message can be the above-mentioned node discovery message 33
- the third node discovery response message can be the above-mentioned node discovery response message 33.
- the receiving module 1401 is also used to receive a fourth node discovery message sent by the seventh node, the fourth node discovery message carries the identifier of the SFU server, and the fourth node discovery message indicates that the seventh node is a subordinate node of the first node on the transmission path starting from the SFU server.
- the processing module 1402 is also used to determine that the seventh node supports data deduplication based on the fourth node discovery message.
- the sending module 1403 is also used to send a fourth node discovery response message corresponding to the fourth node discovery message to the seventh node, and the fourth node discovery response message indicates that the first node supports data deduplication.
- the seventh node can be the above-mentioned node 37
- the fourth node discovery message can be the above-mentioned node discovery message 34
- the fourth node discovery response message can be the above-mentioned node discovery response message 34.
- FIG15 is a schematic diagram of the hardware structure of a communication device provided in an embodiment of the present application.
- a communication device 1500 includes a processor 1501 and a memory 1502, and the memory 1501 is connected to the memory 1502 via a bus 1503.
- FIG15 illustrates that the processor 1501 and the memory 1502 are independent of each other.
- the processor 1501 and the memory 1502 are integrated together.
- the communication device 1500 may be, for example, a network device or a server.
- the memory 1502 is used to store computer programs, including operating systems and program codes.
- the memory 1502 is a storage medium of various types, such as read-only memory (ROM), random access memory (RAM), electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM), flash memory, optical storage, register, optical disk storage, optical disk storage, magnetic disk or other magnetic storage devices.
- ROM read-only memory
- RAM random access memory
- EEPROM electrically erasable programmable read-only memory
- CD-ROM compact disc read-only memory
- flash memory optical storage, register, optical disk storage, optical disk storage, magnetic disk or other magnetic storage devices.
- the processor 1501 is a general-purpose processor or a special-purpose processor.
- the processor 1501 may be a single-core processor or a multi-core processor.
- the processor 1501 includes at least one circuit to execute the above-mentioned data transmission method provided in the embodiment of the present application, such as the steps performed by the node 11, the node 21 or the node 31 in the above-mentioned method embodiment.
- the communication device 1500 further includes a network interface 1504, and the network interface 1504 is connected to the processor 1501 and the memory 1502 via the bus 1503.
- the network interface 1504 enables the communication device 1500 to communicate with other devices.
- the communication device 1500 further includes an input/output (I/O) interface 1505, which is connected to the processor 1501 and the memory 1502 via the bus 1503.
- the processor 1501 can receive input commands or data, etc. through the I/O interface 1505.
- the I/O interface 1505 is used for the communication device 1500 to connect input devices, such as keyboards, mice, etc.
- the above-mentioned network interface 1504 and I/O interface 1505 are collectively referred to as a communication interface.
- the communication device 1500 further includes a display 1506, which is connected to the processor 1501 and the memory 1502 via the bus 1503.
- the display 1506 can be used to display the intermediate results and/or final results generated by the processor 1501 executing the above method.
- the display 1506 is a touch display screen to provide a human-computer interaction interface.
- the bus 1503 is any type of communication bus for interconnecting the internal devices of the communication device 1500.
- a system bus for example, a system bus.
- the embodiment of the present application takes the interconnection of the above-mentioned devices inside the communication device 1500 through the bus 1503 as an example.
- the above-mentioned devices inside the communication device 1500 are connected to each other in a communication manner other than the bus 1503, for example, the above-mentioned devices inside the communication device 1500 are interconnected through a logical interface inside the communication device 1500.
- the above devices may be arranged on separate chips, or at least partially or completely on the same chip. Whether each device is independently arranged on different chips or integrated on one or more chips often depends on the needs of product design.
- the embodiments of the present application do not limit the specific implementation form of the above-mentioned devices.
- the communication device 1500 shown in Figure 15 is merely exemplary. During implementation, the communication device 1500 may also include other components, which are not listed here. The communication device 1500 shown in Figure 15 may implement data transmission by executing all or part of the steps of the method provided in the above embodiment.
- the embodiment of the present application also provides a data transmission system, including: an SFU server and multiple nodes in a communication network.
- the multiple nodes include a first node and a second node, and the first node is located between the SFU server and the second node.
- the SFU server is used to send a message to the first node
- the first node is used to execute the steps executed by node 11 in the above method embodiment
- the second node is used to execute the steps executed by node 21 in the above method embodiment.
- the plurality of nodes further include a third node, and the third node is located between the first node and the second node.
- the third node is used to execute the steps executed by the node 31 in the above method embodiment.
- the present application also provides another data transmission system, including: an SFU server and a first node in a communication network.
- the SFU server is used to execute the steps executed by the node 11 in the above method embodiment
- the first node is used to execute the steps executed by the node 21 in the above method embodiment.
- the communication network further comprises a second node, and the second node is located between the SFU server and the first node.
- the second node is used to execute the steps executed by the node 31 in the above method embodiment.
- the embodiment of the present application also provides a communication node, including: a processor and a memory.
- the memory is used to store a computer program, and the computer program includes program instructions.
- the processor is used to call the computer program to implement the steps performed by node 11, node 21 or node 31 in the above method embodiment.
- the embodiment of the present application further provides a computer-readable storage medium, on which instructions are stored.
- the instructions are executed by a processor, the steps executed by node 11, node 21 or node 31 in the above method embodiment are implemented.
- the embodiment of the present application further provides a computer program product, including a computer program.
- a computer program product including a computer program.
- the steps executed by the node 11, the node 21 or the node 31 in the above method embodiment are implemented.
- a and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone.
- the character "/" in this article generally indicates that the associated objects before and after are in an "or" relationship.
- the information including but not limited to user device information, user personal information, etc.
- data including but not limited to data used for analysis, stored data, displayed data, etc.
- signals involved in this application are all authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data must comply with relevant laws, regulations and standards of relevant countries and regions.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Transfer Between Computers (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Description
本申请要求于2022年10月10日提交的申请号为202211233819.3、发明名称为“基于SFU架构的流量分组与去重技术”的中国专利申请的优先权,以及于2022年12月20日提交的申请号为202211643989.9、发明名称为“数据传输方法、装置及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to Chinese patent application No. 202211233819.3, filed on October 10, 2022, with invention name “Traffic grouping and deduplication technology based on SFU architecture”, and priority to Chinese patent application No. 202211643989.9, filed on December 20, 2022, with invention name “Data transmission method, device and system”, all contents of which are incorporated by reference in this application.
本申请涉及网络技术领域,特别涉及一种数据传输方法、装置及系统。The present application relates to the field of network technology, and in particular to a data transmission method, device and system.
在视频会议或直播等通信场景中,同一用户的音视频数据通常需要分发给多个接收者,这会导致数据传输量较大,从而网络开销较大。In communication scenarios such as video conferencing or live broadcasting, the audio and video data of the same user usually needs to be distributed to multiple recipients, which results in a large amount of data transmission and thus a large network overhead.
发明内容Summary of the invention
本申请提供了一种数据传输方法、装置及系统,可以通过减少报文传输数据量来减少网络带宽的开销。The present application provides a data transmission method, device and system, which can reduce network bandwidth overhead by reducing the amount of data transmitted by messages.
第一方面,提供了一种数据传输方法。该方法包括:第一节点获取发送方为选择性转发单元(selective forwarding unit,SFU)服务器的第一报文,第一节点为第一报文的传输路径上支持数据去重的首个节点。如果第一节点向第二节点发送的历史报文中存在目标报文,目标报文的载荷部分与第一报文的载荷部分具有重复内容,第一节点对第一报文的载荷部分进行去重处理,得到第二报文。第二报文不包括重复内容,且第二报文携带有去重标记以及对重复内容的指示信息。该去重标记用于指示第二报文为去重报文。第二节点为第一报文在第一节点上的下一跳。第一节点向第二节点发送第二报文。In a first aspect, a data transmission method is provided. The method includes: a first node obtains a first message whose sender is a selective forwarding unit (SFU) server, and the first node is the first node that supports data deduplication on the transmission path of the first message. If a target message exists in the historical message sent by the first node to the second node, and the payload part of the target message has repeated content with the payload part of the first message, the first node performs deduplication processing on the payload part of the first message to obtain a second message. The second message does not include repeated content, and the second message carries a deduplication mark and indication information of the repeated content. The deduplication mark is used to indicate that the second message is a deduplication message. The second node is the next hop of the first message on the first node. The first node sends the second message to the second node.
本申请中,第一节点在接收到发送方为SFU服务器的报文之后,可以判断是否向该报文的下一跳发送过载荷部分与该报文的载荷部分具有重复内容的历史报文。如果第一节点向该报文的下一跳发送过载荷部分与该报文的载荷部分具有重复内容的历史报文,则第一节点可以对该报文进行数据去重,然后向下级节点发送去重报文。由于去重报文的数据量相较于未去重报文的数据量较小,因此可以减少报文传输数据量,从而减少网络带宽的开销。In the present application, after receiving a message whose sender is an SFU server, the first node can determine whether to send a historical message whose payload part has duplicate content with the payload part of the message to the next hop of the message. If the first node sends a historical message whose payload part has duplicate content with the payload part of the message to the next hop of the message, the first node can deduplicate the data of the message and then send the deduplicated message to the subordinate node. Since the data volume of the deduplicated message is smaller than the data volume of the non-deduplicated message, the data volume of the message transmission can be reduced, thereby reducing the network bandwidth overhead.
可选地,重复内容包括一个或多个重复数据块,对该重复内容的指示信息包括一个或多个指示。该一个或多个指示与重复内容中的一个或多个重复数据块一一对应。每个指示用于指示对应的重复数据块的哈希值。Optionally, the repeated content includes one or more repeated data blocks, and the indication information of the repeated content includes one or more indications. The one or more indications correspond one-to-one to the one or more repeated data blocks in the repeated content. Each indication is used to indicate the hash value of the corresponding repeated data block.
一种可能实现方式,第一节点中存储有数据集合,该数据集合包括第一节点向第二节点发送的历史报文的载荷部分。第一节点对第一报文的载荷部分与该数据集合中的载荷部分进行内容匹配。如果该数据集合中存在与第一报文的载荷部分具有重复数据块的目标载荷部分,第一节点确定历史报文中存在目标报文。相应地,第一节点对第一报文的载荷部分进行去重处理的实现过程,包括:针对第一报文的载荷部分与目标载荷部分之间的每个重复数据块,第一节点计算该重复数据块的哈希值。第一节点去除第一报文的载荷部分的该重复数据块,并在第一报文的载荷部分添加该重复数据块对应的指示,该指示用于指示该重复数据块的哈希值以及该重复数据块在第一报文的载荷部分中的位置。In one possible implementation, a data set is stored in a first node, and the data set includes the payload part of a historical message sent by the first node to a second node. The first node matches the content of the payload part of the first message with the payload part in the data set. If there is a target payload part in the data set that has a duplicate data block with the payload part of the first message, the first node determines that the target message exists in the historical message. Accordingly, the implementation process of the first node performing deduplication processing on the payload part of the first message includes: for each duplicate data block between the payload part of the first message and the target payload part, the first node calculates the hash value of the duplicate data block. The first node removes the duplicate data block in the payload part of the first message, and adds an indication corresponding to the duplicate data block to the payload part of the first message, where the indication is used to indicate the hash value of the duplicate data block and the position of the duplicate data block in the payload part of the first message.
本申请中,第一节点可以对获取的报文的载荷部分与已存储的历史报文的载荷部分进行内容匹配。如果该报文的载荷部分与历史报文的载荷部分有重复数据块,则第一节点计算该重复数据块的哈希值,并去除该报文中的该重复数据块得到去重报文,进一步在该去重报文中携带对该重复数据块的哈希值以及该重复数据块的位置的指示,以实现对报文的数据去重。In the present application, the first node can perform content matching on the payload part of the acquired message with the payload part of the stored historical message. If the payload part of the message and the payload part of the historical message have duplicate data blocks, the first node calculates the hash value of the duplicate data block, removes the duplicate data block in the message to obtain a deduplicated message, and further carries the hash value of the duplicate data block and an indication of the location of the duplicate data block in the deduplicated message to achieve data deduplication of the message.
可选地,如果数据集合中不存在与第一报文的载荷部分具有重复数据块的载荷部分,第一节点确定历史报文中不存在目标报文。第一节点在数据集合中添加第一报文的载荷部分。更新后的数据集合可以用于第一节点对后续获取的发送方为该SFU服务器的报文进行去重处理。Optionally, if there is no payload portion having a duplicate data block with the payload portion of the first message in the data set, the first node determines that the target message does not exist in the historical message. The first node adds the payload portion of the first message to the data set. The updated data set can be used by the first node to perform deduplication processing on messages obtained subsequently whose sender is the SFU server.
另一种可能实现方式,载荷部分包括协议部分和数据部分,一个或多个重复数据块位于第一报文的数据部分。 In another possible implementation, the payload portion includes a protocol portion and a data portion, and the one or more repeated data blocks are located in the data portion of the first message.
可选地,第一节点中存储有采样标签集合,该采样标签集合包括历史数据块的哈希值,历史数据块为对第一节点向第二节点发送的历史报文的数据部分的预设位置采样得到的数据块。第一节点对第一报文的数据部分的预设位置进行采样,得到采样数据块。第一节点计算该采样数据块的哈希值。如果该采样标签集合中包括该采样数据块的哈希值,第一节点确定历史报文中存在目标报文。相应地,第一节点对第一报文的载荷部分进行去重处理的实现过程,包括:第一节点将哈希值属于该采样标签集合的采样数据块作为重复数据块,去除第一报文的数据部分的该重复数据块,并在第一报文的载荷部分添加该重复数据块对应的指示,该指示用于指示该重复数据块的哈希值。Optionally, a sampling tag set is stored in the first node, and the sampling tag set includes a hash value of a historical data block, and the historical data block is a data block obtained by sampling a preset position of a data part of a historical message sent by the first node to the second node. The first node samples a preset position of the data part of the first message to obtain a sampled data block. The first node calculates the hash value of the sampled data block. If the sampling tag set includes the hash value of the sampled data block, the first node determines that the target message exists in the historical message. Accordingly, the implementation process of the first node performing deduplication processing on the payload part of the first message includes: the first node uses the sampled data block whose hash value belongs to the sampling tag set as a duplicate data block, removes the duplicate data block from the data part of the first message, and adds an indication corresponding to the duplicate data block to the payload part of the first message, and the indication is used to indicate the hash value of the duplicate data block.
本申请中,第一节点可以计算获取的报文的数据部分的预设位置的采样数据块的哈希值,并将其与已存储的历史数据块的哈希值进行比较。如果该报文中某个采样数据块的哈希值与第一节点已存储的哈希值相同,则第一节点去除该报文中的该采样数据块得到去重报文,进一步在该去重报文中携带该采样数据块的哈希值,以实现对报文的数据去重。In the present application, the first node can calculate the hash value of the sampled data block at the preset position of the data part of the acquired message, and compare it with the hash value of the stored historical data block. If the hash value of a sampled data block in the message is the same as the hash value stored by the first node, the first node removes the sampled data block in the message to obtain a deduplicated message, and further carries the hash value of the sampled data block in the deduplicated message to achieve data deduplication of the message.
可选地,上述预设位置有多个,第一节点对第一报文的数据部分的预设位置采样得到的采样数据块有多个,重复数据块对应的指示还用于指示该重复数据块在第一报文的数据部分中的位置。Optionally, there are multiple preset positions, and the first node obtains multiple sampled data blocks by sampling the preset positions of the data part of the first message. The indication corresponding to the repeated data block is also used to indicate the position of the repeated data block in the data part of the first message.
本申请中,在预先设置的数据部分的采样位置有多个的情况下,第一节点对报文的数据部分进行采样会得到多个采样数据块,这种情况下需要指示去重报文中被去除的重复数据块在原始报文中的位置,以便后续节点对去重报文进行数据恢复。In the present application, when there are multiple sampling positions of the pre-set data part, the first node samples the data part of the message and obtains multiple sampled data blocks. In this case, it is necessary to indicate the position of the duplicate data blocks removed from the deduplicated message in the original message so that subsequent nodes can restore the data of the deduplicated message.
可选地,如果采样标签集合中不包括采样数据块的哈希值,第一节点确定历史报文中不存在目标报文。第一节点在采样标签集合中添加采样数据块的哈希值。更新后的采样标签集合可以用于第一节点对后续获取的发送方为该SFU服务器的报文进行去重处理。Optionally, if the sampling tag set does not include the hash value of the sampled data block, the first node determines that the target message does not exist in the historical message. The first node adds the hash value of the sampled data block to the sampling tag set. The updated sampling tag set can be used by the first node to perform deduplication processing on the message whose sender is the SFU server that is subsequently obtained.
可选地,采样标签集合还包括哈希值所指示的历史数据块。如果采样标签集合中包括采样数据块的哈希值,第一节点确定历史报文中存在目标报文的实现方式,包括:如果采样标签集合中包括采样数据块的哈希值,第一节点对采样数据块与采样数据块的哈希值所指示的历史数据块进行内容匹配。当采样数据块与该采样数据块的哈希值所指示的历史数据块的内容相同时,第一节点确定历史报文中存在目标报文。Optionally, the sampling tag set also includes a historical data block indicated by a hash value. If the sampling tag set includes the hash value of the sampling data block, the first node determines that the target message exists in the historical message, including: if the sampling tag set includes the hash value of the sampling data block, the first node performs content matching on the sampling data block and the historical data block indicated by the hash value of the sampling data block. When the content of the sampling data block is the same as the content of the historical data block indicated by the hash value of the sampling data block, the first node determines that the target message exists in the historical message.
由于哈希值相同的两个数据块的数据内容有可能不同,通过在采样标签集合中存储历史数据块与历史数据块的哈希值的对应关系,可以使得第一节点在确定报文的采样数据块的哈希值与某个历史数据块的哈希值相同之后,进一步对该采样数据块与该历史数据块进行内容匹配,以实现精确匹配,进而提高对报文的去重准确率。Since the data contents of two data blocks with the same hash value may be different, by storing the correspondence between historical data blocks and hash values of historical data blocks in a sampling tag set, the first node can further perform content matching on the sampled data block and the historical data block after determining that the hash value of the sampled data block of the message is the same as the hash value of a historical data block, so as to achieve an exact match, thereby improving the accuracy of deduplication of the message.
可选地,第一节点中还存储有第一节点向第二节点发送的历史报文的协议部分。上述重复内容还包括位于第一报文的协议部分的协议信息,对重复内容的指示信息还包括差异指示,该差异指示用于指示第一报文的协议部分与目标报文的协议部分的差异。Optionally, the first node also stores the protocol part of the historical message sent by the first node to the second node. The above-mentioned repeated content also includes the protocol information located in the protocol part of the first message, and the indication information of the repeated content also includes a difference indication, which is used to indicate the difference between the protocol part of the first message and the protocol part of the target message.
本申请中,第一节点除了可以对报文的数据部分的预设位置的数据块进行去重以外,还可以对报文的协议部分进行去重,通过在报文的载荷部分携带差异指示以替代协议部分,可以进一步减少报文传输数据量,从而减少网络带宽的开销。In the present application, in addition to deduplicating data blocks at preset positions in the data part of the message, the first node can also deduplicate the protocol part of the message. By carrying a difference indication in the payload part of the message to replace the protocol part, the amount of data transmitted in the message can be further reduced, thereby reducing the network bandwidth overhead.
可选地,第一节点中存储有第二节点对应的一个或多个流分组集合。每个流分组集合包括流经第二节点的多条流的流标识。在第一节点获取第一报文之后,如果第二节点对应的流分组集合中存在包括第一报文所属流的流标识的目标流分组集合,第一节点判断向第二节点发送的目标历史报文中是否存在目标报文,目标历史报文所属流的流标识属于目标流分组集合。如果第二节点对应的所有流分组集合均不包括第一报文所属流的流标识,第一节点向第二节点发送第一报文。Optionally, one or more flow grouping sets corresponding to the second node are stored in the first node. Each flow grouping set includes flow identifiers of multiple flows flowing through the second node. After the first node obtains the first message, if there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the flow grouping set corresponding to the second node, the first node determines whether there is a target message in the target historical message sent to the second node, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set. If all flow grouping sets corresponding to the second node do not include the flow identifier of the flow to which the first message belongs, the first node sends the first message to the second node.
本申请中,第一节点可以根据下级节点对应的流分组集合,判断是否需要对向该下级节点发送的报文进行去重处理。如果该报文所属流的流标识不在该下级节点对应的流分组集合中,那么第一节点直接向下级节点转发该报文,而无需再执行报文去重流程,这样可以减小第一节点的处理开销。另外,当下级节点对应的流分组集合有多个时,这种实现方式下,第一节点只需对属于一个流分组集合所指示的多条流的历史报文的载荷部分与报文的载荷部分进行重复内容判断,减少了第一节点所需判断的历史报文的数量,从而减少了第一节点的处理开销,同时提高了第一节点的报文处理效率,从而可以提高报文传输效率。In the present application, the first node can determine whether it is necessary to perform deduplication processing on the message sent to the subordinate node based on the flow grouping set corresponding to the subordinate node. If the flow identifier of the flow to which the message belongs is not in the flow grouping set corresponding to the subordinate node, then the first node directly forwards the message to the subordinate node without executing the message deduplication process, which can reduce the processing overhead of the first node. In addition, when there are multiple flow grouping sets corresponding to the subordinate node, under this implementation method, the first node only needs to perform duplicate content judgment on the payload part of the historical message belonging to the multiple flows indicated by a flow grouping set and the payload part of the message, which reduces the number of historical messages that the first node needs to judge, thereby reducing the processing overhead of the first node, while improving the message processing efficiency of the first node, thereby improving the message transmission efficiency.
可选地,第一节点接收第二节点发送的分组信息,该分组信息包括第二节点的节点标识与第二节点对应的一个或多个流分组集合的对应关系。Optionally, the first node receives grouping information sent by the second node, where the grouping information includes a correspondence between a node identifier of the second node and one or more flow grouping sets corresponding to the second node.
可选地,第一节点不为SFU服务器。第一节点接收到目的端口号为SFU服务端口号的第三报文之后, 向第三节点发送第一节点发现报文,第三节点为第三报文在第一节点上的下一跳,第三报文的目的地为SFU服务器,第一节点发现报文携带有该SFU服务器的标识,且第一节点发现报文指示第一节点为第三节点在以该SFU服务器为起点的传输路径上的下级节点。响应于未接收到第三节点发送的第一节点发现报文对应的第一节点发现响应报文,第一节点确定第一节点为以该SFU服务器为起点的传输路径上支持数据去重的首个节点。Optionally, the first node is not an SFU server. After the first node receives the third message whose destination port number is the SFU service port number, A first node discovery message is sent to a third node, the third node is the next hop of the third message on the first node, the destination of the third message is the SFU server, the first node discovery message carries the identifier of the SFU server, and the first node discovery message indicates that the first node is a subordinate node of the third node on the transmission path starting from the SFU server. In response to not receiving a first node discovery response message corresponding to the first node discovery message sent by the third node, the first node determines that the first node is the first node that supports data deduplication on the transmission path starting from the SFU server.
可选地,第一节点根据第三报文生成第一节点发现报文,第一节点发现报文的报文头与第三报文的报文头相同,第一节点发现报文的载荷部分携带有对第一节点发现报文的报文类型的指示。Optionally, the first node generates a first node discovery message according to the third message, a message header of the first node discovery message is the same as a message header of the third message, and a payload portion of the first node discovery message carries an indication of a message type of the first node discovery message.
可选地,第一节点接收第四节点发送的第二节点发现报文,第二节点发现报文携带有SFU服务器的标识,且第二节点发现报文指示第四节点为第一节点在以该SFU服务器为起点的传输路径上的下级节点。第一节点根据第二节点发现报文确定第四节点支持数据去重,并向第四节点发送第二节点发现报文对应的第二节点发现响应报文,第二节点发现响应报文指示第一节点支持数据去重。Optionally, the first node receives a second node discovery message sent by the fourth node, the second node discovery message carries an identifier of the SFU server, and the second node discovery message indicates that the fourth node is a subordinate node of the first node on a transmission path starting from the SFU server. The first node determines that the fourth node supports data deduplication based on the second node discovery message, and sends a second node discovery response message corresponding to the second node discovery message to the fourth node, and the second node discovery response message indicates that the first node supports data deduplication.
本申请中,在由目的地为SFU服务器的报文触发节点发现流程的方案中,如果某个节点接收到下级节点发送的节点发现报文,且未接收到上级节点发送的节点发现响应报文,那么该节点可以确定自身为以该SFU服务器为起点的传输路径上的首个节点。In the present application, in the scheme where the node discovery process is triggered by a message whose destination is the SFU server, if a node receives a node discovery message sent by a subordinate node and does not receive a node discovery response message sent by an upper node, then the node can determine itself as the first node on the transmission path starting from the SFU server.
可选地,第一节点接收到源端口号为SFU服务端口号的第四报文之后,向第五节点发送第三节点发现报文,第五节点为第四报文在第一节点上的下一跳,第四报文的发送方为SFU服务器,第三节点发现报文携带有该SFU服务器的标识,且第三节点发现报文指示第一节点为第五节点在以该SFU服务器为起点的传输路径上的上级节点。响应于接收到第五节点发送的第三节点发现报文对应的第三节点发现响应报文,第一节点确定第五节点支持数据去重。Optionally, after the first node receives the fourth message whose source port number is the SFU service port number, the first node sends a third node discovery message to the fifth node, the fifth node is the next hop of the fourth message on the first node, the sender of the fourth message is the SFU server, the third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the superior node of the fifth node on the transmission path starting from the SFU server. In response to receiving a third node discovery response message corresponding to the third node discovery message sent by the fifth node, the first node determines that the fifth node supports data deduplication.
本申请中,在由发送方为SFU服务器的报文触发节点发现流程的方案中,如果某个节点接收到下级节点发送的节点发现响应报文,且未接收到上级节点发送的节点发现报文,那么该节点可以确定自身为以该SFU服务器为起点的传输路径上的首个节点。In the present application, in the scheme where the node discovery process is triggered by a message whose sender is an SFU server, if a node receives a node discovery response message sent by a subordinate node and does not receive a node discovery message sent by an upper node, then the node can determine that it is the first node on the transmission path starting from the SFU server.
可选地,如果第一节点向第二节点发送的历史报文中不存在目标报文,第一节点向第二节点发送第一报文。Optionally, if the target message does not exist in the historical messages sent by the first node to the second node, the first node sends the first message to the second node.
第二方面,提供了一种数据传输方法。该方法包括:第一节点接收第二节点发送的第一报文。第一报文的发送方为SFU服务器。第一报文携带有去重标记以及对重复内容的指示信息。该去重标记用于指示第一报文为去重报文。第一节点为第一报文的传输路径上支持数据去重的最后一个节点。第一节点基于第一报文中的去重标记确定第一报文为去重报文。第一节点根据该指示信息从数据集合中获取该重复内容,数据集合包括第一节点接收到的来自第二节点的历史报文的载荷部分的至少部分内容。第一节点根据该重复内容对第一报文的载荷部分进行去重恢复处理,得到第二报文。第二报文的载荷部分包括该重复内容。第一节点向第三节点发送第二报文,第三节点为第一报文在第一节点上的下一跳。In a second aspect, a data transmission method is provided. The method includes: a first node receives a first message sent by a second node. The sender of the first message is an SFU server. The first message carries a deduplication mark and indication information of repeated content. The deduplication mark is used to indicate that the first message is a deduplication message. The first node is the last node that supports data deduplication on the transmission path of the first message. The first node determines that the first message is a deduplication message based on the deduplication mark in the first message. The first node obtains the repeated content from a data set according to the indication information, and the data set includes at least part of the load part of the historical message received by the first node from the second node. The first node performs deduplication recovery processing on the load part of the first message according to the repeated content to obtain a second message. The load part of the second message includes the repeated content. The first node sends a second message to a third node, and the third node is the next hop of the first message on the first node.
本申请中,第一节点可以对发送方为SFU服务器的去重报文进行数据恢复,使得用户能够接收到携带有完整数据内容的原始报文,从而实现对用户业务的保障。In the present application, the first node can perform data recovery on deduplicated messages whose sender is the SFU server, so that the user can receive the original message carrying complete data content, thereby ensuring the user's business.
可选地,重复内容包括一个或多个重复数据块。对该重复内容的指示信息包括一个或多个指示。该一个或多个指示与重复内容中的一个或多个重复数据块一一对应。每个指示用于指示对应的重复数据块的哈希值。Optionally, the repeated content includes one or more repeated data blocks. The indication information of the repeated content includes one or more indications. The one or more indications correspond one-to-one to the one or more repeated data blocks in the repeated content. Each indication is used to indicate the hash value of the corresponding repeated data block.
一种可能实现方式,每个指示还用于指示对应的重复数据块在第一报文对应的原始报文的载荷部分中的位置,数据集合包括第一节点接收到的来自第二节点的历史报文的载荷部分。第一节点根据指示信息从数据集合中获取重复内容的实现过程,包括:对于指示信息中的每个指示,第一节点根据该指示所指示的位置,获取数据集合中的载荷部分的该位置的待匹配数据块。第一节点计算待匹配数据块的哈希值。第一节点将哈希值与指示所指示的哈希值一致的待匹配数据块,确定为指示对应的重复数据块。In a possible implementation, each indication is also used to indicate the position of the corresponding repeated data block in the payload part of the original message corresponding to the first message, and the data set includes the payload part of the historical message received by the first node from the second node. The implementation process of the first node obtaining repeated content from the data set according to the indication information includes: for each indication in the indication information, the first node obtains the data block to be matched at the position of the payload part in the data set according to the position indicated by the indication. The first node calculates the hash value of the data block to be matched. The first node determines the data block to be matched whose hash value is consistent with the hash value indicated by the indication as the repeated data block corresponding to the indication.
本申请中,第一节点可以根据去重报文中携带的指示所指示的重复数据块在原始报文的载荷部分的位置,分别计算已存储的多个载荷部分的该位置的数据块的哈希值,以获取哈希值与该指示所指示的哈希值一致的存储数据块,然后将该存储数据块添加至该指示所指示的位置,以实现对去重报文的数据恢复。In the present application, the first node can calculate the hash values of the data blocks at that position in the payload part of multiple stored payload parts according to the position of the duplicate data block indicated by the indication carried in the deduplication message in the payload part of the original message, so as to obtain a stored data block whose hash value is consistent with the hash value indicated by the indication, and then add the stored data block to the position indicated by the indication to achieve data recovery of the deduplication message.
另一种可能实现方式,载荷部分包括协议部分和数据部分,一个或多个重复数据块位于数据部分。In another possible implementation, the payload portion includes a protocol portion and a data portion, and one or more repeated data blocks are located in the data portion.
可选地,数据集合包括历史数据块的哈希值与历史数据块的对应关系,历史数据块为对第一节点接收 到的来自第二节点的历史报文的数据部分的预设位置采样得到的数据块。第一节点根据指示信息从数据集合中获取重复内容的实现方式,包括:第一节点将数据集合中与指示信息中的指示所指示的哈希值对应的历史数据块,确定为指示对应的重复数据块。Optionally, the data set includes a corresponding relationship between a hash value of a historical data block and a historical data block, where the historical data block is a corresponding relationship between a hash value of a historical data block and a historical data block received by the first node. The first node obtains the repeated content from the data set according to the indication information, comprising: the first node determines the historical data block in the data set corresponding to the hash value indicated by the indication in the indication information as the repeated data block corresponding to the indication.
可选地,对重复内容的指示信息中的指示还用于指示对应的重复数据块在第一报文对应的原始报文的数据部分中的位置。第一节点根据重复内容对第一报文的载荷部分进行去重恢复处理的实现方式,包括:对于指示信息中的每个指示,第一节点在第一报文的数据部分中该指示所指示的位置,添加该指示对应的重复数据块。Optionally, the indication in the indication information of the repeated content is also used to indicate the position of the corresponding repeated data block in the data part of the original message corresponding to the first message. The implementation method of the first node performing deduplication recovery processing on the payload part of the first message according to the repeated content includes: for each indication in the indication information, the first node adds the repeated data block corresponding to the indication at the position indicated by the indication in the data part of the first message.
本申请中,第一节点可以在已存储的历史数据块的哈希值中查找去重报文中携带的哈希值,并将命中的哈希值所对应的历史数据块添加到去重报文的数据部分,以实现对去重报文的数据部分的恢复。In the present application, the first node can search for the hash value carried in the deduplication message among the hash values of the stored historical data blocks, and add the historical data block corresponding to the hit hash value to the data part of the deduplication message to achieve recovery of the data part of the deduplication message.
可选地,重复内容还包括位于协议部分的协议信息,指示信息还包括差异指示,差异指示用于指示第一报文对应的原始报文的协议部分与目标报文的协议部分的差异,目标报文为第一节点接收到的来自第二节点的历史报文中数据部分与原始报文的数据部分具有一个或多个重复数据块的历史报文。数据集合还包括历史数据块所属报文的协议部分。第一节点根据指示信息从数据集合中获取重复内容的实现过程,还包括:第一节点从数据集合中获取一个或多个重复数据块所属的目标报文的协议部分。相应地,第一节点根据重复内容对第一报文的载荷部分进行去重恢复处理的实现过程,还包括:第一节点根据差异指示修改目标报文的协议部分,并将修改后的目标报文的协议部分作为第二报文的协议部分。Optionally, the repeated content also includes protocol information located in the protocol part, and the indication information also includes a difference indication, and the difference indication is used to indicate the difference between the protocol part of the original message corresponding to the first message and the protocol part of the target message, and the target message is a historical message received by the first node from the second node, in which the data part and the data part of the original message have one or more repeated data blocks. The data set also includes the protocol part of the message to which the historical data block belongs. The implementation process of the first node obtaining the repeated content from the data set according to the indication information also includes: the first node obtains the protocol part of the target message to which one or more repeated data blocks belong from the data set. Correspondingly, the implementation process of the first node performing deduplication and recovery processing on the payload part of the first message according to the repeated content also includes: the first node modifies the protocol part of the target message according to the difference indication, and uses the modified protocol part of the target message as the protocol part of the second message.
本申请中,第一节点可以获取命中的哈希值对应的历史数据块所属的历史报文的协议部分,结合去重报文中针对协议部分的差异指示还原得到去重报文对应的原始报文的协议部分,以实现对去重报文的协议部分的恢复。In the present application, the first node can obtain the protocol part of the historical message to which the historical data block corresponding to the hit hash value belongs, and restore the protocol part of the original message corresponding to the deduplicated message in combination with the difference indication for the protocol part in the deduplicated message, so as to realize the recovery of the protocol part of the deduplicated message.
可选地,去重标记位于第一报文的载荷部分。第一节点中存储有一个或多个流分组集合,每个流分组集合包括流经第一节点的多条流的流标识。在第一节点基于去重标记确定第一报文为去重报文之前,第一节点确定该一个或多个流分组集合中存在包括第一报文所属流的流标识的目标流分组集合。第一节点解析第一报文的载荷部分,得到去重标记。Optionally, the deduplication mark is located in the payload portion of the first message. One or more flow grouping sets are stored in the first node, and each flow grouping set includes flow identifiers of multiple flows flowing through the first node. Before the first node determines that the first message is a deduplication message based on the deduplication mark, the first node determines that there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the one or more flow grouping sets. The first node parses the payload portion of the first message to obtain the deduplication mark.
可选地,第一节点根据指示信息从数据集合中获取重复内容的实现方式,包括:第一节点根据指示信息,从数据集合中的目标历史报文的载荷内容中获取重复内容,目标历史报文所属流的流标识属于目标流分组集合。Optionally, the implementation method of the first node obtaining repeated content from the data set according to the indication information includes: the first node obtains repeated content from the payload content of the target historical message in the data set according to the indication information, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set.
这种实现方式下,当第一节点对应的流分组集合有多个时,第一节点只需查找属于一个流分组集合所指示的多条流的历史报文的载荷部分以获取重复内容,减少了第一节点所需检索的历史报文的数量,从而减少了第一节点的处理开销,同时提高了第一节点的报文处理效率,从而可以提高报文传输效率。Under this implementation, when there are multiple flow grouping sets corresponding to the first node, the first node only needs to search the payload part of the historical messages belonging to multiple flows indicated by one flow grouping set to obtain duplicate content, which reduces the number of historical messages that the first node needs to retrieve, thereby reducing the processing overhead of the first node and improving the message processing efficiency of the first node, thereby improving the message transmission efficiency.
可选地,第一节点接收第三报文,第三报文所属流的流标识不属于任一流分组集合。第一节点转发第三报文。Optionally, the first node receives a third message, and the flow identifier of the flow to which the third message belongs does not belong to any flow grouping set. The first node forwards the third message.
本申请中,由于上级节点只会对属于下级节点对应的流分组集合所指示的流中的报文进行数据去重,因此第一节点在接收到报文之后,可以先判断该报文所属流的流标识是否属于自身对应的某个流分组集合。如果该报文所属流的流标识属于自身对应的某个流分组集合,则说明该报文有可能是经过上级节点去重处理的去重报文,第一节点需要进一步解析该报文的载荷部分以判断该报文是否为去重报文。如果该报文所属流的流标识不属于自身对应的任一流分组集合,则说明上级节点不会对该报文进行数据去重,即该报文不可能为去重报文,因此第一节点可以直接转发该报文,而无需解析该报文的载荷部分以判断该报文是否为去重报文。这样可以减少底部节点的处理开销。In the present application, since the upper-level node will only perform data deduplication on the message in the flow indicated by the flow grouping set corresponding to the lower-level node, the first node can first determine whether the flow identifier of the flow to which the message belongs belongs to a certain flow grouping set corresponding to itself after receiving the message. If the flow identifier of the flow to which the message belongs belongs to a certain flow grouping set corresponding to itself, it means that the message may be a deduplication message that has been deduplication processed by the upper-level node, and the first node needs to further parse the payload part of the message to determine whether the message is a deduplication message. If the flow identifier of the flow to which the message belongs does not belong to any flow grouping set corresponding to itself, it means that the upper-level node will not perform data deduplication on the message, that is, the message cannot be a deduplication message, so the first node can directly forward the message without parsing the payload part of the message to determine whether the message is a deduplication message. This can reduce the processing overhead of the bottom node.
可选地,第一节点接收第四报文,第四报文所属流的流标识属于流分组集合。第一节点解析第四报文的载荷部分,确定第四报文的载荷部分未携带有去重标记。第一节点在数据集合中添加第四报文的载荷部分的至少部分内容,并转发第四报文。这里的第四报文可视为第一节点接收到的一组可去重报文中的首包,第一节点通过在数据集合中存储第四报文的载荷部分的至少部分内容,以便对后续接收到的去重报文进行数据恢复。Optionally, the first node receives a fourth message, and the flow identifier of the flow to which the fourth message belongs belongs to the flow grouping set. The first node parses the payload part of the fourth message and determines that the payload part of the fourth message does not carry a deduplication mark. The first node adds at least part of the content of the payload part of the fourth message to the data set and forwards the fourth message. The fourth message here can be regarded as the first packet in a group of deduplicated messages received by the first node. The first node stores at least part of the content of the payload part of the fourth message in the data set so as to perform data recovery on the deduplicated messages received subsequently.
可选地,第一节点将接收到的属于不同流的多个报文中,载荷部分存在重复内容的报文所属流的流标识添加至同一流分组集合中,该不同流的发送方均为SFU服务器。Optionally, the first node adds the flow identifiers of the flows to which the messages with duplicate contents in the payload part belong among the multiple messages received belonging to different flows to the same flow grouping set, and the senders of the different flows are all SFU servers.
可选地,第一节点向第二节点发送分组信息,分组信息包括第一节点的节点标识与一个或多个流分组集合的对应关系。 Optionally, the first node sends grouping information to the second node, where the grouping information includes a correspondence between a node identifier of the first node and one or more flow grouping sets.
可选地,第一节点接收到目的端口号为SFU服务端口号的第五报文之后,向第四节点发送第一节点发现报文,第四节点为第五报文在第一节点上的下一跳,第五报文的目的地为SFU服务器,第一节点发现报文携带有该SFU服务器的标识,且第一节点发现报文指示第一节点为第四节点在以该SFU服务器为起点的传输路径上的下级节点。响应于接收到第四节点发送的第一节点发现报文对应的第一节点发现响应报文,第一节点确定第四节点支持数据去重。Optionally, after the first node receives the fifth message whose destination port number is the SFU service port number, the first node sends a first node discovery message to the fourth node, the fourth node is the next hop of the fifth message on the first node, the destination of the fifth message is the SFU server, the first node discovery message carries the identifier of the SFU server, and the first node discovery message indicates that the first node is the subordinate node of the fourth node on the transmission path starting from the SFU server. In response to receiving the first node discovery response message corresponding to the first node discovery message sent by the fourth node, the first node determines that the fourth node supports data deduplication.
可选地,第一节点根据第五报文生成第一节点发现报文,第一节点发现报文的报文头与第五报文的报文头相同,第一节点发现报文的载荷部分携带有对第一节点发现报文的报文类型的指示。Optionally, the first node generates a first node discovery message according to the fifth message, a message header of the first node discovery message is the same as a message header of the fifth message, and a payload portion of the first node discovery message carries an indication of a message type of the first node discovery message.
本申请中,在由目的地为SFU服务器的报文触发节点发现流程的方案中,如果某个节点接收到上级节点发送的节点发现响应报文,且未接收到下级节点发送的节点发现报文,那么该节点可以确定自身为以该SFU服务器为起点的传输路径上的最后一个节点。In the present application, in the scheme where the node discovery process is triggered by a message whose destination is the SFU server, if a node receives a node discovery response message sent by an upper-level node and does not receive a node discovery message sent by a lower-level node, then the node can determine that it is the last node on the transmission path starting from the SFU server.
可选地,第一节点接收第五节点发送的第二节点发现报文,第二节点发现报文携带有SFU服务器的标识,且第二节点发现报文指示第五节点为第一节点在以该SFU服务器为起点的传输路径上的上级节点。第一节点根据第二节点发现报文确定第五节点支持数据去重,并向第五节点发送第二节点发现报文对应的第二节点发现响应报文,第二节点发现响应报文指示第一节点支持数据去重。Optionally, the first node receives a second node discovery message sent by the fifth node, the second node discovery message carries an identifier of the SFU server, and the second node discovery message indicates that the fifth node is the superior node of the first node on the transmission path starting from the SFU server. The first node determines that the fifth node supports data deduplication based on the second node discovery message, and sends a second node discovery response message corresponding to the second node discovery message to the fifth node, and the second node discovery response message indicates that the first node supports data deduplication.
可选地,第一节点接收到源端口号为SFU服务端口号的第六报文之后,向第六节点发送第三节点发现报文,第六节点为第六报文在第一节点上的下一跳,第六报文的发送方为SFU服务器,第三节点发现报文携带有该SFU服务器的标识,且第三节点发现报文指示第一节点为第六节点在以该SFU服务器为起点的传输路径上的上级节点。响应于未接收到第六节点发送的第三节点发现报文对应的第三节点发现响应报文,第一节点确定第一节点为以该SFU服务器为起点的传输路径上支持数据去重的最后一个节点。Optionally, after the first node receives the sixth message whose source port number is the SFU service port number, the first node sends a third node discovery message to the sixth node, the sixth node is the next hop of the sixth message on the first node, the sender of the sixth message is the SFU server, the third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the superior node of the sixth node on the transmission path starting from the SFU server. In response to not receiving the third node discovery response message corresponding to the third node discovery message sent by the sixth node, the first node determines that the first node is the last node that supports data deduplication on the transmission path starting from the SFU server.
本申请中,在由发送方为SFU服务器的报文触发节点发现流程的方案中,如果某个节点接收到上级节点发送的节点发现报文,且未接收到下级节点发送的节点发现响应报文,那么该节点可以确定自身为以该SFU服务器为起点的传输路径上的最后一个节点。In the present application, in the scheme where the node discovery process is triggered by a message whose sender is an SFU server, if a node receives a node discovery message sent by an upper-level node and does not receive a node discovery response message sent by a lower-level node, then the node can determine that it is the last node on the transmission path starting from the SFU server.
第三方面,提供了一种数据传输方法。该方法包括:第一节点接收第二节点发送的第一报文。第一报文的发送方为SFU服务器。第一节点为第一报文的传输路径上支持数据去重的中间节点。如果第一报文为未去重报文,且第一节点向第三节点发送的历史报文中存在第一原始报文,第一原始报文的载荷部分与第一报文的载荷部分具有第一重复内容,第一节点对第一报文的载荷部分进行去重处理,得到第二报文,第二报文不包括第一重复内容,且第二报文携带有去重标记以及对第一重复内容的第一指示信息,去重标记用于指示第二报文为去重报文,第三节点为第一报文在第一节点上的下一跳。第一节点向第三节点发送第二报文。In a third aspect, a data transmission method is provided. The method includes: a first node receives a first message sent by a second node. The sender of the first message is an SFU server. The first node is an intermediate node that supports data deduplication on the transmission path of the first message. If the first message is a non-deduplicated message, and there is a first original message in the historical message sent by the first node to the third node, the payload part of the first original message and the payload part of the first message have a first repeated content, the first node performs deduplication processing on the payload part of the first message to obtain a second message, the second message does not include the first repeated content, and the second message carries a deduplication mark and a first indication information of the first repeated content, the deduplication mark is used to indicate that the second message is a deduplication message, and the third node is the next hop of the first message on the first node. The first node sends a second message to the third node.
本申请中,第一节点可以对发送方为SFU服务器的未去重报文进行数据去重,以向下级节点发送去重报文。由于去重报文的数据量相较于未去重报文的数据量较小,因此可以减少报文传输数据量,从而减少网络带宽的开销。第一节点在向下级节点发送去重报文时,只需保证向该下级节点发送过载荷部分携带有该去重报文相对于原始报文被去除的内容的历史报文即可,以保证后续传输路径上存在节点能够对该去重报文进行数据恢复,使得用户最终接收到携带有完整数据内容的原始报文,保障用户业务。In the present application, the first node can perform data deduplication on the non-deduplicated message whose sender is the SFU server, so as to send the deduplicated message to the subordinate node. Since the data volume of the deduplicated message is smaller than that of the non-deduplicated message, the data volume of the message transmission can be reduced, thereby reducing the network bandwidth overhead. When the first node sends a deduplicated message to the subordinate node, it only needs to ensure that the overload part of the historical message carrying the content of the deduplicated message removed relative to the original message is sent to the subordinate node, so as to ensure that there is a node on the subsequent transmission path that can perform data recovery on the deduplicated message, so that the user finally receives the original message carrying the complete data content, thereby ensuring user services.
可选地,如果第一报文为未去重报文,且第一节点向第三节点发送的历史报文中不存在第一原始报文,第一节点向第三节点发送第一报文。Optionally, if the first message is a non-deduplicated message and the first original message does not exist in the historical messages sent by the first node to the third node, the first node sends the first message to the third node.
可选地,第一重复内容包括一个或多个重复数据块,第一指示信息包括一个或多个指示。第一指示信息中的一个或多个指示与第一重复内容中的一个或多个重复数据块一一对应,每个指示用于指示对应的重复数据块的哈希值。Optionally, the first repetitive content includes one or more repetitive data blocks, and the first indication information includes one or more indications. The one or more indications in the first indication information correspond one-to-one to the one or more repetitive data blocks in the first repetitive content, and each indication is used to indicate a hash value of a corresponding repetitive data block.
一种可能实现方式,第一节点中存储有第一数据集合,第一数据集合包括第一节点向第三节点发送的历史报文的载荷部分。第一节点对第一报文的载荷部分与第一数据集合中的载荷部分进行内容匹配。如果第一数据集合中存在与第一报文的载荷部分具有重复数据块的目标载荷部分,第一节点确定历史报文中存在第一原始报文。如果第一数据集合中不存在与第一报文的载荷部分具有重复数据块的载荷部分,第一节点确定历史报文中不存在第一原始报文,并且,第一节点在第一数据集合中添加第一报文的载荷部分。In one possible implementation, a first data set is stored in a first node, and the first data set includes a payload portion of a historical message sent by the first node to a third node. The first node performs content matching on the payload portion of the first message and the payload portion in the first data set. If there is a target payload portion in the first data set that has a duplicate data block with the payload portion of the first message, the first node determines that the first original message exists in the historical message. If there is no payload portion in the first data set that has a duplicate data block with the payload portion of the first message, the first node determines that the first original message does not exist in the historical message, and the first node adds the payload portion of the first message to the first data set.
可选地,第一数据集合包括目标载荷部分,第一节点对第一报文的载荷部分进行去重处理的实现过程,包括:针对第一报文的载荷部分与目标载荷部分之间的每个重复数据块,第一节点计算该重复数据块的哈 希值。第一节点去除第一报文的载荷部分的该重复数据块,并在第一报文的载荷部分添加该重复数据块对应的指示,该指示用于指示该重复数据块的哈希值以及该重复数据块在第一报文的载荷部分中的位置。Optionally, the first data set includes a target payload part, and the first node performs deduplication processing on the payload part of the first message, including: for each repeated data block between the payload part of the first message and the target payload part, the first node calculates a hash of the repeated data block; The first node removes the duplicate data block from the payload portion of the first message, and adds an indication corresponding to the duplicate data block to the payload portion of the first message, the indication being used to indicate the hash value of the duplicate data block and the position of the duplicate data block in the payload portion of the first message.
另一种可能实现方式,载荷部分包括协议部分和数据部分。第一重复内容中的一个或多个重复数据块位于第一报文的数据部分。第一节点中存储有采样标签集合,采样标签集合包括历史数据块的哈希值,历史数据块为对第一节点向第三节点发送的历史报文的数据部分的预设位置采样得到的数据块。第一节点对第一报文的数据部分的预设位置进行采样,得到采样数据块。第一节点计算采样数据块的哈希值。如果采样标签集合中包括采样数据块的哈希值,第一节点确定历史报文中存在第一原始报文。如果采样标签集合中不包括采样数据块的哈希值,第一节点确定历史报文中不存在第一原始报文,并且,第一节点在采样标签集合中添加采样数据块的哈希值。In another possible implementation, the payload portion includes a protocol portion and a data portion. One or more repeated data blocks in the first repeated content are located in the data portion of the first message. A sampling tag set is stored in the first node, and the sampling tag set includes a hash value of a historical data block, where the historical data block is a data block obtained by sampling a preset position of the data portion of a historical message sent by the first node to the third node. The first node samples a preset position of the data portion of the first message to obtain a sampled data block. The first node calculates a hash value of the sampled data block. If the sampling tag set includes the hash value of the sampled data block, the first node determines that the first original message exists in the historical message. If the sampling tag set does not include the hash value of the sampled data block, the first node determines that the first original message does not exist in the historical message, and the first node adds the hash value of the sampled data block to the sampling tag set.
可选地,采样标签集合中包括采样数据块的哈希值,第一节点对第一报文的载荷部分进行去重处理的实现方式,包括:第一节点将哈希值属于采样标签集合的采样数据块作为重复数据块,去除第一报文的数据部分的该重复数据块,并在第一报文的载荷部分添加该重复数据块对应的指示,该指示用于指示该重复数据块的哈希值。Optionally, the sampling tag set includes a hash value of the sampling data block, and the implementation method of the first node deduplicating the payload part of the first message includes: the first node uses the sampling data block whose hash value belongs to the sampling tag set as a duplicate data block, removes the duplicate data block from the data part of the first message, and adds an indication corresponding to the duplicate data block to the payload part of the first message, where the indication is used to indicate the hash value of the duplicate data block.
可选地,第一节点中还存储有第一节点向第三节点发送的历史报文的协议部分;第一重复内容还包括位于第一报文的协议部分的协议信息,第一指示信息还包括差异指示,差异指示用于指示第一报文的协议部分与第一原始报文的协议部分的差异。Optionally, the first node also stores the protocol part of the historical message sent by the first node to the third node; the first repeated content also includes protocol information located in the protocol part of the first message, and the first indication information also includes a difference indication, which is used to indicate the difference between the protocol part of the first message and the protocol part of the first original message.
可选地,如果第一报文为去重报文,第一报文携带有对第二重复内容的第二指示信息,且第一节点向第三节点发送的历史报文中存在第二原始报文,第二原始报文的载荷部分包括第二重复内容,第一节点向第三节点发送第一报文。Optionally, if the first message is a deduplicated message, the first message carries second indication information of the second duplicate content, and the second original message exists in the historical message sent by the first node to the third node, the payload part of the second original message includes the second duplicate content, and the first node sends the first message to the third node.
可选地,如果第一报文为去重报文,第一报文携带有对第二重复内容的第二指示信息,且第一节点向第三节点发送的历史报文中不存在第二原始报文,第二原始报文的载荷部分包括第二重复内容,第一节点根据第二指示信息从第二数据集合中获取第二重复内容,第二数据集合包括第一节点接收到的来自第二节点的历史报文的载荷部分的至少部分内容。第一节点根据第二重复内容对第一报文的载荷部分进行去重恢复处理,得到第三报文,第三报文的载荷部分包括第二重复内容。第一节点向第三节点发送第三报文。Optionally, if the first message is a deduplicated message, the first message carries second indication information for the second repeated content, and the second original message does not exist in the historical message sent by the first node to the third node, the payload part of the second original message includes the second repeated content, and the first node obtains the second repeated content from the second data set according to the second indication information, and the second data set includes at least part of the payload part of the historical message received by the first node from the second node. The first node performs deduplication recovery processing on the payload part of the first message according to the second repeated content to obtain a third message, and the payload part of the third message includes the second repeated content. The first node sends a third message to the third node.
可选地,第二重复内容包括一个或多个重复数据块,第二指示信息包括一个或多个指示,第二指示信息中的一个或多个指示与第二重复内容中的一个或多个重复数据块一一对应,每个指示用于指示对应的重复数据块的哈希值。Optionally, the second repetitive content includes one or more repetitive data blocks, and the second indication information includes one or more indications, the one or more indications in the second indication information correspond one-to-one to one or more repetitive data blocks in the second repetitive content, and each indication is used to indicate the hash value of the corresponding repetitive data block.
一种可能实现方式,每个指示还用于指示对应的重复数据块在第一报文对应的原始报文的载荷部分中的位置,第二数据集合包括第一节点接收到的来自第二节点的历史报文的载荷部分。第一节点根据第二指示信息从第二数据集合中获取第二重复内容的实现方式,包括:对于第二指示信息中的每个指示,第一节点根据该指示所指示的位置,获取第二数据集合中的载荷部分的该位置的待匹配数据块。第一节点计算待匹配数据块的哈希值。第一节点将哈希值与该指示所指示的哈希值一致的待匹配数据块,确定为该指示对应的重复数据块。In a possible implementation, each indication is also used to indicate the position of the corresponding repeated data block in the payload part of the original message corresponding to the first message, and the second data set includes the payload part of the historical message received by the first node from the second node. The implementation method for the first node to obtain the second repeated content from the second data set according to the second indication information includes: for each indication in the second indication information, the first node obtains the data block to be matched at the position of the payload part in the second data set according to the position indicated by the indication. The first node calculates the hash value of the data block to be matched. The first node determines the data block to be matched whose hash value is consistent with the hash value indicated by the indication as the repeated data block corresponding to the indication.
另一种可能实现方式,载荷部分包括协议部分和数据部分,第二重复内容中的一个或多个重复数据块位于数据部分。第二数据集合包括历史数据块的哈希值与历史数据块的对应关系,历史数据块为对第一节点接收到的来自第二节点的历史报文的数据部分的预设位置采样得到的数据块。第一节点根据第二指示信息从第二数据集合中获取第二重复内容的实现方式,包括:第一节点将第二数据集合中与第二指示信息中的指示所指示的哈希值对应的历史数据块,确定为该指示对应的重复数据块。In another possible implementation, the payload portion includes a protocol portion and a data portion, and one or more repeated data blocks in the second repeated content are located in the data portion. The second data set includes a correspondence between a hash value of a historical data block and a historical data block, and the historical data block is a data block obtained by sampling a preset position of a data portion of a historical message received by the first node from the second node. An implementation method in which the first node obtains the second repeated content from the second data set according to the second indication information includes: the first node determines the historical data block in the second data set corresponding to the hash value indicated by the indication in the second indication information as the repeated data block corresponding to the indication.
可选地,第二指示信息中的指示还用于指示对应的重复数据块在第一报文对应的原始报文的数据部分中的位置。第一节点根据第二重复内容对第一报文的载荷部分进行去重恢复处理的实现方式,包括:对于第二指示信息中的每个指示,第一节点在第一报文的数据部分中该指示所指示的位置,添加该指示对应的重复数据块。Optionally, the indication in the second indication information is also used to indicate the position of the corresponding duplicate data block in the data portion of the original message corresponding to the first message. The implementation method of the first node performing deduplication recovery processing on the payload portion of the first message according to the second duplicate content includes: for each indication in the second indication information, the first node adds the duplicate data block corresponding to the indication at the position indicated by the indication in the data portion of the first message.
可选地,第二重复内容还包括位于协议部分的协议信息,第二指示信息还包括差异指示,差异指示用于指示第一报文对应的原始报文的协议部分与目标报文的协议部分的差异,目标报文为第一节点接收到的来自第二节点的历史报文中数据部分与原始报文的数据部分具有一个或多个重复数据块的历史报文。第二数据集合还包括历史数据块所属报文的协议部分。第一节点根据第二指示信息从第二数据集合中获取第二重复内容的实现过程,还包括:第一节点从第二数据集合中获取一个或多个重复数据块所属的目标报文的 协议部分。相应地,第一节点根据第二重复内容对第一报文的载荷部分进行去重恢复处理的实现过程,包括:第一节点根据差异指示修改目标报文的协议部分,并将修改后的目标报文的协议部分作为第三报文的协议部分。Optionally, the second repeated content also includes protocol information located in the protocol part, and the second indication information also includes a difference indication, and the difference indication is used to indicate the difference between the protocol part of the original message corresponding to the first message and the protocol part of the target message, and the target message is a historical message received by the first node from the second node, in which the data part and the data part of the original message have one or more repeated data blocks. The second data set also includes the protocol part of the message to which the historical data block belongs. The implementation process of the first node obtaining the second repeated content from the second data set according to the second indication information also includes: the first node obtains the target message to which one or more repeated data blocks belong from the second data set; Protocol part. Accordingly, the implementation process of the first node performing deduplication recovery processing on the payload part of the first message according to the second duplicate content includes: the first node modifies the protocol part of the target message according to the difference indication, and uses the modified protocol part of the target message as the protocol part of the third message.
可选地,第一节点中存储有一个或多个本地流分组集合,每个本地流分组集合包括流经第一节点的多条流的流标识。第一节点在确定一个或多个本地流分组集合中存在包括第一报文所属流的流标识的目标流分组集合之后,解析第一报文的载荷部分。如果第一报文的载荷部分携带有去重标记,第一节点确定第一报文为去重报文。如果第一报文的载荷部分未携带有去重标记,第一节点确定第一报文为未去重报文。Optionally, one or more local flow grouping sets are stored in the first node, and each local flow grouping set includes flow identifiers of multiple flows flowing through the first node. After determining that there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in one or more local flow grouping sets, the first node parses the payload part of the first message. If the payload part of the first message carries a deduplication mark, the first node determines that the first message is a deduplication message. If the payload part of the first message does not carry a deduplication mark, the first node determines that the first message is a non-deduplication message.
可选地,第一节点将接收到的属于不同流的多个报文中,载荷部分存在重复内容的报文所属流的流标识添加至同一本地流分组集合中,该不同流的发送方均为SFU服务器。Optionally, the first node adds the flow identifiers of the flows to which the messages with duplicate contents in the payload part belong among the multiple messages received belonging to different flows to the same local flow grouping set, and the senders of the different flows are all SFU servers.
可选地,第一节点向第二节点发送第一分组信息,第一分组信息包括第一节点的节点标识与一个或多个本地流分组集合的对应关系。Optionally, the first node sends first grouping information to the second node, where the first grouping information includes a correspondence between a node identifier of the first node and one or more local flow grouping sets.
可选地,第一节点中存储有第三节点对应的一个或多个下级流分组集合,每个下级流分组集合包括流经第三节点的多条流的流标识。在第一节点接收第二节点发送的第一报文之后,如果第三节点对应的下级流分组集合中存在包括第一报文所属流的流标识的目标流分组集合,第一节点判断向第三节点发送的目标历史报文中是否存在载荷部分与第一报文的载荷部分具有重复内容的报文,目标历史报文所属流的流标识属于目标流分组集合。如果第三节点对应的所有下级流分组集合均不包括第一报文所属流的流标识,第一节点向第三节点发送第一报文。Optionally, the first node stores one or more lower-level flow grouping sets corresponding to the third node, and each lower-level flow grouping set includes flow identifiers of multiple flows flowing through the third node. After the first node receives the first message sent by the second node, if there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the lower-level flow grouping set corresponding to the third node, the first node determines whether there is a message in the target historical message sent to the third node whose payload part has repeated content with the payload part of the first message, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set. If all lower-level flow grouping sets corresponding to the third node do not include the flow identifier of the flow to which the first message belongs, the first node sends the first message to the third node.
可选地,第一节点接收第三节点发送的第二分组信息,第二分组信息包括第三节点的节点标识与一个或多个下级流分组集合的对应关系。Optionally, the first node receives second grouping information sent by the third node, where the second grouping information includes a correspondence between a node identifier of the third node and one or more lower-level flow grouping sets.
可选地,第一节点接收到目的端口号为SFU服务端口号的第四报文之后,向第四节点发送第一节点发现报文,第四节点为第四报文在第一节点上的下一跳,第四报文的目的地为SFU服务器,第一节点发现报文携带有该SFU服务器的标识,且第一节点发现报文指示第一节点为第四节点在以该SFU服务器为起点的传输路径上的下级节点。响应于接收到第四节点发送的第一节点发现报文对应的第一节点发现响应报文,第一节点确定第四节点支持数据去重。Optionally, after the first node receives the fourth message whose destination port number is the SFU service port number, the first node sends a first node discovery message to the fourth node, the fourth node is the next hop of the fourth message on the first node, the destination of the fourth message is the SFU server, the first node discovery message carries the identifier of the SFU server, and the first node discovery message indicates that the first node is the subordinate node of the fourth node on the transmission path starting from the SFU server. In response to receiving the first node discovery response message corresponding to the first node discovery message sent by the fourth node, the first node determines that the fourth node supports data deduplication.
可选地,第一节点接收第七节点发送的第四节点发现报文,第四节点发现报文携带有SFU服务器的标识,且第四节点发现报文指示第七节点为第一节点在以该SFU服务器为起点的传输路径上的下级节点。第一节点根据第四节点发现报文确定第七节点支持数据去重,并向第七节点发送第四节点发现报文对应的第四节点发现响应报文,第四节点发现响应报文指示第一节点支持数据去重。Optionally, the first node receives a fourth node discovery message sent by the seventh node, the fourth node discovery message carries an identifier of the SFU server, and the fourth node discovery message indicates that the seventh node is a subordinate node of the first node on a transmission path starting from the SFU server. The first node determines that the seventh node supports data deduplication based on the fourth node discovery message, and sends a fourth node discovery response message corresponding to the fourth node discovery message to the seventh node, and the fourth node discovery response message indicates that the first node supports data deduplication.
本申请中,在由目的地为SFU服务器的报文触发节点发现流程的方案中,如果某个节点接收到上级节点发送的节点发现响应报文,且接收到下级节点发送的节点发现报文,那么该节点可以确定自身为以该SFU服务器为起点的传输路径上的中间节点。In the present application, in the scheme where the node discovery process is triggered by a message whose destination is the SFU server, if a node receives a node discovery response message sent by an upper-level node and receives a node discovery message sent by a lower-level node, then the node can determine itself as an intermediate node on the transmission path starting from the SFU server.
可选地,第一节点接收到源端口号为SFU服务端口号的第五报文之后,向第六节点发送第三节点发现报文,第六节点为第五报文在第一节点上的下一跳,第五报文的发送方为SFU服务器,第三节点发现报文携带有该SFU服务器的标识,且第三节点发现报文指示第一节点为第六节点在以该SFU服务器为起点的传输路径上的上级节点。响应于接收到第六节点发送的第三节点发现报文对应的第三节点发现响应报文,第一节点确定第六节点支持数据去重。Optionally, after the first node receives the fifth message whose source port number is the SFU service port number, the first node sends a third node discovery message to the sixth node, the sixth node is the next hop of the fifth message on the first node, the sender of the fifth message is the SFU server, the third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the upper node of the sixth node on the transmission path starting from the SFU server. In response to receiving a third node discovery response message corresponding to the third node discovery message sent by the sixth node, the first node determines that the sixth node supports data deduplication.
可选地,第一节点接收第五节点发送的第二节点发现报文,第二节点发现报文携带有SFU服务器的标识,且第二节点发现报文指示第五节点为第一节点在以该SFU服务器为起点的传输路径上的上级节点。第一节点根据第二节点发现报文确定第五节点支持数据去重,并向第五节点发送第二节点发现报文对应的第二节点发现响应报文,第二节点发现响应报文指示第一节点支持数据去重。Optionally, the first node receives a second node discovery message sent by the fifth node, the second node discovery message carries an identifier of the SFU server, and the second node discovery message indicates that the fifth node is the superior node of the first node on the transmission path starting from the SFU server. The first node determines that the fifth node supports data deduplication based on the second node discovery message, and sends a second node discovery response message corresponding to the second node discovery message to the fifth node, and the second node discovery response message indicates that the first node supports data deduplication.
本申请中,在由发送方为SFU服务器的报文触发节点发现流程的方案中,如果某个节点接收到上级节点发送的节点发现报文,且接收到下级节点发送的节点发现响应报文,那么该节点可以确定自身为以该SFU服务器为起点的传输路径上的中间节点。In the present application, in the scheme where the node discovery process is triggered by a message whose sender is an SFU server, if a node receives a node discovery message sent by an upper-level node and a node discovery response message sent by a lower-level node, then the node can determine itself as an intermediate node on the transmission path starting from the SFU server.
第四方面,提供了一种节点。所述节点包括多个功能模块,所述多个功能模块相互作用,实现上述第一方面及其各实施方式中的方法。所述多个功能模块可以基于软件、硬件或软件和硬件的结合实现,且所述多个功能模块可以基于具体实现进行任意组合或分割。 In a fourth aspect, a node is provided. The node includes multiple functional modules, and the multiple functional modules interact with each other to implement the method in the first aspect and its embodiments. The multiple functional modules can be implemented based on software, hardware, or a combination of software and hardware, and the multiple functional modules can be arbitrarily combined or divided based on specific implementations.
具体地,该节点为第一节点,第一节点包括:获取模块,用于获取发送方为SFU服务器的第一报文,所述第一节点为所述第一报文的传输路径上支持数据去重的首个节点。处理模块,用于如果所述第一节点向第二节点发送的历史报文中存在目标报文,所述目标报文的载荷部分与所述第一报文的载荷部分具有重复内容,对所述第一报文的载荷部分进行去重处理,得到第二报文,所述第二报文不包括所述重复内容,且所述第二报文携带有去重标记以及对所述重复内容的指示信息,所述去重标记用于指示所述第二报文为去重报文,所述第二节点为所述第一报文在所述第一节点上的下一跳。发送模块,用于向所述第二节点发送所述第二报文。Specifically, the node is a first node, and the first node includes: an acquisition module, which is used to acquire a first message whose sender is an SFU server, and the first node is the first node that supports data deduplication on the transmission path of the first message. A processing module, which is used to deduplicate the payload part of the first message to obtain a second message if there is a target message in the historical message sent by the first node to the second node, and the payload part of the target message has repeated content with the payload part of the first message, wherein the second message does not include the repeated content, and the second message carries a deduplication mark and indication information of the repeated content, wherein the deduplication mark is used to indicate that the second message is a deduplication message, and the second node is the next hop of the first message on the first node. A sending module, which is used to send the second message to the second node.
可选地,所述重复内容包括一个或多个重复数据块,所述指示信息包括一个或多个指示,所述一个或多个指示与所述一个或多个重复数据块一一对应,每个所述指示用于指示对应的重复数据块的哈希值。Optionally, the repeated content includes one or more repeated data blocks, and the indication information includes one or more indications, the one or more indications correspond one-to-one to the one or more repeated data blocks, and each indication is used to indicate a hash value of a corresponding repeated data block.
可选地,所述第一节点中存储有数据集合,所述数据集合包括所述第一节点向所述第二节点发送的历史报文的载荷部分,所述处理模块,用于:对所述第一报文的载荷部分与所述数据集合中的载荷部分进行内容匹配;如果所述数据集合中存在与所述第一报文的载荷部分具有重复数据块的目标载荷部分,确定所述历史报文中存在所述目标报文;针对所述第一报文的载荷部分与所述目标载荷部分之间的每个重复数据块,所述第一节点计算所述重复数据块的哈希值;去除所述第一报文的载荷部分的所述重复数据块,并在所述第一报文的载荷部分添加所述重复数据块对应的指示,所述指示用于指示所述重复数据块的哈希值以及所述重复数据块在所述第一报文的载荷部分中的位置。Optionally, a data set is stored in the first node, and the data set includes the payload part of the historical message sent by the first node to the second node, and the processing module is used to: perform content matching on the payload part of the first message and the payload part in the data set; if there is a target payload part in the data set that has a repeated data block with the payload part of the first message, determine that the target message exists in the historical message; for each repeated data block between the payload part of the first message and the target payload part, the first node calculates the hash value of the repeated data block; removes the repeated data block in the payload part of the first message, and adds an indication corresponding to the repeated data block to the payload part of the first message, wherein the indication is used to indicate the hash value of the repeated data block and the position of the repeated data block in the payload part of the first message.
可选地,所述处理模块,用于:如果所述数据集合中不存在与所述第一报文的载荷部分具有重复数据块的载荷部分,确定所述历史报文中不存在所述目标报文;在所述数据集合中添加所述第一报文的载荷部分。Optionally, the processing module is used to: if there is no payload part with repeated data blocks with the payload part of the first message in the data set, determine that the target message does not exist in the historical message; and add the payload part of the first message to the data set.
可选地,所述载荷部分包括协议部分和数据部分,所述一个或多个重复数据块位于所述第一报文的数据部分。Optionally, the payload part includes a protocol part and a data part, and the one or more repeated data blocks are located in the data part of the first message.
可选地,所述第一节点中存储有采样标签集合,所述采样标签集合包括历史数据块的哈希值,所述历史数据块为对所述第一节点向所述第二节点发送的历史报文的数据部分的预设位置采样得到的数据块;所述处理模块,用于:对所述第一报文的数据部分的所述预设位置进行采样,得到采样数据块;计算所述采样数据块的哈希值;如果所述采样标签集合中包括所述采样数据块的哈希值,确定所述历史报文中存在所述目标报文;将哈希值属于所述采样标签集合的所述采样数据块作为重复数据块,去除所述第一报文的数据部分的所述重复数据块,并在所述第一报文的载荷部分添加所述重复数据块对应的指示,所述指示用于指示所述重复数据块的哈希值。Optionally, a sampling tag set is stored in the first node, the sampling tag set includes a hash value of a historical data block, and the historical data block is a data block obtained by sampling a preset position of a data part of a historical message sent by the first node to the second node; the processing module is used to: sample the preset position of the data part of the first message to obtain a sampled data block; calculate the hash value of the sampled data block; if the sampling tag set includes the hash value of the sampled data block, determine that the target message exists in the historical message; use the sampled data block whose hash value belongs to the sampling tag set as a duplicate data block, remove the duplicate data block in the data part of the first message, and add an indication corresponding to the duplicate data block in the payload part of the first message, the indication being used to indicate the hash value of the duplicate data block.
可选地,所述预设位置有多个,所述第一节点对所述第一报文的数据部分的所述预设位置采样得到的采样数据块有多个,所述指示还用于指示所述重复数据块在所述第一报文的数据部分中的位置。Optionally, there are multiple preset positions, and the first node obtains multiple sampled data blocks by sampling the preset positions of the data part of the first message, and the indication is also used to indicate the position of the repeated data block in the data part of the first message.
可选地,所述处理模块,用于:如果所述采样标签集合中不包括所述采样数据块的哈希值,确定所述历史报文中不存在所述目标报文;在所述采样标签集合中添加所述采样数据块的哈希值。Optionally, the processing module is configured to: if the sampling tag set does not include the hash value of the sampling data block, determine that the target message does not exist in the historical message; and add the hash value of the sampling data block to the sampling tag set.
可选地,所述采样标签集合还包括哈希值所指示的历史数据块,如果所述采样标签集合中包括所述采样数据块的哈希值,所述处理模块,用于:如果所述采样标签集合中包括所述采样数据块的哈希值,对所述采样数据块与所述采样数据块的哈希值所指示的历史数据块进行内容匹配;当所述采样数据块与所述采样数据块的哈希值所指示的历史数据块的内容相同时,确定所述历史报文中存在所述目标报文。Optionally, the sampling tag set also includes a historical data block indicated by a hash value. If the sampling tag set includes the hash value of the sampling data block, the processing module is used to: if the sampling tag set includes the hash value of the sampling data block, perform content matching on the sampling data block and the historical data block indicated by the hash value of the sampling data block; when the content of the sampling data block is the same as that of the historical data block indicated by the hash value of the sampling data block, determine that the target message exists in the historical message.
可选地,所述第一节点中还存储有所述第一节点向所述第二节点发送的历史报文的协议部分;所述重复内容还包括位于所述第一报文的协议部分的协议信息,所述指示信息还包括差异指示,所述差异指示用于指示所述第一报文的协议部分与所述目标报文的协议部分的差异。Optionally, the first node also stores the protocol part of the historical message sent by the first node to the second node; the repeated content also includes protocol information located in the protocol part of the first message, and the indication information also includes a difference indication, and the difference indication is used to indicate the difference between the protocol part of the first message and the protocol part of the target message.
可选地,所述第一节点中存储有所述第二节点对应的一个或多个流分组集合,每个所述流分组集合包括流经所述第二节点的多条流的流标识;所述处理模块,还用于在所述第一节点获取第一报文之后,如果所述第二节点对应的流分组集合中存在包括所述第一报文所属流的流标识的目标流分组集合,判断向所述第二节点发送的目标历史报文中是否存在所述目标报文,所述目标历史报文所属流的流标识属于所述目标流分组集合;所述发送模块,还用于如果所述第二节点对应的所有流分组集合均不包括所述第一报文所属流的流标识,向所述第二节点发送所述第一报文。Optionally, the first node stores one or more flow grouping sets corresponding to the second node, each of the flow grouping sets including flow identifiers of multiple flows flowing through the second node; the processing module is further used to, after the first node obtains the first message, determine whether the target message exists in the target historical message sent to the second node, if there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the flow grouping set corresponding to the second node, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set; the sending module is further used to send the first message to the second node if all flow grouping sets corresponding to the second node do not include the flow identifier of the flow to which the first message belongs.
可选地,所述第一节点还包括接收模块;所述接收模块,用于接收所述第二节点发送的分组信息,所述分组信息包括所述第二节点的节点标识与所述一个或多个流分组集合的对应关系。 Optionally, the first node further includes a receiving module; the receiving module is used to receive grouping information sent by the second node, and the grouping information includes a correspondence between a node identifier of the second node and the one or more flow grouping sets.
可选地,所述第一节点不为所述SFU服务器;所述发送模块,还用于接收到目的端口号为SFU服务端口号的第三报文之后,向第三节点发送第一节点发现报文,所述第三节点为所述第三报文在所述第一节点上的下一跳,所述第三报文的目的地为所述SFU服务器,所述第一节点发现报文携带有所述SFU服务器的标识,且所述第一节点发现报文指示所述第一节点为所述第三节点在以所述SFU服务器为起点的传输路径上的下级节点;所述处理模块,还用于响应于未接收到所述第三节点发送的所述第一节点发现报文对应的第一节点发现响应报文,确定所述第一节点为以所述SFU服务器为起点的传输路径上支持数据去重的首个节点。Optionally, the first node is not the SFU server; the sending module is also used to send a first node discovery message to the third node after receiving a third message whose destination port number is the SFU service port number, the third node is the next hop of the third message on the first node, the destination of the third message is the SFU server, the first node discovery message carries an identifier of the SFU server, and the first node discovery message indicates that the first node is a subordinate node of the third node on the transmission path starting from the SFU server; the processing module is also used to determine that the first node is the first node that supports data deduplication on the transmission path starting from the SFU server in response to not receiving a first node discovery response message corresponding to the first node discovery message sent by the third node.
可选地,所述处理模块,还用于:根据所述第三报文生成所述第一节点发现报文,所述第一节点发现报文的报文头与所述第三报文的报文头相同,所述第一节点发现报文的载荷部分携带有对所述第一节点发现报文的报文类型的指示。Optionally, the processing module is also used to: generate the first node discovery message based on the third message, the message header of the first node discovery message is the same as the message header of the third message, and the payload part of the first node discovery message carries an indication of the message type of the first node discovery message.
可选地,所述第一节点还包括接收模块;所述接收模块,用于接收第四节点发送的第二节点发现报文,所述第二节点发现报文携带有所述SFU服务器的标识,且所述第二节点发现报文指示所述第四节点为所述第一节点在以所述SFU服务器为起点的传输路径上的下级节点;所述处理模块,还用于根据所述第二节点发现报文确定所述第四节点支持数据去重;所述发送模块,还用于向所述第四节点发送所述第二节点发现报文对应的第二节点发现响应报文,所述第二节点发现响应报文指示所述第一节点支持数据去重。Optionally, the first node also includes a receiving module; the receiving module is used to receive a second node discovery message sent by a fourth node, the second node discovery message carries the identifier of the SFU server, and the second node discovery message indicates that the fourth node is a subordinate node of the first node on the transmission path starting from the SFU server; the processing module is also used to determine whether the fourth node supports data deduplication based on the second node discovery message; the sending module is also used to send a second node discovery response message corresponding to the second node discovery message to the fourth node, and the second node discovery response message indicates that the first node supports data deduplication.
可选地,所述发送模块,还用于接收到源端口号为SFU服务端口号的第四报文之后,向第五节点发送第三节点发现报文,所述第五节点为所述第四报文在所述第一节点上的下一跳,所述第四报文的发送方为所述SFU服务器,所述第三节点发现报文携带有所述SFU服务器的标识,且所述第三节点发现报文指示所述第一节点为所述第五节点在以所述SFU服务器为起点的传输路径上的上级节点;所述处理模块,还用于响应于接收到所述第五节点发送的所述第三节点发现报文对应的第三节点发现响应报文,确定所述第五节点支持数据去重。Optionally, the sending module is also used to send a third node discovery message to the fifth node after receiving a fourth message whose source port number is the SFU service port number, the fifth node is the next hop of the fourth message on the first node, the sender of the fourth message is the SFU server, the third node discovery message carries an identifier of the SFU server, and the third node discovery message indicates that the first node is the superior node of the fifth node on the transmission path starting from the SFU server; the processing module is also used to determine that the fifth node supports data deduplication in response to receiving a third node discovery response message corresponding to the third node discovery message sent by the fifth node.
可选地,所述发送模块,还用于如果所述第一节点向所述第二节点发送的历史报文中不存在所述目标报文,向所述第二节点发送所述第一报文。Optionally, the sending module is further configured to send the first message to the second node if the target message does not exist in the historical messages sent by the first node to the second node.
第五方面,提供了一种节点。所述节点包括多个功能模块,所述多个功能模块相互作用,实现上述第二方面及其各实施方式中的方法。所述多个功能模块可以基于软件、硬件或软件和硬件的结合实现,且所述多个功能模块可以基于具体实现进行任意组合或分割。In a fifth aspect, a node is provided. The node includes multiple functional modules, and the multiple functional modules interact with each other to implement the method in the second aspect and its respective embodiments. The multiple functional modules can be implemented based on software, hardware, or a combination of software and hardware, and the multiple functional modules can be arbitrarily combined or divided based on specific implementations.
具体地,该节点为第一节点,第一节点包括:接收模块,用于接收第二节点发送的第一报文,所述第一报文的发送方为SFU服务器,所述第一报文携带有去重标记以及对重复内容的指示信息,所述去重标记用于指示所述第一报文为去重报文,所述第一节点为所述第一报文的传输路径上支持数据去重的最后一个节点;处理模块,用于基于所述去重标记确定所述第一报文为去重报文;根据所述指示信息从数据集合中获取所述重复内容,所述数据集合包括所述第一节点接收到的来自所述第二节点的历史报文的载荷部分的至少部分内容;根据所述重复内容对所述第一报文的载荷部分进行去重恢复处理,得到第二报文,所述第二报文的载荷部分包括所述重复内容;发送模块,用于向第三节点发送所述第二报文,所述第三节点为所述第一报文在所述第一节点上的下一跳。Specifically, the node is a first node, and the first node includes: a receiving module, used to receive a first message sent by a second node, the sender of the first message is an SFU server, the first message carries a deduplication mark and indication information of repeated content, the deduplication mark is used to indicate that the first message is a deduplication message, and the first node is the last node that supports data deduplication on the transmission path of the first message; a processing module, used to determine that the first message is a deduplication message based on the deduplication mark; obtain the repeated content from a data set according to the indication information, the data set includes at least part of the payload part of the historical message received by the first node from the second node; perform deduplication recovery processing on the payload part of the first message according to the repeated content to obtain a second message, and the payload part of the second message includes the repeated content; a sending module, used to send the second message to a third node, and the third node is the next hop of the first message on the first node.
可选地,所述重复内容包括一个或多个重复数据块,所述指示信息包括一个或多个指示,所述一个或多个指示与所述一个或多个重复数据块一一对应,每个所述指示用于指示对应的重复数据块的哈希值。Optionally, the repeated content includes one or more repeated data blocks, and the indication information includes one or more indications, the one or more indications correspond one-to-one to the one or more repeated data blocks, and each indication is used to indicate a hash value of a corresponding repeated data block.
可选地,每个所述指示还用于指示对应的重复数据块在所述第一报文对应的原始报文的载荷部分中的位置,所述数据集合包括所述第一节点接收到的来自所述第二节点的历史报文的载荷部分;所述处理模块,用于:对于所述指示信息中的每个指示,根据所述指示所指示的位置,获取所述数据集合中的载荷部分的所述位置的待匹配数据块;计算所述待匹配数据块的哈希值;将哈希值与所述指示所指示的哈希值一致的待匹配数据块,确定为所述指示对应的重复数据块。Optionally, each of the indications is also used to indicate the position of a corresponding duplicate data block in the payload part of the original message corresponding to the first message, and the data set includes the payload part of the historical message received by the first node from the second node; the processing module is used to: for each indication in the indication information, obtain the data block to be matched at the position of the payload part in the data set according to the position indicated by the indication; calculate the hash value of the data block to be matched; and determine the data block to be matched whose hash value is consistent with the hash value indicated by the indication as the duplicate data block corresponding to the indication.
可选地,所述载荷部分包括协议部分和数据部分,所述一个或多个重复数据块位于数据部分。Optionally, the payload portion includes a protocol portion and a data portion, and the one or more repeated data blocks are located in the data portion.
可选地,所述数据集合包括历史数据块的哈希值与所述历史数据块的对应关系,所述历史数据块为对所述第一节点接收到的来自所述第二节点的历史报文的数据部分的预设位置采样得到的数据块;所述处理模块,用于将所述数据集合中与所述指示信息中的指示所指示的哈希值对应的历史数据块,确定为所述指示对应的重复数据块。 Optionally, the data set includes a correspondence between a hash value of a historical data block and the historical data block, wherein the historical data block is a data block obtained by sampling a preset position of a data portion of a historical message received by the first node from the second node; the processing module is used to determine a historical data block in the data set corresponding to the hash value indicated by the indication in the indication information as a duplicate data block corresponding to the indication.
可选地,所述指示还用于指示对应的重复数据块在所述第一报文对应的原始报文的数据部分中的位置;所述处理模块,用于对于所述指示信息中的每个指示,在所述第一报文的数据部分中所述指示所指示的位置,添加所述指示对应的重复数据块。Optionally, the indication is also used to indicate the position of the corresponding repeated data block in the data part of the original message corresponding to the first message; the processing module is used to add the repeated data block corresponding to the indication at the position indicated by the indication in the data part of the first message for each indication in the indication information.
可选地,所述重复内容还包括位于协议部分的协议信息,所述指示信息还包括差异指示,所述差异指示用于指示所述第一报文对应的原始报文的协议部分与目标报文的协议部分的差异,所述目标报文为所述第一节点接收到的来自所述第二节点的历史报文中数据部分与所述原始报文的数据部分具有所述一个或多个重复数据块的历史报文;所述数据集合还包括所述历史数据块所属报文的协议部分;所述处理模块,还用于:从所述数据集合中获取所述一个或多个重复数据块所属的所述目标报文的协议部分;根据所述差异指示修改所述目标报文的协议部分,并将修改后的所述目标报文的协议部分作为所述第二报文的协议部分。Optionally, the repeated content also includes protocol information located in the protocol part, and the indication information also includes a difference indication, wherein the difference indication is used to indicate the difference between the protocol part of the original message corresponding to the first message and the protocol part of the target message, and the target message is a historical message received by the first node from the second node, in which the data part and the data part of the original message have the one or more repeated data blocks; the data set also includes the protocol part of the message to which the historical data blocks belong; the processing module is also used to: obtain the protocol part of the target message to which the one or more repeated data blocks belong from the data set; modify the protocol part of the target message according to the difference indication, and use the modified protocol part of the target message as the protocol part of the second message.
可选地,所述去重标记位于所述第一报文的载荷部分,所述第一节点中存储有一个或多个流分组集合,每个所述流分组集合包括流经所述第一节点的多条流的流标识;所述处理模块,还用于在所述第一节点基于所述去重标记确定所述第一报文为去重报文之前,确定所述一个或多个流分组集合中存在包括所述第一报文所属流的流标识的目标流分组集合;解析所述第一报文的载荷部分,得到所述去重标记。Optionally, the deduplication mark is located in the payload part of the first message, and one or more flow group sets are stored in the first node, each of the flow group sets including flow identifiers of multiple flows flowing through the first node; the processing module is also used to determine that there is a target flow group set including the flow identifier of the flow to which the first message belongs in the one or more flow group sets before the first node determines that the first message is a deduplication message based on the deduplication mark; and parse the payload part of the first message to obtain the deduplication mark.
可选地,所述处理模块,用于:根据所述指示信息,从所述数据集合中的目标历史报文的载荷内容中获取所述重复内容,所述目标历史报文所属流的流标识属于所述目标流分组集合。Optionally, the processing module is used to: obtain the repeated content from the payload content of the target historical message in the data set according to the indication information, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set.
可选地,所述接收模块,还用于接收第三报文,所述第三报文所属流的流标识不属于任一所述流分组集合;所述发送模块,还用于转发所述第三报文。Optionally, the receiving module is further used to receive a third message, and the flow identifier of the flow to which the third message belongs does not belong to any of the flow grouping sets; the sending module is further used to forward the third message.
可选地,所述接收模块,还用于接收第四报文,所述第四报文所属流的流标识属于所述流分组集合;所述处理模块,还用于解析所述第四报文的载荷部分,确定所述第四报文的载荷部分未携带有所述去重标记;在所述数据集合中添加所述第四报文的载荷部分的至少部分内容,并转发所述第四报文。Optionally, the receiving module is also used to receive a fourth message, and the flow identifier of the flow to which the fourth message belongs belongs to the flow group set; the processing module is also used to parse the payload part of the fourth message, determine that the payload part of the fourth message does not carry the deduplication mark; add at least part of the content of the payload part of the fourth message to the data set, and forward the fourth message.
可选地,所述处理模块,还用于:将接收到的属于不同流的多个报文中,载荷部分存在重复内容的报文所属流的流标识添加至同一流分组集合中,所述不同流的发送方均为所述SFU服务器。Optionally, the processing module is further used to: add the flow identifiers of the flows to which the messages with duplicate content in the payload part belong among the multiple messages received belonging to different flows to the same flow grouping set, and the senders of the different flows are all the SFU servers.
可选地,所述发送模块,还用于向所述第二节点发送分组信息,所述分组信息包括所述第一节点的节点标识与所述一个或多个流分组集合的对应关系。Optionally, the sending module is further used to send grouping information to the second node, where the grouping information includes a correspondence between a node identifier of the first node and the one or more flow grouping sets.
可选地,所述发送模块,还用于接收到目的端口号为SFU服务端口号的第五报文之后,向第四节点发送第一节点发现报文,所述第四节点为所述第五报文在所述第一节点上的下一跳,所述第五报文的目的地为所述SFU服务器,所述第一节点发现报文携带有所述SFU服务器的标识,且所述第一节点发现报文指示所述第一节点为所述第四节点在以所述SFU服务器为起点的传输路径上的下级节点;所述处理模块,还用于响应于接收到所述第四节点发送的所述第一节点发现报文对应的第一节点发现响应报文,确定所述第四节点支持数据去重。Optionally, the sending module is also used to send a first node discovery message to the fourth node after receiving a fifth message whose destination port number is the SFU service port number, the fourth node is the next hop of the fifth message on the first node, the destination of the fifth message is the SFU server, the first node discovery message carries an identifier of the SFU server, and the first node discovery message indicates that the first node is a subordinate node of the fourth node on the transmission path starting from the SFU server; the processing module is also used to determine that the fourth node supports data deduplication in response to receiving a first node discovery response message corresponding to the first node discovery message sent by the fourth node.
可选地,所述处理模块,还用于:根据所述第五报文生成所述第一节点发现报文,所述第一节点发现报文的报文头与所述第五报文的报文头相同,所述第一节点发现报文的载荷部分携带有对所述第一节点发现报文的报文类型的指示。Optionally, the processing module is also used to: generate the first node discovery message based on the fifth message, the message header of the first node discovery message is the same as the message header of the fifth message, and the payload part of the first node discovery message carries an indication of the message type of the first node discovery message.
可选地,所述接收模块,还用于接收第五节点发送的第二节点发现报文,所述第二节点发现报文携带有所述SFU服务器的标识,且所述第二节点发现报文指示所述第五节点为所述第一节点在以所述SFU服务器为起点的传输路径上的上级节点;所述处理模块,还用于根据所述第二节点发现报文确定所述第五节点支持数据去重;所述发送模块,还用于向所述第五节点发送所述第二节点发现报文对应的第二节点发现响应报文,所述第二节点发现响应报文指示所述第一节点支持数据去重。Optionally, the receiving module is also used to receive a second node discovery message sent by the fifth node, the second node discovery message carries the identifier of the SFU server, and the second node discovery message indicates that the fifth node is the superior node of the first node on the transmission path starting from the SFU server; the processing module is also used to determine whether the fifth node supports data deduplication based on the second node discovery message; the sending module is also used to send a second node discovery response message corresponding to the second node discovery message to the fifth node, and the second node discovery response message indicates that the first node supports data deduplication.
可选地,所述发送模块,还用于接收到源端口号为SFU服务端口号的第六报文之后,向第六节点发送第三节点发现报文,所述第六节点为所述第六报文在所述第一节点上的下一跳,所述第六报文的发送方为所述SFU服务器,所述第三节点发现报文携带有所述SFU服务器的标识,且所述第三节点发现报文指示所述第一节点为所述第六节点在以所述SFU服务器为起点的传输路径上的上级节点;所述处理模块,还用于响应于未接收到所述第六节点发送的所述第三节点发现报文对应的第三节点发现响应报文,确定所述第一节点为以所述SFU服务器为起点的传输路径上支持数据去重的最后一个节点。Optionally, the sending module is also used to send a third node discovery message to the sixth node after receiving the sixth message whose source port number is the SFU service port number, the sixth node is the next hop of the sixth message on the first node, the sender of the sixth message is the SFU server, the third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the superior node of the sixth node on the transmission path starting from the SFU server; the processing module is also used to determine that the first node is the last node supporting data deduplication on the transmission path starting from the SFU server in response to not receiving the third node discovery response message corresponding to the third node discovery message sent by the sixth node.
第六方面,提供了一种节点。所述节点包括多个功能模块,所述多个功能模块相互作用,实现上述第 三方面及其各实施方式中的方法。所述多个功能模块可以基于软件、硬件或软件和硬件的结合实现,且所述多个功能模块可以基于具体实现进行任意组合或分割。In a sixth aspect, a node is provided. The node includes multiple functional modules, and the multiple functional modules interact with each other to implement the above-mentioned The plurality of functional modules can be implemented based on software, hardware or a combination of software and hardware, and the plurality of functional modules can be arbitrarily combined or divided based on specific implementation.
具体地,该节点为第一节点,第一节点包括:接收模块,用于接收第二节点发送的第一报文,所述第一报文的发送方为SFU服务器,所述第一节点为所述第一报文的传输路径上支持数据去重的中间节点;处理模块,用于如果所述第一报文为未去重报文,且所述第一节点向第三节点发送的历史报文中存在第一原始报文,所述第一原始报文的载荷部分与所述第一报文的载荷部分具有第一重复内容,对所述第一报文的载荷部分进行去重处理,得到第二报文,所述第二报文不包括所述第一重复内容,且所述第二报文携带有去重标记以及对所述第一重复内容的第一指示信息,所述去重标记用于指示所述第二报文为去重报文,所述第三节点为所述第一报文在所述第一节点上的下一跳;发送模块,用于向所述第三节点发送所述第二报文。Specifically, the node is a first node, and the first node includes: a receiving module, used to receive a first message sent by a second node, the sender of the first message is an SFU server, and the first node is an intermediate node that supports data deduplication on the transmission path of the first message; a processing module, used to, if the first message is a non-deduplicated message, and there is a first original message in the historical message sent by the first node to the third node, and the payload part of the first original message and the payload part of the first message have a first repeated content, deduplicate the payload part of the first message to obtain a second message, the second message does not include the first repeated content, and the second message carries a deduplication mark and a first indication information of the first repeated content, the deduplication mark is used to indicate that the second message is a deduplicated message, and the third node is the next hop of the first message on the first node; a sending module, used to send the second message to the third node.
可选地,所述发送模块,还用于如果所述第一报文为未去重报文,且所述第一节点向所述第三节点发送的历史报文中不存在所述第一原始报文,向所述第三节点发送所述第一报文。Optionally, the sending module is further used to send the first message to the third node if the first message is a non-deduplicated message and the first original message does not exist in the historical messages sent by the first node to the third node.
可选地,所述第一重复内容包括一个或多个重复数据块,所述第一指示信息包括一个或多个指示,所述一个或多个指示与所述一个或多个重复数据块一一对应,每个所述指示用于指示对应的重复数据块的哈希值。Optionally, the first repeated content includes one or more repeated data blocks, and the first indication information includes one or more indications, the one or more indications correspond one-to-one to the one or more repeated data blocks, and each indication is used to indicate a hash value of a corresponding repeated data block.
可选地,所述第一节点中存储有第一数据集合,所述第一数据集合包括所述第一节点向所述第三节点发送的历史报文的载荷部分,所述处理模块,还用于:对所述第一报文的载荷部分与所述第一数据集合中的载荷部分进行内容匹配;如果所述第一数据集合中存在与所述第一报文的载荷部分具有重复数据块的目标载荷部分,确定所述历史报文中存在所述第一原始报文;如果所述第一数据集合中不存在与所述第一报文的载荷部分具有重复数据块的载荷部分,确定所述历史报文中不存在所述第一原始报文,并且,所述第一节点在所述第一数据集合中添加所述第一报文的载荷部分。Optionally, a first data set is stored in the first node, and the first data set includes the payload part of the historical message sent by the first node to the third node. The processing module is also used to: perform content matching on the payload part of the first message and the payload part in the first data set; if there is a target payload part in the first data set that has a repeated data block with the payload part of the first message, determine that the first original message exists in the historical message; if there is no payload part in the first data set that has a repeated data block with the payload part of the first message, determine that the first original message does not exist in the historical message, and the first node adds the payload part of the first message to the first data set.
可选地,所述第一数据集合包括所述目标载荷部分,所述处理模块,用于:针对所述第一报文的载荷部分与所述目标载荷部分之间的每个重复数据块,计算所述重复数据块的哈希值;去除所述第一报文的载荷部分的所述重复数据块,并在所述第一报文的载荷部分添加所述重复数据块对应的指示,所述指示用于指示所述重复数据块的哈希值以及所述重复数据块在所述第一报文的载荷部分中的位置。Optionally, the first data set includes the target payload part, and the processing module is used to: calculate the hash value of the repeated data block for each repeated data block between the payload part and the target payload part of the first message; remove the repeated data block in the payload part of the first message, and add an indication corresponding to the repeated data block to the payload part of the first message, wherein the indication is used to indicate the hash value of the repeated data block and the position of the repeated data block in the payload part of the first message.
可选地,所述载荷部分包括协议部分和数据部分,所述一个或多个重复数据块位于所述第一报文的数据部分;所述第一节点中存储有采样标签集合,所述采样标签集合包括历史数据块的哈希值,所述历史数据块为对所述第一节点向所述第三节点发送的历史报文的数据部分的预设位置采样得到的数据块;所述处理模块,还用于:对所述第一报文的数据部分的所述预设位置进行采样,得到采样数据块;计算所述采样数据块的哈希值;如果所述采样标签集合中包括所述采样数据块的哈希值,确定所述历史报文中存在所述第一原始报文;如果所述采样标签集合中不包括所述采样数据块的哈希值,确定所述历史报文中不存在所述第一原始报文,并且,所述第一节点在所述采样标签集合中添加所述采样数据块的哈希值。Optionally, the payload part includes a protocol part and a data part, and the one or more repeated data blocks are located in the data part of the first message; the first node stores a sampling tag set, and the sampling tag set includes a hash value of a historical data block, and the historical data block is a data block obtained by sampling a preset position of the data part of the historical message sent by the first node to the third node; the processing module is further used to: sample the preset position of the data part of the first message to obtain a sampled data block; calculate the hash value of the sampled data block; if the sampling tag set includes the hash value of the sampled data block, determine that the first original message exists in the historical message; if the sampling tag set does not include the hash value of the sampled data block, determine that the first original message does not exist in the historical message, and the first node adds the hash value of the sampled data block to the sampling tag set.
可选地,所述采样标签集合中包括所述采样数据块的哈希值,所述处理模块,用于:将哈希值属于所述采样标签集合的所述采样数据块作为重复数据块,去除所述第一报文的数据部分的所述重复数据块,并在所述第一报文的载荷部分添加所述重复数据块对应的指示,所述指示用于指示所述重复数据块的哈希值。Optionally, the sampling tag set includes a hash value of the sampling data block, and the processing module is used to: take the sampling data block whose hash value belongs to the sampling tag set as a repeated data block, remove the repeated data block from the data part of the first message, and add an indication corresponding to the repeated data block to the payload part of the first message, wherein the indication is used to indicate the hash value of the repeated data block.
可选地,所述第一节点中还存储有所述第一节点向所述第三节点发送的历史报文的协议部分;所述第一重复内容还包括位于所述第一报文的协议部分的协议信息,所述第一指示信息还包括差异指示,所述差异指示用于指示所述第一报文的协议部分与所述第一原始报文的协议部分的差异。Optionally, the first node also stores the protocol part of the historical message sent by the first node to the third node; the first repeated content also includes protocol information located in the protocol part of the first message, and the first indication information also includes a difference indication, and the difference indication is used to indicate the difference between the protocol part of the first message and the protocol part of the first original message.
可选地,所述发送模块,还用于如果所述第一报文为去重报文,所述第一报文携带有对第二重复内容的第二指示信息,且所述第一节点向所述第三节点发送的历史报文中存在第二原始报文,所述第二原始报文的载荷部分包括所述第二重复内容,向所述第三节点发送所述第一报文。Optionally, the sending module is also used to send the first message to the third node if the first message is a deduplicated message, the first message carries second indication information for second repeated content, and there is a second original message in the historical message sent by the first node to the third node, and the payload part of the second original message includes the second repeated content.
可选地,所述处理模块,还用于如果所述第一报文为去重报文,所述第一报文携带有对第二重复内容的第二指示信息,且所述第一节点向所述第三节点发送的历史报文中不存在第二原始报文,所述第二原始报文的载荷部分包括所述第二重复内容,根据所述第二指示信息从第二数据集合中获取所述第二重复内容,所述第二数据集合包括所述第一节点接收到的来自所述第二节点的历史报文的载荷部分的至少部分内容;根据所述第二重复内容对所述第一报文的载荷部分进行去重恢复处理,得到第三报文,所述第三报文的载荷部分包括所述第二重复内容;所述发送模块,还用于向所述第三节点发送所述第三报文。 Optionally, the processing module is also used to, if the first message is a deduplicated message, the first message carries second indication information for second repeated content, and the second original message does not exist in the historical messages sent by the first node to the third node, and the payload part of the second original message includes the second repeated content, obtain the second repeated content from a second data set according to the second indication information, the second data set including at least part of the payload part of the historical message received by the first node from the second node; perform deduplication and recovery processing on the payload part of the first message according to the second repeated content to obtain a third message, the payload part of the third message including the second repeated content; the sending module is also used to send the third message to the third node.
可选地,所述第二重复内容包括一个或多个重复数据块,所述第二指示信息包括一个或多个指示,所述一个或多个指示与所述一个或多个重复数据块一一对应,每个所述指示用于指示对应的重复数据块的哈希值。Optionally, the second repeated content includes one or more repeated data blocks, and the second indication information includes one or more indications, the one or more indications correspond one-to-one to the one or more repeated data blocks, and each indication is used to indicate a hash value of a corresponding repeated data block.
可选地,每个所述指示还用于指示对应的重复数据块在所述第一报文对应的原始报文的载荷部分中的位置,所述第二数据集合包括所述第一节点接收到的来自所述第二节点的历史报文的载荷部分;所述处理模块,用于:对于所述第二指示信息中的每个指示,根据所述指示所指示的位置,获取所述第二数据集合中的载荷部分的所述位置的待匹配数据块;计算所述待匹配数据块的哈希值;将哈希值与所述指示所指示的哈希值一致的待匹配数据块,确定为所述指示对应的重复数据块。Optionally, each of the indications is also used to indicate the position of the corresponding repeated data block in the payload part of the original message corresponding to the first message, and the second data set includes the payload part of the historical message received by the first node from the second node; the processing module is used to: for each indication in the second indication information, obtain the data block to be matched at the position of the payload part in the second data set according to the position indicated by the indication; calculate the hash value of the data block to be matched; and determine the data block to be matched whose hash value is consistent with the hash value indicated by the indication as the repeated data block corresponding to the indication.
可选地,所述载荷部分包括协议部分和数据部分,所述一个或多个重复数据块位于数据部分;所述第二数据集合包括历史数据块的哈希值与所述历史数据块的对应关系,所述历史数据块为对所述第一节点接收到的来自所述第二节点的历史报文的数据部分的预设位置采样得到的数据块;所述处理模块,用于将所述第二数据集合中与所述第二指示信息中的指示所指示的哈希值对应的历史数据块,确定为所述指示对应的重复数据块。Optionally, the payload part includes a protocol part and a data part, and the one or more repeated data blocks are located in the data part; the second data set includes the correspondence between the hash value of the historical data block and the historical data block, and the historical data block is a data block obtained by sampling the preset position of the data part of the historical message received by the first node from the second node; the processing module is used to determine the historical data block in the second data set corresponding to the hash value indicated by the indication in the second indication information as the repeated data block corresponding to the indication.
可选地,所述指示还用于指示对应的重复数据块在所述第一报文对应的原始报文的数据部分中的位置;所述处理模块,用于对于所述第二指示信息中的每个指示,在所述第一报文的数据部分中所述指示所指示的位置,添加所述指示对应的重复数据块。Optionally, the indication is also used to indicate the position of the corresponding repeated data block in the data part of the original message corresponding to the first message; the processing module is used to add the repeated data block corresponding to the indication at the position indicated by the indication in the data part of the first message for each indication in the second indication information.
可选地,所述第二重复内容还包括位于协议部分的协议信息,所述第二指示信息还包括差异指示,所述差异指示用于指示所述第一报文对应的原始报文的协议部分与目标报文的协议部分的差异,所述目标报文为所述第一节点接收到的来自所述第二节点的历史报文中数据部分与所述原始报文的数据部分具有所述一个或多个重复数据块的历史报文;所述第二数据集合还包括所述历史数据块所属报文的协议部分;所述处理模块,还用于:从所述第二数据集合中获取所述一个或多个重复数据块所属的所述目标报文的协议部分;根据所述差异指示修改所述目标报文的协议部分,并将修改后的所述目标报文的协议部分作为所述第三报文的协议部分。Optionally, the second repeated content also includes protocol information located in the protocol part, and the second indication information also includes a difference indication, wherein the difference indication is used to indicate the difference between the protocol part of the original message corresponding to the first message and the protocol part of the target message, and the target message is a historical message received by the first node from the second node, in which the data part and the data part of the original message have the one or more repeated data blocks; the second data set also includes the protocol part of the message to which the historical data blocks belong; the processing module is also used to: obtain the protocol part of the target message to which the one or more repeated data blocks belong from the second data set; modify the protocol part of the target message according to the difference indication, and use the modified protocol part of the target message as the protocol part of the third message.
可选地,所述第一节点中存储有一个或多个本地流分组集合,每个所述本地流分组集合包括流经所述第一节点的多条流的流标识;所述处理模块,还用于:在确定所述一个或多个本地流分组集合中存在包括所述第一报文所属流的流标识的目标流分组集合之后,解析所述第一报文的载荷部分;如果所述第一报文的载荷部分携带有去重标记,确定所述第一报文为去重报文;如果所述第一报文的载荷部分未携带有去重标记,确定所述第一报文为未去重报文。Optionally, one or more local flow group sets are stored in the first node, and each of the local flow group sets includes flow identifiers of multiple flows flowing through the first node; the processing module is also used to: after determining that there is a target flow group set including the flow identifier of the flow to which the first message belongs in the one or more local flow group sets, parse the payload part of the first message; if the payload part of the first message carries a deduplication mark, determine that the first message is a deduplication message; if the payload part of the first message does not carry a deduplication mark, determine that the first message is a non-deduplication message.
可选地,所述处理模块,还用于:将接收到的属于不同流的多个报文中,载荷部分存在重复内容的报文所属流的流标识添加至同一本地流分组集合中,所述不同流的发送方均为所述SFU服务器。Optionally, the processing module is further used to: add the flow identifiers of the flows to which the messages with duplicate content in the payload part among the multiple messages received belonging to different flows belong to, to the same local flow grouping set, and the senders of the different flows are all the SFU server.
可选地,所述发送模块,还用于向所述第二节点发送第一分组信息,所述第一分组信息包括所述第一节点的节点标识与所述一个或多个本地流分组集合的对应关系。Optionally, the sending module is further used to send first grouping information to the second node, where the first grouping information includes a correspondence between a node identifier of the first node and the one or more local flow grouping sets.
可选地,所述第一节点中存储有所述第三节点对应的一个或多个下级流分组集合,每个所述下级流分组集合包括流经所述第三节点的多条流的流标识,所述处理模块,还用于在所述第一节点接收第二节点发送的第一报文之后,如果所述第三节点对应的下级流分组集合中存在包括所述第一报文所属流的流标识的目标流分组集合,判断向所述第三节点发送的目标历史报文中是否存在载荷部分与所述第一报文的载荷部分具有重复内容的报文,所述目标历史报文所属流的流标识属于所述目标流分组集合;所述发送模块,还用于如果所述第三节点对应的所有下级流分组集合均不包括所述第一报文所属流的流标识,向所述第三节点发送所述第一报文。Optionally, the first node stores one or more lower-level flow group sets corresponding to the third node, each of the lower-level flow group sets including flow identifiers of multiple flows flowing through the third node. The processing module is further used to, after the first node receives the first message sent by the second node, determine whether there is a message in the target historical message sent to the third node whose payload part has repeated content with the payload part of the first message, if there is a target flow group set including the flow identifier of the flow to which the first message belongs in the lower-level flow group set corresponding to the third node, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow group set; the sending module is further used to send the first message to the third node if all lower-level flow group sets corresponding to the third node do not include the flow identifier of the flow to which the first message belongs.
可选地,所述接收模块,还用于接收所述第三节点发送的第二分组信息,所述第二分组信息包括所述第三节点的节点标识与所述一个或多个下级流分组集合的对应关系。Optionally, the receiving module is further used to receive second grouping information sent by the third node, where the second grouping information includes a correspondence between a node identifier of the third node and the one or more lower-level flow grouping sets.
可选地,所述发送模块,还用于接收到目的端口号为SFU服务端口号的第四报文之后,向第四节点发送第一节点发现报文,所述第四节点为所述第四报文在所述第一节点上的下一跳,所述第四报文的目的地为所述SFU服务器,所述第一节点发现报文携带有所述SFU服务器的标识,且所述第一节点发现报文指示所述第一节点为所述第四节点在以所述SFU服务器为起点的传输路径上的下级节点;所述处理模块,还用于响应于接收到所述第四节点发送的所述第一节点发现报文对应的第一节点发现响应报文,确定所述第四节点支持数据去重。 Optionally, the sending module is also used to send a first node discovery message to a fourth node after receiving a fourth message whose destination port number is the SFU service port number, wherein the fourth node is the next hop of the fourth message on the first node, and the destination of the fourth message is the SFU server. The first node discovery message carries an identifier of the SFU server, and the first node discovery message indicates that the first node is a subordinate node of the fourth node on a transmission path starting from the SFU server; the processing module is also used to determine that the fourth node supports data deduplication in response to receiving a first node discovery response message corresponding to the first node discovery message sent by the fourth node.
可选地,所述接收模块,还用于接收第五节点发送的第二节点发现报文,所述第二节点发现报文携带有所述SFU服务器的标识,且所述第二节点发现报文指示所述第五节点为所述第一节点在以所述SFU服务器为起点的传输路径上的上级节点;所述处理模块,还用于根据所述第二节点发现报文确定所述第五节点支持数据去重;所述发送模块,还用于向所述第五节点发送所述第二节点发现报文对应的第二节点发现响应报文,所述第二节点发现响应报文指示所述第一节点支持数据去重。Optionally, the receiving module is also used to receive a second node discovery message sent by the fifth node, the second node discovery message carries the identifier of the SFU server, and the second node discovery message indicates that the fifth node is the superior node of the first node on the transmission path starting from the SFU server; the processing module is also used to determine whether the fifth node supports data deduplication based on the second node discovery message; the sending module is also used to send a second node discovery response message corresponding to the second node discovery message to the fifth node, and the second node discovery response message indicates that the first node supports data deduplication.
可选地,所述发送模块,还用于接收到源端口号为SFU服务端口号的第五报文之后,向第六节点发送第三节点发现报文,所述第六节点为所述第五报文在所述第一节点上的下一跳,所述第五报文的发送方为所述SFU服务器,所述第三节点发现报文携带有所述SFU服务器的标识,且所述第三节点发现报文指示所述第一节点为所述第六节点在以所述SFU服务器为起点的传输路径上的上级节点;所述处理模块,还用于响应于接收到所述第六节点发送的所述第三节点发现报文对应的第三节点发现响应报文,确定所述第六节点支持数据去重。Optionally, the sending module is also used to send a third node discovery message to the sixth node after receiving a fifth message whose source port number is the SFU service port number, the sixth node is the next hop of the fifth message on the first node, the sender of the fifth message is the SFU server, the third node discovery message carries an identifier of the SFU server, and the third node discovery message indicates that the first node is the superior node of the sixth node on the transmission path starting from the SFU server; the processing module is also used to determine that the sixth node supports data deduplication in response to receiving a third node discovery response message corresponding to the third node discovery message sent by the sixth node.
可选地,所述接收模块,还用于接收第七节点发送的第四节点发现报文,所述第四节点发现报文携带有所述SFU服务器的标识,且所述第四节点发现报文指示所述第七节点为所述第一节点在以所述SFU服务器为起点的传输路径上的下级节点;所述处理模块,还用于根据所述第四节点发现报文确定所述第七节点支持数据去重;所述发送模块,还用于向所述第七节点发送所述第四节点发现报文对应的第四节点发现响应报文,所述第四节点发现响应报文指示所述第一节点支持数据去重。Optionally, the receiving module is also used to receive a fourth node discovery message sent by the seventh node, the fourth node discovery message carries the identifier of the SFU server, and the fourth node discovery message indicates that the seventh node is a subordinate node of the first node on the transmission path starting from the SFU server; the processing module is also used to determine whether the seventh node supports data deduplication based on the fourth node discovery message; the sending module is also used to send a fourth node discovery response message corresponding to the fourth node discovery message to the seventh node, and the fourth node discovery response message indicates that the first node supports data deduplication.
第七方面,提供了一种数据传输系统,包括:SFU服务器和通信网络中的多个节点,所述多个节点包括第一节点和第二节点,所述第一节点位于所述SFU服务器与所述第二节点之间。所述SFU服务器用于向所述第一节点发送报文,所述第一节点用于执行上述第一方面及其各实施方式中的方法,所述第二节点用于执行上述第二方面及其各实施方式中的方法。In a seventh aspect, a data transmission system is provided, comprising: an SFU server and a plurality of nodes in a communication network, the plurality of nodes comprising a first node and a second node, the first node being located between the SFU server and the second node. The SFU server is used to send a message to the first node, the first node is used to execute the method in the first aspect and its various embodiments, and the second node is used to execute the method in the second aspect and its various embodiments.
可选地,所述多个节点还包括第三节点,所述第三节点位于所述第一节点与所述第二节点之间,所述第三节点用于执行上述第三方面及其各实施方式中的方法。Optionally, the multiple nodes further include a third node, the third node is located between the first node and the second node, and the third node is used to execute the method in the above third aspect and its various embodiments.
第八方面,提供了另一种数据传输系统,包括:SFU服务器和通信网络中的第一节点。所述SFU服务器用于执行上述第一方面及其各实施方式中的方法,所述第一节点用于执行上述第二方面及其各实施方式中的方法。In an eighth aspect, another data transmission system is provided, comprising: an SFU server and a first node in a communication network. The SFU server is used to execute the method in the first aspect and its respective embodiments, and the first node is used to execute the method in the second aspect and its respective embodiments.
可选地,所述通信网络还包括第二节点,所述第二节点位于所述SFU服务器与所述第一节点之间,所述第二节点用于执行上述第三方面及其各实施方式中的方法。Optionally, the communication network further includes a second node, the second node is located between the SFU server and the first node, and the second node is used to execute the method in the above third aspect and its various embodiments.
第九方面,提供了一种通信节点,包括:处理器和存储器。所述存储器,用于存储计算机程序,所述计算机程序包括程序指令。所述处理器,用于调用所述计算机程序,实现上述第一方面及其各实施方式中的方法,或者实现上述第二方面及其各实施方式中的方法,又或者实现上述第三方面及其各实施方式中的方法。In a ninth aspect, a communication node is provided, comprising: a processor and a memory. The memory is used to store a computer program, and the computer program includes program instructions. The processor is used to call the computer program to implement the method in the first aspect and its various embodiments, or to implement the method in the second aspect and its various embodiments, or to implement the method in the third aspect and its various embodiments.
第十方面,提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有指令,当所述指令被处理器执行时,实现上述第一方面及其各实施方式中的方法,或者实现上述第二方面及其各实施方式中的方法,又或者实现上述第三方面及其各实施方式中的方法。In the tenth aspect, a computer-readable storage medium is provided, on which instructions are stored. When the instructions are executed by a processor, the method of the above-mentioned first aspect and its various embodiments is implemented, or the method of the above-mentioned second aspect and its various embodiments is implemented, or the method of the above-mentioned third aspect and its various embodiments is implemented.
第十一方面,提供了一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现上述第一方面及其各实施方式中的方法,或者实现上述第二方面及其各实施方式中的方法,又或者实现上述第三方面及其各实施方式中的方法。In the eleventh aspect, a computer program product is provided, comprising a computer program, which, when executed by a processor, implements the method of the above-mentioned first aspect and its various embodiments, or implements the method of the above-mentioned second aspect and its various embodiments, or implements the method of the above-mentioned third aspect and its various embodiments.
第十二方面,提供了一种芯片,芯片包括可编程逻辑电路和/或程序指令,当芯片运行时,实现上述第一方面及其各实施方式中的方法,或者实现上述第二方面及其各实施方式中的方法,又或者实现上述第三方面及其各实施方式中的方法。 In the twelfth aspect, a chip is provided, which includes a programmable logic circuit and/or program instructions. When the chip is running, it implements the method in the above-mentioned first aspect and its various embodiments, or implements the method in the above-mentioned second aspect and its various embodiments, or implements the method in the above-mentioned third aspect and its various embodiments.
图1是本申请实施例提供的一种SFU通信架构的示意图;FIG1 is a schematic diagram of an SFU communication architecture provided in an embodiment of the present application;
图2是本申请实施例提供的一种数据传输系统的结构示意图;FIG2 is a schematic diagram of the structure of a data transmission system provided in an embodiment of the present application;
图3是本申请实施例提供的另一种数据传输系统的结构示意图;FIG3 is a schematic diagram of the structure of another data transmission system provided in an embodiment of the present application;
图4是本申请实施例提供的又一种数据传输系统的结构示意图;FIG4 is a schematic diagram of the structure of another data transmission system provided in an embodiment of the present application;
图5是本申请实施例提供的再一种数据传输系统的结构示意图;FIG5 is a schematic diagram of the structure of another data transmission system provided in an embodiment of the present application;
图6是本申请实施例提供的一种数据传输方法的流程示意图;FIG6 is a schematic diagram of a flow chart of a data transmission method provided in an embodiment of the present application;
图7是本申请实施例提供的一种报文去重过程示意图;FIG7 is a schematic diagram of a message deduplication process provided in an embodiment of the present application;
图8是本申请实施例提供的另一种报文去重过程示意图;FIG8 is a schematic diagram of another message deduplication process provided in an embodiment of the present application;
图9是本申请实施例提供的又一种报文去重过程示意图;FIG9 is a schematic diagram of another message deduplication process provided in an embodiment of the present application;
图10是本申请实施例提供的另一种数据传输方法的流程示意图;FIG10 is a schematic flow chart of another data transmission method provided in an embodiment of the present application;
图11是本申请实施例提供的又一种数据传输方法的流程示意图;FIG11 is a flow chart of another data transmission method provided in an embodiment of the present application;
图12是本申请实施例提供的一种通信节点的结构示意图;FIG12 is a schematic diagram of the structure of a communication node provided in an embodiment of the present application;
图13是本申请实施例提供的另一种通信节点的结构示意图;13 is a schematic diagram of the structure of another communication node provided in an embodiment of the present application;
图14是本申请实施例提供的又一种通信节点的结构示意图;FIG14 is a schematic diagram of the structure of another communication node provided in an embodiment of the present application;
图15是本申请实施例提供的一种通信设备的硬件结构示意图。FIG. 15 is a schematic diagram of the hardware structure of a communication device provided in an embodiment of the present application.
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。In order to make the objectives, technical solutions and advantages of the present application more clear, the implementation methods of the present application will be further described in detail below with reference to the accompanying drawings.
在视频会议或直播等通信场景中,同一用户的音视频数据通常需要分发给多个接收者。比如在视频会议场景中,会议主持人的音视频数据需要分发给多个会议成员。又比如在直播场景中,主播用户的音视频数据需要分发给多个观众用户。当前主流视频会议和直播等通信场景通常采用SFU通信架构。在SFU通信架构下,源用户将音视频数据发送给SFU服务器,由SFU服务器负责完成对音视频数据的复制和分发。其中,在视频会议场景中,SFU服务器可以是视频会议服务器。在直播场景中,SFU服务器可以是直播服务器。In communication scenarios such as video conferencing or live broadcasting, the audio and video data of the same user usually needs to be distributed to multiple recipients. For example, in a video conferencing scenario, the audio and video data of the conference host needs to be distributed to multiple conference members. For example, in a live broadcasting scenario, the audio and video data of the anchor user needs to be distributed to multiple audience users. The current mainstream communication scenarios such as video conferencing and live broadcasting usually adopt the SFU communication architecture. Under the SFU communication architecture, the source user sends the audio and video data to the SFU server, and the SFU server is responsible for completing the replication and distribution of the audio and video data. Among them, in the video conferencing scenario, the SFU server can be a video conferencing server. In the live broadcasting scenario, the SFU server can be a live broadcasting server.
例如,图1是本申请实施例提供的一种SFU通信架构的示意图。如图1所示,SFU通信架构包括源用户、SFU服务器和多个接收者,比如多个接收者包括接收者a、接收者b和接收者c。源用户向多个接收者发送音视频数据的实现流程包括:源用户向SFU服务器发送音视频数据;SFU服务器将该音视频数据分别封装在多个报文的载荷部分,该多个报文的目的地分别为该多个接收者;SFU服务器通过通信网络分别向该多个接收者发送对应的报文,比如SFU服务器通过通信网络向接收者a发送报文a,向接收者b发送报文b,向接收者c发送报文c。由于通信网络中的节点需要分别转发SFU服务器向不同接收者发送的报文,而每个报文的载荷部分都携带有来自源用户的完整的音视频数据,即每个报文的数据量都较大,这会导致通信网络中的节点转发多个报文的传输数据量较大,进而导致网络带宽的开销较大。For example, FIG1 is a schematic diagram of an SFU communication architecture provided by an embodiment of the present application. As shown in FIG1 , the SFU communication architecture includes a source user, an SFU server, and multiple receivers, such as multiple receivers including receiver a, receiver b, and receiver c. The implementation process of the source user sending audio and video data to multiple receivers includes: the source user sends audio and video data to the SFU server; the SFU server encapsulates the audio and video data in the payload part of multiple messages respectively, and the destinations of the multiple messages are the multiple receivers respectively; the SFU server sends corresponding messages to the multiple receivers respectively through the communication network, such as the SFU server sends message a to receiver a, sends message b to receiver b, and sends message c to receiver c through the communication network. Since the nodes in the communication network need to forward the messages sent by the SFU server to different receivers respectively, and the payload part of each message carries the complete audio and video data from the source user, that is, the data volume of each message is large, this will cause the nodes in the communication network to forward multiple messages. The transmission data volume is large, which in turn leads to a large network bandwidth overhead.
基于此,本申请提出了一种技术方案,对于发送方为SFU服务器的原始报文,在该原始报文的传输路径上支持数据去重的首个节点或中间节点获取该原始报文之后,判断自身是否向该原始报文的下一跳发送过载荷部分与该原始报文的载荷部分具有重复内容的历史报文,如果自身向该报文的下一跳发送过载荷部分与该原始报文的载荷部分具有重复内容的历史报文,则该节点对该原始报文的载荷部分进行去重处理得到去重报文,该去重报文不包括该重复内容,且该去重报文携带有指示自身为去重报文的去重标记,然后该节点向该原始报文的下一跳发送该去重报文。在该去重报文到达目的地之前,该原始报文的传输路径上支持数据去重的中间节点或最后一个节点对该去重报文的载荷部分进行去重恢复处理,还原得到该原始报文,然后向目的地继续发送该原始报文。本申请中的原始报文均指未去重报文。本申请可以在通信网络传输去重报文来替代原始报文,在不影响用户业务的情况下,通过去重和恢复处理减少报文传输数据量,从而减少网络带宽的开销,提升传输效率和质量,以及降低用户成本。比如对于采用SFU通信架构的视频会议场景或直播场景,同一个视频会议或直播中多个接收者接收来自同一用户的音视频数据时,接收到的数据是高度重合的。通过应用本申请技术方案,可以对同一用户向多个接收者发送的并发流量进行数据去重,实现对同一用户发送给不同接收者的重复流量的削减,减少报文传输数据量,从而可以减少网络带宽的开 销。Based on this, the present application proposes a technical solution. For the original message whose sender is the SFU server, after the first node or intermediate node that supports data deduplication on the transmission path of the original message obtains the original message, it determines whether it has sent a historical message whose payload part has duplicate content with the payload part of the original message to the next hop of the original message. If it has sent a historical message whose payload part has duplicate content with the payload part of the original message to the next hop of the message, the node performs deduplication processing on the payload part of the original message to obtain a deduplication message, which does not include the duplicate content, and the deduplication message carries a deduplication mark indicating that it is a deduplication message, and then the node sends the deduplication message to the next hop of the original message. Before the deduplication message arrives at the destination, the intermediate node or the last node that supports data deduplication on the transmission path of the original message performs deduplication recovery processing on the payload part of the deduplication message, restores the original message, and then continues to send the original message to the destination. The original messages in this application all refer to non-deduplication messages. The present application can transmit deduplicated messages in the communication network to replace the original messages. Without affecting user services, the amount of message transmission data can be reduced through deduplication and recovery processing, thereby reducing network bandwidth overhead, improving transmission efficiency and quality, and reducing user costs. For example, in video conferencing scenarios or live broadcast scenarios that use the SFU communication architecture, when multiple recipients in the same video conference or live broadcast receive audio and video data from the same user, the received data is highly overlapping. By applying the technical solution of the present application, data deduplication can be performed on concurrent traffic sent by the same user to multiple recipients, and duplicate traffic sent by the same user to different recipients can be reduced, reducing the amount of message transmission data, thereby reducing network bandwidth opening and closing. pin.
下面从系统、方法流程、虚拟装置、硬件装置等多个角度对本申请技术方案进行说明。The technical solution of the present application is explained below from multiple perspectives such as system, method flow, virtual device, hardware device, etc.
下面对本申请涉及的系统举例说明。The following is an example of the system involved in this application.
本申请实施例提供的数据传输系统采用SFU通信架构,可以应用于多种通信场景,比如视频会议场景或直播场景等。可选地,在视频会议场景中,SFU服务器可以是视频会议服务器。在直播场景中,SFU服务器可以是直播服务器。The data transmission system provided in the embodiment of the present application adopts the SFU communication architecture and can be applied to a variety of communication scenarios, such as video conferencing scenarios or live broadcast scenarios. Optionally, in the video conferencing scenario, the SFU server can be a video conferencing server. In the live broadcast scenario, the SFU server can be a live broadcast server.
本申请一种典型应用场景为云到分支(cloud-to-branch)场景。在cloud-to-branch场景中,SFU服务器部署在云上,也即是,云为SFU服务器所在位置。分支通常部署在广域网内,分支为用户所在位置。云与分支之间可以通过局域网进行通信。通过在云和分支之间部署支持数据去重的多个节点。SFU服务器向用户发送流量时,多个节点在流量转发过程中,由靠近云的节点对流量进行数据去重,由靠近分支的节点对流量进行数据恢复。本申请中,将具备数据去重功能和/或数据恢复功能的节点统称为支持数据去重的节点。其中,支持数据去重的节点可以是独立的硬件设备,或者也可以以软件形式部署到服务器或者网络设备上。当然,本申请的应用场景不限于cloud-to-branch场景,也可以根据实际需要进行部署。比如将支持数据去重的多个节点部署在局域网中,则可以对局域网中的流量进行数据去重和数据恢复。又比如将支持数据去重的多个节点部署在广域网中的某一段内,则可以对这一段网络中的流量进行数据去重和数据恢复。本申请可以用于对符合SFU通信架构的各种流量进行数据去重处理和数据恢复处理。A typical application scenario of the present application is a cloud-to-branch scenario. In the cloud-to-branch scenario, the SFU server is deployed on the cloud, that is, the cloud is where the SFU server is located. The branch is usually deployed in a wide area network, and the branch is where the user is located. The cloud and the branch can communicate through a local area network. By deploying multiple nodes that support data deduplication between the cloud and the branch. When the SFU server sends traffic to the user, during the traffic forwarding process, the nodes close to the cloud perform data deduplication on the traffic, and the nodes close to the branch perform data recovery on the traffic. In the present application, nodes with data deduplication and/or data recovery functions are collectively referred to as nodes that support data deduplication. Among them, the nodes that support data deduplication can be independent hardware devices, or they can also be deployed in the form of software on a server or network device. Of course, the application scenario of the present application is not limited to the cloud-to-branch scenario, and can also be deployed according to actual needs. For example, if multiple nodes that support data deduplication are deployed in a local area network, data deduplication and data recovery can be performed on the traffic in the local area network. For example, if multiple nodes supporting data deduplication are deployed in a certain section of a wide area network, data deduplication and data recovery can be performed on the traffic in this section of the network. The present application can be used to perform data deduplication and data recovery on various traffic that conforms to the SFU communication architecture.
本申请不限于只部署两层支持数据去重的节点,还可以支持部署多层支持数据去重的节点。其中,靠近SFU服务器的节点可称为顶部节点,靠近用户的节点可称为底部节点,位于顶部节点与底部节点之间的节点可称为中间节点。本申请实施例提供的数据传输系统至少包括顶部节点和底部节点。顶部节点具备数据去重功能,负责对SFU服务器向用户发送的流量进行数据去重。底部节点具备数据恢复功能,负责对SFU服务器向用户发送的流量进行数据恢复。可选地,本申请实施例提供的数据传输系统还可以包括中间节点。中间节点具备数据去重功能和数据恢复功能,负责对SFU服务器向用户发送的流量进行数据去重或者数据恢复。The present application is not limited to deploying only two layers of nodes that support data deduplication, but can also support the deployment of multiple layers of nodes that support data deduplication. Among them, the node close to the SFU server can be called the top node, the node close to the user can be called the bottom node, and the node between the top node and the bottom node can be called the intermediate node. The data transmission system provided in the embodiment of the present application includes at least a top node and a bottom node. The top node has a data deduplication function and is responsible for deduplicating data for the traffic sent by the SFU server to the user. The bottom node has a data recovery function and is responsible for recovering data for the traffic sent by the SFU server to the user. Optionally, the data transmission system provided in the embodiment of the present application may also include an intermediate node. The intermediate node has a data deduplication function and a data recovery function, and is responsible for deduplicating data or recovering data for the traffic sent by the SFU server to the user.
例如,图2是本申请实施例提供的一种数据传输系统的结构示意图。如图2所示,数据传输系统200采用一对一分支两层节点部署模型。该数据传输系统200包括SFU服务器201以及通信网络中的节点202A和节点202B。其中,节点202A靠近SFU服务器201部署,即节点202A为顶部节点,也就是说,节点202A为以SFU服务器201为起点的传输路径上支持数据去重的首个节点。节点202B靠近分支21部署,即节点202B为底部节点,也就是说,节点202B为从SFU服务器201到分支21的传输路径上支持数据去重的最后一个节点。For example, Figure 2 is a structural diagram of a data transmission system provided in an embodiment of the present application. As shown in Figure 2, the data transmission system 200 adopts a one-to-one branch two-layer node deployment model. The data transmission system 200 includes an SFU server 201 and nodes 202A and 202B in a communication network. Among them, node 202A is deployed close to the SFU server 201, that is, node 202A is the top node, that is, node 202A is the first node that supports data deduplication on the transmission path starting from the SFU server 201. Node 202B is deployed close to branch 21, that is, node 202B is the bottom node, that is, node 202B is the last node that supports data deduplication on the transmission path from the SFU server 201 to the branch 21.
又例如,图3是本申请实施例提供的另一种数据传输系统的结构示意图。如图3所示,数据传输系统300采用一对多分支两层节点部署模型。该数据传输系统300包括SFU服务器301以及通信网络中的节点302A、节点302B、节点302C和节点302D。其中,节点302A靠近SFU服务器301部署,即节点302A为顶部节点,也就是说,节点302A为以SFU服务器301为起点的传输路径上支持数据去重的首个节点。节点302B靠近分支31部署,即节点302B为分支31对应的底部节点,也就是说,节点302B为从SFU服务器301到分支31的传输路径上支持数据去重的最后一个节点。节点302C靠近分支32部署,即节点302C为分支32对应的底部节点,也就是说,节点302C为从SFU服务器301到分支32的传输路径上支持数据去重的最后一个节点。节点302D靠近分支33部署,即节点302D为分支33对应的底部节点,也就是说,节点302D为从SFU服务器301到分支33的传输路径上支持数据去重的最后一个节点。For another example, FIG. 3 is a schematic diagram of the structure of another data transmission system provided by an embodiment of the present application. As shown in FIG. 3, a data transmission system 300 adopts a one-to-many branch two-layer node deployment model. The data transmission system 300 includes an SFU server 301 and nodes 302A, 302B, 302C, and 302D in a communication network. Among them, node 302A is deployed close to the SFU server 301, that is, node 302A is a top node, that is, node 302A is the first node that supports data deduplication on the transmission path starting from the SFU server 301. Node 302B is deployed close to branch 31, that is, node 302B is the bottom node corresponding to branch 31, that is, node 302B is the last node that supports data deduplication on the transmission path from the SFU server 301 to branch 31. Node 302C is deployed close to branch 32, that is, node 302C is the bottom node corresponding to branch 32, that is, node 302C is the last node that supports data deduplication on the transmission path from the SFU server 301 to branch 32. Node 302D is deployed close to branch 33 , that is, node 302D is the bottom node corresponding to branch 33 , that is, node 302D is the last node supporting data deduplication on the transmission path from SFU server 301 to branch 33 .
又例如,图4是本申请实施例提供的又一种数据传输系统的结构示意图。如图4所示,数据传输系统400采用多层节点多分支部署模型。该数据传输系统400包括SFU服务器401以及通信网络中的节点402A、节点402B、节点402C、节点402D、节点402E、节点402F和节点402G。其中,节点402A靠近SFU服务器401部署,即节点402A为顶部节点,也就是说,节点402A为以SFU服务器401为起点的传输路径上支持数据去重的首个节点。节点402B靠近分支41部署,即节点402B为分支41对应的底部节点,也就是说,节点402B为从SFU服务器401到分支41的传输路径上支持数据去重的最后一个节点。节点402D靠近分支45部署,即节点402D为分支45对应的底部节点,也就是说,节点402D为从SFU服务器401到分支45的传输路径上支持数据去重的最后一个节点。节点402E靠近分支42部署,即节点402E为分支 42对应的底部节点,也就是说,节点402E为从SFU服务器401到分支42的传输路径上支持数据去重的最后一个节点。节点402F靠近分支43部署,即节点402F为分支43对应的底部节点,也就是说,节点402F为从SFU服务器401到分支43的传输路径上支持数据去重的最后一个节点。节点402G靠近分支44部署,即节点402G为分支44对应的底部节点,也就是说,节点402G为从SFU服务器401到分支44的传输路径上支持数据去重的最后一个节点。节点402C部署在节点402A与节点402E、节点402F以及节点402G之间,即节点402C为分支42、分支43和分支44分别对应的中间节点,也就是说,节点402C为从SFU服务器401到分支42的传输路径上支持数据去重的中间节点,且节点402C为从SFU服务器401到分支43的传输路径上支持数据去重的中间节点,且节点402C为从SFU服务器401到分支44的传输路径上支持数据去重的中间节点。For another example, FIG. 4 is a structural diagram of another data transmission system provided by an embodiment of the present application. As shown in FIG. 4, a data transmission system 400 adopts a multi-layer node multi-branch deployment model. The data transmission system 400 includes an SFU server 401 and nodes 402A, 402B, 402C, 402D, 402E, 402F, and 402G in a communication network. Among them, node 402A is deployed close to the SFU server 401, that is, node 402A is a top node, that is, node 402A is the first node that supports data deduplication on the transmission path starting from the SFU server 401. Node 402B is deployed close to branch 41, that is, node 402B is the bottom node corresponding to branch 41, that is, node 402B is the last node that supports data deduplication on the transmission path from the SFU server 401 to the branch 41. Node 402D is deployed close to branch 45, that is, node 402D is the bottom node corresponding to branch 45, that is, node 402D is the last node that supports data deduplication on the transmission path from SFU server 401 to branch 45. Node 402E is deployed close to branch 42, that is, node 402E is the bottom node corresponding to branch 45. 42, that is, node 402E is the last node supporting data deduplication on the transmission path from SFU server 401 to branch 42. Node 402F is deployed close to branch 43, that is, node 402F is the bottom node corresponding to branch 43, that is, node 402F is the last node supporting data deduplication on the transmission path from SFU server 401 to branch 43. Node 402G is deployed close to branch 44, that is, node 402G is the bottom node corresponding to branch 44, that is, node 402G is the last node supporting data deduplication on the transmission path from SFU server 401 to branch 44. Node 402C is deployed between node 402A and node 402E, node 402F and node 402G, that is, node 402C is the intermediate node corresponding to branch 42, branch 43 and branch 44 respectively, that is to say, node 402C is an intermediate node that supports data deduplication on the transmission path from SFU server 401 to branch 42, and node 402C is an intermediate node that supports data deduplication on the transmission path from SFU server 401 to branch 43, and node 402C is an intermediate node that supports data deduplication on the transmission path from SFU server 401 to branch 44.
可选地,SFU服务器也可以支持数据去重,则SFU服务器为以该SFU服务器为起点的传输路径上支持数据去重的首个节点。例如,图5是本申请实施例提供的再一种数据传输系统的结构示意图。如图5所示,数据传输系统500包括SFU服务器501和通信网络中的节点502。其中,SFU服务器501为顶部节点,也就是说,SFU服务器501为以SFU服务器501为起点的传输路径上支持数据去重的首个节点。节点502靠近分支51部署,即节点502为分支51对应的底部节点,也就是说,节点502为从SFU服务器501到分支51的传输路径上支持数据去重的最后一个节点。在如图5所示的数据传输系统500中,SFU服务器501与节点502之间还可以部署有支持数据去重的中间节点,具体可参考图4示出的数据传输系统400,本申请实施例在此不再赘述。Optionally, the SFU server may also support data deduplication, and the SFU server is the first node that supports data deduplication on the transmission path starting from the SFU server. For example, FIG. 5 is a structural diagram of another data transmission system provided in an embodiment of the present application. As shown in FIG. 5, a data transmission system 500 includes an SFU server 501 and a node 502 in a communication network. Among them, the SFU server 501 is a top node, that is, the SFU server 501 is the first node that supports data deduplication on the transmission path starting from the SFU server 501. Node 502 is deployed close to branch 51, that is, node 502 is the bottom node corresponding to branch 51, that is, node 502 is the last node that supports data deduplication on the transmission path from the SFU server 501 to branch 51. In the data transmission system 500 shown in FIG. 5, an intermediate node that supports data deduplication may also be deployed between the SFU server 501 and the node 502. For details, refer to the data transmission system 400 shown in FIG. 4, and the embodiments of the present application will not be repeated here.
可选地,在如图2至图5所示的数据传输系统中,通信网络中支持数据去重的节点可以是独立的硬件设备,或者也可以以软件形式部署到现网中的服务器或者网络设备上。网络设备包括但不限于路由器或交换机。本申请实施例对通信网络的类型不做限定。例如该通信网络可以是专线网络,或者也可以是非专线网络。又例如该通信网络可以包括隧道,或者也可以不包括隧道。又例如该通信网络可以是广域网或局域网等。Optionally, in the data transmission system shown in Figures 2 to 5, the node supporting data deduplication in the communication network may be an independent hardware device, or may be deployed in the form of software to a server or network device in an existing network. Network devices include but are not limited to routers or switches. The embodiment of the present application does not limit the type of communication network. For example, the communication network may be a dedicated line network, or it may be a non-dedicated line network. For another example, the communication network may include a tunnel, or it may not include a tunnel. For another example, the communication network may be a wide area network or a local area network, etc.
一种可能实现方式,本申请实施例涉及的通信场景采用可伸缩视频编码(scalable video coding,SVC)。SVC是视频编码的一种,通过对视频数据进行分层编码和选择传输实现。通信场景采用SVC的具体实现方式为,源端主机向SFU服务器发送多层码流,该多层码流包括基本层码流和增强层码流,然后由SFU服务器根据接收端主机的能力对多层码流进行选择分发。比如,对于解码能力较强的接收端主机,SFU向该接收端主机发送基本层码流和增强层码流。对于解码能力较弱的接收端主机,SFU只向该接收端主机发送基本层码流。In one possible implementation, the communication scenario involved in the embodiments of the present application adopts scalable video coding (SVC). SVC is a type of video coding, which is implemented by layered encoding and selective transmission of video data. The specific implementation method of using SVC in the communication scenario is that the source host sends a multi-layer stream to the SFU server, and the multi-layer stream includes a basic layer stream and an enhanced layer stream, and then the SFU server selects and distributes the multi-layer stream according to the capabilities of the receiving host. For example, for a receiving host with strong decoding capabilities, the SFU sends a basic layer stream and an enhanced layer stream to the receiving host. For a receiving host with weak decoding capabilities, the SFU only sends a basic layer stream to the receiving host.
一种可能实现方式,本申请实施例涉及的通信场景支持端到端加密(end-to-end encryption,E2EE)。这种实现方式下,源端主机采用高级加密标准(advanced encryption standard,AES)密钥对业务数据进行加密后,将加密得到的密文发送给SFU服务器,SFU服务器将密文携带在报文中分发给接收端主机,接收端主机采用AES密钥对密文进行解密得到业务数据。在通信场景中包括多个接收端主机的情况下,该多个接收端主机可以采用相同的AES密钥。比如在视频会议场景中,会议主持人可以通过信令方式将AES密钥统一分发给多个会议成员。In one possible implementation, the communication scenario involved in the embodiments of the present application supports end-to-end encryption (E2EE). In this implementation, the source host uses the advanced encryption standard (AES) key to encrypt the business data, and then sends the encrypted ciphertext to the SFU server. The SFU server carries the ciphertext in the message and distributes it to the receiving host. The receiving host uses the AES key to decrypt the ciphertext to obtain the business data. In the case where the communication scenario includes multiple receiving hosts, the multiple receiving hosts can use the same AES key. For example, in a video conferencing scenario, the conference host can distribute the AES key to multiple conference members uniformly through signaling.
举例来说,本申请实施例涉及的通信场景采用SFU通信架构,采用SVC,以及支持E2EE。这种通信场景下的数据传输具有以下特点:信令数据基于传输控制协议(transmission control protocol,TCP)和安全传输层(transport layer security,TLS)协议传输,业务数据(比如音视频数据)基于用户数据报协议(user datagram protocol,UDP)和AES加密传输。对于弱网或弱终端场景,业务数据也可以采用TCP和AES加密传输。For example, the communication scenario involved in the embodiment of the present application adopts the SFU communication architecture, adopts SVC, and supports E2EE. The data transmission in this communication scenario has the following characteristics: signaling data is transmitted based on the transmission control protocol (TCP) and the transport layer security (TLS) protocol, and business data (such as audio and video data) is based on the user datagram protocol (UDP) and AES encryption transmission. For weak network or weak terminal scenarios, business data can also be transmitted using TCP and AES encryption.
下面对本申请涉及的方法流程举例说明。The following is an example of the method flow involved in this application.
在本申请实施例提供的数据传输系统中,顶部节点、底部节点和中间节点在传输发送方为SFU服务器的报文时,对报文进行去重和恢复处理的流程各自有所不同。本申请通过以下三个实施例对这三种节点传输报文的实现方式分别进行说明。In the data transmission system provided in the embodiment of the present application, when the top node, the bottom node and the middle node transmit the message whose sender is the SFU server, the processes of deduplication and recovery processing of the message are different. The present application respectively describes the implementation methods of the three types of node transmission messages through the following three embodiments.
第一个实施例应用于顶部节点,顶部节点用于对报文进行去重处理。例如,图6是本申请实施例提供的一种数据传输方法600的流程示意图。为了便于区分各个实施例的内容,在方法600中将第一节点称为节点11,将第二节点称为节点12,将第一报文称为报文11,将第二报文称为报文12。该方法600可以应 用于图2示出的数据传输系统200,则该方法600中的节点11为节点202A,节点12为节点202B。或者,该方法600可以应用于图3示出的数据传输系统300,则该方法600中的节点11为节点302A,节点12为节点302B、节点302C或节点302D。或者,该方法600可以应用于图4示出的数据传输系统400,则该方法600中的节点11为节点402A,节点12为节点402B、节点402C或节点402D。或者,该方法600可以应用于图5示出的数据传输系统500,则该方法600中的节点11为SFU服务器501,节点12为节点502。如图6所示,该方法600包括但不限于以下步骤601至步骤603。The first embodiment is applied to the top node, and the top node is used to perform deduplication processing on the message. For example, FIG6 is a flow chart of a data transmission method 600 provided in an embodiment of the present application. In order to facilitate the distinction between the contents of each embodiment, in the method 600, the first node is referred to as node 11, the second node is referred to as node 12, the first message is referred to as message 11, and the second message is referred to as message 12. The method 600 can be applied to For the data transmission system 200 shown in FIG. 2 , the node 11 in the method 600 is the node 202A, and the node 12 is the node 202B. Alternatively, the method 600 can be applied to the data transmission system 300 shown in FIG. 3 , and the node 11 in the method 600 is the node 302A, and the node 12 is the node 302B, the node 302C, or the node 302D. Alternatively, the method 600 can be applied to the data transmission system 400 shown in FIG. 4 , and the node 11 in the method 600 is the node 402A, and the node 12 is the node 402B, the node 402C, or the node 402D. Alternatively, the method 600 can be applied to the data transmission system 500 shown in FIG. 5 , and the node 11 in the method 600 is the SFU server 501, and the node 12 is the node 502. As shown in FIG. 6 , the method 600 includes but is not limited to the following steps 601 to 603.
步骤601、节点11获取发送方为SFU服务器的报文11,节点11为报文11的传输路径上支持数据去重的首个节点。Step 601: Node 11 obtains message 11 whose sender is the SFU server. Node 11 is the first node on the transmission path of message 11 that supports data deduplication.
其中,报文11为原始报文,即未去重报文。报文11的发送方为SFU服务器,且节点11为报文11的传输路径上支持数据去重的首个节点,也即是,节点11为以SFU服务器为起点、报文11的目的地为终点的传输路径上支持数据去重的首个节点。可选地,报文11的源互联网协议(Internet Protocol,IP)地址可以是SFU服务器的IP地址,比如SFU服务器部署在公网中,则报文11的源IP地址可以采用该SFU服务器的公网地址来表示。或者,报文11的源IP地址可以是SFU服务器的IP地址经过网络地址转换(network address translation,NAT)之后得到的地址,比如SFU服务器部署在私网中,则报文11的源IP地址可以采用该SFU服务器的私网地址对应的公网地址来表示。Among them, message 11 is the original message, that is, the message is not deduplicated. The sender of message 11 is the SFU server, and node 11 is the first node on the transmission path of message 11 that supports data deduplication, that is, node 11 is the first node on the transmission path starting from the SFU server and ending at the destination of message 11 that supports data deduplication. Optionally, the source Internet Protocol (IP) address of message 11 can be the IP address of the SFU server. For example, if the SFU server is deployed in the public network, the source IP address of message 11 can be represented by the public network address of the SFU server. Alternatively, the source IP address of message 11 can be the address obtained after the IP address of the SFU server is converted to network address translation (NAT). For example, if the SFU server is deployed in a private network, the source IP address of message 11 can be represented by the public network address corresponding to the private network address of the SFU server.
可选地,节点11为SFU服务器,则节点11获取报文11,可以是节点11生成报文11。Optionally, the node 11 is an SFU server, and the node 11 obtains the message 11 , which may be the node 11 generating the message 11 .
或者,节点11不为SFU服务器,比如节点11为与SFU服务器相连的网络设备,则节点11获取报文11,可以是节点11接收SFU服务器发送的报文11。可选地,节点11在接收到一个报文之后,可以根据该报文携带的源端口号确定该报文的发送方是否为SFU服务器。Alternatively, if node 11 is not an SFU server, for example, node 11 is a network device connected to an SFU server, node 11 obtains message 11, which may be node 11 receiving message 11 sent by the SFU server. Optionally, after receiving a message, node 11 may determine whether the sender of the message is an SFU server according to the source port number carried by the message.
步骤602、如果节点11向节点12发送的历史报文中存在目标报文,目标报文的载荷部分与报文11的载荷部分具有重复内容,节点11对报文11的载荷部分进行去重处理,得到报文12,报文12不包括该重复内容,且报文12携带有去重标记以及对该重复内容的指示信息,该去重标记用于指示报文12为去重报文,节点12为报文11在节点11上的下一跳。Step 602: If the target message exists in the historical message sent by node 11 to node 12, and the payload part of the target message has repeated content with the payload part of message 11, node 11 deduplicates the payload part of message 11 to obtain message 12. Message 12 does not include the repeated content, and message 12 carries a deduplication mark and indication information of the repeated content. The deduplication mark is used to indicate that message 12 is a deduplication message, and node 12 is the next hop of message 11 on node 11.
可选地,载荷部分为UDP载荷或TCP载荷。载荷部分包括协议部分和数据部分。协议部分可以用于承载应用层协议,比如实时传输协议(real-time transport protocol,RTP)、文件传输协议(file transfer protocol,FTP)或超文本传输协议(hyper text transfer protocol,HTTP)等。数据部分可以用于承载应用层数据,比如信令数据或业务数据等。Optionally, the payload part is a UDP payload or a TCP payload. The payload part includes a protocol part and a data part. The protocol part can be used to carry application layer protocols, such as real-time transport protocol (RTP), file transfer protocol (FTP) or hypertext transfer protocol (HTTP). The data part can be used to carry application layer data, such as signaling data or service data.
可选地,目标报文的载荷部分与报文11的载荷部分之间的重复内容包括一个或多个重复数据块,对该重复内容的指示信息包括一个或多个指示,该一个或多个指示与该重复内容中的一个或多个重复数据块一一对应。每个指示用于指示对应的重复数据块的哈希值,具体可以是,每个指示包括对应的重复数据块的哈希值。可选地,重复数据块的哈希值可以根据该重复数据块的数据内容和长度计算得到,或者也可以将重复数据块的部分数据内容作为该重复数据块的哈希值,本申请实施例对重复数据块的哈希值的计算方式不做限定。Optionally, the repeated content between the payload portion of the target message and the payload portion of message 11 includes one or more repeated data blocks, and the indication information of the repeated content includes one or more indications, and the one or more indications correspond one-to-one to one or more repeated data blocks in the repeated content. Each indication is used to indicate the hash value of the corresponding repeated data block, and specifically, each indication includes the hash value of the corresponding repeated data block. Optionally, the hash value of the repeated data block can be calculated based on the data content and length of the repeated data block, or part of the data content of the repeated data block can be used as the hash value of the repeated data block. The embodiment of the present application does not limit the calculation method of the hash value of the repeated data block.
可选地,节点11在获取发送方为SFU服务器的报文11之后,需要执行判断流程,该判断流程用于判断自身向节点12发送的历史报文中是否存在载荷部分与报文11的载荷部分具有重复内容的目标报文。这里的历史报文指的是节点11在执行该判断流程之前向节点12发送过的原始报文。本申请以下实施例提供了节点11执行判断流程的两种可能实现方式,包括可能实现方式A1和可能实现方式A2。Optionally, after obtaining message 11 whose sender is the SFU server, node 11 needs to execute a judgment process, which is used to determine whether there is a target message in the historical message sent by itself to node 12 whose payload part has repeated content with the payload part of message 11. The historical message here refers to the original message sent by node 11 to node 12 before executing the judgment process. The following embodiments of the present application provide two possible implementation methods for node 11 to execute the judgment process, including possible implementation method A1 and possible implementation method A2.
可能实现方式A1,上述重复内容中的一个或多个重复数据块位于报文11的载荷部分的数据部分和/或协议部分。节点11中存储有数据集合,该数据集合包括节点11向节点12发送的历史报文的载荷部分。节点11执行判断流程,包括:节点11对报文11的载荷部分与数据集合中的载荷部分进行内容匹配。如果数据集合中存在与报文11的载荷部分具有重复数据块的目标载荷部分,节点11确定向节点12发送的历史报文中存在目标报文。如果数据集合中不存在与报文11的载荷部分具有重复数据块的载荷部分,节点11确定向节点12发送的历史报文中不存在目标报文。可选地,节点11在确定向节点12发送的历史报文中不存在目标报文之后,节点11可以在数据集合中添加报文11的载荷部分,得到更新后的数据集合。该更新后的数据集合可以用于节点11对后续获取的发送方为SFU服务器的报文进行去重处理。Possible implementation method A1, one or more repeated data blocks in the above-mentioned repeated content are located in the data part and/or the protocol part of the payload part of the message 11. Node 11 stores a data set, which includes the payload part of the historical message sent by node 11 to node 12. Node 11 executes a judgment process, including: node 11 matches the payload part of message 11 with the payload part in the data set. If there is a target payload part with repeated data blocks in the payload part of message 11 in the data set, node 11 determines that there is a target message in the historical message sent to node 12. If there is no payload part with repeated data blocks in the payload part of message 11 in the data set, node 11 determines that there is no target message in the historical message sent to node 12. Optionally, after node 11 determines that there is no target message in the historical message sent to node 12, node 11 can add the payload part of message 11 to the data set to obtain an updated data set. The updated data set can be used by node 11 to perform deduplication processing on the message whose sender is the SFU server obtained subsequently.
可选地,节点11对报文11的载荷部分与数据集合中的载荷部分进行内容匹配,可以是对报文11的载荷部分的一个或多个位置进行采样,以及对数据集合中的载荷部分的一个或多个位置进行采样。如果报 文11的载荷部分的位置1的采样内容与数据集合中某个载荷部分的位置2的采样内容相同,节点11继续对报文11的载荷部分的位置1的前后内容与数据集合中该载荷部分的位置2的前后内容进行精确匹配,以确定报文11的载荷部分与数据集合中该载荷部分的重复数据块。本申请实施例对内容匹配的具体实现方式不做限定。Optionally, node 11 performs content matching on the payload part of message 11 and the payload part in the data set, which may be sampling one or more positions of the payload part of message 11 and sampling one or more positions of the payload part in the data set. The sampling content of position 1 of the payload part of message 11 is the same as the sampling content of position 2 of a payload part in the data set, and node 11 continues to accurately match the content before and after position 1 of the payload part of message 11 with the content before and after position 2 of the payload part in the data set to determine the repeated data blocks of the payload part of message 11 and the payload part in the data set. The specific implementation method of content matching is not limited in the embodiment of the present application.
例如,报文11的载荷部分的内容为“44336655”,数据集合中某个载荷部分的内容为“11332255”,节点11对报文11的载荷部分与数据集合中的该载荷部分进行内容匹配,确定报文11的载荷部分与数据集合中的该载荷部分具有重复数据块“33”和“55”,则节点11可以将数据集合中的该载荷部分确定为目标载荷部分,并确定节点11向节点12发送的历史报文中存在目标报文。目标报文即包括目标载荷部分的历史报文。For example, the content of the payload part of message 11 is "44336655", and the content of a payload part in the data set is "11332255". Node 11 matches the content of the payload part of message 11 with the payload part in the data set, and determines that the payload part of message 11 and the payload part in the data set have repeated data blocks "33" and "55". Then, node 11 can determine the payload part in the data set as the target payload part, and determine that the target message exists in the historical messages sent by node 11 to node 12. The target message is the historical message including the target payload part.
结合上述可能实现方式A1,节点11对报文11的载荷部分进行去重处理的实现过程,包括:针对报文11的载荷部分与目标载荷部分之间的每个重复数据块,节点11计算该重复数据块的哈希值。节点11去除报文11的载荷部分的该重复数据块,并在报文11的载荷部分添加该重复数据块对应的指示,该指示用于指示该重复数据块的哈希值以及该重复数据块在报文11的载荷部分中的位置。可选地,重复数据块在报文11的载荷部分中的位置可以采用该重复数据块在报文11的载荷部分中的起始位置和该重复数据块的长度来表示,或者可以采用该重复数据块在报文11的载荷部分中的结束位置和该重复数据块的长度来表示,又或者可以采用该重复数据块在报文11的载荷部分中的起始位置和该重复数据块在报文11的载荷部分中的结束位置来表示。In combination with the above possible implementation A1, the implementation process of node 11 performing deduplication processing on the payload part of message 11 includes: for each duplicate data block between the payload part of message 11 and the target payload part, node 11 calculates the hash value of the duplicate data block. Node 11 removes the duplicate data block in the payload part of message 11, and adds an indication corresponding to the duplicate data block in the payload part of message 11, the indication is used to indicate the hash value of the duplicate data block and the position of the duplicate data block in the payload part of message 11. Optionally, the position of the duplicate data block in the payload part of message 11 can be represented by the starting position of the duplicate data block in the payload part of message 11 and the length of the duplicate data block, or can be represented by the ending position of the duplicate data block in the payload part of message 11 and the length of the duplicate data block, or can be represented by the starting position of the duplicate data block in the payload part of message 11 and the ending position of the duplicate data block in the payload part of message 11.
例如参考上述举例,重复数据块“33”的哈希值为a,重复数据块“55”的哈希值为b。则对报文11的载荷部分进行去重处理之后,得到的报文12的载荷部分的内容可以表示为“4466;指示1:<a,位置3-4>;指示2:<b,位置7-8>”。其中,“4466”是去除报文11的载荷部分中与目标载荷部分的重复内容之后的剩余内容,指示1“<a,位置3-4>”用于指示哈希值a对应的数据块位于报文12的载荷部分的第3字节和第4字节,指示2“<b,位置7-8>”用于指示哈希值b对应的数据块位于报文12的载荷部分的第7字节和第8字节。For example, referring to the above example, the hash value of the repeated data block "33" is a, and the hash value of the repeated data block "55" is b. After deduplication of the payload part of message 11, the content of the payload part of message 12 can be expressed as "4466; Indication 1: <a, position 3-4>; Indication 2: <b, position 7-8>". Among them, "4466" is the remaining content after removing the duplicate content in the payload part of message 11 and the target payload part, Indication 1 "<a, position 3-4>" is used to indicate that the data block corresponding to hash value a is located at the 3rd byte and 4th byte of the payload part of message 12, and Indication 2 "<b, position 7-8>" is used to indicate that the data block corresponding to hash value b is located at the 7th byte and 8th byte of the payload part of message 12.
可选地,如果节点11有多个下级节点,节点11中可以存储有多个节点级数据集合,每个节点级数据集合分别用于存储节点11向一个下级节点发送的历史报文的载荷部分,即节点11针对每个下级节点分别存储一个数据集合。或者,节点11中可以存储有一个全局数据集合,该全局数据集合中包括载荷部分与节点标识的对应关系,节点标识用于指示对应的载荷部分所属的历史报文是向该节点标识所指示的下级节点发送的,即节点11针对所有下级节点存储一个共用的数据集合。节点11的下级节点是指在以SFU服务器为起点的传输路径上位于节点11之后的一个节点,比如在图3示出的数据传输系统300中,节点11可以是节点302A,则节点302B、节点302C和节点302D均为节点11的下级节点。Optionally, if node 11 has multiple subordinate nodes, multiple node-level data sets may be stored in node 11, each node-level data set is used to store the payload portion of the historical message sent by node 11 to a subordinate node, that is, node 11 stores a data set for each subordinate node. Alternatively, a global data set may be stored in node 11, and the global data set includes a correspondence between the payload portion and the node identifier, and the node identifier is used to indicate that the historical message to which the corresponding payload portion belongs is sent to the subordinate node indicated by the node identifier, that is, node 11 stores a common data set for all subordinate nodes. The subordinate node of node 11 refers to a node located after node 11 on the transmission path starting from the SFU server. For example, in the data transmission system 300 shown in FIG. 3, node 11 may be node 302A, and node 302B, node 302C, and node 302D are all subordinate nodes of node 11.
在上述可能实现方式A1下,顶部节点存储向下级节点发送过的历史报文的载荷部分的完整内容。顶部节点在判断获取到的原始报文与历史报文是否存在重复内容时,对报文的载荷部分的整体内容进行匹配,而无需区分重复内容是在数据部分还是在协议部分中。In the possible implementation A1, the top node stores the complete content of the payload part of the historical message sent to the lower node. When the top node determines whether the original message obtained has duplicate content with the historical message, it matches the entire content of the payload part of the message without distinguishing whether the duplicate content is in the data part or the protocol part.
本申请实施例中,顶部节点可以对获取的原始报文的载荷部分与已存储的历史报文的载荷部分进行内容匹配。如果原始报文的载荷部分与历史报文的载荷部分有重复数据块,则顶部节点计算该重复数据块的哈希值,并去除原始报文中的该重复数据块得到去重报文,进一步在该去重报文中携带对该重复数据块的哈希值以及该重复数据块的位置的指示,以实现对原始报文的数据去重。In the embodiment of the present application, the top node can perform content matching on the payload part of the acquired original message and the payload part of the stored historical message. If there are duplicate data blocks in the payload part of the original message and the payload part of the historical message, the top node calculates the hash value of the duplicate data block, removes the duplicate data block in the original message to obtain a deduplicated message, and further carries the hash value of the duplicate data block and the position of the duplicate data block in the deduplicated message to achieve data deduplication of the original message.
例如,图7是本申请实施例提供的一种报文去重过程示意图。如图7所示,原始报文包括以太头、IP头、TCP/UDP头以及载荷部分,该载荷部分包括协议部分和数据部分。对原始报文的载荷部分进行去重处理得到去重报文。去重报文包括以太头、IP头、TCP/UDP头以及载荷部分,该载荷部分包括去重标记、重复数据块对应的指示以及原始报文的载荷部分中除重复数据块以外的其它内容。其中,重复数据块对应的指示用于指示重复数据块的哈希值以及该重复数据块在原始报文的载荷部分中的位置。其中,去重报文的载荷部分中,去重标记和重复数据块对应的指示可以在协议部分和数据部分之前,或者也可以在协议部分和数据部分之后,本申请实施例对此不做限定。For example, FIG7 is a schematic diagram of a message deduplication process provided by an embodiment of the present application. As shown in FIG7, the original message includes an Ethernet header, an IP header, a TCP/UDP header and a payload part, and the payload part includes a protocol part and a data part. The payload part of the original message is deduplicated to obtain a deduplicated message. The deduplicated message includes an Ethernet header, an IP header, a TCP/UDP header and a payload part, and the payload part includes a deduplication mark, an indication corresponding to a duplicate data block, and other contents in the payload part of the original message except the duplicate data block. Among them, the indication corresponding to the duplicate data block is used to indicate the hash value of the duplicate data block and the position of the duplicate data block in the payload part of the original message. Among them, in the payload part of the deduplication message, the deduplication mark and the indication corresponding to the duplicate data block may be before the protocol part and the data part, or may also be after the protocol part and the data part, and the embodiment of the present application does not limit this.
可能实现方式A2,上述重复内容中的一个或多个重复数据块位于报文11的载荷部分的数据部分。节点11中存储有采样标签集合,该采样标签集合包括历史数据块的哈希值。历史数据块为对节点11向节点12发送的历史报文的数据部分的预设位置采样得到的数据块。节点11执行判断流程,包括:节点11对报文11的数据部分的预设位置进行采样,得到采样数据块。节点11计算该采样数据块的哈希值。如果采样 标签集合中包括该采样数据块的哈希值,节点11确定向节点12发送的历史报文中存在目标报文。如果采样标签集合中不包括该采样数据块的哈希值,节点11确定向节点12发送的历史报文中不存在目标报文。可选地,节点11在确定向节点12发送的历史报文中不存在目标报文之后,节点11可以在采样标签集合中添加该采样数据块的哈希值,得到更新后的采样标签集合。更新后的采样标签集合可以用于节点11对后续获取的发送方为SFU服务器的报文进行去重处理。Possible implementation method A2, one or more repeated data blocks in the above repeated content are located in the data part of the payload part of message 11. Node 11 stores a sampling label set, which includes the hash value of the historical data block. The historical data block is a data block obtained by sampling the preset position of the data part of the historical message sent by node 11 to node 12. Node 11 executes a judgment process, including: node 11 samples the preset position of the data part of message 11 to obtain a sampled data block. Node 11 calculates the hash value of the sampled data block. If the sample The tag set includes the hash value of the sampled data block, and the node 11 determines that the target message exists in the historical messages sent to the node 12. If the sampling tag set does not include the hash value of the sampled data block, the node 11 determines that the target message does not exist in the historical messages sent to the node 12. Optionally, after determining that the target message does not exist in the historical messages sent to the node 12, the node 11 can add the hash value of the sampled data block to the sampling tag set to obtain an updated sampling tag set. The updated sampling tag set can be used by the node 11 to perform deduplication processing on the messages whose sender is the SFU server that are subsequently obtained.
其中,数据部分的预设位置即预先设置的数据部分的采样位置。可选地,数据部分的预设位置可以是整个数据部分,则节点11对报文11的数据部分的预设位置进行采样得到的采样数据块为报文11的数据部分的完整内容。或者,数据部分的预设位置也可以是数据部分的局部字段,这种情况下,预设位置可以有一个或多个,则节点11对报文11的数据部分的每个预设位置分别进行采样,得到一个或多个采样数据块。The preset position of the data part is the sampling position of the data part set in advance. Optionally, the preset position of the data part can be the entire data part, and the sampled data block obtained by node 11 sampling the preset position of the data part of message 11 is the complete content of the data part of message 11. Alternatively, the preset position of the data part can also be a local field of the data part. In this case, there can be one or more preset positions, and node 11 samples each preset position of the data part of message 11 to obtain one or more sampled data blocks.
结合上述可能实现方式A2,节点11对报文11的载荷部分进行去重处理的实现过程,包括:节点11将哈希值属于采样标签集合的采样数据块作为重复数据块,去除报文11的数据部分的该重复数据块,并在报文11的载荷部分添加该重复数据块对应的指示,该指示用于指示该重复数据块的哈希值。In combination with the possible implementation method A2, the implementation process of node 11 deduplicating the payload part of message 11 includes: node 11 takes the sampling data block whose hash value belongs to the sampling tag set as a duplicate data block, removes the duplicate data block from the data part of message 11, and adds an indication corresponding to the duplicate data block to the payload part of message 11, where the indication is used to indicate the hash value of the duplicate data block.
可选地,预设位置有多个,则节点11对报文11的数据部分的预设位置采样得到的采样数据块有多个,相应地,重复数据块可能有一个或多个。这种情况下,重复数据块对应的指示还用于指示该重复数据块在报文11的数据部分中的位置,比如一个重复数据块对应的指示可表示为<哈希值,位置>。本申请实施例中,在预先设置的数据部分的采样位置有多个的情况下,顶部节点对原始报文的数据部分进行采样会得到多个采样数据块,这种情况下需要指示去重报文中被去除的重复数据块在原始报文中的位置,以便后续节点对去重报文进行数据恢复。Optionally, there are multiple preset positions, and node 11 obtains multiple sampled data blocks by sampling the preset positions of the data part of message 11. Correspondingly, there may be one or more duplicate data blocks. In this case, the indication corresponding to the duplicate data block is also used to indicate the position of the duplicate data block in the data part of message 11. For example, the indication corresponding to a duplicate data block can be expressed as <hash value, position>. In an embodiment of the present application, when there are multiple sampling positions of the pre-set data part, the top node samples the data part of the original message and obtains multiple sampled data blocks. In this case, it is necessary to indicate the position of the duplicate data block removed from the deduplicated message in the original message so that subsequent nodes can perform data recovery on the deduplicated message.
可选地,如果节点11有多个下级节点,节点11中可以存储有多个节点级采样标签集合,每个节点级采样标签集合分别对应一个下级节点,每个节点级采样标签集合分别用于存储对节点11向对应的下级节点发送的历史报文的数据部分的预设位置采样得到的数据块的哈希值。即节点11针对每个下级节点分别存储一个采样标签集合。或者,节点11中可以存储有一个全局采样标签集合,该全局采样标签集合中包括哈希值与节点标识的对应关系,节点标识用于指示对应的哈希值来自向该节点标识所指示的下级节点发送的历史报文,即节点11针对所有下级节点存储一个共用的采样标签集合。Optionally, if node 11 has multiple subordinate nodes, node 11 may store multiple node-level sampling label sets, each node-level sampling label set corresponds to a subordinate node, and each node-level sampling label set is used to store the hash value of the data block sampled at the preset position of the data part of the historical message sent by node 11 to the corresponding subordinate node. That is, node 11 stores a sampling label set for each subordinate node. Alternatively, node 11 may store a global sampling label set, which includes the correspondence between hash values and node identifiers, and the node identifier is used to indicate that the corresponding hash value comes from the historical message sent to the subordinate node indicated by the node identifier, that is, node 11 stores a common sampling label set for all subordinate nodes.
在上述可能实现方式A2下,顶部节点存储向下级节点发送过的历史报文的数据部分的预设位置的数据块的哈希值。顶部节点在判断获取到的原始报文与历史报文是否存在重复内容时,计算对原始报文的数据部分的预设位置采样得到的采样数据块的哈希值,将其与存储的哈希值进行比较,实现对原始报文的数据部分的预设位置的数据块的整体去重。In the possible implementation A2, the top node stores the hash value of the data block at the preset position of the data part of the historical message sent to the lower node. When the top node determines whether the acquired original message and the historical message have duplicate content, it calculates the hash value of the sampled data block sampled at the preset position of the data part of the original message, and compares it with the stored hash value to achieve overall deduplication of the data block at the preset position of the data part of the original message.
本申请实施例中,顶部节点可以计算获取的原始报文的数据部分的预设位置的采样数据块的哈希值,并将其与已存储的历史数据块的哈希值进行比较。如果原始报文中某个采样数据块的哈希值与顶部节点已存储的哈希值相同,则顶部节点去除原始报文中的该采样数据块得到去重报文,进一步在该去重报文中携带该采样数据块的哈希值,以实现对原始报文的数据去重。与上述可能实现方式A1相比,顶部节点无需对原始报文和历史报文的载荷部分进行内容匹配,提高了报文去重效率。In an embodiment of the present application, the top node can calculate the hash value of the sampled data block at a preset position of the data portion of the acquired original message, and compare it with the hash value of the stored historical data block. If the hash value of a sampled data block in the original message is the same as the hash value stored by the top node, the top node removes the sampled data block in the original message to obtain a deduplicated message, and further carries the hash value of the sampled data block in the deduplicated message to achieve data deduplication of the original message. Compared with the possible implementation method A1 described above, the top node does not need to perform content matching on the payload part of the original message and the historical message, thereby improving the efficiency of message deduplication.
例如,图8是本申请实施例提供的另一种报文去重过程示意图。如图8所示,原始报文包括以太头、IP头、TCP/UDP头以及载荷部分,该载荷部分包括协议部分和数据部分。对原始报文的载荷部分进行去重处理得到去重报文。去重报文包括以太头、IP头、TCP/UDP头以及载荷部分。假设预先设置的数据部分的采样位置为整个数据部分,则去重报文的载荷部分包括去重标记、原始报文的数据部分的哈希值以及原始报文的协议部分。其中,去重报文的载荷部分中,去重标记和数据部分的哈希值可以在协议部分之前,或者也可以在协议部分之后,本申请实施例对此不做限定。For example, Figure 8 is a schematic diagram of another message deduplication process provided by an embodiment of the present application. As shown in Figure 8, the original message includes an Ethernet header, an IP header, a TCP/UDP header and a payload part, and the payload part includes a protocol part and a data part. The payload part of the original message is deduplicated to obtain a deduplicated message. The deduplicated message includes an Ethernet header, an IP header, a TCP/UDP header and a payload part. Assuming that the sampling position of the pre-set data part is the entire data part, the payload part of the deduplicated message includes a deduplication mark, a hash value of the data part of the original message, and the protocol part of the original message. Among them, in the payload part of the deduplication message, the deduplication mark and the hash value of the data part can be before the protocol part, or can also be after the protocol part, and the embodiment of the present application does not limit this.
可选地,节点11中存储的采样标签集合还包括哈希值所指示的历史数据块,也即是,节点11中存储的采样标签集合可以包括历史数据块与历史数据块的哈希值的对应关系。则上述如果采样标签集合中包括采样数据块的哈希值,节点11确定历史报文中存在目标报文的实现方式,包括:如果采样标签集合中包括采样数据块的哈希值,节点11对该采样数据块与该采样数据块的哈希值所指示的历史数据块进行内容匹配。如果该采样数据块与该采样数据块的哈希值所指示的历史数据块的内容相同,节点11确定历史报文中存在目标报文。Optionally, the sampling label set stored in the node 11 also includes the historical data block indicated by the hash value, that is, the sampling label set stored in the node 11 may include the correspondence between the historical data block and the hash value of the historical data block. Then, if the sampling label set includes the hash value of the sampling data block, the node 11 determines that the target message exists in the historical message in an implementation manner, including: if the sampling label set includes the hash value of the sampling data block, the node 11 performs content matching on the sampling data block and the historical data block indicated by the hash value of the sampling data block. If the sampling data block has the same content as the historical data block indicated by the hash value of the sampling data block, the node 11 determines that the target message exists in the historical message.
由于哈希值相同的两个数据块的数据内容有可能不同,通过在采样标签集合中存储历史数据块与历史 数据块的哈希值的对应关系,可以使得节点11在确定采样数据块的哈希值与某个历史数据块的哈希值相同之后,进一步对该采样数据块与该历史数据块进行内容匹配,以实现精确匹配,进而提高对报文的去重准确率。Since the data contents of two data blocks with the same hash value may be different, by storing the historical data blocks and historical The corresponding relationship between the hash values of the data blocks enables the node 11 to further perform content matching on the sampled data block and the historical data block after determining that the hash value of the sampled data block is the same as the hash value of a historical data block, so as to achieve an accurate match, thereby improving the accuracy of deduplication of the message.
结合上述可能实现方式A2,节点11中还可以存储有节点11向节点12发送的历史报文的协议部分。报文11的载荷部分与目标报文的载荷部分的重复内容还可以包括位于报文11的协议部分的协议信息,相应地,对该重复内容的指示信息还可以包括差异指示,该差异指示用于指示报文11的协议部分与目标报文的协议部分的差异。报文11的协议部分与目标报文的协议部分的差异具体包括报文11的协议部分与目标报文的协议部分的差异信息以及该差异信息在报文11的协议部分的位置。In combination with the above possible implementation A2, the node 11 may also store the protocol part of the historical message sent by the node 11 to the node 12. The repeated content of the payload part of the message 11 and the payload part of the target message may also include the protocol information located in the protocol part of the message 11. Accordingly, the indication information of the repeated content may also include a difference indication, and the difference indication is used to indicate the difference between the protocol part of the message 11 and the protocol part of the target message. The difference between the protocol part of the message 11 and the protocol part of the target message specifically includes the difference information between the protocol part of the message 11 and the protocol part of the target message and the position of the difference information in the protocol part of the message 11.
可选地,节点11在确定历史报文中存在目标报文之后,对报文11的协议部分与目标报文的协议部分进行内容匹配,以确定报文11的协议部分与目标报文的协议部分的差异信息以及该差异信息在报文11的协议部分的位置。Optionally, after determining that the target message exists in the historical message, node 11 performs content matching on the protocol part of message 11 and the protocol part of the target message to determine the difference information between the protocol part of message 11 and the protocol part of the target message and the position of the difference information in the protocol part of message 11.
例如,图9是本申请实施例提供的又一种报文去重过程示意图。如图9所示,原始报文包括以太头、IP头、TCP/UDP头以及载荷部分,该载荷部分包括协议部分和数据部分。对原始报文的载荷部分进行去重处理得到去重报文。去重报文包括以太头、IP头、TCP/UDP头以及载荷部分。假设预先设置的数据部分的采样位置为整个数据部分,则去重报文的载荷部分包括去重标记、原始报文的数据部分的哈希值以及差异指示。该差异指示用于指示原始报文的协议部分与目标报文的协议部分的差异信息以及该差异信息在原始报文的协议部分的位置。For example, Figure 9 is another schematic diagram of a message deduplication process provided by an embodiment of the present application. As shown in Figure 9, the original message includes an Ethernet header, an IP header, a TCP/UDP header and a payload part, and the payload part includes a protocol part and a data part. The payload part of the original message is deduplicated to obtain a deduplicated message. The deduplicated message includes an Ethernet header, an IP header, a TCP/UDP header and a payload part. Assuming that the sampling position of the pre-set data part is the entire data part, the payload part of the deduplicated message includes a deduplication mark, a hash value of the data part of the original message, and a difference indication. The difference indication is used to indicate the difference information between the protocol part of the original message and the protocol part of the target message and the position of the difference information in the protocol part of the original message.
本申请实施例中,顶部节点除了可以对报文的数据部分的预设位置的数据块进行去重以外,还可以对报文的协议部分进行去重,通过在报文的载荷部分携带差异指示以替代协议部分,可以进一步减少报文传输数据量,从而减少网络带宽的开销。In an embodiment of the present application, in addition to deduplicating data blocks at preset positions in the data part of the message, the top node can also deduplicate the protocol part of the message. By carrying a difference indication in the payload part of the message instead of the protocol part, the amount of data transmitted in the message can be further reduced, thereby reducing the network bandwidth overhead.
步骤603、节点11向节点12发送报文12。Step 603 : Node 11 sends message 12 to node 12 .
可替代地,节点11在执行用于判断自身向节点12发送的历史报文中是否存在载荷部分与报文11的载荷部分具有重复内容的目标报文的上述判断流程之后,如果节点11向节点12发送的历史报文中不存在目标报文,节点11向节点12发送报文11,也即是,节点11直接转发报文11,而不对报文11进行去重处理,即不执行上述步骤602和步骤603。Alternatively, after node 11 executes the above-mentioned judgment process for determining whether there is a target message with a payload part having repeated content with the payload part of message 11 in the historical messages sent by itself to node 12, if the target message does not exist in the historical messages sent by node 11 to node 12, node 11 sends message 11 to node 12, that is, node 11 directly forwards message 11 without performing deduplication processing on message 11, that is, the above-mentioned steps 602 and 603 are not executed.
本申请实施例中,顶部节点在接收到发送方为SFU服务器的报文之后,可以判断是否向该报文的下一跳发送过载荷部分与该报文的载荷部分具有重复内容的历史报文。如果顶部节点向该报文的下一跳发送过载荷部分与该报文的载荷部分具有重复内容的历史报文,则顶部节点可以对该报文进行数据去重,然后向下级节点发送去重报文。由于去重报文的数据量相较于未去重报文的数据量较小,因此可以减少报文传输数据量,从而减少网络带宽的开销。顶部节点在向下级节点发送去重报文时,只需保证向该下级节点发送过载荷部分携带有该去重报文相对于原始报文被去除的内容的历史报文即可,以保证后续传输路径上存在节点能够对该去重报文进行数据恢复,使得用户最终接收到携带有完整数据内容的原始报文,保障用户业务。In an embodiment of the present application, after receiving a message whose sender is an SFU server, the top node can determine whether to send a historical message whose overload part has duplicate content with the load part of the message to the next hop of the message. If the top node sends a historical message whose overload part has duplicate content with the load part of the message to the next hop of the message, the top node can perform data deduplication on the message, and then send a deduplication message to the subordinate node. Since the data volume of the deduplication message is smaller than the data volume of the non-deduplication message, the data volume of the message transmission can be reduced, thereby reducing the network bandwidth overhead. When the top node sends a deduplication message to the subordinate node, it only needs to ensure that the overload part sent to the subordinate node carries the historical message with the content removed from the deduplication message relative to the original message, so as to ensure that there is a node on the subsequent transmission path that can perform data recovery on the deduplication message, so that the user finally receives the original message carrying the complete data content, thereby ensuring user services.
第二个实施例应用于底部节点,底部节点用于对报文进行去重恢复处理。例如,图10是本申请实施例提供的另一种数据传输方法1000的流程示意图。为了便于区分各个实施例的内容,在方法1000中将第一节点称为节点21,将第二节点称为节点22,将第三节点称为节点23,将第一报文称为报文21,将第二报文称为报文22。该方法1000可以应用于图2示出的数据传输系统200,则该方法1000中的节点21为节点202B,节点22为节点202A。或者,该方法1000可以应用于图3示出的数据传输系统300,则该方法1000中的节点21为节点302B、节点302C或节点302D,节点22为节点302A。或者,该方法1000可以应用于图4示出的数据传输系统400,则该方法1000中的节点21为节点402B或节点402D,节点22为节点402A,或者,节点21为节点402E、节点402F或节点402G,节点22为节点402C。或者,该方法1000可以应用于图5示出的数据传输系统500,则该方法1000中的节点21为节点502,节点22为SFU服务器501。如图10所示,该方法1000包括但不限于以下步骤1001至步骤1005。The second embodiment is applied to the bottom node, and the bottom node is used to perform deduplication recovery processing on the message. For example, Figure 10 is a flow chart of another data transmission method 1000 provided in an embodiment of the present application. In order to facilitate the distinction between the contents of each embodiment, in method 1000, the first node is referred to as node 21, the second node is referred to as node 22, the third node is referred to as node 23, the first message is referred to as message 21, and the second message is referred to as message 22. The method 1000 can be applied to the data transmission system 200 shown in Figure 2, then the node 21 in the method 1000 is node 202B, and the node 22 is node 202A. Alternatively, the method 1000 can be applied to the data transmission system 300 shown in Figure 3, then the node 21 in the method 1000 is node 302B, node 302C or node 302D, and the node 22 is node 302A. Alternatively, the method 1000 may be applied to the data transmission system 400 shown in FIG4 , then the node 21 in the method 1000 is the node 402B or the node 402D, and the node 22 is the node 402A, or the node 21 is the node 402E, the node 402F or the node 402G, and the node 22 is the node 402C. Alternatively, the method 1000 may be applied to the data transmission system 500 shown in FIG5 , then the node 21 in the method 1000 is the node 502, and the node 22 is the SFU server 501. As shown in FIG10 , the method 1000 includes but is not limited to the following steps 1001 to 1005.
步骤1001、节点21接收节点22发送的报文21,报文21的发送方为SFU服务器,报文21携带有去重标记以及对重复内容的指示信息,该去重标记用于指示报文21为去重报文,节点21为报文21的传输路径上支持数据去重的最后一个节点。 Step 1001, node 21 receives message 21 sent by node 22. The sender of message 21 is the SFU server. Message 21 carries a deduplication mark and indication information of repeated content. The deduplication mark is used to indicate that message 21 is a deduplication message. Node 21 is the last node on the transmission path of message 21 that supports data deduplication.
报文21的发送方为SFU服务器,且节点21为报文21的传输路径上支持数据去重的最后一个节点,也即是,节点21为以SFU服务器为起点、报文21的目的地为终点的传输路径上支持数据去重的最后一个节点。可选地,报文21的源IP地址可以是SFU服务器的IP地址,或者可以是SFU服务器的IP地址经过NAT之后得到的地址。The sender of message 21 is the SFU server, and node 21 is the last node that supports data deduplication on the transmission path of message 21, that is, node 21 is the last node that supports data deduplication on the transmission path starting from the SFU server and ending at the destination of message 21. Optionally, the source IP address of message 21 may be the IP address of the SFU server, or may be the address obtained after the IP address of the SFU server is NATed.
可选地,重复内容包括一个或多个重复数据块,对该重复内容的指示信息包括一个或多个指示,该一个或多个指示与该重复内容中的一个或多个重复数据块一一对应。每个指示用于指示对应的重复数据块的哈希值,具体可以是,每个指示包括对应的重复数据块的哈希值。其中,对重复内容以及指示信息的具体解释可参考上述步骤602中的相关内容,本申请实施例在此不再赘述。Optionally, the repeated content includes one or more repeated data blocks, and the indication information of the repeated content includes one or more indications, and the one or more indications correspond to one or more repeated data blocks in the repeated content. Each indication is used to indicate the hash value of the corresponding repeated data block, and specifically, each indication includes the hash value of the corresponding repeated data block. Among them, the specific explanation of the repeated content and the indication information can refer to the relevant content in the above step 602, and the embodiment of the present application will not be repeated here.
步骤1002、节点21基于该去重标记确定报文21为去重报文。Step 1002: Node 21 determines that message 21 is a deduplicated message based on the deduplication flag.
可选地,去重标记位于报文21的载荷部分。节点21接收到报文21之后,解析报文21的载荷部分,得到去重标记,进一步基于该去重标记确定节点21为去重报文。Optionally, the deduplication mark is located in the payload of the message 21. After receiving the message 21, the node 21 parses the payload of the message 21 to obtain the deduplication mark, and further determines that the node 21 is a deduplication message based on the deduplication mark.
步骤1003、节点21根据对重复内容的指示信息从数据集合中获取该重复内容,该数据集合包括节点21接收到的来自节点22的历史报文的载荷部分的至少部分内容。Step 1003 : Node 21 obtains the repeated content from a data set according to the indication information of the repeated content. The data set includes at least part of the payload part of the historical message received by node 21 from node 22 .
如果数据传输系统中的节点采用上述步骤602中的可能实现方式A1对报文进行去重处理,则节点21中存储的数据集合包括节点21接收到的来自节点22的历史报文的载荷部分。如果数据传输系统中的节点采用上述步骤602中的可能实现方式A2对报文进行去重处理,则节点21中存储的数据集合包括历史数据块的哈希值与历史数据块的对应关系,该历史数据块为对节点21接收到的来自节点22的历史报文的数据部分的预设位置采样得到的数据块。如果数据传输系统中的节点采用上述步骤602中的可能实现方式A2对报文进行去重处理,节点21中存储的数据集合还可以包括历史数据块所属报文的协议部分。If the node in the data transmission system uses the possible implementation method A1 in the above step 602 to perform deduplication processing on the message, the data set stored in the node 21 includes the payload part of the historical message received by the node 21 from the node 22. If the node in the data transmission system uses the possible implementation method A2 in the above step 602 to perform deduplication processing on the message, the data set stored in the node 21 includes the corresponding relationship between the hash value of the historical data block and the historical data block, and the historical data block is a data block obtained by sampling the preset position of the data part of the historical message received by the node 21 from the node 22. If the node in the data transmission system uses the possible implementation method A2 in the above step 602 to perform deduplication processing on the message, the data set stored in the node 21 may also include the protocol part of the message to which the historical data block belongs.
可能实现方式B1,报文21由传输路径上位于节点21之前的节点采用上述步骤602中的可能实现方式A1对原始报文进行去重处理得到。对重复内容的指示信息中的每个指示用于指示对应的重复数据块的哈希值以及对应的重复数据块在报文21对应的原始报文的载荷部分中的位置。Possible implementation B1: Message 21 is obtained by a node located before node 21 on the transmission path by performing deduplication processing on the original message using possible implementation A1 in step 602. Each indication in the indication information of repeated content is used to indicate the hash value of the corresponding repeated data block and the position of the corresponding repeated data block in the payload part of the original message corresponding to message 21.
结合上述可能实现方式B1,节点21根据对重复内容的指示信息从数据集合中获取该重复内容的实现过程,包括:对于该指示信息中的每个指示,节点21根据该指示所指示的位置,获取数据集合中的载荷部分的该位置的待匹配数据块。节点21计算待匹配数据块的哈希值。节点21将哈希值与该指示所指示的哈希值一致的待匹配数据块,确定为该指示对应的重复数据块。In combination with the above possible implementation mode B1, the implementation process of node 21 obtaining the repeated content from the data set according to the indication information of the repeated content includes: for each indication in the indication information, node 21 obtains the to-be-matched data block at the position indicated by the indication in the load part of the data set. Node 21 calculates the hash value of the to-be-matched data block. Node 21 determines the to-be-matched data block whose hash value is consistent with the hash value indicated by the indication as the repeated data block corresponding to the indication.
例如报文21的载荷部分的内容包括“4466;指示1:<a,位置3-4>;指示2:<b,位置7-8>”。其中,“4466”包括协议部分的内容和/或数据部分的内容,指示1“<a,位置3-4>”用于指示哈希值a对应的数据块位于报文21对应的原始报文的载荷部分的第3字节和第4字节,指示2“<b,位置7-8>”用于指示哈希值b对应的数据块位于报文21对应的原始报文的载荷部分的第7字节和第8字节。假设数据集合中某个载荷部分的内容为“11332255”,节点21根据指示1所指示的位置获取该载荷部分中的待匹配数据块“33”,根据指示2所指示的位置获取该载荷部分中的待匹配数据块“55”。如果待匹配数据块“33”的哈希值为a,则节点21将待匹配数据块“33”作为指示1对应的重复数据块。如果待匹配数据块“55”的哈希值为b,则节点21将待匹配数据块“55”作为指示2对应的重复数据块。For example, the content of the payload part of message 21 includes "4466; Indication 1: <a, position 3-4>; Indication 2: <b, position 7-8>". Among them, "4466" includes the content of the protocol part and/or the content of the data part, Indication 1 "<a, position 3-4>" is used to indicate that the data block corresponding to the hash value a is located at the 3rd byte and the 4th byte of the payload part of the original message corresponding to message 21, and Indication 2 "<b, position 7-8>" is used to indicate that the data block corresponding to the hash value b is located at the 7th byte and the 8th byte of the payload part of the original message corresponding to message 21. Assuming that the content of a payload part in the data set is "11332255", node 21 obtains the to-be-matched data block "33" in the payload part according to the position indicated by Indication 1, and obtains the to-be-matched data block "55" in the payload part according to the position indicated by Indication 2. If the hash value of the to-be-matched data block "33" is a, node 21 uses the to-be-matched data block "33" as the duplicate data block corresponding to Indication 1. If the hash value of the to-be-matched data block "55" is b, node 21 uses the to-be-matched data block "55" as the duplicate data block corresponding to indication 2.
可能实现方式B2,报文21由传输路径上位于节点21之前的节点采用上述步骤602中的可能实现方式A2对原始报文进行去重处理得到。上述重复内容中的一个或多个重复数据块位于报文11的载荷部分的数据部分。Possible implementation B2: Message 21 is obtained by a node located before node 21 on the transmission path by performing deduplication processing on the original message using possible implementation A2 in step 602. One or more repeated data blocks in the repeated content are located in the data part of the payload part of message 11.
结合上述可能实现方式B2,节点21根据对重复内容的指示信息从数据集合中获取该重复内容的实现过程,包括:节点21将数据集合中与该指示信息中的指示所指示的哈希值对应的历史数据块,确定为该指示对应的重复数据块。In combination with the above-mentioned possible implementation method B2, the implementation process of node 21 obtaining the repeated content from the data set according to the indication information of the repeated content includes: node 21 determines the historical data block in the data set corresponding to the hash value indicated by the indication in the indication information as the repeated data block corresponding to the indication.
可选地,如果上述步骤602中的可能实现方式A2中预先设置的数据部分的采样位置只有一个,则对重复内容的指示信息中的指示无需指示对应的重复数据块在报文21对应的原始报文中的位置,节点21默认该重复数据块位于报文21对应的原始报文的数据部分的该采样位置。如果上述步骤602中的可能实现方式A2中预先设置的数据部分的采样位置有多个,则对重复内容的指示信息中的指示还用于指示对应的重复数据块在报文21对应的原始报文的数据部分中的位置。Optionally, if there is only one sampling position of the data portion pre-set in possible implementation manner A2 in step 602, the indication in the indication information of repeated content does not need to indicate the position of the corresponding repeated data block in the original message corresponding to message 21, and node 21 assumes that the repeated data block is located at the sampling position of the data portion of the original message corresponding to message 21. If there are multiple sampling positions of the data portion pre-set in possible implementation manner A2 in step 602, the indication in the indication information of repeated content is also used to indicate the position of the corresponding repeated data block in the data portion of the original message corresponding to message 21.
结合上述可能实现方式B2,节点21中存储的数据集合可以包括历史数据块所属报文的协议部分。上述重复内容还可以包括位于协议部分的协议信息,相应地,对该重复内容的指示信息还包括差异指示,该 差异指示用于指示报文21对应的原始报文的协议部分与目标报文的协议部分的差异,目标报文为节点21接收到的来自节点22的历史报文中数据部分与报文21对应的原始报文的数据部分具有上述一个或多个重复数据块的历史报文。这种情况下,节点21根据对重复内容的指示信息从数据集合中获取该重复内容的实现过程,还包括:节点21从数据集合中获取该一个或多个重复数据块所属的目标报文的协议部分。In combination with the possible implementation B2, the data set stored in the node 21 may include the protocol part of the message to which the historical data block belongs. The repeated content may also include protocol information located in the protocol part. Accordingly, the indication information of the repeated content also includes a difference indication. The difference indication is used to indicate the difference between the protocol part of the original message corresponding to the message 21 and the protocol part of the target message, and the target message is a historical message received by the node 21 from the node 22, in which the data part and the data part of the original message corresponding to the message 21 have the above-mentioned one or more repeated data blocks. In this case, the implementation process of the node 21 obtaining the repeated content from the data set according to the indication information of the repeated content also includes: the node 21 obtains the protocol part of the target message to which the one or more repeated data blocks belong from the data set.
步骤1004、节点21根据该重复内容对报文21的载荷部分进行去重恢复处理,得到报文22,报文22的载荷部分包括该重复内容。Step 1004: Node 21 performs deduplication recovery processing on the payload of message 21 according to the repeated content to obtain message 22, wherein the payload of message 22 includes the repeated content.
其中,报文22即上述报文21对应的原始报文。Among them, message 22 is the original message corresponding to the above message 21.
结合上述步骤1003中的可能实现方式B1,节点21根据该重复内容对报文21的载荷部分进行去重恢复处理的实现方式为:针对指示信息中的每个指示,节点21在报文21的载荷部分中该指示所指示的位置,添加该指示对应的重复数据块。In combination with the possible implementation method B1 in the above step 1003, the implementation method in which node 21 performs deduplication recovery processing on the payload part of message 21 based on the repeated content is as follows: for each indication in the indication information, node 21 adds the repeated data block corresponding to the indication at the position indicated by the indication in the payload part of message 21.
例如参考上述步骤1003中的第一可能实现方式中的举例,报文21的载荷部分的内容包括“4466;指示1:<a,位置3-4>;指示2:<b,位置7-8>”,假设哈希值a对应的重复数据块为“33”,哈希值b对应的重复数据块为“55”,则节点21在“44”和“66”之间插入数据块“33”得到“443366”,使得数据块“33”位于还原后的报文的载荷部分的第3字节和第4字节,以及在“443366”之后增加数据块“55”得到“44336655”,使得数据块“55”位于还原后的报文的载荷部分的第7字节和第8字节。For example, referring to the example in the first possible implementation method in the above step 1003, the content of the payload part of message 21 includes "4466; Indication 1: <a, position 3-4>; Indication 2: <b, position 7-8>", assuming that the repeated data block corresponding to hash value a is "33", and the repeated data block corresponding to hash value b is "55", then node 21 inserts data block "33" between "44" and "66" to obtain "443366", so that data block "33" is located at the 3rd byte and 4th byte of the payload part of the restored message, and adds data block "55" after "443366" to obtain "44336655", so that data block "55" is located at the 7th byte and 8th byte of the payload part of the restored message.
本申请实施例中,底部节点可以根据去重报文中携带的指示所指示的重复数据块在原始报文的载荷部分的位置,分别计算已存储的多个载荷部分的该位置的数据块的哈希值,以获取哈希值与该指示所指示的哈希值一致的存储数据块,然后将该存储数据块添加至该指示所指示的位置,以实现对去重报文的数据恢复。In an embodiment of the present application, the bottom node can calculate the hash values of the data blocks at that position in the payload part of the multiple stored payload parts according to the position of the duplicate data block indicated by the indication carried in the deduplication message in the payload part of the original message, so as to obtain a storage data block whose hash value is consistent with the hash value indicated by the indication, and then add the storage data block to the position indicated by the indication to achieve data recovery of the deduplication message.
结合上述步骤1003中的可能实现方式B2,如果对重复内容的指示信息中的指示用于指示对应的重复数据块在报文21对应的原始报文的数据部分中的位置,则节点21根据该重复内容对报文21的载荷部分进行去重恢复处理的实现方式为:对于该指示信息中的每个指示,节点21在报文21的数据部分中该指示所指示的位置,添加该指示对应的重复数据块。如果对重复内容的指示信息中的指示未指示重复数据块在报文21对应的原始报文的数据部分中的位置,则节点21根据该重复内容对报文21的载荷部分进行去重恢复处理的实现方式为:节点21将获取的重复数据块添加在报文21的数据部分的默认去重位置,该默认去重位置即上述预先设置的数据部分的采样位置。In combination with the possible implementation method B2 in the above step 1003, if the indication in the indication information of the repeated content is used to indicate the position of the corresponding repeated data block in the data part of the original message corresponding to the message 21, then the implementation method of node 21 performing deduplication recovery processing on the payload part of the message 21 according to the repeated content is: for each indication in the indication information, node 21 adds the repeated data block corresponding to the indication at the position indicated by the indication in the data part of the message 21. If the indication in the indication information of the repeated content does not indicate the position of the repeated data block in the data part of the original message corresponding to the message 21, then the implementation method of node 21 performing deduplication recovery processing on the payload part of the message 21 according to the repeated content is: node 21 adds the obtained repeated data block to the default deduplication position of the data part of the message 21, and the default deduplication position is the above-mentioned pre-set sampling position of the data part.
可选地,如果对重复内容的指示信息包括针对协议部分的差异指示,则节点21根据重复内容对报文21的载荷部分进行去重恢复处理的实现过程,还包括:节点21根据该差异指示修改目标报文的协议部分,并将修改后的目标报文的协议部分作为报文22的协议部分。对差异指示的具体解释可参考上述步骤602中的相关内容,本申请实施例在此不再赘述。Optionally, if the indication information for the duplicate content includes a difference indication for the protocol part, the implementation process of node 21 performing deduplication recovery processing on the payload part of message 21 according to the duplicate content further includes: node 21 modifies the protocol part of the target message according to the difference indication, and uses the modified protocol part of the target message as the protocol part of message 22. For a specific explanation of the difference indication, reference may be made to the relevant content in the above step 602, which will not be repeated in detail in the embodiment of the present application.
本申请实施例中,底部节点可以在已存储的历史数据块的哈希值中查找去重报文中携带的哈希值,并将命中的哈希值所对应的历史数据块添加到去重报文的数据部分,以实现对去重报文的数据部分的恢复。另外,底部节点还可以获取命中的哈希值对应的历史数据块所属的历史报文的协议部分,结合去重报文中针对协议部分的差异指示还原得到去重报文对应的原始报文的协议部分,以实现对去重报文的协议部分的恢复。In an embodiment of the present application, the bottom node can search for the hash value carried in the deduplication message in the hash value of the stored historical data block, and add the historical data block corresponding to the hit hash value to the data part of the deduplication message to achieve the recovery of the data part of the deduplication message. In addition, the bottom node can also obtain the protocol part of the historical message to which the historical data block corresponding to the hit hash value belongs, and restore the protocol part of the original message corresponding to the deduplication message in combination with the difference indication for the protocol part in the deduplication message to achieve the recovery of the protocol part of the deduplication message.
在上述步骤1004中,节点21对报文21进行数据恢复时,还可以删除报文21中的去重标记以及对重复内容的指示信息,以还原得到报文21对应的原始报文。In the above step 1004 , when node 21 performs data recovery on message 21 , it may also delete the deduplication mark and the indication information of the duplicate content in message 21 to restore the original message corresponding to message 21 .
步骤1005、节点21向节点23发送报文22,节点23为报文21在节点21上的下一跳。Step 1005 , node 21 sends message 22 to node 23 , and node 23 is the next hop of message 21 on node 21 .
可选地,节点23可以是报文22的目的地,或者也可以是报文22的传输路径上不支持数据去重的网络设备。Optionally, the node 23 may be the destination of the message 22 , or may be a network device on the transmission path of the message 22 that does not support data deduplication.
由于网络不稳定因素可能会导致报文乱序到达底部节点,比如底部节点有可能会先接收到去重报文,之后才接收到与该去重报文对应的原始报文具有重复内容的未去重报文,该未去重报文携带有该去重报文中所指示的重复内容。这种情况下,底部节点在接收到去重报文之后无法立即对该去重报文进行数据恢复,需要等待携带有该去重报文中所指示的重复内容的未去重报文到达,之后再根据该未去重报文的载荷部分对该去重报文进行数据恢复。Due to network instability factors, messages may arrive at the bottom node out of order. For example, the bottom node may first receive a deduplicated message, and then receive a non-deduplicated message with duplicate content corresponding to the original message of the deduplicated message, and the non-deduplicated message carries the duplicate content indicated in the deduplicated message. In this case, the bottom node cannot immediately perform data recovery on the deduplicated message after receiving the deduplicated message, and needs to wait for the arrival of the non-deduplicated message carrying the duplicate content indicated in the deduplicated message, and then perform data recovery on the deduplicated message based on the payload of the non-deduplicated message.
可选地,如果节点21接收到未去重报文,则节点21直接转发该未去重报文。本申请实施例中,底部节点无需执行报文去重流程。 Optionally, if the node 21 receives a non-deduplicated message, the node 21 directly forwards the non-deduplicated message. In the embodiment of the present application, the bottom node does not need to perform the message deduplication process.
本申请实施例中,底部节点可以对发送方为SFU服务器的去重报文进行数据恢复,使得用户能够接收到携带有完整数据内容的原始报文,从而实现对用户业务的保障。In an embodiment of the present application, the bottom node can perform data recovery on deduplicated messages whose sender is the SFU server, so that the user can receive the original message carrying complete data content, thereby ensuring user services.
第三个实施例应用于中间节点,中间节点用于对报文进行去重处理或者去重恢复处理。例如,图11是本申请实施例提供的又一种数据传输方法1100的流程示意图。为了便于区分各个实施例的内容,在方法1100中将第一节点称为节点31,将第二节点称为节点32,将第三节点称为节点33,将第一报文称为报文31,将第二报文称为报文32,将第三报文称为报文33。该方法1100可以应用于图4示出的数据传输系统400,则该方法1100中的节点31为402C,节点32为402A,节点33为402E、节点402F或节点402G。如图11所示,该方法1100包括但不限于以下步骤1101至步骤1105。The third embodiment is applied to an intermediate node, and the intermediate node is used to perform deduplication processing or deduplication recovery processing on the message. For example, Figure 11 is a flow chart of another data transmission method 1100 provided in an embodiment of the present application. In order to facilitate the distinction between the contents of each embodiment, in method 1100, the first node is referred to as node 31, the second node is referred to as node 32, the third node is referred to as node 33, the first message is referred to as message 31, the second message is referred to as message 32, and the third message is referred to as message 33. The method 1100 can be applied to the data transmission system 400 shown in Figure 4, then the node 31 in the method 1100 is 402C, the node 32 is 402A, and the node 33 is 402E, node 402F or node 402G. As shown in Figure 11, the method 1100 includes but is not limited to the following steps 1101 to 1105.
步骤1101、节点31接收节点32发送的报文31,报文31的发送方为SFU服务器,节点31为报文31的传输路径上支持数据去重的中间节点。Step 1101 , node 31 receives message 31 sent by node 32 , the sender of message 31 is the SFU server, and node 31 is an intermediate node supporting data deduplication on the transmission path of message 31 .
可选地,报文31可以是去重报文或者是未去重报文。报文31的发送方为SFU服务器,且节点31为报文31的传输路径上支持数据去重的中间节点,也即是,节点31为以SFU服务器为起点、报文31的目的地为终点的传输路径上支持数据去重的中间节点。可选地,报文31的源IP地址可以是SFU服务器的IP地址,或者可以是SFU服务器的IP地址经过NAT之后得到的地址。Optionally, message 31 can be a deduplicated message or a non-deduplicated message. The sender of message 31 is the SFU server, and node 31 is an intermediate node that supports data deduplication on the transmission path of message 31, that is, node 31 is an intermediate node that supports data deduplication on the transmission path with the SFU server as the starting point and the destination of message 31 as the end point. Optionally, the source IP address of message 31 can be the IP address of the SFU server, or it can be the address obtained after the IP address of the SFU server is NATed.
步骤1102、节点31判断报文31为去重报文还是未去重报文。Step 1102: Node 31 determines whether message 31 is a deduplicated message or a non-deduplicated message.
本申请实施例中的去重报文携带有去重标记。可选地,去重标记位于去重报文的载荷部分。节点31接收到报文31之后,解析报文31的载荷部分。如果报文31的载荷部分携带有去重标记,节点31确定报文31为去重报文。如果报文31的载荷部分未携带有去重标记,节点31确定报文31为未去重报文。The deduplication message in the embodiment of the present application carries a deduplication mark. Optionally, the deduplication mark is located in the payload part of the deduplication message. After receiving the message 31, the node 31 parses the payload part of the message 31. If the payload part of the message 31 carries the deduplication mark, the node 31 determines that the message 31 is a deduplication message. If the payload part of the message 31 does not carry the deduplication mark, the node 31 determines that the message 31 is a non-deduplication message.
步骤1103、如果报文31为未去重报文,节点31判断向节点33发送的历史报文中是否存在第一原始报文,第一原始报文的载荷部分与报文31的载荷部分具有第一重复内容,节点33为报文31在节点31上的下一跳。Step 1103, if message 31 is a non-deduplicated message, node 31 determines whether there is a first original message in the historical messages sent to node 33, the payload part of the first original message and the payload part of message 31 have first repeated content, and node 33 is the next hop of message 31 on node 31.
可选地,第一重复内容包括一个或多个重复数据块。Optionally, the first repetitive content includes one or more repetitive data blocks.
可能实现方式C1,节点31中存储有第一数据集合,第一数据集合包括节点31向节点33发送的历史报文的载荷部分。节点31判断向节点33发送的历史报文中是否存在第一原始报文的实现过程,包括:节点31对报文31的载荷部分与第一数据集合中的载荷部分进行内容匹配。如果第一数据集合中存在与报文31的载荷部分具有重复数据块的目标载荷部分,节点31确定向节点33发送的历史报文中存在第一原始报文。如果第一数据集合中不存在与报文31的载荷部分具有重复数据块的载荷部分,节点31确定向节点33发送的历史报文中不存在第一原始报文。可选地,节点31在确定向节点33发送的历史报文中不存在第一原始报文之后,节点31可以在第一数据集合中添加报文31的载荷部分,得到更新后的第一数据集合。更新后的第一数据集合可以用于节点31对后续获取的发送方为SFU服务器的未去重报文进行去重处理。此实现方式可参考上述步骤602中的可能实现方式A1。Possible implementation mode C1, a first data set is stored in node 31, and the first data set includes the payload part of the historical message sent by node 31 to node 33. The implementation process of node 31 judging whether there is a first original message in the historical message sent to node 33 includes: node 31 matches the payload part of message 31 with the payload part in the first data set. If there is a target payload part with a repeated data block with the payload part of message 31 in the first data set, node 31 determines that there is a first original message in the historical message sent to node 33. If there is no payload part with a repeated data block with the payload part of message 31 in the first data set, node 31 determines that there is no first original message in the historical message sent to node 33. Optionally, after node 31 determines that there is no first original message in the historical message sent to node 33, node 31 can add the payload part of message 31 to the first data set to obtain an updated first data set. The updated first data set can be used by node 31 to perform deduplication processing on the non-deduplication messages obtained subsequently and sent by the SFU server. For this implementation, reference may be made to possible implementation A1 in step 602 above.
可选地,如果节点31有多个下级节点,节点31中可以存储有多个节点级数据集合,每个节点级数据集合分别用于存储节点31向一个下级节点发送的历史报文的载荷部分,即节点31针对每个下级节点分别存储一个数据集合。或者,节点31中可以存储有一个全局数据集合,该全局数据集合中包括载荷部分与节点标识的对应关系,节点标识用于指示对应的载荷部分是向该节点标识所指示的下级节点发送的,即节点31针对所有下级节点存储一个共用的数据集合。Optionally, if the node 31 has multiple subordinate nodes, multiple node-level data sets may be stored in the node 31, each node-level data set is used to store the payload part of the historical message sent by the node 31 to a subordinate node, that is, the node 31 stores a data set for each subordinate node. Alternatively, the node 31 may store a global data set, which includes the corresponding relationship between the payload part and the node identifier, and the node identifier is used to indicate that the corresponding payload part is sent to the subordinate node indicated by the node identifier, that is, the node 31 stores a common data set for all subordinate nodes.
可能实现方式C2,节点31中存储有采样标签集合,该采样标签集合包括历史数据块的哈希值。历史数据块为对节点31向节点33发送的历史报文的数据部分的预设位置采样得到的数据块。节点31判断向节点33发送的历史报文中是否存在第一原始报文的实现过程,包括:节点31对报文31的数据部分的预设位置进行采样,得到采样数据块。节点31计算该采样数据块的哈希值。如果采样标签集合中包括该采样数据块的哈希值,节点31确定向节点33发送的历史报文中存在第一原始报文。如果采样标签集合中不包括该采样数据块的哈希值,节点31确定向节点33发送的历史报文中不存在第一原始报文。可选地,节点31在确定向节点33发送的历史报文中不存在第一原始报文之后,节点31可以在采样标签集合中添加该采样数据块的哈希值,得到更新后的采样标签集合。更新后的采样标签集合可以用于节点31对后续获取的发送方为SFU服务器的未去重报文进行去重处理。此实现方式可参考上述步骤602中的可能实现方式A2。 Possible implementation C2: a sampling label set is stored in the node 31, and the sampling label set includes a hash value of a historical data block. The historical data block is a data block obtained by sampling a preset position of a data part of a historical message sent by the node 31 to the node 33. The implementation process of the node 31 judging whether there is a first original message in the historical message sent to the node 33 includes: the node 31 samples a preset position of the data part of the message 31 to obtain a sampling data block. The node 31 calculates the hash value of the sampling data block. If the sampling label set includes the hash value of the sampling data block, the node 31 determines that there is a first original message in the historical message sent to the node 33. If the sampling label set does not include the hash value of the sampling data block, the node 31 determines that there is no first original message in the historical message sent to the node 33. Optionally, after determining that there is no first original message in the historical message sent to the node 33, the node 31 can add the hash value of the sampling data block to the sampling label set to obtain an updated sampling label set. The updated sampling tag set can be used by the node 31 to perform deduplication processing on the subsequently acquired non-deduplication messages whose sender is the SFU server. This implementation method can refer to the possible implementation method A2 in the above step 602.
可选地,如果节点31有多个下级节点,节点31中可以存储有多个节点级采样标签集合,每个节点级采样标签集合分别对应一个下级节点,每个节点级采样标签集合分别用于存储对节点31向对应的下级节点发送的历史报文的数据部分的预设位置采样得到的数据块的哈希值。即节点31针对每个下级节点分别存储一个采样标签集合。或者,节点31中可以存储有一个全局采样标签集合,该全局采样标签集合中包括哈希值与节点标识的对应关系,节点标识用于指示对应的哈希值来自向该节点标识所指示的下级节点发送的历史报文,即节点31针对所有下级节点存储一个共用的采样标签集合。Optionally, if the node 31 has multiple subordinate nodes, multiple node-level sampling label sets may be stored in the node 31, each node-level sampling label set corresponds to a subordinate node, and each node-level sampling label set is used to store the hash value of the data block sampled at the preset position of the data part of the historical message sent by the node 31 to the corresponding subordinate node. That is, the node 31 stores a sampling label set for each subordinate node. Alternatively, a global sampling label set may be stored in the node 31, and the global sampling label set includes the correspondence between the hash value and the node identifier, and the node identifier is used to indicate that the corresponding hash value comes from the historical message sent to the subordinate node indicated by the node identifier, that is, the node 31 stores a common sampling label set for all subordinate nodes.
可选地,节点31中存储的采样标签集合还包括哈希值所指示的历史数据块,也即是,节点31中存储的采样标签集合可以包括历史数据块与历史数据块的哈希值的对应关系。则上述如果采样标签集合中包括采样数据块的哈希值,节点31确定历史报文中存在第一原始报文的实现方式,包括:如果采样标签集合中包括采样数据块的哈希值,节点31对该采样数据块与该采样数据块的哈希值所指示的历史数据块进行内容匹配。如果该采样数据块与该采样数据块的哈希值所指示的历史数据块的内容相同,节点31确定历史报文中存在第一原始报文。Optionally, the sampling tag set stored in the node 31 also includes the historical data block indicated by the hash value, that is, the sampling tag set stored in the node 31 may include the correspondence between the historical data block and the hash value of the historical data block. Then the implementation method of the node 31 determining that the first original message exists in the historical message if the sampling tag set includes the hash value of the sampling data block includes: if the sampling tag set includes the hash value of the sampling data block, the node 31 performs content matching on the sampling data block and the historical data block indicated by the hash value of the sampling data block. If the sampling data block has the same content as the historical data block indicated by the hash value of the sampling data block, the node 31 determines that the first original message exists in the historical message.
由于哈希值相同的两个数据块的数据内容有可能不同,通过在采样标签集合中存储历史数据块与历史数据块的哈希值的对应关系,可以使得节点31在确定采样数据块的哈希值与某个历史数据块的哈希值相同之后,进一步对该采样数据块与该历史数据块进行内容匹配,以实现精确匹配,进而提高对报文的去重准确率。Since the data contents of two data blocks with the same hash value may be different, by storing the correspondence between historical data blocks and hash values of historical data blocks in the sampling label set, node 31 can further perform content matching on the sampled data block and the historical data block after determining that the hash value of the sampled data block is the same as the hash value of a historical data block, so as to achieve an accurate match, thereby improving the accuracy of deduplication of the message.
步骤1104、如果节点31向节点33发送的历史报文中存在第一原始报文,节点31对报文31的载荷部分进行去重处理,得到报文32,报文32不包括第一重复内容,且报文32携带有去重标记以及对第一重复内容的第一指示信息,该去重标记用于指示报文32为去重报文。Step 1104: If the first original message exists in the historical message sent by node 31 to node 33, node 31 deduplicates the payload part of message 31 to obtain message 32. Message 32 does not include the first repeated content, and message 32 carries a deduplication mark and first indication information for the first repeated content. The deduplication mark is used to indicate that message 32 is a deduplication message.
可选地,第一重复内容包括一个或多个重复数据块,对第一重复内容的第一指示信息包括一个或多个指示,该一个或多个指示与第一重复内容中的一个或多个重复数据块一一对应。每个指示用于指示对应的重复数据块的哈希值。Optionally, the first repetitive content includes one or more repetitive data blocks, and the first indication information for the first repetitive content includes one or more indications, which correspond one to one with the one or more repetitive data blocks in the first repetitive content, and each indication is used to indicate a hash value of a corresponding repetitive data block.
结合上述步骤1103中的可能实现方式C1,第一重复内容中的一个或多个重复数据块位于报文31的载荷部分的数据部分和/或协议部分。节点31向节点33发送的历史报文中存在第一原始报文,即第一数据集合中存在与报文31的载荷部分具有重复数据块的目标载荷部分。节点31对报文31的载荷部分进行去重处理的实现过程,包括:针对报文31的载荷部分与目标载荷部分之间的每个重复数据块,节点31计算该重复数据块的哈希值。节点31去除报文31的载荷部分的该重复数据块,并在报文31的载荷部分添加该重复数据块对应的指示,该指示用于指示该重复数据块的哈希值以及该重复数据块在报文31的载荷部分中的位置。这里节点31对报文31的载荷部分进行去重处理的实现过程,可参考上述步骤602中的可能实现方式A1下,节点11对报文11的载荷部分进行去重处理的实现过程。In combination with the possible implementation C1 in the above step 1103, one or more repeated data blocks in the first repeated content are located in the data part and/or the protocol part of the payload part of the message 31. The first original message exists in the historical message sent by the node 31 to the node 33, that is, the target payload part having repeated data blocks with the payload part of the message 31 exists in the first data set. The implementation process of node 31 performing deduplication processing on the payload part of the message 31 includes: for each repeated data block between the payload part of the message 31 and the target payload part, the node 31 calculates the hash value of the repeated data block. The node 31 removes the repeated data block in the payload part of the message 31, and adds an indication corresponding to the repeated data block to the payload part of the message 31, the indication is used to indicate the hash value of the repeated data block and the position of the repeated data block in the payload part of the message 31. Here, the implementation process of node 31 performing deduplication processing on the payload part of the message 31 can refer to the implementation process of node 11 performing deduplication processing on the payload part of the message 11 under the possible implementation A1 in the above step 602.
本申请实施例中,中间节点可以对获取的未去重报文的载荷部分与已存储的历史报文的载荷部分进行内容匹配。如果该未去重报文的载荷部分与历史报文的载荷部分有重复数据块,则中间节点计算该重复数据块的哈希值,并去除未去重报文中的该重复数据块得到去重报文,进一步在去重报文中携带对该重复数据块的哈希值以及该重复数据块的位置的指示,以实现对未去重报文的数据去重。In an embodiment of the present application, the intermediate node can perform content matching on the payload portion of the acquired non-deduplicated message and the payload portion of the stored historical message. If the payload portion of the non-deduplicated message and the payload portion of the historical message have duplicate data blocks, the intermediate node calculates the hash value of the duplicate data block, removes the duplicate data block from the non-deduplicated message to obtain a deduplicated message, and further carries the hash value of the duplicate data block and an indication of the location of the duplicate data block in the deduplicated message to achieve data deduplication of the non-deduplicated message.
结合上述步骤1103中的可能实现方式C2,第一重复内容中的一个或多个重复数据块位于报文31的载荷部分的数据部分。节点31对报文31的载荷部分进行去重处理的实现过程,包括:节点31将哈希值属于采样标签集合的采样数据块作为重复数据块,去除报文31的数据部分的该重复数据块,并在报文31的载荷部分添加该重复数据块对应的指示,该指示用于指示该重复数据块的哈希值。这里节点31对报文31的载荷部分进行去重处理的实现过程,可参考上述步骤602中的可能实现方式A2下,节点11对报文11的载荷部分进行去重处理的实现过程。In combination with the possible implementation C2 in the above step 1103, one or more repeated data blocks in the first repeated content are located in the data part of the payload part of the message 31. The implementation process of node 31 performing deduplication processing on the payload part of the message 31 includes: node 31 uses the sampling data block whose hash value belongs to the sampling tag set as a repeated data block, removes the repeated data block from the data part of the message 31, and adds an indication corresponding to the repeated data block to the payload part of the message 31, and the indication is used to indicate the hash value of the repeated data block. Here, the implementation process of node 31 performing deduplication processing on the payload part of the message 31 can refer to the implementation process of node 11 performing deduplication processing on the payload part of the message 11 under the possible implementation A2 in the above step 602.
本申请实施例中,中间节点可以计算获取的未去重报文的数据部分的预设位置的采样数据块的哈希值,并将其与已存储的历史数据块的哈希值进行比较。如果该未去重报文中某个采样数据块的哈希值与中间节点已存储的哈希值相同,则中间节点去除该未去重报文中的该采样数据块得到去重报文,进一步在去重报文中携带该采样数据块的哈希值,以实现对未去重报文的数据去重。与上述可能实现方式C1相比,中间节点无需对未去重报文和历史报文的载荷部分进行内容匹配,提高了报文去重效率。In an embodiment of the present application, the intermediate node can calculate the hash value of the sampled data block at the preset position of the data portion of the acquired non-deduplicated message, and compare it with the hash value of the stored historical data block. If the hash value of a sampled data block in the non-deduplicated message is the same as the hash value stored by the intermediate node, the intermediate node removes the sampled data block in the non-deduplicated message to obtain a deduplicated message, and further carries the hash value of the sampled data block in the deduplicated message to achieve data deduplication of the non-deduplicated message. Compared with the possible implementation method C1 described above, the intermediate node does not need to perform content matching on the payload part of the non-deduplicated message and the historical message, thereby improving the efficiency of message deduplication.
结合上述步骤1103中的可能实现方式C2,节点31中还可以存储有节点31向节点33发送的历史报文的协议部分。报文31的载荷部分与第一原始报文的载荷部分的第一重复内容还包括位于报文31的协议 部分的协议信息,相应地,第一指示信息还包括差异指示,该差异指示用于指示报文31的协议部分与第一原始报文的协议部分的差异。这里节点31对报文31的协议部分进行去重处理的实现方式,可参考上述步骤602中的可能实现方式A2下,节点11对报文11的协议部分进行去重处理的实现方式。In combination with the possible implementation C2 in step 1103, the node 31 may also store the protocol part of the historical message sent by the node 31 to the node 33. The first repetition of the payload part of the message 31 and the payload part of the first original message also includes the protocol part located in the message 31. The first indication information also includes a difference indication, which is used to indicate the difference between the protocol part of the message 31 and the protocol part of the first original message. Here, the implementation method of node 31 performing deduplication processing on the protocol part of the message 31 can refer to the implementation method of node 11 performing deduplication processing on the protocol part of the message 11 under the possible implementation method A2 in the above step 602.
本申请实施例中,中间节点除了可以对报文的数据部分的预设位置的数据块进行去重以外,还可以对报文的协议部分进行去重,通过在报文的载荷部分携带差异指示以替代协议部分,可以进一步减少报文传输数据量,从而减少网络带宽的开销。In an embodiment of the present application, in addition to deduplicating data blocks at preset positions in the data part of the message, the intermediate node can also deduplicate the protocol part of the message. By carrying a difference indication in the payload part of the message to replace the protocol part, the amount of data transmitted in the message can be further reduced, thereby reducing the network bandwidth overhead.
步骤1105、节点31向节点33发送报文32。Step 1105 , node 31 sends message 32 to node 33 .
本申请实施例中,中间节点在接收到发送方为SFU服务器的未去重报文之后,可以判断是否向该未去重报文的下一跳发送过载荷部分与该未去重报文的载荷部分具有重复内容的历史报文。如果中间节点向该未去重报文的下一跳发送过载荷部分与该未去重报文的载荷部分具有重复内容的历史报文,则中间节点可以对该未去重报文进行数据去重,然后向下级节点发送去重报文。由于去重报文的数据量相较于未去重报文的数据量较小,因此可以减少报文传输数据量,从而减少网络带宽的开销。中间节点在向下级节点发送去重报文时,只需保证向该下级节点发送过载荷部分携带有该去重报文相对于原始报文被去除的内容的历史报文即可,以保证后续传输路径上存在节点能够对该去重报文进行数据恢复,使得用户最终接收到携带有完整数据内容的原始报文,保障用户业务。In an embodiment of the present application, after receiving a non-deduplicated message whose sender is an SFU server, the intermediate node can determine whether to send a historical message whose overload part has duplicate content with the load part of the non-deduplicated message to the next hop of the non-deduplicated message. If the intermediate node sends a historical message whose overload part has duplicate content with the load part of the non-deduplicated message to the next hop of the non-deduplicated message, the intermediate node can perform data deduplication on the non-deduplicated message, and then send a deduplication message to the subordinate node. Since the data volume of the deduplication message is smaller than the data volume of the non-deduplication message, the data volume of the message transmission can be reduced, thereby reducing the network bandwidth overhead. When the intermediate node sends a deduplication message to the subordinate node, it only needs to ensure that the historical message with the content of the deduplication message removed relative to the original message is sent to the subordinate node, so as to ensure that there is a node on the subsequent transmission path that can perform data recovery on the deduplication message, so that the user finally receives the original message carrying the complete data content, and guarantees the user service.
步骤1106、如果节点31向节点33发送的历史报文中不存在第一原始报文,节点31向节点33发送报文31。Step 1106 : If the first original message does not exist in the historical messages sent by node 31 to node 33 , node 31 sends message 31 to node 33 .
步骤1107、如果报文31为去重报文,报文31携带有对第二重复内容的第二指示信息,节点31判断向节点33发送的历史报文中是否存在第二原始报文,第二原始报文的载荷部分包括第二重复内容,节点33为报文31在节点31上的下一跳。Step 1107: If message 31 is a deduplicated message, message 31 carries second indication information for the second duplicate content, node 31 determines whether there is a second original message in the historical message sent to node 33, the payload part of the second original message includes the second duplicate content, and node 33 is the next hop of message 31 on node 31.
可选地,第二重复内容包括一个或多个重复数据块,对第二重复内容的第二指示信息包括一个或多个指示,该一个或多个指示与第二重复内容中的一个或多个重复数据块一一对应。每个指示用于指示对应的重复数据块的哈希值。Optionally, the second repetitive content includes one or more repetitive data blocks, and the second indication information for the second repetitive content includes one or more indications, which correspond one to one with the one or more repetitive data blocks in the second repetitive content, and each indication is used to indicate a hash value of a corresponding repetitive data block.
可能实现方式D1,报文31由传输路径上位于节点31之前的节点采用上述步骤602中的可能实现方式A1对原始报文进行去重处理得到。对第二重复内容的第二指示信息中的每个指示用于指示对应的重复数据块的哈希值以及对应的重复数据块在报文31对应的原始报文的载荷部分中的位置。节点31中存储有第一数据集合,第一数据集合包括节点31向节点33发送的历史报文的载荷部分。节点31判断向节点33发送的历史报文中是否存在第二原始报文的实现过程,包括:对于第二指示信息中的任一指示,节点31根据该指示所指示的位置,获取第一数据集合中的载荷部分的该位置的待匹配数据块。节点31计算待匹配数据块的哈希值。节点31将哈希值与该指示所指示的哈希值一致的待匹配数据块,确定为该指示对应的重复数据块。如果第一数据集合中存在包括第二指示信息中的各个指示分别对应的重复数据块的载荷部分,节点31确定向节点33发送的历史报文中存在第二原始报文。第二原始报文即载荷部分包括第二指示信息中的各个指示分别对应的重复数据块的历史报文。如果第一数据集合中不存在包括第二指示信息中的各个指示分别对应的重复数据块的载荷部分,节点31确定向节点33发送的历史报文中不存在第二原始报文。Possible implementation method D1, message 31 is obtained by the node located before node 31 on the transmission path by performing deduplication processing on the original message using possible implementation method A1 in the above step 602. Each indication in the second indication information of the second repeated content is used to indicate the hash value of the corresponding repeated data block and the position of the corresponding repeated data block in the payload part of the original message corresponding to message 31. Node 31 stores a first data set, and the first data set includes the payload part of the historical message sent by node 31 to node 33. The implementation process of node 31 judging whether there is a second original message in the historical message sent to node 33 includes: for any indication in the second indication information, node 31 obtains the data block to be matched at the position of the payload part in the first data set according to the position indicated by the indication. Node 31 calculates the hash value of the data block to be matched. Node 31 determines the data block to be matched whose hash value is consistent with the hash value indicated by the indication as the repeated data block corresponding to the indication. If the first data set includes a payload portion including repeated data blocks corresponding to the respective indications in the second indication information, the node 31 determines that the second original message exists in the historical message sent to the node 33. The second original message is a historical message including a payload portion including repeated data blocks corresponding to the respective indications in the second indication information. If the first data set does not include a payload portion including repeated data blocks corresponding to the respective indications in the second indication information, the node 31 determines that the second original message does not exist in the historical message sent to the node 33.
可能实现方式D2,报文31由传输路径上位于节点31之前的节点采用上述步骤602中的可能实现方式A2对原始报文进行去重处理得到。对第二重复内容的第二指示信息中的每个指示用于指示对应的重复数据块的哈希值。节点31中存储有采样标签集合,该采样标签集合包括历史数据块的哈希值。历史数据块为对节点31向节点33发送的历史报文的数据部分的预设位置采样得到的数据块。节点31判断向节点33发送的历史报文中是否存在第二原始报文的实现过程,包括:如果采样标签集合中包括第二指示信息中的各个指示所指示的哈希值,节点31确定向节点33发送的历史报文中存在第二原始报文。第二原始报文即载荷部分包括第二指示信息中的各个指示所指示的哈希值对应的所有历史数据块的历史报文。如果采样标签集合中不包括第二指示信息中的任一指示所指示的哈希值,节点31确定向节点33发送的历史报文中不存在第二原始报文。Possible implementation D2, message 31 is obtained by the node located before node 31 on the transmission path by performing deduplication processing on the original message using possible implementation A2 in the above step 602. Each indication in the second indication information of the second repeated content is used to indicate the hash value of the corresponding repeated data block. Node 31 stores a sampling label set, which includes the hash value of the historical data block. The historical data block is a data block obtained by sampling the preset position of the data part of the historical message sent by node 31 to node 33. The implementation process of node 31 judging whether there is a second original message in the historical message sent to node 33 includes: if the sampling label set includes the hash value indicated by each indication in the second indication information, node 31 determines that there is a second original message in the historical message sent to node 33. The second original message, that is, the payload part includes the historical message of all historical data blocks corresponding to the hash value indicated by each indication in the second indication information. If the sampling label set does not include the hash value indicated by any indication in the second indication information, node 31 determines that there is no second original message in the historical message sent to node 33.
步骤1108、如果节点31向节点33发送的历史报文中存在第二原始报文,节点31向节点33发送报文31。Step 1108 : If the second original message exists in the historical messages sent by node 31 to node 33 , node 31 sends message 31 to node 33 .
本申请实施例中,中间节点在接收到发送方为SFU服务器的去重报文之后,可以判断是否向该去重报文的下一跳发送过载荷部分携带有该去重报文相对于原始报文被去除的内容的历史报文。如果中间节点向 该去重报文的下一跳发送过载荷部分携带有该去重报文相对于原始报文被去除的内容的历史报文,则中间节点可以直接向下级节点转发该去重报文。In the embodiment of the present application, after receiving the deduplication message sent by the SFU server, the intermediate node can determine whether to send a historical message with a payload portion carrying the content of the deduplication message removed from the original message to the next hop of the deduplication message. The next hop of the deduplication message sends a historical message with the content of the deduplication message removed relative to the original message over the payload part, and the intermediate node can directly forward the deduplication message to the lower-level node.
步骤1109、如果节点31向节点33发送的历史报文中不存在第二原始报文,节点31根据第二指示信息从第二数据集合中获取第二重复内容,第二数据集合包括节点31接收到的来自节点32的历史报文的载荷部分的至少部分内容。Step 1109: If the second original message does not exist in the historical message sent by node 31 to node 33, node 31 obtains the second repeated content from the second data set according to the second indication information, and the second data set includes at least part of the payload part of the historical message received by node 31 from node 32.
结合上述步骤1107中的可能实现方式D1,第二指示信息中的每个指示用于指示对应的重复数据块的哈希值以及对应的重复数据块在报文31对应的原始报文的载荷部分中的位置。第二数据集合包括节点31接收到的来自节点32的历史报文的载荷部分。节点31根据第二指示信息从第二数据集合中获取第二重复内容的实现过程,包括:对于第二指示信息中的每个指示,节点31根据该指示所指示的位置,获取第二数据集合中的载荷部分的该位置的待匹配数据块。节点31计算待匹配数据块的哈希值。节点31将哈希值与该指示所指示的哈希值一致的待匹配数据块,确定为该指示对应的重复数据块。这里节点31根据第二指示信息从第二数据集合中获取第二重复内容的实现过程,可参考上述步骤1003中的可能实现方式B1下,节点21根据对重复内容的指示信息从数据集合中获取该重复内容的实现过程。In combination with the possible implementation D1 in the above step 1107, each indication in the second indication information is used to indicate the hash value of the corresponding repeated data block and the position of the corresponding repeated data block in the payload part of the original message corresponding to the message 31. The second data set includes the payload part of the historical message received by the node 31 from the node 32. The implementation process of the node 31 obtaining the second repeated content from the second data set according to the second indication information includes: for each indication in the second indication information, the node 31 obtains the to-be-matched data block at the position of the payload part in the second data set according to the position indicated by the indication. The node 31 calculates the hash value of the to-be-matched data block. The node 31 determines the to-be-matched data block whose hash value is consistent with the hash value indicated by the indication as the repeated data block corresponding to the indication. Here, the implementation process of the node 31 obtaining the second repeated content from the second data set according to the second indication information can refer to the implementation process of the node 21 obtaining the repeated content from the data set according to the indication information of the repeated content under the possible implementation B1 in the above step 1003.
结合上述步骤1107中的可能实现方式D2,第二重复内容中的一个或多个重复数据块位于数据部分。第二数据集合包括历史数据块的哈希值与历史数据块的对应关系,该历史数据块为对节点31接收到的来自节点32的历史报文的数据部分的预设位置采样得到的数据块。节点31根据第二指示信息从第二数据集合中获取第二重复内容的实现过程,包括:节点31将第二数据集合中与第二指示信息中的指示所指示的哈希值对应的历史数据块,确定为该指示对应的重复数据块。这里节点31根据第二指示信息从第二数据集合中获取第二重复内容的实现过程,可参考上述步骤1003中的可能实现方式B2下,节点21根据对重复内容的指示信息从数据集合中获取该重复内容的实现过程。In combination with the possible implementation D2 in the above step 1107, one or more repeated data blocks in the second repeated content are located in the data part. The second data set includes the correspondence between the hash value of the historical data block and the historical data block, and the historical data block is a data block obtained by sampling the preset position of the data part of the historical message received by the node 31 from the node 32. The implementation process of node 31 obtaining the second repeated content from the second data set according to the second indication information includes: node 31 determines the historical data block in the second data set corresponding to the hash value indicated by the indication in the second indication information as the repeated data block corresponding to the indication. Here, the implementation process of node 31 obtaining the second repeated content from the second data set according to the second indication information can refer to the implementation process of node 21 obtaining the repeated content from the data set according to the indication information of the repeated content under the possible implementation B2 in the above step 1003.
结合上述步骤1107中的可能实现方式D2,第二数据集合还可以包括历史数据块所属报文的协议部分。第二重复内容还可以包括位于协议部分的协议信息,相应地,第二指示信息还包括差异指示,该差异指示用于指示报文31对应的原始报文的协议部分与目标报文的协议部分的差异,目标报文为节点31接收到的来自节点32的历史报文中数据部分与报文31对应的原始报文的数据部分具有上述一个或多个重复数据块的历史报文。这种情况下,节点31根据对第二指示信息从第二数据集合中获取第二重复内容的实现过程,还包括:节点31从第二数据集合中获取该一个或多个重复数据块所属的目标报文的协议部分。In combination with the possible implementation D2 in the above step 1107, the second data set may also include the protocol part of the message to which the historical data block belongs. The second repeated content may also include protocol information located in the protocol part. Accordingly, the second indication information also includes a difference indication, which is used to indicate the difference between the protocol part of the original message corresponding to the message 31 and the protocol part of the target message. The target message is a historical message in which the data part of the historical message received by the node 31 from the node 32 and the data part of the original message corresponding to the message 31 have the above one or more repeated data blocks. In this case, the implementation process of the node 31 obtaining the second repeated content from the second data set according to the second indication information also includes: the node 31 obtains the protocol part of the target message to which the one or more repeated data blocks belong from the second data set.
步骤1110、节点31根据第二重复内容对报文31的载荷部分进行去重恢复处理,得到报文33,报文33的载荷部分包括第二重复内容。Step 1110: Node 31 performs deduplication recovery processing on the payload of message 31 according to the second repeated content to obtain message 33, where the payload of message 33 includes the second repeated content.
这里报文31为去重报文,报文33为报文31对应的原始报文。Here, message 31 is a deduplicated message, and message 33 is the original message corresponding to message 31 .
结合上述步骤1109中的可能实现方式D1,节点31根据第二重复内容对报文31的载荷部分进行去重恢复处理的实现方式为:针对第二指示信息中的每个指示,节点31在报文31的载荷部分中该指示所指示的位置,添加该指示对应的重复数据块。此实现方式可参考上述步骤1004中的可能实现方式B1下,节点21根据重复内容对报文21的载荷部分进行去重恢复处理的实现方式。In combination with the possible implementation D1 in the above step 1109, the implementation of node 31 performing deduplication recovery processing on the payload part of message 31 according to the second repeated content is as follows: for each indication in the second indication information, node 31 adds the repeated data block corresponding to the indication at the position indicated by the indication in the payload part of message 31. This implementation can refer to the implementation of node 21 performing deduplication recovery processing on the payload part of message 21 according to the repeated content under the possible implementation B1 in the above step 1004.
本申请实施例中,中间节点可以根据去重报文中携带的指示所指示的重复数据块在原始报文的载荷部分的位置,分别计算已存储的多个载荷部分的该位置的数据块的哈希值,以获取哈希值与该指示所指示的哈希值一致的存储数据块,然后将该存储数据块添加至该指示所指示的位置,以实现对去重报文的数据恢复。In an embodiment of the present application, the intermediate node can calculate the hash values of the data blocks at that position in the payload part of multiple stored payload parts according to the position of the duplicate data block indicated by the indication carried in the deduplication message in the payload part of the original message, so as to obtain a storage data block whose hash value is consistent with the hash value indicated by the indication, and then add the storage data block to the position indicated by the indication to achieve data recovery of the deduplication message.
结合上述步骤1109中的可能实现方式D2,如果第二指示信息中的指示用于指示对应的重复数据块在报文31对应的原始报文的数据部分中的位置,则节点31根据第二重复内容对报文31的载荷部分进行去重恢复处理的实现方式为:对于第二指示信息中的每个指示,节点31在报文31的数据部分中该指示所指示的位置,添加该指示对应的重复数据块。如果第二指示信息中的指示未指示重复数据块在报文31对应的原始报文的数据部分中的位置,则节点31将获取的重复数据块添加在报文31的数据部分的默认去重位置,该默认去重位置即上述预先设置的数据部分的采样位置。此实现方式可参考上述步骤1004中的可能实现方式B2下,节点21根据该重复内容对报文21的载荷部分进行去重恢复处理的实现方式。In combination with the possible implementation method D2 in the above step 1109, if the indication in the second indication information is used to indicate the position of the corresponding duplicate data block in the data part of the original message corresponding to the message 31, then the implementation method of node 31 performing deduplication recovery processing on the payload part of the message 31 according to the second duplicate content is: for each indication in the second indication information, node 31 adds the duplicate data block corresponding to the indication at the position indicated by the indication in the data part of the message 31. If the indication in the second indication information does not indicate the position of the duplicate data block in the data part of the original message corresponding to the message 31, then node 31 adds the obtained duplicate data block to the default deduplication position of the data part of the message 31, and the default deduplication position is the sampling position of the data part pre-set above. This implementation method can refer to the possible implementation method B2 in the above step 1004, in which node 21 performs deduplication recovery processing on the payload part of the message 21 according to the duplicate content.
可选地,如果第二指示信息包括针对协议部分的差异指示,则节点31根据第二重复内容对报文31的载荷部分进行去重恢复处理的实现过程,还包括:节点31根据该差异指示修改目标报文的协议部分,并将修改后的目标报文的协议部分作为报文33的协议部分。 Optionally, if the second indication information includes a difference indication for the protocol part, the implementation process of node 31 performing deduplication recovery processing on the payload part of message 31 according to the second repeated content also includes: node 31 modifies the protocol part of the target message according to the difference indication, and uses the modified protocol part of the target message as the protocol part of message 33.
本申请实施例中,中间节点可以在已存储的历史数据块的哈希值中查找去重报文中携带的哈希值,并将命中的哈希值所对应的历史数据块添加到去重报文的数据部分,以实现对去重报文的数据部分的恢复。另外,中间节点还可以获取命中的哈希值对应的历史数据块所属的历史报文的协议部分,结合去重报文中针对协议部分的差异指示还原得到去重报文对应的原始报文的协议部分,以实现对去重报文的协议部分的恢复。In an embodiment of the present application, the intermediate node can search for the hash value carried in the deduplication message among the hash values of the stored historical data blocks, and add the historical data block corresponding to the hit hash value to the data portion of the deduplication message to achieve recovery of the data portion of the deduplication message. In addition, the intermediate node can also obtain the protocol portion of the historical message to which the historical data block corresponding to the hit hash value belongs, and restore the protocol portion of the original message corresponding to the deduplication message in combination with the difference indication for the protocol portion in the deduplication message to achieve recovery of the protocol portion of the deduplication message.
在上述步骤1110中,节点31对报文31进行数据恢复时,还可以删除报文31中的去重标记以及第二指示信息,以还原得到报文31对应的原始报文。In the above step 1110 , when node 31 performs data recovery on message 31 , it may also delete the deduplication mark and the second indication information in message 31 to restore the original message corresponding to message 31 .
步骤1111、节点31向节点33发送报文33。Step 1111 , node 31 sends message 33 to node 33 .
可选地,节点33为报文33的传输路径上的网络设备。Optionally, the node 33 is a network device on the transmission path of the message 33 .
由于网络不稳定因素可能会导致报文乱序到达中间节点,比如中间节点有可能会先接收到去重报文,之后才接收到与该去重报文对应的原始报文具有重复内容的未去重报文,该未去重报文携带有该去重报文中所指示的重复内容。这种情况下,中间节点需要对乱序报文进行保序处理,中间节点在接收到去重报文之后先缓存该去重报文,等到携带有该去重报文中所指示的重复内容的未去重报文到达之后,中间节点进一步判断是直接转发该去重报文,还是根据该未去重报文的载荷部分对该去重报文进行数据恢复。Due to network instability factors, messages may arrive at the intermediate node out of order. For example, the intermediate node may first receive a deduplication message, and then receive a non-deduplication message with duplicate content corresponding to the original message of the deduplication message. The non-deduplication message carries the duplicate content indicated in the deduplication message. In this case, the intermediate node needs to perform order-preserving processing on the out-of-order messages. After receiving the deduplication message, the intermediate node first caches the deduplication message. After the non-deduplication message carrying the duplicate content indicated in the deduplication message arrives, the intermediate node further determines whether to directly forward the deduplication message or perform data recovery on the deduplication message based on the payload part of the non-deduplication message.
本申请实施例中,中间节点可以对发送方为SFU服务器的未去重报文进行数据去重,以向下级节点发送去重报文。由于去重报文的数据量相较于未去重报文的数据量较小,因此可以减少报文传输数据量,从而减少网络带宽的开销。另外,中间节点也可以对发送方为SFU服务器的去重报文进行数据恢复,或者直接向下级节点转发该去重报文。中间节点在向下级节点发送去重报文时,只需保证向该下级节点发送过载荷部分携带有该去重报文相对于原始报文被去除的内容的历史报文即可,以保证后续传输路径上存在节点能够对该去重报文进行数据恢复,使得用户最终接收到携带有完整数据内容的原始报文,保障用户业务。In an embodiment of the present application, the intermediate node can perform data deduplication on the non-deduplicated message whose sender is the SFU server, so as to send the deduplication message to the subordinate node. Since the data volume of the deduplication message is smaller than the data volume of the non-deduplicated message, the message transmission data volume can be reduced, thereby reducing the network bandwidth overhead. In addition, the intermediate node can also perform data recovery on the deduplication message whose sender is the SFU server, or directly forward the deduplication message to the subordinate node. When the intermediate node sends a deduplication message to the subordinate node, it only needs to ensure that the historical message carrying the content of the deduplication message removed relative to the original message is sent to the subordinate node in the overload part, so as to ensure that there is a node on the subsequent transmission path that can perform data recovery on the deduplication message, so that the user finally receives the original message carrying the complete data content, thereby ensuring user services.
可选地,本申请实施例还提出了针对SFU通信架构的流量分组方案。流量分组用于节点确定哪些流可以作为一组流,进而对同一组流中的报文进行数据去重和/或数据恢复,以提高报文处理效率。在针对SFU通信架构的流量分组方案中,可以由底部节点和中间节点进行分组识别,即主动识别可去重的流,并将可去重的流进行分组。分组完成后,底部节点和中间节点可以将分组结果分别发送给上级节点,上级节点可以根据分组结果对报文进行数据去重或数据恢复。顶部节点可以无需进行分组识别。下面对顶部节点、底部节点和中间节点在数据传输过程中应用流量分组方案的具体实现方式分别进行说明。Optionally, an embodiment of the present application also proposes a traffic grouping scheme for an SFU communication architecture. Traffic grouping is used by nodes to determine which flows can be used as a group of flows, and then to perform data deduplication and/or data recovery on the messages in the same group of flows to improve message processing efficiency. In the traffic grouping scheme for the SFU communication architecture, group identification can be performed by the bottom node and the middle node, that is, the deduplication-capable flows are actively identified, and the deduplication-capable flows are grouped. After the grouping is completed, the bottom node and the middle node can send the grouping results to the upper node respectively, and the upper node can perform data deduplication or data recovery on the messages according to the grouping results. The top node does not need to perform group identification. The specific implementation methods of applying the traffic grouping scheme to the top node, the bottom node, and the middle node during data transmission are described below.
对于顶部节点,例如图6示出的方法600中的节点11。节点11中可以存储有节点12对应的一个或多个流分组集合。每个流分组集合包括流经节点12的多条流的流标识。一个流分组集合指示一个分组,一个流分组集合中的流标识所指示的流属于同一个分组。节点11中存储的流分组集合为下级节点对应的流分组集合,本申请实施例中可以将存储的下级节点对应的流分组集合称为下级流分组集合。流标识可以采用流的五元组信息中的一个或多个来表示。五元组信息包括源IP地址、目的IP地址、源端口、目的端口和传输层协议。For the top node, for example, node 11 in method 600 shown in FIG. 6 . One or more flow grouping sets corresponding to node 12 may be stored in node 11. Each flow grouping set includes flow identifiers of multiple flows flowing through node 12. A flow grouping set indicates a packet, and the flows indicated by the flow identifiers in a flow grouping set belong to the same packet. The flow grouping set stored in node 11 is the flow grouping set corresponding to the lower-level node. In an embodiment of the present application, the flow grouping set corresponding to the stored lower-level node may be referred to as a lower-level flow grouping set. The flow identifier may be represented by one or more of the five-tuple information of the flow. The five-tuple information includes the source IP address, the destination IP address, the source port, the destination port, and the transport layer protocol.
在节点11中存储有节点12对应的一个或多个流分组集合的情况下,在节点11获取报文11之后,节点11可以先判断节点12对应的流分组集合中是否包括报文11所属流的流标识。如果节点12对应的流分组集合中存在包括报文11所属流的流标识的目标流分组集合,节点11执行上述步骤602和步骤603。如果节点12对应的所有流分组集合均不包括报文11所属流的流标识,节点11向节点12发送报文11。In the case where one or more flow grouping sets corresponding to node 12 are stored in node 11, after node 11 obtains message 11, node 11 may first determine whether the flow grouping set corresponding to node 12 includes the flow identifier of the flow to which message 11 belongs. If there is a target flow grouping set including the flow identifier of the flow to which message 11 belongs in the flow grouping set corresponding to node 12, node 11 executes the above steps 602 and 603. If all flow grouping sets corresponding to node 12 do not include the flow identifier of the flow to which message 11 belongs, node 11 sends message 11 to node 12.
其中,节点11判断节点12对应的流分组集合中是否包括报文11所属流的流标识,也即是,节点11判断报文11所属流是否已加入到节点12对应的某个分组中。可选地,节点11可以根据报文11的五元组信息查找节点12对应的流分组集合,以确定报文11所属流是否已加入到节点12对应的某个分组中。对于已经加入分组的流中的报文,节点11进一步判断是否对该报文进行数据去重处理,对于未加入分组的流中的报文,节点11直接转发该报文。Among them, node 11 determines whether the flow grouping set corresponding to node 12 includes the flow identifier of the flow to which message 11 belongs, that is, node 11 determines whether the flow to which message 11 belongs has been added to a certain group corresponding to node 12. Optionally, node 11 can search the flow grouping set corresponding to node 12 according to the five-tuple information of message 11 to determine whether the flow to which message 11 belongs has been added to a certain group corresponding to node 12. For messages in the flow that has been added to the group, node 11 further determines whether to perform data deduplication processing on the message, and for messages in the flow that has not been added to the group, node 11 directly forwards the message.
可选地,节点11接收节点12发送的分组信息,该分组信息包括节点12的节点标识与节点12对应的一个或多个流分组集合的对应关系。在节点11有多个下级节点的情况下,节点11存储下级节点的节点标识与下级流分组集合的对应关系。在节点11只有一个下级节点的情况下,节点11可以只存储下级流分组集合。其中,节点标识可以是节点的IP地址、媒体访问控制(Media Access Control,MAC)地址或硬件地址等能够在通信网络中唯一标识该节点的信息。 Optionally, node 11 receives the grouping information sent by node 12, and the grouping information includes the correspondence between the node identification of node 12 and one or more stream grouping sets corresponding to node 12. In the case where node 11 has multiple subordinate nodes, node 11 stores the correspondence between the node identification of the subordinate node and the subordinate stream grouping set. In the case where node 11 has only one subordinate node, node 11 may only store the subordinate stream grouping set. The node identification may be information such as an IP address, a Media Access Control (MAC) address or a hardware address of the node that can uniquely identify the node in a communication network.
可选地,节点12向节点11发送的分组信息中还可以包括与该分组信息中的一个或多个流分组集合一一对应的一个或多个分组标识。每个分组标识用于唯一标识节点12对应的一个流分组集合。例如,分组标识可以是节点12在本节点内为流分组集合分配的能够唯一标识该流分组集合的编号。可选地,在视频会议场景中,分组信息还可以包括例如会议号等会议标识。在直播场景中,分组信息还可以包括例如直播房间号等直播标识。Optionally, the grouping information sent by node 12 to node 11 may also include one or more grouping identifiers corresponding to one or more stream grouping sets in the grouping information. Each grouping identifier is used to uniquely identify a stream grouping set corresponding to node 12. For example, the grouping identifier may be a number that node 12 assigns to a stream grouping set within the node and that can uniquely identify the stream grouping set. Optionally, in a video conference scenario, the grouping information may also include a conference identifier such as a conference number. In a live broadcast scenario, the grouping information may also include a live broadcast identifier such as a live broadcast room number.
可选地,分组信息携带在加入(join)消息中。也即是,节点12可以通过join消息向节点11发送分组信息。节点11接收到join消息之后,将join消息中携带的分组信息存储在分组表中。之后,节点11还可以向节点12发送join应答消息,以便节点12确认分组成功。进一步地,节点12接收到join应答消息之后,还可以向节点11发送join应答确认消息。Optionally, the grouping information is carried in a join message. That is, node 12 can send the grouping information to node 11 via a join message. After receiving the join message, node 11 stores the grouping information carried in the join message in a grouping table. Afterwards, node 11 can also send a join response message to node 12 so that node 12 confirms that the grouping is successful. Further, after receiving the join response message, node 12 can also send a join response confirmation message to node 11.
值得说明的是,顶部节点中存储的分组表中的分组信息均来自下级节点,该分组表可称为下级分组表。例如,顶部节点中存储的分组表可以如表1所示。It is worth noting that the grouping information in the grouping table stored in the top node all comes from the lower-level nodes, and the grouping table can be called a lower-level grouping table. For example, the grouping table stored in the top node can be as shown in Table 1.
表1
Table 1
参见表1,该分组表中以流标识为键(key)、节点标识和分组标识为值(value)。其中,节点标识A用于标识顶部节点的一个下级节点A,节点标识B用于标识顶部节点的另一个下级节点B。分组标识A1用于标识下级节点A的一个流分组集合,分组标识A2用于标识下级节点A的另一个流分组集合。分组标识B1用于标识下级节点B的流分组集合。基于表1可知,流标识1和流标识2属于下级节点A对应的一个流分组集合,流标识3和流标识4属于下级节点A对应的另一个流分组集合,流标识5和流标识6属于下级节点B对应的流分组集合。顶部节点在接收到报文之后,可以根据该报文所属流的流标识查找键这一列。如果键这一列没有该报文所属流的流标识,则顶部节点直接转发该报文。如果键这一列有该报文所属流的流标识,则顶部节点进一步判断是否对该报文进行数据去重。Referring to Table 1, the grouping table uses the flow identifier as the key, the node identifier and the group identifier as the value. Among them, the node identifier A is used to identify a subordinate node A of the top node, and the node identifier B is used to identify another subordinate node B of the top node. Group identifier A1 is used to identify a flow grouping set of the subordinate node A, and group identifier A2 is used to identify another flow grouping set of the subordinate node A. Group identifier B1 is used to identify the flow grouping set of the subordinate node B. Based on Table 1, it can be seen that flow identifier 1 and flow identifier 2 belong to a flow grouping set corresponding to the subordinate node A, flow identifier 3 and flow identifier 4 belong to another flow grouping set corresponding to the subordinate node A, and flow identifier 5 and flow identifier 6 belong to the flow grouping set corresponding to the subordinate node B. After receiving the message, the top node can search the key column according to the flow identifier of the flow to which the message belongs. If the key column does not have the flow identifier of the flow to which the message belongs, the top node directly forwards the message. If the key column has the flow identifier of the flow to which the message belongs, the top node further determines whether to perform data deduplication on the message.
本申请实施例中,顶部节点可以根据下级节点对应的流分组集合,判断是否需要对向该下级节点发送的报文进行去重处理。如果该报文所属流的流标识不在该下级节点对应的流分组集合中,那么顶部节点直接向下级节点转发该报文,而无需再执行报文去重流程,这样可以减小顶部节点的处理开销。In the embodiment of the present application, the top node can determine whether it is necessary to perform deduplication processing on the message sent to the lower node according to the flow grouping set corresponding to the lower node. If the flow identifier of the flow to which the message belongs is not in the flow grouping set corresponding to the lower node, the top node directly forwards the message to the lower node without executing the message deduplication process, which can reduce the processing overhead of the top node.
可选地,在节点12对应的流分组集合中存在包括报文11所属流的流标识的目标流分组集合的情况下,节点11在判断向节点12发送的历史报文中是否存在载荷部分与报文11的载荷部分具有重复内容的目标报文时,可以只判断向节点12发送的目标历史报文中是否存在目标报文,目标历史报文所属流的流标识属于目标流分组集合。这样,当节点12对应的流分组集合有多个时,节点11只需对属于一个流分组集合所指示的多条流的历史报文的载荷部分与报文11的载荷部分进行重复内容判断,减少了节点11所需判断的历史报文的数量,从而减少了节点11的处理开销,同时提高了节点11的报文处理效率,从而可以提高报文传输效率。Optionally, when there is a target flow grouping set including the flow identifier of the flow to which message 11 belongs in the flow grouping set corresponding to node 12, when node 11 determines whether there is a target message whose payload part has duplicate content with the payload part of message 11 in the historical messages sent to node 12, it can only determine whether there is a target message in the target historical messages sent to node 12, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set. In this way, when there are multiple flow grouping sets corresponding to node 12, node 11 only needs to determine the duplicate content of the payload part of the historical messages belonging to multiple flows indicated by one flow grouping set and the payload part of message 11, which reduces the number of historical messages that node 11 needs to determine, thereby reducing the processing overhead of node 11, while improving the message processing efficiency of node 11, thereby improving the message transmission efficiency.
可选地,节点11以报文11所属流的流标识为键,在例如表1示出的分组表中进行键值匹配,确定报文11所属流的流标识属于节点12对应的目标流分组集合。Optionally, node 11 uses the flow identifier of the flow to which message 11 belongs as a key, performs key value matching in the grouping table shown in Table 1, and determines that the flow identifier of the flow to which message 11 belongs belongs to the target flow grouping set corresponding to node 12.
结合上述步骤602中的可能实现方式A1,节点11中存储有数据集合,该数据集合包括节点11向下级节点发送的历史报文的载荷部分。In conjunction with the possible implementation A1 in the above step 602, a data set is stored in the node 11, and the data set includes a payload portion of a historical message sent by the node 11 to a subordinate node.
第一种实现方式,节点11针对所有下级节点存储一个共用的全局数据集合,该全局数据集合包括节点11向所有下级节点发送的历史报文的载荷部分。全局数据集合中可以包括载荷部分、节点标识与分组标识的对应关系。这里,节点标识可以用于指示对应的载荷部分所属历史报文是向该节点标识所指示的下级节点发送的,分组标识可以用于指示对应的载荷部分所属历史报文所属的流的流标识在该分组标识所指 示的流分组集合中。In the first implementation, node 11 stores a common global data set for all lower-level nodes, and the global data set includes the payload part of the historical message sent by node 11 to all lower-level nodes. The global data set may include the correspondence between the payload part, the node identifier and the group identifier. Here, the node identifier can be used to indicate that the historical message to which the corresponding payload part belongs is sent to the lower-level node indicated by the node identifier, and the group identifier can be used to indicate that the flow identifier of the flow to which the historical message to which the corresponding payload part belongs belongs is in the group identifier. The stream grouping set is shown.
例如,顶部节点中存储的全局数据集合可以如表2所示。For example, the global data set stored in the top node may be as shown in Table 2.
表2
Table 2
参见表2,该全局数据集合中以节点标识和分组标识为键、载荷部分为值。其中,节点标识A用于标识顶部节点的一个下级节点A,节点标识B用于标识顶部节点的另一个下级节点B。分组标识A1用于标识下级节点A的一个流分组集合,分组标识A2用于标识下级节点A的另一个流分组集合。分组标识B1用于标识下级节点B的流分组集合。基于表2可知,载荷部分1、载荷部分2和载荷部分3来自顶部节点向下级节点A发送的、属于一个分组的历史报文。载荷部分4和载荷部分5来自顶部节点向下级节点A发送的、属于另一分组的历史报文。载荷部分6和载荷部分7来自顶部节点向下级节点B发送的属于同一分组的历史报文。这种实现方式下,顶部节点在接收到报文之后,可以基于表1确定该报文对应的节点标识和分组标识,然后再以对应的节点标识和分组标识为键,在表2中进行键值匹配以获取对应的载荷部分,进一步判断该报文的载荷部分与获取的载荷部分是否具有重复内容。Referring to Table 2, the global data set uses the node identifier and the group identifier as the key and the payload part as the value. Among them, the node identifier A is used to identify a subordinate node A of the top node, and the node identifier B is used to identify another subordinate node B of the top node. The group identifier A1 is used to identify a flow packet set of the subordinate node A, and the group identifier A2 is used to identify another flow packet set of the subordinate node A. The group identifier B1 is used to identify the flow packet set of the subordinate node B. Based on Table 2, it can be seen that payload part 1, payload part 2 and payload part 3 come from historical messages sent by the top node to the subordinate node A and belong to a group. Payload part 4 and payload part 5 come from historical messages sent by the top node to the subordinate node A and belong to another group. Payload part 6 and payload part 7 come from historical messages sent by the top node to the subordinate node B and belong to the same group. Under this implementation method, after receiving the message, the top node can determine the node identifier and group identifier corresponding to the message based on Table 1, and then use the corresponding node identifier and group identifier as the key to perform key value matching in Table 2 to obtain the corresponding payload part, and further determine whether the payload part of the message and the obtained payload part have duplicate content.
例如流标识为五元组信息,节点11以报文11的五元组信息为键,在表1示出的分组表中进行键值匹配,确定报文11对应节点标识A和分组标识A2。然后节点11再以节点标识A和分组标识A2为键,在表2示出的全局数据集合中进行键值匹配,获取载荷部分4和载荷部分5。进一步地,节点11将报文11的载荷部分分别与载荷部分4和载荷部分5进行内容匹配。如果报文11的载荷部分与载荷部分4或载荷部分5具有重复内容,则节点11确定向节点12发送的历史报文中存在目标报文,目标报文即包括与报文11的载荷部分具有重复内容的载荷部分4或载荷部分5的历史报文。For example, if the flow identifier is five-tuple information, node 11 uses the five-tuple information of message 11 as a key, performs key-value matching in the grouping table shown in Table 1, and determines that message 11 corresponds to node identifier A and group identifier A2. Then node 11 uses node identifier A and group identifier A2 as keys, performs key-value matching in the global data set shown in Table 2, and obtains payload part 4 and payload part 5. Further, node 11 performs content matching on the payload part of message 11 with payload part 4 and payload part 5, respectively. If the payload part of message 11 has duplicate content with payload part 4 or payload part 5, node 11 determines that the target message exists in the historical message sent to node 12, and the target message includes the historical message of payload part 4 or payload part 5 that has duplicate content with the payload part of message 11.
第二种实现方式,节点11针对每个下级节点分别存储一个节点级数据集合,每个节点级数据集合包括节点11向一个下级节点发送的历史报文的载荷部分。节点级数据集合中可以包括载荷部分与分组标识的对应关系。In a second implementation, the node 11 stores a node-level data set for each subordinate node, each node-level data set includes the payload part of the historical message sent by the node 11 to a subordinate node. The node-level data set may include the correspondence between the payload part and the grouping identifier.
例如参考表2示出的例子,顶部节点中可以分别存储有下级节点A对应的节点级数据集合和下级节点B对应的节点级数据集合。其中,下级节点A对应的节点级数据集合可以如表3所示,下级节点B对应的节点级数据集合可以如表4所示。For example, referring to the example shown in Table 2, the top node may store a node-level data set corresponding to the lower-level node A and a node-level data set corresponding to the lower-level node B. The node-level data set corresponding to the lower-level node A may be as shown in Table 3, and the node-level data set corresponding to the lower-level node B may be as shown in Table 4.
表3
table 3
表4
Table 4
这种实现方式下,顶部节点在接收到报文之后,可以基于表1确定该报文对应的节点标识和分组标识,然后查找到该节点标识对应的节点级数据集合,再以分组标识为键,在例如表3或表4示出的节点级数据集合中进行键值匹配以获取对应的载荷部分,进一步判断该报文的载荷部分与获取的载荷部分是否具有重复内容。Under this implementation method, after receiving the message, the top node can determine the node identifier and group identifier corresponding to the message based on Table 1, and then find the node-level data set corresponding to the node identifier, and then use the group identifier as the key to perform key-value matching in the node-level data set shown in, for example, Table 3 or Table 4 to obtain the corresponding payload part, and further determine whether the payload part of the message and the obtained payload part have duplicate content.
第三种实现方式,节点11针对每个流分组集合分别存储一个分组级数据集合,每个分组级数据集合包括节点11向一个下级节点发送的属于一个流分组集合所指示的多条流的历史报文的载荷部分。每个分组级数据集合中包括一个流分组集合对应的载荷部分。In a third implementation, the node 11 stores a packet-level data set for each flow packet set, each packet-level data set including the payload portion of historical messages belonging to multiple flows indicated by a flow packet set sent by the node 11 to a lower-level node. Each packet-level data set includes the payload portion corresponding to a flow packet set.
这种实现方式下,顶部节点在接收到报文之后,可以基于表1确定该报文对应的节点标识和分组标识,然后查找到该节点标识和该分组标识对应的分组级数据集合,再获取该分组级数据集合中的载荷部分,进一步判断该报文的载荷部分与获取的载荷部分是否具有重复内容。Under this implementation method, after receiving the message, the top node can determine the node identifier and group identifier corresponding to the message based on Table 1, and then find the group-level data set corresponding to the node identifier and the group identifier, and then obtain the payload part in the group-level data set, and further determine whether the payload part of the message and the obtained payload part have duplicate content.
结合上述步骤602中的可能实现方式A2,节点11中存储有采样标签集合,该采样标签集合包括历史数据块的哈希值。历史数据块为对节点11向节点12发送的历史报文的数据部分的预设位置采样得到的数据块。In combination with possible implementation A2 in step 602 , a sampling tag set including a hash value of a historical data block is stored in node 11 . The historical data block is a data block obtained by sampling a preset position of the data part of a historical message sent by node 11 to node 12 .
第一种实现方式,节点11针对所有下级节点存储一个共用的全局采样标签集合,该全局采样标签集合包括对节点11向所有下级节点发送的历史报文的数据部分的预设位置采样得到的数据块的哈希值。全局采样集合中可以包括哈希值、节点标识与分组标识的对应关系。这里,节点标识可以用于指示对应的哈希值是基于向该节点标识所指示的下级节点发送的历史报文得到的,分组标识可以用于指示对应的哈希值源自的历史报文所属的流的流标识在该分组标识所指示的流分组集合中。In a first implementation, the node 11 stores a common global sampling label set for all lower-level nodes, and the global sampling label set includes a hash value of a data block sampled at a preset position of a data portion of a historical message sent by the node 11 to all lower-level nodes. The global sampling set may include a correspondence between a hash value, a node identifier, and a group identifier. Here, the node identifier may be used to indicate that the corresponding hash value is obtained based on a historical message sent to a lower-level node indicated by the node identifier, and the group identifier may be used to indicate that the flow identifier of the flow to which the historical message to which the corresponding hash value originates belongs is in the flow group set indicated by the group identifier.
例如,顶部节点中存储的全局采样标签集合可以如表5所示。For example, the global sampling label set stored in the top node may be as shown in Table 5.
表5
table 5
参见表5,该全局采样标签集合中以节点标识和分组标识为键、哈希值为值。其中,节点标识A用于标识顶部节点的一个下级节点A,节点标识B用于标识顶部节点的另一个下级节点B。分组标识A1用于标识下级节点A的一个流分组集合,分组标识A2用于标识下级节点A的另一个流分组集合。分组标识B1用于标识下级节点B的流分组集合。这种实现方式下,顶部节点在接收到报文之后,可以基于表1确定该报文对应的节点标识和分组标识,然后再以对应的节点标识和分组标识为键,在表5中进行键值匹配以获取对应的哈希值,进一步判断获取的哈希值中是否包括从报文11的数据部分的预设位置采样得到的采样数据块的哈希值。Referring to Table 5, the global sampling label set uses the node identifier and the group identifier as the key and the hash value as the value. Among them, the node identifier A is used to identify a subordinate node A of the top node, and the node identifier B is used to identify another subordinate node B of the top node. The group identifier A1 is used to identify a flow grouping set of the subordinate node A, and the group identifier A2 is used to identify another flow grouping set of the subordinate node A. The group identifier B1 is used to identify the flow grouping set of the subordinate node B. In this implementation mode, after receiving the message, the top node can determine the node identifier and group identifier corresponding to the message based on Table 1, and then use the corresponding node identifier and group identifier as the key to perform key value matching in Table 5 to obtain the corresponding hash value, and further determine whether the obtained hash value includes the hash value of the sampled data block sampled from the preset position of the data part of the message 11.
例如流标识为五元组信息,节点11以报文11的五元组信息为键,在表1示出的分组表中进行键值匹配,确定报文11对应节点标识A和分组标识A2。然后节点11再以节点标识A和分组标识A2为键,在表5示出的全局采样标签集合中进行键值匹配,获取哈希值4和哈希值5。节点11计算从报文11的数据部分的预设位置采样得到的采样数据块的哈希值。如果该采样数据块的哈希值为哈希值4或哈希值5,则节点11确定向节点12发送的历史报文中存在目标报文。For example, if the flow identifier is five-tuple information, node 11 uses the five-tuple information of message 11 as a key, performs key value matching in the grouping table shown in Table 1, and determines that message 11 corresponds to node identifier A and group identifier A2. Then node 11 uses node identifier A and group identifier A2 as keys, performs key value matching in the global sampling label set shown in Table 5, and obtains hash value 4 and hash value 5. Node 11 calculates the hash value of the sampled data block sampled from the preset position of the data part of message 11. If the hash value of the sampled data block is hash value 4 or hash value 5, node 11 determines that the target message exists in the historical messages sent to node 12.
第二种实现方式,节点11针对每个下级节点分别存储一个节点级采样标签集合,每个节点级采样标签集合包括对节点11向一个下级节点发送的历史报文的数据部分的预设位置采样得到的数据块的哈希值。 节点级采样标签集合中可以包括哈希值与分组标识的对应关系。In a second implementation, the node 11 stores a node-level sampling label set for each subordinate node, each node-level sampling label set including a hash value of a data block sampled at a preset position of a data portion of a historical message sent by the node 11 to a subordinate node. The node-level sampling label set may include a correspondence between a hash value and a grouping identifier.
例如参考表5示出的例子,顶部节点中可以分别存储有下级节点A对应的节点级采样标签你集合和下级节点B对应的节点级采样标签集合。其中,下级节点A对应的节点级采样标签集合可以如表6所示,下级节点B对应的节点级采样标签集合可以如表7所示。For example, referring to the example shown in Table 5, the top node may store a node-level sampling label set corresponding to the lower node A and a node-level sampling label set corresponding to the lower node B. The node-level sampling label set corresponding to the lower node A may be as shown in Table 6, and the node-level sampling label set corresponding to the lower node B may be as shown in Table 7.
表6
Table 6
表7
Table 7
这种实现方式下,顶部节点在接收到报文之后,可以基于表1确定该报文对应的节点标识和分组标识,然后查找到该节点标识对应的节点级采样标签集合,再以分组标识为键,在例如表6或表7示出的节点级采样标签集合中进行键值匹配以获取对应的哈希值,进一步判断获取的哈希值中是否包括从报文11的数据部分的预设位置采样得到的采样数据块的哈希值。In this implementation mode, after receiving the message, the top node can determine the node identifier and group identifier corresponding to the message based on Table 1, and then find the node-level sampling label set corresponding to the node identifier, and then use the group identifier as the key to perform key value matching in the node-level sampling label set shown in, for example, Table 6 or Table 7 to obtain the corresponding hash value, and further determine whether the obtained hash value includes the hash value of the sampling data block sampled from the preset position of the data part of the message 11.
第三种实现方式,节点11针对每个流分组集合分别存储一个分组级采样标签集合,每个分组级采样标签集合包括对节点11向一个下级节点发送的属于一个流分组集合所指示的多条流的历史报文的数据部分的预设位置采样得到的数据块的哈希值。每个分组级采样集合中包括一个流分组集合对应的哈希值。In a third implementation, the node 11 stores a group-level sampling label set for each flow group set, each group-level sampling label set including a hash value of a data block sampled at a preset position of a data portion of a historical message belonging to multiple flows indicated by a flow group set and sent by the node 11 to a lower-level node. Each group-level sampling set includes a hash value corresponding to a flow group set.
这种实现方式下,顶部节点在接收到报文之后,可以基于表1确定该报文对应的节点标识和分组标识,然后查找到该节点标识和该分组标识对应的分组级采样标签集合,再获取该分组级采样标签集合中的哈希值,进一步判断获取的哈希值中是否包括从报文11的数据部分的预设位置采样得到的采样数据块的哈希值。In this implementation mode, after receiving the message, the top node can determine the node identifier and group identifier corresponding to the message based on Table 1, and then find the group-level sampling label set corresponding to the node identifier and the group identifier, and then obtain the hash value in the group-level sampling label set, and further determine whether the obtained hash value includes the hash value of the sampling data block sampled from the preset position of the data part of the message 11.
对于底部节点,例如图10示出的方法1000中的节点21。节点21中可以存储有一个或多个流分组集合。每个流分组集合包括流经节点21的多条流的流标识。节点21中存储的流分组集合为节点21自身对应的流分组集合,本申请实施例中可以将存储的节点自身对应的流分组集合称为本地流分组集合。For the bottom node, for example, the node 21 in the method 1000 shown in FIG. 10 . One or more flow grouping sets may be stored in the node 21. Each flow grouping set includes flow identifiers of multiple flows flowing through the node 21. The flow grouping set stored in the node 21 is the flow grouping set corresponding to the node 21 itself. In the embodiment of the present application, the flow grouping set corresponding to the node itself may be referred to as a local flow grouping set.
可选地,在节点21基于去重标记确定报文21为去重报文之前,即上述步骤1002执行之前,节点21确定该一个或多个流分组集合中存在包括报文21所属流的流标识的目标流分组集合。也即是,节点21在接收到报文21之后,先判断节点21对应的流分组集合中是否包括报文21所属流的流标识。节点21在确定报文21所属流的流标识属于节点21对应的某个流分组集合之后,才会解析报文21的载荷部分。节点21判断节点21对应的流分组集合中是否包括报文21所属流的流标识的实现方式可参考上述节点11判断节点12对应的流分组集合中是否包括报文11所属流的流标识的实现方式。Optionally, before node 21 determines that message 21 is a deduplicated message based on the deduplication mark, that is, before the above step 1002 is executed, node 21 determines that there is a target flow group set including the flow identifier of the flow to which message 21 belongs in the one or more flow group sets. That is, after receiving message 21, node 21 first determines whether the flow group set corresponding to node 21 includes the flow identifier of the flow to which message 21 belongs. Node 21 will parse the payload part of message 21 only after determining that the flow identifier of the flow to which message 21 belongs belongs to a certain flow group set corresponding to node 21. The implementation method of node 21 determining whether the flow group set corresponding to node 21 includes the flow identifier of the flow to which message 21 belongs can refer to the implementation method of node 11 determining whether the flow group set corresponding to node 12 includes the flow identifier of the flow to which message 11 belongs.
可选地,节点21接收报文23,报文23所属流的流标识不属于节点21对应的任一流分组集合。节点21转发报文23。Optionally, the node 21 receives the message 23 , and the flow identifier of the flow to which the message 23 belongs does not belong to any flow grouping set corresponding to the node 21 . The node 21 forwards the message 23 .
本申请实施例中,由于上级节点只会对属于下级节点对应的流分组集合所指示的流中的报文进行数据去重,因此底部节点在接收到报文之后,可以先判断该报文所属流的流标识是否属于自身对应的某个流分组集合。如果该报文所属流的流标识属于自身对应的某个流分组集合,则说明该报文有可能是经过上级节点去重处理的去重报文,底部节点需要进一步解析该报文的载荷部分以判断该报文是否为去重报文。如果该报文所属流的流标识不属于自身对应的任一流分组集合,则说明上级节点不会对该报文进行数据去重, 即该报文不可能为去重报文,因此底部节点可以直接转发该报文,而无需解析该报文的载荷部分以判断该报文是否为去重报文。这样可以减少底部节点的处理开销。In the embodiment of the present application, since the upper-level node will only perform data deduplication on the messages in the flow indicated by the flow grouping set corresponding to the lower-level node, after receiving the message, the bottom node can first determine whether the flow identifier of the flow to which the message belongs belongs to a certain flow grouping set corresponding to itself. If the flow identifier of the flow to which the message belongs belongs to a certain flow grouping set corresponding to itself, it means that the message may be a deduplicated message that has been deduplicated by the upper-level node, and the bottom node needs to further parse the payload part of the message to determine whether the message is a deduplicated message. If the flow identifier of the flow to which the message belongs does not belong to any flow grouping set corresponding to itself, it means that the upper-level node will not perform data deduplication on the message. That is, the message cannot be a deduplicated message, so the bottom node can directly forward the message without parsing the payload part of the message to determine whether the message is a deduplicated message. This can reduce the processing overhead of the bottom node.
可选地,节点21接收报文24,报文24所属流的流标识属于节点21对应的流分组集合。节点21解析报文24的载荷部分,确定报文24的载荷部分未携带有去重标记,即报文24为未去重报文。节点21在数据集合中添加报文24的载荷部分的至少部分内容,并转发报文24。这里的报文24可视为节点21接收到的一组可去重报文中的首包,节点21通过在数据集合中存储报文24的载荷部分的至少部分内容,以便对后续接收到的去重报文进行数据恢复。Optionally, node 21 receives message 24, and the flow identifier of the flow to which message 24 belongs belongs to the flow group set corresponding to node 21. Node 21 parses the payload part of message 24 and determines that the payload part of message 24 does not carry a deduplication mark, that is, message 24 is a non-deduplicated message. Node 21 adds at least part of the content of the payload part of message 24 to the data set and forwards message 24. Message 24 here can be regarded as the first packet in a group of deduplicated messages received by node 21. Node 21 stores at least part of the content of the payload part of message 24 in the data set so as to perform data recovery on the deduplicated messages received subsequently.
可选地,节点21将接收到的属于不同流的多个报文中,载荷部分存在重复内容的报文所属流的流标识添加至同一流分组集合中,这些不同流的发送方为同一SFU服务器。例如,节点21在接收到报文23之后,可以存储报文23的载荷部分的至少内容,后续再接收到发送方与报文23的发送方为同一SFU服务器且与报文23属于不同流的报文之后,判断该报文的载荷部分与报文23的载荷部分是否存在重复内容,如果存在,则确定该报文所属的流与报文23所属的流能够构成一个新的分组,进而生成一个新的流分组集合,该流分组集合包括该报文所属流的流标识和报文23所属流的流标识。Optionally, node 21 adds the flow identifiers of the flows to which the messages with duplicate contents in the payload part of the received multiple messages belonging to different flows belong to the same flow grouping set, and the sender of these different flows is the same SFU server. For example, after receiving message 23, node 21 can store at least the contents of the payload part of message 23, and then after receiving a message whose sender is the same SFU server as the sender of message 23 and belongs to a different flow from message 23, determine whether there are duplicate contents in the payload part of the message and the payload part of message 23, and if so, determine that the flow to which the message belongs and the flow to which message 23 belongs can form a new group, and then generate a new flow grouping set, which includes the flow identifier of the flow to which the message belongs and the flow identifier of the flow to which message 23 belongs.
进一步地,节点21可以向节点22发送分组信息,该分组信息包括节点21的节点标识与节点21对应的一个或多个流分组集合的对应关系。当节点21对应的某个流分组集合发生更新时,比如某个流分组集合中添加了新的流标识,节点21可以向节点22发送新的流分组集合。或者节点21也可以向节点22发送该发生更新的流分组集合对应的分组标识以及该新的流标识,以便节点22在该分组标识所指示的流分组集合中添加该新的流标识,实现对该流分组集合的同步更新。Further, node 21 may send grouping information to node 22, and the grouping information includes a correspondence between the node identifier of node 21 and one or more stream grouping sets corresponding to node 21. When a stream grouping set corresponding to node 21 is updated, for example, a new stream identifier is added to a stream grouping set, node 21 may send the new stream grouping set to node 22. Alternatively, node 21 may also send the grouping identifier corresponding to the updated stream grouping set and the new stream identifier to node 22, so that node 22 adds the new stream identifier to the stream grouping set indicated by the grouping identifier, thereby realizing synchronous update of the stream grouping set.
可选地,分组信息携带在join消息中。也即是,节点21可以通过join消息向节点22发送分组信息。节点22接收到join消息之后,将join消息中携带的分组信息存储在分组表中。之后,节点22还可以向节点21发送join应答消息,以便节点21确认分组成功。进一步地,节点21接收到join应答消息之后,还可以向节点22发送join应答确认消息。Optionally, the grouping information is carried in a join message. That is, node 21 can send the grouping information to node 22 via a join message. After receiving the join message, node 22 stores the grouping information carried in the join message in a grouping table. Afterwards, node 22 can also send a join response message to node 21 so that node 21 confirms that the grouping is successful. Further, after receiving the join response message, node 21 can also send a join response confirmation message to node 22.
结合上述步骤1003中的可能实现方式B1,节点21中存储的数据集合包括节点21接收到的未去重报文的载荷部分。In combination with the possible implementation manner B1 in the above step 1003 , the data set stored in the node 21 includes the payload part of the non-deduplicated message received by the node 21 .
这种实现方式下,对于发送方为SFU服务器、且流标识不属于底部节点对应的任一流分组集合的流,底部节点接收到该流中的报文之后,存储该流中报文的载荷部分。之后,如果底部节点再接收到发送方为该SFU服务器、且流标识不属于该底部节点对应的任一流分组集合的另一条流中的报文,底部节点对这两条流的报文的载荷部分进行内容匹配,如果这两条流的报文的载荷部分存在重复内容,则说明这两条流匹配成功,进一步生成包括这两条流的流标识的流分组集合。In this implementation, for a flow whose sender is an SFU server and whose flow identifier does not belong to any flow grouping set corresponding to the bottom node, after the bottom node receives the message in the flow, it stores the payload part of the message in the flow. Afterwards, if the bottom node receives a message in another flow whose sender is the SFU server and whose flow identifier does not belong to any flow grouping set corresponding to the bottom node, the bottom node performs content matching on the payload parts of the messages of the two flows. If there is duplicate content in the payload parts of the messages of the two flows, it means that the two flows are matched successfully, and a flow grouping set including the flow identifiers of the two flows is further generated.
结合上述步骤1003中的可能实现方式B2,节点21中存储的数据集合包括对节点21接收到的未去重报文的数据部分的预设位置采样得到的数据块的哈希值与该数据块的对应关系。In combination with possible implementation B2 in step 1003, the data set stored in node 21 includes a correspondence between a hash value of a data block sampled at a preset position of a data portion of a non-deduplicated message received by node 21 and the data block.
这种实现方式下,对于发送方为SFU服务器、且流标识不属于底部节点对应的任一流分组集合的流,底部节点接收到该流中的报文之后,对该报文的数据部分的预设位置进行采样得到采样数据块,计算并存储该采样数据块的哈希值。之后,如果底部节点再接收到发送方为该SFU服务器、且流标识不属于该底部节点对应的任一流分组集合的另一条流中的报文,底部节点同样对该报文的数据部分的预设位置进行采样得到采样数据块,并计算该采样数据块的哈希值。如果从这两条流的报文中获取的采样数据块的哈希值相同,则说明这两条流匹配成功,进一步生成包括这两条流的流标识的流分组集合。为了提高流分组的准确性,这种实现方式,底部节点还可以存储采样数据块的内容,在属于不同流的两个报文的采样数据块的哈希值相同的情况下,底部节点可以进一步对这两个报文的采样数据块的数据内容进行精确匹配,以确定这两个报文的采样数据块的数据内容是否相同。In this implementation, for a flow whose sender is an SFU server and whose flow identifier does not belong to any flow grouping set corresponding to the bottom node, after the bottom node receives the message in the flow, the preset position of the data part of the message is sampled to obtain a sampled data block, and the hash value of the sampled data block is calculated and stored. Afterwards, if the bottom node receives a message in another flow whose sender is the SFU server and whose flow identifier does not belong to any flow grouping set corresponding to the bottom node, the bottom node also samples the preset position of the data part of the message to obtain a sampled data block, and calculates the hash value of the sampled data block. If the hash values of the sampled data blocks obtained from the messages of the two flows are the same, it means that the two flows are matched successfully, and a flow grouping set including the flow identifiers of the two flows is further generated. In order to improve the accuracy of the flow grouping, in this implementation, the bottom node can also store the content of the sampled data block. When the hash values of the sampled data blocks of two messages belonging to different flows are the same, the bottom node can further accurately match the data content of the sampled data blocks of the two messages to determine whether the data content of the sampled data blocks of the two messages is the same.
在节点21中存储有一个或多个流分组集合的情况下,上述步骤1003的实现方式可以为:节点21根据对重复内容的指示信息,从数据集合中的目标历史报文的载荷内容中获取该重复内容,目标历史报文所属流的流标识属于目标流分组集合。这样,当节点21对应的流分组集合有多个时,节点21只需查找属于一个流分组集合所指示的多条流的历史报文的载荷部分以获取重复内容,减少了节点21所需检索的历史报文的数量,从而减少了节点21的处理开销,同时提高了节点21的报文处理效率,从而可以提高报文传输效率。 In the case where one or more flow grouping sets are stored in the node 21, the implementation of the above step 1003 may be: the node 21 obtains the repeated content from the payload content of the target historical message in the data set according to the indication information of the repeated content, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set. In this way, when there are multiple flow grouping sets corresponding to the node 21, the node 21 only needs to search for the payload part of the historical messages belonging to multiple flows indicated by one flow grouping set to obtain the repeated content, which reduces the number of historical messages that the node 21 needs to retrieve, thereby reducing the processing overhead of the node 21, and at the same time improving the message processing efficiency of the node 21, thereby improving the message transmission efficiency.
对于中间节点,例如图11示出的方法1100中的节点31。节点31中可以存储有一个或多个本地流分组集合,每个本地流分组集合包括流经节点31的多条流的流标识。和/或,节点31中可以存储有节点33对应的一个或多个下级流分组集合,每个下级流分组集合包括流经节点33的多条流的流标识。For an intermediate node, for example, the node 31 in the method 1100 shown in FIG. 11 , one or more local flow grouping sets may be stored in the node 31, each of which includes flow identifiers of multiple flows flowing through the node 31. And/or, the node 31 may store one or more subordinate flow grouping sets corresponding to the node 33, each of which includes flow identifiers of multiple flows flowing through the node 33.
在节点31中存储有一个或多个本地流分组集合的情况下,在节点31判断报文31为去重报文还是未去重报文之前,即上述步骤1102执行之前,节点31确定该一个或多个本地流分组集合中存在包括报文31所属流的流标识的目标流分组集合。也即是,节点31在接收到报文31之后,先判断本地流分组集合中是否包括报文31所属流的流标识。节点31在确定报文31所属流的流标识属于某个本地流分组集合之后,再解析报文31的载荷部分。节点31判断本地流分组集合中是否包括报文31所属流的流标识的实现方式可参考上述节点11判断节点12对应的流分组集合中是否包括报文11所属流的流标识的实现方式。In the case where one or more local flow grouping sets are stored in node 31, before node 31 determines whether message 31 is a deduplicated message or a non-deduplicated message, that is, before the above step 1102 is executed, node 31 determines that there is a target flow grouping set including the flow identifier of the flow to which message 31 belongs in the one or more local flow grouping sets. That is, after receiving message 31, node 31 first determines whether the local flow grouping set includes the flow identifier of the flow to which message 31 belongs. After determining that the flow identifier of the flow to which message 31 belongs belongs to a certain local flow grouping set, node 31 parses the payload part of message 31. The implementation method of node 31 determining whether the local flow grouping set includes the flow identifier of the flow to which message 31 belongs can refer to the implementation method of node 11 determining whether the flow grouping set corresponding to node 12 includes the flow identifier of the flow to which message 11 belongs.
本申请实施例中,由于上级节点只会对属于下级节点对应的流分组集合所指示的流中的报文进行数据去重,因此中间节点在接收到报文之后,可以先判断该报文所属流的流标识是否属于自身对应的某个流分组集合。如果该报文所属流的流标识属于自身对应的某个流分组集合,则说明该报文有可能是经过上级节点去重处理的去重报文,中间节点需要进一步解析该报文的载荷部分以判断该报文是否为去重报文。如果该报文所属流的流标识不属于自身对应的任一流分组集合,则说明上级节点不会对该报文进行数据去重,即该报文不可能为去重报文,因此中间节点可以直接转发该报文,而无需解析该报文的载荷部分以判断该报文是否为去重报文。这样可以减少中间节点的处理开销。In the embodiment of the present application, since the upper node will only perform data deduplication on the message in the flow indicated by the flow grouping set corresponding to the lower node, the intermediate node can first determine whether the flow identifier of the flow to which the message belongs belongs to a certain flow grouping set corresponding to itself after receiving the message. If the flow identifier of the flow to which the message belongs belongs to a certain flow grouping set corresponding to itself, it means that the message may be a deduplication message that has been deduplication processed by the upper node, and the intermediate node needs to further parse the payload part of the message to determine whether the message is a deduplication message. If the flow identifier of the flow to which the message belongs does not belong to any flow grouping set corresponding to itself, it means that the upper node will not perform data deduplication on the message, that is, the message cannot be a deduplication message, so the intermediate node can directly forward the message without parsing the payload part of the message to determine whether the message is a deduplication message. This can reduce the processing overhead of the intermediate node.
可选地,节点31将接收到的属于不同流的多个报文中,载荷部分存在重复内容的报文所属流的流标识添加至同一本地流分组集合中,这些不同流的发送方为同一SFU服务器。进一步地,节点31可以向节点32发送第一分组信息,第一分组信息包括节点31的节点标识与一个或多个本地流分组集合的对应关系。节点31中存储的本地流分组集合的生成方式、发送方式以及作用可分别参考上述节点21中存储的节点21对应的流分组集合的生成方式、发送方式以及作用,节点31中存储的第二数据集合中的内容也可参考节点21中存储的数据集合中的内容,本申请实施例在此不再赘述。Optionally, node 31 adds the flow identifier of the flow to which the message with repeated content in the payload part belongs among the multiple messages received belonging to different flows to the same local flow grouping set, and the sender of these different flows is the same SFU server. Further, node 31 can send first grouping information to node 32, and the first grouping information includes the correspondence between the node identifier of node 31 and one or more local flow grouping sets. The generation method, sending method and function of the local flow grouping set stored in node 31 can refer to the generation method, sending method and function of the flow grouping set corresponding to node 21 stored in the above-mentioned node 21, respectively. The content in the second data set stored in node 31 can also refer to the content in the data set stored in node 21, and the embodiments of the present application will not be repeated here.
在节点31中存储有节点33对应的一个或多个下级流分组集合的情况下,在节点31获取报文31之后,如果报文31为未去重报文,节点31可以先判断节点33对应的下级流分组集合中是否包括报文31所属流的流标识。如果节点33对应的下级流分组集合中存在包括报文31所属流的流标识的目标流分组集合,节点31执行上述步骤1103,即节点31判断向节点33发送的历史报文中是否存在载荷部分与报文31的载荷部分具有重复内容的报文。如果节点33对应的所有流分组集合均不包括报文31所属流的流标识,节点31向节点33发送报文31。In the case where one or more lower-level flow grouping sets corresponding to node 33 are stored in node 31, after node 31 obtains message 31, if message 31 is a non-deduplicated message, node 31 may first determine whether the lower-level flow grouping set corresponding to node 33 includes the flow identifier of the flow to which message 31 belongs. If there is a target flow grouping set including the flow identifier of the flow to which message 31 belongs in the lower-level flow grouping set corresponding to node 33, node 31 executes the above step 1103, that is, node 31 determines whether there is a message in the historical message sent to node 33 whose payload part has duplicate content with the payload part of message 31. If all flow grouping sets corresponding to node 33 do not include the flow identifier of the flow to which message 31 belongs, node 31 sends message 31 to node 33.
本申请实施例中,中间节点可以根据下级节点对应的下级流分组集合,判断是否需要对接收到的需要向该下级节点发送的未去重报文进行去重处理。如果该未去重报文所属流的流标识不在该下级节点对应的下级流分组集合中,那么中间节点直接向该下级节点转发该未去重报文,而无需再执行报文去重流程,这样可以减小中间节点的处理开销。In the embodiment of the present application, the intermediate node can determine whether it is necessary to perform deduplication processing on the received non-deduplication message that needs to be sent to the subordinate node based on the subordinate flow group set corresponding to the subordinate node. If the flow identifier of the flow to which the non-deduplication message belongs is not in the subordinate flow group set corresponding to the subordinate node, then the intermediate node directly forwards the non-deduplication message to the subordinate node without executing the message deduplication process, which can reduce the processing overhead of the intermediate node.
可选地,在节点33对应的下级流分组集合中存在包括报文31所属流的流标识的目标流分组集合的情况下,节点31在判断向节点33发送的历史报文中是否存在载荷部分与报文31的载荷部分具有第一重复内容的第一原始报文时,可以只判断向节点33发送的目标历史报文中是否存在第一原始报文,目标历史报文所属流的流标识属于目标流分组集合。这样,当节点33对应的下级流分组集合有多个时,节点31只需对属于一个下级流分组集合所指示的多条流的历史报文的载荷部分与报文31的载荷部分进行重复内容判断,减少了节点31所需判断的历史报文的数量,从而减少了节点31的处理开销,同时提高了节点31的报文处理效率,从而可以提高报文传输效率。Optionally, in the case where there is a target flow grouping set including the flow identifier of the flow to which message 31 belongs in the lower-level flow grouping set corresponding to node 33, when node 31 determines whether there is a first original message whose payload part has first repetitive content with the payload part of message 31 in the historical message sent to node 33, it can only determine whether there is a first original message in the target historical message sent to node 33, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set. In this way, when there are multiple lower-level flow grouping sets corresponding to node 33, node 31 only needs to perform repetitive content determination on the payload part of the historical messages belonging to multiple flows indicated by one lower-level flow grouping set and the payload part of message 31, which reduces the number of historical messages that node 31 needs to determine, thereby reducing the processing overhead of node 31, while improving the message processing efficiency of node 31, thereby improving the message transmission efficiency.
可选地,节点31接收节点33发送的第二分组信息,第二分组信息包括节点33的节点标识与节点33对应的一个或多个下级流分组集合的对应关系。节点31中存储的节点33对应的下级流分组集合的获取方式以及作用可分别参考上述节点11中存储的节点12对应的流分组集合的获取方式以及作用,节点31中存储的第一数据集合或采样标签集合中的内容也可对应参考节点11中存储的数据集合或采样标签集合中的内容,本申请实施例在此不再赘述。Optionally, node 31 receives second grouping information sent by node 33, and the second grouping information includes a correspondence between a node identifier of node 33 and one or more lower-level flow grouping sets corresponding to node 33. The acquisition method and function of the lower-level flow grouping set corresponding to node 33 stored in node 31 can refer to the acquisition method and function of the flow grouping set corresponding to node 12 stored in the above-mentioned node 11, and the content in the first data set or sampling label set stored in node 31 can also correspond to the content in the data set or sampling label set stored in the reference node 11, which will not be described in detail in the embodiments of the present application.
本申请实施例中定义的顶部节点、中间节点和底部节点用于区分不同节点相对于SFU服务器的位置,这些节点具备的功能可以是相同的。可选地,可以手动配置数据传输系统中多个节点相对于SFU服务器的 位置以及该多个节点之间的连接关系。或者也可以由数据传输系统中的多个节点分别执行节点发现流程以确定自身相对于SFU服务器的位置以及与其它节点的连接关系,实现自动化部署。The top node, middle node and bottom node defined in the embodiment of the present application are used to distinguish the positions of different nodes relative to the SFU server, and the functions of these nodes may be the same. Optionally, the positions of multiple nodes in the data transmission system relative to the SFU server may be manually configured. Alternatively, multiple nodes in the data transmission system may respectively execute the node discovery process to determine their own positions relative to the SFU server and the connection relationships with other nodes to achieve automated deployment.
本申请实施例中,可以由发送方为SFU服务器的报文或目的地为SFU服务器的报文触发数据传输系统中的节点执行节点发现流程。发送方为SFU服务器的报文的源端口号为SFU服务端口号,节点可以根据报文的源端口号判断该报文的发送方是否为SFU服务器。目的地为SFU服务器的报文的目的端口号为SFU服务端口号,节点可以根据报文的目的端口号判断该报文的目的地是否为SFU服务器。下面对顶部节点、底部节点和中间节点执行节点发现流程的具体实现方式分别进行说明。In an embodiment of the present application, a message whose sender is an SFU server or whose destination is an SFU server can trigger a node in a data transmission system to execute a node discovery process. The source port number of the message whose sender is an SFU server is the SFU service port number, and the node can judge whether the sender of the message is an SFU server based on the source port number of the message. The destination port number of the message whose destination is an SFU server is the SFU service port number, and the node can judge whether the destination of the message is an SFU server based on the destination port number of the message. The specific implementation methods of executing the node discovery process for the top node, the bottom node, and the intermediate node are described below.
对于顶部节点,例如图6示出的方法600中的节点11。如果节点11为SFU服务器,在SFU服务器配置有支持数据去重的功能的情况下,节点11直接确定自身为顶部节点。如果节点11不为SFU服务器,则节点11可以通过以下两种可能实现方式确定自身相对于SFU服务器的位置。For the top node, for example, the node 11 in the method 600 shown in FIG6 . If the node 11 is an SFU server, and the SFU server is configured with a function that supports data deduplication, the node 11 directly determines itself as the top node. If the node 11 is not an SFU server, the node 11 can determine its position relative to the SFU server through the following two possible implementations.
一种可能实现方式,由目的地为SFU服务器的报文触发节点发现流程。In one possible implementation, a message destined for an SFU server triggers the node discovery process.
这种实现方式下,节点11接收到目的端口号为SFU服务端口号的报文13之后,可以向节点13发送节点发现报文11。节点13为报文13在节点11上的下一跳。报文13的目的地为SFU服务器,节点发现报文11携带有该SFU服务器的标识,且节点发现报文11指示节点11为节点13在以该SFU服务器为起点的传输路径上的下级节点。响应于未接收到节点13发送的节点发现报文11对应的节点发现响应报文11,节点11确定节点11为以该SFU服务器为起点的传输路径上支持数据去重的首个节点。其中,SFU服务器的标识可以是SFU服务器的IP地址,或者是SFU服务器的IP地址经过NAT之后得到的地址。In this implementation, after node 11 receives message 13 whose destination port number is the SFU service port number, node 11 can send node discovery message 11 to node 13. Node 13 is the next hop of message 13 on node 11. The destination of message 13 is the SFU server, and node discovery message 11 carries the identifier of the SFU server, and node discovery message 11 indicates that node 11 is the subordinate node of node 13 on the transmission path starting from the SFU server. In response to not receiving the node discovery response message 11 corresponding to the node discovery message 11 sent by node 13, node 11 determines that node 11 is the first node that supports data deduplication on the transmission path starting from the SFU server. Among them, the identifier of the SFU server can be the IP address of the SFU server, or the address obtained after the IP address of the SFU server is NATed.
可选地,节点发现报文11指示节点11为节点13在以SFU服务器为起点的传输路径上的下级节点的方式有多种。比如,如果节点发现报文11的目的端口号为SFU服务端口号,且节点发现报文11携带有SFU服务器的标识,则表示发送节点发现报文11的节点11为接收节点发现报文11的节点13在以该SFU服务器为起点的传输路径上的下级节点。又比如,节点发现报文11中携带有用于指示自身为下级节点的位置指示,则节点发现报文11携带的该位置指示和SFU服务器的标识共同指示发送节点发现报文11的节点11为接收节点发现报文11的节点13在以该SFU服务器为起点的传输路径上的下级节点。Optionally, there are multiple ways for the node discovery message 11 to indicate that the node 11 is a subordinate node of the node 13 on the transmission path starting from the SFU server. For example, if the destination port number of the node discovery message 11 is the SFU service port number, and the node discovery message 11 carries the identifier of the SFU server, then it means that the node 11 that sends the node discovery message 11 is a subordinate node of the node 13 that receives the node discovery message 11 on the transmission path starting from the SFU server. For another example, if the node discovery message 11 carries a position indication for indicating that it is a subordinate node, then the position indication carried by the node discovery message 11 and the identifier of the SFU server jointly indicate that the node 11 that sends the node discovery message 11 is a subordinate node of the node 13 that receives the node discovery message 11 on the transmission path starting from the SFU server.
可选地,节点11根据报文13生成节点发现报文11,节点发现报文11的报文头与报文13的报文头相同,节点发现报文11的载荷部分携带有对节点发现报文11的报文类型的指示。例如,节点11根据报文13生成节点发现报文11的实现方式可以是,节点11复制报文13,然后只保留复制报文的报文头,并在复制报文的载荷部分添加用于指示自身的报文类型为节点发现报文的指示,得到节点发现报文11。其中,报文头包括以太头、IP头和传输层协议头(UDP头或TCP头)。或者,节点发现报文11也可以是节点11构造的一种新报文,本申请实施例对节点发现报文的报文结构不做限定。Optionally, node 11 generates node discovery message 11 according to message 13, the message header of node discovery message 11 is the same as the message header of message 13, and the payload of node discovery message 11 carries an indication of the message type of node discovery message 11. For example, the implementation method of node 11 generating node discovery message 11 according to message 13 may be that node 11 copies message 13, then only retains the message header of the copied message, and adds an indication to the payload of the copied message that indicates that its message type is a node discovery message, thereby obtaining node discovery message 11. Among them, the message header includes an Ethernet header, an IP header, and a transport layer protocol header (UDP header or TCP header). Alternatively, node discovery message 11 may also be a new message constructed by node 11, and the embodiment of the present application does not limit the message structure of the node discovery message.
这种实现方式下,节点11还可以接收节点14发送的节点发现报文12,节点发现报文12携带有SFU服务器的标识,且节点发现报文12指示节点14为节点11在以该SFU服务器为起点的传输路径上的下级节点。节点11根据节点发现报文12确定节点14支持数据去重,并向节点14发送节点发现报文12对应的节点发现响应报文12,节点发现响应报文12指示节点11支持数据去重,且节点11还可以确定自身不是在以该SFU服务器为起点的传输路径上的最后一个节点。节点14例如可以是上述节点12。In this implementation, node 11 can also receive node discovery message 12 sent by node 14, node discovery message 12 carries the identifier of the SFU server, and node discovery message 12 indicates that node 14 is a subordinate node of node 11 on the transmission path starting from the SFU server. Node 11 determines that node 14 supports data deduplication based on node discovery message 12, and sends node discovery response message 12 corresponding to node discovery message 12 to node 14, node discovery response message 12 indicates that node 11 supports data deduplication, and node 11 can also determine that it is not the last node on the transmission path starting from the SFU server. Node 14 can be, for example, the above-mentioned node 12.
本申请实施例中,在由目的地为SFU服务器的报文触发节点发现流程的方案中,如果某个节点接收到下级节点发送的节点发现报文,且未接收到上级节点发送的节点发现响应报文,那么该节点可以确定自身为以该SFU服务器为起点的传输路径上的首个节点。In an embodiment of the present application, in a scheme where the node discovery process is triggered by a message whose destination is an SFU server, if a node receives a node discovery message sent by a subordinate node and does not receive a node discovery response message sent by an upper node, then the node can determine that it is the first node on the transmission path starting from the SFU server.
另一种可能实现方式,由发送方为SFU服务器的报文触发节点发现流程。Another possible implementation is that the node discovery process is triggered by a message sent by the SFU server.
这种实现方式下,节点11接收到源端口号为SFU服务端口号的报文14之后,向节点15发送节点发现报文13,节点15为报文14在节点11上的下一跳,报文14的发送方为SFU服务器,节点发现报文13携带有该SFU服务器的标识,且节点发现报文13指示节点11为节点15在以该SFU服务器为起点的传输路径上的上级节点。响应于接收到节点15发送的节点发现报文13对应的节点发现响应报文13,节点11确定节点15支持数据去重,且节点11还可以确定自身不是在以该SFU服务器为起点的传输路径上的最后一个节点。节点15例如可以是上述节点12。进一步地,节点11还可以向节点15发送节点发现响应确认报文。In this implementation, after node 11 receives message 14 whose source port number is the SFU service port number, it sends node discovery message 13 to node 15. Node 15 is the next hop of message 14 on node 11. The sender of message 14 is the SFU server. Node discovery message 13 carries the identifier of the SFU server, and node discovery message 13 indicates that node 11 is the superior node of node 15 on the transmission path starting from the SFU server. In response to receiving the node discovery response message 13 corresponding to the node discovery message 13 sent by node 15, node 11 determines that node 15 supports data deduplication, and node 11 can also determine that it is not the last node on the transmission path starting from the SFU server. Node 15 can be, for example, the above-mentioned node 12. Further, node 11 can also send a node discovery response confirmation message to node 15.
可选地,节点发现报文13指示节点11为节点15在以SFU服务器为起点的传输路径上的上级节点的方式有多种。比如,如果节点发现报文13的源端口号为SFU服务端口号,且节点发现报文11携带有SFU 服务器的标识,则表示发送节点发现报文13的节点11为接收节点发现报文11的节点15在以该SFU服务器为起点的传输路径上的上级节点。又比如,节点发现报文13中携带有用于指示自身为上级节点的位置指示,则节点发现报文13携带的该位置指示和SFU服务器的标识共同指示发送节点发现报文13的节点11为接收节点发现报文11的节点15在以该SFU服务器为起点的传输路径上的上级节点。Optionally, there are multiple ways for the node discovery message 13 to indicate that the node 11 is the superior node of the node 15 on the transmission path starting from the SFU server. For example, if the source port number of the node discovery message 13 is the SFU service port number, and the node discovery message 11 carries the SFU The identifier of the server indicates that the node 11 sending the node discovery message 13 is the superior node of the node 15 receiving the node discovery message 11 on the transmission path starting from the SFU server. For another example, if the node discovery message 13 carries a position indication for indicating that it is the superior node, the position indication carried by the node discovery message 13 and the identifier of the SFU server together indicate that the node 11 sending the node discovery message 13 is the superior node of the node 15 receiving the node discovery message 11 on the transmission path starting from the SFU server.
可选地,节点11根据报文14生成节点发现报文13,节点发现报文13的报文头与报文14的报文头相同,节点发现报文13的载荷部分携带有对节点发现报文13的报文类型的指示。Optionally, the node 11 generates a node discovery message 13 according to the message 14 , the message header of the node discovery message 13 is the same as the message header of the message 14 , and the payload part of the node discovery message 13 carries an indication of the message type of the node discovery message 13 .
本申请实施例中,在由发送方为SFU服务器的报文触发节点发现流程的方案中,如果某个节点接收到下级节点发送的节点发现响应报文,且未接收到上级节点发送的节点发现报文,那么该节点可以确定自身为以该SFU服务器为起点的传输路径上的首个节点。In an embodiment of the present application, in a scheme where the node discovery process is triggered by a message whose sender is an SFU server, if a node receives a node discovery response message sent by a subordinate node and does not receive a node discovery message sent by an upper node, then the node can determine that it is the first node on the transmission path starting from the SFU server.
对于底部节点,例如图10示出的方法1000中的节点21。节点21可以通过以下两种可能实现方式确定自身相对于SFU服务器的位置。For the bottom node, for example, the node 21 in the method 1000 shown in Fig. 10, the node 21 may determine its position relative to the SFU server through the following two possible implementations.
一种可能实现方式,由目的地为SFU服务器的报文触发节点发现流程。In one possible implementation, a message destined for an SFU server triggers the node discovery process.
这种实现方式下,节点21接收到目的端口号为SFU服务端口号的报文25之后,向节点24发送节点发现报文21。节点24为报文25在节点21上的下一跳。报文25的目的地为SFU服务器,节点发现报文21携带有该SFU服务器的标识,且节点发现报文21指示节点21为节点24在以该SFU服务器为起点的传输路径上的下级节点。响应于接收到节点24发送的节点发现报文21对应的节点发现响应报文21,节点21确定节点24支持数据去重,且节点21还可以确定自身不是在以该SFU服务器为起点的传输路径上的首个节点。节点24例如可以是上述节点22。进一步地,节点21还可以向节点24发送节点发现响应确认报文。In this implementation, after node 21 receives message 25 whose destination port number is the SFU service port number, it sends node discovery message 21 to node 24. Node 24 is the next hop of message 25 on node 21. The destination of message 25 is the SFU server, and node discovery message 21 carries the identifier of the SFU server, and node discovery message 21 indicates that node 21 is the subordinate node of node 24 on the transmission path starting from the SFU server. In response to receiving the node discovery response message 21 corresponding to the node discovery message 21 sent by node 24, node 21 determines that node 24 supports data deduplication, and node 21 can also determine that it is not the first node on the transmission path starting from the SFU server. Node 24 can be, for example, the above-mentioned node 22. Further, node 21 can also send a node discovery response confirmation message to node 24.
可选地,节点21根据报文25生成节点发现报文21,节点发现报文21的报文头与报文25的报文头相同,节点发现报文21的载荷部分携带有对节点发现报文21的报文类型的指示。节点21生成节点发现报文的方式可参考上述节点11生成节点发现报文的方式。Optionally, node 21 generates node discovery message 21 according to message 25, the message header of node discovery message 21 is the same as the message header of message 25, and the payload of node discovery message 21 carries an indication of the message type of node discovery message 21. The manner in which node 21 generates a node discovery message may refer to the manner in which node 11 generates a node discovery message.
本申请实施例中,在由目的地为SFU服务器的报文触发节点发现流程的方案中,如果某个节点接收到上级节点发送的节点发现响应报文,且未接收到下级节点发送的节点发现报文,那么该节点可以确定自身为以该SFU服务器为起点的传输路径上的最后一个节点。In an embodiment of the present application, in a scheme where the node discovery process is triggered by a message whose destination is an SFU server, if a node receives a node discovery response message sent by an upper-level node and does not receive a node discovery message sent by a lower-level node, then the node can determine that it is the last node on the transmission path starting from the SFU server.
另一种可能实现方式,由发送方为SFU服务器的报文触发节点发现流程。Another possible implementation is that the node discovery process is triggered by a message sent by the SFU server.
这种实现方式下,节点21可以接收节点25发送的节点发现报文22,节点发现报文22携带有SFU服务器的标识,且节点发现报文22指示节点25为节点21在以该SFU服务器为起点的传输路径上的上级节点。节点21根据节点发现报文22确定节点25支持数据去重,并向节点25发送节点发现报文22对应的节点发现响应报文22,节点发现响应报文22指示节点21支持数据去重,且节点21还可以确定自身不是在以该SFU服务器为起点的传输路径上的首个节点。节点25例如可以是上述节点22。In this implementation, node 21 can receive node discovery message 22 sent by node 25, node discovery message 22 carries the identifier of the SFU server, and node discovery message 22 indicates that node 25 is the superior node of node 21 on the transmission path starting from the SFU server. Node 21 determines that node 25 supports data deduplication based on node discovery message 22, and sends node discovery response message 22 corresponding to node discovery message 22 to node 25, node discovery response message 22 indicates that node 21 supports data deduplication, and node 21 can also determine that it is not the first node on the transmission path starting from the SFU server. Node 25 can be, for example, the above-mentioned node 22.
这种实现方式下,节点21接收到源端口号为SFU服务端口号的报文26之后,还可以向节点26发送节点发现报文23,节点26为报文26在节点21上的下一跳。报文26的发送方为SFU服务器,节点发现报文23携带有该SFU服务器的标识,且节点发现报文23指示节点21为节点26在以该SFU服务器为起点的传输路径上的上级节点。响应于未接收到节点26发送的节点发现报文23对应的节点发现响应报文23,节点21确定节点21为以该SFU服务器为起点的传输路径上支持数据去重的最后一个节点。In this implementation, after node 21 receives message 26 whose source port number is the SFU service port number, it can also send a node discovery message 23 to node 26, and node 26 is the next hop of message 26 on node 21. The sender of message 26 is the SFU server, and node discovery message 23 carries the identifier of the SFU server, and node discovery message 23 indicates that node 21 is the superior node of node 26 on the transmission path starting from the SFU server. In response to not receiving a node discovery response message 23 corresponding to the node discovery message 23 sent by node 26, node 21 determines that node 21 is the last node that supports data deduplication on the transmission path starting from the SFU server.
可选地,节点21根据报文26生成节点发现报文23,节点发现报文23的报文头与报文26的报文头相同,节点发现报文23的载荷部分携带有对节点发现报文23的报文类型的指示。Optionally, the node 21 generates a node discovery message 23 according to the message 26 , the message header of the node discovery message 23 is the same as the message header of the message 26 , and the payload part of the node discovery message 23 carries an indication of the message type of the node discovery message 23 .
本申请实施例中,在由发送方为SFU服务器的报文触发节点发现流程的方案中,如果某个节点接收到上级节点发送的节点发现报文,且未接收到下级节点发送的节点发现响应报文,那么该节点可以确定自身为以该SFU服务器为起点的传输路径上的最后一个节点。In an embodiment of the present application, in a scheme where the node discovery process is triggered by a message whose sender is an SFU server, if a node receives a node discovery message sent by an upper-level node and does not receive a node discovery response message sent by a lower-level node, then the node can determine that it is the last node on the transmission path starting from the SFU server.
对于中间节点,例如图11示出的方法1100中的节点31。节点31可以通过以下两种可能实现方式确定自身相对于SFU服务器的位置。For an intermediate node, such as the node 31 in the method 1100 shown in Fig. 11, the node 31 may determine its position relative to the SFU server through the following two possible implementations.
一种可能实现方式,由目的地为SFU服务器的报文触发节点发现流程。In one possible implementation, a message destined for an SFU server triggers the node discovery process.
这种实现方式下,节点31接收到目的端口号为SFU服务端口号的报文34之后,向节点34发送节点 发现报文31。节点34为报文34在节点31上的下一跳。报文34的目的地为SFU服务器,节点发现报文31携带有该SFU服务器的标识,且节点发现报文31指示节点31为节点34在以该SFU服务器为起点的传输路径上的下级节点。响应于接收到节点34发送的节点发现报文31对应的节点发现响应报文31,节点31确定节点34支持数据去重,且节点31还可以确定自身不是在以该SFU服务器为起点的传输路径上的首个节点。节点34例如可以是上述节点32。进一步地,节点31还可以向节点34发送节点发现响应确认报文。In this implementation, after node 31 receives message 34 whose destination port number is the SFU service port number, it sends node 34 a message. Discovery message 31. Node 34 is the next hop of message 34 on node 31. The destination of message 34 is the SFU server, the node discovery message 31 carries the identifier of the SFU server, and the node discovery message 31 indicates that node 31 is the subordinate node of node 34 on the transmission path starting from the SFU server. In response to receiving the node discovery response message 31 corresponding to the node discovery message 31 sent by node 34, node 31 determines that node 34 supports data deduplication, and node 31 can also determine that it is not the first node on the transmission path starting from the SFU server. Node 34 can be, for example, the above-mentioned node 32. Further, node 31 can also send a node discovery response confirmation message to node 34.
可选地,节点31根据报文34生成节点发现报文31,节点发现报文31的报文头与报文34的报文头相同,节点发现报文31的载荷部分携带有对节点发现报文31的报文类型的指示。节点31生成节点发现报文的方式可参考上述节点11生成节点发现报文的方式。Optionally, the node 31 generates a node discovery message 31 according to the message 34, the message header of the node discovery message 31 is the same as the message header of the message 34, and the payload of the node discovery message 31 carries an indication of the message type of the node discovery message 31. The manner in which the node 31 generates the node discovery message may refer to the manner in which the node 11 generates the node discovery message.
这种实现方式下,节点31还可以接收节点37发送的节点发现报文34,节点发现报文34携带有SFU服务器的标识,且节点发现报文34指示节点37为节点31在以该SFU服务器为起点的传输路径上的下级节点。节点31根据节点发现报文34确定节点37支持数据去重,并向节点37发送节点发现报文34对应的节点发现响应报文34,节点发现响应报文34指示节点31支持数据去重,且节点31还可以确定自身不是在以该SFU服务器为起点的传输路径上的最后一个节点。节点37例如可以是上述节点33。In this implementation, node 31 can also receive node discovery message 34 sent by node 37, node discovery message 34 carries the identifier of the SFU server, and node discovery message 34 indicates that node 37 is a subordinate node of node 31 on the transmission path starting from the SFU server. Node 31 determines that node 37 supports data deduplication based on node discovery message 34, and sends node discovery response message 34 corresponding to node discovery message 34 to node 37, node discovery response message 34 indicates that node 31 supports data deduplication, and node 31 can also determine that it is not the last node on the transmission path starting from the SFU server. Node 37 can be, for example, the above-mentioned node 33.
本申请实施例中,在由目的地为SFU服务器的报文触发节点发现流程的方案中,如果某个节点接收到上级节点发送的节点发现响应报文,且接收到下级节点发送的节点发现报文,那么该节点可以确定自身为以该SFU服务器为起点的传输路径上的中间节点。In an embodiment of the present application, in a scheme where the node discovery process is triggered by a message whose destination is an SFU server, if a node receives a node discovery response message sent by an upper-level node and receives a node discovery message sent by a lower-level node, then the node can determine itself as an intermediate node on the transmission path starting from the SFU server.
另一种可能实现方式,由发送方为SFU服务器的报文触发节点发现流程。Another possible implementation is that the node discovery process is triggered by a message sent by the SFU server.
这种实现方式下,节点31接收到源端口号为SFU服务端口号的报文35之后,可以向节点36发送节点发现报文33。节点36为报文35在节点31上的下一跳。报文35的发送方为SFU服务器,节点发现报文33携带有该SFU服务器的标识,且节点发现报文33指示节点31为节点36在以该SFU服务器为起点的传输路径上的上级节点。响应于接收到节点36发送的节点发现报文33对应的节点发现响应报文33,节点31确定节点36支持数据去重,且节点31还可以确定自身不是在以该SFU服务器为起点的传输路径上的最后一个节点。节点36例如可以是上述节点33。进一步地,节点31还可以向节点36发送节点发现响应确认报文。In this implementation, after node 31 receives message 35 whose source port number is the SFU service port number, it can send a node discovery message 33 to node 36. Node 36 is the next hop of message 35 on node 31. The sender of message 35 is the SFU server, and the node discovery message 33 carries the identifier of the SFU server, and the node discovery message 33 indicates that node 31 is the superior node of node 36 on the transmission path starting from the SFU server. In response to receiving the node discovery response message 33 corresponding to the node discovery message 33 sent by node 36, node 31 determines that node 36 supports data deduplication, and node 31 can also determine that it is not the last node on the transmission path starting from the SFU server. Node 36 can be, for example, the above-mentioned node 33. Further, node 31 can also send a node discovery response confirmation message to node 36.
可选地,节点31根据报文35生成节点发现报文33,节点发现报文33的报文头与报文35的报文头相同,节点发现报文33的载荷部分携带有对节点发现报文33的报文类型的指示。Optionally, the node 31 generates a node discovery message 33 according to the message 35 , the message header of the node discovery message 33 is the same as the message header of the message 35 , and the payload part of the node discovery message 33 carries an indication of the message type of the node discovery message 33 .
这种实现方式下,节点31还可以接收节点35发送的节点发现报文32,节点发现报文32携带有SFU服务器的标识,且节点发现报文32指示节点35为节点31在以该SFU服务器为起点的传输路径上的上级节点。节点31根据节点发现报文32确定节点35支持数据去重,并向节点35发送节点发现报文32对应的节点发现响应报文32,节点发现响应报文32指示节点31支持数据去重,且节点31还可以确定自身不是在以该SFU服务器为起点的传输路径上的首个节点。节点35例如可以是上述节点32。In this implementation, node 31 can also receive node discovery message 32 sent by node 35, node discovery message 32 carries the identifier of the SFU server, and node discovery message 32 indicates that node 35 is the superior node of node 31 on the transmission path starting from the SFU server. Node 31 determines that node 35 supports data deduplication based on node discovery message 32, and sends node discovery response message 32 corresponding to node discovery message 32 to node 35, node discovery response message 32 indicates that node 31 supports data deduplication, and node 31 can also determine that it is not the first node on the transmission path starting from the SFU server. Node 35 can be, for example, the above-mentioned node 32.
本申请实施例中,在由发送方为SFU服务器的报文触发节点发现流程的方案中,如果某个节点接收到上级节点发送的节点发现报文,且接收到下级节点发送的节点发现响应报文,那么该节点可以确定自身为以该SFU服务器为起点的传输路径上的中间节点。In an embodiment of the present application, in a scheme where the node discovery process is triggered by a message whose sender is an SFU server, if a node receives a node discovery message sent by an upper-level node and a node discovery response message sent by a lower-level node, then the node can determine that it is an intermediate node on the transmission path starting from the SFU server.
可选地,本申请实施例提供的数据传输系统中的各个节点还可以存储有SFU服务器的标识与节点位置的对应关系,该节点位置用于指示节点自身相对于对应的SFU服务器的位置。由于数据传输系统中可能有多个SFU服务器,同一节点相对于不同SFU服务器可能处于不同位置,比如某个节点相对于SFU服务器1为顶部节点,相对于SFU服务器2为中间节点,通过在节点中存储SFU服务器的标识与节点位置的对应关系,使得节点在接收到发送方为某个SFU服务器的报文之后,能够确定自身应该以哪个位置的节点对报文进行去重或恢复处理,适用于组网复杂的应用场景。Optionally, each node in the data transmission system provided by the embodiment of the present application may also store a correspondence between the identifier of the SFU server and the node position, and the node position is used to indicate the position of the node itself relative to the corresponding SFU server. Since there may be multiple SFU servers in the data transmission system, the same node may be in different positions relative to different SFU servers. For example, a certain node is a top node relative to SFU server 1 and an intermediate node relative to SFU server 2. By storing the correspondence between the identifier of the SFU server and the node position in the node, after receiving a message from a certain SFU server as the sender, the node can determine the node at which position it should perform deduplication or recovery processing on the message, which is suitable for application scenarios with complex networking.
本申请实施例提供的数据传输方法的步骤的先后顺序能够进行适当调整,步骤也能够根据情况进行相应增减。任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本申请的保护范围之内。例如,本申请实施例中,各个节点中存储的数据集合和/或采样标签集合中的内容可以定期更新,比如只存储最近一定时长内的数据,并自动删除过期数据,以减少节点的内存资源消耗。又例如,目的地为SFU服务器或发送方为SFU服务器的报文除了上述实施例中描述的可以触发接收到该 报文的节点在该报文的传输方向上发送节点发现报文以外,还可以触发接收到该报文的节点在该报文的反向传输方向上发送节点发现报文,节点在报文的反向传输方向上发送节点发现报文确定自身相对于SFU服务器的位置的原理可参考上述实施例中节点在报文的传输方向上发送节点发现报文确定自身相对于SFU服务器的位置的原理,本申请实施例在此不再一一赘述。The order of the steps of the data transmission method provided in the embodiment of the present application can be adjusted appropriately, and the steps can also be increased or decreased accordingly according to the situation. Any technician familiar with the technical field can easily think of the method of change within the technical scope disclosed in this application, which should be covered within the protection scope of this application. For example, in the embodiment of the present application, the content of the data set and/or sampling label set stored in each node can be updated regularly, such as only storing data within a certain period of time, and automatically deleting expired data to reduce the memory resource consumption of the node. For another example, the message with the destination as the SFU server or the sender as the SFU server can trigger the reception of the message as described in the above embodiment. In addition to sending a node discovery message in the transmission direction of the message, the node receiving the message can also trigger the node receiving the message to send a node discovery message in the reverse transmission direction of the message. The principle of the node sending a node discovery message in the reverse transmission direction of the message to determine its own position relative to the SFU server can refer to the principle of the node sending a node discovery message in the transmission direction of the message to determine its own position relative to the SFU server in the above-mentioned embodiment, and the embodiments of the present application will not be described one by one here.
下面对本申请实施例的虚拟装置举例说明。The following is an illustration of the virtual device in the embodiment of the present application.
例如,图12是本申请实施例提供的一种通信节点的结构示意图。该通信节点为第一节点,例如可以是图6示出的方法600中的节点11。如图12所示,通信节点1200包括获取模块1201、处理模块1202和发送模块1203。可选地,通信节点1200还包括接收模块1204。For example, FIG12 is a schematic diagram of the structure of a communication node provided in an embodiment of the present application. The communication node is a first node, for example, it can be the node 11 in the method 600 shown in FIG6. As shown in FIG12, the communication node 1200 includes an acquisition module 1201, a processing module 1202 and a sending module 1203. Optionally, the communication node 1200 also includes a receiving module 1204.
获取模块1201,用于获取发送方为SFU服务器的第一报文,第一节点为第一报文的传输路径上支持数据去重的首个节点。处理模块1202,用于如果第一节点向第二节点发送的历史报文中存在目标报文,目标报文的载荷部分与第一报文的载荷部分具有重复内容,对第一报文的载荷部分进行去重处理,得到第二报文,第二报文不包括重复内容,且第二报文携带有去重标记以及对重复内容的指示信息,去重标记用于指示第二报文为去重报文,第二节点为第一报文在第一节点上的下一跳。发送模块1203,用于向第二节点发送第二报文。这里,第一节点可以是上述节点11,第二节点可以是上述节点12,第一报文可以是上述报文11,第二报文可以是上述报文12。The acquisition module 1201 is used to acquire the first message whose sender is the SFU server, and the first node is the first node that supports data deduplication on the transmission path of the first message. The processing module 1202 is used to deduplicate the payload part of the first message if there is a target message in the historical message sent by the first node to the second node, and the payload part of the target message has repeated content with the payload part of the first message, to obtain a second message, the second message does not include repeated content, and the second message carries a deduplication mark and indication information of the repeated content, the deduplication mark is used to indicate that the second message is a deduplication message, and the second node is the next hop of the first message on the first node. The sending module 1203 is used to send the second message to the second node. Here, the first node can be the above-mentioned node 11, the second node can be the above-mentioned node 12, the first message can be the above-mentioned message 11, and the second message can be the above-mentioned message 12.
可选地,重复内容包括一个或多个重复数据块,指示信息包括一个或多个指示,一个或多个指示与一个或多个重复数据块一一对应,每个指示用于指示对应的重复数据块的哈希值。Optionally, the repeated content includes one or more repeated data blocks, the indication information includes one or more indications, the one or more indications correspond to the one or more repeated data blocks one-to-one, and each indication is used to indicate a hash value of a corresponding repeated data block.
可选地,第一节点中存储有数据集合,数据集合包括第一节点向第二节点发送的历史报文的载荷部分,处理模块1202,用于:对第一报文的载荷部分与数据集合中的载荷部分进行内容匹配。如果数据集合中存在与第一报文的载荷部分具有重复数据块的目标载荷部分,确定历史报文中存在目标报文。针对第一报文的载荷部分与目标载荷部分之间的每个重复数据块,第一节点计算重复数据块的哈希值。去除第一报文的载荷部分的重复数据块,并在第一报文的载荷部分添加重复数据块对应的指示,指示用于指示重复数据块的哈希值以及重复数据块在第一报文的载荷部分中的位置。Optionally, a data set is stored in the first node, and the data set includes the payload part of the historical message sent by the first node to the second node. The processing module 1202 is used to: match the content of the payload part of the first message with the payload part in the data set. If there is a target payload part in the data set that has a duplicate data block with the payload part of the first message, determine that the target message exists in the historical message. For each duplicate data block between the payload part of the first message and the target payload part, the first node calculates the hash value of the duplicate data block. Remove the duplicate data blocks in the payload part of the first message, and add an indication corresponding to the duplicate data block to the payload part of the first message, the indication is used to indicate the hash value of the duplicate data block and the position of the duplicate data block in the payload part of the first message.
可选地,处理模块1202,用于:如果数据集合中不存在与第一报文的载荷部分具有重复数据块的载荷部分,确定历史报文中不存在目标报文。在数据集合中添加第一报文的载荷部分。Optionally, the processing module 1202 is configured to: if there is no payload part having a duplicate data block with the payload part of the first message in the data set, determine that the target message does not exist in the historical message, and add the payload part of the first message to the data set.
可选地,载荷部分包括协议部分和数据部分,一个或多个重复数据块位于第一报文的数据部分。Optionally, the payload part includes a protocol part and a data part, and one or more repeated data blocks are located in the data part of the first message.
可选地,第一节点中存储有采样标签集合,采样标签集合包括历史数据块的哈希值,历史数据块为对第一节点向第二节点发送的历史报文的数据部分的预设位置采样得到的数据块。处理模块1202,用于:对第一报文的数据部分的预设位置进行采样,得到采样数据块。计算采样数据块的哈希值。如果采样标签集合中包括采样数据块的哈希值,确定历史报文中存在目标报文。将哈希值属于采样标签集合的采样数据块作为重复数据块,去除第一报文的数据部分的重复数据块,并在第一报文的载荷部分添加重复数据块对应的指示,指示用于指示重复数据块的哈希值。Optionally, a sampling tag set is stored in the first node, and the sampling tag set includes a hash value of a historical data block, and the historical data block is a data block obtained by sampling a preset position of the data part of a historical message sent by the first node to the second node. Processing module 1202 is used to: sample a preset position of the data part of the first message to obtain a sampled data block. Calculate the hash value of the sampled data block. If the sampling tag set includes the hash value of the sampled data block, determine that the target message exists in the historical message. The sampled data block whose hash value belongs to the sampling tag set is used as a duplicate data block, remove the duplicate data block of the data part of the first message, and add an indication corresponding to the duplicate data block in the payload part of the first message, the indication is used to indicate the hash value of the duplicate data block.
可选地,预设位置有多个,第一节点对第一报文的数据部分的预设位置采样得到的采样数据块有多个,指示还用于指示重复数据块在第一报文的数据部分中的位置。Optionally, there are multiple preset positions, and the first node obtains multiple sampled data blocks by sampling the preset positions of the data part of the first message. The indication is also used to indicate the position of the repeated data block in the data part of the first message.
可选地,处理模块1202,用于:如果采样标签集合中不包括采样数据块的哈希值,确定历史报文中不存在目标报文。在采样标签集合中添加采样数据块的哈希值。Optionally, the processing module 1202 is configured to: if the sampling tag set does not include the hash value of the sampling data block, determine that the target message does not exist in the historical message, and add the hash value of the sampling data block to the sampling tag set.
可选地,采样标签集合还包括哈希值所指示的历史数据块,如果采样标签集合中包括采样数据块的哈希值,处理模块1202,用于:如果采样标签集合中包括采样数据块的哈希值,对采样数据块与采样数据块的哈希值所指示的历史数据块进行内容匹配。当采样数据块与采样数据块的哈希值所指示的历史数据块的内容相同时,确定历史报文中存在目标报文。Optionally, the sampling tag set also includes a historical data block indicated by a hash value. If the sampling tag set includes the hash value of the sampling data block, the processing module 1202 is used to: if the sampling tag set includes the hash value of the sampling data block, perform content matching on the sampling data block and the historical data block indicated by the hash value of the sampling data block. When the content of the sampling data block is the same as the content of the historical data block indicated by the hash value of the sampling data block, it is determined that the target message exists in the historical message.
可选地,第一节点中还存储有第一节点向第二节点发送的历史报文的协议部分;重复内容还包括位于第一报文的协议部分的协议信息,指示信息还包括差异指示,差异指示用于指示第一报文的协议部分与目标报文的协议部分的差异。Optionally, the first node also stores the protocol part of the historical message sent by the first node to the second node; the repeated content also includes protocol information located in the protocol part of the first message, and the indication information also includes a difference indication, which is used to indicate the difference between the protocol part of the first message and the protocol part of the target message.
可选地,第一节点中存储有第二节点对应的一个或多个流分组集合,每个流分组集合包括流经第二节点的多条流的流标识。处理模块1202,还用于在第一节点获取第一报文之后,如果第二节点对应的流分组集合中存在包括第一报文所属流的流标识的目标流分组集合,判断向第二节点发送的目标历史报文中是否 存在目标报文,目标历史报文所属流的流标识属于目标流分组集合。发送模块1203,还用于如果第二节点对应的所有流分组集合均不包括第一报文所属流的流标识,向第二节点发送第一报文。Optionally, the first node stores one or more flow grouping sets corresponding to the second node, each flow grouping set including flow identifiers of multiple flows flowing through the second node. The processing module 1202 is further configured to, after the first node obtains the first message, determine whether the target history message sent to the second node contains the flow identifier of the flow to which the first message belongs if there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the flow grouping set corresponding to the second node. The target message exists, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set. The sending module 1203 is further configured to send the first message to the second node if all flow grouping sets corresponding to the second node do not include the flow identifier of the flow to which the first message belongs.
可选地,接收模块1204,用于接收第二节点发送的分组信息,分组信息包括第二节点的节点标识与一个或多个流分组集合的对应关系。Optionally, the receiving module 1204 is configured to receive grouping information sent by the second node, where the grouping information includes a correspondence between a node identifier of the second node and one or more flow grouping sets.
可选地,第一节点不为SFU服务器;发送模块1203,还用于接收到目的端口号为SFU服务端口号的第三报文之后,向第三节点发送第一节点发现报文,第三节点为第三报文在第一节点上的下一跳,第三报文的目的地为SFU服务器,第一节点发现报文携带有SFU服务器的标识,且第一节点发现报文指示第一节点为第三节点在以SFU服务器为起点的传输路径上的下级节点。处理模块1202,还用于响应于未接收到第三节点发送的第一节点发现报文对应的第一节点发现响应报文,确定第一节点为以SFU服务器为起点的传输路径上支持数据去重的首个节点。这里,第三节点可以是上述节点13,第三报文可以是上述报文13,第一节点发现报文可以是上述节点发现报文11,第一节点发现响应报文可以是上述节点发现响应报文11。Optionally, the first node is not an SFU server; the sending module 1203 is further used to send a first node discovery message to the third node after receiving a third message whose destination port number is the SFU service port number, the third node is the next hop of the third message on the first node, the destination of the third message is the SFU server, the first node discovery message carries the identifier of the SFU server, and the first node discovery message indicates that the first node is a subordinate node of the third node on the transmission path starting from the SFU server. The processing module 1202 is also used to respond to the first node discovery response message corresponding to the first node discovery message sent by the third node not being received, and determine that the first node is the first node that supports data deduplication on the transmission path starting from the SFU server. Here, the third node can be the above-mentioned node 13, the third message can be the above-mentioned message 13, the first node discovery message can be the above-mentioned node discovery message 11, and the first node discovery response message can be the above-mentioned node discovery response message 11.
可选地,处理模块1202,还用于:根据第三报文生成第一节点发现报文,第一节点发现报文的报文头与第三报文的报文头相同,第一节点发现报文的载荷部分携带有对第一节点发现报文的报文类型的指示。Optionally, the processing module 1202 is also used to: generate a first node discovery message based on the third message, the message header of the first node discovery message is the same as the message header of the third message, and the payload part of the first node discovery message carries an indication of the message type of the first node discovery message.
可选地,接收模块1204,用于接收第四节点发送的第二节点发现报文,第二节点发现报文携带有SFU服务器的标识,且第二节点发现报文指示第四节点为第一节点在以SFU服务器为起点的传输路径上的下级节点。处理模块1202,还用于根据第二节点发现报文确定第四节点支持数据去重。发送模块1203,还用于向第四节点发送第二节点发现报文对应的第二节点发现响应报文,第二节点发现响应报文指示第一节点支持数据去重。这里,第四节点可以是上述节点14,第二节点发现报文可以是上述节点发现报文12,第二节点发现响应报文可以是上述节点发现响应报文12。Optionally, the receiving module 1204 is used to receive a second node discovery message sent by the fourth node, the second node discovery message carries the identifier of the SFU server, and the second node discovery message indicates that the fourth node is a subordinate node of the first node on the transmission path starting from the SFU server. The processing module 1202 is also used to determine that the fourth node supports data deduplication based on the second node discovery message. The sending module 1203 is also used to send a second node discovery response message corresponding to the second node discovery message to the fourth node, and the second node discovery response message indicates that the first node supports data deduplication. Here, the fourth node can be the above-mentioned node 14, the second node discovery message can be the above-mentioned node discovery message 12, and the second node discovery response message can be the above-mentioned node discovery response message 12.
可选地,发送模块1203,还用于接收到源端口号为SFU服务端口号的第四报文之后,向第五节点发送第三节点发现报文,第五节点为第四报文在第一节点上的下一跳,第四报文的发送方为SFU服务器,第三节点发现报文携带有SFU服务器的标识,且第三节点发现报文指示第一节点为第五节点在以SFU服务器为起点的传输路径上的上级节点。处理模块1202,还用于响应于接收到第五节点发送的第三节点发现报文对应的第三节点发现响应报文,确定第五节点支持数据去重。这里,第五节点可以是上述节点15,第四报文可以是上述报文14,第三节点发现报文可以是上述节点发现报文13,第三节点发现响应报文可以是上述节点发现响应报文13。Optionally, the sending module 1203 is also used to send a third node discovery message to the fifth node after receiving the fourth message whose source port number is the SFU service port number. The fifth node is the next hop of the fourth message on the first node. The sender of the fourth message is the SFU server. The third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the upper node of the fifth node on the transmission path starting from the SFU server. The processing module 1202 is also used to respond to the third node discovery response message corresponding to the third node discovery message sent by the fifth node, and determine that the fifth node supports data deduplication. Here, the fifth node can be the above-mentioned node 15, the fourth message can be the above-mentioned message 14, the third node discovery message can be the above-mentioned node discovery message 13, and the third node discovery response message can be the above-mentioned node discovery response message 13.
可选地,发送模块1203,还用于如果第一节点向第二节点发送的历史报文中不存在目标报文,向第二节点发送第一报文。Optionally, the sending module 1203 is further configured to send the first message to the second node if the target message does not exist in the historical messages sent by the first node to the second node.
又例如,图13是本申请实施例提供的另一种通信节点的结构示意图。该通信节点为第一节点,例如可以是图10示出的方法1000中的节点21。如图13所示,通信节点1300包括接收模块1301、处理模块1302和发送模块1303。For another example, Figure 13 is a schematic diagram of the structure of another communication node provided in an embodiment of the present application. The communication node is a first node, for example, it can be the node 21 in the method 1000 shown in Figure 10. As shown in Figure 13, the communication node 1300 includes a receiving module 1301, a processing module 1302 and a sending module 1303.
接收模块1301,用于接收第二节点发送的第一报文,第一报文的发送方为SFU服务器,第一报文携带有去重标记以及对重复内容的指示信息,去重标记用于指示第一报文为去重报文,第一节点为第一报文的传输路径上支持数据去重的最后一个节点。处理模块1302,用于基于去重标记确定第一报文为去重报文;根据指示信息从数据集合中获取重复内容,数据集合包括第一节点接收到的来自第二节点的历史报文的载荷部分的至少部分内容;根据重复内容对第一报文的载荷部分进行去重恢复处理,得到第二报文,第二报文的载荷部分包括重复内容。发送模块1303,用于向第三节点发送第二报文,第三节点为第一报文在第一节点上的下一跳。这里,第一节点可以是上述节点21,第二节点可以是上述节点22,第三节点可以是上述节点23,第一报文可以是上述报文21,第二报文可以是上述报文22。The receiving module 1301 is used to receive a first message sent by a second node. The sender of the first message is an SFU server. The first message carries a deduplication mark and indication information of duplicate content. The deduplication mark is used to indicate that the first message is a deduplication message. The first node is the last node that supports data deduplication on the transmission path of the first message. The processing module 1302 is used to determine that the first message is a deduplication message based on the deduplication mark; obtain duplicate content from a data set according to the indication information, and the data set includes at least part of the content of the load part of the historical message received by the first node from the second node; perform deduplication recovery processing on the load part of the first message according to the duplicate content to obtain a second message, and the load part of the second message includes duplicate content. The sending module 1303 is used to send a second message to a third node, and the third node is the next hop of the first message on the first node. Here, the first node can be the above-mentioned node 21, the second node can be the above-mentioned node 22, the third node can be the above-mentioned node 23, the first message can be the above-mentioned message 21, and the second message can be the above-mentioned message 22.
可选地,重复内容包括一个或多个重复数据块,指示信息包括一个或多个指示,一个或多个指示与一个或多个重复数据块一一对应,每个指示用于指示对应的重复数据块的哈希值。Optionally, the repeated content includes one or more repeated data blocks, the indication information includes one or more indications, the one or more indications correspond to the one or more repeated data blocks one-to-one, and each indication is used to indicate a hash value of a corresponding repeated data block.
可选地,每个指示还用于指示对应的重复数据块在第一报文对应的原始报文的载荷部分中的位置,数据集合包括第一节点接收到的来自第二节点的历史报文的载荷部分。处理模块1302,用于:对于指示信息中的每个指示,根据指示所指示的位置,获取数据集合中的载荷部分的位置的待匹配数据块。计算待匹配数据块的哈希值。将哈希值与指示所指示的哈希值一致的待匹配数据块,确定为指示对应的重复数据块。 Optionally, each indication is also used to indicate the position of the corresponding repeated data block in the payload part of the original message corresponding to the first message, and the data set includes the payload part of the historical message received by the first node from the second node. Processing module 1302 is used to: for each indication in the indication information, obtain the data block to be matched at the position of the payload part in the data set according to the position indicated by the indication. Calculate the hash value of the data block to be matched. The data block to be matched whose hash value is consistent with the hash value indicated by the indication is determined as the repeated data block corresponding to the indication.
可选地,载荷部分包括协议部分和数据部分,一个或多个重复数据块位于数据部分。Optionally, the payload portion includes a protocol portion and a data portion, and one or more repeated data blocks are located in the data portion.
可选地,数据集合包括历史数据块的哈希值与历史数据块的对应关系,历史数据块为对第一节点接收到的来自第二节点的历史报文的数据部分的预设位置采样得到的数据块。处理模块1302,用于将数据集合中与指示信息中的指示所指示的哈希值对应的历史数据块,确定为指示对应的重复数据块。Optionally, the data set includes a correspondence between a hash value of a historical data block and a historical data block, where the historical data block is a data block obtained by sampling a preset position of a data portion of a historical message received by the first node from the second node. The processing module 1302 is configured to determine a historical data block in the data set corresponding to the hash value indicated by the indication in the indication information as a duplicate data block corresponding to the indication.
可选地,指示还用于指示对应的重复数据块在第一报文对应的原始报文的数据部分中的位置。处理模块1302,用于对于指示信息中的每个指示,在第一报文的数据部分中指示所指示的位置,添加指示对应的重复数据块。Optionally, the indication is also used to indicate the position of the corresponding repeated data block in the data portion of the original message corresponding to the first message. The processing module 1302 is used to indicate the indicated position in the data portion of the first message for each indication in the indication information, and add the repeated data block corresponding to the indication.
可选地,重复内容还包括位于协议部分的协议信息,指示信息还包括差异指示,差异指示用于指示第一报文对应的原始报文的协议部分与目标报文的协议部分的差异,目标报文为第一节点接收到的来自第二节点的历史报文中数据部分与原始报文的数据部分具有一个或多个重复数据块的历史报文;数据集合还包括历史数据块所属报文的协议部分。处理模块1302,还用于:从数据集合中获取一个或多个重复数据块所属的目标报文的协议部分。根据差异指示修改目标报文的协议部分,并将修改后的目标报文的协议部分作为第二报文的协议部分。Optionally, the repeated content also includes protocol information located in the protocol part, and the indication information also includes a difference indication, the difference indication is used to indicate the difference between the protocol part of the original message corresponding to the first message and the protocol part of the target message, the target message is a historical message received by the first node from the second node, in which the data part and the data part of the original message have one or more repeated data blocks; the data set also includes the protocol part of the message to which the historical data block belongs. The processing module 1302 is also used to: obtain the protocol part of the target message to which one or more repeated data blocks belong from the data set. Modify the protocol part of the target message according to the difference indication, and use the modified protocol part of the target message as the protocol part of the second message.
可选地,去重标记位于第一报文的载荷部分,第一节点中存储有一个或多个流分组集合,每个流分组集合包括流经第一节点的多条流的流标识。处理模块1302,还用于在第一节点基于去重标记确定第一报文为去重报文之前,确定一个或多个流分组集合中存在包括第一报文所属流的流标识的目标流分组集合;解析第一报文的载荷部分,得到去重标记。Optionally, the deduplication mark is located in the payload part of the first message, and one or more flow grouping sets are stored in the first node, each flow grouping set including flow identifiers of multiple flows flowing through the first node. The processing module 1302 is also used to determine that there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the one or more flow grouping sets before the first node determines that the first message is a deduplication message based on the deduplication mark; parse the payload part of the first message to obtain the deduplication mark.
可选地,处理模块1302,用于根据指示信息,从数据集合中的目标历史报文的载荷内容中获取重复内容,目标历史报文所属流的流标识属于目标流分组集合。Optionally, the processing module 1302 is configured to obtain the repeated content from the payload content of the target historical message in the data set according to the indication information, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set.
可选地,接收模块1301,还用于接收第三报文,第三报文所属流的流标识不属于任一流分组集合。发送模块1303,还用于转发第三报文。这里,第三报文可以是上述报文23。Optionally, the receiving module 1301 is further configured to receive a third message, wherein the flow identifier of the flow to which the third message belongs does not belong to any flow grouping set. The sending module 1303 is further configured to forward the third message. Here, the third message may be the message 23 described above.
可选地,接收模块1301,还用于接收第四报文,第四报文所属流的流标识属于流分组集合。处理模块1302,还用于解析第四报文的载荷部分,确定第四报文的载荷部分未携带有去重标记;在数据集合中添加第四报文的载荷部分的至少部分内容,并转发第四报文。这里,第四报文可以是上述报文24。Optionally, the receiving module 1301 is further used to receive a fourth message, and the flow identifier of the flow to which the fourth message belongs belongs to the flow grouping set. The processing module 1302 is further used to parse the payload part of the fourth message, determine that the payload part of the fourth message does not carry a deduplication mark; add at least part of the content of the payload part of the fourth message to the data set, and forward the fourth message. Here, the fourth message can be the above-mentioned message 24.
可选地,处理模块1302,还用于将接收到的属于不同流的多个报文中,载荷部分存在重复内容的报文所属流的流标识添加至同一流分组集合中,不同流的发送方均为SFU服务器。Optionally, the processing module 1302 is further configured to add the flow identifiers of the flows to which the messages with duplicate contents in the payload part belong among the received multiple messages belonging to different flows to the same flow grouping set, and the senders of the different flows are all SFU servers.
可选地,发送模块1303,还用于向第二节点发送分组信息,分组信息包括第一节点的节点标识与一个或多个流分组集合的对应关系。Optionally, the sending module 1303 is further configured to send grouping information to the second node, where the grouping information includes a correspondence between a node identifier of the first node and one or more flow grouping sets.
可选地,发送模块1303,还用于接收到目的端口号为SFU服务端口号的第五报文之后,向第四节点发送第一节点发现报文,第四节点为第五报文在第一节点上的下一跳,第五报文的目的地为SFU服务器,第一节点发现报文携带有SFU服务器的标识,且第一节点发现报文指示第一节点为第四节点在以SFU服务器为起点的传输路径上的下级节点。处理模块1302,还用于响应于接收到第四节点发送的第一节点发现报文对应的第一节点发现响应报文,确定第四节点支持数据去重。这里,第四节点可以是上述节点24,第五报文可以是上述报文25,第一节点发现报文可以是上述节点发现报文21,第一节点发现响应报文可以是上述节点发现响应报文21。Optionally, the sending module 1303 is also used to send a first node discovery message to the fourth node after receiving the fifth message whose destination port number is the SFU service port number. The fourth node is the next hop of the fifth message on the first node, and the destination of the fifth message is the SFU server. The first node discovery message carries the identifier of the SFU server, and the first node discovery message indicates that the first node is the lower node of the fourth node on the transmission path starting from the SFU server. The processing module 1302 is also used to determine that the fourth node supports data deduplication in response to receiving a first node discovery response message corresponding to the first node discovery message sent by the fourth node. Here, the fourth node can be the above-mentioned node 24, the fifth message can be the above-mentioned message 25, the first node discovery message can be the above-mentioned node discovery message 21, and the first node discovery response message can be the above-mentioned node discovery response message 21.
可选地,处理模块1302,还用于根据第五报文生成第一节点发现报文,第一节点发现报文的报文头与第五报文的报文头相同,第一节点发现报文的载荷部分携带有对第一节点发现报文的报文类型的指示。Optionally, the processing module 1302 is also used to generate a first node discovery message based on the fifth message, the message header of the first node discovery message is the same as the message header of the fifth message, and the payload part of the first node discovery message carries an indication of the message type of the first node discovery message.
可选地,接收模块1301,还用于接收第五节点发送的第二节点发现报文,第二节点发现报文携带有SFU服务器的标识,且第二节点发现报文指示第五节点为第一节点在以SFU服务器为起点的传输路径上的上级节点。处理模块1302,还用于根据第二节点发现报文确定第五节点支持数据去重。发送模块1303,还用于向第五节点发送第二节点发现报文对应的第二节点发现响应报文,第二节点发现响应报文指示第一节点支持数据去重。这里,第五节点可以是上述节点25,第二节点发现报文可以是上述节点发现报文22,第二节点发现响应报文可以是上述节点发现响应报文22。Optionally, the receiving module 1301 is also used to receive a second node discovery message sent by the fifth node, the second node discovery message carries the identifier of the SFU server, and the second node discovery message indicates that the fifth node is the upper node of the first node on the transmission path starting from the SFU server. The processing module 1302 is also used to determine that the fifth node supports data deduplication based on the second node discovery message. The sending module 1303 is also used to send a second node discovery response message corresponding to the second node discovery message to the fifth node, and the second node discovery response message indicates that the first node supports data deduplication. Here, the fifth node can be the above-mentioned node 25, the second node discovery message can be the above-mentioned node discovery message 22, and the second node discovery response message can be the above-mentioned node discovery response message 22.
可选地,发送模块1303,还用于接收到源端口号为SFU服务端口号的第六报文之后,向第六节点发送第三节点发现报文,第六节点为第六报文在第一节点上的下一跳,第六报文的发送方为SFU服务器,第三节点发现报文携带有SFU服务器的标识,且第三节点发现报文指示第一节点为第六节点在以SFU服务器为起点的传输路径上的上级节点。处理模块1302,还用于响应于未接收到第六节点发送的第三节点发现 报文对应的第三节点发现响应报文,确定第一节点为以SFU服务器为起点的传输路径上支持数据去重的最后一个节点。这里,第六节点可以是上述节点26,第六报文可以是上述报文26,第三节点发现报文可以是上述节点发现报文23,第三节点发现响应报文可以是上述节点发现响应报文23。Optionally, the sending module 1303 is further used to send a third node discovery message to the sixth node after receiving the sixth message whose source port number is the SFU service port number, the sixth node is the next hop of the sixth message on the first node, the sender of the sixth message is the SFU server, the third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the upper node of the sixth node on the transmission path starting from the SFU server. The processing module 1302 is also used to respond to the failure to receive the third node discovery message sent by the sixth node. The third node discovery response message corresponding to the message determines that the first node is the last node that supports data deduplication on the transmission path starting from the SFU server. Here, the sixth node can be the above-mentioned node 26, the sixth message can be the above-mentioned message 26, the third node discovery message can be the above-mentioned node discovery message 23, and the third node discovery response message can be the above-mentioned node discovery response message 23.
又例如,图14是本申请实施例提供的又一种通信节点的结构示意图。该通信节点为第一节点,例如可以是图11示出的方法1100中的节点31。如图14所示,通信节点1400包括接收模块1401、处理模块1402和发送模块1403。For another example, Figure 14 is a schematic diagram of the structure of another communication node provided in an embodiment of the present application. The communication node is a first node, for example, it can be the node 31 in the method 1100 shown in Figure 11. As shown in Figure 14, the communication node 1400 includes a receiving module 1401, a processing module 1402 and a sending module 1403.
接收模块1401,用于接收第二节点发送的第一报文,第一报文的发送方为SFU服务器,第一节点为第一报文的传输路径上支持数据去重的中间节点。处理模块1402,用于如果第一报文为未去重报文,且第一节点向第三节点发送的历史报文中存在第一原始报文,第一原始报文的载荷部分与第一报文的载荷部分具有第一重复内容,对第一报文的载荷部分进行去重处理,得到第二报文,第二报文不包括第一重复内容,且第二报文携带有去重标记以及对第一重复内容的第一指示信息,去重标记用于指示第二报文为去重报文,第三节点为第一报文在第一节点上的下一跳。发送模块1403,用于向第三节点发送第二报文。这里,第一节点可以是上述节点31,第二节点可以是上述节点32,第三节点可以是上述节点33,第一报文可以是上述报文31,第二报文可以是上述报文32。The receiving module 1401 is used to receive a first message sent by the second node, the sender of the first message is an SFU server, and the first node is an intermediate node that supports data deduplication on the transmission path of the first message. The processing module 1402 is used to, if the first message is a non-deduplicated message, and there is a first original message in the historical message sent by the first node to the third node, and the payload part of the first original message and the payload part of the first message have a first repeated content, deduplication processing is performed on the payload part of the first message to obtain a second message, the second message does not include the first repeated content, and the second message carries a deduplication mark and a first indication information of the first repeated content, the deduplication mark is used to indicate that the second message is a deduplication message, and the third node is the next hop of the first message on the first node. The sending module 1403 is used to send a second message to the third node. Here, the first node can be the above-mentioned node 31, the second node can be the above-mentioned node 32, the third node can be the above-mentioned node 33, the first message can be the above-mentioned message 31, and the second message can be the above-mentioned message 32.
可选地,发送模块1403,还用于如果第一报文为未去重报文,且第一节点向第三节点发送的历史报文中不存在第一原始报文,向第三节点发送第一报文。Optionally, the sending module 1403 is further configured to send the first message to the third node if the first message is a non-deduplicated message and the first original message does not exist in the historical messages sent by the first node to the third node.
可选地,第一重复内容包括一个或多个重复数据块,第一指示信息包括一个或多个指示,一个或多个指示与一个或多个重复数据块一一对应,每个指示用于指示对应的重复数据块的哈希值。Optionally, the first repetitive content includes one or more repetitive data blocks, the first indication information includes one or more indications, the one or more indications correspond one-to-one to the one or more repetitive data blocks, and each indication is used to indicate a hash value of a corresponding repetitive data block.
可选地,第一节点中存储有第一数据集合,第一数据集合包括第一节点向第三节点发送的历史报文的载荷部分,处理模块1402,还用于:对第一报文的载荷部分与第一数据集合中的载荷部分进行内容匹配。如果第一数据集合中存在与第一报文的载荷部分具有重复数据块的目标载荷部分,确定历史报文中存在第一原始报文。如果第一数据集合中不存在与第一报文的载荷部分具有重复数据块的载荷部分,确定历史报文中不存在第一原始报文,并且,第一节点在第一数据集合中添加第一报文的载荷部分。Optionally, a first data set is stored in the first node, and the first data set includes the payload part of the historical message sent by the first node to the third node. The processing module 1402 is also used to: content match the payload part of the first message with the payload part in the first data set. If there is a target payload part in the first data set that has a repeated data block with the payload part of the first message, it is determined that the first original message exists in the historical message. If there is no payload part in the first data set that has a repeated data block with the payload part of the first message, it is determined that the first original message does not exist in the historical message, and the first node adds the payload part of the first message to the first data set.
可选地,第一数据集合包括目标载荷部分,处理模块1402,用于:针对第一报文的载荷部分与目标载荷部分之间的每个重复数据块,计算重复数据块的哈希值。去除第一报文的载荷部分的重复数据块,并在第一报文的载荷部分添加重复数据块对应的指示,指示用于指示重复数据块的哈希值以及重复数据块在第一报文的载荷部分中的位置。Optionally, the first data set includes a target payload part, and the processing module 1402 is used to: for each repeated data block between the payload part and the target payload part of the first message, calculate a hash value of the repeated data block, remove the repeated data block in the payload part of the first message, and add an indication corresponding to the repeated data block to the payload part of the first message, where the indication is used to indicate the hash value of the repeated data block and the position of the repeated data block in the payload part of the first message.
可选地,载荷部分包括协议部分和数据部分,一个或多个重复数据块位于第一报文的数据部分;第一节点中存储有采样标签集合,采样标签集合包括历史数据块的哈希值,历史数据块为对第一节点向第三节点发送的历史报文的数据部分的预设位置采样得到的数据块。处理模块1402,还用于:对第一报文的数据部分的预设位置进行采样,得到采样数据块。计算采样数据块的哈希值。如果采样标签集合中包括采样数据块的哈希值,确定历史报文中存在第一原始报文。如果采样标签集合中不包括采样数据块的哈希值,确定历史报文中不存在第一原始报文,并且,第一节点在采样标签集合中添加采样数据块的哈希值。Optionally, the payload portion includes a protocol portion and a data portion, and one or more repeated data blocks are located in the data portion of the first message; a sampling tag set is stored in the first node, and the sampling tag set includes a hash value of a historical data block, and the historical data block is a data block obtained by sampling a preset position of the data portion of the historical message sent by the first node to the third node. The processing module 1402 is also used to: sample a preset position of the data portion of the first message to obtain a sampled data block. Calculate the hash value of the sampled data block. If the sampling tag set includes the hash value of the sampled data block, it is determined that the first original message exists in the historical message. If the sampling tag set does not include the hash value of the sampled data block, it is determined that the first original message does not exist in the historical message, and the first node adds the hash value of the sampled data block to the sampling tag set.
可选地,采样标签集合中包括采样数据块的哈希值,处理模块1402,用于将哈希值属于采样标签集合的采样数据块作为重复数据块,去除第一报文的数据部分的重复数据块,并在第一报文的载荷部分添加重复数据块对应的指示,指示用于指示重复数据块的哈希值。Optionally, the sampling tag set includes a hash value of the sampling data block, and the processing module 1402 is used to take the sampling data block whose hash value belongs to the sampling tag set as a repeated data block, remove the repeated data block in the data part of the first message, and add an indication corresponding to the repeated data block in the payload part of the first message, where the indication is used to indicate the hash value of the repeated data block.
可选地,第一节点中还存储有第一节点向第三节点发送的历史报文的协议部分;第一重复内容还包括位于第一报文的协议部分的协议信息,第一指示信息还包括差异指示,差异指示用于指示第一报文的协议部分与第一原始报文的协议部分的差异。Optionally, the first node also stores the protocol part of the historical message sent by the first node to the third node; the first repeated content also includes protocol information located in the protocol part of the first message, and the first indication information also includes a difference indication, which is used to indicate the difference between the protocol part of the first message and the protocol part of the first original message.
可选地,发送模块1403,还用于如果第一报文为去重报文,第一报文携带有对第二重复内容的第二指示信息,且第一节点向第三节点发送的历史报文中存在第二原始报文,第二原始报文的载荷部分包括第二重复内容,向第三节点发送第一报文。Optionally, the sending module 1403 is also used to send the first message to the third node if the first message is a deduplicated message, the first message carries second indication information for the second duplicate content, and the second original message exists in the historical message sent by the first node to the third node, and the payload part of the second original message includes the second duplicate content.
可选地,处理模块1402,还用于如果第一报文为去重报文,第一报文携带有对第二重复内容的第二指示信息,且第一节点向第三节点发送的历史报文中不存在第二原始报文,第二原始报文的载荷部分包括第二重复内容,根据第二指示信息从第二数据集合中获取第二重复内容,第二数据集合包括第一节点接收到的来自第二节点的历史报文的载荷部分的至少部分内容;根据第二重复内容对第一报文的载荷部分进行去 重恢复处理,得到第三报文,第三报文的载荷部分包括第二重复内容。发送模块1403,还用于向第三节点发送第三报文。这里,第三报文可以是上述报文33。Optionally, the processing module 1402 is further used to obtain the second duplicate content from a second data set according to the second indication information if the first message is a deduplicated message, the first message carries second indication information for the second duplicate content, and the second original message does not exist in the historical message sent by the first node to the third node, and the payload part of the second original message includes the second duplicate content, and the second duplicate content is obtained from the second data set according to the second indication information, and the second data set includes at least part of the payload part of the historical message received by the first node from the second node; deduplicated the payload part of the first message according to the second duplicate content The third message is obtained by re-recovery processing, and the payload part of the third message includes the second repeated content. The sending module 1403 is further used to send the third message to the third node. Here, the third message can be the message 33 mentioned above.
可选地,第二重复内容包括一个或多个重复数据块,第二指示信息包括一个或多个指示,一个或多个指示与一个或多个重复数据块一一对应,每个指示用于指示对应的重复数据块的哈希值。Optionally, the second repetitive content includes one or more repetitive data blocks, and the second indication information includes one or more indications, the one or more indications correspond one-to-one to the one or more repetitive data blocks, and each indication is used to indicate a hash value of a corresponding repetitive data block.
可选地,每个指示还用于指示对应的重复数据块在第一报文对应的原始报文的载荷部分中的位置,第二数据集合包括第一节点接收到的来自第二节点的历史报文的载荷部分。处理模块1402,用于:对于第二指示信息中的每个指示,根据指示所指示的位置,获取第二数据集合中的载荷部分的位置的待匹配数据块。计算待匹配数据块的哈希值。将哈希值与指示所指示的哈希值一致的待匹配数据块,确定为指示对应的重复数据块。Optionally, each indication is also used to indicate the position of the corresponding repeated data block in the payload part of the original message corresponding to the first message, and the second data set includes the payload part of the historical message received by the first node from the second node. Processing module 1402 is used to: for each indication in the second indication information, obtain the data block to be matched at the position of the payload part in the second data set according to the position indicated by the indication. Calculate the hash value of the data block to be matched. The data block to be matched whose hash value is consistent with the hash value indicated by the indication is determined as the repeated data block corresponding to the indication.
可选地,载荷部分包括协议部分和数据部分,一个或多个重复数据块位于数据部分;第二数据集合包括历史数据块的哈希值与历史数据块的对应关系,历史数据块为对第一节点接收到的来自第二节点的历史报文的数据部分的预设位置采样得到的数据块。处理模块1402,用于将第二数据集合中与第二指示信息中的指示所指示的哈希值对应的历史数据块,确定为指示对应的重复数据块。Optionally, the payload portion includes a protocol portion and a data portion, and one or more repeated data blocks are located in the data portion; the second data set includes a correspondence between a hash value of a historical data block and a historical data block, and the historical data block is a data block obtained by sampling a preset position of a data portion of a historical message received by the first node from the second node. The processing module 1402 is configured to determine a historical data block in the second data set corresponding to the hash value indicated by the indication in the second indication information as a repeated data block corresponding to the indication.
可选地,指示还用于指示对应的重复数据块在第一报文对应的原始报文的数据部分中的位置。处理模块1402,用于对于第二指示信息中的每个指示,在第一报文的数据部分中指示所指示的位置,添加指示对应的重复数据块。Optionally, the indication is also used to indicate the position of the corresponding repeated data block in the data portion of the original message corresponding to the first message. The processing module 1402 is used to indicate the indicated position in the data portion of the first message for each indication in the second indication information, and add the repeated data block corresponding to the indication.
可选地,第二重复内容还包括位于协议部分的协议信息,第二指示信息还包括差异指示,差异指示用于指示第一报文对应的原始报文的协议部分与目标报文的协议部分的差异,目标报文为第一节点接收到的来自第二节点的历史报文中数据部分与原始报文的数据部分具有一个或多个重复数据块的历史报文;第二数据集合还包括历史数据块所属报文的协议部分。处理模块1402,还用于:从第二数据集合中获取一个或多个重复数据块所属的目标报文的协议部分。根据差异指示修改目标报文的协议部分,并将修改后的目标报文的协议部分作为第三报文的协议部分。Optionally, the second repeated content also includes protocol information located in the protocol part, and the second indication information also includes a difference indication, the difference indication is used to indicate the difference between the protocol part of the original message corresponding to the first message and the protocol part of the target message, the target message is a historical message received by the first node from the second node, in which the data part and the data part of the original message have one or more repeated data blocks; the second data set also includes the protocol part of the message to which the historical data block belongs. The processing module 1402 is also used to: obtain the protocol part of the target message to which one or more repeated data blocks belong from the second data set. Modify the protocol part of the target message according to the difference indication, and use the modified protocol part of the target message as the protocol part of the third message.
可选地,第一节点中存储有一个或多个本地流分组集合,每个本地流分组集合包括流经第一节点的多条流的流标识;处理模块1402,还用于:在确定一个或多个本地流分组集合中存在包括第一报文所属流的流标识的目标流分组集合之后,解析第一报文的载荷部分。如果第一报文的载荷部分携带有去重标记,确定第一报文为去重报文。如果第一报文的载荷部分未携带有去重标记,确定第一报文为未去重报文。Optionally, one or more local flow grouping sets are stored in the first node, each local flow grouping set including flow identifiers of multiple flows flowing through the first node; the processing module 1402 is further used to: after determining that there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the one or more local flow grouping sets, parse the payload part of the first message. If the payload part of the first message carries a deduplication mark, determine that the first message is a deduplication message. If the payload part of the first message does not carry a deduplication mark, determine that the first message is a non-deduplication message.
可选地,处理模块1402,还用于将接收到的属于不同流的多个报文中,载荷部分存在重复内容的报文所属流的流标识添加至同一本地流分组集合中,不同流的发送方均为SFU服务器。Optionally, the processing module 1402 is further used to add the flow identifiers of the flows to which the messages with duplicate contents in the payload part belong among the received multiple messages belonging to different flows to the same local flow grouping set, and the senders of the different flows are all SFU servers.
可选地,发送模块1403,还用于向第二节点发送第一分组信息,第一分组信息包括第一节点的节点标识与一个或多个本地流分组集合的对应关系。Optionally, the sending module 1403 is further configured to send first grouping information to the second node, where the first grouping information includes a correspondence between a node identifier of the first node and one or more local flow grouping sets.
可选地,第一节点中存储有第三节点对应的一个或多个下级流分组集合,每个下级流分组集合包括流经第三节点的多条流的流标识。处理模块1402,还用于在第一节点接收第二节点发送的第一报文之后,如果第三节点对应的下级流分组集合中存在包括第一报文所属流的流标识的目标流分组集合,判断向第三节点发送的目标历史报文中是否存在载荷部分与第一报文的载荷部分具有重复内容的报文,目标历史报文所属流的流标识属于目标流分组集合。发送模块1403,还用于如果第三节点对应的所有下级流分组集合均不包括第一报文所属流的流标识,向第三节点发送第一报文。Optionally, one or more lower-level flow grouping sets corresponding to the third node are stored in the first node, and each lower-level flow grouping set includes flow identifiers of multiple flows flowing through the third node. Processing module 1402 is also used to determine whether there is a message with a payload part that has repeated content with the payload part of the first message in the target historical message sent to the third node after the first node receives the first message sent by the second node, if there is a target flow grouping set including the flow identifier of the flow to which the first message belongs in the lower-level flow grouping set corresponding to the third node, and the flow identifier of the flow to which the target historical message belongs belongs to the target flow grouping set. Sending module 1403 is also used to send the first message to the third node if all lower-level flow grouping sets corresponding to the third node do not include the flow identifier of the flow to which the first message belongs.
可选地,接收模块1401,还用于接收第三节点发送的第二分组信息,第二分组信息包括第三节点的节点标识与一个或多个下级流分组集合的对应关系。Optionally, the receiving module 1401 is further configured to receive second grouping information sent by a third node, where the second grouping information includes a correspondence between a node identifier of the third node and one or more lower-level flow grouping sets.
可选地,发送模块1403,还用于接收到目的端口号为SFU服务端口号的第四报文之后,向第四节点发送第一节点发现报文,第四节点为第四报文在第一节点上的下一跳,第四报文的目的地为SFU服务器,第一节点发现报文携带有SFU服务器的标识,且第一节点发现报文指示第一节点为第四节点在以SFU服务器为起点的传输路径上的下级节点。处理模块1402,还用于响应于接收到第四节点发送的第一节点发现报文对应的第一节点发现响应报文,确定第四节点支持数据去重。这里,第四节点可以是上述节点34,第四报文可以是上述报文34,第一节点发现报文可以是上述节点发现报文31,第一节点发现响应报文可以是上述节点发现响应报文31。Optionally, the sending module 1403 is also used to send a first node discovery message to the fourth node after receiving a fourth message whose destination port number is the SFU service port number. The fourth node is the next hop of the fourth message on the first node. The destination of the fourth message is the SFU server. The first node discovery message carries the identifier of the SFU server, and the first node discovery message indicates that the first node is the lower node of the fourth node on the transmission path starting from the SFU server. The processing module 1402 is also used to determine that the fourth node supports data deduplication in response to receiving a first node discovery response message corresponding to the first node discovery message sent by the fourth node. Here, the fourth node can be the above-mentioned node 34, the fourth message can be the above-mentioned message 34, the first node discovery message can be the above-mentioned node discovery message 31, and the first node discovery response message can be the above-mentioned node discovery response message 31.
可选地,接收模块1401,还用于接收第五节点发送的第二节点发现报文,第二节点发现报文携带有SFU服务器的标识,且第二节点发现报文指示第五节点为第一节点在以SFU服务器为起点的传输路径上 的上级节点。处理模块1402,还用于根据第二节点发现报文确定第五节点支持数据去重。发送模块1403,还用于向第五节点发送第二节点发现报文对应的第二节点发现响应报文,第二节点发现响应报文指示第一节点支持数据去重。这里,第五节点可以是上述节点35,第二节点发现报文可以是上述节点发现报文32,第二节点发现响应报文可以是上述节点发现响应报文32。Optionally, the receiving module 1401 is further used to receive a second node discovery message sent by the fifth node, the second node discovery message carries an identifier of the SFU server, and the second node discovery message indicates that the fifth node is the first node on the transmission path starting from the SFU server. The processing module 1402 is also used to determine that the fifth node supports data deduplication according to the second node discovery message. The sending module 1403 is also used to send a second node discovery response message corresponding to the second node discovery message to the fifth node, and the second node discovery response message indicates that the first node supports data deduplication. Here, the fifth node can be the above-mentioned node 35, the second node discovery message can be the above-mentioned node discovery message 32, and the second node discovery response message can be the above-mentioned node discovery response message 32.
可选地,发送模块1403,还用于接收到源端口号为SFU服务端口号的第五报文之后,向第六节点发送第三节点发现报文,第六节点为第五报文在第一节点上的下一跳,第五报文的发送方为SFU服务器,第三节点发现报文携带有SFU服务器的标识,且第三节点发现报文指示第一节点为第六节点在以SFU服务器为起点的传输路径上的上级节点。处理模块1402,还用于响应于接收到第六节点发送的第三节点发现报文对应的第三节点发现响应报文,确定第六节点支持数据去重。这里,第六节点可以是上述节点36,第五报文可以是上述报文35,第三节点发现报文可以是上述节点发现报文33,第三节点发现响应报文可以是上述节点发现响应报文33。Optionally, the sending module 1403 is also used to send a third node discovery message to the sixth node after receiving the fifth message whose source port number is the SFU service port number. The sixth node is the next hop of the fifth message on the first node. The sender of the fifth message is the SFU server. The third node discovery message carries the identifier of the SFU server, and the third node discovery message indicates that the first node is the upper node of the sixth node on the transmission path starting from the SFU server. The processing module 1402 is also used to respond to the third node discovery response message corresponding to the third node discovery message sent by the sixth node, and determine that the sixth node supports data deduplication. Here, the sixth node can be the above-mentioned node 36, the fifth message can be the above-mentioned message 35, the third node discovery message can be the above-mentioned node discovery message 33, and the third node discovery response message can be the above-mentioned node discovery response message 33.
可选地,接收模块1401,还用于接收第七节点发送的第四节点发现报文,第四节点发现报文携带有SFU服务器的标识,且第四节点发现报文指示第七节点为第一节点在以SFU服务器为起点的传输路径上的下级节点。处理模块1402,还用于根据第四节点发现报文确定第七节点支持数据去重。发送模块1403,还用于向第七节点发送第四节点发现报文对应的第四节点发现响应报文,第四节点发现响应报文指示第一节点支持数据去重。这里,第七节点可以是上述节点37,第四节点发现报文可以是上述节点发现报文34,第四节点发现响应报文可以是上述节点发现响应报文34。Optionally, the receiving module 1401 is also used to receive a fourth node discovery message sent by the seventh node, the fourth node discovery message carries the identifier of the SFU server, and the fourth node discovery message indicates that the seventh node is a subordinate node of the first node on the transmission path starting from the SFU server. The processing module 1402 is also used to determine that the seventh node supports data deduplication based on the fourth node discovery message. The sending module 1403 is also used to send a fourth node discovery response message corresponding to the fourth node discovery message to the seventh node, and the fourth node discovery response message indicates that the first node supports data deduplication. Here, the seventh node can be the above-mentioned node 37, the fourth node discovery message can be the above-mentioned node discovery message 34, and the fourth node discovery response message can be the above-mentioned node discovery response message 34.
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。Regarding the device in the above embodiment, the specific manner in which each module performs operations has been described in detail in the embodiment of the method, and will not be elaborated here.
下面对本申请实施例涉及的基本硬件结构举例说明。The following is an illustration of the basic hardware structure involved in the embodiments of the present application.
例如,图15是本申请实施例提供的一种通信设备的硬件结构示意图。如图15所示,通信设备1500包括处理器1501和存储器1502,存储器1501与存储器1502通过总线1503连接。图15以处理器1501和存储器1502相互独立说明。可选地,处理器1501和存储器1502集成在一起。该通信设备1500例如可以是网络设备或服务器。For example, FIG15 is a schematic diagram of the hardware structure of a communication device provided in an embodiment of the present application. As shown in FIG15 , a communication device 1500 includes a processor 1501 and a memory 1502, and the memory 1501 is connected to the memory 1502 via a bus 1503. FIG15 illustrates that the processor 1501 and the memory 1502 are independent of each other. Optionally, the processor 1501 and the memory 1502 are integrated together. The communication device 1500 may be, for example, a network device or a server.
其中,存储器1502用于存储计算机程序,计算机程序包括操作系统和程序代码。存储器1502是各种类型的存储介质,例如只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)、闪存、光存储器、寄存器、光盘存储、光碟存储、磁盘或者其它磁存储设备。The memory 1502 is used to store computer programs, including operating systems and program codes. The memory 1502 is a storage medium of various types, such as read-only memory (ROM), random access memory (RAM), electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM), flash memory, optical storage, register, optical disk storage, optical disk storage, magnetic disk or other magnetic storage devices.
其中,处理器1501是通用处理器或专用处理器。处理器1501可能是单核处理器或多核处理器。处理器1501包括至少一个电路,以执行本申请实施例提供的上述数据传输方法,例如上述方法实施例中节点11、节点21或节点31执行的步骤。The processor 1501 is a general-purpose processor or a special-purpose processor. The processor 1501 may be a single-core processor or a multi-core processor. The processor 1501 includes at least one circuit to execute the above-mentioned data transmission method provided in the embodiment of the present application, such as the steps performed by the node 11, the node 21 or the node 31 in the above-mentioned method embodiment.
可选地,通信设备1500还包括网络接口1504,网络接口1504通过总线1503与处理器1501和存储器1502连接。网络接口1504能够实现通信设备1500与其它设备通信。Optionally, the communication device 1500 further includes a network interface 1504, and the network interface 1504 is connected to the processor 1501 and the memory 1502 via the bus 1503. The network interface 1504 enables the communication device 1500 to communicate with other devices.
可选地,通信设备1500还包括输入/输出(input/output,I/O)接口1505,I/O接口1505通过总线1503与处理器1501和存储器1502连接。处理器1501能够通过I/O接口1505接收输入的命令或数据等。I/O接口1505用于通信设备1500连接输入设备,这些输入设备例如是键盘、鼠标等。可选地,在一些可能的场景中,上述网络接口1504和I/O接口1505被统称为通信接口。Optionally, the communication device 1500 further includes an input/output (I/O) interface 1505, which is connected to the processor 1501 and the memory 1502 via the bus 1503. The processor 1501 can receive input commands or data, etc. through the I/O interface 1505. The I/O interface 1505 is used for the communication device 1500 to connect input devices, such as keyboards, mice, etc. Optionally, in some possible scenarios, the above-mentioned network interface 1504 and I/O interface 1505 are collectively referred to as a communication interface.
可选地,通信设备1500还包括显示器1506,显示器1506通过总线1503与处理器1501和存储器1502连接。显示器1506能够用于显示处理器1501执行上述方法产生的中间结果和/或最终结果等。在一种可能的实现方式中,显示器1506是触控显示屏,以提供人机交互接口。Optionally, the communication device 1500 further includes a display 1506, which is connected to the processor 1501 and the memory 1502 via the bus 1503. The display 1506 can be used to display the intermediate results and/or final results generated by the processor 1501 executing the above method. In a possible implementation, the display 1506 is a touch display screen to provide a human-computer interaction interface.
其中,总线1503是任何类型的,用于实现通信设备1500的内部器件互连的通信总线。例如系统总线。本申请实施例以通信设备1500内部的上述器件通过总线1503互连为例说明,可选地,通信设备1500内部的上述器件采用除了总线1503之外的其他连接方式彼此通信连接,例如通信设备1500内部的上述器件通过通信设备1500内部的逻辑接口互连。The bus 1503 is any type of communication bus for interconnecting the internal devices of the communication device 1500. For example, a system bus. The embodiment of the present application takes the interconnection of the above-mentioned devices inside the communication device 1500 through the bus 1503 as an example. Optionally, the above-mentioned devices inside the communication device 1500 are connected to each other in a communication manner other than the bus 1503, for example, the above-mentioned devices inside the communication device 1500 are interconnected through a logical interface inside the communication device 1500.
上述器件可以分别设置在彼此独立的芯片上,也可以至少部分的或者全部的设置在同一块芯片上。将 各个器件独立设置在不同的芯片上,还是整合设置在一个或者多个芯片上,往往取决于产品设计的需要。本申请实施例对上述器件的具体实现形式不做限定。The above devices may be arranged on separate chips, or at least partially or completely on the same chip. Whether each device is independently arranged on different chips or integrated on one or more chips often depends on the needs of product design. The embodiments of the present application do not limit the specific implementation form of the above-mentioned devices.
图15所示的通信设备1500仅仅是示例性的,在实现过程中,通信设备1500还可以包括其他组件,本文不再一一列举。图15所示的通信设备1500可以通过执行上述实施例提供的方法的全部或部分步骤来实现数据传输。The communication device 1500 shown in Figure 15 is merely exemplary. During implementation, the communication device 1500 may also include other components, which are not listed here. The communication device 1500 shown in Figure 15 may implement data transmission by executing all or part of the steps of the method provided in the above embodiment.
本申请实施例还提供了一种数据传输系统,包括:SFU服务器和通信网络中的多个节点。该多个节点包括第一节点和第二节点,第一节点位于SFU服务器与第二节点之间。SFU服务器用于向第一节点发送报文,第一节点用于执行上述方法实施例中节点11执行的步骤,第二节点用于执行上述方法实施例中节点21执行的步骤。The embodiment of the present application also provides a data transmission system, including: an SFU server and multiple nodes in a communication network. The multiple nodes include a first node and a second node, and the first node is located between the SFU server and the second node. The SFU server is used to send a message to the first node, the first node is used to execute the steps executed by node 11 in the above method embodiment, and the second node is used to execute the steps executed by node 21 in the above method embodiment.
可选地,该多个节点还包括第三节点,第三节点位于第一节点与第二节点之间。第三节点用于执行上述方法实施例中节点31执行的步骤。Optionally, the plurality of nodes further include a third node, and the third node is located between the first node and the second node. The third node is used to execute the steps executed by the node 31 in the above method embodiment.
本申请实施例还提供了另一种数据传输系统,包括:SFU服务器和通信网络中的第一节点。SFU服务器用于执行上述方法实施例中节点11执行的步骤,第一节点用于执行上述方法实施例中节点21执行的步骤。The present application also provides another data transmission system, including: an SFU server and a first node in a communication network. The SFU server is used to execute the steps executed by the node 11 in the above method embodiment, and the first node is used to execute the steps executed by the node 21 in the above method embodiment.
可选地,该通信网络还包括第二节点,第二节点位于SFU服务器与第一节点之间。第二节点用于执行上述方法实施例中节点31执行的步骤。Optionally, the communication network further comprises a second node, and the second node is located between the SFU server and the first node. The second node is used to execute the steps executed by the node 31 in the above method embodiment.
本申请实施例还提供了一种通信节点,包括:处理器和存储器。所述存储器,用于存储计算机程序,所述计算机程序包括程序指令。所述处理器,用于调用所述计算机程序,实现上述方法实施例中节点11、节点21或节点31执行的步骤。The embodiment of the present application also provides a communication node, including: a processor and a memory. The memory is used to store a computer program, and the computer program includes program instructions. The processor is used to call the computer program to implement the steps performed by node 11, node 21 or node 31 in the above method embodiment.
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有指令,当所述指令被处理器执行时,实现上述方法实施例中节点11、节点21或节点31执行的步骤。The embodiment of the present application further provides a computer-readable storage medium, on which instructions are stored. When the instructions are executed by a processor, the steps executed by node 11, node 21 or node 31 in the above method embodiment are implemented.
本申请实施例还提供了一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现上述方法实施例中节点11、节点21或节点31执行的步骤。The embodiment of the present application further provides a computer program product, including a computer program. When the computer program is executed by a processor, the steps executed by the node 11, the node 21 or the node 31 in the above method embodiment are implemented.
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。A person skilled in the art will understand that all or part of the steps to implement the above embodiments may be accomplished by hardware or by instructing related hardware through a program, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a disk or an optical disk, etc.
在本申请实施例中,术语“第一”、“第二”和“第三”仅用于描述目的,而不能理解为指示或暗示相对重要性。In the embodiments of the present application, the terms “first”, “second” and “third” are used for descriptive purposes only and should not be understood as indicating or implying relative importance.
本申请中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。The term "and/or" in this application is only a description of the association relationship of associated objects, indicating that there can be three relationships. For example, A and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone. In addition, the character "/" in this article generally indicates that the associated objects before and after are in an "or" relationship.
需要说明的是,本申请所涉及的信息(包括但不限于用户设备信息、用户个人信息等)、数据(包括但不限于用于分析的数据、存储的数据、展示的数据等)以及信号,均为经用户授权或者经过各方充分授权的,且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。It should be noted that the information (including but not limited to user device information, user personal information, etc.), data (including but not limited to data used for analysis, stored data, displayed data, etc.) and signals involved in this application are all authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data must comply with relevant laws, regulations and standards of relevant countries and regions.
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的构思和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。 The above description is only an optional embodiment of the present application and is not intended to limit the present application. Any modifications, equivalent substitutions, improvements, etc. made within the concept and principle of the present application should be included in the protection scope of the present application.
Claims (63)
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211233819.3 | 2022-10-10 | ||
| CN202211233819 | 2022-10-10 | ||
| CN202211643989.9 | 2022-12-20 | ||
| CN202211643989.9A CN117880200A (en) | 2022-10-10 | 2022-12-20 | Data transmission method, device and system |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024078018A1 true WO2024078018A1 (en) | 2024-04-18 |
Family
ID=90590600
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/103315 Ceased WO2024078018A1 (en) | 2022-10-10 | 2023-06-28 | Data transmission method, apparatus and system |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN117880200A (en) |
| WO (1) | WO2024078018A1 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120075314A (en) * | 2025-04-29 | 2025-05-30 | 北京网藤科技有限公司 | Supervisor method and system for realizing communication between industrial protocols |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119628904B (en) * | 2024-11-28 | 2025-11-04 | 南京大学 | Data Deduplication System and Method Based on Secure IP Tunnels in Internet Networks |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101779421A (en) * | 2007-06-19 | 2010-07-14 | 松下电器产业株式会社 | Header size reductions of data packets |
| US20200167091A1 (en) * | 2018-11-27 | 2020-05-28 | Commvault Systems, Inc. | Using interoperability between components of a data storage management system and appliances for data storage and deduplication to generate secondary and tertiary copies |
| CN111324774A (en) * | 2020-02-26 | 2020-06-23 | 腾讯科技(深圳)有限公司 | Video duplicate removal method and device |
| CN113709510A (en) * | 2021-08-06 | 2021-11-26 | 联想(北京)有限公司 | High-speed data real-time transmission method and device, equipment and storage medium |
| CN114244781A (en) * | 2021-12-20 | 2022-03-25 | 苏州盛科通信股份有限公司 | DPDK-based message deduplication processing method and device |
-
2022
- 2022-12-20 CN CN202211643989.9A patent/CN117880200A/en active Pending
-
2023
- 2023-06-28 WO PCT/CN2023/103315 patent/WO2024078018A1/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101779421A (en) * | 2007-06-19 | 2010-07-14 | 松下电器产业株式会社 | Header size reductions of data packets |
| US20200167091A1 (en) * | 2018-11-27 | 2020-05-28 | Commvault Systems, Inc. | Using interoperability between components of a data storage management system and appliances for data storage and deduplication to generate secondary and tertiary copies |
| CN111324774A (en) * | 2020-02-26 | 2020-06-23 | 腾讯科技(深圳)有限公司 | Video duplicate removal method and device |
| CN113709510A (en) * | 2021-08-06 | 2021-11-26 | 联想(北京)有限公司 | High-speed data real-time transmission method and device, equipment and storage medium |
| CN114244781A (en) * | 2021-12-20 | 2022-03-25 | 苏州盛科通信股份有限公司 | DPDK-based message deduplication processing method and device |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120075314A (en) * | 2025-04-29 | 2025-05-30 | 北京网藤科技有限公司 | Supervisor method and system for realizing communication between industrial protocols |
Also Published As
| Publication number | Publication date |
|---|---|
| CN117880200A (en) | 2024-04-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112119616B (en) | Packet replication method, apparatus and computer readable storage medium related to in-place operation implementation and management (IOAM) | |
| US8023419B2 (en) | Remote monitoring of real-time internet protocol media streams | |
| CN103748835B (en) | The dynamic renewal of label switched path | |
| US8184628B2 (en) | Network based multicast stream duplication and merging | |
| US20130259042A1 (en) | Multicast packet transmission | |
| CN113489652B (en) | Data stream amplifying method and device, converging shunt and storage medium | |
| CN100505897C (en) | Route device, terminal equipment, communication system and routing method | |
| CN101924701B (en) | Building method of multicast forwarding path and route equipment | |
| CN102215172B (en) | A kind of method and system for realizing cross-virtual private local area network multicast | |
| CN104618237B (en) | A kind of wide area network acceleration system and method based on TCP/UDP | |
| JP2001345847A (en) | Packet data transfer method and packet data transfer device | |
| WO2024078018A1 (en) | Data transmission method, apparatus and system | |
| CN101459606A (en) | Extranet networking method, system and device for multicast VPN | |
| CN109275044B (en) | System for realizing flexible scheduling of IP multicast stream | |
| CN112134776B (en) | Method for generating multicast forwarding table item and access gateway | |
| WO2019127134A1 (en) | Data transmission method and virtual switch | |
| CN112866002B (en) | Multicast traffic oriented in-band telemetry method, switching device node and computer readable storage medium | |
| CN105144639A (en) | Efficient multicast delivery to dually connected (VPC) hosts in overlay networks | |
| FR2924557A1 (en) | METHOD OF ROUTING MESSAGES OVER A NETWORK AND SYSTEM FOR IMPLEMENTING THE METHOD | |
| CN110999230B (en) | Method, network equipment and system for transmitting multicast message | |
| CN110115011B (en) | Multicast service processing method and access device | |
| CN104113513A (en) | Host computer discovering method, device and system | |
| US8068515B2 (en) | Faster multimedia synchronization of broadcast streams using router caching of RTCP packets | |
| CN110868353B (en) | A message processing method and device | |
| US12368677B2 (en) | Dual internet protocol (IP) network input reference validation |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23876228 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 23876228 Country of ref document: EP Kind code of ref document: A1 |