WO2017028545A1 - Port queue congestion monitoring method and system - Google Patents
Port queue congestion monitoring method and system Download PDFInfo
- Publication number
- WO2017028545A1 WO2017028545A1 PCT/CN2016/078670 CN2016078670W WO2017028545A1 WO 2017028545 A1 WO2017028545 A1 WO 2017028545A1 CN 2016078670 W CN2016078670 W CN 2016078670W WO 2017028545 A1 WO2017028545 A1 WO 2017028545A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- port queue
- state
- abnormal
- port
- queue
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Definitions
- the present invention relates to the field of network communication technologies, and in particular, to a method and system for monitoring port queue congestion.
- IPTV Internet Protocol Television
- VoIP Voice over Internet Protocol
- FTP FTP download
- the traffic shaping and congestion management functions are implemented by the H-QoS (hierarchical quality of service) queue technology of the traffic management chip hardware (Traffic Management).
- H-QoS hierarchical quality of service
- Traffic Management Traffic Management
- the existing H-QoS queuing technology does not provide a monitoring mechanism for port queue congestion. Therefore, the port queue congestion cannot be monitored in time or even in advance, and the port queue cannot be accurately determined to be blocked, thereby failing to quickly restore the port queue. The message is sent, causing the user to be unable to use the web application for a long time.
- the main purpose of the present invention is to provide a method and system for monitoring port clogging, which is to solve the problem that the port queue clogging cannot be monitored and accurately determined, and the packet sending of the port queue cannot be quickly restored, so that the user cannot use the network application for a long time.
- the present invention provides a method for monitoring port queue congestion, which is applied to a carrier-class router, and the router includes a plurality of ports, and the monitoring method for the port queue congestion includes:
- the current monitoring period is extended by a preset delay duration to determine whether the second sending state of the port queue meets the abnormal sending state. condition;
- determining whether the buffer status of the currently polled port queue meets a preset abnormal cache state condition includes:
- determining whether the first sending state of the port queue meets a preset abnormal sending state condition comprises:
- the first packet count corresponding to the port queue sent in the current monitoring period and in the previous monitoring period is obtained.
- the current monitoring period is extended by a preset delay duration to determine whether the second sending state of the port queue satisfies
- the abnormal transmission status conditions include:
- determining that the port queue has a jam includes:
- the port queue is enabled to enable the port queue to send packets.
- determining whether the buffer status of the currently polled port queue meets the preset abnormal cache state condition includes:
- a monitoring thread of the port queue is created and polling of the port queue jam is performed on all ports on the router in the preset monitoring period.
- the present invention further provides a monitoring system for port queue congestion, which is applied to a carrier-grade router, the router includes a plurality of ports, and the monitoring system for blocking the port queue includes:
- a buffer status determining module configured to determine, when the current monitoring period arrives, whether a buffer status of the currently polled port queue meets a preset abnormal cache status condition
- a first sending state determining module configured to determine, when the buffering state of the port queue meets the abnormal cache state condition, whether the first sending state of the port queue meets a preset abnormal sending state condition
- a second sending state determining module configured to: when the first sending state of the port queue meets the abnormal sending state condition, extend the current monitoring period by a preset delay duration to determine a second sending of the port queue Whether the state satisfies the abnormal transmission state condition;
- the congestion determination module is configured to determine that the port queue is blocked when the second transmission state of the port queue satisfies the abnormal transmission status condition.
- the cache status determining module includes:
- a depth value obtaining unit configured to acquire a real-time depth value of the port queue when the current monitoring period arrives, where the depth value is used to measure a buffered amount of packet storage in the port queue;
- a depth value judging unit configured to determine whether the real-time depth value of the port queue is greater than or equal to a preset depth threshold to determine whether the cache state of the port queue satisfies the abnormal cache state condition
- the cache state determining unit is configured to determine, when the real-time depth value of the port queue is greater than or equal to the depth threshold, that the cache state of the port queue is the abnormal cache state.
- the first sending state determining module includes:
- a packet counting first acquiring unit configured to: when the buffering state of the port queue meets the abnormal buffering state condition, acquire the port queue in the current monitoring period and in the previous monitoring period Corresponding to the first packet count and the second packet count, wherein the packet count is counted by using a high-order and a low-order double counter;
- the message counting first comparing unit is configured to compare whether the upper and lower bits of the first packet count are respectively equal to the high and low bits of the second packet count to determine the first sending state of the port queue. Whether the abnormal transmission status condition is satisfied;
- a first sending state determining unit configured to determine, when the upper and lower bits of the first packet count are equal to the high and low bits of the second packet, respectively, determining that the first sending state of the port queue is The abnormal transmission status is described.
- the second sending state determining module includes:
- a delay unit configured to extend the current monitoring period by the delay duration when the first sending state of the port queue satisfies the abnormal sending state condition
- a second counting unit configured to acquire, when the delay duration arrives, a third packet count sent by the port queue in the last monitoring period and the delay duration;
- the message counting second comparing unit is configured to compare whether the upper and lower bits of the first packet count are respectively equal to the high and low bits of the third packet count to determine that the second sending state of the port queue is satisfied.
- a second sending state determining unit configured to determine, when the upper and lower bits of the first packet count are equal to the high and low bits of the third packet, respectively, determining that the second sending state of the port queue is The abnormal transmission status is described.
- the monitoring system for blocking the port queue further includes:
- a port queue closing module configured to close the port queue when it is detected that the port queue is blocked
- a buffered message clearing module configured to clear the buffered message in the port queue and retain relevant configuration parameters required for resetting the port queue
- the port queue reset module is configured to reset the port queue according to the retained related configuration parameter to restore an initial state corresponding to when the port queue does not send a message;
- the port queue enabling module is configured to enable the port queue to enable the port queue to send packets after the port queue is restored to the initial state.
- the monitoring system for blocking the port queue further includes:
- the monitoring thread creation module is configured to create a monitoring thread of the port queue and perform polling monitoring of the port queue congestion on all ports on the router in the preset monitoring period.
- An embodiment of the present invention further provides a monitoring device for port queue congestion, which is applied to a carrier-grade router including a plurality of ports, including:
- a memory for storing processor executable instructions
- processor is configured to:
- the current monitoring week will be Extending a preset delay duration to determine whether the second transmission state of the port queue satisfies the abnormal transmission state condition;
- Embodiments of the present invention also provide a non-transitory computer readable storage medium having stored therein instructions that cause the router to implement a port when executed by a processor of a carrier-grade router including a plurality of ports
- a method of monitoring queue congestion comprising the steps of:
- the current monitoring period is extended by a preset delay duration to determine whether the second sending state of the port queue meets the abnormal sending state. condition;
- the invention monitors all port queue jams by means of periodic polling, so that the port queue jam problem can be locked to one or some port queues more finely, and only the targeted port queues need to be targeted. Perform recovery operations to avoid impact on other normal port queues.
- the cache status of the port queue it is possible to determine whether the port queue has a potential congestion risk in time or even in advance, and further monitor the transmission status of the port queue to determine whether there is currently a jam.
- the delay is used to further determine whether the current port queue is blocked, thereby making the judgment of the port queue congestion more accurate, thereby enabling Timely and accurately determine the port queues that are blocked, and then timely process the blocked port queues, reducing the negative impact of port queue congestion.
- FIG. 1 is a schematic flowchart of a first embodiment of a method for monitoring port port congestion according to the present invention
- FIG. 2 is a schematic diagram of a refinement process of step S110 in FIG. 1;
- step S120 in FIG. 1 is a schematic diagram of a refinement process of step S120 in FIG. 1;
- step S130 in FIG. 1 is a schematic diagram of a refinement process of step S130 in FIG. 1;
- FIG. 5 is a schematic flowchart of a second embodiment of a method for monitoring port port congestion according to the present invention.
- FIG. 6 is a schematic flowchart diagram of a third embodiment of a method for monitoring port port congestion according to the present invention.
- FIG. 7 is a schematic diagram of functional modules of a first embodiment of a monitoring system for port queue congestion according to the present invention.
- FIG. 8 is a schematic diagram of a refinement function module of the cache state determination module in FIG. 7;
- FIG. 9 is a schematic diagram of a refinement function module of the first transmission state determining module in FIG. 7;
- FIG. 10 is a schematic diagram of a refinement function module of the second transmission state determining module in FIG. 7;
- FIG. 11 is a schematic diagram of functional modules of a second embodiment of a monitoring system for port queue congestion according to the present invention.
- FIG. 12 is a schematic diagram of functional modules of a third embodiment of a monitoring system for port queue congestion according to the present invention.
- FIG. 1 is a schematic flowchart diagram of a first embodiment of a method for monitoring port queue congestion according to the present invention.
- the monitoring method for the port queue congestion includes:
- Step S110 When the current monitoring period arrives, determine whether the buffer status of the currently polled port queue satisfies a preset abnormal cache state condition
- the monitoring method of the port queue jam is applied to the carrier-class router, wherein the port queue corresponds to the H-QoS in the downlink (Traffic Management) chip hardware H-QoS queue technology of the current network carrier-class router.
- the middle-tier queues between the root node of the tree, the port queue, and the user queue. User packets are sent to the middle-tier queue through the user queue, and then the intermediate-layer queues are aggregated to the port queue. Finally, the port queue is sent to the corresponding interface daughter card port, and the interface daughter card port can be externally connected to the fiber.
- routers typically include several ports, so all ports on the router need to be monitored.
- the polling mode is used to monitor all the ports, and the monitoring period is to complete all rounds for all ports, in order to reduce the efficiency of the routers. The time of the inquiry.
- the port queue monitoring period needs to be set according to actual conditions. For example, set to 100 milliseconds. If the monitoring period is set too large, the port queue will not be quickly monitored when it is blocked, which will cause the traffic interruption time to be too long. If it is set too small, the hardware and software overhead brought by the port queue monitoring will increase. This affects the normal operation of the router line card.
- the interval between the two adjacent ports is related to the monitoring period. The interval between the two adjacent ports may be the same or different, and may be set according to actual needs.
- the port queue blocking monitoring of any one of the ports is described in the embodiment, and the descriptions of the other ports are the same, and therefore are not described herein.
- the currently polled port queue specifically refers to the port queue that starts polling when the current monitoring period arrives.
- the cache status of a port queue refers to the state of the amount of packet buffered in the port queue. For example, the cache status is empty, the cache status is not full (that is, it is not blocked), the cache status is full (the risk of congestion), etc. .
- the preset cache state condition is used to determine an abnormal cache state, and if the preset condition is met, the mode is an abnormal cache state; On the other hand, if it is not satisfied, it is in the normal buffer state, and will continue to wait for the arrival of the next monitoring period.
- the abnormal cache state is determined to be in a state of congestion risk, and is also about to enter a cache state at the time of congestion. Therefore, for the abnormal cache state determined above, the preset cache state condition is determined correspondingly, and the condition is specifically set according to actual needs, for example, the condition may be the cache size of the port queue. Therefore, when the buffer status of the currently polled port queue satisfies the foregoing preset condition, at least the current polled port queue may be determined to have a congestion risk.
- Step S120 When the buffering state of the port queue meets the abnormal cache state condition, determine whether the first sending state of the port queue meets a preset abnormal sending state condition;
- the buffered state of the polled port queue satisfies the above-mentioned preset abnormal cache state condition, that is, when at least the current polled port queue has a congestion risk, it is necessary to further determine whether there is a jam, that is, through the port. The determination is made as to whether the first transmission state of the queue satisfies a preset abnormal transmission state condition.
- the sending status of the port queue (the first sending status is only a naming difference) is determined by a preset abnormal sending status condition. If the preset condition is met, the status is abnormal. If it is not satisfied, it is the normal transmission state, and it will continue to wait for the arrival of the next monitoring period.
- the abnormal transmission status is determined to be a state in which no message is sent. Therefore, for the abnormal transmission state determined above, a preset transmission state condition is determined correspondingly, and the condition is specifically set according to actual needs, for example, the condition may be a packet count of the port queue. Therefore, when the transmission status of the currently polled port queue satisfies the corresponding preset condition, it is basically determined that there is congestion in the currently polled port queue.
- Step S130 When the first sending state of the port queue meets the abnormal sending state condition, the current monitoring period is extended by a preset delay duration to determine whether the second sending state of the port queue satisfies the non- Normal transmission status condition;
- Step S140 When the second sending state of the port queue satisfies the abnormal sending state condition, it is determined that the port queue is blocked.
- the port queue congestion When the first transmission status of the polled port queue meets the preset abnormal transmission status, there may be a misjudgment of the port queue congestion. For example, there may be a packet entering the port at the critical moment when the monitoring period arrives. Queue, but the message has not been sent out from the port queue at this time, so the message is not counted, resulting in a false positive.
- the current monitoring period is extended by the preset delay duration, thereby further accurately determining the transmission status of the currently polled port queue (the second transmission state is only It is only the naming difference) whether the preset abnormal transmission status condition is met. If the preset abnormal transmission status condition is met again, the transmission status of the currently polled port queue can be accurately determined as not sending a message, that is, it is determined. The current polled port queue is blocked. On the contrary, it is determined that there is no congestion in the current polled port queue. At this time, it will continue to wait for the arrival of the next monitoring period.
- all port queue jams are monitored by means of periodic polling, so that the port queue jam problem can be locked to one or some port queues more finely, and only the targeted lock is needed.
- Port The queue performs recovery operations to avoid impact on other normal port queues.
- by monitoring the cache status of the port queue it is possible to determine whether the port queue has a potential congestion risk in time or even in advance, and further monitor the transmission status of the port queue to determine whether there is currently a jam.
- the delay is used to further determine whether the current port queue is blocked, thereby further improving the accuracy of the port queue clogging judgment, thereby enabling In order to accurately and accurately determine the port queues that are blocked, the port queues that are blocked are processed in time to reduce the negative impact caused by port queue congestion.
- FIG. 2 is a schematic diagram of the refinement process of step S110 in FIG. Based on the above embodiment, in the embodiment, the foregoing step S110 includes:
- step S1101 when the current monitoring period arrives, the real-time depth value of the port queue is obtained, where the depth value is used to measure the amount of packet buffered in the port queue.
- Step S1102 Determine whether the real-time depth value of the port queue is greater than or equal to a preset depth threshold to determine whether the cache state of the port queue satisfies the abnormal cache state condition;
- Step S1103 When the real-time depth value of the port queue is greater than or equal to the depth threshold, determine that the cache state of the port queue is the abnormal cache state.
- the depth value of the port queue is used to measure the amount of the buffered packets in the port queue, that is, the real-time depth value of the port queue is obtained, and the real-time depth value of the port queue is compared with the preset depth threshold. Therefore, it is used to determine the buffer status of the port queue, that is, whether a certain number of packets to be sent are buffered in the port queue.
- the preset depth threshold must be smaller than the port queue cache. Otherwise, when the port queue is blocked, the real-time depth value of the port queue does not exceed the preset depth threshold. At the same time, the preset depth threshold cannot be set too large. Otherwise, when the port queue is blocked, it cannot be quickly monitored in time, which causes the traffic interruption time to be too long. Therefore, in this embodiment, the preset depth threshold is specifically set according to an actual situation, for example, set to a kilobyte level. In addition, it should be further explained that the port queue cache of all ports on the general router is the default allocation, and the default allocated cache size is the same, therefore, for multiple port queues on the router, only one preset is needed. The depth threshold is OK.
- the abnormal cache state condition is preferably that the real-time depth value of the port queue is greater than or equal to a preset depth threshold. If the real-time depth of the port queue is less than the preset threshold, the port queue is in the normal cache state, that is, the port queue is not blocked. Otherwise, if the real-time depth value is greater than or equal to the preset depth threshold, the port queue cached packets are The storage capacity exceeds the preset depth threshold, that is, the port queue is abnormally cached, and the port queue is at risk of congestion. By setting the preset depth threshold, you can monitor the risk of port queue congestion in time, and then intervene in the port queue with the risk of congestion to minimize the impact of port congestion.
- FIG. 3 is a schematic diagram of the refinement process of step S120 in FIG. Based on the above embodiment, in this implementation
- step S120 includes:
- Step S1201 When the buffering status of the port queue meets the abnormal cache state, the first packet sent by the port queue in the current monitoring period and in the previous monitoring period is obtained. Counting and counting the second message, wherein the message count is counted by using a high and low double counter;
- the buffer status of the currently polled port queue satisfies the abnormal cache state condition, that is, when the risk of congestion of the currently polled port queue is determined, it is necessary to further determine whether the port queue blockage does exist, that is, further Determine whether the sending status of the currently polled port queue meets the preset abnormal sending status condition.
- the abnormal transmission state is preferably that the number of packets sent in the two adjacent monitoring periods is equal, that is, no packets are sent in the two adjacent monitoring periods.
- the two adjacent monitoring periods specifically refer to the current monitoring period and the previous monitoring period of the current monitoring period.
- the first packet count sent in the current monitoring period (assumed to be s1) and the second sent in the previous monitoring period are read from the preset dual counter for counting the statistics in a non-reading manner.
- Message count (assumed to be s0).
- the non-reading mode means that the counting data is not cleared when the counting data is read.
- the double counter used for counting the statistics of the packet directly obtains the count of the packet sent by the port queue provided by the hardware of the TM chip;
- the TM chip that counts the packets is sent to the corresponding port on the interface sub-card. Therefore, in this embodiment, the dual-counter for counting the statistics is obtained.
- the number of packets received by the corresponding port on the interface subcard corresponding to the port queue is also counted as the number of packets sent by the port queue.
- the existing single counter has the problem that the count of two consecutively transmitted messages overflows and falls over a short period of time, it is preferable to determine whether the counts of the messages sent in the adjacent two monitoring periods are equal.
- the high and low double counters count the message.
- the present embodiment specifically uses the high-order and low-level dual counters to count the packets sent by the port queue.
- the double counter is the upper 8 bits and the lower 8 bits respectively
- the double counter is the upper 16 bits and the lower 16 bits respectively
- the double The counters are high 32 bits and low 32 bits, respectively.
- Step S1202 Comparing whether the upper and lower bits of the first packet count are respectively equal to the upper and lower bits of the second packet count to determine whether the first transmission state of the port queue satisfies the abnormal transmission. State condition
- Step S1203 When the upper and lower bits of the first packet count are respectively equal to the high and low bits of the second packet count, determining that the first sending state of the port queue is the abnormal sending state.
- the s1 is compared with the s0 to determine the sending status of the port queue, that is, whether the port queue is transmitting packets. If the port queue is transmitting packets, the status is normal. Otherwise, Abnormal transmission status.
- the size of s1 and s0 is not compared in this embodiment, but it is preferable to compare whether s1 and s0 are equal. Since the message count in the embodiment is counted by the high-order and low-order double counters, it is necessary to respectively compare whether the upper or lower bits of s1 and s0 are equal: if the upper bits corresponding to s1 and s0 are not equal and/or the low bits are not equal If the s1 and s0 respectively correspond to the high-order bits and the low-order bits are equal, the port queue does not send packets in the two consecutive counting times.
- the abnormal transmission status condition is that the port queue does not send a message within two consecutive counting times, that is, no message is sent in two adjacent monitoring periods.
- the port queue with the risk of blocking is locked; and the status of the port queue is judged to further accurately determine the blocked port queue.
- the double-layer progressive monitoring mechanism of the buffer state and the transmission state used in this embodiment can lock the blocked port queues in a timely and accurate manner, thereby enabling timely intervention, thereby minimizing the impact caused by port queue congestion. .
- the high-order and low-level dual counters are used to count the packets sent by the port queue, which solves the problem that the use of the single counter has the overflow of the two consecutively transmitted message counts in a short time, thereby enabling more accurate Determine whether the counts of two consecutively sent messages are equal.
- the number of received packets of the corresponding port of the interface subcard is obtained, and the count is used as the port queue to send the packet count, thereby solving the problem that the TM chip hardware cannot support the port queue to send the packet count and cannot determine the port queue sending. State problem.
- s1 and s0 are respectively assign as the message counts sent in the current monitoring period and the parameters of the message counts sent in the previous monitoring period, and then assign the count of s1 to s0 at the end of the current monitoring period. The comparison of the transmitted message counts is performed in the next monitoring cycle.
- step S130 includes:
- Step S1301 When the first sending state of the port queue satisfies the abnormal sending state condition, extending the current monitoring period by the delay duration;
- the port queue cannot be determined to be blocked and cannot send packets. For example, when a packet enters the port queue at the critical time when the monitoring period arrives, the packet is not sent out from the port queue at this time (that is, the counter does not count at this time). Therefore, the packet may be sent during the current monitoring period.
- the count s1 is equal to the s0 sent in the last monitoring period, that is, the port queue is not sent, and if the port queue is not blocked, but the judgment result of the unsent packet is obtained, the adjacent two are generated. No misjudgement of the message was sent during the monitoring period.
- the delay time T1 is introduced to advance. After the row counts, the comparison of the counts of the adjacent two monitoring period messages is performed.
- the delay time T1 introduced must be smaller than the monitoring period T0 of the port queue. Otherwise, after the delay time T1 arrives and the port queue sends a message count, the time has reached the next monitoring period, instead of the current monitoring period. There is no guarantee that the count of sent messages will be compared during the same monitoring period.
- the product of the port queue delay duration T1 and the number of ports polled in the monitoring period T0 must be smaller than the port queue monitoring period T0. Otherwise, if all the ports are delayed by T1, at least one port is acquired. After the port queue sends the packet count, the time has reached the next monitoring period, instead of the current monitoring period. Therefore, it is impossible to compare the number of sent packets in the same monitoring period.
- Step S1302 When the delay duration arrives, acquire a third packet count sent by the port queue in the last monitoring period and the delay duration;
- Step S1303 Comparing whether the upper and lower bits of the first packet count are respectively equal to the upper and lower bits of the third packet count to determine that the second transmission state of the port queue satisfies the abnormal transmission state. condition;
- Step S1304 When the upper and lower bits of the first packet count are respectively equal to the high and low bits of the third packet count, determine that the second transmission state of the port queue is the abnormal transmission state.
- the third packet count s2 is The sum of the sent packet count s1 and the sent packet count in the last monitoring period, by comparing whether the upper and lower bits of s1 and s2 are respectively equal, that is, determining whether the sent packet count is zero within the delay duration, thereby The previous judgment results are further determined to avoid misjudgment of port queue congestion. If the transmission status of the port queue still meets the preset abnormal transmission status after the delay processing, that is, if the packet is still not sent, the current polled port queue can be accurately determined to be blocked.
- the delay count is introduced to further judge the sending state of the extended time, thereby achieving accurate locking and blocking.
- the port queue improves the accuracy of the judgment that the port queue is blocked.
- FIG. 5 is a schematic flowchart diagram of a second embodiment of a method for monitoring port queue congestion according to the present invention.
- step S140 After step S140,
- Step S210 when it is detected that there is congestion in the port queue, the port queue is closed;
- Step S220 clearing the buffered message in the port queue and retaining relevant configuration parameters required for resetting the port queue.
- Step S230 resetting the port queue according to the retained related configuration parameter to restore an initial state corresponding to when the port queue does not send a message
- step S240 after the port queue is restored to the initial state, the port queue is enabled to enable the port queue to send packets.
- the port queue when the polling port queue is blocked, the port queue is closed to prevent the packet from entering the port queue. At the same time, all the packets cached in the port queue are cleared, but the related configuration parameters, such as the priority and weight of the packet queue, and the rate limit policy for sending packets, are reserved.
- the port queue After clearing the packets buffered in the blocked port queue and retaining the relevant configuration parameters required for resetting the port queue, the port queue is reset according to the reserved configuration parameters to restore the unsent packets to the port queue.
- the initial state corresponding to the time, and the port queue is activated by the enable mode, so that the port queue is opened again to send the packet.
- the common processing methods are as follows: First, insert or remove the interface subcard corresponding to the blocked port queue.
- the negative effect is that the number of the normal ports corresponding to the inserted interface subcards is ten.
- the second-level traffic is interrupted; the second is to plug or unplug the line card.
- the negative effect is that the traffic of the line card is interrupted for several minutes on all normal ports. The above operations on the insertion and removal of the interface daughter card and the line card will cause long-term traffic interruption of other normal port queues.
- the operation reduces the interruption time of the user traffic from tens of seconds to several seconds, and also reduces the impact range of the port queue congestion fault, and improves the stability of the carrier-class router.
- FIG. 6 is a schematic flowchart diagram of a third embodiment of a method for monitoring port queue congestion according to the present invention.
- step S110 the third embodiment of a method for monitoring port queue congestion according to the present invention.
- Step S001 Create a monitoring thread of the port queue and perform polling monitoring of the port queue congestion on all ports on the router in the preset monitoring period.
- the monitoring thread of the port queue is created correspondingly, and the monitoring thread is also configured to perform polling monitoring of the port queue congestion of all the ports.
- the monitoring of the congestion of all the port queues is realized in real time, so that the blocked port queue can be determined more timely and accurately, and the solution is solved in a targeted manner.
- Blocked port queues which reduce the impact of port queue congestion and reduce the interruption time of user traffic provide an effective solution.
- FIG. 7 is a schematic diagram of functional modules of a first embodiment of a monitoring system for port queue congestion according to the present invention.
- the monitoring system for blocking the port queue includes:
- the buffer status determining module 10 is configured to determine, when the current monitoring period arrives, whether the buffer status of the currently polled port queue meets a preset abnormal cache status condition;
- the cache state determination module 10 determines whether the cache state of the currently polled port queue satisfies a preset abnormal cache state condition.
- the currently polled port queue specifically refers to the port queue that starts polling when the current monitoring period arrives.
- the cache status of a port queue is specifically the state of the amount of packet buffered in the port queue. For example, the cache status is empty, the cache status is not full (that is, it is not blocked), and the cache status is full (that is, the risk of congestion). )Wait.
- the preset cache state is used to determine an abnormal cache state. If the preset condition is met, the state is abnormal. If not, the buffer state is normal. Continue to wait for the arrival of the next monitoring cycle.
- the abnormal cache state is determined to be in a state of congestion risk, and is also about to enter a cache state at the time of congestion. Therefore, for the abnormal cache state determined above, the preset cache state condition is determined correspondingly, and the condition is specifically set according to actual needs, for example, the condition may be the buffer size of the port queue or the size of the received message volume. Wait. Therefore, when the buffer status of the currently polled port queue satisfies the foregoing preset condition, at least the current polled port queue may be determined to have a congestion risk.
- the first sending state determining module 20 is configured to determine, when the buffering state of the port queue meets the abnormal cache state condition, whether the first sending state of the port queue meets a preset abnormal sending state condition;
- the first sending state determining module 20 needs to further determine whether it is true. There is congestion, that is, whether the first transmission state of the port queue satisfies a preset abnormal transmission state condition.
- the sending status of the port queue (the first sending status is only a naming difference) is determined by a preset abnormal sending status condition. If the preset condition is met, the status is abnormal. If it is not satisfied, it is the normal transmission state, and it will continue to wait for the arrival of the next monitoring period.
- the abnormal transmission status is determined to be a state in which no message is sent. Therefore, for the abnormal transmission state determined above, the preset transmission state condition is determined correspondingly, and the condition is set according to actual needs, for example, the condition may be the number of packets in the port queue or the number of packets sent in a unit time. Wait. Therefore, when the transmission status of the currently polled port queue satisfies the corresponding preset condition, it is basically determined that there is congestion in the currently polled port queue.
- the second sending state determining module 30 is configured to: when the first sending state of the port queue meets the abnormal sending state condition, extend the current monitoring period by a preset delay duration to determine the second queue queue. Whether the sending status satisfies the abnormal sending status condition;
- the congestion determination module 40 is configured to determine that the port queue is blocked when the second transmission state of the port queue satisfies the abnormal transmission status condition.
- the port queue congestion When the first transmission status of the polled port queue meets the preset abnormal transmission status, there may be a misjudgment of the port queue congestion. For example, there may be a packet entering the port at the critical moment when the monitoring period arrives. Queue, but the message has not been sent out from the port queue at this time, so the message is not counted, resulting in a false positive.
- the second transmission state determination mode Block 30 further extends the current monitoring period by a preset delay duration, thereby further accurately determining whether the currently polled port queue transmission state (the second transmission state is merely a naming difference) satisfies a preset abnormal transmission state condition, and if The congestion determination module 40 can accurately determine that the current polled port queue is sent without sending a message, that is, it determines that the current polled port queue is blocked; There is no congestion in the polled port queue, and it will continue to wait for the arrival of the next monitoring period.
- all port queue jams are monitored by means of periodic polling, so that the port queue jam problem can be locked to one or some port queues more finely, and only the targeted lock is needed.
- the port queue is restored to avoid the impact on other normal port queues.
- by monitoring the cache status of the port queue it is possible to determine whether the port queue has a potential congestion risk in time or even in advance, and further monitor the transmission status of the port queue to determine whether there is currently a jam.
- the delay is used to further determine whether the current port queue is blocked, thereby further improving the accuracy of the port queue clogging judgment, thereby enabling In order to accurately and accurately determine the port queues that are blocked, the port queues that are blocked are processed in time to reduce the negative impact caused by port queue congestion.
- FIG. 8 is a schematic diagram of a refinement function module of the cache state determination module of FIG. Based on the foregoing embodiment, in the embodiment, the cache state determining module 10 includes:
- the depth value obtaining unit 101 is configured to acquire a real-time depth value of the port queue when the current monitoring period arrives, where the depth value is used to measure a buffer size of the port queue;
- the depth value determining unit 102 is configured to determine whether the real-time depth value of the port queue is greater than or equal to a preset depth threshold to determine whether the cache state of the port queue satisfies the abnormal cache state condition;
- the cache state determining unit 103 is configured to determine, when the real-time depth value of the port queue is greater than or equal to the depth threshold, that the cache state of the port queue is the abnormal cache state.
- the depth value of the port queue is used to measure the amount of the message buffered in the port queue, that is, the depth value obtaining unit 101 obtains the real-time depth value of the port queue, and the depth value determining unit 102 performs the real-time port queue.
- the depth value is compared with the preset depth threshold to determine the buffer status of the port queue, that is, whether a certain number of packets to be sent are buffered in the port queue.
- the cache state determining unit 103 determines that the cache state of the port queue satisfies the abnormal cache state condition.
- the preset depth threshold must be smaller than the port queue cache. Otherwise, when the port queue is blocked, the real-time depth value of the port queue does not exceed the preset depth threshold. At the same time, the preset depth threshold cannot be set too large. Otherwise, when the port queue is blocked, it cannot be quickly monitored in time, which causes the traffic interruption time to be too long. Therefore, in this embodiment, the preset depth threshold is specifically set according to an actual situation, for example, set to a kilobyte level. In addition, it should be further explained that the port queue cache of all ports on the general router is the default allocation, and the default allocated cache size is the same, therefore, for multiple port queues on the router, only one preset is needed. The depth threshold is OK.
- the abnormal cache state condition is preferably that the real-time depth value of the port queue is greater than or equal to a preset depth. Threshold. If the real-time depth of the port queue is less than the preset threshold, the port queue is in the normal cache state, that is, the port queue is not blocked. Otherwise, if the real-time depth value is greater than or equal to the preset depth threshold, the port queue caches the packet. The storage capacity exceeds the preset depth threshold, which causes the port queue to be blocked, that is, the port queue is in an abnormal cache state. By setting the preset depth threshold, you can monitor the risk of port queue congestion in time, and then intervene in the port queue with the risk of congestion to minimize the impact of port congestion.
- FIG. 9 is a schematic diagram of a refinement function module of the first transmission state determining module in FIG. Based on the foregoing embodiment, in the embodiment, the first sending state determining module 20 includes:
- the packet counting first obtaining unit 201 is configured to: when the buffering state of the port queue meets the abnormal cache state condition, acquire the port queue in the current monitoring period and in the previous monitoring period Corresponding to the first message count and the second message count respectively sent, wherein the message count is counted by using a high-order and a low-level double counter;
- the buffer status of the currently polled port queue satisfies the abnormal cache state condition, that is, when the risk of congestion of the currently polled port queue is determined, it is necessary to further determine whether the port queue blockage does exist, that is, further Determine whether the sending status of the currently polled port queue meets the preset abnormal sending status condition.
- the abnormal transmission state is preferably that the number of packets sent in the two adjacent monitoring periods is equal, that is, no packets are sent in the two adjacent monitoring periods.
- the two adjacent monitoring periods specifically refer to the current monitoring period and the previous monitoring period of the current monitoring period.
- the first acquiring unit 201 reads the first packet count (assumed as s1) and the upper packet sent in the current monitoring period from the preset dual counter for counting the statistics in the non-reading manner.
- the second message count sent during a monitoring period (assumed to be s0).
- the non-reading mode means that the counting data is not cleared when the counting data is read.
- the double counter used for counting the statistics of the packet directly obtains the count of the packet sent by the port queue provided by the hardware of the TM chip;
- the TM chip that counts the packets is sent to the corresponding port on the interface sub-card. Therefore, in this embodiment, the dual-counter for counting the statistics is obtained.
- the number of packets received by the corresponding port on the interface subcard corresponding to the port queue is also counted as the number of packets sent by the port queue.
- the existing single counter has the problem that the count of two consecutively transmitted messages overflows and falls over a short period of time, it is preferable to determine whether the counts of the messages sent in the adjacent two monitoring periods are equal.
- the high and low double counters count the message.
- the present embodiment specifically uses the high-order and low-level dual counters to count the packets sent by the port queue. For example, when using an 8-bit double counter, the double counters are high 8 bits and low 8 bits respectively; when using a 16-bit double counter, the double counters are The upper 16 bits and the lower 16 bits are counted; when the 32-bit double counter is used, the double counters are high 32 bits and low 32 bits respectively.
- the low-order counter is full, the high-order counter count is incremented by one, and the low-order counter count is cleared.
- the high-order counter is cleared.
- the packet counting first comparing unit 202 is configured to compare whether the upper and lower bits of the first packet count are respectively equal to the high and low bits of the second packet count to determine the first sending of the port queue. Whether the state satisfies the abnormal transmission state condition;
- the first sending state determining unit 203 is configured to: when the high and low bits of the first packet count are respectively equal to the high and low bits of the second packet count, determine that the first sending state of the port queue is The abnormal transmission state.
- the packet counting first comparing unit 202 After acquiring the first packet count s1 sent in the current monitoring period and the second packet count s0 sent in the previous monitoring period, the packet counting first comparing unit 202 compares s1 and s0 to determine the port. The sending status of the queue determines whether the port queue is sending packets. If the port queue is sending packets, it is in the normal sending state. Otherwise, it is in the abnormal sending state.
- the size of s1 and s0 is not compared in this embodiment, but it is preferable to compare whether s1 and s0 are equal. Since the message count in the embodiment is counted by the high-order and low-order double counters, it is necessary to respectively compare whether the upper or lower bits of s1 and s0 are equal: if the upper bits corresponding to s1 and s0 are not equal and/or the low bits are not equal.
- the port queue is sent in the adjacent two counting times; if the upper bits corresponding to s1 and s0 are equal and the lower bits are equal, the first sending state determining unit 203 determines that the port queue is in the adjacent two counting times. No message was sent.
- the abnormal transmission status condition is that the port queue does not send a message within two consecutive counting times, that is, no message is sent in two adjacent monitoring periods.
- the port queue with the risk of blocking is locked; and the status of the port queue is judged to further accurately determine the blocked port queue.
- the double-layer progressive monitoring mechanism of the buffer state and the transmission state used in this embodiment can lock the blocked port queues in a timely and accurate manner, thereby enabling timely intervention, thereby minimizing the impact caused by port queue congestion. .
- the high-order and low-level dual counters are used to count the packets sent by the port queue, which solves the problem that the use of the single counter has the overflow of the two consecutively transmitted message counts in a short time, thereby enabling more accurate Determine whether the counts of two consecutively sent messages are equal.
- the number of received packets of the corresponding port of the interface subcard is obtained, and the count is used as the port queue to send the packet count, thereby solving the problem that the TM chip hardware cannot support the port queue to send the packet count and cannot determine the port queue sending. State problem.
- FIG. 10 is a schematic diagram of a refinement function module of the second transmission state determining module in FIG. Based on the foregoing embodiment, in the embodiment, the second sending state determining module 30 includes:
- the delay unit 301 is configured to extend the current monitoring period by the delay duration when the first sending status of the port queue satisfies the abnormal sending status condition;
- the port queue cannot be determined to be blocked and cannot send packets. For example, when a packet enters the port queue at the critical time when the monitoring period arrives, the packet is not sent out from the port queue at this time (that is, the counter does not count at this time). Therefore, the packet may be sent during the current monitoring period.
- the count s1 is equal to the s0 sent in the last monitoring period, that is, the port queue is not sent, and if the port queue is not blocked, but the judgment result of the unsent packet is obtained, the adjacent two are generated. No misjudgement of the message was sent during the monitoring period.
- the delay unit 301 delays the time length T1 to perform counting and then compares the counts of the adjacent two monitoring period messages.
- the delay time T1 introduced must be smaller than the monitoring period T0 of the port queue. Otherwise, after the delay time T1 arrives and the port queue sends a message count, the time has reached the next monitoring period, instead of the current monitoring period. There is no guarantee that the count of sent messages will be compared during the same monitoring period.
- the product of the port queue delay duration T1 and the number of ports polled in the monitoring period T0 must be smaller than the port queue monitoring period T0. Otherwise, if all the ports are delayed by T1, at least one port is acquired. After the port queue sends the packet count, the time has reached the next monitoring period, instead of the current monitoring period. Therefore, it is impossible to compare the number of sent packets in the same monitoring period.
- the message counting second obtaining unit 302 is configured to: when the delay duration arrives, acquire a third packet count sent by the port queue in the last monitoring period and the delay duration;
- the packet counting second comparing unit 303 is configured to compare whether the upper and lower bits of the first packet count are respectively equal to the high and low bits of the third packet count to determine the second sending of the port queue.
- the state satisfies the abnormal transmission state condition
- the second sending state determining unit 304 is configured to: when the high and low bits of the first packet count are respectively equal to the high and low bits of the third packet count, determine that the second sending state of the port queue is The abnormal transmission state.
- the packet count second obtaining unit 302 reads the packet received by the corresponding port on the interface daughter card corresponding to the port queue in a non-reading manner, and at this time, the statistics are collected.
- the third packet count s2 is specifically the sum of the sent packet count s1 and the delayed packet count in the last monitoring period, and the packet count second comparing unit 303 compares whether the upper and lower bits of s1 and s2 respectively correspond to each other. That is, it is determined whether the count of the transmitted message within the delay time is zero, so that the previous judgment result is further determined to avoid misjudgment of the port queue blockage. If the transmission status of the port queue still meets the preset abnormal transmission status condition after the delay processing, that is, the message is still not sent, the second transmission status determining unit 304 can accurately determine the current polled port queue. Blocked.
- the delay count is introduced to further judge the sending state of the extended time, thereby achieving accurate locking and blocking.
- the port queue improves the accuracy of the judgment that the port queue is blocked.
- FIG. 11 is a schematic diagram of functional modules of a second embodiment of a monitoring system for port queue congestion according to the present invention.
- the monitoring system for blocking the port queue further includes:
- the port queue closing module 50 is configured to close the port queue when it is detected that the port queue is blocked.
- the cache message clearing module 60 is configured to clear the buffered message in the port queue and retain relevant configuration parameters required for resetting the port queue.
- the port queue resetting module 70 is configured to reset the port queue according to the retained related configuration parameter to restore an initial state corresponding to when the port queue does not send a message;
- the port queue enable module 80 is configured to enable the port queue to enable the port queue to send packets after the port queue is restored to the initial state.
- the port queue closing module 50 closes the port queue to prevent the packet from entering the port queue.
- the buffered message clearing module 60 clears all the messages buffered in the port queue but retains the relevant configuration parameters required for resetting the port queue, such as the priority and weight of the packet scheduling in the port queue; Speed limit strategy, etc.
- the port queue reset module 70 After clearing the buffered message in the blocked port queue and retaining the relevant configuration parameters required for resetting the port queue, the port queue reset module 70 will reset the port queue to restore the port according to the reserved configuration parameters.
- the port queue enable module 80 activates the port queue by enabling the port queue to enable the port queue to send the packet again.
- the common processing methods are as follows: First, insert or remove the interface subcard corresponding to the blocked port queue.
- the negative effect is that the number of the normal ports corresponding to the inserted interface subcards is ten.
- the second-level traffic is interrupted; the second is to plug or unplug the line card.
- the negative effect is that the traffic of the line card is interrupted for several minutes on all normal ports. The above operations on the insertion and removal of the interface daughter card and the line card will cause long-term traffic interruption of other normal port queues.
- the operation reduces the interruption time of the user traffic from tens of seconds to several seconds, and also reduces the impact range of the port queue congestion fault, and improves the stability of the carrier-class router.
- FIG. 12 is a schematic diagram of functional modules of a third embodiment of a monitoring system for port queue congestion according to the present invention.
- the monitoring system for blocking the port queue further includes:
- the monitoring thread creation module 90 is configured to create a monitoring thread of the port queue and perform polling monitoring of the port queue congestion on all ports on the router in the preset monitoring period.
- the monitoring thread creation module 90 when the packet is started after the router starts and the port queue is created, the monitoring thread creation module 90 creates a monitoring thread of the port queue, and the monitoring thread creation module 90 also sets the monitoring thread to all.
- the monitoring period of the polling monitoring of the port queue blocking, and the setting is used to determine the port team.
- the monitoring of the congestion of all the port queues is realized in real time, so that the blocked port queue can be determined more timely and accurately, and the solution is solved in a targeted manner.
- Blocked port queues which reduce the impact of port queue congestion and reduce the interruption time of user traffic provide an effective solution.
- An embodiment of the present invention further provides a monitoring device for port queue congestion, which is applied to a carrier-grade router including a plurality of ports, including:
- a memory for storing processor executable instructions
- processor is configured to:
- the current monitoring period is extended by a preset delay duration to determine whether the second sending state of the port queue meets the abnormal sending state. condition;
- Embodiments of the present invention also provide a non-transitory computer readable storage medium having stored therein instructions that cause the router to implement a port when executed by a processor of a carrier-grade router including a plurality of ports
- a method of monitoring queue congestion comprising the steps of:
- the current monitoring period is extended by a preset delay duration to determine whether the second sending state of the port queue meets the abnormal sending state. condition;
- the monitoring method of the port queue jam in the present application can be applied to a carrier-class router, and all port queue jams are monitored by periodic polling, so that the port queue jam problem can be locked to some or some more finely.
- the port queue in turn, only needs to recover the locked port queue in a targeted manner to avoid affecting the normal port queue.
Landscapes
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Description
本申请要求于2015年8月14日提交中国专利局、申请号为201510501920.6的中国专利申请的优先权,以上全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application No. 201510501920.6, filed on Aug. 14, 2015, the entire disclosure of which is hereby incorporated by reference.
本发明涉及网络通信技术领域,尤其涉及端口队列堵塞的监控方法及系统。The present invention relates to the field of network communication technologies, and in particular, to a method and system for monitoring port queue congestion.
随着网络技术的迅速发展,越来越多的用户逐渐使用如IPTV(Internet Protocol Television,网络电视)、VoIP(Voice over Internet Protocol,网络电话)、远程学习、远程医疗等新业务,而不断涌现的新业务对网络的服务能力提出了更高的要求。当用户大量使用IPTV、VoIP、BT(Bit Torrent)下载、FTP下载等网络应用时,现网电信级路由器端口队列的网络流量将变得非常巨大,而这将会导致电信级路由器存在端口队列堵塞的风险。当端口队列发生堵塞时,端口下面所有用户队列的报文都将无法发送,从而导致用户无法使用网络应用。With the rapid development of network technologies, more and more users are gradually using new services such as IPTV (Internet Protocol Television), VoIP (Voice over Internet Protocol), distance learning, telemedicine, etc. The new business puts higher demands on the service capabilities of the network. When users use a large number of network applications such as IPTV, VoIP, BT (Bit Torrent) download, FTP download, etc., the network traffic of the current network carrier-class router port queue will become very large, which will cause the port-level router to have port queue congestion. risks of. When the port queue is blocked, packets of all user queues under the port cannot be sent, which prevents users from using the network application.
在当前网络通信系统中,一般通过现网电信级路由器下行TM(Traffic Management,流量管理)芯片硬件H-QoS(Hierarchical Quality of Service,层次化服务质量)队列技术来实现流量整形、拥塞管理等功能,但现有H-QoS队列技术却并未提供对端口队列堵塞的监控机制,因而不能及时甚至提前监控到端口队列堵塞,同时亦不能准确确定端口队列是否存在堵塞,进而不能快速恢复端口队列的报文发送,致使用户长时间无法使用网络应用。In the current network communication system, the traffic shaping and congestion management functions are implemented by the H-QoS (hierarchical quality of service) queue technology of the traffic management chip hardware (Traffic Management). However, the existing H-QoS queuing technology does not provide a monitoring mechanism for port queue congestion. Therefore, the port queue congestion cannot be monitored in time or even in advance, and the port queue cannot be accurately determined to be blocked, thereby failing to quickly restore the port queue. The message is sent, causing the user to be unable to use the web application for a long time.
发明内容Summary of the invention
本发明的主要目的在于提供一种端口队列堵塞的监控方法及系统,旨在解决不能及时监控并准确确定端口队列堵塞,进而不能快速恢复端口队列的报文发送而致使用户长时间无法使用网络应用的技术问题。The main purpose of the present invention is to provide a method and system for monitoring port clogging, which is to solve the problem that the port queue clogging cannot be monitored and accurately determined, and the packet sending of the port queue cannot be quickly restored, so that the user cannot use the network application for a long time. Technical problem.
为实现上述目的,本发明提供一种端口队列堵塞的监控方法,应用于电信级路由器,所述路由器包括若干端口,所述端口队列堵塞的监控方法包括:To achieve the above object, the present invention provides a method for monitoring port queue congestion, which is applied to a carrier-class router, and the router includes a plurality of ports, and the monitoring method for the port queue congestion includes:
在当前监控周期到达时,确定当前轮询的端口队列的缓存状态是否满足预设的非正常缓存状态条件;When the current monitoring period arrives, it is determined whether the buffer status of the currently polled port queue satisfies a preset abnormal cache state condition;
当所述端口队列的缓存状态满足所述非正常缓存状态条件时,确定所述端口队列的第一发送状态是否满足预设的非正常发送状态条件;Determining whether the first sending state of the port queue meets a preset abnormal sending state condition when the buffering state of the port queue meets the abnormal cache state condition;
当所述端口队列的第一发送状态满足所述非正常发送状态条件时,将当前所述监控周期延长预设延迟时长以确定所述端口队列的第二发送状态是否满足所述非正常发送状态 条件;And when the first sending state of the port queue meets the abnormal sending state, the current monitoring period is extended by a preset delay duration to determine whether the second sending state of the port queue meets the abnormal sending state. condition;
当所述端口队列的第二发送状态满足所述非正常发送状态条件时,确定所述端口队列存在堵塞。When the second transmission state of the port queue satisfies the abnormal transmission state condition, it is determined that the port queue is blocked.
优选地,所述在当前监控周期到达时,确定当前轮询的端口队列的缓存状态是否满足预设的非正常缓存状态条件包括:Preferably, when the current monitoring period arrives, determining whether the buffer status of the currently polled port queue meets a preset abnormal cache state condition includes:
在当前所述监控周期到达时,获取所述端口队列的实时深度值,其中,所述深度值用于衡量端口队列中所缓存的报文存储量;Obtaining a real-time depth value of the port queue when the current monitoring period arrives, where the depth value is used to measure the amount of packet buffered in the port queue;
判断所述端口队列的实时深度值是否大于或等于预设深度阈值以对应确定所述端口队列的缓存状态是否满足所述非正常缓存状态条件;Determining whether the real-time depth value of the port queue is greater than or equal to a preset depth threshold to determine whether the cache state of the port queue satisfies the abnormal cache state condition;
当所述端口队列的实时深度值大于或等于所述深度阈值时,确定所述端口队列的缓存状态为所述非正常缓存状态。When the real-time depth value of the port queue is greater than or equal to the depth threshold, determining that the cache status of the port queue is the abnormal cache state.
优选地,所述当所述端口队列的缓存状态满足所述非正常缓存状态条件时,确定所述端口队列的第一发送状态是否满足预设的非正常发送状态条件包括:Preferably, when the buffering state of the port queue satisfies the abnormal cache state condition, determining whether the first sending state of the port queue meets a preset abnormal sending state condition comprises:
当所述端口队列的缓存状态满足所述非正常缓存状态条件时,获取在当前所述监控周期内以及在上一所述监控周期内所述端口队列分别对应发送的第一报文计数与第二报文计数,其中,所述报文计数采用高位与低位双计数器进行计数;When the buffer status of the port queue satisfies the abnormal cache state condition, the first packet count corresponding to the port queue sent in the current monitoring period and in the previous monitoring period is obtained. Two message counts, wherein the message count is counted by using a high and low double counter;
比较所述第一报文计数的高位及低位是否分别对应与所述第二报文计数的高位及低位相等以对应确定所述端口队列的第一发送状态是否满足所述非正常发送状态条件;Comparing whether the upper and lower bits of the first packet count are respectively equal to the upper and lower bits of the second packet count to determine whether the first transmission state of the port queue satisfies the abnormal transmission state condition;
当所述第一报文计数的高位及低位分别对应与所述第二报文计数的高位及低位相等时,确定所述端口队列的第一发送状态为所述非正常发送状态。When the upper and lower bits of the first packet count are respectively equal to the upper and lower bits of the second packet count, determining that the first transmission state of the port queue is the abnormal transmission state.
优选地,所述当所述端口队列的第一发送状态满足所述非正常发送状态条件时,将当前所述监控周期延长预设延迟时长以确定所述端口队列的第二发送状态是否满足所述非正常发送状态条件包括:Preferably, when the first sending state of the port queue satisfies the abnormal sending state condition, the current monitoring period is extended by a preset delay duration to determine whether the second sending state of the port queue satisfies The abnormal transmission status conditions include:
当所述端口队列的第一发送状态满足所述非正常发送状态条件时,将当前所述监控周期延长所述延迟时长;And when the first sending state of the port queue meets the abnormal sending state condition, extending the current monitoring period by the delay duration;
当所述延迟时长到达时,获取在上一所述监控周期及所述延迟时长内所述端口队列总共所发送的第三报文计数;And acquiring, when the delay duration arrives, a third packet count sent by the port queue in the last monitoring period and the delay duration;
比较所述第一报文计数的高位及低位是否分别对应与所述第三报文计数的高位及低位相等以对应确定所述端口队列的第二发送状态满足所述非正常发送状态条件;Comparing whether the upper and lower bits of the first packet count are respectively equal to the upper and lower bits of the third packet count to determine that the second transmission state of the port queue satisfies the abnormal transmission state condition;
当所述第一报文计数的高位及低位分别对应与所述第三报文计数的高位及低位相等时,确定所述端口队列的第二发送状态为所述非正常发送状态。When the upper and lower bits of the first packet count are respectively equal to the upper and lower bits of the third packet count, determining that the second transmission state of the port queue is the abnormal transmission state.
优选地,所述当所述端口队列的第二发送状态满足所述非正常发送状态条件时,确定所述端口队列存在堵塞之后包括:Preferably, when the second sending state of the port queue satisfies the abnormal sending state condition, determining that the port queue has a jam includes:
当监测到所述端口队列存在堵塞时,关闭所述端口队列;When it is detected that there is congestion in the port queue, the port queue is closed;
清空所述端口队列内所缓存的报文且保留用于复位所述端口队列所需要的相关配置 参数;Clearing the buffered packets in the port queue and retaining the relevant configuration required to reset the port queue parameter;
根据保留的所述相关配置参数,复位所述端口队列以恢复到所述端口队列未发送报文时所对应的初始状态;Resetting the port queue to restore an initial state corresponding to when the port queue does not send a message according to the related configuration parameter that is reserved;
当所述端口队列恢复到所述初始状态后,使能所述端口队列以开启所述端口队列进行报文的发送。After the port queue is restored to the initial state, the port queue is enabled to enable the port queue to send packets.
优选地,所述在当前监控周期到达时,确定当前轮询的端口队列的缓存状态是否满足预设的非正常缓存状态条件之前包括:Preferably, when the current monitoring period arrives, determining whether the buffer status of the currently polled port queue meets the preset abnormal cache state condition includes:
创建端口队列的监控线程并在预设的所述监控周期内,对所述路由器上的所有端口进行端口队列堵塞的轮询监控。A monitoring thread of the port queue is created and polling of the port queue jam is performed on all ports on the router in the preset monitoring period.
进一步地,为实现上述目的,本发明还提供一种端口队列堵塞的监控系统,应用于电信级路由器,所述路由器包括若干端口,所述端口队列堵塞的监控系统包括:Further, in order to achieve the above object, the present invention further provides a monitoring system for port queue congestion, which is applied to a carrier-grade router, the router includes a plurality of ports, and the monitoring system for blocking the port queue includes:
缓存状态确定模块,用于在当前监控周期到达时,确定当前轮询的端口队列的缓存状态是否满足预设的非正常缓存状态条件;a buffer status determining module, configured to determine, when the current monitoring period arrives, whether a buffer status of the currently polled port queue meets a preset abnormal cache status condition;
第一发送状态确定模块,用于当所述端口队列的缓存状态满足所述非正常缓存状态条件时,确定所述端口队列的第一发送状态是否满足预设的非正常发送状态条件;a first sending state determining module, configured to determine, when the buffering state of the port queue meets the abnormal cache state condition, whether the first sending state of the port queue meets a preset abnormal sending state condition;
第二发送状态确定模块,用于当所述端口队列的第一发送状态满足所述非正常发送状态条件时,将当前所述监控周期延长预设延迟时长以确定所述端口队列的第二发送状态是否满足所述非正常发送状态条件;a second sending state determining module, configured to: when the first sending state of the port queue meets the abnormal sending state condition, extend the current monitoring period by a preset delay duration to determine a second sending of the port queue Whether the state satisfies the abnormal transmission state condition;
堵塞确定模块,用于当所述端口队列的第二发送状态满足所述非正常发送状态条件时,确定所述端口队列存在堵塞。The congestion determination module is configured to determine that the port queue is blocked when the second transmission state of the port queue satisfies the abnormal transmission status condition.
优选地,所述缓存状态确定模块包括:Preferably, the cache status determining module includes:
深度值获取单元,用于在当前所述监控周期到达时,获取所述端口队列的实时深度值,其中,所述深度值用于衡量端口队列中所缓存的报文存储量;a depth value obtaining unit, configured to acquire a real-time depth value of the port queue when the current monitoring period arrives, where the depth value is used to measure a buffered amount of packet storage in the port queue;
深度值判断单元,用于判断所述端口队列的实时深度值是否大于或等于预设深度阈值以对应确定所述端口队列的缓存状态是否满足所述非正常缓存状态条件;a depth value judging unit, configured to determine whether the real-time depth value of the port queue is greater than or equal to a preset depth threshold to determine whether the cache state of the port queue satisfies the abnormal cache state condition;
缓存状态确定单元,用于当所述端口队列的实时深度值大于或等于所述深度阈值时,确定所述端口队列的缓存状态为所述非正常缓存状态。The cache state determining unit is configured to determine, when the real-time depth value of the port queue is greater than or equal to the depth threshold, that the cache state of the port queue is the abnormal cache state.
优选地,所述第一发送状态确定模块包括:Preferably, the first sending state determining module includes:
报文计数第一获取单元,用于当所述端口队列的缓存状态满足所述非正常缓存状态条件时,获取在当前所述监控周期内以及在上一所述监控周期内所述端口队列分别对应发送的第一报文计数与第二报文计数,其中,所述报文计数采用高位与低位双计数器进行计数;a packet counting first acquiring unit, configured to: when the buffering state of the port queue meets the abnormal buffering state condition, acquire the port queue in the current monitoring period and in the previous monitoring period Corresponding to the first packet count and the second packet count, wherein the packet count is counted by using a high-order and a low-order double counter;
报文计数第一比较单元,用于比较所述第一报文计数的高位及低位是否分别对应与所述第二报文计数的高位及低位相等以对应确定所述端口队列的第一发送状态是否满足所述非正常发送状态条件; The message counting first comparing unit is configured to compare whether the upper and lower bits of the first packet count are respectively equal to the high and low bits of the second packet count to determine the first sending state of the port queue. Whether the abnormal transmission status condition is satisfied;
第一发送状态确定单元,用于当所述第一报文计数的高位及低位分别对应与所述第二报文计数的高位及低位相等时,确定所述端口队列的第一发送状态为所述非正常发送状态。a first sending state determining unit, configured to determine, when the upper and lower bits of the first packet count are equal to the high and low bits of the second packet, respectively, determining that the first sending state of the port queue is The abnormal transmission status is described.
优选地,所述第二发送状态确定模块包括:Preferably, the second sending state determining module includes:
延时单元,用于当所述端口队列的第一发送状态满足所述非正常发送状态条件时,将当前所述监控周期延长所述延迟时长;a delay unit, configured to extend the current monitoring period by the delay duration when the first sending state of the port queue satisfies the abnormal sending state condition;
报文计数第二获取单元,用于当所述延迟时长到达时,获取在上一所述监控周期及所述延迟时长内所述端口队列总共所发送的第三报文计数;a second counting unit, configured to acquire, when the delay duration arrives, a third packet count sent by the port queue in the last monitoring period and the delay duration;
报文计数第二比较单元,用于比较所述第一报文计数的高位及低位是否分别对应与所述第三报文计数的高位及低位相等以确定所述端口队列的第二发送状态满足所述非正常发送状态条件;The message counting second comparing unit is configured to compare whether the upper and lower bits of the first packet count are respectively equal to the high and low bits of the third packet count to determine that the second sending state of the port queue is satisfied. The abnormal transmission status condition;
第二发送状态确定单元,用于当所述第一报文计数的高位及低位分别对应与所述第三报文计数的高位及低位相等时,确定所述端口队列的第二发送状态为所述非正常发送状态。a second sending state determining unit, configured to determine, when the upper and lower bits of the first packet count are equal to the high and low bits of the third packet, respectively, determining that the second sending state of the port queue is The abnormal transmission status is described.
优选地,所述端口队列堵塞的监控系统还包括:Preferably, the monitoring system for blocking the port queue further includes:
端口队列关闭模块,用于当监测到所述端口队列存在堵塞时,关闭所述端口队列;a port queue closing module, configured to close the port queue when it is detected that the port queue is blocked;
缓存报文清空模块,用于清空所述端口队列内所缓存的报文且保留用于复位所述端口队列所需要的相关配置参数;a buffered message clearing module, configured to clear the buffered message in the port queue and retain relevant configuration parameters required for resetting the port queue;
端口队列复位模块,用于根据保留的所述相关配置参数,复位所述端口队列以恢复到所述端口队列未发送报文时所对应的初始状态;The port queue reset module is configured to reset the port queue according to the retained related configuration parameter to restore an initial state corresponding to when the port queue does not send a message;
端口队列使能模块,用于当所述端口队列恢复到所述初始状态后,使能所述端口队列以开启所述端口队列进行报文的发送。The port queue enabling module is configured to enable the port queue to enable the port queue to send packets after the port queue is restored to the initial state.
优选地,所述端口队列堵塞的监控系统还包括:Preferably, the monitoring system for blocking the port queue further includes:
监控线程创建模块,用于创建端口队列的监控线程并在预设的所述监控周期内,对所述路由器上的所有端口进行端口队列堵塞的轮询监控。The monitoring thread creation module is configured to create a monitoring thread of the port queue and perform polling monitoring of the port queue congestion on all ports on the router in the preset monitoring period.
本发明的实施例还提供了一种端口队列堵塞的监控设备,应用于包括若干端口的电信级路由器,包括:An embodiment of the present invention further provides a monitoring device for port queue congestion, which is applied to a carrier-grade router including a plurality of ports, including:
处理器;processor;
用于存储处理器可执行指令的存储器;a memory for storing processor executable instructions;
其中,所述处理器被配置为:Wherein the processor is configured to:
在当前监控周期到达时,确定当前轮询的端口队列的缓存状态是否满足预设的非正常缓存状态条件;When the current monitoring period arrives, it is determined whether the buffer status of the currently polled port queue satisfies a preset abnormal cache state condition;
当所述端口队列的缓存状态满足所述非正常缓存状态条件时,确定所述端口队列的第一发送状态是否满足预设的非正常发送状态条件;Determining whether the first sending state of the port queue meets a preset abnormal sending state condition when the buffering state of the port queue meets the abnormal cache state condition;
当所述端口队列的第一发送状态满足所述非正常发送状态条件时,将当前所述监控周 期延长预设延迟时长以确定所述端口队列的第二发送状态是否满足所述非正常发送状态条件;When the first transmission state of the port queue satisfies the abnormal transmission state condition, the current monitoring week will be Extending a preset delay duration to determine whether the second transmission state of the port queue satisfies the abnormal transmission state condition;
当所述端口队列的第二发送状态满足所述非正常发送状态条件时,确定所述端口队列存在堵塞。When the second transmission state of the port queue satisfies the abnormal transmission state condition, it is determined that the port queue is blocked.
本发明的实施例还提供了一种非易失性计算机可读存储介质,其中存储有指令,所述指令在由包括若干端口的电信级路由器的处理器执行时使所述路由器实施一种端口队列堵塞的监控方法,所述方法包括以下步骤:Embodiments of the present invention also provide a non-transitory computer readable storage medium having stored therein instructions that cause the router to implement a port when executed by a processor of a carrier-grade router including a plurality of ports A method of monitoring queue congestion, the method comprising the steps of:
在当前监控周期到达时,确定当前轮询的端口队列的缓存状态是否满足预设的非正常缓存状态条件;When the current monitoring period arrives, it is determined whether the buffer status of the currently polled port queue satisfies a preset abnormal cache state condition;
当所述端口队列的缓存状态满足所述非正常缓存状态条件时,确定所述端口队列的第一发送状态是否满足预设的非正常发送状态条件;Determining whether the first sending state of the port queue meets a preset abnormal sending state condition when the buffering state of the port queue meets the abnormal cache state condition;
当所述端口队列的第一发送状态满足所述非正常发送状态条件时,将当前所述监控周期延长预设延迟时长以确定所述端口队列的第二发送状态是否满足所述非正常发送状态条件;And when the first sending state of the port queue meets the abnormal sending state, the current monitoring period is extended by a preset delay duration to determine whether the second sending state of the port queue meets the abnormal sending state. condition;
当所述端口队列的第二发送状态满足所述非正常发送状态条件时,确定所述端口队列存在堵塞。When the second transmission state of the port queue satisfies the abnormal transmission state condition, it is determined that the port queue is blocked.
本发明通过周期轮询的方式对所有端口队列堵塞进行监控,从而能够更为精细地将端口队列堵塞问题锁定在某个或某些个端口队列,进而只需有针对性地对锁定的端口队列进行恢复操作而避免对其他正常端口队列的影响。此外,通过监控端口队列的缓存状态,可以及时甚至于提前确定端口队列是否存在潜在的堵塞风险,同时,进一步通过监控端口队列的发送状态以确定当前是否存在堵塞。另外,在监控端口队列的发送状态时,为排除对发送状态的误判,通过延时的方式以进一步确定当前端口队列是否存在堵塞,从而对端口队列堵塞的判断更为准确,从而能够更为及时准确确定存在堵塞的端口队列,进而及时处理发生堵塞的端口队列,降低端口队列堵塞所带来的负面影响。The invention monitors all port queue jams by means of periodic polling, so that the port queue jam problem can be locked to one or some port queues more finely, and only the targeted port queues need to be targeted. Perform recovery operations to avoid impact on other normal port queues. In addition, by monitoring the cache status of the port queue, it is possible to determine whether the port queue has a potential congestion risk in time or even in advance, and further monitor the transmission status of the port queue to determine whether there is currently a jam. In addition, when monitoring the transmission status of the port queue, in order to eliminate the misjudgment of the transmission status, the delay is used to further determine whether the current port queue is blocked, thereby making the judgment of the port queue congestion more accurate, thereby enabling Timely and accurately determine the port queues that are blocked, and then timely process the blocked port queues, reducing the negative impact of port queue congestion.
图1为本发明端口队列堵塞的监控方法第一实施例的流程示意图;1 is a schematic flowchart of a first embodiment of a method for monitoring port port congestion according to the present invention;
图2为图1中步骤S110的细化流程示意图;2 is a schematic diagram of a refinement process of step S110 in FIG. 1;
图3为图1中步骤S120的细化流程示意图;3 is a schematic diagram of a refinement process of step S120 in FIG. 1;
图4为图1中步骤S130的细化流程示意图;4 is a schematic diagram of a refinement process of step S130 in FIG. 1;
图5为本发明端口队列堵塞的监控方法第二实施例的流程示意图;5 is a schematic flowchart of a second embodiment of a method for monitoring port port congestion according to the present invention;
图6为本发明端口队列堵塞的监控方法第三实施例的流程示意图;FIG. 6 is a schematic flowchart diagram of a third embodiment of a method for monitoring port port congestion according to the present invention; FIG.
图7为本发明端口队列堵塞的监控系统第一实施例的功能模块示意图;7 is a schematic diagram of functional modules of a first embodiment of a monitoring system for port queue congestion according to the present invention;
图8为图7中缓存状态确定模块的细化功能模块示意图; 8 is a schematic diagram of a refinement function module of the cache state determination module in FIG. 7;
图9为图7中第一发送状态确定模块的细化功能模块示意图;9 is a schematic diagram of a refinement function module of the first transmission state determining module in FIG. 7;
图10为图7中第二发送状态确定模块的细化功能模块示意图;10 is a schematic diagram of a refinement function module of the second transmission state determining module in FIG. 7;
图11为本发明端口队列堵塞的监控系统第二实施例的功能模块示意图;11 is a schematic diagram of functional modules of a second embodiment of a monitoring system for port queue congestion according to the present invention;
图12为本发明端口队列堵塞的监控系统第三实施例的功能模块示意图。FIG. 12 is a schematic diagram of functional modules of a third embodiment of a monitoring system for port queue congestion according to the present invention.
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The implementation, functional features, and advantages of the present invention will be further described in conjunction with the embodiments.
应当理解,此处所描述的具体实施例仅用以解释本发明,并不用于限定本发明。It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
参照图1,图1为本发明端口队列堵塞的监控方法第一实施例的流程示意图。本实施例中,所述端口队列堵塞的监控方法包括:Referring to FIG. 1, FIG. 1 is a schematic flowchart diagram of a first embodiment of a method for monitoring port queue congestion according to the present invention. In this embodiment, the monitoring method for the port queue congestion includes:
步骤S110,在当前监控周期到达时,确定当前轮询的端口队列的缓存状态是否满足预设的非正常缓存状态条件;Step S110: When the current monitoring period arrives, determine whether the buffer status of the currently polled port queue satisfies a preset abnormal cache state condition;
本实施例中,端口队列堵塞的监控方法应用于电信级路由器,其中,在现网电信级路由器下行TM(Traffic Management,流量管理)芯片硬件H-QoS队列技术中,端口队列对应于H-QoS树的根结点,端口队列和用户队列之间还可以有数层中间层队列。用户报文经用户队列发送至中间层队列,再由中间层队列汇聚到端口队列,最后通过端口队列发送至对应的接口子卡端口,而该接口子卡端口可以外接光纤。In this embodiment, the monitoring method of the port queue jam is applied to the carrier-class router, wherein the port queue corresponds to the H-QoS in the downlink (Traffic Management) chip hardware H-QoS queue technology of the current network carrier-class router. There are also several layers of middle-tier queues between the root node of the tree, the port queue, and the user queue. User packets are sent to the middle-tier queue through the user queue, and then the intermediate-layer queues are aggregated to the port queue. Finally, the port queue is sent to the corresponding interface daughter card port, and the interface daughter card port can be externally connected to the fiber.
此外,路由器一般包括若干端口,因此,需要对路由器上的所有端口进行监控。考虑到单独监控每一端口对于路由器的软硬件开销较大,从而降低路由器的工作效率,因此,本实施例采用轮询方式实现对所有端口进行监控,其中,监控周期为对所有端口完成一次轮询的时间。In addition, routers typically include several ports, so all ports on the router need to be monitored. In this embodiment, the polling mode is used to monitor all the ports, and the monitoring period is to complete all rounds for all ports, in order to reduce the efficiency of the routers. The time of the inquiry.
需要进一步说明的是,端口队列监控周期具体需要根据实际情况进行设置。比如,设置为百毫秒级。监控周期若设置得太大,则端口队列堵塞时将不能被快速监控到,从而造成流量中断时间过长;而若设置得太小,则对端口队列监控所带来的软硬件开销增大,从而影响路由器线卡上正常业务的运行。此外,轮询相邻两端口之间的间隔时间与监控周期相关,不同相邻两端口之间的间隔时间可以相同,也可以不相同,具体根据实际需要设置。另外,为便于描述,本实施例中仅对所有端口中的任意一个端口的端口队列堵塞监控进行说明,其他端口的说明相同,因此不做赘述。It should be further noted that the port queue monitoring period needs to be set according to actual conditions. For example, set to 100 milliseconds. If the monitoring period is set too large, the port queue will not be quickly monitored when it is blocked, which will cause the traffic interruption time to be too long. If it is set too small, the hardware and software overhead brought by the port queue monitoring will increase. This affects the normal operation of the router line card. In addition, the interval between the two adjacent ports is related to the monitoring period. The interval between the two adjacent ports may be the same or different, and may be set according to actual needs. In addition, for convenience of description, the port queue blocking monitoring of any one of the ports is described in the embodiment, and the descriptions of the other ports are the same, and therefore are not described herein.
在当前监控周期到达时,也即开始对路由器上所有端口中的某一端口进行轮询监控时,确定当前轮询的端口队列的缓存状态是否满足预设的非正常缓存状态条件。本实施例中,当前轮询的端口队列具体是指当前监控周期到达时开始轮询的端口队列。端口队列的缓存状态具体是指端口队列中所缓存的报文存储量的状态,比如,缓存状态为空、缓存状态未满(也即未堵塞)、缓存状态已满(存在堵塞的风险)等。本实施例中,具体通过预设的缓存状态条件用以判断非正常缓存状态,若满足该预设条件,则为非正常缓存状态; 反之,若不满足,则为正常缓存状态,此时将继续等待下一监控周期的到达。When the current monitoring period is reached, that is, when polling monitoring is performed on one of the ports on the router, it is determined whether the buffer status of the currently polled port queue satisfies the preset abnormal cache state condition. In this embodiment, the currently polled port queue specifically refers to the port queue that starts polling when the current monitoring period arrives. The cache status of a port queue refers to the state of the amount of packet buffered in the port queue. For example, the cache status is empty, the cache status is not full (that is, it is not blocked), the cache status is full (the risk of congestion), etc. . In this embodiment, the preset cache state condition is used to determine an abnormal cache state, and if the preset condition is met, the mode is an abnormal cache state; On the other hand, if it is not satisfied, it is in the normal buffer state, and will continue to wait for the arrival of the next monitoring period.
优选地,为便于及时甚至提前监控端口队列堵塞,将非正常缓存状态确定为存在堵塞风险的状态,也即将要进入堵塞时的缓存状态。因此,针对上述所确定的非正常缓存状态,相对应确定预设的缓存状态条件,该条件具体根据实际需要进行设置,比如该条件可以为端口队列的缓存大小。因此,在当前轮询的端口队列的缓存状态满足上述对应的预设条件时,至少可以确定当前轮询的端口队列存在堵塞风险。Preferably, in order to facilitate timely or even early monitoring of port queue congestion, the abnormal cache state is determined to be in a state of congestion risk, and is also about to enter a cache state at the time of congestion. Therefore, for the abnormal cache state determined above, the preset cache state condition is determined correspondingly, and the condition is specifically set according to actual needs, for example, the condition may be the cache size of the port queue. Therefore, when the buffer status of the currently polled port queue satisfies the foregoing preset condition, at least the current polled port queue may be determined to have a congestion risk.
步骤S120,当所述端口队列的缓存状态满足所述非正常缓存状态条件时,确定所述端口队列的第一发送状态是否满足预设的非正常发送状态条件;Step S120: When the buffering state of the port queue meets the abnormal cache state condition, determine whether the first sending state of the port queue meets a preset abnormal sending state condition;
当轮询的端口队列的缓存状态满足上述预设的非正常缓存状态条件时,也即在至少可以确定当前轮询的端口队列存在堵塞风险时,需要进一步确定是否确实存在堵塞,也即通过端口队列的第一发送状态是否满足预设的非正常发送状态条件来进行判定。When the buffered state of the polled port queue satisfies the above-mentioned preset abnormal cache state condition, that is, when at least the current polled port queue has a congestion risk, it is necessary to further determine whether there is a jam, that is, through the port. The determination is made as to whether the first transmission state of the queue satisfies a preset abnormal transmission state condition.
本实施例中,端口队列的发送状态(第一发送状态仅仅只是命名区别)具体通过预设的非正常发送状态条件进行判断,若满足该预设条件,则为非正常发送状态;反之,若不满足,则为正常发送状态,此时将继续等待下一监控周期的到达。In this embodiment, the sending status of the port queue (the first sending status is only a naming difference) is determined by a preset abnormal sending status condition. If the preset condition is met, the status is abnormal. If it is not satisfied, it is the normal transmission state, and it will continue to wait for the arrival of the next monitoring period.
优选地,为便于准确监控端口队列堵塞,将非正常发送状态确定为没有发送报文的状态。因此,针对上述所确定的非正常发送状态,相应确定预设的发送状态条件,该条件具体根据实际需要进行设置,比如该条件可以为端口队列的报文计数。因此,在当前轮询的端口队列的发送状态满足上述对应的预设条件时,基本上可以确定当前轮询的端口队列存在堵塞。Preferably, in order to facilitate accurate monitoring of port queue congestion, the abnormal transmission status is determined to be a state in which no message is sent. Therefore, for the abnormal transmission state determined above, a preset transmission state condition is determined correspondingly, and the condition is specifically set according to actual needs, for example, the condition may be a packet count of the port queue. Therefore, when the transmission status of the currently polled port queue satisfies the corresponding preset condition, it is basically determined that there is congestion in the currently polled port queue.
步骤S130,当所述端口队列的第一发送状态满足所述非正常发送状态条件时,将当前所述监控周期延长预设延迟时长以确定所述端口队列的第二发送状态是否满足所述非正常发送状态条件;Step S130: When the first sending state of the port queue meets the abnormal sending state condition, the current monitoring period is extended by a preset delay duration to determine whether the second sending state of the port queue satisfies the non- Normal transmission status condition;
步骤S140,当所述端口队列的第二发送状态满足所述非正常发送状态条件时,确定所述端口队列存在堵塞。Step S140: When the second sending state of the port queue satisfies the abnormal sending state condition, it is determined that the port queue is blocked.
当轮询的端口队列的第一发送状态满足预设的非正常发送状态条件时,此时可能会存在对端口队列堵塞的误判,比如,可能存在报文在监控周期到达的临界时刻进入端口队列,但报文此时尚未从端口队列发送出去,因而没有对该报文进行计数,从而导致误判。When the first transmission status of the polled port queue meets the preset abnormal transmission status, there may be a misjudgment of the port queue congestion. For example, there may be a packet entering the port at the critical moment when the monitoring period arrives. Queue, but the message has not been sent out from the port queue at this time, so the message is not counted, resulting in a false positive.
因此,基于上述可能存在的端口堵塞的误判情况,本实施例中,通过将当前监控周期延长预设延迟时长,从而再次进一步精确确定当前轮询的端口队列的发送状态(第二发送状态仅仅只是命名区别)是否满足预设的非正常发送状态条件,如果再一次满足预设的非正常发送状态条件,则可以准确确定当前轮询的端口队列的发送状态为没有发送报文,也即确定当前轮询的端口队列存在堵塞;反之,确定当前轮询的端口队列不存在堵塞,此时将继续等待下一监控周期的到达。Therefore, based on the above-mentioned misidentification of the port clogging, in the embodiment, the current monitoring period is extended by the preset delay duration, thereby further accurately determining the transmission status of the currently polled port queue (the second transmission state is only It is only the naming difference) whether the preset abnormal transmission status condition is met. If the preset abnormal transmission status condition is met again, the transmission status of the currently polled port queue can be accurately determined as not sending a message, that is, it is determined. The current polled port queue is blocked. On the contrary, it is determined that there is no congestion in the current polled port queue. At this time, it will continue to wait for the arrival of the next monitoring period.
本实施例中,通过周期轮询的方式对所有端口队列堵塞进行监控,从而能够更为精细地将端口队列堵塞问题锁定在某个或某些个端口队列,进而只需有针对性地对锁定的端口 队列进行恢复操作而避免对其他正常端口队列的影响。此外,通过监控端口队列的缓存状态,可以及时甚至于提前确定端口队列是否存在潜在的堵塞风险,同时,进一步通过监控端口队列的发送状态以确定当前是否存在堵塞。另外,在监控端口队列的发送状态时,为排除对发送状态的误判,通过延时的方式以进一步确定当前端口队列是否存在堵塞,从而进一步提高对端口队列堵塞判断的准确性,从而能够更为及时准确确定存在堵塞的端口队列,进而及时处理发生堵塞的端口队列,降低端口队列堵塞所带来的负面影响。In this embodiment, all port queue jams are monitored by means of periodic polling, so that the port queue jam problem can be locked to one or some port queues more finely, and only the targeted lock is needed. Port The queue performs recovery operations to avoid impact on other normal port queues. In addition, by monitoring the cache status of the port queue, it is possible to determine whether the port queue has a potential congestion risk in time or even in advance, and further monitor the transmission status of the port queue to determine whether there is currently a jam. In addition, when monitoring the transmission status of the port queue, in order to eliminate the misjudgment of the transmission status, the delay is used to further determine whether the current port queue is blocked, thereby further improving the accuracy of the port queue clogging judgment, thereby enabling In order to accurately and accurately determine the port queues that are blocked, the port queues that are blocked are processed in time to reduce the negative impact caused by port queue congestion.
参照图2,图2为图1中步骤S110的细化流程示意图。基于上述实施例,在本实施例中,上述步骤S110包括:Referring to FIG. 2, FIG. 2 is a schematic diagram of the refinement process of step S110 in FIG. Based on the above embodiment, in the embodiment, the foregoing step S110 includes:
步骤S1101,在当前所述监控周期到达时,获取所述端口队列的实时深度值,其中,所述深度值用于衡量端口队列中所缓存的报文存储量;In step S1101, when the current monitoring period arrives, the real-time depth value of the port queue is obtained, where the depth value is used to measure the amount of packet buffered in the port queue.
步骤S1102,判断所述端口队列的实时深度值是否大于或等于预设深度阈值以对应确定所述端口队列的缓存状态是否满足所述非正常缓存状态条件;Step S1102: Determine whether the real-time depth value of the port queue is greater than or equal to a preset depth threshold to determine whether the cache state of the port queue satisfies the abnormal cache state condition;
步骤S1103,当所述端口队列的实时深度值大于或等于所述深度阈值时,确定所述端口队列的缓存状态为所述非正常缓存状态。Step S1103: When the real-time depth value of the port queue is greater than or equal to the depth threshold, determine that the cache state of the port queue is the abnormal cache state.
本实施例中,端口队列的深度值用于衡量端口队列中所缓存的报文存储量,也即通过获取端口队列的实时深度值,并将端口队列的实时深度值与预设深度阈值进行比较,从而用以判断端口队列的缓存状态,也即判断端口队列中是否缓存了一定数量的待发送的报文。In this embodiment, the depth value of the port queue is used to measure the amount of the buffered packets in the port queue, that is, the real-time depth value of the port queue is obtained, and the real-time depth value of the port queue is compared with the preset depth threshold. Therefore, it is used to determine the buffer status of the port queue, that is, whether a certain number of packets to be sent are buffered in the port queue.
需要说明的是,预设深度阈值必须小于端口队列缓存,否则当端口队列堵塞时,由于端口队列的实时深度值不会超过预设深度阈值,因而最终无法正确判断端口队列是否堵塞。同时,预设深度阈值也不能设置太大,否则当端口队列堵塞时不能被及时快速监控到,从而造成流量中断的时间过长。因此,本实施例中,预设深度阈值具体根据实际情况进行设置,比如设置为千字节级。另外,需要进一步说明的是,一般路由器上的所有端口的端口队列缓存都为默认分配,且默认分配的缓存大小相同,因此,对于路由器上的多个端口队列来说,仅只需设置一个预设深度阈值即可。It should be noted that the preset depth threshold must be smaller than the port queue cache. Otherwise, when the port queue is blocked, the real-time depth value of the port queue does not exceed the preset depth threshold. At the same time, the preset depth threshold cannot be set too large. Otherwise, when the port queue is blocked, it cannot be quickly monitored in time, which causes the traffic interruption time to be too long. Therefore, in this embodiment, the preset depth threshold is specifically set according to an actual situation, for example, set to a kilobyte level. In addition, it should be further explained that the port queue cache of all ports on the general router is the default allocation, and the default allocated cache size is the same, therefore, for multiple port queues on the router, only one preset is needed. The depth threshold is OK.
本实施例中,非正常缓存状态条件优选为端口队列的实时深度值大于或等于预设深度阈值。如果端口队列的实时深度值小于预设深度阈值,则端口队列为正常缓存状态,也即端口队列未堵塞;反之,如果实时深度值大于或等于预设深度阈值,说明端口队列缓存的报文的存储量超过了预设的深度阈值,也即端口队列为非正常缓存状态,端口队列存在堵塞的风险。通过预设深度阈值的设置,从而可以及时监控到是否存在端口队列堵塞的风险,进而能够及时对存在堵塞风险的端口队列进行干预,最大限度降低端口堵塞所造成的影响。In this embodiment, the abnormal cache state condition is preferably that the real-time depth value of the port queue is greater than or equal to a preset depth threshold. If the real-time depth of the port queue is less than the preset threshold, the port queue is in the normal cache state, that is, the port queue is not blocked. Otherwise, if the real-time depth value is greater than or equal to the preset depth threshold, the port queue cached packets are The storage capacity exceeds the preset depth threshold, that is, the port queue is abnormally cached, and the port queue is at risk of congestion. By setting the preset depth threshold, you can monitor the risk of port queue congestion in time, and then intervene in the port queue with the risk of congestion to minimize the impact of port congestion.
参照图3,图3为图1中步骤S120的细化流程示意图。基于上述实施例,在本实施 例中,步骤S120包括:Referring to FIG. 3, FIG. 3 is a schematic diagram of the refinement process of step S120 in FIG. Based on the above embodiment, in this implementation In the example, step S120 includes:
步骤S1201,当所述端口队列的缓存状态满足所述非正常缓存状态条件时,获取在当前所述监控周期内以及在上一所述监控周期内所述端口队列分别对应发送的第一报文计数与第二报文计数,其中,所述报文计数采用高位与低位双计数器进行计数;Step S1201: When the buffering status of the port queue meets the abnormal cache state, the first packet sent by the port queue in the current monitoring period and in the previous monitoring period is obtained. Counting and counting the second message, wherein the message count is counted by using a high and low double counter;
当确定了当前轮询的端口队列的缓存状态满足所述非正常缓存状态条件时,也即确定当前轮询的端口队列存在堵塞的风险时,需要进一步确定是否确实存在端口队列堵塞,也即进一步判断当前轮询的端口队列的发送状态是否满足预设的非正常发送状态条件。When it is determined that the buffer status of the currently polled port queue satisfies the abnormal cache state condition, that is, when the risk of congestion of the currently polled port queue is determined, it is necessary to further determine whether the port queue blockage does exist, that is, further Determine whether the sending status of the currently polled port queue meets the preset abnormal sending status condition.
本实施例中,非正常发送状态优选为相邻两监控周期内所发送的报文计数相等,也即在相邻两监控周期内没有发送报文。该相邻的两监控周期具体是指当前监控周期与当前监控周期的上一监控周期。具体通过以非读清方式,从预设的用于统计报文计数的双计数器上读取当前监控周期内发送的第一报文计数(假设为s1)以及上一个监控周期内发送的第二报文计数(假设为s0)。其中,非读清方式是指在读取计数数据时不对计数数据进行清零操作。In this embodiment, the abnormal transmission state is preferably that the number of packets sent in the two adjacent monitoring periods is equal, that is, no packets are sent in the two adjacent monitoring periods. The two adjacent monitoring periods specifically refer to the current monitoring period and the previous monitoring period of the current monitoring period. Specifically, the first packet count sent in the current monitoring period (assumed to be s1) and the second sent in the previous monitoring period are read from the preset dual counter for counting the statistics in a non-reading manner. Message count (assumed to be s0). The non-reading mode means that the counting data is not cleared when the counting data is read.
本实施例中,对于支持端口队列发送报文计数的TM芯片,通过用于统计报文计数的双计数器,直接获取TM芯片硬件提供的端口队列发送报文计数;而对于不支持端口队列发送报文计数的TM芯片,由于所有端口队列的报文都是直接发送至接口子卡上的对应端口,因此本实施例鉴于上述特征,可选的,通过用于统计报文计数的双计数器,获取与端口队列相对应的接口子卡上对应端口所接收到的报文计数,也即将此计数作为端口队列发送报文计数。In this embodiment, for the TM chip that supports the port queue to send the packet count, the double counter used for counting the statistics of the packet directly obtains the count of the packet sent by the port queue provided by the hardware of the TM chip; The TM chip that counts the packets is sent to the corresponding port on the interface sub-card. Therefore, in this embodiment, the dual-counter for counting the statistics is obtained. The number of packets received by the corresponding port on the interface subcard corresponding to the port queue is also counted as the number of packets sent by the port queue.
可选的,鉴于现有使用单计数器存在短时间内相邻两次发送报文计数溢出翻转的问题,因此为更准确地判断相邻两次监控周期内发送的报文计数是否相等,优选采用高位与低位双计数器进行报文计数。Optionally, in view of the fact that the existing single counter has the problem that the count of two consecutively transmitted messages overflows and falls over a short period of time, it is preferable to determine whether the counts of the messages sent in the adjacent two monitoring periods are equal. The high and low double counters count the message.
端口队列发送报文计数使用的计数器位宽越大,在一个监控周期内发送报文计数溢出翻转的可能性越小,同时计数的比较就越精确。为了准确比较端口队列发送报文计数,本实施例具体使用高位、低位双计数器对端口队列发送报文进行计数。例如,使用8位双计数器时,双计数器分别是高8位、低8位计数;使用16位双计数器时,双计数器分别是高16位、低16位计数;使用32位双计数器时,双计数器分别是高32位、低32位计数。当低位计数器满时,高位计数器计数加1,低位计数器计数清零;当高位计数器满时,高位计数器清零。The larger the counter width used by the port queue to send the message count, the smaller the probability that the sent message count overflows and flips during a monitoring period, and the more accurate the comparison is. In order to accurately compare the number of packets sent by the port queue, the present embodiment specifically uses the high-order and low-level dual counters to count the packets sent by the port queue. For example, when using an 8-bit dual counter, the double counter is the upper 8 bits and the lower 8 bits respectively; when using the 16-bit double counter, the double counter is the upper 16 bits and the lower 16 bits respectively; when using the 32-bit double counter, the double The counters are high 32 bits and low 32 bits, respectively. When the low-order counter is full, the high-order counter count is incremented by one, and the low-order counter count is cleared. When the high-order counter is full, the high-order counter is cleared.
步骤S1202,比较所述第一报文计数的高位及低位是否分别对应与所述第二报文计数的高位及低位相等以对应确定所述端口队列的第一发送状态是否满足所述非正常发送状态条件;Step S1202: Comparing whether the upper and lower bits of the first packet count are respectively equal to the upper and lower bits of the second packet count to determine whether the first transmission state of the port queue satisfies the abnormal transmission. State condition
步骤S1203,当所述第一报文计数的高位及低位分别对应与所述第二报文计数的高位及低位相等时,确定所述端口队列的第一发送状态为所述非正常发送状态。Step S1203: When the upper and lower bits of the first packet count are respectively equal to the high and low bits of the second packet count, determining that the first sending state of the port queue is the abnormal sending state.
当获取到当前监控周期内发送的第一报文计数s1以及上一个监控周期内发送的第二 报文计数s0后,将s1与s0进行比较,用以判断端口队列的发送状态,即确定端口队列是否在发送报文,若端口队列在发送报文,则为正常发送状态,反之,则为非正常发送状态。When the first packet count s1 sent in the current monitoring period and the second sent in the previous monitoring period are obtained After the s1 is compared with the s0, the s1 is compared with the s0 to determine the sending status of the port queue, that is, whether the port queue is transmitting packets. If the port queue is transmitting packets, the status is normal. Otherwise, Abnormal transmission status.
为了避免由于报文计数溢出翻转所带来的误判影响,因此本实施例中不比较s1与s0的大小,而是优选比较s1与s0是否相等。由于本实施例中的报文计数采用高位与低位双计数器进行计数,因此,需要分别对应比较s1与s0的高位或低位是否相等:如果s1与s0分别对应的高位不相等和/或低位不相等,则说明端口队列在相邻两次计数时间内发送了报文;如果s1与s0分别对应的高位相等且低位相等,则说明端口队列在相邻两次计数时间内没有发送报文。In order to avoid the misjudgment caused by the overflow of the message count overflow, the size of s1 and s0 is not compared in this embodiment, but it is preferable to compare whether s1 and s0 are equal. Since the message count in the embodiment is counted by the high-order and low-order double counters, it is necessary to respectively compare whether the upper or lower bits of s1 and s0 are equal: if the upper bits corresponding to s1 and s0 are not equal and/or the low bits are not equal If the s1 and s0 respectively correspond to the high-order bits and the low-order bits are equal, the port queue does not send packets in the two consecutive counting times.
本实施例中,非正常发送状态条件优选为端口队列在相邻两次计数时间内没有发送报文,也即在相邻两监控周期内没有发送报文。通过对端口队列的缓存状态进行判断,从而锁定存在堵塞风险的端口队列;而通过对端口队列的发送状态进行判断,从而进一步精确确定存在堵塞的端口队列。本实施例所采用的缓存状态及发送状态双层递进式监控机制能够更为及时、准确地锁定存在堵塞的端口队列,进而能够及时进行干预,从而最大限度降低由于端口队列堵塞所造成的影响。In this embodiment, the abnormal transmission status condition is that the port queue does not send a message within two consecutive counting times, that is, no message is sent in two adjacent monitoring periods. By judging the buffer status of the port queue, the port queue with the risk of blocking is locked; and the status of the port queue is judged to further accurately determine the blocked port queue. The double-layer progressive monitoring mechanism of the buffer state and the transmission state used in this embodiment can lock the blocked port queues in a timely and accurate manner, thereby enabling timely intervention, thereby minimizing the impact caused by port queue congestion. .
此外,本实施例中使用高位、低位双计数器对端口队列发送报文进行计数,解决了使用单计数器存在短时间内相邻两次发送报文计数而产生溢出翻转的问题,从而能够更准确地判断相邻两次发送报文计数是否相等。同时,本实施例中通过获取接口子卡对应端口的接收报文计数,将此计数作为端口队列发送报文计数,从而解决了TM芯片硬件不支持端口队列发送报文计数时无法判断端口队列发送状态的问题。另外,优选将s1与s0分别对应作为当前监控周期内发送的报文计数以及上一监控周期内发送的报文计数的参数,则在当前监控周期结束时,将s1的计数赋给s0以用于下一个监控周期进行发送报文计数的比较。In addition, in the embodiment, the high-order and low-level dual counters are used to count the packets sent by the port queue, which solves the problem that the use of the single counter has the overflow of the two consecutively transmitted message counts in a short time, thereby enabling more accurate Determine whether the counts of two consecutively sent messages are equal. At the same time, in this embodiment, the number of received packets of the corresponding port of the interface subcard is obtained, and the count is used as the port queue to send the packet count, thereby solving the problem that the TM chip hardware cannot support the port queue to send the packet count and cannot determine the port queue sending. State problem. In addition, it is preferable to respectively assign s1 and s0 as the message counts sent in the current monitoring period and the parameters of the message counts sent in the previous monitoring period, and then assign the count of s1 to s0 at the end of the current monitoring period. The comparison of the transmitted message counts is performed in the next monitoring cycle.
参照图4,图4为图1中步骤S130的细化流程示意图。基于上述实施例,在本实施例中,步骤S130包括:Referring to FIG. 4, FIG. 4 is a schematic diagram of the refinement process of step S130 in FIG. Based on the above embodiment, in this embodiment, step S130 includes:
步骤S1301,当所述端口队列的第一发送状态满足所述非正常发送状态条件时,将当前所述监控周期延长所述延迟时长;Step S1301: When the first sending state of the port queue satisfies the abnormal sending state condition, extending the current monitoring period by the delay duration;
当轮询的端口队列的发送状态满足非正常发送状态条件时,也即端口队列在相邻两监控周期内没有发送报文时,此时并不能确定端口队列已经发生堵塞而不能发送报文,例如,当报文在监控周期到达的临界时刻进入端口队列时,报文此时尚未从端口队列发送出去(也即计数器此时并不计数),因此,可能会造成当前监控周期内发送报文计数s1与上一个监控周期内发送报文计数s0相等,也即判断端口队列未发送报文,而若端口队列未堵塞,但得到端口队列未发送报文的判断结果,则导致产生相邻两监控周期内没有发送报文的误判。When the transmission status of the polled port queue meets the abnormal transmission status condition, that is, when the port queue does not send packets in the two adjacent monitoring periods, the port queue cannot be determined to be blocked and cannot send packets. For example, when a packet enters the port queue at the critical time when the monitoring period arrives, the packet is not sent out from the port queue at this time (that is, the counter does not count at this time). Therefore, the packet may be sent during the current monitoring period. The count s1 is equal to the s0 sent in the last monitoring period, that is, the port queue is not sent, and if the port queue is not blocked, but the judgment result of the unsent packet is obtained, the adjacent two are generated. No misjudgement of the message was sent during the monitoring period.
本实施例中,基于上述可能出现的端口队列堵塞的误判问题,引入延迟时长T1以进 行计数后再进行相邻两监控周期报文计数的比较。其中,引入的延迟时长T1必须小于端口队列的监控周期T0,否则在延迟时长T1到达且获取端口队列发送报文计数后,此时时间已到了下一个监控周期,而不是在当前监控周期,从而无法保证在同一个监控周期内比较发送报文计数。In this embodiment, based on the above-mentioned misidentification problem of port queue congestion, the delay time T1 is introduced to advance. After the row counts, the comparison of the counts of the adjacent two monitoring period messages is performed. The delay time T1 introduced must be smaller than the monitoring period T0 of the port queue. Otherwise, after the delay time T1 arrives and the port queue sends a message count, the time has reached the next monitoring period, instead of the current monitoring period. There is no guarantee that the count of sent messages will be compared during the same monitoring period.
此外,优选的,端口队列延迟时长T1与监控周期T0内所轮询的端口数目N的乘积必须小于端口队列监控周期T0,否则如果N个端口都依次延时T1,则至少存在一个端口在获取端口队列发送报文计数后,时间已到了下一个监控周期,而不是在当前监控周期,从而无法保证在同一个监控周期内比较发送报文计数。In addition, it is preferable that the product of the port queue delay duration T1 and the number of ports polled in the monitoring period T0 must be smaller than the port queue monitoring period T0. Otherwise, if all the ports are delayed by T1, at least one port is acquired. After the port queue sends the packet count, the time has reached the next monitoring period, instead of the current monitoring period. Therefore, it is impossible to compare the number of sent packets in the same monitoring period.
步骤S1302,当所述延迟时长到达时,获取在上一所述监控周期及所述延迟时长内所述端口队列总共所发送的第三报文计数;Step S1302: When the delay duration arrives, acquire a third packet count sent by the port queue in the last monitoring period and the delay duration;
步骤S1303,比较所述第一报文计数的高位及低位是否分别对应与所述第三报文计数的高位及低位相等以对应确定所述端口队列的第二发送状态满足所述非正常发送状态条件;Step S1303: Comparing whether the upper and lower bits of the first packet count are respectively equal to the upper and lower bits of the third packet count to determine that the second transmission state of the port queue satisfies the abnormal transmission state. condition;
步骤S1304,当所述第一报文计数的高位及低位分别对应与所述第三报文计数的高位及低位相等时,确定所述端口队列的第二发送状态为所述非正常发送状态。Step S1304: When the upper and lower bits of the first packet count are respectively equal to the high and low bits of the third packet count, determine that the second transmission state of the port queue is the abnormal transmission state.
当预设的延迟时长到达时,将以非读清方式,读取与端口队列相对应的接口子卡上对应端口所接收到的报文计数,此时统计的第三报文计数s2具体为上一监控周期内发送报文计数s1与延迟时长内发送报文计数之和,通过比较s1与s2的高位及低位是否分别对应相等,也即确定延迟时长内发送报文计数是否为零,从而对之前的判断结果做进一步地确定以避免对端口队列堵塞的误判。若经过延时处理后,端口队列的发送状态仍然满足预设的非正常发送状态条件,也即仍然确定没有发送报文,则可以准确确定当前轮询的端口队列存在堵塞。When the preset delay duration arrives, the packet received by the corresponding port on the interface subcard corresponding to the port queue is read in the non-reading mode. The third packet count s2 is The sum of the sent packet count s1 and the sent packet count in the last monitoring period, by comparing whether the upper and lower bits of s1 and s2 are respectively equal, that is, determining whether the sent packet count is zero within the delay duration, thereby The previous judgment results are further determined to avoid misjudgment of port queue congestion. If the transmission status of the port queue still meets the preset abnormal transmission status after the delay processing, that is, if the packet is still not sent, the current polled port queue can be accurately determined to be blocked.
本实施例中,为避免报文在监控周期到达的临界时刻进入端口队列所造成的误判断,通过引入延时计数以对该延长时间内的发送状态做进一步判断,从而实现精确锁定存在堵塞的端口队列,提高了对端口队列存在堵塞的判断的准确性。In this embodiment, in order to avoid the misjudgment caused by the packet entering the port queue at the critical time when the monitoring period arrives, the delay count is introduced to further judge the sending state of the extended time, thereby achieving accurate locking and blocking. The port queue improves the accuracy of the judgment that the port queue is blocked.
参照图5,图5为本发明端口队列堵塞的监控方法第二实施例的流程示意图。本实施例中,在步骤S140之后包括:Referring to FIG. 5, FIG. 5 is a schematic flowchart diagram of a second embodiment of a method for monitoring port queue congestion according to the present invention. In this embodiment, after step S140,
步骤S210,当监测到所述端口队列存在堵塞时,关闭所述端口队列;Step S210, when it is detected that there is congestion in the port queue, the port queue is closed;
步骤S220,清空所述端口队列内所缓存的报文且保留用于复位所述端口队列所需要的相关配置参数;Step S220, clearing the buffered message in the port queue and retaining relevant configuration parameters required for resetting the port queue.
步骤S230,根据保留的所述相关配置参数,复位所述端口队列以恢复到所述端口队列未发送报文时所对应的初始状态;Step S230, resetting the port queue according to the retained related configuration parameter to restore an initial state corresponding to when the port queue does not send a message;
步骤S240,当所述端口队列恢复到所述初始状态后,使能所述端口队列以开启所述端口队列进行报文的发送。 In step S240, after the port queue is restored to the initial state, the port queue is enabled to enable the port queue to send packets.
本实施例中,当监测到轮询的端口队列存在堵塞时,关闭该端口队列以使报文不能进入端口队列。同时,清空该端口队列中缓存的所有报文但保留用于复位该端口队列所需要的相关配置参数,比如端口队列中对于报文调度的优先级、权重;报文发送的限速策略等。In this embodiment, when the polling port queue is blocked, the port queue is closed to prevent the packet from entering the port queue. At the same time, all the packets cached in the port queue are cleared, but the related configuration parameters, such as the priority and weight of the packet queue, and the rate limit policy for sending packets, are reserved.
在清空完堵塞端口队列中缓存的报文且保留下用于复位该端口队列所需要的相关配置参数后,将根据保留的相关配置参数,复位该端口队列以恢复到该端口队列未发送报文时所对应的初始状态,并通过使能方式激活该端口队列,从而再次开启该端口队列进行报文的发送。After clearing the packets buffered in the blocked port queue and retaining the relevant configuration parameters required for resetting the port queue, the port queue is reset according to the reserved configuration parameters to restore the unsent packets to the port queue. The initial state corresponding to the time, and the port queue is activated by the enable mode, so that the port queue is opened again to send the packet.
现网电信级路由器端口队列堵塞后,常用的处理方法一般为:一是插拔或重启堵塞的端口队列对应的接口子卡,负面影响是造成所插拔的接口子卡对应的正常端口数十秒级的流量中断;二是插拔或重启线卡,负面影响是造成线卡所有正常端口数分钟级别的流量中断。上述对接口子卡和线卡的插拔和重启操作都将造成其他正常端口队列长时间流量中断。After the traffic queues of the carrier-class routers are blocked, the common processing methods are as follows: First, insert or remove the interface subcard corresponding to the blocked port queue. The negative effect is that the number of the normal ports corresponding to the inserted interface subcards is ten. The second-level traffic is interrupted; the second is to plug or unplug the line card. The negative effect is that the traffic of the line card is interrupted for several minutes on all normal ports. The above operations on the insertion and removal of the interface daughter card and the line card will cause long-term traffic interruption of other normal port queues.
本实施例中,通过周期性监控端口队列的缓存与发送状态,在监测到端口队列堵塞时,仅对堵塞的端口队列进行恢复处理,从而避免了对接口子卡和线卡的插拔和重启操作,将用户流量中断时间由数十秒级、数分钟级降低至秒级,同时也减小了端口队列堵塞故障的影响范围,提高了电信级路由器的稳定性。In this embodiment, by periodically monitoring the buffering and sending status of the port queue, when the port queue is blocked, only the blocked port queue is restored, thereby preventing the insertion and removal of the interface daughter card and the line card. The operation reduces the interruption time of the user traffic from tens of seconds to several seconds, and also reduces the impact range of the port queue congestion fault, and improves the stability of the carrier-class router.
参照图6,图6为本发明端口队列堵塞的监控方法第三实施例的流程示意图。本实施例中,在步骤S110之前包括:Referring to FIG. 6, FIG. 6 is a schematic flowchart diagram of a third embodiment of a method for monitoring port queue congestion according to the present invention. In this embodiment, before step S110,
步骤S001,创建端口队列的监控线程并在预设的所述监控周期内,对所述路由器上的所有端口进行端口队列堵塞的轮询监控。Step S001: Create a monitoring thread of the port queue and perform polling monitoring of the port queue congestion on all ports on the router in the preset monitoring period.
本实施例中,在路由器启动且创建端口队列后,开始进行端口队列的报文发送时,相应创建端口队列的监控线程,同时也相应设置该监控线程对所有端口进行端口队列堵塞的轮询监控的监控周期,以及设置用于确定端口队列缓存状态的深度阈值和设置用于进一步确定端口队列发送状态的延迟时长。In this embodiment, after the router starts and the port queue is created, when the packet sending of the port queue is started, the monitoring thread of the port queue is created correspondingly, and the monitoring thread is also configured to perform polling monitoring of the port queue congestion of all the ports. The monitoring period, as well as setting the depth threshold used to determine the port queue buffer status and setting the delay duration for further determining the port queue transmission status.
本实施例通过创建端口队列的监控线程以及采用轮询监控的方式,实现了实时对所有端口队列堵塞的监控,从而能够更为及时准确地确定存在堵塞的端口队列,并为有针对性地解决堵塞的端口队列,从而减小端口队列堵塞故障的影响范围并降低用户流量的中断时间提供了有效的解决途径。In this embodiment, by monitoring the thread of the port queue and using the polling monitoring mode, the monitoring of the congestion of all the port queues is realized in real time, so that the blocked port queue can be determined more timely and accurately, and the solution is solved in a targeted manner. Blocked port queues, which reduce the impact of port queue congestion and reduce the interruption time of user traffic provide an effective solution.
参照图7,图7为本发明端口队列堵塞的监控系统第一实施例的功能模块示意图。本实施例中,所述端口队列堵塞的监控系统包括:Referring to FIG. 7, FIG. 7 is a schematic diagram of functional modules of a first embodiment of a monitoring system for port queue congestion according to the present invention. In this embodiment, the monitoring system for blocking the port queue includes:
缓存状态确定模块10,用于在当前监控周期到达时,确定当前轮询的端口队列的缓存状态是否满足预设的非正常缓存状态条件;The buffer
在当前监控周期到达时,也即开始对路由器上所有端口中的某一端口进行轮询监控
时,缓存状态确定模块10确定当前轮询的端口队列的缓存状态是否满足预设的非正常缓存状态条件。本实施例中,当前轮询的端口队列具体是指当前监控周期到达时开始轮询的端口队列。端口队列的缓存状态具体是指端口队列中所缓存的报文存储量的状态,比如,缓存状态为空、缓存状态未满(也即未堵塞)、缓存状态已满(也即存在堵塞的风险)等。本实施例中,具体通过预设的缓存状态条件用以判断非正常缓存状态,若满足该预设条件,则为非正常缓存状态;反之,若不满足,则为正常缓存状态,此时将继续等待下一监控周期的到达。When the current monitoring period arrives, it starts polling monitoring of one of all ports on the router.
The cache
优选地,为便于及时甚至提前监控端口队列堵塞,将非正常缓存状态确定为存在堵塞风险的状态,也即将要进入堵塞时的缓存状态。因此,针对上述所确定的非正常缓存状态,相对应确定预设的缓存状态条件,该条件具体根据实际需要进行设置,比如该条件可以为端口队列的缓存大小或者接收到的报文量的大小等。因此,在当前轮询的端口队列的缓存状态满足上述对应的预设条件时,至少可以确定当前轮询的端口队列存在堵塞风险。Preferably, in order to facilitate timely or even early monitoring of port queue congestion, the abnormal cache state is determined to be in a state of congestion risk, and is also about to enter a cache state at the time of congestion. Therefore, for the abnormal cache state determined above, the preset cache state condition is determined correspondingly, and the condition is specifically set according to actual needs, for example, the condition may be the buffer size of the port queue or the size of the received message volume. Wait. Therefore, when the buffer status of the currently polled port queue satisfies the foregoing preset condition, at least the current polled port queue may be determined to have a congestion risk.
第一发送状态确定模块20,用于当所述端口队列的缓存状态满足所述非正常缓存状态条件时,确定所述端口队列的第一发送状态是否满足预设的非正常发送状态条件;The first sending
当轮询的端口队列的缓存状态满足上述预设的非正常缓存状态条件时,也即在至少可以确定当前轮询的端口队列存在堵塞风险时,第一发送状态确定模块20需要进一步确定是否确实存在堵塞,也即通过端口队列的第一发送状态是否满足预设的非正常发送状态条件来进行判定。When the buffered state of the polled port queue satisfies the above-mentioned preset abnormal cache state condition, that is, when at least the current polled port queue may be determined to have a jam risk, the first sending
本实施例中,端口队列的发送状态(第一发送状态仅仅只是命名区别)具体通过预设的非正常发送状态条件进行判断,若满足该预设条件,则为非正常发送状态;反之,若不满足,则为正常发送状态,此时将继续等待下一监控周期的到达。In this embodiment, the sending status of the port queue (the first sending status is only a naming difference) is determined by a preset abnormal sending status condition. If the preset condition is met, the status is abnormal. If it is not satisfied, it is the normal transmission state, and it will continue to wait for the arrival of the next monitoring period.
优选地,为便于准确监控端口队列堵塞,将非正常发送状态确定为没有发送报文的状态。因此,针对上述所确定的非正常发送状态,相应确定预设的发送状态条件,该条件具体根据实际需要进行设置,比如该条件可以为端口队列的报文计数或者单位时间内发送的报文数等。因此,在当前轮询的端口队列的发送状态满足上述对应的预设条件时,基本上可以确定当前轮询的端口队列存在堵塞。Preferably, in order to facilitate accurate monitoring of port queue congestion, the abnormal transmission status is determined to be a state in which no message is sent. Therefore, for the abnormal transmission state determined above, the preset transmission state condition is determined correspondingly, and the condition is set according to actual needs, for example, the condition may be the number of packets in the port queue or the number of packets sent in a unit time. Wait. Therefore, when the transmission status of the currently polled port queue satisfies the corresponding preset condition, it is basically determined that there is congestion in the currently polled port queue.
第二发送状态确定模块30,用于当所述端口队列的第一发送状态满足所述非正常发送状态条件时,将当前所述监控周期延长预设延迟时长以确定所述端口队列的第二发送状态是否满足所述非正常发送状态条件;The second sending
堵塞确定模块40,用于当所述端口队列的第二发送状态满足所述非正常发送状态条件时,确定所述端口队列存在堵塞。The
当轮询的端口队列的第一发送状态满足预设的非正常发送状态条件时,此时可能会存在对端口队列堵塞的误判,比如,可能存在报文在监控周期到达的临界时刻进入端口队列,但报文此时尚未从端口队列发送出去,因而没有对该报文进行计数,从而导致误判。When the first transmission status of the polled port queue meets the preset abnormal transmission status, there may be a misjudgment of the port queue congestion. For example, there may be a packet entering the port at the critical moment when the monitoring period arrives. Queue, but the message has not been sent out from the port queue at this time, so the message is not counted, resulting in a false positive.
因此,基于上述可能存在的端口堵塞的误判情况,本实施例中,第二发送状态确定模
块30通过将当前监控周期延长预设延迟时长,从而再次进一步精确确定当前轮询的端口队列的发送状态(第二发送状态仅仅只是命名区别)是否满足预设的非正常发送状态条件,如果再一次满足预设的非正常发送状态条件,则堵塞确定模块40可以准确确定当前轮询的端口队列的发送状态为没有发送报文,也即确定当前轮询的端口队列存在堵塞;反之,确定当前轮询的端口队列不存在堵塞,此时将继续等待下一监控周期的到达。Therefore, based on the above-mentioned misjudgment of port clogging that may exist, in this embodiment, the second transmission state
本实施例中,通过周期轮询的方式对所有端口队列堵塞进行监控,从而能够更为精细地将端口队列堵塞问题锁定在某个或某些个端口队列,进而只需有针对性地对锁定的端口队列进行恢复操作而避免对其他正常端口队列的影响。此外,通过监控端口队列的缓存状态,可以及时甚至于提前确定端口队列是否存在潜在的堵塞风险,同时,进一步通过监控端口队列的发送状态以确定当前是否存在堵塞。另外,在监控端口队列的发送状态时,为排除对发送状态的误判,通过延时的方式以进一步确定当前端口队列是否存在堵塞,从而进一步提高对端口队列堵塞判断的准确性,从而能够更为及时准确确定存在堵塞的端口队列,进而及时处理发生堵塞的端口队列,降低端口队列堵塞所带来的负面影响。In this embodiment, all port queue jams are monitored by means of periodic polling, so that the port queue jam problem can be locked to one or some port queues more finely, and only the targeted lock is needed. The port queue is restored to avoid the impact on other normal port queues. In addition, by monitoring the cache status of the port queue, it is possible to determine whether the port queue has a potential congestion risk in time or even in advance, and further monitor the transmission status of the port queue to determine whether there is currently a jam. In addition, when monitoring the transmission status of the port queue, in order to eliminate the misjudgment of the transmission status, the delay is used to further determine whether the current port queue is blocked, thereby further improving the accuracy of the port queue clogging judgment, thereby enabling In order to accurately and accurately determine the port queues that are blocked, the port queues that are blocked are processed in time to reduce the negative impact caused by port queue congestion.
参照图8,图8为图7中缓存状态确定模块的细化功能模块示意图。基于上述实施例,本实施例中,所述缓存状态确定模块10包括:Referring to FIG. 8, FIG. 8 is a schematic diagram of a refinement function module of the cache state determination module of FIG. Based on the foregoing embodiment, in the embodiment, the cache
深度值获取单元101,用于在当前所述监控周期到达时,获取所述端口队列的实时深度值,其中,所述深度值用于衡量端口队列的缓存大小;The depth
深度值判断单元102,用于判断所述端口队列的实时深度值是否大于或等于预设深度阈值以对应确定所述端口队列的缓存状态是否满足所述非正常缓存状态条件;The depth
缓存状态确定单元103,用于当所述端口队列的实时深度值大于或等于所述深度阈值时,确定所述端口队列的缓存状态为所述非正常缓存状态。The cache
本实施例中,端口队列的深度值用于衡量端口队列中所缓存的报文存储量,也即通过深度值获取单元101获取端口队列的实时深度值,深度值判断单元102将端口队列的实时深度值与预设深度阈值进行比较,从而用以判断端口队列的缓存状态,也即判断端口队列中是否缓存了一定数量的待发送的报文。而如果端口队列的实时深度值大于或等于所述深度阈值,缓存状态确定单元103确定端口队列的缓存状态满足非正常缓存状态条件。In this embodiment, the depth value of the port queue is used to measure the amount of the message buffered in the port queue, that is, the depth
需要说明的是,预设深度阈值必须小于端口队列缓存,否则当端口队列堵塞时,由于端口队列的实时深度值不会超过预设深度阈值,因而最终无法正确判断端口队列是否堵塞。同时,预设深度阈值也不能设置太大,否则当端口队列堵塞时不能被及时快速监控到,从而造成流量中断的时间过长。因此,本实施例中,预设深度阈值具体根据实际情况进行设置,比如设置为千字节级。另外,需要进一步说明的是,一般路由器上的所有端口的端口队列缓存都为默认分配,且默认分配的缓存大小相同,因此,对于路由器上的多个端口队列来说,仅只需设置一个预设深度阈值即可。It should be noted that the preset depth threshold must be smaller than the port queue cache. Otherwise, when the port queue is blocked, the real-time depth value of the port queue does not exceed the preset depth threshold. At the same time, the preset depth threshold cannot be set too large. Otherwise, when the port queue is blocked, it cannot be quickly monitored in time, which causes the traffic interruption time to be too long. Therefore, in this embodiment, the preset depth threshold is specifically set according to an actual situation, for example, set to a kilobyte level. In addition, it should be further explained that the port queue cache of all ports on the general router is the default allocation, and the default allocated cache size is the same, therefore, for multiple port queues on the router, only one preset is needed. The depth threshold is OK.
本实施例中,非正常缓存状态条件优选为端口队列的实时深度值大于或等于预设深度 阈值。如果端口队列的实时深度值小于预设深度阈值,则说明端口队列为正常缓存状态,也即端口队列未堵塞;反之,如果实时深度值大于或等于预设深度阈值,说明端口队列缓存的报文的存储量超过了预设的深度阈值,从而使得端口队列存在堵塞的风险,也即端口队列为非正常缓存状态。通过预设深度阈值的设置,从而可以及时监控到是否存在端口队列堵塞的风险,进而能够及时对存在堵塞风险的端口队列进行干预,最大限度降低端口堵塞所造成的影响。In this embodiment, the abnormal cache state condition is preferably that the real-time depth value of the port queue is greater than or equal to a preset depth. Threshold. If the real-time depth of the port queue is less than the preset threshold, the port queue is in the normal cache state, that is, the port queue is not blocked. Otherwise, if the real-time depth value is greater than or equal to the preset depth threshold, the port queue caches the packet. The storage capacity exceeds the preset depth threshold, which causes the port queue to be blocked, that is, the port queue is in an abnormal cache state. By setting the preset depth threshold, you can monitor the risk of port queue congestion in time, and then intervene in the port queue with the risk of congestion to minimize the impact of port congestion.
参照图9,图9为图7中第一发送状态确定模块的细化功能模块示意图。基于上述实施例,本实施例中,所述第一发送状态确定模块20包括:Referring to FIG. 9, FIG. 9 is a schematic diagram of a refinement function module of the first transmission state determining module in FIG. Based on the foregoing embodiment, in the embodiment, the first sending
报文计数第一获取单元201,用于当所述端口队列的缓存状态满足所述非正常缓存状态条件时,获取在当前所述监控周期内以及在上一所述监控周期内所述端口队列分别对应发送的第一报文计数与第二报文计数,其中,所述报文计数采用高位与低位双计数器进行计数;The packet counting first obtaining
当确定了当前轮询的端口队列的缓存状态满足所述非正常缓存状态条件时,也即确定当前轮询的端口队列存在堵塞的风险时,需要进一步确定是否确实存在端口队列堵塞,也即进一步判断当前轮询的端口队列的发送状态是否满足预设的非正常发送状态条件。When it is determined that the buffer status of the currently polled port queue satisfies the abnormal cache state condition, that is, when the risk of congestion of the currently polled port queue is determined, it is necessary to further determine whether the port queue blockage does exist, that is, further Determine whether the sending status of the currently polled port queue meets the preset abnormal sending status condition.
本实施例中,非正常发送状态优选为相邻两监控周期内所发送的报文计数相等,也即在相邻两监控周期内没有发送报文。该相邻的两监控周期具体是指当前监控周期与当前监控周期的上一监控周期。具体通过报文计数第一获取单元201以非读清方式,从预设的用于统计报文计数的双计数器上读取当前监控周期内发送的第一报文计数(假设为s1)以及上一个监控周期内发送的第二报文计数(假设为s0)。其中,非读清方式是指在读取计数数据时不对计数数据进行清零操作。In this embodiment, the abnormal transmission state is preferably that the number of packets sent in the two adjacent monitoring periods is equal, that is, no packets are sent in the two adjacent monitoring periods. The two adjacent monitoring periods specifically refer to the current monitoring period and the previous monitoring period of the current monitoring period. Specifically, the first acquiring
本实施例中,对于支持端口队列发送报文计数的TM芯片,通过用于统计报文计数的双计数器,直接获取TM芯片硬件提供的端口队列发送报文计数;而对于不支持端口队列发送报文计数的TM芯片,由于所有端口队列的报文都是直接发送至接口子卡上的对应端口,因此本实施例鉴于上述特征,可选的,通过用于统计报文计数的双计数器,获取与端口队列相对应的接口子卡上对应端口所接收到的报文计数,也即将此计数作为端口队列发送报文计数。In this embodiment, for the TM chip that supports the port queue to send the packet count, the double counter used for counting the statistics of the packet directly obtains the count of the packet sent by the port queue provided by the hardware of the TM chip; The TM chip that counts the packets is sent to the corresponding port on the interface sub-card. Therefore, in this embodiment, the dual-counter for counting the statistics is obtained. The number of packets received by the corresponding port on the interface subcard corresponding to the port queue is also counted as the number of packets sent by the port queue.
可选的,鉴于现有使用单计数器存在短时间内相邻两次发送报文计数溢出翻转的问题,因此为更准确地判断相邻两次监控周期内发送的报文计数是否相等,优选采用高位与低位双计数器进行报文计数。Optionally, in view of the fact that the existing single counter has the problem that the count of two consecutively transmitted messages overflows and falls over a short period of time, it is preferable to determine whether the counts of the messages sent in the adjacent two monitoring periods are equal. The high and low double counters count the message.
端口队列发送报文计数使用的计数器位宽越大,在一个监控周期内发送报文计数溢出翻转的可能性越小,同时计数的比较就越精确。为了准确比较端口队列发送报文计数,本实施例具体使用高位、低位双计数器对端口队列发送报文进行计数。例如,使用8位双计数器时,双计数器分别是高8位、低8位计数;使用16位双计数器时,双计数器分别是 高16位、低16位计数;使用32位双计数器时,双计数器分别是高32位、低32位计数。当低位计数器满时,高位计数器计数加1,低位计数器计数清零;当高位计数器满时,高位计数器清零。The larger the counter width used by the port queue to send the message count, the smaller the probability that the sent message count overflows and flips during a monitoring period, and the more accurate the comparison is. In order to accurately compare the number of packets sent by the port queue, the present embodiment specifically uses the high-order and low-level dual counters to count the packets sent by the port queue. For example, when using an 8-bit double counter, the double counters are high 8 bits and low 8 bits respectively; when using a 16-bit double counter, the double counters are The upper 16 bits and the lower 16 bits are counted; when the 32-bit double counter is used, the double counters are high 32 bits and low 32 bits respectively. When the low-order counter is full, the high-order counter count is incremented by one, and the low-order counter count is cleared. When the high-order counter is full, the high-order counter is cleared.
报文计数第一比较单元202,用于比较所述第一报文计数的高位及低位是否分别对应与所述第二报文计数的高位及低位相等以对应确定所述端口队列的第一发送状态是否满足所述非正常发送状态条件;The packet counting first comparing
第一发送状态确定单元203,用于当所述第一报文计数的高位及低位分别对应与所述第二报文计数的高位及低位相等时,确定所述端口队列的第一发送状态为所述非正常发送状态。The first sending
当获取到当前监控周期内发送的第一报文计数s1以及上一个监控周期内发送的第二报文计数s0后,报文计数第一比较单元202将s1与s0进行比较,用以判断端口队列的发送状态,即确定端口队列是否在发送报文,若端口队列在发送报文,则为正常发送状态,反之,则为非正常发送状态。After acquiring the first packet count s1 sent in the current monitoring period and the second packet count s0 sent in the previous monitoring period, the packet counting first comparing
为了避免由于报文计数溢出翻转所带来的误判影响,因此本实施例中不比较s1与s0的大小,而是优选比较s1与s0是否相等。由于本实施例中的报文计数采用高位与低位双计数器进行计数,因此,需要分别对应比较s1与s0的高位或低位是否相等:如果s1与s0分别对应的高位不相等和/或低位不相等,则说明端口队列在相邻两次计数时间内发送了报文;如果s1与s0分别对应的高位相等且低位相等,则第一发送状态确定单元203确定端口队列在相邻两次计数时间内没有发送报文。In order to avoid the misjudgment caused by the overflow of the message count overflow, the size of s1 and s0 is not compared in this embodiment, but it is preferable to compare whether s1 and s0 are equal. Since the message count in the embodiment is counted by the high-order and low-order double counters, it is necessary to respectively compare whether the upper or lower bits of s1 and s0 are equal: if the upper bits corresponding to s1 and s0 are not equal and/or the low bits are not equal The port queue is sent in the adjacent two counting times; if the upper bits corresponding to s1 and s0 are equal and the lower bits are equal, the first sending
本实施例中,非正常发送状态条件优选为端口队列在相邻两次计数时间内没有发送报文,也即在相邻两监控周期内没有发送报文。通过对端口队列的缓存状态进行判断,从而锁定存在堵塞风险的端口队列;而通过对端口队列的发送状态进行判断,从而进一步精确确定存在堵塞的端口队列。本实施例所采用的缓存状态及发送状态双层递进式监控机制能够更为及时、准确地锁定存在堵塞的端口队列,进而能够及时进行干预,从而最大限度降低由于端口队列堵塞所造成的影响。In this embodiment, the abnormal transmission status condition is that the port queue does not send a message within two consecutive counting times, that is, no message is sent in two adjacent monitoring periods. By judging the buffer status of the port queue, the port queue with the risk of blocking is locked; and the status of the port queue is judged to further accurately determine the blocked port queue. The double-layer progressive monitoring mechanism of the buffer state and the transmission state used in this embodiment can lock the blocked port queues in a timely and accurate manner, thereby enabling timely intervention, thereby minimizing the impact caused by port queue congestion. .
此外,本实施例中使用高位、低位双计数器对端口队列发送报文进行计数,解决了使用单计数器存在短时间内相邻两次发送报文计数而产生溢出翻转的问题,从而能够更准确地判断相邻两次发送报文计数是否相等。同时,本实施例中通过获取接口子卡对应端口的接收报文计数,将此计数作为端口队列发送报文计数,从而解决了TM芯片硬件不支持端口队列发送报文计数时无法判断端口队列发送状态的问题。In addition, in the embodiment, the high-order and low-level dual counters are used to count the packets sent by the port queue, which solves the problem that the use of the single counter has the overflow of the two consecutively transmitted message counts in a short time, thereby enabling more accurate Determine whether the counts of two consecutively sent messages are equal. At the same time, in this embodiment, the number of received packets of the corresponding port of the interface subcard is obtained, and the count is used as the port queue to send the packet count, thereby solving the problem that the TM chip hardware cannot support the port queue to send the packet count and cannot determine the port queue sending. State problem.
参照图10,图10为图7中第二发送状态确定模块的细化功能模块示意图。基于上述实施例,本实施例中,所述第二发送状态确定模块30包括:Referring to FIG. 10, FIG. 10 is a schematic diagram of a refinement function module of the second transmission state determining module in FIG. Based on the foregoing embodiment, in the embodiment, the second sending
延时单元301,用于当所述端口队列的第一发送状态满足所述非正常发送状态条件时,将当前所述监控周期延长所述延迟时长;
The
当轮询的端口队列的发送状态满足非正常发送状态条件时,也即端口队列在相邻两监控周期内没有发送报文时,此时并不能确定端口队列已经发生堵塞而不能发送报文,例如,当报文在监控周期到达的临界时刻进入端口队列时,报文此时尚未从端口队列发送出去(也即计数器此时并不计数),因此,可能会造成当前监控周期内发送报文计数s1与上一个监控周期内发送报文计数s0相等,也即判断端口队列未发送报文,而若端口队列未堵塞,但得到端口队列未发送报文的判断结果,则导致产生相邻两监控周期内没有发送报文的误判。When the transmission status of the polled port queue meets the abnormal transmission status condition, that is, when the port queue does not send packets in the two adjacent monitoring periods, the port queue cannot be determined to be blocked and cannot send packets. For example, when a packet enters the port queue at the critical time when the monitoring period arrives, the packet is not sent out from the port queue at this time (that is, the counter does not count at this time). Therefore, the packet may be sent during the current monitoring period. The count s1 is equal to the s0 sent in the last monitoring period, that is, the port queue is not sent, and if the port queue is not blocked, but the judgment result of the unsent packet is obtained, the adjacent two are generated. No misjudgement of the message was sent during the monitoring period.
本实施例中,基于上述可能出现的端口队列堵塞的误判问题,延时单元301延迟时长T1以进行计数后再进行相邻两监控周期报文计数的比较。其中,引入的延迟时长T1必须小于端口队列的监控周期T0,否则在延迟时长T1到达且获取端口队列发送报文计数后,此时时间已到了下一个监控周期,而不是在当前监控周期,从而无法保证在同一个监控周期内比较发送报文计数。In this embodiment, based on the above-mentioned misidentification problem of the port queue clogging, the
此外,优选的,端口队列延迟时长T1与监控周期T0内所轮询的端口数目N的乘积必须小于端口队列监控周期T0,否则如果N个端口都依次延时T1,则至少存在一个端口在获取端口队列发送报文计数后,时间已到了下一个监控周期,而不是在当前监控周期,从而无法保证在同一个监控周期内比较发送报文计数。In addition, it is preferable that the product of the port queue delay duration T1 and the number of ports polled in the monitoring period T0 must be smaller than the port queue monitoring period T0. Otherwise, if all the ports are delayed by T1, at least one port is acquired. After the port queue sends the packet count, the time has reached the next monitoring period, instead of the current monitoring period. Therefore, it is impossible to compare the number of sent packets in the same monitoring period.
报文计数第二获取单元302,用于当所述延迟时长到达时,获取在上一所述监控周期及所述延迟时长内所述端口队列总共所发送的第三报文计数;The message counting second obtaining
报文计数第二比较单元303,用于比较所述第一报文计数的高位及低位是否分别对应与所述第三报文计数的高位及低位相等以对应确定所述端口队列的第二发送状态满足所述非正常发送状态条件;The packet counting second comparing
第二发送状态确定单元304,用于当所述第一报文计数的高位及低位分别对应与所述第三报文计数的高位及低位相等时,确定所述端口队列的第二发送状态为所述非正常发送状态。The second sending
当预设的延迟时长到达时,报文计数第二获取单元302将以非读清方式,读取与端口队列相对应的接口子卡上对应端口所接收到的报文计数,此时统计的第三报文计数s2具体为上一监控周期内发送报文计数s1与延迟时长内发送报文计数之和,报文计数第二比较单元303通过比较s1与s2的高位及低位是否分别对应相等,也即确定延迟时长内发送报文计数是否为零,从而对之前的判断结果做进一步地确定以避免对端口队列堵塞的误判。若经过延时处理后,端口队列的发送状态仍然满足预设的非正常发送状态条件,也即仍然确定没有发送报文,则第二发送状态确定单元304可以准确确定当前轮询的端口队列存在堵塞。When the preset delay duration arrives, the packet count second obtaining
本实施例中,为避免报文在监控周期到达的临界时刻进入端口队列所造成的误判断,通过引入延时计数以对该延长时间内的发送状态做进一步判断,从而实现精确锁定存在堵塞的端口队列,提高了对端口队列存在堵塞的判断的准确性。 In this embodiment, in order to avoid the misjudgment caused by the packet entering the port queue at the critical time when the monitoring period arrives, the delay count is introduced to further judge the sending state of the extended time, thereby achieving accurate locking and blocking. The port queue improves the accuracy of the judgment that the port queue is blocked.
参照图11,图11为本发明端口队列堵塞的监控系统第二实施例的功能模块示意图。本实施例中,所述端口队列堵塞的监控系统还包括:Referring to FIG. 11, FIG. 11 is a schematic diagram of functional modules of a second embodiment of a monitoring system for port queue congestion according to the present invention. In this embodiment, the monitoring system for blocking the port queue further includes:
端口队列关闭模块50,用于当监测到所述端口队列存在堵塞时,关闭所述端口队列;The port
缓存报文清空模块60,用于清空所述端口队列内所缓存的报文且保留用于复位所述端口队列所需要的相关配置参数;The cache
端口队列复位模块70,用于根据保留的所述相关配置参数,复位所述端口队列以恢复到所述端口队列未发送报文时所对应的初始状态;The port
端口队列使能模块80,用于当所述端口队列恢复到所述初始状态后,使能所述端口队列以开启所述端口队列进行报文的发送。The port queue enable
本实施例中,当监测到轮询的端口队列存在堵塞时,端口队列关闭模块50关闭该端口队列以使报文不能进入端口队列。同时,缓存报文清空模块60清空该端口队列中缓存的所有报文但保留用于复位该端口队列所需要的相关配置参数,比如端口队列中对于报文调度的优先级、权重;报文发送的限速策略等。In this embodiment, when the port queue of the polled port is blocked, the port
在清空完堵塞端口队列中缓存的报文且保留下用于复位该端口队列所需要的相关配置参数后,端口队列复位模块70将根据保留的相关配置参数,复位该端口队列以恢复到该端口队列未发送报文时所对应的初始状态,端口队列使能模块80通过使能方式激活该端口队列,从而再次开启该端口队列进行报文的发送。After clearing the buffered message in the blocked port queue and retaining the relevant configuration parameters required for resetting the port queue, the port queue reset
现网电信级路由器端口队列堵塞后,常用的处理方法一般为:一是插拔或重启堵塞的端口队列对应的接口子卡,负面影响是造成所插拔的接口子卡对应的正常端口数十秒级的流量中断;二是插拔或重启线卡,负面影响是造成线卡所有正常端口数分钟级别的流量中断。上述对接口子卡和线卡的插拔和重启操作都将造成其他正常端口队列长时间流量中断。After the traffic queues of the carrier-class routers are blocked, the common processing methods are as follows: First, insert or remove the interface subcard corresponding to the blocked port queue. The negative effect is that the number of the normal ports corresponding to the inserted interface subcards is ten. The second-level traffic is interrupted; the second is to plug or unplug the line card. The negative effect is that the traffic of the line card is interrupted for several minutes on all normal ports. The above operations on the insertion and removal of the interface daughter card and the line card will cause long-term traffic interruption of other normal port queues.
本实施例中,通过周期性监控端口队列的缓存与发送状态,在监测到端口队列堵塞时,仅对堵塞的端口队列进行恢复处理,从而避免了对接口子卡和线卡的插拔和重启操作,将用户流量中断时间由数十秒级、数分钟级降低至秒级,同时也减小了端口队列堵塞故障的影响范围,提高了电信级路由器的稳定性。In this embodiment, by periodically monitoring the buffering and sending status of the port queue, when the port queue is blocked, only the blocked port queue is restored, thereby preventing the insertion and removal of the interface daughter card and the line card. The operation reduces the interruption time of the user traffic from tens of seconds to several seconds, and also reduces the impact range of the port queue congestion fault, and improves the stability of the carrier-class router.
参照图12,图12为本发明端口队列堵塞的监控系统第三实施例的功能模块示意图。本实施例中,所述端口队列堵塞的监控系统还包括:Referring to FIG. 12, FIG. 12 is a schematic diagram of functional modules of a third embodiment of a monitoring system for port queue congestion according to the present invention. In this embodiment, the monitoring system for blocking the port queue further includes:
监控线程创建模块90,用于创建端口队列的监控线程并在预设的所述监控周期内,对所述路由器上的所有端口进行端口队列堵塞的轮询监控。The monitoring
本实施例中,在路由器启动且创建端口队列后开始进行端口队列的报文发送时,监控线程创建模块90相应创建端口队列的监控线程,同时监控线程创建模块90也相应设置该监控线程对所有端口进行端口队列堵塞的轮询监控的监控周期,以及设置用于确定端口队
列缓存状态的深度阈值和设置用于进一步确定端口队列发送状态的延迟时长。In this embodiment, when the packet is started after the router starts and the port queue is created, the monitoring
本实施例通过创建端口队列的监控线程以及采用轮询监控的方式,实现了实时对所有端口队列堵塞的监控,从而能够更为及时准确地确定存在堵塞的端口队列,并为有针对性地解决堵塞的端口队列,从而减小端口队列堵塞故障的影响范围并降低用户流量的中断时间提供了有效的解决途径。In this embodiment, by monitoring the thread of the port queue and using the polling monitoring mode, the monitoring of the congestion of all the port queues is realized in real time, so that the blocked port queue can be determined more timely and accurately, and the solution is solved in a targeted manner. Blocked port queues, which reduce the impact of port queue congestion and reduce the interruption time of user traffic provide an effective solution.
本发明的实施例还提供了一种端口队列堵塞的监控设备,应用于包括若干端口的电信级路由器,包括:An embodiment of the present invention further provides a monitoring device for port queue congestion, which is applied to a carrier-grade router including a plurality of ports, including:
处理器;processor;
用于存储处理器可执行指令的存储器;a memory for storing processor executable instructions;
其中,所述处理器被配置为:Wherein the processor is configured to:
在当前监控周期到达时,确定当前轮询的端口队列的缓存状态是否满足预设的非正常缓存状态条件;When the current monitoring period arrives, it is determined whether the buffer status of the currently polled port queue satisfies a preset abnormal cache state condition;
当所述端口队列的缓存状态满足所述非正常缓存状态条件时,确定所述端口队列的第一发送状态是否满足预设的非正常发送状态条件;Determining whether the first sending state of the port queue meets a preset abnormal sending state condition when the buffering state of the port queue meets the abnormal cache state condition;
当所述端口队列的第一发送状态满足所述非正常发送状态条件时,将当前所述监控周期延长预设延迟时长以确定所述端口队列的第二发送状态是否满足所述非正常发送状态条件;And when the first sending state of the port queue meets the abnormal sending state, the current monitoring period is extended by a preset delay duration to determine whether the second sending state of the port queue meets the abnormal sending state. condition;
当所述端口队列的第二发送状态满足所述非正常发送状态条件时,确定所述端口队列存在堵塞。When the second transmission state of the port queue satisfies the abnormal transmission state condition, it is determined that the port queue is blocked.
本发明的实施例还提供了一种非易失性计算机可读存储介质,其中存储有指令,所述指令在由包括若干端口的电信级路由器的处理器执行时使所述路由器实施一种端口队列堵塞的监控方法,所述方法包括以下步骤:Embodiments of the present invention also provide a non-transitory computer readable storage medium having stored therein instructions that cause the router to implement a port when executed by a processor of a carrier-grade router including a plurality of ports A method of monitoring queue congestion, the method comprising the steps of:
在当前监控周期到达时,确定当前轮询的端口队列的缓存状态是否满足预设的非正常缓存状态条件;When the current monitoring period arrives, it is determined whether the buffer status of the currently polled port queue satisfies a preset abnormal cache state condition;
当所述端口队列的缓存状态满足所述非正常缓存状态条件时,确定所述端口队列的第一发送状态是否满足预设的非正常发送状态条件;Determining whether the first sending state of the port queue meets a preset abnormal sending state condition when the buffering state of the port queue meets the abnormal cache state condition;
当所述端口队列的第一发送状态满足所述非正常发送状态条件时,将当前所述监控周期延长预设延迟时长以确定所述端口队列的第二发送状态是否满足所述非正常发送状态条件;And when the first sending state of the port queue meets the abnormal sending state, the current monitoring period is extended by a preset delay duration to determine whether the second sending state of the port queue meets the abnormal sending state. condition;
当所述端口队列的第二发送状态满足所述非正常发送状态条件时,确定所述端口队列存在堵塞。When the second transmission state of the port queue satisfies the abnormal transmission state condition, it is determined that the port queue is blocked.
以上仅为本发明的优选实施例,并非因此限制本发明的专利范围,凡是利用本发明说 明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。The above is only a preferred embodiment of the present invention, and thus does not limit the scope of the patent of the present invention. The equivalent structure or equivalent process transformation of the contents of the specification and the drawings, or directly or indirectly applied to other related technical fields, are included in the scope of patent protection of the present invention.
本申请端口队列堵塞的监控方法,可应用于电信级路由器中,通过周期轮询的方式对所有端口队列堵塞进行监控,从而能够更为精细地将端口队列堵塞问题锁定在某个或某些个端口队列,进而只需有针对性地对锁定的端口队列进行恢复操作而避免影响正常端口队列。 The monitoring method of the port queue jam in the present application can be applied to a carrier-class router, and all port queue jams are monitored by periodic polling, so that the port queue jam problem can be locked to some or some more finely. The port queue, in turn, only needs to recover the locked port queue in a targeted manner to avoid affecting the normal port queue.
Claims (13)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201510501920.6 | 2015-08-14 | ||
| CN201510501920.6A CN106470126A (en) | 2015-08-14 | 2015-08-14 | The monitoring method of port queue blocking and system |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2017028545A1 true WO2017028545A1 (en) | 2017-02-23 |
Family
ID=58051088
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2016/078670 Ceased WO2017028545A1 (en) | 2015-08-14 | 2016-04-07 | Port queue congestion monitoring method and system |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN106470126A (en) |
| WO (1) | WO2017028545A1 (en) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109729014A (en) * | 2017-10-31 | 2019-05-07 | 深圳市中兴微电子技术有限公司 | A message storage method and device |
| CN113806102A (en) * | 2020-06-15 | 2021-12-17 | 中国移动通信集团浙江有限公司 | Message queue processing method and device and computing equipment |
| WO2023283902A1 (en) * | 2021-07-15 | 2023-01-19 | 新华三技术有限公司 | Message transmission method and apparatus |
| CN115757006A (en) * | 2022-09-27 | 2023-03-07 | 郑州云智信安安全技术有限公司 | Method and device for judging process running state based on port queue characteristics |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108390832B (en) * | 2018-02-12 | 2021-10-15 | 苏州盛科通信股份有限公司 | Method for configuring network chip calendar in mixed rate mode |
| CN110830382B (en) * | 2018-08-10 | 2025-01-14 | 华为技术有限公司 | Message processing method and device, communication equipment and switching circuit |
| CN114650232B (en) * | 2020-12-02 | 2024-03-12 | 中盈优创资讯科技有限公司 | Network quality analysis method and device based on QOS queue flow |
| CN118573617A (en) * | 2023-02-28 | 2024-08-30 | 北京车和家信息技术有限公司 | Information processing method, device, equipment and vehicle |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20030103458A1 (en) * | 2001-11-30 | 2003-06-05 | Lg Electronics Inc. | Congestion avoidance apparatus and method for communication network |
| US20040032827A1 (en) * | 2002-08-15 | 2004-02-19 | Charles Hill | Method of flow control |
| CN101459966A (en) * | 2009-01-06 | 2009-06-17 | 北京交通大学 | Ad Hoc network MAC layer QoS guarantee method based on IEEE802.16 |
| CN101646196A (en) * | 2009-08-31 | 2010-02-10 | 华为技术有限公司 | Method and device for ensuring business service quality |
| US20110235519A1 (en) * | 2010-03-24 | 2011-09-29 | Telefonaktiebolaget Lm Ericsson (Publ) | Delayed flow control action in transport network layer wcdma communications |
| CN102611605A (en) * | 2011-01-20 | 2012-07-25 | 华为技术有限公司 | Scheduling method, device and system of data exchange network |
| CN103312566A (en) * | 2013-06-28 | 2013-09-18 | 盛科网络(苏州)有限公司 | Message port congestion detection method and device |
| CN104581821A (en) * | 2015-01-28 | 2015-04-29 | 湘潭大学 | Congestion Control Method Based on Node Cache Length Fair Allocation Rate |
-
2015
- 2015-08-14 CN CN201510501920.6A patent/CN106470126A/en active Pending
-
2016
- 2016-04-07 WO PCT/CN2016/078670 patent/WO2017028545A1/en not_active Ceased
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20030103458A1 (en) * | 2001-11-30 | 2003-06-05 | Lg Electronics Inc. | Congestion avoidance apparatus and method for communication network |
| US20040032827A1 (en) * | 2002-08-15 | 2004-02-19 | Charles Hill | Method of flow control |
| CN101459966A (en) * | 2009-01-06 | 2009-06-17 | 北京交通大学 | Ad Hoc network MAC layer QoS guarantee method based on IEEE802.16 |
| CN101646196A (en) * | 2009-08-31 | 2010-02-10 | 华为技术有限公司 | Method and device for ensuring business service quality |
| US20110235519A1 (en) * | 2010-03-24 | 2011-09-29 | Telefonaktiebolaget Lm Ericsson (Publ) | Delayed flow control action in transport network layer wcdma communications |
| CN102611605A (en) * | 2011-01-20 | 2012-07-25 | 华为技术有限公司 | Scheduling method, device and system of data exchange network |
| CN103312566A (en) * | 2013-06-28 | 2013-09-18 | 盛科网络(苏州)有限公司 | Message port congestion detection method and device |
| CN104581821A (en) * | 2015-01-28 | 2015-04-29 | 湘潭大学 | Congestion Control Method Based on Node Cache Length Fair Allocation Rate |
Non-Patent Citations (2)
| Title |
|---|
| SUN, GUODONG ET AL.: "A Congestion Control Scheme in Wireless Sensor Networks", JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, vol. 30, no. 10, 31 October 2008 (2008-10-31) * |
| SUN, GUODONG.: "Congestion Control in Wireless Sensor Networks", CHINA DOCTORAL DISSERTATIONS FULL-TEXT DATABASE, 15 November 2011 (2011-11-15) * |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109729014A (en) * | 2017-10-31 | 2019-05-07 | 深圳市中兴微电子技术有限公司 | A message storage method and device |
| CN113806102A (en) * | 2020-06-15 | 2021-12-17 | 中国移动通信集团浙江有限公司 | Message queue processing method and device and computing equipment |
| CN113806102B (en) * | 2020-06-15 | 2023-11-21 | 中国移动通信集团浙江有限公司 | Message queue processing method, device and computing device |
| WO2023283902A1 (en) * | 2021-07-15 | 2023-01-19 | 新华三技术有限公司 | Message transmission method and apparatus |
| US12395449B2 (en) | 2021-07-15 | 2025-08-19 | New H3C Technologies Co., Ltd. | Packet transmission method and apparatus |
| CN115757006A (en) * | 2022-09-27 | 2023-03-07 | 郑州云智信安安全技术有限公司 | Method and device for judging process running state based on port queue characteristics |
| CN115757006B (en) * | 2022-09-27 | 2023-08-08 | 郑州云智信安安全技术有限公司 | Method and device for judging running state of process based on port queue characteristics |
Also Published As
| Publication number | Publication date |
|---|---|
| CN106470126A (en) | 2017-03-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2017028545A1 (en) | Port queue congestion monitoring method and system | |
| US10262700B2 (en) | System and method for determining a cause of network congestion | |
| US12273270B2 (en) | Congestion management techniques | |
| US9363184B2 (en) | Token bucket-based traffic limiting method and apparatus | |
| US9800485B2 (en) | System and method for determining an effect of network congestion | |
| KR100875739B1 (en) | Apparatus and method for packet buffer management in IP network system | |
| US9282022B2 (en) | Forensics for network switching diagnosis | |
| US10924374B2 (en) | Telemetry event aggregation | |
| EP3322145A1 (en) | Method, server side and system for computing bandwidth of network transmission of streaming media | |
| CN104038442A (en) | Bandwidth allocation method and router | |
| CN108028828B (en) | Distributed denial of service (DDoS) attack detection method and related equipment | |
| CN111131061B (en) | Data transmission method and network equipment | |
| WO2015169048A1 (en) | Queue management method and device | |
| WO2017107363A1 (en) | Cache management method and device, and computer storage medium | |
| JP7710620B2 (en) | Deploying Shadow Buffers in Clock-Synchronized Edge-Based Network Functions | |
| CN108206787A (en) | A kind of congestion-preventing approach and device | |
| CN107016284A (en) | A kind of data communications equipment CPU front ends dynamic protection method and system | |
| WO2017161967A1 (en) | Method of monitoring data traffic in packets per second, device, and computer storage medium | |
| CN104052676B (en) | A kind of data processing method of transmission path device and transmission path | |
| US7933283B1 (en) | Shared memory management | |
| US20190386906A1 (en) | Just-in-time identification of slow drain devices in a fibre channel network | |
| CN101272254A (en) | Method for generating attack signature database, method and device for preventing network attacks | |
| CN104780178B (en) | A kind of connection management method for being used to prevent that TCP from attacking | |
| CN119544632A (en) | A transmission control method based on flow and delay perception in data center network | |
| US10069673B2 (en) | Methods, systems, and computer readable media for conducting adaptive event rate monitoring |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16836410 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 16836410 Country of ref document: EP Kind code of ref document: A1 |