[go: up one dir, main page]

HK1207221B - Method, apparatus and computer readable medium for node de-duplication in network - Google Patents

Method, apparatus and computer readable medium for node de-duplication in network Download PDF

Info

Publication number
HK1207221B
HK1207221B HK15107668.9A HK15107668A HK1207221B HK 1207221 B HK1207221 B HK 1207221B HK 15107668 A HK15107668 A HK 15107668A HK 1207221 B HK1207221 B HK 1207221B
Authority
HK
Hong Kong
Prior art keywords
duplicate
node
nodes
detector
duplicate detector
Prior art date
Application number
HK15107668.9A
Other languages
Chinese (zh)
Other versions
HK1207221A1 (en
Inventor
M‧齐兹拉维斯基
T‧波斯皮西尔
T‧马克维克卡
Original Assignee
太阳风环球有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US14/072,150 external-priority patent/US9584367B2/en
Application filed by 太阳风环球有限责任公司 filed Critical 太阳风环球有限责任公司
Publication of HK1207221A1 publication Critical patent/HK1207221A1/en
Publication of HK1207221B publication Critical patent/HK1207221B/en

Links

Description

用于在网络中进行节点去重的方法、装置和计算机可读介质Method, apparatus, and computer-readable medium for performing node deduplication in a network

技术领域Technical Field

本发明的实施例总体涉及网络流量监测、分析和/或报告。更具体而言,一些实施例指向例如用于被网络监测系统监测的物理节点的节点去重的方法、系统和计算机程序。Embodiments of the present invention generally relate to network traffic monitoring, analysis, and/or reporting. More specifically, some embodiments are directed to methods, systems, and computer programs for node deduplication, for example, of physical nodes monitored by a network monitoring system.

背景技术Background Art

网络管理包括与联网系统的操作、管理、维护和/或供应相关的活动、方法、程序和工具。可作为网络管理的一部分被执行的功能例如可以包括规划、控制、部署、分配、协调和监测网络的资源。进一步的功能可以与网络规划、频率分配、负载均衡、配置管理、故障管理、安全管理、性能管理、带宽管理、路由分析和计费管理相关。Network management encompasses activities, methods, procedures, and tools related to the operation, administration, maintenance, and/or provisioning of networked systems. Functions that may be performed as part of network management may, for example, include planning, controlling, deploying, allocating, coordinating, and monitoring network resources. Further functions may relate to network planning, frequency allocation, load balancing, configuration management, fault management, security management, performance management, bandwidth management, routing analysis, and billing management.

如以上提及的,网络管理的子集包括网络流量的网络监测。网络管理员对网络流量数据感兴趣是出于若干原因,包括分析新应用对网络的影响、故障诊断网络痛点、检测缓慢或故障的网络设备、检测带宽的重度用户和防护网络。网络交通流数据的多种协议已经被开发。这些协议可以包括多种类型的信息,例如源互联网协议(IP)地址、目的地IP地址、源端口、目的地端口、IP协议、进入接口、服务的IP类型、开始和结束时间、字节数和下一跳(hop)。As mentioned above, a subset of network management includes network monitoring of network traffic. Network administrators are interested in network traffic data for several reasons, including analyzing the impact of new applications on the network, troubleshooting network pain points, detecting slow or faulty network devices, detecting heavy users of bandwidth, and protecting the network. Various protocols have been developed for network traffic flow data. These protocols can include various types of information, such as source Internet Protocol (IP) address, destination IP address, source port, destination port, IP protocol, ingress interface, IP type of service, start and end time, number of bytes, and next hop.

随着网络变得更大和更加复杂,监测、分析和报告交通流数据的系统在处置日益增加的网络设备数量和关于网络流量生成的信息量方面必须变得更加高效。As networks become larger and more complex, systems for monitoring, analyzing, and reporting traffic flow data must become more efficient at handling the increasing number of network devices and the amount of information generated about network traffic.

发明内容Summary of the Invention

某些实施例指向用于节点去重的方法、装置和计算机程序产品。一种方法包括由网络监测装置发现网络中的节点。该方法可以进一步包括针对在网络中所发现的节点中的每个节点来收集互联网协议(IP)地址、媒体访问控制(MAC)地址、域名系统(DNS)名称以及系统名称(sysname)的列表,将发现的节点中的每个节点的IP地址与当前节点和其他发现的节点的IP地址进行比较,将发现的节点中的每个节点的MAC地址与当前节点和其他发现的节点的MAC地址进行比较,将发现的节点中的每个节点的DNS名称与当前节点和其他发现的节点的DNS名称进行比较,将发现的节点中的每个节点的系统名称与当前节点和其他发现的节点的系统名称进行比较,以及基于对IP地址、MAC地址、DNS名称和系统名称的比较来确定重复其他发现的节点和/或当前节点的重复节点。Certain embodiments are directed to methods, devices, and computer program products for node deduplication. A method includes discovering nodes in a network by a network monitoring device. The method may further include collecting a list of Internet Protocol (IP) addresses, Media Access Control (MAC) addresses, Domain Name System (DNS) names, and system names (sysnames) for each of the nodes discovered in the network, comparing the IP address of each of the discovered nodes with the IP addresses of the current node and other discovered nodes, comparing the MAC address of each of the discovered nodes with the MAC addresses of the current node and other discovered nodes, comparing the DNS name of each of the discovered nodes with the DNS names of the current node and other discovered nodes, comparing the system name of each of the discovered nodes with the system names of the current node and other discovered nodes, and determining duplicate nodes of other discovered nodes and/or the current node based on the comparison of the IP addresses, MAC addresses, DNS names, and system names.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了恰当地理解本发明,应当参考附图,其中:For a proper understanding of the present invention, reference should be made to the accompanying drawings, in which:

图1图示了根据一个实施例的系统的框图;FIG1 illustrates a block diagram of a system according to one embodiment;

图2图示了根据一个实施例的系统的框图;FIG2 illustrates a block diagram of a system according to one embodiment;

图3图示了根据一个实施例的方法的流程图;FIG3 illustrates a flow chart of a method according to one embodiment;

图4图示了根据另一实施例的方法的流程图;FIG4 illustrates a flow chart of a method according to another embodiment;

图5图示了根据实施例的装置的框图;以及FIG5 illustrates a block diagram of an apparatus according to an embodiment; and

图6图示了根据另一实施例的方法的流程图。FIG6 illustrates a flow chart of a method according to another embodiment.

具体实施方式DETAILED DESCRIPTION

将容易理解,本发明的组件,如一般性地进行描述和在附图中所图示的,可以以多种配置被布置和设计。因此,以下对用于节点去重的系统、方法、装置和计算机程序产品的实施例的详细描述,如附图中所表示的,并不旨在于限制本发明的范围,而仅代表本发明的选择的实施例。It will be readily understood that the components of the present invention, as generally described and illustrated in the accompanying drawings, may be arranged and designed in a variety of configurations. Accordingly, the following detailed description of embodiments of systems, methods, apparatus, and computer program products for node deduplication, as represented in the accompanying drawings, is not intended to limit the scope of the invention, but merely represents selected embodiments of the invention.

本说明书通篇所述的本发明的特性、结构或特征可以以任何适当的方式被结合在一个或多个实施例中。例如,词组“某些实施例”、“一些实施例”等的使用,在本说明书通篇指代这一事实,即关于实施例所述的特定的特征、结构或特性可以被包括在本发明的至少一个实施例中。因此,词组“在某些实施例中”、“在一些实施例中”、“在其他实施例中”或其他类似语言的出现,本说明书通篇不一定全部指代同一组实施例,并且所述的特征、结构或特性可以以任何适当的方式被结合在一个或多个实施例中。此外,如果需要,以下讨论的不同功能可以按不同的顺序和/或彼此同时被执行。此外,如果需要,所述功能中的一个或多个功能可以是可选的或者可以组合。就此而言,以下描述应当被认为仅是对本发明的原理、教导和实施例的说明,而并不在于对其进行限制。The characteristics, structures or features of the present invention described throughout this specification may be combined in one or more embodiments in any appropriate manner. For example, the use of the phrases "certain embodiments", "some embodiments", etc., refers to the fact that the specific characteristics, structures or features described in the embodiments may be included in at least one embodiment of the present invention throughout this specification. Therefore, the appearance of the phrases "certain embodiments", "in some embodiments", "in other embodiments" or other similar language does not necessarily refer to the same group of embodiments throughout this specification, and the characteristics, structures or features may be combined in one or more embodiments in any appropriate manner. In addition, if desired, the different functions discussed below may be performed in different orders and/or simultaneously with each other. In addition, if desired, one or more of the functions may be optional or may be combined. In this regard, the following description should be considered as merely an illustration of the principles, teachings and embodiments of the present invention, and is not intended to limit them.

应当注意,本说明书通篇,术语——网络设备和网络节点,或仅设备或节点,可以可互换地使用用于指代能够在互联网上连接和/或进行通信的任何物理设备。这些设备或节点的示例可以包括,但不限于,路由器、交换机、服务器、计算机、膝上型计算机、平板电脑、电话、打印机、移动设备以及任何其他能够通过通信信道发送、接收或转发信息的目前或将来的组件。It should be noted that throughout this specification, the terms network device and network node, or simply device or node, may be used interchangeably to refer to any physical device capable of connecting and/or communicating on the Internet. Examples of these devices or nodes may include, but are not limited to, routers, switches, servers, computers, laptops, tablets, phones, printers, mobile devices, and any other current or future component capable of sending, receiving, or forwarding information over a communication channel.

图1图示了根据一个实施例的系统的示例。该系统包括网络监测系统100、网络监测系统存储110、交换机设备130和一个或多个网络设备120。网络监测系统存储110能够存储网络监测数据。网络监测系统存储110可以是数据库或任何其他合适的存储设备。网络设备120可以是被网络流量监测器100监测的网络中的节点。应当注意到,在该系统中任意数目和类型的网络设备120可以受到支持。因此,实施例并不限于图1中所示的网络设备的数目和类型。在实施例中,网络监测系统存储110可以存储发现结果数据库表,该数据库表存储和组织关于在网络中发现的网络设备的信息。FIG1 illustrates an example of a system according to one embodiment. The system includes a network monitoring system 100, a network monitoring system storage 110, a switch device 130, and one or more network devices 120. The network monitoring system storage 110 is capable of storing network monitoring data. The network monitoring system storage 110 can be a database or any other suitable storage device. The network devices 120 can be nodes in the network monitored by the network traffic monitor 100. It should be noted that any number and type of network devices 120 can be supported in the system. Therefore, the embodiment is not limited to the number and type of network devices shown in FIG1. In an embodiment, the network monitoring system storage 110 can store a discovery result database table that stores and organizes information about network devices discovered in the network.

在一些情况下,网络设备能够在多个互联网协议(IP)地址上可接入。例如,一些设备可以同时具有多于一个IP地址。另外,一些节点是动态的因为它们具有随时间变化的IP地址。这种动态IP地址对去重逻辑带来问题,因为它可能将来自发现结果的过期的主IP地址与动态节点的当前IP地址相匹配。In some cases, network devices can be accessible at multiple Internet Protocol (IP) addresses. For example, some devices can have more than one IP address at a time. Additionally, some nodes are dynamic in that they have IP addresses that change over time. This dynamic IP addressing creates problems for deduplication logic, as it may match an outdated primary IP address from a discovery result with the current IP address of a dynamic node.

目前,系统通常在一个IP地址等于一个网络节点处应用逻辑。作为结果,具有多个IP地址绑定的节点可能不被识别为单个节点。这一行为可以导致其中一个物理设备被监测多于一次的状态,该状态在网络监测系统中可以明显地导致额外的开销和低效。Currently, systems often apply logic that equates one IP address to one network node. As a result, nodes with multiple IP addresses bound to them may not be identified as a single node. This behavior can lead to situations where a physical device is monitored more than once, which can significantly increase overhead and inefficiencies in network monitoring systems.

因此,用户通常希望使在多个IP地址上响应的设备在网络中作为单个节点而被监测。本发明的实施例实现了能够检测这种情况并避免对同一物理节点处理多次(例如每次使用不同的IP地址)的自动网络发现。一个实施例包括被配置为自动识别重复节点从而单个节点不被监测多次的去重逻辑。Therefore, users often want to enable devices that respond on multiple IP addresses to be monitored as a single node in the network. Embodiments of the present invention implement automatic network discovery that can detect this situation and avoid processing the same physical node multiple times (e.g., each time using a different IP address). One embodiment includes deduplication logic configured to automatically identify duplicate nodes so that a single node is not monitored multiple times.

某些实施例标识一条或多条信息,其用作唯一标识网络节点所需的节点标识符。有了这一唯一节点标识符,某些实施例能够继续定义逻辑,该逻辑检测在发现过程期间找到的节点(例如不同IP地址下的相同物理网络节点)和/或在不同IP地址下已经正在被监测的节点中的重复。Certain embodiments identify one or more pieces of information that serve as a node identifier required to uniquely identify a network node. With this unique node identifier, certain embodiments can then define logic that detects duplication among nodes found during the discovery process (e.g., the same physical network node under different IP addresses) and/or nodes that are already being monitored under different IP addresses.

根据某些实施例,被用于唯一标识节点的信息包括可以容易地被轮询的信息,对于大多数设备(即,供应商独立的)可用,对于互联网控制消息协议(ICMP)节点及其他类型节点(例如,ICMP节点可以被认为是通过SNMP或WMI不可达到的节点)可用,并且是在高效的同时仍然能够提供准确结果的最小信息集合。In accordance with certain embodiments, the information used to uniquely identify a node includes information that can be easily polled, is available to most devices (i.e., vendor independent), is available to Internet Control Message Protocol (ICMP) nodes and other types of nodes (e.g., ICMP nodes can be considered nodes that are not reachable via SNMP or WMI), and is a minimal set of information that is efficient while still providing accurate results.

某些实施例能够处理至少两种典型的用例。例如,一个用例可以包括在子网络上运行自动网络发现,该子网络包括在多个IP地址上可访问的设备。众所周知,属于一个子网络的设备被它们IP地址中共同的、相同、最高有效位组定址。另一个用例可以包括在已经正在被监测的子网上运行自动网络发现。Certain embodiments can address at least two typical use cases. For example, one use case may include running automatic network discovery on a subnet that includes devices accessible at multiple IP addresses. As is well known, devices belonging to a subnet are addressed by the common, identical, most significant byte of their IP addresses. Another use case may include running automatic network discovery on a subnet that is already being monitored.

某些实施例能够标识作为网络重复的网络节点,即使一组信息在不同于另一组信息的时帧内被收集。例如,当用户决定导入用于定期的发现简档的发现结果时,这种情况可能发生。在这种情况下,一些实施例可能需要使用在发现期间收集的可能过期的信息并且将它与针对所有监测的节点正在被持续收集的新信息作对比。因此,根据某些实施例的去重可以发生在发现工作期间,其中发现的节点被相互比较以使得重复被去除,和/或可以发生在发现结果导入期间,其中发现的节点被导入并且与现有节点相比较以使得重复节点被标识出并去除。Certain embodiments are capable of identifying network nodes that are network duplicates even if one set of information was collected within a different time frame than another set of information. This may occur, for example, when a user decides to import discovery results for a periodic discovery profile. In this case, some embodiments may need to use information collected during discovery that may be out of date and compare it to new information that is being continuously collected for all monitored nodes. Thus, deduplication according to certain embodiments may occur during a discovery effort, where discovered nodes are compared to each other so that duplicates are removed, and/or may occur during a discovery results import, where discovered nodes are imported and compared to existing nodes so that duplicate nodes are identified and removed.

根据一个实施例,由针对每个网络节点的DNS名称、系统名称、IP地址和MAC地址组成的数据收集组被用来帮助标识重复节点。一个实施例包括对来自信息收集组的多条信息进行互相比较(例如DNS与DNS、MACs与MACs等)的逻辑。逻辑可以在重复检测器组件中被实现。来自检测的结果是匹配索引,该匹配索引指示是否存在匹配、无匹配或未知。某些实施例也提供用于聚合来自检测器的部分结果并计算对于节点是否是重复的最终结论的逻辑。例如,在实施例中,节点去重可以包括若干子迭代,每个子迭代负责发现选定的IP范围。在每个迭代结束时,去重被执行以在节点(例如端点)被认为是重复节点时就立即省略该节点。在对于一个迭代的未知或非重复的结果的情况下,该节点被传递到下一个步骤或迭代以用于处理。According to one embodiment, a data collection set consisting of the DNS name, system name, IP address, and MAC address for each network node is used to help identify duplicate nodes. One embodiment includes logic for comparing multiple pieces of information from the information collection set to each other (e.g., DNS with DNS, MACs with MACs, etc.). The logic can be implemented in a duplicate detector component. The result from the detection is a match index that indicates whether there is a match, no match, or unknown. Certain embodiments also provide logic for aggregating partial results from the detector and calculating a final conclusion as to whether the node is a duplicate. For example, in an embodiment, node deduplication can include several sub-iterations, each of which is responsible for discovering a selected IP range. At the end of each iteration, deduplication is performed to immediately omit the node (e.g., an endpoint) if it is considered to be a duplicate node. In the case of an unknown or non-duplicate result for an iteration, the node is passed to the next step or iteration for processing.

在实施例中,可以提供两组重复检测器。一组检测器可以在自动发现过程期间被使用以滤除新发现的设备并去除重复节点。另一组检测器可以在发现结果导入期间被使用以避免将重复节点添加到被监测的节点组中。In an embodiment, two sets of duplicate detectors may be provided. One set of detectors may be used during the automatic discovery process to filter out newly discovered devices and remove duplicate nodes. The other set of detectors may be used during the import of discovery results to prevent duplicate nodes from being added to the monitored node group.

在一个实施例中,系统被配置为针对所有所发现的节点收集所有的DNS名称、系统名称、IP地址和MAC地址的列表,并存储它们,例如作为发现工作结果的一部分。这一信息可以被存储在网络系统监测存储或数据库110中。如以上提及的,该信息可以在发现的至少两个阶段期间被使用。例如,DNS名称、系统名称、IP地址和MAC地址可以在运行发现以检查当前发现的节点是否重复任何其他已找到的节点时被使用,和/或在发现结果导入期间它们可以被使用以将发现结果和被系统监测的现有节点进行比较。在一个实施例中,MAC信息被存储到一个永久性存储例如作为发现结果的一部分。In one embodiment, the system is configured to collect a list of all DNS names, system names, IP addresses, and MAC addresses for all discovered nodes and store them, for example, as part of the results of a discovery effort. This information can be stored in the network system monitoring storage or database 110. As mentioned above, this information can be used during at least two stages of discovery. For example, the DNS name, system name, IP address, and MAC address can be used when running a discovery to check whether the currently discovered node is a duplicate of any other found node, and/or they can be used during the import of discovery results to compare the discovery results with existing nodes monitored by the system. In one embodiment, the MAC information is stored to a permanent storage, for example, as part of the discovery results.

图2图示了根据一个实施例的系统200,该系统可以包括IP地址重复检测器201、DNS重复检测器203、MAC地址重复检测器202和系统名称重复检测器204。在实施例中,数据库210可以与IP地址重复检测器201、DNS重复检测器203、MAC地址重复检测器202和系统名称重复检测器204进行通信从而所述检测器中的每个检测器能够收集存储在数据库210中的相关地址。在实施例中,数据库210可以被存储在如图1所示的网络流量数据存储110中。应当注意到,系统200可以完全以硬件实现,或者以硬件和软件的组合实现。FIG2 illustrates a system 200 according to one embodiment, which may include an IP address duplication detector 201, a DNS duplication detector 203, a MAC address duplication detector 202, and a system name duplication detector 204. In an embodiment, a database 210 may communicate with the IP address duplication detector 201, the DNS duplication detector 203, the MAC address duplication detector 202, and the system name duplication detector 204 so that each of the detectors can collect relevant addresses stored in the database 210. In an embodiment, the database 210 may be stored in the network traffic data storage 110 shown in FIG1. It should be noted that the system 200 may be implemented entirely in hardware, or in a combination of hardware and software.

重复检测器201、202、203、204中的每个重复检测器可以具有经定义的执行顺序的优先级(priority)(例如,更低的数字指示更早的执行)、指示由重复检测器提供的结果的可靠性的权重(例如,权重0将对最终结果没有影响)、以及被用作最高优先级以确定节点是否是重复的否决权。Each of the duplicate detectors 201, 202, 203, 204 can have a priority for a defined order of execution (e.g., a lower number indicates an earlier execution), a weight indicating the reliability of the results provided by the duplicate detector (e.g., a weight of 0 will have no effect on the final result), and a veto power that is used as the highest priority to determine whether a node is a duplicate.

DNS重复检测器203可以被配置为将发现的节点的DNS与所有其他发现的节点和正在被监测的当前节点进行比较。在实施例中,DNS重复检测器203可以做出结论,即如果发现的节点的DNS和被任意监测的节点或任意其他发现的节点使用的DNS相同,则该节点为重复节点。The DNS duplicate detector 203 can be configured to compare the DNS of the discovered node with all other discovered nodes and the current node being monitored. In an embodiment, the DNS duplicate detector 203 can conclude that a node is a duplicate if the DNS of the discovered node is the same as the DNS used by any monitored node or any other discovered node.

MAC地址通常被设计为唯一的(尽管存在相同MAC地址被用在两个不同设备上的情况,例如克隆的虚拟机主控在两个分离的虚拟主机中)。MAC地址重复检测器202可以被配置为将发现的节点的MAC地址和所有之前收集的节点的MAC地址相比较,所有之前收集的节点的MAC地址可以被存储在数据库210的节点MAC地址数据库表中。MAC地址重复检测器202可以被配置为做出结论,即如果一组发现的MAC地址是针对节点的当前监测的MAC地址的子集或者如果一组监测的MAC地址是发现的MAC地址的子集,则该节点为重复节点。MAC addresses are generally designed to be unique (although there are situations where the same MAC address is used on two different devices, such as cloned virtual machines hosted in two separate virtual hosts). MAC address duplicate detector 202 can be configured to compare the MAC address of a discovered node with all previously collected MAC addresses of nodes, which can be stored in a node MAC address database table of database 210. MAC address duplicate detector 202 can be configured to conclude that a node is a duplicate node if a set of discovered MAC addresses is a subset of the currently monitored MAC addresses for the node or if a set of monitored MAC addresses is a subset of the discovered MAC addresses.

下表1示出了其中根据MAC地址重复检测器202节点被视为是相等的示例,其中A、B、C…代表MAC地址。同时,下表2示出了其中(例如,基于MAC地址)节点不被视为是相等的示例。Table 1 below shows an example where nodes are considered equal based on MAC address duplication detector 202, where A, B, C... represent MAC addresses. Meanwhile, Table 2 below shows an example where nodes are not considered equal (eg, based on MAC addresses).

节点ANode A 节点BNode B 节点CNode C 节点DNode D AA AA AA AA BB BB BB BB CC CC CC DD

表1Table 1

节点ANode A 节点BNode B 节点CNode C 节点DNode D AA AA DD AA BB BB EE DD CC DD FF

表2Table 2

两个都具有恰好两个MAC地址的节点,例如0000000000000000和00000000000000E0应当被认为是相等的。例如,当且仅当节点A的MAC地址列表是节点B的MAC地址列表的子集或节点B的MAC列表是节点A的MAC列表的子集时,节点A和节点B是相等的。在这种情况下,系统可以依靠不同的重复检测器或不同的去重方法(例如,系统名称匹配),因为根据MAC地址两节点是相等的。Two nodes that both have exactly two MAC addresses, such as 0000000000000000 and 000000000000000E0, should be considered equal. For example, node A and node B are equal if and only if node A's MAC address list is a subset of node B's MAC address list or node B's MAC list is a subset of node A's MAC list. In this case, the system can rely on a different duplicate detector or a different deduplication method (e.g., system name matching) because the two nodes are equal based on the MAC address.

系统名称重复检测器204可以被配置为将发现的节点的系统名称和所有其他发现的节点以及正在被监测的当前节点进行比较。在实施例中,系统名称重复检测器204可以做出结论,即如果发现的节点的系统名称和被任意监测的节点或任意其他发现的节点使用的系统名称相同,则该节点为重复节点。在实施例中,系统名称例如可以是用于简单网络管理协议(SNMP)节点的系统名称或者可以是窗口管理规范(WMI)节点的完整计算机名称。应当注意到,实施例将数据源限于SNMP和/或WMI,并且根据某些实施例(例如,通过SSH在路由器/交换机或远程登录上的CLI)其他类型的数据源同样适用。The system name duplication detector 204 can be configured to compare the system name of the node discovered with the nodes of all other discoveries and the current node being monitored. In an embodiment, the system name duplication detector 204 can conclude that if the system name of the node discovered is identical to the system name used by any monitored node or any other discovered node, then the node is a duplicate node. In an embodiment, the system name can, for example, be the system name for a Simple Network Management Protocol (SNMP) node or can be the full computer name of a Window Management Interface (WMI) node. It should be noted that the embodiment limits the data source to SNMP and/or WMI, and according to certain embodiments (e.g., CLI on a router/switch or remote login via SSH) other types of data sources are equally applicable.

如以上提及的,每个重复检测器可以从设置数据库表单加载关联的权重。权重代表由相关联的重复检测器提供的结果的可靠性。设置权重为-1以停用相关联的重复检测器是可能的。根据实施例,权重值可以在从0到100的范围上,其中0代表最不可靠而100代表最可靠。As mentioned above, each duplicate detector can have an associated weight loaded from a settings database form. The weight represents the reliability of the results provided by the associated duplicate detector. It is possible to set the weight to -1 to disable the associated duplicate detector. Depending on the embodiment, the weight value can range from 0 to 100, where 0 represents the least reliable and 100 represents the most reliable.

如以上所讨论的,所有重复检测器(d1,…,dn)可以按照优先级所定义的顺序被顺序执行。每个重复检测器可以设置‘IsAuthoritative(是权威的)’标记为真,这可以随后终止后面的重复检测器的执行。在这种情况下,通过‘IsAuthoritative’标记被设为真而对重复检测器的投票被认为是最终的,忽略所有其他投票。根据一个实施例,如果没有‘IsAuthoritative’标记被设为真,关于节点是否为重复节点的最终结果被计算为所有重复检测器投票结果值的总和,如下:As discussed above, all duplicate detectors (d 1 , ..., d n ) can be executed sequentially in the order defined by the priority. Each duplicate detector can set the 'IsAuthoritative' flag to true, which can then terminate the execution of subsequent duplicate detectors. In this case, the vote for the duplicate detector with the 'IsAuthoritative' flag set to true is considered final, and all other votes are ignored. According to one embodiment, if no 'IsAuthoritative' flag is set to true, the final result as to whether a node is a duplicate node is calculated as the sum of the values of all duplicate detector votes, as follows:

最终决定=Final decision

d1.IsDuplicate()*d1.Priority+…+dn.IsDuplicate()*dn.Priority,d 1 .IsDuplicate()*d 1 .Priority+…+d n .IsDuplicate()*d n .Priority,

其中d1指第一重复检测器,d2指第二重复检测器,…以及dn指第n个重复检测器。因此,dn.IsDuplicate()是代表第n个重复检测器关于节点是否为重复节点的结论的函数。Where d1 refers to the first duplicate detector, d2 refers to the second duplicate detector, ... and dn refers to the nth duplicate detector. Therefore, dn.IsDuplicate () is a function representing the conclusion of the nth duplicate detector on whether a node is a duplicate node.

如下表3中所示,每个重复检测器可以针对所有它找到的重复节点返回节点ID的列表。每个节点ID可以分配有相关联的匹配指数(MatchIndex),其指示匹配的可能性。在实施例中,匹配指数值的范围从0到100,其中0表示匹配的最小可能性而100表示匹配的最大可能性。根据实施例,系统200可以被配置为通过节点ID对表3中所述的重复节点信息进行分组,并对同一节点ID的匹配指数进行求和。然后,系统200可以被配置为选择具有最高合计的总匹配指数的节点ID以便丢弃。As shown in Table 3 below, each duplicate detector can return a list of node IDs for all duplicate nodes it finds. Each node ID can be assigned an associated match index (MatchIndex) that indicates the likelihood of a match. In an embodiment, the match index value ranges from 0 to 100, where 0 represents the minimum likelihood of a match and 100 represents the maximum likelihood of a match. According to an embodiment, the system 200 can be configured to group the duplicate node information described in Table 3 by node ID and sum the match indices for the same node ID. The system 200 can then be configured to select the node ID with the highest aggregate total match index for discarding.

重复检测器Duplicate Detector 重复节点IDDuplicate Node ID 匹配指数Matching Index DnsDNS 55 9090 MacMac 11 6060 MacMac 55 8080 系统名称System Name 55 8585 最终决定:Final Decision: 55 9090

表3Table 3

下表4示出根据一个实施例的示例结果表。在该示例中,表的每一行可以代表一个节点。‘DnsDuplicateDetector’列展示了DNS重复检测器关于节点是否为重复节点的结论。类似地,‘MacAddressDuplicateDetector’列展示了MAC地址重复检测器关于节点是否为重复节点的结论,以及‘NameDuplicateDetector’列展示了系统名称重复检测器关于节点是否为重复节点的结论。随后,最终的‘ExpectedResult’列展示了对于节点的期望的结果。Table 4 below shows an example result table according to one embodiment. In this example, each row of the table can represent a node. The 'DnsDuplicateDetector' column shows the conclusion of the DNS duplicate detector on whether the node is a duplicate node. Similarly, the 'MacAddressDuplicateDetector' column shows the conclusion of the MAC address duplicate detector on whether the node is a duplicate node, and the 'NameDuplicateDetector' column shows the conclusion of the system name duplicate detector on whether the node is a duplicate node. Subsequently, the final 'ExpectedResult' column shows the expected result for the node.

表4Table 4

图3图示了根据一个实施例的节点去重的方法的流程图。在图3的示例中,在300,发现一个或多个节点。在310,发现节点的相关联的IP地址。在320,检测用于获得设备信息的支持的技术,诸如SNMP、WMI等。在330,经由(检测到的)支持的技术获得关于节点的信息。在340,执行发现的节点的去重。在350,确定节点是否为重复的。如果是,则在360,从发现的结果组中省略重复的节点。如果节点不是重复节点,则在370,将发现的节点保存到数据库。在380,准备与发现的节点相关联的数据以用于发现导入。Figure 3 illustrates a flow chart of a method for node deduplication according to one embodiment. In the example of Figure 3, at 300, one or more nodes are discovered. At 310, the associated IP address of the node is discovered. At 320, supported technologies for obtaining device information, such as SNMP, WMI, etc., are detected. At 330, information about the node is obtained via the (detected) supported technology. At 340, deduplication of the discovered nodes is performed. At 350, it is determined whether the node is a duplicate. If so, at 360, the duplicate node is omitted from the discovered result group. If the node is not a duplicate node, at 370, the discovered node is saved to a database. At 380, data associated with the discovered node is prepared for discovery import.

图4图示了根据另一个实施例的用于节点去重的方法的流程图。在图4的示例中,在400,保存发现工作的结果。在405,从数据库加载并反序列化发现结果。在410,重复的检测主要通过与节点相关联的主IP地址来完成。在415,检测具有相同IP的节点是否已经被任何引擎监测。如果具有相同IP的节点已经被监测,则在420,更新现有的节点信息。如果具有相同IP的节点没有已经被监测,则在425执行针对忽略的节点的去重检测。在430,确定节点是否为重复的。如果是,则在435,记录该状况并且丢弃重复节点。如果不被确定为重复节点,则在440针对所有监测的节点执行重复检查。在445,确定节点是否为重复节点。如果被确定为重复节点,则在450记录该情况并且丢弃重复节点。如果不被确定为重复节点,则在455将节点信息保存到永久性存储。在460,发现导入的结果是节点连同相关联的IP地址及MAC地址的列表。FIG4 illustrates a flow chart of a method for node deduplication according to another embodiment. In the example of FIG4 , at 400 , the results of the discovery process are saved. At 405 , the discovery results are loaded from the database and deserialized. At 410 , duplicate detection is primarily performed using the primary IP address associated with the node. At 415 , a check is performed to see if the node with the same IP address is already monitored by any engine. If the node with the same IP address is already monitored, then at 420 , the existing node information is updated. If the node with the same IP address is not already monitored, then at 425 , deduplication detection is performed on the ignored nodes. At 430 , a determination is made as to whether the node is a duplicate. If so, at 435 , the condition is recorded and the duplicate node is discarded. If the node is not determined to be a duplicate, then at 440 , a duplicate check is performed on all monitored nodes. At 445 , a determination is made as to whether the node is a duplicate. If the node is determined to be a duplicate, then at 450 , the condition is recorded and the duplicate node is discarded. If the node is not determined to be a duplicate, then at 455 , the node information is saved to permanent storage. At 460, the result of the discovery import is a list of nodes along with associated IP addresses and MAC addresses.

图5图示了可以实现本发明的一个实施例的装置10的框图。装置10可以包括总线12或者用于在装置10的组件之间传达信息的其他通信机构。装置10还包括处理器22,其耦合至总线12,用于处理信息和执行指令或操作。处理器22可以任意类型的通用或专用处理器。装置10进一步包括存储器14以用于存储信息和以供处理器22执行的指令。存储器14可以由随机访问存储器(“RAM”)、只读存储器(“ROM”)、静态存储(例如磁盘或光盘)、或任意其他类型的机器或计算机可读介质的任意组合构成。装置10进一步包括通信设备20,诸如网络接口卡或其他通信接口,以提供对网络的访问。作为结果,用户可以通过网络或任何其他方法直接地或远程地与装置10对接。FIG5 illustrates a block diagram of an apparatus 10 in which one embodiment of the present invention may be implemented. The apparatus 10 may include a bus 12 or other communication mechanism for conveying information between components of the apparatus 10. The apparatus 10 also includes a processor 22 coupled to the bus 12 for processing information and executing instructions or operations. The processor 22 may be any type of general-purpose or special-purpose processor. The apparatus 10 further includes a memory 14 for storing information and instructions for execution by the processor 22. The memory 14 may be comprised of any combination of random access memory ("RAM"), read-only memory ("ROM"), static storage (e.g., a magnetic disk or optical disk), or any other type of machine- or computer-readable medium. The apparatus 10 further includes a communication device 20, such as a network interface card or other communication interface, to provide access to a network. As a result, a user may interface with the apparatus 10 directly or remotely via a network or any other method.

计算机可读介质可以是能够被处理器22访问的任何可用的介质并且包括易失和非易失性介质、可擦除和不可擦除介质以及通信介质。通信介质可以包括计算机可读指令、数据结构、程序模块或其他在诸如载波或其他传送机制之类的调制数据信号中的数据并且包括任何信息传递介质。Computer-readable media can be any available media that can be accessed by processor 22 and includes both volatile and nonvolatile media, removable and non-removable media, and communication media. Communication media may include computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

处理器22经由总线12被进一步耦合至呈现设备24,该呈现设备24诸如显示器、监视器、屏幕或网页浏览器以用于向用户显示信息,例如网络流量信息。用户输入组件25,例如键盘,计算机鼠标或网页浏览器,被进一步耦合至总线12以使得用户可以与装置10对接。处理器22和存储器14还可以经由总线12被耦合至数据库系统30,从而可以能够访问和获取存储在数据库系统30中的信息。在一个实施例中,数据库系统30是图1所示的网络监测系统存储110。虽然仅在图5中示出单个数据库,但是可以根据某些实施例使用任意数目的数据库。Processor 22 is further coupled to a presentation device 24, such as a display, monitor, screen, or web browser, via bus 12 for displaying information, such as network traffic information, to a user. User input components 25, such as a keyboard, computer mouse, or web browser, are further coupled to bus 12 to enable a user to interface with device 10. Processor 22 and memory 14 may also be coupled to a database system 30 via bus 12, thereby enabling access to and retrieval of information stored in database system 30. In one embodiment, database system 30 is network monitoring system storage 110 shown in FIG. Although only a single database is shown in FIG. 5 , any number of databases may be used according to certain embodiments.

在一个实施例中,存储器14存储当被处理器22执行时提供功能的软件模块。所述模块可以包括为装置10提供操作系统功能的操作系统15。如以上所讨论的,存储器还可以存储支持节点去重功能的一个或多个重复检测器16。该一个或多个重复检测器16例如可以包括IP地址重复检测器201、DNS重复检测器203、MAC地址重复检测器202和系统名称重复检测器204,如以上讨论而在图2中描绘的。装置10还可以包括一个或多个其他功能模块18以提供额外的功能。In one embodiment, the memory 14 stores software modules that provide functionality when executed by the processor 22. The modules may include an operating system 15 that provides operating system functionality for the device 10. As discussed above, the memory may also store one or more duplicate detectors 16 that support node deduplication functionality. The one or more duplicate detectors 16 may include, for example, an IP address duplicate detector 201, a DNS duplicate detector 203, a MAC address duplicate detector 202, and a system name duplicate detector 204, as discussed above and depicted in FIG2 . The device 10 may also include one or more other functional modules 18 to provide additional functionality.

数据库系统30可以包括数据库服务器和任意类型的数据库,例如关系数据库或平面文件数据库。数据库系统30可以存储有关网络中每个实体的网络交通流的数据,和/或与装置10或它相关联的模块和组件相关联的任何数据。The database system 30 may include a database server and any type of database, such as a relational database or a flat file database. The database system 30 may store data about network traffic flows for each entity in the network and/or any data associated with the device 10 or its associated modules and components.

在某些实施例中,处理器22,重复检测器16和其他功能模块18可以作为分离的物理和逻辑单元被实现或可以在单个物理和逻辑单元中被实现。此外,在一些实施例中,处理器22、重复检测器16和其他功能模块18可以被实现在硬件中,或被实现为硬件和软件的任意适当的组合。In some embodiments, the processor 22, the duplicate detector 16, and the other functional modules 18 may be implemented as separate physical and logical units or may be implemented in a single physical and logical unit. In addition, in some embodiments, the processor 22, the duplicate detector 16, and the other functional modules 18 may be implemented in hardware or as any suitable combination of hardware and software.

在一些实施例中,处理器22被配置为控制装置10来发现网络中的节点。根据实施例,标识发现的节点的信息例如可以被存储在数据库110中。处理器22可以被配置为控制装置10来针对在网络中发现的节点中的每个节点来收集IP地址、MAC地址、DNS名称和系统名称的列表。In some embodiments, the processor 22 is configured to control the apparatus 10 to discover nodes in the network. According to an embodiment, information identifying the discovered nodes may be stored, for example, in the database 110. The processor 22 may be configured to control the apparatus 10 to collect a list of IP addresses, MAC addresses, DNS names, and system names for each of the nodes discovered in the network.

根据一个实施例,处理器22可以被配置为控制装置10来执行被配置为将所发现的节点中的每个节点的IP地址与当前节点和其他发现的节点的IP地址进行比较的IP重复检测器、被配置为将所发现的节点中的每个节点的MAC地址与当前节点和其他发现的节点的MAC地址进行比较的MAC重复检测器、被配置为将所发现的节点中的每个节点的DNS名称与当前节点和其他发现的节点的DNS名称进行比较的DNS重复检测器,以及被配置为将所发现的节点中的每个节点的系统名称与当前节点和其他发现的节点的系统名称进行比较的名称重复检测器。处理器22可以然后被配置为基于IP重复检测器、MAC重复检测器、DNS重复检测器和名称重复检测器的比较来确定重复其他发现的节点和/或当前节点的重复节点。According to one embodiment, the processor 22 may be configured to control the device 10 to execute an IP duplication detector configured to compare the IP address of each discovered node with the IP addresses of the current node and other discovered nodes, a MAC duplication detector configured to compare the MAC address of each discovered node with the MAC addresses of the current node and other discovered nodes, a DNS duplication detector configured to compare the DNS name of each discovered node with the DNS names of the current node and other discovered nodes, and a name duplication detector configured to compare the system name of each discovered node with the system names of the current node and other discovered nodes. The processor 22 may then be configured to determine whether there are duplicate nodes of other discovered nodes and/or the current node based on the comparisons performed by the IP duplication detector, the MAC duplication detector, the DNS duplication detector, and the name duplication detector.

在实施例中,处理器22可以被配置为控制装置10来丢弃重复节点。根据一个实施例,处理器22可以被配置为控制装置10来向IP重复检测器、MAC重复检测器、DNS重复检测器和名称重复检测器中的每个重复检测器分配确定运行顺序的优先级。装置10可以被控制用于例如通过执行如下公式确定重复节点:In an embodiment, the processor 22 may be configured to control the apparatus 10 to discard duplicate nodes. According to one embodiment, the processor 22 may be configured to control the apparatus 10 to assign a priority to each of the IP duplicate detector, the MAC duplicate detector, the DNS duplicate detector, and the name duplicate detector to determine the order of operation. The apparatus 10 may be controlled to determine duplicate nodes, for example, by executing the following formula:

d1.IsDuplicate()*d1.Priority+…+dn.IsDuplicate()*dn.Priority。d 1 .IsDuplicate()*d 1 .Priority+…+d n .IsDuplicate()*d n .Priority.

在实施例中,所发现的节点中的每个节点可以被分配有节点ID。处理器22可以被配置为控制装置10来向每个节点ID分配匹配指数,其中匹配指数指示所发现的节点和当前节点和其他发现的节点中的任何节点之间匹配的可能性。根据一个实施例,处理器22可以被配置为控制10来通过节点ID对重复节点进行分组,并针对相同节点ID对匹配指数进行求和。此外,向IP重复检测器、MAC重复检测器、DNS重复检测器和名称重复检测器中的每个重复检测器分配权重。该权重指示由相应的重复检测器提供的结果的可靠性。In an embodiment, each of the discovered nodes may be assigned a node ID. The processor 22 may be configured to control the device 10 to assign a matching index to each node ID, wherein the matching index indicates the likelihood of a match between the discovered node and any of the current node and other discovered nodes. According to one embodiment, the processor 22 may be configured to control the device 10 to group duplicate nodes by node ID and sum the matching indices for the same node ID. In addition, a weight is assigned to each of the IP duplicate detector, MAC duplicate detector, DNS duplicate detector, and name duplicate detector. The weight indicates the reliability of the result provided by the corresponding duplicate detector.

图6图示了根据一个实施例的方法的示例流程图。该方法包括,在600,例如由网络监测装置发现网络中的节点。该方法然后可以包括,在610,针对在网络中所发现的节点中的每个节点来收集互联网协议(IP)地址、媒体访问控制(MAC)地址、域名系统(DNS)名称以及系统名称的列表。在620,该方法包括将所发现的节点中的每个节点的IP地址与当前节点和其他发现的节点的IP地址进行比较。在630,该方法包括将所发现的节点中的每个节点的MAC地址与当前节点和其他发现的节点的MAC地址进行比较。在640,该方法包括将所发现的节点中的每个节点的DNS名称与当前节点和其他发现的节点的DNS名称进行比较。在650,该方法包括将所发现的节点中的每个节点的系统名称与当前节点和其他发现的节点的系统名称进行比较。该方法可以进一步包括,在660,基于IP地址、MAC地址、DNS名称和系统名称的比较来确定重复其他发现的节点和/或当前节点的重复节点。Figure 6 illustrates an example flow chart of a method according to one embodiment. The method includes, at 600, discovering nodes in a network, for example, by a network monitoring device. The method may then include, at 610, collecting a list of Internet Protocol (IP) addresses, Media Access Control (MAC) addresses, Domain Name System (DNS) names, and system names for each of the nodes discovered in the network. At 620, the method includes comparing the IP address of each of the discovered nodes with the IP addresses of the current node and other discovered nodes. At 630, the method includes comparing the MAC address of each of the discovered nodes with the MAC addresses of the current node and other discovered nodes. At 640, the method includes comparing the DNS name of each of the discovered nodes with the DNS names of the current node and other discovered nodes. At 650, the method includes comparing the system name of each of the discovered nodes with the system names of the current node and other discovered nodes. The method may further include, at 660, determining duplicates of other discovered nodes and/or the current node based on the comparison of the IP addresses, MAC addresses, DNS names, and system names.

在一些实施例中,本文所述的任何方法的功能,例如图3、4和6中的功能,可以通过存储在存储器或其他计算机可读介质或有形介质中的软件和/或计算机程序代码被实现,并被处理器执行。在其他实施例中,所述功能可以通过硬件执行,例如通过使用专用集成电路(ASIC)、可编程门阵列(PGA)、现场可编程门阵列(FPGA)或硬件和软件的任意其他组合来执行。In some embodiments, the functions of any method described herein, such as those in Figures 3, 4, and 6, may be implemented by software and/or computer program code stored in a memory or other computer-readable or tangible medium and executed by a processor. In other embodiments, the functions may be implemented by hardware, such as by using an application-specific integrated circuit (ASIC), a programmable gate array (PGA), a field programmable gate array (FPGA), or any other combination of hardware and software.

本领域技术人员将容易地理解到以上所讨论的本发明可以通过不同顺序的步骤,并且/或者使用不同于所公开的配置的硬件元件来实践。因此,虽然本发明已经基于这些优选实施例被描述,但是对于本领域技术人员而言是明显的是某些修改、变化和可替代的构建是明显的,而其保持在本发明的精神和范围内。为了确定本发明的边界和界限,因而应当参考所附权利要求。Those skilled in the art will readily appreciate that the invention discussed above can be practiced by steps in a different order and/or using hardware components in a configuration different from that disclosed. Thus, while the invention has been described based on these preferred embodiments, it will be apparent to those skilled in the art that certain modifications, variations, and alternative constructions are apparent, while remaining within the spirit and scope of the invention. To determine the boundaries and limits of the invention, reference should be made to the appended claims.

Claims (11)

1.一种用于在网络中进行节点去重的方法,包括1. A method for deduplicating nodes in a network, comprising: 由网络监测装置发现网络中的节点;Nodes in the network are detected by network monitoring devices; 针对在网络中所发现的节点中的每个节点来收集互联网协议(IP)地址、媒体访问控制(MAC)地址、域名系统(DNS)名称以及系统名称的列表;For each node discovered in the network, a list of Internet Protocol (IP) addresses, Media Access Control (MAC) addresses, Domain Name System (DNS) names, and system names is collected. 由IP重复检测器将发现的所述节点中的每个节点的所述IP地址与当前节点和其他发现的节点的IP地址进行比较;The IP duplicate detector compares the IP address of each of the discovered nodes with the IP addresses of the current node and other discovered nodes. 由MAC重复检测器将所发现的节点中的每个节点的所述MAC地址与所述当前节点和所述其他发现的节点的MAC地址进行比较;The MAC address of each of the discovered nodes is compared with the MAC addresses of the current node and the other discovered nodes by the MAC duplicate detector. 由DNS重复检测器将所发现的节点中的每个节点的所述DNS名称与所述当前节点和所述其他发现的节点的DNS名称进行比较;The DNS name of each of the discovered nodes is compared by the DNS duplicate detector with the DNS names of the current node and the other discovered nodes; 由名称重复检测器将所发现的节点中的每个节点的所述系统名称与所述当前节点和所述其他发现的节点的系统名称进行比较;The system name of each of the discovered nodes is compared by the name duplication detector with the system names of the current node and the other discovered nodes; 基于对所述IP地址、MAC地址、DNS名称和系统名称的所述比较来确定重复所述其他发现的节点和/或所述当前节点的重复节点;The comparison of the IP address, MAC address, DNS name, and system name is used to determine the duplicate nodes of the other discovered nodes and/or the current node; 向所述IP重复检测器、所述MAC重复检测器、所述DNS重复检测器和所述名称重复检测器中的每一个分配优先级,所述优先级确定比较步骤中的每个比较步骤的执行顺序;以及A priority is assigned to each of the IP duplicate detector, the MAC duplicate detector, the DNS duplicate detector, and the name duplicate detector, and the priority determines the execution order of each comparison step in the comparison process; and 向所述IP重复检测器、所述MAC重复检测器、所述DNS重复检测器和所述名称重复检测器中的每一个分配权重,其中所述权重指示由所述IP重复检测器、所述MAC重复检测器、所述DNS重复检测器和所述名称重复检测器中的每一个提供的结果的可靠性。A weight is assigned to each of the IP duplicate detector, the MAC duplicate detector, the DNS duplicate detector, and the name duplicate detector, wherein the weight indicates the reliability of the results provided by each of the IP duplicate detector, the MAC duplicate detector, the DNS duplicate detector, and the name duplicate detector. 2.根据权利要求1所述的方法,进一步包括丢弃所述重复节点。2. The method of claim 1, further comprising discarding the duplicate node. 3.根据权利要求1所述的方法,其中所述确定重复所述其他发现的节点和/或所述当前节点的重复节点包括执行以下公式:3. The method of claim 1, wherein determining the duplicate nodes of the other discovered nodes and/or the current node comprises executing the following formula: d1.IsDuplicate()*d1.Priority+…+dn.IsDuplicate()*dn.Priority,d 1 .IsDuplicate()*d 1 .Priority+…+d n .IsDuplicate()*d n .Priority, 其中d1指第一重复检测器,…,以及dn指第n个重复检测器,dn.IsDuplicate()是代表第n个重复检测器关于节点是否为重复节点的结论的函数,dn.Priority指示第n个重复检验器的优先级,以及所述公式在如果没有针对d1、…、dn的IsAuthoritative标记被设置为真时被执行,所述IsAuthoritative标记被设为真指示对重复检测器的投票被认为是最终的。Where d1 refers to the first duplicate detector, ..., and dn refers to the nth duplicate detector, dn.IsDuplicate () is a function representing the conclusion of the nth duplicate detector regarding whether a node is a duplicate node, dn.Priority indicates the priority of the nth duplicate detector, and the formula is executed if the IsAuthoritative flag for d1 , ..., dn is not set to true, and the IsAuthoritative flag being set to true indicates that the vote for the duplicate detector is considered final. 4.根据权利要求1所述的方法,其中所发现的节点中的每个节点被分配有节点ID,并且其中所述方法进一步包括向每个节点ID分配匹配指数,所述匹配指数指示发现的所述节点与所述当前节点和所述其他发现的节点中的任何节点之间匹配的可能性。4. The method of claim 1, wherein each of the discovered nodes is assigned a node ID, and wherein the method further comprises assigning a matching index to each node ID, the matching index indicating the probability of a match between the discovered node and any of the current node and the other discovered nodes. 5.根据权利要求4所述的方法,进一步包括通过节点ID对所述重复节点进行分组以及针对相同节点ID对所述匹配指数进行求和。5. The method according to claim 4, further comprising grouping the duplicate nodes by node ID and summing the matching index for the same node ID. 6.一种用于在网络中进行节点去重的装置,包括:6. An apparatus for deduplicating nodes in a network, comprising: 至少一个处理器和至少一个包括计算机程序代码的存储器,At least one processor and at least one memory containing computer program code, 所述至少一个存储器和所述计算机程序代码被配置为与所述至少一个处理器一起,使得所述装置至少:The at least one memory and the computer program code are configured together with the at least one processor such that the device at least: 在网络中发现节点;Discover nodes in the network; 针对在网络中所发现的节点中的每个节点来收集互联网协议(IP)地址、媒体访问控制(MAC)地址、域名系统(DNS)名称以及系统名称的列表;For each node discovered in the network, a list of Internet Protocol (IP) addresses, Media Access Control (MAC) addresses, Domain Name System (DNS) names, and system names is collected. 其中所述至少一个处理器被进一步配置为执行:The at least one processor is further configured to perform: IP重复检测器,被配置为将所发现的节点中的每个节点的所述IP地址与当前节点和其他发现的节点的IP地址进行比较;An IP duplicate detector is configured to compare the IP address of each of the discovered nodes with the IP addresses of the current node and other discovered nodes. MAC重复检测器,被配置为将所发现的节点中的每个节点的所述MAC地址与所述当前节点和所述其他发现的节点的MAC地址进行比较;A MAC duplicate detector is configured to compare the MAC address of each of the discovered nodes with the MAC addresses of the current node and the other discovered nodes; DNS重复检测器,被配置为将所发现的节点中的每个节点的所述DNS名称与所述当前节点和所述其他发现的节点的DNS名称进行比较;A DNS duplication detector is configured to compare the DNS name of each of the discovered nodes with the DNS names of the current node and the other discovered nodes; 名称重复检测器,被配置为将所发现的节点中的每个节点的所述系统名称与所述当前节点和所述其他发现的节点的系统名称进行比较;A name duplication detector is configured to compare the system name of each of the discovered nodes with the system names of the current node and the other discovered nodes; 其中所述至少一个存储器和所述计算机程序代码被进一步配置为与所述至少一个处理器一起,使得所述装置至少:The at least one memory and the computer program code are further configured, together with the at least one processor, such that the device at least: 基于所述IP重复检测器、所述MAC重复检测器、所述DNS重复检测器和所述名称重复检测器的所述比较的结果来确定重复所述其他发现的节点和/或所述当前节点的重复节点;The duplicate nodes that are duplicates of the other discovered nodes and/or the current node are determined based on the comparison results of the IP duplicate detector, the MAC duplicate detector, the DNS duplicate detector, and the name duplicate detector. 向所述IP重复检测器、所述MAC重复检测器、所述DNS重复检测器和所述名称重复检测器中的每一个分配优先级,所述优先级确定比较步骤中的每个比较步骤的执行顺序;以及A priority is assigned to each of the IP duplicate detector, the MAC duplicate detector, the DNS duplicate detector, and the name duplicate detector, and the priority determines the execution order of each comparison step in the comparison process; and 向所述IP重复检测器、所述MAC重复检测器、所述DNS重复检测器和所述名称重复检测器中的每一个分配权重,其中所述权重指示由所述IP重复检测器、所述MAC重复检测器、所述DNS重复检测器和所述名称重复检测器中的每一个提供的结果的可靠性。A weight is assigned to each of the IP duplicate detector, the MAC duplicate detector, the DNS duplicate detector, and the name duplicate detector, wherein the weight indicates the reliability of the results provided by each of the IP duplicate detector, the MAC duplicate detector, the DNS duplicate detector, and the name duplicate detector. 7.根据权利要求6所述的装置,其中所述至少一个存储器和所述计算机程序代码被进一步配置为与所述至少一个处理器一起,使得所述装置至少丢弃所述重复节点。7. The apparatus of claim 6, wherein the at least one memory and the computer program code are further configured, together with the at least one processor, such that the apparatus at least discards the duplicate node. 8.根据权利要求6所述的装置,其中所述至少一个存储器和所述计算机程序代码被进一步配置为与所述至少一个处理器一起,使得所述装置至少通过执行以下公式来确定所述重复节点:8. The apparatus of claim 6, wherein the at least one memory and the computer program code are further configured, together with the at least one processor, such that the apparatus determines the repeating node by at least executing the following formula: d1.IsDuplicate()*d1.Priority+…+dn.IsDuplicate()*dn.Priority,d 1 .IsDuplicate()*d 1 .Priority+…+d n .IsDuplicate()*d n .Priority, 其中d1指第一重复检测器,…,以及dn指第n个重复检测器,dn.IsDuplicate()是代表第n个重复检测器关于节点是否为重复节点的结论的函数,dn.Priority指示第n个重复检验器的优先级,以及所述公式在如果没有针对d1、…、dn的IsAuthoritative标记而被设置为真时被执行,所述IsAuthoritative标记被设为真指示对重复检测器的投票被认为是最终的。Where d1 refers to the first duplicate detector, ..., and dn refers to the nth duplicate detector, dn.IsDuplicate () is a function representing the conclusion of the nth duplicate detector regarding whether a node is a duplicate node, dn.Priority indicates the priority of the nth duplicate detector, and the formula is executed if the IsAuthoritative flag for d1 , ..., dn is not set to true, and the IsAuthoritative flag being set to true indicates that the vote for the duplicate detector is considered final. 9.根据权利要求6所述的装置,其中所发现的节点中的每个节点被分配有节点ID,并且其中所述至少一个存储器和所述计算机程序代码被进一步配置为与所述至少一个处理器一起,使得所述装置至少向每个节点ID分配匹配指数,所述匹配指数指示发现的所述节点与所述当前节点和所述其他发现的节点中的任何节点之间匹配的可能性。9. The apparatus of claim 6, wherein each of the discovered nodes is assigned a node ID, and wherein the at least one memory and the computer program code are further configured, together with the at least one processor, such that the apparatus assigns at least a matching index to each node ID, the matching index indicating the probability of a match between the discovered node and any of the current node and the other discovered nodes. 10.根据权利要求9所述的装置,其中所述至少一个存储器和所述计算机程序代码被进一步配置为与所述至少一个处理器一起,使得所述装置至少通过节点ID对所述重复节点进行分组以及针对相同节点ID对所述匹配指数进行求和。10. The apparatus of claim 9, wherein the at least one memory and the computer program code are further configured, together with the at least one processor, such that the apparatus groups the duplicate nodes by node ID and sums the matching index for the same node ID. 11.一种计算机可读介质,存储有计算机程序,其中所述计算机程序被配置为控制处理器以执行过程,所述过程包括:11. A computer-readable medium storing a computer program, wherein the computer program is configured to control a processor to perform a process, the process comprising: 由网络监测装置发现网络中的节点;Nodes in the network are detected by network monitoring devices; 针对在所述网络中所发现的节点中的每个节点来收集互联网协议(IP)地址、媒体访问控制(MAC)地址、域名系统(DNS)名称以及系统名称的列表;For each node found in the network, a list of Internet Protocol (IP) addresses, Media Access Control (MAC) addresses, Domain Name System (DNS) names, and system names is collected. 由IP重复检测器将所发现的节点中的每个节点的所述IP地址与当前节点和其他发现的节点的IP地址进行比较;The IP duplicate detector compares the IP address of each of the discovered nodes with the IP addresses of the current node and other discovered nodes. 由MAC重复检测器将所发现的节点中的每个节点的所述MAC地址与所述当前节点和所述其他发现的节点的MAC地址进行比较;The MAC address of each of the discovered nodes is compared with the MAC addresses of the current node and the other discovered nodes by the MAC duplicate detector. 由DNS重复检测器将所发现的节点中的每个节点的所述DNS名称与所述当前节点和所述其他发现的节点的DNS名称进行比较;The DNS name of each of the discovered nodes is compared by the DNS duplicate detector with the DNS names of the current node and the other discovered nodes; 由名称重复检测器将所发现的节点中的每个节点的所述系统名称与所述当前节点和所述其他发现的节点的系统名称进行比较;The system name of each of the discovered nodes is compared by the name duplication detector with the system names of the current node and the other discovered nodes; 基于对所述IP地址、MAC地址、DNS名称和系统名称的所述比较来确定重复所述其他发现的节点和/或所述当前节点的重复节点;The comparison of the IP address, MAC address, DNS name, and system name is used to determine the duplicate nodes of the other discovered nodes and/or the current node; 向所述IP重复检测器、所述MAC重复检测器、所述DNS重复检测器和所述名称重复检测器中的每一个分配优先级,所述优先级确定比较步骤中的每个比较步骤的执行顺序;以及A priority is assigned to each of the IP duplicate detector, the MAC duplicate detector, the DNS duplicate detector, and the name duplicate detector, and the priority determines the execution order of each comparison step in the comparison process; and 向所述IP重复检测器、所述MAC重复检测器、所述DNS重复检测器和所述名称重复检测器中的每一个分配权重,其中所述权重指示由所述IP重复检测器、所述MAC重复检测器、所述DNS重复检测器和所述名称重复检测器中的每一个提供的结果的可靠性。A weight is assigned to each of the IP duplicate detector, the MAC duplicate detector, the DNS duplicate detector, and the name duplicate detector, wherein the weight indicates the reliability of the results provided by each of the IP duplicate detector, the MAC duplicate detector, the DNS duplicate detector, and the name duplicate detector.
HK15107668.9A 2013-11-05 2015-08-08 Method, apparatus and computer readable medium for node de-duplication in network HK1207221B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14/072,150 US9584367B2 (en) 2013-11-05 2013-11-05 Node de-duplication in a network monitoring system
US14/072,150 2013-11-05

Publications (2)

Publication Number Publication Date
HK1207221A1 HK1207221A1 (en) 2016-01-22
HK1207221B true HK1207221B (en) 2020-03-27

Family

ID=

Similar Documents

Publication Publication Date Title
AU2016262640B2 (en) Node de-duplication in a network monitoring system
US10785140B2 (en) System and method for identifying components of a computer network based on component connections
US11153184B2 (en) Technologies for annotating process and user information for network flows
JP6535809B2 (en) Anomaly detection device, an anomaly detection system, and an anomaly detection method
CN111371735B (en) Botnet detection method, system and storage medium
US9014034B2 (en) Efficient network traffic analysis using a hierarchical key combination data structure
US10079846B2 (en) Domain name system (DNS) based anomaly detection
US9507932B2 (en) Policy enforcement in a topology abstraction system
US8971196B2 (en) Distributed network traffic data collection and storage
US20170187618A1 (en) System and method for analyzing devices accessing a network
EP3105697A1 (en) A system and method for integrating legacy flow-monitoring systems with sdn networks
US10148596B2 (en) Data flow statistics collection method, system, and apparatus
CN111953552B (en) Data flow classification method and message forwarding equipment
HK1207221B (en) Method, apparatus and computer readable medium for node de-duplication in network
US11438237B1 (en) Systems and methods for determining physical links between network devices
JP6295681B2 (en) Communication analysis device, communication analysis system, communication analysis method, and program
BR102014027497B1 (en) DEDUPLICATING NODES IN A NETWORK MONITORING SYSTEM