CN116134785A

CN116134785A - Low-latency identification of network device attributes

Info

Publication number: CN116134785A
Application number: CN202180056570.0A
Authority: CN
Inventors: T·F·勒; M·斯里瓦特萨
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2020-08-10
Filing date: 2021-07-22
Publication date: 2023-05-16
Anticipated expiration: 2041-07-22
Also published as: JP2023536972A; US20220046032A1; CN116134785B; WO2022034405A1; GB202303453D0; GB2613117A; JP7764104B2; US11743272B2

Abstract

A method includes analyzing, by a machine learning model, a first network communication having a first set of inputs. The method further includes exposing, by the machine learning model and based on the analytical reasoning, a device attribute of the first device as a party to the first network communication. The method also includes extracting a first set of important inputs from the machine learning model that have an important impact on determining. The method also includes creating rules for identifying device attributes using the first set of inputs. The rule establishes a condition that, when present in the network communication, implies that a party to the network communication exhibits the device attribute.

Description

Low-latency identification of network device attributes

背景技术Background technique

本公开涉及网络安全，并且更具体地，涉及识别网络通信中的设备的属性。The present disclosure relates to network security, and more particularly, to identifying attributes of devices in network communications.

网络业务(例如，局域网内、分离的局域网之间、分离的虚拟局域网之间、以及去往或来自广域网的业务)涉及在一个或多个网络的电子设备之间发送消息。这些电子设备可包括蜂窝电话、个人计算机、物联网(本文有时称为“IoT”)设备、网络基础设施组件、游戏控制台、计算机外围设备等。典型网络上的大量和种类繁多的电子设备可能导致在典型网络上监视业务变得复杂。这使得不受信任的行动者更容易通过利用展示设备漏洞的网络设备来获得对网络的未授权访问。Network traffic (eg, within a local area network, between separate local area networks, between separate virtual local area networks, and traffic to or from a wide area network) involves sending messages between electronic devices on one or more networks. These electronic devices may include cellular telephones, personal computers, Internet of Things (sometimes referred to herein as "IoT") devices, network infrastructure components, game consoles, computer peripherals, and the like. The large number and variety of electronic devices on a typical network can complicate monitoring traffic on a typical network. This makes it easier for untrusted actors to gain unauthorized access to the network by exploiting network devices that exhibit device vulnerabilities.

监视网络业务和防止未授权访问的一些方法包括建立规则，这些规则基于例如通信起源的网络地址(例如，外部因特网、特定因特网协议地址和特定虚拟局域网)或通信类型(例如，通信响应于已经建立的链路)来规定是否允许通信进入(或离开)网络(或虚拟局域网)。这些通常被称为防火墙规则。防火墙规则可以在关于网络的组织和群体的足够信息(例如，网络上存在什么设备/设备类型，以及它们如何分组)可用时，降低对网络的未授权访问的风险，以及控制这些设备的能力(例如，在每设备基础上管理设备许可)。然而，一些网络或者太大、太简单，或者两者都有，以至于不能提供这种信息或控制。Some methods of monitoring network traffic and preventing unauthorized access include establishing rules based, for example, on network addresses where communications originate (e.g., external Internet, specific Internet Protocol addresses, and specific VLANs) or types of communications (e.g., communications respond to established link) to specify whether traffic is allowed to enter (or leave) the network (or virtual local area network). These are often called firewall rules. Firewall rules can reduce the risk of unauthorized access to a network, and the ability to control those devices when sufficient information is available about the organization and population of the network (e.g., what devices/device types exist on the network, and how they are grouped) For example, managing device licenses on a per-device basis). However, some networks are either too large, too simple, or both, to provide this kind of information or control.

理论上，防止未授权的网络接入也可以基于网络通信中涉及的设备的已知属性来执行。通过分析传入通信被定向到的网络设备的属性，或者传出通信正从其被发送的网络设备的属性，网络管理员可确定易受攻击的设备是否正在参与可能使网络处于风险中的通信。然而，一些网络上的设备是众多的并且不断变化，并且一些用户不具有识别那些属性的技巧。In theory, preventing unauthorized network access could also be performed based on known properties of the devices involved in network communications. By analyzing the properties of the network device to which incoming traffic is being directed, or from which outgoing traffic is being sent, a network administrator can determine whether a vulnerable device is engaging in traffic that could put the network at risk . However, devices on some networks are numerous and constantly changing, and some users do not have the skill to recognize those attributes.

发明内容Contents of the invention

本公开的一些实施例可以被图示为一种方法，该方法包括通过机器学习模型分析具有第一输入集合的第一网络通信。该方法还包括通过所述机器学习模型并且基于所述分析推理作为所述第一网络通信的一方的第一设备展示设备属性。该方法还包括从机器学习模型提取对确定具有重要影响的第一重要输入集合。该方法还包括使用第一输入集合创建用于识别设备属性的规则。该规则建立条件，所述条件在存在于网络通信中时暗示该网络通信的一方展示该设备属性。Some embodiments of the present disclosure may be illustrated as a method that includes analyzing, by a machine learning model, a first network communication having a first set of inputs. The method also includes inferring, by the machine learning model and based on the analysis, that the first device that is a party to the first network communication exhibits device attributes. The method also includes extracting from the machine learning model a first set of significant inputs that are significant to the determination. The method also includes creating a rule for identifying device attributes using the first set of inputs. The rule establishes conditions that, when present in a network communication, imply that a party to the network communication exhibits the device attribute.

所图示的方法的一些优选实施例还可以包括针对第一输入集合中的每个输入识别输入权重。这些优选实施例还可以包括对第一输入集合的输入权重进行排序。这些优选实施例还可以包括基于排序来选择第一重要输入集合。Some preferred embodiments of the illustrated method may also include identifying an input weight for each input in the first set of inputs. These preferred embodiments may also include ordering the input weights of the first set of inputs. These preferred embodiments may also include selecting a first significant input set based on ranking.

所图示的方法的一些优选实施例还可以包括通过机器学习模型分析具有第二输入集合的第二网络通信。这些优选实施例还可以包括通过机器学习模型并且基于分析推理作为第二网络通信的一方的第二设备展示设备属性。在这些优选实施例中，提取还可以包括针对第二输入集合中的每个输入识别输入权重，并且组合第一输入集合和第二输入集合的输入权重。Some preferred embodiments of the illustrated method may further include analyzing, by the machine learning model, the second network communication having the second set of inputs. These preferred embodiments may also include exposing device attributes by the second device that is a party to the second network communication through the machine learning model and based on analytical reasoning. In these preferred embodiments, extracting may also include identifying an input weight for each input in the second set of inputs, and combining the input weights of the first set of inputs and the second set of inputs.

本公开的一些实施例还可以被图示为包括处理器和与处理器通信的存储器的系统。存储器包含程序指令，该程序指令在由处理器执行时被配置为使处理器执行一种方法。由处理器执行的方法包括通过机器学习模型分析具有第一输入集合的第一网络通信集合。由处理器执行的方法还包括通过机器学习模型并且基于分析推理每个设备是展示设备属性的设备集合。在该示例中，每个设备是第一网络通信集合中的网络通信的一方。由处理器执行的方法还包括从机器学习模型提取对确定具有重要影响的第一重要输入集合。由处理器执行的方法还包括使用第一输入集合创建用于识别设备属性的规则。该规则建立条件，该条件在存在于实时网络通信集合中时暗示该实时网络通信集合的一方展示该设备属性。Some embodiments of the present disclosure may also be illustrated as a system including a processor and a memory in communication with the processor. The memory contains program instructions which, when executed by the processor, are configured to cause the processor to perform a method. A method performed by a processor includes analyzing a first set of network communications having a first set of inputs through a machine learning model. The method performed by the processor also includes reasoning through the machine learning model and based on the analysis that each device is a collection of devices exhibiting device attributes. In this example, each device is a party to network communications in the first set of network communications. The method performed by the processor also includes extracting from the machine learning model a first set of significant inputs that are significant to the determination. The method performed by the processor also includes creating a rule for identifying device attributes using the first set of inputs. The rule establishes a condition that, when present in a set of real-time network communications, implies that a party to the set of real-time network communications exhibits the attribute of the device.

在所图示的系统的一些优选实施例中，机器学习模型是基于注意力的模型。在这些优选实施例中，所述提取可以包括针对所述设备集合中的特定设备识别注意力权重列表，所述注意力权重列表表达所述第一输入集合中的每个特定输入对于针对该特定设备的推理的重要性。在这些优选实施例中，所述提取还可以包括对于特定设备的特定输入，将列表中的注意力权重与该设备集合中的其它设备的该输入集合中的对应输入的注意力权重进行组合。这种组合可以导致对应于设备集合中的所有设备的该输入的组合权重。在这些优选实施例中，提取还可以包括将组合权重与用于输入集合中的其他输入的其他组合权重进行比较。在这些优选实施例中，提取还可以包括基于比较确定特定输入是重要输入。在这些优选实施例中，提取还可以包括将特定输入添加到第一重要输入集合。In some preferred embodiments of the illustrated system, the machine learning model is an attention-based model. In these preferred embodiments, said extracting may include identifying, for a particular device in said set of devices, a list of attention weights expressing the importance of each particular input in said first set of inputs for that particular device. Importance of device inference. In these preferred embodiments, the extracting may further include, for a specific input of a specific device, combining attention weights in the list with attention weights of corresponding inputs in the input set of other devices in the device set. This combination may result in a combined weight for that input for all devices in the device set. In these preferred embodiments, extracting may also include comparing the combined weights with other combined weights for other inputs in the set of inputs. In these preferred embodiments, extracting may also include determining that a particular input is a significant input based on the comparison. In these preferred embodiments, extracting may also include adding certain inputs to the first set of significant inputs.

本公开的一些实施例还可以被图示为包括计算机可读存储介质的计算机程序产品。计算机可读存储介质具有随其体现的程序指令。程序指令能够由计算机执行以使计算机执行由上述实施例中的系统执行的方法。Some embodiments of the present disclosure may also be illustrated as a computer program product comprising a computer-readable storage medium. A computer-readable storage medium has program instructions embodied therewith. The program instructions can be executed by a computer to cause the computer to perform the methods performed by the systems in the above-mentioned embodiments.

在本公开的一些优选实施例中，程序指令还使计算机执行由上述优选实施例中的系统执行的方法。In some preferred embodiments of the present disclosure, the program instructions further cause the computer to execute the method performed by the system in the above preferred embodiments.

根据一个方面，提供了一种方法，包括：通过机器学习模型分析具有第一输入集合的第一网络通信；通过所述机器学习模型并且基于所述分析推理作为所述第一网络通信的一方的第一设备展示设备属性；从所述机器学习模型提取对所述确定具有重要影响的第一重要输入集合；以及使用第一输入集合创建用于识别设备属性的规则，其中该规则建立条件，该条件在存在于网络通信中时暗示该网络通信的一方展示设备属性。According to one aspect, there is provided a method comprising: analyzing, by a machine learning model, a first network communication having a first set of inputs; inferring, by the machine learning model and based on the analysis, a A first device exhibits device attributes; extracting from said machine learning model a first set of significant inputs having a significant impact on said determination; and using the first set of inputs to create a rule for identifying device attributes, wherein the rule establishes a condition, the A condition, when present in a network communication, implies that a party to the network communication exhibits device attributes.

根据另一方面，提供了一种系统，包括：处理器；以及与处理器通信的存储器，该存储器包含程序指令，该程序指令在由处理器执行时被配置为使处理器执行一种方法，该方法包括：通过机器学习模型分析具有第一输入集合的第一网络通信集合；通过所述机器学习模型并且基于所述分析推理每个设备是展示设备属性的设备集合，其中每个设备是所述第一网络通信集合中的网络通信的一方；从所述机器学习模型提取对所述确定具有重要影响的第一显著输入集合；以及使用第一输入集合创建用于识别设备属性的规则，其中该规则建立条件，该条件在存在于实时网络通信集合中时暗示该实时网络通信集合的一方展示设备属性。According to another aspect, there is provided a system comprising: a processor; and a memory in communication with the processor, the memory containing program instructions configured, when executed by the processor, to cause the processor to perform a method, The method includes: analyzing, by a machine learning model, a first set of network communications having a first set of inputs; inferring, by the machine learning model and based on the analysis, that each device is a set of devices exhibiting device attributes, wherein each device is the a party to a network communication in the first set of network communications; extracting a first set of salient inputs from the machine learning model that have a significant impact on the determination; and using the first set of inputs to create a rule for identifying device attributes, wherein The rule establishes a condition that, when present in a set of real-time network communications, implies that a party to the set of real-time network communications exhibits device attributes.

根据另一方面，提供了一种计算机程序产品，所述计算机程序产品包括计算机可读存储介质，该计算机可读存储介质具有随其体现的程序指令，所述程序指令能够由计算机执行以使所述计算机：通过机器学习模型分析具有第一输入集合的第一网络通信集合；通过所述机器学习模型并且基于所述分析来推理每个设备是展示设备属性的设备集合，其中每个设备是所述第一网络通信集合中的网络通信的一方；从所述机器学习模型提取对所述确定具有重要影响的第一重要输入集合；以及使用第一输入集合创建用于识别设备属性的规则，其中该规则建立条件，该条件在存在于实时网络通信集合中时暗示该实时网络通信集合的一方展示设备属性。According to another aspect, there is provided a computer program product comprising a computer-readable storage medium having program instructions embodied therewith, the program instructions being executable by a computer to cause all The computer: analyzing, by a machine learning model, a first set of network communications having a first set of inputs; inferring, by the machine learning model and based on the analysis, that each device is a set of devices exhibiting device attributes, wherein each device is the set of devices a party to a network communication in the first set of network communications; extracting a first set of significant inputs from the machine learning model that have a significant impact on the determination; and using the first set of inputs to create a rule for identifying device attributes, wherein The rule establishes a condition that, when present in a set of real-time network communications, implies that a party to the set of real-time network communications exhibits device attributes.

上述发明内容并非旨在描述本公开的每个所示实施例或每个实现方式。The above summary is not intended to describe each illustrated embodiment or every implementation of the present disclosure.

附图说明Description of drawings

现在将仅通过示例并参考以下附图来描述本发明的优选实施例：A preferred embodiment of the invention will now be described, by way of example only, with reference to the following drawings:

图1描绘了根据本公开的实施例的创建和应用规则集合以识别作为网络通信的一方的设备的属性的方法。FIG. 1 depicts a method of creating and applying a rule set to identify attributes of a device that is a party to network communications, according to an embodiment of the disclosure.

图2描绘了根据本公开的实施例的用于开发和应用规则集合以识别作为网络通信的一方的设备的属性的系统的图形抽象。2 depicts a graphical abstraction of a system for developing and applying rule sets to identify attributes of devices that are a party to network communications, according to an embodiment of the disclosure.

图3描绘了根据本公开的实施例的从也被训练为推理设备的属性的机器学习模型提取规则以推理该属性的方法。3 depicts a method of extracting rules from a machine learning model that is also trained to infer an attribute of a device to infer the attribute, according to an embodiment of the present disclosure.

图4描绘了根据实施例可以使用的计算机系统的代表性主要组件。Figure 4 depicts representative major components of a computer system that may be used in accordance with an embodiment.

虽然本发明可以有各种修改和替换形式，但是其细节已经在附图中通过示例的方式示出并且将被详细描述。然而，应当理解，其目的不是将本发明限制于所描述的特定实施例。相反，目的是覆盖落入本发明的精神和范围内的所有修改、等效和替换。While the invention is capable of various modifications and alternative forms, details thereof have been shown by way of example in the drawings and will be described in detail. It should be understood, however, that the intention is not to limit the invention to the particular embodiments described. On the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.

具体实施方式Detailed ways

本公开的实施例涉及网络安全，并且更具体地，涉及识别网络通信中的设备的属性。虽然本公开不一定限于这样的应用，但是本公开的各个实施例可以通过使用该上下文对各个示例的讨论来领会。Embodiments of the present disclosure relate to network security, and more particularly, to identifying attributes of devices in network communications. While the present disclosure is not necessarily limited to such applications, various embodiments of the present disclosure can be appreciated through a discussion of various examples using this context.

对特定通信网络(例如，有线局域网、无线局域网和虚拟局域网)的未授权访问有时通过利用针对该通信网络授权的设备的安全漏洞来实现。一旦发现或怀疑特定设备(例如，蜂窝电话型号和安全相机型号)或设备类型(例如，IoT设备、由特定制造商制造的设备、具有特定处理器品牌的设备、以及运行特定软件的设备)的漏洞，无论是公开的、在黑客群体中还是其他，未授权的行动者(例如，网络罪犯)都可以设计以那些设备或设备类型为目标的入侵尝试。如果未授权的行动者能够与网络上的易受攻击设备通信，则未授权的行动者可能能够利用该漏洞并获得对该易受攻击设备的未经过滤的访问。在该点，网络罪犯可能能够控制设备并使用它来访问网络的其他部分而不被检测到。换句话说，网络罪犯可能能够使用易受攻击的设备作为渠道来获得对网络的整体的未授权访问。Unauthorized access to certain communication networks (eg, wired LANs, wireless LANs, and VLANs) is sometimes achieved by exploiting security holes in devices authorized for that communication network. Once a specific device (for example, a cell phone model and a security camera model) or device type (for example, an IoT device, a device made by a specific manufacturer, a device with a specific processor brand, and a device running a specific software) is discovered or suspected Vulnerabilities, whether public, in hacker groups, or otherwise, allow unauthorized actors (e.g., cybercriminals) to design intrusion attempts that target those devices or device types. If an unauthorized actor is able to communicate with a vulnerable device on the network, the unauthorized actor may be able to exploit the vulnerability and gain unfiltered access to the vulnerable device. At that point, cybercriminals may be able to take control of the device and use it to gain access to other parts of the network without being detected. In other words, cybercriminals may be able to use a vulnerable device as a conduit to gain unauthorized access to the network as a whole.

出于这个原因，对于维护通信网络的个人(在此也称为“网络管理员”，其可以包括例如住宅网络的所有者和从事维护企业网络的雇员/承包者)来监视和过滤去往和来自被怀疑遭受漏洞的网络设备的传入和传出网络业务，这可以是有益的。理论上，这通过基于作为那些通信的参与方的网络设备的属性来过滤网络通信是可能的。这可以包括例如过滤从局域网(本文也称为“LAN”)到因特网的通信、过滤从因特网进入到LAN中的通信、过滤LAN上的虚拟LAN(本文也称为vLAN)之间的通信、或者过滤单个网络(例如单个vLAN)上的设备之间的通信。For this reason, it is important for individuals maintaining communication networks (also referred to herein as "network administrators," which may include, for example, owners of residential networks and employees/contractors engaged in maintaining enterprise networks) to monitor and filter incoming and outgoing communications. This can be beneficial for incoming and outgoing network traffic from network devices suspected of being compromised. In theory, this is possible by filtering network communications based on attributes of the network devices that are parties to those communications. This may include, for example, filtering traffic from a local area network (also referred to herein as a "LAN") to the Internet, filtering traffic from the Internet into a LAN, filtering traffic between virtual LANs on a LAN (also referred to herein as vLANs), or Filter communication between devices on a single network (such as a single vLAN).

例如，网络管理员可断定物联网设备(例如，智能冰箱、视频门铃)频繁地具有过时的、不安全的固件，并且因此可能希望限制IoT设备联系因特网(或被因特网联系)的能力。类似地，一些网络设备可能仅需要联系因特网或网络上的其它设备，以用于非常特定的目的(例如，时间-服务器同步)，并且因此网络管理员可能希望限制这些设备与时间-服务器因特网地址通信的能力。一些特定设备(例如，特定智能电话模型)可能已知易受某种类型的攻击(例如，强力登录尝试)，并且因此网络管理员可能希望阻止重复尝试登录到那些特定设备之一的任何因特网地址。此外，一些设备可能仅需要出于非常有限的目的而在整个网络中通信，因此网络管理员可能希望限制这些设备与这些目的之外的任何其他网络设备通信(例如，网络管理员可能希望防止安全相机与网络上的任何存储服务器进行通信，除非存储服务器被配置为存储该安全相机的镜头)。For example, a network administrator may conclude that IoT devices (e.g., smart refrigerators, video doorbells) frequently have outdated, insecure firmware, and therefore may wish to limit the ability of IoT devices to contact (or be contacted by) the Internet. Similarly, some network devices may only need to contact the Internet or other devices on the network for very specific purposes (e.g., time-server synchronization), and thus network administrators may wish to restrict the connection of these devices with time-server Internet addresses ability to communicate. Some specific devices (e.g., specific smartphone models) may be known to be vulnerable to a certain type of attack (e.g., brute-force login attempts), and therefore a network administrator may wish to block repeated attempts to log in to any Internet address on one of those specific devices . Also, some devices may only need to communicate throughout the network for very limited purposes, so network administrators may wish to restrict these devices from communicating with any other network devices for The camera communicates with any storage server on the network, unless the storage server is configured to store footage from that security camera).

然而，为了使网络管理员基于那些设备属性来适当地过滤网络通信，必须知道作为通信的一方的网络设备的属性。对于具有典型的静态网络设备集合的非常小的网络(例如，家庭网络和不允许设备不被网络上的企业拥有的小型企业的网络)，熟练的网络管理员可能能够监视设备的属性并基于那些被监视的设备过滤互联网业务。However, in order for a network administrator to properly filter network communications based on those device attributes, the attributes of the network devices that are parties to the communication must be known. For very small networks with typically static collections of network devices (for example, home networks and small business networks that do not allow devices not to be owned by businesses on the network), skilled network administrators may be able to monitor device The monitored device filters Internet traffic.

然而，许多网络不落在这些类别中。例如，许多住宅网络的网络管理员通常是家庭用户，他们甚至不知道或者不知道如何识别他们的住宅网络上的所有设备，更不用说识别他们的属性或者潜在的漏洞。类似地，许多企业级网络具有非常大的连接设备的群体、随时间显著改变的连接设备的群体，或两者。因此，对于许多网络，维护网络上设备的最新知识库、它们的属性以及它们的潜在漏洞可能在网络管理员的能力之外，并且甚至可能是不可能的。However, many networks do not fall into these categories. For example, the network administrators of many residential networks are often home users who don't even know or know how to identify all the devices on their residential network, let alone identify their attributes or potential vulnerabilities. Similarly, many enterprise-level networks have very large populations of connected devices, populations of connected devices that change significantly over time, or both. Thus, for many networks, maintaining an up-to-date knowledge base of devices on the network, their attributes, and their potential vulnerabilities may be beyond the capabilities of network administrators, and may not even be possible.

为此，自动地并且实时地推理作为网络通信的参与方的网络设备的属性具有理论上的益处。这在理论上是可能的，例如，通过分析意图由那些网络设备接收或发送到那些网络设备的网络通信。例如，当网络控制器设备(例如，路由器设备、防火墙设备)检测到由网络设备发送到外部IP地址的通信(例如，下载诸如可执行指令之类的指令集的请求)时，网络控制器设备(在本文中有时被称为“网络控制器”、“控制器”或“控制器设备”)可以分析该设备的业务，并且确定其是IoT设备(例如，智能热水器)。如果通信旨在被发送到的外部IP地址不与该IoT设备的制造商相关联，则网络控制器可确定未授权行动者正尝试在IoT设备上安装入侵软件并阻止通信。通过实时推理IoT设备的属性，网络控制器可防止未授权行动者在设备上安装软件，即使IoT设备的属性事先不是已知的。To this end, it is of theoretical benefit to automatically and in real-time reason about attributes of network devices that are participants in network communications. This is theoretically possible, for example, by analyzing network communications intended to be received by or sent to those network devices. For example, when a network controller device (e.g., router device, firewall device) detects a communication (e.g., a request to download a set of instructions, such as executable instructions) sent by the network device to an external IP address, the network controller device (sometimes referred to herein as a "network controller," "controller," or "controller device") may analyze the device's traffic and determine that it is an IoT device (eg, a smart water heater). If the external IP address to which the communication is intended to be sent is not associated with the IoT device's manufacturer, the network controller may determine that an unauthorized actor is attempting to install intrusive software on the IoT device and block the communication. By reasoning about the properties of IoT devices in real time, network controllers can prevent unauthorized actors from installing software on devices even if the properties of IoT devices are not known in advance.

然而，通过分析网络设备的通信来实时推理这些设备的属性是计算密集型活动。典型的通用计算机设备在历史上一直在努力这样做，并且即使成功也需要很长的时间。然而，机器学习模型(例如，神经网络分类器、前馈网络、循环神经网络、长期短期记忆网络以及基于注意力的模型)可被训练为基于那些设备的业务来准确地推理设备属性。在一些实例中，这些机器学习模型(在本文中有时称为“ML模型”)可被配置为近实时地推理设备属性。例如，这些机器学习模型可被训练成分析基于设备类型来标记的历史网络通信，并将那些网络通信中的模式与设备类型标签相关联。However, inferring properties of network devices in real time by analyzing their communications is computationally intensive activity. Typical general-purpose computing devices have historically struggled to do so, and it took a long time, if any, to succeed. However, machine learning models (eg, neural network classifiers, feed-forward networks, recurrent neural networks, long short-term memory networks, and attention-based models) can be trained to accurately reason about device attributes based on the operations of those devices. In some instances, these machine learning models (sometimes referred to herein as "ML models") can be configured to infer device attributes in near real-time. For example, these machine learning models can be trained to analyze historical network communications labeled based on device type, and associate patterns in those network communications with device type labels.

例如，可以向机器学习模型馈送包含若干不同设备类型的网络的历史网络通信。机器学习模型可以被配置为单独地或结合其他通信来分析每个通信，并且尝试推理通信源自或通信被寻址到的设备的属性。这些机器学习模型可被配置成例如基于通信目的地、通信起源、通信格式或通信内容来推理设备属性。For example, a machine learning model can be fed historical network traffic for a network containing several different device types. The machine learning model can be configured to analyze each communication individually or in combination with other communications and attempt to infer properties of the device from which the communication originated or to which the communication was addressed. These machine learning models can be configured to infer device attributes based, for example, on communication destination, communication origin, communication format, or communication content.

例如，一些机器学习模型可以被配置为分析由网络设备发送的DNS查询中的DNS名称(例如，ntp.manufacture.com)，并且基于那些DNS查询的DNS名称(本文中也称为“DNS地址”)中的模式来推理设备属性。这样，到特定DNS名称(或多个DNS名称)的每个通信或到特定DNS名称(或多个名称)的一系列通信可以被认为是机器学习模型的输入。一些机器学习模型可以类似地被配置为分析实际业务内容(例如，通信的网络字节)以基于在通信的内容中观察到的模式来推理设备属性。例如，特定制造商的智能遥控器可以在对更新固件的请求中具有特定签名。这些通信的实际字节可以向机器学习模型提供输入。因此，机器学习模型可被训练以识别这些输入中的模式，并基于那些条件来推理设备属性。For example, some machine learning models may be configured to analyze DNS names (e.g., ntp.manufacture.com) in DNS queries sent by network devices and, based on those DNS queries, DNS names (also referred to herein as "DNS addresses") ) to reason about device properties. In this way, each communication to a particular DNS name (or names) or a sequence of communications to a particular DNS name (or names) can be considered an input to a machine learning model. Some machine learning models may similarly be configured to analyze actual business content (eg, network bytes of communications) to infer device attributes based on patterns observed in the content of the communications. For example, a smart remote from a particular manufacturer may have a specific signature in requests to update firmware. The actual bytes of these communications can provide input to machine learning models. Thus, machine learning models can be trained to recognize patterns in these inputs and reason about device attributes based on those conditions.

机器学习模型被配置来推理的设备属性可以基于训练机器学习模型的用例(例如，网络的能力、网络管理员的安全关注、网络上的设备的群体、典型的网络活动)而变化。机器学习模型可以被配置为推理的设备属性的一些示例包括设备的操作系统、设备上特定软件的存在、设备分类(例如，蜂窝电话、存储服务器、IoT设备)、设备的制造商、设备上的软件版本(或软件版本的年龄)以及设备的年龄。这些设备属性可以在训练数据中被附加到历史网络通信的标签(例如，元数据)中指定。在机器学习模型的训练期间，作为历史通信的一方的设备的推理的设备属性可以与这样的标签中的实际设备属性进行比较。机器学习模型的配置(例如，神经网络神经元的权重和偏差)然后可以基于推理的属性是否匹配实际设备属性来调整。The device attributes that the machine learning model is configured to infer may vary based on the use case for which the machine learning model was trained (eg, capabilities of the network, security concerns of network administrators, population of devices on the network, typical network activity). Some examples of device attributes that a machine learning model can be configured to infer include the operating system of the device, the presence of specific software on the device, the classification of the device (e.g., cellular phone, storage server, IoT device), the manufacturer of the device, the The software version (or the age of the software version) and the age of the device. These device attributes may be specified in tags (eg, metadata) that are appended to historical network communications in the training data. During training of the machine learning model, the inferred device attributes of devices that were a party to historical communications may be compared to actual device attributes in such labels. The configuration of the machine learning model (eg, the weights and biases of the neural network neurons) can then be adjusted based on whether the inferred properties match actual device properties.

一旦机器学习模型被充分训练以基于业务(例如，通过通信事件、通信内容、通信格式)来推理设备属性，机器学习模型通常就可以足够准确地推理内容中涉及的网络设备的属性以解决网络安全问题。此外，在机器学习模型的准确度不足的情况下，机器学习模型通常可以用更多的历史数据(或实时数据)来重新训练以校正该不足。例如，如果ML模型能够准确地推理通信被寻址到由特定公司制造的设备，但不能准确地推理通信是由IoT设备寻址的，则ML模型可用旨在去往网络上的IoT设备或从网络上的IoT设备发送的通信来进一步训练。Once a machine learning model is sufficiently trained to reason about device attributes based on the business (e.g., by communication event, communication content, communication format), a machine learning model can often infer attributes of network devices involved in the content with sufficient accuracy to address cybersecurity question. Furthermore, where the accuracy of a machine learning model is insufficient, the machine learning model can often be retrained with more historical data (or real-time data) to correct the deficiency. For example, if an ML model can accurately reason that communications are addressed to devices made by a particular company, but cannot accurately reason that communications are addressed by IoT devices, the ML model can be used to either Communications sent by IoT devices on the network for further training.

典型的经适当训练的机器学习模型虽然在基于由设备发送或发送到该设备的通信来推理该设备的属性方面足够准确，但其是计算密集的。该计算强度扩展了ML模型推理设备属性所需的时间量。虽然该计算时间对于一些使用情况可能是可接受的，但是将这样的ML模型应用于实时网络通信可能导致明显的、有时不可接受的网络时延。在网络速度特别重要的使用情况下，这种网络时延的不利可能超过ML模型可以推理设备属性的准确度的益处。A typical properly trained machine learning model, while accurate enough to infer properties of a device based on communications sent by or to the device, is computationally intensive. This computational intensity scales the amount of time required for the ML model to reason about device properties. While this computation time may be acceptable for some use cases, applying such ML models to real-time network communications may result in significant and sometimes unacceptable network latency. In use cases where network speed is particularly important, the disadvantage of this network latency may outweigh the benefit of the accuracy with which ML models can reason about device properties.

因此，虽然ML模型理论上能够帮助推理设备属性以用于过滤网络业务的目的，但是实际上它们不足以快速提供独立的解决方案。由于上述原因，许多使用情况将受益于一种系统，该系统可以基于具有ML模型的准确度但没有由这些模型引入的时延的那些网络设备的通信来推理网络设备的属性。Therefore, although ML models are theoretically able to help reason about device attributes for the purpose of filtering network traffic, in practice they are not sufficient to quickly provide independent solutions. For the reasons described above, many use cases would benefit from a system that can reason about attributes of network devices based on the communications of those network devices with the accuracy of ML models but without the latency introduced by these models.

本公开的一些实施例通过从机器学习模型提取用于推理设备属性的规则来解决以上标识的一些问题，该机器学习模型已被训练成基于那些设备的通信来推理设备属性。在一些实施例中，然后分析这些规则的准确度，并且如果足够准确，则将其实时应用于网络通信。然后，可以使用规则以计算上轻松的方式分析实时网络通信。该分析可以允许快速推理在那些通信中涉及的设备的属性，使得能够基于设备属性过滤通信而不引入显著的网络时延。Some embodiments of the present disclosure address some of the issues identified above by extracting rules for inferring device attributes from a machine learning model that has been trained to infer device attributes based on the communications of those devices. In some embodiments, these rules are then analyzed for accuracy and, if accurate enough, applied to network traffic in real time. Rules can then be used to analyze real-time network traffic in a computationally easy manner. This analysis can allow rapid inference of attributes of devices involved in those communications, enabling filtering of communications based on device attributes without introducing significant network delays.

例如，本公开的一些实施例可以配置机器学习模型(例如，基于注意力的机器学习模型)以分析网络通信集合并且推理作为该通信的一方的网络设备(例如，通信的发送者或通信的预期接收方)的属性。在本公开的一些实施例中，机器学习模型然后可以分析历史网络通信以训练模型。这些历史通信可以附加有设备属性标签，其提供关于通信中涉及的网络设备(或多个设备)的属性的信息。例如，基于注意力的机器学习模型可以被馈送由正在为其训练基于注意力的模型的网络发送或接收的标记的历史通信。机器学习模型可通过在属性推理正确时增强模型，但在模型所推理的属性不匹配标签时调整/重新训练模型来训练。For example, some embodiments of the present disclosure may configure a machine learning model (e.g., an attention-based machine learning model) to analyze a collection of network communications and reason about the network devices that were a party to that communication (e.g., the sender of the communication or the intended Receiver) properties. In some embodiments of the present disclosure, the machine learning model may then analyze historical network communications to train the model. These historical communications may be appended with device attribute tags that provide information about the attributes of the network device (or devices) involved in the communication. For example, an attention-based machine learning model may be fed labeled historical communications sent or received by the network for which the attention-based model is being trained. Machine learning models can be trained by augmenting the model when attributes are inferred correctly, but tuning/retraining the model when the attributes inferred by the model do not match the labels.

在本公开的一些实施例中，机器学习模型被训练直到其能够以足以满足网络的安全性考虑的准确度来推理设备属性。在该训练过程之后，可以分析ML模型以识别由ML模型做出的推理背后的推理。具体地，可分析模型以识别对ML模型的设备属性推理特别有影响的重要输入(例如，特定域名、特定通信格式、消息字节中的特定模式)。In some embodiments of the present disclosure, a machine learning model is trained until it is able to infer device attributes with an accuracy sufficient to satisfy the security considerations of the network. After this training process, the ML model can be analyzed to identify the reasoning behind the inferences made by the ML model. In particular, the model can be analyzed to identify important inputs (eg, specific domain names, specific communication formats, specific patterns in message bytes) that are particularly impactful for the ML model's inference of device properties.

在一些实施例中，例如，诸如LIME包装器等包装器可在其分析通信时应用于分类器型机器学习模型。这样的包装器可以使分类器重复地分析相同的输入集合，但是在每次重复时对输入执行细微的调整。通过检测影响分类器的推理的微调和不影响分类器的推理的微调，包装器可以识别在由分类器针对那些通信做出的推理中看起来最重要的分类器的输入。例如，如果分类器推理通信可能源自由特定制造商制造的设备，则包装器可以推理制造商的固件更新服务器的DNS名称是该推理中的最重要输入。In some embodiments, for example, a wrapper such as the LIME wrapper may be applied to a classifier-type machine learning model as it analyzes communications. Such a wrapper can cause the classifier to repeatedly analyze the same set of inputs, but perform small tweaks to the inputs on each iteration. By detecting tweaks that affect the classifier's inferences and tweaks that do not, the wrapper can identify inputs to the classifier that appear to be most important in the inferences made by the classifier for those communications. For example, if the classifier infers that the communication is likely to originate from a device made by a particular manufacturer, the wrapper can infer that the DNS name of the manufacturer's firmware update server is the most important input in that inference.

在一些实施例中，不是分类器类型的ML模型(例如，RNN分类器)，而是基于注意力的模型可以被用来推理网络设备的属性。这可能是有益的，因为与RNN ML模型不同，输入的每个部分相对于基于注意力的模型的结论的重要性可以基于经训练的基于注意力的模型的结构来检查。具体地，模型的注意力权重可用于针对给定设备属性确定在推理该设备属性时网络通信的每个部分有多重要。因此，对于基于注意力的模型，当识别用于推理特定设备属性的重要输入时，不需要包装器。此外，在一些实施例中，算法可以用于直接分析注意力权重并快速提取重要输入。In some embodiments, rather than a classifier-type ML model (eg, an RNN classifier), an attention-based model may be used to reason about properties of network devices. This can be beneficial because, unlike RNN ML models, the importance of each part of the input with respect to the attention-based model's conclusion can be examined based on the structure of the trained attention-based model. Specifically, the model's attention weights can be used to determine, for a given device attribute, how important each part of network communication is in reasoning about that device attribute. Thus, for attention-based models, no wrapper is needed when identifying important inputs for reasoning about specific device properties. Furthermore, in some embodiments, algorithms can be used to directly analyze attention weights and quickly extract important inputs.

在本公开的一些实施例中，可以基于对ML模型对设备属性的推理特别重要的所提取的输入来制定可以在具有或不具有ML模型的情况下推理该设备属性的规则。这些规则可以采取制定条件语句的形式，计算机系统(例如有限状态机)可以容易地解析这些条件语句。这些条件语句可以指定设备属性以推理是否找到条件(例如，网络通信的内容)。例如，在一些实施例中，if-then语句可以基于所提取的重要输入来制定。在一些这样的实施例中，if-then语句的条件(即，“if”部分)可以用一个或多个重要输入来填充，并且if-then语句的结论(即，“then”部分)可以用对应的设备属性来填充。例如，如果确定对DNS名称“ntp.iotproducer.com”的查询在将设备分类为由制造商IoTProducer做出的IoT设备时特别重要，则对应的规则可以声明IF DNSname EQUALS“ntp.iotproducer.com”，则THENdevice＝IoT AND devicemanufacturer＝IoTProducer。In some embodiments of the present disclosure, rules that can infer properties of a device with or without an ML model can be formulated based on extracted inputs that are particularly important to the ML model's inference of properties of the device. These rules can take the form of formulating conditional statements that can be easily parsed by a computer system (eg, a finite state machine). These conditional statements can specify device properties to reason about whether conditions were found (eg, the content of network communications). For example, in some embodiments, if-then statements may be formulated based on extracted significant inputs. In some such embodiments, the condition of the if-then statement (i.e., the "if" part) can be populated with one or more significant inputs, and the conclusion of the if-then statement (i.e., the "then" part) can be filled with corresponding device properties to populate. For example, if it is determined that a query for the DNS name "ntp.iotproducer.com" is particularly important in classifying a device as an IoT device made by the manufacturer IoTProducer, the corresponding rule could state IF DNSname EQUALS "ntp.iotproducer.com" , then THENdevice=IoT AND devicemanufacturer=IoTProducer.

在一些实施例中，这些规则可以手动创建。例如，网络管理员可以检查ML模型对设备属性的推理的重要输入，并为该设备属性创建if-then规则。在一些实例中，这可能是有益的，因为网络管理员可以考虑设备的上下文信息或认识到对复杂的、多条件规则的需要。例如，如果设备在一天中查询特定时间同步服务器至少24次，或者如果设备的DNS查询的至少90％是针对该特定时间同步服务器的，则规则可推理该设备是IoT设备，而不是创建推理该设备是IoT设备的规则。In some embodiments, these rules can be created manually. For example, a network administrator can examine important inputs to an ML model's reasoning about a device attribute and create if-then rules for that device attribute. In some instances, this may be beneficial, as network administrators may consider device contextual information or recognize the need for complex, multi-condition rules. For example, if a device queries a particular time sync server at least 24 times a day, or if at least 90% of a device's DNS queries are to that specific time sync server, the rule could infer that the device is an IoT device, rather than creating an inference that the Devices are the rules for IoT devices.

然而，在一些使用情况下，为其开发推理规则的设备属性的数量可能使得手动规则创建是不期望的或不切实际的。因此，在一些实施例中，这些规则可以自动创建。例如，算法(例如，滑动窗口算法)可以考虑模型的高于某个权重(例如，在第95个百分位中)的输入，并且生成所有if-then序列，其中条件部分(“if”部分)由高于该某个权重的输入组成。然后，这些条件可以与由ML模型推理的设备属性相关联。换言之，推理的属性可被输入到if-then语句的“then”部分，创建包含条件的不同组合(例如，对域X和域Y两者的查询，或对域A或域B的查询)和推理的属性(例如，特定制造商的智能电话)的一系列if-then语句。However, in some use cases, the number of device attributes for which inference rules are developed may make manual rule creation undesirable or impractical. Therefore, in some embodiments, these rules can be created automatically. For example, an algorithm (e.g., a sliding window algorithm) could consider inputs to the model above a certain weight (e.g., in the 95th percentile) and generate all if-then sequences, where the conditional part (the "if" part ) consists of inputs above that certain weight. These conditions can then be associated with device attributes inferred by the ML model. In other words, inferred properties can be entered into the "then" portion of an if-then statement, creating different combinations of inclusion conditions (e.g., queries on both domain X and domain Y, or queries on either domain A or domain B) and A series of if-then statements for an inferred attribute (eg, a smartphone of a particular manufacturer).

这种算法的配置可以取决于用于识别重要输入的机器学习模型的性质。例如，如果循环神经网络被训练来推理设备属性，则算法可以被配置为分析包装器的输出，该包装器识别影响神经网络的结论的关键输入。另一方面，如果基于注意力的模型被训练来推理设备属性，则算法可以被配置为直接分析模型的注意力权重。具有高注意力权重的输入(例如，前5个输入、具有高于阈值的权重的输入、具有高于百分位数的权重的输入)可以被识别为重要输入。然后，任一算法可以基于所识别的重要输入创建所提出的推理规则集合。The configuration of such an algorithm may depend on the nature of the machine learning model used to identify important inputs. For example, if a recurrent neural network is trained to reason about device attributes, the algorithm can be configured to analyze the output of a wrapper that identifies key inputs that affect the neural network's conclusions. On the other hand, if an attention-based model is trained to reason about device attributes, the algorithm can be configured to directly analyze the model's attention weights. Inputs with high attention weights (eg, top 5 inputs, inputs with weight above threshold, inputs with weight above percentile) may be identified as important inputs. Either algorithm can then create a proposed set of inference rules based on the identified significant inputs.

在一些实施例中，可以测试所创建的推理规则以测量它们的准确度。这对于确认所创建的规则能够在依赖于实时通信中的规则之前准确地推理期望的设备属性是有益的。例如，在一些实施例中，可以在第二标记的历史数据集合上测试规则。然后，可以确定规则的准确度(例如，通过计算规则的f1分数(f1-score))。In some embodiments, the created inference rules can be tested to measure their accuracy. This is beneficial for confirming that the created rules can accurately reason about desired device properties before relying on the rules in real-time communication. For example, in some embodiments, a rule may be tested on a second set of tagged historical data. Then, the accuracy of the rule can be determined (eg, by computing the f1-score of the rule (f1-score)).

然而，在一些情况下，可能无法获得进一步的历史数据来测试推理规则的准确度。因此，在一些实施例中，也可以在实时通信上测试规则。在这些实施例中，同时依赖于推理或识别设备属性或网络通信的过滤的其他方法(例如，手动地、使用机器学习模型、使用传统防火墙规则)以避免依赖于未经测试的、潜在不准确的推理规则可能是有益的。在一些实施例中，从其创建规则的机器学习模型还可以分析实时通信，从而允许机器学习模型的准确度与关于那些通信的规则的准确度之间的比较。However, in some cases, further historical data may not be available to test the accuracy of the inference rules. Thus, in some embodiments, rules may also be tested on real-time communications. In these embodiments, concurrent reliance on other methods of reasoning or identifying device attributes or filtering of network communications (e.g., manually, using machine learning models, using traditional firewall rules) avoids relying on untested, potentially inaccurate Rules of inference may be useful. In some embodiments, the machine learning model from which the rules are created may also analyze real-time communications, allowing a comparison between the accuracy of the machine learning model and the accuracy of the rules with respect to those communications.

如果规则被确定为足够准确，则它们可以由网络控制器设备应用于实时通信，而不需要大量的计算资源。例如，可以使用例如网络上的通用计算机或位于网络上的防火墙设备内的有限状态机来实时地应用规则。然后，当进行业务过滤决策(例如，阻止来自因特网的业务、阻止vLAN之间的业务等)时，控制器设备使用规则的结论(即，作为所分析的通信的一方的网络设备的推理的属性)然后可以由网络控制器设备使用。If the rules are determined to be sufficiently accurate, they can be applied to real-time communications by the network controller device without requiring extensive computing resources. For example, rules may be applied in real time using, for example, a general purpose computer on the network or a finite state machine located within a firewall device on the network. Then, when making traffic filtering decisions (e.g., block traffic from the Internet, block traffic between vLANs, etc.), the controller device uses the conclusions of the rules (i.e., the inferred properties of the network device that is party to the analyzed communication ) can then be used by the network controller device.

另一方面，如果推理规则被确定为不够准确，则它们可由网络管理员审阅。网络管理员可以例如将规则的准确度与ML模型的输出进行比较，以确保规则反映ML模型的输出。如果由ML模型和利用规则的网络控制器设备做出的推理不匹配，则规则可能不捕获ML模型的决策逻辑(例如，注意力权重)。然而，如果由ML模型和规则做出的推理匹配，则ML模型可能实际上未被充分训练。因此，取决于审阅的结果，网络管理员可以决定重新配置ML模型(例如，进一步用新数据训练模型)、重新配置规则(例如，添加或从规则中移除输入、基于网络通信的上下文和其他来增加规则的复杂度)或两者。On the other hand, if inference rules are determined to be inaccurate, they can be reviewed by a network administrator. A network administrator can, for example, compare the accuracy of a rule to the output of an ML model to ensure that the rule reflects the output of the ML model. If the inferences made by the ML model and the network controller device utilizing the rules do not match, the rules may not capture the ML model's decision logic (eg, attention weights). However, if the inferences made by the ML model and the rules match, the ML model may not actually be sufficiently trained. Thus, depending on the results of the review, the network administrator may decide to reconfigure the ML model (e.g., further train the model with new data), reconfigure the rules (e.g., add or remove inputs from the rules, context based on network communication, and other to increase the complexity of the rules) or both.

图1描绘了根据本公开的实施例的创建和应用规则集合来推理作为网络通信的一方的设备的属性的方法100。方法100可由诸如计算机系统401的、可访问网络通信集合和被配置成分析那些网络通信的机器学习模型的计算机系统来执行。方法100可以例如由嵌入到计算机系统中的网络控制器设备执行，该计算机系统还运行机器学习模型(或在网络上引导另一计算机运行机器学习模型)。FIG. 1 depicts a method 100 of creating and applying a rule set to reason about attributes of a device that is a party to a network communication, according to an embodiment of the disclosure. Method 100 may be performed by a computer system, such as computer system 401, that has access to a collection of network communications and a machine learning model configured to analyze those network communications. Method 100 may be performed, for example, by a network controller device embedded in a computer system that also runs a machine learning model (or directs another computer on the network to run a machine learning model).

方法100被呈现为创建和应用用于推理作为一个或多个网络通信的一方的设备的单个设备属性(例如，设备是IoT设备，设备运行特定OS，设备运行两年内未被更新的固件)的规则。然而，也可以应用类似的方法来创建和应用用于推理多个设备属性的规则。类似地，可以执行方法100的多个实例，以为网络管理员希望能够推理的每个设备属性创建一个或多个规则。例如，如果网络管理员希望能够推理网络通信的一方何时是移动电话，通信的一方何时是IoT设备，以及通信的一方何时由所结合的不安全设备制造，则方法100可被执行三次(每个属性一次)。Method 100 is presented as creating and applying a method for inferring individual device attributes of a device that is a party to one or more network communications (e.g., the device is an IoT device, the device is running a specific OS, the device is running firmware that has not been updated in two years) rule. However, a similar approach can also be applied to create and apply rules for reasoning about multiple device attributes. Similarly, multiple instances of method 100 may be performed to create one or more rules for each device attribute that a network administrator wishes to be able to reason about. For example, if a network administrator wishes to be able to reason about when a party to a network communication is a mobile phone, when a party to a communication is an IoT device, and when a party to a communication is made by a combined insecure device, the method 100 can be performed three times (once for each property).

方法100开始于框102，其中选择设备属性。在框102中选择的设备属性可以是网络管理员已识别为对网络安全性问题重要的属性。例如，网络管理员可能希望确定网络通信中所涉及的设备是否正在运行已知不安全的操作系统，并相应地过滤那些网络通信。类似地，网络管理员可能希望确定参与网络通信的设备是否由制造商制造，该制造商由于在其设备中包括不安全的固件或者一旦设备被制造就忽略更新固件而出名。如果方法100是针对除了雇员的个人移动电话之外几乎所有网络客户端都是公司所拥有的网络执行的，则网络管理员可能希望确定设备是否是移动电话。一些网络管理员还可希望确定设备是否是IoT设备，因为一些网络管理员相信IoT设备通常是不安全的。Method 100 begins at block 102, where a device attribute is selected. The device attributes selected in block 102 may be attributes that a network administrator has identified as important to network security concerns. For example, a network administrator may wish to determine whether devices involved in network communications are running operating systems that are known to be insecure, and filter those network communications accordingly. Similarly, a network administrator may wish to determine whether a device participating in network communications is manufactured by a manufacturer that has a reputation for including insecure firmware in its devices or neglecting to update firmware once the device is manufactured. If the method 100 is being performed for a network where virtually all network clients are company-owned except for employees' personal mobile phones, the network administrator may wish to determine whether the device is a mobile phone. Some network administrators may also wish to determine whether a device is an IoT device, as some network administrators believe that IoT devices are generally insecure.

一旦在框102中选择了设备属性，则在框104中，可以训练机器学习模型，以基于网络通信来推理设备属性。该机器学习模型可以被预先配置成接受网络通信作为输入，并输出作为该通信的一方的网络设备是否展示在框102中选择的属性的推理。如前所述，机器学习模型可以是基于注意力的模型、循环神经网络、前馈神经网络或其它。Once device attributes are selected in block 102, in block 104 a machine learning model can be trained to infer device attributes based on network communications. The machine learning model may be preconfigured to accept network communications as input and output an inference of whether a network device that is a party to the communication exhibits the attribute selected in block 102 . As mentioned earlier, machine learning models can be attention-based models, recurrent neural networks, feed-forward neural networks, or others.

框104可包括例如接收历史通信集合，其中每个通信都被附加有标签，该标签声明作为该通信的一方的设备是否展示了所选的属性。例如，如果在框102中“是IoT设备”被选择作为设备属性，则历史数据可以包括个人计算机向因特网发送电子邮件的第一网络通信和IoT灯泡向灯泡制造商的服务器发送消息的第二通信。在该示例中，第一网络通信可被附加有表示作为通信的一方的设备不是IoT设备的标签(例如，“否”、“假”、“非IoT”)。另一方面，第二网络通信可被附加有表示作为通信的一方的设备是IoT设备的标签(例如，“是”、“真”、“IoT”)。历史通信集合在本文中可以被称为“训练数据”。Block 104 may include, for example, receiving a collection of historical communications, where each communication is appended with a tag stating whether the device that was a party to the communication exhibited the selected attribute. For example, if "Is an IoT device" is selected as the device attribute in box 102, the historical data may include a first network communication of a personal computer sending an email to the Internet and a second communication of an IoT light bulb sending a message to the light bulb manufacturer's server . In this example, the first network communication may be appended with a tag indicating that the device that is a party to the communication is not an IoT device (eg, "no", "false", "not IoT"). On the other hand, the second network communication may be appended with a tag indicating that the device party to the communication is an IoT device (eg, "yes", "true", "IoT"). The set of historical communications may be referred to herein as "training data."

在一些实施例中，取决于经训练的机器学习模型的期望能力或基于ML模型的推理规则的期望格式，历史消息的不同方面可被包括在训练数据中。在一些实施例中，例如，每个消息被寻址到的域名可以被包括在训练数据中。包括域名对于训练机器学习模型以基于网络设备与之通信的域名中的模式来推理该网络设备的设备属性可能是有用的。在一些实施例中，每个通信的内容(即，在通信中发送的数据)可以被包括在训练数据中。包括通信的内容对于训练机器学习模型以基于设备的网络(或因特网)活动中的模式来推理网络设备的设备属性可能是有用的。在一些实施例中，关于通信的格式(例如，报头大小、通信是否使用巨型分组、消息是否被加密、消息以什么编程语言编写)的元数据可以被包括在训练数据中。关于通信格式的信息可用于训练机器学习模型以基于模式来推理网络设备的设备属性，其中设备(或例如同一制造商或操作系统的网络上的所有设备)以该方式格式化网络通信。在一些实施例中，也可以包括网络通信的其他方面。In some embodiments, different aspects of historical messages may be included in the training data depending on the desired capabilities of the trained machine learning model or the desired format of the ML model based inference rules. In some embodiments, for example, the domain name to which each message is addressed may be included in the training data. Including domain names may be useful for training a machine learning model to reason about device attributes of a network device based on patterns in the domain names with which the network device communicates. In some embodiments, the content of each communication (ie, the data sent in the communication) may be included in the training data. Content that includes communications may be useful for training machine learning models to reason about device attributes of network devices based on patterns in the devices' network (or Internet) activity. In some embodiments, metadata about the format of the communication (eg, header size, whether the communication uses jumbo packets, whether the message is encrypted, what programming language the message is written in) may be included in the training data. Information about communication formats can be used to train machine learning models to reason about device attributes for network devices based on patterns in which devices (or all devices on a network, such as the same manufacturer or operating system) format network communications in that way. In some embodiments, other aspects of network communications may also be involved.

在一些实施例中，框104可以包括基于环境(例如，网络管理员的目标)有目的地选择历史消息的方面以包括在训练数据中。例如，包括绝对所有可用的并且可能与网络通信集合相关的数据可以增加所得到的机器学习模型是准确的可能性。然而，增加训练数据的大小和复杂性可能增加训练ML模型所需的时间和提取对于ML模型的推理最重要的输入所需的时间(以及提取的难度)。因此，在训练和规则创建的速度和容易性特别重要的使用情况下，过分包容可能是有害的。作为另一示例，仅包括与期望规则类型直接相关的历史通信的各方面(诸如基于由该设备查询的域名中的模式来推理设备属性)可以增加ML模型被快速训练以及ML模型被训练成基于期望输入来作出推理的可能性。然而，从机器学习训练中排除数据可能降低最终训练的模型的准确度。类似地，从模型排除数据可以降低有用上下文被包括在从经训练的ML模型创建的规则中的可能性。In some embodiments, block 104 may include purposefully selecting aspects of historical messages to include in the training data based on circumstances (eg, goals of a network administrator). For example, including absolutely all data available and potentially relevant to a collection of network communications can increase the likelihood that the resulting machine learning model will be accurate. However, increasing the size and complexity of the training data may increase the time required to train the ML model and the time required (and the difficulty of extraction) to extract the most important inputs for the ML model's inference. Therefore, in use cases where speed and ease of training and rule creation are particularly important, overinclusiveness can be detrimental. As another example, including only aspects of historical communications that are directly related to the desired rule type (such as inferring device attributes based on patterns in domain names queried by the device) can increase the speed at which ML models are trained and the ML models trained based on Likelihood of expecting an input to make an inference about. However, excluding data from machine learning training can reduce the accuracy of the final trained model. Similarly, excluding data from a model can reduce the likelihood that useful context is included in rules created from a trained ML model.

框104包括一旦训练数据被选择就用该训练数据来训练机器学习模型。例如，可以将标记的网络通信(或通信集合)输入到机器学习模型中，并且机器学习模型可以提供推理作为该通信(或多个通信)的一方的网络设备是否展示ML模型被配置成作出关于其的推理的设备属性的输出。在训练过程的早期，ML模型可能频繁地进行不正确的推理，导致模型被重新训练(包括例如调整偏差或注意力权重)。然后，相同的网络通信或新的网络通信可以再次被输入到ML模型。Block 104 includes using the training data to train the machine learning model once the training data is selected. For example, a labeled network communication (or collection of communications) can be input into a machine learning model, and the machine learning model can provide inferences about whether a network device that is a party to that communication (or communications) exhibits that the ML model is configured to make The output of its inferred device properties. Early in the training process, ML models may frequently make incorrect inferences, causing the model to be retrained (including, for example, adjusting bias or attention weights). Then, the same network communication or a new network communication can be input to the ML model again.

当在框104中训练ML模型时(或在认为ML模型被充分训练之后)，控制器设备可以在框106中确定模型对于网络的目的是否足够准确。在一些实施例中，例如，监督方法100的计算机系统可在ML模型在框104中被训练时监视ML模型的发展准确度，并将该准确度与期望准确度阈值一致地进行比较。这可能是有益的，因为它可以使得计算机系统能够在ML模型被充分训练时立即停止框104中的训练过程，从而避免花费在针对训练数据对模型进行过度训练上的不必要的时间和资源。在一些实施例中，框104中的训练过程可周期性地停止，从而允许在框106中检查ML模型的准确度。这可以涉及例如对先前训练数据或新训练数据测试ML模型。用新训练数据测试ML模型可能是有益的，因为其可防止ML模型在没有足够灵活以对其它训练数据(或实时数据)作出准确推理的情况下对先前训练数据过度训练。When training the ML model in block 104 (or after the ML model is deemed sufficiently trained), the controller device may determine in block 106 whether the model is accurate enough for the purpose of the network. In some embodiments, for example, the computer system of the supervisory method 100 may monitor the developing accuracy of the ML model as it is trained in block 104 and compare the accuracy consistently to a desired accuracy threshold. This may be beneficial because it may enable the computer system to stop the training process in block 104 as soon as the ML model is sufficiently trained, thereby avoiding unnecessary time and resources spent on overtraining the model on the training data. In some embodiments, the training process in block 104 may be stopped periodically, allowing the accuracy of the ML model to be checked in block 106 . This can involve, for example, testing the ML model on previous training data or new training data. Testing the ML model with new training data can be beneficial because it prevents the ML model from being overtrained on previous training data without being flexible enough to make accurate inferences on other training data (or real-time data).

用于测量ML模型的准确度的精确度量可以基于环境和网络管理员偏好来选择。一些示例包括原始准确度(例如，总正确推理除以总推理)、模型的精确度(例如，正确肯定推理除以总肯定推理、正确否定推理除以总否定推理)、召回率(例如，真肯定推理除以真肯定推理和假否定推理之和)、以及F1分数(精确度和召回率的组合)。The precise metric used to measure the accuracy of the ML model can be chosen based on the environment and network administrator preferences. Some examples include raw accuracy (e.g., total correct inferences divided by total inferences), model precision (e.g., correct positive inferences divided by total positive inferences, correct negative inferences divided by total negative inferences), recall (e.g., true Positive inferences divided by the sum of true positive inferences and false negative inferences), and the F1 score (combined precision and recall).

如果在框106中确定机器学习模型不够准确，则监督方法100的计算机系统(或网络管理员)可返回到框104以继续训练机器学习模型。在一些实施例中，这可以简单地涉及不中断训练过程，允许ML模型的连续训练。然而，在一些实施例(诸如在框106处中断训练过程的实施例)中，返回到框104可以包括将新的训练数据添加到训练过程，这可以增加ML模型的灵活性。If it is determined in block 106 that the machine learning model is not accurate enough, the computer system (or network administrator) overseeing method 100 may return to block 104 to continue training the machine learning model. In some embodiments, this may simply involve uninterrupting the training process, allowing continuous training of the ML model. However, in some embodiments, such as those in which the training process was interrupted at block 106, returning to block 104 may include adding new training data to the training process, which may increase the flexibility of the ML model.

然而，如果在框106中确定机器模型足够准确，则监督方法100的计算机系统可以前进到框108，其中提取对ML模型的推理的重要输入。如前所述，在框108中执行的特定过程可以取决于机器学习模型的结构、训练数据的性质、以及关于要从重要输入创建的推理规则的偏好。However, if in block 106 it is determined that the machine model is sufficiently accurate, the computer system of supervisory method 100 may proceed to block 108 where significant inputs to the inference of the ML model are extracted. As previously mentioned, the particular process performed in block 108 may depend on the structure of the machine learning model, the nature of the training data, and preferences regarding the inference rules to be created from significant inputs.

例如，如果在块104中训练的机器学习模型是循环神经网络，则块108可以包括将诸如LIME包装器的包装器应用于ML模型以监视对ML模型的输入的各种调用在ML模型的推理上的影响。另一方面，如果在块104中训练的机器学习模型是基于注意力的模型，则块108可以简单地涉及查询应用于ML模型的每个输入的注意力权重。For example, if the machine learning model trained in block 104 is a recurrent neural network, block 108 may include applying a wrapper such as the LIME wrapper to the ML model to monitor various calls to the ML model's input and inference in the ML model on the impact. On the other hand, if the machine learning model trained in block 104 is an attention-based model, then block 108 may simply involve querying the attention weights applied to each input of the ML model.

此外，如果用于在框104中训练ML模型的训练数据是复杂的，则在框108处提取的重要输入可采取各种形式。例如，如果在与网络通信集合相关的所有可用数据上训练ML模型，则所提取的输入可以包括该数据中的任何数据(例如，通信接收者、通信时间、该通信被发送的频率、通信的格式等)。另一方面，如果ML模型仅在联网设备查询的DNS名称上被训练，则所提取的输入将仅包括在推理该设备是否展示该设备属性方面有影响的DNS名称(或多个DNS名称)。Furthermore, if the training data used to train the ML model in block 104 is complex, the important inputs extracted at block 108 may take various forms. For example, if an ML model is trained on all available data related to a collection of network communications, the extracted inputs may include any of that data (e.g., the recipient of the communication, the time of the communication, how often the communication was sent, the format, etc.). On the other hand, if the ML model is trained only on DNS names queried by networked devices, the extracted input will only include the DNS name (or DNS names) that are influential in inferring whether the device exhibits the device attribute.

在一些实例中，重要输入可以是当被ML模型识别时使ML模型仅基于该输入来推理特定设备属性的输入。例如，重要输入可采取域名的形式，如果网络设备查询，则ML模型推理该设备是IoT设备。然而，在某些情况下，重要输入可以是仅当该重要输入与其他输入一起被识别时才使ML模型推理特定设备属性的输入。例如，在一些情况下，重要输入可以是网络通信的报头中的一系列字节。当被隔离地识别时，该系列字节可能不足以创建设备属性的推理。然而，当ML模型识别出其在同一天也识别来自同一网络设备的第二网络通信中的第二字节系列的那一系列字节时，ML模型可能能够准确地推理网络设备的设备属性。In some instances, significant inputs may be inputs that, when identified by the ML model, cause the ML model to infer certain device attributes based solely on that input. For example, an important input could take the form of a domain name, and if queried by a network device, the ML model infers that the device is an IoT device. In some cases, however, an important input may be one that causes the ML model to infer specific device properties only if that important input is identified along with other inputs. For example, in some cases an important input may be a series of bytes in the header of a network communication. When identified in isolation, the sequence of bytes may not be sufficient to create inferences about device properties. However, when the ML model identifies a series of bytes that it also identified as a second series of bytes in a second network communication from the same network device on the same day, the ML model may be able to accurately reason about device attributes of the network device.

此外，网络管理员的实现偏好也可能影响在框108中提取的重要输入。在一些用例中，例如，网络管理员可能希望仅提取最重要的输入来用于规则创建。另一方面，在一些用例中，网络管理员可能希望提取五个最重要的输入。Furthermore, implementation preferences of network administrators may also affect the important inputs extracted in block 108 . In some use cases, for example, network administrators may wish to extract only the most important inputs for rule creation. On the other hand, in some use cases, a network administrator may wish to extract the five most important inputs.

被推理的设备属性也可能影响被确定为重要的输入。对于推理设备是IoT设备可能特别有用的输入可以完全不同于对于推理设备是由特定制造商制造的设备或者对于推理设备是个人移动电话可能有用的输入。The inferred device properties may also affect inputs determined to be important. An input that may be particularly useful to infer that the device is an IoT device may be quite different from an input that may be useful to infer that the device is a device made by a particular manufacturer or to infer that the device is a personal mobile phone.

在一些实施例中，提取重要输入还可包括提取在数字上表示输入对ML模型的推理的重要性的权重。例如，如果在框104中训练的ML模型是基于注意力的模型，则所提取的每个重要输入也可被附加有该输入的注意力权重。In some embodiments, extracting important inputs may also include extracting weights that numerically represent the importance of the inputs to the inference of the ML model. For example, if the ML model trained in block 104 is an attention-based model, each important input extracted may also be appended with an attention weight for that input.

因此，如果在框104中训练机器学习模型以分析由网络设备查询的DNS名称集合，则在框108处提取的重要输入可以采用以下列表的形式：(1)由于十个DNS名称在输入数据中的存在，所以它们在ML模型的推理中最有影响；以及(2)对于十个DNS名称中的每一个，应用于该DNS名称的数值权重。Thus, if a machine learning model is trained at block 104 to analyze the set of DNS names queried by network devices, the significant input extracted at block 108 may take the form of the following list: (1) Since ten DNS names are in the input data exist, so they are most influential in the inference of the ML model; and (2) for each of the ten DNS names, the numerical weight applied to that DNS name.

一旦在框108中提取了重要输入，则在框110中基于这些提取的输入创建推理规则集合。如前所述，这些推理规则可以采取包含条件和基于该条件的结论(这里是推理)的if-then语句的形式。例如，所创建的规则可以采取具有条件“if‘DNS名称’equals‘ntp.manufacturer.com’and‘device-metrics-us.manufacturer.com’和结论“then‘devicemanufacturer’equals‘manufacturer.’”的语句的形式。该规则在由计算机(诸如通用计算机系统或有限状态机)解析时将得出结论，即当网络设备查询“ntp.manufacturer.com”和“device-metrics-us.manufacturer.com”时，该设备由“制造商(manufacturer)”制造。Once significant inputs are extracted in block 108 , a set of inference rules is created in block 110 based on these extracted inputs. As mentioned earlier, these inference rules can take the form of if-then statements containing a condition and a conclusion (inference here) based on that condition. For example, a rule created could take a rule with conditions "if 'DNS name' equals 'ntp.manufacturer.com' and 'device-metrics-us.manufacturer.com' and conclusion "then 'devicemanufacturer' equals 'manufacturer.'" The form of a statement. This rule, when parsed by a computer (such as a general-purpose computer system or a finite state machine), will conclude that when a network device queries "ntp.manufacturer.com" and "device-metrics-us.manufacturer.com" , the device is manufactured by a "manufacturer".

在一些实施例中，在框110处创建的推理规则可以被手动地(例如，由网络管理员)或自动地创建。在具有简单的、短的所提取的重要输入集合的用例中，自动创建规则可以导致可管理的规则集合。例如，如果在框108提取了前三个DNS名称，则框110可以包括将这些DNS名称中的每一个自动并入规则中，该规则断定如果存在单个DNS名称则展示设备属性。框110还可以包括由“与(AND)”操作符(导致需要两个名称的条件)、“或(OR)”操作符(导致需要一个或两个DNS名称的条件)以及“异或(XOR)”操作符(导致需要一个DNS名称但如果存在两个DNS名称则不满足一个DNS名称的条件)组合的每对可能的DNS名称。在一些实施例中，所有三个DNS名称可以被组合为一个条件，例如需要前两个DNS名称但需要第三个DNS名称不存在的条件。所得到的集合或推理规则可以包括25个规则或更少，使得选择准确的推理规则(或多个规则)是可管理的。In some embodiments, the inference rules created at block 110 may be created manually (eg, by a network administrator) or automatically. In use cases with simple, short sets of extracted significant inputs, automatic rule creation can result in a manageable set of rules. For example, if the first three DNS names were extracted at block 108, block 110 may include automatically incorporating each of these DNS names into a rule that asserts that if a single DNS name exists, then exhibit device attributes. Block 110 may also include an AND operator (resulting in a condition requiring two names), an "OR" operator (resulting in a condition requiring one or two DNS names), and an exclusive OR (XOR) operator. )" operator (resulting in the condition that a DNS name is required but one DNS name is not satisfied if two DNS names exist)" operator (resulting in the condition that one DNS name is not satisfied if two DNS names exist) In some embodiments, all three DNS names may be combined into one condition, such as a condition that requires the first two DNS names but requires the absence of the third DNS name. The resulting set or inference rules may include 25 rules or less so that selecting the exact inference rule (or rules) is manageable.

在一些实施例中，所提取的输入可以允许创建更复杂的推理规则。例如，如果在框108处提取的重要输入包括网络通信的时间戳，则在框110处创建的规则可以利用这些时间戳。例如，可以创建规则，该规则包括网络设备在特定时间段之间(例如，在上午2点和上午4点之间)或以特定频率(例如，每天至少40次)向IP地址发送通信的条件。在其中与网络通信有关的大多数或所有数据被包括在训练数据中的实施例中，所提取的输入可允许甚至更复杂的规则。例如，序列挖掘算法(诸如PrefixSpan算法)可以收集在框108处提取的重要输入中的事件序列。这些规则可以基于例如发送到特定目的地序列的一系列通信，或者与特定内容序列一起发送的一系列通信。In some embodiments, the extracted inputs may allow for the creation of more complex inference rules. For example, if the important inputs extracted at block 108 include timestamps of network communications, the rules created at block 110 may utilize these timestamps. For example, rules can be created that include conditions for a network device to send traffic to an IP address between certain time periods (for example, between 2 am and 4 am) or at a certain frequency (for example, at least 40 times per day) . In embodiments where most or all data related to network communications is included in the training data, the extracted inputs may allow for even more complex rules. For example, a sequence mining algorithm such as the PrefixSpan algorithm may collect event sequences in the significant input extracted at block 108 . These rules may be based, for example, on a series of communications sent to a particular sequence of destinations, or a series of communications sent with a particular sequence of content.

在一些用例中，网络管理员可能更喜欢防止在框110处创建的初始规则集合过于复杂。这可以潜在地避免在方法100的稍后的框中不必要的规则测试和分析。为此，框110可以默认首先创建简单规则(例如，简单的if-then语句)。在一些实施例中，还可能优选的是，由网络管理员创建或审阅规则。网络管理员能够识别规则创建算法不能识别的输入内的上下文。为此，网络管理员可能能够消除网络管理员有理由相信将不准确的规则，即使它们是使用重要输入创建的(例如，要求设备查询制造商的ntp服务器和竞争制造商的ntp服务器两者的规则)。In some use cases, a network administrator may prefer to prevent the initial set of rules created at block 110 from being too complex. This can potentially avoid unnecessary rule testing and analysis in later blocks of method 100 . To this end, block 110 may default to first creating simple rules (eg, simple if-then statements). In some embodiments, it may also be preferred that rules are created or reviewed by a network administrator. Network administrators are able to identify context within inputs that rule creation algorithms do not. To this end, the network administrator may be able to eliminate rules that the network administrator has reason to believe will be inaccurate, even if they were created with significant input (e.g., a request that a device query both a manufacturer's ntp server and a competing manufacturer's ntp server rule).

在框110中创建了该推理规则集合之后，在框112中可以测试规则的准确度。在一些实施例中，测试推理规则可以包括使用计算机系统(诸如通用计算机处理器)将推理规则应用于新数据集。例如，计算机可以将推理规则加载到存储器中，接收新的网络通信，并且将该网络通信与推理规则的“条件”中的文本进行比较。如果网络通信中的数据与条件匹配，则计算机将推理作为通信的一方的设备展示该设备属性。例如，如果条件是“DNS name＝ntp.smarttoaster.com”，并且网络通信是对ntp.smarttoaster.com的查询，计算机将得出结论，作为通信方的一方的设备(这里是发送设备)展示设备属性(例如，该设备是智能烤面包机)。After creating the set of inference rules in block 110 , in block 112 the rules can be tested for accuracy. In some embodiments, testing the inference rules may include applying the inference rules to new data sets using a computer system, such as a general purpose computer processor. For example, a computer may load an inference rule into memory, receive new network communications, and compare that network communication to the text in the "conditions" of the inference rule. If the data in the network communication matches the condition, the computer will infer that the device that is a party to the communication exhibits that device attribute. For example, if the condition is "DNS name=ntp.smarttoaster.com", and the network communication is a query to ntp.smarttoaster.com, the computer will conclude that the device (here, the sending device) that is the party to the communication presents the device properties (for example, the device is a smart toaster).

在框112中使用新数据集合而不是先前的训练数据，可有益地降低规则非常适合于先前的历史训练数据而不太适合于基于不同于先前的历史训练数据的数据作出推理的可能性。换句话说，通过对新数据测试所创建的推理规则，方法100可以避免数据被“过度训练”到先前的历史训练数据。在一些实施例中，新数据和旧训练数据的混合(例如，80％旧训练数据、20％新数据)可以用于测试规则。Using the new data set rather than the previous training data in block 112 can beneficially reduce the likelihood that the rules are well suited to the previous historical training data and less suitable for making inferences based on data different than the previous historical training data. In other words, by testing the created inference rules on new data, the method 100 can prevent the data from being "overtrained" to previous historical training data. In some embodiments, a mixture of new data and old training data (eg, 80% old training data, 20% new data) may be used to test the rules.

在框112中使用的新数据可以采取第二历史训练数据集合或来自网络中的实时通信的实时数据的形式。如果使用第二历史训练数据集合，那么计算机将第二训练数据集合应用于所创建规则的结论可能能够将其推理与训练数据中的标签进行比较，或与基于第二训练数据集合的经训练ML模型的推理进行比较。然而，如果实时数据被用于新的数据集，则使用在框104中训练的机器学习模型来分析新的数据集也可能是特别有益的，因为实时数据将不可能被标记。然后，出于准确度的目的，可以比较运行推理规则的计算机和机器学习模型的推理。The new data used in block 112 may take the form of a second set of historical training data or real-time data from real-time communications in the network. If a second set of historical training data is used, the computer's conclusions from applying the second set of training data to the rules it creates may be able to compare its inferences to the labels in the training data, or to the ML trained on the second set of training data. The inference of the model is compared. However, it may also be particularly beneficial to analyze the new data set using the machine learning model trained in block 104 if the real-time data is used for the new data set, since the real-time data will not likely be labeled. Then, for accuracy purposes, the inferences of the computer running the inference rules and the machine learning model can be compared.

在一些实施例中，在框112中确定所创建的推理规则是否足够准确可包括执行与在框106中对经训练的ML模型执行的类似的准确度计算。例如，可为每个创建的规则计算原始准确度、精确度、召回率、F1分数或其组合。然后，在框112中，这些测量结果可与一个或多个准确度阈值(例如，精确度阈值、F1分数阈值)进行比较。In some embodiments, determining whether the created inference rules are sufficiently accurate in block 112 may include performing an accuracy calculation similar to that performed on the trained ML model in block 106 . For example, raw accuracy, precision, recall, F1 score, or a combination thereof can be calculated for each created rule. These measurements may then be compared to one or more accuracy thresholds (eg, precision thresholds, F1 score thresholds) in block 112 .

如果在框112中确定规则不是足够准确，则在框114中审阅规则。在一些实施例中，框114可以涉及人类审阅者(例如，网络管理员)分析错误源的规则。例如，人类审阅可针对潜在的解析错误审阅规则。人类审阅者也可审阅训练数据内的上下文，该上下文可解释不准确或对避免不准确有用。人类审阅者可针对潜在错误审阅在框108处提取的重要输入(例如，基于很可能在网络上的几乎所有通信中找到的网络通信的内容的输入)。If it is determined in block 112 that the rules are not sufficiently accurate, then in block 114 the rules are reviewed. In some embodiments, block 114 may involve human reviewers (eg, network administrators) analyzing the rules for error sources. For example, a human reviewer can review a rule for potential parsing errors. Human reviewers may also review context within the training data that may explain inaccuracies or be useful in avoiding inaccuracies. Human reviewers may review significant input extracted at block 108 (eg, input based on the content of network communications likely to be found in nearly all communications on the network) for potential errors.

在一些实例中，在114中对规则的审阅可以说明规则可以被改进。例如，如果在框110处创建的规则采用如果向特定DNS名称发送通信则推理设备属性的if-then语句的形式，并且如果框112确定该规则导致不可接受的数量的假肯定引用，则在框114处审阅规则可以揭示如果在24小时内向特定DNS名称发送通信至少6次则从训练数据添加的上下文建议修改规则以推理设备属性。可以通过将规则创建算法的复杂度(例如，从将一个域插入if-then规则的简单算法移动)增加到考虑通信定时、其它网络通信的影响、通信之间的序列(例如，PrefixSpan算法)或以上全部的更复杂算法来执行修改规则。在框114修改规则也可以由诸如网络管理员的人工审阅者来执行。In some instances, review of the rules at 114 may indicate that the rules can be improved. For example, if the rule created at block 110 takes the form of an if-then statement that infers device attributes if communications are sent to a particular DNS name, and if block 112 determines that the rule results in an unacceptable number of false positive references, then at block 112 Reviewing the rules at 114 may reveal that the context added from the training data suggests modifying the rules to reason about device attributes if communications are sent to a particular DNS name at least 6 times within 24 hours. This can be achieved by increasing the complexity of the rule creation algorithm (e.g., moving from a simple algorithm that inserts a field into an if-then rule) to account for communication timing, the impact of other network communications, the sequence between communications (e.g., the PrefixSpan algorithm), or All of the above are more complex algorithms to implement the modification rules. Modifying the rules at block 114 may also be performed by a human reviewer, such as a network administrator.

在一些情况下，在框112中确定规则是不准确的可能是因为规则所基于的机器学习模型(或规则所基于的重要输入)对于在框112中使用的新数据而言是不准确的。在这些情况下，方法100可以包括从框114返回到框104，此时机器学习模型然后可以被重新训练。然而，在大多数情况下，在框106中用足够大的训练集执行104并将准确度阈值设置得适当高，这应当防止由于训练不佳的网络而在框112处造成的不准确。为此，框114通常将进行到框112，并且在此如此示出。In some cases, the determination that a rule is inaccurate in block 112 may be because the machine learning model on which the rule is based (or an important input on which the rule is based) is inaccurate for the new data used in block 112 . In these cases, method 100 may include returning from block 114 to block 104, at which point the machine learning model may then be retrained. However, in most cases, performing 104 in block 106 with a sufficiently large training set and setting the accuracy threshold appropriately high should prevent inaccuracies at block 112 due to poorly trained networks. To this end, block 114 will generally proceed to block 112, and is shown here as such.

在审阅和修改规则之后，框112再次确定规则是否足够准确。框112的这个随后的迭代可以涉及与先前讨论的相同的分析。如果框112确定规则足够准确，则在框116中应用规则。在一些实施例中，框116可涉及将所创建的规则添加到“接受的”规则列表，以用于稍后潜在地应用于实时网络业务。在一些实施例中，框116可以包括将规则应用于实时实况数据。如已经讨论的，这可以由具有处理器的通用计算机来执行，该处理器被配置为将实况网络通信与所创建的规则的条件进行比较(例如，使用字符串比较)。对于大多数创建的规则，简单的有限状态机可能足以用于该比较。当这些规则被应用于实时网络通信时，基于这些规则作出的推理可在作出手动或自动业务过滤决策(例如，是阻塞到IoT设备的业务还是防止具有易受攻击的操作系统的设备联系托管网络控制器设备的VLAN或具有敏感存储的VLAN的决策时使用。在一些实施例中，这个规则应用可以例如在网络防火墙设备或路由器设备上执行。After reviewing and modifying the rules, block 112 again determines whether the rules are sufficiently accurate. This subsequent iteration of block 112 may involve the same analysis as previously discussed. If block 112 determines that the rule is sufficiently accurate, the rule is applied in block 116 . In some embodiments, block 116 may involve adding the created rule to a list of "accepted" rules for later potential application to real-time network traffic. In some embodiments, block 116 may include applying the rules to the real-time live data. As already discussed, this can be performed by a general purpose computer having a processor configured to compare live network traffic to the conditions of the created rules (eg, using string comparisons). For most rules created, a simple finite state machine is probably sufficient for this comparison. When these rules are applied to real-time network traffic, inferences based on these rules can be used in making manual or automatic traffic filtering decisions (e.g., whether to block traffic to IoT devices or prevent devices with vulnerable operating systems from contacting managed networks). VLANs of controller devices or VLANs with sensitive storage are used when making decisions. In some embodiments, this rule application may be performed, for example, on a network firewall device or router device.

为了易于理解，图2描绘了根据本公开的实施例的用于开发和应用规则集合以推理作为网络通信的一方的设备的属性的系统的图形抽象200。图2的目的是提供本公开的过程的输入和输出的简化视图。因此，图形抽象200所建议的组件和过程是简化的抽象，并且不旨在是本公开的实施例的精确表示。For ease of understanding, FIG. 2 depicts a graphical abstraction 200 of a system for developing and applying rule sets to reason about attributes of devices that are a party to network communications, according to an embodiment of the present disclosure. The purpose of Figure 2 is to provide a simplified view of the inputs and outputs of the process of the present disclosure. Thus, the components and processes suggested by graphical abstraction 200 are simplified abstractions and are not intended to be precise representations of embodiments of the present disclosure.

在图形表示200中，网络通信202集合被输入到机器学习模型204中，以便训练机器学习模型来推理作为那些网络通信的一方的设备的设备属性。在一些实施例中，该设备属性将先前已经被选择，并且网络通信202的内容将已经被选择用于那些训练目的。因此，网络通信202可以包含暗示通信的一方是否展示设备属性的标签。可以选择网络通信202的内容以瞄准由机器学习模型204作出的推理的逻辑推理。例如，如果网络管理员希望训练机器学习模型204以基于网络通信的实际字节而不是例如关于发送方或预期接收方的信息来作出推理，则网络通信202可被减少为仅包含消息内容，而不是任何附加通信数据。In graphical representation 200, a collection of network communications 202 is input into a machine learning model 204 to train the machine learning model to infer device attributes for devices that are a party to those network communications. In some embodiments, this device attribute will have been selected previously, and the content of network communication 202 will have been selected for those training purposes. Accordingly, network communication 202 may contain tags that imply whether a party to the communication exhibits device attributes. The content of network communication 202 may be selected to target logical inference of inferences made by machine learning model 204 . For example, if the network administrator wishes to train the machine learning model 204 to make inferences based on the actual bytes of the network communication rather than, for example, information about the sender or intended recipient, the network communication 202 may be reduced to contain only the content of the message, while Not any additional communication data.

一旦机器学习模型204被适当地训练以基于网络通信202推理设备属性，机器学习模型204本身就由计算机系统206分析。计算机系统206可以是能够分析机器学习模型204以提取在由机器学习模型204作出的推理中有意义的输入的任何计算机。例如，如果机器学习模型204是循环神经网络，则计算机系统206可以是具有能够运行LIME包装器的处理器的系统。如果机器学习模型204是基于注意力的网络，则计算机系统206可以是具有能够识别机器学习模型204的注意力权重并对其进行排序的处理器的系统。在一些实施例中，计算机系统206可以是驻留在网络上负责管理网络通信的计算机系统(例如，网络控制器设备或运行网络控制器软件的计算机)。Once machine learning model 204 is properly trained to infer device attributes based on network communications 202 , machine learning model 204 itself is analyzed by computer system 206 . Computer system 206 may be any computer capable of analyzing machine learning model 204 to extract inputs that are meaningful in the inferences made by machine learning model 204 . For example, if machine learning model 204 is a recurrent neural network, computer system 206 may be a system with a processor capable of running a LIME wrapper. If the machine learning model 204 is an attention-based network, the computer system 206 may be a system having a processor capable of identifying and ranking the attention weights of the machine learning model 204 . In some embodiments, computer system 206 may be a computer system residing on a network responsible for managing network communications (eg, a network controller device or a computer running network controller software).

计算机系统206然后可以基于所提取的重要输入来创建推理规则208。在一些实施例中，计算机系统206可以例如利用能够提取输入并制定推理规则208(此处被示为左侧的五个条件的列表和右侧的五个推理的对应列表)的单个算法。当从基于注意力的模型中提取重要输入时，这可能是有益的，因为对于专门用于规则创建的算法，注意力权重可以是容易访问的。在一些实施例中，规则创建算法还可以从应用于神经网络204的单独的包装器获得所提取的重要输入。Computer system 206 can then create inference rules 208 based on the extracted significant inputs. In some embodiments, computer system 206 may, for example, utilize a single algorithm capable of taking input and formulating inference rules 208 (shown here as a list of five conditions on the left and a corresponding list of five inferences on the right). This can be beneficial when extracting important inputs from attention-based models, since attention weights can be easily accessible to algorithms specialized for rule creation. In some embodiments, the rule creation algorithm may also obtain the extracted significant inputs from a separate wrapper applied to the neural network 204 .

一旦创建了推理规则208，计算机系统212就将它们应用于新的网络通信210集合。在一些实施例中，新的网络通信210集合可以是用于验证推理规则208的目的的附加训练数据集合。在其他实施例中，新的网络通信210集合可以是实况的、实时的网络业务。在一些实施例中，计算机系统212可以是与计算机系统206相同的计算机系统(例如，网络控制器设备)或不同的计算机系统。在一些实施例中，计算机系统212可以是能够在推理规则208中的条件与网络通信210的集合之间执行字符串比较的任何计算机系统。Once inference rules 208 are created, computer system 212 applies them to new sets of network communications 210 . In some embodiments, the new set of network communications 210 may be an additional set of training data for the purpose of validating the inference rules 208 . In other embodiments, the new set of network communications 210 may be live, real-time network traffic. In some embodiments, computer system 212 may be the same computer system as computer system 206 (eg, a network controller device) or a different computer system. In some embodiments, computer system 212 may be any computer system capable of performing string comparisons between conditions in inference rules 208 and sets of network communications 210 .

计算机系统212可输出推理214。如图所示，推理214示出计算机系统212已经发现满足两个条件(即，第一条件和第四条件)就满足，并且因此可以推理作为网络通信210的一方的设备表现出相应的第一和第四设备属性。如所呈现的，图2示出了推理出两个设备属性的示例。例如，计算机系统212可推理作为网络通信210的一方的设备是具有超过一年的固件(例如，第四条件)的IoT设备(例如，第一条件)。然而，在本发明的一些实施例中，可以推理出每组规则仅一个设备属性。Computer system 212 may output inference 214 . As shown, inference 214 shows that computer system 212 has found that two conditions (i.e., a first condition and a fourth condition) are satisfied, and therefore it can be inferred that a device that is a party to network communication 210 exhibits a corresponding first condition. and a fourth device property. As presented, Figure 2 shows an example where two device attributes are inferred. For example, computer system 212 may infer that the device that is a party to network communication 210 is an IoT device (eg, first condition) with firmware that is more than one year old (eg, fourth condition). However, in some embodiments of the invention, only one device attribute per set of rules may be inferred.

如先前已经讨论的，提取重要输入和基于那些输入创建规则可以基于环境(例如，ML模型属性、网络管理员偏好)而变化。出于这个原因并且为了理解，图2被呈现为根据本公开的实施例的从ML模型提取规则以推理设备的属性的示例方法300，该ML模型也被训练以推理该属性。方法300可以由诸如计算机系统401之类的计算机系统来执行，该计算机系统被配置为监视和控制网络上的网络通信(例如，网络控制器设备或托管网络控制器软件的系统)。As has been discussed previously, extracting important inputs and creating rules based on those inputs can vary based on circumstances (eg, ML model properties, network administrator preferences). For this reason and for understanding, FIG. 2 is presented as an example method 300 of extracting rules from an ML model to infer attributes of a device that is also trained to infer the attributes, according to an embodiment of the present disclosure. Method 300 may be performed by a computer system, such as computer system 401, configured to monitor and control network communications on a network (eg, a network controller device or a system hosting network controller software).

方法300开始于框302，其中注意力模型机器学习模型被训练以基于那些网络通信推理参与一个或多个网络通信的设备是否表现出预选择的设备属性。在本公开中可以找到框302的细节。可以关于图1的框104和图2的机器学习模型204找到特定示例。一旦在框302中训练了基于注意力的模型，则在框304中开始输入提取。Method 300 begins at block 302, where an attention model machine learning model is trained to infer whether a device participating in one or more network communications exhibits preselected device attributes based on those network communications. Details of block 302 can be found in this disclosure. Specific examples can be found with respect to block 104 of FIG. 1 and machine learning model 204 of FIG. 2 . Once the attention-based model is trained in block 302 , input extraction begins in block 304 .

在框304中，识别由基于注意力的模型推理的设备。例如，由数据集中的基于注意力的模型分析的每个通信集合可以被分配唯一的假设的网络设备。对于基于注意力的模型推理其存在预选择的属性的每个通信集合，在框304中，假想设备可被添加到“识别的”设备列表。例如，如果注意力模型分析了10个通信集合，并且推理预选择的属性(例如，IoT设备)存在于那些集合中的5个集合中，则框304可以导致识别展示预选择的属性(这里，是IoT设备的属性)的5个唯一设备。即使10个通信集合所源自的实际网络仅由两个设备组成，其中仅一个是IoT设备，这也是成立的。对于方法300的目的，将每个通信集合视为对应于唯一设备可能就足够了。In block 304, devices that are inferred by the attention-based model are identified. For example, each set of communications analyzed by an attention-based model in the dataset can be assigned a unique hypothetical network device. For each set of communications for which the attention-based model infers that there are pre-selected attributes, in block 304 a hypothetical device may be added to a list of "recognized" devices. For example, if the attention model analyzes 10 communication sets, and infers that a preselected attribute (e.g., IoT device) is present in 5 of those sets, block 304 may result in identifying the exhibited preselected attribute (here, 5 unique devices that are properties of IoT devices). This is true even if the actual network from which the 10 communication sets originate consists of only two devices, only one of which is an IoT device. For the purposes of method 300, it may be sufficient to consider each set of communications as corresponding to a unique device.

对于在框304中识别的每个设备(例如，每个假想设备)，在框306中识别被分配给每个输入(例如，通信的报头、通信被发送到的地址)的注意力权重。换言之，对于每个通信集合，框306基于相应设备展示了预选择的属性的推理来确定通信的每个方面所具有的相关性。在基于注意力的模型中，这些注意力权重采取应用于每个输入的重要性的数值表达式的形式，并且可以直接从模型的结构中导出。因此，框306可以采取直接从用于在框304中识别的每个设备的模型提取那些注意力权重的形式，从而产生用于每个识别的设备的注意力权重的列表。例如，在一些实施例中，该列表可以采用向其发送该通信集合中的通信的DNS名称的列表的形式，以及对于每个DNS名称，表示该DNS名称的存在对推理的影响的数字。For each device (eg, each imaginary device) identified in block 304 , an attention weight assigned to each input (eg, header of the communication, address to which the communication is sent) is identified in block 306 . In other words, for each set of communications, block 306 determines the relevance of each aspect of the communications based on an inference that the corresponding device exhibits the preselected attributes. In attention-based models, these attention weights take the form of numerical expressions of the importance applied to each input, and can be derived directly from the structure of the model. Accordingly, block 306 may take the form of extracting those attention weights directly from the model for each device identified in block 304, resulting in a list of attention weights for each identified device. For example, in some embodiments, the list may take the form of a list of DNS names to which communications in the set of communications were sent, and for each DNS name, a number representing the impact of the DNS name's presence on the inference.

在框308中，对于每个设备，对在框306中识别的注意力权重求和。例如，如果在框304中识别出5个设备，并且如果基于注意力的模型分别针对这5个设备具有DNS名称firmware_fetch.manufacturer.com的0.2、0.1、0.3和0.2的注意力权重，框308可以包括将这些注意力权重加在一起，从而得到针对所识别的设备集合的DNS名称firmware_fetch.manufacturer.com的组合权重0.9。可以针对注意力权重可用的每个输入重复该求和过程。例如，如果在其上训练注意力模型的数据包含对30个DNS名称的查询，则框308可以产生30个组合权重的列表。这些组合权重中的每一个将表示在所有5个所识别的设备上给予该DNS名称的总注意力权重。In block 308, for each device, the attention weights identified in block 306 are summed. For example, if 5 devices are identified in block 304, and if the attention-based model has attention weights of 0.2, 0.1, 0.3, and 0.2 for the DNS name firmware_fetch.manufacturer.com, respectively, for the 5 devices, block 308 may These attention weights are comprised of adding together, resulting in a combined weight of 0.9 for the DNS name firmware_fetch.manufacturer.com for the set of identified devices. This summation process can be repeated for each input for which attention weights are available. For example, if the data on which the attention model is trained contains queries for 30 DNS names, block 308 may produce a list of 30 combined weights. Each of these combined weights will represent the total attention weight given to that DNS name across all 5 identified devices.

一旦在框308中创建了组合权重，则在框310中对组合权重进行排序。这可以包括例如以降序对组合权重进行排序。换句话说，最大的组合权重将被首先列出，而最小的组合权重将被最后列出。因为每个组合权重对应于在框304中识别的所有设备上的单个输入的权重的和，所以框310的结果提供了对于所有识别的设备上的基于注意力的模型推理最有影响的输入的列表。Once the combined weights are created in block 308 , the combined weights are ordered in block 310 . This may include, for example, sorting the combination weights in descending order. In other words, the largest combination weights will be listed first and the smallest combination weights will be listed last. Because each combined weight corresponds to the sum of the weights of the individual inputs on all devices identified in block 304, the result of block 310 provides the most influential input for attention-based model inference on all identified devices list.

利用组合的注意力权重的排序列表，在框312中，计算机系统可以基于最有影响的输入来创建推理规则。例如，算法可以自动获得5个最有影响的输入，并将它们插入到简单的if-then语句中，其“if”条件是有影响的输入的存在，其“then”条件是设备属性的推理。如前所述，在一些实施例中，这些创建的规则可能更复杂并且包含更多上下文(例如，需要存在两个排序的输入以创建设备属性推理，需要在Y时间段内存在输入至少X次，需要特定的输入序列)。然后，可以通过本文讨论的任何方法来测试作为方法300的一部分而创建的规则的准确度。如果准确，则所创建的规则可应用于实时网络通信。Using the ranked list of combined attention weights, in block 312 the computer system can create inference rules based on the most influential inputs. For example, an algorithm could automatically obtain the 5 most influential inputs and insert them into a simple if-then statement whose "if" condition is the presence of the influential input and whose "then" condition is the inference of the device properties . As previously mentioned, in some embodiments, these created rules may be more complex and contain more context (e.g., two sorted inputs are required to be present to create device attribute inferences, inputs are required to be present at least X times within a Y time period , requires a specific input sequence). The rules created as part of method 300 can then be tested for accuracy by any of the methods discussed herein. If accurate, the rules created can be applied to real-time network traffic.

图4描绘了可以根据本公开的实施例使用的示例计算机系统401的代表性主要组件。所描述的特定组件仅出于示例的目的而呈现，并且不一定是仅有的这种变型。计算机系统401可包括处理器410、存储器420、输入/输出接口(这里也称为I/O或I/O接口)430和主总线440。主总线440可以为计算机系统401的其他组件提供通信路径。在一些实施例中，主总线440可以连接到诸如专用数字信号处理器(未示出)的其它组件。FIG. 4 depicts representative major components of an example computer system 401 that may be used in accordance with embodiments of the present disclosure. The specific components described are presented for purposes of illustration only, and are not necessarily the only such variations. Computer system 401 may include processor 410 , memory 420 , input/output interface (also referred to herein as I/O or I/O interface) 430 and main bus 440 . Main bus 440 may provide a communication path for other components of computer system 401 . In some embodiments, main bus 440 may connect to other components such as a dedicated digital signal processor (not shown).

计算机系统401的处理器410可以包括一个或多个CPU 412。处理器410可以另外包括一个或多个存储器缓冲器或高速缓存(未示出)，其为CPU 412提供指令和数据的临时存储。CPU 412可以对从高速缓存或从存储器420提供的输入执行指令，并将结果输出到高速缓存或存储器420。CPU 412可以包括被配置为执行与本公开的实施例一致的一个或多个方法的一个或多个电路。在一些实施例中，计算机系统401可以包含相对大的系统典型的多个处理器410。然而，在其它实施例中，计算机系统401可以是具有单个CPU 412的单个处理器。Processor 410 of computer system 401 may include one or more CPUs 412. Processor 410 may additionally include one or more memory buffers or caches (not shown) that provide CPU 412 with temporary storage of instructions and data. CPU 412 may execute instructions on input provided from cache or from memory 420 and output results to cache or memory 420. CPU 412 may include one or more circuits configured to perform one or more methods consistent with embodiments of the present disclosure. In some embodiments, computer system 401 may contain multiple processors 410 typical of relatively large systems. However, in other embodiments, computer system 401 may be a single processor with a single CPU 412.

计算机系统401的存储器420可以包括存储器控制器422和一个或多个用于临时或永久存储数据的存储器模块(未示出)。在一些实施例中，存储器420可以包括随机存取半导体存储器、存储设备或用于存储数据和程序的存储介质(易失性或非易失性)。存储器控制器422可以与处理器410通信，从而促进存储器模块中的信息的存储和检索。存储器控制器422可以与I/O接口430通信，从而促进存储器模块中的输入或输出的存储和检索。在一些实施例中，存储器模块可以是双列直插式存储器模块。Memory 420 of computer system 401 may include a memory controller 422 and one or more memory modules (not shown) for temporary or permanent storage of data. In some embodiments, the memory 420 may include a random access semiconductor memory, a storage device, or a storage medium (volatile or non-volatile) for storing data and programs. A memory controller 422 may communicate with processor 410 to facilitate storage and retrieval of information in the memory modules. Memory controller 422 may communicate with I/O interface 430 to facilitate storage and retrieval of input or output in the memory modules. In some embodiments, the memory module may be a dual in-line memory module.

I/O接口430可以包括I/O总线450、终端接口452、存储接口454、I/O设备接口456和网络接口458。I/O接口430可以将主总线440连接到I/O总线450。I/O接口430可以将指令和数据从处理器410和存储器420引导到I/O总线450的各种接口。I/O接口430还可以将指令和数据从I/O总线450的各种接口引导到处理器410和存储器420。各种接口可以包括终端接口452、存储接口454、I/O设备接口456和网络接口458。在一些实施例中，各种接口可以包括上述接口的子集(例如，工业应用中的嵌入式计算机系统可以不包括终端接口452和存储接口454)。I/O interfaces 430 may include I/O bus 450 , terminal interface 452 , storage interface 454 , I/O device interface 456 , and network interface 458 . I/O interface 430 may connect host bus 440 to I/O bus 450 . I/O interface 430 may direct instructions and data from processor 410 and memory 420 to various interfaces of I/O bus 450 . I/O interface 430 may also direct instructions and data from various interfaces of I/O bus 450 to processor 410 and memory 420 . The various interfaces may include a terminal interface 452 , a storage interface 454 , an I/O device interface 456 , and a network interface 458 . In some embodiments, various interfaces may include a subset of the above interfaces (eg, an embedded computer system in an industrial application may not include terminal interface 452 and storage interface 454).

遍及计算机系统401的逻辑模块—包括但不限于存储器420、处理器410和I/O接口430—可以将对一个或多个组件的故障和改变传送给管理程序或操作系统(未描绘)。管理程序或操作系统可以分配计算机系统401中可用的各种资源，并跟踪存储器420中的数据的位置以及分配给各种CPU 412的进程的位置。在组合或重新布置元件的实施例中，可以组合或重新分布逻辑模块的能力的各方面。这些变化对于本领域技术人员是显而易见的。Logic modules throughout computer system 401—including but not limited to memory 420, processor 410, and I/O interface 430—may communicate faults and changes to one or more components to a hypervisor or operating system (not depicted). A hypervisor or operating system may allocate the various resources available in the computer system 401 and keep track of the location of data in memory 420 and the locations of processes assigned to the various CPUs 412. In embodiments where elements are combined or rearranged, aspects of the capabilities of the logic modules may be combined or redistributed. These variations will be apparent to those skilled in the art.

本发明可以是任何可能的技术细节集成水平的系统、方法和/或计算机程序产品。计算机程序产品可以包括其上具有计算机可读程序指令的计算机可读存储介质(或多个介质)，所述计算机可读程序指令用于使处理器执行本发明的各方面。The invention may be a system, method and/or computer program product at any possible level of integration of technical details. A computer program product may include a computer readable storage medium (or multiple media) having computer readable program instructions thereon for causing a processor to perform aspects of the invention.

计算机可读存储介质可以是能够保留和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质可以是例如但不限于电子存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或前述的任何合适的组合。计算机可读存储介质的更具体示例的非穷举列表包括以下：便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式光盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、诸如上面记录有指令的打孔卡或凹槽中的凸起结构的机械编码装置，以及上述的任何适当组合。如本文所使用的计算机可读存储介质不应被解释为暂时性信号本身，诸如无线电波或其他自由传播的电磁波、通过波导或其他传输介质传播的电磁波(例如，通过光纤线缆的光脉冲)、或通过导线传输的电信号。A computer readable storage medium may be a tangible device capable of retaining and storing instructions for use by an instruction execution device. A computer readable storage medium may be, for example and without limitation, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of computer-readable storage media includes the following: portable computer diskette, hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM, or flash memory) ), static random access memory (SRAM), portable compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory sticks, floppy disks, such as punched cards on which instructions are recorded, or embossed in grooves Structured mechanical encoding devices, and any suitable combination of the above. Computer-readable storage media, as used herein, should not be interpreted as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables) , or electrical signals transmitted through wires.

本文描述的计算机可读程序指令可以从计算机可读存储介质下载到相应的计算/处理设备，或者经由网络，例如因特网、局域网、广域网和/或无线网络，下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光传输光纤、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或网络接口从网络接收计算机可读程序指令，并转发计算机可读程序指令以存储在相应计算/处理设备内的计算机可读存储介质中。Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a corresponding computing/processing device, or via a network, such as the Internet, local area network, wide area network, and/or wireless network, to an external computer or external storage device. The network may include copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium within the corresponding computing/processing device.

用于执行本发明的操作的计算机可读程序指令可以是汇编指令、指令集架构(ISA)指令、机器相关指令、微代码、固件指令、状态设置数据、集成电路的配置数据，或者以一种或多种编程语言(包括面向对象的编程语言，诸如Smalltalk、C++等)和过程编程语言(诸如“C”编程语言或类似的编程语言)的任意组合编写的源代码或目标代码。计算机可读程序指令可以完全在用户的计算机上执行，部分在用户的计算机上执行，作为独立的软件包执行，部分在用户的计算机上并且部分在远程计算机上执行，或者完全在远程计算机或服务器上执行。在后一种情况下，远程计算机可以通过任何类型的网络连接到用户的计算机，包括局域网(LAN)或广域网(WAN)，或者可以连接到外部计算机(例如，使用因特网服务提供商通过因特网)。在一些实施例中，为了执行本发明的各方面，包括例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA)的电子电路可以通过利用计算机可读程序指令的状态信息来执行计算机可读程序指令以使电子电路个性化。Computer readable program instructions for carrying out operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine dependent instructions, microcode, firmware instructions, state setting data, configuration data for an integrated circuit, or in the form of a or any combination of multiple programming languages (including object-oriented programming languages such as Smalltalk, C++, etc.) and procedural programming languages (such as the "C" programming language or similar programming languages). The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server to execute. In the latter case, the remote computer can be connected to the user's computer through any type of network, including a local area network (LAN) or wide area network (WAN), or can be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, in order to carry out aspects of the invention, electronic circuitry including, for example, programmable logic circuits, field programmable gate arrays (FPGAs), or programmable logic arrays (PLAs) may be programmed by utilizing computer-readable program instructions that state the information to execute computer readable program instructions to personalize electronic circuits.

在此参考根据本发明实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述本发明的各方面。将理解，流程图和/或框图的每个框以及流程图和/或框图中的框的组合可以由计算机可读程序指令来实现。Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.

这些计算机可读程序指令可以被提供给计算机或其他可编程数据处理装置的处理器以产生机器，使得经由计算机或其他可编程数据处理装置的处理器执行的指令创建用于实现流程图和/或框图的一个或多个框中指定的功能/动作的装置。这些计算机可读程序指令还可以存储在计算机可读存储介质中，其可以引导计算机、可编程数据处理装置和/或其他设备以特定方式工作，使得其中存储有指令的计算机可读存储介质包括制品，该制品包括实现流程图和/或框图的一个或多个框中指定的功能/动作的各方面的指令。These computer-readable program instructions may be provided to a processor of a computer or other programmable data processing apparatus to produce a machine such that instructions executed via the processor of the computer or other programmable data processing apparatus create a process for implementing the flowchart and/or block diagram means for the function/action specified in one or more blocks. These computer-readable program instructions can also be stored in a computer-readable storage medium, which can direct a computer, a programmable data processing device, and/or other equipment to operate in a specific manner, so that the computer-readable storage medium having the instructions stored therein includes the , the article of manufacture includes instructions for implementing aspects of the functions/acts specified in one or more blocks of the flowchart and/or block diagrams.

计算机可读程序指令还可以被加载到计算机、其他可编程数据处理装置或其他设备上，以使得在计算机、其他可编程装置或其他设备上执行一系列操作步骤，以产生计算机实现的过程，使得在计算机、其他可编程装置或其他设备上执行的指令实现流程图和/或框图的一个或多个框中指定的功能/动作。Computer-readable program instructions can also be loaded onto a computer, other programmable data processing device or other equipment, so that a series of operation steps are executed on the computer, other programmable device or other equipment to produce a computer-implemented process, such that Instructions executed on computers, other programmable devices, or other devices implement the functions/actions specified in one or more blocks of the flowcharts and/or block diagrams.

附图中的流程图和框图示出了根据本发明的各种实施例的系统、方法和计算机程序产品的可能实现的架构、功能和操作。在这点上，流程图或框图中的每个框可以表示指令的模块、段或部分，其包括用于实现指定的逻辑功能的一个或多个可执行指令。在一些替代实施方案中，框中所注明的功能可不按图中所注明的次序发生。例如，连续示出的两个框实际上可以作为一个步骤来实现，同时、基本同时、以部分或全部时间重叠的方式执行，或者这些框有时可以以相反的顺序执行，这取决于所涉及的功能。还将注意，框图和/或流程图图示的每个框以及框图和/或流程图图示中的框的组合可以由执行指定功能或动作或执行专用硬件和计算机指令的组合的专用的基于硬件的系统来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which includes one or more executable instructions for implementing the specified logical functions. In some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed as one step, executed concurrently, substantially concurrently, with some or all of the time overlapping, or the blocks may sometimes be executed in the reverse order, depending upon the steps involved. Function. It will also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by special-purpose hardware and computer instructions that perform the specified functions or actions, or combinations of special-purpose hardware and computer instructions. hardware system to achieve.

已经出于说明的目的呈现了对本公开的各种实施例的描述，但是其并非旨在是穷举的或限于所公开的实施例。在不背离所描述的实施例的范围和精神的情况下，许多修改和变化对于本领域的普通技术人员将是显而易见的。选择本文所使用的术语来解释实施例的原理、实际应用或对市场上存在的技术改进，或使本领域的其他普通技术人员能够理解本文所公开的实施例。The description of various embodiments of the present disclosure has been presented for purposes of illustration, but is not intended to be exhaustive or limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terms used herein are selected to explain the principles of the embodiments, practical applications or technical improvements to the market, or to enable other ordinary skilled in the art to understand the embodiments disclosed herein.

Claims

1. A method comprising:

analyzing a first network communication having a first set of inputs by a machine learning model;

exposing device attributes by the machine learning model and based on the analytical reasoning that the first device that is a party to the first network communication;

extracting from the machine learning model a first set of significant inputs having a significant impact on the determination; and

A rule for identifying the device attribute is created using the first set of inputs, wherein the rule establishes a condition that, when present in a network communication, implies that a party to the network communication exhibits the device attribute.

2. The method of claim 1, wherein said extracting comprises:

identifying an input weight for each input in the first set of inputs;

sorting the input weights of the first set of inputs; and

The first significant input set is selected based on the ranking.

3. The method of claim 2, further comprising:

analyzing a second network communication having a second set of inputs by the machine learning model; and

exposing the device attribute by the machine learning model and based on the analytical inference by the second device that is a party to the second network communication;

Wherein said extraction also includes:

for each input in the second set of inputs, identifying an input weight; and

The input weights of the first set of inputs and the second set of inputs are combined.

4. The method of claim 1, wherein the machine learning model is an attention-based model.

5. The method of claim 1, wherein the rule is an if-then statement.

6. The method of claim 1, further comprising:

applying said rules to real-time network communications;

detecting said condition in said real-time network communication;

inferring, based on the detection, that a second device participating in the real-time network communication exhibits the device attribute; and

The real-time network communication is blocked based on the identification.

7. The method of claim 1, wherein the first set of significant inputs includes domain names.

8. A system comprising:

processor; and

memory in communication with the processor, the memory containing program instructions that, when executed by the processor, are configured to cause the processor to perform a method comprising:

analyzing a first set of network communications having a first set of inputs by a machine learning model;

inferring by the machine learning model and based on the analysis that each device is a set of devices exhibiting device attributes, wherein each device is a party to a network communication in the first set of network communications;

Creating a rule for identifying attributes of the device using the first set of inputs, wherein the rule establishes a condition that, when present in a set of real-time network communications, implies that a party to the set of real-time network communications exhibits the device Attributes.

9. The system of claim 8, wherein the machine learning model is an attention-based model, and wherein the extracting comprises:

identifying, for a particular device in the set of devices, a list of attention weights expressing the importance of each particular input in the first set of inputs to reasoning for the particular device;

combining, for a particular input of said particular device, said attention weights in said list with said attention weights for corresponding inputs in said set of inputs for said other devices in said set of devices, resulting in combined weights for the inputs corresponding to all devices in the set of devices;

comparing the combined weight with other combined weights for other inputs in the set of inputs;

determining that the particular input is a significant input based on the comparison; and

The particular input is added to the first significant input set.

10. The system of claim 9, wherein the specific input for the specific device is a DNS name queried by the device, and wherein the corresponding inputs for the other devices in the set of devices are those other The DNS name that the device queries.

11. The system of claim 10, wherein the rules include inferring that a network device exhibits the device attribute if the network device queries the DNS name.

12. The system of claim 8, wherein the rules include inferring that a network device exhibits the device attribute if the network device queries the DNS name and the second DNS name.

13. The system of claim 8, wherein the first set of significant inputs comprises specific byte sequences in real-time network communications.

14. The system of claim 8, wherein the first set of significant inputs includes DNS names, and the condition includes querying the DNS names at least a threshold number of times within a specified time period.

15. A computer program product comprising a computer-readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to:

inferring, by the machine learning model and based on the analysis, that each device is a set of devices exhibiting device attributes, wherein each device is a party to a network communication in the first set of network communications;

16. The computer program product of claim 15, wherein the machine learning model is an attention-based model, and wherein the extracting comprises:

combining, for a particular input of said particular device, said attention weights in said list with said attention weights for corresponding inputs in said set of inputs for said other devices in said set of devices, A combined weight for the input corresponding to all devices in the set of devices results.

17. The computer program product of claim 16, wherein the specific input for the specific device is a DNS name queried by the device, and wherein the corresponding input for the other devices in the set of devices is The DNS names that those other devices look up.

18. The computer program product of claim 15, wherein the rules include inferring that a network device exhibits the device attribute if the network device queries the DNS name.

19. The computer program of claim 15, wherein the rule includes inferring that a network device exhibits the device attribute if the network device queries the DNS name and the second DNS name.

20. The computer program of claim 15, wherein the first set of significant inputs includes DNS names, and the condition includes querying the DNS names at least a threshold number of times within a certain period of time.

21. A computer program comprising program code means adapted to perform the method of any one of claims 1 to 7 when said program is run on a computer.