CN118540239A - Heterogeneous graph anomaly detection method based on multipath neighborhood node self-adaptive selection - Google Patents
Heterogeneous graph anomaly detection method based on multipath neighborhood node self-adaptive selection Download PDFInfo
- Publication number
- CN118540239A CN118540239A CN202410621742.XA CN202410621742A CN118540239A CN 118540239 A CN118540239 A CN 118540239A CN 202410621742 A CN202410621742 A CN 202410621742A CN 118540239 A CN118540239 A CN 118540239A
- Authority
- CN
- China
- Prior art keywords
- node
- neighborhood
- sampling
- nodes
- graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 21
- 238000005070 sampling Methods 0.000 claims abstract description 107
- 238000000034 method Methods 0.000 claims abstract description 52
- 230000008569 process Effects 0.000 claims abstract description 39
- 230000002776 aggregation Effects 0.000 claims abstract description 34
- 238000004220 aggregation Methods 0.000 claims abstract description 34
- 230000003044 adaptive effect Effects 0.000 claims abstract description 18
- 230000002787 reinforcement Effects 0.000 claims abstract description 13
- 230000007613 environmental effect Effects 0.000 claims abstract description 11
- 230000002708 enhancing effect Effects 0.000 claims abstract 3
- 239000003795 chemical substances by application Substances 0.000 claims description 57
- 230000006870 function Effects 0.000 claims description 56
- 239000011159 matrix material Substances 0.000 claims description 52
- 238000004422 calculation algorithm Methods 0.000 claims description 19
- 230000009471 action Effects 0.000 claims description 18
- 230000009466 transformation Effects 0.000 claims description 18
- 238000013528 artificial neural network Methods 0.000 claims description 14
- 238000012549 training Methods 0.000 claims description 11
- 230000004913 activation Effects 0.000 claims description 10
- 238000005295 random walk Methods 0.000 claims description 9
- 230000002159 abnormal effect Effects 0.000 claims description 8
- 230000007246 mechanism Effects 0.000 claims description 8
- 208000001613 Gambling Diseases 0.000 claims description 7
- 230000006399 behavior Effects 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000011156 evaluation Methods 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 5
- 238000005065 mining Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 claims description 2
- 238000012512 characterization method Methods 0.000 claims 5
- 230000010354 integration Effects 0.000 claims 1
- 239000012633 leachable Substances 0.000 claims 1
- 230000008447 perception Effects 0.000 claims 1
- 238000007418 data mining Methods 0.000 abstract description 2
- 238000013135 deep learning Methods 0.000 abstract description 2
- 230000006872 improvement Effects 0.000 description 8
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 2
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000000547 structure data Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/02—Topology update or discovery
- H04L45/08—Learning-based routing, e.g. using neural networks or artificial intelligence
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/24—Multipath
- H04L45/245—Link aggregation, e.g. trunking
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Environmental & Geological Engineering (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明公开了一种基于多路邻域节点自适应选择的异质图异常检测方法,属于深度学习和图数据挖掘技术领域,包括以下步骤:给定元路径模式,构建具有多维语义信息的多维图;扩展多路邻域的采样范围,动态调整邻域节点的采样概率,采样出适量的邻域节点;按照采样节点序列的依赖性、动态性和复杂性仿真强化学习任务中的基本环境要素;在多维图中部署多智能体,自适应地从多维采样节点序列中顺序选择出同配节点;利用关系感知邻域聚合学习全面的节点表征嵌入;将目标节点的表征嵌入传递到下游任务中进行异质图表示学习。本发明能够实现对异质图中邻域节点的细粒度选择,不仅增强中心节点在信息聚合过程的可靠性,还提高异质图在表示学习方面的性能。
The present invention discloses a heterogeneous graph anomaly detection method based on adaptive selection of multi-way neighborhood nodes, which belongs to the technical field of deep learning and graph data mining, and includes the following steps: given a meta-path pattern, constructing a multidimensional graph with multidimensional semantic information; expanding the sampling range of the multi-way neighborhood, dynamically adjusting the sampling probability of the neighborhood nodes, and sampling an appropriate amount of neighborhood nodes; simulating the basic environmental elements in the reinforcement learning task according to the dependency, dynamics and complexity of the sampling node sequence; deploying multiple agents in the multidimensional graph, adaptively selecting the same matching nodes in sequence from the multidimensional sampling node sequence; using relationship-aware neighborhood aggregation to learn comprehensive node representation embedding; passing the representation embedding of the target node to downstream tasks for heterogeneous graph representation learning. The present invention can realize the fine-grained selection of neighborhood nodes in the heterogeneous graph, not only enhancing the reliability of the central node in the information aggregation process, but also improving the performance of the heterogeneous graph in representation learning.
Description
技术领域Technical Field
本发明涉及深度学习和图数据挖掘技术领域,尤其是一种基于多路邻域节点自适应选择的异质图异常检测方法。The present invention relates to the technical field of deep learning and graph data mining, and in particular to a heterogeneous graph anomaly detection method based on multi-path neighborhood node adaptive selection.
背景技术Background Art
真实世界场景中的网络数据可以建模为由多种实体和关系构成的异质图,并在很多研究领域中得到深入的挖掘和广泛的应用。在互联网新时代中,海量知识数据的涌现无疑是一个巨大的考验。面对大规模的图数据时,人们常常面临着信息的不完整性和不可靠性问题,这使得有效地选择和处理这些充满不确定性的知识资源变得颇具挑战。Network data in real-world scenarios can be modeled as heterogeneous graphs consisting of multiple entities and relationships, and has been deeply explored and widely applied in many research fields. In the new era of the Internet, the emergence of massive knowledge data is undoubtedly a huge challenge. When faced with large-scale graph data, people often face the problems of incompleteness and unreliability of information, which makes it challenging to effectively select and process these uncertain knowledge resources.
虽然图神经网络在处理各类结构化图数据方面表现出色,但其表示学习的过程往往建立在同配性假设之上,这意味着图中的邻域节点倾向于拥有与中心节点相似的特征或类别。这一现象在现实世界的图数据中得到了普遍观察和实质验证。然而,图数据的质量参差不齐,在图建模时可能会遭遇诸如异常值、噪声、稀疏性、偏差以及长尾分布等挑战。这些异常数据的存在可能会破坏同配性假设,导致基于邻域的信息变得不可信。如果从这样的邻域环境中聚合信息,可能会引入误导性因素,从而损害图学习算法的性能,导致学到的图表示失真。在同质图中,同配性尚且不是普遍存在的,异配性特征反而更为常见。而在异质图中,同质的对等节点间需要经过多跳的关系路径才能建立连接,这也进一步增强了节点之间的异配性。Although graph neural networks perform well in processing various types of structured graph data, their representation learning process is often based on the assortativity assumption, which means that neighborhood nodes in the graph tend to have similar features or categories as the central node. This phenomenon has been widely observed and verified in real-world graph data. However, the quality of graph data varies, and challenges such as outliers, noise, sparsity, bias, and long-tail distribution may be encountered when modeling graphs. The existence of these abnormal data may undermine the assortativity assumption and make neighborhood-based information unreliable. If information is aggregated from such a neighborhood environment, misleading factors may be introduced, thereby damaging the performance of the graph learning algorithm and causing the learned graph representation to be distorted. In homogeneous graphs, assortativity is not yet ubiquitous, and heterogeneity is more common. In heterogeneous graphs, homogeneous peer nodes need to go through multi-hop relationship paths to establish connections, which further enhances the heterogeneity between nodes.
最近有研究工作指出:在一般的现实业务场景应用中,在运用到图神经网络时,并非所有邻居节点都对消息传递聚合具有正向作用,事实上,一些邻居节点可能还会传递给中心节点的表示学习产生负面影响的干扰信息。它们通过手工或启发式的方法来指定需要保留有益邻居的数量或阈值来缓解有害节点对目标中心节点的负面影响。然而,这些工作仅粗糙地根据邻居节点与中心节点之间的距离度量指标来评判节点的好坏,并没有以更细粒度的方式自适应地检测出与中心节点在空间和语义上毗邻的节点异常。Recent research has pointed out that in general real-world business scenarios, when applied to graph neural networks, not all neighbor nodes have a positive effect on message aggregation. In fact, some neighbor nodes may also pass interference information that has a negative impact on the representation learning of the central node. They use manual or heuristic methods to specify the number or threshold of beneficial neighbors that need to be retained to alleviate the negative impact of harmful nodes on the target central node. However, these works only roughly judge the quality of nodes based on the distance measurement indicators between neighbor nodes and central nodes, and do not adaptively detect node anomalies that are spatially and semantically adjacent to the central node in a more fine-grained manner.
因此,有必要提出一种基于多路邻域节点自适应选择的异质图异常检测方法,来帮助图神经网络更有效地捕捉图数据中多维度的结构和特征信息,以完成高质量的异质图表示学习。Therefore, it is necessary to propose a heterogeneous graph anomaly detection method based on adaptive selection of multi-way neighborhood nodes to help graph neural networks more effectively capture the multi-dimensional structure and feature information in graph data, so as to complete high-quality heterogeneous graph representation learning.
发明内容Summary of the invention
本发明针对异质图中高阶同质对等节点存在的异常,可能会使观察到的邻域不可靠,并导致节点表示学习的误导性信息聚合问题,提供了一种基于多路邻域节点自适应选择的异质图异常检测方法,实现了对异质图中邻域节点的细粒度选择,不仅增强了中心节点在信息聚合过程的可靠性,还提高了异质图在表示学习方面的性能。The present invention aims at the problem that anomalies existing in high-order homogeneous peer nodes in heterogeneous graphs may make the observed neighborhood unreliable and lead to misleading information aggregation in node representation learning. A heterogeneous graph anomaly detection method based on multi-way neighborhood node adaptive selection is provided, which realizes fine-grained selection of neighborhood nodes in heterogeneous graphs, which not only enhances the reliability of central nodes in the information aggregation process, but also improves the performance of heterogeneous graphs in representation learning.
为解决上述技术问题,本发明所采用的技术方案是:In order to solve the above technical problems, the technical solution adopted by the present invention is:
一种基于多路邻域节点自适应选择的异质图异常检测方法,包括以下步骤:A heterogeneous graph anomaly detection method based on multi-path neighborhood node adaptive selection comprises the following steps:
S1,给定元路径模式,构建具有多维语义信息的多维图;S1, given a meta-path pattern, construct a multi-dimensional graph with multi-dimensional semantic information;
S2,扩展多路邻域的采样范围,动态调整邻域节点的采样概率,采样出适量的邻域节点;S2, expand the sampling range of multi-path neighborhood, dynamically adjust the sampling probability of neighborhood nodes, and sample an appropriate number of neighborhood nodes;
S3,按照采样节点序列的依赖性、动态性和复杂性仿真强化学习任务中的基本环境要素;S3, simulates the basic environment elements in the reinforcement learning task according to the dependency, dynamics and complexity of the sampled node sequence;
S4,在多维图中部署多智能体,自适应地从多维采样节点序列中顺序选择出同配节点;S4, deploys multiple agents in the multidimensional graph and adaptively selects the same nodes from the multidimensional sampling node sequence;
S5,利用关系感知邻域聚合学习全面的节点表征嵌入;S5, learning comprehensive node representation embeddings using relation-aware neighborhood aggregation;
S6,将目标节点的表征嵌入传递到下游任务中进行异质图表示学习。S6, passes the representation embedding of the target node to the downstream task for heterogeneous graph representation learning.
本发明技术方案的进一步改进在于:在S1中,具体为给定n个不同的元路径模式根据每个元路径模式进行相应的连续矩阵乘法计算,构建出具有n维关系路径语义信息的多维图过程包括以下步骤:A further improvement of the technical solution of the present invention is that: in S1, specifically, given n different meta-path patterns According to each meta-path pattern, corresponding continuous matrix multiplication calculations are performed to construct a multi-dimensional graph with n-dimensional relationship path semantic information. The process includes the following steps:
S11,不同的下游图挖掘任务可能对目标节点的类型数量有特定的要求,每种目标节点类型对应构建一个多维图;S11, different downstream graph mining tasks may have specific requirements on the number of target node types, and a multidimensional graph is constructed for each target node type;
S12,沿着预设计的元路径中所有关系对应的邻接矩阵进行连续相乘来构建基于元路径的图的邻接矩阵表达如下:S12, along the pre-designed metapath The adjacency matrix of the graph based on the meta-path is constructed by continuously multiplying the adjacency matrices corresponding to all relations in The expression is as follows:
其中,el代表节点类型,是关系<el,el+1>对应的邻接矩阵;对n个不同的元路径模式进行类似的计算,构建出一组同质图对应的邻接矩阵将异质图转化为n维多维图。Among them, e l represents the node type, is the adjacency matrix corresponding to the relationship < el , el+1 >; for n different meta-path patterns Perform similar calculations to construct an adjacency matrix corresponding to a set of homogeneous graphs Convert heterogeneous graphs into n-dimensional multidimensional graphs.
本发明技术方案的进一步改进在于:在S2中,具体为从多路采样启发的角度扩展节点邻域的采样范围,利用上下文多臂赌博机动态地调整每维关系路径下邻域节点的采样概率,并根据概率分别从每个维度为每个节点采样出T个邻域节点,从而构建邻域采样池NSP;过程包括以下步骤:The further improvement of the technical solution of the present invention is that: in S2, specifically, the sampling range of the node neighborhood is expanded from the perspective of multi-path sampling inspiration, the context multi-armed bandit is used to dynamically adjust the sampling probability of the neighborhood nodes under each dimensional relationship path, and T neighborhood nodes are sampled for each node from each dimension according to the probability, so as to construct a neighborhood sampling pool NSP; the process includes the following steps:
S21,对于具有dΦ(v)维节点属性Xv的节点v,采用线性变换矩阵和偏置向量将节点属性变换到特定的d维隐嵌入空间中,得到节点的初始表征嵌入Hv,表达如下:S21, for a node v with d Φ(v) -dimensional node attributes X v , a linear transformation matrix is used and the bias vector Transform node attributes into a specific d-dimensional latent embedding space In the above example, the initial representation of the node is embedded in H v and is expressed as follows:
Hv=WΦ(v)Xv+bΦ(v) (2)H v =W Φ(v) X v +b Φ(v) (2)
S22,对于每个维度的中心节点,采用多路代表性的邻域采样启发式策略;S22, for the central node of each dimension, a multi-way representative neighborhood sampling heuristic strategy is adopted;
S23,用Qk来代表第k个邻域采样启发策略的概率矩阵,在回合ep中,结合K路邻域采样启发策略的采样概率,计算中心节点vi对邻域节点vj的加权多路采样概率 S23, use Qk to represent the probability matrix of the k-th neighborhood sampling heuristic strategy. In round ep, combine the sampling probability of the K-way neighborhood sampling heuristic strategy to calculate the weighted multi-way sampling probability of the central node vi to the neighboring node vj
其中,是邻域采样启发策略粒度概率,Qk满足行归一化 in, is the probability of the neighborhood sampling heuristic strategy granularity, Q k satisfies row normalization
S24,计算策略粒度概率是第k个邻域采样启发策略的概率矩阵Qk的权重;邻域采样启发策略的综合选择能够被视为上下文多臂赌博机中的对抗性问题,策略粒度概率作为多路邻域采样的赌博动作,自适应地调整每一路的采样概率,并随着模型训练回合的迭代而更新:S24, calculate the probability of strategy granularity is the weight of the probability matrix Q k of the kth neighborhood sampling heuristic strategy; the comprehensive selection of the neighborhood sampling heuristic strategy can be regarded as an adversarial problem in the contextual multi-armed bandit. The strategy granularity probability is used as a gambling action for multi-way neighborhood sampling, adaptively adjusting the sampling probability of each way and updating it with the iteration of the model training round:
其中,初始值且pmin≥0是一个可控常数,用于约束采样概率的下限;wk是第k路策略的权重,根据具有监督学习保证的上下文多臂赌博机强化学习算法,迭代更新规则如下:Among them, the initial value And p min ≥ 0 is a controllable constant used to constrain the lower limit of the sampling probability; w k is the weight of the k-th strategy. According to the contextual multi-armed bandit reinforcement learning algorithm with supervised learning guarantee, the iterative update rule is as follows:
其中,是更新周期,和是可调整的超参数,代表上下文多臂赌博机选择动作时的遗憾率;是在第ep回合第k个策略下采样节点的加权奖励值,表达如下:in, is the update cycle, and is an adjustable hyperparameter representing the regret rate of the contextual multi-armed bandit when selecting actions; is the weighted reward value of the sampling node under the k-th strategy in the ep-th round, expressed as follows:
其中,是节点vi选择邻域节点的集合;是节点vi和vj的平均多头注意力得分;in, is the set of neighboring nodes selected by node v i ; is the average multi-head attention score of nodes vi and vj ;
S25,在每个回合结束后,为每个节点v,在每维关系路径下,根据采样概率从节点集合中采样T个邻域节点,并将这些中心节点和邻域节点对添加到邻域采样池中:S25, after each round, for each node v, in each dimension relationship path Next, according to the sampling probability From the node set Sample T neighborhood nodes and add these center node and neighborhood node pairs to the neighborhood sampling pool:
其中,是采样概率。in, is the sampling probability.
本发明技术方案的进一步改进在于:在S22中,所述多路代表性的邻域采样启发式策略包括:一跳邻居策略、二跳邻居策略、最近邻邻居策略和随机游走邻居策略;具体为:A further improvement of the technical solution of the present invention is that: in S22, the multi-channel representative neighborhood sampling heuristic strategy includes: a one-hop neighbor strategy, a two-hop neighbor strategy, a nearest neighbor strategy and a random walk neighbor strategy; specifically:
所述一跳邻居策略:从归一化邻接矩阵中采样的直接连接节点,其中,是添加了自环的邻接矩阵,是满足的对角矩阵;The one-hop neighbor strategy: From the normalized adjacency matrix The directly connected nodes sampled in , where is the adjacency matrix with self-loops added, is satisfied The diagonal matrix of ;
所述二跳邻居策略:从归一化邻接矩阵的幂中采样的间接连接节点;The two-hop neighbor strategy: From the power of the normalized adjacency matrix Indirectly connected nodes sampled in;
所述最近邻邻居策略:从基于节点属性的余弦相似性的概率矩阵SKNN中采样的节点,其中,节点vi和vj之间的相似性得分为 The nearest neighbor strategy: nodes sampled from the probability matrix S KNN based on the cosine similarity of node attributes, where the similarity score between nodes vi and vj is
所述随机游走邻居策略:从随机游走权重矩阵SPPR中采样的节点,其中,个性化PageRank算法用于生成游走权重纯量系数c∈[0,1],表示列归一化邻接矩阵。The random walk neighbor strategy: nodes sampled from the random walk weight matrix S PPR , where the personalized PageRank algorithm is used to generate the walk weights scalar coefficient c∈[0,1], Represents a column-normalized adjacency matrix.
本发明技术方案的进一步改进在于:在S3中,具体为针对邻域采样池中每个中心节点的采样邻域节点集合,根据它们的采样概率进行顺序排列,并按照采样邻域节点序列的依赖关系、动态特性以及复杂程度仿真强化学习任务中的基本环境要素;过程包括以下步骤:A further improvement of the technical solution of the present invention is that: in S3, specifically for each central node in the neighborhood sampling pool, the sampling neighborhood node set is arranged in order according to their sampling probability, and the basic environmental elements in the reinforcement learning task are simulated according to the dependency, dynamic characteristics and complexity of the sampling neighborhood node sequence; the process includes the following steps:
S31,为满足强化环境要素中的依赖性,在对邻域节点的顺序选择时,根据邻域节点的采样概率对邻域节点进行排序,排列相近的邻域节点在语义上存在关联,随着智能体沿着时间步在邻域节点序列中前进,靠近的邻域节点会产生顺序依赖性,依赖过程的表达如下:S31, in order to meet the dependency in the enhanced environmental elements, when selecting the order of neighboring nodes, the neighboring nodes are sorted according to the sampling probability of the neighboring nodes. The neighboring nodes with similar arrangement are semantically associated. As the agent moves forward in the neighboring node sequence along the time step, the close neighboring nodes will produce sequential dependency. The expression of the dependency process is as follows:
ot+1=τot+(1-τ)ot+1 (8)o t+1 =τo t +(1-τ)o t+1 (8)
其中,ot是在时间步t的环境状态,τ是一个数值较小的超参数;Among them, o t is the state of the environment at time step t, and τ is a small hyperparameter;
S32,为满足强化环境要素中的动态性,对邻域节点序列设置跳过率ps,智能体以概率ps随机地跳过对某一个邻域节点的评估,并直接转移到下一个邻域节点,这时的状态序列长度和内容在每一个回合中都会发生明显的变化;S32, in order to meet the dynamics of the enhanced environmental elements, a skip rate ps is set for the neighborhood node sequence. The agent randomly skips the evaluation of a neighborhood node with probability ps and directly transfers to the next neighborhood node. At this time, the length and content of the state sequence will change significantly in each round;
S33,为满足强化环境要素中的复杂性,根据邻域节点的采样概率大小对邻域节点进行降序排列,智能体从采样概率较高的简单邻域节点开始选择判断,并利用较高的跳过率来规避较为困难的邻域节点,跳过率ps随着模型训练回合的迭代而更新:S33, in order to meet the complexity of the enhanced environmental elements, the neighborhood nodes are sorted in descending order according to the sampling probability of the neighborhood nodes. The agent starts to select and judge from the simple neighborhood nodes with higher sampling probability, and uses a higher skip rate to avoid more difficult neighborhood nodes. The skip rate ps is updated with the iteration of the model training round:
其中,ps是基本跳过率,是确保动态特性的最小跳过率;dt代表在时间步t时邻域节点vt的归一化复杂性度量,表达如下:Among them, ps is the basic skip rate, is the minimum skip rate to ensure dynamic characteristics; dt represents the normalized complexity measure of the neighborhood node vt at time step t, expressed as follows:
其中,即从T个采样节点中选取具有最大多路采样概率的节点序号;lt∈[0,1]是节点v与vt同配的归一化置信度,与智能体行为中选择决策的奖励值有关;κ是平衡中心节点v与邻域节点vt之间同配性和异配性特征的超参数。in, That is, the node number with the maximum multi-path sampling probability is selected from the T sampling nodes; l t ∈ [0,1] is the normalized confidence that nodes v and v t are assortative, which is related to the reward value of the decision-making selection in the agent's behavior; κ is a hyperparameter that balances the assortative and heteromatching characteristics between the central node v and the neighboring node v t .
本发明技术方案的进一步改进在于:在S4中,具体为在一个回合中,智能体依次对中心节点与它对应的最多T个采样邻域节点进行异配性判断,过程被视作T轮决策过程;在多维图中为n维关系路径构建的同质图部署n种智能体,它们在n维关系路径语义的马尔可夫博弈中学习如何选择同配性节点以实现多维度选择的合作均衡,从而构建邻域选择池NCP;过程包括以下步骤:The further improvement of the technical solution of the present invention is that: in S4, specifically in one round, the intelligent agent judges the heterogeneity of the central node and the maximum T sampled neighboring nodes corresponding to it in turn, and the process is regarded as a T-round decision process; in the multidimensional graph, a homogeneous graph constructed for the n-dimensional relationship path is deployed with n kinds of intelligent agents, and they learn how to select homogeneous nodes in the Markov game of the semantics of the n-dimensional relationship path to achieve a cooperative equilibrium of multi-dimensional selection, thereby constructing a neighborhood selection pool NCP; the process includes the following steps:
S41,采用最新的中心节点表征Hv,当前时间步的邻域节点表征Hvt和采样邻域节点集合NSP(v)共同作为当前时刻t的可观测状态ot,表达如下:S41, using the latest central node representation Hv , the neighborhood node representation Hvt of the current time step and the sampled neighborhood node set NSP(v) as the observable state ot at the current time t, expressed as follows:
S42,每个智能体的动作空间定义为一个由选择和不选择同配性节点组成的离散方案,表达如下:S42, the action space of each agent It is defined as a discrete scheme consisting of selecting and not selecting nodes of similarity, expressed as follows:
S43,在时间步t,采用线性层将中心节点表征Hv和邻域节点表征组合起来,获得整合中心节点与当前时间步邻域节点的信息估计表达如下:S43, at time step t, a linear layer is used to represent the central node H v and the neighboring node H v. Combined, we can get the information of the integrated center node and the neighboring nodes of the current time step. The estimated expression is as follows:
其中,σ(·)是激活函数,(·||·)是拼接操作,Wh和bh分别是线性映射中可学习的权重矩阵和偏置向量;借鉴对比预测编码的理论,估计中心节点与邻域节点集之间的互信息,获得更高阶的信息量估计表达如下:Among them, σ(·) is the activation function, (·||·) is the concatenation operation, W h and b h are the learnable weight matrix and bias vector in the linear mapping, respectively; referencing the theory of contrastive predictive coding, the mutual information between the central node and the set of neighboring nodes is estimated to obtain higher-order information. The estimated expression is as follows:
其中,Wd是可学习的线性变换矩阵;将上面两个估计式拼接起来,进行可学习的线性变换Wp和Softmax层输出归一化步骤,获得动作概率分布表达如下:Among them, Wd is a learnable linear transformation matrix; concatenate the above two estimation formulas, perform a learnable linear transformation Wp and a Softmax layer output normalization step, and obtain the action probability distribution The expression is as follows:
S44,在时间步t,根据图结构的数据特点,采用无监督的方式为智能体的动作分配奖励rt+1(v),表达如下:S44, at time step t, according to the data characteristics of the graph structure, an unsupervised approach is used to assign rewards r t+1(v) to the actions of the agent, expressed as follows:
其中,σ(·)是Sigmoid激活函数,Ω是在异质图上可观测到的正节点对的集合,而Ω-则是从所有未观测到的节点对中采样的负节点对的集合,ω1和ω2是可调整的超参数,分别代表正样本的奖励权重和负样本的惩罚权重;Where σ(·) is the Sigmoid activation function, Ω is the set of positive node pairs observed on the heterogeneous graph, and Ω - is the set of negative node pairs sampled from all unobserved node pairs. ω 1 and ω 2 are adjustable hyperparameters, representing the reward weight of positive samples and the penalty weight of negative samples, respectively.
S45,在时间步t,智能体α观测节点状态ot,根据策略执行动作ut,获得反馈rt+1,记录完成状态done,并进入下一个时间步t+1,继续观测新的节点状态ot+1,为每个智能体α建立一个经验回放缓冲池在每个时间步存入经验轨迹τα=(ot,ut,rt+1,ot+1,done);S45, at time step t, agent α observes the node state o t and follows the strategy Execute action u t , obtain feedback r t+1 , record the completed state done , and enter the next time step t+1 , continue to observe the new node state o t+1 , and establish an experience replay buffer pool for each agent α Store the empirical trajectory τ α =(o t ,u t ,r t+1 ,o t+1 ,done) at each time step;
S46,采用多演员注意力评论家算法实现上述多智能体行为的学习与优化,在多智能体强化学习环境中,训练去中心化的策略,通过多头注意力机制动态地选取每个智能体的重要信息,并集中学习策略的评价;S46, uses a multi-actor attention critic algorithm to achieve the learning and optimization of the above multi-agent behavior. In a multi-agent reinforcement learning environment, it trains decentralized strategies, dynamically selects important information of each agent through a multi-head attention mechanism, and centralizes the evaluation of learning strategies;
S47,在每个回合结束后,对于在每维关系路径下的中心节点v,智能体沿时间序列从邻域采样池中迭代选择候选节点,并将这些中心节点和选择节点对添加到邻域选择池中:S47, after each round, for each dimension relationship path For the central node v under the circumstance, the agent iteratively selects candidate nodes from the neighborhood sampling pool along the time series and adds these central node and selected node pairs to the neighborhood selection pool:
本发明技术方案的进一步改进在于:在S46中,具体过程包括以下步骤:A further improvement of the technical solution of the present invention is that in S46, the specific process includes the following steps:
S461,智能体αi的状态动作价值函数的表达如下:S461, the state-action value function of agent α i is expressed as follows:
其中,gi(·)是面向不同模块的多层感知机嵌入函数,Ws是可学习的线性变换矩阵,在其他智能体αj的贡献函数gou(o,u)中被共享使用,是贡献价值的注意力权重;Among them, gi(·) is the multi-layer perceptron embedding function for different modules, Ws is a learnable linear transformation matrix, which is shared in the contribution function g ou (o,u) of other agents α j . is the attention weight of the contribution value;
S462,所有智能体的价值函数优化能够近似看作多任务回归问题,它们的联合回归损失函数如下:S462, the value function optimization of all agents can be approximately regarded as a multi-task regression problem, and their joint regression loss function is as follows:
其中,θ和分别是评论家网络和目标评论家网络的参数;是目标价值函数,表达如下:Among them, θ and are the parameters of the critic network and the target critic network, respectively; is the target value function, expressed as follows:
其中,γ是折扣因子,τ是平衡最大化熵和奖励的温度超参数,是目标策略网络的参数;where γ is the discount factor, τ is the temperature hyperparameter that balances maximizing entropy and reward, are the parameters of the target policy network;
S463,每个智能体的策略网络通过梯度上升进行更新,它们各自的梯度表达如下:S463, the policy network of each agent is updated by gradient ascent, and their respective gradients are expressed as follows:
其中,Ai(o|u)=Qi(o,u;θ)-bi(o,u)是优势函数,是反事实基线,φ是策略网络的参数,ρ是可学习的策略对数调整系数;所有目标网络的参数都是通过软更新机制进行同步的。Among them, A i (o|u) = Qi (o,u; θ) - bi (o,u) is the advantage function, is the counterfactual baseline, φ is the parameter of the policy network, and ρ is the learnable policy logarithmic adjustment coefficient; the parameters of all target networks are synchronized through the soft update mechanism.
本发明技术方案的进一步改进在于:在S5中,具体为:为了获得全面的节点表征嵌入,采用关系感知邻域聚合,将邻域聚合分为关系路径内的聚合和关系路径间的聚合;过程包括以下步骤:A further improvement of the technical solution of the present invention is that: in S5, specifically: in order to obtain comprehensive node representation embedding, relationship-aware neighborhood aggregation is adopted, and neighborhood aggregation is divided into aggregation within the relationship path and aggregation between the relationship paths; the process includes the following steps:
S51,在关系路径内的聚合中,采用注意力机制来对邻域选择的权重进行动态调整,得到关系路径内的聚合表征嵌入表达如下:S51, in the aggregation within the relationship path, the attention mechanism is used to dynamically adjust the weight of the neighborhood selection to obtain the aggregation representation embedding within the relationship path The expression is as follows:
其中,Wh和Wt分别是针对中心节点和邻域节点的线性变换矩阵,它们在所有关系路径中是共享的;av,u表示节点vj对vi的重要性分数,其表达如下:Among them, W h and W t are the linear transformation matrices for the central node and the neighborhood nodes respectively, which are shared in all relationship paths; a v,u represents the importance score of node v j to vi , which is expressed as follows:
其中,Wa是注意力变换矩阵,tanh激活函数可以进一步区分邻域节点带给中心节点的异常信息;Among them, Wa is the attention transformation matrix, and the tanh activation function can further distinguish the abnormal information brought to the central node by the neighboring nodes;
S52,在关系路径间的聚合中,将所有关系路径下的语义表征嵌入信息拼接起来进行联合学习,得到关系路径间的聚合表征嵌入Zv:S52, in the aggregation between relation paths, the semantic representation embedding information under all relation paths is spliced together for joint learning to obtain the aggregated representation embedding Z v between relation paths:
其中,MLP(·)是面向最终图任务的多层感知机函数。Among them, MLP(·) is the multi-layer perceptron function for the final graph task.
本发明技术方案的进一步改进在于:在S6中,具体为将目标中心节点的最终表征嵌入Z作为输出,并传递到下游任务中进行异质图的表示学习,过程包括以下步骤:A further improvement of the technical solution of the present invention is that in S6, the final representation of the target central node is embedded into Z as output and passed to the downstream task for representation learning of the heterogeneous graph. The process includes the following steps:
S61,在训练的时候,针对图节点分类任务,首先将图神经网络的输出层维度设定为与节点类型数量一样,然后对于单标签分类场景,通过Softmax函数进行归一化处理,并结合最小化交叉熵损失函数来优化算法性能;而对于多标签分类场景,采用Sigmoid函数来处理每个标签的独立二分类问题,并结合最小化二元交叉熵损失函数来优化算法性能;S61, during training, for the graph node classification task, first set the output layer dimension of the graph neural network to be the same as the number of node types, then for the single-label classification scenario, normalize it through the Softmax function, and optimize the algorithm performance by minimizing the cross entropy loss function; for the multi-label classification scenario, use the Sigmoid function to handle the independent binary classification problem of each label, and optimize the algorithm performance by minimizing the binary cross entropy loss function;
S62,在训练的时候,针对图链接预测任务,首先需要对预测节点对进行内积运算或通过DistMult解码器处理,再结合最小化二元交叉熵损失函数来优化算法性能;S62, during training, for the graph link prediction task, it is first necessary to perform inner product operations on the predicted node pairs or process them through the DistMult decoder, and then combine the minimization of the binary cross entropy loss function to optimize the algorithm performance;
S63,基于下游任务的标签监督信号对模型进行学习优化,异质图神经网络的损失函数如下:S63, based on the label supervision signal of the downstream task, the model is learned and optimized. The loss function of the heterogeneous graph neural network is as follows:
其中,是关系感知邻域聚合模块的损失项,λ*是正则化权重,θ是模型参数集,||θ||2是模型参数的L2正则化范数。in, is the loss term of the relation-aware neighborhood aggregation module, λ * is the regularization weight, θ is the set of model parameters, and ||θ|| 2 is the L 2 regularization norm of the model parameters.
由于采用了上述技术方案,本发明取得的技术进步是:Due to the adoption of the above technical solution, the technical progress achieved by the present invention is:
本发明设计的一种基于多路邻域节点自适应选择的异质图异常检测方法,针对异质图中高阶同质对等节点存在的异配性,从而导致异质图神经网络聚合异常信息的问题,首先采用上下文多臂赌博机动态地扩展邻域节点采样的范围,然后利用多智能体强化学习自适应地选择与目标节点同配的邻域节点,最大程度地过滤掉异常邻域节点,实现了对异质图中邻域节点的细粒度选择,最后通过关系感知邻域聚合模块融合多种关系路径下的选择邻域信息,进一步降低了潜在的异常信息对异质图神经网络的影响;本发明实现了异质图的自适应异常检测方法,并提高了异质图神经网络在图节点分类和图链接预测任务上的表现性能。The present invention designs a heterogeneous graph anomaly detection method based on multi-path neighborhood node adaptive selection. Aiming at the problem that heterogeneous graph neural network aggregates abnormal information due to the heterogeneity of high-order homogeneous peer nodes in heterogeneous graphs, a contextual multi-armed bandit is firstly used to dynamically expand the range of neighborhood node sampling, and then multi-agent reinforcement learning is used to adaptively select neighborhood nodes that are homogeneous with target nodes, and abnormal neighborhood nodes are filtered out to the greatest extent, thereby realizing fine-grained selection of neighborhood nodes in heterogeneous graphs. Finally, a relationship-aware neighborhood aggregation module is used to fuse selected neighborhood information under multiple relationship paths, further reducing the impact of potential abnormal information on heterogeneous graph neural networks. The present invention realizes an adaptive anomaly detection method for heterogeneous graphs and improves the performance of heterogeneous graph neural networks in graph node classification and graph link prediction tasks.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图做以简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图;In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required for use in the embodiments or the description of the prior art are briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1是本发明实施例中提供的一种基于多路邻域节点自适应选择的异质图异常检测方法的流程图;FIG1 is a flow chart of a heterogeneous graph anomaly detection method based on multi-path neighborhood node adaptive selection provided in an embodiment of the present invention;
图2是本发明实施例中提供的一种基于多路邻域节点自适应选择的异质图异常检测方法的架构图。FIG2 is an architecture diagram of a heterogeneous graph anomaly detection method based on adaptive selection of multi-path neighborhood nodes provided in an embodiment of the present invention.
具体实施方式DETAILED DESCRIPTION
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "including" and "having" and any variations thereof in the specification and claims of the present invention and the above-mentioned drawings are intended to cover non-exclusive inclusions. For example, a process, method, system, product or apparatus comprising a series of steps or units is not necessarily limited to those steps or units clearly listed, but may include other steps or units that are not clearly listed or inherent to these processes, methods, products or apparatuses.
本发明中的部分专业术语解释:Explanation of some professional terms in this invention:
异质图(heterogeneous graph):异质图定义为其中是节点集合,ε是边集合,E是实体类型集合,R是关系类型集合,χ是实体属性集合;此外,还定义了两个映射函数,即Φ(·)代表节点到实体类型的映射函数,Ψ(·)代表边到关系类型的映射函数;Heterogeneous graph: A heterogeneous graph is defined as in is a node set, ε is an edge set, E is an entity type set, R is a relationship type set, and χ is an entity attribute set. In addition, two mapping functions are defined, namely, Φ(·) represents the mapping function from node to entity type, and Ψ(·) represents the mapping function from edge to relationship type.
多维图(multi-dimensional graph):多维图是异质图的一个特例,遵循|E|=1和|R|>1,其中,|E|表示实体类型的数量,|R|表示关系类型的数量;Multi-dimensional graph: A multi-dimensional graph is a special case of a heterogeneous graph, which follows |E|=1 and |R|>1, where |E| represents the number of entity types and |R| represents the number of relationship types.
异质图神经网络(heterogeneous graph neural network):异质图神经网络分别在不同的关系内聚合来自中心节点的邻居节点的信息来嵌入图结构数据,然后融合关系间的图信息;中心节点v在第l层的表征嵌入为:其中,σ是激活函数,是通过连接或求和将节点v的信息与其相邻信息组合在一起的运算符,AGG(.)表示将来自不同关系的邻域信息映射到向量中的聚合函数;Heterogeneous graph neural network: Heterogeneous graph neural network aggregates information from neighbor nodes of the central node in different relationships to embed graph structure data, and then fuses graph information between relationships; the representation embedding of the central node v in the lth layer is: Where σ is the activation function, is an operator that combines the information of node v with its neighboring information by concatenation or summation. AGG(.) represents an aggregation function that maps neighborhood information from different relations into a vector;
异配节点(heterophilic node):给定一张包含n个节点实例的属性图,有b个节点异于常规节点(b<<n),它们的属性、连接或行为不同于大多数其他正常节点;异配节点检测的目标是学习一个能够识别并区分与中心节点显著不同的异常节点的模型。Heterophilic node: Given an attribute graph containing n node instances, there are b nodes that are different from regular nodes (b<<n), and their attributes, connections or behaviors are different from most other normal nodes; the goal of heterophilic node detection is to learn a model that can identify and distinguish abnormal nodes that are significantly different from the central node.
下面结合附图及实施例对本发明做进一步详细说明:The present invention is further described in detail below with reference to the accompanying drawings and embodiments:
如图1所示,一种基于多路邻域节点自适应选择的异质图异常检测方法,包括以下步骤:As shown in FIG1 , a heterogeneous graph anomaly detection method based on multi-path neighborhood node adaptive selection includes the following steps:
S1,给定n个不同的元路径模式根据每个元路径模式进行相应的连续矩阵乘法计算,构建出具有n维关系路径语义信息的多维图 S1, given n different meta-path patterns According to each meta-path pattern, corresponding continuous matrix multiplication calculations are performed to construct a multi-dimensional graph with n-dimensional relationship path semantic information.
具体过程为:The specific process is:
S11,不同的下游图挖掘任务可能对目标节点的类型数量有特定的要求,通常情况下,每种目标节点类型都会对应构建一个多维图;S11, Different downstream graph mining tasks may have specific requirements on the number of target node types. Usually, a multi-dimensional graph is constructed for each target node type;
S12,沿着预设计的元路径中所有关系对应的邻接矩阵进行连续相乘来构建基于元路径的图的邻接矩阵其表达如下:S12, along the pre-designed metapath The adjacency matrix of the graph based on the meta-path is constructed by continuously multiplying the adjacency matrices corresponding to all relations in It is expressed as follows:
其中,el代表节点类型,是关系<el,el+1>对应的邻接矩阵;对n个不同的元路径模式进行类似的计算,可以构建出一组同质图对应的邻接矩阵将异质图转化为n维多维图;Among them, e l represents the node type, is the adjacency matrix corresponding to the relationship < el , el+1 >; for n different meta-path patterns By performing similar calculations, we can construct an adjacency matrix corresponding to a set of homogeneous graphs. Convert heterogeneous graphs into n-dimensional multidimensional graphs;
S2,从多路采样启发的角度扩展节点邻域的采样范围,利用上下文多臂赌博机动态地调整每维关系路径下邻域节点的采样概率,并根据概率分别从每个维度为每个节点采样出适当的T个邻域节点,从而构建邻域采样池NSP;S2, expands the sampling range of the node neighborhood from the perspective of multi-path sampling inspiration, uses the contextual multi-armed bandit to dynamically adjust the sampling probability of the neighborhood nodes under each dimension of the relationship path, and samples appropriate T neighborhood nodes for each node from each dimension according to the probability, thereby constructing the neighborhood sampling pool NSP;
具体过程为:The specific process is:
S21,对于具有dΦ(v)维节点属性Xv的节点v,采用线性变换矩阵和偏置向量将节点属性变换到特定的d维隐嵌入空间中,得到节点的初始表征嵌入Hv,其表达如下:S21, for a node v with d Φ(v) -dimensional node attributes X v , a linear transformation matrix is used and the bias vector Transform node attributes into a specific d-dimensional latent embedding space In the above example, we get the initial representation embedding H v of the node, which is expressed as follows:
Hv=WΦ(v)Xv+bΦ(v) (2)H v =W Φ(v) X v +b Φ(v) (2)
S22,对于每个维度的中心节点,采用多路代表性的邻域采样启发式策略,这些策略包括:一跳邻居策略、二跳邻居策略、最近邻邻居策略和随机游走邻居策略;S22, for the central node of each dimension, a multi-way representative neighborhood sampling heuristic strategy is adopted, which includes: one-hop neighbor strategy, two-hop neighbor strategy, nearest neighbor strategy and random walk neighbor strategy;
具体为:Specifically:
一跳邻居策略:从归一化邻接矩阵中采样的直接连接节点,其中,是添加了自环的邻接矩阵,是满足的对角矩阵;One-hop neighbor strategy: From the normalized adjacency matrix The directly connected nodes sampled in , where is the adjacency matrix with self-loops added, is satisfied The diagonal matrix of ;
二跳邻居策略:从归一化邻接矩阵的幂中采样的间接连接节点;Two-hop neighbor strategy: from the power of the normalized adjacency matrix Indirectly connected nodes sampled in;
最近邻邻居策略:从基于节点属性的余弦相似性的概率矩阵SKNN中采样的节点,其中,节点vi和vj之间的相似性得分为 Nearest neighbor strategy: nodes are sampled from the probability matrix S KNN based on the cosine similarity of node attributes, where the similarity score between nodes vi and vj is
随机游走邻居策略:从随机游走权重矩阵SPPR中采样的节点,其中,个性化PageRank算法用于生成游走权重这里的纯量系数c∈[0,1],表示列归一化邻接矩阵;Random walk neighbor strategy: nodes sampled from the random walk weight matrix S PPR , where the personalized PageRank algorithm is used to generate the walk weights Here the scalar coefficient c∈[0,1], represents the column-normalized adjacency matrix;
S23,用Qk来代表第k个邻域采样启发式策略的概率矩阵,在回合ep中,结合这K路邻域采样启发式策略的采样概率,计算中心节点vi对邻域节点vj的加权多路采样概率 S23, use Qk to represent the probability matrix of the kth neighborhood sampling heuristic strategy. In round ep, combine the sampling probabilities of the K-way neighborhood sampling heuristic strategy to calculate the weighted multi-way sampling probability of the central node vi to the neighboring node vj
其中,是邻域采样启发式策略粒度概率,Qk满足行归一化 in, is the granularity probability of the neighborhood sampling heuristic strategy, Q k satisfies row normalization
S24,计算策略粒度概率它是第k个邻域采样启发式策略的概率矩阵Qk的权重;多路采样启发策略的综合选择可以被视为上下文多臂赌博机中的对抗性问题,策略粒度概率作为多路邻域采样的赌博动作,自适应地调整每一路的采样概率,并随着模型训练回合的迭代而更新:S24, calculate the probability of strategy granularity It is the weight of the probability matrix Qk of the kth neighborhood sampling heuristic strategy; the comprehensive selection of multi-way sampling heuristic strategies can be regarded as an adversarial problem in the contextual multi-armed bandit. The strategy granularity probability is used as a gambling action for multi-way neighborhood sampling, adaptively adjusting the sampling probability of each way and updating it with the iteration of the model training round:
其中,初始值且pmin≥0是一个可控常数,用于约束采样概率的下限;wk是第k路策略的权重,根据具有监督学习保证的上下文多臂赌博机强化学习算法,其迭代更新规则如下:Among them, the initial value And p min ≥ 0 is a controllable constant used to constrain the lower limit of the sampling probability; w k is the weight of the k-th strategy. According to the contextual multi-armed bandit reinforcement learning algorithm with supervised learning guarantee, its iterative update rule is as follows:
其中,是更新周期,和是可调整的超参数,代表上下文多臂赌博机选择动作时的遗憾率;是在第ep回合第k个策略下采样节点的加权奖励值,其表达如下:in, is the update cycle, and is an adjustable hyperparameter representing the regret rate of the contextual multi-armed bandit when selecting actions; is the weighted reward value of the sampling node under the k-th strategy in the ep-th round, which is expressed as follows:
其中,是节点vi选择邻域节点的集合;是节点vi和vj的平均多头注意力得分;in, is the set of neighboring nodes selected by node v i ; is the average multi-head attention score of nodes vi and vj ;
S25,在每个回合结束后,为每个节点v,在每维关系路径下,根据采样概率从节点集合中采样T个邻域节点,并将这些中心节点和邻域节点对添加到邻域采样池中:S25, after each round, for each node v, in each dimension relationship path Next, according to the sampling probability From the node set Sample T neighborhood nodes and add these center node and neighborhood node pairs to the neighborhood sampling pool:
S3,针对邻域采样池中每个中心节点的采样邻域节点集合,根据其采样概率进行顺序排列,并按照采样邻域节点序列的依赖关系、动态特性以及复杂程度仿真强化学习任务中的基本环境要素;S3, for each central node in the neighborhood sampling pool, the sampling neighborhood node set is arranged in order according to its sampling probability, and the basic environmental elements in the reinforcement learning task are simulated according to the dependency, dynamic characteristics and complexity of the sampling neighborhood node sequence;
具体过程为:The specific process is:
S31,为满足强化环境要素中的依赖性,在对邻域节点的顺序选择时,根据邻域节点的采样概率对邻域节点进行排序,排列相近的邻域节点在语义上存在关联,随着智能体沿着时间步在邻域节点序列中前进,靠近的邻域节点会产生顺序依赖性,依赖过程的表达如下:S31, in order to meet the dependency in the enhanced environmental elements, when selecting the order of neighboring nodes, the neighboring nodes are sorted according to the sampling probability of the neighboring nodes. The neighboring nodes with similar arrangement are semantically associated. As the agent moves forward in the neighboring node sequence along the time step, the close neighboring nodes will produce sequential dependency. The expression of the dependency process is as follows:
ot+1=τot+(1-τ)ot+1 (8)o t+1 =τo t +(1-τ)o t+1 (8)
其中,ot是在时间步t的环境状态,τ是一个数值较小的超参数;Among them, o t is the state of the environment at time step t, and τ is a small hyperparameter;
S32,为满足强化环境要素中的动态性,对邻域节点序列设置跳过率ps,智能体以概率ps随机地跳过对某一个邻域节点的评估,并直接转移到下一个邻域节点,这时的状态序列长度和内容在每一个回合中都会发生明显的变化;S32, in order to meet the dynamics of the enhanced environmental elements, a skip rate ps is set for the neighborhood node sequence. The agent randomly skips the evaluation of a neighborhood node with probability ps and directly transfers to the next neighborhood node. At this time, the length and content of the state sequence will change significantly in each round;
S33,为满足强化环境要素中的复杂性,根据邻域节点的采样概率大小对邻域节点进行降序排列,智能体从采样概率较高的简单邻域节点开始选择判断,并利用较高的跳过率来规避较为困难的邻域节点,跳过率ps随着模型训练回合的迭代而更新:S33, in order to meet the complexity of the enhanced environmental elements, the neighborhood nodes are sorted in descending order according to the sampling probability of the neighborhood nodes. The agent starts to select and judge from the simple neighborhood nodes with higher sampling probability, and uses a higher skip rate to avoid more difficult neighborhood nodes. The skip rate ps is updated with the iteration of the model training round:
其中,ps是基本跳过率,是确保动态特性的最小跳过率;dt代表在时间步t时邻域节点vt的归一化复杂性度量,其表达如下:Among them, ps is the basic skip rate, is the minimum skip rate to ensure dynamic characteristics; dt represents the normalized complexity measure of the neighborhood node vt at time step t, which is expressed as follows:
其中,即从T个采样节点中选取具有最大多路采样概率的节点序号;lt∈[0,1]是节点v与vt同配的归一化置信度,与智能体行为中选择决策的奖励值有关;κ是平衡中心节点v与邻域节点vt之间同配性和异配性特征的超参数;in, That is, the node number with the maximum multi-path sampling probability is selected from the T sampling nodes; l t ∈ [0,1] is the normalized confidence that nodes v and v t are assortative, which is related to the reward value of the decision-making selection in the agent's behavior; κ is a hyperparameter that balances the assortative and heteroassortative characteristics between the central node v and the neighboring node v t ;
S4,在一个回合中,智能体依次对中心节点与它对应的最多T个采样邻域节点进行异配性判断,该过程被视作T轮决策过程;在多维图中为n维关系路径构建的同质图部署n种智能体,它们在n维关系路径语义的马尔可夫博弈中学习如何选择同配性节点以实现多维度选择的合作均衡,从而构建邻域选择池NCP;S4, in one round, the agents make heterogeneity judgments on the central node and its corresponding up to T sampled neighboring nodes in turn, and this process is regarded as a T-round decision process; n types of agents are deployed in the homogeneous graph constructed for the n-dimensional relationship path in the multidimensional graph, and they learn how to select homogeneous nodes in the Markov game of the n-dimensional relationship path semantics to achieve cooperative equilibrium in multi-dimensional selection, thereby constructing a neighborhood selection pool NCP;
具体过程为:The specific process is:
S41,采用最新的中心节点表征Hv,当前时间步的邻域节点表征Hvt和采样邻域节点集合NSP(v)共同作为当前时刻t的可观测状态ot,其表达如下:S41, using the latest central node representation Hv , the neighborhood node representation Hvt of the current time step and the sampled neighborhood node set NSP(v) as the observable state ot at the current time t, which is expressed as follows:
ot(v)=[Hv,Hvt,HNSP(v)] (11)o t(v) =[H v ,H vt ,H NSP(v) ] (11)
S42,每个智能体的动作空间定义为一个由选择(select)和不选择(unselect)同配性节点组成的离散方案,其表达如下:S42, the action space of each agent It is defined as a discrete scheme consisting of selecting and unselecting nodes of similarity, which is expressed as follows:
S43,在时间步t,采用线性层将中心节点表征Hv和邻域节点表征组合起来,获得整合中心节点与当前时间步邻域节点的信息其估计表达如下:S43, at time step t, a linear layer is used to represent the central node H v and the neighboring node H v. Combined, we can get the information of the integrated center node and the neighboring nodes of the current time step. The estimated expression is as follows:
其中,σ(·)是激活函数,(·||·)是拼接操作,Wh和bh分别是线性映射中可学习的权重矩阵和偏置向量;借鉴对比预测编码的理论,估计中心节点与邻域节点集之间的互信息,获得更高阶的信息量其估计表达如下:Among them, σ(·) is the activation function, (·||·) is the concatenation operation, W h and b h are the learnable weight matrix and bias vector in the linear mapping, respectively; referencing the theory of contrastive predictive coding, the mutual information between the central node and the set of neighboring nodes is estimated to obtain higher-order information. The estimated expression is as follows:
其中,Wd是可学习的线性变换矩阵;将上面两个估计式拼接起来,进行可学习的线性变换Wp和Softmax层输出归一化步骤,获得动作概率分布其表达如下:Among them, Wd is a learnable linear transformation matrix; concatenate the above two estimation formulas, perform a learnable linear transformation Wp and a Softmax layer output normalization step, and obtain the action probability distribution It is expressed as follows:
S44,在时间步t,根据图结构的数据特点,采用无监督的方式为智能体的动作分配奖励rt+1(v),其表达如下:S44, at time step t, according to the data characteristics of the graph structure, an unsupervised approach is used to assign rewards r t+1(v) to the actions of the agent, which is expressed as follows:
其中,σ(·)是Sigmoid激活函数,Ω是在异质图上可观测到的正节点对的集合,而Ω-则是从所有未观测到的节点对中采样的负节点对的集合,ω1和ω2是可调整的超参数,分别代表正样本的奖励权重和负样本的惩罚权重;Where σ(·) is the Sigmoid activation function, Ω is the set of positive node pairs observed on the heterogeneous graph, and Ω - is the set of negative node pairs sampled from all unobserved node pairs. ω 1 and ω 2 are adjustable hyperparameters, representing the reward weight of positive samples and the penalty weight of negative samples, respectively.
S45,在时间步t,智能体α观测节点状态ot,根据策略执行动作ut,获得反馈rt+1,记录完成状态done,并进入下一个时间步t+1,继续观测新的节点状态ot+1,为每个智能体α建立一个经验回放缓冲池在每个时间步存入经验轨迹τα=(ot,ut,rt+1,ot+1,done);S45, at time step t, agent α observes the node state o t and follows the strategy Execute action u t , obtain feedback r t+1 , record the completed state done , and enter the next time step t+1 , continue to observe the new node state o t+1 , and establish an experience replay buffer pool for each agent α Store the empirical trajectory τ α =(o t ,u t ,r t+1 ,o t+1 ,done) at each time step;
S46,采用多演员注意力评论家算法实现上述多智能体行为的学习与优化,在多智能体强化学习环境中,训练去中心化的策略,通过多头注意力机制动态地选取每个智能体的重要信息,并集中学习策略的评价;S46, uses a multi-actor attention critic algorithm to achieve the learning and optimization of the above multi-agent behavior. In a multi-agent reinforcement learning environment, it trains decentralized strategies, dynamically selects important information of each agent through a multi-head attention mechanism, and centralizes the evaluation of learning strategies;
具体过程为:The specific process is:
S461,智能体αi的状态动作价值函数的表达如下:S461, the state-action value function of agent α i is expressed as follows:
其中,gi(·)是面向不同模块的多层感知机嵌入函数,Ws是可学习的线性变换矩阵,它在其他智能体αj的贡献函数gou(o,u)中被共享使用,是贡献价值的注意力权重;Among them, gi(·) is the multi-layer perceptron embedding function for different modules, Ws is a learnable linear transformation matrix, which is shared in the contribution function g ou (o,u) of other agents α j . is the attention weight of the contribution value;
S462,所有智能体的价值函数优化可以近似看作多任务回归问题,它们的联合回归损失函数如下:S462, the value function optimization of all agents can be approximately regarded as a multi-task regression problem, and their joint regression loss function is as follows:
其中,θ和分别是评论家网络和目标评论家网络的参数;是目标价值函数,其表达如下:Among them, θ and are the parameters of the critic network and the target critic network, respectively; is the target value function, which is expressed as follows:
其中,γ是折扣因子,τ是平衡最大化熵和奖励的温度超参数,是目标策略网络的参数;where γ is the discount factor, τ is the temperature hyperparameter that balances maximizing entropy and reward, are the parameters of the target policy network;
S463,每个智能体的策略网络通过梯度上升进行更新,它们各自的梯度表达如下:S463, the policy network of each agent is updated by gradient ascent, and their respective gradients are expressed as follows:
其中,Ai(o|u)=Qi(o,u;θ)-bi(o,u)是优势函数,是反事实基线,φ是策略网络的参数,ρ是可学习的策略对数调整系数;所有目标网络的参数都是通过软更新机制进行同步的;Among them, A i (o|u) = Qi (o,u; θ) - bi (o,u) is the advantage function, is the counterfactual baseline, φ is the parameter of the policy network, and ρ is the learnable policy logarithmic adjustment coefficient; all target network parameters are synchronized through the soft update mechanism;
S47,在每个回合结束后,对于在每维关系路径下的中心节点v,智能体沿时间序列从邻域采样池中迭代选择候选节点,并将这些中心节点和选择节点对添加到邻域选择池中:S47, after each round, for each dimension relationship path For the central node v under the circumstance, the agent iteratively selects candidate nodes from the neighborhood sampling pool along the time series and adds these central node and selected node pairs to the neighborhood selection pool:
S5,为了获得全面的节点表征嵌入,采用关系感知邻域聚合,将邻域聚合分为关系路径内的聚合和关系路径间的聚合;S5, in order to obtain comprehensive node representation embedding, relation-aware neighborhood aggregation is adopted, and neighborhood aggregation is divided into aggregation within relation paths and aggregation between relation paths;
具体过程为:The specific process is:
S51,在关系路径内的聚合中,采用注意力机制来对邻域选择的权重进行动态调整,得到关系路径内的聚合表征嵌入其表达如下:S51, in the aggregation within the relationship path, the attention mechanism is used to dynamically adjust the weight of the neighborhood selection to obtain the aggregation representation embedding within the relationship path It is expressed as follows:
其中,Wh和Wt分别是针对中心节点和邻域节点的线性变换矩阵,它们在所有关系路径中是共享的;av,u表示节点vj对vi的重要性分数,其表达如下:Among them, W h and W t are the linear transformation matrices for the central node and the neighborhood nodes respectively, which are shared in all relationship paths; a v,u represents the importance score of node v j to vi , which is expressed as follows:
其中,Wa是注意力变换矩阵,tanh激活函数可以进一步区分邻域节点带给中心节点的异常信息;Among them, Wa is the attention transformation matrix, and the tanh activation function can further distinguish the abnormal information brought to the central node by the neighboring nodes;
S52,在关系路径间的聚合中,将所有关系路径下的语义表征嵌入信息拼接起来进行联合学习,得到关系路径间的聚合表征嵌入Zv:S52, in the aggregation between relation paths, the semantic representation embedding information under all relation paths is spliced together for joint learning to obtain the aggregated representation embedding Z v between relation paths:
其中,MLP(·)是面向最终图任务的多层感知机函数;Among them, MLP(·) is the multi-layer perceptron function for the final graph task;
S6,将目标中心节点的最终表征嵌入Z作为输出,并传递到下游任务中进行异质图的表示学习;S6, embeds the final representation of the target central node into Z as output and passes it to the downstream task for representation learning of heterogeneous graphs;
具体过程为:The specific process is:
S61,在训练的时候,针对图节点分类任务,首先将图神经网络的输出层维度设定为与节点类型数量一样,然后对于单标签分类场景,通过Softmax函数进行归一化处理,并结合最小化交叉熵损失函数来优化算法性能;而对于多标签分类场景,采用Sigmoid函数来处理每个标签的独立二分类问题,并结合最小化二元交叉熵损失函数来优化算法性能;S61, during training, for the graph node classification task, first set the output layer dimension of the graph neural network to be the same as the number of node types, then for the single-label classification scenario, normalize it through the Softmax function, and optimize the algorithm performance by minimizing the cross entropy loss function; for the multi-label classification scenario, use the Sigmoid function to handle the independent binary classification problem of each label, and optimize the algorithm performance by minimizing the binary cross entropy loss function;
S62,在训练的时候,针对图链接预测任务,首先需要对预测节点对进行内积运算或通过DistMult解码器处理,再结合最小化二元交叉熵损失函数来优化算法性能;S62, during training, for the graph link prediction task, it is first necessary to perform inner product operations on the predicted node pairs or process them through the DistMult decoder, and then combine the minimization of the binary cross entropy loss function to optimize the algorithm performance;
S63,基于下游任务的标签监督信号对模型进行学习优化,该异质图神经网络的损失函数如下:S63, based on the label supervision signal of the downstream task, the model is learned and optimized. The loss function of the heterogeneous graph neural network is as follows:
其中,是关系感知邻域聚合模块的损失项,λ*是正则化权重,θ是模型参数集,||θ||2是模型参数的L2正则化范数。in, is the loss term of the relation-aware neighborhood aggregation module, λ * is the regularization weight, θ is the set of model parameters, and ||θ|| 2 is the L 2 regularization norm of the model parameters.
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit it. Although the present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that they can still modify the technical solutions described in the aforementioned embodiments, or replace some or all of the technical features therein by equivalents. However, these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims (9)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410621742.XA CN118540239A (en) | 2024-05-20 | 2024-05-20 | Heterogeneous graph anomaly detection method based on multipath neighborhood node self-adaptive selection |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410621742.XA CN118540239A (en) | 2024-05-20 | 2024-05-20 | Heterogeneous graph anomaly detection method based on multipath neighborhood node self-adaptive selection |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN118540239A true CN118540239A (en) | 2024-08-23 |
Family
ID=92380685
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410621742.XA Pending CN118540239A (en) | 2024-05-20 | 2024-05-20 | Heterogeneous graph anomaly detection method based on multipath neighborhood node self-adaptive selection |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118540239A (en) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119130652A (en) * | 2024-11-07 | 2024-12-13 | 浙江大学 | Virtual currency wallet address anomaly detection method and system based on adaptive heterogeneous message aggregation |
| CN119477961A (en) * | 2024-10-17 | 2025-02-18 | 山东大学 | Self-supervised medical image segmentation method and system based on random walk path |
| CN119512759A (en) * | 2024-11-23 | 2025-02-25 | 暨南大学 | A load-aware elastic scaling method for multiple microservice replicas in computing power networks |
-
2024
- 2024-05-20 CN CN202410621742.XA patent/CN118540239A/en active Pending
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119477961A (en) * | 2024-10-17 | 2025-02-18 | 山东大学 | Self-supervised medical image segmentation method and system based on random walk path |
| CN119130652A (en) * | 2024-11-07 | 2024-12-13 | 浙江大学 | Virtual currency wallet address anomaly detection method and system based on adaptive heterogeneous message aggregation |
| CN119130652B (en) * | 2024-11-07 | 2025-06-24 | 浙江大学 | Virtual coin wallet address anomaly detection method and system based on self-adaptive heterogeneous message aggregation |
| CN119512759A (en) * | 2024-11-23 | 2025-02-25 | 暨南大学 | A load-aware elastic scaling method for multiple microservice replicas in computing power networks |
| CN119512759B (en) * | 2024-11-23 | 2025-07-18 | 暨南大学 | A load-aware elastic scaling method for multiple microservice replicas in computing power networks |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Gauci et al. | Horizon: Facebook's open source applied reinforcement learning platform | |
| Haarnoja et al. | Latent space policies for hierarchical reinforcement learning | |
| Tang et al. | # exploration: A study of count-based exploration for deep reinforcement learning | |
| CN118540239A (en) | Heterogeneous graph anomaly detection method based on multipath neighborhood node self-adaptive selection | |
| Jia et al. | Dom-q-net: Grounded rl on structured language | |
| Navgaran et al. | Evolutionary based matrix factorization method for collaborative filtering systems | |
| JP2024008888A (en) | Method and system for solving QUBO problems using classical-quantum hybrid solver | |
| CN118133087A (en) | Building engineering quality monitoring method and device based on artificial intelligence and big data | |
| CN113641907B (en) | A hyperparameter adaptive depth recommendation method and device based on evolutionary algorithm | |
| CN117807542A (en) | Power grid abnormal mode data identification method and system based on genetic LM algorithm | |
| Hoang et al. | Deep Reinforcement Learning for Wireless Communications and Networking: Theory, Applications and Implementation | |
| CN112464057A (en) | Network data classification method, device, equipment and readable storage medium | |
| CN113326884A (en) | Efficient learning method and device for large-scale abnormal graph node representation | |
| CN119317921A (en) | Automated Design of Distributed RF Power Amplifiers Using GANs with Deep Reinforcement Learning | |
| CN112766496A (en) | Deep learning model security guarantee compression method and device based on reinforcement learning | |
| Bose et al. | Can graph neural networks go deeper without over-smoothing? Yes, with a randomized path exploration! | |
| Loni et al. | Learning activation functions for sparse neural networks | |
| CN113111308B (en) | Symbolic regression method and system based on data-driven genetic programming algorithm | |
| CN114861766B (en) | Dynamic link prediction method and system based on multi-granularity evolution | |
| Xu et al. | Distribution-guided Graph Learning for Sequential Recommendation | |
| CN115907913A (en) | A Next Shopping Basket Recommendation System Based on User Graph Augmented Graph Neural Network | |
| CN114691995A (en) | A Sequence Recommendation Method Based on Information Propagation and Attention Mechanism | |
| CN118940278A (en) | A source code vulnerability location method based on explainability model | |
| Andersen et al. | Increasing sample efficiency in deep reinforcement learning using generative environment modelling | |
| WO2024221015A2 (en) | Reinforcement learning based circuit topology parameter exploration |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |