[go: up one dir, main page]

CN111507506A - Consensus embedding-based complex network community discovery method - Google Patents

Consensus embedding-based complex network community discovery method Download PDF

Info

Publication number
CN111507506A
CN111507506A CN202010202056.0A CN202010202056A CN111507506A CN 111507506 A CN111507506 A CN 111507506A CN 202010202056 A CN202010202056 A CN 202010202056A CN 111507506 A CN111507506 A CN 111507506A
Authority
CN
China
Prior art keywords
consensus
network
embedding
pop
solution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010202056.0A
Other languages
Chinese (zh)
Inventor
曾湘祥
杜妍孜
刘向荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University
Original Assignee
Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University filed Critical Xiamen University
Priority to CN202010202056.0A priority Critical patent/CN111507506A/en
Publication of CN111507506A publication Critical patent/CN111507506A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Medical Informatics (AREA)
  • Tourism & Hospitality (AREA)
  • Primary Health Care (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computing Systems (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Biomedical Technology (AREA)
  • Quality & Reliability (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A consensus embedding-based complex network community discovery method relates to a multi-objective optimization technology. The method comprises the following steps: 1) giving a maximum algebra maxgen and a particle swarm size pop; 2) given network G ═ V, E, the network size is n, carry on the network and represent and study; 3) representing a learning result by using a network, initializing a particle swarm to obtain 100 particles POP, wherein the iteration number t is 1; 4) carrying out updating and mutation based on consensus embedding on the POP; 5) stopping conditions are as follows: and if t is not more than maxgen, t ← t +1 and go to step 3), otherwise, stopping and returning pareto front edge solution, namely a plurality of community division results. The updating process is more efficient and accurate, and the obtained pareto frontier effect is more competitive; the accuracy rate of community discovery is improved, the convergence time of the method is effectively reduced, and the method has good practicability in practical application of function prediction, recommendation systems and the like.

Description

一种基于共识嵌入的复杂网络社区发现方法A Community Discovery Method for Complex Networks Based on Consensus Embedding

技术领域technical field

本发明涉及多目标优化技术,具体是涉及可应用于功能预测、推荐系统等领域的一种基于共识嵌入的复杂网络社区发现方法。The invention relates to a multi-objective optimization technology, in particular to a complex network community discovery method based on consensus embedding, which can be applied to the fields of function prediction, recommendation system and the like.

背景技术Background technique

具有自组织、自相似、吸引子、小世界、无标度中部分或全部性质的网络称为复杂网络(Complex Network)。复杂网络中相同类型节点之间连接较多,构成一个一个的小社区,不同类型节点之间连接较少,但这些连接将成为沟通不同社区的重要通道。A network with some or all of the properties of self-organization, self-similarity, attractor, small world, and scale-free is called a complex network. In a complex network, there are many connections between the same type of nodes, forming a small community, and there are fewer connections between different types of nodes, but these connections will become an important channel for communication between different communities.

为了探索复杂网络的结构特点,进而了解复杂网络的功能,人们对复杂网络的社区结构进行了广泛的研究,提出了众多社区发现方法,主要分为四种方法:凝聚方法、分裂方法、优化方法和模拟方法。在众多方法中这四种方法并不是独立的,一种方法可能会同时体现多种思想。In order to explore the structural characteristics of complex networks and then understand the functions of complex networks, people have carried out extensive research on the community structure of complex networks, and proposed many community discovery methods, which are mainly divided into four methods: agglomeration method, splitting method, and optimization method. and simulation methods. Among the many methods, these four methods are not independent, and one method may reflect multiple ideas at the same time.

复杂网络社区发现方法被广泛应用到许多基础应用当中,比如推荐系统、疾病预测与预防等。粒子可以定义为多目标粒子群中的一个解决方案solution,每个解决方案可以表示复杂网络中的一种社区划分。最近几年,多目标优化方法取得了很好的成功,因此,在基于多目标粒子群算法上做改进有着重要的意义。其中代表的方法有Maoguo Gong等人(M.Gong,Q.Cai,X.Chen,and L.Ma,“Complex network clustering by multiobjectivediscrete particle swarm optimization based on decomposition,”IEEETransactions on Evolutionary Computation,vol.18,no.1,pp.82–97,2014)的基于分解的多目标离散粒子群算法在复杂网络聚类中的应用(MODPSO)。但算法初始化时产生大量无用解,使得搜索空间过大,冗余度很高,算法只能在相对较小的网络中表现良好,在较大规模的网络上表现不佳。Complex network community discovery methods are widely used in many basic applications, such as recommender systems, disease prediction and prevention, etc. A particle can be defined as a solution in a multi-objective particle swarm, and each solution can represent a kind of community partition in a complex network. In recent years, the multi-objective optimization method has achieved good success, so it is of great significance to improve the multi-objective particle swarm optimization. Among the representative methods are Maoguo Gong et al. (M.Gong, Q.Cai, X.Chen, and L.Ma, "Complex network clustering by multiobjectivediscrete particle swarm optimization based on decomposition," IEEE Transactions on Evolutionary Computation, vol.18, no.1, pp.82–97, 2014) Application of Decomposition-Based Multi-objective Discrete Particle Swarm Optimization to Complex Network Clustering (MODPSO). However, when the algorithm is initialized, a large number of useless solutions are generated, which makes the search space too large and the redundancy is very high. The algorithm can only perform well in relatively small networks, and does not perform well in large-scale networks.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于针对现有技术存在的收敛速度过慢,存在大量冗余解,没有深层次挖掘网络中潜藏的底层结构信息等问题,提供一种基于共识嵌入的复杂网络社区发现方法。The purpose of the present invention is to provide a complex network community discovery method based on consensus embedding, aiming at the problems of the prior art such as too slow convergence speed, a large number of redundant solutions, and no deep mining of the underlying structure information hidden in the network.

本发明包括以下步骤:The present invention includes the following steps:

1)给定最大代数maxgen,粒子群规模pop;1) Given the maximum algebra maxgen, the particle swarm size pop;

2)给定的网络G=(V,E),网络规模为n,进行网络表示学习;2) Given a network G=(V, E), the network scale is n, and network representation learning is performed;

3)利用网络表示学习结果,进行粒子群的初始化得到100个粒子POP,迭代次数t=1;3) Using the network to represent the learning results, initialize the particle swarm to obtain 100 particle POPs, and the number of iterations is t=1;

4)将POP进行基于共识嵌入的更新、变异;4) Update and mutate POP based on consensus embedding;

5)停止条件:若满足t≤maxgen,则t←t+1并转向步骤3),否则停止并返回帕累托前沿解,即多个社区划分结果。5) Stop condition: if t≤maxgen is satisfied, then t←t+1 and turn to step 3), otherwise stop and return to the Pareto frontier solution, that is, the result of dividing multiple communities.

在步骤2)中,所述进行网络表示学习的具体步骤可为:In step 2), the specific steps of performing network representation learning may be:

采用基于奇异值分解和特征值分解框架的保留任意阶相似度的网络嵌入方法AROPE,将高维的邻接矩阵映射到低维的连续的特征空间中,挖掘网络中潜藏的底层结构信息,得到节点的特征向量E={e1,e2,e3,...,en},其中ei={ei1,ei2,ei3,...,eid},d为降维后的维度。Using the network embedding method AROPE which preserves the similarity of any order based on singular value decomposition and eigenvalue decomposition framework, the high-dimensional adjacency matrix is mapped to the low-dimensional continuous feature space, and the underlying structural information hidden in the network is mined to obtain nodes. The feature vector E={e 1 ,e 2 ,e 3 ,...,e n }, where e i ={e i1 ,e i2 ,e i3 ,...,e id }, d is the dimension reduction dimension.

在步骤3)中,所述利用网络表示学习结果,进行粒子群的初始化得到100个粒子POP的具体步骤为:In step 3), the specific steps of using the network to represent the learning result and initializing the particle swarm to obtain 100 particle POPs are:

计算相似度得到

Figure BDA0002419728150000021
利用k-means得到基于相似度的初始化的粒子群POPt={x1,x2,x3,...,xpop}t;粒子的局部最优解也初始化为Pbest=POPt。Calculate the similarity to get
Figure BDA0002419728150000021
The initialized particle swarm POP t ={x 1 ,x 2 ,x 3 ,...,x pop } t is obtained by using k-means; the local optimal solution of the particle is also initialized as Pbest=POP t .

在步骤4)中,所述将POP进行基于共识嵌入的更新、变异的具体步骤为:In step 4), the specific steps for updating and mutating the POP based on consensus embedding are:

在每代中进行基于共识的更新;Consensus-based updates in each generation;

每隔10代进行共识嵌入;Consensus embedding every 10 generations;

每代中进行以变异概率为pm的对粒子进行基于邻域的单点变异。Neighborhood-based single-point mutation of particles with mutation probability pm is performed in each generation.

将粒子的更新变异过程迭代进行maxgen次。Iterates the particle update mutation process for maxgen times.

在每代中进行基于共识的更新时,将gbest中的社区作为共识,随机选择其中一个嵌入到当前的粒子中;每隔10代进行共识嵌入时,每隔10代提取当前粒子群中的特征解,从中提取所有二节点社区,在当前粒子群中进行投票,若支持度≥70%,则选定为共识社区,全部嵌入到当前的粒子中;When consensus-based updates are performed in each generation, the community in gbest is used as consensus, and one of them is randomly selected to be embedded in the current particle; when consensus embedding is performed every 10 generations, the features in the current particle swarm are extracted every 10 generations. solution, extract all two-node communities from it, and vote in the current particle swarm. If the support is ≥ 70%, it is selected as a consensus community and all embedded in the current particle;

其中,从解集中提取的特征解,是三种代表性的解,分别是:Among them, the characteristic solutions extracted from the solution set are three representative solutions, namely:

a.具有最小KKM的解a. The solution with minimal KKM

b.具有最小RC的解b. The solution with minimal RC

c.具有最小曼哈顿距离的局部knee解c. Local knee solution with minimum Manhattan distance

每代中进行以变异概率为pm的对粒子进行基于邻域的单点变异时,对每个需要变异的粒子,随机选择一个位置i,其标签用其邻域中的另一个标签替换,这保证了新的可能解的产生。When performing neighborhood-based single-point mutation of particles with mutation probability pm in each generation, for each particle that needs to be mutated, a position i is randomly selected, and its label is replaced with another label in its neighborhood. The generation of new possible solutions is guaranteed.

本发明采用的目标函数是KKM和RC的变体:The objective function adopted in the present invention is a variant of KKM and RC:

Figure BDA0002419728150000031
Figure BDA0002419728150000031

其中Vh∈V,

Figure BDA0002419728150000032
所以
Figure BDA0002419728150000033
Figure BDA0002419728150000034
是集合Vh中节点的内部和外部相似度之和,“|·|”表示集合的大小。where V h ∈ V,
Figure BDA0002419728150000032
so
Figure BDA0002419728150000033
and
Figure BDA0002419728150000034
is the sum of the internal and external similarity of nodes in the set V h , and “|·|” represents the size of the set.

在粒子的选择过程中采用的是非支配排序选择策略。The non-dominated sorting selection strategy is adopted in the selection process of particles.

与现有技术相比,本发明具有以下突出的技术效果:Compared with the prior art, the present invention has the following outstanding technical effects:

相比MODPSO的进化过程,本发明提出的基于共识嵌入的复杂网络社区发现算法粒子的更新方式不一样,在更新过程中更加高效、准确。实验数据也表明,本发明方法得到的帕累托前沿效果更有竞争力;基于本发明需要的收敛时间也更少。本发明的方法提高了社区发现的准确率,同时又能有效减少方法的收敛时间,在功能预测、推荐系统等实际应用当中有很好的实用性。比如在生物领域的新陈代谢网络的分析、基因调控网络的分析、主控基因的识别;在疾病的传播演化和预测防控领域找出传染病的关键社区、关键节点以预测传播途径及时切断传播途径;在互联网中实现广告的精准投放,在电子商务系统上建立更可靠的推荐系统,在搜索引擎中提供更加个性化的搜索结果。Compared with the evolution process of MODPSO, the update method of the algorithm particles of the complex network community discovery based on consensus embedding proposed by the present invention is different, and the update process is more efficient and accurate. The experimental data also show that the Pareto front effect obtained by the method of the present invention is more competitive; the convergence time required by the present invention is also less. The method of the invention improves the accuracy rate of community discovery, and at the same time can effectively reduce the convergence time of the method, and has good practicability in practical applications such as function prediction and recommendation system. For example, in the biological field, the analysis of the metabolic network, the analysis of the gene regulation network, and the identification of the master gene; in the field of disease transmission and evolution, prediction and prevention, find out the key communities and key nodes of infectious diseases to predict the transmission route and cut off the transmission route in time. ; Realize the precise placement of advertisements in the Internet, establish a more reliable recommendation system on the e-commerce system, and provide more personalized search results in the search engine.

附图说明Description of drawings

图1为在每代中粒子的更新方式示意图。Figure 1 is a schematic diagram of how particles are updated in each generation.

图2为每隔10代粒子的基于共识嵌入的更新方式示意图。Figure 2 is a schematic diagram of the update method based on consensus embedding of particles every 10 generations.

图3本发明的方法(NE-PSO)与MODPSO算法在小规模网络数据集上的运行时间的对比图。FIG. 3 is a comparison diagram of the running time of the method of the present invention (NE-PSO) and the MODPSO algorithm on a small-scale network dataset.

图4本发明的方法(NE-PSO)在大规模网络数据集上的运行时间的折线图。Figure 4. Line graph of the running time of the method of the present invention (NE-PSO) on a large-scale network dataset.

具体实施方式Detailed ways

以下实施例将结合附图对本发明作进一步的说明。The following embodiments will further illustrate the present invention in conjunction with the accompanying drawings.

本发明所述基于共识嵌入的复杂网络社区发现方法实施例,具体包括以下步骤:The embodiment of the method for discovering complex network communities based on consensus embedding according to the present invention specifically includes the following steps:

1)给定网络G=(V,E),最大代数maxgen,粒子群规模pop,网络规模n,变异概率pm;1) Given network G=(V, E), maximum algebra maxgen, particle swarm size pop, network size n, mutation probability pm;

2)利用基于奇异值分解和特征值分解框架的保留任意阶相似度的网络嵌入方法AROPE对网络G进行网络表示学习,将高维的邻接矩阵映射到低维的连续的特征空间中,挖掘网络中潜藏的底层结构信息,得到节点的特征向量E={e1,e2,e3,...,en},其中ei={ei1,ei2,ei3,...,eid},d为降维后的维度;2) Using the network embedding method AROPE that retains any order similarity based on singular value decomposition and eigenvalue decomposition framework, network representation learning is performed on the network G, and the high-dimensional adjacency matrix is mapped into the low-dimensional continuous feature space, and the network is mined. The underlying structure information hidden in the node is obtained, and the feature vector E={e 1 ,e 2 ,e 3 ,..., en } of the node is obtained, where e i = {e i1 ,e i2 ,e i3 ,..., e id }, d is the dimension after dimension reduction;

3)利用网络表示学习结果和节点的特征向量,计算节点之间的cos相似度得到

Figure BDA0002419728150000041
利用k-means得到基于相似度的初始化的粒子群POPt={x1,x2,x3,...,xpop}t;粒子的局部最优解也初始化为Pbest=POPt;迭代次数t=1;3) Using the network to represent the learning results and the feature vectors of nodes, calculate the cos similarity between nodes to get
Figure BDA0002419728150000041
Using k-means to obtain the initialized particle swarm POP t ={x 1 ,x 2 ,x 3 ,...,x pop } t based on similarity; the local optimal solution of the particle is also initialized as Pbest=POP t ; iterative times t=1;

4)将POP进行基于共识嵌入的更新、变异,其中:4) Update and mutate POP based on consensus embedding, where:

在每代中进行基于共识的更新(如图1),将gbest中的社区作为共识,随机选择其中一个嵌入到当前的粒子中;Perform consensus-based updates in each generation (as shown in Figure 1), take the community in gbest as consensus, and randomly select one of them to embed into the current particle;

每隔10代进行共识嵌入(如图2),每隔10代提取当前粒子群中的特征解,从中提取所有二节点社区,在当前粒子群中进行投票,若支持度≥70%,则选定为共识社区,全部嵌入到当前的粒子中;Consensus embedding is performed every 10 generations (as shown in Figure 2), the feature solution in the current particle swarm is extracted every 10 generations, all two-node communities are extracted from it, and voting is performed in the current particle swarm. Set as a consensus community, all embedded in the current particle;

每代中进行以变异概率为pm的对粒子进行基于邻域的单点变异,对每个需要变异的粒子,随机选择一个位置i,其标签用其邻域中的另一个标签替换,这保证了新的可能解的产生。In each generation, a neighborhood-based single-point mutation is performed on the particles with the mutation probability pm. For each particle that needs to be mutated, a position i is randomly selected, and its label is replaced by another label in its neighborhood, which ensures that generation of new possible solutions.

5)停止条件:若满足t≤maxgen,则t←t+1并转向步骤3,否则停止算法并返回帕累托前沿解,即多个社区划分结果。5) Stop condition: if t≤maxgen is satisfied, then t←t+1 and turn to step 3, otherwise stop the algorithm and return to the Pareto frontier solution, that is, the result of multiple community divisions.

图1为在每代中粒子的更新方式示意图。在每一代更新中,所有的粒子都会根据非支配关系进行排序,维护一个全局最优解gbest,gbest中划分的社区可以被看作是一个个共识社区,当前代的粒子随机被一个共识社区嵌入,并与原粒子做非支配排序选择,留下的即为产生的新粒子。Figure 1 is a schematic diagram of how particles are updated in each generation. In each generation update, all particles will be sorted according to the non-dominant relationship to maintain a global optimal solution gbest. The communities divided in gbest can be regarded as a consensus community, and the particles of the current generation are randomly embedded by a consensus community , and make a non-dominated sorting selection with the original particle, and what is left is the new particle generated.

图2为每隔10代粒子的基于共识嵌入的更新方式示意图。在每隔10代的更新中,在所有粒子组成的帕累托前沿中,选出局部knee点(即极值点)和KKM最小值点、RC最小值点,在这h个点中提取共识社区(本发明中设置的支持度为70%),并将得到的共识社区嵌入到当前代所有的粒子中,更新所有粒子。Figure 2 is a schematic diagram of the update method based on consensus embedding of particles every 10 generations. In the update every 10 generations, in the Pareto frontier composed of all particles, select the local knee points (ie extreme points), KKM minimum points, RC minimum points, and extract consensus among these h points community (the support degree set in the present invention is 70%), and the obtained consensus community is embedded into all particles of the current generation, and all particles are updated.

表1给出本发明的方法(NE-PSO)与其他基准算法在11个真实世界网络数据集上的模块度Q(最大值和平均值)的对比。Table 1 presents the comparison of the modularity Q (maximum and average) of the method of the present invention (NE-PSO) with other benchmark algorithms on 11 real-world network datasets.

表1Table 1

Figure BDA0002419728150000051
Figure BDA0002419728150000051

Figure BDA0002419728150000061
Figure BDA0002419728150000061

注:“——”说明对应的方法无法在给定的时间内得到结果。“?”说明对应的方法未公开导致对应的实验结果未知。Note: "——" indicates that the corresponding method cannot get the result within the given time. "?" indicates that the corresponding method is not disclosed, resulting in unknown experimental results.

从表1可以看出,本发明的方法在采用的11个数据集中有9个效果都比其他的基准算法好,特别是在网络节点多的数据集上非常有竞争力。并且根据图3和4在小规模网络数据集和大规模网络数据集上运行时间的折线图,可以看出,NE-PSO还明显减少了多目标粒子群算法的收敛时间。因此,本发明提高了社区发现的准确率,同时又能有效减少算法收敛时间,在实际应用当中有很好的实用性。It can be seen from Table 1 that the method of the present invention has better effects than other benchmark algorithms in 9 of the 11 data sets used, especially in the data set with many network nodes. And according to the line graphs of running time on small-scale network datasets and large-scale network datasets in Figures 3 and 4, it can be seen that NE-PSO also significantly reduces the convergence time of multi-objective particle swarm optimization. Therefore, the present invention improves the accuracy of community discovery, and at the same time can effectively reduce the algorithm convergence time, and has good practicability in practical applications.

上述实施例仅为本发明的较佳实施例,不能被认为用于限定本发明的实施范围。凡依本发明申请范围所作的均等变化与改进等,均应仍归属于本发明的专利涵盖范围之内。The above-mentioned embodiments are only preferred embodiments of the present invention, and should not be construed as limiting the scope of implementation of the present invention. All equivalent changes and improvements made according to the scope of the application of the present invention should still belong to the scope of the patent of the present invention.

Claims (5)

1. A consensus embedding-based complex network community discovery method is characterized by comprising the following steps:
1) giving a maximum algebra maxgen and a particle swarm size pop;
2) given network G ═ V, E, the network size is n, carry on the network and represent and study;
3) representing a learning result by using a network, initializing a particle swarm to obtain 100 particles POP, wherein the iteration number t is 1;
4) carrying out updating and mutation based on consensus embedding on the POP;
5) stopping conditions are as follows: and if t is not more than maxgen, t ← t +1 and go to step 3), otherwise, stopping and returning pareto front edge solution, namely a plurality of community division results.
2. The method for discovering complex web communities based on consensus embedding as claimed in claim 1, wherein in step 2), the specific steps of performing web representation learning are:
the method comprises the steps of adopting a network embedding method AROPE based on singular value decomposition and eigenvalue decomposition frames and keeping any order of similarity, mapping a high-dimensional adjacent matrix into a low-dimensional continuous eigenspace, mining underlying structure information hidden in a network, and obtaining an eigenvector E ═ { E } of a node1,e2,e3,...,enIn which ei={ei1,ei2,ei3,...,eidD is the dimensionality after dimensionality reduction.
3. The method as claimed in claim 1, wherein in step 3), the specific steps of using the network to represent the learning result and performing initialization of the particle swarm to obtain 100 particle POPs are as follows:
calculating similarity to obtain S ═ Sij)n*n(ii) a Particle swarm POP capable of obtaining initialization based on similarity by using k-meanst={x1,x2,x3,...,xpop}t(ii) a The local optimal solution for the particle is also initialized to Pbest ═ POPt
4. The method as claimed in claim 1, wherein in step 4), the step of updating and mutating the POP based on consensus embedding comprises:
performing consensus-based updates in each generation;
performing consensus embedding every 10 generations;
and performing neighborhood-based single-point mutation on the particles with mutation probability of pm in each generation.
5. The method for discovering complex network communities based on consensus embedding as claimed in claim 1, wherein in step 4), the updating and mutation iterates the updating and mutation process of the particles for maxgen times;
when updating based on consensus is carried out in each generation, taking the communities in the gbest as consensus, and randomly selecting one of the communities to be embedded into the current particle; when the consensus embedding is carried out every 10 generations, extracting the characteristic solution in the current particle swarm every 10 generations, extracting all two-node communities from the characteristic solution, voting in the current particle swarm, and if the support degree is more than or equal to 70%, selecting the characteristic solution as the consensus community and completely embedding the characteristic solution into the current particle;
wherein, the characteristic solutions extracted from the solution set are three representative solutions, which are respectively:
(1) solutions with minimum KKM
(2) Solutions with minimum RC
(3) Local knee solution with minimum manhattan distance
When single-point variation based on neighborhood is carried out on the particles with variation probability pm in each generation, a position i is randomly selected for each particle needing variation, and the label of the position i is replaced by another label in the neighborhood, so that the generation of a new possible solution is ensured.
CN202010202056.0A 2020-03-20 2020-03-20 Consensus embedding-based complex network community discovery method Pending CN111507506A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010202056.0A CN111507506A (en) 2020-03-20 2020-03-20 Consensus embedding-based complex network community discovery method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010202056.0A CN111507506A (en) 2020-03-20 2020-03-20 Consensus embedding-based complex network community discovery method

Publications (1)

Publication Number Publication Date
CN111507506A true CN111507506A (en) 2020-08-07

Family

ID=71869303

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010202056.0A Pending CN111507506A (en) 2020-03-20 2020-03-20 Consensus embedding-based complex network community discovery method

Country Status (1)

Country Link
CN (1) CN111507506A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022142467A1 (en) * 2020-12-30 2022-07-07 南方科技大学 Epidemic prevention and control method and apparatus, and device and medium
CN117391032A (en) * 2023-11-01 2024-01-12 浙江阶数律法科技有限公司 Circuit topology diagram generation method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080294403A1 (en) * 2004-04-30 2008-11-27 Jun Zhu Systems and Methods for Reconstructing Gene Networks in Segregating Populations
CN103971160A (en) * 2014-05-05 2014-08-06 北京航空航天大学 Particle swarm optimization method based on complex network
CN109859065A (en) * 2019-02-28 2019-06-07 桂林理工大学 Multiple target complex network community discovery method based on spectral clustering
US20190179615A1 (en) * 2016-10-27 2019-06-13 Tencent Technology (Shenzhen) Company Limited Community discovery method, device, server and computer storage medium
CN109921936A (en) * 2019-03-13 2019-06-21 南京邮电大学 Multiple target dynamic network community division method based on memetic frame

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080294403A1 (en) * 2004-04-30 2008-11-27 Jun Zhu Systems and Methods for Reconstructing Gene Networks in Segregating Populations
CN103971160A (en) * 2014-05-05 2014-08-06 北京航空航天大学 Particle swarm optimization method based on complex network
US20190179615A1 (en) * 2016-10-27 2019-06-13 Tencent Technology (Shenzhen) Company Limited Community discovery method, device, server and computer storage medium
CN109859065A (en) * 2019-02-28 2019-06-07 桂林理工大学 Multiple target complex network community discovery method based on spectral clustering
CN109921936A (en) * 2019-03-13 2019-06-21 南京邮电大学 Multiple target dynamic network community division method based on memetic frame

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XIANGRONG LIU , YANZI DU, MIN JIANG AND XIANGXIANG ZENG: "Multiobjective Particle Swarm Optimization Based on Network Embedding for Complex Network Community Detection", 《IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS》, vol. 7, no. 2, 4 February 2020 (2020-02-04), pages 437 - 449, XP011781725, DOI: 10.1109/TCSS.2020.2964027 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022142467A1 (en) * 2020-12-30 2022-07-07 南方科技大学 Epidemic prevention and control method and apparatus, and device and medium
CN117391032A (en) * 2023-11-01 2024-01-12 浙江阶数律法科技有限公司 Circuit topology diagram generation method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN114022693B (en) Single-cell RNA-seq data clustering method based on double self-supervision
Li et al. Disentangled graph contrastive learning with independence promotion
Khawar et al. Autofeature: Searching for feature interactions and their architectures for click-through rate prediction
CN113704500A (en) Knowledge graph community division method based on graph neural network
CN109002858B (en) Evidence reasoning-based integrated clustering method for user behavior analysis
CN113241115A (en) Depth matrix decomposition-based circular RNA disease correlation prediction method
CN111985623A (en) Attribute graph group discovery method based on maximized mutual information and graph neural network
CN115983341B (en) A node classification method based on relational aggregation hypergraph
Hong et al. Variational gridded graph convolution network for node classification
Cao et al. Combination of links and node contents for community discovery using a graph regularization approach
CN112561599A (en) Click rate prediction method based on attention network learning and fusing domain feature interaction
CN111667881A (en) A protein function prediction method based on multi-network topology
CN113989544A (en) Group discovery method based on deep map convolution network
CN111507506A (en) Consensus embedding-based complex network community discovery method
CN115019878A (en) Drug discovery method based on graph representation and deep learning
Duan et al. G-Prompt: Graphon-based Prompt Tuning for graph classification
Dey et al. A quantum inspired differential evolution algorithm for automatic clustering of real life datasets
Hou et al. Improving molecular graph generation with flow matching and optimal transport
Das et al. Markov clustering algorithms and their application in analysis of PPI network of malaria genes
CN113434815A (en) Community detection method based on similar and dissimilar constraint semi-supervised nonnegative matrix factorization
CN111126467B (en) A spatial spectrum clustering method of remote sensing images based on multi-objective sine-cosine algorithm
Xiong et al. Learning regularized noise contrastive estimation for robust network embedding
Yazdanparast et al. Linear time community detection by a novel modularity gain acceleration in label propagation
CN115512765B (en) A method for predicting pathogenic circRNAs based on quantum representation learning
Yan et al. Hybrid chain-hypergraph P systems for multiobjective ensemble clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200807