CN118536542A - Method and system for constructing optimal architecture of graph neural network - Google Patents
Method and system for constructing optimal architecture of graph neural network Download PDFInfo
- Publication number
- CN118536542A CN118536542A CN202411002376.6A CN202411002376A CN118536542A CN 118536542 A CN118536542 A CN 118536542A CN 202411002376 A CN202411002376 A CN 202411002376A CN 118536542 A CN118536542 A CN 118536542A
- Authority
- CN
- China
- Prior art keywords
- neural network
- graph neural
- optimal
- training
- prompt
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域Technical Field
本发明涉及图神经网络架构构建技术领域,尤其涉及一种图神经网络最优架构构建方法、系统及药物分子性质预测方法、系统。The present invention relates to the technical field of graph neural network architecture construction, and in particular to a method and system for constructing an optimal graph neural network architecture and a method and system for predicting drug molecular properties.
背景技术Background Art
图神经网络(GNN)在处理具有丰富关系的图数据时,相较于循环神经网络等传统神经网络,可以更好地捕捉图数据的非欧几里得性质,所以现在有许多方法使用GNN根据分子图的信息预测分子性质。When processing graph data with rich relationships, graph neural networks (GNNs) can better capture the non-Euclidean properties of graph data compared to traditional neural networks such as recurrent neural networks. Therefore, there are now many methods that use GNNs to predict molecular properties based on the information of molecular graphs.
同时,为了得到相应任务上的GNN最优架构,减少算法设计人员的调参时间,衍生出了图神经网络架构搜索。这类方法对于给定的一个搜索空间和数据集,使用特定的搜索策略,选出一个最好的架构,可以最大化上网络的精度。目前主要的搜索策略有基于强化学习、进化学习和可微方法的算法。At the same time, in order to obtain the optimal GNN architecture for the corresponding task and reduce the parameter adjustment time of algorithm designers, graph neural network architecture search has been derived. This type of method is used for a given search space. and dataset , using a specific search strategy to select the best architecture , Can be maximized The accuracy of the network The main search strategies currently include algorithms based on reinforcement learning, evolutionary learning, and differentiable methods.
虽然各种图架构搜索算法在不同的领域任务中都取得了不错的结果,但设计架构搜索算法也需要大量的领域知识。像随机采样等搜索方式,虽然也能取得不错的性能效果,但是需要对大量的架构进行训练和验证,对资源和时间都有一定的需求。且现有方法更换不同的搜索空间、参考指标等都需要在代码内进行修改。这其实对于非计算机领域的使用者是十分不便的。Although various graph architecture search algorithms have achieved good results in different domain tasks, designing architecture search algorithms also requires a lot of domain knowledge. Search methods such as random sampling can also achieve good performance results, but they require training and verification of a large number of architectures, which has certain requirements for resources and time. In addition, existing methods require modifications in the code to change different search spaces, reference indicators, etc. This is actually very inconvenient for users who are not in the computer field.
发明内容Summary of the invention
为了解决上述技术问题,本发明提出了一种图神经网络最优架构构建方法、系统及药物分子性质预测方法、系统,以自动获得最优的图神经网络架构,节约时间与人力资源成本。In order to solve the above technical problems, the present invention proposes a method and system for constructing an optimal graph neural network architecture and a method and system for predicting drug molecular properties, so as to automatically obtain the optimal graph neural network architecture and save time and human resource costs.
根据本发明的一方面,提出一种图神经网络最优架构构建方法,该方法包括:According to one aspect of the present invention, a method for constructing an optimal architecture of a graph neural network is proposed, the method comprising:
预先设计包含多个图神经网络参数的搜索空间;Pre-design a search space containing multiple graph neural network parameters;
预先设计用于引导大语言模型进行回复的提示语模板;所述提示语模板包括对预先设计的搜索空间的描述;A prompt template is pre-designed for guiding the large language model to respond; the prompt template includes a description of the pre-designed search space;
基于提示语模板获取初始触发的提示语;Acquire the prompt word of the initial trigger based on the prompt word template;
将初始触发的提示语输入预训练的大语言模型中进行采样,获取初始多组图神经网络配置参数;Input the initial triggering prompt into the pre-trained large language model for sampling to obtain the initial multiple sets of graph neural network configuration parameters;
基于初始多组图神经网络配置参数构建对应的多个图神经网络;Construct corresponding multiple graph neural networks based on the initial multiple groups of graph neural network configuration parameters;
利用训练集对多个图神经网络分别进行训练,并利用测试集进行筛选;Use the training set to train multiple graph neural networks separately, and use the test set to screen them;
迭代重复上述采样、构建、训练、筛选过程直至达到最大迭代次数,获取最终的最优图神经网络。The above sampling, construction, training, and screening processes are repeated iteratively until the maximum number of iterations is reached to obtain the final optimal graph neural network.
进一步地,所述搜索空间中多个图神经网络参数包括:归一化操作、激活函数、卷积层操作、读出层操作、循环数量、学习率、隐藏层维度、批数量、比率;其中每个参数包含多个选项。Furthermore, the multiple graph neural network parameters in the search space include: normalization operation, activation function, convolution layer operation, readout layer operation, number of loops, learning rate, hidden layer dimension, batch number, and ratio; each parameter contains multiple options.
进一步地,所述提示语模板还包括任务说明、回复范式规定、搜索策略;其中,任务说明用于说明本次任务是根据搜索空间中的图神经网络参数训练构建具有最优架构的图神经网络;回复范式规定用于规定大语言模型回复输出的参数形式;搜索策略用于给出大语言模型推理获得最优图神经网络配置的方法。Furthermore, the prompt template also includes task description, response paradigm regulations, and search strategy; wherein the task description is used to explain that this task is to build a graph neural network with an optimal architecture based on the graph neural network parameter training in the search space; the response paradigm regulations are used to specify the parameter form of the large language model response output; the search strategy is used to provide a method for obtaining the optimal graph neural network configuration by reasoning with a large language model.
进一步地,所述搜索策略包括:在探索阶段即缺乏实验数据阶段,搜索时关注搜索空间中没有出现过的图神经网络配置;在开发阶段即已有实验数据阶段,给出高性能图神经网络架构的常用配置、对应的性能指标、性能指标均值、高性能图神经网络架构性能指标和均值的偏差,搜索时保留高性能图神经网络配置中共同点,修改配置其他部分,以生成新的图神经网络架构。Furthermore, the search strategy includes: in the exploration stage, i.e., the stage where experimental data is lacking, focusing on graph neural network configurations that have not appeared in the search space during the search; in the development stage, i.e., the stage where experimental data is available, providing common configurations of high-performance graph neural network architectures, corresponding performance indicators, performance indicator means, and deviations from high-performance graph neural network architecture performance indicators and means, retaining common points in high-performance graph neural network configurations during the search, and modifying other parts of the configuration to generate a new graph neural network architecture.
进一步地,所述迭代重复上述采样、构建、训练、筛选过程,获取最终的最优图神经网络包括:采用低精度和高精度结合的训练方式,将训练轮次分为低精度、高精度和训练至收敛;在利用训练集训练阶段,先通过少数轮次的训练获得低精度图神经网络模型,在测试集上得到低精度图神经网络模型的性能,筛选出性能最好的半数模型再进行多数轮次的训练,再将训练结果反馈给预训练的大语言模型;大语言模型采样输出其推理获得的较优的图神经网络配置参数;在多次迭代后,在构建的所有图神经网络中选出性能最好的模型,训练直至收敛,根据最终的性能筛选得到最优的图神经网络模型。Furthermore, the iterative repetition of the above-mentioned sampling, construction, training, and screening processes to obtain the final optimal graph neural network includes: adopting a training method that combines low-precision and high-precision, and dividing the training rounds into low-precision, high-precision, and training until convergence; in the training stage using the training set, first obtain a low-precision graph neural network model through a few rounds of training, obtain the performance of the low-precision graph neural network model on the test set, screen out half of the models with the best performance and then conduct a majority of rounds of training, and then feed back the training results to the pre-trained large language model; the large language model samples and outputs the better graph neural network configuration parameters obtained by its reasoning; after multiple iterations, select the model with the best performance from all the constructed graph neural networks, train until convergence, and obtain the optimal graph neural network model based on the final performance screening.
根据本发明的另一方面,提出一种图神经网络最优架构构建系统,该系统包括:According to another aspect of the present invention, a system for constructing an optimal architecture of a graph neural network is proposed, the system comprising:
提示语模板设计模块,其配置成预先设计包含多个图神经网络参数的搜索空间;预先设计用于引导大语言模型进行回复的提示语模板;所述提示语模板包括对预先设计的搜索空间的描述;A prompt template design module, configured to pre-design a search space including a plurality of graph neural network parameters; pre-design a prompt template for guiding a large language model to respond; the prompt template includes a description of the pre-designed search space;
初始提示语获取模块,其配置成基于提示语模板获取初始触发的提示语;An initial prompt acquisition module, configured to acquire an initial triggered prompt based on a prompt template;
最优模型构建模块,其配置成将初始触发的提示语输入预训练的大语言模型中进行采样,获取初始多组图神经网络配置参数;基于初始多组图神经网络配置参数构建对应的多个图神经网络;利用训练集对多个图神经网络分别进行训练,并利用测试集进行筛选;迭代重复上述采样、构建、训练、筛选过程直至达到最大迭代次数,获取最终的最优图神经网络。The optimal model construction module is configured to input the initial triggered prompt into the pre-trained large language model for sampling to obtain the initial multiple sets of graph neural network configuration parameters; construct the corresponding multiple graph neural networks based on the initial multiple sets of graph neural network configuration parameters; use the training set to train the multiple graph neural networks separately, and use the test set to screen them; iteratively repeat the above sampling, construction, training, and screening processes until the maximum number of iterations is reached to obtain the final optimal graph neural network.
进一步地,所述提示语模板设计模块中多个图神经网络参数包括:归一化操作、激活函数、卷积层操作、读出层操作、循环数量、学习率、隐藏层维度、批数量、比率;其中每个参数包含多个选项;Furthermore, the multiple graph neural network parameters in the prompt template design module include: normalization operation, activation function, convolution layer operation, readout layer operation, number of loops, learning rate, hidden layer dimension, batch number, ratio; each parameter includes multiple options;
所述提示语模板还包括任务说明、回复范式规定、搜索策略;其中,任务说明用于说明本次任务是根据搜索空间中的图神经网络参数训练构建具有最优架构的图神经网络;回复范式规定用于规定大语言模型回复输出的参数形式;搜索策略用于给出大语言模型推理获得最优图神经网络配置的方法。The prompt template also includes task description, response paradigm regulations, and search strategy; wherein the task description is used to explain that this task is to build a graph neural network with an optimal architecture based on the graph neural network parameter training in the search space; the response paradigm regulations are used to specify the parameter form of the large language model response output; the search strategy is used to provide a method for obtaining the optimal graph neural network configuration by inferring the large language model.
进一步地,所述搜索策略包括:在探索阶段即缺乏实验数据阶段,搜索时关注搜索空间中没有出现过的图神经网络配置;在开发阶段即已有实验数据阶段,给出高性能图神经网络架构的常用配置、对应的性能指标、性能指标均值、高性能图神经网络架构性能指标和均值的偏差,搜索时保留高性能图神经网络配置中共同点,修改配置其他部分,以生成新的图神经网络架构。Furthermore, the search strategy includes: in the exploration stage, i.e., the stage where experimental data is lacking, focusing on graph neural network configurations that have not appeared in the search space during the search; in the development stage, i.e., the stage where experimental data is available, providing common configurations of high-performance graph neural network architectures, corresponding performance indicators, performance indicator means, and deviations from high-performance graph neural network architecture performance indicators and means, retaining common points in high-performance graph neural network configurations during the search, and modifying other parts of the configuration to generate a new graph neural network architecture.
根据本发明的又一方面,提出一种基于图神经网络的药物分子性质预测方法,该方法包括:According to another aspect of the present invention, a method for predicting drug molecular properties based on graph neural network is proposed, the method comprising:
获取药物分子图数据集;Get the drug molecule graph dataset;
将所述药物分子图数据集作为训练集和测试集,利用上述所述的图神经网络最优架构构建方法获取基于最优图神经网络的分子性质预测模型;The drug molecule graph dataset is used as a training set and a test set, and the above-mentioned graph neural network optimal architecture construction method is used to obtain a molecular property prediction model based on the optimal graph neural network;
将待预测分子图输入基于最优图神经网络的分子性质预测模型中,获取预测性质结果。The molecular graph to be predicted is input into the molecular property prediction model based on the optimal graph neural network to obtain the predicted property results.
根据本发明的又一方面,还提出一种基于图神经网络的药物分子性质预测系统,该系统具有与上述基于图神经网络的药物分子性质预测方法所述的步骤对应的程序模块,运行时执行上述药物分子性质预测方法中的步骤。According to another aspect of the present invention, a drug molecular property prediction system based on graph neural network is also proposed. The system has a program module corresponding to the steps described in the above-mentioned drug molecular property prediction method based on graph neural network, and executes the steps in the above-mentioned drug molecular property prediction method during runtime.
本发明具有以下技术效果:The present invention has the following technical effects:
本发明提出了一种图神经网络最优架构构建方法、系统及药物分子性质预测方法、系统。包括:预先设计包含多个图神经网络参数的搜索空间;The present invention proposes a method and system for constructing an optimal architecture of a graph neural network and a method and system for predicting the properties of drug molecules, including: pre-designing a search space containing multiple graph neural network parameters;
预先设计用于引导大语言模型进行回复的提示语模板;所述提示语模板包括对预先设计的搜索空间的描述;基于提示语模板获取初始触发的提示语;Pre-designing a prompt template for guiding the large language model to reply; the prompt template includes a description of the pre-designed search space; obtaining an initial triggering prompt based on the prompt template;
将初始触发的提示语输入预训练的大语言模型中进行采样,获取初始多组图神经网络配置参数;基于初始多组图神经网络配置参数构建对应的多个图神经网络;利用训练集对多个图神经网络分别进行训练,并利用测试集进行筛选;迭代重复上述采样、构建、训练、筛选过程直至达到最大迭代次数,获取最终的最优图神经网络;并将构建的最优图神经网络应用于药物性质预测任务中。本发明可以在节约训练成本和资源的前提下,自动化地根据不同的药物性质预测任务的数据集获取性能较优的图神经网络架构。本发明操作灵活,使用者可根据需要调整数据集、参考指标、搜索空间等。Input the initial triggered prompt into the pre-trained large language model for sampling, and obtain the initial multiple groups of graph neural network configuration parameters; construct the corresponding multiple graph neural networks based on the initial multiple groups of graph neural network configuration parameters; use the training set to train the multiple graph neural networks separately, and use the test set to screen; iteratively repeat the above sampling, construction, training, and screening processes until the maximum number of iterations is reached to obtain the final optimal graph neural network; and apply the constructed optimal graph neural network to the drug property prediction task. The present invention can automatically obtain a graph neural network architecture with better performance based on different data sets of drug property prediction tasks under the premise of saving training costs and resources. The present invention is flexible in operation, and users can adjust data sets, reference indicators, search spaces, etc. as needed.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本发明具体实施方式或现有技术中的技术方案,下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly illustrate the specific implementation methods of the present invention or the technical solutions in the prior art, the drawings required for use in the specific implementation methods or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are some implementation methods of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.
图1是本发明实施例所述的一种图神经网络最优架构构建方法的流程图。FIG1 is a flow chart of a method for constructing an optimal architecture of a graph neural network according to an embodiment of the present invention.
图2是本发明实施例中消息传递神经网络的结构示意图。FIG. 2 is a schematic diagram of the structure of a message passing neural network in an embodiment of the present invention.
图3是本发明实施例中采样获得的图神经网络的结构示意图。FIG3 is a schematic diagram of the structure of a graph neural network obtained by sampling in an embodiment of the present invention.
图4是本发明实施例所述的一种图神经网络最优架构构建系统的结构图。Figure 4 is a structural diagram of a graph neural network optimal architecture construction system described in an embodiment of the present invention.
图5是本发明实施例中利用图神经网络最优架构构建方法获取基于最优图神经网络的分子性质预测模型的过程示意图。Figure 5 is a schematic diagram of the process of obtaining a molecular property prediction model based on an optimal graph neural network using the graph neural network optimal architecture construction method in an embodiment of the present invention.
具体实施方式DETAILED DESCRIPTION
为使本发明的目的、技术方案和优点更加清楚,下面将对本发明的技术方案进行清楚、完整的描述。显然,所描述的实施例仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所得到的所有其它实施例,都属于本发明所保护的范围。In order to make the purpose, technical solution and advantages of the present invention clearer, the technical solution of the present invention will be described clearly and completely below. Obviously, the described embodiments are only part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work belong to the scope of protection of the present invention.
本发明实施例提出一种图神经网络最优架构构建方法,如图1所示,该方法包括:The embodiment of the present invention proposes a method for constructing an optimal architecture of a graph neural network, as shown in FIG1 , the method comprising:
S110、预先设计包含多个图神经网络参数的搜索空间;S110, pre-designing a search space containing multiple graph neural network parameters;
S120、预先设计用于引导大语言模型进行回复的提示语模板;所述提示语模板包括对预先设计的搜索空间的描述;S120, pre-designing a prompt template for guiding the large language model to reply; the prompt template includes a description of the pre-designed search space;
S130、基于提示语模板获取初始触发的提示语;S130, obtaining an initial triggering prompt based on a prompt template;
S140、将初始触发的提示语输入预训练的大语言模型中进行采样,获取初始多组图神经网络配置参数;基于初始多组图神经网络配置参数构建对应的多个图神经网络;利用训练集对多个图神经网络分别进行训练,并利用测试集进行筛选;S140, inputting the initial triggered prompt into the pre-trained large language model for sampling, and obtaining initial multiple sets of graph neural network configuration parameters; constructing corresponding multiple graph neural networks based on the initial multiple sets of graph neural network configuration parameters; using the training set to train the multiple graph neural networks respectively, and using the test set to screen them;
S150、迭代重复S140中采样、构建、训练、筛选过程直至达到最大迭代次数,获取最终的最优图神经网络。S150, iteratively repeat the sampling, construction, training, and screening processes in S140 until the maximum number of iterations is reached to obtain the final optimal graph neural network.
方法始于S110。在S110中,预先设计包含多个图神经网络参数的搜索空间;其中多个图神经网络参数包括:归一化操作、激活函数、卷积层操作、读出层操作、循环数量、学习率、隐藏层维度、批数量、比率;每个参数包含多个选项。The method starts from S110. In S110, a search space containing multiple graph neural network parameters is pre-designed; the multiple graph neural network parameters include: normalization operation, activation function, convolution layer operation, readout layer operation, number of cycles, learning rate, hidden layer dimension, batch number, ratio; each parameter includes multiple options.
根据本发明实施例,设计的默认搜索空间中,使用消息传递神经网络(MPNN)作为核心主干,如图2所示。在每个层中,提供不同的模型配置的备选项如规范化、激活函数等,并加入如学习率、隐藏层维度等超参数,共同构成搜索空间,如表1中所示。According to an embodiment of the present invention, in the designed default search space, a message passing neural network (MPNN) is used as the core backbone, as shown in Figure 2. In each layer, different model configuration options such as normalization, activation function, etc. are provided, and hyperparameters such as learning rate and hidden layer dimension are added to form a search space, as shown in Table 1.
表1Table 1
然后执行S120,在S120中,预先设计用于引导大语言模型进行回复的提示语模板;所述提示语模板包括对预先设计的搜索空间的描述,还包括任务说明、回复范式规定、搜索策略;其中,任务说明用于说明本次任务是根据搜索空间中的图神经网络参数训练构建具有最优架构的图神经网络;回复范式规定用于规定大语言模型回复输出的参数形式;搜索策略用于给出大语言模型推理获得最优图神经网络配置的方法。Then execute S120, in which a prompt template is pre-designed to guide the large language model to reply; the prompt template includes a description of the pre-designed search space, and also includes a task description, a reply paradigm specification, and a search strategy; wherein the task description is used to explain that this task is to build a graph neural network with an optimal architecture based on the graph neural network parameter training in the search space; the reply paradigm specification is used to specify the parameter form of the large language model reply output; and the search strategy is used to provide a method for the large language model to infer the optimal graph neural network configuration.
根据本发明实施例,提示语的目的主要是为了让大语言模型明白已定的搜索空间,并引导其推理得到更好的模型配置。提示词模板包括系统设置、任务说明、搜索空间描述、回复范式规定、搜索策略、再次强调六个部分。提示词模板示例如下所示,其中【】中的内容即通过不同的变量进行构成。According to an embodiment of the present invention, the purpose of the prompt is mainly to make the large language model understand the predetermined search space and guide its reasoning to obtain a better model configuration. The prompt word template includes six parts: system settings, task instructions, search space description, response paradigm regulations, search strategy, and re-emphasis. An example of a prompt word template is shown below, where the content in [ ] is constructed by different variables.
在系统设置和任务说明中,向大语言模型说明在对话中扮演的角色和需要执行的任务,需要执行的任务即为根据搜索空间中的图神经网络参数训练构建具有最优架构的图神经网络。在搜索空间描述中,以清晰的文本形式让大语言模型更好地理解所设计的搜索空间。在回复范式规定中,通过给出示例的方法,让模型按照格式进行回复,即回复输出的参数形式要求,这样方便于自动化地处理输出数据,提取配置构建模型。在搜索策略中,通过拆解推理方法引导大语言模型推理得到最优图神经网络配置,即给出大语言模型推理获得最优图神经网络配置的方法。在再次强调部分,对必须遵循范式进行回复、需要回复的配置数量等重要信息重复进行强调,确保大语言模型按照要求进行回复。In the system settings and task descriptions, explain to the large language model the role it plays in the conversation and the tasks it needs to perform. The tasks to be performed are to build a graph neural network with the optimal architecture based on the graph neural network parameter training in the search space. In the search space description, use clear text to help the large language model better understand the designed search space. In the reply paradigm specification, by giving examples, let the model reply in the format, that is, the parameter form requirements of the reply output, so that it is convenient to automatically process the output data and extract the configuration to build the model. In the search strategy, guide the large language model to reason and obtain the optimal graph neural network configuration by disassembling the reasoning method, that is, give the method for the large language model to reason and obtain the optimal graph neural network configuration. In the re-emphasis part, repeat the important information such as the paradigm that must be followed to reply and the number of configurations that need to be replied to ensure that the large language model replies as required.
其中,在搜索策略中加入了一些小技巧,包括:在探索阶段即缺乏实验数据阶段,搜索时关注搜索空间中没有出现过的图神经网络配置;在开发阶段即已有实验数据阶段,给出高性能图神经网络架构的常用配置、对应的性能指标、性能指标均值、高性能图神经网络架构性能指标和均值的偏差,搜索时保留高性能图神经网络配置中共同点,修改配置其他部分,以生成新的图神经网络架构。Among them, some tips are added to the search strategy, including: in the exploration stage, that is, the stage where experimental data is lacking, focus on graph neural network configurations that have not appeared in the search space during the search; in the development stage, that is, the stage where experimental data is available, give the common configurations of high-performance graph neural network architectures, the corresponding performance indicators, the mean of performance indicators, and the deviation of the performance indicators and mean of high-performance graph neural network architectures. When searching, retain the common points in the high-performance graph neural network configurations, and modify other parts of the configuration to generate a new graph neural network architecture.
具体地,使用分步骤的形式对采样过程进行引导,将迭代分为探索和开发阶段。在前面的探索阶段由于缺乏实验数据,引导大语言模型尽量关注搜索空间中没有出现过的模型配置。在积累了一定的实验结果后的开发阶段,引导大模型根据提示词中的反馈实验结果修改部分配置探究是否模型性能优化。另外,在向模型反馈验证结果的时候,按照下述公式将之前的性能进行平均得到,计算得到每个模型的性能偏差值,有利于更好地利用之前模型的性能。Specifically, the sampling process is guided in a step-by-step manner, and the iteration is divided into exploration and development stages. In the previous exploration stage, due to the lack of experimental data, the large language model is guided to focus on model configurations that have not appeared in the search space. In the development stage after accumulating a certain amount of experimental results, the large model is guided to modify some configurations based on the feedback experimental results in the prompt words to explore whether the model performance is optimized. In addition, when feeding back the verification results to the model, the previous performance is averaged according to the following formula to obtain , calculate the performance deviation value of each model , which helps to better utilize the performance of the previous model.
式中,m表示迭代的次数;n表示一次迭代中返回的图神经网络架构数量;表示第i轮迭代中返回的第j个图神经网络;表示之前性能的平均值。Where m represents the number of iterations; n represents the number of graph neural network architectures returned in one iteration; Represents the j-th graph neural network returned in the i-th iteration; Represents the average of previous performance.
然后执行S130,在S130中,基于提示语模板获取初始触发的提示语。Then, S130 is executed, in which the prompt word for initial triggering is obtained based on the prompt word template.
根据本发明实施例,将选定的数据集、默认的搜索空间、参考指标、已训练的图神经网络的性能等作为变量,按照预先设计的文本提示语模板自动化地构建初始触发的提示语,提示语可以简单地理解为引导大语言模型进行回复的文本。According to an embodiment of the present invention, the selected data set, the default search space, the reference index, the performance of the trained graph neural network, etc. are taken as variables, and the initial trigger prompt is automatically constructed according to the pre-designed text prompt template. The prompt can be simply understood as the text that guides the large language model to reply.
然后执行S140,在S140中,将初始触发的提示语输入预训练的大语言模型中进行采样,获取初始多组图神经网络配置参数;基于初始多组图神经网络配置参数构建对应的多个图神经网络;利用训练集对多个图神经网络分别进行训练,并利用测试集进行筛选。Then execute S140, in which the initial triggered prompt is input into the pre-trained large language model for sampling to obtain initial multiple sets of graph neural network configuration parameters; construct corresponding multiple graph neural networks based on the initial multiple sets of graph neural network configuration parameters; use the training set to train the multiple graph neural networks separately, and use the test set to screen them.
然后执行S150,在S150中,迭代重复S140中采样、构建、训练、筛选过程直至达到最大迭代次数,获取最终的最优图神经网络。Then execute S150, in which the sampling, construction, training, and screening processes in S140 are iteratively repeated until the maximum number of iterations is reached to obtain the final optimal graph neural network.
根据本发明实施例,把提示语输入给已经训练好的大语言模型。大语言模型是指有着大规模的参数,用于处理自然语言的模型,其往往具有强大的推理能力并且内置了丰富的知识。具体地,可采用调用接口的形式对现有的已经训练好的大语言模型进行使用。According to an embodiment of the present invention, the prompt is input into a trained large language model. A large language model refers to a model with large-scale parameters for processing natural language, which often has powerful reasoning capabilities and rich built-in knowledge. Specifically, the existing trained large language model can be used in the form of a calling interface.
采样过程中,大语言模型会以文本的形式返回其推理认为较优的模型配置,所有配置从搜索空间中采样获得,网络构建模块将这些配置进行解析,构成相对应的图神经网络模型,如图3所示。During the sampling process, the large language model will return the model configuration that it infers to be optimal in the form of text. All configurations are sampled from the search space. The network construction module parses these configurations to form the corresponding graph neural network model, as shown in Figure 3.
接着将这些图神经网络模型在数据集上进行低精度和高精度结合的训练和验证,从每一轮的验证结果中选取预测结果最准确的模型返回给大语言模型,引导其推导出下一组预测结果较准确的模型配置。具体地,采用低精度和高精度结合的训练方式,将训练轮次分为低精度、高精度和训练至收敛;在训练阶段,先通过较少轮次的训练获得低精度图神经网络模型,在测试集上得到低精度图神经网络模型的性能,筛选出性能最好的半数模型再进行较多轮次的训练,再将训练结果反馈给大语言模型,大语言模型再次输出其推理获得的较优的图神经网络配置参数;在多次迭代后达到最大迭代次数,在所有图神经网络模型中选出个性能最好的模型,训练直至收敛,根据最终的性能筛选得到最优的图神经网络模型。These graph neural network models are then trained and verified on the dataset using a combination of low-precision and high-precision methods. The model with the most accurate prediction results is selected from the verification results of each round and returned to the large language model to guide it to derive the next set of model configurations with more accurate prediction results. Specifically, a training method that combines low-precision and high-precision methods is adopted, and the training rounds are divided into low-precision, high-precision, and training until convergence; in the training stage, a low-precision graph neural network model is first obtained through a few rounds of training, and the performance of the low-precision graph neural network model is obtained on the test set. Half of the models with the best performance are selected and then trained for more rounds. The training results are then fed back to the large language model, and the large language model again outputs the better graph neural network configuration parameters obtained by its reasoning; after multiple iterations, the maximum number of iterations is reached, and the graph neural network model is selected from all graph neural network models. The model with the best performance is selected and trained until convergence, and the optimal graph neural network model is obtained based on the final performance.
这种训练方式相较于将所有模型都训练到收敛后再进行筛选的传统方式,可以将资源从性能较差模型集中到性能较好的模型上,可以很好地节约训练的资源和成本。Compared with the traditional method of training all models until convergence and then screening them, this training method can concentrate resources from models with poor performance to models with better performance, which can greatly save training resources and costs.
本发明可以在尽量节约训练成本和资源的前提下,自动化地根据不同的数据集获取性能较优的图神经网络架构。除此之外,本发明操作更加灵活,便于非计算机专业用户使用,且能调整数据集、参考指标、搜索空间等。The present invention can automatically obtain a graph neural network architecture with better performance based on different data sets while saving training costs and resources as much as possible. In addition, the present invention is more flexible in operation, convenient for non-computer professional users to use, and can adjust data sets, reference indicators, search spaces, etc.
本发明另一实施例提出一种图神经网络最优架构构建系统,如图4所示,该系统包括:Another embodiment of the present invention provides a system for constructing an optimal architecture of a graph neural network, as shown in FIG4 , the system comprising:
提示语模板设计模块410,其配置成预先设计包含多个图神经网络参数的搜索空间;预先设计用于引导大语言模型进行回复的提示语模板;所述提示语模板包括对预先设计的搜索空间的描述;A prompt template design module 410 is configured to pre-design a search space including multiple graph neural network parameters; pre-design a prompt template for guiding the large language model to reply; the prompt template includes a description of the pre-designed search space;
初始提示语获取模块420,其配置成基于提示语模板获取初始触发的提示语;An initial prompt acquisition module 420, configured to acquire an initial triggered prompt based on a prompt template;
最优模型构建模块430,其配置成将初始触发的提示语输入预训练的大语言模型中进行采样,获取初始多组图神经网络配置参数;基于初始多组图神经网络配置参数构建对应的多个图神经网络;利用训练集对多个图神经网络分别进行训练,并利用测试集进行筛选;迭代重复上述采样、构建、训练、筛选过程直至达到最大迭代次数,获取最终的最优图神经网络。The optimal model construction module 430 is configured to input the initial triggered prompt into the pre-trained large language model for sampling, and obtain the initial multiple sets of graph neural network configuration parameters; construct the corresponding multiple graph neural networks based on the initial multiple sets of graph neural network configuration parameters; use the training set to train the multiple graph neural networks separately, and use the test set to screen them; iteratively repeat the above sampling, construction, training, and screening processes until the maximum number of iterations is reached to obtain the final optimal graph neural network.
本实施例中,优选地,所述提示语模板设计模块410中多个图神经网络参数包括:归一化操作、激活函数、卷积层操作、读出层操作、循环数量、学习率、隐藏层维度、批数量、比率;其中每个参数包含多个选项。In this embodiment, preferably, the multiple graph neural network parameters in the prompt template design module 410 include: normalization operation, activation function, convolution layer operation, readout layer operation, number of loops, learning rate, hidden layer dimension, batch number, ratio; each parameter contains multiple options.
本实施例中,优选地,所述提示语模板设计模块410中所述提示语模板还包括任务说明、回复范式规定、搜索策略;其中,任务说明用于说明本次任务是根据搜索空间中的图神经网络参数训练构建具有最优架构的图神经网络;回复范式规定用于规定大语言模型回复输出的参数形式;搜索策略用于给出大语言模型推理获得最优图神经网络配置的方法。In this embodiment, preferably, the prompt template in the prompt template design module 410 also includes a task description, a response paradigm specification, and a search strategy; wherein the task description is used to explain that this task is to build a graph neural network with an optimal architecture based on the graph neural network parameter training in the search space; the response paradigm specification is used to specify the parameter form of the large language model response output; and the search strategy is used to provide a method for obtaining the optimal graph neural network configuration by inferring the large language model.
本实施例中,优选地,所述提示语模板设计模块410中所述搜索策略包括:在探索阶段即缺乏实验数据阶段,搜索时关注搜索空间中没有出现过的图神经网络配置;在开发阶段即已有实验数据阶段,给出高性能图神经网络架构的常用配置、对应的性能指标、性能指标均值、高性能图神经网络架构性能指标和均值的偏差,搜索时保留高性能图神经网络配置中共同点,修改配置其他部分,以生成新的图神经网络架构。In this embodiment, preferably, the search strategy described in the prompt template design module 410 includes: in the exploration stage, i.e., the stage where experimental data is lacking, focusing on graph neural network configurations that have not appeared in the search space during the search; in the development stage, i.e., the stage where experimental data is available, providing common configurations of high-performance graph neural network architectures, corresponding performance indicators, performance indicator means, and deviations from high-performance graph neural network architecture performance indicators and means, retaining common points in high-performance graph neural network configurations during the search, and modifying other parts of the configuration to generate a new graph neural network architecture.
本实施例中,优选地,所述最优模型构建模块430中所述迭代重复上述采样、构建、训练、筛选过程,获取最终的最优图神经网络包括:采用低精度和高精度结合的训练方式,将训练轮次分为低精度、高精度和训练至收敛;在利用训练集训练阶段,先通过少数轮次的训练获得低精度图神经网络模型,在测试集上得到低精度图神经网络模型的性能,筛选出性能最好的半数模型再进行多数轮次的训练,再将训练结果反馈给预训练的大语言模型;大语言模型采样输出其推理获得的较优的图神经网络配置参数;在多次迭代后,在构建的所有图神经网络中选出性能最好的模型,训练直至收敛,根据最终的性能筛选得到最优的图神经网络模型。In this embodiment, preferably, the iterative repetition of the above-mentioned sampling, construction, training, and screening processes in the optimal model construction module 430 to obtain the final optimal graph neural network includes: adopting a training method that combines low-precision and high-precision, and dividing the training rounds into low-precision, high-precision, and training until convergence; in the training stage using the training set, first obtain a low-precision graph neural network model through a few rounds of training, obtain the performance of the low-precision graph neural network model on the test set, screen out half of the models with the best performance and then conduct most rounds of training, and then feed back the training results to the pre-trained large language model; the large language model samples and outputs the better graph neural network configuration parameters obtained by its reasoning; after multiple iterations, select the model with the best performance from all constructed graph neural networks, train until convergence, and obtain the optimal graph neural network model based on the final performance screening.
根据本发明实施例的一种图神经网络最优架构构建系统的未详述部分,还请参考以上关于方法实施例的具体描述。For the undescribed parts of a system for building an optimal architecture of a graph neural network according to an embodiment of the present invention, please refer to the above detailed description of the method embodiment.
本发明另一实施例还提出一种基于图神经网络的药物分子性质预测方法,该方法包括:Another embodiment of the present invention further provides a method for predicting drug molecular properties based on graph neural network, the method comprising:
获取药物分子图数据集;Get the drug molecule graph dataset;
将所述药物分子图数据集作为训练集和测试集,利用上述实施例所述的图神经网络最优架构构建方法获取基于最优图神经网络的分子性质预测模型;The drug molecule graph dataset is used as a training set and a test set, and the optimal graph neural network architecture construction method described in the above embodiment is used to obtain a molecular property prediction model based on the optimal graph neural network;
将待预测分子图输入基于最优图神经网络的分子性质预测模型中,获取预测性质结果。The molecular graph to be predicted is input into the molecular property prediction model based on the optimal graph neural network to obtain the predicted property results.
根据本发明实施例,选定执行链路面向的具体药物性质预测任务和相应的数据集。本实施例中,构建的最优图神经网络支持所有包含药物分子smiles字符串和目标标签两个核心字段的数据形式,已经预先提供了MoleculeNet数据集中的6个子数据集作为示例。According to an embodiment of the present invention, a specific drug property prediction task and a corresponding data set for the execution link are selected. In this embodiment, the constructed optimal graph neural network supports all data forms containing two core fields: the drug molecule smiles string and the target label, and 6 sub-datasets in the MoleculeNet dataset have been provided in advance as examples.
如图5所示,将构建的最优图神经网络应用于药物性质预测任务中,根据药物的smiles字符串结构预测药物的各类性质。通过自主设计的提示词模板引导大语言模型根据搜索空间和之前实验结果推理得到性能更优的GNN。As shown in Figure 5, the constructed optimal graph neural network is applied to the drug property prediction task, and various properties of the drug are predicted based on the smiles string structure of the drug. The self-designed prompt word template guides the large language model to infer the search space and previous experimental results to obtain a GNN with better performance.
在采用的6个子数据集上,将本发明与经典的5个GNN模型:GCN,GAT,GAT-V2,GraphSAGE,GIN进行对比,并在相同的搜索空间中使用随机采样的方式对比结果。对比结果如表2所示。On the six sub-datasets used, the present invention is compared with five classic GNN models: GCN, GAT, GAT-V2, GraphSAGE, and GIN, and the results are compared using random sampling in the same search space. The comparison results are shown in Table 2.
表2Table 2
在相同的数据集上采用相同的划分,可以看到本发明的性能在大部分数据集上是有明显优势的。需要说明的是,ESOL,Lipophilicity,FreeSolv 数据集由于是回归任务,所以选择使用RMSE(均方根误差)指标衡量性能,该指标越小说明性能越优;Tox21,ToxCast,BACE数据集属于分类任务,选择AUC(ROC曲线下的面积)指标衡量性能,该指标越接近1说明性能越优。Using the same partitioning on the same dataset, we can see that the performance of the present invention has obvious advantages on most datasets. It should be noted that the ESOL, Lipophilicity, and FreeSolv datasets are regression tasks, so the RMSE (root mean square error) indicator is used to measure the performance. The smaller the indicator, the better the performance. The Tox21, ToxCast, and BACE datasets are classification tasks, so the AUC (area under the ROC curve) indicator is used to measure the performance. The closer the indicator is to 1, the better the performance.
本发明另一实施例提出一种基于图神经网络的药物分子性质预测系统,该系统具有与上述实施例所述的基于图神经网络的药物分子性质预测方法的步骤对应的程序模块,运行时执行上述药物分子性质预测方法中的步骤。Another embodiment of the present invention proposes a drug molecular property prediction system based on graph neural network, which has a program module corresponding to the steps of the drug molecular property prediction method based on graph neural network described in the above embodiment, and executes the steps in the above drug molecular property prediction method during runtime.
以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案。The above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit the same. Although the present invention has been described in detail with reference to the above embodiments, those skilled in the art should understand that the technical solutions described in the above embodiments may still be modified, or some or all of the technical features may be replaced by equivalents. However, these modifications or replacements do not deviate the essence of the corresponding technical solutions from the technical solutions of the embodiments of the present invention.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411002376.6A CN118536542B (en) | 2024-07-25 | 2024-07-25 | A method and system for constructing an optimal architecture of a graph neural network |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411002376.6A CN118536542B (en) | 2024-07-25 | 2024-07-25 | A method and system for constructing an optimal architecture of a graph neural network |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN118536542A true CN118536542A (en) | 2024-08-23 |
| CN118536542B CN118536542B (en) | 2024-11-22 |
Family
ID=92381189
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202411002376.6A Active CN118536542B (en) | 2024-07-25 | 2024-07-25 | A method and system for constructing an optimal architecture of a graph neural network |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118536542B (en) |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB201821192D0 (en) * | 2018-12-24 | 2019-02-06 | Nanolayers Res Computing Limited | A computer-implemented method of training a graph neural network |
| WO2023124386A1 (en) * | 2021-12-29 | 2023-07-06 | 华为云计算技术有限公司 | Neural network architecture search method, apparatus and device, and storage medium |
| CN116861953A (en) * | 2023-07-04 | 2023-10-10 | 清华大学 | A search method, device, prediction method and product for graph neural network architecture |
| CN117216220A (en) * | 2023-09-25 | 2023-12-12 | 福建实达集团股份有限公司 | Use method and device of large language model |
| WO2024011475A1 (en) * | 2022-07-14 | 2024-01-18 | Robert Bosch Gmbh | Method and apparatus for graph neural architecture search under distribution shift |
| WO2024103609A1 (en) * | 2022-11-17 | 2024-05-23 | 苏州元脑智能科技有限公司 | Dialogue-model training method and apparatus, and dialogue response method and apparatus |
-
2024
- 2024-07-25 CN CN202411002376.6A patent/CN118536542B/en active Active
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB201821192D0 (en) * | 2018-12-24 | 2019-02-06 | Nanolayers Res Computing Limited | A computer-implemented method of training a graph neural network |
| WO2023124386A1 (en) * | 2021-12-29 | 2023-07-06 | 华为云计算技术有限公司 | Neural network architecture search method, apparatus and device, and storage medium |
| WO2024011475A1 (en) * | 2022-07-14 | 2024-01-18 | Robert Bosch Gmbh | Method and apparatus for graph neural architecture search under distribution shift |
| WO2024103609A1 (en) * | 2022-11-17 | 2024-05-23 | 苏州元脑智能科技有限公司 | Dialogue-model training method and apparatus, and dialogue response method and apparatus |
| CN116861953A (en) * | 2023-07-04 | 2023-10-10 | 清华大学 | A search method, device, prediction method and product for graph neural network architecture |
| CN117216220A (en) * | 2023-09-25 | 2023-12-12 | 福建实达集团股份有限公司 | Use method and device of large language model |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118536542B (en) | 2024-11-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11977474B2 (en) | Automated program repair tool | |
| US10984319B2 (en) | Neural architecture search | |
| Pujar et al. | Automated code generation for information technology tasks in yaml through large language models | |
| Su et al. | Building natural language interfaces to web apis | |
| US20230177078A1 (en) | Conversational Database Analysis | |
| CN117389824A (en) | A cloud server load prediction method based on signal decomposition and hybrid model | |
| CN108021983A (en) | Neural Architecture Search | |
| CN112860685B (en) | Automatic recommendations for analysis of datasets | |
| Li et al. | Fptq: Fine-grained post-training quantization for large language models | |
| CN110210609A (en) | Model training method, device and terminal based on the search of neural frame | |
| CN104750780A (en) | Hadoop configuration parameter optimization method based on statistic analysis | |
| US10769140B2 (en) | Concept expansion using tables | |
| CN110597847A (en) | SQL statement automatic generation method, device, equipment and readable storage medium | |
| CN118886494A (en) | Question answering method, system, device, medium and product based on pre-training model | |
| CN116150128A (en) | Database configuration tuning method, device, equipment and readable storage medium | |
| CN118536542A (en) | Method and system for constructing optimal architecture of graph neural network | |
| Petryshyn et al. | Optimizing Large Language Models for OpenAPI Code Completion | |
| CN118916347A (en) | SQL analysis-based data blood-margin analysis method | |
| CN118656473A (en) | Large model data generation method, device, equipment, medium and product | |
| CN111324344A (en) | Method, apparatus, device and readable storage medium for generating code statement | |
| Agrawal et al. | Spiffy: Multiplying diffusion llm acceleration via lossless speculative decoding | |
| CN110889028A (en) | Corpus processing and model training method and system | |
| CN113723014A (en) | Method and device for searching crystal structure of material | |
| CN118153536B (en) | Table formula processing method, device, electronic device and storage medium | |
| CN114091430B (en) | Clause-based semantic parsing |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |