[go: up one dir, main page]

CN112784949A - Neural network architecture searching method and system based on evolutionary computation - Google Patents

Neural network architecture searching method and system based on evolutionary computation Download PDF

Info

Publication number
CN112784949A
CN112784949A CN202110120132.8A CN202110120132A CN112784949A CN 112784949 A CN112784949 A CN 112784949A CN 202110120132 A CN202110120132 A CN 202110120132A CN 112784949 A CN112784949 A CN 112784949A
Authority
CN
China
Prior art keywords
network
sub
module
pheromone
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110120132.8A
Other languages
Chinese (zh)
Other versions
CN112784949B (en
Inventor
高明柯
杜欣军
赵�卓
逄涛
冒睿瑞
张浩博
郭威
王熠
刘晓娟
于楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 32 Research Institute
Original Assignee
CETC 32 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 32 Research Institute filed Critical CETC 32 Research Institute
Priority to CN202110120132.8A priority Critical patent/CN112784949B/en
Publication of CN112784949A publication Critical patent/CN112784949A/en
Application granted granted Critical
Publication of CN112784949B publication Critical patent/CN112784949B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明提供了一种基于进化计算的神经网络架构搜索方法及系统,包括:根据目标需求和平台要求,通过目标函数设置目标要求;根据设置的搜索空间大小,基于子网络模块集合随机生成N个无向有环图,作为进化寻优的网络搜索空间;在启发式信息引导下,结合信息素动态挥发和概率路径选择机制,通过蚁群算法在随机生成的N个无向有环图中搜索N条有向无环图寻优路径,构成候选集;通过训练和测试获取候选集中N条寻优路径的准确率和推理时延,选取最优结果作为当前最优网络结构;评估当前网络架构是否达到目标要求。本发明具有一定的应用灵活性和可扩展性,在资源受限的情况下获得在精度和速度之间实现良好权衡的神经网络模型。

Figure 202110120132

The invention provides a neural network architecture search method and system based on evolutionary computing, including: setting target requirements through an objective function according to target requirements and platform requirements; randomly generating N based on the set of sub-network modules according to the set size of the search space. Undirected cyclic graph, as a network search space for evolutionary optimization; under the guidance of heuristic information, combined with pheromone dynamic volatilization and probabilistic path selection mechanism, the ant colony algorithm is used to search randomly generated N undirected cyclic graphs N directed acyclic graph optimization paths form a candidate set; obtain the accuracy and inference delay of the N optimization paths in the candidate set through training and testing, and select the optimal result as the current optimal network structure; evaluate the current network architecture whether the target requirements are met. The invention has certain application flexibility and expansibility, and obtains a neural network model that achieves a good trade-off between accuracy and speed in the case of limited resources.

Figure 202110120132

Description

一种基于进化计算的神经网络架构搜索方法和系统A neural network architecture search method and system based on evolutionary computing

技术领域technical field

本发明涉及深度神经网络的架构设计及优化技术领域,具体地,涉及一种基于进化计算的神经网络架构搜索方法和系统。The invention relates to the technical field of architecture design and optimization of deep neural networks, in particular, to a method and system for searching neural network architectures based on evolutionary computing.

背景技术Background technique

深度学习对非结构化数据具有强大的自动特征提取功能,具有强大的自动表示能力,因此其在机器翻译、图像识别、语音识别、目标检测、自然语言处理等许多领域都取得了重大突破和进展。基于神经网络架构的设计对数据的特征表示和最终的性能的重要性,研究人员专注于设计各种复杂的神经网络架构,以获得良好的数据特征表示。但是,神经网络架构的设计在很大程度上依赖于研究人员的先验知识和经验,需要大量的时间和精力。而人类现有的先验知识和固定的思维范式在一定程度上难以发现更优的网络架构,初学者也很难根据自己的实际需要对网络架构进行合理的修改。因此,神经架构搜索(NeuralArchitecture Search,NAS)应运而生。NAS旨在有限的计算资源下,利用算法自动化设计具有最优性能的神经网络架构,尽可能减少人工的干预。通过使用强化学习方法得到的网络架构在图像分类任务上达到SOTA分类精度的研究被认为是NAS的开创性工作,也说明自动化网络架构设计思想是可行的。随后,利用演化学习来获得类似结果的大规模进化计算的研究工作再次验证了这一想法的可行性。NAS已迅速应用在目标检测、语义分割、对抗学习、建筑规模和多目标优化等方面。Deep learning has a powerful automatic feature extraction function for unstructured data and a powerful automatic representation ability, so it has made major breakthroughs and progress in many fields such as machine translation, image recognition, speech recognition, target detection, and natural language processing. . Based on the importance of the design of neural network architectures to the feature representation of the data and the final performance, researchers have focused on designing various complex neural network architectures to obtain good data feature representations. However, the design of neural network architectures heavily relies on researchers' prior knowledge and experience, which requires a lot of time and effort. However, with the existing prior knowledge and fixed thinking paradigm of human beings, it is difficult to find a better network architecture to a certain extent, and it is difficult for beginners to make reasonable modifications to the network architecture according to their actual needs. Therefore, Neural Architecture Search (NAS) came into being. NAS aims to use algorithms to automatically design neural network architectures with optimal performance under limited computing resources, minimizing manual intervention. The research that the network architecture obtained by using the reinforcement learning method to achieve SOTA classification accuracy on the image classification task is considered to be the pioneering work of NAS, and it also shows that the idea of automatic network architecture design is feasible. Subsequently, the research work of large-scale evolutionary computation using evolutionary learning to obtain similar results has once again verified the feasibility of this idea. NAS has been rapidly applied in object detection, semantic segmentation, adversarial learning, building scale, and multi-objective optimization.

由于NAS方法需要强大的算力支撑且需消耗庞大的计算量,因此,出现了重构搜索空间以减小搜索范围降低搜索复杂度以及通过参数共享、模型重用、梯度优化等策略加速网络架构的搜索以减少计算量的研究。早期的NAS在架构搜索阶段从无到有地训练每个候选网络架构,导致计算量激增;尽管采用参数共享策略来加快架构搜索的进程,但很可能导致候选架构的排名不准确,将使NAS难以从大量候选架构中选择最优的网络架构,从而进一步降低最终搜索的网络架构的性能。基于One-Shot的可微分神经网络架构搜索方法将搜索空间从离散放松到连续,从而可以使用梯度下降来同时搜索架构和学习权重,缩短了搜索时间,但当搜索轮数过大时,会导致搜索的架构中包含很多的跳跃连接,使得网络变得越浅。浅度网络可学习的参数更少,具有的表达能力更弱,导致网络性能急剧下降。改进的可微分神经网络架构搜索方法虽然采用早停机制直接控制跳跃连接的数量,但是早停的时机把控是一个重要的问题,过早的停止会导致架构搜索不全面。因此,如何在资源受限的条件下取得性能和效率之间的平衡是亟需解决的问题。Since the NAS method requires strong computing power and consumes a huge amount of computing power, there have been attempts to reconstruct the search space to reduce the search range, reduce the search complexity, and accelerate the network architecture through strategies such as parameter sharing, model reuse, and gradient optimization. Search for research that reduces computational effort. Early NAS trains each candidate network architecture from scratch in the architecture search phase, resulting in a surge in computation; although a parameter sharing strategy is adopted to speed up the process of architecture search, it is likely to result in inaccurate ranking of candidate architectures, which will make NAS It is difficult to select the optimal network architecture from a large number of candidate architectures, further reducing the performance of the final searched network architecture. The one-shot-based differentiable neural network architecture search method relaxes the search space from discrete to continuous, so that gradient descent can be used to search the architecture and learn weights at the same time, which shortens the search time, but when the number of search rounds is too large, it will lead to The searched architecture contains many skip connections, making the network shallower. Shallow networks have fewer parameters to learn and have weaker expressive power, resulting in a sharp drop in network performance. Although the improved differentiable neural network architecture search method uses the early stop mechanism to directly control the number of skip connections, the timing of early stop is an important issue, and premature stop will lead to incomplete architecture search. Therefore, how to achieve a balance between performance and efficiency under resource-constrained conditions is an urgent problem to be solved.

国内专利CN111353313A公开了一种基于进化神经网络架构搜索的情感分析模型构建方法,包括以下步骤:群初始化;以嵌入层为第一层,封装数个卷积层单元、数个池化单元和数个全连接单元,并以全连接单元结尾,随机产生M条染色体;采用准确率作为适用度函数进行适应度评估;采用轮盘赌法选择数个染色体个体,组成第一染色体种群;采用不等长染色体交叉方法对第一染色体种群的染色体个体进行两两交叉,得到数个染色体个体,组成第二染色体种群;对第二染色体种群的染色体个体的某一卷积层单元或池化单元或全连接单元进行添加或替换或删除;计算第二染色体种群的染色体个体的适应度,直至到达预设的迭代次数,并采用适应度选出最优的神经网络结构的染色体个体。Domestic patent CN111353313A discloses a sentiment analysis model construction method based on evolutionary neural network architecture search, including the following steps: group initialization; taking the embedding layer as the first layer, encapsulating several convolutional layer units, several pooling units and data. A fully connected unit ends with a fully connected unit, and M chromosomes are randomly generated; the fitness is evaluated by using the accuracy rate as the fitness function; the roulette method is used to select several chromosome individuals to form the first chromosome population; The long chromosome crossover method performs pairwise crossover of the chromosome individuals of the first chromosome population to obtain several chromosome individuals to form the second chromosome population; The connection units are added, replaced or deleted; the fitness of the chromosome individuals of the second chromosome population is calculated until the preset number of iterations is reached, and the chromosome individuals with the optimal neural network structure are selected by the fitness.

国内专利CN111144555A公开了一种基于改进进化算法的循环神经网络架构搜索方法、系统及介质,本发明方法包括训练多个循环神经网络子模型来更新共享权重;初始化产生种群和用于记录所有循环神经网络模型性能的历史记录表;从种群中随机采样产生样本,选取样本最优模型进行变异操作,以指定概率移除种群中最老或最差模型,将变异后的子节点加入种群和历史记录表;判断是否满足预设的结束条件,如果不满足则继续进行样本变异,否则输出历史记录表中的最优模型。本发明能够加快对循环神经网络架构进行搜索的过程,每个步骤更新种群时同时考虑到性能和搜索时间,能够极大地提高循环神经网络架构搜索的效率。Domestic patent CN111144555A discloses a cyclic neural network architecture search method, system and medium based on an improved evolutionary algorithm. The method of the invention includes training a plurality of cyclic neural network sub-models to update shared weights; The history record table of network model performance; randomly sample from the population to generate samples, select the optimal model of the sample for mutation operation, remove the oldest or worst model in the population with a specified probability, and add the mutated child nodes to the population and history records Table; judge whether the preset end conditions are met, if not, continue to perform sample mutation, otherwise output the optimal model in the history table. The invention can speed up the process of searching for the cyclic neural network architecture, and simultaneously consider performance and search time when updating the population in each step, and can greatly improve the efficiency of the cyclic neural network architecture searching.

国内专利CN110728355A公开了神经网络架构搜索方法、装置、计算机设备及存储介质,涉及深度学习技术领域,其中方法可包括:将神经网络架构划分为M个子结构,M为大于1的正整数;分别搜索各子结构中的拓扑结构;通过将各子结构中的拓扑结构相连得到神经网络架构,可提升搜索速度。The domestic patent CN110728355A discloses a neural network architecture search method, device, computer equipment and storage medium, and relates to the technical field of deep learning, wherein the method may include: dividing the neural network architecture into M substructures, where M is a positive integer greater than 1; The topology in each substructure; the neural network architecture is obtained by connecting the topology in each substructure, which can improve the search speed.

国内专利CN110232434A公开了一种基于属性图优化的神经网络架构评估方法,将神经网络架构建模为属性图,构建贝叶斯图神经网络代理模型;通过随机生成、训练以及测试一组神经网络架构,将这组神经网络架构以及测试对应的性能指标作为初始训练集,训练集用于训练贝叶斯图神经网络代理模型;根据当前的训练集,通过进化算法生成新的神经网络候选集并训练贝叶斯图神经网络代理模型;并通过最大化采集函数从神经网络候选集中选择一个潜在个体,然后对该个体进行训练、测试,并将其以及测试对应的性能指标添加到当前的训练集中;在固定成本的约束下,重复上述步骤直到在当前的训练集中得到最好的神经网络架构以及架构对应的权重。The domestic patent CN110232434A discloses a neural network architecture evaluation method based on attribute graph optimization. The neural network architecture is modeled as an attribute graph, and a Bayesian graph neural network proxy model is constructed; a set of neural network architectures are randomly generated, trained and tested. , take this set of neural network architecture and the performance indicators corresponding to the test as the initial training set, and the training set is used to train the Bayesian graph neural network proxy model; according to the current training set, a new neural network candidate set is generated and trained by the evolutionary algorithm Bayesian graph neural network proxy model; and select a potential individual from the neural network candidate set by maximizing the acquisition function, then train and test the individual, and add it and the performance indicators corresponding to the test to the current training set; Under the constraint of fixed cost, the above steps are repeated until the best neural network architecture and the corresponding weights of the architecture are obtained in the current training set.

国外专利JP2020522035A公开了一种用于确定神经网络结构的方法、系统和装置。方法包括使用具有控制器参数的控制器神经网络来根据控制器参数的当前值生成一批输出序列。在批处理中输出序列生成子卷积神经网络(CNN)的实例,包括具有由输出序列定义的体系结构的第一卷积单元的多个实例;训练子CNN的实例执行图像处理任务,并评估子CNS训练的实例的性能,用于确定子CNN训练实例的性能度量,并包括使用训练后的CNN训练实例的性能度量来调整控制器神经网络的控制器参数的当前值。The foreign patent JP2020522035A discloses a method, system and device for determining the structure of a neural network. The method includes using a controller neural network with controller parameters to generate a batch of output sequences based on current values of the controller parameters. Output sequences in batches generate instances of a subconvolutional neural network (CNN), including multiple instances of the first convolutional unit with an architecture defined by the output sequence; train instances of the subCNN to perform image processing tasks, and evaluate The performance of the sub-CNN training instance, which is used to determine the performance metric of the sub-CNN training instance, and includes using the performance metric of the trained CNN training instance to adjust the current values of the controller parameters of the controller neural network.

国外专利WO2018081563A1公开了一种确定神经网络体系结构的方法、系统和设备。方法包括:使用控制器神经网络生成一批输出序列,该批中的每个输出序列是用于执行特定神经网络任务的子神经网络的体系结构;对于批中的每个输出序列:通过各自的子神经网络实例训练由输出序列定义的体系结构;评估子神经网络训练实例在特定神经网络任务上的性能,以确定子神经网络训练实例在特定神经网络任务上的性能度量;并使用子神经网络训练实例的性能度量来调整控制器神经网络控制器参数的当前值。Foreign patent WO2018081563A1 discloses a method, system and device for determining the architecture of a neural network. The method includes: using a controller neural network to generate a batch of output sequences, each output sequence in the batch is the architecture of a sub-neural network used to perform a specific neural network task; for each output sequence in the batch: through the respective The sub-neural network instance trains the architecture defined by the output sequence; evaluates the performance of the sub-neural network training instance on a specific neural network task to determine a measure of the performance of the sub-neural network training instance on the specific neural network task; and uses the sub-neural network A performance measure of the training instance to adjust the current values of the controller neural network controller parameters.

发明内容SUMMARY OF THE INVENTION

针对现有技术中的缺陷,本发明的目的是提供一种基于进化计算的神经网络架构搜索方法及系统。In view of the defects in the prior art, the purpose of the present invention is to provide a neural network architecture search method and system based on evolutionary computation.

根据本发明提供的一种基于进化计算的神经网络架构搜索方法,包括:According to a kind of neural network architecture search method based on evolutionary computation provided by the present invention, comprising:

步骤S1:根据目标需求和平台要求,通过目标函数设置目标要求,目标要求包括:期望的准确度、推理时延、搜索空间大小和进化次数;Step S1: According to the target requirements and platform requirements, the target requirements are set through the target function, and the target requirements include: expected accuracy, inference delay, search space size and evolution times;

步骤S2:根据设置的搜索空间大小,基于子网络模块集合随机生成N个无向有环图,作为进化寻优的网络搜索空间;Step S2: According to the set size of the search space, randomly generate N undirected cyclic graphs based on the set of sub-network modules, as the network search space for evolutionary optimization;

步骤S3:在启发式信息引导下,结合信息素动态挥发和概率路径选择机制,通过蚁群算法在随机生成的N个无向有环图中搜索N条有向无环图寻优路径,构成候选集;Step S3: Under the guidance of heuristic information, combined with the dynamic volatilization of pheromone and the probability path selection mechanism, the ant colony algorithm is used to search for N directed acyclic graph optimization paths in the randomly generated N undirected acyclic graphs to form candidate set;

步骤S4:通过训练和测试获取候选集中N条寻优路径的准确率和推理时延,选取最优结果作为当前最优网络结构;Step S4: Obtain the accuracy and inference delay of the N optimization paths in the candidate set through training and testing, and select the optimal result as the current optimal network structure;

步骤S5:评估当前网络架构是否达到目标要求,当当前网络架构没有达到预设目标要求,当满足速度和精度要求时,则暂停寻优结果,基于当前最优网络架构的所有子网络模块内部结构实时演化突变,继续迭代直至达到预设目标要求;当满足预设目标要求时,则输出当前最优网络结果,否则退出搜索过程,输出搜索异常。Step S5: Evaluate whether the current network architecture meets the target requirements. When the current network architecture does not meet the preset target requirements, and when the speed and accuracy requirements are met, the optimization result is suspended, and the internal structure of all sub-network modules based on the current optimal network architecture Evolve mutation in real time, and continue to iterate until the preset target requirements are met; when the preset target requirements are met, the current optimal network result is output, otherwise, the search process is exited and the search exception is output.

优选地,所述步骤S1包括:Preferably, the step S1 includes:

所述目标函数公式如下:The objective function formula is as follows:

Figure BDA0002921713920000041
Figure BDA0002921713920000041

s.t.LAT(Net)≤T&ACC(Net)≥As.t.LAT(Net)≤T&ACC(Net)≥A

其中,目标函数定义为一个多目标搜索;Net表示通过进化算法得到的网络;ACC(Net)表示网络的准确率;LAT(Net)表示网络的推理时延;T表示期望的推理时延;A表示期望的准确率;所述期望的准确率根据预设的目标需求设定;所述期望的推理时延根据移动、嵌入式或通用平台类型设定。Among them, the objective function is defined as a multi-objective search; Net represents the network obtained by the evolutionary algorithm; ACC(Net) represents the accuracy of the network; LAT(Net) represents the inference delay of the network; T represents the expected inference delay; A Indicates the expected accuracy rate; the expected accuracy rate is set according to the preset target requirements; the expected inference delay is set according to the mobile, embedded or general platform type.

优选地,所述步骤S2包括:Preferably, the step S2 includes:

所述子网络模块是无向有环图中的结点;子网络模块包括多种类型且具有M个层的子网络,子网络类型可根据目标需求和平台要求扩展子网络模块的类型并进行选择;The sub-network module is a node in an undirected cyclic graph; the sub-network module includes multiple types of sub-networks with M layers, and the sub-network type can expand the type of the sub-network module according to the target requirements and platform requirements and carry out choose;

子网络的结构包括:多卷积层、ResNet块、深度可分离卷积、具有线性瓶颈的倒残差结构和基于压缩-激发结构的轻量级注意力结构。The structure of the sub-network includes: multi-convolutional layers, ResNet blocks, depthwise separable convolution, inverted residual structure with linear bottleneck, and lightweight attention structure based on compression-excitation structure.

优选地,所述网络搜索空间包括:所述网络搜索空间包括N个搜索子空间;Preferably, the network search space includes: the network search space includes N search subspaces;

Figure BDA0002921713920000042
Figure BDA0002921713920000042

Figure BDA0002921713920000043
Figure BDA0002921713920000043

其中,

Figure BDA0002921713920000044
表示第i代子网络模块集合,
Figure BDA0002921713920000045
表示第i代中第j个子网络模块,
Figure BDA0002921713920000046
表示第i代搜索空间的边集合,子网络模块之间通过边连接;
Figure BDA0002921713920000047
表示第i代中第j个子网络模块和第k个子网络模块之间的边;
Figure BDA0002921713920000048
表示第i代第n个搜索空间;i表示迭代序号。in,
Figure BDA0002921713920000044
represents the set of i-th generation sub-network modules,
Figure BDA0002921713920000045
represents the jth subnetwork module in the ith generation,
Figure BDA0002921713920000046
Represents the edge set of the i-th generation search space, and the sub-network modules are connected by edges;
Figure BDA0002921713920000047
represents the edge between the jth subnetwork module and the kth subnetwork module in the ith generation;
Figure BDA0002921713920000048
Represents the n-th search space of the i-th generation; i represents the iteration number.

优选地,所述步骤S3包括:Preferably, the step S3 includes:

步骤S3.1:在每一个搜索子空间中任选一点作为起点,选取距离起点最远的结点为终点,初始化蚂蚁数量、信息素强度常数和循环次数;Step S3.1: select a point in each search subspace as the starting point, select the node farthest from the starting point as the ending point, and initialize the number of ants, the constant of pheromone intensity and the number of cycles;

步骤S3.2:启发信息计算;Step S3.2: heuristic information calculation;

Figure BDA0002921713920000051
Figure BDA0002921713920000051

其中,ηI,J(t)表示t时刻从结点I到结点J的启发信息;DepI,J、WigI,J、ConI,J、FilI,J分别表示结点J的深度、宽度、连接度和滤波器数量,ω表示激励因子,0≤ω≤1;奖励机制定义为激励当前最优的网络架构中所有结点;ω值越小,启发式信息η越大,初始网络搜索空间中由于未产生进化,则ω初始值设置为1;Among them, η I,J (t) represents the heuristic information from node I to node J at time t; Dep I,J , Wig I,J , Con I,J , Fil I,J represent the depth of node J respectively , width, degree of connectivity and number of filters, ω represents the incentive factor, 0≤ω≤1; the reward mechanism is defined to motivate all nodes in the current optimal network architecture; the smaller the value of ω, the greater the heuristic information η, the initial Since there is no evolution in the network search space, the initial value of ω is set to 1;

步骤S3.3:概率路径选择;Step S3.3: Probabilistic path selection;

Figure BDA0002921713920000052
Figure BDA0002921713920000052

其中,

Figure BDA0002921713920000053
表示第t时刻蚂蚁m由点I移动到点J的概率;allowedm表示蚂蚁下一步可以选择的节点;α表示信息素启发因子,表示路径上残留的信息素在寻优过程中的作用,值越大则说明蚂蚁之间的协作能力越强,倾向于选择其它蚂蚁经过的路径;β表示期望值启发因子,表明准确性和推理时延在蚂蚁选择路径时的受重视程度,值越大,状态转移规律越接近贪心规则;τI,J(t)表示t时刻点I到点J路径上的信息素;τI,S(t)表示点I到allowedm中任一点路径上的信息素;
Figure BDA0002921713920000054
表示I到allowedm中任一点路径上的启发信息;in,
Figure BDA0002921713920000053
Represents the probability that the ant m moves from point I to point J at the t-th time; allowed m represents the node that the ant can choose next; α represents the pheromone heuristic factor, which represents the role of the remaining pheromone on the path in the optimization process, value The larger the value, the stronger the cooperation ability between ants and the tendency to choose the path passed by other ants; β represents the expected value heuristic factor, which indicates the importance of accuracy and inference delay when the ants choose the path. The transition rule is closer to the greedy rule; τ I, J (t) represents the pheromone on the path from point I to point J at time t; τ I, S (t) represents the pheromone on the path from point I to any point in allowed m ;
Figure BDA0002921713920000054
Indicates heuristic information on the path from I to allowed m at any point;

步骤S3.4:信息素动态挥发;Step S3.4: dynamic volatilization of pheromone;

Figure BDA0002921713920000055
Figure BDA0002921713920000055

其中,ρI,J(t)表示t时刻在I到J路径上的挥发系数;ηI,J(t)表示t时刻I到J路径上的启发信息;

Figure BDA0002921713920000056
ηi表示所有的启发信息,L表示当前网络下的结点总数;Among them, ρ I, J (t) represents the volatility coefficient on the path from I to J at time t; η I, J (t) represents the heuristic information on the path from I to J at time t;
Figure BDA0002921713920000056
η i represents all the heuristic information, L represents the total number of nodes under the current network;

步骤S3.5:信息素增量计算;Step S3.5: pheromone increment calculation;

Figure BDA0002921713920000057
Figure BDA0002921713920000057

其中,Q为信息素强度常量,是蚂蚁在一个循环中在所经过的路径上释放的信息素的总量;ηm表示第m只蚂蚁在此次循环中所受到的启发信息总量;Among them, Q is the constant of pheromone intensity, which is the total amount of pheromone released by the ants on the path traversed in a cycle; η m represents the total amount of enlightenment information received by the mth ant in this cycle;

步骤S3.6:信息素更新;Step S3.6: pheromone update;

τI,J(t+1)=(1-ρ)τI,J(t)+ΔτI,J(t,t+1)τ I, J (t+1)=(1-ρ)τ I, J (t)+Δτ I, J (t,t+1)

Figure BDA0002921713920000061
Figure BDA0002921713920000061

其中,ρ是信息素动态挥发系数;

Figure BDA0002921713920000062
表示第m只蚂蚁在本次循环中在路径(I,J)上留下的信息素增量,ΔτI,J(t,t+1)表示本次循环当中所有经过路径(I,J)的蚂蚁留下的信息素增量;K表示在本次循环中所有经过路径(I,J)的蚂蚁总数;where ρ is the dynamic volatility coefficient of pheromone;
Figure BDA0002921713920000062
Represents the pheromone increment left by the mth ant on the path (I, J) in this cycle, Δτ I, J (t, t+1) represents all the paths (I, J) passed in this cycle The pheromone increment left by the ants; K represents the total number of ants passing through the path (I, J) in this cycle;

步骤S3.7:寻优结束判断:当所有搜索子空间的寻优均达到最大循环次数则退出循环,输出所有搜索子空间寻优结果作为当代候选集;否则重复执行步骤S3.2至步骤S3.7,直至达到最大循环次数。Step S3.7: Judgment of the end of optimization: when the optimization of all search subspaces reaches the maximum number of cycles, exit the cycle, and output the optimization results of all search subspaces as the contemporary candidate set; otherwise, repeat steps S3.2 to S3 .7 until the maximum number of cycles is reached.

优选地,所述步骤S5中演化突变包括:将当前最优网络架构中所有子网络模块中的激励因子ω设置为常数,且0≤ω≤1;同时在突变集合中随机选取变异操作,进化子网络模块内部结构,生成下一代子网络模块,重复执行步骤S2至步骤S5,直至达到预设目标要求。Preferably, the evolution mutation in the step S5 includes: setting the excitation factor ω in all sub-network modules in the current optimal network architecture to be constant, and 0≤ω≤1; The internal structure of the sub-network module generates the next-generation sub-network module, and steps S2 to S5 are repeatedly executed until the preset target requirements are met.

根据本发明提供的一种基于进化计算的神经网络架构搜索系统,包括:A neural network architecture search system based on evolutionary computing provided by the present invention, comprising:

模块M1:根据目标需求和平台要求,通过目标函数设置目标要求,目标要求包括:期望的准确度、推理时延、搜索空间大小和进化次数;Module M1: According to the target requirements and platform requirements, the target requirements are set through the target function, and the target requirements include: expected accuracy, inference delay, search space size and evolution times;

模块M2:根据设置的搜索空间大小,基于子网络模块集合随机生成N个无向有环图,作为进化寻优的网络搜索空间;Module M2: According to the set size of the search space, N undirected cyclic graphs are randomly generated based on the set of sub-network modules as the network search space for evolutionary optimization;

模块M3:在启发式信息引导下,结合信息素动态挥发和概率路径选择机制,通过蚁群算法在随机生成的N个无向有环图中搜索N条有向无环图寻优路径,构成候选集;Module M3: Under the guidance of heuristic information, combined with the dynamic volatilization of pheromone and the probability path selection mechanism, the ant colony algorithm is used to search for N directed acyclic graphs in randomly generated N undirected acyclic graphs. candidate set;

模块M4:通过训练和测试获取候选集中N条寻优路径的准确率和推理时延,选取最优结果作为当前最优网络结构;Module M4: Obtain the accuracy and inference delay of N optimization paths in the candidate set through training and testing, and select the optimal result as the current optimal network structure;

模块M5:评估当前网络架构是否达到目标要求,当当前网络架构没有达到预设目标要求,当满足速度和精度要求时,则暂停寻优结果,基于当前最优网络架构的所有子网络模块内部结构实时演化突变,继续迭代直至达到预设目标要求;当满足预设目标要求时,则输出当前最优网络结果,否则退出搜索过程,输出搜索异常。Module M5: Evaluate whether the current network architecture meets the target requirements. When the current network architecture does not meet the preset target requirements, when the speed and accuracy requirements are met, the optimization result is suspended, and the internal structure of all sub-network modules based on the current optimal network architecture Evolve mutation in real time, and continue to iterate until the preset target requirements are met; when the preset target requirements are met, the current optimal network result is output, otherwise, the search process is exited and the search exception is output.

优选地,所述模块M1包括:Preferably, the module M1 includes:

所述目标函数公式如下:The objective function formula is as follows:

Figure BDA0002921713920000071
Figure BDA0002921713920000071

s.t.LAT(Net)≤T&ACC(Net)≥As.t.LAT(Net)≤T&ACC(Net)≥A

其中,目标函数定义为一个多目标搜索;Net表示通过进化算法得到的网络;ACC(Net)表示网络的准确率;LAT(Net)表示网络的推理时延;T表示期望的推理时延;A表示期望的准确率;所述期望的准确率根据预设的目标需求设定;所述期望的推理时延根据移动、嵌入式或通用平台类型设定。Among them, the objective function is defined as a multi-objective search; Net represents the network obtained by the evolutionary algorithm; ACC(Net) represents the accuracy of the network; LAT(Net) represents the inference delay of the network; T represents the expected inference delay; A Indicates the expected accuracy rate; the expected accuracy rate is set according to the preset target requirements; the expected inference delay is set according to the mobile, embedded or general platform type.

优选地,所述模块M2包括:Preferably, the module M2 includes:

所述子网络模块是无向有环图中的结点;子网络模块包括多种类型且具有M个层的子网络,子网络类型可根据目标需求和平台要求扩展子网络模块的类型并进行选择;The sub-network module is a node in an undirected cyclic graph; the sub-network module includes multiple types of sub-networks with M layers, and the sub-network type can expand the type of the sub-network module according to the target requirements and platform requirements and carry out choose;

子网络的结构包括:多卷积层、ResNet块、深度可分离卷积、具有线性瓶颈的倒残差结构和基于压缩-激发结构的轻量级注意力结构;The structure of the sub-network includes: multiple convolutional layers, ResNet blocks, depthwise separable convolution, inverted residual structure with linear bottleneck, and lightweight attention structure based on compression-excitation structure;

所述网络搜索空间包括:所述网络搜索空间包括N个搜索子空间;The network search space includes: the network search space includes N search subspaces;

Figure BDA0002921713920000072
Figure BDA0002921713920000072

Figure BDA0002921713920000073
Figure BDA0002921713920000073

其中,

Figure BDA0002921713920000074
表示第i代子网络模块集合,
Figure BDA0002921713920000075
表示第i代中第j个子网络模块,
Figure BDA0002921713920000076
表示第i代搜索空间的边集合,子网络模块之间通过边连接;
Figure BDA0002921713920000077
表示第i代中第j个子网络模块和第k个子网络模块之间的边;
Figure BDA0002921713920000078
表示第i代第n个搜索空间;i表示迭代序号。in,
Figure BDA0002921713920000074
represents the set of i-th generation sub-network modules,
Figure BDA0002921713920000075
represents the jth subnetwork module in the ith generation,
Figure BDA0002921713920000076
Represents the edge set of the i-th generation search space, and the sub-network modules are connected by edges;
Figure BDA0002921713920000077
represents the edge between the jth subnetwork module and the kth subnetwork module in the ith generation;
Figure BDA0002921713920000078
Represents the n-th search space of the i-th generation; i represents the iteration number.

优选地,所述模块M3包括:Preferably, the module M3 includes:

模块M3.1:在每一个搜索子空间中任选一点作为起点,选取距离起点最远的结点为终点,初始化蚂蚁数量、信息素强度常数和循环次数;Module M3.1: Select a point in each search subspace as the starting point, select the node farthest from the starting point as the end point, and initialize the number of ants, the constant of pheromone intensity and the number of cycles;

模块M3.2:启发信息计算;Module M3.2: Heuristic information calculation;

Figure BDA0002921713920000079
Figure BDA0002921713920000079

其中,ηI,J(t)表示t时刻从结点I到结点J的启发信息;DepI,J、WigI,J、ConI,J、FilI,J分别表示结点J的深度、宽度、连接度和滤波器数量,ω表示激励因子,0≤ω≤1;奖励机制定义为激励当前最优的网络架构中所有结点;ω值越小,启发式信息η越大,初始网络搜索空间中由于未产生进化,则ω初始值设置为1;Among them, η I,J (t) represents the heuristic information from node I to node J at time t; Dep I,J , Wig I,J , Con I,J , Fil I,J represent the depth of node J respectively , width, degree of connectivity and number of filters, ω represents the incentive factor, 0≤ω≤1; the reward mechanism is defined to motivate all nodes in the current optimal network architecture; the smaller the value of ω, the greater the heuristic information η, the initial Since there is no evolution in the network search space, the initial value of ω is set to 1;

模块M3.3:概率路径选择;Module M3.3: Probabilistic Path Selection;

Figure BDA0002921713920000081
Figure BDA0002921713920000081

其中,

Figure BDA0002921713920000082
表示第t时刻蚂蚁m由点I移动到点J的概率;allowedm表示蚂蚁下一步可以选择的节点;α表示信息素启发因子,表示路径上残留的信息素在寻优过程中的作用,值越大则说明蚂蚁之间的协作能力越强,倾向于选择其它蚂蚁经过的路径;β表示期望值启发因子,表明准确性和推理时延在蚂蚁选择路径时的受重视程度,值越大,状态转移规律越接近贪心规则;τI,J(t)表示t时刻点I到点J路径上的信息素;τI,S(t)表示点I到allowedm中任一点路径上的信息素;
Figure BDA0002921713920000083
表示I到allowedm中任一点路径上的启发信息;in,
Figure BDA0002921713920000082
Represents the probability that the ant m moves from point I to point J at the t-th time; allowed m represents the node that the ant can choose next; α represents the pheromone heuristic factor, which represents the role of the remaining pheromone on the path in the optimization process, value The larger the value, the stronger the cooperation ability between ants and the tendency to choose the path passed by other ants; β represents the expected value heuristic factor, which indicates the importance of accuracy and inference delay when the ants choose the path. The transition rule is closer to the greedy rule; τ I, J (t) represents the pheromone on the path from point I to point J at time t; τ I, S (t) represents the pheromone on the path from point I to any point in allowed m ;
Figure BDA0002921713920000083
Indicates heuristic information on the path from I to allowed m at any point;

模块M3.4:信息素动态挥发;Module M3.4: dynamic volatilization of pheromone;

Figure BDA0002921713920000084
Figure BDA0002921713920000084

其中,ρI,J(t)表示t时刻在I到J路径上的挥发系数;ηI,J(t)表示t时刻I到J路径上的启发信息;

Figure BDA0002921713920000085
ηi表示所有的启发信息,L表示当前网络下的结点总数;Among them, ρ I, J (t) represents the volatility coefficient on the path from I to J at time t; η I, J (t) represents the heuristic information on the path from I to J at time t;
Figure BDA0002921713920000085
η i represents all the heuristic information, L represents the total number of nodes under the current network;

模块M3.5:信息素增量计算;Module M3.5: pheromone incremental calculation;

Figure BDA0002921713920000086
Figure BDA0002921713920000086

其中,Q为信息素强度常量,是蚂蚁在一个循环中在所经过的路径上释放的信息素的总量;ηm表示第m只蚂蚁在此次循环中所受到的启发信息总量;Among them, Q is the constant of pheromone intensity, which is the total amount of pheromone released by the ants on the path traversed in a cycle; η m represents the total amount of enlightenment information received by the mth ant in this cycle;

模块M3.6:信息素更新;Module M3.6: pheromone update;

τI,J(t+1)=(1-ρ)τI,J(t)+ΔτI,J(t,t+1)τ I, J (t+1)=(1-ρ)τ I, J (t)+Δτ I, J (t, t+1)

Figure BDA0002921713920000087
Figure BDA0002921713920000087

其中,ρ是信息素动态挥发系数;

Figure BDA0002921713920000088
表示第m只蚂蚁在本次循环中在路径(I,J)上留下的信息素增量,ΔτI,J(t,t+1)表示本次循环当中所有经过路径(I,J)的蚂蚁留下的信息素增量;K表示在本次循环中所有经过路径(I,J)的蚂蚁总数;where ρ is the dynamic volatility coefficient of pheromone;
Figure BDA0002921713920000088
Represents the pheromone increment left by the mth ant on the path (I, J) in this cycle, Δτ I, J (t, t+1) represents all the paths (I, J) passed in this cycle The pheromone increment left by the ants; K represents the total number of ants passing through the path (I, J) in this cycle;

模块M3.7:寻优结束判断:当所有搜索子空间的寻优均达到最大循环次数则退出循环,输出所有搜索子空间寻优结果作为当代候选集;否则重复触发模块M3.2至模块M3.7执行,直至达到最大循环次数;Module M3.7: Judgment of the end of optimization: when the optimization of all search subspaces reaches the maximum number of cycles, exit the loop, and output the optimization results of all search subspaces as the contemporary candidate set; otherwise, repeat triggering modules M3.2 to M3 .7 execute until the maximum number of cycles is reached;

所述模块M5中演化突变包括:将当前最优网络架构中所有子网络模块中的激励因子ω设置为常数,且0≤ω≤1;同时在突变集合中随机选取变异操作,进化子网络模块内部结构,生成下一代子网络模块,重复触发模块M2至模块M5执行,直至达到预设目标要求。The evolution mutation in the module M5 includes: setting the excitation factor ω in all sub-network modules in the current optimal network architecture to a constant, and 0≤ω≤1; at the same time randomly selecting mutation operations in the mutation set, and evolving the sub-network modules The internal structure generates the next-generation sub-network module, and repeatedly triggers the execution of modules M2 to M5 until the preset target requirements are met.

与现有技术相比,本发明具有如下的有益效果:Compared with the prior art, the present invention has the following beneficial effects:

1、本发明通过模块化图论构建思想,构建以子网络模块为基本构件的系统级搜索子空间,可有效降低搜索空间复杂度,并以模块级的搜索方式加快架构搜索进程,提高搜索性能;利用蚁群算法的启发式优化能力,可从全局角度防止架构搜索陷入局部最优而导致网络性能下降;融合奖励和突变演化机制,通过结构级的方式尽可能全面进化,可避免架构搜索不全面。1. The present invention constructs a system-level search subspace with a sub-network module as the basic component through the construction idea of modular graph theory, which can effectively reduce the complexity of the search space, and speed up the architecture search process with a module-level search method, and improve the search performance. ;Using the heuristic optimization ability of the ant colony algorithm, it can prevent the architecture search from falling into a local optimum from a global perspective and lead to the degradation of network performance; Integrate the reward and mutation evolution mechanism, and evolve as comprehensively as possible through the structure-level method, which can avoid the architecture search. comprehensive.

2、本发明可根据应用的实际需求和平台要求,通过目标函数的设定期望目标,不受应用平台限制,同时鼓励整个网络的模块内结构的多样性,具有一定的应用灵活性和可扩展性,可在资源受限的情况下获得在精度和速度之间实现良好权衡的神经网络模型。2. The present invention can set the desired target through the objective function according to the actual needs of the application and the platform requirements, and is not limited by the application platform, and at the same time encourages the diversity of the structure of the modules of the entire network, and has certain application flexibility and scalability. properties, a neural network model that achieves a good trade-off between accuracy and speed can be obtained with limited resources.

附图说明Description of drawings

通过阅读参照以下附图对非限制性实施例所作的详细描述,本发明的其它特征、目的和优点将会变得更明显:Other features, objects and advantages of the present invention will become more apparent by reading the detailed description of non-limiting embodiments with reference to the following drawings:

图1为一种基于进化计算的神经网络架构搜索方法流程图。Figure 1 is a flowchart of a neural network architecture search method based on evolutionary computing.

具体实施方式Detailed ways

下面结合具体实施例对本发明进行详细说明。以下实施例将有助于本领域的技术人员进一步理解本发明,但不以任何形式限制本发明。应当指出的是,对本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变化和改进。这些都属于本发明的保护范围。The present invention will be described in detail below with reference to specific embodiments. The following examples will help those skilled in the art to further understand the present invention, but do not limit the present invention in any form. It should be noted that, for those skilled in the art, several changes and improvements can be made without departing from the inventive concept. These all belong to the protection scope of the present invention.

实施例1Example 1

根据本发明提供的一种基于进化计算的神经网络架构搜索方法,包括:According to a kind of neural network architecture search method based on evolutionary computation provided by the present invention, comprising:

步骤S1:根据目标需求和平台要求,通过目标函数设置目标要求,目标要求包括:期望的准确度、推理时延、搜索空间大小和进化次数;Step S1: According to the target requirements and platform requirements, the target requirements are set through the target function, and the target requirements include: expected accuracy, inference delay, search space size and evolution times;

步骤S2:根据设置的搜索空间大小,基于子网络模块集合随机生成N个无向有环图,作为进化寻优的网络搜索空间;Step S2: According to the set size of the search space, randomly generate N undirected cyclic graphs based on the set of sub-network modules, as the network search space for evolutionary optimization;

步骤S3:在启发式信息引导下,结合信息素动态挥发和概率路径选择机制,通过蚁群算法在随机生成的N个无向有环图中搜索N条有向无环图寻优路径,构成候选集;Step S3: Under the guidance of heuristic information, combined with the dynamic volatilization of pheromone and the probability path selection mechanism, the ant colony algorithm is used to search for N directed acyclic graph optimization paths in the randomly generated N undirected acyclic graphs to form candidate set;

步骤S4:通过训练和测试获取候选集中N条寻优路径的准确率和推理时延,选取最优结果作为当前最优网络结构;Step S4: Obtain the accuracy and inference delay of the N optimization paths in the candidate set through training and testing, and select the optimal result as the current optimal network structure;

步骤S5:评估当前网络架构是否达到目标要求,当当前网络架构没有达到预设目标要求,当满足速度和精度要求时,则暂停寻优结果,基于当前最优网络架构的所有子网络模块内部结构实时演化突变,继续迭代直至达到预设目标要求;当满足预设目标要求时,则输出当前最优网络结果,否则退出搜索过程,输出搜索异常。Step S5: Evaluate whether the current network architecture meets the target requirements. When the current network architecture does not meet the preset target requirements, and when the speed and accuracy requirements are met, the optimization result is suspended, and the internal structure of all sub-network modules based on the current optimal network architecture Evolve mutation in real time, and continue to iterate until the preset target requirements are met; when the preset target requirements are met, the current optimal network result is output, otherwise, the search process is exited and the search exception is output.

具体地,所述步骤S1包括:Specifically, the step S1 includes:

所述目标函数公式如下:The objective function formula is as follows:

Figure BDA0002921713920000101
Figure BDA0002921713920000101

s.t.LAT(Net)≤T&ACC(Net)≥As.t.LAT(Net)≤T&ACC(Net)≥A

其中,目标函数定义为一个多目标搜索;Net表示通过进化算法得到的网络;ACC(Net)表示网络的准确率;LAT(Net)表示网络的推理时延;T表示期望的推理时延;A表示期望的准确率;所述期望的准确率根据预设的目标需求设定;所述期望的推理时延根据移动、嵌入式或通用平台类型设定。Among them, the objective function is defined as a multi-objective search; Net represents the network obtained by the evolutionary algorithm; ACC(Net) represents the accuracy of the network; LAT(Net) represents the inference delay of the network; T represents the expected inference delay; A Indicates the expected accuracy rate; the expected accuracy rate is set according to the preset target requirements; the expected inference delay is set according to the mobile, embedded or general platform type.

具体地,所述步骤S2包括:Specifically, the step S2 includes:

所述子网络模块是无向有环图中的结点;子网络模块包括多种类型且具有M个层的子网络,子网络类型可根据目标需求和平台要求扩展子网络模块的类型并进行选择;The sub-network module is a node in an undirected cyclic graph; the sub-network module includes multiple types of sub-networks with M layers, and the sub-network type can expand the type of the sub-network module according to the target requirements and platform requirements and carry out choose;

子网络的结构包括:多卷积层、ResNet块、深度可分离卷积、具有线性瓶颈的倒残差结构和基于压缩-激发结构的轻量级注意力结构。The structure of the sub-network includes: multi-convolutional layers, ResNet blocks, depthwise separable convolution, inverted residual structure with linear bottleneck, and lightweight attention structure based on compression-excitation structure.

具体地,所述网络搜索空间包括:所述网络搜索空间包括N个搜索子空间;Specifically, the network search space includes: the network search space includes N search subspaces;

Figure BDA0002921713920000102
Figure BDA0002921713920000102

Figure BDA0002921713920000103
Figure BDA0002921713920000103

其中,

Figure BDA0002921713920000104
表示第i代子网络模块集合,
Figure BDA0002921713920000105
表示第i代中第j个子网络模块,
Figure BDA0002921713920000111
表示第i代搜索空间的边集合,子网络模块之间通过边连接;
Figure BDA0002921713920000112
表示第i代中第j个子网络模块和第k个子网络模块之间的边;
Figure BDA0002921713920000113
表示第i代第n个搜索空间;i表示迭代序号。in,
Figure BDA0002921713920000104
represents the set of i-th generation sub-network modules,
Figure BDA0002921713920000105
represents the jth subnetwork module in the ith generation,
Figure BDA0002921713920000111
Represents the edge set of the i-th generation search space, and the sub-network modules are connected by edges;
Figure BDA0002921713920000112
represents the edge between the jth subnetwork module and the kth subnetwork module in the ith generation;
Figure BDA0002921713920000113
Represents the n-th search space of the i-th generation; i represents the iteration number.

具体地,所述步骤S3包括:Specifically, the step S3 includes:

步骤S3.1:在每一个搜索子空间中任选一点作为起点,选取距离起点最远的结点为终点,初始化蚂蚁数量、信息素强度常数和循环次数;Step S3.1: select a point in each search subspace as the starting point, select the node farthest from the starting point as the ending point, and initialize the number of ants, the constant of pheromone intensity and the number of cycles;

步骤S3.2:启发信息计算;Step S3.2: heuristic information calculation;

Figure BDA0002921713920000114
Figure BDA0002921713920000114

其中,ηI,J(t)表示t时刻从结点I到结点J的启发信息;DepI,J、WigI,J、ConI,J、FilI,J分别表示结点J的深度、宽度、连接度和滤波器数量,ω表示激励因子,0≤ω≤1;奖励机制定义为激励当前最优的网络架构中所有结点;ω值越小,启发式信息η越大,初始网络搜索空间中由于未产生进化,则ω初始值设置为1;Among them, η I,J (t) represents the heuristic information from node I to node J at time t; Dep I,J , Wig I,J , Con I,J , Fil I,J represent the depth of node J respectively , width, degree of connectivity and number of filters, ω represents the incentive factor, 0≤ω≤1; the reward mechanism is defined to motivate all nodes in the current optimal network architecture; the smaller the value of ω, the greater the heuristic information η, the initial Since there is no evolution in the network search space, the initial value of ω is set to 1;

步骤S3.3:概率路径选择;Step S3.3: Probabilistic path selection;

Figure BDA0002921713920000115
Figure BDA0002921713920000115

其中,

Figure BDA0002921713920000116
表示第t时刻蚂蚁m由点I移动到点J的概率;allowedm表示蚂蚁下一步可以选择的节点;α表示信息素启发因子,表示路径上残留的信息素在寻优过程中的作用,值越大则说明蚂蚁之间的协作能力越强,倾向于选择其它蚂蚁经过的路径;β表示期望值启发因子,表明准确性和推理时延在蚂蚁选择路径时的受重视程度,值越大,状态转移规律越接近贪心规则;τI,J(t)表示t时刻点I到点J路径上的信息素;τI,S(t)表示点I到allowedm中任一点路径上的信息素;
Figure BDA0002921713920000117
表示I到allowedm中任一点路径上的启发信息;in,
Figure BDA0002921713920000116
Represents the probability that the ant m moves from point I to point J at the t-th time; allowed m represents the node that the ant can choose next; α represents the pheromone heuristic factor, which represents the role of the remaining pheromone on the path in the optimization process, value The larger the value, the stronger the cooperation ability between ants and the tendency to choose the path passed by other ants; β represents the expected value heuristic factor, which indicates the importance of accuracy and inference delay when the ants choose the path. The transition rule is closer to the greedy rule; τ I, J (t) represents the pheromone on the path from point I to point J at time t; τ I, S (t) represents the pheromone on the path from point I to any point in allowed m ;
Figure BDA0002921713920000117
Indicates heuristic information on the path from I to allowed m at any point;

步骤S3.4:信息素动态挥发;Step S3.4: dynamic volatilization of pheromone;

Figure BDA0002921713920000118
Figure BDA0002921713920000118

其中,ρI,J(t)表示t时刻在I到J路径上的挥发系数;ηI,J(t)表示t时刻I到J路径上的启发信息;

Figure BDA0002921713920000119
ηi表示所有的启发信息,L表示当前网络下的结点总数;Among them, ρ I, J (t) represents the volatility coefficient on the path from I to J at time t; η I, J (t) represents the heuristic information on the path from I to J at time t;
Figure BDA0002921713920000119
η i represents all the heuristic information, L represents the total number of nodes under the current network;

步骤S3.5:信息素增量计算;Step S3.5: pheromone increment calculation;

Figure BDA0002921713920000121
Figure BDA0002921713920000121

其中,Q为信息素强度常量,是蚂蚁在一个循环中在所经过的路径上释放的信息素的总量;ηm表示第m只蚂蚁在此次循环中所受到的启发信息总量;Among them, Q is the constant of pheromone intensity, which is the total amount of pheromone released by the ants on the path traversed in a cycle; η m represents the total amount of enlightenment information received by the mth ant in this cycle;

步骤S3.6:信息素更新;Step S3.6: pheromone update;

τI,J(t+1)=(1-ρ)τI,J(t)+ΔτI,J(t,t+1)τ I, J (t+1)=(1-ρ)τ I, J (t)+Δτ I, J (t, t+1)

Figure BDA0002921713920000122
Figure BDA0002921713920000122

其中,ρ是信息素动态挥发系数;

Figure BDA0002921713920000123
表示第m只蚂蚁在本次循环中在路径(I,J)上留下的信息素增量,ΔτI,J(t,t+1)表示本次循环当中所有经过路径(I,J)的蚂蚁留下的信息素增量;K表示在本次循环中所有经过路径(I,J)的蚂蚁总数;where ρ is the dynamic volatility coefficient of pheromone;
Figure BDA0002921713920000123
Represents the pheromone increment left by the mth ant on the path (I, J) in this cycle, Δτ I, J (t, t+1) represents all the paths (I, J) passed in this cycle The pheromone increment left by the ants; K represents the total number of ants passing through the path (I, J) in this cycle;

步骤S3.7:寻优结束判断:当所有搜索子空间的寻优均达到最大循环次数则退出循环,输出所有搜索子空间寻优结果作为当代候选集;否则重复执行步骤S3.2至步骤S3.7,直至达到最大循环次数。Step S3.7: Judgment of the end of optimization: when the optimization of all search subspaces reaches the maximum number of cycles, exit the cycle, and output the optimization results of all search subspaces as the contemporary candidate set; otherwise, repeat steps S3.2 to S3 .7 until the maximum number of cycles is reached.

具体地,所述步骤S5中演化突变包括:将当前最优网络架构中所有子网络模块中的激励因子ω设置为常数,且0≤ω≤1;同时在突变集合中随机选取变异操作,进化子网络模块内部结构,生成下一代子网络模块,重复执行步骤S2至步骤S5,直至达到预设目标要求。Specifically, the evolution mutation in step S5 includes: setting the incentive factor ω in all sub-network modules in the current optimal network architecture to be constant, and 0≤ω≤1; The internal structure of the sub-network module generates the next-generation sub-network module, and steps S2 to S5 are repeatedly executed until the preset target requirements are met.

根据本发明提供的一种基于进化计算的神经网络架构搜索系统,包括:A neural network architecture search system based on evolutionary computing provided by the present invention, comprising:

模块M1:根据目标需求和平台要求,通过目标函数设置目标要求,目标要求包括:期望的准确度、推理时延、搜索空间大小和进化次数;Module M1: According to the target requirements and platform requirements, the target requirements are set through the target function, and the target requirements include: expected accuracy, inference delay, search space size and evolution times;

模块M2:根据设置的搜索空间大小,基于子网络模块集合随机生成N个无向有环图,作为进化寻优的网络搜索空间;Module M2: According to the set size of the search space, N undirected cyclic graphs are randomly generated based on the set of sub-network modules as the network search space for evolutionary optimization;

模块M3:在启发式信息引导下,结合信息素动态挥发和概率路径选择机制,通过蚁群算法在随机生成的N个无向有环图中搜索N条有向无环图寻优路径,构成候选集;Module M3: Under the guidance of heuristic information, combined with the dynamic volatilization of pheromone and the probability path selection mechanism, the ant colony algorithm is used to search for N directed acyclic graphs in randomly generated N undirected acyclic graphs. candidate set;

模块M4:通过训练和测试获取候选集中N条寻优路径的准确率和推理时延,选取最优结果作为当前最优网络结构;Module M4: Obtain the accuracy and inference delay of N optimization paths in the candidate set through training and testing, and select the optimal result as the current optimal network structure;

模块M5:评估当前网络架构是否达到目标要求,当当前网络架构没有达到预设目标要求,当满足速度和精度要求时,则暂停寻优结果,基于当前最优网络架构的所有子网络模块内部结构实时演化突变,继续迭代直至达到预设目标要求;当满足预设目标要求时,则输出当前最优网络结果,否则退出搜索过程,输出搜索异常。Module M5: Evaluate whether the current network architecture meets the target requirements. When the current network architecture does not meet the preset target requirements, when the speed and accuracy requirements are met, the optimization result is suspended, and the internal structure of all sub-network modules based on the current optimal network architecture Evolve mutation in real time, and continue to iterate until the preset target requirements are met; when the preset target requirements are met, the current optimal network result is output, otherwise, the search process is exited and the search exception is output.

具体地,所述模块M1包括:Specifically, the module M1 includes:

所述目标函数公式如下:The objective function formula is as follows:

Figure BDA0002921713920000131
Figure BDA0002921713920000131

s.t.LAT(Net)≤T&ACC(Net)≥As.t.LAT(Net)≤T&ACC(Net)≥A

其中,目标函数定义为一个多目标搜索;Net表示通过进化算法得到的网络;ACC(Net)表示网络的准确率;LAT(Net)表示网络的推理时延;T表示期望的推理时延;A表示期望的准确率;所述期望的准确率根据预设的目标需求设定;所述期望的推理时延根据移动、嵌入式或通用平台类型设定。Among them, the objective function is defined as a multi-objective search; Net represents the network obtained by the evolutionary algorithm; ACC(Net) represents the accuracy of the network; LAT(Net) represents the inference delay of the network; T represents the expected inference delay; A Indicates the expected accuracy rate; the expected accuracy rate is set according to the preset target requirements; the expected inference delay is set according to the mobile, embedded or general platform type.

具体地,所述模块M2包括:Specifically, the module M2 includes:

所述子网络模块是无向有环图中的结点;子网络模块包括多种类型且具有M个层的子网络,子网络类型可根据目标需求和平台要求扩展子网络模块的类型并进行选择;The sub-network module is a node in an undirected cyclic graph; the sub-network module includes multiple types of sub-networks with M layers, and the sub-network type can expand the type of the sub-network module according to the target requirements and platform requirements and carry out choose;

子网络的结构包括:多卷积层、ResNet块、深度可分离卷积、具有线性瓶颈的倒残差结构和基于压缩-激发结构的轻量级注意力结构;The structure of the sub-network includes: multiple convolutional layers, ResNet blocks, depthwise separable convolution, inverted residual structure with linear bottleneck, and lightweight attention structure based on compression-excitation structure;

所述网络搜索空间包括:所述网络搜索空间包括N个搜索子空间;The network search space includes: the network search space includes N search subspaces;

Figure BDA0002921713920000132
Figure BDA0002921713920000132

Figure BDA0002921713920000133
Figure BDA0002921713920000133

其中,

Figure BDA0002921713920000134
表示第i代子网络模块集合,
Figure BDA0002921713920000135
表示第i代中第j个子网络模块,
Figure BDA0002921713920000136
表示第i代搜索空间的边集合,子网络模块之间通过边连接;
Figure BDA0002921713920000137
表示第i代中第j个子网络模块和第k个子网络模块之间的边;
Figure BDA0002921713920000138
表示第i代第n个搜索空间;i表示迭代序号。in,
Figure BDA0002921713920000134
represents the set of i-th generation sub-network modules,
Figure BDA0002921713920000135
represents the jth subnetwork module in the ith generation,
Figure BDA0002921713920000136
Represents the edge set of the i-th generation search space, and the sub-network modules are connected by edges;
Figure BDA0002921713920000137
represents the edge between the jth subnetwork module and the kth subnetwork module in the ith generation;
Figure BDA0002921713920000138
Represents the n-th search space of the i-th generation; i represents the iteration number.

具体地,所述模块M3包括:Specifically, the module M3 includes:

模块M3.1:在每一个搜索子空间中任选一点作为起点,选取距离起点最远的结点为终点,初始化蚂蚁数量、信息素强度常数和循环次数;Module M3.1: Select a point in each search subspace as the starting point, select the node farthest from the starting point as the end point, and initialize the number of ants, the constant of pheromone intensity and the number of cycles;

模块M3.2:启发信息计算;Module M3.2: Heuristic information calculation;

Figure BDA0002921713920000139
Figure BDA0002921713920000139

其中,ηI,J(t)表示t时刻从结点I到结点J的启发信息;DepI,J、WigI,J、ConI,J、FilI,J分别表示结点J的深度、宽度、连接度和滤波器数量,ω表示激励因子,0≤ω≤1;奖励机制定义为激励当前最优的网络架构中所有结点;ω值越小,启发式信息η越大,初始网络搜索空间中由于未产生进化,则ω初始值设置为1;Among them, η I,J (t) represents the heuristic information from node I to node J at time t; Dep I,J , Wig I,J , Con I,J , Fil I,J represent the depth of node J respectively , width, degree of connectivity and number of filters, ω represents the incentive factor, 0≤ω≤1; the reward mechanism is defined to motivate all nodes in the current optimal network architecture; the smaller the value of ω, the greater the heuristic information η, the initial Since there is no evolution in the network search space, the initial value of ω is set to 1;

模块M3.3:概率路径选择;Module M3.3: Probabilistic Path Selection;

Figure BDA0002921713920000141
Figure BDA0002921713920000141

其中,

Figure BDA0002921713920000142
表示第t时刻蚂蚁m由点I移动到点J的概率;allowedm表示蚂蚁下一步可以选择的节点;α表示信息素启发因子,表示路径上残留的信息素在寻优过程中的作用,值越大则说明蚂蚁之间的协作能力越强,倾向于选择其它蚂蚁经过的路径;β表示期望值启发因子,表明准确性和推理时延在蚂蚁选择路径时的受重视程度,值越大,状态转移规律越接近贪心规则;τI,J(t)表示t时刻点I到点J路径上的信息素;τI,S(t)表示点I到allowedm中任一点路径上的信息素;
Figure BDA0002921713920000143
表示I到allowedm中任一点路径上的启发信息;in,
Figure BDA0002921713920000142
Represents the probability that the ant m moves from point I to point J at the t-th time; allowed m represents the node that the ant can choose next; α represents the pheromone heuristic factor, which represents the role of the remaining pheromone on the path in the optimization process, value The larger the value, the stronger the cooperation ability between ants and the tendency to choose the path passed by other ants; β represents the expected value heuristic factor, which indicates the importance of accuracy and inference delay when the ants choose the path. The transition rule is closer to the greedy rule; τ I, J (t) represents the pheromone on the path from point I to point J at time t; τ I, S (t) represents the pheromone on the path from point I to any point in allowed m ;
Figure BDA0002921713920000143
Indicates heuristic information on the path from I to allowed m at any point;

模块M3.4:信息素动态挥发;Module M3.4: dynamic volatilization of pheromone;

Figure BDA0002921713920000144
Figure BDA0002921713920000144

其中,ρI,J(t)表示t时刻在I到J路径上的挥发系数;ηI,J(t)表示t时刻I到J路径上的启发信息;

Figure BDA0002921713920000145
ηi表示所有的启发信息,L表示当前网络下的结点总数;Among them, ρ I, J (t) represents the volatility coefficient on the path from I to J at time t; η I, J (t) represents the heuristic information on the path from I to J at time t;
Figure BDA0002921713920000145
η i represents all the heuristic information, L represents the total number of nodes under the current network;

模块M3.5:信息素增量计算;Module M3.5: pheromone incremental calculation;

Figure BDA0002921713920000146
Figure BDA0002921713920000146

其中,Q为信息素强度常量,是蚂蚁在一个循环中在所经过的路径上释放的信息素的总量;ηm表示第m只蚂蚁在此次循环中所受到的启发信息总量;Among them, Q is the constant of pheromone intensity, which is the total amount of pheromone released by the ants on the path traversed in a cycle; η m represents the total amount of enlightenment information received by the mth ant in this cycle;

模块M3.6:信息素更新;Module M3.6: pheromone update;

τI,J(t+1)=(1-ρ)τI,J(t)+ΔτI,J(t,t+1)τ I, J (t+1)=(1-ρ)τ I, J (t)+Δτ I, J (t,t+1)

Figure BDA0002921713920000147
Figure BDA0002921713920000147

其中,ρ是信息素动态挥发系数;

Figure BDA0002921713920000148
表示第m只蚂蚁在本次循环中在路径(I,J)上留下的信息素增量,ΔτI,J(t,t+1)表示本次循环当中所有经过路径(I,J)的蚂蚁留下的信息素增量;K表示在本次循环中所有经过路径(I,J)的蚂蚁总数;where ρ is the dynamic volatility coefficient of pheromone;
Figure BDA0002921713920000148
Represents the pheromone increment left by the mth ant on the path (I, J) in this cycle, Δτ I, J (t, t+1) represents all the paths (I, J) passed in this cycle The pheromone increment left by the ants; K represents the total number of ants passing through the path (I, J) in this cycle;

模块M3.7:寻优结束判断:当所有搜索子空间的寻优均达到最大循环次数则退出循环,输出所有搜索子空间寻优结果作为当代候选集;否则重复触发模块M3.2至模块M3.7执行,直至达到最大循环次数;Module M3.7: Judgment of the end of optimization: when the optimization of all search subspaces reaches the maximum number of cycles, exit the loop, and output the optimization results of all search subspaces as the contemporary candidate set; otherwise, repeat triggering modules M3.2 to M3 .7 execute until the maximum number of cycles is reached;

所述模块M5中演化突变包括:将当前最优网络架构中所有子网络模块中的激励因子ω设置为常数,且0≤ω≤1;同时在突变集合中随机选取变异操作,进化子网络模块内部结构,生成下一代子网络模块,重复触发模块M2至模块M5执行,直至达到预设目标要求。The evolution mutation in the module M5 includes: setting the excitation factor ω in all sub-network modules in the current optimal network architecture to a constant, and 0≤ω≤1; at the same time randomly selecting mutation operations in the mutation set, and evolving the sub-network modules The internal structure generates the next-generation sub-network module, and repeatedly triggers the execution of modules M2 to M5 until the preset target requirements are met.

实施例2Example 2

实施例2是实施例1的变化例Example 2 is a modification of Example 1

本发明提出一种基于进化计算的神经网络架构搜索方法和系统,以模块化图论构建思想,在神经网络初始化模块的基础上,利用蚁群算法的优化能力和奖励进化机制,以模块级搜索结合结构级进化的方式探索最优的神经网络架构,可在资源受限的情况下兼顾速度和精度以解决以上技术问题。The present invention proposes a neural network architecture search method and system based on evolutionary computation. The modular graph theory is used to construct the idea. On the basis of the neural network initialization module, the optimization ability and the reward evolution mechanism of the ant colony algorithm are used to search at the module level. Combining the structure-level evolution method to explore the optimal neural network architecture can solve the above technical problems by taking into account the speed and accuracy under the condition of limited resources.

针对目前神经网络架构搜索在资源受限情况下存在性能和效率难平衡的问题,本发明的目的是提供一种基于进化计算的神经网络架构搜索方法和系统。本发明以子网络模块为基本构件,随机生成多个网络搜索子空间(有向无环图),并以子网络模块(结点)的深度、广度、连接度和滤波器数量作为子网络模块的复杂度结合激励因子作为启发信息,通过信息素动态挥发和随机概率路径选择机制,利用改进的蚁群算法在子空间中寻优,同时融合奖励和演化突变机制不断进化,以搜索兼顾性能和效率的最优神经网络架构。Aiming at the problem that performance and efficiency are difficult to balance in the current neural network architecture search under the condition of limited resources, the purpose of the present invention is to provide a neural network architecture search method and system based on evolutionary computing. The invention takes the sub-network module as the basic component, randomly generates multiple network search subspaces (directed acyclic graph), and uses the depth, breadth, connection degree and number of filters of the sub-network module (node) as the sub-network module The complexity of the algorithm is combined with the incentive factor as the heuristic information. Through the dynamic volatilization of pheromone and the random probability path selection mechanism, the improved ant colony algorithm is used to search for optimization in the subspace, and at the same time, the reward and evolutionary mutation mechanism are continuously evolved to search for both performance and performance. Optimal neural network architecture for efficiency.

所述进化计算在计算机科学领域,进化计算(Evolutionary Computation)是智能计算(Computational Intelligence)中涉及到组合优化问题的一个子域。其进化算法是受生物进化过程中“优胜劣汰”的自然选择机制和遗传信息的传递规律的影响,通过程序迭代模拟这一过程,把要解决的问题看作环境,在一些可能的解组成的种群中,通过自然演化寻求最优解。The evolutionary computation In the field of computer science, evolutionary computation (Evolutionary Computation) is a subfield of intelligent computation (Computational Intelligence) involving combinatorial optimization problems. Its evolutionary algorithm is influenced by the natural selection mechanism of "survival of the fittest" and the transmission law of genetic information in the process of biological evolution. This process is simulated iteratively through the program, and the problem to be solved is regarded as the environment. , the optimal solution is sought through natural evolution.

所述进化算法或称“演化算法”(Evolutionary Algorithms)是一个“算法簇”,尽管它有很多的变化,有不同的遗传基因表达方式,不同的交叉和变异算子,特殊算子的引用,以及不同的再生和选择方法,但它们产生的灵感都来自于大自然的生物进化。与传统的基于微积分的方法和穷举法等优化算法相比,进化算法是一种成熟的具有高鲁棒性和广泛适用性的全局优化方法,具有自组织、自适应、自学习的特性,能够不受问题性质的限制,有效地处理传统优化算法难以解决的复杂问题。The evolutionary algorithm or "Evolutionary Algorithms" is an "algorithm cluster", although it has many variations, there are different ways of genetic gene expression, different crossover and mutation operators, references to special operators, and different methods of regeneration and selection, but they are all inspired by nature's biological evolution. Compared with the traditional calculus-based method and the exhaustive method and other optimization algorithms, the evolutionary algorithm is a mature global optimization method with high robustness and wide applicability, and has the characteristics of self-organization, self-adaptation and self-learning. , which is not limited by the nature of the problem, and can effectively deal with complex problems that are difficult to solve by traditional optimization algorithms.

所述神经网络架构搜索(Neural Architecture Search,NAS)是深度学习研究热点之一。NAS旨在通过使用有限的计算资源,以自动化的方式设计具有最优性能的神经网络架构,尽可能减少人工干预。The Neural Architecture Search (NAS) is one of the hotspots in deep learning research. NAS aims to design neural network architectures with optimal performance in an automated manner with minimal human intervention by using limited computing resources.

所述是仿生学中群体智能算法之一,它是1991年由意大利学者M.Dorigo等人受到真实世界中蚂蚁觅食的行为的启发后而提出。在蚂蚁觅食过程中,每一只蚂蚁都会在走过路径上释放一种可以与其他蚂蚁进行信息交流的化学物质,叫做信息素。随着时间的推移,信息素会挥发,但蚂蚁爬过的路径越短,相应信息素挥发速度也越慢,留在路径上的信息素浓度相对来说就要高一些。蚂蚁能够发现这种信息素并感知它的浓度,能以更高的概率选中信息素浓度最高的路径。这样便会有更多的蚂蚁选择含有高浓度信息素的路径,越来越多的蚂蚁会被吸引到这条路径上,形成一种正反馈。基于这样的原理,蚂蚁们便能快速的找到一条离食物源最短的路径。The above is one of the swarm intelligence algorithms in bionics. It was proposed in 1991 by Italian scholar M. Dorigo and others after being inspired by the foraging behavior of ants in the real world. During the ant foraging process, each ant releases a chemical substance called pheromone that can communicate with other ants on the path it walks. Over time, the pheromone will volatilize, but the shorter the path ants crawl, the slower the corresponding pheromone volatilization rate, and the concentration of the pheromone left on the path will be relatively higher. Ants can find this pheromone and sense its concentration, and can choose the path with the highest pheromone concentration with a higher probability. In this way, more ants will choose the path with high concentration of pheromone, and more and more ants will be attracted to this path, forming a positive feedback. Based on this principle, ants can quickly find the shortest path from the food source.

本发明涉及一种基于进化计算的神经网络架构搜索方法和系统,包括搜索目标设置、搜索空间初始化、蚁群寻优、目标评估、演化突变等步骤。首先根据目标需求和平台要求,从准确度、推理时延、进化次数等方面设置搜索目标参数;然后初始化搜索空间,随机生成N个搜索子空间;在此基础上开始蚁群寻优,构建当代候选集;接着在数据集上完成目标评估,从候选集中选取当代最优网络架构;最后针对当代最优网络架构的所有子网络模块内部结构实施演化突变,继续迭代直至达到目标要求。The invention relates to a neural network architecture search method and system based on evolutionary computation, including the steps of search target setting, search space initialization, ant colony optimization, target evaluation, evolution mutation and the like. First, according to the target requirements and platform requirements, set the search target parameters in terms of accuracy, inference delay, and evolution times; then initialize the search space, and randomly generate N search subspaces; Candidate set; then complete the target evaluation on the data set, and select the contemporary optimal network architecture from the candidate set; finally, implement evolution mutation for the internal structure of all sub-network modules of the contemporary optimal network architecture, and continue to iterate until the target requirements are met.

所述搜索目标设置,即根据目标需求和平台要求,通过定义的目标函数设置期望的准确度、推理时延,并设定搜索空间大小、进化次数等参数。The search target setting, that is, according to the target requirements and platform requirements, the expected accuracy and inference delay are set through the defined target function, and parameters such as the size of the search space and the number of evolutions are set.

所述搜索空间初始化,即依据设定的搜索空间大小,针对子网络模块集合随机生成N个无向有环图,作为进化寻优的网络搜索空间。The search space initialization, that is, according to the set size of the search space, randomly generates N undirected cyclic graphs for the sub-network module set as the network search space for evolutionary optimization.

具体地,所述网络搜索空间是一个分层分块的无向有环图,每一个子网络模块表示无向有环图中的一个结点,允许模块级搜索和模块内部结构级突变和搜索。Specifically, the network search space is a hierarchical block undirected cyclic graph, each sub-network module represents a node in the undirected cyclic graph, allowing module-level search and module internal structure-level mutation and search .

具体地,所述子网络模块是无向有环图中的结点,定义为多种类型具有M个层的子网络,可扩展子网络类型并根据目标需求和应用平台进行选择,其子网络结构包括且不限于多卷积层、ResNet块、深度可分离卷积、具有线性瓶颈的倒残差结构、基于压缩-激发结构的轻量级注意力结构等。Specifically, the sub-network module is a node in an undirected cyclic graph, which is defined as multiple types of sub-networks with M layers. The sub-network type can be expanded and selected according to target requirements and application platforms. Structures include but are not limited to multiple convolutional layers, ResNet blocks, depthwise separable convolutions, inverted residual structures with linear bottlenecks, lightweight attention structures based on compression-excitation structures, etc.

所述蚁群寻优,即在启发式信息引导下,结合信息素动态挥发和概率路径选择机制,通过蚁群算法在随机生成的N个无向有环图中搜索N条有向无环图寻优路径,以此构成这一代的候选集。The ant colony optimization, that is, under the guidance of heuristic information, combined with the dynamic volatilization of pheromone and the probability path selection mechanism, through the ant colony algorithm to search N directed acyclic graphs in randomly generated N undirected acyclic graphs Find the optimal path to form the candidate set of this generation.

所述目标评估,即通过训练和测试获取候选集中N条寻优路径的准确率和推理时延,选取最优结果作为这一代的最优网络架构,并评估是否达到目标要求。The target evaluation is to obtain the accuracy and inference delay of the N optimization paths in the candidate set through training and testing, select the optimal result as the optimal network architecture of this generation, and evaluate whether the target requirements are met.

所述演化突变,即针对这一代最优网络架构中的所有子网络模块内部结构,给予奖励,并在突变集合中随机选取变异操作,生成下一代子网络模块,以此模拟自然界中优异的个体更容易留下后代的过程。The evolution mutation is to reward the internal structure of all sub-network modules in the optimal network architecture of this generation, and randomly select mutation operations in the mutation set to generate the next generation of sub-network modules, so as to simulate excellent individuals in nature. The process of leaving offspring is easier.

具体地,所述突变集合,包括模块结构保持不变、随机选择卷积类型、卷积内核大小、滤波器尺寸、插入卷积层、删除卷积层、添加连接、删除连接操作。Specifically, the mutation set includes the operations of keeping the module structure unchanged, randomly selecting the convolution type, convolution kernel size, filter size, inserting a convolution layer, deleting a convolution layer, adding a connection, and deleting a connection.

一种基于进化计算的神经网络架构搜索方法和系统,包括如下步骤:如图1所示,A neural network architecture search method and system based on evolutionary computing, including the following steps: as shown in Figure 1,

步骤1,搜索目标设定:Step 1, search target setting:

根据目标需求和平台要求,通过定义的目标函数设定期望的准确度和推理时延,同时设定搜索空间大小、进化次数等参数。为了寻求准确性和推理时延方面的均衡,目标函数定义为一个多目标搜索问题,目的是找到准确率高且推力时延低的神经网络架构。准确率可依据用户具体需求而设定,采用TOP-1或TOP-5;推理时延可依据移动、嵌入式或通用平台类型设定。总的目标函数具体形式化如下:According to the target requirements and platform requirements, the expected accuracy and inference delay are set through the defined objective function, and parameters such as the size of the search space and the number of evolutions are also set. In order to seek a balance between accuracy and inference latency, the objective function is defined as a multi-objective search problem to find a neural network architecture with high accuracy and low thrust latency. The accuracy rate can be set according to the specific needs of the user, using TOP-1 or TOP-5; the inference delay can be set according to the type of mobile, embedded or general platform. The overall objective function is specifically formalized as follows:

Figure BDA0002921713920000171
Figure BDA0002921713920000171

s.t. LAT(Net)≤T&ACC(Net)≥As.t. LAT(Net)≤T&ACC(Net)≥A

其中,Net表示通过进化算法得到的网络,ACC(Net)表示该网络的准确率,LAT(Net)表示该网络的推理时延,T表示期望的目标时延。A表示期望的准确率。s.t.LAT(Net)表示Among them, Net represents the network obtained by the evolutionary algorithm, ACC(Net) represents the accuracy of the network, LAT(Net) represents the inference delay of the network, and T represents the desired target delay. A represents the expected accuracy. s.t.LAT(Net) representation

步骤2,搜索空间初始化:Step 2, search space initialization:

依据设定的搜索空间大小,从子网络模块集合随机生成N个无向有环图,作为进化寻优的网络搜索空间。According to the set size of the search space, N undirected cyclic graphs are randomly generated from the set of sub-network modules as the network search space for evolutionary optimization.

每一个子网络模块表示无向有环图中的一个结点,定义为多种类型具有M个层的子网络,可根据目标需求和应用平台要求扩展子网络模块类型并进行选择,其子网络结构包括且不限于多卷积层(1x1 conv+滤波为3*3的2D conv+1x1 conv)、ResNet块、深度可分离卷积、具有线性瓶颈的倒残差结构、基于压缩-激发结构的轻量级注意力结构等。边表示子网络模块之间的连接。一个搜索空间由N个搜索子空间组成,即

Figure BDA0002921713920000172
Figure BDA0002921713920000173
形式化为:Each sub-network module represents a node in an undirected cyclic graph, and is defined as multiple types of sub-networks with M layers. The sub-network module type can be expanded and selected according to the target requirements and application platform requirements. Structures include but are not limited to multiple convolutional layers (1x1 conv + 2D conv filtered to 3*3 + 1x1 conv), ResNet blocks, depthwise separable convolutions, inverted residual structures with linear bottlenecks, compression-excitation structures based light Magnitude attention structure, etc. Edges represent connections between subnetwork modules. A search space consists of N search subspaces, namely
Figure BDA0002921713920000172
Figure BDA0002921713920000173
Formalized as:

Figure BDA0002921713920000181
Figure BDA0002921713920000181

其中,

Figure BDA0002921713920000182
表示第i代子网络模块集合,
Figure BDA0002921713920000183
表示第i代中第j个子网络模块,
Figure BDA0002921713920000184
表示第i代搜索空间的边集合,
Figure BDA0002921713920000185
表示第i代中第j个子网络模块和第k个子网络模块之间的边;
Figure BDA0002921713920000186
表示第i代第n个搜索空间;i表示迭代序号。in,
Figure BDA0002921713920000182
represents the set of i-th generation sub-network modules,
Figure BDA0002921713920000183
represents the jth subnetwork module in the ith generation,
Figure BDA0002921713920000184
represents the edge set of the ith generation search space,
Figure BDA0002921713920000185
represents the edge between the jth subnetwork module and the kth subnetwork module in the ith generation;
Figure BDA0002921713920000186
Represents the n-th search space of the i-th generation; i represents the iteration number.

步骤3,蚁群寻优:Step 3, ant colony optimization:

在启发式信息引导下,结合信息素动态挥发和随机概率路径选择机制,通过蚁群算法在随机生成的N个无向有环图中搜索N条有向无环的寻优路径,以此构成这一代的候选集。Under the guidance of heuristic information, combined with the dynamic volatilization of pheromone and the random probability path selection mechanism, the ant colony algorithm is used to search for N directed and acyclic optimization paths in the randomly generated N undirected cyclic graphs, thus forming candidate set for this generation.

步骤3.1:参数初始化。在每个搜索子空间中任选一点作为起点,自动选取距离起点最远的结点为终点,初始化蚂蚁数量,信息素强度常量,循环次数等。Step 3.1: Parameter initialization. Select a point in each search subspace as the starting point, automatically select the node farthest from the starting point as the end point, initialize the number of ants, pheromone intensity constant, cycle times, etc.

步骤3.2:启发信息计算。启发式函数对寻优过程的收敛性和稳定性有着重要的影响。在蚁群算法中,启发式信息η通常是两点之间距离的反比,但神经网络架构搜索的标准权衡准确性和推理时延,寻优过程中不仅要考虑影响计算量的子网络模块内部结构的复杂性,如深度、宽度、连接度和滤波器数量,还需要考虑奖励机制所给予的激励,因此,启发式信息定义为Step 3.2: Heuristic information calculation. The heuristic function has an important influence on the convergence and stability of the optimization process. In the ant colony algorithm, the heuristic information η is usually the inverse ratio of the distance between two points, but the standard of neural network architecture search balances accuracy and inference delay, and the optimization process should not only consider the internal sub-network module that affects the amount of calculation. The complexity of the structure, such as depth, width, connectivity, and number of filters, also needs to consider the incentives given by the reward mechanism, so the heuristic information is defined as

Figure BDA0002921713920000187
Figure BDA0002921713920000187

其中,ηI,J(t)表示t时刻从结点I到结点J的启发信息;DepI,J、WigI,J、ConI,J、FilI,J分别表示结点J的深度、宽度、连接度和滤波器数量,ω表示激励因子,0≤ω≤1,奖励机制定义为激励这一代最优的网络架构中所有结点。ω值越小,启发式信息η越大,初始网络搜索空间中由于未产生进化,所以ω初始值设置为1。Among them, η I,J (t) represents the heuristic information from node I to node J at time t; Dep I,J , Wig I,J , Con I,J , Fil I,J represent the depth of node J respectively , width, degree of connectivity and number of filters, ω represents the incentive factor, 0≤ω≤1, and the reward mechanism is defined to motivate all nodes in the optimal network architecture of this generation. The smaller the value of ω is, the greater the heuristic information η is. Since there is no evolution in the initial network search space, the initial value of ω is set to 1.

步骤3.3:概率路径选择:在子空间搜索过程的每一步路径选择中,受信息素启发因子和期望值启发因子的影响,蚂蚁m按照概率

Figure BDA0002921713920000188
决定下一步往哪条路上移动。Step 3.3: Probabilistic path selection: In each step of the path selection in the subspace search process, affected by the pheromone heuristic factor and the expected value heuristic factor, the ant m follows the probability
Figure BDA0002921713920000188
Decide which way to move next.

Figure BDA0002921713920000189
Figure BDA0002921713920000189

Figure BDA00029217139200001810
表示第t时刻蚂蚁m由点I移动到点J的概率,allowedm表示蚂蚁下一步可以选择的节点;信息素启发因子α,表示路径上残留的信息素在寻优过程中的作用,值越大则说明蚂蚁之间的协作能力越强,倾向于选择其它蚂蚁经过的路径;期望值启发因子β,表明准确性和推理时延在蚂蚁选择路径时的受重视程度,值越大,状态转移规律越接近贪心规则。τI,J(t)表示t时刻点I到点J路径上的信息素。τI,S(t)表示点I到allowedm中任一点路径上的信息素。
Figure BDA00029217139200001810
Represents the probability that ant m moves from point I to point J at the t-th time, and allowed m represents the node that the ant can choose next; The larger the value is, the stronger the cooperation ability between ants is, and they tend to choose the path passed by other ants; the expected value heuristic factor β indicates the importance of accuracy and inference delay when the ants choose the path. The larger the value is, the state transition law The closer to the greedy rule. τ I,J (t) represents the pheromone on the path from point I to point J at time t. τ I,S (t) represents the pheromone on the path from point I to any point in allowed m .

步骤3.4:信息素动态挥发:自然界中信息素挥发速度是一个动态的变化的过程,会随着时间的推移受到温度、湿度等环境因素而变化。在神经网络架构寻优过程中,结构越复杂的结点,启发信息越小;推理时延越大的结点,其挥发速度越快,信息素残留越少,因此所有蚂蚁到达终点后,信息素动态挥发系数ρ按如下关系计算,Step 3.4: Dynamic volatilization of pheromone: The volatilization rate of pheromone in nature is a dynamic process, which will be changed over time by environmental factors such as temperature and humidity. In the optimization process of the neural network architecture, the more complex the structure of the node, the smaller the heuristic information; the node with a larger inference delay, the faster the evaporation speed, the less pheromone residue, so after all the ants reach the end, the information The prime dynamic volatilization coefficient ρ is calculated according to the following relationship,

Figure BDA0002921713920000191
Figure BDA0002921713920000191

ρI,J(t)表示t时刻在I到J路径上的挥发系数,ηI,J(t)表示t时刻I到J路径上的启发信息,

Figure BDA0002921713920000192
ηi表示所有的启发信息,L表示该网络下的结点总数。ρ I, J (t) represents the volatility coefficient on the path from I to J at time t, η I, J (t) represents the heuristic information on the path from I to J at time t,
Figure BDA0002921713920000192
η i represents all the heuristic information, L represents the total number of nodes under the network.

步骤3.5:信息素增量计算:信息素更新模型是基本蚁群算法的随机搜索与快速收敛重要环节。根据全局性最优要求,修改蚁周模型,形式化如下:Step 3.5: Pheromone Incremental Calculation: The pheromone update model is an important part of the random search and fast convergence of the basic ant colony algorithm. According to the global optimal requirements, the ant-week model is modified and formalized as follows:

Figure BDA0002921713920000193
Figure BDA0002921713920000193

Q为信息素强度常量,是蚂蚁在一个循环中在所经过的路径上释放的信息素的总量,在一定程度上影响着算法的收敛速度。ηm表示第m只蚂蚁在此次循环中所受到的启发信息总量。Q is the constant of pheromone intensity, which is the total amount of pheromone released by ants on the path passed in a cycle, which affects the convergence speed of the algorithm to a certain extent. η m represents the total amount of heuristic information received by the mth ant in this cycle.

步骤3.6:信息素更新:初始时刻各路径上的信息素量是相等的,当蚂蚁完成一次循环后,信息素会随着时间的推移逐渐挥发,因此在蚂蚁进入下一个循环之前需要更新信息素浓度,形式化如下:Step 3.6: Pheromone update: The amount of pheromone on each path is equal at the initial moment. After the ant completes a cycle, the pheromone will gradually volatilize over time, so the pheromone needs to be updated before the ant enters the next cycle. Concentration, formalized as follows:

τI,J(t+1)=(1-ρ)τI,J(t)+ΔτI,J(t,t+1)τ I, J (t+1)=(1-ρ)τ I, J (t)+Δτ I, J (t, t+1)

Figure BDA0002921713920000194
Figure BDA0002921713920000194

ρ是信息素动态挥发系数,

Figure BDA0002921713920000195
表示第m只蚂蚁在本次循环中在路径(I,J)上留下的信息素量,ΔτI,J(t,t+1)表示本次循环当中所有经过路径(I,J)的蚂蚁留下的信息素增量。K表示在本次循环中所有经过路径(I,J)的蚂蚁总数。ρ is the pheromone dynamic volatility coefficient,
Figure BDA0002921713920000195
Represents the amount of pheromone left by the mth ant on the path (I, J) in this cycle, Δτ I, J (t, t+1) represents all the pheromone passing through the path (I, J) in this cycle Pheromone increments left by ants. K represents the total number of ants passing through the path (I, J) in this cycle.

步骤3.7:寻优结束判断:当所有搜索子空间的寻优均达到最大循环次数则退出循环,输出所有搜索子空间寻优结果作为当代候选集,进入步骤4;否则进入启发信息计算3.2,继续循环。Step 3.7: Judgment of the end of optimization: when the optimization of all search subspaces reaches the maximum number of cycles, exit the cycle, output all search subspace optimization results as the contemporary candidate set, and enter step 4; otherwise, enter heuristic information calculation 3.2, continue cycle.

步骤4:目标评估。通过训练和测试获取候选集中N条寻优路径的准确率和推理时延,选取最优结果作为这一代的最优网络架构,若未达到进化次数、未满足最终目标要求,但满足速度和精度要求,则暂存寻优结果,进入步骤5,否则丢弃该搜索子空间;若达到最大进化次数且满足目标要求,则针对所有进化过程中每一代的最优网络架构结果进行排序,进入步骤6,否则退出搜索过程,输出搜索异常。Step 4: Target evaluation. Obtain the accuracy and inference delay of N optimization paths in the candidate set through training and testing, and select the optimal result as the optimal network architecture of this generation. If required, then temporarily store the optimization results and go to step 5, otherwise discard the search subspace; if the maximum number of evolutions is reached and the target requirements are met, the results of the optimal network architecture of each generation in all evolutionary processes will be sorted and go to step 6 , otherwise exit the search process and output a search exception.

步骤5:演化突变。给予这一代最优网络架构中的所有子网络模块奖励,即将激励因子ω设置为某常数,0≤ω≤1,ω值越小,激励作用越大。同时在突变集合中随机选取变异操作,进化子网络模块内部结构,生成下一代子网络模块,返回步骤2。Step 5: Evolutionary mutation. Give rewards to all sub-network modules in the optimal network architecture of this generation, that is, set the incentive factor ω to a constant, 0≤ω≤1, the smaller the ω value, the greater the incentive effect. At the same time, the mutation operation is randomly selected in the mutation set, the internal structure of the sub-network module is evolved, the next-generation sub-network module is generated, and the process returns to step 2.

步骤6:输出最终结果。输出最优的神经网络架构。Step 6: Output the final result. Output the optimal neural network architecture.

本领域技术人员知道,除了以纯计算机可读程序代码方式实现本发明提供的系统、装置及其各个模块以外,完全可以通过将方法步骤进行逻辑编程来使得本发明提供的系统、装置及其各个模块以逻辑门、开关、专用集成电路、可编程逻辑控制器以及嵌入式微控制器等的形式来实现相同程序。所以,本发明提供的系统、装置及其各个模块可以被认为是一种硬件部件,而对其内包括的用于实现各种程序的模块也可以视为硬件部件内的结构;也可以将用于实现各种功能的模块视为既可以是实现方法的软件程序又可以是硬件部件内的结构。Those skilled in the art know that, in addition to implementing the system, device and each module provided by the present invention in the form of pure computer readable program code, the system, device and each module provided by the present invention can be completely implemented by logically programming the method steps. The same program is implemented in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, and embedded microcontrollers, among others. Therefore, the system, device and each module provided by the present invention can be regarded as a kind of hardware component, and the modules included in it for realizing various programs can also be regarded as the structure in the hardware component; A module for realizing various functions can be regarded as either a software program for realizing a method or a structure within a hardware component.

以上对本发明的具体实施例进行了描述。需要理解的是,本发明并不局限于上述特定实施方式,本领域技术人员可以在权利要求的范围内做出各种变化或修改,这并不影响本发明的实质内容。在不冲突的情况下,本申请的实施例和实施例中的特征可以任意相互组合。Specific embodiments of the present invention have been described above. It should be understood that the present invention is not limited to the above-mentioned specific embodiments, and those skilled in the art can make various changes or modifications within the scope of the claims, which do not affect the essential content of the present invention. The embodiments of the present application and features in the embodiments may be combined with each other arbitrarily, provided that there is no conflict.

Claims (10)

1. A neural network architecture searching method based on evolutionary computing is characterized by comprising the following steps:
step S1: setting target requirements through a target function according to the target requirements and the platform requirements, wherein the target requirements comprise: expected accuracy, reasoning time delay, search space size and evolution times;
step S2: according to the set size of the search space, randomly generating N directed cyclic graphs based on the sub-network module set to serve as a network search space for evolutionary optimization;
step S3: under the guidance of heuristic information, combining with a pheromone dynamic volatilization and probability path selection mechanism, searching N directed acyclic graphs for optimizing paths in N randomly generated acyclic graphs through an ant colony algorithm to form a candidate set;
step S4: acquiring the accuracy and reasoning time delay of N optimizing paths in the candidate set through training and testing, and selecting an optimal result as a current optimal network structure;
step S5: evaluating whether the current network architecture meets the target requirement, when the current network architecture does not meet the preset target requirement, and when the current network architecture meets the speed and precision requirements, pausing the optimization result, carrying out real-time evolution mutation on the internal structures of all the sub-network modules based on the current optimal network architecture, and continuing iteration until the preset target requirement is met; and when the preset target requirement is met, outputting the current optimal network result, otherwise, quitting the searching process and outputting the abnormal searching.
2. The evolutionary computing-based neural network architecture search method of claim 1, wherein the step S1 comprises:
the objective function formula is as follows:
Figure FDA0002921713910000011
s.t.LAT(Net)≤T&ACC(Net)≥A
wherein, the target function is defined as a multi-target search; net represents the network obtained by the evolutionary algorithm; acc (net) indicates the accuracy of the network; lat (net) represents the inference delay of the network; t represents the expected inference time delay; a represents the desired accuracy; the expected accuracy is set according to preset target requirements; the expected inference time delay is set according to mobile, embedded or general platform types.
3. The evolutionary computing-based neural network architecture search method of claim 1, wherein the step S2 comprises:
the sub-network modules are nodes in an undirected cyclic graph; the sub-network modules comprise a plurality of types of sub-networks with M layers, and the types of the sub-networks can be expanded and selected according to target requirements and platform requirements;
the structure of the sub-network comprises: multi-convolution layers, ResNet blocks, depth separable convolutions, reversed residual structure with linear bottlenecks, and lightweight attention structure based on compression-excitation structures.
4. The evolutionary computing-based neural network architecture search method of claim 3, wherein the network search space comprises: the network search space comprises N search subspaces;
Figure FDA0002921713910000021
Figure FDA0002921713910000022
wherein ,
Figure FDA0002921713910000023
representing the i-th generation of a collection of sub-network modules,
Figure FDA0002921713910000024
represents the jth sub-network module in the ith generation,
Figure FDA0002921713910000025
representing an edge set of the ith generation of search space, and connecting the sub-network modules through edges;
Figure FDA0002921713910000026
representing an edge between a jth sub-network module and a kth sub-network module in an ith generation;
Figure FDA0002921713910000027
representing the ith generation nth search space; i denotes an iteration number.
5. The evolutionary computing-based neural network architecture search method of claim 1, wherein the step S3 comprises:
step S3.1: selecting any point in each search subspace as a starting point, selecting a node farthest from the starting point as an end point, and initializing the ant number, the pheromone intensity constant and the cycle number;
step S3.2: calculating heuristic information;
Figure FDA0002921713910000028
wherein ,ηI,J(t) heuristic information from node I to node J at time t; depI,J、WigI,J、ConI,J、FilI,JRespectively representing the depth, the width, the connectivity and the number of filters of the joint J, wherein omega represents an excitation factor, and omega is more than or equal to 0 and less than or equal to 1; prize-giving deviceThe excitation mechanism is defined to excite all nodes in the current optimal network architecture; the smaller the omega value is, the larger heuristic information eta is, and the initial value of omega is set to be 1 because evolution is not generated in the initial network search space;
step S3.3: selecting a probability path;
Figure FDA0002921713910000029
wherein ,
Figure FDA00029217139100000210
representing the probability that the ant m moves from the point I to the point J at the t-th moment; allowedmRepresenting nodes which can be selected by ants in the next step; alpha represents pheromone elicitation factors, represents the effect of residual pheromones on the paths in the optimizing process, and the larger the value is, the stronger the cooperation capability among ants is, and the paths passed by other ants are prone to be selected; beta represents an expected value heuristic factor, which shows the accuracy and the degree of importance of reasoning time delay when ants select paths, and the larger the value is, the closer the state transition rule is to the greedy rule; tau isI,J(t) pheromones on the path from point I to point J at time t; tau isI,S(t) indicates points I to allowedmPheromone on any point path;
Figure FDA00029217139100000211
representing I to allowedmHeuristic information on the path of any point in the tree;
step S3.4: the pheromone is volatilized dynamically;
Figure FDA0002921713910000031
wherein ,ρI,J(t) represents the volatility coefficient on the path from I to J at the moment t; etaI,J(t) heuristic information on the path from time I to J at t;
Figure FDA0002921713910000032
representing all initiation information, wherein L represents the total number of nodes in the current network;
step S3.5: performing pheromone increment calculation;
Figure FDA0002921713910000033
wherein Q is a pheromone strength constant, which is the total amount of pheromones released by ants on a path traveled in one cycle; etamRepresenting the total amount of heuristic information suffered by the mth ant in the cycle;
step S3.6: updating pheromone;
τI,J(t+1)=(1-ρ)τI,J(t)+ΔτI,J(t,t+1)
Figure FDA0002921713910000034
wherein rho is the pheromone dynamic volatility coefficient;
Figure FDA0002921713910000035
represents the pheromone increment left by the mth ant on the path (I, J) in the current cycle, delta tauI,J(t, t +1) represents pheromone increment left by all ants passing through the path (I, J) in the current cycle; k represents the total number of ants passing through the paths (I, J) in the current cycle;
step S3.7: and (4) optimizing and judging: when the optimization of all the search subspaces reaches the maximum cycle times, the circulation is exited, and the optimization results of all the search subspaces are output as a current candidate set; otherwise, step S3.2 to step S3.7 are repeated until the maximum number of cycles is reached.
6. The evolutionary computing-based neural network architecture searching method of claim 1, wherein the evolving mutations in step S5 comprise: setting the excitation factors omega in all the sub-network modules in the current optimal network architecture as constants, wherein omega is more than or equal to 0 and less than or equal to 1; and simultaneously, randomly selecting mutation operation in the mutation set, promoting the internal structure of the sub-network module, generating a next generation sub-network module, and repeatedly executing the steps S2 to S5 until the preset target requirement is met.
7. An evolutionary computing-based neural network architecture search system, comprising:
module M1: setting target requirements through a target function according to the target requirements and the platform requirements, wherein the target requirements comprise: expected accuracy, reasoning time delay, search space size and evolution times;
module M2: according to the set size of the search space, randomly generating N directed cyclic graphs based on the sub-network module set to serve as a network search space for evolutionary optimization;
module M3: under the guidance of heuristic information, combining with a pheromone dynamic volatilization and probability path selection mechanism, searching N directed acyclic graphs for optimizing paths in N randomly generated acyclic graphs through an ant colony algorithm to form a candidate set;
module M4: acquiring the accuracy and reasoning time delay of N optimizing paths in the candidate set through training and testing, and selecting an optimal result as a current optimal network structure;
module M5: evaluating whether the current network architecture meets the target requirement, when the current network architecture does not meet the preset target requirement, and when the current network architecture meets the speed and precision requirements, pausing the optimization result, carrying out real-time evolution mutation on the internal structures of all the sub-network modules based on the current optimal network architecture, and continuing iteration until the preset target requirement is met; and when the preset target requirement is met, outputting the current optimal network result, otherwise, quitting the searching process and outputting the abnormal searching.
8. The evolutionary computing-based neural network architecture search system of claim 7, wherein the module M1 comprises:
the objective function formula is as follows:
Figure FDA0002921713910000041
s.t.LAT(Net)≤T&ACC(Net)≥A
wherein, the target function is defined as a multi-target search; net represents the network obtained by the evolutionary algorithm; acc (net) indicates the accuracy of the network; lat (net) represents the inference delay of the network; t represents the expected inference time delay; a represents the desired accuracy; the expected accuracy is set according to preset target requirements; the expected inference time delay is set according to mobile, embedded or general platform types.
9. The evolutionary computing-based neural network architecture search system of claim 7, wherein the module M2 comprises:
the sub-network modules are nodes in an undirected cyclic graph; the sub-network modules comprise a plurality of types of sub-networks with M layers, and the types of the sub-networks can be expanded and selected according to target requirements and platform requirements;
the structure of the sub-network comprises: the system comprises a multi-convolution layer, a ResNet block, a depth separable convolution, an inverted residual error structure with a linear bottleneck and a lightweight attention structure based on a compression-excitation structure;
the network search space includes: the network search space comprises N search subspaces;
Figure FDA0002921713910000042
Figure FDA0002921713910000043
wherein ,
Figure FDA0002921713910000044
representing an ith generation subnetworkA set of modules to be used in the method,
Figure FDA0002921713910000045
represents the jth sub-network module in the ith generation,
Figure FDA0002921713910000046
representing an edge set of the ith generation of search space, and connecting the sub-network modules through edges;
Figure FDA0002921713910000047
representing an edge between a jth sub-network module and a kth sub-network module in an ith generation;
Figure FDA0002921713910000048
representing the ith generation nth search space; i denotes an iteration number.
10. The evolutionary computing-based neural network architecture search system of claim 7, wherein the module M3 comprises:
module M3.1: selecting any point in each search subspace as a starting point, selecting a node farthest from the starting point as an end point, and initializing the ant number, the pheromone intensity constant and the cycle number;
module M3.2: calculating heuristic information;
Figure FDA0002921713910000051
wherein ,ηI,J(t) heuristic information from node I to node J at time t; depI,J、WigI,J、ConI,J、FilI,JRespectively representing the depth, the width, the connectivity and the number of filters of the joint J, wherein omega represents an excitation factor, and omega is more than or equal to 0 and less than or equal to 1; the reward mechanism is defined as exciting all nodes in the current optimal network architecture; the smaller the omega value is, the larger heuristic information eta is, and the initial value of omega is set to be 1 because evolution is not generated in the initial network search space;
module M3.3: selecting a probability path;
Figure FDA0002921713910000052
wherein ,
Figure FDA0002921713910000053
representing the probability that the ant m moves from the point I to the point J at the t-th moment; allowedmRepresenting nodes which can be selected by ants in the next step; alpha represents pheromone elicitation factors, represents the effect of residual pheromones on the paths in the optimizing process, and the larger the value is, the stronger the cooperation capability among ants is, and the paths passed by other ants are prone to be selected; beta represents an expected value heuristic factor, which shows the accuracy and the degree of importance of reasoning time delay when ants select paths, and the larger the value is, the closer the state transition rule is to the greedy rule; tau isI,J(t) pheromones on the path from point I to point J at time t; tau isI,S(t) indicates points I to allowedmPheromone on any point path;
Figure FDA0002921713910000054
representing I to allowedmHeuristic information on the path of any point in the tree;
module M3.4: the pheromone is volatilized dynamically;
Figure FDA0002921713910000055
wherein ,ρI,J(t) represents the volatility coefficient on the path from I to J at the moment t; etaI,J(t) heuristic information on the path from time I to J at t;
Figure FDA0002921713910000056
representing all initiation information, wherein L represents the total number of nodes in the current network;
module M3.5: performing pheromone increment calculation;
Figure FDA0002921713910000061
wherein Q is a pheromone strength constant, which is the total amount of pheromones released by ants on a path traveled in one cycle; etamRepresenting the total amount of heuristic information suffered by the mth ant in the cycle;
module M3.6: updating pheromone;
τI,J(t+1)=(1-ρ)τI,J(t)+ΔτI,J(t,t+1)
Figure FDA0002921713910000062
wherein rho is the pheromone dynamic volatility coefficient;
Figure FDA0002921713910000063
represents the pheromone increment left by the mth ant on the path (I, J) in the current cycle, delta tauI,J(t, t +1) represents pheromone increment left by all ants passing through the path (I, J) in the current cycle; k represents the total number of ants passing through the paths (I, J) in the current cycle;
module M3.7: and (4) optimizing and judging: when the optimization of all the search subspaces reaches the maximum cycle times, the circulation is exited, and the optimization results of all the search subspaces are output as a current candidate set; otherwise, repeatedly triggering the execution of the modules M3.2 to M3.7 until the maximum cycle number is reached;
the mutations evolved in the module M5 include: setting the excitation factors omega in all the sub-network modules in the current optimal network architecture as constants, wherein omega is more than or equal to 0 and less than or equal to 1; and meanwhile, randomly selecting mutation operation in the mutation set, promoting the internal structure of the sub-network module, generating a next generation sub-network module, and repeatedly triggering the execution of the modules M2 to M5 until the preset target requirement is met.
CN202110120132.8A 2021-01-28 2021-01-28 Neural network architecture searching method and system based on evolutionary computation Active CN112784949B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110120132.8A CN112784949B (en) 2021-01-28 2021-01-28 Neural network architecture searching method and system based on evolutionary computation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110120132.8A CN112784949B (en) 2021-01-28 2021-01-28 Neural network architecture searching method and system based on evolutionary computation

Publications (2)

Publication Number Publication Date
CN112784949A true CN112784949A (en) 2021-05-11
CN112784949B CN112784949B (en) 2023-08-11

Family

ID=75759475

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110120132.8A Active CN112784949B (en) 2021-01-28 2021-01-28 Neural network architecture searching method and system based on evolutionary computation

Country Status (1)

Country Link
CN (1) CN112784949B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469078A (en) * 2021-07-07 2021-10-01 西安电子科技大学 Hyperspectral image classification method based on automatic design long-time and short-time memory network
CN114117206A (en) * 2021-11-09 2022-03-01 北京达佳互联信息技术有限公司 Recommendation model processing method and device, electronic equipment and storage medium
CN114662669A (en) * 2022-03-28 2022-06-24 京东科技信息技术有限公司 Neural network architecture searching method and device
CN116108912A (en) * 2023-02-21 2023-05-12 北京航空航天大学 A Heuristic Neural Network Architecture Search Method
CN116522999A (en) * 2023-06-26 2023-08-01 深圳思谋信息科技有限公司 Model searching and time delay predictor training method, device, equipment and storage medium
CN117611974A (en) * 2024-01-24 2024-02-27 湘潭大学 Image recognition method and system based on searching of multiple group alternative evolutionary neural structures
CN117668701A (en) * 2024-01-30 2024-03-08 云南迅盛科技有限公司 AI artificial intelligence machine learning system and method
CN118014010A (en) * 2024-04-09 2024-05-10 南京信息工程大学 Multi-objective evolutionary nerve architecture searching method based on multiple group mechanisms and agent models
CN119716505A (en) * 2024-12-23 2025-03-28 电子科技大学 Simulation circuit test excitation generation method based on heuristic greedy algorithm

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12332882B1 (en) 2024-02-06 2025-06-17 Dropbox, Inc. Directed acyclic graph framework for executing search function operations

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109214498A (en) * 2018-07-10 2019-01-15 昆明理工大学 Ant group algorithm optimization method based on search concentration degree and dynamic pheromone updating
CN111144555A (en) * 2019-12-31 2020-05-12 中国人民解放军国防科技大学 Recurrent neural network architecture search method, system and medium based on improved evolutionary algorithm
CN111325356A (en) * 2019-12-10 2020-06-23 四川大学 Neural network search distributed training system and training method based on evolutionary computation
CN112101525A (en) * 2020-09-08 2020-12-18 南方科技大学 A method, apparatus and system for designing neural network through NAS

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109214498A (en) * 2018-07-10 2019-01-15 昆明理工大学 Ant group algorithm optimization method based on search concentration degree and dynamic pheromone updating
CN111325356A (en) * 2019-12-10 2020-06-23 四川大学 Neural network search distributed training system and training method based on evolutionary computation
CN111144555A (en) * 2019-12-31 2020-05-12 中国人民解放军国防科技大学 Recurrent neural network architecture search method, system and medium based on improved evolutionary algorithm
CN112101525A (en) * 2020-09-08 2020-12-18 南方科技大学 A method, apparatus and system for designing neural network through NAS

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KHALID M. SALAMA 等: "Learning neural network structures with ant colony algorithms", 《SWARM INTELL》 *
YUKANG CHEN 等: "RENAS: Reinforced Evolutionary Neural Architecture Search", 《ARXIV》 *
耿飞 等: "神经网络架构搜索综述", 《智能计算机与应用》, vol. 10, no. 6 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469078A (en) * 2021-07-07 2021-10-01 西安电子科技大学 Hyperspectral image classification method based on automatic design long-time and short-time memory network
CN114117206A (en) * 2021-11-09 2022-03-01 北京达佳互联信息技术有限公司 Recommendation model processing method and device, electronic equipment and storage medium
CN114662669A (en) * 2022-03-28 2022-06-24 京东科技信息技术有限公司 Neural network architecture searching method and device
CN116108912B (en) * 2023-02-21 2025-10-17 北京航空航天大学 Heuristic neural network architecture searching method
CN116108912A (en) * 2023-02-21 2023-05-12 北京航空航天大学 A Heuristic Neural Network Architecture Search Method
CN116522999A (en) * 2023-06-26 2023-08-01 深圳思谋信息科技有限公司 Model searching and time delay predictor training method, device, equipment and storage medium
CN116522999B (en) * 2023-06-26 2023-12-15 深圳思谋信息科技有限公司 Model searching and time delay predictor training method, device, equipment and storage medium
CN117611974A (en) * 2024-01-24 2024-02-27 湘潭大学 Image recognition method and system based on searching of multiple group alternative evolutionary neural structures
CN117611974B (en) * 2024-01-24 2024-04-16 湘潭大学 Image recognition method and system based on multi-population alternating evolution neural structure search
CN117668701B (en) * 2024-01-30 2024-04-12 云南迅盛科技有限公司 AI artificial intelligence machine learning system and method
CN117668701A (en) * 2024-01-30 2024-03-08 云南迅盛科技有限公司 AI artificial intelligence machine learning system and method
CN118014010A (en) * 2024-04-09 2024-05-10 南京信息工程大学 Multi-objective evolutionary nerve architecture searching method based on multiple group mechanisms and agent models
CN119716505A (en) * 2024-12-23 2025-03-28 电子科技大学 Simulation circuit test excitation generation method based on heuristic greedy algorithm

Also Published As

Publication number Publication date
CN112784949B (en) 2023-08-11

Similar Documents

Publication Publication Date Title
CN112784949B (en) Neural network architecture searching method and system based on evolutionary computation
Liu et al. Progressive neural architecture search
Chouikhi et al. PSO-based analysis of Echo State Network parameters for time series forecasting
Kavitha et al. Improved Harris Hawks optimization with hybrid deep learning based heating and cooling load prediction on residential buildings
Chen et al. Nonlinear system modelling via optimal design of neural trees
Lam et al. Automated generation of independent paths and test suite optimization using artificial bee colony
CN111860787A (en) A short-term prediction method and device for coupled directed graph structure traffic data containing missing data
CN112264999B (en) Method, device and storage medium for intelligent agent continuous space action planning
Gill et al. Training back propagation neural networks with genetic algorithm for weather forecasting
CN111144555A (en) Recurrent neural network architecture search method, system and medium based on improved evolutionary algorithm
CN114004008B (en) A Resource Allocation Optimization Method for Aircraft Assembly Line Based on Neural Network and Genetic Algorithm
CN117439053A (en) A Stacking integrated model power prediction method, device and storage medium
Tohidi et al. Short overview of advanced metaheuristic methods
Chauhan et al. Dqnas: Neural architecture search using reinforcement learning
CN110334869A (en) A training method for mangrove ecological health prediction based on dynamic group optimization algorithm
Qi et al. Collective intelligence evolution using ant colony optimization and neural networks
CN115453867B (en) A robust adaptive large-scale pneumatic conveying control method
CN114511078B (en) BP neural network prediction method and device based on multi-strategy sparrow search algorithm
Xiao et al. Relationships of swarm intelligence and artificial immune system
CN118230809A (en) Gene causal relationship prediction method, device and electronic equipment
Xiao et al. An analytical approach to the similarities between swarm intelligence and artificial neural network
CN118917353A (en) Method and system for searching evolutionary neural architecture based on performance level agent assistance
Aminian et al. A robust predictive model for base shear of steel frame structures using a hybrid genetic programming and simulated annealing method
CN118999570A (en) Robot path planning method, device and medium based on quantum search and reinforcement learning
CN118412067A (en) Method for predicting energy level of organic solar cell material

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant