[go: up one dir, main page]

WO2022126448A1 - Procédé et système de recherche d'architecture neuronale basés sur un apprentissage évolutif - Google Patents

Procédé et système de recherche d'architecture neuronale basés sur un apprentissage évolutif Download PDF

Info

Publication number
WO2022126448A1
WO2022126448A1 PCT/CN2020/136950 CN2020136950W WO2022126448A1 WO 2022126448 A1 WO2022126448 A1 WO 2022126448A1 CN 2020136950 W CN2020136950 W CN 2020136950W WO 2022126448 A1 WO2022126448 A1 WO 2022126448A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
supernet
network model
population
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2020/136950
Other languages
English (en)
Chinese (zh)
Inventor
程然
谭浩
何成
侯章禄
邱畅啸
杨帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Southern University of Science and Technology
Original Assignee
Huawei Technologies Co Ltd
Southern University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd, Southern University of Science and Technology filed Critical Huawei Technologies Co Ltd
Priority to PCT/CN2020/136950 priority Critical patent/WO2022126448A1/fr
Priority to CN202080107589.9A priority patent/CN116964594B/zh
Publication of WO2022126448A1 publication Critical patent/WO2022126448A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models

Definitions

  • the present application relates to the field of artificial intelligence, and in particular, to a method and system for searching neural network structures based on evolutionary learning.
  • NAS Neural Network Architecture Search
  • Auto-ML automatic machine learning
  • a neural network usually consists of many nodes. When searching for a neural network structure, a completely arbitrary combination of nodes can be used, that is, each node can be connected to any other node, and there are different operations between nodes to choose from. .
  • the search space increases exponentially with the number of nodes, the search space is huge, and the search speed is very slow. Since the search space involved in NAS is huge, and its performance evaluation often involves model training, it consumes a lot of resources.
  • embodiments of the present application provide a method, system, electronic device and storage medium for searching a neural network structure based on evolutionary learning.
  • the present application provides a method for searching a neural network structure based on evolutionary learning, the method comprising: S101 , initializing a population, where the population is a structure code set including multiple different neural network structures, the structure code A mapping relationship used to indicate the connection and operation between any two nodes of the neural network structure through a continuous real number interval; S102, randomly select two structural codes in the population, and decode the two structural codes to Obtaining two neural network structures, and pairing the two neural network structures; respectively inheriting the corresponding weights from the supernet to obtain the first neural network model and the second neural network model; The supernet includes a set of multiple operations and the weight of each operation; S103, train the first and second neural network models respectively to obtain the trained first and second neural network models; The voice, video or graphic samples are input into the trained first and second neural network models, and the error value between the output result and the label is calculated to obtain the winner and the loser, and the error value of the winner is less than that of the loser.
  • S104 update the supernet according to the trained first and second neural network models
  • S105 calculate the pseudo gradient value between the structural code of the loser and the structural code of the winner, Evolve the structural code of the loser to the structural code of the winner based on the pseudo gradient value, so as to obtain a third neural network structural code
  • the pseudo gradient is the gradient of the structural code update
  • S106 use the third neural network structural code
  • the neural network structure code replaces the structure code of the neural network structure corresponding to the loser in the population to obtain an updated population
  • S107 output the optimal neural network model in the updated population, thereby completing the neural network structure. search.
  • This embodiment uses a continuous real number space to represent the neural network structure, which can reduce the search space corresponding to the operation selection, improve the NAS search efficiency, and increase the diversity of the neural network structure in the population to match the second-order learning evolution of the subsequent neural network. It can solve the problems of poor effect of existing neural network structure search methods and high consumption of computing resources.
  • the outputting the optimal neural network model in the updated population, so as to complete the search of the neural network structure includes: outputting the optimal neural network model in the updated population when the termination condition is satisfied.
  • the neural network model to complete the search of the neural network structure includes: outputting the optimal neural network model in the updated population when the termination condition is satisfied.
  • the population is iteratively evolved by setting termination conditions, which improves the reliability of the neural network structure search.
  • the outputting the optimal neural network model in the updated population, thereby completing the search of the neural network structure including: if the termination condition is not satisfied, returning to S102, and performing a search on the updated population.
  • the population is iteratively evolved until the termination condition is met, and the optimal neural network model in the updated population is output, thereby completing the search of the neural network structure.
  • This embodiment finds a set of optimal neural network models through an iterative method based on the characteristics of the population, which can provide decision makers with multiple choices.
  • the step of inheriting the corresponding weights of the two neural network structures respectively from the supernet to obtain the first neural network model and the second neural network model includes: converting the first neural network structure Inheriting the same connection as the first neural network structure and the first weight corresponding to the same operation corresponding to the connection from the supernet to obtain the first neural network model;
  • the network structure inherits the same connection as the second neural network structure and the second weight corresponding to the same operation corresponding to the connection from the supernet to obtain the second neural network model.
  • This embodiment can speed up the speed of obtaining the model by inheriting the weights of the supernet, and significantly reduce the computational cost and running time required for searching the neural network; in the iterative process, the weights inherited by the neural network structure from the supernet are optimized weights , which significantly reduces the computational cost and running time required for searching neural networks.
  • the separately training the first and second neural network models to obtain the trained first and second neural network models includes: training the first neural network at least once by using a stochastic gradient descent method The weight value of the network model is used to obtain the optimized first neural network model; the weight value of the second neural network model is trained at least once by using the stochastic gradient descent method to obtain the optimized second neural network model.
  • first and second neural network models optimized for weight values are obtained by training the first and second neural network models.
  • the first and second neural network models with labels are input into the trained first and second neural network models, and the error value between the output result and the label is calculated to obtain the winner and the loser
  • the method includes: inputting the labeled voice, video or graphic samples into the trained first neural network model and the trained second neural network model respectively; an output result, calculate the first error value between the first output result and the label of the sample; according to the second output result of the trained second neural network model, calculate the difference between the second output result and the sample
  • the second error value between labels compare the first error value and the second error value, take the first/second neural network model with the smaller error value as the winner, and use the first/second neural network model with the larger error value as the winner.
  • /Second neural network model as loser, get winner and loser.
  • the paired first and second neural network models are trained by the labeled samples, and the performance of the trained first and second models is evaluated, so as to speed up the speed of finding the optimal model.
  • the updating the supernet according to the trained first and second neural network models includes: two nodes of the first and second neural network models contain the same connection and the operation corresponding to the connection is the same, update the supernet by using the weight of the winner as the weight of the corresponding operation in the supernet.
  • This embodiment can synchronously optimize the operation weights of the supernet, so as to speed up the search; the weight update of the supernet can significantly reduce the computational cost and running time required for searching the neural network.
  • the updating of the supernet according to the trained first and second neural network models includes: connecting two nodes of the first and second neural network models and the Under the condition that the operations corresponding to the connections are not the same, take the weight of the first neural network model as the weight of the connection in the supernet that has the same structure as the first neural network and the same operation corresponding to the connection;
  • the weight of the second neural network model is used as the weight of the connection in the supernet with the same structure as the second neural network and the same operation corresponding to the connection; the supernet is updated.
  • This embodiment can synchronously optimize the operation weights of the supernet, so as to speed up the search; the weight update of the supernet can significantly reduce the computational cost and running time required for searching the neural network.
  • the calculating a pseudo gradient value between the structural code of the loser and the structural code of the winner, and based on the pseudo gradient value, the structural code of the loser is moved toward the winner.
  • the structure coding evolution of the winner, to obtain the structure coding of the third neural network structure comprising: calculating the difference between the structure coding value of the loser and the structure coding value of the winner, and multiplying the difference by a random coefficient, Accumulate and sum up the historical pseudo gradients under the random coefficient multiplier to obtain the value of the pseudo gradient updated by the structural code of the loser; sum the structural code value of the loser and the value of the pseudo gradient to obtain the
  • the structure code of the third neural network structure is used to realize the evolution of the structure code of the loser to the structure code of the winner.
  • this embodiment enables the loser to perform structural evolution update by learning from the winner, so as to find the optimal neural network model more quickly.
  • the termination condition includes whether all structural codes in the population participate in pairing or whether a set number of iterations is reached.
  • the population is fully iteratively evolved by setting termination conditions, which improves the reliability of the neural network structure search.
  • the present application provides a search system for a neural network structure based on evolutionary learning
  • the system includes: a population initialization module for initializing a population, where the population is a structure encoding set including a plurality of different neural network structures, The structure code is used to indicate the connection and operation mapping relationship between any two nodes of the neural network structure through a continuous real number interval; the individual pairing module is used to randomly select two structure codes in the population, Decode the two structure codes to obtain two neural network structures, and pair the two neural network structures; the weight inheritance module inherits the corresponding weights of the two neural network structures from the supernet respectively, and obtains a first neural network model and a second neural network model; wherein the supernet includes a set of multiple operations and a weight for each operation; a training module for training the first and second neural network models respectively to obtain training Good first and second neural network models; an evaluation module for inputting labeled voice, video or graphic samples into the trained first and second neural network models, and calculating the difference between the output result and the label
  • the error value obtains the winner and the loser, and the error value of the winner is smaller than that of the loser;
  • the supernet weight update module is used to update the supernet according to the trained first and second neural network models;
  • structure coding an evolution module configured to calculate a pseudo gradient value between the structural code of the loser and the structural code of the winner, and make the structural code of the loser to the structural code of the winner based on the pseudo gradient value evolution to obtain a third neural network structure code;
  • the pseudo gradient is the gradient of the structure code update;
  • a population update module for replacing the neural network corresponding to the loser in the population with the third neural network structure code
  • the structure coding of the network structure obtains the updated population;
  • the model output module outputs the optimal neural network model in the updated population, thereby completing the search of the neural network structure.
  • the model output module is configured to output the optimal neural network model in the updated population under the condition that the termination condition is satisfied, so as to complete the search of the neural network structure.
  • the model output module is configured to: if the termination condition is not met, return to S102, perform iterative evolution on the updated population, and output the updated population after the termination condition is met.
  • the optimal neural network model is used to complete the search of the neural network structure.
  • the weight inheritance module is configured to: inherit the first neural network structure from the supernet to the same connection as the first neural network structure and the same connection corresponding to the connection Operate the corresponding first weight to obtain the first neural network model; inherit the second neural network structure from the supernet and inherit the same connection as the second neural network structure and corresponding to the connection The second weight corresponding to the same operation is obtained to obtain the second neural network model.
  • the training module is used for: training the weight value of the first neural network model at least once by using stochastic gradient descent, to obtain the optimized first neural network model; training by using stochastic gradient descent The weight value of the second neural network model is obtained at least once to obtain the optimized second neural network model.
  • the evaluation module is used to: input the labeled voice, video or graphic samples into the trained first neural network model and the trained second neural network model respectively; The first output result of the trained first neural network model is calculated, and the first error value between the first output result and the label of the sample is calculated; according to the second output result of the trained second neural network model, Calculate the second error value between the second output result and the label of the sample; compare the first error value and the second error value, and record the first/second neural network model with a smaller error value as For the winner, the first/second neural network model with larger error value is recorded as the loser, and the winner and loser are obtained.
  • the supernet weight update module is configured to: under the condition that the two nodes of the first and second neural network models contain the same connection and the corresponding operations of the connection are the same, The weight of the winner is used as the weight of the corresponding operation in the supernet to update the supernet.
  • the supernet weight update module is configured to: under the condition that the connection of the two nodes of the first and second neural network models and the operation corresponding to the connection are different, the The weight of the first neural network model is used as the weight of the connection in the supernet that has the same structure as the first neural network and the same operation corresponding to the connection; the weight of the second neural network model is used as the supernet. in the connection with the same structure as the second neural network and the weight of the same operation corresponding to the connection; update the supernet.
  • the structure coding evolution module is configured to: calculate the difference between the structure coding value of the loser and the structure coding value of the winner, multiply the difference by a random coefficient, and randomly
  • the historical pseudo-gradients under the coefficient magnification are accumulated and summed to obtain the value of the pseudo-gradient updated by the structural code of the loser; the value of the structural code of the loser and the value of the pseudo-gradient are summed to obtain the first
  • the structure coding of the three neural network structures realizes the evolution of the structure coding of the loser to the structure coding of the winner.
  • the termination condition includes whether all structural codes in the population participate in pairing or whether a set number of iterations is reached.
  • the present application provides an electronic device, including a memory and a processor; the processor is configured to execute computer-executable instructions stored in the memory, and the processor executes the computer-executable instructions to execute any one of the above implementations.
  • the present application provides a storage medium, including a readable storage medium and a computer program stored in the readable storage medium, where the computer program is used to implement the evolution-based learning described in any of the foregoing embodiments
  • the search method of the neural network structure is used to implement the evolution-based learning described in any of the foregoing embodiments.
  • a search method, system, electronic device and storage medium for a neural network structure based on evolutionary learning provided by the embodiments of the present application map a continuous space to a neural network structure so as to perform continuous mathematical operations on the structure, which can give the algorithm a better global Search ability; the optimal solution can be found faster through the structure update method based on the population-based paired second-order learning; at the same time, a set of solutions can be found based on the characteristics of the population, which can provide multiple choices for decision makers and improve the algorithm at the same time and the weight inheritance and update of the supernet can speed up the model evaluation and significantly reduce the computational cost and running time required for searching the neural network.
  • FIG. 1 is a schematic diagram of an application environment of a neural network structure search provided by an embodiment of the present application
  • Figure 2 is a flowchart of the population-based neural network structure search proposed by the first scheme
  • FIG. 3 provides a basic framework diagram of a neural network structure search based on evolutionary learning for a system embodiment of the present application
  • FIG. 4 is a general flowchart of a method for searching a neural network structure based on evolutionary learning provided by an embodiment of the present application
  • 5a is a block diagram of a specific embodiment of a method for searching a neural network structure based on evolutionary learning provided by an embodiment of the application;
  • Figure 5b is a flow chart of population initialization
  • Figure 5c is a block diagram of the initialization flow of the supernet
  • FIG. 6 is a schematic diagram of operations between two nodes in a method for searching a neural network structure based on evolutionary learning provided by an embodiment of the present application;
  • FIG. 7 is a schematic diagram of connections and operations between two nodes of a supernet in a method for searching a neural network structure based on evolutionary learning provided by an embodiment of the present application;
  • FIG. 8 is a flowchart of a method for updating a supernet weight provided by an embodiment of the present application.
  • FIG. 9 is a flowchart of a population pairing-based structure updating method provided by an embodiment of the present application.
  • FIG. 10 is a system block diagram of a neural network structure search based on evolutionary learning provided by an embodiment of the application;
  • FIG. 11 is a block diagram of a system for updating a supernet provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of an electronic device according to an embodiment of the present application.
  • the neural network structure search (NAS) technology is applied in a wide range of scenarios.
  • using algorithms to automatically design neural network structure models can achieve better performance than manually designed neural network structures;
  • using neural network structure search to generate neural network structure models To process data such as Magnetic Resonance Imaging (MRI), Computed Tomography (CT) and Ultrasound to determine whether a patient has a disease.
  • MRI Magnetic Resonance Imaging
  • CT Computed Tomography
  • Ultrasound Ultrasound
  • FIG. 1 is a schematic diagram of an application environment of a neural network structure search provided by an embodiment of the present application; as shown in FIG. 1 , the application scenario includes at least one background server 10 and a smart device 11 .
  • the smart device 11 can be connected to the backend server 10 through the Internet; the smart device 11 can include smart devices capable of outputting medical images, voice, video or pictures, such as magnetic resonance imagers, smart speakers, smart cameras, and smart phones.
  • the intelligent device 11 is provided with a picture, voice, medical image or video collection device, and the collected picture, voice, medical image or video data can be sent to the background server 10, so that the background server 10 can input the picture, voice, medical image or video.
  • Use neural network structure search to generate neural network structure models for classification, segmentation or identification.
  • NAS is a subset of hyperparameter optimization. Customized NAS methods are not actually fully automated, they rely on neural network structures specially hand-coded for the application or learning task as the starting point for the search. In general, the goal of the neural network structure search method is defined as:
  • is defined as structural coding
  • is defined as weight information
  • ⁇ * is the corresponding optimal weight.
  • ⁇ * is the corresponding optimal weight.
  • the first solution is population-based neural network structure search. This method is one of the most common methods in current neural network structure search research. The general process is to initialize a population, and then select parent individuals to use crossover, mutation, etc. The operator updates the topology of the parent to obtain the topology of the child, and finally uses the idea of "survival of the fittest" to eliminate individuals with low fitness and retain the better individuals. By iterating the above process, the population can continuously evolve to obtain the global/local optimal solution.
  • Figure 2 shows the population-based neural network structure search proposed by the first scheme; as shown in Figure 2, the steps of population-based neural network structure search include: initializing the population, and the population is a collection of individuals containing different neural network structures; The individual of the neural network structure is trained, and its accuracy on the validation set is obtained as the fitness of the individual; it is judged whether the termination conditions set by the algorithm are met; if the judgment result is "No", the parent neural network structure is passed.
  • the offspring neural network structure is trained to obtain its accuracy on the validation set as the offspring fitness value; According to the size of the fitness value of the offspring, individuals with at least one neural network structure are selected from the neural network structure of the parent and the neural network structure of the offspring; the set of individuals containing the selected at least one neural network structure is a new population Output the selected new population.
  • the main difference between different population-based structure search algorithms lies in the steps of generating the child neural network structure from the parent neural network structure through different crossover and mutation operators.
  • crossover and mutation operators There are many designs of crossover and mutation operators.
  • the AmoebaNet (source link) algorithm defines a macro template of the neural network structure, and designs two mutation operators: an operator that changes different operations between nodes and an operator that changes the connections between different nodes.
  • the Large-Scaleevolution algorithm does not define a macro template, and proposes eleven different mutation operators, including changing the learning rate operator, inserting the convolutional layer operator, removing the convolutional layer operator, changing the number of channels, etc.
  • the method can automatically evolve a complex neural network structure from a simple neural network structure.
  • the population-based neural network structure search method has the advantages of being suitable for parallelism and high reliability, because a large number of individuals in the population need to evaluate their fitness, it needs to consume more GPU resources and time. For example, AmoebaNet needs 3150 GPUdays to complete the search. Task. Therefore, it is difficult for this method to obtain a balance between structure search accuracy and resource consumption.
  • Differentiable Architecture Search maps the neural network into a continuous space, and uses the gradient descent method to solve it, and the parameters such as the structure and weight of the neural network can be obtained at the same time.
  • the gradient information of the structure for:
  • is defined as the structure of the neural network model, ⁇ represents the current weight, ⁇ * ( ⁇ ) is the corresponding optimal weight, and ⁇ represents the learning rate for one step of internal optimization. is the loss value on the validation set.
  • This method approximates ⁇ * ( ⁇ ) by training ⁇ once, instead of continuously training ⁇ to reach its convergence. This method searches the neural network structure along the direction of the gradient, so it can quickly search for a better neural network structure.
  • the structure search method based on differentiable neural network has the advantage of being fast.
  • this method since this method only searches for a single body, compared with a population, because only a single structure is searched each time, the reliability is low.
  • this method only uses the gradient information of the monomer, and cannot avoid the local optimal structure; and this method uses a probability to encode each possible connection and operation, resulting in a huge search space corresponding to the encoding, and the cost of optimization is high. .
  • the following introduces the concept of a method and system for searching for a neural network structure based on evolutionary learning provided by the embodiments of the present application.
  • FIG. 3 provides a basic framework diagram of a neural network structure search based on evolutionary learning according to a system embodiment of the present application.
  • an embodiment of the present application provides a method and system for neural network structure search based on evolutionary learning.
  • the solution uses a population-based pairing mechanism to use two
  • the method of order learning generates a new neural network model for population update; uses the supernet weight to train the newly generated neural network model using the gradient descent method, and uses the trained neural network model to update the weight of the supernet model to complete the neural network.
  • Automatic search of network structures In this process, performance evaluation is performed on the trained paired neural network models, and the losers of the evaluation learn from the winners to generate new neural network models for population update.
  • the solution can solve the problems of poor effect of existing neural network structure search methods and high consumption of computing resources.
  • self-defined coding refers to coding the neural network structure according to the learning task or applying artificially set coding rules.
  • the nodes in the neural network structure can be represented by multiple real variables respectively, and the connection and operation between any two nodes are unified and independent codes.
  • a supernet is a directly defined neural network with the same number of nodes as the neural network model in the initialized population, including all connection relationships and operation relationships, and the corresponding weights of its operations are used for sharing.
  • the structure of the supernet is fixed, and its optimal weight can be optimized by standard backpropagation.
  • the optimized weight value is applicable to all neural network models to improve the recognition performance.
  • FIG. 4 is a flowchart of a method for searching a neural network structure based on evolutionary learning according to an embodiment of the present application.
  • the flow of the method is: S101, initialize the population and the supernet; each neural network structure of the population is a structural code, and the population initialization is to randomly initialize these codes.
  • S102 for each code initialized in the population, first decode it into a neural network structure and then perform random pairing, and the two paired neural network structures respectively inherit weights from the initialized supernet.
  • S103 Perform training and optimization on the two paired neural network structures after inheriting the weights according to the learning task, respectively, to obtain a neural network model, and evaluate the performance of the two trained neural network models on the verification set respectively, and obtain according to the evaluation results. Losers and winners.
  • S104 update the corresponding weight value of the supernet according to the neural network model obtained after training in S103 and the evaluation result;
  • S105 according to the evaluation result, make the structural code of the loser learn from the structural code of the winner to obtain a new structure of the neural network coding, and then replace the structural coding of the loser in the population with the structural coding of the new neural network to update the population;
  • S106 judge whether the termination condition is met, if the termination condition is met, execute S107; otherwise, return to S102 to update the population Iterative evolution.
  • the termination condition is that all individuals in the population participate in pairing and reach the set number of iterations;
  • S107 output the preference model in the new population.
  • the preference model is the optimal neural network that meets the needs of the learning task.
  • FIG. 5a is a block diagram of an embodiment of a method for searching a neural network structure based on evolutionary learning according to an embodiment of the present application. As shown in Figure 5a, the method is implemented by performing the following steps.
  • the coding rules are customized for the application or learning task, and the structure coding of the neural network structure is generated according to the coding rules, and the continuous real number intervals are respectively mapped to the neural network structure.
  • Applications or learning tasks here include classifying, segmenting, or recognizing input pictures, speech, medical images, or videos.
  • S2011 Set the nodes of the neural network structure, express the connection between the set nodes as continuous real numbers, randomly connect the nodes, and encode the connection of the nodes and the operation corresponding to the connection into the structure code ⁇ of the neural network , ⁇ is set as a vector, including the connections between nodes and the operations in these connections, so as to map continuous real-number intervals to the neural network structure respectively.
  • the neural network structure can be set to have m nodes, and the continuous real number space [0, 1), [1, 2), [2, 3), [3, 4).. .[m-1, m) is mapped to m nodes, the first two nodes represent the input, and the latter node randomly selects the two nodes in front of it to connect. Therefore, each node needs to store four variables except the first two, two of which represent the node codes corresponding to the connected nodes, and the other two variables represent the operation codes corresponding to the operations represented by the two connections.
  • Each node is represented by four variables, the operation code ⁇ is a vector containing a plurality of four variables, and the value of the operation code is defined in the real number space with a difference between the upper and lower limits of 1.
  • the connection code of a certain node is 0.5, 2.3, which means that the node is connected with node 0 and node 2.
  • N structure codes can be decoded into N neural network structures with the same number of nodes, different connection relationships and different operations.
  • Using continuous real number space to represent the neural network structure can increase the diversity of the neural network structure within the population to match the second-order learning evolution of the subsequent neural network.
  • S2014 set multiple operations between every two nodes, and set the weight of each operation.
  • the neural network structure represented by any possible structural code in the population is a sub-network of the supernet, and the sub-network is recorded as a network unit. Only one operation can be selected between every two nodes of the neural network structure in the population.
  • the first operation is the operation of the 3*3 average pooling layer
  • the second operation 3*3 is the operation of the max pooling layer
  • the third operation is the operation of 3*3 convolutional layers
  • the continuous real number space of [0, 1) can be mapped to the first operation, and the operation between node 0 and node 1 is represented by the operation code ⁇ [0, 1) as the average
  • the operation of the pooling layer the continuous real number space of [1, 2) can be mapped into the second operation, and the operation code ⁇ [1, 2) represents the operation as the operation of the maximum pooling layer
  • the continuous real number space of is mapped into a third operation, denoted by the operation code ⁇ [2, 3) as the operation of the convolutional layer.
  • FIG. 7 The schematic diagram of the connection between any two nodes of the supernet is shown in Fig. 7.
  • the supernet does not involve structural coding, and every two nodes of the supernet can include all possible connections required for application or learning tasks in parallel.
  • Operations including but not limited to operations in average pooling layers, operations in max pooling layers, and operations in convolutional layers. It is set that each operation contains its own weight information and needs to be trained separately.
  • the step of initializing the population is coded using the coding rules of the connection between two nodes and the operation search, which can map independent continuous variable intervals to the connection between the two nodes and the operation corresponding to the connection, which can reduce the The operation selects the corresponding search space, improves the NAS search efficiency, and can convert discrete real numbers, combination numbers, and probability values into continuous real numbers.
  • execute S202 randomly select the structure codes corresponding to the two neural network structures in the population, decode into two neural network structures for pairing, and the paired two neural network structures inherit weights from the supernet.
  • the weight includes the weight value of the operation.
  • the nth neural network structure is recorded as the first neural network structure
  • the n+1th neural network structure is recorded as the second neural network structure
  • the first neural network inherits the same connection as the first neural network structure and the weight corresponding to the same operation corresponding to the connection from the initialized supernet to obtain the first neural network model
  • the second neural network structure is obtained from the initialized supernet Inheriting the same connection as the second neural network structure and the weight corresponding to the same operation corresponding to the connection, the second neural network model is obtained.
  • the operation weight of the convolutional layer is 2.6.
  • the first neural network structure inherits the same connection as the first neural network structure and the weight value corresponding to the same operation corresponding to the connection from the initialized supernet, so that the weight of the operation of the convolutional layer of the first neural network structure is 2.6.
  • the second neural network structure inherits from the supernet a connection with the same structure and a weight value corresponding to the same operation corresponding to the connection.
  • the weights inherited by the paired two neural network structures from the supernet for the first time are the weight values corresponding to the connection with the same structure in the initialization supernet and the same operation corresponding to the connection, and in the process of each subsequent iteration.
  • the weights inherited by the paired two neural network structures from the supernet are the weight values corresponding to the connection in the updated supernet with the same structure and the same operation corresponding to the connection.
  • S203 in combination with the learning task, perform one or more gradient descent training on the two neural network models after inheriting the weight, optimize the weight value, and verify the trained first neural network model and the second neural network model on the verification set, respectively, Obtain the error value of the first neural network model and the error value of the second neural network model, compare the two error values, and record the neural network model with the smaller error value as the winner, and the neural network model with the larger error value as the loser , get the evaluation result.
  • the stochastic gradient descent method is used to train the two neural network models respectively, and the calculation formula (3) obtains the weight drop value of the current neural network model, and the calculation formula (4) obtains the optimized neural network model.
  • Weight ⁇ :
  • t is the number of iterations of the stochastic gradient descent method
  • ⁇ (t) represents the weight drop value of the t-th generation neural network model
  • ⁇ (t) represents the weight value of the t-th iteration neural network model after optimization
  • is the momentum
  • ⁇ (t) is the learning rate
  • the error value (loss) for the neural network on the training set which is obtained by calculating the accuracy of the current neural network model on the validation set.
  • the first neural network model is respectively trained along the gradient descent direction to operate the weights, and the optimized weight value ⁇ 1 is calculated according to the calculated weight drop value ⁇ 1 (t) to obtain the first neural network model.
  • the first neural network model for one optimization.
  • the second neural network model is trained along the gradient direction to operate the weight, and the optimized weight value ⁇ 2 is calculated according to the calculated weight drop value ⁇ 2 (t) to obtain the optimized second neural network model.
  • S2032 respectively verify the error values of the first neural network model optimized for the first time and the second neural network model optimized for the first time on the validation set, and record the neural network model with the first/second smaller error value as the winner , the first/second neural network model with large error value is recorded as the loser, and the evaluation result is obtained.
  • weight value ⁇ 1 of the first neural network model update the weight of the same connection in the supernet as the structure of the second neural network model and the same operation corresponding to the connection to the optimized weight of the second neural network model.
  • step S2051 is first performed, and according to the evaluation results of the two neural network models optimized after training in S204, the loser learns from the winner to obtain a new neural network model.
  • the structural code ⁇ of the loser is optimized by adopting pseudo-gradient-based learning update for the loser, so that the structural code of the loser is close to the structural code of the winner, and then the structural code of the new neural network structure is used to replace the failure in the population By.
  • Pseudo-gradient-based learning and updating algorithms can include first-order gradient learning and updating, second-order gradient learning and updating, or both first-order and second-order gradient learning updates, or even constant terms or multiples based on gradient information. Expand. Specifically, set the structure code of the winner as ⁇ w and the structure code of the loser as ⁇ l in the paired two neural network structures, then the pseudo gradient ⁇ l updated by the structure code of the neural network model of the loser is as follows:
  • ⁇ l (t) a* ⁇ *( ⁇ w (t)- ⁇ l (t))+b* ⁇ * ⁇ l (t-1)+c (5)
  • ⁇ l (t) represents the pseudo-gradient value of the structural encoding of the t-th generation loser
  • ⁇ and ⁇ represent two real values randomly sampled from a [0,1] uniform distribution
  • ⁇ , b are two [- A given real value between 1, 1], indicating the degree of confidence in gradients of different orders
  • c is a given real number between [-1, 1], indicating the bias effect on the pseudo gradient
  • ⁇ l (t -1) is the historically accumulated pseudo gradient value before the structure update of the loser
  • the initial value ⁇ l (0) is 0.
  • step S2052 is performed, and the structure code of the new neural network structure is used to replace the loser in the population, and the population is updated.
  • S206 judge whether the termination condition is met, if so, execute S207; otherwise, repeat steps 202-206, perform pairing and iterative learning according to the population, and continue to evolve and update the population until the set termination condition is reached.
  • the termination condition can be paired and learned for all structures in the population that encode the corresponding neural network structures.
  • step S206 When executing step S206, it can be judged that if n ⁇ N-1, add 2 to the value of n, and return to step 202, and if n ⁇ N-1, execute S207.
  • the termination condition can also be reaching a set number of iterations.
  • step S206 When executing step S206, it can be judged that if t ⁇ T and n ⁇ N-1, then add 1 to the value of t, add 2 to the value of n, and return to step 202; if t ⁇ T, and n ⁇ N- 1, add 1 to the value of t, and the value of n is 1, return to step 202; if t ⁇ T, execute S207.
  • FIG. 8 is a flowchart of a method for updating supernet weights proposed by the application, and as shown in FIG. 6 , it includes:
  • S302 Randomly pair the decoded neural network structures in the population, and the paired two neural network structures inherit the connection with the same structure and the weight corresponding to the same operation corresponding to the connection from the initialized supernet to generate two neural networks Model.
  • the weight value of the paired two neural network structures inherited from the supernet for the first time is the weight value of the connection with the same structure in the initialization supernet and the weight value of the same operation corresponding to the connection, in the process of each subsequent iteration.
  • the weight values inherited by the paired two neural network structures from the supernet are the updated supernet's connection with the same structure and the weight value of the same operation corresponding to the connection.
  • S303 train one or more gradient descents on the two neural network models after inheriting the weights in S302.
  • the loser and the winner are obtained by calculating the error values of the two neural network models on the validation set, the neural network model with the smaller error value is the winner, and the neural network model with the larger error value is the loser.
  • the weight value of the same connection in the supernet with the two neural network models and the same operation corresponding to the connection is updated to the weight value of the winner.
  • the weight values of the operations corresponding to the connections of the two neural network models in the supernet respectively use the optimized weight values of the two neural network models.
  • the present application also proposes a structure update mechanism based on population pairing, which is used to realize the non-repetitive pairing of neural network structures in the population to compete, and the loser to the winner to perform a second-order based on pseudo-gradient. Learning, generating new individuals to replace the original losers.
  • FIG. 9 is a flowchart of the method for updating the structure based on population pairing proposed by the present application. As shown in Figure 7, including:
  • steps S2011-2012 can be referred to.
  • n and n+1 are the numbers of the paired two neural network structures; record the nth neural network structure as the first neural network structure , the n+1th neural network structure is recorded as the second neural network structure.
  • the paired two neural network structures inherit weight values from the supernet.
  • the weight value inherited from the supernet for the first time by the paired two neural network structures is the weight value of the corresponding connection between the two neural network structures in the initialization supernet and the same operation corresponding to the connection.
  • the weight value inherited from the supernet by the two neural network structures paired in the process is the weight value of the updated supernet.
  • n ⁇ N-1 the iteration is ended, and S508 is executed; if n ⁇ N-1, the value of n is incremented by 2, and the execution returns to step 402.
  • the continuous space is mapped to the neural network structure in order to perform continuous mathematical operations on the structure, which can endow the algorithm with better global search ability; through the population-based paired second-order learning structure update method, the optimal solution can be found faster At the same time, based on the characteristics of the population, a set of solutions can finally be found, which can provide decision makers with multiple choices and improve the reliability of the algorithm at the same time; and the weight inheritance and update of the supernet can speed up the speed of model evaluation and significantly reduce the search for neural networks. the required computational cost and running time.
  • An embodiment of the present application provides a system for searching neural network structures based on evolutionary learning.
  • the system includes: a population initialization module 801, an individual pairing module 802, a training evaluation module 803, a supernet weight update 804, a population Update module 805 and model output module 806.
  • the system initializes the population through the population initialization module 801, wherein each neural network structure in the population is a structure code, and the structure code uses a continuous real number interval to map the connections and corresponding operations between the nodes of the neural network structure.
  • the individual pairing module 802 randomly selects two structural codes in the population, and decodes them into two neural network structures for pairing; the paired two neural network structures inherit the corresponding weights from the supernet respectively to obtain the first neural network model and the first neural network structure.
  • Network model to obtain winners and losers update the supernet according to the first and second neural network models after training through the supernet weight update 804; calculate the difference between the structural code of the loser and the structural code of the winner through the population update module 805.
  • the structural code of the loser is evolved to the structural code of the winner based on the pseudo gradient value, and the structural code of the third neural network structure is obtained; the structural code of the third neural network structure is used to replace the corresponding structure of the loser in the population.
  • the pairing module 802 performs iterative evolution on the updated population.
  • the population initialization module 801 can also generate N neural network structures with the same number of nodes by manual coding according to the self-defined coding rules; by coding, the continuous real number interval is mapped to the nodes between the nodes of a single neural network structure. Connection and corresponding discrete operation, N is a natural number.
  • the system for searching neural network structures based on evolutionary learning further includes a supernet initialization module, which sets up a supernet according to a learning task, and the supernet includes N network units and a set of all operations.
  • the individual pairing module 802 inherits the first neural network structure from the supernet and inherits the same connection as the first neural network structure and the connection is related to the first neural network structure.
  • the first weight corresponding to the corresponding same operation is obtained, and the first neural network model is obtained;
  • the second neural network structure is inherited from the supernet with the same connection as the second neural network structure and the second corresponding to the same operation corresponding to the connection.
  • the weights of the second neural network model are obtained.
  • the training evaluation module 803 uses the stochastic gradient descent method to train the weight value of the first neural network model at least once in combination with the learning task to obtain an optimized first neural network model; in combination with the learning task, uses the stochastic gradient descent method to train the second neural network model once , obtain the optimized second neural network model; evaluate the optimized first neural network model and the optimized second neural network model respectively on the verification set; calculate the error value of the first neural network model according to the optimized first neural network model; The optimized second neural network model calculates the error value of the second neural network model; compares the error value of the first neural network model and the error value of the second neural network model; The network model is recorded as the winner, the first/second with the largest error value is recorded as the loser, and the evaluation result is obtained.
  • the supernet weight update 804 takes the operation weight of the winner as the weight of the supernet under the condition that some two nodes have the same connection in the first and second neural networks and the connection has the same operation; Under the condition that the second neural network has different node connections or the same node connection but corresponds to different operations, the weight of the first neural network model is taken as the connection in the supernet that has the same structure as the first neural network and the corresponding value of the connection. The weight of the same operation; take the weight of the second neural network model as the weight of the connection in the supernet that has the same structure as the second neural network and the weight of the same operation corresponding to the connection; obtain the updated supernet.
  • the population update module calculates the difference between the structural code value of the loser and the structural code value of the winner, multiplies the difference by a random coefficient, and accumulates and sums the historical pseudo-gradient under the random coefficient multiple to obtain the structural code update of the loser
  • the value of the pseudo gradient of sum the value of the structure code of the loser and the value of the pseudo gradient to obtain the structure code of the third neural network structure.
  • the model output module 806 judges whether all the neural network structures in the population are involved in pairing; if the judgment result is "no", it returns to the individual pairing module 802 to iteratively evolve the updated population; if the judgment result is "yes”, it outputs the updated The optimal neural network model in the population; thus completing the search of the neural network structure.
  • Model output module 806 Set the number of iterations as T, where T is a natural number greater than 0, and determine whether the current number of executions is less than T; if the judgment result is "yes”, return to the individual pairing module 802 to iteratively evolve the updated population; If the judgment result is "No", output the optimal neural network model in the updated population; thus completing the search of the neural network structure.
  • the model output module 806 may also inherit the corresponding weight values from the updated supernet when the number of execution iterations is greater than 1.
  • An embodiment of the present application provides a system for updating a supernet.
  • the system includes: randomly initializing a supernet through a supernet initialization module 901, and the supernet includes N network units and a set of all operations; 802 Randomly select two neural network structures in the population for pairing; the paired two neural network structures respectively inherit the corresponding weights from the supernet to obtain the first neural network model and the second neural network model; Train the first and second neural network models, evaluate the trained first and second neural network models, and obtain winners and losers; there are two nodes in the first and second neural networks that have the same connection and the connection Under the condition of having the same operation, the weight of the operation of the winner is taken as the weight of the supernet.
  • the first neural network model The weight is taken as the weight of the same connection as the first neural network structure in the supernet and the same operation corresponding to the connection; the weight of the second neural network model is taken as the same connection as the second neural network structure in the supernet and this connection The corresponding weights of the same operation; get the updated supernet.
  • An embodiment of the present application provides an electronic device 1000, as shown in FIG. 12, including a processor 1001 and a memory 1002; the processor 1001 is configured to execute computer-executed instructions stored in the memory 1002, and the processor 1001 runs The computer executes the instructions to execute the method for searching for a neural network structure based on evolutionary learning described in any of the foregoing embodiments.
  • An embodiment of the present application provides a storage medium, including a readable storage medium and a computer program stored in the readable storage medium, where the computer program is used to implement the evolutionary learning-based neural network described in any of the foregoing embodiments method for structure search.
  • computer-readable media may include, but are not limited to, magnetic storage devices (eg, hard disks, floppy disks, or magnetic tapes, etc.), optical disks (eg, compact discs (CDs), digital versatile discs (DVDs, DVDs), etc.), Smart cards and flash memory devices (eg, erasable programmable read-only memory (EPROM), cards, stick or key drives, etc.).
  • various storage media described herein can represent one or more devices and/or other machine-readable media for storing information.
  • the term "machine-readable medium” may include, but is not limited to, wireless channels and various other media capable of storing, containing, and/or carrying instructions and/or data.
  • the size of the sequence numbers of the above-mentioned processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not be The implementation process of the embodiments of the present application constitutes any limitation.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • the functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium.
  • the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence, or the parts that make contributions to the prior art or the parts of the technical solutions, and the computer software products are stored in a storage medium , including several instructions to cause a computer device (which may be a personal computer, a server, or an access network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the embodiments of this application.
  • the aforementioned storage medium includes: U disk, removable hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Genetics & Genomics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

La présente invention concerne un procédé et un système de recherche d'architecture neuronale basés sur un apprentissage évolutif. Le procédé comprend les étapes suivantes consistant à : S101, initialiser une population, chaque architecture neuronale dans la population étant un code d'architecture ; S102, sélectionner de manière aléatoire deux codes d'architecture dans la population, décoder les deux codes d'architecture en deux architectures neuronales pour l'appariement, et hériter des pondérations correspondantes d'un Supernet, de façon à obtenir des premier et second modèles de réseau neuronal ; S103, évaluer les premier et second modèles de réseau neuronal qui ont été entraînés, de façon à obtenir un gagnant et un perdant ; S104, mettre à jour le Supernet selon les premier et second modèles de réseau neuronal entraînés ; S105, calculer une valeur de pseudo-gradient, de telle sorte que le perdant apprend à partir du gagnant, et obtenir un code d'architecture d'une troisième architecture neuronale ; S106, remplacer, dans la population, le code d'architecture du perdant par le code d'architecture de la troisième architecture neuronale, et mettre à jour la population ; et S107, délivrer un modèle de réseau neuronal optimal à partir de la population, et effectuer une évolution itérative sur la population mise à jour.
PCT/CN2020/136950 2020-12-16 2020-12-16 Procédé et système de recherche d'architecture neuronale basés sur un apprentissage évolutif Ceased WO2022126448A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/136950 WO2022126448A1 (fr) 2020-12-16 2020-12-16 Procédé et système de recherche d'architecture neuronale basés sur un apprentissage évolutif
CN202080107589.9A CN116964594B (zh) 2020-12-16 2020-12-16 一种基于演化学习的神经网络结构搜索方法和系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/136950 WO2022126448A1 (fr) 2020-12-16 2020-12-16 Procédé et système de recherche d'architecture neuronale basés sur un apprentissage évolutif

Publications (1)

Publication Number Publication Date
WO2022126448A1 true WO2022126448A1 (fr) 2022-06-23

Family

ID=82059915

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/136950 Ceased WO2022126448A1 (fr) 2020-12-16 2020-12-16 Procédé et système de recherche d'architecture neuronale basés sur un apprentissage évolutif

Country Status (2)

Country Link
CN (1) CN116964594B (fr)
WO (1) WO2022126448A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115130483A (zh) * 2022-07-13 2022-09-30 湘潭大学 一种基于多目标群体智能算法的神经架构搜索方法及用途
CN115879509A (zh) * 2022-11-18 2023-03-31 西安电子科技大学 基于代理辅助进化算法的卷积神经网络结构优化方法
CN116304932A (zh) * 2023-05-19 2023-06-23 湖南工商大学 一种样本生成方法、装置、终端设备及介质
CN116957015A (zh) * 2023-06-20 2023-10-27 北京航空航天大学 一种基于演化神经网络的多层网络重要节点识别方法
CN117611974A (zh) * 2024-01-24 2024-02-27 湘潭大学 基于多种群交替进化神经结构搜索的图像识别方法及系统
CN117709205A (zh) * 2024-02-05 2024-03-15 华南师范大学 航空发动机剩余使用寿命预测方法、装置、设备以及介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190095780A1 (en) * 2017-08-18 2019-03-28 Beijing Sensetime Technology Development Co., Ltd Method and apparatus for generating neural network structure, electronic device, and storage medium
US20190385059A1 (en) * 2018-05-23 2019-12-19 Tusimple, Inc. Method and Apparatus for Training Neural Network and Computer Server
CN110782034A (zh) * 2019-10-31 2020-02-11 北京小米智能科技有限公司 神经网络的训练方法、装置及存储介质
CN111340220A (zh) * 2020-02-25 2020-06-26 北京百度网讯科技有限公司 用于训练预测模型的方法和装置
CN111368973A (zh) * 2020-02-25 2020-07-03 北京百度网讯科技有限公司 用于训练超网络的方法和装置
CN111563592A (zh) * 2020-05-08 2020-08-21 北京百度网讯科技有限公司 基于超网络的神经网络模型生成方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6553357B2 (en) * 1999-09-01 2003-04-22 Koninklijke Philips Electronics N.V. Method for improving neural network architectures using evolutionary algorithms
CN110569972A (zh) * 2019-09-11 2019-12-13 北京百度网讯科技有限公司 超网络的搜索空间构建方法、装置以及电子设备
CN111325356A (zh) * 2019-12-10 2020-06-23 四川大学 一种基于演化计算的神经网络搜索分布式训练系统及训练方法
CN111275172B (zh) * 2020-01-21 2023-09-01 复旦大学 一种基于搜索空间优化的前馈神经网络结构搜索方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190095780A1 (en) * 2017-08-18 2019-03-28 Beijing Sensetime Technology Development Co., Ltd Method and apparatus for generating neural network structure, electronic device, and storage medium
US20190385059A1 (en) * 2018-05-23 2019-12-19 Tusimple, Inc. Method and Apparatus for Training Neural Network and Computer Server
CN110782034A (zh) * 2019-10-31 2020-02-11 北京小米智能科技有限公司 神经网络的训练方法、装置及存储介质
CN111340220A (zh) * 2020-02-25 2020-06-26 北京百度网讯科技有限公司 用于训练预测模型的方法和装置
CN111368973A (zh) * 2020-02-25 2020-07-03 北京百度网讯科技有限公司 用于训练超网络的方法和装置
CN111563592A (zh) * 2020-05-08 2020-08-21 北京百度网讯科技有限公司 基于超网络的神经网络模型生成方法和装置

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115130483A (zh) * 2022-07-13 2022-09-30 湘潭大学 一种基于多目标群体智能算法的神经架构搜索方法及用途
CN115879509A (zh) * 2022-11-18 2023-03-31 西安电子科技大学 基于代理辅助进化算法的卷积神经网络结构优化方法
CN116304932A (zh) * 2023-05-19 2023-06-23 湖南工商大学 一种样本生成方法、装置、终端设备及介质
CN116304932B (zh) * 2023-05-19 2023-09-05 湖南工商大学 一种样本生成方法、装置、终端设备及介质
CN116957015A (zh) * 2023-06-20 2023-10-27 北京航空航天大学 一种基于演化神经网络的多层网络重要节点识别方法
CN117611974A (zh) * 2024-01-24 2024-02-27 湘潭大学 基于多种群交替进化神经结构搜索的图像识别方法及系统
CN117611974B (zh) * 2024-01-24 2024-04-16 湘潭大学 基于多种群交替进化神经结构搜索的图像识别方法及系统
CN117709205A (zh) * 2024-02-05 2024-03-15 华南师范大学 航空发动机剩余使用寿命预测方法、装置、设备以及介质
CN117709205B (zh) * 2024-02-05 2024-05-07 华南师范大学 航空发动机剩余使用寿命预测方法、装置、设备以及介质

Also Published As

Publication number Publication date
CN116964594B (zh) 2025-10-28
CN116964594A (zh) 2023-10-27

Similar Documents

Publication Publication Date Title
WO2022126448A1 (fr) Procédé et système de recherche d'architecture neuronale basés sur un apprentissage évolutif
WO2022252455A1 (fr) Procédés et systèmes pour l'entraînement d'un réseau neuronal en graphe à l'aide d'un apprentissage par contraste supervisé
Jiang et al. Efficient network architecture search via multiobjective particle swarm optimization based on decomposition
CN112465120A (zh) 一种基于进化方法的快速注意力神经网络架构搜索方法
CN108334949B (zh) 一种基于优化深度卷积神经网络结构快速进化的图像分类器构建方法
WO2022083624A1 (fr) Procédé d'acquisition de modèle, et dispositif
CN111476285B (zh) 一种图像分类模型的训练方法及图像分类方法、存储介质
CN114118369B (zh) 一种基于群智能优化的图像分类卷积神经网络设计方法
US20240311651A1 (en) Method and apparatus for searching for neural network ensemble model, and electronic device
WO2018161468A1 (fr) Optimisation globale, recherche et procédé d'apprentissage automatique basé sur le principe génétique acquis par lamarck
CN113128432B (zh) 基于演化计算的机器视觉多任务神经网络架构搜索方法
Wen et al. Learning ensemble of decision trees through multifactorial genetic programming
CN114241267A (zh) 基于结构熵采样的多目标架构搜索骨质疏松图像识别方法
CN118014010B (zh) 基于多种群机制及代理模型的多目标演化神经架构搜索方法
CN107783998A (zh) 一种数据处理的方法以及装置
Xie et al. Automated design of CNN architecture based on efficient evolutionary search
WO2021042857A1 (fr) Procédé de traitement et appareil de traitement pour modèle de segmentation d'image
CN117809734B (zh) 一种基因调控网络的降维建模方法及系统
CN117611974A (zh) 基于多种群交替进化神经结构搜索的图像识别方法及系统
CN116822584A (zh) 基于高斯过程的多任务神经网络模型性能预测方法
CN118917353B (zh) 基于性能层级代理辅助的演化神经架构搜索方法和系统
CN114202669B (zh) 一种用于医疗图像分割的神经网络搜索方法
CN116090512A (zh) 神经网络的构建方法和装置
CN118153633B (zh) 一种改进的cnn架构优化设计方法
CN117669742A (zh) 一种基于进化集成的图神经网络解释方法、系统及设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20965466

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202080107589.9

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20965466

Country of ref document: EP

Kind code of ref document: A1

WWG Wipo information: grant in national office

Ref document number: 202080107589.9

Country of ref document: CN