CN115115034B

CN115115034B - Image processing network generation, image classification method and related device

Info

Publication number: CN115115034B
Application number: CN202210778672.XA
Authority: CN
Inventors: 张哲熙; 娄英欣; 梅树起; 薛涛; 严骏驰
Original assignee: Shenzhen Tencent Computer Systems Co Ltd; Shanghai Jiao Tong University
Current assignee: Shenzhen Tencent Computer Systems Co Ltd; Shanghai Jiao Tong University
Priority date: 2022-06-30
Filing date: 2022-06-30
Publication date: 2025-01-24
Anticipated expiration: 2042-06-30
Also published as: CN115115034A

Abstract

The present application discloses an image processing network generation, image classification method and related devices, and relates to the field of artificial intelligence technology. The present application can be applied to the fields of blockchain, cloud technology and map vehicle networking, and the method includes: obtaining an initial network structure unit, which includes at least one node, and there are connecting edges between the nodes; obtaining a candidate operation set, and the candidate operation set includes at least one candidate operation; according to the structural parameters of the candidate operation corresponding to each connecting edge, generating a probability distribution of each connecting edge selecting the corresponding candidate operation; according to the probability distribution corresponding to each connecting edge, only one candidate operation is selected for each connecting edge and added to the initial network structure unit to obtain the current network to be optimized; the probability distribution is processed into a discrete distribution continuous process, and the current network to be optimized is gradient optimized to obtain the target image processing network. The present application can effectively reduce the computing resources consumed by the generation of the image processing network, and can realize flexible deployment.

Description

Image processing network generation and image classification method and related device

Technical Field

The application relates to the technical field of artificial intelligence, in particular to an image processing network generation and image classification method and a related device.

Background

An image processing network is a neural network that processes images, such as a neural network that performs image detection or image classification processing.

At present, in the related scheme, an image processing network is generated by searching through a micro-network structure searching method, and all candidate operations are inevitably connected among nodes aiming at the nodes in the super network to be searched, so that a target image processing network is generated by searching. Under the related scheme, the search generation process has high demand on computing resources and is difficult to deploy on the device with limited video memory.

Therefore, the problem that the image processing network is high in computational resource consumption and difficult to flexibly deploy exists at present.

Disclosure of Invention

The embodiment of the application provides an image processing network generation and image classification method and a related device, which can effectively reduce the calculation resources consumed by the image processing network generation and can realize flexible deployment.

In order to solve the technical problems, the embodiment of the application provides the following technical scheme:

The image processing network generation method comprises the steps of obtaining an initial network structure unit, wherein the initial network structure unit comprises at least one node representing a feature map, connecting edges are arranged between the nodes, obtaining a candidate operation set, the candidate operation set comprises at least one candidate operation corresponding to each connecting edge, the candidate operation is used for processing the feature map, probability distribution of corresponding candidate operation is selected by each connecting edge in current forward propagation according to structural parameters of the candidate operation corresponding to each connecting edge, only one candidate operation is selected from the corresponding candidate operation according to the probability distribution corresponding to each connecting edge and is added to the initial network structure unit, a current network to be optimized is obtained, discrete distribution continuous processing is conducted on the probability distribution, gradient optimization is conducted on the structural parameters of the candidate operation in the current network to be optimized based on continuous processing results until preset optimization conditions are met, and the target image processing network is obtained.

According to one embodiment of the application, an image processing network generating device comprises a structural unit acquisition module, an operation set acquisition module, a probability calculation module and a single operation sampling module, wherein the structural unit acquisition module is used for acquiring an initial network structural unit, the initial network structural unit comprises at least one node representing a feature map, connection edges are arranged between the nodes, the operation set acquisition module is used for acquiring a candidate operation set, the candidate operation set comprises at least one candidate operation corresponding to each connection edge and is used for processing the feature map, the probability calculation module is used for generating a probability distribution of each connection edge in current forward propagation, which corresponds to the candidate operation, according to the structural parameters of each connection edge, the single operation sampling module is used for selecting only one candidate operation from the corresponding candidate operation for each connection edge and adding the candidate operation to the initial network structural unit to obtain a current network to be optimized, and the optimization module is used for carrying out discrete distribution continuous processing on the probability distribution and optimizing the structure parameters of the operation in the current network to be optimized until the candidate operation gradient meets a preset optimization condition based on a continuous processing result, and the single operation sampling module is used for obtaining a target image.

In some embodiments of the present application, the optimization module includes a sample input unit configured to process sample images respectively by using the current network to be optimized and a preset image processing network to obtain image processing results, and a network optimization unit configured to perform gradient optimization on structural parameters of candidate operations in the current network to be optimized based on a continuous processing result according to the image processing results and a network middle layer of the preset image processing network.

In some embodiments of the present application, the image processing result includes a first result corresponding to the current network to be optimized and a second result corresponding to the preset image processing network, the sample image is calibrated with a predetermined result, the network optimizing unit is configured to calculate a first loss according to the first result and the predetermined result, calculate a second loss according to the first result and the second result, calculate a third loss according to the preset image processing network and a network middle layer corresponding to the same level in the current network to be optimized, and perform gradient optimization on the structural parameters of candidate operations in the current network to be optimized based on the continuous processing result according to the first loss, the second loss and the third loss.

In some embodiments of the present application, the probability calculation module is configured to perform an exponential operation on a structural parameter of each candidate operation corresponding to each connection edge to obtain a parameter operation result corresponding to each candidate operation, sum the parameter operation results corresponding to each candidate operation corresponding to the connection edge to obtain a sum result corresponding to the connection edge, divide the sum of the parameter operation results corresponding to each candidate operation corresponding to the connection edge by the sum result corresponding to the connection edge to obtain a probability that each candidate operation is selected by the corresponding connection edge, sum the probability that each candidate operation is selected by the corresponding connection edge by a random sampling value to obtain a target probability value corresponding to each candidate operation, and obtain a probability distribution of each candidate operation selected by the corresponding connection edge in current forward propagation based on the target probability value corresponding to each candidate operation corresponding to each connection edge.

In some embodiments of the present application, the single operation sampling module is configured to determine a candidate operation corresponding to a maximum target probability value in a probability distribution corresponding to each connection edge, to obtain a target candidate operation corresponding to each connection edge, and add the target candidate operation corresponding to each connection edge to a position corresponding to each connection edge in the initial network structure unit.

In some embodiments of the present application, the network optimization unit is configured to perform a mean pooling operation on two network feature graphs corresponding to a network middle layer of the same hierarchy in the preset image processing network and the current network to be optimized to obtain two target feature graphs with uniform channel numbers, convert the two target feature graphs into a first weighted feature graph and a second weighted feature graph, and calculate the third loss according to the first weighted feature graph and the second weighted feature graph.

In some embodiments of the present application, the network optimization unit is configured to calculate the first weighted feature map and the second weighted feature map by using a mean square error loss function, so as to obtain the third loss.

In some embodiments of the present application, the network optimization unit is configured to perform calculation processing on the first result and the predetermined result by using a cross entropy loss function to obtain the first loss, and perform calculation processing on the first result and the second result by using a relative entropy loss function to obtain the second loss.

In some embodiments of the present application, the network optimization unit is configured to perform a second-order approximate estimation process on the network parameter in the current network to be optimized and the continuous processing result to obtain a second-order approximate estimation result, and perform an alternating gradient optimization on the network parameter and the structure parameter in the current network to be optimized based on the second-order approximate estimation result.

According to one embodiment of the application, an image classification method comprises the steps of obtaining an image to be classified, and performing classification processing on the image to be classified by adopting a target image processing network for performing image classification to obtain a classification result corresponding to the image to be classified, wherein the target image processing network is generated by the image processing network generation method according to any embodiment of the application.

According to one embodiment of the application, the image classification device comprises an image acquisition module and a classification module, wherein the image acquisition module is used for acquiring an image to be classified, the classification module is used for performing classification processing on the image to be classified by adopting a target image processing network for performing image classification to obtain a classification result corresponding to the image to be classified, and the target image processing network is generated by the image processing network generation method according to any one embodiment of the application.

According to another embodiment of the application, a computer readable storage medium has stored thereon a computer program which, when executed by a processor of a computer, causes the computer to perform the method according to the embodiment of the application.

According to another embodiment of the application, an electronic device comprises a memory storing a computer program and a processor reading the computer program stored in the memory to perform the method according to the embodiment of the application.

According to another embodiment of the application, a computer program product or computer program includes computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions to cause the computer device to perform the methods provided in the various alternative implementations described in the embodiments of the present application.

In the embodiment of the application, an initial network structure unit is acquired, wherein the initial network structure unit comprises at least one node representing a feature map, connecting edges are arranged between the nodes, a candidate operation set is acquired, the candidate operation set comprises at least one candidate operation corresponding to each connecting edge, the candidate operation is used for processing the feature map, probability distribution of the corresponding candidate operation is selected by each connecting edge in the current forward propagation according to structural parameters of the candidate operation corresponding to each connecting edge, only one candidate operation is selected from the corresponding candidate operation according to the probability distribution corresponding to each connecting edge and is added to the initial network structure unit to obtain a current network to be optimized, discrete distribution continuous processing is carried out on the probability distribution, and gradient optimization is carried out on the structural parameters of the candidate operation in the current network to be optimized based on a continuous processing result until a preset optimization condition is met, so that a target image processing network is obtained.

In this way, in the image processing network generation process, the probability distribution of the corresponding candidate operation is selected by calculating each connection edge, so that only one candidate operation is selected from the corresponding candidate operation for each connection edge and added to the network structure unit, the image processing network generation method based on the single-path sampling network structure search is realized, the requirements of the connection edges between nodes on all candidate operations are avoided, the consumption of the computing resources in the image processing network search generation process is effectively reduced, the network structure search efficiency is improved, the video memory occupation is saved, and the image processing network generation method can be flexibly deployed on the equipment with limited video memory.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 shows a schematic diagram of a system to which embodiments of the application may be applied.

Fig. 2 shows a flow chart of an image processing network generation method according to an embodiment of the application.

Fig. 3 shows a schematic diagram of an initial network structural unit according to one embodiment of the application.

FIG. 4 illustrates a schematic diagram of a network architecture element searching for candidate operations according to one embodiment of the application.

Fig. 5 shows a schematic diagram of network optimization according to an embodiment of the application.

Fig. 6 shows a flow chart of an image classification method according to an embodiment of the application.

Fig. 7 shows a block diagram of an image processing network generating apparatus according to an embodiment of the present application.

Fig. 8 shows a block diagram of an image classification apparatus according to an embodiment of the application.

Fig. 9 shows a block diagram of an electronic device according to an embodiment of the application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.

It will be appreciated that in the specific embodiments of the present application, related data such as images of items, user information, etc. are involved, and when the above embodiments of the present application are applied to specific products or technologies, user permissions or consents need to be obtained, and the collection, use and processing of related data need to comply with the relevant laws and regulations and standards of the relevant countries and regions.

Fig. 1 shows a schematic diagram of a system 100 in which embodiments of the application may be applied. As shown in fig. 1, the system 100 may include a server 101 and a terminal 102.

The server 101 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligence platforms, and the like. In one implementation of the present example, server 101 is a cloud server, and server 101 may provide artificial intelligence cloud services, such as artificial intelligence cloud services that provide large image classification.

The terminal 102 may be any device, and the terminal 102 includes, but is not limited to, a cell phone, a computer, a smart voice interaction device, a smart home appliance, a vehicle terminal, a VR/AR device, a smart watch, a computer, and the like. In one embodiment, the server 101 or terminal 102 may be a node device in a blockchain network or a map internet of vehicles platform.

In one implementation manner of the present example, the server 101 or the terminal 102 may obtain an initial network structure unit, where the initial network structure unit includes at least one node representing a feature map, and there is a connection edge between the nodes, obtain a candidate operation set, where the candidate operation set includes at least one candidate operation corresponding to each connection edge, where the candidate operation is an operation for processing the feature map, calculate, according to a structural parameter of the candidate operation corresponding to each connection edge, a probability distribution of each connection edge in the current forward propagation, where the probability distribution of each connection edge selects a corresponding candidate operation, and according to the probability distribution of each connection edge, select, for each connection edge, only one candidate operation from the corresponding candidate operation to be added to the network structure unit, to obtain a current network to be optimized, perform discrete distribution serialization processing on the probability distribution, and perform gradient optimization on the structural parameter of the candidate operation in the current network to be optimized based on the serialization processing result until a predetermined optimization condition is met, to obtain a target image processing network.

In one implementation manner of the present example, the server 101 or the terminal 102 may acquire an image to be classified, and perform classification processing on the image to be classified by using a target image processing network for performing image classification, to obtain a classification result corresponding to the image to be classified, where the target image processing network is generated by using the image processing network generation method according to any embodiment of the present application.

Fig. 2 schematically shows a flow chart of an image processing network generation method according to an embodiment of the application. The execution subject of the image processing network generation method may be any device, such as the server 101 or the terminal 102 shown in fig. 1.

As shown in fig. 2, the image processing network generation method may include steps S210 to S250.

Step S210, an initial network structure unit is obtained, wherein the initial network structure unit comprises at least one node representing a feature map, and connecting edges are arranged between the nodes;

Step S220, a candidate operation set is obtained, wherein the candidate operation set comprises at least one candidate operation corresponding to each connecting edge, and the candidate operation is used for processing the feature map;

step S230, generating probability distribution of selecting the corresponding candidate operation by each connecting edge in the current forward propagation according to the structure parameters of the candidate operation corresponding to each connecting edge;

step S240, according to the probability distribution corresponding to each connection edge, only selecting one candidate operation from the corresponding candidate operations for each connection edge and adding the candidate operation to the initial network structure unit to obtain the current network to be optimized;

and S250, carrying out discrete distribution continuous processing on the probability distribution, and carrying out gradient optimization on the structural parameters of the candidate operation in the current network to be optimized based on the continuous processing result until the structural parameters meet the preset optimization conditions, so as to obtain the target image processing network.

The initial network structural unit is the neural network structural unit of the candidate operation to be searched. The neural network structural units can be called cell units, and the whole neural network can be formed by stacking the cell units. The initial network structure unit can comprise at least one node, each node represents a feature map, the feature map is a feature matrix, for example, the feature matrix is obtained through convolution operation, the nodes can be connected through a directional connecting edge, the connecting edge is an unknown candidate operation in the initial network structure, and the candidate operation at the connecting edge is to be searched. The initial network structural unit may be obtained from a predetermined location.

The candidate operation set includes at least one candidate operation corresponding to each connection edge, where the candidate operation is an operation for processing the feature map, and the candidate operation is an operation such as a convolution operation, a pooling operation, a jump connection, and the like. The candidate operation set may be a search space composed of candidate operations, and the candidate operation set may be acquired from a predetermined location.

Each initial network element may be abstracted to a directed acyclic graph comprising N nodes { x ⁽⁰⁾,x⁽¹⁾,…,x^(N-1) }, for example, fig. 3 illustrates an initial network element including 7 nodes (e.g., 0 and 1), where each node x ⁽ⁱ⁾ represents a feature graph in the network, and a candidate operation (such as a convolution operation) may be connected between two nodes connected by a connection edge, and the purpose of generating the image processing network is to select, by searching the network structure, a candidate operation from the candidate operation set that is most suitable for each connection edge. The feature map represented by the previous node can be processed through candidate operation to obtain the feature map represented by the next node.

Each time forward propagation is performed (before gradient optimization is performed until a predetermined optimization condition is met, multiple forward propagation may be performed, for example, each time after gradient optimization is performed on a current network to be optimized is completed, next forward propagation may be performed), probability distribution of selecting corresponding candidate operations by each connection side in current forward propagation may be calculated according to structural parameters of the candidate operations corresponding to the connection sides, and the probability distribution is generated by probability of selecting corresponding candidate operations at each connection side. For example, the connection edge a corresponds to 9 predetermined candidate operations, and the probability that the connection edge selects each of the 9 predetermined candidate operations may be calculated, thereby generating a probability distribution corresponding to the connection edge a.

According to the probability distribution corresponding to each connection edge, only one candidate operation is selected from the corresponding candidate operations for each connection edge to be added to the initial network structural unit, for example, the connection edge A corresponds to 9 predetermined candidate operations, only one is selected from the 9 predetermined candidate operations according to the probability distribution corresponding to the connection edge A, and the selected single candidate operation is added to the initial network structural unit at the connection edge A. And searching a candidate operation for each connecting edge in the initial network structure unit respectively to obtain a complete network comprising nodes and the candidate operation, namely the current network to be optimized.

The probability distribution generated based on the structural parameters of the candidate operation is discretized, discrete distribution continuous processing is carried out on the probability distribution, and gradient optimization can be carried out on the structural parameters of the candidate operation in the current network to be optimized based on the continuous processing result, so that the structural parameters of each candidate operation have gradients from a deep network when the structure parameters are reversely propagated through gradient optimization, further the structural parameters meeting the requirements can be optimized, and the target image processing network is obtained when the optimization is carried out until the preset optimization conditions are met.

The target image processing network is a network composed of network structural units (i.e., searched network structural units) of candidate operations (i.e., searched network structural units) that search for structural parameters that meet the requirements, for example, a searched network structural unit obtained after searching for candidate operations (e.g., skip-connect) between 7 nodes (e.g., 0 and 1) in the initial network structural units in fig. 3 shown in fig. 4. The image to be processed can be processed based on a target image processing network, and the target image processing network can be a neural network for performing functions such as image classification or image target detection.

In this way, based on steps S210 to S250, during the image processing network generation process, the probability distribution of the corresponding candidate operation is selected by calculating each connection edge, and then only one candidate operation is selected from the corresponding candidate operations for each connection edge and added to the network structure unit, so as to implement the image processing network generation method based on the single-path sampling network structure search, avoid the requirement of the connection edge between the nodes on all the candidate operations, effectively reduce the consumption of the computing resource in the image processing network search generation process, improve the efficiency of the network structure search, save the occupation of the video memory, and also flexibly deploy on the device with limited video memory.

Other specific alternative embodiments of the steps performed when the embodiment of fig. 2 performs image processing network generation are described below.

In one embodiment, step S230, generating a probability distribution of selecting a candidate operation corresponding to each connection edge in the current forward propagation according to the structure parameters of the candidate operation corresponding to each connection edge, includes:

The method comprises the steps of carrying out exponential operation on structural parameters of each candidate operation corresponding to each connecting edge to obtain parameter operation results corresponding to each candidate operation, summing the parameter operation results corresponding to each candidate operation corresponding to each connecting edge to obtain summation results corresponding to the connecting edges, dividing summation of the parameter operation results corresponding to each candidate operation corresponding to each connecting edge and summation results corresponding to the connecting edges to obtain probability that each candidate operation is selected by the corresponding connecting edge, summing the probability that each candidate operation is selected by the corresponding connecting edge with a random sampling value to obtain target probability values corresponding to each candidate operation, and obtaining probability distribution of each candidate operation selected by each connecting edge in current forward propagation based on the target probability values corresponding to each candidate operation corresponding to each connecting edge.

Specifically, the probability distribution of selecting the corresponding candidate operation for each connection edge may be calculated based on the following gummel-Max method formula:

wherein, Is the structural parameter of each candidate operation o corresponding to the connecting edge (i, j); The probability of selecting each candidate operation o is represented by a connecting edge (i, j), i.e., the probability that each candidate operation o is selected by the corresponding connecting edge (i, j), i.e., the connecting edge between node i and node j. Candidate operation o belongs to a set of at least one candidate operation corresponding to the connection edge I.e. the parameter operation result corresponding to the candidate operation o; G is a random sampling value of Gumbel distribution, and candidate operations sampled in any forward propagation can be different due to randomness; i.e. the target probability value corresponding to candidate operation o, and the set of V, i.e. the connecting edges, selects the probability distribution of the corresponding candidate operation.

In one embodiment, according to probability distribution corresponding to each connecting edge, only one candidate operation is selected from the corresponding candidate operations for each connecting edge and added to an initial network structure unit, wherein the method comprises the steps of determining the candidate operation corresponding to the maximum target probability value in the probability distribution corresponding to each connecting edge, and obtaining the target candidate operation corresponding to each connecting edge; and adding the target candidate operation corresponding to each connecting edge to the corresponding position of each connecting edge in the initial network structure unit.

Wherein, can be formulated asDetermining a maximum target probability value in a probability distribution corresponding to each connecting edge (i, j), by which the set of at least one candidate operation corresponding to the connecting edge can be determinedTarget probability value corresponding to each candidate operation oNormalized to normalized value a _i,j, then, the largest normalized value is processed to 1 by one-hot encoding (one_hot), the rest normalized values are processed to 0, and then, the target probability value corresponding to the normalized value corresponding to 1 is the largest.

And adding the determined target candidate operation to the corresponding position of each connecting edge in the initial network structure unit to obtain the network structure unit to be optimized of which the candidate operation is primarily searched.

In one embodiment, in step S250, the probability distribution is subjected to discrete distribution serialization processing, which may be the following formula of gummel-Softmax methodThe formed probability distribution discrete distribution is continuous, and a continuous processing result B _i,j is obtained:

where τ is the temperature coefficient.

In one embodiment, step S250 performs gradient optimization on the structure parameters of the candidate operation in the current network to be optimized based on the continuous processing result, and comprises directly performing gradient optimization on the structure parameters of the candidate operation in the current network to be optimized based on the continuous processing result.

In one embodiment, step S250 performs gradient optimization on the structure parameters of the candidate operation in the current network to be optimized based on the continuous processing result, and the method comprises the steps of adopting the current network to be optimized and a preset image processing network to process sample images respectively to obtain an image processing result, and performing gradient optimization on the structure parameters of the candidate operation in the current network to be optimized based on the continuous processing result according to the image processing result and a network middle layer of the preset image processing network.

The preset image processing network is a pre-designed and trained image processing network. The search process directly using candidate operations for gradient optimization has instability, and the network performance obtained by searching is reduced when the number of search rounds becomes long, because there is imbalance in the inter-layer gradient in the current network to be optimized, and non-parametric candidate operations such as jump connection and the like can provide an additional path for gradient conduction, so that as the search is performed, meaningless jump connection in the network is more prone to be selected as the candidate operation between nodes. The gradient optimization is carried out through the combination of the preset image processing networks, so that the current network to be optimized learns the interlayer gradient distribution of the preset image processing network, the gradient distribution of the current network to be optimized is further smoothed, the stability of the searching process is improved, and meanwhile, the current network to be optimized can be subjected to information supervision from the preset image processing network, the performance of the target image processing network can be further improved, and the image processing effect of the target image processing network is improved.

In one embodiment, the image processing result comprises a first result corresponding to a current network to be optimized and a second result corresponding to a preset image processing network, the sample image is calibrated with a preset result, the structure parameters of candidate operations in the current network to be optimized are subjected to gradient optimization based on the continuous processing result according to the processing result and the network middle layer of the preset image processing network, the method comprises the steps of calculating first loss according to the first result and the preset result, calculating second loss according to the first result and the second result, calculating third loss according to the preset image processing network and the network middle layer corresponding to the same level in the current network to be optimized, and carrying out gradient optimization on the structure parameters of the candidate operations in the current network to be optimized based on the continuous processing result according to the first loss, the second loss and the third loss.

The first loss, the second loss and the third loss respectively form learning guide relative to a preset image processing network for the current network to be optimized from different angles, and the structure parameters of candidate operation in the current network to be optimized are subjected to gradient optimization based on the continuous processing result according to the first loss, the second loss and the third loss, in this way, the current network to be optimized can be used as a student network in a knowledge distillation way, the preset image processing network is used as a teacher network, so that the current network to be optimized further effectively learns the interlayer gradient distribution of the preset image processing network, further smoothes the gradient distribution of the current network to be optimized, further improves the stability of the searching process, and can further effectively receive information supervision from the preset image processing network, further improves the performance of the target image processing network and improves the network stability.

Referring to fig. 5, the preset image processing network is divided into 3 network blocks, each network block is a hierarchical network middle layer, the current network to be optimized includes 3 cell units, each cell unit corresponds to an initial network structure unit, and each cell unit is a hierarchical network middle layer. Third losses can be calculated for network blocks and cell units of the same hierarchyCalculate a second penalty for the second result output by the last network block (i.e., teacher output) and the first result output by the last cell unit (i.e., student output)Calculating a first loss for a predetermined result (i.e., a true label) of sample image calibration and a first result (i.e., student output) of last cell unit output

In one embodiment, the third loss is calculated according to network intermediate layers corresponding to the same level in a preset image processing network and a current network to be optimized, and the method comprises the steps of carrying out mean pooling operation on two network feature images corresponding to the network intermediate layers of the same level in the preset image processing network and the current network to be optimized to obtain two target feature images with uniform channel numbers, respectively converting the two target feature images into a first weighted feature image and a second weighted feature image, and calculating the third loss according to the first weighted feature image and the second weighted feature image.

Specifically, two network feature maps corresponding to the same hierarchical network middle layer (for example, each cell 1 and each network block 1) are subjected to a mean pooling operation, and the channel numbers of the two network feature maps are uniformly reduced to a smaller value between the two network feature maps through the mean pooling operationAnd obtaining two target feature graphs with uniform channel numbers.

The target feature map corresponding to the network feature map output by the cell unit of the ith level can be represented by F _i and adopts a formulaThe target feature map F _i may be converted to a first weighted feature mapI _kd is the total number of cell units, j is the jth feature in the target feature map F _i, and the network feature map output by the ith level network block can be usedRepresentation, using the formulaThe target feature map can be mappedAnd converting the first weighted feature map into a second weighted feature map A _i, wherein the third loss obtained by calculation can effectively guide the learning of the current network to be optimized according to the first weighted feature map and the second weighted feature map.

In one embodiment, calculating the third loss from the first weighted feature map and the second weighted feature map includes calculating the first weighted feature map and the second weighted feature map using a mean square error loss function to obtain the third loss.

Specifically, the first weighted feature map and the second weighted feature map may be calculated based on the following mean square error loss function to obtain the third loss

Wherein, For the first weighted feature map, a _i is the second weighted feature map.

In one embodiment, calculating the first loss based on the first result and the predetermined result includes calculating the first loss using a cross entropy loss function to obtain the first loss, and calculating the second loss based on the first result and the second result includes calculating the first result and the second result using a relative entropy loss function to obtain the second loss.

Specifically, the first result and the predetermined result may be calculated based on the following cross entropy loss function to obtain the first loss

Where N is the total number of sub-results (e.g., classification probabilities) in the first result and the predetermined result, y _k is the kth sub-result in the predetermined result,Is the kth sub-result in the first result.

Specifically, the first result and the second result may be calculated based on the following relative entropy loss function (i.e., KL divergence loss function) to obtain the second loss

Wherein N is the total number of sub-results (e.g., classification probabilities) in the first result and the second result,Is the kth sub-result in the second result and p _L, is the kth sub-result in the first result.

In one embodiment, the gradient optimization of the structure parameters of the candidate operation in the current network to be optimized is performed based on the continuous processing result, and the method comprises the steps of performing second-order approximate estimation processing on the network parameters in the current network to be optimized and the continuous processing result to obtain a second-order approximate estimation result, and performing alternating gradient optimization on the network parameters and the structure parameters in the current network to be optimized based on the second-order approximate estimation result.

And carrying out second-order approximate estimation processing on the network parameters in the current network to be optimized and the continuous processing result to obtain a second-order approximate estimation result, enabling the loss function to be micro to both the network parameters omega and the structural parameters alpha based on the second-order approximate result, and carrying out alternating gradient optimization on the network parameters and the structural parameters in the current network to be optimized based on the second-order approximate estimation result until the preset optimization condition is met, so that the target image processing network can be obtained.

For example, the goal of the gradient optimization corresponding search process is the following two-objective optimization problem:

However, the loss function under the target is not differentiable with respect to the structural parameter α, the first order approximation can directly approximate the current network parameter ω as ω ^* (α), and the second order approximation approximates the optimal network parameter to one-step down gradient network parameter After second-order approximation estimation, the loss function can be used for carrying out micro-scale on the network parameter omega and the structural parameter alpha, then the current network to be optimized is trained by adopting the alternating gradient optimization of training the one-step structural parameter and training the one-step network parameter until the preset optimization condition is met, and the target image processing network can be obtained. The method of alternating gradient optimization may be to fix the value of the structural parameter α matrix on the training set, then gradient down the value of the network parameter ω matrix, and then gradient down the value of the structural parameter α matrix on the verification set.

Fig. 6 schematically shows a flow chart of an image classification method according to an embodiment of the application. The execution subject of the image classification method may be any device, such as the server 101 or the terminal 102 shown in fig. 1.

As shown in fig. 6, the image classification method may include steps S310 to S320.

Step S310, obtaining an image to be classified, step S320, performing classification processing on the image to be classified by using a target image processing network for performing image classification to obtain a classification result corresponding to the image to be classified, wherein the target image processing network is generated by the image processing network generation method according to any one of the embodiments of the present application.

In this way, in some embodiments, the target image processing network for image classification has the advantages of low consumption of computing resources, high generation efficiency, low occupation of video memory and flexible deployment on devices with limited video memory in the search generation process, and the target image processing network for image classification can further reduce the resource consumption of image classification tasks as a whole and promote the deployment flexibility of image classification tasks.

The foregoing embodiments are further described below in connection with classifying an image to be classified in a scenario in which the image to be classified is classified by applying the foregoing embodiments of the present application.

In this scenario, classifying the image to be classified may include steps (1) to (2).

And (1) searching the convolutional neural network in the image classification task by using a network structure searching method.

The method comprises the steps of obtaining an initial network structure unit, wherein the initial network structure unit comprises at least one node representing a feature graph, connecting edges are arranged between the nodes, obtaining a candidate operation set, the candidate operation set comprises at least one candidate operation corresponding to each connecting edge and used for processing the feature graph, calculating probability distribution of each connecting edge in current forward propagation, which corresponds to the candidate operation, according to structural parameters of the candidate operation corresponding to each connecting edge, selecting only one candidate operation from the corresponding candidate operation according to the probability distribution corresponding to each connecting edge, adding the candidate operation to the initial network structure unit, obtaining a current network to be optimized, carrying out discrete distribution continuous processing on the probability distribution, and carrying out gradient optimization on the structural parameters of the candidate operation in the current network to be optimized based on a continuous processing result until the structural parameters meet preset optimization conditions, thus obtaining the target image processing network. The target image processing network is a convolutional neural network searched for image classification.

Further, generating probability distribution of selecting corresponding candidate operations by each connecting edge in current forward propagation according to the structure parameters of the candidate operations corresponding to each connecting edge comprises carrying out exponential operation on the structure parameters of each candidate operation corresponding to each connecting edge to obtain parameter operation results corresponding to each candidate operation, summing the parameter operation results corresponding to each candidate operation corresponding to each connecting edge to obtain summation results corresponding to the connecting edge, dividing the summation results of the parameter operation results corresponding to each candidate operation corresponding to the connecting edge by the summation results corresponding to the connecting edge to obtain probability of selecting each candidate operation by the corresponding connecting edge, summing the probability of selecting each candidate operation by the corresponding connecting edge with a random sampling value to obtain target probability value corresponding to each candidate operation, and obtaining probability distribution of selecting the corresponding candidate operation by each connecting edge in current forward propagation based on the target probability value corresponding to each candidate operation corresponding to each connecting edge.

Further, according to the probability distribution corresponding to each connecting edge, only one candidate operation is selected from the corresponding candidate operations for each connecting edge and added to the initial network structure unit, wherein the method comprises the steps of determining the candidate operation corresponding to the maximum target probability value in the probability distribution corresponding to each connecting edge, and obtaining the target candidate operation corresponding to each connecting edge; and adding the target candidate operation corresponding to each connecting edge to the corresponding position of each connecting edge in the initial network structure unit.

Wherein, can be according to the formulaDetermining a maximum target probability value in a probability distribution corresponding to each connecting edge (i, j), by which the set of at least one candidate operation corresponding to the connecting edge can be determinedTarget probability value corresponding to each candidate operation oNormalized to normalized value a _i,j, then, the largest normalized value is processed to 1 by one-hot encoding (one_hot), the rest normalized values are processed to 0, and then, the target probability value corresponding to the normalized value corresponding to 1 is the largest.

Further, the probability distribution is continuously processed in discrete distribution, which can be the probability distribution according to the following formula of Gumbel-Softmax methodThe formed probability distribution discrete distribution is continuous, and a continuous processing result B _i,j is obtained:

where τ is the temperature coefficient.

Further, the gradient optimization of the structure parameters of the candidate operation in the current network to be optimized is performed based on the continuous processing result, which comprises the step of directly performing gradient optimization on the structure parameters of the candidate operation in the current network to be optimized based on the continuous processing result.

Further, the gradient optimization of the structure parameters of the candidate operation in the current network to be optimized is carried out based on the continuous processing result, and the method comprises the steps of respectively processing sample images by the current network to be optimized and a preset image processing network to obtain an image processing result, and carrying out gradient optimization on the structure parameters of the candidate operation in the current network to be optimized based on the continuous processing result according to the image processing result and a network middle layer of the preset image processing network.

The image processing result comprises a first result corresponding to the current network to be optimized and a second result corresponding to the preset image processing network, the sample image is calibrated with a preset result, the structure parameters of candidate operations in the current network to be optimized are subjected to gradient optimization based on the continuous processing result according to the processing result and the network middle layer of the preset image processing network, the method comprises the steps of calculating first loss according to the first result and the preset result, calculating second loss according to the first result and the second result, calculating third loss according to the preset image processing network and the network middle layer corresponding to the same level in the current network to be optimized, and carrying out gradient optimization on the structure parameters of the candidate operations in the current network to be optimized according to the first loss, the second loss and the third loss based on the continuous processing result.

Further, calculating third loss according to the network intermediate layer corresponding to the same level in the preset image processing network and the current network to be optimized comprises the steps of carrying out mean value pooling operation on two network feature images corresponding to the network intermediate layer of the same level in the preset image processing network and the current network to be optimized to obtain two target feature images with uniform channel number, respectively converting the two target feature images into a first weighted feature image and a second weighted feature image, and calculating third loss according to the first weighted feature image and the second weighted feature image.

Further, calculating a third loss according to the first weighted feature map and the second weighted feature map includes calculating the first weighted feature map and the second weighted feature map with a mean square error loss function to obtain the third loss.

Further, calculating the first loss according to the first result and the preset result comprises calculating the first loss by adopting a cross entropy loss function to obtain the first loss, and calculating the second loss according to the first result and the second result comprises calculating the first result and the second result by adopting a relative entropy loss function to obtain the second loss.

Wherein N is the total number of sub-results (e.g., classification probabilities) in the first result and the second result,Is the kth sub-result in the second result and p _L,k is the kth sub-result in the first result.

Further, the gradient optimization of the structure parameters of the candidate operation in the current network to be optimized is carried out based on the continuous processing result, and the gradient optimization method comprises the steps of carrying out second-order approximate estimation processing on the network parameters in the current network to be optimized and the continuous processing result to obtain a second-order approximate estimation result, and carrying out alternating gradient optimization on the network parameters and the structure parameters in the current network to be optimized based on the second-order approximate estimation result.

And (2) classifying the images by using the searched convolutional neural network.

The method comprises the steps of obtaining images to be classified, and classifying the images to be classified by adopting a searched convolutional neural network to obtain classification results corresponding to the images to be classified.

In the scene, the method and the device for classifying the images by using the embodiment of the application have the advantages that the convolutional neural network searching process has low consumption of computing resources, high generation efficiency, low occupation of video memory and flexible deployment on the device with limited video memory, the convolutional neural network is used for classifying the images, the resource consumption of the image classification task can be further reduced on the whole, the deployment flexibility of the image classification task is improved, and the convolutional neural network for classifying the images has excellent stability and can further effectively improve the classification accuracy of the image classification on the whole.

In order to facilitate better implementation of the image processing network generation method provided by the embodiment of the application, the embodiment of the application also provides an image processing network generation device based on the image processing network generation method. Where the meaning of the terms is the same as in the above-described image processing network generation method, specific implementation details may be referred to in the description of the method embodiment. Fig. 7 shows a block diagram of an image processing network generating apparatus according to an embodiment of the present application.

As shown in fig. 7, the image processing network generating apparatus 400 may include a structural unit acquiring module 410, an operation set acquiring module 420, a probability calculating module 430, a single operation sampling module 440, and an optimizing module 450.

The system comprises a structure unit acquisition module, an operation set acquisition module, a probability calculation module and a single operation sampling module, wherein the initial network structure unit comprises at least one node representing a feature map, a connection edge is arranged between the nodes, the operation set acquisition module is used for acquiring a candidate operation set, the candidate operation set comprises at least one candidate operation corresponding to each connection edge and used for processing the feature map, the probability calculation module is used for generating probability distribution of each connection edge selected to correspond to the candidate operation in the current forward propagation according to the structure parameters of the candidate operation corresponding to each connection edge, the single operation sampling module is used for selecting only one candidate operation from the corresponding candidate operation to be added to the initial network structure unit according to the probability distribution corresponding to each connection edge, the current network to be optimized is obtained, and the optimization module is used for carrying out discrete distribution continuous processing on the probability distribution and carrying out gradient optimization on the structure parameters of the candidate operation in the current network to be optimized until the structure parameters meet preset optimization conditions and the target image processing network is obtained.

The embodiment of the application also provides an image classification device based on the image classification method. Where the meaning of nouns is the same as in the image classification method described above, specific implementation details may be referred to in the description of the method embodiments. Fig. 8 shows a block diagram of an image classification apparatus according to an embodiment of the application.

As shown in fig. 8, the image classification apparatus 500 may include an image acquisition module 510 and a classification module 520.

The image classification system comprises an image acquisition module, a classification module and a classification module, wherein the image acquisition module is used for acquiring an image to be classified, the classification module is used for classifying the image to be classified by adopting a target image processing network for performing image classification to obtain a classification result corresponding to the image to be classified, and the target image processing network is generated by the image processing network generation method according to any one embodiment of the application.

It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.

In addition, the embodiment of the present application further provides an electronic device, which may be a terminal or a server, as shown in fig. 9, which shows a schematic structural diagram of the electronic device according to the embodiment of the present application, specifically:

The electronic device may include one or more processing cores 'processors 601, one or more computer-readable storage media's memory 602, power supply 603, and input unit 604, among other components. It will be appreciated by those skilled in the art that the electronic device structure shown in fig. 9 is not limiting of the electronic device and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components. Wherein:

Processor 601 is the control center of the electronic device and uses various interfaces and lines to connect the various parts of the overall computer device, and to perform various functions of the computer device and process data by running or executing software programs and/or modules stored in memory 602, and invoking data stored in memory 602. Alternatively, the processor 601 may include one or more processing cores, and preferably the processor 601 may integrate an application processor that primarily processes operating systems, user pages, applications, etc., and a modem processor that primarily processes wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 601.

The memory 602 may be used to store software programs and modules, and the processor 601 may execute various functional applications and data processing by executing the software programs and modules stored in the memory 602. The memory 602 may mainly include a storage program area that may store an operating system, application programs required for at least one function (such as a sound playing function, an image playing function, etc.), etc., and a storage data area that may store data created according to the use of the computer device, etc. In addition, the memory 602 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 602 may also include a memory controller to provide access to the memory 602 by the processor 601.

The electronic device further comprises a power supply 603 for supplying power to the various components, preferably the power supply 603 may be logically connected to the processor 601 by a power management system, so that functions of managing charging, discharging, power consumption management and the like are achieved by the power management system. The power supply 603 may also include one or more of any components, such as a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.

The electronic device may further comprise an input unit 604, which input unit 604 may be used for receiving input digital or character information and for generating keyboard, mouse, joystick, optical or trackball signal inputs in connection with user settings and function control.

Although not shown, the electronic device may further include a display unit or the like, which is not described herein. In particular, in this embodiment, the processor 601 in the electronic device loads executable files corresponding to the processes of one or more computer programs into the memory 602 according to the following instructions, and the processor 601 executes the computer programs stored in the memory 602, so as to implement the functions of the foregoing embodiments of the present application.

The processor 601 may perform, for example, obtaining an initial network structure unit, where the initial network structure unit includes at least one node representing a feature map, and there is a connection edge between the nodes, obtaining a candidate operation set, where the candidate operation set includes at least one candidate operation corresponding to each connection edge, where the candidate operation is an operation for processing the feature map, calculating a probability distribution of selecting a corresponding candidate operation for each connection edge in current forward propagation according to a structure parameter of the candidate operation corresponding to each connection edge, selecting only one candidate operation from the corresponding candidate operations for each connection edge according to the probability distribution corresponding to each connection edge, adding the selected candidate operation to the network structure unit to obtain a current network to be optimized, performing discrete distribution serialization processing on the probability distribution, and performing gradient optimization on the structure parameter of the candidate operation in the current network to be optimized based on the serialization processing result until a predetermined optimization condition is met, thereby obtaining a target image processing network.

As another example, the processor 601 may perform obtaining an image to be classified, and performing classification processing on the image to be classified by using a target image processing network for performing image classification, to obtain a classification result corresponding to the image to be classified, where the target image processing network is generated by using the image processing network generating method according to any embodiment of the present application.

It will be appreciated by those of ordinary skill in the art that all or part of the steps of the various methods of the above embodiments may be performed by a computer program, or by computer program control related hardware, which may be stored in a computer readable storage medium and loaded and executed by a processor.

To this end, embodiments of the present application also provide a computer readable storage medium having stored therein a computer program that can be loaded by a processor to perform the steps of any of the methods provided by the embodiments of the present application.

The computer readable storage medium may include, among others, read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disks, and the like.

Since the computer program stored in the computer readable storage medium may execute the steps of any one of the methods provided in the embodiments of the present application, the beneficial effects that can be achieved by the methods provided in the embodiments of the present application may be achieved, which are detailed in the previous embodiments and are not described herein.

According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions to cause the computer device to perform the methods provided in the various alternative implementations of the application described above.

Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains.

It will be understood that the application is not limited to the embodiments which have been described above and shown in the drawings, but that various modifications and changes can be made without departing from the scope thereof.

Claims

1. An image processing network generation method, comprising:

acquiring an initial network structure unit, wherein the initial network structure unit comprises at least one node representing a feature map, and connecting edges are arranged between the nodes;

acquiring a candidate operation set, wherein the candidate operation set comprises at least one candidate operation corresponding to each connecting edge, and the candidate operation is an operation for processing a feature map;

generating probability distribution of selecting the corresponding candidate operation by each connecting edge in the current forward propagation according to the structure parameters of the candidate operation corresponding to each connecting edge;

According to probability distribution corresponding to each connecting edge, selecting only one candidate operation from the corresponding candidate operations for each connecting edge, and adding the candidate operation to the initial network structure unit to obtain a current network to be optimized;

performing discrete distribution continuous processing on the probability distribution, and performing gradient optimization on the structural parameters of candidate operations in the current network to be optimized based on a continuous processing result until the structural parameters meet preset optimization conditions, so as to obtain a target image processing network;

the gradient optimizing the structure parameters of the candidate operation in the current network to be optimized based on the continuous processing result comprises the following steps:

Respectively processing the sample image by adopting the current network to be optimized and a preset image processing network to obtain an image processing result;

And carrying out gradient optimization on the structural parameters of the candidate operation in the current network to be optimized based on the continuous processing result according to the image processing result and the network middle layer of the preset image processing network.

2. The method according to claim 1, wherein the image processing result includes a first result corresponding to the current network to be optimized and a second result corresponding to the preset image processing network, and the sample image is calibrated with a predetermined result;

And performing gradient optimization on the structural parameters of the candidate operation in the current network to be optimized based on the continuous processing result according to the processing result and the network middle layer of the preset image processing network, wherein the gradient optimization comprises the following steps:

calculating a first loss according to the first result and the preset result;

calculating a second loss according to the first result and the second result;

calculating third loss according to the preset image processing network and the network middle layer corresponding to the same level in the current network to be optimized;

And according to the first loss, the second loss and the third loss, carrying out gradient optimization on the structural parameters of the candidate operation in the current network to be optimized based on a continuous processing result.

3. The method of claim 1, wherein generating a probability distribution for each connection edge in the current forward propagation to select a corresponding candidate operation according to the structure parameters of the candidate operation corresponding to each connection edge comprises:

Carrying out exponential operation on the structural parameters of each candidate operation corresponding to the connecting edges aiming at the connecting edges to obtain parameter operation results corresponding to each candidate operation;

summing the parameter operation results corresponding to each candidate operation corresponding to the connecting edge to obtain a summation result corresponding to the connecting edge;

Summing the parameter operation results corresponding to each candidate operation corresponding to the connecting edge and dividing the summation results corresponding to the connecting edge to obtain the probability that each candidate operation is selected by the corresponding connecting edge;

Summing the probability that each candidate operation is selected by the corresponding connecting edge with a random sampling value to obtain a target probability value corresponding to each candidate operation;

And obtaining probability distribution of selecting the corresponding candidate operation by each connecting edge in the current forward propagation based on the target probability value corresponding to each candidate operation corresponding to each connecting edge.

4. A method according to claim 3, wherein said selecting, for each of said connection edges, only one candidate operation from the corresponding candidate operations to be added to said initial network structural unit according to the probability distribution corresponding to each connection edge comprises:

Determining candidate operations corresponding to the maximum target probability values in the probability distribution corresponding to the connecting edges, and obtaining target candidate operations corresponding to the connecting edges;

And adding the target candidate operation corresponding to each connecting edge to the corresponding position of each connecting edge in the initial network structure unit.

5. The method according to claim 2, wherein calculating a third loss according to the preset image processing network and the network middle layer corresponding to the same hierarchy in the current network to be optimized comprises:

performing an average pooling operation on two network feature graphs corresponding to the same-level network middle layer in the preset image processing network and the current network to be optimized to obtain two target feature graphs with uniform channel numbers;

converting the two target feature maps into a first weighted feature map and a second weighted feature map respectively;

and calculating the third loss according to the first weighted characteristic diagram and the second weighted characteristic diagram.

6. The method of claim 5, wherein said calculating said third loss from said first weighted feature map and said second weighted feature map comprises:

And calculating the first weighted feature map and the second weighted feature map by adopting a mean square error loss function to obtain the third loss.

7. The method of claim 2 or 5 or 6, wherein said calculating a first loss from said first result and said predetermined result comprises:

Calculating the first result and the preset result by adopting a cross entropy loss function to obtain the first loss;

said calculating a second loss from said first result and said second result, comprising:

And calculating the first result and the second result by adopting a relative entropy loss function to obtain the second loss.

8. The method according to claim 1 or 2 or 5 or 6, wherein the gradient optimizing the structure parameters of the candidate operations in the current network to be optimized based on the continuous processing result comprises:

Performing second-order approximate estimation processing on the network parameters in the current network to be optimized and the continuous processing result to obtain a second-order approximate estimation result;

and carrying out alternating gradient optimization on the network parameters and the structural parameters in the current network to be optimized based on the second-order approximation estimation result.

9. An image classification method, comprising:

Acquiring an image to be classified;

classifying the image to be classified by using a target image processing network for image classification to obtain a classification result corresponding to the image to be classified, wherein the target image processing network is generated according to the method of any one of claims 1 to 8.

10. An image processing network generation apparatus, comprising:

the system comprises a structural unit acquisition module, a network unit generation module and a network unit generation module, wherein the structural unit acquisition module is used for acquiring an initial network structural unit, the initial network structural unit comprises at least one node representing a feature graph, and connecting edges are arranged between the nodes;

The operation set acquisition module is used for acquiring a candidate operation set, wherein the candidate operation set comprises at least one candidate operation corresponding to each connecting edge, and the candidate operation is an operation for processing a feature map;

the probability calculation module is used for generating probability distribution of selecting the corresponding candidate operation by each connecting edge in the current forward propagation according to the structure parameters of the candidate operation corresponding to each connecting edge;

The single-operation sampling module is used for selecting only one candidate operation from the corresponding candidate operations for each connecting edge according to the probability distribution corresponding to each connecting edge and adding the candidate operation to the initial network structure unit to obtain a current network to be optimized;

The optimization module is used for carrying out discrete distribution continuous processing on the probability distribution, carrying out gradient optimization on the structure parameters of the candidate operation in the current network to be optimized based on a continuous processing result until the structure parameters meet a preset optimization condition, and obtaining a target image processing network, wherein the gradient optimization on the structure parameters of the candidate operation in the current network to be optimized based on the continuous processing result comprises the steps of respectively processing sample images by adopting the current network to be optimized and a preset image processing network to obtain an image processing result, and carrying out gradient optimization on the structure parameters of the candidate operation in the current network to be optimized based on the continuous processing result according to the image processing result and a network intermediate layer of the preset image processing network.

11. An image classification apparatus, comprising:

The image acquisition module is used for acquiring images to be classified;

The classification module is configured to perform classification processing on the image to be classified by using a target image processing network for performing image classification, so as to obtain a classification result corresponding to the image to be classified, where the target image processing network is generated according to the method of any one of claims 1 to 8.

12. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor of a computer, causes the computer to perform the method of any of claims 1 to 9.

13. An electronic device comprising a memory storing a computer program, and a processor reading the computer program stored in the memory to perform the method of any one of claims 1 to 9.

14. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the method of any one of claims 1 to 9.