[go: up one dir, main page]

WO2021179117A1 - Method and apparatus for searching number of neural network channels - Google Patents

Method and apparatus for searching number of neural network channels Download PDF

Info

Publication number
WO2021179117A1
WO2021179117A1 PCT/CN2020/078413 CN2020078413W WO2021179117A1 WO 2021179117 A1 WO2021179117 A1 WO 2021179117A1 CN 2020078413 W CN2020078413 W CN 2020078413W WO 2021179117 A1 WO2021179117 A1 WO 2021179117A1
Authority
WO
WIPO (PCT)
Prior art keywords
sub
weighting coefficients
feature
tensors
convolutional layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2020/078413
Other languages
French (fr)
Chinese (zh)
Inventor
邱畅啸
杨帆
钟刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to PCT/CN2020/078413 priority Critical patent/WO2021179117A1/en
Priority to CN202080091992.7A priority patent/CN114902240A/en
Publication of WO2021179117A1 publication Critical patent/WO2021179117A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Definitions

  • This application relates to the field of artificial intelligence, and more specifically, to a method and device for searching the number of neural network channels.
  • Artificial intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
  • artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • neural networks for example, deep neural networks
  • a neural network with good performance often has a sophisticated network structure, which requires human experts with superb skills and rich experience to spend a lot of energy to construct.
  • NAS neural architecture search
  • Neural network structure search technology can be divided into different categories according to the search method. Differentiable search technology is one of the important technologies of NAS. It is mainly divided into three stages: construct a differentiable neural network search space, perform network structure search, and search The result is decoded to get the final network structure.
  • the calculation unit can be a single operation, such as convolution, pooling, etc., or a block operation composed of multiple basic operations.
  • Differentiable search technology can search for different computing units when searching for a network structure, but it does not support a single convolution channel number search. It cannot meet the requirements when it is desired to search for a network with a smaller amount of calculation.
  • the present application provides a neural network channel number search method and device, which can realize the differentiable search technology to be able to perform the network channel number search problem, and reduce the computational complexity of the network while ensuring the network performance.
  • a method for searching the number of neural network channels includes: determining the number of output channels N of the convolutional layer, where N is a positive integer; and dividing the feature tensor output by the convolutional layer into n sub-features Tensor, the number of channels of each sub-feature tensor is N/n, n is an integer divisible by N, and n ⁇ 2; determine n sets of weighting coefficients, each set of weighting coefficients includes multiple weighting coefficients, multiple weighting coefficients One-to-one correspondence with multiple sub-feature tensors in n sub-feature tensors; determine the sub-feature tensor corresponding to the maximum value in each set of weighting coefficients in n sets of weighting coefficients, to obtain the sub-feature corresponding to the n maximum values Tensor; re-determine the number of output channels of the convolutional layer according to the sub-feature tensor corresponding to the n maximum values.
  • the calculation unit can be a single operation, such as convolution, pooling, etc., or a block operation composed of multiple basic operations.
  • Differentiable search technology can search for different computing units when searching for a network structure, but it does not support the search for the number of channels of a single convolution. It cannot meet the requirements when it is desired to search for a network with a smaller amount of calculation.
  • the method for searching the number of neural network channels provided by the embodiments of the present application can realize the search of the number of neural network channels based on a differentiable search technology.
  • each group of weighting coefficients includes n weighting coefficients, and the n weighting coefficients have a one-to-one correspondence with the n sub-feature tensors.
  • each group of weighting coefficients includes m weighting coefficients, and the m weighting coefficients are in one-to-one correspondence with m sub-feature tensors in the n sub-feature tensors, where m is a positive value smaller than n. Integer.
  • This embodiment of the application provides two possible implementations to determine the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in n sets of weighting coefficients, that is, the sub-feature tensor corresponding to each maximum value can be from n It can be determined from the number of sub-feature tensors, or it can be determined from part of the sub-feature tensors in n sub-feature tensors.
  • the method before determining the sub-feature tensor corresponding to the maximum value of each of the n sets of weighting coefficients, the method further includes: according to n sets of weighting coefficients and n sub-feature tensors Multiple sub-feature tensors in the feature tensor generate n candidate feature tensors, and a set of weighting coefficients corresponds to one candidate feature tensor.
  • the sub-feature tensor corresponding to the maximum value of each group of weighting coefficients in the n sets of weighting coefficients is determined to obtain the sub-feature tensor corresponding to the n maximum values, It also includes: determining the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each candidate feature tensor, so as to obtain n sub-feature tensors with the largest weights.
  • each candidate feature tensor can be calculated based on each set of weighting coefficients and multiple sub feature tensors in n sub feature tensors, where multiple sub feature tensors can be n sub feature tensors, or part of n sub feature tensors Sub-feature tensor, and then generate the sub-feature tensor with the largest weight among each candidate feature tensor as the sub-feature tensor corresponding to the maximum value.
  • re-determining the number of output channels of the convolutional layer according to the sub-feature tensors corresponding to the n maximum values includes: determining the mutual relationship among the sub-feature tensors corresponding to the n maximum values.
  • the number of different sub-feature tensors k, k is a positive integer less than or equal to n; the number of output channels of the re-determined convolutional layer is kN/n.
  • the number of output channels of the convolutional layer is re-determined according to the number of sub-feature tensors that are different from each other in the sub-feature tensors corresponding to the n maximum values.
  • the re-determined number of channels is the original k/n, which can realize the neural
  • the number of network channels is compressed, thereby reducing the computational complexity of the neural network.
  • an image processing method including: acquiring an image to be processed; classifying the image to be processed according to a target neural network to obtain a classification result of the image to be processed; wherein the determination of the number of channels of the target neural network includes: determining The number of output channels of the convolutional layer is N, where N is a positive integer; the feature tensor output by the convolutional layer is divided into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, n is divisible by N Integer of, and n ⁇ 2; determine n sets of weighting coefficients, each set of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients are in one-to-one correspondence with multiple sub-feature tensors of the n sub-feature tensors; determine n sets of weighting coefficients The sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the, to obtain the sub-
  • the neural network searched by the neural network channel number search method provided in the embodiments of the present application is used for image processing. Compared with the neural network without channel number compression, the overall computational complexity of the neural network is reduced.
  • each group of weighting coefficients includes n weighting coefficients, and the n weighting coefficients are in one-to-one correspondence with the n sub-feature tensors.
  • each group of weighting coefficients includes m weighting coefficients, and the m weighting coefficients are in one-to-one correspondence with m sub-feature tensors in the n sub-feature tensors, where m is a positive value smaller than n. Integer.
  • the method before determining the sub-feature tensor corresponding to the maximum value of each of the n sets of weighting coefficients, the method further includes: according to n sets of weighting coefficients and n sub-features Multiple sub-feature tensors in the tensor generate n candidate feature tensors, and one set of weighting coefficients corresponds to one candidate feature tensor.
  • the sub-feature tensor corresponding to the maximum value of each group of weighting coefficients in the n sets of weighting coefficients is determined to obtain the sub-feature tensor corresponding to the n maximum values, It also includes: determining the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each candidate feature tensor, so as to obtain n sub-feature tensors with the largest weights.
  • re-determining the number of output channels of the convolutional layer according to the sub-feature tensors corresponding to the n maximum values includes: determining the mutual relationship among the sub-feature tensors corresponding to the n maximum values.
  • the number of different sub-feature tensors k, k is a positive integer less than or equal to n; the number of output channels of the re-determined convolutional layer is kN/n.
  • a neural network channel number search device including: a memory for storing programs; a processor for executing programs stored in the memory, and when the programs stored in the memory are executed, the processor is configured to perform the following processes : Determine the number of output channels of the convolutional layer N, where N is a positive integer; divide the feature tensor output by the convolutional layer into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, where n is An integer divisible by N, and n ⁇ 2; determine n groups of weighting coefficients, each group of weighting coefficients includes multiple weighting coefficients, the multiple weighting coefficients correspond to multiple sub-feature tensors of the n sub-feature tensors one-to-one; determine n groups The sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the weighting coefficients to obtain the sub-feature tensor corresponding to the n maximum values; the
  • an image processing device including: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the processor is configured to Perform the following process: obtain the image to be processed; classify the image to be processed according to the target neural network to obtain the classification result of the image to be processed; wherein the determination of the number of channels of the target neural network includes: determining the number of output channels of the convolutional layer N, N Is a positive integer; divide the feature tensor output by the convolutional layer into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, n is an integer divisible by N, and n ⁇ 2; determine n Group weighting coefficients, each group of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients correspond to multiple sub-feature tensors of the n sub-feature tensors one-to-one; determine the maximum value of each
  • a computer-readable storage medium stores program code for device execution, and the program code includes any one of the implementation manners of the first aspect to the second aspect described above. In the method.
  • a computer program product containing instructions is provided.
  • the computer program product runs on a computer, the computer executes the method in any one of the foregoing first to second aspects.
  • a chip in a seventh aspect, includes a processor and a data interface.
  • the processor reads instructions stored in a memory through the data interface to execute any one of the first aspect to the second aspect. One way to achieve this.
  • the chip may further include a memory in which instructions are stored, and the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, The processor is configured to execute the method in any one of the implementation manners of the first aspect to the second aspect.
  • Fig. 1 is a schematic structural diagram of a convolutional neural network provided by an embodiment of the present application
  • FIG. 2 is a schematic block diagram of a method for searching a differentiable neural network structure provided by an embodiment of the present application
  • FIG. 3 is a schematic flowchart of a method for searching the number of neural network channels provided by an embodiment of the present application
  • FIG. 4 is a schematic block diagram of a method for searching the number of neural network channels provided by an embodiment of the present application
  • FIG. 5 is a schematic block diagram of another neural network channel number search method provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a super-division neural network structure provided by an embodiment of the present application.
  • FIG. 7 is a schematic flowchart of an image processing method provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of the hardware structure of a neural network channel number search device provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of the hardware structure of an image processing apparatus provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of the hardware structure of a neural network training device according to an embodiment of the present application.
  • FIG. 11 is a schematic structural block diagram of a neural network channel number search device provided by an embodiment of the present application.
  • FIG. 12 is a schematic structural block diagram of an image processing device provided by an embodiment of the present application.
  • the neural network obtained according to the method for searching for the number of neural network channels may be a convolutional neural network (CNN), a deep convolutional neural network (DCNN), or a recurrent neural network (recurrent neural network). neural network, RNN) and so on. Since CNN is a very common neural network, the structure of CNN will be introduced below in conjunction with Figure 1.
  • a convolutional neural network (CNN) 100 may include an input layer 110, a convolutional layer/pooling layer 120 (the pooling layer is optional), and a neural network layer 130.
  • the input layer 110 can obtain the image to be processed, and pass the obtained image to be processed to the convolutional layer/pooling layer 120 and the subsequent neural network layer 130 for processing, and the processing result of the image can be obtained.
  • the following describes the internal layer structure of CNN 100 in Figure 1 in detail.
  • the convolutional layer/pooling layer 120 may include layers 121-126 as shown in the examples.
  • layer 121 is a convolutional layer
  • layer 122 is a pooling layer
  • layer 123 is a convolutional layer.
  • Layers, 124 are pooling layers
  • 125 are convolutional layers
  • 126 are pooling layers; in another implementation, 121 and 122 are convolutional layers, 123 are pooling layers, and 124 and 125 are convolutional layers.
  • Layer, 126 is the pooling layer. That is, the output of the convolutional layer can be used as the input of the subsequent pooling layer, or as the input of another convolutional layer to continue the convolution operation.
  • the convolution layer 121 can include many convolution operators.
  • the convolution operator is also called a kernel. Its role in image processing is equivalent to a filter that extracts specific information from the input image matrix.
  • the convolution operator is essentially It can be a weight matrix, which is usually predefined. In the process of image convolution operation, the weight matrix is usually along the horizontal direction of the input image one pixel after another pixel (or two pixels then two pixels...It depends on the value of stride). Processing, so as to complete the work of extracting specific features from the image.
  • the size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix and the depth dimension of the input image are the same.
  • the weight matrix will extend to Enter the entire depth of the image. Therefore, convolution with a single weight matrix will produce a single depth dimension convolution output, but in most cases, a single weight matrix is not used, but multiple weight matrices of the same size (row ⁇ column) are applied. That is, multiple homogeneous matrices.
  • the output of each weight matrix is stacked to form the depth dimension of the convolutional image, where the dimension can be understood as determined by the "multiple" mentioned above.
  • Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract edge information of the image, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to eliminate unwanted noise in the image.
  • the multiple weight matrices have the same size (row ⁇ column), the size of the convolution feature maps extracted by the multiple weight matrices of the same size are also the same, and then the multiple extracted convolution feature maps of the same size are merged to form The output of the convolution operation.
  • weight values in these weight matrices need to be obtained through a lot of training in practical applications, and each weight matrix formed by the weight values obtained through training can extract information from the input image, so that the convolutional neural network 100 can make correct predictions.
  • the initial convolutional layer (such as 121) often extracts more general features, which can also be called low-level features; with the convolutional neural network
  • the subsequent convolutional layers for example, 126
  • features such as high-level semantics
  • the pooling layer can also be a multi-layer convolutional layer followed by one or more pooling layers.
  • the pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to obtain an image with a smaller size.
  • the average pooling operator can calculate the pixel values in the image within a specific range to generate an average value as the result of the average pooling.
  • the maximum pooling operator can take the pixel with the largest value within a specific range as the result of the maximum pooling.
  • the operators in the pooling layer should also be related to the image size.
  • the size of the image output after processing by the pooling layer can be smaller than the size of the image of the input pooling layer, and each pixel in the image output by the pooling layer represents the average value or the maximum value of the corresponding sub-region of the image input to the pooling layer.
  • the convolutional neural network 100 After processing by the convolutional layer/pooling layer 120, the convolutional neural network 100 is not enough to output the required output information. Because as mentioned above, the convolutional layer/pooling layer 120 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other related information), the convolutional neural network 100 needs to use the neural network layer 130 to generate one or a group of required classes of output. Therefore, the neural network layer 130 may include multiple hidden layers (131, 132 to 13n as shown in FIG. 1) and an output layer 140. The parameters contained in the multiple hidden layers may be based on specific task types. The relevant training data of the, for example, the task type can include image recognition, image classification, image super-resolution reconstruction and so on.
  • the output layer 140 After the multiple hidden layers in the neural network layer 130, that is, the final layer of the entire convolutional neural network 100 is the output layer 140.
  • the output layer 140 has a loss function similar to the classification cross entropy, which is specifically used to calculate the prediction error.
  • the neural network shown in Figure 1 can be obtained by the neural network structure search method.
  • the differentiable search technology is one of the important techniques of neural network structure search. Perform decoding to get the final network structure.
  • the following is a brief introduction to the differentiable neural network structure search technology in conjunction with Figure 2.
  • the first step is to construct a differentiable neural network search space.
  • the candidate network calculation units 1, 2, 3, 4, and 5 are deployed in a network and constructed by a weighted summation method.
  • the weighting coefficient is obtained by the gunbel_softmax conversion function.
  • the gunbel_softmax conversion function can convert the weighting coefficient into a vector between [0,1] and add to 1.
  • the parameter temperature controls the output distribution. When the temperature is very low, the output tends to average Distribution, when the temperature is high, the output tends to one-hot distribution, that is, only one element tends to 1, and the other elements tend to 0.
  • a1, a2, a3, a4, a5 are weighting coefficients, where a1 is the weighting coefficient of calculation unit 1, a2 is the weighting coefficient of calculation unit 2...
  • the second step is to search the network structure.
  • alternate training of the network parameters and weighting coefficients of the network calculation unit is performed, where the weighting coefficients represent the structural parameters of the network structure.
  • the input data is passed into the calculation units 1 to 5 respectively, and the network parameter training of the calculation unit is performed.
  • the data processed by the calculation units 1 to 5 are respectively multiplied by the weighting coefficients a1 to a5 and then added to obtain an output 1.
  • the data of output 1 is passed into the calculation units 1'to 5'respectively, and the network parameter training of the calculation units is performed.
  • the third step is to decode the search results.
  • the calculation unit with the largest weighting coefficient is retained, and other calculation units are deleted, and the final network structure is obtained as the search result.
  • the largest weighting coefficient among the weighting coefficients a1 to a5 the corresponding calculation unit is retained, and other calculation units are deleted; according to the largest weighting coefficient among the weighting coefficients b1 to b5, the corresponding calculation unit is retained, and Delete other calculation units.
  • the calculation unit can be an operation, such as convolution, pooling, etc., or a block operation composed of multiple basic operations.
  • Differentiable search technology can search for different computing units when searching for a network structure, but it does not support the search for the number of channels of a single convolution. It cannot meet the requirements when it is desired to search for a network with a smaller amount of calculation.
  • the method for searching the number of neural network channels provided by the embodiments of the present application can realize the search of the number of neural network channels based on a differentiable search technology.
  • Fig. 3 shows a schematic flow chart of the method for searching the number of neural network channels provided by the present application.
  • the method shown in FIG. 3 can be executed by a neural network structure search device.
  • the neural network structure search device can be a computer, a server, a cloud device, and other devices with sufficient computing power to implement a neural network structure search.
  • the method shown in FIG. 3 includes steps 301 to 305, which are described in detail below.
  • S301 Determine the number N of output channels of the convolutional layer, where N is a positive integer.
  • the number N of output channels of the convolutional layer may be the maximum number of output channels of the convolutional layer, and the maximum number of output channels of the convolutional layer may be a value set according to a specific embodiment.
  • S302 Divide the feature tensor output by the convolutional layer into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, where n is an integer divisible by N, and n ⁇ 2.
  • the segmentation of the feature tensor output by the convolutional layer is the segmentation in the channel dimension.
  • a picture is a three-dimensional tensor, the length and width are each dimension, and the third dimension is the number of channels.
  • the feature tensor is divided in the channel dimension, so that each sub-feature tensor can equally divide the number of channels of the convolutional layer.
  • each group of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients are in one-to-one correspondence with multiple sub-feature tensors of the n sub-feature tensors.
  • each group of weighting coefficients includes n weighting coefficients, and the n weighting coefficients have a one-to-one correspondence with the n sub-feature tensors.
  • n weighting coefficients have a one-to-one correspondence with the n sub-feature tensors.
  • four sets of weighting coefficients are determined, and each set of weighting coefficients includes four weighting coefficients a1, a2, a3, and a4.
  • These four weighting coefficients correspond to the four sub-feature tensors T1, T2, T3, and T4, namely a1 Corresponds to T1, a2 corresponds to T2, a3 corresponds to T3, and a4 corresponds to T4.
  • each set of weighting coefficients may be different from each other.
  • each group of weighting coefficients includes m weighting coefficients, and the m weighting coefficients have a one-to-one correspondence with m sub-feature tensors in the n sub-feature tensors, where m is a positive integer less than n.
  • m is a positive integer less than n.
  • four sets of weighting coefficients are determined, and each set of weighting coefficients includes two weighting coefficients a1, a2, and these two weighting coefficients correspond to any two of the four sub-feature tensors T1, T2, T3, and T4 one-to-one, for example
  • a1 corresponds to T1 and a2 corresponds to T2
  • a1 corresponds to T1 and a2 corresponds to T3
  • the weighting coefficients of each group may be different from each other.
  • This embodiment of the application provides two possible implementations to determine the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in n sets of weighting coefficients, that is, the sub-feature tensor corresponding to each maximum value can be from n It can be determined from the number of sub-feature tensors, or it can be determined from part of the sub-feature tensors in n sub-feature tensors.
  • S304 Determine the sub-feature tensor corresponding to the maximum value of each group of weighting coefficients in the n groups of weighting coefficients, so as to obtain the sub-feature tensor corresponding to the n maximum values.
  • the sub-feature tensor corresponding to the maximum value is T1; the second group having the largest weighting coefficient is also a1, then the sub-feature tensor corresponding to the maximum value is a1.
  • the feature tensor is also T1; the largest weighting coefficient in the third group is a2, and the sub-feature tensor corresponding to the maximum value is T2; the largest weighting coefficient in the fourth group is a3, then the sub-feature corresponding to the maximum value
  • the tensor is T3.
  • the sub-feature tensors corresponding to the four maximum values thus obtained are T1, T1, T2, and T3, respectively.
  • n candidate tensors may be generated according to the n sets of weighting coefficients and multiple sub-feature tensors of the n sub-feature tensors Feature tensor, where a set of weighting coefficients corresponds to a candidate feature tensor. Then determine the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each candidate feature tensor, so as to obtain n sub-feature tensors with the largest weights.
  • a set of weighting coefficients is used for description.
  • a candidate feature is generated based on the 4 weighting coefficients a1, a2, a3, a4 and 4 sub-feature tensors T1, T2, T3, T4 in a group.
  • T1 the sub-feature tensor with the largest weights
  • S305 Re-determine the number of output channels of the convolutional layer according to the sub-feature tensors corresponding to the n maximum values.
  • the sub-feature tensors corresponding to the four maximum values are T1, T1, T2, and T3, respectively.
  • the sub-feature tensors that are different from each other are T1, T2, T3, and the number is 3. Therefore, the number of output channels of the convolutional layer can be re-determined as 3N/4, which can realize the compression of the number of neural network channels, thereby reducing the computational complexity of the neural network.
  • the overall process of the neural network channel number search method according to the embodiment of the present application will be introduced below in conjunction with FIG. 4.
  • the method of constructing a search space with a searchable number of convolutional channels in the embodiment of the present application is as follows.
  • the maximum number of output channels N of the convolutional layer may be a value set according to a specific embodiment.
  • the maximum number of search channels is determined according to the maximum number of output channels N of the convolutional layer.
  • Convolutional layer 1 outputs the feature tensor T, and divides the feature tensor T into four sub-feature tensors as shown in Figure 4 in the channel dimension: T0, T1, T2, and T3. Thus, the number of channels for each sub-feature tensor Is N/4.
  • g is a randomly generated variable
  • is the set value
  • is the input
  • y is the output, which is the calculated weighting coefficient.
  • a corresponding set of weighting coefficients a01, a02, a03, a04 can be calculated, where a01, a02, a03, and a04 are all between [0,1 and add up A vector of 1.
  • 4 sets of weighting coefficients can be obtained.
  • the candidate feature tensor TC0 is a00 ⁇ T0+a01 ⁇ T1+a02 ⁇ T2+a03 ⁇ T3
  • TC1 is a10 ⁇ T0+a11 ⁇ T1+a12 ⁇ T2+a13 ⁇ T3
  • TC2 is a20 ⁇ T0+a21 ⁇ T1+a22 ⁇ T2+a23 ⁇ T3
  • TC3 is a30 ⁇ T0+a31 ⁇ T1+a32 ⁇ T2+a33 ⁇ T3.
  • the 4 candidate feature tensors TC0, TC1, TC2, and TC3 are spliced into a feature tensor Tout as the output of the convolutional layer 1, and the number of channels is still N.
  • the feature tensor Tout is input to the next convolutional layer 2.
  • the sub-feature tensor with the largest contribution among each candidate feature tensor can be obtained. For example, for TC0, a00>a01>a02>a03, so the sub-feature tensor that contributes the most to TC0 is T0; for TC1, a10>a11>a12>a13, the sub-feature tensor that contributes the most to TC1 is T0; TC2, a23>a20>a21>a22, so the sub-feature tensor that contributes the most to TC2 is T3; for TC3, a32>a30>a31>a33, so the sub-feature tensor that contributes the most to TC3 is T2.
  • the sub-feature tensors that contribute the most are T0, only T0, T2, and T3 need to be retained in the four sub-feature tensors T0, T1, T2, and T3, so the number of channels only needs to be 3N/4. This can achieve compression of the number of channels.
  • the actual number of channels of the searched network structure in the convolutional layer 1 is only 3/4 of the original number of channels, the number of channels is reduced, and the overall computational complexity of the network is reduced.
  • FIG. 5 shows another way of generating candidate feature tensors.
  • the sub-feature tensor T0 is directly used as the candidate feature tensor TC0, and the sub-feature tensor T0 and the sub-feature tensor T1 are weighted and summed to generate the candidate feature tensor TC1 as a10 ⁇ T0+a11 ⁇ T1.
  • Sub-feature tensor T0 and sub-feature tensor T2 are weighted and summed to generate candidate feature tensor TC2 as a20 ⁇ T0+a21 ⁇ T2, and sub-feature tensor T0 and sub-feature tensor T3 are weighted and summed to generate candidate feature tensor TC3 as a30 ⁇ T0+a31 ⁇ T3.
  • the sub-feature tensor that contributes the most to TC0 is obviously T0, for TC1, a10>a11, so the sub-feature tensor that contributes the most to TC1 is T0; for TC2, a20>a21, so the sub-feature tensor that contributes the most to TC2
  • the quantity is T0; for TC3, a31>a30, the sub-feature tensor that contributes the most to TC3 is T3.
  • the sub-feature tensors that contribute the most are all T0, only T0 and T3 need to be retained in the four sub-feature tensors T0, T1, T2, and T3, so the number of channels only needs to be N/2, As a result, the number of channels can be compressed.
  • the combination method of generating candidate feature tensors shown in FIG. 5 is beneficial to search for a network structure with a smaller number of channels.
  • FIG. 6 is a schematic diagram of a super-divided neural network structure searched by the neural network channel number search method provided by an embodiment of the present application.
  • the performance of the super-divided neural network structure searched out without using the neural network channel number search method provided by the embodiment of the present application is equivalent in performance.
  • the number of channels of the convolutional layer 0 of the super-division neural network structure searched by the neural network channel number search method provided by the embodiments of the present application is reduced by 50%
  • the number of channels of the convolutional layer 1 is reduced by 50%
  • the convolution The number of channels in layer 2 is reduced by 0%
  • the number of channels in convolutional layer 3 is reduced by 0%
  • the number of channels in convolutional layer 4 is reduced by 50%.
  • the overall computational complexity of the network is reduced by 37%.
  • the neural network structure searched for by the method provided in the embodiments of the present application has the effect of processing the image, which is equivalent to the subjective effect of the original image, and is better than using the difference method to process the image.
  • the details of the image after processing the image using the neural network structure searched out by the method provided in the embodiment of the present application are clearer than the original image.
  • FIG. 7 is a schematic flowchart of an image processing method according to an embodiment of the present application. It should be understood that the above definitions, explanations and extensions of the relevant content of the method shown in FIG. 3 are also applicable to the method shown in FIG. 7, and repeated descriptions are appropriately omitted when introducing the method shown in FIG.
  • the method shown in FIG. 7 can be applied to terminal equipment, including:
  • S701 Acquire an image to be processed.
  • S702 Classify the image to be processed according to the target neural network to obtain a classification result of the image to be processed.
  • the determination of the number of channels of the target neural network includes: determining the number of output channels of the convolutional layer N, where N is a positive integer; the feature tensor output by the convolutional layer is divided into n sub-feature tensors, and the channel of each sub-feature tensor The number is N/n, n is an integer divisible by N, and n ⁇ 2; determine n sets of weighting coefficients, each set of weighting coefficients includes multiple weighting coefficients, multiple weighting coefficients and multiple sub-features of the n sub-feature tensors One-to-one correspondence of the quantities; determine the sub-feature tensor corresponding to the maximum value of each group of weighting coefficients in the n groups of weighting coefficients to obtain the sub-feature tensor corresponding to the n maximum values; according to the sub-feature tensor corresponding to the n maximum values The feature tensor re-determines the number of output channels of the convolutional layer.
  • FIG. 8 is a schematic diagram of the hardware structure of a neural network channel number search device provided by an embodiment of the present application.
  • the neural network channel number search device 800 shown in FIG. 8 (the device 800 may specifically be a computer device) includes a memory 801, a processor 802, a communication interface 803, and a bus 804. Among them, the memory 801, the processor 802, and the communication interface 803 realize the communication connection between each other through the bus 804.
  • the memory 801 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
  • the memory 801 may store a program. When the program stored in the memory 801 is executed by the processor 802, the processor 802 is configured to execute each step of the method for searching the number of neural network channels in the embodiment of the present application.
  • the processor 802 may adopt a general central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more
  • the integrated circuit is used to execute related programs to implement the neural network channel number search method in the method embodiment of the present application.
  • the processor 802 may also be an integrated circuit chip with signal processing capability.
  • each step of the method for searching the number of neural network channels of the present application can be completed by an integrated logic circuit of hardware in the processor 802 or instructions in the form of software.
  • the above-mentioned processor 802 may also be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, Discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 801, and the processor 802 reads the information in the memory 801, and combines its hardware to complete the functions required by the units included in the neural network channel number search device, or execute the neural network channel of the method embodiment of the application Number of search methods.
  • the communication interface 803 uses a transceiving device such as but not limited to a transceiver to implement communication between the device 800 and other devices or a communication network. For example, the information of the target neural network to be determined and the training data needed in the process of determining the target neural network can be obtained through the communication interface 803.
  • a transceiving device such as but not limited to a transceiver to implement communication between the device 800 and other devices or a communication network. For example, the information of the target neural network to be determined and the training data needed in the process of determining the target neural network can be obtained through the communication interface 803.
  • the bus 804 may include a path for transferring information between various components of the device 800 (for example, the memory 801, the processor 802, and the communication interface 803).
  • FIG. 9 is a schematic diagram of the hardware structure of an image processing apparatus according to an embodiment of the present application.
  • the image processing apparatus 900 shown in FIG. 9 includes a memory 901, a processor 902, a communication interface 903, and a bus 904.
  • the memory 901, the processor 902, and the communication interface 903 implement communication connections between each other through the bus 904.
  • the memory 901 may be ROM, static storage device and RAM.
  • the memory 901 may store a program. When the program stored in the memory 901 is executed by the processor 902, the processor 902 and the communication interface 903 are used to execute each step of the image processing method of the embodiment of the present application.
  • the processor 902 may adopt a general-purpose CPU, a microprocessor, an ASIC, a GPU or one or more integrated circuits to execute related programs to realize the functions required by the units in the image processing apparatus of the embodiments of the present application. Or execute the image processing method in the method embodiment of this application.
  • the processor 902 may also be an integrated circuit chip with signal processing capability.
  • each step of the image processing method of the embodiment of the present application can be completed by an integrated logic circuit of hardware in the processor 902 or instructions in the form of software.
  • the aforementioned processor 902 may also be a general-purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component.
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 901, and the processor 902 reads the information in the memory 901, and combines its hardware to complete the functions required by the units included in the image processing apparatus of the embodiment of the present application, or perform the image processing of the method embodiment of the present application method.
  • the communication interface 903 uses a transceiving device such as but not limited to a transceiver to implement communication between the device 900 and other devices or a communication network.
  • a transceiving device such as but not limited to a transceiver to implement communication between the device 900 and other devices or a communication network.
  • the image to be processed can be acquired through the communication interface 903.
  • the bus 904 may include a path for transferring information between various components of the device 900 (for example, the memory 901, the processor 902, and the communication interface 903).
  • FIG. 10 is a schematic diagram of the hardware structure of a neural network training device according to an embodiment of the present application. Similar to the aforementioned device 800 and device 900, the neural network training device 1000 shown in FIG. 10 includes a memory 1001, a processor 1002, a communication interface 1003, and a bus 1004. Among them, the memory 1001, the processor 1002, and the communication interface 1003 implement communication connections between each other through the bus 1004.
  • the neural network After the neural network has been searched by the neural network channel number search device shown in FIG. 8, the neural network can be trained by the neural network training device 1000 shown in FIG. 10, and the trained neural network can be used to execute this Apply for the image processing method of the embodiment.
  • the device shown in FIG. 10 can obtain training data and the neural network to be trained from the outside through the communication interface 1003, and then the processor trains the neural network to be trained according to the training data.
  • the device 800, device 900, and device 1000 only show a memory, a processor, and a communication interface, in the specific implementation process, those skilled in the art should understand that the device 800, device 900, and device 1000 may also Including other devices necessary for normal operation. At the same time, according to specific needs, those skilled in the art should understand that the device 800, the device 900, and the device 1000 may also include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the device 800, the device 900, and the device 1000 may also only include the components necessary to implement the embodiments of the present application, and not necessarily include all the components shown in FIGS. 8, 9 and 10.
  • FIG. 11 is a schematic structural block diagram of a neural network channel number search device provided by an embodiment of the present application, where the neural network channel number search device 1100 includes:
  • the first determining unit 1101 is configured to determine the number N of output channels of the convolutional layer, where N is a positive integer;
  • the dividing unit 1102 is configured to divide the feature tensor output by the convolutional layer into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, n is an integer divisible by N, and n ⁇ 2;
  • the second determining unit 1103 is configured to determine n sets of weighting coefficients, where each set of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients have a one-to-one correspondence with multiple sub-feature tensors of the n sub-feature tensors;
  • the third determining unit 1104 is configured to determine the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the n groups of weighting coefficients, so as to obtain the sub-feature tensor corresponding to the n maximum values;
  • the update unit 1105 is configured to update the number of output channels of the convolutional layer according to the sub-feature tensor corresponding to the n maximum values.
  • each group of weighting coefficients includes n weighting coefficients, and the n weighting coefficients are in one-to-one correspondence with the n sub-feature tensors.
  • each set of weighting coefficients includes m weighting coefficients, and the m weighting coefficients have a one-to-one correspondence with m sub-feature tensors in the n sub-feature tensors, where m is less than n Positive integer.
  • the third determining unit 1104 is further configured to generate n candidate feature tensors according to the n sets of weighting coefficients and multiple sub-feature tensors of the n sub-feature tensors.
  • the weighting coefficient corresponds to one of the candidate feature tensors.
  • the third determining unit 1104 is further configured to determine the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each of the candidate feature tensors, so as to obtain n largest weights.
  • the update unit 1105 is specifically configured to determine the number k of sub-feature tensors that are different from each other among the sub-feature tensors corresponding to the n maximum values, and k is a positive value less than or equal to n.
  • k is a positive value less than or equal to n.
  • An integer, the number of output channels of the updated convolutional layer is kN/n.
  • FIG. 12 is a schematic structural block diagram of an image processing device according to an embodiment of the present application, where the image processing device 1200 includes:
  • the acquiring unit 1201 is configured to acquire the image to be processed
  • the classification unit 1202 is configured to classify the image to be processed according to a target neural network to obtain a classification result of the image to be processed, wherein the number of channels of the target neural network is determined by the device 1100.
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

A method and apparatus for searching the number of neural network channels, being capable of implementing a differentiable search technique and solving a search problem of the number of network channels, and reducing the computational complexity of a network while ensuring the network performance. The method comprises: determining the number N of output channels of a convolutional layer, N being a positive integer (S301); segmenting a feature tensor outputted by the convolutional layer into n sub-feature tensors, the number of channels of each sub-feature tensor being N/n, n being an integer divisible by N, and n being greater than or equal to 2 (S302); and determining n groups of weighting coefficients, each group of weighting coefficients comprising multiple weighting coefficients, and the multiple weighting coefficients having one-to-one correspondence to multiple sub-feature tensors in n sub-feature tensors (S303); determining sub-feature tensors corresponding to the maximum values in the groups of weighting coefficients in the n groups of weighting coefficients to obtain sub-feature tensors corresponding to n maximum values (S304); and re-determining the number of output channels of the convolutional layer according to the sub-feature tensors corresponding to the n maximum values (S305).

Description

神经网络通道数搜索方法和装置Neural network channel number search method and device 技术领域Technical field

本申请涉及人工智能领域,并且更具体地,涉及一种神经网络通道数搜索方法和装置。This application relates to the field of artificial intelligence, and more specifically, to a method and device for searching the number of neural network channels.

背景技术Background technique

人工智能(artificial intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个分支,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式作出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。Artificial intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results. In other words, artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence. Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.

随着人工智能技术的快速发展,神经网络(例如,深度神经网络)近年来在图像、视频以及语音等多种媒体信号的处理与分析中取得了很大的成就。一个性能优良的神经网络往往拥有精妙的网络结构,而这需要具有高超技能和丰富经验的人类专家花费大量精力进行构建。为了更好地构建神经网络,人们提出了通过神经网络结构搜索(neural architecture search,NAS)的方法来搭建神经网络,通过自动化地搜索神经网络结构,从而得到性能优异的神经网络结构。With the rapid development of artificial intelligence technology, neural networks (for example, deep neural networks) have made great achievements in the processing and analysis of various media signals such as images, videos, and voices in recent years. A neural network with good performance often has a sophisticated network structure, which requires human experts with superb skills and rich experience to spend a lot of energy to construct. In order to better construct a neural network, people propose a neural network structure search (neural architecture search, NAS) method to build a neural network, and automatically search the neural network structure to obtain a neural network structure with excellent performance.

神经网络结构搜索技术按照搜索方法可以分为不同类别,可微分搜索技术是NAS的重要技术之一,其主要分为三个阶段:构造可微分的神经网络搜索空间,进行网络结构搜索,对搜索结果进行解码得到最终的网络结构。在应用可微分搜索技术进行网络结构搜索时,目前主要是搜索计算单元,计算单元可以是一个操作,如卷积、池化等,也可以是多个基本操作组合而成的块操作。可微分搜索技术进行网络结构搜索时,虽然能够搜索不同的计算单元,但是不支持单个卷积的通道数搜索,对于希望能够搜索到计算量更小的网络时,并不能满足要求。Neural network structure search technology can be divided into different categories according to the search method. Differentiable search technology is one of the important technologies of NAS. It is mainly divided into three stages: construct a differentiable neural network search space, perform network structure search, and search The result is decoded to get the final network structure. When applying the differentiable search technology to search for a network structure, currently, it is mainly a search calculation unit. The calculation unit can be a single operation, such as convolution, pooling, etc., or a block operation composed of multiple basic operations. Differentiable search technology can search for different computing units when searching for a network structure, but it does not support a single convolution channel number search. It cannot meet the requirements when it is desired to search for a network with a smaller amount of calculation.

发明内容Summary of the invention

本申请提供一种神经网络通道数搜索方法和装置,能够实现可微分搜索技术能够进行网络通道数搜索问题,在保证网络性能的同时,减少网络的计算复杂度。The present application provides a neural network channel number search method and device, which can realize the differentiable search technology to be able to perform the network channel number search problem, and reduce the computational complexity of the network while ensuring the network performance.

第一方面,提供了一种神经网络通道数搜索方法,其特征在于,包括:确定卷积层的输出通道数N,N为正整数;将卷积层输出的特征张量分割为n个子特征张量,每个子特征张量的通道数为N/n,n为可以被N整除的整数,且n≥2;确定n组加权系数,每组加权系数包括多个加权系数,多个加权系数与n个子特征张量中的多个子特征张量一一对应;确定n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个所述最大值所对应的子特征张量;根据n个最大值所对应的子特征张量重新确定卷积层的输出通道数。In the first aspect, a method for searching the number of neural network channels is provided, which is characterized in that it includes: determining the number of output channels N of the convolutional layer, where N is a positive integer; and dividing the feature tensor output by the convolutional layer into n sub-features Tensor, the number of channels of each sub-feature tensor is N/n, n is an integer divisible by N, and n≥2; determine n sets of weighting coefficients, each set of weighting coefficients includes multiple weighting coefficients, multiple weighting coefficients One-to-one correspondence with multiple sub-feature tensors in n sub-feature tensors; determine the sub-feature tensor corresponding to the maximum value in each set of weighting coefficients in n sets of weighting coefficients, to obtain the sub-feature corresponding to the n maximum values Tensor; re-determine the number of output channels of the convolutional layer according to the sub-feature tensor corresponding to the n maximum values.

在应用可微分搜索技术进行神经网络结构搜索时,目前主要是搜索计算单元,计算单 元可以是一个操作,如卷积、池化等,也可以是多个基本操作组合而成的块操作。可微分搜索技术进行网络结构搜索时,虽然能够搜索不同的计算单元,但是不支持单个卷积的通道数的搜索,对于希望能够搜索到计算量更小的网络时,并不能满足要求。本申请实施例提供的神经网络通道数搜索方法,基于可微分的搜索技术,可以实现神经网络通道数的搜索。When applying the differentiable search technology to search the neural network structure, it is currently mainly the search calculation unit. The calculation unit can be a single operation, such as convolution, pooling, etc., or a block operation composed of multiple basic operations. Differentiable search technology can search for different computing units when searching for a network structure, but it does not support the search for the number of channels of a single convolution. It cannot meet the requirements when it is desired to search for a network with a smaller amount of calculation. The method for searching the number of neural network channels provided by the embodiments of the present application can realize the search of the number of neural network channels based on a differentiable search technology.

结合第一方面,在一种可能的实现方式中,每组加权系数包括n个加权系数,n个加权系数与n个子特征张量一一对应。With reference to the first aspect, in a possible implementation manner, each group of weighting coefficients includes n weighting coefficients, and the n weighting coefficients have a one-to-one correspondence with the n sub-feature tensors.

结合第一方面,在一种可能的实现方式中,每组加权系数包括m个加权系数,m个加权系数与n个子特征张量中的m个子特征张量一一对应,其中m为小于n的正整数。With reference to the first aspect, in a possible implementation manner, each group of weighting coefficients includes m weighting coefficients, and the m weighting coefficients are in one-to-one correspondence with m sub-feature tensors in the n sub-feature tensors, where m is a positive value smaller than n. Integer.

本申请实施例提供了两种可能的实现方式来确定n组加权系数中每组加权系数中的最大值所对应的子特征张量,即每个最大值所对应的子特征张量可以从n个子特征张量中确定,也可以从n个子特征张量中的部分子特征张量中确定。This embodiment of the application provides two possible implementations to determine the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in n sets of weighting coefficients, that is, the sub-feature tensor corresponding to each maximum value can be from n It can be determined from the number of sub-feature tensors, or it can be determined from part of the sub-feature tensors in n sub-feature tensors.

结合第一方面,在一种可能的实现方式中,确定n组加权系数中每组加权系数中的最大值所对应的子特征张量之前,该方法还包括:根据n组加权系数和n个子特征张量中的多个子特征张量生成n个候选特征张量,其中一组加权系数对应一个候选特征张量。With reference to the first aspect, in a possible implementation manner, before determining the sub-feature tensor corresponding to the maximum value of each of the n sets of weighting coefficients, the method further includes: according to n sets of weighting coefficients and n sub-feature tensors Multiple sub-feature tensors in the feature tensor generate n candidate feature tensors, and a set of weighting coefficients corresponds to one candidate feature tensor.

结合第一方面,在一种可能的实现方式中,确定n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个最大值所对应的子特征张量,还包括:确定生成每个候选特征张量的多个子特征张量中权重最大的子特征张量,以得到n个权重最大的子特征张量。With reference to the first aspect, in a possible implementation manner, the sub-feature tensor corresponding to the maximum value of each group of weighting coefficients in the n sets of weighting coefficients is determined to obtain the sub-feature tensor corresponding to the n maximum values, It also includes: determining the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each candidate feature tensor, so as to obtain n sub-feature tensors with the largest weights.

除了第一方面提供的方法,本申请实施例还提供了一种最大值所对应的子特征张量的方法。即可以根据每组加权系数和n个子特征张量中的多个子特征张量计算出每个候选特征张量,其中多个子特征张量可以是n个子特征张量,也可以是n个子特征张量中的部分子特征张量,然后将生成每个候选特征张量中权重最大的子特征张量作为最大值所对应的子特征张量。In addition to the method provided in the first aspect, the embodiment of the present application also provides a method for the sub-feature tensor corresponding to the maximum value. That is, each candidate feature tensor can be calculated based on each set of weighting coefficients and multiple sub feature tensors in n sub feature tensors, where multiple sub feature tensors can be n sub feature tensors, or part of n sub feature tensors Sub-feature tensor, and then generate the sub-feature tensor with the largest weight among each candidate feature tensor as the sub-feature tensor corresponding to the maximum value.

结合第一方面,在一种可能的实现方式中,根据n个最大值所对应的子特征张量重新确定卷积层的输出通道数,包括:确定n个最大值所对应的子特征张量中互不相同的子特征张量的个数k,k为小于等于n的正整数;重新确定的卷积层的输出通道数为kN/n。With reference to the first aspect, in a possible implementation manner, re-determining the number of output channels of the convolutional layer according to the sub-feature tensors corresponding to the n maximum values includes: determining the mutual relationship among the sub-feature tensors corresponding to the n maximum values. The number of different sub-feature tensors k, k is a positive integer less than or equal to n; the number of output channels of the re-determined convolutional layer is kN/n.

根据n个最大值所对应的子特征张量中互不相同的子特征张量的个数来重新确定卷积层的输出通道数,重新确定的通道数为原来的k/n,由此可以实现神经网络通道数的压缩,从而减小神经网络的计算复杂度。The number of output channels of the convolutional layer is re-determined according to the number of sub-feature tensors that are different from each other in the sub-feature tensors corresponding to the n maximum values. The re-determined number of channels is the original k/n, which can realize the neural The number of network channels is compressed, thereby reducing the computational complexity of the neural network.

第二方面,提供了一种图像处理方法,包括:获取待处理图像;根据目标神经网络对待处理图像进行分类,得到待处理图像的分类结果;其中,目标神经网络的通道数的确定包括:确定卷积层的输出通道数N,N为正整数;将卷积层输出的特征张量分割为n个子特征张量,每个子特征张量的通道数为N/n,n为可以被N整除的整数,且n≥2;确定n组加权系数,每组加权系数包括多个加权系数,多个加权系数与所述n个子特征张量中的多个子特征张量一一对应;确定n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个所述最大值所对应的子特征张量;根据n个最大值所对应的子特征张量重新确定卷积层的输出通道数。In a second aspect, an image processing method is provided, including: acquiring an image to be processed; classifying the image to be processed according to a target neural network to obtain a classification result of the image to be processed; wherein the determination of the number of channels of the target neural network includes: determining The number of output channels of the convolutional layer is N, where N is a positive integer; the feature tensor output by the convolutional layer is divided into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, n is divisible by N Integer of, and n≥2; determine n sets of weighting coefficients, each set of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients are in one-to-one correspondence with multiple sub-feature tensors of the n sub-feature tensors; determine n sets of weighting coefficients The sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the, to obtain the sub-feature tensor corresponding to the n maximum values; re-determine the convolutional layer according to the sub-feature tensor corresponding to the n maximum values The number of output channels.

利用本申请实施例提供的神经网络通道数搜索方法搜索得到的神经网络用于图像处 理,相比于未进行通道数压缩的神经网络,神经网络整体计算复杂度降低。The neural network searched by the neural network channel number search method provided in the embodiments of the present application is used for image processing. Compared with the neural network without channel number compression, the overall computational complexity of the neural network is reduced.

结合第二方面,在一种可能的实现方式中,每组加权系数包括n个加权系数,n个加权系数与所述n个子特征张量一一对应。With reference to the second aspect, in a possible implementation manner, each group of weighting coefficients includes n weighting coefficients, and the n weighting coefficients are in one-to-one correspondence with the n sub-feature tensors.

结合第二方面,在一种可能的实现方式中,每组加权系数包括m个加权系数,m个加权系数与n个子特征张量中的m个子特征张量一一对应,其中m为小于n的正整数。With reference to the second aspect, in a possible implementation manner, each group of weighting coefficients includes m weighting coefficients, and the m weighting coefficients are in one-to-one correspondence with m sub-feature tensors in the n sub-feature tensors, where m is a positive value smaller than n. Integer.

结合第二方面,在一种可能的实现方式中,确定n组加权系数中每组加权系数中的最大值所对应的子特征张量之前,方法还包括:根据n组加权系数和n个子特征张量中的多个子特征张量生成n个候选特征张量,其中一组加权系数对应一个候选特征张量。With reference to the second aspect, in a possible implementation manner, before determining the sub-feature tensor corresponding to the maximum value of each of the n sets of weighting coefficients, the method further includes: according to n sets of weighting coefficients and n sub-features Multiple sub-feature tensors in the tensor generate n candidate feature tensors, and one set of weighting coefficients corresponds to one candidate feature tensor.

结合第二方面,在一种可能的实现方式中,确定n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个最大值所对应的子特征张量,还包括:确定生成每个候选特征张量的多个子特征张量中权重最大的子特征张量,以得到n个权重最大的子特征张量。With reference to the second aspect, in a possible implementation manner, the sub-feature tensor corresponding to the maximum value of each group of weighting coefficients in the n sets of weighting coefficients is determined to obtain the sub-feature tensor corresponding to the n maximum values, It also includes: determining the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each candidate feature tensor, so as to obtain n sub-feature tensors with the largest weights.

结合第二方面,在一种可能的实现方式中,根据n个最大值所对应的子特征张量重新确定卷积层的输出通道数,包括:确定n个最大值所对应的子特征张量中互不相同的子特征张量的个数k,k为小于等于n的正整数;重新确定的卷积层的输出通道数为kN/n。With reference to the second aspect, in a possible implementation manner, re-determining the number of output channels of the convolutional layer according to the sub-feature tensors corresponding to the n maximum values includes: determining the mutual relationship among the sub-feature tensors corresponding to the n maximum values. The number of different sub-feature tensors k, k is a positive integer less than or equal to n; the number of output channels of the re-determined convolutional layer is kN/n.

第三方面,提供了一种神经网络通道数搜索装置,包括:存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行以下过程:确定卷积层的输出通道数N,N为正整数;将卷积层输出的特征张量分割为n个子特征张量,每个子特征张量的通道数为N/n,n为可以被N整除的整数,且n≥2;确定n组加权系数,每组加权系数包括多个加权系数,多个加权系数与所述n个子特征张量中的多个子特征张量一一对应;确定n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个所述最大值所对应的子特征张量;根据n个最大值所对应的子特征张量重新确定卷积层的输出通道数。In a third aspect, a neural network channel number search device is provided, including: a memory for storing programs; a processor for executing programs stored in the memory, and when the programs stored in the memory are executed, the processor is configured to perform the following processes : Determine the number of output channels of the convolutional layer N, where N is a positive integer; divide the feature tensor output by the convolutional layer into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, where n is An integer divisible by N, and n≥2; determine n groups of weighting coefficients, each group of weighting coefficients includes multiple weighting coefficients, the multiple weighting coefficients correspond to multiple sub-feature tensors of the n sub-feature tensors one-to-one; determine n groups The sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the weighting coefficients to obtain the sub-feature tensor corresponding to the n maximum values; the volume is re-determined according to the sub-feature tensor corresponding to the n maximum values The number of output channels of the stack.

第四方面,提供了一种图像处理装置,包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行以下过程:获取待处理图像;根据目标神经网络对待处理图像进行分类,得到待处理图像的分类结果;其中,目标神经网络的通道数的确定包括:确定卷积层的输出通道数N,N为正整数;将卷积层输出的特征张量分割为n个子特征张量,每个子特征张量的通道数为N/n,n为可以被N整除的整数,且n≥2;确定n组加权系数,每组加权系数包括多个加权系数,多个加权系数与所述n个子特征张量中的多个子特征张量一一对应;确定n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个所述最大值所对应的子特征张量;根据n个最大值所对应的子特征张量重新确定卷积层的输出通道数。In a fourth aspect, an image processing device is provided, including: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the processor is configured to Perform the following process: obtain the image to be processed; classify the image to be processed according to the target neural network to obtain the classification result of the image to be processed; wherein the determination of the number of channels of the target neural network includes: determining the number of output channels of the convolutional layer N, N Is a positive integer; divide the feature tensor output by the convolutional layer into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, n is an integer divisible by N, and n≥2; determine n Group weighting coefficients, each group of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients correspond to multiple sub-feature tensors of the n sub-feature tensors one-to-one; determine the maximum value of each group of weighting coefficients in the n sets of weighting coefficients Corresponding sub-feature tensors to obtain sub-feature tensors corresponding to the n maximum values; and re-determine the number of output channels of the convolutional layer according to the sub-feature tensors corresponding to the n maximum values.

第五方面,提供了一种计算机可读存储介质,该计算机可读介质存储用于设备执行的程序代码,该程序代码包括用于执行上述第一方面至第二方面中的任意一种实现方式中的方法。In a fifth aspect, a computer-readable storage medium is provided. The computer-readable medium stores program code for device execution, and the program code includes any one of the implementation manners of the first aspect to the second aspect described above. In the method.

第六方面,提供一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述第一方面至第二方面中的任意一种实现方式中的方法。In a sixth aspect, a computer program product containing instructions is provided. When the computer program product runs on a computer, the computer executes the method in any one of the foregoing first to second aspects.

第七方面,提供了一种芯片,该芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,以执行上述第一方面至第二方面中的任意一种实现方式 中的方法。In a seventh aspect, a chip is provided. The chip includes a processor and a data interface. The processor reads instructions stored in a memory through the data interface to execute any one of the first aspect to the second aspect. One way to achieve this.

可选地,作为一种可能的实现方式,所述芯片还可以包括存储器,所述存储器中存储有指令,所述处理器用于执行所述存储器上存储的指令,当所述指令被执行时,所述处理器用于执行第一方面至第二方面中的任意一种实现方式中的方法。Optionally, as a possible implementation manner, the chip may further include a memory in which instructions are stored, and the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, The processor is configured to execute the method in any one of the implementation manners of the first aspect to the second aspect.

附图说明Description of the drawings

图1是本申请实施例提供的一种卷积神经网络的结构示意图;Fig. 1 is a schematic structural diagram of a convolutional neural network provided by an embodiment of the present application;

图2是本申请实施例提供的一种可微分神经网络结构搜索方法的示意性框图;2 is a schematic block diagram of a method for searching a differentiable neural network structure provided by an embodiment of the present application;

图3是本申请实施例提供的一种的神经网络通道数搜索方法的示意性流程图;FIG. 3 is a schematic flowchart of a method for searching the number of neural network channels provided by an embodiment of the present application;

图4是本申请实施例提供的一种的神经网络通道数搜索方法的示意性框图;4 is a schematic block diagram of a method for searching the number of neural network channels provided by an embodiment of the present application;

图5是本申请实施例提供的另一种的神经网络通道数搜索方法的示意性框图;FIG. 5 is a schematic block diagram of another neural network channel number search method provided by an embodiment of the present application;

图6是本申请实施例提供的超分神经网络结构示意图;FIG. 6 is a schematic diagram of a super-division neural network structure provided by an embodiment of the present application;

图7是本申请实施例提供的图像处理方法的示意性流程图;FIG. 7 is a schematic flowchart of an image processing method provided by an embodiment of the present application;

图8是本申请实施例提供的神经网络通道数搜索装置的硬件结构示意图;FIG. 8 is a schematic diagram of the hardware structure of a neural network channel number search device provided by an embodiment of the present application;

图9是本申请实施例提供的图像处理装置的硬件结构示意图;FIG. 9 is a schematic diagram of the hardware structure of an image processing apparatus provided by an embodiment of the present application;

图10是本申请实施例的神经网络训练装置的硬件结构示意图;FIG. 10 is a schematic diagram of the hardware structure of a neural network training device according to an embodiment of the present application;

图11是本申请实施例提供的神经网络通道数搜索装置的示意性结构框图;FIG. 11 is a schematic structural block diagram of a neural network channel number search device provided by an embodiment of the present application;

图12是本申请实施例提供的图像处理装置的示意性结构框图。FIG. 12 is a schematic structural block diagram of an image processing device provided by an embodiment of the present application.

具体实施方式Detailed ways

下面将结合附图,对本申请中的技术方案进行描述。The technical solution in this application will be described below in conjunction with the accompanying drawings.

根据本申请实施例提供的神经网络通道数搜索方法得到的神经网络可以是卷积神经网络(convolutional neuron network,CNN),深度卷积神经网络(deep convolutional neural network,DCNN),循环神经网络(recurrent neural network,RNN)等等。由于CNN是一种非常常见的神经网络,下面结合图1对CNN的结构进行介绍。The neural network obtained according to the method for searching for the number of neural network channels provided by the embodiments of the present application may be a convolutional neural network (CNN), a deep convolutional neural network (DCNN), or a recurrent neural network (recurrent neural network). neural network, RNN) and so on. Since CNN is a very common neural network, the structure of CNN will be introduced below in conjunction with Figure 1.

本申请实施例的图像处理方法具体采用的神经网络的结构可以如图1所示。在图1中,卷积神经网络(CNN)100可以包括输入层110,卷积层/池化层120(其中池化层为可选的),以及神经网络层130。其中,输入层110可以获取待处理图像,并将获取到的待处理图像交由卷积层/池化层120以及后面的神经网络层130进行处理,可以得到图像的处理结果。下面对图1中的CNN 100中内部的层结构进行详细的介绍。The structure of the neural network specifically adopted by the image processing method of the embodiment of the present application may be as shown in FIG. 1. In FIG. 1, a convolutional neural network (CNN) 100 may include an input layer 110, a convolutional layer/pooling layer 120 (the pooling layer is optional), and a neural network layer 130. Among them, the input layer 110 can obtain the image to be processed, and pass the obtained image to be processed to the convolutional layer/pooling layer 120 and the subsequent neural network layer 130 for processing, and the processing result of the image can be obtained. The following describes the internal layer structure of CNN 100 in Figure 1 in detail.

卷积层/池化层120:Convolutional layer/pooling layer 120:

卷积层:Convolutional layer:

如图1所示卷积层/池化层120可以包括如示例121-126层,举例来说:在一种实现中,121层为卷积层,122层为池化层,123层为卷积层,124层为池化层,125为卷积层,126为池化层;在另一种实现方式中,121、122为卷积层,123为池化层,124、125为卷积层,126为池化层。即卷积层的输出可以作为随后的池化层的输入,也可以作为另一个卷积层的输入以继续进行卷积操作。As shown in Figure 1, the convolutional layer/pooling layer 120 may include layers 121-126 as shown in the examples. For example, in an implementation, layer 121 is a convolutional layer, layer 122 is a pooling layer, and layer 123 is a convolutional layer. Layers, 124 are pooling layers, 125 are convolutional layers, and 126 are pooling layers; in another implementation, 121 and 122 are convolutional layers, 123 are pooling layers, and 124 and 125 are convolutional layers. Layer, 126 is the pooling layer. That is, the output of the convolutional layer can be used as the input of the subsequent pooling layer, or as the input of another convolutional layer to continue the convolution operation.

下面将以卷积层121为例,介绍一层卷积层的内部工作原理。The following will take the convolutional layer 121 as an example to introduce the internal working principle of a convolutional layer.

卷积层121可以包括很多个卷积算子,卷积算子也称为核,其在图像处理中的作用相 当于一个从输入图像矩阵中提取特定信息的过滤器,卷积算子本质上可以是一个权重矩阵,这个权重矩阵通常被预先定义。在对图像进行卷积操作的过程中,权重矩阵通常在输入图像上沿着水平方向一个像素接着一个像素(或两个像素接着两个像素……这取决于步长stride的取值)的进行处理,从而完成从图像中提取特定特征的工作。该权重矩阵的大小应该与图像的大小相关,需要注意的是,权重矩阵的纵深维度(depth dimension)和输入图像的纵深维度是相同的,在进行卷积运算的过程中,权重矩阵会延伸到输入图像的整个深度。因此,和一个单一的权重矩阵进行卷积会产生一个单一纵深维度的卷积化输出,但是大多数情况下不使用单一权重矩阵,而是应用多个尺寸(行×列)相同的权重矩阵,即多个同型矩阵。每个权重矩阵的输出被堆叠起来形成卷积图像的纵深维度,这里的维度可以理解为由上面所述的“多个”来决定。不同的权重矩阵可以用来提取图像中不同的特征,例如一个权重矩阵用来提取图像边缘信息,另一个权重矩阵用来提取图像的特定颜色,又一个权重矩阵用来对图像中不需要的噪点进行模糊化等。该多个权重矩阵尺寸(行×列)相同,经过该多个尺寸相同的权重矩阵提取后的卷积特征图的尺寸也相同,再将提取到的多个尺寸相同的卷积特征图合并形成卷积运算的输出。The convolution layer 121 can include many convolution operators. The convolution operator is also called a kernel. Its role in image processing is equivalent to a filter that extracts specific information from the input image matrix. The convolution operator is essentially It can be a weight matrix, which is usually predefined. In the process of image convolution operation, the weight matrix is usually along the horizontal direction of the input image one pixel after another pixel (or two pixels then two pixels...It depends on the value of stride). Processing, so as to complete the work of extracting specific features from the image. The size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix and the depth dimension of the input image are the same. During the convolution operation, the weight matrix will extend to Enter the entire depth of the image. Therefore, convolution with a single weight matrix will produce a single depth dimension convolution output, but in most cases, a single weight matrix is not used, but multiple weight matrices of the same size (row×column) are applied. That is, multiple homogeneous matrices. The output of each weight matrix is stacked to form the depth dimension of the convolutional image, where the dimension can be understood as determined by the "multiple" mentioned above. Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract edge information of the image, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to eliminate unwanted noise in the image. Perform obfuscation and so on. The multiple weight matrices have the same size (row×column), the size of the convolution feature maps extracted by the multiple weight matrices of the same size are also the same, and then the multiple extracted convolution feature maps of the same size are merged to form The output of the convolution operation.

这些权重矩阵中的权重值在实际应用中需要经过大量的训练得到,通过训练得到的权重值形成的各个权重矩阵可以从输入图像中提取信息,从而使得卷积神经网络100进行正确的预测。The weight values in these weight matrices need to be obtained through a lot of training in practical applications, and each weight matrix formed by the weight values obtained through training can extract information from the input image, so that the convolutional neural network 100 can make correct predictions.

当卷积神经网络100有多个卷积层的时候,初始的卷积层(例如121)往往提取较多的一般特征,该一般特征也可以称之为低级别的特征;随着卷积神经网络100深度的加深,越往后的卷积层(例如126)提取到的特征越来越复杂,比如高级别的语义之类的特征,语义越高的特征越适用于待解决的问题。When the convolutional neural network 100 has multiple convolutional layers, the initial convolutional layer (such as 121) often extracts more general features, which can also be called low-level features; with the convolutional neural network The deeper the network 100, the more complex the features extracted by the subsequent convolutional layers (for example, 126), such as features such as high-level semantics, the features with higher semantics are more suitable for the problem to be solved.

池化层:Pooling layer:

由于常常需要减少训练参数的数量,因此卷积层之后常常需要周期性的引入池化层,即如图1中120所示例的121-126各层,可以是一层卷积层后面跟一层池化层,也可以是多层卷积层后面接一层或多层池化层。在图像处理过程中,池化层的唯一目的就是减少图像的空间大小。池化层可以包括平均池化算子和/或最大池化算子,以用于对输入图像进行采样得到较小尺寸的图像。平均池化算子可以在特定范围内对图像中的像素值进行计算产生平均值作为平均池化的结果。最大池化算子可以在特定范围内取该范围内值最大的像素作为最大池化的结果。另外,就像卷积层中用权重矩阵的大小应该与图像尺寸相关一样,池化层中的运算符也应该与图像的大小相关。通过池化层处理后输出的图像尺寸可以小于输入池化层的图像的尺寸,池化层输出的图像中每个像素点表示输入池化层的图像的对应子区域的平均值或最大值。Since it is often necessary to reduce the number of training parameters, it is often necessary to periodically introduce a pooling layer after the convolutional layer, that is, the 121-126 layers as illustrated by 120 in Figure 1, which can be a convolutional layer followed by a layer The pooling layer can also be a multi-layer convolutional layer followed by one or more pooling layers. In the image processing process, the sole purpose of the pooling layer is to reduce the size of the image space. The pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to obtain an image with a smaller size. The average pooling operator can calculate the pixel values in the image within a specific range to generate an average value as the result of the average pooling. The maximum pooling operator can take the pixel with the largest value within a specific range as the result of the maximum pooling. In addition, just as the size of the weight matrix used in the convolutional layer should be related to the image size, the operators in the pooling layer should also be related to the image size. The size of the image output after processing by the pooling layer can be smaller than the size of the image of the input pooling layer, and each pixel in the image output by the pooling layer represents the average value or the maximum value of the corresponding sub-region of the image input to the pooling layer.

神经网络层130:Neural network layer 130:

在经过卷积层/池化层120的处理后,卷积神经网络100还不足以输出所需要的输出信息。因为如前所述,卷积层/池化层120只会提取特征,并减少输入图像带来的参数。然而为了生成最终的输出信息(所需要的类信息或别的相关信息),卷积神经网络100需要利用神经网络层130来生成一个或者一组所需要的类的数量的输出。因此,在神经网络层130中可以包括多层隐含层(如图1所示的131、132至13n)以及输出层140,该多层隐含层中所包含的参数可以根据具体的任务类型的相关训练数据进行预先训练得到,例如 该任务类型可以包括图像识别,图像分类,图像超分辨率重建等等。After processing by the convolutional layer/pooling layer 120, the convolutional neural network 100 is not enough to output the required output information. Because as mentioned above, the convolutional layer/pooling layer 120 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other related information), the convolutional neural network 100 needs to use the neural network layer 130 to generate one or a group of required classes of output. Therefore, the neural network layer 130 may include multiple hidden layers (131, 132 to 13n as shown in FIG. 1) and an output layer 140. The parameters contained in the multiple hidden layers may be based on specific task types. The relevant training data of the, for example, the task type can include image recognition, image classification, image super-resolution reconstruction and so on.

在神经网络层130中的多层隐含层之后,也就是整个卷积神经网络100的最后层为输出层140,该输出层140具有类似分类交叉熵的损失函数,具体用于计算预测误差,一旦整个卷积神经网络100的前向传播(如图1由110至140的传播为前向传播)完成,反向传播(如图1由140至110的传播为反向传播)就会开始更新前面提到的各层的权重值以及偏差,以减少卷积神经网络100的损失及卷积神经网络100通过输出层输出的结果和理想结果之间的误差。After the multiple hidden layers in the neural network layer 130, that is, the final layer of the entire convolutional neural network 100 is the output layer 140. The output layer 140 has a loss function similar to the classification cross entropy, which is specifically used to calculate the prediction error. Once the forward propagation of the entire convolutional neural network 100 (as shown in Figure 1, the propagation from 110 to 140 is forward) is completed, the back propagation (as shown in Figure 1 is the propagation from 140 to 110 as back propagation) will start to update The aforementioned weight values and deviations of each layer are used to reduce the loss of the convolutional neural network 100 and the error between the output result of the convolutional neural network 100 through the output layer and the ideal result.

图1所示的神经网络可以由神经网络结构搜索的方法得到。神经网络结构搜索方法有多种类别,其中可微分搜索技术是神经网络结构搜索的重要技术之一,主要分为三个阶段:构造可微分的神经网络搜索空间,进行网络结构搜索,对搜索结果进行解码得到最终的网络结构。以下结合图2对可微分神经网络结构搜索技术进行简单介绍。The neural network shown in Figure 1 can be obtained by the neural network structure search method. There are many types of neural network structure search methods. Among them, the differentiable search technology is one of the important techniques of neural network structure search. Perform decoding to get the final network structure. The following is a brief introduction to the differentiable neural network structure search technology in conjunction with Figure 2.

第一步,构造可微分的神经网络搜索空间。如图2所示,将候选的网络计算单元1、2、3、4、5部署在一个网络中,由加权求和的方式进行构造。其中,加权系数由gunbel_softmax转换函数得到,gunbel_softmax转换函数能够将加权系数转换为[0,1]之间且相加为1的向量,由参数温度控制输出分布,当温度很低时,输出趋向平均分布,当温度很高时,输出趋向one-hot分布,即只有一个元素趋于1,其他元素趋于0。在图2中,a1、a2、a3、a4、a5为加权系数,其中a1为计算单元1的加权系数,a2为计算单元2的加权系数……The first step is to construct a differentiable neural network search space. As shown in Figure 2, the candidate network calculation units 1, 2, 3, 4, and 5 are deployed in a network and constructed by a weighted summation method. Among them, the weighting coefficient is obtained by the gunbel_softmax conversion function. The gunbel_softmax conversion function can convert the weighting coefficient into a vector between [0,1] and add to 1. The parameter temperature controls the output distribution. When the temperature is very low, the output tends to average Distribution, when the temperature is high, the output tends to one-hot distribution, that is, only one element tends to 1, and the other elements tend to 0. In Figure 2, a1, a2, a3, a4, a5 are weighting coefficients, where a1 is the weighting coefficient of calculation unit 1, a2 is the weighting coefficient of calculation unit 2...

第二步,网络结构搜索。针对第一步中构造的网络,进行网络计算单元的网络参数和加权系数的交替训练,其中加权系数表示网络结构的结构参数。具体地,将输入的数据分别传入计算单元1至5中,进行计算单元的网络参数训练。然后将经过计算单元1至5处理后的数据分别与加权系数a1至a5相乘后相加,从而得到输出1。再将输出1的数据分别传入计算单元1’至5’中,进行计算单元的网络参数训练。然后将经过计算单元1’至5’处理后的数据分别与加权系数b1至b5相乘后相加,从而得到输出2。The second step is to search the network structure. For the network constructed in the first step, alternate training of the network parameters and weighting coefficients of the network calculation unit is performed, where the weighting coefficients represent the structural parameters of the network structure. Specifically, the input data is passed into the calculation units 1 to 5 respectively, and the network parameter training of the calculation unit is performed. Then, the data processed by the calculation units 1 to 5 are respectively multiplied by the weighting coefficients a1 to a5 and then added to obtain an output 1. Then, the data of output 1 is passed into the calculation units 1'to 5'respectively, and the network parameter training of the calculation units is performed. Then, the data processed by the calculation units 1'to 5'are multiplied by the weighting coefficients b1 to b5, respectively, and then added to obtain an output 2.

第三步,搜索结果解码。当第二步中网络训练完成后,根据最终的加权系数,对于每个加权求和项,保留加权系数最大的计算单元,并删除其他计算单元,得到最终的网络结构作为搜索结果。具体地,根据加权系数a1至a5中最大的加权系数,保留其对应的计算单元,并删去其他的计算单元;根据加权系数b1至b5中最大的加权系数,保留其对应的计算单元,并删去其他的计算单元。The third step is to decode the search results. When the network training in the second step is completed, according to the final weighting coefficient, for each weighted sum item, the calculation unit with the largest weighting coefficient is retained, and other calculation units are deleted, and the final network structure is obtained as the search result. Specifically, according to the largest weighting coefficient among the weighting coefficients a1 to a5, the corresponding calculation unit is retained, and other calculation units are deleted; according to the largest weighting coefficient among the weighting coefficients b1 to b5, the corresponding calculation unit is retained, and Delete other calculation units.

在应用可微分搜索技术进行神经网络结构搜索时,目前主要是搜索计算单元,计算单元可以是一个操作,如卷积、池化等,也可以是多个基本操作组合而成的块操作。可微分搜索技术进行网络结构搜索时,虽然能够搜索不同的计算单元,但是不支持单个卷积的通道数的搜索,对于希望能够搜索到计算量更小的网络时,并不能满足要求。本申请实施例提供的神经网络通道数搜索方法,基于可微分的搜索技术,可以实现神经网络通道数的搜索。When applying the differentiable search technology to search the neural network structure, it is currently mainly a search calculation unit. The calculation unit can be an operation, such as convolution, pooling, etc., or a block operation composed of multiple basic operations. Differentiable search technology can search for different computing units when searching for a network structure, but it does not support the search for the number of channels of a single convolution. It cannot meet the requirements when it is desired to search for a network with a smaller amount of calculation. The method for searching the number of neural network channels provided by the embodiments of the present application can realize the search of the number of neural network channels based on a differentiable search technology.

图3示出了本申请提供的神经网络通道数搜索方法的示意性流程图。图3所示的方法可以由神经网络结构搜索装置来执行,该神经网络结构搜索装置可以是电脑、服务器、云端设备等运算能力足以实现神经网络结构的搜索的设备。图3所示的方法包括步骤301至305,下面分别对这些步骤进行详细的描述。Fig. 3 shows a schematic flow chart of the method for searching the number of neural network channels provided by the present application. The method shown in FIG. 3 can be executed by a neural network structure search device. The neural network structure search device can be a computer, a server, a cloud device, and other devices with sufficient computing power to implement a neural network structure search. The method shown in FIG. 3 includes steps 301 to 305, which are described in detail below.

S301,确定卷积层的输出通道数N,N为正整数。S301: Determine the number N of output channels of the convolutional layer, where N is a positive integer.

其中,该卷积层的输出通道数N可以是该卷积层的最大输出通道数,该卷积层的最大输出通道数可以是依据具体实施例设定的值。The number N of output channels of the convolutional layer may be the maximum number of output channels of the convolutional layer, and the maximum number of output channels of the convolutional layer may be a value set according to a specific embodiment.

S302,将卷积层输出的特征张量分割为n个子特征张量,每个子特征张量的通道数为N/n,n为可以被N整除的整数,且n≥2。S302: Divide the feature tensor output by the convolutional layer into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, where n is an integer divisible by N, and n≥2.

其中,该卷积层输出的特征张量的分割为在通道维度上的分割,例如一张图片是三维张量,长宽分别为一个维度,第三个维度即为通道数。将特征张量在通道维度上进行分割,可以使得每个子特征张量可以均分该卷积层的通道数。Among them, the segmentation of the feature tensor output by the convolutional layer is the segmentation in the channel dimension. For example, a picture is a three-dimensional tensor, the length and width are each dimension, and the third dimension is the number of channels. The feature tensor is divided in the channel dimension, so that each sub-feature tensor can equally divide the number of channels of the convolutional layer.

S303,确定n组加权系数,每组加权系数包括多个加权系数,多个加权系数与n个子特征张量中的多个子特征张量一一对应。S303: Determine n groups of weighting coefficients, each group of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients are in one-to-one correspondence with multiple sub-feature tensors of the n sub-feature tensors.

可选地,每组加权系数包括n个加权系数,n个加权系数与n个子特征张量一一对应。例如确定了4组加权系数,其中每组加权系数包括4个加权系数a1、a2、a3、a4,这4个加权系数与4个子特征张量T1、T2、T3、T4一一对应,即a1对应T1、a2对应T2、a3对应T3、a4对应T4。应理解,每组加权系数可以是互不相同的。Optionally, each group of weighting coefficients includes n weighting coefficients, and the n weighting coefficients have a one-to-one correspondence with the n sub-feature tensors. For example, four sets of weighting coefficients are determined, and each set of weighting coefficients includes four weighting coefficients a1, a2, a3, and a4. These four weighting coefficients correspond to the four sub-feature tensors T1, T2, T3, and T4, namely a1 Corresponds to T1, a2 corresponds to T2, a3 corresponds to T3, and a4 corresponds to T4. It should be understood that each set of weighting coefficients may be different from each other.

可选地,每组加权系数包括m个加权系数,m个加权系数与n个子特征张量中的m个子特征张量一一对应,其中m为小于n的正整数。例如确定了4组加权系数,其中每组加权系数包括2个加权系数a1、a2,这2个加权系数与4个子特征张量T1、T2、T3、T4中的任意两个一一对应,例如第一组中a1对应T1、a2对应T2,第二组中a1对应T1、a2对应T3,……应理解,每组加权系数可以是互不相同的。Optionally, each group of weighting coefficients includes m weighting coefficients, and the m weighting coefficients have a one-to-one correspondence with m sub-feature tensors in the n sub-feature tensors, where m is a positive integer less than n. For example, four sets of weighting coefficients are determined, and each set of weighting coefficients includes two weighting coefficients a1, a2, and these two weighting coefficients correspond to any two of the four sub-feature tensors T1, T2, T3, and T4 one-to-one, for example In the first group, a1 corresponds to T1 and a2 corresponds to T2, and in the second group a1 corresponds to T1 and a2 corresponds to T3. It should be understood that the weighting coefficients of each group may be different from each other.

本申请实施例提供了两种可能的实现方式来确定n组加权系数中每组加权系数中的最大值所对应的子特征张量,即每个最大值所对应的子特征张量可以从n个子特征张量中确定,也可以从n个子特征张量中的部分子特征张量中确定。This embodiment of the application provides two possible implementations to determine the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in n sets of weighting coefficients, that is, the sub-feature tensor corresponding to each maximum value can be from n It can be determined from the number of sub-feature tensors, or it can be determined from part of the sub-feature tensors in n sub-feature tensors.

S304,确定n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个最大值所对应的子特征张量。S304: Determine the sub-feature tensor corresponding to the maximum value of each group of weighting coefficients in the n groups of weighting coefficients, so as to obtain the sub-feature tensor corresponding to the n maximum values.

结合S303中的示例,例如,第一组中加权系数最大的是a1,则最大值所对应的子特征张量为T1;第二组中加权系数最大的也是a1,则最大值所对应的子特征张量也为T1;第三组中加权系数最大的是a2,则最大值所对应的子特征张量为T2;第四组中加权系数最大的是a3,则最大值所对应的子特征张量为T3。由此得到的4个最大值所对应的子特征张量分别为T1、T1、T2、T3。Combining the example in S303, for example, if the largest weighting coefficient in the first group is a1, the sub-feature tensor corresponding to the maximum value is T1; the second group having the largest weighting coefficient is also a1, then the sub-feature tensor corresponding to the maximum value is a1. The feature tensor is also T1; the largest weighting coefficient in the third group is a2, and the sub-feature tensor corresponding to the maximum value is T2; the largest weighting coefficient in the fourth group is a3, then the sub-feature corresponding to the maximum value The tensor is T3. The sub-feature tensors corresponding to the four maximum values thus obtained are T1, T1, T2, and T3, respectively.

可选地,在确定n组加权系数中每组加权系数中的最大值所对应的子特征张量之前,还可以根据n组加权系数和n个子特征张量中的多个子特征张量生成n个候选特征张量,其中一组加权系数对应一个候选特征张量。然后确定生成每个候选特征张量的多个子特征张量中权重最大的子特征张量,以得到n个权重最大的子特征张量。结合S303中的示例,以其中一组加权系数进行说明,例如,根据一组中的4个加权系数a1、a2、a3、a4和4个子特征张量T1、T2、T3、T4生成一个候选特征张量TC1,即TC1=a1ⅹT1+a2ⅹT2+a3ⅹT3+a4ⅹT4,然后可以确定生成TC1中权重最大的子特征张量例如为T1。由此可以得到n个权重最大的子特征张量。Optionally, before determining the sub-feature tensor corresponding to the maximum value of each set of weighting coefficients in the n sets of weighting coefficients, n candidate tensors may be generated according to the n sets of weighting coefficients and multiple sub-feature tensors of the n sub-feature tensors Feature tensor, where a set of weighting coefficients corresponds to a candidate feature tensor. Then determine the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each candidate feature tensor, so as to obtain n sub-feature tensors with the largest weights. Combining the example in S303, a set of weighting coefficients is used for description. For example, a candidate feature is generated based on the 4 weighting coefficients a1, a2, a3, a4 and 4 sub-feature tensors T1, T2, T3, T4 in a group. The tensor TC1, that is, TC1=a1ⅹT1+a2ⅹT2+a3ⅹT3+a4ⅹT4, and then it can be determined that the sub-feature tensor with the largest weight in TC1 is generated, for example, T1. Thus, n sub-feature tensors with the largest weights can be obtained.

S305,根据n个最大值所对应的子特征张量重新确定卷积层的输出通道数。S305: Re-determine the number of output channels of the convolutional layer according to the sub-feature tensors corresponding to the n maximum values.

具体地,可以是确定n个最大值所对应的子特征张量中互不相同的子特征张量的个数k,k为小于等于n的正整数,然后重新确定的卷积层的输出通道数为kN/n。Specifically, it may be determined that the number k of sub-feature tensors that are different from each other among the sub-feature tensors corresponding to the n maximum values, where k is a positive integer less than or equal to n, and then the number of output channels of the convolutional layer is re-determined as kN/n.

结合S304中的示例,已经得到了4个最大值所对应的子特征张量分别为T1、T1、T2、T3,其中互不相同的子特征张量为T1、T2、T3,个数为3,由此可以重新确定的卷积层的输出通道数为3N/4,由此可以实现神经网络通道数的压缩,从而减小神经网络的计算复杂度。Combining the example in S304, the sub-feature tensors corresponding to the four maximum values are T1, T1, T2, and T3, respectively. Among them, the sub-feature tensors that are different from each other are T1, T2, T3, and the number is 3. Therefore, the number of output channels of the convolutional layer can be re-determined as 3N/4, which can realize the compression of the number of neural network channels, thereby reducing the computational complexity of the neural network.

为了更好地理解本申请实施例提供的神经网络通道数搜索方法,下面结合图4对本申请实施例的神经网络通道数搜索方法的整体过程进行介绍。以卷积层1到卷积层2为例,本申请实施例构造卷积通道数可搜的搜索空间方法如下。In order to better understand the neural network channel number search method provided by the embodiment of the present application, the overall process of the neural network channel number search method according to the embodiment of the present application will be introduced below in conjunction with FIG. 4. Taking convolutional layer 1 to convolutional layer 2 as an example, the method of constructing a search space with a searchable number of convolutional channels in the embodiment of the present application is as follows.

首先确定卷积层的最大输出通道数N,该最大输出通道数N可以是依据具体实施例设定的值。根据卷积层的最大输出通道数N决定搜索通道数的最大值。First, determine the maximum number of output channels N of the convolutional layer, and the maximum number of output channels N may be a value set according to a specific embodiment. The maximum number of search channels is determined according to the maximum number of output channels N of the convolutional layer.

卷积层1输出特征张量T,将特征张量T在通道维度分割为如图4所示的4个子特征张量:T0、T1、T2和T3,如此,每个子特征张量的通道数为N/4。Convolutional layer 1 outputs the feature tensor T, and divides the feature tensor T into four sub-feature tensors as shown in Figure 4 in the channel dimension: T0, T1, T2, and T3. Thus, the number of channels for each sub-feature tensor Is N/4.

图4中将4个子特征张量使用加权求和的方式生成一个候选特征张量,重复4次由此可以生成4个候选特征张量TC0、TC1、TC2和TC3,其中加权求和的加权系数由gumbel_softmax转换函数处理后得到。gumbel_softmax转换函数的定义如下:In Figure 4, 4 sub-feature tensors are used to generate a candidate feature tensor using a weighted summation method, which can be repeated 4 times to generate 4 candidate feature tensors TC0, TC1, TC2, and TC3, among which the weighted coefficients of the weighted summation It is obtained after processing by the gumbel_softmax conversion function. The definition of gumbel_softmax conversion function is as follows:

Figure PCTCN2020078413-appb-000001
Figure PCTCN2020078413-appb-000001

其中,g为随机生成的变量,τ为设定值,π为输入,y为输出即为计算得到的加权系数。例如,通过一组输入π01、π02、π03、π04,可以计算得到对应的一组加权系数a01、a02、a03、a04,其中a01、a02、a03、a04均为[0,1之间且相加为1的向量。类似的,可以得到4组加权系数。Among them, g is a randomly generated variable, τ is the set value, π is the input, and y is the output, which is the calculated weighting coefficient. For example, through a set of inputs π01, π02, π03, π04, a corresponding set of weighting coefficients a01, a02, a03, a04 can be calculated, where a01, a02, a03, and a04 are all between [0,1 and add up A vector of 1. Similarly, 4 sets of weighting coefficients can be obtained.

对于4组输入,经过gumbel_softmax转换函数处理后,加权系数如表1所示。For 4 groups of inputs, after being processed by the gumbel_softmax conversion function, the weighting coefficients are shown in Table 1.

表1Table 1

a00a00 a10a10 a20a20 a30a30 a01a01 a11a11 a21a21 a31a31 a02a02 a12a12 a21a21 a32a32 a03a03 a13a13 a23a23 a33a33

根据表1中的四组加权系数和4个子特征张量可以得到4个候选特征张量TC0、TC1、TC2和TC3。其中,候选特征张量TC0为a00×T0+a01×T1+a02×T2+a03×T3,TC1为a10×T0+a11×T1+a12×T2+a13×T3,TC2为a20×T0+a21×T1+a22×T2+a23×T3,TC3为a30×T0+a31×T1+a32×T2+a33×T3。According to the four sets of weighting coefficients and four sub-feature tensors in Table 1, four candidate feature tensors TC0, TC1, TC2, and TC3 can be obtained. Among them, the candidate feature tensor TC0 is a00×T0+a01×T1+a02×T2+a03×T3, TC1 is a10×T0+a11×T1+a12×T2+a13×T3, and TC2 is a20×T0+a21× T1+a22×T2+a23×T3, TC3 is a30×T0+a31×T1+a32×T2+a33×T3.

将4个候选特征张量TC0、TC1、TC2和TC3拼接为一个特征张量Tout作为卷积层1的输出,其通道数依然为N。The 4 candidate feature tensors TC0, TC1, TC2, and TC3 are spliced into a feature tensor Tout as the output of the convolutional layer 1, and the number of channels is still N.

将特征张量Tout输入到下一个卷积层2中。The feature tensor Tout is input to the next convolutional layer 2.

搜索完成后,可以得到每个候选特征张量中贡献最大的子特征张量。例如,对于TC0,a00>a01>a02>a03,因此对于TC0贡献最大的子特征张量为T0;对于TC1,a10>a11>a12>a13,对于TC1贡献最大的子特征张量为T0;对于TC2,a23>a20>a21>a22,因此对于TC2贡献最大的子特征张量为T3;对于TC3,a32>a30>a31>a33,因此对于TC3 贡献最大的子特征张量为T2。由于在TC0和TC1中,贡献最大的子特征张量均为T0,因此在4个子特征张量T0、T1、T2和T3中只需保留T0、T2、T3,因此通道数只需3N/4,由此可以实现通道数的压缩。After the search is completed, the sub-feature tensor with the largest contribution among each candidate feature tensor can be obtained. For example, for TC0, a00>a01>a02>a03, so the sub-feature tensor that contributes the most to TC0 is T0; for TC1, a10>a11>a12>a13, the sub-feature tensor that contributes the most to TC1 is T0; TC2, a23>a20>a21>a22, so the sub-feature tensor that contributes the most to TC2 is T3; for TC3, a32>a30>a31>a33, so the sub-feature tensor that contributes the most to TC3 is T2. Since in TC0 and TC1, the sub-feature tensors that contribute the most are T0, only T0, T2, and T3 need to be retained in the four sub-feature tensors T0, T1, T2, and T3, so the number of channels only needs to be 3N/4. This can achieve compression of the number of channels.

由此搜索出的网络结构在卷积层1中的实际通道数只需原通道数的3/4,通道数降低,网络整体计算复杂度降低。The actual number of channels of the searched network structure in the convolutional layer 1 is only 3/4 of the original number of channels, the number of channels is reduced, and the overall computational complexity of the network is reduced.

可选地,图5示出了另一种生成候选特征张量的方式。如图5所示,将子特征张量T0直接作为候选特征张量TC0,将子特征张量T0和子特征张量T1加权求和生成候选特征张量TC1为a10×T0+a11×T1,将子特征张量T0和子特征张量T2加权求和生成候选特征张量TC2为a20×T0+a21×T2,将子特征张量T0和子特征张量T3加权求和生成候选特征张量TC3为a30×T0+a31×T3。对于TC0贡献最大的子特征张量为显然为T0,对于TC1,a10>a11,因此对于TC1贡献最大的子特征张量为T0;对于TC2,a20>a21,因此对于TC2贡献最大的子特征张量为T0;对于TC3,a31>a30,对于TC3贡献最大的子特征张量为T3。由于在TC0、TC1和和TC2中,贡献最大的子特征张量均为T0,因此在4个子特征张量T0、T1、T2和T3中只需保留T0和T3,因此通道数只需N/2,由此可以实现通道数的压缩。Optionally, FIG. 5 shows another way of generating candidate feature tensors. As shown in Figure 5, the sub-feature tensor T0 is directly used as the candidate feature tensor TC0, and the sub-feature tensor T0 and the sub-feature tensor T1 are weighted and summed to generate the candidate feature tensor TC1 as a10×T0+a11×T1. Sub-feature tensor T0 and sub-feature tensor T2 are weighted and summed to generate candidate feature tensor TC2 as a20×T0+a21×T2, and sub-feature tensor T0 and sub-feature tensor T3 are weighted and summed to generate candidate feature tensor TC3 as a30 ×T0+a31×T3. The sub-feature tensor that contributes the most to TC0 is obviously T0, for TC1, a10>a11, so the sub-feature tensor that contributes the most to TC1 is T0; for TC2, a20>a21, so the sub-feature tensor that contributes the most to TC2 The quantity is T0; for TC3, a31>a30, the sub-feature tensor that contributes the most to TC3 is T3. Since in TC0, TC1, and TC2, the sub-feature tensors that contribute the most are all T0, only T0 and T3 need to be retained in the four sub-feature tensors T0, T1, T2, and T3, so the number of channels only needs to be N/2, As a result, the number of channels can be compressed.

图5所示的生成候选特征张量的组合方式有利于搜索出通道数更小的网络结构。The combination method of generating candidate feature tensors shown in FIG. 5 is beneficial to search for a network structure with a smaller number of channels.

图6为利用本申请实施例提供的神经网络通道数搜索方法搜索出的超分神经网络结构示意图。与未使用本申请实施例提供的神经网络通道数搜索方法搜索出的超分神经网络结构在性能上相当。其中,利用本申请实施例提供的神经网络通道数搜索方法搜索出的超分神经网络结构的卷积层0的通道数降低了50%,卷积层1的通道数降低了50%,卷积层2的通道数降低了0%,卷积层3的通道数降低了0%,卷积层4的通道数降低了50%,通道数进行压缩后,网络整体计算复杂度降低了37%。FIG. 6 is a schematic diagram of a super-divided neural network structure searched by the neural network channel number search method provided by an embodiment of the present application. The performance of the super-divided neural network structure searched out without using the neural network channel number search method provided by the embodiment of the present application is equivalent in performance. Among them, the number of channels of the convolutional layer 0 of the super-division neural network structure searched by the neural network channel number search method provided by the embodiments of the present application is reduced by 50%, the number of channels of the convolutional layer 1 is reduced by 50%, and the convolution The number of channels in layer 2 is reduced by 0%, the number of channels in convolutional layer 3 is reduced by 0%, and the number of channels in convolutional layer 4 is reduced by 50%. After the number of channels is compressed, the overall computational complexity of the network is reduced by 37%.

利用本申请实施例提供的方法搜索出的神经网络结构对图像进行处理后的效果与原图像的主观效果相当,比使用差值法处理图像的效果更好。在细节对比上,利用本申请实施例提供的方法搜索出的神经网络结构对图像进行处理后的细节比原图像更加清晰。The neural network structure searched for by the method provided in the embodiments of the present application has the effect of processing the image, which is equivalent to the subjective effect of the original image, and is better than using the difference method to process the image. In terms of detail comparison, the details of the image after processing the image using the neural network structure searched out by the method provided in the embodiment of the present application are clearer than the original image.

图7是本申请实施例的图像处理方法的示意性流程图。应理解,上文中对图3所示的方法的相关内容限定、解释和扩展同样适用于图7所示的方法,下面在介绍图7所示的方法时适当省略重复的描述。图7所示的方法可以应用于终端设备,包括:FIG. 7 is a schematic flowchart of an image processing method according to an embodiment of the present application. It should be understood that the above definitions, explanations and extensions of the relevant content of the method shown in FIG. 3 are also applicable to the method shown in FIG. 7, and repeated descriptions are appropriately omitted when introducing the method shown in FIG. The method shown in FIG. 7 can be applied to terminal equipment, including:

S701,获取待处理图像。S701: Acquire an image to be processed.

S702,根据目标神经网络对待处理图像进行分类,得到待处理图像的分类结果。S702: Classify the image to be processed according to the target neural network to obtain a classification result of the image to be processed.

其中,目标神经网络的通道数的确定包括:确定卷积层的输出通道数N,N为正整数;卷积层输出的特征张量分割为n个子特征张量,每个子特征张量的通道数为N/n,n为可以被N整除的整数,且n≥2;确定n组加权系数,每组加权系数包括多个加权系数,多个加权系数与n个子特征张量中的多个子特征张量一一对应;确定n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个最大值所对应的子特征张量;根据n个最大值所对应的子特征张量重新确定卷积层的输出通道数。Among them, the determination of the number of channels of the target neural network includes: determining the number of output channels of the convolutional layer N, where N is a positive integer; the feature tensor output by the convolutional layer is divided into n sub-feature tensors, and the channel of each sub-feature tensor The number is N/n, n is an integer divisible by N, and n≥2; determine n sets of weighting coefficients, each set of weighting coefficients includes multiple weighting coefficients, multiple weighting coefficients and multiple sub-features of the n sub-feature tensors One-to-one correspondence of the quantities; determine the sub-feature tensor corresponding to the maximum value of each group of weighting coefficients in the n groups of weighting coefficients to obtain the sub-feature tensor corresponding to the n maximum values; according to the sub-feature tensor corresponding to the n maximum values The feature tensor re-determines the number of output channels of the convolutional layer.

图8是本申请实施例提供的神经网络通道数搜索装置的硬件结构示意图。图8所示的神经网络通道数搜索装置800(该装置800具体可以是一种计算机设备)包括存储器801、处理器802、通信接口803以及总线804。其中,存储器801、处理器802、通信接口803 通过总线804实现彼此之间的通信连接。FIG. 8 is a schematic diagram of the hardware structure of a neural network channel number search device provided by an embodiment of the present application. The neural network channel number search device 800 shown in FIG. 8 (the device 800 may specifically be a computer device) includes a memory 801, a processor 802, a communication interface 803, and a bus 804. Among them, the memory 801, the processor 802, and the communication interface 803 realize the communication connection between each other through the bus 804.

存储器801可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器801可以存储程序,当存储器801中存储的程序被处理器802执行时,处理器802用于执行本申请实施例的神经网络通道数的搜索方法的各个步骤。The memory 801 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM). The memory 801 may store a program. When the program stored in the memory 801 is executed by the processor 802, the processor 802 is configured to execute each step of the method for searching the number of neural network channels in the embodiment of the present application.

处理器802可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请方法实施例的神经网络通道数的搜索方法。The processor 802 may adopt a general central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more The integrated circuit is used to execute related programs to implement the neural network channel number search method in the method embodiment of the present application.

处理器802还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的神经网络通道数的搜索方法的各个步骤可以通过处理器802中的硬件的集成逻辑电路或者软件形式的指令完成。The processor 802 may also be an integrated circuit chip with signal processing capability. In the implementation process, each step of the method for searching the number of neural network channels of the present application can be completed by an integrated logic circuit of hardware in the processor 802 or instructions in the form of software.

上述处理器802还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器801,处理器802读取存储器801中的信息,结合其硬件完成本神经网络通道数搜索装置中包括的单元所需执行的功能,或者执行本申请方法实施例的神经网络通道数的搜索方法。The above-mentioned processor 802 may also be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, Discrete gates or transistor logic devices, discrete hardware components. The methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor. The software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers. The storage medium is located in the memory 801, and the processor 802 reads the information in the memory 801, and combines its hardware to complete the functions required by the units included in the neural network channel number search device, or execute the neural network channel of the method embodiment of the application Number of search methods.

通信接口803使用例如但不限于收发器一类的收发装置,来实现装置800与其他设备或通信网络之间的通信。例如,可以通过通信接口803获取待确定的目标神经网络的信息以及确定目标神经网络过程中需要的训练数据。The communication interface 803 uses a transceiving device such as but not limited to a transceiver to implement communication between the device 800 and other devices or a communication network. For example, the information of the target neural network to be determined and the training data needed in the process of determining the target neural network can be obtained through the communication interface 803.

总线804可包括在装置800各个部件(例如,存储器801、处理器802、通信接口803)之间传送信息的通路。The bus 804 may include a path for transferring information between various components of the device 800 (for example, the memory 801, the processor 802, and the communication interface 803).

图9是本申请实施例的图像处理装置的硬件结构示意图。图9所示的图像处理装置900包括存储器901、处理器902、通信接口903以及总线904。其中,存储器901、处理器902、通信接口903通过总线904实现彼此之间的通信连接。FIG. 9 is a schematic diagram of the hardware structure of an image processing apparatus according to an embodiment of the present application. The image processing apparatus 900 shown in FIG. 9 includes a memory 901, a processor 902, a communication interface 903, and a bus 904. Among them, the memory 901, the processor 902, and the communication interface 903 implement communication connections between each other through the bus 904.

存储器901可以是ROM,静态存储设备和RAM。存储器901可以存储程序,当存储器901中存储的程序被处理器902执行时,处理器902和通信接口903用于执行本申请实施例的图像处理方法的各个步骤。The memory 901 may be ROM, static storage device and RAM. The memory 901 may store a program. When the program stored in the memory 901 is executed by the processor 902, the processor 902 and the communication interface 903 are used to execute each step of the image processing method of the embodiment of the present application.

处理器902可以采用通用的,CPU,微处理器,ASIC,GPU或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的图像处理装置中的单元所需执行的功能,或者执行本申请方法实施例的图像处理方法。The processor 902 may adopt a general-purpose CPU, a microprocessor, an ASIC, a GPU or one or more integrated circuits to execute related programs to realize the functions required by the units in the image processing apparatus of the embodiments of the present application. Or execute the image processing method in the method embodiment of this application.

处理器902还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请实施例的图像处理方法的各个步骤可以通过处理器902中的硬件的集成逻辑电路或者软件形式的指令完成。The processor 902 may also be an integrated circuit chip with signal processing capability. In the implementation process, each step of the image processing method of the embodiment of the present application can be completed by an integrated logic circuit of hardware in the processor 902 or instructions in the form of software.

上述处理器902还可以是通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器901,处理器902读取存储器901中的信息,结合其硬件完成本申请实施例的图像处理装置中包括的单元所需执行的功能,或者执行本申请方法实施例的图像处理方法。The aforementioned processor 902 may also be a general-purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component. The methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor. The software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers. The storage medium is located in the memory 901, and the processor 902 reads the information in the memory 901, and combines its hardware to complete the functions required by the units included in the image processing apparatus of the embodiment of the present application, or perform the image processing of the method embodiment of the present application method.

通信接口903使用例如但不限于收发器一类的收发装置,来实现装置900与其他设备或通信网络之间的通信。例如,可以通过通信接口903获取待处理图像。The communication interface 903 uses a transceiving device such as but not limited to a transceiver to implement communication between the device 900 and other devices or a communication network. For example, the image to be processed can be acquired through the communication interface 903.

总线904可包括在装置900各个部件(例如,存储器901、处理器902、通信接口903)之间传送信息的通路。The bus 904 may include a path for transferring information between various components of the device 900 (for example, the memory 901, the processor 902, and the communication interface 903).

图10是本申请实施例的神经网络训练装置的硬件结构示意图。与上述装置800和装置900类似,图10所示的神经网络训练装置1000包括存储器1001、处理器1002、通信接口1003以及总线1004。其中,存储器1001、处理器1002、通信接口1003通过总线1004实现彼此之间的通信连接。FIG. 10 is a schematic diagram of the hardware structure of a neural network training device according to an embodiment of the present application. Similar to the aforementioned device 800 and device 900, the neural network training device 1000 shown in FIG. 10 includes a memory 1001, a processor 1002, a communication interface 1003, and a bus 1004. Among them, the memory 1001, the processor 1002, and the communication interface 1003 implement communication connections between each other through the bus 1004.

在通过图8所示的神经网络通道数搜索装置搜索得到了神经网络之后,可以通过图10所示的神经网络训练装置1000对该神经网络进行训练,训练得到的神经网络就可以用于执行本申请实施例的图像处理方法了。After the neural network has been searched by the neural network channel number search device shown in FIG. 8, the neural network can be trained by the neural network training device 1000 shown in FIG. 10, and the trained neural network can be used to execute this Apply for the image processing method of the embodiment.

具体地,图10所示的装置可以通过通信接口1003从外界获取训练数据以及待训练的神经网络,然后由处理器根据训练数据对待训练的神经网络进行训练。Specifically, the device shown in FIG. 10 can obtain training data and the neural network to be trained from the outside through the communication interface 1003, and then the processor trains the neural network to be trained according to the training data.

应注意,尽管上述装置800、装置900和装置1000仅仅示出了存储器、处理器、通信接口,但是在具体实现过程中,本领域的技术人员应当理解,装置800、装置900和装置1000还可以包括实现正常运行所必须的其他器件。同时,根据具体需要,本领域的技术人员应当理解,装置800、装置900和装置1000还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当理解,装置800、装置900和装置1000也可仅仅包括实现本申请实施例所必须的器件,而不必包括图8、图9和图10中所示的全部器件。It should be noted that although the foregoing device 800, device 900, and device 1000 only show a memory, a processor, and a communication interface, in the specific implementation process, those skilled in the art should understand that the device 800, device 900, and device 1000 may also Including other devices necessary for normal operation. At the same time, according to specific needs, those skilled in the art should understand that the device 800, the device 900, and the device 1000 may also include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the device 800, the device 900, and the device 1000 may also only include the components necessary to implement the embodiments of the present application, and not necessarily include all the components shown in FIGS. 8, 9 and 10.

图11是本申请实施例提供的一种神经网络通道数搜索装置的示意性结构框图,其中,神经网络通道数搜索装置1100,包括:FIG. 11 is a schematic structural block diagram of a neural network channel number search device provided by an embodiment of the present application, where the neural network channel number search device 1100 includes:

第一确定单元1101,用于确定卷积层的输出通道数N,N为正整数;The first determining unit 1101 is configured to determine the number N of output channels of the convolutional layer, where N is a positive integer;

分割单元1102,用于将所述卷积层输出的特征张量分割为n个子特征张量,所述每个子特征张量的通道数为N/n,n为可以被N整除的整数,且n≥2;The dividing unit 1102 is configured to divide the feature tensor output by the convolutional layer into n sub-feature tensors, the number of channels of each sub-feature tensor is N/n, n is an integer divisible by N, and n≥2;

第二确定单元1103,用于确定n组加权系数,所述每组加权系数包括多个加权系数,所述多个加权系数与所述n个子特征张量中的多个子特征张量一一对应;The second determining unit 1103 is configured to determine n sets of weighting coefficients, where each set of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients have a one-to-one correspondence with multiple sub-feature tensors of the n sub-feature tensors;

第三确定单元1104,用于确定所述n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个所述最大值所对应的子特征张量;The third determining unit 1104 is configured to determine the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the n groups of weighting coefficients, so as to obtain the sub-feature tensor corresponding to the n maximum values;

更新单元1105,用于根据所述n个最大值所对应的子特征张量更新所述卷积层的输出通道数。The update unit 1105 is configured to update the number of output channels of the convolutional layer according to the sub-feature tensor corresponding to the n maximum values.

在一种可行的实施方式中,所述每组加权系数包括n个加权系数,所述n个加权系数与所述n个子特征张量一一对应。In a feasible implementation manner, each group of weighting coefficients includes n weighting coefficients, and the n weighting coefficients are in one-to-one correspondence with the n sub-feature tensors.

在一种可行的实施方式中,所述每组加权系数包括m个加权系数,所述m个加权系数与所述n个子特征张量中的m个子特征张量一一对应,其中m为小于n的正整数。In a feasible implementation manner, each set of weighting coefficients includes m weighting coefficients, and the m weighting coefficients have a one-to-one correspondence with m sub-feature tensors in the n sub-feature tensors, where m is less than n Positive integer.

在一种可行的实施方式中,所述第三确定单元1104还用于根据所述n组加权系数和所述n个子特征张量中的多个子特征张量生成n个候选特征张量,其中一组所述加权系数对应一个所述候选特征张量。In a feasible implementation manner, the third determining unit 1104 is further configured to generate n candidate feature tensors according to the n sets of weighting coefficients and multiple sub-feature tensors of the n sub-feature tensors. The weighting coefficient corresponds to one of the candidate feature tensors.

在一种可行的实施方式中,所述第三确定单元1104还用于确定生成每个所述候选特征张量的所述多个子特征张量中权重最大的子特征张量,以得到n个权重最大的子特征张量。In a feasible implementation manner, the third determining unit 1104 is further configured to determine the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each of the candidate feature tensors, so as to obtain n largest weights. The sub-feature tensor of.

在一种可行的实施方式中,所述更新单元1105具体用于确定所述n个最大值所对应的子特征张量中互不相同的子特征张量的个数k,k为小于等于n的正整数,所述更新的所述卷积层的输出通道数为kN/n。In a feasible implementation manner, the update unit 1105 is specifically configured to determine the number k of sub-feature tensors that are different from each other among the sub-feature tensors corresponding to the n maximum values, and k is a positive value less than or equal to n. An integer, the number of output channels of the updated convolutional layer is kN/n.

图12是本申请实施例提供的一种图像处理装置的示意性结构框图,其中,图像处理装置1200,包括:FIG. 12 is a schematic structural block diagram of an image processing device according to an embodiment of the present application, where the image processing device 1200 includes:

获取单元1201,用于获取待处理图像;The acquiring unit 1201 is configured to acquire the image to be processed;

分类单元1202,用于根据目标神经网络对所述待处理图像进行分类,以得到所述待处理图像的分类结果,其中,所述目标神经网络的通道数由装置1100确定。The classification unit 1202 is configured to classify the image to be processed according to a target neural network to obtain a classification result of the image to be processed, wherein the number of channels of the target neural network is determined by the device 1100.

本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.

所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the system, device and unit described above can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.

在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, device, and method can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机 软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes. .

以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Should be covered within the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Claims (16)

一种神经网络通道数搜索方法,其特征在于,包括:A method for searching the number of neural network channels, which is characterized in that it includes: 确定卷积层的输出通道数N,N为正整数;Determine the number of output channels N of the convolutional layer, where N is a positive integer; 将所述卷积层输出的特征张量分割为n个子特征张量,所述每个子特征张量的通道数为N/n,n为可以被N整除的整数,且n≥2;Dividing the feature tensor output by the convolutional layer into n sub feature tensors, the number of channels of each sub feature tensor is N/n, n is an integer divisible by N, and n≥2; 确定n组加权系数,所述每组加权系数包括多个加权系数,所述多个加权系数与所述n个子特征张量中的多个子特征张量一一对应;Determining n sets of weighting coefficients, where each set of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients are in one-to-one correspondence with multiple sub-feature tensors of the n sub-feature tensors; 确定所述n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个所述最大值所对应的子特征张量;Determining the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the n groups of weighting coefficients, so as to obtain the sub-feature tensor corresponding to the n maximum values; 根据所述n个最大值所对应的子特征张量更新所述卷积层的输出通道数。The number of output channels of the convolutional layer is updated according to the sub-feature tensor corresponding to the n maximum values. 根据权利要求1所述的方法,其特征在于,所述每组加权系数包括n个加权系数,所述n个加权系数与所述n个子特征张量一一对应。The method according to claim 1, wherein each set of weighting coefficients includes n weighting coefficients, and the n weighting coefficients are in one-to-one correspondence with the n sub-feature tensors. 根据权利要求1所述的方法,其特征在于,所述每组加权系数包括m个加权系数,所述m个加权系数与所述n个子特征张量中的m个子特征张量一一对应,其中m为小于n的正整数。The method according to claim 1, wherein each set of weighting coefficients includes m weighting coefficients, and the m weighting coefficients are in one-to-one correspondence with m sub-feature tensors in the n sub-feature tensors, wherein m Is a positive integer less than n. 根据权利要求1至3中任一项所述的方法,其特征在于,所述确定所述n组加权系数中每组加权系数中的最大值所对应的子特征张量之前,所述方法还包括:The method according to any one of claims 1 to 3, characterized in that before the determining the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the n groups of weighting coefficients, the method further include: 根据所述n组加权系数和所述n个子特征张量中的多个子特征张量生成n个候选特征张量,其中一组所述加权系数对应一个所述候选特征张量。Generate n candidate feature tensors according to the n sets of weighting coefficients and multiple sub-feature tensors of the n sub-feature tensors, where one set of the weighting coefficients corresponds to one candidate feature tensor. 根据权利要求4所述的方法,其特征在于,所述确定所述n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个所述最大值所对应的子特征张量,还包括:The method according to claim 4, wherein the determining the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the n groups of weighting coefficients, to obtain the sub-feature tensors corresponding to the n maximum values The sub-feature tensor also includes: 确定生成每个所述候选特征张量的所述多个子特征张量中权重最大的子特征张量,以得到n个权重最大的子特征张量。Determine the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each of the candidate feature tensors, so as to obtain n sub-feature tensors with the largest weights. 根据权利要求1至5中任一项所述的方法,其特征在于,所述根据所述n个最大值所对应的子特征张量更新所述卷积层的输出通道数,包括:The method according to any one of claims 1 to 5, wherein the updating the number of output channels of the convolutional layer according to the sub-feature tensor corresponding to the n maximum values comprises: 确定所述n个最大值所对应的子特征张量中互不相同的子特征张量的个数k,k为小于等于n的正整数;Determine the number k of sub-feature tensors that are different from each other among the sub-feature tensors corresponding to the n maximum values, where k is a positive integer less than or equal to n; 对应的,所述更新的所述卷积层的输出通道数为kN/n。Correspondingly, the number of output channels of the updated convolutional layer is kN/n. 一种图像处理方法,其特征在于,包括:An image processing method, characterized in that it comprises: 获取待处理图像;Obtain the image to be processed; 根据目标神经网络对所述待处理图像进行分类,以得到所述待处理图像的分类结果;Classify the image to be processed according to the target neural network to obtain a classification result of the image to be processed; 其中,所述目标神经网络的通道数的确定包括:Wherein, the determination of the number of channels of the target neural network includes: 确定卷积层的输出通道数N,N为正整数;Determine the number of output channels N of the convolutional layer, where N is a positive integer; 将所述卷积层输出的特征张量分割为n个子特征张量,所述每个子特征张量的通道数为N/n,n为可以被N整除的整数,且n≥2;Dividing the feature tensor output by the convolutional layer into n sub feature tensors, the number of channels of each sub feature tensor is N/n, n is an integer divisible by N, and n≥2; 确定n组加权系数,所述每组加权系数包括多个加权系数,所述多个加权系数与所述 n个子特征张量中的多个子特征张量一一对应;Determining n sets of weighting coefficients, where each set of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients have a one-to-one correspondence with multiple sub-feature tensors of the n sub-feature tensors; 确定所述n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个所述最大值所对应的子特征张量;Determining the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the n groups of weighting coefficients, so as to obtain the sub-feature tensor corresponding to the n maximum values; 根据所述n个最大值所对应的子特征张量更新所述卷积层的输出通道数。The number of output channels of the convolutional layer is updated according to the sub-feature tensor corresponding to the n maximum values. 根据权利要求7所述的方法,其特征在于,所述每组加权系数包括n个加权系数,所述n个加权系数与所述n个子特征张量一一对应。The method according to claim 7, wherein each set of weighting coefficients includes n weighting coefficients, and the n weighting coefficients correspond to the n sub-feature tensors in a one-to-one correspondence. 根据权利要求7所述的方法,其特征在于,所述每组加权系数包括m个加权系数,所述m个加权系数与所述n个子特征张量中的m个子特征张量一一对应,其中m为小于n的正整数。The method according to claim 7, wherein each set of weighting coefficients includes m weighting coefficients, and the m weighting coefficients correspond to m sub-feature tensors in the n sub-feature tensors one-to-one, wherein m Is a positive integer less than n. 根据权利要求7至9中任一项所述的方法,其特征在于,所述确定所述n组加权系数中每组加权系数中的最大值所对应的子特征张量之前,所述方法还包括:The method according to any one of claims 7 to 9, characterized in that, before the determining the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the n groups of weighting coefficients, the method further include: 根据所述n组加权系数和所述n个子特征张量中的多个子特征张量生成n个候选特征张量,其中一组所述加权系数对应一个所述候选特征张量。Generate n candidate feature tensors according to the n sets of weighting coefficients and multiple sub-feature tensors of the n sub-feature tensors, where one set of the weighting coefficients corresponds to one candidate feature tensor. 根据权利要求10所述的方法,其特征在于,所述确定所述n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个所述最大值所对应的子特征张量,还包括:The method according to claim 10, wherein said determining the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the n groups of weighting coefficients, so as to obtain the sub-feature tensors corresponding to the n maximum values The sub-feature tensor also includes: 确定生成每个所述候选特征张量的所述多个子特征张量中权重最大的子特征张量,以得到n个权重最大的子特征张量。Determine the sub-feature tensor with the largest weight among the multiple sub-feature tensors for generating each of the candidate feature tensors, so as to obtain n sub-feature tensors with the largest weights. 根据权利要求7至11中任一项所述的方法,其特征在于,所述根据所述n个最大值所对应的子特征张量更新所述卷积层的输出通道数,包括:The method according to any one of claims 7 to 11, wherein the updating the number of output channels of the convolutional layer according to the sub-feature tensor corresponding to the n maximum values comprises: 确定所述n个最大值所对应的子特征张量中互不相同的子特征张量的个数k,k为小于等于n的正整数;Determine the number k of sub-feature tensors that are different from each other among the sub-feature tensors corresponding to the n maximum values, where k is a positive integer less than or equal to n; 对应的,所述更新的所述卷积层的输出通道数为kN/n。Correspondingly, the number of output channels of the updated convolutional layer is kN/n. 一种神经网络通道数搜索装置,其特征在于,包括:A neural network channel number search device, which is characterized in that it comprises: 存储器,用于存储程序;Memory, used to store programs; 处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行以下过程:The processor is configured to execute the program stored in the memory, and when the program stored in the memory is executed, the processor is configured to execute the following process: 确定卷积层的输出通道数N,N为正整数;Determine the number of output channels N of the convolutional layer, where N is a positive integer; 将所述卷积层输出的特征张量分割为n个子特征张量,所述每个子特征张量的通道数为N/n,n为可以被N整除的整数,且n≥2;Dividing the feature tensor output by the convolutional layer into n sub feature tensors, the number of channels of each sub feature tensor is N/n, n is an integer divisible by N, and n≥2; 确定n组加权系数,所述每组加权系数包括多个加权系数,所述多个加权系数与所述n个子特征张量中的多个子特征张量一一对应;Determining n sets of weighting coefficients, where each set of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients are in one-to-one correspondence with multiple sub-feature tensors of the n sub-feature tensors; 确定所述n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个所述最大值所对应的子特征张量;Determining the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the n groups of weighting coefficients, so as to obtain the sub-feature tensor corresponding to the n maximum values; 根据所述n个最大值所对应的子特征张量更新所述卷积层的输出通道数。The number of output channels of the convolutional layer is updated according to the sub-feature tensor corresponding to the n maximum values. 一种图像处理装置,其特征在于,包括:An image processing device, characterized in that it comprises: 存储器,用于存储程序;Memory, used to store programs; 处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行以下过程:The processor is configured to execute the program stored in the memory, and when the program stored in the memory is executed, the processor is configured to execute the following process: 获取待处理图像;Obtain the image to be processed; 根据目标神经网络对所述待处理图像进行分类,以得到所述待处理图像的分类结果;Classify the image to be processed according to the target neural network to obtain a classification result of the image to be processed; 其中,所述目标神经网络的通道数的确定包括:Wherein, the determination of the number of channels of the target neural network includes: 确定卷积层的输出通道数N,N为正整数;Determine the number of output channels N of the convolutional layer, where N is a positive integer; 将所述卷积层输出的特征张量分割为n个子特征张量,所述每个子特征张量的通道数为N/n,n为可以被N整除的整数,且n≥2;Dividing the feature tensor output by the convolutional layer into n sub feature tensors, the number of channels of each sub feature tensor is N/n, n is an integer divisible by N, and n≥2; 确定n组加权系数,所述每组加权系数包括多个加权系数,所述多个加权系数与所述n个子特征张量中的多个子特征张量一一对应;Determining n sets of weighting coefficients, where each set of weighting coefficients includes multiple weighting coefficients, and the multiple weighting coefficients are in one-to-one correspondence with multiple sub-feature tensors of the n sub-feature tensors; 确定所述n组加权系数中每组加权系数中的最大值所对应的子特征张量,以得到n个所述最大值所对应的子特征张量;Determining the sub-feature tensor corresponding to the maximum value in each group of weighting coefficients in the n groups of weighting coefficients, so as to obtain the sub-feature tensor corresponding to the n maximum values; 根据所述n个最大值所对应的子特征张量更新所述卷积层的输出通道数。The number of output channels of the convolutional layer is updated according to the sub-feature tensor corresponding to the n maximum values. 一种计算机可读存储介质,其特征在于,所述计算机可读介质存储用于设备执行的程序代码,该程序代码包括用于执行如权利要求1-6或者7-12中任一项所述的方法。A computer-readable storage medium, wherein the computer-readable medium stores program code for device execution, and the program code includes a program code for executing any one of claims 1-6 or 7-12. Methods. 一种芯片,其特征在于,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,以执行如权利要求1-6或者7-12中任一项所述的方法。A chip, characterized in that the chip comprises a processor and a data interface, and the processor reads instructions stored in a memory through the data interface to execute any one of claims 1-6 or 7-12 The method described in the item.
PCT/CN2020/078413 2020-03-09 2020-03-09 Method and apparatus for searching number of neural network channels Ceased WO2021179117A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/078413 WO2021179117A1 (en) 2020-03-09 2020-03-09 Method and apparatus for searching number of neural network channels
CN202080091992.7A CN114902240A (en) 2020-03-09 2020-03-09 Neural network channel number searching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/078413 WO2021179117A1 (en) 2020-03-09 2020-03-09 Method and apparatus for searching number of neural network channels

Publications (1)

Publication Number Publication Date
WO2021179117A1 true WO2021179117A1 (en) 2021-09-16

Family

ID=77671220

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/078413 Ceased WO2021179117A1 (en) 2020-03-09 2020-03-09 Method and apparatus for searching number of neural network channels

Country Status (2)

Country Link
CN (1) CN114902240A (en)
WO (1) WO2021179117A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117634711A (en) * 2024-01-25 2024-03-01 北京壁仞科技开发有限公司 Tensor dimension segmentation method, system, device and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105631466A (en) * 2015-12-21 2016-06-01 中国科学院深圳先进技术研究院 Method and device for image classification
CN108596274A (en) * 2018-05-09 2018-09-28 国网浙江省电力有限公司 Image classification method based on convolutional neural networks
US20190026600A1 (en) * 2017-07-19 2019-01-24 XNOR.ai, Inc. Lookup-based convolutional neural network
CN109635842A (en) * 2018-11-14 2019-04-16 平安科技(深圳)有限公司 A kind of image classification method, device and computer readable storage medium
CN110197258A (en) * 2019-05-29 2019-09-03 北京市商汤科技开发有限公司 Neural network searching method, image processing method and device, equipment and medium
CN110533068A (en) * 2019-07-22 2019-12-03 杭州电子科技大学 A kind of image object recognition methods based on classification convolutional neural networks

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102631381B1 (en) * 2016-11-07 2024-01-31 삼성전자주식회사 Convolutional neural network processing method and apparatus
CN106709532B (en) * 2017-01-25 2020-03-10 京东方科技集团股份有限公司 Image processing method and device
CN107194404B (en) * 2017-04-13 2021-04-20 哈尔滨工程大学 Feature extraction method of underwater target based on convolutional neural network
CN107480770B (en) * 2017-07-27 2020-07-28 中国科学院自动化研究所 Neural network quantization and compression method and device capable of adjusting quantization bit width
CN107798697A (en) * 2017-10-26 2018-03-13 中国科学院深圳先进技术研究院 A kind of medical image registration method based on convolutional neural networks, system and electronic equipment
US10453220B1 (en) * 2017-12-29 2019-10-22 Perceive Corporation Machine-trained network for misalignment-insensitive depth perception
CN110059796B (en) * 2018-01-19 2021-09-21 杭州海康威视数字技术股份有限公司 Method and device for generating convolutional neural network
TWI729352B (en) * 2018-02-09 2021-06-01 宏達國際電子股份有限公司 Adjustment method for convolutional neural network and electronic apparatus
CN108875752B (en) * 2018-03-21 2022-06-07 北京迈格威科技有限公司 Image processing method and apparatus, computer readable storage medium
US20190318227A1 (en) * 2018-04-13 2019-10-17 Fabula Al Limited Recommendation system and method for estimating the elements of a multi-dimensional tensor on geometric domains from partial observations
US12020167B2 (en) * 2018-05-17 2024-06-25 Magic Leap, Inc. Gradient adversarial training of neural networks
US10380753B1 (en) * 2018-05-30 2019-08-13 Aimotive Kft. Method and apparatus for generating a displacement map of an input dataset pair
JP7020312B2 (en) * 2018-06-15 2022-02-16 日本電信電話株式会社 Image feature learning device, image feature learning method, image feature extraction device, image feature extraction method, and program
CN110826685A (en) * 2018-08-08 2020-02-21 华为技术有限公司 Method and device for convolution calculation of neural network
CN109325530B (en) * 2018-09-07 2021-05-04 中国科学院自动化研究所 An image classification method, storage device and processing device
CN110188795B (en) * 2019-04-24 2023-05-09 华为技术有限公司 Image classification method, data processing method and device
CN110569901B (en) * 2019-09-05 2022-11-29 北京工业大学 A weakly supervised object detection method based on channel selection for adversarial elimination
CN110569972A (en) * 2019-09-11 2019-12-13 北京百度网讯科技有限公司 Method, device and electronic device for constructing search space of super network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105631466A (en) * 2015-12-21 2016-06-01 中国科学院深圳先进技术研究院 Method and device for image classification
US20190026600A1 (en) * 2017-07-19 2019-01-24 XNOR.ai, Inc. Lookup-based convolutional neural network
CN108596274A (en) * 2018-05-09 2018-09-28 国网浙江省电力有限公司 Image classification method based on convolutional neural networks
CN109635842A (en) * 2018-11-14 2019-04-16 平安科技(深圳)有限公司 A kind of image classification method, device and computer readable storage medium
CN110197258A (en) * 2019-05-29 2019-09-03 北京市商汤科技开发有限公司 Neural network searching method, image processing method and device, equipment and medium
CN110533068A (en) * 2019-07-22 2019-12-03 杭州电子科技大学 A kind of image object recognition methods based on classification convolutional neural networks

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117634711A (en) * 2024-01-25 2024-03-01 北京壁仞科技开发有限公司 Tensor dimension segmentation method, system, device and medium
CN117634711B (en) * 2024-01-25 2024-05-14 北京壁仞科技开发有限公司 Tensor dimension segmentation method, system, device and medium

Also Published As

Publication number Publication date
CN114902240A (en) 2022-08-12

Similar Documents

Publication Publication Date Title
CN112561027B (en) Neural network architecture search method, image processing method, device and storage medium
CN107622302B (en) Superpixel Methods for Convolutional Neural Networks
CN112488923B (en) Image super-resolution reconstruction method and device, storage medium and electronic equipment
WO2021018163A1 (en) Neural network search method and apparatus
WO2020073211A1 (en) Operation accelerator, processing method, and related device
Nakahara et al. High-throughput convolutional neural network on an FPGA by customized JPEG compression
CN111382867A (en) Neural network compression method, data processing method and related device
WO2020062299A1 (en) Neural network processor, data processing method and related device
CN114973049A (en) Lightweight video classification method for unifying convolution and self attention
US20250029212A1 (en) Method and apparatus for restoring a target restoration region in an image
CN117635941A (en) Remote sensing image semantic segmentation method based on multi-scale features and global information modeling
CN111709415A (en) Object detection method, apparatus, computer equipment and storage medium
US11948090B2 (en) Method and apparatus for video coding
Niu et al. Machine learning-based framework for saliency detection in distorted images
CN118379374A (en) Hyperspectral image compression imaging method and device based on spectral memory enhancement network
KR20220070505A (en) Multi-scale factor image super-resolution with microstructure mask
CN115082840B (en) Action video classification method and device based on data combination and channel correlation
WO2022141094A1 (en) Model generation method and apparatus, image processing method and apparatus, and readable storage medium
WO2021179117A1 (en) Method and apparatus for searching number of neural network channels
CN114897711A (en) Method, device and equipment for processing images in video and storage medium
WO2023122896A1 (en) Data processing method and apparatus
CN114049634B (en) Image recognition method and device, computer equipment and storage medium
WO2021120036A1 (en) Data processing apparatus and data processing method
CN117575906A (en) Image super-resolution reconstruction method, device, computer equipment and storage medium
CN119206245A (en) Method executed by electronic device, storage medium, and program product

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20923794

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20923794

Country of ref document: EP

Kind code of ref document: A1