[go: up one dir, main page]

WO2020221200A1 - Procédé de construction de réseau neuronal, procédé et dispositifs de traitement d'image - Google Patents

Procédé de construction de réseau neuronal, procédé et dispositifs de traitement d'image Download PDF

Info

Publication number
WO2020221200A1
WO2020221200A1 PCT/CN2020/087222 CN2020087222W WO2020221200A1 WO 2020221200 A1 WO2020221200 A1 WO 2020221200A1 CN 2020087222 W CN2020087222 W CN 2020087222W WO 2020221200 A1 WO2020221200 A1 WO 2020221200A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
network
search
construction
stage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2020/087222
Other languages
English (en)
Chinese (zh)
Inventor
陈鑫
谢凌曦
田奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of WO2020221200A1 publication Critical patent/WO2020221200A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle

Definitions

  • This application relates to the field of artificial intelligence, and more specifically, to a neural network construction method, image processing method and device.
  • Artificial intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
  • artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Research in the field of artificial intelligence includes robotics, natural language processing, computer vision, decision-making and reasoning, human-computer interaction, recommendation and search, and basic AI theories.
  • neural networks for example, deep neural networks
  • a neural network with good performance often has a sophisticated network structure, which requires human experts with superb skills and rich experience to spend a lot of energy to construct.
  • NAS neural architecture search
  • the search method is generally based on a certain number of building units to build a search network, and then search for each node in the search network in the search space.
  • the connection relationship of is optimized to obtain the optimized building unit, and finally the target neural network is built according to the optimized building unit.
  • the search method puts all possible operations into the search space, which leads to a huge video memory space required during the optimization process, which can only be stacked into a shallow search network.
  • the final target neural network to be built is often deeper, which leads to a large difference in depth between the search network and the target neural network, and the construction unit obtained due to the shallower search network optimization is not completely suitable for the deeper ones.
  • the target neural network so that the final target neural network may not meet the application requirements well.
  • This application provides a neural network construction method, image processing method, device, computer readable storage medium, and chip to better construct a neural network that meets the needs.
  • a method for constructing a neural network includes: determining a search space and a plurality of building units; stacking the plurality of building units to obtain a search network, which is a neural network for searching the structure of a neural network Network; optimize the network structure of the building unit in the search network in the search space to obtain the optimized building unit; build the target neural network according to the optimized building unit.
  • the above search space is determined according to the application requirements of the target neural network to be constructed, and the above multiple construction units are determined according to the search space and the size of the video memory resources of the equipment that constructs the target neural network.
  • the construction units are determined by multiple A network structure obtained by the basic operation of the neural network between nodes.
  • the building unit is the basic module for building the neural network.
  • the optimization process for optimizing the network structure of the building unit in the search network includes N stages.
  • the i-th and j-th stages are any two of the N stages.
  • the search space is in the i-th stage.
  • the size is larger than the size of the search network in the j-th stage, and the number of building units included in the search network in the i-th stage is smaller than the number of building units in the search space in the j-th stage, and the search space of the search network is reduced
  • the increase in the number of building units of the search network makes the video memory consumption in the optimization process within the preset range.
  • the difference between the number of building units included in the search network in the Nth stage and the number of building units included in the target neural network is expected within the range, the number of building units included in the target neural network is determined according to the application requirements of the target neural network, N is a positive integer greater than 1, i and j are both positive integers less than or equal to N, and i is less than j .
  • the aforementioned search space is determined according to the application requirements of the target neural network to be constructed, and specifically includes: the aforementioned search space is determined according to the type of processing data of the target neural network.
  • the type and number of operations included in the above-mentioned search space should be adapted to the processing of image data.
  • the aforementioned search space may include convolution operations, pooling operations, skip-connect operations, and so on.
  • the type and number of operations contained in the above-mentioned search space should be adapted to the processing of voice data.
  • the target neural network is a neural network for processing voice data
  • the above search space may include activation functions (such as ReLU, Tanh) and so on.
  • the number of building units included in the target neural network is determined according to the application requirements of the target neural network, including: the number of building units included in the target neural network is based on the type of data and/or calculations to be processed by the target neural network The complexity is determined.
  • the target neural network when the above target neural network is used to process some simple text data, the target neural network only needs to contain a smaller number of building units. When the above target neural network is used to process some more complex image data, the target neural network needs Contains a large number of building units.
  • the target neural network needs to contain a larger number of building units; when the target neural network needs to process data with low complexity, the target neural network needs a smaller number
  • the building unit can be.
  • the above-mentioned video memory resource may be replaced with a cache resource, which is a memory or storage unit used for storing operation data during the optimization process of the device used to construct the neural network.
  • a cache resource which is a memory or storage unit used for storing operation data during the optimization process of the device used to construct the neural network.
  • the foregoing cache resources may specifically include video memory resources.
  • the foregoing stacking of multiple building units to obtain a search network includes: stacking the multiple building units in sequence in a preset stacking manner to obtain a search network, wherein, in the search network, the search network is located The output of the building unit in front of the network is the input of the building unit located in the back of the search network.
  • the foregoing preset stacking manner may include what type of building units are stacked at which position, the number of stacks, and so on.
  • the video memory resources saved by reducing the search space can be used to increase the number of building units, so that the building units can be stacked as much as possible when the video memory resources are limited.
  • a search network whose number is close to the number of building units of the target neural network to be built finally.
  • the optimized construction unit can be better adapted to the construction of the target neural network, and the target neural network built according to the optimized construction unit can better meet the application requirements.
  • this application gradually reduces the size of the search space and increases the number of construction units of the search network, so as to construct a target neural network that can better meet the needs of the application.
  • the dependence on the video memory resources in the optimization process is reduced, so that the target neural network that satisfies the application needs can be obtained by relying on less video memory resources in the optimization process, and also improves the utilization of video memory resources to a certain extent. rate.
  • the optimized construction unit in the search network is more suitable for building the target neural network.
  • the depth of the neural network is positively correlated with the number of building units contained. Therefore, when the number of building units of the search network is close to the number of building units of the target neural network, the network depth of the search network is the same as that of the target neural network. Also relatively close.
  • the size of the search space for the i-th stage S i, S j of the above-described search space size in the j-th stage, the number of the search network construction unit included in the i-th stage is L i
  • the number of construction units included in the j-th stage of the search network is L j , where the size of L j -L i is determined according to the size of S i -S j , or the size of S i -S j is Determined according to the size of L j -L i .
  • the size of S i -S j can be preset, and then the size of L j -L i can be determined according to the size of S i -S j , so that the video memory saved due to the reduced search space
  • the increase in resources and building units causes the difference between the excessively consumed video memory resources to be within a certain threshold range.
  • the size of L j -L i can also be preset, and then the size of S i -S j is determined according to the size of L j -L i , so that the increase in the building unit causes more memory resources to be consumed
  • the difference between the video memory resources saved and the search space reduction is within a certain threshold range.
  • the above-mentioned size of N is preset.
  • the size of the above N can be determined according to the construction requirements of the target neural network. Specifically, when the target neural network needs to be constructed in a relatively short time, N can be set to a small value, and when the target neural network can be constructed in a relatively long time, N can be set to a Larger value.
  • the second stage is compared with the first stage and the fourth stage is satisfied with the third stage: the search space is reduced, and the number of construction units of the search network is increased.
  • the search space and the number of building units contained in the search network in the second and third phases have not changed.
  • j i+1.
  • the number change value of the construction unit of the search network in any two adjacent stages is the same, and the size change value of the search space in any two adjacent stages is also the same.
  • the number of construction units increased in the i+1th stage relative to the i-th stage may be based on the aforementioned value N, the number of construction units included in the search network before optimization, and the construction of the target neural network The number of units is determined.
  • the number of building units increased by the search network in the i+1 stage relative to the i-th stage is X
  • the number of building units contained in the search network before the optimization starts is U
  • the number of building units in the target neural network is V
  • the extent of the reduction in the size of the search space and the extent of the increase in the number of search network construction units can be determined in various ways, as long as the reduction in the search space of the search network during the optimization process and the search
  • the increase in the number of building units of the network makes the memory consumption generated in the optimization process within a preset range.
  • the size of the search space can be reduced in advance, and then the amount of increase in the number of search network construction units can be determined; or the size of the search network can be preset, and then the size of the search space can be reduced.
  • This application does not limit this, and all implementations to ensure that the video memory consumption is within the preset range are within the protection scope of this application.
  • the number of the first type of operations included in the connection relationship between the nodes of the optimized construction unit is within a preset range, and the first type of operation is not Contains operations for neural network trainable parameters.
  • the number of operations of the first type is limited to a certain range, so that the trainable parameters of the final target neural network are maintained at a relatively stable level, and the performance of the target neural network remains stable.
  • the first type of operation mentioned above is an operation that does not contain trainable parameters. If there are too many such operations, it will result in fewer other operations containing trainable parameters, so that the overall neural network has fewer trainable parameters, and the characteristics of the neural network Decreased expression ability.
  • the number of first-type operations in the construction units obtained by each search will have a certain difference.
  • the neural network structure obtained by the search ie, the construction unit
  • Limiting the number of operations of the first type can keep the trainable parameters of the test network constructed from the neural network structure obtained by the search at a relatively stable level, thereby reducing performance fluctuations on the corresponding tasks.
  • the building units in the search network include the first type of building unit, and the first type of building unit is the number and size of the input feature maps and the number and the output feature maps, respectively. Building units of the same size.
  • the building units in the search network include the second type of building unit, and the resolution of the output feature map of the second type of building unit is 1/M of the input feature map,
  • the number of output feature maps of the second type of construction unit is M times the number of input feature maps, and M is a positive integer greater than 1.
  • an image processing method includes: acquiring an image to be processed; classifying the image to be processed according to a target neural network to obtain a classification result of the image to be processed, wherein the target neural network It is a neural network constructed according to any one of the realization methods in the first aspect.
  • the target neural network used in the image processing method in the second aspect performs image classification
  • the target neural network needs to be trained according to the training image, and the trained target neural network can then classify the image to be processed .
  • the neural network structure search method in the first aspect can be used to obtain the target neural network. Then, the target neural network can be trained according to the training image. After the training is completed, the target neural network can be used to perform the processing of the image to be processed. Classified.
  • the target neural network is constructed using the above-mentioned first aspect, it is more in line with or close to the application requirements of the neural network.
  • Using such a neural network for image classification can achieve better image classification effects (for example, The classification results are more accurate, etc.).
  • an image processing method includes: acquiring an image to be processed; and classifying the image to be processed according to a target neural network to obtain a classification result of the image to be processed.
  • the target neural network is constructed by multiple optimized building units, and the multiple optimized building units are obtained by optimizing the network structure of the building units in the search network in N stages.
  • the i-th stage and The j-th stage is any two of the N stages.
  • the size of the search space in the i-th stage is greater than the size of the search network in the j-th stage, and the number of building units included in the search network in the i-th stage is less than
  • the number of construction units included in the search space at the jth stage, the reduction of the search space of the search network and the increase of the number of construction units of the search network make the memory consumption generated during the optimization process within the preset range, and the search network is in the first
  • the difference between the number of building units included in the N stages and the number of building units included in the target neural network is within a preset range.
  • the number of building units included in the target neural network is determined according to the application requirements of the target neural network, and N is greater than A positive integer of 1, i and
  • the optimized construction unit of the search network can be better adapted to the construction of the target neural network, and a better performance target neural network can be obtained.
  • Using the target neural network for image classification can achieve better image classification results (for example, The classification results are more accurate, etc.).
  • j i+1.
  • the number change value of the construction unit of the search network in any two adjacent stages is the same, and the size change value of the search space in any two adjacent stages is also the same.
  • the above-mentioned target neural network is a neural network obtained by training through training pictures.
  • the target neural network can be trained by training pictures and the category information marked by the training pictures, and the trained neural network can be used for image classification.
  • an image processing method includes: obtaining a road image; performing convolution processing on the road image according to a target neural network to obtain multiple convolution feature maps of the road image; and processing the road image according to the target neural network Deconvolution processing is performed on multiple convolution feature maps of to obtain the semantic segmentation result of the road image.
  • the above-mentioned target neural network is a neural network constructed according to any one of the implementation methods in the first aspect.
  • an image processing method includes: obtaining a face image; performing convolution processing on the face image according to a target neural network to obtain a convolution feature map of the face image; The product feature map is compared with the convolution feature map of the ID image to obtain the verification result of the face image.
  • the convolution feature map of the aforementioned ID image may be obtained in advance and stored in the corresponding database. For example, perform convolution processing on the image of the ID document in advance, and store the obtained convolution feature map in the database.
  • the above-mentioned target neural network is a neural network constructed according to any one of the implementation methods in the first aspect.
  • a neural network construction device in a sixth aspect, includes: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the The processor is configured to execute the method in any one of the implementation manners in the first aspect.
  • an image processing device which includes: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the processing The device is used to execute the method in any one of the second aspect to the fifth aspect.
  • a computer-readable medium stores program code for device execution, and the program code includes a method for executing any one of the first to fifth aspects. .
  • a computer program product containing instructions is provided.
  • the computer program product runs on a computer, the computer executes the method in any one of the foregoing first to fifth aspects.
  • a chip in a tenth aspect, includes a processor and a data interface.
  • the processor reads instructions stored in a memory through the data interface, and executes any one of the first to fifth aspects above The method in the implementation mode.
  • the chip may further include a memory in which instructions are stored, and the processor is configured to execute the instructions stored in the memory.
  • the processor is configured to execute the method in any one of the implementation manners of the first aspect to the fifth aspect.
  • FIG. 1 is a schematic diagram of an artificial intelligence main body framework provided by an embodiment of the present application
  • Figure 2 is a schematic diagram of an application environment provided by an embodiment of the application.
  • FIG. 3 is a schematic diagram of a convolutional neural network structure provided by an embodiment of the application.
  • FIG. 4 is a schematic diagram of a convolutional neural network structure provided by an embodiment of the application.
  • FIG. 5 is a schematic structural diagram of a neural network processor provided by an embodiment of this application.
  • FIG. 6 is a schematic structural diagram of a processor provided by an embodiment of the application.
  • FIG. 7 is a schematic diagram of the hardware structure of a chip provided by an embodiment of the application.
  • FIG. 8 is a schematic diagram of a system architecture provided by an embodiment of the application.
  • FIG. 9 is a schematic flowchart of a neural network construction method according to an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a construction unit of an embodiment of the present application.
  • Fig. 11 is a schematic diagram of a search network according to an embodiment of the present application.
  • FIG. 12 is a schematic diagram of a neural network construction method according to an embodiment of the present application.
  • FIG. 13 is a schematic diagram of a neural network construction system according to an embodiment of the present application.
  • FIG. 14 is a schematic diagram of a network structure optimization process of a construction unit of a search network according to an embodiment of the present application.
  • 15 is a schematic diagram of the processing procedure of the operation quantity specification module of the embodiment of the present application.
  • FIG. 16 is a schematic flowchart of an image processing method according to an embodiment of the present application.
  • FIG. 17 is a schematic block diagram of a neural network construction device according to an embodiment of the present application.
  • FIG. 18 is a schematic block diagram of an image processing device according to an embodiment of the present application.
  • Fig. 19 is a schematic block diagram of a neural network training device according to an embodiment of the present application.
  • Figure 1 shows a schematic diagram of an artificial intelligence main framework, which describes the overall workflow of the artificial intelligence system and is suitable for general artificial intelligence field requirements.
  • Intelligent Information Chain reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, the data has gone through the condensing process of "data-information-knowledge-wisdom".
  • Infrastructure provides computing power support for artificial intelligence systems, realizes communication with the outside world, and realizes support through the basic platform.
  • the infrastructure can communicate with the outside through sensors, and the computing power of the infrastructure can be provided by smart chips.
  • the smart chip here can be a central processing unit (CPU), a neural-network processing unit (NPU), a graphics processing unit (GPU), and an application specific integrated circuit (application specific).
  • Hardware acceleration chips such as integrated circuit (ASIC) and field programmable gate array (FPGA).
  • the basic platform of infrastructure can include distributed computing framework and network and other related platform guarantees and support, and can include cloud storage and computing, interconnection networks, etc.
  • data can be obtained through sensors and external communication, and then these data can be provided to the smart chip in the distributed computing system provided by the basic platform for calculation.
  • the data in the upper layer of the infrastructure is used to represent the data source in the field of artificial intelligence.
  • This data involves graphics, images, voice, text, and IoT data of traditional devices, including business data of existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.
  • the above-mentioned data processing usually includes data training, machine learning, deep learning, search, reasoning, decision-making and other processing methods.
  • machine learning and deep learning can symbolize and formalize data for intelligent information modeling, extraction, preprocessing, training, etc.
  • Reasoning refers to the process of simulating human intelligent reasoning in a computer or intelligent system, using formal information to conduct machine thinking and solving problems based on reasoning control strategies.
  • the typical function is search and matching.
  • Decision-making refers to the decision-making process of intelligent information after reasoning, and usually provides functions such as classification, ranking, and prediction.
  • some general capabilities can be formed based on the results of the data processing, such as an algorithm or a general system, for example, translation, text analysis, computer vision processing, speech recognition, image Recognition and so on.
  • Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. It is an encapsulation of the overall solution of artificial intelligence, productizing intelligent information decision-making and realizing landing applications. Its application fields mainly include: intelligent manufacturing, intelligent transportation, Smart home, smart medical, smart security, autonomous driving, safe city, smart terminal, etc.
  • the embodiments of this application can be applied to many fields in artificial intelligence, for example, smart manufacturing, smart transportation, smart home, smart medical care, smart security, automatic driving, safe cities and other fields.
  • the embodiments of the present application can be specifically applied in fields that require the use of (deep) neural networks, such as image classification, image retrieval, image semantic segmentation, image super-resolution, and natural language processing.
  • deep neural networks such as image classification, image retrieval, image semantic segmentation, image super-resolution, and natural language processing.
  • recognizing the images in the album can facilitate the user or the system to classify and manage the album and improve the user experience.
  • the neural network structure search method of the embodiment of the present application can search for a neural network structure suitable for album classification, and then train the neural network according to the training pictures in the training picture library to obtain the album classification neural network.
  • the album classification neural network can be used to classify the pictures, so that different categories of pictures can be labeled for users to view and find.
  • the classification tags of these pictures can also be provided to the album management system for classification management, saving users management time, improving the efficiency of album management, and enhancing user experience.
  • a neural network suitable for album classification can be constructed through a neural network construction system (corresponding to the neural network structure search method in the embodiment of the present application).
  • the network structure of the building unit in the search network can be optimized by using the training image library to obtain the optimized building unit, and then the optimized building unit can be used to build the neural network.
  • the neural network can be trained according to the training pictures to obtain the album classification neural network.
  • the album classification neural network processes the input pictures, and the picture category is tulip.
  • a neural network suitable for data processing in an autonomous driving scenario can be constructed, and then the neural network can be trained through the data in the autonomous driving scenario to obtain The sensor data processing network can finally use the sensor processing network to process the input road images to identify different objects in the road images.
  • the neural network construction system can construct a neural network according to the vehicle detection task.
  • the sensor data can be used to optimize the network structure of the building units in the search network to obtain the optimized construction Unit, and then use the optimized building unit to build a neural network.
  • the neural network can be trained according to the sensor data to obtain the sensor data processing network.
  • the sensor data processing network processes the input road picture, and can identify the vehicle in the road picture (as shown in the rectangular frame in the lower right corner of Fig. 3).
  • a neural network can be composed of neural units.
  • a neural unit can refer to an arithmetic unit that takes x s and intercept 1 as inputs.
  • the output of the arithmetic unit can be:
  • s 1, 2,...n, n is a natural number greater than 1
  • W s is the weight of x s
  • b is the bias of the neural unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal.
  • the output signal of the activation function can be used as the input of the next convolutional layer, and the activation function can be a sigmoid function.
  • a neural network is a network formed by connecting multiple above-mentioned single neural units together, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the characteristics of the local receptive field.
  • the local receptive field can be a region composed of several neural units.
  • Deep neural network also called multi-layer neural network
  • DNN can be understood as a neural network with multiple hidden layers.
  • DNN is divided according to the positions of different layers.
  • the neural network inside the DNN can be divided into three categories: input layer, hidden layer, and output layer.
  • the first layer is the input layer
  • the last layer is the output layer
  • the number of layers in the middle are all hidden layers.
  • the layers are fully connected, that is to say, any neuron in the i-th layer must be connected to any neuron in the i+1th layer.
  • DNN looks complicated, it is not complicated in terms of the work of each layer. Simply put, it is the following linear relationship expression: among them, Is the input vector, Is the output vector, Is the offset vector, W is the weight matrix (also called coefficient), and ⁇ () is the activation function.
  • Each layer is just the input vector After such a simple operation, the output vector is obtained Due to the large number of DNN layers, the coefficient W and the offset vector The number is also relatively large.
  • the definition of these parameters in the DNN is as follows: Take the coefficient W as an example: Suppose that in a three-layer DNN, the linear coefficients from the fourth neuron in the second layer to the second neuron in the third layer are defined as The superscript 3 represents the number of layers where the coefficient W is located, and the subscript corresponds to the output third layer index 2 and the input second layer index 4.
  • the coefficient from the kth neuron in the L-1th layer to the jth neuron in the Lth layer is defined as
  • the input layer has no W parameter.
  • more hidden layers make the network more capable of portraying complex situations in the real world. Theoretically speaking, a model with more parameters is more complex and has a greater "capacity", which means it can complete more complex learning tasks.
  • Training a deep neural network is also a process of learning a weight matrix, and its ultimate goal is to obtain the weight matrix of all layers of the trained deep neural network (a weight matrix formed by vectors W of many layers).
  • Convolutional neural network (convolutional neuron network, CNN) is a deep neural network with convolutional structure.
  • the convolutional neural network contains a feature extractor composed of a convolution layer and a sub-sampling layer.
  • the feature extractor can be regarded as a filter.
  • the convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network.
  • a neuron can be connected to only part of the neighboring neurons.
  • a convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units. Neural units in the same feature plane share weights, and the shared weights here are the convolution kernels.
  • Sharing weight can be understood as the way to extract image information has nothing to do with location.
  • the convolution kernel can be initialized in the form of a matrix of random size. During the training of the convolutional neural network, the convolution kernel can obtain reasonable weights through learning. In addition, the direct benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, while reducing the risk of overfitting.
  • RNN Recurrent Neural Networks
  • RNN can process sequence data of any length.
  • the training of RNN is the same as the training of traditional CNN or DNN.
  • the neural network can use an error back propagation (BP) algorithm to modify the size of the parameters in the initial neural network model during the training process, so that the reconstruction error loss of the neural network model becomes smaller and smaller. Specifically, forwarding the input signal to the output will cause error loss, and the parameters in the initial neural network model are updated by backpropagating the error loss information, so that the error loss is converged.
  • the backpropagation algorithm is a backpropagation motion dominated by error loss, and aims to obtain the optimal neural network model parameters, such as the weight matrix.
  • an embodiment of the present application provides a system architecture 100.
  • the data collection device 160 is used to collect training data.
  • the training data may include training images and classification results corresponding to the training images, where the results of the training images may be manually pre-labeled results.
  • the data collection device 160 stores the training data in the database 130, and the training device 120 trains to obtain the target model/rule 101 based on the training data maintained in the database 130.
  • the training device 120 processes the input original image and compares the output image with the original image until the output image of the training device 120 differs from the original image. The difference is less than a certain threshold, thereby completing the training of the target model/rule 101.
  • the above-mentioned target model/rule 101 can be used to implement the image processing method or the image processing method of the embodiment of the present application.
  • the target model/rule 101 in the embodiment of the present application may specifically be a neural network.
  • the training data maintained in the database 130 may not all come from the collection of the data collection device 160, and may also be received from other devices.
  • the training device 120 does not necessarily perform the training of the target model/rule 101 completely based on the training data maintained by the database 130. It may also obtain training data from the cloud or other places for model training.
  • the above description should not be used as a reference to this application Limitations of Examples.
  • the target model/rule 101 trained according to the training device 120 can be applied to different systems or devices, such as the execution device 110 shown in FIG. 4, which can be a terminal, such as a mobile phone terminal, a tablet computer, notebook computers, augmented reality (AR) AR/virtual reality (VR), vehicle-mounted terminals, etc., can also be servers or clouds.
  • the execution device 110 is configured with an input/output (input/output, I/O) interface 112 for data interaction with external devices.
  • the user can input data to the I/O interface 112 through the client device 140.
  • the input data in this embodiment of the application may include: the image to be processed input by the client device.
  • the preprocessing module 113 and the preprocessing module 114 are used for preprocessing according to the input data (such as the image to be processed) received by the I/O interface 112.
  • the preprocessing module 113 and the preprocessing module may not be provided 114 (there may only be one preprocessing module), and the calculation module 111 is directly used to process the input data.
  • the execution device 110 may call data, codes, etc. in the data storage system 150 for corresponding processing .
  • the data, instructions, etc. obtained by corresponding processing may also be stored in the data storage system 150.
  • the I/O interface 112 returns the processing result, such as the denoising processed image obtained as described above, to the client device 140 to provide it to the user.
  • the training device 120 can generate corresponding target models/rules 101 based on different training data for different goals or tasks, and the corresponding target models/rules 101 can be used to achieve the above goals or complete The above tasks provide the user with the desired result.
  • the user can manually set input data, and the manual setting can be operated through the interface provided by the I/O interface 112.
  • the client device 140 can automatically send input data to the I/O interface 112. If the client device 140 is required to automatically send the input data and the user's authorization is required, the user can set the corresponding authority in the client device 140.
  • the user can view the result output by the execution device 110 on the client device 140, and the specific presentation form may be a specific manner such as display, sound, and action.
  • the client device 140 can also be used as a data collection terminal to collect the input data of the input I/O interface 112 and the output result of the output I/O interface 112 as new sample data, and store it in the database 130 as shown in the figure.
  • the I/O interface 112 directly uses the input data input to the I/O interface 112 and the output result of the output I/O interface 112 as a new sample as shown in the figure.
  • the data is stored in the database 130.
  • FIG. 4 is only a schematic diagram of a system architecture provided by an embodiment of the present application.
  • the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data The storage system 150 is an external memory relative to the execution device 110. In other cases, the data storage system 150 may also be placed in the execution device 110.
  • the target model/rule 101 is obtained by training according to the training device 120.
  • the target model/rule 101 may be the neural network in this application in the embodiment of this application, specifically, the neural network provided in the embodiment of this application Can be CNN, deep convolutional neural networks (deep convolutional neural networks, DCNN), recurrent neural networks (recurrent neural network, RNNS) and so on.
  • CNN is a very common neural network
  • the structure of CNN will be introduced in detail below in conjunction with Figure 5.
  • a convolutional neural network is a deep neural network with a convolutional structure. It is a deep learning architecture.
  • a deep learning architecture refers to a machine learning algorithm. Multi-level learning is carried out on the abstract level of
  • CNN is a feed-forward artificial neural network. Each neuron in the feed-forward artificial neural network can respond to the input image.
  • a convolutional neural network (CNN) 200 may include an input layer 210, a convolutional layer/pooling layer 220 (where the pooling layer is optional), and a neural network layer 230.
  • the input layer 210 can obtain the image to be processed, and pass the obtained image to be processed to the convolutional layer/pooling layer 220 and the subsequent neural network layer 230 for processing, and the image processing result can be obtained.
  • the convolutional layer/pooling layer 220 may include layers 221-226, for example: in an implementation, layer 221 is a convolutional layer, layer 222 is a pooling layer, and layer 223 is a convolutional layer. Layers, 224 is the pooling layer, 225 is the convolutional layer, and 226 is the pooling layer; in another implementation, 221 and 222 are the convolutional layers, 223 is the pooling layer, and 224 and 225 are the convolutional layers. Layer, 226 is the pooling layer. That is, the output of the convolution layer can be used as the input of the subsequent pooling layer, or as the input of another convolution layer to continue the convolution operation.
  • the convolution layer 221 can include many convolution operators.
  • the convolution operator is also called a kernel. Its function in image processing is equivalent to a filter that extracts specific information from the input image matrix.
  • the convolution operator is essentially It can be a weight matrix. This weight matrix is usually pre-defined. In the process of convolution on the image, the weight matrix is usually one pixel after one pixel (or two pixels after two pixels) along the horizontal direction on the input image. ...It depends on the value of stride) to complete the work of extracting specific features from the image.
  • the size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix and the depth dimension of the input image are the same.
  • the weight matrix will extend to Enter the entire depth of the image. Therefore, convolution with a single weight matrix will produce a single depth dimension convolution output, but in most cases, a single weight matrix is not used, but multiple weight matrices of the same size (row ⁇ column) are applied. That is, multiple homogeneous matrices.
  • the output of each weight matrix is stacked to form the depth dimension of the convolutional image, where the dimension can be understood as determined by the "multiple" mentioned above.
  • Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract edge information of the image, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to eliminate unwanted noise in the image.
  • the multiple weight matrices have the same size (row ⁇ column), the size of the convolution feature maps extracted by the multiple weight matrices of the same size are also the same, and then the multiple extracted convolution feature maps of the same size are combined to form The output of the convolution operation.
  • weight values in these weight matrices need to be obtained through a lot of training in practical applications.
  • Each weight matrix formed by the weight values obtained through training can be used to extract information from the input image, so that the convolutional neural network 200 can make correct predictions. .
  • the initial convolutional layer (such as 221) often extracts more general features, which can also be called low-level features; with the convolutional neural network
  • the features extracted by the subsequent convolutional layers (for example, 226) become more and more complex, such as features such as high-level semantics, and features with higher semantics are more suitable for the problem to be solved.
  • the 221-226 layers as illustrated by 220 in Figure 5 can be a convolutional layer followed by a layer
  • the pooling layer can also be a multi-layer convolutional layer followed by one or more pooling layers.
  • the only purpose of the pooling layer is to reduce the size of the image space.
  • the pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to obtain a smaller size image.
  • the average pooling operator can calculate the pixel values in the image within a specific range to generate an average value as the result of average pooling.
  • the maximum pooling operator can take the pixel with the largest value within a specific range as the result of the maximum pooling.
  • the operators in the pooling layer should also be related to the image size.
  • the size of the image output after processing by the pooling layer can be smaller than the size of the image of the input pooling layer, and each pixel in the image output by the pooling layer represents the average value or the maximum value of the corresponding sub-region of the image input to the pooling layer.
  • the convolutional neural network 200 After processing by the convolutional layer/pooling layer 220, the convolutional neural network 200 is not enough to output the required output information. Because as mentioned above, the convolutional layer/pooling layer 220 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other related information), the convolutional neural network 200 needs to use the neural network layer 230 to generate one or a group of required classes of output. Therefore, the neural network layer 230 may include multiple hidden layers (231, 232 to 23n as shown in FIG. 5) and an output layer 240. The parameters contained in the multiple hidden layers can be based on specific task types. The relevant training data of the, for example, the task type can include image recognition, image classification, image super-resolution reconstruction and so on.
  • the output layer 240 After the multiple hidden layers in the neural network layer 230, that is, the final layer of the entire convolutional neural network 200 is the output layer 240.
  • the output layer 240 has a loss function similar to the classification cross entropy, which is specifically used to calculate the prediction error.
  • a convolutional neural network (CNN) 200 may include an input layer 110, a convolutional layer/pooling layer 120 (the pooling layer is optional), and a neural network layer 130.
  • CNN convolutional neural network
  • FIG. 5 multiple convolutional layers/pooling layers in the convolutional layer/pooling layer 120 in FIG. 6 are parallel, and the respectively extracted features are input to the full neural network layer 130 for processing.
  • the convolutional neural network shown in FIGS. 5 and 6 is only used as an example of two possible convolutional neural networks in the image processing method of the embodiment of the application.
  • the application implements
  • the convolutional neural network used in the image processing method of the example can also exist in the form of other network models.
  • the structure of the convolutional neural network obtained by the search method of the neural network structure of the embodiment of the present application may be as shown in the convolutional neural network structure in FIG. 5 and FIG. 6.
  • FIG. 7 is a hardware structure of a chip provided by an embodiment of the application.
  • the chip includes a neural network processor 50.
  • the chip may be set in the execution device 110 as shown in FIG. 1 to complete the calculation work of the calculation module 111.
  • the chip can also be set in the training device 120 as shown in FIG. 1 to complete the training work of the training device 120 and output the target model/rule 101.
  • the algorithms of each layer in the convolutional neural network as shown in FIG. 2 can be implemented in the chip as shown in FIG. 7.
  • the NPU is mounted as a co-processor to a main central processing unit (central processing unit, CPU) (host CPU), and the main CPU distributes tasks.
  • the core part of the NPU is the arithmetic circuit 50.
  • the controller 504 controls the arithmetic circuit 503 to extract data from the memory (weight memory or input memory) and perform calculations.
  • the arithmetic circuit 503 includes multiple processing units (process engines, PE). In some implementations, the arithmetic circuit 503 is a two-dimensional systolic array. The arithmetic circuit 503 may also be a one-dimensional systolic array or other electronic circuits capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 503 is a general-purpose matrix processor.
  • the arithmetic circuit fetches the corresponding data of matrix B from the weight memory 502 and buffers it on each PE in the arithmetic circuit.
  • the arithmetic circuit fetches matrix A data and matrix B from the input memory 501 to perform matrix operations, and the partial or final result of the obtained matrix is stored in an accumulator 508.
  • the vector calculation unit 507 can perform further processing on the output of the arithmetic circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison and so on.
  • the vector calculation unit 507 can be used for network calculations in the non-convolutional/non-FC layer of the neural network, such as pooling, batch normalization, local response normalization, etc. .
  • the vector calculation unit 507 can store the processed output vector in the unified buffer 506.
  • the vector calculation unit 507 may apply a nonlinear function to the output of the arithmetic circuit 503, such as a vector of accumulated values, to generate the activation value.
  • the vector calculation unit 507 generates a normalized value, a combined value, or both.
  • the processed output vector can be used as an activation input to the arithmetic circuit 503, for example for use in subsequent layers in a neural network.
  • the unified memory 506 is used to store input data and output data.
  • the weight data directly transfers the input data in the external memory to the input memory 501 and/or the unified memory 506 through the storage unit access controller 505 (direct memory access controller, DMAC), and stores the weight data in the external memory into the weight memory 502, And the data in the unified memory 506 is stored in the external memory.
  • DMAC direct memory access controller
  • the bus interface unit (BIU) 510 is used to implement interaction between the main CPU, the DMAC, and the fetch memory 509 through the bus.
  • An instruction fetch buffer 509 connected to the controller 504 is used to store instructions used by the controller 504;
  • the controller 504 is configured to call the instructions cached in the memory 509 to control the working process of the computing accelerator.
  • the unified memory 506, the input memory 501, the weight memory 502, and the instruction fetch memory 509 are all on-chip (On-Chip) memories, and the external memory is a memory external to the NPU.
  • the external memory can be a double data rate synchronous dynamic random access memory. Memory (double data rate synchronous dynamic random access memory, referred to as DDR SDRAM), high bandwidth memory (HBM) or other readable and writable memory.
  • DDR SDRAM double data rate synchronous dynamic random access memory
  • HBM high bandwidth memory
  • each layer in the convolutional neural network shown in FIG. 2 can be executed by the arithmetic circuit 303 or the vector calculation unit 307.
  • the execution device 110 in FIG. 4 introduced above can execute the image processing method or the steps of the image processing method of the embodiment of the present application.
  • the CNN model shown in FIG. 5 and FIG. 6 and the chip shown in FIG. 7 can also be used for Perform the image processing method or each step of the image processing method in the embodiment of the application.
  • the image processing method of the embodiment of the present application and the image processing method of the embodiment of the present application will be described in detail below with reference to the accompanying drawings.
  • an embodiment of the present application provides a system architecture 300.
  • the system architecture includes a local device 301, a local device 302, an execution device 210 and a data storage system 250, where the local device 301 and the local device 302 are connected to the execution device 210 through a communication network.
  • the execution device 210 may be implemented by one or more servers.
  • the execution device 210 can be used in conjunction with other computing devices, such as data storage, routers, load balancers and other devices.
  • the execution device 210 may be arranged on one physical site or distributed on multiple physical sites.
  • the execution device 210 may use the data in the data storage system 250 or call the program code in the data storage system 250 to implement the method for searching the neural network structure of the embodiment of the present application.
  • the execution device 210 may perform the following process: determine a search space and multiple building units; stack the multiple building units to obtain a search network, which is a neural network used to search for a neural network structure;
  • the network structure of the building units in the search network is optimized in the search space to obtain optimized building units, wherein the search space gradually decreases during the optimization process, the number of building units gradually increases, and the search space decreases
  • the increase in the number of construction units makes the video memory consumption generated in the optimization process within a preset range
  • the target neural network is built according to the optimized construction unit.
  • a target neural network can be built, and the target neural network can be used for image classification or image processing.
  • Each local device can represent any computing device, such as personal computers, computer workstations, smart phones, tablets, smart cameras, smart cars or other types of cellular phones, media consumption devices, wearable devices, set-top boxes, game consoles, etc.
  • Each user's local device can interact with the execution device 210 through a communication network of any communication mechanism/communication standard.
  • the communication network can be a wide area network, a local area network, a point-to-point connection, or any combination thereof.
  • the local device 301 and the local device 302 obtain the relevant parameters of the target neural network from the execution device 210, deploy the target neural network on the local device 301 and the local device 302, and use the target neural network for image classification Or image processing and so on.
  • the target neural network can be directly deployed on the execution device 210.
  • the execution device 210 obtains the image to be processed from the local device 301 and the local device 302, and classifies the image to be processed according to the target neural network or other types of images. deal with.
  • the above-mentioned execution device 210 may also be referred to as a cloud device. At this time, the execution device 210 is generally deployed in the cloud.
  • the method for constructing a neural network in an embodiment of the present application will be described in detail below with reference to FIG. 9.
  • the method shown in FIG. 9 can be executed by a neural network construction device, which can be a computer, a server, or the like with sufficient computing power to be used for the neural network construction device.
  • the method shown in FIG. 9 includes steps 1001 to 1004, which are described in detail below.
  • the aforementioned search space is determined according to the application requirements of the target neural network to be constructed. Specifically, the aforementioned search space may be determined according to the type of processed data of the target neural network.
  • the type and number of operations contained in the above-mentioned search space should be adapted to the processing of image data; when the above-mentioned target neural network is used to process voice data, the search The type and number of operations contained in the space should be adapted to the processing of voice data.
  • the foregoing multiple construction units are determined according to the search space and the size of the video memory resources of the device that constructs the target neural network.
  • the building unit in this application is a network structure obtained by connecting multiple nodes through the basic operation of a neural network, and the building unit is a basic module for building a neural network.
  • the 3 nodes (node 0, node 1 and node 2) located in the dashed box constitute a building unit that can receive the output of nodes c_ ⁇ k-2 ⁇ and c_ ⁇ k-1 ⁇
  • the data (c_ ⁇ k-2 ⁇ and c_ ⁇ k-1 ⁇ can also be feature maps that meet the requirements.
  • c_ ⁇ k-2 ⁇ and c_ ⁇ k-1 ⁇ can be input images that have undergone certain convolution processing The feature map obtained later), and the input data will be processed by nodes 0 and 1.
  • the data output by node 0 will also be input to node 1 for processing, and the data output by node 0 and node 1 will be sent to Processing is performed in node 2, and node 2 finally outputs the data processed by the construction unit.
  • nodes c_ ⁇ k-2 ⁇ and c_ ⁇ k-1 ⁇ can be regarded as input nodes. These two nodes will input the data to be processed into the construction unit. In the construction unit, 0 and 1 are intermediate nodes. Node 2 is the output node.
  • the thick arrow in Figure 10 represents one or more basic operations. The calculation results of the basic operations that are imported into the same intermediate node are added at the intermediate node.
  • the thin arrow in Figure 10 represents the feature map connection of the channel dimension, and the output node 2
  • the output feature map is formed by connecting the outputs of two intermediate nodes (node 0 and node 1) in the channel dimension of the feature map in order.
  • the aforementioned search space may include basic operations or a combination of basic operations in a preset convolutional neural network, and these basic operations or combinations of basic operations may be collectively referred to as basic operations.
  • the above search space can contain the following 8 basic operations:
  • Zero setting operation (Zero, all neurons in the corresponding position are set to zero).
  • the above-mentioned search network is a neural network for searching the structure of a neural network.
  • the foregoing stacking of multiple building units to obtain a search network includes: stacking the multiple building units in sequence in a preset stacking manner to obtain a search network, wherein, in the search network, the search network is located The output of the building unit in front of the network is the input of the building unit located in the back of the search network.
  • the foregoing preset stacking manner may include what type of building units are stacked at which position, the number of stacks, and so on.
  • the optimization process of optimizing the network structure of the building unit in the search network can include N stages, the i-th stage and the j-th stage are any two stages of the N stages, and the search space is in the i-th stage
  • the size of is greater than the size of the search network in the j-th stage, the number of building units included in the search network in the i-th stage is less than the number of building units in the search space in the j-th stage, the search space of the search network is reduced
  • the increase in the number of building units of the small and search network makes the memory consumption of the optimization process within the preset range.
  • the difference between the number of building units included in the search network in the Nth stage and the number of building units included in the target neural network is within a preset range.
  • the number of building units included in the above target neural network is Determined according to the application requirements of the target neural network, N is a positive integer greater than 1, i and j are both positive integers less than or equal to N, and i is less than j.
  • the above-mentioned video memory resource may be replaced with a cache resource.
  • the cache resource is a memory or storage unit used to store the number of calculations during the optimization process of the device used to construct the neural network.
  • the foregoing cache resources may specifically include video memory resources.
  • the number of building units included in the target neural network is determined according to the type of data to be processed by the target neural network and/or the complexity of calculation.
  • the target neural network when the above target neural network is used to process some simple text data, the target neural network only needs to contain a smaller number of building units. When the above target neural network is used to process some more complex image data, the target neural network needs Contains a large number of building units.
  • the target neural network needs to contain a larger number of building units; when the target neural network needs to process data with low complexity, the target neural network needs a smaller number
  • the building unit can be.
  • the above-mentioned size of N is preset.
  • the size of the above N can be determined according to the construction requirements of the target neural network. Specifically, when the target neural network needs to be constructed in a relatively short time, N can be set to a small value, and when the target neural network can be constructed in a relatively long time, N can be set to a Larger value.
  • the video memory resources saved by reducing the search space can be used to increase the number of building units, so that the building can be stacked as much as possible under the condition of limited video memory resources.
  • a search network whose number of units is close to that of the target neural network to be built.
  • the optimized building unit of the search network can be better adapted to the construction of the target neural network, and the target neural network built according to the optimized building unit can better meet the application requirements.
  • the application gradually reduces the size of the search space and increases the number of construction units of the search network, so as to construct a goal that can better meet the needs of the application.
  • the dependence of the video memory resources in the optimization process is reduced, so that the target neural network that satisfies the application needs can be obtained by relying on less video memory resources in the optimization process, and it also improves the memory resources to a certain extent. Utilization rate.
  • the optimized construction unit in the search network is more suitable for building the target neural network.
  • the depth of the neural network is positively correlated with the number of building units contained. Therefore, when the number of building units of the search network is close to the number of building units of the target neural network, the network depth of the search network is the same as that of the target neural network. Also relatively close.
  • the search space becomes smaller and the number of building units of the search network increases, and from the i-th stage to the j-th stage, the search space
  • the magnitude of the decrease may be the same as the magnitude of the increase in the number of building units.
  • the extent of the reduction of the search space from the i-th stage to the j-th stage can be determined according to the increase in the number of building units of the search network from the i-th stage to the j-th stage, or from the i-th stage to the j-th stage
  • the number of increase in the number of construction units of the search network in each stage can be determined according to the decrease in the search space from the i-th stage to the j-th stage.
  • the size of the video memory resources can also be combined to determine the extent of the reduction in the search space from the i-th stage to the j-th stage and the increase in the number of construction units of the search network from the i-th stage to the j-th stage.
  • the size of the search space for the i-th stage S i, S j of the above-described search space size in the j-th stage, the number of the search network construction unit included in the i-th stage is L i
  • the number of construction units included in the jth stage of the search network is L j , where the size of L j -L i is determined according to the size of S i -S j , or the size of S i -S j is Determined according to the size of L j -L i .
  • the size of S i -S j can be preset, and then the size of L j -L i can be determined according to the size of S i -S j , so that the video memory saved due to the reduced search space
  • the increase in resources and building units causes the difference between the more consumed video memory resources to be within a certain threshold (the smaller the difference, the better).
  • the size of L j -L i can also be preset, and then the size of S i -S j is determined according to the size of L j -L i , so that the increase in the building unit causes more memory resources to be consumed
  • the difference between the memory resources saved by reducing the search space and the search space is within a certain threshold (the smaller the difference, the better).
  • the second stage is compared with the first stage and the fourth stage is satisfied with the third stage: the search space is reduced, and the number of construction units of the search network is increased.
  • the search space and the number of building units contained in the search network in the second and third phases have not changed.
  • the number change value of the construction unit of the search network in any two adjacent stages is the same, and the size change value of the search space in any two adjacent stages is also the same.
  • the number of construction units increased in the i+1th stage relative to the i-th stage may be based on the aforementioned value N, the number of construction units included in the search network before optimization, and the construction of the target neural network The number of units is determined.
  • the number of building units increased by the search network in the i+1 stage relative to the i-th stage is X
  • the number of building units contained in the search network before the optimization starts is U
  • the number of building units in the target neural network is V
  • the extent of the reduction in the size of the search space and the extent of the increase in the number of search network construction units can be determined in various ways, as long as the reduction in the search space of the search network during the optimization process and the search
  • the increase in the number of building units of the network makes the memory consumption generated in the optimization process within a preset range.
  • the size of the search space can be reduced in advance, and then the amount of increase in the number of search network construction units can be determined; or the size of the search network can be preset, and then the size of the search space can be reduced.
  • This application does not limit this, and all implementations to ensure that the video memory consumption is within the preset range are within the protection scope of this application.
  • the number of the first-type operations included in the connection relationship between the nodes of the optimized construction unit is within a preset range, and the first-type operations are operations that do not include the neural network trainable parameters.
  • This application limits the number of operations of the first type to a certain range, so that the trainable parameters of the final target neural network are maintained at a relatively stable level, and the performance of the target neural network remains stable.
  • the number of operations of the first type can also be specifically limited to a certain value, so that the final target neural network includes a fixed number of operations of the first type, so that the performance of the target neural network is more stable.
  • the first type of operation mentioned above is an operation that does not contain trainable parameters. If there are too many such operations, it will result in fewer other operations containing trainable parameters, so that the overall neural network has fewer trainable parameters, and the characteristics of the neural network Decreased expression ability.
  • the number of first-type operations in the construction units obtained by each search will have a certain difference.
  • the neural network structure obtained by the search ie, the construction unit
  • Limiting the number of operations of the first type can keep the trainable parameters of the test network constructed from the neural network structure obtained by the search at a relatively stable level, thereby reducing performance fluctuations on the corresponding tasks.
  • the number of operations of the first type included in the connection relationship between the nodes of the optimized construction unit may be limited during the optimization process.
  • the number of operations of the first type is directly limited to the first number
  • the first type of operations are not changed during the optimization process
  • the number of operations of the first type in the construction unit is greater than the first number, then part of the first operations can be deleted in the optimization process so that the number of operations of the first type after the deletion is equal to the first number; if the construction The number of operations of the first type in the unit is less than the number of operations of the first type, then the number of construction units can be increased in the optimization process, so that the number of construction units after optimization is the first number.
  • the above-mentioned process of limiting the number of operations of the first type to a fixed number can be referred to as a standardized process for the number of operations of the first type.
  • the normative process for the number of operations of the first type may be based on pre-made normative rules, retaining Mc operations of the first type in a type of building unit.
  • the input construction unit structure is output directly; otherwise, the following process is executed:
  • the corresponding network structure parameters corresponding to a type of operation are sorted in descending order.
  • the first type of operation that is not in the building unit with the largest weight and conforms to the network structure generation rule is added according to the network structure generation rule Construct the unit structure, and delete the corresponding basic operations that are replaced according to the network structure generation rules and network structure parameters; if the number of the first type of operation is greater than Mc, remove the first type of operation with the smallest weight from the construction unit structure , And add corresponding other basic operations according to the network structure generation rules and network structure parameters; repeat this process until the number of the first type of operations in the construction unit is equal to Mc.
  • the first type of operation described above may specifically be a skip-connect operation or a zero-setting operation.
  • the above search network can contain multiple types of building units. The following briefly introduces the common building units included in the search network.
  • the building units in the search network include the first type of building units.
  • the first type of construction unit is a construction unit in which the number of input feature maps (specifically, the number of channels) and the size are the same as the number and size of output feature maps.
  • the input of a certain first type of construction unit is a feature map of size C ⁇ D1 ⁇ D2 (C is the number of channels, D1 and D2 are width and height respectively), and the output is processed by the first type of construction unit
  • the size of the feature map is still C ⁇ D1 ⁇ D2.
  • the above-mentioned first type of building unit may specifically be a normal cell (normal cell)
  • the building unit in the search network includes the second type of building unit.
  • the resolution of the output feature map of the second type of construction unit is 1/M of the input feature map
  • the number of output feature maps of the second type of construction unit is M times the number of input feature maps
  • M is a positive value greater than 1. Integer.
  • the value of M can generally be 2, 4, 6, and 8.
  • the input of a certain second type of construction unit is 1 size C ⁇ D1 ⁇ D2 (C is the number of channels, D1 and D2 are width and height respectively, and the product of C1 and C2 can represent the resolution of the feature map) Feature map, then, after the second type of building unit is processed, the size of 1 obtained is Characteristic map.
  • the above-mentioned second type of construction unit may specifically be a down-sampling unit (redution cell).
  • the structure of the search network may be as shown in FIG. 11.
  • the search network is formed by stacking 5 building units in turn.
  • the first type of building unit is located at the front and the last of the search network, and there is a second type of building unit between every two first building units.
  • the first building unit in the search network in Figure 11 can process the input image. After the first type of building unit processes the image, the processed feature map is input to the second type of building unit for processing, and so on. Transfer backwards until the last first-type construction unit in the search network outputs the feature map.
  • the feature map output by the last first-type construction unit of the search network is sent to the classifier for processing, and the classifier classifies the image according to the feature map.
  • the type of neural network to be constructed can be determined according to the task requirements of the neural network to be constructed (that is, the task type of the task to be processed by the neural network to be constructed).
  • the size of the search space and the number of construction units are determined, and the construction units are stacked to obtain the search network.
  • the network structure of the building unit in the search network can be optimized (training data can be used for optimization during the optimization process).
  • the optimization of the network structure of the building unit can be divided into progressive network structure search and operation Quantity specification process (that is, limit the quantity of a certain operation within a certain range, in this application, it is mainly to limit the quantity of the first type of operation within a certain range).
  • the progressive network structure search is to gradually reduce the size of the search space during the optimization process, and gradually increase the number of construction units to obtain a search network that is close to the number of construction units of the neural network to be constructed (see above for the specific process Related description in the method shown in Figure 9).
  • the operation quantity specification process can be used to ensure that the quantity of the first type operation connections in the optimized construction unit is within a certain preset range.
  • This progressive network structure search and operation quantity specification process is equivalent to the optimization process of step 1003 in the method shown in FIG. 9.
  • FIG. 13 shows the process of the neural network construction system executing the neural network structure search method of the embodiment of the application. The content shown in Figure 13 will be described in detail below.
  • the neural network construction system shown in FIG. 13 mainly includes an operation warehouse 101, a progressive network structure search module 102, and an operation quantity specification module 103.
  • the operation warehouse 101 may include a preset basic operation in the convolutional neural network.
  • the progressive network structure search module 102 is used to optimize the network structure of the construction unit of the search network.
  • the search network 1022 itself is continuously updated by increasing the stacking number of the construction unit 1021 and reducing the size of the search space. So as to realize the continuous optimization of the network structure of the building unit of the search network.
  • the operation quantity specification module 103 mainly restricts the quantity of a certain operation within a certain range.
  • the operation quantity specification module 103 mainly restricts the quantity of the first type of operation within a certain range.
  • the size of the operating warehouse 101 (equivalent to the search space above) and the initial number of construction units 103 can be determined according to the target task, and then the search network can be obtained by stacking the initial number of construction units 103 according to the initial number of construction units 103.
  • the progressive structure search module 102 can be used to optimize the search network. In the optimization process, the size of the search space is gradually reduced, the number of stacking units is increased, and the building unit is obtained.
  • the operation quantity specification module 103 restricts the first type of operations in the building units obtained by the progressive network structure search module 102 to a certain range, so as to obtain optimized building units. These optimized building units can be used for Build the final target neural network.
  • step 1003 the processes processed by the progressive network structure search module 102 and the operation quantity specification module 103 are equivalent to the optimization process in step 1003 in the method shown in FIG. 9.
  • the specific optimization process refer to the related description of step 1003.
  • FIG. 14 The specific process of the optimization operation performed by the progressive network structure search module 102 may be as shown in FIG. 14.
  • Figure 14 simplifies the actual operation to a certain extent, only showing the search process of the first type of building unit (specifically, normal cell), and the specific schematic diagram of the first type of building unit is also simplified. Show the search process, not the specific structure.
  • Each arrow line in the figure represents a basic operation, and the number of types of operations is simplified in the schematic diagram; the number boxes represent nodes, and the nodes in this example are the feature maps of the convolutional neural network.
  • connection between nodes is composed of all possible basic operations in the pre-defined search space.
  • Figure 14 uses five basic operations, which are represented by five arrowed lines.
  • the learned network structure parameters are obtained.
  • the weights of the corresponding basic operations between node 0 and node 1 of the first type of construction unit are 0.21, 0.26, 0.18, 0.03, and 0.32, respectively (the weights are shown in FIG. 14).
  • the preset basic operation deletion quantity one or more operations with the smallest weight can be deleted.
  • the arrow line with the smallest weight (as shown in the initial stage in Figure 14, the arrow line with the smallest weight is the fourth arrow line between node 0 and node 1) represents that the basic operation is deleted, and the remaining operations are in this
  • the structure of the building unit output from the stage is retained. Note that different nodes can operate according to the corresponding network structure weights, and the basic operations retained are not necessarily the same.
  • additional rules are applied to generate the network structure in addition to the same construction unit generation rules as other stages, so that the generated construction unit structure has structural characteristics that match the corresponding tasks.
  • the rule is that each node retains at most two basic input operations, that is, according to this rule and the corresponding network structure parameters, all basic operations between node 1 and node 3 are not retained.
  • the finally generated first type of construction unit is shown in the bold arrowed line and corresponding nodes in the final stage in FIG. 14.
  • the generated building unit structure and corresponding network structure parameters and their corresponding operation types are output to subsequent modules or processes together.
  • the operation quantity specification process module 103 shown in FIG. 13 is used to constrain the quantity of the first type of operation within a fixed range (specifically, the quantity of the first type of operation may be directly constrained to a certain value). The following is combined with FIG. 15 The specific process executed by the operation quantity specification flow module 103 is described.
  • FIG. 15 is a schematic diagram of the processing procedure of the operation quantity specification module of the embodiment of the present application.
  • the input is the construction unit structure output by the progressive network structure search module and the corresponding network structure parameters and their corresponding operation types
  • the output is the construction unit structure after the number of operations is standardized. It should be understood that the number of operations of the first type in the construction unit structure output after processing by the operation number specification module 103 is limited to a fixed number.
  • the specific execution process of the foregoing operation quantity specification module 103 includes:
  • Mc ⁇ M the first type of operation operation that does not belong to the construction unit structure and conforms to the network structure generation rule with the largest weight is replaced by the corresponding other types of basic operations.
  • step S3 After the building unit is generated in step S2, the building unit is sent to step S1 to continue the judgment.
  • the first type of operation may specifically be a jump connection operation.
  • Table 1 shows the neural network constructed using the neural network construction method of the embodiment of the application under similar constraints and the neural network designed or searched using other methods.
  • Table 1 also gives the search time of the contrast neural network structure.
  • CIFR10, CIFR100, ImageNetTop1, ImageNetTop5 in Table 1 respectively represent classification accuracy rates.
  • CIFAR10, CIFAR100, and ImageNet are different data sets
  • Top1 and Top5 are sub-indicators, which refer to the first 1 or 5 results.
  • NASNet-A, AmoebaNet-B, ENAS, PNAS, and DARTS (2ND) respectively represent different network structures, and the size of the search overhead can be expressed by the time required for a single GPU to run (here, the time is generally expressed in days).
  • the classification accuracy of the neural network constructed using the neural network construction method of the embodiment of the application is higher than the classification accuracy of the neural network designed or searched by other methods on the image classification data set.
  • the search cost is smaller, which can save more resources in the search process.
  • Table 2 shows the comparison of the performance of the neural network structure on the public data set before using the operation quantity specification process and after using the operation quantity specification process.
  • Run 1 represents the accuracy of the first test
  • Run 2 represents the first test.
  • Run 3 represents the accuracy of the third test.
  • the average accuracy rate in Table 2 represents the average accuracy rate from the first test to the third test
  • the standard deviation in Table 2 is the standard deviation of the accuracy rate from the first test to the third test. It can be seen from Table 2 that the performance and stability of the neural network obtained after using the standardized process of the number of operations are significantly improved compared to before use.
  • the neural network construction method of the embodiment of the application is described in detail above in conjunction with the accompanying drawings.
  • the neural network constructed by the construction method of the neural network of the embodiment of the application can be used for image processing (for example, image classification), etc. Introduce these specific applications.
  • FIG. 16 is a schematic flowchart of an image processing method according to an embodiment of the present application. The method shown in Figure 16 includes:
  • the above-mentioned target neural network may be constructed according to the method shown in FIG. 9.
  • the optimized construction unit of the search network can be better adapted to the construction of the target neural network, and a better performance target neural network can be obtained.
  • Using the target neural network for image classification can achieve better image classification results (for example, The classification result is more accurate).
  • the number change value of the construction unit of the search network in any two adjacent stages is the same, and the size change value of the search space in any two adjacent stages is also the same.
  • the above-mentioned target neural network is a neural network obtained by training through training pictures.
  • the target neural network can be trained by training pictures and the category information marked by the training pictures, and the trained neural network can be used for image classification.
  • FIG. 17 is a schematic diagram of the hardware structure of a neural network construction device provided by an embodiment of the present application.
  • the neural network construction device 3000 shown in FIG. 17 includes a memory 3001, a processor 3002, a communication interface 3003, and a bus 3004.
  • the memory 3001, the processor 3002, and the communication interface 3003 implement communication connections between each other through the bus 3004.
  • the memory 3001 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
  • the memory 3001 may store a program. When the program stored in the memory 3001 is executed by the processor 3002, the processor 3002 is configured to execute each step of the neural network construction method of the embodiment of the present application.
  • the processor 3002 may adopt a general central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more
  • the integrated circuit is used to execute related programs to implement the neural network construction method of the method embodiment of the present application.
  • the processor 3002 may also be an integrated circuit chip with signal processing capabilities.
  • each step of the neural network construction method of the present application can be completed by hardware integrated logic circuits in the processor 3002 or instructions in the form of software.
  • the above-mentioned processor 3002 may also be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, Discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 3001, and the processor 3002 reads the information in the memory 3001, combines its hardware to complete the functions required by the units included in the neural network construction device, or executes the neural network construction method of the method embodiment of the application .
  • the communication interface 3003 uses a transceiver device such as but not limited to a transceiver to implement communication between the device 3000 and other devices or communication networks. For example, the information of the neural network to be constructed and the training data needed in the process of constructing the neural network can be obtained through the communication interface 3003.
  • a transceiver device such as but not limited to a transceiver to implement communication between the device 3000 and other devices or communication networks. For example, the information of the neural network to be constructed and the training data needed in the process of constructing the neural network can be obtained through the communication interface 3003.
  • the bus 3004 may include a path for transferring information between various components of the device 3000 (for example, the memory 3001, the processor 3002, and the communication interface 3003).
  • FIG. 18 is a schematic diagram of the hardware structure of an image processing apparatus according to an embodiment of the present application.
  • the image processing apparatus 4000 shown in FIG. 18 includes a memory 4001, a processor 4002, a communication interface 4003, and a bus 4004. Among them, the memory 4001, the processor 4002, and the communication interface 4003 implement communication connections between each other through the bus 4004.
  • the memory 4001 may be ROM, static storage device and RAM.
  • the memory 4001 may store a program. When the program stored in the memory 4001 is executed by the processor 4002, the processor 4002 and the communication interface 4003 are used to execute each step of the image processing method of the embodiment of the present application.
  • the processor 4002 may adopt a general-purpose CPU, a microprocessor, an ASIC, a GPU, or one or more integrated circuits to execute related programs to realize the functions required by the units in the image processing apparatus of the embodiment of the present application. Or execute the image processing method in the method embodiment of this application.
  • the processor 4002 may also be an integrated circuit chip with signal processing capability.
  • each step of the image processing method of the embodiment of the present application can be completed by an integrated logic circuit of hardware in the processor 4002 or instructions in the form of software.
  • the aforementioned processor 4002 may also be a general-purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component.
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 4001, and the processor 4002 reads the information in the memory 4001, and combines its hardware to complete the functions required by the units included in the image processing apparatus of the embodiment of the application, or perform the image processing of the method embodiment of the application. method.
  • the communication interface 4003 uses a transceiver device such as but not limited to a transceiver to implement communication between the device 4000 and other devices or a communication network.
  • a transceiver device such as but not limited to a transceiver to implement communication between the device 4000 and other devices or a communication network.
  • the image to be processed can be acquired through the communication interface 4003.
  • the bus 4004 may include a path for transferring information between various components of the device 4000 (for example, the memory 4001, the processor 4002, and the communication interface 4003).
  • FIG. 19 is a schematic diagram of the hardware structure of the neural network training device according to an embodiment of the present application. Similar to the aforementioned device 3000 and device 4000, the neural network training device 5000 shown in FIG. 19 includes a memory 5001, a processor 5002, a communication interface 5003, and a bus 5004. The memory 5001, the processor 5002, and the communication interface 5003 implement communication connections between each other through the bus 5004.
  • the neural network After the neural network is constructed by the neural network construction device shown in FIG. 17, the neural network can be trained by the neural network training device 5000 shown in FIG. 19, and the trained neural network can be used to implement the implementation of this application. Example image processing method.
  • the device shown in FIG. 19 can obtain training data and the neural network to be trained from the outside through the communication interface 5003, and then the processor trains the neural network to be trained according to the training data.
  • the device 3000, device 4000, and device 5000 only show a memory, a processor, and a communication interface, in a specific implementation process, those skilled in the art should understand that the device 3000, device 4000, and device 5000 may also Including other devices necessary for normal operation. At the same time, according to specific needs, those skilled in the art should understand that the device 3000, the device 4000, and the device 5000 may also include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the device 3000, the device 4000, and the device 5000 may also only include the components necessary to implement the embodiments of the present application, and not necessarily include all the components shown in FIGS. 17, 18, and 19.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne un procédé de construction de réseau neuronal, un procédé et des dispositifs de traitement d'image dans le domaine de la vision artificielle et le domaine de l'intelligence artificielle. Le procédé de construction de réseau neuronal consiste à : déterminer un espace de recherche et une pluralité d'unités de construction ; empiler la pluralité d'unités de construction pour obtenir un réseau de recherche, le réseau de recherche étant un réseau neuronal pour une recherche d'architecture neuronale ; optimiser, dans l'espace de recherche, l'architecture des unités de construction à l'intérieur du réseau de recherche, de façon à obtenir des unités de construction optimisées, où, dans un processus d'optimisation, l'espace de recherche est réduit progressivement, le nombre d'unités de construction est augmenté progressivement, et la réduction de l'espace de recherche et l'augmentation du nombre d'unités de construction permettent à la mémoire graphique consommée lors du processus d'optimisation d'être dans une plage prédéfinie ; et construire le réseau neuronal cible selon les unités de construction optimisées. La présente invention est capable de construire un réseau neuronal qui respecte mieux les besoins d'application dans le cas où des ressources de mémoire graphique sont assurées.
PCT/CN2020/087222 2019-04-28 2020-04-27 Procédé de construction de réseau neuronal, procédé et dispositifs de traitement d'image Ceased WO2020221200A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910351894.1A CN110175671B (zh) 2019-04-28 2019-04-28 神经网络的构建方法、图像处理方法及装置
CN201910351894.1 2019-04-28

Publications (1)

Publication Number Publication Date
WO2020221200A1 true WO2020221200A1 (fr) 2020-11-05

Family

ID=67690236

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/087222 Ceased WO2020221200A1 (fr) 2019-04-28 2020-04-27 Procédé de construction de réseau neuronal, procédé et dispositifs de traitement d'image

Country Status (2)

Country Link
CN (1) CN110175671B (fr)
WO (1) WO2020221200A1 (fr)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112784962A (zh) * 2021-01-21 2021-05-11 北京百度网讯科技有限公司 超网络的训练方法、装置、电子设备和存储介质
CN112907446A (zh) * 2021-02-07 2021-06-04 电子科技大学 一种基于分组连接网络的图像超分辨率重建方法
CN112949827A (zh) * 2021-02-25 2021-06-11 商汤集团有限公司 神经网络生成、数据处理、智能行驶控制方法及装置
CN113033773A (zh) * 2021-03-03 2021-06-25 北京航空航天大学 面向旋转机械故障诊断的分层多叉网络结构高效搜索方法
CN113076938A (zh) * 2021-05-06 2021-07-06 广西师范大学 一种结合嵌入式硬件信息的深度学习目标检测方法
CN113159115A (zh) * 2021-03-10 2021-07-23 中国人民解放军陆军工程大学 基于神经架构搜索的车辆细粒度识别方法、系统和装置
CN113822426A (zh) * 2021-07-02 2021-12-21 腾讯科技(深圳)有限公司 神经网络结构搜索方法、装置、计算机设备和存储介质
CN114418828A (zh) * 2021-12-23 2022-04-29 北京百度网讯科技有限公司 显存管理方法、装置、设备、存储介质及程序产品
US20220230055A1 (en) * 2021-01-21 2022-07-21 Genesys Logic, Inc. Computing circuit and data processing method based on convolutional neural network and computer readable storage medium
CN114841361A (zh) * 2022-03-26 2022-08-02 华为技术有限公司 一种模型训练方法及其相关设备
US11403069B2 (en) 2017-07-24 2022-08-02 Tesla, Inc. Accelerated mathematical engine
US11409692B2 (en) 2017-07-24 2022-08-09 Tesla, Inc. Vector computational unit
US11487288B2 (en) 2017-03-23 2022-11-01 Tesla, Inc. Data synthesis for autonomous control systems
US11537811B2 (en) 2018-12-04 2022-12-27 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11561791B2 (en) 2018-02-01 2023-01-24 Tesla, Inc. Vector computational unit receiving data elements in parallel from a last row of a computational array
US11562231B2 (en) 2018-09-03 2023-01-24 Tesla, Inc. Neural networks for embedded devices
US11567514B2 (en) 2019-02-11 2023-01-31 Tesla, Inc. Autonomous and user controlled vehicle summon to a target
US11610117B2 (en) 2018-12-27 2023-03-21 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
US11636333B2 (en) 2018-07-26 2023-04-25 Tesla, Inc. Optimizing neural network structures for embedded systems
US11665108B2 (en) 2018-10-25 2023-05-30 Tesla, Inc. QoS manager for system on a chip communications
US11681649B2 (en) 2017-07-24 2023-06-20 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
US11734562B2 (en) 2018-06-20 2023-08-22 Tesla, Inc. Data pipeline and deep learning system for autonomous driving
US11748620B2 (en) 2019-02-01 2023-09-05 Tesla, Inc. Generating ground truth for machine learning from time series elements
CN116797541A (zh) * 2023-05-14 2023-09-22 哈尔滨理工大学 一种基于Transformer的肺部CT图像超分辨率重建方法
US11790664B2 (en) 2019-02-19 2023-10-17 Tesla, Inc. Estimating object properties using visual image data
US11816585B2 (en) 2018-12-03 2023-11-14 Tesla, Inc. Machine learning models operating at different frequencies for autonomous vehicles
US11841434B2 (en) 2018-07-20 2023-12-12 Tesla, Inc. Annotation cross-labeling for autonomous control systems
US11893393B2 (en) 2017-07-24 2024-02-06 Tesla, Inc. Computational array microprocessor system with hardware arbiter managing memory requests
US11893774B2 (en) 2018-10-11 2024-02-06 Tesla, Inc. Systems and methods for training machine models with augmented data
US12014553B2 (en) 2019-02-01 2024-06-18 Tesla, Inc. Predicting three-dimensional features for autonomous driving
US12307350B2 (en) 2018-01-04 2025-05-20 Tesla, Inc. Systems and methods for hardware-based pooling
US12462575B2 (en) 2021-08-19 2025-11-04 Tesla, Inc. Vision-based machine learning model for autonomous driving with adjustable virtual camera

Families Citing this family (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110175671B (zh) * 2019-04-28 2022-12-27 华为技术有限公司 神经网络的构建方法、图像处理方法及装置
CN110428046B (zh) * 2019-08-28 2023-12-15 腾讯科技(深圳)有限公司 神经网络结构的获取方法及装置、存储介质
CN112445823B (zh) * 2019-09-04 2024-11-26 华为技术有限公司 神经网络结构的搜索方法、图像处理方法和装置
CN112465135A (zh) * 2019-09-06 2021-03-09 华为技术有限公司 数据处理方法、装置、计算机可读存储介质和芯片
CN110705599B (zh) * 2019-09-06 2021-10-19 北京理工大学 一种基于在线迁移学习的人体动作识别方法
CN110543944B (zh) * 2019-09-11 2022-08-02 北京百度网讯科技有限公司 神经网络结构搜索方法、装置、电子设备和介质
CN110599999A (zh) * 2019-09-17 2019-12-20 寇晓宇 数据交互方法、装置和机器人
CN112633460A (zh) * 2019-09-24 2021-04-09 华为技术有限公司 构建神经网络的方法与装置、及图像处理方法与装置
EP4030347A4 (fr) 2019-09-24 2022-11-16 Huawei Technologies Co., Ltd. Procédé et dispositif de construction de réseau neuronal, et procédé et dispositif de traitement d'image
CN112561027B (zh) * 2019-09-25 2025-02-07 华为技术有限公司 神经网络架构搜索方法、图像处理方法、装置和存储介质
US11681902B2 (en) * 2019-09-27 2023-06-20 Amazon Technologies, Inc. Transposed convolution using systolic array
CN110728359B (zh) * 2019-10-10 2022-04-26 北京百度网讯科技有限公司 搜索模型结构的方法、装置、设备和存储介质
CN112749778B (zh) * 2019-10-29 2023-11-28 北京灵汐科技有限公司 一种强同步下的神经网络映射方法及装置
CN110851566B (zh) * 2019-11-04 2022-04-29 沈阳雅译网络技术有限公司 一种应用于命名实体识别的可微分网络结构搜索的方法
CN111047563B (zh) * 2019-11-26 2023-09-01 深圳度影医疗科技有限公司 一种应用于医学超声图像的神经网络构建方法
CN111027714B (zh) * 2019-12-11 2023-03-14 腾讯科技(深圳)有限公司 基于人工智能的对象推荐模型训练方法、推荐方法及装置
CN111159542B (zh) * 2019-12-12 2023-05-05 中国科学院深圳先进技术研究院 一种基于自适应微调策略的跨领域序列推荐方法
CN111178316B (zh) * 2020-01-06 2022-04-15 武汉大学 一种高分辨率遥感影像土地覆盖分类方法
CN112513837A (zh) * 2020-01-22 2021-03-16 深圳市大疆创新科技有限公司 网络结构搜索方法和装置
CN111382868B (zh) * 2020-02-21 2024-06-18 华为技术有限公司 神经网络结构搜索方法和神经网络结构搜索装置
CN111401516B (zh) * 2020-02-21 2024-04-26 华为云计算技术有限公司 一种神经网络通道参数的搜索方法及相关设备
CN113361680B (zh) * 2020-03-05 2024-04-12 华为云计算技术有限公司 一种神经网络架构搜索方法、装置、设备及介质
CN113469891A (zh) * 2020-03-31 2021-10-01 武汉Tcl集团工业研究院有限公司 一种神经网络架构搜索方法、训练方法、图像补全方法
CN111488971B (zh) * 2020-04-09 2023-10-24 北京百度网讯科技有限公司 神经网络模型搜索方法及装置、图像处理方法及装置
CN111523596B (zh) * 2020-04-23 2023-07-04 北京百度网讯科技有限公司 目标识别模型训练方法、装置、设备以及存储介质
CN111612134B (zh) * 2020-05-20 2024-04-12 鼎富智能科技有限公司 神经网络结构搜索方法、装置、电子设备及存储介质
CN111797983B (zh) * 2020-05-25 2024-12-03 华为技术有限公司 一种神经网络构建方法以及装置
CN111666763B (zh) * 2020-05-28 2024-12-13 平安科技(深圳)有限公司 用于多任务场景的网络结构构建方法和装置
CN111680599B (zh) * 2020-05-29 2023-08-08 北京百度网讯科技有限公司 人脸识别模型处理方法、装置、设备和存储介质
CN113902088A (zh) * 2020-06-22 2022-01-07 华为技术有限公司 神经网络结构搜索的方法、装置与系统
CN111931904A (zh) * 2020-07-10 2020-11-13 华为技术有限公司 神经网络的构建方法和装置
CN111797999A (zh) * 2020-07-10 2020-10-20 深圳前海微众银行股份有限公司 纵向联邦建模优化方法、装置、设备及可读存储介质
CN111898510B (zh) * 2020-07-23 2023-07-28 合肥工业大学 一种基于渐进式神经网络的跨模态行人再识别方法
CN111898061B (zh) * 2020-07-29 2023-11-28 抖音视界有限公司 搜索网络的方法、装置、电子设备和计算机可读介质
CN111754532B (zh) * 2020-08-12 2023-07-11 腾讯科技(深圳)有限公司 图像分割模型搜索方法、装置、计算机设备及存储介质
CN112070213A (zh) * 2020-08-28 2020-12-11 Oppo广东移动通信有限公司 神经网络模型的优化方法、装置、设备及存储介质
CN112200223B (zh) * 2020-09-22 2024-11-08 北京迈格威科技有限公司 图像识别网络构建方法、装置、设备及介质
CN112381733B (zh) * 2020-11-13 2022-07-01 四川大学 面向图像恢复的多尺度神经网络结构搜索方法及网络应用
CN113052812B (zh) * 2021-03-22 2022-06-24 山西三友和智慧信息技术股份有限公司 一种基于AmoebaNet的MRI前列腺癌检测方法
CN115114470B (zh) * 2022-05-10 2025-05-13 腾讯医疗健康(深圳)有限公司 一种模型搜索方法、装置、设备及存储介质、程序产品
CN114926698B (zh) * 2022-07-19 2022-10-14 深圳市南方硅谷半导体股份有限公司 基于演化博弈论的神经网络架构搜索的图像分类方法
CN115527089B (zh) * 2022-08-11 2025-08-26 东华大学 基于Yolo的目标检测模型训练方法及其应用和装置
CN115475388B (zh) * 2022-09-20 2024-11-05 上海硬通网络科技有限公司 一种游戏画风搜索网络的训练、游戏搜索方法及相关设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304921A (zh) * 2018-02-09 2018-07-20 北京市商汤科技开发有限公司 卷积神经网络的训练方法及图像处理方法、装置
WO2018156942A1 (fr) * 2017-02-23 2018-08-30 Google Llc Optimisation d'architectures de réseau neuronal
CN109165720A (zh) * 2018-09-05 2019-01-08 深圳灵图慧视科技有限公司 神经网络模型压缩方法、装置和计算机设备
US20190026639A1 (en) * 2017-07-21 2019-01-24 Google Llc Neural architecture search for convolutional neural networks
CN109472366A (zh) * 2018-11-01 2019-03-15 郑州云海信息技术有限公司 一种机器学习模型的编码解码方法与装置
CN110175671A (zh) * 2019-04-28 2019-08-27 华为技术有限公司 神经网络的构建方法、图像处理方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2462380B (en) * 2007-03-14 2012-02-15 Halliburton Energy Serv Inc Neural-network based surrogate model construction methods and applications thereof
US10699186B2 (en) * 2015-12-02 2020-06-30 Google Llc Determining orders of execution of a neural network
CN109615073B (zh) * 2018-12-03 2021-06-04 郑州云海信息技术有限公司 一种神经网络模型的构建方法、设备以及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018156942A1 (fr) * 2017-02-23 2018-08-30 Google Llc Optimisation d'architectures de réseau neuronal
US20190026639A1 (en) * 2017-07-21 2019-01-24 Google Llc Neural architecture search for convolutional neural networks
CN108304921A (zh) * 2018-02-09 2018-07-20 北京市商汤科技开发有限公司 卷积神经网络的训练方法及图像处理方法、装置
CN109165720A (zh) * 2018-09-05 2019-01-08 深圳灵图慧视科技有限公司 神经网络模型压缩方法、装置和计算机设备
CN109472366A (zh) * 2018-11-01 2019-03-15 郑州云海信息技术有限公司 一种机器学习模型的编码解码方法与装置
CN110175671A (zh) * 2019-04-28 2019-08-27 华为技术有限公司 神经网络的构建方法、图像处理方法及装置

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
REAL, ESTEBAN ET AL.: "Regularized Evolution for Image Classifier Architecture Search", ARXIV:1802.01548V7 (CS), 16 February 2019 (2019-02-16), XP081222849, DOI: 20200723111206Y *
ZOPH, BARRET ET AL.: "Learning transferable architectures for scalable image recognition", IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 22 June 2018 (2018-06-22), XP055617714, DOI: 20200723112100A *

Cited By (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11487288B2 (en) 2017-03-23 2022-11-01 Tesla, Inc. Data synthesis for autonomous control systems
US12020476B2 (en) 2017-03-23 2024-06-25 Tesla, Inc. Data synthesis for autonomous control systems
US12086097B2 (en) 2017-07-24 2024-09-10 Tesla, Inc. Vector computational unit
US11409692B2 (en) 2017-07-24 2022-08-09 Tesla, Inc. Vector computational unit
US11681649B2 (en) 2017-07-24 2023-06-20 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
US11893393B2 (en) 2017-07-24 2024-02-06 Tesla, Inc. Computational array microprocessor system with hardware arbiter managing memory requests
US11403069B2 (en) 2017-07-24 2022-08-02 Tesla, Inc. Accelerated mathematical engine
US12216610B2 (en) 2017-07-24 2025-02-04 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
US12307350B2 (en) 2018-01-04 2025-05-20 Tesla, Inc. Systems and methods for hardware-based pooling
US11797304B2 (en) 2018-02-01 2023-10-24 Tesla, Inc. Instruction set architecture for a vector computational unit
US11561791B2 (en) 2018-02-01 2023-01-24 Tesla, Inc. Vector computational unit receiving data elements in parallel from a last row of a computational array
US12455739B2 (en) 2018-02-01 2025-10-28 Tesla, Inc. Instruction set architecture for a vector computational unit
US11734562B2 (en) 2018-06-20 2023-08-22 Tesla, Inc. Data pipeline and deep learning system for autonomous driving
US11841434B2 (en) 2018-07-20 2023-12-12 Tesla, Inc. Annotation cross-labeling for autonomous control systems
US12079723B2 (en) 2018-07-26 2024-09-03 Tesla, Inc. Optimizing neural network structures for embedded systems
US11636333B2 (en) 2018-07-26 2023-04-25 Tesla, Inc. Optimizing neural network structures for embedded systems
US11562231B2 (en) 2018-09-03 2023-01-24 Tesla, Inc. Neural networks for embedded devices
US12346816B2 (en) 2018-09-03 2025-07-01 Tesla, Inc. Neural networks for embedded devices
US11983630B2 (en) 2018-09-03 2024-05-14 Tesla, Inc. Neural networks for embedded devices
US11893774B2 (en) 2018-10-11 2024-02-06 Tesla, Inc. Systems and methods for training machine models with augmented data
US11665108B2 (en) 2018-10-25 2023-05-30 Tesla, Inc. QoS manager for system on a chip communications
US11816585B2 (en) 2018-12-03 2023-11-14 Tesla, Inc. Machine learning models operating at different frequencies for autonomous vehicles
US12367405B2 (en) 2018-12-03 2025-07-22 Tesla, Inc. Machine learning models operating at different frequencies for autonomous vehicles
US11537811B2 (en) 2018-12-04 2022-12-27 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US12198396B2 (en) 2018-12-04 2025-01-14 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11908171B2 (en) 2018-12-04 2024-02-20 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11610117B2 (en) 2018-12-27 2023-03-21 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
US12136030B2 (en) 2018-12-27 2024-11-05 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
US11748620B2 (en) 2019-02-01 2023-09-05 Tesla, Inc. Generating ground truth for machine learning from time series elements
US12223428B2 (en) 2019-02-01 2025-02-11 Tesla, Inc. Generating ground truth for machine learning from time series elements
US12014553B2 (en) 2019-02-01 2024-06-18 Tesla, Inc. Predicting three-dimensional features for autonomous driving
US12164310B2 (en) 2019-02-11 2024-12-10 Tesla, Inc. Autonomous and user controlled vehicle summon to a target
US11567514B2 (en) 2019-02-11 2023-01-31 Tesla, Inc. Autonomous and user controlled vehicle summon to a target
US12236689B2 (en) 2019-02-19 2025-02-25 Tesla, Inc. Estimating object properties using visual image data
US11790664B2 (en) 2019-02-19 2023-10-17 Tesla, Inc. Estimating object properties using visual image data
CN112784962A (zh) * 2021-01-21 2021-05-11 北京百度网讯科技有限公司 超网络的训练方法、装置、电子设备和存储介质
US12456038B2 (en) * 2021-01-21 2025-10-28 Genesys Logic, Inc. Computing circuit and data processing method based on convolutional neural network and computer readable storage medium
US20220230055A1 (en) * 2021-01-21 2022-07-21 Genesys Logic, Inc. Computing circuit and data processing method based on convolutional neural network and computer readable storage medium
CN112907446B (zh) * 2021-02-07 2022-06-07 电子科技大学 一种基于分组连接网络的图像超分辨率重建方法
CN112907446A (zh) * 2021-02-07 2021-06-04 电子科技大学 一种基于分组连接网络的图像超分辨率重建方法
CN112949827A (zh) * 2021-02-25 2021-06-11 商汤集团有限公司 神经网络生成、数据处理、智能行驶控制方法及装置
CN112949827B (zh) * 2021-02-25 2024-05-21 商汤集团有限公司 神经网络生成、数据处理、智能行驶控制方法及装置
CN113033773B (zh) * 2021-03-03 2023-01-06 北京航空航天大学 面向旋转机械故障诊断的分层多叉网络结构高效搜索方法
CN113033773A (zh) * 2021-03-03 2021-06-25 北京航空航天大学 面向旋转机械故障诊断的分层多叉网络结构高效搜索方法
CN113159115B (zh) * 2021-03-10 2023-09-19 中国人民解放军陆军工程大学 基于神经架构搜索的车辆细粒度识别方法、系统和装置
CN113159115A (zh) * 2021-03-10 2021-07-23 中国人民解放军陆军工程大学 基于神经架构搜索的车辆细粒度识别方法、系统和装置
CN113076938B (zh) * 2021-05-06 2023-07-25 广西师范大学 一种结合嵌入式硬件信息的深度学习目标检测方法
CN113076938A (zh) * 2021-05-06 2021-07-06 广西师范大学 一种结合嵌入式硬件信息的深度学习目标检测方法
CN113822426A (zh) * 2021-07-02 2021-12-21 腾讯科技(深圳)有限公司 神经网络结构搜索方法、装置、计算机设备和存储介质
US12462575B2 (en) 2021-08-19 2025-11-04 Tesla, Inc. Vision-based machine learning model for autonomous driving with adjustable virtual camera
CN114418828A (zh) * 2021-12-23 2022-04-29 北京百度网讯科技有限公司 显存管理方法、装置、设备、存储介质及程序产品
CN114841361A (zh) * 2022-03-26 2022-08-02 华为技术有限公司 一种模型训练方法及其相关设备
CN116797541A (zh) * 2023-05-14 2023-09-22 哈尔滨理工大学 一种基于Transformer的肺部CT图像超分辨率重建方法

Also Published As

Publication number Publication date
CN110175671B (zh) 2022-12-27
CN110175671A (zh) 2019-08-27

Similar Documents

Publication Publication Date Title
CN110175671B (zh) 神经网络的构建方法、图像处理方法及装置
US20220215227A1 (en) Neural Architecture Search Method, Image Processing Method And Apparatus, And Storage Medium
US20230028237A1 (en) Method and apparatus for training image processing model
US20230206069A1 (en) Deep Learning Training Method for Computing Device and Apparatus
WO2021008206A1 (fr) Procédé de recherche d'architecture neuronale, et procédé et dispositif de traitement d'images
CN111797983B (zh) 一种神经网络构建方法以及装置
WO2021218517A1 (fr) Procédé permettant d'acquérir un modèle de réseau neuronal et procédé et appareil de traitement d'image
WO2021120719A1 (fr) Procédé de mise à jour de modèle de réseau neuronal, procédé et dispositif de traitement d'image
US20230153615A1 (en) Neural network distillation method and apparatus
WO2022083536A1 (fr) Procédé et appareil de construction de réseau neuronal
WO2021043193A1 (fr) Procédé de recherche de structures de réseaux neuronaux et procédé et dispositif de traitement d'images
US20230141145A1 (en) Neural network building method and apparatus
WO2022052601A1 (fr) Procédé d'apprentissage de modèle de réseau neuronal ainsi que procédé et dispositif de traitement d'image
WO2021022521A1 (fr) Procédé de traitement de données et procédé et dispositif d'apprentissage de modèle de réseau neuronal
WO2021244249A1 (fr) Procédé, système et dispositif d'instruction de classificateur et procédé, système et dispositif de traitement de données
WO2019228358A1 (fr) Procédé et appareil d'entraînement de réseau neuronal profond
WO2021164750A1 (fr) Procédé et appareil de quantification de couche convolutive
WO2021233342A1 (fr) Procédé et système de construction de réseau de neurones artificiels
US20220327835A1 (en) Video processing method and apparatus
WO2023071658A1 (fr) Procédé et appareil de traitement de modèle d'ia et procédé et appareil de calcul de modèle d'ia
WO2021129668A1 (fr) Procédé d'apprentissage de réseau neuronal et dispositif
WO2021227787A1 (fr) Procédé et appareil de formation de prédicteur de réseau neuronal, et procédé et appareil de traitement d'image
WO2023231794A1 (fr) Procédé et appareil de quantification de paramètres de réseau neuronal
CN114861859B (zh) 神经网络模型的训练方法、数据处理方法及装置
WO2024160215A1 (fr) Procédé et appareil de traitement de données

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20798704

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20798704

Country of ref document: EP

Kind code of ref document: A1