[go: up one dir, main page]

WO2022021083A1 - Image processing method, image processing device, and computer readable storage medium - Google Patents

Image processing method, image processing device, and computer readable storage medium Download PDF

Info

Publication number
WO2022021083A1
WO2022021083A1 PCT/CN2020/105263 CN2020105263W WO2022021083A1 WO 2022021083 A1 WO2022021083 A1 WO 2022021083A1 CN 2020105263 W CN2020105263 W CN 2020105263W WO 2022021083 A1 WO2022021083 A1 WO 2022021083A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
quantization parameter
neural network
input
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2020/105263
Other languages
French (fr)
Chinese (zh)
Inventor
曾志豪
曹子晟
应礼剑
李志强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SZ DJI Technology Co Ltd
Original Assignee
SZ DJI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SZ DJI Technology Co Ltd filed Critical SZ DJI Technology Co Ltd
Priority to PCT/CN2020/105263 priority Critical patent/WO2022021083A1/en
Publication of WO2022021083A1 publication Critical patent/WO2022021083A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Definitions

  • the present application relates to the technical field of image processing, and in particular, to an image processing method, an image processing apparatus, and a computer-readable storage medium.
  • Neural networks have achieved great success in image processing, speech recognition and processing, and are widely used in embedded devices. Since the neural network often contains many parameters, the entire operation process needs to consume a lot of storage and computing resources, which makes it difficult to meet the requirements of high real-time performance and low power consumption when running the neural network in low-cost embedded devices. Require.
  • Quantizing the neural network is a method to improve the operation speed and reduce the energy consumption. However, after the neural network is quantized by the existing quantization method, the accuracy of the output result of the neural network is insufficient.
  • one of the purposes of the present application is to solve the technical problem of insufficient precision of the output result of the neural network after quantization by the existing quantization method.
  • a first aspect of the embodiments of the present application provides an image processing method, which is applied to a neural network, and the method includes:
  • the image to be processed is used to input the neural network
  • quantization processing is performed on the to-be-processed image, and the image data of the to-be-processed image is quantized into fixed-point data whose precision is adapted to the target dynamic range.
  • a second aspect of the embodiments of the present application provides an image processing apparatus applied to a neural network, including: a processor and a memory storing a computer program;
  • the processor implements the following steps when executing the computer program:
  • the image to be processed is used to input the neural network
  • quantization processing is performed on the to-be-processed image, and the image data of the to-be-processed image is quantized into fixed-point data whose precision is adapted to the target dynamic range.
  • a third aspect of the embodiments of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium; when the computer program is executed by a processor, any image processing method of the first aspect above is implemented.
  • different quantization parameters are set for input images with different dynamic ranges, so that, for the image to be processed, the quantization parameter suitable for the dynamic range of the image to be processed can be used for the image to be processed.
  • the image data of the processed image is quantized, so that the fixed-point image data obtained after quantization can have sufficient accuracy, thereby reducing the accuracy loss caused by quantization, and can meet the high-precision image regression tasks such as image denoising and super-resolution. Require.
  • FIG. 1 is a flowchart of an image processing method provided by an embodiment of the present application.
  • FIG. 2 is a system block diagram of an early stage of an image processing method provided by an embodiment of the present application.
  • FIG. 3 is a system block diagram of an application stage of an image processing method provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of an image processing apparatus provided by an embodiment of the present application.
  • Deep learning techniques have achieved great success in image processing, speech recognition and processing, and are widely used in embedded devices.
  • Neural network is a deep learning algorithm. Because neural network often contains many parameters, the whole operation process needs to consume a lot of storage and computing resources, which makes it difficult to meet high real-time requirements when implemented in low-cost embedded devices. performance and low power requirements.
  • quantizing the neural network is a mainstream method.
  • Quantization approximates data from one range of values to another, for example, floating-point data can be quantized to fixed-point data.
  • data can be represented in two forms: floating-point numbers and fixed-point numbers.
  • a fixed-point number can be understood as a number with fixed decimal places, such as a 16-bit binary data, 5 bits can be fixedly used to store decimals, and the remaining 11 bits can be used to store integers.
  • the floating-point number can be understood as a number whose decimal places are not fixed and can be floated according to requirements.
  • quantization can realize the conversion of floating-point numbers to fixed-point numbers, and the conversion of floating-point numbers to fixed-point numbers can be regarded as a process of determining the number of decimal places.
  • the number of decimal places determines the precision of the data
  • the number of integer bits determines the numerical value that the data can express. Since the total number of bits of data is usually fixed, such as 4, 8, 16, or 32 bits, etc., the more decimal places, the fewer integer digits. Therefore, the ideal representation of a data is that there are just enough integer digits to express its value, and the remaining digits are used as decimal places to express higher precision.
  • the quantization parameter can determine the number of decimal digits and integer digits of the data obtained after quantization.
  • the image data obtained after quantization will have more integer digits to ensure that it is sufficient to express The actual pixel value of the image is obtained, but the number of decimal places is correspondingly smaller.
  • the representation of the pixel value does not require so many integer bits, but due to the use of the same set of quantization parameters, the image data will have excess integer bits and not exhausted decimal bits. The accuracy of the data is insufficient, and ultimately the accuracy of the output of the neural network is also insufficient.
  • FIG. 1 is a flowchart of the image processing method provided by the embodiment of the present application.
  • the image processing method can be applied to a neural network, such as a convolutional neural network, and the method includes the following steps:
  • the image to be processed can be used to feed the neural network.
  • the neural network is a pre-trained neural network, and the specific training process may refer to the prior art, which will not be repeated here.
  • the function corresponding to the neural network may be any function, such as image denoising, face recognition, super-resolution, scene recognition, etc., which is not limited here.
  • the dynamic range of an image may be the range of pixel values of the image. For example, for an image whose maximum pixel value is 60 and the minimum pixel value is 0, its dynamic range may be expressed as 0-60.
  • different quantization parameter sets can be adapted to different dynamic ranges.
  • These candidate quantization parameter sets may be predetermined, and the specific determination method will be described later.
  • the image data of the image can be quantized into fixed-point data whose precision is suitable for the dynamic range of the image.
  • the integer number of bits of the fixed-point data can be just enough to express the pixel value of the image.
  • the total number of bits is 8 bits.
  • the maximum pixel value of the image is 15, then when the number of integer digits is 4, it is just enough to express the pixel value of the image, and the remaining 4 digits are decimal digits.
  • the data precision of the image is the highest.
  • the integer digits of the fixed-point data can also be 5 or 6 digits.
  • the integer digits are slightly more than With just enough bits to express the pixel values of the image, the precision of the fixed-point data can still be considered to fit the dynamic range of the image.
  • the image data of the image to be processed is processed as precision-adapted (for convenience, precision and dynamic range adaptation is simply referred to as precision-adapted) fixed-point data
  • precision-adapted for convenience, precision and dynamic range adaptation is simply referred to as precision-adapted
  • fixed-point image data of the to-be-processed image Can be used to input neural networks.
  • different quantization parameters are set for input images with different dynamic ranges, so that, for the image to be processed, the quantization parameter suitable for the dynamic range of the image to be processed can be used for the image to be processed.
  • the image data of the processed image is quantized, so that the fixed-point image data obtained after quantization can have sufficient accuracy on the basis of no distortion (that is, the value is consistent with the actual size), thereby reducing the accuracy loss caused by quantization. It meets the high-precision requirements of image regression tasks such as image denoising and super-resolution.
  • the image to be processed may be an image block in the original image.
  • the original image may be the processing object of the image processing task, and the original image may be processed in blocks, thereby obtaining multiple image blocks of the original image, and each image block may be the to-be-to-be described in the embodiments of the present application.
  • Process images For each image block, the image processing method provided in the embodiment of the present application can be used to perform quantization and input neural network for forward propagation and other processing, so as to obtain an output image corresponding to each image block output by the neural network.
  • the output images corresponding to each image block can be fused to obtain a complete target output image, which is the output result corresponding to the original image.
  • the original image can be divided into blocks, there can be various implementations.
  • the original image can be divided evenly, so that the size of each divided image block is consistent, and for example, it can be divided unevenly, so that the image blocks can be divided
  • the sizes may be different.
  • the overlapping area between the image blocks can be deduplicated, so that one copy of the image data in the same area can be reserved. , and then use the de-duplicated image blocks to fuse to obtain the target output image.
  • the original network parameters of the trained neural network may be floating-point network parameters. Since the quantized image to be processed input to the neural network is fixed-point data, before the quantized image to be processed is input, The original network parameters of the neural network are converted into fixed-point network parameters, so that the fixed-point image data of the image to be processed and the fixed-point network parameters of the neural network can carry out forward propagation of the neural network according to the algorithm of fixed-point numbers.
  • the network parameters of the neural network may include network parameters of each layer of the neural network, and these network parameters may specifically include one or more of weight parameters, bias parameters, and the like.
  • different neural networks can have different network parameters.
  • a neural network can include one or more layers such as convolutional layers, fully connected layers, and nonlinear activation layers.
  • the network parameters corresponding to each layer can be different. The application examples are not limited herein.
  • an implementation manner is that the original network parameters of the neural network can be quantized to obtain fixed-point network parameters corresponding to the original network parameters.
  • quantization methods such as linear mapping, shifting, truncation, etc.
  • the quantization method of uniform linear mapping may be used as an example for description.
  • the floating-point weight parameter whose value range is [X min ,X max ]
  • the floating-point weight parameter needs to be quantized to 8bits representation, that is, the floating-point weight parameter is quantized to the value range of [0,255]
  • the floating-point number can be converted to a fixed-point number by two quantization parameters s and z.
  • the quantification formula is as follows:
  • x is the floating-point value before quantization
  • y is the fixed-point value after quantization
  • s and z are both quantization parameters, where s is the quantization step size, z is zero
  • round() is the rounding function.
  • the quantization parameters corresponding to different quantization methods may be different.
  • the quantization parameters corresponding to the uniform linear mapping include the quantization step size and the zero point, but the quantization parameters corresponding to other quantization methods are not necessarily the same as the uniform linear mapping, that is, the quantization step size s and the zero point z are not necessarily the same.
  • the quantization parameter may be determined by minimizing the KL divergence between the pre-quantization data distribution and the post-quantization data distribution. In another embodiment, the quantization parameter can also be determined by counting the actual maximum and minimum values.
  • different quantization parameter sets may correspond to different dynamic range intervals, for example, the first quantization parameter set may correspond to the interval [0, 63], and the second quantization parameter set may correspond to the interval [0, 127]...
  • the above The interval is only an example, and a specific interval corresponding to a quantization parameter set can be set by those skilled in the art according to requirements, which is not limited here.
  • the minimum interval to which it belongs can be calculated according to the target dynamic range of the image to be processed, and then the quantization parameter set corresponding to the minimum interval can be determined as the target quantization. parameter set.
  • the dynamic range of the image to be processed is 0-64. If the dynamic range interval is [0,63] and [0,127], in one embodiment, it can be directly determined that the minimum interval to which the target dynamic range belongs is [0,127]. In another implementation manner, considering that the quantization parameter set corresponding to the [0, 127] interval is lower in accuracy, and the dynamic range of 0-64 does not exceed [0, 63] by much, therefore, 0 If the dynamic range of -64 is reduced to a certain extent, the minimum interval to which the reduced target dynamic range belongs may be the interval of [0,63]. Of course, in another example, the target dynamic range may also be amplified to an appropriate degree, and then the minimum interval to which the amplified target dynamic range belongs may be determined.
  • the image to be processed is the input image of the input layer of the neural network.
  • the fixed-point image data of the image to be processed can be input to the neural network for forward propagation.
  • quantization processing may also be performed on the input images of other layers of the neural network in the process of forward propagation.
  • the input images of other layers can be the output images of the previous layer, and the operation performed by each layer can be considered as feature extraction for the input image.
  • the input image of the Nth layer is convolved with the convolution kernel of the Nth layer to obtain a feature map, which can be used as the input image of the N+1 layer.
  • one quantization parameter set may include multiple quantization parameter subsets, and each quantization parameter subset may correspond to a layer of the neural network.
  • each quantization parameter subset may correspond to a layer of the neural network.
  • the quantization parameter subset of the 0th layer can be It is used to quantize the input image (ie, the image to be processed) of the 0th layer (ie, the input layer), and the quantization parameter subset of the second layer can be used to quantize the input image of the second layer.
  • the quantized input image of the Nth layer can be input to the Nth layer, and perform operations such as convolution with the network parameters of the Nth layer.
  • the quantization method described in the above embodiments can also be described as layer-by-layer quantization.
  • the convolution kernels of the same layer correspond to the same set of quantization parameters.
  • a channel-by-channel quantization method can also be used, that is, on the basis of layer-by-layer quantization, each convolution kernel in each layer of the neural network can be further quantized separately.
  • the first convolution kernel is any convolution kernel in the neural network
  • the first convolution kernel may be quantized according to the quantization parameter corresponding to the first convolution kernel.
  • each convolution kernel of the 32 convolution kernels can be 5*5*16, 5*5 indicates the size of the convolution kernel, and 16 indicates the number of channels of the convolution kernel, then each convolution kernel of the 32 convolution kernels can be The kernel determines the quantization parameters respectively, and can perform quantization processing on each of the 32 convolution kernels according to the determined quantization parameters. In this way, after the input image of the Nth layer is quantized by the quantization parameter subset of the Nth layer, the quantized input image of the Nth layer can be respectively processed with each quantized convolution kernel in the Nth layer. Convolution operation.
  • the candidate quantization parameter set may be predetermined, and the stage in which the quantization parameter set is predetermined may be referred to as an early stage.
  • multiple dynamic range intervals may be preset, and relevant descriptions have been made in the previous section, such as [0, 63], [0, 127].
  • For each interval, its corresponding quantization parameter set can be determined.
  • the first interval can be any interval in multiple dynamic range intervals, and a plurality of sample images whose dynamic ranges belong to the first interval can be obtained, and the quantization corresponding to the first interval can be determined according to the sample images. parameter set.
  • an implementation manner is that the sample image can be obtained by dividing the material image into blocks. After the dynamic range is divided into intervals, the dynamic range corresponding to some intervals may be small. In order to obtain the sample images whose dynamic range belongs to the interval, the sample images of the corresponding interval can be more easily obtained by dividing the material image into blocks.
  • the quantization parameter set includes multiple quantization parameter subsets, and each quantization parameter subset corresponds to a layer of the neural network, when determining the quantization parameter set corresponding to the first interval, the above dynamic range may belong to the first interval.
  • the sample images are respectively input to the neural network for forward propagation, and the input images of each layer of the neural network are obtained.
  • the input image of the input layer is the input sample image
  • the input image of the Nth layer is the output image of the N-1th layer.
  • a quantization parameter (quantization parameter subset) for quantizing the image data of the input image of the layer can be determined for each layer, and the determined quantization parameter can be The image data of the input image is quantized into fixed-point data with a precision that matches the dynamic range of the input image.
  • the specific way of determining the quantization parameters has been described above, such as minimizing the KL divergence between the pre-quantization data distribution and the post-quantization data distribution, counting the actual maximum and minimum values, and so on.
  • the network parameters of the neural network may be floating-point network parameters, and the sample images used to input the neural network may also be floating-point image data.
  • the network parameters of the neural network can directly use their original floating-point network parameters.
  • the original network parameters of the neural network may be quantized first to obtain fixed-point network parameters, and the specific implementation of quantization to obtain fixed-point network parameters has been described above, and will not be repeated here.
  • the fixed-point network parameters can be inversely quantized to obtain floating-point network parameters used for operations such as convolution with the sample image.
  • inverse quantization can be obtained by inverse derivation according to the method used for quantization.
  • the method used for quantization is uniform linear mapping, then inverse quantization can be achieved by the following formula:
  • y is a fixed-point value after quantization
  • x' is a floating-point value after inverse quantization
  • s and z are quantization parameters.
  • the network parameters used by the neural network of the image input to be processed are fixed-point network parameters, so in the early stage, in order to fit with the application stage to make the determined quantitative parameter set more applicable, the network parameters of the neural network can be Instead of using the original network parameters, the floating-point network parameters obtained by inverse quantization of the fixed-point network parameters described above are used.
  • the set of quantization parameters may also be determined in real-time during the forward propagation of the neural network.
  • the quantization parameter subset of the 0th layer can be determined according to the dynamic range of the image to be processed, and according to the determined quantization parameter subset of the 0th layer, the quantization process can be performed on the to-be-processed image, and the quantization obtained by the to-be-processed
  • the fixed-point image data corresponding to the image can be input to the 0th layer (input layer) to obtain the output image of the 0th layer, that is, the input image of the 1st layer.
  • the quantization parameter subset of the first layer can be determined according to the dynamic range of the input image of the first layer, and the input image of the first layer can be quantized according to the determined quantization parameter subset of the first layer, and the obtained quantization parameter
  • the latter input image of the first layer can be input to the first layer to obtain the output image of the first layer, that is, the input image of the second layer.
  • the quantization parameter subset of the second layer can be determined according to the dynamic range of the input image of the second layer.
  • the above method of determining the quantization parameter set in real time during the forward propagation of the neural network needs to consume more storage and computing resources because the output image of each layer needs to be cached to determine the quantization parameter according to its dynamic range.
  • pre-determining candidate quantization parameter sets in the early stage can occupy less storage and computing resources in the application stage.
  • the neural network inputted by the image is a neural network that has been trained and has a specific function.
  • the image data used for inputting the neural network for training may be preprocessed.
  • the image data used for training may be subjected to zero-mean processing and/or prior to input to the neural network. Normalization processing to speed up the convergence of the network parameters of the neural network during training. Therefore, for a trained neural network, before the image data of images such as images to be processed or sample images are input, these image data can also be preprocessed.
  • the pixel values of each channel of the input image can be preprocessed by the following formula:
  • x represents the input image
  • i represents the input image channel
  • y represents the output image
  • the parameter k represents the scaling factor of the pixel value
  • the parameter b represents the bias of the pixel value.
  • Shift coefficient, parameters k and b are floating point numbers.
  • This embodiment may include a pre-stage and an application stage.
  • Fig. 2 is a system block diagram of the early stage of an image processing method provided by an embodiment of the present application.
  • a plurality of material images for inputting the convolutional neural network for forward propagation may be obtained, and preprocessing such as zero mean and/or normalization may be performed on the material images.
  • the preprocessed material image can be processed into blocks to obtain multiple image blocks, each image block can be used as a sample image.
  • the dynamic range of each sample image can be counted, and multiple dynamic range intervals can be defined according to the statistical results. For example, after the dynamic range of sample images is counted, if it is determined that the dynamic range of each sample image is distributed in the interval [-128, 127] (due to the sample image is preprocessed, its pixel value can be negative), then it can be obtained from [ -128,127] define multiple intervals, for example, you can define six intervals [-64,63], [-64,0), [0,63], [-128,127], [-128,0) and [0,127] .
  • quantization can be performed first to obtain fixed-point network parameters, and then by inverse quantization of the fixed-point network parameters, floating-point network parameters can be obtained, and the floating-point network parameters can be applied to the convolutional neural network. .
  • Each sample image is input into the convolutional neural network for forward propagation, and the output results of each layer of the convolutional neural network are obtained. Since the output of the Nth layer is also the input of the N+1th layer, obtaining the output result of each layer can also be regarded as obtaining the image data of the input image of each layer (the input image of the input layer is the sample image itself, other The input image of the layer can be a feature map obtained by performing feature extraction on the sample image).
  • Quantization can choose layer-by-layer quantization or channel-by-channel quantization.
  • Layer-by-layer quantization means that each layer corresponds to a set of quantization parameters
  • channel-by-channel quantization means that each convolution kernel corresponds to a set of quantization parameters. The following takes layer-by-layer quantization as an example for description.
  • the minimum interval to which it belongs can be determined, and when the minimum interval is determined, the dynamic range of the sample image can be appropriately scaled.
  • the image data of the input image of each layer obtained after the sample image is input to the convolutional neural network can be classified according to the interval to which the sample image belongs, and further, for each interval, the image data of the input image of each layer can be determined.
  • the corresponding quantization parameter is obtained, and the quantization parameter subset corresponding to the layer in this interval is obtained.
  • the image that is the input image of the Nth layer is obtained.
  • data, for the image data of the input image that is also the Nth layer determine the quantization parameter subset corresponding to the Nth layer in the [-64,63] interval.
  • the quantization parameter subset corresponding to each layer in the [-64,63] interval may constitute the quantization parameter set corresponding to the [-64,63] interval.
  • FIG. 3 is a system block diagram of the application stage of an image processing method provided by an embodiment of the present application.
  • the quantization parameter sets corresponding to different dynamic range intervals have been determined.
  • image preprocessing can be performed on the original image first.
  • the preprocessed original image can be processed into blocks, and each image block can be called the image to be processed.
  • the quantization parameter set corresponding to the minimum interval is determined as the target quantization parameter set.
  • the target quantization parameter set can be applied to the convolutional neural network, and the fixed-point network parameters obtained by quantizing the original network parameters in the previous stage can be applied to the convolutional neural network.
  • the target quantization parameter set includes the quantization parameter subset corresponding to each layer, then in the 0th layer (ie the input layer), the image to be processed can be quantized according to the quantization parameter subset corresponding to the 0th layer.
  • the fixed-point image data of the image to be processed can be input to layer 0.
  • the output of the 0th layer is the image data of the input image of the first layer.
  • the input image of the first layer can be quantized according to the quantization parameter subset corresponding to the first layer, and the quantized input image of the first layer can be processed.
  • the processing flow of other subsequent layers is the same, here Not one by one.
  • the convolutional neural network the fixed-point image data and the fixed-point network parameters perform convolution and other operations layer by layer according to the fixed-point number algorithm, and finally the convolutional neural network can output the output image corresponding to the image to be processed.
  • the output images corresponding to each image block can be further fused to obtain the final target output image.
  • the image to be processed can also be obtained by the following manner: the region of interest in the original image can be determined, the region of interest can be segmented or cut out, and the obtained image block can be used as the image to be processed.
  • the area of interest can be the area with high attention of the human eye. Since the human eye will give more attention to these areas, there are higher requirements for processing details. If the processing accuracy is insufficient, it will be easily detected. The human eye detects flaws.
  • the region of interest can be determined according to the content of the image, for example, it can be the subject of a person, an animal, etc. in the image, or it can be a specific area of the person in the image, such as the face, hair, and clothing.
  • the system may determine it by itself through an algorithm.
  • the original image may be semantically segmented to determine the object category contained in the original image.
  • Object categories can be the aforementioned characters, animals, or parts of characters, such as hair, faces, and other different categories.
  • For each object category its corresponding attention degree can be determined.
  • the determination of the degree of attention can be determined according to the preset correspondence between the object category and the degree of attention. Of course, the degree of attention can also be calculated by formulas and other methods. After the attention degree is calculated, the area with the highest attention degree can be regarded as the area of interest.
  • the region of interest may also be an area selected by the user.
  • the user may be provided with information about each object in the identified image, and the user may select the object of interest.
  • Various selection tools can also be provided to the user, and the user can define the region of interest by himself.
  • the processing of the region of interest can have higher precision, and the region of interest in the output image can have more delicate effects, such as presenting more details. Moreover, only processing the region of interest can achieve the effect of saving computing power compared to processing the entire image.
  • the image processing method provided by the embodiment of the present application different quantization parameters are set for input images with different dynamic ranges, so that, for the image to be processed, the quantization parameter suitable for the dynamic range of the image to be processed can be used for the image to be processed.
  • the image data of the processed image is quantized, so that the fixed-point image data obtained after quantization can have sufficient accuracy on the basis of no distortion (that is, the value is consistent with the actual size), thereby reducing the accuracy loss caused by quantization, so that the The output of the neural network has higher accuracy, which can meet the high-precision requirements of image regression tasks such as image denoising and super-resolution.
  • FIG. 4 is a schematic structural diagram of an image processing apparatus provided by an embodiment of the present application.
  • the image processing apparatus can be applied to a neural network, including: a processor 410 and a memory 420 storing a computer program;
  • the processor implements the following steps when executing the computer program:
  • the image to be processed is used to input the neural network
  • quantization processing is performed on the to-be-processed image, and the image data of the to-be-processed image is quantized into fixed-point data whose precision is adapted to the target dynamic range.
  • the to-be-processed image is an image block in the original image
  • the processor is further configured to perform fusion processing on the output images corresponding to each image block in the original image to obtain a target output image corresponding to the original image, wherein the output image takes the image block as an input the image output by the neural network.
  • the processor when the processor performs fusion processing on the output image corresponding to each image block in the original image, it is used to perform deduplication processing on the overlapping area between the image blocks; The respective image blocks are fused.
  • different described quantization parameter sets correspond to different dynamic range intervals
  • the processor When determining the target quantization parameter set, the processor is configured to: determine the minimum interval to which the target dynamic range belongs; and determine the quantization parameter set corresponding to the minimum interval as the target quantization parameter set.
  • the first interval is any interval in the intervals of the different dynamic ranges
  • the processor When determining the quantization parameter set corresponding to the first interval, the processor is configured to obtain a sample image whose dynamic range belongs to the first interval; and determine the quantization parameter set corresponding to the first interval according to the sample image.
  • the quantization parameter set includes multiple quantization parameter subsets, and one of the quantization parameter subsets corresponds to a layer of the neural network;
  • the processor When determining the quantization parameter set corresponding to the first interval according to the sample image, the processor is configured to obtain the input image of each layer of the neural network, wherein the input image of the input layer of the neural network is the input image of each layer of the neural network.
  • the sample image, the input images of the remaining layers are the images obtained by the neural network using the sample image as the input for forward propagation; for each layer of the neural network, according to the dynamic range of the input image of the layer, Determine the subset of quantization parameters corresponding to this layer.
  • the network parameters of the neural network are floating-point network parameters when the sample image is input.
  • the processor determines the floating-point network parameters in the following manner, quantizes the original network parameters of the neural network to obtain fixed-point network parameters; performs inverse quantization on the fixed-point network parameters to obtain the floating-point network parameters. Click Network Parameters.
  • the processor when determining the minimum interval to which the target dynamic range belongs, is configured to perform scaling processing on the target dynamic range; and determine the minimum interval to which the scaled target dynamic range belongs.
  • the quantization parameter set includes multiple quantization parameter subsets, and one of the quantization parameter subsets corresponds to a layer of the neural network;
  • the processor is further configured to, for each layer of the neural network, perform quantization processing on the input image of the layer according to the quantization parameter subset corresponding to the layer, and input the quantized input image into the layer , wherein the input image of the input layer of the neural network is the image to be processed, and the input images of the other layers are the output images of the previous layer.
  • each convolution kernel in the neural network has undergone quantization processing respectively.
  • the processor when performing quantization processing on the first convolution kernel, is configured to perform quantization processing on the first convolution kernel according to a quantization parameter corresponding to the first convolution kernel, wherein the The first convolution kernel is any convolution kernel in the neural network.
  • the network parameters of the neural network are fixed-point network parameters when the quantized image to be processed is input.
  • the fixed-point network parameters are obtained by quantizing the original network parameters of the neural network.
  • the fixed-point network parameters include weight parameters and/or bias parameters.
  • the processor is further configured to preprocess the image input to the neural network.
  • the preprocessing includes zero mean processing and/or normalization processing.
  • the quantization parameter set includes a step size parameter and a zero point parameter.
  • the image to be processed is an image block corresponding to the region of interest in the original image.
  • the processor when determining the region of interest, is configured to perform semantic segmentation on the original image, determine object categories included in the original image; determine the degree of attention corresponding to each object category; The region corresponding to the object category with the highest degree is determined as the region of interest.
  • the region of interest is selected by the user.
  • different quantization parameters are set for input images with different dynamic ranges, so that, for the image to be processed, the quantization parameter suitable for the dynamic range of the image to be processed can be used for the image to be processed.
  • the image data of the processed image is quantized, so that the fixed-point image data obtained after quantization can have sufficient accuracy on the basis of no distortion (that is, the value is consistent with the actual size), thereby reducing the accuracy loss caused by quantization. It meets the high-precision requirements of image regression tasks such as image denoising and super-resolution.
  • Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium; when the computer program is executed by a processor, any image processing method provided by the embodiments of the present application is implemented.
  • Embodiments of the present application may take the form of a computer program product implemented on one or more storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having program code embodied therein.
  • Computer-usable storage media includes permanent and non-permanent, removable and non-removable media, and storage of information can be accomplished by any method or technology.
  • Information may be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • PRAM phase-change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read only memory
  • EEPROM Electrically Erasable Programmable Read Only Memory
  • Flash Memory or other memory technology
  • CD-ROM Compact Disc Read Only Memory
  • CD-ROM Compact Disc Read Only Memory
  • DVD Digital Versatile Disc
  • Magnetic tape cartridges magnetic tape magnetic disk storage or other magnetic storage devices or any other non-

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Facsimile Image Signal Circuits (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed in embodiments of the present application is an image processing method, applied to a neural network. The method comprises: determining a target dynamic range of an image to be processed, said image being used for being inputted into the neural network; determining a target quantization parameter set adapted to the target dynamic range from multiple candidate quantization parameter sets, different quantization parameter sets being adapted to different dynamic ranges; performing quantization processing on said image according to the target quantization parameter set, such that image data of said image is quantified into fixed-point data whose accuracy is adapted to the target dynamic range. The image processing method disclosed in the embodiments of the present application solves the technical problem that the accuracy of an output result of the neural network is insufficient after quantization is performed by existing quantization methods.

Description

图像处理方法、图像处理装置及计算机可读存储介质Image processing method, image processing device, and computer-readable storage medium 技术领域technical field

本申请涉及图像处理技术领域,尤其涉及一种图像处理方法、图像处理装置及计算机可读存储介质。The present application relates to the technical field of image processing, and in particular, to an image processing method, an image processing apparatus, and a computer-readable storage medium.

背景技术Background technique

神经网络在图像处理、语音识别和处理等领域取得了巨大成功,在嵌入式设备中有着广泛应用。由于神经网络往往包含较多的参数,因此其整个运算过程需要消耗大量的存储和计算资源,这导致在低成本的嵌入式设备中运行神经网路时,难以满足高实时性和低功耗的要求。Neural networks have achieved great success in image processing, speech recognition and processing, and are widely used in embedded devices. Since the neural network often contains many parameters, the entire operation process needs to consume a lot of storage and computing resources, which makes it difficult to meet the requirements of high real-time performance and low power consumption when running the neural network in low-cost embedded devices. Require.

对神经网络进行量化是一种提高运算速度、降低能耗的方法,但通过现有的量化方法对神经网络进行量化后,神经网路的输出结果精度不足。Quantizing the neural network is a method to improve the operation speed and reduce the energy consumption. However, after the neural network is quantized by the existing quantization method, the accuracy of the output result of the neural network is insufficient.

发明内容SUMMARY OF THE INVENTION

有鉴于此,本申请目的之一是解决通过已有量化方法量化后,神经网络的输出结果精度不足的技术问题。In view of this, one of the purposes of the present application is to solve the technical problem of insufficient precision of the output result of the neural network after quantization by the existing quantization method.

本申请实施例第一方面提供一种图像处理方法,应用于神经网络,所述方法包括:A first aspect of the embodiments of the present application provides an image processing method, which is applied to a neural network, and the method includes:

确定待处理图像的目标动态范围,所述待处理图像用于输入所述神经网络;determining the target dynamic range of the image to be processed, the image to be processed is used to input the neural network;

从多个候选的量化参数集中确定与所述目标动态范围适配的目标量化参数集,其中,不同的所述量化参数集适配不同的动态范围;determining a target quantization parameter set adapted to the target dynamic range from a plurality of candidate quantization parameter sets, wherein different said quantization parameter sets are adapted to different dynamic ranges;

根据所述目标量化参数集,对所述待处理图像进行量化处理,将所述待处理图像的图像数据量化为精度与所述目标动态范围相适配的定点数据。According to the target quantization parameter set, quantization processing is performed on the to-be-processed image, and the image data of the to-be-processed image is quantized into fixed-point data whose precision is adapted to the target dynamic range.

本申请实施例第二方面提供一种图像处理装置,应用于神经网络,包括:处理器与存储有计算机程序的存储器;A second aspect of the embodiments of the present application provides an image processing apparatus applied to a neural network, including: a processor and a memory storing a computer program;

所述处理器在执行所述计算机程序时实现以下步骤:The processor implements the following steps when executing the computer program:

确定待处理图像的目标动态范围,所述待处理图像用于输入所述神经网络;determining the target dynamic range of the image to be processed, the image to be processed is used to input the neural network;

从多个候选的量化参数集中确定与所述目标动态范围适配的目标量化参数集,其 中,不同的所述量化参数集适配不同的动态范围;Determine a target quantization parameter set adapted to the target dynamic range from a plurality of candidate quantization parameter sets, wherein different described quantization parameter sets are adapted to different dynamic ranges;

根据所述目标量化参数集,对所述待处理图像进行量化处理,将所述待处理图像的图像数据量化为精度与所述目标动态范围相适配的定点数据。According to the target quantization parameter set, quantization processing is performed on the to-be-processed image, and the image data of the to-be-processed image is quantized into fixed-point data whose precision is adapted to the target dynamic range.

本申请实施例第三方面提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序;所述计算机程序被处理器执行时实现上述第一方面的任一种图像处理方法。A third aspect of the embodiments of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium; when the computer program is executed by a processor, any image processing method of the first aspect above is implemented.

本申请实施例提供的图像处理方法,针对不同动态范围的输入图像设置了不同的量化参数,从而,对于待处理图像,可以使用与该待处理图像的动态范围相适配的量化参数对该待处理图像的图像数据进行量化处理,使量化后得到的定点图像数据可以有足够的精度,从而减少了量化所带来的精度损失,可以满足图像去噪、超分辨率等图像回归任务的高精度要求。In the image processing method provided by the embodiment of the present application, different quantization parameters are set for input images with different dynamic ranges, so that, for the image to be processed, the quantization parameter suitable for the dynamic range of the image to be processed can be used for the image to be processed. The image data of the processed image is quantized, so that the fixed-point image data obtained after quantization can have sufficient accuracy, thereby reducing the accuracy loss caused by quantization, and can meet the high-precision image regression tasks such as image denoising and super-resolution. Require.

附图说明Description of drawings

为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the drawings that are used in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative labor.

图1是本申请实施例提供的图像处理方法的流程图。FIG. 1 is a flowchart of an image processing method provided by an embodiment of the present application.

图2是本申请实施例提供的一种图像处理方法前期阶段的系统框图。FIG. 2 is a system block diagram of an early stage of an image processing method provided by an embodiment of the present application.

图3是本申请实施例提供的一种图像处理方法应用阶段的系统框图。FIG. 3 is a system block diagram of an application stage of an image processing method provided by an embodiment of the present application.

图4是本申请实施例提供的一种图像处理装置的结构示意图。FIG. 4 is a schematic structural diagram of an image processing apparatus provided by an embodiment of the present application.

具体实施方式detailed description

下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

深度学习技术在图像处理、语音识别和处理等领域取得了巨大成功,在嵌入式设备中有着广泛应用。神经网络是一种深度学习算法,由于神经网络往往包含较多的参数,因此其整个运算过程需要消耗大量的存储和计算资源,这导致在低成本的嵌入式设备中实现时,难以满足高实时性和低功耗的要求。为了降低神经网络的运算量、提高运算速度,对神经网络进行量化处理是一种主流方法。Deep learning techniques have achieved great success in image processing, speech recognition and processing, and are widely used in embedded devices. Neural network is a deep learning algorithm. Because neural network often contains many parameters, the whole operation process needs to consume a lot of storage and computing resources, which makes it difficult to meet high real-time requirements when implemented in low-cost embedded devices. performance and low power requirements. In order to reduce the computational complexity of the neural network and improve the computational speed, quantizing the neural network is a mainstream method.

量化可以将数据从一个值域范围近似到另一个值域范围,比如,可以将浮点数据量化为定点数据。在计算机中,数据的表示形式可以有浮点数和定点数两种。其中,定点数可以理解为小数位固定不变的数,比如一个16位的二进制数据,可以固定地将5位用于存储小数,其余的11位用于存储整数。而浮点数则可以理解为小数位不固定可根据需求浮动的数,其在表示形式上包括符号位、阶码和尾数,阶码可以表示小数点在数据中的位置。Quantization approximates data from one range of values to another, for example, floating-point data can be quantized to fixed-point data. In computers, data can be represented in two forms: floating-point numbers and fixed-point numbers. Among them, a fixed-point number can be understood as a number with fixed decimal places, such as a 16-bit binary data, 5 bits can be fixedly used to store decimals, and the remaining 11 bits can be used to store integers. The floating-point number can be understood as a number whose decimal places are not fixed and can be floated according to requirements.

由于浮点数相比定点数而言需要更多的存储空间,且目前多数处理器都更擅长定点运算,因此,将神经网络中的浮点数据量化为定点数据后,神经网络的运算速度可以提高,所占用的存储资源可以减少,所需要的能耗也可以下降。Since floating-point numbers require more storage space than fixed-point numbers, and most processors are currently better at fixed-point operations, after quantizing the floating-point data in the neural network into fixed-point data, the operation speed of the neural network can be improved. , the occupied storage resources can be reduced, and the required energy consumption can also be reduced.

然而,虽然对神经网络进行量化可以有上述的一系列效果,但申请人发现,已有的量化方法对神经网络进行量化后,神经网络的输出结果难以达到需要的精度,尤其对于去噪、超分辨率等图像回归任务,图像回归任务相比图像分类任务要求更高的精度。However, although quantizing the neural network can have the above-mentioned series of effects, the applicant has found that after the existing quantization methods quantify the neural network, the output results of the neural network are difficult to achieve the required accuracy, especially for denoising, super Image regression tasks such as resolution, image regression tasks require higher accuracy than image classification tasks.

具体的,由前文可知,量化可以实现浮点数到定点数的转换,而浮点数到定点数的转换可以认为是确定小数位数的过程。对于一个数据而言,小数位的位数决定该数据的精度,整数位的位数决定该数据所能表达的数值大小。由于数据的总位数通常是固定的,比如4位、8位、16位或32位等,因此小数位数越多,整数位数就越少。所以,一个数据的理想表示是,整数位数正好足以表达其数值,而其余的位数都作为小数位以表达出更高的精度。Specifically, as can be seen from the foregoing, quantization can realize the conversion of floating-point numbers to fixed-point numbers, and the conversion of floating-point numbers to fixed-point numbers can be regarded as a process of determining the number of decimal places. For a data, the number of decimal places determines the precision of the data, and the number of integer bits determines the numerical value that the data can express. Since the total number of bits of data is usually fixed, such as 4, 8, 16, or 32 bits, etc., the more decimal places, the fewer integer digits. Therefore, the ideal representation of a data is that there are just enough integer digits to express its value, and the remaining digits are used as decimal places to express higher precision.

但在针对神经网络的相关量化方法中,由于其更考虑普适性,因此确定出的量化参数更适用于大动态范围的输入图像。量化参数可以决定量化后得到的数据的小数位数与整数位数,对于适用于大动态范围的输入图像的量化参数,量化后得到的图像数据将有较多的整数位数,以确保足以表达出该图像的真实像素值,但相应的,其小数位数则较少。而对于小动态范围的输入图像,其像素值的表示并不需要那么多的整数位,但由于采用同一套量化参数,其图像数据将出现整数位过剩而小数位未用尽的情 况,因此图像数据的精度不足,最终导致神经网络的输出结果的精度也不足。However, in the related quantization method for neural networks, since it considers universality more, the determined quantization parameters are more suitable for input images with a large dynamic range. The quantization parameter can determine the number of decimal digits and integer digits of the data obtained after quantization. For the quantization parameter suitable for the input image with a large dynamic range, the image data obtained after quantization will have more integer digits to ensure that it is sufficient to express The actual pixel value of the image is obtained, but the number of decimal places is correspondingly smaller. For an input image with a small dynamic range, the representation of the pixel value does not require so many integer bits, but due to the use of the same set of quantization parameters, the image data will have excess integer bits and not exhausted decimal bits. The accuracy of the data is insufficient, and ultimately the accuracy of the output of the neural network is also insufficient.

基于此,本申请实施例提供了一种图像处理方法,可以参考图1,图1是本申请实施例提供的图像处理方法的流程图。该图像处理方法可以应用于神经网络,比如可以是卷积神经网络,该方法包括以下步骤:Based on this, an embodiment of the present application provides an image processing method. Referring to FIG. 1 , FIG. 1 is a flowchart of the image processing method provided by the embodiment of the present application. The image processing method can be applied to a neural network, such as a convolutional neural network, and the method includes the following steps:

S101、确定待处理图像的目标动态范围。S101. Determine the target dynamic range of the image to be processed.

待处理图像可以用于输入神经网络。需要说明的是,该神经网络是预先训练好的神经网络,具体的训练过程可以参考现有技术,在此不再赘述。并且,该神经网络所对应的功能可以是任何功能,比如图像去噪、人脸识别、超分辨率、场景识别等等,在此也不做限制。The image to be processed can be used to feed the neural network. It should be noted that the neural network is a pre-trained neural network, and the specific training process may refer to the prior art, which will not be repeated here. Moreover, the function corresponding to the neural network may be any function, such as image denoising, face recognition, super-resolution, scene recognition, etc., which is not limited here.

图像的动态范围可以是该图像的像素值范围,比如一张图像,其最大的像素值是60,最小的像素值是0,则其动态范围可以表示为0-60。The dynamic range of an image may be the range of pixel values of the image. For example, for an image whose maximum pixel value is 60 and the minimum pixel value is 0, its dynamic range may be expressed as 0-60.

S102、从多个候选的量化参数集中确定与上述目标动态范围适配的目标量化参数集。S102. Determine a target quantization parameter set adapted to the above-mentioned target dynamic range from a plurality of candidate quantization parameter sets.

S103、根据该目标量化参数集,对待处理图像进行量化处理,将待处理图像的图像数据量化为精度与目标动态范围相适配的定点数据。S103. Perform quantization processing on the image to be processed according to the target quantization parameter set, and quantize the image data of the image to be processed into fixed-point data whose precision is adapted to the target dynamic range.

其中,不同的量化参数集可以适配不同的动态范围。这些候选的量化参数集可以是预先确定的,具体的确定方式将在后文进行说明。对一张图像,当根据与该图像的动态范围相适配的量化参数集对该图像进行量化处理时,该图像的图像数据可以被量化为精度与其动态范围相适配的定点数据。Among them, different quantization parameter sets can be adapted to different dynamic ranges. These candidate quantization parameter sets may be predetermined, and the specific determination method will be described later. For an image, when the image is quantized according to the quantization parameter set suitable for the dynamic range of the image, the image data of the image can be quantized into fixed-point data whose precision is suitable for the dynamic range of the image.

所谓精度与动态范围相适配,可以结合前文所描述的数据的理想表示进行理解,比如,该定点数据的整数位数可以正好足以表达该图像的像素值,比如总位数为8位,若该图像的最大像素值为15,则整数位数为4位时正好足以表达该图像的像素值,剩余4位为小数位,此时该图像的数据精度最高。当然,上述例子中,若认为图像数据的小数位数有2位以上精度已满足要求,则该定点数据的整数位数还可以为5位或6位,此时,虽然整数位数略微多于正好足以表达该图像的像素值所需的位数,仍可以认为该定点数据的精度与图像的动态范围相适配。The so-called adaptation of precision and dynamic range can be understood in conjunction with the ideal representation of the data described above. For example, the integer number of bits of the fixed-point data can be just enough to express the pixel value of the image. For example, the total number of bits is 8 bits. The maximum pixel value of the image is 15, then when the number of integer digits is 4, it is just enough to express the pixel value of the image, and the remaining 4 digits are decimal digits. At this time, the data precision of the image is the highest. Of course, in the above example, if it is considered that the precision of more than 2 decimal digits of the image data meets the requirements, the integer digits of the fixed-point data can also be 5 or 6 digits. At this time, although the integer digits are slightly more than With just enough bits to express the pixel values of the image, the precision of the fixed-point data can still be considered to fit the dynamic range of the image.

在对待处理图像进行量化处理后,待处理图像的图像数据被处理为精度适配(为方便,将精度与动态范围适配简称为精度适配)的定点数据,该待处理图像的定点图像数据可以用于输入神经网络。After the image to be processed is quantized, the image data of the image to be processed is processed as precision-adapted (for convenience, precision and dynamic range adaptation is simply referred to as precision-adapted) fixed-point data, the fixed-point image data of the to-be-processed image Can be used to input neural networks.

本申请实施例提供的图像处理方法,针对不同动态范围的输入图像设置了不同的量化参数,从而,对于待处理图像,可以使用与该待处理图像的动态范围相适配的量化参数对该待处理图像的图像数据进行量化处理,使量化后得到的定点图像数据可以在不失真(即数值大小上与实际相符)的基础上有足够的精度,从而减少了量化所带来的精度损失,可以满足图像去噪、超分辨率等图像回归任务的高精度要求。In the image processing method provided by the embodiment of the present application, different quantization parameters are set for input images with different dynamic ranges, so that, for the image to be processed, the quantization parameter suitable for the dynamic range of the image to be processed can be used for the image to be processed. The image data of the processed image is quantized, so that the fixed-point image data obtained after quantization can have sufficient accuracy on the basis of no distortion (that is, the value is consistent with the actual size), thereby reducing the accuracy loss caused by quantization. It meets the high-precision requirements of image regression tasks such as image denoising and super-resolution.

在一个实施例中,待处理图像可以是原始图像中的一个图像块。具体的,原始图像可以是图像处理任务的处理对象,可以对该原始图像进行分块处理,从而得到该原始图像的多个图像块,每一个图像块都可以是本申请实施例所描述的待处理图像。对每一个图像块,可以通过本申请实施例提供的图像处理方法进行量化及输入神经网络进行前向传播等处理,从而得到神经网络输出的各个图像块对应的输出图像。可以对每个图像块对应的输出图像进行融合处理,从而得到完整的目标输出图像,该目标输出图像为原始图像对应的输出结果。In one embodiment, the image to be processed may be an image block in the original image. Specifically, the original image may be the processing object of the image processing task, and the original image may be processed in blocks, thereby obtaining multiple image blocks of the original image, and each image block may be the to-be-to-be described in the embodiments of the present application. Process images. For each image block, the image processing method provided in the embodiment of the present application can be used to perform quantization and input neural network for forward propagation and other processing, so as to obtain an output image corresponding to each image block output by the neural network. The output images corresponding to each image block can be fused to obtain a complete target output image, which is the output result corresponding to the original image.

上述实施例,由于对原始图像进行了分块,从而可以针对不同动态范围的图像块进行不同的处理,得到的每一个图像块对应的输出图像均有较高的精度,最终融合得到的目标输出图像将有更大幅度的精度提升。In the above embodiment, since the original image is divided into blocks, different processing can be performed for image blocks with different dynamic ranges, and the obtained output image corresponding to each image block has high precision, and the target output obtained by final fusion The image will have a larger accuracy improvement.

在对原始图像进行分块处理时,可以有多种实施方式,比如可以将原始图像均匀划分,使划分出的每个图像块的尺寸一致,又比如,可以不均匀划分,使图像块之间可以尺寸不同,再比如,可以有重叠的划分,使图像块之间有重叠部分……本申请实施例对原始图像的分块方式不作具体限制。When the original image is divided into blocks, there can be various implementations. For example, the original image can be divided evenly, so that the size of each divided image block is consistent, and for example, it can be divided unevenly, so that the image blocks can be divided The sizes may be different. For another example, there may be overlapping divisions, so that there are overlapping parts between image blocks... This embodiment of the present application does not specifically limit the way of dividing the original image.

需要注意的是,若对原始图像进行有重叠的划分,则在利用各个图像块进行融合时,可以对图像块之间的重叠区域进行去重处理,使同一区域的图像数据保留一份即可,再利用去重处理后的各个图像块进行融合,从而得到目标输出图像。It should be noted that if the original image is divided with overlapping, when each image block is used for fusion, the overlapping area between the image blocks can be deduplicated, so that one copy of the image data in the same area can be reserved. , and then use the de-duplicated image blocks to fuse to obtain the target output image.

在一个实施例中,训练好的神经网络的原始网络参数可以是浮点网络参数,由于输入神经网络的量化后的待处理图像是定点数据,因此,可以在量化后的待处理图像输入之前,使该神经网络的原始网络参数转换为定点网络参数,以使待处理图像的定点图像数据与神经网络的定点网络参数可以按照定点数的运算法则进行神经网络的前向传播。In one embodiment, the original network parameters of the trained neural network may be floating-point network parameters. Since the quantized image to be processed input to the neural network is fixed-point data, before the quantized image to be processed is input, The original network parameters of the neural network are converted into fixed-point network parameters, so that the fixed-point image data of the image to be processed and the fixed-point network parameters of the neural network can carry out forward propagation of the neural network according to the algorithm of fixed-point numbers.

神经网络的网络参数可以包括神经网络每一层的网络参数,这些网络参数具体可以包括权重参数、偏置参数等一种或多种。当然,不同的神经网络可以有不同的网络 参数,比如神经网络可以包括卷积层、全连接层、非线性激活层等一种或多种层,每一种层对应的网络参数可以不同,本申请实施例在此不作限制。The network parameters of the neural network may include network parameters of each layer of the neural network, and these network parameters may specifically include one or more of weight parameters, bias parameters, and the like. Of course, different neural networks can have different network parameters. For example, a neural network can include one or more layers such as convolutional layers, fully connected layers, and nonlinear activation layers. The network parameters corresponding to each layer can be different. The application examples are not limited herein.

在神经网络的原始网络参数转换为定点网络参数的具体实现方式上,一种实施方式是可以对该神经网络的原始网络参数进行量化,从而得到该原始网络参数对应的定点网络参数。量化的方法有多种,比如线性映射、移位、截取等。为便于理解,可以以均匀线性映射的量化方法为例子进行说明。Regarding the specific implementation of converting the original network parameters of the neural network into fixed-point network parameters, an implementation manner is that the original network parameters of the neural network can be quantized to obtain fixed-point network parameters corresponding to the original network parameters. There are many quantization methods, such as linear mapping, shifting, truncation, etc. For ease of understanding, the quantization method of uniform linear mapping may be used as an example for description.

现假设有一个浮点权重参数,其取值范围是[X min,X max],若需要将该浮点权重参数量化到8bits表示,即将该浮点权重参数量化到[0,255]的取值范围,则可以通过两个量化参数s和z将浮点数转换为定点数。量化公式如下: Suppose there is a floating-point weight parameter whose value range is [X min ,X max ], if the floating-point weight parameter needs to be quantized to 8bits representation, that is, the floating-point weight parameter is quantized to the value range of [0,255] , the floating-point number can be converted to a fixed-point number by two quantization parameters s and z. The quantification formula is as follows:

y=clamp(round(s*x+z),0,255)y=clamp(round(s*x+z), 0, 255)

Figure PCTCN2020105263-appb-000001
Figure PCTCN2020105263-appb-000001

其中,x为量化前的浮点数值,y为量化后的定点数值,s和z均为量化参数,其中s为量化步长,z为零点,round()为取整函数。当量化参数s和z确定后,通过上述公式就可以将浮点数量化为定点数。Among them, x is the floating-point value before quantization, y is the fixed-point value after quantization, s and z are both quantization parameters, where s is the quantization step size, z is zero, and round() is the rounding function. After the quantization parameters s and z are determined, the floating-point number can be quantized into a fixed-point number through the above formula.

需要说明的是,采用不同的量化方法所对应的量化参数可以不同。上述例子中,均匀线性映射对应的量化参数包括量化步长和零点,但其他的量化方法对应的量化参数未必与均匀线性映射相同,即未必是量化步长s和零点z。It should be noted that the quantization parameters corresponding to different quantization methods may be different. In the above example, the quantization parameters corresponding to the uniform linear mapping include the quantization step size and the zero point, but the quantization parameters corresponding to other quantization methods are not necessarily the same as the uniform linear mapping, that is, the quantization step size s and the zero point z are not necessarily the same.

在量化对象(即要进行量化的某种数据)确定后,有多种具体的确定可以确定该量化对象对应的量化参数。在一种实施方式中,可以通过最小化量化前数据分布和量化后数据分布之间的KL散度来确定量化参数。在另一种实施方式中,还可以通过统计实际的最大值和最小值来确定量化参数。After the quantization object (that is, some kind of data to be quantized) is determined, there are various specific determinations to determine the quantization parameter corresponding to the quantization object. In one embodiment, the quantization parameter may be determined by minimizing the KL divergence between the pre-quantization data distribution and the post-quantization data distribution. In another embodiment, the quantization parameter can also be determined by counting the actual maximum and minimum values.

在一个实施例中,不同的量化参数集可以对应不同的动态范围区间,比如第一量化参数集可以对应区间[0,63],第二量化参数集可以对应区间[0,127]……当然,以上区间仅作为例子,一个量化参数集具体所对应的区间,本领域技术人员可以根据需求自行设定,在此不做限制。相应的,在确定待处理图像适用的目标量化参数集时,可以先根据待处理图像的目标动态范围,计算其所属的最小区间,进而可以将该最小区间所对应的量化参数集确定为目标量化参数集。In one embodiment, different quantization parameter sets may correspond to different dynamic range intervals, for example, the first quantization parameter set may correspond to the interval [0, 63], and the second quantization parameter set may correspond to the interval [0, 127]... Of course, the above The interval is only an example, and a specific interval corresponding to a quantization parameter set can be set by those skilled in the art according to requirements, which is not limited here. Correspondingly, when determining the target quantization parameter set applicable to the image to be processed, the minimum interval to which it belongs can be calculated according to the target dynamic range of the image to be processed, and then the quantization parameter set corresponding to the minimum interval can be determined as the target quantization. parameter set.

需要说明的是,在确定目标动态范围所属的最小区间时,可以有一定的灵活度。比如待处理图像的动态范围是0-64,若动态范围区间有[0,63]与[0,127],在一种实施方 式中,可以直接确定目标动态范围所属的最小区间是[0,127]。而在另一种实施方式中,考虑到[0,127]区间所对应的量化参数集在精度上要低一些,且0-64的动态范围并没有超出[0,63]很多,因此,可以对0-64的动态范围进行一定程度的缩小,则缩小后的目标动态范围所属的最小区间可以是[0,63]的区间。当然,在另一个例子中,也可以对目标动态范围进行适当程度的放大处理后,再确定放大处理后的目标动态范围所属的最小区间。It should be noted that there may be a certain degree of flexibility when determining the minimum interval to which the target dynamic range belongs. For example, the dynamic range of the image to be processed is 0-64. If the dynamic range interval is [0,63] and [0,127], in one embodiment, it can be directly determined that the minimum interval to which the target dynamic range belongs is [0,127]. In another implementation manner, considering that the quantization parameter set corresponding to the [0, 127] interval is lower in accuracy, and the dynamic range of 0-64 does not exceed [0, 63] by much, therefore, 0 If the dynamic range of -64 is reduced to a certain extent, the minimum interval to which the reduced target dynamic range belongs may be the interval of [0,63]. Of course, in another example, the target dynamic range may also be amplified to an appropriate degree, and then the minimum interval to which the amplified target dynamic range belongs may be determined.

待处理图像是神经网络输入层的输入图像,在对待处理图像进行量化处理后,所得到的待处理图像的定点图像数据可以输入神经网络进行前向传播。在一个实施例中,还可以在前向传播的过程中,对神经网络其他各层的输入图像也进行量化处理。在神经网络中,除输入层外,其他各层的输入图像可以是前一层的输出图像,每一层所进行的运算可以认为是对输入图像进行特征提取,以卷积神经网络为例,第N层的输入图像通过与第N层的卷积核进行卷积运算,从而得到特征图(Feature map),该特征图可以作为N+1层的输入图像。The image to be processed is the input image of the input layer of the neural network. After the image to be processed is quantized, the fixed-point image data of the image to be processed can be input to the neural network for forward propagation. In one embodiment, quantization processing may also be performed on the input images of other layers of the neural network in the process of forward propagation. In the neural network, except for the input layer, the input images of other layers can be the output images of the previous layer, and the operation performed by each layer can be considered as feature extraction for the input image. Taking the convolutional neural network as an example, The input image of the Nth layer is convolved with the convolution kernel of the Nth layer to obtain a feature map, which can be used as the input image of the N+1 layer.

在该实施例中,一个量化参数集中可以包括多个量化参数子集,每个量化参数子集可以对应神经网络的一层。比如神经网络具有N层,则可以有N个量化参数子集,每一层所对应的量化参数子集可以用于对该层的输入图像进行量化处理,比如第0层的量化参数子集可以用于对第0层(即输入层)的输入图像(即待处理图像)进行量化处理,而第2层的量化参数子集可以用于对第2层的输入图像进行量化处理……第N层的输入图像在通过该第N层对应的量化参数子集进行量化处理后,该量化后的第N层的输入图像可以输入第N层,与第N层的网络参数进行卷积等运算。In this embodiment, one quantization parameter set may include multiple quantization parameter subsets, and each quantization parameter subset may correspond to a layer of the neural network. For example, if the neural network has N layers, there can be N quantization parameter subsets, and the quantization parameter subset corresponding to each layer can be used to quantize the input image of the layer. For example, the quantization parameter subset of the 0th layer can be It is used to quantize the input image (ie, the image to be processed) of the 0th layer (ie, the input layer), and the quantization parameter subset of the second layer can be used to quantize the input image of the second layer.  … Nth After the input image of the layer is quantized by the quantization parameter subset corresponding to the Nth layer, the quantized input image of the Nth layer can be input to the Nth layer, and perform operations such as convolution with the network parameters of the Nth layer.

上述实施例所描述的量化方式也可以描述为逐层量化,对于卷积神经网络而言,即同一层的卷积核对应同一套量化参数。The quantization method described in the above embodiments can also be described as layer-by-layer quantization. For a convolutional neural network, that is, the convolution kernels of the same layer correspond to the same set of quantization parameters.

在一种实施方式中,为进一步提高精度,还可以采用逐通道量化的方式,即可以在逐层量化的基础上,进一步对神经网络的每一层中的各个卷积核分别进行量化处理。具体的,若第一卷积核为神经网络中的任一个卷积核,则可以根据该第一卷积核对应的量化参数,对该第一卷积核进行量化处理。In one embodiment, in order to further improve the accuracy, a channel-by-channel quantization method can also be used, that is, on the basis of layer-by-layer quantization, each convolution kernel in each layer of the neural network can be further quantized separately. Specifically, if the first convolution kernel is any convolution kernel in the neural network, the first convolution kernel may be quantized according to the quantization parameter corresponding to the first convolution kernel.

以第N层为例,若第N层的输入图像是32*32*16,32*32表示该输入图像的尺寸,16表示该输入图像的通道数,第N层中包括32个卷积核,每个卷积核可以是5*5*16,5*5表示该卷积核的尺寸,16表示该卷积核的通道数,则可以对该32个卷积核中的每个卷积核分别确定量化参数,并可以根据确定出的量化参数,对32个卷积核中的每个卷积核分别进行量化处理。如此,该第N层的输入图像在通过第N层的量化参数子集 进行量化处理后,该量化后的第N层的输入图像可以分别与第N层中的各个量化后的卷积核进行卷积运算。Taking the Nth layer as an example, if the input image of the Nth layer is 32*32*16, 32*32 represents the size of the input image, 16 represents the number of channels of the input image, and the Nth layer includes 32 convolution kernels , each convolution kernel can be 5*5*16, 5*5 indicates the size of the convolution kernel, and 16 indicates the number of channels of the convolution kernel, then each convolution kernel of the 32 convolution kernels can be The kernel determines the quantization parameters respectively, and can perform quantization processing on each of the 32 convolution kernels according to the determined quantization parameters. In this way, after the input image of the Nth layer is quantized by the quantization parameter subset of the Nth layer, the quantized input image of the Nth layer can be respectively processed with each quantized convolution kernel in the Nth layer. Convolution operation.

由前文可知,候选的量化参数集可以是预先确定的,可以将该预先确定量化参数集的阶段称为前期阶段。在一种实施方式中,可以预先设定多个动态范围的区间,关于区间,在前文中已有相关说明,如[0,63]、[0,127]。针对每个区间,可以确定其所对应的量化参数集。以第一区间为例,第一区间可以是多个动态范围的区间中的任一区间,可以获取动态范围属于该第一区间的多个样本图像,根据该样本图像确定第一区间对应的量化参数集。It can be seen from the foregoing that the candidate quantization parameter set may be predetermined, and the stage in which the quantization parameter set is predetermined may be referred to as an early stage. In an implementation manner, multiple dynamic range intervals may be preset, and relevant descriptions have been made in the previous section, such as [0, 63], [0, 127]. For each interval, its corresponding quantization parameter set can be determined. Taking the first interval as an example, the first interval can be any interval in multiple dynamic range intervals, and a plurality of sample images whose dynamic ranges belong to the first interval can be obtained, and the quantization corresponding to the first interval can be determined according to the sample images. parameter set.

获取样本图像时,一种实施方式是可以对素材图像进行分块得到样本图像。在对动态范围进行了区间的划分后,一些区间对应的动态范围可能较小,为获得动态范围属于该区间的样本图像,通过对素材图像进行分块,能够更容易获得对应区间的样本图像。When the sample image is acquired, an implementation manner is that the sample image can be obtained by dividing the material image into blocks. After the dynamic range is divided into intervals, the dynamic range corresponding to some intervals may be small. In order to obtain the sample images whose dynamic range belongs to the interval, the sample images of the corresponding interval can be more easily obtained by dividing the material image into blocks.

若量化参数集包括多个量化参数子集,每个量化参数子集对应神经网络的一层,则在确定上述第一区间对应的量化参数集时,可以将上述的动态范围属于第一区间的样本图像分别输入神经网络进行前向传播,并获取神经网络每一层的输入图像。每一层的输入图像可以参考前文中的相关说明,输入层的输入图像即为所输入的样本图像,第N层的输入图像为第N-1层的输出图像。If the quantization parameter set includes multiple quantization parameter subsets, and each quantization parameter subset corresponds to a layer of the neural network, when determining the quantization parameter set corresponding to the first interval, the above dynamic range may belong to the first interval. The sample images are respectively input to the neural network for forward propagation, and the input images of each layer of the neural network are obtained. For the input image of each layer, please refer to the relevant description above. The input image of the input layer is the input sample image, and the input image of the Nth layer is the output image of the N-1th layer.

获取到神经网络每一层的输入图像后,可以针对每一层,确定用于对该层的输入图像的图像数据进行量化的量化参数(量化参数子集),该确定出的量化参数可以将输入图像的图像数据量化为精度与该输入图像的动态范围相适配的定点数据。量化参数的具体确定方式,在前文中已有说明,比如最小化量化前数据分布和量化后数据分布之间的KL散度、统计实际的最大值和最小值等等。After the input image of each layer of the neural network is obtained, a quantization parameter (quantization parameter subset) for quantizing the image data of the input image of the layer can be determined for each layer, and the determined quantization parameter can be The image data of the input image is quantized into fixed-point data with a precision that matches the dynamic range of the input image. The specific way of determining the quantization parameters has been described above, such as minimizing the KL divergence between the pre-quantization data distribution and the post-quantization data distribution, counting the actual maximum and minimum values, and so on.

需要说明的是,在确定量化参数集的前期阶段,神经网络的网络参数可以是浮点网络参数,用于输入神经网络的样本图像也可以浮点图像数据。在一种实施方式中,神经网络的网络参数可以直接使用其原始的浮点网络参数。在另一种实施方式中,可以先对神经网络的原始网络参数进行量化,得到定点网络参数,量化得到定点网络参数的具体实现在前文中已有说明,在此不再赘述。进而,可以对该定点网络参数进行逆量化,得到用于与样本图像进行卷积等运算的浮点网络参数。It should be noted that, in the early stage of determining the quantization parameter set, the network parameters of the neural network may be floating-point network parameters, and the sample images used to input the neural network may also be floating-point image data. In one embodiment, the network parameters of the neural network can directly use their original floating-point network parameters. In another implementation manner, the original network parameters of the neural network may be quantized first to obtain fixed-point network parameters, and the specific implementation of quantization to obtain fixed-point network parameters has been described above, and will not be repeated here. Furthermore, the fixed-point network parameters can be inversely quantized to obtain floating-point network parameters used for operations such as convolution with the sample image.

关于逆量化的具体实现,可以根据量化所使用的方式进行逆向推导得到。比如量化所使用的方式是均匀线性映射,则逆量化可以通过以下公式实现:The specific implementation of inverse quantization can be obtained by inverse derivation according to the method used for quantization. For example, the method used for quantization is uniform linear mapping, then inverse quantization can be achieved by the following formula:

x′=(y-z)/sx'=(y-z)/s

其中,y为量化后的定点数值,x′为逆量化后的浮点数值,s和z为量化参数。Among them, y is a fixed-point value after quantization, x' is a floating-point value after inverse quantization, and s and z are quantization parameters.

由于在应用阶段,待处理图像输入的神经网络所使用的网络参数是定点网络参数,因此在前期阶段,为了与应用阶段贴合以使确定出的量化参数集更适用,神经网络的网络参数可以不使用原始网络参数,而是使用上述定点网络参数逆量化得到的浮点网络参数。Since in the application stage, the network parameters used by the neural network of the image input to be processed are fixed-point network parameters, so in the early stage, in order to fit with the application stage to make the determined quantitative parameter set more applicable, the network parameters of the neural network can be Instead of using the original network parameters, the floating-point network parameters obtained by inverse quantization of the fixed-point network parameters described above are used.

在一种实施方式中,量化参数集也可以在神经网络的前向传播过程中实时确定。具体的,可以根据待处理图像的动态范围确定第0层的量化参数子集,根据确定出的第0层的量化参数子集,可以对该待处理图像进行量化处理,量化得到的该待处理图像对应的定点图像数据,可以输入第0层(输入层),得到第0层的输出图像,即第1层的输入图像。可以根据该第1层的输入图像的动态范围确定第1层的量化参数子集,并根据确定出的第1层的量化参数子集对该第1层的输入图像进行量化处理,得到的量化后的该第1层的输入图像可以输入第1层,得到第1层的输出图像,即第2层的输入图像。再可以根据该第2层的输入图像的动态范围确定第2层的量化参数子集……In one embodiment, the set of quantization parameters may also be determined in real-time during the forward propagation of the neural network. Specifically, the quantization parameter subset of the 0th layer can be determined according to the dynamic range of the image to be processed, and according to the determined quantization parameter subset of the 0th layer, the quantization process can be performed on the to-be-processed image, and the quantization obtained by the to-be-processed The fixed-point image data corresponding to the image can be input to the 0th layer (input layer) to obtain the output image of the 0th layer, that is, the input image of the 1st layer. The quantization parameter subset of the first layer can be determined according to the dynamic range of the input image of the first layer, and the input image of the first layer can be quantized according to the determined quantization parameter subset of the first layer, and the obtained quantization parameter The latter input image of the first layer can be input to the first layer to obtain the output image of the first layer, that is, the input image of the second layer. Then the quantization parameter subset of the second layer can be determined according to the dynamic range of the input image of the second layer...

上述的在神经网络前向传播过程中实时确定量化参数集的方式,由于需要对每一层的输出图像进行缓存以根据其动态范围确定量化参数,因此需要消耗更多的存储和计算资源。相比而言,在前期阶段预先确定好候选的量化参数集,可以在应用阶段占用较少的存储和计算资源。The above method of determining the quantization parameter set in real time during the forward propagation of the neural network needs to consume more storage and computing resources because the output image of each layer needs to be cached to determine the quantization parameter according to its dynamic range. In contrast, pre-determining candidate quantization parameter sets in the early stage can occupy less storage and computing resources in the application stage.

前文已有说明,本申请实施例中,图像所输入的神经网络是已经训练好的具备某种特定功能的神经网络。而在该神经网络的训练阶段,用于输入该神经网络进行训练的图像数据可能经过了预处理,比如可以在输入神经网络之前,先对该用于训练的图像数据进行零均值处理和/或归一化处理,以加快训练过程中神经网络的网络参数的收敛。因此,对于训练好的神经网络,在待处理图像或样本图像等图像的图像数据输入之前,也可以对这些图像数据进行预处理。在一个例子中,可以对输入图像的各通道像素值通过以下公式进行预处理:As described above, in the embodiments of the present application, the neural network inputted by the image is a neural network that has been trained and has a specific function. In the training phase of the neural network, the image data used for inputting the neural network for training may be preprocessed. For example, the image data used for training may be subjected to zero-mean processing and/or prior to input to the neural network. Normalization processing to speed up the convergence of the network parameters of the neural network during training. Therefore, for a trained neural network, before the image data of images such as images to be processed or sample images are input, these image data can also be preprocessed. In one example, the pixel values of each channel of the input image can be preprocessed by the following formula:

y i=k*x i+b y i =k*x i +b

其中,x表示输入图像,i表示输入图像通道,对于RGB彩色图像,则i=1,2,3;y表示输出图像,参数k表示对像素值的缩放系数,参数b表示对像素值的偏移系数,参数k和b为浮点数。Among them, x represents the input image, i represents the input image channel, for RGB color images, i=1, 2, 3; y represents the output image, the parameter k represents the scaling factor of the pixel value, and the parameter b represents the bias of the pixel value. Shift coefficient, parameters k and b are floating point numbers.

下面提供一个相对详尽的实施例。A relatively detailed example is provided below.

该实施例可以包括前期阶段与应用阶段。This embodiment may include a pre-stage and an application stage.

前期阶段可以参考图2,图2是本申请实施例提供的一种图像处理方法前期阶段 的系统框图。For the early stage, reference may be made to Fig. 2, which is a system block diagram of the early stage of an image processing method provided by an embodiment of the present application.

在前期阶段,可以先获取多张用于输入卷积神经网络进行前向传播的素材图像,对该素材图像进行零均值和/或归一化等预处理。对预处理后的素材图像,可以进行分块处理,从而得到多个图像块,每个图像块可以作为一张样本图像。In the early stage, a plurality of material images for inputting the convolutional neural network for forward propagation may be obtained, and preprocessing such as zero mean and/or normalization may be performed on the material images. The preprocessed material image can be processed into blocks to obtain multiple image blocks, each image block can be used as a sample image.

可以统计每个样本图像的动态范围,并根据统计结果定义出多个动态范围的区间。比如,在对样本图像的动态范围进行统计后,若确定各样本图像的动态范围分布在[-128,127]的区间内(由于样本图像经过预处理,其像素值可以为负数),则可以从[-128,127]中定义多个区间,例如可以定义[-64,63]、[-64,0)、[0,63]、[-128,127]、[-128,0)与[0,127]六个区间。The dynamic range of each sample image can be counted, and multiple dynamic range intervals can be defined according to the statistical results. For example, after the dynamic range of sample images is counted, if it is determined that the dynamic range of each sample image is distributed in the interval [-128, 127] (due to the sample image is preprocessed, its pixel value can be negative), then it can be obtained from [ -128,127] define multiple intervals, for example, you can define six intervals [-64,63], [-64,0), [0,63], [-128,127], [-128,0) and [0,127] .

对于卷积神经网络的原始网络参数,可以先进行量化,得到定点网络参数,再通过对该定点网络参数进行逆量化,得到浮点网络参数,该浮点网络参数可以应用在卷积神经网络上。For the original network parameters of the convolutional neural network, quantization can be performed first to obtain fixed-point network parameters, and then by inverse quantization of the fixed-point network parameters, floating-point network parameters can be obtained, and the floating-point network parameters can be applied to the convolutional neural network. .

将各个样本图像分别输入卷积神经网络进行前向传播,并获取卷积神经网络每一层的输出结果。由于第N层的输出也是第N+1层的输入,因此获取每一层的输出结果也可以认为是获取每一层的输入图像的图像数据(其中输入层的输入图像是样本图像本身,其他层的输入图像可以是对样本图像进行特征提取得到的特征图)。Each sample image is input into the convolutional neural network for forward propagation, and the output results of each layer of the convolutional neural network are obtained. Since the output of the Nth layer is also the input of the N+1th layer, obtaining the output result of each layer can also be regarded as obtaining the image data of the input image of each layer (the input image of the input layer is the sample image itself, other The input image of the layer can be a feature map obtained by performing feature extraction on the sample image).

量化可以选择逐层量化,也可以选择逐通道量化。逐层量化即每一层对应一套量化参数,逐通道量化即每一卷积核对应一套量化参数,下面以逐层量化为例进行说明。Quantization can choose layer-by-layer quantization or channel-by-channel quantization. Layer-by-layer quantization means that each layer corresponds to a set of quantization parameters, and channel-by-channel quantization means that each convolution kernel corresponds to a set of quantization parameters. The following takes layer-by-layer quantization as an example for description.

对每个样本图像,可以确定其所属的最小区间,此处,在确定最小区间时可以对样本图像的动态范围进行适当的缩放处理。对样本图像输入卷积神经网络后所获取的每一层的输入图像的图像数据,可以按样本图像所属的区间进行分类,进而,针对每个区间,可以确定每一层的输入图像的图像数据对应的量化参数,得到该区间该层对应的量化参数子集。For each sample image, the minimum interval to which it belongs can be determined, and when the minimum interval is determined, the dynamic range of the sample image can be appropriately scaled. The image data of the input image of each layer obtained after the sample image is input to the convolutional neural network can be classified according to the interval to which the sample image belongs, and further, for each interval, the image data of the input image of each layer can be determined. The corresponding quantization parameter is obtained, and the quantization parameter subset corresponding to the layer in this interval is obtained.

举个例子,例如对于动态范围属于[-64,63]区间的样本图像有5个,则在该5个样本图像在分别输入卷积神经网络后,获取同为第N层的输入图像的图像数据,针对该同为第N层的输入图像的图像数据,确定该[-64,63]区间第N层对应的量化参数子集。该[-64,63]区间每一层对应的量化参数子集可以构成该[-64,63]区间对应的量化参数集。For example, for example, if there are 5 sample images whose dynamic range belongs to the [-64,63] interval, after the 5 sample images are respectively input into the convolutional neural network, the image that is the input image of the Nth layer is obtained. data, for the image data of the input image that is also the Nth layer, determine the quantization parameter subset corresponding to the Nth layer in the [-64,63] interval. The quantization parameter subset corresponding to each layer in the [-64,63] interval may constitute the quantization parameter set corresponding to the [-64,63] interval.

应用阶段可以参考图3,图3是本申请实施例提供的一种图像处理方法应用阶段的系统框图。For the application stage, reference may be made to FIG. 3 , which is a system block diagram of the application stage of an image processing method provided by an embodiment of the present application.

在前期阶段已经确定了不同动态范围区间对应的量化参数集,在应用阶段,若图像处理任务的目标对象是原始图像,可以先对原始图像进行图像预处理。预处理后的 原始图像可以进行分块处理,每一个图像块可以称为待处理图像。下面针对每一个图像块,即待处理图像进行描述。In the early stage, the quantization parameter sets corresponding to different dynamic range intervals have been determined. In the application stage, if the target object of the image processing task is the original image, image preprocessing can be performed on the original image first. The preprocessed original image can be processed into blocks, and each image block can be called the image to be processed. The following describes each image block, that is, the image to be processed.

对于待处理图像,可以先确定其动态范围(目标动态范围)。根据目标动态范围所属的最小区间,将该最小区间对应的量化参数集确定为目标量化参数集。For the image to be processed, its dynamic range (target dynamic range) can be determined first. According to the minimum interval to which the target dynamic range belongs, the quantization parameter set corresponding to the minimum interval is determined as the target quantization parameter set.

可以将该目标量化参数集应用到卷积神经网络,以及将前期阶段对原始网络参数量化得到的定点网络参数应用到卷积神经网络。目标量化参数集中包括了对应每一层对应的量化参数子集,则在第0层(即输入层),待处理图像可以根据的第0层对应的量化参数子集进行量化处理,量化处理后的待处理图像的定点图像数据可以输入第0层。第0层的输出即为第1层的输入图像的图像数据,可根据第1层对应的量化参数子集对该第1层的输入图像进行量化处理,量化处理后的第1层的输入图像可以输入第1层,与第1层的定点网络参数进行卷积等运算,得到第1层的输出,即第2层的输入图像的图像数据……后续的其它层的处理流程相同,在此不一一展开。在卷积神经网络中,定点图像数据和定点网络参数按照定点数运算法则逐层进行卷积等运算,最后该卷积神经网路可以输出待处理图像对应的输出图像。The target quantization parameter set can be applied to the convolutional neural network, and the fixed-point network parameters obtained by quantizing the original network parameters in the previous stage can be applied to the convolutional neural network. The target quantization parameter set includes the quantization parameter subset corresponding to each layer, then in the 0th layer (ie the input layer), the image to be processed can be quantized according to the quantization parameter subset corresponding to the 0th layer. The fixed-point image data of the image to be processed can be input to layer 0. The output of the 0th layer is the image data of the input image of the first layer. The input image of the first layer can be quantized according to the quantization parameter subset corresponding to the first layer, and the quantized input image of the first layer can be processed. You can input the first layer and perform operations such as convolution with the fixed-point network parameters of the first layer to obtain the output of the first layer, that is, the image data of the input image of the second layer... The processing flow of other subsequent layers is the same, here Not one by one. In the convolutional neural network, the fixed-point image data and the fixed-point network parameters perform convolution and other operations layer by layer according to the fixed-point number algorithm, and finally the convolutional neural network can output the output image corresponding to the image to be processed.

由于待处理图像仅为原始图像的一个图像块,因此,可以进一步将各个图像块对应的输出图像进行融合处理,从而得到最终的目标输出图像。Since the image to be processed is only one image block of the original image, the output images corresponding to each image block can be further fused to obtain the final target output image.

在一种实施方式中,待处理图像还可以通过以下方式得到:可以确定原始图像中的感兴趣区域,将该感兴趣区域分割或截取出来,得到的图像块可以作为待处理图像。可以理解的,感兴趣区域可以是人眼关注度高的区域,这些区域由于人眼会给予更多的关注,因此在处理细节上有更高的要求,若处理的精度不足,将很容易被人眼发现瑕疵。In an embodiment, the image to be processed can also be obtained by the following manner: the region of interest in the original image can be determined, the region of interest can be segmented or cut out, and the obtained image block can be used as the image to be processed. It is understandable that the area of interest can be the area with high attention of the human eye. Since the human eye will give more attention to these areas, there are higher requirements for processing details. If the processing accuracy is insufficient, it will be easily detected. The human eye detects flaws.

感兴趣区域可以根据图像内容确定,比如可以是图像中的人物主体、动物主体等,也可以是图像中人物的人脸、头发、衣物等特定区域。在确定原始图像中的感兴趣区域时,在一种实施方式中,可以是系统通过算法自行确定的,具体的,可以对该原始图像进行语义分割,确定原始图像所包含的物体类别。物体类别可以是前述的人物、动物,或者人物的局部,如头发、人脸等不同类别。针对各个物体类别,可以确定其对应的关注度。关注度的确定可以根据预先设定的物体类别与关注度的对应关系确定,当然,也可以通过公式等其他方式计算出关注度。在计算出关注度后,可以将关注度最高的区域作为感兴趣区域。The region of interest can be determined according to the content of the image, for example, it can be the subject of a person, an animal, etc. in the image, or it can be a specific area of the person in the image, such as the face, hair, and clothing. When determining the region of interest in the original image, in one embodiment, the system may determine it by itself through an algorithm. Specifically, the original image may be semantically segmented to determine the object category contained in the original image. Object categories can be the aforementioned characters, animals, or parts of characters, such as hair, faces, and other different categories. For each object category, its corresponding attention degree can be determined. The determination of the degree of attention can be determined according to the preset correspondence between the object category and the degree of attention. Of course, the degree of attention can also be calculated by formulas and other methods. After the attention degree is calculated, the area with the highest attention degree can be regarded as the area of interest.

当然,在一种实施方式中,感兴趣区域也可以是用户自行选定的区域,比如在具体交互时,可以向用户提供识别出的图像中的各物体信息,由用户选择感兴趣的物体, 还可以向用户提供各种选择工具,由用户自行划定感兴趣区域。Of course, in an implementation manner, the region of interest may also be an area selected by the user. For example, during specific interaction, the user may be provided with information about each object in the identified image, and the user may select the object of interest, Various selection tools can also be provided to the user, and the user can define the region of interest by himself.

对感兴趣区域通过本申请实施例提供的图像处理方法进行处理,可以使感兴趣区域的处理有更高的精度,输出图像中感兴趣区域可以有更细腻的效果,比如呈现更多的细节。并且,仅对感兴趣区域处理相比整张图像处理而言可以达到节省算力的效果。By processing the region of interest by using the image processing method provided in the embodiment of the present application, the processing of the region of interest can have higher precision, and the region of interest in the output image can have more delicate effects, such as presenting more details. Moreover, only processing the region of interest can achieve the effect of saving computing power compared to processing the entire image.

以上为对本申请实施例提供的图像处理方法的详细说明。本申请实施例提供的图像处理方法,针对不同动态范围的输入图像设置了不同的量化参数,从而,对于待处理图像,可以使用与该待处理图像的动态范围相适配的量化参数对该待处理图像的图像数据进行量化处理,使量化后得到的定点图像数据可以在不失真(即数值大小上与实际相符)的基础上有足够的精度,从而减少了量化所带来的精度损失,使神经网络的输出结果精度更高,可以满足图像去噪、超分辨率等图像回归任务的高精度要求。The above is a detailed description of the image processing method provided by the embodiments of the present application. In the image processing method provided by the embodiment of the present application, different quantization parameters are set for input images with different dynamic ranges, so that, for the image to be processed, the quantization parameter suitable for the dynamic range of the image to be processed can be used for the image to be processed. The image data of the processed image is quantized, so that the fixed-point image data obtained after quantization can have sufficient accuracy on the basis of no distortion (that is, the value is consistent with the actual size), thereby reducing the accuracy loss caused by quantization, so that the The output of the neural network has higher accuracy, which can meet the high-precision requirements of image regression tasks such as image denoising and super-resolution.

下面可以参考图4,图4是本申请实施例提供的一种图像处理装置的结构示意图。该图像处理装置可以应用于神经网络,包括:处理器410与存储有计算机程序的存储器420;Referring to FIG. 4 below, FIG. 4 is a schematic structural diagram of an image processing apparatus provided by an embodiment of the present application. The image processing apparatus can be applied to a neural network, including: a processor 410 and a memory 420 storing a computer program;

所述处理器在执行所述计算机程序时实现以下步骤:The processor implements the following steps when executing the computer program:

确定待处理图像的目标动态范围,所述待处理图像用于输入所述神经网络;determining the target dynamic range of the image to be processed, the image to be processed is used to input the neural network;

从多个候选的量化参数集中确定与所述目标动态范围适配的目标量化参数集,其中,不同的所述量化参数集适配不同的动态范围;determining a target quantization parameter set adapted to the target dynamic range from a plurality of candidate quantization parameter sets, wherein different said quantization parameter sets are adapted to different dynamic ranges;

根据所述目标量化参数集,对所述待处理图像进行量化处理,将所述待处理图像的图像数据量化为精度与所述目标动态范围相适配的定点数据。According to the target quantization parameter set, quantization processing is performed on the to-be-processed image, and the image data of the to-be-processed image is quantized into fixed-point data whose precision is adapted to the target dynamic range.

可选的,所述待处理图像是原始图像中的一个图像块;Optionally, the to-be-processed image is an image block in the original image;

所述处理器还用于,对所述原始图像中各个图像块对应的输出图像进行融合处理,得到所述原始图像对应的目标输出图像,其中,所述输出图像是以所述图像块为输入时所述神经网络输出的图像。The processor is further configured to perform fusion processing on the output images corresponding to each image block in the original image to obtain a target output image corresponding to the original image, wherein the output image takes the image block as an input the image output by the neural network.

可选的,所述处理器在对所述原始图像中各个图像块对应的输出图像进行融合处理时用于,对所述图像块之间的重叠区域进行去重处理;利用去重处理后的所述各个图像块进行融合。Optionally, when the processor performs fusion processing on the output image corresponding to each image block in the original image, it is used to perform deduplication processing on the overlapping area between the image blocks; The respective image blocks are fused.

可选的,不同的所述量化参数集对应不同动态范围的区间;Optionally, different described quantization parameter sets correspond to different dynamic range intervals;

所述处理器在确定所述目标量化参数集时用于,确定所述目标动态范围所属的最小区间;将所述最小区间对应的量化参数集确定为所述目标量化参数集。When determining the target quantization parameter set, the processor is configured to: determine the minimum interval to which the target dynamic range belongs; and determine the quantization parameter set corresponding to the minimum interval as the target quantization parameter set.

可选的,第一区间为所述不同动态范围的区间中的任一区间;Optionally, the first interval is any interval in the intervals of the different dynamic ranges;

所述处理器在确定所述第一区间对应的量化参数集时用于,获取动态范围属于所 述第一区间的样本图像;根据所述样本图像确定所述第一区间对应的量化参数集。When determining the quantization parameter set corresponding to the first interval, the processor is configured to obtain a sample image whose dynamic range belongs to the first interval; and determine the quantization parameter set corresponding to the first interval according to the sample image.

可选的,所述量化参数集包括多个量化参数子集,一个所述量化参数子集对应所述神经网络的一层;Optionally, the quantization parameter set includes multiple quantization parameter subsets, and one of the quantization parameter subsets corresponds to a layer of the neural network;

所述处理器在根据所述样本图像确定所述第一区间对应的量化参数集时用于,获取所述神经网络每一层的输入图像,其中,所述神经网络输入层的输入图像为所述样本图像,其余各层的输入图像为所述神经网络以所述样本图像为输入进行前向传播得到的图像;针对所述神经网络的每一层,根据该层的输入图像的动态范围,确定该层对应的量化参数子集。When determining the quantization parameter set corresponding to the first interval according to the sample image, the processor is configured to obtain the input image of each layer of the neural network, wherein the input image of the input layer of the neural network is the input image of each layer of the neural network. The sample image, the input images of the remaining layers are the images obtained by the neural network using the sample image as the input for forward propagation; for each layer of the neural network, according to the dynamic range of the input image of the layer, Determine the subset of quantization parameters corresponding to this layer.

可选的,所述神经网络的网络参数在所述样本图像输入时是浮点网络参数。Optionally, the network parameters of the neural network are floating-point network parameters when the sample image is input.

可选的,所述处理器通过以下方式确定所述浮点网络参数,对所述神经网络的原始网络参数进行量化,得到定点网络参数;对所述定点网络参数进行逆量化,得到所述浮点网络参数。Optionally, the processor determines the floating-point network parameters in the following manner, quantizes the original network parameters of the neural network to obtain fixed-point network parameters; performs inverse quantization on the fixed-point network parameters to obtain the floating-point network parameters. Click Network Parameters.

可选的,所述处理器在确定所述目标动态范围所属的最小区间时用于,对所述目标动态范围进行缩放处理;确定经缩放处理的所述目标动态范围所属的最小区间。Optionally, when determining the minimum interval to which the target dynamic range belongs, the processor is configured to perform scaling processing on the target dynamic range; and determine the minimum interval to which the scaled target dynamic range belongs.

可选的,所述量化参数集中包括多个量化参数子集,一个所述量化参数子集对应所述神经网络的一层;Optionally, the quantization parameter set includes multiple quantization parameter subsets, and one of the quantization parameter subsets corresponds to a layer of the neural network;

所述处理器还用于,针对所述神经网络的每一层,根据该层对应的量化参数子集对该层的输入图像进行量化处理,并将量化处理后的所述输入图像输入该层,其中,所述神经网络输入层的输入图像为所述待处理图像,其余各层的输入图像为上一层的输出图像。The processor is further configured to, for each layer of the neural network, perform quantization processing on the input image of the layer according to the quantization parameter subset corresponding to the layer, and input the quantized input image into the layer , wherein the input image of the input layer of the neural network is the image to be processed, and the input images of the other layers are the output images of the previous layer.

可选的,所述神经网络中的各卷积核分别经过了量化处理。Optionally, each convolution kernel in the neural network has undergone quantization processing respectively.

可选的,所述处理器在对第一卷积核进行量化处理时用于,根据所述第一卷积核对应的量化参数对所述第一卷积核进行量化处理,其中,所述第一卷积核是所述神经网络中任一卷积核。Optionally, when performing quantization processing on the first convolution kernel, the processor is configured to perform quantization processing on the first convolution kernel according to a quantization parameter corresponding to the first convolution kernel, wherein the The first convolution kernel is any convolution kernel in the neural network.

可选的,所述神经网络的网络参数在量化后的所述待处理图像输入时是定点网络参数。Optionally, the network parameters of the neural network are fixed-point network parameters when the quantized image to be processed is input.

可选的,所述定点网络参数是对所述神经网络的原始网络参数进行量化得到的。Optionally, the fixed-point network parameters are obtained by quantizing the original network parameters of the neural network.

可选的,所述定点网络参数包括权重参数和/或偏置参数。Optionally, the fixed-point network parameters include weight parameters and/or bias parameters.

可选的,所述处理器还用于,对输入所述神经网络的图像进行预处理。Optionally, the processor is further configured to preprocess the image input to the neural network.

可选的,所述预处理包括零均值处理和/或归一化处理。Optionally, the preprocessing includes zero mean processing and/or normalization processing.

可选的,所述量化参数集中包括步长参数与零点参数。Optionally, the quantization parameter set includes a step size parameter and a zero point parameter.

可选的,所述待处理图像是原始图像中的感兴趣区域对应的图像块。Optionally, the image to be processed is an image block corresponding to the region of interest in the original image.

可选的,所述处理器在确定所述感兴趣区域时用于,对所述原始图像进行语义分割,确定所述原始图像所包含的物体类别;确定各物体类别对应的关注度;将关注度最高的物体类别所对应的区域确定为感兴趣区域。Optionally, when determining the region of interest, the processor is configured to perform semantic segmentation on the original image, determine object categories included in the original image; determine the degree of attention corresponding to each object category; The region corresponding to the object category with the highest degree is determined as the region of interest.

可选的,所述感兴趣区域由用户选定。Optionally, the region of interest is selected by the user.

以上所提供的各种实施方式下的图像处理装置,其具体实现可以参考前文中提供的图像处理方法的相关说明,在此不再赘述。For the specific implementation of the image processing apparatuses in the various embodiments provided above, reference may be made to the relevant descriptions of the image processing methods provided above, which will not be repeated here.

本申请实施例提供的图像处理装置,针对不同动态范围的输入图像设置了不同的量化参数,从而,对于待处理图像,可以使用与该待处理图像的动态范围相适配的量化参数对该待处理图像的图像数据进行量化处理,使量化后得到的定点图像数据可以在不失真(即数值大小上与实际相符)的基础上有足够的精度,从而减少了量化所带来的精度损失,可以满足图像去噪、超分辨率等图像回归任务的高精度要求。In the image processing apparatus provided by the embodiment of the present application, different quantization parameters are set for input images with different dynamic ranges, so that, for the image to be processed, the quantization parameter suitable for the dynamic range of the image to be processed can be used for the image to be processed. The image data of the processed image is quantized, so that the fixed-point image data obtained after quantization can have sufficient accuracy on the basis of no distortion (that is, the value is consistent with the actual size), thereby reducing the accuracy loss caused by quantization. It meets the high-precision requirements of image regression tasks such as image denoising and super-resolution.

本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序;所述计算机程序被处理器执行时实现本申请实施例提供的任一种图像处理方法。Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium; when the computer program is executed by a processor, any image processing method provided by the embodiments of the present application is implemented.

本申请实施例可采用在一个或多个其中包含有程序代码的存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。计算机可用存储介质包括永久性和非永久性、可移动和非可移动媒体,可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括但不限于:相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。Embodiments of the present application may take the form of a computer program product implemented on one or more storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having program code embodied therein. Computer-usable storage media includes permanent and non-permanent, removable and non-removable media, and storage of information can be accomplished by any method or technology. Information may be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.

以上所提供的各个实施例以及各种实施方式,只要不存在冲突或矛盾,本领域技术人员可以根据实际需求对各个实施例、实施方式进行组合,从而构成其他的实施例或实施方式。而本申请文件限于篇幅,未对其他组合方式所构成的实施例或实施方式展开说明,但可以理解的是,这些不需要付出创造性劳动的组合所得到的方案也属于本申请实施例公开的范围。As long as there is no conflict or contradiction in the various embodiments and implementations provided above, those skilled in the art can combine the various embodiments and implementations according to actual needs to form other embodiments or implementations. However, this application document is limited in space, and does not describe the embodiments or implementations formed by other combinations, but it can be understood that the solutions obtained by these combinations that do not require creative work also fall within the scope of the disclosure of the embodiments of this application. .

需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实 体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that, in this document, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any relationship between these entities or operations. any such actual relationship or sequence exists. The terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion such that a process, method, article or device comprising a list of elements includes not only those elements, but also other not expressly listed elements, or also include elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.

以上对本申请实施例所提供的方法、装置及计算机可读存储介质进行了详细介绍,本文中应用了具体个例对本申请实施例的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请实施例的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请实施例的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请实施例的限制。The methods, devices, and computer-readable storage media provided by the embodiments of the present application have been described in detail above. The principles and implementations of the embodiments of the present application are described in this document by using specific examples. The descriptions of the above embodiments are only for the purpose of Help to understand the methods and core ideas of the embodiments of the present application; at the same time, for those of ordinary skill in the art, according to the ideas of the embodiments of the present application, there will be changes in the specific implementation and application scope. , the content of this specification should not be construed as a limitation to the embodiments of the present application.

Claims (43)

一种图像处理方法,其特征在于,应用于神经网络,所述方法包括:An image processing method, characterized in that, applied to a neural network, the method comprising: 确定待处理图像的目标动态范围,所述待处理图像用于输入所述神经网络;determining the target dynamic range of the image to be processed, the image to be processed is used to input the neural network; 从多个候选的量化参数集中确定与所述目标动态范围适配的目标量化参数集,其中,不同的所述量化参数集适配不同的动态范围;determining a target quantization parameter set adapted to the target dynamic range from a plurality of candidate quantization parameter sets, wherein different said quantization parameter sets are adapted to different dynamic ranges; 根据所述目标量化参数集,对所述待处理图像进行量化处理,将所述待处理图像的图像数据量化为精度与所述目标动态范围相适配的定点数据。According to the target quantization parameter set, quantization processing is performed on the to-be-processed image, and the image data of the to-be-processed image is quantized into fixed-point data whose precision is adapted to the target dynamic range. 根据权利要求1所述的方法,其特征在于,所述待处理图像是原始图像中的一个图像块;The method according to claim 1, wherein the image to be processed is an image block in the original image; 所述方法还包括:The method also includes: 对所述原始图像中各个图像块对应的输出图像进行融合处理,得到所述原始图像对应的目标输出图像,其中,所述输出图像是以所述图像块为输入时所述神经网络输出的图像。Perform fusion processing on the output images corresponding to each image block in the original image to obtain a target output image corresponding to the original image, wherein the output image is the image output by the neural network when the image block is used as input . 根据权利要求2所述的方法,其特征在于,所述对所述原始图像中各个图像块对应的输出图像进行融合处理包括:The method according to claim 2, wherein the performing fusion processing on the output image corresponding to each image block in the original image comprises: 对所述图像块之间的重叠区域进行去重处理;De-duplication processing is performed on the overlapping area between the image blocks; 利用去重处理后的所述各个图像块进行融合。Fusion is performed using the de-duplicated image blocks. 根据权利要求1所述的方法,其特征在于,不同的所述量化参数集对应不同动态范围的区间,所述目标量化参数集基于以下方式确定:The method according to claim 1, wherein the different quantization parameter sets correspond to different dynamic ranges, and the target quantization parameter set is determined based on the following methods: 确定所述目标动态范围所属的最小区间;determining the minimum interval to which the target dynamic range belongs; 将所述最小区间对应的量化参数集确定为所述目标量化参数集。The quantization parameter set corresponding to the minimum interval is determined as the target quantization parameter set. 根据权利要求4所述的方法,其特征在于,第一区间为所述不同动态范围的区间中的任一区间,所述第一区间对应的量化参数集基于以下方式确定:The method according to claim 4, wherein the first interval is any interval in the intervals of the different dynamic ranges, and the quantization parameter set corresponding to the first interval is determined based on the following methods: 获取动态范围属于所述第一区间的样本图像;acquiring a sample image whose dynamic range belongs to the first interval; 根据所述样本图像确定所述第一区间对应的量化参数集。A quantization parameter set corresponding to the first interval is determined according to the sample image. 根据权利要求5所述的方法,其特征在于,所述量化参数集包括多个量化参数子集,一个所述量化参数子集对应所述神经网络的一层;The method according to claim 5, wherein the quantization parameter set includes a plurality of quantization parameter subsets, and one of the quantization parameter subsets corresponds to a layer of the neural network; 所述根据所述样本图像确定所述第一区间对应的量化参数集包括:The determining of the quantization parameter set corresponding to the first interval according to the sample image includes: 获取所述神经网络每一层的输入图像,其中,所述神经网络输入层的输入图像为所述样本图像,其余各层的输入图像为所述神经网络以所述样本图像为输入进行前向传播得到的图像;Obtain the input image of each layer of the neural network, wherein the input image of the input layer of the neural network is the sample image, and the input images of the other layers are the neural network using the sample image as the input for forwarding disseminated images; 针对所述神经网络的每一层,根据该层的输入图像的动态范围,确定该层对应的量化参数子集。For each layer of the neural network, a quantization parameter subset corresponding to the layer is determined according to the dynamic range of the input image of the layer. 根据权利要求6所述的方法,其特征在于,所述神经网络的网络参数在所述样本图像输入时是浮点网络参数。The method according to claim 6, wherein the network parameters of the neural network are floating-point network parameters when the sample image is input. 根据权利要求7所述的方法,其特征在于,所述浮点网络参数基于以下方式确定:The method according to claim 7, wherein the floating-point network parameters are determined based on the following methods: 对所述神经网络的原始网络参数进行量化,得到定点网络参数;Quantifying the original network parameters of the neural network to obtain fixed-point network parameters; 对所述定点网络参数进行逆量化,得到所述浮点网络参数。Perform inverse quantization on the fixed-point network parameters to obtain the floating-point network parameters. 根据权利要求4所述的方法,其特征在于,所述确定所述目标动态范围所属的最小区间,包括:The method according to claim 4, wherein the determining the minimum interval to which the target dynamic range belongs comprises: 对所述目标动态范围进行缩放处理;scaling the target dynamic range; 确定经缩放处理的所述目标动态范围所属的最小区间。A minimum interval to which the scaled target dynamic range belongs is determined. 根据权利要求1所述的方法,其特征在于,所述量化参数集中包括多个量化参数子集,一个所述量化参数子集对应所述神经网络的一层;The method according to claim 1, wherein the quantization parameter set includes a plurality of quantization parameter subsets, and one of the quantization parameter subsets corresponds to a layer of the neural network; 所述方法还包括:The method also includes: 针对所述神经网络的每一层,根据该层对应的量化参数子集对该层的输入图像进行量化处理,并将量化处理后的所述输入图像输入该层,其中,所述神经网络输入层的输入图像为所述待处理图像,其余各层的输入图像为上一层的输出图像。For each layer of the neural network, perform quantization processing on the input image of the layer according to the quantization parameter subset corresponding to the layer, and input the quantized input image into the layer, wherein the neural network input The input image of the layer is the image to be processed, and the input images of the other layers are the output images of the previous layer. 根据权利要求10所述的方法,其特征在于,所述神经网络中的各卷积核分别经过了量化处理。The method according to claim 10, wherein each convolution kernel in the neural network has undergone quantization processing respectively. 根据权利要求11所述的方法,其特征在于,第一卷积核是所述神经网络中任一卷积核,对所述第一卷积核进行量化处理,包括:The method according to claim 11, wherein the first convolution kernel is any convolution kernel in the neural network, and performing quantization processing on the first convolution kernel includes: 根据所述第一卷积核对应的量化参数对所述第一卷积核进行量化处理。The first convolution kernel is quantized according to the quantization parameter corresponding to the first convolution kernel. 根据权利要求1所述的方法,其特征在于,所述神经网络的网络参数在量化后的所述待处理图像输入时是定点网络参数。The method according to claim 1, wherein the network parameters of the neural network are fixed-point network parameters when the quantized image to be processed is input. 根据权利要求13所述的方法,其特征在于,所述定点网络参数是对所述神经网络的原始网络参数进行量化得到的。The method according to claim 13, wherein the fixed-point network parameters are obtained by quantizing the original network parameters of the neural network. 根据权利要求13所述的方法,其特征在于,所述定点网络参数包括权重参数和/或偏置参数。The method according to claim 13, wherein the fixed-point network parameters include weight parameters and/or bias parameters. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, wherein the method further comprises: 对输入所述神经网络的图像进行预处理。The image input to the neural network is preprocessed. 根据权利要求16所述的方法,其特征在于,所述预处理包括零均值处理和/或归一化处理。The method according to claim 16, wherein the preprocessing comprises zero mean processing and/or normalization processing. 根据权利要求1所述的方法,其特征在于,所述量化参数集中包括步长参数与零点参数。The method according to claim 1, wherein the quantization parameter set includes a step size parameter and a zero point parameter. 根据权利要求1所述的方法,其特征在于,所述待处理图像是原始图像中的感兴趣区域对应的图像块。The method according to claim 1, wherein the image to be processed is an image block corresponding to a region of interest in the original image. 根据权利要求19所述的方法,其特征在于,所述感兴趣区域基于以下方式确定:The method of claim 19, wherein the region of interest is determined based on: 对所述原始图像进行语义分割,确定所述原始图像所包含的物体类别;Semantic segmentation is performed on the original image to determine the object category contained in the original image; 确定各物体类别对应的关注度;Determine the degree of attention corresponding to each object category; 将关注度最高的物体类别所对应的区域确定为感兴趣区域。The region corresponding to the object category with the highest degree of attention is determined as the region of interest. 根据权利要求19所述的方法,其特征在于,所述感兴趣区域由用户选定。The method of claim 19, wherein the region of interest is selected by a user. 一种图像处理装置,其特征在于,应用于神经网络,包括:处理器与存储有计算机程序的存储器;An image processing device, characterized in that, applied to a neural network, comprising: a processor and a memory storing a computer program; 所述处理器在执行所述计算机程序时实现以下步骤:The processor implements the following steps when executing the computer program: 确定待处理图像的目标动态范围,所述待处理图像用于输入所述神经网络;determining the target dynamic range of the image to be processed, the image to be processed is used to input the neural network; 从多个候选的量化参数集中确定与所述目标动态范围适配的目标量化参数集,其中,不同的所述量化参数集适配不同的动态范围;determining a target quantization parameter set adapted to the target dynamic range from a plurality of candidate quantization parameter sets, wherein different said quantization parameter sets are adapted to different dynamic ranges; 根据所述目标量化参数集,对所述待处理图像进行量化处理,将所述待处理图像的图像数据量化为精度与所述目标动态范围相适配的定点数据。According to the target quantization parameter set, quantization processing is performed on the to-be-processed image, and the image data of the to-be-processed image is quantized into fixed-point data whose precision is adapted to the target dynamic range. 根据权利要求22所述的装置,其特征在于,所述待处理图像是原始图像中的一个图像块;The apparatus according to claim 22, wherein the image to be processed is an image block in the original image; 所述处理器还用于,对所述原始图像中各个图像块对应的输出图像进行融合处理,得到所述原始图像对应的目标输出图像,其中,所述输出图像是以所述图像块为输入时所述神经网络输出的图像。The processor is further configured to perform fusion processing on the output image corresponding to each image block in the original image to obtain a target output image corresponding to the original image, wherein the output image takes the image block as input the image output by the neural network. 根据权利要求23所述的装置,其特征在于,所述处理器在对所述原始图像中各个图像块对应的输出图像进行融合处理时用于,对所述图像块之间的重叠区域进行去重处理;利用去重处理后的所述各个图像块进行融合。The device according to claim 23, wherein when the processor performs fusion processing on the output image corresponding to each image block in the original image, the processor is configured to remove the overlapping area between the image blocks. Reprocessing; using the de-repeated image blocks to perform fusion. 根据权利要求22所述的装置,其特征在于,不同的所述量化参数集对应不同动态范围的区间;The apparatus according to claim 22, wherein different quantization parameter sets correspond to different dynamic ranges; 所述处理器在确定所述目标量化参数集时用于,确定所述目标动态范围所属的最小区间;将所述最小区间对应的量化参数集确定为所述目标量化参数集。When determining the target quantization parameter set, the processor is configured to: determine the minimum interval to which the target dynamic range belongs; and determine the quantization parameter set corresponding to the minimum interval as the target quantization parameter set. 根据权利要求25所述的装置,其特征在于,第一区间为所述不同动态范围的区间中的任一区间;The device according to claim 25, wherein the first interval is any interval in the intervals of different dynamic ranges; 所述处理器在确定所述第一区间对应的量化参数集时用于,获取动态范围属于所述第一区间的样本图像;根据所述样本图像确定所述第一区间对应的量化参数集。When determining the quantization parameter set corresponding to the first interval, the processor is configured to obtain a sample image whose dynamic range belongs to the first interval; and determine the quantization parameter set corresponding to the first interval according to the sample image. 根据权利要求26所述的装置,其特征在于,所述量化参数集包括多个量化参数子集,一个所述量化参数子集对应所述神经网络的一层;The apparatus according to claim 26, wherein the quantization parameter set comprises a plurality of quantization parameter subsets, and one of the quantization parameter subsets corresponds to a layer of the neural network; 所述处理器在根据所述样本图像确定所述第一区间对应的量化参数集时用于,获取所述神经网络每一层的输入图像,其中,所述神经网络输入层的输入图像为所述样本图像,其余各层的输入图像为所述神经网络以所述样本图像为输入进行前向传播得到的图像;针对所述神经网络的每一层,根据该层的输入图像的动态范围,确定该层对应的量化参数子集。When determining the quantization parameter set corresponding to the first interval according to the sample image, the processor is configured to obtain the input image of each layer of the neural network, wherein the input image of the input layer of the neural network is the input image of each layer of the neural network. The sample image, the input images of the remaining layers are the images obtained by the neural network using the sample image as the input for forward propagation; for each layer of the neural network, according to the dynamic range of the input image of the layer, Determine the subset of quantization parameters corresponding to this layer. 根据权利要求27所述的装置,其特征在于,所述神经网络的网络参数在所述样本图像输入时是浮点网络参数。The apparatus according to claim 27, wherein the network parameters of the neural network are floating-point network parameters when the sample image is input. 根据权利要求28所述的装置,其特征在于,所述处理器通过以下方式确定所述浮点网络参数,对所述神经网络的原始网络参数进行量化,得到定点网络参数;对所述定点网络参数进行逆量化,得到所述浮点网络参数。The apparatus according to claim 28, wherein the processor determines the floating-point network parameters in the following manner, quantizes the original network parameters of the neural network, and obtains fixed-point network parameters; The parameters are inversely quantized to obtain the floating-point network parameters. 根据权利要求25所述的装置,其特征在于,所述处理器在确定所述目标动态范围所属的最小区间时用于,对所述目标动态范围进行缩放处理;确定经缩放处理的所述目标动态范围所属的最小区间。The device according to claim 25, wherein when determining the minimum interval to which the target dynamic range belongs, the processor is configured to perform scaling processing on the target dynamic range; and determining the scaled target The smallest interval to which the dynamic range belongs. 根据权利要求25所述的装置,其特征在于,所述量化参数集中包括多个量化参数子集,一个所述量化参数子集对应所述神经网络的一层;The apparatus according to claim 25, wherein the quantization parameter set includes a plurality of quantization parameter subsets, and one of the quantization parameter subsets corresponds to a layer of the neural network; 所述处理器还用于,针对所述神经网络的每一层,根据该层对应的量化参数子集对该层的输入图像进行量化处理,并将量化处理后的所述输入图像输入该层,其中,所述神经网络输入层的输入图像为所述待处理图像,其余各层的输入图像为上一层的输出图像。The processor is further configured to, for each layer of the neural network, perform quantization processing on the input image of the layer according to the quantization parameter subset corresponding to the layer, and input the quantized input image into the layer , wherein the input image of the input layer of the neural network is the image to be processed, and the input images of the other layers are the output images of the previous layer. 根据权利要求31所述的装置,其特征在于,所述神经网络中的各卷积核分别经过了量化处理。The device according to claim 31, wherein each convolution kernel in the neural network has undergone quantization processing respectively. 根据权利要求32所述的装置,其特征在于,所述处理器在对第一卷积核进行量化处理时用于,根据所述第一卷积核对应的量化参数对所述第一卷积核进行量化处 理,其中,所述第一卷积核是所述神经网络中任一卷积核。The apparatus according to claim 32, wherein when the processor performs quantization processing on the first convolution kernel, the first convolution kernel is quantized according to a quantization parameter corresponding to the first convolution kernel. The kernel performs quantization processing, wherein the first convolution kernel is any convolution kernel in the neural network. 根据权利要求22所述的装置,其特征在于,所述神经网络的网络参数在量化后的所述待处理图像输入时是定点网络参数。The apparatus according to claim 22, wherein the network parameters of the neural network are fixed-point network parameters when the quantized image to be processed is input. 根据权利要求34所述的装置,其特征在于,所述定点网络参数是对所述神经网络的原始网络参数进行量化得到的。The apparatus according to claim 34, wherein the fixed-point network parameters are obtained by quantizing the original network parameters of the neural network. 根据权利要求34所述的装置,其特征在于,所述定点网络参数包括权重参数和/或偏置参数。The apparatus according to claim 34, wherein the fixed-point network parameters include weight parameters and/or bias parameters. 根据权利要求22所述的装置,其特征在于,所述处理器还用于,对输入所述神经网络的图像进行预处理。The apparatus according to claim 22, wherein the processor is further configured to preprocess the image input to the neural network. 根据权利要求37所述的装置,其特征在于,所述预处理包括零均值处理和/或归一化处理。The apparatus according to claim 37, wherein the preprocessing includes zero mean processing and/or normalization processing. 根据权利要求22所述的装置,其特征在于,所述量化参数集中包括步长参数与零点参数。The apparatus according to claim 22, wherein the quantization parameter set includes a step size parameter and a zero point parameter. 根据权利要求22所述的装置,其特征在于,所述待处理图像是原始图像中的感兴趣区域对应的图像块。The apparatus according to claim 22, wherein the image to be processed is an image block corresponding to a region of interest in the original image. 根据权利要求40所述的装置,其特征在于,所述处理器在确定所述感兴趣区域时用于,对所述原始图像进行语义分割,确定所述原始图像所包含的物体类别;确定各物体类别对应的关注度;将关注度最高的物体类别所对应的区域确定为感兴趣区域。The device according to claim 40, wherein when determining the region of interest, the processor is configured to perform semantic segmentation on the original image, determine object categories included in the original image; The degree of attention corresponding to the object category; the area corresponding to the object category with the highest degree of attention is determined as the area of interest. 根据权利要求40所述的装置,其特征在于,所述感兴趣区域由用户选定。The apparatus of claim 40, wherein the region of interest is selected by a user. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序;所述计算机程序被处理器执行时实现如权利要求1-21任一项所述的图像处理方法。A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program; when the computer program is executed by a processor, the image processing method according to any one of claims 1-21 is implemented.
PCT/CN2020/105263 2020-07-28 2020-07-28 Image processing method, image processing device, and computer readable storage medium Ceased WO2022021083A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/105263 WO2022021083A1 (en) 2020-07-28 2020-07-28 Image processing method, image processing device, and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/105263 WO2022021083A1 (en) 2020-07-28 2020-07-28 Image processing method, image processing device, and computer readable storage medium

Publications (1)

Publication Number Publication Date
WO2022021083A1 true WO2022021083A1 (en) 2022-02-03

Family

ID=80037273

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/105263 Ceased WO2022021083A1 (en) 2020-07-28 2020-07-28 Image processing method, image processing device, and computer readable storage medium

Country Status (1)

Country Link
WO (1) WO2022021083A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114581879A (en) * 2022-02-08 2022-06-03 广州小鹏自动驾驶科技有限公司 Image recognition method, image recognition device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103400159A (en) * 2013-08-05 2013-11-20 中国科学院上海微系统与信息技术研究所 Target classification identifying method in quick mobile context and classifier obtaining method for target classification and identification in quick mobile context
CN110163370A (en) * 2019-05-24 2019-08-23 上海肇观电子科技有限公司 Compression method, chip, electronic equipment and the medium of deep neural network
CN110799994A (en) * 2017-08-14 2020-02-14 美的集团股份有限公司 Adaptive Bit Width Reduction for Neural Networks
CN110874625A (en) * 2018-08-31 2020-03-10 杭州海康威视数字技术股份有限公司 Deep neural network quantification method and device
US20200154145A1 (en) * 2018-11-08 2020-05-14 Alibaba Group Holding Limited Content-weighted deep residual learning for video in-loop filtering

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103400159A (en) * 2013-08-05 2013-11-20 中国科学院上海微系统与信息技术研究所 Target classification identifying method in quick mobile context and classifier obtaining method for target classification and identification in quick mobile context
CN110799994A (en) * 2017-08-14 2020-02-14 美的集团股份有限公司 Adaptive Bit Width Reduction for Neural Networks
CN110874625A (en) * 2018-08-31 2020-03-10 杭州海康威视数字技术股份有限公司 Deep neural network quantification method and device
US20200154145A1 (en) * 2018-11-08 2020-05-14 Alibaba Group Holding Limited Content-weighted deep residual learning for video in-loop filtering
CN110163370A (en) * 2019-05-24 2019-08-23 上海肇观电子科技有限公司 Compression method, chip, electronic equipment and the medium of deep neural network

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114581879A (en) * 2022-02-08 2022-06-03 广州小鹏自动驾驶科技有限公司 Image recognition method, image recognition device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110880038B (en) FPGA-based system for accelerating convolution computing, convolutional neural network
TWI830938B (en) Method and system of quantizing artificial neural network and artificial neural network apparatus
CN110135580B (en) Convolution network full integer quantization method and application method thereof
US12299068B2 (en) Reduced dot product computation circuit
CN109002889B (en) Adaptive iterative convolution neural network model compression method
KR102728799B1 (en) Method and apparatus of artificial neural network quantization
CN112200295B (en) Sorting method, operation method, device and equipment of sparse convolutional neural network
CN110659734B (en) Low bit quantization method for depth separable convolution structure
KR102785479B1 (en) Adaptive quantization method and device, equipment, medium
CN108701250A (en) Data fixed point method and device
CN111240746B (en) Floating point data inverse quantization and quantization method and equipment
CN110363297A (en) Neural metwork training and image processing method, device, equipment and medium
CN111507473B (en) Pruning method and system based on Crossbar architecture
CN111027684A (en) Deep learning model quantification method and device, electronic equipment and storage medium
CN110874627B (en) Data processing method, data processing device and computer readable medium
CN116306879A (en) Data processing method, device, electronic equipment and storage medium
CN114677548B (en) Neural network image classification system and method based on resistive random access memory
CN109325530A (en) Compression method of deep convolutional neural network based on a small amount of unlabeled data
WO2022021083A1 (en) Image processing method, image processing device, and computer readable storage medium
WO2019165602A1 (en) Data conversion method and device
CN114222997A (en) Method and apparatus for post-training quantization of neural networks
CN112418388B (en) A method and device for implementing deep convolutional neural network processing
US20220405576A1 (en) Multi-layer neural network system and method
CN112199072A (en) Data processing method, device and equipment based on neural network layer
CN115905546B (en) Graph Convolutional Network Document Recognition Device and Method Based on Resistive Variable Memory

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20947173

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20947173

Country of ref document: EP

Kind code of ref document: A1