[go: up one dir, main page]

WO2019237357A1 - Procédé et dispositif de détermination de paramètres de poids d'un modèle de réseau de neurones artificiels - Google Patents

Procédé et dispositif de détermination de paramètres de poids d'un modèle de réseau de neurones artificiels Download PDF

Info

Publication number
WO2019237357A1
WO2019237357A1 PCT/CN2018/091652 CN2018091652W WO2019237357A1 WO 2019237357 A1 WO2019237357 A1 WO 2019237357A1 CN 2018091652 W CN2018091652 W CN 2018091652W WO 2019237357 A1 WO2019237357 A1 WO 2019237357A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
weight parameter
network model
error value
pending
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2018/091652
Other languages
English (en)
Chinese (zh)
Inventor
杨帆
钟刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201880092139.XA priority Critical patent/CN111937011B/zh
Priority to PCT/CN2018/091652 priority patent/WO2019237357A1/fr
Publication of WO2019237357A1 publication Critical patent/WO2019237357A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present application relates to the field of data processing technology, and in particular, to a method and device for determining weight parameters of a neural network model.
  • neural network models have shown extremely superior performance in applications such as computer vision and speech processing, and have received widespread attention.
  • the cost of the success of the neural network model is the introduction of a large number of parameters and calculations, and the quantization technology of the relevant model parameters of the neural network model can reduce the redundancy of the accuracy of the relevant model parameters, and achieve the premise of reducing the adverse impact on the accuracy of the model Purpose of model compression.
  • Model compression can not only reduce the consumption of memory bandwidth and energy consumption for data access, but low-precision operations often also bring lower operation energy consumption. For some calculation units that support multiple precision calculations, the number of low-precision calculations per unit time is higher than the number of times that high-precision calculations can be completed.
  • the embodiments of the present application provide a method and a device for determining a weight parameter of a neural network model.
  • the model can be trained by The error between the output result and the expected result introduces a proper correction value, which reduces the quantization error and avoids the problem of over-fitting caused by some weight parameters with larger numerical values leading the reasoning results of the neural network.
  • an embodiment of the present application provides a method for determining a weight parameter of a neural network model, including: processing sample data based on a neural network model's pending weight parameters to obtain an output result; calculating the output result and a preset The original error value of the expected result, the original error value being a numerical representation of the difference between the output result and the expected result; based on the correction value, correcting the original error value to obtain a corrected error value; based on The correction error value and the pending weight parameter determine a model weight parameter of the neural network model; wherein the correction value is obtained according to the following formula:
  • R represents the correction value
  • w k represents the k-th pending weight parameter of the neural network model
  • Q (w k ) represents a quantized value of the k-th pending weight parameter
  • k is a non-negative integer.
  • the embodiment of the present application introduces a proper correction value to the error between the output result of the model training and the expected result, thereby reducing the quantization error and avoiding the overfitting caused by some of the larger weight parameter leading the reasoning results of the neural network. problem.
  • the correction error value is obtained according to the following formula:
  • E1 represents the correction error value
  • E0 represents the original error value
  • is a constant
  • m is the total number of pending weight parameters used to process the sample data
  • F ((w k -Q (w k ) ) ⁇ Q (w k )) represents a function in which the correction value is an independent variable
  • m is a positive integer.
  • the absolute value of the correction value is calculated by using the function of the correction value as an independent variable; correspondingly, the correction error value is obtained according to the following formula:
  • the neural network model includes p network layers, each of the network layers includes q the pending weight parameters, and the k-th pending weight parameter is The j-th pending weight parameter of the i-th network layer in the neural network model is described; correspondingly, the correction error value is obtained according to the following formula:
  • p and q are positive integers
  • i and j are non-negative integers.
  • processing the sample data based on the neural network model's pending weight parameters includes: obtaining the pending weight parameters; quantizing the obtained pending weight parameters, and A quantized weight parameter is obtained, where the quantized weight parameter is a quantized value of the pending weight parameter; the quantized weight parameter is used as a model weight parameter of the neural network model, and a forward propagation algorithm is used to process the sample data Obtaining the output result from an output layer of the neural network model.
  • the input sample data is deduced based on the neural network model to obtain the output result.
  • the model weight parameters of the neural network model are obtained in an iterative training manner, and when the iterative training meets an end condition, the based on the modified error value and the Said to-be-determined weight parameters and determining model weight parameters of the neural network model includes: using the quantized weight parameters as model weight parameters of the neural network model.
  • the embodiment of the present application realizes compression of the neural network model by quantizing the weight parameters, reduces the occupation of memory bandwidth and energy consumption, and improves the computing efficiency of the processor.
  • the model of the neural network model is determined based on the modified error value and the pending weight parameter.
  • the weight parameter includes: adjusting the pending weight parameter layer by layer to the network layer of the neural network model by using a back-propagation algorithm according to the modified error value until the input layer of the neural network model to obtain the The adjusted weight parameters of the neural network model are described.
  • the pending weight parameters of the neural network model are adjusted according to the following formula:
  • w0 k represents the k-th pending weight parameter
  • w1 k represents the k-th adjusted weight parameter
  • is a normal number
  • back propagation algorithms also referred to as back propagation algorithms, mentioned in the embodiments of the present application, which are not limited in the embodiments of the present invention.
  • back propagation algorithms Through the back propagation algorithm, the weight parameters are trained, and the updated weight parameters make the neural network model further optimized.
  • N is an integer greater than 1
  • M is a positive integer less than N
  • the end condition includes the following conditions: A combination of one or more of: the original error value in the Nth training cycle is less than a preset first threshold; the modified error value in the Nth training cycle is less than a preset second threshold; The difference between the original error value in the Nth training cycle and the original error value in the NMth training cycle is less than a preset third threshold; the modified error value in the Nth training cycle and the first The difference between the correction error values in the NM training cycles is less than a preset fourth threshold; the difference between the pending weight parameters in the Nth training cycle and the pending weight parameters in the NM training cycle is less than a preset A fifth threshold; and N is greater than a preset sixth threshold.
  • the Nth training cycle when the Nth training cycle does not satisfy the end condition, one or more combinations of the following physical quantities are stored: the Nth training cycle The original error value in; the modified error value in the N-th training cycle; the pending weight parameter in the N-th training cycle; and the number of cycles N of the N-th training cycle.
  • the training efficiency is improved, and the training effect and the balance of the resources consumed by training are achieved.
  • the obtaining the pending weight parameter includes: using a preset initial weight parameter as the pending weight during a first training cycle of the iterative training. Parameters; in a non-first training cycle of the iterative training, using the adjusted weight parameters of the neural network model as the pending weight parameters.
  • the neural network model is used for image recognition; correspondingly, the sample data includes image samples; and correspondingly, the output result includes all characteristics characterized as a probability form.
  • the recognition result of image recognition is described.
  • the neural network model is used for voice recognition; correspondingly, the sample data includes sound samples; and correspondingly, the output result includes all features characterized as a probability form.
  • the recognition result of voice recognition will be described.
  • the neural network model is used for obtaining a super-resolution image; correspondingly, the sample data includes image samples; correspondingly, the output result includes super-resolution The pixel value of the processed image.
  • the above feasible implementation manners of the examples of the present application exemplarily provide specific application scenarios of the neural network model in the examples of the present application.
  • the recognition rate of image recognition and sound recognition can be improved, and the image can be improved
  • Super-resolution processed image quality can also achieve significant beneficial effects in other application areas.
  • an embodiment of the present application provides a device for determining a weight parameter of a neural network model, which is characterized by including: a forward propagation module for processing sample data based on a pending weight parameter of the neural network model to obtain Output result; a comparison module for calculating an original error value of the output result and a preset expected result, the original error value being a numerical representation of a difference between the output result and the expected result; a correction module, which uses Based on the correction value, the original error value is modified to obtain a correction error value; a determination module is configured to determine a model weight parameter of the neural network model based on the correction error value and the pending weight parameter; , The correction value is obtained according to the following formula:
  • R represents the correction value
  • w k represents the k-th pending weight parameter of the neural network model
  • Q (w k ) represents a quantized value of the k-th pending weight parameter
  • k is a non-negative integer.
  • the correction error value is obtained according to the following formula:
  • E1 represents the correction error value
  • E0 represents the original error value
  • is a constant
  • m is the total number of pending weight parameters used to process the sample data
  • F ((w k -Q (w k ) ) ⁇ Q (w k )) represents a function in which the correction value is an independent variable
  • m is a positive integer.
  • the absolute value of the correction value is calculated by using the function of the correction value as an independent variable; correspondingly, the correction error value is obtained according to the following formula:
  • the neural network model includes p network layers, each of the network layers includes q the pending weight parameters, and the k-th pending weight parameter is The j-th pending weight parameter of the i-th network layer in the neural network model is described; correspondingly, the correction error value is obtained according to the following formula:
  • p and q are positive integers
  • i and j are non-negative integers.
  • the forward propagation module is specifically configured to: obtain the pending weight parameter; quantize the obtained pending weight parameter to obtain a quantized weight parameter, and the quantized weight
  • the parameter is a quantized value of the pending weight parameter
  • the quantized weight parameter is used as a model weight parameter of the neural network model, and the sample data is processed using a forward propagation algorithm; the output from the neural network model The layer obtains the output result.
  • the model weight parameters of the neural network model are obtained by an iterative training method, and when the iterative training meets an end condition, the determining module is specifically configured to:
  • the quantized weight parameter is used as a model weight parameter of the neural network model.
  • the back propagation module when the iterative training does not satisfy the end condition, is specifically configured to: according to the correction error value Using a back-propagation algorithm to adjust the pending weight parameters layer by layer for the network layer of the neural network model until the input layer of the neural network model to obtain the adjusted weight parameters of the neural network model.
  • the pending weight parameters of the neural network model are adjusted according to the following formula:
  • w0 k represents the k-th pending weight parameter
  • w1 k represents the k-th adjusted weight parameter
  • is a normal number
  • N is an integer greater than 1
  • M is a positive integer less than N
  • the end condition includes the following conditions: A combination of one or more of: the original error value in the Nth training cycle is less than a preset first threshold; the modified error value in the Nth training cycle is less than a preset second threshold; The difference between the original error value in the Nth training cycle and the original error value in the NMth training cycle is less than a preset third threshold; the modified error value in the Nth training cycle and the first The difference between the correction error values in the NM training cycles is less than a preset fourth threshold; the difference between the pending weight parameters in the Nth training cycle and the pending weight parameters in the NM training cycle is less than a preset A fifth threshold; and N is greater than a preset sixth threshold.
  • the Nth training cycle when the Nth training cycle does not satisfy the end condition, one or more combinations of the following physical quantities are stored: the Nth training cycle The original error value in; the modified error value in the N-th training cycle; the pending weight parameter in the N-th training cycle; and the number of cycles N of the N-th training cycle.
  • the forward propagation module is specifically configured to: during a first training cycle of the iterative training, use a preset initial weight parameter as the pending weight parameter In a non-first training cycle of the iterative training, using the adjusted weight parameter of the neural network model as the pending weight parameter.
  • the neural network model is used for image recognition; correspondingly, the sample data includes image samples; and correspondingly, the output result includes all characteristics characterized in a probabilistic form.
  • the recognition result of image recognition is described.
  • the neural network model is used for voice recognition; correspondingly, the sample data includes sound samples; and correspondingly, the output result includes all features characterized as a probability form.
  • voice recognition will be described.
  • the neural network model is used for obtaining a super-resolution image; correspondingly, the sample data includes image samples; correspondingly, the output result includes super-resolution The pixel value of the processed image.
  • an embodiment of the present application provides an electronic device including: one or more processors and one or more memories.
  • the one or more memories are coupled with one or more processors, and the one or more memories are used to store computer program code.
  • the computer program code includes computer instructions.
  • the electronic device performs On the one hand, a method for determining weight parameters of a neural network model.
  • an embodiment of the present application provides a computer storage medium including computer instructions, and when the computer instructions are run on an electronic device, the electronic device is caused to execute the data processing method according to any one of the first aspects.
  • an embodiment of the present application provides a computer program product that, when the computer program product runs on a computer, causes the computer to execute the method for determining a weight parameter of a neural network model according to any one of the first aspect.
  • an embodiment of the present application provides a chip including a processor and a memory.
  • the memory is used to store computer program code.
  • the computer program code includes computer instructions.
  • the processor executes the computer instructions
  • the electronic device executes any of the first A method for determining weight parameters of a neural network model.
  • FIG. 1 is an exemplary schematic diagram of a neural network structure
  • FIG. 2 is a schematic structural diagram of an exemplary neuron
  • FIG. 3 is an exemplary schematic diagram of another neural network structure
  • FIG. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • FIG. 5 is an exemplary flowchart of a method for determining weight parameters of a neural network model according to an embodiment of the present application
  • FIG. 6 is an exemplary structural block diagram of a device for determining a weight parameter of a neural network model according to an embodiment of the present application
  • FIG. 7 is a schematic structural diagram of another electronic device according to an embodiment of the present application.
  • Neural networks can be used to process various data, such as image data, audio data, and so on.
  • a neural network may include one or more network layers (also called neural network layers).
  • the network layer may be a convolutional layer, a fully connected layer, a deconvolutional layer, or a recurrent layer.
  • a typical neural network model is shown in Figure 1.
  • Neural network model and forward propagation For the training sample set (x ( i ), y ( i )), the neural network algorithm can provide a complex and nonlinear hypothetical model h W, b ( x), which has parameters W, b, which can be used to fit the data.
  • the neural network model is also referred to as the neural network in this article. This neural network consists of only one "neuron”. Figure 2 is the "neuron" Icon.
  • This "neuron” is an arithmetic unit that takes x 1 , x 2 , x 3 and intercept +1 as input values, and its output is Where function Called the "activation function”.
  • the so-called neural network is to connect many single “neurons” together, so that the output of one "neuron” can be the input of another "neuron".
  • Figure 3 is a simple neural network.
  • the circle is used to represent the input of the neural network.
  • the circle marked "+1" is called the bias node, which is the intercept term.
  • the leftmost layer of the neural network is called the input layer, and the rightmost layer is called the output layer (in this example, the output layer has only one node).
  • a layer composed of all nodes in the middle is called a hidden layer (in other embodiments, the hidden layer may not exist, or there may be multiple layers).
  • n l 3
  • L 1 is the input layer
  • the output layer is
  • S l l represents the number of nodes of layer (not counting the bias unit).
  • use represents the activation value (output value) of the i-th unit of the l-th layer.
  • l 1, That is, the i-th input value (i-th feature of the input value).
  • the neural network can calculate the output according to the function h W, b (x). The calculation steps of the neural network in this example are as follows:
  • the above calculation step is called forward propagation.
  • Backpropagation algorithm also known as Backpropagation
  • error backpropagation algorithm Backpropagation algorithm is mainly repeated and iterated by two links (incentive propagation and weight update) until the network The input response reaches the predetermined target range.
  • the learning process of the error back-propagation algorithm consists of a forward-propagation process and a back-propagation process.
  • input information passes through the input layer through the hidden layer, is processed layer by layer, and is transmitted to the output layer. If the desired output value cannot be obtained at the output layer, take various representation forms (such as the sum of squares) of the output and the expected error as the objective function, transfer to back propagation, and find the objective function for each neuron layer by layer
  • the partial derivatives of the weights constitute the ladder of the objective function to the weight vector.
  • the learning of the network is completed in the process of the weights modification. When the error reaches the expected value, the network learning ends.
  • the propagation link in each iteration includes two steps: sending the training input to the network to obtain the incentive response (forward propagation phase); and differentiating the incentive response with the target output corresponding to the training input to obtain the hidden Layer and output layer response error (back propagation phase).
  • the weight on each neuron is updated according to the following steps: multiply the input excitation and response errors to obtain the gradient of the weights; multiply this gradient by a ratio and invert and add to the weights on. This ratio will affect the speed and effectiveness of the training process, so it is called "training factor”.
  • the direction of the gradient indicates the direction of the error expansion, so when updating the weights, it needs to be inverted to reduce the errors caused by the weights.
  • the data processing device involved in this embodiment of the present application is an electronic device that processes data such as images and voice by using a convolutional neural network, and may be, for example, a server or a terminal.
  • the electronic device may specifically be a desktop computer, a portable computer, a personal digital assistant (PDA), a tablet computer, an embedded device, a mobile phone, or a smart peripheral (such as a smart watch, a handheld Ring, glasses, etc.), TV set-top boxes, surveillance cameras, etc.
  • PDA personal digital assistant
  • the electronic device may specifically be a desktop computer, a portable computer, a personal digital assistant (PDA), a tablet computer, an embedded device, a mobile phone, or a smart peripheral (such as a smart watch, a handheld Ring, glasses, etc.), TV set-top boxes, surveillance cameras, etc.
  • PDA personal digital assistant
  • the embodiment of the present application does not limit the specific type of the electronic device.
  • FIG. 4 shows a schematic diagram of a hardware structure of an electronic device 400 according to an embodiment of the present application.
  • the electronic device 400 may include at least one processor 401, a communication bus 402, and a memory 403.
  • the electronic device 400 may further include at least one communication interface 404.
  • the processor 401 may be a general-purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), and a graphics processor.
  • CPU central processing unit
  • ASIC application-specific integrated circuit
  • GPU graphics processing unit
  • FPGA field programmable gate array
  • the communication bus 402 may include a path for transmitting information between the aforementioned components.
  • the communication interface 404 uses any device such as a transceiver to communicate with other devices or communication networks, such as Ethernet, radio access network (RAN), wireless local area networks (WLAN), etc. .
  • RAN radio access network
  • WLAN wireless local area networks
  • the memory 403 may be a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a random access memory (RAM) or other type that can store information and instructions Dynamic storage device, can also be electrically erasable programmable read-only memory (electrically erasable programmable read-only memory (EEPROM)), read-only compact disc (compact disc-read-only memory (CD-ROM) or other optical disc storage, optical disc storage (Including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), disk storage media or other magnetic storage devices, or can be used to carry or store desired program code in the form of instructions or data structures and can be used by a computer Any other media accessed, but not limited to this.
  • the memory may exist independently and be connected to the processor through a bus. The memory can also be integrated with the processor.
  • the memory 403 is configured to store application program code that executes the solution provided by the embodiment of the present application, and a neural network model structure, weights, and intermediate results of the processor 401 operation using the solution provided by the embodiment of the present application. 401 to control execution.
  • the processor 401 is configured to execute application program code stored in the memory 203, so as to implement a data processing method provided in the following embodiments of the present application.
  • the processor 401 may include one or more CPUs, such as CPU0 and CPU1 in FIG. 4.
  • the electronic device 400 may include multiple processors, such as the processor 401 and the processor 407 in FIG. 4. Each of these processors can be a single-CPU processor or a multi-CPU processor.
  • a processor herein may refer to one or more devices, circuits, and / or processing cores for processing data (such as computer program instructions).
  • the electronic device 400 may further include an output device 405 and an input device 406.
  • the output device 405 communicates with the processor 401 and can display information in a variety of ways.
  • the output device 405 may be a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector. Wait.
  • the input device 406 is in communication with the processor 401 and can accept user input in a variety of ways.
  • the input device 406 may be a mouse, a keyboard, a camera, a microphone, a touch screen device, or a sensing device.
  • neural networks are composed of network layers, and each network layer processes its input data and passes it to the next network layer.
  • each network layer processes its input data and passes it to the next network layer.
  • different attributes of the network layer such as a convolution layer, a fully connected layer, etc.
  • different input weights are used to perform convolution, multiplication and addition operations on the input data.
  • the methods and methods of these operations are determined by the properties of the network layer, but the values of the weights used by different operations are obtained through training. Different data processing results can be obtained by adjusting the weight values.
  • the neural network weight parameter precision is redundant.
  • Low-precision data formats such as INT8, binary, etc.
  • high-precision data formats such as FP32, FP64
  • weight parameter compression can be achieved.
  • Quantifying weight parameters is a feasible implementation method for weight parameter compression. Compressing the accuracy of neural network parameters can obtain a balance between storage, energy consumption, and accuracy. Data that can be quantified include: weights, Feature tensors (activations), gradients and other parameters of the neural network model are not limited.
  • the weight coefficients of the high-precision (non-quantized, generally FP32 or FP64-accurate) neural network are first obtained through training, and then the weight coefficients expressed in the high-precision data format are used.
  • Low-precision data format for expression.
  • the quantized model parameters are generally trained again under the premise of ensuring a low-precision data format, and the weight values are adjusted through retraining to finally achieve a model accuracy close to high accuracy at low accuracy.
  • the sample data is input to the neural network during the training process, and the difference between the expected data and the expected data is calculated, and then the difference is used to calculate the gradient of the weight of the neural network to be adjusted (that is, the trend of the weights to be adjusted), and adjust
  • the weight in the neural network achieves the purpose of reducing the difference, that is, achieving higher accuracy.
  • the probability is smaller than the minimum interval value that can be expressed under the weight accuracy, which makes the gradient unable to actually adjust the weight value.
  • the value range is defined as 0 to 1
  • the minimum value of the expressible interval is 2 -4 .
  • the gradient is likely to be much smaller than this value, assuming 2-6 .
  • Any value A under INT4 after summing with the gradient, the result should be A + 2 -6 , but in the data format of INT4, because its minimum interval cannot express 2 -6 , the actual result is still A. Therefore, you cannot directly use quantized weights for training.
  • a common training method is to use a high-precision reference weight to carry unquantized weight information.
  • the quantized weight is obtained from the reference weight.
  • the error calculation is performed by the quantized weight.
  • the gradient of the error obtained is applied to the unquantized Weight (also called reference weight).
  • the reference weights can be expressed in quantized weights.
  • the reference weight is initially 1. The value has been slightly floating during the adjustment process. Until the reference weight is greater than 1.5, the quantization weight becomes 2 until the reference weight is less than 0.5 Only when the quantization weight becomes 0, otherwise the quantization weight is 1).
  • quantized neural network calculation uses quantized weights instead of reference weights. Differences between reference weights and quantized weight values accumulate in neural network calculations, and ultimately affect The result of the calculation is a quantization error.
  • the gradient used to adjust the reference weight is derived from the above-mentioned quantized weight to calculate the derivation result and the inference error obtained, which is not accurate enough and there is a gradient error.
  • An embodiment of the present application provides a method for determining a weight parameter of a neural network model, as shown in FIG. 5, which specifically includes:
  • S501 Process the sample data based on the pending weight parameters of the neural network model to obtain an output result.
  • this step includes:
  • model weight parameters of the neural network model are obtained in an iterative training manner.
  • step S5011 When entering training for the first time, that is, during the first training cycle of the iterative training, the execution of step S5011 includes obtaining a default initial value, such as assigning constant parameters such as 0, 1 to a pending weight parameter.
  • the determined value is given to the weight parameters to be determined, such as the stored weight parameters of the pre-trained neural network model.
  • step S5011 When in the iterative training process, that is, during the non-first training cycle of the iterative training, the execution of step S5011 includes obtaining the pending weight parameters (that is, the adjusted weight parameters) updated by the back propagation algorithm in the previous training cycle. ) As the pending weight parameter obtained in this step.
  • the pending weight parameter obtained in this step.
  • S5012 Quantify the obtained pending weight parameters to obtain a quantized weight parameter, where the quantized weight parameter is a quantized value of the pending weight parameter.
  • the high-precision expression mode of the weight parameter is transformed into a low-precision expression mode introduced in the foregoing, which is not limited in this step.
  • the forward propagation algorithm was introduced in the foregoing.
  • the quantified pending weight parameters are used as the model parameters of the neural network model, taking the sample data as input, and the forward propagation algorithm as the calculation criterion for calculation. It should be understood that this step does not limit the specific forward propagation algorithm.
  • a calculation result for the sample data is output from an output layer of the neural network model.
  • the sample data includes image samples, and the output result includes the recognition result of the image recognition, which is characterized in a probabilistic form. Specifically, the sample image may be determined as a target. The probability of the image is 90%.
  • the neural network model is used for voice recognition, the sample data includes voice samples, and the output result includes the recognition result of the voice recognition characterized as a probabilistic form. Specifically, it may be determined that the sample voice is the target voice. The probability is 20%. .
  • the neural network model is used to obtain a super-resolution image
  • the sample data includes image samples, and the output result includes pixel values of the image after super-resolution processing.
  • the neural network model can also be used in other applications related to the field of artificial intelligence.
  • the sample data as input data and the output result as output data can also be other types of physical quantities, without limitation.
  • the difference between the expected output result, that is, the expected result, and the actual output result is calculated in this step, and the difference is characterized in a numerical form.
  • the difference may be the difference between the recognition results, for example, the recognition result is 90%, the expected result is 100%, the original error value is 10%, or the original image corresponding to the sample image before the downsampling and the
  • the pixel differences between the sample images after super-resolution processing can be expressed, for example, by the signal peak signal-to-noise ratio (PSNR) between the two, such as -0.2 decibels (dB), or the square between the two image pixels. Differences and the like are determined according to the specific application of the neural network model and are not limited.
  • PSNR signal peak signal-to-noise ratio
  • a correction value is first obtained.
  • the correction value is obtained according to the following formula:
  • R represents the correction value
  • w k represents the k-th pending weight parameter of the neural network model
  • Q (w k ) represents a quantized value of the k-th pending weight parameter
  • k is a non-negative integer.
  • correction error value is obtained according to the following formula:
  • E1 represents the correction error value
  • E0 represents the original error value
  • is a constant
  • m is the total number of pending weight parameters used to process the sample data
  • F ((w k -Q (w k ) ) ⁇ Q (w k )) represents a function in which the correction value is an independent variable
  • m is a positive integer.
  • the function of using the correction value as an independent variable is to calculate an absolute value of the correction value
  • correction error value is obtained according to the following formula:
  • the neural network model includes p network layers, and each of the network layers includes q of the pending weight parameters, and the k-th pending weight parameter is the neural network model.
  • correction error value is obtained according to the following formula:
  • p and q are positive integers
  • i and j are non-negative integers.
  • a certain network layer of the neural network model may not contain the undetermined weight parameters, that is, the corresponding q of the network layer is 0, and obviously the network layer will not be used to correct the error value. Calculation.
  • the difference between the weight parameter and the quantized weight parameter is used as a penalty term by using the correction value (regularization function) to guide the non-quantized weight parameter close to its quantized weight value during the training process to reduce Quantization error.
  • the above-mentioned difference value as a penalty term is multiplied with the quantized weight parameter to avoid overfitting problems caused by the weighted values of some large values leading the reasoning results of the neural network.
  • a model weight parameter of the neural network model is determined based on the modified error value and the pending weight parameter.
  • N is an integer greater than 1
  • M is a positive integer less than N
  • the ending condition includes one or more combinations of the following conditions :
  • the original error value in the Nth training cycle is less than a preset first threshold
  • the correction error value in the Nth training cycle is less than a preset second threshold
  • a difference between the original error value in the N-th training cycle and the original error value in the N-M training cycle is less than a preset third threshold
  • a difference between the correction error value in the N-th training cycle and the correction error value in the N-M training cycle is less than a preset fourth threshold
  • a difference between the pending weight parameter in the N-th training cycle and the pending weight parameter in the N-M training cycle is less than a preset fifth threshold
  • N is greater than a preset sixth threshold.
  • step S504 may be performed in each training cycle, and may also be performed in every M training cycles.
  • the embodiment of this application does not limit the execution frequency of step S504.
  • a training cycle it may be understood as a process of calculating a correction error value, adjusting a weight parameter according to the correction error value, and then using the adjusted weight parameter to obtain a new training result.
  • step S504 when the Nth training cycle does not satisfy the end condition, one or more combinations of the following physical quantities are stored:
  • the number N of the Nth training cycle is the number N of the Nth training cycle.
  • the stored physical quantity will be called in the subsequent execution of step S504.
  • the end of iterative training means that through training, the quantized weight parameters have been optimized to the desired degree, that is, they can be determined as the model weight parameters of the neural network model.
  • the training sample set A is used to train the model
  • the test sample set B is used to test the model.
  • test the model on test data set B to get the first test result X; continue to use A to train the model for M training cycles on test data set B
  • the difference between X and Y is less than the threshold, end the training, otherwise continue to use A to train the model.
  • the ending condition includes that the difference between the first test result X and the second test result Y is smaller than a threshold.
  • a back propagation algorithm is used to adjust the pending weight parameters layer by layer for the network layer of the neural network model until the An input layer of a neural network model to obtain adjusted weight parameters of the neural network model.
  • w0 k represents the k-th pending weight parameter
  • w1 k represents the k-th adjusted weight parameter
  • is a normal number
  • step S5011 is executed to continue the iterative training.
  • the embodiments of the present application provide a method and a device for determining a weight parameter of a neural network model.
  • the model can be trained by The error between the output result and the expected result introduces a proper correction value, which reduces the quantization error and avoids the problem of over-fitting caused by some weight parameters with larger numerical values leading the reasoning results of the neural network.
  • the electronic device includes a hardware structure and / or a software module corresponding to each function.
  • this application can be implemented in hardware or a combination of hardware and computer software. Whether a certain function is performed by hardware or computer software-driven hardware depends on the specific application and design constraints of the technical solution. A professional technician can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
  • the embodiments of the present application may divide the functional modules of the electronic device according to the foregoing method examples.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of the modules in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be another division manner.
  • FIG. 6 shows a possible composition diagram of the electronic device involved in the foregoing embodiment.
  • a device 600 for determining a weight parameter of a neural network model includes:
  • the forward propagation module 601 is configured to process the sample data based on the neural network model's pending weight parameters to obtain an output result;
  • a comparison module 602 configured to calculate an original error value of the output result and a preset expected result, where the original error value is a numerical representation of a difference between the output result and the expected result;
  • a correction module 603, configured to correct the original error value based on the correction value to obtain a correction error value
  • a determining module 604 configured to determine a model weight parameter of the neural network model based on the modified error value and the pending weight parameter;
  • the correction value is obtained according to the following formula:
  • R represents the correction value
  • w k represents the k-th pending weight parameter of the neural network model
  • Q (w k ) represents a quantized value of the k-th pending weight parameter
  • k is a non-negative integer.
  • the correction error value is obtained according to the following formula:
  • E1 represents the correction error value
  • E0 represents the original error value
  • is a constant
  • m is the total number of pending weight parameters used to process the sample data
  • F ((w k -Q (w k ) ) ⁇ Q (w k )) represents a function in which the correction value is an independent variable
  • m is a positive integer.
  • the function that uses the correction value as an independent variable is an absolute value for calculating the correction value; correspondingly, the correction error value is obtained according to the following formula:
  • the neural network model includes p network layers, and each network layer includes q of the pending weight parameters, and the k-th pending weight parameter is in the neural network model.
  • the j-th pending weight parameter of the i-th network layer; correspondingly, the correction error value is obtained according to the following formula:
  • p and q are positive integers
  • i and j are non-negative integers.
  • the forward propagation module 601 is specifically configured to: obtain the pending weight parameter; quantize the obtained pending weight parameter to obtain a quantized weight parameter, where the quantized weight parameter is the A quantized value of a weight parameter to be determined; using the quantized weight parameter as a model weight parameter of the neural network model, using a forward propagation algorithm to process the sample data; obtaining the output layer from the neural network model Output results.
  • the model weight parameters of the neural network model are obtained by iterative training.
  • the determination module 604 is specifically configured to: use the quantized weight parameters As a model weight parameter of the neural network model.
  • the back propagation module 605 is specifically configured to: The forward propagation algorithm adjusts the pending weight parameters layer by layer for the network layer of the neural network model until the input layer of the neural network model to obtain the adjusted weight parameters of the neural network model.
  • the pending weight parameters of the neural network model are adjusted according to the following formula:
  • w0 k represents the k-th pending weight parameter
  • w1 k represents the k-th adjusted weight parameter
  • is a normal number
  • N is an integer greater than 1
  • M is a positive integer less than N
  • the ending condition includes one or more of the following conditions Combinations: the original error value in the Nth training cycle is less than a preset first threshold; the modified error value in the Nth training cycle is less than a preset second threshold; the Nth training The difference between the original error value in the period and the original error value in the NMth training period is less than a preset third threshold; the corrected error value in the Nth training period and the NMth training period The difference between the corrected error value of is smaller than a preset fourth threshold; the difference between the pending weight parameter in the Nth training cycle and the pending weight parameter in the NMth training cycle is less than a preset fifth threshold; and N is greater than a preset sixth threshold.
  • one or more combinations of the following physical quantities are stored: the original error value in the Nth training cycle The correction error value in the Nth training cycle; the pending weight parameter in the Nth training cycle; and the number of cycles N of the Nth training cycle.
  • the forward propagation module 601 is specifically configured to: during a first training cycle of the iterative training, use a preset initial weight parameter as the pending weight parameter; in the In the non-first training cycle of iterative training, the adjusted weight parameter of the neural network model is used as the pending weight parameter.
  • the neural network model is used for image recognition; correspondingly, the sample data includes image samples; correspondingly, the output result includes a recognition result of the image recognition characterized as a probability form. .
  • the neural network model is used for voice recognition; correspondingly, the sample data includes voice samples; correspondingly, the output result includes the recognition result of the voice recognition characterized as a probability form. .
  • the neural network model is used for obtaining a super-resolution image; correspondingly, the sample data includes image samples; correspondingly, the output result includes a super-resolution processed image. Pixel values.
  • the electronic device includes but is not limited to the above-listed unit modules.
  • the electronic device may further include a communication unit.
  • the communication unit may include a sending unit for sending data or signals to other devices, and receiving data or signals sent by other devices. Receiving unit, etc.
  • the functions that can be implemented by the above functional units also include, but are not limited to, the functions corresponding to the method steps of the above examples. For detailed descriptions of other units of the electronic device, refer to the detailed description of the corresponding method steps. No longer.
  • the processing unit 701 in FIG. 7 may be a processor or a controller.
  • the processing unit 701 may be a central processing unit CPU, a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit ASIC, or a field programmable gate.
  • a processor may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a DSP and a microprocessor, and so on.
  • the storage unit 702 may be a memory.
  • the communication unit may be a transceiver, a radio frequency circuit, or a communication interface.
  • the processing unit 701 executes a method for determining a weight parameter of a neural network model as shown in FIG. 5.
  • An embodiment of the present application further includes a computer storage medium including computer instructions, and when the computer instructions are executed on an electronic device, the electronic device is caused to execute a method for determining a weight parameter of a neural network model shown in FIG. 5.
  • the embodiment of the present application further includes a computer program product, when the computer program product is run on a computer, the computer is caused to execute a method for determining a weight parameter of a neural network model shown in FIG. 5.
  • the disclosed apparatus and method may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of modules or units is only a logical function division.
  • multiple units or components may be combined or It can be integrated into another device, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may be one physical unit or multiple physical units, that is, they may be located in one place, or may be distributed to multiple different places. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
  • the integrated unit When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium.
  • the technical solutions of the embodiments of the present application essentially or partly contribute to the existing technology or all or part of the technical solutions may be embodied in the form of a software product, which is stored in a storage medium
  • the instructions include a number of instructions for causing a device (which can be a single-chip microcomputer, a chip, or the like) or a processor to execute all or part of the steps of the methods in the embodiments of the present application.
  • the foregoing storage media include: U disks, mobile hard disks, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks, which can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé de détermination de paramètres de poids d'un modèle de réseau de neurones artificiels, consistant à : traiter des données d'échantillon sur la base de paramètres de poids à déterminer d'un modèle de réseau de neurones artificiels, pour obtenir un résultat de sortie ; calculer une valeur d'erreur d'origine entre le résultat de sortie et un résultat attendu prédéfini, la valeur d'erreur d'origine étant une représentation numérique de la différence entre le résultat de sortie et le résultat attendu ; corriger la valeur d'erreur d'origine sur la base d'une valeur de correction, pour obtenir une valeur d'erreur corrigée ; déterminer, sur la base de la valeur d'erreur corrigée et des paramètres de poids à déterminer, des paramètres de poids de modèle du modèle de réseau de neurones artificiels. La valeur de correction est obtenue en fonction de la formule suivante : R=(w k-Q(w k))×Q(w k), R représentant la valeur de correction, w k représentant le k-ième paramètre de poids à déterminer du modèle de réseau de neurones artificiels, Q(w k) représentant une valeur de quantification du k-ième paramètre de poids à déterminer, k étant un nombre entier non négatif.
PCT/CN2018/091652 2018-06-15 2018-06-15 Procédé et dispositif de détermination de paramètres de poids d'un modèle de réseau de neurones artificiels Ceased WO2019237357A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201880092139.XA CN111937011B (zh) 2018-06-15 2018-06-15 一种神经网络模型权重参数的确定方法及设备
PCT/CN2018/091652 WO2019237357A1 (fr) 2018-06-15 2018-06-15 Procédé et dispositif de détermination de paramètres de poids d'un modèle de réseau de neurones artificiels

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/091652 WO2019237357A1 (fr) 2018-06-15 2018-06-15 Procédé et dispositif de détermination de paramètres de poids d'un modèle de réseau de neurones artificiels

Publications (1)

Publication Number Publication Date
WO2019237357A1 true WO2019237357A1 (fr) 2019-12-19

Family

ID=68841775

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/091652 Ceased WO2019237357A1 (fr) 2018-06-15 2018-06-15 Procédé et dispositif de détermination de paramètres de poids d'un modèle de réseau de neurones artificiels

Country Status (2)

Country Link
CN (1) CN111937011B (fr)
WO (1) WO2019237357A1 (fr)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113570029A (zh) * 2020-04-29 2021-10-29 华为技术有限公司 获取神经网络模型的方法、图像处理方法及装置
CN114255361A (zh) * 2020-09-10 2022-03-29 华为技术有限公司 神经网络模型的训练方法、图像处理方法及装置
CN114298091A (zh) * 2021-12-13 2022-04-08 国网湖北省电力有限公司电力科学研究院 Sf6气体流量计量值修正方法、装置、设备及存储介质
CN114386644A (zh) * 2020-10-19 2022-04-22 中国石油天然气股份有限公司 基于bp神经网络的页岩气井产能分析方法及系统
CN115511042A (zh) * 2021-06-21 2022-12-23 中国科学院微电子研究所 一种实现连续学习的神经网络模型的训练方法、装置和电子设备
CN115577491A (zh) * 2022-08-29 2023-01-06 潍柴动力股份有限公司 一种参数修正方法、装置、电子设备及存储介质
CN116224352A (zh) * 2022-07-18 2023-06-06 武汉象印科技有限责任公司 神经网络的宽动态脉冲激光测距方法及系统
CN119245192A (zh) * 2024-12-05 2025-01-03 广州智业节能科技有限公司 基于人工智能的空调设备寻优控制方法及系统
CN119962597A (zh) * 2024-12-20 2025-05-09 北京忆元科技有限公司 基于存算一体芯片进行神经网络模型训练的方法及装置

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112836435A (zh) * 2021-03-02 2021-05-25 上海交通大学 粗网格数值模拟结果修正方法、装置及电子设备
CN114358245B (zh) * 2021-12-24 2025-12-05 杭州海康威视数字技术股份有限公司 一种模型敏感性确定方法、装置及电子设备
CN116992937A (zh) * 2022-04-19 2023-11-03 华为技术有限公司 神经网络模型的修复方法和相关设备
CN115729110A (zh) * 2022-11-24 2023-03-03 广东汇天航空航天科技有限公司 空速预测方法、装置、多旋翼飞行器及可读存储介质
CN115983506A (zh) * 2023-03-20 2023-04-18 华东交通大学 一种水质预警方法、系统及可读存储介质
CN118075418B (zh) * 2024-04-25 2024-07-16 深圳市慧明捷科技有限公司 视频会议内容输出优化方法、装置、设备及其存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104899641A (zh) * 2015-05-25 2015-09-09 杭州朗和科技有限公司 深度神经网络学习方法、处理器和深度神经网络学习系统
CN106372724A (zh) * 2016-08-31 2017-02-01 西安西拓电气股份有限公司 一种人工神经网络算法
CN106951960A (zh) * 2017-03-02 2017-07-14 平顶山学院 一种神经网络及该神经网络的学习方法
CN108009640A (zh) * 2017-12-25 2018-05-08 清华大学 基于忆阻器的神经网络的训练装置及其训练方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6269351B1 (en) * 1999-03-31 2001-07-31 Dryken Technologies, Inc. Method and system for training an artificial neural network
US7409372B2 (en) * 2003-06-20 2008-08-05 Hewlett-Packard Development Company, L.P. Neural network trained with spatial errors
US11049006B2 (en) * 2014-09-12 2021-06-29 Microsoft Technology Licensing, Llc Computing system for training neural networks
CN106096723B (zh) * 2016-05-27 2018-10-30 北京航空航天大学 一种基于混合神经网络算法的用于复杂工业产品性能评估方法
US11049011B2 (en) * 2016-11-16 2021-06-29 Indian Institute Of Technology Delhi Neural network classifier

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104899641A (zh) * 2015-05-25 2015-09-09 杭州朗和科技有限公司 深度神经网络学习方法、处理器和深度神经网络学习系统
CN106372724A (zh) * 2016-08-31 2017-02-01 西安西拓电气股份有限公司 一种人工神经网络算法
CN106951960A (zh) * 2017-03-02 2017-07-14 平顶山学院 一种神经网络及该神经网络的学习方法
CN108009640A (zh) * 2017-12-25 2018-05-08 清华大学 基于忆阻器的神经网络的训练装置及其训练方法

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113570029A (zh) * 2020-04-29 2021-10-29 华为技术有限公司 获取神经网络模型的方法、图像处理方法及装置
CN114255361A (zh) * 2020-09-10 2022-03-29 华为技术有限公司 神经网络模型的训练方法、图像处理方法及装置
CN114386644A (zh) * 2020-10-19 2022-04-22 中国石油天然气股份有限公司 基于bp神经网络的页岩气井产能分析方法及系统
CN115511042A (zh) * 2021-06-21 2022-12-23 中国科学院微电子研究所 一种实现连续学习的神经网络模型的训练方法、装置和电子设备
CN114298091A (zh) * 2021-12-13 2022-04-08 国网湖北省电力有限公司电力科学研究院 Sf6气体流量计量值修正方法、装置、设备及存储介质
CN116224352A (zh) * 2022-07-18 2023-06-06 武汉象印科技有限责任公司 神经网络的宽动态脉冲激光测距方法及系统
CN115577491A (zh) * 2022-08-29 2023-01-06 潍柴动力股份有限公司 一种参数修正方法、装置、电子设备及存储介质
CN119245192A (zh) * 2024-12-05 2025-01-03 广州智业节能科技有限公司 基于人工智能的空调设备寻优控制方法及系统
CN119962597A (zh) * 2024-12-20 2025-05-09 北京忆元科技有限公司 基于存算一体芯片进行神经网络模型训练的方法及装置

Also Published As

Publication number Publication date
CN111937011B (zh) 2024-11-26
CN111937011A (zh) 2020-11-13

Similar Documents

Publication Publication Date Title
WO2019237357A1 (fr) Procédé et dispositif de détermination de paramètres de poids d'un modèle de réseau de neurones artificiels
US11809515B2 (en) Reduced dot product computation circuit
CN112085186A (zh) 一种神经网络的量化参数确定方法及相关产品
WO2019184823A1 (fr) Procédé et dispositif de traitement d'images basé sur un modèle de réseau neuronal à convolution
CN110378438A (zh) 标签容错下的图像分割模型的训练方法、装置及相关设备
CN110472725A (zh) 一种平衡二值化神经网络量化方法及系统
CN112085175A (zh) 基于神经网络计算的数据处理方法和装置
WO2020155712A1 (fr) Procédé et appareil de traitement d'image, dispositif informatique, et support de stockage informatique
WO2022111002A1 (fr) Procédé et appareil permettant d'entraîner un réseau neuronal et support de stockage lisible par ordinateur
US20230133337A1 (en) Quantization calibration method, computing device and computer readable storage medium
US20250265505A1 (en) Model training method and related apparatus
CN115018072A (zh) 模型训练方法、装置、设备及存储介质
US12093816B1 (en) Initialization of values for training a neural network with quantized weights
CN116402122A (zh) 神经网络训练方法及装置、可读存储介质及芯片
CN118095373B (zh) 大语言模型的量化方法、电子设备、芯片系统及存储介质
CN113875228B (zh) 视频插帧方法及装置、计算机可读存储介质
WO2024244594A1 (fr) Procédé de quantification d'un modèle de réseau neuronal, procédé de traitement de données et appareils associés
CN114186097A (zh) 用于训练模型的方法和装置
CN113570036A (zh) 支持动态神经网络稀疏模型的硬件加速器架构
CN114580625A (zh) 用于训练神经网络的方法、设备和计算机可读存储介质
JP2025522114A (ja) モデル訓練方法及び関連デバイス
WO2024060727A1 (fr) Procédé et appareil d'entraînement de modèle de réseau de neurones, et dispositif et système
CN117597692A (zh) 用于多任务学习中的损失平衡的装置、方法、设备及介质
CN115049059A (zh) 数据处理方法、装置、设备及存储介质
TW202301130A (zh) 深度學習網路裝置、其使用的記憶體存取方法與非揮發性儲存媒介

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18922680

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18922680

Country of ref document: EP

Kind code of ref document: A1