[go: up one dir, main page]

WO2018016608A1 - Appareil de réseau neuronal, système de commande de véhicule, dispositif de décomposition et programme - Google Patents

Appareil de réseau neuronal, système de commande de véhicule, dispositif de décomposition et programme Download PDF

Info

Publication number
WO2018016608A1
WO2018016608A1 PCT/JP2017/026363 JP2017026363W WO2018016608A1 WO 2018016608 A1 WO2018016608 A1 WO 2018016608A1 JP 2017026363 W JP2017026363 W JP 2017026363W WO 2018016608 A1 WO2018016608 A1 WO 2018016608A1
Authority
WO
WIPO (PCT)
Prior art keywords
input
matrix
vector
neural network
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2017/026363
Other languages
English (en)
Japanese (ja)
Inventor
満 安倍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Denso IT Laboratory Inc
Original Assignee
Denso IT Laboratory Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Denso IT Laboratory Inc filed Critical Denso IT Laboratory Inc
Priority to JP2018528880A priority Critical patent/JP6921079B2/ja
Priority to US16/318,779 priority patent/US11657267B2/en
Priority to CN201780056821.9A priority patent/CN109716362B/zh
Publication of WO2018016608A1 publication Critical patent/WO2018016608A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Definitions

  • the present technology relates to a neural network device and program for inputting input information to an input layer of a neural network model and obtaining output information from an output layer, a vehicle control system including the neural network device, and a decomposition for configuring the neural network.
  • the present invention relates to a processing apparatus.
  • FIG. 16 is a diagram illustrating an example of a neural network that divides a four-dimensional input vector into three classes (identifies which of the three classes belongs). As shown in FIG. 16, when a four-dimensional input vector (also referred to as an input map) to be identified is input as an input layer a 0 , the input information passes through intermediate layers a 1 to a 3 and is three-dimensional. It is output as an output layer a 4.
  • a four-dimensional input vector also referred to as an input map
  • a weight matrix (also called a filter) W 1 and a bias vector b 1 are defined between the input layer a 0 and the intermediate layer a 1 , and the intermediate layer a 1 is obtained by the following equation (1).
  • f (•) is an activation function, and for example, the following function (ReLU) is used.
  • the intermediate layers a 2 and a 3 are obtained by the following expressions (2) and (3), and the output layer a 4 is obtained by the following expression (4).
  • the input vector from the previous layer is x (D I dimension), the weight matrix W (D I row D O column), and the bias b (D O dimension) as follows: ),
  • the output vector (before applying the activation function) y (D O dimension) to the next layer is expressed by the following equation (5).
  • FC layer For example, in a fully connected layer (hereinafter also referred to as “FC layer”), when the weight matrix W is a single precision real number (32 bits), a memory of 32 D I D O bits is consumed. Become. Each layer requires D I D O times of product-sum operations of single-precision real numbers, and this calculation particularly requires processing time.
  • the FC layer is usually arranged at the end of the neural network, but the convolutional layer (hereinafter also referred to as “CONV layer”) also appropriately cuts out and rearranges the input map by the sliding window, The CONV layer can be regarded as the FC layer.
  • the present technology has been made in view of the above-described problems, and aims to reduce memory consumption and calculation amount in a neural network device.
  • the neural network device includes a storage unit (24) for storing a neural network model, and an arithmetic unit (22) for inputting input information to an input layer of the neural network model and outputting an output layer,
  • the weight matrix (W) of at least one layer of the neural network model is a product (M w C w ) of a weight basis matrix (M w ) that is an integer matrix and a weight coefficient matrix (C w ) that is a real matrix. It is configured.
  • the neural network device is a neural network device that performs recognition using a neural network model, and has a configuration in which a logical operation is performed as an operation of at least one layer of the neural network model.
  • a neural network device is a neural network device that performs recognition using a neural network model, and stores a binary or ternary matrix used for calculation of at least one layer of the neural network model. have.
  • a vehicle control system includes the neural network device (20), an in-vehicle sensor (30) that acquires the input information, and a vehicle control device (40) that controls the vehicle based on the output. It has a configuration.
  • the decomposition processing apparatus includes an acquisition unit (11) that acquires a neural network model, a weight matrix of at least one layer of the neural network model, a weight basis matrix (M w ) that is an integer matrix, and a real matrix in a weighting coefficient matrix (C w) product of (M w C w) to decompose the weight decomposition section (12), the weight basis matrix (M w) and the weighting coefficient matrix (C w) and outputs an output Part (14).
  • acquisition unit (11) that acquires a neural network model, a weight matrix of at least one layer of the neural network model, a weight basis matrix (M w ) that is an integer matrix, and a real matrix in a weighting coefficient matrix (C w) product of (M w C w) to decompose the weight decomposition section (12), the weight basis matrix (M w) and the weighting coefficient matrix (C w) and outputs an output Part (14).
  • the program according to one aspect is a program that causes a computer to function as a neural network device that inputs input information to an input layer of a neural network model and obtains output information from an output layer, and is stored in the storage unit (24) of the computer.
  • an input vector (x ) To the sum of the product of the integer input basis matrix (M x ) and the real input coefficient vector (c x ) and the input bias (b x ), the input coefficient vector ( c x ) and the input bias vector (c x ) out of the input bias (b x ) and the input coefficient vector obtained by the learning.
  • a value (x j ) of each element of the input vector obtained based on Tol (c x ) and the input bias (b x ), and an input basis matrix value (m x (j) ) corresponding thereto A look-up table (LUT) that defines a relationship, and the program causes the computer to use the output vector of the previous layer as an input vector (x) in at least one fully connected layer of the neural network model.
  • the weight basis matrix (M w ) read from the storage unit (24), the real weight coefficient matrix (C w ), the input coefficient vector (c x ), and the look read from the storage unit (24) the input basis matrix corresponding to the input vector obtained by referring to the up-table (LUT) (x) with (M x) and the input vector (x It said to function as a calculation unit for obtaining the product of the weight matrix (W) and.
  • the program according to one aspect is a program that causes a computer to function as a neural network device that inputs input information to an input layer of a neural network model and obtains output information from an output layer, and is stored in the storage unit (24) of the computer.
  • an input vector (x ) To the sum of the product of the integer input basis matrix (M x ) and the real input coefficient vector (c x ) and the input bias (b x ), the input coefficient vector ( c x ) and the input bias vector (c x ) out of the input bias (b x ) and the input coefficient vector obtained by the learning.
  • the program causes the computer to read the weight basis matrix (24) read from the storage unit (24) with the output vector of the previous layer as an input vector (x) in at least one fully connected layer of the neural network model.
  • the neural network device includes a storage unit (24) for storing a neural network model, and an arithmetic unit (22) for inputting input information to an input layer of the neural network model and outputting an output layer,
  • the operation unit (22) has an input basis matrix (M x ) that is an integer matrix, with the output vector of the previous layer as an input vector (x) in at least one layer of the neural network model.
  • FIG. 1 is a diagram illustrating calculation of a product of an integer-decomposed input vector and a weight matrix according to the embodiment.
  • FIG. 2 is a diagram illustrating a configuration of the decomposition processing apparatus according to the embodiment.
  • FIG. 3 is a diagram illustrating a process of decomposing the weight matrix according to the embodiment into a base matrix and a coefficient matrix.
  • FIG. 4 is a flowchart of an algorithm executed in the division method according to the embodiment.
  • FIG. 5 is a diagram illustrating a modification of the process of decomposing the weight matrix according to the embodiment into a base matrix and a coefficient matrix.
  • FIG. 1 is a diagram illustrating calculation of a product of an integer-decomposed input vector and a weight matrix according to the embodiment.
  • FIG. 2 is a diagram illustrating a configuration of the decomposition processing apparatus according to the embodiment.
  • FIG. 3 is a diagram illustrating a process of decomposing the weight matrix according to the embodiment into a base matrix and a coefficient matrix.
  • FIG. 6 is a diagram illustrating a modified example of the process of decomposing an input vector according to the embodiment into a product of a base matrix and a coefficient vector and a bias.
  • FIG. 7 is a diagram illustrating the update by the full search of the base matrix of the input vector according to the embodiment.
  • FIG. 8 is a diagram illustrating optimization of the basis matrix of the input vector according to the embodiment.
  • FIG. 9 is a diagram illustrating optimization of the basis matrix of the input vector according to the embodiment.
  • FIG. 10 is a diagram illustrating optimization of the basis matrix of the input vector according to the embodiment.
  • FIG. 11 is a diagram illustrating a configuration of the neural network device according to the embodiment.
  • FIG. 12 is a diagram illustrating processing of the calculation unit in the FC layer of the neural network model according to the embodiment.
  • FIG. 13 is a diagram illustrating a relationship between an input map and an output map of the CONV layer according to the embodiment.
  • FIG. 14 is a diagram illustrating a relationship between an input map and an output map of the CONV layer according to the embodiment.
  • FIG. 15 is a diagram illustrating decomposition of the weight matrix of the CONV layer according to the embodiment.
  • FIG. 16 is a diagram illustrating an example of a neural network that identifies four-dimensional input vectors into three classes.
  • FIG. 17 is a diagram for explaining optimization of the basis matrix of the input vector in the modification of the embodiment.
  • FIG. 18 is a diagram for explaining optimization of a base matrix of an input vector in a modification of the embodiment.
  • FIG. 19 is a diagram showing a prototype and a number line on which the midpoints are plotted in a modification of the embodiment.
  • FIG. 20 is a diagram illustrating a prototype and a number line on which the midpoints are plotted in a modification of the embodiment.
  • FIG. 21 is a diagram for explaining the assignment of ⁇ in the modification of the embodiment.
  • FIG. 22 is a diagram illustrating a configuration of a neural network device according to a modification of the embodiment.
  • FIG. 23 is a diagram for explaining a binary tree search in a modification of the embodiment.
  • FIG. 24 is a diagram for explaining a binary tree search in a modification of the embodiment.
  • FIG. 25 is a diagram for explaining a binary tree search in a modification of the embodiment.
  • FIG. 26 is a diagram for explaining a binary tree search in a modification of the embodiment.
  • FIG. 27 is a diagram illustrating a binary tree according to a modification of the embodiment.
  • FIG. 28 is a diagram showing a configuration
  • the neural network device includes a storage unit (24) for storing a neural network model, and an arithmetic unit (22) for inputting input information to an input layer of the neural network model and outputting an output layer,
  • the weight matrix (W) of at least one layer of the neural network model is a product (M w C w ) of a weight basis matrix (M w ) that is an integer matrix and a weight coefficient matrix (C w ) that is a real matrix. It is configured.
  • the weight matrix (W) of all connection layers in the neural network is composed of a product (M w C w ) of an integer weight basis matrix (M w ) and a real weight coefficient matrix (C w ).
  • the memory consumption can be reduced in the calculation of the layer.
  • the product operation of the input basis matrix (M x ) and the weight basis matrix (M w ) is performed by the product operation of the integer matrices. Therefore, it is possible to reduce the amount of memory consumption and the amount of calculation.
  • the weight basis matrix (M w ) may be a binary matrix
  • the input basis matrix (M x ) may be a binary matrix
  • the calculation unit (22) A product operation (M w M x ) of the weight basis matrix (M w ) and the input basis matrix (M x ) may be performed by a logical operation and a bit count.
  • the product operation of the input basis matrix (M x ) and the weight basis matrix (M w ) in the operation for obtaining the product of the input vector (x) and the weight matrix (W) is performed by the product operation of binary matrices. And can be executed with a logical operation and a bit count, so that the operation for obtaining the product of the input vector (x) and the weight matrix (W) can be speeded up.
  • the weight basis matrix (M w ) may be a ternary matrix
  • the input basis matrix (M x ) may be a binary matrix
  • the calculation unit (22) A product operation (M w M x ) of the weight basis matrix (M w ) and the input basis matrix (M x ) may be performed by a logical operation and a bit count.
  • the product operation of the input basis matrix (M x ) and the weight basis matrix (M w ) in the operation for obtaining the product of the input vector (x) and the weight matrix (W) is performed as a binary matrix and a ternary matrix. And can be executed with a logical operation and a bit count, so that the operation for obtaining the product of the input vector (x) and the weight matrix (W) can be speeded up.
  • the calculation unit (22) decomposes the input vector (x) by optimizing the input basis matrix (M x ) with respect to the input vector (x). Good.
  • the calculation unit (22), for each element (x j ) of the input vector (x), includes all combinations of rows of the input basis matrix corresponding to the elements of the input vector ( ⁇ ) and the learned input coefficient vector (c x ) and the learned input bias (b x ) sum ( ⁇ c x + b x ) to select the nearest candidate
  • the basis matrix (M x ) may be optimized.
  • the input basis matrix (M x ) can be optimized by a one-dimensional nearest neighbor search.
  • the storage unit (24) includes a value of each element (x j ) of the input vector and a value (m x (j) ) of an input basis matrix in the nearest candidate corresponding thereto.
  • a lookup table (LUT) that defines the relationship may be stored, and the calculation unit (22) refers to the lookup table (LUT), so that the input basis for the input vector (x) is stored.
  • the matrix (M x ) may be optimized.
  • each element for (x i) of the input vector, the line of the input basis matrix corresponding to each element (x i) of the input vector (beta) A midpoint (mp i ) when all combinations and candidates (p) of approximate values of the elements of the input vector obtained thereby are arranged in order of magnitude may be stored, and the calculation unit (22) for each element of the input vector (x i), the middle point (mp i) rows of the input base matrix corresponding to each element (x i) of the input vector by binary tree search method using (m
  • the input basis matrix (M x ) may be optimized by determining x (j) ).
  • the neural network model may be a convolutional neural network model, and the convolutional neural network model combines the plurality of filters of the convolutional layer into the weight matrix (W), and
  • the convolution layer may be regarded as a fully connected layer, and the weight matrix (W) may be configured by a product of an integer weight basis matrix (M w ) and a real weight coefficient matrix (C w ), (22) may be a convolutional layer that is regarded as a fully connected layer, and may obtain a product of the decomposed input vector (x) and the decomposed weight matrix (W).
  • This configuration makes it possible to reduce the amount of memory consumed and the amount of computation in the computation of the convolutional layer of the convolutional neural network model.
  • the neural network device is a neural network device that performs recognition using a neural network model, and has a configuration in which a logical operation is performed as an operation of at least one layer of the neural network model.
  • This configuration makes it possible to perform neural network model operations at high speed by logical operations.
  • a neural network device is a neural network device that performs recognition using a neural network model, and stores a binary or ternary matrix used for calculation of at least one layer of the neural network model. have.
  • the neural network model can be calculated at high speed using a binary or ternary matrix.
  • a vehicle control system includes the neural network device (20), an in-vehicle sensor (30) that acquires the input information, and a vehicle control device (40) that controls the vehicle based on the output. It has a configuration.
  • This configuration makes it possible to control the vehicle based on recognition by a neural network model.
  • the decomposition processing apparatus includes an acquisition unit (11) that acquires a neural network model, a weight matrix of at least one layer of the neural network model, a weight basis matrix (M w ) that is an integer matrix, and a real matrix in a weighting coefficient matrix (C w) product of (M w C w) to decompose the weight decomposition section (12), the weight basis matrix (M w) and the weighting coefficient matrix (C w) and outputs an output Part (14).
  • acquisition unit (11) that acquires a neural network model, a weight matrix of at least one layer of the neural network model, a weight basis matrix (M w ) that is an integer matrix, and a real matrix in a weighting coefficient matrix (C w) product of (M w C w) to decompose the weight decomposition section (12), the weight basis matrix (M w) and the weighting coefficient matrix (C w) and outputs an output Part (14).
  • the input vector (x) is obtained by multiplying an input basis matrix (M x ) that is an integer matrix and an input coefficient vector (c x ) that is a real vector by an input bias (b x ).
  • An input pre-decomposition unit (13) for learning the input coefficient vector (c x ) and the input bias (b x ) for decomposition into a sum (x M x c x + b x 1);
  • the output unit (14) may output the input coefficient vector (c x ) obtained by the learning.
  • the coefficient vector (c x ) and the input bias (b x ) for decomposing the input vector (x) can be acquired in advance by learning.
  • the input pre-decomposition unit (13) generates a lookup table (LUT) for optimizing the input basis matrix (M x ) for the input vector (x).
  • the output unit (14) may output the lookup table (LUT).
  • a lookup table (LUT) for decomposing the input vector (x) at high speed can be acquired in advance.
  • the program according to one aspect is a program that causes a computer to function as a neural network device that inputs input information to an input layer of a neural network model and obtains output information from an output layer, and is stored in the storage unit (24) of the computer.
  • an input vector (x ) To the sum of the product of the integer input basis matrix (M x ) and the real input coefficient vector (c x ) and the input bias (b x ), the input coefficient vector ( c x ) and the input bias vector (c x ) out of the input bias (b x ) and the input coefficient vector obtained by the learning.
  • a value (x j ) of each element of the input vector obtained based on Tol (c x ) and the input bias (b x ), and an input basis matrix value (m x (j) ) corresponding thereto A look-up table (LUT) that defines a relationship, and the program causes the computer to use the output vector of the previous layer as an input vector (x) in at least one fully connected layer of the neural network model.
  • the weight basis matrix (M w ) read from the storage unit (24), the real weight coefficient matrix (C w ), the input coefficient vector (c x ), and the look read from the storage unit (24) the input basis matrix corresponding to the input vector obtained by referring to the up-table (LUT) (x) with (M x) and the input vector (x It said to function as a calculation unit for obtaining the product of the weight matrix (W) and.
  • the weight matrix (W) of all connection layers in the neural network is composed of a product (M w C w ) of an integer weight basis matrix (M w ) and a real weight coefficient matrix (C w ), and the input
  • the product operation of the input basis matrix (M x ) and the weight basis matrix (M w ) can be a product operation of integer matrices.
  • the program according to one aspect is a program that causes a computer to function as a neural network device that inputs input information to an input layer of a neural network model and obtains output information from an output layer, and is stored in the storage unit (24) of the computer.
  • an input vector (x ) To the sum of the product of the integer input basis matrix (M x ) and the real input coefficient vector (c x ) and the input bias (b x ), the input coefficient vector ( c x ) and the input bias vector (c x ) out of the input bias (b x ) and the input coefficient vector obtained by the learning.
  • the program causes the computer to read the weight basis matrix (24) read from the storage unit (24) with the output vector of the previous layer as an input vector (x) in at least one fully connected layer of the neural network model.
  • the neural network device includes a storage unit (24) for storing a neural network model, and an arithmetic unit (22) for inputting input information to an input layer of the neural network model and outputting an output layer,
  • the operation unit (22) has an input basis matrix (M x ) that is an integer matrix, with the output vector of the previous layer as an input vector (x) in at least one layer of the neural network model.
  • the weight matrix (W) is composed of binary or ternary elements
  • this configuration allows the input basis matrix (M) to be calculated in the calculation of the product of the input vector (x) and the weight matrix (W). Since the product operation of x ) and the weight matrix (W) can be a product operation of an integer matrix and a binary or ternary matrix, the amount of calculation can be reduced.
  • the FC layer of the neural network includes a step of calculating a product W T x of a weight matrix (filter) W and an input vector (input map) x.
  • the weight matrix W is decomposed into an integer basis matrix and a real coefficient matrix (integer decomposition)
  • the input vector x is decomposed into an integer base matrix and a real coefficient vector (integer decomposition), thereby reducing the memory consumption.
  • the processing time can be shortened.
  • FIG. 1 is a diagram for explaining calculation of an integer-resolved product W T x according to the present embodiment.
  • the bias b is omitted.
  • the basis number k w is determined according to the size of the weight matrix W, but is about 1/8 to 1/4 (about several tens to several hundreds) of the weight matrix W, and the basis number k x Can be about 2 to 4, for example.
  • this is expressed by an equation including the bias b, it is expressed as the following equation (6).
  • the base matrix M w T obtained by decomposing the weight matrix W is a binary or ternary matrix
  • the base matrix M x obtained by decomposing the input vector x is a binary matrix.
  • the base matrix M x may be a ternary matrix as will be described later.
  • M w T M x in the first term on the right-hand side of equation (6) is a product of a binary or ternary matrix and a binary or ternary matrix, which is a logical operation (AND, XOR) and a bit. It can be calculated by counting.
  • the sum of the second term and the third term on the right side can be calculated in advance as will be described later. Therefore, most operations can be reduced to logical operations by the decomposition of FIG. 1 and Equation (6).
  • FIG. 2 is a diagram showing a configuration of a decomposition processing apparatus for configuring the deep neural network of the present embodiment.
  • the decomposition processing apparatus 10 includes a data acquisition unit 11, a weight decomposition unit 12, an input pre-decomposition unit 13, and a decomposition result output unit 14.
  • the data acquisition unit 11 acquires configuration information (including the weight (filter) W and bias b of each layer) of the neural network model according to the present embodiment and an input vector for learning.
  • the weight decomposition unit 12 decomposes the weight matrix W into a product of a real coefficient matrix Cw and a binary or ternary basis matrix Mw .
  • Prefilled decomposition unit 13 the product of the coefficient vector c x for decomposing the sum of the product and the bias b x of the basis matrix M x and real coefficient vector c x of the binary or ternary input vector x And a bias b x are obtained by learning, and a lookup table LUT for obtaining a base matrix M x from the input vector x is generated.
  • the decomposition result output unit 14 uses the product of the coefficient matrix C w obtained by the weight decomposition unit 12 and the binary or ternary basis matrix M w and the lookup table LUT obtained by the input pre-decomposition unit 13. Then, the neural network model is reconstructed and output to the neural network device 20 described later. Hereinafter, each function will be described in detail.
  • the weight decomposition unit 12 decomposes the weight matrix W into a product of a real coefficient matrix Cw and an integer basis matrix Mw .
  • Figure 3 is a diagram illustrating the process of decomposing a weight matrix W on the basis matrix of the base number k w M w and the coefficient matrix C w.
  • the weight decomposition unit 12 decomposes the basis matrix M w and real coefficient matrix C w of the weight matrix W binary or ternary.
  • the weight decomposition section 12 of the present embodiment as a method of decomposing a binary or ternary basis matrix M w and real coefficient matrix C w, illustrating the first to fourth methods.
  • First decomposition method As a first decomposition method, a data-independent decomposition method will be described.
  • the weight decomposition unit 12 performs the degradation by solving the cost function g 1 of the formula representing the degradation error.
  • the base matrix M w is a binary matrix, and M ⁇ ⁇ 1, 1 ⁇ D 0 ⁇ kw .
  • the weight decomposing unit 12 solves the cost function g 1 according to the following procedure.
  • (1) Initialize randomly basis matrix M w and the coefficient matrix C w.
  • (2) By fixing the elements of the base matrix M w and optimizing the elements of the coefficient matrix C w by the least square method, the elements of the coefficient matrix C w are updated so that the cost function g 1 is minimized. .
  • (4) Repeat (2) and (3) until convergence. For example, when the cost function g 1 satisfies a predetermined convergence condition (for example, the amount of decrease is a certain value or less), it is determined that the cost function g 1 has converged.
  • a predetermined convergence condition for example, the amount of decrease is a certain value or less
  • steps (1) to (4) are held as candidates.
  • Steps (1) to (5) are repeated, and the candidate base matrix M w and candidate coefficient matrix C w that have the smallest cost function g 1 are adopted as the final results. Note that steps (1) to (5) need not be repeated, but by repeating a plurality of times, the problem of dependence on the initial value can be avoided.
  • the element of the row vector of the jth row of the base matrix Mw depends only on the element of the jth row of the weight matrix W. Therefore, the value of each row vector of the basis matrix M w, since the other row can be independently optimized, the base matrix M w can be carried out exhaustive searches for each row (full search). There are only 2 kw row vectors in the j-th row of the base matrix M w in the case of binary decomposition as in this embodiment ( note that there are only 3 kw ways in the case of ternary decomposition). Therefore, all of these are checked comprehensively, and a row vector that minimizes the cost function g 1 is adopted. This is applied to all the row vectors of the base matrix M to update the elements of the base matrix M.
  • the weight decomposition unit 12 performs the degradation by solving the cost function g 2 of the formula is an exploded error.
  • the base matrix M is a binary matrix
  • 1 is the L1 norm of the element of the coefficient matrix C w
  • is its coefficient.
  • Weight decomposition section 12 solves the cost function g 2 of the following procedure.
  • (1) Initialize randomly basis matrix M w and the coefficient matrix C w.
  • (2) The elements of the base matrix Mw are fixed, and the elements of the coefficient matrix Cw are optimized by the proximity gradient method.
  • (4) Repeat (2) and (3) until convergence. For example, when the cost function g 2 satisfies a predetermined convergence condition (e.g., decrease amount is less than a predetermined value), it is judged to have converged. (5)
  • the solutions obtained in steps (1) to (4) are held as candidates.
  • Steps (1) to (5) are repeated, and the candidate base matrix M w and candidate coefficient matrix C w that have the smallest cost function g 2 are adopted as the final results. Note that steps (1) to (5) need not be repeated, but by repeating a plurality of times, the problem of dependence on the initial value can be avoided.
  • a second decomposition scheme can be sparsely coefficient matrix C w.
  • C w By sparsely coefficient matrix C w, in the calculation of the product C w T M w T M x of formula (6), it is possible to omit the parts relating to the zero elements of the coefficient matrix C w, the inner product even faster Calculations can be made.
  • the decomposition error is expressed as the cost function g 1. To minimize this decomposition error.
  • the weight matrix W is approximated to the product of the base matrix Mw and the coefficient matrix Cw is the product W T x of the input vector x and the weight matrix W.
  • the weight decomposition unit 12 performs the degradation by solving the cost function g 3 of the formula. According to the cost function g 3, the weighting matrix W, since that would be resolved according to the distribution of the actual data, is improved approximation accuracy of the degradation.
  • This approximate decomposition can be performed by sequentially obtaining the basis vectors m w (j) constituting the basis matrix M w .
  • the procedure of the third decomposition method is as follows. (1) The base matrix M w and the coefficient matrix C w are obtained by the first or second decomposition method and set as initial values thereof. (2) The elements of the base matrix M w are fixed, and the elements of the coefficient matrix C w are optimized by the least square method. (3) to fix the elements of the coefficient matrix C w, by optimizing the elements of the basis matrix M w, to update the elements of the basis matrix M w. The update process of the base matrix Mw will be described later.
  • step (1) the base matrix Mw and the coefficient matrix Cw are optimized again by the first or second decomposition method, so that the initial values are changed.
  • step (5) may not be repeated, the problem of initial value dependency can be reduced by repeating a plurality of times.
  • step (3) the update process of the base matrix Mw in step (3) will be described.
  • the value of the row vector of the base matrix Mw is no longer independent of other rows and is dependent. Since the elements of the base matrix M w are binary or ternary, that is, discrete values, the optimization of the base matrix M w becomes a combinatorial optimization problem. Therefore, for example, an algorithm such as a greedy algorithm, tabu search, or simulated annealing can be used to optimize the base matrix Mw . Since a good initial value is obtained in step (1), these algorithms can satisfactorily minimize the decomposition error.
  • the base matrix Mw is optimized by the following procedure.
  • (3-1) T elements are selected at random from the elements of the base matrix Mw .
  • (3-2) tried combination of street 2 T (3 T as if 3 value decomposition described later), it is taken from the one cost function g 3 minimized.
  • (3-3) Repeat steps (3-1) and (3-2) until convergence.
  • the fourth decomposition method is a combination of the second decomposition method and the third decomposition method. Specifically, disassembly by solving the cost function g 4 of the formula. According to the cost function g 4, the weighting matrix W, since that would be resolved according to the distribution of actual data, as well as improved accuracy of approximation during decomposition can be sparsely coefficient matrix C w . That is, both of the advantages of the second decomposition method and the third decomposition method can be obtained.
  • the specific decomposition procedure is the same as that in the third decomposition method.
  • the real matrix may be sequentially decomposed by the following algorithm.
  • FIG. 4 is a flowchart of an algorithm executed in the division method according to the present embodiment.
  • a procedure for decomposing the weight matrix W into a base matrix M w having k w bases and a coefficient matrix C w by the above first to fourth decomposition methods is expressed as the following equation: I decided to.
  • the weight decomposition unit 12 acquires a weight matrix W to be decomposed (step S41).
  • the residual matrix R is a difference between the sum of inner products of the base matrix M w (j) and the coefficient matrix C w (j) that have been decomposed so far by the sequential decomposition and the weight matrix W.
  • the weight decomposition unit 12 decomposes the residual matrix R into the base matrix M w and the coefficient matrix C w by the method of the first or second embodiment (step S43).
  • the basis number is k wj .
  • M w (j) C w (j) is obtained
  • the weight decomposition unit 12 sets the difference between the original residual matrix R and M w (j) C w (j) as a new residual matrix R. (Step S44), the index j is incremented (Step S45), and it is determined whether the index j is greater than N, that is, whether the N-stage sequential decomposition has been completed (Step S46).
  • weight decomposition unit 12 returns to step S43 and increments in step S45 with respect to the new residual matrix R obtained in step S44. Decompose again with the new j. The above process is repeated, and when the index j becomes larger than N (YES in step S46), the process ends.
  • Figure 5 is a diagram for explaining a modification of the process of decomposing a weight matrix W on the basis matrix of the basis number k w M w and the coefficient matrix C w.
  • the vectors of the jth column of the weight matrix W are individually decomposed and collected. By decomposing for each vector in this way, the calculation cost for the decomposition can be suppressed.
  • Individual vectors may be decomposed by the first to fourth decomposition methods described above.
  • the column vector of the j-th column of the weight matrix W is expressed as w (j)
  • the column vector of the j-th column of the coefficient matrix C w is expressed as c w (j) .
  • FIG. 6 is a diagram for explaining a modification of the process of decomposing an input vector x in the product and the bias b x of the basis matrix M x and the coefficient vector c x of the base number k x.
  • the input vector x is decomposed as shown in FIG. 6 and the following equation (12).
  • the reason for considering the bias term b x 1 is that the input vector (map) is non-negative and the bias increases due to the influence of ReLU. This bias term is not necessary, but the necessity depends on the output of the previous layer.
  • the input pre-decomposition unit 13 determines c x and b x in advance by learning. Thereby, when the input vector x is obtained in each layer, the input vector can be decomposed by optimizing only M x accordingly, and the processing can be speeded up. In the present embodiment, the optimization of M x according to the input vector x is also accelerated by using a lookup table described later.
  • the input pre-decomposition unit 13 also performs processing for determining the lookup table in advance by learning. Hereinafter, it demonstrates in order.
  • the cost function J x can be solved by the following procedure.
  • (1) The base matrix M x is initialized at random.
  • (2) basis matrix to fix the M x, an element and a bias b x of the coefficient vector c x By optimizing the least squares method, as the cost function J x is minimized, the coefficient vector c x element And the coefficient b x is updated.
  • (3) The elements of the coefficient vector c x and the bias b x are fixed, and the elements of the base matrix M x are updated by a full search so that the cost function J x is minimized.
  • (4) Repeat (2) and (3) until convergence. For example, when the cost function J x satisfies a predetermined convergence condition (for example, the amount of decrease is equal to or less than a certain value), it is determined that the cost function J x has converged.
  • a predetermined convergence condition for example, the amount of decrease is equal to or less than a certain value
  • each row can be independently updated by the full search according to the following equation (14) and the procedure shown in FIG.
  • the input vector x When the input vector x is obtained in each layer, the input vector can be decomposed into a base matrix M x and a coefficient vector c x by solving the cost function J x described above.
  • this decomposition is performed in each layer at the time of execution, it will take a lot of processing time, and it cannot be practically used for detecting a pedestrian with an in-vehicle camera, for example. Therefore, the present inventor has paid attention to the following points.
  • c x and b x determine the range of x.
  • M x can be considered to indicate which value corresponds to the value range defined by c x and b x .
  • the value range of x is similar to all the elements, only c x and b x are determined in advance by the decomposition processing device 10 during learning, and M x during execution by the neural network device 20 described later. Can only be optimized. By doing in this way, decomposition
  • FIG. 10 is a diagram showing a state in which a plurality of bins are set by dividing the number line of FIG. 9 at equal intervals.
  • the input pre-decomposition unit 13 creates a lookup table LUT that defines ⁇ that is an optimum value for each of a plurality of bins set by dividing the number line in FIG. 9 at equal intervals.
  • m x (j) can be obtained very quickly by obtaining a bin to which the input vector x belongs and referring to the lookup table LUT.
  • the decomposition result output unit 14 uses M w and C w obtained by decomposing the weight matrix W by the weight decomposition unit 12 and the coefficient vector c x and the bias b x obtained by the input pre-decomposition unit 13. The sum of the second and third terms on the right side of Equation (6) is calculated. As described above, since c x , b x , M w , and C w are all obtained by the weight decomposing unit 12 or the input pre-decomposing unit 13, the second term on the right side of the equation (6) and It is possible to calculate the sum of the third term.
  • the decomposition result output unit 14 calculates c x , M w , and C w for calculating the first term on the right side of Equation (6), and the second and third terms on the right side of Equation (6).
  • a lookup table LUT (j) (j 1,..., D I ) for obtaining the sum with the term and each row vector m x (j) of M x is output to the neural network device 20.
  • M w is referred to as “weight basis matrix”
  • C w is referred to as “weight coefficient matrix”
  • M x is referred to as “input basis matrix”
  • c x is referred to as “input coefficient vector”
  • b x is referred to as “input bias”.
  • FIG. 11 is a diagram illustrating a configuration of the neural network device 20.
  • the neural network device 20 includes an input information acquisition unit 21, a calculation unit 22, an output information output unit 23, and a storage unit 24.
  • the storage unit 24 stores a neural network model, and for each FC layer, an input coefficient vector c x for calculating the first term on the right side of the expression (6) generated and output by the decomposition processing device 10.
  • Weight basis matrix M w weight coefficient matrix C w , sum of second term and third term on right side of equation (6) (b x C w T M w T 1 + b), and each row vector of input basis matrix M x
  • the input information to be processed is input to the input information acquisition unit 21.
  • the calculation unit 22 reads the neural network model from the storage unit 24, inputs the input information acquired by the input information acquisition unit 21 to the input layer, executes calculation processing, and obtains an output layer.
  • FIG. 12 is a diagram illustrating the processing of the calculation unit 22 in the FC layer of the neural network model.
  • the calculation unit 22 uses the output vector of the previous layer as an input vector x, and uses the input vector x as a product of a binary input base matrix M x and a real input coefficient vector c x and an input bias. It is decomposed into b x, obtaining the product of the input vector x and the weight matrix W.
  • the calculation unit 22 uses the input vector x as an input vector x and performs the calculation of Expression (6), whereby the input vector x and the weight matrix W Find the product.
  • the arithmetic unit 22 first obtains a binary input base matrix M x corresponding to the input vector x with reference to the lookup table LUT read from the storage unit 24. Next, the arithmetic unit 22 uses the obtained binary input base matrix M x , the weight coefficient matrix C w , the weight base matrix M w , and the input coefficient vector c x read from the storage unit 24 to obtain an equation ( 6) Calculate the first term (C w T M w T M x c x ) on the right side of 6).
  • the calculation unit 22 calculates the value of the first term on the right side of the equation (6) obtained by the above calculation (C w T Mw T M x c x ) and the right side of the equation (6) read from the storage unit 24. calculating a second term and the sum of the third term of (b x C w T M w T 1 + b) the sum of the (C w T M w T M x c x + b x C w T M w T 1 + b).
  • the computing unit 22 further calculates the output of the layer (the input of the next layer) by inputting the calculation result to an activation function (for example, ReLU).
  • an activation function for example, ReLU
  • the computing unit 22 finally obtains an output layer by performing computation according to the neural network model while performing the above computation in the FC layer.
  • the value of the output layer is output to the output information output unit 23.
  • the output information output unit 23 outputs the required output information based on the value of the output layer obtained by the calculation unit 22. For example, when the neural network model performs classification, the output information output unit 23 outputs, as output information, information on the class with the highest likelihood in the output layer as output information.
  • the FC layer in the neural network it is effective to save memory and increase the speed by using the look-up table LUT for decomposing the decomposed weight matrix W and the input vector.
  • the CONV layer of the intermediate layer can also have a four-dimensional data structure by arranging various filters (three-dimensional), and the above-described speed-up method can be applied.
  • 13 and 14 are diagrams showing the relationship between the input map and the output map of the CONV layer. 13 and 14, the left side is the input map IM, the right side is the output map OM, and the rectangular parallelepiped applied to the input map is the three-dimensional filters F1 and F2.
  • the filter F1 and the filter F2 are different filters, and thus C out number of different filters are prepared.
  • the calculation amount for one output map is (f h f w C in ) ⁇ (HW), and when all the filters are added up, (f h f w C in ) ⁇ (HW) ⁇ C out is obtained . When this form is not applied, the amount of calculation becomes very large.
  • the weight matrix W is generated by arranging the filters as column vectors in the row direction.
  • the CONV layer can also be regarded as the FC layer, and the above-described memory saving and high-speed computation can be performed.
  • Table 1 is a table comparing the amount of calculation necessary for each FC layer in the neural network device 20 of the present embodiment with that of the conventional technique.
  • B is the bit width of a variable (register) that performs a logical operation.
  • D I and D O are on the order of hundreds to thousands, whereas k x is about 2 to 4 and k w is about D O / 8 to D O / 4 as described above. Therefore, in this embodiment, the amount of calculation is reduced as compared with the prior art.
  • Table 2 is a table comparing memory consumption in each FC layer with the prior art in the neural network device 20 of the present embodiment.
  • single-precision real numbers 32 bits are used as real numbers.
  • the memory consumption is reduced as compared with the prior art.
  • the memory consumption in the FC layer can be reduced and the amount of computation can be reduced, so that the number of layers of the neural network is large (deep neural network),
  • This embodiment is particularly effective when the above-described memory-saving and high-speed computation can be applied to a plurality of layers.
  • the above-described decomposition processing device 10 and the neural network device 20 are each realized by a computer having a storage device, a memory, an arithmetic processing device, and the like executing a program.
  • the decomposition processing device 10 and the neural network device 20 have been described as separate devices, but these devices may be configured by the same computer.
  • a lookup table LUT defining ⁇ for optimizing m x (j) for each of a plurality of bins is created and stored in the neural network device 20.
  • a prescribed matrix M x is obtained by a method of obtaining an optimum ⁇ for each element x j by obtaining a bin to which the element x j belongs and referring to a lookup table LUT. It was.
  • the optimal input basis search method is not limited to the above. Below, the modification of the optimal input base search method is demonstrated.
  • the input predecomposition unit 13 calculates ( ⁇ c x + b x ) for all candidates ⁇ of m x (j) .
  • k x 4
  • c x (3.8, 8.6, 1.2, 0.4) T
  • b x 15.2
  • a value obtained by calculating ( ⁇ c x + b x ) is referred to as a prototype p.
  • the input pre-decomposition unit 13 sorts (rearranges) the prototypes p according to the size of the values.
  • FIG. 18 shows the result of sorting by the value of prototype p for the example of FIG. Thus rearranged was shaped 1,2 subscript from the direction the value of the prototype is small in the order of time, subjected ..., and 16, p 1, p 2, denoted ..., and p 16.
  • ⁇ to be assigned to the value x j of each element of the input vector can be defined with the midpoint mp i as a boundary.
  • a binary search method can be used.
  • FIG. 22 is a diagram illustrating a configuration of the neural network device 20 according to the present modification.
  • a binary tree (FIG. 27) described later is configured for each element x j of the input vector x instead of the lookup table LUT.
  • FIG. 27 is a diagram illustrating the binary tree search method described above.
  • the arithmetic unit 22 can finally obtain a solution by comparing only the number of bits (k ⁇ times).
  • the input basis matrix M x is a ternary matrix
  • the weight matrix is a real number matrix.
  • the weight matrix is decomposed. Is unnecessary. In this case, only the input vector may be decomposed into a sum of a product of a binary or ternary basis matrix and a real coefficient vector and a bias.
  • a neural network whose weight matrix is originally binary or ternary is, for example, M. Courbariaux, Y. Bengio, and JP David. BinaryConnect: Training deep neural networks with binary weights during propagations. In NIPS, 3105-3113, 2015. and F. Li Liand B B Liu. Ternary weight network, Technical Report ArXiv: 1605.04711, 2016.
  • FIG. 28 is a block diagram showing a configuration of a vehicle control system including the neural network device 20.
  • the vehicle control system 100 includes a neural network device 20, an in-vehicle sensor 30, and a vehicle control device 40.
  • the in-vehicle sensor 30 acquires input information input to the input device of the neural network device by performing sensing.
  • the in-vehicle sensor 30 may be, for example, a monocular camera, a stereo camera, a microphone, or a millimeter wave radar. These detected values may be input as they are to the neural network device 20 as input information, or input information may be generated by performing information processing on these detected values and input to the neural network device 20.
  • the neural network device 20 may detect a specific type of object (for example, a person, a vehicle, etc.) and surround it with a rectangular frame, or determine which class belongs to each pixel (semantic segmentation). Or other recognition processing may be performed.
  • a specific type of object for example, a person, a vehicle, etc.
  • surround it with a rectangular frame, or determine which class belongs to each pixel (semantic segmentation). Or other recognition processing may be performed.
  • the vehicle control device 40 controls the vehicle based on the output (recognition result) of the neural network device.
  • the vehicle control may be automatic driving of the vehicle, driving assistance of the vehicle (for example, forced braking at the time of collision danger, lane keeping, etc.), or provision of information to the driver of the vehicle (for example, Presentation of recognition results, notification of risk judgment results based on recognition results, and the like).

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Neurology (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)
  • Complex Calculations (AREA)
  • Feedback Control In General (AREA)

Abstract

L'invention concerne un appareil de réseau neuronal (20) comprenant : une unité de stockage (24) qui stocke un modèle de réseau neuronal ; et une unité arithmétique (22) qui entre des informations d'entrée dans la couche d'entrée du réseau neuronal et qui génère une couche de sortie. Une matrice de pondération (W) d'une couche FC du modèle de réseau neuronal est configurée avec le produit d'une matrice de base de pondération (Mw) de nombres entiers et d'une matrice de coefficients de pondération (Cw) de nombres réels. L'unité arithmétique (22) calcule, dans la couche FC, le produit d'un vecteur d'entrée (x) et de la matrice de pondération (W) en utilisant un vecteur de sortie de la couche précédente comme vecteur d'entrée (x) et en décomposant le vecteur d'entrée (x) en un produit d'une matrice de base d'entrée binaire (Mx) et d'un vecteur de coefficient d'entrée (cx) d'un nombre réel, et une polarisation d'entrée (bx).
PCT/JP2017/026363 2016-07-21 2017-07-20 Appareil de réseau neuronal, système de commande de véhicule, dispositif de décomposition et programme Ceased WO2018016608A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2018528880A JP6921079B2 (ja) 2016-07-21 2017-07-20 ニューラルネットワーク装置、車両制御システム、分解処理装置、及びプログラム
US16/318,779 US11657267B2 (en) 2016-07-21 2017-07-20 Neural network apparatus, vehicle control system, decomposition device, and program
CN201780056821.9A CN109716362B (zh) 2016-07-21 2017-07-20 神经网络装置、车辆控制系统、分解处理装置以及程序

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2016143705 2016-07-21
JP2016-143705 2016-07-21

Publications (1)

Publication Number Publication Date
WO2018016608A1 true WO2018016608A1 (fr) 2018-01-25

Family

ID=60992638

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/026363 Ceased WO2018016608A1 (fr) 2016-07-21 2017-07-20 Appareil de réseau neuronal, système de commande de véhicule, dispositif de décomposition et programme

Country Status (4)

Country Link
US (1) US11657267B2 (fr)
JP (1) JP6921079B2 (fr)
CN (1) CN109716362B (fr)
WO (1) WO2018016608A1 (fr)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019160319A (ja) * 2018-03-09 2019-09-19 キヤノン株式会社 多階層ニューラルネットワークモデルを最適化して適用する方法及び装置、及び記憶媒体
WO2020116194A1 (fr) * 2018-12-07 2020-06-11 ソニーセミコンダクタソリューションズ株式会社 Dispositif de traitement d'informations, procédé de traitement d'informations, programme, dispositif de commande de corps mobile et corps mobile
CN111382847A (zh) * 2018-12-27 2020-07-07 上海寒武纪信息科技有限公司 数据处理装置及相关产品
WO2021009965A1 (fr) * 2019-07-12 2021-01-21 株式会社メガチップス Processeur pour réseau neuronal, procédé de traitement pour réseau neuronal, et programme
JP2021012553A (ja) * 2019-07-08 2021-02-04 株式会社東芝 推論装置、学習装置、推論方法及び学習方法
JP2021149353A (ja) * 2020-03-18 2021-09-27 株式会社デンソー 情報処理装置、データ分解方法、及びデータ分解プログラム
WO2022244216A1 (fr) * 2021-05-20 2022-11-24 日本電信電話株式会社 Dispositif d'apprentissage, dispositif d'inférence, procédé d'apprentissage, procédé d'inférence, et programme
JP2022552452A (ja) * 2019-09-11 2022-12-16 ロベルト・ボッシュ・ゲゼルシャフト・ミト・ベシュレンクテル・ハフツング 同変ポリシーを伴う物理環境相互作用
JP2023068865A (ja) * 2021-11-04 2023-05-18 富士通株式会社 機械学習プログラム、機械学習方法及び機械学習装置
US11755880B2 (en) 2018-03-09 2023-09-12 Canon Kabushiki Kaisha Method and apparatus for optimizing and applying multilayer neural network model, and storage medium

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12165065B2 (en) * 2017-08-18 2024-12-10 Intel Corporation Efficient neural networks with elaborate matrix structures in machine learning environments
CN110162403B (zh) * 2019-05-28 2021-07-13 首都师范大学 一种基于人工神经网络的硬件资源分配方法及系统
CN110349107B (zh) * 2019-07-10 2023-05-26 北京字节跳动网络技术有限公司 图像增强的方法、装置、电子设备、及存储介质
US20200134417A1 (en) * 2019-12-24 2020-04-30 Intel Corporation Configurable processor element arrays for implementing convolutional neural networks
KR102512932B1 (ko) * 2020-01-31 2023-03-22 한국과학기술원 암 환자의 발현량 데이터로부터 암세포 내재적 특성을 추출하는 방법 및 이를 위한 장치
DE112020006752T5 (de) * 2020-02-17 2022-12-29 Mitsubishi Electric Corporation Modellerzeugungsvorrichtung, fahrzeugseitige Vorrichtung und Modellerzeugungsverfahren
KR20210111014A (ko) * 2020-03-02 2021-09-10 삼성전자주식회사 전자 장치 및 그 제어 방법
TWI723823B (zh) * 2020-03-30 2021-04-01 聚晶半導體股份有限公司 基於神經網路的物件偵測裝置和物件偵測方法
CN113554145B (zh) * 2020-04-26 2024-03-29 伊姆西Ip控股有限责任公司 确定神经网络的输出的方法、电子设备和计算机程序产品
CN111681263B (zh) * 2020-05-25 2022-05-03 厦门大学 基于三值量化的多尺度对抗性目标跟踪算法
DE102021210607A1 (de) * 2021-09-23 2023-03-23 Robert Bosch Gesellschaft mit beschränkter Haftung Verfahren und Vorrichtung zum Verarbeiten von mit einem neuronalen Netz assoziierten Daten
CN114462680B (zh) * 2022-01-05 2025-06-20 北京京东振世信息技术有限公司 车辆配置方法、装置、存储介质及电子设备

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05101025A (ja) * 1991-10-03 1993-04-23 Hamamatsu Photonics Kk ニユーラルネツトワーク装置及びその実行方法
JP2016042359A (ja) * 2014-08-18 2016-03-31 株式会社デンソーアイティーラボラトリ 認識装置、実数行列分解方法、認識方法

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU641418B2 (en) * 1989-09-20 1993-09-23 Fujitsu Limited A parallel data processing system for processing and transmitting data concurrently
US8457409B2 (en) * 2008-05-22 2013-06-04 James Ting-Ho Lo Cortex-like learning machine for temporal and hierarchical pattern recognition
JP6055391B2 (ja) * 2012-11-05 2016-12-27 株式会社デンソーアイティーラボラトリ 関連性判定装置、関連性判定プログラム、及び関連性判定方法
US9728184B2 (en) * 2013-06-18 2017-08-08 Microsoft Technology Licensing, Llc Restructuring deep neural network acoustic models
CN105224984B (zh) * 2014-05-31 2018-03-13 华为技术有限公司 一种基于深度神经网络的数据类别识别方法及装置
EP3192015A1 (fr) * 2014-09-09 2017-07-19 Intel Corporation Implémentations à entiers en virgule fixe améliorées pour réseaux de neurones
US10534994B1 (en) * 2015-11-11 2020-01-14 Cadence Design Systems, Inc. System and method for hyper-parameter analysis for multi-layer computational structures

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05101025A (ja) * 1991-10-03 1993-04-23 Hamamatsu Photonics Kk ニユーラルネツトワーク装置及びその実行方法
JP2016042359A (ja) * 2014-08-18 2016-03-31 株式会社デンソーアイティーラボラトリ 認識装置、実数行列分解方法、認識方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MOHAMMAD RASTEGARI ET AL., XNOR-NET: IMAGENET CLASSIFICATION USING BINARY CONVOLUTIONAL NEURAL NETWORKS, vol. 3, 9 May 2016 (2016-05-09), Retrieved from the Internet <URL:https://arxiv.org/pdf/1603.05279v3.pdf> [retrieved on 20171011] *

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11755880B2 (en) 2018-03-09 2023-09-12 Canon Kabushiki Kaisha Method and apparatus for optimizing and applying multilayer neural network model, and storage medium
JP2019160319A (ja) * 2018-03-09 2019-09-19 キヤノン株式会社 多階層ニューラルネットワークモデルを最適化して適用する方法及び装置、及び記憶媒体
WO2020116194A1 (fr) * 2018-12-07 2020-06-11 ソニーセミコンダクタソリューションズ株式会社 Dispositif de traitement d'informations, procédé de traitement d'informations, programme, dispositif de commande de corps mobile et corps mobile
US12125237B2 (en) 2018-12-07 2024-10-22 Sony Semiconductor Solutions Corporation Information processing apparatus, information processing method, program, mobile-object control apparatus, and mobile object
JP7497298B2 (ja) 2018-12-07 2024-06-10 ソニーセミコンダクタソリューションズ株式会社 情報処理装置、情報処理方法、プログラム、移動体制御装置、及び、移動体
JPWO2020116194A1 (ja) * 2018-12-07 2021-10-21 ソニーセミコンダクタソリューションズ株式会社 情報処理装置、情報処理方法、プログラム、移動体制御装置、及び、移動体
CN111382847A (zh) * 2018-12-27 2020-07-07 上海寒武纪信息科技有限公司 数据处理装置及相关产品
CN111382847B (zh) * 2018-12-27 2022-11-22 上海寒武纪信息科技有限公司 数据处理装置及相关产品
JP2021012553A (ja) * 2019-07-08 2021-02-04 株式会社東芝 推論装置、学習装置、推論方法及び学習方法
JP7114528B2 (ja) 2019-07-08 2022-08-08 株式会社東芝 推論装置、学習装置、推論方法及び学習方法
JP2021015510A (ja) * 2019-07-12 2021-02-12 株式会社メガチップス ニューラルネットワーク用プロセッサ、ニューラルネットワーク用処理方法、および、プログラム
JP7265946B2 (ja) 2019-07-12 2023-04-27 株式会社メガチップス ニューラルネットワーク用プロセッサ、ニューラルネットワーク用処理方法、および、プログラム
WO2021009965A1 (fr) * 2019-07-12 2021-01-21 株式会社メガチップス Processeur pour réseau neuronal, procédé de traitement pour réseau neuronal, et programme
JP2022552452A (ja) * 2019-09-11 2022-12-16 ロベルト・ボッシュ・ゲゼルシャフト・ミト・ベシュレンクテル・ハフツング 同変ポリシーを伴う物理環境相互作用
US12100198B2 (en) 2019-09-11 2024-09-24 Robert Bosch Gmbh Physical environment interaction with an equivariant policy
JP7597795B2 (ja) 2019-09-11 2024-12-10 ロベルト・ボッシュ・ゲゼルシャフト・ミト・ベシュレンクテル・ハフツング 同変ポリシーを伴う物理環境相互作用
JP7384081B2 (ja) 2020-03-18 2023-11-21 株式会社デンソー 情報処理装置、データ分解方法、及びデータ分解プログラム
JP2021149353A (ja) * 2020-03-18 2021-09-27 株式会社デンソー 情報処理装置、データ分解方法、及びデータ分解プログラム
JPWO2022244216A1 (fr) * 2021-05-20 2022-11-24
WO2022244216A1 (fr) * 2021-05-20 2022-11-24 日本電信電話株式会社 Dispositif d'apprentissage, dispositif d'inférence, procédé d'apprentissage, procédé d'inférence, et programme
JP7574920B2 (ja) 2021-05-20 2024-10-29 日本電信電話株式会社 学習装置、学習方法、及びプログラム
JP2023068865A (ja) * 2021-11-04 2023-05-18 富士通株式会社 機械学習プログラム、機械学習方法及び機械学習装置
JP7739949B2 (ja) 2021-11-04 2025-09-17 富士通株式会社 機械学習プログラム、機械学習方法及び機械学習装置

Also Published As

Publication number Publication date
CN109716362B (zh) 2024-01-09
CN109716362A (zh) 2019-05-03
JPWO2018016608A1 (ja) 2019-05-09
US20190286982A1 (en) 2019-09-19
US11657267B2 (en) 2023-05-23
JP6921079B2 (ja) 2021-08-18

Similar Documents

Publication Publication Date Title
JP6921079B2 (ja) ニューラルネットワーク装置、車両制御システム、分解処理装置、及びプログラム
KR102796191B1 (ko) 신경망의 최적화 방법
CN107688855B (zh) 针对于复杂神经网络的分层量化方法与装置
US11601134B2 (en) Optimized quantization for reduced resolution neural networks
Petersen et al. Differentiable sorting networks for scalable sorting and ranking supervision
EP3785176A1 (fr) Apprentissage d&#39;un rang de troncature de matrices décomposées de valeurs singulières représentant des tenseurs de pondération dans des réseaux neuronaux
CN115879533B (zh) 一种基于类比学习的类增量学习方法及系统
JP2019032808A (ja) 機械学習方法および装置
KR20220058897A (ko) 컴퓨트-인-메모리 어레이의 컬럼 임계치들을 조정함으로써 xnor 등가 연산들을 수행
Ying et al. Self-optimizing feature generation via categorical hashing representation and hierarchical reinforcement crossing
EP3882823A1 (fr) Procédé et appareil comportant une approximation softmax
US20250139457A1 (en) Training a neural network to perform a machine learning task
KR20210039921A (ko) 신경망 모델을 최적화하도록 구성된 심층 신경망 시스템의 동작 방법
Chang et al. Differentiable architecture search with ensemble gumbel-softmax
WO2025024000A1 (fr) Élagage d&#39;activations et de poids de réseaux neuronaux avec seuils programmables
Dyro et al. Second-order sensitivity analysis for bilevel optimization
Ahmed et al. Tiny Deep Ensemble: Uncertainty Estimation in Edge AI Accelerators via Ensembling Normalization Layers with Shared Weights
Guo et al. An interpretable neural network model through piecewise linear approximation
Swaney et al. Efficient skin segmentation via neural networks: HP-ELM and BD-SOM
Hämäläinen et al. Scalable robust clustering method for large and sparse data
Viana et al. A Quantum Genetic Algorithm Framework for the MaxCut Problem
Kirtas et al. Multiplicative RMSprop Using Gradient Normalization for Learning Acceleration
CN115293346A (zh) 基于数据库的模型训练方法及相关设备
EP4154186A1 (fr) Dispositif et procédé pour un réseau booléen de neurones artificiels
WO2025184850A1 (fr) Exécution d&#39;une multiplication matricielle par réalisation d&#39;une convolution avec un accélérateur de réseau de neurones profond

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17831124

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2018528880

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17831124

Country of ref document: EP

Kind code of ref document: A1