[go: up one dir, main page]

WO2020213670A1 - Neural calculation device and neural calculation method - Google Patents

Neural calculation device and neural calculation method Download PDF

Info

Publication number
WO2020213670A1
WO2020213670A1 PCT/JP2020/016684 JP2020016684W WO2020213670A1 WO 2020213670 A1 WO2020213670 A1 WO 2020213670A1 JP 2020016684 W JP2020016684 W JP 2020016684W WO 2020213670 A1 WO2020213670 A1 WO 2020213670A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural
calculation
output
zero
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2020/016684
Other languages
French (fr)
Japanese (ja)
Inventor
伸也 高前田
晃大 植吉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hokkaido University NUC
Original Assignee
Hokkaido University NUC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hokkaido University NUC filed Critical Hokkaido University NUC
Publication of WO2020213670A1 publication Critical patent/WO2020213670A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Definitions

  • the present invention belongs to the technical fields of neural computing devices and neural computing methods.
  • Non-Patent Document 1 a technique that processes deep neural networks (DNN) at high speed and low power consumption for applications such as IoT (Internet of Things) and autonomous driving.
  • DNN deep neural networks
  • IoT Internet of Things
  • Most of the calculation of the neural network is the multiplication of the weighting coefficient and the activity value on the input data and the integration of the multiplication result. Therefore, as a method of reducing the number of multiplications and additions by omitting weights that do not significantly affect the final recognition result, a technique called pruning (for example, Non-Patent Document 1) or a similar technique is used.
  • Neuron Pruning for example, Non-Patent Document 2 that removes neurons that do not significantly affect the recognition result.
  • the present invention has been made in view of each of the above problems and requirements, and one example of the problem is to provide a neural computing device capable of dynamically omitting calculations related to neurons.
  • the invention according to claim 1 adds the multiplication result of the input data and the weighting coefficient, applies an activation function to the addition result, and outputs the output data.
  • a neural net calculation means composed of a plurality of neuron calculators, a zero output specifying means for performing a calculation for specifying the neuron calculator whose output becomes zero according to the input data, and the zero output specifying means.
  • the computer is provided with a control means for controlling the calculation of the neural net calculation means so as to omit the calculation of the neuron calculator whose output becomes zero.
  • the zero output specifying means is a binarized neural network calculator that imitates the calculation of the neural network calculation means, and the control means.
  • the calculation of the neural network calculation means is controlled based on the output of the zero output specifying means in which 1-bit data obtained by converting from the input data is input.
  • the invention according to claim 3 is a binarization that imitates the calculation before the zero output specifying means applies the activation function to the neural network calculation means in the neural computer according to claim 2. It is characterized by being a neural computer.
  • the invention according to claim 4 is the binarized neural computer according to claim 3, based on the difference between the output of the zero output specifying means and the addition result in the neural network calculation means in the neural computer according to claim 3. It is characterized by further providing a learning means for learning.
  • the neural network calculation means is composed of a plurality of neural layers and corresponds to each of the neural layers. There are a plurality of the zero output specifying means, and the zero output specifying means corresponding to the neural layer makes the output of the neuron computer belonging to the neural layer zero according to the output data of the neural layer of the previous layer. It is characterized by performing a calculation that identifies a neural computer.
  • the invention according to claim 6 is based on the neural computer according to claim 5, wherein the control means is specified by the zero output specifying means corresponding to the neural layer to have a non-zero output. It is characterized in that the calculation of the neural network calculation means is controlled so that the calculation is started in advance.
  • the invention according to claim 7 is the neural computing device according to claim 5 or 6, wherein the control means has a time interval between the calculation in the neural layer and the calculation in the neural layer of the next layer. It is characterized in that the calculation of the neural network calculation means is controlled so as to reduce the power consumption.
  • the invention according to claim 8 is characterized in that the activation function is a rectified linear function in the neural computing device according to any one of claims 1 to 7.
  • the zero output specifying means identifies the neuron computer in which the sign of the addition result is zero or less from the pattern of the input data. It is characterized by performing calculations to be performed.
  • the invention according to claim 10 is composed of a plurality of neural computers that add a multiplication result of input data and a weighting coefficient, apply an activation function to the addition result, and output output data.
  • the zero output specifying means performs a calculation for specifying the neuron computer whose output becomes zero according to the input data
  • the control means is the zero output specifying means. It is characterized by including a control step for controlling the calculation of the neural network calculation device so as to omit the calculation of the neuron computer whose output becomes zero based on the calculation result of.
  • a neural network calculation composed of a plurality of neuron calculators that add the multiplication result of input data and weighting coefficient, apply an activation function to the addition result, and output output data.
  • a calculation is performed to identify a neuron computer whose output becomes zero according to the input data, and based on this calculation result, the neural network calculation is controlled so as to omit the calculation of the neuron computer whose output becomes zero. By doing so, it is possible to identify the neural computer whose output becomes zero according to the pattern of the input data, so that the calculation related to the neural network can be dynamically omitted.
  • the zero output identification part is the schematic diagram explaining the identification function of a support vector machine. It is a block diagram which shows the modification of the neural network calculation part and the zero output specific part.
  • FIG. 1 is a block diagram showing an example of a neural computing device according to an embodiment.
  • FIG. 2 is a block diagram showing an outline configuration example of the neural network calculation unit and the zero output specific unit.
  • FIG. 3 is a block diagram showing an example of a neuron calculator.
  • FIG. 4 is a schematic diagram showing an example in which the calculation of the neuron calculator is omitted.
  • the neural calculation device 1 adds the multiplication result of the input data i and the weighting coefficient, applies an activation function to the addition result, and outputs the output data o.
  • the neural network calculation unit (NN calculation unit) 2 composed of the instruments, the zero output specification unit 3 that performs the calculation to specify the neural network calculation unit whose output of the neural network calculation unit 2 becomes zero, and the zero output
  • the control unit 4 that controls the calculation of the neural network calculation unit 2 so as to omit the calculation of the neuron computer whose output becomes zero
  • the memory 5 that stores the weighting coefficient w and the intermediate data of the calculation. And have.
  • the neural network calculation unit 2 is a neural network composed of a plurality of neuron calculators that add the multiplication result of the input data and the weighting coefficient, apply an activation function to the addition result, and output the output data.
  • This is an example of a calculation means.
  • the zero output specifying unit 3 is an example of a zero output specifying means that performs a calculation for specifying the neuron computer whose output becomes zero according to the input data.
  • the control unit 4 is an example of a control means that controls the calculation of the neural network calculation means so as to omit the calculation of the neuron computer whose output becomes zero based on the calculation result of the zero output specifying means.
  • Neural computing device 1 the input data i 0, the input data i 1, the input data i 2, ..., the input data i n (n is a natural number.
  • the output data o the output data o 1, output data o 2, ..., input and output data o m (m is a natural number. forth.) to the.
  • the neural network calculation unit 2 is composed of a multi-layer neural layer 20 having a plurality of neuron calculators 20a.
  • the neuron calculator 20a is input data i 0, the input data i 1, the input data i 2, ..., and input data i n, the weighting factor w 0 for each of the input data i, the weighting factor w It has a multiplication adder 21a that adds a multiplication result of 1 , a weighting coefficient w 2 , ..., A weighting coefficient w n , and an activation function device 25a that applies an activation function to the addition result.
  • each layer of the neural network is schematically drawn with the output side neuron and the input side neuron separated between the adjacent neural layers 20.
  • the neuron calculator 20a group of the first layer L1 is configured with respect to the neurons of the input layer L0.
  • the neuron calculator 20a group of the second layer L2 is configured with respect to the neurons of the first layer L1.
  • a neuron calculator 20a group of the nth layer Ln is configured with respect to the neurons of the n-1th layer Ln-1.
  • the zero output specifying unit 3 is composed of a plurality of activity predictors 30. Each activity predictor 30 is installed corresponding to each neural layer 20.
  • the same input data as the corresponding neural layer 20 is input to the activity predictor 30.
  • the output of the neural layer 20 of the previous layer is input to the activity predictor 30 corresponding to the neural layers 20 of the second and subsequent layers.
  • the activity predictor 30 performs a calculation imitating the calculation before applying the activation function among the calculations in each neuron computer 20a of the neural layer 20 based on the same input data pattern as the corresponding neural layer 20.
  • the calculation result in the activity predictor 30 is output to the control unit 4.
  • control unit 4 In contrast to the neurocomputer 20a whose output is predicted to be zero or less based on this calculation result, the control unit 4 omits the calculation of multiplication and addition related to the neurocomputer 20a whose output becomes zero. The calculation of the neural layer 20 is controlled so as to.
  • the neural network calculation unit 2 calculates only the neuron calculator 20a connected by the network of the neuron calculator 20a without omitting the calculation.
  • the activity predictor 30 may be a computer that can calculate faster than the neural layer 20.
  • a binarized neural computer an integerized neural computer, a support vector machine (SVM), a random forest, and the like can be mentioned.
  • SVM support vector machine
  • the activity predictor 30 takes out, for example, the most significant bit of the input data on the input side, and calculates the binarized neural network.
  • the most significant bit represents the sign of the input data (eg, 0 is a minus sign and 1 is a plus sign)
  • the most significant bit is the magnitude of the data value (eg,).
  • 0 indicates 0, and 1 indicates the maximum value that can be indicated by the number of bits).
  • the input used for the calculation of the binarized neural network may be 1-bit data obtained by converting from the input data in addition to the most significant bit of the input data.
  • the calculation of the neural net calculation unit 2 and the zero output specific unit 3 may be performed by the hardware of a dedicated electronic circuit, by a CPU such as a Neumann type, or by an AI chip such as a GPU. However, the calculation may be a combination of the CPU and the GPU, or may be a calculation using a neuromorphic chip.
  • the calculation of the binarized neural network in the zero output specifying unit 3 is performed by, for example, a neural electronic circuit that performs binarization multiplication by the exclusive OR (XNOR) of the 1-bit weighting coefficient and the 1-bit input data. You may.
  • the calculation of the binarized neural network adds to the multiplication result of the binarized multiplication based on the 1-bit connection presence / absence information between neurons.
  • the calculation of the binarized neural network applies an activation function to the addition result and outputs 1-bit output data.
  • FIG. 5 is a block diagram showing an outline configuration example of a binarized neural network system.
  • FIG. 6 is a block diagram showing an example of the neural electronic circuit of FIG.
  • the binarized neural network system NNS includes a plurality of core electronic circuit Cores capable of realizing various types of neural networks with electronic circuits, and a system bus bus connecting the core electronic circuit Cores. , Is equipped.
  • the core electronic circuit Core includes a binarized neural electronic circuit NN that can realize various types of neural networks with an electronic circuit, a memory access control unit MCnt that sets weighting coefficients of the binarized neural electronic circuit NN, and a neural network. It has an electronic circuit NN and a control unit Cnt that controls a memory access control unit MCnt.
  • a fully connected type neural network in which neurons between neurons are fully connected to each other, a neural network that performs convolution calculation, a neural network that is expanded in the neuron layer, and a layer. Examples include neural networks that expand numbers.
  • the binarized neural electronic circuit NN has an input memory array unit MAi that sequentially supplies input data I1, ..., Ip (p is a natural number; the same applies hereinafter) in parallel, and a memory that sequentially supplies weighting coefficient data in parallel.
  • a cell unit MC a plurality of process element units Pe that realize a multiplication function for multiplying the supplied input data I1, ..., Ip and a weighting coefficient and output a multiplication result, and a process of each input data in parallel.
  • Addition activation unit Act that adds each multiplication result from element unit Pe and applies an activation function to the addition result, and 1-bit output data O1, ..., Oq (q) from each addition activation unit Act.
  • the memory access control unit MCnt is, for example, a Direct Memory Access Controller.
  • the memory access control unit MCnt sets the input data to be sequentially supplied to each process element unit Pe in the input memory array unit MAi according to the control of the control unit Cnt. Further, the memory access control unit MCnt sets in advance a weighting coefficient and a predetermined value indicating the presence / absence of connection between neurons in each memory cell unit MC according to the control of the control unit Cnt. Further, the memory access control unit MCnt takes out the output data output from the addition activation unit Act from the output memory array unit MAo under the control of the control unit Cnt.
  • the control unit Cnt has a CPU (Central Processing Unit) and the like.
  • the control unit Cnt measures timing such as synchronization of each element of the binarized neural electronic circuit NN, and synchronizes calculation and data transfer.
  • the control unit Cnt controls switching of the selector element described later in the binarized neural electronic circuit NN.
  • the control unit Cnt controls the memory access control unit MCnt, prepares the data output from the other core electronic circuit Core for the input memory array unit MAi, and supplies the data as input data to the input memory array unit MAi. Control.
  • the control unit Cnt controls the memory access control unit MCnt to transfer the output data acquired from the output memory array unit MAo to another core electronic circuit Core.
  • the host controller (for example, control unit 4) may control the neural network system NNS and the control unit Cnt of each core electronic circuit Core. Further, the host controller may control the binarized neural electronic circuit NN and the memory access control unit MCnt instead of the control unit Cnt.
  • the host controller may be an external computer.
  • the bias memory array unit MAb stores bias data provided to each addition activation unit Act in advance.
  • the binarized neural electronic circuit NN realizes, for example, a two-layer neural network having p inputs and q outputs.
  • the memory cell unit MC has a memory cell 10 that stores a weighting coefficient.
  • the memory cell 10 stores a preset 1-bit weighting coefficient of “1” or “0” based on the brain function realized by the neural network to be constructed.
  • the memory cell unit MC may also have another memory cell (not shown) for connection presence / absence information that stores connection presence / absence information between neurons preset based on the brain function. ..
  • the no-connection information is, for example, a 1-bit predetermined value meaning NC (Not Connected), and "1" or "0” or the like is assigned as the predetermined value.
  • Memory cells 10 are lined up to form a row of memory cells.
  • the memory cell block CB is formed by collectively forming the memory cells 10 output to each process element unit Pe.
  • the memory cell 10 of the memory cell block CB corresponds to each input data input in parallel.
  • the memory cell unit MC preferably has a memory cell block CB having an input parallel number of p or more of input data I1, ..., Ip input in parallel from the input memory array unit MAi.
  • the number of memory cells 10 is preferably equal to or greater than the number of cycles of serial input data sequentially input by 1 bit from the input memory array unit MAi.
  • the memory cell unit MC sequentially outputs a 1-bit weighting coefficient for each memory cell block CB to the process element unit Pe corresponding to the serial input data input sequentially by 1 bit.
  • the weighting coefficient from the memory cell block CB and the input data from the input memory array unit MAi are input to each process element unit Pe.
  • the memory cell block CB may alternately and sequentially output a 1-bit weighting coefficient and a 1-bit connection presence / absence information to the process element unit Pe.
  • the memory cell 10 has an independent connection to the process element unit Pe, and may be sequentially output to the process element unit Pe separately.
  • the memory cell unit MC has q output parallel outputs of output data output in parallel to the output memory array unit MAo, output data O1, ..., Oq output in parallel. Correspondingly, it is arranged in the binarized neural electronic circuit NN.
  • the process element unit Pe of the number of input parallel ps arranged in each parallel input data input in parallel is a process element column (for example, a process) in the binarized neural electronic circuit NN.
  • the element column PC1 to PCq having a number of output parallels of q are arranged in the q column in the binarized neural electronic circuit NN corresponding to the output data output in parallel.
  • the process element unit Pe is set as a two-dimensional arithmetic unit array in p rows ⁇ q columns in the binarized neural electronic circuit NN.
  • Input data I1 is commonly input to the process element part Pe of the matrix (1,1), (1,2), ..., (1,q).
  • Input data I2 is commonly input to the process element parts Pe of the matrices (2,1), (2,2), ..., (2, q).
  • Input data Ip is commonly input to the process element section Pe of the matrix (p, 1), (p, 2), ..., (P, q).
  • the process element unit Pe calculates and outputs the exclusive OR (XNOR) of the 1-bit weighting coefficient output from the corresponding memory cell 10 and the 1-bit input data as the multiplication result and outputs it.
  • XNOR exclusive OR
  • connection presence / absence information when no connection information (for example, a predetermined value meaning "NC") is output from the memory cell for connection presence / absence information, the multiplication result is not added in the addition activation unit Act.
  • the multiplication result and the connection presence / absence information may be alternately output as a pair.
  • the process element unit Pe may have a connection independent of the multiplication result from the addition activation unit Act, and the multiplication result and the connection presence / absence information may be output separately.
  • the process element columns PC1, ..., PCq output the multiplication result from each process element unit Pe or the partial sum result obtained by adding a part of the multiplication results to the addition activation unit Act.
  • the addition activation unit Act is arranged according to each output data O1, ..., Oq output in parallel.
  • the addition activation unit Act adds the multiplication results sequentially output from the process element column based on the connection presence / absence information, applies an activation function to the addition result, and outputs 1-bit output data.
  • Memory array unit MAo Output to.
  • the addition activation unit Act adds the multiplication results sequentially output from the process element column, applies an activation function to the addition result, and 1 bit. Outputs the output data of to the output memory array unit MAo.
  • the addition activation unit Act determines in advance a value obtained by subtracting the number of times "0" is calculated as the multiplication result from the number of times "1" is calculated as the multiplication result in one cycle unit of the input data.
  • a value obtained by subtracting the number of times "0" is calculated as the multiplication result from the number of times "1" is calculated as the multiplication result in one cycle unit of the input data.
  • bit serial input is parallelized, the row of the process element part Pe is shared with respect to the input data, and each process element column which is a column of the process element part Pe is output independently. Output data.
  • the binarized neural electronic circuit NN may have a logarithmic circuit so that it can handle multi-bit data in addition to 2-bit data.
  • the binarized neural electronic circuit NN has a circuit that linearizes the multiplication result of the input data and the weighting coefficient by logarithmic addition of the logarithmic input data and the logarithmic weighting coefficient by inverse conversion. May be good.
  • the zero output specifying unit 3 may be a machine learning method that finally returns 0/1 in addition to the binarized neural electronic circuit NN.
  • FIG. 7 is a flowchart showing an operation example of the neural computing device.
  • FIG. 8 is a block diagram showing an example of a neural network calculation unit and a zero output specific unit.
  • 9 and 10 are schematic views showing an example of the operation timing of the neural network calculation unit and the zero output specific unit.
  • FIG. 11 is a schematic diagram showing an example of the result of the activity prediction accuracy by the neural computing device.
  • FIG. 12 is a schematic diagram showing an example of the result of the calculation reduction rate by the neural computing device.
  • the neural computing device 1 acquires the trained neural network model (step S1). Specifically, the control unit 4 of the neural calculation device 1 acquires the value of the weighting coefficient necessary for constructing the convolution neural network from the memory 5. The control unit 4 outputs the value of the weighting coefficient to the neural network calculation unit 2 and sets the trained neural network model. Further, the control unit 4 sets the activation function as a rectified linear function.
  • the neural network calculation unit 2 when the neural layer 20 is a convolution layer, a multiplication addition unit 21 that performs a convolution operation and a rectified linear function unit 25 are set.
  • the neural layer 20 of the output layer may be a fully connected layer.
  • the neural layer 20 of the hidden layer may include a fully connected layer.
  • the trained neural network model is constructed by pre-learning the neural network calculation unit 2 with a predetermined learning data set. For example, a data set for learning is applied to the neural network calculation unit 2 and learning is performed by the error propagation method.
  • the neural calculation device 1 trains the zero output specifying unit 3 (step S2). For example, as shown in FIG. 8, when the activity predictor 30 is a binarized neural computer 31, the control unit 4 is connected to each binarized neural computer 31 of the zero output specifying unit 3 and the neural network calculation unit 2.
  • the zero output specific unit 3 is trained by applying the same training data set applied to. More specifically, the control unit 4 has an OFM (Output Feature Map) which is the output of the multiplication / addition unit 21 in each neural layer 20 and an OFM which is the output of the binarized neural computer 31 corresponding to the neural layer 20. And get.
  • OFM Output Feature Map
  • the control unit 4 uses the OFM of the multiplication / addition unit 21 as a teacher signal to obtain an error from the OFM of the binarized neural computer 31, and trains each binarized neural computer 31 by an error propagation method based on this error. ..
  • the control unit 4 corrects the weighting of the binarized neural computer 31 by an optimization method such as Newton's method or the steepest descent method until the value of the loss function becomes a predetermined value or less as an error, and repeats the learning.
  • the error is, for example, the square of the difference between the output of the binarized neural computer 31 and the addition result in the multiplication / addition unit 21.
  • the weighting coefficient for the binarized neural computer 31 is modified to form the binarized neural computer 31 imitating the multiplication / addition unit 21.
  • the binarized neural computer 31 is realized by, for example, a binarized neural electronic circuit NN.
  • the zero output specifying unit 3 may be realized by a plurality of core electronic circuits Core.
  • the weighting coefficient for the binarized neural computer 31 is stored in the memory 5 and is corrected every time learning progresses.
  • the control unit 4 learns by sequentially rewriting the weighting coefficient on the memory 5.
  • the modified weighting coefficient is transferred from the memory 5 to the memory access control unit MCnt of the zero output specific unit 3, and is set in the memory cell unit MC.
  • the input data in the input memory array unit MAi and the weighting coefficient sequentially output from the memory cell unit MC to the process element unit Pe are performed in the convolution operation.
  • the convolution operation is realized under the control of.
  • the input image is k ⁇ k pixels
  • the input data i1, i2, ..., Ik, ..., Ik 2 are sequentially input.
  • the value of q is the number of neurons in the next layer.
  • the learned activity predictor 30 or the binarized neural network computer 31 is an example of the binarized neural network computer that imitates the calculation of the neural network calculation means.
  • the learned activity predictor 30 or the binarization neural computer 31 is a binarization neural computer that imitates the calculation (calculation of the multiplication / addition unit 21) before applying the activation function in the neural network calculation means. This is an example.
  • the control unit 4 functions as an example of the learning means for training the binarized neural computer based on the difference between the output of the zero output specifying means and the addition result in the neural network calculation means.
  • the neural calculation device 1 executes the neural network calculation using the zero output specifying unit 3 (step S3).
  • the control unit 4 reads the learned weighting coefficient for the neural network calculation unit 2 from the memory 5 and loads it into the neural network calculation unit 2.
  • the control unit 4 reads the learned weighting coefficient for the zero output specifying unit 3 from the memory 5 and loads it into the zero output specifying unit 3.
  • the learned weighting coefficient for the zero output specifying unit 3 is set in the memory cell unit MC corresponding to each binarized neural computer 31.
  • the calculation of the binarized neural computer 31 corresponding to the first neural layer 20 is started.
  • Input data is sequentially input to the first binarization neural computer 31 as a pattern of input data, and binarized by the uppermost bit of the input data (for example, 0 indicates a minus sign and 1 is a plus sign).
  • Calculations are performed on the neural computer 31.
  • the output data O1, O2 ... Of the addition activation unit Act are transmitted from the first binarized neural computer 31 to the control unit 4.
  • the control unit 4 controls the first multiplication / addition unit 21 so as to omit the calculation of the multiplication adder 21a of the neurocomputer 20a corresponding to the output data O1. If the output data O1 is "1", the control unit 4 performs the first multiplication addition so that the calculation of the multiplication adder 21a of the neurocomputer 20a corresponding to the output data O1 is started with the same input data. The unit 21 is controlled. The calculation result of the multiplication / addition unit 21 is output by applying the normalization linear function unit 25.
  • the activity predictor 30 or the binarized neural computer 31 functions as an example of the zero output specifying means that performs the calculation for specifying the neuron computer whose output becomes zero according to the input data. To do.
  • the control unit 4 controls the first multiplication / addition unit 21 so as to omit the calculation of the multiplication / addition unit 21a of the neurocomputer 20a corresponding to the output data O2. To do. If the output data O2 is "1", the control unit 4 performs the first multiplication addition so that the calculation of the multiplication adder 21a of the neurocomputer 20a corresponding to the output data O2 is started with the same input data. The unit 21 is controlled. The control unit 4 controls the first multiplication / addition unit 21 for each output data. For example, as shown in FIG. 4, in the first layer L1, the calculation of the multiplication adder 21a whose output is not expected to be zero is performed.
  • the output from the first multiplication / addition unit 21 and the rectified linear function unit 25 is sequentially input to the second binarization neural computer 31 as a pattern of input data, and the highest bit of the input data (for example, if it is 0). If 0 is indicated, 1 indicates the maximum value that can be indicated by the number of bits), the calculation in the binarized neural network 31 is performed.
  • the output from the first multiplication / addition unit 21 and the normalized linear function unit 25 may be stored in the cache or the memory 5 of the neural network calculation unit 2 as intermediate data.
  • the output data of the second binarized neural calculator 31 is sequentially output, and the calculation of the multiplication adder 21a of the corresponding neuron calculator 20a is omitted according to the value of each output (“0” or “1”).
  • the control unit 4 controls the second multiplication / addition unit 21 so as to perform the calculation of the multiplication / adder 21a. For example, as shown in FIG. 4, in the second layer L1, the calculation of the multiplication adder 21a whose output is not expected to be zero is performed.
  • the outputs from the second multiplication / addition unit 21 and the normalized linear function unit 25 are sequentially input to the third binarization neural computer 31 as input data, and the highest bit of the input data (for example, 0 if 0). , And 1 indicates the maximum value that can be indicated by the number of bits), so that the calculation in the binarized neural network 31 is performed.
  • the calculation is performed up to the last neural layer 20, and the neural network calculation unit 2 outputs the output data.
  • the calculation can be performed faster than the conventional neural net calculation unit.
  • the conventional neural net calculation unit has a calculation time t0 in a certain layer
  • the multiplication / addition unit 21 of the neural net calculation unit 2 calculates at the calculation time t2 ( ⁇ t0), and the binary value of the zero output specific unit 3
  • the computerized neural computer 31 calculates at the calculation time t1 ( ⁇ t0), and the binarized neural computer 31 and the multiplication / addition unit 21 are slightly staggered from each other, so that the calculation is possible from the neuron computer 20a.
  • the calculation can be performed in the calculation time t3 ( ⁇ t0).
  • the control unit 4 controls the calculation of the neural network calculation means so that the calculation is started in advance from the neuron calculator whose output is specified to be non-zero by the zero output specifying means corresponding to the neural layer. It functions as an example of control means.
  • the calculation time interval between the multiplication / addition unit 21 in the neural network calculation unit 2 and the multiplication / addition unit 21 in the next layer may be separated.
  • the calculation of the binarized neural computer 31 of the zero output specifying unit 3 is completed, the calculation of the multiplication / addition unit 21 of the next layer may be started. Further, there may be a time when the multiplication / addition unit 21 of the same layer and the binarization neural computer 31 overlap each other.
  • the control unit 4 controls the calculation of the neural network calculation means so as to reduce the power consumption by providing a time interval between the calculation in the neural layer and the calculation in the neural layer of the next layer. Works as an example.
  • control unit 4 is an example of the control means that controls the calculation of the neural network calculation means based on the output of the zero output specifying means that inputs the 1-bit data obtained by converting from the input data. is there.
  • This is an example of a zero output specifying means for performing the above.
  • the activity predictor 30 or the binarized neural computer 31 functions as an example of a zero output specifying means that performs a calculation for specifying the neuron computer whose sign of the addition result is zero or less from the pattern of the input data. ..
  • the simulation was performed with a model of 8 convolutional layers, 2 fully connected layers, and 11 layers including the input layer.
  • the activity prediction accuracy of each binarized neural computer 31 tended to decrease as the layers advanced before learning (broken line), but after learning, the activity prediction accuracy of the third layer (dashed line). After conv2_1), the activity prediction accuracy has improved.
  • the horizontal axis is each layer, and the vertical axis is the activity prediction accuracy of each layer.
  • the activity prediction accuracy of each layer is determined by using the number of neuron calculators 20a predicted to be zero as the denominator of the activity prediction accuracy when the correct answer is the case where the calculation is performed by the neural net calculation unit 2 without using the zero output specific unit 3. , The value obtained by using the number of the neurocomputer 20a for which the prediction was correct as the molecule of the activity prediction accuracy.
  • the calculation reduction rate of each layer is a value obtained by using the number of neurocomputers 20a of each layer as the denominator of the calculation reduction rate and the number of neurocomputers 20a of each layer predicted to be zero by the binarized neural computer 31.
  • the zero output specifying unit 3 may omit the activity predictor 30 corresponding to the first layer L1 and the activity predictor 30 corresponding to the second layer L2.
  • a plurality of devices that add the multiplication result of the input data i and the weighting coefficient w, apply an activation function to the addition result, and output the output data o.
  • the zero output specifying unit 3 performs a calculation for specifying the neuron computer 20a whose output becomes zero according to the input data i, and the control unit.
  • 4 controls the neural network calculation of the neural network calculation unit 2 so as to omit the calculation of the neuron computer 20a whose output becomes zero, so that the output is output according to the pattern of the input data. Since the neural computer 20a that becomes zero can be specified, the calculation related to the neural network can be dynamically omitted.
  • the zero output specifying unit 3 is a binarized neural computer 31 that imitates the calculation of the neural network calculation unit 2, and the control unit 4 is a neural based on the output of the output specifying unit 3 that has input the highest bit of the input data.
  • the calculation can be performed faster than the neural network calculation means by binarizing the calculation. Therefore, before the neural network calculation unit 2 is calculated, the output becomes zero. Can be identified.
  • the zero output specifying unit 3 is a binarized neural computer 31 that imitates the calculation before applying the activation function in the neural network calculation unit 2, the calculation to apply the activation function is omitted and the binary value is obtained. Since the modified neural computer 31 is applied, it is possible to identify the neural computer 20a whose output becomes zero faster.
  • the binarized neural computer 31 When the binarized neural computer 31 is trained based on the difference between the output of the zero output specific unit 3 and the addition result in the neural network calculation unit 2, the prediction accuracy of the zero output specific unit 3 is improved by training. ..
  • the neural network calculation unit 2 is composed of a plurality of neural layers 20, there are a plurality of activity predictors 30 corresponding to each neural layer 20, and the activity predictor 30 corresponding to the neural layer 20 is a neural layer of the previous layer.
  • the activity predictor 30 corresponding to the neural layer 20 is a neural layer of the previous layer.
  • the control unit 4 controls the calculation of the neural network calculation unit 2 so that the calculation is started in advance from the neuron computer 20a whose output is specified to be non-zero by the zero output identification unit 3 corresponding to the neural layer 20. When this is done, the calculation of the neural computer 1 can be realized faster.
  • control unit 4 controls the calculation of the neural network calculation unit 2 so as to reduce the power consumption by providing a time interval between the calculation in the neural layer 20 and the calculation in the neural layer 20 of the next layer. It is possible to save labor in the neural network calculation unit 2 having high power consumption, and to save the energy consumption of the entire neural calculation device 1.
  • the activation function is a rectified linear function, if it is zero or less in the application of the rectified linear function, the output is zero, so it is easy to omit the calculation of the neural network calculation unit 2.
  • the zero output specifying unit 3 When the zero output specifying unit 3 performs a calculation to specify the neuron computer 20a whose sign of the addition result is zero or less from the pattern of the input data, it becomes easier to specify the neuron computer 20a whose output becomes zero.
  • FIG. 13 is a block diagram when the zero output specifying unit is a support vector machine.
  • FIG. 14 is a schematic diagram in which the zero output identification unit explains the identification function of the support vector machine and the like.
  • FIG. 15 is a block diagram showing a modified example of the neural network calculation unit and the zero output specific unit.
  • the activity predictor 30 may be a support vector machine classifier 32.
  • the neural calculation device 1 has a neural network calculation unit 2 and a zero output specific unit 3A.
  • the zero output identification unit 3A has a plurality of support vector machine classifiers 32 instead of the binarized neural computer 31.
  • the support vector machine classifier 32 is installed corresponding to the neural layer 20 of each layer.
  • step S2 when the activity predictor 30 is the support vector machine classifier 32, the control unit 4 applies the same learning applied to each support vector machine classifier 32 of the zero output specifying unit 3A to the neural network calculation unit 2.
  • the zero output specific unit 3A is trained by applying the data set for. More specifically, the control unit 4 acquires the OFM which is the output of the multiplication / addition unit 21 in the neural layer 20 and the OFM which is the output of the support vector machine classifier 32 corresponding to the neural layer 20.
  • the neural computer device 1 executes the neural network calculation using the support vector machine classifier 32 instead of the binarized neural computer 31 as in step S3.
  • the support vector machine classifier 32 may be input data that is not converted to the most significant bit, or input data that is binarized to the maximum bit, such as the binarized neural computer 31.
  • the present invention can be applied to a fully bonded layer instead of a convolutional layer. That is, when the neural layer 20 is a fully coupled layer, the activity predictor 30 may be a fully coupled binarized neural computer 33.
  • the multiplication / addition unit 22 of the neural layer 20 full coupling of each layer and the binarized neural computer 33 of the full coupling in the zero output specifying unit 3B correspond to each other.
  • the input data in the input memory array unit MAi and the weighting coefficient sequentially output from the memory cell unit MC to the process element unit Pe are calculated for full coupling. It is controlled so that the operation of full coupling is realized.
  • a process element column in which process element portions Pe with a number of input parallels p are arranged and a process element column with a number of output parallels q It has PC 1 , PC 2 , ..., PC q, and a memory cell unit MC having a number of output parallels q.
  • the control unit Cnt controls the process element columns PC 1 , PC 2 , ..., PC q and q memory cell units MC of the neural electronic circuit NN to be used. Further, in the case of p ⁇ q or more, the core electronic circuits Core may be connected in series or in parallel for realization.
  • the binarized neural computer 31 that performs the convolution operation and the fully-coupled binarized neural computer 33 have the same configuration, and in performing the operation, they differ only in the weighting and the like supplied from the memory cell unit MC. Therefore, the operation of the binarized neural computer 31 and the binarized neural computer 33 is basically the same.
  • the activity predictor 30 is the support vector machine classifier 32 or the binarized neural computer 33, the same effect as that of the binarized neural computer 31 can be obtained.
  • Neural calculation device 2 Neural net calculation unit (neural net calculation means) 3, 3A, 3B: Zero output specifying unit (zero output specifying means) 4: Control unit (control means) 20: Neural layer (neural network calculation means) 20a: Neuron calculator 21, 22: Multiplication and addition part 25: Rectifier linear function part 30: Activity predictor (zero output specifying means) 31, 33: Binarized neural computer (zero output identification means) 32: Support vector machine classifier (zero output identification means) i: Input data w: Weighting coefficient o: Output data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Feedback Control In General (AREA)

Abstract

The present invention provides a neural calculation device 1, etc., with which it is possible to dynamically omit calculations relating to a neuron. The present invention comprises: a neural net calculation unit 2 composed of a plurality of neuron calculators 21a that add up the results obtained by multiplying input data i and a weighting coefficient w and applying an activation function to the addition result to output output data o; a zero output specification unit 3 for performing a calculation to specify a neuron calculator 21a for which the output is zero depending on the input data i; and a control unit 4 for controlling the calculation of the neural net calculation unit 2 on the basis of the output of the zero output specification unit 3 so as to omit the calculation of the neuron calculator 21a for which the output is zero.

Description

ニューラル計算装置、および、ニューラル計算方法Neural calculation device and neural calculation method

 本発明は、ニューラル計算装置、および、ニューラル計算方法の技術分野に属する。 The present invention belongs to the technical fields of neural computing devices and neural computing methods.

 近年、IoT(Internet of Things)・自動運転等の用途で、深層ニューラルネットワーク(DNN)を高速・低消費電力に処理するコンピュータ・ハードウェアが求められている。ニューラルネットワークの計算の大部分は、入力データに対する重み付け係数と活性値の乗算と、乗算結果の積算である。そのため、最終的な認識結果に影響を大きく与えない重みを省略し、乗算と加算の回数を削減する手法として枝刈り(Pruning)と呼ばれる技術(例えば、非特許文献1)や、同様の技術に認識結果に影響を大きく与えないニューロンを取り除くニューロン刈り(Neuron Pruning)と呼ばれる技術(例えば、非特許文献2)が存在する。 In recent years, there has been a demand for computer hardware that processes deep neural networks (DNN) at high speed and low power consumption for applications such as IoT (Internet of Things) and autonomous driving. Most of the calculation of the neural network is the multiplication of the weighting coefficient and the activity value on the input data and the integration of the multiplication result. Therefore, as a method of reducing the number of multiplications and additions by omitting weights that do not significantly affect the final recognition result, a technique called pruning (for example, Non-Patent Document 1) or a similar technique is used. There is a technique called Neuron Pruning (for example, Non-Patent Document 2) that removes neurons that do not significantly affect the recognition result.

「Learning both Weights and Connections for Efficient NeuralNetworks」論文、Song Han他、arXiv:1506.02626v3 [cs.NE]、2015年10月30日"Learning both Weights and Connections for Efficient Neural Networks" paper, Song Han et al., ArXiv: 1506.02626v3 [cs.NE], October 30, 2015 「A Threshold Neuron Pruning for a Binarized Deep Neural Network on an FPGA」論文、Tomoya Fujii他、IEICE Transactions on Information and Systems, 2018 年 E101.D 巻 2 号 p. 376-386(URL:https://www.jstage.jst.go.jp/article/transinf/E101.D/2/E101.D_2017RCP0013/_article/-char/ja/)"A Threshold Neuron Pruning for a Binarized Deep Neural Network on an FPGA" paper, Tomoya Fujii et al., IEICE Transactions on Information and Systems, 2018 E101.D Volume 2 p.376-386 (URL: https: // www. jstage.jst.go.jp/article/transinf/E101.D/2/E101.D_2017RCP0013/_article/-char/ja/)

 しかしながら、従来技術において、ニューラルネットワークの学習時に静的に重み付け係数やニューロンを削るため、入力パターンに応じて動的に発現する、無効な重みやニューロンに関する計算を省略することはできなかった。 However, in the conventional technique, since the weighting coefficient and the neuron are statically deleted when the neural network is learned, it is not possible to omit the calculation regarding the invalid weight and the neuron that are dynamically expressed according to the input pattern.

 そこで本発明は、上記の各問題点及び要請等に鑑みて為されたもので、その課題の一例は、ニューロンに関する計算を動的に省略できるニューラル計算装置を提供することにある。 Therefore, the present invention has been made in view of each of the above problems and requirements, and one example of the problem is to provide a neural computing device capable of dynamically omitting calculations related to neurons.

 上記の課題を解決するために、請求項1に記載の発明は、入力データと重み付け係数との乗算結果を加算し、当該加算結果に対して、活性化関数を適用して出力データを出力する複数のニューロン計算器から構成されるニューラルネット計算手段と、前記入力データに応じて、出力がゼロになる前記ニューロン計算器を特定するための計算を行うゼロ出力特定手段と、前記ゼロ出力特定手段の計算結果に基づき、前記出力がゼロになるニューロン計算器の計算を省略するように前記ニューラルネット計算手段の計算を制御する制御手段と、を備えたことを特徴とする。 In order to solve the above problem, the invention according to claim 1 adds the multiplication result of the input data and the weighting coefficient, applies an activation function to the addition result, and outputs the output data. A neural net calculation means composed of a plurality of neuron calculators, a zero output specifying means for performing a calculation for specifying the neuron calculator whose output becomes zero according to the input data, and the zero output specifying means. Based on the calculation result of the above, the computer is provided with a control means for controlling the calculation of the neural net calculation means so as to omit the calculation of the neuron calculator whose output becomes zero.

 請求項2に記載の発明は、請求項1に記載のニューラル計算装置において、前記ゼロ出力特定手段が、前記ニューラルネット計算手段の計算を模した2値化ニューラルネット計算器であり、前記制御手段が、前記入力データから変換して得られる1ビットのデータを入力した前記ゼロ出力特定手段の出力に基づき、前記ニューラルネット計算手段の計算を制御することを特徴とする。 According to the second aspect of the present invention, in the neural calculation apparatus according to the first aspect, the zero output specifying means is a binarized neural network calculator that imitates the calculation of the neural network calculation means, and the control means. However, it is characterized in that the calculation of the neural network calculation means is controlled based on the output of the zero output specifying means in which 1-bit data obtained by converting from the input data is input.

 請求項3に記載の発明は、請求項2に記載のニューラル計算装置において、前記ゼロ出力特定手段が、前記ニューラルネット計算手段において前記活性化関数を適用する前までの計算を模した2値化ニューラル計算器であることを特徴とする。 The invention according to claim 3 is a binarization that imitates the calculation before the zero output specifying means applies the activation function to the neural network calculation means in the neural computer according to claim 2. It is characterized by being a neural computer.

 請求項4に記載の発明は、請求項3に記載のニューラル計算装置において、前記ゼロ出力特定手段の出力と前記ニューラルネット計算手段における前記加算結果との差異に基づき、前記2値化ニューラル計算器を学習させる学習手段を更に備えたことを特徴とする。 The invention according to claim 4 is the binarized neural computer according to claim 3, based on the difference between the output of the zero output specifying means and the addition result in the neural network calculation means in the neural computer according to claim 3. It is characterized by further providing a learning means for learning.

 請求項5に記載の発明は、請求項1から請求項4のいずれか1項に記載のニューラル計算装置において、前記ニューラルネット計算手段が、複数のニューラルレイヤーから構成され、各前記ニューラルレイヤーに対応する複数の前記ゼロ出力特定手段があり、前記ニューラルレイヤーに対応するゼロ出力特定手段が、前の層のニューラルレイヤーの出力データに応じて、当該ニューラルレイヤーに属する前記ニューロン計算器の出力がゼロになるニューロン計算器を特定する計算を行うことを特徴とする。 In the invention according to claim 5, in the neural calculation apparatus according to any one of claims 1 to 4, the neural network calculation means is composed of a plurality of neural layers and corresponds to each of the neural layers. There are a plurality of the zero output specifying means, and the zero output specifying means corresponding to the neural layer makes the output of the neuron computer belonging to the neural layer zero according to the output data of the neural layer of the previous layer. It is characterized by performing a calculation that identifies a neural computer.

 請求項6に記載の発明は、請求項5に記載のニューラル計算装置において、前記制御手段が、前記ニューラルレイヤーに対応するゼロ出力特定手段により出力がゼロで無いと特定された前記ニューロン計算器から計算を先行して開始するように、前記ニューラルネット計算手段の計算を制御することを特徴とする。 The invention according to claim 6 is based on the neural computer according to claim 5, wherein the control means is specified by the zero output specifying means corresponding to the neural layer to have a non-zero output. It is characterized in that the calculation of the neural network calculation means is controlled so that the calculation is started in advance.

 請求項7に記載の発明は、請求項5または請求項6に記載のニューラル計算装置において、前記制御手段が、前記ニューラルレイヤーにおける計算と次の層の前記ニューラルレイヤーにおける計算との間に時間間隔を設けて消費電力を減少させるように、前記ニューラルネット計算手段の計算を制御することを特徴とする。 The invention according to claim 7 is the neural computing device according to claim 5 or 6, wherein the control means has a time interval between the calculation in the neural layer and the calculation in the neural layer of the next layer. It is characterized in that the calculation of the neural network calculation means is controlled so as to reduce the power consumption.

 請求項8に記載の発明は、請求項1から請求項7のいずれか1項に記載のニューラル計算装置において、前記活性化関数が正規化線形関数であることを特徴とする。 The invention according to claim 8 is characterized in that the activation function is a rectified linear function in the neural computing device according to any one of claims 1 to 7.

 請求項9に記載の発明は、請求項8に記載のニューラル計算装置において、前記ゼロ出力特定手段が、前記入力データのパターンから、前記加算結果の符号がゼロ以下になる前記ニューロン計算器を特定する計算を行うことを特徴とする。 In the invention according to claim 9, in the neural computer according to claim 8, the zero output specifying means identifies the neuron computer in which the sign of the addition result is zero or less from the pattern of the input data. It is characterized by performing calculations to be performed.

 請求項10に記載の発明は、入力データと重み付け係数との乗算結果を加算し、当該加算結果に対して、活性化関数を適用して出力データを出力する複数のニューロン計算器から構成されるニューラルネット計算装置において、ゼロ出力特定手段が、前記入力データに応じて、出力がゼロになる前記ニューロン計算器を特定するための計算を行う出力特定ステップと、制御手段が、前記ゼロ出力特定手段の計算結果に基づき、前記出力がゼロになるニューロン計算器の計算を省略するように前記ニューラルネット計算装置の計算を制御する制御ステップと、を含むことを特徴とする。 The invention according to claim 10 is composed of a plurality of neural computers that add a multiplication result of input data and a weighting coefficient, apply an activation function to the addition result, and output output data. In the neural network computing device, the zero output specifying means performs a calculation for specifying the neuron computer whose output becomes zero according to the input data, and the control means is the zero output specifying means. It is characterized by including a control step for controlling the calculation of the neural network calculation device so as to omit the calculation of the neuron computer whose output becomes zero based on the calculation result of.

 本発明によれば、入力データと重み付け係数との乗算結果を加算し、当該加算結果に対して、活性化関数を適用して出力データを出力する複数のニューロン計算器から構成されるニューラルネット計算において、入力データに応じて出力がゼロになるニューロン計算器を特定するための計算を行い、この計算結果に基づき、出力がゼロになるニューロン計算器の計算を省略するようにニューラルネット計算を制御することにより、入力データのパターンに応じて、出力がゼロになるニューロン計算器を特定できるため、ニューロンに関する計算を動的に省略できる。 According to the present invention, a neural network calculation composed of a plurality of neuron calculators that add the multiplication result of input data and weighting coefficient, apply an activation function to the addition result, and output output data. In, a calculation is performed to identify a neuron computer whose output becomes zero according to the input data, and based on this calculation result, the neural network calculation is controlled so as to omit the calculation of the neuron computer whose output becomes zero. By doing so, it is possible to identify the neural computer whose output becomes zero according to the pattern of the input data, so that the calculation related to the neural network can be dynamically omitted.

実施形態に係るニューラル計算装置の一例を示すブロック図である。It is a block diagram which shows an example of the neural calculation apparatus which concerns on embodiment. 図1のニューラルネット計算部およびゼロ出力特定部の概要構成例を示すブロック図である。It is a block diagram which shows the outline structure example of the neural network calculation part and zero output specific part of FIG. 図2のニューロン計算器の一例を示すブロック図である。It is a block diagram which shows an example of the neuron calculator of FIG. ニューロン計算器の計算を省略の一例を示す模式図である。It is a schematic diagram which shows an example which omitted the calculation of a neuron calculator. 2値化ニューラルネットワーク・システムの概要構成例を示すブロック図である。It is a block diagram which shows the outline structure example of the binarized neural network system. 図5のニューラル電子回路の一例を示すブロック図である。It is a block diagram which shows an example of the neural electronic circuit of FIG. ニューラル計算装置の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the neural calculation apparatus. ニューラルネット計算部およびゼロ出力特定部の一例を示すブロック図である。It is a block diagram which shows an example of a neural network calculation part and zero output specific part. ニューラルネット計算部およびゼロ出力特定部の動作のタイミングの一例を示す模式図である。It is a schematic diagram which shows an example of the operation timing of the neural network calculation part and zero output specific part. ニューラルネット計算部およびゼロ出力特定部の動作のタイミングの一例を示す模式図である。It is a schematic diagram which shows an example of the operation timing of a neural network calculation part and a zero output specific part. ニューラル計算装置による活性予測精度の結果の一例を示す模式図である。It is a schematic diagram which shows an example of the result of the activity prediction accuracy by a neural calculation apparatus. ニューラル計算装置による計算削減率の結果の一例を示す模式図である。It is a schematic diagram which shows an example of the result of the calculation reduction rate by a neural calculation apparatus. ゼロ出力特定部の変形例を示すブロック図である。It is a block diagram which shows the modification of the zero output specific part. ゼロ出力特定部がサポートベクターマシンの識別関数等を説明する模式図である。The zero output identification part is the schematic diagram explaining the identification function of a support vector machine. ニューラルネット計算部およびゼロ出力特定部の変形例を示すブロック図である。It is a block diagram which shows the modification of the neural network calculation part and the zero output specific part.

 以下、図面を参照して本発明の実施形態について説明する。なお、以下に説明する実施の形態は、ニューラル計算装置に対して本発明を適用した場合の実施形態である。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. The embodiment described below is an embodiment when the present invention is applied to a neural computing device.

 (1.ニューラル計算装置の構成および機能概要)
 まず、本発明の一実施形態に係るニューラル計算装置の構成および概要機能について、図1を用いて説明する。
(1. Outline of configuration and function of neural computing device)
First, the configuration and outline function of the neural computing device according to the embodiment of the present invention will be described with reference to FIG.

 図1は、実施形態に係るニューラル計算装置の一例を示すブロック図である。図2は、ニューラルネット計算部およびゼロ出力特定部の概要構成例を示すブロック図である。図3は、ニューロン計算器の一例を示すブロック図である。図4は、ニューロン計算器の計算を省略の一例を示す模式図である。 FIG. 1 is a block diagram showing an example of a neural computing device according to an embodiment. FIG. 2 is a block diagram showing an outline configuration example of the neural network calculation unit and the zero output specific unit. FIG. 3 is a block diagram showing an example of a neuron calculator. FIG. 4 is a schematic diagram showing an example in which the calculation of the neuron calculator is omitted.

 図1に示すように、ニューラル計算装置1は、入力データiと重み付け係数との乗算結果を加算し、加算結果に対して、活性化関数を適用して出力データoを出力する複数のニューロン計算器から構成されるニューラルネット計算部(NN計算部)2と、ニューラルネット計算部2のニューロン計算器の出力がゼロになるニューロン計算器を特定する計算を行うゼロ出力特定部3と、ゼロ出力特定部3に基づき、出力がゼロになるニューロン計算器の計算を省略するようにニューラルネット計算部2の計算を制御する制御部4と、重み付け係数wおよび計算の中間データ等を記憶するメモリ5と、を備えている。 As shown in FIG. 1, the neural calculation device 1 adds the multiplication result of the input data i and the weighting coefficient, applies an activation function to the addition result, and outputs the output data o. The neural network calculation unit (NN calculation unit) 2 composed of the instruments, the zero output specification unit 3 that performs the calculation to specify the neural network calculation unit whose output of the neural network calculation unit 2 becomes zero, and the zero output Based on the specific unit 3, the control unit 4 that controls the calculation of the neural network calculation unit 2 so as to omit the calculation of the neuron computer whose output becomes zero, and the memory 5 that stores the weighting coefficient w and the intermediate data of the calculation. And have.

 ニューラルネット計算部2は、入力データと重み付け係数との乗算結果を加算し、当該加算結果に対して、活性化関数を適用して出力データを出力する複数のニューロン計算器から構成されるニューラルネット計算手段の一例である。ゼロ出力特定部3は、前記入力データに応じて、出力がゼロになる前記ニューロン計算器を特定するための計算を行うゼロ出力特定手段の一例である。制御部4は、前記ゼロ出力特定手段の計算結果に基づき、前記出力がゼロになるニューロン計算器の計算を省略するように前記ニューラルネット計算手段の計算を制御する制御手段の一例である。 The neural network calculation unit 2 is a neural network composed of a plurality of neuron calculators that add the multiplication result of the input data and the weighting coefficient, apply an activation function to the addition result, and output the output data. This is an example of a calculation means. The zero output specifying unit 3 is an example of a zero output specifying means that performs a calculation for specifying the neuron computer whose output becomes zero according to the input data. The control unit 4 is an example of a control means that controls the calculation of the neural network calculation means so as to omit the calculation of the neuron computer whose output becomes zero based on the calculation result of the zero output specifying means.

 ニューラル計算装置1は、入力データi、入力データi1、入力データi2、…、入力データin(nは自然数。以下同様。)に対して、出力データo、出力データo1、出力データo2、…、入出力データom(mは自然数。以下同様。)を出力する。 Neural computing device 1, the input data i 0, the input data i 1, the input data i 2, ..., the input data i n (n is a natural number. Hereinafter the same.) With respect to the output data o 0, the output data o 1, output data o 2, ..., input and output data o m (m is a natural number. forth.) to the.

 図2に示すように、ニューラルネット計算部2は、複数のニューロン計算器20aを有する多層のニューラルレイヤー20から構成されている。 As shown in FIG. 2, the neural network calculation unit 2 is composed of a multi-layer neural layer 20 having a plurality of neuron calculators 20a.

 図3に示すように、ニューロン計算器20aは、入力データi、入力データi1、入力データi2、…、入力データinと、入力データiのそれぞれに対する重み付け係数w、重み付け係数w1、重み付け係数w2、…、重み付け係数wnの乗算結果を加算する乗算加算器21aと、加算結果に対して活性化関数を適用する活性化関数器25aと、を有する。活性化関数器25aの活性化関数が正規化線形関数である場合、乗算加算器21aの出力がゼロ以下の場合、ゼロを出力し、乗算加算器21aの出力が正の値xの場合、値xを出力する。 As shown in FIG. 3, the neuron calculator 20a is input data i 0, the input data i 1, the input data i 2, ..., and input data i n, the weighting factor w 0 for each of the input data i, the weighting factor w It has a multiplication adder 21a that adds a multiplication result of 1 , a weighting coefficient w 2 , ..., A weighting coefficient w n , and an activation function device 25a that applies an activation function to the addition result. When the activation function of the activation function device 25a is a rectified linear function, if the output of the multiplication adder 21a is zero or less, zero is output, and if the output of the multiplication adder 21a is a positive value x, the value. Output x.

 なお、図2において、ニューラルネットの各層が出力側のニューロンと入力側のニューロンとが、隣り合ったニューラルレイヤー20の間で分離して模式的に描かれている。1番目のニューラルレイヤー20内では、入力層L0のニューロンに対して、第1層L1のニューロン計算器20a群が構成されている。2番目のニューラルレイヤー20内では、第1層L1のニューロンに対して、第2層L2のニューロン計算器20a群が構成されている。n番目のニューラルレイヤー20内では、第n-1層Ln-1のニューロンに対して、第n層Lnのニューロン計算器20a群が構成されている。 Note that, in FIG. 2, each layer of the neural network is schematically drawn with the output side neuron and the input side neuron separated between the adjacent neural layers 20. In the first neural layer 20, the neuron calculator 20a group of the first layer L1 is configured with respect to the neurons of the input layer L0. In the second neural layer 20, the neuron calculator 20a group of the second layer L2 is configured with respect to the neurons of the first layer L1. In the nth neural layer 20, a neuron calculator 20a group of the nth layer Ln is configured with respect to the neurons of the n-1th layer Ln-1.

 図2に示すように、ゼロ出力特定部3は、複数の活性予測器30から構成されている。各活性予測器30は、各ニューラルレイヤー20に対応して設置される。 As shown in FIG. 2, the zero output specifying unit 3 is composed of a plurality of activity predictors 30. Each activity predictor 30 is installed corresponding to each neural layer 20.

 活性予測器30には、対応するニューラルレイヤー20と同じ入力データが入力される。第2層以降のニューラルレイヤー20に対応する活性予測器30には、1つ前の層のニューラルレイヤー20の出力が、入力される。 The same input data as the corresponding neural layer 20 is input to the activity predictor 30. The output of the neural layer 20 of the previous layer is input to the activity predictor 30 corresponding to the neural layers 20 of the second and subsequent layers.

 活性予測器30は、対応するニューラルレイヤー20と同じ入力データのパターンに基づき、ニューラルレイヤー20の各ニューロン計算器20aにおける計算のうち活性化関数を適用する前までの計算を模した計算を行う。活性予測器30における計算結果は、制御部4に出力される。 The activity predictor 30 performs a calculation imitating the calculation before applying the activation function among the calculations in each neuron computer 20a of the neural layer 20 based on the same input data pattern as the corresponding neural layer 20. The calculation result in the activity predictor 30 is output to the control unit 4.

 この計算結果でニューロン計算器20aの出力がゼロ以下になると予測されるニューロン計算器20aに対して、制御部4は、出力がゼロになるニューロン計算器20aに関わる乗算や加算の計算を省略するようにニューラルレイヤー20の計算を制御する。 In contrast to the neurocomputer 20a whose output is predicted to be zero or less based on this calculation result, the control unit 4 omits the calculation of multiplication and addition related to the neurocomputer 20a whose output becomes zero. The calculation of the neural layer 20 is controlled so as to.

 例えば、図4に示すように、ニューラルネット計算部2は、計算を省略しないニューロン計算器20aのネットワークで連結されたニューロン計算器20aのみの計算を行う。 For example, as shown in FIG. 4, the neural network calculation unit 2 calculates only the neuron calculator 20a connected by the network of the neuron calculator 20a without omitting the calculation.

 活性予測器30は、ニューラルレイヤー20より速く計算できる計算器ならばよい。例えば、活性予測器30の一例として、2値化ニューラル計算器、整数化ニューラル計算器、サポートベクターマシン(SVM)、ランダムフォレスト等が挙げられる。活性予測器30が2値化ニューラル計算器の場合、活性予測器30は、入力側で、例えば、入力データの最上ビットを取り出し、2値化ニューラルネットワークの計算がされる。第1層目では、最上ビットは、入力データの符号(例えば、0がマイナス符号を示し、1がプラス符号)を表し、第2層目以降では、最上ビットは、データの値のマグニチュード(例えば、0ならば0を示し、1ならばそのビット数で示せる最大値を示す)を表している。なお、2値化ニューラルネットワークの計算に使用される入力は、入力データの最上ビットの他に、入力データから変換して得られる1ビットのデータならばよい。 The activity predictor 30 may be a computer that can calculate faster than the neural layer 20. For example, as an example of the activity predictor 30, a binarized neural computer, an integerized neural computer, a support vector machine (SVM), a random forest, and the like can be mentioned. When the activity predictor 30 is a binarized neural computer, the activity predictor 30 takes out, for example, the most significant bit of the input data on the input side, and calculates the binarized neural network. In the first layer, the most significant bit represents the sign of the input data (eg, 0 is a minus sign and 1 is a plus sign), and in the second and subsequent layers, the most significant bit is the magnitude of the data value (eg,). , 0 indicates 0, and 1 indicates the maximum value that can be indicated by the number of bits). The input used for the calculation of the binarized neural network may be 1-bit data obtained by converting from the input data in addition to the most significant bit of the input data.

 ニューラルネット計算部2およびゼロ出力特定部3の計算は、専用の電子回路のハードウェアで計算されてもよいし、ノイマン型等のCPUによる計算でもよいし、GPU等のAIチップによる計算でもよいし、CPUとGPUとを組み合わせた計算でもよし、ニューロモーフィックチップによる計算でもよい。 The calculation of the neural net calculation unit 2 and the zero output specific unit 3 may be performed by the hardware of a dedicated electronic circuit, by a CPU such as a Neumann type, or by an AI chip such as a GPU. However, the calculation may be a combination of the CPU and the GPU, or may be a calculation using a neuromorphic chip.

 ゼロ出力特定部3における2値化ニューラルネットワークの計算は、例えば、1ビットの重み付け係数と1ビットの入力データとの排他的論理和(XNOR)によって2値化乗算を行うニューラル電子回路で行われてもよい。2値化ニューラルネットワークの計算は、ニューロン間の1ビットの接続有無情報に基づき、2値化乗算の乗算結果に加算をする。2値化ニューラルネットワークの計算は、加算結果に活性化関数を適用して1ビットの出力データを出力する。 The calculation of the binarized neural network in the zero output specifying unit 3 is performed by, for example, a neural electronic circuit that performs binarization multiplication by the exclusive OR (XNOR) of the 1-bit weighting coefficient and the 1-bit input data. You may. The calculation of the binarized neural network adds to the multiplication result of the binarized multiplication based on the 1-bit connection presence / absence information between neurons. The calculation of the binarized neural network applies an activation function to the addition result and outputs 1-bit output data.

 (2.ゼロ出力特定部3における2値化ニューラルネットワーク・システムの構成および機能)
 次に、ゼロ出力特定部3が2値化ニューラルネットワークの場合における2値化ニューラルネットワーク・システムの構成および概要機能について、図を用いて説明する。
(2. Configuration and function of binarized neural network system in zero output identification unit 3)
Next, the configuration and outline function of the binarized neural network system when the zero output specifying unit 3 is the binarized neural network will be described with reference to the drawings.

 図5は、2値化ニューラルネットワーク・システムの概要構成例を示すブロック図である。図6は、図5のニューラル電子回路の一例を示すブロック図である。 FIG. 5 is a block diagram showing an outline configuration example of a binarized neural network system. FIG. 6 is a block diagram showing an example of the neural electronic circuit of FIG.

 図5に示すように、2値化ニューラルネットワーク・システムNNSは、様々なタイプのニューラルネットワークを電子回路で実現可能な複数のコア電子回路Coreと、コア電子回路Core同士を接続するシステムバスbusと、を備えている。 As shown in FIG. 5, the binarized neural network system NNS includes a plurality of core electronic circuit Cores capable of realizing various types of neural networks with electronic circuits, and a system bus bus connecting the core electronic circuit Cores. , Is equipped.

 コア電子回路Coreは、様々なタイプのニューラルネットワークを電子回路で実現可能な2値化ニューラル電子回路NNと、2値化ニューラル電子回路NNの重み付け係数等を設定するメモリアクセス制御部MCntと、ニューラル電子回路NNおよびメモリアクセス制御部MCntを制御する制御部Cntと、を有する。ここで、様々なタイプのニューラルネットワークの一例として、ニューロン層間のニューロン同士がフル結合したフル結合のタイプのニューラルネットワーク、コンボリューション演算をするニューラルネットワーク、ニューロン層における層内拡張をしたニューラルネットワーク、層数を拡張するニューラルネットワーク等が挙げられる。 The core electronic circuit Core includes a binarized neural electronic circuit NN that can realize various types of neural networks with an electronic circuit, a memory access control unit MCnt that sets weighting coefficients of the binarized neural electronic circuit NN, and a neural network. It has an electronic circuit NN and a control unit Cnt that controls a memory access control unit MCnt. Here, as an example of various types of neural networks, a fully connected type neural network in which neurons between neurons are fully connected to each other, a neural network that performs convolution calculation, a neural network that is expanded in the neuron layer, and a layer. Examples include neural networks that expand numbers.

 2値化ニューラル電子回路NNは、入力データI1、・・・、Ip(pは自然数。以下同様)を並列で順次供給する入力メモリアレイ部MAiと、重み付け係数のデータを並列で順次供給するメモリセル部MCと、供給された入力データI1、・・・、Ipと重み付け係数とを乗算する乗算機能を実現して乗算結果を出力する複数のプロセスエレメント部Peと、並列の各入力データのプロセスエレメント部Peからの各乗算結果を加算して加算結果に活性化関数を適用する加算活性化部Actと、各加算活性化部Actからの1ビットの出力データO1、・・・、Oq(qは自然数。以下同様)をそれぞれ順次記憶する出力メモリアレイ部MAoと、各加算活性化部Actにバイアス用の値を順次提供するバイアス用メモリアレイ部MAbとを有する。 The binarized neural electronic circuit NN has an input memory array unit MAi that sequentially supplies input data I1, ..., Ip (p is a natural number; the same applies hereinafter) in parallel, and a memory that sequentially supplies weighting coefficient data in parallel. A cell unit MC, a plurality of process element units Pe that realize a multiplication function for multiplying the supplied input data I1, ..., Ip and a weighting coefficient and output a multiplication result, and a process of each input data in parallel. Addition activation unit Act that adds each multiplication result from element unit Pe and applies an activation function to the addition result, and 1-bit output data O1, ..., Oq (q) from each addition activation unit Act. Has an output memory array unit MAo that sequentially stores natural numbers (the same applies hereinafter), and a bias memory array unit MAb that sequentially provides bias values to each addition activation unit Act.

 メモリアクセス制御部MCntは、例えば、Direct Memory Access Controllerである。メモリアクセス制御部MCntは、制御部Cntの制御に従い、各プロセスエレメント部Peに逐次供給する入力データを、入力メモリアレイ部MAiに設定する。また、メモリアクセス制御部MCntは、制御部Cntの制御に従い、重み付け係数およびニューロン間の接続の有無を示す所定値を、各メモリセル部MCに予め設定する。また、メモリアクセス制御部MCntは、制御部Cntの制御に従い、加算活性化部Actから出力された出力データを、出力メモリアレイ部MAoから取り出する。 The memory access control unit MCnt is, for example, a Direct Memory Access Controller. The memory access control unit MCnt sets the input data to be sequentially supplied to each process element unit Pe in the input memory array unit MAi according to the control of the control unit Cnt. Further, the memory access control unit MCnt sets in advance a weighting coefficient and a predetermined value indicating the presence / absence of connection between neurons in each memory cell unit MC according to the control of the control unit Cnt. Further, the memory access control unit MCnt takes out the output data output from the addition activation unit Act from the output memory array unit MAo under the control of the control unit Cnt.

 制御部Cntは、CPU(Central Processing Unit)等を有する。制御部Cntは、2値化ニューラル電子回路NNの各素子の同期等のタイミングを計ったり、計算やデータの転送の同期を取ったりする。また、制御部Cntは、2値化ニューラル電子回路NN内の後述のセレクタ素子の切り替え制御を行う。 The control unit Cnt has a CPU (Central Processing Unit) and the like. The control unit Cnt measures timing such as synchronization of each element of the binarized neural electronic circuit NN, and synchronizes calculation and data transfer. In addition, the control unit Cnt controls switching of the selector element described later in the binarized neural electronic circuit NN.

 制御部Cntは、メモリアクセス制御部MCntを制御して、他のコア電子回路Coreから出力されたデータを入力メモリアレイ部MAi用に整えて、入力データとして入力メモリアレイ部MAiに供給するように制御する。制御部Cntは、メモリアクセス制御部MCntが、出力メモリアレイ部MAoから取得した出力データを、他のコア電子回路Coreに転送するように制御する。 The control unit Cnt controls the memory access control unit MCnt, prepares the data output from the other core electronic circuit Core for the input memory array unit MAi, and supplies the data as input data to the input memory array unit MAi. Control. The control unit Cnt controls the memory access control unit MCnt to transfer the output data acquired from the output memory array unit MAo to another core electronic circuit Core.

 なお、上位コントローラ(例えば、制御部4)が、ニューラルネットワーク・システムNNSや、各コア電子回路Coreの制御部Cntを制御してもよい。また、上位コントローラが、制御部Cntの代わりに、2値化ニューラル電子回路NNおよびメモリアクセス制御部MCntを制御してもよい。上位コントローラは、外付けのコンピュータでもよい。 Note that the host controller (for example, control unit 4) may control the neural network system NNS and the control unit Cnt of each core electronic circuit Core. Further, the host controller may control the binarized neural electronic circuit NN and the memory access control unit MCnt instead of the control unit Cnt. The host controller may be an external computer.

 バイアス用メモリアレイ部MAbは、各加算活性化部Actに提供するバイアス用のデータを予め記憶している。 The bias memory array unit MAb stores bias data provided to each addition activation unit Act in advance.

 図6に示すように、2値化ニューラル電子回路NNは、例えば、入力p個×出力q個の2層のニューラルネットワークを実現する。 As shown in FIG. 6, the binarized neural electronic circuit NN realizes, for example, a two-layer neural network having p inputs and q outputs.

 メモリセル部MCは、重み付け係数を記憶するメモリセル10を有する。メモリセル10は、構築するニューラルネットワークによる実現する脳機能に基づいて予め設定された「1」または「0」の1ビットの重み付け係数を記憶している。 The memory cell unit MC has a memory cell 10 that stores a weighting coefficient. The memory cell 10 stores a preset 1-bit weighting coefficient of “1” or “0” based on the brain function realized by the neural network to be constructed.

 なお、メモリセル部MCは、上記脳機能に基づいて予め設定されたニューロン間の接続有無情報を記憶している別の接続有無情報用のメモリセル(図示せず)も有していてもよい。ここで、接続無情報は、例えば、NC(Not Connected(接続なし))を意味する1ビット所定値であり、所定値として「1」または「0」等が割り当てられる。 The memory cell unit MC may also have another memory cell (not shown) for connection presence / absence information that stores connection presence / absence information between neurons preset based on the brain function. .. Here, the no-connection information is, for example, a 1-bit predetermined value meaning NC (Not Connected), and "1" or "0" or the like is assigned as the predetermined value.

 メモリセル10が並んで、メモリセルの列が形成される。同時に各プロセスエレメント部Peに出力されるメモリセル10をまとめてメモリセルブロックCBが形成される。メモリセルブロックCBのメモリセル10は、並列で入力される各入力データに対応している。 Memory cells 10 are lined up to form a row of memory cells. At the same time, the memory cell block CB is formed by collectively forming the memory cells 10 output to each process element unit Pe. The memory cell 10 of the memory cell block CB corresponds to each input data input in parallel.

 メモリセル部MCは、入力メモリアレイ部MAiから並列で入力される入力データI1、・・・、Ipの入力並列数p個以上のメモリセルブロックCBを有することが好ましい。メモリセルブロックCBにおいて、メモリセル10の数は、入力メモリアレイ部MAiから1ビット順次入力されるシリアルの入力データのサイクル数以上が好ましい。 The memory cell unit MC preferably has a memory cell block CB having an input parallel number of p or more of input data I1, ..., Ip input in parallel from the input memory array unit MAi. In the memory cell block CB, the number of memory cells 10 is preferably equal to or greater than the number of cycles of serial input data sequentially input by 1 bit from the input memory array unit MAi.

 メモリセル部MCは、メモリセルブロックCB毎に、1ビットの重み付け係数を、1ビット順次入力されるシリアルの入力データに対応してプロセスエレメント部Peに、順次出力する。各プロセスエレメント部Peに、メモリセルブロックCBからの重み付け係数と、入力メモリアレイ部MAiからの入力データとが入力される。 The memory cell unit MC sequentially outputs a 1-bit weighting coefficient for each memory cell block CB to the process element unit Pe corresponding to the serial input data input sequentially by 1 bit. The weighting coefficient from the memory cell block CB and the input data from the input memory array unit MAi are input to each process element unit Pe.

 メモリセルブロックCBは、1ビットの重み付け係数と1ビットの接続有無情報とを、交互に、プロセスエレメント部Peに順次、出力してもよい。メモリセル10は、プロセスエレメント部Peに対して独立した結線を有し、別々に順次、プロセスエレメント部Peに出力してもよい。 The memory cell block CB may alternately and sequentially output a 1-bit weighting coefficient and a 1-bit connection presence / absence information to the process element unit Pe. The memory cell 10 has an independent connection to the process element unit Pe, and may be sequentially output to the process element unit Pe separately.

 メモリセル部MCは、図5および図6に示すように、出力メモリアレイ部MAoに並列に出力される出力データの出力並列数q個、並列に出力される出力データO1、・・・、Oqに対応して、2値化ニューラル電子回路NNに配置される。 As shown in FIGS. 5 and 6, the memory cell unit MC has q output parallel outputs of output data output in parallel to the output memory array unit MAo, output data O1, ..., Oq output in parallel. Correspondingly, it is arranged in the binarized neural electronic circuit NN.

 図6に示すように、並列で入力される並列の各入力データに配置された入力並列数p個のプロセスエレメント部Peは、2値化ニューラル電子回路NNにおいて、プロセスエレメント・コラム(例えば、プロセスエレメント・コラムPC1)を形成する。出力並列数q個のプロセスエレメント・コラムPC1からPCqは、並列に出力される出力データに対応して、2値化ニューラル電子回路NNにおいて、q列に配置される。プロセスエレメント部Peは、図3に示すように、2値化ニューラル電子回路NNにおいて、p行×q列に2次元の演算器アレイとして設定される。 As shown in FIG. 6, the process element unit Pe of the number of input parallel ps arranged in each parallel input data input in parallel is a process element column (for example, a process) in the binarized neural electronic circuit NN. Form the element column PC1). The process element columns PC1 to PCq having a number of output parallels of q are arranged in the q column in the binarized neural electronic circuit NN corresponding to the output data output in parallel. As shown in FIG. 3, the process element unit Pe is set as a two-dimensional arithmetic unit array in p rows × q columns in the binarized neural electronic circuit NN.

 行列(1,1)、(1,2)、・・・、(1,q)のプロセスエレメント部Peには、入力データI1が共通で入力される結線になっている。行列(2,1)、(2,2)、・・・、(2,q)のプロセスエレメント部Peには、入力データI2が共通で入力される結線になっている。行列(p,1)、(p,2)、・・・、(p,q)のプロセスエレメント部Peには、入力データIpが共通で入力される結線になっている。 Input data I1 is commonly input to the process element part Pe of the matrix (1,1), (1,2), ..., (1,q). Input data I2 is commonly input to the process element parts Pe of the matrices (2,1), (2,2), ..., (2, q). Input data Ip is commonly input to the process element section Pe of the matrix (p, 1), (p, 2), ..., (P, q).

 プロセスエレメント部Peは、対応するメモリセル10から出力される1ビットの重み付け係数と、1ビットの入力データとの排他的論理和(XNOR)を乗算結果として算出して出力する。 The process element unit Pe calculates and outputs the exclusive OR (XNOR) of the 1-bit weighting coefficient output from the corresponding memory cell 10 and the 1-bit input data as the multiplication result and outputs it.

 なお、接続有無情報用のメモリセルからの接続無情報(例えば、「NC」を意味する所定値)が出力された場合、加算活性化部Actにおいて、乗算結果が加算されない。例えば、乗算結果と接続有無情報とが交互にペアで出力されてもよい。また、接続有無情報に関して、プロセスエレメント部Peから加算活性化部Actへ、乗算結果とは独立の結線を有し、乗算結果と接続有無情報とは別々に出力されてもよい。 Note that when no connection information (for example, a predetermined value meaning "NC") is output from the memory cell for connection presence / absence information, the multiplication result is not added in the addition activation unit Act. For example, the multiplication result and the connection presence / absence information may be alternately output as a pair. Further, regarding the connection presence / absence information, the process element unit Pe may have a connection independent of the multiplication result from the addition activation unit Act, and the multiplication result and the connection presence / absence information may be output separately.

 なお、プロセスエレメント部Peで、乗算結果の部分和を計算するとき、接続有無情報用のメモリセルからの接続無情報(例えば、「NC」を意味する所定値)が出力された場合、乗算結果の部分和に加算されない。 When calculating the partial sum of the multiplication results in the process element unit Pe, if no connection information (for example, a predetermined value meaning "NC") is output from the memory cell for connection presence / absence information, the multiplication result Is not added to the partial sum of.

 プロセスエレメント・コラムPC1、・・・、PCqは、各プロセスエレメント部Peからの乗算結果または一部の乗算結果を加算した部分和結果等を加算活性化部Actに出力する。 The process element columns PC1, ..., PCq output the multiplication result from each process element unit Pe or the partial sum result obtained by adding a part of the multiplication results to the addition activation unit Act.

 図5および図6に示すように、加算活性化部Actは、並列で出力される各出力データO1、・・・、Oqに応じて配置されている。 As shown in FIGS. 5 and 6, the addition activation unit Act is arranged according to each output data O1, ..., Oq output in parallel.

 加算活性化部Actは、プロセスエレメント・コラムから逐次出力される乗算結果を、接続有無情報に基づき加算して、加算結果に活性化関数を適用して1ビットの出力データを出力メモリアレイ部MAoに出力する。プロセスエレメント部Peが乗算結果の部分和を出力する場合、加算活性化部Actは、プロセスエレメント・コラムから逐次出力される乗算結果を加算して、加算結果に活性化関数を適用して1ビットの出力データを出力メモリアレイ部MAoに出力する。 The addition activation unit Act adds the multiplication results sequentially output from the process element column based on the connection presence / absence information, applies an activation function to the addition result, and outputs 1-bit output data. Memory array unit MAo Output to. When the process element unit Pe outputs the partial sum of the multiplication results, the addition activation unit Act adds the multiplication results sequentially output from the process element column, applies an activation function to the addition result, and 1 bit. Outputs the output data of to the output memory array unit MAo.

 加算活性化部Actは、プロセスエレメント・コラムにおいて、入力データの1サイクル単位で、乗算結果として「1」を算出した回数から、乗算結果として「0」を算出した回数を減じた値が予め定められた閾値以上の場合に「1」を出力データとして出力し、減じた値が閾値未満の場合に「0」を出力データとして出力する。 In the process element column, the addition activation unit Act determines in advance a value obtained by subtracting the number of times "0" is calculated as the multiplication result from the number of times "1" is calculated as the multiplication result in one cycle unit of the input data. When it is equal to or more than the specified threshold value, "1" is output as output data, and when the subtracted value is less than the threshold value, "0" is output as output data.

 図6に示すように、ビットシリアル入力の並列化が行われ、プロセスエレメント部Peの行が入力データに対して共有し、プロセスエレメント部Peの列である各プロセスエレメント・コラムが独立して出力データを出力する。 As shown in FIG. 6, the bit serial input is parallelized, the row of the process element part Pe is shared with respect to the input data, and each process element column which is a column of the process element part Pe is output independently. Output data.

 なお、2値化ニューラル電子回路NNが、2ビットのデータの他に、複数ビットのデータも扱えるように、対数化する回路を有してもよい。例えば、2値化ニューラル電子回路NNが、入力データと重み付け係数との乗算結果を、対数化入力データと対数化重み付け係数とを加算した対数加算して逆変換で線形化する回路を有してもよい。また、ゼロ出力特定部3は、2値化ニューラル電子回路NNの他に、最終的に0/1を返す機械学習方式であればよい。 Note that the binarized neural electronic circuit NN may have a logarithmic circuit so that it can handle multi-bit data in addition to 2-bit data. For example, the binarized neural electronic circuit NN has a circuit that linearizes the multiplication result of the input data and the weighting coefficient by logarithmic addition of the logarithmic input data and the logarithmic weighting coefficient by inverse conversion. May be good. Further, the zero output specifying unit 3 may be a machine learning method that finally returns 0/1 in addition to the binarized neural electronic circuit NN.

 (3.ニューラル計算装置の動作例)
 次に、ニューラル計算装置1の動作例について図7から図12を用いて説明する。なお、ニューラル計算装置1が畳み込みニューラルネットで、ゼロ出力特定部3が2値化ニューラルネットワークである場合を例として、動作を説明する。
(3. Operation example of neural computing device)
Next, an operation example of the neural calculation device 1 will be described with reference to FIGS. 7 to 12. The operation will be described by taking as an example a case where the neural calculation device 1 is a convolutional neural network and the zero output specifying unit 3 is a binarized neural network.

 図7は、ニューラル計算装置の動作例を示すフローチャートである。図8は、ニューラルネット計算部およびゼロ出力特定部の一例を示すブロック図である。図9および図10は、ニューラルネット計算部およびゼロ出力特定部の動作のタイミングの一例を示す模式図である。図11は、ニューラル計算装置による活性予測精度の結果の一例を示す模式図である。図12は、ニューラル計算装置による計算削減率の結果の一例を示す模式図である。 FIG. 7 is a flowchart showing an operation example of the neural computing device. FIG. 8 is a block diagram showing an example of a neural network calculation unit and a zero output specific unit. 9 and 10 are schematic views showing an example of the operation timing of the neural network calculation unit and the zero output specific unit. FIG. 11 is a schematic diagram showing an example of the result of the activity prediction accuracy by the neural computing device. FIG. 12 is a schematic diagram showing an example of the result of the calculation reduction rate by the neural computing device.

 図7に示すように、ニューラル計算装置1は、学習済みニューラルネットモデルを取得する(ステップS1)。具体的には、ニューラル計算装置1の制御部4は、畳み込みニューラルネットを構築に必要な重み付け係数の値をメモリ5から取得する。制御部4は、ニューラルネット計算部2に重み付け係数の値を出力して、学習済みニューラルネットモデルを設定する。また、制御部4は、活性化関数を正規化線形関数に設定する。 As shown in FIG. 7, the neural computing device 1 acquires the trained neural network model (step S1). Specifically, the control unit 4 of the neural calculation device 1 acquires the value of the weighting coefficient necessary for constructing the convolution neural network from the memory 5. The control unit 4 outputs the value of the weighting coefficient to the neural network calculation unit 2 and sets the trained neural network model. Further, the control unit 4 sets the activation function as a rectified linear function.

 例えば、図8に示すように、ニューラルネット計算部2において、ニューラルレイヤー20が、畳み込み層である場合、畳み込み演算を行う乗算加算部21と、正規化線形関数部25とが設定される。なお、出力層のニューラルレイヤー20は、フル結合層でもよい。隠れ層のニューラルレイヤー20に、フル結合層が含まれてもよい。 For example, as shown in FIG. 8, in the neural network calculation unit 2, when the neural layer 20 is a convolution layer, a multiplication addition unit 21 that performs a convolution operation and a rectified linear function unit 25 are set. The neural layer 20 of the output layer may be a fully connected layer. The neural layer 20 of the hidden layer may include a fully connected layer.

 なお、学習済みニューラルネットモデルは、ニューラルネット計算部2に対して、所定の学習用のデータセットによって予め学習が行われて、構築される。例えば、ニューラルネット計算部2に学習用のデータセットを適用して、誤差伝搬法により学習する。 The trained neural network model is constructed by pre-learning the neural network calculation unit 2 with a predetermined learning data set. For example, a data set for learning is applied to the neural network calculation unit 2 and learning is performed by the error propagation method.

 次に、ニューラル計算装置1は、ゼロ出力特定部3を学習させる(ステップS2)。例えば、図8に示すように、活性予測器30が2値化ニューラル計算器31の場合、制御部4は、ゼロ出力特定部3の各2値化ニューラル計算器31に、ニューラルネット計算部2に対して適用した同じ学習用のデータセットを適用して、ゼロ出力特定部3を学習させる。さらに具体的には、制御部4は、各ニューラルレイヤー20における乗算加算部21の出力であるOFM(Output Feature Map)と、ニューラルレイヤー20に対応する2値化ニューラル計算器31の出力であるOFMとを取得する。 Next, the neural calculation device 1 trains the zero output specifying unit 3 (step S2). For example, as shown in FIG. 8, when the activity predictor 30 is a binarized neural computer 31, the control unit 4 is connected to each binarized neural computer 31 of the zero output specifying unit 3 and the neural network calculation unit 2. The zero output specific unit 3 is trained by applying the same training data set applied to. More specifically, the control unit 4 has an OFM (Output Feature Map) which is the output of the multiplication / addition unit 21 in each neural layer 20 and an OFM which is the output of the binarized neural computer 31 corresponding to the neural layer 20. And get.

 制御部4は、乗算加算部21のOFMを教師信号として、2値化ニューラル計算器31のOFMとの誤差を求め、この誤差に基づき誤差伝搬法により各2値化ニューラル計算器31を学習させる。制御部4は、誤差として損失関数の値が所定以下になるまで、ニュートン法、最急降下法等の最適化手法により2値化ニューラル計算器31の重み付けを修正して、学習を繰り返す。なお、誤差は、例えば2値化ニューラル計算器31の出力と乗算加算部21における加算結果との差分の2乗である。学習用のデータセットを適用して、2値化ニューラル計算器31に対する重み付け係数が修正されて、乗算加算部21を模した2値化ニューラル計算器31が形成される。 The control unit 4 uses the OFM of the multiplication / addition unit 21 as a teacher signal to obtain an error from the OFM of the binarized neural computer 31, and trains each binarized neural computer 31 by an error propagation method based on this error. .. The control unit 4 corrects the weighting of the binarized neural computer 31 by an optimization method such as Newton's method or the steepest descent method until the value of the loss function becomes a predetermined value or less as an error, and repeats the learning. The error is, for example, the square of the difference between the output of the binarized neural computer 31 and the addition result in the multiplication / addition unit 21. By applying the data set for learning, the weighting coefficient for the binarized neural computer 31 is modified to form the binarized neural computer 31 imitating the multiplication / addition unit 21.

 ここで、2値化ニューラル計算器31は、例えば、2値化ニューラル電子回路NNにより実現される。また、ゼロ出力特定部3が、複数のコア電子回路Coreにより実現されてもよい。 Here, the binarized neural computer 31 is realized by, for example, a binarized neural electronic circuit NN. Further, the zero output specifying unit 3 may be realized by a plurality of core electronic circuits Core.

 2値化ニューラル計算器31用の重み付け係数は、メモリ5に記憶されていて、学習が進む毎に修正されている。制御部4は、メモリ5上の重み付け係数を逐次書き換えられて学習して行く。修正後の重み付け係数は、メモリ5からゼロ出力特定部3のメモリアクセス制御部MCntに転送され、メモリセル部MCに設定される。 The weighting coefficient for the binarized neural computer 31 is stored in the memory 5 and is corrected every time learning progresses. The control unit 4 learns by sequentially rewriting the weighting coefficient on the memory 5. The modified weighting coefficient is transferred from the memory 5 to the memory access control unit MCnt of the zero output specific unit 3, and is set in the memory cell unit MC.

 なお、畳み込み演算を行う場合、2値化ニューラル電子回路NNにおいて、入力メモリアレイ部MAiにおける入力データと、メモリセル部MCからプロセスエレメント部Peに順次出力する重み付け係数とが、畳み込み演算になるように制御されて、畳み込み演算が実現される。なお、RGB画像の場合、p=3の入力データになる。入力画像がk×kピクセルの場合、入力データi1、i2、・・・、ik、・・・、ikが逐次入力される。qの値は、次の層のニューロン数になる。 When performing the convolution operation, in the binarized neural electronic circuit NN, the input data in the input memory array unit MAi and the weighting coefficient sequentially output from the memory cell unit MC to the process element unit Pe are performed in the convolution operation. The convolution operation is realized under the control of. In the case of an RGB image, the input data is p = 3. When the input image is k × k pixels, the input data i1, i2, ..., Ik, ..., Ik 2 are sequentially input. The value of q is the number of neurons in the next layer.

 学習した活性予測器30または2値化ニューラル計算器31は、前記ニューラルネット計算手段の計算を模した2値化ニューラルネット計算器の一例である。学習した活性予測器30または2値化ニューラル計算器31は、前記ニューラルネット計算手段において前記活性化関数を適用する前までの計算(乗算加算部21の計算)を模した2値化ニューラル計算器の一例である。制御部4は、前記ゼロ出力特定手段の出力と前記ニューラルネット計算手段における前記加算結果との差異に基づき、前記2値化ニューラル計算器を学習させる学習手段の一例として機能する。 The learned activity predictor 30 or the binarized neural network computer 31 is an example of the binarized neural network computer that imitates the calculation of the neural network calculation means. The learned activity predictor 30 or the binarization neural computer 31 is a binarization neural computer that imitates the calculation (calculation of the multiplication / addition unit 21) before applying the activation function in the neural network calculation means. This is an example. The control unit 4 functions as an example of the learning means for training the binarized neural computer based on the difference between the output of the zero output specifying means and the addition result in the neural network calculation means.

 次に、ニューラル計算装置1は、ゼロ出力特定部3を用いたニューラルネット計算を実行する(ステップS3)。制御部4は、ニューラルネット計算部2用の学習済み重み付け係数を、メモリ5から読み出し、ニューラルネット計算部2にロードする。制御部4は、ゼロ出力特定部3用の学習済み重み付け係数を、メモリ5から読み出し、ゼロ出力特定部3にロードする。2値化ニューラル電子回路NNの場合は、各2値化ニューラル計算器31に対応したメモリセル部MCに、ゼロ出力特定部3用の学習済み重み付け係数がセットされる。 Next, the neural calculation device 1 executes the neural network calculation using the zero output specifying unit 3 (step S3). The control unit 4 reads the learned weighting coefficient for the neural network calculation unit 2 from the memory 5 and loads it into the neural network calculation unit 2. The control unit 4 reads the learned weighting coefficient for the zero output specifying unit 3 from the memory 5 and loads it into the zero output specifying unit 3. In the case of the binarized neural electronic circuit NN, the learned weighting coefficient for the zero output specifying unit 3 is set in the memory cell unit MC corresponding to each binarized neural computer 31.

 次に、図9に示すように、1番目のニューラルレイヤー20に対応する2値化ニューラル計算器31の計算が開始される。1番目の2値化ニューラル計算器31に、入力データのパターンとして、入力データが逐次入力され、入力データの最上ビット(例えば、0がマイナス符号を示し、1がプラス符号)により、2値化ニューラル計算器31おける計算が行われる。2値化ニューラル計算器31の計算結果として、加算活性化部Actの出力データO1、O2・・・が、1番目の2値化ニューラル計算器31から制御部4に送信される。 Next, as shown in FIG. 9, the calculation of the binarized neural computer 31 corresponding to the first neural layer 20 is started. Input data is sequentially input to the first binarization neural computer 31 as a pattern of input data, and binarized by the uppermost bit of the input data (for example, 0 indicates a minus sign and 1 is a plus sign). Calculations are performed on the neural computer 31. As the calculation result of the binarized neural computer 31, the output data O1, O2 ... Of the addition activation unit Act are transmitted from the first binarized neural computer 31 to the control unit 4.

 出力データO1が「0」ならば、出力データO1に対応するニューロン計算器20aの乗算加算器21aの計算を省略するように、制御部4が、1番目の乗算加算部21を制御する。出力データO1が「1」ならば、出力データO1に対応するニューロン計算器20aの乗算加算器21aの計算を、同じ入力データで計算を開始するように、制御部4が、1番目の乗算加算部21を制御する。乗算加算部21の計算結果は、正規化線形関数部25が適用されて出力される。 If the output data O1 is "0", the control unit 4 controls the first multiplication / addition unit 21 so as to omit the calculation of the multiplication adder 21a of the neurocomputer 20a corresponding to the output data O1. If the output data O1 is "1", the control unit 4 performs the first multiplication addition so that the calculation of the multiplication adder 21a of the neurocomputer 20a corresponding to the output data O1 is started with the same input data. The unit 21 is controlled. The calculation result of the multiplication / addition unit 21 is output by applying the normalization linear function unit 25.

 このように、活性予測器30または2値化ニューラル計算器31は、前記入力データに応じて、出力がゼロになる前記ニューロン計算器を特定するための計算を行うゼロ出力特定手段の一例として機能する。 As described above, the activity predictor 30 or the binarized neural computer 31 functions as an example of the zero output specifying means that performs the calculation for specifying the neuron computer whose output becomes zero according to the input data. To do.

 次に、出力データO2が「0」ならば、出力データO2に対応するニューロン計算器20aの乗算加算器21aの計算を省略するように、制御部4が、1番目の乗算加算部21を制御する。出力データO2が「1」ならば、出力データO2に対応するニューロン計算器20aの乗算加算器21aの計算を、同じ入力データで計算を開始するように、制御部4が、1番目の乗算加算部21を制御する。各出力データに対して、制御部4が、1番目の乗算加算部21を制御する。例えば、図4に示すように、第1層L1において、出力がゼロとならないと予想される乗算加算器21aの計算が行われる。 Next, if the output data O2 is "0", the control unit 4 controls the first multiplication / addition unit 21 so as to omit the calculation of the multiplication / addition unit 21a of the neurocomputer 20a corresponding to the output data O2. To do. If the output data O2 is "1", the control unit 4 performs the first multiplication addition so that the calculation of the multiplication adder 21a of the neurocomputer 20a corresponding to the output data O2 is started with the same input data. The unit 21 is controlled. The control unit 4 controls the first multiplication / addition unit 21 for each output data. For example, as shown in FIG. 4, in the first layer L1, the calculation of the multiplication adder 21a whose output is not expected to be zero is performed.

 1番目の乗算加算部21および正規化線形関数部25からの出力が、入力データのパターンとして、2番目の2値化ニューラル計算器31に逐次入力され、入力データの最上ビット(例えば、0ならば0を示し、1ならばそのビット数で示せる最大値を示す)により、2値化ニューラル計算器31おける計算が行われる。 The output from the first multiplication / addition unit 21 and the rectified linear function unit 25 is sequentially input to the second binarization neural computer 31 as a pattern of input data, and the highest bit of the input data (for example, if it is 0). If 0 is indicated, 1 indicates the maximum value that can be indicated by the number of bits), the calculation in the binarized neural network 31 is performed.

 なお、1番目の乗算加算部21および正規化線形関数部25からの出力が、中間データとして、ニューラルネット計算部2のキャッシュまたはメモリ5に記憶されてもよい。 Note that the output from the first multiplication / addition unit 21 and the normalized linear function unit 25 may be stored in the cache or the memory 5 of the neural network calculation unit 2 as intermediate data.

 2番目の2値化ニューラル計算器31の出力データが逐次出力され、各出力の値(「0」または「1」)に応じて、対応するニューロン計算器20aの乗算加算器21aの計算を省略、または、乗算加算器21aの計算を行うように、制御部4が、2番目の乗算加算部21を制御する。例えば、図4に示すように、第2層L1において、出力がゼロとならないと予想される乗算加算器21aの計算が行われる。 The output data of the second binarized neural calculator 31 is sequentially output, and the calculation of the multiplication adder 21a of the corresponding neuron calculator 20a is omitted according to the value of each output (“0” or “1”). Or, the control unit 4 controls the second multiplication / addition unit 21 so as to perform the calculation of the multiplication / adder 21a. For example, as shown in FIG. 4, in the second layer L1, the calculation of the multiplication adder 21a whose output is not expected to be zero is performed.

 2番目の乗算加算部21および正規化線形関数部25からの出力が、入力データとして、3番目の2値化ニューラル計算器31に逐次入力され、入力データの最上ビット(例えば、0ならば0を示し、1ならばそのビット数で示せる最大値を示す)により、2値化ニューラル計算器31おける計算が行われる。 The outputs from the second multiplication / addition unit 21 and the normalized linear function unit 25 are sequentially input to the third binarization neural computer 31 as input data, and the highest bit of the input data (for example, 0 if 0). , And 1 indicates the maximum value that can be indicated by the number of bits), so that the calculation in the binarized neural network 31 is performed.

 最後のニューラルレイヤー20まで計算されて、ニューラルネット計算部2は出力データを出力する。 The calculation is performed up to the last neural layer 20, and the neural network calculation unit 2 outputs the output data.

 図9に示すように、出力がゼロとなると予測されるニューロン計算器20aの乗算加算器21aの計算を省くことにより、従来のニューラルネット計算部より速く計算できるようになる。従来のニューラルネット計算部があるレイヤーで計算時間t0であることに対して、ニューラルネット計算部2の乗算加算部21は計算時間t2(<t0)で計算し、ゼロ出力特定部3の2値化ニューラル計算器31は計算時間t1(<t0)で計算し、かつ、2値化ニューラル計算器31と乗算加算部21とが少し時間をずらして、計算が可能となったニューロン計算器20aから計算始めるという並列的に処理することで、計算時間t3(<t0)で計算可能となる。 As shown in FIG. 9, by omitting the calculation of the multiplication adder 21a of the neuron calculator 20a whose output is predicted to be zero, the calculation can be performed faster than the conventional neural net calculation unit. Whereas the conventional neural net calculation unit has a calculation time t0 in a certain layer, the multiplication / addition unit 21 of the neural net calculation unit 2 calculates at the calculation time t2 (<t0), and the binary value of the zero output specific unit 3 The computerized neural computer 31 calculates at the calculation time t1 (<t0), and the binarized neural computer 31 and the multiplication / addition unit 21 are slightly staggered from each other, so that the calculation is possible from the neuron computer 20a. By processing in parallel to start the calculation, the calculation can be performed in the calculation time t3 (<t0).

 制御部4は、前記ニューラルレイヤーに対応するゼロ出力特定手段により出力がゼロで無いと特定された前記ニューロン計算器から計算を先行して開始するように、前記ニューラルネット計算手段の計算を制御する制御手段の一例として機能する。 The control unit 4 controls the calculation of the neural network calculation means so that the calculation is started in advance from the neuron calculator whose output is specified to be non-zero by the zero output specifying means corresponding to the neural layer. It functions as an example of control means.

 なお、図10に示すように、ニューラルネット計算部2における乗算加算部21と次の層の乗算加算部21との計算の時間間隔を空けてもよい。ゼロ出力特定部3の2値化ニューラル計算器31の計算が終わってから、次の層の乗算加算部21の計算を開始してもよい。また、同じ層の乗算加算部21と2値化ニューラル計算器31との計算で重なる時間があってもよい。 As shown in FIG. 10, the calculation time interval between the multiplication / addition unit 21 in the neural network calculation unit 2 and the multiplication / addition unit 21 in the next layer may be separated. After the calculation of the binarized neural computer 31 of the zero output specifying unit 3 is completed, the calculation of the multiplication / addition unit 21 of the next layer may be started. Further, there may be a time when the multiplication / addition unit 21 of the same layer and the binarization neural computer 31 overlap each other.

 制御部4は、前記ニューラルレイヤーにおける計算と次の層の前記ニューラルレイヤーにおける計算との間に時間間隔を設けて消費電力を減少させるように、前記ニューラルネット計算手段の計算を制御する制御手段の一例として機能する。 The control unit 4 controls the calculation of the neural network calculation means so as to reduce the power consumption by providing a time interval between the calculation in the neural layer and the calculation in the neural layer of the next layer. Works as an example.

 このように、制御部4は、前記入力データから変換して得られる1ビットのデータを入力した前記ゼロ出力特定手段の出力に基づき、前記ニューラルネット計算手段の計算を制御する制御手段の一例である。ニューラルレイヤー20に対応する2値化ニューラル計算器31が、前の層のニューラルレイヤーの出力データに応じて、当該ニューラルレイヤーに属する前記ニューロン計算器の出力がゼロになるニューロン計算器を特定する計算を行うゼロ出力特定手段の一例である。活性予測器30または2値化ニューラル計算器31は、前記入力データのパターンから、前記加算結果の符号がゼロ以下になる前記ニューロン計算器を特定する計算を行うゼロ出力特定手段の一例として機能する。 As described above, the control unit 4 is an example of the control means that controls the calculation of the neural network calculation means based on the output of the zero output specifying means that inputs the 1-bit data obtained by converting from the input data. is there. A calculation in which the binarized neural computer 31 corresponding to the neural layer 20 identifies a neuron computer in which the output of the neuron computer belonging to the neural layer becomes zero according to the output data of the neural layer of the previous layer. This is an example of a zero output specifying means for performing the above. The activity predictor 30 or the binarized neural computer 31 functions as an example of a zero output specifying means that performs a calculation for specifying the neuron computer whose sign of the addition result is zero or less from the pattern of the input data. ..

 次に、シミュレーションの結果を図11および図12を用いて説明する。畳み込み層が8層でフル結合層が2層で、入力層を含めて11層のモデルでシミュレーションを行った。 Next, the results of the simulation will be described with reference to FIGS. 11 and 12. The simulation was performed with a model of 8 convolutional layers, 2 fully connected layers, and 11 layers including the input layer.

 図11に示すように、各2値化ニューラル計算器31の活性予測精度は、学習前(破線)で層が先に進むほど、低下する傾向であったが、学習後は、3層目(conv2_1)以降で、活性予測精度が向上した。なお、横軸が各層で、縦軸が各層の活性予測精度である。各層の活性予測精度は、ゼロ出力特定部3を使用せずニューラルネット計算部2で計算した場合を正解とした場合において、ゼロと予測されたニューロン計算器20aの数を活性予測精度の分母とし、予測が正解であったニューロン計算器20aの数を活性予測精度の分子とした値である。 As shown in FIG. 11, the activity prediction accuracy of each binarized neural computer 31 tended to decrease as the layers advanced before learning (broken line), but after learning, the activity prediction accuracy of the third layer (dashed line). After conv2_1), the activity prediction accuracy has improved. The horizontal axis is each layer, and the vertical axis is the activity prediction accuracy of each layer. The activity prediction accuracy of each layer is determined by using the number of neuron calculators 20a predicted to be zero as the denominator of the activity prediction accuracy when the correct answer is the case where the calculation is performed by the neural net calculation unit 2 without using the zero output specific unit 3. , The value obtained by using the number of the neurocomputer 20a for which the prediction was correct as the molecule of the activity prediction accuracy.

 図12に示すように、計算削減率に関して、3層目(conv2_1)以降で、計算削減率が向上した。各層の計算削減率は、各層のニューロン計算器20aの数を計算削減率の分母として、2値化ニューラル計算器31によりゼロと予測された各層のニューロン計算器20aの数とした値である。 As shown in FIG. 12, regarding the calculation reduction rate, the calculation reduction rate improved in the third layer (conv2_1) and thereafter. The calculation reduction rate of each layer is a value obtained by using the number of neurocomputers 20a of each layer as the denominator of the calculation reduction rate and the number of neurocomputers 20a of each layer predicted to be zero by the binarized neural computer 31.

 これらのシミュレーション結果から、第1層L1に対応する活性予測器30と、第2層L2に対応する活性予測器30と、を省いたゼロ出力特定部3でもよい。 From these simulation results, the zero output specifying unit 3 may omit the activity predictor 30 corresponding to the first layer L1 and the activity predictor 30 corresponding to the second layer L2.

 以上説明したように、本実施形態によれば、入力データiと重み付け係数wとの乗算結果を加算し、当該加算結果に対して、活性化関数を適用して出力データoを出力する複数のニューロン計算器20aから構成されるニューラル計算装置1のニューラルネット計算において、ゼロ出力特定部3が入力データiに応じて出力がゼロになるニューロン計算器20aを特定するための計算を行い、制御部4が、この計算結果に基づき、出力がゼロになるニューロン計算器20aの計算を省略するようにニューラルネット計算部2のニューラルネット計算を制御することにより、入力データのパターンに応じて、出力がゼロになるニューロン計算器20aを特定できるため、ニューロンに関する計算を動的に省略できる。 As described above, according to the present embodiment, a plurality of devices that add the multiplication result of the input data i and the weighting coefficient w, apply an activation function to the addition result, and output the output data o. In the neural network calculation of the neural network device 1 composed of the neural computer 20a, the zero output specifying unit 3 performs a calculation for specifying the neuron computer 20a whose output becomes zero according to the input data i, and the control unit. Based on this calculation result, 4 controls the neural network calculation of the neural network calculation unit 2 so as to omit the calculation of the neuron computer 20a whose output becomes zero, so that the output is output according to the pattern of the input data. Since the neural computer 20a that becomes zero can be specified, the calculation related to the neural network can be dynamically omitted.

 ゼロ出力特定部3が、ニューラルネット計算部2の計算を模した2値化ニューラル計算器31であり、制御部4が、入力データの最上ビットを入力した出力特定部3の出力に基づき、ニューラルネット計算部2の計算を制御する場合、2値化して計算することにより、ニューラルネット計算手段より速く計算できるため、ニューラルネット計算部2を計算する前に、出力がゼロになるニューロン計算器20aを特定できる。 The zero output specifying unit 3 is a binarized neural computer 31 that imitates the calculation of the neural network calculation unit 2, and the control unit 4 is a neural based on the output of the output specifying unit 3 that has input the highest bit of the input data. When controlling the calculation of the net calculation unit 2, the calculation can be performed faster than the neural network calculation means by binarizing the calculation. Therefore, before the neural network calculation unit 2 is calculated, the output becomes zero. Can be identified.

 ゼロ出力特定部3が、ニューラルネット計算部2において活性化関数を適用する前までの計算を模した2値化ニューラル計算器31である場合、活性化関数を適用する計算を省いて、2値化ニューラル計算器31を適用するため、より速く、出力がゼロになるニューロン計算器20aを特定できる。 When the zero output specifying unit 3 is a binarized neural computer 31 that imitates the calculation before applying the activation function in the neural network calculation unit 2, the calculation to apply the activation function is omitted and the binary value is obtained. Since the modified neural computer 31 is applied, it is possible to identify the neural computer 20a whose output becomes zero faster.

 ゼロ出力特定部3の出力とニューラルネット計算部2における加算結果との差異に基づき、2値化ニューラル計算器31を学習させる場合、学習させることにより、ゼロ出力特定部3の予測精度が向上する。 When the binarized neural computer 31 is trained based on the difference between the output of the zero output specific unit 3 and the addition result in the neural network calculation unit 2, the prediction accuracy of the zero output specific unit 3 is improved by training. ..

 ニューラルネット計算部2が、複数のニューラルレイヤー20から構成され、各ニューラルレイヤー20に対応する複数の活性予測器30があり、ニューラルレイヤー20に対応する活性予測器30が、前の層のニューラルレイヤー20の出力データに応じて、当該ニューラルレイヤー20に属するニューロン計算器20aの出力がゼロになるニューロン計算器20aを特定する計算を行う場合、層毎に対応した活性予測器30が出力がゼロになるニューロン計算器20aをより精度よく特定できる。 The neural network calculation unit 2 is composed of a plurality of neural layers 20, there are a plurality of activity predictors 30 corresponding to each neural layer 20, and the activity predictor 30 corresponding to the neural layer 20 is a neural layer of the previous layer. When a calculation is performed to specify the neuron calculator 20a in which the output of the neuron calculator 20a belonging to the neural layer 20 becomes zero according to the output data of 20, the activity predictor 30 corresponding to each layer has an output of zero. The neural computer 20a can be identified more accurately.

 制御部4が、ニューラルレイヤー20に対応するゼロ出力特定部3により出力がゼロで無いと特定されたニューロン計算器20aから計算を先行して開始するように、ニューラルネット計算部2の計算を制御する場合、ニューラル計算装置1の計算をより速く実現できる。 The control unit 4 controls the calculation of the neural network calculation unit 2 so that the calculation is started in advance from the neuron computer 20a whose output is specified to be non-zero by the zero output identification unit 3 corresponding to the neural layer 20. When this is done, the calculation of the neural computer 1 can be realized faster.

 制御部4が、ニューラルレイヤー20における計算と次の層のニューラルレイヤー20における計算との間に時間間隔を設けて消費電力を減少させるように、ニューラルネット計算部2の計算を制御する場合、最も消費電力が高いニューラルネット計算部2を省力化でき、ニューラル計算装置1全体のエネルギー消費量を節約することができる。 When the control unit 4 controls the calculation of the neural network calculation unit 2 so as to reduce the power consumption by providing a time interval between the calculation in the neural layer 20 and the calculation in the neural layer 20 of the next layer. It is possible to save labor in the neural network calculation unit 2 having high power consumption, and to save the energy consumption of the entire neural calculation device 1.

 活性化関数が正規化線形関数である場合、正規化線形関数の適用においてゼロ以下であれば、出力がゼロであるので、ニューラルネット計算部2の計算を省略しやすくなる。 When the activation function is a rectified linear function, if it is zero or less in the application of the rectified linear function, the output is zero, so it is easy to omit the calculation of the neural network calculation unit 2.

 ゼロ出力特定部3が、入力データのパターンから、加算結果の符号がゼロ以下になるニューロン計算器20aを特定する計算を行う場合、出力がゼロになるニューロン計算器20aをより特定しやすくなる。 When the zero output specifying unit 3 performs a calculation to specify the neuron computer 20a whose sign of the addition result is zero or less from the pattern of the input data, it becomes easier to specify the neuron computer 20a whose output becomes zero.

 (4.変形例)
 次に、ゼロ出力特定部の変形例について図を用いて説明する。
 図13は、ゼロ出力特定部がサポートベクターマシンの場合のブロック図である。図14は、ゼロ出力特定部がサポートベクターマシンの識別関数等を説明する模式図である。図15は、ニューラルネット計算部およびゼロ出力特定部の変形例を示すブロック図である。
(4. Modification example)
Next, a modified example of the zero output specific unit will be described with reference to the figure.
FIG. 13 is a block diagram when the zero output specifying unit is a support vector machine. FIG. 14 is a schematic diagram in which the zero output identification unit explains the identification function of the support vector machine and the like. FIG. 15 is a block diagram showing a modified example of the neural network calculation unit and the zero output specific unit.

 図13に示すように、活性予測器30がサポートベクターマシン識別器32であってもよい。ニューラル計算装置1は、ニューラルネット計算部2と、ゼロ出力特定部3Aと、を有する。ゼロ出力特定部3Aは、2値化ニューラル計算器31の代わりに複数のサポートベクターマシン識別器32を有する。サポートベクターマシン識別器32は、各層のニューラルレイヤー20に対応して設置される。 As shown in FIG. 13, the activity predictor 30 may be a support vector machine classifier 32. The neural calculation device 1 has a neural network calculation unit 2 and a zero output specific unit 3A. The zero output identification unit 3A has a plurality of support vector machine classifiers 32 instead of the binarized neural computer 31. The support vector machine classifier 32 is installed corresponding to the neural layer 20 of each layer.

 ステップS2において、活性予測器30がサポートベクターマシン識別器32の場合、制御部4は、ゼロ出力特定部3Aの各サポートベクターマシン識別器32に、ニューラルネット計算部2に対して適用した同じ学習用のデータセットを適用して、ゼロ出力特定部3Aを学習させる。さらに具体的には、制御部4は、ニューラルレイヤー20における乗算加算部21の出力であるOFMと、ニューラルレイヤー20に対応するサポートベクターマシン識別器32の出力であるOFMとを取得する。 In step S2, when the activity predictor 30 is the support vector machine classifier 32, the control unit 4 applies the same learning applied to each support vector machine classifier 32 of the zero output specifying unit 3A to the neural network calculation unit 2. The zero output specific unit 3A is trained by applying the data set for. More specifically, the control unit 4 acquires the OFM which is the output of the multiplication / addition unit 21 in the neural layer 20 and the OFM which is the output of the support vector machine classifier 32 corresponding to the neural layer 20.

 制御部4は、図14に示すように乗算加算部21のOFMを正解ラベルtとして、サポートベクターマシン識別器32のOFMとの差異を求め、この差異による損失関数に基づき、制約つきニュートン法や最急降下法等の最適化手法により、各サポートベクターマシン識別器32を学習させる。制御部4は、損失関数が所定以下になるまで、学習を繰り返す。学習用のデータセットを適用して、サポートベクターマシン識別器32に対する重み付け係数が修正されて、乗算加算部21を模したサポートベクターマシン識別器32が形成される。なお、図14に示すように、サポートベクターマシンの識別関数が、y=w(sum)i+w0 (sum) である。 As shown in FIG. 14, the control unit 4 obtains a difference from the OFM of the support vector machine classifier 32 with the OFM of the multiplication / addition unit 21 as the correct answer label t, and based on the loss function due to this difference, the constrained Newton method or Each support vector machine classifier 32 is trained by an optimization method such as the steepest descent method. The control unit 4 repeats learning until the loss function becomes equal to or less than a predetermined value. By applying the training data set, the weighting coefficient for the support vector machine classifier 32 is modified to form the support vector machine classifier 32 that imitates the multiplication / addition unit 21. As shown in FIG. 14, the identification function of the support vector machine is y = w (sum) i + w 0 (sum) .

 サポートベクターマシン識別器32の学習が完了したら、ステップS3のように、ニューラル計算装置1は、2値化ニューラル計算器31の代わりにサポートベクターマシン識別器32を用いたニューラルネット計算を実行する。なお、サポートベクターマシン識別器32は、最上ビット化しない入力データでも、2値化ニューラル計算器31のように最上ビット化した入力データでもよい。 When the learning of the support vector machine classifier 32 is completed, the neural computer device 1 executes the neural network calculation using the support vector machine classifier 32 instead of the binarized neural computer 31 as in step S3. The support vector machine classifier 32 may be input data that is not converted to the most significant bit, or input data that is binarized to the maximum bit, such as the binarized neural computer 31.

 次に、図15に示すように、畳み込み層でなく、フル結合層の場合にも、本願発明を適用できる。すなわち、ニューラルレイヤー20がフル結合層の場合、活性予測器30がフル結合の2値化ニューラル計算器33であってもよい。各層のニューラルレイヤー20フル結合の乗算加算部22と、ゼロ出力特定部3Bにおけるフル結合の2値化ニューラル計算器33とが対応する。 Next, as shown in FIG. 15, the present invention can be applied to a fully bonded layer instead of a convolutional layer. That is, when the neural layer 20 is a fully coupled layer, the activity predictor 30 may be a fully coupled binarized neural computer 33. The multiplication / addition unit 22 of the neural layer 20 full coupling of each layer and the binarized neural computer 33 of the full coupling in the zero output specifying unit 3B correspond to each other.

 なお、フル結合の場合、2値化ニューラル電子回路NNにおいて、入力メモリアレイ部MAiにおける入力データと、メモリセル部MCからプロセスエレメント部Peに順次出力する重み付け係数とが、フル結合の演算になるように制御されて、フル結合の演算が実現される。例えば、図6に示すように、p×qのフル結合のニューラルネットワークの場合、入力並列数p個のプロセスエレメント部Peが並んだプロセスエレメント・コラムと、出力並列数q個のプロセスエレメント・コラムPC、PC、・・・、PCと、出力並列数q個のメモリセル部MCとを有する。制御部Cntは、ニューラル電子回路NNのうち、プロセスエレメント・コラムPC、PC、・・・、PCと、q個のメモリセル部MCとを、使用するように制御する。また、p×q以上の場合、コア電子回路Core同士を、直列または並列に連結して、実現してもよい。 In the case of full coupling, in the binarized neural electronic circuit NN, the input data in the input memory array unit MAi and the weighting coefficient sequentially output from the memory cell unit MC to the process element unit Pe are calculated for full coupling. It is controlled so that the operation of full coupling is realized. For example, as shown in FIG. 6, in the case of a fully coupled neural network of p × q, a process element column in which process element portions Pe with a number of input parallels p are arranged and a process element column with a number of output parallels q It has PC 1 , PC 2 , ..., PC q, and a memory cell unit MC having a number of output parallels q. The control unit Cnt controls the process element columns PC 1 , PC 2 , ..., PC q and q memory cell units MC of the neural electronic circuit NN to be used. Further, in the case of p × q or more, the core electronic circuits Core may be connected in series or in parallel for realization.

 畳み込み演算を行う2値化ニューラル計算器31と、フル結合の2値化ニューラル計算器33とは、構成は同じで、演算を行う上で、メモリセル部MCから供給する重み付け等の違いに過ぎないので、2値化ニューラル計算器31と、2値化ニューラル計算器33との動作は、基本的に同じである。 The binarized neural computer 31 that performs the convolution operation and the fully-coupled binarized neural computer 33 have the same configuration, and in performing the operation, they differ only in the weighting and the like supplied from the memory cell unit MC. Therefore, the operation of the binarized neural computer 31 and the binarized neural computer 33 is basically the same.

 以上のように、活性予測器30がサポートベクターマシン識別器32でも、2値化ニューラル計算器33でも、2値化ニューラル計算器31と同様の効果が得られる。 As described above, whether the activity predictor 30 is the support vector machine classifier 32 or the binarized neural computer 33, the same effect as that of the binarized neural computer 31 can be obtained.

 1:ニューラル計算装置
 2:ニューラルネット計算部(ニューラルネット計算手段)
 3、3A、3B:ゼロ出力特定部(ゼロ出力特定手段)
 4:制御部(制御手段)
 20:ニューラルレイヤー(ニューラルネット計算手段)
 20a:ニューロン計算器
 21、22:乗算加算部
 25:正規化線形関数部
 30:活性予測器(ゼロ出力特定手段)
 31、33:2値化ニューラル計算器(ゼロ出力特定手段)
 32:サポートベクターマシン識別器(ゼロ出力特定手段)
 i:入力データ
 w:重み付け係数
 o:出力データ
1: Neural calculation device 2: Neural net calculation unit (neural net calculation means)
3, 3A, 3B: Zero output specifying unit (zero output specifying means)
4: Control unit (control means)
20: Neural layer (neural network calculation means)
20a: Neuron calculator 21, 22: Multiplication and addition part 25: Rectifier linear function part 30: Activity predictor (zero output specifying means)
31, 33: Binarized neural computer (zero output identification means)
32: Support vector machine classifier (zero output identification means)
i: Input data w: Weighting coefficient o: Output data

Claims (10)

 入力データと重み付け係数との乗算結果を加算し、当該加算結果に対して、活性化関数を適用して出力データを出力する複数のニューロン計算器から構成されるニューラルネット計算手段と、
 前記入力データに応じて、出力がゼロになる前記ニューロン計算器を特定するための計算を行うゼロ出力特定手段と、
 前記ゼロ出力特定手段の計算結果に基づき、前記出力がゼロになるニューロン計算器の計算を省略するように前記ニューラルネット計算手段の計算を制御する制御手段と、
 を備えたことを特徴とするニューラル計算装置。
A neural network calculation means composed of a plurality of neuron calculators that add the multiplication result of the input data and the weighting coefficient, apply an activation function to the addition result, and output the output data.
A zero output specifying means that performs a calculation for specifying the neuron computer whose output becomes zero according to the input data.
A control means that controls the calculation of the neural network calculation means so as to omit the calculation of the neuron computer whose output becomes zero based on the calculation result of the zero output specifying means.
A neural computing device characterized by being equipped with.
 請求項1に記載のニューラル計算装置において、
 前記ゼロ出力特定手段が、前記ニューラルネット計算手段の計算を模した2値化ニューラルネット計算器であり、
 前記制御手段が、前記入力データから変換して得られる1ビットのデータを入力した前記ゼロ出力特定手段の出力に基づき、前記ニューラルネット計算手段の計算を制御することを特徴とするニューラル計算装置。
In the neural computing device according to claim 1,
The zero output specifying means is a binarized neural network computer that imitates the calculation of the neural network calculation means.
A neural calculation apparatus, characterized in that the control means controls the calculation of the neural network calculation means based on the output of the zero output specifying means in which 1-bit data obtained by converting from the input data is input.
 請求項2に記載のニューラル計算装置において、
 前記ゼロ出力特定手段が、前記ニューラルネット計算手段において前記活性化関数を適用する前までの計算を模した2値化ニューラル計算器であることを特徴とするニューラル計算装置。
In the neural computing device according to claim 2,
A neural computing device, characterized in that the zero output specifying means is a binarized neural computer that imitates the calculation before applying the activation function in the neural network calculating means.
 請求項3に記載のニューラル計算装置において、
 前記ゼロ出力特定手段の出力と前記ニューラルネット計算手段における前記加算結果との差異に基づき、前記2値化ニューラル計算器を学習させる学習手段を更に備えたことを特徴とするニューラル計算装置。
In the neural computing device according to claim 3,
A neural calculation apparatus further provided with a learning means for training the binarized neural computer based on a difference between the output of the zero output specifying means and the addition result in the neural network calculation means.
 請求項1から請求項4のいずれか1項に記載のニューラル計算装置において、
 前記ニューラルネット計算手段が、複数のニューラルレイヤーから構成され、
 各前記ニューラルレイヤーに対応する複数の前記ゼロ出力特定手段があり、
 前記ニューラルレイヤーに対応するゼロ出力特定手段が、前の層のニューラルレイヤーの出力データに応じて、当該ニューラルレイヤーに属する前記ニューロン計算器の出力がゼロになるニューロン計算器を特定する計算を行うことを特徴とするニューラル計算装置。
In the neural computing device according to any one of claims 1 to 4.
The neural network calculation means is composed of a plurality of neural layers.
There are a plurality of said zero output identifying means corresponding to each said neural layer.
The zero output specifying means corresponding to the neural layer performs a calculation to specify a neuron computer in which the output of the neuron computer belonging to the neural layer becomes zero according to the output data of the neural layer of the previous layer. A neural computing device characterized by.
 請求項5に記載のニューラル計算装置において、
 前記制御手段が、前記ニューラルレイヤーに対応するゼロ出力特定手段により出力がゼロで無いと特定された前記ニューロン計算器から計算を先行して開始するように、前記ニューラルネット計算手段の計算を制御することを特徴とするニューラル計算装置。
In the neural computing device according to claim 5,
The control means controls the calculation of the neural network calculation means so that the calculation is started in advance from the neuron computer whose output is specified to be non-zero by the zero output specifying means corresponding to the neural layer. A neural computing device characterized by this.
 請求項5または請求項6に記載のニューラル計算装置において、
 前記制御手段が、前記ニューラルレイヤーにおける計算と次の層の前記ニューラルレイヤーにおける計算との間に時間間隔を設けて消費電力を減少させるように、前記ニューラルネット計算手段の計算を制御することを特徴とするニューラル計算装置。
In the neural computing device according to claim 5 or 6.
The control means controls the calculation of the neural network calculation means so as to reduce power consumption by providing a time interval between the calculation in the neural layer and the calculation in the neural layer of the next layer. Neural computing device.
 請求項1から請求項7のいずれか1項に記載のニューラル計算装置において、
 前記活性化関数が正規化線形関数であることを特徴とするニューラル計算装置。
In the neural computing device according to any one of claims 1 to 7.
A neural computing device characterized in that the activation function is a normalized linear function.
 請求項8に記載のニューラル計算装置において、
 前記ゼロ出力特定手段が、前記入力データのパターンから、前記加算結果の符号がゼロ以下になる前記ニューロン計算器を特定する計算を行うことを特徴とするニューラル計算装置。
In the neural computing device according to claim 8,
A neural computing device, characterized in that the zero output specifying means performs a calculation for identifying the neuron computer whose sign of the addition result is zero or less from the pattern of the input data.
 入力データと重み付け係数との乗算結果を加算し、当該加算結果に対して、活性化関数を適用して出力データを出力する複数のニューロン計算器から構成されるニューラルネット計算装置において、
 ゼロ出力特定手段が、前記入力データに応じて、出力がゼロになる前記ニューロン計算器を特定するための計算を行う出力特定ステップと、
 制御手段が、前記ゼロ出力特定手段の計算結果に基づき、前記出力がゼロになるニューロン計算器の計算を省略するように前記ニューラルネット計算装置の計算を制御する制御ステップと、
 を含むことを特徴とするニューラル計算方法。
In a neural network computing device composed of a plurality of neuron calculators that add the multiplication result of input data and weighting coefficient, apply an activation function to the addition result, and output output data.
An output specifying step in which the zero output specifying means performs a calculation for identifying the neuron computer whose output becomes zero according to the input data.
A control step in which the control means controls the calculation of the neural net computer so as to omit the calculation of the neuron computer whose output becomes zero based on the calculation result of the zero output specifying means.
A neural calculation method characterized by including.
PCT/JP2020/016684 2019-04-19 2020-04-16 Neural calculation device and neural calculation method Ceased WO2020213670A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019-080194 2019-04-19
JP2019080194A JP2022095999A (en) 2019-04-19 2019-04-19 Neural calculation device and neural calculation method

Publications (1)

Publication Number Publication Date
WO2020213670A1 true WO2020213670A1 (en) 2020-10-22

Family

ID=72837256

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/016684 Ceased WO2020213670A1 (en) 2019-04-19 2020-04-16 Neural calculation device and neural calculation method

Country Status (2)

Country Link
JP (1) JP2022095999A (en)
WO (1) WO2020213670A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07302249A (en) * 1992-08-11 1995-11-14 Hitachi Ltd Feedforward neural network learning method
JP2017182319A (en) * 2016-03-29 2017-10-05 株式会社メガチップス Machine learning device
US20180174025A1 (en) * 2016-12-16 2018-06-21 SK Hynix Inc. Apparatus and method for normalizing neural network device
JP2018129033A (en) * 2016-12-21 2018-08-16 アクシス アーベー Pruning based on a class of artificial neural networks

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07302249A (en) * 1992-08-11 1995-11-14 Hitachi Ltd Feedforward neural network learning method
JP2017182319A (en) * 2016-03-29 2017-10-05 株式会社メガチップス Machine learning device
US20180174025A1 (en) * 2016-12-16 2018-06-21 SK Hynix Inc. Apparatus and method for normalizing neural network device
JP2018129033A (en) * 2016-12-21 2018-08-16 アクシス アーベー Pruning based on a class of artificial neural networks

Also Published As

Publication number Publication date
JP2022095999A (en) 2022-06-29

Similar Documents

Publication Publication Date Title
CN107862374B (en) Neural network processing system and processing method based on assembly line
CN107818367B (en) Processing system and processing method for neural network
CN107844826B (en) Neural network processing unit and processing system comprising same
US11544539B2 (en) Hardware neural network conversion method, computing device, compiling method and neural network software and hardware collaboration system
WO2022134391A1 (en) Fusion neuron model, neural network structure and training and inference methods therefor, storage medium, and device
US11017294B2 (en) Recognition method and apparatus
CN109359730B (en) A neural network processor for fixed output paradigm Winograd convolution
CN107609641A (en) Sparse neural network framework and its implementation
CN111582451B (en) Image recognition interlayer parallel pipeline type binary convolution neural network array architecture
JPWO2019155910A1 (en) Neural electronic circuit
US12265492B2 (en) Circular buffer for input and output of tensor computations
US11954580B2 (en) Spatial tiling of compute arrays with shared control
KR102396447B1 (en) Deep learning apparatus for ANN with pipeline architecture
US12197362B2 (en) Batch matrix multiplication operations in a machine learning accelerator
CN109582911A (en) For carrying out the computing device of convolution and carrying out the calculation method of convolution
Addanki et al. Placeto: Efficient progressive device placement optimization
JP7621197B2 (en) Learning Recognition Device
US20240143525A1 (en) Transferring non-contiguous blocks of data using instruction-based direct-memory access (dma)
CN111930681A (en) Computing device and related product
WO2020213670A1 (en) Neural calculation device and neural calculation method
US20240264948A1 (en) Transpose a tensor with a single transpose buffer
KR102090109B1 (en) Learning and inference apparatus and method
US20210097391A1 (en) Network model compiler and related product
CN113222134B (en) Brain-like computing system, method and computer readable storage medium
Li et al. TAIL: Exploiting Temporal Asynchronous Execution for Efficient Spiking Neural Networks with Inter-Layer Parallelism

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20791232

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20791232

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP