WO2020213670A1

WO2020213670A1 - Neural calculation device and neural calculation method

Info

Publication number: WO2020213670A1
Application number: PCT/JP2020/016684
Authority: WO
Inventors: 伸也高前田; 晃大植吉
Original assignee: Hokkaido University NUC
Current assignee: Hokkaido University NUC
Priority date: 2019-04-19
Filing date: 2020-04-16
Publication date: 2020-10-22
Anticipated expiration: 2021-10-19
Also published as: JP2022095999A

Abstract

The present invention provides a neural calculation device 1, etc., with which it is possible to dynamically omit calculations relating to a neuron. The present invention comprises: a neural net calculation unit 2 composed of a plurality of neuron calculators 21a that add up the results obtained by multiplying input data i and a weighting coefficient w and applying an activation function to the addition result to output output data o; a zero output specification unit 3 for performing a calculation to specify a neuron calculator 21a for which the output is zero depending on the input data i; and a control unit 4 for controlling the calculation of the neural net calculation unit 2 on the basis of the output of the zero output specification unit 3 so as to omit the calculation of the neuron calculator 21a for which the output is zero.

Description

Neural calculation device and neural calculation method

　本発明は、ニューラル計算装置、および、ニューラル計算方法の技術分野に属する。 The present invention belongs to the technical fields of neural computing devices and neural computing methods.

　近年、ＩｏＴ（Internet of Things）・自動運転等の用途で、深層ニューラルネットワーク（ＤＮＮ）を高速・低消費電力に処理するコンピュータ・ハードウェアが求められている。ニューラルネットワークの計算の大部分は、入力データに対する重み付け係数と活性値の乗算と、乗算結果の積算である。そのため、最終的な認識結果に影響を大きく与えない重みを省略し、乗算と加算の回数を削減する手法として枝刈り(Pruning)と呼ばれる技術（例えば、非特許文献１）や、同様の技術に認識結果に影響を大きく与えないニューロンを取り除くニューロン刈り(Neuron Pruning)と呼ばれる技術（例えば、非特許文献２）が存在する。 In recent years, there has been a demand for computer hardware that processes deep neural networks (DNN) at high speed and low power consumption for applications such as IoT (Internet of Things) and autonomous driving. Most of the calculation of the neural network is the multiplication of the weighting coefficient and the activity value on the input data and the integration of the multiplication result. Therefore, as a method of reducing the number of multiplications and additions by omitting weights that do not significantly affect the final recognition result, a technique called pruning (for example, Non-Patent Document 1) or a similar technique is used. There is a technique called Neuron Pruning (for example, Non-Patent Document 2) that removes neurons that do not significantly affect the recognition result.

「Learning both Weights and Connections for Efficient NeuralNetworks」論文、Song Han他、arXiv:1506.02626v3 [cs.NE]、2015年10月30日"Learning both Weights and Connections for Efficient Neural Networks" paper, Song Han et al., ArXiv: 1506.02626v3 [cs.NE], October 30, 2015 「A Threshold Neuron Pruning for a Binarized Deep Neural Network on an FPGA」論文、Tomoya Fujii他、IEICE Transactions on Information and Systems, 2018 年 E101.D 巻 2 号 p. 376-386（ＵＲＬ：https://www.jstage.jst.go.jp/article/transinf/E101.D/2/E101.D_2017RCP0013/_article/-char/ja/）"A Threshold Neuron Pruning for a Binarized Deep Neural Network on an FPGA" paper, Tomoya Fujii et al., IEICE Transactions on Information and Systems, 2018 E101.D Volume 2 p.376-386 (URL: https: // www. jstage.jst.go.jp/article/transinf/E101.D/2/E101.D_2017RCP0013/_article/-char/ja/)

　しかしながら、従来技術において、ニューラルネットワークの学習時に静的に重み付け係数やニューロンを削るため、入力パターンに応じて動的に発現する、無効な重みやニューロンに関する計算を省略することはできなかった。 However, in the conventional technique, since the weighting coefficient and the neuron are statically deleted when the neural network is learned, it is not possible to omit the calculation regarding the invalid weight and the neuron that are dynamically expressed according to the input pattern.

　そこで本発明は、上記の各問題点及び要請等に鑑みて為されたもので、その課題の一例は、ニューロンに関する計算を動的に省略できるニューラル計算装置を提供することにある。 Therefore, the present invention has been made in view of each of the above problems and requirements, and one example of the problem is to provide a neural computing device capable of dynamically omitting calculations related to neurons.

　上記の課題を解決するために、請求項１に記載の発明は、入力データと重み付け係数との乗算結果を加算し、当該加算結果に対して、活性化関数を適用して出力データを出力する複数のニューロン計算器から構成されるニューラルネット計算手段と、前記入力データに応じて、出力がゼロになる前記ニューロン計算器を特定するための計算を行うゼロ出力特定手段と、前記ゼロ出力特定手段の計算結果に基づき、前記出力がゼロになるニューロン計算器の計算を省略するように前記ニューラルネット計算手段の計算を制御する制御手段と、を備えたことを特徴とする。 In order to solve the above problem, the invention according to claim 1 adds the multiplication result of the input data and the weighting coefficient, applies an activation function to the addition result, and outputs the output data. A neural net calculation means composed of a plurality of neuron calculators, a zero output specifying means for performing a calculation for specifying the neuron calculator whose output becomes zero according to the input data, and the zero output specifying means. Based on the calculation result of the above, the computer is provided with a control means for controlling the calculation of the neural net calculation means so as to omit the calculation of the neuron calculator whose output becomes zero.

　請求項２に記載の発明は、請求項１に記載のニューラル計算装置において、前記ゼロ出力特定手段が、前記ニューラルネット計算手段の計算を模した２値化ニューラルネット計算器であり、前記制御手段が、前記入力データから変換して得られる１ビットのデータを入力した前記ゼロ出力特定手段の出力に基づき、前記ニューラルネット計算手段の計算を制御することを特徴とする。 According to the second aspect of the present invention, in the neural calculation apparatus according to the first aspect, the zero output specifying means is a binarized neural network calculator that imitates the calculation of the neural network calculation means, and the control means. However, it is characterized in that the calculation of the neural network calculation means is controlled based on the output of the zero output specifying means in which 1-bit data obtained by converting from the input data is input.

　請求項３に記載の発明は、請求項２に記載のニューラル計算装置において、前記ゼロ出力特定手段が、前記ニューラルネット計算手段において前記活性化関数を適用する前までの計算を模した２値化ニューラル計算器であることを特徴とする。 The invention according to claim 3 is a binarization that imitates the calculation before the zero output specifying means applies the activation function to the neural network calculation means in the neural computer according to claim 2. It is characterized by being a neural computer.

　請求項４に記載の発明は、請求項３に記載のニューラル計算装置において、前記ゼロ出力特定手段の出力と前記ニューラルネット計算手段における前記加算結果との差異に基づき、前記２値化ニューラル計算器を学習させる学習手段を更に備えたことを特徴とする。 The invention according to claim 4 is the binarized neural computer according to claim 3, based on the difference between the output of the zero output specifying means and the addition result in the neural network calculation means in the neural computer according to claim 3. It is characterized by further providing a learning means for learning.

　請求項５に記載の発明は、請求項１から請求項４のいずれか１項に記載のニューラル計算装置において、前記ニューラルネット計算手段が、複数のニューラルレイヤーから構成され、各前記ニューラルレイヤーに対応する複数の前記ゼロ出力特定手段があり、前記ニューラルレイヤーに対応するゼロ出力特定手段が、前の層のニューラルレイヤーの出力データに応じて、当該ニューラルレイヤーに属する前記ニューロン計算器の出力がゼロになるニューロン計算器を特定する計算を行うことを特徴とする。 In the invention according to claim 5, in the neural calculation apparatus according to any one of claims 1 to 4, the neural network calculation means is composed of a plurality of neural layers and corresponds to each of the neural layers. There are a plurality of the zero output specifying means, and the zero output specifying means corresponding to the neural layer makes the output of the neuron computer belonging to the neural layer zero according to the output data of the neural layer of the previous layer. It is characterized by performing a calculation that identifies a neural computer.

　請求項６に記載の発明は、請求項５に記載のニューラル計算装置において、前記制御手段が、前記ニューラルレイヤーに対応するゼロ出力特定手段により出力がゼロで無いと特定された前記ニューロン計算器から計算を先行して開始するように、前記ニューラルネット計算手段の計算を制御することを特徴とする。 The invention according to claim 6 is based on the neural computer according to claim 5, wherein the control means is specified by the zero output specifying means corresponding to the neural layer to have a non-zero output. It is characterized in that the calculation of the neural network calculation means is controlled so that the calculation is started in advance.

　請求項７に記載の発明は、請求項５または請求項６に記載のニューラル計算装置において、前記制御手段が、前記ニューラルレイヤーにおける計算と次の層の前記ニューラルレイヤーにおける計算との間に時間間隔を設けて消費電力を減少させるように、前記ニューラルネット計算手段の計算を制御することを特徴とする。 The invention according to claim 7 is the neural computing device according to claim 5 or 6, wherein the control means has a time interval between the calculation in the neural layer and the calculation in the neural layer of the next layer. It is characterized in that the calculation of the neural network calculation means is controlled so as to reduce the power consumption.

　請求項８に記載の発明は、請求項１から請求項７のいずれか１項に記載のニューラル計算装置において、前記活性化関数が正規化線形関数であることを特徴とする。 The invention according to claim 8 is characterized in that the activation function is a rectified linear function in the neural computing device according to any one of claims 1 to 7.

　請求項９に記載の発明は、請求項８に記載のニューラル計算装置において、前記ゼロ出力特定手段が、前記入力データのパターンから、前記加算結果の符号がゼロ以下になる前記ニューロン計算器を特定する計算を行うことを特徴とする。 In the invention according to claim 9, in the neural computer according to claim 8, the zero output specifying means identifies the neuron computer in which the sign of the addition result is zero or less from the pattern of the input data. It is characterized by performing calculations to be performed.

　請求項１０に記載の発明は、入力データと重み付け係数との乗算結果を加算し、当該加算結果に対して、活性化関数を適用して出力データを出力する複数のニューロン計算器から構成されるニューラルネット計算装置において、ゼロ出力特定手段が、前記入力データに応じて、出力がゼロになる前記ニューロン計算器を特定するための計算を行う出力特定ステップと、制御手段が、前記ゼロ出力特定手段の計算結果に基づき、前記出力がゼロになるニューロン計算器の計算を省略するように前記ニューラルネット計算装置の計算を制御する制御ステップと、を含むことを特徴とする。 The invention according to claim 10 is composed of a plurality of neural computers that add a multiplication result of input data and a weighting coefficient, apply an activation function to the addition result, and output output data. In the neural network computing device, the zero output specifying means performs a calculation for specifying the neuron computer whose output becomes zero according to the input data, and the control means is the zero output specifying means. It is characterized by including a control step for controlling the calculation of the neural network calculation device so as to omit the calculation of the neuron computer whose output becomes zero based on the calculation result of.

　本発明によれば、入力データと重み付け係数との乗算結果を加算し、当該加算結果に対して、活性化関数を適用して出力データを出力する複数のニューロン計算器から構成されるニューラルネット計算において、入力データに応じて出力がゼロになるニューロン計算器を特定するための計算を行い、この計算結果に基づき、出力がゼロになるニューロン計算器の計算を省略するようにニューラルネット計算を制御することにより、入力データのパターンに応じて、出力がゼロになるニューロン計算器を特定できるため、ニューロンに関する計算を動的に省略できる。 According to the present invention, a neural network calculation composed of a plurality of neuron calculators that add the multiplication result of input data and weighting coefficient, apply an activation function to the addition result, and output output data. In, a calculation is performed to identify a neuron computer whose output becomes zero according to the input data, and based on this calculation result, the neural network calculation is controlled so as to omit the calculation of the neuron computer whose output becomes zero. By doing so, it is possible to identify the neural computer whose output becomes zero according to the pattern of the input data, so that the calculation related to the neural network can be dynamically omitted.

実施形態に係るニューラル計算装置の一例を示すブロック図である。It is a block diagram which shows an example of the neural calculation apparatus which concerns on embodiment. 図１のニューラルネット計算部およびゼロ出力特定部の概要構成例を示すブロック図である。It is a block diagram which shows the outline structure example of the neural network calculation part and zero output specific part of FIG. 図２のニューロン計算器の一例を示すブロック図である。It is a block diagram which shows an example of the neuron calculator of FIG. ニューロン計算器の計算を省略の一例を示す模式図である。It is a schematic diagram which shows an example which omitted the calculation of a neuron calculator. ２値化ニューラルネットワーク・システムの概要構成例を示すブロック図である。It is a block diagram which shows the outline structure example of the binarized neural network system. 図５のニューラル電子回路の一例を示すブロック図である。It is a block diagram which shows an example of the neural electronic circuit of FIG. ニューラル計算装置の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the neural calculation apparatus. ニューラルネット計算部およびゼロ出力特定部の一例を示すブロック図である。It is a block diagram which shows an example of a neural network calculation part and zero output specific part. ニューラルネット計算部およびゼロ出力特定部の動作のタイミングの一例を示す模式図である。It is a schematic diagram which shows an example of the operation timing of the neural network calculation part and zero output specific part. ニューラルネット計算部およびゼロ出力特定部の動作のタイミングの一例を示す模式図である。It is a schematic diagram which shows an example of the operation timing of a neural network calculation part and a zero output specific part. ニューラル計算装置による活性予測精度の結果の一例を示す模式図である。It is a schematic diagram which shows an example of the result of the activity prediction accuracy by a neural calculation apparatus. ニューラル計算装置による計算削減率の結果の一例を示す模式図である。It is a schematic diagram which shows an example of the result of the calculation reduction rate by a neural calculation apparatus. ゼロ出力特定部の変形例を示すブロック図である。It is a block diagram which shows the modification of the zero output specific part. ゼロ出力特定部がサポートベクターマシンの識別関数等を説明する模式図である。The zero output identification part is the schematic diagram explaining the identification function of a support vector machine. ニューラルネット計算部およびゼロ出力特定部の変形例を示すブロック図である。It is a block diagram which shows the modification of the neural network calculation part and the zero output specific part.

　以下、図面を参照して本発明の実施形態について説明する。なお、以下に説明する実施の形態は、ニューラル計算装置に対して本発明を適用した場合の実施形態である。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. The embodiment described below is an embodiment when the present invention is applied to a neural computing device.

　（１．ニューラル計算装置の構成および機能概要）
　まず、本発明の一実施形態に係るニューラル計算装置の構成および概要機能について、図１を用いて説明する。 (1. Outline of configuration and function of neural computing device)
First, the configuration and outline function of the neural computing device according to the embodiment of the present invention will be described with reference to FIG.

　図１は、実施形態に係るニューラル計算装置の一例を示すブロック図である。図２は、ニューラルネット計算部およびゼロ出力特定部の概要構成例を示すブロック図である。図３は、ニューロン計算器の一例を示すブロック図である。図４は、ニューロン計算器の計算を省略の一例を示す模式図である。 FIG. 1 is a block diagram showing an example of a neural computing device according to an embodiment. FIG. 2 is a block diagram showing an outline configuration example of the neural network calculation unit and the zero output specific unit. FIG. 3 is a block diagram showing an example of a neuron calculator. FIG. 4 is a schematic diagram showing an example in which the calculation of the neuron calculator is omitted.

　図１に示すように、ニューラル計算装置１は、入力データｉと重み付け係数との乗算結果を加算し、加算結果に対して、活性化関数を適用して出力データｏを出力する複数のニューロン計算器から構成されるニューラルネット計算部（ＮＮ計算部）２と、ニューラルネット計算部２のニューロン計算器の出力がゼロになるニューロン計算器を特定する計算を行うゼロ出力特定部３と、ゼロ出力特定部３に基づき、出力がゼロになるニューロン計算器の計算を省略するようにニューラルネット計算部２の計算を制御する制御部４と、重み付け係数ｗおよび計算の中間データ等を記憶するメモリ５と、を備えている。 As shown in FIG. 1, the neural calculation device 1 adds the multiplication result of the input data i and the weighting coefficient, applies an activation function to the addition result, and outputs the output data o. The neural network calculation unit (NN calculation unit) 2 composed of the instruments, the zero output specification unit 3 that performs the calculation to specify the neural network calculation unit whose output of the neural network calculation unit 2 becomes zero, and the zero output Based on the specific unit 3, the control unit 4 that controls the calculation of the neural network calculation unit 2 so as to omit the calculation of the neuron computer whose output becomes zero, and the memory 5 that stores the weighting coefficient w and the intermediate data of the calculation. And have.

　ニューラルネット計算部２は、入力データと重み付け係数との乗算結果を加算し、当該加算結果に対して、活性化関数を適用して出力データを出力する複数のニューロン計算器から構成されるニューラルネット計算手段の一例である。ゼロ出力特定部３は、前記入力データに応じて、出力がゼロになる前記ニューロン計算器を特定するための計算を行うゼロ出力特定手段の一例である。制御部４は、前記ゼロ出力特定手段の計算結果に基づき、前記出力がゼロになるニューロン計算器の計算を省略するように前記ニューラルネット計算手段の計算を制御する制御手段の一例である。 The neural network calculation unit 2 is a neural network composed of a plurality of neuron calculators that add the multiplication result of the input data and the weighting coefficient, apply an activation function to the addition result, and output the output data. This is an example of a calculation means. The zero output specifying unit 3 is an example of a zero output specifying means that performs a calculation for specifying the neuron computer whose output becomes zero according to the input data. The control unit 4 is an example of a control means that controls the calculation of the neural network calculation means so as to omit the calculation of the neuron computer whose output becomes zero based on the calculation result of the zero output specifying means.

　ニューラル計算装置１は、入力データｉ_０、入力データｉ₁、入力データｉ₂、…、入力データｉ_n（ｎは自然数。以下同様。）に対して、出力データｏ_０、出力データｏ₁、出力データｏ₂、…、入出力データｏ_m（ｍは自然数。以下同様。）を出力する。 Neural computing device 1, the input data _{i 0,} the input data i _1, the input data i _2, ..., the input data i _n (n is a natural number. Hereinafter the same.) With respect to the output data _{o 0,} the output data o _1, output data o _2, ..., input and output data o _m (m is a natural number. forth.) to the.

　図２に示すように、ニューラルネット計算部２は、複数のニューロン計算器２０ａを有する多層のニューラルレイヤー２０から構成されている。 As shown in FIG. 2, the neural network calculation unit 2 is composed of a multi-layer neural layer 20 having a plurality of neuron calculators 20a.

　図３に示すように、ニューロン計算器２０ａは、入力データｉ_０、入力データｉ₁、入力データｉ₂、…、入力データｉ_nと、入力データｉのそれぞれに対する重み付け係数ｗ_０、重み付け係数ｗ₁、重み付け係数ｗ₂、…、重み付け係数ｗ_nの乗算結果を加算する乗算加算器２１ａと、加算結果に対して活性化関数を適用する活性化関数器２５ａと、を有する。活性化関数器２５ａの活性化関数が正規化線形関数である場合、乗算加算器２１ａの出力がゼロ以下の場合、ゼロを出力し、乗算加算器２１ａの出力が正の値ｘの場合、値ｘを出力する。 As shown in FIG. 3, the neuron calculator 20a is input data _{i 0,} the input data i _1, the input data i _2, ..., and input data i _n, the weighting factor _{w 0} for each of the input data i, the weighting factor w _It has a multiplication adder 21a that adds a multiplication result of ₁ , a weighting coefficient w ₂ , ..., A weighting coefficient w _n , and an activation function device 25a that applies an activation function to the addition result. When the activation function of the activation function device 25a is a rectified linear function, if the output of the multiplication adder 21a is zero or less, zero is output, and if the output of the multiplication adder 21a is a positive value x, the value. Output x.

　なお、図２において、ニューラルネットの各層が出力側のニューロンと入力側のニューロンとが、隣り合ったニューラルレイヤー２０の間で分離して模式的に描かれている。１番目のニューラルレイヤー２０内では、入力層Ｌ０のニューロンに対して、第１層Ｌ１のニューロン計算器２０ａ群が構成されている。２番目のニューラルレイヤー２０内では、第１層Ｌ１のニューロンに対して、第２層Ｌ２のニューロン計算器２０ａ群が構成されている。ｎ番目のニューラルレイヤー２０内では、第ｎ－１層Ｌｎ－１のニューロンに対して、第ｎ層Ｌｎのニューロン計算器２０ａ群が構成されている。 Note that, in FIG. 2, each layer of the neural network is schematically drawn with the output side neuron and the input side neuron separated between the adjacent neural layers 20. In the first neural layer 20, the neuron calculator 20a group of the first layer L1 is configured with respect to the neurons of the input layer L0. In the second neural layer 20, the neuron calculator 20a group of the second layer L2 is configured with respect to the neurons of the first layer L1. In the nth neural layer 20, a neuron calculator 20a group of the nth layer Ln is configured with respect to the neurons of the n-1th layer Ln-1.

　図２に示すように、ゼロ出力特定部３は、複数の活性予測器３０から構成されている。各活性予測器３０は、各ニューラルレイヤー２０に対応して設置される。 As shown in FIG. 2, the zero output specifying unit 3 is composed of a plurality of activity predictors 30. Each activity predictor 30 is installed corresponding to each neural layer 20.

　活性予測器３０には、対応するニューラルレイヤー２０と同じ入力データが入力される。第２層以降のニューラルレイヤー２０に対応する活性予測器３０には、１つ前の層のニューラルレイヤー２０の出力が、入力される。 The same input data as the corresponding neural layer 20 is input to the activity predictor 30. The output of the neural layer 20 of the previous layer is input to the activity predictor 30 corresponding to the neural layers 20 of the second and subsequent layers.

　活性予測器３０は、対応するニューラルレイヤー２０と同じ入力データのパターンに基づき、ニューラルレイヤー２０の各ニューロン計算器２０ａにおける計算のうち活性化関数を適用する前までの計算を模した計算を行う。活性予測器３０における計算結果は、制御部４に出力される。 The activity predictor 30 performs a calculation imitating the calculation before applying the activation function among the calculations in each neuron computer 20a of the neural layer 20 based on the same input data pattern as the corresponding neural layer 20. The calculation result in the activity predictor 30 is output to the control unit 4.

　この計算結果でニューロン計算器２０ａの出力がゼロ以下になると予測されるニューロン計算器２０ａに対して、制御部４は、出力がゼロになるニューロン計算器２０ａに関わる乗算や加算の計算を省略するようにニューラルレイヤー２０の計算を制御する。 In contrast to the neurocomputer 20a whose output is predicted to be zero or less based on this calculation result, the control unit 4 omits the calculation of multiplication and addition related to the neurocomputer 20a whose output becomes zero. The calculation of the neural layer 20 is controlled so as to.

　例えば、図４に示すように、ニューラルネット計算部２は、計算を省略しないニューロン計算器２０ａのネットワークで連結されたニューロン計算器２０ａのみの計算を行う。 For example, as shown in FIG. 4, the neural network calculation unit 2 calculates only the neuron calculator 20a connected by the network of the neuron calculator 20a without omitting the calculation.

　活性予測器３０は、ニューラルレイヤー２０より速く計算できる計算器ならばよい。例えば、活性予測器３０の一例として、２値化ニューラル計算器、整数化ニューラル計算器、サポートベクターマシン（SVM）、ランダムフォレスト等が挙げられる。活性予測器３０が２値化ニューラル計算器の場合、活性予測器３０は、入力側で、例えば、入力データの最上ビットを取り出し、２値化ニューラルネットワークの計算がされる。第１層目では、最上ビットは、入力データの符号（例えば、０がマイナス符号を示し、１がプラス符号）を表し、第２層目以降では、最上ビットは、データの値のマグニチュード（例えば、０ならば０を示し、１ならばそのビット数で示せる最大値を示す）を表している。なお、２値化ニューラルネットワークの計算に使用される入力は、入力データの最上ビットの他に、入力データから変換して得られる１ビットのデータならばよい。 The activity predictor 30 may be a computer that can calculate faster than the neural layer 20. For example, as an example of the activity predictor 30, a binarized neural computer, an integerized neural computer, a support vector machine (SVM), a random forest, and the like can be mentioned. When the activity predictor 30 is a binarized neural computer, the activity predictor 30 takes out, for example, the most significant bit of the input data on the input side, and calculates the binarized neural network. In the first layer, the most significant bit represents the sign of the input data (eg, 0 is a minus sign and 1 is a plus sign), and in the second and subsequent layers, the most significant bit is the magnitude of the data value (eg,). , 0 indicates 0, and 1 indicates the maximum value that can be indicated by the number of bits). The input used for the calculation of the binarized neural network may be 1-bit data obtained by converting from the input data in addition to the most significant bit of the input data.

　ニューラルネット計算部２およびゼロ出力特定部３の計算は、専用の電子回路のハードウェアで計算されてもよいし、ノイマン型等のＣＰＵによる計算でもよいし、ＧＰＵ等のＡＩチップによる計算でもよいし、ＣＰＵとＧＰＵとを組み合わせた計算でもよし、ニューロモーフィックチップによる計算でもよい。 The calculation of the neural net calculation unit 2 and the zero output specific unit 3 may be performed by the hardware of a dedicated electronic circuit, by a CPU such as a Neumann type, or by an AI chip such as a GPU. However, the calculation may be a combination of the CPU and the GPU, or may be a calculation using a neuromorphic chip.

　ゼロ出力特定部３における２値化ニューラルネットワークの計算は、例えば、１ビットの重み付け係数と１ビットの入力データとの排他的論理和（ＸＮＯＲ）によって２値化乗算を行うニューラル電子回路で行われてもよい。２値化ニューラルネットワークの計算は、ニューロン間の１ビットの接続有無情報に基づき、２値化乗算の乗算結果に加算をする。２値化ニューラルネットワークの計算は、加算結果に活性化関数を適用して１ビットの出力データを出力する。 The calculation of the binarized neural network in the zero output specifying unit 3 is performed by, for example, a neural electronic circuit that performs binarization multiplication by the exclusive OR (XNOR) of the 1-bit weighting coefficient and the 1-bit input data. You may. The calculation of the binarized neural network adds to the multiplication result of the binarized multiplication based on the 1-bit connection presence / absence information between neurons. The calculation of the binarized neural network applies an activation function to the addition result and outputs 1-bit output data.

　（２．ゼロ出力特定部３における２値化ニューラルネットワーク・システムの構成および機能）
　次に、ゼロ出力特定部３が２値化ニューラルネットワークの場合における２値化ニューラルネットワーク・システムの構成および概要機能について、図を用いて説明する。 (2. Configuration and function of binarized neural network system in zero output identification unit 3)
Next, the configuration and outline function of the binarized neural network system when the zero output specifying unit 3 is the binarized neural network will be described with reference to the drawings.

　図５は、２値化ニューラルネットワーク・システムの概要構成例を示すブロック図である。図６は、図５のニューラル電子回路の一例を示すブロック図である。 FIG. 5 is a block diagram showing an outline configuration example of a binarized neural network system. FIG. 6 is a block diagram showing an example of the neural electronic circuit of FIG.

　図５に示すように、２値化ニューラルネットワーク・システムＮＮＳは、様々なタイプのニューラルネットワークを電子回路で実現可能な複数のコア電子回路Ｃｏｒｅと、コア電子回路Ｃｏｒｅ同士を接続するシステムバスｂuｓと、を備えている。 As shown in FIG. 5, the binarized neural network system NNS includes a plurality of core electronic circuit Cores capable of realizing various types of neural networks with electronic circuits, and a system bus bus connecting the core electronic circuit Cores. , Is equipped.

　コア電子回路Ｃｏｒｅは、様々なタイプのニューラルネットワークを電子回路で実現可能な２値化ニューラル電子回路ＮＮと、２値化ニューラル電子回路ＮＮの重み付け係数等を設定するメモリアクセス制御部ＭＣｎｔと、ニューラル電子回路ＮＮおよびメモリアクセス制御部ＭＣｎｔを制御する制御部Ｃｎｔと、を有する。ここで、様々なタイプのニューラルネットワークの一例として、ニューロン層間のニューロン同士がフル結合したフル結合のタイプのニューラルネットワーク、コンボリューション演算をするニューラルネットワーク、ニューロン層における層内拡張をしたニューラルネットワーク、層数を拡張するニューラルネットワーク等が挙げられる。 The core electronic circuit Core includes a binarized neural electronic circuit NN that can realize various types of neural networks with an electronic circuit, a memory access control unit MCnt that sets weighting coefficients of the binarized neural electronic circuit NN, and a neural network. It has an electronic circuit NN and a control unit Cnt that controls a memory access control unit MCnt. Here, as an example of various types of neural networks, a fully connected type neural network in which neurons between neurons are fully connected to each other, a neural network that performs convolution calculation, a neural network that is expanded in the neuron layer, and a layer. Examples include neural networks that expand numbers.

　２値化ニューラル電子回路ＮＮは、入力データＩ１、・・・、Ｉｐ（ｐは自然数。以下同様）を並列で順次供給する入力メモリアレイ部ＭＡｉと、重み付け係数のデータを並列で順次供給するメモリセル部ＭＣと、供給された入力データＩ１、・・・、Ｉｐと重み付け係数とを乗算する乗算機能を実現して乗算結果を出力する複数のプロセスエレメント部Ｐｅと、並列の各入力データのプロセスエレメント部Ｐｅからの各乗算結果を加算して加算結果に活性化関数を適用する加算活性化部Ａｃｔと、各加算活性化部Ａｃｔからの１ビットの出力データＯ１、・・・、Ｏｑ（ｑは自然数。以下同様）をそれぞれ順次記憶する出力メモリアレイ部ＭＡｏと、各加算活性化部Ａｃｔにバイアス用の値を順次提供するバイアス用メモリアレイ部ＭＡｂとを有する。 The binarized neural electronic circuit NN has an input memory array unit MAi that sequentially supplies input data I1, ..., Ip (p is a natural number; the same applies hereinafter) in parallel, and a memory that sequentially supplies weighting coefficient data in parallel. A cell unit MC, a plurality of process element units Pe that realize a multiplication function for multiplying the supplied input data I1, ..., Ip and a weighting coefficient and output a multiplication result, and a process of each input data in parallel. Addition activation unit Act that adds each multiplication result from element unit Pe and applies an activation function to the addition result, and 1-bit output data O1, ..., Oq (q) from each addition activation unit Act. Has an output memory array unit MAo that sequentially stores natural numbers (the same applies hereinafter), and a bias memory array unit MAb that sequentially provides bias values to each addition activation unit Act.

　メモリアクセス制御部ＭＣｎｔは、例えば、Direct Memory Access Controllerである。メモリアクセス制御部ＭＣｎｔは、制御部Ｃｎｔの制御に従い、各プロセスエレメント部Ｐｅに逐次供給する入力データを、入力メモリアレイ部ＭＡｉに設定する。また、メモリアクセス制御部ＭＣｎｔは、制御部Ｃｎｔの制御に従い、重み付け係数およびニューロン間の接続の有無を示す所定値を、各メモリセル部ＭＣに予め設定する。また、メモリアクセス制御部ＭＣｎｔは、制御部Ｃｎｔの制御に従い、加算活性化部Ａｃｔから出力された出力データを、出力メモリアレイ部ＭＡｏから取り出する。 The memory access control unit MCnt is, for example, a Direct Memory Access Controller. The memory access control unit MCnt sets the input data to be sequentially supplied to each process element unit Pe in the input memory array unit MAi according to the control of the control unit Cnt. Further, the memory access control unit MCnt sets in advance a weighting coefficient and a predetermined value indicating the presence / absence of connection between neurons in each memory cell unit MC according to the control of the control unit Cnt. Further, the memory access control unit MCnt takes out the output data output from the addition activation unit Act from the output memory array unit MAo under the control of the control unit Cnt.

　制御部Ｃｎｔは、ＣＰＵ（Central Processing Unit）等を有する。制御部Ｃｎｔは、２値化ニューラル電子回路ＮＮの各素子の同期等のタイミングを計ったり、計算やデータの転送の同期を取ったりする。また、制御部Ｃｎｔは、２値化ニューラル電子回路ＮＮ内の後述のセレクタ素子の切り替え制御を行う。 The control unit Cnt has a CPU (Central Processing Unit) and the like. The control unit Cnt measures timing such as synchronization of each element of the binarized neural electronic circuit NN, and synchronizes calculation and data transfer. In addition, the control unit Cnt controls switching of the selector element described later in the binarized neural electronic circuit NN.

　制御部Ｃｎｔは、メモリアクセス制御部ＭＣｎｔを制御して、他のコア電子回路Ｃｏｒｅから出力されたデータを入力メモリアレイ部ＭＡｉ用に整えて、入力データとして入力メモリアレイ部ＭＡｉに供給するように制御する。制御部Ｃｎｔは、メモリアクセス制御部ＭＣｎｔが、出力メモリアレイ部ＭＡｏから取得した出力データを、他のコア電子回路Ｃｏｒｅに転送するように制御する。 The control unit Cnt controls the memory access control unit MCnt, prepares the data output from the other core electronic circuit Core for the input memory array unit MAi, and supplies the data as input data to the input memory array unit MAi. Control. The control unit Cnt controls the memory access control unit MCnt to transfer the output data acquired from the output memory array unit MAo to another core electronic circuit Core.

　なお、上位コントローラ（例えば、制御部４）が、ニューラルネットワーク・システムＮＮＳや、各コア電子回路Ｃｏｒｅの制御部Ｃｎｔを制御してもよい。また、上位コントローラが、制御部Ｃｎｔの代わりに、２値化ニューラル電子回路ＮＮおよびメモリアクセス制御部ＭＣｎｔを制御してもよい。上位コントローラは、外付けのコンピュータでもよい。 Note that the host controller (for example, control unit 4) may control the neural network system NNS and the control unit Cnt of each core electronic circuit Core. Further, the host controller may control the binarized neural electronic circuit NN and the memory access control unit MCnt instead of the control unit Cnt. The host controller may be an external computer.

　バイアス用メモリアレイ部ＭＡｂは、各加算活性化部Ａｃｔに提供するバイアス用のデータを予め記憶している。 The bias memory array unit MAb stores bias data provided to each addition activation unit Act in advance.

　図６に示すように、２値化ニューラル電子回路ＮＮは、例えば、入力ｐ個×出力ｑ個の２層のニューラルネットワークを実現する。 As shown in FIG. 6, the binarized neural electronic circuit NN realizes, for example, a two-layer neural network having p inputs and q outputs.

　メモリセル部ＭＣは、重み付け係数を記憶するメモリセル１０を有する。メモリセル１０は、構築するニューラルネットワークによる実現する脳機能に基づいて予め設定された「１」または「０」の１ビットの重み付け係数を記憶している。 The memory cell unit MC has a memory cell 10 that stores a weighting coefficient. The memory cell 10 stores a preset 1-bit weighting coefficient of “1” or “0” based on the brain function realized by the neural network to be constructed.

　なお、メモリセル部ＭＣは、上記脳機能に基づいて予め設定されたニューロン間の接続有無情報を記憶している別の接続有無情報用のメモリセル（図示せず）も有していてもよい。ここで、接続無情報は、例えば、ＮＣ（Not Connected（接続なし））を意味する１ビット所定値であり、所定値として「１」または「０」等が割り当てられる。 The memory cell unit MC may also have another memory cell (not shown) for connection presence / absence information that stores connection presence / absence information between neurons preset based on the brain function. .. Here, the no-connection information is, for example, a 1-bit predetermined value meaning NC (Not Connected), and "1" or "0" or the like is assigned as the predetermined value.

　メモリセル１０が並んで、メモリセルの列が形成される。同時に各プロセスエレメント部Ｐｅに出力されるメモリセル１０をまとめてメモリセルブロックＣＢが形成される。メモリセルブロックＣＢのメモリセル１０は、並列で入力される各入力データに対応している。 Memory cells 10 are lined up to form a row of memory cells. At the same time, the memory cell block CB is formed by collectively forming the memory cells 10 output to each process element unit Pe. The memory cell 10 of the memory cell block CB corresponds to each input data input in parallel.

　メモリセル部ＭＣは、入力メモリアレイ部ＭＡｉから並列で入力される入力データＩ１、・・・、Ｉｐの入力並列数ｐ個以上のメモリセルブロックＣＢを有することが好ましい。メモリセルブロックＣＢにおいて、メモリセル１０の数は、入力メモリアレイ部ＭＡｉから１ビット順次入力されるシリアルの入力データのサイクル数以上が好ましい。 The memory cell unit MC preferably has a memory cell block CB having an input parallel number of p or more of input data I1, ..., Ip input in parallel from the input memory array unit MAi. In the memory cell block CB, the number of memory cells 10 is preferably equal to or greater than the number of cycles of serial input data sequentially input by 1 bit from the input memory array unit MAi.

　メモリセル部ＭＣは、メモリセルブロックＣＢ毎に、１ビットの重み付け係数を、１ビット順次入力されるシリアルの入力データに対応してプロセスエレメント部Ｐｅに、順次出力する。各プロセスエレメント部Ｐｅに、メモリセルブロックＣＢからの重み付け係数と、入力メモリアレイ部ＭＡｉからの入力データとが入力される。 The memory cell unit MC sequentially outputs a 1-bit weighting coefficient for each memory cell block CB to the process element unit Pe corresponding to the serial input data input sequentially by 1 bit. The weighting coefficient from the memory cell block CB and the input data from the input memory array unit MAi are input to each process element unit Pe.

　メモリセルブロックＣＢは、１ビットの重み付け係数と１ビットの接続有無情報とを、交互に、プロセスエレメント部Ｐｅに順次、出力してもよい。メモリセル１０は、プロセスエレメント部Ｐｅに対して独立した結線を有し、別々に順次、プロセスエレメント部Ｐｅに出力してもよい。 The memory cell block CB may alternately and sequentially output a 1-bit weighting coefficient and a 1-bit connection presence / absence information to the process element unit Pe. The memory cell 10 has an independent connection to the process element unit Pe, and may be sequentially output to the process element unit Pe separately.

　メモリセル部ＭＣは、図５および図６に示すように、出力メモリアレイ部ＭＡｏに並列に出力される出力データの出力並列数ｑ個、並列に出力される出力データＯ１、・・・、Ｏｑに対応して、２値化ニューラル電子回路ＮＮに配置される。 As shown in FIGS. 5 and 6, the memory cell unit MC has q output parallel outputs of output data output in parallel to the output memory array unit MAo, output data O1, ..., Oq output in parallel. Correspondingly, it is arranged in the binarized neural electronic circuit NN.

　図６に示すように、並列で入力される並列の各入力データに配置された入力並列数ｐ個のプロセスエレメント部Ｐｅは、２値化ニューラル電子回路ＮＮにおいて、プロセスエレメント・コラム（例えば、プロセスエレメント・コラムＰＣ１）を形成する。出力並列数ｑ個のプロセスエレメント・コラムＰＣ１からＰＣｑは、並列に出力される出力データに対応して、２値化ニューラル電子回路ＮＮにおいて、ｑ列に配置される。プロセスエレメント部Ｐｅは、図３に示すように、２値化ニューラル電子回路ＮＮにおいて、ｐ行×ｑ列に２次元の演算器アレイとして設定される。 As shown in FIG. 6, the process element unit Pe of the number of input parallel ps arranged in each parallel input data input in parallel is a process element column (for example, a process) in the binarized neural electronic circuit NN. Form the element column PC1). The process element columns PC1 to PCq having a number of output parallels of q are arranged in the q column in the binarized neural electronic circuit NN corresponding to the output data output in parallel. As shown in FIG. 3, the process element unit Pe is set as a two-dimensional arithmetic unit array in p rows × q columns in the binarized neural electronic circuit NN.

　行列（１，１）、（１，２）、・・・、（１，ｑ）のプロセスエレメント部Ｐｅには、入力データＩ１が共通で入力される結線になっている。行列（２，１）、（２，２）、・・・、（２，ｑ）のプロセスエレメント部Ｐｅには、入力データＩ２が共通で入力される結線になっている。行列（ｐ，１）、（ｐ，２）、・・・、（ｐ，ｑ）のプロセスエレメント部Ｐｅには、入力データＩｐが共通で入力される結線になっている。 Input data I1 is commonly input to the process element part Pe of the matrix (1,1), (1,2), ..., (1,q). Input data I2 is commonly input to the process element parts Pe of the matrices (2,1), (2,2), ..., (2, q). Input data Ip is commonly input to the process element section Pe of the matrix (p, 1), (p, 2), ..., (P, q).

　プロセスエレメント部Ｐｅは、対応するメモリセル１０から出力される１ビットの重み付け係数と、１ビットの入力データとの排他的論理和（ＸＮＯＲ）を乗算結果として算出して出力する。 The process element unit Pe calculates and outputs the exclusive OR (XNOR) of the 1-bit weighting coefficient output from the corresponding memory cell 10 and the 1-bit input data as the multiplication result and outputs it.

　なお、接続有無情報用のメモリセルからの接続無情報（例えば、「ＮＣ」を意味する所定値）が出力された場合、加算活性化部Ａｃｔにおいて、乗算結果が加算されない。例えば、乗算結果と接続有無情報とが交互にペアで出力されてもよい。また、接続有無情報に関して、プロセスエレメント部Ｐｅから加算活性化部Ａｃｔへ、乗算結果とは独立の結線を有し、乗算結果と接続有無情報とは別々に出力されてもよい。 Note that when no connection information (for example, a predetermined value meaning "NC") is output from the memory cell for connection presence / absence information, the multiplication result is not added in the addition activation unit Act. For example, the multiplication result and the connection presence / absence information may be alternately output as a pair. Further, regarding the connection presence / absence information, the process element unit Pe may have a connection independent of the multiplication result from the addition activation unit Act, and the multiplication result and the connection presence / absence information may be output separately.

　なお、プロセスエレメント部Ｐｅで、乗算結果の部分和を計算するとき、接続有無情報用のメモリセルからの接続無情報（例えば、「ＮＣ」を意味する所定値）が出力された場合、乗算結果の部分和に加算されない。 When calculating the partial sum of the multiplication results in the process element unit Pe, if no connection information (for example, a predetermined value meaning "NC") is output from the memory cell for connection presence / absence information, the multiplication result Is not added to the partial sum of.

　プロセスエレメント・コラムＰＣ１、・・・、ＰＣｑは、各プロセスエレメント部Ｐｅからの乗算結果または一部の乗算結果を加算した部分和結果等を加算活性化部Ａｃｔに出力する。 The process element columns PC1, ..., PCq output the multiplication result from each process element unit Pe or the partial sum result obtained by adding a part of the multiplication results to the addition activation unit Act.

　図５および図６に示すように、加算活性化部Ａｃｔは、並列で出力される各出力データＯ１、・・・、Ｏｑに応じて配置されている。 As shown in FIGS. 5 and 6, the addition activation unit Act is arranged according to each output data O1, ..., Oq output in parallel.

　加算活性化部Ａｃｔは、プロセスエレメント・コラムから逐次出力される乗算結果を、接続有無情報に基づき加算して、加算結果に活性化関数を適用して１ビットの出力データを出力メモリアレイ部ＭＡｏに出力する。プロセスエレメント部Ｐｅが乗算結果の部分和を出力する場合、加算活性化部Ａｃｔは、プロセスエレメント・コラムから逐次出力される乗算結果を加算して、加算結果に活性化関数を適用して１ビットの出力データを出力メモリアレイ部ＭＡｏに出力する。 The addition activation unit Act adds the multiplication results sequentially output from the process element column based on the connection presence / absence information, applies an activation function to the addition result, and outputs 1-bit output data. Memory array unit MAo Output to. When the process element unit Pe outputs the partial sum of the multiplication results, the addition activation unit Act adds the multiplication results sequentially output from the process element column, applies an activation function to the addition result, and 1 bit. Outputs the output data of to the output memory array unit MAo.

　加算活性化部Ａｃｔは、プロセスエレメント・コラムにおいて、入力データの１サイクル単位で、乗算結果として「１」を算出した回数から、乗算結果として「０」を算出した回数を減じた値が予め定められた閾値以上の場合に「１」を出力データとして出力し、減じた値が閾値未満の場合に「０」を出力データとして出力する。 In the process element column, the addition activation unit Act determines in advance a value obtained by subtracting the number of times "0" is calculated as the multiplication result from the number of times "1" is calculated as the multiplication result in one cycle unit of the input data. When it is equal to or more than the specified threshold value, "1" is output as output data, and when the subtracted value is less than the threshold value, "0" is output as output data.

　図６に示すように、ビットシリアル入力の並列化が行われ、プロセスエレメント部Ｐｅの行が入力データに対して共有し、プロセスエレメント部Ｐｅの列である各プロセスエレメント・コラムが独立して出力データを出力する。 As shown in FIG. 6, the bit serial input is parallelized, the row of the process element part Pe is shared with respect to the input data, and each process element column which is a column of the process element part Pe is output independently. Output data.

　なお、２値化ニューラル電子回路ＮＮが、２ビットのデータの他に、複数ビットのデータも扱えるように、対数化する回路を有してもよい。例えば、２値化ニューラル電子回路ＮＮが、入力データと重み付け係数との乗算結果を、対数化入力データと対数化重み付け係数とを加算した対数加算して逆変換で線形化する回路を有してもよい。また、ゼロ出力特定部３は、２値化ニューラル電子回路ＮＮの他に、最終的に０／１を返す機械学習方式であればよい。 Note that the binarized neural electronic circuit NN may have a logarithmic circuit so that it can handle multi-bit data in addition to 2-bit data. For example, the binarized neural electronic circuit NN has a circuit that linearizes the multiplication result of the input data and the weighting coefficient by logarithmic addition of the logarithmic input data and the logarithmic weighting coefficient by inverse conversion. May be good. Further, the zero output specifying unit 3 may be a machine learning method that finally returns 0/1 in addition to the binarized neural electronic circuit NN.

　（３．ニューラル計算装置の動作例）
　次に、ニューラル計算装置１の動作例について図７から図１２を用いて説明する。なお、ニューラル計算装置１が畳み込みニューラルネットで、ゼロ出力特定部３が２値化ニューラルネットワークである場合を例として、動作を説明する。 (3. Operation example of neural computing device)
Next, an operation example of the neural calculation device 1 will be described with reference to FIGS. 7 to 12. The operation will be described by taking as an example a case where the neural calculation device 1 is a convolutional neural network and the zero output specifying unit 3 is a binarized neural network.

　図７は、ニューラル計算装置の動作例を示すフローチャートである。図８は、ニューラルネット計算部およびゼロ出力特定部の一例を示すブロック図である。図９および図１０は、ニューラルネット計算部およびゼロ出力特定部の動作のタイミングの一例を示す模式図である。図１１は、ニューラル計算装置による活性予測精度の結果の一例を示す模式図である。図１２は、ニューラル計算装置による計算削減率の結果の一例を示す模式図である。 FIG. 7 is a flowchart showing an operation example of the neural computing device. FIG. 8 is a block diagram showing an example of a neural network calculation unit and a zero output specific unit. 9 and 10 are schematic views showing an example of the operation timing of the neural network calculation unit and the zero output specific unit. FIG. 11 is a schematic diagram showing an example of the result of the activity prediction accuracy by the neural computing device. FIG. 12 is a schematic diagram showing an example of the result of the calculation reduction rate by the neural computing device.

　図７に示すように、ニューラル計算装置１は、学習済みニューラルネットモデルを取得する（ステップＳ１）。具体的には、ニューラル計算装置１の制御部４は、畳み込みニューラルネットを構築に必要な重み付け係数の値をメモリ５から取得する。制御部４は、ニューラルネット計算部２に重み付け係数の値を出力して、学習済みニューラルネットモデルを設定する。また、制御部４は、活性化関数を正規化線形関数に設定する。 As shown in FIG. 7, the neural computing device 1 acquires the trained neural network model (step S1). Specifically, the control unit 4 of the neural calculation device 1 acquires the value of the weighting coefficient necessary for constructing the convolution neural network from the memory 5. The control unit 4 outputs the value of the weighting coefficient to the neural network calculation unit 2 and sets the trained neural network model. Further, the control unit 4 sets the activation function as a rectified linear function.

　例えば、図８に示すように、ニューラルネット計算部２において、ニューラルレイヤー２０が、畳み込み層である場合、畳み込み演算を行う乗算加算部２１と、正規化線形関数部２５とが設定される。なお、出力層のニューラルレイヤー２０は、フル結合層でもよい。隠れ層のニューラルレイヤー２０に、フル結合層が含まれてもよい。 For example, as shown in FIG. 8, in the neural network calculation unit 2, when the neural layer 20 is a convolution layer, a multiplication addition unit 21 that performs a convolution operation and a rectified linear function unit 25 are set. The neural layer 20 of the output layer may be a fully connected layer. The neural layer 20 of the hidden layer may include a fully connected layer.

　なお、学習済みニューラルネットモデルは、ニューラルネット計算部２に対して、所定の学習用のデータセットによって予め学習が行われて、構築される。例えば、ニューラルネット計算部２に学習用のデータセットを適用して、誤差伝搬法により学習する。 The trained neural network model is constructed by pre-learning the neural network calculation unit 2 with a predetermined learning data set. For example, a data set for learning is applied to the neural network calculation unit 2 and learning is performed by the error propagation method.

　次に、ニューラル計算装置１は、ゼロ出力特定部３を学習させる（ステップＳ２）。例えば、図８に示すように、活性予測器３０が２値化ニューラル計算器３１の場合、制御部４は、ゼロ出力特定部３の各２値化ニューラル計算器３１に、ニューラルネット計算部２に対して適用した同じ学習用のデータセットを適用して、ゼロ出力特定部３を学習させる。さらに具体的には、制御部４は、各ニューラルレイヤー２０における乗算加算部２１の出力であるＯＦＭ（Output Feature Map）と、ニューラルレイヤー２０に対応する２値化ニューラル計算器３１の出力であるＯＦＭとを取得する。 Next, the neural calculation device 1 trains the zero output specifying unit 3 (step S2). For example, as shown in FIG. 8, when the activity predictor 30 is a binarized neural computer 31, the control unit 4 is connected to each binarized neural computer 31 of the zero output specifying unit 3 and the neural network calculation unit 2. The zero output specific unit 3 is trained by applying the same training data set applied to. More specifically, the control unit 4 has an OFM (Output Feature Map) which is the output of the multiplication / addition unit 21 in each neural layer 20 and an OFM which is the output of the binarized neural computer 31 corresponding to the neural layer 20. And get.

　制御部４は、乗算加算部２１のＯＦＭを教師信号として、２値化ニューラル計算器３１のＯＦＭとの誤差を求め、この誤差に基づき誤差伝搬法により各２値化ニューラル計算器３１を学習させる。制御部４は、誤差として損失関数の値が所定以下になるまで、ニュートン法、最急降下法等の最適化手法により２値化ニューラル計算器３１の重み付けを修正して、学習を繰り返す。なお、誤差は、例えば２値化ニューラル計算器３１の出力と乗算加算部２１における加算結果との差分の２乗である。学習用のデータセットを適用して、２値化ニューラル計算器３１に対する重み付け係数が修正されて、乗算加算部２１を模した２値化ニューラル計算器３１が形成される。 The control unit 4 uses the OFM of the multiplication / addition unit 21 as a teacher signal to obtain an error from the OFM of the binarized neural computer 31, and trains each binarized neural computer 31 by an error propagation method based on this error. .. The control unit 4 corrects the weighting of the binarized neural computer 31 by an optimization method such as Newton's method or the steepest descent method until the value of the loss function becomes a predetermined value or less as an error, and repeats the learning. The error is, for example, the square of the difference between the output of the binarized neural computer 31 and the addition result in the multiplication / addition unit 21. By applying the data set for learning, the weighting coefficient for the binarized neural computer 31 is modified to form the binarized neural computer 31 imitating the multiplication / addition unit 21.

　ここで、２値化ニューラル計算器３１は、例えば、２値化ニューラル電子回路ＮＮにより実現される。また、ゼロ出力特定部３が、複数のコア電子回路Ｃｏｒｅにより実現されてもよい。 Here, the binarized neural computer 31 is realized by, for example, a binarized neural electronic circuit NN. Further, the zero output specifying unit 3 may be realized by a plurality of core electronic circuits Core.

　２値化ニューラル計算器３１用の重み付け係数は、メモリ５に記憶されていて、学習が進む毎に修正されている。制御部４は、メモリ５上の重み付け係数を逐次書き換えられて学習して行く。修正後の重み付け係数は、メモリ５からゼロ出力特定部３のメモリアクセス制御部ＭＣｎｔに転送され、メモリセル部ＭＣに設定される。 The weighting coefficient for the binarized neural computer 31 is stored in the memory 5 and is corrected every time learning progresses. The control unit 4 learns by sequentially rewriting the weighting coefficient on the memory 5. The modified weighting coefficient is transferred from the memory 5 to the memory access control unit MCnt of the zero output specific unit 3, and is set in the memory cell unit MC.

　なお、畳み込み演算を行う場合、２値化ニューラル電子回路ＮＮにおいて、入力メモリアレイ部ＭＡｉにおける入力データと、メモリセル部ＭＣからプロセスエレメント部Ｐｅに順次出力する重み付け係数とが、畳み込み演算になるように制御されて、畳み込み演算が実現される。なお、ＲＧＢ画像の場合、ｐ＝３の入力データになる。入力画像がｋ×ｋピクセルの場合、入力データｉ１、ｉ２、・・・、ｉｋ、・・・、ｉｋ^２が逐次入力される。ｑの値は、次の層のニューロン数になる。 When performing the convolution operation, in the binarized neural electronic circuit NN, the input data in the input memory array unit MAi and the weighting coefficient sequentially output from the memory cell unit MC to the process element unit Pe are performed in the convolution operation. The convolution operation is realized under the control of. In the case of an RGB image, the input data is p = 3. When the input image is k × k pixels, the input data i1, i2, ..., Ik, ..., Ik ² are sequentially input. The value of q is the number of neurons in the next layer.

　学習した活性予測器３０または２値化ニューラル計算器３１は、前記ニューラルネット計算手段の計算を模した２値化ニューラルネット計算器の一例である。学習した活性予測器３０または２値化ニューラル計算器３１は、前記ニューラルネット計算手段において前記活性化関数を適用する前までの計算（乗算加算部２１の計算）を模した２値化ニューラル計算器の一例である。制御部４は、前記ゼロ出力特定手段の出力と前記ニューラルネット計算手段における前記加算結果との差異に基づき、前記２値化ニューラル計算器を学習させる学習手段の一例として機能する。 The learned activity predictor 30 or the binarized neural network computer 31 is an example of the binarized neural network computer that imitates the calculation of the neural network calculation means. The learned activity predictor 30 or the binarization neural computer 31 is a binarization neural computer that imitates the calculation (calculation of the multiplication / addition unit 21) before applying the activation function in the neural network calculation means. This is an example. The control unit 4 functions as an example of the learning means for training the binarized neural computer based on the difference between the output of the zero output specifying means and the addition result in the neural network calculation means.

　次に、ニューラル計算装置１は、ゼロ出力特定部３を用いたニューラルネット計算を実行する（ステップＳ３）。制御部４は、ニューラルネット計算部２用の学習済み重み付け係数を、メモリ５から読み出し、ニューラルネット計算部２にロードする。制御部４は、ゼロ出力特定部３用の学習済み重み付け係数を、メモリ５から読み出し、ゼロ出力特定部３にロードする。２値化ニューラル電子回路ＮＮの場合は、各２値化ニューラル計算器３１に対応したメモリセル部ＭＣに、ゼロ出力特定部３用の学習済み重み付け係数がセットされる。 Next, the neural calculation device 1 executes the neural network calculation using the zero output specifying unit 3 (step S3). The control unit 4 reads the learned weighting coefficient for the neural network calculation unit 2 from the memory 5 and loads it into the neural network calculation unit 2. The control unit 4 reads the learned weighting coefficient for the zero output specifying unit 3 from the memory 5 and loads it into the zero output specifying unit 3. In the case of the binarized neural electronic circuit NN, the learned weighting coefficient for the zero output specifying unit 3 is set in the memory cell unit MC corresponding to each binarized neural computer 31.

　次に、図９に示すように、１番目のニューラルレイヤー２０に対応する２値化ニューラル計算器３１の計算が開始される。１番目の２値化ニューラル計算器３１に、入力データのパターンとして、入力データが逐次入力され、入力データの最上ビット（例えば、０がマイナス符号を示し、１がプラス符号）により、２値化ニューラル計算器３１おける計算が行われる。２値化ニューラル計算器３１の計算結果として、加算活性化部Ａｃｔの出力データＯ１、Ｏ２・・・が、１番目の２値化ニューラル計算器３１から制御部４に送信される。 Next, as shown in FIG. 9, the calculation of the binarized neural computer 31 corresponding to the first neural layer 20 is started. Input data is sequentially input to the first binarization neural computer 31 as a pattern of input data, and binarized by the uppermost bit of the input data (for example, 0 indicates a minus sign and 1 is a plus sign). Calculations are performed on the neural computer 31. As the calculation result of the binarized neural computer 31, the output data O1, O2 ... Of the addition activation unit Act are transmitted from the first binarized neural computer 31 to the control unit 4.

　出力データＯ１が「０」ならば、出力データＯ１に対応するニューロン計算器２０ａの乗算加算器２１ａの計算を省略するように、制御部４が、１番目の乗算加算部２１を制御する。出力データＯ１が「１」ならば、出力データＯ１に対応するニューロン計算器２０ａの乗算加算器２１ａの計算を、同じ入力データで計算を開始するように、制御部４が、１番目の乗算加算部２１を制御する。乗算加算部２１の計算結果は、正規化線形関数部２５が適用されて出力される。 If the output data O1 is "0", the control unit 4 controls the first multiplication / addition unit 21 so as to omit the calculation of the multiplication adder 21a of the neurocomputer 20a corresponding to the output data O1. If the output data O1 is "1", the control unit 4 performs the first multiplication addition so that the calculation of the multiplication adder 21a of the neurocomputer 20a corresponding to the output data O1 is started with the same input data. The unit 21 is controlled. The calculation result of the multiplication / addition unit 21 is output by applying the normalization linear function unit 25.

　このように、活性予測器３０または２値化ニューラル計算器３１は、前記入力データに応じて、出力がゼロになる前記ニューロン計算器を特定するための計算を行うゼロ出力特定手段の一例として機能する。 As described above, the activity predictor 30 or the binarized neural computer 31 functions as an example of the zero output specifying means that performs the calculation for specifying the neuron computer whose output becomes zero according to the input data. To do.

　次に、出力データＯ２が「０」ならば、出力データＯ２に対応するニューロン計算器２０ａの乗算加算器２１ａの計算を省略するように、制御部４が、１番目の乗算加算部２１を制御する。出力データＯ２が「１」ならば、出力データＯ２に対応するニューロン計算器２０ａの乗算加算器２１ａの計算を、同じ入力データで計算を開始するように、制御部４が、１番目の乗算加算部２１を制御する。各出力データに対して、制御部４が、１番目の乗算加算部２１を制御する。例えば、図４に示すように、第１層Ｌ１において、出力がゼロとならないと予想される乗算加算器２１ａの計算が行われる。 Next, if the output data O2 is "0", the control unit 4 controls the first multiplication / addition unit 21 so as to omit the calculation of the multiplication / addition unit 21a of the neurocomputer 20a corresponding to the output data O2. To do. If the output data O2 is "1", the control unit 4 performs the first multiplication addition so that the calculation of the multiplication adder 21a of the neurocomputer 20a corresponding to the output data O2 is started with the same input data. The unit 21 is controlled. The control unit 4 controls the first multiplication / addition unit 21 for each output data. For example, as shown in FIG. 4, in the first layer L1, the calculation of the multiplication adder 21a whose output is not expected to be zero is performed.

　１番目の乗算加算部２１および正規化線形関数部２５からの出力が、入力データのパターンとして、２番目の２値化ニューラル計算器３１に逐次入力され、入力データの最上ビット（例えば、０ならば０を示し、１ならばそのビット数で示せる最大値を示す）により、２値化ニューラル計算器３１おける計算が行われる。 The output from the first multiplication / addition unit 21 and the rectified linear function unit 25 is sequentially input to the second binarization neural computer 31 as a pattern of input data, and the highest bit of the input data (for example, if it is 0). If 0 is indicated, 1 indicates the maximum value that can be indicated by the number of bits), the calculation in the binarized neural network 31 is performed.

　なお、１番目の乗算加算部２１および正規化線形関数部２５からの出力が、中間データとして、ニューラルネット計算部２のキャッシュまたはメモリ５に記憶されてもよい。 Note that the output from the first multiplication / addition unit 21 and the normalized linear function unit 25 may be stored in the cache or the memory 5 of the neural network calculation unit 2 as intermediate data.

　２番目の２値化ニューラル計算器３１の出力データが逐次出力され、各出力の値（「０」または「１」）に応じて、対応するニューロン計算器２０ａの乗算加算器２１ａの計算を省略、または、乗算加算器２１ａの計算を行うように、制御部４が、２番目の乗算加算部２１を制御する。例えば、図４に示すように、第２層Ｌ１において、出力がゼロとならないと予想される乗算加算器２１ａの計算が行われる。 The output data of the second binarized neural calculator 31 is sequentially output, and the calculation of the multiplication adder 21a of the corresponding neuron calculator 20a is omitted according to the value of each output (“0” or “1”). Or, the control unit 4 controls the second multiplication / addition unit 21 so as to perform the calculation of the multiplication / adder 21a. For example, as shown in FIG. 4, in the second layer L1, the calculation of the multiplication adder 21a whose output is not expected to be zero is performed.

　２番目の乗算加算部２１および正規化線形関数部２５からの出力が、入力データとして、３番目の２値化ニューラル計算器３１に逐次入力され、入力データの最上ビット（例えば、０ならば０を示し、１ならばそのビット数で示せる最大値を示す）により、２値化ニューラル計算器３１おける計算が行われる。 The outputs from the second multiplication / addition unit 21 and the normalized linear function unit 25 are sequentially input to the third binarization neural computer 31 as input data, and the highest bit of the input data (for example, 0 if 0). , And 1 indicates the maximum value that can be indicated by the number of bits), so that the calculation in the binarized neural network 31 is performed.

　最後のニューラルレイヤー２０まで計算されて、ニューラルネット計算部２は出力データを出力する。 The calculation is performed up to the last neural layer 20, and the neural network calculation unit 2 outputs the output data.

　図９に示すように、出力がゼロとなると予測されるニューロン計算器２０ａの乗算加算器２１ａの計算を省くことにより、従来のニューラルネット計算部より速く計算できるようになる。従来のニューラルネット計算部があるレイヤーで計算時間ｔ０であることに対して、ニューラルネット計算部２の乗算加算部２１は計算時間ｔ２（＜ｔ０）で計算し、ゼロ出力特定部３の２値化ニューラル計算器３１は計算時間ｔ１（＜ｔ０）で計算し、かつ、２値化ニューラル計算器３１と乗算加算部２１とが少し時間をずらして、計算が可能となったニューロン計算器２０ａから計算始めるという並列的に処理することで、計算時間ｔ３（＜ｔ０）で計算可能となる。 As shown in FIG. 9, by omitting the calculation of the multiplication adder 21a of the neuron calculator 20a whose output is predicted to be zero, the calculation can be performed faster than the conventional neural net calculation unit. Whereas the conventional neural net calculation unit has a calculation time t0 in a certain layer, the multiplication / addition unit 21 of the neural net calculation unit 2 calculates at the calculation time t2 (<t0), and the binary value of the zero output specific unit 3 The computerized neural computer 31 calculates at the calculation time t1 (<t0), and the binarized neural computer 31 and the multiplication / addition unit 21 are slightly staggered from each other, so that the calculation is possible from the neuron computer 20a. By processing in parallel to start the calculation, the calculation can be performed in the calculation time t3 (<t0).

　制御部４は、前記ニューラルレイヤーに対応するゼロ出力特定手段により出力がゼロで無いと特定された前記ニューロン計算器から計算を先行して開始するように、前記ニューラルネット計算手段の計算を制御する制御手段の一例として機能する。 The control unit 4 controls the calculation of the neural network calculation means so that the calculation is started in advance from the neuron calculator whose output is specified to be non-zero by the zero output specifying means corresponding to the neural layer. It functions as an example of control means.

　なお、図１０に示すように、ニューラルネット計算部２における乗算加算部２１と次の層の乗算加算部２１との計算の時間間隔を空けてもよい。ゼロ出力特定部３の２値化ニューラル計算器３１の計算が終わってから、次の層の乗算加算部２１の計算を開始してもよい。また、同じ層の乗算加算部２１と２値化ニューラル計算器３１との計算で重なる時間があってもよい。 As shown in FIG. 10, the calculation time interval between the multiplication / addition unit 21 in the neural network calculation unit 2 and the multiplication / addition unit 21 in the next layer may be separated. After the calculation of the binarized neural computer 31 of the zero output specifying unit 3 is completed, the calculation of the multiplication / addition unit 21 of the next layer may be started. Further, there may be a time when the multiplication / addition unit 21 of the same layer and the binarization neural computer 31 overlap each other.

　制御部４は、前記ニューラルレイヤーにおける計算と次の層の前記ニューラルレイヤーにおける計算との間に時間間隔を設けて消費電力を減少させるように、前記ニューラルネット計算手段の計算を制御する制御手段の一例として機能する。 The control unit 4 controls the calculation of the neural network calculation means so as to reduce the power consumption by providing a time interval between the calculation in the neural layer and the calculation in the neural layer of the next layer. Works as an example.

　このように、制御部４は、前記入力データから変換して得られる１ビットのデータを入力した前記ゼロ出力特定手段の出力に基づき、前記ニューラルネット計算手段の計算を制御する制御手段の一例である。ニューラルレイヤー２０に対応する２値化ニューラル計算器３１が、前の層のニューラルレイヤーの出力データに応じて、当該ニューラルレイヤーに属する前記ニューロン計算器の出力がゼロになるニューロン計算器を特定する計算を行うゼロ出力特定手段の一例である。活性予測器３０または２値化ニューラル計算器３１は、前記入力データのパターンから、前記加算結果の符号がゼロ以下になる前記ニューロン計算器を特定する計算を行うゼロ出力特定手段の一例として機能する。 As described above, the control unit 4 is an example of the control means that controls the calculation of the neural network calculation means based on the output of the zero output specifying means that inputs the 1-bit data obtained by converting from the input data. is there. A calculation in which the binarized neural computer 31 corresponding to the neural layer 20 identifies a neuron computer in which the output of the neuron computer belonging to the neural layer becomes zero according to the output data of the neural layer of the previous layer. This is an example of a zero output specifying means for performing the above. The activity predictor 30 or the binarized neural computer 31 functions as an example of a zero output specifying means that performs a calculation for specifying the neuron computer whose sign of the addition result is zero or less from the pattern of the input data. ..

　次に、シミュレーションの結果を図１１および図１２を用いて説明する。畳み込み層が８層でフル結合層が２層で、入力層を含めて１１層のモデルでシミュレーションを行った。 Next, the results of the simulation will be described with reference to FIGS. 11 and 12. The simulation was performed with a model of 8 convolutional layers, 2 fully connected layers, and 11 layers including the input layer.

　図１１に示すように、各２値化ニューラル計算器３１の活性予測精度は、学習前（破線）で層が先に進むほど、低下する傾向であったが、学習後は、３層目（conv2_1）以降で、活性予測精度が向上した。なお、横軸が各層で、縦軸が各層の活性予測精度である。各層の活性予測精度は、ゼロ出力特定部３を使用せずニューラルネット計算部２で計算した場合を正解とした場合において、ゼロと予測されたニューロン計算器２０ａの数を活性予測精度の分母とし、予測が正解であったニューロン計算器２０ａの数を活性予測精度の分子とした値である。 As shown in FIG. 11, the activity prediction accuracy of each binarized neural computer 31 tended to decrease as the layers advanced before learning (broken line), but after learning, the activity prediction accuracy of the third layer (dashed line). After conv2_1), the activity prediction accuracy has improved. The horizontal axis is each layer, and the vertical axis is the activity prediction accuracy of each layer. The activity prediction accuracy of each layer is determined by using the number of neuron calculators 20a predicted to be zero as the denominator of the activity prediction accuracy when the correct answer is the case where the calculation is performed by the neural net calculation unit 2 without using the zero output specific unit 3. , The value obtained by using the number of the neurocomputer 20a for which the prediction was correct as the molecule of the activity prediction accuracy.

　図１２に示すように、計算削減率に関して、３層目（conv2_1）以降で、計算削減率が向上した。各層の計算削減率は、各層のニューロン計算器２０ａの数を計算削減率の分母として、２値化ニューラル計算器３１によりゼロと予測された各層のニューロン計算器２０ａの数とした値である。 As shown in FIG. 12, regarding the calculation reduction rate, the calculation reduction rate improved in the third layer (conv2_1) and thereafter. The calculation reduction rate of each layer is a value obtained by using the number of neurocomputers 20a of each layer as the denominator of the calculation reduction rate and the number of neurocomputers 20a of each layer predicted to be zero by the binarized neural computer 31.

　これらのシミュレーション結果から、第１層Ｌ１に対応する活性予測器３０と、第２層Ｌ２に対応する活性予測器３０と、を省いたゼロ出力特定部３でもよい。 From these simulation results, the zero output specifying unit 3 may omit the activity predictor 30 corresponding to the first layer L1 and the activity predictor 30 corresponding to the second layer L2.

　以上説明したように、本実施形態によれば、入力データｉと重み付け係数ｗとの乗算結果を加算し、当該加算結果に対して、活性化関数を適用して出力データｏを出力する複数のニューロン計算器２０ａから構成されるニューラル計算装置１のニューラルネット計算において、ゼロ出力特定部３が入力データｉに応じて出力がゼロになるニューロン計算器２０ａを特定するための計算を行い、制御部４が、この計算結果に基づき、出力がゼロになるニューロン計算器２０ａの計算を省略するようにニューラルネット計算部２のニューラルネット計算を制御することにより、入力データのパターンに応じて、出力がゼロになるニューロン計算器２０ａを特定できるため、ニューロンに関する計算を動的に省略できる。 As described above, according to the present embodiment, a plurality of devices that add the multiplication result of the input data i and the weighting coefficient w, apply an activation function to the addition result, and output the output data o. In the neural network calculation of the neural network device 1 composed of the neural computer 20a, the zero output specifying unit 3 performs a calculation for specifying the neuron computer 20a whose output becomes zero according to the input data i, and the control unit. Based on this calculation result, 4 controls the neural network calculation of the neural network calculation unit 2 so as to omit the calculation of the neuron computer 20a whose output becomes zero, so that the output is output according to the pattern of the input data. Since the neural computer 20a that becomes zero can be specified, the calculation related to the neural network can be dynamically omitted.

　ゼロ出力特定部３が、ニューラルネット計算部２の計算を模した２値化ニューラル計算器３１であり、制御部４が、入力データの最上ビットを入力した出力特定部３の出力に基づき、ニューラルネット計算部２の計算を制御する場合、２値化して計算することにより、ニューラルネット計算手段より速く計算できるため、ニューラルネット計算部２を計算する前に、出力がゼロになるニューロン計算器２０ａを特定できる。 The zero output specifying unit 3 is a binarized neural computer 31 that imitates the calculation of the neural network calculation unit 2, and the control unit 4 is a neural based on the output of the output specifying unit 3 that has input the highest bit of the input data. When controlling the calculation of the net calculation unit 2, the calculation can be performed faster than the neural network calculation means by binarizing the calculation. Therefore, before the neural network calculation unit 2 is calculated, the output becomes zero. Can be identified.

　ゼロ出力特定部３が、ニューラルネット計算部２において活性化関数を適用する前までの計算を模した２値化ニューラル計算器３１である場合、活性化関数を適用する計算を省いて、２値化ニューラル計算器３１を適用するため、より速く、出力がゼロになるニューロン計算器２０ａを特定できる。 When the zero output specifying unit 3 is a binarized neural computer 31 that imitates the calculation before applying the activation function in the neural network calculation unit 2, the calculation to apply the activation function is omitted and the binary value is obtained. Since the modified neural computer 31 is applied, it is possible to identify the neural computer 20a whose output becomes zero faster.

　ゼロ出力特定部３の出力とニューラルネット計算部２における加算結果との差異に基づき、２値化ニューラル計算器３１を学習させる場合、学習させることにより、ゼロ出力特定部３の予測精度が向上する。 When the binarized neural computer 31 is trained based on the difference between the output of the zero output specific unit 3 and the addition result in the neural network calculation unit 2, the prediction accuracy of the zero output specific unit 3 is improved by training. ..

　ニューラルネット計算部２が、複数のニューラルレイヤー２０から構成され、各ニューラルレイヤー２０に対応する複数の活性予測器３０があり、ニューラルレイヤー２０に対応する活性予測器３０が、前の層のニューラルレイヤー２０の出力データに応じて、当該ニューラルレイヤー２０に属するニューロン計算器２０ａの出力がゼロになるニューロン計算器２０ａを特定する計算を行う場合、層毎に対応した活性予測器３０が出力がゼロになるニューロン計算器２０ａをより精度よく特定できる。 The neural network calculation unit 2 is composed of a plurality of neural layers 20, there are a plurality of activity predictors 30 corresponding to each neural layer 20, and the activity predictor 30 corresponding to the neural layer 20 is a neural layer of the previous layer. When a calculation is performed to specify the neuron calculator 20a in which the output of the neuron calculator 20a belonging to the neural layer 20 becomes zero according to the output data of 20, the activity predictor 30 corresponding to each layer has an output of zero. The neural computer 20a can be identified more accurately.

　制御部４が、ニューラルレイヤー２０に対応するゼロ出力特定部３により出力がゼロで無いと特定されたニューロン計算器２０ａから計算を先行して開始するように、ニューラルネット計算部２の計算を制御する場合、ニューラル計算装置１の計算をより速く実現できる。 The control unit 4 controls the calculation of the neural network calculation unit 2 so that the calculation is started in advance from the neuron computer 20a whose output is specified to be non-zero by the zero output identification unit 3 corresponding to the neural layer 20. When this is done, the calculation of the neural computer 1 can be realized faster.

　制御部４が、ニューラルレイヤー２０における計算と次の層のニューラルレイヤー２０における計算との間に時間間隔を設けて消費電力を減少させるように、ニューラルネット計算部２の計算を制御する場合、最も消費電力が高いニューラルネット計算部２を省力化でき、ニューラル計算装置１全体のエネルギー消費量を節約することができる。 When the control unit 4 controls the calculation of the neural network calculation unit 2 so as to reduce the power consumption by providing a time interval between the calculation in the neural layer 20 and the calculation in the neural layer 20 of the next layer. It is possible to save labor in the neural network calculation unit 2 having high power consumption, and to save the energy consumption of the entire neural calculation device 1.

　活性化関数が正規化線形関数である場合、正規化線形関数の適用においてゼロ以下であれば、出力がゼロであるので、ニューラルネット計算部２の計算を省略しやすくなる。 When the activation function is a rectified linear function, if it is zero or less in the application of the rectified linear function, the output is zero, so it is easy to omit the calculation of the neural network calculation unit 2.

　ゼロ出力特定部３が、入力データのパターンから、加算結果の符号がゼロ以下になるニューロン計算器２０ａを特定する計算を行う場合、出力がゼロになるニューロン計算器２０ａをより特定しやすくなる。 When the zero output specifying unit 3 performs a calculation to specify the neuron computer 20a whose sign of the addition result is zero or less from the pattern of the input data, it becomes easier to specify the neuron computer 20a whose output becomes zero.

　（４．変形例）
　次に、ゼロ出力特定部の変形例について図を用いて説明する。
　図１３は、ゼロ出力特定部がサポートベクターマシンの場合のブロック図である。図１４は、ゼロ出力特定部がサポートベクターマシンの識別関数等を説明する模式図である。図１５は、ニューラルネット計算部およびゼロ出力特定部の変形例を示すブロック図である。 (4. Modification example)
Next, a modified example of the zero output specific unit will be described with reference to the figure.
FIG. 13 is a block diagram when the zero output specifying unit is a support vector machine. FIG. 14 is a schematic diagram in which the zero output identification unit explains the identification function of the support vector machine and the like. FIG. 15 is a block diagram showing a modified example of the neural network calculation unit and the zero output specific unit.

　図１３に示すように、活性予測器３０がサポートベクターマシン識別器３２であってもよい。ニューラル計算装置１は、ニューラルネット計算部２と、ゼロ出力特定部３Ａと、を有する。ゼロ出力特定部３Ａは、２値化ニューラル計算器３１の代わりに複数のサポートベクターマシン識別器３２を有する。サポートベクターマシン識別器３２は、各層のニューラルレイヤー２０に対応して設置される。 As shown in FIG. 13, the activity predictor 30 may be a support vector machine classifier 32. The neural calculation device 1 has a neural network calculation unit 2 and a zero output specific unit 3A. The zero output identification unit 3A has a plurality of support vector machine classifiers 32 instead of the binarized neural computer 31. The support vector machine classifier 32 is installed corresponding to the neural layer 20 of each layer.

　ステップＳ２において、活性予測器３０がサポートベクターマシン識別器３２の場合、制御部４は、ゼロ出力特定部３Ａの各サポートベクターマシン識別器３２に、ニューラルネット計算部２に対して適用した同じ学習用のデータセットを適用して、ゼロ出力特定部３Ａを学習させる。さらに具体的には、制御部４は、ニューラルレイヤー２０における乗算加算部２１の出力であるＯＦＭと、ニューラルレイヤー２０に対応するサポートベクターマシン識別器３２の出力であるＯＦＭとを取得する。 In step S2, when the activity predictor 30 is the support vector machine classifier 32, the control unit 4 applies the same learning applied to each support vector machine classifier 32 of the zero output specifying unit 3A to the neural network calculation unit 2. The zero output specific unit 3A is trained by applying the data set for. More specifically, the control unit 4 acquires the OFM which is the output of the multiplication / addition unit 21 in the neural layer 20 and the OFM which is the output of the support vector machine classifier 32 corresponding to the neural layer 20.

　制御部４は、図１４に示すように乗算加算部２１のＯＦＭを正解ラベルｔとして、サポートベクターマシン識別器３２のＯＦＭとの差異を求め、この差異による損失関数に基づき、制約つきニュートン法や最急降下法等の最適化手法により、各サポートベクターマシン識別器３２を学習させる。制御部４は、損失関数が所定以下になるまで、学習を繰り返す。学習用のデータセットを適用して、サポートベクターマシン識別器３２に対する重み付け係数が修正されて、乗算加算部２１を模したサポートベクターマシン識別器３２が形成される。なお、図１４に示すように、サポートベクターマシンの識別関数が、ｙ＝ｗ^(sum)ｉ＋ｗ₀ ^(sum) である。 As shown in FIG. 14, the control unit 4 obtains a difference from the OFM of the support vector machine classifier 32 with the OFM of the multiplication / addition unit 21 as the correct answer label t, and based on the loss function due to this difference, the constrained Newton method or Each support vector machine classifier 32 is trained by an optimization method such as the steepest descent method. The control unit 4 repeats learning until the loss function becomes equal to or less than a predetermined value. By applying the training data set, the weighting coefficient for the support vector machine classifier 32 is modified to form the support vector machine classifier 32 that imitates the multiplication / addition unit 21. As shown in FIG. 14, the identification function of the support vector machine is y = w ^(sum) i + w ₀ ^(sum) .

　サポートベクターマシン識別器３２の学習が完了したら、ステップＳ３のように、ニューラル計算装置１は、２値化ニューラル計算器３１の代わりにサポートベクターマシン識別器３２を用いたニューラルネット計算を実行する。なお、サポートベクターマシン識別器３２は、最上ビット化しない入力データでも、２値化ニューラル計算器３１のように最上ビット化した入力データでもよい。 When the learning of the support vector machine classifier 32 is completed, the neural computer device 1 executes the neural network calculation using the support vector machine classifier 32 instead of the binarized neural computer 31 as in step S3. The support vector machine classifier 32 may be input data that is not converted to the most significant bit, or input data that is binarized to the maximum bit, such as the binarized neural computer 31.

　次に、図１５に示すように、畳み込み層でなく、フル結合層の場合にも、本願発明を適用できる。すなわち、ニューラルレイヤー２０がフル結合層の場合、活性予測器３０がフル結合の２値化ニューラル計算器３３であってもよい。各層のニューラルレイヤー２０フル結合の乗算加算部２２と、ゼロ出力特定部３Ｂにおけるフル結合の２値化ニューラル計算器３３とが対応する。 Next, as shown in FIG. 15, the present invention can be applied to a fully bonded layer instead of a convolutional layer. That is, when the neural layer 20 is a fully coupled layer, the activity predictor 30 may be a fully coupled binarized neural computer 33. The multiplication / addition unit 22 of the neural layer 20 full coupling of each layer and the binarized neural computer 33 of the full coupling in the zero output specifying unit 3B correspond to each other.

　なお、フル結合の場合、２値化ニューラル電子回路ＮＮにおいて、入力メモリアレイ部ＭＡｉにおける入力データと、メモリセル部ＭＣからプロセスエレメント部Ｐｅに順次出力する重み付け係数とが、フル結合の演算になるように制御されて、フル結合の演算が実現される。例えば、図６に示すように、ｐ×ｑのフル結合のニューラルネットワークの場合、入力並列数ｐ個のプロセスエレメント部Ｐｅが並んだプロセスエレメント・コラムと、出力並列数ｑ個のプロセスエレメント・コラムＰＣ_１、ＰＣ_２、・・・、ＰＣ_ｑと、出力並列数ｑ個のメモリセル部ＭＣとを有する。制御部Ｃｎｔは、ニューラル電子回路ＮＮのうち、プロセスエレメント・コラムＰＣ_１、ＰＣ_２、・・・、ＰＣ_ｑと、ｑ個のメモリセル部ＭＣとを、使用するように制御する。また、ｐ×ｑ以上の場合、コア電子回路Ｃｏｒｅ同士を、直列または並列に連結して、実現してもよい。 In the case of full coupling, in the binarized neural electronic circuit NN, the input data in the input memory array unit MAi and the weighting coefficient sequentially output from the memory cell unit MC to the process element unit Pe are calculated for full coupling. It is controlled so that the operation of full coupling is realized. For example, as shown in FIG. 6, in the case of a fully coupled neural network of p × q, a process element column in which process element portions Pe with a number of input parallels p are arranged and a process element column with a number of output parallels q It has PC ₁ , PC ₂ , ..., PC _q, and a memory cell unit MC having a number of output parallels q. The control unit Cnt controls the process element columns PC ₁ , PC ₂ , ..., PC _q and q memory cell units MC of the neural electronic circuit NN to be used. Further, in the case of p × q or more, the core electronic circuits Core may be connected in series or in parallel for realization.

　畳み込み演算を行う２値化ニューラル計算器３１と、フル結合の２値化ニューラル計算器３３とは、構成は同じで、演算を行う上で、メモリセル部ＭＣから供給する重み付け等の違いに過ぎないので、２値化ニューラル計算器３１と、２値化ニューラル計算器３３との動作は、基本的に同じである。 The binarized neural computer 31 that performs the convolution operation and the fully-coupled binarized neural computer 33 have the same configuration, and in performing the operation, they differ only in the weighting and the like supplied from the memory cell unit MC. Therefore, the operation of the binarized neural computer 31 and the binarized neural computer 33 is basically the same.

　以上のように、活性予測器３０がサポートベクターマシン識別器３２でも、２値化ニューラル計算器３３でも、２値化ニューラル計算器３１と同様の効果が得られる。 As described above, whether the activity predictor 30 is the support vector machine classifier 32 or the binarized neural computer 33, the same effect as that of the binarized neural computer 31 can be obtained.

　１：ニューラル計算装置
　２：ニューラルネット計算部（ニューラルネット計算手段）
　３、３Ａ、３Ｂ：ゼロ出力特定部（ゼロ出力特定手段）
　４：制御部（制御手段）
　２０：ニューラルレイヤー（ニューラルネット計算手段）
　２０ａ：ニューロン計算器
　２１、２２：乗算加算部
　２５：正規化線形関数部
　３０：活性予測器（ゼロ出力特定手段）
　３１、３３：２値化ニューラル計算器（ゼロ出力特定手段）
　３２：サポートベクターマシン識別器（ゼロ出力特定手段）
　ｉ：入力データ
　ｗ：重み付け係数
　ｏ：出力データ 1: Neural calculation device 2: Neural net calculation unit (neural net calculation means)
3, 3A, 3B: Zero output specifying unit (zero output specifying means)
4: Control unit (control means)
20: Neural layer (neural network calculation means)
20a: Neuron calculator 21, 22: Multiplication and addition part 25: Rectifier linear function part 30: Activity predictor (zero output specifying means)
31, 33: Binarized neural computer (zero output identification means)
32: Support vector machine classifier (zero output identification means)
i: Input data w: Weighting coefficient o: Output data

Claims

A neural network calculation means composed of a plurality of neuron calculators that add the multiplication result of the input data and the weighting coefficient, apply an activation function to the addition result, and output the output data.
A zero output specifying means that performs a calculation for specifying the neuron computer whose output becomes zero according to the input data.
A control means that controls the calculation of the neural network calculation means so as to omit the calculation of the neuron computer whose output becomes zero based on the calculation result of the zero output specifying means.
A neural computing device characterized by being equipped with.

In the neural computing device according to claim 1,
The zero output specifying means is a binarized neural network computer that imitates the calculation of the neural network calculation means.
A neural calculation apparatus, characterized in that the control means controls the calculation of the neural network calculation means based on the output of the zero output specifying means in which 1-bit data obtained by converting from the input data is input.

In the neural computing device according to claim 2,
A neural computing device, characterized in that the zero output specifying means is a binarized neural computer that imitates the calculation before applying the activation function in the neural network calculating means.

In the neural computing device according to claim 3,
A neural calculation apparatus further provided with a learning means for training the binarized neural computer based on a difference between the output of the zero output specifying means and the addition result in the neural network calculation means.

In the neural computing device according to any one of claims 1 to 4.
The neural network calculation means is composed of a plurality of neural layers.
There are a plurality of said zero output identifying means corresponding to each said neural layer.
The zero output specifying means corresponding to the neural layer performs a calculation to specify a neuron computer in which the output of the neuron computer belonging to the neural layer becomes zero according to the output data of the neural layer of the previous layer. A neural computing device characterized by.

In the neural computing device according to claim 5,
The control means controls the calculation of the neural network calculation means so that the calculation is started in advance from the neuron computer whose output is specified to be non-zero by the zero output specifying means corresponding to the neural layer. A neural computing device characterized by this.

In the neural computing device according to claim 5 or 6.
The control means controls the calculation of the neural network calculation means so as to reduce power consumption by providing a time interval between the calculation in the neural layer and the calculation in the neural layer of the next layer. Neural computing device.

In the neural computing device according to any one of claims 1 to 7.
A neural computing device characterized in that the activation function is a normalized linear function.

In the neural computing device according to claim 8,
A neural computing device, characterized in that the zero output specifying means performs a calculation for identifying the neuron computer whose sign of the addition result is zero or less from the pattern of the input data.

In a neural network computing device composed of a plurality of neuron calculators that add the multiplication result of input data and weighting coefficient, apply an activation function to the addition result, and output output data.
An output specifying step in which the zero output specifying means performs a calculation for identifying the neuron computer whose output becomes zero according to the input data.
A control step in which the control means controls the calculation of the neural net computer so as to omit the calculation of the neuron computer whose output becomes zero based on the calculation result of the zero output specifying means.
A neural calculation method characterized by including.