CN107402905B

CN107402905B - Computing method and device based on neural network

Info

Publication number: CN107402905B
Application number: CN201610340203.4A
Authority: CN
Inventors: 周舒畅; 毛慧子; 周昕宇; 吴育昕; 印奇
Original assignee: Beijing Kuangshi Technology Co Ltd; Beijing Megvii Technology Co Ltd
Current assignee: Yuanlijuhe Chongqing Information Technology Co ltd
Priority date: 2016-05-19
Filing date: 2016-05-19
Publication date: 2021-04-09
Anticipated expiration: 2036-05-19
Also published as: CN107402905A

Abstract

The embodiment of the invention provides a neural network-based computing method and device. The method comprises the following steps: inputting input data into a first neural network, wherein at least one row of at least one weight coefficient matrix comprises the same parameters; calculating input data based on a first neural network, wherein in a matrix multiplication operation involving at least one row of a matrix of weight coefficients, elements corresponding to the same parameter in input vectors involved in the matrix multiplication operation are added to obtain an element sum; multiplying the elemental sum with the value of the same parameter to obtain a data product; calculating a multiplication calculation result involving at least one row of the weight coefficient matrix from the addition of the data products; and calculating and outputting the calculation result of the first neural network according to the multiplication calculation results of all the rows of the weight coefficient matrix. The method and the device can significantly reduce the calculation amount of the calculation based on the neural network.

Description

Neural network-based computing method and device

Technical Field

The invention relates to the field of artificial intelligence, in particular to a computing method and device based on a neural network.

Background

With the rapid development of artificial intelligence, neural network-based computing has been widely used in many fields such as character recognition, speech recognition, and the like. In neural network-based computations, a typical operation is a matrix multiplication of an input vector with a parameter matrix. The parameters of existing neural networks are typically large in number, even as many as billions. Also, these parameters typically consist of 32-bit floating point numbers.

Thus, existing neural network-based computations are typically computationally intensive and time consuming. Particularly, on a platform with limited computing capability, such as a mobile terminal of a mobile phone, the above calculation not only causes a long time delay, but also brings challenges to power consumption, heat dissipation and the like of the platform.

Disclosure of Invention

The present invention has been made in view of the above problems. The invention provides a calculation method and a device based on a neural network, which can remarkably reduce the calculation amount based on the neural network by combining multiplication operations of the same parameters in the neural network, thereby improving the calculation speed and saving the energy of a platform.

According to an aspect of the present invention, there is provided a neural network-based computing method, including:

inputting input data to a first neural network, wherein at least one row of at least one weight coefficient matrix of the first neural network comprises the same parameters;

calculating the input data based on the first neural network, wherein in a matrix multiplication operation involving at least one row of the matrix of weight coefficients,

adding elements corresponding to the same parameters in the input vectors involved in the matrix multiplication operation to obtain element sums;

multiplying said element with a value of said same parameter to obtain a data product;

calculating a multiplication calculation result involving at least one row of the weight coefficient matrix from the addition of the data products; and

and calculating and outputting the calculation result of the first neural network according to the multiplication calculation result of all the rows of the weight coefficient matrix.

Exemplarily, the computing method further comprises:

clustering parameters within at least one row of at least one weight coefficient matrix of a second neural network;

determining a class center value for each of the at least one class according to the clustered parameters; and

replacing parameters in the corresponding class with the class center value to obtain the first neural network.

Illustratively, the determining a class center value for each of the at least one class comprises:

calculating an average value of the parameter for each of the at least one class, wherein the average value is a class center value of the corresponding class.

calculating an average value of the parameters of each of the at least one class; and

rounding the average value;

wherein the rounded average is the class center value of the corresponding class.

Illustratively, the clustering employs K-means clustering.

Exemplarily, the computing method further comprises:

and training and obtaining the second neural network by using the training data and the labeled content corresponding to the training data.

Illustratively, the calculation method is implemented using code generated by a code generator.

According to another aspect of the present invention, there is also provided a neural network-based computing apparatus, including:

an input module for inputting input data to a first neural network, wherein at least one row of at least one weight coefficient matrix of the first neural network comprises the same parameters;

a calculation module for calculating the input data based on the first neural network, wherein in a matrix multiplication operation involving at least one row of the weight coefficient matrix,

and the output module is used for calculating and outputting the calculation result of the first neural network according to the multiplication calculation result of all the rows of the weight coefficient matrix.

Illustratively, the computing device further comprises:

a clustering module for clustering parameters within at least one row of at least one weight coefficient matrix of the second neural network;

a class center value determination module to determine a class center value for each of at least one class according to the clustered parameters for the second neural network; and

a replacing module for replacing parameters in a corresponding class with the class center value for the second neural network to obtain the first neural network.

Illustratively, the class center value determination module includes:

a first average calculation unit, configured to calculate an average value of the parameter of each of the at least one class, where the average value is a class center value of the corresponding class.

Illustratively, the class center value determination module includes:

a second average value calculation unit for calculating an average value of the parameter of each of the at least one class; and

a rounding unit for rounding the average value;

Illustratively, the clustering employs K-means clustering.

Illustratively, the computing device further comprises:

and the training module is used for training and obtaining the second neural network by utilizing the training data and the labeled content corresponding to the training data.

The calculation method and the calculation device can remarkably reduce the calculation amount of the calculation based on the neural network. Thereby enabling a reduction in the computation time of the running platform. Therefore, the power consumption of the platform is reduced, the standby time of the platform is prolonged, and the heat dissipation capacity of the platform is reduced.

Drawings

The above and other objects, features and advantages of the present invention will become more apparent by describing in more detail embodiments of the present invention with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings, the same reference numbers generally represent the same or similar parts or steps.

FIG. 1 shows a schematic block diagram of an example electronic device for implementing a method and apparatus for neural network based computing in accordance with embodiments of the present invention;

FIG. 2 shows a schematic flow diagram of a neural network-based computational method, according to one embodiment of the present invention;

FIG. 3 shows a schematic flow diagram of a neural network-based computational method, according to another embodiment of the present invention;

FIG. 4 shows a schematic block diagram of a neural network-based computing device, in accordance with one embodiment of the present invention;

FIG. 5 shows a schematic block diagram of a neural network-based computing device, in accordance with another embodiment of the present invention; and

FIG. 6 shows a schematic block diagram of an electronic device for neural network-based computing, according to one embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, exemplary embodiments according to the present invention will be described in detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of embodiments of the invention and not all embodiments of the invention, with the understanding that the invention is not limited to the example embodiments described herein. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the invention described herein without inventive step, shall fall within the scope of protection of the invention.

First, an example electronic device 100 for implementing the neural network-based computing method and apparatus of the present invention is described with reference to fig. 1.

As shown in FIG. 1, electronic device 100 includes one or more processors 102, one or more memory devices 104, an input device 106, and an output device 108, which are interconnected via a bus system 110 and/or other form of connection mechanism (not shown). It should be noted that the components and structure of the electronic device 100 shown in fig. 1 are exemplary only, and not limiting, and the electronic device may have other components and structures as desired.

The processor 102 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 100 to perform desired functions.

The storage 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. On which one or more computer program instructions may be stored that may be executed by processor 102 to implement the computer functions (implemented by the processor) of the embodiments of the invention described below and/or other desired functions. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.

The input device 106 may be a device for receiving instructions input by a user and collecting data, and may include one or more of a keyboard, a mouse, a microphone, a touch screen, a camera, and the like.

The output device 108 may output various information (e.g., images or sounds) to an external (e.g., user), and may include one or more of a display, a speaker, and the like.

The input device 106 and the output device 108 are mainly used for interacting with a user, and the electronic apparatus 100 may not include both.

In the following, a neural network based calculation method 200 according to an embodiment of the present invention will be described with reference to fig. 2.

In step S220, input data is input to the first neural network.

The input data may be any data such as image data, sound data, and text data, etc. The input data may be represented by a vector, referred to as an input vector. The input data may also be represented by a matrix, which is composed of one or more vectors, also referred to as input vectors.

For a neural network, one or more layers may be included, and some or all of the layers may perform matrix multiplication operations on input data of the layers. In particular, the operation of a neural network on data typically includes one or more matrix multiplications: and Y is M X. Wherein, M is a weight coefficient matrix of the neural network, and the elements thereof are parameters of the neural network. X is the input vector and Y is the output vector. Let X be m in length, i.e., X includes m elements. Y has a length of n, i.e. Y comprises n elements. Thus, M × n floating-point or integer multiplications are required to complete one matrix multiplication on the weight coefficient matrix M.

For the first neural network, there is at least one weight coefficient matrix that satisfies the following condition: there is at least one row in the weight coefficient matrix that includes the same parameter. That is, in one or more weight coefficient matrices of the first neural network, the same parameters are included (i.e., the elements of the matrices are equal in size), and at least a portion of the same parameters are in the same row of the weight coefficient matrices. An example of a weight coefficient matrix for the first neural network is shown below:

for the weight coefficient matrix M in the above example, a total of 3 rows is included. The first and second elements in the first row are both 3, which are the same parameters. The first, third and fourth elements in the third row are all 5, which are the same parameters. That is, the weight coefficient matrix M includes two rows including the same parameter.

In step S240, input data is calculated based on the first neural network.

As previously described, the computation of the data by the first neural network includes one or more matrix multiplication computations. The first neural network may perform a matrix multiplication computation on input data of one or more layers thereof. The parameters of that layer of the first neural network constitute a matrix of weight coefficients involved in the matrix multiplication computation. In the first neural network, there is at least one weight coefficient matrix in which at least one row includes the same parameter.

In a matrix multiplication operation involving rows in the weight coefficient matrix including the same parameters, the following sub-steps are specifically included.

First, elements corresponding to the same parameter within the row of the weight coefficient matrix in the input vector involved in the matrix multiplication operation are added to obtain an element sum.

As previously described, the first neural network may include one or more layers. One or more of the layers may perform a matrix multiplication computation on the input data for that layer. For the first layer, its input data is the input data of the first neural network. For the second layer, its input data is the output data of the first layer. By analogy, for any layer of the first neural network except the first layer, its input data is the output data of the previous layer. If a particular layer of the first neural network performs a matrix multiplication operation on the input data for that layer, then the input vector involved in the matrix multiplication operation is a vector representing the input data for that layer. An example of an input vector X involved in a matrix multiplication operation is shown below: x ═ 1278]^T。

For the weight coefficient matrix as described above

The same parameters in the first row of (1) are the first element and the second element. These two elements correspond to the first element "1" and the second element "2" in the input vector X, respectively, and the sum of the elements is 1+2 — 3.

For the weight coefficient matrix as described above

The third row of (2) wherein the same parameters are the first element, the third element and the fourth element. The three elements respectively correspond to the first element "1", the third element "7" and the fourth element "8" in the input vector X, and the sum of the elements is 1+7+8 — 16.

The above elements are then multiplied by the value of the same parameter within that row of the weight coefficient matrix to obtain the data product.

For the weight coefficient matrix as described above

The data product is 3 x 3-9 by multiplying the element sum 3 with the value 3 of the same parameter in the first row.

For the weight coefficient matrix as described above

The third row of (2) multiplies the element sum 16 by the value 5 of the same parameter in the third row, the data product is 16 x 5 — 80.

Finally, the result of the multiplication computation involving the row of the weight coefficient matrix is computed from the sum of the data products described above.

For the weight coefficient matrix as described above

The result of multiplication with X in the first row of (2) is 9+5 × 7+4 × 8 — 76.

For as aboveThe weight coefficient matrix

The result of multiplication with X in the third row of (a) is 80+2 × 2 — 84.

In step S260, the calculation result of the first neural network is calculated and output according to the multiplication calculation results of all rows of the weight coefficient matrix.

It will be appreciated that the weight coefficient matrix M comprises a second row, in addition to the first and third rows described above comprising the same parameters, in which rows no identical parameters are present. For this second row, the result of the multiplication with X is 2 × 1+ 2+5 × 7+6 × 8 — 87.

According to the multiplication calculation results of all rows of the weight coefficient matrix, the output vector of the weight coefficient matrix can be obtained. For the weight coefficient matrix as described above

In other words, the output vector is Y ═ 768784]^T。

And obtaining and outputting the calculation result of the first neural network according to the output vectors of all the weight coefficient matrixes. This process is known to those of ordinary skill in the art and will not be described herein for the sake of brevity.

The above-described neural network-based calculation method can significantly reduce the amount of calculation of the neural network-based calculation by reducing multiplication operations in the matrix multiplication operation of the neural network. Thereby enabling a reduction in the computation time of the running platform. Therefore, the power consumption of the platform is reduced, the standby time of the platform is prolonged, and the heat dissipation capacity of the platform is reduced.

Optionally, the above calculation method 200 is implemented by using codes generated by a code generator. The code generator is capable of automatically outputting a desired code in accordance with a specific coding specification.

First, parameters of the first neural network and topology data of each layer may be input to the code generator. The description of the first neural network is input to a code generator to generate corresponding code, e.g., C language code, by the code generator.

Data on the calculation operation of each layer of the first neural network, for example, data on the matrix multiplication operation, may be included in the topology data of each layer. An example of a piece of original code (pseudo code) input to a code generator according to an embodiment of the present invention is shown below, which is a weight coefficient matrix with respect to the first neural network described above

Original code of the first line of (1):

Y[0]＝M[0][0]*X[0]；

Y[0]＝Y[0]+M[0][1]*X[1]；

Y[0]＝Y[0]+M[0][2]*X[2]；

Y[0]＝Y[0]+M[0][3]*X[3]。

then, for a matrix multiplication operation, such as described in step S240, the code generator directly writes the generated code with the parameters in the first neural network replaced with a constant (known constant).

For example, for the original code, the following four parameters are included:

M[0][0]＝3,M[0][1]＝3,M[0][2]＝5,M[0][3]＝4。

replacing parameters in the original code with known constants according to the equation, thereby resulting in the following replaced code:

Y[0]＝3*X[0]；

Y[0]＝Y[0]+3*X[1]；

Y[0]＝Y[0]+5*X[2]；

Y[0]＝Y[0]+4*X[3]。

by directly replacing the parameters of the first neural network with constants, the access to the memory in the calculation process can be reduced.

Finally, the code generator performs an equivalent transformation on the replaced code. Since the rows of the weight coefficient matrix of the first neural network comprise the same parameters, the replaced code may be equivalently transformed according to the multiplicative allocation rate a × b + c × b ═ a + c × b. After the replaced code is subjected to equivalent transformation, the following results are obtained:

Y[0]＝(X[0]+X[1])*3；

Y[0]＝Y[0]+X[2]*5；

Y[0]＝Y[0]+X[3]*4。

it can be understood that the above technical solution reduces one multiplication operation compared with before optimization. The code finally generated by the code generator may be sent to a compiler to generate an executable binary code. The code generator has high code generating efficiency, and can obviously improve the working efficiency of a code writer.

FIG. 3 shows a schematic flow diagram of a neural network-based computational method 300, according to another embodiment of the present invention. As shown in fig. 3, the neural network-based calculation method 300 adds step S311, step S312, step S313, and step S314 as compared to the neural network-based calculation method 200 described above. A first neural network is obtained by step S311, step S312, step S313, and step S314. Steps S320, S340 and S360 in the method 300 are respectively similar to the corresponding steps in the method 200, and are not repeated herein for brevity.

In step S311, a second neural network is trained and obtained by using the training data and the labeled content corresponding thereto.

The training data is data whose calculation result is known. For example, for the case where the first neural network is used to perform text recognition on an image, then the training data may include images in which the text included is known. The annotation corresponding to such training data is a text annotation. For another example, where the first neural network is used to perform face recognition on an image, then the training data may include images in which faces are known to be included. The annotation content corresponding to such training data is a face annotation, e.g. a bounding box comprising a face.

And adjusting parameters of the neural network by using the training data and the labeled content corresponding to the training data to ensure that the calculation result obtained by the neural network is consistent with the labeled content corresponding to the training data, thereby determining the second neural network. The second neural network is capable of performing calculations on unlabeled input data to obtain ideal calculation results. For example, for a second neural network for performing character or face recognition on an image, the image is input to the second neural network, and the second neural network can perform accurate character or face recognition on the input image and output an ideal recognition result.

In step S312, the parameters within at least one row of at least one weight coefficient matrix of the second neural network obtained in step S311 are clustered.

The second neural network includes many parameters, the number of which may even be as many as several billion. The parameters within at least one row of at least one weight coefficient matrix of the second neural network may be clustered according to their parameter values, e.g. into k classes. Alternatively, k may be any integer between 2 and 100 ten thousand, depending on the application. The smaller the value of k, the less computationally intensive the method 300.

Assume that the parameters within a row of a weight coefficient matrix of the second neural network can be represented as a vector N [1] ═ 1.912.563.584.013.598.46. The parameters within the vector are clustered into 3 classes. The first type of parameters includes: 1.91 and 2.56, the second category of parameters includes 3.58, 4.01 and 3.59, and the third category of parameters includes: 8.46.

alternatively, the clustering operation may employ a K-means clustering method. The K-means clustering method is easy to realize, and can reasonably classify the target parameters, so that the calculation accuracy of the second neural network is ensured.

In step S313, for the second neural network, a class center value for each of the at least one class is determined from the clustered parameters.

As set forth in step S312, the parameters within at least one row of at least one weight coefficient matrix of the second neural network may be clustered into k classes. For one or more of the k classes, a class center value for the class may be determined based on the parameters therein. The class center value may substantially embody the value of a parameter of the class.

Alternatively, step S313 may include calculating an average value of the parameter for each of the one or more classes.

Again taking the above vector N [1] ═ 1.912.563.584.013.598.46 as an example, the average value of the first type of parameters is (1.91+2.56)/2 ═ 2.235, the average value of the second type of parameters is (3.58+4.01+3.59)/3 ═ 3.727, and the third type of parameters includes only one, so its average value is the parameter itself 8.46.

In one example, the average value obtained by the above calculation may be taken as the class center value of the corresponding class. The mean value is taken as the class center value, so that the method is an ideal scheme, not only is easy to realize, but also ensures the calculation accuracy of the first neural network obtained according to the method.

In another example, step S313 further includes rounding the above average. Again, taking the above vector N [1] ═ 1.912.563.584.013.598.46 as an example, the average value of the first type of parameters 2.235 is rounded to obtain a rounded value of 2, the average value of the second type of parameters 3.727 is rounded to obtain a rounded value of 4, and the average value of the third type of parameters 8.46 is rounded to obtain a rounded value of 8. The rounded value may be used as the class center value of the corresponding class. Taking the rounded value as the class center value, the parameters of the first neural network obtained therefrom can be made integer, thereby making the calculation amount based on the first neural network smaller.

In step S314, for the second neural network, parameters in each class are replaced with the class center value of the class to obtain the first neural network. For each class of parameters, the parameters of the class may be replaced with a class center value for the class.

Still taking the above vector N [1] ═ 1.912.563.584.013.598.46 as an example, assume that the class center value of the first class parameter is finally determined to be 2, the class center value of the second class parameter is 4, and the class center value of the third class parameter is 8. The vector after parameter replacement is M [1] ═ 224448 ], which can be used as a parameter in a row of a weight coefficient matrix of the first neural network.

While the obtaining process of the first neural network is described above by taking only one row of the weight coefficient matrix of the first neural network as an example, it can be understood by those skilled in the art that the operations of step S312, step S313 and step S314 described above can be performed for multiple rows of the weight coefficient matrix of the second neural network. Furthermore, the above operations may also be performed for a plurality of rows of a plurality of weight coefficient matrices of the second neural network, wherein the above operations may be performed for one or more rows of each weight coefficient matrix. Preferably, the above operation is performed for all rows in all weight coefficient matrices of the second neural network. The more the number of rows involved in the above operation, the less the computation amount of the calculation based on the first neural network, and accordingly, the more resources are saved.

The operations of step S313 and step S314 described above may be performed for each row for one or more classes obtained by clustering. Preferably, the operations of step S313 and step S314 described above are performed for all the classes obtained by clustering. This operation is performed for more classes, the less computationally intensive the first neural network-based computation.

By clustering the parameters of the second neural network and determining the parameters of the first neural network according to the clustering result, the parameters of the first neural network determined according to the parameters can be limited and can be represented by integers or fixed points instead of floating points. Thereby significantly increasing the computational speed of the first neural network and less sacrificing the computational accuracy of the first neural network.

In the above neural network-based calculation method 300, the second neural network is obtained by training with training data and labeling content corresponding thereto, and those skilled in the art will appreciate that the manner of obtaining the second neural network is merely illustrative and not restrictive, for example, the second neural network may be pre-stored or empirically determined. The second neural network is obtained through a training mode, so that the second neural network is more suitable for a specific scene, and therefore the calculation accuracy of the first neural network obtained according to the second neural network can be improved.

FIG. 4 shows a schematic block diagram of a neural network-based computing device 400, according to one embodiment of the present invention. As shown in fig. 4, the apparatus 400 may include an input module 420, a calculation module 440, and an output module 460.

The input module 420 is used to input data to the first neural network. The same parameters are included in at least one row of at least one weight coefficient matrix of the first neural network. The input module 420 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104, and may perform step S220 in the neural network-based computing method according to an embodiment of the present invention.

The calculation module 440 is configured to calculate the input data based on the first neural network. Wherein, in a matrix multiplication operation involving at least one row of the weight coefficient matrix, elements corresponding to the same parameter in input vectors involved in the matrix multiplication operation are added to obtain an element sum; multiplying said element with a value of said same parameter to obtain a data product; the addition calculation according to the data product involves a multiplication calculation result of at least one row of the weight coefficient matrix. The calculation module 440 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104, and may perform step S240 in the neural network-based calculation method according to an embodiment of the present invention.

The output module 460 is configured to calculate and output the calculation result of the first neural network according to the multiplication calculation result of all rows of the weight coefficient matrix. The output module 460 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104, and may perform step S260 in the neural network-based computing method according to an embodiment of the present invention.

The calculation method based on the neural network can remarkably reduce the calculation amount of calculation based on the neural network. Thereby enabling a reduction in the computation time of the running platform. Therefore, the power consumption of the platform is reduced, the standby time of the platform is prolonged, and the heat dissipation capacity of the platform is reduced.

Fig. 5 shows a schematic block diagram of a neural network-based computing device 500, according to another embodiment of the present invention. As shown in fig. 5, the neural network-based computing device 500 adds a training module 511, a clustering module 512, a class center value determination module 513, and a replacement module 514, as compared to the neural network-based computing device 400 described above. The training module 511, the clustering module 512, the class center value determination module 513, and the replacement module 514 are configured to obtain a first neural network. The input module 520, the computing module 540, and the output module 560 in the computing apparatus 500 are respectively similar to the corresponding modules in the computing apparatus 400, and are not described herein again for brevity.

The training module 511 is configured to train and obtain a second neural network by using the training data and the labeled content corresponding to the training data. The training module 511 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage device 104, and may perform step S311 of the neural network-based computing method according to the embodiment of the present invention.

It is to be appreciated that the second neural network can be obtained in other ways besides using the training module 511, such as being pre-stored or set based on experience. In other words, the computing device 500 may not include the training module 511.

The clustering module 512 is configured to cluster the parameters within at least one row of at least one weight coefficient matrix of the second neural network. Optionally, K-means clustering is employed for parameter clustering. The clustering module 512 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage device 104, and may perform step S312 of the neural network-based computing method according to the embodiment of the present invention.

The class center value determination module 513 is configured to determine a class center value for each of at least one class from the clustered parameters for the second neural network. The class center value determination module 513 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104, and may perform step S313 of the neural network-based calculation method according to the embodiment of the present invention.

Illustratively, the class center value determining module 513 includes a first average value calculating unit for calculating an average value of the parameter of each of the at least one class, wherein the average value is a class center value of the corresponding class.

Illustratively, the class center value determination module 513 includes a second average value calculation unit and a rounding unit. The second average calculation unit is used for calculating the average value of the parameters of each of the at least one class. And the rounding unit is used for rounding the average value. Wherein the rounded average is the class center value of the corresponding class.

A replacement module 514 is configured to replace parameters in the corresponding class with the class center value for the second neural network to obtain the first neural network. The replacement module 514 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104, and may perform step S314 of the neural network-based computing method according to the embodiment of the present invention.

The structure, implementation and advantages of the above-mentioned neural network-based computing device can be understood by those skilled in the art from reading the above detailed description of the neural network-based computing method, and thus will not be described in detail herein.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

FIG. 6 shows a schematic block diagram of an electronic device 600 for neural network-based computing, according to an embodiment of the present invention. The electronic device 600 includes an input means 610, a storage means 620, a processor 630 and an output means 640.

The input device 610 is used for receiving an operation instruction input by a user and collecting data. The input device 610 may include one or more of a keyboard, a mouse, a microphone, a touch screen, a camera, and the like.

The storage means 620 stores program codes for implementing respective steps in the neural network-based calculation method according to the embodiment of the present invention.

The processor 630 is configured to run the program codes stored in the storage device 620 to perform the corresponding steps of the neural network based computing method according to the embodiment of the present invention, and is configured to implement the input module 420, the computing module 440 and the output module 460 in the neural network based computing device according to the embodiment of the present invention.

In one embodiment, the program code, when executed by the processor 630, causes the electronic device 600 to perform the steps of:

In one embodiment, the program code, when executed by the processor 630, further causes the electronic device 600 to perform the steps of:

determining, for the second neural network, a class center value for each of at least one class from the clustered parameters; and

for the second neural network, replacing parameters in a corresponding class with the class center value to obtain the first neural network.

Illustratively, the clustering operation adopts a K-means clustering method.

Illustratively, the program code when executed by the processor 630 causes the electronic device 600 to perform the step of determining a class center value for each of at least one class from the clustered parameters comprises:

rounding the average value;

and training and obtaining a second neural network by using the training data and the labeled content corresponding to the training data.

Furthermore, according to an embodiment of the present invention, there is also provided a storage medium on which program instructions are stored, which when executed by a computer or a processor are used for executing the respective steps of the neural network-based computing method according to an embodiment of the present invention, and for implementing the respective modules in the neural network-based computing apparatus according to an embodiment of the present invention. The storage medium may include, for example, a memory card of a smart phone, a storage component of a tablet computer, a hard disk of a personal computer, a Read Only Memory (ROM), an Erasable Programmable Read Only Memory (EPROM), a portable compact disc read only memory (CD-ROM), a USB memory, or any combination of the above storage media. The computer-readable storage medium may be any combination of one or more computer-readable storage media.

In one embodiment, the computer program instructions may, when executed by a computer or a processor, implement the functional blocks of the neural network-based computing device according to the embodiment of the present invention, and/or may perform the neural network-based computing method according to the embodiment of the present invention.

In one embodiment, the computer program instructions, when executed by a computer or processor, perform the steps of:

Illustratively, the computer program instructions, when executed by a computer or processor, further perform the steps of:

clustering parameters within at least one row of at least one weight coefficient matrix of the second neural network;

Illustratively, the clustering operation adopts a K-means clustering method.

Illustratively, the computer program instructions, when executed by the computer or processor, cause the computer or processor to perform the step of determining a class center value for each of at least one class comprises:

rounding the average value;

According to the neural network-based calculation method and device, the electronic device and the storage medium provided by the embodiment of the invention, the calculation amount of the neural network-based calculation can be obviously reduced. Thereby enabling a reduction in the computation time of the running platform. Therefore, the power consumption of the platform is reduced, the standby time of the platform is prolonged, and the heat dissipation capacity of the platform is reduced.

Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the foregoing illustrative embodiments are merely exemplary and are not intended to limit the scope of the invention thereto. Various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present invention. All such changes and modifications are intended to be included within the scope of the present invention as set forth in the appended claims.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the units is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another device, or some features may be omitted, or not executed.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the neural network-based calculation method of the present invention should not be interpreted to reflect the following intention: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

It will be understood by those skilled in the art that all of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where such features are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.

The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some of the modules in a neural network-based computing device according to embodiments of the present invention. The present invention may also be embodied as apparatus programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

The above description is only for the specific embodiment of the present invention or the description thereof, and the protection scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the changes or substitutions should be covered within the protection scope of the present invention. The protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A neural network-based computing method, comprising:

determining a class center value for each of the at least one class according to the clustered parameters;

replacing parameters in the corresponding class with the class center value to obtain a first neural network;

inputting the input image data to the first neural network, wherein at least one row of at least one weight coefficient matrix of the first neural network comprises the same parameters;

computing the input image data based on the first neural network, wherein in a matrix multiplication operation involving at least one row of the weight coefficient matrix,

calculating and outputting a calculation result of the first neural network according to the multiplication calculation results of all rows of the weight coefficient matrix;

the second neural network is obtained by training by using training data and labeled contents corresponding to the training data, the training data is image data, the labeled contents are character labeling or face labeling, and the calculation result is a character or face recognition result.

2. The computing method of claim 1, the determining a class center value for each of at least one class comprising:

3. The computing method of claim 1, the determining a class center value for each of at least one class comprising:

rounding the average value;

4. The computing method of any one of claims 1 to 3, said clustering employing K-means clustering.

5. The computing method of any of claims 1 to 3, wherein the computing method further comprises:

6. The computing method of any of claims 1 to 3, wherein the computing method is implemented using code generated by a code generator.

7. A neural network-based computing device, comprising:

a class center value determination module to determine a class center value for each of at least one class according to the clustered parameters for the second neural network;

a replacing module, configured to replace, for the second neural network, a parameter in a corresponding class with the class center value to obtain a first neural network;

an input module for inputting input image data to the first neural network, wherein at least one row of at least one weight coefficient matrix of the first neural network comprises the same parameters;

a calculation module for calculating the input image data based on the first neural network, wherein in a matrix multiplication operation involving at least one row of the weight coefficient matrix,

the output module is used for calculating and outputting the calculation result of the first neural network according to the multiplication calculation result of all the rows of the weight coefficient matrix;

8. The computing device of claim 7, the class center value determination module comprising:

9. The computing device of claim 7, the class center value determination module comprising:

a rounding unit for rounding the average value;

10. The computing device of any of claims 7 to 9, the clustering employing K-means clustering.

11. The computing device of any of claims 7 to 9, further comprising:

and the training module is used for training and obtaining the second neural network by utilizing the training data and the corresponding labeled content.