CN111602145A - Optimization method of convolutional neural network and related products - Google Patents
Optimization method of convolutional neural network and related products Download PDFInfo
- Publication number
- CN111602145A CN111602145A CN201880083507.4A CN201880083507A CN111602145A CN 111602145 A CN111602145 A CN 111602145A CN 201880083507 A CN201880083507 A CN 201880083507A CN 111602145 A CN111602145 A CN 111602145A
- Authority
- CN
- China
- Prior art keywords
- model
- convolutional layer
- layer
- loss value
- intermediate model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
A method of optimizing a convolutional neural network and related products, the method comprising: obtaining a pre-training model M; retraining a data set D of the pre-training model M in a specified field to obtain an initial model M0For the initial model M0Carrying out replacement layer operation; the replacement layer operations include: initial model M is determined based on bipartite graph maximum matching algorithm0The middle convolutional layer e is suitable to be replaced by the high-efficiency convolutional layer, and the first intermediate model M for replacing the standard convolutional layer e by the high-efficiency convolutional layer is determined1The effect is increased; reforming the parameters of the first intermediate model M1 to obtain a second intermediate model M2; initializing and retraining the second intermediate model M2 to obtain a third intermediate model M3; calculating a loss value of the third intermediate model M3; repeatedly executing the replacement layer operation to obtain a plurality of third intermediate models M3 and a plurality of loss values; the third intermediate model M3 with the smallest loss value is selected as the output model. The method has the advantage of low cost.
Description
PCT国内申请,说明书已公开。PCT domestic application, the description has been published.
Claims (8)
- A method for optimizing a convolutional neural network, the method comprising the steps of:obtaining a pre-training model M;retraining a data set D of the pre-training model M in a specified field to obtain an initial model M0For the initial model M0Carrying out replacement layer operation;the replacement layer operations include: initial model M is determined based on bipartite graph maximum matching algorithm0The middle convolutional layer e is suitable to be replaced by the high-efficiency convolutional layer, and the first intermediate model M for replacing the standard convolutional layer e by the high-efficiency convolutional layer is determined1The effect is increased; parameterization of the first intermediate model M1Reforming the line to obtain a second intermediate model M2; initializing and retraining the second intermediate model M2 to obtain a third intermediate model M3; calculating a loss value of the third intermediate model M3;repeatedly executing the replacement layer operation to obtain a plurality of third intermediate models M3 and a plurality of loss values; the third intermediate model M3 with the smallest loss value is selected as the output model.
- The method of claim 1, wherein the initial model M is determined based on a bipartite graph maximum matching algorithm0The adaptation of the standard convolutional layer e to be replaced by the high-efficiency convolutional layer specifically comprises:from an initial model M0Wherein finding a group convolutional layer containing Ng groups is such that intra-layer connections have a minimum change in importance;the importance is the L2 norm of all weights in each connection;
- An apparatus for optimizing a convolutional neural network, the apparatus comprising:an obtaining unit, configured to obtain a pre-training model M;a training unit for specifying the pre-training model MRetraining a data set D of the domain to obtain an initial model M0;A replacement unit for replacing the initial model M0Carrying out replacement layer operation; the replacement layer operations include: initial model M is determined based on bipartite graph maximum matching algorithm0The middle convolutional layer e is suitable to be replaced by the high-efficiency convolutional layer, and the first intermediate model M for replacing the standard convolutional layer e by the high-efficiency convolutional layer is determined1The effect is increased; reforming the parameters of the first intermediate model M1 to obtain a second intermediate model M2; initializing and retraining the second intermediate model M2 to obtain a third intermediate model M3; calculating a loss value of the third intermediate model M3;a selecting unit, configured to control the replacing unit to repeatedly perform a replacement layer operation to obtain a plurality of third intermediate models M3 and a plurality of loss values; the third intermediate model M3 with the smallest loss value is selected as the output model.
- The apparatus of claim 4,the replacement unit, in particular for use in removing the initial model M0Wherein finding a group convolutional layer containing Ng groups is such that intra-layer connections have a minimum change in importance;the importance is the L2 norm of all weights in each connection;
- A computer-readable storage medium storing a program for electronic data exchange, wherein the program causes a terminal to perform the method as provided in any one of claims 1-3.
- A computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform the method as provided in any one of claims 1 to 3.
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2018/112569 WO2020087254A1 (en) | 2018-10-30 | 2018-10-30 | Optimization method for convolutional neural network, and related product |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN111602145A true CN111602145A (en) | 2020-08-28 |
Family
ID=70463304
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201880083507.4A Pending CN111602145A (en) | 2018-10-30 | 2018-10-30 | Optimization method of convolutional neural network and related products |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN111602145A (en) |
| WO (1) | WO2020087254A1 (en) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113128670A (en) * | 2021-04-09 | 2021-07-16 | 南京大学 | Neural network model optimization method and device |
| CN114648671A (en) * | 2022-02-15 | 2022-06-21 | 成都臻识科技发展有限公司 | Detection model generation method and device based on deep learning |
| CN114912569A (en) * | 2021-02-10 | 2022-08-16 | 华为技术有限公司 | Model training method and device |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP6042274B2 (en) * | 2013-06-28 | 2016-12-14 | 株式会社デンソーアイティーラボラトリ | Neural network optimization method, neural network optimization apparatus and program |
| CN105844653B (en) * | 2016-04-18 | 2019-07-30 | 深圳先进技术研究院 | A kind of multilayer convolutional neural networks optimization system and method |
| CN106485324A (en) * | 2016-10-09 | 2017-03-08 | 成都快眼科技有限公司 | A kind of convolutional neural networks optimization method |
| CN108319988B (en) * | 2017-01-18 | 2021-12-24 | 华南理工大学 | Acceleration method of deep neural network for handwritten Chinese character recognition |
-
2018
- 2018-10-30 CN CN201880083507.4A patent/CN111602145A/en active Pending
- 2018-10-30 WO PCT/CN2018/112569 patent/WO2020087254A1/en not_active Ceased
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114912569A (en) * | 2021-02-10 | 2022-08-16 | 华为技术有限公司 | Model training method and device |
| CN113128670A (en) * | 2021-04-09 | 2021-07-16 | 南京大学 | Neural network model optimization method and device |
| CN113128670B (en) * | 2021-04-09 | 2024-03-19 | 南京大学 | An optimization method and device for neural network models |
| CN114648671A (en) * | 2022-02-15 | 2022-06-21 | 成都臻识科技发展有限公司 | Detection model generation method and device based on deep learning |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2020087254A1 (en) | 2020-05-07 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6831347B2 (en) | Learning equipment, learning methods and learning programs | |
| CN111178520B (en) | Method and device for constructing neural network | |
| JP2020512639A5 (en) | ||
| PH12021551336A1 (en) | Automated generation of machine learning models | |
| WO2017157183A1 (en) | Automatic multi-threshold characteristic filtering method and apparatus | |
| WO2019091020A1 (en) | Weight data storage method, and neural network processor based on method | |
| CN111602145A (en) | Optimization method of convolutional neural network and related products | |
| WO2018227800A1 (en) | Neural network training method and device | |
| JP5624562B2 (en) | Method and system for calculating website visitor ratings | |
| CN107330446A (en) | A kind of optimization method of depth convolutional neural networks towards image classification | |
| CN110291540A (en) | Criticize renormalization layer | |
| JP2015011510A (en) | Neural network optimization method, neural network optimization apparatus and program | |
| TW202036388A (en) | Neural network, method to prune weights and output feature maps of layer of neural network, and neural network analyzer | |
| CN111144548A (en) | Method and device for identifying working condition of pumping well | |
| US20190279092A1 (en) | Convolutional Neural Network Compression | |
| WO2018107383A1 (en) | Neural network convolution computation method and device, and computer-readable storage medium | |
| CN112905894B (en) | Collaborative filtering recommendation method based on enhanced graph learning | |
| US20210232912A1 (en) | Systems and Methods for Providing a Machine-Learned Model with Adjustable Computational Demand | |
| CN106897265B (en) | Word vector training method and device | |
| CN112771547A (en) | End-to-end learning in a communication system | |
| You et al. | Recursive reduced kernel based extreme learning machine for aero-engine fault pattern recognition | |
| CN103544528A (en) | BP neural-network classification method based on Hadoop | |
| JP2016006617A (en) | Learning device, learning method, and learning program | |
| CN108604313B (en) | Automated predictive modeling and framework | |
| WO2015192798A1 (en) | Topic mining method and device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200828 |
|
| RJ01 | Rejection of invention patent application after publication |