[go: up one dir, main page]

CN111602145A - Optimization method of convolutional neural network and related products - Google Patents

Optimization method of convolutional neural network and related products Download PDF

Info

Publication number
CN111602145A
CN111602145A CN201880083507.4A CN201880083507A CN111602145A CN 111602145 A CN111602145 A CN 111602145A CN 201880083507 A CN201880083507 A CN 201880083507A CN 111602145 A CN111602145 A CN 111602145A
Authority
CN
China
Prior art keywords
model
convolutional layer
layer
loss value
intermediate model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201880083507.4A
Other languages
Chinese (zh)
Inventor
赵睿哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Corerain Technologies Co Ltd
Original Assignee
Shenzhen Corerain Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Corerain Technologies Co Ltd filed Critical Shenzhen Corerain Technologies Co Ltd
Publication of CN111602145A publication Critical patent/CN111602145A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

A method of optimizing a convolutional neural network and related products, the method comprising: obtaining a pre-training model M; retraining a data set D of the pre-training model M in a specified field to obtain an initial model M0For the initial model M0Carrying out replacement layer operation; the replacement layer operations include: initial model M is determined based on bipartite graph maximum matching algorithm0The middle convolutional layer e is suitable to be replaced by the high-efficiency convolutional layer, and the first intermediate model M for replacing the standard convolutional layer e by the high-efficiency convolutional layer is determined1The effect is increased; reforming the parameters of the first intermediate model M1 to obtain a second intermediate model M2; initializing and retraining the second intermediate model M2 to obtain a third intermediate model M3; calculating a loss value of the third intermediate model M3; repeatedly executing the replacement layer operation to obtain a plurality of third intermediate models M3 and a plurality of loss values; the third intermediate model M3 with the smallest loss value is selected as the output model. The method has the advantage of low cost.

Description

PCT国内申请,说明书已公开。PCT domestic application, the description has been published.

Claims (8)

  1. A method for optimizing a convolutional neural network, the method comprising the steps of:
    obtaining a pre-training model M;
    retraining a data set D of the pre-training model M in a specified field to obtain an initial model M0For the initial model M0Carrying out replacement layer operation;
    the replacement layer operations include: initial model M is determined based on bipartite graph maximum matching algorithm0The middle convolutional layer e is suitable to be replaced by the high-efficiency convolutional layer, and the first intermediate model M for replacing the standard convolutional layer e by the high-efficiency convolutional layer is determined1The effect is increased; parameterization of the first intermediate model M1Reforming the line to obtain a second intermediate model M2; initializing and retraining the second intermediate model M2 to obtain a third intermediate model M3; calculating a loss value of the third intermediate model M3;
    repeatedly executing the replacement layer operation to obtain a plurality of third intermediate models M3 and a plurality of loss values; the third intermediate model M3 with the smallest loss value is selected as the output model.
  2. The method of claim 1, wherein the initial model M is determined based on a bipartite graph maximum matching algorithm0The adaptation of the standard convolutional layer e to be replaced by the high-efficiency convolutional layer specifically comprises:
    from an initial model M0Wherein finding a group convolutional layer containing Ng groups is such that intra-layer connections have a minimum change in importance;
    Figure PCTCN2018112569-APPB-100001
    the importance is the L2 norm of all weights in each connection;
    Figure PCTCN2018112569-APPB-100002
  3. the method of claim 1 or 2, wherein the loss value comprises:
    Figure PCTCN2018112569-APPB-100003
    wherein Lw is a loss value.
  4. An apparatus for optimizing a convolutional neural network, the apparatus comprising:
    an obtaining unit, configured to obtain a pre-training model M;
    a training unit for specifying the pre-training model MRetraining a data set D of the domain to obtain an initial model M0
    A replacement unit for replacing the initial model M0Carrying out replacement layer operation; the replacement layer operations include: initial model M is determined based on bipartite graph maximum matching algorithm0The middle convolutional layer e is suitable to be replaced by the high-efficiency convolutional layer, and the first intermediate model M for replacing the standard convolutional layer e by the high-efficiency convolutional layer is determined1The effect is increased; reforming the parameters of the first intermediate model M1 to obtain a second intermediate model M2; initializing and retraining the second intermediate model M2 to obtain a third intermediate model M3; calculating a loss value of the third intermediate model M3;
    a selecting unit, configured to control the replacing unit to repeatedly perform a replacement layer operation to obtain a plurality of third intermediate models M3 and a plurality of loss values; the third intermediate model M3 with the smallest loss value is selected as the output model.
  5. The apparatus of claim 4,
    the replacement unit, in particular for use in removing the initial model M0Wherein finding a group convolutional layer containing Ng groups is such that intra-layer connections have a minimum change in importance;
    Figure PCTCN2018112569-APPB-100004
    the importance is the L2 norm of all weights in each connection;
    Figure PCTCN2018112569-APPB-100005
    Figure PCTCN2018112569-APPB-100006
  6. the apparatus of claim 4 or 5, wherein the loss value comprises:
    Figure PCTCN2018112569-APPB-100007
    wherein Lw is a loss value.
  7. A computer-readable storage medium storing a program for electronic data exchange, wherein the program causes a terminal to perform the method as provided in any one of claims 1-3.
  8. A computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform the method as provided in any one of claims 1 to 3.
CN201880083507.4A 2018-10-30 2018-10-30 Optimization method of convolutional neural network and related products Pending CN111602145A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/112569 WO2020087254A1 (en) 2018-10-30 2018-10-30 Optimization method for convolutional neural network, and related product

Publications (1)

Publication Number Publication Date
CN111602145A true CN111602145A (en) 2020-08-28

Family

ID=70463304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201880083507.4A Pending CN111602145A (en) 2018-10-30 2018-10-30 Optimization method of convolutional neural network and related products

Country Status (2)

Country Link
CN (1) CN111602145A (en)
WO (1) WO2020087254A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113128670A (en) * 2021-04-09 2021-07-16 南京大学 Neural network model optimization method and device
CN114648671A (en) * 2022-02-15 2022-06-21 成都臻识科技发展有限公司 Detection model generation method and device based on deep learning
CN114912569A (en) * 2021-02-10 2022-08-16 华为技术有限公司 Model training method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6042274B2 (en) * 2013-06-28 2016-12-14 株式会社デンソーアイティーラボラトリ Neural network optimization method, neural network optimization apparatus and program
CN105844653B (en) * 2016-04-18 2019-07-30 深圳先进技术研究院 A kind of multilayer convolutional neural networks optimization system and method
CN106485324A (en) * 2016-10-09 2017-03-08 成都快眼科技有限公司 A kind of convolutional neural networks optimization method
CN108319988B (en) * 2017-01-18 2021-12-24 华南理工大学 Acceleration method of deep neural network for handwritten Chinese character recognition

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114912569A (en) * 2021-02-10 2022-08-16 华为技术有限公司 Model training method and device
CN113128670A (en) * 2021-04-09 2021-07-16 南京大学 Neural network model optimization method and device
CN113128670B (en) * 2021-04-09 2024-03-19 南京大学 An optimization method and device for neural network models
CN114648671A (en) * 2022-02-15 2022-06-21 成都臻识科技发展有限公司 Detection model generation method and device based on deep learning

Also Published As

Publication number Publication date
WO2020087254A1 (en) 2020-05-07

Similar Documents

Publication Publication Date Title
JP6831347B2 (en) Learning equipment, learning methods and learning programs
CN111178520B (en) Method and device for constructing neural network
JP2020512639A5 (en)
PH12021551336A1 (en) Automated generation of machine learning models
WO2017157183A1 (en) Automatic multi-threshold characteristic filtering method and apparatus
WO2019091020A1 (en) Weight data storage method, and neural network processor based on method
CN111602145A (en) Optimization method of convolutional neural network and related products
WO2018227800A1 (en) Neural network training method and device
JP5624562B2 (en) Method and system for calculating website visitor ratings
CN107330446A (en) A kind of optimization method of depth convolutional neural networks towards image classification
CN110291540A (en) Criticize renormalization layer
JP2015011510A (en) Neural network optimization method, neural network optimization apparatus and program
TW202036388A (en) Neural network, method to prune weights and output feature maps of layer of neural network, and neural network analyzer
CN111144548A (en) Method and device for identifying working condition of pumping well
US20190279092A1 (en) Convolutional Neural Network Compression
WO2018107383A1 (en) Neural network convolution computation method and device, and computer-readable storage medium
CN112905894B (en) Collaborative filtering recommendation method based on enhanced graph learning
US20210232912A1 (en) Systems and Methods for Providing a Machine-Learned Model with Adjustable Computational Demand
CN106897265B (en) Word vector training method and device
CN112771547A (en) End-to-end learning in a communication system
You et al. Recursive reduced kernel based extreme learning machine for aero-engine fault pattern recognition
CN103544528A (en) BP neural-network classification method based on Hadoop
JP2016006617A (en) Learning device, learning method, and learning program
CN108604313B (en) Automated predictive modeling and framework
WO2015192798A1 (en) Topic mining method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200828

RJ01 Rejection of invention patent application after publication