CN111602145A

CN111602145A - Optimization method of convolutional neural network and related products

Info

Publication number: CN111602145A
Application number: CN201880083507.4A
Authority: CN
Inventors: 赵睿哲
Original assignee: Shenzhen Corerain Technologies Co Ltd
Current assignee: Shenzhen Corerain Technologies Co Ltd
Priority date: 2018-10-30
Filing date: 2018-10-30
Publication date: 2020-08-28
Also published as: WO2020087254A1

Abstract

A method of optimizing a convolutional neural network and related products, the method comprising: obtaining a pre-training model M; retraining a data set D of the pre-training model M in a specified field to obtain an initial model M₀For the initial model M₀Carrying out replacement layer operation; the replacement layer operations include: initial model M is determined based on bipartite graph maximum matching algorithm₀The middle convolutional layer e is suitable to be replaced by the high-efficiency convolutional layer, and the first intermediate model M for replacing the standard convolutional layer e by the high-efficiency convolutional layer is determined₁The effect is increased; reforming the parameters of the first intermediate model M1 to obtain a second intermediate model M2; initializing and retraining the second intermediate model M2 to obtain a third intermediate model M3; calculating a loss value of the third intermediate model M3; repeatedly executing the replacement layer operation to obtain a plurality of third intermediate models M3 and a plurality of loss values; the third intermediate model M3 with the smallest loss value is selected as the output model. The method has the advantage of low cost.

Description

PCT国内申请，说明书已公开。PCT domestic application, the description has been published.

Claims

A method for optimizing a convolutional neural network, the method comprising the steps of:

obtaining a pre-training model M;

retraining a data set D of the pre-training model M in a specified field to obtain an initial model M₀For the initial model M₀Carrying out replacement layer operation;

the replacement layer operations include: initial model M is determined based on bipartite graph maximum matching algorithm₀The middle convolutional layer e is suitable to be replaced by the high-efficiency convolutional layer, and the first intermediate model M for replacing the standard convolutional layer e by the high-efficiency convolutional layer is determined₁The effect is increased; parameterization of the first intermediate model M1Reforming the line to obtain a second intermediate model M2; initializing and retraining the second intermediate model M2 to obtain a third intermediate model M3; calculating a loss value of the third intermediate model M3;

repeatedly executing the replacement layer operation to obtain a plurality of third intermediate models M3 and a plurality of loss values; the third intermediate model M3 with the smallest loss value is selected as the output model.
The method of claim 1, wherein the initial model M is determined based on a bipartite graph maximum matching algorithm₀The adaptation of the standard convolutional layer e to be replaced by the high-efficiency convolutional layer specifically comprises:

from an initial model M₀Wherein finding a group convolutional layer containing Ng groups is such that intra-layer connections have a minimum change in importance;

the importance is the L2 norm of all weights in each connection;
the method of claim 1 or 2, wherein the loss value comprises:

wherein Lw is a loss value.
An apparatus for optimizing a convolutional neural network, the apparatus comprising:

an obtaining unit, configured to obtain a pre-training model M;

a training unit for specifying the pre-training model MRetraining a data set D of the domain to obtain an initial model M₀；

A replacement unit for replacing the initial model M₀Carrying out replacement layer operation; the replacement layer operations include: initial model M is determined based on bipartite graph maximum matching algorithm₀The middle convolutional layer e is suitable to be replaced by the high-efficiency convolutional layer, and the first intermediate model M for replacing the standard convolutional layer e by the high-efficiency convolutional layer is determined₁The effect is increased; reforming the parameters of the first intermediate model M1 to obtain a second intermediate model M2; initializing and retraining the second intermediate model M2 to obtain a third intermediate model M3; calculating a loss value of the third intermediate model M3;

a selecting unit, configured to control the replacing unit to repeatedly perform a replacement layer operation to obtain a plurality of third intermediate models M3 and a plurality of loss values; the third intermediate model M3 with the smallest loss value is selected as the output model.
The apparatus of claim 4,

the replacement unit, in particular for use in removing the initial model M₀Wherein finding a group convolutional layer containing Ng groups is such that intra-layer connections have a minimum change in importance;

the importance is the L2 norm of all weights in each connection;
the apparatus of claim 4 or 5, wherein the loss value comprises:

wherein Lw is a loss value.
A computer-readable storage medium storing a program for electronic data exchange, wherein the program causes a terminal to perform the method as provided in any one of claims 1-3.
A computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform the method as provided in any one of claims 1 to 3.