CN109190666B - Flower image classification method based on improved deep neural network - Google Patents
Flower image classification method based on improved deep neural network Download PDFInfo
- Publication number
- CN109190666B CN109190666B CN201810854879.4A CN201810854879A CN109190666B CN 109190666 B CN109190666 B CN 109190666B CN 201810854879 A CN201810854879 A CN 201810854879A CN 109190666 B CN109190666 B CN 109190666B
- Authority
- CN
- China
- Prior art keywords
- network
- training
- data set
- flower
- improved
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
本发明提供了一种基于改进的深度神经网络的花卉图像分类方法,该方法采用迁移学习的方法,将在大规模数据集上训练的InceptionV3网络用于花卉图像数据集的分类,对其中的激活函数进行改进。在通用Oxford flower‑102数据集上的实验表明:该模型在花类图像分类任务中比传统方法和普通卷积神经网络分类准确率高,且比未改进的卷积神经网络准确率高,迁移过程准确率达到81.32%,微调过程准确率达到92.85%。
The invention provides a flower image classification method based on an improved deep neural network. The method adopts a migration learning method, and uses the InceptionV3 network trained on a large-scale data set for the classification of flower image data sets, and activates the function to improve. Experiments on the general Oxford flower-102 dataset show that the model has higher classification accuracy than traditional methods and ordinary convolutional neural networks in flower image classification tasks, and is more accurate than unimproved convolutional neural networks. The process accuracy rate reaches 81.32%, and the fine-tuning process accuracy rate reaches 92.85%.
Description
技术领域technical field
本发明涉及一种花卉图像分类方法,具体涉及一种基于改进的深度神经网络的花卉图像分类方法。The invention relates to a flower image classification method, in particular to a flower image classification method based on an improved deep neural network.
背景技术Background technique
花的图片分类任务是一项较难的任务。最大的困难来自于类内和类间的变化。例如,来自不同类别的一些图像与类别本身相比具有较小的变化,并且一些微小的差异决定它们的不同分类。另外,作为一个持续成长,非刚性物体的植物,花朵可以多种变形,所以在分类上也有很大的变化。许多传统方法都被使用在花卉识别中。The task of classifying pictures of flowers is a difficult task. The biggest difficulty comes from variation within and between classes. For example, some images from different categories have small changes compared to the categories themselves, and some small differences determine their different classifications. In addition, as a continuous growing, non-rigid plant, flowers can be deformed in many ways, so there are also great changes in classification. Many traditional methods are used in flower identification.
传统的方法需要为每个花类建立一个分类器,并获取大量的花样来训练这些分类器。在实践中,许多不同类型的花使工作变得非常困难和无聊。在典型的花朵图像,部分遮挡,光照,多重事件等方面,规模和观点有很大的变化。Traditional methods need to build a classifier for each flower class and acquire a large number of patterns to train these classifiers. In practice, many different types of flowers make work very difficult and boring. In typical flower images, partial occlusions, lighting, multiple events, etc., there are large variations in scale and perspective.
由于传统花卉分类方法需大量人工标注信息、特征信息不足等原因,分类能力有限。在Oxford flow-ers-102数据集上,传统花卉分类方法准确率均低于81%。Due to the fact that traditional flower classification methods require a large amount of manual annotation information and insufficient feature information, the classification ability is limited. On the Oxford flow-ers-102 dataset, the accuracy of traditional flower classification methods is lower than 81%.
近年来,深度学习在很多领域取得突破性进展。卷积神经网络(ConvolutionalNeural Networks,CNN)因易学习到图像的高层次特征,被广泛应用到图像分类的任务中。In recent years, deep learning has made breakthroughs in many fields. Convolutional Neural Networks (CNN) are widely used in image classification tasks because they are easy to learn high-level features of images.
然而,对传统的卷积神经网络,在进行花类图片分类任务时,有以下几个缺点:1.整个神经网络需要通过大量的非线性变换,易出现过拟合现象。2.传统的神经网络结构层数不够,提取图片特征信息不够全面。3.经典的BP神经网络,在其进行误差反向传播时,若层数过多,会出现梯度弥散现象。4.花卉图片具有类内差异较小的特性,有些相似类的花可能不会被很好的被网络区分开。However, the traditional convolutional neural network has the following disadvantages when performing flower image classification tasks: 1. The entire neural network needs to undergo a large number of nonlinear transformations, which is prone to overfitting. 2. The number of layers of the traditional neural network structure is not enough, and the extraction of image feature information is not comprehensive enough. 3. In the classic BP neural network, when the error is back-propagated, if there are too many layers, the phenomenon of gradient dispersion will occur. 4. Flower pictures have the characteristics of small intra-class differences, and some flowers of similar classes may not be well differentiated by the network.
发明内容SUMMARY OF THE INVENTION
针对上述技术问题,本发明提供一种基于改进的深度神经网络的花卉图像分类方法,该方法能够有效提高神经网络提取特征信息的能力,减少网络的过拟合,增加网络的分类能力。In view of the above technical problems, the present invention provides a flower image classification method based on an improved deep neural network, which can effectively improve the ability of the neural network to extract feature information, reduce the overfitting of the network, and increase the classification ability of the network.
本发明采用的技术方案为:The technical scheme adopted in the present invention is:
本发明实施例提供一种基于改进的深度神经网络的花卉图像分类方法,包括:将基础InceptionV3网络在大规模数据集上进行训练,得到预训练网络;对所述预训练网络进行改进,得到适用于花卉识别的数据集的改进的网络,所述数据集包括训练数据集和测试数据集;将改进的网络迁移到所述训练数据集上进行迁移训练,得到迁移训练后的网络;将迁移训练后的网络的激活函数修改为基于Tanh与ReLU函数校正之后的Tanh-ReLU函数,得到改进激活函数后的网络;将改进激活函数后的网络微调到所述训练数据集,进行微调训练,得到微调训练后的网络;将所述测试数据集输入所述微调训练后的网络,以对花卉图像进行分类。An embodiment of the present invention provides a method for classifying flower images based on an improved deep neural network, including: training a basic InceptionV3 network on a large-scale data set to obtain a pre-training network; improving the pre-training network to obtain a suitable An improved network based on a data set for flower recognition, the data set includes a training data set and a test data set; the improved network is migrated to the training data set for migration training to obtain a network after migration training; The activation function of the latter network is modified to the Tanh-ReLU function after correction based on the Tanh and ReLU functions, and the network after the improved activation function is obtained; fine-tune the network after the improved activation function to the training data set, perform fine-tuning training, and obtain fine-tuning Trained network; feed the test dataset into the fine-tuned trained network to classify flower images.
可选地,所述对所述预训练网络进行改进,得到适用于花卉识别的数据集的改进的网络包括:删除所述预训练网络的最后一层全连接层,加入一层全局平均池化层,并在全局平均池化层后加入第一全连接层,以及在所述第一全连接层后加入第二全连接层,从而得到所述改进网络;其中,所述第一全连接层含有1024个节点,激活函数采用Relu,并采用Dropout处理,概率设置为0.5;所述第二全连接层的激活函数采用Softmax,输出节点为102类。Optionally, the improving the pre-training network to obtain an improved network suitable for the data set of flower recognition includes: deleting the last fully connected layer of the pre-training network, adding a layer of global average pooling layer, and a first fully connected layer is added after the global average pooling layer, and a second fully connected layer is added after the first fully connected layer, so as to obtain the improved network; wherein, the first fully connected layer It contains 1024 nodes, the activation function adopts Relu, and adopts Dropout processing, and the probability is set to 0.5; the activation function of the second fully connected layer adopts Softmax, and the output nodes are 102 types.
可选地,所述将改进的网络迁移到所述训练数据集上进行迁移训练,得到迁移训练后的网络包括:保持原始InceptionV3部分的网络权重不变,利用所述训练数据集训练最后4层的网络的参数,从而得到迁移训练后的网络;Optionally, migrating the improved network to the training data set for migration training, and obtaining the network after migration training includes: keeping the network weight of the original InceptionV3 part unchanged, and using the training data set to train the last 4 layers The parameters of the network, so as to obtain the network after migration training;
其中,使用优化器RMSprop训练参数,在训练过程中,梯度下降时,每个批次包含32个样本,迭代轮数设为30轮。Among them, the optimizer RMSprop is used to train the parameters. During the training process, during the gradient descent, each batch contains 32 samples, and the number of iteration rounds is set to 30 rounds.
可选地,所述Tanh-ReLU函数的表达式为: Optionally, the expression of the Tanh-ReLU function is:
可选地,所述将改进激活函数后的网络微调到所述训练数据集,进行微调训练,得到微调训练后的网络包括:冻结改进激活函数后的网络中前两个初始块的参数,使该两个初始块的参数的值在训练中保持不变,利用所述训练数据集重新训练其余层的参数,从而得到微调训练后的网络;Optionally, fine-tuning the network after the improved activation function to the training data set, and performing fine-tuning training to obtain the fine-tuned trained network includes: freezing the parameters of the first two initial blocks in the network after the improved activation function, so that The values of the parameters of the two initial blocks remain unchanged during training, and the parameters of the remaining layers are retrained by using the training data set, thereby obtaining a fine-tuned trained network;
其中,使用优化器SGD训练参数,学习率设为0.001,动量参数设为0.9,损失函数使用交叉熵损失函数,在训练过程中,梯度下降时,每个批次包含32个样本,迭代轮数设为30轮。Among them, the optimizer SGD is used to train the parameters, the learning rate is set to 0.001, the momentum parameter is set to 0.9, and the loss function uses the cross-entropy loss function. During the training process, when the gradient descends, each batch contains 32 samples, and the number of iteration rounds Set to 30 rounds.
可选地,所述数据集通过数据增强进行处理。Optionally, the dataset is processed by data augmentation.
可选地,所述数据集通过数据增强进行处理包括:Optionally, the processing of the data set by data augmentation includes:
对图片进行不同角度的倾斜,并且进行水平和垂直图像旋转,以增加样本数量;Tilt the image at different angles, and rotate the image horizontally and vertically to increase the number of samples;
对图片进行80%大小的随机裁剪和80%到120%的随机缩放,以增加样本数量;以及Randomly crop the image to 80% size and randomly scale it from 80% to 120% to increase the number of samples; and
适当增加图片的高斯噪声。Appropriately increase the Gaussian noise of the image.
本发明实施例提供的基于改进的深度神经网络的花卉图像分类方法,采用迁移学习的方法,将在大规模数据集上训练的InceptionV3网络用于花卉图像数据集的分类,对其中的激活函数进行改进。在通用Oxford flower-102数据集上的实验表明:该模型在花类图像分类任务中比传统方法和普通卷积神经网络分类准确率高,且比未改进的卷积神经网络准确率高,迁移过程准确率达到81.32%,微调过程准确率达到92.85%。The flower image classification method based on the improved deep neural network provided by the embodiment of the present invention adopts the transfer learning method, and the InceptionV3 network trained on the large-scale data set is used for the classification of the flower image data set, and the activation function in it is used for classification. Improve. Experiments on the general Oxford flower-102 dataset show that the model has higher classification accuracy than traditional methods and ordinary convolutional neural networks in flower image classification tasks, and is more accurate than unimproved convolutional neural networks. The process accuracy rate reaches 81.32%, and the fine-tuning process accuracy rate reaches 92.85%.
附图说明Description of drawings
图1为本发明实施例提供的基于改进的深度神经网络的花卉图像分类方法的流程示意图;1 is a schematic flowchart of a flower image classification method based on an improved deep neural network provided by an embodiment of the present invention;
图2为Tanh函数和ReLU函数曲线;Figure 2 is the Tanh function and the ReLU function curve;
图3为Tanh-ReLU函数及其导数曲线;Figure 3 is the Tanh-ReLU function and its derivative curve;
图4为不同激活函数对102类花卉分类正确率随迭代轮数变化曲线。Figure 4 shows the curve of the correct rate of different activation functions for 102 types of flower classification with the number of iterations.
具体实施方式Detailed ways
为使本发明要解决的技术问题、技术方案和优点更加清楚,下面将结合附图及具体实施例进行详细描述。In order to make the technical problems, technical solutions and advantages to be solved by the present invention more clear, the following will be described in detail with reference to the accompanying drawings and specific embodiments.
图1为本发明实施例提供的基于改进的深度神经网络的花卉图像分类方法的流程示意图。如图1所示,本发明实施例提供的基于改进的深度神经网络的花卉图像分类方法包括以下步骤:FIG. 1 is a schematic flowchart of a flower image classification method based on an improved deep neural network according to an embodiment of the present invention. As shown in FIG. 1, the flower image classification method based on the improved deep neural network provided by the embodiment of the present invention includes the following steps:
S101、将基础InceptionV3网络在大规模数据集上进行训练,得到预训练网络;S101, train the basic InceptionV3 network on a large-scale data set to obtain a pre-training network;
S102、对所述预训练网络进行改进,得到适用于花卉识别的数据集的改进的网络,所述数据集包括训练数据集和测试数据集;S102, improving the pre-training network to obtain an improved network suitable for a data set for flower recognition, where the data set includes a training data set and a test data set;
S103、将改进的网络迁移到所述训练数据集上进行迁移训练,得到迁移训练后的网络;S103, migrating the improved network to the training data set for migration training to obtain the network after migration training;
S104、将迁移训练后的网络的激活函数修改为基于Tanh与ReLU函数校正之后的Tanh-ReLU函数,得到改进激活函数后的网络;S104, modifying the activation function of the network after the transfer training to a Tanh-ReLU function after correction based on the Tanh and ReLU functions, to obtain a network with an improved activation function;
S105、将改进激活函数后的网络微调到所述训练数据集,进行微调训练,得到微调训练后的网络;S105, fine-tuning the network after the improved activation function to the training data set, and performing fine-tuning training to obtain a fine-tuning trained network;
S106、将所述测试数据集输入所述微调训练后的网络,以对花卉图像进行分类。S106. Input the test data set into the fine-tuned trained network to classify flower images.
以下,对上述各步骤进行详细说明。Hereinafter, each of the above steps will be described in detail.
S101、将基础InceptionV3网络在大规模数据集上进行训练,得到预训练网络S101. Train the basic InceptionV3 network on a large-scale dataset to obtain a pre-trained network
本发明实施例利用在大规模数据集上训练的InceptionV3网络用作花卉分类网络,从而得到预训练网络。In the embodiment of the present invention, the InceptionV3 network trained on a large-scale data set is used as a flower classification network, thereby obtaining a pre-trained network.
本发明实施例使用的每种Inception结构在InceptionV2的基础上改进了3种Inception模块,如下所示:Each Inception structure used in the embodiment of the present invention improves three Inception modules on the basis of InceptionV2, as follows:
在第一种Inception结构中,每个5×5的卷积被两个3×3的卷积所替代。In the first Inception structure, each 5×5 convolution is replaced by two 3×3 convolutions.
在第二种Inception结构中,将n×n卷积分解成n×1和1×1两层卷积的形式。对于17×17的网络,最终选择n为7。In the second Inception structure, the n×n convolution is decomposed into the form of n×1 and 1×1 two-layer convolution. For a 17×17 network, n is finally chosen to be 7.
在第三种Inception结构中,拓展卷积核组的输出。这种架构用在粗网络(Coarsest Grid)中促进高尺寸图像的表示。In the third Inception structure, the output of the convolution kernel group is expanded. This architecture is used in Coarsest Grid to facilitate the representation of high-dimensional images.
本发明实施例使用的InceptionV3网络模型在V2的基础上,做如下的改进:优化器用RMSProp代替SGD,在类别全连接层后加入LSR层,将7×7卷积核由三个3×3卷积核代替。On the basis of V2, the InceptionV3 network model used in the embodiment of the present invention is improved as follows: the optimizer replaces SGD with RMSProp, adds the LSR layer after the category fully connected layer, and divides the 7×7 convolution kernel into three 3×3 volumes Nuclei instead.
S102、对所述预训练网络进行改进,得到适用于花卉识别的数据集的改进的网络。S102 , improving the pre-training network to obtain an improved network suitable for a data set of flower recognition.
本发明实施例的花卉分类实验需要将102类花卉进行分类。为使该网络适用于花卉分类,改进该网络,网络改进包括:删除所述预训练网络的最后一层全连接层,加入一层全局平均池化层,为扩大感受野,并在全局平均池化层后加入第一全连接层,以及在所述第一全连接层后加入第二全连接层,从而得到所述改进网络;其中,所述第一全连接层含有1024个节点,激活函数采用Relu,并采用Dropout处理,概率设置为0.5,以防止网络过拟合;所述第二全连接层的激活函数采用Softmax,输出节点为102类。改进后网络结构如表1所示。The flower classification experiment of the embodiment of the present invention needs to classify 102 types of flowers. In order to make the network suitable for flower classification, improve the network, the network improvement includes: deleting the last fully connected layer of the pre-training network, adding a global average pooling layer, in order to expand the receptive field, and in the global average pooling layer. The first fully connected layer is added after the first fully connected layer, and the second fully connected layer is added after the first fully connected layer, so as to obtain the improved network; wherein, the first fully connected layer contains 1024 nodes, and the activation function Relu is adopted, and Dropout processing is adopted, and the probability is set to 0.5 to prevent the network from overfitting; the activation function of the second fully connected layer adopts Softmax, and the output nodes are 102 types. The improved network structure is shown in Table 1.
表1改进后的网络结构(315层)Table 1 Improved network structure (315 layers)
从表1中可看出,网络输入为299×299×3,即输入图片的尺寸。网络的输出为1×1×102,对应每类花的概率值。As can be seen from Table 1, the network input is 299×299×3, which is the size of the input image. The output of the network is 1×1×102, corresponding to the probability value of each type of flower.
S103、将改进的网络迁移到所述训练数据集上进行迁移训练,得到迁移训练后的S103: Migrate the improved network to the training data set for migration training, and obtain a 网络。network.
该步骤包括:保持原始InceptionV3部分的网络权重不变,利用所述训练数据集训练最后4层的网络的参数,从而得到迁移训练后的网络。由于训练参数较少,选择较为平稳的优化器RMSprop。训练过程中,梯度下降时,每个批次包含32个样本,迭代轮数设为30轮。This step includes: keeping the network weight of the original InceptionV3 part unchanged, and using the training data set to train the parameters of the network of the last 4 layers, so as to obtain the network after migration training. Since there are fewer training parameters, the more stable optimizer RMSprop is chosen. During the training process, during gradient descent, each batch contains 32 samples, and the number of iterations is set to 30.
S104、将迁移训练后的网络的激活函数修改为基于Tanh与ReLU函数校正之后的S104, modify the activation function of the network after the transfer training to the one after correction based on the Tanh and ReLU functions Tanh-ReLU函数,得到改进激活函数后的网络。Tanh-ReLU function, the network after the improved activation function is obtained.
本发明实施例提出的改进的InceptionV3神经网络结构模型如上表1所示,只不过将其中的激活函数改为Tanh与ReLU函数校正之后的Tanh-ReLU函数。最后一层为全连接层,即最后要得到花卉分类的概率。本发明最大的特点在于将非饱和函数ReLU与Tanh函数软饱和的特点结合起来。Tanh函数可表示为:The improved InceptionV3 neural network structure model proposed in the embodiment of the present invention is shown in Table 1 above, except that the activation function therein is changed to the Tanh-ReLU function after the correction of the Tanh and ReLU functions. The last layer is the fully connected layer, that is, the probability of flower classification is finally obtained. The biggest feature of the present invention is to combine the characteristics of soft saturation of the non-saturated function ReLU and the Tanh function. The Tanh function can be expressed as:
ReLU函数可表示为:The ReLU function can be expressed as:
各自函数曲线如图2所示。The respective function curves are shown in Figure 2.
从图2可以看出,ReLU函数对于小于零的情况全部置为0,大于零的情况下保持不变,这说明该函数在大于零时具有较快的收敛速度,但由于负值全部置为0。因此,节点数据可能会在训练过程中不可逆地死亡,进而破坏数据流。在花卉图像分类的任务中,由于花卉具有类间差异较小的特性,而ReLU输出没有负值,激活层之间累积的偏置会影响分类的效果。It can be seen from Figure 2 that the ReLU function is all set to 0 when it is less than zero, and remains unchanged when it is greater than zero, which shows that the function has a faster convergence speed when it is greater than zero, but because all negative values are set to 0. As a result, node data may die irreversibly during training, disrupting the data flow. In the task of flower image classification, since flowers have the characteristics of small inter-class differences and the ReLU output has no negative values, the accumulated bias between activation layers will affect the classification effect.
基于此,本发明取小于零的部分为Tanh函数的左边部分,大于零的部分为ReLU函数的右半部分,记为Tanh-ReLU函数。该函数的表达式为:Based on this, the present invention takes the part less than zero as the left part of the Tanh function, and the part greater than zero as the right half of the ReLU function, denoted as the Tanh-ReLU function. The expression for this function is:
Tanh-ReLU函数曲线及其导数曲线如图3所示。The Tanh-ReLU function curve and its derivative curve are shown in Figure 3.
当网络使用BP算法进行回溯时,误差从输出层传播,当前激活函数的一阶导数和当前神经元的值必须在每一层相乘,即Grad=Error×(Tanh-ReLU)’(x)×x。该函数的导数和ReLU函数基本一致,故继承ReLU的优点,可以有效地缓解梯度消失的问题,从而直接以监督的方式训练深层神经网络,不需要无监督逐层训练。When the network uses the BP algorithm for backtracking, the error propagates from the output layer, and the first derivative of the current activation function and the value of the current neuron must be multiplied at each layer, ie Grad=Error×(Tanh-ReLU)'(x) ×x. The derivative of this function is basically the same as the ReLU function, so inheriting the advantages of ReLU can effectively alleviate the problem of gradient disappearance, so that the deep neural network can be trained directly in a supervised way without unsupervised layer-by-layer training.
在大于零时,Tanh-ReLU继承ReLU的优点,因为具有线性非饱和的形式,故在梯度下降时具有更快的收敛形式。而在小于零时,克服ReLU可能会破坏数据流形的缺点,且具有软饱和的特性(固定在-1和0之间),提升对噪音的鲁棒性。对于花卉图片分类任务,能针对花卉图片类间差异较小的特性进行较好的分类。When it is greater than zero, Tanh-ReLU inherits the advantages of ReLU, because it has a linearly unsaturated form, so it has a faster convergence form during gradient descent. When it is less than zero, it overcomes the disadvantage that ReLU may destroy the data manifold, and has the characteristics of soft saturation (fixed between -1 and 0) to improve the robustness to noise. For the flower image classification task, it can better classify the characteristics of flower images with small differences between classes.
与ReLU函数相比,Tanh-ReLU具有以下两个特点:1.它在x<0处激活值为负值,导数不为0。因为当ReLU函数输入为负值时,其导数的输出会变成0,这会导致神经元死亡的问题。Tanh-ReLU函数能改善这个问题,且让输入为负的部分呈现一种软饱和的特性,这一点能特省噪音鲁棒性。2.可以使得输出均值为0。ReLU的所有输出都非负,所以输出均值必然是非负的,这会导致网络发生均值偏移(Mean Shift),在训练某些超深网络时可能不收敛。Tanh-ReLU函数修复了这个问题,使得均值可能为0。Compared with the ReLU function, Tanh-ReLU has the following two characteristics: 1. Its activation value is negative at x < 0, and the derivative is not 0. Because when the input of the ReLU function is negative, the output of its derivative will become 0, which will cause the problem of neuron death. The Tanh-ReLU function can improve this problem and make the negative input part exhibit a soft saturation characteristic, which can save noise robustness. 2. You can make the output mean 0. All outputs of ReLU are non-negative, so the output mean must be non-negative, which will cause the network to have a mean shift (Mean Shift), which may not converge when training some ultra-deep networks. The Tanh-ReLU function fixes this so that the mean may be 0.
综上,本发明针对花卉图片的特点,将表1中的网络结构中的激活函数改为Tanh-ReLU函数,生成改进后的InceptionV3网络结构。To sum up, according to the characteristics of flower pictures, the present invention changes the activation function in the network structure in Table 1 to the Tanh-ReLU function to generate an improved InceptionV3 network structure.
S105、将改进激活函数后的网络微调到所述训练数据集,进行微调训练,得到微调S105. Fine-tune the network after the improved activation function to the training data set, perform fine-tuning training, and obtain fine-tuning 训练后的网络。The trained network.
该步骤包括:冻结S104得到的网络中前两个初始块的参数,使其在训练中值保持不变,利用所述训练数据集重新训练其余层的参数。由于训练参数较多,选择收敛速度较快的优化器SGD,其中的学习率设为0.001,动量参数设为0.9,损失函数使用交叉熵损失函数。迭代轮数设为30轮,在梯度下降时,每个批次包含样本数和迭代轮数同迁移训练过程。This step includes: freezing the parameters of the first two initial blocks in the network obtained in S104 so as to keep their values unchanged during training, and retraining the parameters of the remaining layers by using the training data set. Due to the large number of training parameters, the optimizer SGD with a faster convergence speed is selected, where the learning rate is set to 0.001, the momentum parameter is set to 0.9, and the loss function uses the cross-entropy loss function. The number of iterative rounds is set to 30 rounds. During gradient descent, each batch contains the number of samples and the number of iteration rounds with the transfer training process.
进一步地,本发明实施例使用的数据集通过数据增强进行处理。增强处理可包括:Further, the data sets used in the embodiments of the present invention are processed through data augmentation. Enhanced processing can include:
对图片进行不同角度的倾斜,并且进行水平和垂直图像旋转,以增加样本数量;Tilt the image at different angles, and rotate the image horizontally and vertically to increase the number of samples;
对图片进行80%大小的随机裁剪和80%到120%的随机缩放,以增加样本数量;以及Randomly crop the image to 80% size and randomly scale it from 80% to 120% to increase the number of samples; and
适当增加图片的高斯噪声。Appropriately increase the Gaussian noise of the image.
【实施例】【Example】
以下通过实验对本发明实施例提供的花卉图像分类方法的优点进行描述。The advantages of the flower image classification method provided by the embodiments of the present invention are described below through experiments.
【实验及分析】【Experiment and Analysis】
实验环境lab environment
本实验采用的软硬件实验环境如表2所示。The software and hardware experimental environment used in this experiment is shown in Table 2.
表2实验软硬件环境Table 2 Experimental software and hardware environment
在Linux系统下,本实验采用基于TensorFlow的Keras深度学习框架,对花卉图片进行训练和测试。Under the Linux system, this experiment uses the Keras deep learning framework based on TensorFlow to train and test flower pictures.
本实验选用Oxford flower-102公开数据集,数据库来自牛津大学视觉几何组创建的花图像数据库。包含102类花类别,每个类别图片在40~258张之间,共8189张图片。该数据库同时也兼顾在图像识别领域中的所有的难点,如光照变化、视觉变化、背景复杂、花卉类别形态多以及颜色变化复杂。加上部分不同的花卉高度相似,因此,对花卉图像分类研究具有重要意义。This experiment uses the Oxford flower-102 public dataset, which is from the flower image database created by the Visual Geometry Group of Oxford University. Contains 102 flower categories, each category has between 40 and 258 pictures, a total of 8189 pictures. The database also takes into account all the difficulties in the field of image recognition, such as illumination changes, visual changes, complex backgrounds, many types of flowers and complex color changes. In addition, some different flowers are highly similar, so it is of great significance to the study of flower image classification.
数据增强data augmentation
数据增强方法可大大增大训练数据集的样本量,提高网络模型的泛化能力。实质上,数据增强方法是通过仿射变换等数据转换方法来人为增加数据集的样本量的过程。The data augmentation method can greatly increase the sample size of the training data set and improve the generalization ability of the network model. In essence, the data augmentation method is the process of artificially increasing the sample size of the dataset through data transformation methods such as affine transformation.
本数据库只有8189张花类图片,对于102类花卉分类任务来说,平均每类只有80张图片用于花卉种类,每类花卉图片数据量依然很小,所以要进行数据增强,才能完全满足训练网络的需求。There are only 8189 flower pictures in this database. For 102 kinds of flower classification tasks, there are only 80 pictures per category on average, and the amount of data of each category of flower pictures is still small, so data enhancement is required to fully meet the training requirements. network needs.
1.考虑到设想到拍摄花卉的不同方向,保证图像识别过程中旋转和倾斜不变性,对图片进行不同角度的倾斜,并且进行水平和垂直图像旋转。增加样本数量。1. Considering the different directions of shooting flowers, ensure the invariance of rotation and tilt during image recognition, tilt the picture at different angles, and perform horizontal and vertical image rotation. Increase the sample size.
2.考虑到复杂背景下,花卉图片的某一部分也是该花卉种类,对图片进行80%大小的随机裁剪和80%到120%的随机缩放,以增加样本数量。2. Considering that a certain part of the flower picture is also the flower species under the complex background, the picture is randomly cropped by 80% size and randomly scaled by 80% to 120% to increase the number of samples.
3.考虑到雨、雾、雪和一些光照变化,不同季节或一天中不同拍照时间获取的图像色调,亮度和饱和度均有不同的变化,适当增加高斯噪声。3. Considering the changes of rain, fog, snow and some lighting, the hue, brightness and saturation of the images obtained in different seasons or at different shooting times of the day have different changes, and Gaussian noise should be appropriately increased.
通过以上3种数据增强方法,通过训练数据集生成器从原数据中不断地生成训练数据,直到达到目标迭代轮数为止。增强后的数据,在训练过程中可以有效地减少网络过拟合,增加卷积网络识别花卉图像的能力。Through the above three data augmentation methods, the training data is continuously generated from the original data by the training data set generator until the target number of iterations is reached. The enhanced data can effectively reduce the network overfitting during the training process and increase the ability of the convolutional network to recognize flower images.
本数据库具有8189张图片,将其中7169张图片用作训练集,1020张图片用作测试集。用数据增强技术将该数据集拓展到原来的30倍,有效避免网络的过拟合现象。This database has 8189 images, of which 7169 images are used as training set and 1020 images are used as test set. The data set is expanded to 30 times the original size by data augmentation technology, which effectively avoids the overfitting of the network.
基于Tanh-ReLU激活函数的InceptionV3网络结构实验InceptionV3 network structure experiment based on Tanh-ReLU activation function
进行图像数据增强后,接着对图像进行预处理。考虑到102类花卉图片分辨率不均等,将所有图片缩放到299×299像素,以完成网络统一化输入的要求。考虑到图片像素是从0~255,输入计算量较复杂,将图片像素点从0~255压缩到-1~1,以简化网络的输入。After image data enhancement is performed, the image is then preprocessed. Considering that the resolutions of 102 categories of flower images are not equal, all images are scaled to 299×299 pixels to fulfill the requirement of the unified input of the network. Considering that the picture pixels are from 0 to 255, the input calculation is more complicated, and the picture pixels are compressed from 0 to 255 to -1 to 1 to simplify the input of the network.
考虑到训练样本数据过少需要抑制更多的神经元,在数据库Oxfordflower-102上进行训练时,其Dropout比率设置为0.5,防止过拟合现象。Considering that there are too few training samples and more neurons need to be suppressed, when training on the database Oxfordflower-102, the Dropout ratio is set to 0.5 to prevent overfitting.
为解决全连接层,最后的池化层选择用全局平均池化,将最后一层的特征图进行整张图的均值池化,形成一个特征点,将这个特征点组成最后的特征向量。利于最后的Softmax计算。In order to solve the fully connected layer, the final pooling layer chooses to use global average pooling, and the feature map of the last layer is average pooled for the entire image to form a feature point, and this feature point is composed of the final feature vector. Conducive to the final Softmax calculation.
迁移的过程为固定原来InceptionV3网络部分的参数不变,训练剩下顶部4层的参数。优化器采用RMSProp优化器,损失函数采用多类别对数损失函数(categoricalcrossentropy)。Batchsize设为32,迭代轮数设置为30,迭代次数为32×30=960次。The migration process is to fix the parameters of the original InceptionV3 network part unchanged, and train the parameters of the remaining top 4 layers. The optimizer adopts the RMSProp optimizer, and the loss function adopts the multi-category logarithmic loss function (categoricalcrossentropy). The batch size is set to 32, the number of iteration rounds is set to 30, and the number of iterations is 32×30=960 times.
微调网络的过程为固定网络中InceptionV3前两个初始块中的参数不变,即固定前127层参数不变,训练剩下顶部的参数。优化器为SGD,学习率设置为0.0001,动量参数设置为0.9,损失函数,Batchsize,迭代轮数均和迁移过程设置一样。The process of fine-tuning the network is to fix the parameters in the first two initial blocks of InceptionV3 in the fixed network, that is, to fix the parameters of the first 127 layers, and train the remaining top parameters. The optimizer is SGD, the learning rate is set to 0.0001, the momentum parameter is set to 0.9, the loss function, Batchsize, and the number of iterations are the same as the migration process settings.
结果分析Result analysis
为验证改进后的激活函数能否提高分类准确率,本文采用Tanh,ReLU,Tanh-ReLU3种激活函数对102类花卉图片进行分类。迁移过程和微调过程准确率如表3所示。其中,前30轮为迁移过程,后30轮为微调过程。在迁移过程中,由于RMSProp优化器需要先行搜索初始学习率,然后对其逐数量级下降,故准确率有所震荡,但总的来说Tanh-ReLU激活函数高于其他两个激活函数的准确率。在微调过程中使用相对平稳的SGD优化器,每次只计算一个样本,可看出改进后的Tanh-ReLU激活函数分类准确率有所提高,如下表3所示。In order to verify whether the improved activation function can improve the classification accuracy, this paper uses three activation functions of Tanh, ReLU, and Tanh-ReLU to classify 102 types of flower images. The accuracy of the migration process and fine-tuning process is shown in Table 3. Among them, the first 30 rounds are the migration process, and the last 30 rounds are the fine-tuning process. During the migration process, since the RMSProp optimizer needs to search for the initial learning rate first, and then decrease it by orders of magnitude, the accuracy rate fluctuates, but in general the Tanh-ReLU activation function is higher than the other two activation functions. . In the fine-tuning process, a relatively stable SGD optimizer is used, and only one sample is calculated at a time. It can be seen that the classification accuracy of the improved Tanh-ReLU activation function is improved, as shown in Table 3 below.
表3不同激活函数对102类花卉分类准确率Table 3 Classification accuracy of 102 types of flowers with different activation functions
在进行102类花卉图片分类实验时,直接对比Tanh函数,ReLU函数和改进后的Tanh-ReLU函数,通过实验对比,其中迁移过程和微调过程的训练集的Batch数均为32,迭代轮书为30轮,总迭代轮数为60轮。通过以上实验可以得出,改进后的激活函数不仅可以提高网络的收敛速度,而且对花卉图片分类识别率有所提高。从表3可以看出,迁移过程中,改进的Tanh-ReLU函数比Tanh和ReLU激活函数分别准确率分别高出2.3%和0.48%,微调过程准确率分别高出2.08%和1.32%。When conducting the 102-category flower image classification experiment, the Tanh function, the ReLU function and the improved Tanh-ReLU function were directly compared. Through the experimental comparison, the number of batches in the training set of the migration process and the fine-tuning process were both 32, and the iterative round book was 30 rounds, and the total number of iteration rounds is 60 rounds. Through the above experiments, it can be concluded that the improved activation function can not only improve the convergence speed of the network, but also improve the classification and recognition rate of flower pictures. It can be seen from Table 3 that during the migration process, the accuracy of the improved Tanh-ReLU function is 2.3% and 0.48% higher than that of the Tanh and ReLU activation functions, respectively, and the accuracy of the fine-tuning process is 2.08% and 1.32% higher, respectively.
在对102类花卉进行分类时得出准确率为92.85%,针对迁移过程和微调过程中,准确率比较高的部分的花卉图片进行分析。分类准确率比较高的花卉大多在颜色或形态上与其他花卉有明显差异,如图4所示。When classifying 102 types of flowers, the accuracy rate is 92.85%. In the process of migration and fine-tuning, the flower images with high accuracy rate are analyzed. Most of the flowers with high classification accuracy are significantly different from other flowers in color or shape, as shown in Figure 4.
综上,本发明针对传统网络进行花卉图片分类任务时出现的提取图片特征信息不全面和缺点,利用迁移学习的思想,采用在ImageNet数据集上预训练的InceptionV3网络架构。针对传统网络参数过多易出现过拟合现象,采用数据增强技术适当增加数据集。在InceptionV3网络架构的基础上,改进网络的激活函数,采用Tanh-ReLU激活函数,实验表明,其模型比未改进的网络模型分类效果更好,且比传统方法和其他深度神经网络架构的分类好。迁移过程准确率达到81.32%,微调过程准确率达到92.85%。验证该改进方法对于花卉图片分类任务的准确率和进行花卉识别的可行性。To sum up, the present invention aims at the incompleteness and shortcomings of extracting picture feature information when the traditional network performs the flower picture classification task, using the idea of migration learning, and adopts the InceptionV3 network architecture pre-trained on the ImageNet data set. In view of the fact that too many traditional network parameters are prone to over-fitting, data enhancement techniques are used to appropriately increase the data set. On the basis of the InceptionV3 network architecture, the activation function of the network is improved, and the Tanh-ReLU activation function is used. Experiments show that the model has a better classification effect than the unimproved network model, and is better than the traditional method and other deep neural network architectures. . The accuracy of the migration process reaches 81.32%, and the accuracy of the fine-tuning process reaches 92.85%. The accuracy of the improved method for flower image classification task and the feasibility of flower recognition are verified.
本需要说明的是,本发明提供的方法还可拓展到类似领域进行研究。在植物学分类上具有普适性,但要借鉴植物学专家知识库做一些深入研究。同时也可作为动物种类研究的借鉴。It should be noted that the method provided by the present invention can also be extended to similar fields for research. It is universal in botanical classification, but some in-depth research should be done with reference to the knowledge base of botanical experts. It can also be used as a reference for animal species research.
以上所述实施例,仅为本发明的具体实施方式,用以说明本发明的技术方案,而非对其限制,本发明的保护范围并不局限于此,尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本发明实施例技术方案的精神和范围,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应所述以权利要求的保护范围为准。The above-mentioned embodiments are only specific implementations of the present invention, and are used to illustrate the technical solutions of the present invention, but not to limit them. Detailed description, those of ordinary skill in the art should understand: any person skilled in the art is within the technical scope disclosed by the present invention, and it can still modify the technical solutions recorded in the foregoing embodiments or can easily think of changes, Or equivalently replace some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should be included within the protection scope of the present invention. Therefore, the protection scope of the present invention should be based on the protection scope of the claims.
Claims (5)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810854879.4A CN109190666B (en) | 2018-07-30 | 2018-07-30 | Flower image classification method based on improved deep neural network |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810854879.4A CN109190666B (en) | 2018-07-30 | 2018-07-30 | Flower image classification method based on improved deep neural network |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN109190666A CN109190666A (en) | 2019-01-11 |
| CN109190666B true CN109190666B (en) | 2022-04-29 |
Family
ID=64937434
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201810854879.4A Active CN109190666B (en) | 2018-07-30 | 2018-07-30 | Flower image classification method based on improved deep neural network |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN109190666B (en) |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110347851A (en) * | 2019-05-30 | 2019-10-18 | 中国地质大学(武汉) | Image search method and system based on convolutional neural networks |
| CN111383357A (en) * | 2019-05-31 | 2020-07-07 | 纵目科技(上海)股份有限公司 | Network model fine-tuning method, system, terminal and storage medium adapting to target data set |
| CN110739051B (en) * | 2019-10-08 | 2022-06-03 | 中山大学附属第三医院 | A method for establishing a model of eosinophil proportion using pathological pictures of nasal polyps |
| CN111008674B (en) * | 2019-12-24 | 2022-05-03 | 哈尔滨工程大学 | An underwater target detection method based on fast cycle unit |
| CN114758357A (en) * | 2022-04-14 | 2022-07-15 | 哈尔滨理工大学 | Animal species identification method based on neural network and improved K-SVD algorithm |
| CN118452912B (en) * | 2024-05-30 | 2025-01-10 | 南通大学 | A 1D-CNN blood glucose concentration prediction method based on activation function |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107404656A (en) * | 2017-06-26 | 2017-11-28 | 武汉斗鱼网络科技有限公司 | Live video recommends method, apparatus and server |
| CN107423815A (en) * | 2017-08-07 | 2017-12-01 | 北京工业大学 | A kind of computer based low quality classification chart is as data cleaning method |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11074495B2 (en) * | 2013-02-28 | 2021-07-27 | Z Advanced Computing, Inc. (Zac) | System and method for extremely efficient image and pattern recognition and artificial intelligence platform |
-
2018
- 2018-07-30 CN CN201810854879.4A patent/CN109190666B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107404656A (en) * | 2017-06-26 | 2017-11-28 | 武汉斗鱼网络科技有限公司 | Live video recommends method, apparatus and server |
| CN107423815A (en) * | 2017-08-07 | 2017-12-01 | 北京工业大学 | A kind of computer based low quality classification chart is as data cleaning method |
Non-Patent Citations (1)
| Title |
|---|
| 一种改进的深度卷积神经网络的精细图像分类;杨国亮 等;《江西师范大学学报( 自然科学版)》;20170930;第41卷(第5期);476-483 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN109190666A (en) | 2019-01-11 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN109190666B (en) | Flower image classification method based on improved deep neural network | |
| CN106991372B (en) | A dynamic gesture recognition method based on a hybrid deep learning model | |
| CN108491765B (en) | Method and system for classification and recognition of vegetable images | |
| CN110619369B (en) | Fine-grained image classification method based on feature pyramid and global average pooling | |
| Liu et al. | Meta-learning based prototype-relation network for few-shot classification | |
| CN109255340A (en) | It is a kind of to merge a variety of face identification methods for improving VGG network | |
| CN108304826A (en) | Facial expression recognizing method based on convolutional neural networks | |
| CN108596039A (en) | A kind of bimodal emotion recognition method and system based on 3D convolutional neural networks | |
| WO2018052587A1 (en) | Method and system for cell image segmentation using multi-stage convolutional neural networks | |
| CN108648191A (en) | Pest image-recognizing method based on Bayes's width residual error neural network | |
| CN110929610A (en) | Plant disease identification method and system based on CNN model and transfer learning | |
| CN107871136A (en) | Image Recognition Method Based on Convolutional Neural Network with Sparsity Random Pooling | |
| CN105205475A (en) | Dynamic gesture recognition method | |
| CN108416318A (en) | Diameter radar image target depth method of model identification based on data enhancing | |
| CN107085704A (en) | Fast Facial Expression Recognition Method Based on ELM Autoencoding Algorithm | |
| CN106682569A (en) | Fast traffic signboard recognition method based on convolution neural network | |
| CN106407986A (en) | Synthetic aperture radar image target identification method based on depth model | |
| US11695898B2 (en) | Video processing using a spectral decomposition layer | |
| CN106709482A (en) | Method for identifying genetic relationship of figures based on self-encoder | |
| CN111783688B (en) | A classification method of remote sensing image scene based on convolutional neural network | |
| CN106845525A (en) | A kind of depth confidence network image bracket protocol based on bottom fusion feature | |
| CN113723456B (en) | Automatic astronomical image classification method and system based on unsupervised machine learning | |
| CN112258431A (en) | Image classification model and classification method based on hybrid depthwise separable dilated convolution | |
| CN107423747A (en) | A kind of conspicuousness object detection method based on depth convolutional network | |
| CN105976397B (en) | A kind of method for tracking target |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |