CN110059558A

CN110059558A - A kind of orchard barrier real-time detection method based on improvement SSD network

Info

Publication number: CN110059558A
Application number: CN201910198144.5A
Authority: CN
Inventors: 刘慧�; 张礼帅; 沈跃; 吴边; 张健
Original assignee: Jiangsu University
Current assignee: Jiangsu University
Priority date: 2019-03-15
Filing date: 2019-03-15
Publication date: 2019-07-26
Anticipated expiration: 2039-03-15
Also published as: CN110059558B

Abstract

The invention discloses a real-time detection method for orchard obstacles based on an improved SSD. The improved SSD deep learning target detection method is used to identify obstacles in an orchard environment, and a lightweight network MobileNetV2 is used as the basic network in the SSD model to reduce extraction. The time and computational complexity of the image feature process, the auxiliary layer uses the reverse residual structure combined with the hole convolution as the basic structure for position prediction, so that the multi-scale features can be integrated and the information loss caused by the downsampling operation can be avoided. The corresponding image The improved SSD target detection model is trained on the data set, and the images collected by the camera are input into the trained model to detect the target position, which solves the traditional obstacle detection algorithm that is prone to background interference, inaccurate positioning of obstacles and difficult to locate. Realize problems such as simultaneous detection of multiple obstacle categories.

Description

A Real-time Detection Method of Orchard Obstacles Based on Improved SSD Network

技术领域technical field

本发明属于计算机视觉、深度学习领域，具体涉及的是一种针对室外果园环境下移动机器人智能作业的障碍物检测方法。The invention belongs to the fields of computer vision and deep learning, and specifically relates to an obstacle detection method for intelligent operations of mobile robots in an outdoor orchard environment.

背景技术Background technique

农田障碍物的精确识别是无人农业车辆必不可少的关键技术之一随着精准农业理论的提出以及智能化机器人的发展，智能农业车辆的自动导航越来越受到国内外的关注。自主导航的农业车辆具有取代人工，提高作业效率，降低农业生产成本等特点。为了保证智能化车辆在无人工干预时在田间操作的安全性，必须有实时的障碍物检测。田间环境下的障碍物检测由于其复杂的自然环境、障碍物形态的多变性、光照等外部条件的大范围变化等，实施起来具有一定挑战性。在田间环境下，超声波传感器存在检测障碍物空间位置的准确性较差，易受干扰等缺点，虽然激光雷达传感器可以较直观地检测障碍物，但雷达系统的造价昂贵。计算机视觉检测相比于其他障碍物检测方法具有成本低、能够有效利用环境中的颜色与纹理信息等优点。本文采用计算机视觉方法结合深度学习进行无人农机自动作业过程中的行人障碍物检测，Accurate identification of obstacles in farmland is one of the essential key technologies for unmanned agricultural vehicles. With the proposal of precision agriculture theory and the development of intelligent robots, the automatic navigation of intelligent agricultural vehicles has attracted more and more attention at home and abroad. Autonomous navigation agricultural vehicles have the characteristics of replacing labor, improving operational efficiency, and reducing agricultural production costs. In order to ensure the safety of intelligent vehicles operating in the field without human intervention, real-time obstacle detection is necessary. Obstacle detection in the field environment is challenging to implement due to its complex natural environment, variability of obstacle shapes, and large-scale changes in external conditions such as illumination. In the field environment, ultrasonic sensors have shortcomings such as poor accuracy in detecting the spatial position of obstacles and being susceptible to interference. Although lidar sensors can detect obstacles more intuitively, the cost of radar systems is expensive. Compared with other obstacle detection methods, computer vision detection has the advantages of low cost and effective use of color and texture information in the environment. In this paper, the computer vision method combined with deep learning is used to detect pedestrian obstacles during the automatic operation of unmanned agricultural machinery.

在目标检测领域，基于深度学习的方法准确率大大超过了传统的基于HOG、SIFT等人工设计特征的检测方法。基于深度学习的目标检测主要包括两类，一类是基于区域生成的卷积网络结构，代表性的网络为R-CNN系列(R-CNN,fast R-CNN,faster R-CNN)；另一类是把目标位置的检测视作回归问题，直接利用CNN网络结构对整个图像进行处理，同时预测出目标的类别和位置，代表性的网络有YOLO、SSD(Single Shot MultiBox Detector)等，其速度一般快于前一类方法。In the field of target detection, the accuracy of deep learning-based methods greatly exceeds that of traditional detection methods based on artificially designed features such as HOG and SIFT. Target detection based on deep learning mainly includes two categories, one is the convolutional network structure based on region generation, and the representative network is the R-CNN series (R-CNN, fast R-CNN, faster R-CNN); The class is to regard the detection of the target position as a regression problem, directly use the CNN network structure to process the entire image, and predict the category and position of the target at the same time. Representative networks include YOLO, SSD (Single Shot MultiBox Detector), etc. Its speed Generally faster than the former method.

SSD目标检测模型由于不需要耗时的区域生成及特征重采样步骤，直接对整个图像进行卷积操作并预测出图像中所包含物体的类别及对应的坐标，从而极大提高了检测速度，同时通过使用小尺寸的卷积核、多尺度预测等使得目标检测的精度得到很大提升。SSD网络结构分为基础网络(base network)和辅助网络(auxiliary network)两部分：基础网络为一些典型的在图像分类领域具有很高分类精度的网络并去除其分类层；辅助网络为在基础网络基础上增加的用于目标检测的卷积网络结构，这些层的尺寸逐渐减小从而可以进行多尺度预测。SSD网络在检测速度和精度的综合性能上表现优异，其检测速度和精度有待于进一步提升，且需要减少其运算量以满足其在移动设备上部署运行的要求。Since the SSD target detection model does not require time-consuming region generation and feature resampling steps, it directly convolves the entire image and predicts the category and corresponding coordinates of the objects contained in the image, thereby greatly improving the detection speed. By using small-sized convolution kernels, multi-scale prediction, etc., the accuracy of target detection is greatly improved. The SSD network structure is divided into two parts: the base network and the auxiliary network: the base network is some typical networks with high classification accuracy in the field of image classification and its classification layer is removed; the auxiliary network is the base network. Based on the added convolutional network structure for object detection, the size of these layers is gradually reduced to enable multi-scale prediction. The SSD network has excellent performance in the comprehensive performance of detection speed and accuracy, and its detection speed and accuracy need to be further improved, and its computational load needs to be reduced to meet the requirements of its deployment and operation on mobile devices.

发明内容SUMMARY OF THE INVENTION

本发明针对以上问题，使用轻量化网络MobileNetV2作为SSD模型中的基础网络以减少提取图像特征过程所花费时间及运算量，辅助层以反向残差结构结合空洞卷积作为基础结构进行位置预测从而可以综合多尺度特征的同时避免下采样操作带来的信息损失，以进行实时的障碍物检测并保证智能化车辆在无人工干预时在田间操作的安全性，减少深度学习模型的参数量和计算量从而可以降低深度学习模型对硬件的要求并且达到实时性以满足其在室外移动设备上的应用。In view of the above problems, the present invention uses the lightweight network MobileNetV2 as the basic network in the SSD model to reduce the time and computational complexity of the process of extracting image features, and the auxiliary layer uses the reverse residual structure combined with the hole convolution as the basic structure for position prediction. It can integrate multi-scale features while avoiding information loss caused by down-sampling operations, so as to perform real-time obstacle detection and ensure the safety of intelligent vehicles operating in the field without manual intervention, reducing the amount of parameters and calculations of deep learning models. Therefore, the hardware requirements of the deep learning model can be reduced and the real-time performance can be achieved to meet its application on outdoor mobile devices.

本发明的技术方案为：一种基于改进SSD网络的果园障碍物实时检测方法，包括以下步骤：The technical scheme of the present invention is: a real-time detection method for orchard obstacles based on an improved SSD network, comprising the following steps:

步骤1，构造关于果园环境的数据集并将数据集分为训练集和测试集；Step 1, construct a data set about the orchard environment and divide the data set into a training set and a test set;

步骤2：在TensorFlow深度学习框架的基础上，搭建SSD网络目标检测模型，将MobileNetV2作为特征提取网络，对SSD的辅助层使用反向残差结构并结合空洞卷积作为基础卷积结构；Step 2: On the basis of the TensorFlow deep learning framework, build an SSD network target detection model, use MobileNetV2 as the feature extraction network, use the reverse residual structure for the auxiliary layer of the SSD and combine the hole convolution as the basic convolution structure;

步骤3：初始化网络模型中的参数得到预训练模型；Step 3: Initialize the parameters in the network model to obtain a pre-training model;

步骤4：使用步骤1中的训练集和测试集，对预训练模型使用批量梯度下降算法进行训练，在训练过程中使用困难样本挖掘策略以增强模型判别假阳性的能力；Step 4: Using the training set and test set in Step 1, use the batch gradient descent algorithm to train the pre-trained model, and use the difficult sample mining strategy during the training process to enhance the model's ability to discriminate false positives;

步骤5：部署SSD网络目标检测模型，通过摄像头采集图像并送入SSD网络目标检测模型，并使用非极大值抑制算法去掉多余边界框，得到检测结果。Step 5: Deploy the SSD network target detection model, collect images through the camera and send them to the SSD network target detection model, and use the non-maximum suppression algorithm to remove redundant bounding boxes to obtain the detection results.

进一步，步骤1的具体过程为：Further, the specific process of step 1 is:

1.1)通过安装在相应果园农机上的摄像头上获取大量不同场景的果园环境下的视频图像获取大量果园环境下的视频，并按照7.5帧/秒抽取图片，将所有图片按照2:1:1比例分为训练集、验证集和测试集；1.1) Obtain a large number of video images in the orchard environment with different scenes on the camera installed on the corresponding orchard agricultural machinery. Obtain a large number of videos in the orchard environment, and extract pictures according to 7.5 frames/second, and take all pictures according to the ratio of 2:1:1 Divided into training set, validation set and test set;

1.2)对上述所有图像进行人工标注，标注的对象是所要检测的障碍物目标，具体的标注信息为图像中目标的类别和该目标的边界框的左上和右下的坐标值；1.2) Manually label all the above images, the labelled object is the obstacle target to be detected, and the specific labeling information is the category of the target in the image and the upper left and lower right coordinate values of the bounding box of the target;

1.3)对训练集的图像进行预处理，包括水平翻转和平移以增加样本数量同时也对标注信息进行对应的处理，并通过自适应直方图均衡化增加图像的质量，减少光照变化对图像的影响。1.3) Preprocess the images of the training set, including horizontal flipping and translation to increase the number of samples and corresponding processing of the annotation information, and increase the quality of the image through adaptive histogram equalization and reduce the impact of illumination changes on the image. .

进一步，步骤2中，所述的将MobileNetV2作为特征提取网络，对SSD的辅助层使用反向残差结构并结合空洞卷积作为基础卷积结构具体方法为：Further, in step 2, the MobileNetV2 is used as the feature extraction network, the reverse residual structure is used for the auxiliary layer of the SSD, and the hole convolution is used as the basic convolution structure. The specific method is:

2.1)将MobileNetV2的用于分类的卷积层移去后留下特征提取层作为SSD的基础网络；2.1) After removing the convolutional layer used for classification of MobileNetV2, leave the feature extraction layer as the basic network of SSD;

2.2)以反向残差结构结合空洞卷积并应用层级特征融合策略解决空洞卷积所带来的计算不连续问题，从而作为辅助层的基本结构用于对基础网络提取出的特征进行位置及类别的检测。2.2) Use the reverse residual structure combined with the hole convolution and apply the hierarchical feature fusion strategy to solve the computational discontinuity problem caused by the hole convolution, so as to serve as the basic structure of the auxiliary layer for the location and the feature extracted by the basic network. category detection.

进一步，步骤3的具体方法为：Further, the specific method of step 3 is:

3.1)在ImageNet大规模分类数据集上对MobileNetV2进行训练使其对达到较高的分类准确度；3.1) Train MobileNetV2 on ImageNet large-scale classification dataset to achieve high classification accuracy;

3.2)去掉MobileNetV2的分类卷积层，取其用于特征提取的卷积层参数赋值给SSD对应的特征提取层；3.2) Remove the classification convolution layer of MobileNetV2, and assign the convolution layer parameters used for feature extraction to the feature extraction layer corresponding to SSD;

3.3)对SSD辅助层各层参数使用以0为均值，0.01为标准差的高斯分布进行随机初始化。3.3) The parameters of each layer of the SSD auxiliary layer are randomly initialized using a Gaussian distribution with 0 as the mean and 0.01 as the standard deviation.

进一步，步骤4的具体方法为：Further, the specific method of step 4 is:

4.1)批量梯度下降算法进行训练过程中使用的目标函数为：4.1) The objective function used in the training process of the batch gradient descent algorithm is:

其中N是匹配的默认边界框的个数，当其中当N为0时，直接设置L为0，c为标注类别，l为预测的边界框，g为标注的边界框，L_loc为对应的位置预测的smooth_L1误差，L_conf为对应的softmax多分类误差函数：where N is the number of matching default bounding boxes, when N is 0, directly set L to 0, c is the label category, l is the predicted bounding box, g is the labeled bounding box, and L _loc is the corresponding The smooth _L1 error of position prediction, L _conf is the corresponding softmax multi-class error function:

其中：Pos为样本中的正例，cx,cy为预测框的中心点坐标，w为预测框的宽，h为预测框的高，为第几个预测框与第j个真实框关于类别K是否匹配，为预测框，为真实框Among them: Pos is the positive example in the sample, cx, cy are the coordinates of the center point of the prediction frame, w is the width of the prediction frame, h is the height of the prediction frame, Whether the predicted box matches the jth real box with respect to category K, is the prediction box, for the real box

其中：为预测框i与真实框j关于类别p是否匹配，Neg为样本中的负例，为预测框中没有物体，计算式为：in: Whether the predicted box i and the real box j match the category p, Neg is the negative example in the sample, To predict that there is no object in the box, The calculation formula is:

其中：为目标第i个预测框中目标是第p个类别的概率。in: The probability that the target is the pth class in the ith prediction box for the target.

4.2)训练过程中先用初始的正负样本训练检测模型，然后使用训练出的模型对样本进行检测分类，把其中检测错误的那些样本继续放入负样本集合进行训练，从而可以加强模型判别假阳性的能力。4.2) In the training process, first use the initial positive and negative samples to train the detection model, and then use the trained model to detect and classify the samples, and continue to put those samples that are wrongly detected into the negative sample set for training, so as to strengthen the model to discriminate false positives. positive ability.

进一步，步骤5的具体方法为：Further, the specific method of step 5 is:

5.1)去除训练过程中所用到的用于防止过拟合的操作并固定网络参数已得到用于部署的SSD网络目标检测模型；5.1) Remove the operation used in the training process to prevent overfitting and fix the network parameters to obtain the SSD network target detection model for deployment;

5.2)通过摄像头采集图像并作为模型的输入，从而得到若干目标的类别置信度和边界框坐标；5.2) The image is collected by the camera and used as the input of the model, so as to obtain the category confidence and bounding box coordinates of several targets;

5.3)使用非极大值抑制算法去除多余的检测框，得到更准确的检测结果。5.3) Use the non-maximum suppression algorithm to remove redundant detection frames to obtain more accurate detection results.

进一步，非极大值抑制算法具体为：对于检测结果中所对应的置信度对检测结果进行按照置信度从高到低进行排序，并且计算出相应的重叠率，设置重叠率阈值为0.5，在检测结果具有高置信度和高重叠率阈值时采纳此检测结果。Further, the non-maximum value suppression algorithm is specifically: sorting the detection results according to the confidence levels corresponding to the detection results from high to low, and calculating the corresponding overlap rate, setting the threshold value of the overlap rate to 0.5, in This detection result is adopted when the detection result has a high confidence and a high overlap rate threshold.

本方案的优点是：The advantages of this scheme are:

1)通过迁移学习技术，把MobileNetV2在Imagenet分类表现较好的参数移植到SSD的特征提取网络模型中，从而简化目标检测模型的训练过程并缩短训练时间。1) Through the transfer learning technology, the parameters that MobileNetV2 performs better in Imagenet classification are transplanted into the feature extraction network model of SSD, thereby simplifying the training process of the target detection model and shortening the training time.

2)通过改进原始SSD的特征提取网络，使用更加轻量化的MobileNetV2网络模型进行特征提取，辅助层使用改进后的反向残差结构进行卷积运算，从而可以利用多特征信息并且减少运算量，从而提高模型检测的准确率和检测速度。2) By improving the feature extraction network of the original SSD, the more lightweight MobileNetV2 network model is used for feature extraction, and the auxiliary layer uses the improved reverse residual structure for convolution operation, so that multi-feature information can be used and the amount of computation can be reduced, Thereby, the accuracy and detection speed of model detection are improved.

3)采用改进的SSD目标检测模型，进行田间环境下的行人障碍物检测，模型占用空间较小且轻量化，适合于在移动设备上部署，模型具有较好的鲁棒性，可以较好地实现果园环境下障碍物的检测，为避障决策提供依据。3) The improved SSD target detection model is used to detect pedestrian obstacles in the field environment. The model occupies a small space and is lightweight, and is suitable for deployment on mobile devices. The model has good robustness and can better Realize the detection of obstacles in the orchard environment, and provide a basis for obstacle avoidance decisions.

附图说明Description of drawings

图1为本发明的步骤图。Fig. 1 is a step diagram of the present invention.

图2改进后的反向残差结构图Figure 2 Improved reverse residual structure diagram

图3层级特征融合结构图Figure 3 Hierarchical feature fusion structure diagram

空洞卷积层结构表示为(#输入通道，感受野，#输出通道)，其中空洞卷积核的有效感受野为nk*nk,nk＝(n-1)*2k-1+1,k＝1,...,K。The hole convolution layer structure is expressed as (#input channel, receptive field, #output channel), where the effective receptive field of the hole convolution kernel is nk*nk,nk=(n-1)*2k-1+1,k= 1,...,K.

图4改进后的SSD目标检测模型。Figure 4 The improved SSD object detection model.

具体实施方式Detailed ways

以下结合附图和具体实施方式，对本发明做进一步的详细说明。The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

本发明提供一种基于改进SSD的果园障碍物实时检测方法，该方法主要包括以下步骤：The present invention provides a real-time detection method for orchard obstacles based on improved SSD, and the method mainly comprises the following steps:

步骤一：构造数据集并将数据集分为训练集和测试集，该步骤包括以下子步骤：Step 1: Construct the dataset and divide the dataset into training set and test set. This step includes the following sub-steps:

1.2)对上述所有图像进行人工标注，标注的对象是所要检测的障碍物目标，具体的标注信息为图像中目标的类别和该目标的边界框的左上和右下的坐标值。1.2) Manually annotate all the above images, the annotated object is the obstacle target to be detected, and the specific annotation information is the category of the target in the image and the coordinate values of the upper left and lower right of the bounding box of the target.

步骤二：在TensorFlow深度学习框架的基础上，将MobileNetV2作为特征提取网络，对SSD的辅助层使用反向残差结构并结合空洞卷积作为基础卷积结构。Step 2: On the basis of the TensorFlow deep learning framework, MobileNetV2 is used as the feature extraction network, and the reverse residual structure is used for the auxiliary layer of SSD and combined with hole convolution as the basic convolution structure.

主要包括以下步骤：It mainly includes the following steps:

1、在TensorFlow深度学习框架中搭建改进SSD目标检测算法，将轻量化网络模型MobileNetV2最后用于分类的卷积层conv2d 1x1,avgpool 7x7,conv2d 1x1移除后作为SSD的基础层用于提取特征。1. Build an improved SSD target detection algorithm in the TensorFlow deep learning framework, and remove the convolutional layers conv2d 1x1, avgpool 7x7, and conv2d 1x1 of the lightweight network model MobileNetV2 for classification as the base layer of SSD for feature extraction.

2、对SSD目标检测模型的辅助层卷积结构进行改进，使用反向残差结构集合空洞卷积结构来对卷积结构进行改进，并作为辅助层的基本卷积结构单元，具体如附图2所示，应用空洞卷积可以在不用下采样操作的情况下增加卷积核的感受野，以减少学习过程中非线性变换造成的信息损失且卷积核具有多尺度的感受野。2. Improve the convolution structure of the auxiliary layer of the SSD target detection model, and use the reverse residual structure set hole convolution structure to improve the convolution structure, and use it as the basic convolution structure unit of the auxiliary layer, as shown in the attached figure. 2, the application of atrous convolution can increase the receptive field of the convolution kernel without downsampling operation, so as to reduce the information loss caused by nonlinear transformation in the learning process, and the convolution kernel has a multi-scale receptive field.

3、由于空洞卷积的引入会导致卷积核运算不连续的问题，进一步使用层级特征融合算法来消除空洞卷积带来的负面影响，具体实现方式为对空洞卷积层的每一个卷积单元的输出依次进行求和，并且把每个求和后的结果都进行连接(concatenate)操作得到最后的输出结果。见附图3。3. Due to the introduction of hole convolution, the operation of the convolution kernel will be discontinuous. The hierarchical feature fusion algorithm is further used to eliminate the negative impact of hole convolution. The specific implementation method is for each convolution of the hole convolution layer. The outputs of the units are summed in turn, and each summed result is concatenated to obtain the final output result. See Figure 3.

其中，改进后的反向残差结构使用的激活函数为ReLU6,ReLU6相比于ReLU在低精度运算场景中具有更好的鲁棒性，另外，卷积核尺寸还是使用典型的3x3大小的卷积核，ReLU6函数为式(1)所示。Among them, the activation function used by the improved reverse residual structure is ReLU6. Compared with ReLU, ReLU6 has better robustness in low-precision operation scenarios. In addition, the size of the convolution kernel still uses a typical 3x3 volume. Product kernel, ReLU6 function is shown in formula (1).

ReLU6＝min(max(features,0),6) (1)ReLU6=min(max(features,0),6) (1)

最终得到的改进后的SSD目标检测模型如附图4。The final improved SSD target detection model is shown in Figure 4.

步骤三：初始化网络模型中的参数得到预训练模型。主要包括以下步骤：Step 3: Initialize the parameters in the network model to obtain a pre-trained model. It mainly includes the following steps:

3.2)去掉MobileNetV2的分类卷积层，取其用于特征提取的卷积层参数赋值给SSD对应的特征提取层；对基础网络MobilenetV2部分，使用已在ImageNet分类任务数据集上训练好的MobilenetV2网络并提取对应网络结构的参数作为基础网络的初始化值。3.2) Remove the classification convolution layer of MobileNetV2, and assign the convolution layer parameters used for feature extraction to the feature extraction layer corresponding to the SSD; for the MobilenetV2 part of the basic network, use the MobilenetV2 network that has been trained on the ImageNet classification task dataset. And extract the parameters corresponding to the network structure as the initialization value of the basic network.

步骤四：对预训练模型使用批量梯度下降算法进行训练，在训练过程中使用困难样本挖掘策略以增强模型判别假阳性的能力。具体训练过程为：Step 4: Use the batch gradient descent algorithm to train the pre-trained model, and use the difficult sample mining strategy during the training process to enhance the model's ability to discriminate false positives. The specific training process is:

上述批量梯度下降算法设置样本批量大小为128，冲量为0.9，权值衰减系数为2×10^-3,最大迭代次数设置为100k，初始学习率为0.004，衰减率为0.95，每10000次迭代后衰减一次，并每间隔10000次迭代后保存一次模型，最终选取精度最高的模型。The above batch gradient descent algorithm sets the sample batch size to 128, the impulse to 0.9, the weight decay coefficient to 2×10 ^-3 , the maximum number of iterations to 100k, the initial learning rate to 0.004, and the decay rate to 0.95. After every 10,000 iterations Attenuate once, and save the model every 10,000 iterations, and finally select the model with the highest accuracy.

训练过程中使用困难样本挖掘(hard negative mining)策略，即训练过程中先用初始的正负样本训练检测模型，然后使用训练出的模型对样本进行检测分类，把其中检测错误的那些样本继续放入负样本集合进行训练，从而可以加强模型判别假阳性的能力。The hard negative mining strategy is used in the training process, that is, the initial positive and negative samples are used to train the detection model during the training process, and then the trained model is used to detect and classify the samples, and those samples that are detected incorrectly are placed in the The negative sample set is used for training, which can strengthen the ability of the model to discriminate false positives.

步骤五：部署SSD模型，通过摄像头采集图像并送入SSD目标检测模型，并使用非极大值抑制算法去掉多余边界框，得到检测结果。具体实现方式为：Step 5: Deploy the SSD model, collect images through the camera and send them to the SSD target detection model, and use the non-maximum suppression algorithm to remove redundant bounding boxes to obtain the detection results. The specific implementation is as follows:

5.1)去除训练过程中所用到的用于防止过拟合的操作并固定网络参数已得到用于部署的网络模型；5.2)通过摄像头采集图像并作为模型的输入，从而得到若干目标的类别置信度和边界框坐标；5.3)使用非极大值抑制算法去除多余的检测框，得到更准确的检测结果。5.1) Remove the operations used in the training process to prevent over-fitting and fix the network parameters to obtain the network model for deployment; 5.2) Collect images through the camera and use them as the input of the model, so as to obtain the category confidence of several targets and bounding box coordinates; 5.3) Use non-maximum suppression algorithm to remove redundant detection boxes to get more accurate detection results.

1、固定步骤四中训练好的模型参数并去除dropout等防止过拟合的操作从而得到最终的网络模型。1. Fix the model parameters trained in step 4 and remove operations such as dropout to prevent overfitting to obtain the final network model.

2、测试和评估网络模型，评价指标采用查准率(P)和查全率(R)以及二者的调和均值F₁，分别如式(2)、(3)、(4)所示。2. Test and evaluate the network model. The evaluation indicators are precision (P) and recall (R) and their harmonic mean F ₁ , as shown in equations (2), (3), and (4) respectively.

式(2)、(3)中TP为正确检测到行人的数量，FP为误把非行人目标检测为行人目标的数量，FN为误把行人检测为背景的数量，F₁值是对查准率和查全率的调和均值，越接近于1，表明模型表现越好。In equations (2) and (3), TP is the number of correctly detected pedestrians, FP is the number of non-pedestrian targets mistakenly detected as pedestrian targets, FN is the number of pedestrians mistakenly detected as the background, and the value of F1 is the number _of The harmonic mean of rate and recall, the closer it is to 1, the better the performance of the model.

3、把达到预期的网络模型参数固定并部署在相应的移动设备中，通过摄像头实时获取果园环境下的图片并输入模型中，采用非极大值抑制来去除多余的检测框，其中IOU阈值选为0.4，置信度阈值选为0.5。3. Fix the expected network model parameters and deploy them in the corresponding mobile devices, obtain real-time pictures in the orchard environment through the camera and input them into the model, and use non-maximum suppression to remove redundant detection frames, where the IOU threshold is selected. is 0.4, and the confidence threshold is selected as 0.5.

综上，本发明的基于改进SSD的果园障碍物实时检测方法主要适用于果园环境下无人农机的自动导航场景，通过研究深度学习目标检测算法基本原理，提出一种基于改进SSD目标检测网络的障碍物检测方法，以SSD检测网络为基础，在网络结构和训练过程上进行改进，以减少运算量加快检测速度，提高检测精度，达到实时性的要求，并且降低深度学习模型对硬件的要求，从而可以满足在移动上的部署应用。该方法首先采集相应的视频数据，并以一定的帧率抽取图片进行标注，制作出用于训练的数据集。并通过迁移学习对深度学习模型进行初始化，然后使用批量梯度下降算法对模型进行训练，最后使用训练好的模型用于实际障碍物检测任务中。本发明可在果园环境下对前方障碍物进行快速而准确的检测，是实现智能农业提高其可靠性的有力措施。In summary, the real-time detection method of orchard obstacles based on the improved SSD of the present invention is mainly suitable for the automatic navigation scene of the unmanned agricultural machine in the orchard environment. The obstacle detection method is based on the SSD detection network, and the network structure and training process are improved to reduce the calculation amount to speed up the detection speed, improve the detection accuracy, meet the real-time requirements, and reduce the hardware requirements of the deep learning model. Thereby, it can meet the deployment application on the mobile. The method first collects the corresponding video data, and extracts pictures at a certain frame rate for labeling, and produces a data set for training. And initialize the deep learning model through transfer learning, then use the batch gradient descent algorithm to train the model, and finally use the trained model for the actual obstacle detection task. The invention can quickly and accurately detect the obstacles ahead in the orchard environment, and is a powerful measure for realizing intelligent agriculture and improving its reliability.

Claims

1. a kind of based on the orchard barrier real-time detection method for improving SSD network, characterized in that the following steps are included:

Step 1, it constructs the data set about orchard environment and data set is divided into training set and test set；

Step 2: on the basis of TensorFlow deep learning frame, SSD network objectives detection model is built, it will MobileNetV2 uses reversed residual error structure to the auxiliary layer of SSD and combines empty convolution as base as feature extraction network Plinth convolutional coding structure；

Step 3: the parameter in initialization network model obtains pre-training model；

Step 4: using the training set and test set in step 1, pre-training model being instructed using batch gradient descent algorithm Practice, enhances the ability of Model checking false positive using difficult sample Mining Strategy in the training process；

Step 5: deployment SSD network objectives detection model passes through camera collection image and is sent into SSD network objectives detection mould Type, and remove excess edge frame using non-maxima suppression algorithm, obtain testing result.

2. according to claim 1 a kind of based on the orchard barrier real-time detection method for improving SSD network, feature exists In the detailed process of step 1 are as follows:

1.1) video under the orchard environment by obtaining a large amount of different scenes on the camera that is mounted on corresponding orchard agricultural machinery Image obtains the video under a large amount of orchard environment, and extracts picture according to 7.5 frames/second, by all pictures according to 2:1:1 ratio point For training set, verifying collection and test set；

1.2) above-mentioned all images are manually marked, the object of mark is obstacle target to be detected, specific to mark Infuse the coordinate value that information is the upper left and bottom right of the bounding box of the classification and target of target in image；

1.3) image of training set is pre-processed, including flip horizontal and translation are to increase sample size while also to mark Information carries out corresponding processing, and increases the quality of image by adaptive histogram equalization, reduces illumination variation to image Influence.

3. according to claim 1 a kind of based on the orchard barrier real-time detection method for improving SSD network, feature exists In, it is described using MobileNetV2 as feature extraction network in step 2, reversed residual error structure is used to the auxiliary layer of SSD And combine empty convolution as basic convolutional coding structure method particularly includes:

2.1) basic network for leaving feature extraction layer as SSD after removing the convolutional layer for being used to classify of MobileNetV2；

2.2) empty convolution is combined with reversed residual error structure and application level Fusion Features strategy solves brought by empty convolution Discontinuous problem is calculated, so that the feature that the basic structure as auxiliary layer is used to extract basic network carries out position and class Other detection.

4. according to claim 1 a kind of based on the orchard barrier real-time detection method for improving SSD network, feature exists In step 3 method particularly includes:

3.1) being trained on the extensive categorized data set of ImageNet to MobileNetV2 makes it to reaching higher classification Accuracy；

3.2) the classification convolutional layer for removing MobileNetV2 takes it to be used for the convolutional layer parameter assignment of feature extraction corresponding to SSD Feature extraction layer；

It 3.3) is mean value with 0 to each layer parameter use of SSD auxiliary layer, 0.01 is random initial for the Gaussian Profile progress of standard deviation Change.

5. according to claim 1 a kind of based on the orchard barrier real-time detection method for improving SSD network, feature exists In step 4 method particularly includes:

4.1) batch gradient descent algorithm is trained objective function used in process are as follows:

Wherein N is the number of matched default boundary frame, when wherein when N is 0, it is mark classification that directly setting L, which is 0, c, and l is The bounding box of prediction, g are the bounding box of mark, L_locFor the smooth of corresponding position prediction_L1Error, L_confIt is corresponding The more error in classification functions of softmax:

Wherein: Pos is the positive example in sample, and cx, cy are the center point coordinate of prediction block, and w is the width of prediction block, and h is prediction block Height,Whether matched with j-th of true frame about classification K for which prediction block,For prediction block,For true frame

Wherein:Whether being matched with true frame j about classification p for prediction block i, Neg is the negative example in sample,For in prediction block There is no object,Calculating formula are as follows:

Wherein:It is the probability of p-th of classification for target in i-th of prediction block of target.

4.2) first with initial positive and negative sample training detection model in training process, then using the model trained to sample into Row detection classification continues wherein detection those of mistake sample to be put into negative sample set being trained, so as to reinforce mould The ability of type differentiation false positive.

6. according to claim 1 a kind of based on the orchard barrier real-time detection method for improving SSD network, feature exists In step 5 method particularly includes:

5.1) it removes used for preventing the operation of over-fitting and fixed network parameter has been obtained for portion in training process The SSD network objectives detection model of administration；

5.2) by camera collection image and the input as model, to obtain classification confidence level and the boundary of several targets Frame coordinate；

5.3) extra detection block is removed using non-maxima suppression algorithm, obtains more accurate testing result.

7. according to claim 6 a kind of based on the orchard barrier real-time detection method for improving SSD network, feature exists In non-maxima suppression algorithm specifically: confidence level corresponding in testing result carries out according to confidence testing result Degree is ranked up from high to low, and calculates corresponding Duplication, and setting Duplication threshold value is 0.5, is had in testing result This testing result is adopted when high confidence level and high Duplication threshold value.