CN118262299A - Small ship detection method and system based on novel neck network and loss function - Google Patents
Small ship detection method and system based on novel neck network and loss function Download PDFInfo
- Publication number
- CN118262299A CN118262299A CN202410426660.XA CN202410426660A CN118262299A CN 118262299 A CN118262299 A CN 118262299A CN 202410426660 A CN202410426660 A CN 202410426660A CN 118262299 A CN118262299 A CN 118262299A
- Authority
- CN
- China
- Prior art keywords
- information
- loss function
- loss
- small ship
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/54—Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/42—Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Radar Systems Or Details Thereof (AREA)
Abstract
Description
技术领域Technical Field
本发明涉及SAR图像小目标检测技术领域,具体而言,涉及基于新型颈网络和损失函数的小型船舶检测方法及系统。The present invention relates to the technical field of small target detection in SAR images, and in particular to a small ship detection method and system based on a novel neck network and a loss function.
背景技术Background technique
SAR具有全天时全天候的优势,SAR船舶目标检测在军民领域均有广泛的应用。传统的检测是恒虚警率(CFAR)检测方法,它无法适应复杂的背景环境。近些年,随着深度学习的发展,很多研究人员采用基于深度学习的检测方法,并取得了非常好的检测效果。SAR has the advantage of being available all day and all weather. SAR ship target detection is widely used in both military and civilian fields. The traditional detection method is the constant false alarm rate (CFAR) detection method, which cannot adapt to complex background environments. In recent years, with the development of deep learning, many researchers have adopted detection methods based on deep learning and achieved very good detection results.
基于深度学习的目标检测方法可分为单阶段和两阶段。单阶段方法可以直接基于定位框来预测边界框,它们速度很快,但不够精确。两阶段方法首先根据CNN(卷积神经网络)的特征生成大量的区域建议框,然后用另一个网络对建议框进行识别,它们的精度很高,但速度稍慢。虽然上述方法很先进,无法适应小目标检测的特殊场景。这种情况在SAR船舶检测中更为恶劣,因为SAR图像中有大量的小型船舶。Object detection methods based on deep learning can be divided into single-stage and two-stage. Single-stage methods can directly predict bounding boxes based on positioning boxes. They are fast but not accurate enough. Two-stage methods first generate a large number of region proposal boxes based on the features of CNN (convolutional neural network), and then use another network to identify the proposal boxes. They are very accurate but slightly slower. Although the above methods are very advanced, they cannot adapt to the special scenario of small target detection. This situation is even worse in SAR ship detection because there are a large number of small ships in SAR images.
小尺寸目标是指其像素个数较少,以COCO数据集的划分标准为例,小对象的面积小于1024,中等对象的面积在1024-9216之间,大对象的面积大于9216。这意味着小物体的视觉特征是有限的,因此对于基于深度学习的检测算法,小对象的检测性能远不如大对象。Small-sized objects refer to those with fewer pixels. Taking the division criteria of the COCO dataset as an example, the area of a small object is less than 1024, the area of a medium object is between 1024-9216, and the area of a large object is greater than 9216. This means that the visual features of small objects are limited, so for deep learning-based detection algorithms, the detection performance of small objects is far inferior to that of large objects.
与计算机视觉中的数据集相比,SAR船舶检测数据集中的小目标比例极高。例如,COCO数据集中小对象、中对象和大对象的比例分别为41.4%、34.3%和24.2%。但在SAR-Ship-Dataset数据集SAR船舶数据集中,小、中、大目标的比例分别为56.5%、43.2%、0.3%。因此,由于SAR船舶检测数据集中有很大比例的小型船舶,因此检测算法的整体性能将大幅下降。Compared with the datasets in computer vision, the proportion of small objects in the SAR ship detection dataset is extremely high. For example, the proportions of small objects, medium objects, and large objects in the COCO dataset are 41.4%, 34.3%, and 24.2%, respectively. But in the SAR-Ship-Dataset dataset, the proportions of small, medium, and large objects are 56.5%, 43.2%, and 0.3%, respectively. Therefore, due to the large proportion of small ships in the SAR ship detection dataset, the overall performance of the detection algorithm will drop significantly.
小物体检测性能差的原因为,经过CNN后,小物体的特征往往受到背景噪声的污染,难以捕捉到对分类至关重要的判别信息,且基于IoU(交并比)的度量中边界框的小位置偏差对小对象的干扰大于对大对象的干扰,使得很难找到合适的IoU阈值并提供高质量的正样本和负样本来训练网络。为了解决上述问题,常用的方法有锚框设计、小目标数据扩充、损失函数设计等。The reason for the poor performance of small object detection is that after CNN, the features of small objects are often contaminated by background noise, making it difficult to capture the discriminative information that is critical for classification, and the small position deviation of the bounding box in the IoU (intersection over union) metric interferes with small objects more than large objects, making it difficult to find a suitable IoU threshold and provide high-quality positive and negative samples to train the network. In order to solve the above problems, commonly used methods include anchor box design, small target data expansion, loss function design, etc.
发明内容Summary of the invention
为了解决上述问题,本发明的目的是YOLOX的基础上,采用了多种改进(包括颈部和损失函数)来提高小型船舶的检测性能。In order to solve the above problems, the purpose of the present invention is to improve the detection performance of small ships by adopting multiple improvements (including neck and loss function) based on YOLOX.
为了实现上述技术目的,本申请提供了基于新型颈网络和损失函数的小型船舶检测方法,包括以下步骤:In order to achieve the above technical objectives, the present application provides a small ship detection method based on a novel neck network and a loss function, comprising the following steps:
将YOLOX模型的颈网络进行改进,通过SAM模块收集和融合来自不同级别的信息后并分发到不同级别,增强颈网络的信息融合能力,The neck network of the YOLOX model is improved. The SAM module collects and fuses information from different levels and distributes it to different levels to enhance the information fusion capability of the neck network.
同时,at the same time,
对YOLOX模型的损失函数的定位损失进行改进,通过增加增益函数,使得小型船舶的监督信号将更加充分;The positioning loss of the loss function of the YOLOX model is improved, and the supervision signal of small ships will be more sufficient by adding the gain function;
依据改进后的YOLOX模型,对SAR图像中的小型船舶进行检测。Based on the improved YOLOX model, small ships in SAR images are detected.
优选地,在对颈网络进行改进的过程中,通过SAM模块,获取局部特征和全局特征,在全局特征中使用两个不同的卷积进行计算,使用平均池或双线性插值来缩放全局特征,并在每次注意力融合结束时,添加RepConv以进一步提取和融合信息。Preferably, in the process of improving the neck network, local features and global features are obtained through the SAM module, two different convolutions are used for calculation in the global features, average pooling or bilinear interpolation is used to scale the global features, and at the end of each attention fusion, RepConv is added to further extract and fuse information.
优选地,在使用SAM模块对颈网络进行改进时,SAM模块由Conv卷积、Sigmoid激活函数、Up Sampling上采样、Down Sampling下采样和RepConv单元组成。Preferably, when the SAM module is used to improve the neck network, the SAM module consists of Conv convolution, Sigmoid activation function, Up Sampling, Down Sampling and RepConv units.
优选地,在获取全局特征的过程中,通过融合局部特征,获取全局特征,其中,对局部特征的融合用于获得保留小型船舶信息的高分辨率特征。Preferably, in the process of acquiring the global features, the global features are acquired by fusing the local features, wherein the fusion of the local features is used to obtain high-resolution features that retain the information of the small ship.
优选地,在对输入特征进行下采样的过程中,使用平均池化操作对输入特征进行下采样。Preferably, in the process of downsampling the input features, an average pooling operation is used to downsample the input features.
优选地,在对YOLOX模型的颈网络进行改进的过程中,基于SAM模块,通过全局融合不同层次的特征以获得全局信息,并将全局信息注入不同层次的特性中,在不显著增加延迟的情况下,提高颈网络的信息融合能力。Preferably, in the process of improving the neck network of the YOLOX model, based on the SAM module, the features at different levels are globally fused to obtain global information, and the global information is injected into the features at different levels, so as to improve the information fusion capability of the neck network without significantly increasing the delay.
优选地,在对定位损失进行改进的过程中,增益函数f(t)表示为:Preferably, in the process of improving the positioning loss, the gain function f(t) is expressed as:
式中,t表示地面实况边界盒的面积,τ表示小型船舶损失与总损失的比率,w表示阈值,wi表示,hi表示。Where t represents the area of the ground truth bounding box, τ represents the ratio of small ship loss to total loss, w represents the threshold, wi represents, and hi represents.
本发明公开了基于新型颈网络和损失函数的小型船舶检测系统,包括:The present invention discloses a small ship detection system based on a novel neck network and a loss function, comprising:
数据采集模块,用于获取SAR图像;Data acquisition module, used to acquire SAR images;
检测识别模块,用于通过改进的YOLOX模型,对SAR图像中的小型船舶进行目标检测,其中,对YOLOX模型的改进过程包括:将YOLOX模型的颈网络进行改进,通过SAM模块收集和融合来自不同级别的信息后并分发到不同级别,增强颈网络的信息融合能力;对YOLOX模型的损失函数的定位损失进行改进,通过增加增益函数,使得小型船舶的监督信号将更加充分。The detection and recognition module is used to detect small ships in SAR images through the improved YOLOX model. The improvement process of the YOLOX model includes: improving the neck network of the YOLOX model, collecting and fusing information from different levels through the SAM module and distributing it to different levels to enhance the information fusion capability of the neck network; improving the positioning loss of the loss function of the YOLOX model, and increasing the gain function so that the supervision signal of small ships will be more sufficient.
优选地,检测识别模块,还用于通过SAM模块,获取局部特征和全局特征,在全局特征中使用两个不同的卷积进行计算,使用平均池或双线性插值来缩放全局特征,并在每次注意力融合结束时,添加RepConv以进一步提取和融合信息。Preferably, the detection and recognition module is also used to obtain local features and global features through a SAM module, use two different convolutions to calculate the global features, use average pooling or bilinear interpolation to scale the global features, and add RepConv at the end of each attention fusion to further extract and fuse information.
优选地,检测识别模块,还用于基于SAM模块,通过全局融合不同层次的特征以获得全局信息,并将全局信息注入不同层次的特性中,在不显著增加延迟的情况下,提高颈网络的信息融合能力。Preferably, the detection and recognition module is also used to obtain global information by globally fusing features at different levels based on the SAM module, and inject the global information into features at different levels, so as to improve the information fusion capability of the neck network without significantly increasing the delay.
本发明公开了以下技术效果:The present invention discloses the following technical effects:
本发明新设计的颈部网络的全局信息是通过对不同层次的特征进行全局集成而获得的,并注入到不同层次的特性中,实现了高效的信息交换与融合。它避免了FPN(FeaturePyramidNetwork)中的信息丢失问题,提高了颈部的融合能力,有助于检测小目标。新设计的损失函数通过自适应调整定位损耗增益,增强了小型船舶的监控信号,提高了小型船舶检测算法的优化效果。The global information of the neck network newly designed in the present invention is obtained by globally integrating features at different levels and injected into features at different levels, achieving efficient information exchange and fusion. It avoids the information loss problem in FPN (Feature Pyramid Network), improves the fusion ability of the neck, and helps detect small targets. The newly designed loss function enhances the monitoring signal of small ships by adaptively adjusting the positioning loss gain, thereby improving the optimization effect of the small ship detection algorithm.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required for use in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.
图1是本发明所述的FPN与规模注意模块示意图;FIG1 is a schematic diagram of the FPN and scale attention module of the present invention;
图2是本发明所述的SAM模块示意图;Fig. 2 is a schematic diagram of a SAM module according to the present invention;
图3是本发明所述的全局功能模块示意图;FIG3 is a schematic diagram of a global functional module according to the present invention;
图4是本发明所述的方法流程示意图。FIG. 4 is a schematic flow chart of the method described in the present invention.
具体实施方式Detailed ways
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本申请实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本申请的实施例的详细描述并非旨在限制要求保护的本申请的范围,而是仅仅表示本申请的选定实施例。基于本申请的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本申请保护的范围。To make the purpose, technical scheme and advantages of the embodiments of the present application clearer, the technical scheme in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. The components of the embodiments of the present application usually described and shown in the drawings here can be arranged and designed in various different configurations. Therefore, the following detailed description of the embodiments of the present application provided in the drawings is not intended to limit the scope of the application claimed for protection, but merely represents the selected embodiments of the present application. Based on the embodiments of the present application, all other embodiments obtained by those skilled in the art without making creative work belong to the scope of protection of the present application.
如图1-4所示,本发明公开了基于新型颈网络和损失函数的SAR图像小尺寸船舶目标检测方法。颈网络将多层特征全局集成,并将全局和局部信息注入不同级别,它避免了传统特征金字塔网络固有的信息丢失,在不显著增加延迟的情况下增强了信息融合能力。在训练过程中,小型船舶的定位损失越小,其定位损失收益就越大。通过这种方式,在损失函数中加强了对小型船舶的监督,这可以使损失的计算更加偏向于小型船舶。在SSDD和SAR-Ship-Dataset数据集上进行了一系列实验,结果表明,该方法可以大大提高小型船舶的检测性能。As shown in Figures 1-4, the present invention discloses a method for detecting small-sized ship targets in SAR images based on a new neck network and loss function. The neck network globally integrates multiple layers of features and injects global and local information into different levels. It avoids the information loss inherent in the traditional feature pyramid network and enhances the information fusion capability without significantly increasing the delay. During the training process, the smaller the positioning loss of small ships, the greater the positioning loss gain. In this way, the supervision of small ships is strengthened in the loss function, which can make the calculation of the loss more biased towards small ships. A series of experiments were carried out on the SSDD and SAR-Ship-Dataset datasets, and the results showed that this method can greatly improve the detection performance of small ships.
SAR图像中小尺寸船舶目标占比非常大,这对检测造成了较大的困难,现有的小型船舶检测研究很少,解决方案都是基于特征融合(不同层次的特征图融合和FPN等),缺乏对小型船舶在数据集中所占比例、检测困难的原因和解决方案的深入讨论。这导致对小型船只的检测改进有限。因此,有必要采取特殊的策略来提高小型船舶的检测结果,这是本发明的主要动机。所提出的方法是基于YOLOX的,因为YOLOX是无锚的,并且是从头开始训练的,并且在SAR船舶检测方面具有良好的性能。在此基础上新设计了颈网络和损失函数。对于颈部,尺度注意力模块与FPN一起用于融合和交互不同尺度的信息,它全局集成了多层特征,并将全局信息注入到更高的级别,这在不显著增加延迟的情况下显著增强了颈部的信息融合能力,并提高了不同物体尺寸的性能。对于损失函数,在训练过程中加强对小型船舶的监督,这会使损失的计算更加偏向小型船舶。Small-sized ship targets account for a very large proportion in SAR images, which poses great difficulties for detection. Existing research on small ship detection is rare, and the solutions are all based on feature fusion (feature map fusion at different levels and FPN, etc.). There is a lack of in-depth discussion on the proportion of small ships in the dataset, the reasons for the detection difficulties and solutions. This leads to limited improvement in the detection of small ships. Therefore, it is necessary to adopt special strategies to improve the detection results of small ships, which is the main motivation of this invention. The proposed method is based on YOLOX, because YOLOX is anchor-free and trained from scratch, and has good performance in SAR ship detection. On this basis, the neck network and loss function are newly designed. For the neck, the scale attention module is used together with FPN to fuse and interact information of different scales. It globally integrates multiple layers of features and injects global information into higher levels, which significantly enhances the information fusion ability of the neck without significantly increasing the delay and improves the performance of different object sizes. For the loss function, the supervision of small ships is strengthened during the training process, which makes the calculation of the loss more biased towards small ships.
深度CNN生成具有不同空间分辨率的特征图,其中低级特征描述更精细的细节并具有更多的定位信息,而高级特征捕获更丰富的语义信息。由于存在向下采样,小型船舶的响应可能会在更深的层中消失。为了缓解这一问题,许多研究人员通过重用前向传播生成的多尺度特征图来构建特征金字塔,然后使用低分辨率特征图来检测小型船只,如FPN及其变体。Deep CNN generates feature maps with different spatial resolutions, where low-level features describe finer details and have more localization information, while high-level features capture richer semantic information. Due to downsampling, the response of small ships may disappear in deeper layers. To alleviate this problem, many researchers build feature pyramids by reusing multi-scale feature maps generated by forward propagation, and then use low-resolution feature maps to detect small ships, such as FPN and its variants.
FPN及其变体只能完全集成相邻层的特征,只能间接“递归”地从其他层获得信息。它只能交换中间层选择的信息,未选择的信息在传输过程中被丢弃。这将导致一个层的信息只影响相邻层,而对其他全局层的影响有限。因此,信息融合的整体有效性是有限的。FPN and its variants can only fully integrate features from adjacent layers and can only indirectly "recursively" obtain information from other layers. It can only exchange information selected by the intermediate layers, and the unselected information is discarded during the transmission process. This will cause the information of one layer to only affect the adjacent layers, and have limited impact on other global layers. Therefore, the overall effectiveness of information fusion is limited.
为了避免传统FPN结构在传输过程中的信息丢失,放弃了原来的递归方法,构造了一种新的融合机制。通过使用统一的模块来收集和融合来自不同级别的信息,然后将其分发到不同级别,不仅避免了传统FPN固有的信息丢失,而且在不显著增加延迟的情况下增强了颈部的信息融合能力。In order to avoid the information loss of the traditional FPN structure during transmission, the original recursive method is abandoned and a new fusion mechanism is constructed. By using a unified module to collect and fuse information from different levels and then distribute it to different levels, it not only avoids the inherent information loss of the traditional FPN, but also enhances the information fusion capability of the neck without significantly increasing the delay.
颈部的输入由从主干提取的特征图B2、B3、B4和B5组成,其尺寸分别为160×160、80×80、40×40和20×20。SAM(尺度注意力模块)模块收集并对齐各个级别的特征,并将它们分配到不同的级别C3、C4、C5,如图1所示。The input of the neck consists of feature maps B2, B3, B4 and B5 extracted from the backbone, with sizes of 160 × 160, 80 × 80, 40 × 40 and 20 × 20, respectively. The SAM (Scale Attention Module) module collects and aligns the features of each level and assigns them to different levels C3, C4, C5, as shown in Figure 1.
SAM模块的结构如图2所示,图中Local和Global分别代表局部和全局特征,Conv代表卷积,Sigmoid为激活函数,Up/Down Sampling代表上采样和下采样,RepConv blocks代表RepConv单元。输入是局部特征B2-B5和全局特征。全局特征是通过融合B2-B5的特征产生的,B2、B3、B4和B5是本地特征,并且在全局特征中使用两个不同的卷积进行计算。由于局部特征和全局特征之间的大小差异,使用平均池或双线性插值来缩放全局特征,以确保适当的对齐。在每次注意力融合结束时,添加RepConv以进一步提取和融合信息。The structure of the SAM module is shown in Figure 2, where Local and Global represent local and global features respectively, Conv represents convolution, Sigmoid is the activation function, Up/Down Sampling represents upsampling and downsampling, and RepConv blocks represent RepConv units. The input is local features B2-B5 and global features. The global feature is generated by fusing the features of B2-B5. B2, B3, B4, and B5 are local features, and two different convolutions are used in the global feature for calculation. Due to the size difference between local and global features, average pooling or bilinear interpolation is used to scale the global features to ensure proper alignment. At the end of each attention fusion, RepConv is added to further extract and fuse information.
选择B2、B3、B4和B5特征进行融合,获得保留小型船舶信息的高分辨率特征。使用平均池化操作对输入特征进行下采样,实现大小一致。通过将特征大小调整为组中最小的特征大小,可以获得对齐的特征。它可以有效地聚合信息,同时最大限度地减少计算量。The B2, B3, B4, and B5 features are selected for fusion to obtain high-resolution features that retain the information of small ships. The input features are downsampled using an average pooling operation to achieve consistent sizes. Aligned features can be obtained by adjusting the feature size to the smallest feature size in the group. It can effectively aggregate information while minimizing the amount of computation.
较大的特征图可以保留更多的定位信息,但会增加计算复杂性。较小的特征地图将丢失较小船只的定位信息。因此,本发明选择B4作为特征对齐的尺寸,以实现速度和精度之间的平衡。Larger feature maps can retain more positioning information, but will increase computational complexity. Smaller feature maps will lose positioning information of smaller ships. Therefore, the present invention selects B4 as the size of feature alignment to achieve a balance between speed and accuracy.
在获得融合的全局信息后,将信息分布到各个级别,并使用简单的注意力操作注入,以提高分支的检测能力。After obtaining the fused global information, it is distributed to each level and injected using a simple attention operation to improve the detection ability of the branches.
通过全局融合不同层次的特征以获得全局信息,并将全局信息注入不同层次的特性中,实现了高效的信息交换和融合。在不显著增加延迟的情况下,SAM系统显著提高了颈部的信息融合能力,提高了模型对小型船舶的检测能力。By globally fusing features at different levels to obtain global information and injecting global information into features at different levels, efficient information exchange and fusion is achieved. Without significantly increasing latency, the SAM system significantly improves the information fusion capability of the neck and improves the model's ability to detect small ships.
在训练过程中,与大型船只相比,由于小型船只的正样本太少,它们在大多数训练迭代中对总损失的贡献很小。小型船舶的监督信号不足会导致检测性能不佳。因此,有必要更多地关注小型船舶的损失。对于目标检测,损失函数通常由三部分组成:分类损失、置信度损失和定位损失。分类损失和置信度损失与船舶尺寸无关,而定位损失与船舶大小有关。为了平衡损失分布,缓解小型船舶监管不足的问题,提出了一种新的定位损失函数,可以更有效地监管小型船舶,更均匀地训练检测算法。将设计损失函数乘以f(t)定位损失之前的函数:During training, small ships contribute little to the total loss in most training iterations due to too few positive samples for them compared to large ships. Insufficient supervision signals for small ships lead to poor detection performance. Therefore, it is necessary to pay more attention to the loss of small ships. For object detection, the loss function usually consists of three parts: classification loss, confidence loss, and localization loss. The classification loss and confidence loss are independent of the ship size, while the localization loss is related to the ship size. In order to balance the loss distribution and alleviate the problem of insufficient supervision of small ships, a new localization loss function is proposed, which can supervise small ships more effectively and train the detection algorithm more evenly. The function before multiplying the designed loss function by f(t) the localization loss:
Losstotal=λ1Losscls+λ2Lossobj+λ3f(t)Lossloc.Loss total = λ 1 Loss cls + λ 2 Loss obj + λ 3 f(t)Loss loc .
其中,等式左侧为总的损失,右侧三项分别为分类损失、似然损失以及定位损失及其系数。为了确保损失函数更倾向于小型船舶,在训练迭代过程中,小型船舶的损失越小,其损失增益就越大f(t)。因此,增益函数的公式f(t)如下:The left side of the equation is the total loss, and the three items on the right side are classification loss, likelihood loss, and positioning loss and their coefficients. In order to ensure that the loss function is more inclined to small ships, during the training iteration, the smaller the loss of small ships, the greater their loss gain f(t). Therefore, the formula of the gain function f(t) is as follows:
其中,in,
t=wi×hi.t= wi × hi .
在公式中,t表示地面实况边界盒的面积,τ表示小型船舶损失与总损失的比率,h 表示,w表示,π表示。增益系数为它引入τ作为对损耗分布敏感的反馈信号。它采用了非线性尺度敏感系数,使其对尺度变化敏感。船舶尺寸(t)和小型船舶损失(τ)越小,增益系数越大。In the formula, t represents the area of the ground truth bounding box, τ represents the ratio of small ship losses to total losses, h represents, w represents, and π represents . The gain coefficient is It introduces τ as a feedback signal that is sensitive to the loss distribution. It adopts a nonlinear scale sensitivity coefficient to make it sensitive to scale changes. The smaller the ship size (t) and the smaller the ship loss (τ), the larger the gain coefficient.
为了避免负增益的问题,它被设计为分段函数。当t小于阈值ω时,f(t)使用当它大于阈值时,使用原始标度敏感增益系数(2-t)。In order to avoid the problem of negative gain, it is designed as a piecewise function. When t is less than the threshold ω, f(t) uses When it is greater than the threshold, the original scaled sensitivity gain factor (2-t) is used.
通过上述损失函数,小型船舶的监督信号将更加充分,检测算法的训练将更加平衡,精度将稳步提高。这种损失函数可以很容易地集成到任何检测算法中,并且在训练和推理过程中几乎没有施加额外的负担。Through the above loss function, the supervision signal of small ships will be more sufficient, the training of the detection algorithm will be more balanced, and the accuracy will be steadily improved. This loss function can be easily integrated into any detection algorithm and imposes almost no additional burden during training and inference.
对于本发明提出的技术的验证:Verification of the technology proposed by the present invention:
实验是在Linux操作系统上进行的,使用Pytorch深度学习框架。硬件为IntelXeon(Cascade Lake)Platinum 8269CYCPU@2.5GHz一个NVIDIA GeForce TITAN V GPU(12GB内存)。前500次迭代以常规学习率的三分之一进行,然后学习率逐渐恢复到0.0025。所有模型的动量都设置为0.9,重量率衰减设置为0.0025。The experiments were conducted on a Linux operating system using the Pytorch deep learning framework. The hardware was an Intel Xeon (Cascade Lake) Platinum 8269CY CPU @ 2.5 GHz and an NVIDIA GeForce TITAN V GPU (12 GB memory). The first 500 iterations were performed at one-third of the regular learning rate, and then the learning rate was gradually restored to 0.0025. The momentum of all models was set to 0.9 and the weight rate decay was set to 0.0025.
与SSDD上最先进的物体检测算法的比较Comparison with state-of-the-art object detection algorithms on SSDD
为了验证不同检测算法在SSDD中对小型、中型和大型船舶的检测性能,本节进行了以下实验。在表1中选择了22个检测算法,AP(平均精度)、APs(小尺寸目标的平均精度)、APm(中尺寸目标的平均精度)、APl(大尺寸目标的平均精度)和FPS(帧数)用作评估指标。In order to verify the detection performance of different detection algorithms for small, medium and large ships in SSDD, the following experiments are conducted in this section. 22 detection algorithms are selected in Table 1, and AP (average precision), APs (average precision of small-sized targets), APm (average precision of medium-sized targets), APl (average precision of large-sized targets) and FPS (frames per second) are used as evaluation indicators.
表1Table 1
YOLOX与其他检测算法相比,在速度和精度方面具有显著优势,而且它不需要锚框,而且是从头开始训练的。因此,本文提出的方法是基于YOLOX的。可以观察到,所提出的检测算法将AP增加了0.031。APs、APm和APl分别提高0.04、0.018和0.046。所提出的方法提高了大型、中型和小型船舶的性能,并且对小型船舶的改进相当大。对于数量众多的小型船舶,改进AP是SAR船舶检测的最佳方法。YOLOX has significant advantages over other detection algorithms in terms of speed and accuracy, and it does not require anchor boxes and is trained from scratch. Therefore, the method proposed in this paper is based on YOLOX. It can be observed that the proposed detection algorithm increases the AP by 0.031. APs, APm, and APl are improved by 0.04, 0.018, and 0.046, respectively. The proposed method improves the performance of large, medium, and small ships, and the improvement for small ships is considerable. For a large number of small ships, improving AP is the best method for SAR ship detection.
从表中可以观察到两种现象,第一种是波动,第二种是与中小型船舶相比,许多检测算法对大型船舶的AP较低。直观地说,AP应该比APm或APl低得多,但我们发现有时大型船舶的AP更高,有时小型船舶的AP也更高。通过分析数据集,可以发现这一现象背后的原因:在SSDD中,大型船舶较少(小型、中型和大型船舶分别为62.8%、34.6%和2.6%)。在测试过程中遗漏或成功检测到一个对最终AP有重大影响,导致显著波动。许多大型船舶都存在低AP的异常现象,原因是数据集中的大型船只较少,大型船只训练频率较低,导致检测性能较差。Two phenomena can be observed from the table, the first is fluctuation, and the second is that many detection algorithms have lower AP for large ships compared to small and medium-sized ships. Intuitively, the AP should be much lower than APm or APl, but we find that sometimes the AP is higher for large ships and sometimes the AP is higher for small ships. By analyzing the dataset, we can find the reason behind this phenomenon: in SSDD, there are fewer large ships (62.8%, 34.6% and 2.6% for small, medium and large ships respectively). Missing or successfully detecting one during the test has a significant impact on the final AP, resulting in significant fluctuations. Many large ships have the abnormal phenomenon of low AP because there are fewer large ships in the dataset and the training frequency of large ships is low, resulting in poor detection performance.
在SAR-Ship-Dataset上与最先进的物体检测算法的比较:Comparison with state-of-the-art object detection algorithms on SAR-Ship-Dataset:
为了验证SAR船舶数据集中不同检测算法对小型、中型和大型船舶的检测性能,本节进行了以下实验。在表2中选择了14个检测算法,AP、AP、APm、APl和FPS被用作评估指标:In order to verify the detection performance of different detection algorithms for small, medium and large ships in the SAR ship dataset, the following experiments are conducted in this section. 14 detection algorithms are selected in Table 2, and AP, AP, APm, APl and FPS are used as evaluation indicators:
表2Table 2
从表2中,我们还可以发现,单级检测算法具有较高的AP,但速度较低。无锚检测算法在准确性和延迟之间具有更好的折衷。与YOLOX相比,所提出的方法将平均AP、AP、APm和APl分别提高了0.037、0.048、0.029和0.034。在上述数据上可以观察到两种与SSDD相似的现象。在测试过程中丢失或成功检测到一个对最终AP有重大影响。有时APl低于AP和APm是反常的,原因是数据集中的大型船只较少,大型船只训练频率较低,导致检测性能较差。From Table 2, we can also find that the single-stage detection algorithm has a higher AP but a lower speed. The anchor-free detection algorithm has a better trade-off between accuracy and latency. Compared with YOLOX, the proposed method improves the average AP, AP, APm, and APl by 0.037, 0.048, 0.029, and 0.034, respectively. Two phenomena similar to SSDD can be observed on the above data. Losing or successfully detecting one during the test has a significant impact on the final AP. Sometimes it is abnormal that APl is lower than AP and APm. The reason is that there are fewer large ships in the dataset and the large ships are trained less frequently, resulting in poor detection performance.
不同组件的评估:Evaluation of different components:
SSDD和SAR船舶数据集上每个组件的改进情况如表3和表4所示。所提出的瓶颈将AP提高至0.70和0.674,这证明了在构建FPN时使用全局和局部信息的有效性。所提出的损失函数使AP增加至0.717和0.695,表明了增加小型船舶定位损失重量的重要性。The improvements of each component on the SSDD and SAR ship datasets are shown in Tables 3 and 4. The proposed bottleneck improves the AP to 0.70 and 0.674, which proves the effectiveness of using global and local information when building FPN. The proposed loss function increases the AP to 0.717 and 0.695, indicating the importance of increasing the loss weight for small ship localization.
表3所提出的检测算法在SSDD上的消融结果Table 3 Ablation results of the proposed detection algorithm on SSDD
表4所提出的检测算法在SAR船舶数据集上的消融结果Table 4 Ablation results of the proposed detection algorithm on the SAR ship dataset
从以上结果可以发现,上述实验也证明了将上述2种策略相结合是有效的。From the above results, it can be found that the above experiment also proves that combining the above two strategies is effective.
小目标检测具有挑战性,因为小目标不包含详细信息,甚至可能消失在深度网络中。SAR船舶检测数据集中有更多的小型船舶,因此改进AP是提高整体性能的好选择。据我们所知,本文是首次系统应用小目标检测策略,并取得了良好的性能。颈部有效融合了不同层次的特征,损失函数解决了小型船舶训练过程中监督信息不足的问题。Small object detection is challenging because small objects do not contain detailed information and may even disappear in deep networks. There are more small ships in the SAR ship detection dataset, so improving AP is a good choice to improve the overall performance. To the best of our knowledge, this paper is the first to systematically apply a small object detection strategy and achieve good performance. The neck effectively fuses features from different levels, and the loss function solves the problem of insufficient supervision information during training for small ships.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to the flowchart and/or block diagram of the method, device (system), and computer program product according to the embodiment of the present invention. It should be understood that each process and/or box in the flowchart and/or block diagram, as well as the combination of the process and/or box in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
在本发明的描述中,需要理解的是,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本发明的描述中,“多个”的含义是两个或两个以上,除非另有明确具体的限定。In the description of the present invention, it should be understood that the terms "first" and "second" are used for descriptive purposes only and should not be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of the features. In the description of the present invention, the meaning of "plurality" is two or more, unless otherwise clearly and specifically defined.
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. Thus, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include these modifications and variations.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410426660.XA CN118262299B (en) | 2024-04-10 | 2024-04-10 | Small ship detection method and system based on novel neck network and loss function |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410426660.XA CN118262299B (en) | 2024-04-10 | 2024-04-10 | Small ship detection method and system based on novel neck network and loss function |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN118262299A true CN118262299A (en) | 2024-06-28 |
| CN118262299B CN118262299B (en) | 2024-11-01 |
Family
ID=91612949
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410426660.XA Active CN118262299B (en) | 2024-04-10 | 2024-04-10 | Small ship detection method and system based on novel neck network and loss function |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118262299B (en) |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114596335A (en) * | 2022-03-01 | 2022-06-07 | 广东工业大学 | Unmanned ship target detection tracking method and system |
| CN116823775A (en) * | 2023-06-29 | 2023-09-29 | 哈尔滨理工大学 | A deep learning-based display screen defect detection method |
| CN116824376A (en) * | 2023-06-28 | 2023-09-29 | 中国人民解放军海军潜艇学院 | A lightweight SAR image ship target detection method and system |
| CN117036656A (en) * | 2023-08-17 | 2023-11-10 | 集美大学 | A method for identifying floating objects on water surface in complex scenes |
| CN117132951A (en) * | 2023-09-11 | 2023-11-28 | 武汉理工大学重庆研究院 | A ship detection and tracking method in a bridge area monitoring scenario |
| CN117456163A (en) * | 2023-10-27 | 2024-01-26 | 数据空间研究院 | Ship target detection method, system and storage medium |
-
2024
- 2024-04-10 CN CN202410426660.XA patent/CN118262299B/en active Active
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114596335A (en) * | 2022-03-01 | 2022-06-07 | 广东工业大学 | Unmanned ship target detection tracking method and system |
| CN116824376A (en) * | 2023-06-28 | 2023-09-29 | 中国人民解放军海军潜艇学院 | A lightweight SAR image ship target detection method and system |
| CN116823775A (en) * | 2023-06-29 | 2023-09-29 | 哈尔滨理工大学 | A deep learning-based display screen defect detection method |
| CN117036656A (en) * | 2023-08-17 | 2023-11-10 | 集美大学 | A method for identifying floating objects on water surface in complex scenes |
| CN117132951A (en) * | 2023-09-11 | 2023-11-28 | 武汉理工大学重庆研究院 | A ship detection and tracking method in a bridge area monitoring scenario |
| CN117456163A (en) * | 2023-10-27 | 2024-01-26 | 数据空间研究院 | Ship target detection method, system and storage medium |
Non-Patent Citations (1)
| Title |
|---|
| 张筱晗;姚力波;吕亚飞;简涛;赵志伟;藏洁;: "双向特征融合的数据自适应SAR图像舰船目标检测模型", 中国图象图形学报, no. 09, 16 September 2020 (2020-09-16) * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118262299B (en) | 2024-11-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110956126B (en) | A small target detection method based on joint super-resolution reconstruction | |
| CN111079739B (en) | Multi-scale attention feature detection method | |
| CN115035295B (en) | Remote sensing image semantic segmentation method based on shared convolution kernel and boundary loss function | |
| WO2023193400A1 (en) | Point cloud detection and segmentation method and apparatus, and electronic device | |
| WO2023116632A1 (en) | Video instance segmentation method and apparatus based on spatio-temporal memory information | |
| CN110991311A (en) | Target detection method based on dense connection deep network | |
| CN115223009A (en) | Small target detection method and device based on improved YOLOv5 | |
| CN111242127A (en) | Vehicle detection method with granularity level multi-scale characteristics based on asymmetric convolution | |
| CN111950612B (en) | FPN-based weak and small target detection method for fusion factor | |
| CN115546705B (en) | Target identification method, terminal device and storage medium | |
| CN115471729B (en) | Ship target identification method and system based on improved YOLOv5 | |
| CN116246119A (en) | 3D target detection method, electronic device and storage medium | |
| CN118736214A (en) | Infrared small target detection network based on feature enhancement | |
| Sun et al. | Roadway crack segmentation based on an encoder-decoder deep network with multi-scale convolutional blocks | |
| CN118262299B (en) | Small ship detection method and system based on novel neck network and loss function | |
| Wang et al. | YOLOV5s-Face face detection algorithm | |
| CN112487911B (en) | Real-time pedestrian detection method and device based on improvement yolov under intelligent monitoring environment | |
| WO2025185167A1 (en) | Road region detection method and apparatus, and electronic device | |
| Yang et al. | Traffic conflicts analysis in penang based on improved object detection with transformer model | |
| CN118864808A (en) | Weak small ship detection method based on SAR/AIS cross-modal fusion YOLO network | |
| CN118537781A (en) | Seed detection method, device, electronic equipment and storage medium | |
| Qi et al. | Vehicle detection under unmanned aerial vehicle based on improved YOLOv3 | |
| CN118609054A (en) | A method, system and medium for detecting and segmenting safety helmets in a logistics park | |
| Gao et al. | Apple maturity detection based on improved YOLOv8 | |
| CN117173423A (en) | Image small target detection methods, systems, equipment and media |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |