CN116310900A

CN116310900A - A target recognition and tracking method, device and system on a complex battlefield

Info

Publication number: CN116310900A
Application number: CN202310183454.6A
Authority: CN
Inventors: 毕建权; 杨朝红; 邱晓波; 张国辉; 田相轩; 金丽亚; 王璇; 陈波; 王远; 刑程
Original assignee: Army Academy of Armored Forces of PLA
Current assignee: Army Academy of Armored Forces of PLA
Priority date: 2023-02-28
Filing date: 2023-02-28
Publication date: 2023-06-23

Abstract

The embodiment of the invention provides a method, a device and a system for identifying and tracking targets on a complex battlefield, wherein the method comprises the following steps: acquiring a first image set containing a target object, wherein the types of first images in the first image set are different; deblurring and denoising the first image set to form a second image set; performing image fusion processing on the second image set to form a fusion image; constructing a target tracking algorithm model for identifying a target object in an image and tracking and marking, wherein the target tracking algorithm model is based on a single-stage target detection model and is obtained at least by adjusting the positions of a feature extraction module and a pooling layer and the number of convolution layers in the single-stage target detection model; and inputting the fusion image into the target tracking algorithm model to obtain a recognition and tracking result related to the target object based on the target tracking algorithm model.

Description

A target recognition and tracking method, device and system on a complex battlefield

技术领域technical field

本发明实施例涉及目标识别与跟踪技术领域，特别涉及一种复杂战场上的目标识别及跟踪方法及装置。Embodiments of the present invention relate to the technical field of target recognition and tracking, and in particular to a method and device for target recognition and tracking on complex battlefields.

背景技术Background technique

伴随着现代作战无人化趋势的不断加剧，无人机在军事领域的应用变得愈发重要，不仅能承担侦察、打击、运输等任务，而且还能有效保障作战人员的安全，对于战场的颠覆作用不言而喻。无人机作为战场目标侦察的重要组成，有利于指挥员全面掌握整个战场变化中的态势。在未来的信息化战争中，提高战场态势感知能力，可以有效提高战争全局掌控能力，各军事大国都在加强相关技术的开发和研究。With the increasing unmanned trend of modern combat, the application of drones in the military field has become more and more important. It can not only undertake reconnaissance, strike, transportation and other tasks, but also effectively guarantee the safety of combatants. The subversive effect is self-evident. As an important component of battlefield target reconnaissance, drones are conducive to commanders to fully grasp the changing situation of the entire battlefield. In the future of information warfare, improving battlefield situation awareness can effectively improve the overall control of the war. All major military powers are strengthening the development and research of related technologies.

由于战场态势具有高动态性，对战场典型目标检测跟踪的实时性与准确性具有较高的要求。此外，在对无人机侦察获得的战场图像信息进行目标检测时，该过程中军事目标受到运动模糊、噪声污染、气候干扰、目标尺寸、部分目标存在遮挡等情况的影响，使得战场态势感知的性能难以满足现实的需求。主要问题表现在几个方面：Due to the high dynamics of the battlefield situation, there are high requirements for the real-time performance and accuracy of typical target detection and tracking on the battlefield. In addition, when performing target detection on the battlefield image information obtained by UAV reconnaissance, military targets are affected by motion blur, noise pollution, climate interference, target size, and partial target occlusion during the process, which makes the battlefield situation awareness difficult. Performance is difficult to meet the needs of reality. The main problems are manifested in several aspects:

1)战场环境复杂多变，无人机的侦察任务往往包括一些夜间、雾天、雨天等能见度低的场景。同时因无人机飞行抖动与传感器问题，所采集的画面难免出现模糊与噪声的影响，基于常规成像系统的目标检测和跟踪算法在这类场景下，通常性能会大大降低，难以发挥作用。1) The battlefield environment is complex and changeable, and the reconnaissance missions of UAVs often include scenes with low visibility such as night, foggy days, and rainy days. At the same time, due to the flight jitter and sensor problems of the UAV, the captured pictures will inevitably be affected by blur and noise. In such scenarios, the performance of target detection and tracking algorithms based on conventional imaging systems will usually be greatly reduced, making it difficult to play a role.

2)不同模态的图像数据，如可见光图像和红外图像，具有各自的特点，但单一地依靠其中任一种模态图像样本进行目标检测都具有一定的局限性，无法满足多场景、高精度性能的目标检测与跟踪要求。2) Image data of different modalities, such as visible light images and infrared images, have their own characteristics, but relying solely on any one of the modal image samples for target detection has certain limitations, and cannot satisfy multi-scene, high-precision Performance object detection and tracking requirements.

发明内容Contents of the invention

本发明实施例提供了一种复杂战场上的目标识别及跟踪方法，包括：An embodiment of the present invention provides a target recognition and tracking method on a complex battlefield, including:

采集包含目标对象的第一图像集，所述第一图像集中的第一图像的类型不同；Acquiring a first image set containing a target object, where the first images in the first image set are of different types;

对所述第一图像集进行去模糊、去噪处理，形成第二图像集；performing deblurring and denoising processing on the first image set to form a second image set;

对所述第二图像集进行图像融合处理，形成融合图像；performing image fusion processing on the second image set to form a fusion image;

构建用于识别图像中的目标对象并进行跟踪标记的目标跟踪算法模型，所述目标跟踪算法模型以单阶段目标检测模型为基础，至少通过调整所述单阶段目标检测模型中特征提取模块及池化层的位置以及卷积层的数量得到；Constructing a target tracking algorithm model for identifying target objects in an image and performing tracking and marking, the target tracking algorithm model is based on a single-stage target detection model, at least by adjusting the feature extraction module and pooling in the single-stage target detection model The position of the layer and the number of convolutional layers are obtained;

将所述融合图像输入至所述目标跟踪算法模型中，以基于所述目标跟踪算法模型得到关于所述目标对象的识别及跟踪结果。The fused image is input into the target tracking algorithm model, so as to obtain the recognition and tracking results of the target object based on the target tracking algorithm model.

作为一可选实施例，还包括：As an optional embodiment, also includes:

对所述第一图像集中的第一图像进行图像模糊度检测，所述图像模糊度检测包括对所述第一图像进行二次模糊处理，基于二次模糊处理前后的第一图像的比对结果确定所述第一图像的模糊度。Performing image blur detection on the first image in the first image set, the image blur detection includes performing secondary blur processing on the first image, based on the comparison result of the first image before and after the secondary blur processing A blurriness of the first image is determined.

作为一可选实施例，所述对所述第一图像集进行去模糊处理，包括：As an optional embodiment, the performing deblurring processing on the first image set includes:

将特征金字塔网络、双判别器结构与图像去模糊算法结合形成第一图像处理网络；Combining feature pyramid network, double discriminator structure and image deblurring algorithm to form the first image processing network;

基于所述第一图像处理网络对所述第一图像进行去模糊处理。Perform deblurring processing on the first image based on the first image processing network.

作为一可选实施例，所述对所述第一图像集进行去噪处理，包括：As an optional embodiment, the performing denoising processing on the first image set includes:

通过引入非对称损失函数及噪声估计子网络至图像去噪网络中，形成第二图像处理网络；Forming a second image processing network by introducing an asymmetric loss function and a noise estimation sub-network into the image denoising network;

基于所述第二图像处理网络对所述第一图像进行去噪处理。Denoising the first image based on the second image processing network.

基于能够不平等处理不同图像特征及像素的特征注意模块、由局部残差学习模块和特征注意模块组成的基本块结构，以及能够区别富裕图像中不同特征权重的特征融合结构构建特征融合注意力网络，所述特征融合注意力网络用于对图像进行去雾处理；Construct a feature fusion attention network based on a feature attention module that can handle different image features and pixels unequally, a basic block structure composed of a local residual learning module and a feature attention module, and a feature fusion structure that can distinguish different feature weights in rich images , the feature fusion attention network is used to dehaze the image;

基于所述特征融合注意力网络对所述第一图像集中的第一图像同时进行去雾处理，以得到所述第二图像集。Dehaze processing is simultaneously performed on the first images in the first image set based on the feature fusion attention network to obtain the second image set.

作为一可选实施例，所述第一图像集中包括对同一场景进行拍摄得到的第一可见光图像及第一红外图像，形成的所述第二图像集包括去模糊、去噪后的第二可见光图像及第二红外图像；As an optional embodiment, the first image set includes the first visible light image and the first infrared image obtained by shooting the same scene, and the formed second image set includes the deblurred and denoised second visible light image. an image and a second infrared image;

所述对所述第二图像集进行图像融合处理，形成融合图像，包括：The performing image fusion processing on the second image set to form a fusion image includes:

对所述第二可见光图像与第二红外图像进行几何配准；performing geometric registration on the second visible light image and the second infrared image;

将配准后的所述第二可见光图像与第二红外图像进行图像融合处理，形成所述融合图像。performing image fusion processing on the registered second visible light image and the second infrared image to form the fusion image.

作为一可选实施例，所述构建用于识别图像中的目标对象并进行跟踪标记的目标跟踪算法模型，包括：As an optional embodiment, the construction of a target tracking algorithm model for identifying and tracking a target object in an image includes:

基于基本卷积层、下采样层、残差模块、特征提取模块和池化层构建所述目标跟踪算法模型，所述基本卷积层由多个卷积层、BN层和激活函数层构成；Constructing the target tracking algorithm model based on a basic convolutional layer, a downsampling layer, a residual module, a feature extraction module and a pooling layer, the basic convolutional layer is composed of a plurality of convolutional layers, a BN layer and an activation function layer;

其中，所述目标跟踪算法模型包括六个卷积阶段，所特征提取模块位于第四个和第五个卷积阶段后，所述池化层位于第六个卷积阶段后，且第二个卷积阶段到第六个卷积阶段的残差模块的数量为1、2、4、4、2，各卷积通道数分别为16、24、48、96、128、160。Wherein, the target tracking algorithm model includes six convolution stages, the feature extraction module is located after the fourth and fifth convolution stages, the pooling layer is located after the sixth convolution stage, and the second The number of residual modules from the convolution stage to the sixth convolution stage is 1, 2, 4, 4, and 2, and the number of convolution channels is 16, 24, 48, 96, 128, and 160, respectively.

作为一可选实施例，所述采集包含目标对象的第一图像集，包括：As an optional embodiment, the collection includes a first image set of the target object, including:

基于无人机、可见光成像传感器及红外传感器采集所述包含所述目标对象的第一图像集。The first image set including the target object is collected based on a drone, a visible light imaging sensor, and an infrared sensor.

本发明另一实施例同时提供一种复杂战场上的目标识别及跟踪装置，包括：Another embodiment of the present invention also provides a target recognition and tracking device on a complex battlefield, including:

采集模块，用于采集包含目标对象的第一图像集，所述第一图像集中的第一图像的类型不同；An acquisition module, configured to acquire a first image set containing a target object, where the first images in the first image set are of different types;

第一处理模块，用于对所述第一图像集进行去模糊、去噪处理，形成第二图像集；A first processing module, configured to perform deblurring and denoising processing on the first image set to form a second image set;

第二处理模块，用于对所述第二图像集进行图像融合处理，形成融合图像；A second processing module, configured to perform image fusion processing on the second image set to form a fusion image;

第一构建模块，用于构建用于识别图像中的目标对象并进行跟踪标记的目标跟踪算法模型，所述目标跟踪算法模型以单阶段目标检测模型为基础，至少通过调整所述单阶段目标检测模型中特征提取模块及池化层的位置以及卷积层的数量得到；The first building block is used to construct a target tracking algorithm model for identifying a target object in an image and performing tracking and marking, the target tracking algorithm model is based on a single-stage target detection model, at least by adjusting the single-stage target detection The position of the feature extraction module and pooling layer in the model and the number of convolutional layers are obtained;

输入模块，用于将所述融合图像输入至所述目标跟踪算法模型中，以基于所述目标跟踪算法模型得到关于所述目标对象的识别及跟踪结果。The input module is used to input the fused image into the target tracking algorithm model, so as to obtain the recognition and tracking results of the target object based on the target tracking algorithm model.

本发明另一实施例还提供一种复杂战场上的目标识别及跟踪系统，包括：Another embodiment of the present invention also provides a target recognition and tracking system on a complex battlefield, including:

至少一个处理器；以及，at least one processor; and,

与所述至少一个处理器通信连接的存储器；其中，a memory communicatively coupled to the at least one processor; wherein,

所述存储器存储有可被所述至少一个处理器执行的指令，所述指令被所述至少一个处理器执行以实现如上文中任一项所述的复杂战场上的目标识别及跟踪方法。The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to implement the target recognition and tracking method on a complex battlefield as described in any one of the above.

本申请的其它特征和优点将在随后的说明书中阐述，并且，部分地从说明书中变得显而易见，或者通过实施本申请而了解。本申请的目的和其他优点可通过在所写的说明书、权利要求书、以及附图中所特别指出的结构来实现和获得。Additional features and advantages of the application will be set forth in the description which follows, and, in part, will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

下面通过附图和实施例，对本申请的技术方案做进一步的详细描述。The technical solutions of the present application will be described in further detail below with reference to the drawings and embodiments.

附图说明Description of drawings

附图用来提供对本申请的进一步理解，并且构成说明书的一部分，与本申请的实施例一起用于解释本申请，并不构成对本申请的限制。在附图中：The accompanying drawings are used to provide a further understanding of the present application, and constitute a part of the specification, and are used together with the embodiments of the present application to explain the present application, and do not constitute a limitation to the present application. In the attached picture:

图1为本发明实施例中的复杂战场上的目标识别及跟踪方法的流程图。FIG. 1 is a flowchart of a target recognition and tracking method on a complex battlefield in an embodiment of the present invention.

图2为系统总体组成图。Figure 2 shows the overall composition of the system.

图3为本发明实施例中的复杂战场上的目标识别及跟踪方法的算法框图。FIG. 3 is an algorithm block diagram of a target recognition and tracking method on a complex battlefield in an embodiment of the present invention.

图4为图像融合的算法流程图。Figure 4 is the algorithm flow chart of image fusion.

图5为基于YOLOv3+deepsort的目标跟踪算法流程图Figure 5 is a flowchart of the target tracking algorithm based on YOLOv3+deepsort

图6为本发明实施例中的复杂战场上的目标识别及跟踪方法的结构框图。FIG. 6 is a structural block diagram of a target recognition and tracking method on a complex battlefield in an embodiment of the present invention.

具体实施方式Detailed ways

下面，结合附图对本发明的具体实施例进行详细的描述，但不作为本发明的限定。Below, specific embodiments of the present invention will be described in detail in conjunction with the accompanying drawings, but they are not intended to limit the present invention.

应理解的是，可以对此处申请的实施例做出各种修改。因此，下述说明书不应该视为限制，而仅是作为实施例的范例。本领域的技术人员将想到在本申请的范围和精神内的其他修改。It should be understood that various modifications may be made to the embodiments applied for herein. Accordingly, the following description should not be viewed as limiting, but only as exemplifications of embodiments. Those skilled in the art will envision other modifications within the scope and spirit of the application.

包含在说明书中并构成说明书的一部分的附图示出了本申请的实施例，并且与上面给出的对本申请的大致描述以及下面给出的对实施例的详细描述一起用于解释本申请的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the application and, together with the general description of the application given above and the detailed description of the embodiments given below, serve to explain the embodiments of the application. principle.

通过下面参照附图对给定为非限制性实例的实施例的优选形式的描述，本发明的这些和其它特性将会变得显而易见。These and other characteristics of the invention will become apparent from the following description of preferred forms of embodiment given as non-limiting examples with reference to the accompanying drawings.

还应当理解，尽管已经参照一些具体实例对本发明进行了描述，但本领域技术人员能够确定地实现本发明的很多其它等效形式，它们具有如权利要求所述的特征并因此都位于借此所限定的保护范围内。It should also be understood that while the invention has been described with reference to a few specific examples, those skilled in the art can certainly implement many other equivalent forms of the invention, which have the features described in the claims and thus lie within the scope of the present invention. within the limited scope of protection.

当结合附图时，鉴于以下详细说明，本申请的上述和其他方面、特征和优势将变得更为显而易见。The above and other aspects, features and advantages of the present application will become more apparent in view of the following detailed description when taken in conjunction with the accompanying drawings.

此后参照附图描述本申请的具体实施例；然而，应当理解，所申请的实施例仅仅是本申请的实例，其可采用多种方式实施。熟知和/或重复的功能和结构并未详细描述以避免不必要或多余的细节使得本申请模糊不清。因此，本文所申请的具体的结构性和功能性细节并非意在限定，而是仅仅作为权利要求的基础和代表性基础用于教导本领域技术人员以实质上任意合适的详细结构多样地使用本申请。Specific embodiments of the present application are hereinafter described with reference to the accompanying drawings; however, it should be understood that the applied embodiments are merely examples of the present application, which can be implemented in various ways. Well-known and/or repetitive functions and constructions are not described in detail to avoid obscuring the application with unnecessary or redundant detail. Therefore, specific structural and functional details disclosed herein are not intended to be limiting, but merely as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present invention in virtually any suitable detailed structure. Apply.

本说明书可使用词组“在一种实施例中”、“在另一个实施例中”、“在又一实施例中”或“在其他实施例中”，其均可指代根据本申请的相同或不同实施例中的一个或多个。This specification may use the phrases "in one embodiment," "in another embodiment," "in yet another embodiment," or "in other embodiments," which may refer to the same or one or more of the different embodiments.

下面，结合附图详细的说明本发明实施例。Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

如图1所示，本发明实施例提供一种复杂战场上的目标识别及跟踪方法，包括：As shown in Figure 1, an embodiment of the present invention provides a target recognition and tracking method on a complex battlefield, including:

采集包含目标对象的第一图像集，第一图像集中的第一图像的类型不同；Acquiring a first image set containing the target object, where the types of the first images in the first image set are different;

对第一图像集进行去模糊、去噪处理，形成第二图像集；performing deblurring and denoising processing on the first image set to form a second image set;

对第二图像集进行图像融合处理，形成融合图像；performing image fusion processing on the second image set to form a fusion image;

构建用于识别图像中的目标对象并进行跟踪标记的目标跟踪算法模型，目标跟踪算法模型以单阶段目标检测模型为基础，至少通过调整单阶段目标检测模型中特征提取模块及池化层的位置以及卷积层的数量得到；Construct a target tracking algorithm model for identifying and tracking the target object in the image. The target tracking algorithm model is based on the single-stage target detection model, at least by adjusting the position of the feature extraction module and the pooling layer in the single-stage target detection model And the number of convolutional layers is obtained;

将融合图像输入至目标跟踪算法模型中，以基于目标跟踪算法模型得到关于目标对象的识别及跟踪结果。The fused image is input into the target tracking algorithm model to obtain the recognition and tracking results of the target object based on the target tracking algorithm model.

本实施例中的上述方法可应用于战场典型目标检测跟踪与控制系统中，该控制系统的总成如图2所示，其从从业务层面主要包含6部分：数据层，算法模型层、离线平台、在线平台、数据管理系统和目标检测分析与显示系统。具体展现系统平台包含3部分：(1)深度学习训练平台，主要开展数据标记，预处理和模型的构建和训练；(2)无人机实时监控系统，主要完成无人机飞行参数实时显示和无人机姿态控制；(3)目标综合分析与显示系统，主要开展算法的验证分析和数据管理功能，具体包括本地数据存储模块、数据库、数据存储管理系统、结果显示系统和目标检测分析与显示系统。The above method in this embodiment can be applied to a typical battlefield target detection tracking and control system. The assembly of the control system is shown in Figure 2. It mainly includes 6 parts from the business level: data layer, algorithm model layer, offline platform, online platform, data management system and target detection analysis and display system. The specific presentation system platform consists of three parts: (1) Deep learning training platform, which mainly carries out data labeling, preprocessing and model construction and training; (2) UAV real-time monitoring system, which mainly completes the real-time display and monitoring of UAV flight parameters. UAV attitude control; (3) target comprehensive analysis and display system, which mainly carries out algorithm verification analysis and data management functions, including local data storage module, database, data storage management system, result display system and target detection analysis and display system.

进一步地，本实施例中上述方法的算法框架可参考图3所示，其中图像预处理算法模型主要包括图像模糊度检测模型以及图像优化预处理模型，其中图像优化预处理模型包括图像去模糊算法模型、图像去噪算法模型和图像去雾算法模型三个模块(即分别为下文所述的第一图像处理网络、第二图像处理网络及特征融合注意力网络)。经预处理后的图像(即第二图像集)用于目标检测和跟踪，包括通过单目标检测算法以及多模态融合目标检测方法的处理，进而实现了YOLO+deepsort(下文将介绍)的无人机典型目标跟踪。Further, the algorithm framework of the above method in this embodiment can be referred to as shown in Figure 3, wherein the image preprocessing algorithm model mainly includes an image blur degree detection model and an image optimization preprocessing model, wherein the image optimization preprocessing model includes an image deblurring algorithm model, an image denoising algorithm model and an image dehazing algorithm model (namely the first image processing network, the second image processing network and the feature fusion attention network described below respectively). The preprocessed image (that is, the second image set) is used for target detection and tracking, including processing through a single target detection algorithm and a multi-modal fusion target detection method, thereby realizing the YOLO+deepsort (described below) without Human-machine typical target tracking.

具体地，本实施例中的方法还包括：Specifically, the method in this embodiment also includes:

对第一图像集中的第一图像进行图像模糊度检测，图像模糊度检测包括对第一图像进行二次模糊处理，基于二次模糊处理前后的第一图像的比对结果确定第一图像的模糊度。Performing image blur detection on the first image in the first image set, the image blur detection includes performing secondary blur processing on the first image, and determining the blur of the first image based on the comparison result of the first image before and after the secondary blur processing Spend.

例如，无人机采集到的典型目标种类较多，且具有随机性，因此通常没有参考图像，本实施例提出的模糊度检测用于在无参考图像下对采集的图像进行质量评价。由于图像的清晰度是衡量图像质量优劣的重要指标，其能够较好的与人的主观感受相对应，图像的清晰度不高表现出图像的模糊。故，本实施例针对无参考图像的图像质量评价应用图像二次模糊的图像模糊度检测算法实现。For example, there are many types of typical targets collected by drones and are random, so there is usually no reference image. The blur detection proposed in this embodiment is used to evaluate the quality of the collected images without reference images. Since the sharpness of an image is an important index to measure the quality of an image, it can better correspond to people's subjective feelings, and the low sharpness of an image shows that the image is blurred. Therefore, this embodiment is implemented by applying an image blur degree detection algorithm of image secondary blur for image quality evaluation without a reference image.

具体地，对模糊图像(即第一图像集中的第一图像)进行二次模糊处理，比对图像二次模糊结果与原模糊图像之间相差是否显著，若原图模糊，则图像二次模糊后，与原图相差不大。如果原图是清晰的，对它进行二次模糊处理，模糊前后图像的差异性大，说明原图清晰。对于模糊前后的图像评价，本实施例优选采用灰度方差乘积法(SMD2)处理，该算法能够在准确度和计算时间上有较好的表现。图像二次模糊方法能够在没有清晰图片作为参考的情况下给出图像的模糊程度，也即评估图像的模糊程度，实现无参考清晰图像的模糊程度检测。Specifically, perform secondary blurring processing on the blurred image (that is, the first image in the first image set), and compare whether the difference between the secondary blurring result of the image and the original blurred image is significant. If the original image is blurred, the image after the secondary blurring , not much different from the original. If the original image is clear, perform a secondary blurring process on it. The difference between the image before and after blurring is large, indicating that the original image is clear. For image evaluation before and after blurring, this embodiment preferably adopts gray level variance product method (SMD2) to process, and this algorithm can have better performance in terms of accuracy and calculation time. The image secondary blurring method can give the blurring degree of the image without a clear picture as a reference, that is, evaluate the blurring degree of the image, and realize the blurring degree detection without a reference clear image.

在对第一图像集进行去模糊处理时，包括：When deblurring the first set of images, include:

基于第一图像处理网络对第一图像进行去模糊处理。Deblurring the first image is performed based on the first image processing network.

例如，在经过模糊度检测后确定第一图像模糊，需要去模糊处理时，需要基于第一图像处理网络执行去模糊处理，本实施例中的第一图像处理网络是采用DeblurGANv2算法，在该算法的基本框架的基础上，构建了特征金字塔网络(FeaturePyramidNetworks,FPN)结构进行特征融合，以便于多尺度特征汇聚外，还能够在精度与效率之间取得均衡。而为了考虑处理更大更复杂的真实模糊区域，以及提升处理效率，本实施例中的第一图像处理网络还需要采用以下的网络结构：For example, when it is determined that the first image is blurred after blur detection, and deblurring processing is required, it is necessary to perform deblurring processing based on the first image processing network. The first image processing network in this embodiment uses the DeblurGANv2 algorithm. In this algorithm On the basis of the basic framework, a Feature Pyramid Network (Feature Pyramid Networks, FPN) structure is constructed for feature fusion, so as to facilitate the convergence of multi-scale features and achieve a balance between accuracy and efficiency. In order to consider processing larger and more complex real blurred regions and improve processing efficiency, the first image processing network in this embodiment also needs to adopt the following network structure:

(1)以Inception-ResNet为基本网络骨架，构成FPN-Inception-ResNet-v2网络；(1) Taking Inception-ResNet as the basic network skeleton to form the FPN-Inception-ResNet-v2 network;

(2)以MobileNet为基本网络骨架，构成FPN-MobileNet网络，取得10～100倍的推理速度提升的同时性能优异，满足实时性需求。(2) With MobileNet as the basic network framework, the FPN-MobileNet network is formed, which achieves 10 to 100 times inference speed improvement and excellent performance, meeting real-time requirements.

在判别器方面，本实施例采用双判别器结构，即在全局与局部两个尺度方面进行判别器度量；在网络训练过程中，优化上述网络模型的损失函数，便可实现网络的收敛，从而得到训练好的网络模型，也即第一图像处理网络。In terms of discriminators, this embodiment adopts a dual discriminator structure, that is, discriminators are measured at two scales, global and local; during network training, network convergence can be achieved by optimizing the loss function of the above-mentioned network model, thereby Obtain a trained network model, that is, the first image processing network.

进一步地，在对第一图像集进行去噪处理时，包括：Further, when performing denoising processing on the first image set, including:

基于第二图像处理网络对第一图像进行去噪处理。Denoising processing is performed on the first image based on the second image processing network.

例如，本实施例优选采用深度学习CBDNet模型来构建第二图像处理网络，该算法具有消除真实噪声的效果，且带来的图像模糊后果更小，同时能满足实时性要求。另外，通过引入噪声估计子网络于模型中可更好地提高去噪能力。同时通过在模型中引入非对称损失函数能够提高模型对真实噪声的泛化能力，支持合理调整噪声的强度等级实现交互式去噪声。For example, in this embodiment, the deep learning CBDNet model is preferably used to construct the second image processing network. This algorithm has the effect of eliminating real noise, and the consequences of image blurring are smaller, and at the same time, it can meet the real-time requirements. In addition, the denoising ability can be better improved by introducing the noise estimation sub-network into the model. At the same time, by introducing an asymmetric loss function in the model, the generalization ability of the model to real noise can be improved, and the noise intensity level can be reasonably adjusted to realize interactive denoising.

优选地，本实施例中的方法还包括：Preferably, the method in this embodiment also includes:

基于能够不平等处理不同图像特征及像素的特征注意模块、由局部残差学习模块和特征注意模块组成的基本块结构，以及能够区别富裕图像中不同特征权重的特征融合结构构建特征融合注意力网络，特征融合注意力网络用于对图像进行去雾处理；Construct a feature fusion attention network based on a feature attention module that can handle different image features and pixels unequally, a basic block structure composed of a local residual learning module and a feature attention module, and a feature fusion structure that can distinguish different feature weights in rich images , the feature fusion attention network is used to dehaze the image;

基于特征融合注意力网络对第一图像集中的第一图像同时进行去雾处理，以得到第二图像集。The first image in the first image set is dehazed simultaneously based on the feature fusion attention network to obtain the second image set.

具体地，本实施例实际优选采用一种端到端的特征融合注意力网络(FFA-Net)来直接恢复无雾图像。FFA-Net架构由以下三个主要部分组成：a.采用一种新的特征注意(FA)模块，该模块将通道注意与像素注意机制相结合，考虑到不同的信道特征所包含的加权信息完全不同，且不同图像像素上的雾霾分布不均匀，本实施例中的FA对不同的特征和像素的处理是不平等的，这为处理不同类型的信息提供了额外的灵活性，同时提高了CNNs的表达能力；b.基本块结构由局部残差学习和特征注意模块组成，局部残差学习模块允许较不重要的信息，如薄霾区域或低频区域被多个局部残差连接被绕过，使主网络架构关注更有效的信息。c.基于注意模块的不同层次特征融合(FFA)结构，特征权重自适应地从特征注意(FA)模块学习，赋予重要特征更多的权重。该种结构还可以保留浅层的信息并将其传递到深层。当制备得到特征融合注意力网络后，便可由其处理第一图像集，实现图像去雾，以辅助得到清晰的第二图像集。Specifically, this embodiment preferably uses an end-to-end feature fusion attention network (FFA-Net) to directly restore the haze-free image. The FFA-Net architecture consists of the following three main parts: a. A new feature attention (FA) module is adopted, which combines channel attention with pixel attention mechanism, considering that the weighted information contained in different channel features is completely different, and the haze distribution on different image pixels is uneven, the FA in this embodiment treats different features and pixels unequally, which provides additional flexibility for processing different types of information, while improving the The expressive power of CNNs; b. The basic block structure consists of local residual learning and feature attention modules, the local residual learning module allows less important information, such as thin haze regions or low-frequency regions to be bypassed by multiple local residual connections , making the main network architecture focus on more effective information. c. Based on the feature fusion (FFA) structure of different levels of attention modules, feature weights are adaptively learned from feature attention (FA) modules, giving more weight to important features. This kind of structure can also preserve the information of the shallow layer and pass it to the deep layer. After the feature fusion attention network is prepared, it can process the first image set to achieve image dehazing to assist in obtaining a clear second image set.

本实施例中，为了提升目标检测算法在战场复杂场景下检测的准确性，本实施例通过采集可见光图像和红外图像，并对其融合，通过种类型图像的信息互补，达到提高复杂目标检测性能的目的。具体地，本实施例中的第一图像集中包括对同一场景进行拍摄得到的第一可见光图像及第一红外图像，形成的第二图像集包括去模糊、去噪后的第二可见光图像及第二红外图像；采集包含目标对象的第一图像集时是基于无人机、可见光成像传感器及红外传感器采集得到。In this embodiment, in order to improve the detection accuracy of the target detection algorithm in complex battlefield scenes, this embodiment collects visible light images and infrared images and fuses them, and complements the information of various types of images to improve the detection performance of complex targets the goal of. Specifically, the first image set in this embodiment includes the first visible light image and the first infrared image obtained by shooting the same scene, and the formed second image set includes the deblurred and denoised second visible light image and the first infrared image. Two infrared images: when the first image set containing the target object is collected, it is collected based on the UAV, the visible light imaging sensor and the infrared sensor.

进一步地，对第二图像集进行图像融合处理，形成融合图像，包括：Further, image fusion processing is performed on the second image set to form a fusion image, including:

对第二可见光图像与第二红外图像进行几何配准；performing geometric registration on the second visible light image and the second infrared image;

将配准后的第二可见光图像与第二红外图像进行图像融合处理，形成融合图像。Perform image fusion processing on the registered second visible light image and the second infrared image to form a fusion image.

具体处理过程可参考图4所示，由于不同传感器、不同成像模式得到的同一个对象的图像之间会存在相对平移、旋转、不同比例缩放以及畸变等关系，因此在利用可见光图像和红外图像进行融合之前，必须保证几何上的严格配准。对于图像融合，其是通过对多个源图像的信息进行叠加得到融合图像，然后对融合图像进行分析处理的过程。根据融合处理的特点和抽象程度，图像融合方式共分为三类，包括：决策级融合、特征级融合和像素级融合，该三种图像融合方式均可根据实际情况而灵活选择使用，具体不唯一。The specific processing process can be referred to as shown in Figure 4. Since the images of the same object obtained by different sensors and different imaging modes will have relative translation, rotation, different scaling and distortion, etc., when using visible light images and infrared images Geometrically strict registration must be guaranteed prior to fusion. For image fusion, it is a process of superimposing the information of multiple source images to obtain a fusion image, and then analyzing and processing the fusion image. According to the characteristics and abstraction of fusion processing, image fusion methods are divided into three categories, including: decision-level fusion, feature-level fusion and pixel-level fusion. These three image fusion methods can be flexibly selected and used according to the actual situation. only.

1)决策级融合是一种基于认知模型的融合方式，以特征提取为基础，分别对图像的特征信息进行识别和判断，得出各自的决策，再根据实际需要，将其合并成一个全局最优的联合决策。1) Decision-level fusion is a fusion method based on cognitive models. Based on feature extraction, the feature information of the image is identified and judged respectively, and the respective decisions are obtained, and then merged into a global image according to actual needs. optimal joint decision-making.

2)特征级融合是指提取源图像中感兴趣区域的目标特征，并对这些特征信息进行综合性分析和融合处理。在特征提取过程中，只保留重要的信息，对于不重要的信息和冗余的信息通常借助降维等方式去除。该种融合方式压缩了源图像的信息，计算速度有明显提升。2) Feature-level fusion refers to extracting the target features of the region of interest in the source image, and performing comprehensive analysis and fusion processing on these feature information. In the process of feature extraction, only important information is retained, and unimportant information and redundant information are usually removed by means of dimensionality reduction. This fusion method compresses the information of the source image, and the calculation speed is significantly improved.

3)像素级融合是指选取融合策略对严格配准的源图像的像素点进行处理，从而得到融合图像的过程，例如使用基于金字塔变换或小波变化这类算法。像素级融合能够最大程度地保留目标图像的原始信息，应用价值更大。可以尽可能多地保留可见光图像与红外图像中可利用信息，使两者信息能够有效互补，达到提高目标检测精度的目的，通过特定算法，将可见光图像与红外图像合并为一副融合图像，再利用目标检测模型进行检测。3) Pixel-level fusion refers to the process of selecting a fusion strategy to process the pixels of the strictly registered source image to obtain a fused image, such as using algorithms based on pyramid transformation or wavelet transformation. Pixel-level fusion can preserve the original information of the target image to the greatest extent, and has greater application value. It is possible to retain as much available information in the visible light image and infrared image as possible, so that the two information can effectively complement each other, and achieve the purpose of improving the accuracy of target detection. Through a specific algorithm, the visible light image and the infrared image are combined into a fusion image, and then Use the object detection model for detection.

进一步地，构建用于识别图像中的目标对象并进行跟踪标记的目标跟踪算法模型，包括：Further, construct a target tracking algorithm model for identifying and tracking target objects in images, including:

基于基本卷积层、下采样层、残差模块、特征提取模块和池化层构建目标跟踪算法模型，基本卷积层由多个卷积层、BN层和激活函数层构成；The target tracking algorithm model is constructed based on the basic convolutional layer, downsampling layer, residual module, feature extraction module and pooling layer. The basic convolutional layer is composed of multiple convolutional layers, BN layers and activation function layers;

其中，目标跟踪算法模型包括六个卷积阶段，所特征提取模块位于第四个和第五个卷积阶段后，池化层位于第六个卷积阶段后，且第二个卷积阶段到第六个卷积阶段的残差模块的数量为1、2、4、4、2，各卷积通道数分别为16、24、48、96、128、160。Among them, the target tracking algorithm model includes six convolution stages, the feature extraction module is located after the fourth and fifth convolution stages, the pooling layer is located after the sixth convolution stage, and the second convolution stage to The number of residual modules in the sixth convolution stage is 1, 2, 4, 4, and 2, and the number of convolution channels is 16, 24, 48, 96, 128, and 160, respectively.

具体地，本实施例中是优选采用YOLOv3(单阶段目标监测模型)为基本结构框架进行构建的，该模型主要包含三个部分：基本结构、功能模块和损失函数。在此基础上，本实施例从这三个方面针对无人机典型目标检测场景进行了模型优化。例如，基本结构与YOLOv3保持一致，包含Backbone、Neck和Head三个部分。Backbone的基本组成部分包括：基本卷积层(Basicconv)，下采样层(Downsample)，残差模块(Res)，RFB模块(特征提取模块)和SPP层(池化层)。网络中的基本卷积层均由卷积层、BN层和ReLU激活函数层构成。进一步地，模型包含初始化卷积，且共有6个卷积阶段，经过5次下采样操作，因此最深的网络层上featuremap的卷积步长为32。RFB模块分别添加在第4个和第5个卷积阶段的最后，用于增强该卷积阶段的感受野；SPP模块添加在第6个卷积阶段的最后，用于获取更高级的语义信息和增强泛化性能。鉴于无人机检测场景的目标类别数较少以及目标姿态、尺度变换幅度较小，因此本实施例对每个阶段的卷积层数进行调整，第2个卷积阶段到第6个卷积阶段的残差模块数量由原来的1、2、8、8、4调整为1、2、4、4、2，即由Darknet-53变为Darknet-33。6个阶段的卷积通道数也进行了调整，由原来的32、64、128、256、512、1024调整为16、24、48、96、128、160。Specifically, in this embodiment, YOLOv3 (single-stage target monitoring model) is preferably used as the basic structural framework for construction, and the model mainly includes three parts: basic structure, functional module and loss function. On this basis, this embodiment optimizes the model for typical target detection scenarios of UAVs from these three aspects. For example, the basic structure is consistent with YOLOv3, including Backbone, Neck and Head. The basic components of Backbone include: basic convolutional layer (Basicconv), downsampling layer (Downsample), residual module (Res), RFB module (feature extraction module) and SPP layer (pooling layer). The basic convolutional layers in the network are composed of convolutional layers, BN layers and ReLU activation function layers. Furthermore, the model includes initialization convolution, and there are 6 convolution stages in total, after 5 downsampling operations, so the convolution step of the featuremap on the deepest network layer is 32. The RFB module is added at the end of the 4th and 5th convolution stages to enhance the receptive field of the convolution stage; the SPP module is added at the end of the 6th convolution stage to obtain more advanced semantic information and enhance generalization performance. In view of the fact that the number of target categories in the UAV detection scene is small and the target pose and scale transformation are small, this embodiment adjusts the number of convolution layers in each stage, and the second convolution stage to the sixth convolution stage The number of residual modules in the stage is adjusted from the original 1, 2, 8, 8, 4 to 1, 2, 4, 4, 2, that is, from Darknet-53 to Darknet-33. The number of convolution channels in the 6 stages is also Adjustments have been made from the original 32, 64, 128, 256, 512, and 1024 to 16, 24, 48, 96, 128, and 160.

本实施例提出的上述YOLOv3算法降低了网络结构的复杂度，减少了计算力的需求，缩短模型训练的时间和压缩了模型的大小，有利于算法在无人机嵌入式平台的部署，尽可能达到算法实时性的要求，且性能与YOLOv5m至少持平，经实验结果测试，基于COCO数据集，IOU阈值为0.5时，本实施例提出的YOLOv3算法的平均精度(mAP)略高于YOLOv5m。The above-mentioned YOLOv3 algorithm proposed in this embodiment reduces the complexity of the network structure, reduces the demand for computing power, shortens the time of model training and compresses the size of the model, which is conducive to the deployment of the algorithm on the embedded platform of the UAV. The real-time requirements of the algorithm are met, and the performance is at least equal to that of YOLOv5m. According to the experimental results, based on the COCO data set, when the IOU threshold is 0.5, the average precision (mAP) of the YOLOv3 algorithm proposed in this embodiment is slightly higher than that of YOLOv5m.

进一步地，继续结合图4和图5所示，在制备出目标跟踪算法模型后，便可由其处理融合图像，以得到关于目标对象的识别及跟踪结果。具体地，由于本实施例采用YOLOv3算法作为目标检测算法模型，因此可以直接使用在战场典型目标中训练好的模型作为目标跟踪算法中的检测器，因此最终实现战场典型目标多目标跟踪的技术路线为YOLOv3+deepsort。本实施例基于YOLOv3+deepsort的目标跟踪算法具体如图5所示，主要包括卡尔曼滤波和匈牙利算法两部分。例如，DeepSORT算法采用了目前主流的跟踪思路：detection+track(即实时检测+跟踪预测)。具体流程为：YOLOv3检测器生成检测框detections→由上一帧的跟踪结果预测当前帧的跟踪器tracks→使用匈牙利算法(通过外观信息、马氏距离、或者IOU来计算代价矩阵)将新tracks和detections进行匹配→将匹配成功的detections与对应的tracks进行卡尔曼滤波更新。上述算法中将YOLOv3算法作为检测器，以检测结果为输入：boundingbox、confidence、feature。confidence主要用于进行一部分的检测框的筛选；boundingbox与feature(ReID)用于后面与跟踪器的match计算；应用时首先是预测模块会对跟踪器tracks使用卡尔曼滤波器进行预测。本实施例使用卡尔曼滤波器的匀速运动和线性观测模型(意味着只有四个量且在初始化时会使用检测器进行恒值初始化)。其次使用更新模块，其中包括匹配，追踪器更新与特征集更新。在更新模块的部分，根本的方法还是使用IOU来进行匈牙利算法的匹配。Further, as shown in FIG. 4 and FIG. 5 , after the target tracking algorithm model is prepared, the fusion image can be processed by it to obtain the recognition and tracking results of the target object. Specifically, since this embodiment uses the YOLOv3 algorithm as the target detection algorithm model, it is possible to directly use the model trained on typical battlefield targets as a detector in the target tracking algorithm, so that the technical route for multi-target tracking of typical battlefield targets is finally realized It is YOLOv3+deepsort. The target tracking algorithm based on YOLOv3+deepsort in this embodiment is specifically shown in Figure 5, mainly including two parts: Kalman filter and Hungarian algorithm. For example, the DeepSORT algorithm adopts the current mainstream tracking idea: detection+track (that is, real-time detection + tracking prediction). The specific process is: YOLOv3 detector generates detection frame detections → predicts the tracker tracks of the current frame from the tracking results of the previous frame → uses the Hungarian algorithm (calculates the cost matrix through appearance information, Mahalanobis distance, or IOU) to combine new tracks and The detections are matched → the Kalman filter update is performed on the successfully matched detections and the corresponding tracks. In the above algorithm, the YOLOv3 algorithm is used as the detector, and the detection result is used as input: boundingbox, confidence, feature. Confidence is mainly used to screen a part of the detection frame; boundingbox and feature (ReID) are used for the subsequent match calculation with the tracker; when applied, the prediction module first uses the Kalman filter to predict the tracker tracks. This embodiment uses the uniform motion and linear observation model of the Kalman filter (meaning that there are only four quantities and the detector will be used for constant value initialization during initialization). Then use the update module, which includes matching, tracker update and feature set update. In the part of updating the module, the fundamental method is to use IOU to match the Hungarian algorithm.

基于上文所述的各项实施例的公开内容可知，本申请的有益效果包括：Based on the disclosure content of the various embodiments described above, the beneficial effects of the present application include:

在图像复原方面：(1)对于无人机采集目标的高度非均匀模糊图像，尤其是包含复杂目标运动的图像，提出了基于DeblurGANv2的图像去模糊算法，充分利用全局与局部特征，利用特征金字塔网络提高图像特征的泛化性，引入全局残差结构保证图像复原的准确性，以MobileNet为基本网络结构，提高推理速度，保证实时性。(2)传统图像去噪算法较为“粗暴”，虽然有去噪效果，但是会滤掉一部分图像细节，使图像更加模糊，影响后续目标检测效果。本实施例提出基于深度神经网络的无人机典型目标图像去噪CBDNet模型，由噪声估计子网络和非盲目去噪子网络两部分组成，同时采用合成噪声图像和真实噪声图像训练CBDNet，采用了人工合成噪声图像和真实噪声图像来训练模型，提高了模型对真实噪声图像的泛化能力。In terms of image restoration: (1) For highly non-uniform blurred images of targets collected by drones, especially images containing complex target motion, an image deblurring algorithm based on DeblurGANv2 is proposed, which makes full use of global and local features, and uses feature pyramids The network improves the generalization of image features, introduces a global residual structure to ensure the accuracy of image restoration, and uses MobileNet as the basic network structure to improve the reasoning speed and ensure real-time performance. (2) The traditional image denoising algorithm is relatively "rude". Although it has a denoising effect, it will filter out some image details, making the image more blurred and affecting the subsequent target detection effect. This embodiment proposes a typical target image denoising CBDNet model based on a deep neural network, which consists of a noise estimation sub-network and a non-blind denoising sub-network. At the same time, synthetic noise images and real noise images are used to train CBDNet. Artificially synthesizing noise images and real noise images to train the model improves the generalization ability of the model to real noise images.

进一步地，对满足无人机典型目标检测的实时性要求，本实施例提出改进的YOLOv3为基本结构框架，通过修改YOLOv3网络的基本结构，满足无人机检测场景的目标类别数较少以及目标姿态、尺度变换幅度较小等要求；结合空间金字塔池化和扩展感受野，实现面向无人机典型目标检测获取更高级的语义信息和增强泛化性。采用多任务损失函数，实现目标分类与定位。构建模型时可基于可见光的目标检测算法模型，改进YOLOv3算法在模型推理速度和准确率、召回率方面均得到了很好的验证，并且大大压缩了模型，减少了检测时间。Further, to meet the real-time requirements of the typical target detection of UAVs, this embodiment proposes the improved YOLOv3 as the basic structural framework. By modifying the basic structure of the YOLOv3 network, the number of target categories in the UAV detection scene is small and Requirements such as small attitude and scale transformation; combined with spatial pyramid pooling and expanded receptive field, to achieve higher-level semantic information and enhance generalization for typical target detection of UAVs. Multi-task loss function is used to realize target classification and localization. When building a model, it can be based on the target detection algorithm model of visible light. The improved YOLOv3 algorithm has been well verified in terms of model reasoning speed, accuracy, and recall rate, and it has greatly compressed the model and reduced the detection time.

另外，本实施例通过将双波段融合目标检测识别算法应用在战场典型目标的检测识别，主要用于无人机典型目标检测，可有效提升检测性能和鲁棒性。其中，该双波段融合的目标检测识别算法是采用YOLOv3为基本结构框架进行算法改进得到的。特征级多源传感器图像融合是提取多源图像中感兴趣区域的目标特征，并对这些特征信息进行综合性分析和融合处理。本实施例提出的可见光图像与红外图像的特征级融合算法，在特征提取过程中，只保留重要的信息，对于不重要的信息和冗余的信息通常借助降维等方式去除，并将可见光和红外提取的重要特征进行融合，为后续进行准确的目标检测与跟踪奠定了基础，提高了检测与跟踪的精准度。In addition, in this embodiment, by applying the dual-band fusion target detection and recognition algorithm to the detection and recognition of typical targets on the battlefield, it is mainly used for the detection of typical targets of unmanned aerial vehicles, which can effectively improve the detection performance and robustness. Among them, the target detection and recognition algorithm of the dual-band fusion is obtained by improving the algorithm using YOLOv3 as the basic structural framework. Feature-level multi-source sensor image fusion is to extract the target features of the region of interest in multi-source images, and perform comprehensive analysis and fusion processing on these feature information. The feature-level fusion algorithm of visible light images and infrared images proposed in this embodiment only retains important information during the feature extraction process, and usually removes unimportant information and redundant information by means of dimensionality reduction, and combines visible light and infrared images. The important features extracted by infrared are fused, which lays the foundation for subsequent accurate target detection and tracking, and improves the accuracy of detection and tracking.

如图6所示，本发明另一实施例同时提供一种复杂战场上的目标识别及跟踪装置100，包括：As shown in Figure 6, another embodiment of the present invention also provides a target recognition and tracking device 100 on a complex battlefield, including:

检测模块，用于对所述第一图像集中的第一图像进行图像模糊度检测，所述图像模糊度检测包括对所述第一图像进行二次模糊处理，基于二次模糊处理前后的第一图像的比对结果确定所述第一图像的模糊度。A detection module, configured to perform image blur detection on the first image in the first image set, the image blur detection includes performing secondary blur processing on the first image, based on the first blur before and after the secondary blur processing The image comparison result determines the blurriness of the first image.

第二构建模块，用于根据能够不平等处理不同图像特征及像素的特征注意模块、由局部残差学习模块和特征注意模块组成的基本块结构，以及能够区别富裕图像中不同特征权重的特征融合结构构建特征融合注意力网络，所述特征融合注意力网络用于对图像进行去雾处理；The second building block is based on a feature attention module that can handle different image features and pixels unequally, a basic block structure composed of a local residual learning module and a feature attention module, and feature fusion that can distinguish different feature weights in rich images Structural construction feature fusion attention network, described feature fusion attention network is used for dehazing processing to image;

第三处理模块，基于所述特征融合注意力网络对所述第一图像集中的第一图像同时进行去雾处理，以得到所述第二图像集。The third processing module is to simultaneously perform defogging processing on the first images in the first image set based on the feature fusion attention network, so as to obtain the second image set.

进一步地，本发明另一实施例还提供一种复杂战场上的目标识别及跟踪系统，其特征在于，包括：Furthermore, another embodiment of the present invention also provides a target recognition and tracking system on a complex battlefield, which is characterized in that it includes:

至少一个处理器；以及，at least one processor; and,

进一步地，本发明一实施例还提供一种存储介质，其上存储有计算机程序，该程序被处理器执行时实现如上所述的复杂战场上的目标识别及跟踪方法。应理解，本实施例中的各个方案具有上述方法实施例中对应的技术效果，此处不再赘述。Furthermore, an embodiment of the present invention also provides a storage medium, on which a computer program is stored, and when the program is executed by a processor, the above-mentioned target recognition and tracking method on a complex battlefield is realized. It should be understood that each solution in this embodiment has the corresponding technical effects in the above method embodiments, and details are not repeated here.

进一步地，本发明实施例还提供了一种计算机程序产品，所述计算机程序产品被有形地存储在计算机可读介质上并且包括计算机可读指令，所述计算机可执行指令在被执行时使至少一个处理器执行诸如上文所述实施例中的复杂战场上的目标识别及跟踪方法。Furthermore, an embodiment of the present invention also provides a computer program product, the computer program product is tangibly stored on a computer-readable medium and includes computer-readable instructions, and when executed, the computer-executable instructions cause at least A processor executes the target recognition and tracking method on a complex battlefield such as in the embodiments described above.

需要说明的是，本申请的计算机存储介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读介质例如可以但不限于是电、磁、光、电磁、红外线、或半导体的系统、系统或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于：具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储介质(RAM)、只读存储介质(ROM)、可擦式可编程只读存储介质(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储介质(CD-ROM)、光存储介质件、磁存储介质件、或者上述的任意合适的组合。在本申请中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、系统或者器件使用或者与其结合使用。而在本申请中，计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号，其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式，包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质，该计算机可读介质可以发送、传播或者传输配置为由指令执行系统、系统或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输，包括但不限于：无线、天线、光缆、RF等等，或者上述的任意合适的组合。It should be noted that the computer storage medium in the present application may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two. A computer readable medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, system, or device, or a combination of any of the above. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random-access storage media (RAM), read-only storage media (ROM), erasable Programmable read-only storage medium (EPROM or flash memory), optical fiber, portable compact disk read-only storage medium (CD-ROM), optical storage medium, magnetic storage medium, or any suitable combination of the above. In this application, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, system, or device. In this application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, in which computer-readable program codes are carried. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program configured to be used by or in conjunction with an instruction execution system, system, or device . Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, antenna, optical cable, RF, etc., or any suitable combination of the foregoing.

另外，本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。In addition, those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, optical storage, etc.) having computer-usable program code embodied therein.

本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的系统。The present application is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a A system for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令系统的制造品，该指令系统实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising a system of instructions, the The system implements the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.

显然，本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样，倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内，则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the application without departing from the spirit and scope of the application. In this way, if these modifications and variations of the present application fall within the scope of the claims of the present application and their equivalent technologies, the present application is also intended to include these modifications and variations.

Claims

1. A target recognition and tracking method on a complex battlefield, characterized in that it comprises:

Acquiring a first image set containing a target object, where the first images in the first image set are of different types;

performing deblurring and denoising processing on the first image set to form a second image set;

performing image fusion processing on the second image set to form a fusion image;

Constructing a target tracking algorithm model for identifying target objects in an image and performing tracking and marking, the target tracking algorithm model is based on a single-stage target detection model, at least by adjusting the feature extraction module and pooling in the single-stage target detection model The position of the layer and the number of convolutional layers are obtained;

The fused image is input into the target tracking algorithm model, so as to obtain the recognition and tracking results of the target object based on the target tracking algorithm model.

2. The target recognition and tracking method on the complex battlefield according to claim 1, is characterized in that, also comprises:

Performing image blur detection on the first image in the first image set, the image blur detection includes performing secondary blur processing on the first image, based on the comparison result of the first image before and after the secondary blur processing A blurriness of the first image is determined.

3. The target recognition and tracking method on the complex battlefield according to claim 1, wherein said performing deblurring processing on said first set of images comprises:

Combining feature pyramid network, double discriminator structure and image deblurring algorithm to form the first image processing network;

Perform deblurring processing on the first image based on the first image processing network.

4. The target recognition and tracking method on the complex battlefield according to claim 1, wherein the denoising processing of the first set of images comprises:

Forming a second image processing network by introducing an asymmetric loss function and a noise estimation sub-network into the image denoising network;

Denoising the first image based on the second image processing network.

5. The target recognition and tracking method on the complex battlefield according to claim 1, is characterized in that, also comprises:

Construct a feature fusion attention network based on a feature attention module that can handle different image features and pixels unequally, a basic block structure composed of a local residual learning module and a feature attention module, and a feature fusion structure that can distinguish different feature weights in rich images , the feature fusion attention network is used to dehaze the image;

Dehaze processing is simultaneously performed on the first images in the first image set based on the feature fusion attention network to obtain the second image set.

6. The target recognition and tracking method on a complex battlefield according to claim 1, wherein the first image set includes the first visible light image and the first infrared image obtained by photographing the same scene, and the formed The second image set includes a second visible light image and a second infrared image after deblurring and denoising;

The performing image fusion processing on the second image set to form a fusion image includes:

performing geometric registration on the second visible light image and the second infrared image;

performing image fusion processing on the registered second visible light image and the second infrared image to form the fusion image.

7. the target recognition and tracking method on the complex battlefield according to claim 1, is characterized in that, described construction is used to identify the target object in the image and carries out the target tracking algorithm model of tracking mark, comprises:

Constructing the target tracking algorithm model based on a basic convolutional layer, a downsampling layer, a residual module, a feature extraction module and a pooling layer, the basic convolutional layer is composed of a plurality of convolutional layers, a BN layer and an activation function layer;

Wherein, the target tracking algorithm model includes six convolution stages, the feature extraction module is located after the fourth and fifth convolution stages, the pooling layer is located after the sixth convolution stage, and the second The number of residual modules from the convolution stage to the sixth convolution stage is 1, 2, 4, 4, and 2, and the number of convolution channels is 16, 24, 48, 96, 128, and 160, respectively.

8. The target recognition and tracking method on the complex battlefield according to claim 1, wherein the collection includes the first image set of the target object, comprising:

The first image set including the target object is collected based on a drone, a visible light imaging sensor, and an infrared sensor.

9. A target recognition and tracking device on a complex battlefield, characterized in that it includes:

An acquisition module, configured to acquire a first image set containing a target object, where the first images in the first image set are of different types;

A first processing module, configured to perform deblurring and denoising processing on the first image set to form a second image set;

A second processing module, configured to perform image fusion processing on the second image set to form a fusion image;

The first building block is used to construct a target tracking algorithm model for identifying a target object in an image and performing tracking and marking, the target tracking algorithm model is based on a single-stage target detection model, at least by adjusting the single-stage target detection The position of the feature extraction module and pooling layer in the model and the number of convolutional layers are obtained;

The input module is used to input the fused image into the target tracking algorithm model, so as to obtain the recognition and tracking results of the target object based on the target tracking algorithm model.

10. A target recognition and tracking system on a complex battlefield, characterized in that it includes:

at least one processor; and,

a memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions that can be executed by the at least one processor, and the instructions are executed by the at least one processor to realize target recognition and tracking on a complex battlefield according to any one of claims 1-8 method.