CN111401203A - Target identification method based on multi-dimensional image fusion - Google Patents
Target identification method based on multi-dimensional image fusion Download PDFInfo
- Publication number
- CN111401203A CN111401203A CN202010165922.3A CN202010165922A CN111401203A CN 111401203 A CN111401203 A CN 111401203A CN 202010165922 A CN202010165922 A CN 202010165922A CN 111401203 A CN111401203 A CN 111401203A
- Authority
- CN
- China
- Prior art keywords
- image
- pyramid
- fusion
- reference image
- aplace
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/251—Fusion techniques of input or preprocessed data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Astronomy & Astrophysics (AREA)
- Remote Sensing (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明属于人工智能与计算机视觉技术领域,具体涉及一种基于多维图像融合的目标识别方法。The invention belongs to the technical field of artificial intelligence and computer vision, and in particular relates to a target recognition method based on multi-dimensional image fusion.
背景技术Background technique
图像目标识别是通过存储的目标信息与当前的图像信息进行比较实现对图像的识别。图像的描述是目标识别的前提,通过采用数字或者符号来表示图像或景物中各个目标的相关特征,甚至目标之间的关系,最终得到目标特征以及它们之间的关系的抽象表达。图像识别技术对图像中个性特征进行提取时,可以采用模板匹配模型。在某些具体的应用中,图像识别除了要给出被识别对象是什么物体外,还需要给出物体所处的位置。目前,图像识别技术已广泛应用于多个领域,如生物医学、卫星遥感、机器人视觉、货物检测、目标跟踪、自主车导航、公安、银行、交通、军事、电子商务和多媒体网络通信等。随着人工智能与计算机视觉技术的快速发展,出现了基于机器视觉的目标识别、基于深度学习的目标识别等,大大提高了图像识别的准确度和识别效率。Image target recognition is to realize image recognition by comparing the stored target information with the current image information. The description of the image is the premise of target recognition. By using numbers or symbols to represent the relevant features of each target in the image or scene, and even the relationship between the targets, the abstract expression of the target features and the relationship between them is finally obtained. Image recognition technology can use template matching model when extracting individual features in images. In some specific applications, image recognition not only needs to give what object the recognized object is, but also needs to give the position of the object. At present, image recognition technology has been widely used in many fields, such as biomedicine, satellite remote sensing, robot vision, cargo detection, target tracking, autonomous vehicle navigation, public security, banking, transportation, military, e-commerce and multimedia network communication. With the rapid development of artificial intelligence and computer vision technology, target recognition based on machine vision and target recognition based on deep learning have appeared, which greatly improves the accuracy and efficiency of image recognition.
然而,单一波段传感器所获取的图像信息存在着不足之处。例如,可见光图像细节丰富,但是晚上或者光线弱的情况下无法成像;红外图像能够24小时成像,但是得到的是物体温度的分布,不能够实现对细节的观测。采用图像融合手段,可以将单一传感器的多波段信息或不同类传感器所提供的信息加以综合,消除多传感器信息之间可能存在的冗余和矛盾,以增强影像中信息透明度,改善解译的精度、可靠性以及使用率,以形成对目标的清晰、完整、准确的信息描述。高效的图像融合方法可以根据需要综合处理多源通道的信息,从而有效地提高图像信息的利用率、系统对目标识别的可靠性及系统的自动化程度。However, the image information obtained by a single-band sensor has shortcomings. For example, visible light images are rich in details, but cannot be imaged at night or when the light is weak; infrared images can be imaged 24 hours a day, but the temperature distribution of objects can be obtained, and details cannot be observed. Using image fusion methods, the multi-band information of a single sensor or the information provided by different types of sensors can be integrated to eliminate the possible redundancy and contradiction between the multi-sensor information, so as to enhance the transparency of the information in the image and improve the accuracy of interpretation , reliability, and usage to form a clear, complete, and accurate description of the target. The efficient image fusion method can comprehensively process the information of multi-source channels according to the needs, so as to effectively improve the utilization rate of image information, the reliability of the system's target recognition and the degree of automation of the system.
在无人侦察机、车载全景态势感知、舰载光电搜索跟踪等系统中,通过基于多维图像融合的目标识别技术可以满足军用光电系统的多项需求,解决对外部场景的自动化、智能化感知能力,同时基于多维图像融合的目标识别技术在民用领域的航测、工业测量方面也有广泛的用途,因此能为我国军事和侦察领域带来巨大的社会效益和经济效益。In systems such as unmanned reconnaissance aircraft, vehicle-mounted panoramic situational awareness, and shipborne photoelectric search and tracking, the target recognition technology based on multi-dimensional image fusion can meet many needs of military photoelectric systems, and solve the ability to automate and intelligently perceive external scenes. At the same time, the target recognition technology based on multi-dimensional image fusion is also widely used in aerial survey and industrial survey in the civil field, so it can bring huge social and economic benefits to my country's military and reconnaissance fields.
近年来有许多学者开展了目标识别方法的研究,但目前主要采用单一波段传感器图像的方法进行目标识别。中国期刊《指挥控制与仿真》2019,Vol.28,No.1,pp.1-5刊登了一篇题为“基于深度学习的海战场图像目标识别”的论文,作者单连平等人在该论文中分析了基于区域建议的R-CNN系列模型与基于回归的YOLO模型的优势和缺陷,梳理了深度学习技术在海战场图像目标识别中的应用现状。可见,传统的目标识别算法不能够在低信噪比条件下的图像中识别出有效的目标。因此,要实现有效的图像识别必须研究和寻找更为有效、准确、实时的技术途径。In recent years, many scholars have carried out research on target recognition methods, but at present, the method of single-band sensor image is mainly used for target recognition. The Chinese journal "Command and Control and Simulation" 2019, Vol.28, No.1, pp.1-5 published a paper entitled "Recognition of Image Targets in Naval Battlefields Based on Deep Learning". In this paper, the advantages and disadvantages of the R-CNN series models based on regional proposal and the YOLO model based on regression are analyzed, and the application status of deep learning technology in image target recognition in naval battlefields is sorted out. It can be seen that the traditional target recognition algorithm cannot identify the effective target in the image under the condition of low signal-to-noise ratio. Therefore, to achieve effective image recognition, it is necessary to research and find more effective, accurate and real-time technical approaches.
发明内容SUMMARY OF THE INVENTION
(一)要解决的技术问题(1) Technical problems to be solved
本发明要解决的技术问题是:为满足复杂环境下的目标识别需求,如何为无人系统提供一种基于多维图像融合的目标识别方法。The technical problem to be solved by the present invention is: how to provide a target recognition method based on multi-dimensional image fusion for unmanned systems in order to meet the target recognition requirements in complex environments.
(二)技术方案(2) Technical solutions
为解决上述技术问题,本发明提供一种基于多维图像融合的目标识别方法,所述方法包括:In order to solve the above technical problems, the present invention provides a target recognition method based on multi-dimensional image fusion, the method comprising:
步骤1:对图像进行预处理;包括:Step 1: Preprocess the image; including:
步骤11:计算图像相对参数变换矩阵;Step 11: Calculate the relative parameter transformation matrix of the image;
当接到无人侦察系统装置发出的识别命令后,通过相应的传感器采集可见光图像作为基准图像gB,红外图像作为候选图像gc;After receiving the identification command issued by the unmanned reconnaissance system device, the visible light image is collected as the reference image g B through the corresponding sensor, and the infrared image is used as the candidate image g c ;
假定匹配点数量为N,N的数量至少为3,假设N=3,则在基准图像中选择3个特征点,分别为B1,B2和B3;在候选图像gc中选择对应的3个匹配点,分别为C1,C2和C3;Assuming that the number of matching points is N, the number of N is at least 3, and assuming that N=3, select 3 feature points in the reference image, namely B 1 , B 2 and B 3 ; select the corresponding feature points in the candidate image g c 3 matching points, namely C 1 , C 2 and C 3 ;
对于基准图像gB和候选图像gC两幅多源图像,一共找到了N对匹配点;则基准图像gB和候选图像gC之间的相对参数变换矩阵PC←B采用以下最小二乘法公式计算:For the two multi-source images of the reference image g B and the candidate image g C , a total of N pairs of matching points are found; then the relative parameter transformation matrix P C←B between the reference image g B and the candidate image g C adopts the following least squares method Formula calculation:
PC←B=C*BT*(B*BT)-1 P C←B =C*B T *(B*B T ) -1
其中,C表示候选图像坐标系中的匹配点的奇次坐标,C为3×N矩阵,B表示基准图像坐标系中的匹配点的奇次坐标,B为3×N矩阵,BT为B的转置矩阵,相对参数变换矩阵PC←B为3×3矩阵;Among them, C represents the odd coordinate of the matching point in the candidate image coordinate system, C is a 3×N matrix, B represents the odd coordinate of the matching point in the reference image coordinate system, B is a 3×N matrix, B T is B The transpose matrix of , the relative parameter transformation matrix P C←B is a 3×3 matrix;
步骤12:将候选图像变换到基准图像坐标系中;Step 12: Transform the candidate image into the reference image coordinate system;
采用逆向映射,从基准图像gB出发,通过变换函数求解基准图像gB上每一个像素点位置在候选图像gC上的对应位置;根据基准图像gB中每个点B0的奇次坐标,都可以根据下式计算出候选图像gC中与之对应的点C0的奇次坐标:Using inverse mapping, starting from the reference image g B , the corresponding position of each pixel position on the reference image g B on the candidate image g C is obtained through the transformation function; according to the odd coordinate of each point B 0 in the reference image g B , the odd-order coordinates of the corresponding point C 0 in the candidate image g C can be calculated according to the following formula:
其中,B0的水平坐标为其取值范围为(1,2,…,W);垂直坐标为其取值范围为(1,2,…,H);C0的水平坐标为其取值范围为(1,2,…,W);垂直坐标为其取值范围为(1,2,…,H);Among them, the horizontal coordinate of B 0 is Its value range is (1,2,…,W); the vertical coordinate is Its value range is (1,2,…,H); the horizontal coordinate of C 0 is Its value range is (1,2,…,W); the vertical coordinate is Its value range is (1,2,…,H);
将候选图像gC中点C0的像素灰度值赋予基准图像gB对应像素点B0后,获得变换后的候选图像gB←C,并将该图像gB←C作为输出与基准图像gB进行融合;After assigning the pixel gray value of point C 0 in the candidate image g C to the corresponding pixel point B 0 of the reference image g B , the transformed candidate image g B←C is obtained, and the image g B←C is used as the output and the reference image. g B for fusion;
步骤2:基于多源特征对图像进行图像融合;包括:Step 2: Image fusion based on multi-source features; including:
将基准图像gB与变换后的候选图像gB←C进行融合,采用基于多分辨率分析的金字塔分解结构的图像融合算法,实现的步骤如下:The reference image g B is fused with the transformed candidate image g B←C , and the image fusion algorithm based on the pyramid decomposition structure of multi-resolution analysis is adopted. The steps are as follows:
步骤21:进行图像的Gauss塔型分解;Step 21: Perform Gauss tower decomposition of the image;
对于作为源图像的基准图像gB,以G0作为Gauss金字塔的零层,则高斯金字塔的第l层图像Gl为:For the reference image g B as the source image, with G 0 as the zero layer of the Gaussian pyramid, the image G l of the l-th layer of the Gaussian pyramid is:
式中:N1为Gauss金字塔顶层层号;Cl为Gauss金字塔第l层图像的列数;Rl为Gauss金字塔第l层图像的行数;w(m,n)表示5×5窗口函数,具有低通特性,具体如下:In the formula: N1 is the top layer number of the Gauss pyramid; C l is the number of columns of the image of the first layer of the Gauss pyramid; R l is the number of rows of the image of the first layer of the Gauss pyramid; w(m,n) represents the 5×5 window function, It has low-pass characteristics, as follows:
步骤22:建立图像的Laplace金字塔;Step 22: Build the Laplace pyramid of the image;
其中:in:
式中,是由Gl内插放大得到的图像,其尺寸与Gl-1的尺寸相同,但与Gl-1并不相等,在原有像素间内插的新像素的灰度值是通过对原有像素灰度值的加权平均确定的;由于Gl是对Gl-1低通滤波得到的,即Gl是模糊化、降采样的Gl-1,因此的细节信息比Gl-1的少;In the formula, is the image enlarged by the interpolation of G l , and its size is the same as that of G l-1 , but It is not equal to G l-1 , the gray value of the new pixel interpolated between the original pixels is determined by the weighted average of the original pixel gray value; since G l is obtained by low-pass filtering of G l-1 , that is, G l is the blurred, down-sampled G l-1 , so The detailed information is less than that of G l-1 ;
由此得到Laplace金字塔各层的分解图像LPl:From this, the decomposed image LP l of each layer of the Laplace pyramid is obtained:
式中:N2为Laplace金字塔顶层的层号;LPl表示Laplace金字塔分解的第l层图像;In the formula: N2 is the layer number of the top layer of the Laplace pyramid; LP l represents the l-th layer image decomposed by the Laplace pyramid;
步骤23:由Laplace金字塔重建源图像;Step 23: Reconstruct the source image by the Laplace pyramid;
由上式进行转化可得From the above transformation, we can get
步骤24:基于Laplace金字塔分解的图像融合;Step 24: Image fusion based on Laplace pyramid decomposition;
设A,B为两幅源图像,F为融合后的图像,融合过程如下:Let A and B be two source images, and F be the fused image. The fusion process is as follows:
步骤241:对每幅源图像进行Laplace金字塔分解,建立各自的Laplace金字塔;Step 241: Perform Laplace pyramid decomposition on each source image to establish its own Laplace pyramid;
步骤242:对图像金字塔的各分解层分别进行融合处理,得到融合后图像的Laplace金字塔;Step 242: Perform fusion processing on each decomposition layer of the image pyramid to obtain the Laplace pyramid of the fused image;
步骤243:对融合后的Laplace金字塔进行图像重构,得到最终的融合图像;Step 243: Perform image reconstruction on the fused Laplace pyramid to obtain a final fused image;
步骤3:进行目标识别;该步骤包括:Step 3: carry out target identification; this step includes:
步骤31:进行大数据标注;Step 31: Perform big data annotation;
对于M张采集到的大数据图像,采用标注工具选择矩形区域,将背景区域的label定义为0,定义目标的区域label为1;分类构成一个有一定规模的用于训练深度学习模型的训练集和验证集,实现对1类目标的识别;所述M的数量至少为12000;For the M collected big data images, use the labeling tool to select a rectangular area, define the label of the background area as 0, and define the label of the target area as 1; the classification constitutes a training set of a certain scale for training deep learning models. and the validation set to realize the recognition of class 1 targets; the number of M is at least 12000;
步骤32:进行数据训练;Step 32: perform data training;
利用上一步标注好的数据集对目标分类模型进行训练;Use the data set marked in the previous step to train the target classification model;
步骤33:进行实时图像融合;Step 33: perform real-time image fusion;
实时采集可见光图像与红外图像,并进行融合,得到融合后的图像;Collect visible light images and infrared images in real time, and fuse them to obtain fused images;
步骤34:进行目标识别与定位;Step 34: Carry out target identification and positioning;
利用训练好的分类模型对上一步求得的融合图像进行舰船目标检测,将由分类模型识别出来的所有的舰船目标的大小位置记录下来,并使用矩形标识框将这一矩形区域在图像上标识出来。Use the trained classification model to detect ship targets on the fusion image obtained in the previous step, record the size and position of all ship targets identified by the classification model, and use a rectangular identification frame to mark this rectangular area on the image. identified.
(三)有益效果(3) Beneficial effects
与现有技术相比较,本发明具备如下有益效果:Compared with the prior art, the present invention has the following beneficial effects:
(1)本发明提出基于多维图像信息进行融合的方法。通过研究不同传感器的成像特性和图像特征之间的相关性,实现了基于多分辨率分析的拉普拉斯金字塔分解结构的融合算法。利用图像的金字塔分解,能分析图像中不同大小的物体,例如,高分辨率层(下层)可用于分析细节,低分辨率层(顶层)可用于分析较大的物体。同时,通过对高分辨率的下层进行分析所得到的信息还可能用于指导对低分辨率的上层进行分析,从而可大大简化分析和计算。图像的塔型分解提供了一种方便灵活的图像多分辨率分析方法,图像的拉普拉斯塔型分解可以将图像的重要性(例如边缘)按照不同的尺度分解到不同的塔型分解层上。由于本发明的图像融合方法符合战场复杂环境中的自然现实条件,因而,与其它现有图像融合方法相比,本发明具有融合效果好、细节丰富等特点。(1) The present invention proposes a method for fusion based on multi-dimensional image information. By studying the imaging characteristics of different sensors and the correlation between image features, the fusion algorithm of Laplacian pyramid decomposition structure based on multi-resolution analysis is realized. Using the pyramid decomposition of the image, it is possible to analyze objects of different sizes in the image, for example, a high-resolution layer (lower layer) can be used to analyze details, and a low-resolution layer (top layer) can be used to analyze larger objects. At the same time, the information obtained by analyzing the high-resolution lower layer may also be used to guide the analysis of the low-resolution upper layer, which can greatly simplify the analysis and calculation. The tower decomposition of images provides a convenient and flexible method for multi-resolution analysis of images. The Laplacian decomposition of images can decompose the importance of images (such as edges) into different tower decomposition layers according to different scales. superior. Since the image fusion method of the present invention conforms to the natural reality conditions in the complex environment of the battlefield, the present invention has the characteristics of good fusion effect and rich details compared with other existing image fusion methods.
(2)在本发明中,对融合后的图像采用基于深度学习的卷积神经网络进行目标识别,具有识别准确、抗尺度与光照变化能力强等特点。(2) In the present invention, the convolutional neural network based on deep learning is used for target recognition on the fused image, which has the characteristics of accurate recognition, strong ability to resist scale and illumination changes, and the like.
(3)在本发明中,采用特征点进行异源图像配准。并采用最小二乘法计算图像变换参数,具有计算精度高、速度快、融合识别效果好等优点。(3) In the present invention, feature points are used for heterologous image registration. And the least square method is used to calculate the image transformation parameters, which has the advantages of high calculation accuracy, fast speed, and good fusion recognition effect.
附图说明Description of drawings
图1(a)-图1(d)是多维度图像融合实验结果图。其中,图1(a)是多维度图像融合软件启动界面;图1(b)是多维度图像融合软件加载图像视频后的界面;图1(c)是可见光与红外图像配准选点界面;图1(d)是多维度图像融合结果。Figure 1(a)-Figure 1(d) are the results of multi-dimensional image fusion experiments. Wherein, Figure 1(a) is the startup interface of the multi-dimensional image fusion software; Figure 1(b) is the interface after the multi-dimensional image fusion software loads the image and video; Figure 1(c) is the visible light and infrared image registration selection interface; Figure 1(d) is the result of multi-dimensional image fusion.
图2是本发明基于多维图像融合的目标识别方法的操作流程图。FIG. 2 is an operation flow chart of the target recognition method based on multi-dimensional image fusion of the present invention.
图3是本发明基于金字塔分解的图像融合原理图。FIG. 3 is a schematic diagram of the image fusion based on pyramid decomposition of the present invention.
图4(a)及图4(b)是本发明优选实施例对舰船目标图像视频进行目标识别的实验结果图。FIG. 4(a) and FIG. 4(b) are diagrams of experimental results of target recognition on ship target images and videos according to the preferred embodiment of the present invention.
具体实施方式Detailed ways
为使本发明的目的、内容、和优点更加清楚,下面结合附图和实施例,对本发明的具体实施方式作进一步详细描述。In order to make the purpose, content, and advantages of the present invention clearer, the specific embodiments of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments.
为满足复杂环境下的目标识别需求,本发明为无人系统提供一种基于多维图像融合的目标识别方法,该方法对多个传感器融合得到的大数据图像进行学习和训练,从而提高目标识别的能力。In order to meet the target recognition requirements in complex environments, the present invention provides a target recognition method based on multi-dimensional image fusion for unmanned systems. ability.
本发明的主要任务是采用图像融合方法对多波段视频图像序列进行融合,并采用深度学习方法最终智能识别出目标。由此可见,视频图像序列是本发明需要处理的对象。基于多维图像融合的目标识别技术是将多波段融合图像大数据进行标注、训练学习,用来自动识别出图像中的多个目标。The main task of the present invention is to use the image fusion method to fuse the multi-band video image sequence, and finally to intelligently identify the target by using the deep learning method. It can be seen that the video image sequence is the object to be processed by the present invention. The target recognition technology based on multi-dimensional image fusion is to label, train and learn the multi-band fusion image big data to automatically identify multiple targets in the image.
本发明的图像采集装置采用全瑞视讯可见光相机,红外采用EXA-IR2非制冷热像仪。本优选实施例中的计算机硬件采用I7-7700处理器,主频为2.80G,硬盘大小为1T,用该计算机对一帧图像进行目标识别算法计算只需0.1秒左右。采集1帧可见光图像与1帧红外图像进行图像融合操作。假定可见光图像为基准图像,红外图像为候选图像。图像融合技术的基本流程是,首先对基准图像和候选图像进行配准,从而建立两幅图像的数学变换模型;再根据建立的数学变换模型,进行统一坐标变换,即将所有候选图像序列变换到基准图像的坐标系中,以此来构成融合后的图像。对于两幅图像来说,由于存在尺度、旋转和平移变换,其变换关系至少需要3对匹配点来求解。The image acquisition device of the present invention adopts the visible light camera of Quanrui Video, and the infrared adopts the EXA-IR2 uncooled thermal imager. The computer hardware in this preferred embodiment adopts the I7-7700 processor, the main frequency is 2.80G, and the size of the hard disk is 1T. It only takes about 0.1 second to use the computer to calculate the target recognition algorithm for one frame of image. Collect 1 frame of visible light image and 1 frame of infrared image for image fusion operation. It is assumed that the visible light image is the reference image and the infrared image is the candidate image. The basic process of image fusion technology is to first register the reference image and the candidate image to establish a mathematical transformation model of the two images; then according to the established mathematical transformation model, a unified coordinate transformation is performed, that is, all candidate image sequences are transformed to the benchmark. In the coordinate system of the image, the fused image is formed. For two images, due to the existence of scale, rotation and translation transformation, the transformation relationship needs at least 3 pairs of matching points to solve.
本发明优选实施例提供的基于多维图像融合的目标识别方法,按照图2所示的工作流程完成图像的实时识别,其识别过程包括以下四大部分内容。The target recognition method based on multi-dimensional image fusion provided by the preferred embodiment of the present invention completes the real-time recognition of images according to the workflow shown in FIG. 2 , and the recognition process includes the following four major parts.
具体而言,本发明所提供的基于多维图像融合的目标识别方法,包括:Specifically, the target recognition method based on multi-dimensional image fusion provided by the present invention includes:
步骤1:对图像进行预处理;包括:Step 1: Preprocess the image; including:
步骤11:计算图像相对参数变换矩阵;Step 11: Calculate the relative parameter transformation matrix of the image;
当接到无人侦察系统装置发出的识别命令后,通过相应的传感器采集可见光图像作为基准图像gB,红外图像作为候选图像gc;After receiving the identification command issued by the unmanned reconnaissance system device, the visible light image is collected as the reference image g B through the corresponding sensor, and the infrared image is used as the candidate image g c ;
假定匹配点数量为N,N的数量至少为3,假设N=3,则在基准图像中选择3个特征点,分别为B1,B2和B3;在候选图像gc中选择对应的3个匹配点,分别为C1,C2和C3;Assuming that the number of matching points is N, the number of N is at least 3, and assuming that N=3, select 3 feature points in the reference image, namely B 1 , B 2 and B 3 ; select the corresponding feature points in the candidate image g c 3 matching points, namely C 1 , C 2 and C 3 ;
对于基准图像gB和候选图像gC两幅多源图像,一共找到了N对匹配点;如果N≥3,则基准图像gB和候选图像gC之间的相对参数变换矩阵PC←B采用以下最小二乘法公式计算:For the two multi-source images of the reference image g B and the candidate image g C , a total of N pairs of matching points are found; if N≥3, the relative parameter transformation matrix P C←B between the reference image g B and the candidate image g C Calculated using the following least squares formula:
PC←B=C*BT*(B*BT)-1 P C←B =C*B T *(B*B T ) -1
其中,C表示候选图像坐标系中的匹配点的奇次坐标,C为3×N矩阵,B表示基准图像坐标系中的匹配点的奇次坐标,B为3×N矩阵,BT为B的转置矩阵,相对参数变换矩阵PC←B为3×3矩阵;Among them, C represents the odd coordinate of the matching point in the candidate image coordinate system, C is a 3×N matrix, B represents the odd coordinate of the matching point in the reference image coordinate system, B is a 3×N matrix, B T is B The transpose matrix of , the relative parameter transformation matrix P C←B is a 3×3 matrix;
步骤12:将候选图像变换到基准图像坐标系中;Step 12: Transform the candidate image into the reference image coordinate system;
由于基准图像gB和候选图像gC之间存在一定的变换关系,因此需要将它们变换到同一个坐标系中进行图像融合。本发明中,需要将候选图像gC变换到基准图像gB坐标系中。Since there is a certain transformation relationship between the reference image g B and the candidate image g C , it is necessary to transform them into the same coordinate system for image fusion. In the present invention, the candidate image g C needs to be transformed into the reference image g B coordinate system.
采用逆向映射,从基准图像gB出发,通过变换函数求解基准图像gB上每一个像素点位置在候选图像gC上的对应位置;根据基准图像gB中每个点B0的奇次坐标,都可以根据下式计算出候选图像gC中与之对应的点C0的奇次坐标:Using inverse mapping, starting from the reference image g B , the corresponding position of each pixel position on the reference image g B on the candidate image g C is obtained through the transformation function; according to the odd coordinate of each point B 0 in the reference image g B , the odd-order coordinates of the corresponding point C 0 in the candidate image g C can be calculated according to the following formula:
其中,B0的水平坐标为其取值范围为(1,2,…,W);垂直坐标为其取值范围为(1,2,…,H);C0的水平坐标为其取值范围为(1,2,…,W);垂直坐标为其取值范围为(1,2,…,H);Among them, the horizontal coordinate of B 0 is Its value range is (1,2,…,W); the vertical coordinate is Its value range is (1,2,…,H); the horizontal coordinate of C 0 is Its value range is (1,2,…,W); the vertical coordinate is Its value range is (1,2,…,H);
将候选图像gC中点C0的像素灰度值gc(i,j)赋予基准图像gB对应像素点B0后,获得变换后的候选图像gB←C,并将该图像gB←C作为输出与基准图像gB进行融合;After assigning the pixel gray value g c (i,j) of the point C 0 in the candidate image g C to the corresponding pixel point B 0 of the reference image g B , the transformed candidate image g B←C is obtained, and the image g B is ←C is used as the output to fuse with the reference image g B ;
通常图像变换可以采用两种映射方式:正向映射和逆向映射;正向映射是根据计算出的图像变换参数将候选图像变换到基准图像所处的坐标空间上;即扫描候选图像的每一个像素,通过变换函数,依次计算每个像素对应到基准图像中的位置。当候选图像的两个相邻像素点映射到基准图像的两个不相邻像素点时,就会出现离散的马赛格和虚点空洞现象。因此,需要转换思路,可以采用逆向思维,反过来对基准图像的每一个点,都寻找与之对应的候选图像的坐标。逆向映射是从基准图像gB出发,通过变换函数求解基准图像gB上每一个像素点位置在候选图像gC上的对应位置。首先扫描基准图像gB的每个像素点位置,然后根据变换函数,计算在候选图像gC上的对应采样像素点,并将该点的灰度值赋值给基准图像gB的对应像素点。Usually image transformation can use two mapping methods: forward mapping and reverse mapping; forward mapping is to transform the candidate image to the coordinate space where the reference image is located according to the calculated image transformation parameters; that is, scan every pixel of the candidate image. , through the transformation function, calculate the position of each pixel corresponding to the reference image in turn. When two adjacent pixels of the candidate image are mapped to two non-adjacent pixels of the reference image, discrete marseille and void voids will appear. Therefore, it is necessary to change the way of thinking, and reverse thinking can be used to find the coordinates of the corresponding candidate image for each point of the reference image. The inverse mapping starts from the reference image g B , and uses the transformation function to solve the corresponding position of each pixel position on the reference image g B on the candidate image g C. First scan each pixel position of the reference image g B , and then calculate the corresponding sampling pixel point on the candidate image g C according to the transformation function, and assign the gray value of this point to the corresponding pixel point of the reference image g B.
逆向映射的效果要好于正向映射,因为基准图像的每个像素都能被扫描,获得适当的灰度值,从而避免了正向映射中输出图像的某些点可能没有被赋值而出现虚点空洞和马赛克的情况。The effect of inverse mapping is better than that of forward mapping, because each pixel of the reference image can be scanned to obtain the appropriate gray value, thus avoiding the appearance of virtual points in the forward mapping that some points of the output image may not be assigned. The case of voids and mosaics.
步骤2:基于多源特征对图像进行图像融合;包括:Step 2: Image fusion based on multi-source features; including:
将基准图像gB与变换后的候选图像gB←C进行融合,采用基于多分辨率分析的金字塔分解结构的图像融合算法,实现的步骤如下:The reference image g B is fused with the transformed candidate image g B←C , and the image fusion algorithm based on the pyramid decomposition structure of multi-resolution analysis is adopted. The steps are as follows:
步骤21:进行图像的Gauss塔型分解;Step 21: Perform Gauss tower decomposition of the image;
对于作为源图像的基准图像gB,以G0作为Gauss金字塔的零层(底层),则高斯金字塔的第l层图像Gl为:For the reference image g B as the source image, with G 0 as the zero layer (bottom layer) of the Gaussian pyramid, the image G l of the l-th layer of the Gaussian pyramid is:
式中:N1为Gauss金字塔顶层层号;Cl为Gauss金字塔第l层图像的列数;Rl为Gauss金字塔第l层图像的行数;w(m,n)表示5×5窗口函数(生成核),具有低通特性,具体如下:In the formula: N1 is the top layer number of the Gauss pyramid; C l is the number of columns of the image of the first layer of the Gauss pyramid; R l is the number of rows of the image of the first layer of the Gauss pyramid; w(m,n) represents the 5×5 window function ( generation kernel), with low-pass characteristics, as follows:
步骤22:建立图像的Laplace金字塔;Step 22: Build the Laplace pyramid of the image;
其中:in:
式中,是由Gl内插放大得到的图像,其尺寸与Gl-1的尺寸相同,但与Gl-1并不相等,在原有像素间内插的新像素的灰度值是通过对原有像素灰度值的加权平均确定的;由于Gl是对Gl-1低通滤波得到的,即Gl是模糊化、降采样的Gl-1,因此的细节信息比Gl-1的少;In the formula, is the image enlarged by the interpolation of G l , and its size is the same as that of G l-1 , but It is not equal to G l-1 , the gray value of the new pixel interpolated between the original pixels is determined by the weighted average of the original pixel gray value; since G l is obtained by low-pass filtering of G l-1 , that is, G l is the blurred, down-sampled G l-1 , so The detailed information is less than that of G l-1 ;
由此得到Laplace金字塔各层的分解图像LPl:From this, the decomposed image LP l of each layer of the Laplace pyramid is obtained:
式中:N2为Laplace金字塔顶层的层号;LPl表示Laplace金字塔分解的第l层图像;In the formula: N2 is the layer number of the top layer of the Laplace pyramid; LP l represents the l-th layer image decomposed by the Laplace pyramid;
步骤23:由Laplace金字塔重建源图像;Step 23: Reconstruct the source image by the Laplace pyramid;
由上式进行转化可得From the above transformation, we can get
步骤24:基于Laplace金字塔分解的图像融合;Step 24: Image fusion based on Laplace pyramid decomposition;
基于Laplace金字塔分解的图像融合方法如图3所示;设A,B为两幅源图像,F为融合后的图像,融合过程如下:The image fusion method based on Laplace pyramid decomposition is shown in Figure 3; let A and B be two source images, F is the fused image, and the fusion process is as follows:
步骤241:对每幅源图像进行Laplace金字塔分解,建立各自的Laplace金字塔;Step 241: Perform Laplace pyramid decomposition on each source image to establish its own Laplace pyramid;
步骤242:对图像金字塔的各分解层分别进行融合处理,得到融合后图像的Laplace金字塔;Step 242: Perform fusion processing on each decomposition layer of the image pyramid to obtain the Laplace pyramid of the fused image;
步骤243:对融合后的Laplace金字塔进行图像重构,得到最终的融合图像;Step 243: Perform image reconstruction on the fused Laplace pyramid to obtain a final fused image;
步骤3:进行目标识别;该步骤包括:Step 3: carry out target identification; this step includes:
步骤31:进行大数据标注;Step 31: Perform big data annotation;
对于M张采集到的大数据图像,采用标注工具选择矩形区域,将背景区域的label定义为0,定义舰船目标的区域label为1;分类构成一个有一定规模的用于训练深度学习模型的训练集和验证集,实现对1类舰船目标的识别;所述M的数量至少为12000;For the M collected big data images, the labeling tool is used to select the rectangular area, the label of the background area is defined as 0, and the label of the area defining the ship target is 1; the classification constitutes a certain scale for training the deep learning model. Training set and validation set to realize the recognition of Class 1 ship targets; the number of M is at least 12,000;
步骤32:进行数据训练;Step 32: perform data training;
利用上一步标注好的数据集对舰船目标分类模型进行训练;Use the data set marked in the previous step to train the ship target classification model;
步骤33:进行实时图像融合;Step 33: perform real-time image fusion;
实时采集可见光图像与红外图像,并进行融合,得到融合后的图像;Collect visible light images and infrared images in real time, and fuse them to obtain fused images;
步骤34:进行目标识别与定位;Step 34: Carry out target identification and positioning;
利用训练好的分类模型对上一步求得的融合图像进行舰船目标检测,将由分类模型识别出来的所有的舰船目标的大小位置记录下来,并使用矩形标识框将这一矩形区域在图像上标识出来。Use the trained classification model to detect ship targets on the fusion image obtained in the previous step, record the size and position of all ship targets identified by the classification model, and use a rectangular identification frame to mark this rectangular area on the image. identified.
图4(a)及图4(b)给出了采用本优选实施例进行基于多维图像融合的目标识别的实验结果。其中图4(a)中有两个舰船目标,图4(b)中有一个舰船目标。可以看出,本发明由于采用基于多维图像融合的目标识别方法,因此具有较好的目标识别效果。Figures 4(a) and 4(b) show the experimental results of target recognition based on multi-dimensional image fusion using this preferred embodiment. There are two ship targets in Figure 4(a) and one ship target in Figure 4(b). It can be seen that since the present invention adopts the target recognition method based on multi-dimensional image fusion, it has a better target recognition effect.
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和变形,这些改进和变形也应视为本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the technical principle of the present invention, several improvements and modifications can also be made. These improvements and modifications It should also be regarded as the protection scope of the present invention.
Claims (1)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010165922.3A CN111401203A (en) | 2020-03-11 | 2020-03-11 | Target identification method based on multi-dimensional image fusion |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010165922.3A CN111401203A (en) | 2020-03-11 | 2020-03-11 | Target identification method based on multi-dimensional image fusion |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN111401203A true CN111401203A (en) | 2020-07-10 |
Family
ID=71430666
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010165922.3A Pending CN111401203A (en) | 2020-03-11 | 2020-03-11 | Target identification method based on multi-dimensional image fusion |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111401203A (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113283411A (en) * | 2021-07-26 | 2021-08-20 | 中国人民解放军国防科技大学 | Unmanned aerial vehicle target detection method, device, equipment and medium |
| CN114842427A (en) * | 2022-03-31 | 2022-08-02 | 南京邮电大学 | Intelligent traffic-oriented complex multi-target self-adaptive detection method |
Citations (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102609928A (en) * | 2012-01-12 | 2012-07-25 | 中国兵器工业第二0五研究所 | Visual variance positioning based image mosaic method |
| CN103778616A (en) * | 2012-10-22 | 2014-05-07 | 中国科学院研究生院 | Contrast pyramid image fusion method based on area |
| US8885976B1 (en) * | 2013-06-20 | 2014-11-11 | Cyberlink Corp. | Systems and methods for performing image fusion |
| CN104616273A (en) * | 2015-01-26 | 2015-05-13 | 电子科技大学 | Multi-exposure image fusion method based on Laplacian pyramid decomposition |
| CN106960428A (en) * | 2016-01-12 | 2017-07-18 | 浙江大立科技股份有限公司 | Visible ray and infrared double-waveband image co-registration Enhancement Method |
| CN107609601A (en) * | 2017-09-28 | 2018-01-19 | 北京计算机技术及应用研究所 | A kind of ship seakeeping method based on multilayer convolutional neural networks |
| CN109492700A (en) * | 2018-11-21 | 2019-03-19 | 西安中科光电精密工程有限公司 | A kind of Target under Complicated Background recognition methods based on multidimensional information fusion |
| CN109558848A (en) * | 2018-11-30 | 2019-04-02 | 湖南华诺星空电子技术有限公司 | A kind of unmanned plane life detection method based on Multi-source Information Fusion |
| CN110111581A (en) * | 2019-05-21 | 2019-08-09 | 哈工大机器人(山东)智能装备研究院 | Target identification method, device, computer equipment and storage medium |
| CN110322423A (en) * | 2019-04-29 | 2019-10-11 | 天津大学 | A kind of multi-modality images object detection method based on image co-registration |
-
2020
- 2020-03-11 CN CN202010165922.3A patent/CN111401203A/en active Pending
Patent Citations (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102609928A (en) * | 2012-01-12 | 2012-07-25 | 中国兵器工业第二0五研究所 | Visual variance positioning based image mosaic method |
| CN103778616A (en) * | 2012-10-22 | 2014-05-07 | 中国科学院研究生院 | Contrast pyramid image fusion method based on area |
| US8885976B1 (en) * | 2013-06-20 | 2014-11-11 | Cyberlink Corp. | Systems and methods for performing image fusion |
| CN104616273A (en) * | 2015-01-26 | 2015-05-13 | 电子科技大学 | Multi-exposure image fusion method based on Laplacian pyramid decomposition |
| CN106960428A (en) * | 2016-01-12 | 2017-07-18 | 浙江大立科技股份有限公司 | Visible ray and infrared double-waveband image co-registration Enhancement Method |
| CN107609601A (en) * | 2017-09-28 | 2018-01-19 | 北京计算机技术及应用研究所 | A kind of ship seakeeping method based on multilayer convolutional neural networks |
| CN109492700A (en) * | 2018-11-21 | 2019-03-19 | 西安中科光电精密工程有限公司 | A kind of Target under Complicated Background recognition methods based on multidimensional information fusion |
| CN109558848A (en) * | 2018-11-30 | 2019-04-02 | 湖南华诺星空电子技术有限公司 | A kind of unmanned plane life detection method based on Multi-source Information Fusion |
| CN110322423A (en) * | 2019-04-29 | 2019-10-11 | 天津大学 | A kind of multi-modality images object detection method based on image co-registration |
| CN110111581A (en) * | 2019-05-21 | 2019-08-09 | 哈工大机器人(山东)智能装备研究院 | Target identification method, device, computer equipment and storage medium |
Non-Patent Citations (3)
| Title |
|---|
| 李良福 等: "基于深度学习的光电系统智能目标识别", 《兵工学报》, vol. 43, pages 162 - 168 * |
| 江海军 等: "拉普拉斯金字塔融合在红外无损检测技术中的应用", 《红外技术》, vol. 41, no. 12, pages 1151 - 1155 * |
| 韩潇 等: "基于改进拉普拉斯金字塔的图像融合方法", 《自动化与仪器仪表》, no. 5, pages 191 - 194 * |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113283411A (en) * | 2021-07-26 | 2021-08-20 | 中国人民解放军国防科技大学 | Unmanned aerial vehicle target detection method, device, equipment and medium |
| CN113283411B (en) * | 2021-07-26 | 2022-01-28 | 中国人民解放军国防科技大学 | Unmanned aerial vehicle target detection method, device, equipment and medium |
| CN114842427A (en) * | 2022-03-31 | 2022-08-02 | 南京邮电大学 | Intelligent traffic-oriented complex multi-target self-adaptive detection method |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Li et al. | Deep learning-based object detection techniques for remote sensing images: A survey | |
| Li et al. | Multigrained attention network for infrared and visible image fusion | |
| Chen et al. | Large-scale structure from motion with semantic constraints of aerial images | |
| CN110852182B (en) | Depth video human body behavior recognition method based on three-dimensional space time sequence modeling | |
| US10755146B2 (en) | Network architecture for generating a labeled overhead image | |
| Liangjun et al. | MSFA-YOLO: A multi-scale SAR ship detection algorithm based on fused attention | |
| Zhao et al. | Multiscale object detection in high-resolution remote sensing images via rotation invariant deep features driven by channel attention | |
| Wang et al. | Alignment-free RGBT salient object detection: Semantics-guided asymmetric correlation network and a unified benchmark | |
| Wang et al. | PACCDU: Pyramid attention cross-convolutional dual UNet for infrared and visible image fusion | |
| CN114943902A (en) | Urban vegetation unmanned aerial vehicle remote sensing classification method based on multi-scale feature perception network | |
| Guo et al. | PIF-Net: A deep point-image fusion network for multimodality semantic segmentation of very high-resolution imagery and aerial point cloud | |
| CN112700476A (en) | Infrared ship video tracking method based on convolutional neural network | |
| CN118429702A (en) | Anti-unmanned aerial vehicle data acquisition and intelligent labeling system based on multiple modes and operation method thereof | |
| CN104463962B (en) | Three-dimensional scene reconstruction method based on GPS information video | |
| CN115937704B (en) | Remote sensing image road segmentation method based on topology perception neural network | |
| CN111401203A (en) | Target identification method based on multi-dimensional image fusion | |
| CN116402859A (en) | A Moving Target Detection Method Based on Aerial Image Sequence | |
| Chen et al. | SACNet: A Novel Self-Supervised Learning Method for Shadow Detection from High-Resolution Remote Sensing Images | |
| Dong et al. | A Survey on Self-Supervised Monocular Depth Estimation Based on Deep Neural Networks | |
| Liu et al. | Tree species classification based on PointNet++ deep learning and true-colour point cloud | |
| Zhang et al. | Swin‐fisheye: Object detection for fisheye images | |
| Kaur et al. | Area Recognition in Aerial Images Leveraging Pre-trained ResNet-50 Architecture | |
| Fenglei et al. | A Boundary-Enhanced Semantic Segmentation Model for Buildings | |
| Liu et al. | Multimodal Absolute Visual Localization for Unmanned Aerial Vehicles | |
| Li et al. | CT-YoloTrad: fast and accurate recognition of point-distributed coded targets for UAV images incorporating CT-YOLOv7 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200710 |