[go: up one dir, main page]

CN116757980A - Infrared and visible light image fusion method and system based on feature block segmentation and separation - Google Patents

Infrared and visible light image fusion method and system based on feature block segmentation and separation Download PDF

Info

Publication number
CN116757980A
CN116757980A CN202310696821.2A CN202310696821A CN116757980A CN 116757980 A CN116757980 A CN 116757980A CN 202310696821 A CN202310696821 A CN 202310696821A CN 116757980 A CN116757980 A CN 116757980A
Authority
CN
China
Prior art keywords
image
fusion
feature
infrared
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310696821.2A
Other languages
Chinese (zh)
Other versions
CN116757980B (en
Inventor
孙乐
李宇航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202310696821.2A priority Critical patent/CN116757980B/en
Publication of CN116757980A publication Critical patent/CN116757980A/en
Application granted granted Critical
Publication of CN116757980B publication Critical patent/CN116757980B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/10Image enhancement or restoration using non-spatial domain filtering
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20056Discrete and fast Fourier transform, [DFT, FFT]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an infrared and visible light image fusion method and system based on feature block segmentation and separation, wherein the method comprises the following steps: two image enhancement modes are designed according to the infrared imaging characteristics and the visible imaging characteristics, and a primary characteristic block and a secondary characteristic block are rapidly separated by using a SIFT algorithm; designing a fusion network capable of adaptively adjusting network output according to the characteristic information quantity, and performing multi-task training of main characteristics and secondary characteristics on the fusion network so that the network can serve two tasks; and (3) performing splicing operation on the fused primary and secondary characteristic block image blocks, and eliminating a splicing seam through Cupresso fusion to obtain an infrared and visible light fusion image with outstanding obvious characteristics. The invention has the characteristics of obvious salient feature areas on the fusion result of the main feature image fusion block and the secondary feature image fusion block, and can improve the efficiency and the precision of a subsequent processing algorithm, so that the invention has more excellent result readability compared with the most advanced fusion algorithm.

Description

Infrared and visible light image fusion method and system based on feature block segmentation and separation
Technical Field
The invention relates to the technical field of computers and software, in particular to an infrared and visible light image fusion method and system based on feature block segmentation and separation.
Background
The infrared image can overcome the influence of a severe working environment, and the imaging image does not need to consider the reflected light intensity of the environment by acquiring the working characteristics of target thermal radiation. But the infrared image can only describe the outline of the target, and the imaging result lacks texture details. Thus, a visible light image containing rich detailed information, but limited to ambient light, is a complementary object to the infrared image's extreme. The fused image can overcome the limitation of a single sensor and combine the advantages of two images. In the application of high-level visual processing tasks, such as segmentation, tracking and detection, the fused image can promote the effect of a subsequent algorithm.
In order to obtain a fusion result with higher quality, researchers combine methods in multiple fields to improve the fusion quality. Including conventional methods and deep learning-based methods. Traditional pixel-level image fusion algorithms, such as multi-scale transformation, sparse representation, saliency maps, and other hybrid methods, are used in a simple application scene. However, the limitation of manually specifying the fusion rule in the traditional method makes it impossible to cope with the changeable fusion objects in the high-level visual task. Researchers have therefore addressed this problem by building end-to-end deep learning network models. While many approaches achieve excellent fusion results, existing fusion algorithms tend to pursue better overall visual quality, giving priority to human readability. And the image fusion is used as a preprocessing module of a high-level visual task, and the fusion result is expected to show high interpretation on a subsequent processing algorithm. The fusion algorithm requires the network to be designed starting from the requirements of the high-level visual tasks. The need for high-level visual tasks would like the converged network to be able to highlight the primary feature region and weaken the secondary background feature region to identify more salient objects. This requires that the fusion network should take a targeted fusion strategy for the primary and secondary feature regions. However, existing fusion algorithms typically employ the same fusion strategy. This makes convergence of the network during training ambiguous.
In response to this problem, various fusion algorithms for dividing the salient regions have been proposed. There are researchers that extract the main feature region by cross-aligning all pixels of the original image. However, since the fusion network does not design a respective main feature region extraction scheme for the infrared image and the visible image, the same strategy thereof affects the significant target extraction effect. An infrared and visible image fusion method based on significant object extraction and low illumination region enhancement is proposed for this purpose. Researchers design respective extraction strategies according to different visual characteristics of infrared images and visible light images. The salient objects are extracted by comparing the intensity of each pixel of the original image to the intensity of the background to define intensity saliency. However, the fusion rule is not suitable for application in high-level visual tasks due to the disadvantage that the fusion rule requires manual customization and the original image with multiple noise points greatly interferes with the extraction effect of the remarkable target. To solve this problem, a method of designing a fusion task for the characteristics of a high-level visual task is proposed. Researchers have proposed a fusion network model TarDAL that uses target-aware-based dual-countermeasure learning to distinguish infrared image thermal targets from visible image background texture details by using two target-aware discriminators. The characteristics of the lightweight network enable TarDAL to be well deployed in high-level visual tasks. However, tarDAL has a single source of extraction for significant regions, and depends on the imaging effect of the infrared image. For complex and changeable scenes, it is difficult to accurately extract the salient regions of the targets.
Disclosure of Invention
The invention aims to solve the technical problems that: the infrared and visible light image fusion method and system based on feature block segmentation and separation are provided, the image fusion center of gravity is concentrated to the main features, a network is designed from the requirement of a high-level visual task, the main features of a source image are reserved to the greatest extent, and the method and system have a promotion effect on the effect of a subsequent image processing algorithm.
The invention adopts the following technical scheme for solving the technical problems:
the invention provides an infrared and visible light image fusion method based on feature block segmentation and separation, which comprises the following steps:
s1, designing two image enhancement modes according to infrared and visible light imaging characteristics, and rapidly separating by using a SIFT algorithm to obtain a main feature block and a secondary feature block.
S2, spatial information and texture detail information in the main feature block and the secondary feature block are fused in parallel in a fusion network to obtain a main feature image fusion block and a secondary feature image fusion block, and the two image fusion blocks are trained in a multi-task mode.
The fusion network can adaptively adjust the network structure according to the potential contribution degree of the feature images so as to better serve the fusion task of the primary and secondary feature blocks.
And S3, splicing the trained main characteristic image fusion blocks and the trained secondary characteristic image fusion blocks by using the cypress fusion, and eliminating the splicing seams to obtain the infrared and visible light fusion image with outstanding obvious characteristics.
Further, in step S1, the specific steps for obtaining the main feature block and the secondary feature block are as follows:
s101, performing image enhancement on an infrared image, wherein the specific content is as follows:
the infrared image is characterized by the intensity of the target thermal radiation, which causes the infrared image to exhibit a characteristic of a distinct target profile in the imaged image. In order to extract more targets, an edge detection algorithm is used for carrying out image enhancement on the infrared image; the method for filtering the potential noise in the infrared image by using the Gaussian filter comprises the following specific steps:
where G (x, y) represents a gaussian function and σ represents a standard deviation of a gaussian filter.
The Sobel operator is used for calculating the gradient amplitude and direction of each pixel in the infrared image so as to find the possible edge position, and the specific formula is as follows:
wherein ,Gx and Gy Representing gradient values in the horizontal and vertical directions, respectively, I representing the input image.
The position of the local gradient maximum is found in the infrared image by suppressing the non-maximum value, and is regarded as a potential edge, and the specific suppression formula is as follows:
wherein Deltax and Deltay represent gradient directions, and the values are 0 and + -1 respectively.
Since potential edges may contain noise, dividing the gradient magnitude into two thresholds determines the required edge, the specific formula is:
If G(x y)≥Th the point is a strong edge point
If Tl≤G(x y)<Th the point is a weak edge point
If G(x y)<Tl the point is a nonedge point
wherein ,Th =0.2G max ,T l =0.1G max ,G max Is the maximum value of the gradient amplitude.
The specific content of the image enhancement of the visible light image is as follows:
visible light image imaging depends on the quality of the ambient light. Both the dark light and the high exposure environment greatly affect the quality of detail, so image enhancement is achieved by converting the image to the frequency domain, filtering the low and high frequency portions using band pass filters. Converting the visible light image to a frequency domain using a two-dimensional fourier transform; the zero frequency component in the frequency domain is moved to the center of the frequency spectrum, and the amplitude spectrum is calculated, wherein the specific formula is as follows:
f mp =20·log 10 (|f|)
where f represents an image converted to the frequency domain, f mp Representing the amplitude spectrum.
The amplitude spectrum is subjected to high-pass filtering, high-frequency information is reserved, and the specific formula is as follows:
f hp =f mp ·H(u,v)
wherein H (u, v) represents the transfer function of the high-pass filter, f hp Representing an image retaining high frequency information.
Performing inverse Fourier transform on the filtered image, and performing threshold processing to obtain final visible light image preprocessing, wherein the specific formula of the threshold processing is as follows:
wherein FFT () represents a fast Fourier transform function, I vis Indicating that a visible light image is output.
S102, marking characteristic points of the preprocessed infrared and visible light images by using a SIFT algorithm.
S103, dividing the image into a main feature block and a secondary feature block according to the distribution of the feature points on the marked image.
Further, in step S103, the image segmentation process according to the distribution of the feature points is as follows:
taking the region with densely distributed characteristic points as a main characteristic region, and clustering the characteristic points; traversing an image using a moving window of adjustable size, dividing a feature point within the window as a main feature block Q when the feature point is greater than or equal to a threshold i The calculation formula of the threshold value is as follows:
where η is a threshold and h and w are the height and width of the image.
Repeatedly traversing for a plurality of times until the window cannot contain more characteristic points; at this time, the remaining area of the divided original image is a secondary feature block.
Further, in step S2, the specific steps of training the main feature image fusion block and the secondary feature image fusion block are as follows:
s201, performing feature extraction on the main feature block and the secondary feature block by using a plurality of convolution layers, and forcing the fused image to contain richer texture detail information so as to enhance the extraction capability of a subsequent processing task on the feature information; a plurality of gradient operator modules are connected in parallel in the multi-layer convolution to strengthen edge and texture details in the image, and a 1 multiplied by 1 regular convolution layer is used for eliminating channel dimension differences; considering that the characteristic image subjected to gradient calculation may lose part of information in the propagation process, adding the output of the gradient calculation module and the output of the convolution layer at the tail part of the convolution layer to integrate depth characteristics and detail characteristics, wherein the output of each layer of convolution layer is expressed as:
wherein RELU is an activation function, BN is a batch normalization function, conv is a convolution function, sobel is an operator, P is a convolution layer output, and i is a convolution layer output.
S202, after feature extraction, respectively obtaining five layers of output feature images of the infrared and visible light images, and connecting the feature images in series in the channel dimension, wherein the dimension is increased from 1 dimension to 2 dimensions; the results after the series connection are sent to five-layer input of the feature fusion layer, and the output of the feature fusion layer is expressed as:
S i =concat(P i_ir ,P i_vis ,dim=1)
wherein ,Si P representing a combined channel of an infrared image and a visible image i_ir Characteristic diagram representing infrared image, P i_vis Feature table, L, representing visible light image i 0 And representing the output of the ith feature fusion layer.
S203, the feature graphs from different scales have larger difference in feature expression, so that the adjacent feature fusion layers are cross-connected to output, the connected result is transmitted to the multi-scale cross fusion layer through convolution, batch normalization and activation layers, and at the moment, a main feature image fusion block and a secondary feature image fusion block are output, and the output is expressed as:
wherein ,Li 1 And representing the i-th layer multi-scale cross fusion output.
S204, calculating similarity functions of the infrared and visible light image feature images and the infrared and visible light images to obtain contribution degrees of the feature images in a future fusion result, wherein the specific calculation formula is as follows:
pSSIM(P,I ir ,I vis )=SSIM(P,I ir )+SSIM(P,I vis )
wherein P is the output of the feature extraction layer, I ir Is an infrared image, I vis For visible light images, SSIM is a similarity function, C i For each layer of output channel size allocated, CN is the total number of channels.
The higher the contribution, the more abundant the feature information contained in the layer. At the tail end of the converged network, we allocate the appropriate number of channels according to the contribution of different layers.
S205, in order to evaluate the importance degree of the network parameters to maintain the training effect of different tasks, using Fisher information matrix as parameter importance evaluation item, calculating the average value of the curvature (namely second derivative) of the log likelihood function on the parameters, wherein the smaller the value is, the smaller the curvature is, the smaller the influence of the parameter change of the model on the change of the probability density function is, which indicates that the importance of the parameters is low, and the specific formula is as follows:
wherein ,is a multitask balance function, D is a training task, theta is a network parameter of the current task, and theta * Network parameters for the previous task; mu (mu) i Is a parameter importance evaluation item.
S206, designing a loss function sensitive to image gradient transformation according to the characteristics of the main characteristic image fusion blockTraining the main characteristic image fusion block; design of a loss function using mean square error based on features of the secondary feature image fusion block>Training the secondary characteristic image fusion block, wherein the specific formula is as follows:
wherein ,is a gradient transformation metric function, h is the height of the image, w is the width of the image, I f Is a fused image.
Further, in step S3, the specific content of obtaining the infrared and visible light fusion image with prominent salient features is:
taking the visible light image as an output image, sequentially and seamlessly splicing all feature image fusion blocks, giving the output image and the feature image fusion blocks, supposing that a designated area is fused to a position P of the output image, traversing the pixel gradient of a group of output images while ensuring that the pixel value of the feature block image is matched with the pixel gradient of the output image, so that the pixel gradient difference of the two images near the splicing position is minimum, and finally obtaining an infrared and visible light fusion image, wherein the specific formula is as follows:
where x and y are pixel locations of the image, I tgt Is an output image, I src Is a feature image fusion block, R src Is a designated area.
Furthermore, the invention also provides an infrared and visible light image fusion system based on feature block segmentation and separation, which comprises
And the source image segmentation module is used for designing two image enhancement modes according to the infrared and visible light imaging characteristics and rapidly separating by using a SIFT algorithm to obtain a main feature block and a secondary feature block.
The image fusion module is used for parallelly fusing the space information and the texture detail information in the main feature block and the secondary feature block in the fusion network to obtain a main feature image fusion block and a secondary feature image fusion block, and training the two image fusion blocks in a multi-task mode.
And the image splicing module is used for splicing the trained main characteristic image fusion blocks and the trained secondary characteristic image fusion blocks by using the cypress fusion and eliminating the splicing seams to obtain the infrared and visible light fusion images with prominent obvious characteristics.
Furthermore, the invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the steps of the infrared and visible light image fusion method based on the feature block segmentation and separation when executing the computer program.
Furthermore, the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program is executed by a processor to execute the infrared and visible light image fusion method based on the feature block segmentation and separation.
Compared with the prior art, the invention adopts the technical proposal and has the following remarkable technical effects:
(1) Efficient processing efficiency: the method is characterized in that the primary and secondary characteristic separation and prediction participation degree module enables the fusion result to show the obvious characteristic of the obvious characteristic region, and the efficiency and the precision of a subsequent processing algorithm can be improved.
(2) Lightweight network: the invention adjusts the size of the output channel by predicting the information retention degree of the input characteristic image in the fusion result. The fusion network keeps a lightweight architecture, adopts a multitask training mode, trains the fusion network in batches to reduce half of network parameters.
(3) Excellent fusion effect: qualitative experiments and target detection task test experiments on various data sets demonstrate the excellent performance of the present invention. The fusion characteristic of feature block separation enables the most advanced fusion algorithm to show more excellent result readability.
Drawings
FIG. 1 is a flow chart of the overall steps of the present invention.
Fig. 2 is a schematic diagram of the present invention for image feature block segmentation.
FIG. 3 is a schematic representation of the fusion results of the present invention.
Detailed Description
The technical solutions of the present invention will be clearly and completely described below with reference to the drawings and the detailed description, and it is obvious that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which are derived by a person skilled in the art based on the embodiments of the invention, fall within the scope of protection of the invention.
The invention provides an infrared and visible light image fusion method based on feature block segmentation and separation, which is shown in fig. 1 and comprises the following steps:
s1, designing two image enhancement modes according to infrared and visible light imaging characteristics, and rapidly separating by using a SIFT algorithm to obtain a main feature block and a secondary feature block, wherein the main feature block and the secondary feature block are shown in FIG. 2, and the specific steps are as follows:
s101, performing image enhancement on an infrared image, wherein the specific content is as follows:
performing image enhancement on the infrared image by using an edge detection algorithm; the method comprises the following specific steps of:
where G (x, y) represents a gaussian function and σ represents a standard deviation of a gaussian filter.
The Sobel operator is used for calculating the gradient amplitude and direction of each pixel in the infrared image so as to find the possible edge position, and the specific formula is as follows:
wherein ,Gx and Gy Representing gradient values in the horizontal and vertical directions, respectively, I representing the input image.
The position of the local gradient maximum is found in the infrared image by suppressing the non-maximum value, and is regarded as a potential edge, and the specific suppression formula is as follows:
wherein Deltax and Deltay represent gradient directions, and the values are 0 and + -1 respectively.
Since potential edges may contain noise, dividing the gradient magnitude into two thresholds determines the required edge, the specific formula is:
If G(x y)≥Th the point is a strong edge point
If Tl≤G(x y)<Th the point is a weak edge point
If G(x y)<Tl the point is a nonedge point
wherein ,Th =0.2G max ,T l =0.1G max ,G max Is the maximum value of the gradient amplitude.
The specific content of the image enhancement of the visible light image is as follows:
visible light image imaging depends on the quality of the ambient light. Both the dark light and the high exposure environment greatly affect the quality of detail, so image enhancement is achieved by converting the image to the frequency domain, filtering the low and high frequency portions using band pass filters. Converting the visible light image to a frequency domain using a two-dimensional fourier transform; the zero frequency component in the frequency domain is moved to the center of the frequency spectrum, and the amplitude spectrum is calculated, wherein the second column image of the first row of the figure 2 is the enhancement effect of the visible light image, and the specific formula of the enhancement process is as follows:
f mp =20·log 10 (|f|)
where f represents an image converted to the frequency domain, f mp Representing the amplitude spectrum.
The amplitude spectrum is subjected to high-pass filtering, high-frequency information is reserved, and the specific formula is as follows:
f hp =f mp ·H(u,v)
wherein H (u, v) tableTransfer function of high-pass filter, f hp Representing an image retaining high frequency information.
Performing inverse Fourier transform on the filtered image, and performing threshold processing to obtain final visible light image preprocessing, wherein the specific formula of the threshold processing is as follows:
wherein FFT () represents a fast Fourier transform function, I vis Indicating that a visible light image is output.
S102, marking characteristic points of the preprocessed infrared and visible light images by using a SIFT algorithm, wherein the third-column image in FIG. 2 is a characteristic point marking result of the infrared image and the visible light image.
S103, dividing the image into a main feature block and a secondary feature block according to the distribution of feature points on the marked image, wherein the fourth-column image in FIG. 2 is the division result of the infrared image and the visible light image, the rectangular area formed by the frame is the main feature block, and the rest areas are the secondary feature blocks. The specific process of segmentation is as follows:
taking the region with densely distributed characteristic points as a main characteristic region, and clustering the characteristic points; traversing an image using a moving window of adjustable size, dividing a feature point within the window as a main feature block Q when the feature point is greater than or equal to a threshold i The calculation formula of the threshold value is as follows:
where η is a threshold and h and w are the height and width of the image.
Repeatedly traversing for a plurality of times until the window cannot contain more characteristic points; at this time, the remaining area of the divided original image is a secondary feature block.
S2, parallelly fusing spatial information and texture detail information in the main feature block and the secondary feature block in a fusion network to obtain a main feature image fusion block and a secondary feature image fusion block, and training the two image fusion blocks in a multi-task mode, wherein the specific steps are as follows:
s201, performing feature extraction on the main feature block and the secondary feature block by using a plurality of convolution layers, and forcing the fused image to contain richer texture detail information so as to enhance the extraction capability of a subsequent processing task on the feature information; a plurality of gradient operator modules are connected in parallel in the multi-layer convolution to strengthen edge and texture details in the image, and a 1 multiplied by 1 regular convolution layer is used for eliminating channel dimension differences; considering that the characteristic image subjected to gradient calculation may lose part of information in the propagation process, adding the output of the gradient calculation module and the output of the convolution layer at the tail part of the convolution layer to integrate depth characteristics and detail characteristics, wherein the output of each layer of convolution layer is expressed as:
wherein RELU is an activation function, BN is a batch normalization function, conv is a convolution function, sobel is an operator, P is a convolution layer output, and i is a convolution layer output.
S202, after feature extraction, respectively obtaining five layers of output feature images of the infrared and visible light images, and connecting the feature images in series in the channel dimension, wherein the dimension is increased from 1 dimension to 2 dimensions; the results after the series connection are sent to five-layer input of the feature fusion layer, and the output of the feature fusion layer is expressed as:
S i =concat(P i_ir ,P i_vis ,dim=1)
wherein ,Si P representing a combined channel of an infrared image and a visible image i_ir Characteristic diagram representing infrared image, P i_vis Feature table, L, representing visible light image i 0 And representing the output of the ith feature fusion layer.
S203, the feature graphs from different scales have larger difference in feature expression, so that the outputs of adjacent feature fusion layers are cross-connected, the connected results are transmitted to a multi-scale cross fusion layer through convolution, batch normalization and activation layers to reduce dimension difference, and the outputs of the multi-scale cross fusion layer are expressed as follows:
wherein ,Li 1 And representing the i-th layer multi-scale cross fusion output.
S204, calculating similarity functions of the infrared and visible light image feature images and the infrared and visible light images to obtain contribution degrees of the feature images in a future fusion result, wherein the specific calculation formula is as follows:
pSSIM(P,I ir ,I vis )=SSIM(P,I ir )+SSIM(P,I vis )
wherein P is the output of the feature extraction layer, I ir Is an infrared image, I vis For visible light images, SSIM is a similarity function, C i For each layer of output channel size allocated, CN is the total number of channels.
S205, in order to evaluate the importance degree of the network parameters to maintain the training effect of different tasks, using Fisher information matrix as the parameter importance evaluation term, calculating the average value of the curvature (namely second derivative) of the log likelihood function on the parameters, wherein the smaller the value is, the smaller the curvature is, the smaller the influence of the parameter change of the model on the change of the probability density function is, which indicates that the parameter importance is low, and the specific formula is as follows:
wherein ,is a multitask balance function, D is a training task, theta is a network parameter of the current task, and theta * Network parameters for the previous task; mu (mu) i Is a parameter importance evaluation item.
S206, designing a loss function sensitive to image gradient transformation according to the characteristics of the main characteristic image fusion blockTraining the main characteristic image fusion block; design of a loss function using mean square error based on features of the secondary feature image fusion block>Training the secondary characteristic image fusion block, wherein the specific formula is as follows:
wherein ,is a gradient transformation metric function, h is the height of the image, w is the width of the image, I f Is a fused image.
And S3, splicing the trained main characteristic image fusion blocks and the trained secondary characteristic image fusion blocks by using the cypress fusion, and eliminating the splicing seams to obtain the infrared and visible light fusion image with outstanding obvious characteristics. Fig. 3 shows the fusion result of two groups of infrared images and visible light images, which is specifically:
taking the visible light image as an output image, sequentially and seamlessly splicing all feature image fusion blocks, giving the output image and the feature image fusion blocks, supposing that a designated area is fused to a position P of the output image, traversing the pixel gradient of a group of output images while ensuring that the pixel value of the feature block image is matched with the pixel gradient of the output image, so that the pixel gradient difference of the two images near the splicing position is minimum, and finally obtaining an infrared and visible light fusion image, wherein the specific formula is as follows:
where x and y are pixel locations of the image, I tgt Is an output image, I src Is a feature image fusion block, R src Is a designated area.
As can be seen from fig. 3, after division, block fusion and splicing, the fusion result shows excellent results in terms of both area details and overall fusion quality. Furthermore, as a preprocessing module for high-level visual tasks, we use a target detection algorithm to detect the saliency of the main target. The last column of fig. 3 shows the test results, and it can be seen that the present invention can effectively help the target detection algorithm detect targets such as automobiles, pedestrians, etc.
The embodiment of the invention also provides an infrared and visible light image fusion system based on feature block segmentation and separation, which comprises a source image segmentation module, an image fusion module, an image stitching module and a computer program capable of running on a processor. It should be noted that each module in the above system corresponds to a specific step of the method provided by the embodiment of the present invention, and has a corresponding functional module and beneficial effect of executing the method. Technical details not described in detail in this embodiment may be found in the methods provided in the embodiments of the present invention.
The embodiment of the invention also provides an electronic device which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor. It should be noted that each module in the above system corresponds to a specific step of the method provided by the embodiment of the present invention, and has a corresponding functional module and beneficial effect of executing the method. Technical details not described in detail in this embodiment may be found in the methods provided in the embodiments of the present invention.
The embodiment of the invention also provides a computer readable storage medium, and the computer readable storage medium stores a computer program. It should be noted that each module in the above system corresponds to a specific step of the method provided by the embodiment of the present invention, and has a corresponding functional module and beneficial effect of executing the method. Technical details not described in detail in this embodiment may be found in the methods provided in the embodiments of the present invention.
The foregoing description is only exemplary embodiments of the present invention and is not intended to limit the scope of the present invention, and all equivalent structures or equivalent processes using the descriptions and the drawings of the present invention or directly or indirectly applied to other related technical fields are included in the scope of the present invention.

Claims (8)

1. The infrared and visible light image fusion method based on feature block segmentation and separation comprises the following steps:
s1, designing two image enhancement modes according to infrared and visible light imaging characteristics, and rapidly separating by using a SIFT algorithm to obtain a main feature block and a secondary feature block;
s2, parallelly fusing space information and texture detail information in the main feature block and the secondary feature block in a fusion network to obtain a main feature image fusion block and a secondary feature image fusion block, and training the two image fusion blocks in a multi-task mode;
and S3, splicing the trained main characteristic image fusion blocks and the trained secondary characteristic image fusion blocks by using the cypress fusion, and eliminating the splicing seams to obtain the infrared and visible light fusion image with outstanding obvious characteristics.
2. The method for merging infrared and visible light images based on feature block segmentation and separation according to claim 1, wherein in step S1, the specific steps of obtaining the main feature block and the secondary feature block are as follows:
s101, performing image enhancement on an infrared image, wherein the specific content is as follows:
performing image enhancement on the infrared image by using an edge detection algorithm; the method for filtering the potential noise in the infrared image by using the Gaussian filter comprises the following specific steps:
wherein G (x, y) represents a gaussian function, σ represents a standard deviation of a gaussian filter;
calculating the gradient amplitude and direction of each pixel in the infrared image by using a Sobel operator, wherein the specific formula is as follows:
wherein ,Gx and Gy Representing gradient values in the horizontal and vertical directions, respectively, I representing the input image;
the position of the local gradient maximum is found in the infrared image by suppressing the non-maximum value, and is regarded as a potential edge, and the specific suppression formula is as follows:
wherein Δx and Δy represent gradient directions;
dividing the gradient amplitude into two thresholds, and determining the required edge, wherein the specific formula is as follows:
If G(x y)≥Th the point is a strong edge point
If Tl≤G(x y)<Th the point is a weak edge point
If G(x y)<Tl the point is a nonedge point
wherein ,Th =0.2G max ,T l =0.1G max ,G max Is the maximum value of the gradient amplitude;
the specific content of the image enhancement of the visible light image is as follows:
converting the visible light image to a frequency domain using a two-dimensional fourier transform; the zero frequency component in the frequency domain is moved to the center of the frequency spectrum, and the amplitude spectrum is calculated, wherein the specific formula is as follows:
f mp =20·log 10 (|f|)
where f represents an image converted to the frequency domain, f mp Representing an amplitude spectrum;
the amplitude spectrum is subjected to high-pass filtering, high-frequency information is reserved, and the specific formula is as follows:
f hp =f mp ·H(u,v)
wherein H (u, v) represents the transfer function of the high-pass filter, f hp An image representing the reserved high-frequency information;
performing inverse Fourier transform on the filtered image, and performing threshold processing to obtain final visible light image preprocessing, wherein the specific formula of the threshold processing is as follows:
wherein FFT () represents a fast Fourier transform function, I vis Representing an output visible light image;
s102, marking characteristic points of the preprocessed infrared and visible light images by using a SIFT algorithm;
s103, dividing the image into a main feature block and a secondary feature block according to the distribution of the feature points on the marked image.
3. The method for merging the infrared and visible light images based on feature block segmentation and separation according to claim 2, wherein in step S103, the specific content of image segmentation according to the distribution of feature points is:
taking the region with densely distributed characteristic points as a main characteristic region, and clustering the characteristic points in the region; traversing an image using a moving window of adjustable size, dividing a feature point within the window as a main feature block Q when the feature point is greater than or equal to a threshold i The calculation formula of the threshold value is as follows:
wherein eta is a threshold value, and h and w are the height and width of the image;
repeatedly traversing for a plurality of times until the window cannot contain more characteristic points; at this time, the remaining area of the divided original image is a secondary feature block.
4. The method for fusing infrared and visible light images based on feature block segmentation and separation according to claim 1, wherein in step S2, the specific steps of training the main feature image fusion block and the secondary feature image fusion block are as follows:
s201, performing feature extraction on a main feature block and a secondary feature block by using a multi-layer convolution layer, connecting a plurality of gradient operator modules in parallel in the multi-layer convolution, and eliminating channel dimension differences by using a 1X 1 regular convolution layer; adding gradient calculation module output and convolution layer output at the tail part of the convolution layer, wherein the output of each layer of convolution layer is expressed as:
wherein RELU is an activation function, BN is a batch normalization function, conv is a convolution function, sobel is an operator, P is a convolution layer output, and i is a convolution layer output;
s202, after feature extraction, respectively obtaining five layers of output feature images of the infrared and visible light images, and connecting the feature images in series in the channel dimension, wherein the dimension is increased from 1 dimension to 2 dimensions; the results after the series connection are sent to five-layer input of the feature fusion layer, and the output of the feature fusion layer is expressed as:
S i =concat(P i_ir ,P i_vis ,dim=1)
wherein ,Si P representing a combined channel of an infrared image and a visible image i_ir Characteristic diagram representing infrared image, P i_vis Feature map representing visible light image, L i 0 Representing the output of the ith feature fusion layer;
s203, cross-connecting the output of the adjacent feature fusion layer, transmitting the connected result to the multi-scale cross fusion layer through convolution, batch normalization and activation layers, and outputting a main feature image fusion block and a secondary feature image fusion block at the moment, wherein the output is expressed as:
wherein ,Li 1 Representing the i-th layer multi-scale cross fusion output;
s204, calculating similarity functions of the infrared and visible light image feature images and the infrared and visible light images to obtain contribution degrees of the feature images in a future fusion result, wherein the specific calculation formula is as follows:
pSSIM(P,I ir ,I vis )=SSIM(P,I ir )+SSIM(P,I vis )
wherein P is the output of the feature extraction layer, I ir Is an infrared image, I vis For visible light images, SSIM is a similarity function, C i For each layer of output channel size allocated, CN is the total number of channels;
s205, using Fisher information matrix as parameter importance evaluation item, calculating average value of curvature of the log likelihood function about the parameter, wherein the specific formula is as follows:
wherein ,is a multitask balance function, D is a training task, theta is a network parameter of the current task, and theta * Mu, network parameters for the previous task i The parameter importance evaluation item;
s206, designing a loss function sensitive to image gradient transformation according to the characteristics of the main characteristic image fusion blockAnd training the main characteristic image fusion blockThe method comprises the steps of carrying out a first treatment on the surface of the Design of a loss function using mean square error based on features of the secondary feature image fusion block>Training the secondary characteristic image fusion block, wherein the specific formula is as follows:
wherein ,is a gradient transformation metric function, h is the height of the image, w is the width of the image, I f Is a fused image.
5. The method for fusing infrared and visible light images based on feature block segmentation and separation according to claim 1, wherein in step S3, the specific contents of obtaining the infrared and visible light fused image with prominent features are as follows:
taking the visible light image as an output image, sequentially and seamlessly splicing all feature image fusion blocks, giving the output image and the feature image fusion blocks, supposing that a designated area is fused to a position P of the output image, traversing the pixel gradient of a group of output images while ensuring that the pixel value of the feature block image is matched with the pixel gradient of the output image, so that the pixel gradient difference of the two images near the splicing position is minimum, and finally obtaining an infrared and visible light fusion image, wherein the specific formula is as follows:
where x and y are pixel locations of the image, I tgt Is an output image, I src Is a feature image fusion block, R src Is a designated area.
6. The infrared and visible light image fusion system based on feature block segmentation and separation is characterized by comprising
The source image segmentation module is used for designing two image enhancement modes according to the infrared and visible light imaging characteristics and rapidly separating by using a SIFT algorithm to obtain a main feature block and a secondary feature block;
the image fusion module is used for parallelly fusing the space information and the texture detail information in the main feature block and the secondary feature block in the fusion network to obtain a main feature image fusion block and a secondary feature image fusion block, and training the two image fusion blocks in a multi-task mode;
and the image splicing module is used for splicing the trained main characteristic image fusion blocks and the trained secondary characteristic image fusion blocks by using the cypress fusion and eliminating the splicing seams to obtain the infrared and visible light fusion images with prominent obvious characteristics.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any one of claims 1 to 5 when the computer program is executed by the processor.
8. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, performs the method of any one of claims 1 to 5.
CN202310696821.2A 2023-06-13 2023-06-13 Infrared and visible light image fusion method and system based on feature block segmentation and separation Active CN116757980B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310696821.2A CN116757980B (en) 2023-06-13 2023-06-13 Infrared and visible light image fusion method and system based on feature block segmentation and separation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310696821.2A CN116757980B (en) 2023-06-13 2023-06-13 Infrared and visible light image fusion method and system based on feature block segmentation and separation

Publications (2)

Publication Number Publication Date
CN116757980A true CN116757980A (en) 2023-09-15
CN116757980B CN116757980B (en) 2025-10-28

Family

ID=87952663

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310696821.2A Active CN116757980B (en) 2023-06-13 2023-06-13 Infrared and visible light image fusion method and system based on feature block segmentation and separation

Country Status (1)

Country Link
CN (1) CN116757980B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018120936A1 (en) * 2016-12-27 2018-07-05 Zhejiang Dahua Technology Co., Ltd. Systems and methods for fusing infrared image and visible light image
WO2021120404A1 (en) * 2019-12-17 2021-06-24 大连理工大学 Infrared and visible light fusing method
WO2021120406A1 (en) * 2019-12-17 2021-06-24 大连理工大学 Infrared and visible light fusion method based on saliency map enhancement
CN115601282A (en) * 2022-11-10 2023-01-13 江苏海洋大学(Cn) Infrared and visible light image fusion method based on multi-discriminator generation countermeasure network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018120936A1 (en) * 2016-12-27 2018-07-05 Zhejiang Dahua Technology Co., Ltd. Systems and methods for fusing infrared image and visible light image
WO2021120404A1 (en) * 2019-12-17 2021-06-24 大连理工大学 Infrared and visible light fusing method
WO2021120406A1 (en) * 2019-12-17 2021-06-24 大连理工大学 Infrared and visible light fusion method based on saliency map enhancement
CN115601282A (en) * 2022-11-10 2023-01-13 江苏海洋大学(Cn) Infrared and visible light image fusion method based on multi-discriminator generation countermeasure network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李伟;陈红斌;: "结合NSST和LC显著性的红外与可见光图像融合", 电子技术与软件工程, no. 08, 15 April 2020 (2020-04-15) *

Also Published As

Publication number Publication date
CN116757980B (en) 2025-10-28

Similar Documents

Publication Publication Date Title
CN112488210B (en) A method for automatic classification of 3D point clouds based on graph convolutional neural networks
Guo et al. BARNet: Boundary aware refinement network for crack detection
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
CN106845487B (en) End-to-end license plate identification method
WO2021196873A1 (en) License plate character recognition method and apparatus, electronic device, and storage medium
CN119648999B (en) A target detection method, system, device and medium based on cross-modal fusion and guided attention mechanism
CN110378297A (en) A kind of Remote Sensing Target detection method based on deep learning
US11615612B2 (en) Systems and methods for image feature extraction
WO2020220663A1 (en) Target detection method and apparatus, device, and storage medium
KR20200027887A (en) Learning method, learning device for optimizing parameters of cnn by using multiple video frames and testing method, testing device using the same
CN117037104B (en) A multi-classification lane line detection method based on line anchor
CN118397253A (en) Small target detection method and device and terminal equipment
CN116596792B (en) Inland river foggy scene recovery method, system and equipment for intelligent ship
CN111738114A (en) Vehicle target detection method based on accurate sampling of remote sensing images without anchor points
CN119992390A (en) Small target detection method based on multi-scale hole fusion
CN119625269A (en) Aerial image small target detection method, system, medium and processor based on improved YOLOv8 algorithm
Wang et al. DAFPN-YOLO: An Improved UAV-Based Object Detection Algorithm Based on YOLOv8s.
CN110245600A (en) Adaptive start fast stroke width UAV road detection method
He et al. NTS-YOLO: A Nocturnal Traffic Sign Detection Method Based on Improved YOLOv5.
Bi et al. DR-YOLO: An improved multi-scale small object detection model for drone aerial photography scenes based on YOLOv7
CN114782919B (en) A road grid map construction method and system with real and simulated data enhancement
Luo et al. A lightweight YOLOv5-FFM model for occlusion pedestrian detection
CN116757980B (en) Infrared and visible light image fusion method and system based on feature block segmentation and separation
CN112419227B (en) Underwater target detection method and system based on small target search scaling technology
CN114612999B (en) Target behavior classification method, storage medium and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant