[go: up one dir, main page]

CN109903301B - An Image Contour Detection Method Based on Multi-level Feature Channel Optimal Coding - Google Patents

An Image Contour Detection Method Based on Multi-level Feature Channel Optimal Coding Download PDF

Info

Publication number
CN109903301B
CN109903301B CN201910080334.7A CN201910080334A CN109903301B CN 109903301 B CN109903301 B CN 109903301B CN 201910080334 A CN201910080334 A CN 201910080334A CN 109903301 B CN109903301 B CN 109903301B
Authority
CN
China
Prior art keywords
layer
image
size
channels
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910080334.7A
Other languages
Chinese (zh)
Other versions
CN109903301A (en
Inventor
范影乐
方琳灵
周涛
武薇
佘青山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN201910080334.7A priority Critical patent/CN109903301B/en
Publication of CN109903301A publication Critical patent/CN109903301A/en
Application granted granted Critical
Publication of CN109903301B publication Critical patent/CN109903301B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

本发明涉及一种基于多级特征信道优化编码的图像轮廓检测方法。本发明针对输入图像I(x,y),首先基于相似度指标获取Gabor滤波器的最优尺度mopt和方向θopt,并将mopt和θopt作为NSCT的频率分离参数;然后将经过NSCT得到的轮廓子图与I(x,y)进行特征增强融合,实现对I(x,y)的初级轮廓检测;最后针对性地设计全卷积神经网络,包括由不同尺度FCN‑32s、FCN‑16s、FCN‑8s网络单元构成的特征编解码器,利用特征编码器的卷积与池化模块实现网络参数的主动学习,利用特征解码器的反卷积与上采样模块得到与I(x,y)对应的图像轮廓掩模图,实现多级特征信道的优化编码,完成图像轮廓的高效准确检测。

Figure 201910080334

The invention relates to an image contour detection method based on multi-level feature channel optimization coding. For the input image I(x, y), the present invention first obtains the optimal scale m opt and direction θ opt of the Gabor filter based on the similarity index, and uses m opt and θ opt as the frequency separation parameters of NSCT; The obtained contour subgraph is combined with I(x,y) for feature enhancement to realize the primary contour detection of I(x,y). ‑16s and FCN‑8s network units are composed of feature encoders and decoders. The convolution and pooling modules of the feature encoder are used to realize active learning of network parameters, and the deconvolution and upsampling modules of the feature decoder are used to obtain a , y) the corresponding image contour mask map, realizes the optimal coding of multi-level feature channels, and completes the efficient and accurate detection of image contours.

Figure 201910080334

Description

Image contour detection method based on multistage characteristic channel optimization coding
Technical Field
The invention belongs to the field of machine learning and image processing, and particularly relates to an image contour detection method based on multilevel characteristic channel optimization coding.
Background
The contour information has important significance for segmentation and identification of image data, fast delineation of an image target region is achieved, analysis and understanding of an image on a limited feature dimension are facilitated, and therefore automatic detection of the image contour is one of important research contents in the field of machine learning and image processing. Conventional detection algorithms based on regional gradient information typically consider linear filtering and local directional characteristics of the image, such as image local energy based methods, but they typically do not involve important information such as active contours, texture edges, and region boundaries. The existing contour detection method based on deep learning is concerned, a processing process of a human visual perception system on visual information is simulated through a deep network structure, feature learning is actively carried out, original complex feature extraction and data reconstruction processes are effectively simplified, but the method generally has the following problems: (1) the segmentation and fusion of images directly through the neural network can result in the imprecision of segmentation results and the generalization of feature information. (2) By not combining deep learning with the traditional characteristic-based method, the detection performance is heavily dependent on the quantity and quality of training samples, and the filtering capability of redundant information including texture background is weak. (3) Although some methods consider the problem of extracting multi-source features, for example, SAR image segmentation based on Gabor-NSCT and a pulse neural network, it involves multi-source feature extraction of Gabor and NSCT under multiple scales, and then trains the extracted Gabor features and NSCT features as the inputs of two pulse neural networks, so that the segmentation performance will depend heavily on the perception capability of Gabor and NSCT on the image content, and the multi-source feature signal fusion coding capability under multiple scales is not fully utilized, and in addition, the pulse neural network does not belong to the deep learning category from the model level and structure. For example, there is also an image contour extraction method based on Gabor-NSCT and a visual mechanism, which also involves multi-source feature extraction at different scales, but considering the operational capability of a visual mechanism model, a simplified fusion coding mode is usually adopted, and a learning process represented by convolutional neural network training is essentially lacked, so that the effectiveness of multi-source features on expressing contours cannot be truly embodied.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an image contour detection method based on multi-level characteristic channel optimization coding.
Although the NSCT transform has excellent performance in characterizing image details, it usually adopts optimized encoding in a manner of performing some weighting on decomposition results in scale and direction, and the artificial setting of weighting parameters in the processing process causes large uncertainty of detection results. Considering the effectiveness of the Gabor filter in perceiving the target dimension and direction of the image, the invention firstly calculates the optimal dimension m corresponding to the Gabor filter for the input image I (x, y)optAnd a direction thetaoptAnd will beM obtainedoptAnd thetaoptAs the frequency separation parameter of NSCT transformation, the traditional redundant fusion mode that Gabor and NSCT need to traverse all scales and directions is changed; in addition, the invention performs feature enhancement fusion on the contour subgraph obtained by NSCT and I (x, y), which is beneficial to efficiently and accurately obtaining the primary contour response E (x, y) of I (x, y); and then E (x, y) is transmitted into a full convolution neural network formed by FSC-32S, FSC-16S, FSC-8S network units, active learning of network parameters is realized by utilizing a convolution and pooling module of a feature encoder, an image contour mask image corresponding to I (x, y) is obtained through a deconvolution and up-sampling module of a feature decoder, and dot multiplication operation is carried out on the image contour mask image and I (x, y), so that accurate detection of the image contour is finally realized. The method specifically comprises the following steps:
step 1, acquiring a primary contour response of an input image I (x, y). First, the Gabor filter response of the input image I (x, y) is calculated, and the result is noted as
Figure GDA0002847359640000021
As shown in formulas (1) to (4).
Figure GDA0002847359640000022
Figure GDA0002847359640000023
Figure GDA0002847359640000024
Figure GDA0002847359640000025
In the formula:
Figure GDA0002847359640000026
the representation image I (x, y) is obtained on the scale m, the direction θ ═ n pi/K through a Gabor filterGabor characteristic information of (a); sigmaxyRespectively representing the standard deviation of the Gabor wavelet basis function along the x-axis and the y-axis; omega is the complex modulation frequency of the Gaussian function; the Gabor filter psi is obtained by taking psi (x, y) as mother wavelet and carrying out scale and rotation transformation on the mother waveletm,n(x, y); wherein u, v is ψm,nA template size of (x, y); m-0,., S-1, n-0,., K-1, S and K denote the number of scales and directions, respectively; α is a scale factor of ψ (x, y), where: alpha is alpha>1。
Calculating the optimal scale m corresponding to the Gabor filter based on the similarity index SSIMoptAnd a direction thetaoptAs shown in formulas (5) to (8).
Figure GDA0002847359640000031
Figure GDA0002847359640000032
Figure GDA0002847359640000033
Figure GDA0002847359640000034
Wherein
Figure GDA0002847359640000035
Representing filter response
Figure GDA0002847359640000036
With the known outline marker image ImarkThe similarity between them when
Figure GDA0002847359640000037
When taking the maximum value, obtaining the optimal scale moptAnd a direction thetaopt
Figure GDA0002847359640000038
And
Figure GDA0002847359640000039
respectively represent
Figure GDA00028473596400000310
And ImarkQuantitative similarity measures in brightness, contrast, and structure between; u. ofGabor、umarkRespectively representing images
Figure GDA00028473596400000311
And ImarkMean value of brightness, deltaGabor、δmarkRespectively representing images
Figure GDA00028473596400000312
And ImarkThe standard deviation of the luminance of (a),
Figure GDA00028473596400000313
respectively representing images
Figure GDA00028473596400000314
And ImarkBrightness variance of δG,mRepresentative image
Figure GDA00028473596400000315
And Imark(ii) a luminance covariance of; i ismarkThe pixels of the outline area of the image are 1, and the other pixels are 0; to avoid system instability caused by the denominator in equations (6) to (8) approaching zero, C1、C2And C3Set to some positive constant, less than the filter response
Figure GDA00028473596400000316
3% of the mean brightness value.
M is to beoptAnd thetaoptAs the frequency separation parameter of NSCT, the NSCT decomposes the image I (x, y) to obtain a profile subgraph
Figure GDA00028473596400000317
Since the NSCT decomposition process remains dimensionally unchanged, it will
Figure GDA00028473596400000318
And directly carrying out a pixel-level feature enhancement fusion operation with the I (x, y) to finally obtain a primary contour response E (x, y) of the input image I (x, y), as shown in formulas (9) and (10).
Figure GDA0002847359640000041
Figure GDA0002847359640000042
In the formula (I), the compound is shown in the specification,
Figure GDA0002847359640000043
represents the dimension moptAnd a direction thetaoptA non-downsampled contourlet transform under parametric conditions,
Figure GDA0002847359640000044
representing a corresponding NSCT contour sub-graph; t represents a contour subgraph
Figure GDA0002847359640000045
The average value of brightness of; max is the function of taking the maximum value, the same applies below.
Step 2: transmitting the primary contour response E (x, y) obtained in the step 1 to a full convolution neural network to obtain a heat map F trained by FCN-32S, FCN-16S, FCN-8S network elements respectively5,F4,F3. The full convolutional neural network is divided into two parts of a feature encoder and a feature decoder, and the whole network comprises 8 convolutional blocks, 5 maximum pooling layers, 5 upsampling layers and 2 convolutional layers. The concrete structure is as follows:
1. feature encoder
With VGG-16, performing optimization and reconstruction of the full convolution neural network by using the network as a basic network. In order to improve the network computing speed and enhance the generalization capability, 1 × 1 convolution kernels are added into every two convolution kernels with the convolution kernel number of 3 × 3 in a convolution block (3 × 3, 1 × 1 and 3 × 3) structure; in order to strengthen the nonlinearity and translation invariance of the learning image characteristics, a maximum pooling layer is added behind each layer of convolution module; meanwhile, E (x, y) is processed by a pooling layer Max pool5, and the size of E (x, y) is changed into 1/32 of I (x, y), which is marked as
Figure GDA0002847359640000046
Representing a feature diagram output after the FCN-32S network unit is trained; e (x, y) passes through pooling layer Max pool4 and convolutional layer 1X 1, and has a size of 1/16 of I (x, y), which is recorded as 1/16
Figure GDA0002847359640000047
Representing a feature diagram output after the FCN-16S network unit is trained; similarly, E (x, y) passes through pooling layer Max pool3 and convolutional layer 1 × 1, and the size becomes 1/8 of I (x, y), which is recorded as 1/8
Figure GDA0002847359640000048
And the characteristic diagram is output after the FCN-8S network unit is trained. Wherein each pooling layer output utilizes a Relu activation function to implement a sparse coding function. The characteristic encoder comprises the following thirteen-layer structure, wherein step lengths stride are all 1:
the first layer, convolutional layer CONV1-1, number of channels 8, convolution kernel size 3 × 3; CONV1-2, the number of channels is 8, and the size of a convolution kernel is 3 multiplied by 3;
a second layer, a maximum pooling layer Max pool1, with a pooling area size of 2 × 2;
the third layer, convolution layer CONV2-1, channel number 16, convolution kernel size 3x 3; CONV2-2, the number of channels is 16, and the size of a convolution kernel is 1 multiplied by 1; CONV2-3, the number of channels is 16, and the size of a convolution kernel is 3x 3;
the fourth layer is a maximum pooling layer Max pool2, and the size of the pooling area is 2 multiplied by 2;
the fifth layer, convolution layer CONV3-1, number of channels 32, convolution kernel size 3 × 3; CONV3-2, the number of channels is 32, and the size of a convolution kernel is 1 multiplied by 1; CONV3-3, the number of channels is 32, and the size of a convolution kernel is 3 multiplied by 3;
a sixth layer, a maximum pooling layer Max pool3, with a pooling area size of 2 × 2;
the seventh layer, convolution layer CONV4-1, channel number 64, convolution kernel size 3x 3; CONV4-2, the number of channels is 64, and the size of a convolution kernel is 1 multiplied by 1; CONV4-3, the number of channels is 64, and the size of a convolution kernel is 3 multiplied by 3;
the eighth layer, the largest pooling layer Max pool4, the pooling area size is 2 × 2;
the ninth layer, convolutional layer CONV5-1, number of channels 128, convolution kernel size 3 × 3; CONV5-2, the number of channels is 128, and the size of a convolution kernel is 1 multiplied by 1; CONV5-3, the number of channels is 128, and the size of a convolution kernel is 3 multiplied by 3;
the tenth layer, the largest pooling layer Max pool5, the pooling area size is 2 × 2;
the eleventh layer, convolutional layer CONV6, number of channels 256, size of convolutional kernel 1 × 1;
the twelfth layer, convolution layer CONV7, 256 channels, convolution kernel size 1 × 1;
the thirteenth layer, convolutional layer CONV8, number of channels 1, size of convolutional kernel 1 × 1;
2. feature decoder
The primary contour response E (x, y) is continuously reduced to 1/8, 1/16 and 1/32 times after feature coding, the obtained feature map has low resolution, and therefore a feature decoder is added to carry out bilinear upsampling operation on the feature map with low resolution. For 32 times down-sampled image
Figure GDA0002847359640000051
Using 32-fold bilinear upsampling, we obtain a heat map of the same size as I (x, y), denoted F5(ii) a Adding a prediction convolutional layer 1 multiplied by 1 for adjusting the number of characteristic image channels after a pooling layer Max pool4, and outputting to obtain an image
Figure GDA0002847359640000052
Simultaneous down-sampling of 32-fold images
Figure GDA0002847359640000053
Performing two-fold upsampling to obtain a result
Figure GDA0002847359640000054
Adding corresponding elements, and obtaining a heat map with the same size as I (x, y) by utilizing 16 times of bilinear upsampling, and marking the heat map as F4(ii) a Adding a prediction convolutional layer 1 multiplied by 1 for adjusting the number of characteristic image channels after a pooling layer Max pool3, and outputting to obtain an image
Figure GDA0002847359640000055
Simultaneous down-sampling of 16-fold images
Figure GDA0002847359640000056
Performing two-fold upsampling to obtain a result
Figure GDA0002847359640000057
Adding corresponding elements, and obtaining a heat map with the same size as I (x, y) by 8 times of bilinear upsampling, and marking the heat map as F3
And step 3: for the heatmap F obtained in step 25,F4,F3The max function is used to take the maximum pixel value of each pixel, the maximum pixel value is fused to obtain an image outline mask image F, and the image outline mask image F is further subjected to the action of Relu activation function and is combined with a known outline mark image ImarkAnd performing loss operation, recording the result as loss, continuously and iteratively updating parameters of each network layer by adopting random gradient descent, finishing the training when the loss value is smaller than a threshold value epsilon, and setting epsilon to be 1-3% of the total number of pixels of the training image sample to obtain the trained full convolution neural network.
And 4, step 4: and (3) passing the image to be detected through the Gabor filter constructed in the steps 1-3, non-subsampled contourlet transformation and the trained full convolution neural network to obtain an image contour mask image, and performing dot multiplication operation on the image contour mask image and the image to be detected to finally obtain an image contour detection result.
The invention has the following beneficial effects:
1. a novel primary contour response method for multi-stage characteristic channel optimization coding is provided. Since NSCT transformation can simulate the frequency domain separation effect of the LGN of the outer knee in visual information processing, the detection result has large uncertainty due to the artificial setting of weighting parameters in the image decomposition process. Considering that the response characteristic of the Gabor filter is similar to that of a human visual system, the Gabor filter has certain robustness on illumination and posture, and has high-quality spatial locality and directional selectivity. Therefore, the invention provides that the optimal scale and direction based on Gabor filter response are searched for each picture, and then the optimal scale and direction are used as direct basis for setting frequency separation parameters for NSCT; and performing feature enhancement fusion on the contour sub-image obtained by NSCT and the original image, and being beneficial to efficiently and accurately obtaining the primary contour response. A novel primary contour response method of multilevel characteristic channel optimization coding is constructed, an image characteristic channel with low dimension and redundancy is obtained, and the method has important application prospects in relieving network pressure, reducing the computational complexity of a convolutional neural network and improving the training efficiency of the network.
2. A full convolutional neural network is constructed for multi-scale training, and characteristics such as smoothness and fineness of the FCN-32S, FCN-16S, FCN-8S in heat map expression are fully complemented. The network is divided into a feature encoder and a feature decoder, region selection is not needed for a target image from end to end, the feature encoder continuously and actively learns feature parameters through convolution and pooling, feature maps are reduced in proportion, the feature decoder guarantees two-dimensional characteristics of extracted features through deconvolution and up-sampling processes, main outlines of the images are represented by obtaining heat maps with the same size as an original image, prediction of each pixel is achieved, and meanwhile space information of the original image is reserved.
Drawings
In order to make the object, technical scheme and beneficial effect of the invention more clear, the invention provides the following drawings for explanation:
FIG. 1 is a flow chart of image contour detection according to the present invention;
FIG. 2 is a block diagram of a full convolution neural network architecture of the present invention;
Detailed Description
The following description of the present invention will be made with reference to the accompanying drawings, which are shown in fig. 1,
step 1, acquiring a primary contour response of an input image I (x, y). First, the Gabor filter response of the input image I (x, y) is calculated, and the result is noted as
Figure GDA0002847359640000061
As shown in formulas (11) to (14).
Figure GDA0002847359640000071
Figure GDA0002847359640000072
Figure GDA0002847359640000073
Figure GDA0002847359640000074
In the formula:
Figure GDA0002847359640000075
representing Gabor characteristic information obtained by an image I (x, y) on a scale m and a direction theta (n pi/K) through a Gabor filter; sigmaxyRespectively representing the standard deviation of the Gabor wavelet basis function along the x-axis and the y-axis; omega is the complex modulation frequency of the Gaussian function; the Gabor filter psi is obtained by taking psi (x, y) as mother wavelet and carrying out scale and rotation transformation on the mother waveletm,n(x, y); wherein u, v is ψm,nA template size of (x, y); m-0,., S-1, n-0,., K-1, S and K denote the number of scales and directions, respectively; α is a scale factor of ψ (x, y), where: alpha is alpha>1。
Calculating the optimal scale m corresponding to the Gabor filter based on the similarity index SSIMoptAnd a direction thetaoptAs shown in formulas (15) to (18).
Figure GDA0002847359640000076
Figure GDA0002847359640000077
Figure GDA0002847359640000078
Figure GDA0002847359640000079
Wherein
Figure GDA00028473596400000710
Representing filter response
Figure GDA00028473596400000711
With the known outline marker image ImarkThe similarity between them when
Figure GDA00028473596400000712
When taking the maximum value, obtaining the optimal scale moptAnd a direction thetaopt
Figure GDA00028473596400000713
And
Figure GDA00028473596400000714
respectively represent
Figure GDA00028473596400000715
And ImarkQuantitative similarity measures in brightness, contrast, and structure between; u. ofGabor、umarkRespectively representing images
Figure GDA00028473596400000716
And ImarkMean value of brightness, deltaGabor、δmarkRespectively representing images
Figure GDA00028473596400000717
And ImarkThe standard deviation of the luminance of (a),
Figure GDA00028473596400000718
respectively representing images
Figure GDA0002847359640000081
And ImarkBrightness variance of δG,mRepresentative image
Figure GDA0002847359640000082
And Imark(ii) a luminance covariance of; i ismarkThe pixels of the outline area of the image are 1, and the other pixels are 0; to avoid system instability caused by the denominator in equations (16) to (18) approaching zero, C1、C2And C3Set to some positive constant, less than the filter response
Figure GDA0002847359640000083
3% of the mean brightness value.
M is to beoptAnd thetaoptObtaining a profile subgraph as a frequency separation parameter of NSCT
Figure GDA0002847359640000084
Since the NSCT maintains the same dimension after the image I (x, y) is decomposed, it will do
Figure GDA0002847359640000085
And directly carrying out a feature enhancement fusion operation at a pixel level with the I (x, y), and finally obtaining a primary contour response E (x, y) of the input image I (x, y), as shown in formulas (19) and (20).
Figure GDA0002847359640000086
Figure GDA0002847359640000087
In the formula (I), the compound is shown in the specification,
Figure GDA0002847359640000088
represents the dimension moptAnd a direction thetaoptA non-downsampled contourlet transform under parametric conditions,
Figure GDA0002847359640000089
representing a corresponding NSCT contour sub-graph; t represents a contour subgraph
Figure GDA00028473596400000810
The average value of brightness of; max is the function of taking the maximum value, the same applies below.
Step 2: as shown in FIG. 2, a full convolutional neural network was constructed, obtaining a heatmap F trained by FCN-32S, FCN-16S, FCN-8S network elements, respectively5,F4,F3. The full convolutional neural network is divided into two parts of a feature encoder and a feature decoder, and the whole network comprises 8 convolutional blocks, 5 maximum pooling layers, 5 upsampling layers and 2 convolutional layers. The concrete structure is as follows:
1. feature encoder
(1) The primary contour response E (x, y) passes through a 3X 3-8, 3X 3-8 CONV1 volume block, then maximum pooling of 2X 2 is carried out, and the effect of Relu activation function is used to obtain an image F1 1 /2As shown in formula (21), the representation I (x, y) is passed through convolution-Max pool1, and the size becomes 1/2.
Figure GDA00028473596400000811
Wherein conv1() represents the convolution operation of the first layer; pool1() represents the first max pooling operation; relu () represents the activation function for the sparse result, as follows.
(2) Will be provided with
Figure GDA0002847359640000091
The image is obtained by the CONV2 volume blocks of 3x 3-16, 1 x 1-16 and 3x 3-16, the maximum pooling of 2 x 2 is carried out, and then the 1 x 1 prediction convolution for adjusting the number of characteristic image channels and the Relu activation function are added
Figure GDA0002847359640000092
As shown in equation (22), the image is represented
Figure GDA0002847359640000093
After convolution with Max pool2, the size becomes 1/4 for I (x, y).
Figure GDA0002847359640000094
Wherein conv2() represents the second layer convolution operation; pool2() represents the second max pooling operation.
(3) Will be provided with
Figure GDA0002847359640000095
The image is obtained by 3x 3-32, 1 x 1-32 and 3x 3-32 CONV3 volume blocks, 2 x 2 maximum pooling and Relu activation function
Figure GDA0002847359640000096
As shown in equation (23), the image is represented
Figure GDA0002847359640000097
After convolution with a convolution block-Max pool 3-predictive convolution, the size becomes 1/8 for I (x, y).
Figure GDA0002847359640000098
Wherein conv3() represents the third layer of convolution operations; pool3() represents the third max pooling operation, conv1 × 1() represents a 1 × 1 convolution kernel, the same applies below.
(4) Will be provided with
Figure GDA0002847359640000099
The image is obtained by carrying out 3x 3-64, 1 x 1-64 and 3x 3-64 CONV4 volume blocks, then carrying out 2 x 2 maximum pooling, then adding 1 x 1 prediction convolution for adjusting the number of characteristic image channels, and carrying out the function of Relu activation function
Figure GDA00028473596400000910
As shown in equation (24), the image is represented
Figure GDA00028473596400000911
After convolution with a convolution block-Max pool 4-predictive convolution, the size becomes 1/16 for I (x, y).
Figure GDA00028473596400000912
Wherein conv4() represents the fourth layer convolution operation; pool4() represents the fourth max pooling operation.
(5) Will be provided with
Figure GDA00028473596400000913
Obtaining images through CONV5 volume blocks of 3x 3-64, 1 x 1-64 and 3x 3-64, then performing 2 x 2 maximum pooling and Relu activation function action
Figure GDA00028473596400000914
As shown in equation (25), the image is represented
Figure GDA00028473596400000915
After convolution with Max pool4, the size becomes 1/32 for I (x, y).
Figure GDA00028473596400000916
Wherein conv5() represents the fifth layer convolution operation; pool5() represents the fifth max pooling operation.
2. Feature decoder
(1) Image of a person
Figure GDA00028473596400000917
Using 32-fold bilinear upsampling, we get a heat map of the same size as I (x, y), denoted F5. As shown in equation (26).
Figure GDA0002847359640000101
Where bilinear () represents a bilinear upsampling operation, the same applies below.
(2) Adding a prediction convolutional layer 1 multiplied by 1 for adjusting the number of characteristic image channels after a pooling layer Max pool4, and outputting to obtain an image
Figure GDA0002847359640000102
Simultaneous down-sampling of 32-fold images
Figure GDA0002847359640000103
Performing two-fold upsampling to obtain a result
Figure GDA0002847359640000104
Adding corresponding elements, and obtaining a heat map with the same size as I (x, y) by utilizing 16 times of bilinear upsampling, and marking the heat map as F4As shown in formula (27).
Figure GDA0002847359640000105
Where sum () represents a matrix addition operation, the same applies below.
(3) Adding a prediction convolutional layer 1 multiplied by 1 for adjusting the number of characteristic image channels after a pooling layer Max pool3, and outputting to obtain an image
Figure GDA0002847359640000106
Simultaneous down-sampling of 16-fold images
Figure GDA0002847359640000107
Performing two-fold upsampling to obtain a result
Figure GDA0002847359640000108
Adding corresponding elements, and obtaining a heat map with the same size as I (x, y) by 8 times of bilinear upsampling, and marking the heat map as F3As shown in equation (28).
Figure GDA0002847359640000109
And step 3: for the heatmap F obtained in step 25,F4,F3And taking the maximum pixel value of each pixel by using a max function, and fusing to obtain an image contour mask image F as shown in a formula (29). And performing loss operation on the training image with the known artificial marked contour under the action of a Relu activation function, and recording the result as loss as shown in a formula (30). And continuously and iteratively updating parameters of each network layer by adopting random gradient descent (Stochastic gradient parameter), finishing the training when the loss value is smaller than a threshold value epsilon, setting epsilon to be 1-3% of the total number of pixels of the training image sample, and obtaining the trained full convolution neural network.
F=max(F5,F4,F3) (29)
Figure GDA00028473596400001010
Wherein M, N is the number of rows and columns of the training image, Fi,jRepresenting the pixel values of the image profile mask F at coordinates (i, j),
Figure GDA00028473596400001011
marking an image I for a known contourmarkThe pixel value at coordinate (i, j).
And 4, step 4: and (3) passing the image to be detected through the Gabor filter constructed in the steps 1-3, non-subsampled contourlet transformation and the trained full convolution neural network to obtain an image contour mask image, and performing dot multiplication operation on the image contour mask image and the image to be detected to finally obtain an image contour detection result.

Claims (1)

1.一种基于多级特征信道优化编码的图像轮廓检测方法,其特征在于,该方法具体包括以下步骤:1. an image contour detection method based on multi-level feature channel optimization coding, is characterized in that, this method specifically comprises the following steps: 步骤1:获取输入图像I(x,y)的初级轮廓响应;Step 1: obtain the primary contour response of the input image I(x, y); 首先计算输入图像I(x,y)的Gabor滤波器响应,结果记为
Figure FDA0002847359630000011
如式(1)~(4)所示;
First calculate the Gabor filter response of the input image I(x,y), the result is recorded as
Figure FDA0002847359630000011
As shown in formulas (1) to (4);
Figure FDA0002847359630000012
Figure FDA0002847359630000012
Figure FDA0002847359630000013
Figure FDA0002847359630000013
Figure FDA0002847359630000014
Figure FDA0002847359630000014
Figure FDA0002847359630000015
Figure FDA0002847359630000015
式中:
Figure FDA0002847359630000016
表示图像I(x,y)经过Gabor滤波器在尺度m,方向θ=nπ/K上得到的Gabor特征信息;σxy分别表示Gabor小波基函数沿x轴和y轴的标准偏差;ω为高斯函数的复调制频率;以ψ(x,y)为母小波,通过对其进行尺度和旋转变换,得到Gabor滤波器ψm,n(x,y);其中,u,v是ψm,n(x,y)的模板尺寸;m=0,...,S-1,n=0,...,K-1,S和K分别表示尺度数和方向数;α为ψ(x,y)的尺度因子,式中:α>1;
where:
Figure FDA0002847359630000016
Represents the Gabor feature information obtained by the image I( x , y ) through the Gabor filter at the scale m and the direction θ=nπ/K; ω is the complex modulation frequency of the Gaussian function; with ψ(x, y) as the mother wavelet, the Gabor filter ψ m,n (x, y) is obtained by scaling and rotating it; where u, v are ψ Template size of m,n (x,y); m=0,...,S-1, n=0,...,K-1, S and K represent the number of scales and directions respectively; α is ψ (x,y) scale factor, where: α>1;
基于相似度指标SSIM,计算Gabor滤波器对应的最优尺度mopt和方向θopt,如式(5)~(8)所示;Based on the similarity index SSIM, calculate the optimal scale m opt and direction θ opt corresponding to the Gabor filter, as shown in equations (5) to (8);
Figure FDA0002847359630000017
Figure FDA0002847359630000017
Figure FDA0002847359630000018
Figure FDA0002847359630000018
Figure FDA0002847359630000021
Figure FDA0002847359630000021
Figure FDA0002847359630000022
Figure FDA0002847359630000022
其中
Figure FDA0002847359630000023
表示滤波器响应
Figure FDA0002847359630000024
与已知的轮廓标记图像Imark之间的相似度,当
Figure FDA0002847359630000025
取极大值时,获得最优尺度mopt和方向θopt
Figure FDA0002847359630000026
Figure FDA0002847359630000027
分别表示
Figure FDA0002847359630000028
与Imark之间在亮度、对比度和结构上的定量相似性度量;uGabor、umark分别表示图像
Figure FDA0002847359630000029
和Imark的亮度均值,δGabor、δmark分别表示图像
Figure FDA00028473596300000210
和Imark的亮度标准差,
Figure FDA00028473596300000211
分别表示图像
Figure FDA00028473596300000212
和Imark的亮度方差,δG,m代表图像
Figure FDA00028473596300000213
和Imark的亮度协方差;为了避免由于式(6)~(8)中的各项分母接近零值时所引起的系统不稳定,C1、C2和C3设置为某个正常数,小于滤波器响应
Figure FDA00028473596300000214
亮度均值的3%;
in
Figure FDA0002847359630000023
represents the filter response
Figure FDA0002847359630000024
The similarity with the known contour mark image I mark , when
Figure FDA0002847359630000025
When taking the maximum value, the optimal scale m opt and direction θ opt are obtained;
Figure FDA0002847359630000026
and
Figure FDA0002847359630000027
Respectively
Figure FDA0002847359630000028
Quantitative similarity measures in brightness, contrast, and structure with I mark ; u Gabor and u mark represent images, respectively
Figure FDA0002847359630000029
and the mean brightness of I mark , δ Gabor and δ mark represent the image, respectively
Figure FDA00028473596300000210
and the standard deviation of the brightness of I mark ,
Figure FDA00028473596300000211
separate images
Figure FDA00028473596300000212
and the luminance variance of I mark , δ G, m represents the image
Figure FDA00028473596300000213
and the luminance covariance of I mark ; in order to avoid the instability of the system caused by the denominators in equations (6) to (8) being close to zero, C 1 , C 2 and C 3 are set to a certain constant, less than the filter response
Figure FDA00028473596300000214
3% of the mean brightness;
将mopt和θopt作为非下采样轮廓波变换的频率分离参数,非下采样轮廓波变换对图像I(x,y)分解得到轮廓子图
Figure FDA00028473596300000215
由于非下采样轮廓波变换分解过程尺寸保持不变,因此将
Figure FDA00028473596300000216
与I(x,y)直接进行像素级的特征增强融合操作,最终获得输入图像I(x,y)的初级轮廓响应E(x,y),如式(9)和(10)所示;
Taking m opt and θ opt as the frequency separation parameters of the non-subsampled contourlet transform, the non-subsampled contourlet transform decomposes the image I(x, y) to obtain the contour subgraph
Figure FDA00028473596300000215
Since the size of the non-subsampled contourlet transform decomposition process remains the same, the
Figure FDA00028473596300000216
Perform a pixel-level feature enhancement fusion operation directly with I(x,y), and finally obtain the primary contour response E(x,y) of the input image I(x,y), as shown in equations (9) and (10);
Figure FDA00028473596300000217
Figure FDA00028473596300000217
Figure FDA00028473596300000218
Figure FDA00028473596300000218
式中,
Figure FDA00028473596300000219
表示尺度mopt和方向θopt参数条件下的非下采样轮廓波变换,
Figure FDA00028473596300000220
表示对应的非下采样轮廓波变换轮廓子图;t表示轮廓子图
Figure FDA0002847359630000031
的亮度均值;max表示取最大值函数,下同;
In the formula,
Figure FDA00028473596300000219
represents the non-subsampled contourlet transform with scale m opt and direction θ opt parameters,
Figure FDA00028473596300000220
represents the corresponding non-subsampled contourlet transform contour submap; t represents the contour submap
Figure FDA0002847359630000031
The mean value of brightness; max represents the function of taking the maximum value, the same below;
步骤2:将步骤1获得的初级轮廓响应E(x,y),传输至全卷积神经网络,获得分别由FCN-32S、FCN-16S、FCN-8S网络单元训练得到的热图F5,F4,F3;全卷积神经网络分为特征编码器和特征解码器两部分,整个网络包含8个卷积块,5个最大池化层,5个上采样和2个卷积层;具体结构如下:Step 2: The primary contour response E(x, y) obtained in Step 1 is transmitted to the fully convolutional neural network to obtain the heatmap F5 trained by the FCN-32S, FCN-16S, and FCN-8S network units respectively, F 4 , F 3 ; the fully convolutional neural network is divided into two parts: feature encoder and feature decoder, the whole network contains 8 convolution blocks, 5 max pooling layers, 5 upsampling and 2 convolution layers; The specific structure is as follows: 1.特征编码器1. Feature encoder 以VGG-16作为基础网络进行全卷积神经网络的优化改造;为实现网络计算速度的提高,增强泛化能力,在卷积块(3×3、1×1、3×3)结构中,每两个3×3的卷积核中加入1×1卷积核;为加强学习图像特征的非线性和平移不变性,每层卷积模块后面加入最大池化层;同时E(x,y)经过池化层Max pool5处理后,尺寸变成I(x,y)的1/32,记为
Figure FDA0002847359630000032
表示经过FCN-32S网络单元训练后输出的特征图;E(x,y)经过池化层Max pool4和卷积层1×1,尺寸变成I(x,y)的1/16,记为
Figure FDA0002847359630000033
表示经过FCN-16S网络单元训练后输出的特征图;同理,E(x,y)经过池化层Max pool3和卷积层1×1,尺寸变成I(x,y)的1/8,记为
Figure FDA0002847359630000034
表示经过FCN-8S网络单元训练后输出的特征图;其中每个池化层输出利用Relu激活函数实现稀疏编码功能;特征编码器包括如下十三层结构,其中步长stride均为1:
The full convolutional neural network is optimized and transformed with VGG-16 as the basic network; in order to improve the network computing speed and enhance the generalization ability, in the convolution block (3×3, 1×1, 3×3) structure, A 1×1 convolution kernel is added to every two 3×3 convolution kernels; in order to strengthen the nonlinearity and translation invariance of learning image features, a maximum pooling layer is added after each convolution module; at the same time, E(x,y ) After being processed by the pooling layer Max pool5, the size becomes 1/32 of I(x,y), denoted as
Figure FDA0002847359630000032
Represents the feature map output after FCN-32S network unit training; E(x,y) passes through the pooling layer Max pool4 and the convolutional layer 1×1, and the size becomes 1/16 of I(x,y), denoted as
Figure FDA0002847359630000033
Represents the feature map output after FCN-16S network unit training; Similarly, E(x,y) passes through the pooling layer Max pool3 and the convolutional layer 1×1, and the size becomes 1/8 of I(x,y) , denoted as
Figure FDA0002847359630000034
Represents the feature map output after training by the FCN-8S network unit; the output of each pooling layer uses the Relu activation function to implement the sparse coding function; the feature encoder includes the following thirteen-layer structure, where the stride is 1:
第一层,卷积层CONV1-1,通道个数8,卷积核大小3×3;CONV1-2,通道个数8,卷积核大小为3×3;The first layer, convolutional layer CONV1-1, the number of channels is 8, and the size of the convolution kernel is 3×3; CONV1-2, the number of channels is 8, and the size of the convolution kernel is 3×3; 第二层,最大池化层Max pool1,池化区域大小为2×2;The second layer, the maximum pooling layer Max pool1, the size of the pooling area is 2 × 2; 第三层,卷积层CONV2-1,通道个数16,卷积核大小为3×3;CONV2-2,通道个数16,卷积核大小为1×1;CONV2-3,通道个数16,卷积核大小为3x3;The third layer, convolutional layer CONV2-1, the number of channels is 16, and the size of the convolution kernel is 3 × 3; CONV2-2, the number of channels is 16, and the size of the convolution kernel is 1 × 1; CONV2-3, the number of channels 16, the size of the convolution kernel is 3x3; 第四层,最大池化层Max pool2,池化区域大小为2×2;The fourth layer, the maximum pooling layer Max pool2, the size of the pooling area is 2 × 2; 第五层,卷积层CONV3-1,通道个数32,卷积核大小为3×3;CONV3-2,通道个数32,卷积核大小为1×1;CONV3-3,通道个数32,卷积核大小为3×3;The fifth layer, the convolution layer CONV3-1, the number of channels is 32, and the size of the convolution kernel is 3 × 3; CONV3-2, the number of channels is 32, and the size of the convolution kernel is 1 × 1; CONV3-3, the number of channels 32, the size of the convolution kernel is 3×3; 第六层,最大池化层Max pool3,池化区域大小为2×2;The sixth layer, the maximum pooling layer Max pool3, the size of the pooling area is 2 × 2; 第七层,卷积层CONV4-1,通道个数64,卷积核大小为3×3;CONV4-2,通道个数64,卷积核大小为1×1;CONV4-3,通道个数64,卷积核大小为3×3;The seventh layer, convolutional layer CONV4-1, the number of channels is 64, and the size of the convolution kernel is 3×3; CONV4-2, the number of channels is 64, and the size of the convolution kernel is 1×1; CONV4-3, the number of channels 64, the size of the convolution kernel is 3×3; 第八层,最大池化层Max pool4,池化区域大小为2×2;The eighth layer, the maximum pooling layer Max pool4, the size of the pooling area is 2 × 2; 第九层,卷积层CONV5-1,通道个数128,卷积核大小为3×3;CONV5-2,通道个数128,卷积核大小为1×1;CONV5-3,通道个数128,卷积核大小为3×3;The ninth layer, the convolutional layer CONV5-1, the number of channels is 128, and the size of the convolution kernel is 3×3; CONV5-2, the number of channels is 128, and the size of the convolution kernel is 1×1; CONV5-3, the number of channels 128, the size of the convolution kernel is 3×3; 第十层,最大池化层Max pool5,池化区域大小为2×2;The tenth layer, the maximum pooling layer Max pool5, the size of the pooling area is 2 × 2; 第十一层,卷积层CONV6,通道个数256,卷积核大小为1×1;The eleventh layer, the convolution layer CONV6, the number of channels is 256, and the size of the convolution kernel is 1×1; 第十二层,卷积层CONV7,通道个数256,卷积核大小为1×1;The twelfth layer, the convolution layer CONV7, the number of channels is 256, and the size of the convolution kernel is 1×1; 第十三层,卷积层CONV8,通道个数1,卷积核大小为1×1;The thirteenth layer, the convolution layer CONV8, the number of channels is 1, and the size of the convolution kernel is 1×1; 2.特征解码器2. Feature Decoder 初级轮廓响应E(x,y)经过特征编码不断缩小为原来的1/8,1/16,1/32,获得的特征图分辨率低,因此加入特征解码器,对低分辨率的特征图进行双线性上采样操作;对于经过32倍下采样的图像
Figure FDA0002847359630000051
利用32倍双线性上采样得到与I(x,y)一样大小的热图,记为F5;在池化层Max pool4后加入调节特征图像通道个数的预测卷积层1×1,输出得到图像
Figure FDA0002847359630000052
同时把32倍下采样的图像
Figure FDA0002847359630000053
进行两倍上采样,所得结果与
Figure FDA0002847359630000054
对应元素相加,再利用16倍双线性上采样得到与I(x,y)一样大小的热图,记为F4;在池化层Max pool3后加入调节特征图像通道个数的预测卷积层1×1,输出得到图像
Figure FDA0002847359630000055
同时把16倍下采样的图像
Figure FDA0002847359630000056
进行两倍上采样,所得结果与
Figure FDA0002847359630000057
对应元素相加,再利用8倍双线性上采样得到与I(x,y)一样大小的热图,记为F3
The primary contour response E(x, y) is continuously reduced to 1/8, 1/16, 1/32 of the original through feature encoding, and the obtained feature map has low resolution, so a feature decoder is added to the low-resolution feature map. Perform a bilinear upsampling operation; for images downsampled by a factor of 32
Figure FDA0002847359630000051
Use 32-fold bilinear upsampling to obtain a heat map of the same size as I(x,y), denoted as F 5 ; add a prediction convolution layer 1×1 to adjust the number of feature image channels after the pooling layer Max pool4, output image
Figure FDA0002847359630000052
Also downsample the image by a factor of 32
Figure FDA0002847359630000053
Upsampling twice, the result is the same as
Figure FDA0002847359630000054
The corresponding elements are added, and then 16 times bilinear upsampling is used to obtain a heat map of the same size as I(x,y), denoted as F 4 ; after the pooling layer Max pool3, a prediction volume that adjusts the number of feature image channels is added Layer 1×1, output to get the image
Figure FDA0002847359630000055
Simultaneously downsample the image by a factor of 16
Figure FDA0002847359630000056
Upsampling twice, the result is the same as
Figure FDA0002847359630000057
Add the corresponding elements, and then use 8-fold bilinear upsampling to obtain a heat map of the same size as I(x, y), denoted as F 3 ;
步骤3:对步骤2获得的热图F5,F4,F3,利用max函数取各像素上的最大像素值,融合得到图像轮廓掩模图F,再经过Relu激活函数作用,与已知的轮廓标记图像Imark进行损失运算,结果记为loss,并采用随机梯度下降,不断迭代更新各个网络层的参数,当loss值小于阀值ε时训练结束,ε设为训练图像样本像素总数的1~3%,获得训练后的全卷积神经网络;Step 3: For the heatmaps F 5 , F 4 , F 3 obtained in step 2, use the max function to take the maximum pixel value on each pixel, fuse to obtain the image contour mask map F, and then pass the Relu activation function. The contour marked image I mark is used for loss calculation, and the result is recorded as loss, and stochastic gradient descent is used to continuously update the parameters of each network layer. When the loss value is less than the threshold ε, the training ends, and ε is set as the total number of training image samples. 1 to 3%, obtain the fully convolutional neural network after training; 步骤4:将待检测图像经过步骤1~3所构建的Gabor滤波器、非下采样轮廓波变换以及训练后的全卷积神经网络,得到图像轮廓掩模图,与待检测图像进行点乘操作,最终获得图像轮廓检测结果。Step 4: Pass the image to be detected through the Gabor filter constructed in steps 1 to 3, the non-subsampling contourlet transform and the trained full convolutional neural network to obtain an image contour mask map, and perform a dot multiplication operation with the image to be detected , and finally obtain the image contour detection result.
CN201910080334.7A 2019-01-28 2019-01-28 An Image Contour Detection Method Based on Multi-level Feature Channel Optimal Coding Active CN109903301B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910080334.7A CN109903301B (en) 2019-01-28 2019-01-28 An Image Contour Detection Method Based on Multi-level Feature Channel Optimal Coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910080334.7A CN109903301B (en) 2019-01-28 2019-01-28 An Image Contour Detection Method Based on Multi-level Feature Channel Optimal Coding

Publications (2)

Publication Number Publication Date
CN109903301A CN109903301A (en) 2019-06-18
CN109903301B true CN109903301B (en) 2021-04-13

Family

ID=66944375

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910080334.7A Active CN109903301B (en) 2019-01-28 2019-01-28 An Image Contour Detection Method Based on Multi-level Feature Channel Optimal Coding

Country Status (1)

Country Link
CN (1) CN109903301B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110378976B (en) * 2019-07-18 2020-11-13 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN110796643A (en) * 2019-10-18 2020-02-14 四川大学 Rail fastener defect detection method and system
CN111126494B (en) * 2019-12-25 2023-09-26 中国科学院自动化研究所 Image classification method and system based on anisotropic convolution
CN113076966B (en) * 2020-01-06 2023-06-13 字节跳动有限公司 Image processing method and device, neural network training method, storage medium
CN111310771B (en) * 2020-03-11 2023-07-04 中煤航测遥感集团有限公司 Road image extraction method, device and equipment of remote sensing image and storage medium
CN113761983B (en) * 2020-06-05 2023-08-22 杭州海康威视数字技术股份有限公司 Method, device and image acquisition device for updating human face liveness detection model
CN111985329B (en) * 2020-07-16 2024-03-29 浙江工业大学 Remote sensing image information extraction method based on FCN-8s and improved Canny edge detection
CN112116537B (en) * 2020-08-31 2023-02-10 中国科学院长春光学精密机械与物理研究所 Image reflected light elimination method and image reflected light elimination network construction method
US20220114424A1 (en) * 2020-10-08 2022-04-14 Niamul QUADER Multi-bandwidth separated feature extraction convolution layer for convolutional neural networks
CN112633099B (en) * 2020-12-15 2023-06-20 中国人民解放军战略支援部队信息工程大学 Gabornet-based signal processing method and system in the low-level visual area of the brain
CN113673538B (en) * 2021-08-16 2023-07-14 广西科技大学 A Biologically Inspired Multilevel and Multilevel Feedback Contour Detection Method
CN113673539B (en) * 2021-08-19 2023-06-20 广西科技大学 A step-by-step interactive contour recognition method based on deep learning model
CN119559203B (en) * 2025-01-23 2025-04-15 杭州电子科技大学 OCTA retinal vessel segmentation method based on Gabor modulation coding improved U-Net

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239751A (en) * 2017-05-22 2017-10-10 西安电子科技大学 High Resolution SAR image classification method based on the full convolutional network of non-down sampling contourlet
KR101829287B1 (en) * 2016-11-29 2018-02-14 인천대학교 산학협력단 Nonsubsampled Contourlet Transform Based Infrared Image Super-Resolution
CN108764186A (en) * 2018-06-01 2018-11-06 合肥工业大学 Personage based on rotation deep learning blocks profile testing method
CN109033945A (en) * 2018-06-07 2018-12-18 西安理工大学 A kind of human body contour outline extracting method based on deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101829287B1 (en) * 2016-11-29 2018-02-14 인천대학교 산학협력단 Nonsubsampled Contourlet Transform Based Infrared Image Super-Resolution
CN107239751A (en) * 2017-05-22 2017-10-10 西安电子科技大学 High Resolution SAR image classification method based on the full convolutional network of non-down sampling contourlet
CN108764186A (en) * 2018-06-01 2018-11-06 合肥工业大学 Personage based on rotation deep learning blocks profile testing method
CN109033945A (en) * 2018-06-07 2018-12-18 西安理工大学 A kind of human body contour outline extracting method based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Multifocus Image Fusion Based on NSCT and Focused Area Detection";Yong Yang;《IEEE Sensors Journal》;20150531;第2824-2838页 *
"基于深度协同稀疏编码网络的海洋浮筏SAR图像目标识别";耿杰;《自动化学报》;20160430;第593-604页 *

Also Published As

Publication number Publication date
CN109903301A (en) 2019-06-18

Similar Documents

Publication Publication Date Title
CN109903301B (en) An Image Contour Detection Method Based on Multi-level Feature Channel Optimal Coding
CN110781775B (en) Remote sensing image water body information accurate segmentation method supported by multi-scale features
CN112183360B (en) A lightweight semantic segmentation method for high-resolution remote sensing images
CN110008915B (en) System and method for dense human pose estimation based on mask-RCNN
CN113673590B (en) Rain removal method, system and medium based on multi-scale hourglass densely connected network
CN111612807B (en) Small target image segmentation method based on scale and edge information
CN105551036B (en) A kind of training method and device of deep learning network
CN111145181B (en) Skeleton CT image three-dimensional segmentation method based on multi-view separation convolutional neural network
CN112308860A (en) Earth observation image semantic segmentation method based on self-supervision learning
CN116258976A (en) A Hierarchical Transformer Semantic Segmentation Method and System for High Resolution Remote Sensing Images
CN110070091B (en) Semantic segmentation method and system based on dynamic interpolation reconstruction and used for street view understanding
CN104616032A (en) Multi-camera system target matching method based on deep-convolution neural network
CN116797787B (en) Remote sensing image semantic segmentation method based on cross-modal fusion and graph neural network
CN114445665A (en) Hyperspectral image classification method based on Transformer enhanced non-local U-shaped network
CN113421210A (en) Surface point cloud reconstruction method based on binocular stereo vision
CN107977661A (en) The region of interest area detecting method decomposed based on full convolutional neural networks and low-rank sparse
CN114565628B (en) Image segmentation method and system based on boundary perception attention
CN110415253A (en) A kind of point Interactive medical image dividing method based on deep neural network
CN112686830B (en) Super-resolution method for a single depth map based on image decomposition
CN115937704B (en) Remote sensing image road segmentation method based on topology perception neural network
Zhang et al. A separation–aggregation network for image denoising
CN115205308A (en) A method for segmentation of blood vessels in fundus images based on linear filtering and deep learning
CN116309681A (en) A Weakly Supervised Segmentation Method for Medical Images Based on Class Activation Maps
CN103871060B (en) Image partition method based on steady direction wave zone probability graph model
Lei et al. HPLTS-GAN: A high-precision remote sensing spatiotemporal fusion method based on low temporal sensitivity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant