Disclosure of Invention
Aiming at the problems, the invention provides a method and a device for detecting the road change of a high-resolution remote sensing image based on deep learning, which are used for solving the technical problems in the prior art, can introduce an attention module into the original jump connection of a U-Net model, and improve the change detection precision of the model by increasing the weight of a changed area and reducing the weight of feature information of an unchanged area.
In order to achieve the purpose, the invention provides the following scheme: the invention provides a high-resolution remote sensing image road change detection device based on deep learning, which comprises: the system comprises an image correction unit, an image drawing unit, a data processing unit, a characteristic deep learning unit, a communication unit and terminal equipment;
the image correction unit, the image drawing unit, the data processing unit, the feature deep learning unit, the communication unit and the terminal equipment are sequentially connected;
the image correction unit is used for acquiring a high-resolution remote sensing image of a road, and performing radiation error correction, geometric error correction and image enhancement processing on the high-resolution remote sensing image to obtain a road correction image;
the image drawing unit draws a label map based on the road correction image;
the data processing unit is used for carrying out segmentation processing on the road correction image and the label graph to obtain a segmentation data set;
the feature deep learning unit is used for deep learning the segmentation data set, extracting features of the segmentation data set through an inclusion structure, constructing an ASPP AttR2U-Net model based on the features, and acquiring a change result of the road based on the ASPP AttR2U-Net model;
the communication unit is used for transmitting the change result of the road to the terminal equipment.
Preferably, the image correction unit comprises an acquisition module and a correction module; the acquisition module, the correction module and the image drawing unit are sequentially connected;
the acquisition module is used for acquiring a high-resolution remote sensing image of the road;
the correction module is used for carrying out radiation error correction, geometric error correction and image enhancement processing on the high-resolution remote sensing image to obtain the road correction image.
Preferably, the image drawing unit comprises an interpretation module and a delineation module; the correction module, the interpretation module, the delineation module and the data processing unit are connected in sequence;
the interpretation module is used for loading and manually and visually interpreting the road correction image to obtain an interpreted image;
the drawing module is used for drawing the pattern spots of the road changes in the interpreted image to obtain the label image.
Preferably, the data processing unit comprises a partitioning module and a collecting module; the interpretation module and the drawing module are both connected with the segmentation module; the segmentation module, the collection module and the feature deep learning unit are connected in sequence;
the segmentation module is used for segmenting the road correction image and the label graph to obtain a plurality of image segmentation graphs;
the collection module is used for collecting all the image segmentation maps to obtain a segmentation data set.
Preferably, the feature deep learning unit comprises a feature extraction module, a secondary cyclic residual convolution module and a model construction module; the set module, the feature extraction module, the secondary cyclic residual convolution module and the model construction module are connected in sequence;
the feature extraction module is used for deep learning of the segmentation data set and extracting features of the segmentation data set through an inclusion structure;
the secondary cyclic residual convolution module is used for performing secondary cyclic residual convolution processing on the characteristics of the segmentation data set;
the model building module builds an ASPP AttR2U-Net model based on the characteristics, and obtains the change result of the road based on the ASPP AttR2U-Net model.
Preferably, the communication unit includes a first communication module and a second communication module; the first communication module is arranged in the feature deep learning unit; the second communication module is arranged in the terminal equipment;
the first communication module is used for transmitting the result of the road change;
the second communication module is used for receiving the result of the road change.
Preferably, the first communication module and the second communication module are wirelessly connected through 2.4 g.
The method for detecting the road change of the high-resolution remote sensing image based on deep learning comprises the following steps:
s1, collecting a high-resolution remote sensing image of a road, and performing radiation error correction, geometric error correction and image enhancement processing on the high-resolution remote sensing image to obtain a road correction image;
s2, drawing a label graph based on the road correction image;
s3, carrying out segmentation processing on the road correction image and the label graph to obtain a segmentation data set;
s4, deep learning is conducted on the segmentation data set, features of the segmentation data set are extracted through an inclusion structure, an ASPP AttR2U-Net model is built based on the features, and a change result of the road is obtained based on the ASPP AttR2U-Net model.
The invention discloses the following technical effects:
(1) the convolution layer of the original U-Net model at the encoding end is changed into the inclusion, the topological structure is more excellent, the structure of the topological structure comprises a plurality of parallel convolution structures, the depth and the width of the original U-Net model are increased, different local features are obtained under the condition that model calculation parameters are not added, the diversity features are increased, and the generalization capability of the model is improved.
(2) The invention can expand the receptive field by utilizing different voidage under the condition of not reducing the space dimension, thereby obtaining the image characteristics of the multi-scale context and improving the network model segmentation performance.
(3) The original jump connection of the U-Net model is to splice low-level features and high-level features in a waveband dimension to realize feature fusion, but the problem that edge details of a predicted image are lost due to insufficient consideration of spatial consistency is solved. Therefore, the original jump connection of the U-Net model is changed, an attention mechanism is introduced in the process of the jump connection, the attention mechanism can adjust the weight of each component in the feature map, the learning of the class of features is inhibited by reducing the weight of the feature map which is not related to the task, and the learning of the class of features is enhanced by increasing the weight of the feature which is related to the task. In a change detection task of a high-resolution remote sensing image, the key point is to extract a change region from a two-phase image, so that the information weight of a change type is increased by introducing an attention mechanism, so that the model gravity learns the change region, meanwhile, the weight of an unchanged type is reduced, the sensitivity of a model to the change type is improved, and a more accurate road change condition can be obtained.
(4) The invention improves the U-Net model, and the improved model is called an ASPP ArrU-Net model. The model introduces a secondary cyclic residual convolution module in the up-sampling and down-sampling stages to repeatedly utilize the characteristic diagram, thereby enhancing the propagation of the characteristics; the receptive field can be enlarged, and context information of different scales can be extracted; and the change detection precision of the model is improved by increasing the weight of the change area and simultaneously reducing the weight of the feature information of the non-change area.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Referring to fig. 1, the present embodiment provides a device for detecting a road change in high-resolution remote sensing images based on deep learning, including: the system comprises an image correction unit, an image drawing unit, a data processing unit, a characteristic deep learning unit, a communication unit and terminal equipment; the image correction unit, the image drawing unit, the data processing unit, the feature deep learning unit, the communication unit and the terminal equipment are connected in sequence.
The image correction unit is used for acquiring a high-resolution remote sensing image of a road, and performing radiation error correction, geometric error correction and image enhancement processing on the high-resolution remote sensing image to obtain a road correction image; the image drawing unit draws a label graph based on the road correction image; the data processing unit is used for carrying out segmentation processing on the road correction image and the label graph to obtain a segmentation data set; the feature deep learning unit is used for deep learning of the segmented data set, extracting features of the segmented data set through an inclusion structure, constructing an ASPP AttR2U-Net model based on the features, and acquiring a change result of a road based on the ASPP AttR2U-Net model; the communication unit is used for transmitting the change result of the road to the terminal equipment; the terminal equipment is used for storing and checking the change result of the road.
The image correction unit comprises an acquisition module and a correction module; the acquisition module, the correction module and the image drawing unit are sequentially connected; the acquisition module is used for acquiring a high-resolution remote sensing image of a road; the correction module is used for carrying out radiation error correction, geometric error correction and image enhancement processing on the high-resolution remote sensing image to obtain a road correction image.
The image drawing unit comprises an interpretation module and a delineation module; the correction module, the interpretation module, the delineation module and the data processing unit are connected in sequence; the interpretation module is used for loading and manually visually interpreting the road correction image to obtain an interpreted image; the drawing module is used for drawing the image spots of the road change in the interpreted image to obtain a label image.
The data processing unit comprises a segmentation module and a collection module; the interpretation module and the drawing module are both connected with the segmentation module; the segmentation module, the collection module and the feature deep learning unit are connected in sequence; the segmentation module is used for carrying out segmentation processing on the road correction image and the label graph to obtain a plurality of image segmentation graphs; the collection module is used for collecting all the image segmentation maps to obtain a segmentation data set.
The feature deep learning unit comprises a feature extraction module, a secondary cycle residual convolution module and a model construction module; the set module, the feature extraction module, the secondary cyclic residual convolution module and the model construction module are connected in sequence; the feature extraction module is used for deep learning of the segmented data set and extracting features of the segmented data set through an inclusion structure; the secondary cyclic residual convolution module is used for performing secondary cyclic residual convolution processing on the characteristics of the segmented data set; the model building module builds an ASPP AttR2U-Net model based on the characteristics, and obtains the change result of the road based on the ASPP AttR2U-Net model.
The communication unit comprises a first communication module and a second communication module; the first communication module is arranged in the feature deep learning unit; the second communication module is arranged in the terminal equipment; the first communication module is used for transmitting a road change result; the second communication module is used for receiving the result of the road change. The first communication module and the second communication module are wirelessly connected through 2.4 g.
Referring to fig. 2 to 11, the embodiment provides a method for detecting road changes of high-resolution remote sensing images based on deep learning, which includes the following steps:
and S1, collecting the high-resolution remote sensing image of the road, and carrying out radiation error correction, geometric error correction and image enhancement processing on the high-resolution remote sensing image to obtain a road correction image.
Preprocessing the acquired high-resolution remote sensing image, and mainly comprises the steps of carrying out radiation correction and geometric correction on the image. In the imaging process of the satellite, due to different orbit positions, solar altitude angles and instantaneous field angles of the sensor at different moments, the obtained remote sensing image may have geometric distortion in position and cannot be directly used. Therefore, the first step in the remote sensing image interpretation is to pre-process the acquired image. The preprocessing generally comprises radiation correction, geometric correction, image enhancement and the like, and the influence on the change detection of the remote sensing image is weakened by preprocessing the image to eliminate 'pseudo change' caused by external factors.
The radiation correction is used for correcting or eliminating the phenomenon that when the sensor receives electromagnetic wave radiation energy emitted by an earth surface object, due to the influences of factors such as atmospheric action, illumination conditions and the like, a detection value received by the sensor is inconsistent with the spectral radiance actually emitted by the earth surface object, and the phenomenon of image gray level distortion, namely radiation error, is caused. In the experiment in the embodiment, when the radiation correction is performed on the second-phase image of the GF-1, only the radiation error generated by the factors of the sensor is considered, and the radiation error caused by other factors is not considered.
The radiation correction is further divided into absolute radiation correction and relative radiation correction. The absolute radiation correction is to establish a mathematical model for various standard radiation sources in different ranges of spectra to quantitatively describe the relationship between the spectral radiation brightness value at the satellite imaging spectrometer and the output digital quantization, so that the remote sensing data can truly reflect the intensity of the electromagnetic wave information of the ground objects. The relative radiation correction is to select one of the two-stage images as a reference image and the other image as an image to be corrected, and establish a linear or nonlinear mathematical model for the gray values of the two-stage images to eliminate the radiation difference between the two-stage images, so that the two-stage images of the same ground object have the same radiation brightness value, and realize the normalization of the spectral value. Wherein, the radiation correction adopts relative radiation correction, namely, one phase image in two phases of images is taken as a reference image, the other phase image is taken as an image to be corrected, and a regression analysis method is adopted to establish linear mapping y between the two phases of imagesiThe formula is as follows:
yi=ki*xi+bi
in the formula, yiThe radiation brightness value x of the pixel of the ith waveband after radiation correction of the image to be corrected in the later periodiThe pixel radiation brightness value k of the image to be corrected in the ith wave bandi、biThe slope and intercept of the linear regression equation of the ith wave band. Selecting pseudo-invariant feature points in the two-phase image by adopting an iterative weighted multivariate algorithm, selecting a threshold value and a weighted value after multiple iterations, calculating the pseudo-invariant feature points by adopting a least square method, and further solving a slope k in the formulaiAnd intercept bi。
The geometric errors of the image are caused by a series of factors such as the height of a sensor platform, the curvature of the earth, the change of air refraction, the change of terrain and the like. The geometric correction is a geometric deformation error generated by the characteristics of geometric position, shape size, space position and the like of the same ground object when a certain type of information in the two-phase images is projected to a reference system in a specified image. The method utilizes the control points to carry out geometric correction on the image, the number of the control points is related to the times of using a polynomial model for geometric correction, for an nth-order polynomial, (n +1) × (n +2)/2 control points are at least needed theoretically, and in the process of actually selecting the control points, the number of the control points is at least larger than the lowest theoretical value. The selection of control points mainly follows the following principles: (1) the control point should select characteristic points which are easy to distinguish, permanent and fine in images, such as house corners, road intersections, airports and the like; (2) the area with large characteristic change on the image should select some control points; (3) control points are selected in the image edge area to avoid the corrected image extrapolation; (4) the selection of control points should be evenly distributed over the image. In the embodiment of the present application, the first-stage image is a reference image, the second-stage image is an image to be corrected, and the geometric correction is performed by using a second-order polynomial model, which has the following formula:
in the formula, x
l、y
lIs pixel coordinate, x, of the corrected later-stage image
1And y
1Is the pixel coordinate of the previous image.
And
the coefficients of the second-order polynomial correction model are obtained by the least square method through artificially selected control points, wherein i is 0, 1, 2, 3, 4 and 5.
There are many methods for image enhancement in remote sensing images, such as color enhancement, radiation enhancement, etc., and the final results obtained by different methods are different. The embodiment of the application mainly aims at performing gray level stretching on the acquired two-phase image. The gray scale stretching is a simple and efficient linear image enhancement method. The piecewise linear gray stretching can restrain a low-frequency part in an image, improve the contrast and brightness of the high-frequency part, improve the visual effect of the image more obviously, and extract more useful information for a current task from the image when the image is visually interpreted. 2% linear stretching is carried out on two images adopted in the experiment, namely the pixel gray value of the image gray value between 2% and 98% is linearly stretched, the gray values smaller than 2% and larger than 98% are set as 0, so that part of abnormal values can be abandoned, and the pixel values in the residual range are stretched again to the gray value range of 0-255, and the formula is as follows:
in the formula, g (x, y) represents the processed image, f (x, y) represents the input image, and V represents the image pixel gradation value.
And S2, drawing a label graph based on the road correction image.
And respectively loading the two-stage images by using remote sensing image processing software, and drawing the pattern spots of the road change in the two-stage images in a mode of manual visual interpretation to serve as training samples of the model.
And S3, carrying out segmentation processing on the road correction image and the label map to obtain a segmentation data set.
And (4) dividing the processed two-stage images obtained in the S1 and the S2 and the label graph which is interpreted and sketched by human vision according to the same size, and dividing the divided images and the label graph into a training sample and a verification sample according to a certain proportion.
S4, deep learning is conducted on the segmentation data set, features of the segmentation data set are extracted through an inclusion structure, an ASPP AttR2U-Net model is built on the basis of the features, and a road change result is obtained on the basis of the ASPP AttR2U-Net model.
Firstly, a convolution operation in a deep learning framework pytore is called to extract features of an input image, and one convolution kernel can only extract one feature map and can not extract all different features of the whole image, so each convolution layer extracts different types of features by a plurality of different convolution kernels, wherein the low-layer convolution layer mainly extracts shallow features of the image, such as information of boundaries, contours and the like, and the high-layer convolution layer extracts high-level features of the image, such as geometric relations, spatial relations and the like of the image by superposing and integrating the information extracted by the low-layer convolution. Second, a pooling operation in the pytorech is invoked. Because the feature map of the input image after passing through the convolution layer has high dimensionality and contains some unimportant high-frequency information, if the high-dimensional feature maps are directly input into the next convolution layer, the calculated amount of the model is increased, the dimensionality is overhigh, and the phenomenon of overfitting occurs. Therefore, a method for performing aggregation processing on the convolved feature maps, that is, describing a large-area region by using a feature with a small dimension, is needed, and the method can reduce the dimension of the feature maps, well retain the main features of the feature maps, effectively reduce the number of parameters, and prevent the occurrence of an overfitting phenomenon. However, the convolution operation only performs linear transformation on the input image, and no matter how many hidden layers are overlapped in the neural network, the output result is a combination of linear transformation, and only a simple mapping relation can be expressed. When the method faces complex task scenes such as remote sensing images, the model expression capability of linear transformation is insufficient, and the generalization capability is very limited. Therefore, in order to improve the expression capability and generalization capability of the model, it is necessary to introduce an activation function to map the linear features extracted by the convolutional layer into nonlinear features, so as to enhance the generalization capability of the model.
Aiming at the phenomena of missing detection, error detection and the like of the U-Net model in the change detection of the high-resolution image, the invention improves the U-Net model. And replacing the encoding end of the U-Net model and the first convolution layer of different stages with an increment structure. The inclusion structure is originally proposed in google lenet, and compared with the traditional convolutional layer, the inclusion structure has a better topological structure, the structure of the inclusion structure comprises a plurality of parallel convolutional structures, and each convolutional structure comprises convolutional cores with different sizes, so that the depth and the width of a U-Net model are increased. As shown in fig. 3, the inclusion structure has convolution kernels of sizes 1 × 1, 3 × 3, and 5 × 5 and pooling layers of sizes 3 × 3, respectively. And a convolution kernel with the size of 1 multiplied by 1 is added in each channel to reduce the dimensionality and reduce the calculation amount. And finally, the extracted feature layers are combined, so that different local features can be obtained under the condition that model calculation parameters are not added, the diversity features are increased, and the generalization capability of the model is enhanced.
And introducing an ASPP module at the bottom of the U-Net model to acquire multi-scale context information of the feature map. The hole Convolution (Atrous Convolution) was first proposed in the Deeplab v1 model. The traditional convolution operation extracts the characteristics of image pixel points which are arranged closely, the hole convolution is added into a hole in a standard convolution module in a mode of adding 0 value, and the hole convolution with the hole rate r is assumed to expand a convolution kernel with the size of m multiplied by m into Nm×NmThe receptive field is RF. Then N ismAnd the calculation formula of RF is as follows:
Nm=m+(m-1)*(r-1)
RFk=RFk-1+(mk-1)*Sk
in the formula ofkDenotes the receptive field size, S, of the k-th layerkStep size, RF, for the k-th layer convolutionk-1Is the receptive field size of the k-1 th layer, mkRepresenting the size of the k-th layer convolution kernel. The method can enlarge the context receptive field without increasing the calculation amount, so that the feature map output by the convolution layer contains a wider range of feature information. In the later Deeplab v2 model, an empty space Pyramid Pooling (ASPP) module was proposed, the structure of which is shown in fig. 4. Aiming at the multi-scale features of the ground features in the image, the ASPP module adopts a cavity convolution module with the cavity rates of 6, 12, 18 and 24 to obtain the receptive field information under different scales, and the context information of different scales can be obtained in the feature extraction of the image.
In the original jump connection, an attention mechanism is added, the weight of the information of the changed area is increased, and the weight of the characteristic information of the unchanged area is reduced at the same time, so that the result of the change detection is more accurate. In summary, a remote sensing image change detection model is constructed, training samples are input into the improved U-Net model for training, and after a plurality of iterations, when the loss value of a training set is not reduced any more, model training is completed.
The improved U-Net model is similar to the original U-Net model in structure and is composed of an encoder, a decoder and a jump connection part. In the encoder part, the improved U-Net model is composed of 4 convolution layers and down-sampling layers, wherein the convolution layer part of the encoder adopts the aforementioned Incep structure, so that different local features can be obtained under the condition that model calculation parameters are not added, the diversity features are increased, and the generalization capability extraction of the model is enhanced. An ASPP module is introduced to the bottommost part of an encoder part, the receptive field is expanded by adopting four cavity convolutions with different cavity rates, and multi-scale information of an image target is extracted. The decoder part is composed of 4 sets of convolutional layers and upsampling layers, and the convolutional layer part is the same as the encoder. A jump connection incorporating an attention module is used between each layer of encoder and decoder for a total of 4 layers. The skip connection can splice the shallow features and the deep features in the wave band dimension, and the added attention module can increase the feature weight of the change information, so that the anti-noise capability of the model is improved.
The invention discloses the following technical effects:
(1) the convolution layer of the original U-Net model at the encoding end is changed into the inclusion, the topological structure is more excellent, the structure of the topological structure comprises a plurality of parallel convolution structures, the depth and the width of the original U-Net model are increased, different local features are obtained under the condition that model calculation parameters are not added, the diversity features are increased, and the generalization capability of the model is improved.
(2) The invention can expand the receptive field by utilizing different voidage under the condition of not reducing the space dimension, thereby obtaining the image characteristics of the multi-scale context and improving the network model segmentation performance.
(3) The original jump connection of the U-Net model is to splice low-level features and high-level features in a waveband dimension to realize feature fusion, but the problem that edge details of a predicted image are lost due to insufficient consideration of spatial consistency is solved. Therefore, the original jump connection of the U-Net model is changed, an attention mechanism is introduced in the process of the jump connection, the attention mechanism can adjust the weight of each component in the feature map, the learning of the class of features is inhibited by reducing the weight of the feature map which is not related to the task, and the learning of the class of features is enhanced by increasing the weight of the feature which is related to the task. In a change detection task of a high-resolution remote sensing image, the key point is to extract a change region from a two-phase image, so that the information weight of a change type is increased by introducing an attention mechanism, so that the model gravity learns the change region, meanwhile, the weight of an unchanged type is reduced, the sensitivity of a model to the change type is improved, and a more accurate road change condition can be obtained.
(4) The invention improves the U-Net model, and the improved model is called an ASPP ArrU-Net model. The model introduces a secondary cyclic residual convolution module in the up-sampling and down-sampling stages to repeatedly utilize the characteristic diagram, thereby enhancing the propagation of the characteristics; the receptive field can be enlarged, and context information of different scales can be extracted; and the change detection precision of the model is improved by increasing the weight of the change area and simultaneously reducing the weight of the feature information of the non-change area.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the present invention in its spirit and scope. Are intended to be covered by the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.