Method for detecting damage of building after earthquake based on near-ground image data
Technical Field
The invention relates to the technical field of remote sensing application and the field of disaster assessment, in particular to a method for detecting damage of a building after earthquake based on near-ground image data such as a camera (handheld/vehicle-mounted/unmanned aerial vehicle) and a smart phone, and particularly relates to a method for detecting damage of a building after earthquake by adopting a deep learning and super-pixel segmentation algorithm based on the near-ground image data.
Background
After a natural disaster occurs, emergency rescue, decision command and post-earthquake reconstruction work can be carried out at the first time, and casualties and property loss in a disaster area can be reduced to the minimum as possible. In the whole post-earthquake emergency response, the damage information of the buildings in the disaster area can provide important guiding significance for decision makers and rescue workers. At present, the most common method is to extract the damaged object of the building based on the pre-disaster/post-disaster remote sensing image change detection technology. The method is usually based on remote sensing images of the same sensor in different time phases, and the difficulty of obtaining remote sensing data of the same sensor is quite large in emergency monitoring and evaluation of actual earthquake disasters. In contrast, the post-disaster remote sensing image data is relatively easy to acquire, and therefore building damage detection based on the post-disaster remote sensing image is gradually a research hotspot in recent years. At present, in the aspect of building damage detection research based on post-disaster single-temporal images, the traditional satellite remote sensing image has long revisit period and can only obtain top surface information of a building in a main data source; the aviation Lidar point cloud and aviation oblique photogrammetry technology can make up the inherent defect of satellite remote sensing image to building facade damage detection, however, due to the complexity of modern buildings, particularly in dense areas of buildings, the problems of ground object shielding, dead angle shooting and the like exist. Therefore, there is still a need to further collect near-ground image data which is closer to the building and has a more intuitive shooting angle as auxiliary data for detecting the damage of the building after the earthquake.
With the maturity of the camera (handheld/vehicle-mounted/unmanned aerial vehicle) and smart phone technology, it becomes possible to perform near-ground image acquisition and fine damage detection on the damaged buildings in the disaster area by using the camera (handheld/vehicle-mounted/unmanned aerial vehicle) and the high-pixel camera equipped in the smart phone. Compared with the traditional photogrammetry technology, the camera (handheld/vehicle-mounted/unmanned aerial vehicle) and the smart phone are used for data acquisition, the image resolution is obviously improved, the technical requirements on data acquisition personnel are lower, the timeliness is strong, and the problems of facade information loss, ground object shielding and the like existing in satellite and aviation data can be effectively solved. In conjunction with satellite and aviation data, the method for detecting damage of buildings after earthquake based on near-ground image data has the potential to become one of important technical means for detecting damage of buildings after earthquake in an all-dimensional and fine manner.
Disclosure of Invention
Aiming at the problems, the invention provides a method for detecting damage of a building after earthquake based on near-ground image data, which comprises the following specific steps:
acquiring image data of a building near the ground after an earthquake by using photographic equipment, preprocessing the image data, and preparing a building damage labeling sample set;
substituting the training samples of the damaged buildings and the corresponding labeling information into the deep neural network model for training;
step three, substituting the building near-ground image data to be detected into the deep neural network model trained in the step two to obtain a building damage pre-detection result;
performing superpixel segmentation on the building near-ground image data to be detected, and performing fusion processing on the building damage pre-detection result obtained in the step three on the basis of a majority voting rule by using a superpixel segmentation result to obtain a building damage fine detection result; the concrete implementation mode is as follows,
firstly, dividing an image superpixel segmentation result graph into different regions based on segmented superpixel blocks; further calculating the number of pixels contained in each category in the corresponding damage pre-detection result in each region; and finally, according to the pixel number statistical result, determining the category with the most total number of pixels as the category label of the super-pixel region, wherein the calculation formula is as follows:
wherein L isrThe method includes the steps that a region r belongs to a class label, M is the total class number of damage pre-detection results, r (i, j) is an image element with coordinates (i, j) in the region r, f (r (i, j)) is the class label to which the image element r (i, j) belongs, and sgn (x) function is a mathematical sign function, wherein if f (r (i, j)) ═ c, sgn returns 1, and otherwise, 0 is returned.
Further, in the first step, the photographing device includes a handheld camera, a vehicle-mounted camera, an unmanned aerial vehicle camera, or a smart phone.
Furthermore, in the step one, the specific implementation manner of preprocessing the image and preparing the building damage labeling sample set is as follows,
(1) cutting the image data of the earthquake building near the ground and readjusting the image width and height;
(2) and carrying out ambiguity analysis on the adjusted image:
a) converting the near-ground image of the building into a gray image;
b) calculating the Laplace variance of the gray level image: 1) firstly, using a Laplacian operator (LoG, Laplacian of Gaussian) to carry out edge detection on an image; 2) then, for the edge detection result of 1), calculating the variance with a certain sampling window, wherein the variance calculation formula is as follows:
wherein x is
iCalculating the value of LoG of the ith pixel in the sampling window,
calculating the mean value of all pixels LoG in the sampling window, wherein n is the total number of pixels contained in the sampling window;
c) averaging the variance values of the windows returned in the step b), taking the average value as a measurement value V of the image ambiguity level, and filtering image data of which the ambiguity measurement value V is smaller than a threshold value;
(3) and manually plotting the objects of wall falling and wall crack damage on the facade of the building by using a marking tool to prepare a building damage marking sample set.
Further, in the third step, the deep neural network model is an example segmentation model Mask R-CNN, and comprises a target detection model fast R-CNN network and a semantic segmentation model FCN network, wherein the Mask R-CNN model firstly detects a building damage object from an input image by adopting the fast R-CNN network and marks a detection frame ROI on the building damage object, and then performs semantic grade segmentation and pixel-by-pixel marking on the detected ROI area by adopting the FCN network;
wherein, the Faster R-CNN firstly adopts a feature extraction network to extract important features of different targets from an image to generate a feature map; secondly, in the RPN stage, a plurality of candidate ROIs are taken from each anchor point on the obtained feature map, and the ROI is subjected to foreground and background differentiation and ROI position preliminary adjustment; furthermore, in the detection network, distinguishing different types of targets and accurately adjusting the position of the ROI; the FCN converts the full-link layer in the traditional CNN into a convolutional layer, and the characteristic graph of the last convolutional layer is up-sampled by adopting an anti-convolutional layer to restore the characteristic graph to the same size as an input image, so that each pixel is predicted, the classification of the pixel level of the image is realized, and the problem of image segmentation of the semantic level is solved.
The invention utilizes the near-ground image data acquired from a camera (handheld/vehicle-mounted/unmanned aerial vehicle) and a smart phone, adopts a method combining deep learning and a superpixel segmentation algorithm, realizes the fine damage detection of the building after the earthquake, and is characterized in that:
(1) the data sources employed are cameras (handheld/vehicle/drone) and smart phones. On one hand, due to the maturity of the related technology, the camera (handheld/vehicle-mounted/unmanned aerial vehicle) and the smart phone are used for acquiring the near-ground image data of the earthquake-caused building, and the timeliness and the mass source of the camera provide good guarantee for emergency work after the earthquake; on the other hand, the image resolution ratio of the near-ground image data acquired by the camera (handheld/vehicle-mounted/unmanned aerial vehicle) and the smart phone is high, fine damage detection can be performed on the building, and congenital defects such as building facade information loss and ground feature shielding existing in satellite and aviation data can be overcome.
(2) The adopted data acquisition means is multi-view/surround shooting of buildings based on low-altitude or ground platforms. Taking an unmanned aerial vehicle as an example, in the existing method for evaluating the damage of the building after the disaster by using the unmanned aerial vehicle, the oblique downward-looking facade image data of the building is mainly obtained by establishing a large-scale cruising route for the disaster area. Similar to an airborne remote sensing image, the method is limited by factors such as high shooting angle, complex terrain in disaster areas and the like, and has the defects of building elevation information loss, ground object shielding and the like. The method provided by the invention is suitable for the oblique downward-looking image data acquired under the large-scale cruising route of the unmanned aerial vehicle, but is more biased to utilize a camera (handheld/vehicle-mounted/unmanned aerial vehicle) and a smart phone to perform low-altitude/ground short-distance multi-view angle/surrounding shooting on a single building or a small-range building group to acquire the near-ground image data.
(3) The adopted building damage detection method is a two-step extraction method. Firstly, processing input near-ground image data of a building to be detected by adopting a deep learning method, and outputting a building damage pre-detection result; and secondly, fusing the building damage pre-detection result extracted in the first step by using the super-pixel segmentation result of the input image, so as to obtain a building damage fine detection result after noise removal and boundary optimization. Compared with the traditional one-step extraction method, the two-step extraction method can obviously improve the damage detection precision of the building.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a Mask R-CNN model used in the embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a Faster R-CNN model according to an embodiment of the present invention;
fig. 4 is a schematic structural composition diagram of an FCN model according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a backbone architecture (FPN network) of a Mask R-CNN model according to an embodiment of the present invention;
fig. 6 is a schematic diagram illustrating an effect of performing super-pixel fusion processing on a building damage pre-detection result according to an embodiment of the present invention. The graph (a) is a building damage pre-detection result output by a Mask R-CNN model, and the graph (b) is a building damage fine detection result after super-pixel fusion processing.
Detailed Description
The technical solution of the present invention will be described in detail below with reference to the accompanying drawings and examples, so that the technical contents thereof will be more clear and easy to understand. It should be noted that the scope of the present invention is not limited to the embodiments mentioned herein.
As in fig. 1, the embodiment comprises the following steps:
firstly, preprocessing building near-ground image data acquired by a camera (handheld/vehicle-mounted/unmanned aerial vehicle) or a smart phone after an earthquake, and preparing a building damage labeling sample set; .
(1) Readjusting the image width and height of a camera (handheld/vehicle-mounted/unmanned aerial vehicle) or a smart phone image data acquired after an earthquake through resampling to meet the requirement of subsequent input model training, wherein in the embodiment, the image width and height are set to 3000 px;
(2) carrying out ambiguity analysis on image data of a camera (handheld/vehicle-mounted/unmanned aerial vehicle) or a smart phone acquired after an earthquake:
a) converting the near-ground image of the building into a gray image;
b) calculating the Laplace variance of the gray level image: 1) firstly, using a Laplacian operator (LoG, Laplacian of Gaussian) to carry out edge detection on an image; 2) then, for the edge detection result of 1), the variance is calculated with a sampling window of (2 × 3), and the variance calculation formula is as follows:
wherein x is
iCalculating the value of LoG of the ith pixel in the sampling window,
the calculated value is the mean value of all pixels LoG in the sampling window, and n is the total number of pixels contained in the sampling window.
c) Taking the mean value of the variance values of the windows returned in the step b) and taking the mean value as a measure of the image ambiguity level, wherein the variance value of a normally clear image is large, and the variance value of a blurred image is small.
(3) The image after the ambiguity analysis is subjected to manual marking of the damaged object, in this embodiment, the damaged objects such as wall falling, wall cracks and the like in the building facade image are manually plotted mainly by means of a labelme marking tool.
And step two, substituting the training samples of the damaged buildings and the corresponding labeling information into the deep neural network model for training.
(1) In the embodiment, the Mask R-CNN model in the deep learning example segmentation algorithm is adopted to carry out damage detection on the image data of the post-earthquake near-ground building.
As shown in fig. 2, the example segmentation model Mask R-CNN network adopted in the present embodiment mainly includes a target detection model fast R-CNN network and a semantic segmentation model FCN (full Convolutional neural network) network. As shown in FIG. 3, structurally, the Faster R-CNN first adopts a feature extraction network to extract important features of different targets from an image to generate a feature map; secondly, in the RPN stage, a plurality of candidate ROIs are taken from each anchor point on the obtained feature map, and the ROI is subjected to foreground and background differentiation and ROI position preliminary adjustment; furthermore, in the detection network, the ROI is subjected to the distinguishing of different types of targets and the accurate adjustment of the position of the ROI. As shown in fig. 4, in structure, the FCN converts the fully-connected layer in the conventional CNN into convolutional layers, and uses the deconvolution layer to up-sample the feature map of the last convolutional layer, so that it is restored to the same size as the input image, thereby predicting each pixel, implementing pixel-level classification on the image, and solving the problem of image segmentation at semantic level. In the step, a Mask R-CNN model firstly adopts a FasterR-CNN network to detect a building damage object from an input image and label the building damage object with a detection frame (ROI), and then adopts an FCN network to carry out semantic level segmentation and pixel-by-pixel labeling on the detected ROI area.
As shown in fig. 5, the Mask R-CNN of the example segmentation model adopted in this embodiment mainly adopts an fpn (femtocell) network as a backbone network architecture. The FPN network can extract and simultaneously extract image characteristic information under different scales by setting convolution kernels with different sizes for an input image, thereby detecting building damage objects with different scales in the image. Therefore, the problems of omission of small-scale damaged objects, breakage of large-scale damaged objects and the like in the building damage detection process can be avoided as much as possible, and therefore the damage detection method provided by the invention has certain robustness under different scales.
(2) And (4) substituting the manually marked sample image and the corresponding marking information into a Mask R-CNN model for training, wherein the trained Mask R-CNN model has the capability of detecting and segmenting the damaged object in the image data of the near ground of the building.
And step three, substituting the building near-ground image data to be detected into the deep neural network model trained in the step two to obtain a building damage pre-detection result.
(1) Substituting the building near-ground image data to be detected into the example segmentation model trained in the step two to obtain a segmented damage pre-detection result;
(2) as shown in fig. 6(a), in the damage pre-detection result generated in step (1), pre-detection and positioning of the damaged object in the building image after the earthquake are realized.
And step four, performing superpixel segmentation on the building near-ground image data to be detected, and performing fusion processing on the building damage pre-detection result obtained in the step three by using a superpixel segmentation result to obtain a building damage fine detection result.
(1) By means of a multi-Scale Segmentation algorithm provided by eCoginization software, the near-ground image of the building to be detected is subjected to object-oriented superpixel Segmentation by adjusting a proper Scale Parameter (Scale Parameter). In this embodiment, the scale parameter is set to 200 (image size is 3000px by 3000 px);
(2) and (3) based on a Majority Voting (Majority Voting) rule, carrying out fusion processing on the image superpixel segmentation result obtained in the step (1) and the building damage pre-detection result generated in the step three. And carrying out post-processing optimization on the building damage pre-detection area by means of the homogeneous clustering characteristic of the superpixel segmentation algorithm.
According to a Majority Voting (Majority Voting) rule adopted by the embodiment, firstly, an image superpixel segmentation result graph is divided into different areas based on segmented superpixel blocks; further calculating the number of pixels contained in each category in the damage pre-detection result corresponding to each region; and finally, according to the pixel number statistical result, taking the category with the most total number of contained pixels as the category label of the super pixel block region. The formula is expressed as follows:
wherein L isrThe method includes the steps that a category label belongs to an area r, M is the total number of categories of a damage pre-detection result, r (i, j) is an image element with coordinates (i, j) in the area r, f (r (i, j)) is the category label to which the image element r (i, j) belongs, and sgn (x) function is a mathematical sign function.
(3) As shown in fig. 6, the results of building damage detection before and after the super-pixel fusion process are shown. Wherein, (a) is the building damage pre-detection result before the super pixel fusion processing, and (b) is the building damage fine detection result after the super pixel fusion processing. It can be found that, before the super-pixel fusion processing, the damaged object segmentation is performed on the building by adopting the deep learning example segmentation method, although the damaged object in the image can be identified and positioned more accurately, some detailed information of the damaged object boundary can be lost due to the superposition of the number of convolution layers. The super-pixel segmentation algorithm can perform homogeneous clustering on pixels in an image and reserve relatively complete boundary information for each super-pixel block, so that a building damage fine detection result after super-pixel fusion processing can be used for accurately detecting and positioning a damaged object and reserving rich boundary information of the damaged object, and damage detection precision is remarkably improved.