Background
With the development of informatization, the application of the automatic teller machine (hereinafter referred to as an ATM machine) is more and more extensive, and due to the characteristics of unattended operation and electronic finance, the ATM machine brings convenience to people and simultaneously brings opportunity for criminals. According to actual cases, criminals in such cases often wear protective articles such as sunglasses, masks or hats to shield the face. Therefore, in order to prevent the occurrence of such cases in time, it is particularly important to detect and identify the face covered by the ornament.
There has been little research on detecting a human face having a mask and identifying a class of the mask. The correction statistical conversion (MCT) algorithm proposed by Froba et al. Dong and Soh detect a moving target by using a Gaussian mixture model, then segment a face region by moving skin color detection, and then provide facial features, and judge whether the face is blocked or not by a threshold value. In the positioned head image, YuanBaoHua judges whether the face is provided with the shielding ornaments or not by judging whether the five sense organs are missing or not, and most of the use of the face image is a simple image processing method. The lie proposes that on the basis of the detected possible face for the first time, the face is judged whether to be shielded by ornaments, and the lie utilizes a frame difference method and an Adaboost classifier to detect. Guo thought depression was detected using ellipse fitting and traditional skin color detection and five sense organs detection methods. However, in general, the existing research in China and abroad does not make the methods for detecting and judging the face with the shielding ornaments mature, and still has many defects. Summarizing, the method mainly comprises the following aspects: (1) by using a skin color detection method, for black and yellow people, the method cannot set a uniform threshold value to distinguish the face from the background, and ornaments such as a mask, a hat and the like can shield most of the face area, so that the skin is difficult to detect; (2) by using the method for detecting the five sense organs, the characteristics of the five sense organs can be shielded under the condition that the face is shielded by an ornament, and the false detection rate is relatively high in consideration of the fact that the posture of a customer changes greatly in the actual depositing and withdrawing process, such as the head is lowered, the side face is lowered and the like; (3) by using the template matching method, the difference of the human head outline is large, so that the detection rate cannot be ensured by using a simple template model, but the complex template model has large calculation amount, much time consumption and poor real-time property, and is difficult to be used for detecting a real video.
Disclosure of Invention
The invention aims to provide a face shielding detection method and system based on video monitoring, which can realize face shielding detection with less time consumption, high real-time performance, high precision and low error, particularly carry out real-time detection on a face with two shielding ornaments of sunglasses and scarves and judge the types of the two ornaments, and send out a warning when detecting the face with the sunglasses or the scarves, thereby actively preventing illegal criminal behaviors and being applied to monitoring videos of bank ATM machines.
In order to achieve the purpose, the invention provides the following scheme:
a face occlusion detection method based on video monitoring comprises the following steps:
acquiring a dynamic video image acquired by video monitoring equipment;
detecting a moving object of the dynamic video image by adopting an algorithm combining a three-frame difference method and a mixed Gaussian background model, and determining a region image of the moving object;
extracting and marking each connected region in the moving object region image, and determining the maximum connected region;
carrying out vertical and horizontal projection processing on the maximum communication area, and intercepting a face area image;
and carrying out face shielding detection on the face region image by adopting a K nearest neighbor classification algorithm and a local binary pattern algorithm.
Optionally, the face occlusion detection method further includes:
and when the face is detected to be shielded, the video monitoring equipment sends out an alarm prompt.
Optionally, the detecting a moving object of the dynamic video image and determining a region image of the moving object by using an algorithm combining a three-frame difference method and a gaussian mixture background model specifically includes:
judging whether a moving object enters the dynamic video image in real time by adopting an algorithm combining a three-frame difference method and a Gaussian mixture background model to obtain a first judgment result;
if the first judgment result shows that the moving object enters the dynamic video image, determining that the moving object exists in the dynamic video image, and extracting an image area corresponding to the moving object;
and if the first judgment result shows that the moving object is not entered into the dynamic video image, updating the background parameter of the Gaussian mixture background model.
Optionally, the extracting and marking each connected region in the moving object region image, and determining the maximum connected region specifically include:
carrying out mathematical morphology processing on the moving object region image;
extracting and marking each connected region of the processed moving object region image;
calculating the area of the connected region of each mark;
and determining the maximum connected region according to the area of the connected region of each mark.
Optionally, the performing vertical and horizontal projection processing on the maximum connected region to capture a face region image specifically includes:
acquiring the highest point vertical coordinate of the moving object profile in the maximum communication area by adopting a bwboundaries function;
performing vertical projection processing on the maximum communication area to obtain a vertical projection image;
determining two critical values of the vertical projection image, wherein the two critical values are a vertical first critical value and a vertical second critical value respectively; the vertical first critical value and the vertical second critical value are horizontal coordinates of left and right boundary points of the face area;
carrying out horizontal projection processing on the vertical projection image to obtain a horizontal projection image;
determining a horizontal critical value of the horizontal projection image; the horizontal critical value is a vertical coordinate of a critical point of a chin and a neck in the face area;
and intercepting a face region image from the moving object region image according to the vertical first critical value, the vertical second critical value, the horizontal critical value and the highest point vertical coordinate.
The invention also provides a face occlusion detection system based on video monitoring, which comprises:
the dynamic video image acquisition module is used for acquiring a dynamic video image acquired by the video monitoring equipment;
the moving object region image determining module is used for detecting a moving object of the dynamic video image by adopting an algorithm combining a three-frame difference method and a Gaussian mixture background model and determining a moving object region image;
the maximum connected region determining module is used for extracting and marking each connected region in the moving object region image and determining the maximum connected region;
the human face area image intercepting module is used for performing vertical and horizontal projection processing on the maximum communication area and intercepting a human face area image;
and the face shielding detection module is used for carrying out face shielding detection on the face region image by adopting a K nearest neighbor classification algorithm and a local binary pattern algorithm.
Optionally, the face occlusion detection system further includes:
and the alarm reminding module is used for sending out alarm reminding by the video monitoring equipment when the face is detected to be shielded.
Optionally, the moving object region image determining module specifically includes:
the first judgment result obtaining unit is used for judging whether the moving object enters the dynamic video image in real time by adopting an algorithm combining a three-frame difference method and a Gaussian mixture background model to obtain a first judgment result;
a moving object region image extraction unit, configured to determine that the moving object exists in the dynamic video image and extract an image region corresponding to the moving object if the first determination result indicates that the moving object enters the dynamic video image;
and the background parameter updating unit is used for updating the background parameter of the Gaussian mixture background model when the first judgment result shows that the moving object is not entered into the dynamic video image.
Optionally, the maximum connected region determining module specifically includes:
the mathematical morphology processing unit is used for carrying out mathematical morphology processing on the moving object region image;
a connected region extraction marking unit for extracting and marking each connected region of the processed moving object region image;
a connected region area calculation unit for calculating the area of the connected region of each mark;
and a maximum connected region determining unit configured to determine a maximum connected region according to an area of the connected region of each of the marks.
Optionally, the face region image intercepting module specifically includes:
the highest vertical coordinate acquisition unit is used for acquiring the highest vertical coordinate of the moving object profile in the maximum communication area by adopting a bwboundaries function;
a vertical projection image obtaining unit, configured to perform vertical projection processing on the maximum connected region to obtain a vertical projection image;
the vertical critical value determining unit is used for determining two critical values of the vertical projection image, wherein the two critical values are a vertical first critical value and a vertical second critical value respectively; the vertical first critical value and the vertical second critical value are horizontal coordinates of left and right boundary points of the face area;
a horizontal projection image obtaining unit for performing horizontal projection processing on the vertical projection image to obtain a horizontal projection image;
a horizontal critical value determining unit for determining a horizontal critical value of the horizontal projection image; the horizontal critical value is a vertical coordinate of a critical point of a chin and a neck in the face area;
and the human face region image intercepting unit is used for intercepting a human face region image from the moving object region image according to the vertical first critical value, the vertical second critical value, the horizontal critical value and the highest point vertical coordinate.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
the invention provides a face shielding detection method and a face shielding detection system based on video monitoring, wherein the method comprises the steps of obtaining a dynamic video image collected by video monitoring equipment; detecting a moving object of the dynamic video image by adopting an algorithm combining a three-frame difference method and a mixed Gaussian background model, and determining a region image of the moving object; extracting and marking each connected region in the moving object region image, and determining the maximum connected region; carrying out vertical and horizontal projection processing on the maximum communication area, and intercepting a face area image; and carrying out face shielding detection on the face region image by adopting a K nearest neighbor classification algorithm and a local binary pattern algorithm. By adopting the algorithm and the face region image determining means, the face occlusion detection can be realized with less time consumption, high real-time performance, high precision and low error.
In addition, when the face is detected to be shielded, the video monitoring equipment sends out alarm reminding to actively prevent illegal criminal behaviors.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a face shielding detection method and system based on video monitoring, which can realize face shielding detection with less time consumption, high real-time performance, high precision and low error, particularly carry out real-time detection on a face with two shielding ornaments of sunglasses and scarves and judge the types of the two ornaments, and send out a warning when detecting the face with the sunglasses or the scarves, thereby actively preventing illegal criminal behaviors and being applied to monitoring videos of bank ATM machines.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Fig. 1 is a schematic flow chart of a face occlusion detection method based on video monitoring according to an embodiment of the present invention, and as shown in fig. 1, the face occlusion detection method based on video monitoring according to the embodiment of the present invention includes the following steps.
Step 101: and acquiring a dynamic video image acquired by the video monitoring equipment.
Step 102: and detecting a moving object of the dynamic video image by adopting an algorithm combining a three-frame difference method and a mixed Gaussian background model, and determining a region image of the moving object.
Step 103: and extracting and marking each connected region in the moving object region image, and determining the maximum connected region.
Step 104: and carrying out vertical and horizontal projection processing on the maximum connected region, and intercepting a face region image.
Step 105: and carrying out face shielding detection on the face region image by adopting a K nearest neighbor classification algorithm and a local binary pattern algorithm.
Step 106: and when the face is detected to be shielded, the video monitoring equipment sends out an alarm prompt.
Step 102 specifically includes:
because the external environment of the ATM changes, a method capable of updating the background is required to be used for detecting moving objects, and the code running speed is considered as much as possible, the method is realized by combining a three-frame difference method and a Gaussian mixture background model, so that the detection speed can be improved, a better detection effect can be obtained, the updating time interval of background parameters is limited, and the running speed is further improved.
Fig. 2 is a schematic flow chart of the method for determining the image of the moving object region according to the present invention, as shown in fig. 2, including:
step 201: judging whether a moving object enters the dynamic video image in real time by adopting an algorithm combining a three-frame difference method and a Gaussian mixture background model to obtain a first judgment result; if the first judgment result indicates that the moving object enters the dynamic video image, executing step 202; if the first determination result indicates that the moving object is not entered into the dynamic video image, step 203 is executed.
Step 202: and determining that the moving object exists in the dynamic video image, and extracting an image area corresponding to the moving object.
Step 203: and updating the background parameters of the Gaussian mixture background model.
The purpose of updating the background parameters is to solve the problem of interference on detection caused by changes (such as light rays and the like) in a video scene by modeling each pixel point and setting parameters such as weight, learning rate, standard deviation and the like, comparing a new pixel point with an established model in each frame to judge whether the pixel point is matched with the established model, classifying the pixel point into the model if the pixel point is matched with the established model, updating the model according to the new pixel value, and establishing a Gaussian model by using the pixel if the pixel point is not matched with the established model to initialize the parameters to replace the least probable model in the original model.
The update time interval of the background parameter is because the background is updated for a plurality of times first, which can effectively prevent the video from shaking; secondly, considering that most of the bank ATM machines are fixed, the background can be updated at any time (the background updating only occurs under the condition that no moving object enters), but a time period (the length is set according to the actual scene) can be set, and the updating is carried out once every other time period, so that the running speed is improved.
After the step 102, the detected moving object effect graph is shown in fig. 3, and it can be seen that white point noise appears in the detected moving object effect graph, and a hole appears in a required moving object region, and measures taken for this phenomenon are as follows: firstly, filling by using mathematical morphology and a method of expanding in a small area, filling a hole and then corroding in a large area, and the obtained effect graph is shown in fig. 4. Marking each connected domain, solving the size of each connected domain to obtain the index of the maximum connected domain, wherein the schematic diagram of the maximum connected domain is shown in fig. 5.
Therefore, step 103 specifically includes:
and carrying out mathematical morphology processing on the moving object region image.
And extracting and marking each connected region of the processed moving object region image.
The area of the connected region of each marker is calculated.
And determining the maximum connected region according to the area of the connected region of each mark.
In order to obtain an accurate face region, the maximum connected domain is respectively subjected to vertical projection and horizontal projection.
Firstly, vertical projection is carried out, and an effect diagram is shown in FIG. 6; two special points can be seen in fig. 6, which is schematically shown in fig. 7, and the marked points can reflect the horizontal coordinates of the left and right boundary points of the face region, which are respectively set as a1 and a 2.
Then, horizontal projection is performed, the effect diagram is shown in fig. 8, and similarly, a special point can be seen from fig. 8, the schematic diagram is shown in fig. 9, the point reflects the ordinate of the critical point of the chin and the neck of the detected face region, and is set as b 1.
The idea of finding out a1, a2 and b1 is that since a1, a2 and b1 are critical points at which the function corresponding to the projected image starts to change rapidly, the first derivative of each point of the function corresponding to the projected image can be calculated, and the larger the derivative is, the steeper the derivative is, so as to find out the three critical points.
And then finding the vertical coordinate min _ y of the highest point of the contour of the moving object by utilizing a bwbounderies function. Using the imcrop function: the imcrop (original image, [ a1(1), min _ y +30, a2(1) -a1(1), b1-min _ y ]), and a desired face region image is cut out on the original image (moving object region image).
That is, step 104 specifically includes:
and acquiring the highest point vertical coordinate of the contour of the moving object in the maximum communication area by adopting a bwbounderies function.
And carrying out vertical projection processing on the maximum communication area to obtain a vertical projection image.
Determining two critical values of the vertical projection image, wherein the two critical values are a vertical first critical value and a vertical second critical value respectively; the vertical first critical value and the vertical second critical value are horizontal coordinates of left and right boundary points of the face area.
And carrying out horizontal projection processing on the vertical projection image to obtain a horizontal projection image.
Determining a horizontal critical value of the horizontal projection image; the horizontal critical value is the ordinate of the critical point of the chin and the neck in the face area.
And intercepting a face region image from the moving object region image according to the vertical first critical value, the vertical second critical value, the horizontal critical value and the highest point vertical coordinate.
After the face region image is intercepted, face occlusion detection is realized by a K-nearest neighbor (KNN) classification algorithm and a Local Binary Pattern (LBP) algorithm, and a specific flow diagram is shown in fig. 10.
The classical AR database as shown in fig. 11 is used for sample training.
Firstly, training is carried out on two or hundred normal faces, faces with sunglasses and faces with scarves, and then detection is carried out to obtain a training model, wherein the detection results are shown in table 1.
TABLE 1 face occlusion training test result table
| Occlusion type
|
Detecting pictures
|
Detection rate
|
| Is normal
|
50
|
94%
|
| Sunglasses
|
120
|
95.4%
|
| Scarf |
|
|
100
|
96% |
Then, the played dynamic video image (the face region image obtained after processing) is detected by adopting a training model, and the detection of the scarf is changed into the detection of the mask due to the actual situation, and the detection result is shown in table 2.
Table 2 table of actual face shielding detection results
| Occlusion type
|
Number of entries
|
Detection rate
|
| Is normal
|
50
|
90%
|
| Sunglasses |
|
|
100
|
88%
|
| Gauze mask |
|
|
100
|
92% |
In order to achieve the purpose, the invention also provides a face shielding detection system based on video monitoring.
Fig. 12 is a schematic structural diagram of a face occlusion detection system based on video monitoring according to an embodiment of the present invention, as shown in fig. 12, the face occlusion detection system according to the embodiment of the present invention includes:
the dynamic video image obtaining module 100 is configured to obtain a dynamic video image collected by a video monitoring device.
And the moving object region image determining module 200 is configured to detect a moving object of the dynamic video image by using an algorithm combining a three-frame difference method and a gaussian mixture background model, and determine a moving object region image.
And a maximum connected region determining module 300, configured to extract and mark each connected region in the moving object region image, and determine a maximum connected region.
And a face region image intercepting module 400, configured to perform vertical and horizontal projection processing on the maximum connected region, and intercept a face region image.
And the face shielding detection module 500 is configured to perform face shielding detection on the face region image by using a K-nearest neighbor classification algorithm and a local binary pattern algorithm.
And the alarm reminding module 600 is configured to send an alarm reminding by the video monitoring device when the face is detected to be blocked.
The moving object region image determining module 200 specifically includes:
and the first judgment result obtaining unit is used for judging whether the moving object enters the dynamic video image in real time by adopting an algorithm combining a three-frame difference method and a Gaussian mixture background model to obtain a first judgment result.
And the moving object region image extraction unit is used for determining that the moving object exists in the dynamic video image and extracting an image region corresponding to the moving object if the first judgment result shows that the moving object enters the dynamic video image.
And the background parameter updating unit is used for updating the background parameter of the Gaussian mixture background model when the first judgment result shows that the moving object is not entered into the dynamic video image.
The maximum connected region determining module 300 specifically includes:
and the mathematical morphology processing unit is used for performing mathematical morphology processing on the moving object region image.
And the connected region extraction marking unit is used for extracting and marking each connected region of the processed moving object region image.
And a connected region area calculating unit for calculating the area of the connected region of each mark.
And a maximum connected region determining unit configured to determine a maximum connected region according to an area of the connected region of each of the marks.
The face region image intercepting module 400 specifically includes:
and the highest vertical coordinate acquisition unit is used for acquiring the highest vertical coordinate of the contour of the moving object in the maximum communication area by adopting a bwboundaries function.
And the vertical projection image obtaining unit is used for performing vertical projection processing on the maximum communication area to obtain a vertical projection image.
The vertical critical value determining unit is used for determining two critical values of the vertical projection image, wherein the two critical values are a vertical first critical value and a vertical second critical value respectively; the vertical first critical value and the vertical second critical value are horizontal coordinates of left and right boundary points of the face area.
And the horizontal projection image obtaining unit is used for carrying out horizontal projection processing on the vertical projection image to obtain a horizontal projection image.
A horizontal critical value determining unit for determining a horizontal critical value of the horizontal projection image; the horizontal critical value is the ordinate of the critical point of the chin and the neck in the face area.
And the human face region image intercepting unit is used for intercepting a human face region image from the moving object region image according to the vertical first critical value, the vertical second critical value, the horizontal critical value and the highest point vertical coordinate.
Compared with the prior art, the invention has the advantages that;
the method of combining the three-frame difference method with the Gaussian mixture background model is utilized, so that the detection precision is improved, and the calculation time of the system is saved.
The updating time interval of the background parameters in the Gaussian mixture background model is limited, so that the adaptability to the actual scene is not reduced, and the computing time of the system is saved.
The human face is divided by adopting a projection method, so that the time is saved, and the detection effect is good.
The method based on the K nearest neighbor classification algorithm (approach algorithm) and the local binary pattern is used for detecting and judging and identifying the shielded human face, and has a good detection effect.
The method or the system provided by the invention relates to the field of intelligent monitoring, in particular to the technologies of image processing, machine learning and the like, and aiming at the crime cases of the ATM, the face shielding detection method and the system based on video monitoring are designed through an algorithm, so that a good detection effect can be achieved on the action that criminals often wear shielding ornaments to hide the facial features of the criminals, and the illegal criminal actions can be actively prevented.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.