Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
As shown in fig. 1, the present application provides a method for identifying abnormal behavior in a financial escort process, which includes:
collecting personnel characteristic data, wherein the characteristic data comprises fingerprint data, facial data and iris data;
Fingerprint data, facial data and iris data are all unique biological identification characteristics, and have high uniqueness and stability. In the financial escort process, fingerprint data, face data and iris data of escort personnel are collected for identifying and verifying personnel identity. Among other things, iris recognition techniques are used to provide a higher level of authentication for enhanced security measures to prevent unauthorized persons from accessing escort items, since iris features are difficult to forge or imitate.
Collecting personnel characteristic data, wherein the characteristic data comprises fingerprint data, facial data and iris data, and the method comprises the following steps of:
acquiring fingerprints by using professional fingerprint acquisition equipment to acquire fingerprint data;
acquiring face data of a person by using a high-resolution camera;
And (3) performing multiple scans by using an iris acquisition device to acquire iris data under different angles and illumination conditions.
The optical fingerprint acquisition equipment is used for capturing fine textures and characteristic points of fingerprints, so that the acquired fingerprint data is ensured to have high resolution and definition. When a finger is pressed on the acquisition surface in the process of acquiring fingerprint data, the reflection of lines and valleys on the surface of the finger is different, and the device captures the differences of the reflected light through the sensor to form a fingerprint image. When collecting facial data, a high-resolution camera is used for capturing facial details of a person and generating facial data, including the shape and the outline of eyes, a nose, a mouth and other parts, skin texture and other characteristics. And capturing the facial expression and the motion of the escort personnel in real time, so as to identify potential abnormal behaviors such as tension, anxiety and the like. In order to adapt to different illumination environments, the camera is provided with automatic focusing and automatic exposure functions. Under the environment with darker light, the camera can automatically increase the exposure degree, and under the environment with too strong light, the exposure degree can also be automatically reduced, so that the collected facial image has proper contrast and brightness; the iris acquisition equipment is high-precision biological identification equipment, adopts image processing and optical technology, accurately captures fine textures and characteristic points of the iris, and realizes high-precision iris identification. The iris acquisition equipment is internally provided with a special optical system, so that the incidence angle and intensity of light can be adjusted within a certain range, the light is focused on an iris area, and the texture information of the iris is accurately captured.
Extracting the characteristics of the personnel characteristic data, and establishing a multi-mode biological characteristic database;
First, feature extraction is performed from the person feature data obtained from the acquisition device, including fingerprint features, facial features, and iris features. The fingerprint feature extraction determines the pattern of the fingerprint image by analyzing the trend and the mode of the integral pattern of the fingerprint image, and comprises a bucket type, a skip type and a bow type. The fingerprint has circular or spiral fingerprint lines, dustpan-like fingerprint, and bow-shaped fingerprint lines are smooth and arc-shaped. The fingerprint image contains a number of unique minutiae such as endpoints, bifurcation points, etc. The fingerprint lines are thinned into lines with single pixel width by using an image thinning method, and the coordinate position (x, y) and the direction (expressed by angles) and the type (end points or bifurcation points) of each minutiae point are marked. Facial feature extraction locates facial key points such as corners of the eyes, nose, corners of the mouth, etc. through an active appearance model. And learning the distribution pattern of the facial key points on the training set through a facial feature point detection algorithm, and searching the matched key point positions on the input facial image. Then, the positions of the key points of the face of the person are determined, and the relative distance, angle and other relations between the key points are calculated. Including calculating the distance between the eyes, the distance from the tip of the nose to the corner of the mouth, etc. Finally, texture information of the facial skin is extracted. The iris feature extraction is to locate the collected iris image to determine the inner and outer boundaries of the iris, normalize the iris region to a fixed size, and filter the normalized iris region.
Extracting the characteristics of the personnel characteristic data, and establishing a multi-mode biological characteristic database comprises the following steps:
Extracting characteristic information such as texture, direction and frequency of the fingerprint from the collected fingerprint data by using a fingerprint characteristic extraction algorithm;
Extracting feature information such as key points, textures, shapes and the like of the face from the acquired face data by using a face feature extraction algorithm, wherein the key points of the face comprise eye corners, mouth corners and the like;
extracting characteristic information such as texture, spots, lines and the like of the iris from the acquired iris data by using an iris characteristic extraction algorithm;
And integrating the extracted fingerprint features, facial features and iris features to form multi-modal biological feature data of each person, and establishing a multi-modal biological feature database.
Feature extraction is performed by Convolutional Neural Network (CNN). Firstly, carrying out convolution operation to output a feature map, then calculating the size of the feature map generated after the convolution operation, then carrying out pooling operation on the feature map, and finally carrying out full-connection layer operation.
Convolution operation expression:
Wherein Y is an output feature map;
w is a convolution kernel;
X is an input image;
valid is a convolution type (no zero padding);
b is a bias term;
conv2 is a convolution operation.
The calculation formula of the size of the feature map generated after the convolution operation is as follows:
Where w is the size of the input image;
p is the number of layers filled;
k is the size of the convolution kernel;
s is stride;
constraint that division operations round down.
The max pooling operation expression is:
S is an output feature map;
max_ pooling represents the max pooling operation;
X is an input feature map;
k is a window size value.
The output Y expression of the full connection layer is:
Wherein W represents the weight of the full connection layer;
A represents the output of the previous layer;
b represents the bias of the fully connected layer.
Acquiring real-time personnel feature data, and matching the extracted features of the real-time personnel feature data with the features stored in the multi-mode biological feature database by using a convolutional neural network to acquire a matching result;
Firstly, performing feature matching by using a convolutional neural network, secondly, defining and training a convolutional neural network by adopting a cosine similarity verification and matching process to extract features from input data, and calculating a similarity matrix between samples by using the features. In the similarity measurement stage, similarity between the feature vectors is calculated by utilizing measurement methods such as cosine similarity and the like, so that verification and matching tasks are completed. And finally, integrating the verification results of the multiple modes to obtain a final verification conclusion. Based on the verification result, feedback (e.g., "verification success" or "verification failure") is provided to the user.
The output characteristic diagram expression is:
wherein X is an m X n input image matrix;
W is a convolution kernel matrix of k x k (k < m, k < n);
y is one Is provided.
When the matching result is that the matching is successful, continuously acquiring the video stream of the monitoring area in real time and preprocessing the video stream;
And after the matching is successful, continuously acquiring the video stream of the monitoring area in real time, and preprocessing the acquired video frames for subsequent analysis in order to improve the visibility of the video image. Firstly, the video stream image is subjected to a resizing operation, and various noises may be contained in the video stream due to the complexity of a monitoring environment, so that a Gaussian filtering denoising process is performed. Contrast enhancement processing is carried out on license plates of escort vehicles and faces of escort personnel in the financial escort process.
Performing target detection on the preprocessed video stream by using a deep learning algorithm, and identifying a tracking target, wherein the tracking target comprises personnel and vehicles;
and distinguishing escort personnel and other personnel according to the color and style characteristics of the uniform of escort personnel. By analyzing multi-frame images in the video stream, whether the person walks, runs normally or makes abnormal actions (such as suddenly squatting, waving an arm, etc.) is identified. Vehicles have significant shape features such as rectangular contours of the body, rounded wheels, etc. Meanwhile, the color and license plate information of the vehicle are also important identification features. The driving state of the vehicle, such as speed, driving direction, etc., can also be obtained by analyzing the position change of the vehicle in successive frames in the video stream. Thereby monitoring whether the escort vehicle runs along a preset route, whether abnormal acceleration or deceleration exists or not, and the like.
Target detection is carried out on the preprocessed video stream by using a deep learning algorithm, and a tracking target is identified, wherein the tracking target comprises personnel and vehicles and comprises the following components:
Extracting images from the video stream frame by frame for frame analysis;
performing enhancement processing on the extracted image and adjusting the image to a size suitable for the input of the deep learning model;
carrying out normalization processing on the image to enable the pixel value to be in a specific range;
performing feature extraction and classification by using a convolutional neural network;
Training the selected deep learning model by using the marked data set;
Applying a trained model to each frame in the video stream to perform target detection, and identifying the positions and the categories of personnel and vehicles;
a tracking algorithm is applied between successive frames to track the motion trajectory of the object in the video.
The method comprises the steps of extracting images from a video stream frame by frame, carrying out contrast enhancement and noise reduction on the extracted images, adjusting the images to a size suitable for deep learning model input, carrying out normalization on the images, normalizing pixel values to a specific range, carrying out characteristic extraction and classification by using a convolutional neural network, carrying out convolution operation firstly, carrying out pooling operation secondly, and carrying out full-connection layer operation. Training the selected deep learning model by using the marked data set, applying the trained model to each frame in the video stream to perform target detection, sliding a convolution kernel on the image, extracting features and judging whether personnel or vehicle targets exist according to the output of the full connection layer. If an object is present, the model determines the location and class of the person and vehicle based on the feature map and associated algorithms. And finally, a tracking algorithm is applied between the continuous frames to track the motion trail of the target in the video.
Tracking the detected tracking target in continuous frames by utilizing a target tracking algorithm to acquire a motion trail of the tracking target;
The regions that may contain the target are generated using an algorithm and then classified and regressed to determine the exact location and class of the target. Then, the optical flow method is used for assuming that the intensity of pixels in the image is kept unchanged in a short time, and the gray level change of the pixels between adjacent frames is analyzed to determine the movement direction and speed of the pixels so as to realize target tracking. In the target tracking process, the characteristics of the target are continuously extracted from the current frame along with the movement of the target in the video. The sliding window method is adopted when searching the most similar area in the subsequent frame. And sliding a window similar to the initial size of the target in the subsequent frames, and calculating the similarity between the region in the window and the target characteristics. When the most similar region is found (determined based on the set similarity threshold), the location of the object in the frame is considered to be found. Once the most similar area is found in the subsequent frames, the position information of the target is updated in time, and the target is accurately identified.
Tracking the detected tracking target in continuous frames by using a target tracking algorithm to acquire a motion trail of the tracking target, wherein the method comprises the following steps:
performing target detection on each frame of the video stream by using a target detection algorithm;
Initializing the detected target, wherein the initialization comprises the initial position, the size and possible appearance characteristics of the target;
Calculating the motion of pixels in the image by using an optical flow method, so as to track the motion trail of the target;
extracting features of the target from the current frame;
Searching a region most similar to the target in a subsequent frame according to the characteristics of the extracted target;
When the most similar area is found, updating the position information of the target;
and estimating the motion state of the target according to the position information of the target in the continuous frames, and obtaining the motion trail of the target.
Extracting the gesture information of the target by using a gesture estimation algorithm, and constructing a multidimensional behavior feature vector by combining the motion trail and space-time features of the target;
pose in the video frames of escort personnel and vehicles, such as the joint positions of human bodies, the angles of limbs and the like, are determined by using a pose estimation algorithm. The position coordinate points at different moments are used for representing the movement tracks of escort personnel and vehicles, and for the time, duration, time rhythm and other time characteristics of the movement of the escort personnel and the vehicles in the video, the spatial characteristics comprise whether the vehicles are in a lane, an intersection, a parking lot and other different spatial areas. And combining the gesture information, the motion trail features and the space-time features into a multidimensional behavior feature vector.
Extracting the gesture information of the target by using a gesture estimation algorithm, and constructing a multidimensional behavior feature vector by combining the motion trail and the space-time feature of the target, wherein the method comprises the following steps:
carrying out gesture estimation on a target in the video stream by using a gesture estimation algorithm to obtain key points of the target and position information of the key points;
extracting gesture information of a target by utilizing the output of a gesture estimation algorithm, wherein the gesture information comprises coordinates of key points, relative position relations among the key points, motion tracks of the key points and the like;
Acquiring a motion trail of a target by using a target tracking algorithm, wherein the motion trail is a position information sequence of the target in continuous frames;
extracting space-time characteristics of a target, wherein the space-time characteristics refer to change information of the target in time and space, and the change information comprises acceleration, speed change, gesture change and the like of the target;
And combining the extracted gesture information, motion trail and space-time characteristics to construct a multidimensional behavior feature vector.
Establishing a preset abnormal behavior rule base, wherein the abnormal behavior rule base comprises a known abnormal behavior set;
And establishing an abnormal behavior rule base, wherein the abnormal behavior rule base aims at identifying abnormal behaviors of financial escort personnel in the task execution process. By defining the rules of the behaviors, the abnormal behaviors are automatically detected by using a monitoring technology, so that the security of the field finance escort is ensured.
Establishing a preset abnormal behavior rule base, wherein the abnormal behavior rule base comprises a known abnormal behavior set and comprises the following steps:
collecting sample data containing various abnormal behaviors, and establishing a preset abnormal behavior rule base;
extracting behavior characteristic information capable of describing an abnormal behavior sample, wherein the characteristic information comprises the gesture, the motion track, the speed, the acceleration, the interaction with other targets and the like of the targets;
and constructing an abnormal behavior rule base based on the extracted characteristic information.
The method comprises the steps of establishing a preset abnormal behavior rule base, and collecting abnormal behavior sample data of escort personnel and vehicles, wherein the abnormal behaviors of the escort personnel comprise abnormal behaviors such as abnormal sitting postures, dozing, calling, eating, smoking, connecting with the ears and the like, and the abnormal behaviors of the vehicles comprise abnormal behaviors such as non-running according to a set track, overspeed or slow speed of the vehicles, collision accidents of target vehicles and other vehicles or people, abnormal vehicle components and the like. And constructing an abnormal behavior rule base, wherein behavior characteristic information in an abnormal behavior sample is extracted from the abnormal behavior rule base data, and the behavior characteristic information comprises the gesture of escort personnel, the motion track of escort vehicles, the vehicle speed, the acceleration and the interaction with other targets.
Based on the output of the multidimensional behavior feature vector, the identified behaviors are analyzed in real time by combining a preset abnormal behavior rule base, and whether the target is in abnormal behavior is identified.
Firstly, comparing and analyzing behaviors in the escort personnel and vehicle video frames output by the multidimensional behavior feature vectors with a preset abnormal behavior rule base, and identifying whether the escort personnel and the vehicle are performing abnormal behaviors.
Based on the output of the multidimensional behavior feature vector, the identified behavior is analyzed in real time by combining with a preset abnormal behavior rule base, and whether the target is in abnormal behavior is identified, which comprises the following steps:
Extracting multidimensional behavior feature vectors of targets from video streams in real time by using a gesture estimation algorithm, a target tracking algorithm and a space-time feature extraction method;
matching the extracted multidimensional behavior feature vector with a preset abnormal behavior rule base;
Judging whether the target is performing abnormal behavior in real time according to the matching result of the feature vector and the rule base;
when abnormal behavior is identified, the system immediately triggers an alarm mechanism.
And identifying the positions of the joints of the human body and the movement track information of the targets by using an attitude estimation algorithm, a target tracking algorithm and a space-time feature extraction method, wherein the movement track information comprises coordinates of joints such as the head, the shoulder, the elbow, the wrist, the hip, the knee, the ankle and the like of escort personnel and the movement track of escort vehicles. The posture of the human body, such as standing, bending down, lifting hands, moving track of the vehicle and the like, is described through the position relation of the joint points of the personnel, and the characteristics of the moving direction, the speed change and the like of the target are further analyzed. If the motion direction vector of the target is obtained by calculating the vector between adjacent points on the track, the speed information of the target is obtained by calculating the change rate of the distance between the track points along with time, and the multidimensional behavior feature vector of the target is extracted. After the multidimensional behavior feature vector of the target is extracted, the multidimensional behavior feature vector is matched with rules in an abnormal behavior rule base one by one. If the abnormal behavior is matched, the system gives an alarm, and sends alarm information to a remote monitoring center through a network.
As shown in fig. 3, an embodiment of the present application includes a system for identifying abnormal behavior of a financial escort process, the system including:
The data acquisition module 11 is used for acquiring personnel characteristic data, wherein the characteristic data comprises fingerprint data, facial data and iris data;
the feature extraction module 12 is used for carrying out feature extraction on the personnel feature data and establishing a multi-mode biological feature database;
The feature matching module 13 is used for acquiring real-time personnel feature data, and matching the features extracted from the real-time personnel feature data with the features stored in the multi-mode biological feature database by using a convolutional neural network to acquire a matching result;
The video stream obtaining module 14 is configured to continuously obtain the video stream of the monitoring area in real time and perform preprocessing when the matching result is that the matching is successful;
The target recognition module 15 is used for performing target detection on the preprocessed video stream by using a deep learning algorithm and recognizing a tracking target, wherein the tracking target comprises personnel and vehicles;
A motion track acquisition module 16, configured to track the detected tracking target in successive frames by using a target tracking algorithm, and acquire a motion track of the tracking target;
The multidimensional behavior feature vector construction module 17 is used for extracting the gesture information of the target by using a gesture estimation algorithm and constructing a multidimensional behavior feature vector by combining the motion track and the space-time feature of the target;
an abnormal behavior rule base establishing module 18, configured to establish a preset abnormal behavior rule base, where the abnormal behavior rule base includes a known abnormal behavior set;
The abnormal behavior recognition module 19 is configured to analyze the recognized behavior in real time based on the output of the multidimensional behavior feature vector in combination with a preset abnormal behavior rule base, and recognize whether the target is performing an abnormal behavior.
Further, the data acquisition module 11 further includes:
acquiring fingerprints by using professional fingerprint acquisition equipment to acquire fingerprint data;
acquiring face data of a person by using a high-resolution camera;
And (3) performing multiple scans by using an iris acquisition device to acquire iris data under different angles and illumination conditions.
Further, the feature extraction module 12 further includes:
Extracting characteristic information such as texture, direction and frequency of the fingerprint from the collected fingerprint data by using a fingerprint characteristic extraction algorithm;
Extracting feature information such as key points, textures, shapes and the like of the face from the acquired face data by using a face feature extraction algorithm, wherein the key points of the face comprise eye corners, mouth corners and the like;
extracting characteristic information such as texture, spots, lines and the like of the iris from the acquired iris data by using an iris characteristic extraction algorithm;
And integrating the extracted fingerprint features, facial features and iris features to form multi-modal biological feature data of each person, and establishing a multi-modal biological feature database.
Further, the object recognition module 15 further includes:
Extracting images from the video stream frame by frame for frame analysis;
performing enhancement processing on the extracted image and adjusting the image to a size suitable for the input of the deep learning model;
carrying out normalization processing on the image to enable the pixel value to be in a specific range;
performing feature extraction and classification by using a convolutional neural network;
Training the selected deep learning model by using the marked data set;
Applying a trained model to each frame in the video stream to perform target detection, and identifying the positions and the categories of personnel and vehicles;
a tracking algorithm is applied between successive frames to track the motion trajectory of the object in the video.
Further, the motion trajectory acquisition module 16 further includes:
performing target detection on each frame of the video stream by using a target detection algorithm;
Initializing the detected target, wherein the initialization comprises the initial position, the size and possible appearance characteristics of the target;
Calculating the motion of pixels in the image by using an optical flow method, so as to track the motion trail of the target;
extracting features of the target from the current frame;
Searching a region most similar to the target in a subsequent frame according to the characteristics of the extracted target;
When the most similar area is found, updating the position information of the target;
and estimating the motion state of the target according to the position information of the target in the continuous frames, and obtaining the motion trail of the target.
Further, the multidimensional behavior feature vector constructing module 17 further includes:
carrying out gesture estimation on a target in the video stream by using a gesture estimation algorithm to obtain key points of the target and position information of the key points;
extracting gesture information of a target by utilizing the output of a gesture estimation algorithm, wherein the gesture information comprises coordinates of key points, relative position relations among the key points, motion tracks of the key points and the like;
Acquiring a motion trail of a target by using a target tracking algorithm, wherein the motion trail is a position information sequence of the target in continuous frames;
extracting space-time characteristics of a target, wherein the space-time characteristics refer to change information of the target in time and space, and the change information comprises acceleration, speed change, gesture change and the like of the target;
And combining the extracted gesture information, motion trail and space-time characteristics to construct a multidimensional behavior feature vector.
Further, the abnormal behavior rule base building module 18 further includes:
collecting sample data containing various abnormal behaviors, and establishing a preset abnormal behavior rule base;
extracting behavior characteristic information capable of describing an abnormal behavior sample, wherein the characteristic information comprises the gesture, the motion track, the speed, the acceleration, the interaction with other targets and the like of the targets;
and constructing an abnormal behavior rule base based on the extracted characteristic information.
Further, the abnormal behavior recognition module 19 further includes:
Extracting multidimensional behavior feature vectors of targets from video streams in real time by using a gesture estimation algorithm, a target tracking algorithm and a space-time feature extraction method;
matching the extracted multidimensional behavior feature vector with a preset abnormal behavior rule base;
Judging whether the target is performing abnormal behavior in real time according to the matching result of the feature vector and the rule base;
when abnormal behavior is identified, the system immediately triggers an alarm mechanism.
For specific embodiments of the abnormal behavior recognition system in the financial escort process, reference may be made to the above embodiments of the abnormal behavior recognition method in the financial escort process, which are not described herein. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.