CN111339810A

CN111339810A - Low-resolution large-angle face recognition method based on Gaussian distribution

Info

Publication number: CN111339810A
Application number: CN201910337099.7A
Authority: CN
Inventors: 施海波; 孙勇
Original assignee: Nanjing Teworth High Tech Co ltd
Current assignee: Nanjing Teworth High Tech Co ltd
Priority date: 2019-04-25
Filing date: 2019-04-25
Publication date: 2020-06-26

Abstract

The invention relates to the technical field of intelligent security face recognition, and discloses a low-resolution large-angle face recognition method based on Gaussian distribution, which comprises the following specific steps of: step 1: reading a face data set used for training a face recognition model; step 2: carrying out face detection and cutting on pictures in the data, and screening out training pictures meeting conditions; step 3: sequentially reading training pictures in batches, and performing data enhancement processing such as random rotation and zooming; step 4: training a face recognition model by using a softmax function based on Gaussian distribution as a loss function; step 5; judging whether the face recognition accuracy meets the design requirement or not; if the training data are obtained, the final trained face recognition model is output, otherwise, the training data are continuously read to train the model. The low-resolution large-angle face recognition method based on Gaussian distribution improves the accuracy of the face recognition model on face angles and resolution, and enhances the robustness of the model.

Description

Low-resolution large-angle face recognition method based on Gaussian distribution

Technical Field

The invention relates to the technical field of intelligent security face recognition, in particular to a low-resolution large-angle face recognition method based on Gaussian distribution.

Background

With the continuous improvement of security protection standards of crowded areas such as airports, subways, shopping centers and the like, machine vision-based intelligent monitoring systems are receiving more and more attention. In order to obtain a wider visual field of most monitoring videos, the face collected generally is smaller and not at a direct viewing angle, and compared with a clear and high-resolution front face picture, the discrimination and the information content of a small-scale large-angle face picture are greatly reduced, so that a monitoring video face recognition system needs to be specially optimized for the situations of small scale and large angle.

At present, algorithms related to face recognition have been studied for many years, but algorithms specific to monitoring videos are rare, and only some algorithms only consider the problems of pose and resolution respectively, but do not consider the two problems together, and do not pay attention to the real-time problem of the algorithms. Surveillance video face recognition is more challenging than still picture face recognition.

The existing face recognition model training methods can be generally divided into two types: metric learning-based methods and boundary-based classification methods. The metric learning method calculates a Loss function directly on the features extracted from the human face, so that the same human feature is closer to the features extracted from the human face, and different human features are farther from the features extracted from the human face, such as deep id2 using contrast Loss (contrast Loss) and FaceNet using triple Loss (triple Loss). The metric learning based on the comparison loss function and the triple loss function accords with the cognitive rules of people, and a good effect is achieved in practical application, but the metric learning has two fatal problems, namely, the model can be fitted for a long time, and the model depends on the sampling mode of training data, so that the ideal sampling mode not only can improve the final performance of the algorithm, but also can accelerate the training speed.

The boundary classification method is not like a metric learning method which directly calculates a loss function in a characteristic layer, but the face recognition is still used as a classification task for training, and the limitation of applying a boundary to the characteristic layer is indirectly realized through the modification of a softmax formula, so that the finally obtained characteristics of the network have higher discrimination. The idea is proposed from Sphereface, and the AM-softmax/CosFace and ArcFace are improved to obtain the best performance on the existing public data set at present, and the training speed is higher. However, the method does not consider the problem of unbalanced distribution of sample data, so that the algorithm performs well on a high-definition front face picture and has low recognition accuracy on a low-resolution side face picture, and therefore improvement is needed for the problem.

Disclosure of Invention

Technical problem to be solved

Aiming at the defects of the prior art, the invention provides a low-resolution large-angle face recognition method based on Gaussian distribution, which improves the accuracy of a face recognition model on face angles and resolution, enhances the robustness of the model and the like, and solves the problem that the recognition accuracy of the existing face recognition model training method on a low-resolution side face picture is low.

(II) technical scheme

In order to achieve the purpose, the invention provides the following technical scheme: a low-resolution large-angle face recognition method based on Gaussian distribution comprises the following specific steps:

step 1: reading a face data set used for training a face recognition model;

step 2: carrying out face detection and cutting on pictures in the data, and screening out training pictures meeting conditions;

step 3: sequentially reading training pictures in batches, and performing data enhancement processing such as random rotation and zooming;

step 4: training a face recognition model by using a softmax function based on Gaussian distribution as a loss function;

step 5; judging whether the face recognition accuracy meets the design requirement or not; if the training data reaches the preset training data, outputting a final trained face recognition model, otherwise, continuously reading the training data to train the model;

step 6: outputting a face recognition model which is finally trained;

step 7: reading a face picture to be recognized;

step 8: carrying out face detection on the read face picture, and intercepting a face range;

step 9: extracting a face picture characteristic vector by using a face recognition model which is trained before;

step 10: calculating cosine distances between the extracted feature vectors to serve as face similarity measurement;

step 11: and finally, outputting the similarity between the face pictures and judging whether the face pictures are the same person or not.

Preferably, the specific processing procedure of Step3 is divided into random rotation of the training picture, random scaling of the training picture, and modified softmax loss function based on gaussian distribution.

Preferably, the random rotation of the training picture is performed by counting the angle range of the face of the monitoring image, and the angle distribution of the face of the monitoring image is found to be within a range of ± 30 degrees, so that in order to enable the model to cover all the angle ranges of the face, a random rotation operation is added during the training of the face recognition model, and the output face rotation angle is as follows:

θ_out＝30λ_θ-θ_in

wherein: theta_outThe angle of rotation of the face is output; lambda [ alpha ]_θIs a random number between-1 and 1; theta_inThe face angle is the input face rotation angle, and after the face rotation, the face angle is uniformly distributed in the angle range of-30 degrees to 30 degrees.

Preferably, the random scaling of the training picture is performed by counting the distribution of the face resolution of the monitoring image, and it is found that the face resolution of the monitoring image is within a range of 16x 16-128 x128, so that in order to make the model cover all the face resolution ranges, a random scaling operation is added during the training of the face recognition model, since the resolution of the model training picture is 128x128, the training picture is firstly reduced by using a bicubic interpolation method, and then is enlarged to 128x128 by using the bicubic interpolation, and the resolution of the output face picture is:

s_out＝λ_s

wherein: s_outInterpolating the face reduction resolution for bicubic; lambda [ alpha ]_sIs a random integer between 16 and 128. After random scaling, the face resolution is uniformly distributed between 16x16 and 128x 128.

Preferably, the modified softmax loss function based on the gaussian distribution is for the input face data x, p_data(x) Representing the true probability distribution, p, of a sample_model(x | θ) represents the probability of the face recognition model output, and therefore the KL divergence of the modelComprises the following steps:

wherein: x is a face sample picture; x is a face data set; d_KL(p_data||p_model) The KL divergence between the probability distribution of the real data and the probability distribution of the model; p is a radical of_data(x) Is the true data probability; p is a radical of_model(x) Calculating the probability of the face picture by the model; h (p) is a model-independent constant, D_KL(p_data||p_model) The smaller the model prediction probability is closer to the real picture probability, and after a constant H (p) irrelevant to the model is removed, the cross entropy form generally used by a face recognition loss function is obtained.

The method comprises the following steps of taking a softmax function as a model prediction probability, introducing a boundary mechanism, namely a cos-softmax loss function with high accuracy rate of face recognition of a single picture at present, taking a regularization term as an inter-class distance, introducing an inter-class distance constraint mechanism to improve the cos-softmax loss function, and preventing the robustness of a face recognition model area from being influenced by too small inter-class distance:

L_{pro-cosSoftmax}＝L_cos-Softmax+λL_conv

L_conv＝-|conv(W，W)||_F

wherein: l is_{pro-cosSoftmax}For improved cos-a softmax loss function comprising cos-softmax loss function L_{pro-cosSoftmax}Sum covariance loss function L_conv(ii) a N is the size of the training batch, s is a scaling factor, the weight of the loss function is adjusted, m is the boundary between classes, and the training intensity is adjusted; w_jIs the jth column of the weight of the last fully-connected layer of the network, i.e. the representative vector of the jth class in the training sample, W_j ^*Is W_jThe result of the normalization; x is the number of_iRepresenting the feature vector extracted from the ith picture in the batch,

is x_iThe result of the normalization; cos (theta)_j，i) It represents the j-th type normalized feature W in the training sample_j ^*A normalized feature of the ith sample picture

The included angle between the two eigenvectors, that is, the distance between the two eigenvectors, which is used for classifying the softmax loss, that is, the softmax part of the loss function; λ represents the weight coefficient of the regularization term, | conv (W, W) | luminance_FThe covariance matrix is calculated for W, and then the Frobenius norm is calculated for the covariance matrix, namely the square sum of the absolute values of matrix elements is squared again to represent the inter-class distance.

In order to consider the imbalance of the training samples, namely the number of high-quality samples in the training samples is large, and the number of low-quality samples is small, so that the amplitude of the extracted feature high-quality samples is large, and the amplitude of the low-quality samples is small; based on this observation, the sample sampling probability is introduced into the model probability distribution using the feature amplitude as the sample sampling distribution:

p_model(x|θ)＝p_model-cls*p_model-samp

wherein: p is a radical of_model(x | θ) represents the probability distribution of the face recognition model θ output for the input picture x; including therein a model classification probability p_model-clsAnd model sampling probability p_model-samp(ii) a Model classification probability p_model-clsThe classification probability of the cos-softmax loss function is obtained; model sampling probability p_model-sampFitting Gaussian distribution; mu is a characteristic mean value; sigma is a characteristic variance; p is to be_model(x | θ) is substituted into the cross entropy formula, i.e. the modified softmax loss function based on the gaussian distribution.

Preferably, Step1, Step2, Step3, Step4, Step5 and Step6 constitute a model training section, and Step7, Step8, Step9, Step10 and Step11 constitute a face recognition section.

(III) advantageous effects

Compared with the prior art, the invention provides a low-resolution large-angle face recognition method based on Gaussian distribution, which has the following beneficial effects:

1. the low-resolution large-angle face recognition method based on Gaussian distribution is characterized in that the method comprises the following steps of 3: the training pictures are sequentially read in batches, and data enhancement processing such as random rotation and scaling is carried out, so that the accuracy of the face identification model on face angles and resolution is improved.

2. According to the low-resolution large-angle face recognition method based on Gaussian distribution, a softmax function improved by Gaussian distribution is introduced through Step4, the distribution rule of training data is considered, weighting is carried out by using Gaussian distribution probability, the face recognition accuracy of pictures with less distribution is improved, and the robustness of a model is enhanced.

Drawings

Fig. 1 is a flowchart of a low-resolution large-angle face recognition method based on gaussian distribution according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, a low-resolution large-angle face recognition method based on gaussian distribution includes a model training portion and a face recognition portion, where the model training portion is: reading model training data, performing preprocessing work such as data cutting and cleaning, then reading the processed data in batches, performing operations such as random rotation and scaling, and training a face recognition model by using an improved softmax function based on Gaussian distribution as a loss function; the face recognition part is as follows: inputting a human face picture to be recognized, performing human face detection on the picture and intercepting a human face range, extracting human face characteristic vectors by using a human face recognition model trained in S1, calculating cosine distances among the characteristic vectors, and finally outputting similarity among human faces, wherein the specific steps are as follows:

step 1: reading a face data set used for training a face recognition model;

step 6: outputting a face recognition model which is finally trained;

step 7: reading a face picture to be recognized;

The specific processing procedure of Step3 is divided into random rotation of the training picture, random scaling of the training picture and modified softmax loss function based on Gaussian distribution.

The random rotation of the training picture is realized by counting the angle range of the face of the monitoring image, and the angle distribution of the face of the monitoring image is found to be within the range of +/-30 degrees, so that in order to enable the model to cover all the angle ranges of the face, the random rotation operation is added during the training of the face recognition model, and the output face rotation angle is as follows:

θ_out＝30λ_θ-θ_in

wherein: theta_outThe angle of rotation of the face is output; lambda [ alpha ]_θIs a random number between-1 and 1; theta_inThe human face rotation angle is input, after the human face rotation angle is rotated, the human face angles are uniformly distributed in an angle range of-30 degrees to 30 degrees, and the human face recognition model can be more robust to the human face large angle by using the human face picture after the angle enhancement to carry out human face recognition model training.

The random zooming of the training picture is realized by counting the face resolution distribution of the monitoring image, and the face resolution of the monitoring image is found to be in the range of 16x 16-128 x128, so that in order to enable the model to cover all the face resolution ranges, random zooming operation is added during the training of the face recognition model, because the resolution of the model training picture is 128x128, the training picture is firstly reduced by a bicubic interpolation method, then the bicubic interpolation is used for amplifying the training picture to 128x128, and the resolution of the output face picture is as follows:

s_out＝λ_s

wherein: s_outInterpolating a person for bicubic interpolationResolution of face reduction; lambda [ alpha ]_sIs a random integer between 16 and 128. After random scaling, the face resolution is uniformly distributed between 16x 16-128 x128, and the face recognition model training is carried out by using the face picture after resolution enhancement, so that the face recognition model has stronger robustness to low-resolution faces and higher recognition accuracy.

The improved softmax loss function based on Gaussian distribution is used for input face data x, p_data(x) Representing the true probability distribution, p, of a sample_model(x | θ) represents the probability of the face recognition model output, so the KL divergence of the model is:

L_{pro-cosSoftmax}＝L_{pro-cosSoftmax}+λL_conv

L_conv＝-||conv(w,w)||_F

wherein: l is_{pro-cosSoftmax}For improved cos-softmax loss function, comprising a cos-softmax loss function L_{pro-cosSoftmax}Sum covariance loss function L_conv(ii) a N is the size of the training batch, s is a scaling factor, the weight of the loss function is adjusted, m is the boundary between classes, and the training intensity is adjusted; w_jIs the jth column of the weight of the last fully-connected layer of the network, i.e. the representative vector of the jth class in the training sample, W_j ^*Is W_jA result of the normalization; x is the number of_iRepresenting the feature vector extracted from the ith picture in the batch,

The included angle between the two eigenvectors, that is, the distance between the two eigenvectors, which is used for classifying the softmax loss, that is, the softmax part of the loss function; λ represents the weight coefficient of the regularization term, | conv (W, W) | luminance_FThe covariance matrix is calculated for W, then the Frobenius norm is calculated for the covariance matrix, namely the square sum of the absolute values of matrix elements is squared again, the inter-class distance is represented, after the constraint of the inter-class distance is increased, the robustness of the model can be increased, and the accuracy of the face recognition of the model is improved.

p_model(x|θ)＝p_model-cls*p_model-samp

wherein: p is a radical of_model(x | θ) represents the probability distribution of the face recognition model θ output for the input picture x; including therein a model classification probability p_model-clsAnd model sampling probability p_model-samp(ii) a Model classification probability p_model-clsThe classification probability of the cos-softmax loss function is obtained; model sampling probability p_model-sampFitting Gaussian distribution; mu is a characteristic mean value; sigma is a characteristic variance; p is to be_modelAnd (x | theta) is substituted into a cross entropy formula, namely, an improved softmax loss function based on Gaussian distribution, and after the Gaussian distribution is introduced, the model can give larger weight to the samples with less distribution, so that the model pays more attention to the low-resolution large-angle face picture, and the recognition rate of the trained model to the large-angle low-resolution face picture with more monitoring videos is greatly enhanced.

To sum up, the low-resolution and large-angle face recognition method based on the gaussian distribution is implemented by the steps of 3: the method comprises the steps of sequentially reading training pictures in batches, carrying out data enhancement processing such as random rotation and scaling, improving the accuracy of a face recognition model on face angles and resolution, introducing a softmax function improved by Gaussian distribution through Step4, considering the distribution rule of training data, weighting by using Gaussian distribution probability, improving the accuracy of face recognition of the pictures with less distribution, and enhancing the robustness of the model.

It is to be noted that the term "comprises," "comprising," or any other variation thereof is intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A low-resolution large-angle face recognition method based on Gaussian distribution is characterized by comprising the following specific steps:

step 1: reading a face data set used for training a face recognition model;

step 6: outputting a face recognition model which is finally trained;

step 7: reading a face picture to be recognized;

2. The low-resolution large-angle face recognition method based on the Gaussian distribution as claimed in claim 1, characterized in that: the specific processing process of Step3 comprises random rotation of the training picture, random scaling of the training picture and an improved softmax loss function based on Gaussian distribution.

3. The low-resolution large-angle face recognition method based on the Gaussian distribution as claimed in claim 2, characterized in that: the random rotation of the training picture is realized by counting the angle range of the face of the monitoring image, and the angle distribution of the face of the monitoring image is found to be within the range of +/-30 degrees, so that in order to enable the model to cover all the angle ranges of the face, random rotation operation is added during the training of the face recognition model, and the output face rotation angle is as follows:

θ_out＝30λ_θ-θ_in

4. The low-resolution large-angle face recognition method based on the Gaussian distribution as claimed in claim 2, characterized in that: the random scaling of the training picture is realized by counting the distribution of the face resolution of the monitoring image, and the face resolution of the monitoring image is found to be in the range of 16x 16-128 x128, so that in order to enable the model to cover all the face resolution ranges, the random scaling operation is added when the face recognition model is trained, because the resolution of the model training picture is 128x128, the training picture is firstly reduced by using a bicubic interpolation method, then is enlarged to 128x128 by using the bicubic interpolation, and the resolution of the output face picture is as follows:

s_out＝λ_s

5. The low-resolution large-angle face recognition method based on the Gaussian distribution as claimed in claim 2, characterized in that: the improved softmax loss function based on the Gaussian distribution is used for input face data x, p_data(x) Representing the true probability distribution, p, of a sample_model(x|θ)p_data(x) Representing the probability of the face recognition model output, the KL divergence of the model is therefore:

wherein: x is a face sample picture; x is a face data set; d_KL(p_data||p_model) The KL divergence between the probability distribution of the real data and the probability distribution of the model is obtained; p is a radical of_data(x) Is the true data probability; p is a radical of_model(x) Calculating the probability of the face picture by the model; h (p) is a model-independent constant, D_KL(p_data||p_model) The smaller the model prediction probability is, the closer the model prediction probability is to the real picture probability, and after a constant H (p) irrelevant to the model is removed, the model is a cross entropy form generally used by a face recognition loss function.

The method comprises the following steps of taking a softmax function as a model prediction probability, introducing a boundary mechanism, namely a cos-softmax loss function with high accuracy rate of single-picture face recognition at present, taking a regularization term as an inter-class distance, introducing an inter-class distance constraint mechanism to improve the cos-softmax loss function, and preventing the inter-class distance from being too small to influence the robustness of a face recognition model area:

L_{pro-cosSoftmax}＝L_{pro-cosSoftmax}+λL_conv

L_conv＝-||conv(w,w)||_F

wherein: l is_{pro-cosSoftmax}For improved cos-softmax loss function, comprising a cos-softmax loss function L_cos-SoftmaxSum covariance loss function L_conv(ii) a N is the size of the training batch, s is a scaling factor, the weight of the loss function is adjusted, m is the boundary between classes, and the training intensity is adjusted; w_jIs the jth column of the weight of the last fully-connected layer of the network, i.e. the representative vector of the jth class in the training sample, W_j ^*Is W_jThe result of the normalization; x is the number of_iRepresenting the feature vector extracted from the ith picture in the batch,

is x_iThe result of the normalization; cos (theta)_j，i) It represents the j-th normalized feature in the training sample

With respect to the ith sample pictureA normalization feature

The included angle between the two eigenvectors, that is, the distance between the two eigenvectors, which is used for classifying the softmax loss, that is, the softmax part in the loss function; λ represents the weight coefficient of the regularization term, | conv (W, W) | luminance_FThe method comprises the steps of solving a covariance matrix for W, and solving a Frobenius norm for the covariance matrix, namely, the square sum of absolute values of matrix elements is solved, and then the square is opened, so as to represent the inter-class distance.

p_model(x|θ)＝p_model-cls*p_model-samp

wherein: p is a radical of_model(x | θ) represents the probability distribution of the face recognition model θ output for the input picture x; including the model classification probability p_model-clsAnd model sampling probability p_model-samp(ii) a Model classification probability p_model-clsThe classification probability of the cos-softmax loss function is obtained; model sampling probability p_model-sampThe Gaussian distribution is met; mu is a characteristic mean value; sigma is a characteristic variance; p is to be_modelThe cross entropy formula is substituted by (x | theta) to form an improved softmax loss function based on Gaussian distribution.

6. The low-resolution large-angle face recognition method based on the Gaussian distribution as claimed in claim 1, characterized in that: the Step1, the Step2, the Step3, the Step4, the Step5 and the Step6 form a model training part, and the Step7, the Step8, the Step9, the Step10 and the Step11 form a face recognition part.