WO2023000792A1 - Procédés et appareils pour construire un modèle d'identification de corps vivant et pour une identification de corps vivant, dispositif et support - Google Patents
Procédés et appareils pour construire un modèle d'identification de corps vivant et pour une identification de corps vivant, dispositif et support Download PDFInfo
- Publication number
- WO2023000792A1 WO2023000792A1 PCT/CN2022/093514 CN2022093514W WO2023000792A1 WO 2023000792 A1 WO2023000792 A1 WO 2023000792A1 CN 2022093514 W CN2022093514 W CN 2022093514W WO 2023000792 A1 WO2023000792 A1 WO 2023000792A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- living body
- living
- image data
- category
- loss function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/40—Spoof detection, e.g. liveness detection
Definitions
- the present disclosure relates to the field of computer technology, and in particular to a method, device, device and medium for constructing a living body recognition model and living body recognition.
- embodiments of the present disclosure provide a method, device, device and medium for constructing a living body recognition model and living body recognition.
- the embodiments of the present disclosure provide a method for constructing a living body recognition model.
- the above-mentioned method for constructing a living body recognition model includes: acquiring image data obtained by shooting a target object, the above-mentioned target object including: a living body object and non-living body objects carried by multiple types of physical media; The first label; based on the type difference of the physical medium, the image data of the above-mentioned non-living object corresponds to multiple second labels representing the non-living category; the above-mentioned image data is input into the machine learning model for training; and based on the above The first label and multiple types of the above-mentioned second labels are used to perform multi-classification training on the above-mentioned machine learning model to obtain a living body recognition model.
- the above-mentioned multi-classification training is performed on the above-mentioned machine learning model based on the above-mentioned first label and multiple types of the above-mentioned second labels to obtain a living body recognition model, including: in each round of training of the above-mentioned machine learning model, For the input current image data, output the respective probability values that the above current image data belong to the living body category and belong to the non-living body category corresponding to each type of physical medium in the above-mentioned multiple types of physical media; A target loss function, the above-mentioned target loss function is used to characterize the degree of deviation between the predicted category of the above-mentioned current image data and the category corresponding to the label of the above-mentioned current image data; and when the degree of convergence of the above-mentioned target loss function meets the set value Stop the training and get the trained living body recognition model.
- the above target loss function is a weighted sum of a cross-entropy loss function and a ternary center loss function.
- the above-mentioned cross-entropy loss function is used as the main loss function
- the above-mentioned ternary center loss function is used as the auxiliary loss function
- the above-mentioned target loss function is the sum of the product of the above-mentioned auxiliary loss function and the weight coefficient and the above-mentioned main loss function , the value of the above weight coefficient is between 0 and 1 and can ensure the convergence of the above target loss function.
- the above-mentioned based on the type difference of the physical medium, corresponding the image data of the above-mentioned non-living object to multiple types of second labels representing the category of the non-living body includes: based on the difference of the attribute type of the physical medium, the above-mentioned physical medium Divided into a plurality of main categories; based on the difference of at least one of the shape and material of the physical medium, the physical medium under each main category is subdivided to obtain a subdivision category; wherein, the above main category and the above subdivision category belong to non-living object category; for each image data of the non-living object, determine the target main category or target sub-category corresponding to the physical medium of the current non-living object; or the second tab of the target segment above.
- the above-mentioned main categories include: paper media, screen media, and material media for three-dimensional models; Or more: plain paper, curved paper, cut paper, button hole paper, normal photo, curved photo, cropped photo, button hole photo; according to the type difference of the above screen media, the above screen media are divided into the following Two or more types of subdivided categories: desktop screen, tablet computer screen, mobile phone screen, laptop computer screen; according to the material difference of the material medium for the above three-dimensional model, the above three-dimensional model material medium is divided into the following subcategories Two or more of: plaster models, wooden models, metal models, plastic models.
- inventions of the present disclosure provide a method for living body identification.
- the above method for living body recognition includes: acquiring image data to be detected, wherein the image data to be detected contains objects to be identified; inputting the image data to be detected into a living body recognition model to output classification results of the objects to be identified is the type of physical medium corresponding to the living body category or the non-living body category; wherein, the above-mentioned living body recognition model is constructed by the above-mentioned method for constructing a living body recognition model.
- inventions of the present disclosure provide an apparatus for constructing a living body recognition model.
- the above-mentioned device for building a living body recognition model includes: a first data acquisition module, a tag association module, an input module and a training module.
- the above-mentioned first data acquisition module is configured to acquire the image data of the target object obtained by shooting, and the above-mentioned target object includes: living objects and non-living objects carried by various types of physical media.
- the label association module is configured to correspond the image data of the above-mentioned living objects to the first label representing the category of living objects; Second tab.
- the above-mentioned input module is configured to input the above-mentioned image data into the machine learning model for training.
- the above-mentioned training module is configured to perform multi-classification training on the above-mentioned machine learning model based on the above-mentioned first label and multiple types of the above-mentioned second labels, so as to obtain a living body recognition model.
- inventions of the present disclosure provide a device for living body identification.
- the above-mentioned device for living body identification includes: a second data acquisition module and an identification module.
- the second data acquisition module is configured to acquire image data to be detected, and the image data to be detected includes an object to be identified.
- the recognition module is configured to input the image data to be detected into the living body recognition model, so as to output the classification result of the object to be recognized as the living body category or the physical medium type corresponding to the non-living body category.
- the above-mentioned living body recognition model is constructed by the above-mentioned method for constructing a living body recognition model or constructed by the above-mentioned device for constructing a living body recognition model.
- inventions of the present disclosure provide an electronic device.
- the above-mentioned electronic equipment includes a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete mutual communication through the communication bus; the memory is used to store computer programs; the processor is used to execute all programs on the memory.
- the stored program realizes the above-mentioned method of constructing a living body recognition model or a method of living body recognition.
- embodiments of the present disclosure provide a computer-readable storage medium.
- a computer program is stored on the above-mentioned computer-readable storage medium, and when the above-mentioned computer program is executed by a processor, the above-mentioned method for constructing a living body recognition model or a method for living body recognition is realized.
- the image data of the non-living object corresponds to the multiple second labels representing the category of non-living objects.
- the second label is used for multi-category learning.
- the learning of each attack category only needs to focus on a smaller number of features, the task is simpler, the machine learning is easier and more efficient, and the living body recognition model obtained after training is suitable for living objects and non-living objects. Objects are well differentiated.
- FIG. 1 schematically shows the system architecture of the method and device for constructing a living body recognition model applicable to an embodiment of the present disclosure
- FIG. 2 schematically shows a flowchart of a method for constructing a living body recognition model according to an embodiment of the present disclosure
- FIG. 3 schematically shows a detailed implementation flowchart of operation S203 according to an embodiment of the present disclosure
- FIG. 4 schematically shows a detailed implementation flowchart of operation S205 according to an embodiment of the present disclosure
- Fig. 5 schematically shows a schematic diagram of the implementation process of constructing a living body recognition model according to an embodiment of the present disclosure
- Figure 6 schematically shows the visual features of the trained model on the test set using the Cross Entropy Loss function (Cross Entropy Loss) as the target loss function;
- Figure 7 schematically shows the visual features of the trained living body recognition model on the test set using the weighted sum of the cross-entropy loss function (Cross Entropy Loss) and the triplet-center loss function (Triplet-Center Loss) as the target loss function ;
- FIG. 8 schematically shows a flow chart of a method for living body recognition according to an embodiment of the present disclosure
- Fig. 9 schematically shows a structural block diagram of a device for constructing a living body recognition model according to an embodiment of the present disclosure
- Fig. 10 schematically shows a structural block diagram of a device for living body recognition according to an embodiment of the present disclosure.
- Fig. 11 schematically shows a structural block diagram of an electronic device provided by an embodiment of the present disclosure.
- Embodiments of the present disclosure provide a method, device, device and medium for constructing a living body recognition model and living body recognition.
- the above-mentioned method for constructing a living body recognition model includes: acquiring image data obtained by shooting a target object, the above-mentioned target object including: a living body object and non-living body objects carried by multiple types of physical media; The first label; based on the type difference of the physical medium, the image data of the above-mentioned non-living object corresponds to multiple second labels representing the non-living category; the above-mentioned image data is input into the machine learning model for training; and based on the above The first label and multiple types of the above-mentioned second labels are used to perform multi-classification training on the above-mentioned machine learning model (corresponding to at least 3 categories, one category of living objects, and at least 2 categories of non-living objects carried by physical media), to obtain a living body recognition model. .
- a result of the above-mentioned living body recognition model classifying the above-mentioned image data is: a living body category, or a non-living body category corresponding to one type of physical medium among the above-mentioned multiple types of physical media.
- Fig. 1 schematically shows the system architecture of the method and device for constructing a living body recognition model applicable to the embodiments of the present disclosure.
- a system architecture 100 applicable to the method and device for constructing a living body recognition model includes: terminal devices 101 , 102 , 103 , a network 104 and a server 105 .
- the network 104 is used as a medium for providing communication links between the terminal devices 101 , 102 , 103 and the server 105 .
- Network 104 may include various connection types, such as wires, wireless communication links, or fiber optic cables, among others.
- terminal devices 101 , 102 , 103 Users can use terminal devices 101 , 102 , 103 to interact with server 105 via network 104 to receive or send messages and the like.
- the terminal devices 101, 102, 103 may be installed with an image capture device, a picture/video playing application, and the like.
- Other communication client applications may also be installed, such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, social platform software, etc. (just examples).
- the terminal devices 101, 102, 103 can be display screens and various electronic devices that support picture/video playback.
- the electronic devices can further include image capture devices.
- electronic devices include but are not limited to smart phones, tablet computers, notebook computers, Desktop computers, self-driving cars, surveillance equipment, and more.
- the server 105 may be a server that provides various services, such as a background management server that provides service support for data processing of images or videos captured by users using the terminal devices 101 , 102 , and 103 (just an example).
- the background management server can analyze and process the received data such as image/video processing requests, and feed back the processing results (such as web pages, information, or data obtained or generated according to user requests) to the terminal device.
- the data processing can be to perform face recognition processing on the video frames in the images or videos captured by the terminal devices 101, 102, 103, so as to obtain whether the faces in the above images or video frames are real faces or other types of faces. Fake face.
- the method for constructing a living body recognition model provided by the embodiments of the present disclosure may generally be executed by the server 105 or a terminal device with certain computing capabilities.
- the apparatus for constructing a living body recognition model provided by the embodiments of the present disclosure may generally be set in the server 105 or the above-mentioned terminal devices with certain computing capabilities.
- the method for constructing a living body recognition model provided by the embodiments of the present disclosure may also be executed by a server or server cluster that is different from the server 105 and can communicate with the terminal devices 101 , 102 , 103 and/or the server 105 .
- the apparatus for constructing a living body recognition model may also be set in a server or a server cluster that is different from the server 105 and can communicate with the terminal devices 101 , 102 , 103 and/or the server 105 .
- the first exemplary embodiment of the present disclosure provides a method of constructing a living body recognition model.
- Fig. 2 schematically shows a flowchart of a method for constructing a living body recognition model according to an embodiment of the present disclosure.
- the method for constructing a living body recognition model includes the following operations: S201 , S202 , S203 , S204 and S205 .
- the above operations S201-S205 may be performed by a terminal device equipped with an image capture device, or by a server.
- image data of a target object captured by shooting is acquired, and the target object includes: living objects and non-living objects carried by various types of physical media.
- the image data of the above-mentioned living body object is corresponded to the first label representing the living body category.
- the image data of the above-mentioned non-living object is corresponded to multiple types of second labels representing the category of non-living objects.
- multi-classification training is performed on the machine learning model based on the first label and multiple types of the second label, so as to obtain a living body recognition model.
- a result of the above-mentioned living body recognition model classifying the above-mentioned image data is: a living body category, or a non-living body category corresponding to one type of physical medium among the above-mentioned multiple types of physical media.
- the above-mentioned living object is a real object, such as a real human part, such as a human face.
- the non-living objects carried by the above-mentioned physical medium may be: human faces on photos, human faces on A4 paper, human faces on screens (such as human faces on mobile phone screens), human faces corresponding to statues, etc.
- the living object can be other real animals, such as cats, dogs, birds, etc.
- the non-living objects carried by physical media are: cats/dogs/ Birds, cats/dogs/birds on A4 paper, cats/dogs/birds on screen, cats/dogs/birds corresponding to statues, etc.
- the way to obtain the image data obtained by shooting the target object can be directly shooting the target object by the terminal device to obtain the image data obtained by shooting the target object; Or the image data corresponding to the photo and video frame is obtained from the image and video database captured by the monitoring device).
- the first label is denoted as 0, and the image data of the living object is corresponding (also referred to as associated) with the label 0, and the label 0 indicates that the real classification of the living object is the living category.
- the number 0 of the above label is used as an example, and may also be defined as other numbers, as long as the number corresponds to the meaning of the representation.
- the image data of the non-living object may be corresponding to multiple different category tags.
- non-living objects are classified according to differences in physical media: ordinary paper, curved paper, cut paper, buttonhole paper, desktop screen, tablet computer screen, mobile phone screen, laptop computer screen, plaster model , wooden model, metal model, plastic model these 12 non-living categories, then the corresponding multi-category second labels can be expressed as: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12.
- label 1 indicates that the real classification of non-living objects is non-living objects corresponding to ordinary paper.
- the real classification of living objects is non-living objects corresponding to bent paper, non-living objects corresponding to cut paper, ..., non-living objects corresponding to metal models, and non-living objects corresponding to plastic models.
- the machine learning model may be a convolutional neural network, or other types of deep learning networks or other machine learning models.
- multi-classification training may be performed on the machine learning model based on the labels 0, 1, 2, . . . , 11, 12, so as to obtain a living body recognition model.
- the multi-classification training here, it corresponds to at least 3 categories, the living object corresponds to one category, and according to the difference in the type of physical medium, the non-living objects carried by the physical medium correspond to at least 2 categories.
- the model mainly relies on the moiré features generated when the screen is presented, while for paper-based attacks, the model mainly relies on the unique fiber texture of paper.
- Features such as color gamut changes are used for identification.
- the image data of non-living objects corresponds to multiple second labels representing non-living categories based on the type difference of the physical medium, and when training the machine learning model, according to the The first label and the multi-category second label of non-living objects are used for multi-classification learning.
- the task is simpler, and machine learning is easier and more efficient.
- Fig. 3 schematically shows a detailed implementation flowchart of operation S203 according to an embodiment of the present disclosure.
- the operation S203 of associating the image data of the above-mentioned non-living object with multiple types of second labels representing the category of non-living objects based on the type difference of the physical medium includes the following sub-operations: S2031, S2032, S2033, and S2034.
- the above-mentioned physical media is divided into a plurality of main categories based on the difference of attribute types of the physical media.
- the physical medium under each main category is subdivided to obtain subdivided categories.
- the above-mentioned main category and the above-mentioned sub-category belong to the non-living category.
- the image data of the current non-living object is corresponded to the second label representing the above-mentioned target main category or the above-mentioned target sub-category.
- the above-mentioned main categories include: paper media, screen media, and material media for three-dimensional models.
- the above-mentioned paper-based media are divided into two or more of the following sub-categories: plain paper, curved paper, cut-out paper, button-hole paper, plain photo, Bend photos, crop photos, buttonhole photos.
- the above-mentioned screen media can be divided into two or more of the following subcategories: desktop screens, tablet computer screens, mobile phone screens, and laptop computer screens.
- the above-mentioned material medium for the three-dimensional model is divided into two or more of the following subcategories: plaster model, wooden model, metal model, plastic model.
- Fig. 4 schematically shows a detailed implementation flowchart of operation S205 according to an embodiment of the present disclosure.
- the above-mentioned operation S205 of performing multi-classification training on the above-mentioned machine learning model based on the above-mentioned first label and multiple types of above-mentioned second labels to obtain a living body recognition model includes the following sub-operations: S2051, S2052, and S2053.
- sub-operation S2051 in each round of training of the above-mentioned machine learning model, for the input current image data, output and obtain that the above-mentioned current image data respectively belong to the living body category and belong to the non-living body category corresponding to each type of physical medium in the above-mentioned multiple types of physical media each probability value of .
- a target loss function for the current image data is determined according to the respective probability values, and the target loss function is used to characterize the degree of deviation between the predicted category of the current image data and the category corresponding to the label of the current image data.
- sub-operation S2053 the training is stopped when the convergence degree of the target loss function meets the set value, and a trained living body recognition model is obtained.
- the above target loss function is a weighted sum of a cross-entropy loss function and a ternary center loss function.
- Triplet-Center Loss combines the advantages of Triplet Loss (Triplet Loss) and Center Loss (Center Loss).
- Triplet Loss makes the sample features of the same class as close as possible during the learning process. , the sample features of different categories are as far away as possible to achieve the effect of increasing the separability between categories.
- the center loss first provides a class center for each category. During the model learning process, the distance between the sample and the corresponding category center is minimized to reduce the intra-class variance and make the intra-class features more compact.
- the ternary center loss function can be Increasing the distance between classes can reduce the variance within classes.
- Fig. 5 schematically shows a schematic diagram of an implementation process of constructing a living body recognition model according to an embodiment of the present disclosure.
- the target object includes: a real human face, a human face carried by ordinary paper, a human face carried by curved paper, a human face carried by a tablet computer screen, a human face carried by a mobile phone screen, and a human face carried by a plaster model and the human face carried by the metal model, the captured image data of these target objects are obtained, and the image data 0-6 are respectively used to correspond to the image data of the above-mentioned target objects.
- a large number of image data samples acquired are divided into training set and test set, and the image data samples in the training set are input into the machine learning model for multi-classification training.
- the features of each image data can be extracted through the weight-sharing convolutional neural network, correspondingly expressed as features 0 to 6, and the target loss function is determined based on the labels of each input image data sample, where the target loss function is cross
- the living body recognition model can process an image data containing an object to be recognized randomly input in the test set, and the classification result is obtained: the living body category or the corresponding medium type of the non-living body category is: ordinary paper, curved paper, Tablet screens, mobile phone screens, plaster or metal models.
- the accuracy of the living body recognition model can be tested based on the test set, and the parameters of the living body recognition model can be adjusted according to the test set, so that the application scenarios of the living body recognition model can be generalized.
- the cross-entropy loss function is used as the main loss function, and the ternary
- the central loss function is used as an auxiliary loss function (corresponding to multiply by the previous weight coefficient ⁇ in the subsequent formula (3)).
- the main loss function Based on the setting of the main loss function, it is ensured that the predicted category corresponding to the output of the image data sample input to the machine learning model is as close as possible to the category corresponding to the real label; based on the setting of the auxiliary loss function, in multi-category training scenarios with more than three categories It can effectively promote the reduction of intra-class distance and the simultaneous increase of inter-class distance.
- the model obtained by using the weighted sum of the cross-entropy loss function and the ternary center loss function as the target loss function for training is also compared with the model obtained by only using the cross-entropy loss function for training.
- Figure 6 schematically shows the visual features of the trained model on the test set using the cross entropy loss function (Cross Entropy Loss) as the target loss function
- Figure 7 schematically shows the cross entropy loss function (Cross Entropy Loss) ) and the triplet-center loss function (Triplet-Center Loss) as the target loss function, the visual features of the trained living body recognition model on the test set.
- the circled part is represented as a real person feature (corresponding to a living body category), and other points in the area outside the circled part represent non-real human face features, Corresponds to the attack features in the face anti-counterfeiting technology (corresponding to the non-living category).
- the three-category (including living body type and at least two other non-living body types) model training process proposed by the embodiments of the present disclosure requires only It is necessary to focus on fewer and essential features, which not only achieves the focus of features; but also combines with the target loss function composed of the weighted sum of the cross-entropy loss function and the ternary center loss function, making the overall training process more efficient. Fast and has a good convergence effect.
- N represents the total number of samples
- xi is the input image data sample
- y i is the actual/real label corresponding to xi
- the above label y i ⁇ ⁇ 0,1,2,3,... ,9,10,11,12 ⁇ as an example
- other values 1 to 12 represent different attack types, including: ordinary paper, bent paper, cut paper, buttonhole paper, Desktop screens, tablet screens, mobile phone screens, laptop screens, plaster models, wooden models, metal models, and plastic models correspond to non-living types.
- CNN convolutional neural network
- f i is the feature extracted by the CNN network; c j and j ⁇ y i represent other The center point of the category; m is the preset hyperparameter of the ternary loss; for f i and The Euclidean distance of is used to characterize the feature distance between the input image data sample x i and the center point of the current category; The minimum value of the feature distance between the image data sample x i used to characterize the input and the center points of other categories.
- the purpose of setting the preset hyperparameter m is to increase the distance between classes, and the specific value can be optimized in advance.
- the target loss function composed of the ternary center loss function L tc and the weighted cross-entropy loss function converges to a preset level.
- the parameters of the training model make The corresponding intra-class distance decreases, The corresponding inter-class distance increases.
- the target loss function is the weighted sum of the cross-entropy loss function L ce and the ternary center loss function L tc , and the target loss function is expressed as L, then L satisfies the following expression:
- ⁇ is the probability (or score) of the image data sample x i identified as y i obtained after passing through the CNN network
- ⁇ is the weight coefficient of the ternary center loss function
- the value of ⁇ is: 0 ⁇ 1 and can guarantee the convergence of the target loss function. According to the actual experimental results, on the premise of ensuring the convergence of the target loss function, the value of ⁇ can be as large as possible to improve the training speed.
- the target loss function provided by the embodiments of the present disclosure which is composed of the weighted sum of the cross-entropy loss function and the ternary center loss function, can well match the multi-classification training process of more than three classifications.
- the cross-entropy loss function as the main loss function and the ternary center loss function as the auxiliary loss function, refer to formula (3).
- the main loss function L ce Based on the setting of the main loss function L ce , it is ensured that the corresponding output of the image data sample input to the machine learning model is as close as possible to the corresponding category of the real label; based on the setting of the auxiliary loss function L tc , in more than three categories
- the feature distance between the input image data sample and the current category center point during the training process tends to decrease, and at the same time, the minimum value of the feature distance between the input image data sample and other category center points It shows an increasing trend, which effectively promotes the reduction of the intra-class distance and the simultaneous increase of the inter-class distance, speeds up the convergence speed of training, and improves the effect of aggregation between similar classes and distinction between different classes.
- the target loss function disclosed in this disclosure does not have a good adaptability to the scenario of binary classification, because the intra-class distance of binary classification is larger than that of multi-classification, and various types of attacks act as a large class, resulting in a large intra-class distance, not easy to aggregate, and the convergence speed is very slow.
- the data corresponding to the non-living body type is not easy to aggregate in the class during the training process, so that the input image data samples are consistent with
- the minimum value of the feature distance between other category center points (there is only one category center point in the binary classification scenario, and it is unstable) cannot be increased in a regular manner, resulting in that the distance between classes is not easy to separate, and the convergence speed is very slow.
- the idea of combining the multi-classification training with more than three classifications and the target loss function in the weighted form of the cross-entropy loss function and the ternary center loss function proposed by the embodiments of the present disclosure is original and has excellent effects.
- a second exemplary embodiment of the present disclosure provides a method of living body recognition.
- Fig. 8 schematically shows a flow chart of a method for living body recognition according to an embodiment of the present disclosure.
- the living body recognition method provided by the embodiment of the present disclosure includes the following operations: S801 and S802.
- image data to be detected is acquired, and the image data to be detected includes an object to be recognized.
- the image data to be detected can be image data containing objects to be identified in various types of application scenarios, for example, in the scenario of facial recognition punching of a facial recognition attendance machine, or in the scenario of personal smart device security verification.
- the image data to be detected may be: image data taken by a real user in the surrounding background, or image data taken by an illegal user in the surrounding background by taking a face photo or A4 paper with a human face printed on it.
- the above-mentioned image data to be detected is input into the living body recognition model, so as to output the classification result of the above-mentioned object to be recognized as a living body type or a physical medium type corresponding to a non-living body type.
- the living body recognition model feature extraction and recognition can be performed on the image to be recognized in the input image data to be detected, and it can be identified whether the classification result of the object to be recognized is the living body category or the physical medium type determined in the non-living body category.
- the above-mentioned living body recognition model is constructed by the method for constructing a living body recognition model described in the first embodiment.
- the living body recognition model has a good degree of discrimination between living objects and non-living objects
- multi-category learning is performed according to the first label of living objects and the multi-category second labels of non-living objects.
- the task is simpler, machine learning is easier and more efficient, and it can quickly extract and classify the feature information of the object to be recognized in the image data to be detected, and has a high efficiency and high recognition accuracy.
- a third exemplary embodiment of the present disclosure provides an apparatus for constructing a living body recognition model.
- Fig. 9 schematically shows a structural block diagram of an apparatus for constructing a living body recognition model according to an embodiment of the present disclosure.
- an apparatus 900 for building a living body recognition model includes: a first data acquisition module 901 , a tag association module 902 , an input module 903 and a training module 904 .
- the above-mentioned first data acquisition module 901 is configured to acquire the image data of the target object obtained by shooting, and the above-mentioned target object includes: living objects and non-living objects carried by various types of physical media.
- the tag association module 902 is configured to associate the image data of the living object with the first tag representing the living category; class second label.
- the tag association module 902 includes functional modules or sub-modules for implementing the above-mentioned sub-operations S2031-S2034.
- the above-mentioned input module 903 is configured to input the above-mentioned image data into the machine learning model for training.
- the above-mentioned training module 904 is configured to perform multi-classification training on the above-mentioned machine learning model based on the above-mentioned first label and multiple types of the above-mentioned second labels, so as to obtain a living body recognition model.
- the result of the above-mentioned living body recognition model classifying the above-mentioned image data is: a living body category, or a non-living body category corresponding to one type of physical medium among the above-mentioned multiple types of physical media.
- the above-mentioned training module 904 includes functional modules or sub-modules for implementing the above-mentioned sub-operations S2051-S2053.
- a fourth exemplary embodiment of the present disclosure provides an apparatus for living body identification.
- Fig. 10 schematically shows a structural block diagram of a device for living body recognition according to an embodiment of the present disclosure.
- an apparatus 1000 for living body identification provided by an embodiment of the present disclosure includes: a second data acquisition module 1001 and an identification module 1002 .
- the second data acquisition module 1001 is configured to acquire image data to be detected, and the image data to be detected includes an object to be identified.
- the recognition module 1002 is configured to input the image data to be detected into the living body recognition model, so as to output the classification result of the object to be recognized as the living body category or the physical medium type corresponding to the non-living body category.
- the above-mentioned living body recognition model is constructed by the above-mentioned method for constructing a living body recognition model or constructed by the above-mentioned device for constructing a living body recognition model.
- the above-mentioned device 1000 for living body recognition may store a pre-built living body recognition model, or may perform data communication with a device for building a living body recognition model, so as to call the constructed living body recognition model to process the image data to be detected, In order to obtain the classification result of the object to be recognized.
- any number of the first data acquisition module 901, the label association module 902, the input module 903 and the training module 904 can be combined in one module, or any one of the modules can be split into multiple modules. Alternatively, at least part of the functions of one or more of these modules may be combined with at least part of the functions of other modules and implemented in one module.
- At least one of the first data acquisition module 901, the label association module 902, the input module 903 and the training module 904 can be at least partially implemented as a hardware circuit, such as a field programmable gate array (FPGA), a programmable logic array (PLA) , system-on-chip, system-on-substrate, system-on-package, application-specific integrated circuit (ASIC), or any other reasonable means of integrating or packaging circuits, such as hardware or firmware, or in software, hardware, and firmware Any one of the three implementations or an appropriate combination of any of them.
- FPGA field programmable gate array
- PLA programmable logic array
- ASIC application-specific integrated circuit
- at least one of the first data acquisition module 901, the label association module 902, the input module 903 and the training module 904 may be at least partially implemented as a computer program module, and when the computer program module is executed, corresponding functions may be performed .
- any multiple of the second data acquisition module 1001 and the identification module 1002 can be implemented in one module, or any one of the modules can be split into multiple modules. Alternatively, at least part of the functions of one or more of these modules may be combined with at least part of the functions of other modules and implemented in one module.
- At least one of the second data acquisition module 1001 and the identification module 1002 can be at least partially implemented as a hardware circuit, such as a field programmable gate array (FPGA), a programmable logic array (PLA), a system on a chip, a system on a substrate, A system on a package, an application-specific integrated circuit (ASIC), or any other reasonable way of integrating or packaging circuits, such as hardware or firmware, or any of the three implementation methods of software, hardware, and firmware, or It can be realized by any suitable combination of any of them.
- FPGA field programmable gate array
- PLA programmable logic array
- ASIC application-specific integrated circuit
- at least one of the second data acquisition module 1001 and the identification module 1002 may be at least partially implemented as a computer program module, and when the computer program module is executed, corresponding functions may be performed.
- a fifth exemplary embodiment of the present disclosure provides an electronic device.
- Fig. 11 schematically shows a structural block diagram of an electronic device provided by an embodiment of the present disclosure.
- an electronic device 1100 provided by an embodiment of the present disclosure includes a processor 1101, a communication interface 1102, a memory 1103, and a communication bus 1104, wherein the processor 1101, the communication interface 1102, and the memory 1103 complete mutual communication via the communication bus 1104.
- the memory 1103 is used to store computer programs; the processor 1101 is used to execute the programs stored in the memory to implement the above-mentioned method of constructing a living body recognition model or a living body recognition method.
- the sixth exemplary embodiment of the present disclosure also provides a computer-readable storage medium.
- a computer program is stored on the above-mentioned computer-readable storage medium, and when the above-mentioned computer program is executed by a processor, the method for constructing a living body recognition model or the method for living body recognition as described above is realized.
- the computer-readable storage medium may be included in the device/device described in the above embodiments; or it may exist independently without being assembled into the device/device.
- the above-mentioned computer-readable storage medium carries one or more programs, and when the above-mentioned one or more programs are executed, the method according to the embodiment of the present disclosure is implemented.
- the computer-readable storage medium may be a non-volatile computer-readable storage medium, such as may include but not limited to: portable computer disk, hard disk, random access memory (RAM), read-only memory (ROM) , erasable programmable read-only memory (EPROM or flash memory), portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
- a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Human Computer Interaction (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Image Analysis (AREA)
Abstract
La présente invention concerne des procédés et des appareils pour construire un modèle d'identification de corps vivant et pour une identification de corps vivant, un dispositif et un support. Le procédé de construction du modèle d'identification de corps vivant consiste à : acquérir des données d'image obtenues en photographiant un sujet cible, le sujet cible comprenant : un sujet vivant et des objets non vivants portés par de multiples types de supports physiques; faire en sorte que les données d'image du sujet vivant correspondent à une première étiquette qui représente une catégorie de corps vivant; sur la base des différences de type des supports physiques, faire en sorte que les données d'image des objets non vivants correspondent à de multiples types de secondes étiquettes qui représentent des catégories de corps non vivants; entrer les données d'image dans un modèle d'apprentissage automatique pour l'entraînement; et effectuer un entraînement multi-classes sur le modèle d'apprentissage automatique sur la base de la première étiquette et des multiples types de secondes étiquettes afin d'obtenir un modèle d'identification de corps vivant.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110833025.X | 2021-07-22 | ||
| CN202110833025.XA CN115690918A (zh) | 2021-07-22 | 2021-07-22 | 构建活体识别模型和活体识别的方法、装置、设备及介质 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023000792A1 true WO2023000792A1 (fr) | 2023-01-26 |
Family
ID=84978915
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2022/093514 Ceased WO2023000792A1 (fr) | 2021-07-22 | 2022-05-18 | Procédés et appareils pour construire un modèle d'identification de corps vivant et pour une identification de corps vivant, dispositif et support |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN115690918A (fr) |
| WO (1) | WO2023000792A1 (fr) |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116469179A (zh) * | 2023-04-20 | 2023-07-21 | 支付宝(杭州)信息技术有限公司 | 一种活体识别的方法和系统 |
| CN116524609A (zh) * | 2023-04-20 | 2023-08-01 | 支付宝(杭州)信息技术有限公司 | 活体检测方法及系统 |
| CN116543201A (zh) * | 2023-04-11 | 2023-08-04 | 中国人民解放军国防科技大学 | 结构化平面物体识别模型训练及识别方法 |
| CN116740782A (zh) * | 2023-05-30 | 2023-09-12 | 北京百度网讯科技有限公司 | 图像处理以及模型获取方法、装置、电子设备及存储介质 |
| CN117196560A (zh) * | 2023-11-07 | 2023-12-08 | 深圳市慧云智跑网络科技有限公司 | 一种基于物联网的打卡设备数据采集方法及系统 |
| CN118941833A (zh) * | 2024-10-12 | 2024-11-12 | 西南交通大学 | 一种基于层次特征记忆学习的皮肤病识别方法 |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110765923A (zh) * | 2019-10-18 | 2020-02-07 | 腾讯科技(深圳)有限公司 | 一种人脸活体检测方法、装置、设备及存储介质 |
| CN111191519A (zh) * | 2019-12-09 | 2020-05-22 | 同济大学 | 一种用于移动供电装置用户接入的活体检测方法 |
| CN112036331A (zh) * | 2020-09-03 | 2020-12-04 | 腾讯科技(深圳)有限公司 | 活体检测模型的训练方法、装置、设备及存储介质 |
| CN112270288A (zh) * | 2020-11-10 | 2021-01-26 | 深圳市商汤科技有限公司 | 活体识别、门禁设备控制方法和装置、电子设备 |
| CN112597885A (zh) * | 2020-12-22 | 2021-04-02 | 北京华捷艾米科技有限公司 | 人脸活体检测方法、装置、电子设备及计算机存储介质 |
| CN112883831A (zh) * | 2021-01-29 | 2021-06-01 | 北京市商汤科技开发有限公司 | 一种活体检测的方法、装置、电子设备及存储介质 |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111401348B (zh) * | 2020-06-05 | 2020-09-04 | 支付宝(杭州)信息技术有限公司 | 一种目标对象的活体检测方法及系统 |
| CN111767900B (zh) * | 2020-07-28 | 2024-01-26 | 腾讯科技(深圳)有限公司 | 人脸活体检测方法、装置、计算机设备及存储介质 |
| CN112085072B (zh) * | 2020-08-24 | 2022-04-29 | 北方民族大学 | 基于时空特征信息的草图检索三维模型的跨模态检索方法 |
| CN112464873B (zh) * | 2020-12-09 | 2025-04-04 | 携程计算机技术(上海)有限公司 | 模型的训练方法、人脸活体识别方法、系统、设备及介质 |
-
2021
- 2021-07-22 CN CN202110833025.XA patent/CN115690918A/zh active Pending
-
2022
- 2022-05-18 WO PCT/CN2022/093514 patent/WO2023000792A1/fr not_active Ceased
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110765923A (zh) * | 2019-10-18 | 2020-02-07 | 腾讯科技(深圳)有限公司 | 一种人脸活体检测方法、装置、设备及存储介质 |
| CN111191519A (zh) * | 2019-12-09 | 2020-05-22 | 同济大学 | 一种用于移动供电装置用户接入的活体检测方法 |
| CN112036331A (zh) * | 2020-09-03 | 2020-12-04 | 腾讯科技(深圳)有限公司 | 活体检测模型的训练方法、装置、设备及存储介质 |
| CN112270288A (zh) * | 2020-11-10 | 2021-01-26 | 深圳市商汤科技有限公司 | 活体识别、门禁设备控制方法和装置、电子设备 |
| CN112597885A (zh) * | 2020-12-22 | 2021-04-02 | 北京华捷艾米科技有限公司 | 人脸活体检测方法、装置、电子设备及计算机存储介质 |
| CN112883831A (zh) * | 2021-01-29 | 2021-06-01 | 北京市商汤科技开发有限公司 | 一种活体检测的方法、装置、电子设备及存储介质 |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116543201A (zh) * | 2023-04-11 | 2023-08-04 | 中国人民解放军国防科技大学 | 结构化平面物体识别模型训练及识别方法 |
| CN116469179A (zh) * | 2023-04-20 | 2023-07-21 | 支付宝(杭州)信息技术有限公司 | 一种活体识别的方法和系统 |
| CN116524609A (zh) * | 2023-04-20 | 2023-08-01 | 支付宝(杭州)信息技术有限公司 | 活体检测方法及系统 |
| CN116740782A (zh) * | 2023-05-30 | 2023-09-12 | 北京百度网讯科技有限公司 | 图像处理以及模型获取方法、装置、电子设备及存储介质 |
| CN117196560A (zh) * | 2023-11-07 | 2023-12-08 | 深圳市慧云智跑网络科技有限公司 | 一种基于物联网的打卡设备数据采集方法及系统 |
| CN117196560B (zh) * | 2023-11-07 | 2024-02-13 | 深圳市慧云智跑网络科技有限公司 | 一种基于物联网的打卡设备数据采集方法及系统 |
| CN118941833A (zh) * | 2024-10-12 | 2024-11-12 | 西南交通大学 | 一种基于层次特征记忆学习的皮肤病识别方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN115690918A (zh) | 2023-02-03 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2023000792A1 (fr) | Procédés et appareils pour construire un modèle d'identification de corps vivant et pour une identification de corps vivant, dispositif et support | |
| US10755084B2 (en) | Face authentication to mitigate spoofing | |
| JP7369500B2 (ja) | 閾値ベースのマッチングによる遠隔ユーザの身元確認 | |
| Oh et al. | Faceless person recognition: Privacy implications in social media | |
| CN107403173B (zh) | 一种人脸识别系统及方法 | |
| CN105659286B (zh) | 自动化图像裁剪和分享 | |
| KR101647691B1 (ko) | 하이브리드 기반의 영상 클러스터링 방법 및 이를 운용하는 서버 | |
| WO2022246989A1 (fr) | Procédé et appareil d'identification de données, et dispositif et support de stockage lisible | |
| US11126827B2 (en) | Method and system for image identification | |
| WO2018176954A1 (fr) | Procédé, dispositif et système de fourniture d'objets pour se faire des amis | |
| CN112330331A (zh) | 基于人脸识别的身份验证方法、装置、设备及存储介质 | |
| CN113221721A (zh) | 图像识别方法、装置、设备及介质 | |
| CN112041847B (zh) | 提供具有隐私标签的图像 | |
| CN113591603B (zh) | 证件的验证方法、装置、电子设备及存储介质 | |
| WO2021128846A1 (fr) | Procédé et appareil de contrôle de fichier électronique, et dispositif informatique et support de stockage | |
| CN111582228B (zh) | 活体掌纹的识别方法、装置、设备及存储介质 | |
| Geradts | Digital, big data and computational forensics | |
| US9332031B1 (en) | Categorizing accounts based on associated images | |
| WO2019129293A1 (fr) | Procédé et appareil de génération de données de caractéristiques, et procédé et appareil de mise en correspondance de caractéristiques | |
| US12306880B2 (en) | Systems and methods for classifying documents | |
| US20240184860A1 (en) | Methods and arrangements for providing impact imagery | |
| KR102060110B1 (ko) | 컨텐츠에 포함되는 객체를 분류하는 방법, 장치 및 컴퓨터 프로그램 | |
| CN117892280A (zh) | 视频双因子身份认证方法、系统、电子设备及存储介质 | |
| KR20200009887A (ko) | 디바이스에서 실시간 이미지 유사성을 결정하는 방법 및 시스템 | |
| Farooqui et al. | Automatic detection of fake profiles in online social network using soft computing |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 31.05.2024) |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 22844959 Country of ref document: EP Kind code of ref document: A1 |