[go: up one dir, main page]

CN118411746B - A face fusion method, system, electronic device and storage medium - Google Patents

A face fusion method, system, electronic device and storage medium Download PDF

Info

Publication number
CN118411746B
CN118411746B CN202410454968.5A CN202410454968A CN118411746B CN 118411746 B CN118411746 B CN 118411746B CN 202410454968 A CN202410454968 A CN 202410454968A CN 118411746 B CN118411746 B CN 118411746B
Authority
CN
China
Prior art keywords
face
image
ornament
fusion
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410454968.5A
Other languages
Chinese (zh)
Other versions
CN118411746A (en
Inventor
肖冠正
吴凯文
张子荷
范胜旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iMusic Culture and Technology Co Ltd
Original Assignee
iMusic Culture and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iMusic Culture and Technology Co Ltd filed Critical iMusic Culture and Technology Co Ltd
Priority to CN202410454968.5A priority Critical patent/CN118411746B/en
Publication of CN118411746A publication Critical patent/CN118411746A/en
Application granted granted Critical
Publication of CN118411746B publication Critical patent/CN118411746B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/02Affine transformations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Processing Or Creating Images (AREA)
  • Image Processing (AREA)

Abstract

The application discloses a face fusion method, a system, electronic equipment and a storage medium, wherein the method comprises the steps of respectively carrying out face detection and alignment processing on a source face image and a target face image to obtain a source face affine transformation matrix and a target face affine transformation matrix, and carrying out cutting processing on the source face image and the target face image to obtain a source face area image and a target face area image; the method comprises the steps of carrying out feature extraction processing on a source face region image to obtain ornament features and source face features, carrying out face fusion processing on the source face features and a target face region image by combining the source face features and the target face region image and inputting the source face features and the target face region image into a face fusion model to obtain an initial face fusion result, carrying out reconstruction processing on the ornament features and the ornament feature input model to obtain an ornament reconstruction result, and covering the ornament reconstruction result on the target face region image according to an inverse matrix of an affine transformation matrix of the target face to obtain a target face fusion result. The application can improve the reality of the face fusion and can be widely applied to the technical field of computer vision.

Description

Face fusion method, system, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer vision, and in particular, to a face fusion method, a face fusion system, an electronic device, and a storage medium.
Background
In the related art, there is a face fusion method based on an image processing technique and a feature point alignment algorithm, that is, a new composite image is generated by merely fusing features of a source image with an appearance of a target image. However, this method tends to have limited effect in dealing with complex facial expressions, gestures and illumination changes, and tends to produce unnatural or distorted synthetic results. When the face in the image wears the ornaments, the face with the synthesized result only has the facial features of the source face, and the accurate restoration of the ornaments features cannot be realized.
In summary, the technical problems in the related art are to be improved.
Disclosure of Invention
The embodiment of the application mainly aims to provide a face fusion method, a face fusion system, electronic equipment and a storage medium, which can improve the accuracy and naturalness of a fusion result.
In order to achieve the above object, an aspect of an embodiment of the present application provides a face fusion method, where the method includes:
Acquiring a source face image and a target face image;
respectively carrying out face detection and alignment treatment on a source face image and a target face image, and calculating to obtain a source face affine transformation matrix and a target face affine transformation matrix;
Cutting the source face image and the target face image according to the source face affine transformation matrix and the target face affine transformation matrix to obtain a source face area image and a target face area image;
Performing feature extraction processing on the source face region image to obtain ornament features and source face features;
The source face features and the target face region images are jointly input into a face fusion model to be subjected to face fusion processing, and an initial face fusion result is obtained;
inputting the initial face fusion result and the ornament feature into an ornament generation model for reconstruction processing, so as to obtain an ornament reconstruction result;
And covering the ornament reconstruction result to the target face region image according to the inverse matrix of the target face affine transformation matrix to obtain a target face fusion result.
In some embodiments, the face detection and alignment processing are performed on the source face image and the target face image respectively, and the source face affine transformation matrix and the target face affine transformation matrix are obtained by calculation, including:
obtaining a standard face template image;
Respectively inputting the source face image and the target face image into a face detection model to perform face detection processing to obtain source face key point information and target face key point information;
And respectively carrying out face alignment processing on the source face key point information and the target face key point information according to the standard face template image, and calculating to obtain a source face affine transformation matrix and a target face affine transformation matrix.
In some embodiments, the performing feature extraction processing on the source face area image to obtain the ornament feature and the source face feature includes:
inputting the source face region image into an ornament classification model to perform ornament feature extraction processing to obtain ornament features;
performing bilinear interpolation on the source face region image to obtain a scaled image;
and inputting the scaled image into a face feature extraction model to perform face feature extraction processing to obtain source face features.
In some embodiments, before the step of inputting the source face feature and the target face region image into a face fusion model in a combined manner to perform face fusion processing to obtain an initial face fusion result, the method further includes pre-training the face fusion model, and specifically includes:
Acquiring high-resolution portrait training data and low-resolution portrait training data;
Performing face detection and alignment processing on the high-resolution portrait training data, and cutting to obtain a training target face image;
Respectively extracting face features of the high-resolution portrait training data and the low-resolution portrait training data to obtain training face features;
Inputting the training face features and the training target face images into the face fusion model to obtain a training fusion result;
Carrying out loss value calculation processing on the training fusion result according to the face fusion loss function to obtain a face fusion loss value;
And updating parameters of the face fusion model according to the face fusion loss value.
In some embodiments, the performing a loss value calculation process on the training fusion result according to a face fusion loss function to obtain a face fusion loss value includes:
According to the training fusion result, calculating to obtain the facial feature cosine similarity loss, the pixel loss, the perception loss and the antagonism loss;
And carrying out weighted summation processing on the face feature cosine similarity loss, the pixel loss, the perception loss and the countermeasure loss according to a face fusion loss function to obtain a face fusion loss value.
In some embodiments, before the initial face fusion result and the ornament feature are input into the ornament generation model for reconstruction processing, the method further includes training the ornament generation model in advance, and specifically includes:
constructing and obtaining a training image by a computer synthesis method;
Performing ornament feature extraction processing on the training image to obtain training ornament features;
Inputting the training ornament characteristics and the training images into the ornament generation model to obtain a training generation result;
performing loss value calculation processing on the training generation result according to the ornament generation loss function to obtain an ornament generation loss value;
updating parameters of the ornament generation model according to the ornament generation loss value.
In some embodiments, the constructing the training image by a computer synthesis method includes:
Constructing and obtaining a simulated digital person and ornament model by a three-dimensional modeling technology;
Carrying out random condition modification treatment on the simulated digital person to obtain a random model;
And synthesizing the ornament model and the random model to obtain a training image.
To achieve the above object, another aspect of an embodiment of the present application provides a face fusion system, including:
The first module is used for acquiring a source face image and a target face image;
The second module is used for carrying out face detection and alignment processing on the source face image and the target face image respectively, and calculating to obtain a source face affine transformation matrix and a target face affine transformation matrix;
A third module, configured to perform clipping processing on the source face image and the target face image according to the source face affine transformation matrix and the target face affine transformation matrix, so as to obtain a source face area image and a target face area image;
a fourth module, configured to perform feature extraction processing on the source face area image to obtain an ornament feature and a source face feature;
A fifth module, configured to combine the source face feature and the target face area image to input a face fusion model for face fusion processing, so as to obtain an initial face fusion result;
A sixth module, configured to input the initial face fusion result and the ornament feature into an ornament generating model for reconstruction processing, so as to obtain an ornament reconstruction result;
and a seventh module, configured to cover the ornament reconstruction result to the target face area image according to the inverse matrix of the target face affine transformation matrix, to obtain a target face fusion result.
To achieve the above object, another aspect of the embodiments of the present application provides an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor implements the method described above when executing the computer program.
To achieve the above object, another aspect of the embodiments of the present application proposes a computer-readable storage medium storing a computer program which, when executed by a processor, implements the method described above.
The face fusion method, system, electronic equipment and storage medium have the advantages that the face fusion method, system, electronic equipment and storage medium are provided, face detection and alignment processing are respectively carried out on a source face image and a target face image, a source face affine transformation matrix and a target face affine transformation matrix are obtained through calculation, clipping processing is respectively carried out on the source face image and the target face image according to the source face affine transformation matrix and the target face affine transformation matrix to obtain a source face area image and a target face area image, feature extraction processing is carried out on the source face area image to obtain ornament features and source face features, ornament features and source face features can be accurately extracted, and the sense of reality of fusion effects is improved. The scheme also includes that the source face features and the target face region images are jointly input into a face fusion model to be subjected to face fusion processing to obtain an initial face fusion result, the initial face fusion result and the ornament features are input into an ornament generation model to be subjected to reconstruction processing to obtain an ornament reconstruction result, the ornament features can be accurately restored by the aid of the reconstruction processing of the initial face fusion result and the ornament features, the ornament reconstruction result is covered to the target face region images according to an inverse matrix of an affine transformation matrix of the target face to obtain the target face fusion result, and fusion accuracy and sense of reality can be improved.
Drawings
Fig. 1 is a flowchart of a face fusion method provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of image preprocessing according to an embodiment of the present application;
Fig. 3 is a schematic diagram of a source face feature extraction process according to an embodiment of the present application;
Fig. 4 is a schematic diagram of a target face extraction process according to an embodiment of the present application;
fig. 5 is a schematic diagram of a face feature fusion process according to an embodiment of the present application;
FIG. 6 is a schematic illustration of an ornamental article reconstruction process according to an embodiment of the present application;
FIG. 7 is a schematic diagram of an affine transformation overlay process provided by an embodiment of the application;
fig. 8 is a schematic structural diagram of a face fusion system according to an embodiment of the present application;
fig. 9 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with embodiments of the application, but are merely examples of systems and methods consistent with aspects of embodiments of the application as detailed in the accompanying claims.
It is to be understood that the terms "first," "second," and the like, as used herein, may be used to describe various concepts, but are not limited by these terms unless otherwise specified. These terms are only used to distinguish one concept from another. For example, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information, without departing from the scope of embodiments of the present application. The words "if", as used herein, may be interpreted as "when" or "in response to a determination", depending on the context.
The terms "at least one", "a plurality", "each", "any" and the like as used herein, at least one includes one, two or more, a plurality includes two or more, each means each of the corresponding plurality, and any one means any of the plurality.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.
Before describing embodiments of the present application in detail, some of the terms and expressions that are referred to in the embodiments of the present application will be described first, and the terms and expressions that are referred to in the embodiments of the present application are applicable to the following explanation.
Computer Vision (CV), which is a science of studying how to "look" a machine, is further to replace the human eyes with a camera and a Computer to recognize, track and measure the object, and further to process the image to make the Computer process the image more suitable for the human eyes to observe or transmit to the instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and others.
The face fusion technique is a technique of synthesizing two or more face images, aiming at generating a new image having the appearance of a source image feature and a target image. The face fusion technology has been developed remarkably, and can be widely applied to the fields of virtual reality, entertainment, advertisement and the like.
In the related art, there is a face fusion method based on an image processing technique and a feature point alignment algorithm, that is, a new composite image is generated by merely fusing features of a source image with an appearance of a target image. However, this method tends to have limited effect in dealing with complex facial expressions, gestures and illumination changes, and tends to produce unnatural or distorted synthetic results. There is also a method for performing face fusion through a convolutional neural network, but when the source face wearing the ornaments is processed, after the features of the source face are fused to the target face to generate a synthetic result, only the face with the synthetic result can have the facial features of the source face, and accurate restoration of the ornament features cannot be realized.
In view of the above, an embodiment of the present application provides a face fusion method, a system, an electronic device, and a storage medium, where the face fusion method, the system, the electronic device, and the storage medium are configured to perform face detection and alignment processing on a source face image and a target face image, respectively, to obtain a source face affine transformation matrix and a target face affine transformation matrix through calculation; cutting the source face image and the target face image according to the source face affine transformation matrix and the target face affine transformation matrix respectively to obtain a source face area image and a target face area image, extracting features of the source face area image to obtain ornament features and source face features, accurately extracting ornament features and source face features, and improving the sense of reality of the fusion effect. The scheme also includes that the source face features and the target face region images are jointly input into a face fusion model to be subjected to face fusion processing to obtain an initial face fusion result, the initial face fusion result and the ornament features are input into an ornament generation model to be subjected to reconstruction processing to obtain an ornament reconstruction result, the ornament features can be accurately restored by the aid of the reconstruction processing of the initial face fusion result and the ornament features, the ornament reconstruction result is covered to the target face region images according to an inverse matrix of an affine transformation matrix of the target face to obtain the target face fusion result, and fusion accuracy and sense of reality can be improved.
The embodiment of the application provides a face fusion method, and relates to the technical field of computer vision. The face fusion method provided by the embodiment of the application can be applied to a terminal, a server and software running in the terminal or the server. In some embodiments, the terminal may be, but not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a vehicle terminal, etc., the server may be configured as an independent physical server, may be configured as a server cluster or a distributed system formed by a plurality of physical servers, may be configured as a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, and basic cloud computing services such as big data and artificial intelligence platform, and the server may be a node server in a blockchain network, and the software may be an application for implementing a face fusion method, etc., but is not limited to the above form.
The application is operational with numerous general purpose or special purpose computer system environments or configurations. Such as a personal computer, a server computer, a hand-held or portable device, a tablet device, a multiprocessor system, a microprocessor-based system, a set top box, a programmable consumer electronics, a network PC, a minicomputer, a mainframe computer, a distributed computing environment that includes any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
It should be noted that, in each specific embodiment of the present application, when related processing is required according to user information, user behavior data, user history data, user location information, and other data related to user identity or characteristics, permission or consent of the user is obtained first, and the collection, use, processing, and the like of the data comply with related laws and regulations and standards. In addition, when the embodiment of the application needs to acquire the sensitive personal information of the user, the independent permission or independent consent of the user is acquired through popup or jump to a confirmation page and the like, and after the independent permission or independent consent of the user is definitely acquired, the necessary relevant data of the user for enabling the embodiment of the application to normally operate is acquired.
Fig. 1 is an optional flowchart of a face fusion method according to an embodiment of the present application, where the method in fig. 1 may include, but is not limited to, steps S101 to S107.
Step S101, acquiring a source face image and a target face image;
step S102, face detection and alignment processing are respectively carried out on a source face image and a target face image, and a source face affine transformation matrix and a target face affine transformation matrix are obtained through calculation;
Step S103, clipping the source face image and the target face image according to the source face affine transformation matrix and the target face affine transformation matrix respectively to obtain a source face area image and a target face area image;
step S104, carrying out feature extraction processing on the source face region image to obtain ornament features and source face features;
Step S105, the source face features and the target face region images are input into a face fusion model in a combined mode to carry out face fusion processing, and an initial face fusion result is obtained;
step S106, inputting the initial face fusion result and the ornament feature into an ornament generation model for reconstruction processing, and obtaining an ornament reconstruction result;
And step S107, covering the ornament reconstruction result to the target face region image according to the inverse matrix of the target face affine transformation matrix to obtain a target face fusion result.
In the steps S101 to S107 shown in the embodiment of the present application, a source face image and a target face image input by a user are acquired, where the source face image may be a self-timer image uploaded by the user or an image with a face selected in an album, and may be a cartoon image or a hand-drawn image. The target face image is an image which needs to extract relevant features in the source face image for fusion processing, and can be understood as a material face image or a template face image, such as a certificate image, a cartoon image, a ancient costume image and the like. And then cutting the source face image and the target face image according to the source face affine transformation matrix and the target face affine transformation matrix to obtain a source face area image and a target face area image. The source face area image and the target face area image are images obtained by respectively cutting face areas from the source face image and the target face image according to affine transformation matrixes obtained by calculation after alignment processing according to the preset image size after the face detection models are respectively input into the source face image and the target face image for detection and alignment processing. And then carrying out feature extraction processing on the source face region image, and carrying out feature extraction through an ornament feature detection model and a face feature detection model to obtain ornament features and source face features, wherein the ornament can comprise glasses, masks, earrings and the like. And finally, the source face features and the target face region images are jointly input into a face fusion model to be subjected to face fusion processing to obtain an initial face fusion result, the initial face fusion result and the ornament features are input into an ornament generation model to be subjected to reconstruction processing to obtain an ornament reconstruction result, and the ornament reconstruction result is covered on the target face region images according to an inverse matrix of the target face affine transformation matrix to obtain a target face fusion result. The aligned face image can be restored to the gesture and the position of the original image by calculating the inverse matrix of the affine transformation matrix of the target face, so that the ornament reconstruction result is covered back to the face area cut out from the target face. According to the embodiment of the application, the ornament feature is obtained through the feature extraction model, the source face feature information is extracted, the initial face fusion result is obtained through the target face image and the source face input face fusion model, and the initial face fusion result and the ornament feature are output through the ornament generation model, so that the problems that an unnatural or distorted synthesis result is easy to generate and the reconstruction of the eyeglass feature cannot be realized when the existing face fusion method is used for processing complex face expression, gesture and illumination changes are solved, the realism and fidelity of the fusion effect are improved, and the more real and more lifelike face fusion effect is realized.
In step S101 of some embodiments, a source face image and a target face image may be obtained through user uploading. The source face image and the target face image can be obtained by other modes such as album, material library and the like, and the method is not limited to the above. And, referring to fig. 2, in the embodiment of the present application, when a source face image and a target face image are obtained, preprocessing is performed on the obtained images. Specifically, the source face image and the target face image are decoded, and image data is recorded in numpy array form. And carrying out normalization processing on pixel values of the image data, and normalizing the value range of the pixel values from 0-255 to a section of-1 to obtain a preprocessed source face image and a preprocessed target face image.
In step S102 of some embodiments, face detection and alignment processing are performed on the source face image and the target face image, respectively, to calculate a source face affine transformation matrix and a target face affine transformation matrix, which include:
obtaining a standard face template image;
Respectively inputting the source face image and the target face image into a face detection model to perform face detection processing to obtain source face key point information and target face key point information;
and respectively carrying out face alignment and clipping processing on the source face key point information and the target face key point information according to the standard face template image to obtain a source face area image and a target face area image.
In the embodiment of the application, a standard face template image is firstly obtained, wherein the standard face template image is an image with standard face proportion and can be obtained from a face image database. Referring to fig. 3, a face detection model is input into a preprocessed source face image to perform face detection to obtain key point information, the key point information includes face position information, then face alignment is performed with a standard face template image according to the key point information, a main stream alignment method such as FFHQ dataset alignment can be adopted, and a source face affine transformation matrix can be obtained through calculation according to an alignment result, so that the source face image is cut according to the source face affine transformation matrix, and a source face region image with a size of 256×256 is obtained. The face detection model may be a YOLO model or the like. Referring to fig. 4, face detection and alignment processing are performed on the preprocessed target face image, and a target face affine transformation matrix is obtained through calculation, so that the target face image is cut according to the target face affine transformation matrix, and a target face area image with the size of 256×256 is obtained. According to the embodiment of the application, the face region can be accurately positioned by carrying out face detection and alignment processing on the source face image and the target face image, so that the subsequent face fusion is convenient, and the accuracy of the fusion result is improved.
In step S103 of some embodiments, clipping the source face image and the target face image according to the source face affine transformation matrix and the target face affine transformation matrix, respectively, to obtain a source face area image and a target face area image;
In the embodiment of the application, the source face affine transformation matrix and the target face affine transformation matrix are obtained by carrying out face detection and alignment processing on the source face image and the target face image, the face alignment can comprise affine transformation processing such as translation, rotation, scaling, shearing and the like, and the affine transformation matrix is obtained by establishing a homogeneous coordinate system for calculation. Referring to fig. 4, after the preprocessed target face image is input into the face detection model to obtain key point information in the target face image, face alignment is performed on the key point information and the standard face template image, an affine transformation matrix is calculated, and a target face region image with a size of 256 x 256 is cut out from the preprocessed target face image according to the calculated affine transformation matrix. According to the embodiment of the application, the affine transformation matrix is obtained, so that the corresponding relation between the fusion result and the target face image can be better found, and the fusion effect is better improved.
In step S104 of some embodiments, the performing feature extraction processing on the source face area image to obtain ornament features and source face features includes:
inputting the source face region image into an ornament classification model to perform ornament feature extraction processing to obtain ornament features;
performing bilinear interpolation on the source face region image to obtain a scaled image;
and inputting the scaled image into a face feature extraction model to perform face feature extraction processing to obtain source face features.
In the embodiment of the application, the source face area image obtained by cutting is input into an ornament classification model to be subjected to ornament feature extraction treatment, so that ornament features are obtained. In the embodiment of the application, ornaments are used as glasses for explanation, and referring to fig. 3, a source face image can be a face image wearing glasses, so that a source face area image obtained by cutting is input into a glasses classification model for glasses feature extraction processing to obtain glasses classification features. And scaling the source face region image with the size of 256 x 256 obtained by cutting to 112 x 112 by a bilinear interpolation method to obtain a scaled image. And finally, inputting the scaled image into a face feature extraction model to perform face feature extraction processing to obtain source face features, wherein the face feature extraction model can use Arcface models and the like. According to the embodiment of the application, the ornament features and the face features are extracted, so that the features and the target face image can be better fused, and the image fusion fidelity is improved.
In step S105 of some embodiments, before the step of performing face fusion processing by combining the source face feature and the target face area image with the face fusion model, the method further includes pre-training the face fusion model, specifically including:
Acquiring high-resolution portrait training data and low-resolution portrait training data;
Performing face detection and alignment processing on the high-resolution portrait training data, and cutting to obtain a training target face image;
Respectively extracting face features of the high-resolution portrait training data and the low-resolution portrait training data to obtain training face features;
Inputting the training face features and the training target face images into the face fusion model to obtain a training fusion result;
Carrying out loss value calculation processing on the training fusion result according to the face fusion loss function to obtain a face fusion loss value;
And updating parameters of the face fusion model according to the face fusion loss value.
In the embodiment of the application, referring to fig. 5, a source face feature and a target face region image are jointly input into a face fusion model to perform face fusion processing, so as to obtain an initial face fusion result. The face fusion model adopts Encoder-Decoder network architecture, and the model input is a target face area image with the size of 256 x 3 and source face features with the size of 1 x 512. When training the face fusion model, firstly, high-resolution portrait training data and low-resolution portrait training data are obtained as training data, wherein the high-resolution portrait training data are face images with vertical resolution being more than or equal to 720 and face area size being more than 256 x 256, and the low-resolution portrait training data are face images with vertical resolution being less than 720 and face area being less than 256 x 256 and more than 112 x 112. Face detection and alignment are carried out on high-resolution face data, the face area images of 256 x 3 are cut out, meanwhile, the face area images are scaled to 112 x 3 in a bilinear difference mode, and face features are obtained through a face extraction model. Face detection and alignment are carried out on the low-resolution image data, 112 x 3 images are cut out, and face features are obtained through a face extraction model. 256×256×3 face region images cut out from the high-resolution data set are used as training target face images. Face features extracted from the high resolution dataset and face features extracted from the low resolution dataset are used as training face features. And inputting the training face features and the training target face images into the face fusion model to obtain a training fusion result, so that the training fusion result is subjected to loss value calculation according to the face fusion loss function to obtain a face fusion loss value. Calculating a training fusion result and a loss value of a training target face image through a loss function of the face fusion model, optimizing the loss function of the face fusion model according to the loss value, carrying out back propagation on model loss of the loss function, continuously adjusting model parameters until the loss value is greater than or equal to a loss value threshold value, and stopping optimizing the face fusion model to obtain the face fusion model meeting the requirements. According to the embodiment of the application, the face fusion model is trained by using the high-resolution portrait training data and the low-resolution portrait training data, so that the situations with complicated face expression, gesture and illumination changes in the image can be better dealt with, and the sense of realism of image fusion is improved.
In some embodiments, the performing a loss value calculation process on the training fusion result according to a face fusion loss function to obtain a face fusion loss value includes:
According to the training fusion result, calculating to obtain the facial feature cosine similarity loss, the pixel loss, the perception loss and the antagonism loss;
And carrying out weighted summation processing on the face feature cosine similarity loss, the pixel loss, the perception loss and the countermeasure loss according to a face fusion loss function to obtain a face fusion loss value.
In the embodiment of the application, the expression of the face fusion loss function is shown as follows:
Ltotal=αLid+βLpixel+γLpercent+δLGAN;
Wherein, L total represents a face fusion loss value, L id represents a cosine similarity loss of the face features extracted by the face feature extractor from the training fusion result and the training face features, L pixel represents a pixel loss, L percent represents a perception loss, L GAN represents an antagonism loss, and alpha, beta, gamma and delta are weights of loss functions respectively, and one suitable value is 5, 1, 0.5 and 0.2. According to the embodiment of the application, the face fusion loss value is obtained by carrying out weighted summation processing on the face feature cosine similarity loss, the pixel loss, the perception loss and the antagonism loss, so that the face fusion model can be better trained, and the face fusion effect is improved.
In step S106 of some embodiments, before the reconstructing the initial face fusion result and the ornament feature input ornament generating model to obtain an ornament reconstructing result, the method further includes pre-training the ornament generating model, specifically including:
constructing and obtaining a training image by a computer synthesis method;
Performing ornament feature extraction processing on the training image to obtain training ornament features;
Inputting the training ornament characteristics and the training images into the ornament generation model to obtain a training generation result;
performing loss value calculation processing on the training generation result according to the ornament generation loss function to obtain an ornament generation loss value;
updating parameters of the ornament generation model according to the ornament generation loss value.
In the embodiment of the application, referring to fig. 6, the initial face fusion result and the ornament feature are input into an ornament generating model for reconstruction processing, so as to obtain an ornament reconstruction result. In the embodiment of the application, glasses are taken as ornaments for specific explanation, and the glasses reconstruction result is obtained by inputting the face ID fusion result, namely the face feature fusion result, into a glasses generation model in combination with the classification features of the source face glasses. The glasses generating model uses Encoder-Decoder network structure, and uses the generating countermeasure network training mode to train, when training, the images without glasses in the training data and the glasses classifying feature vector are input into the glasses generating model in a combined way, and the training generating result is obtained. And the parameters of the ornament generating model are updated according to the ornament generating loss value. The expression of the ornament generation loss function is as follows:
Ltotal=αLpixel+βLpercent+γLGAN
Where L total is the ornament generation loss value, L pixel is the pixel level loss, L percent is the perceived loss, and L GAN is the counterloss. Alpha, beta and gamma are weights of the loss functions, and one suitable value is 10, 1 and 0.2. According to the embodiment of the application, the image and the ornament characteristics of the wearing ornaments are combined and input into the model for training treatment, so that the ornament generating model can be trained better, the reconstruction of the ornament characteristics is realized, and the sense of reality of image fusion is improved.
In some embodiments, the constructing the training image by a computer synthesis method includes:
Constructing and obtaining a simulated digital person and ornament model by a three-dimensional modeling technology;
Carrying out random condition modification treatment on the simulated digital person to obtain a random model;
And synthesizing the ornament model and the random model to obtain a training image.
In the embodiment of the application, ornaments are used as glasses for explanation, wherein a training image is constructed by a computer synthesis method, 10 common glasses 3D models are designed by a high-precision simulation digital person with 3D modeling constructed in advance, the figure image is randomly modified, and after conditions of figure accessories, figure complexion, figure angle, figure illumination and the like are met, an image without wearing glasses and 10 images with different glasses are synthesized, so that the training image is obtained. According to the embodiment of the application, the training image can be obtained rapidly through the computer synthesis technology, so that the training speed of the ornament generation model is improved.
In step S107 of some embodiments, the ornament reconstruction result is covered on the target face area image according to the inverse matrix of the target face affine transformation matrix, so as to obtain a target face fusion result.
In the embodiment of the application, the glasses are used as ornaments, and referring to fig. 7, the image of the reconstructed result of the glasses is covered by affine transformation to the image of the target face region cut out from the target face. And performing inverse normalization processing, changing the numerical range from 0-1 to 0-255, and storing the numerical range as a picture to serve as a final output result. According to the application, the aligned face image can be restored to the gesture and the position of the original image by calculating the inverse matrix of the affine transformation matrix of the target face, so that the spectacle reconstruction result is covered back to the face region cut out of the target face, and the final target face fusion result is obtained.
The following describes and describes the scheme of the embodiment of the present application in detail in conjunction with a specific application scenario:
The face fusion method provided by the application can be applied to various photographing scenes or image processing scenes, can be applied to advertisement marketing, picture applications, various social APP (Application), video processing APP and the like, provides a face fusion function for various APP, and can be used for interesting video and video color bell production and the like. For example, in one image processing APP, a material library may be provided, where the material library is used to store various types of material face images, such as a cartoon image, a ancient dress image, a group photo image, and the like, and there may be a plurality of each type of material face image. When the user needs to perform face fusion, for example, a 'fusion' function button can be clicked, then a material face image is selected from a pop-up interface to serve as a target face image, for example, a material face image in a cartoon image, then the user can perform self-shooting of a person image or select a person image photo from a local album by using a camera to serve as a source face image, and the source face image and the target face image selected by the user are subjected to face fusion, so that a fusion image with the appearance characteristics of the user and the image characteristics of the target face image is obtained. When the face fusion method is adopted, the face features and the ornament features of the source face image and the target face image are more obvious, so that the fusion image can realize ornament feature reconstruction, distortion sense of the face part when complicated face expression, gesture and illumination change are processed can be reduced, and the whole fusion image is more natural and real.
Referring to fig. 8, an embodiment of the present application further provides a face fusion system, which may implement the face fusion method, where the system includes:
A first module 801, configured to acquire a source face image and a target face image;
A second module 802, configured to perform face detection and alignment processing on the source face image and the target face image, and calculate a source face affine transformation matrix and a target face affine transformation matrix;
A third module 803, configured to perform clipping processing on the source face image and the target face image according to the source face affine transformation matrix and the target face affine transformation matrix, to obtain a source face area image and a target face area image;
A fourth module 804, configured to perform feature extraction processing on the source face area image to obtain an ornament feature and a source face feature;
a fifth module 805, configured to combine the source face feature and the target face area image to input a face fusion model to perform a face fusion process, so as to obtain an initial face fusion result;
A sixth module 806, configured to input the initial face fusion result and the ornament feature into an ornament generating model for reconstruction processing, so as to obtain an ornament reconstruction result;
A seventh module 807, configured to cover the ornament reconstruction result to the target face area image according to the inverse matrix of the target face affine transformation matrix, to obtain a target face fusion result.
It can be understood that the content in the above method embodiment is applicable to the system embodiment, and the functions specifically implemented by the system embodiment are the same as those of the above method embodiment, and the achieved beneficial effects are the same as those of the above method embodiment.
The embodiment of the application also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the face fusion method when executing the computer program. The electronic equipment can be any intelligent terminal including a tablet personal computer, a vehicle-mounted computer and the like.
It can be understood that the content in the above method embodiment is applicable to the embodiment of the present apparatus, and the specific functions implemented by the embodiment of the present apparatus are the same as those of the embodiment of the above method, and the achieved beneficial effects are the same as those of the embodiment of the above method.
Referring to fig. 9, fig. 9 illustrates a hardware structure of an electronic device according to another embodiment, the electronic device includes:
The processor 901 may be implemented by a general purpose CPU (central processing unit), a microprocessor, an application specific integrated circuit (ApplicationSpecificIntegratedCircuit, ASIC), or one or more integrated circuits, etc. for executing related programs, so as to implement the technical solution provided by the embodiments of the present application;
The memory 902 may be implemented in the form of read-only memory (ReadOnlyMemory, ROM), static storage, dynamic storage, or random access memory (RandomAccessMemory, RAM), among others. The memory 902 may store an operating system and other application programs, and when the technical solution provided in the embodiments of the present disclosure is implemented by software or firmware, relevant program codes are stored in the memory 902, and the processor 901 invokes a face fusion method for executing the embodiments of the present disclosure;
An input/output interface 903 for inputting and outputting information;
The communication interface 904 is configured to implement communication interaction between the device and other devices, and may implement communication in a wired manner (e.g. USB, network cable, etc.), or may implement communication in a wireless manner (e.g. mobile network, WIFI, bluetooth, etc.);
a bus 905 that transfers information between the various components of the device (e.g., the processor 901, the memory 902, the input/output interface 903, and the communication interface 904);
Wherein the processor 901, the memory 902, the input/output interface 903 and the communication interface 904 are communicatively coupled to each other within the device via a bus 905.
Referring to fig. 10, an embodiment of the present application further provides a computer readable storage medium 1001, where the computer readable storage medium 1001 stores a computer program 1002, and the computer program 1002 implements the above-mentioned face fusion method when executed by a processor.
It can be understood that the content of the above method embodiment is applicable to the present storage medium embodiment, and the functions of the present storage medium embodiment are the same as those of the above method embodiment, and the achieved beneficial effects are the same as those of the above method embodiment.
The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The face fusion method, the face fusion system, the electronic equipment and the storage medium are characterized in that face detection and alignment processing are respectively carried out on a source face image and a target face image, a source face affine transformation matrix and a target face affine transformation matrix are obtained through calculation, clipping processing is respectively carried out on the source face image and the target face image according to the source face affine transformation matrix and the target face affine transformation matrix to obtain a source face area image and a target face area image, feature extraction processing is carried out on the source face area image to obtain ornament features and source face features, ornament features and source face features can be accurately extracted, and the reality of fusion effect is improved. The scheme also includes that the source face features and the target face region images are jointly input into a face fusion model to be subjected to face fusion processing to obtain an initial face fusion result, the initial face fusion result and the ornament features are input into an ornament generation model to be subjected to reconstruction processing to obtain an ornament reconstruction result, the ornament features can be accurately restored by the aid of the reconstruction processing of the initial face fusion result and the ornament features, the ornament reconstruction result is covered to the target face region images according to an inverse matrix of an affine transformation matrix of the target face to obtain the target face fusion result, and fusion accuracy and sense of reality can be improved.
The embodiments described in the embodiments of the present application are for more clearly describing the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided by the embodiments of the present application, and those skilled in the art can know that, with the evolution of technology and the appearance of new application scenarios, the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.
It will be appreciated by persons skilled in the art that the embodiments of the application are not limited by the illustrations, and that more or fewer steps than those shown may be included, or certain steps may be combined, or different steps may be included.
The system embodiments described above are merely illustrative, in that the units illustrated as separate components may or may not be physically separate, i.e., may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.
The terms "first," "second," "third," "fourth," and the like in the description of the application and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "at least one (item)" means one or more, and "a plurality" means two or more. "and/or" is used to describe an association relationship of an associated object, and indicates that three relationships may exist, for example, "a and/or B" may indicate that only a exists, only B exists, and three cases of a and B exist simultaneously, where a and B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one of a, b or c may represent a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided by the present application, it should be understood that the disclosed systems and methods may be implemented in other ways. For example, the system embodiments described above are merely illustrative, e.g., the division of the above elements is merely a logical functional division, and there may be additional divisions in actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interface, system or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including multiple instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method of the various embodiments of the present application. The storage medium includes various media capable of storing programs, such as a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory RAM), a magnetic disk, or an optical disk.
The preferred embodiments of the present application have been described above with reference to the accompanying drawings, and are not thereby limiting the scope of the claims of the embodiments of the present application. Any modifications, equivalent substitutions and improvements made by those skilled in the art without departing from the scope and spirit of the embodiments of the present application shall fall within the scope of the claims of the embodiments of the present application.

Claims (9)

1. A method of face fusion, the method comprising:
Acquiring a source face image and a target face image;
Performing face detection and alignment processing on the source face image and the target face image respectively, and calculating to obtain a source face affine transformation matrix and a target face affine transformation matrix;
Cutting the source face image and the target face image according to the source face affine transformation matrix and the target face affine transformation matrix to obtain a source face area image and a target face area image;
Performing feature extraction processing on the source face region image to obtain ornament features and source face features;
The source face features and the target face region images are jointly input into a face fusion model to be subjected to face fusion processing, and an initial face fusion result is obtained;
inputting the initial face fusion result and the ornament feature into an ornament generation model for reconstruction processing, so as to obtain an ornament reconstruction result;
covering the ornament reconstruction result to the target face region image according to the inverse matrix of the target face affine transformation matrix to obtain a target face fusion result;
Before the initial face fusion result and the ornament feature are input into an ornament generation model for reconstruction processing to obtain an ornament reconstruction result, the method further comprises the step of pre-training the ornament generation model, and specifically comprises the following steps:
constructing and obtaining a training image by a computer synthesis method;
Performing ornament feature extraction processing on the training image to obtain training ornament features;
Inputting the training ornament characteristics and the training images into the ornament generation model to obtain a training generation result;
performing loss value calculation processing on the training generation result according to the ornament generation loss function to obtain an ornament generation loss value;
updating parameters of the ornament generation model according to the ornament generation loss value.
2.The method according to claim 1, wherein the face detection and alignment processing is performed on the source face image and the target face image respectively, and the calculation result includes:
obtaining a standard face template image;
Respectively inputting the source face image and the target face image into a face detection model to perform face detection processing to obtain source face key point information and target face key point information;
And respectively carrying out face alignment processing on the source face key point information and the target face key point information according to the standard face template image, and calculating to obtain a source face affine transformation matrix and a target face affine transformation matrix.
3. The method of claim 1, wherein the performing feature extraction on the source face region image to obtain the ornament feature and the source face feature comprises:
inputting the source face region image into an ornament classification model to perform ornament feature extraction processing to obtain ornament features;
performing bilinear interpolation on the source face region image to obtain a scaled image;
and inputting the scaled image into a face feature extraction model to perform face feature extraction processing to obtain source face features.
4. The method according to claim 1, wherein before the step of performing face fusion processing by jointly inputting the source face feature and the target face region image into a face fusion model to obtain an initial face fusion result, the method further comprises pre-training the face fusion model, specifically comprising:
Acquiring high-resolution portrait training data and low-resolution portrait training data;
Performing face detection and alignment processing on the high-resolution portrait training data, and cutting to obtain a training target face image;
Respectively extracting face features of the high-resolution portrait training data and the low-resolution portrait training data to obtain training face features;
Inputting the training face features and the training target face images into the face fusion model to obtain a training fusion result;
Carrying out loss value calculation processing on the training fusion result according to the face fusion loss function to obtain a face fusion loss value;
And updating parameters of the face fusion model according to the face fusion loss value.
5. The method of claim 4, wherein the performing a loss value calculation process on the training fusion result according to a face fusion loss function to obtain a face fusion loss value includes:
According to the training fusion result, calculating to obtain the facial feature cosine similarity loss, the pixel loss, the perception loss and the antagonism loss;
And carrying out weighted summation processing on the face feature cosine similarity loss, the pixel loss, the perception loss and the countermeasure loss according to a face fusion loss function to obtain a face fusion loss value.
6. The method of claim 1, wherein constructing the training image by a computer synthesis method comprises:
Constructing and obtaining a simulated digital person and ornament model by a three-dimensional modeling technology;
Carrying out random condition modification treatment on the simulated digital person to obtain a random model;
And synthesizing the ornament model and the random model to obtain a training image.
7. A face fusion system, the system comprising:
The first module is used for acquiring a source face image and a target face image;
a second module, configured to perform face detection and alignment processing on the source face image and the target face image respectively, and calculate to obtain a source face affine transformation matrix and a target face affine transformation matrix;
A third module, configured to perform clipping processing on the source face image and the target face image according to the source face affine transformation matrix and the target face affine transformation matrix, so as to obtain a source face area image and a target face area image;
a fourth module, configured to perform feature extraction processing on the source face area image to obtain an ornament feature and a source face feature;
A fifth module, configured to combine the source face feature and the target face area image to input a face fusion model for face fusion processing, so as to obtain an initial face fusion result;
A sixth module, configured to input the initial face fusion result and the ornament feature into an ornament generating model for reconstruction processing, so as to obtain an ornament reconstruction result;
a seventh module, configured to cover the ornament reconstruction result to the target face area image according to the inverse matrix of the target face affine transformation matrix, to obtain a target face fusion result;
The sixth module is configured to input the initial face fusion result and the ornament feature into an ornament generating model for reconstruction, and before obtaining an ornament reconstruction result, the sixth module further includes training the ornament generating model in advance, and specifically includes:
constructing and obtaining a training image by a computer synthesis method;
Performing ornament feature extraction processing on the training image to obtain training ornament features;
Inputting the training ornament characteristics and the training images into the ornament generation model to obtain a training generation result;
performing loss value calculation processing on the training generation result according to the ornament generation loss function to obtain an ornament generation loss value;
updating parameters of the ornament generation model according to the ornament generation loss value.
8. An electronic device comprising a memory storing a computer program and a processor implementing the method of any of claims 1 to 6 when the computer program is executed by the processor.
9. A computer readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the method of any one of claims 1 to 6.
CN202410454968.5A 2024-04-16 2024-04-16 A face fusion method, system, electronic device and storage medium Active CN118411746B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410454968.5A CN118411746B (en) 2024-04-16 2024-04-16 A face fusion method, system, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410454968.5A CN118411746B (en) 2024-04-16 2024-04-16 A face fusion method, system, electronic device and storage medium

Publications (2)

Publication Number Publication Date
CN118411746A CN118411746A (en) 2024-07-30
CN118411746B true CN118411746B (en) 2025-01-24

Family

ID=91990253

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410454968.5A Active CN118411746B (en) 2024-04-16 2024-04-16 A face fusion method, system, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN118411746B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119559335B (en) * 2024-12-02 2025-12-16 平安科技(深圳)有限公司 Image synthesis method and device, electronic equipment and storage medium
CN120126245B (en) * 2025-04-08 2025-07-18 中电金融设备系统(深圳)有限公司 A multi-user unlocking control method, device and storage medium for smart cash box

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115601459A (en) * 2022-09-19 2023-01-13 厦门美图之家科技有限公司(Cn) CNN-based face ornament generation method, device and equipment
CN117196937A (en) * 2023-09-08 2023-12-08 天翼爱音乐文化科技有限公司 Video face changing method, device and storage medium based on face recognition model

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476709B (en) * 2020-04-09 2023-04-07 广州方硅信息技术有限公司 Face image processing method and device and electronic equipment
CN111598818B (en) * 2020-04-17 2023-04-28 北京百度网讯科技有限公司 Face fusion model training method, device and electronic equipment
CN111932439A (en) * 2020-06-28 2020-11-13 深圳市捷顺科技实业股份有限公司 Method and related device for generating face image of mask
CN113366491B (en) * 2021-04-26 2022-07-22 华为技术有限公司 Eyeball tracking method, device and storage medium
CN113706428B (en) * 2021-07-02 2024-01-05 杭州海康威视数字技术股份有限公司 An image generation method and device
CN115457624B (en) * 2022-08-18 2023-09-01 中科天网(广东)科技有限公司 Face recognition method, device, equipment and medium for wearing mask by cross fusion of local face features and whole face features
CN116188915A (en) * 2023-03-29 2023-05-30 恒银金融科技股份有限公司 Training method and device for synthetic model of face mask image

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115601459A (en) * 2022-09-19 2023-01-13 厦门美图之家科技有限公司(Cn) CNN-based face ornament generation method, device and equipment
CN117196937A (en) * 2023-09-08 2023-12-08 天翼爱音乐文化科技有限公司 Video face changing method, device and storage medium based on face recognition model

Also Published As

Publication number Publication date
CN118411746A (en) 2024-07-30

Similar Documents

Publication Publication Date Title
CN110503703B (en) Methods and apparatus for generating images
CN111553267B (en) Image processing method, image processing model training method and device
Liu et al. Psgan++: Robust detail-preserving makeup transfer and removal
CN114973349B (en) Facial image processing method and facial image processing model training method
US20240169701A1 (en) Affordance-based reposing of an object in a scene
CN118411746B (en) A face fusion method, system, electronic device and storage medium
CN119338937A (en) Method and computing device for facial reproduction
JP7479507B2 (en) Image processing method and device, computer device, and computer program
CN114529785B (en) Model training method, video generating method and device, equipment and medium
CN114241558B (en) Model training method, video generating method and device, equipment and medium
CN112085835A (en) Three-dimensional cartoon face generation method and device, electronic equipment and storage medium
WO2025246674A1 (en) Image processing method and apparatus, electronic device, computer-readable storage medium, and computer program product
CN113362263A (en) Method, apparatus, medium, and program product for changing the image of a virtual idol
CN115049016A (en) Model driving method and device based on emotion recognition
WO2024233818A1 (en) Segmentation of objects in an image
US20250139848A1 (en) Image generation method and related apparatus
Liu et al. High Fidelity Makeup via 2D and 3D Identity Preservation Net
CN117456067A (en) Image processing methods, devices, electronic equipment and storage media
CN117808934A (en) A data processing method and related equipment
Basso New interpretation tools and metamorphosis of the image, how the self-synthesizing of visual elements influences the aesthetic evolution
CN119559335B (en) Image synthesis method and device, electronic equipment and storage medium
Miura et al. SynSLaG: Synthetic sign language generator
US12437497B2 (en) Method, device, and computer program product for image processing
CN117994173B (en) Repair network training method, image processing method, device and electronic equipment
CN119323513A (en) Face image exchange method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant