[go: up one dir, main page]

WO2024016611A1 - Image processing method and apparatus, electronic device, and computer-readable storage medium - Google Patents

Image processing method and apparatus, electronic device, and computer-readable storage medium Download PDF

Info

Publication number
WO2024016611A1
WO2024016611A1 PCT/CN2022/144343 CN2022144343W WO2024016611A1 WO 2024016611 A1 WO2024016611 A1 WO 2024016611A1 CN 2022144343 W CN2022144343 W CN 2022144343W WO 2024016611 A1 WO2024016611 A1 WO 2024016611A1
Authority
WO
WIPO (PCT)
Prior art keywords
watermark
image
model
target
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2022/144343
Other languages
French (fr)
Chinese (zh)
Inventor
王勇涛
黄灏
叶晓雨
汤帜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Publication of WO2024016611A1 publication Critical patent/WO2024016611A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/16Program or content traceability, e.g. by watermarking
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • This application belongs to the field of artificial intelligence technology and involves image processing methods, devices, electronic equipment and computer-readable storage media.
  • This application provides an active defense method and system against deep forgery, and also provides an image processing method, device, electronic equipment and computer-readable storage medium.
  • the technical solution provided by this application includes the following aspects.
  • an active defense method against deep forgery is provided, the steps of which include:
  • the watermark is updated based on the defensive watermark obtained in the previous training. Specifically, the watermark obtained in this training needs to be multiplied by the coefficient ⁇ (usually 0.01) and the last watermark multiplied by the coefficient 1 - ⁇ gets new defense watermark.
  • Training watermark embedding and detection specifically including:
  • a training encoder-decoder Train a training encoder-decoder.
  • the encoder embeds the active defense watermark obtained in the previous step into the input image, and uses the loss function to ensure that the embedded information is invisible.
  • the decoder reads the embedded image and decodes the encoded watermark, ensuring the accuracy of the decoded information through the loss function.
  • the corresponding encoder and decoder weights are generated.
  • an active defense system against deep forgery which includes:
  • Deep forgery model interface module includes functions for inputting images to the deep forgery model and obtaining the generated results
  • Active defense watermark generation module used to generate defensive watermarks that protect faces from multiple deep forgery models; specifically, this module first completes access to the deep forgery model, calls the basic watermark generation algorithm, and generates the model combined with watermark fusion technology Universal active defense watermark.
  • Active defense watermark embedding module This module trains the encoder-decoder, and uses the encoder to embed the universal watermark generated by the active defense watermark generation module into the face image.
  • Watermark defense effect evaluation module used to evaluate the degree to which the watermark distorts the output of the deep forgery model
  • Deep forgery detection module Through the decoder provided by the active defense watermark embedding module, pictures with embedded watermarks are detected to determine whether these pictures have been modified by a deep forgery model.
  • a model-general active defense watermark By generating a model-general active defense watermark, embedding the watermark into media containing face information can distort the generation of deep forgery models, and use the watermark to detect whether the media content has experienced deep forgery, completely preventing deep forgery. Forgery and tampering.
  • the embodiments of the present application have defense capabilities against a variety of deep forgery models, and can achieve defense effects without the need for structural information of the deep forgery models.
  • an image processing method which method includes:
  • the first watermark and the second watermark are embedded in the initial image through the target encoding model to obtain a target image, and the difference between the initial image and the target image is less than a reference threshold.
  • embedding the first watermark and the second watermark into an initial image through the target encoding model to obtain a target image includes: superimposing the first watermark onto the initial image. , obtain the first image, input the first image and the second watermark into the target coding model, and obtain the target image output by the target coding model; or, combine the second watermark and the initial The image is input into the target encoding model to obtain a second image output by the target encoding model, and the first watermark is superimposed on the second image to obtain the target image.
  • the method before embedding the first watermark and the second watermark into the initial image through the target encoding model and obtaining the target image, the method further includes: acquiring a first sample image, carrying The third watermark of the first information and the initial coding model; input the first sample image into the initial coding model to obtain the first result image output by the initial coding model; combine the first sample image and the initial coding model The third watermark is input into the initial coding model to obtain a second result image output by the initial coding model, and the second result image is embedded with the third watermark; according to the first result image and the second The result image determines a first loss function, the first loss function is used to indicate the difference between the first result image and the second result image; the initialization is updated by minimizing the first loss function.
  • the method further includes: obtaining the generated data based on the target image. Update the image; input the updated image into the target decoding model to obtain the fourth watermark output by the target decoding model; determine the bit error between the second watermark and the fourth watermark; when the bit error is greater than the error In the case of a threshold, it is determined that the updated image is an image obtained by deep forging the target image using the deep forgery model.
  • the method before inputting the updated image into the target decoding model, the method further includes: obtaining a second sample image and an initial decoding model, the second sample image is embedded with a third image carrying the second information.
  • Five watermarks input the second sample image into the initial decoding model to obtain the sixth watermark output by the initial decoding model; determine a second loss function according to the fifth watermark and the sixth watermark; by minimizing The process of the second loss function updates the initial decoding model to obtain the target decoding model.
  • obtaining the first watermark generated for the deep forgery model includes: obtaining a third sample image, a seventh watermark, and the deep forgery model; and inputting the third sample image into the deep forgery model.
  • model to obtain the third result image output by the deep forgery model; superimpose the seventh watermark to the third sample image to obtain a fourth sample image; input the fourth sample image into the deep forgery model,
  • the seventh watermark is the first watermark generated for the deep forgery model.
  • the seventh watermark is to obtain the first watermark generated for the deep forgery model, including: averaging the multiple gradient information to obtain target gradient information; calculating the target gradient information through a symbolic function to obtain the first calculation Result; generate a second calculation result according to the first calculation result, superimpose the second calculation result to the fourth sample image, and obtain a fifth sample image; apply upper and lower limits constraints on the fifth sample image, and obtain the sixth Sample image; remove the third sample image from the sixth sample image to obtain an eighth watermark; perform a weighted average of the seventh watermark and the eighth watermark to obtain the first watermark.
  • an image processing device which device includes:
  • the acquisition module is used to obtain the first watermark generated for the deep forgery model and the second watermark carrying target information
  • the acquisition module is also used to acquire a target coding model, which is used to embed watermarks on images;
  • An embedding module configured to embed the first watermark and the second watermark into an initial image through the target encoding model to obtain a target image, where the difference between the initial image and the target image is less than a reference threshold.
  • the embedding module is configured to superimpose the first watermark onto the initial image to obtain a first image, and input the first image and the second watermark into the target encoding. model to obtain the target image output by the target encoding model; or, input the second watermark and the initial image into the target encoding model to obtain the second image output by the target encoding model, and convert the The first watermark is superimposed on the second image to obtain the target image.
  • the acquisition module is also used to acquire a first sample image, a third watermark carrying first information, and an initial encoding model; input the first sample image into the initial encoding model, Obtain the first result image output by the initial encoding model; input the first sample image and the third watermark into the initial encoding model to obtain the second result image output by the initial encoding model, and the third
  • the second result image is embedded with the third watermark; a first loss function is determined according to the first result image and the second result image, and the first loss function is used to indicate the difference between the first result image and the third result image.
  • the difference between the two result images; the initial encoding model is updated through the process of minimizing the first loss function to obtain the target encoding model, wherein the minimized first loss function is smaller than the reference threshold.
  • the acquisition module is also used to acquire an updated image generated based on the target image; input the updated image into the target decoding model to obtain the fourth watermark output by the target decoding model; determine the The bit error between the second watermark and the fourth watermark; when the bit error is greater than the error threshold, it is determined that the updated image is obtained by deep forging the target image through the deep forgery model.
  • Image is also used to acquire an updated image generated based on the target image; input the updated image into the target decoding model to obtain the fourth watermark output by the target decoding model; determine the The bit error between the second watermark and the fourth watermark; when the bit error is greater than the error threshold, it is determined that the updated image is obtained by deep forging the target image through the deep forgery model.
  • the acquisition module is also used to acquire a second sample image and an initial decoding model, the second sample image is embedded with a fifth watermark carrying second information; input the second sample image
  • the initial decoding model obtains the sixth watermark output by the initial decoding model; determines a second loss function according to the fifth watermark and the sixth watermark; and updates the second loss function through the process of minimizing the second loss function.
  • Initial decoding model to obtain the target decoding model.
  • the acquisition module is used to acquire a third sample image, a seventh watermark and the deep forgery model; input the third sample image into the deep forgery model to obtain the deep forgery model.
  • the third result image output superimpose the seventh watermark to the third sample image to obtain a fourth sample image; input the fourth sample image into the deep forgery model to obtain the output of the deep forgery model.
  • a fourth result image determine a third loss function according to the third result image and the fourth result image; determine gradient information according to the third loss function; update the seventh watermark according to the gradient information to obtain the Describes the first watermark generated for the deep fake model.
  • the number of the deep forgery models and the number of the gradient information are both multiple, and the multiple deep forgery models correspond to the multiple gradient information one-to-one;
  • the acquisition module is used to obtain the A plurality of gradient information are averaged to obtain target gradient information;
  • the target gradient information is calculated through a symbolic function to obtain a first calculation result;
  • a second calculation result is generated according to the first calculation result, and the second calculation result is superimposed to
  • the fourth sample image is used to obtain a fifth sample image; the upper and lower limits are applied to the fifth sample image to obtain a sixth sample image;
  • the third sample image is removed from the sixth sample image to obtain an eighth sample image.
  • Watermark perform a weighted average of the seventh watermark and the eighth watermark to obtain the first watermark.
  • an electronic device includes a memory and a processor; at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor, so that the electronic device implements the present invention.
  • the active defense method against deep forgery or the image processing method provided by any of the exemplary embodiments of the application.
  • a computer-readable storage medium is provided. At least one instruction is stored in the computer-readable storage medium. The instruction is loaded and executed by a processor to enable the computer to implement any exemplary implementation of the present application.
  • the active defense method against deep forgery provided by the example, or the image processing method.
  • the computer program or computer program product includes: computer instructions.
  • the computer instructions When the computer instructions are executed by a computer, the computer implements any exemplary embodiment of the present application.
  • the embodiment provides an active defense method against deep forgery or an image processing method.
  • this kind of target image can not only defend against deep forgery models, but also carry customized target information, ensuring the accuracy of image processing.
  • Figure 1 is a schematic diagram of the generation of an active defense watermark provided by an embodiment of the present application
  • Figure 2 is a schematic diagram of an active defense watermark embedding and deep forgery detection provided by an embodiment of the present application
  • Figure 3 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
  • Figure 4 is a flow chart of an image processing method provided by an embodiment of the present application.
  • Figure 5 is a schematic diagram of generating a first watermark provided by an embodiment of the present application.
  • Figure 6 is a schematic diagram of watermark embedding and deep forgery detection provided by an embodiment of the present application.
  • Figure 7 is a structural diagram of an image processing device provided by an embodiment of the present application.
  • Figure 8 is a structural diagram of an electronic device provided by an embodiment of the present application.
  • Deep forgery technology modifies faces through attribute modification or facial replacement. It can modify hair color, face shape and other appearance features, or replace faces on other videos and images to make characters behave inconsistently with their identities or convey false information.
  • StarGAN Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation, a unified generative adversarial network for multi-domain image-to-image translation
  • InterfaceGAN Interpreting the Latent Space of GANs for Semantic Face Editing, explaining the unified generation of adversarial network latent space for semantic face editing
  • audio and video media created using deep fakes can be seen everywhere, and the social problems they cause have gradually attracted attention.
  • audio and video media created by deep fakes may be used to spread false information, causing huge damage to user images.
  • This application designs an active defense system against deep forgery.
  • the system includes five modules: deep forgery model interface, watermark generation, watermark embedding, defense effect evaluation, and deep forgery detection. in:
  • Deep forgery model interface module includes functions for inputting images to the deep forgery model and obtaining the generated results
  • Active defense watermark generation module used to generate defensive watermarks that protect faces from multiple deep forgery models; specifically, this module first completes access to the deep forgery model, calls the basic watermark generation algorithm, and generates the model combined with watermark fusion technology Universal active defense watermark.
  • Active defense watermark embedding module This module trains the encoder-decoder and uses the encoder to embed the universal watermark generated by the active defense watermark generation module into the face image.
  • Watermark defense effect evaluation module used to evaluate the degree to which the watermark distorts the output of the deep forgery model
  • Deep forgery detection module Through the decoder provided by the active defense watermark embedding module, pictures with embedded watermarks are detected to determine whether these pictures have been modified by a deep forgery model.
  • AttGAN and Attentiongan are used as attack targets
  • PGD attack algorithm is used as the basic attack algorithm to explain how to generate active defense watermarks, how to embed watermarks, and how to perform deep forgery detection.
  • the first step is to obtain the active defense watermark, as shown in Figure 1:
  • the loss is the loss function of the deep forgery model output obtained from the original image and the watermarked image:
  • I is the original image
  • W is the watermark
  • G is the deep fake model.
  • the adversarial perturbation P obtained on this model needs to be multiplied by the coefficient ⁇ (usually 0.01) and the previous watermark multiplied by 1- ⁇ to obtain a new defensive watermark.
  • the second step is to train watermark embedding and detection:
  • the encoder uses the training set of CelebA to train a pair of convolutional neural network encoder-decoder.
  • the encoder embeds the active defense watermark into the input image, and uses the loss function to constrain the embedded image to be close enough to the original image, that is, to minimize the mean square error and ensure that the embedded information is invisible.
  • Loss encoding MSE(E(I),E(I,N))
  • E is the encoder and N is the active defense watermark.
  • the decoder reads the embedded picture and decodes the encoded watermark.
  • the bit error between the decoding result and the original watermark is constrained through the loss function, that is, the BCE error function with logit is minimized.
  • Lossd encoding BCEwithLogitsLoss(W,D(E(I,N)))
  • D is the decoder.
  • the third step is deep forgery detection, as shown in Figure 2:
  • the encoded watermark is decoded from the forged picture and compared with the original embedded watermark.
  • the bit difference between the two is greater than or equal to the set threshold (0.4)
  • the picture is considered Passed deepfake.
  • the minimum encoding change rate after encoding by the deep forgery model and without forgery is 41.0%, which can be detected.
  • the embodiment of the present application achieved a 100% deepfake defense rate.
  • the embodiment of the present application provides an active defense method against deep forgery and an image processing method. These methods can be applied in the implementation environment as shown in Figure 3.
  • FIG. 3 at least one electronic device 31 and a server 32 are included.
  • the electronic device 31 can communicate with the server 32 .
  • the electronic device 31 can obtain the image that needs to be processed from the server 32 .
  • the electronic device 31 can also acquire images that need to be processed by itself, such as by taking pictures or other methods.
  • the electronic device 31 includes but is not limited to any electronic product that can perform human-computer interaction with the user through one or more methods such as keyboard, touch pad, touch screen, remote control, voice interaction or handwriting device, such as a PC.
  • PC Personal Computer
  • mobile phones smartphones
  • PDAs Personal Digital Assistant
  • wearable devices PPC (Pocket PC)
  • tablets smart cars
  • smart TVs smart speakers, etc.
  • the server 32 may be one server, a server cluster composed of multiple servers, or a cloud server.
  • an embodiment of the present application provides an image processing method, which can be applied to the electronic device shown in Figure 3. As shown in Figure 4, the method includes the following steps 401 to 403.
  • Step 401 Obtain the first watermark generated for the deep forgery model and the second watermark carrying target information.
  • the first watermark generated for the deep forgery model is a universal watermark that can defend against the deep forgery model.
  • being able to defend against a deep forgery model means that after an image embedded with such a first watermark is input into the deep forgery model, the deep forgery model will tamper with the image embedded with such a first watermark and output the tampered image. image.
  • the tampered image is distorted, so it can be known that the tampered image has been tampered with by the deep forgery model, forming a defense against the deep forgery model.
  • deep fake models include but are not limited to StarGAN, InterfaceGAN, HiSD (Hierarchical Style Disentanglement, hierarchical splitting) and AttGAN (Attention Generative Adversarial Networks, attention generation adversarial networks), etc., without limitation.
  • the first watermark is a watermark in the form of an image.
  • the first watermark is composed of multiple pixels.
  • each pixel can correspond to at least one channel, and each channel corresponds to a value.
  • the second watermark carrying target information is a personalized watermark that can be customized according to the actual needs of the user.
  • the target information carried by the second watermark is the information that the user wants to reflect.
  • the second watermark is a watermark in the form of an image, such as a logo.
  • the second watermark consists of multiple pixels. Each pixel can correspond to at least one channel, and each channel corresponds to a value.
  • the second watermark is a watermark in the form of a string, such as an identification (Identification, ID) used by the user.
  • ID identification
  • the embodiment of the present application does not limit the number of digits in the string, and the number of digits in the string can be determined based on experience or actual needs.
  • obtaining the first watermark generated for the deepfake model includes the following steps A1 to A3.
  • Step A1 Obtain the third sample image, the seventh watermark and the deep forgery model, superimpose the seventh watermark to the third sample image, and obtain the fourth sample image; input the third sample image into the deep forgery model, and obtain the output of the deep forgery model.
  • the third result image, and the fourth sample image is input into the deep forgery model to obtain the fourth result image output by the deep forgery model.
  • the third sample image may be an image that has not yet embedded a watermark, and the third sample image is the clean sample shown in Figure 1 .
  • the third sample image can come from public data sets, including but not limited to CelebA (CelebFaces Attributes Dataset, face attribute data set).
  • the seventh watermark is an initialization watermark used to generate the first watermark.
  • the seventh watermark may be a watermark trained through some training processes, or a randomly generated watermark (also known as random noise).
  • the seventh watermark has the same type as the first watermark. For example, when the first watermark is in the form of an image and the seventh watermark is also in the form of an image, then each channel included in the pixels that make up the seventh watermark corresponds to a value.
  • each channel corresponds to a smaller value.
  • the value range is 0 to 255
  • the value corresponding to each channel is between 0 and 1.
  • the size of the seventh watermark is less than or equal to the size of the third sample image, or in other words, the number of pixels included in the seventh watermark is less than or equal to the number of pixels included in the third sample image.
  • This seventh watermark can be superimposed on the third sample image to obtain a fourth sample image.
  • the fourth sample image is also the adversarial sample shown in Figure 1.
  • the number of pixels included in the seventh watermark is equal to the number of pixels included in the third sample image, then the pixels included in the seventh watermark correspond to the pixels included in the third sample image one-to-one.
  • superimposing the seventh watermark onto the third sample image to obtain a fourth sample image may include: adding values corresponding to channels included in corresponding pixels to obtain a fourth sample image.
  • the embodiment of the present application can intercept a part from the third sample image, and the number of pixels included in this intercepted part is equal to the number of pixels included in the third sample image.
  • the seventh watermark includes the same number of pixels, so that a one-to-one pixel relationship can be formed.
  • the number of pixels of the seventh watermark can be increased through interpolation or other methods, so that the increased number of pixels is the same as the number of pixels included in the third sample image, or a one-to-one pixel relationship can be formed.
  • the values corresponding to the channels included in the corresponding pixels can be added according to the above description to obtain the fourth sample image, which will not be described again here.
  • the third sample image can be input into the deep forgery model, so that the deep forgery model tamperes with the third sample image and outputs the tampered third sample image, that is, the third result image.
  • the third sample image is represented as I
  • the deep fake model is represented as G( ⁇ )
  • the third result image is represented as G(I).
  • the fourth sample image can be input into the deep forgery model, causing the deep forgery model to tamper with the fourth sample image and output the tampered fourth sample image, that is, the fourth result image.
  • the seventh watermark is expressed as W. Since the fourth sample image is obtained by superimposing the seventh watermark W on the third sample image I, the fourth sample image is expressed as I+W, and the deep forgery model is still expressed as G( ⁇ ), then the fourth result image is expressed as G(I+W).
  • Step A2 Determine a third loss function based on the third result image and the fourth result image, and determine gradient information based on the third loss function.
  • the third loss function determined according to the third result image and the fourth result image is used to indicate the difference between the third result image and the fourth result image, which difference also represents the difference between the third sample image and the fourth sample image. difference between.
  • the embodiment of the present application does not limit the determination method of the third loss function.
  • the third loss function is obtained by calculating the third result image and the fourth result image through MSE (Mean-Square Error) as an example. , then the third loss function loss generation is expressed as the following formula (1):
  • the pixels included in the third result image and the pixels included in the fourth result image are in one-to-one correspondence. Therefore, you can first calculate the difference between the values corresponding to the channels included in the corresponding pixels, and perform a square calculation on the difference to obtain the square value corresponding to the pixel. After that, the sum of square values corresponding to different pixels is calculated, and the ratio of the sum of square values to the number of pixels is used as the third loss function.
  • the gradient calculation can be performed on the third loss function to obtain gradient information, which indicates the direction in which the value of the third loss function changes fastest.
  • the gradient information may be a vector, and the embodiment of the present application does not limit the form of the gradient information.
  • Step A3 Update the seventh watermark according to the gradient information to obtain the first watermark generated for the deep forgery model.
  • the seventh watermark is an initialization watermark used to generate the first watermark. Therefore, after obtaining the gradient information, the embodiment of the present application updates the seventh watermark according to the gradient information, thereby obtaining the first watermark based on the seventh watermark. watermark.
  • the number of deepfake models and the number of gradient information are both multiple. Then for a third sample image, the above-mentioned step A1 and step A2 will be performed for each deepfake model respectively. Therefore, referring to Figure 1 and Figure 5, one gradient information can be obtained for each deep forgery model, so that multiple deep forgery models correspond to multiple gradient information one-to-one.
  • step A3 further includes the following steps A31 to A33.
  • Step A31 average multiple gradient information to obtain target gradient information.
  • the target gradient information is obtained by averaging multiple gradient information
  • the target gradient information is also the average value of the multiple gradient information
  • the target gradient information is expressed as g avg .
  • averaging multiple gradient information is also called gradient fusion.
  • Step A32 Calculate the target gradient information through the symbolic function to obtain the first calculation result; generate the second calculation result according to the first calculation result, and superimpose the second calculation result to the fourth sample image to obtain the fifth sample image;
  • the five sample images are subject to upper and lower bound constraints to obtain the sixth sample image;
  • the third sample image is removed from the sixth sample image to obtain the eighth watermark.
  • the target gradient information is calculated through the symbolic function.
  • the values less than -1 in the target gradient information can be converted to -1, the values that are 0 are kept as 0, and the values greater than 1 are converted to 1, thereby obtaining the first Calculation results.
  • the sign function is represented as sign( ⁇ )
  • the first calculation result can be correspondingly expressed as sign(g avg ).
  • the embodiments of the present application do not limit the method of generating the second calculation result based on the first calculation result, and the generation method can be set according to actual requirements.
  • the generation method is to use the product of the first calculation result and the constant a as the second calculation result.
  • the value of the constant a is, for example, 0.1, which is not limited here.
  • the second calculation result can be expressed as asign(g avg ).
  • the third sample image is represented as I
  • the seventh watermark is represented as W
  • the fourth sample image is represented as I+W.
  • the seventh watermark is expressed as P r
  • the fourth sample image is expressed as Then the fourth sample image Expressed as the following formula (2):
  • the second calculation result can be superimposed on the fourth sample image to obtain the fifth sample image.
  • the fifth sample image is subject to upper and lower bound constraints according to the following formula (3) to obtain the sixth sample image
  • slip I, ⁇ ⁇ represents the calculation of an upper and lower limit constraint.
  • the value corresponding to the channel of the pixel included in the fifth sample image is greater than the upper limit, the value is converted into the upper limit.
  • the fifth sample image includes When the value corresponding to the channel of the pixel is less than the lower limit, the value is converted to the lower limit.
  • the upper limit and lower limit can be determined according to actual needs and are not limited here.
  • the third sample image is removed from the sixth sample image according to the following formula (4) to obtain the eighth watermark P r+1 .
  • This eighth watermark is also called adversarial perturbation:
  • r in the above formulas (2), (3) and (4) is used to represent the number of iterations, and the number of iterations is set according to actual needs. For example, when the eighth watermark is obtained through an iterative process, the value of r is 0 and the eighth watermark is P 1 . For another example, when the eighth watermark is obtained through 10 iterations, the value of r is 9 and the eighth watermark is P 10 .
  • Step A33 perform a weighted average of the seventh watermark and the eighth watermark to obtain the first watermark.
  • the seventh watermark and the eighth watermark can be weighted and averaged according to the following formula (5) to obtain the first watermark W′:
  • is the weight, and the value of ⁇ is, for example, 0.01, which can be determined based on experience or actual needs.
  • W is the seventh watermark, which is equivalent to the above-mentioned P r
  • P is the eighth watermark, which is equivalent to the above-mentioned P r+1 .
  • the above steps A1 to A3 are a generation method based on the PGD (Projected Gradient Descent) algorithm. Moreover, the above steps A1 to A3 are described for one third sample image.
  • the embodiment of the present application can provide multiple sets of images, each set of images including multiple third sample images. In this case, first process the first third sample image in the first group of images according to steps A1 to A3 to obtain the first watermark 1, and then process the second third sample image in the first group of images.
  • the three sample images and the first watermark 1 are substituted into step A1, and steps A1 to A3 are repeatedly executed to obtain the first watermark 2, and so on, until all the third sample images in the first group of images have been used, and we obtain The first watermark K, K is the number of third sample images included in the first group of images.
  • use the second set of images to substitute the first third sample image and the first watermark K in the second set of images into step A1, thereby repeating steps A1 to A3, and so on, until all The last third sample image in the last set of images has also been used, resulting in the first watermark generated for the defense-in-depth model.
  • obtaining the second watermark carrying target information may include: providing a second watermark input interface, obtaining the second watermark in the form of an image uploaded by the user from the second watermark input interface, or obtaining the second watermark uploaded by the user from the second watermark input interface.
  • the second watermark is entered in the form of a string through the second watermark input interface.
  • the implementation of this application does not limit the acquisition method of the second watermark.
  • Step 402 Obtain a target coding model, which is used to embed watermarks on the image.
  • the target encoding model is the encoder mentioned above.
  • the target encoding model can be a model based on a convolutional neural network or other artificial intelligence models, which are not limited here.
  • the target coding model has the ability to embed watermarks on images, and can make the difference between the image before embedding the watermark and the image after embedding the watermark less than a reference threshold.
  • the embedded watermark is made invisible in the image, thereby enabling hidden embedding of the watermark.
  • the method provided by the embodiment of the present application also includes the following steps B1 to B3.
  • Step B1 obtain the first sample image, the third watermark carrying the first information and the initial coding model, input the first sample image into the initial coding model, obtain the first result image output by the initial coding model, and convert the first sample
  • the image and the third watermark are input into the initial encoding model to obtain a second result image output by the initial encoding model, and the second result image is embedded with the third watermark.
  • the first sample image may be an image that has not yet embedded a watermark.
  • the first sample image may be from a public data set such as CelebA.
  • the first sample image may be located in the same data set as the above-mentioned third sample image.
  • the concentrated image, or the first sample image and the above-mentioned third sample image may be images located in different data, which is not limited here.
  • the third watermark carrying the first information is of the same type as the second watermark.
  • the second watermark and the third watermark are both in image form, or the second watermark and the third watermark are both in string form.
  • the first information carried by the third watermark can be randomly generated, and the first information is not limited in this embodiment of the application.
  • the obtained initial encoding model is an initialization model used to generate the target encoding model.
  • the first sample image can be input into the initial encoding model, and the initial encoding model outputs the first result image obtained by encoding.
  • the first sample image is represented as M
  • the initial encoding model is represented as E( ⁇ )
  • the first result image is represented as E(M).
  • both the first sample image and the third watermark can be input into the initial encoding model.
  • the initial encoding model outputs the second result image obtained by encoding.
  • the second result image can be considered to have been embedded.
  • the first sample image of the third watermark is represented as N
  • the initial coding model is still represented as E( ⁇ )
  • the second result image is represented as E(M,N).
  • Step B2 Determine a first loss function based on the first result image and the second result image.
  • the first loss function is used to indicate the difference between the first result image and the second result image.
  • the first loss function determined according to the first result image and the second result image is used to indicate the difference between the first result image and the second result image. This difference also represents the difference between the first sample image and the embedded image. The difference between the first sample image of the third watermark.
  • the embodiment of the present application does not limit the method of determining the first loss function. Taking the first loss function obtained by calculating the first result image and the second result image through MSE as an example, the first loss function Loss encoding is expressed as The following formula (6):
  • Loss encoding MSE(E(M),E(M,N)) (6)
  • Step B3 Update the initial coding model through the process of minimizing the first loss function to obtain the target coding model, where the minimized first loss function is smaller than the reference threshold.
  • the target encoding model is obtained by updating the initial encoding model parameters in the initial encoding model to the target encoding model parameters.
  • the minimized first loss function is smaller than the reference threshold, and the reference threshold can be a value infinitely close to 0, which is not limited here. Since the minimized first loss function is smaller than the reference threshold, and the first loss function represents the first sample image (the image before embedding the watermark) and the first sample image that has embedded the third watermark (the image after embedding the watermark) ), therefore, after updating the initial coding model by minimizing the first loss function to obtain the target coding model, the target coding model can make the difference between the image before embedding the watermark and the image after embedding the watermark less than Reference threshold is used to make the embedded watermark invisible in the image, thereby achieving hidden embedding of the watermark.
  • Step 403 Embed the first watermark and the second watermark into the initial image through the target encoding model to obtain the target image, and the difference between the initial image and the target image is less than the reference threshold.
  • the initial image may be a photographed image or an image captured from a video, which is not limited here.
  • the embodiment of the present application embeds the obtained first watermark and second watermark into the initial image through the target coding model to obtain the target image.
  • the initial image is the image before the watermark is embedded, and the target image is the image after the watermark is embedded. Since the target encoding model can make the difference between the image before embedding the watermark and the image after embedding the watermark less than the reference threshold, the difference between the initial image and the target image is less than the reference threshold. That is to say, the first watermark and the second watermark are hidden and embedded in the initial image, and the first watermark and the second watermark are not visible in the target image.
  • the first watermark is generated for the deepfake model
  • the first watermark itself is an invisible watermark
  • the embedding of the first watermark may be a direct superposition of the first watermark.
  • the second watermark carries target information, and the second watermark is often a visible watermark, so it is necessary to implement the embedding of the second watermark through the target coding model. Based on these considerations, the first watermark and the second watermark are embedded in the initial image through the target coding model to obtain the target image, which can include the following two embedding methods.
  • Embedding method one: superimpose the first watermark onto the initial image to obtain the first image, input the first image and the second watermark into the target encoding model, and obtain the target image output by the target encoding model.
  • the first watermark is superimposed on the initial image to obtain the first image.
  • the method of superimposing the first watermark please refer to the method of superimposing the seventh watermark on the third sample image in step 301, which will not be described again here.
  • the second watermark is embedded into the first image through the target encoding model to obtain the target image.
  • the second watermark and the initial image are input into the target encoding model to obtain the second image output by the target encoding model, and the first watermark is superimposed on the second image to obtain the target image.
  • the second watermark is first embedded into the initial image through the target encoding model to obtain the second image. After that, the first watermark is superimposed on the second image to obtain the target image.
  • the method of superimposing the first watermark can also refer to the method of superimposing the seventh watermark on the third sample image in step 301, which will not be described again here.
  • embodiments of the present application can embed the first watermark and the second watermark into the target image, so that the target image can not only defend against deep forgery models, but also carry target information.
  • embodiments of the present application can also detect whether such a target image has been tampered with by a deep forgery model. Please refer to the following description for details.
  • the method further includes: obtaining an updated image generated based on the target image; inputting the updated image into the target decoding model to obtain The fourth watermark output by the target decoding model; determine the bit error between the second watermark and the fourth watermark; when the bit error is greater than the error threshold, determine that the updated image is a deep forgery of the target image through the deep forgery model The resulting image.
  • embodiments of the present application can upload the target image to a network scenario where image tampering may occur, and download the uploaded target image after a period of time to obtain an updated image, so as to detect whether the updated image has been tampered with, or Indicates whether the deepfake model has been deepfaked.
  • the target image can also be input into the deep forgery model to obtain an updated image output by the deep forgery model.
  • the embodiment of the present application does not limit the method of generating the updated image.
  • the target decoding model is the decoder in the above description.
  • the target encoding model can be a model based on a convolutional neural network or other artificial intelligence models, which is not limited here.
  • This target decoding model is used to extract watermarks from images with embedded watermarks. Since the update image is generated based on the target image, and the target image is embedded with the first watermark and the second watermark, the update image is also embedded with the watermark. Therefore, after the updated image is input to the target decoding model, the target decoding model can extract and output the fourth watermark from the updated image.
  • the bit error between the second watermark and the fourth watermark can be calculated, and the bit error can be compared with an error threshold.
  • the error threshold is, for example, 0.4, which is not limited here. If the bit error is greater than the error threshold, it is determined that the updated image is a tampered image, or in other words, it is determined that the updated image is an image obtained by deep forging the target image through a deep forgery model. The reason is that when the update image has been tampered with, the fourth watermark extracted from the update image is the distorted first watermark and the distorted second watermark. Therefore, the fourth watermark is different from the normal undistorted second watermark.
  • the difference between the watermarks is large, so that the bit error between the second watermark and the fourth watermark is greater than the error threshold. If the bit error is less than or equal to the error threshold, it is determined that the updated image is an image that has not been tampered with, or in other words, it is determined that the updated image is not an image obtained by deep forging the target image through a deep forgery model. The reason is that when the update image has not been tampered with, the fourth watermark extracted from the update image is the normal undistorted first watermark and the second watermark, so that the difference between the second watermark and the fourth watermark is The bit error is less than or equal to the error threshold.
  • the second watermark and the fourth watermark are watermarks in the form of images. Then determining the bit error between the second watermark and the fourth watermark includes: comparing the image quality index of the second watermark with the image quality index of the fourth watermark, and taking the difference between the image quality indexes as the bit error.
  • image quality indicators include but are not limited to PSNR (Peak Signal-to-Noise Ratio), SSIM (Structural Similarity, structural similarity), etc., which are not limited here.
  • the second watermark and the fourth watermark are watermarks in the form of character strings. Then determining the bit error between the second watermark and the fourth watermark includes: determining the total number of bits included in the second watermark, and the target number of different bits included in the second watermark and the fourth watermark, and setting the target number. The ratio of the quantity to the total quantity is used as the bit error.
  • the second watermark is a string of 4 bits 1100
  • the fourth watermark is also a string of 4 bits 0000, then the total number mentioned above is 4, the target number mentioned above is 2, and the bit error is 0.5.
  • the first watermark is superimposed on the initial image, and the obtained first image and the second watermark are input into the target encoding model to obtain the target image output by the target encoding model.
  • the target image is input into the deep forgery model to obtain an updated image output by the deep forgery model, and the updated image is input into the target decoding model to obtain a fourth watermark output from the updated image.
  • the second watermark and the fourth watermark are compared to determine whether the updated image has been tampered with based on the comparison result, thereby completing deep forgery detection.
  • the target decoding model before using the target decoding model in this embodiment of the present application, can be trained first.
  • the method before inputting the updated image into the target decoding model, the method further includes the following steps C1 to C3.
  • Step C1 Obtain the second sample image and the initial decoding model.
  • the second sample image is embedded with the fifth watermark carrying the second information; input the second sample image into the initial decoding model to obtain the sixth watermark output by the initial decoding model.
  • the second sample image is an image that has been embedded with the fifth watermark.
  • the embodiment of the present application can directly obtain the second sample image that meets the requirements, or can also produce the second sample image through the process of embedding the watermark.
  • the above step B1 The second result image in is used as the second sample image, then the fifth watermark is also the third watermark in the above step B1.
  • the method of obtaining the second sample image is not limited here.
  • the fifth watermark carrying the second information is of the same type as the second watermark, and the second information can be randomly generated.
  • the initial decoding model is an initialization model used to generate the target decoding model.
  • the second sample image is input into the initial decoding model, and the initial decoding model outputs the decoded sixth watermark.
  • the fifth watermark and the sixth watermark are of the same type, for example, both are in image form or both are in string form. Among them, taking the second result image as the second sample image and the fifth watermark as the third watermark as an example, the fifth watermark is represented as N and the second sample image is represented as E(M,N).
  • the initial decoding model Expressed as D( ⁇ ), the sixth watermark is expressed as D(E(M,N)).
  • Step C2 Determine the second loss function based on the fifth watermark and the sixth watermark.
  • the second loss function determined according to the fifth watermark and the sixth watermark is used to indicate the difference between the fifth watermark and the sixth watermark.
  • This application does not limit the method of determining the second loss function. Taking the second loss function obtained by calculating the fifth and sixth watermarks through BCE (Binary Cross Entropy) with Logit (logistic regression) as an example, the second loss function Loss decoding represents is the following formula (7):
  • Loss decoding BCEwithLogitsLoss(N,D(E(M,N))) (7)
  • BCEwithLogitsLoss is BCE with Logit.
  • the fifth watermark is normalized to obtain the normalized fifth watermark
  • the sixth watermark is normalized to obtain the normalized
  • the Sigmoid (S-shaped growth curve) function can be used for normalization.
  • the normalized fifth watermark and the normalized sixth watermark are calculated through BCE to obtain the second loss function.
  • Step C3 Update the initial decoding model through the process of minimizing the second loss function to obtain the target decoding model.
  • the second loss function In the process of minimizing the second loss function, gradient calculation is performed on the second loss function to obtain the second gradient. Gradient backpropagation is performed based on the second gradient to obtain the target decoding when the second loss function is minimized. model parameters. Afterwards, the target decoding model is obtained by updating the initial decoding model parameters in the initial decoding model to the target decoding model parameters. Since the second loss function is minimized and the second loss function represents the difference between the fifth watermark and the sixth watermark, the difference between the fifth watermark and the sixth watermark can be minimized.
  • the fifth watermark is the actual embedded watermark
  • the sixth watermark is the extracted watermark
  • minimizing the difference between the fifth watermark and the sixth watermark can ensure that the fifth watermark and the sixth watermark are close enough to ensure the target
  • the watermark extracted by the decoding model is more accurate.
  • the embodiment of this application embeds the first watermark generated for the deep forgery model and the second watermark carrying target information into the initial image, and uses the target encoding model in the embedding process, so that the initial image is the same as the one obtained after embedding.
  • the difference between the target images is small, so that the first watermark and the second watermark are invisible in the target image to avoid affecting the content recorded in the target image.
  • this kind of target image can not only defend against deep forgery models, but also carry customized target information. Therefore, the quality and security of the target image are higher, and the accuracy of image processing is better.
  • the embodiments of the present application can also detect whether the target image has been tampered with by the deep fake model, thereby further ensuring the security of the target image.
  • An embodiment of the present application provides an image processing device. See Figure 7.
  • the device includes the following modules.
  • the acquisition module 701 is used to acquire the first watermark generated for the deep forgery model and the second watermark carrying target information
  • the acquisition module 701 is also used to acquire the target coding model, which is used to embed watermarks on images;
  • the embedding module 702 is used to embed the first watermark and the second watermark into the initial image through the target encoding model to obtain the target image, where the difference between the initial image and the target image is less than the reference threshold.
  • the embedding module 702 is used to superimpose the first watermark onto the initial image to obtain the first image, and input the first image and the second watermark into the target encoding model to obtain the target image output by the target encoding model;
  • the second watermark and the initial image are input into the target encoding model to obtain the second image output by the target encoding model, and the first watermark is superimposed on the second image to obtain the target image.
  • the acquisition module 701 is also used to acquire the first sample image, the third watermark carrying the first information, and the initial encoding model; input the first sample image into the initial encoding model to obtain the initial encoding model output The first result image; input the first sample image and the third watermark into the initial coding model to obtain the second result image output by the initial coding model, and the second result image is embedded with the third watermark; according to the first result image and the second The result image determines the first loss function, and the first loss function is used to indicate the difference between the first result image and the second result image; the initial encoding model is updated through the process of minimizing the first loss function to obtain the target encoding model, where, The minimized first loss function is smaller than the reference threshold.
  • the acquisition module 701 is also used to acquire an updated image generated based on the target image; input the updated image into the target decoding model to obtain the fourth watermark output by the target decoding model; determine the relationship between the second watermark and the fourth watermark. bit error between; when the bit error is greater than the error threshold, it is determined that the updated image is an image obtained by deep forging the target image through a deep forgery model.
  • the acquisition module 701 is also used to acquire a second sample image and an initial decoding model.
  • the second sample image is embedded with a fifth watermark carrying second information; input the second sample image into the initial decoding model to obtain The sixth watermark output by the initial decoding model; determine the second loss function based on the fifth watermark and the sixth watermark; update the initial decoding model through the process of minimizing the second loss function to obtain the target decoding model.
  • the acquisition module 701 is used to acquire the third sample image, the seventh watermark and the deep forgery model; input the third sample image into the deep forgery model to obtain the third result image output by the deep forgery model; convert the third sample image to the deep forgery model.
  • the seven watermarks are superimposed on the third sample image to obtain the fourth sample image; the fourth sample image is input into the deep forgery model to obtain the fourth result image output by the deep forgery model; the third loss is determined based on the third result image and the fourth result image. function; determine the gradient information based on the third loss function; update the seventh watermark based on the gradient information to obtain the first watermark generated for the deep forgery model.
  • the acquisition module 701 is used to average multiple gradient information, Obtain the target gradient information; calculate the target gradient information through the symbolic function to obtain the first calculation result; generate the second calculation result based on the first calculation result, and superimpose the second calculation result to the fourth sample image to obtain the fifth sample image; Apply upper and lower bound constraints to the fifth sample image to obtain the sixth sample image; remove the third sample image from the sixth sample image to obtain the eighth watermark; perform a weighted average of the seventh watermark and the eighth watermark to obtain the first watermark.
  • the embodiment of the present application also provides an electronic device.
  • the electronic device includes a memory and a processor; at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor, so that the electronic device implements active deep forgery provided by any exemplary embodiment of the present application.
  • the electronic device 800 can be a portable mobile electronic device, such as: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, Moving Picture Experts Compression Standard Audio Layer 3), MP4 (Moving Picture Experts Group Audio Layer IV , motion picture expert compresses standard audio levels 4) players, laptops or desktop computers.
  • Electronic device 800 may also be referred to as user equipment, portable electronic device, laptop electronic device, desktop electronic device, and other names.
  • the electronic device 800 includes: a processor 801 and a memory 802.
  • the processor 801 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc.
  • the processor 801 can adopt at least one of the group consisting of DSP (Digital Signal Processing, digital signal processing), FPGA (Field-Programmable Gate Array, field programmable gate array), and PLA (Programmable Logic Array, programmable logic array).
  • DSP Digital Signal Processing, digital signal processing
  • FPGA Field-Programmable Gate Array, field programmable gate array
  • PLA Programmable Logic Array, programmable logic array
  • the processor 801 can also include a main processor and a co-processor.
  • the main processor is a processor used to process data in the wake-up state, also called CPU (Central Processing Unit, central processing unit); the co-processor is A low-power processor used to process data in standby mode.
  • the processor 801 may be integrated with a GPU (Graphics Processing Unit, image processor), and the GPU is responsible for rendering and drawing the content to be displayed on the display screen 805 .
  • the processor 801 may also include an AI (Artificial Intelligence, artificial intelligence) processor, which is used to process computing operations related to machine learning.
  • AI Artificial Intelligence, artificial intelligence
  • Memory 802 may include one or more computer-readable storage media, which may be non-transitory. Memory 802 may also include high-speed random access memory, and non-volatile memory, such as one or more disk storage devices, flash memory storage devices. In some embodiments, the non-transitory computer-readable storage medium in the memory 802 is used to store at least one instruction, and the at least one instruction is used to be executed by the processor 801 to implement the depth-oriented method provided by the method embodiments in this application. The fake active defense method, or the image processing method corresponding to Figure 4.
  • the electronic device 800 optionally further includes: a peripheral device interface 803 and at least one peripheral device.
  • the processor 801, the memory 802 and the peripheral device interface 803 may be connected through a bus or a signal line.
  • Each peripheral device can be connected to the peripheral device interface 803 through a bus, a signal line or a circuit board.
  • the peripheral device includes: at least one of the group consisting of a radio frequency circuit 804, a display screen 805, a camera component 806, an audio circuit 807, a positioning component 808 and a power supply 809.
  • the peripheral device interface 803 may be used to connect at least one I/O (Input/Output, input/output) related peripheral device to the processor 801 and the memory 802 .
  • the processor 801, the memory 802, and the peripheral device interface 803 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 801, the memory 802, and the peripheral device interface 803 or Both of them can be implemented on separate chips or circuit boards, which is not limited in this embodiment.
  • the radio frequency circuit 804 is used to receive and transmit RF (Radio Frequency, radio frequency) signals, also called electromagnetic signals. Radio frequency circuit 804 communicates with communication networks and other communication devices through electromagnetic signals. The radio frequency circuit 804 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals. Optionally, the radio frequency circuit 804 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a user identity module card, and the like. Radio frequency circuitry 804 can communicate with other electronic devices through at least one wireless communication protocol.
  • RF Radio Frequency, radio frequency
  • the wireless communication protocol includes but is not limited to: metropolitan area network, mobile communication networks of all generations (2G, 3G, 4G and 5G), wireless LAN and/or Wi-Fi (Wireless Fidelity, wireless fidelity) network.
  • the radio frequency circuit 804 may also include NFC (Near Field Communication) related circuits, which is not limited in this application.
  • the display screen 805 is used to display UI (User Interface, user interface).
  • the UI can include graphics, text, icons, videos, and any combination thereof.
  • display screen 805 is a touch display screen
  • display screen 805 also has the ability to collect touch signals on or above the surface of display screen 805 .
  • the touch signal can be input to the processor 801 as a control signal for processing.
  • the display screen 805 can also be used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards.
  • the display screen 805 may be a flexible display screen, disposed on a curved or folded surface of the electronic device 800. Even the display screen 805 can be set into a non-rectangular irregular shape, that is, a special-shaped screen.
  • the display screen 805 can be made of LCD (Liquid Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode, organic light-emitting diode) and other materials.
  • the camera assembly 806 is used to capture images or videos.
  • the camera assembly 806 includes a front camera and a rear camera.
  • the front camera is set on the front panel of the electronic device, and the rear camera is set on the back of the electronic device.
  • there are at least two rear cameras one of which is a main camera, a depth-of-field camera, a wide-angle camera, and a telephoto camera, so as to realize the integration of the main camera and the depth-of-field camera to realize the background blur function.
  • camera assembly 806 may also include a flash.
  • the flash can be a single color temperature flash or a dual color temperature flash. Dual color temperature flash refers to a combination of warm light flash and cold light flash, which can be used for light compensation under different color temperatures.
  • Audio circuitry 807 may include a microphone and speakers.
  • the microphone is used to collect sound waves from the user and the environment, and convert the sound waves into electrical signals that are input to the processor 801 for processing, or to the radio frequency circuit 804 to implement voice communication.
  • the microphone can also be an array microphone or an omnidirectional collection microphone.
  • the speaker is used to convert electrical signals from the processor 801 or the radio frequency circuit 804 into sound waves.
  • the loudspeaker can be a traditional membrane loudspeaker or a piezoelectric ceramic loudspeaker.
  • audio circuitry 807 may also include a headphone jack.
  • the positioning component 808 is used to locate the current geographical location of the electronic device 800 to implement navigation or LBS (Location Based Service).
  • the positioning component 808 may be a positioning component based on the United States' GPS (Global Positioning System), China's Beidou system, Russia's Galileo system, or the European Union's Galileo system.
  • the power supply 809 is used to power various components in the electronic device 800 .
  • Power source 809 may be AC, DC, disposable batteries, or rechargeable batteries.
  • the rechargeable battery may support wired charging or wireless charging.
  • the rechargeable battery can also be used to support fast charging technology.
  • electronic device 800 also includes one or more sensors 810 .
  • the one or more sensors 810 include, but are not limited to: an acceleration sensor 811 , a gyroscope sensor 812 , a pressure sensor 813 , a fingerprint sensor 814 , an optical sensor 815 and a proximity sensor 816 .
  • the acceleration sensor 811 can detect the acceleration on the three coordinate axes of the coordinate system established by the electronic device 800 .
  • the acceleration sensor 811 can be used to detect the components of gravity acceleration on three coordinate axes.
  • the processor 801 can control the display screen 805 to display the user interface in a horizontal view or a vertical view according to the gravity acceleration signal collected by the acceleration sensor 811 .
  • the acceleration sensor 811 can also be used to collect game or user motion data.
  • the gyro sensor 812 can detect the body direction and rotation angle of the electronic device 800 , and the gyro sensor 812 can cooperate with the acceleration sensor 811 to collect the user's 3D movements on the electronic device 800 . Based on the data collected by the gyro sensor 812, the processor 801 can implement the following functions: motion sensing (such as changing the UI according to the user's tilt operation), image stabilization during shooting, game control, and inertial navigation.
  • the pressure sensor 813 may be disposed on the side frame of the electronic device 800 and/or on the lower layer of the display screen 805 .
  • the pressure sensor 813 can detect the user's holding signal of the electronic device 800, and the processor 801 performs left and right hand identification or quick operation based on the holding signal collected by the pressure sensor 813.
  • the processor 801 controls the operability controls on the UI interface according to the user's pressure operation on the display screen 805.
  • the operability control includes at least one of the group consisting of a button control, a scroll bar control, an icon control, and a menu control.
  • the fingerprint sensor 814 is used to collect the user's fingerprint.
  • the processor 801 identifies the user's identity based on the fingerprint collected by the fingerprint sensor 814, or the fingerprint sensor 814 identifies the user's identity based on the collected fingerprint.
  • the processor 801 authorizes the user to perform relevant sensitive operations.
  • the sensitive operations include unlocking the screen, viewing encrypted information, downloading software, making payments, and changing settings.
  • the fingerprint sensor 814 may be disposed on the front, back, or side of the electronic device 800 . When the electronic device 800 is provided with a physical button or a manufacturer's logo, the fingerprint sensor 814 can be integrated with the physical button or the manufacturer's logo.
  • Optical sensor 815 is used to collect ambient light intensity.
  • the processor 801 can control the display brightness of the display screen 805 according to the ambient light intensity collected by the optical sensor 815 . Specifically, when the ambient light intensity is high, the display brightness of the display screen 805 is increased; when the ambient light intensity is low, the display brightness of the display screen 808 is decreased.
  • the processor 801 can also dynamically adjust the shooting parameters of the camera assembly 806 according to the ambient light intensity collected by the optical sensor 815 .
  • the proximity sensor 816 also called a distance sensor, is usually provided on the front panel of the electronic device 800.
  • the proximity sensor 816 is used to collect the distance between the user and the front of the electronic device 800 .
  • the processor 801 controls the display screen 805 to switch from the bright screen state to the closed screen state; when the proximity sensor 816 detects When the distance between the user and the front of the electronic device 800 gradually increases, the processor 801 controls the display screen 805 to switch from the screen-off state to the screen-on state.
  • FIG. 8 does not constitute a limitation on the electronic device 800, and may include more or fewer components than shown, or combine certain components, or adopt different component arrangements.
  • Embodiments of the present application provide a computer-readable storage medium. At least one instruction is stored in the computer-readable storage medium. The instruction is loaded and executed by a processor, so that the computer can implement any of the exemplary embodiments of the present application.
  • Embodiments of the present application provide a computer program or computer program product.
  • the computer program or computer program product includes: computer instructions.
  • the computer instructions When the computer instructions are executed by the computer, the computer implements the methods provided by any exemplary embodiment of the present application. Active defense method for deep fakes, or the image processing method corresponding to Figure 4.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Technology Law (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Editing Of Facsimile Originals (AREA)
  • Image Processing (AREA)

Abstract

The present application relates to the technical field of artificial intelligence, and discloses an image processing method and apparatus, an electronic device, and a computer-readable storage medium. The image processing method comprises: acquiring a first watermark generated for a deepfake model and a second watermark carrying target information; acquiring a target coding model, the target coding model being used for performing watermark embedding on an image; and embedding the first watermark and the second watermark into an initial image by means of the target coding model to obtain a target image, a difference between the initial image and the target image being less than a reference threshold.

Description

图像处理方法、装置、电子设备及计算机可读存储介质Image processing method, device, electronic equipment and computer-readable storage medium

本申请要求于2022年07月19日提交的申请号为202210845845.5、申请名称为“一种针对深度伪造的主动防御方法、系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application with application number 202210845845.5 and the application title "An active defense method and system against deep forgery" submitted on July 19, 2022, the entire content of which is incorporated into this application by reference. middle.

技术领域Technical field

本申请属于人工智能技术领域,涉及图像处理方法、装置、电子设备及计算机可读存储介质。This application belongs to the field of artificial intelligence technology and involves image processing methods, devices, electronic equipment and computer-readable storage media.

背景技术Background technique

随着人工智能技术的不断发展,逐渐产生了一些对图像进行篡改的技术,深度伪造(Deepfake)技术便是其中一种。因此,需要对图像进行处理,从而避免图像被深度伪造技术篡改。With the continuous development of artificial intelligence technology, some technologies for tampering with images have gradually emerged, and deepfake technology is one of them. Therefore, the image needs to be processed to prevent it from being tampered with by deepfake technology.

发明内容Contents of the invention

本申请提供了一种针对深度伪造的主动防御方法、系统,还提供了一种图像处理方法、装置、电子设备及计算机可读存储介质。本申请提供的技术方案包括如下的几个方面。This application provides an active defense method and system against deep forgery, and also provides an image processing method, device, electronic equipment and computer-readable storage medium. The technical solution provided by this application includes the following aspects.

一方面,提供了一种针对深度伪造的主动防御方法,其步骤包括:On the one hand, an active defense method against deep forgery is provided, the steps of which include:

1)获得主动防御水印:准备多个深度伪造模型,已经训练好的深度伪造模型参数。具体包括:1) Obtain active defense watermarks: Prepare multiple deep forgery models and already trained deep forgery model parameters. Specifically include:

1-1)将任意一张原始的训练图片和该图片加上防御水印(若为第一次训练,将水印初始化为随机噪音),输入到深度伪造模型中,得到原始图片和加上水印图片的篡改图片。1-1) Add any original training image and the defensive watermark to the image (if this is the first training, initialize the watermark to random noise), input it into the deep forgery model, and obtain the original image and the watermarked image. of doctored images.

1-2)将损失在不同的深度伪造模型上回传,得到图片上的梯度序列。1-2) Pass the loss back on different deep fake models to obtain the gradient sequence on the picture.

1-3)综合各图片、各模型梯度序列,对其进行上下限约束后,得到一个防御水印。1-3) After synthesizing each image and each model gradient sequence, and subjecting them to upper and lower limits, a defensive watermark is obtained.

1-4)每次训练时在上一次训练得到的防御水印的基础上更新水印,具体地,本次训练得到的水印需要乘上系数α(通常为0.01)和上一次的水印乘上系数1-α得到新的防御水印。1-4) During each training, the watermark is updated based on the defensive watermark obtained in the previous training. Specifically, the watermark obtained in this training needs to be multiplied by the coefficient α (usually 0.01) and the last watermark multiplied by the coefficient 1 -α gets new defense watermark.

1-5)重复直至达到训练次数上限,得到可以使多个深度伪造模型的生成扭曲的主动防御水印。1-5) Repeat until the upper limit of training times is reached to obtain an active defense watermark that can distort the generation of multiple deepfake models.

2)训练水印嵌入和检测:具体包括:2) Training watermark embedding and detection: specifically including:

2-1)准备一定数量的人脸图片;2-1) Prepare a certain number of face pictures;

2-2)训练一个训练编码器-解码器。其中,编码器将上一步得到的主动防御水印嵌入到输入图像中,通过损失函数确保嵌入信息的不可见。之后,解码器读取嵌入后的图片,并将编码的水印解码出来,通过损失函数确保解码信息的准确率。当训练完成后,生成相对应的编码器和解码器权重。2-2) Train a training encoder-decoder. Among them, the encoder embeds the active defense watermark obtained in the previous step into the input image, and uses the loss function to ensure that the embedded information is invisible. Afterwards, the decoder reads the embedded image and decodes the encoded watermark, ensuring the accuracy of the decoded information through the loss function. When training is completed, the corresponding encoder and decoder weights are generated.

3)深度伪造检测:具体包括:3) Deep fake detection: specifically including:

3-1)准备需要保护的人脸图片(或需要保护的视频按帧切分),以及需要防御的深度伪造 模型;3-1) Prepare face pictures that need to be protected (or videos that need to be protected divided by frames), and deep fake models that need to be defended;

3-2)使用上一步得到的编码器,将主动防御水印嵌入到人脸图片后,将人脸图片输入到深度伪造模型,得到伪造后的图片;3-2) Use the encoder obtained in the previous step to embed the active defense watermark into the face image, input the face image into the deep forgery model, and obtain the forged image;

3-3)通过上一步得到的解码器,将编码的水印从伪造后的图片中解码出来,和最初的嵌入水印作比较,当二者间的bit差异大于等于设定的阈值(通常为0.4),则认为该图片经过了深度伪造。3-3) Use the decoder obtained in the previous step to decode the encoded watermark from the forged picture and compare it with the original embedded watermark. When the bit difference between the two is greater than or equal to the set threshold (usually 0.4 ), the image is considered to be deepfake.

一方面,提供了一种针对深度伪造的主动防御系统,该系统包括:On the one hand, an active defense system against deep forgery is provided, which includes:

1)深度伪造模型接口模块:包括用于向深度伪造模型输入图片、并获取生成结果的函数;1) Deep forgery model interface module: includes functions for inputting images to the deep forgery model and obtaining the generated results;

2)主动防御水印生成模块:用于生成从多个深度伪造模型保护人脸的防御水印;具体地,该模块首先完成深度伪造模型接入,并调用基础水印生成算法,结合水印融合技术生成模型通用的主动防御水印。2) Active defense watermark generation module: used to generate defensive watermarks that protect faces from multiple deep forgery models; specifically, this module first completes access to the deep forgery model, calls the basic watermark generation algorithm, and generates the model combined with watermark fusion technology Universal active defense watermark.

3)主动防御水印嵌入模块:该模块训练编码器—解码器,利用编码器将主动防御水印生成模块生成的通用水印嵌入人脸图片。3) Active defense watermark embedding module: This module trains the encoder-decoder, and uses the encoder to embed the universal watermark generated by the active defense watermark generation module into the face image.

4)水印防御效果评估模块:用于评估水印使深度伪造模型输出的扭曲程度;4) Watermark defense effect evaluation module: used to evaluate the degree to which the watermark distorts the output of the deep forgery model;

5)深度伪造检测模块:通过主动防御水印嵌入模块提供的解码器,检测嵌入了水印的图片,以判断是否有深度伪造模型对这些图片进行了修改。5) Deep forgery detection module: Through the decoder provided by the active defense watermark embedding module, pictures with embedded watermarks are detected to determine whether these pictures have been modified by a deep forgery model.

本申请实施例提供的技术方案带来的有益效果至少包括:The beneficial effects brought by the technical solutions provided by the embodiments of this application at least include:

通过生成一种模型通用的主动防御水印,将该水印嵌入包含人脸信息的媒体后可使深度伪造模型的生成扭曲,并可通过该水印检测出该媒体内容是否经历过深度伪造,彻底防止深度伪造篡改。本申请实施例对多种深度伪造模型具有防御能力,且无需深度伪造模型结构信息,即可达到防御效果。By generating a model-general active defense watermark, embedding the watermark into media containing face information can distort the generation of deep forgery models, and use the watermark to detect whether the media content has experienced deep forgery, completely preventing deep forgery. Forgery and tampering. The embodiments of the present application have defense capabilities against a variety of deep forgery models, and can achieve defense effects without the need for structural information of the deep forgery models.

一方面,提供了一种图像处理方法,该方法包括:On the one hand, an image processing method is provided, which method includes:

获取针对深度伪造模型生成的第一水印,以及携带目标信息的第二水印;Obtain the first watermark generated for the deep forgery model and the second watermark carrying target information;

获取目标编码模型,所述目标编码模型用于针对图像进行水印嵌入;Obtain a target coding model, which is used for watermark embedding for the image;

通过所述目标编码模型将所述第一水印和所述第二水印嵌入初始图像,得到目标图像,所述初始图像与所述目标图像之间的差异小于参考阈值。The first watermark and the second watermark are embedded in the initial image through the target encoding model to obtain a target image, and the difference between the initial image and the target image is less than a reference threshold.

在示例性实施例中,所述通过所述目标编码模型将所述第一水印和所述第二水印嵌入初始图像,得到目标图像,包括:将所述第一水印叠加至所述初始图像上,得到第一图像,将所述第一图像和所述第二水印输入所述目标编码模型,得到所述目标编码模型输出的所述目标图像;或者,将所述第二水印和所述初始图像输入所述目标编码模型,得到所述目标编码模型输出的第二图像,将所述第一水印叠加至所述第二图像上,得到所述目标图像。In an exemplary embodiment, embedding the first watermark and the second watermark into an initial image through the target encoding model to obtain a target image includes: superimposing the first watermark onto the initial image. , obtain the first image, input the first image and the second watermark into the target coding model, and obtain the target image output by the target coding model; or, combine the second watermark and the initial The image is input into the target encoding model to obtain a second image output by the target encoding model, and the first watermark is superimposed on the second image to obtain the target image.

在示例性实施例中,所述通过所述目标编码模型将所述第一水印和所述第二水印嵌入初始图像,得到目标图像之前,所述方法还包括:获取第一样本图像、携带第一信息的第三水印以及初始编码模型;将所述第一样本图像输入所述初始编码模型,得到所述初始编码模型输出的第一结果图像;将所述第一样本图像和所述第三水印输入所述初始编码模型,得到所述初始编码模型输出的第二结果图像,所述第二结果图像嵌入有所述第三水印;根据所述第一结果图像和所述第二结果图像确定第一损失函数,所述第一损失函数用于指示所述第一结果图像与所述第二结果图像之间的差异;通过最小化所述第一损失函数的过程更新所述初始编码模型,得到所述目标编码模型,其中,最小化的所述第一损失函数小于所述参考阈值。In an exemplary embodiment, before embedding the first watermark and the second watermark into the initial image through the target encoding model and obtaining the target image, the method further includes: acquiring a first sample image, carrying The third watermark of the first information and the initial coding model; input the first sample image into the initial coding model to obtain the first result image output by the initial coding model; combine the first sample image and the initial coding model The third watermark is input into the initial coding model to obtain a second result image output by the initial coding model, and the second result image is embedded with the third watermark; according to the first result image and the second The result image determines a first loss function, the first loss function is used to indicate the difference between the first result image and the second result image; the initialization is updated by minimizing the first loss function. Encoding model to obtain the target encoding model, wherein the minimized first loss function is smaller than the reference threshold.

在示例性实施例中,所述通过所述目标编码模型将所述第一水印和所述第二水印嵌入初始图像,得到目标图像之后,所述方法还包括:获取基于所述目标图像生成的更新图像;将所述更新图像输入目标解码模型,得到所述目标解码模型输出的第四水印;确定所述第二水印与所述第四水印之间的位误差;在所述位误差大于误差阈值的情况下,确定所述更新图像为通过所述深度伪造模型对所述目标图像进行深度伪造后得到的图像。In an exemplary embodiment, after embedding the first watermark and the second watermark into the initial image through the target encoding model and obtaining the target image, the method further includes: obtaining the generated data based on the target image. Update the image; input the updated image into the target decoding model to obtain the fourth watermark output by the target decoding model; determine the bit error between the second watermark and the fourth watermark; when the bit error is greater than the error In the case of a threshold, it is determined that the updated image is an image obtained by deep forging the target image using the deep forgery model.

在示例性实施例中,所述将所述更新图像输入目标解码模型之前,所述方法还包括:获取第二样本图像和初始解码模型,所述第二样本图像嵌入有携带第二信息的第五水印;将所述第二样本图像输入所述初始解码模型,得到所述初始解码模型输出的第六水印;根据所述第五水印和所述第六水印确定第二损失函数;通过最小化所述第二损失函数的过程更新所述初始解码模型,得到所述目标解码模型。In an exemplary embodiment, before inputting the updated image into the target decoding model, the method further includes: obtaining a second sample image and an initial decoding model, the second sample image is embedded with a third image carrying the second information. Five watermarks; input the second sample image into the initial decoding model to obtain the sixth watermark output by the initial decoding model; determine a second loss function according to the fifth watermark and the sixth watermark; by minimizing The process of the second loss function updates the initial decoding model to obtain the target decoding model.

在示例性实施例中,所述获取针对深度伪造模型生成的第一水印,包括:获取第三样本图像、第七水印和所述深度伪造模型;将所述第三样本图像输入所述深度伪造模型,得到所述深度伪造模型输出的第三结果图像;将所述第七水印叠加至所述第三样本图像,得到第四样本图像;将所述第四样本图像输入所述深度伪造模型,得到所述深度伪造模型输出的第四结果图像;根据所述第三结果图像和所述第四结果图像确定第三损失函数;根据所述第三损失函数确定梯度信息;根据所述梯度信息更新所述第七水印,得到所述针对深度伪造模型生成的第一水印。In an exemplary embodiment, obtaining the first watermark generated for the deep forgery model includes: obtaining a third sample image, a seventh watermark, and the deep forgery model; and inputting the third sample image into the deep forgery model. model to obtain the third result image output by the deep forgery model; superimpose the seventh watermark to the third sample image to obtain a fourth sample image; input the fourth sample image into the deep forgery model, Obtain the fourth result image output by the deep forgery model; determine a third loss function according to the third result image and the fourth result image; determine gradient information according to the third loss function; update according to the gradient information The seventh watermark is the first watermark generated for the deep forgery model.

在示例性实施例中,所述深度伪造模型的数量和所述梯度信息的数量均为多个,多个深度伪造模型与多个梯度信息一一对应;所述根据所述梯度信息更新所述第七水印,得到所述针对深度伪造模型生成的第一水印,包括:对所述多个梯度信息进行平均,得到目标梯度信息;通过符号函数对所述目标梯度信息进行计算,得到第一计算结果;根据所述第一计算结果生成第二计算结果,将第二计算结果叠加至所述第四样本图像,得到第五样本图像;对所述第五样本图像进行上下限约束,得到第六样本图像;从所述第六样本图像中去除所述第三样本图像,得到第八水印;对所述第七水印和所述第八水印进行加权平均,得到所述第一水印。In an exemplary embodiment, there are multiple deepfake models and multiple gradient information, and multiple deepfake models correspond to multiple gradient information in one-to-one correspondence; the updating of the deepfake model according to the gradient information The seventh watermark is to obtain the first watermark generated for the deep forgery model, including: averaging the multiple gradient information to obtain target gradient information; calculating the target gradient information through a symbolic function to obtain the first calculation Result; generate a second calculation result according to the first calculation result, superimpose the second calculation result to the fourth sample image, and obtain a fifth sample image; apply upper and lower limits constraints on the fifth sample image, and obtain the sixth Sample image; remove the third sample image from the sixth sample image to obtain an eighth watermark; perform a weighted average of the seventh watermark and the eighth watermark to obtain the first watermark.

一方面,提供了一种图像处理装置,该装置包括:On the one hand, an image processing device is provided, which device includes:

获取模块,用于获取针对深度伪造模型生成的第一水印,以及携带目标信息的第二水印;The acquisition module is used to obtain the first watermark generated for the deep forgery model and the second watermark carrying target information;

所述获取模块,还用于获取目标编码模型,所述目标编码模型用于针对图像进行水印嵌入;The acquisition module is also used to acquire a target coding model, which is used to embed watermarks on images;

嵌入模块,用于通过所述目标编码模型将所述第一水印和所述第二水印嵌入初始图像,得到目标图像,所述初始图像与所述目标图像之间的差异小于参考阈值。An embedding module, configured to embed the first watermark and the second watermark into an initial image through the target encoding model to obtain a target image, where the difference between the initial image and the target image is less than a reference threshold.

在示例性实施例中,所述嵌入模块,用于将所述第一水印叠加至所述初始图像上,得到第一图像,将所述第一图像和所述第二水印输入所述目标编码模型,得到所述目标编码模型输出的所述目标图像;或者,将所述第二水印和所述初始图像输入所述目标编码模型,得到所述目标编码模型输出的第二图像,将所述第一水印叠加至所述第二图像上,得到所述目标图像。In an exemplary embodiment, the embedding module is configured to superimpose the first watermark onto the initial image to obtain a first image, and input the first image and the second watermark into the target encoding. model to obtain the target image output by the target encoding model; or, input the second watermark and the initial image into the target encoding model to obtain the second image output by the target encoding model, and convert the The first watermark is superimposed on the second image to obtain the target image.

在示例性实施例中,所述获取模块,还用于获取第一样本图像、携带第一信息的第三水印以及初始编码模型;将所述第一样本图像输入所述初始编码模型,得到所述初始编码模型输出的第一结果图像;将所述第一样本图像和所述第三水印输入所述初始编码模型,得到所 述初始编码模型输出的第二结果图像,所述第二结果图像嵌入有所述第三水印;根据所述第一结果图像和所述第二结果图像确定第一损失函数,所述第一损失函数用于指示所述第一结果图像与所述第二结果图像之间的差异;通过最小化所述第一损失函数的过程更新所述初始编码模型,得到所述目标编码模型,其中,最小化的所述第一损失函数小于所述参考阈值。In an exemplary embodiment, the acquisition module is also used to acquire a first sample image, a third watermark carrying first information, and an initial encoding model; input the first sample image into the initial encoding model, Obtain the first result image output by the initial encoding model; input the first sample image and the third watermark into the initial encoding model to obtain the second result image output by the initial encoding model, and the third The second result image is embedded with the third watermark; a first loss function is determined according to the first result image and the second result image, and the first loss function is used to indicate the difference between the first result image and the third result image. The difference between the two result images; the initial encoding model is updated through the process of minimizing the first loss function to obtain the target encoding model, wherein the minimized first loss function is smaller than the reference threshold.

在示例性实施例中,所述获取模块,还用于获取基于所述目标图像生成的更新图像;将所述更新图像输入目标解码模型,得到所述目标解码模型输出的第四水印;确定所述第二水印与所述第四水印之间的位误差;在所述位误差大于误差阈值的情况下,确定所述更新图像为通过所述深度伪造模型对所述目标图像进行深度伪造后得到的图像。In an exemplary embodiment, the acquisition module is also used to acquire an updated image generated based on the target image; input the updated image into the target decoding model to obtain the fourth watermark output by the target decoding model; determine the The bit error between the second watermark and the fourth watermark; when the bit error is greater than the error threshold, it is determined that the updated image is obtained by deep forging the target image through the deep forgery model. Image.

在示例性实施例中,所述获取模块,还用于获取第二样本图像和初始解码模型,所述第二样本图像嵌入有携带第二信息的第五水印;将所述第二样本图像输入所述初始解码模型,得到所述初始解码模型输出的第六水印;根据所述第五水印和所述第六水印确定第二损失函数;通过最小化所述第二损失函数的过程更新所述初始解码模型,得到所述目标解码模型。In an exemplary embodiment, the acquisition module is also used to acquire a second sample image and an initial decoding model, the second sample image is embedded with a fifth watermark carrying second information; input the second sample image The initial decoding model obtains the sixth watermark output by the initial decoding model; determines a second loss function according to the fifth watermark and the sixth watermark; and updates the second loss function through the process of minimizing the second loss function. Initial decoding model to obtain the target decoding model.

在示例性实施例中,所述获取模块,用于获取第三样本图像、第七水印和所述深度伪造模型;将所述第三样本图像输入所述深度伪造模型,得到所述深度伪造模型输出的第三结果图像;将所述第七水印叠加至所述第三样本图像,得到第四样本图像;将所述第四样本图像输入所述深度伪造模型,得到所述深度伪造模型输出的第四结果图像;根据所述第三结果图像和所述第四结果图像确定第三损失函数;根据所述第三损失函数确定梯度信息;根据所述梯度信息更新所述第七水印,得到所述针对深度伪造模型生成的第一水印。In an exemplary embodiment, the acquisition module is used to acquire a third sample image, a seventh watermark and the deep forgery model; input the third sample image into the deep forgery model to obtain the deep forgery model. The third result image output; superimpose the seventh watermark to the third sample image to obtain a fourth sample image; input the fourth sample image into the deep forgery model to obtain the output of the deep forgery model. a fourth result image; determine a third loss function according to the third result image and the fourth result image; determine gradient information according to the third loss function; update the seventh watermark according to the gradient information to obtain the Describes the first watermark generated for the deep fake model.

在示例性实施例中,所述深度伪造模型的数量和所述梯度信息的数量均为多个,多个深度伪造模型与多个梯度信息一一对应;所述获取模块,用于对所述多个梯度信息进行平均,得到目标梯度信息;通过符号函数对所述目标梯度信息进行计算,得到第一计算结果;根据所述第一计算结果生成第二计算结果,将第二计算结果叠加至所述第四样本图像,得到第五样本图像;对所述第五样本图像进行上下限约束,得到第六样本图像;从所述第六样本图像中去除所述第三样本图像,得到第八水印;对所述第七水印和所述第八水印进行加权平均,得到所述第一水印。In an exemplary embodiment, the number of the deep forgery models and the number of the gradient information are both multiple, and the multiple deep forgery models correspond to the multiple gradient information one-to-one; the acquisition module is used to obtain the A plurality of gradient information are averaged to obtain target gradient information; the target gradient information is calculated through a symbolic function to obtain a first calculation result; a second calculation result is generated according to the first calculation result, and the second calculation result is superimposed to The fourth sample image is used to obtain a fifth sample image; the upper and lower limits are applied to the fifth sample image to obtain a sixth sample image; the third sample image is removed from the sixth sample image to obtain an eighth sample image. Watermark: perform a weighted average of the seventh watermark and the eighth watermark to obtain the first watermark.

一方面,提供了一种电子设备,所述电子设备包括存储器及处理器;所述存储器中存储有至少一条指令,所述至少一条指令由所述处理器加载并执行,以使电子设备实现本申请的任一种示例性实施例所提供的针对深度伪造的主动防御方法,或者图像处理方法。On the one hand, an electronic device is provided. The electronic device includes a memory and a processor; at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor, so that the electronic device implements the present invention. The active defense method against deep forgery or the image processing method provided by any of the exemplary embodiments of the application.

一方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条指令,所述指令由处理器加载并执行,以使计算机实现本申请的任一种示例性实施例所提供的针对深度伪造的主动防御方法,或者图像处理方法。On the one hand, a computer-readable storage medium is provided. At least one instruction is stored in the computer-readable storage medium. The instruction is loaded and executed by a processor to enable the computer to implement any exemplary implementation of the present application. The active defense method against deep forgery provided by the example, or the image processing method.

另一方面,提供了一种计算机程序或计算机程序产品,所述计算机程序或计算机程序产品包括:计算机指令,所述计算机指令被计算机执行时,使得所述计算机实现本申请的任一种示例性实施例所提供的针对深度伪造的主动防御方法,或者图像处理方法。On the other hand, a computer program or computer program product is provided. The computer program or computer program product includes: computer instructions. When the computer instructions are executed by a computer, the computer implements any exemplary embodiment of the present application. The embodiment provides an active defense method against deep forgery or an image processing method.

本申请实施例提供的技术方案带来的有益效果至少还包括:The beneficial effects brought by the technical solutions provided by the embodiments of this application at least include:

将针对深度伪造模型生成的第一水印,以及携带目标信息的第二水印均嵌入初始图像,并在嵌入过程中使用目标编码模型,使得初始图像与嵌入之后得到的目标图像之间的差异较小,从而使得第一水印和第二水印在目标图像中不可见,避免对目标图像记录的内容造成影响。并且,此种目标图像既能够防御深度伪造模型,又能够携带定制的目标信息,保证了图 像处理的准确性。Embed the first watermark generated for the deep forgery model and the second watermark carrying target information into the initial image, and use the target encoding model during the embedding process to make the difference between the initial image and the target image obtained after embedding smaller. , thereby making the first watermark and the second watermark invisible in the target image to avoid affecting the content recorded in the target image. Moreover, this kind of target image can not only defend against deep forgery models, but also carry customized target information, ensuring the accuracy of image processing.

附图说明Description of drawings

图1为本申请实施例提供的一种主动防御水印的生成的示意图;Figure 1 is a schematic diagram of the generation of an active defense watermark provided by an embodiment of the present application;

图2为本申请实施例提供的一种主动防御水印的嵌入及深度伪造检测的示意图;Figure 2 is a schematic diagram of an active defense watermark embedding and deep forgery detection provided by an embodiment of the present application;

图3为本申请实施例提供的一种实施环境的示意图;Figure 3 is a schematic diagram of an implementation environment provided by an embodiment of the present application;

图4为本申请实施例提供的一种图像处理方法的流程图;Figure 4 is a flow chart of an image processing method provided by an embodiment of the present application;

图5为本申请实施例提供的一种生成第一水印的示意图;Figure 5 is a schematic diagram of generating a first watermark provided by an embodiment of the present application;

图6为本申请实施例提供的一种水印嵌入及深度伪造检测的示意图;Figure 6 is a schematic diagram of watermark embedding and deep forgery detection provided by an embodiment of the present application;

图7为本申请实施例提供的一种图像处理装置的结构图;Figure 7 is a structural diagram of an image processing device provided by an embodiment of the present application;

图8为本申请实施例提供的一种电子设备的结构图。Figure 8 is a structural diagram of an electronic device provided by an embodiment of the present application.

具体实施方式Detailed ways

随着深度学习技术的不断发展,对人脸图像和视频进行修改的技术:深度伪造(Deepfake)在互联网上爆发式流行。深度伪造技术通过属性修改或面部替换修改人脸,可以修改发色、脸型等外形特征,也可将人脸替换到其他的视频和图像上,使人物做出不符其身份的行为,或传达虚假信息。比如,StarGAN(Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation,用于多域图像对图像翻译的统一生成对抗性网络)可以由一张原始人脸图片生成不同面部特征和表情的人脸篡改图像。又比如,InterfaceGAN(Interpreting the Latent Space of GANs for Semantic Face Editing,解释用于语义脸部编辑的统一生成对抗性网络潜在空间)可以通过隐变量编辑,可以生成拍照角度可控的人脸图像。With the continuous development of deep learning technology, the technology of modifying facial images and videos: deepfake has exploded in popularity on the Internet. Deep forgery technology modifies faces through attribute modification or facial replacement. It can modify hair color, face shape and other appearance features, or replace faces on other videos and images to make characters behave inconsistently with their identities or convey false information. information. For example, StarGAN (Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation, a unified generative adversarial network for multi-domain image-to-image translation) can generate people with different facial features and expressions from an original face picture. Face doctored images. For another example, InterfaceGAN (Interpreting the Latent Space of GANs for Semantic Face Editing, explaining the unified generation of adversarial network latent space for semantic face editing) can be edited through latent variables to generate face images with controllable camera angles.

在社交媒体和短视频平台中,使用深度伪造创作的音视频媒体已经随处可见,其引发的社会问题也逐渐引起关注。比如,深度伪造创作的音视频媒体可能被用于传播虚假信息,对用户形象造成巨大损失。In social media and short video platforms, audio and video media created using deep fakes can be seen everywhere, and the social problems they cause have gradually attracted attention. For example, audio and video media created by deep fakes may be used to spread false information, causing huge damage to user images.

对此,许多短视频平台已经开始采取措施监管和禁止换脸视频。但目前平台针对深度伪造采取的措施主要是被动检测,也即训练检测器对已经制作发布的视频进行检测,判断是否为深度伪造内容。这种检测只能被动防御和事后取证,并不能阻止深度伪造内容的生成和传播,没有办法断绝虚假内容造成的恶劣影响;且面对日新月异的深度伪造模型,需要不断训练和更新检测器,成本代价十分高昂。In response, many short video platforms have begun to take measures to regulate and ban face-changing videos. However, the current measures taken by the platform against deep fakes are mainly passive detection, that is, training detectors to detect videos that have been produced and released to determine whether they are deep fake content. This kind of detection can only passively defend and collect evidence afterwards, and cannot prevent the generation and spread of deep fake content. There is no way to cut off the negative impact of false content; and in the face of the ever-changing deep fake models, detectors need to be continuously trained and updated, which costs a lot of money. The price is very high.

本申请设计一种针对深度伪造的主动防御系统,该系统包括深度伪造模型接口、水印生成、水印嵌入、防御效果评估以及深度伪造检测五个模块。其中:This application designs an active defense system against deep forgery. The system includes five modules: deep forgery model interface, watermark generation, watermark embedding, defense effect evaluation, and deep forgery detection. in:

1)深度伪造模型接口模块:包括用于向深度伪造模型输入图片、并获取生成结果的函数;1) Deep forgery model interface module: includes functions for inputting images to the deep forgery model and obtaining the generated results;

2)主动防御水印生成模块:用于生成从多个深度伪造模型保护人脸的防御水印;具体地,该模块首先完成深度伪造模型接入,并调用基础水印生成算法,结合水印融合技术生成模型通用的主动防御水印。2) Active defense watermark generation module: used to generate defensive watermarks that protect faces from multiple deep forgery models; specifically, this module first completes access to the deep forgery model, calls the basic watermark generation algorithm, and generates the model combined with watermark fusion technology Universal active defense watermark.

3)主动防御水印嵌入模块:该模块训练编码器-解码器,利用编码器将主动防御水印生成模块生成的通用水印嵌入人脸图片。3) Active defense watermark embedding module: This module trains the encoder-decoder and uses the encoder to embed the universal watermark generated by the active defense watermark generation module into the face image.

4)水印防御效果评估模块:用于评估水印使深度伪造模型输出的扭曲程度;4) Watermark defense effect evaluation module: used to evaluate the degree to which the watermark distorts the output of the deep forgery model;

5)深度伪造检测模块:通过主动防御水印嵌入模块提供的解码器,检测嵌入了水印的 图片,以判断是否有深度伪造模型对这些图片进行了修改。5) Deep forgery detection module: Through the decoder provided by the active defense watermark embedding module, pictures with embedded watermarks are detected to determine whether these pictures have been modified by a deep forgery model.

为进一步说明本申请,下面通过实例描述其具体实施方式,但不以任何方式限制该方法的适用范围。In order to further illustrate the present application, the specific implementation is described below through examples, but the scope of application of the method is not limited in any way.

以大规模的人脸属性数据集CelebA(CelebFaces Attributes Dataset:http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html)以及在该数据集上训练的深度伪造模型HiSD、Stargan、AttGAN、Attentiongan作为攻击目标,采用PGD攻击算法作为攻击基础算法来说明如何生成主动防御水印,如何进行水印嵌入,以及如何进行深度伪造检测。Based on the large-scale face attribute data set CelebA (CelebFaces Attributes Dataset: http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html) and the deep forgery models HiSD and Stargan trained on this data set, AttGAN and Attentiongan are used as attack targets, and the PGD attack algorithm is used as the basic attack algorithm to explain how to generate active defense watermarks, how to embed watermarks, and how to perform deep forgery detection.

准备已经封装好的Deepfake模型;读入干净的CelebA数据集,将其缩放到256×256大小并进行标准化预处理,将CelebaA数据集划分为训练集、验证集和测试集。Prepare the encapsulated Deepfake model; read the clean CelebA data set, scale it to 256×256 size and perform standardized preprocessing, and divide the CelebaA data set into a training set, a verification set and a test set.

第一步,获得主动防御水印,如图1所示:The first step is to obtain the active defense watermark, as shown in Figure 1:

1)将任意一批原始的训练图片和该图片加上防御水印(若为第一次训练,将水印初始化为随机噪音),输入到深度伪造模型中,得到原始图片和加上水印图片的篡改图片。1) Add any batch of original training images and the image with a defensive watermark (if this is the first training, initialize the watermark to random noise), input it into the deep forgery model, and obtain the tampered images of the original image and the watermarked image. picture.

2)将损失在不同的深度伪造模型上回传,得到输入图片上的梯度序列。其中损失为原始图片和加水印图片得到的深度伪造模型输出的损失函数:2) Pass the loss back on different deep forgery models to obtain the gradient sequence on the input image. The loss is the loss function of the deep forgery model output obtained from the original image and the watermarked image:

Loss generation=MSE(G(I),G(I+W)) Loss generation =MSE(G(I),G(I+W))

其中,I为原始图片,W为水印,G为深度伪造模型。Among them, I is the original image, W is the watermark, and G is the deep fake model.

3)融合各图片、各模型梯度序列,对其进行上下限约束后,得到一个防御水印。具体地,综合各图片梯度序列时,对一个批的图片(8张)在一个模型上求得梯度后,将梯度进行平均得到g avg,并使用PGD算法在梯度的正方向上迭代更新10次,得到对抗扰动P: 3) Fusion of each image and each model gradient sequence, and subjecting them to upper and lower bound constraints, a defensive watermark is obtained. Specifically, when synthesizing the gradient sequence of each picture, after obtaining the gradient on a model for a batch of pictures (8 pictures), the gradient is averaged to obtain g avg , and the PGD algorithm is used to iteratively update 10 times in the positive direction of the gradient. Obtain the adversarial perturbation P:

Figure PCTCN2022144343-appb-000001
Figure PCTCN2022144343-appb-000001

Figure PCTCN2022144343-appb-000002
Figure PCTCN2022144343-appb-000002

Figure PCTCN2022144343-appb-000003
Figure PCTCN2022144343-appb-000003

融合各模型梯度序列时,在本模型上得到的对抗扰动P需要乘上系数α(通常为0.01)和之前的水印乘上1-α得到新的防御水印。When fusing the gradient sequences of each model, the adversarial perturbation P obtained on this model needs to be multiplied by the coefficient α (usually 0.01) and the previous watermark multiplied by 1-α to obtain a new defensive watermark.

W′←(1-α)W+αPW′←(1-α)W+αP

4)重复,直至训练完128张图片,得到可以使深度伪造模型的生成扭曲的主动防御水印。4) Repeat until 128 images are trained, and an active defense watermark that can distort the generation of the deep forgery model is obtained.

第二步,训练水印嵌入和检测:The second step is to train watermark embedding and detection:

1)使用CelebA的训练集,训练一对基于卷积神经网络编码器-解码器。其中,编码器将主动防御水印嵌入到输入图像中,通过损失函数约束嵌入后的图片和原图片之间足够接近,也即最小化均方误差,确保嵌入信息不可见。1) Use the training set of CelebA to train a pair of convolutional neural network encoder-decoder. Among them, the encoder embeds the active defense watermark into the input image, and uses the loss function to constrain the embedded image to be close enough to the original image, that is, to minimize the mean square error and ensure that the embedded information is invisible.

Loss encoding=MSE(E(I),E(I,N)) Loss encoding =MSE(E(I),E(I,N))

其中,E是编码器,N是主动防御水印。Among them, E is the encoder and N is the active defense watermark.

2)之后,解码器读取嵌入后的图片,并将编码的水印解码出来,通过损失函数约束解码结果和原始水印之间的bit误差,也即最小化带logit的BCE误差函数。2) After that, the decoder reads the embedded picture and decodes the encoded watermark. The bit error between the decoding result and the original watermark is constrained through the loss function, that is, the BCE error function with logit is minimized.

Lossd ecoding=BCEwithLogitsLoss(W,D(E(I,N))) Lossd encoding =BCEwithLogitsLoss(W,D(E(I,N)))

其中,D是解码器。Among them, D is the decoder.

3)当训练完成后,生成相对应的编码器E和解码器权重。3) When training is completed, the corresponding encoder E and decoder weights are generated.

第三步,深度伪造检测,如图2所示:The third step is deep forgery detection, as shown in Figure 2:

1)选择CelebA测试集进行输入;1) Select the CelebA test set for input;

2)使用上一步得到的编码器,将主动防御水印嵌入到人脸图片后,将人脸图片输入到各 个深度伪造模型,得到伪造后的图片;2) Use the encoder obtained in the previous step to embed the active defense watermark into the face image, input the face image into each deep forgery model, and obtain the forged image;

通过上一步得到的解码器,将编码的水印从伪造后的图片中解码出来,和最初的嵌入水印作比较,当二者间的bit差异大于等于设定的阈值(0.4),则认为该图片经过了深度伪造。在CelebA全测试集上,经过深度伪造模型的编码和未经伪造的最小编码改变率为41.0%,可以被检出。Through the decoder obtained in the previous step, the encoded watermark is decoded from the forged picture and compared with the original embedded watermark. When the bit difference between the two is greater than or equal to the set threshold (0.4), the picture is considered Passed deepfake. On the full test set of CelebA, the minimum encoding change rate after encoding by the deep forgery model and without forgery is 41.0%, which can be detected.

在模型结构未知的深度伪造模型攻击测试中,本申请实施例获得了100%的深度伪造防御率。In the deepfake model attack test with unknown model structure, the embodiment of the present application achieved a 100% deepfake defense rate.

本申请实施例提供了一种针对深度伪造的主动防御方法,以及一种图像处理方法,这些方法均可应用于如图3所示的实施环境中。图3中,包括至少一个电子设备31和服务器32,电子设备31可与服务器32进行通信连接。以图像处理方法为例,电子设备31可以从服务器32获取需要处理的图像。当然,电子设备31也可以自行获取需要处理的图像,比如通过拍摄等方式获取需要处理的图像。The embodiment of the present application provides an active defense method against deep forgery and an image processing method. These methods can be applied in the implementation environment as shown in Figure 3. In FIG. 3 , at least one electronic device 31 and a server 32 are included. The electronic device 31 can communicate with the server 32 . Taking the image processing method as an example, the electronic device 31 can obtain the image that needs to be processed from the server 32 . Of course, the electronic device 31 can also acquire images that need to be processed by itself, such as by taking pictures or other methods.

示例性地,电子设备31包括但不限于任何一种可与用户通过键盘、触摸板、触摸屏、遥控器、语音交互或手写设备等一种或多种方式进行人机交互的电子产品,例如PC(Personal Computer,个人计算机)、手机、智能手机、PDA(Personal Digital Assistant,个人数字助手)、可穿戴设备、掌上电脑PPC(Pocket PC)、平板电脑、智能车机、智能电视、智能音箱等。Illustratively, the electronic device 31 includes but is not limited to any electronic product that can perform human-computer interaction with the user through one or more methods such as keyboard, touch pad, touch screen, remote control, voice interaction or handwriting device, such as a PC. (Personal Computer), mobile phones, smartphones, PDAs (Personal Digital Assistant), wearable devices, PPC (Pocket PC), tablets, smart cars, smart TVs, smart speakers, etc.

可选地,服务器32可以是一台服务器,可以是多台服务器组成的服务器集群,还可以是云化服务器。Optionally, the server 32 may be one server, a server cluster composed of multiple servers, or a cloud server.

本领域技术人员应能理解上述电子设备31和服务器32仅为举例,其他现有的或今后可能出现的电子设备和服务器如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。Those skilled in the art should understand that the above-mentioned electronic device 31 and server 32 are only examples. If other existing or possible future electronic devices and servers are applicable to this application, they should also be included in the protection scope of this application and be included in the protection scope of this application. This is incorporated herein by reference.

参见图4,本申请实施例提供了一种图像处理方法,该方法可应用于图3所示的电子设备中。如图4所示,该方法包括如下的步骤401至步骤403。Referring to Figure 4, an embodiment of the present application provides an image processing method, which can be applied to the electronic device shown in Figure 3. As shown in Figure 4, the method includes the following steps 401 to 403.

步骤401,获取针对深度伪造模型生成的第一水印,以及携带目标信息的第二水印。Step 401: Obtain the first watermark generated for the deep forgery model and the second watermark carrying target information.

其中,针对深度伪造模型生成的第一水印,是一种能够防御深度伪造模型的通用水印。示例性地,能够防御深度伪造模型是指,将嵌入有这种第一水印的图像输入深度伪造模型之后,深度伪造模型会对嵌入有这种第一水印的图像进行篡改,并输出篡改后的图像。然而,该篡改后的图像是扭曲的,从而能够获知该篡改后的图像已经被深度伪造模型篡改,形成了对深度伪造模型的防御。可选地,深度伪造模型包括但不限于StarGAN、InterfaceGAN、HiSD(Hierarchical Style Disentanglement,层次式拆分)和AttGAN(Attention Generative Adversarial Networks,注意力生成对抗性网络)等等,不作限定。Among them, the first watermark generated for the deep forgery model is a universal watermark that can defend against the deep forgery model. For example, being able to defend against a deep forgery model means that after an image embedded with such a first watermark is input into the deep forgery model, the deep forgery model will tamper with the image embedded with such a first watermark and output the tampered image. image. However, the tampered image is distorted, so it can be known that the tampered image has been tampered with by the deep forgery model, forming a defense against the deep forgery model. Optionally, deep fake models include but are not limited to StarGAN, InterfaceGAN, HiSD (Hierarchical Style Disentanglement, hierarchical splitting) and AttGAN (Attention Generative Adversarial Networks, attention generation adversarial networks), etc., without limitation.

示例性地,该第一水印为图像形式的水印。或者说,该第一水印由多个像素组成。其中,每个像素可以对应至少一个通道,每个通道对应一个取值。For example, the first watermark is a watermark in the form of an image. In other words, the first watermark is composed of multiple pixels. Among them, each pixel can correspond to at least one channel, and each channel corresponds to a value.

另外,携带目标信息的第二水印,是一种可以根据用户的实际需求定制的个性化水印,该第二水印携带的目标信息,也即是用户希望体现的信息。在一些实施方式中,该第二水印为图像形式的水印,比如一个标志(Logo)。则该第二水印由多个像素组成。每个像素可以对应至少一个通道,且每个通道对应一个取值。在另一些实施方式中,该第二水印为字符串形式的水印,比如用户使用的标识(Identification,ID)。本申请实施例不对该字符串的位数进 行限制,根据经验或者实际需求确定字符串的位数即可。In addition, the second watermark carrying target information is a personalized watermark that can be customized according to the actual needs of the user. The target information carried by the second watermark is the information that the user wants to reflect. In some implementations, the second watermark is a watermark in the form of an image, such as a logo. Then the second watermark consists of multiple pixels. Each pixel can correspond to at least one channel, and each channel corresponds to a value. In other implementations, the second watermark is a watermark in the form of a string, such as an identification (Identification, ID) used by the user. The embodiment of the present application does not limit the number of digits in the string, and the number of digits in the string can be determined based on experience or actual needs.

下面,分别对第一水印的获取方式和第二水印的获取方式进行说明。Next, the methods of obtaining the first watermark and the method of obtaining the second watermark will be described respectively.

在示例性实施例中,获取针对深度伪造模型生成的第一水印,包括如下的步骤A1至步骤A3。In an exemplary embodiment, obtaining the first watermark generated for the deepfake model includes the following steps A1 to A3.

步骤A1,获取第三样本图像、第七水印和深度伪造模型,将第七水印叠加至第三样本图像,得到第四样本图像;将第三样本图像输入深度伪造模型,得到深度伪造模型输出的第三结果图像,以及,将第四样本图像输入深度伪造模型,得到深度伪造模型输出的第四结果图像。Step A1: Obtain the third sample image, the seventh watermark and the deep forgery model, superimpose the seventh watermark to the third sample image, and obtain the fourth sample image; input the third sample image into the deep forgery model, and obtain the output of the deep forgery model. The third result image, and the fourth sample image is input into the deep forgery model to obtain the fourth result image output by the deep forgery model.

其中,第三样本图像可以是还未嵌入水印的图像,第三样本图像也即是图1所示的干净样本。该第三样本图像可以来自于公开数据集,包括但不限于CelebA(CelebFaces Attributes Dataset,人脸属性数据集)。第七水印是用于生成第一水印的初始化水印。示例性地,该第七水印可以是通过一些训练过程训练得到的水印,或者是随机生成的水印(又称为随机噪声)。该第七水印与第一水印具有相同的类型,比如第一水印为图像形式时,该第七水印也为图像形式,则组成第七水印的每个像素包括的通道对应一个取值。示例性地,每个通道对应一个较小的取值,比如在值域为0至255的情况下,每个通道对应的取值均位于0至1之间。另外,该第七水印的尺寸小于或者等于第三样本图像的尺寸,或者说,该第七水印包括的像素数量小于或者等于第三样本图像包括的像素数量。The third sample image may be an image that has not yet embedded a watermark, and the third sample image is the clean sample shown in Figure 1 . The third sample image can come from public data sets, including but not limited to CelebA (CelebFaces Attributes Dataset, face attribute data set). The seventh watermark is an initialization watermark used to generate the first watermark. For example, the seventh watermark may be a watermark trained through some training processes, or a randomly generated watermark (also known as random noise). The seventh watermark has the same type as the first watermark. For example, when the first watermark is in the form of an image and the seventh watermark is also in the form of an image, then each channel included in the pixels that make up the seventh watermark corresponds to a value. For example, each channel corresponds to a smaller value. For example, when the value range is 0 to 255, the value corresponding to each channel is between 0 and 1. In addition, the size of the seventh watermark is less than or equal to the size of the third sample image, or in other words, the number of pixels included in the seventh watermark is less than or equal to the number of pixels included in the third sample image.

此种第七水印可以叠加至第三样本图像上,从而得到第四样本图像,第四样本图像也即是图1所示的对抗样本。在一些实施方式中,第七水印包括的像素数量等于第三样本图像包括的像素数量,则第七水印包括的像素与第三样本图像包括的像素一一对应。相应地,将第七水印叠加至第三样本图像上,得到第四样本图像,可以包括:将相对应的像素包括的通道对应的取值相加,得到第四样本图像。以一个像素对应R(Red,红)通道、G(Green,绿)通道和B(Blue,蓝)通道为例,对于第七水印和第三样本图像包括的两个相对应的像素,将这两个像素的R通道的取值相加、G通道的取值相加、B通道的取值也相加,得到第四样本图像。在另一些实施方式中,第七水印包括的像素数量小于第三样本图像包括的像素数量,则本申请实施例可以从第三样本图像中截取出一部分,且截取的这部分包括的像素数量与第七水印包括的像素数量相同,从而可以形成一一对应的像素关系。或者,可以通过插值等方式对增加第七水印的像素数量,使得增加后的像素数量与第三样本图像包括的像素数量相同,也可以形成一一对应的像素关系。总之,在形成一一对应的像素关系之后,便可以按照上述说明,将相对应的像素包括的通道对应的取值相加,得到第四样本图像,此处不再进行赘述。This seventh watermark can be superimposed on the third sample image to obtain a fourth sample image. The fourth sample image is also the adversarial sample shown in Figure 1. In some implementations, the number of pixels included in the seventh watermark is equal to the number of pixels included in the third sample image, then the pixels included in the seventh watermark correspond to the pixels included in the third sample image one-to-one. Correspondingly, superimposing the seventh watermark onto the third sample image to obtain a fourth sample image may include: adding values corresponding to channels included in corresponding pixels to obtain a fourth sample image. Taking one pixel corresponding to the R (Red, red) channel, G (Green, green) channel and B (Blue, blue) channel as an example, for the two corresponding pixels included in the seventh watermark and the third sample image, these The values of the R channel of the two pixels are added, the values of the G channel are added, and the values of the B channel are also added to obtain the fourth sample image. In other implementations, the number of pixels included in the seventh watermark is less than the number of pixels included in the third sample image, then the embodiment of the present application can intercept a part from the third sample image, and the number of pixels included in this intercepted part is equal to the number of pixels included in the third sample image. The seventh watermark includes the same number of pixels, so that a one-to-one pixel relationship can be formed. Alternatively, the number of pixels of the seventh watermark can be increased through interpolation or other methods, so that the increased number of pixels is the same as the number of pixels included in the third sample image, or a one-to-one pixel relationship can be formed. In short, after forming a one-to-one pixel relationship, the values corresponding to the channels included in the corresponding pixels can be added according to the above description to obtain the fourth sample image, which will not be described again here.

由于已经获取了第三样本图像,因而可以将第三样本图像输入深度伪造模型,使得深度伪造模型对第三样本图像进行篡改,并输出篡改后的第三样本图像,即第三结果图像。其中,将第三样本图像表示为I,将深度伪造模型表示为G(·),则第三结果图像表示为G(I)。由于已经得到了第四样本图像,因而可以将第四样本图像输入深度伪造模型,使得深度伪造模型对第四样本图像进行篡改,并输出篡改后的第四样本图像,即第四结果图像。其中,将第七水印表示为W,由于第四样本图像通过在第三样本图像I上叠加第七水印W得到,因而将第四样本图像表示为I+W,深度伪造模型仍表示为G(·),则第四结果图像表示为G(I+W)。Since the third sample image has been obtained, the third sample image can be input into the deep forgery model, so that the deep forgery model tamperes with the third sample image and outputs the tampered third sample image, that is, the third result image. Among them, the third sample image is represented as I, the deep fake model is represented as G(·), and the third result image is represented as G(I). Since the fourth sample image has been obtained, the fourth sample image can be input into the deep forgery model, causing the deep forgery model to tamper with the fourth sample image and output the tampered fourth sample image, that is, the fourth result image. Among them, the seventh watermark is expressed as W. Since the fourth sample image is obtained by superimposing the seventh watermark W on the third sample image I, the fourth sample image is expressed as I+W, and the deep forgery model is still expressed as G( ·), then the fourth result image is expressed as G(I+W).

步骤A2,根据第三结果图像和第四结果图像确定第三损失函数,根据第三损失函数确定梯度信息。Step A2: Determine a third loss function based on the third result image and the fourth result image, and determine gradient information based on the third loss function.

其中,根据第三结果图像和第四结果图像确定的第三损失函数用于指示第三结果图像与 第四结果图像之间的差异,该差异也代表了第三样本图像与第四样本图像之间的差异。本申请实施例不对该第三损失函数的确定方式加以限定,以通过MSE(Mean-Square Error,均方误差)的方式对第三结果图像和第四结果图像进行计算得到第三损失函数为例,则第三损失函数loss generation表示为如下的公式(1): Wherein, the third loss function determined according to the third result image and the fourth result image is used to indicate the difference between the third result image and the fourth result image, which difference also represents the difference between the third sample image and the fourth sample image. difference between. The embodiment of the present application does not limit the determination method of the third loss function. The third loss function is obtained by calculating the third result image and the fourth result image through MSE (Mean-Square Error) as an example. , then the third loss function loss generation is expressed as the following formula (1):

Loss generation=MSE(G(I),G(I+W))  (1) Loss generation =MSE(G(I),G(I+W)) (1)

示例性地,在通过MSE的方式对第三结果图像和第四结果图像进行计算的过程中,第三结果图像包括的像素与第四结果图像包括的像素是一一对应的。因此,可以先计算相对应的像素包括的通道对应的取值之间的差值,对差值进行平方计算,得到像素对应的平方值。之后,计算不同像素对应的平方值之和,将该平方值之和与像素数量的比值作为第三损失函数。For example, in the process of calculating the third result image and the fourth result image through MSE, the pixels included in the third result image and the pixels included in the fourth result image are in one-to-one correspondence. Therefore, you can first calculate the difference between the values corresponding to the channels included in the corresponding pixels, and perform a square calculation on the difference to obtain the square value corresponding to the pixel. After that, the sum of square values corresponding to different pixels is calculated, and the ratio of the sum of square values to the number of pixels is used as the third loss function.

无论通过何种方式确定第三损失函数,均可以针对第三损失函数进行求取梯度计算,从而得到梯度信息,该梯度信息指示着第三损失函数的取值变化最快的方向。示例性地,该梯度信息可以为一个向量,本申请实施例不对该梯度信息的形式加以限定。No matter how the third loss function is determined, the gradient calculation can be performed on the third loss function to obtain gradient information, which indicates the direction in which the value of the third loss function changes fastest. For example, the gradient information may be a vector, and the embodiment of the present application does not limit the form of the gradient information.

步骤A3,根据梯度信息更新第七水印,得到针对深度伪造模型生成的第一水印。Step A3: Update the seventh watermark according to the gradient information to obtain the first watermark generated for the deep forgery model.

根据上文说明可知,第七水印是用于生成第一水印的初始化水印,因而本申请实施例在得到梯度信息之后,根据梯度信息更新第七水印,从而在第七水印的基础上得到第一水印。According to the above description, it can be seen that the seventh watermark is an initialization watermark used to generate the first watermark. Therefore, after obtaining the gradient information, the embodiment of the present application updates the seventh watermark according to the gradient information, thereby obtaining the first watermark based on the seventh watermark. watermark.

在示例性实施例中,深度伪造模型的数量和梯度信息的数量均为多个,则对于一个第三样本图像而言,会针对每个深度伪造模型分别执行上述的步骤A1和步骤A2。由此,参见图1和图5,可以针对每个深度伪造模型分别得到一个梯度信息,使得多个深度伪造模型与多个梯度信息一一对应。In an exemplary embodiment, the number of deepfake models and the number of gradient information are both multiple. Then for a third sample image, the above-mentioned step A1 and step A2 will be performed for each deepfake model respectively. Therefore, referring to Figure 1 and Figure 5, one gradient information can be obtained for each deep forgery model, so that multiple deep forgery models correspond to multiple gradient information one-to-one.

相应地,步骤A3进一步包括如下的步骤A31至步骤A33。Correspondingly, step A3 further includes the following steps A31 to A33.

步骤A31,对多个梯度信息进行平均,得到目标梯度信息。Step A31: average multiple gradient information to obtain target gradient information.

其中,由于目标梯度信息通过对多个梯度信息进行平均得到,因而该目标梯度信息也即是多个梯度信息的平均值,将该目标梯度信息表示为g avg。参见图1和图5,对多个梯度信息进行平均也称为梯度融合。 Among them, since the target gradient information is obtained by averaging multiple gradient information, the target gradient information is also the average value of the multiple gradient information, and the target gradient information is expressed as g avg . Referring to Figures 1 and 5, averaging multiple gradient information is also called gradient fusion.

步骤A32,通过符号函数对目标梯度信息进行计算,得到第一计算结果;根据第一计算结果生成第二计算结果,将第二计算结果叠加至第四样本图像,得到第五样本图像;对第五样本图像进行上下限约束,得到第六样本图像;从第六样本图像中去除第三样本图像,得到第八水印。Step A32: Calculate the target gradient information through the symbolic function to obtain the first calculation result; generate the second calculation result according to the first calculation result, and superimpose the second calculation result to the fourth sample image to obtain the fifth sample image; The five sample images are subject to upper and lower bound constraints to obtain the sixth sample image; the third sample image is removed from the sixth sample image to obtain the eighth watermark.

其中,通过符号函数对目标梯度信息进行计算,可以将目标梯度信息中小于-1的数值转换为-1,将为0的数值保持为0,将大于1的数值转换为1,从而得到第一计算结果。将符号函数表示为sign(·),则该第一计算结果可以相应的表示为sign(g avg)。 Among them, the target gradient information is calculated through the symbolic function. The values less than -1 in the target gradient information can be converted to -1, the values that are 0 are kept as 0, and the values greater than 1 are converted to 1, thereby obtaining the first Calculation results. If the sign function is represented as sign(·), then the first calculation result can be correspondingly expressed as sign(g avg ).

示例性地,本申请实施例不对根据第一计算结果生成第二计算结果的方式进行限定,该生成方式可以根据实际需求进行设置。比如,该生成方式是将第一计算结果与常数a的乘积作为第二计算结果,该常数a的取值例如为0.1,在此不作限定。相应地,第二计算结果可以表示为asign(g avg)。 Illustratively, the embodiments of the present application do not limit the method of generating the second calculation result based on the first calculation result, and the generation method can be set according to actual requirements. For example, the generation method is to use the product of the first calculation result and the constant a as the second calculation result. The value of the constant a is, for example, 0.1, which is not limited here. Correspondingly, the second calculation result can be expressed as asign(g avg ).

在上述的步骤A1中,将第三样本图像表示为I,将第七水印表示为W,将第四样本图像表示为I+W。而在此处为了便于表示迭代训练的过程,将第七水印表示为P r,将第四样本图像表示为

Figure PCTCN2022144343-appb-000004
则第四样本图像
Figure PCTCN2022144343-appb-000005
表示为如下的公式(2): In the above step A1, the third sample image is represented as I, the seventh watermark is represented as W, and the fourth sample image is represented as I+W. In order to facilitate the representation of the iterative training process here, the seventh watermark is expressed as P r and the fourth sample image is expressed as
Figure PCTCN2022144343-appb-000004
Then the fourth sample image
Figure PCTCN2022144343-appb-000005
Expressed as the following formula (2):

Figure PCTCN2022144343-appb-000006
Figure PCTCN2022144343-appb-000006

由于已经获得了第四样本图像和第二计算结果,因而可以将第二计算结果叠加至第四样 本图像,得到第五样本图像

Figure PCTCN2022144343-appb-000007
另外,按照如下的公式(3)对第五样本图像进行上下限约束,得到第六样本图像
Figure PCTCN2022144343-appb-000008
Since the fourth sample image and the second calculation result have been obtained, the second calculation result can be superimposed on the fourth sample image to obtain the fifth sample image.
Figure PCTCN2022144343-appb-000007
In addition, the fifth sample image is subject to upper and lower bound constraints according to the following formula (3) to obtain the sixth sample image
Figure PCTCN2022144343-appb-000008

Figure PCTCN2022144343-appb-000009
Figure PCTCN2022144343-appb-000009

其中,slip I,∈{·}表示一种上下限约束的计算,当第五样本图像包括的像素的通道对应的取值大于上限时,将该取值转换为上限,当第五样本图像包括的像素的通道对应的取值小于下限时,将该取值转换为下限。其中,上限和下限根据实际需求确定即可,在此不作限定。 Among them, slip I,∈ {·} represents the calculation of an upper and lower limit constraint. When the value corresponding to the channel of the pixel included in the fifth sample image is greater than the upper limit, the value is converted into the upper limit. When the fifth sample image includes When the value corresponding to the channel of the pixel is less than the lower limit, the value is converted to the lower limit. Among them, the upper limit and lower limit can be determined according to actual needs and are not limited here.

在得到第六样本图像之后,再按照如下的公式(4)从第六样本图像中去除第三样本图像,得到第八水印P r+1,该第八水印又称为对抗扰动: After obtaining the sixth sample image, the third sample image is removed from the sixth sample image according to the following formula (4) to obtain the eighth watermark P r+1 . This eighth watermark is also called adversarial perturbation:

Figure PCTCN2022144343-appb-000010
Figure PCTCN2022144343-appb-000010

需要说明的是,以上的公式(2)、(3)和(4)中的r用于表示迭代次数,该迭代次数根据实际需求进行设置。比如,当通过一次迭代过程得到第八水印时,r的取值为0,第八水印为P 1。又比如,当通过10次迭代过程得到第八水印时,r的取值为9,第八水印为P 10It should be noted that r in the above formulas (2), (3) and (4) is used to represent the number of iterations, and the number of iterations is set according to actual needs. For example, when the eighth watermark is obtained through an iterative process, the value of r is 0 and the eighth watermark is P 1 . For another example, when the eighth watermark is obtained through 10 iterations, the value of r is 9 and the eighth watermark is P 10 .

步骤A33,对第七水印和第八水印进行加权平均,得到第一水印。Step A33: perform a weighted average of the seventh watermark and the eighth watermark to obtain the first watermark.

在得到第八水印之后,便可以按照如下的公式(5)对第七水印和第八水印进行加权平均,得到第一水印W′:After obtaining the eighth watermark, the seventh watermark and the eighth watermark can be weighted and averaged according to the following formula (5) to obtain the first watermark W′:

W′←(1-α)W+αP   (5)W′←(1-α)W+αP (5)

其中,α为权重,α的取值例如为0.01,可以根据经验或者实际需求进行确定。W即第七水印,等同于上述的P r,P即第八水印,等同于上述的P r+1Among them, α is the weight, and the value of α is, for example, 0.01, which can be determined based on experience or actual needs. W is the seventh watermark, which is equivalent to the above-mentioned P r , and P is the eighth watermark, which is equivalent to the above-mentioned P r+1 .

以上的步骤A1至步骤A3是基于PGD(Projected Gradient Descent,投影梯度下降)算法的一种生成方式。并且,以上的步骤A1至A3是针对一个第三样本图像所进行的说明,本申请实施例可以提供多组图像,每组图像包括多个第三样本图像。则在此种情况下,先按照步骤A1至A3对第一组图像中的第一个第三样本图像进行处理,得到第一水印1,再将第一组图像中的第二个第三样本图像和该第一水印1代入步骤A1,从而重复执行步骤A1至A3,得到第一水印2,以此类推,直至第一组图像中的全部第三样本图像均已使用完毕,得到第一水印K,K即为第一组图像包括的第三样本图像的数量。之后,再使用第二组图像,将第二组图像中的第一个第三样本图像和第一水印K代入步骤A1,从而重复执行步骤A1至A3,继续以此类推,直至全部的最后一组图像中的最后一个第三样本图像也已使用完毕,得到针对深度防御模型生成的第一水印。The above steps A1 to A3 are a generation method based on the PGD (Projected Gradient Descent) algorithm. Moreover, the above steps A1 to A3 are described for one third sample image. The embodiment of the present application can provide multiple sets of images, each set of images including multiple third sample images. In this case, first process the first third sample image in the first group of images according to steps A1 to A3 to obtain the first watermark 1, and then process the second third sample image in the first group of images. The three sample images and the first watermark 1 are substituted into step A1, and steps A1 to A3 are repeatedly executed to obtain the first watermark 2, and so on, until all the third sample images in the first group of images have been used, and we obtain The first watermark K, K is the number of third sample images included in the first group of images. After that, use the second set of images to substitute the first third sample image and the first watermark K in the second set of images into step A1, thereby repeating steps A1 to A3, and so on, until all The last third sample image in the last set of images has also been used, resulting in the first watermark generated for the defense-in-depth model.

在示例性实施例中,获取携带目标信息的第二水印,可以包括:提供第二水印输入接口,获取用户从该第二水印输入接口上传的图像形式的第二水印,或者,获取用户从该第二水印输入接口键入的字符串形式的第二水印。当然,除了获取用户提供的第二水印之外,还可以获取本地存储的第二水印,或者,获取其他电子设备、服务器等发送的第二水印。本申请实施不对第二水印的获取方式进行限定。In an exemplary embodiment, obtaining the second watermark carrying target information may include: providing a second watermark input interface, obtaining the second watermark in the form of an image uploaded by the user from the second watermark input interface, or obtaining the second watermark uploaded by the user from the second watermark input interface. The second watermark is entered in the form of a string through the second watermark input interface. Of course, in addition to obtaining the second watermark provided by the user, you can also obtain the second watermark stored locally, or obtain the second watermark sent by other electronic devices, servers, etc. The implementation of this application does not limit the acquisition method of the second watermark.

步骤402,获取目标编码模型,目标编码模型用于针对图像进行水印嵌入。Step 402: Obtain a target coding model, which is used to embed watermarks on the image.

其中,目标编码模型也即是上文所说的编码器。该目标编码模型可以是基于卷积神经网络的模型,也可以是其他人工智能模型,在此不作限定。在本申请实施例中,该目标编码模型具备针对图像进行水印嵌入的能力,并且能够使得嵌入水印之前的图像与嵌入水印之后的图像之间的差异小于参考阈值。或者说,使得所嵌入的水印在图像中不可见,从而可以实现对水印的隐藏嵌入。基于此,在训练得到该目标编码模型的过程中,需要结合该参考阈值进行训练。因此,在示例性实施例中,本申请实施例提供的方法还包括如下的步骤B1至步骤 B3。Among them, the target encoding model is the encoder mentioned above. The target encoding model can be a model based on a convolutional neural network or other artificial intelligence models, which are not limited here. In this embodiment of the present application, the target coding model has the ability to embed watermarks on images, and can make the difference between the image before embedding the watermark and the image after embedding the watermark less than a reference threshold. In other words, the embedded watermark is made invisible in the image, thereby enabling hidden embedding of the watermark. Based on this, in the process of training to obtain the target encoding model, it is necessary to combine the reference threshold for training. Therefore, in an exemplary embodiment, the method provided by the embodiment of the present application also includes the following steps B1 to B3.

步骤B1,获取第一样本图像、携带第一信息的第三水印以及初始编码模型,将第一样本图像输入初始编码模型,得到初始编码模型输出的第一结果图像,将第一样本图像和第三水印输入初始编码模型,得到初始编码模型输出的第二结果图像,第二结果图像嵌入有第三水印。Step B1, obtain the first sample image, the third watermark carrying the first information and the initial coding model, input the first sample image into the initial coding model, obtain the first result image output by the initial coding model, and convert the first sample The image and the third watermark are input into the initial encoding model to obtain a second result image output by the initial encoding model, and the second result image is embedded with the third watermark.

其中,第一样本图像可以是还未嵌入水印的图像,该第一样本图像可以来自于CelebA等公开数据集,该第一样本图像可以与上述的第三样本图像为位于同一个数据集中的图像,或者,该第一样本图像也可以与上述的第三样本图像为位于不同的数据中的图像,在此不作限定。携带第一信息的第三水印与第二水印具有相同的类型,比如第二水印和第三水印均为图像形式,或者第二水印和第三水印均为字符串形式。另外,该第三水印携带的第一信息可以随机生成,本申请实施例不对第一信息进行限定。所获取的初始编码模型是用于生成目标编码模型的初始化模型。The first sample image may be an image that has not yet embedded a watermark. The first sample image may be from a public data set such as CelebA. The first sample image may be located in the same data set as the above-mentioned third sample image. The concentrated image, or the first sample image and the above-mentioned third sample image may be images located in different data, which is not limited here. The third watermark carrying the first information is of the same type as the second watermark. For example, the second watermark and the third watermark are both in image form, or the second watermark and the third watermark are both in string form. In addition, the first information carried by the third watermark can be randomly generated, and the first information is not limited in this embodiment of the application. The obtained initial encoding model is an initialization model used to generate the target encoding model.

由于已经获取了第一样本图像,因而可以将第一样本图像输入初始编码模型,该初始编码模型输出编码得到的第一结果图像。其中,将第一样本图像表示为M,将初始编码模型表示为E(·),则第一结果图像表示为E(M)。由于还获取了第三水印,因而可以将第一样本图像和第三水印均输入初始编码模型,该初始编码模型输出编码得到的第二结果图像,该第二结果图像可以认为是已经嵌入了第三水印的第一样本图像。其中,将第三水印表示为N,初始编码模型仍表示为E(·),则第二结果图像表示为E(M,N)。Since the first sample image has been acquired, the first sample image can be input into the initial encoding model, and the initial encoding model outputs the first result image obtained by encoding. Among them, the first sample image is represented as M, the initial encoding model is represented as E(·), and the first result image is represented as E(M). Since the third watermark is also obtained, both the first sample image and the third watermark can be input into the initial encoding model. The initial encoding model outputs the second result image obtained by encoding. The second result image can be considered to have been embedded. The first sample image of the third watermark. Among them, the third watermark is represented as N, the initial coding model is still represented as E(·), and the second result image is represented as E(M,N).

步骤B2,根据第一结果图像和第二结果图像确定第一损失函数,第一损失函数用于指示第一结果图像与第二结果图像之间的差异。Step B2: Determine a first loss function based on the first result image and the second result image. The first loss function is used to indicate the difference between the first result image and the second result image.

其中,根据第一结果图像和第二结果图像确定的第一损失函数,用于指示第一结果图像与第二结果图像之间的差异,该差异也代表着第一样本图像与已经嵌入了第三水印的第一样本图像之间的差异。本申请实施例不对该第一损失函数的确定方式加以限定,以通过MSE的方式对第一结果图像和第二结果图像进行计算得到第一损失函数为例,则第一损失函数Loss encoding表示为如下的公式(6): Among them, the first loss function determined according to the first result image and the second result image is used to indicate the difference between the first result image and the second result image. This difference also represents the difference between the first sample image and the embedded image. The difference between the first sample image of the third watermark. The embodiment of the present application does not limit the method of determining the first loss function. Taking the first loss function obtained by calculating the first result image and the second result image through MSE as an example, the first loss function Loss encoding is expressed as The following formula (6):

Loss encoding=MSE(E(M),E(M,N))  (6) Loss encoding =MSE(E(M),E(M,N)) (6)

其中,通过MSE的方式对第一结果图像和第二结果图像进行计算的过程,可以参见上文步骤301中说明的通过MSE的方式对第三结果图像和第四结果图像进行计算的过程,此处不再进行赘述。For the process of calculating the first result image and the second result image through MSE, please refer to the process of calculating the third result image and the fourth result image through MSE explained in step 301 above. No further details will be given.

步骤B3,通过最小化第一损失函数的过程更新初始编码模型,得到目标编码模型,其中,最小化的第一损失函数小于参考阈值。Step B3: Update the initial coding model through the process of minimizing the first loss function to obtain the target coding model, where the minimized first loss function is smaller than the reference threshold.

在最小化第一损失函数的过程中,针对第一损失函数进行求取梯度计算,得到第一梯度,根据该第一梯度进行梯度反向传播,可以得到第一损失函数最小化时的目标编码模型参数。之后,通过将初始编码模型中的初始编码模型参数更新为目标编码模型参数,得到目标编码模型。In the process of minimizing the first loss function, gradient calculation is performed on the first loss function to obtain the first gradient. Gradient backpropagation is performed based on the first gradient to obtain the target encoding when the first loss function is minimized. model parameters. Afterwards, the target encoding model is obtained by updating the initial encoding model parameters in the initial encoding model to the target encoding model parameters.

需要说明的是,最小化的第一损失函数小于参考阈值,该参考阈值可以是无限趋近于0的数值,在此不作限定。由于最小化的第一损失函数小于参考阈值,而第一损失函数代表着第一样本图像(嵌入水印之前的图像)与已经嵌入了第三水印的第一样本图像(嵌入水印之后的图像)之间的差异,因此,在通过最小化第一损失函数的过程更新初始编码模型得到目标编码模型之后,该目标编码模型能够使得嵌入水印之前的图像与嵌入水印之后的图像之间 的差异小于参考阈值,从而使得所嵌入的水印在图像中不可见,实现对水印的隐藏嵌入。It should be noted that the minimized first loss function is smaller than the reference threshold, and the reference threshold can be a value infinitely close to 0, which is not limited here. Since the minimized first loss function is smaller than the reference threshold, and the first loss function represents the first sample image (the image before embedding the watermark) and the first sample image that has embedded the third watermark (the image after embedding the watermark) ), therefore, after updating the initial coding model by minimizing the first loss function to obtain the target coding model, the target coding model can make the difference between the image before embedding the watermark and the image after embedding the watermark less than Reference threshold is used to make the embedded watermark invisible in the image, thereby achieving hidden embedding of the watermark.

步骤403,通过目标编码模型将第一水印和第二水印嵌入初始图像,得到目标图像,初始图像与目标图像之间的差异小于参考阈值。Step 403: Embed the first watermark and the second watermark into the initial image through the target encoding model to obtain the target image, and the difference between the initial image and the target image is less than the reference threshold.

其中,初始图像可以是拍摄得到的图像,也可以是从视频中截取的图像,在此不作限定。本申请实施例通过目标编码模型将所获取的第一水印和第二水印嵌入初始图像,得到目标图像。该初始图像即为嵌入水印之前的图像,该目标图像即为嵌入水印之后的图像。由于目标编码模型能够使得嵌入水印之前的图像与嵌入水印之后的图像之间的差异小于参考阈值,因而初始图像与目标图像之间的差异小于参考阈值。也就是说,第一水印和第二水印是隐藏嵌入初始图像的,第一水印和第二水印在目标图像中不可见。The initial image may be a photographed image or an image captured from a video, which is not limited here. The embodiment of the present application embeds the obtained first watermark and second watermark into the initial image through the target coding model to obtain the target image. The initial image is the image before the watermark is embedded, and the target image is the image after the watermark is embedded. Since the target encoding model can make the difference between the image before embedding the watermark and the image after embedding the watermark less than the reference threshold, the difference between the initial image and the target image is less than the reference threshold. That is to say, the first watermark and the second watermark are hidden and embedded in the initial image, and the first watermark and the second watermark are not visible in the target image.

在示例性实施例中,考虑到第一水印是针对深度伪造模型生成的,因而第一水印本身就是不可见的水印,对第一水印的嵌入,可以是对第一水印的直接叠加。而第二水印是携带目标信息的,第二水印往往是可见的水印,因而需要通过目标编码模型实现第二水印的嵌入。基于这些考虑,通过目标编码模型将第一水印和第二水印嵌入初始图像,得到目标图像,可以包括如下的两种嵌入方式。In an exemplary embodiment, considering that the first watermark is generated for the deepfake model, the first watermark itself is an invisible watermark, and the embedding of the first watermark may be a direct superposition of the first watermark. The second watermark carries target information, and the second watermark is often a visible watermark, so it is necessary to implement the embedding of the second watermark through the target coding model. Based on these considerations, the first watermark and the second watermark are embedded in the initial image through the target coding model to obtain the target image, which can include the following two embedding methods.

嵌入方式一,将第一水印叠加至初始图像上,得到第一图像,将第一图像和第二水印输入目标编码模型,得到目标编码模型输出的目标图像。在嵌入方式一中,先在初始图像上叠加第一水印,得到第一图像。叠加该第一水印的方式可以参见步骤301中将第七水印叠加在第三样本图像上的方式,此处不再赘述。之后,再通过目标编码模型将第二水印嵌入第一图像,得到目标图像。Embedding method one: superimpose the first watermark onto the initial image to obtain the first image, input the first image and the second watermark into the target encoding model, and obtain the target image output by the target encoding model. In the first embedding method, the first watermark is superimposed on the initial image to obtain the first image. For the method of superimposing the first watermark, please refer to the method of superimposing the seventh watermark on the third sample image in step 301, which will not be described again here. After that, the second watermark is embedded into the first image through the target encoding model to obtain the target image.

嵌入方式二,将第二水印和初始图像输入目标编码模型,得到目标编码模型输出的第二图像,将第一水印叠加至第二图像上,得到目标图像。在嵌入方式二中,先通过目标编码模型将第二水印嵌入初始图像,得到第二图像。之后,再在第二图像上叠加第一水印,得到目标图像。叠加该第一水印的方式同样可以参见步骤301中将第七水印叠加在第三样本图像上的方式,此处不再赘述。In the second embedding method, the second watermark and the initial image are input into the target encoding model to obtain the second image output by the target encoding model, and the first watermark is superimposed on the second image to obtain the target image. In the second embedding method, the second watermark is first embedded into the initial image through the target encoding model to obtain the second image. After that, the first watermark is superimposed on the second image to obtain the target image. The method of superimposing the first watermark can also refer to the method of superimposing the seventh watermark on the third sample image in step 301, which will not be described again here.

通过以上的说明,本申请实施例可以将第一水印和第二水印嵌入目标图像,使得该目标图像既能够防御深度伪造模型,又能够携带目标信息。另外,本申请实施例还可以检测此种目标图像是否被深度伪造模型篡改,详见如下的说明。Through the above description, embodiments of the present application can embed the first watermark and the second watermark into the target image, so that the target image can not only defend against deep forgery models, but also carry target information. In addition, embodiments of the present application can also detect whether such a target image has been tampered with by a deep forgery model. Please refer to the following description for details.

在示例性实施例中,通过目标编码模型将第一水印和第二水印嵌入初始图像,得到目标图像之后,方法还包括:获取基于目标图像生成的更新图像;将更新图像输入目标解码模型,得到目标解码模型输出的第四水印;确定第二水印与第四水印之间的位(Bit)误差;在位误差大于误差阈值的情况下,确定更新图像为通过深度伪造模型对目标图像进行深度伪造后得到的图像。In an exemplary embodiment, after embedding the first watermark and the second watermark into the initial image through the target encoding model to obtain the target image, the method further includes: obtaining an updated image generated based on the target image; inputting the updated image into the target decoding model to obtain The fourth watermark output by the target decoding model; determine the bit error between the second watermark and the fourth watermark; when the bit error is greater than the error threshold, determine that the updated image is a deep forgery of the target image through the deep forgery model The resulting image.

其中,本申请实施例可以将目标图像上传至可能发生图像篡改的网络场景中,在一段时间之后对所上传的目标图像进行下载,从而得到更新图像,以便于检测该更新图像是否被篡改,或者说是否被深度伪造模型进行过深度伪造。或者,本申请实施例也可以将该目标图像输入深度伪造模型,得到该深度伪造模型输出的更新图像。本申请实施例不对生成更新图像的方式进行限定。Among them, embodiments of the present application can upload the target image to a network scenario where image tampering may occur, and download the uploaded target image after a period of time to obtain an updated image, so as to detect whether the updated image has been tampered with, or Indicates whether the deepfake model has been deepfaked. Alternatively, in the embodiment of the present application, the target image can also be input into the deep forgery model to obtain an updated image output by the deep forgery model. The embodiment of the present application does not limit the method of generating the updated image.

目标解码模型也即是上述说明中的解码器。该目标编码模型可以是基于卷积神经网络的模型,也可以是其他人工智能模型,在此不作限定。该目标解码模型用于从嵌入有水印的图 像中进行水印提取。由于更新图像是基于目标图像生成的,而目标图像嵌入有第一水印和第二水印,因而更新图像也嵌入有水印。因此,将将更新图像输入目标解码模型之后,目标解码模型能够从更新图像中提取并输出第四水印。The target decoding model is the decoder in the above description. The target encoding model can be a model based on a convolutional neural network or other artificial intelligence models, which is not limited here. This target decoding model is used to extract watermarks from images with embedded watermarks. Since the update image is generated based on the target image, and the target image is embedded with the first watermark and the second watermark, the update image is also embedded with the watermark. Therefore, after the updated image is input to the target decoding model, the target decoding model can extract and output the fourth watermark from the updated image.

之后,便可以计算第二水印与第四水印之间的位误差,对位误差与误差阈值进行比较,该误差阈值例如为0.4,在此不作限定。如果位误差大于误差阈值,则确定更新图像是被篡改过的图像,或者说,确定更新图像为通过深度伪造模型对目标图像进行深度伪造后得到的图像。其原因在于,在更新图像已经被篡改的情况下,从更新图像中提取的第四水印是已经扭曲的第一水印以及已经扭曲的第二水印,因而第四水印与正常未被扭曲的第二水印之间的差异较大,从而使得第二水印与第四水印之间的位误差大于误差阈值。而位如果位误差小于或等于误差阈值,则确定更新图像是未被篡改过的图像,或者说,确定更新图像不是通过深度伪造模型对目标图像进行深度伪造后得到的图像。其原因在于,在更新图像未被篡改的情况下,从更新图像中提取的第四水印即为正常未被扭曲的第一水印和第二水印,从而使得第二水印与第四水印之间的位误差小于或等于误差阈值。After that, the bit error between the second watermark and the fourth watermark can be calculated, and the bit error can be compared with an error threshold. The error threshold is, for example, 0.4, which is not limited here. If the bit error is greater than the error threshold, it is determined that the updated image is a tampered image, or in other words, it is determined that the updated image is an image obtained by deep forging the target image through a deep forgery model. The reason is that when the update image has been tampered with, the fourth watermark extracted from the update image is the distorted first watermark and the distorted second watermark. Therefore, the fourth watermark is different from the normal undistorted second watermark. The difference between the watermarks is large, so that the bit error between the second watermark and the fourth watermark is greater than the error threshold. If the bit error is less than or equal to the error threshold, it is determined that the updated image is an image that has not been tampered with, or in other words, it is determined that the updated image is not an image obtained by deep forging the target image through a deep forgery model. The reason is that when the update image has not been tampered with, the fourth watermark extracted from the update image is the normal undistorted first watermark and the second watermark, so that the difference between the second watermark and the fourth watermark is The bit error is less than or equal to the error threshold.

在一些实施方式中,第二水印和第四水印为图像形式的水印。则确定第二水印与第四水印之间的位误差,包括:比较第二水印的图像质量指标与第四水印的图像质量指标,将图像质量指标之间的差异作为该位误差。示例性地,图像质量指标包括但不限于PSNR(峰值信噪比,Peak Signal-to-Noise Ratio)、SSIM(Structural Similarity,结构相似性)等等,在此不作限定。In some implementations, the second watermark and the fourth watermark are watermarks in the form of images. Then determining the bit error between the second watermark and the fourth watermark includes: comparing the image quality index of the second watermark with the image quality index of the fourth watermark, and taking the difference between the image quality indexes as the bit error. For example, image quality indicators include but are not limited to PSNR (Peak Signal-to-Noise Ratio), SSIM (Structural Similarity, structural similarity), etc., which are not limited here.

在另一些实施方式中,第二水印和第四水印为字符串形式的水印。则确定第二水印与第四水印之间的位误差,包括:确定第二水印包括的比特位的总数量,以及第二水印与第四水印包括的存在差异的比特位的目标数量,将目标数量与总数量的比值作为该位误差。比如,第二水印是一个包括4个比特位的字符串1100,第四水印也是一个包括4个比特位的字符串0000,则上述的总数量为4,上述的目标数量为2,位误差为0.5。In other implementations, the second watermark and the fourth watermark are watermarks in the form of character strings. Then determining the bit error between the second watermark and the fourth watermark includes: determining the total number of bits included in the second watermark, and the target number of different bits included in the second watermark and the fourth watermark, and setting the target number. The ratio of the quantity to the total quantity is used as the bit error. For example, the second watermark is a string of 4 bits 1100, and the fourth watermark is also a string of 4 bits 0000, then the total number mentioned above is 4, the target number mentioned above is 2, and the bit error is 0.5.

比如,参见图6,对深度伪造检测过程进行举例说明。在初始图像上叠加第一水印,将得到的第一图像和第二水印一起输入目标编码模型,得到目标编码模型输出的目标图像。之后,将目标图像输入深度伪造模型,得到深度伪造模型输出的更新图像,将更新图像输入目标解码模型,得到更新图像输出的第四水印。然后,对第二水印和第四水印进行比较,从而根据比较结果确定更新图像是否被篡改,完成深度伪造检测。For example, see Figure 6 for an example of the deepfake detection process. The first watermark is superimposed on the initial image, and the obtained first image and the second watermark are input into the target encoding model to obtain the target image output by the target encoding model. Afterwards, the target image is input into the deep forgery model to obtain an updated image output by the deep forgery model, and the updated image is input into the target decoding model to obtain a fourth watermark output from the updated image. Then, the second watermark and the fourth watermark are compared to determine whether the updated image has been tampered with based on the comparison result, thereby completing deep forgery detection.

当然,本申请实施例在使用目标解码模型之前,可以先训练得到该目标解码模型。则在示例性实施例中,将更新图像输入目标解码模型之前,方法还包括如下的步骤C1至步骤C3。Of course, before using the target decoding model in this embodiment of the present application, the target decoding model can be trained first. In an exemplary embodiment, before inputting the updated image into the target decoding model, the method further includes the following steps C1 to C3.

步骤C1,获取第二样本图像和初始解码模型,第二样本图像嵌入有携带第二信息的第五水印;将第二样本图像输入初始解码模型,得到初始解码模型输出的第六水印。Step C1: Obtain the second sample image and the initial decoding model. The second sample image is embedded with the fifth watermark carrying the second information; input the second sample image into the initial decoding model to obtain the sixth watermark output by the initial decoding model.

其中,第二样本图像是已经嵌入有第五水印的图像,本申请实施例可以直接获取满足要求的第二样本图像,也可以通过水印嵌入的过程制作出第二样本图像,比如将上述步骤B1中的第二结果图像作为第二样本图像,则第五水印也即是上述步骤B1中的第三水印。在此不对第二样本图像的获取方式进行限定。该携带第二信息的第五水印与第二水印具有相同的类型,该第二信息可以随机生成。另外,初始解码模型是用于生成目标解码模型的初始化模型。Among them, the second sample image is an image that has been embedded with the fifth watermark. The embodiment of the present application can directly obtain the second sample image that meets the requirements, or can also produce the second sample image through the process of embedding the watermark. For example, the above step B1 The second result image in is used as the second sample image, then the fifth watermark is also the third watermark in the above step B1. The method of obtaining the second sample image is not limited here. The fifth watermark carrying the second information is of the same type as the second watermark, and the second information can be randomly generated. In addition, the initial decoding model is an initialization model used to generate the target decoding model.

在获取第二样本图像之后,将第二样本图像输入初始解码模型,该初始解码模型输出解码得到的第六水印。第五水印与第六水印的类型相同,比如都是图像形式或者都是字符串形 式。其中,以第二结果图像即为第二样本图像,第五水印即为第三水印为例,则第五水印表示为N,第二样本图像表示为E(M,N),将初始解码模型表示为D(·),则第六水印表示为D(E(M,N))。After the second sample image is acquired, the second sample image is input into the initial decoding model, and the initial decoding model outputs the decoded sixth watermark. The fifth watermark and the sixth watermark are of the same type, for example, both are in image form or both are in string form. Among them, taking the second result image as the second sample image and the fifth watermark as the third watermark as an example, the fifth watermark is represented as N and the second sample image is represented as E(M,N). The initial decoding model Expressed as D(·), the sixth watermark is expressed as D(E(M,N)).

步骤C2,根据第五水印和第六水印确定第二损失函数。Step C2: Determine the second loss function based on the fifth watermark and the sixth watermark.

其中,根据第五水印和第六水印确定的第二损失函数用于指示第五水印和第六水印之间的差异。本申请不对该第二损失函数的确定方式加以限定。以通过带有Logit(逻辑回归)的BCE(Binary Cross Entropy,二元交叉熵)的方式对第五水印和第六水印进行计算得到第二损失函数为例,则该第二损失函数Loss decoding表示为如下的公式(7): Wherein, the second loss function determined according to the fifth watermark and the sixth watermark is used to indicate the difference between the fifth watermark and the sixth watermark. This application does not limit the method of determining the second loss function. Taking the second loss function obtained by calculating the fifth and sixth watermarks through BCE (Binary Cross Entropy) with Logit (logistic regression) as an example, the second loss function Loss decoding represents is the following formula (7):

Loss decoding=BCEwithLogitsLoss(N,D(E(M,N)))  (7) Loss decoding =BCEwithLogitsLoss(N,D(E(M,N))) (7)

其中,BCEwithLogitsLoss即为带有Logit的BCE,在计算过程中,对第五水印进行归一化,得到归一化后的第五水印,并对第六水印进行归一化,得到归一化后的第六水印,归一化时可以采用Sigmoid(S型生长曲线)函数。之后,再通过BCE对归一化后的第五水印和归一化后的第六水印进行计算,得到第二损失函数。Among them, BCEwithLogitsLoss is BCE with Logit. During the calculation process, the fifth watermark is normalized to obtain the normalized fifth watermark, and the sixth watermark is normalized to obtain the normalized For the sixth watermark, the Sigmoid (S-shaped growth curve) function can be used for normalization. After that, the normalized fifth watermark and the normalized sixth watermark are calculated through BCE to obtain the second loss function.

步骤C3,通过最小化第二损失函数的过程更新初始解码模型,得到目标解码模型。Step C3: Update the initial decoding model through the process of minimizing the second loss function to obtain the target decoding model.

在最小化第二损失函数的过程中,针对第二损失函数进行求取梯度计算,得到第二梯度,根据该第二梯度进行梯度反向传播,可以得到第二损失函数最小化时的目标解码模型参数。之后,通过将初始解码模型中的初始解码模型参数更新为目标解码模型参数,得到目标解码模型。由于对第二损失函数进行了最小化,而该第二损失函数代表第五水印和第六水印之间的差异,因而能够使得第五水印和第六水印之间的差异最小化。由于第五水印是实际嵌入的水印,而第六水印是提取得到的水印,因而将第五水印和第六水印之间的差异最小化能够保证第五水印和第六水印足够接近,进而保证目标解码模型所提取的水印准确性较高。In the process of minimizing the second loss function, gradient calculation is performed on the second loss function to obtain the second gradient. Gradient backpropagation is performed based on the second gradient to obtain the target decoding when the second loss function is minimized. model parameters. Afterwards, the target decoding model is obtained by updating the initial decoding model parameters in the initial decoding model to the target decoding model parameters. Since the second loss function is minimized and the second loss function represents the difference between the fifth watermark and the sixth watermark, the difference between the fifth watermark and the sixth watermark can be minimized. Since the fifth watermark is the actual embedded watermark, and the sixth watermark is the extracted watermark, minimizing the difference between the fifth watermark and the sixth watermark can ensure that the fifth watermark and the sixth watermark are close enough to ensure the target The watermark extracted by the decoding model is more accurate.

综上所述,本申请实施例将针对深度伪造模型生成的第一水印,以及携带目标信息的第二水印均嵌入初始图像,并在嵌入过程中使用目标编码模型,使得初始图像与嵌入之后得到的目标图像之间的差异较小,从而使得第一水印和第二水印在目标图像中不可见,避免对目标图像记录的内容造成影响。并且,此种目标图像既能够防御深度伪造模型,又能够携带定制的目标信息,因而目标图像的质量和安全性较高,图像处理的准确性较好。本申请实施例还可以检测目标图像是否被深度伪造模型篡改,从而进一步保证了目标图像的安全性。To sum up, the embodiment of this application embeds the first watermark generated for the deep forgery model and the second watermark carrying target information into the initial image, and uses the target encoding model in the embedding process, so that the initial image is the same as the one obtained after embedding. The difference between the target images is small, so that the first watermark and the second watermark are invisible in the target image to avoid affecting the content recorded in the target image. Moreover, this kind of target image can not only defend against deep forgery models, but also carry customized target information. Therefore, the quality and security of the target image are higher, and the accuracy of image processing is better. The embodiments of the present application can also detect whether the target image has been tampered with by the deep fake model, thereby further ensuring the security of the target image.

本申请实施例提供了一种图像处理装置,参见图7,该装置包括如下的几个模块。An embodiment of the present application provides an image processing device. See Figure 7. The device includes the following modules.

获取模块701,用于获取针对深度伪造模型生成的第一水印,以及携带目标信息的第二水印;The acquisition module 701 is used to acquire the first watermark generated for the deep forgery model and the second watermark carrying target information;

获取模块701,还用于获取目标编码模型,目标编码模型用于针对图像进行水印嵌入;The acquisition module 701 is also used to acquire the target coding model, which is used to embed watermarks on images;

嵌入模块702,用于通过目标编码模型将第一水印和第二水印嵌入初始图像,得到目标图像,初始图像与目标图像之间的差异小于参考阈值。The embedding module 702 is used to embed the first watermark and the second watermark into the initial image through the target encoding model to obtain the target image, where the difference between the initial image and the target image is less than the reference threshold.

在示例性实施例中,嵌入模块702,用于将第一水印叠加至初始图像上,得到第一图像,将第一图像和第二水印输入目标编码模型,得到目标编码模型输出的目标图像;或者,将第二水印和初始图像输入目标编码模型,得到目标编码模型输出的第二图像,将第一水印叠加至第二图像上,得到目标图像。In an exemplary embodiment, the embedding module 702 is used to superimpose the first watermark onto the initial image to obtain the first image, and input the first image and the second watermark into the target encoding model to obtain the target image output by the target encoding model; Alternatively, the second watermark and the initial image are input into the target encoding model to obtain the second image output by the target encoding model, and the first watermark is superimposed on the second image to obtain the target image.

在示例性实施例中,获取模块701,还用于获取第一样本图像、携带第一信息的第三水印以及初始编码模型;将第一样本图像输入初始编码模型,得到初始编码模型输出的第一结果 图像;将第一样本图像和第三水印输入初始编码模型,得到初始编码模型输出的第二结果图像,第二结果图像嵌入有第三水印;根据第一结果图像和第二结果图像确定第一损失函数,第一损失函数用于指示第一结果图像与第二结果图像之间的差异;通过最小化第一损失函数的过程更新初始编码模型,得到目标编码模型,其中,最小化的第一损失函数小于参考阈值。In an exemplary embodiment, the acquisition module 701 is also used to acquire the first sample image, the third watermark carrying the first information, and the initial encoding model; input the first sample image into the initial encoding model to obtain the initial encoding model output The first result image; input the first sample image and the third watermark into the initial coding model to obtain the second result image output by the initial coding model, and the second result image is embedded with the third watermark; according to the first result image and the second The result image determines the first loss function, and the first loss function is used to indicate the difference between the first result image and the second result image; the initial encoding model is updated through the process of minimizing the first loss function to obtain the target encoding model, where, The minimized first loss function is smaller than the reference threshold.

在示例性实施例中,获取模块701,还用于获取基于目标图像生成的更新图像;将更新图像输入目标解码模型,得到目标解码模型输出的第四水印;确定第二水印与第四水印之间的位误差;在位误差大于误差阈值的情况下,确定更新图像为通过深度伪造模型对目标图像进行深度伪造后得到的图像。In an exemplary embodiment, the acquisition module 701 is also used to acquire an updated image generated based on the target image; input the updated image into the target decoding model to obtain the fourth watermark output by the target decoding model; determine the relationship between the second watermark and the fourth watermark. bit error between; when the bit error is greater than the error threshold, it is determined that the updated image is an image obtained by deep forging the target image through a deep forgery model.

在示例性实施例中,获取模块701,还用于获取第二样本图像和初始解码模型,第二样本图像嵌入有携带第二信息的第五水印;将第二样本图像输入初始解码模型,得到初始解码模型输出的第六水印;根据第五水印和第六水印确定第二损失函数;通过最小化第二损失函数的过程更新初始解码模型,得到目标解码模型。In an exemplary embodiment, the acquisition module 701 is also used to acquire a second sample image and an initial decoding model. The second sample image is embedded with a fifth watermark carrying second information; input the second sample image into the initial decoding model to obtain The sixth watermark output by the initial decoding model; determine the second loss function based on the fifth watermark and the sixth watermark; update the initial decoding model through the process of minimizing the second loss function to obtain the target decoding model.

在示例性实施例中,获取模块701,用于获取第三样本图像、第七水印和深度伪造模型;将第三样本图像输入深度伪造模型,得到深度伪造模型输出的第三结果图像;将第七水印叠加至第三样本图像,得到第四样本图像;将第四样本图像输入深度伪造模型,得到深度伪造模型输出的第四结果图像;根据第三结果图像和第四结果图像确定第三损失函数;根据第三损失函数确定梯度信息;根据梯度信息更新第七水印,得到针对深度伪造模型生成的第一水印。In an exemplary embodiment, the acquisition module 701 is used to acquire the third sample image, the seventh watermark and the deep forgery model; input the third sample image into the deep forgery model to obtain the third result image output by the deep forgery model; convert the third sample image to the deep forgery model. The seven watermarks are superimposed on the third sample image to obtain the fourth sample image; the fourth sample image is input into the deep forgery model to obtain the fourth result image output by the deep forgery model; the third loss is determined based on the third result image and the fourth result image. function; determine the gradient information based on the third loss function; update the seventh watermark based on the gradient information to obtain the first watermark generated for the deep forgery model.

在示例性实施例中,深度伪造模型的数量和梯度信息的数量均为多个,多个深度伪造模型与多个梯度信息一一对应;获取模块701,用于对多个梯度信息进行平均,得到目标梯度信息;通过符号函数对目标梯度信息进行计算,得到第一计算结果;根据第一计算结果生成第二计算结果,将第二计算结果叠加至第四样本图像,得到第五样本图像;对第五样本图像进行上下限约束,得到第六样本图像;从第六样本图像中去除第三样本图像,得到第八水印;对第七水印和第八水印进行加权平均,得到第一水印。In an exemplary embodiment, there are multiple deep fake models and multiple gradient information, and multiple deep fake models correspond to multiple gradient information one-to-one; the acquisition module 701 is used to average multiple gradient information, Obtain the target gradient information; calculate the target gradient information through the symbolic function to obtain the first calculation result; generate the second calculation result based on the first calculation result, and superimpose the second calculation result to the fourth sample image to obtain the fifth sample image; Apply upper and lower bound constraints to the fifth sample image to obtain the sixth sample image; remove the third sample image from the sixth sample image to obtain the eighth watermark; perform a weighted average of the seventh watermark and the eighth watermark to obtain the first watermark.

需要说明的是,上述图7所示的装置所具备的技术效果,可以参见图4所示的方法实施例所具备的技术效果,此处不再赘述。并且,上述实施例提供的装置在实现其功能时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的装置与方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。It should be noted that for the technical effects of the device shown in Figure 7, please refer to the technical effects of the method embodiment shown in Figure 4, which will not be described again here. Moreover, when the device provided in the above embodiment implements its functions, only the division of the above functional modules is used as an example. In practical applications, the above functions can be allocated to different functional modules as needed, that is, the internal structure of the device Divide it into different functional modules to complete all or part of the functions described above. In addition, the apparatus and method embodiments provided in the above embodiments belong to the same concept, and the specific implementation process can be found in the method embodiments, which will not be described again here.

在示例性实施例中,本申请实施例还提供了一种电子设备。该电子设备包括存储器及处理器;存储器中存储有至少一条指令,至少一条指令由处理器加载并执行,以使电子设备实现本申请的任一种示例性实施例所提供的针对深度伪造的主动防御方法,或者图4对应的图像处理方法。In an exemplary embodiment, the embodiment of the present application also provides an electronic device. The electronic device includes a memory and a processor; at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor, so that the electronic device implements active deep forgery provided by any exemplary embodiment of the present application. Defense method, or the image processing method corresponding to Figure 4.

参见图8,其示出了本申请实施例提供的一种电子设备800的结构示意图。该电子设备800可以是便携式移动电子设备,比如:智能手机、平板电脑、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、笔记本电脑或台式电脑。 电子设备800还可能被称为用户设备、便携式电子设备、膝上型电子设备、台式电子设备等其他名称。通常,电子设备800包括有:处理器801和存储器802。Referring to FIG. 8 , a schematic structural diagram of an electronic device 800 provided by an embodiment of the present application is shown. The electronic device 800 can be a portable mobile electronic device, such as: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, Moving Picture Experts Compression Standard Audio Layer 3), MP4 (Moving Picture Experts Group Audio Layer IV , motion picture expert compresses standard audio levels 4) players, laptops or desktop computers. Electronic device 800 may also be referred to as user equipment, portable electronic device, laptop electronic device, desktop electronic device, and other names. Generally, the electronic device 800 includes: a processor 801 and a memory 802.

处理器801可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器801可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)所组成的群组中的至少一种硬件形式来实现。处理器801也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器801可以集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显示屏805所需要显示的内容的渲染和绘制。一些实施例中,处理器801还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。The processor 801 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor 801 can adopt at least one of the group consisting of DSP (Digital Signal Processing, digital signal processing), FPGA (Field-Programmable Gate Array, field programmable gate array), and PLA (Programmable Logic Array, programmable logic array). A form of hardware implementation. The processor 801 can also include a main processor and a co-processor. The main processor is a processor used to process data in the wake-up state, also called CPU (Central Processing Unit, central processing unit); the co-processor is A low-power processor used to process data in standby mode. In some embodiments, the processor 801 may be integrated with a GPU (Graphics Processing Unit, image processor), and the GPU is responsible for rendering and drawing the content to be displayed on the display screen 805 . In some embodiments, the processor 801 may also include an AI (Artificial Intelligence, artificial intelligence) processor, which is used to process computing operations related to machine learning.

存储器802可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器802还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器802中的非暂态的计算机可读存储介质用于存储至少一个指令,该至少一个指令用于被处理器801所执行以实现本申请中方法实施例提供的针对深度伪造的主动防御方法,或者图4对应的图像处理方法。Memory 802 may include one or more computer-readable storage media, which may be non-transitory. Memory 802 may also include high-speed random access memory, and non-volatile memory, such as one or more disk storage devices, flash memory storage devices. In some embodiments, the non-transitory computer-readable storage medium in the memory 802 is used to store at least one instruction, and the at least one instruction is used to be executed by the processor 801 to implement the depth-oriented method provided by the method embodiments in this application. The fake active defense method, or the image processing method corresponding to Figure 4.

在一些实施例中,电子设备800还可选包括有:外围设备接口803和至少一个外围设备。处理器801、存储器802和外围设备接口803之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口803相连。具体地,外围设备包括:射频电路804、显示屏805、摄像头组件806、音频电路807、定位组件808和电源809所组成的群组中的至少一种。In some embodiments, the electronic device 800 optionally further includes: a peripheral device interface 803 and at least one peripheral device. The processor 801, the memory 802 and the peripheral device interface 803 may be connected through a bus or a signal line. Each peripheral device can be connected to the peripheral device interface 803 through a bus, a signal line or a circuit board. Specifically, the peripheral device includes: at least one of the group consisting of a radio frequency circuit 804, a display screen 805, a camera component 806, an audio circuit 807, a positioning component 808 and a power supply 809.

外围设备接口803可被用于将I/O(Input/Output,输入/输出)相关的至少一个外围设备连接到处理器801和存储器802。在一些实施例中,处理器801、存储器802和外围设备接口803被集成在同一芯片或电路板上;在一些其他实施例中,处理器801、存储器802和外围设备接口803中的任意一个或两个可以在单独的芯片或电路板上实现,本实施例对此不加以限定。The peripheral device interface 803 may be used to connect at least one I/O (Input/Output, input/output) related peripheral device to the processor 801 and the memory 802 . In some embodiments, the processor 801, the memory 802, and the peripheral device interface 803 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 801, the memory 802, and the peripheral device interface 803 or Both of them can be implemented on separate chips or circuit boards, which is not limited in this embodiment.

射频电路804用于接收和发射RF(Radio Frequency,射频)信号,也称电磁信号。射频电路804通过电磁信号与通信网络以及其他通信设备进行通信。射频电路804将电信号转换为电磁信号进行发送,或者,将接收到的电磁信号转换为电信号。可选地,射频电路804包括:天线系统、RF收发器、一个或多个放大器、调谐器、振荡器、数字信号处理器、编解码芯片组、用户身份模块卡等等。射频电路804可以通过至少一种无线通信协议来与其它电子设备进行通信。该无线通信协议包括但不限于:城域网、各代移动通信网络(2G、3G、4G及5G)、无线局域网和/或Wi-Fi(Wireless Fidelity,无线保真)网络。在一些实施例中,射频电路804还可以包括NFC(Near Field Communication,近距离无线通信)有关的电路,本申请对此不加以限定。The radio frequency circuit 804 is used to receive and transmit RF (Radio Frequency, radio frequency) signals, also called electromagnetic signals. Radio frequency circuit 804 communicates with communication networks and other communication devices through electromagnetic signals. The radio frequency circuit 804 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals. Optionally, the radio frequency circuit 804 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a user identity module card, and the like. Radio frequency circuitry 804 can communicate with other electronic devices through at least one wireless communication protocol. The wireless communication protocol includes but is not limited to: metropolitan area network, mobile communication networks of all generations (2G, 3G, 4G and 5G), wireless LAN and/or Wi-Fi (Wireless Fidelity, wireless fidelity) network. In some embodiments, the radio frequency circuit 804 may also include NFC (Near Field Communication) related circuits, which is not limited in this application.

显示屏805用于显示UI(User Interface,用户界面)。该UI可以包括图形、文本、图标、视频及其它们的任意组合。当显示屏805是触摸显示屏时,显示屏805还具有采集在显示屏805的表面或表面上方的触摸信号的能力。该触摸信号可以作为控制信号输入至处理器801进行处理。此时,显示屏805还可以用于提供虚拟按钮和/或虚拟键盘,也称软按钮和/或软键 盘。在一些实施例中,显示屏805可以为一个,设置在电子设备800的前面板;在另一些实施例中,显示屏805可以为至少两个,分别设置在电子设备800的不同表面或呈折叠设计;在另一些实施例中,显示屏805可以是柔性显示屏,设置在电子设备800的弯曲表面上或折叠面上。甚至,显示屏805还可以设置成非矩形的不规则图形,也即异形屏。显示屏805可以采用LCD(Liquid Crystal Display,液晶显示屏)、OLED(Organic Light-Emitting Diode,有机发光二极管)等材质制备。The display screen 805 is used to display UI (User Interface, user interface). The UI can include graphics, text, icons, videos, and any combination thereof. When display screen 805 is a touch display screen, display screen 805 also has the ability to collect touch signals on or above the surface of display screen 805 . The touch signal can be input to the processor 801 as a control signal for processing. At this time, the display screen 805 can also be used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards. In some embodiments, there may be one display screen 805, which is disposed on the front panel of the electronic device 800; in other embodiments, there may be at least two display screens 805, which are disposed on different surfaces of the electronic device 800 or folded. Design; In other embodiments, the display screen 805 may be a flexible display screen, disposed on a curved or folded surface of the electronic device 800. Even the display screen 805 can be set into a non-rectangular irregular shape, that is, a special-shaped screen. The display screen 805 can be made of LCD (Liquid Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode, organic light-emitting diode) and other materials.

摄像头组件806用于采集图像或视频。可选地,摄像头组件806包括前置摄像头和后置摄像头。通常,前置摄像头设置在电子设备的前面板,后置摄像头设置在电子设备的背面。在一些实施例中,后置摄像头为至少两个,分别为主摄像头、景深摄像头、广角摄像头、长焦摄像头中的任意一种,以实现主摄像头和景深摄像头融合实现背景虚化功能、主摄像头和广角摄像头融合实现全景拍摄以及VR(Virtual Reality,虚拟现实)拍摄功能或者其它融合拍摄功能。在一些实施例中,摄像头组件806还可以包括闪光灯。闪光灯可以是单色温闪光灯,也可以是双色温闪光灯。双色温闪光灯是指暖光闪光灯和冷光闪光灯的组合,可以用于不同色温下的光线补偿。The camera assembly 806 is used to capture images or videos. Optionally, the camera assembly 806 includes a front camera and a rear camera. Usually, the front camera is set on the front panel of the electronic device, and the rear camera is set on the back of the electronic device. In some embodiments, there are at least two rear cameras, one of which is a main camera, a depth-of-field camera, a wide-angle camera, and a telephoto camera, so as to realize the integration of the main camera and the depth-of-field camera to realize the background blur function. Integrated with a wide-angle camera to achieve panoramic shooting and VR (Virtual Reality, virtual reality) shooting functions or other integrated shooting functions. In some embodiments, camera assembly 806 may also include a flash. The flash can be a single color temperature flash or a dual color temperature flash. Dual color temperature flash refers to a combination of warm light flash and cold light flash, which can be used for light compensation under different color temperatures.

音频电路807可以包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至处理器801进行处理,或者输入至射频电路804以实现语音通信。出于立体声采集或降噪的目的,麦克风可以为多个,分别设置在电子设备800的不同部位。麦克风还可以是阵列麦克风或全向采集型麦克风。扬声器则用于将来自处理器801或射频电路804的电信号转换为声波。扬声器可以是传统的薄膜扬声器,也可以是压电陶瓷扬声器。当扬声器是压电陶瓷扬声器时,不仅可以将电信号转换为人类可听见的声波,也可以将电信号转换为人类听不见的声波以进行测距等用途。在一些实施例中,音频电路807还可以包括耳机插孔。Audio circuitry 807 may include a microphone and speakers. The microphone is used to collect sound waves from the user and the environment, and convert the sound waves into electrical signals that are input to the processor 801 for processing, or to the radio frequency circuit 804 to implement voice communication. For the purpose of stereo collection or noise reduction, there may be multiple microphones, which are respectively arranged at different parts of the electronic device 800 . The microphone can also be an array microphone or an omnidirectional collection microphone. The speaker is used to convert electrical signals from the processor 801 or the radio frequency circuit 804 into sound waves. The loudspeaker can be a traditional membrane loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, it can not only convert electrical signals into sound waves that are audible to humans, but also convert electrical signals into sound waves that are inaudible to humans for purposes such as ranging. In some embodiments, audio circuitry 807 may also include a headphone jack.

定位组件808用于定位电子设备800的当前地理位置,以实现导航或LBS(Location Based Service,基于位置的服务)。定位组件808可以是基于美国的GPS(Global Positioning System,全球定位系统)、中国的北斗系统、俄罗斯的格雷纳斯系统或欧盟的伽利略系统的定位组件。The positioning component 808 is used to locate the current geographical location of the electronic device 800 to implement navigation or LBS (Location Based Service). The positioning component 808 may be a positioning component based on the United States' GPS (Global Positioning System), China's Beidou system, Russia's Galileo system, or the European Union's Galileo system.

电源809用于为电子设备800中的各个组件进行供电。电源809可以是交流电、直流电、一次性电池或可充电电池。当电源809包括可充电电池时,该可充电电池可以支持有线充电或无线充电。该可充电电池还可以用于支持快充技术。The power supply 809 is used to power various components in the electronic device 800 . Power source 809 may be AC, DC, disposable batteries, or rechargeable batteries. When the power supply 809 includes a rechargeable battery, the rechargeable battery may support wired charging or wireless charging. The rechargeable battery can also be used to support fast charging technology.

在一些实施例中,电子设备800还包括有一个或多个传感器810。该一个或多个传感器810包括但不限于:加速度传感器811、陀螺仪传感器812、压力传感器813、指纹传感器814、光学传感器815以及接近传感器816。In some embodiments, electronic device 800 also includes one or more sensors 810 . The one or more sensors 810 include, but are not limited to: an acceleration sensor 811 , a gyroscope sensor 812 , a pressure sensor 813 , a fingerprint sensor 814 , an optical sensor 815 and a proximity sensor 816 .

加速度传感器811可以检测以电子设备800建立的坐标系的三个坐标轴上的加速度大小。比如,加速度传感器811可以用于检测重力加速度在三个坐标轴上的分量。处理器801可以根据加速度传感器811采集的重力加速度信号,控制显示屏805以横向视图或纵向视图进行用户界面的显示。加速度传感器811还可以用于游戏或者用户的运动数据的采集。The acceleration sensor 811 can detect the acceleration on the three coordinate axes of the coordinate system established by the electronic device 800 . For example, the acceleration sensor 811 can be used to detect the components of gravity acceleration on three coordinate axes. The processor 801 can control the display screen 805 to display the user interface in a horizontal view or a vertical view according to the gravity acceleration signal collected by the acceleration sensor 811 . The acceleration sensor 811 can also be used to collect game or user motion data.

陀螺仪传感器812可以检测电子设备800的机体方向及转动角度,陀螺仪传感器812可以与加速度传感器811协同采集用户对电子设备800的3D动作。处理器801根据陀螺仪传感器812采集的数据,可以实现如下功能:动作感应(比如根据用户的倾斜操作来改变UI)、拍摄时的图像稳定、游戏控制以及惯性导航。The gyro sensor 812 can detect the body direction and rotation angle of the electronic device 800 , and the gyro sensor 812 can cooperate with the acceleration sensor 811 to collect the user's 3D movements on the electronic device 800 . Based on the data collected by the gyro sensor 812, the processor 801 can implement the following functions: motion sensing (such as changing the UI according to the user's tilt operation), image stabilization during shooting, game control, and inertial navigation.

压力传感器813可以设置在电子设备800的侧边框和/或显示屏805的下层。当压力传感器813设置在电子设备800的侧边框时,可以检测用户对电子设备800的握持信号,由处理器801根据压力传感器813采集的握持信号进行左右手识别或快捷操作。当压力传感器813设置在显示屏805的下层时,由处理器801根据用户对显示屏805的压力操作,实现对UI界面上的可操作性控件进行控制。可操作性控件包括按钮控件、滚动条控件、图标控件、菜单控件所组成的群组中的至少一种。The pressure sensor 813 may be disposed on the side frame of the electronic device 800 and/or on the lower layer of the display screen 805 . When the pressure sensor 813 is disposed on the side frame of the electronic device 800, it can detect the user's holding signal of the electronic device 800, and the processor 801 performs left and right hand identification or quick operation based on the holding signal collected by the pressure sensor 813. When the pressure sensor 813 is provided on the lower layer of the display screen 805, the processor 801 controls the operability controls on the UI interface according to the user's pressure operation on the display screen 805. The operability control includes at least one of the group consisting of a button control, a scroll bar control, an icon control, and a menu control.

指纹传感器814用于采集用户的指纹,由处理器801根据指纹传感器814采集到的指纹识别用户的身份,或者,由指纹传感器814根据采集到的指纹识别用户的身份。在识别出用户的身份为可信身份时,由处理器801授权该用户执行相关的敏感操作,该敏感操作包括解锁屏幕、查看加密信息、下载软件、支付及更改设置等。指纹传感器814可以被设置在电子设备800的正面、背面或侧面。当电子设备800上设置有物理按键或厂商Logo时,指纹传感器814可以与物理按键或厂商Logo集成在一起。The fingerprint sensor 814 is used to collect the user's fingerprint. The processor 801 identifies the user's identity based on the fingerprint collected by the fingerprint sensor 814, or the fingerprint sensor 814 identifies the user's identity based on the collected fingerprint. When the user's identity is recognized as a trusted identity, the processor 801 authorizes the user to perform relevant sensitive operations. The sensitive operations include unlocking the screen, viewing encrypted information, downloading software, making payments, and changing settings. The fingerprint sensor 814 may be disposed on the front, back, or side of the electronic device 800 . When the electronic device 800 is provided with a physical button or a manufacturer's logo, the fingerprint sensor 814 can be integrated with the physical button or the manufacturer's logo.

光学传感器815用于采集环境光强度。在一个实施例中,处理器801可以根据光学传感器815采集的环境光强度,控制显示屏805的显示亮度。具体地,当环境光强度较高时,调高显示屏805的显示亮度;当环境光强度较低时,调低显示屏808的显示亮度。在另一个实施例中,处理器801还可以根据光学传感器815采集的环境光强度,动态调整摄像头组件806的拍摄参数。Optical sensor 815 is used to collect ambient light intensity. In one embodiment, the processor 801 can control the display brightness of the display screen 805 according to the ambient light intensity collected by the optical sensor 815 . Specifically, when the ambient light intensity is high, the display brightness of the display screen 805 is increased; when the ambient light intensity is low, the display brightness of the display screen 808 is decreased. In another embodiment, the processor 801 can also dynamically adjust the shooting parameters of the camera assembly 806 according to the ambient light intensity collected by the optical sensor 815 .

接近传感器816,也称距离传感器,通常设置在电子设备800的前面板。接近传感器816用于采集用户与电子设备800的正面之间的距离。在一个实施例中,当接近传感器816检测到用户与电子设备800的正面之间的距离逐渐变小时,由处理器801控制显示屏805从亮屏状态切换为息屏状态;当接近传感器816检测到用户与电子设备800的正面之间的距离逐渐变大时,由处理器801控制显示屏805从息屏状态切换为亮屏状态。The proximity sensor 816, also called a distance sensor, is usually provided on the front panel of the electronic device 800. The proximity sensor 816 is used to collect the distance between the user and the front of the electronic device 800 . In one embodiment, when the proximity sensor 816 detects that the distance between the user and the front of the electronic device 800 gradually becomes smaller, the processor 801 controls the display screen 805 to switch from the bright screen state to the closed screen state; when the proximity sensor 816 detects When the distance between the user and the front of the electronic device 800 gradually increases, the processor 801 controls the display screen 805 to switch from the screen-off state to the screen-on state.

本领域技术人员可以理解,图8中示出的结构并不构成对电子设备800的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。Those skilled in the art can understand that the structure shown in FIG. 8 does not constitute a limitation on the electronic device 800, and may include more or fewer components than shown, or combine certain components, or adopt different component arrangements.

本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质中存储有至少一条指令,指令由处理器加载并执行,以使计算机实现本申请的任一种示例性实施例所提供的针对深度伪造的主动防御方法,或者图4对应的图像处理方法。Embodiments of the present application provide a computer-readable storage medium. At least one instruction is stored in the computer-readable storage medium. The instruction is loaded and executed by a processor, so that the computer can implement any of the exemplary embodiments of the present application. An active defense method against deep forgery, or the image processing method corresponding to Figure 4.

本申请实施例提供了一种计算机程序或计算机程序产品,计算机程序或计算机程序产品包括:计算机指令,计算机指令被计算机执行时,使得计算机实现本申请的任一种示例性实施例所提供的针对深度伪造的主动防御方法,或者图4对应的图像处理方法。Embodiments of the present application provide a computer program or computer program product. The computer program or computer program product includes: computer instructions. When the computer instructions are executed by the computer, the computer implements the methods provided by any exemplary embodiment of the present application. Active defense method for deep fakes, or the image processing method corresponding to Figure 4.

上述所有可选技术方案,可以采用任意结合形成本申请的可选实施例,在此不再一一赘述。本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。All the above optional technical solutions can be combined in any way to form optional embodiments of the present application, and will not be described again one by one. Those of ordinary skill in the art can understand that all or part of the steps to implement the above embodiments can be completed by hardware, or can be completed by instructing relevant hardware through a program. The program can be stored in a computer-readable storage medium. The above-mentioned The storage media mentioned can be read-only memory, magnetic disks or optical disks, etc.

以上通过详细实施案例描述了本申请,本领域的研究人员和技术人员可以根据上述的步骤作出形式或内容方面的非实质性的改变而不偏离本申请实质保护的范围。因此,本申请不局限于以上实施例中所公开的内容,本申请的保护范围应以权利要求所述为准。The present application has been described above through detailed implementation examples. Researchers and technicians in the field may make non-substantive changes in form or content based on the above steps without departing from the scope of the essential protection of the present application. Therefore, the present application is not limited to the content disclosed in the above embodiments, and the protection scope of the present application shall be subject to the claims.

Claims (10)

一种图像处理方法,其特征在于,所述方法包括:An image processing method, characterized in that the method includes: 获取针对深度伪造模型生成的第一水印,以及携带目标信息的第二水印;Obtain the first watermark generated for the deep forgery model and the second watermark carrying target information; 获取目标编码模型,所述目标编码模型用于针对图像进行水印嵌入;Obtain a target coding model, which is used for watermark embedding for images; 通过所述目标编码模型将所述第一水印和所述第二水印嵌入初始图像,得到目标图像,所述初始图像与所述目标图像之间的差异小于参考阈值。The first watermark and the second watermark are embedded in the initial image through the target encoding model to obtain a target image, and the difference between the initial image and the target image is less than a reference threshold. 根据权利要求1所述的方法,其特征在于,所述通过所述目标编码模型将所述第一水印和所述第二水印嵌入初始图像,得到目标图像,包括:The method according to claim 1, characterized in that embedding the first watermark and the second watermark into the initial image through the target encoding model to obtain the target image includes: 将所述第一水印叠加至所述初始图像上,得到第一图像,将所述第一图像和所述第二水印输入所述目标编码模型,得到所述目标编码模型输出的所述目标图像;Superimpose the first watermark onto the initial image to obtain a first image, input the first image and the second watermark into the target encoding model, and obtain the target image output by the target encoding model. ; 或者,将所述第二水印和所述初始图像输入所述目标编码模型,得到所述目标编码模型输出的第二图像,将所述第一水印叠加至所述第二图像上,得到所述目标图像。Or, input the second watermark and the initial image into the target encoding model to obtain the second image output by the target encoding model, and superimpose the first watermark onto the second image to obtain the target image. 根据权利要求1所述的方法,其特征在于,所述通过所述目标编码模型将所述第一水印和所述第二水印嵌入初始图像,得到目标图像之前,所述方法还包括:The method according to claim 1, characterized in that before embedding the first watermark and the second watermark into the initial image through the target encoding model to obtain the target image, the method further includes: 获取第一样本图像、携带第一信息的第三水印以及初始编码模型;Obtain the first sample image, the third watermark carrying the first information, and the initial coding model; 将所述第一样本图像输入所述初始编码模型,得到所述初始编码模型输出的第一结果图像;Input the first sample image into the initial encoding model to obtain a first result image output by the initial encoding model; 将所述第一样本图像和所述第三水印输入所述初始编码模型,得到所述初始编码模型输出的第二结果图像,所述第二结果图像嵌入有所述第三水印;Input the first sample image and the third watermark into the initial encoding model to obtain a second result image output by the initial encoding model, and the second result image is embedded with the third watermark; 根据所述第一结果图像和所述第二结果图像确定第一损失函数,所述第一损失函数用于指示所述第一结果图像与所述第二结果图像之间的差异;Determine a first loss function based on the first result image and the second result image, the first loss function being used to indicate a difference between the first result image and the second result image; 通过最小化所述第一损失函数的过程更新所述初始编码模型,得到所述目标编码模型,其中,最小化的所述第一损失函数小于所述参考阈值。The initial coding model is updated through a process of minimizing the first loss function to obtain the target coding model, wherein the minimized first loss function is smaller than the reference threshold. 根据权利要求1-3任一所述的方法,其特征在于,所述通过所述目标编码模型将所述第一水印和所述第二水印嵌入初始图像,得到目标图像之后,所述方法还包括:The method according to any one of claims 1 to 3, characterized in that after embedding the first watermark and the second watermark into the initial image through the target coding model to obtain the target image, the method further include: 获取基于所述目标图像生成的更新图像;Obtain an updated image generated based on the target image; 将所述更新图像输入目标解码模型,得到所述目标解码模型输出的第四水印;Input the updated image into the target decoding model to obtain the fourth watermark output by the target decoding model; 确定所述第二水印与所述第四水印之间的位误差;determining a bit error between the second watermark and the fourth watermark; 在所述位误差大于误差阈值的情况下,确定所述更新图像为通过所述深度伪造模型对所述目标图像进行深度伪造后得到的图像。When the bit error is greater than the error threshold, it is determined that the updated image is an image obtained by deep forging the target image through the deep forgery model. 根据权利要求4所述的方法,其特征在于,所述将所述更新图像输入目标解码模型之前,所述方法还包括:The method according to claim 4, characterized in that before inputting the updated image into the target decoding model, the method further includes: 获取第二样本图像和初始解码模型,所述第二样本图像嵌入有携带第二信息的第五水印;Obtaining a second sample image and an initial decoding model, where the second sample image is embedded with a fifth watermark carrying second information; 将所述第二样本图像输入所述初始解码模型,得到所述初始解码模型输出的第六水印;Input the second sample image into the initial decoding model to obtain the sixth watermark output by the initial decoding model; 根据所述第五水印和所述第六水印确定第二损失函数;Determine a second loss function according to the fifth watermark and the sixth watermark; 通过最小化所述第二损失函数的过程更新所述初始解码模型,得到所述目标解码模型。The initial decoding model is updated through the process of minimizing the second loss function to obtain the target decoding model. 根据权利要求1-3任一所述的方法,其特征在于,所述获取针对深度伪造模型生成的第一水印,包括:The method according to any one of claims 1-3, characterized in that said obtaining the first watermark generated for the deep forgery model includes: 获取第三样本图像、第七水印和所述深度伪造模型;Obtain the third sample image, the seventh watermark and the deepfake model; 将所述第三样本图像输入所述深度伪造模型,得到所述深度伪造模型输出的第三结果图像;Input the third sample image into the deep forgery model to obtain a third result image output by the deep forgery model; 将所述第七水印叠加至所述第三样本图像,得到第四样本图像;Superimpose the seventh watermark onto the third sample image to obtain a fourth sample image; 将所述第四样本图像输入所述深度伪造模型,得到所述深度伪造模型输出的第四结果图像;Input the fourth sample image into the deep forgery model to obtain a fourth result image output by the deep forgery model; 根据所述第三结果图像和所述第四结果图像确定第三损失函数;determining a third loss function based on the third result image and the fourth result image; 根据所述第三损失函数确定梯度信息;Determine gradient information according to the third loss function; 根据所述梯度信息更新所述第七水印,得到所述针对深度伪造模型生成的第一水印。The seventh watermark is updated according to the gradient information to obtain the first watermark generated for the deep forgery model. 根据权利要求6所述的方法,其特征在于,所述深度伪造模型的数量和所述梯度信息的数量均为多个,多个深度伪造模型与多个梯度信息一一对应;The method according to claim 6, characterized in that the number of deep forgery models and the number of gradient information are both multiple, and multiple deep forgery models correspond to multiple gradient information in one-to-one correspondence; 所述根据所述梯度信息更新所述第七水印,得到所述针对深度伪造模型生成的第一水印,包括:The updating of the seventh watermark according to the gradient information to obtain the first watermark generated for the deep forgery model includes: 对所述多个梯度信息进行平均,得到目标梯度信息;Average the multiple gradient information to obtain target gradient information; 通过符号函数对所述目标梯度信息进行计算,得到第一计算结果;Calculate the target gradient information through a symbolic function to obtain a first calculation result; 根据所述第一计算结果生成第二计算结果,将第二计算结果叠加至所述第四样本图像,得到第五样本图像;Generate a second calculation result according to the first calculation result, superimpose the second calculation result to the fourth sample image, and obtain a fifth sample image; 对所述第五样本图像进行上下限约束,得到第六样本图像;Apply upper and lower bound constraints to the fifth sample image to obtain a sixth sample image; 从所述第六样本图像中去除所述第三样本图像,得到第八水印;Remove the third sample image from the sixth sample image to obtain an eighth watermark; 对所述第七水印和所述第八水印进行加权平均,得到所述第一水印。The seventh watermark and the eighth watermark are weighted and averaged to obtain the first watermark. 一种添加水印的装置,其特征在于,所述装置包括:A device for adding watermarks, characterized in that the device includes: 获取模块,用于获取针对深度伪造模型生成的第一水印,以及携带目标信息的第二水印;The acquisition module is used to acquire the first watermark generated for the deep forgery model and the second watermark carrying target information; 所述获取模块,还用于获取目标编码模型,所述目标编码模型用于针对图像进行水印嵌入;The acquisition module is also used to acquire a target coding model, which is used to embed watermarks on images; 嵌入模块,用于通过所述目标编码模型将所述第一水印和所述第二水印嵌入初始图像,得到目标图像,所述初始图像与所述目标图像之间的差异小于参考阈值。An embedding module, configured to embed the first watermark and the second watermark into an initial image through the target encoding model to obtain a target image, where the difference between the initial image and the target image is less than a reference threshold. 一种电子设备,其特征在于,所述电子设备包括存储器及处理器;所述存储器中存储有至少一条指令,所述至少一条指令由所述处理器加载并执行,以使所述电子设备实现权利要求1-7中任一所述的图像处理方法。An electronic device, characterized in that the electronic device includes a memory and a processor; at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor to enable the electronic device to implement The image processing method according to any one of claims 1-7. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有至少一条 指令,所述指令由处理器加载并执行,以使计算机实现如权利要求1-7中任一所述的图像处理方法。A computer-readable storage medium, characterized in that at least one instruction is stored in the computer-readable storage medium, and the instruction is loaded and executed by a processor, so that the computer implements any one of claims 1-7. The image processing method described above.
PCT/CN2022/144343 2022-07-19 2022-12-30 Image processing method and apparatus, electronic device, and computer-readable storage medium Ceased WO2024016611A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210845845.5 2022-07-19
CN202210845845.5A CN115273247B (en) 2022-07-19 2022-07-19 An active defense method and system against deep fakes

Publications (1)

Publication Number Publication Date
WO2024016611A1 true WO2024016611A1 (en) 2024-01-25

Family

ID=83767960

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/144343 Ceased WO2024016611A1 (en) 2022-07-19 2022-12-30 Image processing method and apparatus, electronic device, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN115273247B (en)
WO (1) WO2024016611A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118333830A (en) * 2024-06-14 2024-07-12 齐鲁工业大学(山东省科学院) Dual-task cascade active Deepfake detection method based on QPCET watermark
CN118691452A (en) * 2024-04-11 2024-09-24 齐鲁工业大学(山东省科学院) A deepfake face-swapping detection method based on robust identity-aware watermarking
CN120182082A (en) * 2025-05-15 2025-06-20 齐鲁工业大学(山东省科学院) An active deep fake defense method and system based on image texture

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115273247B (en) * 2022-07-19 2025-07-15 北京大学 An active defense method and system against deep fakes
CN116127422A (en) * 2022-11-04 2023-05-16 马上消费金融股份有限公司 Image processing method, training method and device of image processing model
CN115631085B (en) * 2022-12-19 2023-04-11 浙江君同智能科技有限责任公司 Active defense method and device for image protection
CN117975578A (en) * 2024-02-05 2024-05-03 北京理工大学 Deep counterfeiting active detection method based on face features and related watermarks
CN119963391B (en) * 2025-02-07 2025-09-26 暨南大学 Deep fake facial image defense method, device, and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111768327A (en) * 2020-06-30 2020-10-13 苏州科达科技股份有限公司 Watermark adding and extracting method and device based on deep learning and storage medium
US20210233204A1 (en) * 2020-01-15 2021-07-29 Digimarc Corporation System for mitigating the problem of deepfake media content using watermarking
CN114155132A (en) * 2021-12-06 2022-03-08 北京声智科技有限公司 Image processing method, device, equipment and computer readable storage medium
CN115273247A (en) * 2022-07-19 2022-11-01 北京大学 Active defense method and system for deep forgery

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883874B (en) * 2021-02-22 2022-09-06 中国科学技术大学 Active defense method aiming at deep face tampering
CN113362217B (en) * 2021-07-09 2025-04-04 浙江工业大学 A deep learning model poisoning defense method based on model watermarking
CN113689318B (en) * 2021-07-30 2023-07-07 南京信息工程大学 Deep semi-fragile watermarking method for image authentication and anti-sample defense
CN114254276A (en) * 2021-12-22 2022-03-29 支付宝(杭州)信息技术有限公司 Active defense method and device for image protection

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210233204A1 (en) * 2020-01-15 2021-07-29 Digimarc Corporation System for mitigating the problem of deepfake media content using watermarking
CN111768327A (en) * 2020-06-30 2020-10-13 苏州科达科技股份有限公司 Watermark adding and extracting method and device based on deep learning and storage medium
CN114155132A (en) * 2021-12-06 2022-03-08 北京声智科技有限公司 Image processing method, device, equipment and computer readable storage medium
CN115273247A (en) * 2022-07-19 2022-11-01 北京大学 Active defense method and system for deep forgery

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XUE FENG, ZHU XIN-SHAN, TANG ZHI : "Progress of research on digital watermarking techniques applied to multimedia", COMPUTER ENGINEERING AND APPLICATIONS, vol. 43, no. 13, 1 May 2007 (2007-05-01), pages 1 - 7, XP093131737 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118691452A (en) * 2024-04-11 2024-09-24 齐鲁工业大学(山东省科学院) A deepfake face-swapping detection method based on robust identity-aware watermarking
CN118333830A (en) * 2024-06-14 2024-07-12 齐鲁工业大学(山东省科学院) Dual-task cascade active Deepfake detection method based on QPCET watermark
CN120182082A (en) * 2025-05-15 2025-06-20 齐鲁工业大学(山东省科学院) An active deep fake defense method and system based on image texture

Also Published As

Publication number Publication date
CN115273247B (en) 2025-07-15
CN115273247A (en) 2022-11-01

Similar Documents

Publication Publication Date Title
WO2024016611A1 (en) Image processing method and apparatus, electronic device, and computer-readable storage medium
CN112257876B (en) Federal learning method, apparatus, computer device and medium
CN109086709B (en) Feature extraction model training method and device and storage medium
CN111726536A (en) Video generation method, device, storage medium and computer equipment
CN113822136B (en) Methods, devices, equipment and storage media for selecting video source images
TW202032503A (en) Method, device, computer equipment, and storage medium for generating 3d face model
CN111242090A (en) Human face recognition method, device, equipment and medium based on artificial intelligence
CN110163066B (en) Multimedia data recommendation method, device and storage medium
CN111541907A (en) Item display method, device, equipment and storage medium
CN111178343A (en) Multimedia resource detection method, device, equipment and medium based on artificial intelligence
CN108712641A (en) Electronic equipment and its image providing method for providing VR images based on polyhedron
WO2022068569A1 (en) Watermark detection method and apparatus, computer device and storage medium
CN110853124B (en) Methods, devices, electronic equipment and media for generating GIF dynamic images
CN114996515A (en) Training method of video feature extraction model, text generation method and device
CN110675473B (en) Method, device, electronic equipment and medium for generating GIF dynamic images
CN114691860A (en) Training method, device, electronic device and storage medium for text classification model
CN111797754B (en) Image detection method, device, electronic equipment and medium
CN111695629B (en) User feature acquisition method, device, computer equipment and storage medium
CN114741559A (en) Method, device and storage medium for determining video cover
CN113947556A (en) Image enhancement method, device, device and storage medium
US20240203072A1 (en) Dynamic augmented reality experience
CN114547429A (en) Data recommendation method, device, server and storage medium
US12072930B2 (en) Transmitting metadata via inaudible frequencies
US11874960B2 (en) Pausing device operation based on facial movement
WO2024108555A1 (en) Face image generation method and apparatus, device, and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22951866

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 06.05.2025)