[go: up one dir, main page]

CN116596813A - Image data enhancement method and device based on image destruction processing - Google Patents

Image data enhancement method and device based on image destruction processing Download PDF

Info

Publication number
CN116596813A
CN116596813A CN202310364614.7A CN202310364614A CN116596813A CN 116596813 A CN116596813 A CN 116596813A CN 202310364614 A CN202310364614 A CN 202310364614A CN 116596813 A CN116596813 A CN 116596813A
Authority
CN
China
Prior art keywords
image
processing
diffusion
destruction
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310364614.7A
Other languages
Chinese (zh)
Inventor
暴宇健
汪骞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Longzhi Digital Technology Service Co Ltd
Original Assignee
Beijing Longzhi Digital Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Longzhi Digital Technology Service Co Ltd filed Critical Beijing Longzhi Digital Technology Service Co Ltd
Priority to CN202310364614.7A priority Critical patent/CN116596813A/en
Publication of CN116596813A publication Critical patent/CN116596813A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The disclosure relates to the technical field of image processing, and provides an image data enhancement method and device based on image destruction processing. The method comprises the following steps: acquiring an image data set to be data enhanced; performing continuous multiple times of destruction processing on the target image in the image data set by utilizing the diffusion process of the image diffusion model to obtain a first destruction image corresponding to the target image; performing restoration processing on the first damaged image continuously for multiple times by using an inverse diffusion process of the image diffusion model to obtain a first restored image corresponding to the target image, wherein each restoration processing corresponds to one damage processing and the damage processing corresponding to each restoration processing is the inverse process corresponding to the restoration processing; a data-enhanced image dataset is generated using the target image and the first restored image. By adopting the technical means, the problem that the image obtained by the traditional data enhancement method does not accord with the change of the image in the actual application scene in the prior art is solved.

Description

Image data enhancement method and device based on image destruction processing
Technical Field
The disclosure relates to the technical field of image processing, and in particular relates to an image data enhancement method and device based on image destruction processing.
Background
In the field of computer vision, image data enhancement technology is a commonly used method for enriching training data sets and improving generalization capability of models. Existing image data enhancement methods typically generate new image data by performing a series of affine transformations on the original image. Common affine transformations include random rotation, flipping, cropping, etc. For example, the existing image data enhancement method randomly selects an area from an original image to clip, randomly rotates, slightly stretches or overturns the clipped image, and adds the converted image into a training data set. The random transformation cannot accurately reproduce the image change in the practical application, and visual and environmental changes in the practical application, such as changes of light rays, visual angles and the like, cannot be effectively simulated, so that the generated image often does not accord with the image change in the practical application scene.
In the process of implementing the disclosed concept, the inventor finds that at least the following technical problems exist in the related art: the image obtained by the traditional data enhancement method does not conform to the problem of image change in the actual application scene.
Disclosure of Invention
In view of the above, the embodiments of the present disclosure provide an image data enhancement method, an apparatus, an electronic device, and a computer readable storage medium based on image destruction processing, so as to solve the problem in the prior art that an image obtained by a conventional data enhancement method does not conform to a change of an image in a practical application scene.
In a first aspect of the embodiments of the present disclosure, there is provided an image data enhancement method based on a destruction process for an image, including: acquiring an image data set to be data enhanced; performing continuous multiple times of destruction processing on the target image in the image data set by utilizing the diffusion process of the image diffusion model to obtain a first destruction image corresponding to the target image; performing restoration processing on the first damaged image continuously for multiple times by using an inverse diffusion process of the image diffusion model to obtain a first restored image corresponding to the target image, wherein each restoration processing corresponds to one damage processing and the damage processing corresponding to each restoration processing is the inverse process corresponding to the restoration processing; a data-enhanced image dataset is generated using the target image and the first restored image.
In a second aspect of the embodiments of the present disclosure, there is provided an image data enhancement apparatus based on a destruction process of an image, including: an acquisition module configured to acquire an image dataset to be data enhanced; the diffusion module is configured to continuously destroy the target image in the image data set for a plurality of times by utilizing the diffusion process of the image diffusion model to obtain a first destroyed image corresponding to the target image; the back diffusion module is configured to continuously restore the first damaged image for a plurality of times by utilizing the back diffusion process of the image diffusion model to obtain a first restored image corresponding to the target image, wherein each restoring process corresponds to one damaging process, and each restoring process corresponds to the corresponding damaging process; an enhancement module configured to generate a data-enhanced image dataset using the target image and the first restored image.
In a third aspect of the disclosed embodiments, an electronic device is provided, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the above method when executing the computer program.
In a fourth aspect of the disclosed embodiments, a computer-readable storage medium is provided, which stores a computer program which, when executed by a processor, implements the steps of the above-described method.
Compared with the prior art, the embodiment of the disclosure has the beneficial effects that: because the embodiments of the present disclosure enhance by acquiring an image dataset to be data enhanced; performing continuous multiple times of destruction processing on the target image in the image data set by utilizing the diffusion process of the image diffusion model to obtain a first destruction image corresponding to the target image; performing restoration processing on the first damaged image continuously for multiple times by using an inverse diffusion process of the image diffusion model to obtain a first restored image corresponding to the target image, wherein each restoration processing corresponds to one damage processing and the damage processing corresponding to each restoration processing is the inverse process corresponding to the restoration processing; the image data set obtained by data enhancement is generated by utilizing the target image and the first restored image, so that the problem that an image obtained by a traditional data enhancement method does not accord with the change of the image in an actual application scene in the prior art can be solved by adopting the technical means, and the image obtained by data enhancement accords with the change of the image in the actual application scene.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings that are required for the embodiments or the description of the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings may be obtained according to these drawings without inventive effort for a person of ordinary skill in the art.
Fig. 1 is a scene schematic diagram of an application scene of an embodiment of the present disclosure;
FIG. 2 is a flow chart of an image data enhancement method based on image destruction processing according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of an image data enhancement device based on image destruction processing according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the disclosed embodiments. However, it will be apparent to one skilled in the art that the present disclosure may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present disclosure with unnecessary detail.
An image data enhancing method and apparatus based on a destruction process of an image according to an embodiment of the present disclosure will be described in detail with reference to the accompanying drawings.
Fig. 1 is a scene diagram of an application scene of an embodiment of the present disclosure. The application scenario may include terminal devices 101, 102, and 103, server 104, and network 105.
The terminal devices 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, and 103 are hardware, they may be various electronic devices having a display screen and supporting communication with the server 104, including but not limited to smartphones, tablets, laptop and desktop computers, etc.; when the terminal devices 101, 102, and 103 are software, they may be installed in the electronic device as above. Terminal devices 101, 102, and 103 may be implemented as multiple software or software modules, or as a single software or software module, as embodiments of the present disclosure are not limited in this regard. Further, various applications, such as a data processing application, an instant messaging tool, social platform software, a search class application, a shopping class application, and the like, may be installed on the terminal devices 101, 102, and 103.
The server 104 may be a server that provides various services, for example, a background server that receives a request transmitted from a terminal device with which communication connection is established, and the background server may perform processing such as receiving and analyzing the request transmitted from the terminal device and generate a processing result. The server 104 may be a server, a server cluster formed by a plurality of servers, or a cloud computing service center, which is not limited in the embodiments of the present disclosure.
The server 104 may be hardware or software. When the server 104 is hardware, it may be various electronic devices that provide various services to the terminal devices 101, 102, and 103. When the server 104 is software, it may be a plurality of software or software modules providing various services to the terminal devices 101, 102, and 103, or may be a single software or software module providing various services to the terminal devices 101, 102, and 103, which is not limited by the embodiments of the present disclosure.
The network 105 may be a wired network using coaxial cable, twisted pair wire, and optical fiber connection, or may be a wireless network that can implement interconnection of various communication devices without wiring, for example, bluetooth (Bluetooth), near field communication (Near Field Communication, NFC), infrared (Infrared), etc., which are not limited by the embodiments of the present disclosure.
The user can establish a communication connection with the server 104 via the network 105 through the terminal devices 101, 102, and 103 to receive or transmit information or the like. It should be noted that the specific types, numbers and combinations of the terminal devices 101, 102 and 103, the server 104 and the network 105 may be adjusted according to the actual requirements of the application scenario, which is not limited by the embodiment of the present disclosure.
Fig. 2 is a flowchart of an image data enhancement method based on image destruction processing according to an embodiment of the present disclosure. The image data enhancement method of fig. 2 based on the destruction of an image may be performed by the computer or server of fig. 1, or software on the computer or server. As shown in fig. 2, the image data enhancement method based on the destruction processing of an image includes:
s201, acquiring an image dataset to be data enhanced;
s202, performing destruction processing on a target image in an image data set continuously for a plurality of times by using a diffusion process of an image diffusion model to obtain a first destruction image corresponding to the target image;
s203, carrying out recovery processing on the first damaged image continuously for a plurality of times by using an inverse diffusion process of the image diffusion model to obtain a first restored image corresponding to the target image, wherein each recovery processing corresponds to one damage processing, and each recovery processing corresponds to the damage processing which is the inverse process corresponding to the damage processing;
s204, generating a data-enhanced image data set by using the target image and the first restored image.
The image diffusion model has two processes, namely a diffusion process and a back diffusion process. The diffusion process is to continuously destroy the target image in the image data set for a plurality of times to obtain a first destroyed image corresponding to the target image; the back diffusion process is to determine the destruction processing of the target image in the diffusion process each time, and execute the recovery processing corresponding to the destruction processing to obtain the first restored image corresponding to the target image.
The image dataset comprises a plurality of target images, the target images can be regarded as one for facilitating understanding, a first damaged image corresponding to the target images is obtained through a diffusion process, a restored first restored image corresponding to the target images is obtained through a back diffusion process, according to the method, the first restored image corresponding to each target image is obtained, and then all the target images and the corresponding first restored images form the image dataset with enhanced data.
The diffusion model is mainly a denoising model in structure, which can be a U-net structure and consists of a plurality of convolution layers and deconvolution layers. The disclosed embodiments are for image data enhancement using a trained image diffusion model.
According to the technical scheme provided by the embodiment of the disclosure, an image dataset to be enhanced by data is obtained; performing continuous multiple times of destruction processing on the target image in the image data set by utilizing the diffusion process of the image diffusion model to obtain a first destruction image corresponding to the target image; performing restoration processing on the first damaged image continuously for multiple times by using an inverse diffusion process of the image diffusion model to obtain a first restored image corresponding to the target image, wherein each restoration processing corresponds to one damage processing and the damage processing corresponding to each restoration processing is the inverse process corresponding to the restoration processing; the image data set obtained by data enhancement is generated by utilizing the target image and the first restored image, so that the problem that an image obtained by a traditional data enhancement method does not accord with the change of the image in an actual application scene in the prior art can be solved by adopting the technical means, and the image obtained by data enhancement accords with the change of the image in the actual application scene.
The number of times of processing the target image by using the image diffusion model is controlled, the number of first restored images corresponding to the target image is controlled, and the scale of the image data set after data enhancement is controlled, wherein the image diffusion model is recorded as processing the target image once through a diffusion process and a back diffusion process.
Each time the image diffusion model is used for processing the target image, a first restored image corresponding to the target image is obtained, and when the image diffusion model is used for processing the target image for a plurality of times, a large number of first restored images corresponding to the target image are obtained (because the damage processing of each time the image diffusion model is used for processing the target image can be different, the image data enhancement can be realized by using the image diffusion model).
The method further comprises the steps of: the training data set is obtained, and the training data set is utilized to train the image diffusion model, so that the image diffusion model can determine and execute recovery processing corresponding to the destruction processing of the training images in the training data set in the inverse diffusion process.
The trained image diffusion model can perform damage processing on the training images in the training data set in the diffusion process, determine recovery processing corresponding to the damage processing in the inverse diffusion process, and execute the recovery processing. The diffusion process carries out multiple damage treatment on the training images in the training data set, and the inverse diffusion process carries out multiple recovery treatment. Each recovery process corresponds to one destruction process, and each recovery process corresponds to the destruction process as its inverse.
After acquiring the image dataset to be data enhanced, the method further comprises: performing destructive processing on the target image in the image data set and adding noise by utilizing the diffusion process of the image diffusion model for a plurality of times to obtain a second destructive image corresponding to the target image; determining destructive processing and added noise of the target image in the diffusion process by using an inverse diffusion process of the image diffusion model, and continuously performing recovery processing and noise removal corresponding to the destructive processing on the second destructive image for a plurality of times to obtain a second recovery image corresponding to the target image; a data-enhanced image dataset is generated using the target image and the second restored image.
And sequentially carrying out one-time destruction processing and noise addition on the target image, namely, carrying out one-time processing on the target image in a diffusion process, and carrying out multiple-time processing on the target image in the diffusion process to obtain a second destruction image corresponding to the target image. And carrying out one-time recovery processing and noise removal on the target image, namely, carrying out one-time processing on the target image in a back diffusion process, and carrying out multiple times of processing on the target image in the back diffusion process to obtain a second recovery image corresponding to the target image.
The method further comprises the steps of: acquiring a training data set; performing damage processing on the training images in the training data set and adding noise by utilizing the diffusion process of the image diffusion model for multiple times continuously to obtain a third damage image corresponding to the training images; determining destructive processing and added noise of the training image in each diffusion process by using an inverse diffusion process of the image diffusion model, and continuously performing recovery processing and noise removal corresponding to the destructive processing on the third destructive image for multiple times to obtain a third recovery image corresponding to the training image; calculating a noise loss between a plurality of noises added during diffusion and a plurality of noises determined during back diffusion, and calculating a damage loss between a damage process performed during diffusion and a recovery process performed during back diffusion; and updating model parameters of the image diffusion model according to the noise loss and the damage loss so as to complete training of the image diffusion model.
The mean square error may be used to calculate the loss between each noise added during diffusion and the noise predicted during inverse diffusion corresponding to that noise; the sum of all the losses calculated is taken as the noise loss. The cross entropy loss function may be used to calculate the damage loss between the tag of the damage process and the recovery process to which the damage process corresponds.
The target image is corrupted and noise is added according to the following formula:
x t =F t (x 0 )+σε,ε~q(ε)
wherein F is t For the t-th destroy process, the destroy process includes a pooling operation, a blurring operation, and a masking operation, x t Is the target image after the t-th destruction processing and noise addition, whenWhen t is equal to 1, x 0 Is the target image, x when t is equal to N N Is a second damaged image corresponding to the target image, N is a preset number epsilon Is that The noise, q (), is the target distribution, which includes Gaussian distribution, uniform distribution and t distribution, ε -q (ε) Representation of ε Satisfy the following requirements q (), σ is the variance of q ().
The destruction process may be a pooling operation, a blurring operation, a masking operation, and the like, and the target distribution may be a gaussian distribution, a uniform distribution, a t-distribution, and the like.
In an alternative embodiment, the loss function is used for calculating the loss, and the model parameters of the image diffusion model are updated according to the loss so as to complete the training of the image diffusion model;
the loss function is as follows:
wherein G is t For the t-th recovery processing, F t For the t-th destruction treatment, G t And F is equal to t Is corresponding to, x t-1 Is the target image after the t-1 th destruction processing and noise addition, when t is equal to 1, x 0 Is the target image, x when t is equal to N N Is a second corrupted image corresponding to the target image, N is a preset number, epsilon is noise, epsilon satisfies the target distribution and, σ is the variance of the target distribution, I 1 Representing the number of times a norm operation is performed, T is the number of times a total of corruption processing and noise addition is performed, and T is equal in value to T.
The disclosed embodiments calculate noise loss and corruption loss as a whole.
Any combination of the above optional solutions may be adopted to form an optional embodiment of the present application, which is not described herein.
The following are device embodiments of the present disclosure that may be used to perform method embodiments of the present disclosure. For details not disclosed in the embodiments of the apparatus of the present disclosure, please refer to the embodiments of the method of the present disclosure.
Fig. 3 is a schematic diagram of an image data enhancement apparatus based on a destruction process for an image according to an embodiment of the present disclosure. As shown in fig. 3, the image data enhancement apparatus based on the destruction processing of an image includes:
an acquisition module 301 configured to acquire an image dataset to be data enhanced;
the diffusion module 302 is configured to perform a continuous multiple-time destruction process on the target image in the image data set by using the diffusion process of the image diffusion model, so as to obtain a first destruction image corresponding to the target image;
a back diffusion module 303, configured to perform a recovery process on the first damaged image continuously for multiple times by using a back diffusion process of the image diffusion model, so as to obtain a first restored image corresponding to the target image, where each recovery process corresponds to one damage process, and each damage process corresponding to the recovery process is a corresponding inverse process thereof;
the enhancement module 304 is configured to generate a data-enhanced image dataset using the target image and the first restored image.
The image diffusion model has two processes, namely a diffusion process and a back diffusion process. The diffusion process is to continuously destroy the target image in the image data set for a plurality of times to obtain a first destroyed image corresponding to the target image; the back diffusion process is to determine the destruction processing of the target image in the diffusion process each time, and execute the recovery processing corresponding to the destruction processing to obtain the first restored image corresponding to the target image.
The image dataset comprises a plurality of target images, the target images can be regarded as one for facilitating understanding, a first damaged image corresponding to the target images is obtained through a diffusion process, a restored first restored image corresponding to the target images is obtained through a back diffusion process, according to the method, the first restored image corresponding to each target image is obtained, and then all the target images and the corresponding first restored images form the image dataset with enhanced data.
The diffusion model is mainly a denoising model in structure, which can be a U-net structure and consists of a plurality of convolution layers and deconvolution layers. The disclosed embodiments are for image data enhancement using a trained image diffusion model.
According to the technical scheme provided by the embodiment of the disclosure, an image dataset to be enhanced by data is obtained; performing continuous multiple times of destruction processing on the target image in the image data set by utilizing the diffusion process of the image diffusion model to obtain a first destruction image corresponding to the target image; performing restoration processing on the first damaged image continuously for multiple times by using an inverse diffusion process of the image diffusion model to obtain a first restored image corresponding to the target image, wherein each restoration processing corresponds to one damage processing and the damage processing corresponding to each restoration processing is the inverse process corresponding to the restoration processing; the image data set obtained by data enhancement is generated by utilizing the target image and the first restored image, so that the problem that an image obtained by a traditional data enhancement method does not accord with the change of the image in an actual application scene in the prior art can be solved by adopting the technical means, and the image obtained by data enhancement accords with the change of the image in the actual application scene.
Optionally, the enhancing module 304 is further configured to control the size of the data-enhanced image dataset by controlling the number of times the target image is processed using the image diffusion model, which is denoted as processing the target image once, and controlling the number of first restored images corresponding to the target image.
Each time the image diffusion model is used for processing the target image, a first restored image corresponding to the target image is obtained, and when the image diffusion model is used for processing the target image for a plurality of times, a large number of first restored images corresponding to the target image are obtained (because the damage processing of each time the image diffusion model is used for processing the target image can be different, the image data enhancement can be realized by using the image diffusion model).
Optionally, the diffusion module 302 is further configured to acquire a training data set, and train the image diffusion model by using the training data set, so that the image diffusion model can determine and execute recovery processing corresponding to the destruction processing of the training image in the training data set by the diffusion process in the back diffusion process.
The trained image diffusion model can perform damage processing on the training images in the training data set in the diffusion process, determine recovery processing corresponding to the damage processing in the inverse diffusion process, and execute the recovery processing. The diffusion process carries out multiple damage treatment on the training images in the training data set, and the inverse diffusion process carries out multiple recovery treatment. Each recovery process corresponds to one destruction process, and each recovery process corresponds to the destruction process as its inverse.
Optionally, the diffusion module 302 is further configured to perform a destruction process on the target image in the image data set and add noise multiple times in succession by using a diffusion process of the image diffusion model, so as to obtain a second destruction image corresponding to the target image; determining destructive processing and added noise of the target image in the diffusion process by using an inverse diffusion process of the image diffusion model, and continuously performing recovery processing and noise removal corresponding to the destructive processing on the second destructive image for a plurality of times to obtain a second recovery image corresponding to the target image; a data-enhanced image dataset is generated using the target image and the second restored image.
And sequentially carrying out one-time destruction processing and noise addition on the target image, namely, carrying out one-time processing on the target image in a diffusion process, and carrying out multiple-time processing on the target image in the diffusion process to obtain a second destruction image corresponding to the target image. And carrying out one-time recovery processing and noise removal on the target image, namely, carrying out one-time processing on the target image in a back diffusion process, and carrying out multiple times of processing on the target image in the back diffusion process to obtain a second recovery image corresponding to the target image.
Optionally, the diffusion module 302 is further configured to obtain a training data set; performing damage processing on the training images in the training data set and adding noise by utilizing the diffusion process of the image diffusion model for multiple times continuously to obtain a third damage image corresponding to the training images; determining destructive processing and added noise of the training image in each diffusion process by using an inverse diffusion process of the image diffusion model, and continuously performing recovery processing and noise removal corresponding to the destructive processing on the third destructive image for multiple times to obtain a third recovery image corresponding to the training image; calculating a noise loss between a plurality of noises added during diffusion and a plurality of noises determined during back diffusion, and calculating a damage loss between a damage process performed during diffusion and a recovery process performed during back diffusion; and updating model parameters of the image diffusion model according to the noise loss and the damage loss so as to complete training of the image diffusion model.
The mean square error may be used to calculate the loss between each noise added during diffusion and the noise predicted during inverse diffusion corresponding to that noise; the sum of all the losses calculated is taken as the noise loss. The cross entropy loss function may be used to calculate the damage loss between the tag of the damage process and the recovery process to which the damage process corresponds.
Optionally, the diffusion module 302 is further configured to destroy the target image and add noise according to the following formula:
x t =F t (x 0 )+σε,ε~q(ε)
wherein F is t For the t-th destroy process, the destroy process includes a pooling operation, a blurring operation, and a masking operation, x t Is the target image after the t-th destruction processing and noise addition, when t is equal to 1, x 0 Is the target image, x when t is equal to N N The second destructive image corresponding to the target image is characterized in that N is a preset number, epsilon is noise, q () is target distribution, the target distribution comprises Gaussian distribution, uniform distribution and t distribution, epsilon-q (epsilon) represents epsilon to meet q (), and sigma is the variance of q ().
The destruction process may be a pooling operation, a blurring operation, a masking operation, and the like, and the target distribution may be a gaussian distribution, a uniform distribution, a t-distribution, and the like.
Optionally, the diffusion module 302 is further configured to calculate a loss using the loss function, and update model parameters of the image diffusion model according to the loss to complete training of the image diffusion model;
the loss function is as follows:
wherein G is t For the t-th recovery processing, F t For the t-th destruction treatment, G t And F is equal to t Is corresponding to, x t-1 Is the target image after the t-1 th destruction processing and noise addition, when t is equal to 1, x 0 Is the target image, x when t is equal to N N Is a second corrupted image corresponding to the target image, N is a preset number, epsilon is noise, epsilon satisfies the target distribution and, σ is the variance of the target distribution, I 1 Representing the number of times a norm operation is performed, T is the number of times a total of corruption processing and noise addition is performed, and T is equal in value to T.
The disclosed embodiments calculate noise loss and corruption loss as a whole.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic of each process, and should not constitute any limitation on the implementation process of the embodiments of the disclosure.
Fig. 4 is a schematic diagram of an electronic device 4 provided by an embodiment of the present disclosure. As shown in fig. 4, the electronic apparatus 4 of this embodiment includes: a processor 401, a memory 402 and a computer program 403 stored in the memory 402 and executable on the processor 401. The steps of the various method embodiments described above are implemented by processor 401 when executing computer program 403. Alternatively, the processor 401, when executing the computer program 403, performs the functions of the modules/units in the above-described apparatus embodiments.
The electronic device 4 may be a desktop computer, a notebook computer, a palm computer, a cloud server, or the like. The electronic device 4 may include, but is not limited to, a processor 401 and a memory 402. It will be appreciated by those skilled in the art that fig. 4 is merely an example of the electronic device 4 and is not limiting of the electronic device 4 and may include more or fewer components than shown, or different components.
The processor 401 may be a central processing unit (Central Processing Unit, CPU) or other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.
The memory 402 may be an internal storage unit of the electronic device 4, for example, a hard disk or a memory of the electronic device 4. The memory 402 may also be an external storage device of the electronic device 4, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the electronic device 4. Memory 402 may also include both internal storage units and external storage devices of electronic device 4. The memory 402 is used to store computer programs and other programs and data required by the electronic device.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present disclosure may implement all or part of the flow of the method of the above-described embodiments, or may be implemented by a computer program to instruct related hardware, and the computer program may be stored in a computer readable storage medium, where the computer program, when executed by a processor, may implement the steps of the method embodiments described above. The computer program may comprise computer program code, which may be in source code form, object code form, executable file or in some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be appropriately increased or decreased according to the requirements of the jurisdiction's jurisdiction and the patent practice, for example, in some jurisdictions, the computer readable medium does not include electrical carrier signals and telecommunication signals according to the jurisdiction and the patent practice.
The above embodiments are merely for illustrating the technical solution of the present disclosure, and are not limiting thereof; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the disclosure, and are intended to be included in the scope of the present disclosure.

Claims (10)

1. An image data enhancement method based on a destruction process for an image, comprising:
acquiring an image data set to be data enhanced;
performing continuous multiple times of destruction processing on a target image in the image data set by using a diffusion process of an image diffusion model to obtain a first destruction image corresponding to the target image;
performing restoration processing on the first damaged image continuously for multiple times by using an inverse diffusion process of the image diffusion model to obtain a first restored image corresponding to the target image, wherein each restoration processing corresponds to one damage processing, and each restoration processing corresponds to the damage processing which is the inverse process corresponding to the restoration processing;
and generating the data-enhanced image data set by utilizing the target image and the first restored image.
2. The method according to claim 1, characterized in that it comprises:
controlling the number of times of processing the target image by using the image diffusion model, controlling the number of first restored images corresponding to the target image, and controlling the scale of the image dataset after data enhancement, wherein the image diffusion model is recorded as processing the target image once after one time of the diffusion process and the inverse diffusion process.
3. The method according to claim 1, wherein the image diffusion process using the image diffusion model performs a destruction process on the target image in the image data set a plurality of times in succession, and before obtaining the first destruction image corresponding to the target image, the method further comprises:
and acquiring a training data set, and training the image diffusion model by using the training data set, so that the image diffusion model can determine and execute the restoration process corresponding to the destruction process of the training image in the training data set by the diffusion process in the inverse diffusion process.
4. The method of claim 1, wherein after the acquiring the image dataset to be data enhanced, the method further comprises:
performing the destruction processing and noise adding on the target image in the image data set continuously for a plurality of times by using a diffusion process of an image diffusion model to obtain a second destruction image corresponding to the target image;
determining the destruction processing and the added noise of the target image each time in the diffusion process by using an inverse diffusion process of the image diffusion model, and continuously performing the recovery processing and the noise removal corresponding to the destruction processing on the second destruction image for a plurality of times to obtain a second recovery image corresponding to the target image;
and generating the data-enhanced image data set by using the target image and the second restored image.
5. The method of claim 4, wherein the performing the destruction processing and adding noise to the target image in the image dataset and the image diffusion process using the image diffusion model are performed a plurality of times in succession, and before obtaining the second destruction image corresponding to the target image, the method further comprises:
acquiring a training data set;
carrying out the destruction processing and noise adding on the training images in the training data set continuously for a plurality of times by utilizing the diffusion process of the image diffusion model to obtain a third destruction image corresponding to the training images;
determining the destruction processing and the added noise of the training image in each time in the diffusion process by using an inverse diffusion process of the image diffusion model, and continuously performing the recovery processing and the noise removal corresponding to the destruction processing on the third destruction image for a plurality of times to obtain a third recovery image corresponding to the training image;
calculating a noise loss between a plurality of noises added in the diffusion process and a plurality of noises determined in the back diffusion process, calculating a damage loss between the damage processing performed in the diffusion process and the recovery processing performed in the back diffusion process;
and updating model parameters of the image diffusion model according to the noise loss and the damage loss so as to complete training of the image diffusion model.
6. The method of claim 4, wherein the destruction process and noise addition are performed on the target image according to the following formula:
x t =F t (x 0 )+σε,ε~q(ε)
wherein F is t For the t-th time of the destruction process, the destruction process includes a pooling operation, a blurring operation, and a masking operation, x t Is the target image after the destruction processing and noise addition at time t, when t is equal to 1, x 0 Is the target image, x when t is equal to N N The method is characterized in that the method is a second damaged image corresponding to the target image, N is a preset number, epsilon is noise, q () is target distribution, the target distribution comprises Gaussian distribution, uniform distribution and t distribution, epsilon-q (epsilon) represents that epsilon meets q (), and sigma is the variance of q ().
7. The method according to claim 5, comprising:
calculating loss by using a loss function, and updating model parameters of the image diffusion model according to the loss so as to complete training of the image diffusion model;
the loss function is as follows:
wherein G is t F for the t-th recovery process t G for the t-th time of the destruction process t And F is equal to t Is corresponding to, x t-1 Is the target image after the destruction processing and noise addition of the t-1 th time, when t is equal to 1, x 0 Is the target image, x when t is equal to N N Is a second corrupted image corresponding to the target image, N is a preset number, epsilon is noise, epsilon satisfies the target distribution and, σ is the variance of the target distribution, I 1 Representing the number of times a norm operation is performed, T0 is the number of times the corruption process and noise addition are performed in total, and T is equal in value to T.
8. An image data enhancement apparatus based on a destruction process of an image, comprising:
an acquisition module configured to acquire an image dataset to be data enhanced;
the diffusion module is configured to continuously destroy the target image in the image data set for a plurality of times by using the diffusion process of the image diffusion model to obtain a first destroyed image corresponding to the target image;
a back diffusion module configured to perform recovery processing on the first damaged image continuously for a plurality of times by using a back diffusion process of the image diffusion model, so as to obtain a first restored image corresponding to the target image, wherein each recovery processing corresponds to one damage processing, and each damage processing corresponding to the recovery processing is a corresponding back process;
an enhancement module configured to generate the data-enhanced image dataset using the target image and the first restored image.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 7.
CN202310364614.7A 2023-04-07 2023-04-07 Image data enhancement method and device based on image destruction processing Pending CN116596813A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310364614.7A CN116596813A (en) 2023-04-07 2023-04-07 Image data enhancement method and device based on image destruction processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310364614.7A CN116596813A (en) 2023-04-07 2023-04-07 Image data enhancement method and device based on image destruction processing

Publications (1)

Publication Number Publication Date
CN116596813A true CN116596813A (en) 2023-08-15

Family

ID=87588806

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310364614.7A Pending CN116596813A (en) 2023-04-07 2023-04-07 Image data enhancement method and device based on image destruction processing

Country Status (1)

Country Link
CN (1) CN116596813A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20250061548A1 (en) * 2023-08-18 2025-02-20 Adobe Inc. Hybrid sampling for diffusion models

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20250061548A1 (en) * 2023-08-18 2025-02-20 Adobe Inc. Hybrid sampling for diffusion models

Similar Documents

Publication Publication Date Title
US11978245B2 (en) Method and apparatus for generating image
US10915980B2 (en) Method and apparatus for adding digital watermark to video
CN108038469B (en) Method and apparatus for detecting human body
CN107633218B (en) Method and apparatus for generating images
CN109829432B (en) Method and apparatus for generating information
CN116385328A (en) Image data enhancement method and device based on noise addition to image
CN109766925B (en) Feature fusion method and device, electronic equipment and storage medium
CN111666994A (en) Sample image data enhancement method and device, electronic equipment and storage medium
CN113066034B (en) Face image restoration method and device, restoration model, medium and equipment
CN108765340B (en) Blurred image processing method, device and terminal device
CN109118456B (en) Image processing method and device
CN109255337B (en) Face key point detection method and device
CN111783731B (en) Method and device for extracting video features
CN108595211B (en) Method and apparatus for outputting data
US20210200971A1 (en) Image processing method and apparatus
CN110046622B (en) Targeted attack sample generation method, device, equipment and storage medium
CN117894038A (en) Method and device for generating object gesture in image
CN110288625B (en) Method and apparatus for processing image
CN112330788A (en) Image processing method, image processing device, readable medium and electronic equipment
CN112419179B (en) Method, apparatus, device and computer readable medium for repairing image
CN117671254A (en) Image segmentation method and device
CN113516697A (en) Method, apparatus, electronic device, and computer-readable storage medium for image registration
CN116596813A (en) Image data enhancement method and device based on image destruction processing
CN111612715B (en) Image restoration method and device and electronic equipment
CN115222948A (en) Image classification method, device, server and system based on quantum kernel method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination