[go: up one dir, main page]

CN110874819A - Video image restoration method, device and storage medium - Google Patents

Video image restoration method, device and storage medium Download PDF

Info

Publication number
CN110874819A
CN110874819A CN201810991291.3A CN201810991291A CN110874819A CN 110874819 A CN110874819 A CN 110874819A CN 201810991291 A CN201810991291 A CN 201810991291A CN 110874819 A CN110874819 A CN 110874819A
Authority
CN
China
Prior art keywords
repaired
pixel
image
value
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810991291.3A
Other languages
Chinese (zh)
Other versions
CN110874819B (en
Inventor
陈步华
陈戈
梁洁
庄一嵘
薛沛林
余媛
陈麒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN201810991291.3A priority Critical patent/CN110874819B/en
Publication of CN110874819A publication Critical patent/CN110874819A/en
Application granted granted Critical
Publication of CN110874819B publication Critical patent/CN110874819B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention provides a video image restoration method, a video image restoration device and a storage medium, wherein the method comprises the following steps: the method comprises the steps of obtaining a smooth block to be repaired and an edge block in a received video frame, carrying out interpolation repair on the smooth block, training a confrontation network model by using an image to be repaired in the edge block, repairing the image to be repaired in the edge block by using the trained confrontation network model, and generating a repaired video frame. The video image restoration method, the video image restoration device and the storage medium can reduce the data processing burden of a sending end and the transmission pressure of a channel, avoid transmission delay, greatly reduce the complexity of model training, ensure the restoration effect and improve the restoration efficiency.

Description

Video image restoration method, device and storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a method and an apparatus for repairing a video image, and a storage medium.
Background
Under the background that the 4K/8K service causes the data volume to expand rapidly due to the rapid development of the Internet and the video service, a higher requirement is provided for the reliability of the video quality of the CDN. In the transmission of the existing IPTV system, once Bit Error (Bit Error) or Packet Loss (Packet Loss) occurs, the video quality is rapidly reduced (for example, the phenomenon of screen splash occurs), which affects the viewing experience of the user. For such problems, Error control techniques such as fec (forward Error correction), arq (automatic Repeat request) and Error concealment techniques are usually adopted to repair. For example, based on the FEC principle, a sender adds redundant data to be sent, which may occupy a certain extra bandwidth; based on the ARQ principle, a packet loss retransmission request sent by a receiving end is sent to a packet loss retransmission server, which may cause extra transmission delay and increase channel transmission pressure, and in a bad channel, a video transmission error still cannot be avoided. For the error concealment technology, an improved interpolation method is usually adopted to conceal error data, but the restoration effect of the overall image characteristic rule is influenced, and the content restoration effect of the drastic pixel change is not good.
Disclosure of Invention
One or more embodiments of the present invention provide a video image restoration method, apparatus, and storage medium.
According to an aspect of the present disclosure, there is provided a video image restoration method including: judging whether the received video frame has data loss or not based on a preset detection rule; if yes, dividing the video frame into a plurality of non-overlapping sub-blocks, and determining the sub-blocks needing to be repaired in the video frame, wherein the sub-blocks needing to be repaired comprise: a smooth block and an edge block; carrying out interpolation restoration on the sliding block; determining the pixel value of a pixel point to be repaired according to the pixel value of a pixel point adjacent to the pixel point to be repaired in the sliding block; training a confrontation network model by using the image to be repaired in the edge block, and repairing the image to be repaired in the edge block by using the trained confrontation network model; and generating a repaired video frame according to the repaired smooth block and the repaired edge block.
Optionally, the determining whether there is data loss in the received video frame based on a preset detection rule includes: acquiring a first hash value corresponding to the video frame from the video frame; calculating a second hash value of the video frame; and comparing the first hash value with the second hash value, and if the first hash value is different from the second hash value, determining that the video frame has data loss.
Optionally, the dividing the video frame into a plurality of non-overlapping sub-blocks and determining the sub-blocks in the video frame that need to be repaired includes: dividing the video frame into N m non-overlapping subblocks, wherein m is the number of pixel points; obtaining the average value of the pixel values of the pixels in the sub-block, and calculating the mean square error of the pixels corresponding to the pixels in the sub-block based on the average value; determining the sub-block to be an edge block if the pixel mean square error is greater than or equal to a pixel mean square error threshold; and if the pixel mean square error is smaller than the pixel mean square error threshold value, determining the sub-block to be a smooth block.
Optionally, the determining the pixel value of the pixel point to be repaired according to the pixel value of the pixel point adjacent to the pixel point to be repaired in the slider includes: obtaining average pixel values corresponding to a plurality of pixel points adjacent to the pixel point to be repaired in the sliding block; the average pixel value is used to replace the pixel value of the pixel to be repaired.
Optionally, the image after interpolation restoration of the slider is
Figure BDA0001780838860000021
Wherein g (x, y) is the gray value of a first pixel point with the coordinate (x, y), and f (m, n) is the gray value of a pixel point which is adjacent to the first pixel point and has the coordinate value (m, n); s is a set of pixels adjacent to the first pixel, excluding the (x, y) point; m is the total number of pixels in the set.
Optionally, the training of a countermeasure network model by using the image to be repaired in the edge block, and the repairing of the image to be repaired in the edge block by using the trained countermeasure network model includes: constructing a Wtherstein generative confrontation network (WGAN) model, wherein the WGAN model comprises a generator and an arbiter; setting a model parameter theta of the WGAN model, inputting the image to be repaired in the edge block to the WGAN model, and training the WGAN model; in the iterative process, recording the mean square error of pixels between the last image output by the WGAN model and the current output image as X, and dynamically adjusting the value of theta; if the X is smaller than a preset threshold value, determining that the WGAN model training is finished; and repairing the image to be repaired in the edge block by using the WGAN model after training, and outputting the repaired edge block.
Optionally, the solving formula of the model parameter θ is:
Figure BDA0001780838860000031
wherein the optimal value theta of theta is obtained through training, and z represents a vector of random codes; argmin E (-) refers to the WGAN model parameter θ that causes the function E (-) to take its minimum;
is provided with
Figure BDA0001780838860000032
Initializing a WGAN model function f by using a random model parameter theta, enabling the input of the f to be a fixed random code z and a picture x0 to be repaired, and training the parameter of the model f; and if the mean square error of the pixels between the last image output by the WGAN model and the current output image is smaller than a preset threshold value, determining that f can realize the output of a repaired image x.
According to another aspect of the present disclosure, there is provided a video image restoration apparatus including: the video image detection module is used for judging whether the received video frame has data loss or not based on a preset detection rule; a repair area distinguishing module, configured to, if yes, divide the video frame into multiple non-overlapping sub-blocks, and determine a sub-block that needs to be repaired in the video frame, where the sub-block that needs to be repaired includes: a smooth block and an edge block; the first image restoration module is used for carrying out interpolation restoration on the sliding block; determining the pixel value of a pixel point to be repaired according to the pixel value of a pixel point adjacent to the pixel point to be repaired in the sliding block; the second image restoration module is used for training a confrontation network model by utilizing the image to be restored in the edge block and restoring the image to be restored in the edge block by utilizing the trained confrontation network model; and the repair image generation module is used for generating a repair video frame according to the repaired smooth block and the edge block.
Optionally, the video image detection module is configured to obtain a first hash value corresponding to the video frame from the video frame; calculating a second hash value of the video frame; and comparing the first hash value with the second hash value, and if the first hash value is different from the second hash value, determining that the video frame has data loss.
Optionally, the repair area determining module is configured to divide the video frame into N m × m sub-blocks that do not overlap with each other, where m × m is the number of pixels; obtaining the average value of the pixel values of the pixels in the sub-block, and calculating the mean square error of the pixels corresponding to the pixels in the sub-block based on the average value; determining the sub-block to be an edge block if the pixel mean square error is greater than or equal to a pixel mean square error threshold; and if the pixel mean square error is smaller than the pixel mean square error threshold value, determining the sub-block to be a smooth block.
Optionally, the first image restoration module is configured to obtain an average pixel value corresponding to a plurality of pixel points adjacent to a pixel point to be restored in the slider; the average pixel value is used to replace the pixel value of the pixel to be repaired.
Optionally, the first image restoration module is configured to restore, by interpolation, the image after the smoothing block is restored
Figure BDA0001780838860000041
Wherein g (x, y) is the gray value of a first pixel point with the coordinate (x, y), and f (m, n) is the gray value of a pixel point which is adjacent to the first pixel point and has the coordinate value (m, n); s is a set of pixels adjacent to the first pixel, excluding the (x, y) point; m is the total number of pixels in the set.
Optionally, the second image restoration module is configured to construct a wotherstein generated confrontation network WGAN model, wherein the WGAN model includes a generator and an arbiter; setting a model parameter theta of the WGAN model, inputting the image to be repaired in the edge block to the WGAN model, and training the WGAN model; in the iterative process, recording the mean square error of pixels between the last image output by the WGAN model and the current output image as X, and dynamically adjusting the value of theta; if the X is smaller than a preset threshold value, determining that the WGAN model training is finished; and repairing the image to be repaired in the edge block by using the WGAN model after training, and outputting the repaired edge block.
Optionally, the solving formula of the model parameter θ is:
Figure BDA0001780838860000042
wherein the optimal value theta of theta is obtained through training, and z represents a vector of random codes; argmin E (-) refers to the WGAN model parameter θ that causes the function E (-) to take its minimum;
the second image restoration module is used for setting
Figure BDA0001780838860000051
Initializing a WGAN model function f by using a random model parameter theta, enabling the input of the f to be a fixed random code z and a picture x0 to be repaired, and training the parameter of the model f; and if the mean square error of the pixels between the last image output by the WGAN model and the current output image is smaller than a preset threshold value, determining that f can realize the output of a repaired image x.
According to still another aspect of the present disclosure, there is provided a video image restoration apparatus including: a memory; and a processor coupled to the memory, the processor configured to perform the method as described above based on instructions stored in the memory.
According to yet another aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by one or more processors, implement the steps of the method as described above.
The video image restoration method, the video image restoration device and the storage medium obtain a smooth block and an edge block to be restored in a received video frame, carry out interpolation restoration on the smooth block, train a confrontation network model by utilizing an image to be restored in the edge block, and restore the image to be restored in the edge block by utilizing the trained confrontation network model to generate a restored video frame; the method can reduce the data processing burden of the sending end and the transmission pressure of a channel, avoid transmission delay, greatly reduce the complexity of model training, ensure the repairing effect and improve the repairing efficiency.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without inventive exercise.
FIG. 1 is a schematic flow chart diagram illustrating one embodiment of a video image restoration method according to the present disclosure;
FIG. 2 is a schematic flow chart illustrating an inspection of a video frame in an embodiment of a video image restoration method according to the present disclosure;
FIG. 3A is a schematic diagram of a GAN model; FIG. 3B is an original video frame, FIG. 3C is a received video frame with missing data, and FIG. 3D is a video frame after being restored according to one embodiment of the video image restoration method of the present disclosure;
FIG. 4 is a block diagram representation of one embodiment of a video image restoration device according to the present disclosure;
fig. 5 is a block diagram of another embodiment of a video image restoration apparatus according to the present disclosure.
Detailed Description
The present disclosure now will be described more fully hereinafter with reference to the accompanying drawings, in which exemplary embodiments of the disclosure are shown. The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
The terms "first", "second", and the like are used hereinafter only for descriptive distinction and not for other specific meanings.
Fig. 1 is a schematic flow chart of an embodiment of a video image restoration method according to the present disclosure, as shown in fig. 1:
step 101, judging whether the received video frame has data loss or not based on a preset detection rule. When the receiving end decodes, it is determined whether each frame in the video has data loss, and the error position and the lost information can also be determined.
102, if yes, dividing the video frame into a plurality of non-overlapping sub-blocks, and determining sub-blocks needing to be repaired in the video frame, wherein the sub-blocks needing to be repaired include: a smooth block and an edge block. The smooth blocks are sub-blocks with small pixel value changes of the video frame images, and the edge blocks are sub-blocks with large pixel value changes of the video frame images.
103, carrying out interpolation restoration on the smooth block; and determining the pixel value of the pixel point to be repaired according to the pixel value of the pixel point adjacent to the pixel point to be repaired in the sliding block. The intra-frame sliding block can be repaired by adopting an airspace difference method, and the video frame repaired by the airspace difference method is output.
And 104, training a confrontation network model by using the image to be repaired in the edge block, and repairing the image to be repaired in the edge block by using the trained confrontation network model.
The edge blocks in the video frames can be restored by adopting a depth self-similarity algorithm, the image to be restored and the structure of the generated countermeasure network are used as prior supervision information on the basis of the depth learning of the artificial intelligence subject, and the edge blocks of the image to be restored are restored by adopting the depth self-similarity algorithm.
And 105, generating a repaired video frame according to the repaired smooth block and the repaired edge block.
The video image restoration method in the embodiment provides an intelligent restoration method for an error video stream, which can restore an error video image, is deployed at a video receiving end, only uses an image to be restored as input, and further restores the error video image by using an improved generation countermeasure network solving method on the basis of spatial interpolation restoration, and has the advantages of simple training, greatly reduced model solving complexity and capability of obtaining higher-quality image restoration quality.
In one embodiment, the detection rule may be multiple. The difference information between the received video code stream and the original video can be verified by using a digital signature mode, and if the digital signature of the corresponding frame has difference, the video frame is considered to have data error. For example, a first hash value corresponding to a video frame is obtained from the video frame, and a second hash value of the video frame is calculated, and the calculation methods of the first hash value and the second hash value are the same. And comparing the first hash value with the second hash value, and if the first hash value is different from the second hash value, determining that the video frame has data loss.
And dividing the video frame into N m sub-blocks which are not overlapped with each other, wherein m is the number of pixel points. And obtaining the average value of the pixel values of the pixel points in the sub-blocks, and calculating the mean square error of the pixel corresponding to the pixel points in the sub-blocks based on the average value. And if the pixel mean square error is greater than or equal to the pixel mean square error threshold, determining the sub-block as an edge block, and if the pixel mean square error is less than the pixel mean square error threshold, determining the sub-block as a smooth block.
For example, if the error detection mechanism detects that a video frame is corrupted, the video frame to be repaired is divided into N3 × 3 sub-blocks, and the pixel mean square error threshold is set to T0. And obtaining the average value of the pixel values of 3 × 3 ═ 9 pixels in the subblock, calculating the square sum of the differences between the pixel values of 9 pixels and the average value, and obtaining the square root of the ratio of the square sum to 9 (the number of pixels), namely the pixel mean square error of the subblock. If the pixel mean square error of the sub-block < T0 (pixel variation is small), then label as a smooth block; for other sub-blocks with pixels whose mean square error ≧ T0 (pixel variation is large), the block is labeled as an edge block.
In one embodiment, to reduce the consumption of computing resources, a spatial interpolation method is used to repair the smooth block of the image to be repaired. The smoothing of the image in the spatial domain is usually a weighted neighborhood average, replacing the gray level of each pixel by the average of several pixel gray levels. And obtaining average pixel values corresponding to a plurality of pixel points adjacent to the pixel point to be repaired in the sliding block, and replacing the pixel value of the pixel to be repaired with the average pixel value.
The image after interpolation restoration of the sliding block is
Figure BDA0001780838860000081
Wherein g (x, y) is the gray value of a first pixel point with the coordinate (x, y), and f (m, n) is the gray value of a pixel point which is adjacent to the first pixel point and has the coordinate value (m, n); s is a set of pixels adjacent to the first pixel, excluding the (x, y) point; m is the total number of pixels in the set. Wherein x, y is 0,1,2, …, N-1; the gray value of each pixel in the smoothed image g (x, y) is determined by the average of the gray values of several pixels of f (x, y) contained in a predetermined neighborhood of (x, y).
For example, the slider is a 3 × 3 sub-block, one pixel point in the slider is obtained, the gray values of 6 pixel points adjacent to the pixel point are obtained, the average gray value of the gray values of the 6 pixel points is calculated, and the average gray value is used for replacing the gray value of the pixel point. And replacing the original gray values of the 9 pixel points by adopting the same method for the 9 pixel points in the sliding block.
In one embodiment, a depth self-similarity algorithm is used to repair: and for the intra-frame edge block, repairing the residual sub-blocks of the image by adopting a depth self-similarity algorithm based on artificial intelligence. Constructing a Wtherstein generative confrontation network WGAN model, wherein the WGAN model comprises a generator and an arbiter. And setting a model parameter theta of the WGAN model, inputting the image to be repaired in the edge block into the WGAN model, and training the WGAN model.
In the iterative process, the mean square error of pixels between the last image output by the WGAN model and the current output image is recorded as X, and the value of theta is dynamically adjusted. And if the X is smaller than the preset threshold value, determining that the WGAN model training is finished. And repairing the image to be repaired in the edge block by using the trained WGAN model, and outputting the repaired edge block.
The image restoration task can be expressed as an energy minimization problem, and the optimization function comprises two terms, one is a certain function representation of the generated image and the original image, and the other is a priori knowledge of the image:
Figure BDA0001780838860000082
where x0 is an image to be restored, x is a restored image, and x*The method is an optimal restored image obtained by solving, E (x; x0) is a data item depending on a task and represents a certain functional relation representation of a generated image and a correct original image, R (x) is a regularization item of prior knowledge of a natural image, and if the method is a traditional machine learning method, a large amount of training is needed to obtain the prior knowledge.
The generation countermeasure network (GAN) is a deep learning prediction mechanism, which can be used to repair damaged images. GAN is composed of two major components, the Generator Neural Network (Generator Neural Network) and the Discriminator Neural Network (Discriminator Neural Network).
As shown in fig. 3A, the generation network g (z) first takes random input and attempts to output the generated image. In particular, for the generator g (z) an input z is obtained, where z is a sample from the probability distribution p (z), representing a randomly encoded tensor (vector), such that a picture of the same size as the output is initialized (i.e. the pixel values are randomly generated, randomly initializing the generated picture). Then, the generator g (z) sends the generated image data to the discriminator network d (x).
The task of the authentication network is to receive the generated data and to attempt to determine whether the received data image is real or false. The discriminator network d (x) requires prior knowledge of the distribution pdata (x) from the real image data x, and also inputs the prior knowledge to the discriminator network d (x) as reference empirical knowledge for determining whether the generated data is real or not.
If the discriminator network D (x) judges that the generated image data is false, the iterative adjustment of the parameter set theta of the generating network G (z) and the discriminator network D (x) is needed to be continued until the optimal parameter set theta of the whole GAN network*So that when the discriminator network D (x) judges that the generated image is true, the optimal parameter theta*The generated image of the next generation network g (z) is the generated image x with the repair completed.
To avoid completely solving the problem of GAN training instability, the present disclosure employs Wasserstein GAN (WGAN) with faster convergence speed and more stability as a basic model of the deep self-similarity algorithm. The assumption of Wasserstein GAN (WGAN) is that the data has local correlation, which is itself a priori information, and in addition, the normal part of the damaged image is itself the correct a priori information. These a priori knowledge is not due to the network "seeing" a large number of samples, so r (x) need not be embodied in the formula (the equation in the preceding paragraph). After simplification, the solving formula of the simplified WGAN model parameter θ provided by the present disclosure is:
Figure BDA0001780838860000091
where z represents a randomly encoded tensor (vector) such that the initialization generates a picture of the same size as the output (i.e., the pixel values are randomly generated, randomly initializing the generated picture); theta is a model parameter of the WGAN, an optimal value theta of the model parameter is obtained through training, and in order to avoid the problem that the gradient of a generator and a discriminator disappears, the WGAN learns the parameter through a Wasserstein distance, and a conventional gradient descent method is abandoned; arg min E (-) refers to the WGAN model parameter θ that causes the function E (-) to take its minimum.
Initializing a WGAN model function f by using a random model parameter theta, and training parameters of the model f by using fixed random codes z and a picture x0 to be repaired (the picture to be repaired after airspace repair is finished) as inputs of the function f; after a certain number of iterations, if the mean square error of the pixel between the last time and the current output image is less than 0.001 (threshold), f is considered to be capable of outputting a repaired image. Video frames restored based on the video image restoration method of the present disclosure are shown in fig. 3B, 3C, and 3D.
In one embodiment, as shown in fig. 4, the present disclosure provides a video image restoration apparatus 40, including: a video image detection module 41, a repair area discrimination module 42, a first image repair module 43, a second image repair module 44, and a repair image generation module 45.
The video image detection module 41 determines whether there is data loss in the received video frame based on a preset detection rule. If yes, the repair area determining module 42 divides the video frame into a plurality of non-overlapping sub-blocks, and determines sub-blocks to be repaired in the video frame, where the sub-blocks to be repaired include: a smooth block and an edge block.
The first image restoration module 43 performs interpolation restoration on the slider; and determining the pixel value of the pixel point to be repaired according to the pixel value of the pixel point adjacent to the pixel point to be repaired in the sliding block. The second image restoration module 44 trains a confrontation network model by using the image to be restored in the edge block, and restores the image to be restored in the edge block by using the trained confrontation network model. The restored image generation module 45 generates a restored video frame from the restored slider block and edge block.
In one embodiment, the video image detection module 41 obtains a first hash value corresponding to a video frame from the video frame, calculates a second hash value of the video frame, compares the first hash value with the second hash value, and determines that the video frame has data loss if the first hash value is not the same as the second hash value.
The repair area determination module 42 divides the video frame into N m × m sub-blocks that do not overlap with each other, where m × m is the number of pixels. The repair area discrimination module 42 obtains an average value of pixel values of the pixel points in the sub-block, and calculates a mean square error of the pixel corresponding to the pixel point in the sub-block based on the average value. If the pixel mean square error is greater than or equal to the pixel mean square error threshold, the repair area discrimination module 42 determines the sub-block to be an edge block. If the pixel mean square error is less than the pixel mean square error threshold, the repair area discrimination module 42 determines the sub-block to be a smooth block.
The first image restoration module 43 obtains average pixel values corresponding to a plurality of pixel points adjacent to the pixel point to be restored in the slider. The first image restoration module 43 replaces the pixel value of the pixel to be restored with the average pixel value.
The first image restoration module 43 restores the interpolated image of the slider as
Figure BDA0001780838860000111
Wherein g (x, y) is the gray value of a first pixel point with the coordinate (x, y), and f (m, n) is the gray value of a pixel point which is adjacent to the first pixel point and has the coordinate value (m, n); s is a set of pixels adjacent to the first pixel, excluding the (x, y) point; m is the total number of pixels in the set.
The second image restoration module 44 constructs a wotherstein generation type confrontation network WGAN model, wherein the WGAN model includes a generator and a discriminator, sets a model parameter θ of the WGAN model, inputs an image to be restored in an edge block to the WGAN model, and trains the WGAN model. In the iterative process, the second image inpainting module 44 records the pixel mean square error between the last image output by the WGAN model and the current output image as X, and dynamically adjusts the value of θ. If the X is smaller than the preset threshold, the second image restoration module 44 determines that the WGAN model training is completed, restores the image to be restored in the edge block using the WGAN model that is trained, and outputs the restored edge block.
The solving formula of the model parameter theta is as follows:
Figure BDA0001780838860000112
wherein the optimal value theta of theta is obtained through training, and z represents a vector of random codes; argmin E (-) refers to the WGAN model parameter θ that causes the function E (-) to take its minimum;
second image restoration module 44 arrangement
Figure BDA0001780838860000113
Initializing a WGAN model function f by using a random model parameter theta, enabling the input of the f to be a fixed random code z and a picture x0 to be repaired, and training the parameter of the model f; and if the mean square error of the pixels between the last image output by the WGAN model and the current output image is smaller than a preset threshold value, determining that f can realize the output of a repaired image x.
Fig. 5 is a block diagram of another embodiment of a terminal according to the present disclosure. As shown in fig. 5, the apparatus may include a memory 51, a processor 52, a communication interface 53, and a bus 54. The memory 51 is used for storing instructions, the processor 52 is coupled to the memory 51, and the processor 52 is configured to execute the video image restoration method based on the instructions stored in the memory 51.
The memory 51 may be a high-speed RAM memory, a nonvolatile memory (NoN-volatile memory), or the like, and the memory 51 may be a memory array. The storage 51 may also be partitioned and the blocks may be combined into virtual volumes according to certain rules. Processor 52 may be a central processing unit CPU, or an application specific integrated circuit asic, or one or more integrated circuits configured to implement the video image restoration methods disclosed herein.
In one embodiment, the present disclosure also provides a computer-readable storage medium, where the computer-readable storage medium stores computer instructions, and the instructions, when executed by a processor, implement the video image restoration method according to any of the above embodiments.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, apparatus, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
In the video image restoration method, the video image restoration device and the storage medium in the embodiments, the video frame is restored at the receiving end without obtaining additional data from the transmitting end and the transmission channel, so that additional bandwidth resources are not consumed and retransmission delay is not introduced; on the basis of generation of the countermeasure network, training is simple, only the input image to be restored is used as training, and no other normal image needs to be learned in advance, so that the complexity of model solving is greatly reduced, the training time is greatly reduced, the consumption of computing resources is reduced, and the real-time performance of decoding at a receiving end is ensured; on the basis of the prediction and generation capabilities of a deep learning model superior to a traditional shallow machine learning model, regular features and image styles of an undamaged part in a frame are fully learned, neighborhood features of a part to be repaired are not only considered, but the consideration on the neighborhood features is increased only due to the local correlation of WGAN, so that the image repairing effect on the drastic change of pixels is good; the method is beneficial to vigorous popularization, and finally the error video stream can be repaired in real time, so that the video decoding quality of the IPTV service is optimized.
The method and system of the present disclosure may be implemented in a number of ways. For example, the methods and systems of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
The description of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to practitioners skilled in this art. The embodiment was chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.

Claims (16)

1. A video image restoration method, comprising:
judging whether the received video frame has data loss or not based on a preset detection rule;
if yes, dividing the video frame into a plurality of non-overlapping sub-blocks, and determining the sub-blocks needing to be repaired in the video frame, wherein the sub-blocks needing to be repaired comprise: a smooth block and an edge block;
carrying out interpolation restoration on the sliding block; determining the pixel value of a pixel point to be repaired according to the pixel value of a pixel point adjacent to the pixel point to be repaired in the sliding block;
training a confrontation network model by using the image to be repaired in the edge block, and repairing the image to be repaired in the edge block by using the trained confrontation network model;
and generating a repaired video frame according to the repaired smooth block and the repaired edge block.
2. The method of claim 1, wherein the determining whether there is data loss in the received video frame based on the preset detection rule comprises:
acquiring a first hash value corresponding to the video frame from the video frame;
calculating a second hash value of the video frame;
and comparing the first hash value with the second hash value, and if the first hash value is different from the second hash value, determining that the video frame has data loss.
3. The method of claim 2, wherein the dividing the video frame into a plurality of non-overlapping sub-blocks and determining the sub-blocks of the video frame that need repair comprises:
dividing the video frame into N m * m non-overlapping sub-blocks, wherein m * m is the number of pixels;
obtaining the average value of the pixel values of the pixels in the sub-block, and calculating the mean square error of the pixels corresponding to the pixels in the sub-block based on the average value;
determining the sub-block to be an edge block if the pixel mean square error is greater than or equal to a pixel mean square error threshold; and if the pixel mean square error is smaller than the pixel mean square error threshold value, determining the sub-block to be a smooth block.
4. The method according to claim 2, wherein the determining the pixel value of the pixel point to be repaired according to the pixel value of the pixel point adjacent to the pixel point to be repaired in the slider block comprises:
obtaining average pixel values corresponding to a plurality of pixel points adjacent to the pixel point to be repaired in the sliding block;
the average pixel value is used to replace the pixel value of the pixel to be repaired.
5. The method of claim 4, wherein,
the image after interpolation restoration of the sliding block is
Figure FDA0001780838850000021
Wherein g (x, y) is the gray value of a first pixel point with the coordinate (x, y), and f (m, n) is the gray value of a pixel point which is adjacent to the first pixel point and has the coordinate value (m, n); s is a set of pixels adjacent to the first pixel, excluding the (x, y) point; m is the total number of pixels in the set.
6. The method of claim 1, wherein the training of a countermeasure network model using the image to be repaired in the edge block, and the repairing of the image to be repaired in the edge block using the trained countermeasure network model comprises:
constructing a Wtherstein generative confrontation network (WGAN) model, wherein the WGAN model comprises a generator and an arbiter;
setting a model parameter theta of the WGAN model, inputting the image to be repaired in the edge block to the WGAN model, and training the WGAN model;
in the iterative process, recording the mean square error of pixels between the last image output by the WGAN model and the current output image as X, and dynamically adjusting the value of theta;
if the X is smaller than a preset threshold value, determining that the WGAN model training is finished;
and repairing the image to be repaired in the edge block by using the WGAN model after training, and outputting the repaired edge block.
7. The method of claim 6, wherein,
the solving formula of the model parameter theta is as follows:
Figure FDA0001780838850000022
wherein, the optimal value theta * of theta is obtained by training, z represents a vector of random coding, argmin E (-) refers to the WGAN model parameter theta which makes the function E (-) obtain the minimum value;
is provided with
Figure FDA0001780838850000023
The method comprises the steps of initializing a WGAN model function f by using random model parameters theta, enabling input of the function f to be fixed random codes z and pictures to be restored x0, training parameters of the model f, and if the mean square error of pixels between a last image output by the WGAN model and a current output image is smaller than a preset threshold value, determining that f can achieve output of restored pictures x *.
8. A video image restoration apparatus comprising:
the video image detection module is used for judging whether the received video frame has data loss or not based on a preset detection rule;
a repair area distinguishing module, configured to, if yes, divide the video frame into multiple non-overlapping sub-blocks, and determine a sub-block that needs to be repaired in the video frame, where the sub-block that needs to be repaired includes: a smooth block and an edge block;
the first image restoration module is used for carrying out interpolation restoration on the sliding block; determining the pixel value of a pixel point to be repaired according to the pixel value of a pixel point adjacent to the pixel point to be repaired in the sliding block;
the second image restoration module is used for training a confrontation network model by utilizing the image to be restored in the edge block and restoring the image to be restored in the edge block by utilizing the trained confrontation network model;
and the repair image generation module is used for generating a repair video frame according to the repaired smooth block and the edge block.
9. The apparatus of claim 8, wherein,
the video image detection module is used for acquiring a first hash value corresponding to the video frame from the video frame; calculating a second hash value of the video frame; and comparing the first hash value with the second hash value, and if the first hash value is different from the second hash value, determining that the video frame has data loss.
10. The apparatus of claim 9, wherein,
the repair area judging module is used for dividing the video frame into N m * m non-overlapping subblocks, wherein m * m is the number of pixels, obtaining the average value of pixel values of the pixels in the subblocks, calculating the mean square error of pixels corresponding to the pixels in the subblocks based on the average value, determining the subblocks to be an edge block if the mean square error of the pixels is larger than or equal to a pixel mean square error threshold, and determining the subblocks to be a flat sliding block if the mean square error of the pixels is smaller than the pixel mean square error threshold.
11. The apparatus of claim 8, wherein,
the first image restoration module is used for obtaining average pixel values corresponding to a plurality of pixel points adjacent to the pixel point to be restored in the slider; the average pixel value is used to replace the pixel value of the pixel to be repaired.
12. The apparatus of claim 11, wherein,
the first image restoration module is used for restoring the image after interpolation of the sliding block into
Figure FDA0001780838850000041
Wherein g (x, y) is the gray value of a first pixel point with the coordinate (x, y), and f (m, n) is the gray value of a pixel point which is adjacent to the first pixel point and has the coordinate value (m, n); s is a set of pixels adjacent to the first pixel, excluding the (x, y) point; m is the total number of pixels in the set.
13. The apparatus of claim 8, wherein,
the second image restoration module is used for constructing a Wtherstein generated confrontation network WGAN model, wherein the WGAN model comprises a generator and an arbiter; setting a model parameter theta of the WGAN model, inputting the image to be repaired in the edge block to the WGAN model, and training the WGAN model; in the iterative process, recording the mean square error of pixels between the last image output by the WGAN model and the current output image as X, and dynamically adjusting the value of theta; if the X is smaller than a preset threshold value, determining that the WGAN model training is finished; and repairing the image to be repaired in the edge block by using the WGAN model after training, and outputting the repaired edge block.
14. The apparatus of claim 13, wherein,
the solving formula of the model parameter theta is as follows:
Figure FDA0001780838850000042
wherein, the optimal value theta * of theta is obtained by training, z represents a vector of random coding, argmin E (-) refers to the WGAN model parameter theta which makes the function E (-) obtain the minimum value;
the second image restoration module is used for setting
Figure FDA0001780838850000043
The method comprises the steps of initializing a WGAN model function f by using random model parameters theta, enabling input of the function f to be fixed random codes z and pictures to be restored x0, training parameters of the model f, and if the mean square error of pixels between a last image output by the WGAN model and a current output image is smaller than a preset threshold value, determining that f can achieve output of restored pictures x *.
15. A video image restoration apparatus comprising:
a memory; and a processor coupled to the memory, the processor configured to perform the method of any of claims 1-7 based on instructions stored in the memory.
16. A computer readable storage medium having stored thereon computer program instructions which, when executed by one or more processors, implement the steps of the method of any one of claims 1 to 7.
CN201810991291.3A 2018-08-29 2018-08-29 Video image restoration method, device and storage medium Active CN110874819B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810991291.3A CN110874819B (en) 2018-08-29 2018-08-29 Video image restoration method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810991291.3A CN110874819B (en) 2018-08-29 2018-08-29 Video image restoration method, device and storage medium

Publications (2)

Publication Number Publication Date
CN110874819A true CN110874819A (en) 2020-03-10
CN110874819B CN110874819B (en) 2022-06-17

Family

ID=69714156

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810991291.3A Active CN110874819B (en) 2018-08-29 2018-08-29 Video image restoration method, device and storage medium

Country Status (1)

Country Link
CN (1) CN110874819B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112565763A (en) * 2020-11-30 2021-03-26 北京达佳互联信息技术有限公司 Abnormal image sample generation method and device, and image detection method and device
CN113643564A (en) * 2021-07-27 2021-11-12 中国科学院深圳先进技术研究院 A parking data restoration method, device, computer equipment and storage medium
CN114638748A (en) * 2020-12-16 2022-06-17 阿里巴巴集团控股有限公司 Image processing method, image restoration method, computer equipment, storage medium
CN115065819A (en) * 2022-06-06 2022-09-16 三星电子(中国)研发中心 Method and device for repairing splash screen, electronic equipment and storage medium
CN116977192A (en) * 2022-10-09 2023-10-31 中国移动通信有限公司研究院 Method and device for repairing defective video frame
CN118694981A (en) * 2024-08-23 2024-09-24 宁波康达凯能医疗科技有限公司 A method, device and medium for inter-frame error concealment based on stable diffusion model

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101389038A (en) * 2008-09-28 2009-03-18 湖北科创高新网络视频股份有限公司 Video error blanketing method and apparatus based on macro block classification
CN101931821A (en) * 2010-07-21 2010-12-29 中兴通讯股份有限公司 Video transmission error control method and system
CN103124356A (en) * 2013-01-17 2013-05-29 浙江工业大学 Self-adaptive space domain error concealment method based on direction information
US20170372193A1 (en) * 2016-06-23 2017-12-28 Siemens Healthcare Gmbh Image Correction Using A Deep Generative Machine-Learning Model
CN107563510A (en) * 2017-08-14 2018-01-09 华南理工大学 A kind of WGAN model methods based on depth convolutional neural networks
CN107945140A (en) * 2017-12-20 2018-04-20 中国科学院深圳先进技术研究院 A kind of image repair method, device and equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101389038A (en) * 2008-09-28 2009-03-18 湖北科创高新网络视频股份有限公司 Video error blanketing method and apparatus based on macro block classification
CN101931821A (en) * 2010-07-21 2010-12-29 中兴通讯股份有限公司 Video transmission error control method and system
CN103124356A (en) * 2013-01-17 2013-05-29 浙江工业大学 Self-adaptive space domain error concealment method based on direction information
US20170372193A1 (en) * 2016-06-23 2017-12-28 Siemens Healthcare Gmbh Image Correction Using A Deep Generative Machine-Learning Model
CN107563510A (en) * 2017-08-14 2018-01-09 华南理工大学 A kind of WGAN model methods based on depth convolutional neural networks
CN107945140A (en) * 2017-12-20 2018-04-20 中国科学院深圳先进技术研究院 A kind of image repair method, device and equipment

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
何东健: "《数字图像处理》", 28 February 2015, 西安电子科技大学出版社 *
哈文全: "基于WGAN的图像修复应用", 《电子技术与软件工程》 *
曹志义: "基于生成对抗网络的遮挡图像修复算法", 《北京邮电大学学报》 *
曾凯等: "图像超分辨率重建的研究进展", 《计算机工程与应用》 *
甘玲等: "一种用于视频修复的块匹配方法", 《计算机应用与软件》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112565763A (en) * 2020-11-30 2021-03-26 北京达佳互联信息技术有限公司 Abnormal image sample generation method and device, and image detection method and device
CN114638748A (en) * 2020-12-16 2022-06-17 阿里巴巴集团控股有限公司 Image processing method, image restoration method, computer equipment, storage medium
CN113643564A (en) * 2021-07-27 2021-11-12 中国科学院深圳先进技术研究院 A parking data restoration method, device, computer equipment and storage medium
CN115065819A (en) * 2022-06-06 2022-09-16 三星电子(中国)研发中心 Method and device for repairing splash screen, electronic equipment and storage medium
CN116977192A (en) * 2022-10-09 2023-10-31 中国移动通信有限公司研究院 Method and device for repairing defective video frame
CN118694981A (en) * 2024-08-23 2024-09-24 宁波康达凯能医疗科技有限公司 A method, device and medium for inter-frame error concealment based on stable diffusion model
CN118694981B (en) * 2024-08-23 2024-12-10 宁波康达凯能医疗科技有限公司 A method, device and medium for inter-frame error concealment based on stable diffusion model

Also Published As

Publication number Publication date
CN110874819B (en) 2022-06-17

Similar Documents

Publication Publication Date Title
CN110874819B (en) Video image restoration method, device and storage medium
Jiang et al. Wireless semantic communications for video conferencing
US11310509B2 (en) Method and apparatus for applying deep learning techniques in video coding, restoration and video quality analysis (VQA)
CN113658051B (en) An image defogging method and system based on recurrent generative adversarial network
CN110324664B (en) A neural network-based video frame supplementation method and its model training method
Wang et al. Selfpromer: Self-prompt dehazing transformers with depth-consistency
CN110072119B (en) Content-aware video self-adaptive transmission method based on deep learning network
WO2023050720A1 (en) Image processing method, image processing apparatus, and model training method
WO2022252372A1 (en) Image processing method, apparatus and device, and computer-readable storage medium
Agarwal et al. Compressing video calls using synthetic talking heads
CN117896546B (en) Data transmission method, system, electronic equipment and storage medium
WO2023051583A1 (en) Video coding unit division method and apparatus, and computer device and computer-readable storage medium
Nami et al. Lightweight multitask learning for robust JND prediction using latent space and reconstructed frames
Zhang et al. Diffusion-Based Wireless Semantic Communication for VR Image
CN114549302B (en) Image super-resolution reconstruction method and system
Wang et al. Reparo: QoE-aware live video streaming in low-rate networks by intelligent frame recovery
KR102248352B1 (en) Method and device for removing objects in video
Yin et al. Generative Video Semantic Communication via Multimodal Semantic Fusion with Large Model
CN107071447B (en) Correlated noise modeling method based on secondary side information in DVC
CN118505581A (en) Image defogging method based on multi-scale dense feature fusion and gate control jump connection
CN114359009B (en) Watermark embedding method, watermark embedding network construction method, system and storage medium for robust image based on visual perception
CN117764812A (en) Image generation method, device, electronic equipment and medium
CN113628121B (en) Method and device for processing and training multimedia data
CN115409721A (en) Dark light video enhancement method and device
CN118741055B (en) High-resolution image transmission method and system based on optical communication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant