CN110874819A

CN110874819A - Video image restoration method, device and storage medium

Info

Publication number: CN110874819A
Application number: CN201810991291.3A
Authority: CN
Inventors: 陈步华; 陈戈; 梁洁; 庄一嵘; 薛沛林; 余媛; 陈麒
Original assignee: China Telecom Corp Ltd
Current assignee: China Telecom Corp Ltd
Priority date: 2018-08-29
Filing date: 2018-08-29
Publication date: 2020-03-10
Anticipated expiration: 2038-08-29
Also published as: CN110874819B

Abstract

The invention provides a video image restoration method, a video image restoration device and a storage medium, wherein the method comprises the following steps: the method comprises the steps of obtaining a smooth block to be repaired and an edge block in a received video frame, carrying out interpolation repair on the smooth block, training a confrontation network model by using an image to be repaired in the edge block, repairing the image to be repaired in the edge block by using the trained confrontation network model, and generating a repaired video frame. The video image restoration method, the video image restoration device and the storage medium can reduce the data processing burden of a sending end and the transmission pressure of a channel, avoid transmission delay, greatly reduce the complexity of model training, ensure the restoration effect and improve the restoration efficiency.

Description

Video image restoration method, device and storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method and an apparatus for repairing a video image, and a storage medium.

Background

Under the background that the 4K/8K service causes the data volume to expand rapidly due to the rapid development of the Internet and the video service, a higher requirement is provided for the reliability of the video quality of the CDN. In the transmission of the existing IPTV system, once Bit Error (Bit Error) or Packet Loss (Packet Loss) occurs, the video quality is rapidly reduced (for example, the phenomenon of screen splash occurs), which affects the viewing experience of the user. For such problems, Error control techniques such as fec (forward Error correction), arq (automatic Repeat request) and Error concealment techniques are usually adopted to repair. For example, based on the FEC principle, a sender adds redundant data to be sent, which may occupy a certain extra bandwidth; based on the ARQ principle, a packet loss retransmission request sent by a receiving end is sent to a packet loss retransmission server, which may cause extra transmission delay and increase channel transmission pressure, and in a bad channel, a video transmission error still cannot be avoided. For the error concealment technology, an improved interpolation method is usually adopted to conceal error data, but the restoration effect of the overall image characteristic rule is influenced, and the content restoration effect of the drastic pixel change is not good.

Disclosure of Invention

One or more embodiments of the present invention provide a video image restoration method, apparatus, and storage medium.

According to an aspect of the present disclosure, there is provided a video image restoration method including: judging whether the received video frame has data loss or not based on a preset detection rule; if yes, dividing the video frame into a plurality of non-overlapping sub-blocks, and determining the sub-blocks needing to be repaired in the video frame, wherein the sub-blocks needing to be repaired comprise: a smooth block and an edge block; carrying out interpolation restoration on the sliding block; determining the pixel value of a pixel point to be repaired according to the pixel value of a pixel point adjacent to the pixel point to be repaired in the sliding block; training a confrontation network model by using the image to be repaired in the edge block, and repairing the image to be repaired in the edge block by using the trained confrontation network model; and generating a repaired video frame according to the repaired smooth block and the repaired edge block.

Optionally, the determining whether there is data loss in the received video frame based on a preset detection rule includes: acquiring a first hash value corresponding to the video frame from the video frame; calculating a second hash value of the video frame; and comparing the first hash value with the second hash value, and if the first hash value is different from the second hash value, determining that the video frame has data loss.

Optionally, the dividing the video frame into a plurality of non-overlapping sub-blocks and determining the sub-blocks in the video frame that need to be repaired includes: dividing the video frame into N m non-overlapping subblocks, wherein m is the number of pixel points; obtaining the average value of the pixel values of the pixels in the sub-block, and calculating the mean square error of the pixels corresponding to the pixels in the sub-block based on the average value; determining the sub-block to be an edge block if the pixel mean square error is greater than or equal to a pixel mean square error threshold; and if the pixel mean square error is smaller than the pixel mean square error threshold value, determining the sub-block to be a smooth block.

Optionally, the determining the pixel value of the pixel point to be repaired according to the pixel value of the pixel point adjacent to the pixel point to be repaired in the slider includes: obtaining average pixel values corresponding to a plurality of pixel points adjacent to the pixel point to be repaired in the sliding block; the average pixel value is used to replace the pixel value of the pixel to be repaired.

Optionally, the image after interpolation restoration of the slider is

Wherein g (x, y) is the gray value of a first pixel point with the coordinate (x, y), and f (m, n) is the gray value of a pixel point which is adjacent to the first pixel point and has the coordinate value (m, n); s is a set of pixels adjacent to the first pixel, excluding the (x, y) point; m is the total number of pixels in the set.

Optionally, the training of a countermeasure network model by using the image to be repaired in the edge block, and the repairing of the image to be repaired in the edge block by using the trained countermeasure network model includes: constructing a Wtherstein generative confrontation network (WGAN) model, wherein the WGAN model comprises a generator and an arbiter; setting a model parameter theta of the WGAN model, inputting the image to be repaired in the edge block to the WGAN model, and training the WGAN model; in the iterative process, recording the mean square error of pixels between the last image output by the WGAN model and the current output image as X, and dynamically adjusting the value of theta; if the X is smaller than a preset threshold value, determining that the WGAN model training is finished; and repairing the image to be repaired in the edge block by using the WGAN model after training, and outputting the repaired edge block.

Optionally, the solving formula of the model parameter θ is:

wherein the optimal value theta of theta is obtained through training, and z represents a vector of random codes; argmin E (-) refers to the WGAN model parameter θ that causes the function E (-) to take its minimum;

is provided with

Initializing a WGAN model function f by using a random model parameter theta, enabling the input of the f to be a fixed random code z and a picture x0 to be repaired, and training the parameter of the model f; and if the mean square error of the pixels between the last image output by the WGAN model and the current output image is smaller than a preset threshold value, determining that f can realize the output of a repaired image x.

According to another aspect of the present disclosure, there is provided a video image restoration apparatus including: the video image detection module is used for judging whether the received video frame has data loss or not based on a preset detection rule; a repair area distinguishing module, configured to, if yes, divide the video frame into multiple non-overlapping sub-blocks, and determine a sub-block that needs to be repaired in the video frame, where the sub-block that needs to be repaired includes: a smooth block and an edge block; the first image restoration module is used for carrying out interpolation restoration on the sliding block; determining the pixel value of a pixel point to be repaired according to the pixel value of a pixel point adjacent to the pixel point to be repaired in the sliding block; the second image restoration module is used for training a confrontation network model by utilizing the image to be restored in the edge block and restoring the image to be restored in the edge block by utilizing the trained confrontation network model; and the repair image generation module is used for generating a repair video frame according to the repaired smooth block and the edge block.

Optionally, the video image detection module is configured to obtain a first hash value corresponding to the video frame from the video frame; calculating a second hash value of the video frame; and comparing the first hash value with the second hash value, and if the first hash value is different from the second hash value, determining that the video frame has data loss.

Optionally, the repair area determining module is configured to divide the video frame into N m × m sub-blocks that do not overlap with each other, where m × m is the number of pixels; obtaining the average value of the pixel values of the pixels in the sub-block, and calculating the mean square error of the pixels corresponding to the pixels in the sub-block based on the average value; determining the sub-block to be an edge block if the pixel mean square error is greater than or equal to a pixel mean square error threshold; and if the pixel mean square error is smaller than the pixel mean square error threshold value, determining the sub-block to be a smooth block.

Optionally, the first image restoration module is configured to obtain an average pixel value corresponding to a plurality of pixel points adjacent to a pixel point to be restored in the slider; the average pixel value is used to replace the pixel value of the pixel to be repaired.

Optionally, the first image restoration module is configured to restore, by interpolation, the image after the smoothing block is restored

Optionally, the second image restoration module is configured to construct a wotherstein generated confrontation network WGAN model, wherein the WGAN model includes a generator and an arbiter; setting a model parameter theta of the WGAN model, inputting the image to be repaired in the edge block to the WGAN model, and training the WGAN model; in the iterative process, recording the mean square error of pixels between the last image output by the WGAN model and the current output image as X, and dynamically adjusting the value of theta; if the X is smaller than a preset threshold value, determining that the WGAN model training is finished; and repairing the image to be repaired in the edge block by using the WGAN model after training, and outputting the repaired edge block.

Optionally, the solving formula of the model parameter θ is:

the second image restoration module is used for setting

According to still another aspect of the present disclosure, there is provided a video image restoration apparatus including: a memory; and a processor coupled to the memory, the processor configured to perform the method as described above based on instructions stored in the memory.

According to yet another aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by one or more processors, implement the steps of the method as described above.

The video image restoration method, the video image restoration device and the storage medium obtain a smooth block and an edge block to be restored in a received video frame, carry out interpolation restoration on the smooth block, train a confrontation network model by utilizing an image to be restored in the edge block, and restore the image to be restored in the edge block by utilizing the trained confrontation network model to generate a restored video frame; the method can reduce the data processing burden of the sending end and the transmission pressure of a channel, avoid transmission delay, greatly reduce the complexity of model training, ensure the repairing effect and improve the repairing efficiency.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without inventive exercise.

FIG. 1 is a schematic flow chart diagram illustrating one embodiment of a video image restoration method according to the present disclosure;

FIG. 2 is a schematic flow chart illustrating an inspection of a video frame in an embodiment of a video image restoration method according to the present disclosure;

FIG. 3A is a schematic diagram of a GAN model; FIG. 3B is an original video frame, FIG. 3C is a received video frame with missing data, and FIG. 3D is a video frame after being restored according to one embodiment of the video image restoration method of the present disclosure;

FIG. 4 is a block diagram representation of one embodiment of a video image restoration device according to the present disclosure;

fig. 5 is a block diagram of another embodiment of a video image restoration apparatus according to the present disclosure.

Detailed Description

The present disclosure now will be described more fully hereinafter with reference to the accompanying drawings, in which exemplary embodiments of the disclosure are shown. The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

The terms "first", "second", and the like are used hereinafter only for descriptive distinction and not for other specific meanings.

Fig. 1 is a schematic flow chart of an embodiment of a video image restoration method according to the present disclosure, as shown in fig. 1:

step 101, judging whether the received video frame has data loss or not based on a preset detection rule. When the receiving end decodes, it is determined whether each frame in the video has data loss, and the error position and the lost information can also be determined.

102, if yes, dividing the video frame into a plurality of non-overlapping sub-blocks, and determining sub-blocks needing to be repaired in the video frame, wherein the sub-blocks needing to be repaired include: a smooth block and an edge block. The smooth blocks are sub-blocks with small pixel value changes of the video frame images, and the edge blocks are sub-blocks with large pixel value changes of the video frame images.

103, carrying out interpolation restoration on the smooth block; and determining the pixel value of the pixel point to be repaired according to the pixel value of the pixel point adjacent to the pixel point to be repaired in the sliding block. The intra-frame sliding block can be repaired by adopting an airspace difference method, and the video frame repaired by the airspace difference method is output.

And 104, training a confrontation network model by using the image to be repaired in the edge block, and repairing the image to be repaired in the edge block by using the trained confrontation network model.

The edge blocks in the video frames can be restored by adopting a depth self-similarity algorithm, the image to be restored and the structure of the generated countermeasure network are used as prior supervision information on the basis of the depth learning of the artificial intelligence subject, and the edge blocks of the image to be restored are restored by adopting the depth self-similarity algorithm.

And 105, generating a repaired video frame according to the repaired smooth block and the repaired edge block.

The video image restoration method in the embodiment provides an intelligent restoration method for an error video stream, which can restore an error video image, is deployed at a video receiving end, only uses an image to be restored as input, and further restores the error video image by using an improved generation countermeasure network solving method on the basis of spatial interpolation restoration, and has the advantages of simple training, greatly reduced model solving complexity and capability of obtaining higher-quality image restoration quality.

In one embodiment, the detection rule may be multiple. The difference information between the received video code stream and the original video can be verified by using a digital signature mode, and if the digital signature of the corresponding frame has difference, the video frame is considered to have data error. For example, a first hash value corresponding to a video frame is obtained from the video frame, and a second hash value of the video frame is calculated, and the calculation methods of the first hash value and the second hash value are the same. And comparing the first hash value with the second hash value, and if the first hash value is different from the second hash value, determining that the video frame has data loss.

And dividing the video frame into N m sub-blocks which are not overlapped with each other, wherein m is the number of pixel points. And obtaining the average value of the pixel values of the pixel points in the sub-blocks, and calculating the mean square error of the pixel corresponding to the pixel points in the sub-blocks based on the average value. And if the pixel mean square error is greater than or equal to the pixel mean square error threshold, determining the sub-block as an edge block, and if the pixel mean square error is less than the pixel mean square error threshold, determining the sub-block as a smooth block.

For example, if the error detection mechanism detects that a video frame is corrupted, the video frame to be repaired is divided into N3 × 3 sub-blocks, and the pixel mean square error threshold is set to T0. And obtaining the average value of the pixel values of 3 × 3 ═ 9 pixels in the subblock, calculating the square sum of the differences between the pixel values of 9 pixels and the average value, and obtaining the square root of the ratio of the square sum to 9 (the number of pixels), namely the pixel mean square error of the subblock. If the pixel mean square error of the sub-block < T0 (pixel variation is small), then label as a smooth block; for other sub-blocks with pixels whose mean square error ≧ T0 (pixel variation is large), the block is labeled as an edge block.

In one embodiment, to reduce the consumption of computing resources, a spatial interpolation method is used to repair the smooth block of the image to be repaired. The smoothing of the image in the spatial domain is usually a weighted neighborhood average, replacing the gray level of each pixel by the average of several pixel gray levels. And obtaining average pixel values corresponding to a plurality of pixel points adjacent to the pixel point to be repaired in the sliding block, and replacing the pixel value of the pixel to be repaired with the average pixel value.

The image after interpolation restoration of the sliding block is

Wherein g (x, y) is the gray value of a first pixel point with the coordinate (x, y), and f (m, n) is the gray value of a pixel point which is adjacent to the first pixel point and has the coordinate value (m, n); s is a set of pixels adjacent to the first pixel, excluding the (x, y) point; m is the total number of pixels in the set. Wherein x, y is 0,1,2, …, N-1; the gray value of each pixel in the smoothed image g (x, y) is determined by the average of the gray values of several pixels of f (x, y) contained in a predetermined neighborhood of (x, y).

For example, the slider is a 3 × 3 sub-block, one pixel point in the slider is obtained, the gray values of 6 pixel points adjacent to the pixel point are obtained, the average gray value of the gray values of the 6 pixel points is calculated, and the average gray value is used for replacing the gray value of the pixel point. And replacing the original gray values of the 9 pixel points by adopting the same method for the 9 pixel points in the sliding block.

In one embodiment, a depth self-similarity algorithm is used to repair: and for the intra-frame edge block, repairing the residual sub-blocks of the image by adopting a depth self-similarity algorithm based on artificial intelligence. Constructing a Wtherstein generative confrontation network WGAN model, wherein the WGAN model comprises a generator and an arbiter. And setting a model parameter theta of the WGAN model, inputting the image to be repaired in the edge block into the WGAN model, and training the WGAN model.

In the iterative process, the mean square error of pixels between the last image output by the WGAN model and the current output image is recorded as X, and the value of theta is dynamically adjusted. And if the X is smaller than the preset threshold value, determining that the WGAN model training is finished. And repairing the image to be repaired in the edge block by using the trained WGAN model, and outputting the repaired edge block.

The image restoration task can be expressed as an energy minimization problem, and the optimization function comprises two terms, one is a certain function representation of the generated image and the original image, and the other is a priori knowledge of the image:

where x0 is an image to be restored, x is a restored image, and x^*The method is an optimal restored image obtained by solving, E (x; x0) is a data item depending on a task and represents a certain functional relation representation of a generated image and a correct original image, R (x) is a regularization item of prior knowledge of a natural image, and if the method is a traditional machine learning method, a large amount of training is needed to obtain the prior knowledge.

The generation countermeasure network (GAN) is a deep learning prediction mechanism, which can be used to repair damaged images. GAN is composed of two major components, the Generator Neural Network (Generator Neural Network) and the Discriminator Neural Network (Discriminator Neural Network).

As shown in fig. 3A, the generation network g (z) first takes random input and attempts to output the generated image. In particular, for the generator g (z) an input z is obtained, where z is a sample from the probability distribution p (z), representing a randomly encoded tensor (vector), such that a picture of the same size as the output is initialized (i.e. the pixel values are randomly generated, randomly initializing the generated picture). Then, the generator g (z) sends the generated image data to the discriminator network d (x).

The task of the authentication network is to receive the generated data and to attempt to determine whether the received data image is real or false. The discriminator network d (x) requires prior knowledge of the distribution pdata (x) from the real image data x, and also inputs the prior knowledge to the discriminator network d (x) as reference empirical knowledge for determining whether the generated data is real or not.

If the discriminator network D (x) judges that the generated image data is false, the iterative adjustment of the parameter set theta of the generating network G (z) and the discriminator network D (x) is needed to be continued until the optimal parameter set theta of the whole GAN network^*So that when the discriminator network D (x) judges that the generated image is true, the optimal parameter theta^*The generated image of the next generation network g (z) is the generated image x with the repair completed.

To avoid completely solving the problem of GAN training instability, the present disclosure employs Wasserstein GAN (WGAN) with faster convergence speed and more stability as a basic model of the deep self-similarity algorithm. The assumption of Wasserstein GAN (WGAN) is that the data has local correlation, which is itself a priori information, and in addition, the normal part of the damaged image is itself the correct a priori information. These a priori knowledge is not due to the network "seeing" a large number of samples, so r (x) need not be embodied in the formula (the equation in the preceding paragraph). After simplification, the solving formula of the simplified WGAN model parameter θ provided by the present disclosure is:

where z represents a randomly encoded tensor (vector) such that the initialization generates a picture of the same size as the output (i.e., the pixel values are randomly generated, randomly initializing the generated picture); theta is a model parameter of the WGAN, an optimal value theta of the model parameter is obtained through training, and in order to avoid the problem that the gradient of a generator and a discriminator disappears, the WGAN learns the parameter through a Wasserstein distance, and a conventional gradient descent method is abandoned; arg min E (-) refers to the WGAN model parameter θ that causes the function E (-) to take its minimum.

Initializing a WGAN model function f by using a random model parameter theta, and training parameters of the model f by using fixed random codes z and a picture x0 to be repaired (the picture to be repaired after airspace repair is finished) as inputs of the function f; after a certain number of iterations, if the mean square error of the pixel between the last time and the current output image is less than 0.001 (threshold), f is considered to be capable of outputting a repaired image. Video frames restored based on the video image restoration method of the present disclosure are shown in fig. 3B, 3C, and 3D.

In one embodiment, as shown in fig. 4, the present disclosure provides a video image restoration apparatus 40, including: a video image detection module 41, a repair area discrimination module 42, a first image repair module 43, a second image repair module 44, and a repair image generation module 45.

The video image detection module 41 determines whether there is data loss in the received video frame based on a preset detection rule. If yes, the repair area determining module 42 divides the video frame into a plurality of non-overlapping sub-blocks, and determines sub-blocks to be repaired in the video frame, where the sub-blocks to be repaired include: a smooth block and an edge block.

The first image restoration module 43 performs interpolation restoration on the slider; and determining the pixel value of the pixel point to be repaired according to the pixel value of the pixel point adjacent to the pixel point to be repaired in the sliding block. The second image restoration module 44 trains a confrontation network model by using the image to be restored in the edge block, and restores the image to be restored in the edge block by using the trained confrontation network model. The restored image generation module 45 generates a restored video frame from the restored slider block and edge block.

In one embodiment, the video image detection module 41 obtains a first hash value corresponding to a video frame from the video frame, calculates a second hash value of the video frame, compares the first hash value with the second hash value, and determines that the video frame has data loss if the first hash value is not the same as the second hash value.

The repair area determination module 42 divides the video frame into N m × m sub-blocks that do not overlap with each other, where m × m is the number of pixels. The repair area discrimination module 42 obtains an average value of pixel values of the pixel points in the sub-block, and calculates a mean square error of the pixel corresponding to the pixel point in the sub-block based on the average value. If the pixel mean square error is greater than or equal to the pixel mean square error threshold, the repair area discrimination module 42 determines the sub-block to be an edge block. If the pixel mean square error is less than the pixel mean square error threshold, the repair area discrimination module 42 determines the sub-block to be a smooth block.

The first image restoration module 43 obtains average pixel values corresponding to a plurality of pixel points adjacent to the pixel point to be restored in the slider. The first image restoration module 43 replaces the pixel value of the pixel to be restored with the average pixel value.

The first image restoration module 43 restores the interpolated image of the slider as

The second image restoration module 44 constructs a wotherstein generation type confrontation network WGAN model, wherein the WGAN model includes a generator and a discriminator, sets a model parameter θ of the WGAN model, inputs an image to be restored in an edge block to the WGAN model, and trains the WGAN model. In the iterative process, the second image inpainting module 44 records the pixel mean square error between the last image output by the WGAN model and the current output image as X, and dynamically adjusts the value of θ. If the X is smaller than the preset threshold, the second image restoration module 44 determines that the WGAN model training is completed, restores the image to be restored in the edge block using the WGAN model that is trained, and outputs the restored edge block.

The solving formula of the model parameter theta is as follows:

second image restoration module 44 arrangement

Fig. 5 is a block diagram of another embodiment of a terminal according to the present disclosure. As shown in fig. 5, the apparatus may include a memory 51, a processor 52, a communication interface 53, and a bus 54. The memory 51 is used for storing instructions, the processor 52 is coupled to the memory 51, and the processor 52 is configured to execute the video image restoration method based on the instructions stored in the memory 51.

The memory 51 may be a high-speed RAM memory, a nonvolatile memory (NoN-volatile memory), or the like, and the memory 51 may be a memory array. The storage 51 may also be partitioned and the blocks may be combined into virtual volumes according to certain rules. Processor 52 may be a central processing unit CPU, or an application specific integrated circuit asic, or one or more integrated circuits configured to implement the video image restoration methods disclosed herein.

In one embodiment, the present disclosure also provides a computer-readable storage medium, where the computer-readable storage medium stores computer instructions, and the instructions, when executed by a processor, implement the video image restoration method according to any of the above embodiments.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, apparatus, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

In the video image restoration method, the video image restoration device and the storage medium in the embodiments, the video frame is restored at the receiving end without obtaining additional data from the transmitting end and the transmission channel, so that additional bandwidth resources are not consumed and retransmission delay is not introduced; on the basis of generation of the countermeasure network, training is simple, only the input image to be restored is used as training, and no other normal image needs to be learned in advance, so that the complexity of model solving is greatly reduced, the training time is greatly reduced, the consumption of computing resources is reduced, and the real-time performance of decoding at a receiving end is ensured; on the basis of the prediction and generation capabilities of a deep learning model superior to a traditional shallow machine learning model, regular features and image styles of an undamaged part in a frame are fully learned, neighborhood features of a part to be repaired are not only considered, but the consideration on the neighborhood features is increased only due to the local correlation of WGAN, so that the image repairing effect on the drastic change of pixels is good; the method is beneficial to vigorous popularization, and finally the error video stream can be repaired in real time, so that the video decoding quality of the IPTV service is optimized.

The method and system of the present disclosure may be implemented in a number of ways. For example, the methods and systems of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

The description of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to practitioners skilled in this art. The embodiment was chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. A video image restoration method, comprising:

judging whether the received video frame has data loss or not based on a preset detection rule;

if yes, dividing the video frame into a plurality of non-overlapping sub-blocks, and determining the sub-blocks needing to be repaired in the video frame, wherein the sub-blocks needing to be repaired comprise: a smooth block and an edge block;

carrying out interpolation restoration on the sliding block; determining the pixel value of a pixel point to be repaired according to the pixel value of a pixel point adjacent to the pixel point to be repaired in the sliding block;

training a confrontation network model by using the image to be repaired in the edge block, and repairing the image to be repaired in the edge block by using the trained confrontation network model;

and generating a repaired video frame according to the repaired smooth block and the repaired edge block.

2. The method of claim 1, wherein the determining whether there is data loss in the received video frame based on the preset detection rule comprises:

acquiring a first hash value corresponding to the video frame from the video frame;

calculating a second hash value of the video frame;

and comparing the first hash value with the second hash value, and if the first hash value is different from the second hash value, determining that the video frame has data loss.

3. The method of claim 2, wherein the dividing the video frame into a plurality of non-overlapping sub-blocks and determining the sub-blocks of the video frame that need repair comprises:

dividing the video frame into N m ＊ m non-overlapping sub-blocks, wherein m ＊ m is the number of pixels;

obtaining the average value of the pixel values of the pixels in the sub-block, and calculating the mean square error of the pixels corresponding to the pixels in the sub-block based on the average value;

determining the sub-block to be an edge block if the pixel mean square error is greater than or equal to a pixel mean square error threshold; and if the pixel mean square error is smaller than the pixel mean square error threshold value, determining the sub-block to be a smooth block.

4. The method according to claim 2, wherein the determining the pixel value of the pixel point to be repaired according to the pixel value of the pixel point adjacent to the pixel point to be repaired in the slider block comprises:

obtaining average pixel values corresponding to a plurality of pixel points adjacent to the pixel point to be repaired in the sliding block;

the average pixel value is used to replace the pixel value of the pixel to be repaired.

5. The method of claim 4, wherein,

the image after interpolation restoration of the sliding block is

6. The method of claim 1, wherein the training of a countermeasure network model using the image to be repaired in the edge block, and the repairing of the image to be repaired in the edge block using the trained countermeasure network model comprises:

constructing a Wtherstein generative confrontation network (WGAN) model, wherein the WGAN model comprises a generator and an arbiter;

setting a model parameter theta of the WGAN model, inputting the image to be repaired in the edge block to the WGAN model, and training the WGAN model;

in the iterative process, recording the mean square error of pixels between the last image output by the WGAN model and the current output image as X, and dynamically adjusting the value of theta;

if the X is smaller than a preset threshold value, determining that the WGAN model training is finished;

and repairing the image to be repaired in the edge block by using the WGAN model after training, and outputting the repaired edge block.

7. The method of claim 6, wherein,

the solving formula of the model parameter theta is as follows:

wherein, the optimal value theta ＊ of theta is obtained by training, z represents a vector of random coding, argmin E (-) refers to the WGAN model parameter theta which makes the function E (-) obtain the minimum value;

is provided with

The method comprises the steps of initializing a WGAN model function f by using random model parameters theta, enabling input of the function f to be fixed random codes z and pictures to be restored x0, training parameters of the model f, and if the mean square error of pixels between a last image output by the WGAN model and a current output image is smaller than a preset threshold value, determining that f can achieve output of restored pictures x ＊.

8. A video image restoration apparatus comprising:

the video image detection module is used for judging whether the received video frame has data loss or not based on a preset detection rule;

a repair area distinguishing module, configured to, if yes, divide the video frame into multiple non-overlapping sub-blocks, and determine a sub-block that needs to be repaired in the video frame, where the sub-block that needs to be repaired includes: a smooth block and an edge block;

the first image restoration module is used for carrying out interpolation restoration on the sliding block; determining the pixel value of a pixel point to be repaired according to the pixel value of a pixel point adjacent to the pixel point to be repaired in the sliding block;

the second image restoration module is used for training a confrontation network model by utilizing the image to be restored in the edge block and restoring the image to be restored in the edge block by utilizing the trained confrontation network model;

and the repair image generation module is used for generating a repair video frame according to the repaired smooth block and the edge block.

9. The apparatus of claim 8, wherein,

the video image detection module is used for acquiring a first hash value corresponding to the video frame from the video frame; calculating a second hash value of the video frame; and comparing the first hash value with the second hash value, and if the first hash value is different from the second hash value, determining that the video frame has data loss.

10. The apparatus of claim 9, wherein,

the repair area judging module is used for dividing the video frame into N m ＊ m non-overlapping subblocks, wherein m ＊ m is the number of pixels, obtaining the average value of pixel values of the pixels in the subblocks, calculating the mean square error of pixels corresponding to the pixels in the subblocks based on the average value, determining the subblocks to be an edge block if the mean square error of the pixels is larger than or equal to a pixel mean square error threshold, and determining the subblocks to be a flat sliding block if the mean square error of the pixels is smaller than the pixel mean square error threshold.

11. The apparatus of claim 8, wherein,

the first image restoration module is used for obtaining average pixel values corresponding to a plurality of pixel points adjacent to the pixel point to be restored in the slider; the average pixel value is used to replace the pixel value of the pixel to be repaired.

12. The apparatus of claim 11, wherein,

the first image restoration module is used for restoring the image after interpolation of the sliding block into

13. The apparatus of claim 8, wherein,

the second image restoration module is used for constructing a Wtherstein generated confrontation network WGAN model, wherein the WGAN model comprises a generator and an arbiter; setting a model parameter theta of the WGAN model, inputting the image to be repaired in the edge block to the WGAN model, and training the WGAN model; in the iterative process, recording the mean square error of pixels between the last image output by the WGAN model and the current output image as X, and dynamically adjusting the value of theta; if the X is smaller than a preset threshold value, determining that the WGAN model training is finished; and repairing the image to be repaired in the edge block by using the WGAN model after training, and outputting the repaired edge block.

14. The apparatus of claim 13, wherein,

the solving formula of the model parameter theta is as follows:

the second image restoration module is used for setting

15. A video image restoration apparatus comprising:

a memory; and a processor coupled to the memory, the processor configured to perform the method of any of claims 1-7 based on instructions stored in the memory.

16. A computer readable storage medium having stored thereon computer program instructions which, when executed by one or more processors, implement the steps of the method of any one of claims 1 to 7.