[go: up one dir, main page]

WO2018153161A1 - Procédé, appareil, et dispositif d'évaluation de qualité vidéo, et support de stockage - Google Patents

Procédé, appareil, et dispositif d'évaluation de qualité vidéo, et support de stockage Download PDF

Info

Publication number
WO2018153161A1
WO2018153161A1 PCT/CN2017/119261 CN2017119261W WO2018153161A1 WO 2018153161 A1 WO2018153161 A1 WO 2018153161A1 CN 2017119261 W CN2017119261 W CN 2017119261W WO 2018153161 A1 WO2018153161 A1 WO 2018153161A1
Authority
WO
WIPO (PCT)
Prior art keywords
image block
pixel
video
gradient
distortion metric
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2017/119261
Other languages
English (en)
Chinese (zh)
Inventor
刘祥凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sanechips Technology Co Ltd
Original Assignee
Sanechips Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sanechips Technology Co Ltd filed Critical Sanechips Technology Co Ltd
Publication of WO2018153161A1 publication Critical patent/WO2018153161A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N17/00Diagnosis, testing or measuring for television systems or their details
    • H04N17/02Diagnosis, testing or measuring for television systems or their details for colour television signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion

Definitions

  • the present invention relates to the field of multimedia information processing, and in particular, to a video quality evaluation method, device, device, and storage medium.
  • Video quality evaluation can be divided into two categories, subjective video quality evaluation and objective video quality evaluation.
  • Subjective video quality evaluation refers to the organization of testers to view a set of video with distortion according to the prescribed experimental process, and subjectively score the quality of the video.
  • the subjective video quality evaluation may calculate the mean of each test video as the Mean Opinion Score (MOS), or may calculate the score of each test video and the score of the original reference video corresponding to the video.
  • MOS Mean Opinion Score
  • DMOS Difference Mean Opinion Score
  • Subjective video quality evaluation can get the result closest to the human eye's true visual perception quality, but the experiment is time consuming and laborious and cannot be applied to real-time video compression and processing systems.
  • the objective video quality evaluation algorithm can automatically predict the quality of the video and is therefore more practical. In order to evaluate whether the objective video quality evaluation algorithm can accurately predict the true visual perception quality of the human eye, a more complete and complete data set is needed for test verification.
  • the contribution of subjective video quality evaluation is to establish a public test video data set and provide corresponding MOS or DMOS data for testing the performance of different objective video quality evaluation algorithms.
  • the objective video quality evaluation algorithm can be roughly divided into three categories according to whether the original reference video needs to be used in the calculation process, which are Full Reference (FR), Reduce Reference (RR) and No Reference (No Reference). , NR) video quality evaluation algorithm.
  • FR Full Reference
  • RR Reduce Reference
  • No Reference No Reference
  • NR NR video quality evaluation algorithm.
  • MSE Mean Square Error
  • PSNR Peak Signal to Noise Ratio
  • an algorithm for extracting and comparing certain pixel statistical features of the original image and the distorted image appears, such as a structural similarity algorithm (Structural). Similarity Index Measurement, SSIM).
  • SSIM Similarity Index Measurement
  • the basic process of the full reference video quality evaluation algorithm is to extract the multiple visual statistic features of the original video image and the distorted video image to form the feature vector, and estimate the distortion degree of the video image by comparing the distance between the feature vectors. .
  • the full reference video quality evaluation algorithm also has a Video Quality Model (VQM) algorithm and a MOtion Based Video Integrity Evaluation Index (MOVIE) algorithm.
  • VQM Video Quality Model
  • MOVIE MOtion Based Video Integrity Evaluation Index
  • a disadvantage of the existing full reference video quality algorithm is that it is difficult to meet both the quality prediction accuracy and the computational low complexity requirements. Because of its simple calculation, PSNR is widely used in video image processing systems with high real-time requirements such as video coding. However, the experimental results show that the correlation between the video quality score calculated by PSNR and the real subjective score is poor. .
  • Advanced video quality evaluation algorithms such as VQM and MOVIE can effectively predict video quality and approximate the subjective perceived quality score of the human eye.
  • the calculation of the VQM and MOVIE algorithms is extremely complicated and can only be applied to offline calculation of video quality. Since the video encoder needs to calculate the distortion of the reconstructed image in real time during the encoding process and select the encoding parameters based on this, the video quality evaluation algorithms such as VQM and MOVIE cannot be applied to the video encoder.
  • the embodiments of the present invention provide a video quality evaluation method, device, device, and storage medium, which can reduce the computational complexity while ensuring the accuracy of video quality estimation, and can be applied to A video image processing system such as video coding that requires high real-time performance.
  • An embodiment of the present invention provides a video quality evaluation method, where the method includes:
  • Each frame of the video in which the quality evaluation is required is divided into image blocks according to a preset size
  • a distortion metric value of the video is determined based on a distortion metric of the image of each frame.
  • An embodiment of the present invention provides a video quality evaluation apparatus, where the apparatus includes:
  • a dividing module configured to divide an image block according to a preset size in each frame of the video that needs to be subjected to quality evaluation
  • a first determining module configured to determine a distortion metric value of each image block according to a standard deviation of a space time domain gradient of each image block and a mean square error of the pixel value
  • a second determining module configured to determine a distortion metric value of each frame image according to a distortion metric value of the image block included in each frame image
  • a third determining module configured to determine a distortion metric value of the video according to the distortion metric of each frame image.
  • An embodiment of the present invention provides a video quality evaluation apparatus, including a memory, a processor, and a computer program stored on the memory and operable on the processor, where the processor implements the video quality evaluation method when the program is executed .
  • Embodiments of the present invention provide a computer readable storage medium having stored thereon a computer program, wherein the computer program is implemented by a processor to implement the video quality evaluation method described above.
  • An embodiment of the present invention provides a video quality evaluation method, device, device, and storage medium, where the method includes: first, dividing each frame of a video that needs to be subjected to quality evaluation into image blocks according to a preset size; Determining the distortion metric of each image block by the standard deviation of the spatial time domain gradient of an image block and the mean square error of the pixel value; and determining the distortion metric according to the image block included in each frame image a distortion metric for each frame of image; and finally determining a distortion metric for the video based on the distortion metric for each frame of the image.
  • the method includes: first, dividing each frame of a video that needs to be subjected to quality evaluation into image blocks according to a preset size; Determining the distortion metric of each image block by the standard deviation of the spatial time domain gradient of an image block and the mean square error of the pixel value; and determining the distortion metric according to the image block included in each frame image a distortion metric for each frame of image; and finally determining
  • FIG. 1 is a schematic flowchart of an implementation process of a video quality evaluation method according to an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of an implementation process of a video quality evaluation method according to Embodiment 2 of the present invention
  • Embodiment 3 is a first template for calculating a horizontal gradient of a pixel according to Embodiment 2 of the present invention
  • FIG. 5 is a third template for calculating a time domain gradient of a pixel according to Embodiment 2 of the present invention.
  • FIG. 6 is a correlation scatter diagram between a video distortion score calculated by a video quality evaluation method and a subjective experimental distortion DMOS score of a LIVE data set according to Embodiment 3 of the present invention
  • FIG. 10 is a schematic structural diagram of a video quality evaluation apparatus according to Embodiment 4 of the present invention.
  • FIG. 11 is a schematic diagram of a hardware entity of a terminal according to an embodiment of the present invention.
  • the embodiment of the present invention provides a video quality evaluation method, which is applied to a video quality evaluation device, and the video quality evaluation device includes, but is not limited to, a terminal such as a computer, a tablet computer, or a smart phone.
  • a terminal such as a computer, a tablet computer, or a smart phone.
  • 1 is a schematic flowchart of an implementation process of a video quality evaluation method according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps:
  • Step S101 dividing each image frame in the video that needs to be subjected to quality evaluation into image blocks according to a preset size
  • the size of the image block can be set according to actual needs, and is generally set to an image block with the same number of rows and columns, such as an image block set to 8*8 or 16*16.
  • Step S102 determining a distortion metric value of each image block according to a standard deviation of a space time domain gradient of each image block and a mean square error of the pixel;
  • step S102 further includes:
  • Step S102a determining an empty time domain gradient of each pixel in the image block
  • the image block includes at least one pixel.
  • the step S102a first calculates a horizontal gradient, a vertical gradient, and a time domain gradient of each pixel in the image block, and then according to the formula (1-1) according to the horizontal gradient, the vertical gradient, and the time domain gradient of each pixel point. To calculate the spatial time domain gradient for each pixel.
  • Is the space time domain gradient of the pixel Is the horizontal gradient of the pixel, Is the vertical gradient of the pixel, Is the time domain gradient of the pixel.
  • Step S102b determining a standard deviation of an empty time domain gradient of the image block
  • Step S102c determining a mean square error of a pixel value of the image block
  • Step S102d Determine a distortion metric value of the image block according to a standard deviation of a space time domain gradient of the image block and a mean square error of the pixel value.
  • the distortion metric value of the image block is determined according to the formula (1-2).
  • D is a distortion metric of the image block
  • MSE is a mean square error of a pixel value of the image block
  • is a standard deviation of a spatial time domain gradient of the image block.
  • the human eye is not sensitive to the distortion of the edge or texture region of the image, and is sensitive to the distortion of the flat region.
  • the human eye is not sensitive to the distortion of fast moving video images. Therefore, for the visual characteristics of the human eye, the distortion of the image block can be divided by the standard deviation of the spatial time domain gradient value of the image block on the basis of the original video distortion, thereby embodying the complexity of the human eye to the time domain content.
  • the visual perception of video distortion is not sensitive. In this way, not only the accuracy of the video quality estimation can be guaranteed, but also the computational complexity is reduced.
  • Step S103 determining a distortion metric value of each frame image according to a distortion metric value of an image block included in each frame image in the video;
  • the average value of the distortion metric values of all the image blocks included in each frame is determined as the distortion metric value of the video.
  • Step S104 Determine a distortion metric value of the video according to a distortion metric value of each frame image included in the video.
  • the average value of the distortion metric values of the images of all the frames included in the video is determined as the distortion metric value of the video.
  • the method includes: first, dividing each frame image in the video that needs to be quality-evaluated into image blocks according to a preset size; and then, according to the standard deviation and pixel value of the space-time domain gradient of each image block.
  • Mean square error determines a distortion metric value of each image block; and further determines a distortion metric value of each frame image according to a distortion metric value of the image block included in each frame image;
  • the distortion metric of the frame image determines the distortion metric of the video.
  • the embodiment of the present invention further provides a video quality evaluation method, which is applied to a video quality evaluation apparatus, and the video quality evaluation apparatus includes, but is not limited to, a terminal such as a computer, a tablet computer, or a smart phone.
  • a terminal such as a computer, a tablet computer, or a smart phone.
  • 2 is a schematic flowchart of an implementation process of a video quality evaluation method according to an embodiment of the present invention. As shown in FIG. 2, the method includes the following steps:
  • Step S201 acquiring a first video and a second video.
  • the first video is a video that requires quality evaluation, that is, a video in which distortion occurs.
  • the second video is the original video of the video that requires quality evaluation, that is, the video that does not occur.
  • acquiring the first video includes acquiring pixel values of all pixels in each frame image of the first frequency.
  • acquiring the second video comprises acquiring pixel values of all pixels in each frame image of the second video.
  • Step S202 dividing each frame image of the first video and the second video into image blocks according to a preset size
  • the first video and the second video are divided, they are divided according to the same size.
  • the first video and the second video are divided into image blocks according to a size of 4*4.
  • Step S203 determining a standard deviation of an empty time domain gradient of each image block in the first video
  • step S203 further includes:
  • Step S203a determining, according to the preset first template, a horizontal gradient of each pixel in each image block in the first video;
  • the first template is a template for calculating a horizontal gradient of a pixel
  • the first template is as shown in FIG. 3
  • the first column of the first template is a right column of a pixel to be calculated.
  • the second column is the weight of the column where the pixel is calculated
  • the third column is the weight of the column to the right of the pixel to be calculated.
  • f(k, i, j) is a pixel value at which the position of the kth frame in the first video is (i, j).
  • Step S203b determining, according to the preset second template, a vertical gradient of each pixel in each image block in the first video
  • the second template is a template for calculating a vertical gradient of a pixel
  • the second template is as shown in FIG. 4
  • the first row of the second template is a weight of a row above the pixel to be calculated
  • the second line is the weight of the row of pixels to be calculated
  • the third row is the weight of the row below the pixel to be calculated.
  • f(k, i, j) is a pixel value at which the position of the kth frame in the first video is (i, j) pixels.
  • Step S203c Calculate a time domain gradient of each pixel in each image block in the first video according to a preset third template.
  • the third template is a template for calculating a pixel time domain gradient, and the third template is as shown in FIG. 5.
  • the third template has three 3 ⁇ 3 matrices, wherein the left 3 ⁇ 3 matrix is The weight of the image of the previous frame of the frame in which the pixel is to be calculated, the middle 3 ⁇ 3 matrix is the weight of the frame where the pixel point needs to be calculated, and the matrix of the right 3 ⁇ 3 is the frame of the pixel where the pixel to be calculated is located. The weight of a frame of image.
  • the first template shown in FIG. 3, the second template shown in FIG. 4, and the third template shown in FIG. 5 are merely exemplary illustrations, the first template and the second template.
  • the third template can be set according to actual needs in practical applications.
  • the size of the first template, the second template, and the third template may be N ⁇ N, where N is an odd number greater than 1, such as 3 ⁇ 3, 5 ⁇ 5, and 7 ⁇ 7.
  • the weights of the left and right sides of the pixel to be calculated can be set according to actual needs.
  • the setting should follow the principle that the absolute values of the weights on the left and right symmetrical positions are the same, and the weight of one side is positive and the weight of one side is negative.
  • the closer to the pixel to be calculated is to be followed.
  • the principle that the absolute value of the weight of a pixel is larger.
  • the second column is the column of the pixel to be calculated, so the weight of the second column is 0, and the absolute values of the weights of the symmetric positions of the first column and the third column are the same, and the first column is Negative, the third column is positive.
  • the weight (6) of the pixel point close to the pixel to be calculated is larger than the weight (3) of the pixel point farther from the pixel to be calculated.
  • the weight setting of the second template needs to ensure that the weight of the row of the pixel to be calculated is 0, and the absolute value of the weight at the symmetric position of the upper side and the lower side of the pixel to be calculated is the same. And the weight of one side is positive, the weight of one side is negative, and the absolute value of the weight of the pixel closer to the pixel to be calculated is larger.
  • the second row is to calculate the row of the pixel point, so the weight of the second row is 0, and the absolute values of the weights of the symmetric positions of the first row and the third row are the same, and the first behavior is negative.
  • the third act is positive.
  • the weight (6) of the pixel point close to the pixel to be calculated is larger than the weight (3) of the pixel point farther from the pixel to be calculated.
  • the weight of the third template is set to ensure that the weight of the frame in which the pixel is to be calculated is 0, and the absolute value of the weight of the previous frame of the pixel to be calculated is the same as the absolute value of the symmetric position of the latter frame, and one side
  • the weight is positive, the weight on one side is negative, and the absolute value of the weight of the pixel closer to the pixel to be calculated is larger.
  • the middle 3 ⁇ 3 matrix is the weight of the frame of the pixel to be calculated
  • the matrix is 0, and the left 3 ⁇ 3 matrix is the previous frame of the frame where the pixel to be calculated is located.
  • the weight of the weight, the right 3 ⁇ 3 matrix is the weight of the frame after the pixel to be calculated, the absolute value of the weight of the left 3 ⁇ 3 matrix and the right 3 ⁇ 3 matrix symmetric position The same, and one side is positive on the negative side.
  • the weight (6) of the pixel near the pixel to be calculated is larger than the weight (3) of the pixel far from the pixel to be calculated.
  • Step S203d determining an empty time domain gradient of each pixel of each image block in the first video according to formula (1-1);
  • Step S203e Determine a standard deviation of the spatial time domain gradient of each image block according to a spatial time domain gradient of each pixel of each image block.
  • Step S204 determining a mean square error of a pixel value of each image block in the first video
  • the mean square error MSE 1 of the pixel value of the first image block in the kth frame image in the first video is calculated according to formula (2-4):
  • f(k, i, j) is the pixel value of the (i, j) pixel of the kth frame in the first video
  • g(k, i, j) is the second video.
  • the position of the k frame is the pixel value of the (i, j) pixel.
  • Step S205 determining a distortion metric value of each image block in the first video according to a standard deviation of a space time domain gradient and a mean square error of a pixel value of each image block in the first video;
  • the distortion metric value of each image block in the first video is determined according to formula (1-2).
  • Step S206 determining a distortion metric value of each frame image according to a distortion metric value of an image block included in each frame image in the first video;
  • Step S207 determining a distortion metric value of the video according to a distortion metric value of each frame image in the first video.
  • the method includes: first acquiring a first video and a second video, and dividing each frame image of the first video and the second video into image blocks according to a preset size; a standard deviation of a spatial time domain gradient of each image block in the first video and a mean square error of the pixel value determine a distortion metric value of each image block; and then according to each frame image in the first video A distortion metric value of the included image block determines a distortion metric value of the image of each frame; and finally determines a distortion metric value of the video according to a distortion metric value of each frame image in the first video.
  • the video image processing system such as video coding with high real-time requirements can be applied.
  • the embodiment of the present invention first provides a video quality evaluation method to overcome the problem that the existing video quality evaluation method cannot simultaneously achieve video quality prediction accuracy and maintain low computational complexity.
  • the method includes the following steps:
  • the first step is to calculate a horizontal gradient, a vertical gradient, and a time domain gradient for each pixel of the video image, and calculate an empty time domain gradient of the pixel on the basis of the pixel;
  • the standard deviation of the pixel empty time domain gradient in the image block is counted
  • the standard deviation ⁇ of the pixel space-time gradient value in each image block may be counted in units of 8 ⁇ 8 blocks or 16 ⁇ 16 blocks, and ⁇ is used as a space time for characterizing the content of the image block.
  • the basis for domain complexity may be used.
  • the mean square error (MSE) of each image block is counted, and the MSE is divided by the space-time gradient standard deviation of the image block and the logarithm is taken as the final distortion metric of the image block.
  • MSE mean square error
  • the traditional mean square error is used as the objective distortion calculation criterion of the video, and based on the formula (1-2), the final distortion determination D of the video is adjusted according to the video space-time domain content complexity ⁇ .
  • the video quality evaluation method provided in the embodiment of the present invention is compared with the PSNR algorithm, the SSIM algorithm, and the VQM algorithm in the prior art.
  • the PSNR algorithm compares each frame of the original video and the distorted video pixel by pixel. It is an algorithm based on independent pixel difference, ignoring the influence of sequence content and observation conditions on distortion visibility, so it tends to Subjectively perceived video quality is less consistent.
  • the SSIM algorithm is an algorithm for extracting and calculating a certain pixel statistical feature of an image of an original video and a distorted video.
  • the VQM algorithm decomposes the original video and the distorted video into different channels (such as edge, luminance, chrominance, and frame difference) through different filters (such as edge detection), and then extracts pixel-level features and space-time image block-levels respectively.
  • Pixel-level features include the magnitude of the gradient, the direction of the gradient, the color difference, the contrast and the frame difference, etc., for each pixel.
  • the statistical features of the temporal and spatial image blocks include the calculation of statistical features (mean, standard deviation of pixel-level features) within an 8*8 image block, thereby enhancing the feature integration of the pixels to the characteristics of the temporal and spatial image blocks.
  • the distortion of the video sequence is obtained through the spatial-time domain integration and weighted fusion of the distortion of each feature.
  • the evaluation criteria used include Pearson Linear Correlation Coefficient (PLCC) and Spearman Rank Order Correlation Coefficient (SROCC).
  • PLCC mainly evaluates the prediction accuracy of the algorithm, that is, evaluates the linear fit between the predicted distortion and the real MOS.
  • SROCC mainly evaluates the monotonicity of the algorithm prediction, that is, whether the prediction prediction distortion order is consistent with the real MOS ordering.
  • the video subjective quality evaluation data set used in the experiment is the LIVE data set of the Laboratory for Image and Video Engineering (LIVE) of the University of Texas, which includes wireless network transmission distortion and wired network.
  • LIVE Laboratory for Image and Video Engineering
  • Four types of video distortion such as transmission distortion, H.264 compression distortion, and MPEG-2 compression distortion.
  • Experimental comparison methods include the classic objective video quality assessment methods PSNR and SSIM, and the video quality model VQM established by the National Telecommunications and Information Administration (NTIA).
  • Table 1 shows the video quality evaluation method provided by the embodiment of the present invention and the results obtained by the PSNR, SSIM, VQM algorithm and the PLCC correlation coefficient table of the subjective scoring data.
  • the correlation between the video distortion predicted by the video quality evaluation algorithm of the present invention and the actual subjective DMOS score of the LIVE data set is as shown in FIG. 6 (the abscissa is the video distortion prediction value of the method of the present invention, and the ordinate Score the actual video for the DMOS value).
  • the correlation between the video quality score predicted by the VQM, PSNR, and SSIM comparison methods and the actual subjective DMOS score of the LIVE data set is shown in Figures 7-9, respectively.
  • the values calculated by the method provided by the embodiment of the present invention and the VQM method are both distortion estimations of the video (the larger the value indicates the greater the video distortion), so the abscissa and the ordinate in FIG. 6 and FIG. 7 The data is positively correlated.
  • the PSNR method and the SSIM method calculate the value of the video quality estimate (the smaller the value, the greater the distortion), so the abscissa and the ordinate data in Figures 8 and 9 are negatively correlated.
  • the simulation results show that the video quality evaluation method provided by the embodiment of the present invention achieves a video quality prediction result that is more consistent with the subjective perceived quality of the human eye than the prior art, and requires only a low computational complexity. .
  • FIG. 10 is a schematic structural diagram of a video quality evaluation apparatus according to Embodiment 4 of the present invention.
  • the apparatus 1000 includes: a partitioning module 1001, and a first determining module 1002. a second determining module 1003, a third determining module 1004, wherein:
  • the dividing module 1001 is configured to divide each frame image in the video that needs to be quality-evaluated into image blocks according to a preset size.
  • the first determining module 1002 is configured to determine a distortion metric value of each image block according to a standard deviation of a spatial time domain gradient of each image block and a mean square error of the pixel value.
  • the first determining module 1002 further includes:
  • a first determining unit configured to determine a spatial time domain gradient of each pixel in the image block; wherein the image block includes at least one pixel;
  • the first determining unit further includes: a first determining subunit configured to determine a horizontal gradient, a vertical gradient, and a time domain gradient of each pixel in the image block; and a second determining subunit configured to be The horizontal gradient, the vertical gradient, and the time domain gradient of each pixel in the image block determine an empty time domain gradient for each pixel in the image block.
  • the second determining subunit further includes: determining a sub-subunit, configured to determine an empty time domain gradient of each pixel in the image block according to formula (1-1), where Is the space time domain gradient of the pixel, Is the horizontal gradient of the pixel, Is the vertical gradient of the pixel, Is the time domain gradient of the pixel.
  • a second determining unit configured to determine a standard deviation of an empty time domain gradient of the image block
  • a third determining unit configured to determine a mean square error of a pixel value of the image block
  • a fourth determining unit configured to determine a distortion metric value of the image block according to a standard deviation of a spatial time domain gradient of the image block and a mean square error of the pixel value.
  • the fourth determining unit further includes:
  • a third determining subunit configured to determine a distortion metric value of each image block according to formula (1-2), where D is a distortion metric value of the image block, and MSE is a pixel value of the image block The mean square error, ⁇ is the standard deviation of the spatial time domain gradient of the image block.
  • the second determining module 1003 is configured to determine a distortion metric value of each frame image according to a distortion metric value of the image block included in each frame image.
  • the second determining module includes: a fifth determining unit configured to determine an average value of distortion metric values of all image blocks included in each frame as a distortion metric value of the video;
  • the third determining module 1004 is configured to determine a distortion metric value of the video according to the distortion metric value of each frame image.
  • the third determining module includes: a sixth determining unit configured to determine an average value of distortion metric values of images of all frames included in the video as a distortion metric value of the video.
  • the video quality evaluation method described above is implemented in the form of a software function module and sold or used as a standalone product, it may also be stored in a computer readable storage medium.
  • the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions.
  • a computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read only memory (ROM), a magnetic disk, or an optical disk.
  • program codes such as a USB flash drive, a mobile hard disk, a read only memory (ROM), a magnetic disk, or an optical disk.
  • an embodiment of the present invention provides a video quality evaluation apparatus, such as a terminal, including a memory, a processor, and a computer program stored on the memory and operable on the processor, where the processor implements the program The video quality evaluation method described.
  • Embodiments of the present invention provide a computer readable storage medium having stored thereon a computer program, wherein the computer program is implemented by a processor to implement the video quality evaluation method described above.
  • FIG. 11 is a schematic diagram of a hardware entity of a terminal according to an embodiment of the present invention.
  • the hardware entity of the terminal 1100 includes: a processor 1101, a communication interface 1102, and a memory 1103, where:
  • the processor 1101 typically controls the overall operation of the terminal 1100.
  • Communication interface 1102 can cause terminal 1100 to communicate with other terminals or servers over a network.
  • the memory 1103 is configured to store instructions and applications executable by the processor 1101, and may also cache data to be processed or processed by the processor 1101 and each module in the terminal 1100 (eg, image data, audio data, voice communication data, and video) Communication data) can be realized by flash memory (FLASH) or random access memory (RAM).
  • FLASH flash memory
  • RAM random access memory
  • a computer program (also referred to as a program, software, software application, script or code) can be written in any programming language (including assembly or interpreted language, descriptive language or programming language) and can be any Form (including as a stand-alone program, or as a module, component, subroutine, object, or other unit suitable for use in a computing environment).
  • a computer program can, but does not necessarily, correspond to a file in a file system.
  • the program can be stored in a portion of the file that holds other programs or data (eg, one or more scripts stored in the markup language document), in a single file dedicated to the program of interest, or in multiple collaborative files ( For example, storing one or more modules, submodules, or files in a code section).
  • the computer program can be deployed to be executed on one or more computers located at one site or distributed across multiple sites and interconnected by a communication network.
  • the processes and logic flows described in the specification can be performed by one or more programmable processors executing one or more computer programs to perform actions by operating input data and generating output.
  • the above described processes and logic flows can also be performed by dedicated logic circuitry, and the apparatus can also be implemented as dedicated logic circuitry, such as an FPGA or ASIC.
  • processors suitable for the execution of a computer program include, for example, a general purpose microprocessor and a special purpose microprocessor, and any one or more processors of any type of digital computer.
  • a processor will receive instructions and data from a read only memory or a random access memory or both.
  • the calculated elements are a processor for performing actions in accordance with instructions and one or more memories for storing instructions and data.
  • a computer also includes one or more mass storage devices (eg, magnetic disks, magneto-optical disks, or optical disks) for storing data, or is operatively coupled to receive data from or send data thereto, or Both are. However, the computer does not need to have such a device.
  • the computer can be embedded in another device, such as a mobile phone, a personal digital assistant (PDA), a mobile audio player or mobile video player, a game console, a global positioning system (GPS) receiver, or a mobile storage device.
  • PDA personal digital assistant
  • GPS global positioning system
  • Suitable devices for storing computer program instructions and data include all forms of non-volatile memory, media and storage devices, including, for example, semiconductor storage devices (eg, EPROM, EEPROM, and flash memory devices), magnetic disks (eg, internal hard drives or removable hard drives). ), magneto-optical disks, and CD-ROM and DVD-ROM discs.
  • the processor and memory can be supplemented by or included in dedicated logic circuitry.
  • Embodiments of the subject matter described in the specification can be implemented in a computing system.
  • the computing system includes a backend component (eg, a data server), or includes a middleware component (eg, an application server), or includes a front end component (eg, a client computer with a graphical user interface or web browser through which the user passes)
  • the end computer can interact with an embodiment of the subject matter described herein, or any combination of one or more of the above described backend components, middleware components, or front end components.
  • the components of the system can be interconnected by any form of digital data communication or medium (e.g., a communication network). Examples of communication networks include local area networks (LANs) and wide area networks (WANs), interconnected networks (e.g., the Internet), and end-to-end networks (e.g., ad hoc end-to-end networks).
  • LANs local area networks
  • WANs wide area networks
  • interconnected networks e.g., the Internet
  • end-to-end networks
  • the features described in this application are implemented on a smart television module (or connected to a television module, hybrid television module, etc.).
  • the smart TV module can include processing circuitry configured to integrate more traditional television program sources (eg, program sources received via cable, satellite, air, or other signals) with Internet connectivity.
  • the smart TV module can be physically integrated into a television set or can include stand-alone devices such as set top boxes, Blu-ray or other digital media players, game consoles, hotel television systems, and other ancillary equipment.
  • the smart TV module can be configured to enable viewers to search for and find videos, movies, pictures or other content on the network, on local cable channels, on satellite television channels, or on local hard drives.
  • a set top box (STB) or set top box unit (STU) may include an information-applicable device that includes a tuner and is coupled to the television set and an external source to tune the signal to be displayed on a television screen or other playback device.
  • the smart TV module can be configured to provide a home screen or a top screen including icons for a variety of different applications (eg, web browsers and multiple streaming services, connecting cable or satellite media sources, other network "channels", etc.).
  • the smart TV module can also be configured to provide electronic programming to the user.
  • the companion application of the smart TV module can be run on the mobile terminal to provide the user with additional information related to the available programs, thereby enabling the user to control the smart TV module and the like.
  • this feature can be implemented on a portable computer or other personal computer (PC), smart phone, other mobile phone, handheld computer, tablet PC, or other computing device.
  • the first video and the second video are first acquired, and each frame image of the first video and the second video is divided into image blocks according to a preset size; and then according to the first video.
  • the standard deviation of the spatial time domain gradient of each image block and the mean square error of the pixel value determine the distortion metric value of each image block; and further according to the image block included in each frame image in the first video
  • the distortion metric determines a distortion metric of the image of each frame; and finally determines a distortion metric of the video according to a distortion metric of each frame image in the first video; thus, not only can the video quality estimation be accurate It reduces the computational complexity and can be applied to video image processing systems such as video coding with high real-time requirements.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé, un appareil, et un dispositif d'évaluation de qualité vidéo, et un support de stockage. Le procédé consiste à : diviser chaque trame d'image d'une vidéo, sur laquelle une évaluation de qualité doit être exécutée, en blocs d'image d'une taille prédéfinie ; déterminer une valeur de mesure de distorsion de chacun des blocs d'image selon un écart-type de gradients de domaine espace-temps de chacun des blocs d'image et une erreur quadratique moyenne de valeurs de pixels ; déterminer une valeur de mesure de distorsion de chaque trame d'image d'après la valeur de mesure de distorsion du bloc d'image inclus dans chaque trame d'image ; et déterminer une valeur de mesure de distorsion de la vidéo d'après la valeur de mesure de distorsion de chaque trame d'image.
PCT/CN2017/119261 2017-02-24 2017-12-28 Procédé, appareil, et dispositif d'évaluation de qualité vidéo, et support de stockage Ceased WO2018153161A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710102413.4A CN108513132B (zh) 2017-02-24 2017-02-24 一种视频质量评价方法及装置
CN201710102413.4 2017-02-24

Publications (1)

Publication Number Publication Date
WO2018153161A1 true WO2018153161A1 (fr) 2018-08-30

Family

ID=63253122

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/119261 Ceased WO2018153161A1 (fr) 2017-02-24 2017-12-28 Procédé, appareil, et dispositif d'évaluation de qualité vidéo, et support de stockage

Country Status (2)

Country Link
CN (1) CN108513132B (fr)
WO (1) WO2018153161A1 (fr)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110401832B (zh) * 2019-07-19 2020-11-03 南京航空航天大学 一种基于时空管道建模的全景视频客观质量评估方法
CN111083468B (zh) * 2019-12-23 2021-08-20 杭州小影创新科技股份有限公司 一种基于图像梯度的短视频质量评价方法及系统
CN112365418B (zh) * 2020-11-11 2024-05-03 抖音视界有限公司 一种图像失真评测的方法、装置及计算机设备
CN114332088B (zh) * 2022-03-11 2022-06-03 电子科技大学 一种基于运动估计的全参考视频质量评估方法
CN116033144A (zh) * 2023-01-17 2023-04-28 维沃移动通信有限公司 视频质量评估方法、装置、电子设备及可读存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742355A (zh) * 2009-12-24 2010-06-16 厦门大学 基于空时域特征提取的无线视频部分参考测评方法
CN102984540A (zh) * 2012-12-07 2013-03-20 浙江大学 一种基于宏块域失真度估计的视频质量评价方法
CN103458265A (zh) * 2013-02-01 2013-12-18 深圳信息职业技术学院 一种视频质量评价方法、装置
US20140015923A1 (en) * 2012-07-16 2014-01-16 Cisco Technology, Inc. Stereo Matching for 3D Encoding and Quality Assessment
CN106028026A (zh) * 2016-05-27 2016-10-12 宁波大学 一种基于时空域结构的高效视频质量客观评价方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742355A (zh) * 2009-12-24 2010-06-16 厦门大学 基于空时域特征提取的无线视频部分参考测评方法
US20140015923A1 (en) * 2012-07-16 2014-01-16 Cisco Technology, Inc. Stereo Matching for 3D Encoding and Quality Assessment
CN102984540A (zh) * 2012-12-07 2013-03-20 浙江大学 一种基于宏块域失真度估计的视频质量评价方法
CN103458265A (zh) * 2013-02-01 2013-12-18 深圳信息职业技术学院 一种视频质量评价方法、装置
CN106028026A (zh) * 2016-05-27 2016-10-12 宁波大学 一种基于时空域结构的高效视频质量客观评价方法

Also Published As

Publication number Publication date
CN108513132B (zh) 2020-11-10
CN108513132A (zh) 2018-09-07

Similar Documents

Publication Publication Date Title
Ghadiyaram et al. A subjective and objective study of stalling events in mobile streaming videos
CN111193923B (zh) 视频质量评估方法、装置、电子设备及计算机存储介质
CN111918066B (zh) 视频编码方法、装置、设备及存储介质
WO2018153161A1 (fr) Procédé, appareil, et dispositif d'évaluation de qualité vidéo, et support de stockage
Gu et al. Hybrid no-reference quality metric for singly and multiply distorted images
Zhang et al. Subjective and objective quality assessment of panoramic videos in virtual reality environments
Moorthy et al. Visual quality assessment algorithms: what does the future hold?
US8804815B2 (en) Support vector regression based video quality prediction
CN103152600B (zh) 一种立体视频质量评价方法
KR102523149B1 (ko) 부트스트래핑을 통한 지각 품질 모델 불확실성의 정량화
CN105049838B (zh) 一种用于压缩立体视频质量的客观评价方法
US20210044791A1 (en) Video quality determination system and method
Zeng et al. 3D-SSIM for video quality assessment
Yang et al. Subjective quality assessment of screen content images
Ghadiyaram et al. A no-reference video quality predictor for compression and scaling artifacts
Nezhivleva et al. Comparing of Modern Methods Used to Assess the Quality of Video Sequences During Signal Streaming with and Without Human Perception
Jin et al. Quantifying the importance of cyclopean view and binocular rivalry-related features for objective quality assessment of mobile 3D video
CN111311584B (zh) 视频质量评估方法及装置、电子设备、可读介质
Keimel et al. Video is a cube
CN114630139A (zh) 直播视频的质量评估方法及其相关设备
Ortiz-Jaramillo et al. Content-aware objective video quality assessment
CN117478973A (zh) 针对插帧视频的无参考视频质量评价方法、系统及终端
CN116980604A (zh) 视频编码方法、视频解码方法及相关设备
CN104981841B (zh) 3d图像数据分割的方法和装置
CN113038129A (zh) 一种用于机器学习的数据样本获取的方法及设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17897421

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17897421

Country of ref document: EP

Kind code of ref document: A1