CN116471262A

CN116471262A - Video quality evaluation method, apparatus, device, storage medium, and program product

Info

Publication number: CN116471262A
Application number: CN202310520120.3A
Authority: CN
Inventors: 王伟权; 靳凯
Original assignee: Bigo Technology Pte Ltd
Current assignee: Bigo Technology Pte Ltd
Priority date: 2023-05-09
Filing date: 2023-05-09
Publication date: 2023-07-21

Abstract

The embodiment of the application provides a video quality assessment method, a device, equipment, a storage medium and a program product, wherein the method comprises the following steps: acquiring an image frame sequence and video parameters of a video to be evaluated; the image frame sequences are respectively input into a motion estimation module, a subjective image quality module, a content attraction module and a scene module to obtain a motion estimation score, a subjective image quality score, a content attraction score and a scene score; inputting the video parameters to an objective parameter module to obtain objective parameter scores; and calculating to obtain the video quality score of the video to be evaluated based on the motion estimation score, the subjective image quality score, the content attraction score, the scene score and the objective parameter score. The scheme fuses the motion information of the video, and combines the subjective preferences of the video scene and the audience tightly, so that the accuracy of the finally obtained video quality evaluation result is higher, and the application range is wider.

Description

Video quality evaluation method, device, equipment, storage medium and program product

技术领域technical field

本申请实施例涉及计算机技术领域，尤其涉及一种视频质量评估方法、装置、设备、存储介质及程序产品。The embodiments of the present application relate to the field of computer technologies, and in particular, to a video quality assessment method, device, equipment, storage medium, and program product.

背景技术Background technique

随着实时视频流技术的不断发展，视频流直播已经成为人们日常生活中必不可少的一部分。无论是视频会议、在线流媒体，还是电视直播，都需要高质量的视频传输。然而，实时视频流传输存在很多问题，会影响用户体验，比如视频质量下降、延迟增加、缓冲等。With the continuous development of real-time video streaming technology, live video streaming has become an indispensable part of people's daily life. Whether it is video conferencing, online streaming, or live TV, high-quality video transmission is required. However, there are many problems with real-time video streaming that affect user experience, such as video quality degradation, increased latency, buffering, etc.

通常情况下，使用先进的编码标准可以在保证客观视觉质量的同时降低视频比特率，为低带宽的观众带来更流畅的视觉体验。然而，现有的编码器不能完全满足用户的客观要求。一方面，实时流媒体要求编码器具有较高的编码速度，会影响编码质量。另一方面，编码参数(如帧率、比特率)的调整与观众的主观体验并没有线性关系。例如，当视频已经具有较高的帧率和比特率时，进一步增加编码参数并不能显著改善观看者的主观体验。相反，当视频处于低帧率和比特率的情况下，适当提高帧率会比提高比特率带来更好的观看体验。因此，需要一种准确高效的视频质量评估方式进行视频质量的准确评价，以用于后续的参数决策，来更好的满足用户需求。Typically, the use of advanced encoding standards can reduce the video bit rate while maintaining objective visual quality, bringing a smoother visual experience to low-bandwidth viewers. However, existing encoders cannot fully meet the objective requirements of users. On the one hand, real-time streaming media requires the encoder to have a high encoding speed, which will affect the encoding quality. On the other hand, the adjustment of encoding parameters (such as frame rate, bit rate) does not have a linear relationship with the viewer's subjective experience. For example, when the video already has a high frame rate and bit rate, further increasing the encoding parameters does not significantly improve the viewer's subjective experience. On the contrary, when the video is at a low frame rate and bit rate, appropriately increasing the frame rate will bring a better viewing experience than increasing the bit rate. Therefore, an accurate and efficient video quality evaluation method is needed to accurately evaluate the video quality for subsequent parameter decision-making to better meet user needs.

发明内容Contents of the invention

本申请实施例提供了一种视频质量评估方法、装置、设备、存储介质及程序产品，融合了视频的运动信息，紧密结合视频场景和观众的主观偏好，使得最终得到的视频质量评估结果准确度更高，其应用范围更广。Embodiments of the present application provide a video quality assessment method, device, device, storage medium, and program product, which integrate video motion information and closely combine video scenes and subjective preferences of viewers, so that the final video quality assessment results are more accurate and have a wider range of applications.

第一方面，本申请实施例提供了一种视频质量评估方法，该方法包括：In the first aspect, the embodiment of the present application provides a video quality assessment method, the method comprising:

获取待评估视频的图像帧序列以及视频参数；Obtain the image frame sequence and video parameters of the video to be evaluated;

将所述图像帧序列分别输入至运动估计模块、主观图像质量模块、内容吸引力模块以及场景模块，得到运动估计分值、主观图像质量分值、内容吸引力分值以及场景分值，所述内容吸引力模块基于数据分层处理后的样本训练得到；Inputting the image frame sequence into the motion estimation module, the subjective image quality module, the content attractiveness module and the scene module respectively, and obtaining the motion estimation score, the subjective image quality score, the content attractiveness score and the scene score, and the content attractiveness module is obtained based on sample training after data layering processing;

将所述视频参数输入至客观参数模块得到客观参数分值；Inputting the video parameters into the objective parameter module to obtain the objective parameter score;

基于所述运动估计分值、所述主观图像质量分值、所述内容吸引力分值、所述场景分值以及所述客观参数分值计算得到所述待评估视频的视频质量分值。A video quality score of the video to be evaluated is calculated based on the motion estimation score, the subjective image quality score, the content attractiveness score, the scene score and the objective parameter score.

第二方面，本申请实施例还提供了一种视频质量评估装置，包括：In the second aspect, the embodiment of the present application also provides a video quality evaluation device, including:

数据获取单元，配置为获取待评估视频的图像帧序列以及视频参数；A data acquisition unit configured to acquire an image frame sequence and video parameters of a video to be evaluated;

数据融合处理单元，配置为将所述图像帧序列分别输入至运动估计模块、主观图像质量模块、内容吸引力模块以及场景模块，得到运动估计分值、主观图像质量分值、内容吸引力分值以及场景分值，所述内容吸引力模块基于数据分层处理后的样本训练得到；以及，将所述视频参数输入至客观参数模块得到客观参数分值；The data fusion processing unit is configured to input the sequence of image frames into the motion estimation module, the subjective image quality module, the content attractiveness module and the scene module respectively, to obtain the motion estimation score, the subjective image quality score, the content attractiveness score and the scene score, and the content attractiveness module is obtained based on the sample training after data layering processing; and, the video parameters are input into the objective parameter module to obtain the objective parameter score;

分值计算单元，配置为基于所述运动估计分值、所述主观图像质量分值、所述内容吸引力分值、所述场景分值以及所述客观参数分值计算得到所述待评估视频的视频质量分值。A score calculation unit configured to calculate a video quality score of the video to be evaluated based on the motion estimation score, the subjective image quality score, the content attractiveness score, the scene score, and the objective parameter score.

第三方面，本申请实施例还提供了一种视频质量评估设备，该设备包括：In the third aspect, the embodiment of the present application also provides a video quality evaluation device, the device includes:

一个或多个处理器；one or more processors;

存储装置，用于存储一个或多个程序，storage means for storing one or more programs,

当所述一个或多个程序被所述一个或多个处理器执行，使得所述一个或多个处理器实现本申请实施例所述的视频质量评估方法。When the one or more programs are executed by the one or more processors, the one or more processors are made to implement the video quality assessment method described in the embodiment of the present application.

第四方面，本申请实施例还提供了一种存储计算机可执行指令的非易失性存储介质，所述计算机可执行指令在由计算机处理器执行时用于执行本申请实施例所述的视频质量评估方法。In a fourth aspect, the embodiment of the present application further provides a non-volatile storage medium storing computer-executable instructions, and the computer-executable instructions are used to execute the video quality assessment method described in the embodiment of the present application when executed by a computer processor.

第五方面，本申请实施例还提供了一种计算机程序产品，该计算机程序产品包括计算机程序，该计算机程序存储在计算机可读存储介质中，设备的至少一个处理器从计算机可读存储介质读取并执行计算机程序，使得设备执行本申请实施例所述的视频质量评估方法。In the fifth aspect, the embodiment of the present application further provides a computer program product, the computer program product includes a computer program, the computer program is stored in a computer-readable storage medium, at least one processor of the device reads and executes the computer program from the computer-readable storage medium, so that the device executes the video quality assessment method described in the embodiment of the present application.

本申请实施例中，通过获取待评估视频的图像帧序列以及视频参数，将图像帧序列分别输入至运动估计模块、主观图像质量模块、内容吸引力模块以及场景模块，得到运动估计分值、主观图像质量分值、内容吸引力分值以及场景分值，其中，内容吸引力模块基于数据分层处理后的样本训练得到，将视频参数输入至客观参数模块得到客观参数分值，再基于得到的运动估计分值、主观图像质量分值、内容吸引力分值、场景分值以及客观参数分值计算得到待评估视频的视频质量分值，该种视频质量评估方式，融合了运动估计、主观图像质量、内容吸引力、视频场景以及客观的视频参数，紧密结合视频场景和观众的主观偏好，由此得到的视频质量评估结果准确度更高，可应用的范围更广，其中，内容吸引力模块的训练数据采用数据分层处理后的样本，能够得到主观画质相同下的不同内容吸引力分值，来显著提升后续的视频质量评估的准确度。In the embodiment of the present application, by obtaining the image frame sequence and video parameters of the video to be evaluated, the image frame sequence is respectively input into the motion estimation module, the subjective image quality module, the content attractiveness module and the scene module to obtain the motion estimation score, the subjective image quality score, the content attractiveness score and the scene score. The video quality score of the video to be evaluated. This video quality evaluation method integrates motion estimation, subjective image quality, content attractiveness, video scene and objective video parameters, and closely combines the video scene and the subjective preferences of the audience. The resulting video quality evaluation results are more accurate and can be applied in a wider range. Among them, the training data of the content attractiveness module uses samples after data layering processing, and different content attractiveness scores under the same subjective image quality can be obtained to significantly improve the accuracy of subsequent video quality evaluation.

附图说明Description of drawings

图1为本申请实施例提供的一种视频质量评估方法的流程图；Fig. 1 is the flowchart of a kind of video quality evaluation method that the embodiment of the present application provides;

图2为本申请实施例提供的一种通过运动估计模块得到运动估计分值的方法的流程图；FIG. 2 is a flow chart of a method for obtaining a motion estimation score through a motion estimation module provided in an embodiment of the present application;

图3为本申请实施例提供的一种内容吸引力模块训练方法的流程图；FIG. 3 is a flow chart of a method for training a content attractiveness module provided by an embodiment of the present application;

图4为本申请实施例提供的一种视频帧率决策的方法的流程图；FIG. 4 is a flow chart of a method for video frame rate decision-making provided in an embodiment of the present application;

图5为本申请实施例提供的一种视频质量评估装置的结构框图；FIG. 5 is a structural block diagram of a video quality assessment device provided in an embodiment of the present application;

图6为本申请实施例提供的一种视频质量评估设备的结构示意图。FIG. 6 is a schematic structural diagram of a video quality assessment device provided by an embodiment of the present application.

具体实施方式Detailed ways

下面结合附图和实施例对本申请实施例作进一步的详细说明。可以理解的是，此处所描述的具体实施例仅仅用于解释本申请实施例，而非对本申请实施例的限定。另外还需要说明的是，为了便于描述，附图中仅示出了与本申请实施例相关的部分而非全部结构。The embodiments of the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It can be understood that the specific embodiments described here are only used to explain the embodiments of the present application, but not to limit the embodiments of the present application. In addition, it should be noted that, for the convenience of description, only a part but not all structures related to the embodiment of the present application are shown in the drawings.

本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象，而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施，且“第一”、“第二”等所区分的对象通常为一类，并不限定对象的个数，例如第一对象可以是一个，也可以是多个。此外，说明书以及权利要求中“和/或”表示所连接对象的至少其中之一，字符“/”，一般表示前后关联对象是一种“或”的关系。The terms "first", "second" and the like in the specification and claims of the present application are used to distinguish similar objects, and are not used to describe a specific sequence or sequence. It should be understood that the data used in this way can be interchanged under appropriate circumstances, so that the embodiments of the present application can be implemented in an order other than those illustrated or described here, and the objects distinguished by "first", "second" and so on are generally of one type, and the number of objects is not limited. For example, there can be one or more first objects. In addition, "and/or" in the specification and claims means at least one of the connected objects, and the character "/" generally means that the related objects are an "or" relationship.

本申请实施例提供的视频质量评估方法，可应用于直播场景中，如对直播视频质量进行评价，以相应得到视频质量分值。或者，其它任何需要对视频质量进行评估的场景。The video quality evaluation method provided in the embodiment of the present application can be applied to a live broadcast scene, such as evaluating the quality of a live video to obtain a video quality score accordingly. Or, any other scenario where video quality needs to be evaluated.

图1为本申请实施例提供的一种视频质量评估方法的流程图，如图1所示，具体包括如下步骤：Fig. 1 is the flow chart of a kind of video quality evaluation method that the embodiment of the present application provides, as shown in Fig. 1, specifically comprises the following steps:

步骤S101、获取待评估视频的图像帧序列以及视频参数。Step S101. Obtain an image frame sequence and video parameters of a video to be evaluated.

其中，待评估视频可以是获取到的直播流视频，如直播过程中，需要进行直播画面内容的质量评估时，进行获取的视频。其执行主体可以是服务器，即服务器对视频质量进行评估得到评估结果，以对视频参数的调整提供辅助决策。Wherein, the video to be evaluated may be an acquired live streaming video, for example, a video acquired when it is necessary to evaluate the quality of the live image content during the live broadcast. The subject of execution may be a server, that is, the server evaluates the video quality to obtain an evaluation result, so as to provide auxiliary decision-making for adjusting video parameters.

其中，图像帧序列为由多张图像帧组成的序列。可选的，可以是对待评估视频进行采样得到的连续多帧图像组成序列。视频参数表征待评估视频的客观视频质量参数，示例性的，可以是帧率、比特率、量化参数值、分辨率等。Wherein, the image frame sequence is a sequence composed of multiple image frames. Optionally, a sequence may be composed of consecutive multiple frames of images obtained by sampling the video to be evaluated. The video parameter represents an objective video quality parameter of the video to be evaluated, for example, it may be a frame rate, a bit rate, a quantization parameter value, a resolution, and the like.

步骤S102、将所述图像帧序列分别输入至运动估计模块、主观图像质量模块、内容吸引力模块以及场景模块，得到运动估计分值、主观图像质量分值、内容吸引力分值以及场景分值。Step S102, input the sequence of image frames into motion estimation module, subjective image quality module, content attractiveness module and scene module respectively, and obtain motion estimation score, subjective image quality score, content attractiveness score and scene score.

其中，运动估计模块、主观图像质量模块、内容吸引力模块以及场景模块为预先训练完成的神经网络模块，其针对输入的图像帧序列可相应的输出对应类别的分值。如将图像帧序列输入至运动估计模块，以输出运动估计分值；将图像帧序列输入至主观图像质量模块，以输出主观图像质量分值；将图像帧序列输入至内容吸引力模块，以输出内容吸引力分值；将图像帧序列输入至场景模块，以输出场景分值。其中，主观图像质量模块、内容吸引力模块训练过程中，使用大量来自不同地区和国家的标记数据，以反映不同区域人群的偏好情况。Among them, the motion estimation module, subjective image quality module, content attractiveness module and scene module are pre-trained neural network modules, which can output corresponding category scores for the input image frame sequence. For example, the image frame sequence is input to the motion estimation module to output the motion estimation score; the image frame sequence is input to the subjective image quality module to output the subjective image quality score; the image frame sequence is input to the content attractiveness module to output the content attractiveness score; the image frame sequence is input to the scene module to output the scene score. Among them, during the training process of the subjective image quality module and the content attractiveness module, a large number of labeled data from different regions and countries are used to reflect the preferences of people in different regions.

其中，运动估计模块基于光流模型，当运动被感知为不连续或起伏时，会对用户体验产生负面影响。光流模型通过映射每个像素从一张图像到另一张图像的位移来精确的捕捉运动。Among them, the motion estimation module is based on the optical flow model, which will have a negative impact on user experience when motion is perceived as discontinuous or undulating. Optical flow models accurately capture motion by mapping the displacement of each pixel from one image to another.

在一个实施例中，如图2所示，图2为本申请实施例提供的一种通过运动估计模块得到运动估计分值的方法的流程图，如图所示，包括：In one embodiment, as shown in FIG. 2, FIG. 2 is a flowchart of a method for obtaining a motion estimation score through a motion estimation module provided in an embodiment of the present application. As shown in the figure, it includes:

步骤S1021、将图像帧序列中连续的两帧图像作为一组输入至运动估计模块。Step S1021, inputting two consecutive frames of images in the image frame sequence as a group to the motion estimation module.

在一个实施例中，针对图像帧序列中的每张帧图像，将连续的两帧图像作为一组输入至运动估计模块，以输出每组图像对应的像素点的相对位移。In one embodiment, for each frame image in the image frame sequence, two consecutive frames of images are input as a group to the motion estimation module, so as to output the relative displacement of the pixels corresponding to each group of images.

步骤S1022、通过运动估计模块计算得到每组帧图像中像素点的相对位移，并基于像素点的相对位移计算得到运动估计分值。Step S1022, calculate the relative displacement of the pixels in each group of frame images through the motion estimation module, and calculate the motion estimation score based on the relative displacement of the pixels.

其中，像素点的相对位移包括水平方向上的位移和竖直方向上的位移，针对包含N帧图像的图像帧序列，其输出N/2个像素点的相对位移的运动评估结果。示例性的，每个运动评估结果记录内容为连续的两帧图像中其中一帧图像的每个像素点相对于另一帧图像的相对位移。再得到每个像素点的相对位移后，可以通过计算图像中所有像素点的相对位移的平均值，即平均幅度来表征运动情况以得到对应的运动估计分值。示例性的，平均幅度越高表示运动情况越明显，对应的运动估计分值越高，反之则越低。Wherein, the relative displacement of pixels includes displacement in the horizontal direction and displacement in the vertical direction, and for an image frame sequence including N frames of images, it outputs a motion evaluation result of the relative displacement of N/2 pixels. Exemplarily, the recorded content of each motion evaluation result is the relative displacement of each pixel of one frame of images in two consecutive frames of images relative to the other frame of images. After the relative displacement of each pixel is obtained, the average value of the relative displacements of all pixels in the image can be calculated, that is, the average amplitude to characterize the motion situation to obtain the corresponding motion estimation score. Exemplarily, the higher the average amplitude, the more obvious the motion situation, and the higher the corresponding motion estimation score, and vice versa.

在另一个实施例中，在通过运动估计模块计算运动估计分值时，进一步将得到的像素点的相对位移分别与像素点的三通道颜色值叠加后，将得到的特征输入至运动估计模块得到运动估计分值。示例性的，针对像素点的相对位移采用二维向量形式表示，像素点的三通道颜色值采用三维向量表示，二者叠加后的结果采用五维向量表示，可选的，没个维度依次表征像素点的水平方向位移、垂直方向位移、R分量颜色值、G分量颜色值和B分量颜色值，以该叠加后的结果作为图像特征输入至运动估计模块得到整幅图像的运动估计分值。In another embodiment, when the motion estimation score is calculated by the motion estimation module, the obtained relative displacement of the pixel is further superimposed on the three-channel color value of the pixel, and then the obtained features are input into the motion estimation module to obtain the motion estimation score. Exemplarily, the relative displacement of the pixel point is represented by a two-dimensional vector, the three-channel color value of the pixel point is represented by a three-dimensional vector, and the result after the two are superimposed is represented by a five-dimensional vector. Optionally, each dimension represents the horizontal displacement, the vertical displacement, the R component color value, the G component color value, and the B component color value of the pixel point in turn, and the superimposed result is input to the motion estimation module as an image feature to obtain a motion estimation score of the entire image.

其中，主观图像质量模块用于捕捉图像帧序列中每帧图像的质量。可选的，其使用图像质量评估模型，该模型带有回归输出的特征提取头。使用监督学习技术在大型数据集上进行训练，数据集包括亮度、对比度、饱和度、噪声、失真、场景和物体类型的变化图像。人员标记的质量分数与数据集中的每张图像相关联，以反映图像的主观质量。其中，该主观图像质量模块使用的模型区别于客观质量模型，主要用神经网络拟合人对于图像模糊、噪声大、丢帧等情况的主观感受程度，并非客观图像质量，根据不同国家人群的画质偏好进行区域化定制。Wherein, the subjective image quality module is used to capture the image quality of each frame in the image frame sequence. Optionally, it uses an image quality assessment model with a feature extraction head for regression output. Use supervised learning techniques to train on large datasets including images of variations in brightness, contrast, saturation, noise, distortion, scene, and object type. A quality score for person labels is associated with each image in the dataset to reflect the subjective quality of the image. Among them, the model used by the subjective image quality module is different from the objective quality model. It mainly uses the neural network to fit people's subjective perception of image blur, noise, frame loss, etc., not the objective image quality.

其中，内容吸引力模块基于数据分层处理后的样本训练得到。可选的，图3为本申请实施例提供的一种内容吸引力模块训练方法的流程图，如图3所示，包括：Wherein, the content attractiveness module is obtained based on sample training after data layering processing. Optionally, FIG. 3 is a flow chart of a content attractiveness module training method provided in the embodiment of the present application, as shown in FIG. 3 , including:

步骤S1023、基于运动估计模块、主观图像质量模块、场景模块以及客观参数模块对样本数据的打分分值，对所述样本数据进行数据分层处理得到分层样本。Step S1023 , based on the scoring values of the sample data by the motion estimation module, the subjective image quality module, the scene module and the objective parameter module, perform data layering processing on the sample data to obtain layered samples.

在一个实施例中，进行样本分层处理时，使用训练完毕的运动估计模块、主观图像质量模块、场景模块以及客观参数模块对样本数据进行打分得到对应的打分分值，在根据该打分分值对样本数据进行数据分层处理得到分层样本。需要说明的是，在数据分层处理中，还可以使用上述的运动估计模块、主观图像质量模块、场景模块以及客观参数模块中的任意两个或三个进行相应的打分，并基于打分分值进行样本数据的分层处理得到分层样本。In one embodiment, when performing sample layering processing, the trained motion estimation module, subjective image quality module, scene module, and objective parameter module are used to score the sample data to obtain corresponding scoring values, and then perform data layering processing on the sample data according to the scoring points to obtain layered samples. It should be noted that in data layering processing, any two or three of the above-mentioned motion estimation module, subjective image quality module, scene module, and objective parameter module can also be used to perform corresponding scoring, and layered processing of sample data is performed based on the scoring value to obtain layered samples.

可选的，在得到使用的模块对样本数据打分得到打分分值后，可将打分分值相同或分值差小于预设值的样本数据组成分层样本。举例而言，针对样本数据中每个样本图像的打分结果示例性的可记为[a，b，c，d]，其中，a为运动估计模块的打分分值，b为主观图像质量模块的打分分值，c为场景模块的打分分值，d为客观参数模块的打分分值。示例性的，每个打分分值经对应模块归一化处理后其取值范围为[0，1]。在进行分层处理过程中，可以是将结果中每个打分分值均相同的图像作为一层样本，也可以是将分值差小于预设值的样本数据组成一层样本，该预设值示例性的可以是0.1。Optionally, after the used module scores the sample data to obtain a scoring value, the sample data with the same scoring value or a score difference smaller than a preset value can be formed into a stratified sample. For example, the scoring result for each sample image in the sample data can be exemplarily recorded as [a, b, c, d], where a is the scoring value of the motion estimation module, b is the scoring value of the subjective image quality module, c is the scoring value of the scene module, and d is the scoring value of the objective parameter module. Exemplarily, the value range of each scoring value is [0, 1] after being normalized by the corresponding module. In the layering process, images with the same scoring value in the result can be used as a layer of samples, or sample data whose score difference is smaller than a preset value can be used to form a layer of samples, and the preset value can be 0.1 for example.

步骤S1024、基于分层样本对内容吸引力模块进行训练。Step S1024, train the content attractiveness module based on hierarchical samples.

在一个实施例中，通过其它四个模块(运动估计模块、主观图像质量模块、场景模块以及客观参数模块)对数据进行分层，每一层的数据具有相似的运动估计、主观图像质量、场景以及客观参数分数，但其表示不同的内容，针对每一层的样本数据对不同国家的标注团队进行打分作为区域化的吸引力标签，再采用神经网络方法回归训练进行拟合，即可得到针对该国家下细粒度的，主观画质近似相同下的不同内容吸引力的分数。In one embodiment, the data is layered through other four modules (motion estimation module, subjective image quality module, scene module, and objective parameter module). The data in each layer has similar motion estimation, subjective image quality, scene, and objective parameter scores, but they represent different content. For the sample data in each layer, the labeling teams of different countries are scored as regional attractiveness labels, and then the neural network method is used for regression training.

在一个实施例中，场景模块输出结果表征图像的类别场景，如游戏直播场景、个人秀直播场景、PK直播场景等。示例性的，可以预先设置8个不同的直播场景，其输出结果为8维向量，每一维对应一个类别，向量值为1的维度表示的类别为场景类别，向量值为0的维度表示的类别为非识别得到的类别。In one embodiment, the output result of the scene module represents the category scene of the image, such as a game live broadcast scene, a personal show live broadcast scene, a PK live broadcast scene, and the like. Exemplarily, 8 different live broadcast scenes can be set in advance, and the output result is an 8-dimensional vector, each dimension corresponds to a category, the category represented by the dimension whose vector value is 1 is the scene category, and the category represented by the dimension whose vector value is 0 is the non-recognized category.

步骤S103、将所述视频参数输入至客观参数模块得到客观参数分值。Step S103, input the video parameters into the objective parameter module to obtain the objective parameter score.

其中，该视频参数可以是帧率、比特率、量化参数值、分辨率等，其视频的客观质量参数，通过将其输入至客观参数模块，以得到客观参数分值。示例性的，假定输入的视频参数包括帧率、比特率、量化参数值三种，相应的输出结果每种参数对应的分值，以共同组成客观参数分值。其中，具体分值大小经归一化处理落在[0，1]范围。Wherein, the video parameter may be frame rate, bit rate, quantization parameter value, resolution, etc., and the objective quality parameter of the video is input to the objective parameter module to obtain the objective parameter score. Exemplarily, it is assumed that the input video parameters include three types of frame rate, bit rate, and quantization parameter value, and the corresponding output result is a score corresponding to each parameter, so as to jointly form an objective parameter score. Among them, the specific score falls within the range of [0, 1] after normalization.

步骤S104、基于所述运动估计分值、所述主观图像质量分值、所述内容吸引力分值、所述场景分值以及所述客观参数分值计算得到所述待评估视频的视频质量分值。Step S104: Calculate and obtain the video quality score of the video to be evaluated based on the motion estimation score, the subjective image quality score, the content attractiveness score, the scene score and the objective parameter score.

在一个实施例中，得到运动估计分值、主观图像质量分值、内容吸引力分值、场景分值以及客观参数分值后，进行融合计算得到待评估视频的视频质量分值。可选的，可以是将运动估计分值、主观图像质量分值、内容吸引力分值、场景分值以及客观参数分值，与各自对应的权重相乘，将乘积结果相加得到待评估视频的视频质量分值。其中，权重值的大小通过神经网络训练优化得到。在另一个实施例中，还可以是设置其它的计算公式，将得到的运动估计分值、主观图像质量分值、内容吸引力分值、场景分值以及客观参数分值带入公式计算得到待评估视频的视频质量分值。In one embodiment, after the motion estimation score, subjective image quality score, content attractiveness score, scene score and objective parameter score are obtained, fusion calculation is performed to obtain the video quality score of the video to be evaluated. Optionally, the motion estimation score, subjective image quality score, content attractiveness score, scene score and objective parameter score may be multiplied by their corresponding weights, and the product results may be added to obtain the video quality score of the video to be evaluated. Wherein, the size of the weight value is obtained through neural network training and optimization. In another embodiment, other calculation formulas can also be set, and the obtained motion estimation score, subjective image quality score, content attractiveness score, scene score and objective parameter score are brought into the formula to calculate the video quality score of the video to be evaluated.

由上述可知，通过获取待评估视频的图像帧序列以及视频参数，将图像帧序列分别输入至运动估计模块、主观图像质量模块、内容吸引力模块以及场景模块，得到运动估计分值、主观图像质量分值、内容吸引力分值以及场景分值，其中，内容吸引力模块基于数据分层处理后的样本训练得到，将视频参数输入至客观参数模块得到客观参数分值，再基于得到的运动估计分值、主观图像质量分值、内容吸引力分值、场景分值以及客观参数分值计算得到待评估视频的视频质量分值，该种视频质量评估方式，融合了运动估计、主观图像质量、内容吸引力、视频场景以及客观的视频参数，紧密结合视频场景和观众的主观偏好，由此得到的视频质量评估结果准确度更高，可应用的范围更广，其中，内容吸引力模块的训练数据采用数据分层处理后的样本，能够得到主观画质相同下的不同内容吸引力分值，来显著提升后续的视频质量评估的准确度。It can be seen from the above that by obtaining the image frame sequence and video parameters of the video to be evaluated, the image frame sequence is respectively input into the motion estimation module, subjective image quality module, content attractiveness module and scene module to obtain the motion estimation score, subjective image quality score, content attractiveness score and scene score. The video quality score of the video to be evaluated. This video quality evaluation method integrates motion estimation, subjective image quality, content attractiveness, video scene and objective video parameters, and closely combines the video scene and the subjective preferences of the audience. The resulting video quality evaluation results are more accurate and can be applied in a wider range. Among them, the training data of the content attractiveness module uses samples after data layering processing, and different content attractiveness scores under the same subjective image quality can be obtained to significantly improve the accuracy of subsequent video quality evaluation.

图4为本申请实施例提供的一种视频帧率决策的方法的流程图，如图4所示，包括：Fig. 4 is a flow chart of a method for video frame rate decision-making provided in the embodiment of the present application, as shown in Fig. 4 , including:

步骤S201、对待评估视频的帧率进行调整得到不同帧率的待评估视频，将不同帧率的待评估视频输入至客观参数模块得到多个客观参数分值。Step S201 , adjust the frame rate of the video to be evaluated to obtain videos to be evaluated with different frame rates, and input the videos to be evaluated with different frame rates to the objective parameter module to obtain multiple objective parameter scores.

在一个实施例中，可针对相同视频流/视频内容的视频参数进行调整，以帧率为例，调整得到不同帧率的待评估视频，分别输入至客观参数模块得到多个客观参数分值。In one embodiment, the video parameters of the same video stream/video content can be adjusted. Taking the frame rate as an example, videos to be evaluated with different frame rates can be adjusted and input to the objective parameter module to obtain multiple objective parameter scores.

步骤S202、基于运动估计分值、主观图像质量分值、内容吸引力分值、场景分值以及每个客观参数分值计算得到不同帧率的待评估视频的视频质量分值。Step S202 , based on the motion estimation score, subjective image quality score, content attractiveness score, scene score and each objective parameter score, the video quality scores of the videos to be evaluated at different frame rates are calculated.

其中，由于针对的是相同视频帧序列，其运动估计分值、主观图像质量分值、内容吸引力分值、场景分值保持不变，该数值沿用前述示例描述的方案得到，仅将不同的客观参数分值作为变量进行融合计算得到不同帧率的待评估视频的视频质量分值。Among them, since the same video frame sequence is aimed at, its motion estimation score, subjective image quality score, content attractiveness score, and scene score remain unchanged, and the value is obtained following the scheme described in the previous example, and only different objective parameter scores are used as variables for fusion calculation to obtain the video quality score of videos to be evaluated at different frame rates.

步骤S203、基于多个不同帧率下对应的视频质量分值，进行视频帧率决策。Step S203, based on multiple video quality scores corresponding to different frame rates, make a video frame rate decision.

在一个实施例中，得到多个不同帧率下对应的视频质量分值，进行视频帧率决策时，可以是根据不同帧率下视频质量分值的变化情况，进行视频帧率决策。可选的，可以是在视频帧率提升，且视频质量分值显著增加的情况下，提升视频帧率；在视频帧率降低，且视频质量分值没有显著下降的情况下，降低视频帧率。示例性的，以原始的待评估视频帧率为15fps编码为例，通过对客观参数模块输入9、12、15、18、21和24来观察帧率变化时的相对收益，如果帧率在15-18fps时，得到的最终的视频质量分数提升较大，则通过插帧优化视觉体验。如果12-15fps时，得到的最终的视频质量分数降低较少，则通过抽帧来降低码率，降低带宽成本。In one embodiment, multiple video quality scores corresponding to different frame rates are obtained, and when making a video frame rate decision, the video frame rate decision may be made according to changes in video quality scores at different frame rates. Optionally, when the video frame rate is increased and the video quality score is significantly increased, the video frame rate is increased; when the video frame rate is decreased and the video quality score is not significantly decreased, the video frame rate is decreased. Exemplarily, taking the original video frame rate to be evaluated at 15fps as an example, by inputting 9, 12, 15, 18, 21, and 24 to the objective parameter module to observe the relative benefits when the frame rate changes, if the frame rate is 15-18fps, the final video quality score is greatly improved, and the visual experience is optimized by interpolating frames. If at 12-15fps, the resulting final video quality score is less degraded, then frame extraction is used to reduce the bit rate and bandwidth cost.

需要说明的是，上述以帧率作为视频质量参数，对其决策进行了说明，还可以是基于同样的思想和设计逻辑对比特率、量化参数等进行同样的决策。It should be noted that the frame rate is used as the video quality parameter above to explain its decision-making, and the same decision-making on the bit rate and quantization parameters, etc. can also be made based on the same idea and design logic.

由上述可知，通过对待评估视频的帧率进行调整得到不同帧率的待评估视频，将不同帧率的待评估视频输入至客观参数模块得到多个客观参数分值，基于所述运动估计分值、所述主观图像质量分值、所述内容吸引力分值、所述场景分值以及每个客观参数分值计算得到不同帧率的待评估视频的视频质量分值，基于多个不同帧率下对应的视频质量分值，进行视频帧率决策，其结合了主观视频质量的评估来进行视频参数调整的决策，并非简单的基于客观因素和视频参数进行机械的调整，可以在平衡带宽、设备算力的同时，显著提升用户的观感体验。It can be known from the above that by adjusting the frame rate of the video to be evaluated, the video to be evaluated at different frame rates is obtained, and the video to be evaluated at different frame rates is input to the objective parameter module to obtain a plurality of objective parameter scores, and the video quality scores of the video to be evaluated at different frame rates are calculated based on the motion estimation score, the subjective image quality score, the content attractiveness score, the scene score and each objective parameter score, and the video frame rate decision is made based on the corresponding video quality scores at multiple different frame rates, which combines the evaluation of subjective video quality to adjust video parameters Decision-making is not simply a mechanical adjustment based on objective factors and video parameters. It can significantly improve the user's viewing experience while balancing bandwidth and device computing power.

图5为本申请实施例提供的一种视频质量评估装置的结构框图，如图5所示，该装置用于执行上述实施例提供的视频质量评估方法，具备执行方法相应的功能模块和有益效果。如图5所示，该装置具体包括：数据获取单元101、数据融合处理单元102和分值计算单元103，其中，数据融合处理单元102包括运动估计模块1021、主观图像质量模块1022、内容吸引力模块1023、场景模块1024和客观参数模块1025，FIG. 5 is a structural block diagram of a video quality assessment device provided by an embodiment of the present application. As shown in FIG. 5 , the device is used to implement the video quality assessment method provided by the above embodiment, and has corresponding functional modules and beneficial effects for executing the method. As shown in Figure 5, the device specifically includes: a data acquisition unit 101, a data fusion processing unit 102, and a score calculation unit 103, wherein the data fusion processing unit 102 includes a motion estimation module 1021, a subjective image quality module 1022, a content attractiveness module 1023, a scene module 1024 and an objective parameter module 1025,

数据获取单元101，配置为获取待评估视频的图像帧序列以及视频参数；The data acquisition unit 101 is configured to acquire an image frame sequence and video parameters of the video to be evaluated;

数据融合处理单元102，配置为将所述图像帧序列分别输入至运动估计模块1021、主观图像质量模块1022、内容吸引力模块1023以及场景模块1024，得到运动估计分值、主观图像质量分值、内容吸引力分值以及场景分值，所述内容吸引力模块基于数据分层处理后的样本训练得到；以及，将所述视频参数输入至客观参数模块1025得到客观参数分值；The data fusion processing unit 102 is configured to input the sequence of image frames into the motion estimation module 1021, the subjective image quality module 1022, the content attractiveness module 1023, and the scene module 1024, respectively, to obtain motion estimation scores, subjective image quality scores, content attractiveness scores, and scene scores, and the content attractiveness module is obtained based on sample training after data layering processing; and, input the video parameters into the objective parameter module 1025 to obtain objective parameter scores;

分值计算单元103，配置为基于所述运动估计分值、所述主观图像质量分值、所述内容吸引力分值、所述场景分值以及所述客观参数分值计算得到所述待评估视频的视频质量分值。The score calculation unit 103 is configured to calculate the video quality score of the video to be evaluated based on the motion estimation score, the subjective image quality score, the content attractiveness score, the scene score and the objective parameter score.

在一个可能的实施例中，所述数据融合处理单元102，具体配置为：In a possible embodiment, the data fusion processing unit 102 is specifically configured as:

将所述图像帧序列中连续的两帧图像作为一组输入至运动估计模块；Two consecutive frames of images in the image frame sequence are input as a group to the motion estimation module;

通过所述运动估计模块计算得到每组帧图像中像素点的相对位移，并基于所述像素点的相对位移计算得到运动估计分值。The relative displacement of pixels in each group of frame images is calculated by the motion estimation module, and a motion estimation score is calculated based on the relative displacement of the pixels.

在一个可能的实施例中，所述数据融合处理单元102，还配置为：In a possible embodiment, the data fusion processing unit 102 is further configured to:

在所述通过所述运动估计模块计算得到每组帧图像中像素点的相对位移之后，将所述每组帧图像中像素点的相对位移与像素点的三通道颜色值叠加后，输入至所述运动估计模块得到运动估计分值。After the relative displacement of the pixels in each group of frame images is calculated by the motion estimation module, the relative displacement of the pixels in each group of frame images is superimposed with the three-channel color value of the pixels, and input to the motion estimation module to obtain a motion estimation score.

在一个可能的实施例中，所述主观图像质量模块基于图像质量评估模型生成，包括带有回归输出的特征提取头。In a possible embodiment, the subjective image quality module is generated based on an image quality assessment model, and includes a feature extraction head with a regression output.

在所述获取待评估视频中的多帧图像，生成图像帧序列之前，基于运动估计模块、主观图像质量模块、场景模块以及客观参数模块对样本数据的打分分值，对所述样本数据进行数据分层处理得到分层样本；Before the multi-frame images in the video to be evaluated are acquired and the image frame sequence is generated, based on the scoring points of the sample data by the motion estimation module, the subjective image quality module, the scene module and the objective parameter module, data layering processing is performed on the sample data to obtain layered samples;

基于所述分层样本对内容吸引力模块进行训练。A content attractiveness module is trained based on the stratified samples.

将打分分值相同或分值差小于预设值的样本数据组成分层样本。The sample data with the same scoring value or the difference between the scoring values is smaller than the preset value to form a stratified sample.

在一个可能的实施例中，所述分值计算单元103，配置为：In a possible embodiment, the score calculation unit 103 is configured to:

将所述运动估计分值、所述主观图像质量分值、所述内容吸引力分值、所述场景分值以及所述客观参数分值，与各自对应的权重相乘；multiplying the motion estimation score, the subjective image quality score, the content attractiveness score, the scene score, and the objective parameter score by their corresponding weights;

将乘积结果相加得到所述待评估视频的视频质量分值。Adding the product results to obtain the video quality score of the video to be evaluated.

在一个可能的实施例中，所述视频参数包括视频帧率，所述数据融合处理单元102，还配置为：In a possible embodiment, the video parameters include a video frame rate, and the data fusion processing unit 102 is further configured to:

对所述待评估视频的帧率进行调整得到不同帧率的待评估视频，将不同帧率的待评估视频输入至客观参数模块得到多个客观参数分值；Adjusting the frame rate of the video to be evaluated to obtain video to be evaluated with different frame rates, and inputting the video to be evaluated with different frame rates to the objective parameter module to obtain a plurality of objective parameter scores;

基于所述运动估计分值、所述主观图像质量分值、所述内容吸引力分值、所述场景分值以及每个客观参数分值计算得到不同帧率的待评估视频的视频质量分值；Based on the motion estimation score, the subjective image quality score, the content attractiveness score, the scene score and each objective parameter score, the video quality scores of the video to be evaluated at different frame rates are calculated;

所述装置还包括决策单元，配置为：基于多个不同帧率下对应的视频质量分值，进行视频帧率决策。The device further includes a decision unit configured to: make a video frame rate decision based on multiple video quality scores corresponding to different frame rates.

图6为本申请实施例提供的一种视频质量评估设备的结构示意图，如图6所示，该设备包括处理器201、存储器202、输入装置203和输出装置204；设备中处理器201的数量可以是一个或多个，图6中以一个处理器201为例；设备中的处理器201、存储器202、输入装置203和输出装置204可以通过总线或其他方式连接，图6中以通过总线连接为例。存储器202作为一种计算机可读存储介质，可用于存储软件程序、计算机可执行程序以及模块，如本申请实施例中的视频质量评估方法对应的程序指令/模块。处理器201通过运行存储在存储器202中的软件程序、指令以及模块，从而执行设备的各种功能应用以及数据处理，即实现上述的视频质量评估方法。输入装置703可用于接收输入的数字或字符信息，以及产生与设备的用户设置以及功能控制有关的键信号输入。输出装置204可包括显示屏等显示设备。Figure 6 is a schematic structural diagram of a video quality assessment device provided in the embodiment of the present application. As shown in Figure 6, the device includes a processor 201, a memory 202, an input device 203, and an output device 204; the number of processors 201 in the device can be one or more, and one processor 201 is used as an example in Figure 6; the processor 201, memory 202, input device 203, and output device 204 in the device can be connected through a bus or in other ways, and in Figure 6, the connection through a bus is used as an example. The memory 202, as a computer-readable storage medium, can be used to store software programs, computer-executable programs and modules, such as program instructions/modules corresponding to the video quality assessment method in the embodiment of the present application. The processor 201 executes various functional applications and data processing of the device by running the software programs, instructions and modules stored in the memory 202, that is, realizes the above-mentioned video quality assessment method. The input device 703 can be used to receive input numbers or character information, and generate key signal input related to user settings and function control of the device. The output device 204 may include a display device such as a display screen.

本申请实施例还提供一种包含计算机可执行指令的非易失性存储介质，所述计算机可执行指令在由计算机处理器执行时用于执行一种上述实施例描述的视频质量评估方法，其中，包括：An embodiment of the present application also provides a non-volatile storage medium containing computer-executable instructions, the computer-executable instructions are used to execute a video quality assessment method described in the above-mentioned embodiments when executed by a computer processor, including:

值得注意的是，上述视频质量评估装置的实施例中，所包括的各个单元和模块只是按照功能逻辑进行划分的，但并不局限于上述的划分，只要能够实现相应的功能即可；另外，各功能单元的具体名称也只是为了便于相互区分，并不用于限制本申请实施例的保护范围。It should be noted that, in the above embodiment of the video quality assessment device, the various units and modules included are only divided according to the functional logic, but are not limited to the above division, as long as the corresponding functions can be realized; in addition, the specific names of the functional units are only for the convenience of mutual distinction, and are not used to limit the scope of protection of the embodiments of the present application.

在一些可能的实施方式中，本申请提供的方法的各个方面还可以实现为一种程序产品的形式，其包括程序代码，当所述程序产品在计算机设备上运行时，所述程序代码用于使所述计算机设备执行本说明书上述描述的根据本申请各种示例性实施方式的方法中的步骤，例如，所述计算机设备可以执行本申请实施例所记载的视频质量评估方法。所述程序产品可以采用一个或多个可读介质的任意组合实现。In some possible implementations, various aspects of the method provided in this application can also be implemented in the form of a program product, which includes program code, and when the program product is run on a computer device, the program code is used to make the computer device execute the steps in the method according to various exemplary embodiments of the application described above in this specification. For example, the computer device can execute the video quality evaluation method described in the embodiment of the application. The program product can be implemented using any combination of one or more readable media.

Claims

1. The video quality assessment method is characterized in that, comprising:

Obtain the image frame sequence and video parameters of the video to be evaluated;

Inputting the image frame sequence into the motion estimation module, the subjective image quality module, the content attractiveness module and the scene module respectively, and obtaining the motion estimation score, the subjective image quality score, the content attractiveness score and the scene score, and the content attractiveness module is obtained based on sample training after data layering processing;

Inputting the video parameters into the objective parameter module to obtain the objective parameter score;

A video quality score of the video to be evaluated is calculated based on the motion estimation score, the subjective image quality score, the content attractiveness score, the scene score and the objective parameter score.

2. The video quality assessment method according to claim 1, wherein the image frame sequence is input to a motion estimation module to obtain a motion estimation score, comprising:

Two consecutive frames of images in the image frame sequence are input as a group to the motion estimation module;

The relative displacement of pixels in each group of frame images is calculated by the motion estimation module, and a motion estimation score is calculated based on the relative displacement of the pixels.

3. The video quality assessment method according to claim 2, wherein, after the relative displacement of the pixels in each group of frame images is obtained through the calculation of the motion estimation module, further comprising:

The relative displacement of the pixel in each group of frame images is superimposed with the three-channel color value of the pixel, and input to the motion estimation module to obtain a motion estimation score.

4. The video quality assessment method according to claim 1, wherein the subjective image quality module is generated based on an image quality assessment model, comprising a feature extraction head with regression output.

5. The video quality assessment method according to claim 1, wherein, before said obtaining multi-frame images in the video to be evaluated, and generating an image frame sequence, it also includes:

Based on the scoring values of the sample data by the motion estimation module, the subjective image quality module, the scene module and the objective parameter module, performing data layering processing on the sample data to obtain layered samples;

A content attractiveness module is trained based on the stratified samples.

6. The video quality assessment method according to claim 5, wherein said performing data layering processing on said sample data to obtain layered samples comprises:

The sample data with the same scoring value or the difference between the scoring values is smaller than the preset value to form a stratified sample.

7. The video quality assessment method according to any one of claims 1-6, wherein the calculation of the video quality score of the video to be evaluated based on the motion estimation score, the subjective image quality score, the content attractiveness score, the scene score and the objective parameter score includes:

multiplying the motion estimation score, the subjective image quality score, the content attractiveness score, the scene score, and the objective parameter score by their corresponding weights;

Adding the product results to obtain the video quality score of the video to be evaluated.

8. The video quality assessment method according to any one of claims 1-6, wherein the video parameters include video frame rate, and the video quality assessment method further comprises:

Adjusting the frame rate of the video to be evaluated to obtain video to be evaluated with different frame rates, and inputting the video to be evaluated with different frame rates to the objective parameter module to obtain a plurality of objective parameter scores;

Based on the motion estimation score, the subjective image quality score, the content attractiveness score, the scene score and each objective parameter score, the video quality scores of the video to be evaluated at different frame rates are calculated;

Based on the video quality scores corresponding to multiple different frame rates, the video frame rate decision is made.

9. A video quality assessment device, characterized in that it comprises:

A data acquisition unit configured to acquire an image frame sequence and video parameters of a video to be evaluated;

The data fusion processing unit is configured to input the image frame sequence to the motion estimation module, the subjective image quality module, the content attractiveness module and the scene module respectively, to obtain the motion estimation score, the subjective image quality score, the content attractiveness score and the scene score, and the content attractiveness module is obtained based on the sample training after data layering processing; and, the video parameters are input to the objective parameter module to obtain the objective parameter score;

A score calculation unit configured to calculate a video quality score of the video to be evaluated based on the motion estimation score, the subjective image quality score, the content attractiveness score, the scene score, and the objective parameter score.

10. A video quality evaluation device, the device comprising: one or more processors; storage means for storing one or more programs, when the one or more programs are executed by the one or more processors, the one or more processors implement the video quality evaluation method according to any one of claims 1-8.

11. A non-volatile storage medium storing computer-executable instructions, the computer-executable instructions being used to execute the video quality assessment method according to any one of claims 1-8 when executed by a computer processor.

12. A computer program product, comprising a computer program, characterized in that, when the computer program is executed by a processor, the video quality assessment method according to any one of claims 1-8 is implemented.