WO2024149035A1 - Procédés et appareil de compensation de mouvement affine pour limites de bloc et raffinement de mouvement dans un codage vidéo - Google Patents
Procédés et appareil de compensation de mouvement affine pour limites de bloc et raffinement de mouvement dans un codage vidéo Download PDFInfo
- Publication number
- WO2024149035A1 WO2024149035A1 PCT/CN2023/139960 CN2023139960W WO2024149035A1 WO 2024149035 A1 WO2024149035 A1 WO 2024149035A1 CN 2023139960 W CN2023139960 W CN 2023139960W WO 2024149035 A1 WO2024149035 A1 WO 2024149035A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- affine
- current block
- subblock
- block
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/537—Motion estimation other than block-based
- H04N19/54—Motion estimation other than block-based using feature points or meshes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- the present invention is a non-Provisional Application of and claims priority to U.S. Provisional Patent Application No. 63/479,754, filed on January 13, 2023, U.S. Provisional Patent Application No. 63/480,333, filed on January 18, 2023, U.S. Provisional Patent Application No. 63/512,308, filed on July 7, 2023 and U.S. Provisional Patent Application No. 63/590,006, filed on October 13, 2023.
- the U.S. Provisional Patent Applications are hereby incorporated by reference in their entireties.
- the present invention relates to video coding system.
- the present invention relates to affine motion compensation techniques to improve the coding performance of video coding systems.
- VVC Versatile video coding
- JVET Joint Video Experts Team
- MPEG ISO/IEC Moving Picture Experts Group
- ISO/IEC 23090-3 2021
- Information technology -Coded representation of immersive media -Part 3 Versatile video coding, published Feb. 2021.
- VVC is developed based on its predecessor HEVC (High Efficiency Video Coding) by adding more coding tools to improve coding efficiency and also to handle various types of video sources including 3-dimensional (3D) video signals.
- HEVC High Efficiency Video Coding
- Fig. 1A illustrates an exemplary adaptive Inter/Intra video encoding system incorporating loop processing.
- Intra Prediction 110 the prediction data is derived based on previously coded video data in the current picture.
- Motion Estimation (ME) is performed at the encoder side and Motion Compensation (MC) is performed based on the result of ME to provide prediction data derived from other picture (s) and motion data.
- Switch 114 selects Intra Prediction 110 or Inter-Prediction 112 and the selected prediction data is supplied to Adder 116 to form prediction errors, also called residues.
- the prediction error is then processed by Transform (T) 118 followed by Quantization (Q) 120.
- T Transform
- Q Quantization
- the transformed and quantized residues are then coded by Entropy Encoder 122 to be included in a video bitstream corresponding to the compressed video data.
- the bitstream associated with the transform coefficients is then packed with side information such as motion and coding modes associated with Intra prediction and Inter prediction, and other information such as parameters associated with loop filters applied to underlying image area.
- the side information associated with Intra Prediction 110, Inter prediction 112 and in-loop filter 130, are provided to Entropy Encoder 122 as shown in Fig. 1A. When an Inter-prediction mode is used, a reference picture or pictures have to be reconstructed at the encoder end as well.
- the transformed and quantized residues are processed by Inverse Quantization (IQ) 124 and Inverse Transformation (IT) 126 to recover the residues.
- the residues are then added back to prediction data 136 at Reconstruction (REC) 128 to reconstruct video data.
- the reconstructed video data may be stored in Reference Picture Buffer 134 and used for prediction of other frames.
- incoming video data undergoes a series of processing in the encoding system.
- the reconstructed video data from REC 128 may be subject to various impairments due to a series of processing.
- in-loop filter 130 is often applied to the reconstructed video data before the reconstructed video data are stored in the Reference Picture Buffer 134 in order to improve video quality.
- deblocking filter (DF) may be used.
- SAO Sample Adaptive Offset
- ALF Adaptive Loop Filter
- the loop filter information may need to be incorporated in the bitstream so that a decoder can properly recover the required information. Therefore, loop filter information is also provided to Entropy Encoder 122 for incorporation into the bitstream.
- DF deblocking filter
- SAO Sample Adaptive Offset
- ALF Adaptive Loop Filter
- Loop filter 130 is applied to the reconstructed video before the reconstructed samples are stored in the reference picture buffer 134.
- the system in Fig. 1A is intended to illustrate an exemplary structure of a typical video encoder. It may correspond to the High Efficiency Video Coding (HEVC) system, VP8, VP9, H. 264 or VVC.
- HEVC High Efficiency Video Coding
- the decoder can use similar or portion of the same functional blocks as the encoder except for Transform 118 and Quantization 120 since the decoder only needs Inverse Quantization 124 and Inverse Transform 126.
- the decoder uses an Entropy Decoder 140 to decode the video bitstream into quantized transform coefficients and needed coding information (e.g. ILPF information, Intra prediction information and Inter prediction information) .
- the Intra prediction 150 at the decoder side does not need to perform the mode search. Instead, the decoder only needs to generate Intra prediction according to Intra prediction information received from the Entropy Decoder 140.
- the decoder only needs to perform motion compensation (MC 152) according to Inter prediction information received from the Entropy Decoder 140 without the need for motion estimation.
- HEVC high definition motion model
- MCP motion compensation prediction
- a block-based affine transform motion compensation prediction is applied. As shown Figs. 2A-B, the affine motion field of the block 210 is described by motion information of two control point (4-parameter) in Fig. 2A or three control point motion vectors (6-parameter) in Fig. 2B.
- motion vector at sample location (x, y) in a block is derived as:
- motion vector at sample location (x, y) in a block is derived as:
- block based affine transform prediction is applied.
- the motion vector of the centre sample of each subblock is calculated according to above equations, and rounded to 1/16 fraction accuracy.
- the motion compensation interpolation filters are applied to generate the prediction of each subblock with the derived motion vector.
- the subblock size of chroma-components is also set to be 4 ⁇ 4.
- the MV of a 4 ⁇ 4 chroma subblock is calculated as the average of the MVs of the top-left and bottom-right luma subblocks in the collocated 8x8 luma region.
- affine motion inter prediction modes As is for translational-motion inter prediction, there are also two affine motion inter prediction modes: affine merge mode and affine AMVP mode.
- AF_MERGE mode can be applied for CUs with both width and height larger than or equal to 8.
- the CPMVs Control Point MVs
- CPMVP CPMV Prediction
- the following three types of CPVM candidate are used to form the affine merge candidate list:
- VVC there are two inherited affine candidates at most, which are derived from the affine motion model of the neighbouring blocks, one from left neighbouring CUs and one from above neighbouring CUs.
- the candidate blocks are the same as those shown in Fig. 4.
- the scan order is A 0 ->A 1
- the scan order is B0->B 1 ->B 2 .
- Only the first inherited candidate from each side is selected. No pruning check is performed between two inherited candidates.
- a neighbouring affine CU is identified, its control point motion vectors are used to derived the CPMVP candidate in the affine merge list of the current CU.As shown in Fig.
- Constructed affine candidate means the candidate is constructed by combining the neighbouring translational motion information of each control point.
- the motion information for the control points is derived from the specified spatial neighbours and temporal neighbours for a current block 610 as shown in Fig. 6.
- CPMV 1 the B2->B3->A2 blocks are checked and the MV of the first available block is used.
- CPMV 2 the B1->B0 blocks are checked and for CPMV 3 , the A1->A0 blocks are checked.
- TMVP is used as CPMV 4 if it’s available.
- affine merge candidates are constructed based on the motion information.
- the following combinations of control point MVs are used to construct in order:
- the combination of 3 CPMVs constructs a 6-parameter affine merge candidate and the combination of 2 CPMVs constructs a 4-parameter affine merge candidate. To avoid motion scaling process, if the reference indices of control points are different, the related combination of control point MVs is discarded.
- Affine AMVP mode can be applied for CUs with both width and height larger than or equal to 16.
- An affine flag in the CU level is signalled in the bitstream to indicate whether affine AMVP mode is used and then another flag is signalled to indicate whether 4-parameter affine or 6-parameter affine is used.
- the difference of the CPMVs of current CU and their predictors CPMVPs is signalled in the bitstream.
- the affine AVMP candidate list size is 2 and it is generated by using the following four types of CPVM candidate in order:
- the checking order of inherited affine AMVP candidates is the same as the checking order of inherited affine merge candidates. The only difference is that, for AVMP candidate, only the affine CU that has the same reference picture as current block is considered. No pruning process is applied when inserting an inherited affine motion predictor into the candidate list.
- Constructed AMVP candidate is derived from the specified spatial neighbours shown in Fig. 6. The same checking order is used as that in the affine merge candidate construction. In addition, the reference picture index of the neighbouring block is also checked. In the checking order, the first block that is inter coded and has the same reference picture as in current CUs is used. When the current CU is coded with the 4-parameter affine mode, and mv 0 and mv 1 are both availlalbe, they are added as one candidate in the affine AMVP list. When the current CU is coded with 6-parameter affine mode, and all three CPMVs are available, they are added as one candidate in the affine AMVP list. Otherwise, the constructed AMVP candidate is set as unavailable.
- mv 0 , mv 1 and mv 2 will be added as the translational MVs in order to predict all control point MVs of the current CU, when available. Finally, zero MVs are used to fill the affine AMVP list if it is still not full.
- the CPMVs of affine CUs are stored in a separate buffer.
- the stored CPMVs are only used to generate the inherited CPMVPs in the affine merge mode and affine AMVP mode for the lately coded CUs.
- the subblock MVs derived from CPMVs are used for motion compensation, MV derivation of merge/AMVP list of translational MVs and de-blocking.
- affine motion data inheritance from the CUs of the above CTU is treated differently for the inheritance from the normal neighbouring CUs. If the candidate CU for affine motion data inheritance is in the above CTU line, the bottom-left and bottom-right subblock MVs in the line buffer instead of the CPMVs are used for the affine MVP derivation. In this way, the CPMVs are only stored in a local buffer. If the candidate CU is 6-parameter affine coded, the affine model is degraded to 4-parameter model. As shown in Fig.
- FIG. 7 along the top CTU boundary, the bottom-left and bottom right subblock motion vectors of a CU are used for affine inheritance of the CUs in bottom CTUs.
- line 710 and line 712 indicate the x and y coordinates of the picture with the origin (0, 0) at the upper left corner.
- Legend 720 shows the meaning of various motion vectors, where arrow 722 represents the CPMVs for affine inheritance in the local buff, arrow 724 represents sub-block vectors for MC/merge/skip/AMVP/deblocking/TMVPs in the local buffer and for affine inheritance in the line buffer, and arrow 1126 represents sub-block vectors for MC/merge/skip/AMVP/deblocking/TMVPs.
- Subblock based affine motion compensation can save memory access bandwidth and reduce computation complexity compared to pixel based motion compensation, at the cost of prediction accuracy penalty.
- Prediction Refinement with Optical Flow is used to refine the subblock based affine motion compensated prediction without increasing the memory access bandwidth for motion compensation.
- VVC After the subblock based affine motion compensation is performed, luma prediction sample is refined by adding a difference derived by the optical flow equation. The PROF is described as following four steps:
- Step 1) The subblock-based affine motion compensation is performed to generate subblock prediction I (i, j) .
- Step2 The spatial gradients g x (i, j) and g y (i, j) of the subblock prediction are calculated at each sample location using a 3-tap filter [-1, 0, 1] .
- shift1 is used to control the gradient’s precision.
- the subblock (i.e. 4x4) prediction is extended by one sample on each side for the gradient calculation. To avoid additional memory bandwidth and additional interpolation computation, those extended samples on the extended borders are copied from the nearest integer pixel position in the reference picture.
- ⁇ v (i, j) is the difference between sample MV computed for sample location (i, j) , denoted by v (i, j) , and the subblock MV of the subblock to which sample (i, j) belongs, as shown in Fig. 8.
- the ⁇ v (i, j) is quantized in the unit of 1/32 luam sample precision.
- sub-block 822 corresponds to a reference sub-block for sub-block 820 as pointed by the motion vector vSB (812) .
- the reference sub-block 822 represents a reference sub-block resulted from translational motion of block 820.
- Reference sub-block 824 corresponds to a reference sub-block with PROF.
- the motion vector for each pixel is refined by ⁇ v (i, j) .
- the refined motion vector v (i, j) 814 for the top-left pixel of the sub-block 820 is derived based on the sub-block MV vSB (812) modified by ⁇ v (i, j) 816.
- ⁇ v (i, j) can be calculated for the first subblock, and reused for other subblocks in the same CU.
- the enter of the subblock (x SB , y SB ) is calculated as ( (W SB -1) /2, (H SB -1) /2) , where W SB and H SB are the subblock width and height, respectively.
- Step 4) Finally, the luma prediction refinement ⁇ I (i, j) is added to the subblock prediction I (i, j) .
- PROF is not applied in two cases for an affine coded CU: 1) all control point MVs are the same, which indicates the CU only has translational motion; 2) the affine motion parameters are greater than a specified limit because the subblock based affine MC (Motion Compensation) is degraded to CU based MC to avoid large memory access bandwidth requirement.
- a fast encoding method is applied to reduce the encoding complexity of affine motion estimation with PROF.
- PROF is not applied at affine motion estimation stage in following two situations: a) if this CU is not the root block and its parent block does not select the affine mode as its best mode, PROF is not applied since the possibility for current CU to select the affine mode as best mode is low; b) if the magnitude of four affine parameters (C, D, E, F) are all smaller than a predefined threshold and the current picture is not a low delay picture, PROF is not applied because the improvement introduced by PROF is small for this case. In this way, the affine motion estimation with PROF can be accelerated.
- JVET-AC0158 Zhi Zhang, et al., “EE2-2.5: Pixel based affine motion compensation” , 29th Meeting, by teleconference, 11–20 January 2023, Document: JVET-AC0158
- pixel based affine motion compensation is disclosed, which proposes to change the minimum affine subblock size from 4x4 to 1x1 for both luma and chroma components, 1x1 subblock size allows pixel based affine MC.
- affine subblock width or height is smaller than 4, PROF is disabled.
- MP-DMVR Multi-Pass Decoder-Side Motion Vector Refinement
- a multi-pass decoder-side motion vector refinement is applied.
- bilateral matching (BM) is applied to the coding block.
- BM is applied to each 16x16 subblock within the coding block.
- MV in each 8x8 subblock is refined by applying Bi-Directional Optical Flow (BDOF) .
- BDOF Bi-Directional Optical Flow
- a refined MV is derived by applying BM to a coding block. Similar to Decoder-Side Motion Vector Refinement (DMVR) , in bi-prediction operation, a refined MV is searched around the two initial MVs (MV0 and MV1) in the reference picture lists L0 and L1. The refined MVs (MV0_pass1 and MV1_pass1) are derived around the initiate MVs based on the minimum bilateral matching cost between the two reference blocks in L0 and L1.
- DMVR Decoder-Side Motion Vector Refinement
- BM performs local search to derive integer sample precision intDeltaMV.
- the local search applies a 3 ⁇ 3 square search pattern to loop through the search range [–sHor, sHor] in the horizontal direction and [–sVer, sVer] in the vertical direction, where the values of sHor and sVer are determined by the block dimension, and the maximum value of sHor and sVer is 8.
- Mean-Removal SAD (MRSAD) cost function is applied to remove the DC effect of distortion between reference blocks.
- MRSAD Mean-Removal SAD
- the existing fractional sample refinement is further applied to derive the final deltaMV.
- the refined MVs after the first pass is then derived as:
- ⁇ MV0_pass1 MV0 + deltaMV
- ⁇ MV1_pass1 MV1 –deltaMV
- a refined MV is derived by applying BM to a 16 ⁇ 16 grid subblock. For each subblock, a refined MV is searched around the two MVs (MV0_pass1 and MV1_pass1) obtained in the first pass, in the reference picture list L0 and L1.
- the refined MVs (MV0_pass2 (sbIdx2) and MV1_pass2 (sbIdx2) ) are derived based on the minimum bilateral matching cost between the two reference subblocks in L0 and L1.
- BM For each subblock, BM performs full search to derive integer sample precision intDeltaMV.
- the full search has a search range [–sHor, sHor] in horizontal direction and [–sVer, sVer] in vertical direction, wherein, the values of sHor and sVer are determined according to the block dimension, and the maximum value of sHor and sVer is 8.
- the search area (2*sHor + 1) * (2*sVer + 1) is divided up to 5 diamond shape search regions shown on Fig. 9, where the five regions are shown in different shades.
- Each search region is assigned a costFactor, which is determined by the distance (intDeltaMV) between each search point and the starting MV, and each diamond region is processed in the order starting from the centre of the search area. In each region, the search points are processed in the raster scan order starting from the top left going to the bottom right corner of the region.
- the int-pel full search is terminated; otherwise, the int-pel full search continues to the next search region until all search points are examined. Additionally, if the difference between the previous minimum cost and the current minimum cost in the iteration is less than a threshold that is equal to the area of the block, the search process terminates.
- the existing VVC DMVR fractional sample refinement is further applied to derive the final deltaMV (sbIdx2) .
- the refined MVs at second pass is then derived as:
- ⁇ MV0_pass2 (sbIdx2) MV0_pass1 + deltaMV (sbIdx2)
- ⁇ MV1_pass2 (sbIdx2) MV1_pass1 –deltaMV (sbIdx2)
- a refined MV is derived by applying BDOF to an 8 ⁇ 8 grid subblock. For each 8 ⁇ 8 subblock, BDOF refinement is applied to derive scaled Vx and Vy without clipping starting from the refined MV of the parent subblock of the second pass.
- the derived bioMv (Vx, Vy) is rounded to 1/16 sample precision and clipped between -32 and 32.
- MV0_pass3 (sbIdx3) and MV1_pass3 (sbIdx3) ) at third pass are derived as:
- MV0_pass3 MV0_pass2 (sbIdx2) + bioMv
- MV1_pass3 MV0_pass2 (sbIdx2) –bioMv
- the coding block is divided into 8 ⁇ 8 subblocks. For each subblock, whether to apply BDOF or not is determined by checking the SAD between the two reference subblocks against a threshold. If decided to apply BDOF to a subblock, for every sample in the subblock, a sliding 5 ⁇ 5 window is used and the existing BDOF process is applied for every sliding window to derive Vx and Vy. The derived motion refinement (Vx, Vy) is applied to adjust the bi-predicted sample value for the centre sample of the window.
- JVET-AE0148 Zhi Zhang, et al., “Non-EE2: Affine subblock BDOF refinement” , 31st Meeting, Geneva, CH, 11–19 July 2023, Document: JVET-AE0148
- BDOF subblock MV refinement and sample adjustment to an affine coded block when the block meets the BDOF condition and the block is determined to use subblock MC (e.g., OBMC being applied to subblocks) .
- subblock MC e.g., OBMC being applied to subblocks
- An affine coded block derives MVs for each 4 ⁇ 4 subblock from the affine model.
- the BDOF process starts with the 4 ⁇ 4 subblock grouping with identical MVs.
- BDOF MV refinement is processed in 4 ⁇ 4 subblock grid, and otherwise in 8 ⁇ 8 subblock grid.
- the BDOF enabling condition is same as ECM-9.0, e.g., two reference pictures have equal POC distance to the current picture, and equal weight prediction.
- DMVR is applied to affine merge coded blocks and affine MMVD coded blocks when DMVR condition is satisfied. It is also extended to adaptive BM merge mode.
- An affine motion field is modelled as follows (6-parameters affine case) :
- (mv x , mv y ) is the motion vector at location (x, y) and (mv 0x , mv 0y ) is the base MV representing the translation motion of the affine model. Parameters and represent the non-translation parameters (rotation, scaling) .
- Motion vectors (mv 0x , mv 0y ) , (mv 1x , mv 1y ) and (mv 2x , mv 2y ) are called the control point motion vectors (CPMVs) of the considered affine coding unit.
- the bilateral matching cost is calculated per subblock.
- the subblock bilateral matching costs and refined subblock MVs are used to determine the overall best refined CPMVs for the affine block. More specific, the CPMVs are refined according to the following steps:
- step 3 Perform linear regression using the refined subblock MVs from step 1 as input and output a set of control-point motion vectors.
- the non-translation parameters of affine model are refined after the base MVs are determined.
- Each of CPMVs is fixed as a base MV in turn, and an offset is added to the non-translation parameter of affine model by minimizing the bilateral matching cost, and then the other two CPMVs are calculated according to the based MV and refined non-translation parameters.
- both CPMVs and non-translation parameters refinements are applied.
- the MMVD offset is added to the affine DMVR refined affine merge base candidate if the base candidate meets the affine DMVR refinement condition.
- an affine merge list that only contains affine merge candidates that meet the affine DMVR conditions are constructed and then CPMVs refinement and non-translation parameters refinement are applied.
- the affine motion compensation technique is used with various other coding tools.
- various techniques to improve coding efficiency are disclosed for using affine motion compensation with other coding tools.
- a method and apparatus for video coding using affine motion compensation are disclosed.
- input data comprising a current block are received, wherein the input data comprise pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side, and the current block is coded using an affine model.
- a first MV Motion Vector
- BDOF Bi-Directional Optical Flow
- a second MV based on is determined a PROF (Prediction Refinement with Optical Flow) process.
- the current block is encoded or decoded by using information comprising the first MV and the second MV.
- a third MV is derived based on the first MV and the second MV, and the third MV is used for said encoding or decoding the current block.
- the third MV is derived by fusing the first MV and the second MV or the third MV is derived as a sum of or a weighted sum of the first MV and the second MV.
- the third MV is derived as a weighted sum of the first MV and the second MV using a pre-defined weight or using an adaptive weight according to a magnitude of the first MV, the second MV or both.
- prediction refinement based on the BDOF process and the PROF process is derived based on the third MV and one or more sample gradients, and said one or more sample gradients are derived from an affine predictor for the current block. Furthermore, said one or more sample gradients derived from the affine predictor for the current block can be reused for the prediction refinement.
- said encoding or decoding the current block by using the information comprising the first MV and the second MV is performed only if one or more conditions are satisfied.
- the PROF process is first applied to generate one or more affine predictors, and then subblock-based BDOF is applied to derive a refined MV and gradient for a subblock based on said one or more affine predictors.
- the BDOF process is first applied to refine one or more affine motion vectors, and then the PROF process is applied to derive one or more refined predictors.
- a first offset of the first MV and a second offset of the second MV are fused according to a magnitude of MV refinement, magnitude of sample gradients, a pre-defined weight, OBMC (Overlapped Boundary Motion Compensation) flag, TM (Template Matching) or BM (Block Matching) cost, sample intensity characteristics, the affine model, or a combination thereof.
- OBMC Overlapped Boundary Motion Compensation
- TM Tempolate Matching
- BM Block Matching
- OBMC Overlapped Boundary Motion Compensation
- Affine MC Motion Compensation
- Affine MC Motion Compensation
- block size for the current block is determined according to the OBMC on/off condition of the current block, the affine motion model of the current block, or both, or whether to allow sample-based or subblock-based affine MC is determined according to the OBMC on/off condition of the current block.
- Affine MC samples are generated for an affine MC area with the affine MC block size determined, or subblock-based affine MC is applied if the sample-based or subblock-based affine MC is allowed.
- the affine MC block size depends on the OBMC on/off condition and one or more absolute values associated with summation and subtraction of two, three or four of model parameters associated with the affine motion model.
- a larger affine MC block size is used when the OBMC on/off condition is off and said one or more absolute values are larger than a threshold.
- a smaller affine MC block size is used when the OBMC on/off condition is on or said one or more absolute values are smaller than a threshold.
- the sample-based or subblock-based affine MC is allowed if the OBMC on/off condition is on, and wherein subblock width and height associated with the subblock-based affine MC is smaller than a threshold.
- one or more larger affine MC block sizes are selected if one of absolute values of model parameters is smaller than a first predefined threshold, or absolute values of summation and subtraction of two, three or four of model parameters associated with the affine mode are smaller than a second predefined threshold.
- the first predefined threshold and/or the second predefined threshold are dependent on a maximum value of MV gradient of PROF (Prediction Refinement with Optical Flow) and/or a difference between two control point MVs for the current block.
- interpolation filter tap size for an interpolation filter is determined according to block width, block height or block size, or according to a difference between subblock motion vectors.
- Affine MC (Motion Compensation) samples are generated using the interpolation filter with the interpolation filter tap size determined.
- the current block is encoded or decoded by using the affine MC samples.
- a larger interpolation filter tap size is selected for a larger block size and/or a smaller interpolation filter tap size is selected for a smaller block size. In one embodiment, a larger interpolation filter tap size is selected for a larger difference between the subblock motion vectors, and/or a smaller interpolation filter tap size is selected for a smaller difference between the subblock motion vectors.
- a target prediction mode between a sample-based affine MC (Motion Compensation) and affine BDOF (Bi-Directional Optical Flow) is determined for a current subblock of the current block according to the affine model, subblock TM (Template Matching) or BM (Block Matching) cost, sample intensity characteristics or magnitude of BDOF MV refinement.
- Prediction samples are generated for the current block using the sample-based affine MC or the affine BDOF according to the target prediction mode.
- the current block is encoded or decoded by using the prediction samples generated.
- the current subblock is encoded or decoded by using the sample-based affine MC to generate the prediction samples if the TM or BM cost of the current subblock or the magnitude of BDOF MV refinement is larger than a threshold.
- the current subblock is encoded or decoded by using the sample-based affine MC to generate the prediction samples if one, two, three or four of affine parameters associated with the affine model or a combination of two, three or four of the affine parameters of the current block is larger than a threshold.
- the current subblock is encoded or decoded by using the sample-based affine MC to generate the prediction samples if one, two, three or four of affine parameters associated with the affine model or a combination of two, three or four of the affine parameters of the current block is smaller than a threshold.
- Fig. 1A illustrates an exemplary adaptive Inter/Intra video coding system incorporating loop processing.
- Fig. 1B illustrates a corresponding decoder for the encoder in Fig. 1A.
- Fig. 2A illustrates an example of the affine motion field of a block described by motion information of two control point (4-parameter) .
- Fig. 2B illustrates an example of the affine motion field of a block described by motion information of three control point motion vectors (6-parameter) .
- Fig. 3 illustrates an example of block based affine transform prediction, where the motion vector of each 4 ⁇ 4 luma subblock is derived from the control-point MVs.
- Fig. 4 illustrates the neighbouring blocks used for deriving spatial merge candidates for VVC.
- Fig. 5 illustrates an example of derivation for inherited affine candidates based on control-point MVs of a neighbouring block.
- Fig. 6 illustrates an example of affine candidate construction by combining the translational motion information of each control point from spatial neighbours and temporal.
- Fig. 7 illustrates an example of affine motion information storage for motion information inheritance.
- Fig. 8 illustrates an example of sub-block based affine motion compensation, where the motion vectors for individual pixels of a sub-block are derived according to motion vector refinement.
- Fig. 9 illustrates the five diamond-shaped search regions for subblock based bilateral matching MV refinement.
- Fig. 10 illustrates an example of affine MC block size dependency on the horizontal and vertical MV differences between two pixels according to an embodiment of the current invention.
- Fig. 11 illustrates a flowchart of an exemplary video coding system that performs affine motion compensation using subblock-based BDOF derived MV and PROF derived MV according to an embodiment of the present invention.
- Fig. 12 illustrates a flowchart of an exemplary video coding system that determines the affine motion compensation block size based on OBMC on/off condition and/or the affine motion model according to an embodiment of the present invention.
- Fig. 13 illustrates a flowchart of an exemplary video coding system that determines interpolation filter tap size for affine motion compensation based on the affine block size according to an embodiment of the present invention.
- Fig. 14 illustrates a flowchart of an exemplary video coding system that combines affine BDOF and sample-based affine motion compensation according to an embodiment of the present invention.
- VTM VVC Test Model
- the minimum affine MC block size (or the minimum affine MC size) is fixed as 4x4.
- the pixel-based affine MC is proposed to apply on the affine CUs with OBMC disable.
- the OBMC on/off condition might not be the best criteria to determine whether applying pixel-based affine MC since the interaction between the coding tools becomes more complicated.
- there are more choices for minimum affine MC block size e.g., rectangular shapes instead of only selecting between 4x4 or 1x1 affine MC block size.
- the minimum affine MC block size is dependent on the horizontal and vertical MV differences between two pixels. That is, if the horizontal/vertical MV difference between two pixels are smaller than one predefined threshold, the pixels between these two pixels are merged into one MC block. For example, if the horizontal difference between v0 and v1 of current subblock 1010 is smaller than a threshold and the horizontal difference between v0 of current subblock 1010 and v3 of right neighbouring subblock 1020 is larger than the threshold, the MC block width is determined as W in the Fig. 10.
- the MC block height is determined as H.
- This method can be applied to luma component only, chroma component only, or both luma and chroma components.
- the threshold can be conditioned by the maximum value of MV gradient of PROF, the difference between two control point MVs with the consideration of CU width/height, or the difference between two control point MVs without the consideration of CU width/height.
- the W and H can be any number larger than or equal to 1 and W can be equal or unequal to H.
- the subblock MV of each MxN subblock is derived.
- the M and N are positive integers, such as 1, 2, 4, 8, or 16.
- the MV of each subblock is derived according to the affine model.
- the MV can be the centre MV of the subblock. If the MVx difference of two horizontal adjacent subblocks is smaller than a threshold, the subblock width for MC can be increased by M. The same procedure can be applied sequentially or in parallel, and also can be applied for the vertical direction.
- the minimum affine MC block size is dependent on the affine motion model. That is, if the one of the absolute values of model parameters a, b, d and e is smaller than a predefined threshold or the absolute values of summation and subtraction of two, three or four of the model parameters are smaller than a predefined threshold, the lager affine MC block size is selected which can be square or rectangular shape. If one of the absolute values of model parameters a, b, d and e is larger than a predefined threshold, or absolute values of the summation and subtraction of two, three or four of the model parameters are larger than a predefined threshold, the smaller affine MC block size is selected.
- the MC block width is determined as 32, and if the absolute value of summation of parameters a and b is smaller than a threshold T2 but larger than T1, the MC block width is determined as 16 and so on.
- the MC block height is determined as 32, and if the absolute value of summation of parameters d and e is smaller than a threshold T2 but larger than T1, the MC block height is determined as 16 and so on.
- the threshold Ti can be multiple and the MC block width and height can be different and determined independently according to different model parameters.
- the size of MC block width and height can be any number larger than or equal to 1.
- the threshold Ti can be conditioned by the maximum value of MV gradient of PROF, the difference between two control point MVs with the consideration of CU width/height, or the difference between two control point MVs without the consideration of CU width/height. This method can be applied to luma component only, chroma component only, or both luma and chroma components.
- a mathematical calculation of a, b, c, d, e or f, subblock width or height, MC block width or height can be performed to determine the MV derivation subblock or MC subblock width or height.
- a*N + b*M is larger than, or is larger than or equal to a threshold [i]
- the subblock width or height is smaller than K [i] , where different thresholds and K can be pre-determined or adaptively determined.
- a*N + b*M is smaller than, or is smaller than or equal to a threshold [i]
- the subblock width or height is smaller than K [i] , where different thresholds and K can be pre-determined or adaptively determined.
- a*N + b*M is larger than, or is larger than or equal to a threshold [i]
- the subblock width or height is larger than K [i] , where different threshold and K can be pre-determined or adaptively determined.
- a*N + b*M is smaller than, or is smaller than or equal to a threshold [i]
- the subblock width or height is larger than K [i] , where different thresholds and K can be pre-determined or adaptively determined.
- a*N + d*M is larger than, or is larger than or equal to a threshold [i]
- the subblock width or height is smaller than K [i] , where different thresholds and K can be pre-determined or adaptively determined.
- a*N + d*M is smaller than, or is smaller than or equal to a threshold [i] , the subblock width or height is smaller than K [i] , where different thresholds and K can be pre-determined or adaptively determined.
- a*N + d*M is larger than, or is larger than or equal to a threshold [i]
- the subblock width or height is larger than K [i] , where different thresholds and K can be pre-determined or adaptively determined.
- a*N + d*M is smaller than, or is smaller than or equal to a threshold [i]
- the subblock width or height is larger than K [i] , where different thresholds and K can be pre-determined or adaptively determined.
- the PROF can be applied. If the subblock size is 1xM, the vertical direction PROF is applied. If the subblock size is Nx1, the horizontal direction PROF is applied. In another example, when subblock size is not 1x1, the PROF can be applied. If the subblock size is 1xM, the horizontal direction PROF is applied. If the subblock size is Nx1, the vertical direction PROF is applied.
- the sample-based MC or subblock size MC can also be applied to chroma.
- the filter-tap length, threshold, block size can be different from luma, or aligned with luma, or aligned with subsampled luma (subsampled with the chroma subsample ratio) .
- the interpolation filter tap sizes for affine MC can be determined according to the affine MC block width and height independently. For smaller block size, the smaller filter tap size is used, and for the larger block size, the larger filter tap size is used. If the MC block width and height are different, the horizontal and vertical filter tap size can also be different. In one embodiment, the filter tap length of the shorter-length boundary is smaller than the longer length boundary.
- the filter tap length is determined by the affine model or the subblock MVs. For example, if the affine model parameter value is larger, means the subblock MVs are more diverse, the shorter filter tap length is applied.
- the filter tap length can also be determined according to the size, width or height of the required reference samples region.
- the OBMC on/off condition can be used to determine the affine MC block size. If the OBMC is disable, the affine minimum MC block size can be smaller (e.g. 1x1, 2x2 or 4x4) and if the OBMC is enable, the affine minimum MC block size can be larger (e.g. 4x4, 8x8 or 16x16) .
- the OBMC on/off condition, and/or the affine motion model, and/or the horizontal and/or vertical MV difference can be used to determine the affine MC block size. For example, if the OBMC is disable and the absolute values of summation and subtraction of two, three or four of the model parameters are larger than a predefined threshold, the affine minimum MC block size can be smaller (e.g. 1x1, 1x2, 2x1, 2x2, 2x4, 4x2, 4x4, etc. ) . If the OBMC is enable and the absolute values of summation and subtraction of two, three or four of the model parameters are smaller than a predefined threshold, the affine minimum MC block size can be larger (e.g. 4x4, 4x8, 8x4, 8x8, 16x16, 16x32, 32x16, 32x32, etc. ) .
- the sample-based affine MC or the MxN subblock-based affine MC when the OBMC is applied, the sample-based affine MC or the MxN subblock-based affine MC will be turn off.
- the M or N is smaller than a threshold, such as 4.
- the sample-based affine MC or the MxN subblock-based affine MC still can be applied.
- the M or N is smaller than a threshold, such as 4.
- only the CU boundary OBMC is applied.
- the inside CU subblock OBMC is not applied.
- the sample-based affine MC still can be applied. However, only the CU boundary OBMC is applied.
- the inside CU subblock OBMC is not applied.
- the OBMC when the sample-based affine MC or the MxN subblock-based affine MC is applied (e.g., in one of the affine merge/skip mode) , the OBMC is not applied.
- the M or N is smaller than a threshold, such as 4.
- the CU-boundary OBMC when the sample-based affine MC or the MxN subblock-based affine MC is applied (e.g., in one of the affine merge/skip mode) , the CU-boundary OBMC can be applied, but the in-side CU subblock OBMC is turn-off.
- the M or N is smaller than a threshold, such as 4.
- the OBMC can be replaced by other coding tools.
- multi-hypothesis compensation coding tools or LIC.
- the BDOF and PROF are mutually exclusive under specific conditions.
- PROF when the enabling condition for BDOF is satisfied and BDOF is enabled, PROF will be disabled.
- PROF when the subblock-based BDOF in MP-DMVR third pass is enabled, PROF can still be enabled. However, if the sample-based BDOF is enabled, PROF will be disabled.
- PROF when the sample-based BDOF is enabled, PROF can still be enabled. However, if the subblock-based BDOF in MP-DMVR third pass is enabled, PROF will be disabled.
- PROF when the BDOF is enabled, PROF will be disabled. However, if the subblock-based or sample-based BDOF derived delta MV (i.e., bioMv as described in sub-section entitled: Third pass –Subblock based bi-directional optical flow MV refinement) is zero or smaller than a threshold, or there is no solution for BDOF, or the BM cost of subblock-based or sample-based BDOF is smaller than a threshold, or Affine parameters (i.e., 4 or 6 Affine parameters) are larger than a threshold, or the difference between L0 and L1 prediction sample is larger than a threshold, PROF can be enabled and BDOF is disabled. In another example, the BDOF can be applied first.
- subblock-based or sample-based BDOF derived delta MV i.e., bioMv as described in sub-section entitled: Third pass –Subblock based bi-directional optical flow MV refinement
- the BDOF is disabled, and the PROF is applied.
- the PROF can be applied first. However, if the sample value difference after applying PROF is smaller than a threshold or larger than a threshold, the PROF is disabled and the BDOF is applied
- BDOF when the enabling condition for PROF is satisfied and PROF is enabled, BDOF will be disabled.
- PROF when PROF is enabled, BDOF will be disabled. However, if the PROF derived delta MV (i.e., ⁇ v x (i, j) and ⁇ v y (i, j) in Section entitled: Prediction Refinement with Optical Flow for Affine Mode) is zero or smaller than a threshold, or the PROF
- ⁇ I (i, j) is zero or smaller than a threshold, or there is no solution for PROF, or the BM cost of subblock-based or sample-based BDOF is larger than a threshold, or the Affine parameter (i.e., 4 or 6 Affine parameters) are smaller than a threshold, BDOF can be enabled and PROF is disabled.
- BDOF and PROF are applied in cascade if both methods are enabled.
- subblock-based BDOF is first applied and then PROF is applied based on the BDOF refined MVs. At last, sample-based BDOF is applied on the PROF refined predictor to perform the final prediction refinement.
- subblock-based BDOF is first applied and then sample-based BDOF is applied on the subblock-based BDOF refined predictor. At last, PROF is applied based on the BDOF refined predictor to perform the final prediction refinement.
- PROF is first applied to refine the affine predictor and then subblock-based BDOF is applied to further refined affine subblock MVs.
- sample-based BDOF is applied on the subblock-based BDOF refined predictor to perform the final prediction refinement.
- the PROF is first applied to generate the affine predictors. Based on these predictors, the subblock-based BDOF is applied to derive the refined MV for a subblock. According the new subblock MV, the PROF is applied again to generate the predictors. The regenerated predictors are further refined by sample-based BDOF if one or more conditions are satisfied.
- the block-based MC is first applied to generate the affine predictors.
- the subblock-based BDOF is applied to derive the refined MV for a subblock.
- the PROF is applied again to generate the predictors.
- the regenerated predictors are further refined by sample-based BDOF if one or more conditions are satisfied.
- the block-based MC is first applied to generate the affine predictors.
- the subblock-based BDOF is applied to derive the refined MV for a subblock.
- the block-based MC is applied again to generate the predictors.
- the regenerated predictors are further refined by sample-based BDOF if one or more conditions are satisfied.
- the PROF is applied to further refine the predictors.
- the PROF is first applied to generate the affine predictors. Based on these predictors, the subblock-based BDOF is applied to derive the refined MV for a subblock. According the new subblock MV, the block-based MC is applied again to generate the predictors. The regenerated predictors are further refined by sample-based BDOF if one or more conditions are satisfied. After that, the PROF is applied to further refine the predictors.
- the multiple-step MC e.g. two or three or more step MCs
- the generated MC predictors e.g.
- one or more of the PROF, sample-based BDOF or block-based BDOF can be applied.
- the MC was regenerated, and one or more of the PROF, sample-based BDOF or block-based BDOF can be applied.
- only the subblock-based BDOF is applied to refine the sample or subblock MV.
- the sample-based BDOF is not applied.
- the PROF is first applied to generate the affine predictors. Based on these predictors, the subblock-based BDOF is applied to derive the refined MV for a subblock. According the new subblock MV, the PROF is applied again to generate the final predictors.
- the block-based MC is first applied to generate the affine predictors. Based on these predictors, the subblock-based BDOF is applied to derive the refined MV for a subblock. According the new subblock MV, the PROF is applied again to generate the final predictors.
- BDOF and PROF are applied with fusion if both methods are enabled.
- the subblock-based BDOF derived MV i.e., bioMv as described in sub-section entitled: Third pass –Subblock based bi-directional optical flow MV refinement
- PROF derived MV i.e., ⁇ v x (i, j) and ⁇ v y (i, j) in Section entitled: Prediction Refinement with Optical Flow for Affine Mode
- Prediction Refinement with Optical Flow for Affine Mode by a pre-defined weight or adaptive weight according to the magnitude of the derived MV, magnitude of affine parameters, L0 and L1 prediction sample difference, BM cost of BDOF, BDOF block size or CU size.
- the prediction refinement of BDOF and PROF can be calculated by the fused MV and the sample gradient on affine predictor.
- the ⁇ v x ′ (i, j) and ⁇ v y ′ (i, j) are derived by blending, summation or weighted summation of the BDOF refined MV and PROF derived MV offset.
- the equation (3) is applied to refine the predictors.
- the PROF is first applied to generate the affine predictors. Based on these predictors, the subblock-based BDOF is applied to derive the refined MV for a subblock. According the new subblock MV, the block-based MC is applied to generate new predictors. The regenerated predictors are further refined by both sample- based BDOF and PROF.
- the sample-based MVs are derived by blending, summation or weighted summation of the sample-based BDOF refined MV and PROF derived MV offset.
- the equation (3) is applied to refine the predictors.
- the block-based MC is first applied to generate the affine predictors. Based on these predictors, the subblock-based BDOF is applied to derive the refined MV for a subblock. According the new subblock MV, the block-based MC is applied to generate new predictors. The regenerated predictors are further refined by both sample-based BDOF and PROF. The sample-based MVs are derived by blending/sum up/weight summed the sample-based BDOF refined MV and PROF derived MV offset. The equation (3) is applied to refine the predictors.
- the block-based MC is first applied to generate the affine predictors. Based on these predictors, the subblock-based BDOF and sample-based BDOF are applied to derive the refined MV for a subblock and each samples.
- the already generated block-based MC predictors are further refined by both sample-based BDOF and PROF.
- the sample-based MVs are derived by blending, summation or weighted summation of the sample-based BDOF refined MV and PROF derived MV offset.
- the equation (3) is applied to refine the predictors.
- the block-based MC is first applied to generate the affine predictors.
- the already generated block-based MC predictors are further refined by both sample-based BDOF and PROF.
- the sample-based MVs are derived by blending, summation or weighted summation the sample-based BDOF refined MV and PROF derived MV offset.
- the equation (3) is applied to refine the predictors.
- the refined MV can be selected from the subblock-based BDOF derived MV (i.e., bioMv as described in sub-section entitled: Third pass –Subblock based bi-directional optical flow MV refinement) and PROF derived MV (i.e., ⁇ v x (i, j) and ⁇ v y (i, j) in Section entitled: Prediction Refinement with Optical Flow for Affine Mode) in sample-based, block-based or CU-based by comparing the magnitude of the derived MV. If the magnitude of BDOF derived MV is larger than, or small than PROF derived MV, the refined MV will be BDOF derived MV.
- subblock-based BDOF derived MV i.e., bioMv as described in sub-section entitled: Third pass –Subblock based bi-directional optical flow MV refinement
- PROF derived MV i.e., ⁇ v x (i, j) and ⁇ v y
- the BDOF can be applied when one or more conditions are satisfied.
- the above inventions can be sample-based, block-based or CU-based.
- the BDOF can be skipped. If the BM cost of subblock-based or sample-based BDOF is larger than a threshold, we can reduce the magnitude of subblock-based or sample- based BDOF derived MV (i.e., bioMv as described in sub-section entitled: Third pass –Subblock based bi-directional optical flow MV refinement) by a scaling factor and iteratively apply subblock-based or sample-based BDOF until the BM cost is smaller than a threshold.
- the threshold can be sample-based, block-based, or CU-based and according to subblock size, QP value or subblock MV magnitude.
- the subblock-based BDOF could be replaced by MV refinement or derivation by using BDOF.
- PROF is applied to refine the predictor before each stage of affine BDOF refinement.
- JVET-AE0148 three stages of affine BDOF refinement are proposed.
- An affine predictor is first refined by first BDOF refinement iteration (i.e., 8x8 BDOF MV refinement) if the subblock size condition is satisfied.
- the second BDOF refinement iteration i.e., 8x8 or 4x4 BDOF MV refinement
- sample-based BDOF refinement is applied to directly refine the sample value of each sample.
- the sample-based BDOF refinement can be combined with PROF. That is, the BDOF MV refinement (i.e., (Vx, Vy) described in Section entitled: Sample-based BDOF) can be fused with PROF MV refinement (i.e., ( ⁇ v x (i, j) , ⁇ v y (i, j) ) described in Section entitled: Prediction Refinement with Optical Flow for Affine Mode) according to the magnitude of MV refinement, magnitude of sample gradients, a predefined weight, OBMC flag, TM or BM costs, sample intensity characteristics (i.e., similarity between L0 and L1 samples, similarity between L0 and L1 gradients, other characteristics derived from sample intensity etc. ) , or the affine model. The fused MV refinement is then used to calculate the sample refinement by multiplying with the sample gradient.
- the BDOF MV refinement i.e., (Vx, Vy) described in Section entitled: Sample
- some stages of affine BDOF refinement can be replaced by PROF.
- the sample-based BDOF refinement can be replaced by PROF.
- the sample-based BDOF refinement can be replaced by PROF under certain conditions. If the BDOF MV refinement (i.e., (Vx, Vy) described in Section entitled: Sample-based BDOF) of sample-based BDOF is smaller than a threshold or smaller than the PROF MV refinement, the PROF is applied to replace affine BDOF.
- the sample gradients and the MV refinements of PROF are recalculated after the BDOF refinement. Specifically, after n-th BDOF refinement, the subblock MVs and the sample values of each subblock can be changed. PROF is then performed on the refined subblock MVs and samples to calculate the MV refinements and sample gradients for the sample refinements.
- sample-based affine motion compensation (described in Section 6. ) can be combined with affine BDOF.
- the affine BDOF can fall back to sample-based affine motion compensation (MC) depending on the affine model. That is, if one, two, three or four of the affine parameters (i.e., a, b, d and e) as shown in the equation below or the combination of two, three or four of the affine parameters of a PU is larger and/or smaller than a threshold, the sample-based affine MC is used.
- the affine BDOF process can switch to sample-based affine MC depending on the subblock TM or BM costs, sample intensity characteristics (i.e., similarity between L0 and L1 samples, similarity between L0 and L1 gradients, other characteristics derived from sample intensity etc. ) or the magnitude of BDOF MV refinement. Specifically, if the TM or BM cost of an affine subblock or the magnitude of BDOF MV refinement is larger than a threshold, the subblock is then processed by sample-based affine MC to generate the predictor.
- sample intensity characteristics i.e., similarity between L0 and L1 samples, similarity between L0 and L1 gradients, other characteristics derived from sample intensity etc.
- sample-based affine MC or subblock-based affine MC with PROF is used to generate the predictor before each stage of affine BDOF depending on the OBMC flag or affine model. If OBMC flag is false or one, two, three or four of the affine parameters (i.e., a, b, d and e) or the combination of two, three or four of the affine parameters of a PU is larger and/or smaller than a threshold, sample-based affine MC is used to generate the predictor before each stage of affine BDOF. Otherwise, the subblock-based affine MC with PROF is applied.
- subblock OBMC is applied to further refine the affine predictor after one or more than one stages of affine BDOF process.
- the BM cost of the linear regression derived CPMV set is further calculated and used to compare with the BM cost after BDOF MV refinement to determine whether performing linear regression is better or not.
- the linear regression is skipped. Specifically, after n-th BDOF refinement iteration, the BM cost of entire CU is calculated. The linear regression is performed on the BDOF refined subblock MVs to generate the new CPMV set and the BM cost can also be calculated based on the new CPMV set. If the BM cost of new CPMV set is larger than the BM cost of entire CU after n-th BDOF refinement iteration, the new CPMV set is not stored.
- non-translation parameters of affine model (as described in Section entitled: DMVR for Affine Merge Coded Blocks) is further refined after linear regression derived CPMV set is determined. That is, if the BM cost of the linear regression derived CPMV set is smaller than the BM cost of entire CU after n-th BDOF refinement iteration, linear regression derived CPMV set is stored and the non-translation parameters of affine model is further performed to refine the affine model.
- any of the foregoing proposed affine motion compensation methods can be implemented in encoders and/or decoders.
- any of the proposed affine motion compensation methods can be implemented in an inter coding module inside an encoder, and/or an inter decoding module inside a decoder.
- any of the proposed methods can be implemented as a circuit coupled to the inter coding module of the encoder and/or the decoder, so as to provide the information needed by the inter coding.
- any of the proposed affine motion compensation methods can be implemented in an Inter coding module (e.g. MC 152 in Fig. 1B) in a decoder or an Inter coding module is an encoder (e.g. Inter Pred. 112 in Fig. 1A) .
- any of the proposed affine motion compensation can also be implemented as a circuit coupled to the intra/inter coding module at the decoder or the encoder.
- the decoder or encoder may also use additional processing unit to implement the required cross-component prediction processing.
- the Intra Pred. units e.g. unit 110/112 in Fig. 1A and unit 150/152 in Fig. 1B
- the Intra Pred. units are shown as individual processing units, they may correspond to executable software or firmware codes stored on a media, such as hard disk or flash memory, for a CPU (Central Processing Unit) or programmable devices (e.g. DSP (Digital Signal Processor) or FPGA (Field Programmable Gate Array) ) .
- DSP Digital Signal Processor
- FPGA Field Programmable Gate Array
- Fig. 11 illustrates a flowchart of an exemplary video coding system that performs affine motion compensation using subblock-based BDOF derived MV and PROF derived MV according to an embodiment of the present invention.
- the steps shown in the flowchart may be implemented as program codes executable on one or more processors (e.g., one or more CPUs) at the encoder side.
- the steps shown in the flowchart may also be implemented based hardware such as one or more electronic devices or processors arranged to perform the steps in the flowchart.
- input data comprising a current block are received in step 1110, wherein the input data comprise pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side, and the current block is coded using an affine model.
- a first MV Motion Vector
- BDOF Bi-Directional Optical Flow
- a second MV based on is determined a PROF (Prediction Refinement with Optical Flow) process in step 1130.
- the current block is encoded or decoded by using information comprising the first MV and the second MV in step 1140.
- Fig. 12 illustrates a flowchart of an exemplary video coding system that determines the affine motion compensation block size based on OBMC on/off condition and/or the affine motion model according to an embodiment of the present invention.
- input data comprising a current block are received in step 1210, wherein the input data comprise pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side, and the current block is coded in an affine mode.
- OBMC Local Boundary Motion Compensation
- Affine MC (Motion Compensation) block size for the current block is determined according to the OBMC on/off condition of the current block, the affine motion model of the current block, or both, or whether to allow sample-based or subblock-based affine MC is determined according to the OBMC on/off condition of the current block in step 1230.
- Affine MC samples are generated for an affine MC area with the affine MC block size determined, or subblock-based affine MC is applied if the sample-based or subblock-based affine MC is allowed in step 1240.
- Fig. 13 illustrates a flowchart of an exemplary video coding system that determines interpolation filter tap size for affine motion compensation based on the affine block size according to an embodiment of the present invention.
- input data comprising a current block are received in step 1310, wherein the input data comprise pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side, and the current block is coded in an affine mode.
- Interpolation filter tap size for an interpolation filter is determined according to block width, block height or block size, or according to a difference between subblock motion vectors in step 1320.
- Affine MC (Motion Compensation) samples are generated using the interpolation filter with the interpolation filter tap size determined in step 1330.
- the current block is encoded or decoded by using the affine MC samples in step 1340.
- Fig. 14 illustrates a flowchart of an exemplary video coding system that combines affine BDOF and sample-based affine motion compensation according to an embodiment of the present invention.
- input data comprising a current block are received in step 1410, wherein the input data comprise pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side, and the current block is coded using an affine model.
- a target prediction mode between a sample-based affine MC (Motion Compensation) and affine BDOF (Bi-Directional Optical Flow) is determined for a current subblock of the current block according to the affine model, subblock TM (Template Matching) or BM (Block Matching) cost, sample intensity characteristics or magnitude of BDOF MV refinement in step 1420.
- Prediction samples are generated for the current block using the sample-based affine MC or the affine BDOF according to the target prediction mode in step 1430.
- the current block is encoded or decoded by using the prediction samples generated in step 1440.
- Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both.
- an embodiment of the present invention can be one or more circuit circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein.
- An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein.
- DSP Digital Signal Processor
- the invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) .
- These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention.
- the software code or firmware code may be developed in different programming languages and different formats or styles.
- the software code may also be compiled for different target platforms.
- different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
L'invention concerne un procédé et un appareil de codage vidéo à l'aide d'une compensation de mouvement affine. Selon un procédé, pour un bloc courant codé à l'aide d'un modèle affine, un premier MV est déterminé sur la base d'un processus BDOF. Un second MV est déterminé sur la base d'un processus PROF. Le bloc courant est encodé ou décodé à l'aide d'informations comprenant le premier MV et le second MV. Selon un autre procédé, une condition marche/arrêt d'OBMC du bloc courant, un modèle de mouvement affine du bloc courant, ou les deux, sont déterminés. Une taille de bloc MC affine pour le bloc courant est déterminée en fonction de la condition marche/arrêt d'OBMC du bloc courant, du modèle de mouvement affine du bloc courant, ou des deux, ou la possibilité de permettre à une MC affine basé sur un échantillon ou un sous-bloc est déterminée en fonction de la condition marche/arrêt d'OBMC du bloc courant.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202380091254.6A CN120584488A (zh) | 2023-01-13 | 2023-12-19 | 用于块边界和运动细化的仿射运动补偿的方法和装置在视频编解码中 |
Applications Claiming Priority (8)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363479754P | 2023-01-13 | 2023-01-13 | |
| US63/479754 | 2023-01-13 | ||
| US202363480333P | 2023-01-18 | 2023-01-18 | |
| US63/480333 | 2023-01-18 | ||
| US202363512308P | 2023-07-07 | 2023-07-07 | |
| US63/512308 | 2023-07-07 | ||
| US202363590006P | 2023-10-13 | 2023-10-13 | |
| US63/590006 | 2023-10-13 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024149035A1 true WO2024149035A1 (fr) | 2024-07-18 |
Family
ID=91897678
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/139960 Ceased WO2024149035A1 (fr) | 2023-01-13 | 2023-12-19 | Procédés et appareil de compensation de mouvement affine pour limites de bloc et raffinement de mouvement dans un codage vidéo |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN120584488A (fr) |
| WO (1) | WO2024149035A1 (fr) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20200351495A1 (en) * | 2019-05-02 | 2020-11-05 | Tencent America LLC | Method and apparatus for improvements of affine prof |
| US20210029370A1 (en) * | 2019-07-23 | 2021-01-28 | Tencent America LLC | Method and apparatus for video coding |
| US20210392370A1 (en) * | 2020-06-10 | 2021-12-16 | Kt Corporation | Method and apparatus for encoding/decoding a video signal, and a recording medium storing a bitstream |
| US20220078442A1 (en) * | 2019-03-20 | 2022-03-10 | Huawei Technologies Co., Ltd. | Method and apparatus for prediction refinement with optical flow for an affine coded block |
-
2023
- 2023-12-19 CN CN202380091254.6A patent/CN120584488A/zh active Pending
- 2023-12-19 WO PCT/CN2023/139960 patent/WO2024149035A1/fr not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220078442A1 (en) * | 2019-03-20 | 2022-03-10 | Huawei Technologies Co., Ltd. | Method and apparatus for prediction refinement with optical flow for an affine coded block |
| US20200351495A1 (en) * | 2019-05-02 | 2020-11-05 | Tencent America LLC | Method and apparatus for improvements of affine prof |
| US20210029370A1 (en) * | 2019-07-23 | 2021-01-28 | Tencent America LLC | Method and apparatus for video coding |
| US20210392370A1 (en) * | 2020-06-10 | 2021-12-16 | Kt Corporation | Method and apparatus for encoding/decoding a video signal, and a recording medium storing a bitstream |
Also Published As
| Publication number | Publication date |
|---|---|
| CN120584488A (zh) | 2025-09-02 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| TWI702834B (zh) | 視訊編解碼系統中具有重疊塊運動補償的視訊處理的方法以及裝置 | |
| US12477118B2 (en) | Method and apparatus using affine non-adjacent candidates for video coding | |
| WO2023241637A9 (fr) | Procédé et appareil de prédiction inter-composantes avec mélange dans des systèmes de codage vidéo | |
| US20230328278A1 (en) | Method and Apparatus of Overlapped Block Motion Compensation in Video Coding System | |
| WO2023221993A1 (fr) | Procédé et appareil d'affinement de vecteur de mouvement côté décodeur et de flux optique bidirectionnel pour codage vidéo | |
| WO2023134564A1 (fr) | Procédé et appareil dérivant un candidat de fusion à partir de blocs codés affine pour un codage vidéo | |
| US20250280106A1 (en) | Method and Apparatus for Regression-based Affine Merge Mode Motion Vector Derivation in Video Coding Systems | |
| WO2024149035A1 (fr) | Procédés et appareil de compensation de mouvement affine pour limites de bloc et raffinement de mouvement dans un codage vidéo | |
| CN116456110A (zh) | 视频编解码方法及装置 | |
| WO2025077512A1 (fr) | Procédés et appareil de mode de partition géométrique avec modes de sous-bloc | |
| WO2025218694A1 (fr) | Procédés et appareil de sélection du nombre de cadidats mvd en mode amvp avec sbtmvp pour le codage vidéo | |
| WO2024027784A1 (fr) | Procédé et appareil de prédiction de vecteurs de mouvement temporel basée sur un sous-bloc avec réorganisation et affinement dans un codage vidéo | |
| WO2025218539A1 (fr) | Procédé et appareil de sélection de fonction de coût adaptative pour un affinement de mv et un réordonnancement de candidats de mode de fusion dans un codage vidéo | |
| WO2024078331A1 (fr) | Procédé et appareil de prédiction de vecteurs de mouvement basée sur un sous-bloc avec réorganisation et affinement dans un codage vidéo | |
| WO2025167844A1 (fr) | Procédés et appareil de dérivation et d'héritage de modèle de compensation d'éclairage local destinés à un codage vidéo | |
| WO2024016844A1 (fr) | Procédé et appareil utilisant une estimation de mouvement affine avec affinement de vecteur de mouvement de point de commande | |
| WO2025007931A1 (fr) | Procédés et appareil d'amélioration de codage vidéo par de multiples modèles | |
| WO2024074134A1 (fr) | Prédiction basée sur un mouvement affine dans un codage vidéo | |
| WO2025153018A1 (fr) | Procédés et appareil de candidats à bi-prédiction pour prédiction de vecteur de bloc auto-relocalisé ou prédiction de vecteur de mouvement chaîné | |
| WO2024193431A9 (fr) | Procédé et appareil de prédiction combinée dans un système de codage vidéo | |
| WO2024193431A1 (fr) | Procédé et appareil de prédiction combinée dans un système de codage vidéo | |
| WO2025007974A1 (fr) | Procédés et appareil de prédiction adaptative inter-composantes pour codage de chrominance | |
| WO2025026397A1 (fr) | Procédés et appareil de codage vidéo utilisant une prédiction inter-composantes à hypothèses multiples pour un codage de chrominance | |
| WO2025153050A1 (fr) | Procédés et appareil de prédiction intra basée sur un filtre avec de multiples hypothèses dans des systèmes de codage vidéo | |
| WO2025082073A1 (fr) | Procédés et appareil de dérivation et d'héritage de modèle de compensation d'éclairage local et non local pour codage vidéo |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23915789 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202380091254.6 Country of ref document: CN |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWP | Wipo information: published in national office |
Ref document number: 202380091254.6 Country of ref document: CN |