WO2020185034A1 - Procédé de dérivation d'un vecteur de mouvement delta, et dispositif de décodage d'image - Google Patents
Procédé de dérivation d'un vecteur de mouvement delta, et dispositif de décodage d'image Download PDFInfo
- Publication number
- WO2020185034A1 WO2020185034A1 PCT/KR2020/003532 KR2020003532W WO2020185034A1 WO 2020185034 A1 WO2020185034 A1 WO 2020185034A1 KR 2020003532 W KR2020003532 W KR 2020003532W WO 2020185034 A1 WO2020185034 A1 WO 2020185034A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- unit
- motion vector
- integer
- samples
- flag
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/91—Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
Definitions
- the present invention relates to encoding and decoding of an image, and more particularly, to a delta motion vector derivation method and an image decoding apparatus in which the efficiency of encoding and decoding is improved by organically controlling the complexity of the process of deriving the delta motion vector. .
- moving picture data Since moving picture data has a large amount of data compared to audio data or still image data, it requires a lot of hardware resources including memory in order to store or transmit itself without processing for compression.
- the moving picture data is compressed and stored or transmitted using an encoder, and the decoder receives the compressed moving picture data, decompresses and reproduces the compressed moving picture data.
- video compression technologies there are H.264/AVC and HEVC (High Efficiency Video Coding), which improves coding efficiency by about 40% compared to H.264/AVC.
- the present invention aims to provide an improved video encoding and decoding technology, and in particular, one aspect of the present invention is to perform the entire process or partial process of the DMVR as indicated by the application flag. It relates to a technique for improving the efficiency of encoding and decoding by selectively determining whether or not.
- An aspect of the present invention is a method of deriving a delta motion vector used for decoder-side motion vector refinement (DMVR), comprising, from a bitstream, an application flag related to whether or not the DMVR is applied, and a current subblock motion Obtaining a vector and a reference picture of the current subblock; From the position indicated by the motion vector, the delta motion vector is calculated using an integer sample offset indicating an integer sample having a minimum sum of absolute differences (SAD) among candidate integer samples corresponding to the current subblock.
- SAD minimum sum of absolute differences
- a method of deriving a delta motion vector is provided, in which the performance is determined according to the method.
- Another aspect of the present invention is an image decoding apparatus that induces a delta motion vector used for decoder-side motion vector refinement (DMVR), from a bitstream, an application flag related to whether or not the DMVR is applied, a current sub
- An acquisition unit acquiring a motion vector of a block and a reference picture of the current subblock; From the position indicated by the motion vector, the delta motion vector is calculated using an integer sample offset indicating an integer sample having a minimum sum of absolute differences (SAD) among candidate integer samples corresponding to the current subblock.
- SAD minimum sum of absolute differences
- FIG. 1 is an exemplary block diagram of an image encoding apparatus capable of implementing the techniques of the present disclosure.
- FIG. 2 is a diagram for explaining a method of dividing a block using a QTBTTT structure.
- 3A is a diagram illustrating a plurality of intra prediction modes.
- 3B is a diagram illustrating a plurality of intra prediction modes including wide-angle intra prediction modes.
- FIG. 4 is an exemplary block diagram of an image decoding apparatus capable of implementing the techniques of the present disclosure.
- 5 is a diagram for describing RDPCM.
- FIG. 6 is a diagram for describing a DMVR.
- FIG. 7 is an exemplary block diagram of an inter prediction unit capable of implementing the techniques of this disclosure.
- FIG. 8 is a flowchart illustrating an example of a method of inducing a delta motion vector.
- 9 is a diagram for explaining candidate integer samples.
- 10 is a flowchart illustrating an example of a method of determining whether to perform integer unit improvement.
- 11 is a flowchart illustrating an example of a method of determining whether to interpolate candidate prime samples.
- FIG. 1 is an exemplary block diagram of an image encoding apparatus capable of implementing the techniques of the present disclosure.
- an image encoding apparatus and sub-elements of the apparatus will be described with reference to FIG. 1.
- the image encoding apparatus includes a picture segmentation unit 110, a prediction unit 120, a subtractor 130, a transform unit 140, a quantization unit 145, a rearrangement unit 150, an entropy encoding unit 155, an inverse quantization unit. (160), an inverse transform unit 165, an adder 170, a filter unit 180, and a memory 190 may be included.
- Each component of the image encoding apparatus may be implemented by hardware or software, or by a combination of hardware and software.
- functions of each component may be implemented as software, and a microprocessor may be implemented to execute a function of software corresponding to each component.
- One image is composed of a plurality of pictures. Each picture is divided into a plurality of regions, and encoding is performed for each region. For example, one picture is divided into one or more tiles or/and slices. Here, one or more tiles may be defined as a tile group. Each tile or/slice is divided into one or more Coding Tree Units (CTUs). And each CTU is divided into one or more CUs (Coding Units) by a tree structure. Information applied to each CU is encoded as the syntax of the CU, and information commonly applied to CUs included in one CTU is encoded as the syntax of the CTU.
- CTUs Coding Tree Units
- information commonly applied to all blocks in one slice is encoded as the syntax of the slice header, and information applied to all blocks constituting one picture is a picture parameter set (PPS) or picture. It is coded in the header. Further, information commonly referred to by a plurality of pictures is encoded in a sequence parameter set (SPS). In addition, information commonly referred to by one or more SPSs is encoded in a video parameter set (VPS). Also, information commonly applied to one tile or tile group may be encoded as syntax of a tile or tile group header.
- PPS picture parameter set
- SPS sequence parameter set
- VPS video parameter set
- information commonly applied to one tile or tile group may be encoded as syntax of a tile or tile group header.
- the picture dividing unit 110 determines the size of a coding tree unit (CTU).
- CTU size Information on the size of the CTU (CTU size) is encoded as the syntax of the SPS or PPS and transmitted to the video decoding apparatus.
- the picture dividing unit 110 After dividing each picture constituting an image into a plurality of CTUs (Coding Tree Units) having a predetermined size, the picture dividing unit 110 repetitively divides the CTU using a tree structure. (recursively) split. A leaf node in the tree structure becomes a coding unit (CU), which is a basic unit of coding.
- CU coding unit
- a quad tree (QuadTree, QT) in which an upper node (or parent node) is divided into four lower nodes (or child nodes) of the same size, or a binary tree (BinaryTree) in which an upper node is divided into two lower nodes. , BT), or a ternary tree (TT) in which an upper node is divided into three lower nodes in a 1:2:1 ratio, or a structure in which two or more of these QT structures, BT structures, and TT structures are mixed.
- QT quad tree
- BT binary tree
- TT ternary tree
- a QTBT QuadTree plus BinaryTree
- a QTBTTT QuadTree plus BinaryTree TernaryTree
- MTT Multiple-Type Tree
- the CTU may be first divided into a QT structure.
- the quadtree division may be repeated until the size of a splitting block reaches the minimum block size (MinQTSize) of a leaf node allowed in QT.
- a first flag (QT_split_flag) indicating whether each node of the QT structure is divided into four nodes of a lower layer is encoded by the entropy encoder 155 and signaled to the image decoding apparatus. If the leaf node of the QT is not larger than the maximum block size (MaxBTSize) of the root node allowed in BT, it may be further divided into one or more of a BT structure or a TT structure.
- MaxBTSize maximum block size
- a plurality of division directions may exist. For example, there may be two directions in which a block of a corresponding node is divided horizontally and a direction vertically divided.
- a second flag indicating whether nodes are split, and if split, a flag indicating a split direction (vertical or horizontal) and/or a split type (Binary or Ternary).
- a flag indicating) is encoded by the entropy encoder 155 and signaled to the image decoding apparatus.
- a CU split flag (split_cu_flag) indicating whether the node is divided is encoded. It could be.
- the block of the corresponding node becomes a leaf node in the split tree structure and becomes a coding unit (CU), which is a basic unit of encoding.
- CU coding unit
- a split flag indicating whether each node of the BT structure is divided into blocks of a lower layer and split type information indicating a type to be divided are encoded by the entropy encoder 155 and transmitted to the image decoding apparatus.
- a type of dividing the block of the corresponding node into two blocks having an asymmetric shape may further exist.
- the asymmetric form may include a form of dividing a block of a corresponding node into two rectangular blocks having a size ratio of 1:3, or a form of dividing a block of a corresponding node in a diagonal direction.
- the CU can have various sizes according to the QTBT or QTBTTT split from the CTU.
- a block corresponding to a CU to be encoded or decoded ie, a leaf node of QTBTTT
- a'current block' a block corresponding to a CU to be encoded or decoded
- the shape of the current block may be not only square but also rectangular.
- the prediction unit 120 predicts the current block and generates a prediction block.
- the prediction unit 120 includes an intra prediction unit 122 and an inter prediction unit 124.
- each of the current blocks in a picture can be predictively coded.
- prediction of the current block is performed using an intra prediction technique (using data from a picture containing the current block) or an inter prediction technique (using data from a picture coded before a picture containing the current block). Can be done.
- Inter prediction includes both one-way prediction and two-way prediction.
- the intra prediction unit 122 predicts pixels in the current block by using pixels (reference pixels) located around the current block in the current picture including the current block.
- the plurality of intra prediction modes may include two non-directional modes including a planar mode and a DC mode, and 65 directional modes.
- the surrounding pixels to be used and the calculation expression are defined differently.
- directional modes (67 to 80, intra prediction modes -1 to -14) shown by dotted arrows in FIG. 3B may be additionally used. These may be referred to as "wide angle intra-prediction modes". Arrows in FIG. 3B indicate corresponding reference samples used for prediction, and do not indicate a prediction direction. The prediction direction is opposite to the direction indicated by the arrow.
- the wide-angle intra prediction modes when the current block is a rectangular shape, a specific directional mode is predicted in the opposite direction without additional bit transmission. In this case, among the wide-angle intra prediction modes, some wide-angle intra prediction modes available for the current block may be determined based on a ratio of the width and height of the rectangular current block.
- intra prediction modes 67 to 80 can be used when the current block has a rectangular shape with a height smaller than the width, and wide-angle with an angle greater than -135 degrees.
- the intra prediction modes can be used when the current block has a rectangular shape whose height is greater than the width.
- the intra prediction unit 122 may determine an intra prediction mode to be used to encode the current block.
- the intra prediction unit 122 may encode the current block using several intra prediction modes and select an appropriate intra prediction mode to use from the tested modes. For example, the intra prediction unit 122 calculates rate distortion values using rate-distortion analysis for several tested intra prediction modes, and has the best rate distortion characteristics among the tested modes. It is also possible to select an intra prediction mode.
- the intra prediction unit 122 selects one intra prediction mode from among a plurality of intra prediction modes, and predicts the current block using a neighboring pixel (reference pixel) determined according to the selected intra prediction mode and an equation.
- Information on the selected intra prediction mode is encoded by the entropy encoder 155 and transmitted to the image decoding apparatus.
- the inter prediction unit 124 generates a prediction block for the current block through a motion compensation process.
- the inter prediction unit 124 searches for a block most similar to the current block in the coded and decoded reference picture prior to the current picture, and generates a prediction block for the current block using the searched block. Then, a motion vector corresponding to a displacement between the current block in the current picture and the prediction block in the reference picture is generated.
- motion estimation is performed on a luma component, and a motion vector calculated based on the luma component is used for both the luma component and the chroma component.
- Motion information including information on a reference picture used to predict the current block and information on a motion vector is encoded by the entropy encoder 155 and transmitted to an image decoding apparatus.
- the subtractor 130 generates a residual block by subtracting the prediction block generated by the intra prediction unit 122 or the inter prediction unit 124 from the current block.
- the transform unit 140 divides the residual block into one or more transform blocks, applies the transform to one or more transform blocks, and transforms residual values of the transform blocks from the pixel domain to the frequency domain.
- transformed blocks are referred to as coefficient blocks comprising one or more transform coefficient values.
- a 2D transformation kernel may be used for transformation, and a 1D transformation kernel may be used for horizontal and vertical transformation respectively.
- the transform kernel may be based on discrete cosine transform (DCT), discrete sine transform (DST), or the like.
- the transform unit 140 may transform residual signals in the residual block by using the entire size of the residual block as a transform unit.
- the transform unit 140 may divide the residual block into two sub-blocks in a horizontal or vertical direction, and may perform transformation on only one of the two sub-blocks. Accordingly, the size of the transform block may be different from the size of the residual block (and thus the size of the prediction block).
- Non-zero residual sample values may not exist or may be very sparse in a subblock on which transformation is not performed.
- the residual samples of the subblock on which the transformation is not performed are not signaled, and may be regarded as "0" by the image decoding apparatus.
- the transform unit 140 includes information on the coding mode (or transform mode) of the residual block (e.g., information indicating whether the residual block is transformed or the residual subblock is transformed, and a partition type selected to divide the residual block into subblocks)
- the entropy encoding unit 155 may be provided with information indicating information and information identifying a subblock on which transformation is performed.
- the entropy encoder 155 may encode information about a coding mode (or transform mode) of the residual block.
- the quantization unit 145 quantizes the transform coefficients output from the transform unit 140 and outputs the quantized transform coefficients to the entropy encoding unit 155.
- the quantization unit 145 may immediately quantize a related residual block for a certain block or frame without transformation.
- the rearrangement unit 150 may rearrange coefficient values on the quantized residual values.
- the rearrangement unit 150 may change a two-dimensional coefficient array into a one-dimensional coefficient sequence through coefficient scanning. For example, the rearrangement unit 150 may scan from a DC coefficient to a coefficient in a high frequency region using a zig-zag scan or a diagonal scan to output a one-dimensional coefficient sequence. .
- zig-zag scan instead of zig-zag scan, a vertical scan that scans a two-dimensional coefficient array in a column direction or a horizontal scan that scans a two-dimensional block shape coefficient in a row direction may be used. That is, a scan method to be used may be determined from among zig-zag scan, diagonal scan, vertical scan, and horizontal scan according to the size of the transform unit and the intra prediction mode.
- the entropy encoding unit 155 uses various encoding methods such as Context-based Adaptive Binary Arithmetic Code (CABAC), Exponential Golomb, and the like, and the quantized transform coefficients of 1D output from the reordering unit 150 are A bitstream is generated by encoding the sequence.
- CABAC Context-based Adaptive Binary Arithmetic Code
- Exponential Golomb Exponential Golomb
- the entropy encoder 155 encodes information such as a CTU size related to block division, a CU division flag, a QT division flag, an MTT division type, and an MTT division direction, so that the video decoding apparatus performs the same block as the video encoding apparatus. Make it possible to divide.
- the entropy encoder 155 encodes information on a prediction type indicating whether the current block is encoded by intra prediction or inter prediction, and intra prediction information (ie, intra prediction) according to the prediction type. Mode information) or inter prediction information (reference picture and motion vector information) is encoded.
- the inverse quantization unit 160 inverse quantizes the quantized transform coefficients output from the quantization unit 145 to generate transform coefficients.
- the inverse transform unit 165 converts transform coefficients output from the inverse quantization unit 160 from the frequency domain to the spatial domain to restore the residual block.
- the addition unit 170 restores the current block by adding the restored residual block and the prediction block generated by the prediction unit 120.
- the pixels in the reconstructed current block are used as reference pixels when intra-predicting the next block.
- the filter unit 180 filters reconstructed pixels to reduce blocking artifacts, ringing artifacts, blurring artifacts, etc. that occur due to block-based prediction and transformation/quantization. Perform.
- the filter unit 180 may include a deblocking filter 182 and a sample adaptive offset (SAO) filter 184.
- the deblocking filter 180 filters the boundary between reconstructed blocks to remove blocking artifacts caused by block-based encoding/decoding, and the SAO filter 184 adds additional information to the deblocking-filtered image. Filtering is performed.
- the SAO filter 184 is a filter used to compensate for a difference between a reconstructed pixel and an original pixel caused by lossy coding.
- the reconstructed block filtered through the deblocking filter 182 and the SAO filter 184 is stored in the memory 190.
- the reconstructed picture may be used as a reference picture for inter prediction of a block in a picture to be encoded later.
- FIG. 4 is an exemplary block diagram of an image decoding apparatus capable of implementing the techniques of the present disclosure.
- an image decoding apparatus and sub-components of the apparatus will be described with reference to FIG. 4.
- the image decoding apparatus includes an entropy decoding unit 410, a rearrangement unit 415, an inverse quantization unit 420, an inverse transform unit 430, a prediction unit 440, an adder 450, a filter unit 460, and a memory 470. ) Can be included.
- each component of the image decoding apparatus may be implemented as hardware or software, or may be implemented as a combination of hardware and software.
- functions of each component may be implemented as software, and a microprocessor may be implemented to execute a function of software corresponding to each component.
- the entropy decoding unit 410 determines the current block to be decoded by decoding the bitstream generated by the image encoding apparatus and extracting information related to block division, and predicting information and residual signals necessary to restore the current block. Extract information, etc.
- the entropy decoding unit 410 determines the size of the CTU by extracting information on the CTU size from a sequence parameter set (SPS) or a picture parameter set (PPS), and divides the picture into CTUs of the determined size. Then, the CTU is determined as the uppermost layer of the tree structure, that is, the root node, and the CTU is divided using the tree structure by extracting partition information for the CTU.
- SPS sequence parameter set
- PPS picture parameter set
- a first flag (QT_split_flag) related to the splitting of the QT is extracted and each node is split into four nodes of a lower layer.
- the second flag (MTT_split_flag) related to the splitting of the MTT and the splitting direction (vertical / horizontal) and/or split type (binary / ternary) information are extracted and the corresponding leaf node is MTT.
- MTT_split_flag related to the splitting of the MTT and the splitting direction (vertical / horizontal) and/or split type (binary / ternary) information
- each node may have 0 or more repetitive MTT segmentation after 0 or more repetitive QT segmentation.
- MTT division may occur immediately, or, conversely, only multiple QT divisions may occur.
- each node is divided into four nodes of a lower layer by extracting the first flag (QT_split_flag) related to the division of the QT.
- QT_split_flag the first flag related to the division of the QT.
- a split flag indicating whether or not the node corresponding to the leaf node of the QT is further split into BT and split direction information are extracted.
- the entropy decoder 410 extracts information on a prediction type indicating whether the current block is intra prediction or inter prediction.
- the prediction type information indicates intra prediction
- the entropy decoder 410 extracts a syntax element for intra prediction information (intra prediction mode) of the current block.
- the prediction type information indicates inter prediction
- the entropy decoder 410 extracts a syntax element for the inter prediction information, that is, information indicating a motion vector and a reference picture referenced by the motion vector.
- the entropy decoder 410 includes information on the coding mode of the residual block (e.g., information on whether the residual block is encoded or only the subblocks of the residual block are encoded, and is selected to divide the residual block into subblocks). Information indicating the partition type, information identifying the encoded residual subblock, quantization parameters, etc.) are extracted from the bitstream. In addition, the entropy decoder 410 extracts information on quantized transform coefficients of the current block as information on the residual signal.
- information on the coding mode of the residual block e.g., information on whether the residual block is encoded or only the subblocks of the residual block are encoded, and is selected to divide the residual block into subblocks.
- Information indicating the partition type, information identifying the encoded residual subblock, quantization parameters, etc. are extracted from the bitstream.
- the entropy decoder 410 extracts information on quantized transform coefficients of the current block as information on the residual signal.
- the rearrangement unit 415 in the reverse order of the coefficient scanning order performed by the image encoding apparatus, reconverts the sequence of one-dimensional quantized transform coefficients entropy-decoded by the entropy decoder 410 into a two-dimensional coefficient array (i.e., Block).
- the inverse quantization unit 420 inverse quantizes the quantized transform coefficients, and the inverse transform unit 430 inversely transforms the inverse quantized transform coefficients from the frequency domain to the spatial domain based on information on the coding mode of the residual block By reconstructing the signals, a reconstructed residual block for the current block is generated.
- the inverse transform unit 430 determines the size of the current block (and thus, to be reconstructed) with respect to the inverse quantized transformation coefficients.
- a reconstructed residual block for the current block is generated by performing inverse transformation using the residual block size) as a transformation unit.
- the inverse transform unit 430 performs the transformed sub-blocks on the inverse quantized transform coefficients.
- the size of the block as a transformation unit, performing inverse transformation to restore residual signals for the transformed subblock, and filling the residual signals for untransformed subblocks with a value of "0", the reconstructed current block Create a residual block.
- the prediction unit 440 may include an intra prediction unit 442 and an inter prediction unit 444.
- the intra prediction unit 442 is activated when the prediction type of the current block is intra prediction
- the inter prediction unit 444 is activated when the prediction type of the current block is inter prediction.
- the intra prediction unit 442 determines an intra prediction mode of the current block among a plurality of intra prediction modes from the syntax element for the intra prediction mode extracted from the entropy decoding unit 410, and references around the current block according to the intra prediction mode. Predict the current block using pixels.
- the inter prediction unit 444 determines a motion vector of the current block and a reference picture referenced by the motion vector using the syntax element for the intra prediction mode extracted from the entropy decoding unit 410, and determines the motion vector and the reference picture. Is used to predict the current block.
- the adder 450 adds the residual block output from the inverse transform unit 430 and the prediction block output from the inter prediction unit 444 or the intra prediction unit 442 to restore the current block.
- the pixels in the reconstructed current block are used as reference pixels for intra prediction of a block to be decoded later.
- the filter unit 460 may include a deblocking filter 462 and an SAO filter 464.
- the deblocking filter 462 performs deblocking filtering on the boundary between reconstructed blocks in order to remove blocking artifacts caused by decoding in units of blocks.
- the SAO filter 464 performs additional filtering on the reconstructed block after deblocking filtering in order to compensate for the difference between the reconstructed pixel and the original pixel caused by lossy coding.
- the reconstructed block filtered through the deblocking filter 462 and the SAO filter 464 is stored in the memory 470. When all blocks in one picture are reconstructed, the reconstructed picture is used as a reference picture for inter prediction of a block in a picture to be encoded later.
- VVC next-generation video coding standard
- HEVC High Efficiency Video Coding
- MTS multiple transform set or multiple transform selection
- TS multiple transform set or multiple transform selection
- the TS mode transform skip mode
- quantization and entropy coding are performed at the pixel level without applying transformation to residual samples.
- MTS and TS modes have a relationship with each other in the conversion process of residual samples, and new ideas are proposed for this relationship in the process of VVC standardization discussion.
- Table 1 shows a comparison between one of the new ideas for the MTS and TS mode with a conventional VVC draft.
- the conventional VVC draft defines syntax elements for MTS in transform_unit() syntax.
- the maximum size of a transform block to which MTS can be applied is 32 ⁇ 32, and when this condition is satisfied, a syntax element tu_mts_flag indicating whether MTS is applied to the corresponding transform block is signaled.
- syntax elements for the TS mode are defined in the residual_coding() syntax.
- the maximum size of a transform block to which the TS mode can be applied is 4 ⁇ 4, and when this condition is satisfied, transform_skip_flag, which is a syntax element indicating whether transform is skipped, is signaled in the corresponding transform block.
- mts_idx is an index indicating a transform kernel applied to residual samples along the horizontal and vertical directions of the corresponding transform block.
- the new idea is to unify the MTS and TS modes in the transform_unit() syntax using tu_mts_idx.
- the new idea defines that the TS mode, which was applied to a transform block having a maximum 4x4 size, can be extended to a transform block having a maximum of 32x32 and applied. This is to prevent duplication of syntax coding with MTS.
- Table 2 shows an example of selecting the MTS and TS modes using tu_mts_idx.
- tu_mts_idx can have a value of 0-5, and values 0-5 are binarized using truncated unary binarization. Any one of the MTS and TS modes may be selected by each value of tu_mts_idx, and transform kernels to be applied to the transform block may be designated.
- RDPCM residual differential pulse-code modulation
- RDPCM is performed on residual samples after intra-prediction and inter-prediction, when lossy compression is performed in TS mode.
- TS mode since transformation is not performed on the residual samples and entropy encoding is directly applied, it can be said that the encoding performance of RDPCM is not excellent compared to DCT.
- RDPCM has a high encoding performance, so it can be usefully used for compression. This is because there are a lot of residual samples in the high frequency range that occur at the boundary of graphic elements with high color contrast in this specific image. Accordingly, RDPCM can provide superior compression performance by reducing the total amount of energy of residual samples for entropy encoding in the TS mode.
- RDPCM There are two types of RDPCM: an implicit RDPCM method in which prediction is performed in a horizontal direction and a vertical direction after intra prediction, and an explicit method in which prediction is performed in a horizontal direction or a vertical direction after inter prediction.
- an explicit method information on the prediction direction of the RDPCM is signaled to the image decoding apparatus through a bitstream.
- FIG. 5(a) shows the horizontal direction prediction of RDPCM
- FIG. 5(b) shows the vertical direction prediction of RDPCM.
- RDPCM is performed using the residual components of the nearest left column or upper row according to a prediction direction among residual samples.
- the second residual signal is the result after prediction May be expressed as in Equation 1.
- Equation 1 Q(r) is a reconstructed residual signal including quantization noise.
- the image encoding apparatus entropy-encodes the second residual signal, then signals it to the image decoding apparatus, and restores (stores) the second residual signal to predict the residual signal of the next row.
- RDPCM in the vertical direction sequentially proceeds for all rows in the residual block, and the image decoding apparatus restores the residual signal of the i-th row by sequentially summing the reconstructed secondary residual signals as shown in Equation (2).
- the prediction direction information of RDPCM is estimated from intra prediction information decoded in advance, but in the case of explicit RDPCM, RDPCM prediction mode information is decoded from the bitstream and determined.
- a decoder-side motion vector derivation is a method of improving the accuracy of a motion vector used in a merge mode by refinement of a motion vector (MV) in an image decoding apparatus.
- the video decoding apparatus decodes motion information from a bitstream, and derives an initial MV (MV 0 , MV 1 ) for a current subblock (currSb) in a current picture (currPic) based on the decoded motion information.
- the video decoding apparatus configures samples (candidate samples) in a prediction block corresponding to a current subblock based on a location within a reference picture (refPicL0, refPicL1) indicated by the initial MV.
- Candidate samples include integer samples (candidate integer samples) and decimal samples (candidate decimal samples).
- the video decoding apparatus determines a candidate sample having a minimum sum of absolute differences (SAD) by searching around the initial MV.
- SAD minimum sum of absolute differences
- MV diff delta motion vector representing the displacement between the position indicated by the initial MV and the position of the searched candidate sample is derived, and'improvements' using the initial MV and delta MV
- the MV (MV 0 ', MV 1 ')' is finally derived.
- a list (sadList[i]) of candidate integer samples of a preset number and position is set, and the delta MV is calculated using this list.
- dMvLx derived in integer units is improved in decimal units.
- a specific condition in which the fractional unit improvement is performed is when the candidate sample (bestIdx) having the smallest SAD in the derivation of the integer unit corresponds to a position indicated by the initial MV (sadList[4]).
- the present invention proposes a new method for selectively performing the fractional improvement according to the characteristics of an image based on the recognition of a problem in which the improvement of the fractional unit is unconditionally performed in the DMVR.
- FIG. 7 An exemplary block diagram of an inter prediction unit 444 capable of implementing the techniques of the present disclosure is shown in FIG. 7.
- the inter prediction unit 444 includes an acquisition unit 710, a configuration unit 720, a derivation unit 730, an improvement unit 740, a derivation unit 750, and a prediction execution unit 760. It can be configured to include.
- the configuration unit 720 may be configured to include an integer sample configuration unit 722 and an interpolation unit 724
- the derivation unit 730 includes a determination unit 732 and an integer unit derivation unit 734. Can be.
- functions of each component will be described with reference to FIGS. 8 to 11.
- the apparatus for encoding an image may perform inter prediction on a current subblock, encode motion information on a prediction block of the current subblock, and signal.
- the image encoding apparatus may determine the necessity of improvement in a fractional unit, and set the determination result as a value of an application flag to signal.
- the entropy decoder 410 may decode the applied flag and motion information of the current subblock from the bitstream.
- the acquisition unit 710 may acquire a motion vector of the current subblock and a reference picture of the current subblock based on the motion information decoded from the bitstream (S810). As a result, the acquisition unit 710 may acquire the applied flag, the motion vector of the current subblock, and the reference picture of the current subblock from the bitstream.
- the motion vector acquired by the acquisition unit 710 may correspond to an initial MV used for the DMVR, and the application flag may be a syntax element indicating information related to whether or not the DMVR is applied.
- a process of deriving the delta MV into an integer unit (S820 to S840) and a process of improving the delta MV derived by an integer unit into a decimal unit (S860) may be performed.
- the process of deriving the delta MV in an integer unit (S820 to S840) is a process of constructing prediction samples corresponding to the current subblock (S820), a process of determining an integer sample having a minimum SAD from among the prediction samples (S830), and , A process of deriving the delta MV in integer units by using the integer sample offset derived from the determined integer sample (S840).
- the configuration unit 720 may configure samples (prediction samples corresponding to the current subblock) for the prediction block of the current subblock, centering on the position indicated by the initial MV in the reference picture. (S820).
- the prediction samples may include integer samples (candidate integer samples) and/or decimal samples (candidate decimal samples).
- the determiner 732 may search for candidate integer samples located at a preset position among the prediction samples and determine an integer sample having the minimum SAD (S830). The determiner 732 may search all candidate integer samples or search only some of the candidate integer samples in order to derive an integer sample having a minimum SAD.
- the integer unit derivation unit 734 may derive the delta MV in integer units by using the integer sample offset (S840).
- the integer sample offset may be a displacement between a position in a reference picture indicated by an initial MV and a position of an integer sample having a minimum SAD.
- the integer sample offset may indicate an integer sample having a minimum SAD from a position in a reference picture indicated by an initial MV.
- the delta MV can be derived in integer units.
- a process of improving the delta MV to a decimal unit may be performed.
- the process of determining whether to perform fractional improvement may be performed first.
- the enhancement unit 740 may determine whether to perform fractional enhancement based on the application flag decoded from the bitstream (S850).
- the improvement unit 740 may improve the delta MV derived in the integer unit in the decimal unit by using the decimal sample offset derived from the candidate integer samples (S860). In contrast, if it is determined that the improvement in decimal units is not performed, the delta MV itself derived in integer units may be used in the process of deriving the improved MV (S870).
- the derivation unit 750 may derive the improved MV from the initial MV and the delta MV (S870).
- the improved MV can be derived by summing the initial MV and the delta MV.
- the delta MV may be a delta MV to which both integer unit derivation and decimal unit improvement are applied, or delta MV to which only integer unit derivation is applied.
- the prediction execution unit 760 may predict the current subblock based on the improved MV (S880). That is, the prediction execution unit 760 may generate or induce a prediction block for the current subblock.
- candidate integer samples used for the DMVR may be placed at a preset position in the reference picture.
- the positions where candidate integer samples are located may be preset based on the position indicated by the initial MV. For example, as shown in FIG. 9, when the initial MV indicates a position indicated by C (center) in the reference picture, candidate integer samples are at position C, left, right, upper and lower, centering on position C. Can be seated.
- a position where candidate integer samples are located may be included within a preset search range. For example, if the search range is set to 2, a total of 25 integer samples including 2 columns to the left, 2 columns to the right, 2 rows to the top, and 2 rows to the bottom centered at the C position are included in the candidate integer samples. May be applicable.
- the determiner 732 may search for candidate integer samples to determine an integer sample having a minimum SAD among candidate integer samples. In addition, the determiner 732 may search all of the candidate integer samples or only a part of the candidate integer samples in order to derive an integer sample having a minimum SAD.
- the determination unit 732 calculates the SAD of the C position, and if the calculated SAD is less than the threshold value, determines the C position as an integer sample having the minimum SAD, and searches for integer samples. You can end the step. Alternatively, if the SAD of the C position is greater than or equal to the threshold value, the determiner 732 may calculate the SAD of the remaining 24 candidate integer samples and derive an integer sample having the smallest SAD among them.
- the determination unit 732 calculates the SAD of the C position, the P1 position, the P2 position, the P3 position and the P4 position, and when the SAD of the C position is the minimum, the C position is It is determined as an integer sample having the minimum SAD, and the integer sample search step can be finished (step 1). In contrast, when the SAD of the C position is not the minimum, the determination unit 732 sets the position having the smallest SAD among the P1 position, P2 position, P3 position, P4 position, and P5 position as the new C position. The first step may be performed once more (step 2).
- the improvement unit 740 may determine whether to perform the improvement in decimal units based on the application flag.
- the application flag directly indicates whether or not fractional improvement is applied or performed, or the current subblock or the upper region (block, tile, slice, picture, sequence, etc.) containing the current subblock is generally used as a sub-pel. It is also possible to indicate whether it is included in a specific image that is not included. Here, the screen content may be included in the specific image.
- the Unit improvement can be performed.
- the enhancement unit 740 indicates that the application flag indicates'the fractional improvement of the delta MV is not applied', or'the current subblock or the upper region including the current subblock is not included in a specific image' In the case of indicating, it is possible not to perform fractional improvement (skip).
- the application flag may be decoded from a sequence parameter set (SPS) of the bitstream or may be decoded from a tile group header of the bitstream.
- SPS sequence parameter set
- Table 4 shows an example in which the application flag (tile_group_scc_subpel_disabled_flag) is defined in the tile group header of the bitstream and is signaled from the video encoding apparatus to the video decoding apparatus.
- the tile_group_scc_subpel_disabled_flag is an example in which the application flag is implemented as a syntax element indicating'whether a current subblock or an upper region including the current subblock is included in the screen content'.
- the improvement unit 740 may improve the delta MV derived in the integer unit in the decimal unit by using the decimal sample offset derived from the candidate integer samples.
- the fractional improvement may include a process of deriving a fractional sample offset from candidate integer samples, and a process of improving or adjusting a delta MV derived in an integer unit by using a fractional sample offset in a fractional unit.
- the refiner 740 may perform fractional improvement using a parametric error surface equation.
- the example described above corresponds to an embodiment in which only whether or not to perform the improvement process in a fractional unit is determined according to the value of the application flag.
- the present invention may determine whether to perform an integer unit derivation process depending on the value of the application flag. That is, the present invention may selectively determine whether to perform the DMVR itself (integer unit improvement and decimal unit improvement) according to the value of the applied flag.
- the acquisition unit 710 may acquire the applied flag, the initial MV, and the reference picture of the current subblock from the bitstream (S1010).
- the induction unit 730 may determine whether to induce the delta MV in integer units according to the value of the application flag (S1020). If it is determined that the derivation of the integer unit is to be performed, the process of constructing the prediction samples (S1030), the process of determining the integer sample having the smallest SAD among the prediction samples (S1040), and the delta MV by using the integer sample The inducing process (S1050) may be performed. In contrast, if it is determined that the derivation of the integer unit is not performed, the processes S1030 to S1050 may not be performed.
- the fractional unit improvement of the delta MV is performed using a fractional sample offset, and the fractional sample offset may be performed using candidate integer samples derived from the processes S1030 and S1040. Therefore, when the derivation of the integer unit is not performed by the value of the application flag, it is preferable to set the fractional unit improvement not to be performed. Accordingly, when the derivation of the integer unit is not performed, the process of improving the delta MV to the decimal unit (S1060) and the process of deriving the improved MV (S1070) may not be performed.
- determining that the derivation of the integer unit is not performed may mean that the DMVR itself is not performed. If this is expanded, the process of determining whether to induce delta MV in integer units according to the value of the application flag (S1020) may be understood as a process of determining whether to apply the DMVR to the current subblock.
- the prediction execution unit 760 may predict the current subblock based on the improved MV when the integer unit derivation is performed, and may predict the current subblock based on the initial MV when the integer unit derivation is not performed ( S1080).
- the process of configuring candidate samples may include a process of configuring candidate integer samples (S1110) and a process of configuring candidate decimal samples through interpolation using values of the candidate integer samples (S1130).
- a'process of configuring candidate decimal samples' for improvement in the decimal unit may be unnecessary. That is, when the fractional improvement is not performed by the application flag, the process of configuring the candidate samples may consist of only the process of configuring the candidate integer samples.
- the configuration unit 720 may determine whether to perform a process of interpolating candidate prime samples based on the application flag decoded from the bitstream (S1120).
- the constructing unit 720 may construct candidate prime samples using values of the candidate integer samples (S1130). In contrast, if it is determined that the process of interpolating the candidate prime samples is not performed, the constructing unit 720 may not perform the interpolation process for constructing the candidate prime samples. In this case, since only candidate integer samples are available, DMVR can only perform integer unit derivation for delta MV.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
L'invention concerne un procédé de dérivation d'un vecteur de mouvement delta et un dispositif de décodage d'image. Un mode de réalisation de la présente invention concerne un procédé de dérivation d'un vecteur de mouvement delta utilisé pour une décomposition de vecteur de mouvement côté décodeur (DMVR), comprenant les étapes consistant à : acquérir, à partir d'un train de bits, un indicateur d'application associé au fait que le DMVR est appliqué, un vecteur de mouvement d'un sous-bloc actuel, et une image de référence du sous-bloc actuel ; dériver le vecteur de mouvement delta en unités entières à partir d'une position indiquée par le vecteur de mouvement en utilisant un décalage d'échantillon entier qui indique un échantillon entier ayant une somme minimale de différences absolues (SADs) d'échantillons entiers candidats correspondant au sous-bloc actuel ; et la décomposition, en unités fractionnées, du vecteur de mouvement delta dérivé en unités entières, en utilisant un décalage d'échantillon fractionné dérivé des échantillons entiers candidats, dans lequel, le fait que l'étape de décomposition est effectuée est déterminée en fonction de la valeur de l'indicateur d'application. Dessin représentatif : FIG. 7:
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR20190028991 | 2019-03-13 | ||
| KR10-2019-0028991 | 2019-03-13 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2020185034A1 true WO2020185034A1 (fr) | 2020-09-17 |
Family
ID=72426396
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2020/003532 Ceased WO2020185034A1 (fr) | 2019-03-13 | 2020-03-13 | Procédé de dérivation d'un vecteur de mouvement delta, et dispositif de décodage d'image |
Country Status (2)
| Country | Link |
|---|---|
| KR (1) | KR20200110235A (fr) |
| WO (1) | WO2020185034A1 (fr) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20150083826A (ko) * | 2015-06-30 | 2015-07-20 | 에스케이텔레콤 주식회사 | 인터 예측을 이용한 영상 복호화 방법 및 장치 |
| WO2018121506A1 (fr) * | 2016-12-27 | 2018-07-05 | Mediatek Inc. | Procédé et appareil d'affinement de mv à modèle bilatéral destinés au codage vidéo |
| US20180199057A1 (en) * | 2017-01-12 | 2018-07-12 | Mediatek Inc. | Method and Apparatus of Candidate Skipping for Predictor Refinement in Video Coding |
-
2020
- 2020-03-13 KR KR1020200031115A patent/KR20200110235A/ko not_active Withdrawn
- 2020-03-13 WO PCT/KR2020/003532 patent/WO2020185034A1/fr not_active Ceased
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20150083826A (ko) * | 2015-06-30 | 2015-07-20 | 에스케이텔레콤 주식회사 | 인터 예측을 이용한 영상 복호화 방법 및 장치 |
| WO2018121506A1 (fr) * | 2016-12-27 | 2018-07-05 | Mediatek Inc. | Procédé et appareil d'affinement de mv à modèle bilatéral destinés au codage vidéo |
| US20180199057A1 (en) * | 2017-01-12 | 2018-07-12 | Mediatek Inc. | Method and Apparatus of Candidate Skipping for Predictor Refinement in Video Coding |
Non-Patent Citations (2)
| Title |
|---|
| BENJAMIN BROSS: "Versatile Video Coding (Draft 4", JOINT VIDEO EXPERTS TEA M (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11, JVET_M1001-V7, 1 3TH MEETING, 18 January 2019 (2019-01-18), Marrakech, MA, pages 1 - 290, XP030203323, Retrieved from the Internet <URL:http://phenix.int-evry.fr/jvet> [retrieved on 20200518] * |
| JIANLE CHEN: "Algorithm description for Versatile Video Coding and Test Mo del 4 (VTM 4", JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/I EC JTC 1/SC 29/WG 11,JVET-M1002-V2,13TH MEETING, 18 January 2019 (2019-01-18), Marrakech, MA, pages 1 - 62, XP030254429, Retrieved from the Internet <URL:http://phenix.int-evry.fr/jvet> [retrieved on 20200518] * |
Also Published As
| Publication number | Publication date |
|---|---|
| KR20200110235A (ko) | 2020-09-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2020185009A1 (fr) | Procédé et appareil de codage efficace de blocs résiduels | |
| WO2020185004A1 (fr) | Procédé et dispositif de prédiction intra pour prédire une unité de prédiction et diviser une unité de prédiction en sous-unités | |
| WO2021025478A1 (fr) | Procédé et dispositif de codage de prédiction intra de données vidéo | |
| WO2017043786A1 (fr) | Procédé et dispositif de prédiction intra dans un système de codage vidéo | |
| WO2013069932A1 (fr) | Procédé et appareil de codage d'image, et procédé et appareil de décodage d'image | |
| WO2017052000A1 (fr) | Procédé et appareil de prédiction inter basée sur le raffinement des vecteurs de mouvement dans un système de codage d'images | |
| WO2016204374A1 (fr) | Procédé et dispositif de filtrage d'image dans un système de codage d'image | |
| WO2020185050A1 (fr) | Encodage et décodage d'image utilisant une copie intrabloc | |
| WO2018056602A1 (fr) | Appareil et procédé de prédiction-inter dans un système de codage d'image | |
| WO2020231228A1 (fr) | Dispositif et procédé de quantification inverse utilisés dans un dispositif de décodage d'image | |
| WO2019240425A1 (fr) | Procédé d'inter-prédiction et dispositif de décodage d'image | |
| WO2021060804A1 (fr) | Procédé permettant de restaurer un bloc résiduel de bloc de chrominance, et dispositif de décodage | |
| WO2021145691A1 (fr) | Codage et décodage vidéo à l'aide d'une transformée de couleur adaptative | |
| WO2020190077A1 (fr) | Dispositif et procédé de prédiction intra basée sur une estimation de mode de prédiction | |
| WO2022177375A1 (fr) | Procédé de génération d'un bloc de prédiction à l'aide d'une somme pondérée d'un signal de prédiction intra et d'un signal de prédiction inter, et dispositif l'utilisant | |
| WO2022114742A1 (fr) | Appareil et procédé de codage et décodage vidéo | |
| WO2020185027A1 (fr) | Procédé et dispositif pour appliquer efficacement un mode de saut de transformation à un bloc de données | |
| WO2024058430A1 (fr) | Procédé et appareil de codage vidéo qui utilisent de manière adaptative une arborescence unique et une arborescence double dans un bloc | |
| WO2022177317A1 (fr) | Procédé et dispositif de codage vidéo utilisant une prédiction intra basée sur une division de sous-blocs | |
| WO2022177380A1 (fr) | Codage et décodage vidéo sur la base d'une prédiction inter | |
| WO2022031003A1 (fr) | Procédé de prédiction d'un paramètre de quantification utilisé dans un dispositif de codage/décodage d'image | |
| WO2020185034A1 (fr) | Procédé de dérivation d'un vecteur de mouvement delta, et dispositif de décodage d'image | |
| WO2021112544A1 (fr) | Encodage et décodage de vidéo utilisant une modulation différentielle | |
| WO2022114752A1 (fr) | Structure de fractionnement de blocs pour une prédiction et une transformation efficaces, et procédé et appareil de codage et de décodage vidéo utilisant la structure de fractionnement de blocs | |
| WO2021040430A1 (fr) | Codage et décodage vidéo en utilisant le codage différentiel |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20770390 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 20770390 Country of ref document: EP Kind code of ref document: A1 |