[go: up one dir, main page]

WO2021018081A1 - Configurable coding tree unit size in video coding - Google Patents

Configurable coding tree unit size in video coding Download PDF

Info

Publication number
WO2021018081A1
WO2021018081A1 PCT/CN2020/104784 CN2020104784W WO2021018081A1 WO 2021018081 A1 WO2021018081 A1 WO 2021018081A1 CN 2020104784 W CN2020104784 W CN 2020104784W WO 2021018081 A1 WO2021018081 A1 WO 2021018081A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
block
size
dimensions
bitstream representation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2020/104784
Other languages
French (fr)
Inventor
Zhipin DENG
Li Zhang
Kai Zhang
Hongbin Liu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
ByteDance Inc
Original Assignee
Beijing ByteDance Network Technology Co Ltd
ByteDance Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd, ByteDance Inc filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN202080053833.8A priority Critical patent/CN114175649A/en
Publication of WO2021018081A1 publication Critical patent/WO2021018081A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Definitions

  • This document is related to video and image coding and decoding technologies.
  • Digital video accounts for the largest bandwidth use on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the bandwidth demand for digital video usage will continue to grow.
  • the disclosed techniques may be used by video or image decoder or encoder to performing coding or decoding of video in which a configurable coding tree unit size is used.
  • a method of video processing includes performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video, wherein the conversion conforms to a rule that allows use of different sizes for the one or more video blocks in different video regions of the one or more video regions for performing the conversion.
  • a method of video processing includes determining, based on a size of a video block of a video region of a video exceeding a threshold, that the video block is split using a quadtree-based splitting until a size condition is met and an indication of the quadtree-based splitting is excluded from a bitstream representation of the video, and performing, based on the determining, a conversion between the video and the bitstream representation.
  • a method of video processing includes determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for ternary-tree (TT) splitting of the video block is signaled in a bitstream representation of the video, and performing, based on the determining, a conversion between the video and the bitstream representation.
  • TT ternary-tree
  • a method of video processing includes determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for binary-tree (BT) splitting of the video block is signaled in a bitstream representation of the video, and performing, based on the determining, a conversion between the video and the bitstream representation.
  • BT binary-tree
  • a method of video processing includes performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, wherein the conversion comprises an affine model parameters calculation, and wherein the affine model parameters calculation is based on dimensions of the video block.
  • a method of video processing includes performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, wherein the conversion comprises an application of an intra block copy (IBC) tool, and wherein a size of an IBC buffer is based on maximum configurable and/or allowable dimensions of the video block.
  • IBC intra block copy
  • a method of video processing includes performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video, wherein the conversion is performed according to a rule that specifies a relationship between an indication of a size of a video block of the one or more video blocks and an indication of a maximum size of a transform block (TB) used for the video block.
  • TB transform block
  • the above-described method may be implemented by a video encoder apparatus that comprises a processor.
  • these methods may be embodied in the form of processor-executable instructions and stored on a computer-readable program medium.
  • FIG. 1 is a block diagram of an example of a hardware platform used for implementing techniques described in the present document.
  • FIG. 2 is a block diagram of an example video processing system in which disclosed techniques may be implemented.
  • FIG. 3 is a flowchart for an example method of video processing.
  • FIG. 4 is a flowchart for another example method of video processing.
  • FIG. 5 is a flowchart for yet another example method of video processing.
  • FIG. 6 is a flowchart for yet another example method of video processing.
  • FIG. 7 is a flowchart for yet another example method of video processing.
  • FIG. 8 is a flowchart for yet another example method of video processing.
  • FIG. 9 is a flowchart for yet another example method of video processing.
  • the present document provides various techniques that can be used by a decoder of image or video bitstreams to improve the quality of decompressed or decoded digital video or images.
  • video is used herein to include both a sequence of pictures (traditionally called video) and individual images.
  • a video encoder may also implement these techniques during the process of encoding in order to reconstruct decoded frames used for further encoding.
  • Section headings are used in the present document for ease of understanding and do not limit the embodiments and techniques to the corresponding sections. As such, embodiments from one section can be combined with embodiments from other sections.
  • This document is related to video coding technologies. Specifically, it is directed to configurable coding tree units (CTUs) in video coding and decoding. It may be applied to the existing video coding standard like HEVC, or the standard (Versatile Video Coding) to be finalized. It may be also applicable to future video coding standards or video codec.
  • CTUs configurable coding tree units
  • Video coding standards have evolved primarily through the development of the well-known ITU-T and ISO/IEC standards.
  • the ITU-T produced H. 261 and H. 263, ISO/IEC produced MPEG-1 and MPEG-4 Visual, and the two organizations jointly produced the H. 262/MPEG-2 Video and H. 264/MPEG-4 Advanced Video Coding (AVC) and H. 265/HEVC standards.
  • AVC H. 264/MPEG-4 Advanced Video Coding
  • H. 265/HEVC High Efficiency Video Coding
  • the video coding standards are based on the hybrid video coding structure wherein temporal prediction plus transform coding are utilized.
  • Joint Video Exploration Team JVET was founded by VCEG and MPEG jointly in 2015.
  • JEM Joint Exploration Model
  • VTM-5.0 software allows 4 different CTU sizes: 16x16, 32x32, 64x64 and 128x128.
  • the minimum CTU size was redefined to 32x32 due to the adoption of JVET-O0526.
  • the CTU size in VVC working draft 6 is encoded in the SPS header in a UE-encoded syntax element called log2_ctu_size_minus_5.
  • VVC draft 6 With the definition of Virtual pipeline data units (VPDUs) and the adoption of JVET-O0526.
  • VPDUs Virtual pipeline data units
  • log2_ctu_size_minus5plus 5 specifies the luma coding tree block size of each CTU. It is a requirement of bitstream conformance that the value of log2_ctu_size_minus5 be less than or equal to 2.
  • log2_min_luma_coding_block_size_minus2 plus 2 specifies the minimum luma coding block size.
  • MinCbLog2SizeY log2_min_luma_coding_block_size_minus2+2 (7-17)
  • MinCbSizeY 1 ⁇ MinCbLog2SizeY (7-18)
  • IbcBufWidthY 128*128/CtbSizeY (7-19)
  • IbcBufWidthC IbcBufWidthY/SubWidthC (7-20)
  • CtbWidthC and CtbHeightC which specify the width and height, respectively, of the array for each chroma CTB, are derived as follows:
  • chroma_format_idc is equal to 0 (monochrome) or separate_colour_plane_flag is equal to 1, CtbWidthC and CtbHeightC are both equal to 0.
  • CtbWidthC and CtbHeightC are derived as follows:
  • RasterScanOrder [log2BlockWidth] [log2BlockHeight] .
  • slice_log2_diff_max_bt_min_qt_luma specifies the difference between the base 2 logarithm of the maximum size (width or height) in luma samples of a luma coding block that can be split using a binary split and the minimum size (width or height) in luma samples of a luma leaf block resulting from quadtree splitting of a CTU in the current slice.
  • the value of slice_log2_diff_max_bt_min_qt_luma shall be in the range of 0 to CtbLog2SizeY-MinQtLog2SizeY, inclusive.
  • the value of slice_log2_diff_max_bt_min_qt_luma is inferred as follows:
  • slice_log2_diff_max_bt_min_qt_luma is inferred to be equal to sps_log2_diff_max_bt_min_qt_intra_slice_luma
  • slice_log2_diff_max_bt_min_qt_luma is inferred to be equal to sps_log2_diff_max_bt_min_qt_inter_slice.
  • slice_log2_diff_max_tt_min_qt_luma specifies the difference between the base 2 logarithm of the maximum size (width or height) in luma samples of a luma coding block that can be split using a ternary split and the minimum size (width or height) in luma samples of a luma leaf block resulting from quadtree splitting of a CTU in in the current slice.
  • the value of slice_log2_diff_max_tt_min_qt_luma shall be in the range of 0 to CtbLog2SizeY-MinQtLog2SizeY, inclusive.
  • the value of slice_log2_diff_max_tt_min_qt_luma is inferred as follows:
  • slice_log2_diff_max_tt_min_qt_luma is inferred to be equal to sps_log2_diff_max_tt_min_qt_intra_slice_luma
  • slice_log2_diff_max_tt_min_qt_luma is inferred to be equal to sps_log2_diff_max_tt_min_qt_inter_slice.
  • slice_log2_diff_min_qt_min_cb_chroma specifies the difference between the base 2 logarithm of the minimum size in luma samples of a chroma leaf block resulting from quadtree splitting of a chroma CTU with treeType equal to DUAL_TREE_CHROMA and the base 2 logarithm of the minimum coding block size in luma samples for chroma CUs with treeType equal to DUAL_TREE_CHROMA in the current slice.
  • the value of slice_log2_diff_min_qt_min_cb_chroma shall be in the range of 0 to CtbLog2SizeY-MinCbLog2SizeY, inclusive.
  • slice_log2_diff_min_qt_min_cb_chroma When not present, the value of slice_log2_diff_min_qt_min_cb_chroma is inferred to be equal to sps_log2_diff_min_qt_min_cb_intra_slice_chroma.
  • slice_max_mtt_hierarchy_depth_chroma specifies the maximum hierarchy depth for coding units resulting from multi-type tree splitting of a quadtree leaf with treeType equal to DUAL_TREE_CHROMA in the current slice.
  • the value of slice_max_mtt_hierarchy_depth_chroma shall be in the range of 0 to CtbLog2SizeY-MinCbLog2SizeY, inclusive.
  • the values of slice_max_mtt_hierarchy_depth_chroma is inferred to be equal to sps_max_mtt_hierarchy_depth_intra_slices_chroma.
  • slice_log2_diff_max_bt_min_qt_chroma specifies the difference between the base 2 logarithm of the maximum size (width or height) in luma samples of a chroma coding block that can be split using a binary split and the minimum size (width or height) in luma samples of a chroma leaf block resulting from quadtree splitting of a chroma CTUwith treeType equal to DUAL_TREE_CHROMA in the current slice.
  • the value of slice_log2_diff_max_bt_min_qt_chroma shall be in the range of 0 to CtbLog2SizeY-MinQtLog2SizeC, inclusive.
  • slice_log2_diff_max_bt_min_qt_chroma When not present, the value of slice_log2_diff_max_bt_min_qt_chroma is inferred to be equal to sps_log2_diff_max_bt_min_qt_intra_slice_chroma
  • slice_log2_diff_max_tt_min_qt_chroma specifies the difference between the base 2 logarithm of the maximum size (width or height) in luma samples of a chroma coding block that can be split using a ternary split and the minimum size (width or height) in luma samples of a chroma leaf block resulting from quadtree splitting of a chroma CTUwith treeType equal to DUAL_TREE_CHROMA in the current slice.
  • the value of slice_log2_diff_max_tt_min_qt_chroma shall be in the range of 0 to CtbLog2SizeY-MinQtLog2SizeC, inclusive.
  • slice_log2_diff_max_tt_min_qt_chroma is inferred to be equal to sps_log2_diff_max_tt_min_qt_intra_slice_chroma
  • MinQtLog2SizeY, MinQtLog2SizeC, MinQtSizeY, MinQtSizeC, MaxBtSizeY, MaxBtSizeC, MinBtSizeY, MaxTtSizeY, MaxTtSizeC, MinTtSizeY, MaxMttDepthY and MaxMttDepthC are derived as follows:
  • MinQtLog2SizeY MinCbLog2SizeY+slice_log2_diff_min_qt_min_cb_luma (7-86)
  • MinQtLog2SizeC MinCbLog2SizeY+slice_log2_diff_min_qt_min_cb_chroma (7-87)
  • MinQtSizeY 1 ⁇ MinQtLog2SizeY (7-88)
  • MinQtSizeC 1 ⁇ MinQtLog2SizeC (7-89)
  • MaxBtSizeY 1 ⁇ (MinQtLog2SizeY+slice_log2_diff_max_bt_min_qt_luma) (7-90)
  • MaxBtSizeC 1 ⁇ (MinQtLog2SizeC+slice_log2_diff_max_bt_min_qt_chroma) (7-91)
  • MinBtSizeY 1 ⁇ MinCbLog2SizeY (7-92)
  • MaxTtSizeY 1 ⁇ (MinQtLog2SizeY+slice_log2_diff_max_tt_min_qt_luma) (7-93)
  • MaxTtSizeC 1 ⁇ (MinQtLog2SizeC+slice_log2_diff_max_tt_min_qt_chroma) (7-94)
  • MinTtSizeY 1 ⁇ MinCbLog2SizeY (7-95)
  • MaxMttDepthY slice_max_mtt_hierarchy_depth_luma (7-96)
  • MaxMttDepthC slice_max_mtt_hierarchy_depth_chroma (7-97)
  • Max chroma transform size is derived from the chroma sampling ratio relative to the max luma transform size.
  • sps_max_luma_transform_size_64_flagequal to 1 specifies that the maximum transform size in luma samples is equal to 64.
  • sps_max_luma_transform_size_64_flagequal to 0 specifies that the maximum transform size in luma samples is equal to 32.
  • MinTbLog2SizeY MaxTbLog2SizeY
  • MinTbSizeY MinTbSizeY
  • MaxTbSizeY MaxTbSizeY
  • MaxTbLog2SizeY sps_max_luma_transform_size_64_flag? 6: 5 (7-28)
  • MinTbSizeY 1 ⁇ MinTbLog2SizeY (7-29)
  • MaxTbSizeY 1 ⁇ MaxTbLog2SizeY (7-30)
  • sps_sbt_max_size_64_flag 0 specifies that the maximum CU width and height for allowing subblock transform is 32 luma samples.
  • sps_sbt_max_size_64_flag 1 specifies that the maximum CU width and height for allowing subblock transform is 64 luma samples.
  • MaxSbtSize Min (MaxTbSizeY, sps_sbt_max_size_64_flag ? 64: 32) (7-31)
  • the maximum transform size and CTU size are defined independently.
  • CTU size could be 32, whereas transform size could be 64. It is desirable that the maximum transform size should be equal or smaller than the CTU size.
  • the block partition process depends on the maximum transform block size other than the VPDU size. Therefore, if the maximum transform block size is 32x32, in addition to prohibit 128x128 TT split and 64x128 vertical BT split, and 128x64 horizontal BT split to obey the VPDU rule, it further prohibits TT split for 64x64 block, prohibits vertical BT split for 32x64/16x64/8x64 coding block, and also prohibits horizontal BT split for 64x8/64x16/64x32 coding block, which may not efficient for coding efficiency.
  • the CTU size is signaled at SPS level.
  • the adoption of reference picture resampling (a.k.a. adaptive resolution change) allows that the pictures could be coded with difference resolutions in one bistream, the CTU size may be different across multiple layers.
  • the video unit size/dimension may be either the height or width of a video unit (e.g., width or height of a picture/sub-picture/slice/brick/tile/CTU/CU/CB/TU/TB) . If a video unit size is denoted by MxN, then M denotes the width and N denotes the height of the video unit.
  • a coding block may be a luma coding block, and/or a chroma coding block.
  • the size/dimension in luma samples for a coding block may be used in this invention to represent the size/dimension measured in luma samples.
  • a 128x128 coding block (or a coding block size 128x128 in luma samples) may indicate a 128x128 luma coding block, and/or a 64x64 chroma coding block for 4: 2: 0 color format.
  • 4: 2: 2 color format it may refer to a 128x128 luma coding block and/or a 64x128 chroma coding block.
  • 4: 4 color format it may refer to a 128x128 luma coding block and/or a 128x128 chroma coding block.
  • CTU dimensions such as width and/or height
  • video units such as Layers/Pictures/Subpictures/Slices/Tiles/Bricks.
  • one or multiple sets of CTU dimensions may be explicitly signaled at a video unit level such as VPS/DPS/SPS/PPS/APS/Picture/Subpicture/Slice/Slice header/Tile/Brick level.
  • the CTU dimensions may be different across difference layers.
  • the CTU dimensions of an inter-layer picture may be implicitly derived according to the downsample/upsample scaling factor.
  • the CTU dimensions in the inter-layer coded picture may be derived to (M ⁇ S) ⁇ (N ⁇ T) , or (M/S) ⁇ (N/T) .
  • different CTU dimensions may be explicitly signalled for multiple layers at video unit level, e.g., for inter-layer resampling pictures/subpictures, the CTU dimensions may be signaled at
  • TT or BT split may be dependent on VPDU dimensions (such as width and/or height) .
  • VPDU is with dimension VSize in luma samples
  • the coding tree block is with dimension CtbSizeY in luma samples.
  • VSize min (M, CtbSizeY) .
  • M is an integer value such as 64.
  • whether TT or BT split is allowed or not may be independent of the maximum transform size.
  • TT split may be disabled when a coding block width or height in luma samples is greater than min (VSize, maxTtSize) .
  • TT split may be disabled for 128x128/128x64/64x128 coding block.
  • TT split may be allowed for 64x64 coding block.
  • vertical BT split may be disabled when a coding block width in luma samples is less than or equal to VSize, but its height in luma samples is greater than VSize.
  • vertical BT split may be disabled for 64x128 coding block.
  • vertical BT split may be allowed for 32x64/16x64/8x64 coding block.
  • vertical BT split may be disabled when a coding block exceeds the Picture/Subpicture width in luma samples, but its height in luma samples is greater than VSize.
  • horizontal BT split may be allowed when a coding block exceeds the Picture/Subpicture width in luma samples.
  • horizontal BT split may be disabled when a coding block width in luma samples is greater than VSize, but its height in luma samples is less than or equal to VSize.
  • vertical BT split may be disabled for 128x64 coding block.
  • horizontal BT split may be allowed for 64x8/64x16/64x32 coding block.
  • horizontal BT split may be disabled when a coding block exceeds the Picture/Subpicture height in luma samples, but its width in luma samples is greater than VSize.
  • vertical BT split may be allowed when a coding block exceeds the Picture/Subpicture height in luma samples.
  • the TT or BT split flag may be not signaled and implicitly derived to be zero.
  • the TT and/or BT split flag may be explicitly signaled in the bitstream.
  • the TT or BT split flag may be signaled but ignored by the decoder.
  • the TT or BT split flag may be signaled but it must be zero in a conformance bitstream.
  • the CTU dimensions (such as width and/or height) may be larger than 128.
  • the signaled CTU dimensions may be 256 or even larger (e.g., log2_ctu_size_minus5 may be equal to 3 or larger) .
  • the derived CTU dimensions may be 256 or even larger.
  • the derived CTU dimensions for resampling pictures/subpictures may be larger than 128.
  • the QT split flag may be inferred to be true and the QT split may be recursively applied till the dimension of split coding block reach a specified value (e.g., a specified value may be set to the maximum transform block size, or 128, or 64, or 32) .
  • the recursive QT split may be implicitly conducted without signaling, until the split coding block size reach the maximum transform block size.
  • the QT split flag may be not signalled for a coding block larger than maximum transform block size, and the QT split may be forced to be used for the coding block until the split coding block size reach the maximum transform block size.
  • TT split flag may be conditionally signalled for CU/PU dimensions (width and/or height) larger than 128.
  • both horizontal and vertical TT split flags may be signalled for a 256x256 CU.
  • vertical TT split but not horizontal TT split may be signalled for a 256x128/256x64 CU/PU.
  • horizontal TT split but not vertical TT split may be signalled for a 128x256/64x256 CU/PU.
  • TT split flag when TT split flag is prohibited for CU dimensions larger than 128, then it may not be signalled and implicitly derived as zero.
  • horizontal TT split may be prohibited for 256x128/256x64 CU/PU.
  • vertical TT split may be prohibited for 128x256/64x256 CU/PU.
  • BT split flag may be conditionally signalled for CU/PU dimensions (width and/or height) larger than 128.
  • both horizontal and vertical BT split flags may be signalled for 256x256/256x128/128x256 CU/PU.
  • horizontal BT split flag may be signaled for 64x256 CU/PU.
  • vertical BT split flag may be signaled for 256x64 CU/PU.
  • BT split flag when BT split flag is prohibited for CU dimension larger than 128, then it may be not signalled and implicitly derived as zero.
  • vertical BT split may be prohibited for Kx256 CU/PU (such as K is equal to or smaller than 64 in luma samples) , and the vertical BT split flag may be not signaled and derived as zero.
  • vertical BT split may be prohibited for 64x256 CU/PU.
  • vertical BT split may be prohibited to avoid 32x256 CU/PU at picture/subpicture boundaries.
  • horizontal BT split may be prohibited for 256xK (such as K is equal to or smaller than 64 in luma samples) coding block, and the horizontal BT split flag may be not signaled and derived as zero.
  • horizontal BT split may be prohibited for 256x64 coding block.
  • horizontal BT split may be prohibited to avoid 256x32 coding block at picture/subpicture boundaries.
  • affine model parameters calculation may be dependent on the CTU dimensions.
  • the derivation of scaled motion vectors, and/or control point motion vectors in affine prediction may be dependent on the CTU dimensions.
  • the intra block copy (IBC) buffer may depend on the maximum configurable/allowable CTU dimensions.
  • the above-mentioned specified coding tool may be palette, and/or intra block copy (IBC) , and/or intra skip mode, and/or triangle prediction mode, and/or CIIP mode, and/or regular merge mode, and/or decoder side motion derivation, and/or bi-directional optical flow, and/or prediction refinement based optical flow, and/or affine prediction, and/or sub-block based TMVP, and etc.
  • IBC intra block copy
  • IBC intra skip mode
  • triangle prediction mode and/or CIIP mode
  • regular merge mode and/or decoder side motion derivation
  • bi-directional optical flow and/or prediction refinement based optical flow
  • affine prediction and/or sub-block based TMVP, and etc.
  • screen content coding tool such as palette and/or intra block copy (IBC) mode may be applied to large CU/PU.
  • IBC intra block copy
  • it may explicitly use syntax constraint for disabling the specified coding tool (s) for a large CU/PU.
  • Palette/IBC flag may explicitly signal for a CU/PU which is not a large CU/PU.
  • bitstream constraint for disabling specified coding tool (s) for a large CU/PU.
  • the maximum TU size may be dependent on CTU dimensions (width and/or height) , or CTU dimensions may be dependent on the maximum TU size
  • abitstream constraint may be used that the maximum TU size shall be smaller or equal to the CTU dimensions.
  • the signaling of maximum TU size may depend on the CTU dimensions.
  • the signaled maximum TU size must be smaller than N.
  • the indication of whether the maximum luma transform size is 64 or 32 may not be signaled and the maximum luma transform size may be derived as 32 implicitly.
  • Newly added parts are enclosed in bolded double parentheses, e.g., ⁇ ⁇ a ⁇ ⁇ denotes that “a” has been added, whereas the deleted parts from VVC working draft are enclosed in bolded double brackets, e.g., [ [b] ] denotes that “b” has been deleted.
  • the modifications are based on the latest VVC working draft (JVET-O2001-v11)
  • the embodiment below is for the invented method that making the maximum TU size dependent on the CTU size.
  • sps_max_luma_transform_size_64_flag 1 specifies that the maximum transform size in luma samples is equal to 64.
  • sps_max_luma_transform_size_64_flag 0 specifies that the maximum transform size in luma samples is equal to 32.
  • MinTbLog2SizeY MaxTbLog2SizeY
  • MinTbSizeY MinTbSizeY
  • MaxTbSizeY MaxTbSizeY
  • MaxTbLog2SizeY sps_max_luma_transform_size_64_flag? 6: 5 (7-28)
  • MinTbSizeY 1 ⁇ MinTbLog2SizeY (7-29)
  • MaxTbSizeY ⁇ ⁇ min (CtbSizeY, 1 ⁇ MaxTbLog2SizeY) ⁇ ⁇ (7-30)
  • the embodiment below is for the invented method that making the TT and BT split process dependent on the VPDU size.
  • variable allowBtSplit is derived as follows:
  • allowBtSplit is set equal to FALSE
  • allowBtSplit is set equal to FALSE
  • allowBtSplit is set equal to FALSE
  • allowBtSplit is set equal to FALSE
  • variable allowTtSplit is derived as follows:
  • allowTtSplit is set equal to FALSE:
  • treeType is equal to DUAL_TREE_CHROMA and (cbWidth/SubWidthC) * (cbHeight/SubHeightC) is less than or equal to 32
  • treeType is equal to DUAL_TREE_CHROMA and modeType is equal to INTRA
  • allowTtSplit is set equal to TRUE.
  • log2_ctu_size_minus5plus 5 specifies the luma coding tree block size of each CTU. It is a requirement of bitstream conformance that the value of log2_ctu_size_minus5 be less than or equal to [ [2] ] ⁇ ⁇ 3 (could be larger per specified) ⁇ ⁇ .
  • ⁇ ⁇ CtbLog2SizeY is used to indicate the CTU size in luma sampales of current video unit.
  • the CtbLog2SizeY is calculated by above equation. Otherwise, CtbLog2SizeY may depend on the actual CTU size which may be explicit signalled or implicit derived for the current video unit. (an example) ⁇ ⁇
  • variable availableFlagLX is derived as follows:
  • availableFlagLX is set equal to TRUE:
  • refIdxLXCorner [0] is equal to refIdxLXCorner [2]
  • availableFlagLX is set equal to FALSE.
  • the second control point motion vector cpMvLXCorner [1] is derived as follows:
  • cpMvLXCorner [1] [0] (cpMvLXCorner [0] [0] ⁇ [ [7] ] ⁇ ⁇ CtbLog2SizeY ⁇ ⁇ ) + ( (cpMvLXCorner [2] [1] -cpMvLXCorner [0] [1] ) (8-606)
  • cpMvLXCorner [1] (cpMvLXCorner [0] [1] ⁇ [ [7] ] ⁇ ⁇ CtbLog2SizeY ⁇ ⁇ ) + ( (cpMvLXCorner [2] [0] -cpMvLXCorner [0] [0] ) (8-607)
  • FIG. 1 is a block diagram of a video processing apparatus 1300.
  • the apparatus 1300 may be used to implement one or more of the methods described herein.
  • the apparatus 1300 may be embodied in a smartphone, tablet, computer, Internet of Things (IoT) receiver, and so on.
  • the apparatus 1300 may include one or more processors 1302, one or more memories 1304 and video processing hardware 1306.
  • the processor (s) 1302 may be configured to implement one or more methods described in the present document.
  • the memory (memories) 1304 may be used for storing data and code used for implementing the methods and techniques described herein.
  • the video processing hardware 1306 may be used to implement, in hardware circuitry, some techniques described in the present document. In some embodiments, the hardware 1306 may be at least partially internal to the processors 1302, e.g., a graphics co-processor.
  • the video coding methods may be implemented using an apparatus that is implemented on a hardware platform as described with respect to FIG. 1.
  • Some embodiments of the disclosed technology include making a decision or determination to enable a video processing tool or mode.
  • the encoder when the video processing tool or mode is enabled, the encoder will use or implement the tool or mode in the processing of a block of video, but may not necessarily modify the resulting bitstream based on the usage of the tool or mode. That is, a conversion from the block of video to the bitstream representation of the video will use the video processing tool or mode when it is enabled based on the decision or determination.
  • the decoder when the video processing tool or mode is enabled, the decoder will process the bitstream with the knowledge that the bitstream has been modified based on the video processing tool or mode. That is, a conversion from the bitstream representation of the video to the block of video will be performed using the video processing tool or mode that was enabled based on the decision or determination.
  • Some embodiments of the disclosed technology include making a decision or determination to disable a video processing tool or mode.
  • the encoder will not use the tool or mode in the conversion of the block of video to the bitstream representation of the video.
  • the decoder will process the bitstream with the knowledge that the bitstream has not been modified using the video processing tool or mode that was enabled based on the decision or determination.
  • FIG. 2 is a block diagram showing an example video processing system 200 in which various techniques disclosed herein may be implemented.
  • the system 200 may include input 202 for receiving video content.
  • the video content may be received in a raw or uncompressed format, e.g., 8 or 10 bit multi-component pixel values, or may be in a compressed or encoded format.
  • the input 202 may represent a network interface, a peripheral bus interface, or a storage interface. Examples of network interface include wired interfaces such as Ethernet, passive optical network (PON) , etc. and wireless interfaces such as Wi-Fi or cellular interfaces.
  • PON passive optical network
  • the system 200 may include a coding component 204 that may implement the various coding or encoding methods described in the present document.
  • the coding component 204 may reduce the average bitrate of video from the input 202 to the output of the coding component 204 to produce a coded representation of the video.
  • the coding techniques are therefore sometimes called video compression or video transcoding techniques.
  • the output of the coding component 204 may be either stored, or transmitted via a communication connected, as represented by the component 206.
  • the stored or communicated bitstream (or coded) representation of the video received at the input 202 may be used by the component 208 for generating pixel values or displayable video that is sent to a display interface 210.
  • the process of generating user-viewable video from the bitstream representation is sometimes called video decompression.
  • certain video processing operations are referred to as “coding” operations or tools, it will be appreciated that the coding tools or operations are used at an encoder and corresponding decoding tools or operations that reverse the results of the coding will be performed
  • peripheral bus interface or a display interface may include universal serial bus (USB) or high definition multimedia interface (HDMI) or Displayport, and so on.
  • storage interfaces include SATA (serial advanced technology attachment) , PCI, IDE interface, and the like.
  • FIG. 3 is a flowchart for a method 300 of video processing.
  • the method 300 includes, at operation 310, performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video, the conversion conforming to a rule that allows use of different sizes for the one or more video blocks in different video regions of the one or more video regions for performing the conversion.
  • FIG. 4 is a flowchart for a method 400 of video processing.
  • the method 400 includes, at operation 410, determining, based on a size of a video block of a video region of a video exceeding a threshold, that the video block is split using a quadtree-based splitting until a size condition is met and an indication of the quadtree-based splitting is excluded from a bitstream representation of the video.
  • the method 400 includes, at operation 420, performing, based on the determining, a conversion between the video and the bitstream representation.
  • FIG. 5 is a flowchart for a method 500 of video processing.
  • the method 500 includes, at operation 510, determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for ternary-tree (TT) splitting of the video block is signaled in a bitstream representation of the video.
  • TT ternary-tree
  • the method 500 includes, at operation 520, performing, based on the determining, a conversion between the video and the bitstream representation.
  • FIG. 6 is a flowchart for a method 600 of video processing.
  • the method 600 includes, at operation 610, determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for binary-tree (BT) splitting of the video block is signaled in a bitstream representation of the video.
  • BT binary-tree
  • the method 600 includes, at operation 620, performing, based on the determining, a conversion between the video and the bitstream representation.
  • FIG. 7 is a flowchart for a method 700 of video processing.
  • the method 700 includes, at operation 710, performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, the conversion comprising an affine model parameters calculation, and the affine model parameters calculation being based on dimensions of the video block.
  • FIG. 8 is a flowchart for a method 800 of video processing.
  • the method 800 includes, at operation 810, performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, the conversion comprising an application of an intra block copy (IBC) tool, and a size of an IBC buffer being based on maximum configurable and/or allowable dimensions of the video block.
  • IBC intra block copy
  • FIG. 9 is a flowchart for a method 900 of video processing.
  • the method 900 includes, at operation 910, performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video, the conversion being performed according to a rule that specifies a relationship between an indication of a size of a video block of the one or more video blocks and an indication of a maximum size of a transform block (TB) used for the video block.
  • TB transform block
  • a method of video processing comprising performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video, wherein the conversion conforms to a rule that allows use of different sizes for the one or more video blocks in different video regions of the one or more video regions for performing the conversion.
  • A5 The method of solution A2, wherein the syntax element is included in a video parameter set (VPS) , a decoding parameter set (DPS) , an adaptation parameter set (APS) , a picture header, a subpicture header, a slice header, a tile header, or a brick header.
  • VPS video parameter set
  • DPS decoding parameter set
  • APS adaptation parameter set
  • A6 The method of solution A1, wherein the one or more video regions correspond to video layers, and wherein the one or more video blocks correspond to coding tree units (CTUs) representing logical partitions used for coding the video into the bitstream representation.
  • CTUs coding tree units
  • A12 The method of solution A8, wherein a size of a video block of the one or more video blocks of an inter-layer picture or an intra-layer picture is M ⁇ N, wherein the inter-layer picture or the intra-layer picture is resampled by a first scale factor (S) in a width dimension and by a second scale factor (T) , wherein the dimensions of video blocks for inter-layer referencing or intra-layer referencing are (M ⁇ S) ⁇ (N ⁇ T) or (M/S) ⁇ (N/T) , and wherein M, N, S, and T are positive integers.
  • S first scale factor
  • T second scale factor
  • A14 The method of solution A13, wherein the different sizes are signaled in a sequence parameter set (SPS) or a picture parameter set (PPS) .
  • SPS sequence parameter set
  • PPS picture parameter set
  • a method of video processing comprising determining, based on a size of a video block of a video region of a video exceeding a threshold, that the video block is split using a quadtree-based splitting until a size condition is met and an indication of the quadtree-based splitting is excluded from a bitstream representation of the video; and performing, based on the determining, a conversion between the video and the bitstream representation.
  • a method of video processing comprising determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for ternary-tree (TT) splitting of the video block is signaled in a bitstream representation of the video; and performing, based on the determining, a conversion between the video and the bitstream representation.
  • TT ternary-tree
  • A27 The method of any of solutions A22 to A26, wherein the video block is a coding unit (CU) or a prediction unit (PU) .
  • CU coding unit
  • PU prediction unit
  • a method of video processing comprising determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for binary-tree (BT) splitting of the video block is signaled in a bitstream representation of the video; and performing, based on the determining, a conversion between the video and the bitstream representation.
  • BT binary-tree
  • A33 The method of any of solutions A28 to A32, wherein the video block is a coding unit (CU) or a prediction unit (PU) .
  • CU coding unit
  • PU prediction unit
  • a method of video processing comprising performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, wherein the conversion comprises an affine model parameters calculation, and wherein the affine model parameters calculation is based on dimensions of the video block.
  • A36 The method of solution A34 or A35, wherein the video block corresponds to a coding tree unit (CTU) representing a logical partition used for coding the video into the bitstream representation.
  • CTU coding tree unit
  • a method of video processing comprising performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, wherein the conversion comprises an application of an intra block copy (IBC) tool, and wherein a size of an IBC buffer is based on maximum configurable and/or allowable dimensions of the video block.
  • IBC intra block copy
  • A38 The method of solution A37, wherein a width of the IBC buffer in luma samples is equal to N ⁇ N divided by a width or a height of the video block, wherein N ⁇ N is the maximum configurable dimensions of the video block in luma samples, and wherein N is an integer.
  • A40 The method of any of solutions A37 to A39, wherein the video block corresponds to a coding tree unit (CTU) representing a logical partition used for coding the video into the bitstream representation.
  • CTU coding tree unit
  • An apparatus in a video system comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to implement the method in any one of solutions A1 to A42.
  • a computer program product stored on a non-transitory computer readable media, the computer program product including program code for carrying out the method in any one of solutions A1 to A42.
  • a method of video processing comprising performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video, wherein the conversion is performed according to a rule that specifies a relationship between an indication of a size of a video block of the one or more video blocks and an indication of a maximum size of a transform block (TB) used for the video block.
  • TB transform block
  • bitstream representation excludes an indication of the maximum size of the luma transform block when at least one of the dimensions of the video block is smaller than N, wherein the maximum size of the luma transform block is implicitly derived as 32, and wherein N is a positive integer.
  • An apparatus in a video system comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to implement the method in any one of solutions B1 to B19.
  • a computer program product stored on a non-transitory computer readable media including program code for carrying out the method in any one of solutions B1 to B19.
  • the disclosed and other solutions, examples, embodiments, modules and the functional operations described in this document can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them.
  • the disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus.
  • the computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more them.
  • data processing apparatus encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers.
  • the apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
  • a propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.
  • Acomputer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
  • a computer program does not necessarily correspond to a file in a file system.
  • a program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document) , in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code) .
  • a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
  • the processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output.
  • the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit) .
  • processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer.
  • a processor will receive instructions and data from a read only memory or a random-access memory or both.
  • the essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
  • mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
  • a computer need not have such devices.
  • Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks.
  • semiconductor memory devices e.g., EPROM, EEPROM, and flash memory devices
  • magnetic disks e.g., internal hard disks or removable disks
  • magneto optical disks e.g., CD ROM and DVD-ROM disks.
  • the processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Methods, systems, and devices for coding or decoding video which include configurable coding tree units (CTUs) are described. An example method of video processing includes performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video, wherein the conversion conforms to a rule that allows use of different sizes for the one or more video blocks in different video regions of the one or more video regions for performing the conversion. Another example method of video processing includes determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for binary-tree (BT) splitting of the video block is signaled in a bitstream representation of the video, and performing, based on the determining, a conversion between the video and the bitstream representation.

Description

CONFIGURABLE CODING TREE UNIT SIZE IN VIDEO CODING
CROSS-REFERENCE TO RELATED APPLICATION
Under the applicable patent law and/or rules pursuant to the Paris Convention, this application is made to timely claim the priority to and benefits of International Patent Application No. PCT/CN2019/097926 filed on July 26, 2019. For all purposes under the law, the entire disclosures of the aforementioned applications are incorporated by reference as part of the disclosure of this application.
TECHNICAL FIELD
This document is related to video and image coding and decoding technologies.
BACKGROUND
Digital video accounts for the largest bandwidth use on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the bandwidth demand for digital video usage will continue to grow.
SUMMARY
The disclosed techniques may be used by video or image decoder or encoder to performing coding or decoding of video in which a configurable coding tree unit size is used.
In an example aspect a method of video processing is disclosed. The method includes performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video, wherein the conversion conforms to a rule that allows use of different sizes for the one or more video blocks in different video regions of the one or more video regions for performing the conversion.
In another example aspect a method of video processing is disclosed. The method includes determining, based on a size of a video block of a video region of a video exceeding a threshold, that the video block is split using a quadtree-based splitting until a size condition is met and an indication of the quadtree-based splitting is excluded from a bitstream representation of the video, and performing, based on the determining, a conversion between the video and the bitstream representation.
In yet another example aspect a method of video processing is disclosed. The method includes determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for ternary-tree (TT) splitting of the video block is signaled in a bitstream representation of the video, and performing, based on the determining, a conversion between the video and the bitstream representation.
In yet another example aspect a method of video processing is disclosed. The method includes determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for binary-tree (BT) splitting of the video block is signaled in a bitstream representation of the video, and performing, based on the determining, a conversion between the video and the bitstream representation.
In yet another example aspect a method of video processing is disclosed. The method includes performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, wherein the conversion comprises an affine model parameters calculation, and wherein the affine model parameters calculation is based on dimensions of the video block.
In yet another example aspect a method of video processing is disclosed. The method includes performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, wherein the conversion comprises an application of an intra block copy (IBC) tool, and wherein a size of an IBC buffer is based on maximum configurable and/or allowable dimensions of the video block.
In yet another example aspect a method of video processing is disclosed. The method includes performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video, wherein the conversion is performed according to a rule that specifies a relationship between an indication of a size of a video block of the one or more video blocks and an indication of a maximum size of a transform block (TB) used for the video block.
In another example aspect, the above-described method may be implemented by a video encoder apparatus that comprises a processor.
In yet another example aspect, these methods may be embodied in the form of processor-executable instructions and stored on a computer-readable program medium.
These, and other, aspects are further described in the present document.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of an example of a hardware platform used for implementing techniques described in the present document.
FIG. 2 is a block diagram of an example video processing system in which disclosed techniques may be implemented.
FIG. 3 is a flowchart for an example method of video processing.
FIG. 4 is a flowchart for another example method of video processing.
FIG. 5 is a flowchart for yet another example method of video processing.
FIG. 6 is a flowchart for yet another example method of video processing.
FIG. 7 is a flowchart for yet another example method of video processing.
FIG. 8 is a flowchart for yet another example method of video processing.
FIG. 9 is a flowchart for yet another example method of video processing.
DETAILED DESCRIPTION
The present document provides various techniques that can be used by a decoder of image or video bitstreams to improve the quality of decompressed or decoded digital video or images. For brevity, the term “video” is used herein to include both a sequence of pictures (traditionally called video) and individual images. Furthermore, a video encoder may also implement these techniques during the process of encoding in order to reconstruct decoded frames used for further encoding.
Section headings are used in the present document for ease of understanding and do not limit the embodiments and techniques to the corresponding sections. As such, embodiments from one section can be combined with embodiments from other sections.
1. Summary
This document is related to video coding technologies. Specifically, it is directed to configurable coding tree units (CTUs) in video coding and decoding. It may be applied to the existing video coding standard like HEVC, or the standard (Versatile Video Coding) to be finalized. It may be also applicable to future video coding standards or video codec.
2. Initial discussion
Video coding standards have evolved primarily through the development of the well-known ITU-T and ISO/IEC standards. The ITU-T produced H. 261 and H. 263, ISO/IEC produced MPEG-1 and MPEG-4 Visual, and the two organizations jointly produced the H. 262/MPEG-2 Video and H. 264/MPEG-4 Advanced Video Coding (AVC) and H. 265/HEVC standards. Since  H. 262, the video coding standards are based on the hybrid video coding structure wherein temporal prediction plus transform coding are utilized. To explore the future video coding technologies beyond HEVC, Joint Video Exploration Team (JVET) was founded by VCEG and MPEG jointly in 2015. Since then, many new methods have been adopted by JVET and put into the reference software named Joint Exploration Model (JEM) . The JVET meeting is concurrently held once every quarter, and the new coding standard is targeting at 50%bitrate reduction as compared to HEVC. The new video coding standard was officially named as Versatile Video Coding (VVC) in the April 2018 JVET meeting, and the first version of VVC test model (VTM) was released at that time. As there are continuous effort contributing to VVC standardization, new coding techniques are being adopted to the VVC standard in every JVET meeting. The VVC working draft and test model VTM are then updated after every meeting. The VVC project is now aiming for technical completion (FDIS) at the July 2020 meeting.
2.1 CTU size in VVC
VTM-5.0 software allows 4 different CTU sizes: 16x16, 32x32, 64x64 and 128x128. However, at the July 2019 JVET meeting, the minimum CTU size was redefined to 32x32 due to the adoption of JVET-O0526. And the CTU size in VVC working draft 6 is encoded in the SPS header in a UE-encoded syntax element called log2_ctu_size_minus_5.
Below are the corresponding spec modifications in VVC draft 6 with the definition of Virtual pipeline data units (VPDUs) and the adoption of JVET-O0526.
7.3.2.3. Sequence parameter set RBSP syntax
Figure PCTCN2020104784-appb-000001
7.4.3.3. Sequence parameter set RBSP semantics
log2_ctu_size_minus5plus 5 specifies the luma coding tree block size of each CTU. It is a requirement of bitstream conformance that the value of log2_ctu_size_minus5 be less than or equal to 2.
log2_min_luma_coding_block_size_minus2 plus 2 specifies the minimum luma coding block size.
The variables CtbLog2SizeY, CtbSizeY, MinCbLog2SizeY, MinCbSizeY, IbcBufWidthY, IbcBufWidthC and Vsize are derived as follows:
CtbLog2SizeY = log2_ctu_size_minus5+5             (7-15)
CtbSizeY = 1<<CtbLog2SizeY         (7-16)
MinCbLog2SizeY = log2_min_luma_coding_block_size_minus2+2      (7-17)
MinCbSizeY = 1<<MinCbLog2SizeY       (7-18)
IbcBufWidthY = 128*128/CtbSizeY       (7-19)
IbcBufWidthC = IbcBufWidthY/SubWidthC            (7-20)
VSize = Min (64, CtbSizeY)         (7-21)
The variables CtbWidthC and CtbHeightC, which specify the width and height, respectively, of the array for each chroma CTB, are derived as follows:
– If chroma_format_idc is equal to 0 (monochrome) or separate_colour_plane_flag is equal to 1, CtbWidthC and CtbHeightC are both equal to 0.
– Otherwise, CtbWidthC and CtbHeightC are derived as follows:
CtbWidthC = CtbSizeY /SubWidthC        (7-22)
CtbHeightC = CtbSizeY /SubHeightC       (7-23)
For log2BlockWidth ranging from 0 to 4 and for log2BlockHeight ranging from 0 to 4, inclusive, the up-right diagonal and raster scan order array initialization process as specified in clause 6.5.2 is invoked with 1<<log2BlockWidth and 1<<log2BlockHeight as inputs, and the output is assigned to
DiagScanOrder [log2BlockWidth] [log2BlockHeight] and
RasterScanOrder [log2BlockWidth] [log2BlockHeight] .
slice_log2_diff_max_bt_min_qt_luma specifies the difference between the base 2 logarithm of the maximum size (width or height) in luma samples of a luma coding block that can be split using a binary split and the minimum size (width or height) in luma samples of a luma leaf block resulting from quadtree splitting of a CTU in the current slice. The value of slice_log2_diff_max_bt_min_qt_luma shall be in the range of 0 to CtbLog2SizeY-MinQtLog2SizeY, inclusive. When not present, the value of slice_log2_diff_max_bt_min_qt_luma is inferred as follows:
– If slice_type equal to 2 (I) , the value of slice_log2_diff_max_bt_min_qt_luma is inferred to be equal to sps_log2_diff_max_bt_min_qt_intra_slice_luma
– Otherwise (slice_type equal to 0 (B) or 1 (P) ) , the value of slice_log2_diff_max_bt_min_qt_luma is inferred to be equal to sps_log2_diff_max_bt_min_qt_inter_slice.
slice_log2_diff_max_tt_min_qt_luma specifies the difference between the base 2 logarithm of the maximum size (width or height) in luma samples of a luma coding block that can be split using a ternary split and the minimum size (width or height) in luma samples of a luma leaf block resulting from quadtree  splitting of a CTU in in the current slice. The value of slice_log2_diff_max_tt_min_qt_luma shall be in the range of 0 to CtbLog2SizeY-MinQtLog2SizeY, inclusive. When not present, the value of slice_log2_diff_max_tt_min_qt_luma is inferred as follows:
– If slice_type equal to 2 (I) , the value of slice_log2_diff_max_tt_min_qt_luma is inferred to be equal to sps_log2_diff_max_tt_min_qt_intra_slice_luma
– Otherwise (slice_type equal to 0 (B) or 1 (P) ) , the value of slice_log2_diff_max_tt_min_qt_luma is inferred to be equal to sps_log2_diff_max_tt_min_qt_inter_slice.
slice_log2_diff_min_qt_min_cb_chroma specifies the difference between the base 2 logarithm of the minimum size in luma samples of a chroma leaf block resulting from quadtree splitting of a chroma CTU with treeType equal to DUAL_TREE_CHROMA and the base 2 logarithm of the minimum coding block size in luma samples for chroma CUs with treeType equal to DUAL_TREE_CHROMA in the current slice. The value of slice_log2_diff_min_qt_min_cb_chroma shall be in the range of 0 to CtbLog2SizeY-MinCbLog2SizeY, inclusive. When not present, the value of slice_log2_diff_min_qt_min_cb_chroma is inferred to be equal to sps_log2_diff_min_qt_min_cb_intra_slice_chroma.
slice_max_mtt_hierarchy_depth_chroma specifies the maximum hierarchy depth for coding units resulting from multi-type tree splitting of a quadtree leaf with treeType equal to DUAL_TREE_CHROMA in the current slice. The value of slice_max_mtt_hierarchy_depth_chroma shall be in the range of 0 to CtbLog2SizeY-MinCbLog2SizeY, inclusive. When not present, the values of slice_max_mtt_hierarchy_depth_chroma is inferred to be equal to sps_max_mtt_hierarchy_depth_intra_slices_chroma.
slice_log2_diff_max_bt_min_qt_chroma specifies the difference between the base 2 logarithm of the maximum size (width or height) in luma samples of a chroma coding block that can be split using a binary split and the minimum size (width or height) in luma samples of a chroma leaf block resulting from quadtree splitting of a chroma CTUwith treeType equal to DUAL_TREE_CHROMA in the current slice. The value of slice_log2_diff_max_bt_min_qt_chroma shall be in the range of 0 to CtbLog2SizeY-MinQtLog2SizeC, inclusive. When not present, the value of slice_log2_diff_max_bt_min_qt_chroma is inferred to be equal to sps_log2_diff_max_bt_min_qt_intra_slice_chroma
slice_log2_diff_max_tt_min_qt_chroma specifies the difference between the base 2 logarithm of the maximum size (width or height) in luma samples of a chroma coding block that can be split using a ternary split and the minimum size (width or height) in luma samples of a chroma leaf block resulting from quadtree splitting of a chroma CTUwith treeType equal to DUAL_TREE_CHROMA in the current slice. The value of slice_log2_diff_max_tt_min_qt_chroma shall be in the range of 0 to CtbLog2SizeY-MinQtLog2SizeC,  inclusive. When not present, the value of slice_log2_diff_max_tt_min_qt_chroma is inferred to be equal to sps_log2_diff_max_tt_min_qt_intra_slice_chroma
The variables MinQtLog2SizeY, MinQtLog2SizeC, MinQtSizeY, MinQtSizeC, MaxBtSizeY, MaxBtSizeC, MinBtSizeY, MaxTtSizeY, MaxTtSizeC, MinTtSizeY, MaxMttDepthY and MaxMttDepthC are derived as follows:
MinQtLog2SizeY = MinCbLog2SizeY+slice_log2_diff_min_qt_min_cb_luma   (7-86)
MinQtLog2SizeC = MinCbLog2SizeY+slice_log2_diff_min_qt_min_cb_chroma (7-87)
MinQtSizeY = 1<<MinQtLog2SizeY        (7-88)
MinQtSizeC = 1<<MinQtLog2SizeC         (7-89)
MaxBtSizeY = 1<< (MinQtLog2SizeY+slice_log2_diff_max_bt_min_qt_luma) (7-90)
MaxBtSizeC = 1<< (MinQtLog2SizeC+slice_log2_diff_max_bt_min_qt_chroma) (7-91)
MinBtSizeY = 1<<MinCbLog2SizeY        (7-92)
MaxTtSizeY = 1<< (MinQtLog2SizeY+slice_log2_diff_max_tt_min_qt_luma) (7-93)
MaxTtSizeC = 1<< (MinQtLog2SizeC+slice_log2_diff_max_tt_min_qt_chroma) (7-94)
MinTtSizeY = 1<<MinCbLog2SizeY          (7-95)
MaxMttDepthY = slice_max_mtt_hierarchy_depth_luma        (7-96)
MaxMttDepthC = slice_max_mtt_hierarchy_depth_chroma      (7-97)
2.2 Maximum transform size in VVC
In VVC Draft 5, the max transform size is signalled in the SPS but it is fixed as 64-length and not configurable. However, at the July 2019 JVET meeting, it was decided to enable the max luma transform size to be either 64 or 32 only with a flag at the SPS level. Max chroma transform size is derived from the chroma sampling ratio relative to the max luma transform size.
Below are the corresponding spec modifications in VVC draft 6 with the adoption of JVET-O05xxx.
7.3.2.3. Sequence parameter set RBSP syntax
Figure PCTCN2020104784-appb-000002
Figure PCTCN2020104784-appb-000003
7.4.3.3. Sequence parameter set RBSP semantics
sps_max_luma_transform_size_64_flagequal to 1 specifies that the maximum transform size in luma samples is equal to 64. sps_max_luma_transform_size_64_flagequal to 0 specifies that the maximum transform size in luma samples is equal to 32.
When CtbSizeY is less than 64, the value of sps_max_luma_transform_size_64_flag shall be equal to 0. The variables MinTbLog2SizeY, MaxTbLog2SizeY, MinTbSizeY, and MaxTbSizeY are derived as follows:
MinTbLog2SizeY = 2       (7-27)
MaxTbLog2SizeY = sps_max_luma_transform_size_64_flag? 6: 5     (7-28)
MinTbSizeY = 1<<MinTbLog2SizeY       (7-29)
MaxTbSizeY = 1<<MaxTbLog2SizeY          (7-30)
sps_sbt_max_size_64_flag equal to 0 specifies that the maximum CU width and height for allowing subblock transform is 32 luma samples. sps_sbt_max_size_64_flag equal to 1 specifies that the maximum CU width and height for allowing subblock transform is 64 luma samples.
MaxSbtSize = Min (MaxTbSizeY, sps_sbt_max_size_64_flag ? 64: 32) (7-31)
3.Examples of technical problems addressed by the disclosed technical solutions
There are several problems in the latest VVC working draft JVET-O2001-v11, which are described below.
1) In current VVC draft 6, the maximum transform size and CTU size are defined independently. E.g., CTU size could be 32, whereas transform size could be 64. It is desirable that the maximum transform size should be equal or smaller than the CTU size.
2) In current VVC draft 6, the block partition process depends on the maximum transform block size other than the VPDU size. Therefore, if the maximum transform block size is 32x32, in addition to prohibit 128x128 TT split and 64x128 vertical BT split, and 128x64 horizontal BT split to obey the VPDU rule, it further prohibits TT split for 64x64 block, prohibits vertical BT split for 32x64/16x64/8x64 coding block, and also prohibits horizontal BT split for 64x8/64x16/64x32 coding block, which may not efficient for coding efficiency.
3) Current VVC draft 6 allows CTU size equal to 32, 64, and 128. However, it is possible that the CTU size could be larger than 128. Thus some syntax elements need to be modified.
a) If larger CTU size is allowed, the block partition structure and the signaling of block split flags may be redesigned.
b) If larger CTU size is allowed, then some of the current design (e.g., affine parameters derivation, IBC prediction, IBC buffer size, merge triangle prediction, CIIP, regular merge mode, and etc. ) may be redesigned.
4) In current VVC draft 6, the CTU size is signaled at SPS level. However, since the adoption of reference picture resampling (a.k.a. adaptive resolution change) allows that the pictures could be coded with difference resolutions in one bistream, the CTU size may be different across multiple layers.
4. Example embodiments and techniques
The listing of solutions below should be considered as examples to explain some concepts. These items should not be interpreted in a narrow way. Furthermore, these items can be combined in any manner.
In this document, C=min (a, b) indicates that the C is equal to the minimum value between a and b.
In this document, the video unit size/dimension may be either the height or width of a video unit (e.g., width or height of a picture/sub-picture/slice/brick/tile/CTU/CU/CB/TU/TB) . If a video unit size is denoted by MxN, then M denotes the width and N denotes the height of the video unit.
In this document, “a coding block” may be a luma coding block, and/or a chroma coding block. The size/dimension in luma samples for a coding block may be used in this invention to represent the size/dimension measured in luma samples. For example, a 128x128 coding block (or a coding block size 128x128 in luma samples) may indicate a 128x128 luma coding block, and/or a 64x64 chroma coding block for 4: 2: 0 color format. Similarly, for 4: 2: 2 color format, it may refer to a 128x128 luma coding block and/or a 64x128 chroma coding block. For 4: 4: 4 color format, it may refer to a 128x128 luma coding block and/or a 128x128 chroma coding block.
Configurable CTU size related
1. It is proposed that different CTU dimensions (such as width and/or height) may be allowed for different video units such as Layers/Pictures/Subpictures/Slices/Tiles/Bricks.
a) In one example, one or multiple sets of CTU dimensions may be explicitly signaled at a video unit level such as VPS/DPS/SPS/PPS/APS/Picture/Subpicture/Slice/Slice header/Tile/Brick level.
b) In one example, when the reference picture resampling (a.k.a. Adaptive Resolution Change) is allowed, the CTU dimensions may be different across difference layers.
i. For example, the CTU dimensions of an inter-layer picture may be implicitly derived according to the downsample/upsample scaling factor.
1. For example, if the signaled CTU dimensions for a base layer is M×N (such as M=128 and N=128) and the inter-layer coded picture is resampled by a scaling factor S in width and a scaling factor T in height, which may be larger or smaller than 1 (such as S=1/4 and T=1/2 denoting the inter-layer coded picture is downsampled by 4 times in width and downsamled by 2 times in height) , then the CTU dimensions in the inter-layer coded picture may be derived to (M×S) × (N×T) , or (M/S) × (N/T) .
ii. For example, different CTU dimensions may be explicitly signalled for multiple layers at video unit level, e.g., for inter-layer resampling pictures/subpictures, the CTU dimensions may be signaled at
VPS/DPS/SPS/PPS/APS/picture/subpicture/Slice/Slice header/Tile/Brick level which is different from the base-layer CTU size.
2. It is proposed that whether TT or BT split is allowed or not may be dependent on VPDU dimensions (such as width and/or height) . Suppose VPDU is with dimension VSize in luma samples, and the coding tree block is with dimension CtbSizeY in luma samples.
a) In one example, VSize = min (M, CtbSizeY) . M is an integer value such as 64.
b) In one example, whether TT or BT split is allowed or not may be independent of the maximum transform size.
c) In one example, TT split may be disabled when a coding block width or height in luma samples is greater than min (VSize, maxTtSize) .
i. In one example, when maximum transform size is equal to 32x32 but VSize is equal to 64x64, TT split may be disabled for 128x128/128x64/64x128 coding block.
ii. In one example, when maximum transform size is equal to 32x32 but VSize is equal to 64x64, TT split may be allowed for 64x64 coding block.
d) In one example, vertical BT split may be disabled when a coding block width in luma samples is less than or equal to VSize, but its height in luma samples is greater than VSize.
i. In one example, in case of maximum transform size 32x32 but VPDU size equal to 64x64, vertical BT split may be disabled for 64x128 coding block.
ii. In one example, in case of maximum transform size 32x32 but VPDU size equal to 64x64, vertical BT split may be allowed for 32x64/16x64/8x64 coding block.
e) In one example, vertical BT split may be disabled when a coding block exceeds the Picture/Subpicture width in luma samples, but its height in luma samples is greater than VSize.
i. Alternatively, horizontal BT split may be allowed when a coding block exceeds the Picture/Subpicture width in luma samples.
f) In one example, horizontal BT split may be disabled when a coding block width in luma samples is greater than VSize, but its height in luma samples is less than or equal to VSize.
i. In one example, in case of maximum transform size 32x32 but VPDU size equal to 64x64, vertical BT split may be disabled for 128x64 coding block.
ii. In one example, in case of maximum transform size 32x32 but VPDU size equal to 64x64, horizontal BT split may be allowed for 64x8/64x16/64x32 coding block.
g) In one example, horizontal BT split may be disabled when a coding block exceeds the Picture/Subpicture height in luma samples, but its width in luma samples is greater than VSize.
i. Alternatively, vertical BT split may be allowed when a coding block exceeds the Picture/Subpicture height in luma samples.
h) In one example, when TT or BT split is disabled, the TT or BT split flag may be not signaled and implicitly derived to be zero.
i. Alternatively, when TT and/or BT split is enabled, the TT and/or BT split flag may be explicitly signaled in the bitstream.
ii. Alternatively, when TT or BT split is disabled, the TT or BT split flag may be signaled but ignored by the decoder.
iii. Alternatively, when TT or BT split is disabled, the TT or BT split flag may be signaled but it must be zero in a conformance bitstream.
3. It is proposed that the CTU dimensions (such as width and/or height) may be larger than 128.
a) In one example, the signaled CTU dimensions may be 256 or even larger (e.g., log2_ctu_size_minus5 may be equal to 3 or larger) .
b) In one example, the derived CTU dimensions may be 256 or even larger.
i. For example, the derived CTU dimensions for resampling pictures/subpictures may be larger than 128.
4. It is proposed that when larger CTU dimensions is allowed (such as CTU width and/or height is larger than 128) , then the QT split flag may be inferred to be true and the QT split may be recursively applied till the dimension of split coding block reach a specified value (e.g., a specified value may be set to the maximum transform block size, or 128, or 64, or 32) .
a) In one example, the recursive QT split may be implicitly conducted without signaling, until the split coding block size reach the maximum transform block size.
b) In one example, when CTU 256x256 is applied to dual tree, then the QT split flag may be not signalled for a coding block larger than maximum transform block size, and the QT split may be forced to be used for the coding block until the split coding block size reach the maximum transform block size.
5. It is proposed that TT split flag may be conditionally signalled for CU/PU dimensions (width and/or height) larger than 128.
a) In one example, both horizontal and vertical TT split flags may be signalled for a 256x256 CU.
b) In one example, vertical TT split but not horizontal TT split may be signalled for a 256x128/256x64 CU/PU.
c) In one example, horizontal TT split but not vertical TT split may be signalled for a 128x256/64x256 CU/PU.
d) In one example, when TT split flag is prohibited for CU dimensions larger than 128, then it may not be signalled and implicitly derived as zero.
i. In one example, horizontal TT split may be prohibited for 256x128/256x64 CU/PU.
ii. In one example, vertical TT split may be prohibited for 128x256/64x256 CU/PU.
6. It is proposed that BT split flag may be conditionally signalled for CU/PU dimensions (width and/or height) larger than 128.
a) In one example, both horizontal and vertical BT split flags may be signalled for 256x256/256x128/128x256 CU/PU.
b) In one example, horizontal BT split flag may be signaled for 64x256 CU/PU.
c) In one example, vertical BT split flag may be signaled for 256x64 CU/PU.
d) In one example, when BT split flag is prohibited for CU dimension larger than 128, then it may be not signalled and implicitly derived as zero.
i. In one example, vertical BT split may be prohibited for Kx256 CU/PU (such as K is equal to or smaller than 64 in luma samples) , and the vertical BT split flag may be not signaled and derived as zero.
1. For example, in the above case, vertical BT split may be prohibited for 64x256 CU/PU.
2. For example, in the above case, vertical BT split may be prohibited to avoid 32x256 CU/PU at picture/subpicture boundaries.
ii. In one example, vertical BT split may be prohibited when a coding block exceeds the Picture/Subpicture width in luma samples, but its height in luma samples is greater than M (such as M=64 in luma samples) .
iii. In one example, horizontal BT split may be prohibited for 256xK (such as K is equal to or smaller than 64 in luma samples) coding block, and the horizontal BT split flag may be not signaled and derived as zero.
1. For example, in the above case, horizontal BT split may be prohibited for 256x64 coding block.
2. For example, in the above case, horizontal BT split may be prohibited to avoid 256x32 coding block at picture/subpicture boundaries.
iv. In one example, horizontal BT split may be prohibited when a coding block exceeds the Picture/Subpicture height in luma samples, but its width in luma samples is greater than M (such as M=64 in luma samples) .
7. It is proposed that the affine model parameters calculation may be dependent on the CTU dimensions.
a) In one example, the derivation of scaled motion vectors, and/or control point motion vectors in affine prediction may be dependent on the CTU dimensions.
8. It is proposed that the intra block copy (IBC) buffer may depend on the maximum configurable/allowable CTU dimensions.
a) For example, the IBC buffer width in luma samples may be equal to NxN divided by CTU width (or height) in luma samples, wherein N may be the maximum configurable CTU size in luma samples, such as N = 1 << (log2_ctu_size_minus5 + 5) .
9. It is proposed that a set of specified coding tool (s) may be disabled for a large CU/PU, where the large CU/PU refers to a CU/PU where either the CU/PU width or CU/PU height is larger than N (such as N=64 or 128) .
a) In one example, the above-mentioned specified coding tool (s) may be palette, and/or intra block copy (IBC) , and/or intra skip mode, and/or triangle prediction mode, and/or CIIP mode, and/or regular merge mode, and/or decoder side motion derivation, and/or bi-directional optical flow, and/or prediction refinement based optical flow, and/or affine prediction, and/or sub-block based TMVP, and etc.
i. Alternatively, screen content coding tool (s) such as palette and/or intra block copy (IBC) mode may be applied to large CU/PU.
b) In one example, it may explicitly use syntax constraint for disabling the specified coding tool (s) for a large CU/PU.
i. For example, Palette/IBC flag may explicitly signal for a CU/PU which is not a large CU/PU.
c) In one example it may use bitstream constraint for disabling specified coding tool (s) for a large CU/PU.
Configurable maximum transform size related
10. It is proposed that the maximum TU size may be dependent on CTU dimensions (width and/or height) , or CTU dimensions may be dependent on the maximum TU size
a) In one example, abitstream constraint may be used that the maximum TU size shall be smaller or equal to the CTU dimensions.
b) In one example, the signaling of maximum TU size may depend on the CTU dimensions.
i. For example, when the CTU dimensions are smaller than N (e.g. N=64) , the signaled maximum TU size must be smaller than N.
ii. For example, when the CTU dimensions are smaller than N (e.g. N=64) , the indication of whether the maximum luma transform size is 64 or 32 (e.g., sps_max_luma_transform_size_64_flag) may not be signaled and the maximum luma transform size may be derived as 32 implicitly.
5. Embodiments
Newly added parts are enclosed in bolded double parentheses, e.g., { {a} } denotes that “a” has been added, whereas the deleted parts from VVC working draft are enclosed in bolded double brackets, e.g., [ [b] ] denotes that “b” has been deleted. The modifications are based on the latest VVC working draft (JVET-O2001-v11)
5.1 An example embodiment#1
The embodiment below is for the invented method that making the maximum TU size dependent on the CTU size.
7.4.3.3. Sequence parameter set RBSP semantics
sps_max_luma_transform_size_64_flag equal to 1 specifies that the maximum transform size in luma samples is equal to 64. sps_max_luma_transform_size_64_flag equal to 0 specifies that the maximum transform size in luma samples is equal to 32.
When CtbSizeY is less than 64, the value of sps_max_luma_transform_size_64_flag shall be equal to 0. The variables MinTbLog2SizeY, MaxTbLog2SizeY, MinTbSizeY, and MaxTbSizeY are derived as follows:
MinTbLog2SizeY = 2      (7-27)
MaxTbLog2SizeY = sps_max_luma_transform_size_64_flag? 6: 5     (7-28)
MinTbSizeY = 1<<MinTbLog2SizeY      (7-29)
MaxTbSizeY = { {min (CtbSizeY, 1<<MaxTbLog2SizeY) } }       (7-30)
5.2 An example embodiment#2
The embodiment below is for the invented method that making the TT and BT split process dependent on the VPDU size.
6.4.2Allowed binary split process
The variable allowBtSplit is derived as follows:
….
– Otherwise, if all of the following conditions are true, allowBtSplit is set equal to FALSE
– btSplit is equal to SPLIT_BT_VER
– cbHeight is greater than [ [MaxTbSizeY] ] { {VSize} }
– x0+cbWidth is greater than pic_width_in_luma_samples
– Otherwise, if all of the following conditions are true, allowBtSplit is set equal to FALSE
– btSplit is equal to SPLIT_BT_HOR
– cbWidth is greater than [ [MaxTbSizeY] ] { {VSize} }
– y0+cbHeight is greater than pic_height_in_luma_samples
– Otherwise if all of the following conditions are true, allowBtSplit is set equal to FALSE
– btSplit is equal to SPLIT_BT_VER
– cbWidth is less than or equal to [ [MaxTbSizeY] ] { {VSize} }
– cbHeight is greater than [ [MaxTbSizeY] ] { {VSize} }
– Otherwise if all of the following conditions are true, allowBtSplit is set equal to FALSE
– btSplit is equal to SPLIT_BT_HOR
– cbWidth isgreater than [ [MaxTbSizeY] ] { {VSize} }
– cbHeight is less than or equal to [ [MaxTbSizeY] ] { {VSize} }
6.4.3Allowed ternary split process
The variable allowTtSplit is derived as follows:
– If one or more of the following conditions are true, allowTtSplit is set equal to FALSE:
– cbSize is less than or equal to 2*MinTtSizeY
– cbWidth is greater than Min ( [ [MaxTbSizeY] ] { {VSize} } , maxTtSize)
– cbHeight is greater than Min ( [ [MaxTbSizeY] ] { {VSize} } , maxTtSize)
– mttDepth is greater than or equal to maxMttDepth
– x0+cbWidth is greater than pic_width_in_luma_samples
– y0+cbHeight is greater than pic_height_in_luma_samples
– treeType is equal to DUAL_TREE_CHROMA and (cbWidth/SubWidthC) * (cbHeight/SubHeightC) is less than or equal to 32
– treeType is equal to DUAL_TREE_CHROMA and modeType is equal to INTRA
– Otherwise, allowTtSplit is set equal to TRUE.
5.3 An example embodiment#3
The embodiment below is for the invented method that making the affine model parameters calculation dependent on the CTU size.
7.4.3.3. Sequence parameter set RBSP semantics
log2_ctu_size_minus5plus 5 specifies the luma coding tree block size of each CTU. It is a requirement of bitstream conformance that the value of log2_ctu_size_minus5 be less than or equal to [ [2] ] { {3 (could be larger per specified) } } .
CtbLog2SizeY = log2_ctu_size_minus5+5
{ {CtbLog2SizeY is used to indicate the CTU size in luma sampales of current video unit. When a single CTU size is used for the current video unit, the CtbLog2SizeY is calculated by above equation. Otherwise,  CtbLog2SizeY may depend on the actual CTU size which may be explicit signalled or implicit derived for the current video unit. (an example) } }
8.5.5.5 Derivation process for luma affine control point motion vectors from a neighbouring block
The variables mvScaleHor, mvScaleVer, dHorX and dVerX are derived as follows:
– If isCTUboundary is equal to TRUE, the following applies:
mvScaleHor=MvLX [xNb] [yNb+nNbH-1] [0] << [ [7] ] { {CtbLog2SizeY} }    (8-533)
mvScaleVer=MvLX [xNb] [yNb+nNbH-1] [1] << [ [7] ] { {CtbLog2SizeY} }     (8-534)
– Otherwise (isCTUboundary is equal to FALSE) , the following applies:
mvScaleHor=CpMvLX [xNb] [yNb] [0] [0] << [ [7] ] { {CtbLog2SizeY} }    (8-537)
mvScaleVer=CpMvLX [xNb] [yNb] [0] [1] << [ [7] ] { {CtbLog2SizeY} }                           (8-538)
8.5.5.6 Derivation process for constructed affine control point motion vector merging candidates
When availableFlagCorner [0] is equal to TRUE and availableFlagCorner [2] is equal to TRUE, the following applies:
– For X being replaced by 0 or 1, the following applies:
– The variable availableFlagLX is derived as follows:
– If all of following conditions are TRUE, availableFlagLX is set equal to TRUE:
– predFlagLXCorner [0] is equal to 1
– predFlagLXCorner [2] is equal to 1
– refIdxLXCorner [0] is equal to refIdxLXCorner [2]
– Otherwise, availableFlagLX is set equal to FALSE.
– When availableFlagLX is equal to TRUE, the following applies:
– The second control point motion vector cpMvLXCorner [1] is derived as follows:
cpMvLXCorner [1] [0] = (cpMvLXCorner [0] [0] << [ [7] ] { {CtbLog2SizeY} } ) + ( (cpMvLXCorner [2] [1] -cpMvLXCorner [0] [1] )                          (8-606)
<<( [ [7] ] { {CtbLog2SizeY} } +Log2 (cbHeight/cbWidth) ) )
cpMvLXCorner [1] [1] = (cpMvLXCorner [0] [1] << [ [7] ] { {CtbLog2SizeY} } ) + ( (cpMvLXCorner [2] [0] -cpMvLXCorner [0] [0] )                          (8-607)
<<( [ [7] ] { {CtbLog2SizeY} } +Log2 (cbHeight/cbWidth) ) )
8.5.5.9 Derivation process for motion vector arrays from affine control point motion vectors
The variables mvScaleHor, mvScaleVer, dHorX and dVerX are derived as follows:
mvScaleHor=cpMvLX [0] [0] << [ [7] ] { {CtbLog2SizeY} }                        (8-665)
mvScaleVer=cpMvLX [0] [1] << [ [7] ] { {CtbLog2SizeY} }                          (8-666)
FIG. 1 is a block diagram of a video processing apparatus 1300. The apparatus 1300 may be used to implement one or more of the methods described herein. The apparatus 1300 may be embodied in a smartphone, tablet, computer, Internet of Things (IoT) receiver, and so on. The apparatus 1300 may include one or more processors 1302, one or more memories 1304 and video processing hardware 1306. The processor (s) 1302 may be configured to implement one or more methods described in the present document. The memory (memories) 1304 may be used for storing data and code used for implementing the methods and techniques described herein. The video processing hardware 1306 may be used to implement, in hardware circuitry, some techniques described in the present document. In some embodiments, the hardware 1306 may be at least partially internal to the processors 1302, e.g., a graphics co-processor.
In some embodiments, the video coding methods may be implemented using an apparatus that is implemented on a hardware platform as described with respect to FIG. 1.
Some embodiments of the disclosed technology include making a decision or determination to enable a video processing tool or mode. In an example, when the video processing tool or mode is enabled, the encoder will use or implement the tool or mode in the processing of a block of video, but may not necessarily modify the resulting bitstream based on the usage of the tool or mode. That is, a conversion from the block of video to the bitstream representation of the video will use the video processing tool or mode when it is enabled based on the decision or determination. In another example, when the video processing tool or mode is enabled, the decoder will process the bitstream with the knowledge that the bitstream has  been modified based on the video processing tool or mode. That is, a conversion from the bitstream representation of the video to the block of video will be performed using the video processing tool or mode that was enabled based on the decision or determination.
Some embodiments of the disclosed technology include making a decision or determination to disable a video processing tool or mode. In an example, when the video processing tool or mode is disabled, the encoder will not use the tool or mode in the conversion of the block of video to the bitstream representation of the video. In another example, when the video processing tool or mode is disabled, the decoder will process the bitstream with the knowledge that the bitstream has not been modified using the video processing tool or mode that was enabled based on the decision or determination.
FIG. 2 is a block diagram showing an example video processing system 200 in which various techniques disclosed herein may be implemented. Various implementations may include some or all of the components of the system 200. The system 200 may include input 202 for receiving video content. The video content may be received in a raw or uncompressed format, e.g., 8 or 10 bit multi-component pixel values, or may be in a compressed or encoded format. The input 202 may represent a network interface, a peripheral bus interface, or a storage interface. Examples of network interface include wired interfaces such as Ethernet, passive optical network (PON) , etc. and wireless interfaces such as Wi-Fi or cellular interfaces.
The system 200 may include a coding component 204 that may implement the various coding or encoding methods described in the present document. The coding component 204 may reduce the average bitrate of video from the input 202 to the output of the coding component 204 to produce a coded representation of the video. The coding techniques are therefore sometimes called video compression or video transcoding techniques. The output of the coding component 204 may be either stored, or transmitted via a communication connected, as represented by the component 206. The stored or communicated bitstream (or coded) representation of the video received at the input 202 may be used by the component 208 for generating pixel values or displayable video that is sent to a display interface 210. The process of generating user-viewable video from the bitstream representation is sometimes called video decompression. Furthermore, while certain video processing operations are referred to as “coding” operations or tools, it will be appreciated that the coding tools or operations are used at an encoder and corresponding decoding tools or operations that reverse the results of the coding will be performed by a decoder.
Examples of a peripheral bus interface or a display interface may include universal serial bus (USB) or high definition multimedia interface (HDMI) or Displayport, and so on. Examples of storage interfaces include SATA (serial advanced technology attachment) , PCI, IDE interface, and the like. The techniques described in the present document may be embodied in various electronic devices such as mobile phones, laptops, smartphones or other devices that are capable of performing digital data processing and/or video display.
FIG. 3 is a flowchart for a method 300 of video processing. The method 300 includes, at operation 310, performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video, the conversion conforming to a rule that allows use of different sizes for the one or more video blocks in different video regions of the one or more video regions for performing the conversion.
FIG. 4 is a flowchart for a method 400 of video processing. The method 400 includes, at operation 410, determining, based on a size of a video block of a video region of a video exceeding a threshold, that the video block is split using a quadtree-based splitting until a size condition is met and an indication of the quadtree-based splitting is excluded from a bitstream representation of the video.
The method 400 includes, at operation 420, performing, based on the determining, a conversion between the video and the bitstream representation.
FIG. 5 is a flowchart for a method 500 of video processing. The method 500 includes, at operation 510, determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for ternary-tree (TT) splitting of the video block is signaled in a bitstream representation of the video.
The method 500 includes, at operation 520, performing, based on the determining, a conversion between the video and the bitstream representation.
FIG. 6 is a flowchart for a method 600 of video processing. The method 600 includes, at operation 610, determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for binary-tree (BT) splitting of the video block is signaled in a bitstream representation of the video.
The method 600 includes, at operation 620, performing, based on the determining, a conversion between the video and the bitstream representation.
FIG. 7 is a flowchart for a method 700 of video processing. The method 700 includes, at operation 710, performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, the conversion comprising an affine  model parameters calculation, and the affine model parameters calculation being based on dimensions of the video block.
FIG. 8 is a flowchart for a method 800 of video processing. The method 800 includes, at operation 810, performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, the conversion comprising an application of an intra block copy (IBC) tool, and a size of an IBC buffer being based on maximum configurable and/or allowable dimensions of the video block.
FIG. 9 is a flowchart for a method 900 of video processing. The method 900 includes, at operation 910, performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video, the conversion being performed according to a rule that specifies a relationship between an indication of a size of a video block of the one or more video blocks and an indication of a maximum size of a transform block (TB) used for the video block.
In some embodiments, the following technical solutions may be implemented:
A1. A method of video processing, comprising performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video, wherein the conversion conforms to a rule that allows use of different sizes for the one or more video blocks in different video regions of the one or more video regions for performing the conversion.
A2. The method of solution A1, wherein the rule further specifies that a syntax element is included in the bitstream representation indicative of one or more sizes of video blocks permitted in the bitstream representation.
A3. The method of solution A2, wherein the syntax element is included in a sequence parameter set (SPS) .
A4. The method of solution A2, wherein the syntax element is included in a picture parameter set (PPS) .
A5. The method of solution A2, wherein the syntax element is included in a video parameter set (VPS) , a decoding parameter set (DPS) , an adaptation parameter set (APS) , a picture header, a subpicture header, a slice header, a tile header, or a brick header.
A6. The method of solution A1, wherein the one or more video regions correspond to video layers, and wherein the one or more video blocks correspond to coding tree units (CTUs) representing logical partitions used for coding the video into the bitstream representation.
A7. The method of solution A6, wherein the different sizes for the one or more video blocks are used in the video layers when a reference picture resampling tool is enabled for at least one of the one or more video regions.
A8. The method of solution A6, wherein at least one of the one or more video regions comprises an inter-layer picture or an intra-layer picture, and wherein the dimensions of the one or more video blocks for inter-layer referencing or intra-layer referencing are implicitly based on a scale factor.
A9. The method of solution A8, wherein the scale factor comprises an upsample scale factor or a downsample scale factor.
A10. The method of solution A8, wherein the scale factor is derived from a size of a current picture comprising the one or more blocks and a size of a reference picture associated with the current picture.
A11. The method of solution A8, wherein the scale factor is derived from one or more syntax elements in the bitstream representation.
A12. The method of solution A8, wherein a size of a video block of the one or more video blocks of an inter-layer picture or an intra-layer picture is M×N, wherein the inter-layer picture or the intra-layer picture is resampled by a first scale factor (S) in a width dimension and by a second scale factor (T) , wherein the dimensions of video blocks for inter-layer referencing or intra-layer referencing are (M×S) × (N×T) or (M/S) × (N/T) , and wherein M, N, S, and T are positive integers.
A13. The method of solution A8, wherein the different sizes for the one or more video blocks used in the video layers are signaled in the bitstream representation.
A14. The method of solution A13, wherein the different sizes are signaled in a sequence parameter set (SPS) or a picture parameter set (PPS) .
A15. The method of solution A14, wherein each of the different sizes is different from a size of a base-layer CTU.
A16. The method of solution A6, wherein the dimensions of the CTUs comprise a height and a width, and wherein the height and/or the width is greater than 128.
A17. The method of solution A6, wherein the dimensions of the CTUs comprise a height and a width, and wherein the height and/or the width is greater than or equal to 256.
A18. A method of video processing, comprising determining, based on a size of a video block of a video region of a video exceeding a threshold, that the video block is split using a quadtree-based splitting until a size condition is met and an indication of the quadtree-based  splitting is excluded from a bitstream representation of the video; and performing, based on the determining, a conversion between the video and the bitstream representation.
A19. The method of solution A18, wherein the threshold is 128.
A20. The method of solution A18 or A19, wherein the size condition corresponds to a maximum transform block size of 64 or 32.
A21. The method of any of solutions A18 to A20, wherein the video block corresponds to a coding tree unit (CTU) representing a logical partition used for coding the video into the bitstream representation.
A22. A method of video processing, comprising determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for ternary-tree (TT) splitting of the video block is signaled in a bitstream representation of the video; and performing, based on the determining, a conversion between the video and the bitstream representation.
A23. The method of solution A22, wherein the threshold is 128.
A24. The method of solution A22 or 23, wherein the indication comprises a horizontal TT flag and a vertical TT flag when the dimensions of the video block are 256×256.
A25. The method of solution A22 or 23, wherein the indication consists of a vertical TT flag when the dimensions of the video block are 256×128 or 256×64.
A26. The method of solution A22 or 23, wherein the indication consists of a horizontal TT flag when the dimensions of the video block are 128×256 or 64×256.
A27. The method of any of solutions A22 to A26, wherein the video block is a coding unit (CU) or a prediction unit (PU) .
A28. A method of video processing, comprising determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for binary-tree (BT) splitting of the video block is signaled in a bitstream representation of the video; and performing, based on the determining, a conversion between the video and the bitstream representation.
A29. The method of solution A28, wherein the threshold is 128.
A30. The method of solution A28 or 29, wherein the indication comprises a horizontal TT flag and a vertical TT flag when the dimensions of the video block are 256×256, 256×128 or 128×256.
A31. The method of solution A28 or 29, wherein the indication consists of a horizontal TT flag when the dimensions of the video block are 64×256.
A32. The method of solution A28 or 29, wherein the indication consists of a vertical TT flag when the dimensions of the video block are 256×64.
A33. The method of any of solutions A28 to A32, wherein the video block is a coding unit (CU) or a prediction unit (PU) .
A34. A method of video processing, comprising performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, wherein the conversion comprises an affine model parameters calculation, and wherein the affine model parameters calculation is based on dimensions of the video block.
A35. The method of solution A34, wherein the affine model parameters calculation is part of an affine prediction process that further comprises a derivation of scaled motion vectors and/or control point motion vectors, and wherein the derivation is based on the dimensions of the video block.
A36. The method of solution A34 or A35, wherein the video block corresponds to a coding tree unit (CTU) representing a logical partition used for coding the video into the bitstream representation.
A37. A method of video processing, comprising performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, wherein the conversion comprises an application of an intra block copy (IBC) tool, and wherein a size of an IBC buffer is based on maximum configurable and/or allowable dimensions of the video block.
A38. The method of solution A37, wherein a width of the IBC buffer in luma samples is equal to N×N divided by a width or a height of the video block, wherein N×N is the maximum configurable dimensions of the video block in luma samples, and wherein N is an integer.
A39. The method of solution A38, wherein N = 1 << (log2_ctu_size_minus5 + 5) , wherein log2_ctu_size_minus5 denotes an indication of a coding tree unit (CTU) size.
A40. The method of any of solutions A37 to A39, wherein the video block corresponds to a coding tree unit (CTU) representing a logical partition used for coding the video into the bitstream representation.
A41. The method of any of solutions A1 to A40, wherein performing the conversion comprises generating the bitstream representation from the video.
A42. The method of any of solutions A1 to A40, wherein performing the conversion comprises generating the video from the bitstream representation.
A43. An apparatus in a video system comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to implement the method in any one of solutions A1 to A42.
A44. A computer program product stored on a non-transitory computer readable media, the computer program product including program code for carrying out the method in any one of solutions A1 to A42.
In some embodiments, the following technical solutions may be implemented:
B1. A method of video processing, comprising performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video, wherein the conversion is performed according to a rule that specifies a relationship between an indication of a size of a video block of the one or more video blocks and an indication of a maximum size of a transform block (TB) used for the video block.
B2. The method of solution B1, wherein the relationship specifies that the maximum size of the TB is based on the size of the video block.
B3. The method of solution B1, wherein the relationship specifies that the size of the video block is based on the maximum size of the TB.
B4. The method of solution B2 or B3, wherein the maximum size of the TB is smaller than or equal to the dimensions of the video block.
B5. The method of solution B2 or B3, wherein an inclusion of an indication of the maximum size of the TB in the bitstream representation is based on the dimensions of the video block.
B6. The method of solution B5, wherein at least one of the dimensions of the video block is smaller than N, wherein the indication of the maximum size of the TB indicates that the maximum size of the TB is smaller than N, and wherein N is a positive integer.
B7. The method of solution B6, wherein N = 64.
B8. The method of solution B5, wherein a maximum size of a luma transform block associated with the video region is 64 or 32.
B9. The method of solution B8, wherein the bitstream representation excludes an indication of the maximum size of the luma transform block when at least one of the dimensions of the video block is smaller than N, wherein the maximum size of the luma transform block is implicitly derived as 32, and wherein N is a positive integer.
B10. The method of solution B9, wherein N = 64.
B11. The method of any of solutions B1 to B10, wherein the video block corresponds to a coding tree block (CTB) representing a logical partition used for coding the video into the bitstream representation.
B12. The method of any of solutions B1 to B10, wherein the video block corresponds to a luma coding tree block (CTB) representing a logical partition used for coding a luma component of the video into the bitstream representation.
B13. The method of any of solutions B1 to B10, wherein the indication of the size of the video block corresponds to a syntax element or a variable representing whether a size of the luma coding tree block (CTB) is greater than 32.
B14. The method of any of solutions B1 to B10, wherein the indication of the size of the video block corresponds to a syntax element or a variable representing whether a size of the luma coding tree block (CTB) is greater than or equal to 64.
B15. The method of any of solutions B1 to B10, wherein the maximum size of the transform block corresponds to a maximum size of a luma transform block.
B16. The method of any of solutions B1 to B10, wherein the indication of the maximum size of the transform block corresponds to a syntax element or a variable representing whether a maximum size of a luma transform block is equal to 64.
B17. The method of solution B16, wherein the syntax element is a flag.
B18. The method of any of solutions B1 to B17, wherein performing the conversion comprises generating the bitstream representation from the video region.
B19. The method of any of solutions B1 to B17, wherein performing the conversion comprises generating the video region from the bitstream representation.
B20. An apparatus in a video system comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to implement the method in any one of solutions B1 to B19.
B21. A computer program product stored on a non-transitory computer readable media, the computer program product including program code for carrying out the method in any one of solutions B1 to B19.
The disclosed and other solutions, examples, embodiments, modules and the functional operations described in this document can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them. The disclosed and other embodiments can be implemented as one or more computer program products, i.e., one  or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more them. The term “data processing apparatus” encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them. A propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.
Acomputer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document) , in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code) . A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit) .
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random-access memory or both. The essential elements of a computer  are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
While this patent document contains many specifics, these should not be construed as limitations on the scope of any subject matter or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular techniques. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.
Only a few implementations and examples are described and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.

Claims (44)

  1. A method of video processing, comprising:
    performing a conversion between a video comprising one or more video regions comprising one or more video blocks and a bitstream representation of the video,
    wherein the conversion conforms to a rule that allows use of different sizes for the one or more video blocks in different video regions of the one or more video regions for performing the conversion.
  2. The method of claim 1, wherein the rule further specifies that a syntax element is included in the bitstream representation indicative of one or more sizes of video blocks permitted in the bitstream representation.
  3. The method of claim 2, wherein the syntax element is included in a sequence parameter set (SPS) .
  4. The method of claim 2, wherein the syntax element is included in a picture parameter set (PPS) .
  5. The method of claim 2, wherein the syntax element is included in a video parameter set (VPS) , a decoding parameter set (DPS) , an adaptation parameter set (APS) , a picture header, a subpicture header, a slice header, a tile header, or a brick header.
  6. The method of claim 1, wherein the one or more video regions correspond to video layers, and wherein the one or more video blocks correspond to coding tree units (CTUs) representing logical partitions used for coding the video into the bitstream representation.
  7. The method of claim 6, wherein the different sizes for the one or more video blocks are used in the video layers when a reference picture resampling tool is enabled for at least one of the one or more video regions.
  8. The method of claim 6, wherein at least one of the one or more video regions comprises an inter-layer picture or an intra-layer picture, and wherein the dimensions of the one or more video blocks for inter-layer referencing or intra-layer referencing are implicitly based on a scale factor.
  9. The method of claim 8, wherein the scale factor comprises an upsample scale factor or a downsample scale factor.
  10. The method of claim 8, wherein the scale factor is derived from a size of a current picture comprising the one or more blocks and a size of a reference picture associated with the current picture.
  11. The method of claim 8, wherein the scale factor is derived from one or more syntax elements in the bitstream representation.
  12. The method of claim 8, wherein a size of a video block of the one or more video blocks of an inter-layer picture or an intra-layer picture is M×N, wherein the inter-layer picture or the intra-layer picture is resampled by a first scale factor (S) in a width dimension and by a second scale factor (T) , wherein the dimensions of video blocks for inter-layer referencing or intra-layer referencing are (M×S) × (N×T) or (M/S) × (N/T) , and wherein M, N, S, and T are positive integers.
  13. The method of claim 8, wherein the different sizes for the one or more video blocks used in the video layers are signaled in the bitstream representation.
  14. The method of claim 13, wherein the different sizes are signaled in a sequence parameter set (SPS) or a picture parameter set (PPS) .
  15. The method of claim 14, wherein each of the different sizes is different from a size of a base-layer CTU.
  16. The method of claim 6, wherein the dimensions of the CTUs comprise a height and a width, and wherein the height and/or the width is greater than 128.
  17. The method of claim 6, wherein the dimensions of the CTUs comprise a height and a width, and wherein the height and/or the width is greater than or equal to 256.
  18. A method of video processing, comprising:
    determining, based on a size of a video block of a video region of a video exceeding a threshold, that the video block is split using a quadtree-based splitting until a size condition is met and an indication of the quadtree-based splitting is excluded from a bitstream representation of the video; and
    performing, based on the determining, a conversion between the video and the bitstream representation.
  19. The method of claim 18, wherein the threshold is 128.
  20. The method of claim 18 or 19, wherein the size condition corresponds to a maximum transform block size of 64 or 32.
  21. The method of any of claims 18 to 20, wherein the video block corresponds to a coding tree unit (CTU) representing a logical partition used for coding the video into the bitstream representation.
  22. A method of video processing, comprising:
    determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for ternary-tree (TT) splitting of the video block is signaled in a bitstream representation of the video; and
    performing, based on the determining, a conversion between the video and the bitstream representation.
  23. The method of claim 22, wherein the threshold is 128.
  24. The method of claim 22 or 23, wherein the indication comprises a horizontal TT flag and a vertical TT flag when the dimensions of the video block are 256×256.
  25. The method of claim 22 or 23, wherein the indication consists of a vertical TT flag when the dimensions of the video block are 256×128 or 256×64.
  26. The method of claim 22 or 23, wherein the indication consists of a horizontal TT flag when the dimensions of the video block are 128×256 or 64×256.
  27. The method of any of claims 22 to 26, wherein the video block is a coding unit (CU) or a prediction unit (PU) .
  28. A method of video processing, comprising:
    determining, based on dimensions of a video block of a video region of a video exceeding a threshold, whether an indication for binary-tree (BT) splitting of the video block is signaled in a bitstream representation of the video; and
    performing, based on the determining, a conversion between the video and the bitstream representation.
  29. The method of claim 28, wherein the threshold is 128.
  30. The method of claim 28 or 29, wherein the indication comprises a horizontal TT flag and a vertical TT flag when the dimensions of the video block are 256×256, 256×128 or 128×256.
  31. The method of claim 28 or 29, wherein the indication consists of a horizontal TT flag when the dimensions of the video block are 64×256.
  32. The method of claim 28 or 29, wherein the indication consists of a vertical TT flag when the dimensions of the video block are 256×64.
  33. The method of any of claims 28 to 32, wherein the video block is a coding unit (CU) or a prediction unit (PU) .
  34. A method of video processing, comprising:
    performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, wherein the conversion comprises an affine model parameters calculation, and wherein the affine model parameters calculation is based on dimensions of the video block.
  35. The method of claim 34, wherein the affine model parameters calculation is part of an affine prediction process that further comprises a derivation of scaled motion vectors and/or control point motion vectors, and wherein the derivation is based on the dimensions of the video block.
  36. The method of claim 34 or 35, wherein the video block corresponds to a coding tree unit (CTU) representing a logical partition used for coding the video into the bitstream representation.
  37. A method of video processing, comprising:
    performing a conversion between a video comprising a video region comprising a video block and a bitstream representation of the video, wherein the conversion comprises an application of an intra block copy (IBC) tool, and wherein a size of an IBC buffer is based on maximum configurable and/or allowable dimensions of the video block.
  38. The method of claim 37, wherein a width of the IBC buffer in luma samples is equal to N×N divided by a width or a height of the video block, wherein N×N is the maximum configurable dimensions of the video block in luma samples, and wherein N is an integer.
  39. The method of claim 38, wherein N = 1 << (log2_ctu_size_minus5 + 5) , wherein log2_ctu_size_minus5 denotes an indication of a coding tree unit (CTU) size.
  40. The method of any of claims 37 to 39, wherein the video block corresponds to a coding tree unit (CTU) representing a logical partition used for coding the video into the bitstream representation.
  41. The method of any of claims 1 to 40, wherein performing the conversion comprises generating the bitstream representation from the video.
  42. The method of any of claims 1 to 40, wherein performing the conversion comprises generating the video from the bitstream representation.
  43. An apparatus in a video system comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to implement the method recited in one or more of claims 1 to 42.
  44. A computer program product stored on a non-transitory computer readable media, the computer program product including program code for carrying out the method recited in one or more of claims 1 to 42.
PCT/CN2020/104784 2019-07-26 2020-07-27 Configurable coding tree unit size in video coding Ceased WO2021018081A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202080053833.8A CN114175649A (en) 2019-07-26 2020-07-27 Configurable coding tree cell size in video coding

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2019097926 2019-07-26
CNPCT/CN2019/097926 2019-07-26

Publications (1)

Publication Number Publication Date
WO2021018081A1 true WO2021018081A1 (en) 2021-02-04

Family

ID=74229220

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/CN2020/104790 Ceased WO2021018084A1 (en) 2019-07-26 2020-07-27 Interdependence of transform size and coding tree unit size in video coding
PCT/CN2020/104784 Ceased WO2021018081A1 (en) 2019-07-26 2020-07-27 Configurable coding tree unit size in video coding

Family Applications Before (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/104790 Ceased WO2021018084A1 (en) 2019-07-26 2020-07-27 Interdependence of transform size and coding tree unit size in video coding

Country Status (2)

Country Link
CN (2) CN114175649A (en)
WO (2) WO2021018084A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023034748A1 (en) * 2021-08-31 2023-03-09 Bytedance Inc. Method, apparatus, and medium for video processing

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2025003032A1 (en) * 2023-06-30 2025-01-02 Interdigital Ce Patent Holdings, Sas Intra sub-partitions (isp) combination with intra block copy (ibc)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017123980A1 (en) * 2016-01-15 2017-07-20 Qualcomm Incorporated Multi-type-tree framework for video coding
US20180146205A1 (en) * 2014-04-14 2018-05-24 Avago Technologies General Ip (Singapore) Pte. Ltd. Pipelined video decoder system
WO2018131523A1 (en) * 2017-01-12 2018-07-19 ソニー株式会社 Image processing device and image processing method
EP3381186A1 (en) * 2015-11-25 2018-10-03 Qualcomm Incorporated(1/3) Flexible transform tree structure in video coding
WO2018217024A1 (en) * 2017-05-26 2018-11-29 에스케이텔레콤 주식회사 Apparatus and method for image encoding or decoding supporting various block sizes
US20190075328A1 (en) * 2016-03-16 2019-03-07 Mediatek Inc. Method and apparatus of video data processing with restricted block size in video coding
WO2019059676A1 (en) * 2017-09-20 2019-03-28 한국전자통신연구원 Method and device for encoding/decoding image, and recording medium having stored bitstream

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2015203103B2 (en) * 2010-08-17 2016-06-30 Samsung Electronics Co., Ltd. Video encoding method and apparatus using transformation unit of variable tree structure, and video decoding method and apparatus
SG191845A1 (en) * 2011-01-12 2013-08-30 Mitsubishi Electric Corp Moving image encoding device, moving image decoding device, moving image encoding method, and moving image decoding method
US9788019B2 (en) * 2011-03-09 2017-10-10 Hfi Innovation Inc. Method and apparatus of transform unit partition with reduced complexity
PL3849186T3 (en) * 2011-06-24 2024-01-03 Mitsubishi Electric Corporation Moving image encoding apparatus, moving image decoding apparatus, moving image encoding method and moving image decoding method
JP5810700B2 (en) * 2011-07-19 2015-11-11 ソニー株式会社 Image processing apparatus and image processing method
WO2013139212A1 (en) * 2012-03-21 2013-09-26 Mediatek Singapore Pte. Ltd. Method and apparatus for intra mode derivation and coding in scalable video coding
US9467701B2 (en) * 2012-04-05 2016-10-11 Qualcomm Incorporated Coded block flag coding
JP2014045434A (en) * 2012-08-28 2014-03-13 Nippon Hoso Kyokai <Nhk> Image encoder, image decoder and programs thereof
JP6341426B2 (en) * 2012-09-10 2018-06-13 サン パテント トラスト Image decoding method and image decoding apparatus
US9648335B2 (en) * 2013-07-12 2017-05-09 Qualcomm Incorporated Bitstream restrictions on picture partitions across layers
KR101709775B1 (en) * 2013-07-23 2017-02-23 인텔렉추얼디스커버리 주식회사 Method and apparatus for image encoding/decoding
WO2015012600A1 (en) * 2013-07-23 2015-01-29 성균관대학교 산학협력단 Method and apparatus for encoding/decoding image
US20150078457A1 (en) * 2013-09-13 2015-03-19 Qualcomm Incorporated Representation format signaling in multi-layer video coding
EP3120561B1 (en) * 2014-03-16 2023-09-06 VID SCALE, Inc. Method and apparatus for the signaling of lossless video coding
WO2016074147A1 (en) * 2014-11-11 2016-05-19 Mediatek Singapore Pte. Ltd. Separated coding tree for luma and chroma
US20180091810A1 (en) * 2015-03-23 2018-03-29 Lg Electronics Inc. Method for processing video signal and device therefor
JP6704932B2 (en) * 2015-03-31 2020-06-03 リアルネットワークス,インコーポレーテッド Residual transform and inverse transform method in video coding system
KR102199463B1 (en) * 2015-08-31 2021-01-06 삼성전자주식회사 Method and apparatus for image transform, and method and apparatus for image inverse transform based on scan order
JP6559337B2 (en) * 2015-09-23 2019-08-14 ノキア テクノロジーズ オーユー 360-degree panoramic video encoding method, encoding apparatus, and computer program
US10284845B2 (en) * 2016-05-25 2019-05-07 Arris Enterprises Llc JVET quadtree plus binary tree (QTBT) structure with multiple asymmetrical partitioning
US10779007B2 (en) * 2017-03-23 2020-09-15 Mediatek Inc. Transform coding of video data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180146205A1 (en) * 2014-04-14 2018-05-24 Avago Technologies General Ip (Singapore) Pte. Ltd. Pipelined video decoder system
EP3381186A1 (en) * 2015-11-25 2018-10-03 Qualcomm Incorporated(1/3) Flexible transform tree structure in video coding
WO2017123980A1 (en) * 2016-01-15 2017-07-20 Qualcomm Incorporated Multi-type-tree framework for video coding
US20190075328A1 (en) * 2016-03-16 2019-03-07 Mediatek Inc. Method and apparatus of video data processing with restricted block size in video coding
WO2018131523A1 (en) * 2017-01-12 2018-07-19 ソニー株式会社 Image processing device and image processing method
WO2018217024A1 (en) * 2017-05-26 2018-11-29 에스케이텔레콤 주식회사 Apparatus and method for image encoding or decoding supporting various block sizes
WO2019059676A1 (en) * 2017-09-20 2019-03-28 한국전자통신연구원 Method and device for encoding/decoding image, and recording medium having stored bitstream

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MENG WANG ET AL.: "Extended Quad-tree Partitioning for Future Video Coding", 2019 DATA COMPRESSION CONFERENCE (DCC), 29 March 2019 (2019-03-29), pages 300 - 309, XP033548541, ISSN: 2375-0359 *
SHIH-TA HSIANG ET AL.: "CE1.7.0.1: Signaling maximum CU size for BT/TT split", JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 11TH MEETING: LJUBLJANA, SI, 10–18 JULY 2018 JVET-K0229, 10 July 2018 (2018-07-10) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023034748A1 (en) * 2021-08-31 2023-03-09 Bytedance Inc. Method, apparatus, and medium for video processing

Also Published As

Publication number Publication date
CN114175650A (en) 2022-03-11
WO2021018084A1 (en) 2021-02-04
CN114175649A (en) 2022-03-11

Similar Documents

Publication Publication Date Title
US11553179B2 (en) Sample determination for adaptive loop filtering
CN113711604B (en) Signaling of chroma and luma syntax elements in video codecs
US12003712B2 (en) Handling video unit boundaries and virtual boundaries
US12439044B2 (en) Block size dependent use of video coding mode
US20230156189A1 (en) Sample padding in adaptive loop filtering
US11490082B2 (en) Handling video unit boundaries and virtual boundaries based on color format
US12184872B2 (en) Cross-component adaptive loop filter
US11539946B2 (en) Sample padding for cross-component adaptive loop filtering
CN113853798B (en) Signaling syntax elements based on chroma format
WO2021018081A1 (en) Configurable coding tree unit size in video coding
HK40063730A (en) Determination of picture partition mode based on block size
CN114902684A (en) Controlling cross-boundary filtering in video coding and decoding
HK40063730B (en) Determination of picture partition mode based on block size

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20847494

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20847494

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 26-08-22)

122 Ep: pct application non-entry in european phase

Ref document number: 20847494

Country of ref document: EP

Kind code of ref document: A1