[go: up one dir, main page]

WO2025042227A1 - Procédé et dispositif de codage/décodage d'image - Google Patents

Procédé et dispositif de codage/décodage d'image Download PDF

Info

Publication number
WO2025042227A1
WO2025042227A1 PCT/KR2024/012566 KR2024012566W WO2025042227A1 WO 2025042227 A1 WO2025042227 A1 WO 2025042227A1 KR 2024012566 W KR2024012566 W KR 2024012566W WO 2025042227 A1 WO2025042227 A1 WO 2025042227A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
filter
reference picture
interpolating
determined
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/KR2024/012566
Other languages
English (en)
Korean (ko)
Inventor
이영렬
임수연
송현주
최민경
유영환
이진영
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industry Academy Cooperation Foundation of Sejong University
Original Assignee
Industry Academy Cooperation Foundation of Sejong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industry Academy Cooperation Foundation of Sejong University filed Critical Industry Academy Cooperation Foundation of Sejong University
Publication of WO2025042227A1 publication Critical patent/WO2025042227A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/177Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques

Definitions

  • the present disclosure proposes an intra-subpartition based intra-screen prediction technique for effective intra-screen prediction.
  • the present disclosure proposes a technique of an adaptive reference block interpolation filter for effective inter-screen prediction.
  • the present disclosure proposes improved prediction techniques compared to conventional techniques.
  • the present disclosure seeks to overcome the limitations of existing Intra SubPartition-based intra-screen prediction techniques.
  • existing Intra SubPartition-based intra-screen prediction techniques have limitations in that they divide one CU into K subpartitions and then apply the same intra-screen prediction mode to the subpartitions, and the subpartitions cannot apply different transformation kernels to each other, and the division method is also limited.
  • the present disclosure seeks to overcome the limitations of existing interpolation filters by providing an interpolation filter that adaptively improves encoding/decoding efficiency according to the frequency characteristics of a block.
  • the video encoding/decoding method, device, and recording medium of the present disclosure include the steps of: determining a filter for interpolating a reference picture of a current block in inter-screen prediction of a current video; interpolating the reference picture with the determined filter to obtain an interpolated reference picture; and performing motion prediction based on the interpolated reference picture, wherein the filter for interpolating the reference picture can be determined by comparing a size of the current block, an absolute value of a correlation of the reference block of the current block, or a Mean SAD (Sum of Absolute Difference) value of the reference block with a threshold value.
  • a filter for interpolating the reference picture can be determined by comparing a size of the current block, an absolute value of a correlation of the reference block of the current block, or a Mean SAD (Sum of Absolute Difference) value of the reference block with a threshold value.
  • the filter for interpolating the reference picture when the filter for interpolating the reference picture is determined by comparing the size of the current block with a threshold value, the filter for interpolating the reference picture in response to the size of the current block being larger than the threshold value may be determined as an 8-tap or 6-tap interpolation filter, and the filter for interpolating the reference picture in response to the size of the current block being smaller than or equal to the threshold value may be determined as a 12-tap interpolation filter.
  • the threshold value may vary depending on the size of the current block.
  • the MeanSAD value of the reference block can be calculated based on the width and height of the reference block and the sample values of the reference block.
  • a filter interpolating the reference picture in response to the MeanSAD value of the reference block being greater than a threshold value, may be determined as a 12-tap interpolation filter or an interpolation filter longer than 12 taps.
  • the filter for interpolating the reference picture when the resolution of the current video is less than 4K, the filter for interpolating the reference picture is determined by comparing the absolute value of the correlation of the reference block with a threshold value, and when the resolution of the current video is 4K or more, the filter for interpolating the reference picture can be determined as an 8-tap interpolation filter or a 6-tap interpolation filter.
  • the filter for interpolating the reference picture when the absolute correlation value of the reference block is greater than or equal to a threshold value, the filter for interpolating the reference picture may be determined as an 8-tap interpolation filter or a 6-tap interpolation filter, and when the absolute correlation value of the reference block is less than the threshold value, the filter for interpolating the reference picture may be determined as a 12-tap interpolation filter.
  • the Intra SubPartition-based intra-screen prediction technique of the present disclosure is more adaptively applied to situations than the conventional Intra SubPartition-based intra-screen prediction technique, and thus can improve encoding/decoding efficiency in encoding/decoding of images.
  • FIG. 1 is a block diagram showing an image encoding device according to an embodiment of the present invention.
  • FIG. 2 is a block diagram showing an image decoding device (200) according to one embodiment of the present invention.
  • Figure 3 shows a flow diagram of intra-subpartition (ISP)-based intra-screen prediction.
  • ISP intra-subpartition
  • Figure 4 is a diagram showing adjacent blocks used to generate an MPM list.
  • Figure 5 is a diagram showing an example of an on-screen prediction mode.
  • Figure 6 is an example diagram illustrating a CU division method in ISP-based on-screen prediction.
  • Figure 7 is a diagram illustrating an example of using a previous subpartition in configuring the MPM list of the current subpartition.
  • FIG. 8 is a diagram illustrating an embodiment of implicitly determining a combination of transformation kernels depending on the splitting direction and the prediction mode of the subpartition.
  • Figure 9 is a diagram illustrating an example of performing a transformation by making a non-rectangular subpartition into a rectangular shape.
  • Figure 10 shows an example of an equation for filter coefficients.
  • Figure 11 shows examples of formulas for DCT-II coefficients and IDCT-II coefficients.
  • Figure 12 is a diagram showing the relationship between the Magnitude and Frequency graphs representing the frequency characteristics at 1/2 Luma sample locations of a 6-tap Gaussian-based IF, an 8-tap DCT-IF, and the proposed 12-tap DCT-IF.
  • Figure 13 shows an example of an auto-correlation formula.
  • Figure 14 shows an example of 12-tap DCT-IF coefficients.
  • Figure 17 shows the experimental results of the performance of the proposed method.
  • Figure 19 illustrates a formula for a method that uses correlation to find out the frequency characteristics of a reference block.
  • Figure 20 shows an equation for a method using Mean SAD to find out the frequency characteristics of a reference block.
  • first, second, A, B, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are only used to distinguish one component from another.
  • the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component.
  • the term and/or includes any combination of a plurality of related described items or any item among a plurality of related described items.
  • an image can be composed of a series of still images, and these still images can be divided into GOP (Group of Pictures) units, and each still image can be referred to as a picture or a frame.
  • GOP Group of Pictures
  • each picture can be divided into predetermined areas such as slices, tiles, and blocks.
  • one GOP can include units such as I picture, P picture, and B picture.
  • An I picture can mean a picture that is encoded/decoded on its own without using a reference picture
  • a P picture and a B picture can mean a picture that is encoded/decoded by performing processes such as motion estimation and motion compensation using a reference picture.
  • I pictures and P pictures can be used as reference pictures
  • B pictures I pictures and P pictures can be used as reference pictures, but this definition above can also be changed depending on the encoding/decoding settings.
  • a picture referred to for encoding/decoding is called a reference picture
  • a block or pixel referred to is called a reference block or reference pixel
  • the reference data may be not only a pixel value in the spatial domain, but also a coefficient value in the frequency domain, and various encoding/decoding information generated and determined during the encoding/decoding process.
  • YCbCr 4:2:0 it can be composed of one luminance component (Y in this example) and two chrominance components (Cb/Cr in this example), and in this case, the composition ratio of the chrominance component and the luminance component can have a horizontal and vertical ratio of 1:2.
  • the composition ratio of the chrominance component and the luminance component can have a horizontal and vertical ratio of 1:2.
  • 4:4:4 it can have the same horizontal and vertical composition ratio.
  • the description will be made based on some color spaces (Y in this example) of some color formats (YCbCr in this example), and the same or similar application (settings dependent on a specific color space) can be made to other color spaces (Cb, Cr in this example) according to the color format.
  • the settings dependent on each color space can mean having settings proportional to or dependent on the composition ratio of each component (for example, determined according to 4:2:0, 4:2:2, or 4:4:4, etc.)
  • the independent settings for each color space can mean having settings only for the corresponding color space regardless of or independently of the composition ratio of each component.
  • some components can have independent settings or dependent settings depending on the encoder/decoder.
  • the configuration information or syntax elements required in the video encoding process can be determined at the unit level such as video, sequence, picture, slice, tile, block, etc., and can be included in the bitstream as units such as VPS (Video Parameter Set), SPS (Sequence Parameter Set), PPS (Picture Parameter Set), Slice Header, Tile Header, or Block Header and transmitted to the decoder, and the decoder can parse the units at the same level to restore the configuration information transmitted from the encoder and use it in the video decoding process.
  • Each parameter set has a unique ID value, and the lower parameter set can have the ID value of the upper parameter set to be referenced.
  • setting information occurring in the above unit may include content about independent settings for each unit, or content about settings that are dependent on previous, subsequent, or upper units.
  • the dependent setting may be understood as indicating the setting information of the unit with flag information that follows the settings of the previous, subsequent, or upper units (for example, a 1-bit flag, if 1, it follows, and if 0, it does not follow).
  • the setting information in the present invention will be explained focusing on examples of independent settings, but examples of addition or replacement with content about a dependent relationship on setting information of previous, subsequent units or upper units of the current unit may also be included.
  • Encoding/decoding of video may be generally performed according to the input size, but there may also be cases where encoding/decoding is performed through size adjustment.
  • a hierarchical encoding method scaling video coding
  • the overall resolution is adjusted, such as expansion and reduction of the video
  • Information about this can be switched by assigning selection information in units such as the above-described VPS, SPS, PPS, and Slice Header.
  • the hierarchy between each unit can be set as VPS > SPS > PPS > Slice header, etc.
  • the image encoding device and the image decoding device may be user terminals such as a personal computer (PC), a notebook computer, a personal digital assistant (PDA), a portable multimedia player (PMP), a PlayStation Portable (PSP), a wireless communication terminal, a smart phone, or a TV; or server terminals such as an application server and a service server; and may include various devices including a communication device such as a communication modem for communicating with various devices or wired/wireless communication networks, a memory for storing various programs and data for inter- or intra-predicting to encode or decode an image, or a processor for executing a program and performing calculations and controls.
  • a communication device such as a communication modem for communicating with various devices or wired/wireless communication networks
  • a memory for storing various programs and data for inter- or intra-predicting to encode or decode an image
  • a processor for executing a program and performing calculations and controls.
  • the video encoding device and the video decoding device may be separate devices, but may be made into a single video encoding/decoding device depending on the implementation. In that case, some components of the video encoding device may be implemented to include at least the same structure as some components of the video decoding device as substantially the same technical elements or to perform at least the same function.
  • the image decoding device corresponds to a computing device that applies the image encoding method performed in the image encoding device to decoding, the description below will focus on the image encoding device.
  • a computing device may include a memory storing a program or software module implementing an image encoding method and/or an image decoding method, and a processor connected to the memory to execute the program.
  • the image encoding device may be referred to as an encoder
  • the image decoding device may be referred to as a decoder, respectively.
  • an image encoding device may include an image segmentation unit (101), an intra-screen prediction unit (102), an inter-screen prediction unit (103), a subtraction unit (104), a transformation unit (105), a quantization unit (106), an entropy encoding unit (107), an inverse quantization unit (108), an inverse transformation unit (109), an increase unit (110), a filter unit (111), and a memory (112).
  • some components may not be essential components that perform essential functions in the present invention, but may be optional components that are merely used to improve performance.
  • the present invention may be implemented by including only essential components for implementing the essence of the present invention, excluding components that are merely used to improve performance, and a structure that includes only essential components, excluding optional components that are merely used to improve performance, is also included in the scope of the present invention.
  • the image segmentation unit (100) can segment an input image into at least one block.
  • the input image can have various shapes and sizes such as a picture, a slice, a tile, and a segment.
  • the block can mean an encoding unit (CU, or encoding block), a prediction unit (PU, or prediction block), or a transformation unit (TU, or transformation block). All or some of the encoding units, the prediction units, and the transformation units can be the same.
  • the encoding unit and the prediction unit can be the same, but the transformation unit can be a lower unit of the prediction unit.
  • the encoding unit, the prediction unit, and the transformation unit can all be different.
  • the segmentation can be performed based on at least one of quadtree division, binary tree division, ternary tree division, and geometric division.
  • Quadtree division is a method of dividing an upper block into lower blocks whose width and height are half of those of the upper block.
  • Binary tree partitioning is a method of dividing an upper block into two sub-blocks, each of which is half the width or height of the upper block.
  • Ternary tree partitioning is a method of dividing an upper block into three sub-blocks, each of which is one-third the width or height of the upper block.
  • Geometric partitioning is a method of dividing an upper block into sub-blocks by dividing lines of various angles based on slope and/or distance from the center.
  • the prediction unit (102, 103) may include an inter-prediction unit (103) that performs inter-prediction and an intra-prediction unit (102) that performs intra-prediction. It may be determined whether to use inter-prediction or intra-prediction for a prediction unit, and specific information (e.g., intra-prediction mode, motion vector, reference picture, etc.) according to each prediction method may be determined. At this time, the processing unit where the prediction is performed and the processing unit where the prediction method and specific contents are determined may be different.
  • specific information e.g., intra-prediction mode, motion vector, reference picture, etc.
  • the prediction method and prediction mode, etc. may be determined in prediction units, and prediction may be performed in transformation units.
  • the current block may be a target block of each step of the encoding/decoding method or the operation of each part of the encoding/decoding device.
  • a prediction unit in which a prediction method and a prediction mode, etc. are determined in a prediction unit is viewed as a current block
  • a transformation unit in which prediction is performed may be expressed as a sub-block of the current block.
  • a prediction unit in which a prediction method and a prediction mode, etc. are determined in a prediction unit may be expressed as a block above the current block.
  • the residual value (residual block) between the generated prediction block and the original block can be input to the transformation unit (105).
  • the prediction mode information, motion vector information, etc. used for prediction can be encoded by the entropy encoding unit (107) together with the residual value and transmitted to the decoder.
  • a specific encoding mode it is also possible to encode the original block as it is and transmit it to the decoding unit without generating the prediction block through the prediction unit (102, 103).
  • the prediction unit (102) within the screen can generate a prediction block based on reference pixel information surrounding the current block, which is pixel information within the current picture. If the prediction mode of a block surrounding the current block on which intra prediction is to be performed is inter prediction, a reference pixel included in a block surrounding the current block to which inter prediction is applied can be replaced with a reference pixel within another block surrounding the current block to which intra prediction is applied. That is, if a reference pixel is not available, the unavailable reference pixel information can be replaced with at least one reference pixel among the available reference pixels and used.
  • the prediction unit (102) within the screen may include an Adaptive Intra Smoothing (AIS) filter, a reference pixel interpolation unit, and a DC filter.
  • the AIS filter is a filter that performs filtering on the reference pixels of the current block and can adaptively determine whether to apply the filter according to the prediction mode of the current prediction unit. If the prediction mode of the current block is a mode that does not perform AIS filtering, the AIS filter may not be applied.
  • the reference pixel interpolation unit of the prediction unit (102) within the screen can interpolate the reference pixel to generate a reference pixel at a fractional unit position when the intra prediction mode of the prediction unit is a prediction unit that performs intra prediction based on the pixel value interpolated for the reference pixel.
  • the prediction mode of the current prediction unit is a prediction mode that generates a prediction block without interpolating the reference pixel
  • the reference pixel may not be interpolated.
  • the DC filter can generate a prediction block through filtering when the prediction mode of the current block is the DC mode.
  • the inter-screen prediction unit (103) generates a prediction block using the previously restored reference image and motion information stored in the memory (112).
  • the motion information may include, for example, a motion vector, a reference picture index, a list 1 prediction flag, a list 0 prediction flag, etc.
  • the inter-screen prediction unit (103) can derive a prediction block based on information of at least one picture among the previous picture or the subsequent picture of the current picture.
  • the prediction block of the current block can also be derived based on information of a part of the current picture in which encoding is completed.
  • the inter-screen prediction unit (103) according to one embodiment of the present invention can include a reference picture interpolation unit, a motion prediction unit, and a motion compensation unit.
  • the reference picture interpolation unit can receive reference picture information from the memory (112) and generate pixel information below an integer pixel from the reference picture.
  • a DCT-based 8-tap interpolation filter (DCT-based Filter) with different filter coefficients can be used to generate pixel information below an integer pixel in units of 1/4 pixels.
  • a DCT-based 4-tap interpolation filter (DCT-based Interpolation Filter) with different filter coefficients can be used to generate pixel information below an integer pixel in units of 1/8 pixels.
  • the subtraction unit (104) subtracts the block currently to be encoded from the prediction block generated from the intra-screen prediction unit (102) or inter-screen prediction unit (103) to generate a residual block of the current block.
  • the residual block including the residual data can be transformed using a transformation method such as DCT, DST, KLT (Karhunen Loeve Transform), etc.
  • the transformation method can be determined based on the intra prediction mode of the prediction unit used to generate the residual block. For example, depending on the intra prediction mode, DCT may be used in the horizontal direction and DST may be used in the vertical direction.
  • the above transformation unit (105) and/or quantization unit (106) may be optionally included in the image encoding device (100). That is, the image encoding device (100) may encode the residual block by performing at least one of transformation or quantization on the residual data of the residual block, or skipping both transformation and quantization. Even if neither transformation nor quantization is performed in the image encoding device (100), a block that enters the input of the entropy encoding unit (107) is typically referred to as a transformation block.
  • the entropy encoding unit (107) entropy-encodes the input data.
  • the entropy encoding may use various encoding methods, such as, for example, Exponential Golomb, CAVLC (Context-Adaptive Variable Length Coding), and CABAC (Context-Adaptive Binary Arithmetic Coding).
  • the entropy encoding unit (107) can encode various information such as coefficient information of a transform block, block type information, prediction mode information, division unit information, prediction unit information, transmission unit information, motion vector information, reference frame information, block interpolation information, and filtering information.
  • the coefficients of a transform block can be encoded in units of sub blocks within the transform block.
  • various syntax elements can be encoded, such as Last_sig, a syntax element indicating the position of the first non-zero coefficient in reverse scan order, Coded_sub_blk_flag, a flag indicating whether there is at least one non-zero coefficient in a subblock, Sig_coeff_flag, a flag indicating whether the coefficient is non-zero, Abs_greater1_flag, a flag indicating whether the absolute value of the coefficient is greater than 1, Abs_greater2_flag, a flag indicating whether the absolute value of the coefficient is greater than 2, and Sign_flag, a flag indicating the sign of the coefficient.
  • the residual value of the coefficient that is not encoded by the above syntax elements can be encoded via the syntax element remaining_coeff.
  • the inverse quantization unit (108) and the inverse transformation unit (109) inversely quantize the values quantized in the quantization unit (106) and inversely transform the values transformed in the transformation unit (105).
  • the residual values generated in the inverse quantization unit (108) and the inverse transformation unit (109) can be combined with the predicted units predicted through the motion estimation unit, the motion compensation unit, and the in-screen prediction unit (102) included in the prediction unit (102, 103) to generate a reconstructed block.
  • the multiplier (110) multiplies the predicted blocks generated in the prediction units (102, 103) and the residual blocks generated through the inverse transformation unit (109) to generate a reconstructed block.
  • the filter unit (111) may include at least one of a deblocking filter, an offset correction unit, and an ALF (Adaptive Loop Filter).
  • a deblocking filter may include at least one of a deblocking filter, an offset correction unit, and an ALF (Adaptive Loop Filter).
  • ALF Adaptive Loop Filter
  • the offset correction unit can correct the offset from the original image on a pixel basis for the image on which deblocking has been performed.
  • a method can be used in which the pixels included in the image are divided into a certain number of regions, the regions to be offset are determined, and the offset is applied to the regions, or a method can be used in which the offset is applied by considering the edge information of each pixel.
  • Adaptive Loop Filtering can be performed based on the value compared between the filtered restored image and the original image. After dividing the pixels included in the image into a predetermined group, one filter to be applied to the group is determined, and filtering can be performed differentially for each group. Information related to whether to apply ALF can be transmitted by luminance signal for each coding unit (CU), and the shape and filter coefficient of the ALF filter to be applied can be different for each block. In addition, the same shape (fixed shape) of the ALF filter can be applied regardless of the characteristics of the target block.
  • ALF Adaptive Loop Filtering
  • the image decoding device (200) may include an entropy decoding unit (201), an inverse quantization unit (202), an inverse transformation unit (203), an amplification unit (204), a filter unit (205), a memory (206), and a prediction unit (207, 208).
  • an image bitstream generated by an image encoding device (100) is input to an image decoding device (200)
  • the input bitstream can be decoded according to a process opposite to the process performed in the image encoding device (100).
  • the multiplication unit (204) multiplies the prediction block generated from the intra-screen prediction unit (207) or inter-screen prediction unit (208) and the residual block generated through the inverse transformation unit (203) to generate a restored block. It operates substantially the same as the multiplication unit (110) of Fig. 1.
  • the filter unit (205) reduces various types of noise occurring in restored blocks.
  • the filter unit (205) may include a deblocking filter, an offset correction unit, and an ALF.
  • the deblocking filter of the video decoding device (200) can receive information related to the deblocking filter provided from the video encoding device (100) and perform deblocking filtering on the current block in the video decoding device (200).
  • the offset correction unit can perform offset correction on the restored image based on information such as the type of offset correction applied to the image during encoding and the offset value.
  • the prediction unit (207, 208) can generate a prediction block based on prediction block generation related information provided from the entropy decoding unit (201) and previously decoded block or picture information provided from the memory (206).
  • the prediction unit (207, 208) may include an intra-screen prediction unit (207) and an inter-screen prediction unit (208). Although not illustrated separately, the prediction unit (207, 208) may further include a prediction unit determination unit.
  • the prediction unit determination unit may receive various information such as prediction unit information input from the entropy decoding unit (201), prediction mode information of an intra-prediction method, and motion prediction-related information of an inter-prediction method, and may distinguish a prediction unit from a current encoding unit and determine whether the prediction unit performs inter-prediction or intra-prediction.
  • the inter-screen prediction unit (208) may perform inter-screen prediction for the current prediction unit based on information included in at least one of a previous picture or a subsequent picture of the current picture including the current prediction unit, by using information necessary for inter-prediction of the current prediction unit provided from the video encoding device (100). Alternatively, inter-screen prediction can be performed based on information from some previously-reconstructed region within the current picture containing the current prediction unit.
  • the motion prediction method of the prediction unit included in the encoding unit is Skip Mode, Merge Mode, or AMVP Mode based on the encoding unit.
  • the prediction unit (207) within the screen generates a prediction block using pixels located around the block to be currently encoded and previously restored.
  • the prediction unit (207) within the screen may include an Adaptive Intra Smoothing (AIS) filter, a reference pixel interpolation unit, and a DC filter.
  • the AIS filter is a filter that performs filtering on the reference pixels of the current block and may adaptively determine whether to apply the filter according to the prediction mode of the current prediction unit.
  • the AIS filter may be performed on the reference pixels of the current block using the prediction mode and AIS filter information of the prediction unit provided by the image encoding device (100). If the prediction mode of the current block is a mode that does not perform AIS filtering, the AIS filter may not be applied.
  • the current block may be the same as described in the encoding device.
  • the reference pixel interpolation unit of the prediction unit (207) within the screen can interpolate the reference pixel to generate a reference pixel at a fractional unit position when the prediction mode of the prediction unit is a prediction unit that performs intra prediction based on the pixel value interpolated from the reference pixel.
  • the generated reference pixel at the fractional unit position can be used as a prediction pixel of a pixel in the current block.
  • the prediction mode of the current prediction unit is a prediction mode that generates a prediction block without interpolating the reference pixel
  • the reference pixel may not be interpolated.
  • the DC filter can generate a prediction block through filtering when the prediction mode of the current block is the DC mode.
  • the current block can be the same as described in the encoding device.
  • the on-screen prediction unit (207) operates substantially the same as the on-screen prediction unit (102) of Fig. 1.
  • the inter-screen prediction unit (208) generates an inter-screen prediction block using reference pictures and motion information stored in the memory (206).
  • the inter-screen prediction unit (208) operates substantially the same as the inter-screen prediction unit (103) of Fig. 1.
  • Figure 3 shows a flow diagram of intra-subpartition (ISP)-based intra-screen prediction.
  • ISP intra-subpartition
  • the Luma block of the present disclosure can be divided into K subpartitions in the horizontal or vertical direction.
  • the sizes of the subpartitions can be the same, but in some cases, they can be different from each other.
  • the encoding process of prediction and transformation can be performed individually for each subpartition, but the prediction mode and transformation kernel can be shared between the subpartitions.
  • each subpartition can perform prediction by sharing the prediction mode of the upper block, the Luma block.
  • the ISP-based intra-picture prediction can be applied to blocks with a size of 4x8 and 8x4 to 64x64.
  • the above K can be a natural number greater than or equal to 1.
  • the K can be determined according to the size of the block. For example, in the case of 4x8 and 8x4, the K can be set to 2, and in other cases, the K can be determined to 4. In this case, at least 16 samples (or pixels) can exist in each subpartition.
  • the size of the block to which the ISP-based on-screen prediction technology is applied is WxH
  • the size of the partitioned subpartition in the case of horizontal partitioning, can be WxH/K, and in the case of vertical partitioning, the size of the partitioned subpartition can be W/KxH.
  • the processing order of each subpartition can proceed from top to bottom in the case of horizontal partitioning, and from left to right in the case of vertical partitioning. According to the processing order, the subpartition that is restored first can be used for the prediction and transformation of the next subpartition.
  • the conversion kernel can be implicitly determined based on the block size. For example, if W is greater than or equal to 4 and H is greater than or equal to 16, DST7 can be used, otherwise DCT2 can be used.
  • LFNST Low Frequency Non-Separable Transform
  • the intra_subpartitions_mode_flag signaled from the bitstream can indicate whether the ISP technology is applied to the current block. Accordingly, if the intra_subpartitions_mode_flag is true, the intra_subpartitions_split_flag can be additionally signaled.
  • the intra_subpartitions_split_flag can determine at least one of the splitting direction, the number of splits, or the splitting shape to which the ISP technology of the current block is applied. For example, the vertical splitting or the horizontal splitting can be determined depending on the intra_subpartitions_split_flag value.
  • intra_subpartitions_mode_flag Even if the intra_subpartitions_mode_flag value is true, since the subpartitions (or sub blocks) of the current block share the prediction mode of the current block, the prediction mode of the current block can be determined by applying the Most Probable Modes (MPM) technique of the current block regardless of the intra_subpartitions_mode_flag value.
  • MPM Most Probable Modes
  • intra_luma_mpm_flag can be examined, and if intra_luma_mpm_flag is true, intra_luma_not_planar_flag can be examined.
  • intra_luma_mpm_flag can indicate whether the intra-screen prediction mode of the current block is determined using any one of the MPM candidates.
  • intra_luma_not_planar_flag can indicate whether the intra-screen prediction mode of the current block is the Planar mode. That is, if intra_luma_not_planar_flag is true, the intra-screen prediction mode of the current block is determined as the Planar mode, and if intra_luma_not_planar_flag is false, the intra-screen prediction mode of the current block can be determined from MPM candidates excluding the Planar mode.
  • intra_luma_not_planar_flag is false, intra_luma_mpm_index can be additionally examined.
  • the intra_luma_mpm_index indicates the prediction mode of one of the MPM candidates, and the prediction mode can be determined as the intra-screen prediction mode of the current block.
  • the intra-screen prediction mode of the current block can be determined as one of the remaining prediction modes excluding the MPM candidates among the available intra-screen prediction modes.
  • the one of the prediction modes can be determined as intra_luma_mpm_remainder.
  • the intra_luma_mpm_remainder can indicate one of the remaining prediction modes.
  • Figure 4 is a diagram showing adjacent blocks used to generate an MPM list.
  • the on-screen prediction mode of the current block can be determined based on the MPM list.
  • the MPM list can include at least one MPM candidate.
  • the MPM candidates can be determined based on the surrounding blocks of the current block.
  • the surrounding blocks of the current block can be at least one of a block adjacent to the left side of the current block or a block adjacent to the top side of the current block.
  • the block adjacent to the left side of the current block can be the lowermost block among the blocks adjacent to the left side when there are multiple blocks adjacent to the left side of the current block.
  • the block adjacent to the top of the current block can be the rightmost block among the blocks adjacent to the top when there are multiple blocks adjacent to the top of the current block.
  • the prediction mode of the left-bottommost block adjacent to the current block in Fig. 4 is called L
  • the prediction mode of the top-rightmost block adjacent to the current block is called A
  • the vertical mode is called V
  • the horizontal mode is called H.
  • Max can mean max(A, L)
  • Min can mean min (A, L).
  • the Planar mode can be assigned.
  • a candidate derived from the adjacent block can be added to the candidate list in the Planar mode.
  • the MPM list may consist of ⁇ Planar, Max, Max-1, Max+1, Max-2, Max+2 ⁇ .
  • the ISP-based intra-screen prediction mode of the present disclosure can be based on the ISP-based intra-screen prediction mode described above.
  • the ISP-based intra-picture prediction mode of the present disclosure can use different intra-picture prediction modes between subpartitions (sub-blocks). Or, as described above, the same intra-picture prediction mode can be used between subpartitions.
  • a flag indicating whether different intra-picture prediction modes are used between subpartitions can be transmitted through a bitstream. Or, whether different intra-picture prediction modes are used between subpartitions can be determined based on attributes of a current block.
  • the attributes of the current block can mean the position, size, shape, width, length ratio of width/height, length of either width or height, split depth, image type (I/P/B), color components (e.g., luminance, chrominance), value of intra-picture prediction mode, whether intra-picture prediction mode is a non-directional mode, angle of intra-picture prediction mode, position of reference pixel, etc.
  • the prediction modes of the subpartitions are prediction modes spaced apart from the horizontal intra prediction mode, and the two prediction modes can be symmetrical with respect to the horizontal intra prediction mode.
  • the prediction modes of the subpartitions follow the prediction mode of the current block.
  • the prediction modes of the subpartitions may be different from each other as the prediction modes of the current block are adjusted by adding an offset value to the prediction modes of the current block.
  • a subpartition is obtained by dividing the current block by a method other than vertical or horizontal division, and the sizes of the subpartitions may be different from each other. Or, the sizes of the subpartitions may be some the same and some different. Or, the sizes of the subpartitions may all have the same size.
  • the processing order of each subpartition of the ISP technology of the present disclosure can be at least one of a first order proceeding from top to bottom, a second order proceeding from bottom to top, a third order proceeding from left to right, or a fourth order proceeding from right to left.
  • the processing order of each subpartition can be performed in a diagonal direction by combining the first order and the third order or the fourth order.
  • the processing order of each subpartition can be performed in a diagonal direction by combining the second order and the third order or the fourth order.
  • information for obtaining the prediction mode of each subpartition may be encoded/decoded for each subpartition.
  • the information may be an index indicating one of the prediction modes available to the current block.
  • the information may include at least one of the above-described MPM-related flags or indexes. That is, the method for obtaining the prediction mode within a screen of an existing CU (coding unit) may be applied to a subpartition (or sub-block) as is.
  • the prediction mode of the current subpartition can be determined using at least one of information from the current CU (coding unit), adjacent CUs, or previous subpartitions.
  • the previous subpartition can indicate a subpartition that is prior to the current subpartition in terms of processing order. In other words, it can be a subpartition that has performed at least one of prediction or transformation before the current subpartition.
  • the prediction mode of the current subpartition can be determined based on the intra-screen prediction mode of the current block.
  • the prediction mode of the first subpartition can be a prediction mode obtained by adding K value to the intra-screen prediction mode of the current block
  • the prediction mode of the second subpartition can be a prediction mode obtained by subtracting K value from the intra-screen prediction mode of the current block.
  • the prediction mode of the subpartition can be determined as a mode within a range of ⁇ K of the intra-screen prediction mode of the current block. The determination can be performed based on the aforementioned attribute of the current block or index information signaled from a bitstream.
  • the prediction mode of the current subpartition can be determined based on the prediction mode of the previous subpartition.
  • the MPM list of the current subpartition can be constructed based on the prediction mode of the previous subpartition.
  • the MPM list of the current subpartition can include a prediction mode obtained by adding or subtracting a value m from the prediction mode of the previous subpartition.
  • m can be a natural number greater than or equal to 1.
  • the value m can be explicitly signaled from the bitstream or implicitly determined from the attributes of the current block described above.
  • the MPM list of the current subpartition can include the prediction mode of the previous subpartition.
  • the MPM list of the current subpartition can be constructed excluding the previous subpartition.
  • the prediction modes available for the current block are divided into a plurality of groups, and a first group used to obtain the prediction mode of the first subpartition among the plurality of groups may be excluded from obtaining the prediction mode of the second subpartition. That is, the prediction mode of the second subpartition may be obtained from any one of the remaining groups excluding the first group among the plurality of groups.
  • the plurality of groups may be natural numbers greater than 1, such as 2, 3, and 4, and the number of prediction modes of each group may be a natural number greater than 1, such as 1, 2, 3, 4, 5, and 6.
  • the prediction mode of the second subpartition can be determined as a prediction mode by adding or subtracting a value m to the prediction mode of the first subpartition.
  • m can be a natural number greater than or equal to 1.
  • the value m can be explicitly signaled from the bitstream or implicitly determined from the properties of the current block described above.
  • the prediction mode of the current subpartition can be determined from the prediction mode (K) of the current block, excluding the prediction mode of the previous subpartition, within a certain range (K-m) to (K+m).
  • m can be a natural number greater than or equal to 1.
  • the value of m can be explicitly signaled from the bitstream, or can be implicitly determined from the properties of the current block described above.
  • prediction of the current subpartition can be performed using the restored sample of the previous subpartition.
  • Figure 5 is a diagram showing an example of an on-screen prediction mode.
  • the subpartitions of the present disclosure can be independently encoded in at least one of transformation, quantization, or entropy encoding processes following ISP-based intra-picture prediction.
  • Each subpartition can be encoded using a different transformation kernel depending on at least one of the size or prediction mode of the subpartition in the first transformation.
  • the 1D (1 Dimension) horizontal translation kernel can be DST-7 and the 1D vertical translation kernel can be DST-7.
  • the 1D horizontal translation kernel can be DST-7 and the 1D vertical translation kernel can be DST-7.
  • the transformation kernel derived for each mode may be KLT as a 1D horizontal transformation kernel and KLT as a 1D vertical transformation kernel.
  • a combination of the transformation kernels of each subpartition can be transmitted via a bitstream. That is, the transformation kernel of the subpartition can be explicitly determined based on information transmitted via the bitstream, or implicitly determined based on at least one of the attributes of the current block described above or the prediction mode within the screen.
  • Figure 6 is an example diagram illustrating a CU division method in ISP-based on-screen prediction.
  • the splitting method can be transmitted in the bitstream via a flag or index, or determined based on the properties of the current block as described above.
  • the size of the CU being split is Wx1 or 1xH, it can only be split in the vertical or horizontal direction, and the maximum number of subpartitions being split can be 4.
  • Fig. 6 illustrates only the division of a square rectangle as an example, a division similar to Fig. 6 may also be possible for non-square rectangle blocks. However, the number of methods for dividing blocks of non-square rectangles may be smaller than that of dividing blocks of square rectangles.
  • Figure 7 is a diagram illustrating an example of using a previous subpartition in configuring the MPM list of the current subpartition.
  • the current subpartition can construct the MPM list of the current subpartition using the prediction mode P of the previous subpartition.
  • the MPM list of the current subpartition can be constructed using at least one of L, which is a prediction mode of the left neighboring block of the current subpartition, or A, which is a prediction mode of the upper neighboring block.
  • FIG. 8 is a diagram illustrating an embodiment of implicitly determining a combination of transformation kernels depending on the splitting direction and the prediction mode of the subpartition.
  • the combination of transformation kernels of each subpartition can be a 1D horizontal transformation kernel of DST-7 and a 1D vertical transformation kernel of DST-7.
  • Figure 9 is a diagram illustrating an example of performing a transformation by making a non-rectangular subpartition into a rectangular shape.
  • the subpartition is an example in which the shape of a triangle is expanded to create a rectangular shape.
  • the coefficient value of the expanded blank area can be determined as 0.
  • the CU is a subpartition located at the upper part of the CU divided into four in an X shape.
  • the transformation can be performed by expanding the part adjacent to the two short lines of the triangle to create a rectangular shape.
  • the coefficient value of the expanded empty area can be determined as 0.
  • the expanded rectangular shape of the first subpartition located at the upper part of the four-divided CU of Fig. 9B may be a rectangle whose width is larger than its height.
  • the expanded rectangular shape of the second subpartition located at the left part of the four-divided CU of Fig. 9B may be a rectangle whose height is larger than its width.
  • the present disclosure can propose a new Interpolation filter and an Interpolation filter application method to effectively perform ME (Motion Estimation) and MC (Motion Compensation) in inter prediction.
  • ME Motion Estimation
  • MC Motion Compensation
  • IF Adaptive Motion Vector Resolution
  • Figure 10 shows an example of an equation for filter coefficients.
  • N represents the total number of taps of IF and n represents the sample position.
  • w[n] can represent the filter coefficient.
  • the 8-tap DCT-IF applied to the 1/2 luma sample positions and 1/4 luma sample positions is derived based on the DCT (Discrete Cosine Transform)-II and IDCT (Inverse DCT)-II below, and can have the characteristics of restoring low frequency and high frequency components well. Therefore, effective ME/MC can be possible when the reference block has high frequency characteristics.
  • the current VVC applies the same DCT-IF regardless of the degree of high frequency of the reference block.
  • Figure 11 shows examples of formulas for DCT-II coefficients and IDCT-II coefficients.
  • X(k) may represent DCT-II coefficients and x(n) may represent IDCT-II coefficients.
  • Figure 12 is a diagram showing the relationship between the Magnitude and Frequency graphs representing the frequency characteristics at 1/2 Luma sample locations of a 6-tap Gaussian-based IF, an 8-tap DCT-IF, and the proposed 12-tap DCT-IF.
  • the 6-tap Gaussian-based IF has the characteristic of restoring low frequency components well
  • the 8-tap and 12-tap DCT-IF have the characteristic of restoring low frequency and high frequency components well
  • the proposed 12-tap DCT-IF has the characteristic of restoring high frequency components better.
  • Figure 13 shows an example of an auto-correlation formula.
  • the frequency characteristics of a block can be found by applying the auto-correlation formula to the reference block.
  • N means the height of the reference block
  • M means the width
  • x means the samples of the reference block. may mean the average of reference block samples.
  • a larger absolute value of the correlation value means that the reference block has low frequency characteristics
  • a smaller absolute value means that the reference block has high frequency characteristics
  • the threshold may be different if a different equation representing similarity other than auto-correlation is used.
  • Figure 14 shows an example of 12-tap DCT-IF coefficients.
  • the encoder and decoder can utilize the above information to not transmit flag information, and if the absolute value of the correlation is greater than or equal to a set threshold value, the 8-tab DCT-IF of VVC can be applied, and if it is less than or equal to the threshold value, the proposed 12-tab DCT-IF, which also restores high frequency components well, can be applied.
  • the total weight of the 12-tab DCT-IF coefficients is set to 128, and if the value of the weight changes, the coefficients of the filter can also change.
  • Figure 15 shows the locations of fractional samples and integer samples.
  • the index of Fig. 14 can represent an integer sample position (A0 to A11) to which filtering is applied to generate a fractional sample (a0, b0, c0 of Fig. 15).
  • a 1/4-pixel filter can be a filter coefficient for generating a sample at position a0
  • a 1/2-pixel filter can be a filter coefficient for generating a sample at position b0
  • a 3/4-pixel filter can be a filter coefficient for generating a sample at position c0.
  • Figure 16 shows an example of generating a sample at position b0 as an example of generating a fractional sample.
  • the proposed method can be applied by sending a flag to the decoder indicating whether to use a filter that restores high frequency characteristics well in the encoder, rather than finding out the frequency characteristics in both the encoder and the decoder.
  • VTM 16.0 which is a VVC reference software, and can be implemented by selecting the interpolation filter through correlation calculation in both encoder and decoder without flag transmission. It can also be implemented by flag transmission method.
  • Figure 17 shows the experimental results of the performance of the proposed method.
  • the experiment was conducted in RA(Random Access) configurations and used A1, A2, B, C, and D classes.
  • QPs of 22, 27, 32, and 37 were used for each sequence, and 32 frames were encoded for A1 and A2 classes, and 64 frames for B, C, and D classes.
  • the results of each experiment are expressed in BD-rate, and Fig. 17 shows that the proposed method improves the encoding efficiency.
  • the proposed method can be applied to the normal AMVP mode of inter-screen prediction.
  • the block to which the proposed method is applied can have the size of WxH.
  • W and H can be greater than or equal to 4.
  • the proposed method shows performance improvement in VTM16.0, which is a VVC reference software.
  • Figure 18 shows an example of the Mean SAD formula.
  • the Mean Sum of Absolute Difference (SAD) of the reference block can be utilized instead of the correlation.
  • the equation in Fig. 18 represents the Mean SAD equation, where N represents the height, M represents the width, and x(i,j) can represent the sample value at the (i, j) position of the reference block.
  • the interpolation filter to be applied is selected by comparing MeanSAD' with the threshold, and x (0, i) of MeanSAD 1st row can represent the samples of the first row, and x (i, 0) of MeanSAD 1st row can represent the samples of the first column.
  • a larger Mean SAD value means that the reference block has high frequency characteristics, so if the Mean SAD value of the calculated reference block is larger than a predefined threshold value (including high freq. components), the proposed 12-tap DCT-IF or an IF with a longer long tap can be used. In other cases, the existing interpolation filter of the VVC standard can be used. This can also be applied to the case of Mean SAD'.
  • the threshold in blocks of other sizes can be determined in proportion to the number of samples based on the 4x4 block.
  • 12-tap DCT-IF can be applied when ref block's Mean SAD > threshold_n.
  • Video resolution may be additionally considered when applying the filter of the present disclosure.
  • various embodiments of the present disclosure may be implemented by hardware, firmware, software, or a combination thereof.
  • the embodiments may be implemented by one or more ASICs (Application Specific Integrated Circuits), DSPs (Digital Signal Processors), DSPDs (Digital Signal Processing Devices), PLDs (Programmable Logic Devices), FPGAs (Field Programmable Gate Arrays), general processors, controllers, microcontrollers, microprocessors, and the like.
  • ASICs Application Specific Integrated Circuits
  • DSPs Digital Signal Processors
  • DSPDs Digital Signal Processing Devices
  • PLDs Programmable Logic Devices
  • FPGAs Field Programmable Gate Arrays
  • general processors controllers, microcontrollers, microprocessors, and the like.
  • the scope of the present disclosure includes software or machine-executable instructions (e.g., an operating system, an application, firmware, a program, etc.) that cause operations according to the methods of various embodiments to be executed on a device or a computer, and a non-transitory computer-readable medium having such software or instructions stored thereon and being executable on the device or the computer.
  • software or machine-executable instructions e.g., an operating system, an application, firmware, a program, etc.
  • the present disclosure has industrial applicability in the technical fields of video encoding/decoding methods, devices, and recording media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente divulgation concerne un procédé et un dispositif de codage/décodage d'image, et un support d'enregistrement, le procédé comprenant les étapes consistant à : déterminer un filtre pour interpoler une image de référence d'un bloc courant dans une prédiction inter d'une vidéo courante ; interpoler l'image de référence avec le filtre déterminé de façon à obtenir une image de référence interpolée ; et effectuer une prédiction de mouvement sur la base de l'image de référence interpolée, le filtre pour interpoler l'image de référence pouvant être déterminé par comparaison de la taille du bloc courant, de la valeur de corrélation absolue du bloc de référence du bloc courant, ou de la valeur moyenne de la somme des différences absolues (SAD) du bloc de référence avec une valeur seuil.
PCT/KR2024/012566 2023-08-22 2024-08-22 Procédé et dispositif de codage/décodage d'image Pending WO2025042227A1 (fr)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
KR20230110122 2023-08-22
KR10-2023-0110122 2023-08-22
KR10-2023-0110123 2023-08-22
KR20230110123 2023-08-22
KR20240006836 2024-01-16
KR10-2024-0006836 2024-01-16

Publications (1)

Publication Number Publication Date
WO2025042227A1 true WO2025042227A1 (fr) 2025-02-27

Family

ID=94732520

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2024/012566 Pending WO2025042227A1 (fr) 2023-08-22 2024-08-22 Procédé et dispositif de codage/décodage d'image

Country Status (2)

Country Link
KR (1) KR20250029008A (fr)
WO (1) WO2025042227A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180019566A (ko) * 2015-06-18 2018-02-26 퀄컴 인코포레이티드 인트라 예측 및 인트라 모드 코딩
KR101956284B1 (ko) * 2011-06-30 2019-03-08 엘지전자 주식회사 보간 방법 및 이를 이용한 예측 방법
KR20200123836A (ko) * 2018-06-11 2020-10-30 삼성전자주식회사 부호화 방법 및 그 장치, 복호화 방법 및 그 장치
KR20220071945A (ko) * 2020-11-24 2022-05-31 현대자동차주식회사 기하학적 변환에 기반하는 블록 복사를 이용하는 인트라 예측방법과 장치
KR20230025036A (ko) * 2019-02-21 2023-02-21 엘지전자 주식회사 인트라 예측을 위한 비디오 신호의 처리 방법 및 장치

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101956284B1 (ko) * 2011-06-30 2019-03-08 엘지전자 주식회사 보간 방법 및 이를 이용한 예측 방법
KR20180019566A (ko) * 2015-06-18 2018-02-26 퀄컴 인코포레이티드 인트라 예측 및 인트라 모드 코딩
KR20200123836A (ko) * 2018-06-11 2020-10-30 삼성전자주식회사 부호화 방법 및 그 장치, 복호화 방법 및 그 장치
KR20230025036A (ko) * 2019-02-21 2023-02-21 엘지전자 주식회사 인트라 예측을 위한 비디오 신호의 처리 방법 및 장치
KR20220071945A (ko) * 2020-11-24 2022-05-31 현대자동차주식회사 기하학적 변환에 기반하는 블록 복사를 이용하는 인트라 예측방법과 장치

Also Published As

Publication number Publication date
KR20250029008A (ko) 2025-03-04

Similar Documents

Publication Publication Date Title
WO2011126275A2 (fr) Détermination d'un mode de prédiction intra d'une unité de codage d'image et d'une unité de décodage d'image
WO2020050684A1 (fr) Procédé et dispositif de codage/décodage d'image utilisant une prédiction intra
WO2017222325A1 (fr) Dispositif et procédé de traitement de signal vidéo
WO2018052224A1 (fr) Procédé et dispositif de codage et de décodage vidéo, et support d'enregistrement ayant un flux binaire stocké dans celui-ci
WO2017179835A1 (fr) Procédé et appareil de traitement de signal vidéo à base d'intraprédiction
WO2014171713A1 (fr) Procédé et appareil de codage/décodage vidéo utilisant une prédiction intra
WO2016175549A1 (fr) Procédé de traitement de signal vidéo et dispositif correspondant
WO2016052977A1 (fr) Procédé et appareil de traitement de signal vidéo
WO2013109123A1 (fr) Procédé et dispositif de codage vidéo permettant d'améliorer la vitesse de traitement de prédiction intra, et procédé et dispositif de décodage vidéo
WO2020009419A1 (fr) Procédé et dispositif de codage vidéo utilisant un candidat de fusion
WO2017082443A1 (fr) Procédé et appareil pour prédire de manière adaptative une image à l'aide d'une valeur de seuil dans un système de codage d'image
WO2020162737A1 (fr) Procédé de traitement du signal vidéo et dispositif utilisant une transformée secondaire
WO2016159610A1 (fr) Procédé et appareil de traitement de signal vidéo
WO2018044089A1 (fr) Procédé et dispositif pour traiter un signal vidéo
WO2020013609A1 (fr) Procédé et dispositif de codage vidéo basés sur une prédiction intra-trame
WO2018164504A1 (fr) Procédé et dispositif de traitement de signal vidéo
WO2016114583A1 (fr) Procédé et dispositif de traitement de signal vidéo
WO2021137597A1 (fr) Procédé et dispositif de décodage d'image utilisant un paramètre de dpb pour un ols
WO2016122253A1 (fr) Procédé et appareil de traitement de signaux vidéo
WO2016122251A1 (fr) Procédé et appareil de traitement de signaux vidéo
WO2017188782A2 (fr) Procédé et appareil de codage/décodage de signal d'image
WO2014107073A1 (fr) Procédé et appareil d'encodage de vidéo, et procédé et appareil de décodage de ladite vidéo
WO2021060801A1 (fr) Procédé et dispositif de codage/décodage d'image et support d'enregistrement mémorisant un flux binaire
WO2018070788A1 (fr) Procédé/dispositif de codage d'image, procédé/dispositif de décodage d'image, et support d'enregistrement sur lequel est stocké un flux binaire
WO2014098374A1 (fr) Procédé de décodage vidéo échelonnable utilisant un mpm, et appareil utilisant un tel procédé

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24856838

Country of ref document: EP

Kind code of ref document: A1