WO2025152924A1 - Procédé de codage, procédé de décodage et dispositif associé - Google Patents
Procédé de codage, procédé de décodage et dispositif associéInfo
- Publication number
- WO2025152924A1 WO2025152924A1 PCT/CN2025/072261 CN2025072261W WO2025152924A1 WO 2025152924 A1 WO2025152924 A1 WO 2025152924A1 CN 2025072261 W CN2025072261 W CN 2025072261W WO 2025152924 A1 WO2025152924 A1 WO 2025152924A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- node
- prediction mode
- current node
- context information
- decoded
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/161—Encoding, multiplexing or demultiplexing different image signal components
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/18—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/96—Tree coding, e.g. quad-tree coding
Definitions
- the embodiment of the present application provides a coding and decoding method and related equipment, which can select a more appropriate prediction mode for the current node, thereby facilitating improving the prediction coding effect and improving the coding quality.
- a coding method which is performed by a coding end, and the method includes:
- the encoder determines a transformation tree structure of the point cloud to be encoded based on the geometric reconstruction information, wherein the transformation tree structure includes at least one first node layer;
- the node to be coded is a child node of the current node;
- the attribute transformation coefficients are coded to obtain a code stream.
- a decoding method which is performed by a decoding end, and the method includes:
- the determination module is further used to determine, in the reference frame of the current node in the at least one first node layer, a node having the same geometric position as the current node as a reference frame node;
- the determination module is further configured to select a prediction mode adopted by the current node from an inter-frame prediction mode and a non-prediction mode through rate-distortion optimization if the current node completely matches the reference frame node and the intra-frame prediction mode is not enabled;
- a prediction and transformation module used for predicting and transforming a node to be coded according to the prediction mode to obtain an attribute transformation coefficient; the node to be coded is a child node of the current node;
- the encoding module is used to encode the attribute transformation coefficients to obtain a code stream.
- a decoding device comprising:
- a parsing module is used to parse the bitstream and obtain the transformation coefficient reconstruction value of the point cloud to be decoded
- the decoding module is used to determine the prediction mode adopted by the current node through entropy decoding if the current node completely matches the reference frame node and the intra-frame prediction mode is not turned on; wherein the prediction mode is an inter-frame prediction mode or a non-prediction mode;
- the encoding end selects the prediction mode adopted by the current node from the inter-prediction mode and the non-prediction mode through rate-distortion optimization when the current node fully matches the corresponding reference frame node and the intra-frame prediction mode is not turned on, and then transforms, predicts and encodes the attribute information of the child nodes of the current node, that is, the node to be encoded, according to the selected appropriate prediction mode to obtain a code stream.
- the embodiment of the present application can select a more appropriate prediction mode for the current node, which is conducive to improving the prediction coding effect and improving the coding quality.
- FIG1 is a schematic diagram of a coding and decoding system provided in an embodiment of the present application.
- FIG2a is a flow chart of encoding performed by an encoder based on an AVS-PCC encoding framework
- FIG2b is a flow chart of encoding performed by an encoder based on the encoding framework of MPEG G-PCC;
- FIG3b is a decoding flow chart of a decoder based on the decoding framework of MPEG G-PCC;
- FIG4 is a schematic flow chart of an encoding method provided in an embodiment of the present application.
- FIG5 is a schematic flow chart of another encoding method provided in an embodiment of the present application.
- FIG8 is a schematic flow chart of another decoding method provided in an embodiment of the present application.
- FIG12 is a schematic block diagram of an electronic device further provided in an embodiment of the present application.
- FIG. 13 is a schematic diagram of the hardware structure of a terminal implementing an embodiment of the present application.
- each point in the point cloud has the same amount of attribute information.
- each point in the point cloud can have two kinds of attribute information: color information and laser reflection intensity.
- each point in the point cloud can have three kinds of attribute information: color information, material information, and laser reflection intensity information.
- Point cloud coding refers to the process of encoding the geometric coordinate information and attribute information of each point in the point cloud to obtain a compressed code stream.
- Point cloud coding can include two main processes: geometric coordinate information encoding and attribute information encoding.
- the point cloud coding framework that can compress point clouds can be the geometry-based point cloud compression (G-PCC) codec framework or the video-based point cloud compression (V-PCC) codec framework provided by the Moving Picture Experts Group (MPEG), or the AVS-PCC codec framework provided by the Audio Video Standard (AVS).
- G-PCC geometry-based point cloud compression
- V-PCC video-based point cloud compression
- MPEG Moving Picture Experts Group
- AVS-PCC codec framework provided by the Audio Video Standard (AVS).
- Point cloud decoding refers to the process of decoding the compressed bitstream obtained by point cloud encoding to reconstruct the point cloud. In detail, it refers to the process of reconstructing the geometric coordinate information and attribute information of each point in the point cloud based on the geometric bitstream and attribute bitstream in the compressed bitstream. After obtaining the compressed bitstream at the decoding end, the geometric bitstream is first entropy decoded to obtain the quantized information of each point in the point cloud, and then inverse quantization is performed to reconstruct the geometric coordinate information of each point in the point cloud.
- Fig. 1 is a schematic diagram of a codec system 10 provided in an embodiment of the present application.
- the technical solution of the embodiment of the present application involves performing codec (including encoding or decoding) on point cloud data.
- the source device 100 and the destination device 110 may include any one or more of a desktop computer, a notebook (i.e., laptop) computer, a tablet computer, a set-top box, a mobile phone, a wearable device (e.g., a smart watch or a wearable camera), a television, a camera, a display device, a vehicle-mounted device, a virtual reality (VR) device, an augmented reality (AR) device, a mixed reality (MR) device, a digital media player, a video game console, a video conferencing device, a video streaming device, a broadcast receiver device, a broadcast transmitter device, a spacecraft, an aircraft, a robot, a satellite, and the like.
- a desktop computer a notebook (i.e., laptop) computer, a tablet computer, a set-top box
- a mobile phone e.g., a smart watch or a wearable camera
- a television a camera
- a display device e.g., a display device,
- the source device 100 includes a data source 101, a memory 102, an encoder 200, and an output interface 104.
- the destination device 110 includes an input interface 111, a decoder 300, a memory 113, and a display device 114.
- the source device 100 represents an example of an encoding device
- the destination device 110 represents an example of a decoding device.
- the source device 100 and the destination device 110 may not include some of the components in FIG. 1 , or may also include other components other than FIG. 1 .
- the source device 100 may acquire point cloud data through an external capture device.
- the destination device 110 may be connected to an external display device interface without including an integrated display device.
- the memory 102 and the memory 113 may be external memories.
- the source device 100 and the destination device 110 can perform unidirectional data transmission or bidirectional data transmission. If it is bidirectional data transmission, the source device 100 and the destination device 110 can operate in a substantially symmetrical manner, that is, each of the source device 100 and the destination device 110 includes an encoder and a decoder.
- the data source 101 represents the source of point cloud data (i.e., the original, unencoded point cloud data) and provides the encoder 200 with the point cloud data, and the encoder 103 encodes the point cloud data.
- the source device 100 may include a capture device (e.g., a camera device, a sensor device, or a scanning device), an archive of previously captured point cloud data, or a feed interface for receiving point cloud data from a data content provider.
- the camera device may include an ordinary camera, a stereo camera, and a light field camera, etc.
- the sensor device may include a laser device, a radar device, etc.
- the scanning device may include a three-dimensional laser scanning device, etc.
- the point cloud data can be obtained by collecting the visual scene of the real world through the capture device.
- the data source 101 may generate computer graphics-based data as source data, or combine real-time data, archived data, and computer-generated data.
- the data source generates point cloud data based on a virtual object (e.g., a virtual three-dimensional object and a virtual three-dimensional scene obtained by three-dimensional modeling).
- the encoder 200 encodes the captured, pre-captured, or computer-generated data.
- the encoder 200 may rearrange the point cloud data from the order in which it was received (sometimes referred to as the "display order") into an encoding order.
- the encoder 200 may generate a bitstream including the encoded point cloud data.
- the source device 100 may then output the encoded point cloud data to the communication medium 120 via the output interface 104 for receipt or retrieval by, for example, the input interface 111 of the destination device 110.
- the destination device 110 can access the encoded point cloud data from the server, for example via a wireless channel (e.g., a Wi-Fi connection) or a wired connection (e.g., a digital subscriber line (DSL), a cable modem, etc.) for accessing the encoded point cloud data stored on the server.
- a wireless channel e.g., a Wi-Fi connection
- a wired connection e.g., a digital subscriber line (DSL), a cable modem, etc.
- Octree-based geometric coding is a tree data structure that evenly divides the pre-set bounding box in three-dimensional space, and each node has eight child nodes. By using “1" and "0" to indicate whether each child node of the octree is occupied or not, the occupancy code information (Occupancy Code) is obtained as the code stream of the point cloud geometry information.
- Occupancy Code occupancy code information
- Geometry Entropy Encoding Statistical compression encoding is performed on the occupancy code information of the octree, the prediction residual information of the prediction tree, and the vertex information of the triangle representation, and finally a binary (0 or 1) compressed code stream is output.
- Statistical coding is a lossless coding method that can effectively reduce the bit rate required to express the same signal.
- the commonly used statistical coding method is context-based binary arithmetic coding (Content Adaptive Binary Arithmetic Coding, CABAC).
- Lifting transform coding refers to introducing a weight update strategy for neighborhood points based on the prediction of adjacent layers of LoD, and ultimately obtaining the predicted attribute information of each point and the corresponding attribute residual information.
- Hierarchical region adaptive transform coding means that the attribute information is transformed into a transform domain through RAHT transform, which is called transform coefficient.
- Attribute Quantization The degree of quantization is usually determined by the quantization parameter.
- the transform coefficients or attribute residual information obtained by attribute information processing are quantized, and the quantized results are entropy coded.
- the quantized attribute residual information is entropy coded; in RAHT, the quantized transform coefficients are entropy coded.
- the encoder 200 encodes the geometric coordinate information of each point in the point cloud to obtain a geometric bitstream, and encodes the attribute information of each point in the point cloud to obtain an attribute bitstream.
- the encoder 200 can transmit the encoded geometric bitstream and attribute bitstream together to the decoder 300.
- FIG3a shows a decoding flowchart performed by a decoder based on the decoding framework of AVS-PCC
- FIG3b shows a decoding flowchart performed by a decoder based on the decoding framework of MPEG G-PCC
- the above decoder may be the decoder 300 shown in FIG1.
- the decoder 300 After receiving the compressed code stream (i.e., the attribute bit stream and the geometry bit stream) transmitted by the encoder 200, the decoder 300 decodes the geometry bit stream to reconstruct the geometric coordinate information of each point in the point cloud, and decodes the attribute bit stream to reconstruct the attribute information of each point in the point cloud.
- Entropy Decoding Entropy decoding is performed on the geometry bit stream and attribute bit stream respectively to obtain geometry syntax elements and attribute syntax elements.
- Geometric decoding For the AVS-PCC coding framework, geometric decoding includes two modes, namely, octree-based geometric decoding and prediction tree-based geometric decoding. For the G-PCC coding framework, geometric coding includes three modes, namely, octree-based geometric decoding, trisoup-based geometric decoding, and prediction tree-based prediction decoding.
- Octree-based geometry decoding The octree is reconstructed based on the geometry syntax elements parsed from the geometry bitstream.
- Prediction tree-based geometry decoding The prediction tree is reconstructed based on the geometry syntax elements parsed from the geometry bitstream.
- Coordinate inverse transformation The reconstructed geometric coordinate information is inversely transformed to convert the reconstructed coordinates (positions) of the points in the point cloud from the transformed domain back to the initial domain.
- Dequantization Dequantize the attribute syntax elements.
- attribute information processing determines the color information of the midpoint in the point cloud by predicting or predicting the prediction residual or prediction residual transformation coefficient after inverse quantization, or by transforming the inverse quantized transformation coefficient to determine the color information of the midpoint in the point cloud.
- attribute information processing determines the color information of the point in the point cloud by using RAHT to inversely quantize the attribute information, or by using LOD and inverse lifting to inversely quantize the attribute information.
- Color inversion Convert color information from the YCbCr color space to the RGB color space. In some examples, the color inversion operation may not be performed.
- an embodiment of the present application provides a coding and decoding method and related equipment, which can select the prediction mode adopted by the node to be encoded from the inter-frame prediction mode and the non-prediction mode through rate-distortion optimization when the intra-frame prediction mode is not turned on, thereby facilitating improving the prediction coding effect and improving the coding quality.
- the encoding method provided by the embodiments of the present application can be performed by the encoding end, such as the encoder 200 shown in Figure 1, Figure 2a, or Figure 2b.
- the decoding method provided by the embodiments of the present application can be performed by the decoding end, such as the decoder 300 described in Figure 1, Figure 3a, or Figure 3b.
- the encoding end and the decoding end can be implemented by software, hardware, or a combination thereof.
- the encoding end can be referred to as an encoding end device or an encoding device
- the decoding end can be referred to as a decoding end device or a decoding device.
- the prediction mode used by the current node is selected from the intra prediction mode, inter prediction mode, and non-prediction mode through RDO.
- the prediction mode used by the current node is selected from the intra prediction mode, inter prediction mode, and non-prediction mode through RDO.
- a suitable prediction mode may be selected from the three prediction modes mentioned above according to Table 1 below.
- the attribute transformation coefficient of the node to be encoded may be obtained according to the following steps 441 to 443 .
- the reconstructed attribute value of the child node at the corresponding position of the reference frame node may be determined as the predicted attribute value of the node to be encoded at the corresponding position of the current node.
- the attribute transformation coefficient of the node to be encoded may also be obtained according to the following step 444 .
- the prediction mode of the current node is the non-prediction mode, it is only necessary to transform the original attribute values of the child nodes of the current node to be encoded to obtain the corresponding attribute transformation coefficients.
- the attribute transformation coefficients may be referred to as AC transformation coefficients.
- the corresponding attribute transformation coefficient is obtained by transforming the original attribute value of the node to be encoded.
- the upsampling prediction and RAHT process is as follows:
- the same bottom-up transformation tree construction operation as the current frame is performed on the reference frame, and a reference frame node attribute value corresponding to the current frame node is generated, which can be used as an inter-frame prediction value.
- This inter-frame prediction mode is called the revision inter-frame prediction mode, which can be expressed as Inter (revision) mode. Since the RAHT transform is encoded from top to bottom, when encoding the child nodes of the current node, the current node is used as the parent node, and its attribute value has been encoded and reconstructed, so it can be used.
- the AC residual transform coefficients obtained in step 440 may be encoded to obtain a bitstream.
- the prediction mode is the non-prediction mode
- the AC transform coefficients obtained according to the original attribute values of the nodes to be encoded in step 440 may be encoded to obtain a bitstream.
- the encoding end selects the prediction mode adopted by the current node from the inter-prediction mode and the non-prediction mode through rate-distortion optimization when the current node fully matches the corresponding reference frame node and the intra-frame prediction mode is not turned on, and then transforms, predicts and encodes the attribute information of the child nodes of the current node, that is, the node to be encoded, according to the selected appropriate prediction mode to obtain a code stream.
- the embodiment of the present application can select a more appropriate prediction mode for the current node, which is conducive to improving the prediction coding effect and improving the coding quality.
- the selected prediction mode when RDO is used to select the best prediction mode, the selected prediction mode needs to be entropy encoded and the encoding result is passed into the bitstream.
- 2 bits are used to encode the above three prediction modes, and the 2 bits are isNullFlag and isIntraFlag.
- each bit uses 108 contexts to perform arithmetic encoding on these prediction modes.
- the 108 contexts can be jointly determined based on the following four context information a) to d):
- the current node uses inter-frame prediction mode
- Intra-frame prediction mode is turned on, and the current node does not completely match the reference frame node
- Intra-frame prediction mode is not enabled.
- the second context information corresponds to one of the following three states:
- the context may be determined based on the following two types of context information:
- Intra prediction mode is on and inter prediction matches
- the third context information may also be determined by using the number of occupied child nodes in the current node; wherein the third context information corresponds to two contexts.
- the embodiment of the present application can help reduce the types of contexts by corresponding the context information of the occupied child nodes in the current node to one of the two states.
- the two states corresponding to the third context information may include the following two:
- the number of child nodes occupied by the current node is in the first interval
- Intra-frame prediction mode is turned on, and inter-frame prediction does not match
- Intra-frame prediction mode is not enabled.
- the index of the context corresponding to the prediction model of the current node may be determined according to the prediction mode of the current node and the matching state with the first context information and the second context information.
- the index of the context corresponding to the prediction mode may be determined based on the first context information, the second context information, and the third context information.
- the value range of the index of the context corresponding to the prediction mode of the current node can be ⁇ 0, 8 ⁇ ; for the above 18 contexts, the value range of the index of the context corresponding to the prediction mode of the current node can be ⁇ 0, 17 ⁇ .
- the embodiment of the present application can help reduce the types of contexts by determining the context information of the current node as a kind of neighbor node, or determining one of the two states corresponding to the context information based on the number of occupied child nodes in the current node, thereby helping to reduce the computational complexity of the encoding process, reduce memory overhead, and improve encoding efficiency.
- Fig. 7 shows a schematic flow chart of a decoding method 500 provided in an embodiment of the present application. As shown in Fig. 8 , the method 500 includes steps 510 to 550 .
- the data to be decoded can be parsed from the bitstream, and the data to be decoded can be entropy decoded and dequantized to obtain the reconstruction value of the transform coefficient.
- the data to be decoded can be geometrically decoded to obtain the geometric reconstruction information of the point cloud to be decoded.
- the process of constructing the transformation tree structure of the point cloud to be decoded is similar to the process of constructing the transformation tree structure of the point cloud to be encoded, and reference may be made to the description of step 410 in FIG. 4 , which will not be repeated here.
- the process of determining the reference frame node of the current node can refer to the relevant description in step 420 in FIG. 4 , which will not be repeated here.
- the prediction mode of the current node is determined by entropy decoding, and the prediction mode includes an intra prediction mode, an inter prediction mode, or a non-prediction mode.
- the prediction mode of the node may be inferred.
- inter-frame prediction is not performed on nodes in the lower node layer.
- it can be determined whether intra-frame prediction is enabled, and if enabled, an intra-frame prediction mode is selected, otherwise a non-prediction mode is selected.
- the prediction mode is an inter-frame prediction mode
- the predicted attribute value of the child node of the current node i.e., the node to be decoded
- the predicted attribute value is transformed to obtain the AC transform coefficient of the predicted attribute value
- the decoded transform coefficient reconstruction value i.e., AC residual coefficient reconstruction value
- the DC coefficient of the node to be decoded can be inherited from the parent node, and the RAHT inverse transform is performed based on the AC coefficient reconstruction value and the DC coefficient to obtain the reconstruction value of the node to be decoded.
- the transformation coefficient reconstruction value of the node to be decoded is inversely transformed to obtain the reconstructed attribute value.
- the decoding end determines the prediction mode of the current node by entropy decoding when the current node fully matches the corresponding reference frame node and the intra-frame prediction mode is not turned on, whether it is an inter-frame prediction mode or a non-prediction mode, and then inversely transforms the transformation coefficient reconstruction value of the child node of the current node, that is, the node to be decoded, according to the prediction mode to obtain the reconstructed attribute value.
- the embodiment of the present application can select a more appropriate prediction mode for the current node, which is conducive to improving the prediction decoding effect and improving the decoding quality.
- entropy decoding may be performed according to the following steps 560 to 590 to determine the prediction mode adopted by the current node.
- the first context information corresponds to one of the following three states:
- the intra-frame prediction mode is turned on, and the current node does not completely match the reference frame node;
- Intra prediction mode is not enabled.
- the unpredicted mode is most commonly used in the current node, neighboring parent nodes, and decoded sibling child nodes.
- the number of occupied child nodes is in the first interval
- parse the bitstream parse the bitstream, perform entropy decoding according to the context index, and obtain a prediction mode.
- the embodiment of the present application uses a better RDO algorithm when selecting the prediction mode of the current node, and effectively reduces the number of contexts in the encoding and decoding prediction mode, thereby achieving the purpose of improving the encoding efficiency.
- the performance gains of the three color attribute channels Luma, ChromeCb, and Cr are 0.2, 0.5, and 0.5 respectively.
- the encoding method provided in the embodiment of the present application may be executed by an encoding device.
- the encoding device provided in the embodiment of the present application is described by taking the encoding method executed by the encoding device as an example.
- Fig. 10 shows a schematic block diagram of a coding apparatus 1000 provided in an embodiment of the present application.
- the coding apparatus 1000 includes a determination module 1010 , a prediction and transformation module 1020 , and a coding module 1030 .
- the determination module 1010 is further configured to select a prediction mode adopted by the current node from an inter-frame prediction mode and a non-prediction mode through rate-distortion optimization if the current node completely matches the reference frame node and the intra-frame prediction mode is not enabled;
- a prediction and transformation module 1020 is used to predict and transform the node to be coded according to the prediction mode to obtain an attribute transformation coefficient; the node to be coded is a child node of the current node;
- the encoding module 1030 is used to encode the attribute transformation coefficients to obtain a code stream.
- the prediction and transformation module 1020 is specifically used for:
- the prediction mode is the inter-frame prediction mode, determining a prediction attribute value of the node to be encoded
- the attribute transformation coefficient is obtained according to the first transformation coefficient and the second transformation coefficient.
- the prediction and transformation module 1020 is specifically used for:
- the prediction and transformation module 1020 is specifically used for:
- Intra prediction mode is not enabled.
- the number range of the occupied sub-nodes is a first interval
- the range of the number of occupied child nodes is a second interval; wherein the possible value range of the number of child nodes of the current node is divided into the first interval and the second interval.
- the embodiment of the present application can select a more appropriate prediction mode for the current node, which is conducive to improving the prediction coding effect and improving the coding quality.
- the embodiment of the present application can help reduce the types of contexts by determining the context information of the current node as a kind of neighbor node, or determining one of the two states corresponding to the context information according to the number of occupied child nodes in the current node, thereby helping to reduce the computational complexity of the encoding process, reduce memory overhead, and improve encoding efficiency.
- Fig. 11 shows a schematic block diagram of a decoding device 1100 provided in an embodiment of the present application.
- the decoding device 1100 includes a parsing module 1110 , a determining module 1120 , a decoding module 1130 and an inverse transform module 1140 .
- the parsing module 1110 is used to parse the bit stream to obtain the transformation coefficient reconstruction value of the point cloud to be decoded;
- a determination module 1120 configured to determine a transformation tree structure of the to-be-decoded point cloud based on geometric reconstruction information; the transformation tree structure comprises at least one first node layer;
- the decoding module 1130 is used to determine the prediction mode adopted by the current node through entropy decoding if the current node completely matches the reference frame node and the intra-frame prediction mode is not turned on; wherein the prediction mode is an inter-frame prediction mode or a non-prediction mode;
- the inverse transformation module 1140 is used to perform an inverse transformation on the transformation coefficient reconstruction value of the node to be decoded according to the prediction mode to obtain a reconstructed attribute value; wherein the node to be decoded is a child node of the current node.
- the prediction mode is the inter-frame prediction mode, transforming the predicted attribute value of the node to be decoded to obtain a second transformation coefficient
- the first transform coefficient is inversely transformed to obtain the reconstructed attribute value.
- the first context information corresponds to one of the following three states:
- the second context information corresponds to one of the following three states:
- the non-prediction mode is most commonly used in the current node, neighboring parent nodes, and decoded same-layer child nodes.
- the determination module 1120 is further configured to:
- An index of a context corresponding to the prediction mode is determined according to the first context information, the second context information and the third context information.
- the number range of the occupied sub-nodes is a first interval
- Wearable devices include: smart watches, smart bracelets, smart headphones, smart glasses, smart jewelry (smart bracelets, smart bracelets, smart rings, smart necklaces, smart anklets, smart anklets, etc.), smart wristbands, smart clothing, etc.
- the vehicle-mounted device can also be called a vehicle-mounted terminal, a vehicle-mounted controller, a vehicle-mounted module, a vehicle-mounted component, a vehicle-mounted chip or a vehicle-mounted unit, etc. It should be noted that the specific type of the terminal is not limited in the embodiments of the present application.
- the server can be an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that can provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (CDN), or cloud computing services based on big data and artificial intelligence platforms.
- cloud servers can provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (CDN), or cloud computing services based on big data and artificial intelligence platforms.
- the processor 1310 is configured to determine a transformation tree structure of a point cloud to be encoded based on the geometric reconstruction information, wherein the transformation tree structure includes at least one first node layer;
- the node to be coded is a child node of the current node;
- the attribute transformation coefficients are coded to obtain a code stream.
- the encoding end selects the prediction mode adopted by the current node from the inter-prediction mode and the non-prediction mode through rate-distortion optimization when the current node fully matches the corresponding reference frame node and the intra-frame prediction mode is not turned on, and then transforms, predicts and encodes the attribute information of the child nodes of the current node, that is, the node to be encoded, according to the selected appropriate prediction mode to obtain the code stream.
- the processor 1310 is used to parse the bitstream to obtain a transformation coefficient reconstruction value of the point cloud to be decoded
- the prediction mode is an inter-frame prediction mode or a non-prediction mode
- the decoding end determines the prediction mode of the current node by entropy decoding when the current node fully matches the corresponding reference frame node and the intra-frame prediction mode is not turned on, whether it is an inter-frame prediction mode or a non-prediction mode, and then performs an inverse transformation on the transformation coefficient reconstruction value of the child node of the current node, that is, the node to be decoded, according to the prediction mode to obtain a reconstructed attribute value.
- the embodiment of the present application can select a more appropriate prediction mode for the current node, which is conducive to improving the prediction decoding effect and improving the decoding quality.
- the processor is the processor in the terminal described in the above embodiment.
- the readable storage medium includes a computer-readable storage medium, such as a ROM, RAM, a magnetic disk or an optical disk.
- the readable storage medium may be a non-transient readable storage medium.
- the embodiments of the present application further provide a computer program/program product, which is stored in a storage medium and is executed by at least one processor to implement the various processes of the above-mentioned method embodiments and can achieve the same technical effect. To avoid repetition, it will not be described here.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
La présente demande appartient au domaine technique du codage et du décodage. Sont divulgués un procédé de codage, un procédé de décodage et un dispositif associé. Le procédé de codage comprend les étapes suivantes : sur la base d'informations de reconstruction géométrique, une extrémité de codage détermine une structure d'arbre de transformation d'un nuage de points à coder, la structure d'arbre de transformation comprenant au moins une première couche de nœud ; à partir d'une trame de référence d'un nœud actuel dans la ou les premières couches de nœud, déterminer en tant que nœud de trame de référence un nœud ayant la même position géométrique qu'un nœud à coder ; si le nœud actuel correspond complètement au nœud de trame de référence et qu'un mode de prédiction intra-trame n'est pas activé, sélectionner, au moyen d'une optimisation de distorsion de débit et parmi un mode de prédiction inter-trame et un mode de non-prédiction, un mode de prédiction utilisé par le nœud actuel ; sur la base du mode de prédiction, effectuer une prédiction et une transformation sur le nœud à coder, de façon à obtenir un coefficient de transformation d'attribut, le nœud à coder étant un nœud enfant du nœud actuel ; et effectuer un traitement de codage sur le coefficient de transformation d'attribut, de façon à obtenir un flux binaire.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410072679.9 | 2024-01-17 | ||
| CN202410072679.9A CN120343259A (zh) | 2024-01-17 | 2024-01-17 | 编解码方法及相关设备 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025152924A1 true WO2025152924A1 (fr) | 2025-07-24 |
Family
ID=96351954
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2025/072261 Pending WO2025152924A1 (fr) | 2024-01-17 | 2025-01-14 | Procédé de codage, procédé de décodage et dispositif associé |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN120343259A (fr) |
| WO (1) | WO2025152924A1 (fr) |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112385236A (zh) * | 2020-06-24 | 2021-02-19 | 北京小米移动软件有限公司 | 点云的编码和解码方法 |
| CN114095735A (zh) * | 2020-08-24 | 2022-02-25 | 北京大学深圳研究生院 | 一种基于块运动估计和运动补偿的点云几何帧间预测方法 |
| US20220286713A1 (en) * | 2019-03-20 | 2022-09-08 | Lg Electronics Inc. | Point cloud data transmission device, point cloud data transmission method, point cloud data reception device and point cloud data reception method |
| CN115471627A (zh) * | 2021-06-11 | 2022-12-13 | 维沃移动通信有限公司 | 点云的几何信息编码处理方法、解码处理方法及相关设备 |
| CN115714864A (zh) * | 2021-08-23 | 2023-02-24 | 鹏城实验室 | 点云属性编码方法、装置、解码方法以及装置 |
| CN116636214A (zh) * | 2020-12-22 | 2023-08-22 | Oppo广东移动通信有限公司 | 点云编解码方法与系统、及点云编码器与点云解码器 |
-
2024
- 2024-01-17 CN CN202410072679.9A patent/CN120343259A/zh active Pending
-
2025
- 2025-01-14 WO PCT/CN2025/072261 patent/WO2025152924A1/fr active Pending
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220286713A1 (en) * | 2019-03-20 | 2022-09-08 | Lg Electronics Inc. | Point cloud data transmission device, point cloud data transmission method, point cloud data reception device and point cloud data reception method |
| CN112385236A (zh) * | 2020-06-24 | 2021-02-19 | 北京小米移动软件有限公司 | 点云的编码和解码方法 |
| CN114095735A (zh) * | 2020-08-24 | 2022-02-25 | 北京大学深圳研究生院 | 一种基于块运动估计和运动补偿的点云几何帧间预测方法 |
| CN116636214A (zh) * | 2020-12-22 | 2023-08-22 | Oppo广东移动通信有限公司 | 点云编解码方法与系统、及点云编码器与点云解码器 |
| CN115471627A (zh) * | 2021-06-11 | 2022-12-13 | 维沃移动通信有限公司 | 点云的几何信息编码处理方法、解码处理方法及相关设备 |
| CN115714864A (zh) * | 2021-08-23 | 2023-02-24 | 鹏城实验室 | 点云属性编码方法、装置、解码方法以及装置 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN120343259A (zh) | 2025-07-18 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2023024840A1 (fr) | Procédés de codage et de décodage de nuage de points, codeur, décodeur et support de stockage | |
| CN115474041B (zh) | 点云属性的预测方法、装置及相关设备 | |
| CN114598883B (zh) | 点云属性的预测方法、编码器、解码器及存储介质 | |
| WO2025060705A1 (fr) | Procédé et appareil de traitement de nuage de points, support de stockage et dispositif électronique | |
| WO2022188582A1 (fr) | Procédé et appareil de sélection d'un point voisin dans un nuage de points, et codec | |
| WO2025152924A1 (fr) | Procédé de codage, procédé de décodage et dispositif associé | |
| CN119815053B (zh) | 点云属性编码方法、点云属性解码方法、装置及电子设备 | |
| CN119815052B (zh) | 编码方法、解码方法及相关设备 | |
| WO2025077667A1 (fr) | Procédé et appareil de détermination d'informations d'attribut de nuage de points, et dispositif électronique | |
| WO2025067194A1 (fr) | Procédé de traitement de codage de nuage de points, procédé de traitement de décodage de nuage de points et dispositif associé | |
| WO2025218556A1 (fr) | Procédé, appareil et dispositif d'optimisation de sommet trisoup | |
| WO2025082237A1 (fr) | Procédé et appareil de codage, procédé et appareil de décodage, et dispositif électronique | |
| CN120835147A (zh) | 点云信息的解码、编码方法、装置及相关设备 | |
| WO2025218557A1 (fr) | Procédé et appareil de reconstruction géométrique et dispositif | |
| CN120343269A (zh) | 点云重建方法、装置及相关设备 | |
| CN120835153A (zh) | 解码方法、编码方法、装置、解码端及编码端 | |
| US20250247566A1 (en) | Encoding method, decoding method, code stream, encoder, decoder and storage medium | |
| US20240037799A1 (en) | Point cloud coding/decoding method and apparatus, device and storage medium | |
| CN114697666B (zh) | 屏幕编码方法、屏幕解码方法及相关装置 | |
| WO2025218571A1 (fr) | Procédé de décodage de grille basé sur une tranche, procédé de codage de grille basé sur une tranche et dispositif associé | |
| WO2024217302A1 (fr) | Procédé de traitement de codage de nuage de points, procédé de traitement de décodage de nuage de points et dispositif associé | |
| CN120188479A (zh) | 点云编解码方法、装置、设备及存储介质 | |
| CN118828023A (zh) | 点云编码处理方法、点云解码处理方法及相关设备 | |
| WO2024244900A1 (fr) | Procédé et appareil de traitement de nuage de points, dispositif informatique et support de stockage | |
| WO2024245112A1 (fr) | Procédé de traitement de codage de nuage de points, procédé de traitement de décodage de nuage de points et appareil associé |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25741485 Country of ref document: EP Kind code of ref document: A1 |