[go: up one dir, main page]

WO2025039113A1 - Procédé de codage, procédé de décodage, flux de code, codeur, décodeur, et support de stockage - Google Patents

Procédé de codage, procédé de décodage, flux de code, codeur, décodeur, et support de stockage Download PDF

Info

Publication number
WO2025039113A1
WO2025039113A1 PCT/CN2023/113792 CN2023113792W WO2025039113A1 WO 2025039113 A1 WO2025039113 A1 WO 2025039113A1 CN 2023113792 W CN2023113792 W CN 2023113792W WO 2025039113 A1 WO2025039113 A1 WO 2025039113A1
Authority
WO
WIPO (PCT)
Prior art keywords
value
syntax element
element information
current macroblock
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CN2023/113792
Other languages
English (en)
Chinese (zh)
Inventor
孙泽星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to PCT/CN2023/113792 priority Critical patent/WO2025039113A1/fr
Publication of WO2025039113A1 publication Critical patent/WO2025039113A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/161Encoding, multiplexing or demultiplexing different image signal components
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Definitions

  • the embodiments of the present application relate to the field of point cloud coding technology, and in particular to a coding and decoding method, a bit stream, an encoder, a decoder, and a storage medium.
  • the geometric information of the point cloud and the attribute information corresponding to each point are encoded separately.
  • the encoding method of geometric information can be divided into octree-based geometric coding and prediction tree-based geometric coding.
  • the geometric macroblock In the process of encoding geometric information, the geometric macroblock can be encoded based on the largest coding unit (LCU). However, when decoding the geometric macroblock, some syntax elements transmitted in the bitstream are unnecessary, resulting in low coding efficiency.
  • LCU largest coding unit
  • the embodiments of the present application provide a coding and decoding method, a bit stream, an encoder, a decoder and a storage medium, which can improve the coding and decoding efficiency.
  • an embodiment of the present application provides a decoding method, which is applied to a decoder, and the method includes:
  • Decoding a bitstream determining a value of a first syntax element information and a value of at least one second syntax element information; wherein the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock, and the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock, where i is an integer greater than or equal to 0 and less than the value of the first syntax element information;
  • a point value of the current macroblock is determined according to a value of the first syntax element information and a value of at least one second syntax element information.
  • an embodiment of the present application provides an encoding method, which is applied to an encoder, and the method includes:
  • the value of the first syntax element information and the value of at least one second syntax element information are encoded, and the obtained encoding bits are written into a bitstream.
  • an embodiment of the present application provides a code stream, which is generated by bit encoding according to information to be encoded; wherein the information to be encoded includes at least one of the following:
  • an encoder comprising a first determining unit and an encoding unit, wherein:
  • a first determination unit is configured to determine a point value of the current macroblock; and determine a value of a first syntax element information and a value of at least one second syntax element information according to the point value of the current macroblock; wherein the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock, and the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock, where i is an integer greater than or equal to 0 and less than the value of the first syntax element information;
  • the encoding unit is configured to encode a value of the first syntax element information and a value of at least one second syntax element information, and write the obtained encoding bits into a bitstream.
  • the first processor is configured to execute the method according to the first aspect when running the computer program.
  • an embodiment of the present application provides a decoder, the decoder comprising a decoding unit and a second determining unit, wherein:
  • a decoding unit configured to decode a bit stream, and determine a value of a first syntax element information and a value of at least one second syntax element information; wherein the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock, and the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock, where i is an integer greater than or equal to 0 and less than the value of the first syntax element information;
  • the second determination unit is configured to determine a point value of the current macroblock according to a value of the first syntax element information and a value of at least one second syntax element information.
  • an embodiment of the present application provides a decoder, the decoder comprising a second memory and a second processor, wherein:
  • a second memory for storing a computer program that can be run on a second processor
  • the second processor is used to execute the method described in the second aspect when running the computer program.
  • an embodiment of the present application provides a computer-readable storage medium, which stores a computer program.
  • the computer program When the computer program is executed by a first processor, it implements the method as described in the first aspect, or when the computer program is executed by a second processor, it implements the method as described in the second aspect.
  • the embodiment of the present application provides a coding and decoding method, a bit stream, an encoder, a decoder, and a storage medium.
  • the point value of the current macroblock is determined; then, according to the point value of the current macroblock, the value of the first syntax element information and the value of at least one second syntax element information are determined; then, the value of the first syntax element information and the value of at least one second syntax element information are encoded, and the obtained coded bits are written into the bit stream.
  • the bit stream is decoded to determine the value of the first syntax element information and the value of at least one second syntax element information; then, according to the value of the first syntax element information and the value of at least one second syntax element information, the point value of the current macroblock is determined.
  • the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock
  • the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock
  • i is an integer greater than or equal to 0 and less than the value of the first syntax element information.
  • the decoding end no longer needs to decode the syntax elements used to represent the maximum number of points in the geometric prediction tree, but instead determines the point value of the current macroblock through the first syntax element information and the second syntax element information, thereby saving bit rate, improving encoding and decoding efficiency, and further improving geometric encoding and decoding performance.
  • FIG1A is a schematic diagram of a three-dimensional point cloud image
  • FIG1B is a partial enlarged view of a three-dimensional point cloud image
  • FIG2A is a schematic diagram of six viewing angles of a point cloud image
  • FIG2B is a schematic diagram of a data storage format corresponding to a point cloud image
  • FIG3 is a schematic diagram of the positions of reference nodes selected by each sub-node
  • FIG4 is a schematic diagram of the positions of four groups of reference neighbor nodes of a current node
  • FIG5 is a schematic diagram showing the positions of each sub-block corresponding to six adjacent parent blocks
  • FIG6 is a schematic diagram showing the positions of 18 adjacent blocks around a current block and their Morton sequence numbers
  • FIG7 is a schematic diagram of a simplified prediction tree structure
  • FIG8A is a schematic diagram of a framework of an AVS encoder
  • FIG8B is a schematic diagram of a framework of an AVS decoder
  • FIG9 is a schematic diagram of a network architecture for point cloud encoding and decoding
  • FIG10 is a flowchart diagram 1 of a decoding method provided in an embodiment of the present application.
  • FIG11 is a second flow chart of a decoding method provided in an embodiment of the present application.
  • FIG12 is a third flow chart of a decoding method provided in an embodiment of the present application.
  • FIG13 is a schematic diagram of a flow chart of an encoding method provided in an embodiment of the present application.
  • FIG14 is a schematic diagram of the structure of an encoder provided in an embodiment of the present application.
  • FIG15 is a schematic diagram of a specific hardware structure of an encoder provided in an embodiment of the present application.
  • FIG16 is a schematic diagram of the structure of a decoder provided in an embodiment of the present application.
  • FIG17 is a schematic diagram of a specific hardware structure of a decoder provided in an embodiment of the present application.
  • FIG. 18 is a schematic diagram of the composition structure of a coding and decoding system provided in an embodiment of the present application.
  • first ⁇ second ⁇ third involved in the embodiments of the present application are only used to distinguish similar objects and do not represent a specific ordering of the objects. It can be understood that “first ⁇ second ⁇ third” can be interchanged in a specific order or sequence where permitted, so that the embodiments of the present application described here can be implemented in an order other than that illustrated or described here.
  • Point Cloud is a three-dimensional representation of the surface of an object.
  • Point cloud (data) on the surface of an object can be collected through acquisition equipment such as photoelectric radar, lidar, laser scanner, and multi-view camera.
  • a point cloud is a set of irregularly distributed discrete points in space that express the spatial structure and surface properties of a three-dimensional object or scene.
  • FIG1A shows a three-dimensional point cloud image
  • FIG1B shows a partial magnified view of the three-dimensional point cloud image. It can be seen that the point cloud surface is composed of densely distributed points.
  • Two-dimensional images have information expressed at each pixel point, and the distribution is regular, so there is no need to record its position information additionally; however, the distribution of points in point clouds in three-dimensional space is random and irregular, so it is necessary to record the position of each point in space in order to fully express a point cloud.
  • each position in the acquisition process has corresponding attribute information, usually RGB color values, and the color value reflects the color of the object; for point clouds, in addition to color information, the attribute information corresponding to each point is also commonly the reflectance value, which reflects the surface material of the object. Therefore, the points in the point cloud can include the geometric information of the point and the attribute information of the point.
  • the geometric information of the point can be the three-dimensional coordinate information (x, y, z) of the point, so the geometric information of the point can also be called the position information of the point.
  • the attribute information of the point can include color information (three-dimensional color information) and/or reflectance (one-dimensional reflectance information r), etc.
  • the color information can be information on any color space.
  • the color information can be RGB information. Among them, R represents red (Red, R), G represents green (Green, G), and B represents blue (Blue, B).
  • the color information may be luminance and chrominance (YCbCr, YUV) information, where Y represents brightness (Luma), Cb (U) represents blue color difference, and Cr (V) represents red color difference.
  • the points in the point cloud may include the three-dimensional coordinate information of the points and the reflectivity value of the points.
  • the points in the point cloud may include the three-dimensional coordinate information of the points and the three-dimensional color information of the points.
  • a point cloud obtained by combining the principles of laser measurement and photogrammetry may include the three-dimensional coordinate information of the points, the reflectivity value of the points and the three-dimensional color information of the points.
  • Figure 2A and 2B a point cloud image and its corresponding data storage format are shown.
  • Figure 2A provides six viewing angles of the point cloud image
  • Figure 2B consists of a file header information part and a data part.
  • the header information includes the data format, data representation type, the total number of point cloud points, and the content represented by the point cloud.
  • the point cloud is in the ".ply" format, represented by ASCII code, with a total number of 207242 points, and each point has three-dimensional coordinate information (x, y, z) and three-dimensional color information (r, g, b).
  • Point clouds can be divided into the following categories according to the way they are obtained:
  • Static point cloud the object is stationary, and the device that obtains the point cloud is also stationary;
  • Dynamic point cloud The object is moving, but the device that obtains the point cloud is stationary;
  • Dynamic point cloud acquisition The device used to acquire the point cloud is in motion.
  • point clouds can be divided into two categories according to their usage:
  • Category 1 Machine perception point cloud, which can be used in autonomous navigation systems, real-time inspection systems, geographic information systems, visual sorting robots, disaster relief robots, etc.
  • Category 2 Point cloud perceived by the human eye, which can be used in point cloud application scenarios such as digital cultural heritage, free viewpoint broadcasting, 3D immersive communication, and 3D immersive interaction.
  • Point clouds can flexibly and conveniently express the spatial structure and surface properties of three-dimensional objects or scenes. Point clouds are obtained by directly sampling real objects, so they can provide a strong sense of reality while ensuring accuracy. Therefore, they are widely used, including virtual reality games, computer-aided design, geographic information systems, automatic navigation systems, digital cultural heritage, free viewpoint broadcasting, three-dimensional immersive remote presentation, and three-dimensional reconstruction of biological tissues and organs.
  • Point clouds can be collected mainly through the following methods: computer generation, 3D laser scanning, 3D photogrammetry, etc.
  • Computers can generate point clouds of virtual three-dimensional objects and scenes; 3D laser scanning can obtain point clouds of static real-world three-dimensional objects or scenes, and can obtain millions of point clouds per second; 3D photogrammetry can obtain point clouds of dynamic real-world three-dimensional objects or scenes, and can obtain tens of millions of point clouds per second.
  • 3D photogrammetry can obtain point clouds of dynamic real-world three-dimensional objects or scenes, and can obtain tens of millions of point clouds per second.
  • the number of points in each point cloud frame is 700,000, and each point has coordinate information xyz (float) and color information RGB (uchar).
  • the point cloud coding framework that can compress point clouds can be the geometry-based point cloud compression (G-PCC) codec framework or the video-based point cloud compression (V-PCC) codec framework provided by the Moving Picture Experts Group (MPEG), or the AVS-PCC codec framework provided by the Audio Video Standard (AVS).
  • G-PCC codec framework can be used to compress the first type of static point cloud and the third type of dynamically acquired point cloud, which can be based on the point cloud compression test platform (Test Model Compression 13, TMC13), and the V-PCC codec framework can be used to compress the second type of dynamic point cloud, which can be based on the point cloud compression test platform (Test Model Compression 2, TMC2). Therefore, the G-PCC codec framework is also called point cloud codec TMC13, and the V-PCC codec framework is also called point cloud codec TMC2.
  • the geometric information of the point cloud and the attribute information corresponding to each point are encoded separately.
  • the geometric information is transformed so that all the point clouds are contained in a bounding box.
  • the preprocessing process includes quantization and removal of duplicate points. Quantization mainly plays a role in scaling. Due to quantization rounding, the geometric information of some points is the same. Whether to remove duplicate points is determined based on the parameters.
  • the bounding box is divided in the order of breadth-first traversal (octree/quadtree/binary tree, etc.), and the placeholder code of each node is encoded.
  • the encoder divides the bounding box into sub-cubes in sequence, and continues to divide the non-empty (containing points in the point cloud) sub-cubes until the leaf node obtained by the division is a 1 ⁇ 1 ⁇ 1 unit cube. Then, in the case of geometric lossless coding, the number of points contained in the leaf node is encoded, and finally the geometric octree encoding is completed to generate a binary code stream.
  • the decoder obtains the placeholder code of each node by continuously parsing in the order of breadth-first traversal, and continuously divides the nodes in sequence until the division is a 1 ⁇ 1 ⁇ 1 unit cube. The number of points contained in each leaf node is parsed, and finally the geometric reconstructed point cloud information is restored.
  • AVS-PCC geometry coding there are two encoding methods, one is octree-based encoding (Octree geometry encoding, OctGeomEnc), and the other is predictive tree-based encoding (Predictive geometry coding, PredGeomTree).
  • Octree geometry encoding OctGeomEnc
  • Predictive geometry coding PredGeomTree
  • Context model 1 can be used for cat1-A and cat2 point cloud sequences; context model 2 can be used for cat1-B and cat3 sequences.
  • context model 1 includes the sub-layer neighbor prediction of the current point and the neighbor prediction of the current point layer.
  • the neighbor information that can be obtained when encoding the child node of the current point includes the neighbor child nodes in the three directions of left, front and bottom.
  • the context model of the child node layer is designed as follows: for the child node layer to be encoded, find the occupancy of the three coplanar nodes, three colinear nodes, one co-point node in the left, front and bottom direction of the same layer as the child node to be encoded, and the node in the negative direction of the dimension with the shortest node side length and two node side lengths away from the current child node to be encoded.
  • the occupancy of the three coplanar nodes, the three colinear nodes, and the nodes at the negative direction of the dimension with the shortest node side length and two node side lengths away from the current sub-node to be encoded is considered in detail.
  • There are two possibilities for the common neighbor: occupied or unoccupied; at this time, a context model is assigned separately for the case where the common neighbor node is occupied. If the common neighbor is also unoccupied, the occupancy of the neighbors at the current node layer to be described next is considered. That is, the neighbors at the sub-node layer to be encoded correspond to a total of 127 + 2-1 128 context models.
  • FIG 4 (a) represents the first group of reference neighbor nodes, specifically the upper right rear coplanar neighbor nodes; (b) represents the second group of reference neighbor nodes, specifically the left front lower coplanar neighbor nodes; (c) represents the third group of reference neighbor nodes, specifically the upper right rear collinear neighbor nodes; (d) represents the fourth group of reference neighbor nodes, specifically the left front lower collinear neighbor nodes; where the dotted frame node is the current node, and the solid frame node is the neighbor node.
  • Step 2 Consider the distance between the most recently occupied node and the current node.
  • the distance has three values, among which the importance of the left front and lower coplanar neighbors or the upper right and rear collinear neighbors is the highest, and the value of the distance is set to 1; when the left front and lower coplanar neighbors and the upper right and rear collinear neighbors are not occupied, the importance of the left front and lower collinear neighbors is the second highest, and the value of the distance is set to 2; if none of these four groups of neighbor nodes are occupied, then the value of the distance is set to 3.
  • the distance has 3 values.
  • This method uses a two-layer context reference relationship configuration, as shown in formula (1).
  • the first layer is the occupancy of the encoded adjacent blocks of the parent node of the current sub-block to be encoded (i.e., ctxIdxParent), and the second layer is the occupancy of the encoded adjacent blocks at the same depth as the current sub-block to be encoded (i.e., ctxIdxChild).
  • idx LUT[ctxIdxParent][ctxIdxChild] (1)
  • the ctxIdxChild of the second layer is as shown in formula (2): Indicates that the current sub-block Occupancy of the three coded sub-blocks with a distance of 1.
  • the adjacent parent blocks that are coplanar and colinear with them are found by table lookup, and the ctxIdxParent is calculated according to the occupancy according to formula (3).
  • the node filled with dots is the current node
  • the child node filled with grids is the sub-block to be encoded.
  • Each sub-graph shows the relative position relationship of the 6 adjacent parent blocks found by the i-th sub-block, including 3 coplanar parent blocks (P i,0 ,P i,1 ,P i,2 ) and 3 colinear parent blocks (P i,3 ,P i,4 ,P i,5 ).
  • each sub-block and the adjacent parent block is obtained by the method of Table 1.
  • the numbers in Table 2 correspond to the Morton sequence in Figure 6.
  • This method takes into account the different sub-block positions and the geometric center rotation symmetry. As can be seen from Figure 6, with the current block as the center, this method has a larger receptive field and can use up to 18 adjacent parent blocks that have been encoded around it.
  • the method used in formula (3) is the combination of the occupancy of the three coplanar parent blocks and the sum of the occupancy of the three colinear parent blocks.
  • Table 2 shows the corresponding relationship between a child block i and its adjacent parent block j, wherein the numbers in Table 2 correspond to the Morton sequence numbers in FIG. 6 .
  • the geometric information of the point cloud is first used at the encoding end to perform Morton code sorting, and then the geometric information of the point cloud is predictively encoded using a KD-Tree, similar to a single chain structure that predicts the geometric information of the child node by using the parent node.
  • Fig. 7 is a schematic diagram of a simplified prediction tree structure. As shown in Fig. 7, the prediction tree adopts a single chain structure, and each tree node has only one child node except for the only leaf node. Except for the root node predicted by the default value, other nodes are provided with geometric prediction values by their parent nodes.
  • Condition 2 The current block contains only one point cloud data point
  • Condition 3 The sum of the number of Morton code bits to be encoded for the points in the current block is greater than twice the number of directions that have not reached the minimum side length.
  • a flag is introduced to indicate whether the current node uses the isolated point direct coding mode.
  • the flag uses a context for entropy coding. If the flag is true (True), the isolated point mode is used to directly encode the geometric coordinates of the point, and the octree division is terminated. If the flag is false (False), the occupancy code is encoded and the octree division continues.
  • this flag can be inferred to be False and not encoded. If the parent block of the current block already allows the use of isolated point coding mode, and the current block is the only child node of the parent block, then the current block must not contain isolated points. Therefore, under this condition, the bits for encoding the flag can be omitted.
  • FIG8A is a schematic diagram of the framework of an AVS encoder.
  • the slices are independently encoded.
  • the geometric information of the point cloud and the attribute information in the point cloud are encoded separately.
  • the AVS encoder first encodes the geometric information.
  • the AVS encoder performs coordinate transformation (including coordinate translation and coordinate quantization) on the geometric position so that all the point clouds are contained in a bounding box; then the bounding box is constructed and entropy encoded by an octree to generate a binary code stream (specifically a geometric code stream). After the geometric encoding is completed, the geometric information is reconstructed.
  • attribute encoding is mainly performed on color and reflectivity information. First, it is determined whether to perform color space conversion. If color space conversion is performed, the color information is converted from RGB color space to YUV color space. Then, the reconstructed point cloud is recolored using the original point cloud so that the unencoded attribute information corresponds to the reconstructed geometric information.
  • color information encoding it is divided into two modules: attribute prediction and attribute transformation.
  • the attribute prediction process is as follows: first, reorder the point cloud, and then perform differential prediction. There are two reordering methods: Morton reordering and Hilbert reordering.
  • the attribute transformation process is as follows: first, wavelet transform is performed on the point cloud attributes, and the transform coefficients are quantized; secondly, the attribute reconstruction value is obtained by inverse quantization and inverse wavelet transform; then the difference between the original attribute and the attribute reconstruction value is calculated to obtain the attribute residual and quantize it; finally, the quantized transform coefficients and attribute residuals are entropy encoded to generate a binary code stream (specifically, the attribute code stream).
  • FIG8B is a schematic diagram of the framework of an AVS decoder.
  • the geometric code stream and the attribute code stream in the binary code stream are decoded independently.
  • the geometric position of the point cloud is obtained through entropy decoding-octree reconstruction-inverse coordinate quantization and inverse coordinate translation.
  • the attribute information of the point cloud is obtained through entropy decoding-inverse quantization-attribute prediction compensation-inverse space transformation; or the attribute information of the point cloud is obtained through entropy decoding-inverse quantization-attribute inverse transformation-inverse space transformation.
  • the slice to be encoded can be restored based on the geometric position and attribute information; then After merging the slices, the three-dimensional image model of the input point cloud can be restored.
  • FIG9 is a schematic diagram of a network architecture of a point cloud encoding and decoding.
  • the network architecture includes one or more electronic devices 13 to 1N and a communication network 01, wherein the electronic devices 13 to 1N can perform video interaction through the communication network 01.
  • the electronic device can be various types of devices with point cloud encoding and decoding functions.
  • the electronic device can include a mobile phone, a tablet computer, a personal computer, a personal digital assistant, a navigator, a digital phone, a video phone, a television, a sensor device, a server, etc., which is not limited by the embodiment of the present application.
  • the decoder or encoder in the embodiment of the present application can be the above-mentioned electronic device. That is to say, the electronic device in the embodiment of the present application has the point cloud encoding and decoding function, generally including a point cloud encoder (ie, encoder) and a point cloud decoder (ie, decoder).
  • the electronic device in the embodiment of the present application has the point cloud encoding and decoding function, generally including a point cloud encoder (ie, encoder) and a point cloud decoder (ie, decoder).
  • Condition 1 The geometric position is limitedly lossy and the attributes are lossy;
  • the general test sequence includes five categories: Cat1A, Cat1B, Cat1C, Cat2-frame and Cat3. Among them, Cat1A and Cat2-frame point clouds only contain reflectance attribute information, Cat1B and Cat3 point clouds only contain color attribute information, and Cat1C point cloud contains both color and reflectance attribute information.
  • the points in the point cloud are processed in a certain order (the original acquisition order of the point cloud, the Morton order, the Hilbert order, etc.), and the prediction algorithm is first used to obtain the attribute prediction value, and the attribute residual is obtained according to the attribute value and the attribute prediction value. Then, the attribute residual is quantized to generate a quantized residual, and finally the quantized residual is encoded;
  • the points in the point cloud are processed in a certain order (the original acquisition order of the point cloud, Morton order, Hilbert order, etc.).
  • the prediction algorithm is first used to obtain the attribute prediction value, and then the decoding is performed to obtain the quantized residual.
  • the quantized residual is then dequantized, and finally the attribute reconstruction value is obtained based on the attribute prediction value and the dequantized residual.
  • Prediction transform branch - limited resources attribute compression uses a method based on intra-frame prediction and discrete cosine transform (DCT).
  • DCT discrete cosine transform
  • the points in the point cloud are processed in a certain order (the original acquisition order of the point cloud, the Morton order, the Hilbert order, etc.), and the entire point cloud is first divided into several small groups with a maximum length of Y (such as 2), and then these small groups are combined into several large groups (the number of points in each large group does not exceed X, such as 4096), and then the prediction algorithm is used to obtain the attribute prediction value, and the attribute residual is obtained according to the attribute value and the attribute prediction value.
  • the attribute residual is transformed by DCT in small groups to generate transformation coefficients, and then the transformation coefficients are quantized to generate quantized transformation coefficients, and finally the quantized transformation coefficients are encoded in large groups;
  • the points in the point cloud are processed in a certain order (the original acquisition order of the point cloud, Morton order, Hilbert order, etc.).
  • the entire point cloud is divided into several small groups with a maximum length of Y (such as 2), and then these small groups are combined into several large groups (the number of points in each large group does not exceed X, such as 4096).
  • the quantized transform coefficients are decoded in large groups, and then the prediction algorithm is used to obtain the attribute prediction value.
  • the quantized transform coefficients are dequantized and inversely transformed in small groups.
  • the attribute reconstruction value is obtained based on the attribute prediction value and the dequantized and inversely transformed coefficients.
  • Prediction transform branch - resources are not limited. Attribute compression adopts a method based on intra-frame prediction and DCT transform. When encoding the quantized transform coefficients, there is no limit on the maximum number of points X, that is, all coefficients are encoded together:
  • the points in the point cloud are processed in a certain order (the original acquisition order of the point cloud, the Morton order, the Hilbert order, etc.).
  • the entire point cloud is divided into several small groups with a maximum length of Y (such as 2).
  • the prediction algorithm is used to obtain the attribute prediction value.
  • the attribute residual is obtained according to the attribute value and the attribute prediction value.
  • the attribute residual is subjected to DCT transformation in groups to generate transformation coefficients.
  • the transformation coefficients are then quantized to generate quantized transformation coefficients.
  • the quantized transformation coefficients of the entire point cloud are encoded.
  • the points in the point cloud are processed in a certain order (the original acquisition order of the point cloud, Morton order, Hilbert order, etc.).
  • the entire point cloud is divided into several small groups with a maximum length of Y (such as 2), and the quantized transformation coefficients of the entire point cloud are obtained by decoding.
  • the prediction algorithm is used to obtain the attribute prediction value, and then the quantized transformation coefficients are dequantized and inversely transformed in groups.
  • the attribute reconstruction value is obtained based on the attribute prediction value and the dequantized and inversely transformed coefficients.
  • the entire point cloud is subjected to multi-layer wavelet transform to generate transform coefficients, which are then quantized to generate quantized transform coefficients, and finally the quantized transform coefficients of the entire point cloud are encoded;
  • decoding obtains the quantized transform coefficients of the entire point cloud, and then dequantizes and inversely transforms the quantized transform coefficients to obtain attribute reconstruction values.
  • AVS-PCC when AVS-PCC encodes the geometric information of the point cloud, it encodes the geometric macroblock based on the largest coding unit (LCU), that is, firstly, the point cloud slice is spatially divided to obtain different geometric macroblocks, and then each geometric macroblock is adaptively encoded.
  • LCU largest coding unit
  • each geometric macroblock is adaptively encoded.
  • the decoding type of the current macroblock is first obtained: octree decoding or prediction tree decoding.
  • geom_max_tree_size_log2_minus8 there is currently a syntax element geom_max_tree_size_log2_minus8 in AVS-PCC, which has no effect on the decoding end. In other words, when decoding geometric macroblocks, some syntax elements transmitted in the bitstream are redundant, resulting in low coding efficiency.
  • an embodiment of the present application provides an encoding method, which determines the point value of the current macroblock; then determines the value of the first syntax element information and the value of at least one second syntax element information according to the point value of the current macroblock; then encodes the value of the first syntax element information and the value of at least one second syntax element information, and writes the obtained coded bits into the bitstream.
  • An embodiment of the present application also provides a decoding method, which decodes the bitstream, determines the value of the first syntax element information and the value of at least one second syntax element information; then determines the point value of the current macroblock according to the value of the first syntax element information and the value of at least one second syntax element information.
  • the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock
  • the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock, where i is an integer greater than or equal to 0 and less than the value of the first syntax element information.
  • FIG10 a schematic diagram of a decoding method provided by an embodiment of the present application is shown. As shown in FIG10 , the method may include:
  • S1001 Decode a bitstream and determine a value of first syntax element information and a value of at least one second syntax element information.
  • the decoding method of the embodiment of the present application is applied to a decoder, specifically to an AVS-PCC decoding framework, or AVS-GPCC decoding framework.
  • the embodiment of the present application specifically provides a point cloud geometry decoding solution, more specifically, a method for decoding parameters related to a geometric macroblock and a prediction tree to improve encoding and decoding efficiency.
  • the point cloud slice may be spatially divided based on the largest coding unit (LCU) to determine at least one geometric macroblock, wherein the current macroblock is any one of the at least one geometric macroblock.
  • LCU largest coding unit
  • decoding the bitstream and determining a value of the first syntax element information and a value of at least one second syntax element information may include:
  • Decode the bitstream determine the value of the first syntax element information; when i is greater than or equal to 0 and less than the value of the first syntax element information, loop through the following steps to obtain the value of at least one second syntax element information:
  • the code stream is decoded, the value of the i-th second syntax element information is determined, and an operation of adding 1 is performed on i.
  • the value of the first syntax element information is first decoded and determined; then, when i is greater than or equal to 0 and less than the value of the first syntax element information, the decoding code stream is cyclically executed to determine the value of the i-th second syntax element information until i is equal to the value of the first syntax element information, thereby obtaining the value of at least one second syntax element information.
  • the number of at least one second syntax element information is N; that is, the number of at least one second syntax element information is the same as the value of the first syntax element information.
  • the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock.
  • the first syntax element information can also be referred to as the number of bytes occupied by the point value of the current macroblock, which can be represented by num_bits_in_lcu_num_points.
  • the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock.
  • the i-th second syntax element information may also be referred to as the byte of the point value of the current macroblock, which may be represented by lcu_num_points[i].
  • i is an integer greater than or equal to 0 and less than the value of the first syntax element information.
  • the value of the first syntax element information is an unsigned integer, specifically a 5-bit unsigned integer.
  • the value of the second syntax element information is a binary variable, specifically 0 or 1.
  • the value of the first syntax element information and the value of at least one second syntax element information can be obtained by decoding based on a bypass model.
  • S1002 Determine a point value of a current macroblock according to a value of a first syntax element information and a value of at least one second syntax element information.
  • the point value of the current macroblock can be calculated based on the value of the first syntax element information and the value of at least one second syntax element information.
  • the method when determining the point value of the current macroblock according to the value of the first syntax element information and the value of at least one second syntax element information, referring to FIG. 11, the method may include:
  • S1101 Determine the initial point value of the current macroblock.
  • S1102 Determine N midpoint values of the current macroblock according to the value of at least one second syntax element information.
  • S1103 Determine the point value of the current macroblock according to the initial point value of the current macroblock and the N intermediate point values of the current macroblock.
  • the value of N is equal to the value of the first syntax element information, and the number of at least one second syntax element information is N.
  • each value of the second syntax element information corresponds to a determined midpoint value.
  • determining the N midpoint values of the current macroblock according to the value of at least one second syntax element information may include: when i is greater than or equal to 0 and less than the value of the first syntax element information, loopingly performing the following steps to obtain the N midpoint values of the current macroblock:
  • the i-th midpoint value of the current macroblock is determined, and an addition operation is performed on i.
  • the step of determining the i-th midpoint value of the current macroblock according to the value of the i-th second syntax element information and the value of i is executed in a loop until i is equal to the value of the first syntax element information, thereby obtaining N midpoint values of the current macroblock.
  • determining the i-th midpoint value of the current macroblock based on the value of the i-th second syntax element information and the value of i can include: performing an i-bit left shift operation on the value of the i-th second syntax element information to obtain the i-th midpoint value of the current macroblock.
  • the value of the i-th second syntax element information can be represented by lcu_num_points[i].
  • the value of the i-th middle point of the current macroblock can be specifically expressed as: lcu_num_points[i] ⁇ i.
  • the method when determining the initial point value of the current macroblock, may include: setting the initial point value of the current macroblock to 0.
  • the method may include: performing a cumulative operation on the initial point value of the current macroblock and the N intermediate point values of the current macroblock to obtain the point value of the current macroblock.
  • lcu_num_points represents the point value of the current macroblock
  • lcu_num_points0 represents the initial point value of the current macroblock
  • lcu_num_points[i] represents the value of the i-th second syntax element information.
  • the method may include:
  • S1203 Determine whether i is greater than or equal to 0 and less than N.
  • S1204 Determine the value of the i-th point of the current macroblock.
  • S1205 Determine the i-th midpoint value of the current macroblock according to the value of the i-th second syntax element information and the value of i.
  • S1206 Determine the i+1th point value of the current macroblock according to the i-th point value of the current macroblock and the i-th middle point value of the current macroblock.
  • step S1203 if step S1203 is established, steps S1204 to S1207 are continued. If step S1203 is not established, that is, when i is equal to N, the determined Nth point value is used as the point value of the current macroblock.
  • the Nth point value of the current macroblock is obtained, the Nth point value is used as the point value of the current macroblock.
  • the 0th point value of the current macroblock is the initial point value of the current macroblock, which can be set to 0. And when i is equal to N-1, the Nth point value calculated at this time is the point value (lcu_num_points) of the current macroblock.
  • determining the i-th midpoint value of the current macroblock based on the value of the i-th second syntax element information and the value of i it can include: performing an i-bit left shift operation on the value of the i-th second syntax element information to obtain the i-th midpoint value of the current macroblock.
  • determining the i+1th point value of the current macroblock based on the i-th point value of the current macroblock and the i-th midpoint value of the current macroblock it can include: adding the i-th point value of the current macroblock and the i-th midpoint value of the current macroblock to determine the i+1th point value of the current macroblock.
  • the decoding end can calculate the point value lcu_num_points of the current macroblock.
  • the method may further include: decoding the bitstream to determine first identification information; if the first identification information indicates that the current sequence starts the prediction tree decoding mode, decoding the bitstream to determine the value of the third syntax element information.
  • the method may also include: if the first identification information indicates that the prediction tree decoding mode is not enabled for the current sequence, then the decoding bitstream is not performed, and the value of the third syntax element information is determined.
  • the current sequence includes the current macroblock
  • the third syntax element information is used to indicate the maximum number of points of the prediction tree corresponding to the current macroblock.
  • the third syntax element information can be represented by geom_max_tree_size_log2_minus8, and the value of the third syntax element information is an unsigned integer.
  • geom_max_tree_size 2 ⁇ (geom_max_tree_size_log2_minus8+8) (5)
  • the method may further include: if the value of the first identification information is a first value, determining that the first identification information indicates that the current sequence turns on the prediction tree decoding mode; if the value of the first identification information is a second value, determining that the first identification information indicates that the current sequence does not turn on the prediction tree decoding mode.
  • the first value is different from the second value, and the first value and the second value can be in parameter form or in digital form.
  • the first identification information can be a parameter written in the profile or a flag value, which is not specifically limited here.
  • the first value can be set to 1 and the second value can be set to 0; or, the first value can be set to 0 and the second value can be set to 1; or, the first value can be set to true and the second value can be set to false; or, the first value can be set to false and the second value can be set to true.
  • the first value is set to 1 and the second value is set to 0, but this is not specifically limited.
  • the decoding method of the current macroblock is first determined: octree decoding method or prediction tree decoding method. If the current sequence does not enable the prediction tree decoding method, that is, the current macroblock is decoded using the octree decoding method, there is no need to decode geom_max_tree_size_log2_minus8 in the code stream. If the current sequence enables the prediction tree decoding method, then for the current macroblock, geom_max_tree_size_log2_minus8 in the code stream can also be decoded to determine the maximum number of points in the geometric prediction tree of the current macroblock.
  • the embodiment of the present application removes the syntax element geom_max_tree_size_log2_minus8 in the geometry parameter set (GPS) of AVS-GPCC by modifying it, and the decoder does not need to decode the syntax element. Specifically, firstly, the number of bytes occupied by the point value of the current macroblock, num_bits_in_lcu_num_points, is decoded, and then each bit of the point value of the current macroblock, i.e., lcu_num_points[i], is decoded. Then, the decoder decodes the two syntax elements corresponding to the current macroblock. num_bits_in_lcu_num_points and lcu_num_points[i] are used to calculate the point value of the current macroblock.
  • GPS geometry parameter set
  • Table 3 shows a syntax table corresponding to the GPS header, and the description of its syntax elements is shown in Table 3.
  • Table 4 shows the syntax table corresponding to the current macroblock LCU in the prediction tree decoding mode, and the description of its syntax elements is shown in Table 4.
  • An embodiment of the present application provides a decoding method, which decodes a bit stream, determines the value of a first syntax element information and the value of at least one second syntax element information; then determines the point value of the current macroblock based on the value of the first syntax element information and the value of at least one second syntax element information.
  • the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock
  • the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock, where i is an integer greater than or equal to 0 and less than the value of the first syntax element information.
  • the decoding end does not It is no longer necessary to decode the syntax element used to characterize the maximum number of points in the geometric prediction tree. Instead, the point value of the current macroblock is determined by the first syntax element information and the second syntax element information, thereby saving bit rate, improving encoding and decoding efficiency, and further improving geometric encoding and decoding performance.
  • FIG13 a schematic diagram of a flow chart of an encoding method provided in an embodiment of the present application is shown. As shown in FIG13, the method may include:
  • the encoding method of the embodiment of the present application is applied to an encoder, specifically to an AVS-PCC encoding framework, or AVS-GPCC encoding framework.
  • the embodiment of the present application specifically provides a point cloud geometry encoding scheme, more specifically, provides an encoding method for geometric macroblocks and prediction tree related parameters to improve encoding and decoding efficiency.
  • the point cloud to be processed is spatially divided to determine at least one geometric macroblock, wherein the at least one geometric macroblock includes the current macroblock, or in other words, the current macroblock is any one of the at least one macroblock.
  • the point cloud to be processed may be a point cloud slice.
  • the slice is spatially divided based on the largest coding unit (LCU) to obtain at least one geometric macroblock.
  • LCU largest coding unit
  • the encoder counts the number of points of each geometric macroblock and can determine the respective point values.
  • S1302 Determine a value of a first syntax element information and a value of at least one second syntax element information according to a point value of a current macroblock.
  • the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock.
  • the first syntax element information can also be referred to as the number of bytes occupied by the point value of the current macroblock, which can be represented by num_bits_in_lcu_num_points.
  • the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock.
  • the i-th second syntax element information can also be referred to as the byte of the point value of the current macroblock, which can be represented by lcu_num_points[i].
  • i is an integer greater than or equal to 0 and less than the value of the first syntax element information.
  • the point value of the current macroblock when encoding the point value of the current macroblock, it is first necessary to determine the value of the corresponding syntax element information. In some embodiments, it may specifically include: binarizing the point value of the current macroblock to determine the value of the first syntax element information and the value of at least one second syntax element information.
  • the value of the first syntax element information is an unsigned integer, specifically a 5-bit unsigned integer.
  • the value of the second syntax element information is a binary variable, specifically 0 or 1.
  • the point value of the current macroblock is binarized to generate a binary string; based on the generated binary string, the value of the first syntax element information and the value of at least one second syntax element information can be determined.
  • the generated binary string is 11010, and then the value of the first syntax element information can be determined to be 5.
  • the value of the 0th second syntax element information is 0, the value of the 1st second syntax element information is 1, the value of the 2nd second syntax element information is 0, the value of the 3rd second syntax element information is 1, and the value of the 4th second syntax element information is 1.
  • the number of at least one second syntax element information is N; that is, the number of at least one second syntax element information is the same as the value of the first syntax element information.
  • S1303 Encode the value of the first syntax element information and the value of at least one second syntax element information, and write the obtained coded bits into a bitstream.
  • the value of the first syntax element information and the value of at least one second syntax element information may be encoded.
  • the method when encoding the value of the first syntax element information and the value of at least one second syntax element information, the method may include:
  • the value of the i-th second syntax element information is encoded, and an operation of adding 1 is performed on i.
  • the value of the first syntax element information is encoded first; then, when i is greater than or equal to 0 and less than the value of the first syntax element information, the step of encoding the value of the i-th second syntax element information is executed repeatedly until i is equal to the value of the first syntax element information, thereby encoding the value of the first syntax element information and the value of at least one second syntax element information into the bitstream.
  • the value of the first syntax element information and the value of at least one second syntax element information may be encoded based on a bypass model.
  • the method may further include: determining first identification information; wherein the first identification information is used to indicate whether the current sequence turns on the prediction tree encoding mode; and encoding the value of the first identification information, and writing the obtained encoding bits into the bit stream.
  • determining the first identification information when determining the first identification information, it may include: if the current sequence turns on the prediction tree encoding mode, then determining the first identification information as the first value; if the current sequence does not turn on the prediction tree encoding mode, then determining the first identification information as the second value.
  • the first value is different from the second value, and the first value and the second value can be in parameter form or in digital form.
  • the first identification information can be a parameter written in the profile or a value of a flag, which is not specifically limited here.
  • the first value can be set to 1 and the second value can be set to 0; or, the first value can be set to 0 and the second value can be set to 1; or, the first value can be set to true and the second value can be set to false; or, the first value can be set to false and the second value can be set to true.
  • the first value is set to 1 and the second value is set to 0, but this is not specifically limited.
  • the current sequence includes the current macroblock.
  • the first identification information can be a GPS parameter. If the current sequence turns on the prediction tree coding mode, then the first identification information can be determined to be the first value. At this time, for the current macroblock, it is necessary to further determine the coding mode, such as the octree coding mode or the prediction tree coding mode. If the current sequence does not turn on the prediction tree coding mode, then the first identification information can be determined to be the second value, that is, the coding mode of the current macroblock is the octree coding mode.
  • the method may further include: determining the maximum number of points of the prediction tree corresponding to the current macroblock; and determining the encoding method of the current macroblock based on a comparison between the point value of the current macroblock and the maximum number of points.
  • the encoding method of the current macroblock may include an octree encoding method and a prediction tree encoding method, and the encoding end performs adaptive encoding in these two methods.
  • the encoding end when determining the encoding method of the current macroblock based on the comparison between the point value of the current macroblock and the maximum point number, it may include:
  • the point value of the current macroblock is less than the maximum point value, it is determined that the current macroblock uses the prediction tree encoding method
  • the point value of the current macroblock is greater than the maximum point value, it is determined that the current macroblock uses the octree encoding method.
  • the method may further include: when the current macroblock uses a prediction tree encoding method, determining the value of the third syntax element information according to the maximum number of points; encoding the value of the third syntax element information, and writing the obtained encoding bits into the bitstream.
  • the method may further include: when the current macroblock does not use the prediction tree encoding method, that is, the current macroblock uses the octree encoding method, then there is no need to determine the value of the third syntax element information, nor is there any need to encode the value of the third syntax element information into the bitstream.
  • the current sequence includes the current macroblock
  • the third syntax element information is used to indicate the maximum number of points of the prediction tree corresponding to the current macroblock.
  • the third syntax element information can be represented by geom_max_tree_size_log2_minus8, and the value of the third syntax element information is an unsigned integer.
  • geom_max_tree_size_log2_minus8 log 2 (geom_max_tree_size)-8 (6)
  • the encoding method of the current sequence is first determined: octree encoding method or prediction tree encoding method. If the current sequence turns on the prediction tree decoding method, then for the current macroblock, it is necessary to further determine the encoding method of the current macroblock; if the current macroblock uses prediction tree encoding, then it is necessary to determine the value of the third syntax element geom_max_tree_size_log2_minus8 and write it into the bitstream; if the current sequence does not turn on the prediction tree encoding method, that is, the current macroblock uses octree encoding, then it is not necessary to determine the value of the third syntax element geom_max_tree_size_log2_minus8, nor to write it into the bitstream.
  • the embodiment of the present application removes the syntax element geom_max_tree_size_log2_minus8 in the geometry parameter set (GPS) of AVS-GPCC by modifying it. Since this syntax element is only used to control the encoding of each macroblock at the encoding end, if the current macroblock adopts the prediction tree encoding method, the number of encoding points in the current macroblock must be controlled to be less than a certain threshold (such as the maximum number of points), and the decoding end does not need this syntax element to control. At the decoding end, by taking the syntax element of each macroblock The point value of the current macroblock is calculated by the element lcu_num_points.
  • the decoding end calculates the point value of the current macroblock according to the two syntax elements num_bits_in_lcu_num_points and lcu_num_points[i] corresponding to the current macroblock.
  • the embodiment of the present application further provides a code stream, which is generated by bit encoding according to the information to be encoded; wherein the information to be encoded includes at least one of the following:
  • a value of a first syntax element information, a value of at least one second syntax element information, and first identification information is a value of a first syntax element information, a value of at least one second syntax element information, and first identification information.
  • the first identification information is used to indicate whether the prediction tree encoding mode is turned on in the current sequence
  • the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock
  • the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock, where i is an integer greater than or equal to 0 and less than the value of the first syntax element information.
  • the encoder can write num_bits_in_lcu_num_points and lcu_num_points[i] into the bitstream, so that the decoder can calculate the point value lcu_num_points of the current macroblock based on num_bits_in_lcu_num_points and lcu_num_points[i] obtained by decoding.
  • the embodiment of the present application provides a coding method, which determines the point value of the current macroblock; determines the value of the first syntax element information and the value of at least one second syntax element information according to the point value of the current macroblock; encodes the value of the first syntax element information and the value of at least one second syntax element information, and writes the obtained coded bits into the bitstream.
  • the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock
  • the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock
  • i is an integer greater than or equal to 0 and less than the value of the first syntax element information.
  • the decoding end no longer needs to decode the syntax elements used to characterize the maximum number of points in the geometric prediction tree, but determines the point value of the current macroblock through the first syntax element information and the second syntax element information, thereby saving bit rate, improving encoding and decoding efficiency, and further improving geometric encoding and decoding performance.
  • AVS-PCC when AVS-PCC encodes the geometric information of the point cloud, it encodes based on the geometric macroblock of the LCU, that is, firstly, the point cloud slices are spatially divided to obtain different geometric macroblocks, and then each geometric macroblock is adaptively encoded.
  • the decoding type of the current geometric macroblock is first obtained: octree decoding or prediction tree decoding. If the current geometric macroblock is decoded using a prediction tree, it is necessary to decode the point value of the current geometric macroblock.
  • geom_max_tree_size_log2_minus8 the specific meaning of which is: the maximum number of points in the geometric prediction tree geom_max_tree_size_log2_minus8, an unsigned integer. Among them, the maximum number of points in the geometric prediction tree is calculated as shown in the aforementioned formula (5).
  • the embodiment of the present application is a single-chain structure that is decoded in sequence, so there is no longer a cache pressure problem at the decoding end, that is, there is no need to determine the maximum number of points.
  • the syntax element geom_max_tree_size_log2_minus8 it can also be used to indicate whether the current macroblock can perform adaptive selection of the prediction tree or octree; if the number of coding points in the current macroblock is greater than the maximum number of points, then only octree encoding can be selected, and there is no adaptive selection decoding at the decoding end. Therefore, in the embodiment of the present application, the syntax element geom_max_tree_size_log2_minus8 has no effect at the decoding end, and the decoding end can obtain the point value of the prediction tree decoding in the current geometric macroblock according to lcu_num_points.
  • the technical solution makes corresponding corrections to the relevant syntax elements of the prediction tree decoding point cloud in the geometric coding in AVS-GPCC, specifically:
  • the syntax element geom_max_tree_size_log2_minus8 in GPS can be removed, as shown in the aforementioned Table 3; and the point calculation method for decoding using the prediction tree in the geometry macroblock needs to be specified, as follows:
  • the number of bytes occupied by the geometry macroblock (LCU) point value is num_bits_in_lcu_num_points, a 5-bit unsigned integer; this syntax element indicates the number of bits used by the geometry macroblock point value.
  • the present technical scheme removes the syntax element geom_max_tree_size_log2_minus8 in the GPS in the current AVS-GPCC geometry coding by correcting it. Since this syntax element is only used to control the encoding of each LCU at the encoding end, if the current LCU adopts prediction tree encoding, the number of encoding points in the current LCU must be controlled to be less than a certain threshold, and the decoding end does not need this syntax element to control.
  • the number of points of the current geometry macroblock is calculated by taking the syntax element lcu_num_points of each LCU geometry macroblock. Specifically, first, the number of bytes occupied by the geometry macroblock point value num_bits_in_lcu_num_points is decoded, and then each bit of the point value in the geometry macroblock is decoded, that is, lcu_num_points[i].
  • the calculation method is further specified in the embodiment of the present application, so that the point value of the geometric macroblock can still be determined when one syntax element is removed; thereby saving bit rate, improving encoding and decoding efficiency, and further improving geometric encoding and decoding performance.
  • Figure 14 shows a schematic diagram of the composition structure of an encoder provided by an embodiment of the present application.
  • the encoder 140 may include a first determining unit 1401 and an encoding unit 1402, wherein:
  • the first determining unit 1401 is configured to determine the point value of the current macroblock; and determine the value of the first syntax element information and the value of at least one second syntax element information according to the point value of the current macroblock; wherein the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock, and the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock, where i is an integer greater than or equal to 0 and less than the value of the first syntax element information;
  • the encoding unit 1402 is configured to encode a value of the first syntax element information and a value of at least one second syntax element information, and write the obtained coded bits into a bitstream.
  • the first determination unit 1401 is further configured to spatially divide the point cloud to be processed and determine at least one geometric macroblock; wherein the at least one geometric macroblock includes the current macroblock.
  • the first determination unit 1401 is further configured to perform binarization processing on the point value of the current macroblock to determine the value of the first syntax element information and the value of at least one second syntax element information.
  • the encoding unit 1402 is further configured to encode the value of the first syntax element information; and when i is greater than or equal to 0 and less than the value of the first syntax element information, loop the following steps to achieve encoding of the value of at least one second syntax element information: encode the value of the i-th second syntax element information, and perform an addition operation on i.
  • the value of the first syntax element information is an unsigned integer
  • the value of the second syntax element information is a binary variable
  • the first determining unit 1401 is further configured to determine first identification information; wherein the first identification information is used to indicate whether the current sequence starts the prediction tree encoding mode;
  • the encoding unit 1402 is further configured to encode the value of the first identification information and write the obtained encoding bits into the bit stream.
  • the first determination unit 1401 is further configured to determine that the first identification information is a first value if the current sequence starts the prediction tree coding mode; if the current sequence does not start the prediction tree coding mode, determine that the first identification information is a second value.
  • the first determination unit 1401 is further configured to determine the maximum number of points of the prediction tree corresponding to the current macroblock; and determine the encoding method of the current macroblock based on the comparison between the point value of the current macroblock and the maximum number of points.
  • the first determination unit 1401 is further configured to determine that the current macroblock uses a prediction tree encoding method if the point value of the current macroblock is less than the maximum point number; if the point value of the current macroblock is greater than the maximum point number, determine that the current macroblock uses an octree encoding method.
  • the first determining unit 1401 is further configured to determine the value of the third syntax element information according to the maximum number of points when the current macroblock uses a prediction tree coding method;
  • the encoding unit 1402 is further configured to encode the value of the third syntax element information and write the obtained coded bits into the bitstream.
  • a "unit” may be a part of a circuit, a part of a processor, a part of a program or software, etc., and of course, it may be a module, or it may be non-modular.
  • the components in the present embodiment may be integrated into a processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware or in the form of a software functional module.
  • the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of this embodiment is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product.
  • the computer software is stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the method described in this embodiment.
  • the aforementioned storage medium includes: a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and other media that can store program codes.
  • an embodiment of the present application provides a computer-readable storage medium, which is applied to the encoder 140.
  • the computer-readable storage medium stores a computer program, and when the computer program is executed by the first processor, the method described in any one of the aforementioned embodiments is implemented.
  • the encoder 140 may include: a first communication interface 1501, a first memory 1502 and a first processor 1503; each component is coupled together through a first bus system 1504. It can be understood that the first bus system 1504 is used to realize the connection and communication between these components.
  • the first bus system 1504 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, various buses are marked as the first bus system 1504 in Figure 15. Among them,
  • the first communication interface 1501 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
  • a first memory 1502 used for storing a computer program that can be run on the first processor 1503;
  • the first processor 1503 is configured to, when running the computer program, execute:
  • the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock
  • the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock, where i is an integer greater than or equal to 0 and less than the value of the first syntax element information
  • the value of the first syntax element information and the value of at least one second syntax element information are encoded, and the obtained encoding bits are written into a bitstream.
  • the first memory 1502 in the embodiment of the present application can be a volatile memory or a non-volatile memory, or can include both volatile and non-volatile memories.
  • the non-volatile memory can be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory.
  • the volatile memory can be a random access memory (RAM), which is used as an external cache.
  • RAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate synchronous DRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous link DRAM
  • DRRAM direct RAM bus RAM
  • the first processor 1503 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by the hardware integrated logic circuit or software instructions in the first processor 1503.
  • the above-mentioned first processor 1503 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the methods, steps and logic block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general-purpose processor can be a microprocessor or the processor can also be any conventional processor, etc.
  • the steps of the method disclosed in the embodiments of the present application can be directly embodied as a hardware decoding processor to execute, or the hardware and software modules in the decoding processor can be executed.
  • the software module can be located in a mature storage medium in the field such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory or an electrically erasable programmable memory, a register, etc.
  • the storage medium is located in the first memory 1502, and the first processor 1503 reads the information in the first memory 1502 and completes the steps of the above method in combination with its hardware.
  • the embodiments described in this application may be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof.
  • the processing unit may be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Device (DSPD), Programmable Logic Device (PLD), Field-Programmable Gate Array (FPGA), general-purpose processor, controller, microcontroller, microprocessor, other electronic units for performing the functions described in this application, or a combination thereof.
  • ASIC Application Specific Integrated Circuits
  • DSP Digital Signal Processing
  • DSPD Digital Signal Processing Device
  • PLD Programmable Logic Device
  • FPGA Field-Programmable Gate Array
  • the techniques described in this application may be implemented by modules (such as procedures, functions, etc.) that perform the functions described in this application.
  • the software code may be stored in a memory and executed by a processor.
  • the memory may be in the processor or Implemented externally to the processor.
  • the first processor 1503 is further configured to execute any one of the methods described in the foregoing embodiments when running the computer program.
  • the present embodiment provides an encoder.
  • the decoding end by making corresponding corrections to relevant syntax elements in the geometric coding process, the decoding end no longer needs to decode the syntax elements used to characterize the maximum number of points in the geometric prediction tree, but instead determines the point value of the current macroblock through the first syntax element information and the second syntax element information, thereby saving bit rate, improving encoding and decoding efficiency, and further improving geometric encoding and decoding performance.
  • the decoder 160 may include a decoding unit 1601 and a second determining unit 1602, wherein:
  • the decoding unit 1601 is configured to decode the bitstream and determine the value of the first syntax element information and the value of at least one second syntax element information; wherein the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock, and the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock, where i is an integer greater than or equal to 0 and less than the value of the first syntax element information;
  • the second determining unit 1602 is configured to determine a point value of the current macroblock according to a value of the first syntax element information and a value of at least one second syntax element information.
  • the decoding unit 1601 is further configured to decode the code stream, determine the value of the first syntax element information; and when i is greater than or equal to 0 and less than the value of the first syntax element information, loop the following steps to obtain the value of at least one second syntax element information: decode the code stream, determine the value of the i-th second syntax element information, and perform an addition operation on i.
  • the number of at least one second syntax element information is N; the second determination unit 1602 is further configured to determine an initial point value of the current macroblock; determine N intermediate point values of the current macroblock according to the value of at least one second syntax element information; wherein the value of N is equal to the value of the first syntax element information; and determine the point value of the current macroblock according to the initial point value of the current macroblock and the N intermediate point values of the current macroblock.
  • the second determination unit 1602 is further configured to, when i is greater than or equal to 0 and less than the value of the first syntax element information, loop through the following steps to obtain N midpoint values of the current macroblock: determine the i-th midpoint value of the current macroblock based on the value of the i-th second syntax element information and the value of i, and perform an add 1 operation on i.
  • the second determining unit 1602 is further configured to perform an i-bit left shift operation on the value of the i-th second syntax element information to obtain the i-th midpoint value of the current macroblock.
  • the second determining unit 1602 is further configured to set the initial point value of the current macroblock to 0.
  • the second determination unit 1602 is further configured to perform a cumulative operation based on the initial point value of the current macroblock and the N intermediate point values of the current macroblock to obtain the point value of the current macroblock.
  • the value of the first syntax element information is an unsigned integer
  • the value of the second syntax element information is a binary variable
  • the decoding unit 1601 is further configured to decode the code stream and determine the first identification information; and if the first identification information indicates that the current sequence turns on the prediction tree decoding mode, decode the code stream and determine the value of the third syntax element information; wherein the current sequence includes the current macroblock, and the third syntax element information is used to indicate the maximum number of points of the prediction tree corresponding to the current macroblock.
  • the decoding unit 1601 is further configured to not perform the step of decoding the bitstream and determining the value of the third syntax element information if the first identification information indicates that the prediction tree decoding mode is not enabled for the current sequence.
  • a "unit" can be a part of a circuit, a part of a processor, a part of a program or software, etc., and of course it can also be a module, or it can be non-modular.
  • the components in this embodiment can be integrated into a processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or in the form of a software functional module.
  • the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • this embodiment provides a computer-readable storage medium, which is applied to the decoder 160, and the computer-readable storage medium stores a computer program. When the computer program is executed by the second processor, the method described in any one of the above embodiments is implemented.
  • the decoder 160 may include: a second communication interface 1701, a second memory 1702, and a second processor 1703; each component is coupled together through a second bus system 1704. It can be understood that the second bus system 1704 is used to realize the connection and communication between these components.
  • the second bus system 1704 also includes The second bus system 1704 includes a power bus, a control bus, and a status signal bus. However, for the sake of clarity, various buses are labeled as the second bus system 1704 in FIG. 17.
  • the second communication interface 1701 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
  • the second memory 1702 is used to store a computer program that can be run on the second processor 1703;
  • the second processor 1703 is configured to execute, when running the computer program:
  • Decoding a bitstream determining a value of a first syntax element information and a value of at least one second syntax element information; wherein the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock, and the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock, where i is an integer greater than or equal to 0 and less than the value of the first syntax element information;
  • a point value of the current macroblock is determined according to a value of the first syntax element information and a value of at least one second syntax element information.
  • the second processor 1703 is further configured to execute any one of the methods described in the foregoing embodiments when running the computer program.
  • the present embodiment provides a decoder.
  • the decoding end by making corresponding corrections to the relevant syntax elements in the geometric coding process, the decoding end no longer needs to decode the syntax elements used to represent the maximum number of points in the geometric prediction tree, but instead determines the point value of the current macroblock through the first syntax element information and the second syntax element information, thereby saving bit rate, improving encoding and decoding efficiency, and further improving geometric encoding and decoding performance.
  • the coding and decoding system 180 may include an encoder 1801 and a decoder 1802 .
  • the encoder 1801 may be the encoder described in any one of the aforementioned embodiments
  • the decoder 1802 may be the decoder described in any one of the aforementioned embodiments.
  • the point value of the current macroblock is determined; then, according to the point value of the current macroblock, the value of the first syntax element information and the value of at least one second syntax element information are determined; then, the value of the first syntax element information and the value of at least one second syntax element information are encoded, and the obtained coded bits are written into the bitstream.
  • the bitstream is decoded, the value of the first syntax element information and the value of at least one second syntax element information are determined; then, according to the value of the first syntax element information and the value of at least one second syntax element information, the point value of the current macroblock is determined.
  • the first syntax element information is used to indicate the number of bits occupied by the point value of the current macroblock
  • the i-th second syntax element information is used to indicate the i-th bit corresponding to the point value of the current macroblock
  • i is an integer greater than or equal to 0 and less than the value of the first syntax element information.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Sont divulgués dans les modes de réalisation de la présente demande un procédé de codage, un procédé de décodage, un flux de code, un codeur, un décodeur, et un support de stockage. Le procédé de décodage comprend les étapes suivantes : décodage d'un flux de code, de façon à déterminer la valeur de premières informations d'élément de syntaxe et la valeur d'au moins un élément de secondes informations d'élément de syntaxe, les premières informations d'élément de syntaxe étant utilisées pour indiquer le nombre de bits occupés par la valeur du nombre de points d'un macro-bloc courant, et un i-ième élément de secondes informations d'élément de syntaxe étant utilisé pour indiquer un i-ième bit correspondant à la valeur du nombre de points du macro-bloc courant, i étant un nombre entier supérieur ou égal à 0 et inférieur à la valeur des premières informations d'élément de syntaxe ; et sur la base de la valeur des premières informations d'élément de syntaxe et de la valeur de l'au moins un élément de secondes informations d'élément de syntaxe, détermination de la valeur du nombre de points du macro-bloc courant. De cette manière, l'efficacité de codage et de décodage peut être améliorée.
PCT/CN2023/113792 2023-08-18 2023-08-18 Procédé de codage, procédé de décodage, flux de code, codeur, décodeur, et support de stockage Pending WO2025039113A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/113792 WO2025039113A1 (fr) 2023-08-18 2023-08-18 Procédé de codage, procédé de décodage, flux de code, codeur, décodeur, et support de stockage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/113792 WO2025039113A1 (fr) 2023-08-18 2023-08-18 Procédé de codage, procédé de décodage, flux de code, codeur, décodeur, et support de stockage

Publications (1)

Publication Number Publication Date
WO2025039113A1 true WO2025039113A1 (fr) 2025-02-27

Family

ID=94731245

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/113792 Pending WO2025039113A1 (fr) 2023-08-18 2023-08-18 Procédé de codage, procédé de décodage, flux de code, codeur, décodeur, et support de stockage

Country Status (1)

Country Link
WO (1) WO2025039113A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115379191A (zh) * 2022-08-22 2022-11-22 腾讯科技(深圳)有限公司 一种点云解码方法、点云编码方法及相关设备
CN116320352A (zh) * 2023-03-16 2023-06-23 腾讯科技(深圳)有限公司 一种点云处理方法、装置及计算机设备、存储介质

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115379191A (zh) * 2022-08-22 2022-11-22 腾讯科技(深圳)有限公司 一种点云解码方法、点云编码方法及相关设备
CN116320352A (zh) * 2023-03-16 2023-06-23 腾讯科技(深圳)有限公司 一种点云处理方法、装置及计算机设备、存储介质

Similar Documents

Publication Publication Date Title
CN116033186B (zh) 一种点云数据处理方法、装置、设备以及介质
WO2024174086A1 (fr) Procédé de décodage, procédé de codage, décodeurs et codeurs
WO2025039113A1 (fr) Procédé de codage, procédé de décodage, flux de code, codeur, décodeur, et support de stockage
WO2024065269A1 (fr) Procédé et appareil de codage et de décodage de nuage de points, dispositif, et support de stockage
WO2025039122A1 (fr) Procédé de codage de nuage de points, procédé de décodage de nuage de points, flux de code, codeur, décodeur et support de stockage
WO2024207235A1 (fr) Procédé de codage/décodage, train de bits, codeur, décodeur et support de stockage
WO2024187380A1 (fr) Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support de stockage
WO2025039120A1 (fr) Procédé de codage, procédé de décodage, codeur, décodeur et support de stockage
WO2025039125A1 (fr) Procédé d'encodage, procédé de décodage, encodeur, décodeur, et support de stockage
WO2025039127A1 (fr) Procédé de codage, procédé de décodage, codeur, décodeur et support de stockage
WO2024216649A1 (fr) Procédé de codage et de décodage de nuage de points, codeur, décodeur, flux de code et support de stockage
WO2024174092A1 (fr) Procédé de codage/décodage, flux de code, codeur, décodeur et support d'enregistrement
WO2024103304A1 (fr) Procédé d'encodage de nuage de points, procédé de décodage de nuage de points, encodeur, décodeur, flux de code, et support de stockage
WO2025076662A1 (fr) Procédé de codage de nuage de points, procédé de décodage de nuage de points, flux de code, codeur, décodeur et support de stockage
WO2025076659A1 (fr) Procédé de codage de nuage de points, procédé de décodage de nuage de points, flux de code, codeur, décodeur et support de stockage
WO2024148598A1 (fr) Procédé de codage, procédé de décodage, codeur, décodeur et support de stockage
WO2024119518A1 (fr) Procédé de codage, procédé de décodage, décodeur, codeur, flux de code et support de stockage
WO2024119420A1 (fr) Procédé de codage, procédé de décodage, flux de code, codeur, décodeur, et support de stockage
WO2025039236A1 (fr) Procédé de codage et décodage, train de codes, codeur, décodeur et support de stockage
WO2024065406A1 (fr) Procédés de codage et de décodage, train de bits, codeur, décodeur et support de stockage
WO2024119419A1 (fr) Procédé de codage, procédé de décodage, flux binaire, codeur, décodeur et support de stockage
WO2025138048A1 (fr) Procédé de codage, procédé de décodage, flux de codes, codeur, décodeur et support de stockage
WO2025138196A1 (fr) Procédé de codage, procédé de décodage, codeur de nuage de points, décodeur de nuage de points, flux binaire et support de stockage
WO2025147915A1 (fr) Procédé de codage de nuage de points, procédé de décodage de nuage de points, codeurs, décodeurs, train de bits et support de stockage
WO2025217849A1 (fr) Procédé de codage/décodage, codeur de nuage de points, décodeur de nuage de points et support de stockage

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23949307

Country of ref document: EP

Kind code of ref document: A1