WO2024212042A1 - Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support d'enregistrement - Google Patents
Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support d'enregistrement Download PDFInfo
- Publication number
- WO2024212042A1 WO2024212042A1 PCT/CN2023/087289 CN2023087289W WO2024212042A1 WO 2024212042 A1 WO2024212042 A1 WO 2024212042A1 CN 2023087289 W CN2023087289 W CN 2023087289W WO 2024212042 A1 WO2024212042 A1 WO 2024212042A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- value
- identification information
- inter
- mode
- frame prediction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/109—Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
Definitions
- the embodiments of the present application relate to the field of point cloud encoding and decoding technology, and in particular, to an encoding and decoding method, a bit stream, an encoder, a decoder, and a storage medium.
- G-PCC geometry-based point cloud compression
- the geometry coding of G-PCC can be divided into octree-based geometry coding and prediction tree-based geometry coding.
- For the prediction tree-based geometry coding it is necessary to first establish a prediction tree; then traverse each node in the prediction tree, and after determining the prediction mode of each node, predict the geometric position information of the node according to the prediction mode to obtain the prediction residual, and finally encode the parameters such as the prediction mode and prediction residual of each node to generate a binary code stream.
- the encoding of the inter-frame prediction mode number is usually performed by converting the inter-frame prediction mode number into binary for direct encoding.
- the performance of the encoded inter-frame prediction mode number is not optimal, which reduces the encoding and decoding efficiency.
- the embodiments of the present application provide a coding and decoding method, a bit stream, an encoder, a decoder and a storage medium, which can reduce the number of coding bits, thereby saving bit rate and improving coding and decoding efficiency.
- an embodiment of the present application provides a decoding method, which is applied to a decoder, and the method includes:
- Decoding a bitstream determining a value of at least one mode identification information; wherein the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode;
- the prediction value of the current node is determined.
- an embodiment of the present application provides an encoding method, which is applied to an encoder, and the method includes:
- the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode;
- the value of at least one mode identification information is encoded, and the obtained encoded bits are written into the code stream.
- an embodiment of the present application provides a code stream, which is generated by bit encoding according to information to be encoded; wherein the information to be encoded includes at least one of the following:
- the first identification information is used to indicate whether the current node uses the inter-frame prediction mode
- the second identification information is used to indicate whether the current node enables the target inter-frame encoding/decoding method
- the mode identification information includes at least the i-th mode identification information
- the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i
- i is an integer greater than or equal to 0 and less than N
- N represents the maximum value of the inter-frame prediction mode.
- an embodiment of the present application provides an encoder, the encoder comprising a first determining unit and an encoding unit; wherein,
- a first determining unit configured to determine an inter-frame prediction mode value of a current node
- the first determination unit is further configured to determine a value of at least one mode identification information according to the inter-frame prediction mode value; wherein the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode;
- the encoding unit is configured to encode the value of at least one mode identification information and write the obtained encoding bits into the bit stream.
- an embodiment of the present application provides an encoder, the encoder comprising a first memory and a first processor; wherein,
- a first memory for storing a computer program that can be run on the first processor
- the first processor is used to execute the method described in the second aspect when running a computer program.
- an embodiment of the present application provides a decoder, the decoder comprising a decoding unit and a second determining unit; wherein:
- a decoding unit configured to decode a bitstream and determine a value of at least one mode identification information; wherein the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents a maximum value of the inter-frame prediction mode;
- the second determination unit is configured to determine the inter-frame prediction mode value of the current node according to the value of at least one mode identification information; and determine the prediction value of the current node according to the inter-frame prediction mode value.
- an embodiment of the present application provides a decoder, the decoder comprising a second memory and a second processor; wherein:
- a second memory for storing a computer program that can be run on a second processor
- the second processor is used to execute the method described in the first aspect when running a computer program.
- an embodiment of the present application provides a computer-readable storage medium, which stores a computer program.
- the computer program When executed, it implements the method as described in the first aspect, or implements the method as described in the second aspect.
- the embodiment of the present application provides a coding and decoding method, a bitstream, an encoder, a decoder and a storage medium.
- the inter-frame prediction mode value of the current node is determined; according to the inter-frame prediction mode value, the value of at least one mode identification information is determined; wherein the mode identification information at least includes the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode; the value of at least one mode identification information is encoded, and the obtained encoding bits are written into the bitstream.
- the bitstream is decoded to determine the value of at least one mode identification information; wherein the mode identification information at least includes the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode; according to the value of at least one mode identification information, the inter-frame prediction mode value of the current node is determined; according to the inter-frame prediction mode value, the prediction value of the current node is determined.
- the inter-frame prediction mode value is no longer converted into binary for direct encoding, but the value of at least one mode identification information is determined according to the inter-frame prediction mode value, and then the value of the at least one mode identification information is encoded; wherein, the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i.
- This takes into account the frequency distribution of the inter-frame prediction mode, that is, the more likely the inter-frame prediction mode is to appear, the closer its corresponding inter-frame prediction mode value is, thereby reducing the number of encoding bits, saving bit rate, and thus improving encoding and decoding efficiency.
- FIG1A is a schematic diagram of a three-dimensional point cloud image
- FIG1B is a partial enlarged view of a three-dimensional point cloud image
- FIG2A is a schematic diagram of six viewing angles of a point cloud image
- FIG2B is a schematic diagram of a data storage format corresponding to a point cloud image
- FIG3 is a schematic diagram of a network architecture for point cloud encoding and decoding
- FIG4A is a schematic diagram of a composition framework of a G-PCC encoder
- FIG4B is a schematic diagram of a composition framework of a G-PCC decoder
- FIG5A is a schematic diagram of a low plane position in the Z-axis direction
- FIG5B is a schematic diagram of a high plane position in the Z-axis direction
- FIG6 is a schematic diagram of a node encoding sequence
- FIG. 7A is a schematic diagram of a plane identification information
- FIG7B is a schematic diagram of another type of planar identification information
- FIG8 is a schematic diagram of sibling nodes of a current node
- FIG9 is a schematic diagram of the intersection of a laser radar and a node
- FIG10 is a schematic diagram of neighborhood nodes at the same partition depth and the same coordinates
- FIG11 is a schematic diagram of a current node being located at a low plane position of a parent node
- FIG12 is a schematic diagram of a high plane position of a current node located at a parent node
- FIG13 is a schematic diagram of predictive coding of planar position information of a laser radar point cloud
- FIG14 is a schematic diagram of IDCM encoding
- FIG15 is a schematic diagram of coordinate transformation of a rotating laser radar to obtain a point cloud
- FIG16 is a schematic diagram of predictive coding in the X-axis or Y-axis direction
- FIG17A is a schematic diagram showing an angle of the Y plane predicted by a horizontal azimuth angle
- FIG17B is a schematic diagram showing an angle of predicting the X-plane by using a horizontal azimuth angle
- FIG18 is another schematic diagram of predictive coding in the X-axis or Y-axis direction
- FIG19A is a schematic diagram of three intersection points included in a sub-block
- FIG19B is a schematic diagram of a triangular facet set fitted using three intersection points
- FIG19C is a schematic diagram of upsampling of a triangular face set
- FIG20 is a schematic diagram of the structure of a geometric information inter-frame encoding and decoding
- FIG21 is a schematic diagram of a flow chart of a decoding method provided in an embodiment of the present application.
- FIG22 is a schematic diagram of a flow chart of an encoding method provided in an embodiment of the present application.
- FIG23 is a schematic diagram of a flow chart of another encoding method provided in an embodiment of the present application.
- FIG24 is a schematic diagram of the composition structure of an encoder provided in an embodiment of the present application.
- FIG25 is a schematic diagram of a specific hardware structure of an encoder provided in an embodiment of the present application.
- FIG26 is a schematic diagram of the composition structure of a decoder provided in an embodiment of the present application.
- FIG27 is a schematic diagram of a specific hardware structure of a decoder provided in an embodiment of the present application.
- FIG. 28 is a schematic diagram of the composition structure of a coding and decoding system provided in an embodiment of the present application.
- first ⁇ second ⁇ third involved in the embodiments of the present application are only used to distinguish similar objects and do not represent a specific ordering of the objects. It can be understood that “first ⁇ second ⁇ third” can be interchanged in a specific order or sequence where permitted, so that the embodiments of the present application described here can be implemented in an order other than that illustrated or described here.
- Point Cloud is a three-dimensional representation of the surface of an object.
- Point cloud (data) on the surface of an object can be collected through acquisition equipment such as photoelectric radar, lidar, laser scanner, and multi-view camera.
- a point cloud is a set of irregularly distributed discrete points in space that express the spatial structure and surface properties of a three-dimensional object or scene.
- FIG1A shows a three-dimensional point cloud image
- FIG1B shows a partial magnified view of the three-dimensional point cloud image. It can be seen that the point cloud surface is composed of densely distributed points.
- Two-dimensional images have information expressed at each pixel point, and the distribution is regular, so there is no need to record its position information additionally; however, the distribution of points in point clouds in three-dimensional space is random and irregular, so it is necessary to record the position of each point in space in order to fully express a point cloud.
- each position in the acquisition process has corresponding attribute information, usually RGB color values, and the color value reflects the color of the object; for point clouds, in addition to color information, the attribute information corresponding to each point is also commonly the reflectance value, which reflects the surface material of the object. Therefore, point cloud data usually includes the position information of the point and the attribute information of the point. Among them, the position information of the point can also be called the geometric information of the point.
- the geometric information of the point can be the three-dimensional coordinate information of the point (x, y, z).
- the attribute information of the point can include color information and/or reflectivity, etc.
- reflectivity can be one-dimensional reflectivity information (r); color information can be information on any color space, or color information can also be three-dimensional color information, such as RGB information.
- R represents red (Red, R)
- G represents green (Green, G)
- B blue (Blue, B).
- the color information may be luminance and chrominance (YCbCr, YUV) information, where Y represents brightness (Luma), Cb (U) represents blue color difference, and Cr (V) represents red color difference.
- the points in the point cloud may include the three-dimensional coordinate information of the points and the reflectivity value of the points.
- the points in the point cloud may include the three-dimensional coordinate information of the points and the three-dimensional color information of the points.
- a point cloud obtained by combining the principles of laser measurement and photogrammetry may include the three-dimensional coordinate information of the points, the reflectivity value of the points and the three-dimensional color information of the points.
- Figure 2A and 2B a point cloud image and its corresponding data storage format are shown.
- Figure 2A provides six viewing angles of the point cloud image
- Figure 2B consists of a file header information part and a data part.
- the header information includes the data format, data representation type, the total number of point cloud points, and the content represented by the point cloud.
- the point cloud is in the ".ply" format, represented by ASCII code, with a total number of 207242 points, and each point has three-dimensional coordinate information (x, y, z) and three-dimensional color information (r, g, b).
- Point clouds can be divided into the following categories according to the way they are obtained:
- Static point cloud the object is stationary, and the device that obtains the point cloud is also stationary;
- Dynamic point cloud The object is moving, but the device that obtains the point cloud is stationary;
- Dynamic point cloud acquisition The device used to acquire the point cloud is in motion.
- point clouds can be divided into two categories according to their usage:
- Category 1 Machine perception point cloud, which can be used in autonomous navigation systems, real-time inspection systems, geographic information systems, visual sorting robots, disaster relief robots, etc.
- Category 2 Point cloud perceived by the human eye, which can be used in point cloud application scenarios such as digital cultural heritage, free viewpoint broadcasting, 3D immersive communication, and 3D immersive interaction.
- Point clouds can flexibly and conveniently express the spatial structure and surface properties of three-dimensional objects or scenes. Point clouds are obtained by directly sampling real objects, so they can provide a strong sense of reality while ensuring accuracy. Therefore, they are widely used, including virtual reality games, computer-aided design, geographic information systems, automatic navigation systems, digital cultural heritage, free viewpoint broadcasting, three-dimensional immersive remote presentation, and three-dimensional reconstruction of biological tissues and organs.
- Point clouds can be collected mainly through the following methods: computer generation, 3D laser scanning, 3D photogrammetry, etc.
- Computers can generate point clouds of virtual three-dimensional objects and scenes; 3D laser scanning can obtain point clouds of static real-world three-dimensional objects or scenes, and can obtain millions of point clouds per second; 3D photogrammetry can obtain point clouds of dynamic real-world three-dimensional objects or scenes, and can obtain tens of millions of point clouds per second.
- 3D photogrammetry can obtain point clouds of dynamic real-world three-dimensional objects or scenes, and can obtain tens of millions of point clouds per second.
- the number of points in each point cloud frame is 700,000, and each point has coordinate information xyz (float) and color information RGB (uchar).
- point cloud compression has become a key issue in promoting the development of the point cloud industry.
- the point cloud is a collection of massive points, storing the point cloud will not only consume a lot of memory, but also be inconvenient for transmission. There is also not enough bandwidth to support direct transmission of the point cloud at the network layer without compression. Therefore, the point cloud needs to be compressed.
- the point cloud coding framework that can compress point clouds can be the geometry-based point cloud compression (G-PCC) codec framework or the video-based point cloud compression (V-PCC) codec framework provided by the Moving Picture Experts Group (MPEG), or the AVS-PCC codec framework provided by AVS.
- the G-PCC codec framework can be used to compress the first type of static point cloud and the third type of dynamically acquired point cloud, which can be based on the point cloud compression test platform (Test Model Compression 13, TMC13), and the V-PCC codec framework can be used to compress the second type of dynamic point cloud, which can be based on the point cloud compression test platform (Test Model Compression 2, TMC2). Therefore, the G-PCC codec framework is also called the point cloud codec TMC13, and the V-PCC codec framework is also called the point cloud codec TMC2.
- FIG3 is a schematic diagram of a network architecture of a point cloud encoding and decoding provided by the embodiment of the present application.
- the network architecture includes one or more electronic devices 13 to 1N and a communication network 01, wherein the electronic devices 13 to 1N can perform video interaction through the communication network 01.
- the electronic device can be various types of devices with point cloud encoding and decoding functions.
- the electronic device can include a mobile phone, a tablet computer, a personal computer, a personal digital assistant, a navigator, a digital phone, a video phone, a television, a sensor device, a server, etc., which is not limited by the embodiment of the present application.
- the decoder or encoder in the embodiment of the present application can be the above-mentioned electronic device.
- the electronic device in the embodiment of the present application has a point cloud encoding and decoding function, generally including a point cloud encoder (ie, encoder) and a point cloud decoder (ie, decoder).
- a point cloud encoder ie, encoder
- a point cloud decoder ie, decoder
- the point cloud data is first divided into multiple slices by slice division.
- the geometric information of the point cloud and the attribute information corresponding to each point are encoded separately.
- FIG4A shows a schematic diagram of the composition framework of a G-PCC encoder.
- the geometric information is transformed so that all point clouds are contained in a bounding box (Bounding Box), and then quantized.
- This step of quantization mainly plays a role in scaling. Due to the quantization rounding, the geometric information of a part of the point cloud is the same, so whether to remove duplicate points is determined based on parameters.
- the process of quantization and removal of duplicate points is also called voxelization.
- the Bounding Box is divided into octrees or a prediction tree is constructed.
- arithmetic coding is performed on the points in the divided leaf nodes to generate a binary geometric bit stream; or, arithmetic coding is performed on the intersections (Vertex) generated by the division (surface fitting is performed based on the intersections) to generate a binary geometric bit stream.
- attribute encoding After the geometric encoding is completed and the geometric information is reconstructed, color conversion is required first to convert the color information (i.e., attribute information) from the RGB color space to the YUV color space. Then, the point cloud is recolored using the reconstructed geometric information so that the uncoded attribute information corresponds to the reconstructed geometric information. Attribute encoding is mainly performed on color information. There are two main transformation methods.
- LOD level of detail
- RAHT direct region adaptive hierarchical transformation
- FIG4B shows a schematic diagram of the composition framework of a G-PCC decoder.
- the geometric bit stream and the attribute bit stream in the binary bit stream are first decoded independently.
- the geometric information of the point cloud is obtained through arithmetic decoding-reconstruction of the octree/reconstruction of the prediction tree-reconstruction of the geometry-coordinate inverse conversion;
- the attribute information of the point cloud is obtained through arithmetic decoding-inverse quantization-LOD partitioning/RAHT-color inverse conversion, and the point cloud data to be encoded (i.e., the output point cloud) is restored based on the geometric information and attribute information.
- the current geometric coding of G-PCC can be divided into octree-based geometric coding (marked by a dotted box) and prediction tree-based geometric coding (marked by a dotted box).
- the octree-based geometry encoding includes: first, coordinate transformation of the geometric information so that all point clouds are contained in a Bounding Box. Then quantization is performed. This step of quantization mainly plays a role of scaling. Due to the quantization rounding, the geometric information of some points is the same. Whether to remove duplicate points is determined based on parameters. The process of quantization and removal of duplicate points is also called voxelization. Next, the Bounding Box is continuously divided into trees (such as octrees, quadtrees, binary trees, etc.) in the order of breadth-first traversal, and the placeholder code of each node is encoded.
- trees such as octrees, quadtrees, binary trees, etc.
- the bounding box of the point cloud is calculated. Assume that dx > dy > dz , the bounding box corresponds to a cuboid.
- binary tree partitioning will be performed based on the x-axis to obtain two child nodes.
- quadtree partitioning will be performed based on the x- and y-axes to obtain four child nodes.
- octree partitioning will be performed until the leaf node obtained by partitioning is a 1 ⁇ 1 ⁇ 1 unit cube.
- K indicates the maximum number of binary tree/quadtree partitions before octree partitioning
- M is used to indicate that the minimum block side length corresponding to binary tree/quadtree partitioning is 2M .
- the reason why parameters K and M meet the above conditions is that in the process of geometric implicit partitioning in G-PCC, the priority of partitioning is binary tree, quadtree and octree.
- the node block size does not meet the conditions of binary tree/quadtree, the node will be partitioned by octree until it is divided into the minimum unit of leaf node 1 ⁇ 1 ⁇ 1.
- the geometric information encoding mode based on octree can effectively encode the geometric information of point cloud by utilizing the correlation between adjacent points in space.
- the encoding efficiency of point cloud geometric information can be further improved by using plane coding.
- Fig. 5A and Fig. 5B provide a kind of plane position schematic diagram.
- Fig. 5A shows a kind of low plane position schematic diagram in the Z-axis direction
- Fig. 5B shows a kind of high plane position schematic diagram in the Z-axis direction.
- (a), (a0), (a1), (a2), (a3) here all belong to the low plane position in the Z-axis direction.
- the four subnodes occupied in the current node are located at the high plane position of the current node in the Z-axis direction, so it can be considered that the current node belongs to a Z plane and is a high plane in the Z-axis direction.
- FIG. 6 provides a schematic diagram of the node coding order, that is, the node coding is performed in the order of 0, 1, 2, 3, 4, 5, 6, and 7 as shown in FIG. 6.
- the octree coding method is used for (a) in FIG. 5A, the placeholder information of the current node is represented as: 11001100.
- the plane coding method is used, first, an identifier needs to be encoded to indicate that the current node is a plane in the Z-axis direction.
- the plane position of the current node needs to be represented; secondly, only the placeholder information of the low plane node in the Z-axis direction needs to be encoded (that is, the placeholder information of the four subnodes 0, 2, 4, and 6). Therefore, based on the plane coding method, only 6 bits need to be encoded to encode the current node, which can reduce the representation of 2 bits compared with the octree coding of the related art. Based on this analysis, plane coding has a more obvious coding efficiency than octree coding.
- Prob(i) new (L ⁇ Prob(i)+ ⁇ (coded node))/L+1 (3)
- L 255; in addition, if the coded node is a plane, ⁇ (coded node) is 1; otherwise, ⁇ (coded node) is 0.
- local_node_density new local_node_density+4*numSiblings (4)
- FIG8 shows a schematic diagram of the sibling nodes of the current node. As shown in FIG8, the current node is a node filled with slashes, and the nodes filled with grids are sibling nodes, then the number of sibling nodes of the current node is 5 (including the current node itself).
- planarEligibleK OctreeDepth if (pointCount-numPointCountRecon) is less than nodeCount ⁇ 1.3, then planarEligibleK OctreeDepth is true; if (pointCount-numPointCountRecon) is not less than nodeCount ⁇ 1.3, then planarEligibleKOctreeDepth is false. In this way, when planarEligibleKOctreeDepth is true, all nodes in the current layer are plane-encoded; otherwise, all nodes in the current layer are not plane-encoded, and only octree coding is used.
- Figure 9 shows a schematic diagram of the intersection of a laser radar and a node.
- a node filled with a grid is simultaneously passed through by two laser beams (Laser), so the current node is not a plane in the vertical direction of the Z axis;
- a node filled with a slash is small enough that it cannot be passed through by two lasers at the same time, so the node filled with a slash may be a plane in the vertical direction of the Z axis.
- the plane identification information and the plane position information may be predictively coded.
- the predictive encoding of the plane position information may include:
- the plane position information is divided into three elements: predicted as a low plane, predicted as a high plane, and unpredictable;
- the spatial distance after determining the spatial distance between the node at the same division depth and the same coordinates as the current node and the current node, if the spatial distance is less than a preset distance threshold, then the spatial distance can be determined to be "near”; or, if the spatial distance is greater than the preset distance threshold, then the spatial distance can be determined to be "far”.
- FIG10 shows a schematic diagram of neighborhood nodes at the same division depth and the same coordinates.
- the bold large cube represents the parent node (Parent node), the small cube filled with a grid inside it represents the current node (Current node), and the intersection position (Vertex position) of the current node is shown;
- the small cube filled with white represents the neighborhood nodes at the same division depth and the same coordinates, and the distance between the current node and the neighborhood node is the spatial distance, which can be judged as "near” or "far”; in addition, if the neighborhood node is a plane, then the plane position (Planar position) of the neighborhood node is also required.
- the current node is a small cube filled with a grid
- the neighboring node is searched for a small cube filled with white at the same octree partition depth level and the same vertical coordinate, and the distance between the two nodes is judged as "near" and "far", and the plane position of the reference node is referenced.
- FIG11 shows a schematic diagram of a current node being located at a low plane position of a parent node.
- (a), (b), and (c) show three examples of the current node being located at a low plane position of a parent node.
- the specific description is as follows:
- FIG12 shows a schematic diagram of a current node being located at a high plane position of a parent node.
- (a), (b), and (c) show three examples of the current node being located at a high plane position of a parent node.
- the specific description is as follows:
- Figure 13 shows a schematic diagram of predictive encoding of the laser radar point cloud plane position information.
- the laser radar emission angle is ⁇ bottom
- it can be mapped to the bottom plane (Bottom virtual plane)
- the laser radar emission angle is ⁇ top
- it can be mapped to the top plane (Top virtual plane).
- the plane position of the current node is predicted by using the laser radar acquisition parameters, and the position of the current node intersecting with the laser ray is used to quantify the position into multiple intervals, which is finally used as the context information of the plane position of the current node.
- the specific calculation process is as follows: Assuming that the coordinates of the laser radar are (x Lidar , y Lidar , z Lidar ), and the geometric coordinates of the current node are (x, y, z), then first calculate the vertical tangent value tan ⁇ of the current node relative to the laser radar, and the calculation formula is as follows:
- each Laser has a certain offset angle relative to the LiDAR, it is also necessary to calculate the relative tangent value tan ⁇ corr,L of the current node relative to the Laser.
- the specific calculation is as follows:
- the relative tangent value tan ⁇ corr,L of the current node is used to predict the plane position of the current node. Specifically, assuming that the tangent value of the lower boundary of the current node is tan( ⁇ bottom ), and the tangent value of the upper boundary is tan( ⁇ top ), the plane position is quantized into 4 quantization intervals according to tan ⁇ corr,L , that is, the context information of the plane position is determined.
- the octree-based geometric information coding mode only has an efficient compression rate for points with correlation in space.
- the use of the direct coding model (DCM) can greatly reduce the complexity.
- DCM direct coding model
- the use of DCM is not represented by flag information, but by the parent node and neighbor nodes of the current node. There are three ways to determine whether the current node is eligible for DCM encoding, as follows:
- the current node has no sibling child nodes, that is, the parent node of the current node has only one child node, and the parent node of the parent node of the current node has only two occupied child nodes, that is, the current node has at most one neighbor node.
- the parent node of the current node has only one child node, the current node.
- the six neighbor nodes that share a face with the current node are also empty nodes.
- FIG14 provides a schematic diagram of IDCM coding. If the current node does not have the DCM coding qualification, it will be divided into octrees. If it has the DCM coding qualification, the number of points contained in the node will be further determined. When the number of points is less than a threshold value (for example, 2), the node will be DCM-encoded, otherwise the octree division will continue.
- a threshold value for example, 2
- IDCM_flag the current node is encoded using DCM, otherwise octree coding is still used.
- the DCM coding mode of the current node needs to be encoded.
- DCM modes There are currently two DCM modes, namely: (a) only one point exists (or multiple points, but they are repeated points); (b) contains two points.
- the geometric information of each point needs to be encoded. Assuming that the side length of the node is 2d , d bits are required to encode each component of the geometric coordinates of the node, and the bit information is directly encoded into the bit stream. It should be noted here that when encoding the lidar point cloud, the three-dimensional coordinate information can be predictively encoded by using the lidar acquisition parameters, thereby further improving the encoding efficiency of the geometric information.
- the current node does not meet the requirements of the DCM node, it will exit directly (that is, the number of points is greater than 2 points and it is not a duplicate point).
- the second point of the current node is a repeated point, and then it is encoded whether the number of repeated points of the current node is greater than 1. When the number of repeated points is greater than 1, it is necessary to perform exponential Golomb decoding on the remaining number of repeated points.
- the coordinate information of the points contained in the current node is encoded.
- the following will introduce the lidar point cloud and the human eye point cloud in detail.
- the axis with the smaller node coordinate geometry position will be used as the priority coded axis dirextAxis, and then the geometry information of the priority coded axis dirextAxis will be encoded as follows. Assume that the bit depth of the coded geometry corresponding to the priority coded axis is nodeSizeLog2, and assume that the coordinates of the two points are pointPos[0] and pointPos[1].
- the specific encoding process is as follows:
- the priority coded coordinate axis dirextAxis geometry information is first encoded as follows, assuming that the priority coded axis corresponds to the coded geometry bit depth of nodeSizeLog2, and assuming that the coordinates of the two points are pointPos[0] and pointPos[1].
- the specific encoding process is as follows:
- the geometric coordinate information of the current node can be predicted, so as to further improve the efficiency of the geometric information encoding of the point cloud.
- the geometric information nodePos of the current node is first used to obtain a directly encoded main axis direction, and then the geometric information of the encoded direction is used to predict the geometric information of another dimension.
- the axis direction of the direct encoding is directAxis
- the bit depth of the direct encoding is nodeSizeLog2
- FIG15 provides a schematic diagram of coordinate transformation for obtaining point clouds using a rotating laser radar.
- the (x, y, z) coordinates of each node can be converted to (R, ⁇ , i).
- the laser scanner can perform laser scanning at a preset angle, and different ⁇ (i) can be obtained under different values of i.
- ⁇ (1) can be obtained, and the corresponding scanning angle is -15°; when i is equal to 2, ⁇ (2) can be obtained, and the corresponding scanning angle is -13°; when i is equal to 10, ⁇ (10) can be obtained, and the corresponding scanning angle is +13°; when i is equal to 9, ⁇ (19) can be obtained, and the corresponding scanning angle is +15°.
- the LaserIdx corresponding to the current point i.e., the pointLaserIdx number in Figure 15, will be calculated first, and the LaserIdx of the current node, i.e., nodeLaserIdx, will be calculated; secondly, the LaserIdx of the node, i.e., nodeLaserIdx, will be used to predictively encode the LaserIdx of the point, i.e., pointLaserIdx, where the calculation method of the LaserIdx of the node or point is as follows.
- the LaserIdx of the current node is first used to predict the pointLaserIdx of the point. After the LaserIdx of the current point is encoded, the three-dimensional geometric information of the current point is predicted and encoded using the acquisition parameters of the laser radar.
- FIG16 shows a schematic diagram of predictive coding in the X-axis or Y-axis direction.
- a box filled with a grid represents a current node
- a box filled with a slash represents an already coded node.
- the LaserIdx corresponding to the current node is first used to obtain the corresponding predicted value of the horizontal azimuth, that is, Secondly, the node geometry information corresponding to the current point is used to obtain the horizontal azimuth angle corresponding to the node Assuming the geometric coordinates of the node are nodePos, the horizontal azimuth
- the calculation method between the node geometry information is as follows:
- Figure 17A shows a schematic diagram of predicting the angle of the Y plane through the horizontal azimuth angle
- Figure 17B shows a schematic diagram of predicting the angle of the X plane through the horizontal azimuth angle.
- the predicted value of the horizontal azimuth angle corresponding to the current point The calculation is as follows:
- FIG18 shows another schematic diagram of predictive coding in the X-axis or Y-axis direction.
- the portion filled with a grid represents the low plane
- the portion filled with dots represents the high plane.
- Indicates the horizontal azimuth of the low plane of the current node Indicates the horizontal azimuth of the high plane of the current node, Indicates the predicted horizontal azimuth angle corresponding to the current node.
- int context (angLel ⁇ 0&&angLeR ⁇ 0)
- the LaserIdx corresponding to the current point will be used to predict the Z-axis direction of the current point. That is, the radius information radius of the radar coordinate system is calculated by using the x and y information of the current point. Then, the tangent value of the current point and the vertical offset are obtained by using the laser LaserIdx of the current point, and the predicted value of the Z-axis direction of the current point, namely Z_pred, can be obtained.
- Z_pred is used to perform predictive coding on the geometric information of the current point in the Z-axis direction to obtain the prediction residual Z_res, and finally Z_res is encoded.
- G-PCC currently introduces a plane coding mode. In the process of geometric division, it will determine whether the child nodes of the current node are in the same plane. If the child nodes of the current node meet the conditions of the same plane, the child nodes of the current node will be represented by the plane.
- the decoder follows the order of breadth-first traversal. Before decoding the placeholder information of each node, it will first use the reconstructed geometric information to determine whether the current node is to be plane decoded or IDCM decoded. If the current node meets the conditions for plane decoding, it will first decode the plane identification and plane position information of the current node, and then decode the placeholder information of the current node based on the plane information. If the current node meets the conditions for IDCM decoding, it will first decode whether the current node is a true IDCM node.
- IDCM decoding If it is a true IDCM decoding, it will continue to parse the DCM decoding mode of the current node, and then get the number of points in the current DCM node, and finally decode the geometric information of each point. For nodes that do not meet the conditions for plane decoding, it will first decode the plane identification and plane position information of the current node, and then decode the placeholder information of the current node based on the plane information. For nodes that do not meet the requirements of DCM decoding, the plane decoding will decode the placeholder information of the current node. By continuously parsing in this way, the placeholder code of each node is obtained, and the nodes are continuously divided in turn until a 1 ⁇ 1 ⁇ 1 unit cube is obtained. The number of points contained in each leaf node is parsed, and finally the geometrically reconstructed point cloud information is restored.
- the prior information is first used to determine whether the node starts IDCM. That is, the starting conditions of IDCM are as follows:
- the current node has no sibling child nodes, that is, the parent node of the current node has only one child node, and the parent node of the parent node of the current node has only two occupied child nodes, that is, the current node has at most one neighbor node.
- the parent node of the current node has only one child node, the current node.
- the six neighbor nodes that share a face with the current node are also empty nodes.
- a node meets the conditions for DCM coding, first decode whether the current node is a real DCM node, that is, IDCM_flag; when IDCM_flag is true, the current node adopts DCM coding, otherwise it still adopts octree coding.
- numPonts of the current node obtained by decoding is less than or equal to 1, continue decoding to see if the second point is a repeated point; if the second point is not a repeated point, it can be implicitly inferred that the second type that satisfies the DCM mode contains only one point; if the second point obtained by decoding is a repeated point, it can be inferred that the third type that satisfies the DCM mode contains multiple points, but they are all repeated points, then continue decoding to see if the number of repeated points is greater than 1 (entropy decoding), and if it is greater than 1, continue decoding the number of remaining repeated points (decoding using exponential Columbus).
- the current node does not meet the requirements of the DCM node, it will exit directly (that is, the number of points is greater than 2 points and it is not a duplicate point).
- the coordinate information of the points contained in the current node is decoded.
- the following will introduce the lidar point cloud and the human eye point cloud in detail.
- the axis with the smaller node coordinate geometry position will be used as the priority decoding axis dirextAxis, and then the priority decoding axis dirextAxis geometry information will be decoded first in the following way.
- the geometry bit depth to be decoded corresponding to the priority decoding axis is nodeSizeLog2
- the coordinates of the two points are pointPos[0] and pointPos[1] respectively.
- the specific encoding process is as follows:
- the priority encoded coordinate axis dirextAxis geometry information is first decoded as follows, assuming that the priority decoded axis corresponds to the code geometry bit depth of nodeSizeLog2, and assuming that the coordinates of the two points are pointPos[0] and pointPos[1].
- the specific encoding process is as follows:
- the LaserIdx of the current node i.e., nodeLaserIdx
- the LaserIdx of the node i.e., nodeLaserIdx
- the calculation method of the LaserIdx of the node or point is the same as that of the encoder.
- the LaserIdx of the current point and the predicted residual information of the LaserIdx of the node are decoded to obtain ResLaserIdx.
- the three-dimensional geometric information of the current point is predicted and decoded using the acquisition parameters of the laser radar.
- the specific algorithm is as follows:
- the node geometry information corresponding to the current point is used to obtain the horizontal azimuth angle corresponding to the node Assuming the geometric coordinates of the node are nodePos, the horizontal azimuth
- the calculation method between the node geometry information is as follows:
- int context (angLel ⁇ 0&&angLeR ⁇ 0)
- the Z-axis direction of the current point will be predicted and decoded using the LaserIdx corresponding to the current point, that is, the radius information radius of the radar coordinate system is calculated by using the x and y information of the current point, and then the tangent value of the current point and the vertical offset are obtained using the laser LaserIdx of the current point, so that the predicted value of the Z-axis direction of the current point, namely Z_pred, can be obtained.
- the decoded Z_res and Z_pred are used to reconstruct and restore the geometric information of the current point in the Z-axis direction.
- geometric information coding based on triangle soup (trisoup)
- geometric division must also be performed first, but different from geometric information coding based on binary tree/quadtree/octree, this method does not need to divide the point cloud into unit cubes with a side length of 1 ⁇ 1 ⁇ 1 step by step, but stops dividing when the side length of the sub-block is W.
- the surface and the twelve edges of the block are obtained.
- the vertex coordinates of each block are encoded in turn to generate a binary code stream.
- the Predictive geometry coding includes: first, sorting the input point cloud.
- the currently used sorting methods include unordered, Morton order, azimuth order, and radial distance order.
- the prediction tree structure is established by using two different methods, including: KD-Tree (high-latency slow mode) and low-latency fast mode (using laser radar calibration information).
- KD-Tree high-latency slow mode
- low-latency fast mode using laser radar calibration information.
- each node in the prediction tree is traversed, and the geometric position information of the node is predicted by selecting different prediction modes to obtain the prediction residual, and the geometric prediction residual is quantized using the quantization parameter.
- the prediction residual of the prediction tree node position information, the prediction tree structure, and the quantization parameters are encoded to generate a binary code stream.
- the decoding end reconstructs the prediction tree structure by continuously parsing the bit stream, and then obtains the geometric position prediction residual information and quantization parameters of each prediction node through parsing, and dequantizes the prediction residual to recover the reconstructed geometric position information of each node, and finally completes the geometric reconstruction of the decoding end.
- attribute encoding is mainly performed on color information.
- the color information is converted from the RGB color space to the YUV color space.
- the point cloud is recolored using the reconstructed geometric information so that the unencoded attribute information corresponds to the reconstructed geometric information.
- color information encoding there are two main transformation methods, one is the distance-based lifting transformation that relies on LOD division, and the other is to directly perform RAHT transformation. Both methods will convert color information from the spatial domain to the frequency domain, and obtain high-frequency coefficients and low-frequency coefficients through transformation.
- the coefficients are quantized and encoded to generate a binary code stream, as shown in Figures 4A and 4B.
- the Morton code can be used to search for the nearest neighbor.
- the Morton code corresponding to each point in the point cloud can be obtained from the geometric coordinates of the point.
- the specific method for calculating the Morton code is described as follows. For each component of the three-dimensional coordinate represented by a d-bit binary number, its three components can be expressed as:
- the Morton code M is x, y, z starting from the highest bit, and then arranged in sequence from x l ,y l ,z l to the lowest bit.
- the calculation formula of M is as follows:
- Condition 1 The geometric position is limitedly lossy and the attributes are lossy;
- Condition 3 The geometric position is lossless, and the attributes are limitedly lossy
- Condition 4 The geometric position and attributes are lossless.
- the general test sequences include four categories: Cat1A, Cat1B, Cat3-fused, and Cat3-frame.
- the Cat2-frame point cloud only contains reflectance attribute information
- the Cat1A and Cat1B point clouds only contain color attribute information
- the Cat3-fused point cloud contains both color and reflectance attribute information.
- the bounding box is divided into sub-cubes in sequence, and the non-empty sub-cubes (containing points in the point cloud) are divided again until the leaf node obtained by division is a 1 ⁇ 1 ⁇ 1 unit cube.
- the number of points contained in the leaf node needs to be encoded, and finally the encoding of the geometric octree is completed to generate a binary code stream.
- the decoding end obtains the placeholder code of each node by continuously parsing in the order of breadth-first traversal, and continuously divides the nodes in turn until a 1 ⁇ 1 ⁇ 1 unit cube is obtained.
- geometric lossless decoding it is necessary to parse the number of points contained in each leaf node and finally restore the geometrically reconstructed point cloud information.
- the prediction tree structure is established by using two different methods, including: based on KD-Tree (high-latency slow mode) and using lidar calibration information (low-latency fast mode).
- lidar calibration information each point can be divided into different Lasers, and the prediction tree structure is established according to different Lasers.
- each node in the prediction tree is traversed, and the geometric position information of the node is predicted by selecting different prediction modes to obtain the prediction residual, and the geometric prediction residual is quantized using the quantization parameter.
- the prediction residual of the prediction tree node position information, the prediction tree structure, and the quantization parameters are encoded to generate a binary code stream.
- the decoding end reconstructs the prediction tree structure by continuously parsing the bit stream, and then obtains the geometric position prediction residual information and quantization parameters of each prediction node through parsing, and dequantizes the prediction residual to restore the reconstructed geometric position information of each node, and finally completes the geometric reconstruction at the decoding end.
- the point coordinates of the point cloud input are (x, y, z).
- the position information of the point cloud is converted into the radar coordinate system (radius, laserIdx).
- the geometric coordinates of the point are pointPos
- the starting coordinates of the laser ray are LidarOrigin
- the number of lasers is LaserNum
- the tangent value of each Laser is tan ⁇ i
- the offset position of each Laser in the vertical direction is Zi
- the calculation method of the node or point LaserIdx is as follows:
- the depth information radius is calculated as follows:
- LidarOrigin is generally 0.
- FIG20 shows a schematic diagram of the structure of inter-frame coding and decoding of geometric information.
- the current point to be coded in the current frame is filled with a grid, and the current point is represented by a at the previous coded node; there are a first reference frame and a second reference frame, wherein the first reference frame can be the previous frame of the current frame, and the second reference frame can be a global motion compensation (Global Motion Compensation, GMC) reference frame.
- GMC Global Motion Compensation
- the first reference frame i.e., the previous reference frame
- the second reference frame i.e., the reference frame of the previous frame after global motion
- find the node a that has the same and laserID point g and use the points e and f encoded or decoded after point g in the second reference frame as inter-frame candidate points; at the same time, point e and point f Replaced by the parent node of the current point to be encoded
- different prediction points including several intra-frame candidate points and up to 4 inter-frame candidate points
- RDO rate distortion optimization
- the geometric prediction residual is quantized using the quantization parameter.
- the prediction mode, prediction residual, prediction tree structure, quantization parameter and other parameters of the prediction tree node position information are encoded to generate a binary code stream.
- the decoding end continuously parses the bitstream, reconstructs the prediction tree structure, and traverses the prediction tree to find the previous decoded node a before the current point to be decoded;
- the prediction mode is decoded; if the prediction mode is inter-frame prediction mode, the prediction point is selected from the following at most four candidate points using the decoded prediction mode:
- the first reference frame i.e., the previous reference frame
- the second reference frame i.e., the reference frame of the previous frame after global motion
- find the node a that has the same and laserID point g and use the points e and f encoded or decoded after point g in the second reference frame as inter-frame candidate points; at the same time, point e and point f Replaced by the parent node of the current point to be decoded
- the geometric position prediction residual information and quantization parameters of different prediction points are obtained by analysis, and the prediction residual is dequantized, so that the reconstructed geometric position information of each node can be restored, and finally the geometric reconstruction at the decoding end is completed.
- inter-frame prediction mode number inter-frame prediction mode number
- the related technology is to convert the inter-frame prediction mode number into binary and directly encode it.
- the existing technical solution only directly splits the encoded inter-frame prediction mode number into binary for direct encoding, and does not take into account the frequency of different inter-frame prediction mode numbers. This will result in the performance of the encoded inter-frame prediction mode number not being optimal, reducing the encoding and decoding efficiency.
- an embodiment of the present application provides an encoding method to determine the inter-frame prediction mode value of the current node; determine the value of at least one mode identification information according to the inter-frame prediction mode value; wherein the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode; encode the value of at least one mode identification information, and write the obtained encoding bits into the bitstream.
- An embodiment of the present application also provides a coding method, decoding a bit stream, and determining a value of at least one mode identification information; wherein the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode; according to the value of at least one mode identification information, determining the inter-frame prediction mode value of the current node; according to the inter-frame prediction mode value, determining the prediction value of the current node.
- the inter-frame prediction mode value is no longer converted into binary for direct encoding, but the value of at least one mode identification information is determined according to the inter-frame prediction mode value, and then the value of the at least one mode identification information is encoded; wherein, the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i.
- This takes into account the frequency distribution of the inter-frame prediction mode, that is, the more likely the inter-frame prediction mode is to appear, the closer its corresponding inter-frame prediction mode value is, thereby reducing the number of encoding bits, saving bit rate, and thereby improving encoding and decoding efficiency, and can also achieve the purpose of improving compression efficiency.
- FIG21 a schematic flow chart of a decoding method provided by an embodiment of the present application is shown. As shown in FIG21, the method may include:
- S2101 Decode a bitstream and determine a value of at least one mode identification information.
- the decoding method of the embodiment of the present application is applied to a decoder.
- the decoding method may specifically refer to a point cloud inter-frame prediction method; more specifically, a decoding method of a point cloud inter-frame geometric information coding mode, or a Columbus decoding method of a point cloud inter-frame geometric information coding mode, to achieve decoding processing of an inter-frame prediction mode value.
- the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; wherein i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode.
- the point in the point cloud, the point may be all points in the point cloud, or may be some points in the point cloud, which are relatively concentrated in space.
- the current node may specifically refer to the node currently to be decoded in the point cloud.
- the inter-frame prediction mode value can be from 0 to N, that is, there can be at most N+1 inter-frame prediction modes.
- the i-th mode identification information can be represented by flagi.
- the mode identification information at most N mode identification information can be included here, specifically: flag0, flag1, ..., flagN-1.
- decoding the code stream and determining the value of at least one mode identification information may include: decoding the code stream based on a first decoding mode and determining the value of at least one mode identification information.
- the first decoding mode may include at least one of the following: a decoding mode with fixed context information, a decoding mode with adaptive context information, and a decoding mode without using context information.
- the value of at least one mode identification information can be obtained by decoding using a decoding mode of fixed context information, or by decoding using a decoding mode of adaptive context information, or by decoding using a decoding mode that does not use context information, and no specific limitation is made herein.
- flagi can be decoded using a decoding mode of fixed context information/a decoding mode of adaptive context information/a decoding mode without using context information; then, based on the value of flagi, it can be determined whether the inter-frame prediction mode value of the current node is greater than or equal to i.
- decoding the bitstream and determining the value of at least one mode identification information may include:
- i+1 mode identification information is used as at least one mode identification information; wherein the i+1 mode identification information includes the 0th mode identification information, the 1st mode identification information, ..., the i-th mode identification information;
- i-th mode identification information indicates that the inter-frame prediction mode value of the current node is greater than i
- i is updated based on i+1, and the decoding code stream is continued to determine the value of the i-th mode identification information until the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to i.
- the method may further include:
- the value of the i-th mode identification information is the first value, determining that the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to i;
- the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is greater than i.
- the first value is different from the second value, and the first value and the second value can be in parameter form or in digital form.
- the i-th mode identification information can be a parameter written in the profile or a flag value, which is not specifically limited here.
- the first value can be set to 1 and the second value can be set to 0; or, the first value can be set to 0 and the second value can be set to 1; or, the first value can be set to true and the second value can be set to false; or, the first value can be set to false and the second value can be set to true; but this is not specifically limited here.
- decoding the bitstream and determining the value of at least one mode identification information may include:
- i+1 mode identification information is used as at least one mode identification information; wherein the i+1 mode identification information includes the 0th mode identification information, the 1st mode identification information, ..., the i-th mode identification information;
- the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is not equal to i, then update i based on i+1, continue to decode the code stream, and determine the value of the i-th mode identification information until the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i.
- the method may further include:
- the value of the i-th mode identification information is the first value, determining that the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is not equal to i;
- the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i.
- the first value is different from the second value.
- flagi written into the bitstream as an example, assuming that the first value is set to 0 and the second value is set to 1, if the value of flagi is 0, then it can be determined that the inter-frame prediction mode value of the current node is not equal to i; if the value of flagi is 1, then it can be determined that the inter-frame prediction mode value of the current node is equal to i.
- S2102 Determine an inter-frame prediction mode value of a current node according to a value of at least one mode identification information.
- the inter-frame prediction mode value of the current node can be determined according to the value of the at least one mode identification information.
- determining the inter-frame prediction mode value of the current node based on the value of at least one mode identification information may include: when the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to i, setting the inter-frame prediction mode value of the current node to be equal to i.
- the method may further include:
- the inter-frame prediction mode value of the current node is set to be equal to N-1;
- the inter-frame prediction mode value of the current node is set to be equal to N.
- first decode flag0 using the decoding mode of fixed context information/the decoding mode of adaptive context information/the decoding mode without using context information. If the value of flag0 indicates that the inter-frame prediction mode value of the current node is less than or equal to 0, then end the decoding operation, and the inter-frame prediction mode value is equal to 0 at this time. If the value of flag0 indicates that the inter-frame prediction mode value of the current node is greater than 0, decode flag1 using the decoding mode of fixed context information/the decoding mode of adaptive context information/the decoding mode without using context information.
- the value of flag1 indicates that the inter-frame prediction mode value of the current node is less than or equal to 1, then end the decoding operation, and the inter-frame prediction mode value is equal to 1 at this time. If the value of flag1 indicates that the inter-frame prediction mode value of the current node is greater than 1, the decoding mode of fixed context information/the decoding mode of adaptive context information/the decoding mode without using context information is used to decode flag2.
- determining the inter-frame prediction mode value of the current node according to the value of at least one mode identification information may include:
- the Mth mode identification information indicates that the inter-frame prediction mode value of the current node is greater than M, decoding the bitstream based on the second decoding mode to determine the first inter-frame prediction mode residual value of the current node;
- the inter-frame prediction mode value of the current node is determined according to the values of the M+1 mode identification information and the first inter-frame prediction mode residual value.
- M+1 mode identification information may include: 0th mode identification information, 1st mode identification information, ..., Mth mode identification information; M is an integer greater than or equal to 0 and less than N.
- the second decoding mode may be an Exponential-Golomb decoding method, such as a K-order Exponential-Golomb decoding method.
- the K-order Exponential-Golomb coding is a lossless data compression method that can achieve a very high coding efficiency; therefore, in order to improve the coding efficiency, when the inter-frame prediction mode value of the current node is greater than M, the Exponential-Golomb decoding method can be used to decode the residual value of the first inter-frame prediction mode.
- the method may further include: initializing the inter prediction mode value to 0.
- the inter-frame prediction mode value is initialized to 0; secondly, the decoding mode of fixed context information/the decoding mode of adaptive context information/the decoding mode without using context information is used to decode flag0.
- the decoding operation is terminated, and the inter-frame prediction mode value is equal to 0; if the value of flag0 indicates that the inter-frame prediction mode value of the current node is greater than 0, the decoding mode of fixed context information/the decoding mode of adaptive context information/the decoding mode without using context information is used to decode flag1;
- decode flagM using a decoding mode with fixed context information/a decoding mode with adaptive context information/a decoding mode without context information; if the value of flagM indicates that the inter-frame prediction mode value of the current node is less than or equal to M, terminate the decoding operation, and the inter-frame prediction mode value is equal to M; if the value of flagM indicates that the inter-frame prediction mode value of the current node is greater than M, use the exponential Columbus decoding method to decode the first inter-frame prediction mode residual value; then determine the inter-frame prediction mode value of the current node based on
- determining the inter-frame prediction mode value of the current node based on the values of M+1 mode identification information and the residual value of the first inter-frame prediction mode may include: adding the values of the M+1 mode identification information and the residual value of the first inter-frame prediction mode to determine the inter-frame prediction mode value of the current node.
- inter mode Residual mode1+(flag0+flag1+...+flagM) (20)
- determining the inter-frame prediction mode value of the current node based on the values of M+1 mode identification information and the residual value of the first inter-frame prediction mode may include: adding the residual value of the first inter-frame prediction mode to (M+1) to determine the inter-frame prediction mode value of the current node.
- determining the inter-frame prediction mode value of the current node according to the value of at least one mode identification information may include: when the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i, setting the inter-frame prediction mode value of the current node to be equal to i.
- the method may further include:
- the inter-frame prediction mode value of the current node is Set equal to N-1;
- the inter-frame prediction mode value of the current node is set to be equal to N.
- the decoding operation is terminated, and the inter-frame prediction mode value is equal to 1 at this time. If the value of flag1 indicates that the inter-frame prediction mode value of the current node is not equal to 1, then use the decoding mode of fixed context information/the decoding mode of adaptive context information/the decoding mode without using context information to decode flag2.
- determining the inter-frame prediction mode value of the current node according to the value of at least one mode identification information may include:
- the Mth mode identification information indicates that the inter-frame prediction mode value of the current node is not equal to M, decoding the bitstream based on the second decoding mode to determine the second inter-frame prediction mode residual value of the current node;
- the inter-frame prediction mode value of the current node is determined according to the values of the M+1 mode identification information and the second inter-frame prediction mode residual value.
- M+1 mode identification information may include: 0th mode identification information, 1st mode identification information, ..., Mth mode identification information; M is an integer greater than or equal to 0 and less than N.
- the second decoding mode may be an Exponential Golomb decoding method, such as a K-order Exponential Golomb decoding method.
- K-order Exponential Golomb is a lossless data compression method that can achieve very high coding efficiency; therefore, in order to improve coding efficiency, when the inter-frame prediction mode value of the current node is not equal to M, the Exponential Golomb decoding method may also be used to decode the residual value of the second inter-frame prediction mode.
- the method may further include: initializing the inter prediction mode value to 0.
- the inter-frame prediction mode value is initialized to 0; secondly, the decoding mode of fixed context information/the decoding mode of adaptive context information/the decoding mode without using context information is used to decode flag0. If the value of flag0 indicates that the inter-frame prediction mode value of the current node is equal to 0, the decoding operation is terminated, and the inter-frame prediction mode value is equal to 0 at this time; if the value of flag0 indicates that the inter-frame prediction mode value of the current node is not equal to 0, the decoding mode of fixed context information/the decoding mode of adaptive context information/the decoding mode without using context information is used for decoding.
- decode flagM using the decoding mode of fixed context information/the decoding mode of adaptive context information/the decoding mode without using context information; if the value of flagM indicates that the inter-frame prediction mode value of the current node is equal to M, then end the decoding operation, and the inter-frame prediction mode value is equal to M; if the value of flagM indicates that the inter-frame prediction mode value of the current node is not equal to M, then use the exponential Columbus decoding method to decode the second inter-frame prediction mode residual value; then determine the inter-frame prediction mode value of the current node based on the values of the M+1 mode identification information and the second inter-frame prediction mode residual value.
- determining the inter-frame prediction mode value of the current node based on the values of M+1 mode identification information and the residual value of the second inter-frame prediction mode can include: performing a negation operation on the values of the M+1 mode identification information to determine the negated value of the M+1 mode identification information; performing an addition operation on the negated value of the M+1 mode identification information and the residual value of the second inter-frame prediction mode to determine the inter-frame prediction mode value of the current node.
- the first value is set to 0 and the second value is set to 1, that is, when the value of flagi is 0, it is determined that the inter-frame prediction mode value of the current node is not equal to i, and the decoding operation needs to be continued.
- M+1 mode identification information such as flag0, flag1, ..., flagM
- the current node is determined based on the values of the M+1 mode identification information and the second inter-frame prediction mode residual value.
- the inter-frame prediction mode value may include: performing an addition operation on the second inter-frame prediction mode residual value and (M+1) to determine the inter-frame prediction mode value of the current node.
- the inter-frame prediction mode value of the current node can be determined according to the value of the i-th mode identification information; or, in order to improve the coding efficiency, the exponential Golomb decoding method can be combined to determine the inter-frame prediction mode value of the current node.
- S2103 Determine the prediction value of the current node according to the inter-frame prediction mode value.
- the selected node can be determined according to the inter-frame prediction mode value, and then the prediction value of the current node can be determined.
- determining the prediction value of the current node according to the inter-frame prediction mode value may include: determining the selected node from at least one candidate node according to the inter-frame prediction mode value; and determining the prediction value of the current node according to the selected node.
- At least one candidate node includes at least one of the following: at least one second candidate node and at least one fourth candidate node; wherein at least one second candidate node is a candidate node in the first reference frame, and at least one fourth candidate node is a candidate node in the second reference frame.
- at least one candidate node may be composed of at least one second candidate node and/or at least one fourth candidate node, and the number of candidate nodes is not limited here.
- the current node is located in the current frame, wherein the current frame refers to the frame to be decoded, the first reference frame and the second reference frame are both decoded frames, and the first reference frame is different from the second reference frame.
- the first reference frame may be the previous K frames of the current frame, where K is an integer greater than 0; the second reference frame may be obtained by performing global motion on the first reference frame.
- the first reference frame may be the previous frame of the current frame; the second reference frame may be obtained by performing global motion on the previous frame.
- the first reference frame may be a previous frame of the current frame; the second reference frame may be a previous frame of the previous frame of the current frame.
- the current frame is Frame t
- the first reference frame may be Frame t-1
- the second reference frame may be Frame t-2
- t is an integer.
- the method may further include:
- the first candidate node determining the first second candidate node to the pth second candidate node in a preset manner in the first reference frame
- the selected node is determined according to the pth second candidate node; wherein p is a positive integer greater than 0, and the value of p is associated with the inter-frame prediction mode value.
- the method may further include:
- the third candidate node determining the first fourth candidate node to the qth fourth candidate node in the second reference frame in a preset manner
- the qth fourth candidate node is selected as the node; wherein q is a positive integer greater than 0, and the value of q is associated with the inter-frame prediction mode value.
- determining the previous decoded node of the current node may include: determining a prediction tree corresponding to the current frame; and determining the previous decoded node of the current node based on the decoding order of the prediction tree.
- two different methods can be used to construct the prediction tree structure, which can include: KD-Tree (high latency slow mode) and low latency fast mode (using laser radar calibration information).
- KD-Tree high latency slow mode
- low latency fast mode using laser radar calibration information.
- the decoding order of the prediction tree can be one of the following: unordered, Morton order, azimuth order, radial distance order, etc., which is not specifically limited here.
- the prediction tree structure is reconstructed by decoding the bitstream, and then the prediction tree is traversed to determine the previous decoded node of the current node in the decoding order of the prediction tree.
- the geometric parameters here refer to the parameters in the radar coordinate system.
- the geometric parameters may include: horizontal azimuth And the radar laser index number laserID.
- the geometric parameters satisfy the first condition, which may include: the radar laser index signal is the same as the radar laser index sequence number of the previous decoded node; the horizontal azimuth is the same as the horizontal azimuth of the previous decoded node.
- the geometric parameters of the previous decoded node and the first candidate node satisfy the first condition, which may specifically include: the radar laser index signal of the first candidate node is the same as the radar laser index serial number of the previous decoded node; and the horizontal azimuth angle of the first candidate node is the same as the horizontal azimuth angle of the previous decoded node.
- the geometric parameters of the previous decoded node and the third candidate node satisfy the first condition, which may specifically include: the radar laser index signal of the third candidate node is the same as the radar laser index serial number of the previous decoded node; and the horizontal azimuth angle of the third candidate node is the same as the horizontal azimuth angle of the previous decoded node.
- the first candidate node found has the same laserID as the laserID of the previous decoded node, and The previous decoded node
- the third candidate node found has the same laserID as the laserID of the previous decoded node, and The previous decoded node same.
- the geometric parameters satisfy the first condition, which may include: the radar laser index signal is the same as the radar laser index sequence number of the previous decoded node; the horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the previous decoded node.
- the geometric parameters of the previous decoded node and the first candidate node satisfy the first condition, which may specifically include: the radar laser index signal of the first candidate node is the same as the radar laser index sequence number of the previous decoded node; and the horizontal azimuth angle of the first candidate node is greater than and closest to the horizontal azimuth angle of the previous decoded node.
- the geometric parameters of the previous decoded node and the third candidate node satisfy the first condition, which may specifically include: the radar laser index signal of the third candidate node is the same as the radar laser index serial number of the previous decoded node; and the horizontal azimuth angle of the third candidate node is greater than and closest to the horizontal azimuth angle of the previous decoded node.
- the first candidate node found has the same laserID as the laserID of the previous decoded node, and is the first node that is greater than the previous decoded node
- the third candidate node found has the same laserID as the laserID of the previous decoded node, and is the first node that is greater than the previous decoded node
- the geometric parameters satisfy the first condition, which may include: the radar laser index signal is the same as the radar laser index sequence number of the previous decoded node; the horizontal azimuth angle is less than and closest to the horizontal azimuth angle of the previous decoded node.
- the geometric parameters of the previous decoded node and the first candidate node satisfy the first condition, which may specifically include: the radar laser index signal of the first candidate node is the same as the radar laser index sequence number of the previous decoded node; and the horizontal azimuth angle of the first candidate node is less than and closest to the horizontal azimuth angle of the previous decoded node.
- the geometric parameters of the previous decoded node and the first candidate node satisfy the first condition, which may specifically include: the radar laser index signal of the third candidate node is the same as the radar laser index serial number of the previous decoded node; and the horizontal azimuth angle of the third candidate node is less than and closest to the horizontal azimuth angle of the previous decoded node.
- the first candidate node found has the same laserID as the laserID of the previous decoded node, and is the first node that is smaller than the previous decoded node
- the third candidate node found has the same laserID as the laserID of the previous decoded node, and is the first node that is smaller than the previous decoded node
- the second reference frame is a reference frame obtained by global motion of the first reference frame
- the horizontal azimuth angles of the third candidate node and at least one fourth candidate node are Need to be replaced by the parent node of the current node
- the preset manner may be based on the decoding order of the prediction tree, or may be based on the order of the magnitude of the horizontal azimuth angle, which is not specifically limited here. The following describes these two specific implementations.
- determining the first second candidate node to the pth second candidate node in a preset manner in the first reference frame may include: determining the first second candidate node, the second second candidate node, ..., the pth second candidate node decoded after the first candidate node in the first reference frame in sequence according to the decoding order of the prediction tree.
- determining the first fourth candidate node to the qth fourth candidate node in a preset manner in the second reference frame may include: determining the first fourth candidate node, the second fourth candidate node, ..., the qth fourth candidate node decoded after the third candidate node in the second reference frame in sequence according to the decoding order of the prediction tree.
- the at least one second candidate node includes a node c and a node d.
- the previous decoded node a of the current node is first determined; then the first candidate node b whose geometric parameters satisfy the first condition with the previous decoded node a is determined; according to the decoding order of the prediction tree, the first second candidate node c and the second second candidate node d encoded or decoded after the first candidate node b are determined in sequence in the first reference frame; the second second candidate node d determined at this time is the selected node.
- determining the first second candidate node to the pth second candidate node in a preset manner in the first reference frame may include: determining, in the first reference frame, in order of magnitude of the horizontal azimuth angles, the first second candidate node whose horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the first candidate node ... and the first second candidate node whose horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the first candidate node.
- determining the first to qth fourth candidate nodes in a preset manner in the second reference frame may include: determining in order of horizontal azimuth angles in the second reference frame the first fourth candidate node whose horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the third candidate node, the second fourth candidate node whose horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the first fourth candidate node, ..., the qth fourth candidate node whose horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the j-1th fourth candidate node; wherein the radar laser index numbers of the first fourth candidate node, the second fourth candidate node, ..., and the qth fourth candidate node are all the same as the radar laser index number of the previous decoded node.
- At least one second candidate node includes node c and node d.
- the selected node is determined to be node d according to the inter-frame prediction mode value
- the previous decoded node a of the current node is first determined; then the first candidate node b whose geometric parameters satisfy the first condition with the previous decoded node a is determined; and then the order of the horizontal azimuth angle is as follows:
- the first second candidate node c with the same radar laser index number and a first horizontal azimuth angle greater than the horizontal azimuth angle of the first candidate node b is determined, and the second second candidate node d with the same radar laser index number and a first horizontal azimuth angle greater than the horizontal azimuth angle of the first second candidate node c is determined; the second second candidate node d determined at this time is the selected node.
- the first reference frame i.e., the previous reference frame
- the second reference frame i.e., the reference frame of the previous frame after global motion
- nodes e and f encoded or decoded after node g in the second reference frame are used as inter-frame candidate points
- nodes e and f The node that is replaced by the parent node of the current node
- the selected node can be determined from at least one of node c, node d, node e and node f, and then the prediction value of the current node can be determined; alternatively, the first candidate node and the second candidate node can also be included here, so the selected node can also be determined from at least one of node b, node g, node c, node d, node e and node f, and then the prediction value of the current node can be determined; there is no specific limitation on this.
- the selected node is determined according to the inter-frame prediction mode value, and an inter-frame candidate node set that is the same as the encoding end can also be constructed here.
- the inter-frame candidate node set can be composed of at least one of node c, node d, node e and node f, or at least one of node b, node g, node c, node d, node e and node f. Then, using the inter-frame prediction mode value obtained by decoding, the selected node can be determined from the inter-frame candidate node set, and then the prediction value of the current node can be determined.
- the method may also include: decoding the code stream to determine the prediction residual value and quantization parameter of the current node; dequantizing the prediction residual value according to the quantization parameter to obtain a dequantized residual value; and determining the reconstruction information of the current node based on the dequantized residual value and the prediction value.
- determining the reconstruction information of the current node according to the inverse quantization residual value and the prediction value may include: performing an addition operation according to the inverse quantization residual value and the prediction value to determine the reconstruction information of the current node.
- the predicted residual value of the current node is obtained by decoding the code stream, and the quantization parameter is obtained by decoding the code stream; then the predicted residual value is inversely quantized according to the quantization parameter to obtain the inverse quantized residual value; then the inverse quantized residual value and the predicted value are summed to obtain the reconstruction information of the current node, such as restoring the reconstructed geometric position information of the current node, and finally completing the geometric reconstruction at the decoding end.
- the decoding method is mainly for decoding optimization of the inter-frame prediction mode value, and here a flag bit can also be used to determine whether the current node uses the inter-frame prediction mode. Therefore, in some embodiments, the method also includes: decoding the code stream to determine the value of the first identification information; if the first identification information indicates that the current node uses the inter-frame prediction mode, then performing the step of decoding the code stream to determine the value of at least one mode identification information.
- the method further includes:
- the value of the first identification information is the third value, determining that the first identification information indicates that the current node does not use the inter-frame prediction mode
- the value of the first identification information is the fourth value, it is determined that the first identification information indicates that the current node uses the inter-frame prediction mode.
- the third value is different from the fourth value, and the third value and the fourth value can be in parameter form or in digital form.
- the first identification information can be a parameter written in the profile or a flag value, which is not specifically limited here.
- the third value can be set to 1 and the fourth value can be set to 0; or, the third value can be set to 0 and the fourth value can be set to 1; or, the third value can be set to true and the fourth value can be set to false; or, the third value can be set to false and the fourth value can be set to true; but this is not specifically limited here.
- a flag bit can be set to determine whether to enable the decoding method of the embodiment of the present application. Therefore, in some embodiments, the method further includes: decoding the code stream to determine the value of the second identification information; if the second identification information indicates that the current node enables the target inter-frame decoding mode, then performing the step of decoding the code stream to determine the value of at least one mode identification information.
- the method further includes:
- the value of the second identification information is the fifth value, it is determined that the second identification information indicates that the current node does not enable the target inter-frame decoding mode
- the value of the second identification information is the sixth value, it is determined that the second identification information indicates that the current node enables the target inter-frame decoding mode.
- the fifth value is different from the sixth value, and the fifth value and the sixth value can be in parameter form or in digital form.
- the second identification information can be a parameter written in the profile or a value of a flag, which is not specifically limited here.
- the fifth value can be set to 1 and the sixth value can be set to 0; or, the fifth value can be set to 0 and the sixth value can be set to 1; or, the fifth value can be set to true and the sixth value can be set to false; or, the fifth value can be set to false and the sixth value can be set to true; but no specific limitation is made here.
- a 1-bit flag (i.e., the second identification information) can be used here to indicate whether the target inter-frame decoding mode is enabled or not.
- This flag can be placed in the header information of the high-level syntax element, such as the geometry header; and this flag can be conditionally enabled under certain conditions. If this flag does not appear in the bitstream, its default value is a fixed value. At the decoding end, if this flag does not appear in the bitstream, decoding may not be performed, and its default value is a fixed value.
- This embodiment provides a decoding method, firstly decoding a code stream, determining the value of at least one mode identification information; wherein the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode; then determining the inter-frame prediction mode value of the current node according to the value of at least one mode identification information; and then determining the prediction value of the current node according to the inter-frame prediction mode value.
- the inter-frame prediction mode value is no longer converted into binary for direct decoding, but the value of at least one mode identification information is determined by decoding the code stream, and then the inter-frame prediction mode value is determined according to the value of the at least one mode identification information; wherein the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i, so that the frequency distribution of the inter-frame prediction mode is taken into account, that is, the more likely the inter-frame prediction mode is to appear, the more forward the corresponding inter-frame prediction mode value is, so that the number of coding bits can be reduced, the bit rate can be saved, and the encoding and decoding efficiency can be improved.
- FIG22 a schematic diagram of a flow chart of an encoding method provided in an embodiment of the present application is shown. As shown in FIG22, the method may include:
- S2201 Determine the inter-frame prediction mode value of the current node.
- the encoding method of the embodiment of the present application is applied to an encoder.
- the encoding method may specifically refer to a point cloud inter-frame prediction method; more specifically, it may be an encoding method of a point cloud inter-frame geometric information encoding mode, or it may also be a Columbus encoding method of a point cloud inter-frame geometric information encoding mode, to achieve encoding processing of an inter-frame prediction mode value.
- the point in the point cloud, the point can be all points in the point cloud, or it can be part of the points in the point cloud, and these points are relatively concentrated in space.
- the current node can specifically refer to the node to be encoded in the point cloud.
- determining the inter-frame prediction mode value of the current node may include:
- inter-frame candidate node set wherein the inter-frame candidate node set includes at least one candidate node
- a selected node is determined from the inter-frame candidate node set, and an inter-frame prediction mode value of the current node is determined according to an index position of the selected node in the inter-frame candidate node set.
- determining the selected node from the inter-frame candidate node set may include:
- a cost is calculated for at least one candidate node in the inter-frame candidate node set, and at least one candidate node is determined. Select the cost value of each node; determine the minimum cost value from the cost value of at least one candidate node, and use the candidate node corresponding to the minimum cost value as the selected node.
- the inter-frame candidate node set includes at least one candidate node.
- the inter-frame candidate node set may include one candidate node, two candidate nodes, or more candidate nodes, which is not specifically limited here.
- the method may include:
- S2301 Determine the previous encoded node of the current node.
- S2302 Determine a first candidate node whose geometric parameters satisfy a first condition with those of a previously encoded node in a first reference frame, and determine at least one second candidate node in the first reference frame based on the first candidate node.
- S2303 Determine a third candidate node whose geometric parameters satisfy the first condition with those of the previous encoded node in the second reference frame, determine at least one fourth candidate node in the second reference frame based on the third candidate node, and set the horizontal azimuth angle of at least one fourth candidate node to the horizontal azimuth angle of the parent node of the current node.
- S2304 Determine an inter-frame candidate node set according to at least one second candidate node and/or at least one fourth candidate node.
- the current node is located in the current frame.
- the current frame refers to the frame to be encoded
- the first reference frame and the second reference frame are both frames that have been encoded
- the first reference frame is different from the second reference frame.
- determining the previous encoded node of the current node may include: determining a prediction tree corresponding to the current frame; and determining the previous encoded node of the current node based on the encoding order of the prediction tree.
- two different methods can be used to construct the prediction tree structure, which can include: KD-Tree (high latency slow mode) and low latency fast mode (using laser radar calibration information).
- KD-Tree high latency slow mode
- low latency fast mode using laser radar calibration information.
- the encoding order of the prediction tree can be one of the following: unordered, Morton order, azimuth order, radial distance order, etc., which is not specifically limited here.
- the first reference frame may be the previous K frames of the current frame, where K is an integer greater than 0; the second reference frame may be obtained by performing global motion on the first reference frame.
- the first reference frame may be the previous frame of the current frame; the second reference frame may be obtained by performing global motion on the previous frame.
- the first reference frame may be a previous frame of the current frame; the second reference frame may be a previous frame of the previous frame of the current frame.
- the current frame is Frame t
- the first reference frame may be Frame t-1
- the second reference frame may be Frame t-2
- t is an integer.
- the geometric parameters here refer to the parameters in the radar coordinate system.
- the geometric parameters may include: horizontal azimuth And the radar laser index number laserID.
- the geometric parameters satisfy the first condition, which may include: the radar laser index signal is the same as the radar laser index sequence number of the previous encoded node; the horizontal azimuth angle is the same as the horizontal azimuth angle of the previous encoded node.
- determining a first candidate node whose geometric parameters satisfy a first condition with those of a previous encoded node in a first reference frame may specifically include: a radar laser index signal of the first candidate node is the same as the radar laser index sequence number of the previous encoded node; and a horizontal azimuth angle of the first candidate node is the same as the horizontal azimuth angle of the previous encoded node.
- a third candidate node whose geometric parameters satisfy the first condition with those of the previous encoded node in the second reference frame is determined, which may specifically include: a radar laser index signal of the third candidate node is the same as the radar laser index sequence number of the previous encoded node; and a horizontal azimuth angle of the third candidate node is the same as the horizontal azimuth angle of the previous encoded node.
- the first candidate node found has the same laserID as the laserID of the previous encoded node, and The previous encoded node
- the third candidate node found has the same laserID as the laserID of the previous encoded node, and The previous encoded node same.
- the geometric parameters satisfy the first condition, which may include: the radar laser index signal is the same as the radar laser index sequence number of the previous encoded node; the horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the previous encoded node.
- determining a first candidate node whose geometric parameters satisfy a first condition with those of a previous encoded node in a first reference frame may specifically include: a radar laser index signal of the first candidate node is the same as the radar laser index sequence number of the previous encoded node; and a horizontal azimuth angle of the first candidate node is greater than and closest to the horizontal azimuth angle of the previous encoded node.
- a third candidate node whose geometric parameters satisfy the first condition with those of the previous encoded node in the second reference frame is determined, which may specifically include: a radar laser index signal of the third candidate node is the same as the radar laser index sequence number of the previous encoded node; and the horizontal azimuth angle of the third candidate node is greater than and closest to the horizontal azimuth angle of the previous encoded node.
- the first candidate node found has the same laserID as the laserID of the previous encoded node, and is the first node that is greater than the previous encoded node
- the third candidate node found has the same laserID as the laserID of the previous encoded node, and is the first node that is greater than the previous encoded node
- the geometric parameters satisfying the first condition may include: the radar laser index signal and the radar of the previous encoded node;
- the laser index number is the same as that of the previous encoded node;
- the horizontal azimuth angle is less than and closest to the horizontal azimuth angle of the previous encoded node.
- determining a first candidate node whose geometric parameters satisfy a first condition with those of a previous encoded node in a first reference frame may specifically include: a radar laser index signal of the first candidate node is the same as the radar laser index sequence number of the previous encoded node; and a horizontal azimuth angle of the first candidate node is smaller than and closest to the horizontal azimuth angle of the previous encoded node.
- a third candidate node whose geometric parameters satisfy the first condition with those of the previous encoded node in the second reference frame is determined, which may specifically include: a radar laser index signal of the third candidate node is the same as the radar laser index sequence number of the previous encoded node; and the horizontal azimuth angle of the third candidate node is smaller than and closest to the horizontal azimuth angle of the previous encoded node.
- the first candidate node found has the same laserID as the laserID of the previous encoded node, and is the first node that is smaller than the previous encoded node
- the third candidate node found has the same laserID as the laserID of the previous encoded node, and is the first node that is smaller than the previous encoded node
- the second reference frame is a reference frame obtained by global motion of the first reference frame
- the horizontal azimuth angles of the third candidate node and at least one fourth candidate node are Need to be replaced by the parent node of the current node
- determining at least one second candidate node in the first reference frame according to the first candidate node may include:
- determining at least one second candidate node in a first reference frame based on a first candidate node may include: determining the 1st second candidate node to the mth second candidate node in the first reference frame in sequence according to the encoding order of the prediction tree; wherein m is a positive integer greater than 0.
- the 1st second candidate node to the mth second candidate node may specifically be: the 1st second candidate node, the 2nd second candidate node, ..., the mth second candidate node.
- m here may be the same as or different from p at the decoding end, and p is less than or equal to the value of m.
- determining at least one fourth candidate node in the second reference frame based on the third candidate node may include: determining the first fourth candidate node to the nth fourth candidate node in the second reference frame in sequence according to the encoding order of the prediction tree; wherein n is a positive integer greater than 0.
- the first fourth candidate node to the nth fourth candidate node may specifically be: the first fourth candidate node, the second fourth candidate node, ..., the nth fourth candidate node. It should be noted that n here may be the same as or different from q at the decoding end, and q is less than or equal to the value of n.
- the first reference frame assuming that at least one second candidate node includes node c and node d.
- determining the first second candidate node to the mth second candidate node in a preset manner in the first reference frame may include: determining in order of horizontal azimuth angles in the first reference frame the first second candidate node whose horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the first candidate node, the second second candidate node whose horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the first second candidate node, ..., the mth second candidate node whose horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the m-1th second candidate node; wherein the radar laser index numbers of the first second candidate node, the second second candidate node, ..., and the mth second candidate node are all the same as the radar laser index number of the previous encoded node.
- determining the first fourth candidate node to the nth fourth candidate node in a preset manner in the second reference frame may include: determining in order of horizontal azimuth angles in the second reference frame the first fourth candidate node whose horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the third candidate node, the second fourth candidate node whose horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the first fourth candidate node, ..., the nth fourth candidate node whose horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the n-1th fourth candidate node; wherein the radar laser index numbers of the first fourth candidate node, the second fourth candidate node, ..., and the nth fourth candidate node are all the same as the radar laser index number of the previous decoded node.
- At least one second candidate node includes node c and node d.
- the first second candidate node c is determined, which has the same radar laser index number and a first horizontal azimuth angle greater than the horizontal azimuth angle of the first candidate node b
- the second second candidate node d is determined, which has the same radar laser index number and a first horizontal azimuth angle greater than the horizontal azimuth angle of the first second candidate node c.
- the first reference frame i.e., the previous reference frame
- the second reference frame i.e. the reference frame of the previous frame after global motion
- a has the same and laserID of node g, and use nodes e and f encoded or decoded after node g in the second reference frame as inter-frame candidate points; at the same time, nodes e and f The node that is replaced by the parent node of the current node
- the selection of nodes c and d may also be performed in the first reference frame according to the horizontal azimuth angle. , determine the node c whose horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of the first candidate node b, and determine the node d whose horizontal azimuth angle is greater than and closest to the horizontal azimuth angle of node c.
- the selection of nodes e and f it is also possible to select nodes e and f in the second reference frame according to the horizontal azimuth angles.
- the inter-frame candidate node set may include at least one of node c, node d, node e and node f; or, the inter-frame candidate node set may also include at least one of node b, node g, node c, node d, node e and node f, without specific limitation.
- the selected node can be determined from the inter-frame candidate node set.
- the selected node is determined from the inter-frame candidate node set, and different candidate nodes can be selected using the RDO method.
- it can be: the cost of each candidate node in the inter-frame candidate node set is calculated using the rate-distortion cost method to determine the cost value of at least one candidate node; the minimum cost value is selected from these cost values, and then the candidate node corresponding to the minimum cost value is used as the selected node.
- the inter-frame prediction mode value of the current node can be determined according to the index position of the selected node in the inter-frame candidate node set.
- S2202 Determine a value of at least one mode identification information according to the inter-frame prediction mode value.
- the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; wherein i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode.
- the inter-frame prediction mode value can be from 0 to N, that is, there can be at most N+1 inter-frame prediction modes.
- the i-th mode identification information can be represented by flagi.
- the mode identification information at most N mode identification information can be included here, specifically: flag0, flag1, ..., flagN-1.
- determining the value of at least one mode identification information according to the inter-frame prediction mode value may include:
- i+1 mode identification information is used as at least one mode identification information; wherein the i+1 mode identification information includes the 0th mode identification information, the 1st mode identification information, ..., the i-th mode identification information;
- i-th mode identification information indicates that the inter-frame prediction mode value of the current node is greater than i
- i is updated based on i+1, and the step of determining the value of the i-th mode identification information according to the inter-frame prediction mode value is continued until the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to i.
- the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to i
- determining the value of the i-th mode identification information according to the inter-frame prediction mode value may include:
- inter-frame prediction mode value is less than or equal to i, determining the value of the i-th mode identification information to be the first value
- the value of the i-th mode identification information is determined to be the second value.
- the first value is different from the second value, and the first value and the second value can be in parameter form or in digital form.
- the i-th mode identification information can be a parameter written in the profile or a flag. Value, no specific limitation is made here.
- the first value can be set to 1 and the second value can be set to 0; or, the first value can be set to 0 and the second value can be set to 1; or, the first value can be set to true and the second value can be set to false; or, the first value can be set to false and the second value can be set to true; but this is not specifically limited here.
- determining the value of at least one mode identification information according to the inter-frame prediction mode value may include:
- i+1 mode identification information is used as at least one mode identification information; wherein the i+1 mode identification information includes the 0th mode identification information, the 1st mode identification information, ..., the i-th mode identification information;
- the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is not equal to i, then update i based on i+1, and continue to execute the step of determining the value of the i-th mode identification information according to the inter-frame prediction mode value, until the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i.
- the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i
- determining the value of the i-th mode identification information according to the inter-frame prediction mode value may include:
- inter-frame prediction mode value is not equal to i, determining the value of the i-th mode identification information to be the first value
- the inter-frame prediction mode value is equal to i, then the value of the i-th mode identification information is determined to be the second value.
- the first value and the second value may be different.
- S2203 Encode the value of at least one mode identification information, and write the obtained coded bits into the bit stream.
- the encoding end after obtaining the value of at least one mode identification information, the encoding end needs to write the value of the at least one mode identification information into the bit stream; in this way, the value of the at least one mode identification information can be obtained by decoding the bit stream at the decoding end.
- encoding the value of at least one mode identification information and writing the obtained coded bits into the bitstream may include: encoding the value of at least one mode identification information based on the first coding mode and writing the obtained coded bits into the bitstream.
- the first encoding mode may include at least one of the following: an encoding mode with fixed context information, an encoding mode with adaptive context information, and an encoding mode without using context information.
- encoding processing can be performed using an encoding mode of fixed context information, or encoding processing can be performed using an encoding mode of adaptive context information, or encoding processing can be performed using an encoding mode that does not use context information, and no specific limitation is made herein.
- the coding mode of fixed context information/the coding mode of adaptive context information/the coding mode without using context information can be used to encode flagi, and the obtained coded bits can be written into the bitstream.
- encoding the value of at least one mode identification information and writing the obtained encoding bits into the bitstream may include:
- the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to i, then the value of the i-th mode identification information is Encoding is performed, and the obtained coded bits are written into a bit stream;
- the value of the i-th mode identification information is encoded and the obtained coded bits are written into the bitstream; and i is updated based on i+1, and the step of determining the value of the i-th mode identification information according to the inter-frame prediction mode value is continued until the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to i.
- the coding mode of fixed context information/the coding mode of adaptive context information/the coding mode without context information is used to encode the value of flag0, and it is determined whether the inter-frame prediction mode value of the current node is greater than 0; if the inter-frame prediction mode value is not greater than 0, the encoding operation is terminated; if the inter-frame prediction mode value is greater than 0, the coding mode of fixed context information/the coding mode of adaptive context information/the coding mode without context information is continued to be used to encode the value of flag1, and it is determined whether the inter-frame prediction mode value of the current node is greater than 1; if the inter-frame prediction mode value is not greater than 1 ...
- the mode value is greater than 1, the coding mode of fixed context information/the coding mode of adaptive context information/the coding mode without context information is continued to be used to encode the value of flag2, and it is determined whether the inter-frame prediction mode value of the current node is greater than 2; similarly, if the inter-frame prediction mode value is greater than M-1, the coding mode of fixed context information/the coding mode of adaptive context information/the coding mode without context information is continued to be used to encode the value of flagM, and it is determined whether the inter-frame prediction mode value of the current node is greater than M. If the inter-frame prediction mode value is not greater than M, the encoding operation is terminated; where M is an integer greater than or equal to 0 and less than N.
- the method may further include:
- the value of the N-1th mode identification information is encoded, and the obtained encoded bits are written into the bit stream.
- determining the value of the N-1th mode identification information according to the inter-frame prediction mode value may include:
- inter-frame prediction mode value is less than or equal to N-1, determining the value of the N-1th mode identification information to be the first value
- the value of the N-1th mode identification information is determined to be the second value.
- N+1 inter-frame prediction modes there are at most N corresponding mode identification information, namely: the 0th mode identification information flag0, the 1st mode identification information flag1, ..., the N-1th mode identification information flagN-1.
- the encoding mode of fixed context information/the encoding mode of adaptive context information/the encoding mode without context information can be used to encode the value of flagi, and determine whether the inter-frame prediction mode value of the current node is greater than i; if the inter-frame prediction mode value is not greater than i, the encoding operation is terminated; if the inter-frame prediction mode value is greater than i, the encoding mode of fixed context information/the encoding mode of adaptive context information/the encoding mode without context information is continued to be used to encode the value of flagi+1, and determine whether the inter-frame prediction mode value of the current node is greater than i+1.
- i is an integer greater than or equal to 0 and less than N-1.
- the encoding mode of fixed context information/the encoding mode of adaptive context information/the encoding mode without using context information can still be used to encode the value of flagN-1, and determine whether the inter-frame prediction mode value of the current node is greater than N-1; if the inter-frame prediction mode value is not greater than N-1, the encoding operation is terminated, that is, the inter-frame prediction mode value is equal to N-1; if the inter-frame prediction mode value is greater than N-1, the encoding operation is terminated, that is, the inter-frame prediction mode value is equal to N.
- the method may further include:
- the Mth mode identification information indicates that the inter-frame prediction mode value of the current node is greater than M, determining the first inter-frame prediction mode residual value of the current node according to the inter-frame prediction mode value and the values of the M+1 mode identification information;
- the first inter-frame prediction mode residual value of the current node is encoded based on the second encoding mode, and the obtained encoding bits are written into the bitstream.
- M+1 mode identification information may include: 0th mode identification information, 1st mode identification information, ..., Mth mode identification information; M is an integer greater than or equal to 0 and less than N.
- the second coding mode may be an exponential Golomb coding method, such as a K-order exponential Golomb coding method.
- K-order exponential Golomb is a lossless data compression method that can achieve very high coding efficiency; therefore, in order to improve coding efficiency, when the inter-frame prediction mode value of the current node is greater than M, the exponential Golomb coding method may be used to encode the residual value of the first inter-frame prediction mode.
- the encoding end first use the fixed context information encoding mode/adaptive context information encoding mode/no context information encoding mode to encode flag0. If the value of flag0 indicates that the inter-frame prediction mode value of the current node is less than or equal to 0, the encoding operation is terminated; if the value of flag0 indicates that the inter-frame prediction mode value of the current node is greater than 0, the fixed context information is used.
- determining the first inter-frame prediction mode residual value of the current node based on the inter-frame prediction mode value and the values of M+1 mode identification information may include: performing a subtraction operation on the inter-frame prediction mode value and the values of M+1 mode identification information to determine the first inter-frame prediction mode residual value of the current node.
- residual mode1 inter mode-(flag0+flag1+...+flagM) (24)
- determining the first inter-frame prediction mode residual value of the current node based on the inter-frame prediction mode value and the values of M+1 mode identification information may include: performing a subtraction operation on the inter-frame prediction mode value and (M+1) to determine the first inter-frame prediction mode residual value of the current node.
- encoding the value of at least one mode identification information and writing the obtained encoding bits into the bitstream may include:
- the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i, the value of the i-th mode identification information is encoded, and the obtained encoding bits are written into the bitstream;
- the value of the i-th mode identification information is encoded and the obtained coded bits are written into the bitstream; and i is updated based on i+1, and the step of determining the value of the i-th mode identification information according to the inter-frame prediction mode value is continued until the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i.
- the coding mode of fixed context information/the coding mode of adaptive context information/the coding mode without using context information is used to encode the value of flag0, and it is determined whether the inter-frame prediction mode value of the current node is equal to 0; if the inter-frame prediction mode value is equal to 0, the encoding operation is terminated; if the inter-frame prediction mode value is not equal to 0, the coding mode of fixed context information/the coding mode of adaptive context information/the coding mode without using context information is continued to encode the value of flag1, and it is determined whether the inter-frame prediction mode value of the current node is equal to 1; if the inter-frame prediction mode value is equal to 1, the encoding operation is terminated; if the inter-frame prediction mode value is equal to 1, the encoding operation is terminated; If the value is not equal to 1, continue to use the coding mode of fixed context information/the coding mode of adaptive context information/the coding mode without using context information to encode the value of flag2, and judge whether the inter-frame prediction mode
- the method may further include:
- the value of the N-1th mode identification information is encoded, and the obtained encoded bits are written into the bit stream.
- determining the value of the N-1th mode identification information according to the inter-frame prediction mode value may include:
- inter-frame prediction mode value is equal to N, determining the value of the N-1th mode identification information to be the first value;
- the value of the N-1th mode identification information is determined to be the second value.
- N+1 inter-frame prediction modes there are at most N corresponding mode identification information, namely: the 0th mode identification information flag0, the 1st mode identification information flag1, ..., the N-1th mode identification information flagN-1.
- the coding mode of fixed context information/adaptive context information can be used.
- the encoding mode/coding mode without context information is used to encode the value of flagi, and it is determined whether the inter-frame prediction mode value of the current node is equal to i; if the inter-frame prediction mode value is equal to i, the encoding operation is terminated; if the inter-frame prediction mode value is not equal to i, the encoding mode with fixed context information/coding mode with adaptive context information/coding mode without context information is continued to encode the value of flagi+1, and it is determined whether the inter-frame prediction mode value of the current node is equal to i+1.
- i is an integer greater than or equal to 0 and less than N-1.
- the coding mode of fixed context information/coding mode of adaptive context information/coding mode without context information can still be used to encode the value of flagN-1, and determine whether the inter-frame prediction mode value of the current node is equal to N-1; if the inter-frame prediction mode value is equal to N-1, the coding operation is terminated, that is, the inter-frame prediction mode value is equal to N-1; if the inter-frame prediction mode value is not equal to N-1, the coding operation is terminated, that is, the inter-frame prediction mode value is equal to N.
- the method may further include:
- the Mth mode identification information indicates that the inter-frame prediction mode value of the current node is not equal to M, determining the second inter-frame prediction mode residual value of the current node according to the inter-frame prediction mode value and the values of the M+1 mode identification information;
- the second inter-frame prediction mode residual value of the current node is encoded based on the second encoding mode, and the obtained encoding bits are written into the bitstream.
- M+1 mode identification information may include: 0th mode identification information, 1st mode identification information, ..., Mth mode identification information; M is an integer greater than or equal to 0 and less than N.
- the second coding mode may be an exponential Golomb coding method, such as a K-order exponential Golomb coding method.
- K-order exponential Golomb is a lossless data compression method that can achieve very high coding efficiency; therefore, in order to improve coding efficiency, when the inter-frame prediction mode value of the current node is not equal to M, the exponential Golomb coding method may be used to encode the second inter-frame prediction mode residual value.
- the encoding end first use the coding mode of fixed context information/the coding mode of adaptive context information/the coding mode without using context information to encode flag0. If the value of flag0 indicates that the inter-frame prediction mode value of the current node is equal to 0, the encoding operation is terminated; if the value of flag0 indicates that the inter-frame prediction mode value of the current node is not equal to 0, then use the coding mode of fixed context information/the coding mode of adaptive context information/the coding mode without using context information to encode flag1; and so on, use the coding mode of fixed context information/the coding mode of adaptive context information/the coding mode without using context information to encode flagM; if the value of flagM indicates that the inter-frame prediction mode value of the current node is equal to M, the encoding operation is terminated; if the value of flagM indicates that the inter-frame prediction mode value of the current node is not equal to M, it is necessary to calculate the second inter-frame prediction mode residual value based on the values of
- determining the second inter-frame prediction mode residual value of the current node based on the inter-frame prediction mode value and the values of M+1 mode identification information may include: performing a negation operation on the values of the M+1 mode identification information to determine the negated value of the M+1 mode identification information; performing a subtraction operation based on the inter-frame prediction mode value and the negated value of the M+1 mode identification information to determine the second inter-frame prediction mode residual value of the current node.
- the first value is set to 0 and the second value is set to 1, that is, when the value of flagi is 0, it is determined that the inter-frame prediction mode value of the current node is not equal to i, and the encoding operation needs to be continued.
- M+1 mode identification information such as flag0, flag1, ..., flagM
- determining the second inter-frame prediction mode residual value of the current node based on the inter-frame prediction mode value and the values of M+1 mode identification information may include: performing a subtraction operation on the inter-frame prediction mode value and (M+1) to determine the second inter-frame prediction mode residual value of the current node.
- the point cloud inter-frame geometric information encoding mode can be used for encoding; or, in order to improve the encoding efficiency, the exponential Golomb encoding method can be combined for encoding.
- the method may also include:
- the initial residual value is quantized according to the quantization parameter to determine the predicted residual value of the current node.
- determining the initial residual value of the current node according to the original value and the predicted value of the current node may include: performing a subtraction operation according to the original value and the predicted value of the current node to determine the initial residual value of the current node.
- the prediction value of the current node can be determined; then, the prediction residual value of the current node can be calculated by subtracting the original value of the current node from the prediction value; and the initial residual value is quantized according to the quantization parameter to obtain the prediction residual value of the current node.
- the method may further include: encoding the prediction residual value of the current node, and writing the obtained coded bits into the bitstream.
- the method may further include: encoding the quantization parameter, and writing the obtained encoded bits into a bitstream.
- the geometric prediction value of the current node is first determined; then the difference operation is performed based on the geometric position information of the current node and the geometric prediction value to obtain the geometric prediction residual; and the geometric prediction residual is quantized using the quantization parameter. Finally, through continuous iteration, the inter-frame prediction mode value, prediction residual, prediction tree structure, quantization parameter and other parameters of each node position information in the prediction tree are encoded, and the obtained coded bits are written into the bitstream.
- the encoding method is mainly for encoding optimization of the inter-frame prediction mode value, and here a flag bit can also be used to determine whether the current node uses the inter-frame prediction mode. Therefore, in some embodiments, the method can also include:
- the step of determining the inter-frame prediction mode value of the current node is performed.
- determining the value of the first identification information may include:
- the first identification information indicates that the current node does not use the inter-frame prediction mode, determining that the value of the first identification information is a third value
- the first identification information indicates that the current node uses the inter-frame prediction mode, it is determined that the value of the first identification information is the fourth value.
- the third value is different from the fourth value, and the third value and the fourth value can be in parameter form or in digital form.
- the first identification information can be a parameter written in the profile or a flag value, which is not specifically limited here.
- the third value can be set to 1 and the fourth value can be set to 0; or, the third value can be set to 0 and the fourth value can be set to 1; or, the third value can be set to true and the fourth value can be set to false; or, the third value can be set to false and the fourth value can be set to true; but this is not specifically limited here.
- the method may further include: encoding the value of the first identification information, and writing the obtained encoded bits into a bit stream.
- a flag bit may be set to determine whether to enable the decoding method of the embodiment of the present application. Therefore, in some embodiments, the method may further include:
- a step of determining a value of at least one mode identification information according to the inter-frame prediction mode value is performed.
- determining the value of the second identification information may include:
- the second identification information indicates that the current node does not enable the target inter-frame coding mode, determining that the value of the second identification information is a fifth value
- the value of the second identification information is determined to be the sixth value.
- the fifth value is different from the sixth value, and the fifth value and the sixth value can be in parameter form or in digital form.
- the second identification information can be a parameter written in the profile or a value of a flag, which is not specifically limited here.
- the fifth value can be set to 1 and the sixth value can be set to 0; or, the fifth value can be set to 0 and the sixth value can be set to 1; or, the fifth value can be set to true and the sixth value can be set to false; or, the fifth value can be set to false and the sixth value can be set to true; but no specific limitation is made here.
- the method may further include: encoding the value of the second identification information, and writing the obtained encoded bits into the bit stream.
- the value of the second identification information can be directly obtained by decoding at the decoding end, and it can be determined whether the current node enables the target inter-frame coding method, thereby improving decoding efficiency.
- a 1-bit flag (i.e., the second identification information) can be used to indicate whether the target inter-frame coding mode is enabled or not.
- This flag can be placed in the header information of a high-level syntax element, such as a geometry header; and this flag can be conditionally enabled under certain conditions. If this flag does not appear in the bitstream, its default value is a fixed value.
- the embodiment of the present application further provides a code stream, which is generated by bit encoding according to the information to be encoded; wherein the information to be encoded may include at least one of the following:
- the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode.
- the first identification information is used to indicate whether the current node uses the inter-frame prediction mode
- the second identification information is used to indicate whether the current node enables the target inter-frame encoding/decoding mode.
- the encoder can encode the information to be encoded and write the obtained encoded bits into the bitstream, which is then transmitted from the encoder to the decoder. Later, at the decoder, by decoding the bitstream, the value of at least one mode identification information and the residual value of the inter-frame prediction mode of the current node can be obtained, so that the inter-frame prediction mode value of the current node can be directly determined.
- This embodiment provides a coding method, first determining the inter-frame prediction mode value of the current node; then determining the value of at least one mode identification information according to the inter-frame prediction mode value; wherein the mode identification information at least includes the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode; finally, encoding the value of at least one mode identification information, and writing the obtained coded bit into the bitstream.
- the inter-frame prediction mode value is no longer converted into binary for direct encoding, but the value of at least one mode identification information is determined according to the inter-frame prediction mode value, and then the value of the at least one mode identification information is encoded; wherein the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i, so that the frequency distribution of the inter-frame prediction mode is taken into account, that is, the more likely the inter-frame prediction mode is to appear, the more forward the corresponding inter-frame prediction mode value is, so that the number of coding bits can be reduced, the bit rate can be saved, and the encoding and decoding efficiency can be improved.
- the prediction mode of the inter-frame prediction tree is mainly improved.
- the specific performance of the encoding prediction mode is as follows:
- the inter-frame prediction mode value is encoded.
- the solution adopted by the related art is to convert the inter-frame prediction mode digital into binary and encode it directly.
- the specific performance of the decoding prediction mode is:
- inter mode coded inter-frame prediction mode value
- inter-frame prediction mode the coded inter-frame prediction mode value
- a coding mode for inter-mode is designed here, taking into account the frequency distribution of inter-frame coding modes. The more likely inter-mode to appear, the more it is placed in the front, which can reduce the number of coding bits and achieve the purpose of improving compression efficiency.
- the tool proposed by the present technology can use a 1-bit flag to indicate whether it is enabled or not.
- This flag is placed in the header information of the high-level syntax element, such as the geometry header, and this flag is conditionally enabled under certain conditions. If this flag does not appear in the bitstream, its default value is a fixed value. Similarly, the flag needs to be decoded at the decoding end. If this flag does not appear in the bitstream, it can be decoded without decoding, and its default value is a fixed value.
- the present technology is to design a coding and decoding scheme for the inter-frame prediction mode value (or "inter-frame prediction mode number", represented by inter mode):
- flagx represents whether inter mode is greater than x. If so, flagx is 1 (i.e., inter mode is greater than x, and processing continues downward); if not, flagx is 0 (i.e., inter mode is not greater than x, and encoding ends at this time).
- inter mode which defaults to 0 to N, that is, there are at most N+1 inter modes.
- the specific process is as follows:
- a flag flag0 indicates whether the inter mode is greater than 0; if not, the decoding is terminated and the inter mode is 0.
- a flag flag1 is encoded using fixed context/no context/adaptive context to indicate whether the inter mode is greater than 1; if not, decoding is terminated and the inter mode is 1.
- a flag flagN-1 is encoded using fixed context/no context/adaptive context to indicate whether the inter mode is greater than N-1; if not, decoding is terminated and the inter mode is N-1; if yes, decoding is terminated and the inter mode is N.
- flagx represents whether inter mode is equal to x. If not, flagx is 0 (i.e., inter mode is not equal to x, and the processing continues downward); if yes, flagx is 1 (i.e., inter mode is equal to x, and the encoding ends).
- inter mode which defaults to 0 to N, that is, there are at most N+1 inter modes.
- the specific process is as follows:
- a flag flag0 indicates whether the inter mode is equal to 0; if so, the decoding is terminated and the inter mode is 0.
- a flag flag1 is encoded using fixed context/no context/adaptive context to indicate whether the inter mode is equal to 1; if so, decoding is terminated and the inter mode is 1.
- a flag flagN-1 is encoded using fixed context/no context/adaptive context to indicate whether the inter mode is equal to N-1; if so, decoding is terminated, and the inter mode is N-1; if not, decoding is terminated, and the inter mode is N.
- the present technology is a Columbus scheme designed to encode and decode the inter-frame prediction mode value (or "inter-frame prediction mode number", denoted by inter mode):
- flagx represents whether inter mode is greater than x. If so, flagx is 1 (i.e., inter mode is greater than x, and processing continues downward); if not, flagx is 0 (i.e., inter mode is not greater than x, and encoding ends at this time).
- inter mode which defaults to 0 to N, that is, there are at most N+1 inter modes.
- the specific process is as follows:
- flagM indicates that the inter mode is greater than M
- inter mode inter mode-(M+1); or,
- inter mode inter mode-(flag0+flag1+...+flagM).
- a flag flag0 indicates whether the inter mode is greater than 0; if not, decoding is terminated.
- Decoding using fixed context/no context/adaptive context uses a flag flag1 to indicate whether the inter mode is greater than 1; if not, decoding ends.
- a flag flagM indicates whether the inter mode is greater than M; if not, the decoding is terminated.
- flagM indicates that the inter mode is greater than M
- flagx represents whether inter mode is equal to x. If not, flagx is 0 (i.e., inter mode is not equal to x, and the processing continues downward); if yes, flagx is 1 (i.e., inter mode is equal to x, and the encoding ends).
- inter mode which defaults to 0 to N, that is, there are at most N+1 inter modes.
- the specific process is as follows:
- flagM indicates that the inter mode is not equal to M
- inter mode inter mode-(M+1), or
- inter mode inter mode-(!(flag0)+!(flag1)+...+!(flagM)).
- Decoding using fixed context/no context/adaptive context uses a flag flag0 to indicate whether the inter mode is equal to 0; if so, decoding ends.
- Decoding using fixed context/no context/adaptive context uses a flag flag1 to indicate whether the inter mode is equal to 1; if so, decoding ends.
- Decoding using fixed context/no context/adaptive context uses a flag flagM to indicate whether the inter mode is equal to M; if so, decoding ends.
- flagM indicates that the inter mode is not equal to M
- the inter-frame prediction mode value for the inter-frame prediction mode value, the inter-frame prediction mode value is no longer converted into binary for direct encoding, but the value of at least one mode identification information is determined according to the inter-frame prediction mode value, and then the value of the at least one mode identification information is encoded; wherein, the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i, which takes into account the frequency distribution of the inter-frame prediction mode, that is, the more likely the inter-frame prediction mode is to appear, the closer its corresponding inter-frame prediction mode value is, thereby reducing the number of encoding bits, saving bit rate, and thus improving encoding and decoding efficiency.
- the encoder 240 may include: a first determination unit 2401 and an encoding unit 2402; wherein,
- a first determining unit 2401 is configured to determine an inter-frame prediction mode value of a current node
- the first determining unit 2401 is further configured to determine a value of at least one mode identification information according to the inter-frame prediction mode value; wherein the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode;
- the encoding unit 2402 is configured to encode the value of at least one mode identification information and write the obtained encoding bits into the bit stream.
- the encoding unit 2402 is further configured to encode a value of at least one mode identification information based on the first encoding mode, and write the obtained encoding bits into the bit stream.
- the first encoding mode includes at least one of the following: an encoding mode with fixed context information, an encoding mode with adaptive context information, and an encoding mode without using context information.
- the first determination unit 2401 is further configured to determine the value of the i-th mode identification information according to the inter-frame prediction mode value; and if the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to i, then the i+1 mode identification information is used as at least one mode identification information; wherein the i+1 mode identification information includes the 0th mode identification information, the 1st mode identification information, ..., the i-th mode identification information; if the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is greater than i, then i is updated based on i+1, and the step of determining the value of the i-th mode identification information according to the inter-frame prediction mode value is continued until the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than
- the first determination unit 2401 is further configured to determine that the value of the i-th mode identification information is a first value if the inter-frame prediction mode value is less than or equal to i; if the inter-frame prediction mode value is greater than i, determine that the value of the i-th mode identification information is a second value.
- the encoding unit 2402 is further configured to, if the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to i, encode the value of the i-th mode identification information and write the obtained coding bits into the bitstream; if the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is greater than i, encode the value of the i-th mode identification information and write the obtained coding bits into the bitstream; and update i based on i+1, and continue to execute the step of determining the value of the i-th mode identification information according to the inter-frame prediction mode value, until the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to i.
- the first determining unit 2401 is further configured to determine the value of the N-1th mode identification information according to the inter-frame prediction mode value when i is equal to N-1;
- the encoding unit 2402 is further configured to encode the value of the N-1th mode identification information and write the obtained coded bits into the bit stream.
- the first determination unit 2401 is further configured to determine that the value of the N-1th mode identification information is the first value if the inter-frame prediction mode value is less than or equal to N-1; if the inter-frame prediction mode value is equal to N, determine that the value of the N-1th mode identification information is the second value.
- the first determining unit 2401 is further configured to, when i is equal to M, determine the first inter-frame prediction mode residual value of the current node according to the inter-frame prediction mode value and the values of M+1 mode identification information if the Mth mode identification information indicates that the inter-frame prediction mode value of the current node is greater than M;
- the encoding unit 2402 is also configured to encode the residual value of the first inter-frame prediction mode of the current node based on the second encoding mode, and write the obtained encoding bits into the bit stream; wherein the M+1 mode identification information includes: the 0th mode identification information, the 1st mode identification information, ..., the Mth mode identification information; M is an integer greater than or equal to 0 and less than N.
- the first determining unit 2401 is further configured to perform a subtraction operation on the inter-frame prediction mode value and the values of the M+1 mode identification information to determine the first inter-frame prediction mode residual value of the current node.
- the first determining unit 2401 is further configured to perform a subtraction operation on the inter-frame prediction mode value and (M+1) to determine the first inter-frame prediction mode residual value of the current node.
- the first determination unit 2401 is further configured to determine the value of the i-th mode identification information according to the inter-frame prediction mode value; and if the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i, then the i+1 mode identification information is used as at least one mode identification information; wherein the i+1 mode identification information includes the 0th mode identification information, the 1st mode identification information, ..., the i-th mode identification information; if the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is not equal to i, then i is updated based on i+1, and the step of determining the value of the i-th mode identification information according to the inter-frame prediction mode value is continued until the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i
- the first determination unit 2401 is further configured to determine the value of the i-th mode identification information as the first value if the inter-frame prediction mode value is not equal to i; if the inter-frame prediction mode value is equal to i, determine the value of the i-th mode identification information as the second value.
- the encoding unit 2402 is further configured to, if the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i, encode the value of the i-th mode identification information and write the obtained coding bits into the bitstream; if the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is not equal to i, encode the value of the i-th mode identification information and write the obtained coding bits into the bitstream; and update i based on i+1, and continue to execute the step of determining the value of the i-th mode identification information according to the inter-frame prediction mode value, until the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i.
- the first determining unit 2401 is further configured to determine the value of the N-1th mode identification information according to the inter-frame prediction mode value when i is equal to N-1;
- the encoding unit 2402 is further configured to encode the value of the N-1th mode identification information and write the obtained coded bits into the bit stream.
- the first determination unit 2401 is further configured to determine that if the inter-frame prediction mode value is equal to N, the value of the N-1th mode identification information is the first value; if the inter-frame prediction mode value is equal to N-1, the value of the N-1th mode identification information is determined to be the second value.
- the first determining unit 2401 is further configured to, when i is equal to M, if the Mth mode identification information indicates that the inter-frame prediction mode value of the current node is not equal to M, determine the second inter-frame prediction mode residual value of the current node according to the inter-frame prediction mode value and the values of the M+1 mode identification information;
- the encoding unit 2402 is also configured to encode the second inter-frame prediction mode residual value of the current node based on the second encoding mode, and write the obtained encoding bits into the bitstream; wherein the M+1 mode identification information includes: the 0th mode identification information, the 1st mode identification information, ..., the Mth mode identification information; M is an integer greater than or equal to 0 and less than N.
- the first determination unit 2401 is further configured to perform a negation operation on the values of the M+1 mode identification information to determine the negated values of the M+1 mode identification information; and perform a subtraction operation on the inter-frame prediction mode value and the negated values of the M+1 mode identification information to determine the second inter-frame prediction mode residual value of the current node.
- the first determining unit 2401 is further configured to perform a subtraction operation on the inter-frame prediction mode value and (M+1) to determine the second inter-frame prediction mode residual value of the current node.
- the second encoding mode comprises: an Exponential Golomb encoding mode.
- the first determination unit 2401 is further configured to determine an inter-frame candidate node set; wherein the inter-frame candidate node set includes at least one candidate node; determine a selected node from the inter-frame candidate node set, and determine the inter-frame prediction mode value of the current node based on the index position of the selected node in the inter-frame candidate node set.
- the first determination unit 2401 is further configured to perform cost calculation on at least one candidate node in the inter-frame candidate node set based on a rate-distortion cost method to determine the cost value of at least one candidate node; and determine the minimum cost value from the cost values of at least one candidate node, and select the candidate node corresponding to the minimum cost value as the selected node.
- the first determination unit 2401 is also configured to determine a previous encoded node of the current node; determine a first candidate node whose geometric parameters satisfy a first condition with those of the previous encoded node in the first reference frame, and determine at least one second candidate node in the first reference frame based on the first candidate node; determine a third candidate node whose geometric parameters satisfy the first condition with those of the previous encoded node in the second reference frame, determine at least one fourth candidate node in the second reference frame based on the third candidate node, and set the horizontal azimuth angle of at least one fourth candidate node to the horizontal azimuth angle of the father node of the current node; and determine an inter-frame candidate node set based on at least one second candidate node and/or at least one fourth candidate node.
- the first determination unit 2401 is further configured to determine a prediction tree corresponding to a current frame, wherein the current frame includes a current node; and determine a previous encoded node of the current node based on a coding order of the prediction tree.
- the first reference frame is a frame before the current frame; and the second reference frame is obtained by performing global motion on the previous frame.
- the first determination unit 2401 is further configured to determine a predicted value of the current node based on the selected node; determine an initial residual value of the current node based on the original value and the predicted value of the current node; and quantize the initial residual value based on a quantization parameter to determine a predicted residual value of the current node.
- the encoding unit 2402 is further configured to encode the prediction residual value of the current node and write the obtained encoding bits into the bitstream.
- the encoding unit 2402 is further configured to encode the quantization parameter and write the obtained encoded bits into the bitstream.
- the first determination unit 2401 is further configured to determine a value of the first identification information; and if the first identification information indicates that the current node uses an inter-frame prediction mode, execute the step of determining the inter-frame prediction mode value of the current node.
- the first determination unit 2401 is further configured to determine that the value of the first identification information is a third value if the first identification information indicates that the current node does not use the inter-frame prediction mode; if the first identification information indicates that the current node uses the inter-frame prediction mode, determine that the value of the first identification information is a fourth value.
- the encoding unit 2402 is further configured to encode the value of the first identification information and write the obtained encoded bits into the bit stream.
- the first determination unit 2401 is further configured to determine the value of the second identification information; and if the second identification information indicates that the current node enables the target inter-frame coding method, then execute the step of determining the value of at least one mode identification information according to the inter-frame prediction mode value.
- the first determination unit 2401 is further configured to determine that the value of the second identification information is a fifth value if the second identification information indicates that the current node does not enable the target inter-frame coding method; if the second identification information indicates that the current node enables the target inter-frame coding method, determine that the value of the second identification information is a sixth value.
- the encoding unit 2402 is further configured to encode the value of the second identification information and write the obtained encoded bits into the bit stream.
- a "unit” may be a part of a circuit, a part of a processor, a part of a program or software, etc., and of course, it may be a module, or it may be non-modular.
- the components in the present embodiment may be integrated into a processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit may be implemented in the form of hardware or in the form of a software functional module.
- the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
- the technical solution of this embodiment is essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a
- a storage medium includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the method described in this embodiment.
- the aforementioned storage medium includes: a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and other media that can store program codes.
- an embodiment of the present application provides a computer-readable storage medium, which is applied to the encoder 240.
- the computer-readable storage medium stores a computer program, and when the computer program is executed by the first processor, the method described in any one of the aforementioned embodiments is implemented.
- the encoder 240 may include: a first communication interface 2501, a first memory 2502 and a first processor 2503; each component is coupled together through a first bus system 2504. It can be understood that the first bus system 2504 is used to realize the connection and communication between these components.
- the first bus system 2504 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, various buses are marked as the first bus system 2504 in Figure 25. Among them,
- the first communication interface 2501 is used to receive and send signals during the process of sending and receiving information with other external network elements;
- a first memory 2502 used for storing a computer program that can be run on the first processor 2503;
- the first processor 2503 is configured to, when running the computer program, execute:
- the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode;
- the value of at least one mode identification information is encoded, and the obtained encoded bits are written into the code stream.
- the first memory 2502 in the embodiment of the present application can be a volatile memory or a non-volatile memory, or can include both volatile and non-volatile memories.
- the non-volatile memory can be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory.
- the volatile memory can be a random access memory (RAM), which is used as an external cache.
- RAM static RAM
- DRAM dynamic RAM
- SDRAM synchronous DRAM
- DDRSDRAM double data rate synchronous DRAM
- ESDRAM enhanced synchronous DRAM
- SLDRAM synchronous link DRAM
- DRRAM direct RAM bus RAM
- the first processor 2503 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by the hardware integrated logic circuit or software instructions in the first processor 2503.
- the above-mentioned first processor 2503 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
- DSP Digital Signal Processor
- ASIC Application Specific Integrated Circuit
- FPGA Field Programmable Gate Array
- the methods, steps and logic block diagrams disclosed in the embodiments of the present application can be implemented or executed.
- the general-purpose processor can be a microprocessor or the processor can also be any conventional processor, etc.
- the steps of the method disclosed in the embodiments of the present application can be directly embodied as a hardware decoding processor to execute, or the hardware and software modules in the decoding processor can be executed.
- the software module can be located in a mature storage medium in the field such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory or an electrically erasable programmable memory, a register, etc.
- the storage medium is located in the first memory 2502, and the first processor 2503 reads the information in the first memory 2502 and completes the steps of the above method in combination with its hardware.
- the processing unit can be implemented in one or more application specific integrated circuits (Application Specific Integrated Circuits, ASIC), digital signal processors (Digital Signal Processing, DSP), digital signal processing devices (DSP Device, DSPD), programmable logic devices (Programmable Logic Device, PLD), field programmable gate arrays (Field-Programmable Gate Array, FPGA), general processors, controllers, microcontrollers, microprocessors, other electronic units for performing the functions described in this application or a combination thereof.
- ASIC Application Specific Integrated Circuits
- DSP Digital Signal Processing
- DSP Device digital signal processing devices
- PLD programmable logic devices
- FPGA field programmable gate array
- general processors controllers, microcontrollers, microprocessors, other electronic units for performing the functions described in this application or a combination thereof.
- the technology described in this application can be implemented by a module (such as a process, function, etc.) that performs the functions described in this application.
- the software code can be stored in a memory and executed by a processor.
- the memory can be implemented in the processor or outside the processor.
- the first processor 2503 is further configured to execute the method described in any one of the aforementioned embodiments when running the computer program.
- This embodiment provides an encoder, in which, for an inter-frame prediction mode value, the inter-frame prediction mode value is no longer converted into binary for direct encoding, but the value of at least one mode identification information is determined according to the inter-frame prediction mode value, and then the value of the mode identification information is converted into binary for direct encoding.
- the value of at least one mode identification information is encoded; wherein the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i.
- the decoder 260 may include: a decoding unit 2601 and a second determining unit 2602; wherein,
- the decoding unit 2601 is configured to decode the bitstream and determine the value of at least one mode identification information; wherein the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode;
- the second determination unit 2602 is configured to determine the inter-frame prediction mode value of the current node according to the value of at least one mode identification information; and determine the prediction value of the current node according to the inter-frame prediction mode value.
- the decoding unit 2601 is further configured to decode the code stream based on the first decoding mode and determine a value of at least one mode identification information.
- the first decoding mode includes at least one of the following: a decoding mode with fixed context information, a decoding mode with adaptive context information, and a decoding mode without using context information.
- the decoding unit 2601 is further configured to decode the bitstream to determine the value of the i-th mode identification information
- the second determining unit 2602 is further configured to use i+1 mode identification information as at least one mode identification information if the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to i; wherein the i+1 mode identification information includes the 0th mode identification information, the 1st mode identification information, ..., the i-th mode identification information;
- the decoding unit 2601 is also configured to update i based on i+1 if the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is greater than i, and continue to perform the steps of decoding the code stream and determining the value of the i-th mode identification information until the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to i.
- the second determination unit 2602 is further configured to determine that if the value of the i-th mode identification information is a first value, then determine that the inter-frame prediction mode value of the current node indicated by the i-th mode identification information is less than or equal to i; if the value of the i-th mode identification information is a second value, then determine that the inter-frame prediction mode value of the current node indicated by the i-th mode identification information is greater than i.
- the second determining unit 2602 is further configured to set the inter-frame prediction mode value of the current node to be equal to i when the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to i.
- the second determination unit 2602 is further configured to, when i is equal to N-1, if the N-1th mode identification information indicates that the inter-frame prediction mode value of the current node is less than or equal to N-1, then the inter-frame prediction mode value of the current node is set to be equal to N-1; if the N-1th mode identification information indicates that the inter-frame prediction mode value of the current node is greater than N-1, then the inter-frame prediction mode value of the current node is set to be equal to N.
- the second determination unit 2602 is further configured to, when i is equal to M, if the M-th mode identification information indicates that the inter-frame prediction mode value of the current node is greater than M, then based on the second decoding mode decoding code stream, determine the first inter-frame prediction mode residual value of the current node; determine the inter-frame prediction mode value of the current node according to the values of M+1 mode identification information and the first inter-frame prediction mode residual value; wherein the M+1 mode identification information includes: the 0th mode identification information, the 1st mode identification information, ..., the Mth mode identification information; M is an integer greater than or equal to 0 and less than N.
- the second determining unit 2602 is further configured to perform an addition operation on the values of the M+1 mode identification information and the first inter-frame prediction mode residual value to determine the inter-frame prediction mode value of the current node.
- the second determining unit 2602 is further configured to perform an addition operation on the first inter-frame prediction mode residual value and (M+1) to determine the inter-frame prediction mode value of the current node.
- the decoding unit 2601 is further configured to decode the bitstream to determine the value of the i-th mode identification information
- the second determining unit 2602 is further configured to, if the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i, use i+1 mode identification information as at least one mode identification information; wherein the i+1 mode identification information includes the 0th mode identification information, the 1st mode identification information, ..., the i-th mode identification information;
- the decoding unit 2601 is also configured to update i based on i+1 if the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is not equal to i, and continue to perform the steps of decoding the code stream and determining the value of the i-th mode identification information until the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i.
- the second determination unit 2602 is further configured to determine that if the value of the i-th mode identification information is a first value, then determine that the inter-frame prediction mode value of the current node indicated by the i-th mode identification information is not equal to i; if the value of the i-th mode identification information is a second value, then determine that the inter-frame prediction mode value of the current node indicated by the i-th mode identification information is equal to i.
- the second determining unit 2602 is further configured to set the inter-frame prediction mode value of the current node to be equal to i when the i-th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to i.
- the second determination unit 2602 is further configured to, when i is equal to N-1, if the N-1th mode identification information indicates that the inter-frame prediction mode value of the current node is equal to N-1, then the inter-frame prediction mode value of the current node is set to be equal to N-1; if the N-1th mode identification information indicates that the inter-frame prediction mode value of the current node is not equal to N-1, then the inter-frame prediction mode value of the current node is set to be equal to N.
- the second determination unit 2602 is further configured to, when i is equal to M, if the M-th mode identification information indicates that the inter-frame prediction mode value of the current node is not equal to M, then determine the second inter-frame prediction mode residual value of the current node based on the decoded code stream in the second decoding mode; determine the inter-frame prediction mode value of the current node according to the values of M+1 mode identification information and the second inter-frame prediction mode residual value; wherein the M+1 mode identification information includes: the 0th mode identification information, the 1st mode identification information, ..., the Mth mode identification information; M is an integer greater than or equal to 0 and less than N.
- the second determination unit 2602 is further configured to perform a negation operation on the values of the M+1 mode identification information to determine the negated values of the M+1 mode identification information; and to perform an addition operation on the negated values of the M+1 mode identification information and the residual value of the second inter-frame prediction mode to determine the inter-frame prediction mode value of the current node.
- the second determining unit 2602 is further configured to perform an addition operation on the second inter-frame prediction mode residual value and (M+1) to determine the inter-frame prediction mode value of the current node.
- the second decoding mode comprises: an Exponential Golomb decoding mode.
- the decoding unit 2601 is further configured to decode the code stream to determine the value of the first identification information; and if the first identification information indicates that the current node uses the inter-frame prediction mode, execute the step of decoding the code stream to determine the value of at least one mode identification information.
- the second determination unit 2602 is further configured to determine that the first identification information indicates that the current node does not use the inter-frame prediction mode if the value of the first identification information is a third value; and to determine that the first identification information indicates that the current node uses the inter-frame prediction mode if the value of the first identification information is a fourth value.
- the decoding unit 2601 is further configured to decode the code stream and determine the value of the second identification information; and if the second identification information indicates that the current node enables the target inter-frame decoding method, execute the step of decoding the code stream and determining the value of at least one mode identification information.
- the second determination unit 2602 is further configured to, if the value of the second identification information is the fifth value, determine that the second identification information indicates that the current node does not enable the target inter-frame decoding method; if the value of the second identification information is the sixth value, determine that the second identification information indicates that the current node enables the target inter-frame decoding method.
- the second determination unit 2602 is further configured to determine a selected node from at least one candidate node according to the inter-frame prediction mode value; and determine a prediction value of the current node according to the selected node.
- At least one candidate node includes at least one of the following: at least one second candidate node and at least one fourth candidate node; wherein, at least one second candidate node is a candidate node in a first reference frame, and at least one fourth candidate node is a candidate node in a second reference frame; and the first reference frame is a previous frame of the current frame, the second reference frame is obtained by global motion of the previous frame, and the current frame includes the current node.
- the second determination unit 2602 is also configured to determine the previous decoded node of the current node; based on the previous decoded node, determine the first candidate node in the first reference frame; wherein the geometric parameters of the previous decoded node and the first candidate node satisfy the first condition; based on the first candidate node, determine the 1st second candidate node to the pth second candidate node in the first reference frame according to a preset method; and use the pth second candidate node as the selected node; wherein p is a positive integer greater than 0, and the value of p is associated with the inter-frame prediction mode value.
- the second determination unit 2602 is also configured to determine the previous decoded node of the current node; based on the previous decoded node, determine the third candidate node in the second reference frame; wherein the geometric parameters of the previous decoded node and the third candidate node satisfy the first condition; based on the third candidate node, determine the 1st fourth candidate node to the qth fourth candidate node in the second reference frame according to a preset method; and use the qth fourth candidate node as the selected node; wherein q is a positive integer greater than 0, and the value of q is associated with the inter-frame prediction mode value.
- the decoding unit 2601 is further configured to decode the bitstream and determine the prediction residual value and quantization parameter of the current node;
- the second determining unit 2602 is further configured to perform inverse quantization processing on the prediction residual value according to the quantization parameter to obtain an inverse quantized residual value; and determine the reconstruction information of the current node according to the inverse quantized residual value and the prediction value.
- a "unit" can be a part of a circuit, a part of a processor, a part of a program or software, etc., and of course it can also be a module, or it can be non-modular.
- the components in this embodiment can be integrated into a processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
- the above-mentioned integrated unit can be implemented in the form of hardware or in the form of a software functional module.
- the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
- this embodiment provides a computer-readable storage medium, which is applied to the decoder 260, and the computer-readable storage medium stores a computer program. When the computer program is executed by the second processor, the method described in any one of the above embodiments is implemented.
- the decoder 260 may include: a second communication interface 2701, a second memory 2702 and a second processor 2703; each component is coupled together through a second bus system 2704. It can be understood that the second bus system 2704 is used to realize the connection and communication between these components.
- the second bus system 2704 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, various buses are marked as the second bus system 2704 in Figure 27. Among them,
- the second communication interface 2701 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
- the second memory 2702 is used to store a computer program that can be run on the second processor 2703;
- the second processor 2703 is configured to execute, when running the computer program:
- Decoding a bitstream determining a value of at least one mode identification information; wherein the mode identification information includes at least the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode;
- the prediction value of the current node is determined.
- the second processor 2703 is further configured to execute any one of the methods described in the foregoing embodiments when running the computer program.
- the present embodiment provides a decoder, in which, for the inter-frame prediction mode value, the inter-frame prediction mode value is no longer converted into binary for direct decoding, but the value of at least one mode identification information is determined by decoding the code stream, and then the inter-frame prediction mode value is determined according to the value of the at least one mode identification information; wherein the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i, thereby taking into account the frequency distribution of the inter-frame prediction mode, that is, the more likely the inter-frame prediction mode is to appear, the closer its corresponding inter-frame prediction mode value is, thereby reducing the number of encoding bits, saving bit rate, and further improving encoding and decoding efficiency.
- a schematic diagram of the composition structure of a coding and decoding system provided in an embodiment of the present application is shown.
- a coding and decoding system 280 may include an encoder 2801 and a decoder 2802 .
- the encoder 2801 may be the encoder described in any one of the aforementioned embodiments
- the decoder 2802 may be the decoder described in any one of the aforementioned embodiments.
- the inter-frame prediction mode value of the current node is determined; according to the inter-frame prediction mode value, the value of at least one mode identification information is determined; wherein the mode identification information at least includes the i-th mode identification information, and the i-th mode identification information is used to indicate Indicates whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode; encode the value of at least one mode identification information, and write the obtained coded bits into the bitstream.
- the decoding end decode the bitstream and determine the value of at least one mode identification information; wherein the mode identification information at least includes the i-th mode identification information, and the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i; i is an integer greater than or equal to 0 and less than N, and N represents the maximum value of the inter-frame prediction mode; determine the inter-frame prediction mode value of the current node according to the value of at least one mode identification information; determine the prediction value of the current node according to the inter-frame prediction mode value.
- the inter-frame prediction mode value is no longer converted into binary for direct encoding, but the value of at least one mode identification information is determined according to the inter-frame prediction mode value, and then the value of the at least one mode identification information is encoded; wherein, the i-th mode identification information is used to indicate whether the inter-frame prediction mode value of the current node is greater than or equal to i.
- This takes into account the frequency distribution of the inter-frame prediction mode, that is, the more likely the inter-frame prediction mode is to appear, the closer its corresponding inter-frame prediction mode value is, thereby reducing the number of encoding bits, saving bit rate, and thus improving encoding and decoding efficiency.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Sont divulgués dans les modes de réalisation de la présente demande un procédé de codage, un procédé de décodage, un flux de code, un codeur, un décodeur et un support d'enregistrement. Le procédé de décodage comprend : le décodage d'un flux de code, et la détermination de la valeur d'au moins un élément d'informations d'identification de mode, les informations d'identification de mode comprenant au moins un i-ième élément d'informations d'identification de mode, les i-ièmes informations d'identification de mode étant utilisées pour indiquer si une valeur de mode de prédiction inter-trame du nœud actuel est supérieure ou égale à i, i étant un nombre entier supérieur ou égal à 0 et inférieur à N, et N représentant la valeur maximale du mode de prédiction inter-trame ; en fonction de la valeur du ou des éléments des informations d'identification de mode, la détermination de la valeur de mode de prédiction inter-trame du nœud actuel ; et, en fonction de la valeur de mode de prédiction inter-trame, la détermination d'une valeur prédite du nœud actuel. Par conséquent, le nombre de bits codés peut être réduit, ce qui permet de réduire un débit de code et d'améliorer l'efficacité de codage et de décodage.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2023/087289 WO2024212042A1 (fr) | 2023-04-10 | 2023-04-10 | Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support d'enregistrement |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2023/087289 WO2024212042A1 (fr) | 2023-04-10 | 2023-04-10 | Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support d'enregistrement |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024212042A1 true WO2024212042A1 (fr) | 2024-10-17 |
Family
ID=93058637
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/087289 Pending WO2024212042A1 (fr) | 2023-04-10 | 2023-04-10 | Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support d'enregistrement |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2024212042A1 (fr) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220086426A1 (en) * | 2018-12-28 | 2022-03-17 | Electronics And Telecommunications Research Institute | Method and apparatus for deriving intra-prediction mode |
| US20220207780A1 (en) * | 2020-12-29 | 2022-06-30 | Qualcomm Incorporated | Inter prediction coding for geometry point cloud compression |
-
2023
- 2023-04-10 WO PCT/CN2023/087289 patent/WO2024212042A1/fr active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220086426A1 (en) * | 2018-12-28 | 2022-03-17 | Electronics And Telecommunications Research Institute | Method and apparatus for deriving intra-prediction mode |
| US20220207780A1 (en) * | 2020-12-29 | 2022-06-30 | Qualcomm Incorporated | Inter prediction coding for geometry point cloud compression |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2024145904A1 (fr) | Procédé de codage, procédé de décodage, flux de code, codeur, décodeur, et support de stockage | |
| WO2024212042A1 (fr) | Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support d'enregistrement | |
| WO2024212038A1 (fr) | Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support d'enregistrement | |
| WO2024212045A1 (fr) | Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support de stockage | |
| WO2024212043A1 (fr) | Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support de stockage | |
| WO2025015523A1 (fr) | Procédé de codage, procédé de décodage, flux de bits, codeur, décodeur et support de stockage | |
| WO2025076663A1 (fr) | Procédé de codage, procédé de décodage, codeur, décodeur, et support de stockage | |
| WO2024216477A1 (fr) | Procédés de codage/décodage, codeur, décodeur, flux de code et support de stockage | |
| WO2025145433A1 (fr) | Procédé de codage de nuage de points, procédé de décodage de nuage de points, codec, flux de code et support de stockage | |
| WO2024216476A1 (fr) | Procédé de codage/décodage, codeur, décodeur, flux de code, et support de stockage | |
| WO2025010601A9 (fr) | Procédé de codage, procédé de décodage, codeurs, décodeurs, flux de code et support de stockage | |
| WO2025007355A9 (fr) | Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support de stockage | |
| WO2025007360A1 (fr) | Procédé de codage, procédé de décodage, flux binaire, codeur, décodeur et support d'enregistrement | |
| WO2025010604A1 (fr) | Procédé de codage de nuage de points, procédé de décodage de nuage de points, décodeur, flux de code et support d'enregistrement | |
| WO2024234132A9 (fr) | Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support d'enregistrement | |
| WO2024207456A1 (fr) | Procédé de codage et de décodage, codeur, décodeur, flux de code et support de stockage | |
| WO2025010600A9 (fr) | Procédé de codage, procédé de décodage, flux de code, codeur, décodeur et support de stockage | |
| WO2024216479A9 (fr) | Procédé de codage et de décodage, flux de code, codeur, décodeur et support de stockage | |
| WO2025076672A1 (fr) | Procédé de codage, procédé de décodage, codeur, décodeur, flux de code, et support de stockage | |
| WO2024207481A1 (fr) | Procédé de codage, procédé de décodage, codeur, décodeur, support de stockage et de flux binaire | |
| WO2025007349A1 (fr) | Procédés de codage et de décodage, flux binaire, codeur, décodeur et support de stockage | |
| WO2024148598A1 (fr) | Procédé de codage, procédé de décodage, codeur, décodeur et support de stockage | |
| WO2025076668A1 (fr) | Procédé de codage, procédé de décodage, codeur, décodeur et support de stockage | |
| WO2024145910A1 (fr) | Procédé de codage, procédé de décodage, flux de bits, codeur, décodeur et support de stockage | |
| TW202431857A (zh) | 編解碼方法、碼流、編碼器、解碼器以及儲存媒介 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23932342 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |