WO2025217813A1 - Point cloud encoding method, point cloud decoding method, encoder, decoder, bitstream, and storage medium - Google Patents
Point cloud encoding method, point cloud decoding method, encoder, decoder, bitstream, and storage mediumInfo
- Publication number
- WO2025217813A1 WO2025217813A1 PCT/CN2024/088076 CN2024088076W WO2025217813A1 WO 2025217813 A1 WO2025217813 A1 WO 2025217813A1 CN 2024088076 W CN2024088076 W CN 2024088076W WO 2025217813 A1 WO2025217813 A1 WO 2025217813A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- parameter
- value
- intersection
- values
- point cloud
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/40—Tree coding, e.g. quadtree, octree
Definitions
- the present application relates to the field of point cloud encoding and decoding technology, and in particular to a point cloud encoding and decoding method, codec, bit stream and storage medium.
- intersection centroid offset value can more accurately restore the shape of the point cloud.
- a context model can be used to encode and decode the intersection centroid offset value.
- encoding and decoding the intersection centroid offset value requires a large amount of context, which is not conducive to improving encoding and decoding performance.
- the present invention provides a point cloud encoding and decoding method, codec, code stream, and storage medium. The following describes various aspects of the present invention.
- a point cloud decoding method is provided, which is applied to a decoder, including: determining a first parameter of a current block based on at least one intersection in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection; determining a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; determining a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and decoding the third parameter based on the context model of the third parameter.
- a point cloud encoding method is provided, which is applied to an encoder, including: determining a first parameter of a current block based on at least one intersection point in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection point; determining a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; determining a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and encoding the third parameter based on the context model of the third parameter.
- a decoder comprising: a first determination unit, configured to determine a first parameter of a current block based on at least one intersection in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection; a second determination unit, configured to determine a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; a third determination unit, configured to determine a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and a decoding unit, configured to decode the third parameter based on the context model of the third parameter.
- a decoder comprising: a memory for storing a computer program; and a processor for executing the method of the first aspect when running the computer program.
- an encoder comprising: a first determination unit, configured to determine a first parameter of a current block based on at least one intersection in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection; a second determination unit, configured to determine a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; a third determination unit, configured to determine a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and an encoding unit, configured to encode the third parameter based on the context model of the third parameter.
- an encoder comprising: a memory for storing a computer program; and a processor for executing the method of the second aspect when running the computer program.
- a computer-readable storage medium stores a computer program, and when the computer program is executed, the method of the first aspect or the second aspect is implemented.
- a non-volatile computer-readable storage medium for storing a bit stream, wherein the bit stream is generated by an encoding method using an encoder, or the bit stream is decoded by a decoding method using a decoder, wherein the decoding method is the method of the first aspect and the encoding method is the method of the second aspect.
- a code stream comprising a code stream generated according to the method of the second aspect.
- the second parameter can be used as a parameter of the context model for determining the third parameter (which can be used to determine the intersection centroid offset value of the current block), wherein one value of the second parameter corresponds to multiple values of the first parameter.
- the number of index values (or values) of the second parameter is less than the number of index values of the first parameter.
- the context model of the third parameter determined by the second parameter can reduce the number of required context models, which is beneficial to improving the performance of encoding and decoding the third parameter.
- FIG1A is a schematic diagram of a three-dimensional point cloud image.
- FIG1B is a partially enlarged view of a three-dimensional point cloud image.
- FIG2A is a schematic diagram of six viewing angles of a point cloud image.
- FIG2B is a schematic diagram of a data storage format corresponding to a point cloud image.
- FIG3 is a schematic diagram of a network architecture for point cloud encoding and decoding.
- FIG4A is a schematic diagram of a composition framework of a G-PCC encoder.
- FIG4B is a schematic diagram of a composition framework of a G-PCC decoder.
- FIG5A is a schematic diagram of a low plane position in the Z-axis direction.
- FIG5B is a schematic diagram of a high plane position in the Z-axis direction.
- FIG6 is a schematic diagram of a node encoding sequence.
- FIG. 7A is a schematic diagram of plane identification information.
- FIG. 7B is a schematic diagram of another type of planar identification information.
- FIG8 is a schematic diagram of sibling nodes of a current node.
- FIG9 is a schematic diagram of calculating a centroid offset value.
- FIG. 10A is a schematic diagram showing three intersection points included in a sub-block.
- FIG. 10B is a schematic diagram of a triangular facet set fitted using three intersection points.
- FIG. 10C is a schematic diagram of upsampling of a triangle face set.
- FIG11 is a schematic diagram of a flow chart of calculating a centroid offset value.
- FIG12 is a schematic diagram of the context required for decoding the Intersameflag.
- FIG13 is a flow chart of the point cloud decoding method provided in an embodiment of the present application.
- FIG14 is a flow chart of the point cloud encoding method provided in an embodiment of the present application.
- FIG15 is a schematic diagram of a flow chart of calculating a centroid offset value provided in an embodiment of the present application.
- Figure 16 is a schematic diagram of the context required for decoding the Intersameflag provided in an embodiment of the present application.
- FIG17 is a schematic diagram of the structure of a decoder provided in an embodiment of the present application.
- FIG18 is a schematic structural diagram of a decoder provided in another embodiment of the present application.
- FIG19 is a schematic diagram of the structure of an encoder provided in one embodiment of the present application.
- FIG20 is a schematic diagram of the structure of an encoder provided in another embodiment of the present application.
- first ⁇ second ⁇ third involved in the embodiments of the present application are only used to distinguish similar objects and do not represent a specific ordering of the objects. It can be understood that “first ⁇ second ⁇ third” can be interchanged with a specific order or sequence where permitted, so that the embodiments of the present application described here can be implemented in an order other than that illustrated or described here.
- a point cloud is a set of irregularly distributed discrete points in space that represent the spatial structure and surface properties of a three-dimensional object or scene. These points contain geometric information representing spatial location and attribute information representing the point cloud's appearance and texture.
- Figure 1A shows a 3D point cloud image
- Figure 1B shows a zoomed-in view of a 3D point cloud image. It can be seen that the point cloud surface is composed of densely distributed points.
- each pixel In a two-dimensional image, each pixel contains information and is distributed regularly, so there's no need to record its location. However, the distribution of points in a point cloud in three-dimensional space is random and irregular, so recording the location of each point in space is necessary to fully represent the point cloud. Similar to a two-dimensional image, each location in the acquisition process has corresponding attribute information, typically an RGB color value, which reflects the object's color. For a point cloud, in addition to color information, each point's corresponding attribute information often includes reflectance values, which reflect the surface texture of the object. Therefore, point cloud data typically includes both point location information and point attribute information. Point location information can also be referred to as point geometric information.
- point geometric information can be the point's three-dimensional coordinates (x, y, z).
- Point attribute information can include color information and/or reflectance.
- reflectance can be one-dimensional reflectance information (r).
- Color information can be information in any color space, or it can be three-dimensional color information, such as RGB.
- R represents red (red)
- G represents green (green)
- B represents blue (blue).
- color information can be luminance and chrominance (YCbCr, YUV) information.
- Y represents brightness (luma)
- Cb (U) represents blue color difference
- Cr(V) represents red color difference.
- a point cloud generated using laser measurement principles can include both its 3D coordinate information and its reflectivity.
- a point cloud generated using photogrammetry principles can include both its 3D coordinate information and its 3D color information.
- a point cloud generated using a combination of laser measurement and photogrammetry principles can include both its 3D coordinate information, its reflectivity value, and its 3D color information.
- Figures 2A and 2B show a point cloud image and its corresponding data storage format.
- Figure 2A provides six viewing angles of the point cloud image
- Figure 2B consists of a file header and data.
- the header includes the data format, data representation type, the total number of points in the point cloud, and the content represented by the point cloud.
- the point cloud is in ".ply" format, represented by ASCII code, with a total of 207,242 points.
- Each point has 3D coordinate information (x, y, z) and 3D color information (r, g, b).
- Point clouds can be divided into the following categories according to the acquisition method:
- Static point cloud the object is stationary and the device that acquires the point cloud is also stationary;
- Dynamic point cloud The object is moving, but the device that obtains the point cloud is stationary;
- Dynamic point cloud acquisition The device used to acquire the point cloud is in motion.
- point clouds can be divided into two categories according to their usage:
- Category 1 Machine perception point cloud, which can be used in scenarios such as autonomous navigation systems, real-time inspection systems, geographic information systems, visual sorting robots, and disaster relief robots;
- Category 2 Human eye perception point cloud, which can be used in point cloud application scenarios such as digital cultural heritage, free viewpoint broadcasting, 3D immersive communication, and 3D immersive interaction.
- Point clouds can flexibly and conveniently express the spatial structure and surface properties of three-dimensional objects or scenes. Moreover, since point clouds are obtained by directly sampling real objects, they can provide a strong sense of reality while ensuring accuracy. Therefore, they are widely used, including virtual reality games, computer-aided design, geographic information systems, automatic navigation systems, digital cultural heritage, free viewpoint broadcasting, three-dimensional immersive remote presentation, and three-dimensional reconstruction of biological tissues and organs.
- Point clouds are primarily collected through computer generation, 3D laser scanning, and 3D photogrammetry.
- Computers can generate point clouds of virtual 3D objects and scenes; 3D laser scanning can obtain point clouds of static real-world 3D objects or scenes, generating millions of point clouds per second; and 3D photogrammetry can obtain point clouds of dynamic real-world 3D objects or scenes, generating tens of millions of point clouds per second.
- 3D photogrammetry can obtain point clouds of dynamic real-world 3D objects or scenes, generating tens of millions of point clouds per second.
- the data volume for 10 seconds is approximately 1280 ⁇ 720 ⁇ 12 bits ⁇ 30 frames ⁇ 10 seconds, which is approximately 0.39 GB.
- the point cloud is a collection of massive points, storing the point cloud not only consumes a lot of memory, but is also not conducive to transmission. There is also not enough bandwidth to support direct transmission of the point cloud at the network layer without compression. Therefore, the point cloud needs to be compressed.
- point cloud coding frameworks that can compress point clouds include the geometry-based point cloud compression (G-PCC) codec framework or the video-based point cloud compression (V-PCC) codec framework provided by the Moving Picture Experts Group (MPEG), or the AVS-PCC codec framework provided by AVS.
- G-PCC codec framework can be used to compress the first type of static point clouds and the third type of dynamically acquired point clouds, and can be based on the Point Cloud Compression Test Platform (Test Model Compression 13, TMC13).
- the V-PCC codec framework can be used to compress the second type of dynamic point clouds, and can be based on the Point Cloud Compression Test Platform (Test Model Compression 2, TMC2). Therefore, the G-PCC codec framework is also called the Point Cloud Codec TMC13, and the V-PCC codec framework is also called the Point Cloud Codec TMC2.
- FIG3 is a schematic diagram of a network architecture of a point cloud encoding and decoding system provided by an embodiment of the present application.
- the network architecture includes one or more electronic devices 13 to 1N and a communication network 01, wherein the electronic devices 13 to 1N can perform video interaction through the communication network 01.
- the electronic device can be various types of devices with point cloud encoding and decoding functions.
- the electronic device can include a mobile phone, a tablet computer, a personal computer, a personal digital assistant, a navigator, a digital phone, a video phone, a television, a sensor device, a server, etc., which is not limited by the embodiment of the present application.
- the decoder or encoder in the embodiment of the present application can be the above-mentioned electronic device.
- the electronic device in the embodiment of the present application has a point cloud encoding and decoding function, generally including a point cloud encoder (ie, encoder) and a point cloud decoder (ie, decoder).
- a point cloud encoder ie, encoder
- a point cloud decoder ie, decoder
- the point cloud data to be encoded is first divided into multiple slices (slices, also called strips).
- slices also called strips.
- the geometric information of the point cloud and the attribute information corresponding to each point are encoded separately.
- Figure 4A shows a schematic diagram of the G-PCC encoder architecture.
- the geometric information is transformed so that the entire point cloud is contained within a bounding box.
- Quantization is then performed. This quantization step primarily serves a scaling purpose. Due to quantization rounding, the geometric information of some point clouds becomes identical. Parameters are then used to determine whether to remove duplicate points. This process of quantization and removing duplicate points is also known as voxelization.
- the bounding box is then partitioned into an octree or a prediction tree is constructed. During this process, arithmetic coding is performed on the points within the leaf nodes of the partition to generate a binary geometry bitstream.
- arithmetic coding is performed on the intersections (vertex) of the partition (surface fitting based on the intersections) to generate a binary geometry bitstream.
- color conversion is performed to convert the color information (i.e., attribute information) from the RGB color space to the YUV color space.
- the reconstructed geometry information is then used to recolor the point cloud so that the unencoded attribute information corresponds to the reconstructed geometry information.
- Attribute encoding is mainly performed on color information. In the process of color information encoding, there are three main transformation methods.
- the first two methods rely on the level of detail (LOD) division, which are distance-based lifting transformation and prediction transformation respectively.
- LOD level of detail
- the third method is to directly perform RAHT. All three methods will convert color information from the spatial domain to the frequency domain, and obtain high-frequency coefficients and low-frequency coefficients through transformation. Finally, the coefficients are quantized and then arithmetic coding is performed on the quantized coefficients to generate a binary attribute bit stream.
- Figure 4B shows a schematic diagram of the composition framework of a G-PCC decoder.
- the geometric bit stream and attribute bit stream in the binary bit stream are first decoded independently.
- the geometric information of the point cloud is obtained through arithmetic decoding-reconstruction of the octree/reconstruction of the prediction tree-reconstruction of the geometry-coordinate inverse conversion;
- the attribute information of the point cloud is obtained through arithmetic decoding-inverse quantization-LOD partitioning/RAHT-color inverse conversion, and the point cloud data to be encoded (i.e., the output point cloud) is restored based on the geometric information and attribute information.
- the current geometric coding and decoding of G-PCC can be divided into octree-based geometric coding and decoding (marked by a dotted box) and prediction tree-based geometric coding and decoding (marked by a dotted box).
- octree geometry encoding includes: first, coordinate conversion of the geometric information so that all point clouds are contained in a bounding box. Then quantization is performed. This step of quantization mainly plays a role of scaling. Due to the quantization rounding, the geometric information of some points is the same. The parameters are used to decide whether to remove duplicate points. The process of quantization and removal of duplicate points is also called voxelization. Next, the bounding box is continuously divided into trees (such as octrees, quadtrees, binary trees, etc.) in the order of breadth-first traversal, and the placeholder code of each node is encoded.
- trees such as octrees, quadtrees, binary trees, etc.
- the bounding box of the point cloud is calculated. Assuming dx > dy > dz , the bounding box corresponds to a cuboid.
- binary tree partitioning is first performed along the x-axis, resulting in two child nodes.
- octree partitioning is performed until the resulting leaf node is a 1 ⁇ 1 ⁇ 1 unit cube.
- the points in the leaf node are encoded to generate a binary bitstream.
- Two parameters, K and M, are introduced during the binary/quadtree/octree partitioning process.
- Parameter K indicates the maximum number of binary/quadtree partitions to be performed before octree partitioning.
- Parameter M indicates the minimum block side length of 2M corresponding to the binary/quadtree partitioning.
- Octree-based geometric information encoding can effectively encode point cloud geometry by leveraging the correlation between adjacent points in space. However, for relatively flat nodes or those with planar characteristics, plane coding can further improve the performance of point cloud geometry encoding.
- Figure 5A and Figure 5B provide a kind of plane position schematic diagram.
- Figure 5A shows a kind of low plane position schematic diagram in the Z-axis direction
- Figure 5B shows a kind of high plane position schematic diagram in the Z-axis direction.
- Figure 5A here (a), (a0), (a1), (a2), (a3) all belong to the low plane position in the Z-axis direction.
- the four child nodes occupied in the current node are all located at the low plane position of the current node in the Z-axis direction, so it can be considered that the current node belongs to a Z plane and is a low plane in the Z-axis direction.
- FIG6 provides a schematic diagram of the node coding order, that is, the node coding is performed in the order of 0, 1, 2, 3, 4, 5, 6, and 7 as shown in FIG6.
- the placeholder information of the current node is represented as: 10101010.
- the placeholder information of the current node is represented as: 10101010.
- Coding method first of all, it is necessary to encode an identifier to indicate that the current node is a plane in the Z-axis direction, and secondly, if the current node is a plane in the Z-axis direction, it is also necessary to represent the plane position of the current node; secondly, it is only necessary to encode the placeholder information of the low plane node in the Z-axis direction (i.e., the placeholder information of the four child nodes 0, 2, 4, and 6). Therefore, based on the plane coding method, the current node is encoded, and only 6 bits (bits) need to be encoded, which can reduce the representation of 2 bits compared to the octree coding of the related technology.
- plane coding has more obvious coding performance than octree coding. Therefore, for an occupied node, if the plane coding method is used for encoding in a certain dimension, it is first necessary to represent the plane identification (planarMode) and plane position (PlanePos) information of the current node in the dimension, and secondly, the placeholder information of the current node is encoded based on the plane information of the current node.
- planeMode plane identification
- PlanePos plane position
- PlaneMode_ i 0 indicates that the current node is not a plane in the i-axis direction, while 1 indicates that the current node is a plane in the i-axis direction. If the current node is a plane in the i-axis direction, then for PlanePosition_ i , 0 indicates that the current node is a plane in the i-axis direction and the plane position is low, while 1 indicates that the current node is a high plane in the i-axis direction.
- the octree-based geometric information coding mode only has an efficient compression rate for points that are correlated in space.
- the use of the direct coding model (DCM) can greatly reduce the complexity.
- the use of DCM is not indicated by flag information, but is inferred from the parent node and neighbor information of the current node. There are three ways to determine whether the current node is eligible for DCM encoding, as follows:
- the current node has no sibling child nodes, that is, the parent node of the current node has only one child node, and the parent node of the parent node of the current node has only two occupied child nodes, that is, the current node has at most one neighbor node.
- the parent node of the current node has only one child node, the current node, and the six neighbor nodes that share a face with the current node are all empty nodes.
- Figure 8 provides a schematic diagram of IDCM encoding. If the current node does not meet the DCM encoding requirements, it will be divided into octrees. If it meets the DCM encoding requirements, the number of points contained in the node will be further determined. If the number of points is less than a threshold (e.g., 2), the node will be DCM-encoded; otherwise, the octree division will continue.
- a threshold e.g. 2, 2
- IDCM_flag a true isolated point
- the DCM encoding mode of the current node needs to be encoded.
- DCM modes there are two DCM modes: (a) only one point exists (or multiple points, but they are duplicate points); (b) contains two points.
- the geometric information of each point needs to be encoded. Assuming that the side length of the node is 2d , encoding each component of the node's geometric coordinates requires d bits, and this bit information is directly encoded into the bitstream. It should be noted here that when encoding the lidar point cloud, the three-dimensional coordinate information can be predictively encoded by using the lidar acquisition parameters, thereby further improving the encoding performance of the geometric information.
- G-PCC currently introduces a plane coding mode. During the geometric partitioning process, it determines whether the child nodes of the current node are in the same plane. If the child nodes of the current node meet the condition of being in the same plane, the child nodes of the current node are represented by that plane.
- the decoder follows a breadth-first traversal. Before decoding each node's occupancy information, it first uses the reconstructed geometric information to determine whether the current node is for plane decoding or IDCM decoding. If the current node meets the requirements for plane decoding, it first decodes the plane identifier and plane position information of the current node. Then, based on the plane information, it decodes the current node's occupancy information. If the current node meets the requirements for IDCM decoding, it first decodes whether the current node is a true IDCM node.
- the current node If so, it continues to parse the DCM decoding mode of the current node, then obtains the number of points in the current DCM node, and finally decodes the geometric information of each point. For nodes that do not meet either plane decoding or DCM decoding requirements, the current node's occupancy information is decoded. By continuously parsing in this way, the placeholder code of each node is obtained, and the node is continuously partitioned until a 1 ⁇ 1 ⁇ 1 unit cube is obtained. The number of points contained in each leaf node is parsed, and the geometrically reconstructed point cloud information is finally recovered.
- geometric partitioning must also be performed first in the trisoup-based geometric information coding framework.
- this method does not require step-by-step partitioning of the point cloud into unit cubes with side lengths of 1 ⁇ 1 ⁇ 1. Instead, the partitioning stops when the sub-blocks (blocks) have a side length of W. Based on the surface formed by the distribution of the point cloud in each block, the surface and the twelve edges of the block are obtained. The vertex coordinates of each block are encoded in sequence to generate a binary code stream.
- centroid offset of the intersection of the cube is encoded/decoded.
- C mean the centroid of up to twelve vertices (intersections) is first calculated, denoted as C mean .
- the centroid C is calculated using the actual point cloud set of the cube, as follows: is the normal vector of the centroid, as shown in Figure 9.
- ⁇ is the intersection centroid offset, which needs to be encoded and decoded.
- the resulting point cloud triangles are CV 1 V 2 , CV 2 V 3 , CV 3 V 4 , and CV 4 V 1 . These triangles are used to restore the point cloud.
- the decoding end first decodes vertex coordinates to complete triangle reconstruction. This process is shown in Figures 10A, 10B, and 10C.
- the block shown in Figure 10A contains three intersection points (v1, v2, v3).
- the set of triangles formed by these three intersection points in a certain order is called a triangle soup, or trisoup, as shown in Figure 10B.
- sampling is performed on this set of triangles, and the resulting sampled points are used as the reconstructed point cloud within the block, as shown in Figure 10C.
- intersection point centroid offset value ⁇ can be obtained by decoding at the decoding end.
- the following describes in detail the process of decoding the intersection point centroid offset value ⁇ at the decoding end.
- the decoding end can decode up to twelve intersections in the current block to obtain the up to twelve intersections.
- the up to twelve intersections are then compared with the up to twelve intersections in the reference block (or decoded block) to determine the qualitySKIP of the predicted inaccuracy of up to twelve intersections in the current block.
- qualitySKIP is used to indicate the number of predicted inaccurate values of up to twelve intersections. If qualitySKIP is less than K, it means that the up to twelve intersections of the current block are predicted more accurately, and the intersection centroid offset value of the current block is most likely equal to the intersection centroid offset value of the reference block. If qualitySKIP is less than K, possibleSKIP can be recorded as 1.
- the intersection centroid offset value of the reference block is also called the inter-frame prediction value driftSKIP of the intersection centroid offset value.
- driftSKIP can be obtained by previously decoding the reference block.
- the reference block can be the previous block of the current block. Of course, the reference block can also be other blocks.
- the Intersameflag flag is decoded to determine whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value. If the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value, decoding ends; if the intersection centroid offset value of the current block is not equal to the inter-frame prediction value of the intersection centroid offset value, subsequent decoding continues.
- Intersameflag 1, it means that the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; if the value of Intersameflag is 0, it means that the intersection centroid offset value of the current block is not equal to the inter-frame prediction value of the intersection centroid offset value.
- intersection centroid offset value ⁇ may also be encoded in the above manner at the encoding end.
- the Intersame Flag may be encoded and decoded using arithmetic coding.
- the context model may be used to encode and decode the Intersame Flag.
- an embodiment of the present application provides a point cloud encoding method, including: determining a first parameter of a current block based on at least one intersection in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection; determining a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; determining a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; encoding the third parameter based on the context model of the third parameter.
- An embodiment of the present application also provides a point cloud decoding method, including: determining a first parameter of a current block based on at least one intersection in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection; determining a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; determining a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and decoding the third parameter based on the context model of the third parameter.
- the second parameter can be used as a parameter of the context model for determining the third parameter (such as Intersameflag), wherein one value of the second parameter corresponds to multiple values of the first parameter.
- the third parameter such as Intersameflag
- the number of index values (or values) of the second parameter is less than the number of index values of the first parameter.
- the context model of the third parameter is determined directly using the first parameter.
- the context model of the third parameter is determined by the second parameter, which can reduce the number of required context models and is conducive to improving the performance of encoding and decoding the third parameter.
- Figure 13 is a flow chart of the point cloud decoding method provided in an embodiment of the present application.
- the method shown in Figure 13 can be applied to a decoder.
- the point cloud decoding method shown in Figure 13 can be used to decode the geometric information of a point cloud.
- the decoding method can be applied to a trisoup-based decoding method.
- the decoding method is based on a geometry-based solid content test model (GES-TM).
- GES-TM is a coding and decoding framework proposed for dense point clouds (such as point clouds collected in augmented reality (AR) or virtual reality (VR) scenes).
- a first parameter of a current block is determined according to at least one intersection point in a reference block.
- the at least one intersection point may be an intersection point between a point cloud in the reference block and the 12 edges of the reference block.
- the number of the at least one intersection point is greater than 0 and less than or equal to 12.
- the at least one intersection point may be obtained by decoding the reference block.
- the reference block may be a decoded block.
- the reference block may be a reference block in the trisoup-based encoding and decoding method described above.
- the reference block may also be referred to as a block.
- the first parameter may be used to indicate the number of inaccurate values in the intersection point prediction value determined based on at least one intersection point.
- the first parameter may be, for example, the qualitySKIP described above (of course, the first parameter may also be represented by any other letters and/or numbers).
- the value of the first parameter may be 0 to 11, or in other words, the value of the first parameter may be an integer greater than or equal to 0 and less than or equal to 11.
- the first parameter can be determined based on a relationship between at least one intersection point in the reference block and at least one intersection point in the current block. For example, if intersection point 1 in the reference block is the same as intersection point 1 in the current block, the predicted value of intersection point 1 in the current block is accurate; if intersection point 1 in the reference block is different from intersection point 1 in the current block, the predicted value of intersection point 1 in the current block is inaccurate.
- the first parameter can be determined by comparing at least one intersection point in the reference block with at least one intersection point in the current block.
- step S1320 the second parameter is determined according to the first parameter.
- the second parameter may be a newly defined parameter.
- the second parameter may be represented by Index_New (of course, the second parameter may also be represented by any other letters and/or numbers).
- the second parameter may include a first value, and the first value corresponds to multiple values of the first parameter. In other words, when the value of the first parameter is any one of the multiple values, the value of the second parameter is the first value.
- the probabilities (or probability distributions) corresponding to the multiple values are equal or approximately equal. Therefore, the contexts indexed by the multiple values can be simplified to one to reduce the number of contexts.
- the multiple values may be values within the first value range among the values of the first parameter.
- the multiple values may be continuous values.
- the multiple values may be discontinuous values or scattered values.
- the first value range may include some or all values from 1 to 11.
- the first value range may be 1 to 11.
- the first value range may be P to 11, where P is an integer greater than 1, and P may be, for example, 2, 3, 4, or 5.
- the first value range may be 1 to Q, where Q is an integer less than 11 and greater than 1, and Q may be, for example, 10, 9, 8, or 7.
- the first value can be, for example, 0 or 1, thereby reducing the complexity of encoding and decoding. Taking the first value as 0 as an example, when the value of the first parameter is any one of multiple values, the value of the second parameter is 0. Taking the first value as 1 as an example, when the value of the first parameter is any one of multiple values, the value of the second parameter is 1.
- the value of the second parameter is 0 or 1.
- the second parameter also includes a second value.
- This second value may correspond to a value other than the aforementioned multiple values of the first parameter.
- the second value corresponds to one of the values of the first parameter, and this one value may be any value other than the aforementioned multiple values.
- This one value may be, for example, 0.
- the first value corresponds to a value between 1 and 11 of the first parameter, then the value of the first parameter corresponding to the second value is 0.
- the first value and the second value of the second parameter may correspond to all values of the first parameter.
- the second value can be 1 or 0.
- the first value is different from the second value. If the first value is 0, the second value is 1; if the first value is 1, the second value is 0.
- the relationship between the value of the first parameter and the value of the second parameter can be implemented by a function or a mapping table, and this embodiment of the present application does not specifically limit this.
- the second parameter can be determined based on the first parameter and the first mapping relationship.
- the first mapping relationship can be used to indicate the mapping relationship between the value of the first parameter and the value of the second parameter.
- step S1330 a context model of the third parameter is determined based on the second parameter.
- the third parameter may be used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value.
- the third parameter may be the Intersameflag described above (of course, the third parameter may also be represented by any other letters and/or numbers).
- the inter-frame prediction value of the intersection centroid offset value may be a centroid offset value obtained by prediction.
- the inter-frame prediction value of the intersection centroid offset value may be an intersection centroid offset value of a reference block.
- the value of the third parameter is 1, it means that the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; if the value of the third parameter is 0, it means that the intersection centroid offset value of the current block is not equal to the inter-frame prediction value of the intersection centroid offset value.
- the value of the third parameter is 0, it means that the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; if If the value of the three parameters is 1, it means that the intersection centroid offset value of the current block is not equal to the inter-frame prediction value of the intersection centroid offset value.
- the third parameter may be determined based on a context model.
- the third parameter may be decoded based on the context model.
- the context model of the third parameter is related to the second parameter.
- the context model of the third parameter may be determined based on the second parameter and the fourth parameter.
- the fourth parameter may be used to indicate an inter-frame prediction value of the intersection centroid offset value.
- the fourth parameter may be, for example, the driftSKIP described above (of course, the fourth parameter may also be represented by any other letters and/or numbers).
- the third parameter is decoded only when the value of the fifth parameter satisfies the first condition, and the value of the fifth parameter is determined based on the value of the first parameter.
- the fifth parameter may be, for example, the possibleSKIP described above (of course, the fifth parameter may also be represented by any other letters and/or numbers).
- the first condition may include that the value of the fifth parameter is 1. If the value of the fifth parameter is 1, the decoder decodes the third parameter; if the value of the fifth parameter is not 1, such as the value of the fifth parameter is 0, the decoder does not decode the third parameter.
- the value of the fifth parameter is 1.
- the first threshold may be, for example, K. If the value of the first parameter is less than K, the value of the fifth parameter is 1.
- K is an integer less than or equal to 12. K may be, for example, 6 or 12.
- step S1340 the third parameter is decoded according to the context model of the third parameter.
- the decoder may use arithmetic decoding to decode the third parameter according to the context model to determine the value of the third parameter. If the value of the third parameter is 1, the centroid offset value of the current block is determined to be the inter-frame prediction value of the centroid offset value; if the value of the third parameter is 0, the centroid offset value of the current block is determined not to be the inter-frame prediction value of the centroid offset value. If the centroid offset value of the current block is not the inter-frame prediction value of the centroid offset value, the decoder may use other methods to determine the centroid offset value of the current block.
- the decoder may determine the reconstructed values of the point cloud within the current block based on the intersection points in the current block.
- Figure 14 is a flow chart of the point cloud encoding method provided in an embodiment of the present application.
- the point cloud encoding method of Figure 14 can be applied to an encoder.
- the point cloud encoding method shown in Figure 14 can be used to encode the geometric information of a point cloud.
- the encoding method can be applied to a trisoup-based encoding method.
- the encoding method is based on a geometry-based solid content test model (GES-TM).
- GES-TM is a coding and decoding framework proposed for dense point clouds (such as point clouds collected in augmented reality (AR) or virtual reality (VR) scenes).
- a first parameter of a current block is determined according to at least one intersection point in a reference block.
- the at least one intersection point may be an intersection point between a point cloud in the reference block and the 12 edges of the reference block.
- the number of the at least one intersection point is greater than 0 and less than or equal to 12.
- the at least one intersection point may be obtained by decoding the reference block.
- the reference block may be a decoded block.
- the reference block may be a reference block in the trisoup-based encoding and decoding method described above.
- the reference block may also be referred to as a block.
- the first parameter may be used to indicate the number of inaccurate values in the intersection point prediction value determined based on at least one intersection point.
- the first parameter may be, for example, the qualitySKIP described above (of course, the first parameter may also be represented by any other letters and/or numbers).
- the value of the first parameter may be 0 to 11, or in other words, the value of the first parameter may be an integer greater than or equal to 0 and less than or equal to 11.
- the first parameter can be determined based on a relationship between at least one intersection point in the reference block and at least one intersection point in the current block. For example, if intersection point 1 in the reference block is the same as intersection point 1 in the current block, the predicted value of intersection point 1 in the current block is accurate; if intersection point 1 in the reference block is different from intersection point 1 in the current block, the predicted value of intersection point 1 in the current block is inaccurate.
- the first parameter can be determined by comparing at least one intersection point in the reference block with at least one intersection point in the current block.
- step S1420 the second parameter is determined according to the first parameter.
- the second parameter may be a newly defined parameter.
- the second parameter may be represented by Index_New (of course, the second parameter may also be represented by any other letters and/or numbers).
- the second parameter may include a first value, and the first value corresponds to multiple values of the first parameter. In other words, when the value of the first parameter is any one of the multiple values, the value of the second parameter is the first value.
- the probabilities (or probability distributions) corresponding to the multiple values are equal or approximately equal. Therefore, the contexts indexed by the multiple values can be simplified to one to reduce the number of contexts.
- the multiple values may be values within the first value range among the values of the first parameter.
- the multiple values may be continuous values.
- the multiple values may be discontinuous values or scattered values.
- the first value range may include some or all of the values from 1 to 11.
- the first value range may be 1 to 11.
- the first value range may be P to 11, where P is an integer greater than 1, and P may be 2, 3, 4, or 5, for example.
- the first value range may be It can be 1 to Q, where Q is an integer smaller than 11 and larger than 1.
- Q can be 10, 9, 8, or 7.
- the first value can be, for example, 0 or 1, thereby reducing the complexity of encoding and decoding. Taking the first value as 0 as an example, when the value of the first parameter is any one of multiple values, the value of the second parameter is 0. Taking the first value as 1 as an example, when the value of the first parameter is any one of multiple values, the value of the second parameter is 1.
- the value of the second parameter is 0 or 1.
- the second parameter also includes a second value.
- This second value may correspond to a value other than the aforementioned multiple values of the first parameter.
- the second value corresponds to one of the values of the first parameter, and this one value may be any value other than the aforementioned multiple values.
- This one value may be, for example, 0.
- the first value corresponds to a value between 1 and 11 of the first parameter, then the value of the first parameter corresponding to the second value is 0.
- the first value and the second value of the second parameter may correspond to all values of the first parameter.
- the second value can be 1 or 0.
- the first value is different from the second value. If the first value is 0, the second value is 1; if the first value is 1, the second value is 0.
- the relationship between the value of the first parameter and the value of the second parameter can be implemented by a function or a mapping table, and this embodiment of the present application does not specifically limit this.
- the second parameter can be determined based on the first parameter and the first mapping relationship.
- the first mapping relationship can be used to indicate the mapping relationship between the value of the first parameter and the value of the second parameter.
- step S1430 a context model of the third parameter is determined based on the second parameter.
- the third parameter may be used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value.
- the third parameter may be the Intersameflag described above (of course, the third parameter may also be represented by any other letters and/or numbers).
- the inter-frame prediction value of the intersection centroid offset value may be a centroid offset value obtained by prediction.
- the inter-frame prediction value of the intersection centroid offset value may be an intersection centroid offset value of a reference block.
- the value of the third parameter is 1, it indicates that the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; if the value of the third parameter is 0, it indicates that the intersection centroid offset value of the current block is not equal to the inter-frame prediction value of the intersection centroid offset value.
- the value of the third parameter is 0, it indicates that the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; if the value of the third parameter is 1, it indicates that the intersection centroid offset value of the current block is not equal to the inter-frame prediction value of the intersection centroid offset value.
- the third parameter may be determined based on a context model.
- the third parameter may be encoded based on the context model.
- the context model of the third parameter is related to the second parameter.
- the context model of the third parameter may be determined based on the second parameter and the fourth parameter.
- the fourth parameter may be used to indicate an inter-frame prediction value of the intersection centroid offset value.
- the fourth parameter may be, for example, the driftSKIP described above (of course, the fourth parameter may also be represented by any other letters and/or numbers).
- the encoding of the third parameter is performed when the value of the fifth parameter satisfies the first condition, and the value of the fifth parameter is determined based on the value of the first parameter.
- the fifth parameter can be, for example, the possibleSKIP described above (of course, the fifth parameter can also be represented by any other letters and/or numbers).
- the first condition may include that the value of the fifth parameter is 1. If the value of the fifth parameter is 1, the encoder encodes the third parameter; if the value of the fifth parameter is not 1, such as the value of the fifth parameter is 0, the encoder does not encode the third parameter.
- the value of the fifth parameter is 1.
- the first threshold may be, for example, K. If the value of the first parameter is less than K, the value of the fifth parameter is 1.
- K is an integer less than or equal to 12. K may be, for example, 6 or 12.
- the third parameter is encoded according to the context model of the third parameter.
- the encoder may encode the third parameter according to the context model using arithmetic coding.
- the context model indexed by the qualitySKIP value (0 to 11) is redundant. Since the probability distribution of qualitySKIP values other than 0 (1 to 11) is approximate, the context models indexed by qualitySKIP values other than 0 (1 to 11) can be simplified to one.
- This application proposes an optimization algorithm for reducing the context model, which can be achieved without compromising performance or time complexity.
- the improvement proposed in this application is to convert qualitySKIP into a new index value Index_New when decoding the Intersameflag flag. This is because the probability distribution of qualitySKIP is not 0 (1 to 11) is approximate, so the original 12 contexes (0 to 11, a total of 12) can be reduced to 2 contexes (such as 0 is 1, 1 to 11 are another), to reduce the number of contexts on the decoding end without affecting performance and time complexity.
- Method 2 uses a look-up table, that is, the relationship between the value of qualitySKIP and the value of Index_New can be realized through a mapping table.
- mapping table between Index_New and qualitySKIP is:
- TABLE_QUANLITY_CONTEX [0,1,1,1,1,1,1,1,1,1,1] or [1,0,0,0,0,0,0,0,0,0,0,0,0];
- Figure 15 shows a framework diagram for decoding the intersection centroid offset value
- Figure 16 shows a framework diagram for the context required for decoding Intersameflag.
- the framework diagram shown in Figure 15 is similar to the framework shown in Figure 11.
- Tables 1 and 2 show the test results for Method 1 and Method 2, respectively.
- the solution of this application has almost no impact on energy and time complexity, but reduces 20 contexts on the decoding end.
- the number of contexts shared by the related art solutions is 73.
- the number of contexts used by Method 1 or Method 2 is 53, equivalent to a 27% reduction in the number of contexts.
- This application uses a new indexing method when decoding the intersection centroid offset value, converting the qualitySKIP index into a new index Index_New.
- the new indexing method can reduce the number of contexts without affecting performance and time complexity.
- FIG17 is a schematic diagram of the structure of a decoder provided by an embodiment of the present application.
- the decoder 1700 may include a first determining unit 1710 , a second determining unit 1720 , a third determining unit 1730 , and a decoding unit 1740 .
- the first determining unit 1710 is configured to determine a first parameter of the current block according to at least one intersection point in the reference block, where the first parameter is used to indicate the number of inaccurate values in the intersection prediction value determined based on the at least one intersection point.
- the second determining unit 1720 is configured to determine a second parameter according to the first parameter, where the second parameter includes a first value, and the first value corresponds to multiple values of the first parameter.
- the third determination unit 1730 is configured to determine a context model of a third parameter based on the second parameter, where the third parameter is used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value.
- the decoding unit 1740 is configured to decode the third parameter according to the context model of the third parameter.
- the multiple values are values of the first parameter that are within a first value range.
- the first value range includes some or all of the values from 1 to 11.
- the first value is 0 or 1.
- the second parameter further includes a second value, and the second value corresponds to one of the values of the first parameter.
- the value of the first parameter corresponding to the second value is 0.
- the second value is 1 or 0.
- the second parameter is determined based on the first parameter and a first mapping relationship, where the first mapping relationship is used to indicate a mapping relationship between a value of the first parameter and a value of the second parameter.
- the context model is determined based on the second parameter and a fourth parameter, and the fourth parameter is used to indicate an inter-frame prediction value of the intersection centroid offset value.
- the decoding of the third parameter is performed when the value of the fifth parameter satisfies the first condition, and the value of the fifth parameter is determined based on the value of the first parameter.
- the first condition is that the value of the fifth parameter is 1.
- the value of the fifth parameter is 1.
- the decoder further includes a fourth determination unit configured to determine a reconstructed value of the point cloud within the current block based on the intersection point in the current block.
- a "unit” may be a part of a circuit, a part of a processor, a part of a program or software, etc., and of course it may be a module or a non-modular one.
- the various components in this embodiment may be integrated into a processing unit.
- each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit may be implemented in the form of hardware or in the form of software functional modules.
- the integrated unit is implemented in the form of a software functional module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
- the computer software product is stored in a storage medium and includes several instructions for causing a computer device (which can be a personal computer, server, or network device, etc.) or a processor to execute all or part of the steps of the method described in this embodiment.
- the aforementioned storage medium includes various media that can store program code, such as a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.
- an embodiment of the present application provides a computer-readable storage medium, which is applied to the decoder 1700.
- the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, it implements the decoding method described in any one of the aforementioned embodiments.
- the encoder 1800 may include: a communication interface 1810, a memory 1820 and a processor 1830; each component is coupled together through a bus system 1840. It can be understood that the bus system 1840 is used to achieve connection and communication between these components.
- the bus system 1840 also includes a power bus, a control bus and a status signal bus.
- various buses are labeled as bus system 1840 in Figure 18. Among them,
- Communication interface 1810 used for sending and receiving signals when sending and receiving information with other external network elements
- Memory 1820 for storing computer programs
- Processor 1830 is configured to, when running the computer program, perform the following: determining a first parameter of a current block based on at least one intersection in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection; determining a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; determining a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and decoding the third parameter based on the context model of the third parameter.
- the memory 1820 in the embodiment of the present application can be a volatile memory or a non-volatile memory, or can include both volatile and non-volatile memories.
- the non-volatile memory can be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory.
- the volatile memory can be a random access memory (RAM), which is used as an external cache.
- RAM static RAM
- DRAM dynamic RAM
- SDRAM synchronous DRAM
- DDR SDRAM double data rate SDRAM
- ESDRAM enhanced SDRAM
- SLDRAM synchlink DRAM
- DRRAM direct RAM bus RAM
- Processor 1830 may be an integrated circuit chip with signal processing capabilities. During implementation, each step of the above method can be completed by hardware integrated logic circuits or software instructions in processor 1830.
- the above-mentioned processor 1830 can be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
- DSP digital signal processor
- ASIC application-specific integrated circuit
- FPGA field programmable gate array
- a general-purpose processor can be a microprocessor or any conventional processor.
- the steps of the method disclosed in conjunction with the embodiments of this application can be directly implemented and executed by a hardware decoding processor, or by a combination of hardware and software modules in the decoding processor.
- the software module can be located in a storage medium mature in the art, such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, etc.
- the storage medium is located in the memory 1820 , and the processor 1830 reads the information in the memory 1820 and completes the steps of the above method in combination with its hardware.
- the embodiments described in this application can be implemented using hardware, software, firmware, middleware, microcode, or a combination thereof.
- the processing unit can be implemented in one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), general-purpose processors, controllers, microcontrollers, microprocessors, other electronic units for performing the functions described in this application, or a combination thereof.
- ASICs application specific integrated circuits
- DSPs digital signal processors
- DSPDs digital signal processing devices
- PLDs programmable logic devices
- FPGAs field programmable gate arrays
- the technology described in this application can be implemented by modules (such as processes, functions, etc.) that perform the functions described in this application.
- the software code can be stored in a memory and executed by a processor.
- the memory can be implemented in the processor or outside the processor.
- the processor 1830 is further configured to execute the aforementioned embodiment when running the computer program.
- the decoding method according to any one of claims 1 to 6.
- FIG19 is a schematic diagram of the structure of an encoder provided by an embodiment of the present application.
- the encoder 1900 includes a first determination unit 1910 , a second determination unit 1920 , a third determination unit 1930 , and an encoding unit 1940 .
- the first determining unit 1910 is configured to determine a first parameter of the current block according to at least one intersection point in the reference block, where the first parameter is used to indicate the number of inaccurate values in the intersection prediction value determined based on the at least one intersection point.
- the second determining unit 1920 is configured to determine a second parameter according to the first parameter, where the second parameter includes a first value, and the first value corresponds to multiple values of the first parameter.
- the third determination unit 1930 is configured to determine a context model of a third parameter based on the second parameter, where the third parameter is used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value.
- the encoding unit 1940 is configured to encode the third parameter according to the context model of the third parameter.
- the multiple values are values of the first parameter that are within a first value range.
- the first value range includes some or all of the values from 1 to 11.
- the first value is 0 or 1.
- the second parameter further includes a second value, and the second value corresponds to one of the values of the first parameter.
- the value of the first parameter corresponding to the second value is 0.
- the second value is 1 or 0.
- the second parameter is determined based on the first parameter and a first mapping relationship, where the first mapping relationship is used to indicate a mapping relationship between a value of the first parameter and a value of the second parameter.
- the context model is determined based on the second parameter and a fourth parameter, and the fourth parameter is used to indicate an inter-frame prediction value of the intersection centroid offset value.
- the encoding of the third parameter is performed when the value of the fifth parameter satisfies the first condition, and the value of the fifth parameter is determined based on the value of the first parameter.
- the first condition is that the value of the fifth parameter is 1.
- the value of the fifth parameter is 1.
- a "unit" can be a portion of a circuit, a portion of a processor, a portion of a program or software, etc., and of course it can also be a module, or it can be non-modular.
- the various components in this embodiment can be integrated into a processing unit, or each unit can exist physically separately, or two or more units can be integrated into a single unit.
- the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional modules.
- the integrated unit is implemented as a software functional module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
- This computer software product is stored in a storage medium and includes several instructions for causing a computer device (which can be a personal computer, server, or network device, etc.) or a processor to execute all or part of the steps of the method described in this embodiment.
- the aforementioned storage medium includes various media that can store program code, such as a USB flash drive, a mobile hard drive, ROM, RAM, a magnetic disk, or an optical disk.
- an embodiment of the present application provides a computer-readable storage medium, which is applied to the encoder 1900.
- the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, it implements the decoding method described in any one of the aforementioned embodiments.
- the encoder 2000 may include: a communication interface 2010, a memory 2020 and a processor 2030; each component is coupled together through a bus system 2040. It can be understood that the bus system 2040 is used to realize the connection and communication between these components.
- the bus system 2040 also includes a power bus, a control bus and a status signal bus.
- various buses are labeled as bus systems 2040 in Figure 20. Among them,
- Memory 2020 for storing computer programs
- Processor 2030 is used to, when running the computer program, perform the following: determining a first parameter of the current block based on at least one intersection in the reference block, the first parameter being used to indicate the number of inaccurate values in the intersection prediction value determined based on the at least one intersection; determining a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; determining a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and encoding the third parameter based on the context model of the third parameter.
- the memory 2020 in the embodiment of the present application can be a volatile memory or a non-volatile memory, or can include both volatile and non-volatile memories.
- the non-volatile memory can be ROM, PROM, EPROM, EEPROM or flash memory.
- the volatile memory can be RAM, which is used as an external cache.
- many forms of RAM are available, such as SRAM, DRAM, SDRAM, DDRSDRAM, ESDRAM, SLDRAM and DRRAM.
- the memory described in this application The memory 2020 of the systems and methods is intended to comprise, without being limited to, these and any other suitable types of memory.
- Processor 2030 may be an integrated circuit chip with signal processing capabilities. During implementation, each step of the above method may be completed by hardware integrated logic circuits or software instructions within processor 2030.
- Processor 2030 may be a general-purpose processor, DSP, ASIC, FPGA, or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component. It may implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application.
- a general-purpose processor may be a microprocessor or any conventional processor. The steps of the methods disclosed in the embodiments of this application may be directly implemented and executed by a hardware decoding processor, or by a combination of hardware and software modules within the decoding processor.
- the software modules may be located in a storage medium known in the art, such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, or registers. This storage medium is located in memory 220.
- Processor 2030 reads information from memory 220 and, in conjunction with its hardware, completes the steps of the above method.
- the embodiments described herein can be implemented with hardware, software, firmware, middleware, microcode, or a combination thereof.
- the processing unit can be implemented in one or more ASICs, DSPs, DSPDs, PLDs, FPGAs, general-purpose processors, controllers, microcontrollers, microprocessors, other electronic units for performing the functions described herein, or a combination thereof.
- the technology described herein can be implemented by modules (e.g., processes, functions, etc.) that perform the functions described herein.
- the software code can be stored in a memory and executed by a processor.
- the memory can be implemented in the processor or outside the processor.
- the processor 2030 is further configured to execute the encoding method described in any one of the aforementioned embodiments when running the computer program.
- An embodiment of the present application also provides a computer-readable storage medium, which is a non-volatile computer-readable storage medium for storing a bit stream.
- the bit stream can be generated by an encoding method of an encoder, or the bit stream can be decoded by a decoding method of a decoder, wherein the decoding method can be the decoding method described in any of the foregoing embodiments, and the encoding method can be the encoding method described in any of the foregoing embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
本申请涉及点云编解码技术领域,尤其涉及一种点云编解码方法、编解码器、码流以及存储介质。The present application relates to the field of point cloud encoding and decoding technology, and in particular to a point cloud encoding and decoding method, codec, bit stream and storage medium.
利用交点质心偏移值可以更加准确地还原点云的形状。在编解码交点质心偏移值时,为了降低码流开销,可以利用上下文模型对交点质心偏移值进行编解码。但是,对交点质心偏移值进行编解码所用到的上下文数量较多,不利于提升编解码的性能。Using the intersection centroid offset value can more accurately restore the shape of the point cloud. To reduce bitrate overhead when encoding and decoding the intersection centroid offset value, a context model can be used to encode and decode the intersection centroid offset value. However, encoding and decoding the intersection centroid offset value requires a large amount of context, which is not conducive to improving encoding and decoding performance.
发明内容Summary of the Invention
本申请实施例提供一种点云编解码方法、编解码器、码流以及存储介质。下面对本申请涉及的各个方面进行介绍。The present invention provides a point cloud encoding and decoding method, codec, code stream, and storage medium. The following describes various aspects of the present invention.
第一方面,提供一种点云解码方法,应用于解码器,包括:根据参考块中的至少一个交点确定当前块的第一参数,所述第一参数用于指示基于所述至少一个交点确定的交点预测值中的不准确值的数量;根据所述第一参数确定第二参数,所述第二参数包括第一值,且所述第一值对应所述第一参数的多个值;根据所述第二参数确定第三参数的上下文模型,所述第三参数用于指示所述当前块的交点质心偏移值是否等于交点质心偏移值的帧间预测值;根据所述第三参数的上下文模型,对所述第三参数进行解码。In a first aspect, a point cloud decoding method is provided, which is applied to a decoder, including: determining a first parameter of a current block based on at least one intersection in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection; determining a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; determining a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and decoding the third parameter based on the context model of the third parameter.
第二方面,提供一种点云编码方法,应用于编码器,包括:根据参考块中的至少一个交点确定当前块的第一参数,所述第一参数用于指示基于所述至少一个交点确定的交点预测值中的不准确值的数量;根据所述第一参数确定第二参数,所述第二参数包括第一值,且所述第一值对应所述第一参数的多个值;根据所述第二参数确定第三参数的上下文模型,所述第三参数用于指示所述当前块的交点质心偏移值是否等于交点质心偏移值的帧间预测值;根据所述第三参数的上下文模型,对所述第三参数进行编码。In a second aspect, a point cloud encoding method is provided, which is applied to an encoder, including: determining a first parameter of a current block based on at least one intersection point in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection point; determining a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; determining a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and encoding the third parameter based on the context model of the third parameter.
第三方面,提供一种解码器,包括:第一确定单元,配置为根据参考块中的至少一个交点确定当前块的第一参数,所述第一参数用于指示基于所述至少一个交点确定的交点预测值中的不准确值的数量;第二确定单元,配置为根据所述第一参数确定第二参数,所述第二参数包括第一值,且所述第一值对应所述第一参数的多个值;第三确定单元,配置为根据所述第二参数确定第三参数的上下文模型,所述第三参数用于指示所述当前块的交点质心偏移值是否等于交点质心偏移值的帧间预测值;解码单元,配置为根据所述第三参数的上下文模型,对所述第三参数进行解码。According to a third aspect, a decoder is provided, comprising: a first determination unit, configured to determine a first parameter of a current block based on at least one intersection in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection; a second determination unit, configured to determine a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; a third determination unit, configured to determine a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and a decoding unit, configured to decode the third parameter based on the context model of the third parameter.
第四方面,提供一种解码器,解码器包括:存储器,用于存储计算机程序;处理器,用于在运行计算机程序时,执行如第一方面的方法。In a fourth aspect, a decoder is provided, comprising: a memory for storing a computer program; and a processor for executing the method of the first aspect when running the computer program.
第五方面,提供一种编码器,包括:第一确定单元,配置为根据参考块中的至少一个交点确定当前块的第一参数,所述第一参数用于指示基于所述至少一个交点确定的交点预测值中的不准确值的数量;第二确定单元,配置为根据所述第一参数确定第二参数,所述第二参数包括第一值,且所述第一值对应所述第一参数的多个值;第三确定单元,配置为根据所述第二参数确定第三参数的上下文模型,所述第三参数用于指示所述当前块的交点质心偏移值是否等于交点质心偏移值的帧间预测值;编码单元,配置为根据所述第三参数的上下文模型,对所述第三参数进行编码。In a fifth aspect, an encoder is provided, comprising: a first determination unit, configured to determine a first parameter of a current block based on at least one intersection in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection; a second determination unit, configured to determine a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; a third determination unit, configured to determine a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and an encoding unit, configured to encode the third parameter based on the context model of the third parameter.
第六方面,提供一种编码器,编码器包括:存储器,用于存储计算机程序;处理器,用于在运行计算机程序时,执行如第二方面的方法。In a sixth aspect, an encoder is provided, comprising: a memory for storing a computer program; and a processor for executing the method of the second aspect when running the computer program.
第七方面,提供一种计算机可读存储介质,其中,计算机可读存储介质存储有计算机程序,计算机程序被执行时实现如第一方面或第二方面的方法。In a seventh aspect, a computer-readable storage medium is provided, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed, the method of the first aspect or the second aspect is implemented.
第八方面,提供一种存储比特流的非易失性计算机可读存储介质,比特流通过利用编码器的编码方法而生成,或者,比特流通过利用解码器的解码方法而解码,其中,解码方法为如第一方面的方法、编码方法为如第二方面的方法。In an eighth aspect, a non-volatile computer-readable storage medium for storing a bit stream is provided, wherein the bit stream is generated by an encoding method using an encoder, or the bit stream is decoded by a decoding method using a decoder, wherein the decoding method is the method of the first aspect and the encoding method is the method of the second aspect.
第九方面,提供一种码流,包括根据第二方面的方法生成的码流。According to a ninth aspect, a code stream is provided, comprising a code stream generated according to the method of the second aspect.
本申请实施例提出可以用第二参数作为确定第三参数(可用于确定当前块的交点质心偏移值)的上下文模型的参数,其中,第二参数的一个值对应第一参数的多个值。通过这种方式,第二参数的索引值(或取值)的数量小于第一参数的索引值的数量,相比于相关技术直接使用第一参数确定第三参数的上下文模型的方式,由第二参数确定第三参数的上下文模型,可以降低所需要的上下文模型的数量,有利于提升编解码第三参数的性能。 The embodiment of the present application proposes that the second parameter can be used as a parameter of the context model for determining the third parameter (which can be used to determine the intersection centroid offset value of the current block), wherein one value of the second parameter corresponds to multiple values of the first parameter. In this way, the number of index values (or values) of the second parameter is less than the number of index values of the first parameter. Compared with the method of directly using the first parameter to determine the context model of the third parameter in the related art, the context model of the third parameter determined by the second parameter can reduce the number of required context models, which is beneficial to improving the performance of encoding and decoding the third parameter.
图1A为一种三维点云图像示意图。FIG1A is a schematic diagram of a three-dimensional point cloud image.
图1B为一种三维点云图像的局部放大图。FIG1B is a partially enlarged view of a three-dimensional point cloud image.
图2A为一种点云图像的六个观看角度示意图。FIG2A is a schematic diagram of six viewing angles of a point cloud image.
图2B为一种点云图像对应的数据存储格式示意图。FIG2B is a schematic diagram of a data storage format corresponding to a point cloud image.
图3为一种点云编解码的网络架构示意图。FIG3 is a schematic diagram of a network architecture for point cloud encoding and decoding.
图4A为一种G-PCC编码器的组成框架示意图。FIG4A is a schematic diagram of a composition framework of a G-PCC encoder.
图4B为一种G-PCC解码器的组成框架示意图。FIG4B is a schematic diagram of a composition framework of a G-PCC decoder.
图5A为一种Z轴方向的低平面位置示意图。FIG5A is a schematic diagram of a low plane position in the Z-axis direction.
图5B为一种Z轴方向的高平面位置示意图。FIG5B is a schematic diagram of a high plane position in the Z-axis direction.
图6为一种节点编码顺序示意图。FIG6 is a schematic diagram of a node encoding sequence.
图7A为一种平面标识信息示意图。FIG. 7A is a schematic diagram of plane identification information.
图7B为另一种平面标识信息示意图。FIG. 7B is a schematic diagram of another type of planar identification information.
图8为一种当前节点的兄弟姐妹节点示意图。FIG8 is a schematic diagram of sibling nodes of a current node.
图9为一种计算质心偏移值的示意图。FIG9 is a schematic diagram of calculating a centroid offset value.
图10A为一种子块包括的三个交点示意图。FIG. 10A is a schematic diagram showing three intersection points included in a sub-block.
图10B为一种利用三个交点拟合的三角面片集示意图。FIG. 10B is a schematic diagram of a triangular facet set fitted using three intersection points.
图10C为一种三角面片集的上采样示意图。FIG. 10C is a schematic diagram of upsampling of a triangle face set.
图11为一种计算质心偏移值的流程示意图。FIG11 is a schematic diagram of a flow chart of calculating a centroid offset value.
图12为一种解码Intersameflag所需上下文的示意图。FIG12 is a schematic diagram of the context required for decoding the Intersameflag.
图13为本申请实施例提供的点云解码方法的流程示意图。FIG13 is a flow chart of the point cloud decoding method provided in an embodiment of the present application.
图14为本申请实施例提供的点云编码方法的流程示意图。FIG14 is a flow chart of the point cloud encoding method provided in an embodiment of the present application.
图15为本申请实施例提供的计算质心偏移值的流程示意图。FIG15 is a schematic diagram of a flow chart of calculating a centroid offset value provided in an embodiment of the present application.
图16为本申请实施例提供的解码Intersameflag所需上下文的示意图。Figure 16 is a schematic diagram of the context required for decoding the Intersameflag provided in an embodiment of the present application.
图17为本申请一实施例提供的解码器的结构示意图。FIG17 is a schematic diagram of the structure of a decoder provided in an embodiment of the present application.
图18为本申请另一实施例提供的解码器的结构示意图。FIG18 is a schematic structural diagram of a decoder provided in another embodiment of the present application.
图19为本申请一实施例提供的编码器的结构示意图。FIG19 is a schematic diagram of the structure of an encoder provided in one embodiment of the present application.
图20为本申请另一实施例提供的编码器的结构示意图。FIG20 is a schematic diagram of the structure of an encoder provided in another embodiment of the present application.
为了能够更加详尽地了解本申请实施例的特点与技术内容,下面结合附图对本申请实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本申请实施例。In order to enable a more detailed understanding of the features and technical contents of the embodiments of the present application, the implementation of the embodiments of the present application is described in detail below with reference to the accompanying drawings. The attached drawings are for reference only and are not used to limit the embodiments of the present application.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the art to which this application pertains. The terms used herein are for the purpose of describing the embodiments of this application only and are not intended to limit this application.
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。In the following description, reference is made to “some embodiments”, which describes a subset of all possible embodiments, but it will be understood that “some embodiments” may be the same subset or different subsets of all possible embodiments and may be combined with each other without conflict.
还需要指出,本申请实施例所涉及的术语“第一\第二\第三”仅是用于区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。It should also be pointed out that the terms "first\second\third" involved in the embodiments of the present application are only used to distinguish similar objects and do not represent a specific ordering of the objects. It can be understood that "first\second\third" can be interchanged with a specific order or sequence where permitted, so that the embodiments of the present application described here can be implemented in an order other than that illustrated or described here.
点云是空间中一组无规则分布的、表达三维物体或场景的空间结构及表面属性的离散点集,这些点包含了用于表示空间位置的几何信息和用于表示点云外观纹理的属性信息。图1A展示了三维点云图像和图1B展示了三维点云图像的局部放大图,可以看到点云表面是由分布稠密的点所组成的。A point cloud is a set of irregularly distributed discrete points in space that represent the spatial structure and surface properties of a three-dimensional object or scene. These points contain geometric information representing spatial location and attribute information representing the point cloud's appearance and texture. Figure 1A shows a 3D point cloud image, and Figure 1B shows a zoomed-in view of a 3D point cloud image. It can be seen that the point cloud surface is composed of densely distributed points.
二维图像在每一个像素点均有信息表达,分布规则,因此不需要额外记录其位置信息;然而点云中的点在三维空间中的分布具有随机性和不规则性,因此需要记录每一个点在空间中的位置,才能完整地表达一幅点云。与二维图像类似,采集过程中每一个位置均有对应的属性信息,通常为RGB颜色值,颜色值反映物体的色彩;对于点云来说,每一个点所对应的属性信息除了颜色信息以外,还有比较常见的是反射率(reflectance)值,反射率值反映物体的表面材质。因此,点云数据通常包括点的位置信息和点的属性信息。其中,点的位置信息也可称为点的几何信息。例如,点的几何信息可以是点的三维坐标信息(x,y,z)。点的属性信息可以包括颜色信息和/或反射率等等。例如,反射率可以是一维反射率信息(r);颜色信息可以是任意一种色彩空间上的信息,或者颜色信息也可以是三维颜色信息,如RGB信息。在这里,R表示红色(red,R),G表示绿色(green,G),B表示蓝色(blue,B)。再如,颜色信息可以是亮度色度(YCbCr,YUV)信息。其中,Y表示明亮度(luma),Cb(U)表示蓝色色差, Cr(V)表示红色色差。In a two-dimensional image, each pixel contains information and is distributed regularly, so there's no need to record its location. However, the distribution of points in a point cloud in three-dimensional space is random and irregular, so recording the location of each point in space is necessary to fully represent the point cloud. Similar to a two-dimensional image, each location in the acquisition process has corresponding attribute information, typically an RGB color value, which reflects the object's color. For a point cloud, in addition to color information, each point's corresponding attribute information often includes reflectance values, which reflect the surface texture of the object. Therefore, point cloud data typically includes both point location information and point attribute information. Point location information can also be referred to as point geometric information. For example, point geometric information can be the point's three-dimensional coordinates (x, y, z). Point attribute information can include color information and/or reflectance. For example, reflectance can be one-dimensional reflectance information (r). Color information can be information in any color space, or it can be three-dimensional color information, such as RGB. Here, R represents red (red), G represents green (green), and B represents blue (blue). For example, color information can be luminance and chrominance (YCbCr, YUV) information. Where Y represents brightness (luma), Cb (U) represents blue color difference, Cr(V) represents red color difference.
根据激光测量原理得到的点云,点云中的点可以包括点的三维坐标信息和点的反射率值。再如,根据摄影测量原理得到的点云,点云中的点可以可包括点的三维坐标信息和点的三维颜色信息。再如,结合激光测量和摄影测量原理得到点云,点云中的点可以可包括点的三维坐标信息、点的反射率值和点的三维颜色信息。For example, a point cloud generated using laser measurement principles can include both its 3D coordinate information and its reflectivity. For another example, a point cloud generated using photogrammetry principles can include both its 3D coordinate information and its 3D color information. For another example, a point cloud generated using a combination of laser measurement and photogrammetry principles can include both its 3D coordinate information, its reflectivity value, and its 3D color information.
如图2A和图2B所示为一幅点云图像及其对应的数据存储格式。其中,图2A提供了点云图像的六个观看角度,图2B由文件头信息部分和数据部分组成,头信息包含了数据格式、数据表示类型、点云总点数、以及点云所表示的内容。例如,点云为“.ply”格式,由ASCII码表示,总点数为207242,每个点具有三维坐标信息(x,y,z)和三维颜色信息(r,g,b)。Figures 2A and 2B show a point cloud image and its corresponding data storage format. Figure 2A provides six viewing angles of the point cloud image, while Figure 2B consists of a file header and data. The header includes the data format, data representation type, the total number of points in the point cloud, and the content represented by the point cloud. For example, the point cloud is in ".ply" format, represented by ASCII code, with a total of 207,242 points. Each point has 3D coordinate information (x, y, z) and 3D color information (r, g, b).
点云可以按获取的途径分为:Point clouds can be divided into the following categories according to the acquisition method:
静态点云:即物体是静止的,获取点云的设备也是静止的;Static point cloud: the object is stationary and the device that acquires the point cloud is also stationary;
动态点云:物体是运动的,但获取点云的设备是静止的;Dynamic point cloud: The object is moving, but the device that obtains the point cloud is stationary;
动态获取点云:获取点云的设备是运动的。Dynamic point cloud acquisition: The device used to acquire the point cloud is in motion.
例如,按点云的用途分为两大类:For example, point clouds can be divided into two categories according to their usage:
类别一:机器感知点云,其可以用于自主导航系统、实时巡检系统、地理信息系统、视觉分拣机器人、抢险救灾机器人等场景;Category 1: Machine perception point cloud, which can be used in scenarios such as autonomous navigation systems, real-time inspection systems, geographic information systems, visual sorting robots, and disaster relief robots;
类别二:人眼感知点云,其可以用于数字文化遗产、自由视点广播、三维沉浸通信、三维沉浸交互等点云应用场景。Category 2: Human eye perception point cloud, which can be used in point cloud application scenarios such as digital cultural heritage, free viewpoint broadcasting, 3D immersive communication, and 3D immersive interaction.
点云可以灵活方便地表达三维物体或场景的空间结构及表面属性,并且由于点云通过直接对真实物体采样获得,在保证精度的前提下能提供极强的真实感,因而应用广泛,其范围包括虚拟现实游戏、计算机辅助设计、地理信息系统、自动导航系统、数字文化遗产、自由视点广播、三维沉浸远程呈现、生物组织器官三维重建等。Point clouds can flexibly and conveniently express the spatial structure and surface properties of three-dimensional objects or scenes. Moreover, since point clouds are obtained by directly sampling real objects, they can provide a strong sense of reality while ensuring accuracy. Therefore, they are widely used, including virtual reality games, computer-aided design, geographic information systems, automatic navigation systems, digital cultural heritage, free viewpoint broadcasting, three-dimensional immersive remote presentation, and three-dimensional reconstruction of biological tissues and organs.
点云的采集主要有以下途径:计算机生成、3D激光扫描、3D摄影测量等。计算机可以生成虚拟三维物体及场景的点云;3D激光扫描可以获得静态现实世界三维物体或场景的点云,每秒可以获取百万级点云;3D摄影测量可以获得动态现实世界三维物体或场景的点云,每秒可以获取千万级点云。这些技术降低了点云数据获取成本和时间周期,提高了数据的精度。点云数据获取方式的变革,使大量点云数据的获取成为可能,伴随着应用需求的增长,海量3D点云数据的处理遭遇存储空间和传输带宽限制的瓶颈。Point clouds are primarily collected through computer generation, 3D laser scanning, and 3D photogrammetry. Computers can generate point clouds of virtual 3D objects and scenes; 3D laser scanning can obtain point clouds of static real-world 3D objects or scenes, generating millions of point clouds per second; and 3D photogrammetry can obtain point clouds of dynamic real-world 3D objects or scenes, generating tens of millions of point clouds per second. These technologies reduce the cost and time required to acquire point cloud data while improving data accuracy. While changes in point cloud data acquisition methods have made it possible to acquire large amounts of point cloud data, the processing of this massive amount of 3D point cloud data is facing bottlenecks due to storage space and transmission bandwidth constraints, as application demands grow.
示例性地,以帧率为30帧每秒(fps)的点云视频为例,每帧点云的点数为70万,每个点具有坐标信息xyz(float)和颜色信息RGB(uchar),则10s点云视频的数据量大约为0.7million×(4Byte×3+1Byte×3)×30fps×10s=3.15GB,其中,1Byte为10bit;而YUV采样格式为4:2:0,帧率为30fps的1280×720二维视频,其10s的数据量约为1280×720×12bit×30frames×10s≈0.39GB,10s的两视角三维视频的数据量约为0.39×2=0.78GB。由此可见,点云视频的数据量远超过相同时长的二维视频和三维视频的数据量。因此,为更好地实现数据管理,节省服务器存储空间,降低服务器与客户端之间的传输流量及传输时间,点云压缩成为促进点云产业发展的关键问题。For example, taking a point cloud video with a frame rate of 30 frames per second (fps), each frame contains 700,000 points, and each point has coordinate information (xyz, float) and color information (RGB, uchar). Therefore, the data volume of a 10-second point cloud video is approximately 0.7 million × (4 bytes × 3 + 1 byte × 3) × 30 fps × 10 seconds = 3.15 GB, where 1 byte is 10 bits. For a 1280 × 720 2D video with a YUV sampling format of 4:2:0 and a frame rate of 30 fps, the data volume for 10 seconds is approximately 1280 × 720 × 12 bits × 30 frames × 10 seconds, which is approximately 0.39 GB. A 10-second two-view 3D video has a data volume of approximately 0.39 × 2 = 0.78 GB. This shows that the data volume of a point cloud video far exceeds that of 2D and 3D videos of the same length. Therefore, in order to better realize data management, save server storage space, and reduce the transmission traffic and transmission time between the server and the client, point cloud compression has become a key issue in promoting the development of the point cloud industry.
也就是说,由于点云是海量点的集合,存储点云不仅会消耗大量的内存,而且不利于传输,也没有这么大的带宽可以支持将点云不经过压缩直接在网络层进行传输,因此,需要对点云进行压缩。That is to say, since the point cloud is a collection of massive points, storing the point cloud not only consumes a lot of memory, but is also not conducive to transmission. There is also not enough bandwidth to support direct transmission of the point cloud at the network layer without compression. Therefore, the point cloud needs to be compressed.
目前,可对点云进行压缩的点云编码框架可以是运动图像专家组(moving picture experts group,MPEG)提供的基于几何的点云压缩(geometry-based point cloud compression,G-PCC)编解码框架或基于视频的点云压缩(video-based point cloud compression,V-PCC)编解码框架,也可以是AVS提供的AVS-PCC编解码框架。G-PCC编解码框架可用于针对第一类静态点云和第三类动态获取点云进行压缩,其可以是基于点云压缩测试平台(test model compression 13,TMC13),V-PCC编解码框架可用于针对第二类动态点云进行压缩,其可以是基于点云压缩测试平台(test model compression 2,TMC2)。故G-PCC编解码框架也称为点云编解码器TMC13,V-PCC编解码框架也称为点云编解码器TMC2。Currently, point cloud coding frameworks that can compress point clouds include the geometry-based point cloud compression (G-PCC) codec framework or the video-based point cloud compression (V-PCC) codec framework provided by the Moving Picture Experts Group (MPEG), or the AVS-PCC codec framework provided by AVS. The G-PCC codec framework can be used to compress the first type of static point clouds and the third type of dynamically acquired point clouds, and can be based on the Point Cloud Compression Test Platform (Test Model Compression 13, TMC13). The V-PCC codec framework can be used to compress the second type of dynamic point clouds, and can be based on the Point Cloud Compression Test Platform (Test Model Compression 2, TMC2). Therefore, the G-PCC codec framework is also called the Point Cloud Codec TMC13, and the V-PCC codec framework is also called the Point Cloud Codec TMC2.
本申请实施例提供了一种包含解码方法和编码方法的点云编解码系统的网络架构,图3为本申请实施例提供的一种点云编解码的网络架构示意图。如图3所示,该网络架构包括一个或多个电子设备13至1N和通信网络01,其中,电子设备13至1N可以通过通信网络01进行视频交互。电子设备在实施的过程中可以为各种类型的具有点云编解码功能的设备,例如,所述电子设备可以包括手机、平板电脑、个人计算机、个人数字助理、导航仪、数字电话、视频电话、电视机、传感设备、服务器等,本申请实施例不作限制。其中,本申请实施例中的解码器或编码器就可以为上述电子设备。An embodiment of the present application provides a network architecture of a point cloud encoding and decoding system including a decoding method and an encoding method. FIG3 is a schematic diagram of a network architecture of a point cloud encoding and decoding system provided by an embodiment of the present application. As shown in FIG3 , the network architecture includes one or more electronic devices 13 to 1N and a communication network 01, wherein the electronic devices 13 to 1N can perform video interaction through the communication network 01. During the implementation process, the electronic device can be various types of devices with point cloud encoding and decoding functions. For example, the electronic device can include a mobile phone, a tablet computer, a personal computer, a personal digital assistant, a navigator, a digital phone, a video phone, a television, a sensor device, a server, etc., which is not limited by the embodiment of the present application. Among them, the decoder or encoder in the embodiment of the present application can be the above-mentioned electronic device.
其中,本申请实施例中的电子设备具有点云编解码功能,一般包括点云编码器(即编码器)和点云解码器(即解码器)。 Among them, the electronic device in the embodiment of the present application has a point cloud encoding and decoding function, generally including a point cloud encoder (ie, encoder) and a point cloud decoder (ie, decoder).
下面以G-PCC编解码框架和AVS编解码框架为例进行相关技术的说明。The following describes the related technologies using the G-PCC codec framework and the AVS codec framework as examples.
可以理解,在点云G-PCC编解码框架中,针对待编码的点云数据,首先通过片(slice,也称为条带)划分,将点云数据划分为多个slice。在每一个slice中,点云的几何信息和每个点所对应的属性信息是分开进行编码的。As you can understand, in the point cloud G-PCC codec framework, the point cloud data to be encoded is first divided into multiple slices (slices, also called strips). In each slice, the geometric information of the point cloud and the attribute information corresponding to each point are encoded separately.
图4A示出了一种G-PCC编码器的组成框架示意图。如图4A所示,在几何编码过程中,对几何信息进行坐标转换,使点云全都包含在一个包围盒(bounding box)中,然后再进行量化,这一步量化主要起到缩放的作用,由于量化取整,使得一部分点云的几何信息相同,于是再基于参数来决定是否移除重复点,量化和移除重复点这一过程又被称为体素化过程。接着对包围盒进行八叉树划分或者预测树构建。在该过程中,针对划分的叶子结点中的点进行算术编码,生成二进制的几何比特流;或者,针对划分产生的交点(vertex)进行算术编码(基于交点进行表面拟合),生成二进制的几何比特流。在属性编码过程中,几何编码完成,对几何信息进行重建后,需要先进行颜色转换,将颜色信息(即属性信息)从RGB颜色空间转换到YUV颜色空间。然后,利用重建的几何信息对点云重新着色,使得未编码的属性信息与重建的几何信息对应起来。属性编码主要针对颜色信息进行,在颜色信息编码过程中,主要有三种变换方法,前两种方法依赖于细节层次(level of detail,LOD)划分,分别是基于距离的提升变换和预测变换,第三种方法是直接进行RAHT,这三种方法都会将颜色信息从空间域转换到频域,通过变换得到高频系数和低频系数,最后对系数进行量化,再对量化系数进行算术编码,可以生成二进制的属性比特流。Figure 4A shows a schematic diagram of the G-PCC encoder architecture. As shown in Figure 4A, during the geometry encoding process, the geometric information is transformed so that the entire point cloud is contained within a bounding box. Quantization is then performed. This quantization step primarily serves a scaling purpose. Due to quantization rounding, the geometric information of some point clouds becomes identical. Parameters are then used to determine whether to remove duplicate points. This process of quantization and removing duplicate points is also known as voxelization. The bounding box is then partitioned into an octree or a prediction tree is constructed. During this process, arithmetic coding is performed on the points within the leaf nodes of the partition to generate a binary geometry bitstream. Alternatively, arithmetic coding is performed on the intersections (vertex) of the partition (surface fitting based on the intersections) to generate a binary geometry bitstream. During the attribute encoding process, after the geometry encoding is completed and the geometric information is reconstructed, color conversion is performed to convert the color information (i.e., attribute information) from the RGB color space to the YUV color space. The reconstructed geometry information is then used to recolor the point cloud so that the unencoded attribute information corresponds to the reconstructed geometry information. Attribute encoding is mainly performed on color information. In the process of color information encoding, there are three main transformation methods. The first two methods rely on the level of detail (LOD) division, which are distance-based lifting transformation and prediction transformation respectively. The third method is to directly perform RAHT. All three methods will convert color information from the spatial domain to the frequency domain, and obtain high-frequency coefficients and low-frequency coefficients through transformation. Finally, the coefficients are quantized and then arithmetic coding is performed on the quantized coefficients to generate a binary attribute bit stream.
图4B示出了一种G-PCC解码器的组成框架示意图。如图4B所示,针对所获取的二进制比特流,首先对二进制比特流中的几何比特流和属性比特流分别进行独立解码。在对几何比特流的解码时,通过算术解码-重构八叉树/重构预测树-重建几何-坐标逆转换,得到点云的几何信息;在对属性比特流的解码时,通过算术解码-反量化-LOD划分/RAHT-颜色逆转换,得到点云的属性信息,基于几何信息和属性信息还原待编码的点云数据(即输出点云)。Figure 4B shows a schematic diagram of the composition framework of a G-PCC decoder. As shown in Figure 4B, for the acquired binary bit stream, the geometric bit stream and attribute bit stream in the binary bit stream are first decoded independently. When decoding the geometric bit stream, the geometric information of the point cloud is obtained through arithmetic decoding-reconstruction of the octree/reconstruction of the prediction tree-reconstruction of the geometry-coordinate inverse conversion; when decoding the attribute bit stream, the attribute information of the point cloud is obtained through arithmetic decoding-inverse quantization-LOD partitioning/RAHT-color inverse conversion, and the point cloud data to be encoded (i.e., the output point cloud) is restored based on the geometric information and attribute information.
需要说明的是,在如图4A或图4B所示,目前G-PCC的几何编解码可以分为基于八叉树的几何编解码(用虚线框标识)和基于预测树的几何编解码(用点划线框标识)。It should be noted that, as shown in FIG4A or FIG4B , the current geometric coding and decoding of G-PCC can be divided into octree-based geometric coding and decoding (marked by a dotted box) and prediction tree-based geometric coding and decoding (marked by a dotted box).
对于基于八叉树的几何编码(octree geometry encoding,OctGeomEnc)而言,基于八叉树的几何编码包括:首先对几何信息进行坐标转换,使点云全都包含在一个包围盒中。然后再进行量化,这一步量化主要起到缩放的作用,由于量化取整,使得一部分点的几何信息相同,根据参数来决定是否移除重复点,量化和移除重复点这一过程又被称为体素化过程。接下来,按照广度优先遍历的顺序不断对包围盒进行树划分(例如八叉树、四叉树、二叉树等),对每个节点的占位码进行编码。在相关技术中,某公司提出了一种隐式几何的划分方式,首先计算点云的包围盒假设dx>dy>dz,该包围盒对应为一个长方体。在几何划分时,首先会基于x轴一直进行二叉树划分,得到两个子节点;直到满足dx=dy>dz条件时,才会基于x和y轴一直进行四叉树划分,得到四个子节点;当最终满足dx=dy=dz条件时,会一直进行八叉树划分,直到划分得到的叶子结点为1×1×1的单位立方体时停止划分,对叶子结点中的点进行编码,生成二进制码流。在基于二叉树/四叉树/八叉树划分的过程中,引入两个参数:K、M。参数K指示在进行八叉树划分之前二叉树/四叉树划分的最多次数;参数M用来指示在进行二叉树/四叉树划分时对应的最小块边长为2M。同时K和M必须满足条件:假设dmax=max(dx,dy,dz),dmin=min(dx,dy,dz),参数K满足:K≥dmax-dmin;参数M满足:M≥dmin。参数K与M之所以满足上述的条件,是因为目前G-PCC在几何隐式划分的过程中,划分方式的优先级为二叉树、四叉树和八叉树,当节点块大小不满足二叉树/四叉树的条件时,才会对节点一直进行八叉树的划分,直到划分到叶子节点最小单位1×1×1。基于八叉树的几何信息编码模式可以通过利用空间中相邻点之间的相关性来对点云的几何信息进行有效的编码,但是对于一些较为平坦的节点或者具有平面特性的节点,通过利用平面编码可以进一步提升点云几何信息的编码性能。For octree geometry encoding (OctGeomEnc), octree geometry encoding includes: first, coordinate conversion of the geometric information so that all point clouds are contained in a bounding box. Then quantization is performed. This step of quantization mainly plays a role of scaling. Due to the quantization rounding, the geometric information of some points is the same. The parameters are used to decide whether to remove duplicate points. The process of quantization and removal of duplicate points is also called voxelization. Next, the bounding box is continuously divided into trees (such as octrees, quadtrees, binary trees, etc.) in the order of breadth-first traversal, and the placeholder code of each node is encoded. In related technologies, a company proposed an implicit geometric division method. First, the bounding box of the point cloud is calculated. Assuming dx > dy > dz , the bounding box corresponds to a cuboid. During geometric partitioning, binary tree partitioning is first performed along the x-axis, resulting in two child nodes. Quadtree partitioning is then performed along the x- and y-axes until the condition dx = dy > dz is satisfied, resulting in four child nodes. When dx = dy = dz is finally satisfied, octree partitioning is performed until the resulting leaf node is a 1×1×1 unit cube. The points in the leaf node are encoded to generate a binary bitstream. Two parameters, K and M, are introduced during the binary/quadtree/octree partitioning process. Parameter K indicates the maximum number of binary/quadtree partitions to be performed before octree partitioning. Parameter M indicates the minimum block side length of 2M corresponding to the binary/quadtree partitioning. At the same time, K and M must satisfy the following conditions: Assuming d max = max(d x , dy , d z ) and d min = min(d x , dy , d z ), parameter K satisfies: K ≥ d max - d min ; parameter M satisfies: M ≥ d min . Parameters K and M satisfy these conditions because, during the current G-PCC implicit geometric partitioning process, the priority is binary tree, quadtree, and octree. When the node block size does not meet the binary tree or quadtree requirements, the node is partitioned using the octree until the minimum leaf node unit is 1×1×1. Octree-based geometric information encoding can effectively encode point cloud geometry by leveraging the correlation between adjacent points in space. However, for relatively flat nodes or those with planar characteristics, plane coding can further improve the performance of point cloud geometry encoding.
示例性地,图5A和图5B提供了一种平面位置示意图。其中,图5A示出了一种Z轴方向的低平面位置示意图,图5B示出了一种Z轴方向的高平面位置示意图。如图5A所示,这里的(a)、(a0)、(a1)、(a2)、(a3)均属于Z轴方向的低平面位置,以(a)为例,可以看到当前节点中被占据的四个子节点都位于当前节点在Z轴方向的低平面位置,那么可以认为当前节点属于一个Z平面并且在Z轴方向是一个低平面。同理,如图5B所示,这里的(b)、(b0)、(b1)、(b2)、(b3)均属于Z轴方向的高平面位置,以(b)为例,可以看到当前节点中被占据的四个子节点位于当前节点在Z轴方向的高平面位置,那么可以认为当前节点属于一个Z平面并且在Z轴方向是一个高平面。Exemplarily, Figure 5A and Figure 5B provide a kind of plane position schematic diagram.Wherein, Figure 5A shows a kind of low plane position schematic diagram in the Z-axis direction, and Figure 5B shows a kind of high plane position schematic diagram in the Z-axis direction.As shown in Figure 5A, here (a), (a0), (a1), (a2), (a3) all belong to the low plane position in the Z-axis direction. Taking (a) as an example, it can be seen that the four child nodes occupied in the current node are all located at the low plane position of the current node in the Z-axis direction, so it can be considered that the current node belongs to a Z plane and is a low plane in the Z-axis direction. Similarly, as shown in Figure 5B, here (b), (b0), (b1), (b2), (b3) all belong to the high plane position in the Z-axis direction. Taking (b) as an example, it can be seen that the four child nodes occupied in the current node are located at the high plane position of the current node in the Z-axis direction, so it can be considered that the current node belongs to a Z plane and is a high plane in the Z-axis direction.
进一步地,以图5A中的(a)为例,对八叉树编码和平面编码性能进行比较,图6提供了一种节点编码顺序示意图,即按照图6所示的0、1、2、3、4、5、6、7的顺序进行节点编码。在这里,如果对图5A中的(a)采用八叉树编码方式,那么当前节点的占位信息表示为:10101010。但是如果采用平面 编码方式,首先需要编码一个标识符表示当前节点在Z轴方向是一个平面,其次如果当前节点在Z轴方向是一个平面,还需要对当前节点的平面位置进行表示;其次仅仅需要对Z轴方向的低平面节点的占位信息进行编码(即0、2、4、6四个子节点的占位信息),因此基于平面编码方式对当前节点进行编码,仅仅需要编码6个比特(bit),相比相关技术的八叉树编码可以减少2个bit的表示。基于此分析,平面编码相比八叉树编码具有较为明显的编码性能。因此,对于一个被占据的节点,如果在某一个维度上采用平面编码方式进行编码,首先需要对当前节点在该维度上的平面标识(planarMode)和平面位置(PlanePos)信息进行表示,其次基于当前节点的平面信息来对当前节点的占位信息进行编码。示例性地,图7A示出了一种平面标识信息示意图。如图7A所示,这里在Z轴方向为一个低平面;对应地,平面标识信息的取值为真(true)或者1,即planarMode_Z=true;平面位置信息为低平面(low),即PlanePosition_Z=low。图7B示出了另一种平面标识信息示意图。如图7B所示,这里在Z轴方向不为一个平面;对应地,平面标识信息的取值为假(false)或者0,即planarMode_Z=false。Furthermore, taking (a) in FIG5A as an example, the performance of octree coding and plane coding is compared. FIG6 provides a schematic diagram of the node coding order, that is, the node coding is performed in the order of 0, 1, 2, 3, 4, 5, 6, and 7 as shown in FIG6. Here, if the octree coding method is used for (a) in FIG5A, the placeholder information of the current node is represented as: 10101010. However, if the plane is used, the placeholder information of the current node is represented as: 10101010. Coding method, first of all, it is necessary to encode an identifier to indicate that the current node is a plane in the Z-axis direction, and secondly, if the current node is a plane in the Z-axis direction, it is also necessary to represent the plane position of the current node; secondly, it is only necessary to encode the placeholder information of the low plane node in the Z-axis direction (i.e., the placeholder information of the four child nodes 0, 2, 4, and 6). Therefore, based on the plane coding method, the current node is encoded, and only 6 bits (bits) need to be encoded, which can reduce the representation of 2 bits compared to the octree coding of the related technology. Based on this analysis, plane coding has more obvious coding performance than octree coding. Therefore, for an occupied node, if the plane coding method is used for encoding in a certain dimension, it is first necessary to represent the plane identification (planarMode) and plane position (PlanePos) information of the current node in the dimension, and secondly, the placeholder information of the current node is encoded based on the plane information of the current node. For example, Figure 7A shows a schematic diagram of plane identification information. As shown in Figure 7A , the plane is a low plane in the Z-axis direction; accordingly, the plane identification information has a value of true or 1, i.e., planarMode_ Z = true; and the plane position information is a low plane (low), i.e., PlanePosition_ Z = low. Figure 7B shows another schematic diagram of plane identification information. As shown in Figure 7B , the plane is not a plane in the Z-axis direction; accordingly, the plane identification information has a value of false or 0, i.e., planarMode_ Z = false.
需要注意的是,对于PlaneMode_i:0代表当前节点在i轴方向不是一个平面,1代表当前节点在i轴方向是一个平面。若当前节点在i轴方向是一个平面,则对于PlanePosition_i:0代表当前节点在i轴方向是一个平面,并且平面位置为低平面,1表示当前节点在i轴方向上是一个高平面。其中,i表示坐标维度,可以为X轴方向、Y轴方向或者Z轴方向,故i=0,1,2。Note that for PlaneMode_ i , 0 indicates that the current node is not a plane in the i-axis direction, while 1 indicates that the current node is a plane in the i-axis direction. If the current node is a plane in the i-axis direction, then for PlanePosition_ i , 0 indicates that the current node is a plane in the i-axis direction and the plane position is low, while 1 indicates that the current node is a high plane in the i-axis direction. i represents the coordinate dimension, which can be the X-axis, Y-axis, or Z-axis direction, so i = 0, 1, or 2.
基于八叉树的几何信息编码模式仅对空间中具有相关性的点有高效的压缩速率,而对于在几何空间中处于孤立位置的点来说,使用直接编码模式(direct coding model,DCM)可以大大降低复杂度。对于八叉树中的所有节点,DCM的使用不是通过标志位信息来表示的,而是通过当前节点父节点和邻居信息来进行推断得到。判断当前节点是否具有DCM编码资格的方式有三种,具体如下:The octree-based geometric information coding mode only has an efficient compression rate for points that are correlated in space. For points that are isolated in the geometric space, the use of the direct coding model (DCM) can greatly reduce the complexity. For all nodes in the octree, the use of DCM is not indicated by flag information, but is inferred from the parent node and neighbor information of the current node. There are three ways to determine whether the current node is eligible for DCM encoding, as follows:
(1)当前节点没有兄弟姐妹子节点,即当前节点的父节点只有一个孩子节点,同时当前节点父节点的父节点仅有两个被占据子节点,即当前节点最多只有一个邻居节点。(1) The current node has no sibling child nodes, that is, the parent node of the current node has only one child node, and the parent node of the parent node of the current node has only two occupied child nodes, that is, the current node has at most one neighbor node.
(2)当前节点的父节点仅有当前节点一个占据子节点,同时与当前节点共用一个面的六个邻居节点也都属于空节点。(2) The parent node of the current node has only one child node, the current node, and the six neighbor nodes that share a face with the current node are all empty nodes.
(3)当前节点的兄弟姐妹节点数目大于1。(3) The number of sibling nodes of the current node is greater than 1.
示例性地,图8提供了一种IDCM编码示意图。如果当前节点不具有DCM编码资格将对其进行八叉树划分,若具有DCM编码资格将进一步判断该节点中包含的点数,当点数小于阈值(例如2)时,则对该节点进行DCM编码,否则将继续进行八叉树划分。当应用DCM编码模式时,首先需要编码当前节点是否是一个真正的孤立点,即IDCM_flag,当IDCM_flag为true时,则当前节点采用DCM编码,否则仍然采用八叉树编码。在当前节点满足DCM编码时,需要编码当前节点的DCM编码模式,目前存在两种DCM模式,分别是:(a)仅仅只有一个点存在(或者是多个点,但是属于重复点);(b)含有两个点。最后需要编码每个点的几何信息,假设节点的边长为2d时,对该节点几何坐标的每一个分量进行编码时需要d比特,该比特信息直接被编进码流中。这里需要注意的是,在对激光雷达点云进行编码时,通过利用激光雷达采集参数来对三个维度的坐标信息进行预测编码,从而可以进一步提升几何信息的编码性能。For example, Figure 8 provides a schematic diagram of IDCM encoding. If the current node does not meet the DCM encoding requirements, it will be divided into octrees. If it meets the DCM encoding requirements, the number of points contained in the node will be further determined. If the number of points is less than a threshold (e.g., 2), the node will be DCM-encoded; otherwise, the octree division will continue. When the DCM encoding mode is applied, it is first necessary to encode whether the current node is a true isolated point, i.e., IDCM_flag. When IDCM_flag is true, the current node is encoded using DCM; otherwise, octree encoding is still used. When the current node meets the DCM encoding requirements, the DCM encoding mode of the current node needs to be encoded. Currently, there are two DCM modes: (a) only one point exists (or multiple points, but they are duplicate points); (b) contains two points. Finally, the geometric information of each point needs to be encoded. Assuming that the side length of the node is 2d , encoding each component of the node's geometric coordinates requires d bits, and this bit information is directly encoded into the bitstream. It should be noted here that when encoding the lidar point cloud, the three-dimensional coordinate information can be predictively encoded by using the lidar acquisition parameters, thereby further improving the encoding performance of the geometric information.
需要注意的是,在节点划分到叶子节点时,在几何无损编码的情况下,需要对叶子节点中的重复点数目进行编码。最终对所有节点的占位信息进行编码,生成二进制码流。另外G-PCC目前引入了一种平面编码模式,在对几何进行划分的过程中,会判断当前节点的子节点是否处于同一平面,如果当前节点的子节点满足同一平面的条件,会用该平面对当前节点的子节点进行表示。It is important to note that when partitioning nodes into leaf nodes, the number of duplicate points in the leaf nodes must be encoded in the case of lossless geometric coding. Ultimately, the placeholder information for all nodes is encoded to generate a binary bitstream. Furthermore, G-PCC currently introduces a plane coding mode. During the geometric partitioning process, it determines whether the child nodes of the current node are in the same plane. If the child nodes of the current node meet the condition of being in the same plane, the child nodes of the current node are represented by that plane.
对于基于八叉树的几何解码而言,解码端按照广度优先遍历的顺序,在对每个节点的占位信息解码之前,首先会利用已经重建得到的几何信息来判断当前节点是否进行平面解码或者IDCM解码,如果当前节点满足平面解码的条件,则会首先对当前节点的平面标识和平面位置信息进行解码,其次基于平面信息来对当前节点的占位信息进行解码;如果当前节点满足IDCM解码的条件,则会首先解码当前节点是否是一个真正的IDCM节点,如果是一个真正的IDCM解码,则会继续解析当前节点的DCM解码模式,其次可以得到当前DCM节点中的点数目,最后对每个点的几何信息进行解码。对于既不满足平面解码也不满足DCM解码的节点,会对当前节点的占位信息进行解码。通过按照这样的方式不断解析得到每个节点的占位码,并且依次不断划分节点,直至划分得到1×1×1的单位立方体时停止划分,解析得到每个叶子节点中包含的点数,最终恢复得到几何重构点云信息。For octree-based geometric decoding, the decoder follows a breadth-first traversal. Before decoding each node's occupancy information, it first uses the reconstructed geometric information to determine whether the current node is for plane decoding or IDCM decoding. If the current node meets the requirements for plane decoding, it first decodes the plane identifier and plane position information of the current node. Then, based on the plane information, it decodes the current node's occupancy information. If the current node meets the requirements for IDCM decoding, it first decodes whether the current node is a true IDCM node. If so, it continues to parse the DCM decoding mode of the current node, then obtains the number of points in the current DCM node, and finally decodes the geometric information of each point. For nodes that do not meet either plane decoding or DCM decoding requirements, the current node's occupancy information is decoded. By continuously parsing in this way, the placeholder code of each node is obtained, and the node is continuously partitioned until a 1×1×1 unit cube is obtained. The number of points contained in each leaf node is parsed, and the geometrically reconstructed point cloud information is finally recovered.
对于基于三角面片集(triangle soup,trisoup)的几何信息编码而言,在基于trisoup的几何信息编码框架中,同样也要先进行几何划分,但区别于基于二叉树/四叉树/八叉树的几何信息编码,该方法不需要将点云逐级划分到边长为1×1×1的单位立方体,而是划分到子块(block)边长为W时停止划分,基于每个block中点云的分布所形成的表面,得到该表面与block的十二条边所产生的至多十二个交点(vertex)。依次编码每个block的vertex坐标,生成二进制码流。 For triangle soup (trisoup)-based geometric information coding, geometric partitioning must also be performed first in the trisoup-based geometric information coding framework. However, unlike geometric information coding based on binary trees, quadtrees, and octrees, this method does not require step-by-step partitioning of the point cloud into unit cubes with side lengths of 1×1×1. Instead, the partitioning stops when the sub-blocks (blocks) have a side length of W. Based on the surface formed by the distribution of the point cloud in each block, the surface and the twelve edges of the block are obtained. The vertex coordinates of each block are encoded in sequence to generate a binary code stream.
在编码/解码至多十二个vertex之后,为了能更加准确地还原点云的形状,会编码/解码立方体的交点质心偏移值。在编码交点质心偏移值的过程中,首先对至多十二个vertex(交点)计算其质心,记为Cmean,在编码端利用该立方体的部分真实的点云点集计算质心C,有 为质心点的法向量,如图9所示。其中α为交点质心偏移值,α需要被编码和解码。这样最终的点云三角形面片即为CV1V2,CV2V3,CV3V4,CV4V1。利用这些三角形面片还原点云。After encoding/decoding up to twelve vertices, in order to more accurately restore the shape of the point cloud, the centroid offset of the intersection of the cube is encoded/decoded. In the process of encoding the centroid offset of the intersection, the centroid of up to twelve vertices (intersections) is first calculated, denoted as C mean . At the encoding end, the centroid C is calculated using the actual point cloud set of the cube, as follows: is the normal vector of the centroid, as shown in Figure 9. α is the intersection centroid offset, which needs to be encoded and decoded. The resulting point cloud triangles are CV 1 V 2 , CV 2 V 3 , CV 3 V 4 , and CV 4 V 1 . These triangles are used to restore the point cloud.
对于基于trisoup的点云几何信息重建而言,在解码端进行点云几何信息重建时,首先解码vertex坐标用于完成三角面片重建,该过程如图10A、图10B和图10C所示。其中,图10A所示的block中存在3个交点(v1,v2,v3),利用这3个交点按照一定顺序所构成的三角面片集被称为triangle soup,即trisoup,如图10B所示。之后,在该三角面片集上进行采样,将得到的采样点作为该block内的重建点云,如图10C所示。For trisoup-based point cloud geometry reconstruction, the decoding end first decodes vertex coordinates to complete triangle reconstruction. This process is shown in Figures 10A, 10B, and 10C. The block shown in Figure 10A contains three intersection points (v1, v2, v3). The set of triangles formed by these three intersection points in a certain order is called a triangle soup, or trisoup, as shown in Figure 10B. Afterwards, sampling is performed on this set of triangles, and the resulting sampled points are used as the reconstructed point cloud within the block, as shown in Figure 10C.
对于交点质心偏移值α,在解码端可以通过解码得到。下面对解码端解码交点质心偏移值α的过程进行详细描述。The intersection point centroid offset value α can be obtained by decoding at the decoding end. The following describes in detail the process of decoding the intersection point centroid offset value α at the decoding end.
解码端可以对当前块内的至多十二个交点进行解码,得到该至多十二个交点。然后将该至多十二个交点与参考块(或已解码块)中的至多十二个交点进行比较,确定当前块中的至多十二个交点预测的不准确值qualitySKIP。qualitySKIP用于指示至多十二个交点的预测不准确值的数目。如果qualitySKIP小于K,表示当前块的至多十二个交点预测得比较准确,当前块的交点质心偏移值大概率与参考块的交点质心偏移值相等。如果qualitySKIP小于K,则可以将possibleSKIP记为1。参考块的交点质心偏移值也称为交点质心偏移值的帧间预测值driftSKIP。driftSKIP可以通过之前对参考块进行解码得到。该参考块可以为当前块的上一个块。当然,参考块也可以为其他块。The decoding end can decode up to twelve intersections in the current block to obtain the up to twelve intersections. The up to twelve intersections are then compared with the up to twelve intersections in the reference block (or decoded block) to determine the qualitySKIP of the predicted inaccuracy of up to twelve intersections in the current block. qualitySKIP is used to indicate the number of predicted inaccurate values of up to twelve intersections. If qualitySKIP is less than K, it means that the up to twelve intersections of the current block are predicted more accurately, and the intersection centroid offset value of the current block is most likely equal to the intersection centroid offset value of the reference block. If qualitySKIP is less than K, possibleSKIP can be recorded as 1. The intersection centroid offset value of the reference block is also called the inter-frame prediction value driftSKIP of the intersection centroid offset value. driftSKIP can be obtained by previously decoding the reference block. The reference block can be the previous block of the current block. Of course, the reference block can also be other blocks.
如果possibleSKIP为1,则对标志位Intersameflag进行解码,以确定当前块的交点质心偏移值是否等于交点质心偏移值的帧间预测值。如果当前块的交点质心偏移值等于交点质心偏移值的帧间预测值,则解码结束;如果当前块的交点质心偏移值不等于交点质心偏移值的帧间预测值,则继续进行后续的解码。If possibleSKIP is 1, the Intersameflag flag is decoded to determine whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value. If the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value, decoding ends; if the intersection centroid offset value of the current block is not equal to the inter-frame prediction value of the intersection centroid offset value, subsequent decoding continues.
例如,如果Intersameflag取值为1,则表示当前块的交点质心偏移值等于交点质心偏移值的帧间预测值;如果Intersameflag取值为0,则表示当前块的交点质心偏移值不等于交点质心偏移值的帧间预测值。For example, if the value of Intersameflag is 1, it means that the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; if the value of Intersameflag is 0, it means that the intersection centroid offset value of the current block is not equal to the inter-frame prediction value of the intersection centroid offset value.
类似地,在编码端也可以采用上述方式对交点质心偏移值α进行编码。Similarly, the intersection centroid offset value α may also be encoded in the above manner at the encoding end.
为了降低Intersameflag的码流开销,可以利用算术编码方式对Intersameflag进行编解码。在对Intersameflag进行编解码时,可以利用上下文模型,对Intersameflag进行编解码。In order to reduce the bit stream overhead of the Intersame Flag, the Intersame Flag may be encoded and decoded using arithmetic coding. When encoding and decoding the Intersame Flag, the context model may be used to encode and decode the Intersame Flag.
Intersameflag的上下文模型可以基于qualitySKIP和driftSKIP确定。例如:Intersameflag=decode(ctxtMemOctree.ctxDriftSKIP[qualitySKIP][driftSKIP==0])。The context model of Intersameflag can be determined based on qualitySKIP and driftSKIP. For example: Intersameflag = decode(ctxtMemOctree.ctxDriftSKIP[qualitySKIP][driftSKIP==0]).
对Intersameflag进行解码的框架如图11所示。首先,计算qualitySKIP。确定qualitySKIP是否小于K。如果qualitySKIP<K,则possibleSKIP=1;如果qualitySKIP≥K,则possibleSKIP=0。如果possibleSKIP=1,则解码Intersameflag;如果possibleSKIP不等于1,则通过其他方法解码得到当前块的交点质心偏移值。如果Intersameflag=1,则当前块的交点质心偏移值等于driftSKIP。如果Intersameflag不等于1,则通过其他方法解码得到当前块的交点质心偏移值。The framework for decoding Intersameflag is shown in Figure 11. First, calculate qualitySKIP. Determine whether qualitySKIP is less than K. If qualitySKIP < K, then possibleSKIP = 1; if qualitySKIP ≥ K, then possibleSKIP = 0. If possibleSKIP = 1, decode Intersameflag; if possibleSKIP is not equal to 1, decode the intersection centroid offset value of the current block through other methods. If Intersameflag = 1, the intersection centroid offset value of the current block is equal to driftSKIP. If Intersameflag is not equal to 1, decode the intersection centroid offset value of the current block through other methods.
在解码Intersameflag的时候,需要用到参数qualitySKIP和driftSKIP。qualitySKIP对应12个索引值,即0~11;driftSKIP==0对应2个索引值,即0和1。所以解码Intersameflag共用到12*2=24个上下文,如图12所示。采用这种方式解码Intersameflag会造成上下文的数量较多,会影响编解码的性能。When decoding the Intersame Flag, the parameters qualitySKIP and driftSKIP are required. qualitySKIP corresponds to 12 index values, 0 to 11; driftSKIP == 0 corresponds to two index values, 0 and 1. Therefore, decoding the Intersame Flag requires a total of 12 * 2 = 24 contexts, as shown in Figure 12. Using this method to decode the Intersame Flag results in a large number of contexts, which can affect codec performance.
基于此,本申请实施例提供一种点云编码方法,包括:根据参考块中的至少一个交点确定当前块的第一参数,所述第一参数用于指示基于所述至少一个交点确定的交点预测值中的不准确值的数量;根据所述第一参数确定第二参数,所述第二参数包括第一值,且所述第一值对应所述第一参数的多个值;根据所述第二参数确定第三参数的上下文模型,所述第三参数用于指示所述当前块的交点质心偏移值是否等于交点质心偏移值的帧间预测值;根据所述第三参数的上下文模型,对所述第三参数进行编码。Based on this, an embodiment of the present application provides a point cloud encoding method, including: determining a first parameter of a current block based on at least one intersection in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection; determining a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; determining a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; encoding the third parameter based on the context model of the third parameter.
本申请实施例还提供一种点云解码方法,包括:根据参考块中的至少一个交点确定当前块的第一参数,所述第一参数用于指示基于所述至少一个交点确定的交点预测值中的不准确值的数量;根据所述第一参数确定第二参数,所述第二参数包括第一值,且所述第一值对应所述第一参数的多个值;根据所述第二参数确定第三参数的上下文模型,所述第三参数用于指示所述当前块的交点质心偏移值是否等于交点质心偏移值的帧间预测值;根据所述第三参数的上下文模型,对所述第三参数进行解码。An embodiment of the present application also provides a point cloud decoding method, including: determining a first parameter of a current block based on at least one intersection in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection; determining a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; determining a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and decoding the third parameter based on the context model of the third parameter.
考虑到第一参数(如qualitySKIP)中的多个值对应的概率近似相等,因此,申请人提出可以用第二参数作为确定第三参数(如Intersameflag)的上下文模型的参数,其中,第二参数的一个值对应第一参数的多个值。通过这种方式,第二参数的索引值(或取值)的数量小于第一参数的索引值的数量,相比 于相关技术直接使用第一参数确定第三参数的上下文模型的方式,由第二参数确定第三参数的上下文模型,可以降低所需要的上下文模型的数量,有利于提升编解码第三参数的性能。Considering that the probabilities corresponding to multiple values in the first parameter (such as qualitySKIP) are approximately equal, the applicant proposes that the second parameter can be used as a parameter of the context model for determining the third parameter (such as Intersameflag), wherein one value of the second parameter corresponds to multiple values of the first parameter. In this way, the number of index values (or values) of the second parameter is less than the number of index values of the first parameter. In the related art, the context model of the third parameter is determined directly using the first parameter. The context model of the third parameter is determined by the second parameter, which can reduce the number of required context models and is conducive to improving the performance of encoding and decoding the third parameter.
下文先对本申请实施例的点云解码方法进行详细地举例说明。The following first describes in detail the point cloud decoding method of the embodiment of the present application with examples.
图13是本申请实施例提供的点云解码方法的流程示意图。图13所示的方法可应用于解码器。图13所示的点云解码方法可用于对点云的几何信息进行解码。在一些实现方式中,该解码方法可应用于基于trisoup的解码方法。在一些实现方式中,该解码方法基于几何的稠密点云测试模型(geometry-based solid content test model,GES-TM)。GES-TM是针对稠密点云(如在增强现实(augmented reality,AR)或虚拟现实(virtual reality,VR)场景采集到的点云)而提出的编解码框架。Figure 13 is a flow chart of the point cloud decoding method provided in an embodiment of the present application. The method shown in Figure 13 can be applied to a decoder. The point cloud decoding method shown in Figure 13 can be used to decode the geometric information of a point cloud. In some implementations, the decoding method can be applied to a trisoup-based decoding method. In some implementations, the decoding method is based on a geometry-based solid content test model (GES-TM). GES-TM is a coding and decoding framework proposed for dense point clouds (such as point clouds collected in augmented reality (AR) or virtual reality (VR) scenes).
参见图13,在步骤S1310,根据参考块中的至少一个交点确定当前块的第一参数。13 , in step S1310 , a first parameter of a current block is determined according to at least one intersection point in a reference block.
至少一个交点可以为参考块中的点云与参考块的12条边的交点。该至少一个交点的数量大于0且小于或等于12。该至少一个交点可以通过对参考块进行解码得到。参考块可以为已解码块。参考块可以为上文描述的基于trisoup编解码方法中的参考块。参考块也可以称为block。The at least one intersection point may be an intersection point between a point cloud in the reference block and the 12 edges of the reference block. The number of the at least one intersection point is greater than 0 and less than or equal to 12. The at least one intersection point may be obtained by decoding the reference block. The reference block may be a decoded block. The reference block may be a reference block in the trisoup-based encoding and decoding method described above. The reference block may also be referred to as a block.
该第一参数可用于指示基于至少一个交点确定的交点预测值中的不准确值的数量。第一参数例如可以为上文中的qualitySKIP(当然,第一参数也可以采用其他任意的字母和/或数字进行表示)。第一参数的取值可以为0~11,或者说,第一参数的取值为大于或等于0且小于或等于11的整数。The first parameter may be used to indicate the number of inaccurate values in the intersection point prediction value determined based on at least one intersection point. The first parameter may be, for example, the qualitySKIP described above (of course, the first parameter may also be represented by any other letters and/or numbers). The value of the first parameter may be 0 to 11, or in other words, the value of the first parameter may be an integer greater than or equal to 0 and less than or equal to 11.
第一参数可以基于参考块中的至少一个交点与当前块中的至少一个交点之间的关系确定。例如,如果参考块中的交点1与当前块中的交点1相同,则表示当前块中的交点1的预测值准确;如果参考块中的交点1与当前块中的交点1不同,则表示当前块中的交点1的预测值不准确。通过将参考块中的至少一个交点与当前块中的至少一个交点进行对比,可以确定第一参数。The first parameter can be determined based on a relationship between at least one intersection point in the reference block and at least one intersection point in the current block. For example, if intersection point 1 in the reference block is the same as intersection point 1 in the current block, the predicted value of intersection point 1 in the current block is accurate; if intersection point 1 in the reference block is different from intersection point 1 in the current block, the predicted value of intersection point 1 in the current block is inaccurate. The first parameter can be determined by comparing at least one intersection point in the reference block with at least one intersection point in the current block.
继续参见图13,在步骤S1320,根据第一参数确定第二参数。Continuing to refer to FIG. 13 , in step S1320 , the second parameter is determined according to the first parameter.
第二参数可以为一个新定义的参数。例如,第二参数可以用Index_New来表示(当然,第二参数也可以采用其他任意的字母和/或数字进行表示)。The second parameter may be a newly defined parameter. For example, the second parameter may be represented by Index_New (of course, the second parameter may also be represented by any other letters and/or numbers).
第二参数可以包括第一值,且第一值对应第一参数的多个值。或者说,当第一参数的取值为该多个值中的任意一个值时,第二参数的取值均为第一值。在一些实现方式中,该多个值对应的概率(或概率分布)相等或近似相等。因此,可以将该多个值所索引的上下文简化为一个,以减少上下文的数目。The second parameter may include a first value, and the first value corresponds to multiple values of the first parameter. In other words, when the value of the first parameter is any one of the multiple values, the value of the second parameter is the first value. In some implementations, the probabilities (or probability distributions) corresponding to the multiple values are equal or approximately equal. Therefore, the contexts indexed by the multiple values can be simplified to one to reduce the number of contexts.
本申请实施例对该多个值不做具体限定。例如,该多个值可以为第一参数的取值中的位于第一取值范围内的值。该多个值可以为连续的多个值。或者,该多个值可以为不连续的多个值,或零散的多个值。The embodiments of the present application do not specifically limit the multiple values. For example, the multiple values may be values within the first value range among the values of the first parameter. The multiple values may be continuous values. Alternatively, the multiple values may be discontinuous values or scattered values.
第一取值范围可以包括1至11中的部分或全部取值。例如,第一取值范围可以为1~11。又例如,第一取值范围可以为P~11,P为大于1的整数,P例如可以为2、3、4或5等。又例如,第一取值范围可以为1~Q,Q为小于11且大于1的整数,Q例如可以为10、9、8或7等。The first value range may include some or all values from 1 to 11. For example, the first value range may be 1 to 11. For another example, the first value range may be P to 11, where P is an integer greater than 1, and P may be, for example, 2, 3, 4, or 5. For another example, the first value range may be 1 to Q, where Q is an integer less than 11 and greater than 1, and Q may be, for example, 10, 9, 8, or 7.
第一值例如可以为0或1,从而可以降低编解码的复杂度。以第一值为0为例,当第一参数的取值为多个值中的任意一个时,第二参数的取值为0。以第一值为1为例,当第一参数的取值为多个值中的任意一个时,第二参数的取值为1。The first value can be, for example, 0 or 1, thereby reducing the complexity of encoding and decoding. Taking the first value as 0 as an example, when the value of the first parameter is any one of multiple values, the value of the second parameter is 0. Taking the first value as 1 as an example, when the value of the first parameter is any one of multiple values, the value of the second parameter is 1.
以多个值为1~11为例,第一参数的取值为1~11时,第二参数的取值为0或1。Taking multiple values of 1 to 11 as an example, when the value of the first parameter is 1 to 11, the value of the second parameter is 0 or 1.
在一些实现方式中,第二参数还包括第二值。该第二值可以对应第一参数的取值中除上述多个值之外的值。例如,第二值对应第一参数的取值中的一个值,该一个值可以为除上述多个值之外的任意一个值。该一个值例如可以为0。举例说明,如果第一值对应第一参数的1~11中的值,则第二值对应的第一参数的取值为0。也就是说,第二参数的第一值和第二值可以对应第一参数的全部取值。In some implementations, the second parameter also includes a second value. This second value may correspond to a value other than the aforementioned multiple values of the first parameter. For example, the second value corresponds to one of the values of the first parameter, and this one value may be any value other than the aforementioned multiple values. This one value may be, for example, 0. For example, if the first value corresponds to a value between 1 and 11 of the first parameter, then the value of the first parameter corresponding to the second value is 0. In other words, the first value and the second value of the second parameter may correspond to all values of the first parameter.
第二值可以为1或0。第一值与第二值不同。如果第一值为0,则第二值为1;如果第一值为1,则第二值为0。The second value can be 1 or 0. The first value is different from the second value. If the first value is 0, the second value is 1; if the first value is 1, the second value is 0.
第一参数的取值与第二参数的取值之间的关系可以通过函数来实现,也可以通过映射表来实现,本申请实施例对此不做具体限定。例如,第二参数可以基于第一参数以及第一映射关系确定。第一映射关系可用于指示第一参数的取值与第二参数的取值之间的映射关系。通过映射表指示第一参数的取值与第二参数的取值之间的关系,可以提高编解码的运行速度。The relationship between the value of the first parameter and the value of the second parameter can be implemented by a function or a mapping table, and this embodiment of the present application does not specifically limit this. For example, the second parameter can be determined based on the first parameter and the first mapping relationship. The first mapping relationship can be used to indicate the mapping relationship between the value of the first parameter and the value of the second parameter. By indicating the relationship between the value of the first parameter and the value of the second parameter through a mapping table, the running speed of the codec can be improved.
继续参见图13,在步骤S1330,根据第二参数确定第三参数的上下文模型。Continuing to refer to FIG. 13 , in step S1330 , a context model of the third parameter is determined based on the second parameter.
第三参数可用于指示当前块的交点质心偏移值是否等于交点质心偏移值的帧间预测值。第三参数可以为上文中描述的Intersameflag(当然,第三参数也可以采用其他任意的字母和/或数字进行表示)。The third parameter may be used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value. The third parameter may be the Intersameflag described above (of course, the third parameter may also be represented by any other letters and/or numbers).
交点质心偏移值的帧间预测值可以为通过预测得到的质心偏移值。交点质心偏移值的帧间预测值可以为参考块的交点质心偏移值。The inter-frame prediction value of the intersection centroid offset value may be a centroid offset value obtained by prediction. The inter-frame prediction value of the intersection centroid offset value may be an intersection centroid offset value of a reference block.
如果第三参数的取值为1,则表示当前块的交点质心偏移值等于交点质心偏移值的帧间预测值;如果第三参数的取值为0,则表示当前块的交点质心偏移值不等于交点质心偏移值的帧间预测值。或者,如果第三参数的取值为0,则表示当前块的交点质心偏移值等于交点质心偏移值的帧间预测值;如果第 三参数的取值为1,则表示当前块的交点质心偏移值不等于交点质心偏移值的帧间预测值。If the value of the third parameter is 1, it means that the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; if the value of the third parameter is 0, it means that the intersection centroid offset value of the current block is not equal to the inter-frame prediction value of the intersection centroid offset value. Alternatively, if the value of the third parameter is 0, it means that the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; if If the value of the three parameters is 1, it means that the intersection centroid offset value of the current block is not equal to the inter-frame prediction value of the intersection centroid offset value.
第三参数可以是基于上下文模型确定的。或者说,第三参数可以是基于上下文模型进行解码的。第三参数的上下文模型与第二参数有关。The third parameter may be determined based on a context model. In other words, the third parameter may be decoded based on the context model. The context model of the third parameter is related to the second parameter.
在一些实现方式中,第三参数的上下文模型可以基于第二参数和第四参数确定。第四参数可用于指示交点质心偏移值的帧间预测值。第四参数例如可以为上文描述的driftSKIP(当然,第四参数也可以采用其他任意的字母和/或数字进行表示)。In some implementations, the context model of the third parameter may be determined based on the second parameter and the fourth parameter. The fourth parameter may be used to indicate an inter-frame prediction value of the intersection centroid offset value. The fourth parameter may be, for example, the driftSKIP described above (of course, the fourth parameter may also be represented by any other letters and/or numbers).
在一些实现方式中,第三参数的上下文模型可以基于第二参数以及第四参数是否为0确定。如果第二参数包括2个取值,第四参数是否为0对应2个取值,则第三参数的上下文数目为2*2=4个,相比于传统方案中的24个上下文,减少了20个上下文,可以大大减少上下文的数目。In some implementations, the context model for the third parameter may be determined based on the second parameter and whether the fourth parameter is 0. If the second parameter has two values, and whether the fourth parameter is 0 corresponds to the two values, then the number of contexts for the third parameter is 2*2=4, which is 20 fewer than the 24 contexts in the traditional solution, significantly reducing the number of contexts.
在一些实现方式中,第三参数的解码是在第五参数的取值满足第一条件的情况下进行的,第五参数的取值基于第一参数的取值确定。第五参数例如可以为上文描述的possibleSKIP(当然,第五参数也可以采用其他任意的字母和/或数字进行表示)。通过在满足第一条件的情况下再解码第三参数,可以避免一些不必要的解码操作,有利于提升编解码的性能。In some implementations, the third parameter is decoded only when the value of the fifth parameter satisfies the first condition, and the value of the fifth parameter is determined based on the value of the first parameter. The fifth parameter may be, for example, the possibleSKIP described above (of course, the fifth parameter may also be represented by any other letters and/or numbers). By decoding the third parameter only when the first condition is satisfied, unnecessary decoding operations can be avoided, thereby improving codec performance.
第一条件可以包括第五参数的取值为1。如果第五参数的取值为1,则解码器才对第三参数进行解码;如果第五参数的取值不为1,如第五参数的取值为0,则解码器不对第三参数进行解码。The first condition may include that the value of the fifth parameter is 1. If the value of the fifth parameter is 1, the decoder decodes the third parameter; if the value of the fifth parameter is not 1, such as the value of the fifth parameter is 0, the decoder does not decode the third parameter.
在一些实现方式中,如果第一参数的取值小于第一阈值,则第五参数的取值为1。第一阈值例如可以为K,如果第一参数的取值小于K,则第五参数的取值为1。K为小于或等于12的整数。K例如可以为6或12等。In some implementations, if the value of the first parameter is less than the first threshold, the value of the fifth parameter is 1. The first threshold may be, for example, K. If the value of the first parameter is less than K, the value of the fifth parameter is 1. K is an integer less than or equal to 12. K may be, for example, 6 or 12.
继续参见图13,在步骤S1340,根据第三参数的上下文模型,对第三参数进行解码。Continuing to refer to FIG. 13 , in step S1340 , the third parameter is decoded according to the context model of the third parameter.
解码器可以利用算术解码方式,根据上下文模型,对第三参数进行解码,以确定第三参数的取值。如果第三参数的取值为1,则确定当前块的质心偏移值为质心偏移值的帧间预测值;如果第三参数的取值为0,则确定当前块的质心偏移值不为质心偏移值的帧间预测值。如果当前块的质心偏移值不为质心偏移值的帧间预测值,则解码器可以采用其他方式确定当前块的质心偏移值。The decoder may use arithmetic decoding to decode the third parameter according to the context model to determine the value of the third parameter. If the value of the third parameter is 1, the centroid offset value of the current block is determined to be the inter-frame prediction value of the centroid offset value; if the value of the third parameter is 0, the centroid offset value of the current block is determined not to be the inter-frame prediction value of the centroid offset value. If the centroid offset value of the current block is not the inter-frame prediction value of the centroid offset value, the decoder may use other methods to determine the centroid offset value of the current block.
在一些实现方式中,解码器可以根据当前块中的交点,确定当前块内的点云的重建值。In some implementations, the decoder may determine the reconstructed values of the point cloud within the current block based on the intersection points in the current block.
上文结合图13,详细描述了本申请实施例提供的点云解码方法。下面结合图14,详细描述本申请实施例提供的点云编码方法。The above describes in detail the point cloud decoding method provided by the embodiment of the present application in conjunction with Figure 13. The following describes in detail the point cloud encoding method provided by the embodiment of the present application in conjunction with Figure 14.
图14为本申请实施例提供的点云编码方法的流程示意图。图14的点云编码方法可应用于编码器。图14所示的点云编码方法可用于对点云的几何信息进行编码。在一些实现方式中,该编码方法可应用于基于trisoup的编码方法。在一些实现方式中,该编码方法基于几何的稠密点云测试模型(geometry-based solid content test model,GES-TM)。GES-TM是针对稠密点云(如在增强现实(augmented reality,AR)或虚拟现实(virtual reality,VR)场景采集到的点云)而提出的编解码框架。Figure 14 is a flow chart of the point cloud encoding method provided in an embodiment of the present application. The point cloud encoding method of Figure 14 can be applied to an encoder. The point cloud encoding method shown in Figure 14 can be used to encode the geometric information of a point cloud. In some implementations, the encoding method can be applied to a trisoup-based encoding method. In some implementations, the encoding method is based on a geometry-based solid content test model (GES-TM). GES-TM is a coding and decoding framework proposed for dense point clouds (such as point clouds collected in augmented reality (AR) or virtual reality (VR) scenes).
参见图14,在步骤S1410,根据参考块中的至少一个交点确定当前块的第一参数。14 , in step S1410 , a first parameter of a current block is determined according to at least one intersection point in a reference block.
至少一个交点可以为参考块中的点云与参考块的12条边的交点。该至少一个交点的数量大于0且小于或等于12。该至少一个交点可以通过对参考块进行解码得到。参考块可以为已解码块。参考块可以为上文描述的基于trisoup编解码方法中的参考块。参考块也可以称为block。The at least one intersection point may be an intersection point between a point cloud in the reference block and the 12 edges of the reference block. The number of the at least one intersection point is greater than 0 and less than or equal to 12. The at least one intersection point may be obtained by decoding the reference block. The reference block may be a decoded block. The reference block may be a reference block in the trisoup-based encoding and decoding method described above. The reference block may also be referred to as a block.
该第一参数可用于指示基于至少一个交点确定的交点预测值中的不准确值的数量。第一参数例如可以为上文中的qualitySKIP(当然,第一参数也可以采用其他任意的字母和/或数字进行表示)。第一参数的取值可以为0~11,或者说,第一参数的取值为大于或等于0且小于或等于11的整数。The first parameter may be used to indicate the number of inaccurate values in the intersection point prediction value determined based on at least one intersection point. The first parameter may be, for example, the qualitySKIP described above (of course, the first parameter may also be represented by any other letters and/or numbers). The value of the first parameter may be 0 to 11, or in other words, the value of the first parameter may be an integer greater than or equal to 0 and less than or equal to 11.
第一参数可以基于参考块中的至少一个交点与当前块中的至少一个交点之间的关系确定。例如,如果参考块中的交点1与当前块中的交点1相同,则表示当前块中的交点1的预测值准确;如果参考块中的交点1与当前块中的交点1不同,则表示当前块中的交点1的预测值不准确。通过将参考块中的至少一个交点与当前块中的至少一个交点进行对比,可以确定第一参数。The first parameter can be determined based on a relationship between at least one intersection point in the reference block and at least one intersection point in the current block. For example, if intersection point 1 in the reference block is the same as intersection point 1 in the current block, the predicted value of intersection point 1 in the current block is accurate; if intersection point 1 in the reference block is different from intersection point 1 in the current block, the predicted value of intersection point 1 in the current block is inaccurate. The first parameter can be determined by comparing at least one intersection point in the reference block with at least one intersection point in the current block.
继续参见图14,在步骤S1420,根据第一参数确定第二参数。Continuing to refer to FIG. 14 , in step S1420 , the second parameter is determined according to the first parameter.
第二参数可以为一个新定义的参数。例如,第二参数可以用Index_New来表示(当然,第二参数也可以采用其他任意的字母和/或数字进行表示)。The second parameter may be a newly defined parameter. For example, the second parameter may be represented by Index_New (of course, the second parameter may also be represented by any other letters and/or numbers).
第二参数可以包括第一值,且第一值对应第一参数的多个值。或者说,当第一参数的取值为该多个值中的任意一个值时,第二参数的取值均为第一值。在一些实现方式中,该多个值对应的概率(或概率分布)相等或近似相等。因此,可以将该多个值所索引的上下文简化为一个,以减少上下文的数目。The second parameter may include a first value, and the first value corresponds to multiple values of the first parameter. In other words, when the value of the first parameter is any one of the multiple values, the value of the second parameter is the first value. In some implementations, the probabilities (or probability distributions) corresponding to the multiple values are equal or approximately equal. Therefore, the contexts indexed by the multiple values can be simplified to one to reduce the number of contexts.
本申请实施例对该多个值不做具体限定。例如,该多个值可以为第一参数的取值中的位于第一取值范围内的值。该多个值可以为连续的多个值。或者,该多个值可以为不连续的多个值,或零散的多个值。The embodiments of the present application do not specifically limit the multiple values. For example, the multiple values may be values within the first value range among the values of the first parameter. The multiple values may be continuous values. Alternatively, the multiple values may be discontinuous values or scattered values.
第一取值范围可以包括1至11中的部分或全部取值。例如,第一取值范围可以为1~11。又例如,第一取值范围可以为P~11,P为大于1的整数,P例如可以为2、3、4或5等。又例如,第一取值范围 可以为1~Q,Q为小于11且大于1的整数,Q例如可以为10、9、8或7等。The first value range may include some or all of the values from 1 to 11. For example, the first value range may be 1 to 11. For another example, the first value range may be P to 11, where P is an integer greater than 1, and P may be 2, 3, 4, or 5, for example. For another example, the first value range may be It can be 1 to Q, where Q is an integer smaller than 11 and larger than 1. For example, Q can be 10, 9, 8, or 7.
第一值例如可以为0或1,从而可以降低编解码的复杂度。以第一值为0为例,当第一参数的取值为多个值中的任意一个时,第二参数的取值为0。以第一值为1为例,当第一参数的取值为多个值中的任意一个时,第二参数的取值为1。The first value can be, for example, 0 or 1, thereby reducing the complexity of encoding and decoding. Taking the first value as 0 as an example, when the value of the first parameter is any one of multiple values, the value of the second parameter is 0. Taking the first value as 1 as an example, when the value of the first parameter is any one of multiple values, the value of the second parameter is 1.
以多个值为1~11为例,第一参数的取值为1~11时,第二参数的取值为0或1。Taking multiple values of 1 to 11 as an example, when the value of the first parameter is 1 to 11, the value of the second parameter is 0 or 1.
在一些实现方式中,第二参数还包括第二值。该第二值可以对应第一参数的取值中除上述多个值之外的值。例如,第二值对应第一参数的取值中的一个值,该一个值可以为除上述多个值之外的任意一个值。该一个值例如可以为0。举例说明,如果第一值对应第一参数的1~11中的值,则第二值对应的第一参数的取值为0。也就是说,第二参数的第一值和第二值可以对应第一参数的全部取值。In some implementations, the second parameter also includes a second value. This second value may correspond to a value other than the aforementioned multiple values of the first parameter. For example, the second value corresponds to one of the values of the first parameter, and this one value may be any value other than the aforementioned multiple values. This one value may be, for example, 0. For example, if the first value corresponds to a value between 1 and 11 of the first parameter, then the value of the first parameter corresponding to the second value is 0. In other words, the first value and the second value of the second parameter may correspond to all values of the first parameter.
第二值可以为1或0。第一值与第二值不同。如果第一值为0,则第二值为1;如果第一值为1,则第二值为0。The second value can be 1 or 0. The first value is different from the second value. If the first value is 0, the second value is 1; if the first value is 1, the second value is 0.
第一参数的取值与第二参数的取值之间的关系可以通过函数来实现,也可以通过映射表来实现,本申请实施例对此不做具体限定。例如,第二参数可以基于第一参数以及第一映射关系确定。第一映射关系可用于指示第一参数的取值与第二参数的取值之间的映射关系。通过映射表指示第一参数的取值与第二参数的取值之间的关系,可以提高编解码的运行速度。The relationship between the value of the first parameter and the value of the second parameter can be implemented by a function or a mapping table, and this embodiment of the present application does not specifically limit this. For example, the second parameter can be determined based on the first parameter and the first mapping relationship. The first mapping relationship can be used to indicate the mapping relationship between the value of the first parameter and the value of the second parameter. By indicating the relationship between the value of the first parameter and the value of the second parameter through a mapping table, the running speed of the codec can be improved.
继续参见图14,在步骤S1430,根据第二参数确定第三参数的上下文模型。Continuing to refer to FIG. 14 , in step S1430 , a context model of the third parameter is determined based on the second parameter.
第三参数可用于指示当前块的交点质心偏移值是否等于交点质心偏移值的帧间预测值。第三参数可以为上文中描述的Intersameflag(当然,第三参数也可以采用其他任意的字母和/或数字进行表示)。The third parameter may be used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value. The third parameter may be the Intersameflag described above (of course, the third parameter may also be represented by any other letters and/or numbers).
交点质心偏移值的帧间预测值可以为通过预测得到的质心偏移值。交点质心偏移值的帧间预测值可以为参考块的交点质心偏移值。The inter-frame prediction value of the intersection centroid offset value may be a centroid offset value obtained by prediction. The inter-frame prediction value of the intersection centroid offset value may be an intersection centroid offset value of a reference block.
如果第三参数的取值为1,则表示当前块的交点质心偏移值等于交点质心偏移值的帧间预测值;如果第三参数的取值为0,则表示当前块的交点质心偏移值不等于交点质心偏移值的帧间预测值。或者,如果第三参数的取值为0,则表示当前块的交点质心偏移值等于交点质心偏移值的帧间预测值;如果第三参数的取值为1,则表示当前块的交点质心偏移值不等于交点质心偏移值的帧间预测值。If the value of the third parameter is 1, it indicates that the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; if the value of the third parameter is 0, it indicates that the intersection centroid offset value of the current block is not equal to the inter-frame prediction value of the intersection centroid offset value. Alternatively, if the value of the third parameter is 0, it indicates that the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; if the value of the third parameter is 1, it indicates that the intersection centroid offset value of the current block is not equal to the inter-frame prediction value of the intersection centroid offset value.
第三参数可以是基于上下文模型确定的。或者说,第三参数可以是基于上下文模型进行编码的。第三参数的上下文模型与第二参数有关。The third parameter may be determined based on a context model. In other words, the third parameter may be encoded based on the context model. The context model of the third parameter is related to the second parameter.
在一些实现方式中,第三参数的上下文模型可以基于第二参数和第四参数确定。第四参数可用于指示交点质心偏移值的帧间预测值。第四参数例如可以为上文描述的driftSKIP(当然,第四参数也可以采用其他任意的字母和/或数字进行表示)。In some implementations, the context model of the third parameter may be determined based on the second parameter and the fourth parameter. The fourth parameter may be used to indicate an inter-frame prediction value of the intersection centroid offset value. The fourth parameter may be, for example, the driftSKIP described above (of course, the fourth parameter may also be represented by any other letters and/or numbers).
在一些实现方式中,第三参数的上下文模型可以基于第二参数以及第四参数是否为0确定。如果第二参数包括2个取值,第四参数是否为0对应2个取值,则第三参数的上下文数目为2*2=4个,相比于传统方案中的24个上下文,减少了20个上下文,可以大大减少上下文的数目。In some implementations, the context model for the third parameter may be determined based on the second parameter and whether the fourth parameter is 0. If the second parameter has two values, and whether the fourth parameter is 0 corresponds to the two values, then the number of contexts for the third parameter is 2*2=4, which is 20 fewer than the 24 contexts in the traditional solution, significantly reducing the number of contexts.
在一些实现方式中,第三参数的编码是在第五参数的取值满足第一条件的情况下进行的,第五参数的取值基于第一参数的取值确定。第五参数例如可以为上文描述的possibleSKIP(当然,第五参数也可以采用其他任意的字母和/或数字进行表示)。通过在满足第一条件的情况下再编码第三参数,可以避免一些不必要的编码操作,有利于提升编解码的性能。In some implementations, the encoding of the third parameter is performed when the value of the fifth parameter satisfies the first condition, and the value of the fifth parameter is determined based on the value of the first parameter. The fifth parameter can be, for example, the possibleSKIP described above (of course, the fifth parameter can also be represented by any other letters and/or numbers). By encoding the third parameter when the first condition is satisfied, some unnecessary encoding operations can be avoided, which is beneficial to improving codec performance.
第一条件可以包括第五参数的取值为1。如果第五参数的取值为1,则编码器才对第三参数进行编码;如果第五参数的取值不为1,如第五参数的取值为0,则编码器不对第三参数进行编码。The first condition may include that the value of the fifth parameter is 1. If the value of the fifth parameter is 1, the encoder encodes the third parameter; if the value of the fifth parameter is not 1, such as the value of the fifth parameter is 0, the encoder does not encode the third parameter.
在一些实现方式中,如果第一参数的取值小于第一阈值,则第五参数的取值为1。第一阈值例如可以为K,如果第一参数的取值小于K,则第五参数的取值为1。K为小于或等于12的整数。K例如可以为6或12等。In some implementations, if the value of the first parameter is less than the first threshold, the value of the fifth parameter is 1. The first threshold may be, for example, K. If the value of the first parameter is less than K, the value of the fifth parameter is 1. K is an integer less than or equal to 12. K may be, for example, 6 or 12.
继续参见图14,在步骤S1440,根据第三参数的上下文模型,对第三参数进行编码。编码器可以利用算术编码方式,根据上下文模型,对第三参数进行编码。14 , in step S1440 , the third parameter is encoded according to the context model of the third parameter. The encoder may encode the third parameter according to the context model using arithmetic coding.
下面结合具体例子,更加详细地描述本申请实施例。应注意,下文中的例子仅仅是为了帮助本领域技术人员理解本申请实施例,而非要将本申请实施例限于所例示的具体数值或具体场景。本领域技术人员根据所给出的下文中的例子,显然可以进行各种等价的修改或变化,这样的修改或变化也落入本申请实施例的范围内。The following examples are used to describe the embodiments of the present application in more detail. It should be noted that the examples below are only intended to help those skilled in the art understand the embodiments of the present application, rather than to limit the embodiments of the present application to the specific numerical values or specific scenarios illustrated. It is apparent that those skilled in the art can make various equivalent modifications or changes based on the examples given below, and such modifications or changes also fall within the scope of the embodiments of the present application.
在相关技术中,在解码交点质心偏移值的时候,当解码intersameflag标志位时,对于qualitySKIP(0到11)这个值所索引的上下文模型(contex model)会产生冗余,由于qualitySKIP不为0(1到11)的概率分布是近似的,所以可以将qualitySKIP不为0(1到11)所索引的contex简化成一个。本申请提出减少contex model的优化算法,能在完全不影响性能和时间复杂度的前提下减少contex model。In related art, when decoding the intersection centroid offset value, when decoding the intersameflag flag, the context model indexed by the qualitySKIP value (0 to 11) is redundant. Since the probability distribution of qualitySKIP values other than 0 (1 to 11) is approximate, the context models indexed by qualitySKIP values other than 0 (1 to 11) can be simplified to one. This application proposes an optimization algorithm for reducing the context model, which can be achieved without compromising performance or time complexity.
本申请提出的改进是在解码标志位Intersameflag时,将qualitySKIP转换成新的索引值Index_New, 这是由于qualitySKIP不为0(1到11)的概率分布是近似的,所以可以将原先的12种contex(0到11共12种)减少成2种contex(如0为1种,1到11均为另一种),在不影响性能和时间复杂度的情况下,来减少解码端的上下文数目。The improvement proposed in this application is to convert qualitySKIP into a new index value Index_New when decoding the Intersameflag flag. This is because the probability distribution of qualitySKIP is not 0 (1 to 11) is approximate, so the original 12 contexes (0 to 11, a total of 12) can be reduced to 2 contexes (such as 0 is 1, 1 to 11 are another), to reduce the number of contexts on the decoding end without affecting performance and time complexity.
Intersameflag=decode(ctxtMemOctree.ctxDriftSKIP[Index_New][driftSKIP==0])Intersameflag=decode(ctxtMemOctree.ctxDriftSKIP[Index_New][driftSKIP==0])
其中解码Intersameflag所需要的索引值为Index_New和driftSKIP==0。The index values required for decoding Intersameflag are Index_New and driftSKIP==0.
方法一:Method 1:
如果0<=quantitySKIP<K-1,则Index_New=0,否则Index_New=1。If 0<=quantitySKIP<K-1, then Index_New=0, otherwise Index_New=1.
伪代码如下:The pseudo code is as follows:
If 0<=quantitySKIP<K-1:If 0<=quantitySKIP<K-1:
Index_New=0Index_New=0
ElseElse
Index_New=1Index_New=1
以K=2进行举例说明。如果K取2,则有如下方案。Take K = 2 as an example. If K is 2, the following solutions are available.
如果quantitySKIP==0,则Index_New=0,否则Index_New=1。If quantitySKIP==0, then Index_New=0, otherwise Index_New=1.
伪代码如下:The pseudo code is as follows:
If quantitySKIP==0:If quantitySKIP == 0:
Index_New=0Index_New=0
Else(即quantitySKIP取1-11任意则有)Else (i.e. quantitySKIP is any value from 1 to 11)
Index_New=1Index_New=1
其中,解码Intersameflag所利用的上下文为:The context used to decode the Intersameflag is:
Index_New--》2个索引值;Index_New--》2 index values;
driftSKIP==0--》2个索引值。driftSKIP == 0 --》2 index values.
Intersameflag=decode(ctxtMemOctree.ctxDriftSKIP[Index_New][driftSKIP==0])Intersameflag=decode(ctxtMemOctree.ctxDriftSKIP[Index_New][driftSKIP==0])
方法一共需要2*2=4个上下文,相比相关技术(24个上下文)减少20个上下文。The method requires a total of 2*2=4 contexts, which is 20 fewer contexts than the related art (24 contexts).
方法二:Method 2:
方法二使用映射表(Look Up table)的方式,即qualitySKIP的取值与Index_New的取值之间的关系可以通过映射表实现。Method 2 uses a look-up table, that is, the relationship between the value of qualitySKIP and the value of Index_New can be realized through a mapping table.
Index_New与qualitySKIP的映射表为:The mapping table between Index_New and qualitySKIP is:
TABLE_QUANLITY_CONTEX=[0,1,1,1,1,1,1,1,1,1,1,1]或者[1,0,0,0,0,0,0,0,0,0,0,0];TABLE_QUANLITY_CONTEX=[0,1,1,1,1,1,1,1,1,1,1,1] or [1,0,0,0,0,0,0,0,0,0,0,0];
Index_New=TABLE_QUANLITY_CONTEX[qualitySKIP]。Index_New=TABLE_QUANLITY_CONTEX[qualitySKIP].
其中,解码Intersameflag所利用的上下文为:The context used to decode the Intersameflag is:
Index_New--》2个索引值;Index_New--》2 index values;
driftSKIP==0--》2个索引值。driftSKIP == 0 --》2 index values.
Intersameflag=decode(ctxtMemOctree.ctxDriftSKIP[Index_New][driftSKIP==0])Intersameflag=decode(ctxtMemOctree.ctxDriftSKIP[Index_New][driftSKIP==0])
方法二共需要2*2=4个上下文,相比相关技术(24个上下文)减少20个上下文。Method 2 requires a total of 2*2=4 contexts, which is 20 fewer contexts than the related art (24 contexts).
图15示出了交点质心偏移值解码框架图,图16示出了解码Intersameflag所需上下文的框架图。图15所示的框架图与图11所示的框架类似。首先,计算qualitySKIP。确定qualitySKIP是否小于K。如果qualitySKIP<K,则possibleSKIP=1;如果qualitySKIP≥K,则possibleSKIP=0。如果possibleSKIP=1,则解码Intersameflag;如果possibleSKIP不等于1,则通过其他方法解码得到当前块的交点质心偏移值。如果Intersameflag=1,则当前块的交点质心偏移值等于driftSKIP。如果Intersameflag不等于1,则通过其他方法解码得到当前块的交点质心偏移值。Figure 15 shows a framework diagram for decoding the intersection centroid offset value, and Figure 16 shows a framework diagram for the context required for decoding Intersameflag. The framework diagram shown in Figure 15 is similar to the framework shown in Figure 11. First, calculate qualitySKIP. Determine whether qualitySKIP is less than K. If qualitySKIP<K, possibleSKIP=1; if qualitySKIP≥K, possibleSKIP=0. If possibleSKIP=1, decode Intersameflag; if possibleSKIP is not equal to 1, decode the intersection centroid offset value of the current block by other methods. If Intersameflag=1, the intersection centroid offset value of the current block is equal to driftSKIP. If Intersameflag is not equal to 1, decode the intersection centroid offset value of the current block by other methods.
参见图16,可以基于Index_New和driftSKIP==0确定Intersameflag的上下文模型。由于Index_New包括2个取值,driftSKIP==0对应2个取值,则解码Intersameflag所需要的上下文数量为4个,从而可以大大减少解码Intersameflag所需要的上下文。16 , the context model of Intersameflag can be determined based on Index_New and driftSKIP == 0. Since Index_New includes 2 values and driftSKIP == 0 corresponds to 2 values, the number of contexts required to decode Intersameflag is 4, which can greatly reduce the context required to decode Intersameflag.
表1和表2分别示出了方法1和方法2的测试结果。从表1和表2可以看出,本申请的方案对能和时间复杂度几乎没有影响,但是在解码端减少了20个上下文。在解码交点质心偏移值α时,相关技术方案共用到的上下文的数目为73个,采用本申请的方案,方法一或者方法二用到的上下文的数目为53个,相当于降低了27%的上下文数目。Tables 1 and 2 show the test results for Method 1 and Method 2, respectively. As can be seen from Tables 1 and 2, the solution of this application has almost no impact on energy and time complexity, but reduces 20 contexts on the decoding end. When decoding the intersection centroid offset value α, the number of contexts shared by the related art solutions is 73. Using the solution of this application, the number of contexts used by Method 1 or Method 2 is 53, equivalent to a 27% reduction in the number of contexts.
本申请在对交点质心偏移值解码的时候,采用一种新的索引方式,将qualitySKIP索引转换成新的索引Index_New。使用新的索引方法,能在不影响性能和时间复杂度的情况下,可以减少上下文的数目。 This application uses a new indexing method when decoding the intersection centroid offset value, converting the qualitySKIP index into a new index Index_New. The new indexing method can reduce the number of contexts without affecting performance and time complexity.
表1
Table 1
表2
Table 2
上文结合图1至图16,详细描述了本申请的方法实施例,下面结合图17至图20,详细描述本申请的装置实施例。应理解,方法实施例的描述与装置实施例的描述相互对应,因此,未详细描述的部分可以参见前面方法实施例。The method embodiment of the present application is described in detail above in conjunction with Figures 1 to 16 . The device embodiment of the present application is described in detail below in conjunction with Figures 17 to 20 . It should be understood that the description of the method embodiment corresponds to the description of the device embodiment. Therefore, for parts not described in detail, reference can be made to the above method embodiment.
图17是本申请一个实施例提供的解码器的结构示意图。如图17所示,所述解码器1700可以包括第一确定单元1710、第二确定单元1720、第三确定单元1730和解码单元1740。FIG17 is a schematic diagram of the structure of a decoder provided by an embodiment of the present application. As shown in FIG17 , the decoder 1700 may include a first determining unit 1710 , a second determining unit 1720 , a third determining unit 1730 , and a decoding unit 1740 .
第一确定单元1710,配置为根据参考块中的至少一个交点确定当前块的第一参数,所述第一参数用于指示基于所述至少一个交点确定的交点预测值中的不准确值的数量。The first determining unit 1710 is configured to determine a first parameter of the current block according to at least one intersection point in the reference block, where the first parameter is used to indicate the number of inaccurate values in the intersection prediction value determined based on the at least one intersection point.
第二确定单元1720,配置为根据所述第一参数确定第二参数,所述第二参数包括第一值,且所述第一值对应所述第一参数的多个值。The second determining unit 1720 is configured to determine a second parameter according to the first parameter, where the second parameter includes a first value, and the first value corresponds to multiple values of the first parameter.
第三确定单元1730,配置为根据所述第二参数确定第三参数的上下文模型,所述第三参数用于指示所述当前块的交点质心偏移值是否等于交点质心偏移值的帧间预测值。The third determination unit 1730 is configured to determine a context model of a third parameter based on the second parameter, where the third parameter is used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value.
解码单元1740,配置为根据所述第三参数的上下文模型,对所述第三参数进行解码。The decoding unit 1740 is configured to decode the third parameter according to the context model of the third parameter.
在一些实现方式中,所述多个值为所述第一参数的取值中的位于第一取值范围内的值。In some implementations, the multiple values are values of the first parameter that are within a first value range.
在一些实现方式中,所述第一取值范围包括1至11中的部分或全部取值。In some implementations, the first value range includes some or all of the values from 1 to 11.
在一些实现方式中,所述第一值为0或1。In some implementations, the first value is 0 or 1.
在一些实现方式中,所述第二参数还包括第二值,所述第二值对应所述第一参数的取值中的一个值。In some implementations, the second parameter further includes a second value, and the second value corresponds to one of the values of the first parameter.
在一些实现方式中,所述第二值对应的所述第一参数的取值为0。In some implementations, the value of the first parameter corresponding to the second value is 0.
在一些实现方式中,所述第二值为1或0。In some implementations, the second value is 1 or 0.
在一些实现方式中,所述第二参数是基于所述第一参数以及第一映射关系确定的,所述第一映射关系用于指示所述第一参数的取值与所述第二参数的取值之间的映射关系。In some implementations, the second parameter is determined based on the first parameter and a first mapping relationship, where the first mapping relationship is used to indicate a mapping relationship between a value of the first parameter and a value of the second parameter.
在一些实现方式中,所述上下文模型是基于所述第二参数和第四参数确定的,所述第四参数用于指示所述交点质心偏移值的帧间预测值。In some implementations, the context model is determined based on the second parameter and a fourth parameter, and the fourth parameter is used to indicate an inter-frame prediction value of the intersection centroid offset value.
在一些实现方式中,所述第三参数的解码是在第五参数的取值满足第一条件的情况下进行的,所述第五参数的取值基于所述第一参数的取值确定。In some implementations, the decoding of the third parameter is performed when the value of the fifth parameter satisfies the first condition, and the value of the fifth parameter is determined based on the value of the first parameter.
在一些实现方式中,所述第一条件为所述第五参数的取值为1。In some implementations, the first condition is that the value of the fifth parameter is 1.
在一些实现方式中,如果所述第一参数的取值小于第一阈值,则所述第五参数的取值为1。In some implementations, if the value of the first parameter is less than a first threshold, the value of the fifth parameter is 1.
在一些实现方式中,所述解码器还包括第四确定单元,配置为根据所述当前块中的交点,确定所述当前块内的点云的重建值。In some implementations, the decoder further includes a fourth determination unit configured to determine a reconstructed value of the point cloud within the current block based on the intersection point in the current block.
可以理解地,在本申请实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中, 也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。It is understandable that in the embodiments of the present application, a "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., and of course it may be a module or a non-modular one. Moreover, the various components in this embodiment may be integrated into a processing unit. Alternatively, each unit may exist physically separately, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware or in the form of software functional modules.
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this embodiment, or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium and includes several instructions for causing a computer device (which can be a personal computer, server, or network device, etc.) or a processor to execute all or part of the steps of the method described in this embodiment. The aforementioned storage medium includes various media that can store program code, such as a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.
因此,本申请实施例提供了一种计算机可读存储介质,应用于解码器1700,该计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现前述实施例中任一项所述的解码方法。Therefore, an embodiment of the present application provides a computer-readable storage medium, which is applied to the decoder 1700. The computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, it implements the decoding method described in any one of the aforementioned embodiments.
基于上述解码器1700的组成以及计算机可读存储介质,参见图18,其示出了本申请实施例提供的编码器1800的具体硬件结构示意图。如图18所示,编码器1800可以包括:通信接口1810、存储器1820和处理器1830;各个组件通过总线系统1840耦合在一起。可理解,总线系统1840用于实现这些组件之间的连接通信。总线系统1840除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图18中将各种总线都标为总线系统1840。其中,Based on the composition of the above-mentioned decoder 1700 and the computer-readable storage medium, refer to Figure 18, which shows a specific hardware structure diagram of the encoder 1800 provided in an embodiment of the present application. As shown in Figure 18, the encoder 1800 may include: a communication interface 1810, a memory 1820 and a processor 1830; each component is coupled together through a bus system 1840. It can be understood that the bus system 1840 is used to achieve connection and communication between these components. In addition to the data bus, the bus system 1840 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, various buses are labeled as bus system 1840 in Figure 18. Among them,
通信接口1810,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;Communication interface 1810, used for sending and receiving signals when sending and receiving information with other external network elements;
存储器1820,用于存储计算机程序;Memory 1820, for storing computer programs;
处理器1830,用于在运行所述计算机程序时,执行:根据参考块中的至少一个交点确定当前块的第一参数,所述第一参数用于指示基于所述至少一个交点确定的交点预测值中的不准确值的数量;根据所述第一参数确定第二参数,所述第二参数包括第一值,且所述第一值对应所述第一参数的多个值;根据所述第二参数确定第三参数的上下文模型,所述第三参数用于指示所述当前块的交点质心偏移值是否等于交点质心偏移值的帧间预测值;根据所述第三参数的上下文模型,对所述第三参数进行解码。Processor 1830 is configured to, when running the computer program, perform the following: determining a first parameter of a current block based on at least one intersection in a reference block, the first parameter being used to indicate the number of inaccurate values in an intersection prediction value determined based on the at least one intersection; determining a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; determining a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and decoding the third parameter based on the context model of the third parameter.
可以理解,本申请实施例中的存储器1820可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(doubledata rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DRRAM)。本申请描述的系统和方法的存储器1820旨在包括但不限于这些和任意其它适合类型的存储器。It is understood that the memory 1820 in the embodiment of the present application can be a volatile memory or a non-volatile memory, or can include both volatile and non-volatile memories. Among them, the non-volatile memory can be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory. The volatile memory can be a random access memory (RAM), which is used as an external cache. By way of example and not limitation, many forms of RAM are available, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), synchlink DRAM (SLDRAM), and direct RAM bus RAM (DRRAM). The memory 1820 of the systems and methods described herein is intended to include, but is not limited to, these and any other suitable types of memory.
而处理器1830可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器1830中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1830可以是通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1820,处理器1830读取存储器1820中的信息,结合其硬件完成上述方法的步骤。Processor 1830 may be an integrated circuit chip with signal processing capabilities. During implementation, each step of the above method can be completed by hardware integrated logic circuits or software instructions in processor 1830. The above-mentioned processor 1830 can be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. The various methods, steps, and logic block diagrams disclosed in the embodiments of this application can be implemented or executed. A general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in conjunction with the embodiments of this application can be directly implemented and executed by a hardware decoding processor, or by a combination of hardware and software modules in the decoding processor. The software module can be located in a storage medium mature in the art, such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, etc. The storage medium is located in the memory 1820 , and the processor 1830 reads the information in the memory 1820 and completes the steps of the above method in combination with its hardware.
可以理解的是,本申请描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(application specific integrated circuits,ASIC)、数字信号处理器(digital signal processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(programmable logic device,PLD)、现场可编程门阵列(field-programmable gate array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。对于软件实现,可通过执行本申请所述功能的模块(例如过程、函数等)来实现本申请所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。It is understood that the embodiments described in this application can be implemented using hardware, software, firmware, middleware, microcode, or a combination thereof. For hardware implementation, the processing unit can be implemented in one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), general-purpose processors, controllers, microcontrollers, microprocessors, other electronic units for performing the functions described in this application, or a combination thereof. For software implementation, the technology described in this application can be implemented by modules (such as processes, functions, etc.) that perform the functions described in this application. The software code can be stored in a memory and executed by a processor. The memory can be implemented in the processor or outside the processor.
可选地,作为另一个实施例,处理器1830还配置为在运行所述计算机程序时,执行前述实施例中 任一项所述的解码方法。Optionally, as another embodiment, the processor 1830 is further configured to execute the aforementioned embodiment when running the computer program. The decoding method according to any one of claims 1 to 6.
图19是本申请一个实施例提供的编码器的结构示意图。如图19所示,编码器1900包括第一确定单元1910、第二确定单元1920、第三确定单元1930和编码单元1940。FIG19 is a schematic diagram of the structure of an encoder provided by an embodiment of the present application. As shown in FIG19 , the encoder 1900 includes a first determination unit 1910 , a second determination unit 1920 , a third determination unit 1930 , and an encoding unit 1940 .
第一确定单元1910,配置为根据参考块中的至少一个交点确定当前块的第一参数,所述第一参数用于指示基于所述至少一个交点确定的交点预测值中的不准确值的数量。The first determining unit 1910 is configured to determine a first parameter of the current block according to at least one intersection point in the reference block, where the first parameter is used to indicate the number of inaccurate values in the intersection prediction value determined based on the at least one intersection point.
第二确定单元1920,配置为根据所述第一参数确定第二参数,所述第二参数包括第一值,且所述第一值对应所述第一参数的多个值。The second determining unit 1920 is configured to determine a second parameter according to the first parameter, where the second parameter includes a first value, and the first value corresponds to multiple values of the first parameter.
第三确定单元1930,配置为根据所述第二参数确定第三参数的上下文模型,所述第三参数用于指示所述当前块的交点质心偏移值是否等于交点质心偏移值的帧间预测值。The third determination unit 1930 is configured to determine a context model of a third parameter based on the second parameter, where the third parameter is used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value.
编码单元1940,配置为根据所述第三参数的上下文模型,对所述第三参数进行编码。The encoding unit 1940 is configured to encode the third parameter according to the context model of the third parameter.
在一些实现方式中,所述多个值为所述第一参数的取值中的位于第一取值范围内的值。In some implementations, the multiple values are values of the first parameter that are within a first value range.
在一些实现方式中,所述第一取值范围包括1至11中的部分或全部取值。In some implementations, the first value range includes some or all of the values from 1 to 11.
在一些实现方式中,所述第一值为0或1。In some implementations, the first value is 0 or 1.
在一些实现方式中,所述第二参数还包括第二值,所述第二值对应所述第一参数的取值中的一个值。In some implementations, the second parameter further includes a second value, and the second value corresponds to one of the values of the first parameter.
在一些实现方式中,所述第二值对应的所述第一参数的取值为0。In some implementations, the value of the first parameter corresponding to the second value is 0.
在一些实现方式中,所述第二值为1或0。In some implementations, the second value is 1 or 0.
在一些实现方式中,所述第二参数是基于所述第一参数以及第一映射关系确定的,所述第一映射关系用于指示所述第一参数的取值与所述第二参数的取值之间的映射关系。In some implementations, the second parameter is determined based on the first parameter and a first mapping relationship, where the first mapping relationship is used to indicate a mapping relationship between a value of the first parameter and a value of the second parameter.
在一些实现方式中,所述上下文模型是基于所述第二参数和第四参数确定的,所述第四参数用于指示所述交点质心偏移值的帧间预测值。In some implementations, the context model is determined based on the second parameter and a fourth parameter, and the fourth parameter is used to indicate an inter-frame prediction value of the intersection centroid offset value.
在一些实现方式中,所述第三参数的编码是在第五参数的取值满足第一条件的情况下进行的,所述第五参数的取值基于所述第一参数的取值确定。In some implementations, the encoding of the third parameter is performed when the value of the fifth parameter satisfies the first condition, and the value of the fifth parameter is determined based on the value of the first parameter.
在一些实现方式中,所述第一条件为所述第五参数的取值为1。In some implementations, the first condition is that the value of the fifth parameter is 1.
在一些实现方式中,如果所述第一参数的取值小于第一阈值,则所述第五参数的取值为1。In some implementations, if the value of the first parameter is less than a first threshold, the value of the fifth parameter is 1.
可以理解地,在本申请实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。It is understandable that in the embodiments of the present application, a "unit" can be a portion of a circuit, a portion of a processor, a portion of a program or software, etc., and of course it can also be a module, or it can be non-modular. Moreover, the various components in this embodiment can be integrated into a processing unit, or each unit can exist physically separately, or two or more units can be integrated into a single unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional modules.
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented as a software functional module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this embodiment, or the portion that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions for causing a computer device (which can be a personal computer, server, or network device, etc.) or a processor to execute all or part of the steps of the method described in this embodiment. The aforementioned storage medium includes various media that can store program code, such as a USB flash drive, a mobile hard drive, ROM, RAM, a magnetic disk, or an optical disk.
因此,本申请实施例提供了一种计算机可读存储介质,应用于编码器1900,该计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现前述实施例中任一项所述的解码方法。Therefore, an embodiment of the present application provides a computer-readable storage medium, which is applied to the encoder 1900. The computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, it implements the decoding method described in any one of the aforementioned embodiments.
基于上述编码器1900的组成以及计算机可读存储介质,参见图20,其示出了本申请实施例提供的编码器2000的具体硬件结构示意图。如图20所示,编码器2000可以包括:通信接口2010、存储器2020和处理器2030;各个组件通过总线系统2040耦合在一起。可理解,总线系统2040用于实现这些组件之间的连接通信。总线系统2040除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图20中将各种总线都标为总线系统2040。其中,Based on the composition of the above-mentioned encoder 1900 and the computer-readable storage medium, refer to Figure 20, which shows a specific hardware structure diagram of the encoder 2000 provided in an embodiment of the present application. As shown in Figure 20, the encoder 2000 may include: a communication interface 2010, a memory 2020 and a processor 2030; each component is coupled together through a bus system 2040. It can be understood that the bus system 2040 is used to realize the connection and communication between these components. In addition to the data bus, the bus system 2040 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, various buses are labeled as bus systems 2040 in Figure 20. Among them,
通信接口2010,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;Communication interface 2010, used for sending and receiving signals when sending and receiving information with other external network elements;
存储器2020,用于存储计算机程序;Memory 2020, for storing computer programs;
处理器2030,用于在运行所述计算机程序时,执行:根据参考块中的至少一个交点确定当前块的第一参数,所述第一参数用于指示基于所述至少一个交点确定的交点预测值中的不准确值的数量;根据所述第一参数确定第二参数,所述第二参数包括第一值,且所述第一值对应所述第一参数的多个值;根据所述第二参数确定第三参数的上下文模型,所述第三参数用于指示所述当前块的交点质心偏移值是否等于交点质心偏移值的帧间预测值;根据所述第三参数的上下文模型,对所述第三参数进行编码。Processor 2030 is used to, when running the computer program, perform the following: determining a first parameter of the current block based on at least one intersection in the reference block, the first parameter being used to indicate the number of inaccurate values in the intersection prediction value determined based on the at least one intersection; determining a second parameter based on the first parameter, the second parameter including a first value, and the first value corresponding to multiple values of the first parameter; determining a context model of a third parameter based on the second parameter, the third parameter being used to indicate whether the intersection centroid offset value of the current block is equal to the inter-frame prediction value of the intersection centroid offset value; and encoding the third parameter based on the context model of the third parameter.
可以理解,本申请实施例中的存储器2020可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是ROM、PROM、EPROM、EEPROM或闪存。易失性存储器可以是RAM,其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如SRAM、DRAM、SDRAM、DDRSDRAM、ESDRAM、SLDRAM和DRRAM。本申请描述的 系统和方法的存储器2020旨在包括但不限于这些和任意其它适合类型的存储器。It is understood that the memory 2020 in the embodiment of the present application can be a volatile memory or a non-volatile memory, or can include both volatile and non-volatile memories. Among them, the non-volatile memory can be ROM, PROM, EPROM, EEPROM or flash memory. The volatile memory can be RAM, which is used as an external cache. By way of example but not limitation, many forms of RAM are available, such as SRAM, DRAM, SDRAM, DDRSDRAM, ESDRAM, SLDRAM and DRRAM. The memory described in this application The memory 2020 of the systems and methods is intended to comprise, without being limited to, these and any other suitable types of memory.
而处理器2030可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器2030中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器2030可以是通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器2020,处理器2030读取存储器2020中的信息,结合其硬件完成上述方法的步骤。Processor 2030 may be an integrated circuit chip with signal processing capabilities. During implementation, each step of the above method may be completed by hardware integrated logic circuits or software instructions within processor 2030. Processor 2030 may be a general-purpose processor, DSP, ASIC, FPGA, or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component. It may implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application. A general-purpose processor may be a microprocessor or any conventional processor. The steps of the methods disclosed in the embodiments of this application may be directly implemented and executed by a hardware decoding processor, or by a combination of hardware and software modules within the decoding processor. The software modules may be located in a storage medium known in the art, such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, or registers. This storage medium is located in memory 220. Processor 2030 reads information from memory 220 and, in conjunction with its hardware, completes the steps of the above method.
可以理解的是,本申请描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个ASIC、DSP、DSPD、PLD、FPGA、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。对于软件实现,可通过执行本申请所述功能的模块(例如过程、函数等)来实现本申请所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。It is understood that the embodiments described herein can be implemented with hardware, software, firmware, middleware, microcode, or a combination thereof. For hardware implementation, the processing unit can be implemented in one or more ASICs, DSPs, DSPDs, PLDs, FPGAs, general-purpose processors, controllers, microcontrollers, microprocessors, other electronic units for performing the functions described herein, or a combination thereof. For software implementation, the technology described herein can be implemented by modules (e.g., processes, functions, etc.) that perform the functions described herein. The software code can be stored in a memory and executed by a processor. The memory can be implemented in the processor or outside the processor.
可选地,作为另一个实施例,处理器2030还配置为在运行所述计算机程序时,执行前述实施例中任一项所述的编码方法。Optionally, as another embodiment, the processor 2030 is further configured to execute the encoding method described in any one of the aforementioned embodiments when running the computer program.
本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质为存储比特流的非易失性计算机可读存储介质,所述比特流可以通过利用编码器的编码方法而生成,或者,所述比特流通过利用解码器的解码方法而解码,其中,所述解码方法可以为前文任一实施例所述的解码方法、所述编码方法可以为前文任一实施例所述的编码方法。An embodiment of the present application also provides a computer-readable storage medium, which is a non-volatile computer-readable storage medium for storing a bit stream. The bit stream can be generated by an encoding method of an encoder, or the bit stream can be decoded by a decoding method of a decoder, wherein the decoding method can be the decoding method described in any of the foregoing embodiments, and the encoding method can be the encoding method described in any of the foregoing embodiments.
需要说明的是,在本申请中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that, in this application, the terms "comprises," "includes," or any other variations thereof are intended to encompass non-exclusive inclusion, such that a process, method, article, or apparatus comprising a series of elements includes not only those elements but also other elements not explicitly listed, or elements inherent to such process, method, article, or apparatus. In the absence of further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of other identical elements in the process, method, article, or apparatus comprising the element.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the above embodiments of the present application are for description only and do not represent the advantages or disadvantages of the embodiments.
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。The methods disclosed in the several method embodiments provided in this application can be arbitrarily combined without conflict to obtain new method embodiments.
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。The features disclosed in the several product embodiments provided in this application can be arbitrarily combined without conflict to obtain new product embodiments.
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。The features disclosed in the several method or device embodiments provided in this application can be arbitrarily combined without conflict to obtain new method embodiments or device embodiments.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。 The above description is merely a specific embodiment of the present application, but the scope of protection of the present application is not limited thereto. Any changes or substitutions that can be easily conceived by a person skilled in the art within the technical scope disclosed in this application should be included in the scope of protection of this application. Therefore, the scope of protection of this application should be based on the scope of protection of the claims.
Claims (32)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2024/088076 WO2025217813A1 (en) | 2024-04-16 | 2024-04-16 | Point cloud encoding method, point cloud decoding method, encoder, decoder, bitstream, and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2024/088076 WO2025217813A1 (en) | 2024-04-16 | 2024-04-16 | Point cloud encoding method, point cloud decoding method, encoder, decoder, bitstream, and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025217813A1 true WO2025217813A1 (en) | 2025-10-23 |
Family
ID=97402826
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2024/088076 Pending WO2025217813A1 (en) | 2024-04-16 | 2024-04-16 | Point cloud encoding method, point cloud decoding method, encoder, decoder, bitstream, and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025217813A1 (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2022109810A1 (en) * | 2020-11-24 | 2022-06-02 | 浙江大学 | Point cloud encoding method and apparatus, point cloud decoding method and apparatus, and storage medium |
| US20230099908A1 (en) * | 2021-09-27 | 2023-03-30 | Qualcomm Incorporated | Coding point cloud data using direct mode for inter-prediction in g-pcc |
| CN117795554A (en) * | 2023-10-06 | 2024-03-29 | 北京小米移动软件有限公司 | Methods for decoding and encoding 3D point clouds |
| WO2024065270A1 (en) * | 2022-09-28 | 2024-04-04 | Oppo广东移动通信有限公司 | Point cloud encoding method and apparatus, point cloud decoding method and apparatus, devices, and storage medium |
-
2024
- 2024-04-16 WO PCT/CN2024/088076 patent/WO2025217813A1/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2022109810A1 (en) * | 2020-11-24 | 2022-06-02 | 浙江大学 | Point cloud encoding method and apparatus, point cloud decoding method and apparatus, and storage medium |
| US20230099908A1 (en) * | 2021-09-27 | 2023-03-30 | Qualcomm Incorporated | Coding point cloud data using direct mode for inter-prediction in g-pcc |
| WO2024065270A1 (en) * | 2022-09-28 | 2024-04-04 | Oppo广东移动通信有限公司 | Point cloud encoding method and apparatus, point cloud decoding method and apparatus, devices, and storage medium |
| CN117795554A (en) * | 2023-10-06 | 2024-03-29 | 北京小米移动软件有限公司 | Methods for decoding and encoding 3D point clouds |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| TW202425653A (en) | Point cloud encoding and decoding method, device, equipment and storage medium | |
| WO2025217813A1 (en) | Point cloud encoding method, point cloud decoding method, encoder, decoder, bitstream, and storage medium | |
| WO2024174086A1 (en) | Decoding method, encoding method, decoders and encoders | |
| TW202425635A (en) | Point cloud encoding method and apparatus, point cloud decoding method and apparatus, devices, and storage medium | |
| WO2024145904A1 (en) | Encoding method, decoding method, code stream, encoder, decoder, and storage medium | |
| US20250337964A1 (en) | Encoding method, decoding method, encoder, decoder and storage medium | |
| WO2024174092A9 (en) | Encoding/decoding method, code stream, encoder, decoder, and storage medium | |
| WO2022116122A1 (en) | Intra-frame prediction method and apparatus, codec, device, and storage medium | |
| US20250337924A1 (en) | Encoding method, decoding method, bitstream, encoder, decoder and storage medium | |
| US20250392732A1 (en) | Coding method, coder, electronic device, and storage medium | |
| WO2025217889A1 (en) | Encoding method, decoding method, code stream, decoder, encoder, and storage medium | |
| WO2024212043A1 (en) | Encoding method, decoding method, code stream, encoder, decoder, and storage medium | |
| WO2024187380A1 (en) | Encoding method, decoding method, code stream, encoder, decoder and storage medium | |
| WO2025213480A1 (en) | Encoding method and apparatus, decoding method and apparatus, point cloud encoder, point cloud decoder, bit stream, device, and storage medium | |
| WO2025076663A1 (en) | Encoding method, decoding method, encoder, decoder, and storage medium | |
| WO2024207235A1 (en) | Encoding/decoding method, bitstream, encoder, decoder, and storage medium | |
| WO2024216649A1 (en) | Point cloud encoding and decoding method, encoder, decoder, code stream, and storage medium | |
| WO2023173237A1 (en) | Encoding method, decoding method, bit stream, encoder, decoder, and storage medium | |
| WO2025010590A1 (en) | Decoding method, coding method, decoder, and coder | |
| WO2024212228A1 (en) | Coding method, coder, electronic device, and storage medium | |
| WO2023173238A1 (en) | Encoding method, decoding method, code stream, encoder, decoder, and storage medium | |
| WO2024212038A1 (en) | Encoding method, decoding method, code stream, encoder, decoder, and storage medium | |
| WO2025039113A1 (en) | Encoding method, decoding method, code stream, encoder, decoder, and storage medium | |
| WO2025145433A1 (en) | Point cloud encoding method, point cloud decoding method, codec, code stream, and storage medium | |
| WO2024212042A1 (en) | Coding method, decoding method, code stream, coder, decoder, and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24935366 Country of ref document: EP Kind code of ref document: A1 |