[go: up one dir, main page]

GB2626029A - Enhanced method for point cloud compression - Google Patents

Enhanced method for point cloud compression Download PDF

Info

Publication number
GB2626029A
GB2626029A GB2300234.8A GB202300234A GB2626029A GB 2626029 A GB2626029 A GB 2626029A GB 202300234 A GB202300234 A GB 202300234A GB 2626029 A GB2626029 A GB 2626029A
Authority
GB
United Kingdom
Prior art keywords
predictor
azimuth
point
encoding
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
GB2300234.8A
Inventor
Le Floch Hervé
Tannhauser Falk
Ouedraogo Naël
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Priority to GB2300234.8A priority Critical patent/GB2626029A/en
Priority to GB2300627.3A priority patent/GB2626043A/en
Priority to GB2305110.5A priority patent/GB2629559A/en
Publication of GB2626029A publication Critical patent/GB2626029A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/001Model-based coding, e.g. wire frame
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/004Predictors, e.g. intraframe, interframe coding
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/005Statistical coding, e.g. Huffman, run length coding
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/40Tree coding, e.g. quadtree, octree
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Image Processing (AREA)
  • Image Generation (AREA)

Abstract

The present invention concerns a method of encoding and decoding a 3D dynamic point cloud, comprising a sequence of frames of 3D point clouds, in a bitstream, each 3D point cloud comprising a set of 3D points (for example in a G-PCC, Geometry Based Point Cloud Compression encoder.) The encoding method comprises obtaining an azimuth predictor for encoding the azimuth of a current 3D point using a coding mode (such as INTER coding) wherein the azimuth predictor is obtained by applying a multiplier to a selected azimuth predictor. One or more probability distributions are then determined based on the coding mode and one or more further parameters. The multiplier is encoded using the obtained one or more probability distributions. Also disclosed is an equivalent decoding method in which a multiplier is decoded using a probability distribution determined based on a coding mode a further parameter. The further parameter may be a selected INTRA or INTER predictor index or a number of times an azimuth angle is scaled independence on a radius of a selected predictor of the 3D point.

Description

METHOD AND APPARATUS FOR COMPRESSION AND ENCODING OF 3D DYNAMIC POINT CLOUD
FIELD OF THE INVENTION
The present disclosure concerns a method and a device for compression and encoding of 3D point cloud. It concerns more particularly 3D dynamic point cloud. A 3D dynamic point cloud is a temporal sequence of 3D point clouds. Each 3D point cloud comprising a variable number of 3D points with variable 3D positions.
BACKGROUND OF INVENTION
In particular, the 3D point cloud can be a set of 3D points captured by a LIDAR located on the top of a car. For example, the LIDAR can be a rotating LIDAR containing different elevation lasers with a horizontal rotation along the vertical dimension.
For compressing 3D points, a standard called G-PCC V1 (ISO/IEC FDIS 23090- 9:2022(E)) has emerged which proposes to encode 3D points based on INTRA prediction. However, recent improvements over the standard are proposed in a MPEG document called Technology Under Consideration ("ISO/IEC JTC 1/SC 29/VVG 7 N00281", called herein TUC). In particular, new parameters like spherical representation or INTER prediction have been introduced.
The compression (or encoding, both terms being used in this disclosure referring to the same techniques) techniques proposed in these two documents are based on predictive encoding where the encoding of a particular point is based on the identification of a predictor, and the encoding of a residual constituted by the difference between the current point and the 3D coordinates of a previously encoded point (the predictor). As the encoding may be destructive, the residual is actually computed as the difference of the current point and the decoded version of the predictor. INTRA modes correspond to modes where the predictor is chosen in the current frame among the previously encoded/decoded points, meaning the current 3D point cloud. INTER modes correspond to modes where the predictor is chosen in a reference frame, meaning a previously encoded/decoded 3D point cloud.
Whatever the used mode (INTER or INTRA), for rotating LiDAR and spherical representation, the initial prediction values of the azimuth are shifted by a multiple of the rotating LiDAR speed conducting to new azimuthal prediction values. This multiple (called multiplier or Qphi in this disclosure) is a coding parameter sent to the decoder.
This shift enables to be closer to the point to encode. As a consequence, the azimuth residual is smaller and is easier to compress. The multiplier is usually encoded with an entropic encoder and especially with an arithmetic encoder.
However, the coding of the Qphi is not optimal. Improvements in term of coding efficiency of this parameter may be advantageous for both INTRA and INTER modes.
SUMMARY OF THE INVENTION
The present invention has been devised to address one or more of the foregoing concerns. It concerns different improvements over techniques defined in the standard, the TUC propositions and adopted tools for future version of G-PCC.
According to a first aspect of the invention it is proposed a method of encoding a 3D dynamic point cloud, comprising a sequence of frames of 3D point clouds, in a bitstream, each 3D point cloud comprising a set of 3D points, the method comprising: obtaining an azimuth predictor for encoding the azimuth of a current 3D point using a coding mode, wherein the azimuth predictor is obtained by applying a multiplier to a selected azimuth predictor, the multiplier being representative of a number of azimuthal angle steps to be added to the selected azimuth predictor; obtaining one or more probability distributions determined based on the coding mode and one or more further parameters; and encoding the multiplier using the obtained one or more probability distributions.
In an embodiment, encoding the multiplier comprises encoding each of one or more elements composing the multiplier. For a given value of the multiplier, one or more elements need to be encoded. For example, encoding a multiplier value of 1 may require the encoding of 0 and 1. Here there are two elements 0 and 1. The sign of the multiplier (positive and/or negative) may also be considered as an element composing the multiplier and that is to be encoded.
In an embodiment, a plurality of determined probability distributions is associated with an element of the multiplier (value) and wherein the element is encoded using one probability distribution selected among the plurality based on a coding mode and one or more further parameters.
In an embodiment, the coding mode is an INTRA coding mode or an INTER coding mode.
In an embodiment, the one or more further parameters are chosen among: -an index of the selected predictor when INTRA coding mode is used; an index of the selected predictor when INTER coding mode is used; a number, or an estimation thereof, of times an azimuthal angle is scaled depending on a radius of the selected predictor or the 3D point.
In an embodiment, the method further comprises encoding the azimuth of the current 3D point using the obtained azimuth predictor.
According to another aspect of the invention it is proposed a method of decoding a bitstream comprising an encoded 3D dynamic point cloud, the 3D dynamic point cloud comprising a sequence of frames of 3D point clouds, each 3D point cloud comprising a set of 3D points, the method comprising: determining a coding mode from the bitstream; - obtaining one or more further parameters; obtaining one or more probability distributions determined based on the coding mode and the obtained one or more further parameters; and - decoding a multiplier using the obtained one or more probability distributions.
In an embodiment, the one or more further parameters are chosen among: - an index of the predictor determined based on the coding mode; - a number, or estimation thereof, of times an azimuthal angle is scaled depending on a radius of the selected predictor or the 3D point.
In an embodiment, the method further comprises: selecting an azimuth predictor; determining a modified azimuth predictor by applying the decoded multiplier to the selected azimuth predictor; and - decoding the azimuth of the current 3D point using the determined azimuth predictor.
In an embodiment, the coding mode is an INTRA coding mode or an INTER coding mode.
According to another aspect of the invention it is proposed a device for encoding a 3D dynamic point cloud, comprising a sequence of frames of 3D point clouds, in a bitstream, each 3D point cloud comprising a set of 3D points, the device comprising a processor configured for: obtaining an azimuth predictor for encoding the azimuth of a current 3D point using a coding mode, wherein the azimuth predictor is obtained by applying a multiplier to a selected azimuth predictor; obtaining one or more probability distributions determined based on the coding mode and one or more further parameters; and encoding the multiplier using the obtained one or more probability distributions.
According to another aspect of the invention it is proposed a device for decoding a bitstream comprising an encoded 3D dynamic point cloud, the 3D dynamic point cloud comprising a sequence of frames of 3D point clouds, each 3D point cloud comprising a set of 3D points, the device comprising a processor configured for: determining a coding mode from the bitstream; obtaining one or more further parameters; obtaining one or more probability distributions determined based on the coding mode and the obtained one or more further parameters; and decoding a multiplier using the obtained one or more probability distributions.
According to another aspect of the invention it is proposed a computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for implementing a method according to the invention, when loaded into and executed by the programmable apparatus.
According to another aspect of the invention it is proposed a computer-readable storage medium storing instructions of a computer program for implementing a method according to the invention.
According to another aspect of the invention it is proposed a computer program which upon execution causes the method of the invention to be performed.
At least parts of the methods according to the invention may be computer implemented. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a "circuit", "module" or "system".
Furthermore, the present invention may take the form of a computer program product embodied in any tangible medium of expression having computer usable program code embodied in the medium.
Since the present invention can be implemented in software, the present invention can be embodied as computer readable code for provision to a programmable apparatus on any suitable carrier medium. A tangible, non-transitory carrier medium may comprise a storage medium such as a floppy disk, a CD-ROM, a hard disk drive, a magnetic tape device or a solid state memory device and the like. A transient carrier medium may include a signal such as an electrical signal, an electronic signal, an optical signal, an acoustic signal, a magnetic signal or an electromagnetic signal, e.g. a microwave or RE signal.
BRIEF DESCRIPTION OF THE DRAWINGS
Embodiments of the invention will now be described, by way of example only, and with reference to the following drawings in which: Figure 1 illustrates an example of LI DAR system; Figure 2 illustrates different compression tools proposed in this standard or different new adopted tools for the future version of the standard; Figure 3 illustrates an example of the main steps of a general encoding method for a point cloud and the construction of the associated bitstream; Figure 4 illustrates the main steps of an example of decoding method for decoding a bitstream generated by the encoder; Figure 5 illustrates examples of points obtained by a rotating lidar; Figure 6 illustrates an example of tree construction for a rotating lidar; Figure 7 illustrates the main step of an example of the compression algorithm used for encoding the 3D geometry of the input points; Figure 8 illustrates a schematic version of an example of encoding process for choosing between INTER and INTRA compression according to embodiments of the invention; Figure 9 illustrates the INTER prediction as proposed in the G-PCC reference software and adopted tools document; Figure 10 is an illustration of both the INTRA prediction mode and the calculation of the qPhi; Figure 11 is an illustration of both the INTER prediction mode and the calculation of the qPhi; Figure 12 illustrates the encoding of the qPhi value from the prior art for a given input point; Figure 13 illustrates the function for estimating the bit cost in an embodiment of the invention; Figure 14 illustrates the decoding of the qPhi value from the prior art; Figure 15 illustrates a property of the calculation of the parameter qPhi when the current point to encode has a small radius; and Figure 16 illustrates a block diagram of a device adapted to incorporate embodiments of the invention.
DETAILED DESCRIPTION OF THE INVENTION
Figure 1 illustrates an example of LIDAR system where the 4 arrows 1000 are representative of 4 laser heads. The 4 heads rotate around the z axis at a given rotation speed. The rotation speed, sometimes called geomAngularAzimuthSpeed in the prior art reference software, is a parameter of the encoder and decoder. During the compression, this value will be attributed to a second variable of the encoder or decoder called azimuthSpeed in the prior art reference software. Each head emits laser beam regularly and calculate a 3D position (x,y,z coordinates in a Cartesian system) according to measured feedbacks relative to the laser beam emission. In addition, to these 3D positions, attributes may be present for each point. At each 3D position, a set of attributes can be associated. For example, the attributes can be a color, a normal, reflectance, etc. This kind of LIDAR is called rotating LIDAR, but the invention is not limited to this particular example of LIDAR and may be adapted to any kind of 3D dynamic point clouds.
In the context of MPEG standardization, a compression standard called G-PCC V1 has emerged and is under finalization. A second version of this standard is under development. Figure 2 illustrates different compression tools proposed in the future version of this standard: In step 2000, the initial point cloud is considered. It corresponds to a frame captured at time 't'. This initial point cloud may be optionally pre-processed in step 2001 to quantize the 3D positions of the 3D points. This first step is often called Voxelization'. The voxelization moves slightly the 3D positions of the initial point cloud so that they move to the center of the nearest voxel in a bounding box.
A step 2002, called geometry encoding, consists in encoding the 3D positions of the point cloud. G-PCC proposes three alternatives, 2003, 2004 and 2009, for the encoding of the geometry: The first alternative 2003 is based on the construction of an octree and the encoding of the generated octree structure.
o The second alternative 2004 is based on the construction of a tree where each 3D point position is encoded in reference to previously encoded point positions in the tree. This second alternative is called geometry prediction.
o The third alternative 2009 is based on the construction of a surface of triangles from an initial partial octree coding. Some parameters are added in the initial octree coding enabling the construction of the surface. Intersections between the surface and voxels enables to generate a point cloud approximating the initial point cloud to encode Once the geometry is encoded, either from octree-based technics, geometry prediction technics or trisoup, the geometry is decoded to obtain a decoded 3D position of the points. The decoded positions, which can be different from the initial positions if the geometry is encoded with the lossy mode, can be re-colored in step 2005. It means that attributes of the encoded-decoded points can be changed according to the differences between the initial 3D position of a point and its decoded position. For example, the color or normal of each point can be recalculated taking into account the modification between the initial position and decoded position of a point. If the geometry is encoded in lossless mode, there is no need in applying this step 2005. The result of this recoloring is a new set of attributes related to the encoded-decoded 3D points.
After recoloring, the attributes are encoded in a step 2006. G-PCC proposes 2 modes for encoding the attributes called RAHT, step 2007, and LoD/Liffing, step 2008.
Usually, the dense point clouds use the octree-based technics (2003) for geometry encoding step 2002, thus for compressing the positions of the point cloud called the geometry. The sparse point clouds use typically the geometry prediction technics of step 2004. Indeed, geometry prediction is more efficient when dealing with sparse content. LIDAR generated point clouds are often sparse. Trisoup is dedicated to low bit-rate and is used for dense point clouds. It consists in encoding the positions of point cloud by transmission of triangular-based surface (parameters of this surface being the compression parameters) and in intersecting the surface with voxels. The intersections will be considered as the reconstructed points.
Embodiments of the invention are related to step 2004 of geometry prediction.
The contemplated standard G-FCC V1suffers from drawbacks.
One major drawback is the independent temporal compression of the frames of the temporal point cloud. Temporal independency means that the current frame, whose capture starts at time 't', is compressed and encoded independently of the previous frame, whose capture starts at time 't-1'. The redundant information between the two frames cannot be exploited. That is the reason why the new technologies under development in the G-PCC context described in the TUC document, has proposed INTER tools for taking advantage of the temporal redundancy and improving the compression ratio. The INTER tools proposed by the TUC document exist both for geometry encoding (2002) and attributes encoding (2006). Recently, these INTER tools have been added in a list of adopted tools for the future version of G-PCC.
The predictive encoding of a 3D point consists in determining a predictor of the 3D point to encode. The predictor is the decoded version of a previously encoded point. The predictor is in the current frame, meaning the current 3D point cloud, for INTRA encoding. The predictor is in a previously encoded frame, meaning a previously encoded 3D point cloud, named the reference frame, in case of INTER encoding.
The predictor is determined by evaluating a set of predictors and selecting one of the predictors based on this evaluation as the predictor to be used at encoding and decoding. The determination being made by evaluating a cost function for each predictor corresponding to the rate-distortion of the encoding based on the predictor.
In the adopted tools reference software, for a given point to encode, several INTRA prediction values are possible. When one of these prediction values is chosen as the best value for encoding, it is encoded through a prediction index (called predldx). Usually arithmetic encoder is used in order to encode this variable. In addition to this prediction index, a flag (called interFlag) is coded indicating that the INTRA prediction will be used.
In the adopted tools reference software, for a given point to encode, two INTER prediction functions are possible. The index (called 'refNodeFlag') of the chosen INTER prediction function (if INTER mode is better than INTRA mode), is encoded by an arithmetic encoder based on a simple arithmetic context. If the first INTER prediction function is chosen, 'refNodeFlag' is encoded with the Boolean 'false' with the arithmetic encoder. If the second INTER prediction function is chosen, refNodeFlag' is encoded with the Boolean 'true' with the arithmetic encoder.
Figure 3 illustrates an example of the main steps of a general encoding method for a point cloud and the construction of the associated bitstream. The three main elements are the point cloud/frame 3000 to be encoded, the compression algorithm 3003 which will generate the bitstream 3006.
In a step 3001, the point cloud is divided in slices, which are subsets of the 3D points of the point cloud. Each slice is compressed in step 3005 for the geometry and in step 3007 for the 3D points attributes.
The compression step 3003, comprises a first step 3004 of initialization, a step of generation of the geometry bitstream 3005, followed by the generation of the attributes bitstream 3007.
The generated bitstream 3006 comprises a metadata part 3008, 3009 and 3010 followed by data units 3011-3014. The metadata part comprises a sequence parameter set, SPS, 3008 comprising parameters defined for the whole sequence, a geometry parameter set, GPS, 3009, comprising parameters of the geometry and an attribute parameter set, APS, 3010 comprising parameters of attributes. The data units are of two sorts, geometry data unit, GDU and attributes data unit, ADU. Each data unit is constituted of a header, GDU header 3011 and ADU header 3013, and a payload, GDU data 3012 and ADU data 3014, comprising the raw encoded geometry data and attributes data respectively.
The metadata are typically generated during the initialization step 3004, the GDUs during the geometry bitstream generation step 3005, the ADUs during the attributes bitstream generation step 3007.
Figure 4 illustrates an example of the main steps of a decoding method for decoding a bitstream generated by the encoder as specified in relation with Figure 3. In a first step 4009, the bitstream 4000 is obtained (e.g. read). The metadata comprising the SPS 4001, the GPS 4002 and the APS 4003, are obtained in a step 4010. These metadata are, for example, read and decompressed and then used in a step 4011 for initializing the Raw data decompression step 4012.
The raw data decompression step 4012 is then able to decode the GDUs, 4004, 4005 and the ADUs, 4006, 4007 based on the metadata.
Figure 5 illustrates examples of 3D points obtained by a rotating lidar. On the left, the lidar 5000 is rotating clock wise on the '1 axis, based on the reflection of the beam on obstacles, here two cars 5001 and 5002 in front of a wall 5003. The projection on the plane (x', 'y') is illustrated. For each point, its coordinates (x, y, z) are obtained.
The right part illustrates the same points expressed in spherical coordinates where each point is defined by a laser index corresponding to an elevation angle 0 defined by each laser head geometry as illustrated by Figure 1, the azimuthal angle (rotation angle) cp, and the radius r giving the distance between the lidar and the point. Accordingly, each 3D point may be represented by its spherical coordinates (r, p, 0). The geometry prediction method uses mainly the spherical coordinates.
According to the future version of G-PCC-V2, the encoding of geometry is now described. As explained in relation to Figure 2, new version of 3-FCC proposes three methods for encoding the geometry of the 3D points. In G-PCC-v1, the predictive approach starts by defining a prediction structure on the point cloud. Such structure could be described by a prediction tree where each point in the point cloud is associated with a vertex of the tree. Each vertex could be predicted only from its ancestors in the tree.
Various prediction strategies are possible. The standard proposes four different strategies for determining a predictor: no prediction, meaning the current point is encoded with a predefined predictor, delta prediction where the predictor is the direct ancestor of the current point in the tree (i.e. p0), linear2 prediction where the predictor is a linear combination of the two ancestors (i.e. 2p0-p1), and linear3 prediction where the predictor is a linear combination of the three ancestors (i.e., 2p0+p1-p2), where p0, p1, p2 are the positions of the parent, grandparent, and grand-grandparent of the current vertex.
The tree structure is encoded by traversing the tree in a depth order and encoding for each vertex the number of its children. The positions of the vertices are encoded by storing the chosen prediction mode and the obtained prediction residuals. Arithmetic coding is used to further compress the generated values. Building the optimal prediction tree is an NP-hard problem. Prior to generating each predictive tree, the input points corresponding to the tree are sorted according to a sorting method. This helps to guide the tree construction process to build a more efficient tree. The sorting methods available are none, Morton order, azimuth angle order, and radial distance order. Especially, the azimuthal sorting by rounded azimuth, radius, and elevation generates stable ordering shape and improves the coding efficiency.
In G-PCC's predictive tree, for each node in the prediction tree, one predictor index 'n' is encoded in the bitstream. This index points to a selected predictor PR n among a list of possible predictors, called predictors. The predictors are PRo, PRE PR2 and PR3 defined as follows: 1) "None": PR0 = To, 00, where re* is the minimum radius value (provided in the geometry parameter set), and To and 80 are equal to 0 if the node has no parent, or are equal to T and e values of the point coded in the parent node.
2) "Delta": PRi = p0 = (ro, (Po, 00), where r0, To and 80 are respectively the radius r, the azimuthal angle T and the laser index e values of the parent point p0 coded in the parent node.
3) "Linear2": PR2 = 2"p0-p1, where p0 and p1 are the parent and grandparent points/nodes.
4) "Linear3": PR3= p0+p1-p2, where p0, p1 and p2 are the parent, the grandparent and great grandparent points/nodes.
This way of generating predictors is not the most efficient one. In the TUC document and in the document containing adopted tools for a future version of G-PCC, new prediction functions have been proposed, which are now described. Instead of using the list of G-PCC prediction functions as previously explained, a list of N predictors is built from a prediction buffer of N pairs of one radius and one azimuthal angle (re, TO. The coding of the predictor index is simply performed using a unary coding with one context per bin index.
The derivation of a predictor is performed as follows: If the point being predicted is the first point of the tree (i.e., there is no parent node), the predictor PRo is set equal to (rreie, 0, 0), the other predictors PRe>0 are set equal to (0, 0, 0). If the point has a parent point, the predictor PRo is set equal to (ro, (Po, 00), where 80 is the laser index G value of the parent point p0 coded in the parent node, and where (r0, (Pa) is the first pair in the buffer (as will be understood from the buffer management, it is also equal to respectively the radius r, and the azimuthal angle T of the parent point p0 coded in the parent node); the predictors PRe>0 are set equal to (re, p. + qPhi" (step, 80), where 80 is the laser index 8 value of the parent point p0 coded in the parent node, and where (re, TO is the n-th pair in the buffer, and qPhi equals 0 if I To (PnI < Wstep, else qPhi equals the integer division (Ma -Mn, * (step.
rstep* Since it is better to avoid integer division in decoder, (90 - / Tstep is approximated using the divApprox function of G-PCC: qPhi = divApprox(To -(Pe, (stap, 0).
This parameter will be called in the following text 'qPhi' or 'multiplier'. Some additional explanations about this parameter qPhi are proposed in the description related to figure 10 and 11. In particular, we propose to improve the coding of this parameter.
The buffer used for the predictors' derivation is managed as follows. Each pair of the buffer is first initialized to (0, 0). After the (de)coding of a point, the buffer is updated as follows: If the absolute value of (de)coded radius residual rres is higher than a threshold Th, it is considered that the laser has probed a new object. Then a new element (ro, po) is inserted in front of the buffer, with ro and To the reconstructed radius and the reconstructed azimuthal angle of the (de)coded point. The last element of the buffer is discarded. This is performed by letting the buffer element (re, TO be equal to (re_i, we4) for n=3 to 1. Then, setting the first buffer element values from the decoded point.
If the absolute value of (de)coded rres is not higher than the threshold Th, it is considered that the laser has probed an object present in the buffer. Then, the element of the buffer with index predldx, corresponding to the index of the predictor that has been used for the prediction, is moved to the front of the list and is updated with (ro, To) the reconstructed radius and the reconstructed azimuthal angle of the (de)coded point. This is performed by letting the buffer elements (re, Pe) be equal to (rn_1, (p.-1) for n=predldx to 1, then, setting the first buffer element values from the decoded point.
Th is equal to ps.predgeom_radius_threshold_for_pred_list and has been fixed in the encoder to 2048 » ps.geom_angular_radius_inv_scale_log2 with »' being a bit shift.
The parameter cgeom_angular_radius_inv_scale_log2' is an inverse scaling factor applied to radius in predictive geometry coding. It is used to modify the precision in bits of the radius values.
In particular, it is proposed to improve the coding of this parameter qPhi. The prediction index 'n and the number 'qPhi' of azimuthal angle steps are encoded in the bitstream for each node, while the value of ((pstop) is encoded in the geometry parameter set by geom_angular_azimuth_speed_minus1. The residual (nos, pros, OFes) is also encoded in the bitstream.
9res can be additionally quantized according to the radius r before being encoded in the bitstream.
In both encoder and decoder side, the coordinates (rd.., ( tdec, edec) of points are retrieved by doing the reverse process: a reconstructed point (rdec, mdee, edee) is obtained by: (rd.., (Pdee, edec) = PR (n "P-. * m"-step, n) ,rres, (Tres,rec, Ores), where cpres,rec is obtained from the quantized (Pros after being inversely quantized.
In addition, some INTER prediction tools have been added in the predictive geometry scheme. These tools will be explained later.
Figure 6 illustrates an example of tree construction for a rotating lidar.
Input 3D points 6007 in spherical coordinates (Ri, (pi, being the index of the point. For rotating Lidar, there is a direct correspondence between elevation 0i and laserld. This is the reason why we use indifferently these 2 words.
Before starting the encoding, the tree is constructed in step 6006. The tree construction consists in sorting the points in an increasing azimuth order fora same given elevation except for a short subset of 3D points. This short subset of points is illustrated with the points 6000 and 6003. As these are the first encoded points for a given (different) elevation, their ancestors have a different elevation and there is no guaranty the azimuth of the ancestor is lower. For example, in the tree 6008, the point 6001 has the point 6000 as parent. The point 6002 has the point 6001 as parent. It means that the azimuth (p1 is higher than (p0. The azimuth rp2 is higher than (pl. The point 6003 has the point 6000 as parent because it is the 3D point at elevation 01 with the lowest azimuth. The first point with an elevation higher than 00 (6003) has as parent the point 6000. The point 6004 has the point 6003 as parent. The point 6005 has the point 6004 as parent.
In the TUC document, the generated tree is used for: Successively selecting the 3D points to encode (a point being a node of the tree) Use the parent as predictor of the elevation.
The radius and azimuth are predicted from the new INTRA and INTER prediction functions. In a preferred embodiment, we suppose that this global prediction mode is kept in this disclosure even if the INTRA or INTER prediction functions could be also used for elevation prediction, or just for radius or just azimuth prediction.
Figure 7 illustrates the main step of an example of the compression algorithm used for encoding the 3D geometry of the input points corresponding to step 2004 in Figure 2. This version of the algorithm is used, for example, for a 3D point cloud coming from LIDAR or any 3D point cloud with a spherical representation.
As explained previously, a tree is constructed. According to this tree, a point is selected in step 7000. This point is in its spherical representation (radius, azimuth, elevation or LaserId). In step 7001, a prediction of this point is done. The prediction can be done from INTRA prediction functions (figure 5), or INTER prediction functions (figure 9). The prediction value contains a prediction value for the radius of the current point to encode and a prediction value for the azimuth of the current point to encode. The azimuth may be modified by adding a multiple of the rotation speed (called qPhi or multiplier).
After this modification, a new azimuthal prediction is obtained. The modification of the azimuthal prediction value requires the encoding of the qPhi parameter. This parameter will be decoded at the decoder to generate the same prediction value as the one used at the encoder.
The predictors (calculated from used prediction functions) are previous reconstructed 3D points, reconstructed meaning that these points have been encoded and decoded, coming from a reference frame for INTER prediction or from the current frame for INTRA prediction. The best predictor is chosen among all the predictors. A flag is set to notify if it is an INTRA or INTER predictor. The index of this predictor (INTER or INTRA) is encoded and added in the bitstream in step 7006. The predictors are used for the prediction of radius and azimuth. The elevation being predicted from the parent in the tree. Once the prediction is done (7001), the residual is calculated in the spherical domain, quantized and encoded in a binary representation in step 7002 before being added in the bitstream in step 7006. The decoded residual in the spherical domain is calculated in step 7003 and added to the prediction in the step 7007. The reconstructed point is transformed in the Cartesian domain/representation in step 7008. In step 7013, the input point is transformed in Cartesian coordinates. The residual in the Cartesian domain is obtained in step 7004. This residual is quantized and encoded in a binary stream in step 7006.
In step 7009, the residual is decoded in the Cartesian domain before being added in step 7010 to the point in step 7008. The result is the reconstructed point in the Cartesian domain which is transformed in the spherical domain in step 7011 before being stored in step 7012. The storage in used in step 7012 contains encoded/decoded points that can be used as INTRA or INTER prediction.
For INTRA prediction, the encoded/decoded points are used as described above for constructing the list of predictors.
For INTER prediction, the stored point will be used for the encoding of the next frame: they will be used as reference frame for the next frame as illustrated by Figure 10.
Figure 8 illustrates a rate distortion algorithm. This figure is done for the coding of all the points of the point cloud. In 8000, the INTRA predictors are tested. In 8001, for each tested INTRA predictor, the estimate bits function estimates the coding cost in bits of the tested INTRA predictor. In 8002 and 8003, the same operations are done for the INTER predictors. In 8004, the best predictor (among the INTRA and INTER predictors) is selected and encoded in 8005 for generating the bitstream. The associated residual information is also encoded in 8005. In particular, the bit estimation 8003 requires estimating the coding cost of the chosen INTER prediction function, the index of this chosen INTER function being called refNodeFlag. In particular, the bit encoding 8005 requires encoding the chosen INTER prediction function, the index of this chosen INTER function being called refNodeFlag.
The function called Estimate Bits (8001 or 8002) estimates the cost of all the coding parameters. In particular and in relation with this invention, the estimate of the cost of the encoding comprise the coding cost of the Qphi parameter.
According to an aspect of the invention, it is proposed to change the coding method of the parameter qPhi by adding new contexts. It means that the proposed solution yields to modification of the Estimate Bits module 8001 or 8003 for the part related to qPhi and to the modification of the encoding module 8005 for the coding of the qPhi. This will be illustrated later.
Some associated modifications will be done for the decoding part for being compatible with the modifications made by the encoder.
Figure 9 illustrates the INTER prediction as proposed in a first and second version of the TUC document and in the adopted tools reference software.
When inter-coding is applied, the radius, azimuth and laserlD of the current point are predicted based on a point that is near the azimuth position of a previously encoded/decoded point in the reference frame. The reference frame is an already encoded/decoded frame. This frame may have been compensated in motion.
This improved method is illustrated in Figure 9. The method consists of the following steps: * For a given 3D point in the frame at time 't', called Curr Point 9000 in the figure, choose the previous encoded/decoded point, Prey Dec Point 9001 in the same frame.
* Potentially quantize the azimuth of the previous encoded/decoded point (quantized azimuth can be called 'scaled azimuth' or phi quantized). The potentially quantized value is called phiQ.
* Determine a reference point 9004 corresponding to the previous decoded point 9001 in the reference frame. This reference point 9004 has typically the same azimuth phiQ than the previous decoded point 9001. Depending on the resolution of each frame, this reference point 9004 with the same azimuth phiQ may exist or not in the reference frame.
* Choose position in the reference frame a first prediction point (called Inter Pred Point 9002 in the figure). The prediction point is the prediction point of the reference frame with same elevation and first quantized azimuth higher than phiQ. This is the point following the reference point 9004 in the reference frame when this reference point 9004 exists.
* Choose position in the reference frame a second prediction point (called Additional Inter Pred Point 9003 in the figure). The prediction point is the prediction point of the reference frame with same elevation and second quantized azimuth higher than phiQ.
When the INTER predictor is chosen by the encoder, a flag is encoded, called "interflag", for signalling to the decoder to use INTER predictor. If the first INTER prediction point is chosen, a flag called "refNodeFlag' is set to false. If the second INTER prediction point is chosen, the flag called tefNodeFlag' is set to true. This flag value is next encoded with an arithmetic encoder. In the state of the art, the arithmetic encoder described later in relation to Figure 11 is based on one single arithmetic context. The flag trefNodeFlag' indicates which predictor is selected between the first and the second INTER predictor.
Figure 10 is an illustration of both the INTRA prediction mode and the calculation of the qPhi. The minimal azimuthal speed called azimuthSpeed is assumed to be known. This minimal azimuthal speed defines the minimal theoretical azimuth increment between 2 consecutive points acquired by a rotating LiDAR for a same elevation. This value is theoretical because in practical cases it may be subjected to noise. Let us suppose that the current point to encode is in 10004. 10001 is the INTRA buffer containing already encoded/decoded points. It is composed of several predictors, four in the illustrated example. A predictor, e.g Pi in 10002, is composed of two values: a radius value Pi[O] and an azimuth value Pi[1]. The radius value will be directly used as predictor of the radius of the current point to encode 10004. The azimuthal value of the predictor value is not directly used. Assuming that the predictor '1' has been chosen by the encoder, meaning that predl ndex=1 in 10005, this predictor corresponds to the encoded point 10006. In 10007, the azimuth value of this point is first considered: pred[1] = Pi[1]. Then, the azimuthal value is aligned on the last input predictor of the INTRA buffer. This alignment consists in the following steps: -The azimuth difference, called deltaPhi 10003, with the last input predictor of the INTRA buffer is calculated.
-An integer value called qphi0 is calculated as being a multiple of the azimuth speed so that pred[1] = Pi[1] + qphi0 * azimuthSpeed is as close as possible to Po[1] (i.e. Pi[1] for i=0).
As the INTRA buffer is calculated in a same way in the encoder and decoder from already encoded/decoded points, the value qphi0 doesn't need to be encoded. After these two first steps, the new prediction value for the azimuth, pred[1], is used and a second value called qphi is calculated. After the calculation of this value, the new predictor pred[1] becomes 10010: pred[1] = pred[1] + qphi * azimuthSpeed. In summary, pred[1] = Pi[1] + (qPhi0+qPhi) * azimuthSpeed, where qPhi0 is determined as bringing the prediction value pred[1] as close as possible to the azimuth yo5 of the point to encode. qPhi0 does not need to be encoded as the encoder can calculate this value from the INTRA buffer. Only qPhi needs to be encoded. Decomposing the multiplier into the sum of qPhi0 and qPhi is advantageous, as qPhi is a smaller value to encode than the complete multiplier value qPhi0+qPhi. With this value, the azimuthal residual for the current point to encoded can be calculated in 10011: residual[1] = point[1] -pred[1] wherein point[1] is the azimuth of the current point to encode. This residual will be quantized and encoded with an arithmetic encoder. The multiplier value qPhi needs to be encoded so that the decoder, after decoding qPhi, can use it for obtaining the same predictor than the one used at the encoder. The azimuthal value of the current point to decode could be calculated after decoding of the azimuthal residual.
Figure 11 is similar to Figure 10 but for INTER prediction. This figure describes some similar steps with just few changes: The minimal azimuthal speed called azimuthSpeed is still known. Let us suppose that the current point to encode is in 11005. 11001 illustrates the calculation of predictors from the INTER process as illustrated by figure 9. INTER prediction is based on previous encoded point cloud (usually at a time smaller than the time when the current point cloud has been captured). A predictor is calculated. For example, this predictor can be the two points PO 11002 and P1 11003. Both predictors are tested and the best one according to a rate-distortion criterion is chosen by the encoder. Assuming the best predictor chosen at encoding is the point 11004, the predictor Pi 11004 is composed of 2 values: a radius value Pi[0] and an azimuth value Pi[1]. The radius value will be directly used as predictor of the radius of the current point to encode 11005 (pred[0] = Pi[0]). The azimuthal value Pi[1] of the predictor value is not directly used. Let us explain how pred[1] is calculated. An integer value called qPhi is calculated. After the calculation of this value, the predictor pred[1] is determined as: pred[1] = pred[1] + qphi * azimuthSpeed. With this value, the azimuthal residual for the current point to encode can be calculated: residual[1] = point[1] -pred[1] wherein point[1] is the azimuth of the current point to encode. This residual will be quantized and encoded with an arithmetic encoder. The encoded value of the current point 11005 is now 11006: Pi[1] + qphi * azimuthSpeed + residual[1]. The residual and the multiplier qPhi need to be encoded so that the decoder can obtain the same predictor than the one used at the encoder. The decoded azimuthal value of the current point to decode, corresponding to the value Pi[1] + qphi * azimuthSpeed + residual[1] can be calculated from this same prediction.
Figure 12 illustrates the encoding of the qPhi value from the prior art for a given input point. This encoding takes as parameters the multiplier qPhi to encode and a Boolean value called interFlag specifying if INTRA or INTER prediction is used. This function operates in several steps: In a step12001, a context is calculated according to the value of interFlag. If interFlag is true, the context will be set to 1 otherwise it will be set to 0. Contexts are used by arithmetic encoders. They are associated to probability models that contains the statistics of the encoded value. These statistics can be known in advance or calculated on the fly. These probability models are used by an arithmetic encoder. As 2 contexts are possible, it means that 2 probability tables (for a given parameter to encode) are constructed and used by the arithmetic encoder.
In a step 12002, the arithmetic encoder encodes a flag indicating whether the multiplier value is zero. This coding is based on a probability table, called _ctxPhiGtN0' in an exemplary embodiment that contains two probability models. One probability model corresponding to a value of 'interFlag' to 0 and a second probability model corresponding to a value of 'interFlag' equal to 1.
Each of these two models is updated on the fly, according to the value of the flag and according to the selected context. If the multiplier is zero, the encoding is terminated.
In a step 12003, the arithmetic encoder encodes a flag indicating whether the absolute value of the multiplier is one. This arithmetic encoding is based on a probability table, called i_ctxPhiGtN1' in an exemplary embodiment that contains two probability models. One probability model corresponding to a value of 'interFlag' to 0 and a second probability model corresponding to a value of 'interFlag' equal to 1. Each of these two models is updated on the fly, according to the value of the flag and the value of the context.
If the absolute value of the multiplier is one, the sign of the multiplier is coded by an arithmetic encoder and a given adaptive probability model called _ctxSignPhi[interCtx1dx]' and the encoding is terminated where interCtxIdx takes the value 0 or 1 according to the 'interFlag' value.
In a step 12004, arithmetic encoders are used in order to encode the next 3 bits of the absolute value of the multiplier. Once again arithmetic encoders make use of adaptive probability models and contexts dedicated to these values. The probability tables are called (_ctxResidualPhr in an exemplary embodiment. At the end of this module, all the absolute values of the multiplier between 0 and 9 should have been coded. If the absolute value of the multiplier is higher than 9, then the module 12005 is used.
In 12005, the potential absolute value of the multiplier (with values higher than 9) are encoded by using an exponential Golomb encoder.
In 12006, the sign of the multiplier is coded by an arithmetic encoder and a dedicated adaptive probability table, called tctxSignPhi[interCtx1dx]' in an exemplary embodiment, and the encoding is terminated.
It is to be noted that for all these arithmetic encoding steps, only the value of the flag indicating INTRA or INTER coding mode is used as the context for selecting and updating the associated probability models.
In an exemplary embodiment, the encoding may be coded as follows: void encodePhiMUltiplier(const int32 t multiplier, constbool interFlag) int interCtxIdx = interFlag ? 1: 0; aec->encode(multiplier!= 0, ctxPhiGtN[interCtxIdx][0]); if (!multiplier) return; int32 t value = abs(multiplier) -1; aec->encode(value > 0, ctxPhiGtN[interCtxidx][1]); if (!value) aec->enccde(multiplier < 0, ctxSignPhil-interCtmlaixj); return; value--; int valueMinus7 = value -7; value = std::min(value, 7); aec->encode((value >> 2) & 1, _ctxResidualPhi[interCtmIdx](0]); aec->encode((value >> 1) & 1, ctxResidualPhi[interCtx1dx][1 ^ (value » 2)]); aec->enccde((value >> 0) & 1, ctxResidualPhi[interCtmIdx][3 ^ (value » 1)]); if (valueMinus7 >= 0) aec->encodeExpGolcmh(valueMinus7, 0, ctmEGPhi[interCtxrdx]); aec->encode(multiplier < 0, _ctxSignPhi[interCtmIdx]); Figure 13 illustrates the way the function 8001 and 8003 of figure 8 estimates the bit cost of the arithmetic encoders used in figure 12. The process is almost similar to the process described in figure 12 except that no real encoding is done, only estimation is done.
In a step 13001, a context is calculated according to the value of interFlag. If interFlag is true, the context will be set to 1 otherwise it will be set to 0. This step is similar to step 12001 of figure 12.
In a step 13002, the estimation of the coding cost that multiplier is zero is conducted. This estimation is based on a context table, called ictxPhiGtN0', in an exemplary embodiment, that contains two probability models. If the multiplier is zero, the function is terminated.
In a step 13003, the estimation of the coding cost that that absolute value of the multiplier is one is conducted. If the absolute value of the multiplier is one, the coding cost of the sign of the multiplier is estimated with the given adaptive probability model called ictxSignPhi[interCbddx]' and the function is terminated.
In a step 13004, the estimation of the coding cost of the next 3 bits of the absolute value of the multiplier is conducted. If the absolute value of the multiplier is higher than 9, then the module 13005 is run.
In a step 13005, the bit cost of coding potential absolute value of the multiplier (with values higher than 9) are estimated by using an exponential Golomb encoder.
In a step 13006, the bit cost for the sign of the multiplier is estimated and the function is terminated.
According to a first embodiment of the invention, the encoding of the multiplier qPhi, already described in relation to figure 12, is amended. In this embodiment, the encoding takes a further argument, called predldx in an example, which is the index of the INTRA predictor if INTRA prediction is used. If INTRA prediction is used, the variable interFlag is set to false, corresponding to a value of 0, otherwise it is set to 1.
The step 12001 of determining a context for the subsequent arithmetic encoding steps is modified to further takes into account the index predldx. Accordingly, the context is calculated according to the values of interFlag and predldx. If interFlag is true, the context will be set to 2 otherwise it will be set to 0 or 1. If INTRA encoding is used, meaning that interFlag value is 0, the context is set to 0 if predldx is different from zero, and set to 1 otherwise. The contexts are used by arithmetic encoders in the following steps that are not modified compared to the encoding process described in relation to figure 12. It should be noted that the main changes over the prior art comes from the addition of contexts. As three contexts are now defined, three adaptive models for the coding of the multiplier are used and updated.
In an embodiment, the context cinterctx/ax may be calculated according to the following code: void encodePhiMultiplier(const int32 t multiplier, constbool interflag, const int predldx) f int interCtxIdx = interFlag ? 2: 0; if (interFlag == 0 && laedIdx == 0) interCtxIdx = 1; In other words, the first embodiment may be summarized as follows.
In the context of the current version of the standard, the encoding is exemplified by a source code of the encoding. This source code is called in the following reference model, or reference software. Some of the embodiments proposed in this disclosure are illustrated by amendments of this reference model.
As previously explained, the first embodiment is related to predictive geometry when spherical coordinates and rotating LiDAR are used. This embodiment proposes to improve the coding of the parameter called number of Azimuth Angle Steps 'k'. This parameter 'k' is the multiplier qPhi.
In the current version of the reference model, the number of azimuthal angle steps is encoded by using arithmetic coding using different adaptive models and contexts.
This embodiment proposes to add new contexts for encoding the number of azimuthal angle steps. These new contexts are calculated from the Infra predictor index.
Global context of the first embodiment: In the reference software, the number of azimuth steps k is used as followed: p re d = k * S(Pstep, r) + with: S(pste,,r): a scaled azimuthal angle step (can be viewed as the azimuthSpeed) cp-n: the azimuthal angle prediction provided by the 'n'-th predictor Current version of the standard and source code.
In the reference software, the number of azimuthal steps is encoded from arithmetic coding and contexts according to the following steps: - Comparison to zero o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_prefix (.) Source code: * aec->encode(multiplier!= 0, _ctxPhiGtN[interCtx1dx][0]); Comparison of the absolute value to one: a Adaptive model according to the current version of the standard: ptn_phi_mul_abs_prefix o Source code: * int32_t value = abs(multiplier) -1; * _aec->encode(value!= 0, _ctxPhiGtNI[interCtx1dx][1]); -Encoding of the next 3 bits for the absolute value Of needed): o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_minus2 o Source code: * aec->encode((value » 2) & 1, _ctxResidualPhi[interCtx1dx][0]); * aec->encode( * (value >> 1) & 1, _ctxResidualPhi[interCtx1dx][1 + (value >> 2)]); * aec->encode( * (value » 0) & 1, _ctxResidualPhi[interCtx1dx][3 + (value » 1)]); Encoding of the remaining bits for the absolute value Of needed): o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_minus9 o Source code: * aec->encodeExpGolomb(valueMinus7, 0, ctxEGPhi[interCtxIdx]); Encoding of the sign (if needed): o Adaptive model according to the current version of the standard: ptn_phi_mul_sign a Source code: * _aec->encode(multiplier < 0, _ctxSignPhi[interCtx1dx]); The context interCtxIdx used by the different models is defined according to the Inter flag by: int interCbddx = interFlag ? 1: 0; In the reference software, the adaptive models are declared as: AdaptiveBitModel _ctxPhiGtN[2][2]; AdaptiveBitModel _ctxSignPhi[2]; AdaptiveBitModel _ctxEGPhi[2]; AdaptiveBitModel _ctxResidualPhi[2][7]; The adaptive models contain some probability tables and are used in the reference software. In the current version of the standard for a future version of G-PCC, the names and dimensions of these adaptive models are slightly different but are equivalent. c_aec' is an instance of an arithmetic encoder whose adaptive model is selected according to the calculated context.
In this first embodiment, it is proposed to add contexts to the adaptive models based on the intra predictor index: The new context is defined by: int interCtxIdx = (inter 20: 2) + ( !inter && predldx!= 0); This modification is applied inside 3 functions (illustrated here as software functions): void PredGeomEncoder::encodePhiMultiplier(...) float PredGeomEncodecestimateBits(...) 1nt32_t PredGeomDecoder::decodePhiMultiplier( ) The adaptive models related to these new contexts are defined by: AdaptiveBitModel _ctxPhiGtN[3][2]; AdaptiveBitModel _ctxSignPhi[3]; AdaptiveBitModel _ctxEGPhi[3]; AdaptiveBitModel _ctxResidualPhi[3][7]; The remaining of the source code is unchanged.
Comparison to zero o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_prefix o Source code: _aec->encode(multiplier!= 0, _ctxPhiGtN[interCtx1dx][0]); Comparison of the absolute value to one: o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_prefix o Source code: * int322 value = abs(multiplier) -1; * _aec->encode(value!= 0, _ctxPhiGtN[interCtx1dx][1]); -Encoding of the next 3 bits for the absolute value Of needed): o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_minus2 o Source code: * aec->encode((value » 2) & 1, _ctxResidualPhi[interCtx1dx][0]); * aec->encode( * (value >> 1) & 1, _ctxResidualPhi[interCtx1dx][1 + (value >> 2)]); * aec->encode( * (value » 0) & 1, _ctxResidualPhi[interCtx1dx][3 + (value » 1)]); Encoding of the remaining bits for the absolute value Of needed): o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_minus9 o Source code: * aec->encodeExpGolomb(valueMinus7, 0, ctxEGPhi[interCtxIdx]); Encoding of the sign (if needed): o Adaptive model according to the current version of the standard: ptn_phi_mul_sign a Source code: * _aec->encode(multiplier < 0, _ctxSignPhi[interCtx1dx]); According to this first embodiment, the current version of the standard related to the parameter qPhi is displayed in the following table representing a specification for the coding of the number of azimuthal steps.
ptn_phi_mul_abs_minus2 23 8 x ptn_inter_flag[nodeldx] + 16 Exp2(Binldx) + PartVal -1 ptn_phi_mul_abs_minus9 24 Prefix ptninter_flag[nodeldx] 2 Suffix bypass 0 ptn_phi_mul_abs_prefix 25 2 x ptn_inter_flag[nodeIdx] + BinldxTu 4 ptn_phi_mul_sign 26 ptminter_flag[nodeldx] 2 The modifications of the text of the standard in relation to this embodiment are displayed in the following table representing a proposed modification of the current version of the standard for coding the number of azimuthal steps: 16 x ptn_inter_flag[node/dx] + ptn_phi_mul_abs_minus2 23 8 x (ptn_inter_flag[node/dx] ? 0: 24 ptn_pred_idx[node/dx] *0) + Exp2(Binldx) + Part Vu! -1 ptn_phi_mul_abs_minus9 24 Prefix 2 x ptminter_flag[nodeldx] + (ptn_inter_flag[node/dx] ? 0: ptn_pred_idx[node[dx] * 0) 3 Suffix bypass 0 ptn phi mul abs prefix 25 4 x ptn inter flag[nodeldx] + 2 x (ptn_inter_flag[nodeldx] ? 0: ptn_pred_idx[node/dx]* 0) + Binldxru 6 ptn_phi_mul_sign 26 2 x ptn_inter_flag[node/dx] + 3 (ptminter_flag[node/dx] ? 0: ptn_pred_idx[node/dx] * 0) The first column of these 2 previous tables displays the name of the adaptive model according to the current version of the standard. The second column displays the index of the adaptive model according to the whole set of adaptive model.
This first embodiment has been tested and compared to the prior art The experimental conditions are: Reference branch (the reference software amended according to an example of this embodiment): * TM_Reference_release-v20.0-rcl Tested branch: * TM_Reference_release-v20.0-rcl + modification proposed in this embodiment Tested modes (the reference software amended according to an example of this embodiment): * Inter disabled (IMr:a only) and Inter enabled 15 Sequences: * Cat-3 sequences (contain a list of rotating LIDAR sequences) The results proposed in this embodiment show improvements. In summary, the first embodiment is related to a method for coding the number of azimuthal angle steps. Compression gains are obtained by using the proposed method.
According to a second embodiment of the invention, the encoding of the multiplier qPhi, already described in relation to figure 12, is amended. In this embodiment, the encoding takes a further argument, called refNodeFlag in an example, which is the index of the INTER predictor if INTER prediction is used. If INTER prediction is used, the variable interFlag is set to true, corresponding to a value of 1, otherwise it is set to 0.
The step 12001 of determining a context for the subsequent arithmetic encoding steps is modified to further takes into account the index refNodeFlag. Accordingly, the context is calculated according to the values of interFlag and refNodeFlag. If interFlag is false, the context will be set to 0 otherwise it will be set to 1 or 2. If INTER encoding is used, meaning that interFlag value is 1, the context is set to 2 if refNodeFlag value is 1, and set to 1 otherwise. The contexts are used by arithmetic encoders in the following steps that are not modified compared to the encoding process described in relation to figure 12. It should be noted that the main changes over the prior art comes from the addition of contexts and adaptive models for the coding of the multiplier In an embodiment, the context may be calculated according to the following code: void encodePhiMultiplier(const int32 t multiplier, constbool interFlag, const bool refNodeFlag) flit interCtxIdx = interFlag ? 1: 0; if (interFlag == 1 && refNodeFlag == 1) interCtxIdx 4-= 1; According to a third embodiment of the invention, the encoding of the multiplier qPhi, already described in relation to figure 12, is amended. In this embodiment, which may be seen as a combination of the first and second embodiment, the encoding takes two further arguments, called predldx and refNodeFlag in an example, which are respectively the index of the INTRA predictor if INTRA prediction is used and the index of the INTER predictor if INTER prediction is used. If INTER prediction is used, the variable interFlag is set to true, corresponding to a value of 1, otherwise it is set to 0.
The step 12001 of determining a context for the subsequent artithmetic encoding steps is modified to further takes into account the indexes predldx and refNodeFlag. Accordingly, the context is calculated according to the values of interFlag, predildx and refNodeFlag. If interFlag is true, the context will be set to 2 or 3 otherwise it will be set to 0 or 1. It is set to 2 if refNodeFlag is different from true and interFlag is true. It is set to 3 if refNodeFlag is true and interFlag is true. It is set to 0 if predldx is different from 0 and interFlag is false. It is set to 1 if predldx equals 0 and interFlag is false.lt should be noted that the main changes over the prior art comes from the addition of contexts and adaptive models for the coding of the multiplier In an embodiment, the context may be calculated according to the following code: void encodePhiMultiplier(const int32 t multiplier, const bool interFlag, const bool refModeFlag, const int Fined:Mk) int interCtrIdx = interFlag ? 2: 0; if (interFlag == 0 && lonedIdk == 0) interCtxIdk 4-= 1; if (interFlag == 1 && refNodeF2ag == 1) interCtrIdx += 1; Figure 14 illustrates the decoding of the qPhi value from the prior art. The decoding is the exact symmetric of the encoding process described in relation to figure 12. This decoding takes as parameter a Boolean value called interFlag specifying if INTRA or INTER prediction is used and returns the value of the decoded qPhi. This function operates in several steps: In a step 14001, a context is calculated according to the value of interFlag. If interFlag is true, the context will be set to 1 otherwise it will be set to 0. Contexts are used by arithmetic decoders; they must correspond to the contexts used for encoding.
They are associated to probability tables that contains the statistics of the encoded value. These statistics can be known in advance or calculated on the fly. These probability tables are used by the arithmetic decoder.
In a step 14002, the arithmetic decoder decodes a flag indicating whether the multiplier value is zero. This coding is based on a context table, called tcb<PhiGtN0' in an exemplary embodiment, that contains two probability models. Each model used by the arithmetic decoder is defined, and updated on the fly, according to the value of the flag and to the context 'interFlag'. If the multiplier is zero, the decoding is terminated and the value zero is returned.
In a step 14003, the arithmetic decoder decodes a flag indicating whether the absolute value of the multiplier is one. This arithmetic encoding is based on a context table, called ictxPhiGtN1' in an exemplary embodiment, that contains different probability tables. Each model used by the arithmetic decoder is defined, and updated on the fly, according to the value of the flag and to the context 'interFlag'. If the absolute value of the multiplier is one, the sign of the multiplier is decoded by an arithmetic decoder and a given adaptive probability model called c_ctxSignPhi[interCtx1dx]' and the decoding is terminated. The value interCtx1dx' being the context calculated according to 'interFlag'. The decoded value (one or minus one) is returned as decoded value.
In a step 14004, arithmetic decoders are used in order to decode the next 3 bits of the absolute value of the multiplier. Once again arithmetic decoders make use of adaptive probability models and contexts dedicated to these values, called ictxResidualPhr in an exemplary embodiment. At the end of this module, all the absolute values of the multiplier between 0 and 9 should have been decoded.
In 14005, the potential absolute value of the multiplier (with values higher than 9) are decoded by using an exponential Golomb decoder. If the value to decode is lower than 9, no data will be read and no additional value will be added to the previously decoded qPhi.
In 14006, the sign of the multiplier is decoded by an arithmetic decoder and a dedicated adaptive probability model, called tctxSignPhi[interCtx1dx]' in an exemplary embodiment, and the decoding is terminated. The decoded value is returned.
It is to be noted that for all these arithmetic decoding steps, only the value of the flag indicating INTRA or INTER coding mode is used as the context of the decoding. In an exemplary embodiment, the decoding may be coded as follows: int32 t decodePhiMUltiplier(const bool interFlag) int interCtxIdx = interFlag ? 1: 0; if (Laed->decode( ctxPhiCtrninterCtxIdx][0])) return 0; int value = 1; value += aed->decode( ctxPhiGtrninterCtxidx][1]); if (value == 1) ( const auto sign = _aed->decode( ctxSignPhi(interCtxIdx)); return sign ? -1: 1; auto* ctxs -& ctxResidualPhi[interCtxIdx][0] -1; value = 1; for (int n -3; n > 0; n--) value = (value << 1) I _aed->decode(ctxs[value]); value A= 1 << 3; if (value == 7) value += aed->decodeExpGolomb(0, ctxEGPhifinterCtxIdx]); const auto sign = aed->decode( ctxSignPhifinterCtxIdxj); return sign ? -(value + 2) : (value + 2); The decoding of the multiplier in the first, second and third embodiment is obtained by adopting the same amendments to the calculation of the context in step 14001 than those described for the calculation of the context at encoding.
Figure 15 illustrates a property of the calculation of the parameter qPhi when the current point to encode has a small radius. This property comes from the fact that the Cartesian distance between two points differing from a small azimuthal angle and having a same radius depends on this radius value. For a small radius the distance is small, while it grows bigger as the radius grows bigger. According to this property, and due to the quantization, when the radius is small enough, a difference of a few azimuthal steps may have no effect on the location of the reconstructed point. This property may allow a modification of the way qPhi is calculated. As the purpose of this disclosure is related to the encoding of qPhi and as this property is used in an embodiment of the invention for the coding of qPhi, we need to describe this property of the encoder. It is to be noted that the same property is obtained symmetrically at the decoder.
In this figure, a coordinate system is illustrated. The horizontal axis is the 'x' axis of encoded points in the Cartesian system. The vertical axis is the 'y' axis of encoded points in the Cartesian system. This figure can be seen as the lop' view of a point to encode.
Considering that the predictor in the spherical system is illustrated in 15000. Considering that the azimuthal step corresponding to the azimuth speed in the spherical domain is illustrated by the small arc (I) ste, 18002. It means that the predictor point 15000 will be incremented for the azimuth by qPhi x caste, for being used as azimuth predictor of the current point 15001.
The point to encode 15003 is inside the Cartesian quantization step illustrated by the square around the point 15003. We recall that the encoding of a point is first done in the spherical domain, corresponding to step 7002 figure 7, before being completed in the Cartesian domain, corresponding to step 7005 figure 7. It means that any point encoded in the spherical domain and decoded/reconstructed in the Cartesian domain in the square wherein the point 15000 is drawn, before its Cartesian residual is encoded as illustrated in 7005, will be reconstructed as the center of this same square.
Based on this observation, we can deduce that several different qPhi values may conduct to a same reconstructed point in the Cartesian representation before the Cartesian residual is encoded as displayed in 7005. In addition, as the codec calculates the value qPhi so that the predictor 15001 is as close to the point to encode 15003 as possible, the calculated qPhi will be 2 in this example. It means that the coding of a value 2 for qPhi will need useless bits because the less consuming value in terms of bit cost is 0, and this value will lead to a same result. In short, "When a radius is sufficiently small (i.e., when a point is sufficiently close to the LiDAR sensor), it is possible that a rotation of one 'step' angle does not change cartesian coordinates of the reconstructed point".
This property means also that the step of azimuthal angle considered for encoding a point can be increased for low values of the radius with no prejudice to the precision of the encoding. And this is exactly what is done in the prior art.
As a reminder, the azimuthal angle step, cp",,, depends on the actual rotation speed of the lidar. But for small values of the radius, this rotation speed is artificially increased, leading to a corresponding increase of the azimuthal angle step. This is done by iteratively doubling the speed until a condition depending on the radius is reached.
The process is conducted on the reconstructed value of the radius to be symmetric at encoding and decoding. Accordingly, the encoded value of the multiplier qPhi, counts a number of azimuthal angle steps to be added to the predictor, where the azimuthal angle step may be different for two points, but is the same for a given point at encoding and decoding.
According to a fourth embodiment, it is proposed to take into account the number of times the azimuth speed is scaled by a factor two, or equivalently the number of times the azimuthal speed is scaled by a factor two, when calculating the value qPhi, and to use this value in the determination of the context in the encoding of the value qPhi. This value is called iterationSpeedAzimuth in an exemplary embodiment. It is to be noted that the scaling may be done according to a factor different of two, which is only an example.
In this exemplary embodiment, the step 12001 of figure 12 for the determination of the context may be calculated based on the encoding mode interFlag indicating if the point is encoded using INTER or INTRA mode and on the iterationSpeedAzimuth indicating the number of times the speed has been artificially scaled by a factor two for computing qPhi. If interFlag is true, the context will be set to 5, 6, 7, 8 or 9 otherwise it will be set to 0, 1, 2, 3 or 4. The context takes a basic value, 0 when the predictor is an INTRA predictor and 5 if the predictor is an INTER predictor, and is incremented according to the value of iterationSpeedAzimuth. The increment is bounded to 4. This context is then used in the subsequent arithmetic encoding steps for encoding qPhi.
In an embodiment, the context may be calculated according to the following code: void encodePhiMultiplier(const int32 t multiplier, constbool interFlag, const int iterationSpeedAzimnth) int interCtxIdx = interFlag ? 5: 0; if (iterationSpeedAzimnth > 0)1 if (iterationSpeedAzimuth >= 4) interCtx1dx += 4; else interCtxIdx += iterationSpeedazimuth; In a first variant of the fourth embodiment, the newly introduced context is not used in all the arithmetic encoding steps of the encoding process of qPhi as described in relation to figure 12. In this variant, the encoding of the sign of qPhi provided in steps 12003 and 12006 use the classical context based only on the encoding mode interFlag, while all other encoding steps use the newly introduced context.
In some embodiments, the residual of the radius is encoded using the value of qPhi as a context for the encoding. This means that the decoding of the residual requires the prior decoding of qPhi. In these embodiments, the computation of the number of times the speed is scaled may be based on the radius of the predictor not corrected by the residual of the radius which is not available. As a variant, the encoding of the residual of the radius may be modified to not require the value of qPhi. In this case, the computation of the number of times the speed is scaled can be based on the decoded radius, namely the radius of the predictor corrected with its residual, which can be decoded prior to the decoding of qPhi in this case.
In other words, the fourth embodiment may be summarized as follows.
This embodiment is related to predictive geometry when spherical coordinates and rotating LiDAR are used. This disclosure proposes to improve the coding of the parameter called number of Azimuth Angle Steps 'W.
In the current version of the reference model, the number of azimuthal angle steps is encoded by using arithmetic coding using different adaptive models and contexts.
This embodiment proposes to add new contexts for encoding the number of azimuthal angle steps. These new contexts are calculated from (an estimation) the number of times the azimuthal speed is scaled.
This embodiment proposes to: Calculate an approximated value (called "iterationSpeedAzimuth") of the number of times the azimuth speed is scaled (in a symmetrical way at encoding and decoding).
To use this calculated value (called "iterationSpeedAzimuth") for generating additional contexts for the arithmetic coding of the number of Azimuth Angle Steps 'k'.
In the reference software, the number of azimuth steps k is used as follows: (Ppred = k * S(cPstep, r) + with: - S(pstep,r): a scaled azimuthal angle step (can be viewed as the azimuthSpeed) - (p-n: the azimuthal angle prediction provided by the 'n'-th predictor 10 The azimuth step is potentially scaled (for small values of radius) in order to bound the dynamic of the number of azimuth step 'k' as exemplified in the following code representing scaling of the azimuth speed as done in the reference software: int r = rPred + residual[ 0] << 3; auto speedTimesR = int64 t( geomAngularAzimuthSpeed) * r; int phiBound divExp2RoundHalrTnr(speedTimesR, azimuthTwoBilog2 + 1); if (r 66!phiBound) const int32 t pi = 1 << ( azimuthTwoPiLog2 1); int32_t speedTimesR32 -speedlimesR; while (speedTimesR32 < pi) speedlimesR32 <<= 1; *azimuthSpeed <<= 1; By increasing the azimuthSpeed, the values of the number of azimuth angle steps are lower.
In the reference software, the number of azimuthal angle steps is encoded from arithmetic coding and contexts according to the following steps: Comparison to zero o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_prefix o Source code: * _aec->encode(multiplier!= 0, _ctxPhiGtN[interCbddx][0]); Comparison of the absolute value to one: o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_prefix a Source code: * 1nt322 value = abs(multiplier) -1; * aec->encode(value!= 0, _ctxPhiGtN[interCtxIdx][1]); Encoding of the next 3 bits for the absolute value Of needed): o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_minus2 o Source code: * aec->encode((value >> 2) & 1, _ctxResidualPhi[interCtx1dx][0]); * aec->encode( * (value » 1) & 1, _ctxResidualPhi[interCtxldx][1 + (value » 2)]); * aec->encode( * (value » 0) & 1, _ctxResidualPhi[interCbddx][3 + (value » 1)]); Encoding of the remaining bits for the absolute value Of needed): a Adaptive model according to the current version of the standard: ptn_phi_mul_abs_minus9 o Source code: * aec->encodeExpGolomb(valueMinus7, 0, ctxEGPhi[interCtxIdx]); Encoding of the sign (if needed): o Adaptive model according to the current version of the standard: ptn_phi_mul_sign o Source code: * _aec->encode(multiplier < 0, _chaignPhi[interCtx1dx]); 30 The context interCtxIdx used by the different models is defined according to the Inter flag by: int interCtxIdx = interFlag ? 1 0; In the reference software, the adaptive models are declared as: AdaptiveBitModel _ctxPhiGtN[2][2]; AdaptiveBitModel _ctxSignPhi[2]; AdaptiveBitModel _ctxEGPhi[2]; AdaptiveBitModel _ctxResidualPhi[2][7]; In this embodiment, it is proposed to add new contexts to the adaptive models for the coding of 'W. The new contexts are based on an estimation of the number of times the azimuth speed is increased. This requires to calculate this estimation both at the encoder and decoder. This is done by adding the following source code: Estimation of the number of times the azimuth speed is increased (encoder/decoder) The way the number of times the azimuth speed is increased is estimated is illustrated by the following extract of the reference model representing estimation of the number of times the azimuth speed is scaled (with limit to 4). This code is implemented both in encoder and decoder.
r = pred[0]; auto azimuthSpeed =_geomAngularAzimuthSpeed; int recordAzimuthSpeed = 0; r -r << 3; auto speedTimesR = int64 t(azimuthSpeed) * r; int phiBound divEzp2RoundHalfInf(speedTimesR, azimuthTwoPiBug2 + 1) ; if (r && !phiEcund) recordAzimutnSpeed++; const int32_t pi -1 << ( azimuthTwoPiLog2 -1); int32 t speedTimesR32 = speedTimesR; while (speedTimesR32 < pi && recordAzimuthSpeed < 4) ( speedTimesR32 <<= 1; azimutnSpeed <<= 1; recordAzimuthSpeed++; 1 Interpretation of the calculated additional parameter: recordAzimuthSpeed is an estimation of the number of times azimuthSpeed' is scaled.
recordAzimuthSpeed == 0 if azimuthSpeed is estimated as not increased.
recordAzimuthSpeed > 0 if azimuthSpeed is estimated as potentially increased.
Usage of the estimated number crecordAzimuthSpeed for coding and decoding the qPhi value: -The new calculated parameter is used as additional parameter in the functions in charge of encoding or decoding or estimating the cost of qPhi. Example for the decoding is displayed below: int qphi = decodePhiMulfiplier(mode, interFlag, recordAzimuthSpeed); - The new context selection is defined by: int interCtxIdx_ = (interFlag 2 5: 0) + iterationSpeedAzimuth; - This modification is applied inside 3 functions: void PredGeomEncoder::encodePhiMultiplier(...) float PredGeomEncodecestimateBits(...) int32_t PredGeomDecoder::decodePhiMultiplier(...) The 'new' adaptive models related to these new contexts are defined by: AdaptiveBitModel _ctxPhiGtN[10][2]; AdaptiveBitModel _ctxSignPhi[2]; AdaptiveBitModel _ctxEGPhi[10]; AdaptiveBitModel _ctxResidualPhi[10][7]; The remaining of the source code is unchanged. An example for the encoding of qPhi is displayed below: - Comparison to zero o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_prefix o Source code: _aec->encode(multiplier!= 0, ctxPhiGtMinterCtx1dx_][0]); - Comparison of the absolute value to one: o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_prefix * Source code: * int322 value = abs(multiplier) -1; * aec->encode(value!= 0, _ctxPhiGtNI[interCbcIdx_][1]); Encoding of the next 3 bits for the absolute value Of needed): a Adaptive model according to the current version of the standard ptn_phi_mul_abs_minus2 * Source code: * aec->encode((value >> 2) & 1, ctxResidualPhi[interCtx1dx j[0]); * aec->encode( * (value >> 1) & 1, _ctxResidualPhi[interCtx1dx_][1 + (value >> 2)]); * aec->encode( * (value » 0) & 1, _ctxResidualPhi[interCtx1dx_][3 + (value » 1)]); - Encoding of the remaining bits for the absolute value (if needed): o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_minus9 o Source code: * aec->encodeExpGolomb(valueMinus7, 0, ctxEGPhi[interCtxIdx j); Modification of the current version of the standard The current version of the standard related to the parameter qPhi is displayed in the following table representing specification for the coding of the number of azimuthal steps.
8 x ptn_inter_flag[nodeldx] + ptn_phi_mul_abs_minus2 23 16 Exp2(Binldx) + PartVal -1 ptn_phi_mul_abs_minus9 24 Prefix ptn_inter_flag[nodeldx] 2 Suffix bypass 0 ptn_phi_mul_abs_prefix 25 2 x ptninter_flag[nodeldx] + BinldxTu 4 ptn_phi_mul_sign 26 ptninter_flag[nodeldx] 2 The modifications of this specification are displayed in the following table representing proposed modification of the current version of the standard for coding the number of azimuthal steps. The variable called ptn_iteration_speed inside the following table is the tecordAzimuthSpeed' variable of the source code.
ptn_phi_mul_abs_minus2 23 40 x ptn_inter_fiag[node/dx] + 80 8 x (ptii_iteration_speed [nodeldx] + Exp2(Binfdx) + PartVal-1 ptn_phi_mul_abs_minus9 24 Prefix 5 x ptn_inter_flag[node/dx] + ptn_iteration_speed [nodeldx] 10 Suffix bypass 0 ptn_phi_mul_abs_prefix 25 10 x ptn_inter_flag[nodeldx] + 20 2 x ptn_iteration_speed[node/dx] + Bin IdxTu ptn_phi_mul_sign 26 ptn_inter_flag[nodeldx] 2 The first column of these 2 previous tables displays the name of the adaptive model according to the current version of the standard. The second column displays the index of the adaptive model according to the whole set of adaptive model.
This embodiment has been tested and compared to the prior art. The experimental conditions are: Reference branch: * TM_Reference_release-v20.0-rc1 Tested branch: * TM_Reference_release-v20.0-rc1 + modification proposed in this embodiment Tested modes: * Inter disabled (Intra only) and Inter enabled Sequences: * Cat-3 sequences (sequences containing rotating LiDAR content) The results proposed in this embodiment show improvements. In summary, the fourth embodiment is related to a method for coding the number of azimuthal angle steps and compression gains are obtained by using the proposed method.
According to a fifth embodiment, it is proposed to take into account the index of the INTRA predictor predldx when the INTRA mode is used and the number of times the speed is artificially scaled by a factor two when calculating the value qPhi, and to use this value in the determination of the context in the encoding of the value qPhi. This value is called iterationSpeedAzimuth in an exemplary embodiment.
In this exemplary embodiment, the step 12001 of figure 12 for the determination of the context may be calculated based on the encoding mode interFlag indicating if the point is encoded using INTER or INTRA mode, the index of the INTRA predictor predldx, and on the iterationSpeedAzimuth indicating the number of times the speed has been artificially scaled by a factor two for computing qPhi. If interFlag is true, the context will be set to 5, 6, 7, 8 or 9 otherwise it will be set to 0, 1, 2, 3, 4 or 10. The context takes a basic value, 0 when the predictor is an INTRA predictor and 5 if the predictor is an INTER predictor, and is incremented according to the value of iterationSpeedAzimuth if this value is strictly positive. The increment is bounded to 4. If iterationSpeedAzimuth is zero, then the context will be set to 10 if predldx is not zero.
In an embodiment, the context may be calculated according to the following code: void encodePhiMultiplier(const int32 t multiplier, const 'Jodi interFlag, const int iterationSpeedAzimuth, const int predldx ) int interCtxIdx = interElag ? 5: 0; if (iterationSpeedAzimuth > 0)1 if (iterationSpeedAzimuth >= 4) interCtxIdx += 4; else interCtxIdx += iterationSpeedAzimuth; else( if (interF2ag == 0 && predldx i= 0) interCtxIdx = 10; In a variant of the fifth embodiment, the newly introduced context is not used in all the arithmetic encoding steps of the encoding process of qPhi as described in relation to figure 12. In this variant, the encoding of the sign of qPhi provided in steps 12003 and 12006 use the context of the first embodiment based on the encoding mode interFlag and the index of the INTRA predictor when INTRA encoding mode is selected, while all other encoding steps use the newly introduced context.
It is to be noted that using different contexts for different values composing the multiplier may be used in all described embodiments. In this case, some values composing the multiplier may be encoded using the context described in these embodiments, while some others values composing the multiplier are encoded using a context based on a second criterion. This second criterion may be the coding mode among INTER and INTRA coding modes. In some embodiments, this second criterion may be further based on the INTER predictor index Predldx.
In other words, the embodiment may be summarized as follows: This embodiment is related to predictive geometry when spherical coordinates and rotating LiDAR are used. This disclosure proposes to improve the coding of the parameter called number of Azimuth Angle Steps 'k'.
In the current version of the reference model, the number of azimuthal angle steps is encoded by using arithmetic coding and different adaptive models and contexts. This embodiment proposes to add new contexts for the arithmetic coding of the number of azimuthal angle steps based on 1) the Infra predictor index 2) an estimation of the number of times the azimuthal speed is scaled.
This embodiment proposes the encoding of the number of azimuthal angle steps by: Using the Infra predictor index as additional contexts for the arithmetic coding - Calculating an approximated value (called "iterationSpeedAzimuth") of the number of times the azimuth speed is increased (encoding/decoding) Using these parameters as additional contexts for the arithmetic coding Context 1: In the reference software, the number of azimuth steps k is used as followed: (Ppred = k * S((ps1ep,r)+ with: - S(pstep,r): a scaled azimuthal angle step - (p-n: the azimuthal angle prediction provided by the 'n'-th predictor Context 2: The azimuth step is potentially scaled (for small values of radius) in order to bound the dynamic of the number of azimuth step 'lc as displayed in the following source code representing scaling of the azimuth speed as done in the reference software: int r = rPred + residual[0] << 3; auto speedTimesR = int64 t( geomAngularAzimuthSpeed) * r; int phiBound = divExp2RoundElalfinf(speedTimesR, azimuthTwoPiLog2 + 1); if (r && !phiBound) ( const int32 t pi = 1 << ( azimuthTwoPiLog2 -1); int32 t speedTimesR32 = speedTimesR; while (speedTimesR32 < pi) f speedTimesR32 <<= 1; *azimuthSpeed <<= 1; By increasing the azimuthSpeed, the values of the number of azimuth speed are lower Current version of the standard and source code.
In the reference software, the number of azimuthal steps is encoded from arithmetic coding and contexts according to the following steps: - Comparison to zero o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_prefix o Source code: * _aec->encode(multiplier I= 0, _ctxPhiGtN[interCtx1dx][0]); -Comparison of the absolute value to one: o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_prefix o Source code: * int322 value = abs(multiplier) -1; * _aec->encode(value!= 0, _ctxPhiGtN[interCtxIdx][1]); - Encoding of the next 3 bits for the absolute value Of needed): o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_minus2 o Source code: * aec->encode((value » 2) & 1, _ctxResidualPhi[interCtx1dx][0]); * aec->encode( * (value >> 1) & 1, _ctxResidualPhi[interCtx1dx][1 + (value >> 2)])I * aec->encode( * (value » 0) & 1, _ctxResidualPhi[interCtx1dx][3 + (value » 1)])I Encoding of the remaining bits for the absolute value (if needed): * Adaptive model according to the current version of the standard: ptn_phi_mul_abs_minus9 o Source code: * aec->encodeExpGolomb(valueMinus7, 0, ctxEGPhi[interCtxIdx]); Encoding of the sign (if needed): o Adaptive model according to the current version of the standard: ptn_phi_mul_sign o Source code: * _aec->encode(multiplier < 0, _ctxSignPhi[interCtxIdx]); The context interCtxIdx used by the different models is defined according to the Inter flag by: int interCtxIdx = interFlag ? 1: 0; In the reference software, the adaptive models are declared as: AdaptiveBitModel _ctxPhiGtN[2][2]; AdaptiveBitModel _ctxSignPhi[2]; AdaptiveBitModel _ctxEGPhi[2]; AdaptiveBitModel _ctxResidualPhi[2][7].
In this embodiment, it is proposed to add new contexts to the adaptive models for the coding of k' based on: -The intra predictor index -An estimation of the number of times the azimuth speed is increased. This requires to calculate this estimation both at the encoder and decoder. This is done by adding the following source code: Estimation of the number of times the azimuth speed is increased (encoder/decoder): The way the number of times the azimuth speed is increased is estimated is displayed in the following source code extract representing estimation of the number of times the azimuth speed is scaled (with limit to 4). This code is implemented both in encoder and decoder.
r = pred(01; auto azimuthSpeed =_geomAngularAzimuthSpeed; int recordAzimutnSpeed = 0; r = r << 3; auto speedTimesR = int64 t(animuthSpeed) * r; int phiBound = divEzp2RoundHalfInf(speedTimesR, _azimuthTwoPiLog2 + 1); if (r && 1phiBound) f recordAzimuthSpeed++; const int32 t pi = 1 << ( azimuthlwoPiLog2 -1); int32 t speedTimesP32 = speedTimesR; while (speedTimesR32 < pi && recordAzimuthSpeed < 4) speedTimesR32 <<= 1; azimuthSpeed <<= 1; recordAzimuthSpeed++; Usage of the estimated number crecordAzimuthSpeed for coding the qPhi value at the decoder: The new calculated parameter (recordAzimuthSpeed) and the Infra prediction index are used as additional parameters in the functions in charge of encoding, decoding or estimating the cost of qPhi.
Example for the decoding is displayed below: int32 t PredGeomDecoderndecodePhiMultiplier(GPredicter::Mbde mode, const bool interFlag, const int32 t azimuthSpeedIndex, const int pred1dx) The new context is defined by: int interCtxIdx = interflag ? 2: 0; int interCtx1dx = (interFlag 2 5: 0) + iterationSpeedAzimuth; if (azimuthSpeedIndex == 0 && !interF1ag && !pred1dx)( interCtxIdx += 1; interCtxIdx += 1; This modification is applied inside 3 functions: void PredGeomEncoder::encodePhiMultiplier(...) float PredGeomEncodecestimateBits(...) int32_t PredGeomDecoder::decodePhiMultiplier( ) The adaptive models related to these new contexts are defined by: AdaptiveBitModel _ctxPhiGtN[11][2]; AdaptiveBitModel _ctxSignPhi[3]; AdaptiveBitModel _ctxEGPhi[11]; AdaptiveBitModel _ctxResidualPhi[11][7].
The remaining of the source code is almost unchanged. Example of encoding the qPhi multiplier is described below: -Comparison to zero o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_prefix o Source code: _aec->encode(multiplier!= 0, ctxPhiGtN[interCtxldx_][0]); -Comparison of the absolute value to one: o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_prefix o Source code: * int32 _t value = abs(multiplier) -1; * _aec->encode(value!= 0, _ctxPhiGtN[interCtx1dx_][1]); Encoding of the next 3 bits for the absolute value (if needed): o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_minus2 o Source code: * aec->encode((value » 2) & 1, ctxResidualPhi[interCtx1dx j[0]); * aec->encode( * (value >> 1) & 1, _ctxResidualPhi[interCtx1dx_][1 + (value >> 2)]); * aec->encode( * (value >> 0) & 1, _ctxResidualPhi[interCtx1dx_][3 + (value >> 1)]); Encoding of the remaining bits for the absolute value (if needed): o Adaptive model according to the current version of the standard: ptn_phi_mul_abs_minus9 o Source code: * aec->encodeExpGolomb(valueMinus7, 0, ctxEGPhi[interCtxIdx A); Encoding of the sign: o Adaptive model according to the current version of the standard: ptn_phi_mul_sign o Source code: * _aec->encode(multiplier < 0, _ctxSignPhi[interCtxIdx]); Modification of the current version of the standard The text of the standard related to the parameter qPhi is displayed in the following table representing specification for the coding of the number of azimuthal steps.
ptn_phi_mul_abs_minus2 23 8 x ptii_inter_flag[nodeldx] + 16 Exp2(Binldx) + PartVal -1 ptn_phi_mul_abs_minus9 24 Prefix ptminter_flag[nodeldx] 2 Suffix bypass 0 ptn_phi_mul_abs_prefix 25 2 x ptninter_flag[nodeldx] + BinldxTu 4 ptn_phi_mul_sign 26 ptn_inter_flag[nodeldx] 2 The modifications of the current version of the standard in relation to this embodiment are displayed in the following table representing proposed modification of the current version of the standard for coding the number of azimuthal steps. The variable called ptn_iterafion_speed inside the following table is the tecordAzimuthSpeed' variable of the source code.
48 x ptn_inter_fiag[node/dx] + (ptn_inter_flag[node/dx] == 0) x
C
aptn_iteration_speed[node/dx] == 0) x 8 x ptn_phi_mul_a 23 (ptn_pred_idx[node/dx] *0)) + 88 bs_minus2 8 x (ptn_iteration_speed[node/dx] * 0) 1 -F 8 x (ptniteration_speed [nodeldx] ÷ Exp2 (Binldx) -I-Part Val -1 ptn_phi_mul_ab s_minus9 24 Prefix 6 x ptninter_flag[node/dx] + (ptn_inter_flag[ncxiddx] == 0) x f 11 aptn_iteration_speed [nodeldx] == 0) x (ptn_predidx[node/dx] # 0)) + (ptn_iteration_speed[nodeldx] * 0) 1+ ptn_iteration_speed[nodeldx] Suffix bypass 0 ptn_phi_mul_ab s_pretlx 25 12 x ptn_inter_flag[node/dx] + (ptninter_flag[node/dx] == 0) x 22 ( ((ptn_iteration_speed[node/dx] == 0) x 2 x (ptn_pred_idx[node/dx] # 0)) + 2 x (ptn_iteration_speed[node/cLx] # 0) 1+ 2 x ptn_iteration_speed[node/dx] + BinIdxTu ptn_phi_mul_sig 26 2 x ptninter_flag[node/dx] + (ptninter_flag[node/dx] ? 3 n 0: ptn_pred_idx[node/dx] # 0) The proposed embodiment has been tested and compared according to the following conditions: Reference branch: * TM_Reference_release-v20.0-rc1 Tested branch (current embodiment): * TM_Reference_release-v20.0-rc1 + modification proposed in this embodiment Tested modes: * Inter disabled and Inter enabled Sequences: * Cat-3 sequences The results proposed in this embodiment show improvements. In summary, the embodiment is related to a method for coding the number of azimuthal angle steps and compression gains are obtained by using the proposed method.
Figure 16 illustrates a block diagram of a device adapted to incorporate embodiments of the invention.
Preferably, the device comprises a central processing unit (CPU) 16001 capable of executing instructions from program ROM 16003 on powering up of the receiving device, and instructions relating to a software application from main memory 1002 after the powering up. The main memory 16002 is for example of Random Access Memory (RAM) type which functions as a working area of CPU 16001, and the memory capacity thereof can be expanded by an optional RAM connected to an expansion port (not illustrated). Instructions relating to the software application may be loaded to the main memory 16002 from the hard-disc (HD) 16006 or the program ROM 16003 for example. Such software application, when executed by the CPU 16001, causes the steps of the flowcharts shown in the previous figures.
Reference numeral 16004 is a network interface that allows the connection of the device to the communication network. The software application when executed by the CPU is adapted to receive data streams through the network interface from other devices.
Reference numeral 16005 represents a user interface to display information to, and/or receive inputs from a user.
In this section, we are going to describe roughly how arithmetic encoding works. In particular we, are going to describe context adaptive binary arithmetic encoder used in embodiments of the invention.
Binary arithmetic encoding consists in reading a sequence of bins (sequences of 0 and 1) and in generating a floating interval (for each bin, a new floating interval is calculated based on the previous calculated interval) between zero and one. The interval is constructed based on probability of bins (probability of zero or one). When the entire sequence of bins is processed, this interval can be identified by a floating value. This floating value is represented in a binary form (the bits of the bitstream) and can be used by the arithmetic decoder for decoding the bins. It is supposed that the probabilities of bins are known by the decoder. With this means, it is possible to have code words (bitstream) close to the Shannon entropy. For example, a 20 binary bins can be encoded (for some adequate probabilities) in 16 bits or less.
For practical applications of arithmetic encoder, there is no need to wait the entire sequence of bins to be encoded before generating bits for the bitstream.
Context Adaptive Binary arithmetic encoding is an improvement over binary arithmetic encoding. It uses additional contexts. Contexts enable to specify 'sharper' probability tables. In other words, context allows probability modelling for specific conditions. These specific conditions are equivalent to conditional probabilities. If the selected condition enables to increase the probability of a binary value (for example if the probability of 1 increases when a condition 'A' is true), it is interesting to use this condition state. Indeed, as the probability of the binary value increases, the Shannon entropy decreases and the arithmetic encoder compresses better (because arithmetic encoder is able to generate bits/bitstream close to the Shannon entropy).
The probability model used by the encoder and decoder (for a given bin) can be known in advance or can be calculated on the flow. For example, the probability model can be constructed from previously decoded elements. In such a way, the probability model will be adaptive and will match as closely as possible the evolution of the probability distribution of the given bins to encode (the more the probability model is close from the real probability, the higher efficient is the arithmetic codec).
In an embodiment of the invention, it is proposed to encode with a Context Adaptive Binary Arithmetic encoder the bins related to refNodeFlag. The choice of the contexts is based on the value called IsPreviousPredictorExist as explained in relation with Figure 15.
Any step of the algorithms of the invention may be implemented in software by execution of a set of instructions or program by a programmable computing machine, such as a PC ("Personal Computer"), a DSP ("Digital Signal Processor") or a microcontroller; or else implemented in hardware by a machine or a dedicated component, such as an FPGA ("Field-Programmable Gate Array") or an ASIC ("Application-Specific Integrated Circuit').
Although the present invention has been described hereinabove with reference to specific embodiments, the present invention is not limited to the specific embodiments, and modifications will be apparent to a skilled person in the art which lie within the scope of the present invention.
Many further modifications and variations will suggest themselves to those versed in the art upon making reference to the foregoing illustrative embodiments, which are given by way of example only and which are not intended to limit the scope of the invention, that being determined solely by the appended claims. In particular the different features from different embodiments may be interchanged, where appropriate.
Each of the embodiments of the invention described above can be implemented solely or as a combination of a plurality of the embodiments. Also, features from different embodiments can be combined where necessary or where the combination of elements or features from individual embodiments in a single embodiment is beneficial.
In the claims, the word "comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality. The mere fact that different features are recited in mutually different dependent claims does not indicate that a combination of these features cannot be advantageously used.

Claims (15)

  1. CLAIMS1 A method of encoding a 3D dynamic point cloud, comprising a sequence of frames of 3D point clouds, in a bitstream, each 3D point cloud comprising a set of 3D points, the method comprising: obtaining an azimuth predictor for encoding the azimuth of a current 3D point using a coding mode, wherein the azimuth predictor is obtained by applying a multiplier to a selected azimuth predictor; obtaining one or more probability distributions determined based on the coding mode and one or more further parameters; and encoding the multiplier using the obtained one or more probability distributions.
  2. 2. The method of claim 1, wherein encoding the multiplier comprises encoding each of one or more elements composing the multiplier.
  3. 3 The method of claim 2, wherein a plurality of determined probability distributions is associated with an element of the multiplier and wherein the element is encoded using one probability distribution selected among the plurality based on a coding mode and one or more further parameters.
  4. 4. The method of one of claims 1 to 3, wherein the coding mode is an INTRA coding mode or an INTER coding mode.
  5. 5 The method of claim 4, wherein the one or more further parameters are chosen among: an index of the selected predictor when INTRA coding mode is used; an index of the selected predictor when INTER coding mode is used; a number, or an estimation thereof, of times an azimuthal angle is scaled depending on a radius of the selected predictor or the 3D point.
  6. 6. The method of one of claims 1 to 5, further comprising encoding the azimuth of the current 3D point using the obtained azimuth predictor.
  7. 7 A method of decoding a bitstream comprising an encoded 3D dynamic point cloud, the 3D dynamic point cloud comprising a sequence of frames of 3D point clouds, each 3D point cloud comprising a set of 3D points, the method comprising: determining a coding mode from the bitstream; obtaining one or more further parameters; obtaining one or more probability distributions determined based on the coding mode and the obtained one or more further parameters; and decoding a multiplier using the obtained one or more probability distributions.
  8. 8 The method of claim 7, wherein the one or more further parameters are chosen among: - an index of the predictor determined based on the coding mode; a number, or estimation thereof, of times an azimuthal angle is scaled depending on a radius of the selected predictor or the 3D point.
  9. 9 The method of one of claims 7 to 8, further comprising: selecting an azimuth predictor; determining a modified azimuth predictor by applying the decoded multiplier to the selected azimuth predictor; and - decoding the azimuth of the current 3D point using the determined azimuth predictor.
  10. 10. The method of one of claims 7 to 9, wherein the coding mode is an INTRA coding mode or an INTER coding mode.
  11. 11 A device for encoding a 3D dynamic point cloud, comprising a sequence of frames of 3D point clouds, in a bitstream, each 3D point cloud comprising a set of 3D points, the device comprising a processor configured for: obtaining an azimuth predictor for encoding the azimuth of a current 3D point using a coding mode, wherein the azimuth predictor is obtained by applying a multiplier to a selected azimuth predictor; - obtaining one or more probability distributions determined based on the coding mode and one or more further parameters; and -encoding the multiplier using the obtained one or more probability distributions.
  12. 12 A device for decoding a bitstream comprising an encoded 3D dynamic point cloud, the 3D dynamic point cloud comprising a sequence of frames of 3D point clouds, each 3D point cloud comprising a set of 3D points, the device comprising a processor configured for: determining a coding mode from the bitstream; obtaining one or more further parameters; obtaining one or more probability distributions determined based on the coding mode and the obtained one or more further parameters; and decoding a multiplier using the obtained one or more probability distributions.
  13. 13. A computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for implementing a method according to any one of claims 1 to 10, when loaded into and executed by the programmable apparatus.
  14. 14. A computer-readable storage medium storing instructions of a computer program for implementing a method according to any one of claims 1 to 10.
  15. 15. A computer program which upon execution causes the method of any one of claims 1 to 10 to be performed.
GB2300234.8A 2023-01-06 2023-01-06 Enhanced method for point cloud compression Pending GB2626029A (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
GB2300234.8A GB2626029A (en) 2023-01-06 2023-01-06 Enhanced method for point cloud compression
GB2300627.3A GB2626043A (en) 2023-01-06 2023-01-16 Method and apparatus for compression and encoding of 3D dynamic point cloud
GB2305110.5A GB2629559A (en) 2023-01-06 2023-04-05 Method and apparatus for compression and encoding of 3D dynamic point cloud

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB2300234.8A GB2626029A (en) 2023-01-06 2023-01-06 Enhanced method for point cloud compression

Publications (1)

Publication Number Publication Date
GB2626029A true GB2626029A (en) 2024-07-10

Family

ID=85284028

Family Applications (3)

Application Number Title Priority Date Filing Date
GB2300234.8A Pending GB2626029A (en) 2023-01-06 2023-01-06 Enhanced method for point cloud compression
GB2300627.3A Pending GB2626043A (en) 2023-01-06 2023-01-16 Method and apparatus for compression and encoding of 3D dynamic point cloud
GB2305110.5A Pending GB2629559A (en) 2023-01-06 2023-04-05 Method and apparatus for compression and encoding of 3D dynamic point cloud

Family Applications After (2)

Application Number Title Priority Date Filing Date
GB2300627.3A Pending GB2626043A (en) 2023-01-06 2023-01-16 Method and apparatus for compression and encoding of 3D dynamic point cloud
GB2305110.5A Pending GB2629559A (en) 2023-01-06 2023-04-05 Method and apparatus for compression and encoding of 3D dynamic point cloud

Country Status (1)

Country Link
GB (3) GB2626029A (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220108491A1 (en) * 2020-10-07 2022-04-07 Qualcomm Incorporated Predictive geometry coding in g-pcc

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4052227A1 (en) * 2019-10-31 2022-09-07 BlackBerry Limited Method and system for azimuthal angular prior and tree representation for cloud compression
US12322143B2 (en) * 2021-04-16 2025-06-03 Qualcomm Incorporated Performance improvement of geometry point cloud compression (GPCC) planar mode using inter prediction
EP4152265A1 (en) * 2021-09-17 2023-03-22 Beijing Xiaomi Mobile Software Co., Ltd. Method and apparatus of encoding/decoding point cloud geometry data sensed by at least one sensor
US20250071325A1 (en) * 2021-12-30 2025-02-27 Beijing Xiaomi Mobile Software Co., Ltd. Method and apparatus of encoding/decoding point cloud geometry data

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220108491A1 (en) * 2020-10-07 2022-04-07 Qualcomm Incorporated Predictive geometry coding in g-pcc

Also Published As

Publication number Publication date
GB2629559A (en) 2024-11-06
GB202305110D0 (en) 2023-05-17
GB2626043A (en) 2024-07-10
GB202300627D0 (en) 2023-03-01

Similar Documents

Publication Publication Date Title
Quach et al. Survey on deep learning-based point cloud compression
EP4226627B1 (en) Predictive geometry coding in g-pcc
JP2023520855A (en) Coding laser angles for angular and azimuthal modes in geometry-based point cloud compression
EP3942830A1 (en) Methods and devices for predictive point cloud attribute coding
EP4052474A1 (en) Angular priors for improved prediction in tree-based point cloud coding
CN114503440A (en) Angle mode of tree-based point cloud coding and decoding
WO2021207376A1 (en) Predictor index signaling for predicting transform in geometry-based point cloud compression
KR20140042864A (en) Method for adaptive entropy coding of tree structures
US20230342987A1 (en) Occupancy coding using inter prediction with octree occupancy coding based on dynamic optimal binary coder with update on the fly (obuf) in geometry-based point cloud compression
US12444088B2 (en) Angular mode and in-tree quantization in geometry point cloud compression
WO2022076316A1 (en) Scaling of quantization parameter values in geometry-based point cloud compression (g-pcc)
JP2023101095A (en) Point group decoding device, point group decoding method and program
GB2626029A (en) Enhanced method for point cloud compression
US20240346706A1 (en) Method, apparatus, and medium for point cloud coding
GB2620453A (en) Method and apparatus for compression and encoding of 3D dynamic point cloud
US20230342984A1 (en) Inter prediction in point cloud compression
EP4236322B1 (en) Point cloud encoding/decoding method and device based on two-dimensional regularized plane projection
GB2623372A (en) Method and apparatus for compression and encoding of 3D dynamic point cloud
JP7773959B2 (en) Point group decoding device, point group decoding method and program
JP2023053827A (en) Point group decoding device, point group decoding method and program
JP7773960B2 (en) Point group decoding device, point group decoding method and program
US12425658B2 (en) Predictive geometry coding for point cloud compression
WO2024149309A1 (en) Method, apparatus, and medium for point cloud coding
WO2024149258A1 (en) Method, apparatus, and medium for point cloud coding
US20250337954A1 (en) Method, apparatus, and medium for point cloud coding