WO2024217301A1 - Point cloud coding processing method, point cloud decoding processing method and related device - Google Patents
Point cloud coding processing method, point cloud decoding processing method and related device Download PDFInfo
- Publication number
- WO2024217301A1 WO2024217301A1 PCT/CN2024/086903 CN2024086903W WO2024217301A1 WO 2024217301 A1 WO2024217301 A1 WO 2024217301A1 CN 2024086903 W CN2024086903 W CN 2024086903W WO 2024217301 A1 WO2024217301 A1 WO 2024217301A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- frame
- node
- prediction
- target
- inter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/18—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
Definitions
- the present application belongs to the field of computer technology, and specifically relates to a point cloud encoding processing method, a point cloud decoding processing method and related equipment.
- Point cloud is a representation of a three-dimensional object or scene. It is composed of a set of discrete points that are irregularly distributed in space and express the spatial structure and surface properties of the three-dimensional object or scene. In order to accurately reflect the information in space, the number of discrete points required is quite large. In order to reduce the bandwidth occupied by point cloud data storage and transmission, the point cloud data needs to be encoded and compressed. Point cloud data usually consists of geometric information describing the location, such as three-dimensional coordinates (x, y, z) and attribute information of the location, such as color (R, G, B) or reflectivity. In the process of point cloud coding and compression, the encoding of geometric information and attribute information is performed separately.
- the embodiments of the present application provide a point cloud encoding processing method, a point cloud decoding processing method and related equipment, which can solve the problem of low encoding efficiency.
- the code stream of the point cloud to be encoded includes the first encoding result
- the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
- a point cloud decoding processing method which is executed by a decoding end and includes:
- the target prediction residual value when the target distance is less than or equal to the first preset distance, includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer;
- the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer;
- the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
- a point cloud coding processing device comprising:
- a first determination module is used to determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be encoded
- a second determination module used to determine a target prediction residual value based on the target distance
- An encoding module used for encoding the target prediction residual value to obtain a first encoding result
- the code stream of the point cloud to be encoded includes the first encoding result
- the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer;
- the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer;
- the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
- a point cloud decoding processing device comprising:
- a determination module used to determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be decoded
- a decoding module used for decoding a first encoding result in a bit stream of a point cloud to be decoded to obtain a target prediction residual value
- An acquisition module used for acquiring attribute information of the nodes of the target layer based on the target prediction residual value and the target distance;
- the target prediction residual value when the target distance is less than or equal to the first preset distance, includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer;
- the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer;
- the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
- a terminal which includes a processor and a memory, wherein the memory stores a program or instruction that can be run on the processor, and when the program or instruction is executed by the processor, the steps of the method described in the first aspect are implemented, or the steps of the method described in the second aspect are implemented.
- a terminal comprising a processor and a communication interface, wherein the processor is used to: determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be encoded; determine a target prediction residual value based on the target distance; encode the target prediction residual value to obtain a first encoding result; wherein the code stream of the point cloud to be encoded includes the first encoding result; when the target distance is less than or equal to a first preset distance, the target prediction residual value includes a first prediction residual value obtained by inter-frame prediction of attribute information of a node of the target layer; or when the target distance is greater than the first preset distance and less than or equal to a second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer; or when the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by intra-frame prediction of the attribute information of the
- a terminal comprising a processor and a communication interface, wherein the processor is used to: determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be decoded; decode a first encoding result in a code stream of the point cloud to be decoded to obtain a target prediction residual value; obtain attribute information of the nodes of the target layer based on the target prediction residual value and the target distance; wherein, when the target distance is less than or equal to a first preset distance, the target prediction residual value includes a first prediction residual value obtained by inter-frame prediction of the attribute information of the nodes of the target layer; or when the target distance is greater than the first preset distance and less than or equal to a second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the nodes of the target layer; or when the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by intra-
- a readable storage medium on which a program or instruction is stored.
- the program or instruction is executed by a processor, the steps of the method described in the first aspect are implemented, or the steps of the method described in the second aspect are implemented.
- a coding and decoding system comprising: an encoding end device and a decoding end device, wherein the encoding end device can be used to execute the steps of the method described in the first aspect, and the decoding end device can be used to execute the steps of the method described in the second aspect.
- a chip comprising a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is used to run a program or instruction to implement the method described in the first aspect, or to implement the method described in the second aspect.
- a computer program/program product is provided, wherein the computer program/program product is stored in a storage medium, and the program/program product is executed by at least one processor to implement the steps of the method according to the first aspect, Or implement the steps of the method as described in the second aspect.
- a target prediction residual value is determined based on the target distance, and the target prediction residual value is encoded.
- the target prediction residual value includes a first prediction residual value obtained by inter-frame prediction of the attribute information of the node of the target layer; or when the target distance is greater than the first preset distance and less than or equal to a second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer; or when the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by intra-frame prediction of the attribute information of the node of the target layer.
- FIG1 is a flow chart of a prediction method in the related art
- FIG2 is a schematic diagram of a node in the related art
- FIG3 is a schematic diagram of binary RAHT decomposition of a node block in the related art
- FIG4 is a schematic diagram of a prediction method in the related art
- FIG5 is a flowchart of a point cloud coding processing method provided in an embodiment of the present application.
- FIG6 is a flowchart of a point cloud decoding method provided in an embodiment of the present application.
- FIG. 7 is a second flowchart of a point cloud coding processing method provided in an embodiment of the present application.
- FIG8 is a second flowchart of a point cloud decoding processing method provided in an embodiment of the present application.
- FIG9 is a schematic diagram of the structure of a point cloud coding processing device provided in an embodiment of the present application.
- FIG10 is a schematic diagram of the structure of a point cloud decoding processing device provided in an embodiment of the present application.
- FIG11 is a schematic diagram of the structure of a communication device provided in an embodiment of the present application.
- FIG. 12 is a schematic diagram of the structure of a terminal provided in an embodiment of the present application.
- first, second, etc. of the present application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It should be understood that the terms used in this way are interchangeable where appropriate, so that the embodiments of the present application can be implemented in an order other than those illustrated or described herein, and the objects distinguished by “first” and “second” are generally of one type, and the number of objects is not limited, for example, the first object can be one or more.
- “or” in the present application represents at least one of the connected objects.
- “A or B” covers three schemes, namely, Scheme 1: including A but not including B; Scheme 2: including B but not including A; Scheme 3: including both A and B.
- the character "/" generally indicates that the objects associated with each other are in an "or” relationship.
- indication in this application can be a direct indication (or explicit indication) or an indirect indication (or implicit indication).
- a direct indication can be understood as the sender explicitly informing the receiver of specific information, operations to be performed, or request results in the sent indication;
- an indirect indication can be understood as the receiver determining the corresponding information according to the indication sent by the sender, or making a judgment and determining the operation to be performed or the request result according to the judgment result.
- the codec end corresponding to the point cloud codec processing method in the embodiment of the present application may be a terminal, which may also be referred to as a terminal device or a user terminal (User Equipment, UE).
- the terminal may be a mobile phone, a tablet computer (Tablet Personal Computer), a laptop computer (Laptop Computer) or a notebook computer, a personal digital assistant (Personal Digital Assistant, PDA), a handheld computer, a netbook, an ultra-mobile personal computer (ultra-mobile personal computer, UMPC), a mobile Internet device (Mobile Internet Device, MID), an augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) device, a robot, a wearable device (Wearable Device) or a vehicle-mounted device (Vehicle User Equipment, VUE), a pedestrian terminal (Pedestrian User Equipment, PUE) and other terminal side devices, and the wearable device includes: a smart watch, a bracelet, a headset, glasses, etc. It should be noted that
- the GPCC encoded and decoded point cloud its geometric information and attribute information are encoded separately. First, the geometric information is encoded, and then the attribute information is encoded using the reconstructed geometric information.
- RAHT Region Adaptive Hierarchal Transform
- the attribute coding method of RAHT can be divided into intra-frame RAHT and inter-frame RAHT according to the prediction method adopted.
- Intra-frame prediction uses only the nodes that have been encoded in the current frame to predict the current node, while inter-frame prediction uses the nodes of the reference frame to predict the current node.
- Intra-frame RAHT technology is called regional adaptive transformation based on upsampling prediction, which is transformed layer by layer from top to bottom.
- RAHT is applied to each unit node containing 2 ⁇ 2 ⁇ 2 child nodes.
- its parent node (previous layer), coplanar neighbor parent node (previous layer), collinear neighbor parent node (previous layer), and encoded coplanar neighbor child node (same layer) and encoded collinear neighbor child node (same layer) are used to perform weighted prediction on the attributes of the placeholder child node of the node.
- the obtained attribute prediction value and the true value are RAHT transformed and the difference is made to obtain the AC coefficient (AC coefficient) residual, and the AC coefficient residual is quantized and encoded.
- AC coefficient AC coefficient
- intra-frame prediction will not be enabled, and the original attribute value of the current node to be encoded will be directly RAHT transformed, and then the transform coefficient will be quantized and encoded.
- the first step is to build a transformation tree structure. Starting from the bottom layer, the octree structure is built from the bottom up. In the process of building the transformation tree, it is necessary to generate corresponding Morton code information, attribute information and weight information for the merged nodes.
- the second step is to parse the tree from top to bottom.
- upsampling prediction and regional adaptive hierarchical transform (RAHT) are performed layer by layer.
- RAHT regional adaptive hierarchical transform
- the current node is a root node
- no upsampling prediction is performed, and the attribute information of the node is directly subjected to RAHT transformation to obtain the DC coefficient and AC coefficient.
- Neighbor search The search range is: the parent node of the current child node to be encoded (1), the coplanar neighbor node of the parent node of the current child node to be encoded (6), the colinear neighbor node of the parent node of the current child node to be encoded (12), the coplanar neighbor node of the current child node to be encoded (6), the colinear neighbor node of the current child node to be encoded (12). Search the above neighbor nodes in turn. If the neighbor node exists, record its corresponding index information and the number of neighbors of the parent node (including the parent node itself). Then proceed to step 4.
- the third step is weighted prediction.
- the nearest neighbors found in the neighbor search are used to perform weighted prediction on each child node of the current node to be encoded, as shown in Figure 2.
- the prediction weight of the parent node is specified to be 9, the prediction weight of the neighboring child node coplanar with the current child node to be encoded is 5, the prediction weight of the neighboring child node colinear with the current child node to be encoded is 2, the prediction weight of the neighboring parent node coplanar with the current child node to be encoded is 3, and the prediction weight of the neighboring parent node colinear with the current child node to be encoded is 1.
- the parent node can be directly used to predict each child node of the current block to be encoded.
- neighbor nodes neighbor parent nodes and encoded neighbor child nodes
- the current neighbor parent node cannot be used to predict the child node of the current block to be encoded; if the condition is met, continue to make the following judgment;
- the fourth step is RAHT transformation.
- the process is as follows:
- the original attribute values and/or attribute prediction values need to be normalized first, and then the processed values are subjected to RAHT transformation.
- each 2*2*2 block performs the following transformations in total:
- LLL is the DC coefficient
- LLH, LHL, LHH, HLL, HLH, HHL, and HHH are AC coefficients.
- the AC coefficients are quantized and entropy coded in the order of LLH, LHL, HLL, LHH, HLH, HHL, HHH.
- each 2*2*2 node block usually has less than 8 occupied child nodes, so not all AC coefficients will exist, and non-existent AC coefficients will not be encoded.
- the current node is not a root node and no prediction is performed, the AC coefficients of the original attribute values after transformation are quantized and entropy encoded;
- the current node is not a root node and prediction is performed, the residual of the AC coefficient of the original attribute value after transformation and the AC coefficient of the attribute prediction value after transformation is calculated, and the residual is quantized and entropy encoded.
- the coefficients are decoded and dequantized, and then the RAHT inverse transform is performed.
- the inverse RAHT transform is the inverse process of the RAHT transform.
- the inverse transform formula is as follows:
- T 1 and T w0+1 are transformation coefficients
- T 01 and T 11 are reconstruction attribute values
- the calculation method of a and b is the same as the calculation method of a and b in the transformation process.
- the DC coefficient inherits the attribute reconstruction value of the parent node, and then performs an inverse transformation to obtain the attribute reconstruction value of each child node;
- the DC coefficient inherits the attribute reconstruction value of the parent node, and the AC coefficient is the AC coefficient residual plus the AC coefficient of the attribute prediction value. Then an inverse transformation is performed to obtain the attribute reconstruction value of each child node.
- the inter-frame RAHT technology uses the inter-frame prediction attribute values of some qualified nodes in the reference frame to predict the attributes of the nodes in the current frame, and then performs RAHT transformation on the attribute prediction value and the true attribute value respectively and makes a difference, and quantizes and encodes the AC coefficient residual.
- Inter-RAHT is a method for inter-frame prediction of DC and AC coefficients between RAHT frames: inter-frame prediction is only used for the first five layers of nodes, and the remaining layer nodes only use intra-frame prediction methods.
- the DC coefficient prediction residual DC residual is: the DC coefficient DC current of the root node of the current frame minus the DC coefficient DC reference of the root node of the reference frame.
- DC residual DC current -DC reference
- the prediction process is as follows:
- Reference frame reconstruction Merge the points in the reference frame within a certain range of the current point into one point, add the attributes, and take the coordinates of the current point. Then, the same prediction tree construction and analysis can be performed. The specific process is as follows:
- the rule is: if the distance of a point in the reference frame to the current point of the current frame is closer than the Morton code distance to the next point of the current frame, then this point is determined to be used to reconstruct the current point, the attribute value is accumulated, and the weight is increased by one. Until the condition is not met, continue to determine the next reconstruction point.
- Tree construction The reconstructed reference frame and the current frame are subjected to the bottom-up operation of constructing the prediction tree, and the same results are obtained.
- the prediction tree is constructed to generate the attribute value of the current frame node and the predicted attribute value and weight value of the corresponding reference frame node.
- the predicted attribute value is selected: in the first five layers, if the attribute value of the node at the corresponding position of the reference frame is not 0 and is within 20% to 250% of the attribute value of the current parent node, the attribute value Attr predicted_inter of the node in the reference frame is used as the attribute prediction value, otherwise the intra-frame prediction value Attr predicted_intra of the current frame is used as the attribute prediction value.
- the remaining layers only use the original intra-frame RAHT prediction method.
- the tree structure corresponding to the point cloud to be encoded is divided into two parts.
- the nodes in the upper layer are predicted using the inter-frame prediction algorithm, and the nodes in the lower layer are predicted using the intra-frame prediction algorithm.
- the flexibility is poor, resulting in low coding efficiency.
- FIG. 5 is a flow chart of a point cloud coding processing method provided in an embodiment of the present application, which can be applied to the coding end.
- the point cloud coding processing method includes the following steps:
- Step 101 Determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be encoded.
- the geometric information and attribute information of the point cloud to be encoded can be obtained; based on the geometric information of the point cloud to be encoded, a transform tree corresponding to the point cloud to be encoded is generated.
- the transform tree corresponding to the point cloud to be encoded can include multiple layers, the root node can be considered as the first layer of the transform tree, and the target layer can be any layer in the transform tree.
- the target distance can refer to the number of layers from the target layer to the root node.
- the target distance between the second layer of the transform tree and the root node can be one layer
- the target distance between the third layer of the transform tree and the root node can be two layers
- the target distance between the fourth layer of the transform tree and the root node can be three layers, and so on.
- Step 102 determining a target prediction residual value based on the target distance
- Step 103 Encode the target prediction residual value to obtain a first encoding result
- the code stream of the point cloud to be encoded includes the first encoding result
- the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer;
- the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer;
- the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
- the first preset distance and the second preset distance may both be the number of layers from the root node.
- the first preset distance is smaller than the second preset distance.
- the first preset distance may be 5 layers and the second preset distance may be 10 layers; or the first preset distance may be 7 layers and the second preset distance may be 13 layers; or the first preset distance may be 8 layers and the second preset distance may be 15 layers; etc.
- This embodiment does not limit the first preset distance and the second preset distance.
- the node of the target layer is the node of the upper layer that is closer to the root node. Since the nodes of the upper layer of the transform tree are larger and contain more points, even if there is movement between the two frames, there will basically not be much change in these upper layer nodes.
- the inter-frame attribute prediction residual may be smaller than the intra-frame attribute prediction residual, so that the use of inter-frame prediction in the upper layer will have a better effect.
- the inter-frame prediction method or the intra-frame prediction method can be determined by a rate-distortion optimization algorithm.
- the node of the target layer is a node of the lower layer that is farther away from the root node. Since the lower layer block is small and contains fewer points, when there is a certain amount of motion between the two frames, the number of points contained in each block may change greatly, which is not suitable for inter-frame prediction. Therefore, using intra-frame prediction in the lower layer will have a better effect.
- the inter-frame prediction method in the related art is to determine whether the current layer has turned on the inter-frame prediction mode by the distance from the root node (number of layers). That is, when the distance from the current layer to the root node is less than or equal to the set threshold, the corresponding inter-frame prediction algorithm is applied. When the distance from the current layer to the root node is greater than the set threshold, only the existing intra-frame prediction method is used. However, for some sequences, the results of inter-frame prediction for some nodes greater than the threshold may have smaller residuals than intra-frame prediction, and the prediction effect is better.
- This embodiment adopts a more flexible method to select the method of using the intra-frame and inter-frame prediction modes, instead of simply using the number of layers from the root node as the basis, the divided octree is divided into two, the inter-frame prediction is used for the nodes in the upper layer, and the intra-frame prediction is generally used for the remaining nodes.
- This embodiment uses a more flexible method to determine the prediction mode used by the current node, so as to make the prediction residual smaller.
- a target prediction residual value is determined based on the target distance, and the target prediction residual value is encoded.
- the target prediction residual value includes a first prediction residual value obtained by inter-frame prediction of the attribute information of the node of the target layer; or when the target distance is greater than the first preset distance and less than or equal to a second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer; or when the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by intra-frame prediction of the attribute information of the node of the target layer.
- the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm.
- a first generation value corresponding to the inter-frame residual coefficient and a second generation value corresponding to the intra-frame residual coefficient may be determined based on a rate-distortion optimization algorithm, and a second prediction residual value may be determined based on the first generation value and the second generation value.
- the target prediction residual value when the target distance is greater than the first preset distance and less than or equal to the second preset distance, includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm. In this way, it supports judging the cost of using intra-frame prediction and inter-frame prediction through a rate-distortion optimization algorithm, thereby selecting a suitable prediction method for prediction processing to obtain a second prediction residual value.
- determining a target prediction residual value based on the target distance includes:
- the target prediction residual value includes the second prediction residual value.
- cost can also be described as expenditure.
- D distal
- SAD Sum of Absolute Difference
- R rate: bit rate
- prediction residual coefficient intra-frame residual coefficient or inter-frame residual coefficient
- ⁇ The value of ⁇ depends on the quantization parameter QP, which is calculated as follows:
- the process of determining the inter-frame residual coefficients of the attribute information of the nodes of the target layer may include: determining the inter-frame prediction value of the attribute information of the nodes of the target layer, transforming the inter-frame prediction value to obtain the inter-frame transformation coefficient, and obtaining the inter-frame residual coefficient based on the inter-frame transformation coefficient; or, may include: if there is a transformation coefficient of a reference node having the same position as the node of the target layer in the reference frame of the point cloud to be encoded, then determining the transformation coefficient of the reference node having the same position as the inter-frame transformation coefficient of the attribute information of the node, and determining the inter-frame residual coefficient of the attribute information of the node based on the transformation coefficient of the node and the inter-frame transformation coefficient of the attribute information of the node.
- determining the intra-frame residual coefficient of the attribute information of the node of the target layer may include: determining the intra-frame prediction value of the attribute information of the node of the target layer, transforming the intra-frame prediction value to obtain the intra-frame transformation coefficient, and obtaining the inter-frame residual coefficient based on the intra-frame transformation coefficient.
- inter-frame residual coefficients and intra-frame residual coefficients of attribute information of the node of the target layer are determined; Based on the rate-distortion optimization algorithm, a first generation value corresponding to the inter-frame residual coefficient and a second generation value corresponding to the intra-frame residual coefficient are determined; based on the first generation value and the second generation value, a second prediction residual value is determined.
- the cost of using intra-frame prediction and inter-frame prediction can be determined through the rate-distortion optimization algorithm, so as to select a suitable prediction method for prediction processing to obtain the second prediction residual value, so that the attribute prediction residual is small and the coding efficiency is high.
- the second prediction residual value is the inter-frame residual coefficient
- the second prediction residual value is the intra-frame residual coefficient
- the second prediction residual value is the inter-frame residual coefficient or the intra-frame residual coefficient.
- the first generation value being equal to the second generation value indicates that the cost of encoding the intra-frame residual coefficient is the same as the cost of encoding the inter-frame residual coefficient, and the intra-frame prediction mode or the inter-frame prediction mode can be selected.
- the second prediction residual value when the first cost value is less than the second cost value, the second prediction residual value is the inter-frame residual coefficient; when the second cost value is less than the first cost value, the second prediction residual value is the intra-frame residual coefficient; when the first cost value is equal to the second cost value, the second prediction residual value is the inter-frame residual coefficient or the intra-frame residual coefficient.
- the code stream of the point cloud to be encoded also includes a second encoding result, and the second encoding result is used to represent that the second prediction residual value is the intra-frame residual coefficient or the second prediction residual value is the inter-frame residual coefficient.
- a flag can be used to mark the prediction mode used by the node, and the second encoding result can be the encoding result of the flag.
- a binary flag flag can be used to mark the prediction mode used by the node, when flag is 1, the representation node uses the intra-frame prediction mode, and the second prediction residual value is the intra-frame residual coefficient, when flag is 0, the representation node uses the inter-frame prediction mode, and the second prediction residual value is the inter-frame residual coefficient; or, when flag is 0, the representation node uses the intra-frame prediction mode, and the second prediction residual value is the intra-frame residual coefficient, when flag is 1, the representation node uses the inter-frame prediction mode, and the second prediction residual value is the inter-frame residual coefficient.
- the code stream of the point cloud to be encoded also includes a second encoding result, and the second encoding result is used to characterize that the second prediction residual value is the intra-frame residual coefficient or the second prediction residual value is the inter-frame residual coefficient, so that the decoding end can determine whether the prediction mode is an intra-frame prediction mode or an inter-frame prediction mode through the second encoding result, and thus select a suitable prediction mode for decoding.
- the determining of the inter-frame residual coefficients of the attribute information of the node of the target layer includes:
- An inter-frame residual coefficient is obtained based on the inter-frame transform coefficient.
- the reference frame data of the first data type may be encoded and reconstructed point cloud frame data.
- the reference frame data of the first data type may include attribute information and geometric information.
- the transforming the inter-frame prediction value to obtain the inter-frame transform coefficient may include performing a RAHT transform on the inter-frame prediction value to obtain the inter-frame transform coefficient.
- the obtaining of the inter-frame residual coefficient based on the inter-frame transform coefficient may include performing a RAHT transform on the attribute information of the node to obtain the transform coefficient of the node; and subtracting the inter-frame transform coefficient from the transform coefficient of the node to obtain the inter-frame residual coefficient.
- the inter-frame residual coefficient may also be described as an inter-frame prediction residual coefficient.
- the inter-frame prediction value of the attribute information of the node of the target layer is determined, and the inter-frame prediction value is transformed to obtain the inter-frame transformation coefficient; and the inter-frame residual coefficient is obtained based on the inter-frame transformation coefficient.
- the inter-frame prediction value is transformed to obtain the inter-frame transformation coefficient, and then the inter-frame residual coefficient is obtained.
- the method before determining the inter-frame prediction value of the attribute information of the node of the target layer, the method further includes:
- the determining the inter-frame prediction value of the attribute information of the node of the target layer includes:
- An inter-frame prediction value of attribute information of a node of the target layer is determined based on the prediction tree and the transform tree.
- the reconstruction of the reference frame can be realized by reconstructing the points in the reference frame corresponding to the point cloud to be encoded. For each point, there will be a reconstructed reference point with the same position coordinates.
- the attribute of this reference point is the attribute sum of the points within a certain range in the reference frame obtained according to a certain rule; the rule is: if a point in the reference frame is closer to the current point of the current frame than the Morton code distance to the next point of the current frame, then this point is determined to be used to reconstruct the current point, the attribute value is accumulated, and the weight is increased by one until the condition is not met, and the next reconstruction point is judged.
- the starting and ending distances for judgment are specified.
- the attribute value is greatly different and is not suitable for reconstruction. Specifically: only points in the reference frame whose distance to the first point of the current frame is not greater than 64 will be used for reconstruction. Only points in the reference frame whose distance to the last point of the current frame is not greater than 64 will be used for reconstruction.
- the operation of constructing a prediction tree from bottom to top can be performed on the reconstructed reference frame, and the tree structure of the constructed prediction tree is the same as the tree structure of the transformation tree.
- the inter-frame prediction value of the attribute information of the node of the target layer determined based on the prediction tree and the transform tree may include: obtaining the attribute values of the nodes of each layer and the corresponding attribute values of the nodes of each layer of the reconstructed reference frame through the prediction tree and the transform tree; if the attribute value of the child node at the corresponding position of the reference frame is not 0 and is within 20% to 250% of the attribute value of the parent node of the current frame, then the attribute value of the child node of the reference frame is used as the inter-frame prediction value of the attribute information of the child node at the corresponding position of the current frame; otherwise, the intra-frame prediction value of the current frame is used as the inter-frame prediction value of the attribute information.
- the points in the reference frame corresponding to the to-be-encoded point cloud are reconstructed; a prediction tree is constructed based on the reconstructed reference frame, and the tree structure of the prediction tree is the same as the tree structure of the transform tree; and the inter-frame prediction value of the attribute information of the node of the target layer is determined based on the prediction tree and the transform tree.
- the inter-frame prediction value of the attribute information of the node of the target layer is determined by the prediction tree constructed by the reconstructed reference frame.
- the reference frame data of the first data type is encoded and reconstructed point cloud frame data.
- the determining of the inter-frame residual coefficients of the attribute information of the node of the target layer includes:
- the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node
- An inter-frame residual coefficient of the attribute information of the node is determined based on the transformation coefficient of the node and the inter-frame transformation coefficient of the attribute information of the node.
- the determining of the inter-frame residual coefficient of the attribute information of the node based on the transformation coefficient of the node and the inter-frame transformation coefficient of the attribute information of the node may include performing RAHT transformation on the attribute information of the node to obtain the transformation coefficient of the node; and subtracting the inter-frame transformation coefficient of the attribute information of the node from the transformation coefficient of the node to obtain the inter-frame residual coefficient.
- the inter-frame residual coefficient may also be described as an inter-frame prediction residual coefficient.
- the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node; based on the transformation coefficient of the node and the inter-frame transformation coefficient of the attribute information of the node, the inter-frame residual coefficient of the attribute information of the node is determined.
- the inter-frame transformation coefficient can be directly obtained through the reference frame data, and then the inter-frame residual coefficient can be obtained.
- the reference frame data of the second data type is reconstructed transformation coefficients of an encoded point cloud frame.
- the code stream of the point cloud to be encoded further includes a third encoding result
- the third encoding result is an encoding result of at least one of the first preset distance and the second preset distance.
- FIG. 6 is a flow chart of a point cloud decoding processing method provided in an embodiment of the present application, which can be applied to a decoding end.
- the point cloud decoding processing method includes the following steps:
- Step 201 determining a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be decoded
- Step 202 Decode the first encoding result in the code stream of the point cloud to be decoded to obtain a target prediction residual value
- Step 203 acquiring attribute information of the nodes of the target layer based on the target prediction residual value and the target distance;
- the target prediction residual value when the target distance is less than or equal to the first preset distance, includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer;
- the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer;
- the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
- the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm.
- the target prediction residual value includes the second prediction residual value
- the code stream of the point cloud to be decoded also includes a second encoding result
- the second encoding result is used to characterize that the second prediction residual value is an intra-frame residual coefficient or the second prediction residual value is an inter-frame residual coefficient
- the acquiring the attribute information of the node of the target layer based on the target prediction residual value and the target distance includes:
- the second encoding result represents that the second prediction residual value is an intra-frame residual coefficient
- determining the attribute information of the node of the target layer based on the intra-frame transform coefficient of the node of the target layer and the second prediction residual value
- attribute information of the node of the target layer is determined based on the inter-frame transform coefficient of the node of the target layer and the second prediction residual value.
- determining the attribute information of the node of the target layer based on the inter-frame transform coefficient of the node of the target layer and the second prediction residual value includes:
- the attribute information of the node of the target layer is determined based on the inter-frame transform coefficient and the second prediction residual value.
- the method before determining the inter-frame prediction value of the attribute information of the node of the target layer, the method further includes:
- the determining of the inter-frame prediction value of the attribute information of the node of the target layer includes:
- An inter-frame prediction value of attribute information of a node of the target layer is determined based on the prediction tree and the transform tree.
- the reference frame data of the first data type is decoded and reconstructed point cloud frame data.
- determining the attribute information of the node of the target layer based on the inter-frame transform coefficient of the node of the target layer and the second prediction residual value includes:
- the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node
- the attribute information of the node is obtained based on the second prediction residual value and the inter-frame transform coefficient of the attribute information of the node.
- the reference frame data of the second data type is reconstructed transform coefficients of a decoded point cloud frame.
- the code stream of the point cloud to be decoded further includes a third encoding result
- the method further includes:
- the third encoding result is decoded to obtain at least one of the first preset distance and the second preset distance.
- this embodiment is an implementation of the decoding side corresponding to the embodiment shown in Figure 4. Its specific implementation can refer to the relevant description of the embodiment shown in Figure 4. In order to avoid repeated description, this embodiment will not be repeated, and the same beneficial effects can be achieved.
- Embodiment 1 is a diagrammatic representation of Embodiment 1:
- This embodiment is executed by the encoding end.
- This embodiment proposes a RAHT inter-frame prediction scheme based on dual thresholds, by introducing two syntax elements thrh1 (i.e., the first preset distance) and thrh2 (i.e., the second preset distance) in the parameter set (the parameter set can be an attribute parameter set APS, or an attribute data unit ADU, or other parameter sets), and further subdividing the constructed transform tree into three levels: upper layer, middle layer, and lower layer, and using different prediction modes for nodes in different layers.
- thrh1 and thrh2 are the number of layers from the root node, respectively, and generally speaking, thrh1 ⁇ thrh2.
- Let the distance from the current node layer to the root node be lvl (i.e., the target distance).
- Step (11) Reorder the point cloud and construct an N-layer transformation tree structure for the reordered point cloud data based on the geometric distance. A bottom-up construction method is adopted.
- RAHT regional adaptive hierarchical transform
- intra-frame prediction When predicting the nodes of the transform tree, two prediction methods can be selected: intra-frame prediction or inter-frame prediction. As shown in Figure 7, the specific selection method is as follows:
- the inter-frame prediction mode is used by default in the upper layer.
- the RDO algorithm is used to determine the cost of using intra-frame and inter-frame prediction.
- the current node uses the inter-frame prediction mode, otherwise it uses the intra-frame prediction mode.
- the prediction mode used by the current node For example, 1 indicates the use of inter-frame prediction mode, and 0 indicates the use of intra-frame prediction mode. And vice versa.
- the intra prediction mode is used by default in the lower layer.
- the inter-frame prediction used needs to select different prediction methods according to different reference frame data types.
- the reference frame data is the point cloud frame information that has been encoded and reconstructed, including attribute information and geometric information, the same bottom-up transformation tree construction operation as the current frame can be performed, and the reference frame node attribute value corresponding to the current frame node can be generated, which can be used as the inter-frame prediction value.
- the attributes of this reference point are the sum of the attributes of the points within a certain range in the reference frame obtained according to certain rules.
- the rule is: if the distance of a point in the reference frame to the current point of the current frame is closer than the Morton code distance to the next point of the current frame, then this point is determined to be used to reconstruct the current point, the attribute value is accumulated, and the weight is increased by one. Until the condition is not met, continue to determine the next reconstruction point.
- start and end distances are specified. If the distance is too far, it will be considered that the attribute values are too different and not suitable for reconstruction.
- the current node is a root node
- no prediction is performed, and the attribute information of the node is directly subjected to RAHT transformation to obtain the DC coefficient and AC coefficient.
- the DC coefficient prediction residual is used to replace the DC coefficient obtained by direct transformation.
- the DC coefficient prediction residual is: the DC coefficient of the root node of the current frame minus the DC coefficient of the root node of the reference frame.
- DC residual DC current -DC reference
- the current node is not the root node, it is necessary to determine whether to predict the current eight child nodes: For details, please refer to The second step of the intra-frame RAHT technique is described in the content description of the GPCC codec. If no prediction is performed, the original attribute value is directly subjected to RAHT transformation to obtain AC coefficients, which are then quantized and entropy coded.
- the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
- Attr predicted Attr predicted_inter ?
- the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
- the RDO algorithm is used to determine which prediction mode to use.
- the specific process is as follows:
- the inter-frame prediction value and the intra-frame prediction value are respectively subjected to RAHT transformation to obtain corresponding transformation coefficients.
- the inter-frame prediction residual coefficient and the intra-frame prediction residual coefficient are respectively subtracted from the true value RAHT transformation coefficient.
- D distaltion
- SAD absolute errors
- R (rate: bit rate) is the number of bits required to encode the prediction residual coefficients.
- Cost_inter is less than Cost_intra
- the inter-frame prediction residual is used, otherwise the intra-frame prediction residual is used.
- a flag is used to mark this. For example, 1 marks the use of inter-frame prediction residual, and 0 marks the use of intra-frame prediction residual. And vice versa.
- the determined AC residual coefficients are quantized and entropy coded.
- the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
- the intra-frame prediction value is used as the attribute prediction value, and the original attribute value and the intra-frame attribute prediction value are subjected to RAHT transformation and difference, and the residual is quantized and entropy coded.
- the transform coefficient can be directly used as the transform coefficient for inter-frame prediction and compared with the transform coefficient of the intra-frame prediction attribute value in the current frame node.
- the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
- the RDO algorithm is used to determine which prediction mode to use.
- the specific process is as follows:
- the original attribute value and the intra-frame prediction value of the current node are transformed respectively to obtain the corresponding transformation coefficients, and the intra-frame prediction residual coefficients are obtained by difference.
- the inter-frame prediction residual coefficient is obtained by subtracting it from the actual RAHT transform coefficient of the current node. If it does not exist, the intra-frame prediction residual coefficient is directly used, and subsequent quantization and entropy coding are performed.
- D distaltion
- SAD absolute errors
- R (rate: bit rate) is the number of bits required to encode the prediction residual coefficients.
- ⁇ depends on QP and is calculated by the following formula: (other formulas are also possible)
- Cost_inter is less than Cost_intra
- the inter-frame prediction residual is used, otherwise the intra-frame prediction residual is used.
- a flag is used to mark this. For example, 1 flag uses the inter-frame prediction residual, and 0 flag uses the intra-frame prediction residual. And vice versa.
- the determined AC residual coefficients are quantized and entropy encoded.
- the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
- the intra-frame prediction value is used as the attribute prediction value, and the original attribute value and the intra-frame attribute prediction value are subjected to RAHT transformation and difference, and the residual is quantized and entropy coded.
- Embodiment 2 is a diagrammatic representation of Embodiment 1:
- This embodiment is executed by a decoding end.
- Step (21) First, decode the input code stream to obtain syntax elements thrh1 and thrh2.
- the quantized coded transform coefficients or coefficient residuals are entropy decoded and dequantized from the input bitstream to obtain the reconstructed transform coefficients or transform residual coefficients.
- the flag bit of the prediction mode used by the middle-level node is decoded from the input bitstream.
- Step (22) Reorder the point cloud and construct an N-layer transformation tree structure for the reordered point cloud data based on the geometric distance. A bottom-up construction method is adopted.
- Step (23) parse the tree from top to bottom. Starting from the root node, perform upsampling prediction and RAHT inverse transformation layer by layer to obtain the reconstructed attribute value of the node.
- intra-frame prediction When predicting the nodes of the transform tree, two prediction methods can be selected: intra-frame prediction or inter-frame prediction. As shown in Figure 8, the specific selection method is as follows:
- the corresponding prediction value obtained according to the flag is added to the residual transform coefficient obtained by decoding and reconstruction to reconstruct the transform coefficient.
- the intra prediction mode is used by default in the lower layer.
- the inter-frame prediction used needs to select different prediction methods according to different reference frame data types.
- the reference frame data is the point cloud frame information that has been encoded and reconstructed, including attribute information and geometric information
- the same bottom-up transformation tree construction operation can be performed as the current frame, and the reference frame node attribute value corresponding to the current frame node can be generated, which can be used as the inter-frame prediction value.
- the specific process has been introduced at the encoding end and will not be repeated here.
- the inter-frame prediction value When the inter-frame prediction value is obtained, it is added to the decoded transform residual coefficient to obtain the reconstructed transform coefficient, and the RAHT inverse transform is performed to obtain the attribute reconstruction value of each child node.
- the DC coefficient residual of the root node is added to the DC coefficient residual of the reference frame root node to obtain the DC coefficient of the current frame, and the AC coefficient reconstructed by the root node is used, and then the RAHT inverse transform is performed to obtain the attribute reconstruction value of each child node;
- the current node is not the root node, it is necessary to determine whether to predict the current eight child nodes: for details, please refer to the description of the second step of the intra-frame RAHT technology in the content description section of the GPCC codec.
- the DC coefficient directly inherits the attribute reconstruction value of the parent node, and uses the reconstructed AC coefficient of the current node to perform RAHT inverse transformation to obtain the attribute reconstruction value of each child node;
- the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
- the attribute prediction value obtained by the 2*2*2 node is transformed by RAHT to obtain the AC coefficient of the attribute prediction value.
- the DC coefficient inherits the attribute reconstruction value of the parent node, and the AC coefficient is the reconstructed AC coefficient residual plus the AC coefficient of the attribute prediction value.
- the RAHT inverse transformation is performed to obtain the attribute reconstruction value of each child node.
- the DC coefficient directly inherits the attribute reconstruction value of the parent node, and uses the reconstructed AC coefficient of the current node to perform RAHT inverse transformation to obtain the attribute reconstruction value of each child node;
- the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
- RAHT transform is performed on the inter-frame prediction value and the intra-frame prediction value respectively to obtain the corresponding AC coefficients.
- the DC coefficient inherits the attribute reconstruction value of the parent node, and the AC coefficient is the reconstructed AC coefficient residual plus the AC coefficient corresponding to the flag. Then, the RAHT inverse transform is performed to obtain the attribute reconstruction value of each child node.
- the DC coefficient directly inherits the attribute reconstruction value of the parent node, and uses the reconstructed AC coefficient of the current node to perform RAHT inverse transformation to obtain the attribute reconstruction value of each child node;
- the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
- the intra-frame attribute prediction value obtained by the 2*2*2 node is subjected to RAHT transformation to obtain the AC coefficient of the attribute prediction value.
- the DC coefficient inherits the attribute reconstruction value of the parent node, and the AC coefficient is the reconstructed AC coefficient residual plus the AC coefficient of the attribute prediction value. After that, the RAHT inverse transform is performed to obtain the attribute reconstruction value of each child node.
- the transform coefficient can be directly used as the transform coefficient for inter-frame prediction and compared with the transform coefficient of the intra-frame prediction attribute value in the current frame node.
- the decoded reconstruction transform coefficients are directly subjected to RAHT inverse transformation to obtain the attribute reconstruction value of each child node.
- the DC coefficient directly inherits the attribute reconstruction value of the parent node, and uses the reconstructed AC coefficient of the current node to perform RAHT inverse transformation to obtain the attribute reconstruction value of each child node;
- the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
- This embodiment proposes a dual-threshold inter-frame prediction method, which divides the octree into three levels: upper level, middle level, and lower level. Different prediction methods are used for different levels according to the node characteristics of different levels. In particular, for the nodes in the middle level, the RDO algorithm is used to determine whether the current node is an intra-frame or inter-frame prediction method. At the same time, the two set thresholds and flags need to be transmitted in the bitstream.
- the three-layer structure of this embodiment is more flexible and has higher coding efficiency.
- this embodiment is more flexible. Considering that some nodes in the middle layer are more suitable for the inter-frame prediction method, a double threshold is used to divide the octree into three levels. For the nodes in the middle layer, the RDO algorithm is used to determine whether to adopt the inter-frame prediction mode, and a flag is used to mark it. This is more flexible and can reduce the bit rate and improve the coding efficiency.
- the point cloud coding processing method provided in the embodiment of the present application can be executed by a point cloud coding processing device, or a control module in the point cloud coding processing device for executing the point cloud coding processing method.
- the point cloud coding processing device provided in the embodiment of the present application is described by taking the method for executing the point cloud coding processing by the point cloud coding processing device as an example.
- FIG. 9 is a structural diagram of a point cloud coding processing device provided in an embodiment of the present application.
- the point cloud coding processing device 300 includes:
- the first determination module 301 is used to determine the target distance between the target layer and the root node of the transform tree corresponding to the point cloud to be encoded;
- a second determination module 302 is used to determine a target prediction residual value based on the target distance
- the encoding module 303 is used to encode the target prediction residual value to obtain a first encoding result
- the code stream of the point cloud to be encoded includes the first encoding result
- the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer;
- the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer;
- the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
- the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm.
- the second determining module includes:
- a first determining unit configured to determine an inter-frame residual coefficient and an intra-frame residual coefficient of the attribute information of the node of the target layer when the target distance is greater than the first preset distance and less than or equal to the second preset distance;
- a second determining unit configured to determine a first generation value corresponding to the inter-frame residual coefficient and a second generation value corresponding to the intra-frame residual coefficient based on a rate-distortion optimization algorithm
- a third determining unit configured to determine a second prediction residual value based on the first generation value and the second generation value
- the target prediction residual value includes the second prediction residual value.
- the second prediction residual value is the inter-frame residual coefficient
- the second prediction residual value is the intra-frame residual coefficient
- the second prediction residual value is the inter-frame residual coefficient or the intra-frame residual coefficient.
- the code stream of the point cloud to be encoded also includes a second encoding result, and the second encoding result is used to represent that the second prediction residual value is the intra-frame residual coefficient or the second prediction residual value is the inter-frame residual coefficient.
- the first determining unit includes:
- a determination subunit configured to determine, when the data type of the reference frame data of the to-be-encoded point cloud is the first data type, an inter-frame prediction value of the attribute information of the node of the target layer, and transform the inter-frame prediction value to obtain an inter-frame transformation coefficient;
- the acquisition subunit is used to acquire the inter-frame residual coefficient based on the inter-frame transform coefficient.
- the first determining unit is further configured to:
- the determining subunit is specifically used for:
- An inter-frame prediction value of attribute information of a node of the target layer is determined based on the prediction tree and the transform tree.
- the reference frame data of the first data type is encoded and reconstructed point cloud frame data.
- the first determining unit is specifically configured to:
- the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node
- An inter-frame residual coefficient of the attribute information of the node is determined based on the transformation coefficient of the node and the inter-frame transformation coefficient of the attribute information of the node.
- the reference frame data of the second data type is reconstructed transformation coefficients of an encoded point cloud frame.
- the code stream of the point cloud to be encoded further includes a third encoding result
- the third encoding result is an encoding result of at least one of the first preset distance and the second preset distance.
- the point cloud coding processing device in the embodiment of the present application can be a device, a device or electronic device with an operating system, or a component, integrated circuit, or chip in a terminal.
- the device or electronic device can be a mobile terminal or a non-mobile terminal.
- the mobile terminal can include but is not limited to the types of terminals listed above, and the non-mobile terminal can be a server, a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a television (television, TV), a teller machine or a self-service machine, etc., which is not specifically limited in the embodiment of the present application.
- the point cloud coding processing device provided in the embodiment of the present application can implement each process implemented by the method embodiment of Figure 5 and achieve the same technical effect. To avoid repetition, it will not be repeated here.
- the point cloud decoding processing method provided in the embodiment of the present application can be executed by a point cloud decoding processing device, or a control module in the point cloud decoding processing device for executing the point cloud decoding processing method.
- the point cloud decoding processing device provided in the embodiment of the present application is described by taking the method for executing the point cloud decoding processing by the point cloud decoding processing device as an example.
- FIG. 10 is a structural diagram of a point cloud decoding processing device provided in an embodiment of the present application.
- the point cloud decoding processing device 400 includes:
- a determination module 401 is used to determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be decoded;
- a decoding module 402 configured to decode a first encoding result in a bitstream of a point cloud to be decoded, and obtain a target prediction residual value
- An acquisition module 403 is used to acquire attribute information of the node of the target layer based on the target prediction residual value and the target distance;
- the target prediction residual value when the target distance is less than or equal to the first preset distance, includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer;
- the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer;
- the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
- the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm.
- the target prediction residual value includes the second prediction residual value
- the code stream of the point cloud to be decoded also includes a second encoding result
- the second encoding result is used to represent that the second prediction residual value is an intra-frame residual coefficient or the second prediction residual value is an inter-frame residual coefficient
- the acquisition module includes:
- a decoding unit configured to decode the second encoding result when the target distance is greater than the first preset distance and less than or equal to a second preset distance
- a determination unit configured to determine attribute information of a node of the target layer based on an intra-frame transform coefficient of the node of the target layer and the second prediction residual value when it is determined that the second encoding result represents that the second prediction residual value is an intra-frame residual coefficient;
- attribute information of the node of the target layer is determined based on the inter-frame transform coefficient of the node of the target layer and the second prediction residual value.
- the determining unit includes:
- a first determination subunit is used to determine the inter-frame prediction value of the attribute information of the node of the target layer when the data type of the reference frame data of the to-be-decoded point cloud is the first data type, and to transform the inter-frame prediction value to obtain an inter-frame transformation coefficient;
- the second determination subunit is used to determine the attribute information of the node of the target layer based on the inter-frame transformation coefficient and the second prediction residual value.
- the determining unit is further configured to:
- the first determining subunit is specifically used for:
- An inter-frame prediction value of attribute information of a node of the target layer is determined based on the prediction tree and the transform tree.
- the reference frame data of the first data type is decoded and reconstructed point cloud frame data.
- the determining unit when the data type of the reference frame data of the to-be-decoded point cloud is the second data type, when determining that the second encoding result represents that the second prediction residual value is an inter-frame residual coefficient, the determining unit is specifically configured to:
- the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node
- the attribute information of the node is obtained based on the second prediction residual value and the inter-frame transform coefficient of the attribute information of the node.
- the reference frame data of the second data type is reconstructed transform coefficients of a decoded point cloud frame.
- the code stream of the point cloud to be decoded further includes a third encoding result
- the decoding module is further used to:
- the third encoding result is decoded to obtain at least one of the first preset distance and the second preset distance.
- the point cloud decoding processing device in the embodiment of the present application can be a device, a device or electronic device with an operating system, or a component, integrated circuit, or chip in a terminal.
- the device or electronic device can be a mobile terminal or a non-mobile terminal.
- the mobile terminal can include but is not limited to the types of terminals listed above, and the non-mobile terminal can be a server, a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a television (television, TV), a teller machine or a self-service machine, etc., which is not specifically limited in the embodiment of the present application.
- the point cloud decoding processing device provided in the embodiment of the present application can implement the various processes implemented by the method embodiment of Figure 6 and achieve the same technical effect. To avoid repetition, it will not be repeated here.
- the embodiment of the present application further provides a communication device 500, including a processor 501 and a memory 502, wherein the memory 502 stores a program or instruction that can be run on the processor 501.
- the communication device 500 is an encoding end device
- the program or instruction is executed by the processor 501 to implement the various steps of the above-mentioned point cloud encoding processing method embodiment, and can achieve the same technical effect.
- the communication device 500 is a decoding end device
- the program or instruction is executed by the processor 501 to implement the various steps of the above-mentioned point cloud decoding processing method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
- the embodiment of the present application also provides a terminal, including a processor and a communication interface, wherein the processor is used to: determine the target distance between the target layer and the root node of the transform tree corresponding to the point cloud to be encoded; determine the target prediction residual value based on the target distance; encode the target prediction residual value to obtain a first encoding result; wherein the code stream of the point cloud to be encoded includes the first encoding result; when the target distance is less than or equal to the first preset distance, the target prediction residual value includes a first prediction residual value obtained by inter-frame prediction of the attribute information of the node of the target layer; or when the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer; or when the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by intra-frame prediction of the attribute information of the node of the target layer.
- An embodiment of the present application also provides a terminal, including a processor and a communication interface, wherein the processor is used to: determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be decoded; decode a first encoding result in a code stream of a point cloud to be decoded to obtain a target prediction residual value; obtain attribute information of the nodes of the target layer based on the target prediction residual value and the target distance; wherein, when the target distance is less than or equal to a first preset distance, the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the nodes of the target layer; or when the target distance is greater than the first preset distance and less than or equal to a second preset distance, the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the nodes of the target layer; or when the target distance is greater than the second preset distance, the target prediction residual value includes The third prediction residual value
- FIG12 is a schematic diagram of the hardware structure of a terminal implementing an embodiment of the present application.
- the terminal 600 includes but is not limited to: a radio frequency unit 601, a network module 602, an audio output unit 603, an input unit 604, a sensor 605, a display unit 606, a user input unit 607, an interface unit 608, a memory 609 and at least some of the components of a processor 610.
- the terminal 600 may also include a power source (such as a battery) for supplying power to each component, and the power source may be logically connected to the processor 610 through a power management system, so as to implement functions such as managing charging, discharging, and power consumption management through the power management system.
- a power source such as a battery
- the terminal structure shown in FIG12 does not constitute a limitation on the terminal, and the terminal may include more or fewer components than shown in the figure, or combine certain components, or arrange components differently, which will not be described in detail here.
- the input unit 604 may include a graphics processing unit (GPU) 6041 and a microphone 6042, and the graphics processor 6041 processes the image data of the static picture or video obtained by the image capture device (such as a camera) in the video capture mode or the image capture mode.
- the display unit 606 may include a display panel 6061, and the display panel 6061 may be configured in the form of a liquid crystal display, an organic light emitting diode, etc.
- the user input unit 607 includes a touch panel 6071 and at least one of other input devices 6072.
- the touch panel 6071 is also called a touch screen.
- the touch panel 6071 may include two parts: a touch detection device and a touch controller.
- Other input devices 6072 may include, but are not limited to, a physical keyboard, function keys (such as a volume control key, a switch key, etc.), a trackball, a mouse, and a joystick, which will not be repeated here.
- the RF unit 601 after receiving downlink data from the network side device, can transmit the data to the processor 610 for processing; in addition, the RF unit 601 can send uplink data to the network side device.
- the RF unit 601 includes but is not limited to an antenna, an amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, etc.
- the memory 609 can be used to store software programs or instructions and various data.
- the memory 609 may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instruction required for at least one function (such as a sound playback function, an image playback function, etc.), etc.
- the memory 609 may include a volatile memory or a non-volatile memory, or the memory 609 may include both volatile and non-volatile memories.
- the non-volatile memory may be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory.
- the volatile memory may be a random access memory (RAM), a static random access memory (SRAM), a dynamic random access memory (DRAM), a synchronous dynamic random access memory (SDRAM), a double data rate synchronous dynamic random access memory (DDRSDRAM), an enhanced synchronous dynamic random access memory (ESDRAM), a synchronous link dynamic random access memory (SLDRAM) and a direct RAM bus random access memory (DRRAM).
- the memory x09 in the embodiment of the present application includes but Without limitation, these and any other suitable types of memory.
- the processor 610 may include one or more processing units; optionally, the processor 610 integrates an application processor and a modem processor, wherein the application processor mainly processes operations related to an operating system, a user interface, and application programs, and the modem processor mainly processes wireless communication signals, such as a baseband processor. It is understandable that the modem processor may not be integrated into the processor 610.
- the processor 610 is configured to:
- the code stream of the point cloud to be encoded includes the first encoding result
- the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer;
- the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer;
- the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
- the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm.
- the processor 610 is specifically configured to:
- the target prediction residual value includes the second prediction residual value.
- the second prediction residual value is the inter-frame residual coefficient
- the second prediction residual value is the intra-frame residual coefficient
- the second prediction residual value is the inter-frame residual coefficient or the intra-frame residual coefficient.
- the code stream of the point cloud to be encoded further includes a second encoding result, and the second encoding result is used to represent the
- the second prediction residual value is the intra-frame residual coefficient or the second prediction residual value is the inter-frame residual coefficient.
- the processor 610 is specifically configured to:
- An inter-frame residual coefficient is obtained based on the inter-frame transform coefficient.
- processor 610 is further configured to:
- An inter-frame prediction value of attribute information of a node of the target layer is determined based on the prediction tree and the transform tree.
- the reference frame data of the first data type is encoded and reconstructed point cloud frame data.
- the processor 610 is specifically configured to:
- the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node
- An inter-frame residual coefficient of the attribute information of the node is determined based on the transformation coefficient of the node and the inter-frame transformation coefficient of the attribute information of the node.
- the reference frame data of the second data type is reconstructed transformation coefficients of an encoded point cloud frame.
- the code stream of the point cloud to be encoded further includes a third encoding result
- the third encoding result is an encoding result of at least one of the first preset distance and the second preset distance.
- the processor 610 is configured to:
- the target prediction residual value when the target distance is less than or equal to the first preset distance, includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer;
- the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer;
- the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
- the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm.
- the target prediction residual value includes the second prediction residual value
- the code stream of the point cloud to be decoded also includes a second encoding result
- the second encoding result is used to characterize that the second prediction residual value is an intra-frame residual coefficient or the second prediction residual value is an inter-frame residual coefficient
- the processor 610 is specifically used to:
- the second encoding result represents that the second prediction residual value is an intra-frame residual coefficient
- determining the attribute information of the node of the target layer based on the intra-frame transform coefficient of the node of the target layer and the second prediction residual value
- attribute information of the node of the target layer is determined based on the inter-frame transform coefficient of the node of the target layer and the second prediction residual value.
- the processor 610 is specifically configured to:
- the attribute information of the node of the target layer is determined based on the inter-frame transform coefficient and the second prediction residual value.
- processor 610 is further configured to:
- An inter-frame prediction value of attribute information of a node of the target layer is determined based on the prediction tree and the transform tree.
- the reference frame data of the first data type is decoded and reconstructed point cloud frame data.
- the processor 610 is specifically configured to:
- the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node
- the attribute information of the node is obtained based on the second prediction residual value and the inter-frame transform coefficient of the attribute information of the node.
- the reference frame data of the second data type is reconstructed transform coefficients of a decoded point cloud frame.
- the code stream of the point cloud to be decoded further includes a third encoding result
- the processor 610 is further configured to:
- the third encoding result is decoded to obtain at least one of the first preset distance and the second preset distance.
- the embodiments of the present application can improve coding efficiency.
- the terminal of the embodiment of the present application also includes: instructions or programs stored in the memory 609 and executable on the processor 610.
- the processor 610 calls the instructions or programs in the memory 609 to execute the methods executed by the modules shown in Figure 9 or Figure 10, and achieves the same technical effect. To avoid repetition, it will not be repeated here.
- An embodiment of the present application also provides a readable storage medium, on which a program or instruction is stored.
- a program or instruction is stored.
- the various processes of the above-mentioned point cloud encoding processing method embodiment are implemented, or when the program or instruction is executed by a processor, the various processes of the above-mentioned point cloud decoding processing method embodiment are implemented, and the same technical effect can be achieved. To avoid repetition, it will not be repeated here.
- the processor is the processor in the terminal described in the above embodiment.
- the readable storage medium includes a computer readable storage medium, such as a computer read-only memory ROM, a random access memory RAM, a magnetic disk or an optical disk.
- the readable storage medium may be a non-transient readable storage medium.
- An embodiment of the present application further provides a chip, which includes a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the various processes of the above-mentioned point cloud encoding processing method embodiment, or to implement the various processes of the above-mentioned point cloud decoding processing method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
- the chip mentioned in the embodiments of the present application can also be called a system-level chip, a system chip, a chip system or a system-on-chip chip, etc.
- the embodiments of the present application further provide a computer program/program product, which is stored in a storage medium.
- the computer program/program product is executed by at least one processor to implement the various processes of the above-mentioned point cloud encoding processing method or point cloud decoding processing method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
- An embodiment of the present application also provides a coding and decoding system, including: an encoding end device and a decoding end device, wherein the encoding end device can be used to execute the steps of the point cloud encoding processing method as described above, and the decoding end device can be used to execute the steps of the point cloud decoding processing method as described above.
- the above embodiment method can be implemented by means of a computer software product plus a necessary general hardware platform, or of course by hardware.
- the computer software product is stored in a storage medium (such as ROM, RAM, magnetic disk, optical disk, etc.), and includes several instructions for enabling
- the terminal or network side device executes the methods described in each embodiment of the present application.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请主张在2023年4月17日在中国提交的中国专利申请No.202310408599.1的优先权,其全部内容通过引用包含于此。This application claims priority to Chinese Patent Application No. 202310408599.1 filed in China on April 17, 2023, the entire contents of which are incorporated herein by reference.
本申请属于计算机技术领域,具体涉及一种点云编码处理方法、点云解码处理方法及相关设备。The present application belongs to the field of computer technology, and specifically relates to a point cloud encoding processing method, a point cloud decoding processing method and related equipment.
点云是三维物体或场景的一种表现形式,是由空间中一组无规则分布、表达三维物体或场景空间结构和表面属性的离散点集所构成。为了准确反映空间中的信息,所需离散点的数量相当大,而为了减少点云数据存储和传输时所占用的带宽,需要对点云数据进行编码压缩处理。点云数据通常由描述位置的几何信息如三维坐标(x,y,z)以及该位置的属性信息如颜色(R,G,B)或者反射率等构成。在点云编码压缩过程中对几何信息及属性信息的编码是分开进行的。Point cloud is a representation of a three-dimensional object or scene. It is composed of a set of discrete points that are irregularly distributed in space and express the spatial structure and surface properties of the three-dimensional object or scene. In order to accurately reflect the information in space, the number of discrete points required is quite large. In order to reduce the bandwidth occupied by point cloud data storage and transmission, the point cloud data needs to be encoded and compressed. Point cloud data usually consists of geometric information describing the location, such as three-dimensional coordinates (x, y, z) and attribute information of the location, such as color (R, G, B) or reflectivity. In the process of point cloud coding and compression, the encoding of geometric information and attribute information is performed separately.
相关技术中,在基于区域自适应分层变换(Region Adaptive Hierarchal Transform,RAHT)算法对属性信息进行编码的过程中,将待编码点云对应的树形结构一分为二,上层的节点使用帧间预测算法进行预测,下层的节点使用帧内预测算法进行预测,灵活性较差,导致编码效率较低。In the related technology, in the process of encoding attribute information based on the Region Adaptive Hierarchal Transform (RAHT) algorithm, the tree structure corresponding to the point cloud to be encoded is divided into two parts. The nodes in the upper layer are predicted using the inter-frame prediction algorithm, and the nodes in the lower layer are predicted using the intra-frame prediction algorithm. This has poor flexibility and leads to low coding efficiency.
发明内容Summary of the invention
本申请实施例提供一种点云编码处理方法、点云解码处理方法及相关设备,能够解决编码效率较低的问题。The embodiments of the present application provide a point cloud encoding processing method, a point cloud decoding processing method and related equipment, which can solve the problem of low encoding efficiency.
第一方面,提供了一种点云编码处理方法,由编码端执行,包括:In a first aspect, a point cloud encoding processing method is provided, which is executed by an encoding end and includes:
确定待编码点云对应的变换树的目标层与根节点之间的目标距离;Determine the target distance between the target layer and the root node of the transform tree corresponding to the point cloud to be encoded;
基于所述目标距离确定目标预测残差值;Determining a target prediction residual value based on the target distance;
对所述目标预测残差值进行编码,得到第一编码结果;Encoding the target prediction residual value to obtain a first encoding result;
其中,所述待编码点云的码流包括所述第一编码结果;The code stream of the point cloud to be encoded includes the first encoding result;
在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或In a case where the target distance is less than or equal to a first preset distance, the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer; or
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或 In a case where the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer; or
在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。When the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
第二方面,提供了一种点云解码处理方法,由解码端执行,包括:In a second aspect, a point cloud decoding processing method is provided, which is executed by a decoding end and includes:
确定待解码点云对应的变换树的目标层与根节点之间的目标距离;Determine the target distance between the target layer and the root node of the transform tree corresponding to the point cloud to be decoded;
对待解码点云的码流中的第一编码结果进行解码,得到目标预测残差值;Decode the first encoding result in the bitstream of the point cloud to be decoded to obtain a target prediction residual value;
基于所述目标预测残差值及所述目标距离获取所述目标层的节点的属性信息;Acquire attribute information of nodes in the target layer based on the target prediction residual value and the target distance;
其中,在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或Wherein, when the target distance is less than or equal to the first preset distance, the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer; or
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或In a case where the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer; or
在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。When the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
第三方面,提供了一种点云编码处理装置,包括:In a third aspect, a point cloud coding processing device is provided, comprising:
第一确定模块,用于确定待编码点云对应的变换树的目标层与根节点之间的目标距离;A first determination module is used to determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be encoded;
第二确定模块,用于基于所述目标距离确定目标预测残差值;A second determination module, used to determine a target prediction residual value based on the target distance;
编码模块,用于对所述目标预测残差值进行编码,得到第一编码结果;An encoding module, used for encoding the target prediction residual value to obtain a first encoding result;
其中,所述待编码点云的码流包括所述第一编码结果;The code stream of the point cloud to be encoded includes the first encoding result;
在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或In a case where the target distance is less than or equal to a first preset distance, the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer; or
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或In a case where the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer; or
在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。When the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
第四方面,提供了一种点云解码处理装置,包括:In a fourth aspect, a point cloud decoding processing device is provided, comprising:
确定模块,用于确定待解码点云对应的变换树的目标层与根节点之间的目标距离;A determination module, used to determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be decoded;
解码模块,用于对待解码点云的码流中的第一编码结果进行解码,得到目标预测残差值;A decoding module, used for decoding a first encoding result in a bit stream of a point cloud to be decoded to obtain a target prediction residual value;
获取模块,用于基于所述目标预测残差值及所述目标距离获取所述目标层的节点的属性信息;An acquisition module, used for acquiring attribute information of the nodes of the target layer based on the target prediction residual value and the target distance;
其中,在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或Wherein, when the target distance is less than or equal to the first preset distance, the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer; or
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值; 或In the case where the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer; or
在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。When the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
第五方面,提供了一种终端,该终端包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面所述的方法的步骤,或实现如第二方面所述的方法的步骤。In a fifth aspect, a terminal is provided, which includes a processor and a memory, wherein the memory stores a program or instruction that can be run on the processor, and when the program or instruction is executed by the processor, the steps of the method described in the first aspect are implemented, or the steps of the method described in the second aspect are implemented.
第六方面,提供了一种终端,包括处理器及通信接口,其中,所述处理器用于:确定待编码点云对应的变换树的目标层与根节点之间的目标距离;基于所述目标距离确定目标预测残差值;对所述目标预测残差值进行编码,得到第一编码结果;其中,所述待编码点云的码流包括所述第一编码结果;在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。In a sixth aspect, a terminal is provided, comprising a processor and a communication interface, wherein the processor is used to: determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be encoded; determine a target prediction residual value based on the target distance; encode the target prediction residual value to obtain a first encoding result; wherein the code stream of the point cloud to be encoded includes the first encoding result; when the target distance is less than or equal to a first preset distance, the target prediction residual value includes a first prediction residual value obtained by inter-frame prediction of attribute information of a node of the target layer; or when the target distance is greater than the first preset distance and less than or equal to a second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer; or when the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by intra-frame prediction of the attribute information of the node of the target layer.
第七方面,提供了一种终端,包括处理器及通信接口,其中,所述处理器用于:确定待解码点云对应的变换树的目标层与根节点之间的目标距离;对待解码点云的码流中的第一编码结果进行解码,得到目标预测残差值;基于所述目标预测残差值及所述目标距离获取所述目标层的节点的属性信息;其中,在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。In the seventh aspect, a terminal is provided, comprising a processor and a communication interface, wherein the processor is used to: determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be decoded; decode a first encoding result in a code stream of the point cloud to be decoded to obtain a target prediction residual value; obtain attribute information of the nodes of the target layer based on the target prediction residual value and the target distance; wherein, when the target distance is less than or equal to a first preset distance, the target prediction residual value includes a first prediction residual value obtained by inter-frame prediction of the attribute information of the nodes of the target layer; or when the target distance is greater than the first preset distance and less than or equal to a second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the nodes of the target layer; or when the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by intra-frame prediction of the attribute information of the nodes of the target layer.
第八方面,提供了一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面所述的方法的步骤,或者实现如第二方面所述的方法的步骤。In an eighth aspect, a readable storage medium is provided, on which a program or instruction is stored. When the program or instruction is executed by a processor, the steps of the method described in the first aspect are implemented, or the steps of the method described in the second aspect are implemented.
第九方面,提供了一种编解码系统,包括::编码端设备及解码端设备,所述编码端设备可用于执行如第一方面所述的方法的步骤,所述解码端设备可用于执行如第二方面所述的方法的步骤。In the ninth aspect, a coding and decoding system is provided, comprising: an encoding end device and a decoding end device, wherein the encoding end device can be used to execute the steps of the method described in the first aspect, and the decoding end device can be used to execute the steps of the method described in the second aspect.
第十方面,提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面所述的方法,或实现如第二方面所述的方法。In the tenth aspect, a chip is provided, comprising a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is used to run a program or instruction to implement the method described in the first aspect, or to implement the method described in the second aspect.
第十一方面,提供了一种计算机程序/程序产品,所述计算机程序/程序产品被存储在存储介质中,所述程序/程序产品被至少一个处理器执行以实现如第一方面所述的方法的步骤, 或者实现如第二方面所述的方法的步骤。In an eleventh aspect, a computer program/program product is provided, wherein the computer program/program product is stored in a storage medium, and the program/program product is executed by at least one processor to implement the steps of the method according to the first aspect, Or implement the steps of the method as described in the second aspect.
在本申请实施例中,基于所述目标距离确定目标预测残差值,对所述目标预测残差值进行编码,在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。通过将目标距离与第一预设距离和第二预设距离进行比较,确定不同的预测方式,灵活性较高,能够提高编码效率。In an embodiment of the present application, a target prediction residual value is determined based on the target distance, and the target prediction residual value is encoded. When the target distance is less than or equal to a first preset distance, the target prediction residual value includes a first prediction residual value obtained by inter-frame prediction of the attribute information of the node of the target layer; or when the target distance is greater than the first preset distance and less than or equal to a second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer; or when the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by intra-frame prediction of the attribute information of the node of the target layer. By comparing the target distance with the first preset distance and the second preset distance, different prediction methods are determined, which has high flexibility and can improve coding efficiency.
图1是相关技术中的一种预测方法的流程图;FIG1 is a flow chart of a prediction method in the related art;
图2是相关技术中的一种节点的示意图;FIG2 is a schematic diagram of a node in the related art;
图3是相关技术中的一种节点块的二元RAHT分解示意图;FIG3 is a schematic diagram of binary RAHT decomposition of a node block in the related art;
图4是相关技术中的一种预测方法的示意图;FIG4 is a schematic diagram of a prediction method in the related art;
图5是本申请实施例提供的一种点云编码处理方法的流程图之一;FIG5 is a flowchart of a point cloud coding processing method provided in an embodiment of the present application;
图6是本申请实施例提供的一种点云解码处理方法的流程图之一;FIG6 is a flowchart of a point cloud decoding method provided in an embodiment of the present application;
图7是本申请实施例提供的一种点云编码处理方法的流程图之二;FIG. 7 is a second flowchart of a point cloud coding processing method provided in an embodiment of the present application;
图8是本申请实施例提供的一种点云解码处理方法的流程图之二;FIG8 is a second flowchart of a point cloud decoding processing method provided in an embodiment of the present application;
图9是本申请实施例提供的一种点云编码处理装置的结构示意图;FIG9 is a schematic diagram of the structure of a point cloud coding processing device provided in an embodiment of the present application;
图10是本申请实施例提供的一种点云解码处理装置的结构示意图;FIG10 is a schematic diagram of the structure of a point cloud decoding processing device provided in an embodiment of the present application;
图11是本申请实施例提供的一种通信设备的结构示意图;FIG11 is a schematic diagram of the structure of a communication device provided in an embodiment of the present application;
图12是本申请实施例提供的一种终端的结构示意图。FIG. 12 is a schematic diagram of the structure of a terminal provided in an embodiment of the present application.
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员所获得的所有其他实施例,都属于本申请保护的范围。The following will be combined with the drawings in the embodiments of the present application to clearly describe the technical solutions in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in the present application, all other embodiments obtained by ordinary technicians in this field belong to the scope of protection of this application.
本申请的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”所区别的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,本申请中的“或”表示所连接对象的至少其中之一。例如“A或B”涵盖三种方案,即,方案一:包括A且不包括B;方案二:包括B且不包括A;方案三:既包括A又包括B。字符“/”一般表示前后关联对象是一种“或”的关系。 The terms "first", "second", etc. of the present application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It should be understood that the terms used in this way are interchangeable where appropriate, so that the embodiments of the present application can be implemented in an order other than those illustrated or described herein, and the objects distinguished by "first" and "second" are generally of one type, and the number of objects is not limited, for example, the first object can be one or more. In addition, "or" in the present application represents at least one of the connected objects. For example, "A or B" covers three schemes, namely, Scheme 1: including A but not including B; Scheme 2: including B but not including A; Scheme 3: including both A and B. The character "/" generally indicates that the objects associated with each other are in an "or" relationship.
本申请的术语“指示”既可以是一个直接的指示(或者说显式的指示),也可以是一个间接的指示(或者说隐含的指示)。其中,直接的指示可以理解为,发送方在发送的指示中明确告知了接收方具体的信息、需要执行的操作或请求结果等内容;间接的指示可以理解为,接收方根据发送方发送的指示确定对应的信息,或者进行判断并根据判断结果确定需要执行的操作或请求结果等。The term "indication" in this application can be a direct indication (or explicit indication) or an indirect indication (or implicit indication). A direct indication can be understood as the sender explicitly informing the receiver of specific information, operations to be performed, or request results in the sent indication; an indirect indication can be understood as the receiver determining the corresponding information according to the indication sent by the sender, or making a judgment and determining the operation to be performed or the request result according to the judgment result.
本申请实施例中的点云编解码处理方法对应的编解码端可以为终端,该终端也可以称作终端设备或者用户终端(User Equipment,UE),终端可以是手机、平板电脑(Tablet Personal Computer)、膝上型电脑(Laptop Computer)或称为笔记本电脑、个人数字助理(Personal Digital Assistant,PDA)、掌上电脑、上网本、超级移动个人计算机(ultra-mobile personal computer,UMPC)、移动上网装置(Mobile Internet Device,MID)、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、机器人、可穿戴式设备(Wearable Device)或车载设备(Vehicle User Equipment,VUE)、行人终端(Pedestrian User Equipment,PUE)等终端侧设备,可穿戴式设备包括:智能手表、手环、耳机、眼镜等。需要说明的是,在本申请实施例并不限定终端的具体类型。The codec end corresponding to the point cloud codec processing method in the embodiment of the present application may be a terminal, which may also be referred to as a terminal device or a user terminal (User Equipment, UE). The terminal may be a mobile phone, a tablet computer (Tablet Personal Computer), a laptop computer (Laptop Computer) or a notebook computer, a personal digital assistant (Personal Digital Assistant, PDA), a handheld computer, a netbook, an ultra-mobile personal computer (ultra-mobile personal computer, UMPC), a mobile Internet device (Mobile Internet Device, MID), an augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) device, a robot, a wearable device (Wearable Device) or a vehicle-mounted device (Vehicle User Equipment, VUE), a pedestrian terminal (Pedestrian User Equipment, PUE) and other terminal side devices, and the wearable device includes: a smart watch, a bracelet, a headset, glasses, etc. It should be noted that the specific type of the terminal is not limited in the embodiment of the present application.
为了方便理解,以下对本申请实施例涉及的一些内容进行说明:For ease of understanding, some contents involved in the embodiments of the present application are described below:
1、基于几何的点云压缩(Geometry-based Point Cloud Compression,GPCC)编解码1. Geometry-based Point Cloud Compression (GPCC) encoding and decoding
针对GPCC编解码点云,对其几何信息和属性信息分别进行编码,首先编码几何信息,再利用重建后的几何信息编码属性信息。For the GPCC encoded and decoded point cloud, its geometric information and attribute information are encoded separately. First, the geometric information is encoded, and then the attribute information is encoded using the reconstructed geometric information.
对于属性信息编码方式主要分为两种,分别为基于细节层次(level of detail)的预测和提升变换以及区域自适应分层变换(Region Adaptive Hierarchal Transform,RAHT)。There are two main ways to encode attribute information, namely, level of detail-based prediction and lifting transform and Region Adaptive Hierarchal Transform (RAHT).
对于RAHT这一种属性编码方式,根据其采用的预测方法可以被分为帧内RAHT和帧间RAHT。The attribute coding method of RAHT can be divided into intra-frame RAHT and inter-frame RAHT according to the prediction method adopted.
帧内预测即只采用当前帧已编码的节点对当前节点进行预测,对于帧间预测则是用到了参考帧的节点预测当前节点。Intra-frame prediction uses only the nodes that have been encoded in the current frame to predict the current node, while inter-frame prediction uses the nodes of the reference frame to predict the current node.
a)帧内RAHT技术:a) Intra-frame RAHT technology:
帧内RAHT技术被称为基于上采样预测的区域自适应变换,自顶向底逐层进行变换。在每一层中,RAHT会应用于每个包含2×2×2个子节点的单元节点,在对当前层当前含有8个子节点的单元节点进行编码时,利用了其父节点(上一层),共面邻居父节点(上一层),共线邻居父节点(上一层),以及已编码的共面邻居子节点(同层)和已编码的共线邻居子节点(同层)对该节点的占位子节点的属性进行加权预测,之后对得到的属性预测值和真实值进行RAHT变换后作差,得到交流系数(AC系数)残差,对AC系数残差进行量化编码。但是当待编码子节点的个数,及其邻居祖父节点个数和邻居父节点个数不满足一定条件时,就不会开启帧内预测,会直接对当前待编码节点的原始属性值进行RAHT变换,之后对变换系数进行量化编码。Intra-frame RAHT technology is called regional adaptive transformation based on upsampling prediction, which is transformed layer by layer from top to bottom. In each layer, RAHT is applied to each unit node containing 2×2×2 child nodes. When encoding the unit node of the current layer with 8 child nodes, its parent node (previous layer), coplanar neighbor parent node (previous layer), collinear neighbor parent node (previous layer), and encoded coplanar neighbor child node (same layer) and encoded collinear neighbor child node (same layer) are used to perform weighted prediction on the attributes of the placeholder child node of the node. After that, the obtained attribute prediction value and the true value are RAHT transformed and the difference is made to obtain the AC coefficient (AC coefficient) residual, and the AC coefficient residual is quantized and encoded. However, when the number of child nodes to be encoded, the number of neighbor grandparent nodes and the number of neighbor parent nodes do not meet certain conditions, intra-frame prediction will not be enabled, and the original attribute value of the current node to be encoded will be directly RAHT transformed, and then the transform coefficient will be quantized and encoded.
以下为帧内RAHT技术具体流程: The following is the specific process of intra-frame RAHT technology:
第一步,构建变换树结构。从最底层开始,自底向上构建八叉树结构,在构建变换树的过程中,需要为合并后的节点生成对应的莫顿码信息、属性信息以及权重信息。The first step is to build a transformation tree structure. Starting from the bottom layer, the octree structure is built from the bottom up. In the process of building the transformation tree, it is necessary to generate corresponding Morton code information, attribute information and weight information for the merged nodes.
第二步,自上而下解析树。自顶向下从根节点开始逐层进行上采样预测和区域自适应分层变换(RAHT)。The second step is to parse the tree from top to bottom. Starting from the root node, upsampling prediction and regional adaptive hierarchical transform (RAHT) are performed layer by layer.
若当前节点为根节点,则不进行上采样预测,直接对节点的属性信息进行RAHT变换,得到直流系数(DC)和交流系数(AC)。If the current node is a root node, no upsampling prediction is performed, and the attribute information of the node is directly subjected to RAHT transformation to obtain the DC coefficient and AC coefficient.
若不为根节点,则需要判断对当前八个子节点是否进行预测,如图1所示:If it is not the root node, it is necessary to determine whether to predict the current eight child nodes, as shown in Figure 1:
1、判断占位子节点NumValidc个数是否等于1:如果当前待编码节点的占位子节点(非空)个数为1,则设置其邻居父节点NumValidP个数为val(例如,19),不进行预测,直接对当前节点的原始属性信息进行RAHT变换,然后对得到的AC系数进行量化和熵编码。1. Determine whether the number of placeholder child nodes NumValidc is equal to 1: If the number of placeholder child nodes (non-empty) of the current node to be encoded is 1, set the number of its neighbor parent nodes NumValidP to val (for example, 19), do not make predictions, directly perform RAHT transformation on the original attribute information of the current node, and then quantize and entropy encode the obtained AC coefficients.
否则继续步骤2:Otherwise continue with step 2:
2、判断邻居祖父节点NumValidGP个数是否大于等于TH1(例如,2):如果当前待编码子节点的邻居祖父节点(包括祖父节点)的个数小于2,则不进行预测,直接对当前节点的原始属性信息进行RAHT变换,然后对得到的AC系数进行量化和熵编码。否则进行步骤3:2. Determine whether the number of neighboring grandparent nodes NumValidGP is greater than or equal to TH1 (for example, 2): If the number of neighboring grandparent nodes (including grandparent nodes) of the current subnode to be encoded is less than 2, no prediction is performed, and the original attribute information of the current node is directly RAHT transformed, and then the obtained AC coefficients are quantized and entropy encoded. Otherwise, proceed to step 3:
3、邻居搜索:搜索范围为:当前待编码子节点的父节点(1个)、当前待编码子节点的父节点的共面邻居节点(6个)、当前待编码子节点的父节点的共线邻居节点(12个)、当前待编码子节点的共面邻居节点(6个)、当前待编码子节点的共线邻居节点(12个)。依次搜索上述邻居节点,若该邻居节点存在,记录它对应的索引信息,同时记录父节点的邻居个数(包括父节点本身)。然后进行步骤4。3. Neighbor search: The search range is: the parent node of the current child node to be encoded (1), the coplanar neighbor node of the parent node of the current child node to be encoded (6), the colinear neighbor node of the parent node of the current child node to be encoded (12), the coplanar neighbor node of the current child node to be encoded (6), the colinear neighbor node of the current child node to be encoded (12). Search the above neighbor nodes in turn. If the neighbor node exists, record its corresponding index information and the number of neighbors of the parent node (including the parent node itself). Then proceed to step 4.
4、判断邻居父节点的个数是否大于等于TH2(例如,6):若当前待编码子节点的邻居父节点个数小于TH2,则不进行加权预测,直接对当前待编码节点的原始属性信息进行RAHT变换,然后对得到的AC系数进行量化和熵编码。否则进行加权预测。4. Determine whether the number of neighboring parent nodes is greater than or equal to TH2 (for example, 6): If the number of neighboring parent nodes of the current child node to be encoded is less than TH2, no weighted prediction is performed, and the original attribute information of the current node to be encoded is directly RAHT transformed, and then the obtained AC coefficients are quantized and entropy encoded. Otherwise, weighted prediction is performed.
第三步,加权预测。利用邻居搜索中找到的最近邻居来对当前待编码节点的每个子节点进行加权预测。如图2所示。The third step is weighted prediction. The nearest neighbors found in the neighbor search are used to perform weighted prediction on each child node of the current node to be encoded, as shown in Figure 2.
规定父节点的预测权重为9,与当前待编码子节点共面的邻居子节点预测权重为5,与当前待编码子节点共线的邻居子节点的预测权重为2,与当前待编码子节点共面的邻居父节点预测权重为3,与当前待编码子节点共线的邻居父节点预测权重为1。The prediction weight of the parent node is specified to be 9, the prediction weight of the neighboring child node coplanar with the current child node to be encoded is 5, the prediction weight of the neighboring child node colinear with the current child node to be encoded is 2, the prediction weight of the neighboring parent node coplanar with the current child node to be encoded is 3, and the prediction weight of the neighboring parent node colinear with the current child node to be encoded is 1.
其中,父节点可以直接用来预测当前待编码块的每一个子节点,对于其他邻居节点(邻居父节点和已编码邻居子节点)需要进行一步判断是否可以用来预测当前块的子节点,判断的步骤如下:Among them, the parent node can be directly used to predict each child node of the current block to be encoded. For other neighbor nodes (neighbor parent nodes and encoded neighbor child nodes), it is necessary to perform a step to determine whether they can be used to predict the child nodes of the current block. The determination steps are as follows:
1、如图2所示,对于索引为1-7的邻居父节点,由于其子节点还没有被编码,所以只能使用邻居父节点进行预测。1. As shown in Figure 2, for neighbor parent nodes with indexes 1-7, since their child nodes have not been encoded yet, only the neighbor parent nodes can be used for prediction.
首先,判断该位置的邻居父节点是否存在,若不存在则继续判断下一个邻居父节点。First, determine whether the neighbor parent node of the position exists. If not, continue to determine the next neighbor parent node.
若存在,则需要判断当前邻居父节点是否可以用来预测,进行邻居父节点的筛选。具 体过程如下:If it exists, it is necessary to determine whether the current neighbor parent node can be used for prediction and to screen the neighbor parent node. The whole process is as follows:
1)根据父节点的属性值设置两个预测阈值用来进一步筛选最近邻,筛选掉不合理的邻居父节点,以提高预测的准确性。设这两个阈值分别为limitLow,limitHigh,设父节点的属性值为attrPar,则:
limitLow=attrpar*2
limitHigh=attrpar*251) According to the attribute value of the parent node, two prediction thresholds are set to further filter the nearest neighbors and filter out unreasonable neighbor parent nodes to improve the accuracy of the prediction. Let the two thresholds be limitLow and limitHigh, and let the attribute value of the parent node be attrPar, then:
limitLow=attrpar*2
limitHigh=attrpar*25
设当前邻居父节点的属性值为attrNei,对它进行以下判断:
limitLow<attrnei*10<limitHighAssume that the attribute value of the current neighbor's parent node is attrNei, and make the following judgments on it:
limitLow<attrnei*10<limitHigh
若不满足该条件,则当前邻居父节点不能用来预测当前待编码块的子节点;若满足该条件,则继续进行如下判断;If the condition is not met, the current neighbor parent node cannot be used to predict the child node of the current block to be encoded; if the condition is met, continue to make the following judgment;
2)判断当前邻居父节点是否满足与当前每一个待编码子节点共面、共线这一条件。若不满足该条件,则不能使用当前邻居节点对当前待编码子节点进行加权预测;若满足该条件,则使用当前邻居节点来对当前待编码子节点进行加权预测。2) Determine whether the current neighboring parent node meets the condition of being coplanar and colinear with each of the current child nodes to be encoded. If this condition is not met, the current neighboring node cannot be used to perform weighted prediction on the current child node to be encoded; if this condition is met, the current neighboring node is used to perform weighted prediction on the current child node to be encoded.
2、对于索引为8-19的邻居父节点,由于其子节点已经被编码,所以如果存在相同位置的邻居子节点就可以使用它来代替其所属的邻居父节点进行预测,会有更好的预测效果。2. For neighbor parent nodes with indexes 8-19, since their child nodes have been encoded, if there is a neighbor child node at the same position, it can be used to replace the neighbor parent node to which it belongs for prediction, which will have a better prediction effect.
首先,判断该位置的邻居父节点是否存在,若不存在则继续判断下一个邻居父节点。First, determine whether the neighbor parent node of the position exists. If not, continue to determine the next neighbor parent node.
若存在,则需要判断当前邻居父节点是否可以用来预测,进行邻居父节点的筛选。具体过程如下:If it exists, it is necessary to determine whether the current neighbor parent node can be used for prediction and screen the neighbor parent node. The specific process is as follows:
1)根据父节点的属性值设置两个预测阈值用来进一步筛选最近邻,筛选掉不合理的邻居父节点,以提高预测的准确性。设当前邻居父节点的属性值为attrNei,对它进行以下判断:
limitLow<attrnei*10<limitHigh1) According to the attribute value of the parent node, two prediction thresholds are set to further filter the nearest neighbors and filter out unreasonable neighbor parent nodes to improve the accuracy of the prediction. Suppose the attribute value of the current neighbor parent node is attrNei, and make the following judgment on it:
limitLow<attrnei*10<limitHigh
若不满足该条件,则当前邻居父节点不能用来预测当前待编码块的子节点;若满足该条件,则继续进行如下判断:If the condition is not met, the current neighbor parent node cannot be used to predict the child node of the current block to be encoded; if the condition is met, the following judgment is continued:
2)判断当前父节点邻居的相同位置是否存在同层邻居子节点:对于共面邻居父节点,如果其中某个子节点也与当前待预测子节点共面,则用该共面邻居子节点代替该共面邻居父节点对该待预测子节点进行加权预测;对于共线邻居父节点,如果其中某个子节点也与当前带预测子节点共线,则用该共线邻居子节点代替该共线邻居父节点进行加权预测。若不存在这样的子节点的话,就继续使用该邻居父节点进行加权预测。2) Determine whether there is a neighbor child node in the same layer as the current parent node's neighbor: For coplanar neighbor parent nodes, if one of the child nodes is also coplanar with the current child node to be predicted, use the coplanar neighbor child node to replace the coplanar neighbor parent node to make a weighted prediction for the child node to be predicted; for collinear neighbor parent nodes, if one of the child nodes is also collinear with the current predicted child node, use the collinear neighbor child node to replace the collinear neighbor parent node for weighted prediction. If there is no such child node, continue to use the neighbor parent node for weighted prediction.
3、最后,当前待编码节点的各个子节点使用满足条件的邻居节点作为参考点集,来进行加权预测,得到属性预测值。之后对原始属性值和预测属性值分别进行RAHT变换后作差得到AC系数残差,对AC系数残差进行量化和熵编码。
Attrres=AttrCofforg-AttrCoffpred
3. Finally, each child node of the current node to be encoded uses the neighboring nodes that meet the conditions as the reference point set to perform weighted prediction and obtain the attribute prediction value. Then, the original attribute value and the predicted attribute value are RAHT transformed and the difference is obtained to obtain the AC coefficient residual, which is quantized and entropy coded.
Attr res =AttrCoff org -AttrCoff pred
第四步,RAHT变换,流程如下:The fourth step is RAHT transformation. The process is as follows:
对原始属性值或(和)属性预测值进行RAHT变换;若当前块不进行预测,则只需对 原始属性值进行RAHT;若当前块进行了预测,则需要对原始属性值和属性预测值都进行RAHT变换。Perform RAHT transformation on the original attribute value or (and) attribute prediction value; if the current block is not predicted, only The original attribute value is subjected to RAHT; if the current block is predicted, both the original attribute value and the attribute prediction value need to be subjected to RAHT transformation.
进行RAHT变换时,首先需要对原始属性值和/或属性预测值进行归一化处理,然后对处理之后的值进行RAHT变换。When performing RAHT transformation, the original attribute values and/or attribute prediction values need to be normalized first, and then the processed values are subjected to RAHT transformation.
首先对原始属性值进行归一化处理,设当前子节点的原始属性值为Ai,权重为wi(大小为当前节点中包含的点数),则:
First, normalize the original attribute value. Suppose the original attribute value of the current child node is A i and the weight is w i (the size is the number of points contained in the current node), then:
若使用了预测,设当前子节点的属性预测值为Api,则:
If prediction is used, let the attribute prediction value of the current child node be A pi , then:
对处理后的属性预测值A′pi或原始属性值A′i进行RAHT变换,对一个2*2*2的节点块,分别沿3个方向进行变换,每个方向执行4次变换,因此,每个2*2*2的块总共执行:Perform RAHT transformation on the processed attribute prediction value A′ pi or the original attribute value A′ i. For a 2*2*2 node block, transform it along three directions respectively, and perform four transformations in each direction. Therefore, each 2*2*2 block performs the following transformations in total:
1)沿第一个方向进行变换,得到低频L节点和高频H节点;1) Transform along the first direction to obtain low-frequency L nodes and high-frequency H nodes;
2)沿第二个方向对L节点和H节点进行变换,得到低频LL节点和高频LH、HL、HH节点;2) Transform the L nodes and H nodes along the second direction to obtain low-frequency LL nodes and high-frequency LH, HL, and HH nodes;
3)沿第三个方向对LL、LH、HL、HH节点进行变换,得到低频LLL节点和高频LLH、LHL、LHH、HLL、HLH、HHL、HHH节点。3) Transform the LL, LH, HL, and HH nodes along the third direction to obtain low-frequency LLL nodes and high-frequency LLH, LHL, LHH, HLL, HLH, HHL, and HHH nodes.
其中LLL为DC系数,LLH、LHL、LHH、HLL、HLH、HHL、HHH为AC系数。Among them, LLL is the DC coefficient, and LLH, LHL, LHH, HLL, HLH, HHL, and HHH are AC coefficients.
按照LLH,LHL,HLL,LHH,HLH,HHL,HHH的顺序对AC系数进行量化和熵编码。The AC coefficients are quantized and entropy coded in the order of LLH, LHL, HLL, LHH, HLH, HHL, HHH.
由于点云的稀疏性,一般情况下,每个2*2*2的节点块被占据的子节点通常少于8个,因此并不是所有的AC系数都会存在,不存在的AC系数不会进行编码。Due to the sparsity of point clouds, in general, each 2*2*2 node block usually has less than 8 occupied child nodes, so not all AC coefficients will exist, and non-existent AC coefficients will not be encoded.
具体过程如图3所示:The specific process is shown in Figure 3:
具体的,在进行两点变换时,假设输入属性值分别为T01、T11,变换系数T1、Tw0+1为采用的变换公式为:
Specifically, when performing two-point transformation, assuming that the input attribute values are T 01 and T 11 , respectively, and the transformation coefficients are T 1 and T w0+1 , the transformation formula used is:
其中a、b由当前节点的权重计算得到,当前节点的权重从weight数组中得到:
Where a and b are calculated by the weight of the current node, and the weight of the current node is obtained from the weight array:
若当前节点为根节点,不进行预测,则对原始属性值经过变换之后的DC系数和AC系数进行量化和熵编码;If the current node is the root node, no prediction is performed, and the DC coefficient and AC coefficient of the original attribute value after transformation are quantized and entropy coded;
若当前节点不为根节点且不进行预测,则对原始属性值经过变换之后AC系数进行量化和熵编码;If the current node is not a root node and no prediction is performed, the AC coefficients of the original attribute values after transformation are quantized and entropy encoded;
若当前节点不为根节点且进行预测,则计算原始属性值经过变换之后的AC系数和属性预测值经过变换之后的AC系数的残差,对该残差进行量化和熵编码。If the current node is not a root node and prediction is performed, the residual of the AC coefficient of the original attribute value after transformation and the AC coefficient of the attribute prediction value after transformation is calculated, and the residual is quantized and entropy encoded.
在解码端对系数进行解码和反量化。之后进行RAHT反变换。 At the decoding end, the coefficients are decoded and dequantized, and then the RAHT inverse transform is performed.
RAHT反变换是RAHT变换的逆过程,反变换公式如下:
The inverse RAHT transform is the inverse process of the RAHT transform. The inverse transform formula is as follows:
其中,T1、Tw0+1为变换系数,T01、T11分别为重建属性值,a、b的计算方式也变换过程中a、b的计算方式相同。Wherein, T 1 and T w0+1 are transformation coefficients, T 01 and T 11 are reconstruction attribute values, and the calculation method of a and b is the same as the calculation method of a and b in the transformation process.
若当前节点为根节点,则直接进行反变换,得到每个子节点的属性重建值;If the current node is the root node, then directly perform the inverse transformation to obtain the attribute reconstruction value of each child node;
若当前节点不为根节点且不进行预测时,DC系数继承父节点的属性重建值,然后进行反变换,得到每个子节点的属性重建值;If the current node is not a root node and no prediction is performed, the DC coefficient inherits the attribute reconstruction value of the parent node, and then performs an inverse transformation to obtain the attribute reconstruction value of each child node;
若当前子节点为根节点且进行预测时,DC系数继承父节点的属性重建值,AC系数为AC系数残差加上属性预测值的AC系数,然后进行反变换,得到每个子节点的属性重建值。If the current child node is the root node and prediction is performed, the DC coefficient inherits the attribute reconstruction value of the parent node, and the AC coefficient is the AC coefficient residual plus the AC coefficient of the attribute prediction value. Then an inverse transformation is performed to obtain the attribute reconstruction value of each child node.
b)帧间RAHT技术:b) Inter-frame RAHT technology:
帧间RAHT技术利用到了参考帧的某些符合条件的节点的帧间预测属性值对当前的帧的节点进行属性预测,之后对属性预测值和真实属性值分别进行RAHT变换并作差,对AC系数残差进行量化编码。The inter-frame RAHT technology uses the inter-frame prediction attribute values of some qualified nodes in the reference frame to predict the attributes of the nodes in the current frame, and then performs RAHT transformation on the attribute prediction value and the true attribute value respectively and makes a difference, and quantizes and encodes the AC coefficient residual.
以下是对帧间方案的具体介绍:The following is a detailed introduction to the inter-frame scheme:
帧间RAHT是一种针对RAHT帧间DC和AC系数帧间预测的方法:帧间预测只被用于前五层节点,其余层节点只使用帧内预测方法。Inter-RAHT is a method for inter-frame prediction of DC and AC coefficients between RAHT frames: inter-frame prediction is only used for the first five layers of nodes, and the remaining layer nodes only use intra-frame prediction methods.
在前五层(高层节点):In the first five layers (high-level nodes):
对于DC系数:For the DC coefficient:
DC系数预测残差DCresidual为:当前帧根节点的DC系数DCcurrent减去参考帧根节点的DC系数DCreference。
DCresidual=DCcurrent-DCreference
The DC coefficient prediction residual DC residual is: the DC coefficient DC current of the root node of the current frame minus the DC coefficient DC reference of the root node of the reference frame.
DC residual =DC current -DC reference
对于AC系数:For AC coefficients:
预测流程如下:The prediction process is as follows:
1、参考帧重构:把当前点位置一定范围内的参考帧中的点合并成一个点,属性相加,坐标取当前点的坐标。这样之后就能执行相同的预测树构建和解析。具体过程如下:1. Reference frame reconstruction: Merge the points in the reference frame within a certain range of the current point into one point, add the attributes, and take the coordinates of the current point. Then, the same prediction tree construction and analysis can be performed. The specific process is as follows:
a)对于每一个点,都会有一个相同位置坐标的重构参考点。此参考点的属性为根据一定规则得到的参考帧中一定范围内的点的属性和。a) For each point, there is a reconstructed reference point with the same position coordinates. The attributes of this reference point are the sum of the attributes of the points within a certain range in the reference frame obtained according to certain rules.
b)该规则为:若参考帧中的某点离当前帧当前点的距离比离当前帧下一个点的莫顿码距离更近,则此点被判定为用于重建当前点,属性值累加,权重加一。直到不满足条件。继续判断下一个重建点。b) The rule is: if the distance of a point in the reference frame to the current point of the current frame is closer than the Morton code distance to the next point of the current frame, then this point is determined to be used to reconstruct the current point, the attribute value is accumulated, and the weight is increased by one. Until the condition is not met, continue to determine the next reconstruction point.
c)此外,规定了判断开始和结束距离,距离太远会被认为属性值相差较大,不适合用于重建。具体的:c) In addition, the start and end distances are specified. If the distance is too far, it will be considered that the attribute values are too different and not suitable for reconstruction.
与当前帧第一点的距离不大于64的参考帧中的点才会被用于重建。Only points in the reference frame whose distance to the first point of the current frame is no greater than 64 will be used for reconstruction.
与当前帧最后一点的距离不大于64的参考帧中的点才会被用于重建。Only points in the reference frame whose distance to the last point of the current frame is no greater than 64 will be used for reconstruction.
2、建树:对重构后的参考帧和当前帧执行自下而上构建预测树的操作,得到了相同结 构的预测树,生成当前帧节点的属性值和相应参考帧节点的预测属性值,权重值。2. Tree construction: The reconstructed reference frame and the current frame are subjected to the bottom-up operation of constructing the prediction tree, and the same results are obtained. The prediction tree is constructed to generate the attribute value of the current frame node and the predicted attribute value and weight value of the corresponding reference frame node.
3、如图4所示,预测属性值选取:在前五层,若参考帧对应位置节点的属性值不为0且在当前父节点属性值的20%到250%内,则使用参考帧该节点属性值Attrpredicted_inter作为属性预测值,否则就使用当前帧帧内预测值Attrpredicted_intra作为属性预测值。对得到的预测属性值Attrpredicted进行RAHT变换得到预测值的AC系数。将当前帧变换得到的AC系数ACcurrent与预测得到的AC系数ACpredicted相减得到AC系数预测残差ACresidual,对残差进行量化编码。计算公式如下:
ACresidual=ACcurrent-ACpredicted
ACpredicted=Transform(Attrpredicted)
Attrpredicted=Attrpredicted_inter?Attrpredicted_inter:Attrpredicted_intra
3. As shown in Figure 4, the predicted attribute value is selected: in the first five layers, if the attribute value of the node at the corresponding position of the reference frame is not 0 and is within 20% to 250% of the attribute value of the current parent node, the attribute value Attr predicted_inter of the node in the reference frame is used as the attribute prediction value, otherwise the intra-frame prediction value Attr predicted_intra of the current frame is used as the attribute prediction value. Perform RAHT transformation on the predicted attribute value Attr predicted to obtain the AC coefficient of the predicted value. Subtract the AC coefficient AC current obtained by the current frame transformation from the predicted AC coefficient AC predicted to obtain the AC coefficient prediction residual AC residual , and quantize and encode the residual. The calculation formula is as follows:
AC residual =AC current -AC predicted
AC predicted =Transform(Attr predicted )
Attr predicted =Attr predicted_inter ? Attr predicted_inter :Attr predicted_intra
4、其余层只使用原有的帧内RAHT预测方法。4. The remaining layers only use the original intra-frame RAHT prediction method.
相关技术中,在基于区域自适应分层变换(RAHT)算法对属性信息进行编码的过程中,将待编码点云对应的树形结构一分为二,上层的节点使用帧间预测算法进行预测,下层的节点使用帧内预测算法进行预测,灵活性较差,导致编码效率较低。In the related technology, in the process of encoding attribute information based on the region adaptive hierarchical transform (RAHT) algorithm, the tree structure corresponding to the point cloud to be encoded is divided into two parts. The nodes in the upper layer are predicted using the inter-frame prediction algorithm, and the nodes in the lower layer are predicted using the intra-frame prediction algorithm. The flexibility is poor, resulting in low coding efficiency.
下面结合附图,通过一些实施例及其应用场景对本申请实施例提供的点云编码处理方法、点云解码处理方法及相关设备进行详细地说明。In the following, in combination with the accompanying drawings, the point cloud encoding processing method, the point cloud decoding processing method and related equipment provided in the embodiments of the present application are described in detail through some embodiments and their application scenarios.
参见图5,图5是本申请实施例提供的一种点云编码处理方法的流程图,可以应用于编码端,如图5所示,点云编码处理方法包括以下步骤:Referring to FIG. 5 , FIG. 5 is a flow chart of a point cloud coding processing method provided in an embodiment of the present application, which can be applied to the coding end. As shown in FIG. 5 , the point cloud coding processing method includes the following steps:
步骤101、确定待编码点云对应的变换树的目标层与根节点之间的目标距离。Step 101: Determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be encoded.
其中,可以获取待编码点云的几何信息和属性信息;基于所述待编码点云的几何信息,生成所述待编码点云对应的变换树。待编码点云对应的变换树可以包括多层,根节点可以认为是变换树的第一层,目标层可以为变换树中的任意一层。目标距离可以指目标层距离根节点的层数,示例地,变换树的第二层与根节点之间的目标距离可以为一层,变换树的第三层与根节点之间的目标距离可以为两层,变换树的第四层与根节点之间的目标距离可以为三层,等等。Among them, the geometric information and attribute information of the point cloud to be encoded can be obtained; based on the geometric information of the point cloud to be encoded, a transform tree corresponding to the point cloud to be encoded is generated. The transform tree corresponding to the point cloud to be encoded can include multiple layers, the root node can be considered as the first layer of the transform tree, and the target layer can be any layer in the transform tree. The target distance can refer to the number of layers from the target layer to the root node. For example, the target distance between the second layer of the transform tree and the root node can be one layer, the target distance between the third layer of the transform tree and the root node can be two layers, the target distance between the fourth layer of the transform tree and the root node can be three layers, and so on.
步骤102、基于所述目标距离确定目标预测残差值;Step 102: determining a target prediction residual value based on the target distance;
步骤103、对所述目标预测残差值进行编码,得到第一编码结果;Step 103: Encode the target prediction residual value to obtain a first encoding result;
其中,所述待编码点云的码流包括所述第一编码结果;The code stream of the point cloud to be encoded includes the first encoding result;
在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或In a case where the target distance is less than or equal to a first preset distance, the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer; or
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或In a case where the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer; or
在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。 When the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
其中,第一预设距离和第二预设距离均可以为距离根节点的层数。第一预设距离小于第二预设距离。示例地,第一预设距离可以为5层,第二预设距离可以为10层;或者第一预设距离可以为7层,第二预设距离可以为13层;或者第一预设距离可以为8层,第二预设距离可以为15层;等等,本实施例对第一预设距离和第二预设距离不进行限定。The first preset distance and the second preset distance may both be the number of layers from the root node. The first preset distance is smaller than the second preset distance. For example, the first preset distance may be 5 layers and the second preset distance may be 10 layers; or the first preset distance may be 7 layers and the second preset distance may be 13 layers; or the first preset distance may be 8 layers and the second preset distance may be 15 layers; etc. This embodiment does not limit the first preset distance and the second preset distance.
需要说明的是,在所述目标距离小于或等于第一预设距离的情况下,目标层的节点为距离根节点较近的上层的节点,由于变换树上层的节点较大,包含的点数多,即使两帧之间有运动也基本不会在这些上层的节点上有太大变化,此时若使用帧间预测的话,帧间属性预测残差可能会比采用帧内属性预测残差要小,从而在上层使用帧间预测效果要更好。It should be noted that, when the target distance is less than or equal to the first preset distance, the node of the target layer is the node of the upper layer that is closer to the root node. Since the nodes of the upper layer of the transform tree are larger and contain more points, even if there is movement between the two frames, there will basically not be much change in these upper layer nodes. At this time, if inter-frame prediction is used, the inter-frame attribute prediction residual may be smaller than the intra-frame attribute prediction residual, so that the use of inter-frame prediction in the upper layer will have a better effect.
另外,在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,可以通过率失真优化算法确定采用帧间预测方式或帧内预测方式。In addition, when the target distance is greater than the first preset distance and less than or equal to the second preset distance, the inter-frame prediction method or the intra-frame prediction method can be determined by a rate-distortion optimization algorithm.
另外,在所述目标距离大于所述第二预设距离的情况下,目标层的节点为距离根节点较远的下层的节点,由于下层块小,包含的点数较少,当两帧之间有一定的运动时,每个块内包含的点数可能就会发生很大的变化,不适合采用帧间预测,从而在下层使用帧内预测效果要更好。In addition, when the target distance is greater than the second preset distance, the node of the target layer is a node of the lower layer that is farther away from the root node. Since the lower layer block is small and contains fewer points, when there is a certain amount of motion between the two frames, the number of points contained in each block may change greatly, which is not suitable for inter-frame prediction. Therefore, using intra-frame prediction in the lower layer will have a better effect.
需要说明的是,相关技术中帧间预测方法是通过离根节点的距离(层数)来判断当前层是否开启帧间预测模式。即,当当前层离根节点的距离小于等于设定的阈值时,应用相应的帧间预测算法。当当前层离根节点的距离大于设定的阈值时,只采用已有的帧内预测方法。但是对于某些序列来说,大于阈值的某些节点采用帧间预测的结果可能要比帧内预测残差更小,预测效果更好。It should be noted that the inter-frame prediction method in the related art is to determine whether the current layer has turned on the inter-frame prediction mode by the distance from the root node (number of layers). That is, when the distance from the current layer to the root node is less than or equal to the set threshold, the corresponding inter-frame prediction algorithm is applied. When the distance from the current layer to the root node is greater than the set threshold, only the existing intra-frame prediction method is used. However, for some sequences, the results of inter-frame prediction for some nodes greater than the threshold may have smaller residuals than intra-frame prediction, and the prediction effect is better.
本实施例采用更加灵活的方式来选取采用帧内帧间预测模式的方法,而不是简单的使用距离根节点的层数为依据,将划分好的八叉树一分为二,对上层的节点使用帧间预测,对其余的节点笼统的使用帧内预测。本实施例使用一种更灵活的方法来判断当前节点使用的预测模式,从而让预测残差更小。This embodiment adopts a more flexible method to select the method of using the intra-frame and inter-frame prediction modes, instead of simply using the number of layers from the root node as the basis, the divided octree is divided into two, the inter-frame prediction is used for the nodes in the upper layer, and the intra-frame prediction is generally used for the remaining nodes. This embodiment uses a more flexible method to determine the prediction mode used by the current node, so as to make the prediction residual smaller.
在本申请实施例中,基于所述目标距离确定目标预测残差值,对所述目标预测残差值进行编码,在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。通过将目标距离与第一预设距离和第二预设距离进行比较,确定不同的预测方式,灵活性较高,能够提高编码效率。In an embodiment of the present application, a target prediction residual value is determined based on the target distance, and the target prediction residual value is encoded. When the target distance is less than or equal to a first preset distance, the target prediction residual value includes a first prediction residual value obtained by inter-frame prediction of the attribute information of the node of the target layer; or when the target distance is greater than the first preset distance and less than or equal to a second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer; or when the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by intra-frame prediction of the attribute information of the node of the target layer. By comparing the target distance with the first preset distance and the second preset distance, different prediction methods are determined, which has high flexibility and can improve coding efficiency.
可选地,在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括基于率失真优化算法对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值。Optionally, when the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm.
其中,在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下, 可以基于率失真优化算法确定与帧间残差系数对应的第一代价值和与帧内残差系数对应的第二代价值,基于所述第一代价值和所述第二代价值确定第二预测残差值。Wherein, when the target distance is greater than the first preset distance and less than or equal to the second preset distance, A first generation value corresponding to the inter-frame residual coefficient and a second generation value corresponding to the intra-frame residual coefficient may be determined based on a rate-distortion optimization algorithm, and a second prediction residual value may be determined based on the first generation value and the second generation value.
该实施方式中,在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括基于率失真优化算法对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值,这样,支持通过率失真优化算法判断使用帧内预测和帧间预测的代价(cost),从而选择合适的预测方式进行预测处理得到第二预测残差值。In this implementation, when the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm. In this way, it supports judging the cost of using intra-frame prediction and inter-frame prediction through a rate-distortion optimization algorithm, thereby selecting a suitable prediction method for prediction processing to obtain a second prediction residual value.
可选地,所述基于所述目标距离确定目标预测残差值,包括:Optionally, determining a target prediction residual value based on the target distance includes:
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,确定所述目标层的节点的属性信息的帧间残差系数和帧内残差系数;When the target distance is greater than the first preset distance and less than or equal to the second preset distance, determine the inter-frame residual coefficient and the intra-frame residual coefficient of the attribute information of the node of the target layer;
基于率失真优化算法确定与所述帧间残差系数对应的第一代价值和与所述帧内残差系数对应的第二代价值;Determine a first generation value corresponding to the inter-frame residual coefficient and a second generation value corresponding to the intra-frame residual coefficient based on a rate-distortion optimization algorithm;
基于所述第一代价值和所述第二代价值确定第二预测残差值;determining a second prediction residual value based on the first generation value and the second generation value;
其中,所述目标预测残差值包括所述第二预测残差值。Among them, the target prediction residual value includes the second prediction residual value.
其中,代价(cost)也可以描述为花销。Here, cost can also be described as expenditure.
一种实施方式中,可以分别计算帧内残差系数对应的第二代价值Cost_intra和帧间残差系数对应的第一代价值Cost_inter,编码帧内残差系数的第二代价值或者编码帧间残差系数的第一代价值可以由下式得到:
Cost=D+λ*RIn one implementation, the second generation value Cost_intra corresponding to the intra-frame residual coefficient and the first generation value Cost_inter corresponding to the inter-frame residual coefficient may be calculated respectively, and the second generation value of the intra-frame residual coefficient or the first generation value of the inter-frame residual coefficient may be obtained by the following formula:
Cost=D+λ*R
其中,D(distortion:失真)为使用该预测残差系数(帧内残差系数或帧间残差系数)后,得到的重建后的属性与原始属性的绝对误差和(Sum of Absolute Difference,SAD)。Wherein, D (distortion) is the sum of absolute differences (Sum of Absolute Difference, SAD) between the reconstructed attribute and the original attribute obtained after using the prediction residual coefficient (intra-frame residual coefficient or inter-frame residual coefficient).
其中,R(rate:码率)为编码该预测残差系数(帧内残差系数或帧间残差系数)需要使用的比特数。Here, R (rate: bit rate) is the number of bits required to encode the prediction residual coefficient (intra-frame residual coefficient or inter-frame residual coefficient).
其中,λ的值取决于量化参数QP,由下式计算:
The value of λ depends on the quantization parameter QP, which is calculated as follows:
需要说明的是,所述确定所述目标层的节点的属性信息的帧间残差系数,可以包括:确定所述目标层的节点的属性信息的帧间预测值,并对所述帧间预测值进行变换处理得到帧间变换系数,基于所述帧间变换系数获取帧间残差系数;或者,可以包括:若所述待编码点云的参考帧中存在与所述目标层的节点具有相同位置的参考节点的变换系数,则将所述具有相同位置的参考节点的变换系数确定为所述节点的属性信息的帧间变换系数,基于所述节点的变换系数及所述节点的属性信息的帧间变换系数确定所述节点的属性信息的帧间残差系数。It should be noted that the process of determining the inter-frame residual coefficients of the attribute information of the nodes of the target layer may include: determining the inter-frame prediction value of the attribute information of the nodes of the target layer, transforming the inter-frame prediction value to obtain the inter-frame transformation coefficient, and obtaining the inter-frame residual coefficient based on the inter-frame transformation coefficient; or, may include: if there is a transformation coefficient of a reference node having the same position as the node of the target layer in the reference frame of the point cloud to be encoded, then determining the transformation coefficient of the reference node having the same position as the inter-frame transformation coefficient of the attribute information of the node, and determining the inter-frame residual coefficient of the attribute information of the node based on the transformation coefficient of the node and the inter-frame transformation coefficient of the attribute information of the node.
另外,确定所述目标层的节点的属性信息的帧内残差系数,可以包括:确定所述目标层的节点的属性信息的帧内预测值,并对所述帧内预测值进行变换处理得到帧内变换系数,基于所述帧内变换系数获取帧间残差系数。In addition, determining the intra-frame residual coefficient of the attribute information of the node of the target layer may include: determining the intra-frame prediction value of the attribute information of the node of the target layer, transforming the intra-frame prediction value to obtain the intra-frame transformation coefficient, and obtaining the inter-frame residual coefficient based on the intra-frame transformation coefficient.
该实施方式中,确定所述目标层的节点的属性信息的帧间残差系数和帧内残差系数; 基于率失真优化算法确定与所述帧间残差系数对应的第一代价值和与所述帧内残差系数对应的第二代价值;基于所述第一代价值和所述第二代价值确定第二预测残差值。这样,对于中层节点,能够通过率失真优化算法判断使用帧内预测和帧间预测的代价(cost),从而选择合适的预测方式进行预测处理得到第二预测残差值,使得属性预测残差较小,编码效率较高。In this implementation, inter-frame residual coefficients and intra-frame residual coefficients of attribute information of the node of the target layer are determined; Based on the rate-distortion optimization algorithm, a first generation value corresponding to the inter-frame residual coefficient and a second generation value corresponding to the intra-frame residual coefficient are determined; based on the first generation value and the second generation value, a second prediction residual value is determined. In this way, for the middle-level node, the cost of using intra-frame prediction and inter-frame prediction can be determined through the rate-distortion optimization algorithm, so as to select a suitable prediction method for prediction processing to obtain the second prediction residual value, so that the attribute prediction residual is small and the coding efficiency is high.
可选地,在所述第一代价值小于所述第二代价值的情况下,所述第二预测残差值为所述帧间残差系数;或,Optionally, when the first generation value is less than the second generation value, the second prediction residual value is the inter-frame residual coefficient; or,
在所述第二代价值小于所述第一代价值的情况下,所述第二预测残差值为所述帧内残差系数;或,When the second generation value is less than the first generation value, the second prediction residual value is the intra-frame residual coefficient; or
在所述第一代价值等于所述第二代价值的情况下,所述第二预测残差值为所述帧间残差系数或所述帧内残差系数。When the first generation value is equal to the second generation value, the second prediction residual value is the inter-frame residual coefficient or the intra-frame residual coefficient.
需要说明的是,所述第一代价值等于所述第二代价值表征编码帧内残差系数的代价和编码帧间残差系数的代价相同,可以选择帧内预测方式或帧间预测方式。It should be noted that the first generation value being equal to the second generation value indicates that the cost of encoding the intra-frame residual coefficient is the same as the cost of encoding the inter-frame residual coefficient, and the intra-frame prediction mode or the inter-frame prediction mode can be selected.
该实施方式中,在所述第一代价值小于所述第二代价值的情况下,所述第二预测残差值为所述帧间残差系数;在所述第二代价值小于所述第一代价值的情况下,所述第二预测残差值为所述帧内残差系数;在所述第一代价值等于所述第二代价值的情况下,所述第二预测残差值为所述帧间残差系数或所述帧内残差系数。从而能够选择代价值较小的预测方式进行预测,使得属性预测残差较小,编码效率较高。In this implementation, when the first cost value is less than the second cost value, the second prediction residual value is the inter-frame residual coefficient; when the second cost value is less than the first cost value, the second prediction residual value is the intra-frame residual coefficient; when the first cost value is equal to the second cost value, the second prediction residual value is the inter-frame residual coefficient or the intra-frame residual coefficient. Thus, the prediction method with a smaller cost value can be selected for prediction, so that the attribute prediction residual is smaller and the coding efficiency is higher.
可选地,所述待编码点云的码流还包括第二编码结果,所述第二编码结果用于表征所述第二预测残差值为所述帧内残差系数或者所述第二预测残差值为所述帧间残差系数。Optionally, the code stream of the point cloud to be encoded also includes a second encoding result, and the second encoding result is used to represent that the second prediction residual value is the intra-frame residual coefficient or the second prediction residual value is the inter-frame residual coefficient.
其中,可以使用标志位(flag)标志节点使用的预测模式,第二编码结果可以是对该标志位的编码结果。示例地,可以使用一个二元标志位flag标志节点使用的预测模式,在flag为1时,表征节点使用帧内预测模式,第二预测残差值为所述帧内残差系数,在flag为0时,表征节点使用帧间预测模式,第二预测残差值为所述帧间残差系数;或者,在flag为0时,表征节点使用帧内预测模式,第二预测残差值为所述帧内残差系数,在flag为1时,表征节点使用帧间预测模式,第二预测残差值为所述帧间残差系数。Among them, a flag can be used to mark the prediction mode used by the node, and the second encoding result can be the encoding result of the flag. For example, a binary flag flag can be used to mark the prediction mode used by the node, when flag is 1, the representation node uses the intra-frame prediction mode, and the second prediction residual value is the intra-frame residual coefficient, when flag is 0, the representation node uses the inter-frame prediction mode, and the second prediction residual value is the inter-frame residual coefficient; or, when flag is 0, the representation node uses the intra-frame prediction mode, and the second prediction residual value is the intra-frame residual coefficient, when flag is 1, the representation node uses the inter-frame prediction mode, and the second prediction residual value is the inter-frame residual coefficient.
该实施方式中,所述待编码点云的码流还包括第二编码结果,所述第二编码结果用于表征所述第二预测残差值为所述帧内残差系数或者所述第二预测残差值为所述帧间残差系数,从而解码端能够通过第二编码结果确定预测方式为帧内预测方式或帧间预测方式,从而选择合适的预测方式进行解码。In this embodiment, the code stream of the point cloud to be encoded also includes a second encoding result, and the second encoding result is used to characterize that the second prediction residual value is the intra-frame residual coefficient or the second prediction residual value is the inter-frame residual coefficient, so that the decoding end can determine whether the prediction mode is an intra-frame prediction mode or an inter-frame prediction mode through the second encoding result, and thus select a suitable prediction mode for decoding.
可选地,在所述待编码点云的参考帧数据的数据类型为第一数据类型的情况下,所述确定所述目标层的节点的属性信息的帧间残差系数,包括:Optionally, when the data type of the reference frame data of the to-be-encoded point cloud is the first data type, the determining of the inter-frame residual coefficients of the attribute information of the node of the target layer includes:
确定所述目标层的节点的属性信息的帧间预测值,并对所述帧间预测值进行变换处理得到帧间变换系数;Determine an inter-frame prediction value of the attribute information of the node of the target layer, and transform the inter-frame prediction value to obtain an inter-frame transformation coefficient;
基于所述帧间变换系数获取帧间残差系数。 An inter-frame residual coefficient is obtained based on the inter-frame transform coefficient.
其中,所述第一数据类型的参考帧数据可以为已编码且已重建的点云帧数据。第一数据类型的参考帧数据可以包含属性信息和几何信息。The reference frame data of the first data type may be encoded and reconstructed point cloud frame data. The reference frame data of the first data type may include attribute information and geometric information.
一种实施方式中,所述对所述帧间预测值进行变换处理得到帧间变换系数,可以包括,对帧间预测值进行RAHT变换,得到帧间变换系数。In one implementation, the transforming the inter-frame prediction value to obtain the inter-frame transform coefficient may include performing a RAHT transform on the inter-frame prediction value to obtain the inter-frame transform coefficient.
一种实施方式中,所述基于所述帧间变换系数获取帧间残差系数,可以包括,对节点的属性信息进行RAHT变换,得到节点的变换系数;将所述帧间变换系数与节点的变换系数相减得到帧间残差系数。帧间残差系数也可以描述为帧间预测残差系数。In one implementation, the obtaining of the inter-frame residual coefficient based on the inter-frame transform coefficient may include performing a RAHT transform on the attribute information of the node to obtain the transform coefficient of the node; and subtracting the inter-frame transform coefficient from the transform coefficient of the node to obtain the inter-frame residual coefficient. The inter-frame residual coefficient may also be described as an inter-frame prediction residual coefficient.
该实施方式中,在所述待编码点云的参考帧数据的数据类型为第一数据类型的情况下,确定所述目标层的节点的属性信息的帧间预测值,并对所述帧间预测值进行变换处理得到帧间变换系数;基于所述帧间变换系数获取帧间残差系数。这样,在参考帧数据为已编码且已重建的点云帧数据的情况下,对帧间预测值进行变换处理得到帧间变换系数,进而得到帧间残差系数。In this implementation, when the data type of the reference frame data of the point cloud to be encoded is the first data type, the inter-frame prediction value of the attribute information of the node of the target layer is determined, and the inter-frame prediction value is transformed to obtain the inter-frame transformation coefficient; and the inter-frame residual coefficient is obtained based on the inter-frame transformation coefficient. In this way, when the reference frame data is the encoded and reconstructed point cloud frame data, the inter-frame prediction value is transformed to obtain the inter-frame transformation coefficient, and then the inter-frame residual coefficient is obtained.
可选地,所述确定所述目标层的节点的属性信息的帧间预测值之前,所述方法还包括:Optionally, before determining the inter-frame prediction value of the attribute information of the node of the target layer, the method further includes:
对所述待编码点云对应的参考帧中的点进行重构处理;Reconstructing points in a reference frame corresponding to the point cloud to be encoded;
基于重构处理后的参考帧构建预测树,所述预测树的树结构与所述变换树的树结构相同;Constructing a prediction tree based on the reconstructed reference frame, wherein the tree structure of the prediction tree is the same as the tree structure of the transformation tree;
所述确定所述目标层的节点的属性信息的帧间预测值,包括:The determining the inter-frame prediction value of the attribute information of the node of the target layer includes:
基于所述预测树及所述变换树确定所述目标层的节点的属性信息的帧间预测值。An inter-frame prediction value of attribute information of a node of the target layer is determined based on the prediction tree and the transform tree.
其中,对所述待编码点云对应的参考帧中的点进行重构处理可以实现参考帧重构,对于每一个点,都会有一个相同位置坐标的重构参考点,此参考点的属性为根据一定规则得到的参考帧中一定范围内的点的属性和;该规则为:若参考帧中的某点离当前帧当前点的距离比离当前帧下一个点的莫顿码距离更近,则此点被判定为用于重建当前点,属性值累加,权重加一,直到不满足条件,继续判断下一个重建点。此外,规定了判断开始和结束距离,距离太远会被认为属性值相差较大,不适合用于重建。具体的:与当前帧第一点的距离不大于64的参考帧中的点才会被用于重建。与当前帧最后一点的距离不大于64的参考帧中的点才会被用于重建。可以对重构后的参考帧执行自下而上构建预测树的操作,构建的预测树的树结构与所述变换树的树结构相同。Among them, the reconstruction of the reference frame can be realized by reconstructing the points in the reference frame corresponding to the point cloud to be encoded. For each point, there will be a reconstructed reference point with the same position coordinates. The attribute of this reference point is the attribute sum of the points within a certain range in the reference frame obtained according to a certain rule; the rule is: if a point in the reference frame is closer to the current point of the current frame than the Morton code distance to the next point of the current frame, then this point is determined to be used to reconstruct the current point, the attribute value is accumulated, and the weight is increased by one until the condition is not met, and the next reconstruction point is judged. In addition, the starting and ending distances for judgment are specified. If the distance is too far, it will be considered that the attribute value is greatly different and is not suitable for reconstruction. Specifically: only points in the reference frame whose distance to the first point of the current frame is not greater than 64 will be used for reconstruction. Only points in the reference frame whose distance to the last point of the current frame is not greater than 64 will be used for reconstruction. The operation of constructing a prediction tree from bottom to top can be performed on the reconstructed reference frame, and the tree structure of the constructed prediction tree is the same as the tree structure of the transformation tree.
一种实施方式中,所述基于所述预测树及所述变换树确定所述目标层的节点的属性信息的帧间预测值,可以包括:通过所述预测树及所述变换树获取各层节点的属性值和对应的重构参考帧各层节点的属性值,若参考帧对应位置子节点的属性值不为0,且在当前帧父节点属性值的20%到250%内,则使用参考帧该子节点属性值作为当前帧对应位置子节点的属性信息的帧间预测值,否则就使用当前帧帧内预测值作为属性信息的帧间预测值。In one embodiment, the inter-frame prediction value of the attribute information of the node of the target layer determined based on the prediction tree and the transform tree may include: obtaining the attribute values of the nodes of each layer and the corresponding attribute values of the nodes of each layer of the reconstructed reference frame through the prediction tree and the transform tree; if the attribute value of the child node at the corresponding position of the reference frame is not 0 and is within 20% to 250% of the attribute value of the parent node of the current frame, then the attribute value of the child node of the reference frame is used as the inter-frame prediction value of the attribute information of the child node at the corresponding position of the current frame; otherwise, the intra-frame prediction value of the current frame is used as the inter-frame prediction value of the attribute information.
该实施方式中,对所述待编码点云对应的参考帧中的点进行重构处理;基于重构处理后的参考帧构建预测树,所述预测树的树结构与所述变换树的树结构相同;基于所述预测树及所述变换树确定所述目标层的节点的属性信息的帧间预测值。这样,在参考帧数据为 已编码且已重建的点云帧数据的情况下,实现通过重构处理后的参考帧构建的预测树确定所述目标层的节点的属性信息的帧间预测值。In this implementation, the points in the reference frame corresponding to the to-be-encoded point cloud are reconstructed; a prediction tree is constructed based on the reconstructed reference frame, and the tree structure of the prediction tree is the same as the tree structure of the transform tree; and the inter-frame prediction value of the attribute information of the node of the target layer is determined based on the prediction tree and the transform tree. In the case of encoded and reconstructed point cloud frame data, the inter-frame prediction value of the attribute information of the node of the target layer is determined by the prediction tree constructed by the reconstructed reference frame.
可选地,所述第一数据类型的参考帧数据为已编码且已重建的点云帧数据。Optionally, the reference frame data of the first data type is encoded and reconstructed point cloud frame data.
可选地,在所述待编码点云的参考帧数据的数据类型为第二数据类型的情况下,所述确定所述目标层的节点的属性信息的帧间残差系数,包括:Optionally, when the data type of the reference frame data of the to-be-encoded point cloud is the second data type, the determining of the inter-frame residual coefficients of the attribute information of the node of the target layer includes:
若所述待编码点云的参考帧中存在与所述目标层的节点具有相同位置的参考节点的变换系数,则将所述具有相同位置的参考节点的变换系数确定为所述节点的属性信息的帧间变换系数;If there is a transformation coefficient of a reference node having the same position as the node of the target layer in the reference frame of the point cloud to be encoded, the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node;
基于所述节点的变换系数及所述节点的属性信息的帧间变换系数确定所述节点的属性信息的帧间残差系数。An inter-frame residual coefficient of the attribute information of the node is determined based on the transformation coefficient of the node and the inter-frame transformation coefficient of the attribute information of the node.
其中,所述基于所述节点的变换系数及所述节点的属性信息的帧间变换系数确定所述节点的属性信息的帧间残差系数,可以包括,对节点的属性信息进行RAHT变换,得到节点的变换系数;将所述节点的属性信息的帧间变换系数与节点的变换系数相减得到帧间残差系数。帧间残差系数也可以描述为帧间预测残差系数。The determining of the inter-frame residual coefficient of the attribute information of the node based on the transformation coefficient of the node and the inter-frame transformation coefficient of the attribute information of the node may include performing RAHT transformation on the attribute information of the node to obtain the transformation coefficient of the node; and subtracting the inter-frame transformation coefficient of the attribute information of the node from the transformation coefficient of the node to obtain the inter-frame residual coefficient. The inter-frame residual coefficient may also be described as an inter-frame prediction residual coefficient.
该实施方式中,若所述待编码点云的参考帧中存在与所述目标层的节点具有相同位置的参考节点的变换系数,则将所述具有相同位置的参考节点的变换系数确定为所述节点的属性信息的帧间变换系数;基于所述节点的变换系数及所述节点的属性信息的帧间变换系数确定所述节点的属性信息的帧间残差系数。这样,在参考帧数据为已编码的点云帧的重建变换系数的情况下,可以直接通过参考帧数据获取帧间变换系数,进而得到帧间残差系数。In this implementation, if there is a transformation coefficient of a reference node having the same position as the node of the target layer in the reference frame of the point cloud to be encoded, the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node; based on the transformation coefficient of the node and the inter-frame transformation coefficient of the attribute information of the node, the inter-frame residual coefficient of the attribute information of the node is determined. In this way, when the reference frame data is the reconstructed transformation coefficient of the encoded point cloud frame, the inter-frame transformation coefficient can be directly obtained through the reference frame data, and then the inter-frame residual coefficient can be obtained.
可选地,所述第二数据类型的参考帧数据为已编码的点云帧的重建变换系数。Optionally, the reference frame data of the second data type is reconstructed transformation coefficients of an encoded point cloud frame.
可选地,所述待编码点云的码流还包括第三编码结果,所述第三编码结果为所述第一预设距离及所述第二预设距离中的至少一项的编码结果。Optionally, the code stream of the point cloud to be encoded further includes a third encoding result, and the third encoding result is an encoding result of at least one of the first preset distance and the second preset distance.
参见图6,图6是本申请实施例提供的一种点云解码处理方法的流程图,可以应用于解码端,如图6所示,点云解码处理方法包括以下步骤:Referring to FIG. 6 , FIG. 6 is a flow chart of a point cloud decoding processing method provided in an embodiment of the present application, which can be applied to a decoding end. As shown in FIG. 6 , the point cloud decoding processing method includes the following steps:
步骤201、确定待解码点云对应的变换树的目标层与根节点之间的目标距离;Step 201, determining a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be decoded;
步骤202、对待解码点云的码流中的第一编码结果进行解码,得到目标预测残差值;Step 202: Decode the first encoding result in the code stream of the point cloud to be decoded to obtain a target prediction residual value;
步骤203、基于所述目标预测残差值及所述目标距离获取所述目标层的节点的属性信息;Step 203: acquiring attribute information of the nodes of the target layer based on the target prediction residual value and the target distance;
其中,在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或Wherein, when the target distance is less than or equal to the first preset distance, the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer; or
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或In a case where the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer; or
在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。 When the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
可选地,在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括基于率失真优化算法对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值。Optionally, when the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm.
可选地,所述目标预测残差值包括所述第二预测残差值,所述待解码点云的码流还包括第二编码结果,所述第二编码结果用于表征所述第二预测残差值为帧内残差系数或者所述第二预测残差值为帧间残差系数;所述基于所述目标预测残差值及所述目标距离获取所述目标层的节点的属性信息,包括:Optionally, the target prediction residual value includes the second prediction residual value, and the code stream of the point cloud to be decoded also includes a second encoding result, and the second encoding result is used to characterize that the second prediction residual value is an intra-frame residual coefficient or the second prediction residual value is an inter-frame residual coefficient; the acquiring the attribute information of the node of the target layer based on the target prediction residual value and the target distance includes:
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,对所述第二编码结果进行解码;When the target distance is greater than the first preset distance and less than or equal to the second preset distance, decoding the second encoding result;
在确定所述第二编码结果表征所述第二预测残差值为帧内残差系数的情况下,基于所述目标层的节点的帧内变换系数及所述第二预测残差值确定所述目标层的节点的属性信息;或In the case where it is determined that the second encoding result represents that the second prediction residual value is an intra-frame residual coefficient, determining the attribute information of the node of the target layer based on the intra-frame transform coefficient of the node of the target layer and the second prediction residual value; or
在确定所述第二编码结果表征所述第二预测残差值为帧间残差系数的情况下,基于所述目标层的节点的帧间变换系数及所述第二预测残差值确定所述目标层的节点的属性信息。When it is determined that the second encoding result represents that the second prediction residual value is an inter-frame residual coefficient, attribute information of the node of the target layer is determined based on the inter-frame transform coefficient of the node of the target layer and the second prediction residual value.
可选地,在所述待解码点云的参考帧数据的数据类型为第一数据类型的情况下,所述基于所述目标层的节点的帧间变换系数及所述第二预测残差值确定所述目标层的节点的属性信息,包括:Optionally, when the data type of the reference frame data of the point cloud to be decoded is the first data type, determining the attribute information of the node of the target layer based on the inter-frame transform coefficient of the node of the target layer and the second prediction residual value includes:
确定所述目标层的节点的属性信息的帧间预测值,并对所述帧间预测值进行变换处理得到帧间变换系数;Determine an inter-frame prediction value of the attribute information of the node of the target layer, and transform the inter-frame prediction value to obtain an inter-frame transformation coefficient;
基于所述帧间变换系数及所述第二预测残差值确定所述目标层的节点的属性信息。The attribute information of the node of the target layer is determined based on the inter-frame transform coefficient and the second prediction residual value.
可选地,所述确定所述目标层的节点的属性信息的帧间预测值之前,所述方法还包括:Optionally, before determining the inter-frame prediction value of the attribute information of the node of the target layer, the method further includes:
对所述待解码点云对应的参考帧中的点进行重构处理;Reconstructing points in a reference frame corresponding to the point cloud to be decoded;
基于重构处理后的参考帧构建预测树,所述预测树的树结构与所述变换树的树结构相同;Constructing a prediction tree based on the reconstructed reference frame, wherein the tree structure of the prediction tree is the same as the tree structure of the transformation tree;
所述确定所述目标层的节点的属性信息的帧间预测值,包括:The determining of the inter-frame prediction value of the attribute information of the node of the target layer includes:
基于所述预测树及所述变换树确定所述目标层的节点的属性信息的帧间预测值。An inter-frame prediction value of attribute information of a node of the target layer is determined based on the prediction tree and the transform tree.
可选地,所述第一数据类型的参考帧数据为已解码且已重建的点云帧数据。Optionally, the reference frame data of the first data type is decoded and reconstructed point cloud frame data.
可选地,在所述待解码点云的参考帧数据的数据类型为第二数据类型的情况下,所述基于所述目标层的节点的帧间变换系数及所述第二预测残差值确定所述目标层的节点的属性信息,包括:Optionally, when the data type of the reference frame data of the point cloud to be decoded is the second data type, determining the attribute information of the node of the target layer based on the inter-frame transform coefficient of the node of the target layer and the second prediction residual value includes:
若确定所述待解码点云的参考帧中存在与所述目标层的节点具有相同位置的参考节点的变换系数,则将所述具有相同位置的参考节点的变换系数确定为所述节点的属性信息的帧间变换系数;If it is determined that there is a transformation coefficient of a reference node having the same position as the node of the target layer in the reference frame of the point cloud to be decoded, the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node;
基于所述第二预测残差值及所述节点的属性信息的帧间变换系数获取所述节点的属性信息。 The attribute information of the node is obtained based on the second prediction residual value and the inter-frame transform coefficient of the attribute information of the node.
可选地,所述第二数据类型的参考帧数据为已解码的点云帧的重建变换系数。Optionally, the reference frame data of the second data type is reconstructed transform coefficients of a decoded point cloud frame.
可选地,所述待解码点云的码流还包括第三编码结果,所述方法还包括:Optionally, the code stream of the point cloud to be decoded further includes a third encoding result, and the method further includes:
对所述第三编码结果进行解码,得到所述第一预设距离及所述第二预设距离中的至少一项。The third encoding result is decoded to obtain at least one of the first preset distance and the second preset distance.
需要说明的是,本实施例作为与图4所示的实施例中对应的解码侧的实施方式,其具体的实施方式可以参见图4所示的实施例的相关说明,为了避免重复说明,本实施例不再赘述,且还可以达到相同有益效果。It should be noted that this embodiment is an implementation of the decoding side corresponding to the embodiment shown in Figure 4. Its specific implementation can refer to the relevant description of the embodiment shown in Figure 4. In order to avoid repeated description, this embodiment will not be repeated, and the same beneficial effects can be achieved.
以下通过两个具体的实施例对本申请实施例的点云编码处理方法和点云解码处理方法进行说明:The point cloud encoding processing method and the point cloud decoding processing method of the embodiment of the present application are described below through two specific embodiments:
实施例1:Embodiment 1:
该实施例由编码端执行。This embodiment is executed by the encoding end.
本实施例提出了一种基于双阈值的RAHT帧间预测方案,通过参数集中引入两个语法元素thrh1(即第一预设距离)和thrh2(即第二预设距离)(该参数集可以为属性参数集APS,或属性数据单元ADU,或其他的参数集),将构建好的变换树进一步细分成三个层级:上层,中层,下层,对不同层级中的节点采用不同的预测模式。其中thrh1和thrh2分别为距离根节点的层数,一般来说thrh1<thrh2。设当前节层距离根节点的距离为lvl(即目标距离)。This embodiment proposes a RAHT inter-frame prediction scheme based on dual thresholds, by introducing two syntax elements thrh1 (i.e., the first preset distance) and thrh2 (i.e., the second preset distance) in the parameter set (the parameter set can be an attribute parameter set APS, or an attribute data unit ADU, or other parameter sets), and further subdividing the constructed transform tree into three levels: upper layer, middle layer, and lower layer, and using different prediction modes for nodes in different layers. Where thrh1 and thrh2 are the number of layers from the root node, respectively, and generally speaking, thrh1<thrh2. Let the distance from the current node layer to the root node be lvl (i.e., the target distance).
具体的编码方式描述如下:The specific encoding method is described as follows:
步骤(11):对点云进行重排序,基于几何距离对重排序后的点云数据构建N层变换树结构。采用自下而上的构建方法。Step (11): Reorder the point cloud and construct an N-layer transformation tree structure for the reordered point cloud data based on the geometric distance. A bottom-up construction method is adopted.
在构建变换树的过程中,需要为合并后的节点生成对应的莫顿码信息、属性信息以及权重信息。In the process of constructing the transformation tree, it is necessary to generate corresponding Morton code information, attribute information and weight information for the merged nodes.
步骤(12):自上而下解析树。自顶向下从根节点开始逐层进行上采样预测和区域自适应分层变换(RAHT)。Step (12): Parse the tree from top to bottom. Starting from the root node, upsampling prediction and regional adaptive hierarchical transform (RAHT) are performed layer by layer.
对变换树的节点进行预测时可以选用帧内预测或帧间预测两种预测方法。如图7所示,具体的选择方式如下:When predicting the nodes of the transform tree, two prediction methods can be selected: intra-frame prediction or inter-frame prediction. As shown in Figure 7, the specific selection method is as follows:
1)对于上层(lvl<=thrh1)的节点:由于变换树上层的节点较大,包含的点数多,即使两帧之间有运动也基本不会在这些上层的节点上有太大变化,此时若使用帧间预测的话,帧间属性预测残差可能会比采用帧内属性预测残差要小,意味着在上层使用帧间预测效果要更好。1) For nodes in the upper layer (lvl<=thrh1): Since the nodes in the upper layer of the transform tree are larger and contain more points, even if there is movement between two frames, there will basically not be much change in these upper layer nodes. At this time, if inter-frame prediction is used, the inter-frame attribute prediction residual may be smaller than the intra-frame attribute prediction residual, which means that the use of inter-frame prediction in the upper layer will have a better effect.
所以在上层默认使用帧间预测模式。Therefore, the inter-frame prediction mode is used by default in the upper layer.
2):对于中层(thrh1<lvl<=thrh2)的节点:在中间层的这些节点有些可能适合采用帧间预测,有些可能适合采用帧内预测,本实施例采用率失真优化(Rate–distortion optimization,RDO)算法选择合适的预测技术。2): For the nodes in the middle layer (thrh1<lvl<=thrh2): some of these nodes in the middle layer may be suitable for inter-frame prediction, and some may be suitable for intra-frame prediction. This embodiment uses the rate-distortion optimization (RDO) algorithm to select the appropriate prediction technology.
通过RDO算法分别判断使用帧内和帧间预测的花销(cost),当采用帧间预测的花销小于采用帧内预测的花销时,当前节点就使用帧间预测模式,否则就使用帧内预测模式。同 时使用一个二元标志位flag标志当前节点使用的预测模式。例如使用1标志使用帧间预测模式,0标志使用帧内预测模式。反之亦然。The RDO algorithm is used to determine the cost of using intra-frame and inter-frame prediction. When the cost of using inter-frame prediction is less than the cost of using intra-frame prediction, the current node uses the inter-frame prediction mode, otherwise it uses the intra-frame prediction mode. When using a binary flag, it indicates the prediction mode used by the current node. For example, 1 indicates the use of inter-frame prediction mode, and 0 indicates the use of intra-frame prediction mode. And vice versa.
3)对于下层(lvl>thrh2)的节点:由于下层块小,包含的点数较少,当两帧之间有一定的运动时,每个块内包含的点数可能就会发生很大的变化,不适合采用帧间预测。3) For nodes in the lower layer (lvl>thrh2): Since the lower layer blocks are small and contain fewer points, when there is a certain amount of motion between two frames, the number of points contained in each block may change greatly, which is not suitable for inter-frame prediction.
所以在下层默认使用帧内预测模式。Therefore, the intra prediction mode is used by default in the lower layer.
其中,对于1)中的上层节点和2)中的中层节点,所使用的帧间预测,需要根据参考帧数据类型的不同选择不同的预测方式。Among them, for the upper-layer nodes in 1) and the middle-layer nodes in 2), the inter-frame prediction used needs to select different prediction methods according to different reference frame data types.
(a):当参考帧数据是已经编码并重建完成的点云帧信息,包含属性信息和几何信息,则可以与当前帧执行相同的自下而上构建变换树的操作,并生成与当前帧节点对应的参考帧节点属性值,可以将其作为帧间预测值。(a): When the reference frame data is the point cloud frame information that has been encoded and reconstructed, including attribute information and geometric information, the same bottom-up transformation tree construction operation as the current frame can be performed, and the reference frame node attribute value corresponding to the current frame node can be generated, which can be used as the inter-frame prediction value.
具体过程如下:The specific process is as follows:
(a1):参考帧重构:把当前点位置一定范围内的参考帧中的点合并成一个点,属性相加,坐标取当前点的坐标。这样之后就能执行相同的预测树构建和解析。具体过程如下:(a1): Reference frame reconstruction: Merge the points in the reference frame within a certain range of the current point into one point, add the attributes, and take the coordinates of the current point. Then, the same prediction tree construction and analysis can be performed. The specific process is as follows:
对于每一个点,都会有一个相同位置坐标的重构参考点。此参考点的属性为根据一定规则得到的参考帧中一定范围内的点的属性和。For each point, there is a reconstructed reference point with the same position coordinates. The attributes of this reference point are the sum of the attributes of the points within a certain range in the reference frame obtained according to certain rules.
该规则为:若参考帧中的某点离当前帧当前点的距离比离当前帧下一个点的莫顿码距离更近,则此点被判定为用于重建当前点,属性值累加,权重加一。直到不满足条件。继续判断下一个重建点。The rule is: if the distance of a point in the reference frame to the current point of the current frame is closer than the Morton code distance to the next point of the current frame, then this point is determined to be used to reconstruct the current point, the attribute value is accumulated, and the weight is increased by one. Until the condition is not met, continue to determine the next reconstruction point.
此外,规定了判断开始和结束距离,距离太远会被认为属性值相差较大,不适合用于重建。具体的:In addition, the start and end distances are specified. If the distance is too far, it will be considered that the attribute values are too different and not suitable for reconstruction.
与当前帧第一点的距离不大于64的参考帧中的点才会被用于重建。Only points in the reference frame whose distance to the first point of the current frame is no greater than 64 will be used for reconstruction.
与当前帧最后一点的距离不大于64的参考帧中的点才会被用于重建。Only points in the reference frame whose distance to the last point of the current frame is no greater than 64 will be used for reconstruction.
(a2):自下而上构建树:对重构后的参考帧和当前帧执行自下而上构建预测树的操作,得到了相同结构的预测树,生成当前帧各层节点的属性值,权重值和对应的重构参考帧各层节点的预测属性值,权重值。(a2): Bottom-up tree construction: The bottom-up prediction tree construction operation is performed on the reconstructed reference frame and the current frame to obtain a prediction tree with the same structure, generating the attribute values and weight values of the nodes at each layer of the current frame and the corresponding predicted attribute values and weight values of the nodes at each layer of the reconstructed reference frame.
(a3):自上而下解析树:(a3): Top-down parse tree:
(a31):对于上层节点,当当前层距离根节点的层数小于等于thrh1时,使用帧间预测方法:(a31): For the upper layer nodes, when the number of layers from the current layer to the root node is less than or equal to thrh1, the inter-frame prediction method is used:
首先判断当前2*2*2的节点是否进行预测:First, determine whether the current 2*2*2 node is predicted:
若当前节点为根节点,则不进行预测,直接对节点的属性信息进行RAHT变换,得到的直流系数(DC)和交流系数(AC)。使用得到的DC系数预测残差代替直接变换得到的DC系数。If the current node is a root node, no prediction is performed, and the attribute information of the node is directly subjected to RAHT transformation to obtain the DC coefficient and AC coefficient. The DC coefficient prediction residual is used to replace the DC coefficient obtained by direct transformation.
DC系数预测残差为:当前帧根节点的DC系数减去参考帧根节点的DC系数。
DCresidual=DCcurrent-DCreference
The DC coefficient prediction residual is: the DC coefficient of the root node of the current frame minus the DC coefficient of the root node of the reference frame.
DC residual =DC current -DC reference
若当前节点不为根节点,则需要判断对当前八个子节点是否进行预测:具体可以参考 GPCC编解码的内容说明部分中对帧内RAHT技术第二步的描述。若不进行预测,直接对原始属性值进行RAHT变换得到AC系数,对AC系数进行量化和熵编码。If the current node is not the root node, it is necessary to determine whether to predict the current eight child nodes: For details, please refer to The second step of the intra-frame RAHT technique is described in the content description of the GPCC codec. If no prediction is performed, the original attribute value is directly subjected to RAHT transformation to obtain AC coefficients, which are then quantized and entropy coded.
若判断当前2*2*2节点进行预测,利用GPCC编解码的内容说明部分中对帧内RAHT技术第三步加权预测的相关方法得到当前2*2*2节点各个占位子节点的帧内属性预测值。If it is determined that the current 2*2*2 node is to be predicted, the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
之后对于每一个2*2*2的节点的占位子节点进行判断:若参考帧对应位置子节点的属性值不为0且在当前帧父节点属性值的20%到250%内,则使用参考帧该子节点属性值作为当前帧对应位置子节点属性预测值,否则就使用当前帧帧内预测值作为属性预测值。Then, a judgment is made for the placeholder child nodes of each 2*2*2 node: if the attribute value of the child node at the corresponding position in the reference frame is not 0 and is between 20% and 250% of the attribute value of the parent node in the current frame, the attribute value of the child node in the reference frame is used as the attribute prediction value of the child node at the corresponding position in the current frame; otherwise, the intra-frame prediction value of the current frame is used as the attribute prediction value.
之后对该2*2*2的节点得到的预测属性值Attrpredicted进行RAHT变换得到预测值的AC系数。将当前帧变换得到的AC系数ACcurrent与预测得到的AC系数ACpredicted相减得到AC系数预测残差ACresidual,对残差进行量化编码。如下所示:
ACresidual=ACcurrent-ACpredicted
ACpeedicted=Transform(Attrpeedicted)
Attrpredicted=Attrpredicted_inter?Attrpredicted_inter:Attrpredicted_intra
Then, the predicted attribute value Attr predicted obtained by the 2*2*2 node is subjected to RAHT transformation to obtain the AC coefficient of the predicted value. The AC coefficient AC current obtained by the current frame transformation is subtracted from the predicted AC coefficient AC predicted to obtain the AC coefficient prediction residual AC residual , and the residual is quantized and encoded. As shown below:
AC residual =AC current -AC predicted
AC peedicted =Transform(Attr peedicted )
Attr predicted =Attr predicted_inter ? Attr predicted_inter :Attr predicted_intra
(a32):对于中层节点,当当前层距离根节点的层数大于thrh1且小于等于thrh2时,使用RDO算法决定是采用帧间预测还是采用帧内预测方法:(a32): For middle-level nodes, when the number of layers from the current layer to the root node is greater than thrh1 and less than or equal to thrh2, the RDO algorithm is used to decide whether to use inter-frame prediction or intra-frame prediction:
首先判断当前2*2*2的节点是否进行预测:具体可以参考GPCC编解码的内容说明部分中对帧内RAHT技术第二步的描述。若不进行预测,直接对原始属性值进行RAHT变换得到变换系数,对变换系数进行量化和熵编码。First, determine whether the current 2*2*2 node is predicted: For details, please refer to the description of the second step of the intra-frame RAHT technology in the content description of GPCC encoding and decoding. If no prediction is performed, the original attribute value is directly RAHT transformed to obtain the transformation coefficient, and the transformation coefficient is quantized and entropy encoded.
若判断当前2*2*2节点进行预测,利用GPCC编解码的内容说明部分中对帧内RAHT技术第三步加权预测的相关方法得到当前2*2*2节点各个占位子节点的帧内属性预测值。If it is determined that the current 2*2*2 node is to be predicted, the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
之后对于每一个2*2*2的节点利用RDO算法进行判断使用哪一种预测模式,具体过程如下:Then, for each 2*2*2 node, the RDO algorithm is used to determine which prediction mode to use. The specific process is as follows:
对帧间预测值和帧内预测值分别进行RAHT变换,得到相应的变换系数。The inter-frame prediction value and the intra-frame prediction value are respectively subjected to RAHT transformation to obtain corresponding transformation coefficients.
分别与真实值RAHT变换系数相减得到帧间预测残差系数和帧内预测残差系数。The inter-frame prediction residual coefficient and the intra-frame prediction residual coefficient are respectively subtracted from the true value RAHT transformation coefficient.
根据RDO算法进行判断是采用帧间预测残差系数还是帧内预测残差系数:According to the RDO algorithm, it is determined whether to use the inter-frame prediction residual coefficient or the intra-frame prediction residual coefficient:
分别计算采用帧内和帧间残差系数的Cost_inter和Cost_intra,编码帧内或者帧间残差系数的花销(Cost)由下式得到:
Cost=D+λ*RCost_inter and Cost_intra using intra-frame and inter-frame residual coefficients are calculated respectively, and the cost (Cost) of encoding intra-frame or inter-frame residual coefficients is obtained by the following formula:
Cost=D+λ*R
其中D(distortion:失真)为使用该预测残差系数后,得到的重建后的属性与原始属性的绝对误差和(SAD),(也可以是其他方法)。Where D (distortion) is the sum of absolute errors (SAD) between the reconstructed attribute and the original attribute obtained after using the prediction residual coefficient (other methods may also be used).
其中R(rate:码率)为编码该预测残差系数需要使用的比特数。R (rate: bit rate) is the number of bits required to encode the prediction residual coefficients.
其中λ的值取决于QP,由下式计算:(也可以是其他公式)
λ^2=0.85*2^((QP-4)/3)The value of λ depends on QP and is calculated by the following formula: (other formulas are also possible)
λ^2=0.85*2^((QP-4)/3)
如果Cost_inter小于Cost_intra,就使用帧间预测残差,否则就使用帧内预测残差。并使用一个标志位进行标记。例如,1标志使用帧间预测残差,0标志使用帧内预测残差。反 之亦然。If Cost_inter is less than Cost_intra, the inter-frame prediction residual is used, otherwise the intra-frame prediction residual is used. A flag is used to mark this. For example, 1 marks the use of inter-frame prediction residual, and 0 marks the use of intra-frame prediction residual. And vice versa.
对判断得到的AC残差系数进行量化熵编码。The determined AC residual coefficients are quantized and entropy coded.
(a33):对于下层节点,当当前层距离根节点的层数离大于thrh2时,使用帧内预测方法:(a33): For lower layer nodes, when the number of layers from the current layer to the root node is greater than thrh2, the intra-frame prediction method is used:
首先判断当前2*2*2的节点是否进行预测:具体可以参考GPCC编解码的内容说明部分中对帧内RAHT技术第二步的描述。若不进行预测,直接对原始属性值进行RAHT变换得到变换系数,对变换系数进行量化和熵编码。First, determine whether the current 2*2*2 node is predicted: For details, please refer to the description of the second step of the intra-frame RAHT technology in the content description of GPCC encoding and decoding. If no prediction is performed, the original attribute value is directly RAHT transformed to obtain the transformation coefficient, and the transformation coefficient is quantized and entropy encoded.
若判断当前2*2*2节点进行预测,利用GPCC编解码的内容说明部分中对帧内RAHT技术第三步加权预测的相关方法得到当前2*2*2节点各个占位子节点的帧内属性预测值。If it is determined that the current 2*2*2 node is to be predicted, the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
之后使用帧内预测值作为属性预测值,对原始属性值和帧内属性预测值分别进行RAHT变换并作差,对残差进行量化熵编码。Then, the intra-frame prediction value is used as the attribute prediction value, and the original attribute value and the intra-frame attribute prediction value are subjected to RAHT transformation and difference, and the residual is quantized and entropy coded.
(b):当参考帧数据是已经编码并完成的点云帧的重建变换系数时,可以直接将该变换系数作为帧间预测的变换系数,与当前帧节点中帧内预测属性值的变换系数进行比较。(b): When the reference frame data is the reconstructed transform coefficient of an already encoded and completed point cloud frame, the transform coefficient can be directly used as the transform coefficient for inter-frame prediction and compared with the transform coefficient of the intra-frame prediction attribute value in the current frame node.
具体编码方式如下:The specific encoding method is as follows:
(b1):对于上层节点,当当前层距离根节点的层数小于等于thrh1时,使用帧间预测方法:(b1): For the upper layer nodes, when the number of layers from the current layer to the root node is less than or equal to thrh1, the inter-frame prediction method is used:
对当前节点的属性信息进行RAHT变换得到变换系数,再判断在参考帧中是否存在与当前待编码节点具有相同位置的参考节点的变换系数,若存在,则将该变换系数作为变换系数的帧间预测值,并与当前帧的当前节点的变换系数作差得到变换残差系数,进行后续的量化和熵编码。Perform RAHT transform on the attribute information of the current node to obtain the transform coefficient, and then determine whether there is a transform coefficient of a reference node with the same position as the current node to be encoded in the reference frame. If so, use the transform coefficient as the inter-frame prediction value of the transform coefficient, and subtract it from the transform coefficient of the current node in the current frame to obtain the transform residual coefficient for subsequent quantization and entropy coding.
(b2):对于中层节点,当当前层距离根节点的层数大于thrh1且小于等于thrh2时,使用RDO算法决定是采用帧间预测还是采用帧内预测方法:(b2): For middle-level nodes, when the number of layers from the current layer to the root node is greater than thrh1 and less than or equal to thrh2, the RDO algorithm is used to decide whether to use inter-frame prediction or intra-frame prediction:
首先判断当前2*2*2的节点是否进行预测:具体可以参考GPCC编解码的内容说明部分中对帧内RAHT技术第二步的描述。若不进行预测,直接对原始属性值进行RAHT变换得到变换系数,对变换系数进行量化和熵编码First, determine whether the current 2*2*2 node is predicted: For details, please refer to the description of the second step of the intra-frame RAHT technology in the content description of GPCC codec. If no prediction is performed, the original attribute value is directly RAHT transformed to obtain the transformation coefficient, and the transformation coefficient is quantized and entropy encoded.
若判断进行预测,则利用GPCC编解码的内容说明部分中对帧内RAHT技术第三步加权预测的相关方法得到当前2*2*2节点各个占位子节点的帧内属性预测值。If it is determined to make a prediction, the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
之后对于每一个2*2*2的节点利用RDO算法进行判断使用哪一种预测模式,具体过程如下:Then, for each 2*2*2 node, the RDO algorithm is used to determine which prediction mode to use. The specific process is as follows:
对当前节点的原始属性值和帧内预测值分别进行变换,得到相应的变换系数,作差得到帧内预测残差系数。The original attribute value and the intra-frame prediction value of the current node are transformed respectively to obtain the corresponding transformation coefficients, and the intra-frame prediction residual coefficients are obtained by difference.
若在参考帧中存在与当前节点相同位置的变换系数,将其作为变换系数的帧间预测值,与当前节点真实RAHT变换系数作差得到帧间预测残差系数。若不存在,直接使用帧内预测残差系数,并进行后续的量化和熵编码。If there is a transform coefficient at the same position as the current node in the reference frame, it is used as the inter-frame prediction value of the transform coefficient, and the inter-frame prediction residual coefficient is obtained by subtracting it from the actual RAHT transform coefficient of the current node. If it does not exist, the intra-frame prediction residual coefficient is directly used, and subsequent quantization and entropy coding are performed.
根据RDO算法进行判断是采用帧间预测残差系数还是帧内预测残差系数: According to the RDO algorithm, it is determined whether to use the inter-frame prediction residual coefficient or the intra-frame prediction residual coefficient:
分别计算采用帧内和帧间残差系数的Cost_inter和Cost_intra,编码帧内或者帧间残差系数的花销(Cost)由下式得到:
Cost=D+λ*RCost_inter and Cost_intra using intra-frame and inter-frame residual coefficients are calculated respectively, and the cost (Cost) of encoding intra-frame or inter-frame residual coefficients is obtained by the following formula:
Cost=D+λ*R
其中D(distortion:失真)为使用该预测残差系数后,得到的重建后的属性与原始属性的绝对误差和(SAD),(也可以是其他方法)。Where D (distortion) is the sum of absolute errors (SAD) between the reconstructed attribute and the original attribute obtained after using the prediction residual coefficient (other methods may also be used).
其中R(rate:码率)为编码该预测残差系数需要使用的比特数。R (rate: bit rate) is the number of bits required to encode the prediction residual coefficients.
其中λ的值取决于QP,由下式计算:(也可以是其他公式)
The value of λ depends on QP and is calculated by the following formula: (other formulas are also possible)
如果Cost_inter小于Cost_intra,就使用帧间预测残差,否则就使用帧内预测残差。并使用一个标志位进行标记。例如,1标志使用帧间预测残差,0标志使用帧内预测残差。反之亦然。If Cost_inter is less than Cost_intra, the inter-frame prediction residual is used, otherwise the intra-frame prediction residual is used. A flag is used to mark this. For example, 1 flag uses the inter-frame prediction residual, and 0 flag uses the intra-frame prediction residual. And vice versa.
对判断得到的AC残差系数进行量化和熵编码。The determined AC residual coefficients are quantized and entropy encoded.
(b3):对于下层节点,当当前层距离根节点的层数离大于thrh2时,使用帧内预测方法(同(a3)中的下层节点的编码方式一样):(b3): For lower-layer nodes, when the number of layers from the current layer to the root node is greater than thrh2, the intra-frame prediction method is used (the same as the encoding method of the lower-layer nodes in (a3)):
首先判断当前2*2*2的节点是否进行预测:具体可以参考GPCC编解码的内容说明部分中对帧内RAHT技术第二步的描述。若不进行预测,直接对原始属性值进行RAHT变换得到变换系数,对变换系数进行量化和熵编码。First, determine whether the current 2*2*2 node is predicted: For details, please refer to the description of the second step of the intra-frame RAHT technology in the content description of GPCC encoding and decoding. If no prediction is performed, the original attribute value is directly RAHT transformed to obtain the transformation coefficient, and the transformation coefficient is quantized and entropy encoded.
若判断当前2*2*2节点进行预测,利用GPCC编解码的内容说明部分中对帧内RAHT技术第三步加权预测的相关方法得到当前2*2*2节点各个占位子节点的帧内属性预测值。If it is determined that the current 2*2*2 node is to be predicted, the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
之后使用帧内预测值作为属性预测值,对原始属性值和帧内属性预测值分别进行RAHT变换并作差,对残差进行量化熵编码。Then, the intra-frame prediction value is used as the attribute prediction value, and the original attribute value and the intra-frame attribute prediction value are subjected to RAHT transformation and difference, and the residual is quantized and entropy coded.
实施例2:Embodiment 2:
该实施例由解码端执行。This embodiment is executed by a decoding end.
步骤(21):首先从输入码流中解码得到语法元素thrh1和thrh2。Step (21): First, decode the input code stream to obtain syntax elements thrh1 and thrh2.
从输入码流中对量化编码的变换系数或系数残差进行熵解码和反量化,得到重建后的变换系数或变换残差系数。从输入码流中解码得到中层节点使用的预测模式的标志位。The quantized coded transform coefficients or coefficient residuals are entropy decoded and dequantized from the input bitstream to obtain the reconstructed transform coefficients or transform residual coefficients. The flag bit of the prediction mode used by the middle-level node is decoded from the input bitstream.
由于属性解码是在几何解码之后的,所以在解码属性信息的时候,已经得到了重建好的每个点的位置坐标。Since attribute decoding comes after geometric decoding, the reconstructed position coordinates of each point are already obtained when the attribute information is decoded.
步骤(22):对点云进行重排序,基于几何距离对重排序后的点云数据构建N层变换树结构。采用自下而上的构建方法。Step (22): Reorder the point cloud and construct an N-layer transformation tree structure for the reordered point cloud data based on the geometric distance. A bottom-up construction method is adopted.
步骤(23):自上而下解析树。自顶向下从根节点开始逐层进行上采样预测和RAHT反变换,得到节点的重建属性值。Step (23): parse the tree from top to bottom. Starting from the root node, perform upsampling prediction and RAHT inverse transformation layer by layer to obtain the reconstructed attribute value of the node.
对变换树的节点进行预测时可以选用帧内预测或帧间预测两种预测方法。如图8所示,具体的选择方式如下:When predicting the nodes of the transform tree, two prediction methods can be selected: intra-frame prediction or inter-frame prediction. As shown in Figure 8, the specific selection method is as follows:
1)对于上层(lvl<=thrh1)的节点: 1) For nodes in the upper layer (lvl<=thrh1):
默认使用帧间预测模式。By default, inter prediction mode is used.
2):对于中层(thrh1<lvl<=thrh2)的节点:根据解码得到的预测模式的标志位flag,确定采用的是帧内预测还是帧间预测。2): For the nodes in the middle layer (thrh1<lvl<=thrh2): determine whether intra-frame prediction or inter-frame prediction is used according to the prediction mode flag obtained by decoding.
根据flag得到对应的预测值和解码重建得到的残差变换系数相加重建变换系数。The corresponding prediction value obtained according to the flag is added to the residual transform coefficient obtained by decoding and reconstruction to reconstruct the transform coefficient.
3)对于下层(lvl>thrh2)的节点:由于下层块小,包含的点数较少,当两帧之间有一定的运动时,每个块内包含的点数可能就会发生很大的变化,不适合采用帧间预测。3) For nodes in the lower layer (lvl>thrh2): Since the lower layer blocks are small and contain fewer points, when there is a certain amount of motion between two frames, the number of points contained in each block may change greatly, which is not suitable for inter-frame prediction.
所以在下层默认使用帧内预测模式。Therefore, the intra prediction mode is used by default in the lower layer.
其中,对于1)中的上层节点和2)中的中层节点,所使用的帧间预测,需要根据参考帧数据类型的不同选择不同的预测方式。Among them, for the upper-layer nodes in 1) and the middle-layer nodes in 2), the inter-frame prediction used needs to select different prediction methods according to different reference frame data types.
(a):当参考帧数据是已经编码并重建完成的点云帧信息,包含属性信息和几何信息,则可以与当前帧执行相同的自下而上构建变换树的操作,并生成与当前帧节点对应的参考帧节点属性值,可以将其作为帧间预测值。具体过程在编码端已经介绍,此处不再赘述。(a): When the reference frame data is the point cloud frame information that has been encoded and reconstructed, including attribute information and geometric information, the same bottom-up transformation tree construction operation can be performed as the current frame, and the reference frame node attribute value corresponding to the current frame node can be generated, which can be used as the inter-frame prediction value. The specific process has been introduced at the encoding end and will not be repeated here.
当得到帧间预测值时,与解码得到的变换残差系数相加得到重建的变换系数,进行RAHT反变换,得到每个子节点的属性重建值。When the inter-frame prediction value is obtained, it is added to the decoded transform residual coefficient to obtain the reconstructed transform coefficient, and the RAHT inverse transform is performed to obtain the attribute reconstruction value of each child node.
(a1):对于上层节点,当当前层距离根节点的层数小于等于thrh1时,使用帧间预测方法:(a1): For the upper layer nodes, when the number of layers from the current layer to the root node is less than or equal to thrh1, the inter-frame prediction method is used:
首先判断当前2*2*2的节点是否进行预测:First, determine whether the current 2*2*2 node is predicted:
若当前节点为根节点,将得到的根节点DC系数残差加上参考帧根节点DC系数残差得到了当前帧DC系数,使用根节点重建后的AC系数,之后进行RAHT反变换,得到每个子节点的属性重建值;If the current node is a root node, the DC coefficient residual of the root node is added to the DC coefficient residual of the reference frame root node to obtain the DC coefficient of the current frame, and the AC coefficient reconstructed by the root node is used, and then the RAHT inverse transform is performed to obtain the attribute reconstruction value of each child node;
若当前节点不为根节点,则需要判断对当前八个子节点是否进行预测:具体可以参考GPCC编解码的内容说明部分中对帧内RAHT技术第二步的描述。If the current node is not the root node, it is necessary to determine whether to predict the current eight child nodes: for details, please refer to the description of the second step of the intra-frame RAHT technology in the content description section of the GPCC codec.
若不进行预测,DC系数直接继承父节点的属性重建值,同时使用当前节点重建后的AC系数,进行RAHT反变换,得到每个子节点的属性重建值;If no prediction is performed, the DC coefficient directly inherits the attribute reconstruction value of the parent node, and uses the reconstructed AC coefficient of the current node to perform RAHT inverse transformation to obtain the attribute reconstruction value of each child node;
若判断当前2*2*2节点进行预测,利用GPCC编解码的内容说明部分中对帧内RAHT技术第三步加权预测的相关方法得到当前2*2*2节点各个占位子节点的帧内属性预测值。If it is determined that the current 2*2*2 node is to be predicted, the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
之后对于每一个2*2*2的节点的占位子节点进行判断:若参考帧对应位置子节点的属性值不为0且在当前帧父节点属性值的20%到250%内,则使用参考帧该子节点属性值作为当前帧对应位置子节点属性预测值,否则就使用当前帧帧内预测值作为属性预测值。Then, a judgment is made for the placeholder child nodes of each 2*2*2 node: if the attribute value of the child node at the corresponding position in the reference frame is not 0 and is between 20% and 250% of the attribute value of the parent node in the current frame, the attribute value of the child node in the reference frame is used as the attribute prediction value of the child node at the corresponding position in the current frame; otherwise, the intra-frame prediction value of the current frame is used as the attribute prediction value.
之后对该2*2*2的节点得到的属性预测值进行RAHT变换得到属性预测值的AC系数。DC系数继承父节点的属性重建值,AC系数为重建好的AC系数残差加上属性预测值的AC系数,之后进行RAHT反变换,得到每个子节点的属性重建值。Then, the attribute prediction value obtained by the 2*2*2 node is transformed by RAHT to obtain the AC coefficient of the attribute prediction value. The DC coefficient inherits the attribute reconstruction value of the parent node, and the AC coefficient is the reconstructed AC coefficient residual plus the AC coefficient of the attribute prediction value. Then, the RAHT inverse transformation is performed to obtain the attribute reconstruction value of each child node.
(a2):对于中层节点,当当前层距离根节点的层数大于thrh1且小于等于thrh2时,利用flag确定是采用帧间预测还是采用帧内预测方法:(a2): For middle-level nodes, when the number of layers from the current layer to the root node is greater than thrh1 and less than or equal to thrh2, the flag is used to determine whether to use inter-frame prediction or intra-frame prediction:
首先判断当前2*2*2的节点是否进行预测:具体可以参考GPCC编解码的内容说明部 分中对帧内RAHT技术第二步的描述。First, determine whether the current 2*2*2 node is predicted: For details, please refer to the GPCC codec content description section The second step of the intra-RAHT technique is described in detail.
若不进行预测,DC系数直接继承父节点的属性重建值,同时使用当前节点重建后的AC系数,进行RAHT反变换,得到每个子节点的属性重建值;If no prediction is performed, the DC coefficient directly inherits the attribute reconstruction value of the parent node, and uses the reconstructed AC coefficient of the current node to perform RAHT inverse transformation to obtain the attribute reconstruction value of each child node;
若判断当前2*2*2节点进行预测,利用GPCC编解码的内容说明部分中对帧内RAHT技术第三步加权预测的相关方法得到当前2*2*2节点各个占位子节点的帧内属性预测值。If it is determined that the current 2*2*2 node is to be predicted, the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
根据flag判断当前使用的是帧内还是帧间预测模式。Determine whether intra-frame or inter-frame prediction mode is currently used based on the flag.
对帧间预测值和帧内预测值分别进行RAHT变换,得到相应的AC系数。RAHT transform is performed on the inter-frame prediction value and the intra-frame prediction value respectively to obtain the corresponding AC coefficients.
DC系数继承父节点的属性重建值,AC系数为重建好的AC系数残差加上该flag对应的AC系数,之后进行RAHT反变换,得到每个子节点的属性重建值。The DC coefficient inherits the attribute reconstruction value of the parent node, and the AC coefficient is the reconstructed AC coefficient residual plus the AC coefficient corresponding to the flag. Then, the RAHT inverse transform is performed to obtain the attribute reconstruction value of each child node.
(a3):对于下层节点,当当前层距离根节点的层数离大于thrh2时,使用帧内预测方法:(a3): For lower layer nodes, when the number of layers from the current layer to the root node is greater than thrh2, the intra-frame prediction method is used:
首先判断当前2*2*2的节点是否进行预测:具体可以参考GPCC编解码的内容说明部分中对帧内RAHT技术第二步的描述。First, determine whether the current 2*2*2 node is predicted: For details, please refer to the description of the second step of the intra-frame RAHT technology in the content description section of GPCC encoding and decoding.
若不进行预测,DC系数直接继承父节点的属性重建值,同时使用当前节点重建后的AC系数,进行RAHT反变换,得到每个子节点的属性重建值;If no prediction is performed, the DC coefficient directly inherits the attribute reconstruction value of the parent node, and uses the reconstructed AC coefficient of the current node to perform RAHT inverse transformation to obtain the attribute reconstruction value of each child node;
若判断当前2*2*2节点进行预测,利用GPCC编解码的内容说明部分中对帧内RAHT技术第三步加权预测的相关方法得到当前2*2*2节点各个占位子节点的帧内属性预测值。If it is determined that the current 2*2*2 node is to be predicted, the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
之后对该2*2*2的节点得到的帧内属性预测值进行RAHT变换得到属性预测值的AC系数。Then, the intra-frame attribute prediction value obtained by the 2*2*2 node is subjected to RAHT transformation to obtain the AC coefficient of the attribute prediction value.
DC系数继承父节点的属性重建值,AC系数为重建好的AC系数残差加上属性预测值的AC系数,之后进行RAHT反变换,得到每个子节点的属性重建值。The DC coefficient inherits the attribute reconstruction value of the parent node, and the AC coefficient is the reconstructed AC coefficient residual plus the AC coefficient of the attribute prediction value. After that, the RAHT inverse transform is performed to obtain the attribute reconstruction value of each child node.
(b):当参考帧数据是已经编码并完成的点云帧的重建变换系数时,可以直接将该变换系数作为帧间预测的变换系数,与当前帧节点中帧内预测属性值的变换系数进行比较。(b): When the reference frame data is the reconstructed transform coefficient of an already encoded and completed point cloud frame, the transform coefficient can be directly used as the transform coefficient for inter-frame prediction and compared with the transform coefficient of the intra-frame prediction attribute value in the current frame node.
(b1):对于上层节点,当当前层距离根节点的层数小于等于thrh1时,使用帧间预测方法(b1): For the upper layer nodes, when the number of layers from the current layer to the root node is less than or equal to thrh1, the inter-frame prediction method is used
判断在参考帧中是否存在与当前待编码节点具有相同位置的参考节点的变换系数,若存在,则将该变换系数作为变换系数的帧间预测值,并与解码得到的重建变换残差系数求和得到当前节点的重建变换系数,进行RAHT反变换,得到每个子节点的属性重建值。Determine whether there is a transform coefficient of a reference node with the same position as the current node to be encoded in the reference frame. If so, use the transform coefficient as the inter-frame prediction value of the transform coefficient, and sum it with the reconstructed transform residual coefficient obtained by decoding to obtain the reconstructed transform coefficient of the current node, perform RAHT inverse transform, and obtain the attribute reconstruction value of each child node.
若不存在,则直接对解码得到的重建变换系数进行RAHT反变换得到每个子节点的属性重建值。If it does not exist, the decoded reconstruction transform coefficients are directly subjected to RAHT inverse transformation to obtain the attribute reconstruction value of each child node.
(b2):对于中层节点,当当前层距离根节点的层数大于thrh1且小于等于thrh2时,利用flag确定是采用帧间预测还是采用帧内预测方法:(b2): For middle-level nodes, when the number of layers from the current layer to the root node is greater than thrh1 and less than or equal to thrh2, the flag is used to determine whether to use inter-frame prediction or intra-frame prediction:
首先判断当前2*2*2的节点是否进行预测:具体可以参考GPCC编解码的内容说明部分中对帧内RAHT技术第二步的描述。First, determine whether the current 2*2*2 node is predicted: For details, please refer to the description of the second step of the intra-frame RAHT technology in the content description section of GPCC encoding and decoding.
若不进行预测,DC系数直接继承父节点的属性重建值,同时使用当前节点重建后的AC系数,进行RAHT反变换,得到每个子节点的属性重建值; If no prediction is performed, the DC coefficient directly inherits the attribute reconstruction value of the parent node, and uses the reconstructed AC coefficient of the current node to perform RAHT inverse transformation to obtain the attribute reconstruction value of each child node;
若判断当前2*2*2节点进行预测,利用GPCC编解码的内容说明部分中对帧内RAHT技术第三步加权预测的相关方法得到当前2*2*2节点各个占位子节点的帧内属性预测值。If it is determined that the current 2*2*2 node is to be predicted, the related method of the third step weighted prediction of the intra-frame RAHT technology in the content description part of the GPCC codec is used to obtain the intra-frame attribute prediction value of each placeholder sub-node of the current 2*2*2 node.
1、对帧内预测值进行变换得到的帧内预测的变换系数。1. Transform the intra-frame prediction value to obtain the transform coefficient of the intra-frame prediction.
2、判断在参考帧中存在与当前节点相同位置的变换系数,若不存在,直接使用帧内预测系数,与解码得到的重建变换残差系数求和得到当前节点的重建变换系数,进行RAHT反变换,得到每个子节点的属性重建值。2. Determine whether there is a transform coefficient at the same position as the current node in the reference frame. If not, directly use the intra-frame prediction coefficient and sum it with the decoded reconstructed transform residual coefficient to obtain the reconstructed transform coefficient of the current node, perform RAHT inverse transform, and obtain the attribute reconstruction value of each child node.
3、若存在,根据flag判断当前使用的是帧内还是帧间预测模式。根据flag得到预测的变换系数,与解码得到的重建变换残差系数求和得到当前节点的重建变换系数,进行RAHT反变换,得到每个子节点的属性重建值。3. If it exists, determine whether the current prediction mode is intra-frame or inter-frame according to the flag. Get the predicted transform coefficient according to the flag, sum it with the decoded reconstruction transform residual coefficient to get the reconstruction transform coefficient of the current node, perform RAHT inverse transform, and get the attribute reconstruction value of each child node.
(b3):对于下层节点,当当前层距离根节点的层数离大于thrh2时,使用帧内预测方法,与(a3)中的下层解码方式相同,此处不再赘述。(b3): For lower layer nodes, when the number of layers from the current layer to the root node is greater than thrh2, the intra-frame prediction method is used, which is the same as the lower layer decoding method in (a3) and will not be repeated here.
至此属性解码完成。At this point, attribute decoding is complete.
本实施例提出了一种双阈值帧间预测方法,将八叉树分成了三个层级,上层,中层,下层。根据不同层级的节点特性,对不同的层级使用了不同预测方法。特别地,对于中层的节点,利用了RDO算法来判断当前节点是采样帧内还是帧间预测方法。同时需要在码流里传输设定的两个阈值和标志位。本实施例的三层结构更加灵活,编码效率更高。This embodiment proposes a dual-threshold inter-frame prediction method, which divides the octree into three levels: upper level, middle level, and lower level. Different prediction methods are used for different levels according to the node characteristics of different levels. In particular, for the nodes in the middle level, the RDO algorithm is used to determine whether the current node is an intra-frame or inter-frame prediction method. At the same time, the two set thresholds and flags need to be transmitted in the bitstream. The three-layer structure of this embodiment is more flexible and has higher coding efficiency.
与现有技术相比,本实施例更加灵活,考虑到了中间层的一些节点更适合使用帧间预测方法,所以使用了双阈值将八叉树分成了三个层级,对中间层级的节点,使用了RDO算法进行判断是否采用帧间预测模式,并使用一个标志位进行了标记,这样做更加灵活,可以降低码率,提高编码效率。Compared with the prior art, this embodiment is more flexible. Considering that some nodes in the middle layer are more suitable for the inter-frame prediction method, a double threshold is used to divide the octree into three levels. For the nodes in the middle layer, the RDO algorithm is used to determine whether to adopt the inter-frame prediction mode, and a flag is used to mark it. This is more flexible and can reduce the bit rate and improve the coding efficiency.
需要说明的是,本申请实施例提供的点云编码处理方法,执行主体可以为点云编码处理装置,或者,该点云编码处理装置中的用于执行点云编码处理的方法的控制模块。本申请实施例中以点云编码处理装置执行点云编码处理的方法为例,说明本申请实施例提供的点云编码处理装置。It should be noted that the point cloud coding processing method provided in the embodiment of the present application can be executed by a point cloud coding processing device, or a control module in the point cloud coding processing device for executing the point cloud coding processing method. In the embodiment of the present application, the point cloud coding processing device provided in the embodiment of the present application is described by taking the method for executing the point cloud coding processing by the point cloud coding processing device as an example.
请参见图9,图9是本申请实施例提供的一种点云编码处理装置的结构图,如图9所示,点云编码处理装置300包括:Please refer to FIG. 9 , which is a structural diagram of a point cloud coding processing device provided in an embodiment of the present application. As shown in FIG. 9 , the point cloud coding processing device 300 includes:
第一确定模块301,用于确定待编码点云对应的变换树的目标层与根节点之间的目标距离;The first determination module 301 is used to determine the target distance between the target layer and the root node of the transform tree corresponding to the point cloud to be encoded;
第二确定模块302,用于基于所述目标距离确定目标预测残差值;A second determination module 302 is used to determine a target prediction residual value based on the target distance;
编码模块303,用于对所述目标预测残差值进行编码,得到第一编码结果;The encoding module 303 is used to encode the target prediction residual value to obtain a first encoding result;
其中,所述待编码点云的码流包括所述第一编码结果;The code stream of the point cloud to be encoded includes the first encoding result;
在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或In a case where the target distance is less than or equal to a first preset distance, the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer; or
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值; 或When the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer; or
在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。When the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
可选地,在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括基于率失真优化算法对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值。Optionally, when the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm.
可选地,所述第二确定模块包括:Optionally, the second determining module includes:
第一确定单元,用于在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,确定所述目标层的节点的属性信息的帧间残差系数和帧内残差系数;A first determining unit, configured to determine an inter-frame residual coefficient and an intra-frame residual coefficient of the attribute information of the node of the target layer when the target distance is greater than the first preset distance and less than or equal to the second preset distance;
第二确定单元,用于基于率失真优化算法确定与所述帧间残差系数对应的第一代价值和与所述帧内残差系数对应的第二代价值;A second determining unit, configured to determine a first generation value corresponding to the inter-frame residual coefficient and a second generation value corresponding to the intra-frame residual coefficient based on a rate-distortion optimization algorithm;
第三确定单元,用于基于所述第一代价值和所述第二代价值确定第二预测残差值;a third determining unit, configured to determine a second prediction residual value based on the first generation value and the second generation value;
其中,所述目标预测残差值包括所述第二预测残差值。Among them, the target prediction residual value includes the second prediction residual value.
可选地,在所述第一代价值小于所述第二代价值的情况下,所述第二预测残差值为所述帧间残差系数;或,Optionally, when the first generation value is less than the second generation value, the second prediction residual value is the inter-frame residual coefficient; or,
在所述第二代价值小于所述第一代价值的情况下,所述第二预测残差值为所述帧内残差系数;或,When the second generation value is less than the first generation value, the second prediction residual value is the intra-frame residual coefficient; or
在所述第一代价值等于所述第二代价值的情况下,所述第二预测残差值为所述帧间残差系数或所述帧内残差系数。When the first generation value is equal to the second generation value, the second prediction residual value is the inter-frame residual coefficient or the intra-frame residual coefficient.
可选地,所述待编码点云的码流还包括第二编码结果,所述第二编码结果用于表征所述第二预测残差值为所述帧内残差系数或者所述第二预测残差值为所述帧间残差系数。Optionally, the code stream of the point cloud to be encoded also includes a second encoding result, and the second encoding result is used to represent that the second prediction residual value is the intra-frame residual coefficient or the second prediction residual value is the inter-frame residual coefficient.
可选地,所述第一确定单元包括:Optionally, the first determining unit includes:
确定子单元,用于在所述待编码点云的参考帧数据的数据类型为第一数据类型的情况下,确定所述目标层的节点的属性信息的帧间预测值,并对所述帧间预测值进行变换处理得到帧间变换系数;A determination subunit, configured to determine, when the data type of the reference frame data of the to-be-encoded point cloud is the first data type, an inter-frame prediction value of the attribute information of the node of the target layer, and transform the inter-frame prediction value to obtain an inter-frame transformation coefficient;
获取子单元,用于基于所述帧间变换系数获取帧间残差系数。The acquisition subunit is used to acquire the inter-frame residual coefficient based on the inter-frame transform coefficient.
可选地,所述第一确定单元还用于:Optionally, the first determining unit is further configured to:
对所述待编码点云对应的参考帧中的点进行重构处理;Reconstructing points in a reference frame corresponding to the point cloud to be encoded;
基于重构处理后的参考帧构建预测树,所述预测树的树结构与所述变换树的树结构相同;Constructing a prediction tree based on the reconstructed reference frame, wherein the tree structure of the prediction tree is the same as the tree structure of the transformation tree;
所述确定子单元具体用于:The determining subunit is specifically used for:
基于所述预测树及所述变换树确定所述目标层的节点的属性信息的帧间预测值。An inter-frame prediction value of attribute information of a node of the target layer is determined based on the prediction tree and the transform tree.
可选地,所述第一数据类型的参考帧数据为已编码且已重建的点云帧数据。Optionally, the reference frame data of the first data type is encoded and reconstructed point cloud frame data.
可选地,在所述待编码点云的参考帧数据的数据类型为第二数据类型的情况下,所述第一确定单元具体用于: Optionally, when the data type of the reference frame data of the to-be-encoded point cloud is the second data type, the first determining unit is specifically configured to:
若所述待编码点云的参考帧中存在与所述目标层的节点具有相同位置的参考节点的变换系数,则将所述具有相同位置的参考节点的变换系数确定为所述节点的属性信息的帧间变换系数;If there is a transformation coefficient of a reference node having the same position as the node of the target layer in the reference frame of the point cloud to be encoded, the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node;
基于所述节点的变换系数及所述节点的属性信息的帧间变换系数确定所述节点的属性信息的帧间残差系数。An inter-frame residual coefficient of the attribute information of the node is determined based on the transformation coefficient of the node and the inter-frame transformation coefficient of the attribute information of the node.
可选地,所述第二数据类型的参考帧数据为已编码的点云帧的重建变换系数。Optionally, the reference frame data of the second data type is reconstructed transformation coefficients of an encoded point cloud frame.
可选地,所述待编码点云的码流还包括第三编码结果,所述第三编码结果为所述第一预设距离及所述第二预设距离中的至少一项的编码结果。Optionally, the code stream of the point cloud to be encoded further includes a third encoding result, and the third encoding result is an encoding result of at least one of the first preset distance and the second preset distance.
本申请实施例中的点云编码处理装置可以是装置,具有操作系统的装置或电子设备,也可以是终端中的部件、集成电路、或芯片。该装置或电子设备可以是移动终端,也可以为非移动终端。示例性的,移动终端可以包括但不限于上述所列举的终端的类型,非移动终端可以为服务器、网络附属存储器(Network Attached Storage,NAS)、个人计算机(personal computer,PC)、电视机(television,TV)、柜员机或者自助机等,本申请实施例不作具体限定。The point cloud coding processing device in the embodiment of the present application can be a device, a device or electronic device with an operating system, or a component, integrated circuit, or chip in a terminal. The device or electronic device can be a mobile terminal or a non-mobile terminal. Exemplarily, the mobile terminal can include but is not limited to the types of terminals listed above, and the non-mobile terminal can be a server, a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a television (television, TV), a teller machine or a self-service machine, etc., which is not specifically limited in the embodiment of the present application.
本申请实施例提供的点云编码处理装置能够实现图5的方法实施例实现的各个过程,并达到相同的技术效果,为避免重复,这里不再赘述。The point cloud coding processing device provided in the embodiment of the present application can implement each process implemented by the method embodiment of Figure 5 and achieve the same technical effect. To avoid repetition, it will not be repeated here.
需要说明的是,本申请实施例提供的点云解码处理方法,执行主体可以为点云解码处理装置,或者,该点云解码处理装置中的用于执行点云解码处理的方法的控制模块。本申请实施例中以点云解码处理装置执行点云解码处理的方法为例,说明本申请实施例提供的点云解码处理装置。It should be noted that the point cloud decoding processing method provided in the embodiment of the present application can be executed by a point cloud decoding processing device, or a control module in the point cloud decoding processing device for executing the point cloud decoding processing method. In the embodiment of the present application, the point cloud decoding processing device provided in the embodiment of the present application is described by taking the method for executing the point cloud decoding processing by the point cloud decoding processing device as an example.
请参见图10,图10是本申请实施例提供的一种点云解码处理装置的结构图,如图10所示,点云解码处理装置400包括:Please refer to FIG. 10 , which is a structural diagram of a point cloud decoding processing device provided in an embodiment of the present application. As shown in FIG. 10 , the point cloud decoding processing device 400 includes:
确定模块401,用于确定待解码点云对应的变换树的目标层与根节点之间的目标距离;A determination module 401 is used to determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be decoded;
解码模块402,用于对待解码点云的码流中的第一编码结果进行解码,得到目标预测残差值;A decoding module 402, configured to decode a first encoding result in a bitstream of a point cloud to be decoded, and obtain a target prediction residual value;
获取模块403,用于基于所述目标预测残差值及所述目标距离获取所述目标层的节点的属性信息;An acquisition module 403 is used to acquire attribute information of the node of the target layer based on the target prediction residual value and the target distance;
其中,在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或Wherein, when the target distance is less than or equal to the first preset distance, the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer; or
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或In a case where the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer; or
在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。When the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
可选地,在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下, 所述目标预测残差值包括基于率失真优化算法对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值。Optionally, when the target distance is greater than the first preset distance and less than or equal to the second preset distance, The target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm.
可选地,所述目标预测残差值包括所述第二预测残差值,所述待解码点云的码流还包括第二编码结果,所述第二编码结果用于表征所述第二预测残差值为帧内残差系数或者所述第二预测残差值为帧间残差系数;所述获取模块包括:Optionally, the target prediction residual value includes the second prediction residual value, the code stream of the point cloud to be decoded also includes a second encoding result, and the second encoding result is used to represent that the second prediction residual value is an intra-frame residual coefficient or the second prediction residual value is an inter-frame residual coefficient; the acquisition module includes:
解码单元,用于在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,对所述第二编码结果进行解码;a decoding unit, configured to decode the second encoding result when the target distance is greater than the first preset distance and less than or equal to a second preset distance;
确定单元,用于在确定所述第二编码结果表征所述第二预测残差值为帧内残差系数的情况下,基于所述目标层的节点的帧内变换系数及所述第二预测残差值确定所述目标层的节点的属性信息;或A determination unit, configured to determine attribute information of a node of the target layer based on an intra-frame transform coefficient of the node of the target layer and the second prediction residual value when it is determined that the second encoding result represents that the second prediction residual value is an intra-frame residual coefficient; or
在确定所述第二编码结果表征所述第二预测残差值为帧间残差系数的情况下,基于所述目标层的节点的帧间变换系数及所述第二预测残差值确定所述目标层的节点的属性信息。When it is determined that the second encoding result represents that the second prediction residual value is an inter-frame residual coefficient, attribute information of the node of the target layer is determined based on the inter-frame transform coefficient of the node of the target layer and the second prediction residual value.
可选地,所述确定单元包括:Optionally, the determining unit includes:
第一确定子单元,用于在所述待解码点云的参考帧数据的数据类型为第一数据类型的情况下,确定所述目标层的节点的属性信息的帧间预测值,并对所述帧间预测值进行变换处理得到帧间变换系数;A first determination subunit is used to determine the inter-frame prediction value of the attribute information of the node of the target layer when the data type of the reference frame data of the to-be-decoded point cloud is the first data type, and to transform the inter-frame prediction value to obtain an inter-frame transformation coefficient;
第二确定子单元,用于基于所述帧间变换系数及所述第二预测残差值确定所述目标层的节点的属性信息。The second determination subunit is used to determine the attribute information of the node of the target layer based on the inter-frame transformation coefficient and the second prediction residual value.
可选地,所述确定单元还用于:Optionally, the determining unit is further configured to:
对所述待解码点云对应的参考帧中的点进行重构处理;Reconstructing points in a reference frame corresponding to the point cloud to be decoded;
基于重构处理后的参考帧构建预测树,所述预测树的树结构与所述变换树的树结构相同;Constructing a prediction tree based on the reconstructed reference frame, wherein the tree structure of the prediction tree is the same as the tree structure of the transformation tree;
所述第一确定子单元具体用于:The first determining subunit is specifically used for:
基于所述预测树及所述变换树确定所述目标层的节点的属性信息的帧间预测值。An inter-frame prediction value of attribute information of a node of the target layer is determined based on the prediction tree and the transform tree.
可选地,所述第一数据类型的参考帧数据为已解码且已重建的点云帧数据。Optionally, the reference frame data of the first data type is decoded and reconstructed point cloud frame data.
可选地,在所述待解码点云的参考帧数据的数据类型为第二数据类型的情况下,在确定所述第二编码结果表征所述第二预测残差值为帧间残差系数的情况下,所述确定单元具体用于:Optionally, when the data type of the reference frame data of the to-be-decoded point cloud is the second data type, when determining that the second encoding result represents that the second prediction residual value is an inter-frame residual coefficient, the determining unit is specifically configured to:
若确定所述待解码点云的参考帧中存在与所述目标层的节点具有相同位置的参考节点的变换系数,则将所述具有相同位置的参考节点的变换系数确定为所述节点的属性信息的帧间变换系数;If it is determined that there is a transformation coefficient of a reference node having the same position as the node of the target layer in the reference frame of the point cloud to be decoded, the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node;
基于所述第二预测残差值及所述节点的属性信息的帧间变换系数获取所述节点的属性信息。The attribute information of the node is obtained based on the second prediction residual value and the inter-frame transform coefficient of the attribute information of the node.
可选地,所述第二数据类型的参考帧数据为已解码的点云帧的重建变换系数。Optionally, the reference frame data of the second data type is reconstructed transform coefficients of a decoded point cloud frame.
可选地,所述待解码点云的码流还包括第三编码结果,所述解码模块还用于: Optionally, the code stream of the point cloud to be decoded further includes a third encoding result, and the decoding module is further used to:
对所述第三编码结果进行解码,得到所述第一预设距离及所述第二预设距离中的至少一项。The third encoding result is decoded to obtain at least one of the first preset distance and the second preset distance.
本申请实施例中的点云解码处理装置可以是装置,具有操作系统的装置或电子设备,也可以是终端中的部件、集成电路、或芯片。该装置或电子设备可以是移动终端,也可以为非移动终端。示例性的,移动终端可以包括但不限于上述所列举的终端的类型,非移动终端可以为服务器、网络附属存储器(Network Attached Storage,NAS)、个人计算机(personal computer,PC)、电视机(television,TV)、柜员机或者自助机等,本申请实施例不作具体限定。The point cloud decoding processing device in the embodiment of the present application can be a device, a device or electronic device with an operating system, or a component, integrated circuit, or chip in a terminal. The device or electronic device can be a mobile terminal or a non-mobile terminal. Exemplarily, the mobile terminal can include but is not limited to the types of terminals listed above, and the non-mobile terminal can be a server, a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a television (television, TV), a teller machine or a self-service machine, etc., which is not specifically limited in the embodiment of the present application.
本申请实施例提供的点云解码处理装置能够实现图6的方法实施例实现的各个过程,并达到相同的技术效果,为避免重复,这里不再赘述。The point cloud decoding processing device provided in the embodiment of the present application can implement the various processes implemented by the method embodiment of Figure 6 and achieve the same technical effect. To avoid repetition, it will not be repeated here.
可选地,如图11所示,本申请实施例还提供一种通信设备500,包括处理器501和存储器502,存储器502上存储有可在所述处理器501上运行的程序或指令,例如,该通信设备500为编码端设备时,该程序或指令被处理器501执行时实现上述点云编码处理方法实施例的各个步骤,且能达到相同的技术效果。该通信设备500为解码端设备时,该程序或指令被处理器501执行时实现上述点云解码处理方法实施例的各个步骤,且能达到相同的技术效果,为避免重复,这里不再赘述。Optionally, as shown in FIG11 , the embodiment of the present application further provides a communication device 500, including a processor 501 and a memory 502, wherein the memory 502 stores a program or instruction that can be run on the processor 501. For example, when the communication device 500 is an encoding end device, the program or instruction is executed by the processor 501 to implement the various steps of the above-mentioned point cloud encoding processing method embodiment, and can achieve the same technical effect. When the communication device 500 is a decoding end device, the program or instruction is executed by the processor 501 to implement the various steps of the above-mentioned point cloud decoding processing method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
本申请实施例还提供一种终端,包括处理器及通信接口,所述处理器用于:确定待编码点云对应的变换树的目标层与根节点之间的目标距离;基于所述目标距离确定目标预测残差值;对所述目标预测残差值进行编码,得到第一编码结果;其中,所述待编码点云的码流包括所述第一编码结果;在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。该终端实施例与上述点云编码处理方法实施例对应,上述点云编码处理方法实施例的各个实施过程和实现方式均可适用于该终端实施例中,且能达到相同的技术效果。The embodiment of the present application also provides a terminal, including a processor and a communication interface, wherein the processor is used to: determine the target distance between the target layer and the root node of the transform tree corresponding to the point cloud to be encoded; determine the target prediction residual value based on the target distance; encode the target prediction residual value to obtain a first encoding result; wherein the code stream of the point cloud to be encoded includes the first encoding result; when the target distance is less than or equal to the first preset distance, the target prediction residual value includes a first prediction residual value obtained by inter-frame prediction of the attribute information of the node of the target layer; or when the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer; or when the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by intra-frame prediction of the attribute information of the node of the target layer. This terminal embodiment corresponds to the above-mentioned point cloud encoding processing method embodiment, and each implementation process and implementation method of the above-mentioned point cloud encoding processing method embodiment can be applied to this terminal embodiment and can achieve the same technical effect.
本申请实施例还提供一种终端,包括处理器及通信接口,所述处理器用于:确定待解码点云对应的变换树的目标层与根节点之间的目标距离;对待解码点云的码流中的第一编码结果进行解码,得到目标预测残差值;基于所述目标预测残差值及所述目标距离获取所述目标层的节点的属性信息;其中,在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述 目标层的节点的属性信息进行帧内预测得到的第三预测残差值。该终端实施例与上述点云解码处理方法实施例对应,上述点云解码处理方法实施例的各个实施过程和实现方式均可适用于该终端实施例中,且能达到相同的技术效果。An embodiment of the present application also provides a terminal, including a processor and a communication interface, wherein the processor is used to: determine a target distance between a target layer and a root node of a transform tree corresponding to a point cloud to be decoded; decode a first encoding result in a code stream of a point cloud to be decoded to obtain a target prediction residual value; obtain attribute information of the nodes of the target layer based on the target prediction residual value and the target distance; wherein, when the target distance is less than or equal to a first preset distance, the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the nodes of the target layer; or when the target distance is greater than the first preset distance and less than or equal to a second preset distance, the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the nodes of the target layer; or when the target distance is greater than the second preset distance, the target prediction residual value includes The third prediction residual value obtained by intra-frame prediction of the attribute information of the node of the target layer. This terminal embodiment corresponds to the above-mentioned point cloud decoding processing method embodiment, and each implementation process and implementation method of the above-mentioned point cloud decoding processing method embodiment can be applied to this terminal embodiment and can achieve the same technical effect.
具体地,图12为实现本申请实施例的一种终端的硬件结构示意图。Specifically, FIG12 is a schematic diagram of the hardware structure of a terminal implementing an embodiment of the present application.
该终端600包括但不限于:射频单元601、网络模块602、音频输出单元603、输入单元604、传感器605、显示单元606、用户输入单元607、接口单元608、存储器609以及处理器610等中的至少部分部件。The terminal 600 includes but is not limited to: a radio frequency unit 601, a network module 602, an audio output unit 603, an input unit 604, a sensor 605, a display unit 606, a user input unit 607, an interface unit 608, a memory 609 and at least some of the components of a processor 610.
本领域技术人员可以理解,终端600还可以包括给各个部件供电的电源(比如电池),电源可以通过电源管理系统与处理器610逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图12中示出的终端结构并不构成对终端的限定,终端可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,在此不再赘述。Those skilled in the art will appreciate that the terminal 600 may also include a power source (such as a battery) for supplying power to each component, and the power source may be logically connected to the processor 610 through a power management system, so as to implement functions such as managing charging, discharging, and power consumption management through the power management system. The terminal structure shown in FIG12 does not constitute a limitation on the terminal, and the terminal may include more or fewer components than shown in the figure, or combine certain components, or arrange components differently, which will not be described in detail here.
应理解的是,本申请实施例中,输入单元604可以包括图形处理器(Graphics Processing Unit,GPU)6041和麦克风6042,图形处理器6041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。显示单元606可包括显示面板6061,可以采用液晶显示器、有机发光二极管等形式来配置显示面板6061。用户输入单元607包括触控面板6071以及其他输入设备6072中的至少一种。触控面板6 071,也称为触摸屏。触控面板6071可包括触摸检测装置和触摸控制器两个部分。其他输入设备6072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。It should be understood that in the embodiment of the present application, the input unit 604 may include a graphics processing unit (GPU) 6041 and a microphone 6042, and the graphics processor 6041 processes the image data of the static picture or video obtained by the image capture device (such as a camera) in the video capture mode or the image capture mode. The display unit 606 may include a display panel 6061, and the display panel 6061 may be configured in the form of a liquid crystal display, an organic light emitting diode, etc. The user input unit 607 includes a touch panel 6071 and at least one of other input devices 6072. The touch panel 6071 is also called a touch screen. The touch panel 6071 may include two parts: a touch detection device and a touch controller. Other input devices 6072 may include, but are not limited to, a physical keyboard, function keys (such as a volume control key, a switch key, etc.), a trackball, a mouse, and a joystick, which will not be repeated here.
本申请实施例中,射频单元601接收来自网络侧设备的下行数据后,可以传输给处理器610进行处理;另外,射频单元601可以向网络侧设备发送上行数据。通常,射频单元601包括但不限于天线、放大器、收发信机、耦合器、低噪声放大器、双工器等。In the embodiment of the present application, after receiving downlink data from the network side device, the RF unit 601 can transmit the data to the processor 610 for processing; in addition, the RF unit 601 can send uplink data to the network side device. Generally, the RF unit 601 includes but is not limited to an antenna, an amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, etc.
存储器609可用于存储软件程序或指令以及各种数据。存储器609可主要包括存储程序或指令的第一存储区和存储数据的第二存储区,其中,第一存储区可存储操作系统、至少一个功能所需的应用程序或指令(比如声音播放功能、图像播放功能等)等。此外,存储器609可以包括易失性存储器或非易失性存储器,或者,存储器609可以包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请实施例中的存储器x09包括但 不限于这些和任意其它适合类型的存储器。The memory 609 can be used to store software programs or instructions and various data. The memory 609 may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instruction required for at least one function (such as a sound playback function, an image playback function, etc.), etc. In addition, the memory 609 may include a volatile memory or a non-volatile memory, or the memory 609 may include both volatile and non-volatile memories. Among them, the non-volatile memory may be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory. The volatile memory may be a random access memory (RAM), a static random access memory (SRAM), a dynamic random access memory (DRAM), a synchronous dynamic random access memory (SDRAM), a double data rate synchronous dynamic random access memory (DDRSDRAM), an enhanced synchronous dynamic random access memory (ESDRAM), a synchronous link dynamic random access memory (SLDRAM) and a direct RAM bus random access memory (DRRAM). The memory x09 in the embodiment of the present application includes but Without limitation, these and any other suitable types of memory.
处理器610可包括一个或多个处理单元;可选的,处理器610集成应用处理器和调制解调处理器,其中,应用处理器主要处理涉及操作系统、用户界面和应用程序等的操作,调制解调处理器主要处理无线通信信号,如基带处理器。可以理解的是,上述调制解调处理器也可以不集成到处理器610中。The processor 610 may include one or more processing units; optionally, the processor 610 integrates an application processor and a modem processor, wherein the application processor mainly processes operations related to an operating system, a user interface, and application programs, and the modem processor mainly processes wireless communication signals, such as a baseband processor. It is understandable that the modem processor may not be integrated into the processor 610.
其中,在所述终端为编码端的情况下:Wherein, when the terminal is a coding terminal:
所述处理器610用于:The processor 610 is configured to:
确定待编码点云对应的变换树的目标层与根节点之间的目标距离;Determine the target distance between the target layer and the root node of the transform tree corresponding to the point cloud to be encoded;
基于所述目标距离确定目标预测残差值;Determining a target prediction residual value based on the target distance;
对所述目标预测残差值进行编码,得到第一编码结果;Encoding the target prediction residual value to obtain a first encoding result;
其中,所述待编码点云的码流包括所述第一编码结果;The code stream of the point cloud to be encoded includes the first encoding result;
在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或In a case where the target distance is less than or equal to a first preset distance, the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer; or
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或In a case where the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer; or
在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。When the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
可选地,在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括基于率失真优化算法对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值。Optionally, when the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm.
可选地,所述处理器610具体用于:Optionally, the processor 610 is specifically configured to:
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,确定所述目标层的节点的属性信息的帧间残差系数和帧内残差系数;When the target distance is greater than the first preset distance and less than or equal to the second preset distance, determine the inter-frame residual coefficient and the intra-frame residual coefficient of the attribute information of the node of the target layer;
基于率失真优化算法确定与所述帧间残差系数对应的第一代价值和与所述帧内残差系数对应的第二代价值;Determine a first generation value corresponding to the inter-frame residual coefficient and a second generation value corresponding to the intra-frame residual coefficient based on a rate-distortion optimization algorithm;
基于所述第一代价值和所述第二代价值确定第二预测残差值;determining a second prediction residual value based on the first generation value and the second generation value;
其中,所述目标预测残差值包括所述第二预测残差值。Among them, the target prediction residual value includes the second prediction residual value.
可选地,在所述第一代价值小于所述第二代价值的情况下,所述第二预测残差值为所述帧间残差系数;或,Optionally, when the first generation value is less than the second generation value, the second prediction residual value is the inter-frame residual coefficient; or,
在所述第二代价值小于所述第一代价值的情况下,所述第二预测残差值为所述帧内残差系数;或,When the second generation value is less than the first generation value, the second prediction residual value is the intra-frame residual coefficient; or
在所述第一代价值等于所述第二代价值的情况下,所述第二预测残差值为所述帧间残差系数或所述帧内残差系数。When the first generation value is equal to the second generation value, the second prediction residual value is the inter-frame residual coefficient or the intra-frame residual coefficient.
可选地,所述待编码点云的码流还包括第二编码结果,所述第二编码结果用于表征所 述第二预测残差值为所述帧内残差系数或者所述第二预测残差值为所述帧间残差系数。Optionally, the code stream of the point cloud to be encoded further includes a second encoding result, and the second encoding result is used to represent the The second prediction residual value is the intra-frame residual coefficient or the second prediction residual value is the inter-frame residual coefficient.
可选地,在所述待编码点云的参考帧数据的数据类型为第一数据类型的情况下,所述处理器610具体用于:Optionally, when the data type of the reference frame data of the to-be-encoded point cloud is the first data type, the processor 610 is specifically configured to:
确定所述目标层的节点的属性信息的帧间预测值,并对所述帧间预测值进行变换处理得到帧间变换系数;Determine an inter-frame prediction value of the attribute information of the node of the target layer, and transform the inter-frame prediction value to obtain an inter-frame transformation coefficient;
基于所述帧间变换系数获取帧间残差系数。An inter-frame residual coefficient is obtained based on the inter-frame transform coefficient.
可选地,所述处理器610还用于:Optionally, the processor 610 is further configured to:
对所述待编码点云对应的参考帧中的点进行重构处理;Reconstructing points in a reference frame corresponding to the point cloud to be encoded;
基于重构处理后的参考帧构建预测树,所述预测树的树结构与所述变换树的树结构相同;Constructing a prediction tree based on the reconstructed reference frame, wherein the tree structure of the prediction tree is the same as the tree structure of the transformation tree;
基于所述预测树及所述变换树确定所述目标层的节点的属性信息的帧间预测值。An inter-frame prediction value of attribute information of a node of the target layer is determined based on the prediction tree and the transform tree.
可选地,所述第一数据类型的参考帧数据为已编码且已重建的点云帧数据。Optionally, the reference frame data of the first data type is encoded and reconstructed point cloud frame data.
可选地,在所述待编码点云的参考帧数据的数据类型为第二数据类型的情况下,所述处理器610具体用于:Optionally, when the data type of the reference frame data of the to-be-encoded point cloud is the second data type, the processor 610 is specifically configured to:
若所述待编码点云的参考帧中存在与所述目标层的节点具有相同位置的参考节点的变换系数,则将所述具有相同位置的参考节点的变换系数确定为所述节点的属性信息的帧间变换系数;If there is a transformation coefficient of a reference node having the same position as the node of the target layer in the reference frame of the point cloud to be encoded, the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node;
基于所述节点的变换系数及所述节点的属性信息的帧间变换系数确定所述节点的属性信息的帧间残差系数。An inter-frame residual coefficient of the attribute information of the node is determined based on the transformation coefficient of the node and the inter-frame transformation coefficient of the attribute information of the node.
可选地,所述第二数据类型的参考帧数据为已编码的点云帧的重建变换系数。Optionally, the reference frame data of the second data type is reconstructed transformation coefficients of an encoded point cloud frame.
可选地,所述待编码点云的码流还包括第三编码结果,所述第三编码结果为所述第一预设距离及所述第二预设距离中的至少一项的编码结果。Optionally, the code stream of the point cloud to be encoded further includes a third encoding result, and the third encoding result is an encoding result of at least one of the first preset distance and the second preset distance.
其中,在所述终端为解码端的情况下:Wherein, when the terminal is a decoding terminal:
所述处理器610用于:The processor 610 is configured to:
确定待解码点云对应的变换树的目标层与根节点之间的目标距离;Determine the target distance between the target layer and the root node of the transform tree corresponding to the point cloud to be decoded;
对待解码点云的码流中的第一编码结果进行解码,得到目标预测残差值;Decode the first encoding result in the bitstream of the point cloud to be decoded to obtain a target prediction residual value;
基于所述目标预测残差值及所述目标距离获取所述目标层的节点的属性信息;Acquire attribute information of nodes in the target layer based on the target prediction residual value and the target distance;
其中,在所述目标距离小于或等于第一预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧间预测得到的第一预测残差值;或Wherein, when the target distance is less than or equal to the first preset distance, the target prediction residual value includes a first prediction residual value obtained by performing inter-frame prediction on the attribute information of the node of the target layer; or
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值;或In a case where the target distance is greater than the first preset distance and less than or equal to the second preset distance, the target prediction residual value includes a second prediction residual value obtained by performing prediction processing on the attribute information of the node of the target layer; or
在所述目标距离大于所述第二预设距离的情况下,所述目标预测残差值包括对所述目标层的节点的属性信息进行帧内预测得到的第三预测残差值。When the target distance is greater than the second preset distance, the target prediction residual value includes a third prediction residual value obtained by performing intra-frame prediction on the attribute information of the node of the target layer.
可选地,在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下, 所述目标预测残差值包括基于率失真优化算法对所述目标层的节点的属性信息进行预测处理得到的第二预测残差值。Optionally, when the target distance is greater than the first preset distance and less than or equal to the second preset distance, The target prediction residual value includes a second prediction residual value obtained by predicting the attribute information of the node of the target layer based on a rate-distortion optimization algorithm.
可选地,所述目标预测残差值包括所述第二预测残差值,所述待解码点云的码流还包括第二编码结果,所述第二编码结果用于表征所述第二预测残差值为帧内残差系数或者所述第二预测残差值为帧间残差系数;所述处理器610具体用于:Optionally, the target prediction residual value includes the second prediction residual value, the code stream of the point cloud to be decoded also includes a second encoding result, and the second encoding result is used to characterize that the second prediction residual value is an intra-frame residual coefficient or the second prediction residual value is an inter-frame residual coefficient; the processor 610 is specifically used to:
在所述目标距离大于所述第一预设距离且小于或等于第二预设距离的情况下,对所述第二编码结果进行解码;When the target distance is greater than the first preset distance and less than or equal to the second preset distance, decoding the second encoding result;
在确定所述第二编码结果表征所述第二预测残差值为帧内残差系数的情况下,基于所述目标层的节点的帧内变换系数及所述第二预测残差值确定所述目标层的节点的属性信息;或In the case where it is determined that the second encoding result represents that the second prediction residual value is an intra-frame residual coefficient, determining the attribute information of the node of the target layer based on the intra-frame transform coefficient of the node of the target layer and the second prediction residual value; or
在确定所述第二编码结果表征所述第二预测残差值为帧间残差系数的情况下,基于所述目标层的节点的帧间变换系数及所述第二预测残差值确定所述目标层的节点的属性信息。When it is determined that the second encoding result represents that the second prediction residual value is an inter-frame residual coefficient, attribute information of the node of the target layer is determined based on the inter-frame transform coefficient of the node of the target layer and the second prediction residual value.
可选地,在所述待解码点云的参考帧数据的数据类型为第一数据类型的情况下,所述处理器610具体用于:Optionally, when the data type of the reference frame data of the to-be-decoded point cloud is the first data type, the processor 610 is specifically configured to:
确定所述目标层的节点的属性信息的帧间预测值,并对所述帧间预测值进行变换处理得到帧间变换系数;Determine an inter-frame prediction value of the attribute information of the node of the target layer, and transform the inter-frame prediction value to obtain an inter-frame transformation coefficient;
基于所述帧间变换系数及所述第二预测残差值确定所述目标层的节点的属性信息。The attribute information of the node of the target layer is determined based on the inter-frame transform coefficient and the second prediction residual value.
可选地,所述处理器610还用于:Optionally, the processor 610 is further configured to:
对所述待解码点云对应的参考帧中的点进行重构处理;Reconstructing points in a reference frame corresponding to the point cloud to be decoded;
基于重构处理后的参考帧构建预测树,所述预测树的树结构与所述变换树的树结构相同;Constructing a prediction tree based on the reconstructed reference frame, wherein the tree structure of the prediction tree is the same as the tree structure of the transformation tree;
基于所述预测树及所述变换树确定所述目标层的节点的属性信息的帧间预测值。An inter-frame prediction value of attribute information of a node of the target layer is determined based on the prediction tree and the transform tree.
可选地,所述第一数据类型的参考帧数据为已解码且已重建的点云帧数据。Optionally, the reference frame data of the first data type is decoded and reconstructed point cloud frame data.
可选地,在所述待解码点云的参考帧数据的数据类型为第二数据类型的情况下,所述处理器610具体用于:Optionally, when the data type of the reference frame data of the to-be-decoded point cloud is the second data type, the processor 610 is specifically configured to:
若确定所述待解码点云的参考帧中存在与所述目标层的节点具有相同位置的参考节点的变换系数,则将所述具有相同位置的参考节点的变换系数确定为所述节点的属性信息的帧间变换系数;If it is determined that there is a transformation coefficient of a reference node having the same position as the node of the target layer in the reference frame of the point cloud to be decoded, the transformation coefficient of the reference node having the same position is determined as the inter-frame transformation coefficient of the attribute information of the node;
基于所述第二预测残差值及所述节点的属性信息的帧间变换系数获取所述节点的属性信息。The attribute information of the node is obtained based on the second prediction residual value and the inter-frame transform coefficient of the attribute information of the node.
可选地,所述第二数据类型的参考帧数据为已解码的点云帧的重建变换系数。Optionally, the reference frame data of the second data type is reconstructed transform coefficients of a decoded point cloud frame.
可选地,所述待解码点云的码流还包括第三编码结果,所述处理器610还用于:Optionally, the code stream of the point cloud to be decoded further includes a third encoding result, and the processor 610 is further configured to:
对所述第三编码结果进行解码,得到所述第一预设距离及所述第二预设距离中的至少一项。The third encoding result is decoded to obtain at least one of the first preset distance and the second preset distance.
本申请实施例能够提高编码效率。 The embodiments of the present application can improve coding efficiency.
具体地,本申请实施例的终端还包括:存储在存储器609上并可在处理器610上运行的指令或程序,处理器610调用存储器609中的指令或程序执行图9或图10所示各模块执行的方法,并达到相同的技术效果,为避免重复,故不在此赘述。Specifically, the terminal of the embodiment of the present application also includes: instructions or programs stored in the memory 609 and executable on the processor 610. The processor 610 calls the instructions or programs in the memory 609 to execute the methods executed by the modules shown in Figure 9 or Figure 10, and achieves the same technical effect. To avoid repetition, it will not be repeated here.
本申请实施例还提供一种可读存储介质,所述可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现上述点云编码处理方法实施例的各个过程,或者,该程序或指令被处理器执行时实现上述点云解码处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。An embodiment of the present application also provides a readable storage medium, on which a program or instruction is stored. When the program or instruction is executed by a processor, the various processes of the above-mentioned point cloud encoding processing method embodiment are implemented, or when the program or instruction is executed by a processor, the various processes of the above-mentioned point cloud decoding processing method embodiment are implemented, and the same technical effect can be achieved. To avoid repetition, it will not be repeated here.
其中,所述处理器为上述实施例中所述的终端中的处理器。所述可读存储介质,包括计算机可读存储介质,如计算机只读存储器ROM、随机存取存储器RAM、磁碟或者光盘等。在一些示例中,可读存储介质可以是非瞬态的可读存储介质。The processor is the processor in the terminal described in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a computer read-only memory ROM, a random access memory RAM, a magnetic disk or an optical disk. In some examples, the readable storage medium may be a non-transient readable storage medium.
本申请实施例另提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现上述点云编码处理方法实施例的各个过程,或者,实现上述点云解码处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。An embodiment of the present application further provides a chip, which includes a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the various processes of the above-mentioned point cloud encoding processing method embodiment, or to implement the various processes of the above-mentioned point cloud decoding processing method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
应理解,本申请实施例提到的芯片还可以称为系统级芯片,系统芯片,芯片系统或片上系统芯片等。It should be understood that the chip mentioned in the embodiments of the present application can also be called a system-level chip, a system chip, a chip system or a system-on-chip chip, etc.
本申请实施例另提供了一种计算机程序/程序产品,所述计算机程序/程序产品被存储在存储介质中,所述计算机程序/程序产品被至少一个处理器执行以实现上述点云编码处理方法或点云解码处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。The embodiments of the present application further provide a computer program/program product, which is stored in a storage medium. The computer program/program product is executed by at least one processor to implement the various processes of the above-mentioned point cloud encoding processing method or point cloud decoding processing method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
本申请实施例还提供了一种编解码系统,包括:编码端设备及解码端设备,所述编码端设备可用于执行如上所述的点云编码处理方法的步骤,所述解码端设备可用于执行如上所述的点云解码处理方法的步骤。An embodiment of the present application also provides a coding and decoding system, including: an encoding end device and a decoding end device, wherein the encoding end device can be used to execute the steps of the point cloud encoding processing method as described above, and the decoding end device can be used to execute the steps of the point cloud decoding processing method as described above.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。It should be noted that, in this article, the terms "comprise", "include" or any other variant thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, an element defined by the sentence "comprises one..." does not exclude the presence of other identical elements in the process, method, article or device including the element. In addition, it should be pointed out that the scope of the method and device in the embodiment of the present application is not limited to performing functions in the order shown or discussed, and may also include performing functions in a substantially simultaneous manner or in reverse order according to the functions involved, for example, the described method may be performed in an order different from that described, and various steps may also be added, omitted or combined. In addition, the features described with reference to certain examples may be combined in other examples.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助计算机软件产品加必需的通用硬件平台的方式来实现,当然也可以通过硬件。该计算机软件产品存储在存储介质(如ROM、RAM、磁碟、光盘等)中,包括若干指令,用以使 得终端或者网络侧设备执行本申请各个实施例所述的方法。Through the above description of the implementation mode, those skilled in the art can clearly understand that the above embodiment method can be implemented by means of a computer software product plus a necessary general hardware platform, or of course by hardware. The computer software product is stored in a storage medium (such as ROM, RAM, magnetic disk, optical disk, etc.), and includes several instructions for enabling The terminal or network side device executes the methods described in each embodiment of the present application.
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式的实施方式,这些实施方式均属于本申请的保护之内。 The embodiments of the present application are described above in conjunction with the accompanying drawings, but the present application is not limited to the above-mentioned specific implementation methods. The above-mentioned specific implementation methods are merely illustrative and not restrictive. Under the guidance of the present application, ordinary technicians in this field can also make many forms of implementation methods without departing from the purpose of the present application and the scope of protection of the claims, and these implementation methods are all within the protection of the present application.
Claims (24)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310408599.1A CN118827998A (en) | 2023-04-17 | 2023-04-17 | Point cloud encoding processing method, point cloud decoding processing method and related equipment |
| CN202310408599.1 | 2023-04-17 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024217301A1 true WO2024217301A1 (en) | 2024-10-24 |
Family
ID=93063373
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2024/086903 Pending WO2024217301A1 (en) | 2023-04-17 | 2024-04-10 | Point cloud coding processing method, point cloud decoding processing method and related device |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN118827998A (en) |
| WO (1) | WO2024217301A1 (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109196559A (en) * | 2016-05-28 | 2019-01-11 | 微软技术许可有限责任公司 | The motion compensation of dynamic voxelization point cloud is compressed |
| CN112702598A (en) * | 2020-12-03 | 2021-04-23 | 浙江智慧视频安防创新中心有限公司 | Method, device, electronic equipment and medium for encoding and decoding based on displacement operation |
| WO2021138788A1 (en) * | 2020-01-06 | 2021-07-15 | Oppo广东移动通信有限公司 | Intra-frame prediction method and apparatus, coder, decoder and storage medium |
| US20220086461A1 (en) * | 2019-01-10 | 2022-03-17 | Industry Academy Cooperation Foundation Of Sejong University | Image encoding/decoding method and apparatus |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115720270A (en) * | 2021-08-24 | 2023-02-28 | 鹏城实验室 | Point cloud encoding method, point cloud decoding method, point cloud encoding device, and point cloud decoding device |
| CN113766229B (en) * | 2021-09-30 | 2023-04-28 | 咪咕文化科技有限公司 | Encoding method, decoding method, device, equipment and readable storage medium |
| CN115499660B (en) * | 2022-09-27 | 2025-07-18 | 南华大学 | Method and system for quickly determining dynamic 3D point cloud compression inter-frame coding mode |
-
2023
- 2023-04-17 CN CN202310408599.1A patent/CN118827998A/en active Pending
-
2024
- 2024-04-10 WO PCT/CN2024/086903 patent/WO2024217301A1/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109196559A (en) * | 2016-05-28 | 2019-01-11 | 微软技术许可有限责任公司 | The motion compensation of dynamic voxelization point cloud is compressed |
| US20220086461A1 (en) * | 2019-01-10 | 2022-03-17 | Industry Academy Cooperation Foundation Of Sejong University | Image encoding/decoding method and apparatus |
| WO2021138788A1 (en) * | 2020-01-06 | 2021-07-15 | Oppo广东移动通信有限公司 | Intra-frame prediction method and apparatus, coder, decoder and storage medium |
| CN112702598A (en) * | 2020-12-03 | 2021-04-23 | 浙江智慧视频安防创新中心有限公司 | Method, device, electronic equipment and medium for encoding and decoding based on displacement operation |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118827998A (en) | 2024-10-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20240205430A1 (en) | Block-Based Predictive Coding For Point Cloud Compression | |
| WO2022257971A1 (en) | Point cloud encoding processing method, point cloud decoding processing method, and related device | |
| WO2022257978A1 (en) | Point cloud coding method and apparatus, and point cloud decoding method and apparatus | |
| KR20210136082A (en) | Techniques and apparatus for inter-channel prediction and transformation for point cloud attribute coding | |
| CN115474041B (en) | Point cloud attribute prediction method, device and related equipment | |
| WO2022140937A1 (en) | Point cloud encoding method and system, point cloud decoding method and system, point cloud encoder, and point cloud decoder | |
| CN115086658B (en) | Point cloud data processing method and device, storage medium and encoding and decoding equipment | |
| CN116965025A (en) | Bit allocation for neural network feature compression | |
| KR20230173695A (en) | Entropy encoding, decoding method and device | |
| WO2024217301A1 (en) | Point cloud coding processing method, point cloud decoding processing method and related device | |
| TW202446075A (en) | Efficient warping-based neural video coder | |
| CN116233387B (en) | Point cloud coding and decoding methods, devices and communication equipment | |
| CN116233386B (en) | Point cloud attribute encoding method, point cloud attribute decoding method and terminal | |
| KR20240006667A (en) | Point cloud attribute information encoding method, decoding method, device and related devices | |
| CN118678075B (en) | Point cloud encoding processing method, point cloud decoding processing method and related equipment | |
| WO2024217340A1 (en) | Point cloud coding processing method, point cloud decoding processing method, and related device | |
| CN119815053B (en) | Point cloud attribute coding method, point cloud attribute decoding device and electronic equipment | |
| WO2024217302A1 (en) | Point cloud coding processing method, point cloud decoding processing method and related device | |
| CN120343259A (en) | Coding and decoding method and related equipment | |
| CN119071493A (en) | Point cloud encoding processing method, point cloud decoding processing method and related equipment | |
| WO2025077667A1 (en) | Method and apparatus for determining attribute information of point cloud, and electronic device | |
| WO2024217303A1 (en) | Transform coefficient coding method, transform coefficient decoding method, and terminal | |
| CN120835147A (en) | Point cloud information decoding, encoding method, device and related equipment | |
| WO2024120431A1 (en) | Point cloud coding method, point cloud decoding method, and related devices | |
| CN121002538A (en) | Compression schemes for point cloud attributes with implicit inter-frame prediction |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24791886 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |