WO2025242011A1 - Three-dimensional model reconstruction method based on multi-source input and apparatus - Google Patents
Three-dimensional model reconstruction method based on multi-source input and apparatusInfo
- Publication number
- WO2025242011A1 WO2025242011A1 PCT/CN2025/095518 CN2025095518W WO2025242011A1 WO 2025242011 A1 WO2025242011 A1 WO 2025242011A1 CN 2025095518 W CN2025095518 W CN 2025095518W WO 2025242011 A1 WO2025242011 A1 WO 2025242011A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- image
- point cloud
- target
- reconstructed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
Definitions
- This application relates to the field of three-dimensional model reconstruction, and in particular to a method and apparatus for three-dimensional model reconstruction based on multi-source input.
- 3D reconstruction is a mathematical process and computer technology for restoring the original three-dimensional information of a scene.
- Early 3D model reconstruction methods typically used single-view or multi-view image information as input. Limited by the input data, the reconstructed 3D models were usually incomplete and lacked realism.
- High-precision 3D reconstruction schemes based on pure visual input had cumbersome data acquisition processes and low work efficiency. Due to the heavy reliance on the Structure from Motion (SFM) module, the applicable scenarios were limited, the degree of automation was low, it was difficult to meet the requirements of high speed, batch processing, and high frequency, and the accuracy and precision were low with severe distortion.
- SFM Structure from Motion
- This application provides a method and apparatus for reconstructing a three-dimensional model based on multi-source input.
- mapping results including point cloud information and image information of the object to be reconstructed
- a three-dimensional model is established.
- geometric residuals and optical residuals are determined through point cloud information and image information, and the three-dimensional model is further adjusted to improve modeling efficiency, modeling accuracy, and modeling realism.
- this application provides a 3D model reconstruction method based on multi-source input.
- This method is applied to a 3D model reconstruction platform.
- the method includes the following steps: obtaining the mapping results of the object to be reconstructed, wherein the mapping results include image information and point cloud information of the object to be reconstructed. Further, a 3D model of the object to be reconstructed is established based on the mapping results. On this basis, target image information and target point cloud information of the object to be reconstructed under target pose observation are determined from the mapping results.
- the target pose includes position information and attitude information. This instructs the 3D model to generate a target depth image and a target digital image based on the target pose.
- the 3D model reconstruction platform determines the geometric residual based on the target point cloud information and the target depth image, determines the optical residual based on the target image information and the target digital image, and then adjusts the 3D model based on the geometric and optical residuals.
- a 3D model is built by acquiring mapping results from multiple inputs, including point cloud information and image information of the object to be reconstructed. This allows the built 3D model to output corresponding depth and digital images based on specific poses. Then, by combining the target point cloud information and target image information from the mapping results, optical and geometric residuals are constructed. The 3D model is then adjusted based on these optical and geometric residuals, jointly supervising the training of the 3D model.
- the geometric accuracy and detail feature richness of the 3D model are significantly improved, resulting in highly realistic rendered objects.
- the specific steps for obtaining the mapping result of the object to be reconstructed are as follows: obtain the initial image information and initial point cloud information of the object to be reconstructed, as well as the image inertial information and point cloud inertial information corresponding to the acquisition device when acquiring the initial image information and the acquisition device when acquiring the initial point cloud information, and then generate the mapping result based on the image information, point cloud information, point cloud inertial information and image inertial information, thereby obtaining the mapping result of the modeling object.
- the solution provided in this application acquires multi-source input information such as image information, point cloud information, and corresponding inertial information, and fuses frame-by-frame point cloud, frame-by-frame image, and corresponding motion prior information to construct a map of the object to be reconstructed, achieving high-precision mapping. Furthermore, by introducing corresponding inertial information and other motion prior information, along with point cloud information, pose fitting can be performed more quickly, improving mapping efficiency.
- the 3D model reconstruction platform determines descriptive information based on the initial image information, wherein the descriptive information includes the texture features of at least one frame of the initial image information, and then determines the keyframe image based on the initial image information and the descriptive information.
- the determination of keyframe images helps to filter out images that meet the requirements of specific descriptor information. For example, some images with relatively rich texture features or details can be used to further improve the adjustment efficiency of the 3D model and the rendering effect of the 3D model on the details by selecting keyframe images to construct optical residuals when adjusting the 3D model.
- the 3D model reconstruction platform can determine the target pose for observing the keyframe image based on the keyframe image.
- the target pose is determined based on keyframe images, enabling the construction of optical residuals between the digital image output by the 3D model and the keyframe images with relatively rich texture features or detail. Simultaneously, it also enables the construction of geometric residuals between the depth image output by the 3D model and the corresponding point cloud information with high accuracy. Therefore, using the pose obtained from the keyframe images as the basis for residual construction can further improve the efficiency and effectiveness of 3D model training iterations.
- the 3D model reconstruction platform can further determine image pose information based on image information and image inertial information, and determine point cloud pose information based on point cloud information and point cloud inertial information.
- a point cloud map is generated based on the image information, point cloud information, point cloud inertial information, and image inertial information. Specifically, this includes the following steps: generating a point cloud map based on image information, point cloud information, point cloud pose information, and image pose information.
- the 3D model reconstruction platform can determine global pose information based on the point cloud pose information and image pose information.
- the solution provided in this application fuses image information with image inertial information, point cloud information, and point cloud inertial information to obtain corresponding image poses and point cloud poses. This allows for accurate mapping during image and point cloud construction, as the adjacent and matching relationships between the image and point cloud information can be determined based on their respective poses. By establishing global pose information, rapid matching and conversion between image and point cloud poses can be achieved, facilitating the output of a complete and accurate rendered image from the 3D model based on a specific pose from either the point cloud or the image.
- the 3D model reconstruction platform performs loop closure detection on the mapping results to obtain matching information.
- the matching information is used to indicate the matching status between the current frame and historical frames, and the mapping results are adjusted according to the matching information.
- the 3D model reconstruction platform acquires supplementary image information, supplementary point cloud information, supplementary image inertial information, and supplementary laser inertial information of the object to be reconstructed, and updates the mapping results based on the supplementary image information, supplementary point cloud information, supplementary laser inertial information, and supplementary image inertial information.
- the mapping speed is improved, and the current mapping progress can be fed back to the user in real time. This allows the user to know in a timely manner whether there are any local areas that have been missed or details that are missing within the object to be reconstructed during the mapping process, and to perform targeted supplementary scanning of the corresponding areas, thereby completing the construction of high-quality raw data for mapping.
- the user can configure the adjustment of the 3D model.
- the 3D model reconstruction platform obtains the number of iterations input by the user and instructs the 3D model to be adjusted according to the number of iterations. The number of iterations is used to indicate the number of adjustments to the 3D model. After the 3D model meets the number of iterations, the 3D model is output.
- the user controls the iterative training process of the 3D model through configuration, so that after the 3D model meets the corresponding adjustment requirements, the adjusted 3D model is output for subsequent service use.
- the second aspect or any implementation thereof is a device implementation corresponding to the first aspect or any implementation thereof.
- the description in the first aspect or any implementation thereof applies to the second aspect or any implementation thereof, and will not be repeated here.
- this application provides a computing device cluster, including at least one computing device, each computing device including a processor and a memory; the processor of the at least one computing device is used to execute instructions stored in the memory of the at least one computing device, so that the computing device cluster performs the method described in the first aspect and any implementation thereof.
- this application provides a computer program product containing instructions that, when executed by a cluster of computer devices, cause the cluster of computer devices to perform the method described in the first aspect and any implementation thereof.
- this application provides a computer-readable storage medium including computer program instructions, which, when executed by a cluster of computing devices, enable the cluster of computing devices to perform the method described in the first aspect and any implementation thereof.
- Figure 1 is a schematic diagram of an application scenario of the three-dimensional model reconstruction method based on multi-source input provided in the embodiments of this application;
- Figure 2 is a schematic diagram of another application scenario of the three-dimensional model reconstruction method based on multi-source input provided in the embodiments of this application;
- Figure 3 is a flowchart illustrating a three-dimensional model reconstruction method based on multi-source input provided in an embodiment of this application;
- Figure 4 is a schematic diagram of a mapping process for a three-dimensional model reconstruction method based on multi-source input provided in an embodiment of this application;
- Figure 5 is a schematic diagram of another mapping process of the three-dimensional model reconstruction method based on multi-source input provided in the embodiments of this application;
- Figure 6 is a schematic diagram of a keyframe determination process for a three-dimensional model reconstruction method based on multi-source input provided in an embodiment of this application;
- Figure 7 is a schematic diagram of a three-dimensional model reconstruction process based on a multi-source input three-dimensional model reconstruction method provided in an embodiment of this application;
- Figure 8 is a schematic diagram of the device provided in an embodiment of this application.
- Figure 9 is a schematic diagram of the computing device for the three-dimensional model reconstruction method based on multi-source input provided in the embodiments of this application;
- Figure 10 is a schematic diagram of a computing device cluster for a three-dimensional model reconstruction method based on multi-source input provided in an embodiment of this application;
- Figure 11 is another schematic diagram of the computing device cluster of the three-dimensional model reconstruction method based on multi-source input provided in the embodiments of this application;
- Figure 12 is another schematic diagram of the computing device cluster of the three-dimensional model reconstruction method based on multi-source input provided in the embodiments of this application.
- references to "one embodiment” or “some embodiments” as described in this specification mean that one or more embodiments of this application include a specific feature, structure, or characteristic described in connection with that embodiment. Therefore, the phrases “in one embodiment,” “in some embodiments,” “in other embodiments,” “in still other embodiments,” etc., appearing in different parts of this specification do not necessarily refer to the same embodiment, but rather mean “one or more, but not all, embodiments,” unless otherwise specifically emphasized.
- the terms “comprising,” “including,” “having,” and variations thereof mean “including but not limited to,” unless otherwise specifically emphasized.
- 3D model reconstruction refers to the process of converting real-world objects or scenes into computer-processable 3D digital models using computer vision and graphics processing techniques.
- Object to be reconstructed This refers to the object or entity to be reconstructed using computer vision and graphics processing techniques. It can be a 3D space, a scene, an object-level model, or a digital human, etc.
- Depth Image Also known as a distance image, a depth image is an image that uses the distance (depth) from the image acquisition device to various points in the scene as pixel values. It directly reflects the geometry of the visible surfaces of the scene. Depth images can be converted into point cloud data through coordinate transformation, and point cloud data with regularity and necessary information can also be converted back into depth image data.
- a digital image is an image represented by a finite number of digital pixels, represented by an array or matrix, where the illumination position and intensity are discrete.
- a digital image is an image obtained by digitizing an analog image, using pixels as its basic element, and can be stored and processed by a digital computer or digital circuit.
- the digital image can be a binary image, a color image, etc., such as an RGB image or a panchromatic image.
- Point clouds are datasets of points in space that can represent three-dimensional shapes or objects, typically acquired by a 3D scanner.
- the position of each point in a point cloud is described by a set of Cartesian coordinates (X,Y,Z), and some may contain color information (R,G,B) or information about the intensity of the object's reflective surface.
- the point cloud information can be a point cloud directly acquired by a laser scanner (LiDAR) or other sensors; it can also be a point cloud directly acquired by other sensors, such as acoustic radar.
- a 3D model is a mathematical representation of a real or fictional object in three-dimensional space. This representation typically consists of a series of 3D coordinate points connected to form lines and surfaces, creating a complete geometric structure that accurately reproduces the shape, structure, and appearance of the object, enabling scene reconstruction. Examples include 3D Gaussian Splatting (3DGS), a scene reconstruction and rendering technique based on 3D Gaussian volumes. It combines the advantages of explicit and implicit representations, enabling scene reconstruction and efficient real-time rendering based on pure image input, and generating synthetic data from new perspectives. It represents a new paradigm for reconstruction in the current field of computer vision. Another example is Neural Radiance Field (NeRF), an implicit 3D model reconstruction method based on deep neural networks. By learning the radiation and color information of each point in the scene, it can synthesize realistic images from any perspective. It constructs an implicit representation of the scene by sampling points in 3D space and predicting the radiation and color for each point.
- NeRF Neural Radiance Field
- Rasterization The core step in 3D Gaussian sputtering rendering, it's a mathematical process that converts the mathematical description of an object and its associated color information into pixels on the screen for corresponding locations and the colors used to fill those pixels.
- 3D Gaussian it refers to the mathematical process of rendering a 2D image from the distribution and color information of the Gaussian kernel.
- An inertial measurement unit is a device that measures the three-axis attitude angles (or angular rates) and acceleration of an object.
- Pose refers to the position and orientation of an object relative to a reference coordinate system. Specifically, pose includes the object's spatial position information and its rotation direction information. Pose can be used to describe the position and orientation of any rigid body object in three-dimensional space. Position typically refers to the coordinates of the object's center or a specific reference point in three-dimensional space; orientation typically refers to the object's rotation angle or rotation matrix, representing the object's rotation relative to the reference coordinate system.
- Explicit representation The traditional form of representation for 3D models, which involves explicitly modeling a scene or object, allowing users to edit and view it, including meshes, point clouds, voxels, etc.
- Implicit representation Based on machine learning methods such as deep neural networks, it describes the representation of the three-dimensional information of the scene in a parameterized way, and constructs a mapping relationship from three-dimensional spatial coordinates to corresponding geometric/texture information.
- Simultaneous Localization and Mapping A real-time explicit 3D model reconstruction method that aims to enable a robot to start from an unknown location in an unknown environment, locate its own position and orientation by repeatedly observing map features (such as corners, pillars, etc.) during movement, and then incrementally build a map based on its own position, thereby achieving the goal of simultaneous localization and map construction.
- map features such as corners, pillars, etc.
- SFM Structure from Motion
- this application provides a three-dimensional model reconstruction method based on multi-source input.
- mapping results including point cloud information and image information of the object to be modeled and reconstructed
- a three-dimensional model is established.
- geometric residuals and optical residuals are determined through point cloud information and image information, and the three-dimensional model is further adjusted to improve modeling efficiency, modeling accuracy, and modeling realism.
- Figure 1 is a schematic diagram of an application scenario of the 3D model reconstruction method based on multi-source input provided in the embodiments of this application. Specifically:
- the 3D model reconstruction platform 12 can run on infrastructure 15, which includes at least one computing device for computation during the 3D model reconstruction process.
- User A can directly access the 3D model reconstruction platform 12, control the 3D model reconstruction process through it, and input corresponding operation commands during the process.
- the 3D model reconstruction platform 12 obtains the mapping result 16 of the object to be reconstructed and reconstructs a 3D model 17 based on the mapping result 16.
- the 3D model 17, as the output, can be used by subsequent data asset 3D model reconstruction platforms and digital twin simulation platforms.
- the digital asset 3D model reconstruction platform is a digital asset platform that can uniformly manage and schedule the reconstructed 3D model and various types of multimodal synthetic data derived from the 3D model.
- the digital asset format it can be divided into original 3D models, color images, depth maps, color point clouds, semantic maps, individual segmentation results, and mesh models, etc.
- Synthetic data can be accumulated and can also be directly applied to the training of large-scale embodied models.
- the digital twin simulation platform can directly load dense 3D models and display highly realistic rendering results of 3D space to users in real time within the simulator interface. It supports users to roam in the simulation scene in real time and can also use the precise geometric information of the 3D model to realize downstream tasks such as ranging and navigation.
- Figure 2 is a schematic diagram of another application scenario of the 3D model reconstruction method based on multi-source input provided in the embodiments of this application. Specifically:
- the 3D model reconstruction platform 12 can be deployed on infrastructure 14 to perform fusion mapping based on multi-source inputs and output mapping results 16. Alternatively, it can be deployed on infrastructure 15 to reconstruct a 3D model 17 based on the mapping results 16 and output the 3D model 17. Furthermore, user A can directly access the 3D model reconstruction platform 12. Specifically, when the 3D model reconstruction platform 12 is mapping the object to be reconstructed, the user can monitor the mapping results in real time and input corresponding commands to control the mapping process.
- infrastructure 14 is a computing device deployed on the user's end, located close to the object to be reconstructed 10 for local mapping and computation. It also allows the user to view the mapping results 16 in real time.
- infrastructure 15 is deployed in the cloud.
- the cloud-provided infrastructure 15 provides the computing power for the 3D model reconstruction platform 12 during the 3D model reconstruction process based on the mapping results 16, meeting the computational needs of 3D model reconstruction.
- the digital asset 3D model reconstruction platform and the digital twin simulation platform can directly call the 3D model 17 on the cloud, making it more convenient and efficient.
- infrastructure 14 and infrastructure 15 can both be computing devices deployed on the user side, that is, the 3D model reconstruction platform 12 runs directly on the user-side device, thereby enabling the user to have full control over the process of fusion mapping and 3D model reconstruction on the 3D model reconstruction platform 12.
- data acquisition device 11 acquires data from the object 10 to obtain its multi-source input data.
- the acquisition device 11 can be various mapping devices such as robots, professional data acquisition equipment, and portable handheld devices, or it can include sensors such as LiDAR, IMU, and RGB (RGB Color Mode) cameras required for high-precision mapping. Spatiotemporal alignment of data between all sensors is achieved through methods such as extrinsic parameter calibration of LiDAR and cameras, establishment of LiDAR and IMU, and hardware-triggered timestamp synchronization of sensors.
- the acquisition devices mentioned above can be one or more combined and acquired simultaneously; this application does not limit this. Therefore, when the acquisition device 11 acquires the corresponding data, it obtains raw data such as initial image information, initial point cloud information, and inertial information.
- a unified data structure and a unified application programming interface (API) can be used during the acquisition process to improve mapping efficiency.
- the 3D Gaussian model will be used as an example of a three-dimensional model in the following embodiments.
- the 3D Gaussian model is an example of a three-dimensional model protected by this application.
- As shown in the definition of a three-dimensional model there are still many other models that can achieve three-dimensional reconstruction of a scene in the field of three-dimensional model reconstruction. This should not be construed as a specific limitation.
- Figure 3 is a flowchart illustrating a three-dimensional model reconstruction method based on multi-source input provided in an embodiment of this application. Specifically:
- Step S201 Obtain the mapping results of the object to be reconstructed.
- the mapping results include the image information and point cloud information of the object to be reconstructed.
- the 3D model reconstruction platform 12 acquires the mapping results of the object to be reconstructed, which serve as input for the 3D model reconstruction.
- the mapping results can be provided by the user using different methods based on their business needs and the characteristics of the object to be reconstructed; this application does not limit this. It is worth noting that the mapping results include image information and point cloud information of the object to be reconstructed.
- the image information can be the initial image information of the object to be reconstructed acquired by the acquisition device, or it can be specific image information after fusion and fitting, reflecting information such as the color and shape of the object to be reconstructed.
- the point cloud information exemplarily, can be laser point cloud information. Laser point cloud information can accurately determine the depth and geometric information of the object to be reconstructed. Compared to shapes determined by structured light or pure vision methods, the embodiments of this application have higher precision and accuracy.
- mapping result can be generated by the 3D model reconstruction platform 12.
- step S201 For specific steps in step S201, please refer to Figure 4.
- Figure 4 is another flowchart illustrating the 3D model reconstruction method based on multi-source input provided in this embodiment of the application. Specifically:
- Step S301 Obtain the initial image information and initial point cloud information of the object to be reconstructed, as well as the image inertial information and point cloud inertial information corresponding to the acquisition device when acquiring the initial image information and the acquisition device when acquiring the initial point cloud information.
- the 3D model reconstruction platform 12 acquires the initial image information and initial point cloud information of the object to be reconstructed through the acquisition device 11, as well as the image inertial information and point cloud inertial information corresponding to the acquisition device when acquiring the initial image information and the acquisition device when acquiring the initial point cloud information.
- Figure 5 is a schematic diagram of a mapping process for a 3D model reconstruction method based on multi-source input provided in the application embodiment, wherein:
- Step S3011 Obtain the initial point cloud information of the object to be reconstructed.
- the 3D model reconstruction platform 12 obtains the initial point cloud information of the object to be reconstructed by the data reported by the acquisition device 11.
- the initial point cloud information can be a set obtained by the acquisition device 11, such as a lidar, according to a certain acquisition path and corresponding pose.
- the initial point cloud information includes frame-by-frame point cloud information.
- Step S3012. Obtain the inertial information of the object to be reconstructed.
- the 3D model reconstruction platform 12 acquires the inertial information of the object to be reconstructed during the acquisition of initial point cloud information and initial image information by collecting data reported by the acquisition device 11.
- the acquisition device 11 can simultaneously integrate a LiDAR, an RGB camera, and an IMU to achieve synchronous data acquisition and improve acquisition accuracy.
- the inertial information of the object to be reconstructed is also the prior motion information for acquiring the initial point cloud information and initial image information. This prior motion information can include angular velocity, acceleration, and other information.
- Step S3013 Obtain the initial image information of the object to be reconstructed.
- the 3D model reconstruction platform 12 obtains the initial image information of the object to be reconstructed by the data reported by the acquisition device 11.
- This initial image information can be acquired by the acquisition device 11, such as an RGB camera, including frame-by-frame images of the object to be reconstructed during the acquisition process.
- Step S302. Generate mapping results based on initial image information, initial point cloud information, point cloud inertial information, and image inertial information.
- the 3D model reconstruction platform 12 integrates frame-by-frame laser point clouds with motion prior information provided by the IMU. Furthermore, it integrates image features with the motion prior information provided by the IMU to jointly estimate the pose state of the entire initial image information and initial point cloud information, generating the mapping result.
- Step S3024 Generate point cloud pose information.
- the 3D model reconstruction platform 12 integrates frame-by-frame laser point cloud with motion prior information provided by IMU, matches the features of adjacent point clouds, and estimates the point cloud pose information in real time.
- Step S3025 Determine descriptor information based on initial image information.
- the 3D model reconstruction platform 12 determines the descriptive information of each frame of the image based on the initial image information reported by the acquisition device 11.
- the descriptive information can be point-n-point (PnP) descriptive information.
- Figure 6 is a schematic diagram of a keyframe determination process for a 3D model reconstruction method based on multi-source input provided in an embodiment of this application. The steps are as follows:
- Step S3025 Determine descriptor information based on the initial image information.
- the 3D model reconstruction platform 12 can perform frame-by-frame judgment on the initial image information and determine the image's descriptor information frame by frame.
- the descriptor information is a PnP descriptor.
- the PnP descriptor information may include: PnP residual: an image pose reliability index; a high residual indicates unreliable pose accuracy; and the number of PnP pairs: an image texture feature richness index; a low number of pairs indicates severe visual degradation.
- the 3D model reconstruction platform 12 performs frame-by-frame judgment on the initial image information.
- This preset rule can also be configured by the user to filter images that conform to specific rules.
- This step allows for the filtering of input data, especially for selecting relevant image data that meets specific requirements, thus providing specific images for subsequent mapping and 3D model reconstruction.
- Step S303 Determine the keyframe image based on the initial image information and descriptor information.
- the 3D model reconstruction platform 12 filters the initial image information frame by frame according to the descriptor information.
- the images that meet the descriptor information after filtering have pose accuracy and image texture feature richness that meet the PnP descriptor requirements and are regarded as keyframe images.
- Step S3026 Generate image pose information.
- the 3D model reconstruction platform 12 integrates image features and motion prior information provided by the IMU to estimate the spatiotemporal displacement relationship between two adjacent frames and output the image pose.
- Step S3027 The 3D model reconstruction platform 12 performs loop closure detection.
- a robot moves along a corridor, enters a room, and then returns to the corridor.
- the robot continuously acquires images and records its pose (position and orientation).
- Each image is processed to extract key features, such as oriented fast and rotated briefs (ORBs).
- ORBs oriented fast and rotated briefs
- These feature points can be represented using feature descriptors.
- a feature matching algorithm is used to find similar feature points. If a sufficient number of matching feature points are found (e.g., exceeding a certain threshold), a potential loop closure is considered to exist. For each loop closure candidate, the corresponding image and robot pose are recorded.
- a pose graph is constructed, where nodes represent robot poses and edges represent relative transformations between adjacent poses.
- a constraint edge is added, indicating that the robot returns to its previous position.
- the feature matching results of the loop closure candidates and the robot pose information are used to verify the authenticity of the loop closure. Further validation can be achieved by calculating the geometric consistency among loop closure candidates, for example, by using a Random Sample Consensus (RANSAC) algorithm to estimate the optimal relative pose.
- RANSAC Random Sample Consensus
- Step S3028 The 3D model reconstruction platform 12 performs backend optimization.
- the 3D model reconstruction platform 12 After loop closure detection is completed, the 3D model reconstruction platform 12 performs closed-loop optimization on the mapping results based on the matching information of the current loop closures, eliminating mapping misalignment caused by sensor errors.
- the matching results are optimized through constraints from point cloud information and image information.
- Step S3029 The 3D model reconstruction platform 12 generates the mapping results.
- the 3D model reconstruction platform 12 can match the corresponding initial point cloud information and initial image information with the image pose information and point cloud pose information to generate the corresponding mapping result.
- the mapping result includes image information and point cloud information.
- the image information can be a set of images from the initial image information
- the point cloud information can be a set of point clouds from the initial point cloud information.
- the image information includes keyframe images.
- the 3D model reconstruction platform 12 does not determine descriptor information and does not identify keyframe images; the image information in the mapping result is only a subset of the initial image information.
- mapping process steps S3011-S3029 are one embodiment of this application, and any combination of steps or any execution order can fall within the protection scope of this application.
- steps S3027, S3025, and S3028 steps can be additional steps for the 3D model reconstruction platform 12 to build a map based on the initial image information, initial point cloud information, and corresponding initial inertial information.
- mapping efficiency can be greatly improved.
- Point cloud information can provide more accurate geometric and depth information of the object to be reconstructed.
- the introduction of IMU inertial information can more quickly and accurately realize the input of motion prior information when the acquisition device acquires initial point cloud information and initial image information.
- the accuracy of pose matching and fitting is higher, avoiding the problem of identifying relatively coarse geometric and inertial information through a lot of calculations based solely on image information, thus greatly improving mapping efficiency and accuracy.
- the 3D model reconstruction platform 12 can provide real-time mapping feedback to the user during the mapping process.
- the user can promptly take remedial measures such as supplementary data acquisition.
- Step S3023 The 3D model reconstruction platform 12 obtains the supplementary sampling command.
- the 3D model reconstruction platform 12 waits for the acquisition device 11 to report the supplementary acquisition data based on the supplementary acquisition command input by user A.
- This data includes supplementary image information, supplementary point cloud information, supplementary image inertial information, and supplementary point cloud inertial information.
- the platform updates the mapping results. This improves mapping efficiency and ensures the integrity of the mapping results for the object to be reconstructed.
- the 3D model reconstruction platform 12 provides some interfaces that allow users to customize queries for relevant content.
- GetCameraInfo()CameraInfo Gets the camera parameters and status at the current moment.
- Object definition current image frame timestamp, image number, image intrinsic parameters, image pose information, whether it is a keyframe, and image data.
- GetLidarInfo()LidarInfo Gets the current LiDAR parameters and status.
- LiDAR type including solid-state LiDAR and multi-line LiDAR; number of LiDAR lines, LiDAR current frame timestamp, LiDAR point cloud frame number, LiDAR pose information, whether a loop closure is detected at the current position, the LiDAR point cloud frame number associated with the loop closure, and the original LiDAR point cloud data.
- GetIMUInfo() Gets the IMU parameters and status at the current time.
- IMU type including six-axis and nine-axis; IMU timestamp and IMU data.
- GetMapInfo()MapInfo retrieves the global point cloud mapping result at the current moment.
- Object definition point cloud coordinates, point cloud color value.
- Interfaces that rely on user input include:
- mapping process is one embodiment of this application. It should be understood that the above mapping steps do not limit the mapping results in this application during the 3D model reconstruction process.
- Step S202 Establish a three-dimensional model of the object to be reconstructed based on the mapping results.
- the 3D model reconstruction platform 12 establishes a 3D Gaussian model based on the mapping results.
- This 3D Gaussian model typically includes the following five parameters:
- Location also known as mean, represents the coordinates of the center of the 3D Gaussian kernel in three-dimensional space
- Covariance Represents the shape distribution of the 3D Gaussian kernel.
- the three column vectors in the covariance matrix represent the three principal axis directions of the Gaussian ellipsoid.
- Scaling factor Represents the size of each 3D Gaussian kernel
- Opacity Represents the transparency information of the 3D Gaussian kernel. The higher the opacity, the closer the Gaussian kernel is to the object's surface.
- Lighting parameters Encode lighting information from different viewpoints, which can reflect the changes in lighting within a 3D scene.
- feature points or landmarks are selected from the mapping results. These feature points can be fixed points in the environment, such as markings on walls, corners of rooms, etc.
- the mean vector ⁇ of each feature point is determined, where ⁇ is typically the estimated position of the feature point in the world coordinate system.
- the covariance matrix ⁇ is determined, which describes the uncertainty of the feature point's position.
- a 3D Gaussian model is then built. For each selected feature point, a 3D Gaussian distribution model is constructed using the mean vector and covariance matrix determined above to complete the 3D Gaussian model construction.
- Step S203 Determine the target image information and target point cloud information.
- the target image information and target point cloud information of the object to be reconstructed are determined from the mapping results under the observation of the target pose.
- the target pose includes position information and attitude information.
- the target pose can be the position and attitude information used to observe the target image information and target point cloud information in the mapping results.
- This target pose can be input by user A, or it can be selected by the 3D model reconstruction platform 12 according to preset rules, such as setting random rules, and the 3D model reconstruction platform 12 determines the target pose according to the preset rules.
- Step S203 Determine the target pose based on the keyframe image.
- the target pose can be determined based on keyframe images determined by the 3D model reconstruction platform 12 during the mapping process. Specifically, when the keyframe images are determined based on descriptor information, they carry corresponding pose information. When determining the target pose, the pose information of the keyframe images can be directly determined as the target pose. As mentioned above, keyframe images are usually determined through descriptor information and have the characteristics of high image texture feature richness and accurate pose. Therefore, by determining the target pose based on keyframe images, the efficiency and effectiveness of subsequent 3D model adjustment and training can be improved.
- Step S204 Instruct the 3D model to generate a target depth image and a target digital image based on the target pose.
- the 3D model reconstruction platform 12 instructs the 3D Gaussian model to generate a target depth image and a target digital image based on the target pose. For example, taking the target digital image as an example, based on the target pose, the coordinates of the Gaussian model are transformed from the world coordinate system to the corresponding coordinate system. The position of the 3D Gaussian model in the target pose coordinate system is projected onto the 2D image plane. Then, the excess portion is cropped according to the image boundary. Points are sampled on the image plane, and the probability of these points corresponding to the 3D Gaussian model is calculated. The sampled probability values are converted into grayscale or color values to obtain the final image.
- Step S205 Determine the geometric residual based on the target point cloud information and the target depth image.
- the 3D model reconstruction platform 12 determines the geometric residual based on the target point cloud information of the object to be reconstructed under the target pose observation in the mapping results, and the target depth image generated by the 3D Gaussian model based on the target pose. For example, the 3D model reconstruction platform 12 can determine the target point cloud information corresponding to the target pose based on the target pose, compare the target point cloud information with the target depth image, and then calculate the distance between these point clouds and the expected position of the Gaussian distribution to determine the geometric residual.
- Step S206 Determine the optical residual based on the target image information and the target digital image.
- the 3D model reconstruction platform 12 determines the optical residual based on the target image information of the object to be reconstructed under the target pose observation in the mapping results, and the target image information generated by the 3D Gaussian model based on the target pose. For example, the 3D model reconstruction platform 12 can determine the target image information corresponding to the target pose in the target image information based on the target pose. If the target pose is the pose information of a keyframe image, then the target image information is the keyframe image. The residual can be defined as the difference between the actual pixel value and the rendered pixel value. The optical residual is determined by comparing the target image information and the target digital image.
- Step S207 Adjust the 3D model based on the geometric residuals and optical residuals.
- the 3D model reconstruction platform 12 adjusts the 3D Gaussian model based on the optical and geometric residuals obtained in the above steps.
- a nonlinear optimization algorithm such as gradient descent, is used to minimize the geometric and/or optical residuals while updating the parameters of each Gaussian distribution until convergence or the maximum number of iterations is reached.
- the optical and geometric residuals are simultaneously introduced to adjust the 3D model.
- the geometric residuals are introduced through point cloud information to supervise the training of the 3D model, so that the 3D model, after adjustment, is based on real scale information and has higher geometric accuracy and richer detail features.
- Step S208 Perform anti-aliasing calculations based on the target pose.
- rendering anomalies occur when there is a significant discrepancy between the resolution of the user-input rendering viewpoint and the training viewpoint. This is because the opacity parameter trained at the original resolution is only valid for that resolution. When the user performs a significant scaling operation, the opacity parameter error will cause abnormal occlusion within the image plane. Based on this, one embodiment of this application specifically calculates an opacity compensation coefficient. Where ⁇ is the covariance matrix, I is the identity matrix, and the opacity compensation coefficient ⁇ is calculated at the current pose to compensate for the opacity of the target digital image during the 3D Gaussian model rendering process. By compensating for the opacity parameter according to the current resolution, the rasterization framework can correctly render texture details at different resolutions.
- Step S209 The 3D model reconstruction platform 12 instructs the 3D model to adjust its pose during the rendering process.
- image pose supervision is achieved.
- the image pose shifts beyond a certain threshold, the image pose is adjusted to reduce the error.
- ghosting phenomena appearing in the 3D Gaussian model reconstruction results are reduced, and rendering quality is further improved.
- the 3D model reconstruction platform 12 can provide users with corresponding interfaces for configuring and controlling the 3D model reconstruction process, for example:
- GetGaussMapInfo()GaussMapInfo Gets the parameters of a 3D Gaussian model.
- Object definition Gaussian kernel center position, covariance, scaling factor, opacity, spherical harmonic function parameters.
- GetTrainingInfo() Gets the current status and feedback of the 3D Gaussian model training task.
- Object definitions iteration steps, optical residual, geometric residual, peak signal-to-noise ratio (PSNR) (used to evaluate image rendering quality), and structural similarity (used to evaluate the similarity between the rendered image and the ground truth).
- PSNR peak signal-to-noise ratio
- GetCameraOptimizationInfo()CameraOptimizationInfo Gets the image pose optimization result.
- Object definition Keyframe number, keyframe timestamp, original pose, optimized image pose, image pose correction magnitude
- Interfaces that rely on user input include:
- SetConfig(Params, UseGeoLoss, UseCameraOpt, UseAntiAliasing) Sets the training parameters and input parameters for the 3D Gaussian reconstruction module.
- Params Includes basic parameter configurations such as training iteration steps and optimizer learning rate.
- the reconstruction framework provides recommended settings based on the scale of the reconstructed scenario for user reference.
- UseGeoLoss Whether to use geometric residual supervision
- UseCameraOpt Whether to enable camera extrinsic parameter joint optimization
- UseAntiAliasing Whether to enable anti-aliasing calculation.
- user A can configure whether to enable anti-aliasing calculation, the number of training iterations, etc., through the 3D model reconstruction platform 12. For example, user A inputs the number of iterations, and the 3D model reconstruction platform 12 instructs the number of adjustments to the 3D model according to the number of iterations input by user A. After the 3D model has undergone the number of iterations based on the geometric residuals and optical residuals, it outputs the adjusted 3D model.
- FIG. 8 is a structural schematic diagram of a three-dimensional model reconstruction platform provided in this application embodiment. Specifically, it includes the following modules:
- the acquisition module 901 is used to acquire the mapping result of the object to be reconstructed, wherein the mapping result includes the image information and point cloud information of the object to be reconstructed;
- Module 903 is used to create a 3D model of the object to be reconstructed based on the mapping results.
- the determination module 902 is also used to determine the target image information and target point cloud information of the object to be reconstructed under the target pose observation from the mapping results.
- the target pose includes position information and attitude information.
- Instruction module 904 is used to instruct the 3D model to generate a target depth image and a target digital image based on the target pose;
- the determination module 902 is used to determine the geometric residual based on the target point cloud information and the target depth image; the determination module 902 is also used to determine the optical residual based on the target image information and the target digital image.
- Adjustment module 905 is used to adjust the three-dimensional model based on geometric residuals and optical residuals.
- the three-dimensional model reconstruction platform 12 may further include:
- the generation module 906 is used to generate mapping results based on initial image information, initial point cloud information, point cloud inertial information, and image inertial information.
- the update module 907 is used to update the mapping results based on the supplemented image information, supplemented point cloud information, supplemented laser inertial information, and supplemented image inertial information.
- Output module 908 is used to output a 3D model after the number of iterations is satisfied.
- This application embodiment provides an example of obtaining module 901, determining module 902, establishing module 903, instructing module 904, and adjusting module 905.
- the implementation of generating module 906, updating module 907, and output module 908 can refer to the implementation of the aforementioned modules.
- the acquisition module 901, the determination module 902, the establishment module 903, the indication module 904, and the adjustment module 905 can all be implemented in software or in hardware.
- the implementation of the acquisition module 901 will be described below.
- the implementation of the determination module 902, the establishment module 903, the indication module 904, and the adjustment module 905 can refer to the implementation of the acquisition module 901.
- module 901 may include code running on a computing instance.
- the computing instance may include at least one of a physical host (computing device), a virtual machine, and a container. Further, the aforementioned computing instance may be one or more.
- module 901 may include code running on multiple hosts/virtual machines/containers. It should be noted that the multiple hosts/virtual machines/containers used to run the code may be distributed in the same region or in different regions. Further, the multiple hosts/virtual machines/containers used to run the code may be distributed in the same availability zone or in different availability zones, each availability zone including one data center or multiple geographically proximate data centers. Typically, a region may include multiple availability zones.
- the acquisition module 901 may include at least one computing device, such as a server.
- the acquisition module 901 may also be a device implemented using an application-specific integrated circuit (ASIC) or a programmable logic device (PLD).
- ASIC application-specific integrated circuit
- PLD programmable logic device
- the PLD may be implemented using a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL), or any combination thereof.
- CPLD complex programmable logical device
- FPGA field-programmable gate array
- GAL generic array logic
- the acquisition module 901 can be used to execute any step in the multi-source input-based 3D model reconstruction method
- the determination module 902 can be used to execute any step in the multi-source input-based 3D model reconstruction method
- the establishment module 903 can be used to execute any step in the multi-source input-based 3D model reconstruction method
- the instruction module 904 can be used to execute any step in the multi-source input-based 3D model reconstruction method
- the adjustment module 905 can be used to execute any step in the multi-source input-based 3D model reconstruction method.
- the steps implemented by the acquisition module 901, determination module 902, establishment module 903, instruction module 904, and adjustment module 905 can be specified as needed.
- FIG. 9 is a schematic diagram of the structure of a computing device according to an embodiment of this application for a three-dimensional model reconstruction method based on multi-source input.
- the computing device 900 includes: a bus 911, a processor 912, a memory 910, and a communication interface 909.
- the processor 912, the memory 910, and the communication interface 909 communicate via the bus 911.
- the computing device 900 can be a server or a terminal device. It should be understood that this application does not limit the number of processors and memories in the computing device 900.
- Bus 911 can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of illustration, only one line is used in Figure 9, but this does not indicate that there is only one bus or one type of bus. Bus 911 can include pathways for transmitting information between various components of computing device 900 (e.g., memory 910, processor 912, communication interface 909).
- PCI Peripheral Component Interconnect
- EISA Extended Industry Standard Architecture
- the processor 912 may include any one or more of the following processors: a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP), or a digital signal processor (DSP).
- CPU central processing unit
- GPU graphics processing unit
- MP microprocessor
- DSP digital signal processor
- the memory 910 may include volatile memory, such as random access memory (RAM).
- the processor 912 may also include non-volatile memory, such as read-only memory (ROM), flash memory, hard disk drive (HDD), or solid state drive (SSD).
- ROM read-only memory
- HDD hard disk drive
- SSD solid state drive
- the memory 910 stores executable program code, which the processor 912 executes to implement the functions of the acquisition module 901, the determination module 902, the establishment module 903, the instruction module 904, and the adjustment module 905, thereby realizing the 3D model reconstruction method based on multi-source input.
- the memory 910 stores instructions for the 3D model reconstruction platform to execute the 3D model reconstruction method based on multi-source input.
- the communication interface 909 uses transceiver modules, such as, but not limited to, network interface cards and transceivers, to enable communication between the computing device 900 and other devices or communication networks.
- transceiver modules such as, but not limited to, network interface cards and transceivers, to enable communication between the computing device 900 and other devices or communication networks.
- the computing device cluster includes at least one computing device.
- the computing device can be a server, such as a central server, an edge server, or a local server in a local data center.
- the computing device can also be a terminal device such as a desktop computer, a laptop computer, or a smartphone.
- Figure 10 is a schematic diagram of a computing device cluster for a multi-source input-based 3D model reconstruction method according to an embodiment of this application.
- the computing device cluster includes at least one computing device 900.
- the memory 910 of one or more computing devices 900 in the computing device cluster may store the same 3D model reconstruction platform for executing instructions of the multi-source input-based 3D model reconstruction method.
- one or more computing devices 900 in the computing device cluster can also be used to execute some of the instructions of the 3D model reconstruction platform for executing a 3D model reconstruction method based on multi-source input.
- a combination of one or more computing devices 900 can jointly execute the instructions of the 3D model reconstruction platform for executing a 3D model reconstruction method based on multi-source input.
- the memory 910 in different computing devices 900 within the computing device cluster can store different instructions for executing some functions of the 3D model reconstruction platform. That is, the instructions stored in the memory 910 of different computing devices 900 can implement the functions of one or more modules among the acquisition module 901, determination module 902, creation module 903, instruction module 904, and adjustment module 905.
- the memory 910 of one or more computing devices 900 in the computing device cluster may also store partial instructions for executing the multi-source input-based 3D model reconstruction method.
- a combination of one or more computing devices 900 can jointly execute instructions for executing the multi-source input-based 3D model reconstruction method.
- Figure 11 is another schematic diagram of the computing device cluster of the multi-source input-based 3D model reconstruction method according to an embodiment of this application.
- two computing devices 900A and 900B are connected through a communication interface 909.
- the memory in computing device 900A stores instructions for executing the determination module 902, the establishment module 903, and the adjustment module 905.
- the memory in computing device 900B stores instructions for the acquisition module 901 and the instruction module 904 for the functions to be executed.
- the memory 910 of computing devices 900A and 900B jointly stores the instructions used by the 3D model reconstruction platform to execute the multi-source input-based 3D model reconstruction method.
- connection method between the computing device clusters shown in Figure 11 can be considered because the 3D model reconstruction method based on multi-source input provided in this application requires a large amount of data transmission to the acquisition module 901. Considering the amount of data transmission, in order to avoid overloading the computing device 900A, the functions to be implemented are delegated to the computing device 900B.
- computing device 900A shown in Figure 11 can also be performed by multiple computing devices 900.
- computing device 900B can also be performed by multiple computing devices 900.
- Figure 12 is another schematic diagram of the computing device cluster structure of the 3D model reconstruction method based on multi-source input according to an embodiment of this application.
- one or more computing devices in the computing device cluster can be connected via a network.
- the network can be a wide area network (WAN) or a local area network (LAN), etc.
- Figure 12 shows one possible implementation, where two computing devices 900C and 900D are connected via a network. Specifically, they are connected to the network through the communication interface in each computing device.
- the memory 910 in computing device 900C stores instructions for executing the determination module 902, the establishment module 903, and the adjustment module 905. Simultaneously, the memory 910 in computing device 900D stores instructions for executing the acquisition module 901 and the instruction module 904.
- connection method between the computing device clusters shown in Figure 12 can be considered as follows: the three-dimensional model reconstruction method based on multi-source input provided in this application requires a large amount of data transmission and needs to be connected through a network. The execution of these functions is relatively independent. In order to achieve the best storage and computing performance, the data transmission function is considered to be executed by the computing device 900D.
- computing device 900C shown in Figure 12 can also be performed by multiple computing devices 900.
- computing device 900D can also be performed by multiple computing devices 900.
- the memory 910 of one or more computing devices 900 in the computing device cluster may also store partial instructions for executing the multi-source input-based 3D model reconstruction method.
- a combination of one or more computing devices 900 can jointly execute instructions for executing the multi-source input-based 3D model reconstruction method.
- the computer program product may be a software or program product containing instructions, capable of running on a computing device or stored on any usable medium.
- the computer program product When the computer program product is run on at least one computer device, it causes the at least one computer device to execute the above-described method for performing multi-source input-based 3D model reconstruction on a 3D model reconstruction platform.
- the computer-readable storage medium can be any available medium that a computing device can store, or a data storage device such as a data center containing one or more available media.
- the available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state drive).
- the computer-readable storage medium includes instructions that instruct a computing device to execute the above-described method applied to a 3D model reconstruction platform for performing a 3D model reconstruction based on multi-source input.
- the computer program product may be a software or program product containing instructions, capable of running on a computing device or stored on any usable medium.
- the computer program product When the computer program product is run on at least one computer device, it causes the at least one computer device to execute the above-described method for performing multi-source input-based 3D model reconstruction on a 3D model reconstruction platform.
- the computer-readable storage medium can be any available medium that a computing device can store, or a data storage device such as a data center containing one or more available media.
- the available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state drive).
- the computer-readable storage medium includes instructions that instruct a computing device to execute the above-described method applied to a 3D model reconstruction platform for performing a 3D model reconstruction based on multi-source input.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Graphics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Geometry (AREA)
- Software Systems (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
本公开要求于2024年5月20日提交的申请号为202410626989.0、发明名称为“一种基于云计算技术的三维重建方法和装置”的中国专利申请的优先权,和于2024年8月9日提交的申请号为202411096682.0、发明名称为“一种基于多源输入的三维模型重建方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。This disclosure claims priority to Chinese Patent Application No. 202410626989.0, filed on May 20, 2024, entitled "A Three-Dimensional Reconstruction Method and Apparatus Based on Cloud Computing Technology", and Chinese Patent Application No. 202411096682.0, filed on August 9, 2024, entitled "A Three-Dimensional Model Reconstruction Method and Apparatus Based on Multi-Source Input", the entire contents of which are incorporated herein by reference.
本申请涉及三维模型重建领域,尤其是涉及一种基于多源输入的三维模型重建方法和装置。This application relates to the field of three-dimensional model reconstruction, and in particular to a method and apparatus for three-dimensional model reconstruction based on multi-source input.
三维重建(3D Reconstruction)是一种恢复场景原始三维信息的数学过程和计算机技术。早期的三维模型重建方法,通常以单视角或多视角图像信息为输入,受限于输入的数据,重建出的三维模型通常不够完整,而且真实感较低。基于纯视觉输入的高精三维重建方案,数据采集流程繁琐,工作效率低。由于需要高度依赖视觉运动恢复结构(Structure from Motion,SFM)模块,导致适用场景受限,自动化程度低,难以满足高速、批量、高频的需求,且精度和准确度较低,失真严重。3D reconstruction is a mathematical process and computer technology for restoring the original three-dimensional information of a scene. Early 3D model reconstruction methods typically used single-view or multi-view image information as input. Limited by the input data, the reconstructed 3D models were usually incomplete and lacked realism. High-precision 3D reconstruction schemes based on pure visual input had cumbersome data acquisition processes and low work efficiency. Due to the heavy reliance on the Structure from Motion (SFM) module, the applicable scenarios were limited, the degree of automation was low, it was difficult to meet the requirements of high speed, batch processing, and high frequency, and the accuracy and precision were low with severe distortion.
近年来,随着传感器硬件技术的迭代,融合激光雷达传感器(Light Detection and Ranging Sensor,LiDAR)与惯性测量单元(Inertial Measurement Unit,IMU)等三维传感器信息的重建技术得到快速发展,旨在获得更稠密,精度更高的三维模型。In recent years, with the iteration of sensor hardware technology, reconstruction technology that integrates three-dimensional sensor information such as LiDAR (Light Detection and Range Sensor) and Inertial Measurement Unit (IMU) has developed rapidly, aiming to obtain denser and more accurate three-dimensional models.
本申请提供了一种基于多源输入的三维模型重建方法和装置,通过获取包括待重建对象的点云信息和图像信息的建图结果,建立三维模型,进一步通过点云信息和图像信息确定几何残差和光学残差,并进一步调整三维模型,提升建模效率和建模精度,以及建模真实度。This application provides a method and apparatus for reconstructing a three-dimensional model based on multi-source input. By acquiring mapping results including point cloud information and image information of the object to be reconstructed, a three-dimensional model is established. Furthermore, geometric residuals and optical residuals are determined through point cloud information and image information, and the three-dimensional model is further adjusted to improve modeling efficiency, modeling accuracy, and modeling realism.
第一方面,本申请提供了一种基于多源输入的三维模型重建方法,该方法应用于三维模型重建平台,具体的,该方法包括以下步骤:获取待重建对象的建图结果,其中,该建图结果包括待重建对象的图像信息和待重建对象的点云信息。进一步的,根据建图结果建立待重建对象的三维模型,在此基础上,从建图结果中确定目标位姿观察下待重建对象的目标图像信息和目标点云信息,目标位姿包括位置信息和姿态信息,进而指示三维模型根据目标位姿生成目标深度图像和目标数字图像。由此,三维模型重建平台根据目标点云信息和目标深度图像确定几何残差,根据目标图像信息和目标数字图像确定光学残差,再根据几何残差和光学残差,调整三维模型。Firstly, this application provides a 3D model reconstruction method based on multi-source input. This method is applied to a 3D model reconstruction platform. Specifically, the method includes the following steps: obtaining the mapping results of the object to be reconstructed, wherein the mapping results include image information and point cloud information of the object to be reconstructed. Further, a 3D model of the object to be reconstructed is established based on the mapping results. On this basis, target image information and target point cloud information of the object to be reconstructed under target pose observation are determined from the mapping results. The target pose includes position information and attitude information. This instructs the 3D model to generate a target depth image and a target digital image based on the target pose. Thus, the 3D model reconstruction platform determines the geometric residual based on the target point cloud information and the target depth image, determines the optical residual based on the target image information and the target digital image, and then adjusts the 3D model based on the geometric and optical residuals.
在本申请提供的方案中,通过获取包括待重建对象的点云信息和图像信息的多源输入的建图结果,对三维模型进行建立,使得建立后的三维模型可以基于特定位姿输出对应的深度图像和数字图像,由此再结合建图结果中的目标点云信息和目标图像信息,构建光学残差和几何残差,再根据光学残差和几何残差对三维模型进行调整,共同监督三维模型的训练。由此,同时采用几何监督和光学监督两种方式对三维模型同时具备真实的光学信息和真实的几何信息进行调整,使得三维模型在建立的基础上,几何精度和细节特征丰富度得到了明显提升,渲染生成物具有较高的真实度。In the solution provided in this application, a 3D model is built by acquiring mapping results from multiple inputs, including point cloud information and image information of the object to be reconstructed. This allows the built 3D model to output corresponding depth and digital images based on specific poses. Then, by combining the target point cloud information and target image information from the mapping results, optical and geometric residuals are constructed. The 3D model is then adjusted based on these optical and geometric residuals, jointly supervising the training of the 3D model. Thus, by simultaneously employing both geometric and optical supervision to adjust the 3D model to possess realistic optical and geometric information, the geometric accuracy and detail feature richness of the 3D model are significantly improved, resulting in highly realistic rendered objects.
结合第一方面,在第一方面的一种可能的实现方式中,获取待重建对象的建图结果的具体步骤为:获取待重建对象的初始图像信息、初始点云信息,以及采集设备采集初始图像信息时对应的图像惯性信息和采集初始点云信息时对应的点云惯性信息,再根据图像信息、点云信息、点云惯性信息和图像惯性信息,生成建图结果,由此获取建模对象的建图结果。In conjunction with the first aspect, in one possible implementation of the first aspect, the specific steps for obtaining the mapping result of the object to be reconstructed are as follows: obtain the initial image information and initial point cloud information of the object to be reconstructed, as well as the image inertial information and point cloud inertial information corresponding to the acquisition device when acquiring the initial image information and the acquisition device when acquiring the initial point cloud information, and then generate the mapping result based on the image information, point cloud information, point cloud inertial information and image inertial information, thereby obtaining the mapping result of the modeling object.
在本申请提供的方案中,通过获取图像信息、点云信息和对应的惯性信息等多源输入的信息,融合逐帧点云、逐帧图像以及对应的运动先验信息,对待重建对象进行建图,实现高精建图。且通过引入对应的惯性信息等运动先验信息和点云信息,可以更快进行位姿拟合,提升建图效率。The solution provided in this application acquires multi-source input information such as image information, point cloud information, and corresponding inertial information, and fuses frame-by-frame point cloud, frame-by-frame image, and corresponding motion prior information to construct a map of the object to be reconstructed, achieving high-precision mapping. Furthermore, by introducing corresponding inertial information and other motion prior information, along with point cloud information, pose fitting can be performed more quickly, improving mapping efficiency.
结合第一方面,在第一方面的一种可能的实现方式中,三维模型重建平台根据初始图像信息确定描述子信息,其中,描述子信息包括初始图像信息中至少一帧图像的纹理特征,由此,再根据初始图像信息和描述子信息,确定关键帧图像。In conjunction with the first aspect, in one possible implementation of the first aspect, the 3D model reconstruction platform determines descriptive information based on the initial image information, wherein the descriptive information includes the texture features of at least one frame of the initial image information, and then determines the keyframe image based on the initial image information and the descriptive information.
在本申请提供的方案中,关键帧图像的确定,有助于筛选出满足特定描述子信息要求的图像,例如,一些图像的纹理特征或者细节丰富度相对较为丰富的图像,在调整三维模型时,通过选择关键帧图像进行光学残差的构建,可以进一步提升三维模型的调整效率和三维模型对细节特征的渲染效果。In the solution provided in this application, the determination of keyframe images helps to filter out images that meet the requirements of specific descriptor information. For example, some images with relatively rich texture features or details can be used to further improve the adjustment efficiency of the 3D model and the rendering effect of the 3D model on the details by selecting keyframe images to construct optical residuals when adjusting the 3D model.
结合第一方面,在第一方面的一种可能的实现方式中,三维模型重建平台可以根据关键帧图像,确定用于观察关键帧图像的目标位姿。In conjunction with the first aspect, in one possible implementation of the first aspect, the 3D model reconstruction platform can determine the target pose for observing the keyframe image based on the keyframe image.
在本申请提供的方案中,通过根据关键帧图像确定目标位姿,使得三维模型输出的数字图像可以和图像的纹理特征或者细节丰富度相对较为丰富的关键帧图像进行光学残差的构建,同时,也使得三维模型输出的深度图像和对应的有较高精准度的点云信息进行几何残差的构建。由此,经过关键帧图像得到的对应位姿作为残差构建基础,可以进一步提高三维模型训练迭代的效率和效果。In the solution provided in this application, the target pose is determined based on keyframe images, enabling the construction of optical residuals between the digital image output by the 3D model and the keyframe images with relatively rich texture features or detail. Simultaneously, it also enables the construction of geometric residuals between the depth image output by the 3D model and the corresponding point cloud information with high accuracy. Therefore, using the pose obtained from the keyframe images as the basis for residual construction can further improve the efficiency and effectiveness of 3D model training iterations.
结合第一方面,在第一方面的一种可能的实现方式中,三维模型重建平台可以进一步的根据图像信息和图像惯性信息,确定图像位姿信息,并根据点云信息和点云惯性信息,确定点云位姿信息,由此,根据图像信息、点云信息、点云惯性信息和图像惯性信息,生成点云地图,具体的包括以下步骤:根据图像信息、点云信息和点云位姿信息、图像位姿信息,生成点云地图。再进一步的,三维模型重建平台可以根据点云位姿信息和图像位姿信息,确定全局位姿信息。In conjunction with the first aspect, in one possible implementation, the 3D model reconstruction platform can further determine image pose information based on image information and image inertial information, and determine point cloud pose information based on point cloud information and point cloud inertial information. Thus, a point cloud map is generated based on the image information, point cloud information, point cloud inertial information, and image inertial information. Specifically, this includes the following steps: generating a point cloud map based on image information, point cloud information, point cloud pose information, and image pose information. Furthermore, the 3D model reconstruction platform can determine global pose information based on the point cloud pose information and image pose information.
在本申请提供的方案中,通过融合图像信息与图像惯性信息、点云信息和点云惯性信息,对到对应的图像位姿和点云位姿,由此使得在建图时,图像信息和点云信息可以通过对应的图像位姿和点云位姿确定相邻关系和匹配关系,实现精确建图。通过建立全局位姿信息,可以实现图像位姿和点云位姿的快速匹配和转换,便于三维模型根据点云位姿或图像位姿中的某一位姿,输出完整、准确的渲染图像。The solution provided in this application fuses image information with image inertial information, point cloud information, and point cloud inertial information to obtain corresponding image poses and point cloud poses. This allows for accurate mapping during image and point cloud construction, as the adjacent and matching relationships between the image and point cloud information can be determined based on their respective poses. By establishing global pose information, rapid matching and conversion between image and point cloud poses can be achieved, facilitating the output of a complete and accurate rendered image from the 3D model based on a specific pose from either the point cloud or the image.
结合第一方面,在第一方面的一种可能的实现方式中,三维模型重建平台对建图结果进行回环检测,得到匹配信息,匹配信息用于指示当前帧与历史帧的匹配情况,并根据匹配信息调整建图结果。In conjunction with the first aspect, in one possible implementation of the first aspect, the 3D model reconstruction platform performs loop closure detection on the mapping results to obtain matching information. The matching information is used to indicate the matching status between the current frame and historical frames, and the mapping results are adjusted according to the matching information.
在本申请提供的方案中,由于采集设备在长时间运行过程中存在累计测量误差,尤其是采集设备在长时间工作后返回已经完成建图的区域时,当前传感器信息与已经建好的建图结果间会出现错位现象。因此需要在建图过程中进行回环检测,检索匹配当前帧是否能与历史信息构成回环,并构建位姿图,调整建图结果,提升建图精读。In the solution provided in this application, due to the cumulative measurement errors of the acquisition device during long-term operation, especially when the acquisition device returns to a mapped area after a long period of operation, a misalignment may occur between the current sensor information and the already constructed mapping results. Therefore, loop closure detection is required during the mapping process to check whether the current frame can form a loop with historical information, construct a pose graph, adjust the mapping results, and improve the accuracy of the mapping.
结合第一方面,在第一方面的一种可能的实现方式中,三维模型重建平台通过获取待重建对象的补采图像信息、补采点云信息、补采图像惯性信息和补采激光惯性信息,并根据补采图像信息、补采点云信息、补采激光惯性信息和补采图像惯性信息,更新建图结果。In conjunction with the first aspect, in one possible implementation of the first aspect, the 3D model reconstruction platform acquires supplementary image information, supplementary point cloud information, supplementary image inertial information, and supplementary laser inertial information of the object to be reconstructed, and updates the mapping results based on the supplementary image information, supplementary point cloud information, supplementary laser inertial information, and supplementary image inertial information.
在本申请提供的方案中,通过采用多源输入对待重建对象的建图,提升建图速度的基础上,可以实时向用户反馈当前建图进展,使得用户在建图过程中,可及时得知当前待重建对象内是否存在局部区域漏采,细节缺失等现象,并对相应区域进行针对性补充扫描,以此来完成对建图的高质量原始数据构建。In the solution provided in this application, by adopting multi-source input for mapping the object to be reconstructed, the mapping speed is improved, and the current mapping progress can be fed back to the user in real time. This allows the user to know in a timely manner whether there are any local areas that have been missed or details that are missing within the object to be reconstructed during the mapping process, and to perform targeted supplementary scanning of the corresponding areas, thereby completing the construction of high-quality raw data for mapping.
结合第一方面,在第一方面的一种可能的实现方式中,用户可以对三维模型的调整情况进行配置,三维模型重建平台通过获取用户输入的迭代次数,指示三维模型按照迭代次数进行调整,其中,迭代次数用于指示三维模型的调整次数,在三维模型满足迭代次数后,输出三维模型。In conjunction with the first aspect, in one possible implementation of the first aspect, the user can configure the adjustment of the 3D model. The 3D model reconstruction platform obtains the number of iterations input by the user and instructs the 3D model to be adjusted according to the number of iterations. The number of iterations is used to indicate the number of adjustments to the 3D model. After the 3D model meets the number of iterations, the 3D model is output.
在本申请提供的方案中,用户通过配置的方式控制三维模型的迭代训练过程,使得三维模型在满足对应的调整要求后,便输出调整好的三维模型,以供后续服务使用。In the solution provided in this application, the user controls the iterative training process of the 3D model through configuration, so that after the 3D model meets the corresponding adjustment requirements, the adjusted 3D model is output for subsequent service use.
第二方面或第二方面任意一种实现方式是第一方面或第一方面任意一种实现方式对应的装置实现,第一方面或第一方面任意一种实现方式中的描述适用于第二方面或第二方面任意一种实现方式,在此不再赘述。The second aspect or any implementation thereof is a device implementation corresponding to the first aspect or any implementation thereof. The description in the first aspect or any implementation thereof applies to the second aspect or any implementation thereof, and will not be repeated here.
第三方面,本申请提供了一种计算设备集群,包括至少一个计算设备,每个计算设备包括处理器和存储器;至少一个计算设备的处理器用于执行至少一个计算设备的存储器中存储的指令,以使得计算设备集群执行上述第一方面以及结合上述第一方面中的任意一种实现方式的方法。Thirdly, this application provides a computing device cluster, including at least one computing device, each computing device including a processor and a memory; the processor of the at least one computing device is used to execute instructions stored in the memory of the at least one computing device, so that the computing device cluster performs the method described in the first aspect and any implementation thereof.
第四方面,本申请提供了一种包含指令的计算机程序产品,当指令被计算机设备集群运行时,使得计算机设备集群执行上述第一方面以及结合上述第一方面中的任意一种实现方式的方法。Fourthly, this application provides a computer program product containing instructions that, when executed by a cluster of computer devices, cause the cluster of computer devices to perform the method described in the first aspect and any implementation thereof.
第五方面,本申请提供了一种计算机可读存储介质,包括计算机程序指令,当计算机程序指令由计算设备集群执行时,计算设备集群执行上述第一方面以及结合上述第一方面中的任意一种实现方式的方法。Fifthly, this application provides a computer-readable storage medium including computer program instructions, which, when executed by a cluster of computing devices, enable the cluster of computing devices to perform the method described in the first aspect and any implementation thereof.
图1是本申请实施例提供的基于多源输入的三维模型重建方法的一种应用场景示意图;Figure 1 is a schematic diagram of an application scenario of the three-dimensional model reconstruction method based on multi-source input provided in the embodiments of this application;
图2是本申请实施例提供的基于多源输入的三维模型重建方法的又一种应用场景示意图;Figure 2 is a schematic diagram of another application scenario of the three-dimensional model reconstruction method based on multi-source input provided in the embodiments of this application;
图3是本申请实施例提供的基于多源输入的三维模型重建方法的一种流程示意图;Figure 3 is a flowchart illustrating a three-dimensional model reconstruction method based on multi-source input provided in an embodiment of this application;
图4是本申请实施例提供的基于多源输入的三维模型重建方法的一种建图流程示意图;Figure 4 is a schematic diagram of a mapping process for a three-dimensional model reconstruction method based on multi-source input provided in an embodiment of this application;
图5是本申请实施例提供的基于多源输入的三维模型重建方法的又一种建图流程示意图;Figure 5 is a schematic diagram of another mapping process of the three-dimensional model reconstruction method based on multi-source input provided in the embodiments of this application;
图6是本申请实施例提供的基于多源输入的三维模型重建方法的一种关键帧确定流程示意图;Figure 6 is a schematic diagram of a keyframe determination process for a three-dimensional model reconstruction method based on multi-source input provided in an embodiment of this application;
图7是本申请实施例提供的基于多源输入的三维模型重建方法的一种三维模型重建流程示意图;Figure 7 is a schematic diagram of a three-dimensional model reconstruction process based on a multi-source input three-dimensional model reconstruction method provided in an embodiment of this application;
图8是本申请实施例提供的装置的结构示意图;Figure 8 is a schematic diagram of the device provided in an embodiment of this application;
图9是本申请实施例提供的基于多源输入的三维模型重建方法的计算设备的结构示意图;Figure 9 is a schematic diagram of the computing device for the three-dimensional model reconstruction method based on multi-source input provided in the embodiments of this application;
图10是本申请实施例提供的基于多源输入的三维模型重建方法的计算设备集群的一种结构示意图;Figure 10 is a schematic diagram of a computing device cluster for a three-dimensional model reconstruction method based on multi-source input provided in an embodiment of this application;
图11是本申请实施例提供的基于多源输入的三维模型重建方法的计算设备集群的又一种结构示意图;Figure 11 is another schematic diagram of the computing device cluster of the three-dimensional model reconstruction method based on multi-source input provided in the embodiments of this application;
图12是本申请实施例提供的基于多源输入的三维模型重建方法的计算设备集群的又一种结构示意图。Figure 12 is another schematic diagram of the computing device cluster of the three-dimensional model reconstruction method based on multi-source input provided in the embodiments of this application.
下面结合附图对本申请实施例中的技术方案进行清楚、完整的描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。In this document, the term "embodiment" means that a particular feature, structure, or characteristic described in connection with an embodiment may be included in at least one embodiment of this application. The appearance of this phrase in various places throughout the specification does not necessarily refer to the same embodiment, nor is it a separate or alternative embodiment mutually exclusive with other embodiments. It will be explicitly and implicitly understood by those skilled in the art that the embodiments described herein can be combined with other embodiments.
在本说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。References to "one embodiment" or "some embodiments" as described in this specification mean that one or more embodiments of this application include a specific feature, structure, or characteristic described in connection with that embodiment. Therefore, the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in still other embodiments," etc., appearing in different parts of this specification do not necessarily refer to the same embodiment, but rather mean "one or more, but not all, embodiments," unless otherwise specifically emphasized. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless otherwise specifically emphasized.
首先对本文中出现的名词进行解释如下:First, the terms appearing in this article will be explained as follows:
三维模型重建(3D Model Reconstruction):是指利用计算机视觉和图形处理技术,将真实世界中的物体或场景转化为计算机可处理的三维数字模型的过程。3D model reconstruction refers to the process of converting real-world objects or scenes into computer-processable 3D digital models using computer vision and graphics processing techniques.
待重建对象:是指利用计算机视觉和图形处理技术进行重建的对象、客体,可以是三维空间、场景,也可以是物体级模型,以及数字人等。深度图像(Depth Image):深度图像也被称为距离影像,是指将从图像采集器到场景中各点的距离(深度)作为像素值的图像,它直接反映了景物可见表面的几何形状。深度图像经过坐标转换可以计算为点云数据,有规则及必要信息的点云数据也可以反算为深度图像数据。Object to be reconstructed: This refers to the object or entity to be reconstructed using computer vision and graphics processing techniques. It can be a 3D space, a scene, an object-level model, or a digital human, etc. Depth Image: Also known as a distance image, a depth image is an image that uses the distance (depth) from the image acquisition device to various points in the scene as pixel values. It directly reflects the geometry of the visible surfaces of the scene. Depth images can be converted into point cloud data through coordinate transformation, and point cloud data with regularity and necessary information can also be converted back into depth image data.
数字图像(Digital image):是图像用有限数字数值像素的表示,由数组或矩阵表示,其光照位置和强度都是离散的。数字图像是由模拟图像数字化得到的、以像素为基本元素的、可以用数字计算机或数字电路存储和处理的图像。本申请的一种实施例中,数字图像可以是二值图像、彩色图像等,示例性的如RGB图像、全色图像等。A digital image is an image represented by a finite number of digital pixels, represented by an array or matrix, where the illumination position and intensity are discrete. A digital image is an image obtained by digitizing an analog image, using pixels as its basic element, and can be stored and processed by a digital computer or digital circuit. In one embodiment of this application, the digital image can be a binary image, a color image, etc., such as an RGB image or a panchromatic image.
点云(Point Clouds):是空间中点的数据集,可以表示三维形状或对象,通常由三维扫描仪获取。示例的,点云中每个点的位置都由一组笛卡尔坐标(X,Y,Z){\textstyle(X,Y,Z)}描述,有些可能含有色彩信息(R,G,B)或物体反射面强度信息。本申请的一种实施例,点云信息可以是通过激光扫描仪(LiDAR)或其他传感器直接获取的点云;也可以是通过其他传感器,如声波雷达等直接获取的点云。Point clouds are datasets of points in space that can represent three-dimensional shapes or objects, typically acquired by a 3D scanner. For example, the position of each point in a point cloud is described by a set of Cartesian coordinates (X,Y,Z), and some may contain color information (R,G,B) or information about the intensity of the object's reflective surface. In one embodiment of this application, the point cloud information can be a point cloud directly acquired by a laser scanner (LiDAR) or other sensors; it can also be a point cloud directly acquired by other sensors, such as acoustic radar.
三维模型:是指对真实或虚构物体在三维空间中的数学表示。这种表示通常由一系列三维坐标点组成,这些点被连接成线和面,形成一个完整的几何结构,能够准确地再现物体的形状、结构和外观,实现对场景的重建。示例性的,如3D高斯溅射(3D Gaussian Splatting,3DGS):一种以3D高斯体为基础表征的场景重建和渲染技术,融合了显式表征与隐式表征两者的优势,可基于纯图像输入实现场景重建和高效实时渲染,并生成新视角下的合成数据,是代表当前计算机视觉领域发展方向的重建新范式;又如,神经辐射场(Neural Radiance Field,NeRF):一种基于深度神经网络的隐式三维模型重建方法,通过学习场景中每个点的辐射和颜色信息,从而能够在任意视角下合成逼真的图像。它通过在三维空间中对点进行采样,并为每个点预测辐射和颜色,从而构建场景的隐式表示。A 3D model is a mathematical representation of a real or fictional object in three-dimensional space. This representation typically consists of a series of 3D coordinate points connected to form lines and surfaces, creating a complete geometric structure that accurately reproduces the shape, structure, and appearance of the object, enabling scene reconstruction. Examples include 3D Gaussian Splatting (3DGS), a scene reconstruction and rendering technique based on 3D Gaussian volumes. It combines the advantages of explicit and implicit representations, enabling scene reconstruction and efficient real-time rendering based on pure image input, and generating synthetic data from new perspectives. It represents a new paradigm for reconstruction in the current field of computer vision. Another example is Neural Radiance Field (NeRF), an implicit 3D model reconstruction method based on deep neural networks. By learning the radiation and color information of each point in the scene, it can synthesize realistic images from any perspective. It constructs an implicit representation of the scene by sampling points in 3D space and predicting the radiation and color for each point.
光栅化(Rasterization):3D高斯溅射渲染的核心步骤,是一种把物体的数学描述以及与物体相关的颜色信息转换为屏幕上用于对应位置的像素及用于填充像素的颜色的数学过程。在3D高斯中指代从高斯核的分布及颜色信息中渲染2D图像的数学过程。Rasterization: The core step in 3D Gaussian sputtering rendering, it's a mathematical process that converts the mathematical description of an object and its associated color information into pixels on the screen for corresponding locations and the colors used to fill those pixels. In 3D Gaussian, it refers to the mathematical process of rendering a 2D image from the distribution and color information of the Gaussian kernel.
惯性测量单元(Inertial Measurement Unit,IMU):是测量物体三轴姿态角(或角速率)以及加速度的装置。An inertial measurement unit (IMU) is a device that measures the three-axis attitude angles (or angular rates) and acceleration of an object.
位姿:是指一个物体相对于某个参考坐标系的位置(Position)和姿态(Orientation)。具体来说,位姿包含了物体的空间位置信息和它的旋转方向信息。位姿可以用来描述任何三维空间中的刚体物体的位置和朝向。其中,位置通常是指物体中心或特定参考点在三维空间中的坐标;姿态通常是指物体的旋转角度或旋转矩阵,表示物体相对于参考坐标系的旋转情况。Pose: refers to the position and orientation of an object relative to a reference coordinate system. Specifically, pose includes the object's spatial position information and its rotation direction information. Pose can be used to describe the position and orientation of any rigid body object in three-dimensional space. Position typically refers to the coordinates of the object's center or a specific reference point in three-dimensional space; orientation typically refers to the object's rotation angle or rotation matrix, representing the object's rotation relative to the reference coordinate system.
显式表征:三维模型的传统表征形式,对场景或物体进行显式建模,允许用户编辑和查看,包括网格,点云,体素等。Explicit representation: The traditional form of representation for 3D models, which involves explicitly modeling a scene or object, allowing users to edit and view it, including meshes, point clouds, voxels, etc.
隐式表征:基于深度神经网络等机器学习方法,用参数化的方式描述场景三维信息的表征形式,构建从三维空间坐标到对应几何/纹理信息的映射关系。Implicit representation: Based on machine learning methods such as deep neural networks, it describes the representation of the three-dimensional information of the scene in a parameterized way, and constructs a mapping relationship from three-dimensional spatial coordinates to corresponding geometric/texture information.
同步定位与建图(Simultaneous Localization and Mapping,SLAM):一种实时运行的显式三维模型重建方法,希望机器人从未知环境的未知地点出发,在运动过程中通过重复观测到的地图特征(比如,墙角,柱子等)定位自身位置和姿态,再根据自身位置增量式的构建地图,从而达到同时定位和地图构建的目的。Simultaneous Localization and Mapping (SLAM): A real-time explicit 3D model reconstruction method that aims to enable a robot to start from an unknown location in an unknown environment, locate its own position and orientation by repeatedly observing map features (such as corners, pillars, etc.) during movement, and then incrementally build a map based on its own position, thereby achieving the goal of simultaneous localization and map construction.
运动恢复结构(Structure from Motion,SFM):一种非实时的显式三维模型重建方法,给定一系列具有一定重叠度的图像,去同时估计出拍摄每张图像时相机的位置和姿态,以及被拍摄物体或场景的稀疏点云。Structure from Motion (SFM): A non-real-time explicit 3D model reconstruction method that, given a series of images with a certain degree of overlap, simultaneously estimates the position and pose of the camera when each image was taken, as well as the sparse point cloud of the object or scene being photographed.
在三维重建领域,基于单视角或多视角图像信息作为输入数据,进行待重建对象的三维模型重建是较为常见的方法,但因受限于输入数据的限制,对于精确的点云信息多是采用视角图像进行估计,视图融合耗时长,效率低,精度差,且重建出的三维模型不够完整,深度信息失真严重,缺乏真实感。且在对三维模型进行优化的过程中,缺少精确的几何校验信息的输入,难以进行对几何深度信息的优化和调整,同时使得三维模型重建效率较低。In the field of 3D reconstruction, using single-view or multi-view image information as input data to reconstruct the 3D model of the object to be reconstructed is a common method. However, due to limitations in input data, accurate point cloud information is often estimated using view images. View fusion is time-consuming, inefficient, and has poor accuracy. Furthermore, the reconstructed 3D model is incomplete, with severe distortion of depth information and a lack of realism. Moreover, the lack of accurate geometric verification information during the optimization process makes it difficult to optimize and adjust geometric depth information, further contributing to the low efficiency of 3D model reconstruction.
基于此,本申请提供了一种基于多源输入的三维模型重建方法,通过获取包括待建模对象待重建对象的点云信息和图像信息的建图结果,建立三维模型,进一步通过点云信息和图像信息确定几何残差和光学残差,并进一步调整三维模型,提升建模效率和建模精度,以及建模真实度。Based on this, this application provides a three-dimensional model reconstruction method based on multi-source input. By acquiring mapping results including point cloud information and image information of the object to be modeled and reconstructed, a three-dimensional model is established. Furthermore, geometric residuals and optical residuals are determined through point cloud information and image information, and the three-dimensional model is further adjusted to improve modeling efficiency, modeling accuracy, and modeling realism.
请参见图1,如图1所示,图1是本申请实施例提供的基于多源输入的三维模型重建方法的一种应用场景示意图,具体的:Please refer to Figure 1. As shown in Figure 1, Figure 1 is a schematic diagram of an application scenario of the 3D model reconstruction method based on multi-source input provided in the embodiments of this application. Specifically:
作为本申请的一种实施例,三维模型重建平台12可以运行在基础设施15上,其中,基础设施15包含有至少一个计算设备,该至少一个计算设备用于对三维模型重建过程中的计算。且,用户A可以直接访问三维模型重建平台12,通过三维模型重建平台12控制三维模型重建过程,并在该过程中输入对应的操作指令。三维模型重建平台12通过获取待重建对象的建图结果16,并根据建图结果16重建三维模型17,三维模型17作为输出结果,可以供后续的数据资产三维模型重建平台、数字孪生仿真平台进行调用。其中,数字资产三维模型重建平台是可以对重建完成的三维模型,以及从三维模型衍生的多种类,多模态合成数据进行统一管理调度的数字资产平台。按照数字资产格式,可分为原始三维模型,彩色图像,深度图,彩色点云,语义图,单体化分割结果,以及网格模型等。可积累合成数据,也可直接应用于具身大模型训练。此外,数字孪生仿真平台可直接加载稠密的三维模型,于仿真器界面内向用户实时展示三维空间的高拟真度渲染结果,支持用户在仿真场景内实时漫游,同时可利用三维模型的精确几何信息,实现测距与导航等下游任务。As one embodiment of this application, the 3D model reconstruction platform 12 can run on infrastructure 15, which includes at least one computing device for computation during the 3D model reconstruction process. User A can directly access the 3D model reconstruction platform 12, control the 3D model reconstruction process through it, and input corresponding operation commands during the process. The 3D model reconstruction platform 12 obtains the mapping result 16 of the object to be reconstructed and reconstructs a 3D model 17 based on the mapping result 16. The 3D model 17, as the output, can be used by subsequent data asset 3D model reconstruction platforms and digital twin simulation platforms. The digital asset 3D model reconstruction platform is a digital asset platform that can uniformly manage and schedule the reconstructed 3D model and various types of multimodal synthetic data derived from the 3D model. According to the digital asset format, it can be divided into original 3D models, color images, depth maps, color point clouds, semantic maps, individual segmentation results, and mesh models, etc. Synthetic data can be accumulated and can also be directly applied to the training of large-scale embodied models. In addition, the digital twin simulation platform can directly load dense 3D models and display highly realistic rendering results of 3D space to users in real time within the simulator interface. It supports users to roam in the simulation scene in real time and can also use the precise geometric information of the 3D model to realize downstream tasks such as ranging and navigation.
请继续参见图2,如图2,图2是本申请实施例提供的基于多源输入的三维模型重建方法的又一种应用场景示意图,具体的:Please refer to Figure 2. Figure 2 is a schematic diagram of another application scenario of the 3D model reconstruction method based on multi-source input provided in the embodiments of this application. Specifically:
三维模型重建平台12可以是部署在基础设施14上,用于根据多源输入实现融合建图,输出建图结果16,同时,也可以是部署在基础设施15上,根据建图结果16重建三维模型17,并输出三维模型17。且,用户A可以直接访问三维模型重建平台12,特别是当三维模型重建平台12对待重建对象进行建图时,用户可以通过三维模型重建平台12实时监测建图结果,并输入相应指令,对建图过程进行控制。示例性的,基础设施14是部署在用户端侧的计算设备,用于贴近用户的待重建对象10,实现就近建图,就近计算,同时可以供用户实时查看建图结果16,当存在局部区域漏采、细节缺失,由用户可以快速补采,实现高精度采集。基于此,基础设施15部署在云上,由云提供的基础设施15提供三维模型重建平台12在根据建图结果16进行三维模型重建过程的计算能力,满足三维模型重建的计算需求。同时,数字资产三维模型重建平台、数字孪生仿真平台可以直接在云上调用三维模型17,更加便捷和方便,提升使用效率。另一种示例下,基础设施14和基础设施15可以都为部署在用户侧的计算设备,即三维模型重建平台12直接运行在用户侧设备上,由此实现用户对于三维模型重建平台12进行融合建图、三维模型重建过程的全流程把控。The 3D model reconstruction platform 12 can be deployed on infrastructure 14 to perform fusion mapping based on multi-source inputs and output mapping results 16. Alternatively, it can be deployed on infrastructure 15 to reconstruct a 3D model 17 based on the mapping results 16 and output the 3D model 17. Furthermore, user A can directly access the 3D model reconstruction platform 12. Specifically, when the 3D model reconstruction platform 12 is mapping the object to be reconstructed, the user can monitor the mapping results in real time and input corresponding commands to control the mapping process. For example, infrastructure 14 is a computing device deployed on the user's end, located close to the object to be reconstructed 10 for local mapping and computation. It also allows the user to view the mapping results 16 in real time. If there are missed areas or missing details, the user can quickly supplement the data to achieve high-precision acquisition. Based on this, infrastructure 15 is deployed in the cloud. The cloud-provided infrastructure 15 provides the computing power for the 3D model reconstruction platform 12 during the 3D model reconstruction process based on the mapping results 16, meeting the computational needs of 3D model reconstruction. Meanwhile, the digital asset 3D model reconstruction platform and the digital twin simulation platform can directly call the 3D model 17 on the cloud, making it more convenient and efficient. In another example, infrastructure 14 and infrastructure 15 can both be computing devices deployed on the user side, that is, the 3D model reconstruction platform 12 runs directly on the user-side device, thereby enabling the user to have full control over the process of fusion mapping and 3D model reconstruction on the 3D model reconstruction platform 12.
针对待重建对象10,为获取其多源输入数据,由采集设备11对待重建对象10进行数据采集。其中,采集设备11可以是机器人、专业数采设备、便携式手持设备等各类建图设备,也可以是包含高精度建图所需的激光雷达、IMU、RGB(RGB Color Mode)相机等传感器。通过激光雷达和相机的外参标定、激光雷达和IMU的建立、以及传感器硬件触发式时间戳同步等方法完成所有传感器之间的数据时空对齐。同时还可以包括计算单元(负责数据采集与高精建图模块部署运行)、存储单元(负责存储管理多模态数据集)和网络单元(负责数据的高速传输)。上述采集设备可以是一个或多个组合同时进行采集,本申请对此不作限定。由此,当采集设备11在采集对应的数据时,获取初始图像信息、初始点云信息以及惯性信息等原始数据,其中,为三维模型重建平台12在建图时方便数据处理,可以在采集过程中,统一数据结构以及统一的应用编程接口(Application Programming Interface,API)接口,提高建图效率。For the object 10 to be reconstructed, data acquisition device 11 acquires data from the object 10 to obtain its multi-source input data. The acquisition device 11 can be various mapping devices such as robots, professional data acquisition equipment, and portable handheld devices, or it can include sensors such as LiDAR, IMU, and RGB (RGB Color Mode) cameras required for high-precision mapping. Spatiotemporal alignment of data between all sensors is achieved through methods such as extrinsic parameter calibration of LiDAR and cameras, establishment of LiDAR and IMU, and hardware-triggered timestamp synchronization of sensors. It may also include a computing unit (responsible for data acquisition and deployment and operation of the high-precision mapping module), a storage unit (responsible for storing and managing multimodal datasets), and a network unit (responsible for high-speed data transmission). The acquisition devices mentioned above can be one or more combined and acquired simultaneously; this application does not limit this. Therefore, when the acquisition device 11 acquires the corresponding data, it obtains raw data such as initial image information, initial point cloud information, and inertial information. In order to facilitate data processing during the mapping process of the 3D model reconstruction platform 12, a unified data structure and a unified application programming interface (API) can be used during the acquisition process to improve mapping efficiency.
为了便于更清楚地说明本申请的实施例内容,后续实施例中将采用3D高斯模型作为三维模型的示例说明,其中,3D高斯模型是本申请所保护的三维模型的一种示例,如三维模型定义所示,在三维模型重建领域,仍有其他多种模型可以实现场景的三维重建,此处不应理解为具体限定。To facilitate a clearer explanation of the embodiments of this application, the 3D Gaussian model will be used as an example of a three-dimensional model in the following embodiments. The 3D Gaussian model is an example of a three-dimensional model protected by this application. As shown in the definition of a three-dimensional model, there are still many other models that can achieve three-dimensional reconstruction of a scene in the field of three-dimensional model reconstruction. This should not be construed as a specific limitation.
请继续参见图3,如图3所示,图3是本申请实施例提供的基于多源输入的三维模型重建方法的一种流程示意图,具体的:Please refer to Figure 3. As shown in Figure 3, Figure 3 is a flowchart illustrating a three-dimensional model reconstruction method based on multi-source input provided in an embodiment of this application. Specifically:
步骤S201.获取待重建对象的建图结果,建图结果包括待重建对象的图像信息和待重建对象的点云信息。Step S201. Obtain the mapping results of the object to be reconstructed. The mapping results include the image information and point cloud information of the object to be reconstructed.
三维模型重建平台12获取待重建对象的建图结果,作为三维模型重建的输入,建图结果可以是由用户根据其业务需要和待重建对象的特点采用不同的方法提供,本申请对此不作限定。值得注意的是,建图结果中包括待重建对象的图像信息和待重建对象的点云信息,其中图像信息可以是采集设备采集待重建对象的初始图像信息,也可以是经过融合、拟合后的特定图像信息,能够反映待重建对象的颜色、形状等信息。点云信息,示例性的可以是激光点云信息,通过激光点云信息可以精准的确定待重建对象的深度信息和几何信息,相比于通过结构光、纯视觉方案确定的形状,本申请实施例有更高的精度和准确性。The 3D model reconstruction platform 12 acquires the mapping results of the object to be reconstructed, which serve as input for the 3D model reconstruction. The mapping results can be provided by the user using different methods based on their business needs and the characteristics of the object to be reconstructed; this application does not limit this. It is worth noting that the mapping results include image information and point cloud information of the object to be reconstructed. The image information can be the initial image information of the object to be reconstructed acquired by the acquisition device, or it can be specific image information after fusion and fitting, reflecting information such as the color and shape of the object to be reconstructed. The point cloud information, exemplarily, can be laser point cloud information. Laser point cloud information can accurately determine the depth and geometric information of the object to be reconstructed. Compared to shapes determined by structured light or pure vision methods, the embodiments of this application have higher precision and accuracy.
本申请的一种实施例,建图结果可以由三维模型重建平台12生成,在步骤S201获取具体步骤请参见图4,图4是本申请实施例提供的基于多源输入的三维模型重建方法的又一种流程示意图,具体的:In one embodiment of this application, the mapping result can be generated by the 3D model reconstruction platform 12. For specific steps in step S201, please refer to Figure 4. Figure 4 is another flowchart illustrating the 3D model reconstruction method based on multi-source input provided in this embodiment of the application. Specifically:
步骤S301.获取待重建对象的初始图像信息、初始点云信息,以及采集设备采集初始图像信息时对应的图像惯性信息和采集初始点云信息时对应的点云惯性信息。Step S301. Obtain the initial image information and initial point cloud information of the object to be reconstructed, as well as the image inertial information and point cloud inertial information corresponding to the acquisition device when acquiring the initial image information and the acquisition device when acquiring the initial point cloud information.
三维模型重建平台12通过采集设备11获取待重建对象的初始图像信息、初始点云信息,以及采集设备采集初始图像信息时对应的图像惯性信息和采集初始点云信息时对应的点云惯性信息。The 3D model reconstruction platform 12 acquires the initial image information and initial point cloud information of the object to be reconstructed through the acquisition device 11, as well as the image inertial information and point cloud inertial information corresponding to the acquisition device when acquiring the initial image information and the acquisition device when acquiring the initial point cloud information.
具体的,请参见图5,图5是申请实施例提供的基于多源输入的三维模型重建方法的一种建图流程示意图,其中:Specifically, please refer to Figure 5. Figure 5 is a schematic diagram of a mapping process for a 3D model reconstruction method based on multi-source input provided in the application embodiment, wherein:
步骤S3011.获取待重建对象的初始点云信息。Step S3011. Obtain the initial point cloud information of the object to be reconstructed.
三维模型重建平台12通过采集设备11上报的数据,获取待重建对象的初始点云信息,示例性的,该初始点云信息可以是由采集设备11,如激光雷达,根据一定的采集路径和相应的位姿采集得到的集合,该初始点云信息包括逐帧点云信息。The 3D model reconstruction platform 12 obtains the initial point cloud information of the object to be reconstructed by the data reported by the acquisition device 11. For example, the initial point cloud information can be a set obtained by the acquisition device 11, such as a lidar, according to a certain acquisition path and corresponding pose. The initial point cloud information includes frame-by-frame point cloud information.
步骤S3012.获取待重建对象的惯性信息。Step S3012. Obtain the inertial information of the object to be reconstructed.
三维模型重建平台12通过采集设备11上报的数据,获取待重建对象在采集初始点云信息时的惯性信息,以及在采集初始图像信息时的惯性信息,其中,采集设备11可以同时集成激光雷达、RGB相机以及IMU,实现数据同步采集,提升采集精准度。待重建对象的惯性信息,也是采集初始点云信息和初始图像信息的运动先验信息,运动先验信息可以包括角速度、加速度等信息。The 3D model reconstruction platform 12 acquires the inertial information of the object to be reconstructed during the acquisition of initial point cloud information and initial image information by collecting data reported by the acquisition device 11. The acquisition device 11 can simultaneously integrate a LiDAR, an RGB camera, and an IMU to achieve synchronous data acquisition and improve acquisition accuracy. The inertial information of the object to be reconstructed is also the prior motion information for acquiring the initial point cloud information and initial image information. This prior motion information can include angular velocity, acceleration, and other information.
步骤S3013.获取待重建对象的初始图像信息。Step S3013. Obtain the initial image information of the object to be reconstructed.
三维模型重建平台12通过采集设备11上报的数据,获取待重建对象的初始图像信息,该初始图像信息可以是采集设备11,如RGB相机采集完成,包括采集过程中待重建对象的逐帧图像。The 3D model reconstruction platform 12 obtains the initial image information of the object to be reconstructed by the data reported by the acquisition device 11. This initial image information can be acquired by the acquisition device 11, such as an RGB camera, including frame-by-frame images of the object to be reconstructed during the acquisition process.
步骤S302.根据初始图像信息、初始点云信息、点云惯性信息和图像惯性信息,生成建图结果。Step S302. Generate mapping results based on initial image information, initial point cloud information, point cloud inertial information, and image inertial information.
示例性的,三维模型重建平台12融合逐帧激光点云与IMU提供的运动先验信息。进一步的,融合图像特征与IMU提供的运动先验信息,共同估计整个初始图像信息和初始点云信息的位姿状态,生成建图结果。For example, the 3D model reconstruction platform 12 integrates frame-by-frame laser point clouds with motion prior information provided by the IMU. Furthermore, it integrates image features with the motion prior information provided by the IMU to jointly estimate the pose state of the entire initial image information and initial point cloud information, generating the mapping result.
步骤S3024.生成点云位姿信息。Step S3024. Generate point cloud pose information.
三维模型重建平台12融合逐帧激光点云与IMU提供的运动先验信息,对相邻点云的特征进行匹配,并实时估计点云位姿信息。The 3D model reconstruction platform 12 integrates frame-by-frame laser point cloud with motion prior information provided by IMU, matches the features of adjacent point clouds, and estimates the point cloud pose information in real time.
步骤S3025.根据初始图像信息确定描述子信息。Step S3025. Determine descriptor information based on initial image information.
三维模型重建平台12根据采集设备11上报的初始图像信息,确定逐帧图像的描述子信息,其中,该描述子信息可以是点对(Perspective-n-Point,PnP)描述子信息。The 3D model reconstruction platform 12 determines the descriptive information of each frame of the image based on the initial image information reported by the acquisition device 11. The descriptive information can be point-n-point (PnP) descriptive information.
具体的,请参见图6,如图6所述,图6是本申请实施例提供的基于多源输入的三维模型重建方法的一种关键帧确定流程示意图,步骤如下:Specifically, please refer to Figure 6. As shown in Figure 6, Figure 6 is a schematic diagram of a keyframe determination process for a 3D model reconstruction method based on multi-source input provided in an embodiment of this application. The steps are as follows:
步骤S3025.根据初始图像信息确定描述子信息。Step S3025. Determine descriptor information based on the initial image information.
通过定义描述子信息,三维模型重建平台12可以根据初始图像信息,对初始图像信息进行逐帧判断,逐帧确定图像的描述子信息。示例性的,描述子信息为PnP描述子,具体的,PnP描述子信息可以包括:PnP残差:图像位姿可信度指标,残差高则代表位姿精度不可靠;PnP点对数目:图像纹理特征丰富度指标,点对数低代表视觉退化较严重。根据该预设规则,三维模型重建平台12对初始图像信息进行逐帧判断。该预设规则,也可以由用户进行配置,筛选符合特定规则的图像。By defining descriptor information, the 3D model reconstruction platform 12 can perform frame-by-frame judgment on the initial image information and determine the image's descriptor information frame by frame. For example, the descriptor information is a PnP descriptor. Specifically, the PnP descriptor information may include: PnP residual: an image pose reliability index; a high residual indicates unreliable pose accuracy; and the number of PnP pairs: an image texture feature richness index; a low number of pairs indicates severe visual degradation. According to this preset rule, the 3D model reconstruction platform 12 performs frame-by-frame judgment on the initial image information. This preset rule can also be configured by the user to filter images that conform to specific rules.
通过该步骤,可以对输入数据进行筛选,尤其是可以筛选一下满足特定要求的相关图像数据,为后续建图、三维模型重建提供特定图像。This step allows for the filtering of input data, especially for selecting relevant image data that meets specific requirements, thus providing specific images for subsequent mapping and 3D model reconstruction.
步骤S303.根据初始图像信息和描述子信息确定关键帧图像。Step S303. Determine the keyframe image based on the initial image information and descriptor information.
示例性的,通过描述子信息的预设规则,如用户输入PnP描述子中PnP残差小于第一阈值和/或PnP点对数大于第二阈值,三维模型重建平台12根据描述子信息逐帧筛选初始图像信息中,满足该描述子信息的图像,经过筛选出的图像,位姿精度和图像纹理特征丰富度都满足PnP描述子要求,视为关键帧图像。For example, by using preset rules for descriptor information, such as if the PnP residual in the user-input PnP descriptor is less than a first threshold and/or the number of PnP point pairs is greater than a second threshold, the 3D model reconstruction platform 12 filters the initial image information frame by frame according to the descriptor information. The images that meet the descriptor information after filtering have pose accuracy and image texture feature richness that meet the PnP descriptor requirements and are regarded as keyframe images.
在该示例中,通过对初始图像信息的过滤,保留视觉特征丰富,位姿准确视角的关键帧图像,并剔除视觉退化较严重、位姿误差大的图像。进而,使得三维模型重建平台12在进行建图过程中,可以排除干扰图像,提升建图精度。同时还可以通过配置关键帧图像在三维模型重建中,进行光学残差和几何残差的校验,通过提升三维模型重建的精度和准确性。In this example, by filtering the initial image information, keyframe images with rich visual features and accurate poses are retained, while images with severe visual degradation and large pose errors are discarded. This allows the 3D model reconstruction platform 12 to eliminate interfering images and improve mapping accuracy during the mapping process. Furthermore, by configuring keyframe images during 3D model reconstruction, optical and geometric residuals can be verified, thereby improving the accuracy and precision of the 3D model reconstruction.
步骤S3026.生成图像位姿信息。Step S3026. Generate image pose information.
三维模型重建平台12融合图像特征与IMU提供的运动先验信息,对相邻两帧图像之间的时空位移关系进行估计,并输出图像位姿。The 3D model reconstruction platform 12 integrates image features and motion prior information provided by the IMU to estimate the spatiotemporal displacement relationship between two adjacent frames and output the image pose.
步骤S3027.三维模型重建平台12进行回环检测。Step S3027. The 3D model reconstruction platform 12 performs loop closure detection.
由于IMU在长时间运行过程中存在累计测量误差,当数据采集设备在长时间工作后返回已经完成建图的区域时,当前传感器信息与已经建好的点云地图间会出现错位现象。因此需要在建图过程中进行回环检测,检索匹配当前帧是否能与历史信息构成回环,并构建位姿图。由此提升建图数据的准确性,保证建图精度。Because IMUs accumulate measurement errors during long-term operation, misalignment may occur between the current sensor information and the already constructed point cloud map when the data acquisition device returns to the mapped area after a prolonged period of operation. Therefore, loop closure detection is necessary during the mapping process to check whether the current frame can form a loop with historical information and to construct a pose map. This improves the accuracy of the mapping data and ensures mapping precision.
示例性的,机器人沿着走廊前进,并进入房间,然后再次回到走廊。机器人不断地采集图像,并记录其位姿(位置和姿态)。对每张图像进行处理,提取关键特征,如特征点(Oriented FAST and Rotated BRIEF,ORB)等。可以使用特征描述符来表示这些特征点。当机器人移动时,将新采集的图像与之前采集的图像进行特征匹配。使用特征匹配算法找到相似的特征点。如果找到足够数量的匹配特征点(比如超过一定阈值),则认为存在潜在的回环。对于每个回环候选,记录对应的图像和机器人位姿。构建一个位姿图(Pose Graph),其中节点表示机器人的位姿,边表示相邻位姿之间的相对变换。对于每个回环候选,添加一个约束边,表示机器人回到之前的位置。使用回环候选的特征匹配结果和机器人位姿信息来验证回环的真实性。可以通过计算回环候选之间的几何一致性来进一步验证,例如使用随机抽样一致算法(Random Sample Consensus,RANSAC)来估计最佳的相对位姿。一旦确认了回环,就需要更新位姿图,以反映最新的回环信息。使用图优化算法来最小化位姿图中的误差,保持地图的一致性。For example, a robot moves along a corridor, enters a room, and then returns to the corridor. The robot continuously acquires images and records its pose (position and orientation). Each image is processed to extract key features, such as oriented fast and rotated briefs (ORBs). These feature points can be represented using feature descriptors. As the robot moves, newly acquired images are compared with previously acquired images using feature matching. A feature matching algorithm is used to find similar feature points. If a sufficient number of matching feature points are found (e.g., exceeding a certain threshold), a potential loop closure is considered to exist. For each loop closure candidate, the corresponding image and robot pose are recorded. A pose graph is constructed, where nodes represent robot poses and edges represent relative transformations between adjacent poses. For each loop closure candidate, a constraint edge is added, indicating that the robot returns to its previous position. The feature matching results of the loop closure candidates and the robot pose information are used to verify the authenticity of the loop closure. Further validation can be achieved by calculating the geometric consistency among loop closure candidates, for example, by using a Random Sample Consensus (RANSAC) algorithm to estimate the optimal relative pose. Once a loop closure is confirmed, the pose graph needs to be updated to reflect the latest loop closure information. Graph optimization algorithms are used to minimize errors in the pose graph and maintain map consistency.
步骤S3028.三维模型重建平台12进行后端优化。Step S3028. The 3D model reconstruction platform 12 performs backend optimization.
回环检测完成后,三维模型重建平台12基于当前回环的匹配信息,对建图结果进行闭环优化,消除由传感器误差带来的建图错位现象。通过点云信息和图像信息的约束,优化匹配结果。After loop closure detection is completed, the 3D model reconstruction platform 12 performs closed-loop optimization on the mapping results based on the matching information of the current loop closures, eliminating mapping misalignment caused by sensor errors. The matching results are optimized through constraints from point cloud information and image information.
步骤S3029.三维模型重建平台12生成建图结果。Step S3029. The 3D model reconstruction platform 12 generates the mapping results.
示例性的,三维模型重建平台12可以在上述步骤完成后,根据图像位姿信息、点云位姿信息,匹配对应的初始点云信息、初始图像信息进行拟合,生成对应的建图结果。其中,该建图结果包括图像信息和点云信息,该图像信息可以是初始图像信息中的一个图像集合,点云信息可以是初始点云信息中的一个点云集合,该图像信息包括关键帧图像。作为本申请的一种示例,三维模型重建平台12未进行描述子信息确定,且未识别关键帧图像,建图结果的图像信息仅为初始图像信息的一个子集。For example, after the above steps are completed, the 3D model reconstruction platform 12 can match the corresponding initial point cloud information and initial image information with the image pose information and point cloud pose information to generate the corresponding mapping result. The mapping result includes image information and point cloud information. The image information can be a set of images from the initial image information, and the point cloud information can be a set of point clouds from the initial point cloud information. The image information includes keyframe images. As an example of this application, the 3D model reconstruction platform 12 does not determine descriptor information and does not identify keyframe images; the image information in the mapping result is only a subset of the initial image information.
值得注意的是,上述建图过程步骤S3011-步骤S3029是本申请的一种实施例,任意步骤组合、任意执行顺序均可以落入本申请的保护范围。其他的实施例中,如,步骤S3027、步骤S3025、步骤S3028可以是三维模型重建平台12根据初始图像信息、初始点云信息以及对应的初始惯性信息进行建图的额外的步骤。It is worth noting that the above-described mapping process steps S3011-S3029 are one embodiment of this application, and any combination of steps or any execution order can fall within the protection scope of this application. In other embodiments, such as steps S3027, S3025, and S3028, steps can be additional steps for the 3D model reconstruction platform 12 to build a map based on the initial image information, initial point cloud information, and corresponding initial inertial information.
通过采取多源输入的方式对待重建对象进行建图,可以极大的提升建图效率,点云信息可以提供待重建对象更精准的几何信息和深度信息,同时引入IMU惯性信息,可以更加快速精确的实现对采集设备在采集初始点云信息、初始图像信息时运动先验信息的输入,后续建图过程中,进行位姿匹配、拟合时,精度更高,避免仅根据图像信息通过大量计算识别较为粗略几何信息、惯性信息等问题,极大提升建图效率和建图精度。By adopting a multi-source input approach for mapping the object to be reconstructed, mapping efficiency can be greatly improved. Point cloud information can provide more accurate geometric and depth information of the object to be reconstructed. At the same time, the introduction of IMU inertial information can more quickly and accurately realize the input of motion prior information when the acquisition device acquires initial point cloud information and initial image information. In the subsequent mapping process, the accuracy of pose matching and fitting is higher, avoiding the problem of identifying relatively coarse geometric and inertial information through a lot of calculations based solely on image information, thus greatly improving mapping efficiency and accuracy.
在此基础上,本申请的一种实施例中,三维模型重建平台12还在建图过程中,可以实时向用户提供建图反馈,当出现数据漏采、图像缺失、细节丢失等情况时,用户可以及时仅采取补采等措施进行补救,具体的:Based on this, in one embodiment of this application, the 3D model reconstruction platform 12 can provide real-time mapping feedback to the user during the mapping process. When data omissions, image loss, or loss of details occur, the user can promptly take remedial measures such as supplementary data acquisition. Specifically:
步骤S3023.三维模型重建平台12获取补采指令。Step S3023. The 3D model reconstruction platform 12 obtains the supplementary sampling command.
示例性的,三维模型重建平台12根据用户A输入的补采指令,等待采集设备11上报补采数据,包括补采图像信息、补采点云信息,补采图像惯性信息和补采点云惯性信息,并根据上述补采数据,更新建图结果。提升建图效率,保证待重建对象建图结果的完整性。For example, the 3D model reconstruction platform 12 waits for the acquisition device 11 to report the supplementary acquisition data based on the supplementary acquisition command input by user A. This data includes supplementary image information, supplementary point cloud information, supplementary image inertial information, and supplementary point cloud inertial information. Based on this supplementary acquisition data, the platform updates the mapping results. This improves mapping efficiency and ensures the integrity of the mapping results for the object to be reconstructed.
示例性的,三维模型重建平台12提供部分接口,可供用户自定义查询相关内容。For example, the 3D model reconstruction platform 12 provides some interfaces that allow users to customize queries for relevant content.
GetCameraInfo()CameraInfo:获取当前时刻相机参数与状态。GetCameraInfo()CameraInfo: Gets the camera parameters and status at the current moment.
对象定义:当前图像帧时间戳、图像编号、图像内参、图像位姿信息、是否为关键帧、图像数据。Object definition: current image frame timestamp, image number, image intrinsic parameters, image pose information, whether it is a keyframe, and image data.
GetLidarInfo()LidarInfo:获取当前时刻激光雷达参数与状态。GetLidarInfo()LidarInfo: Gets the current LiDAR parameters and status.
对象定义:激光雷达类型,包括固态激光雷达,多线激光雷达;激光雷达线数、激光雷达当前帧时间戳、激光点云帧编号、激光雷达位姿信息、当前位置是否检测出回环、回环关联的激光点云帧编号、激光点云原始数据。Object definition: LiDAR type, including solid-state LiDAR and multi-line LiDAR; number of LiDAR lines, LiDAR current frame timestamp, LiDAR point cloud frame number, LiDAR pose information, whether a loop closure is detected at the current position, the LiDAR point cloud frame number associated with the loop closure, and the original LiDAR point cloud data.
GetIMUInfo()IMUInfo:获取当前时刻IMU参数与状态。GetIMUInfo(): Gets the IMU parameters and status at the current time.
对象定义:IMU类型,包括六轴、九轴;IMU时间戳、IMU数据。Object definition: IMU type, including six-axis and nine-axis; IMU timestamp and IMU data.
GetMapInfo()MapInfo:获取当前时刻全局点云建图结果。GetMapInfo()MapInfo: Retrieves the global point cloud mapping result at the current moment.
对象定义:点云坐标、点云颜色值。Object definition: point cloud coordinates, point cloud color value.
依赖用户输入的接口包括:Interfaces that rely on user input include:
设置PnP描述子阈值,用于自动化关键帧抽取、入参:Set the PnP descriptor threshold for automated keyframe extraction and input parameters:
Residual:PnP残差阈值;Residual: PnP residual threshold;
PairNum:PnP点对数目阈值。PairNum: Threshold for the number of PnP point pairs.
值得注意的是,上述建图过程是本申请的一种实施例,应当理解,在三维模型重建过程中,上述建图步骤不对本申请中的建图结果造成限制。It is worth noting that the above mapping process is one embodiment of this application. It should be understood that the above mapping steps do not limit the mapping results in this application during the 3D model reconstruction process.
步骤S202.根据建图结果建立待重建对象的三维模型。Step S202. Establish a three-dimensional model of the object to be reconstructed based on the mapping results.
本申请的一种实施例,以三维模型为3D高斯模型为例,三维模型重建平台12根据建图结果建立3D高斯模型,该3D高斯模型通常包括如下五种参数:In one embodiment of this application, taking a 3D Gaussian model as an example, the 3D model reconstruction platform 12 establishes a 3D Gaussian model based on the mapping results. This 3D Gaussian model typically includes the following five parameters:
位置:也称均值,表示3D高斯核中心位置在三维空间中的坐标;Location: also known as mean, represents the coordinates of the center of the 3D Gaussian kernel in three-dimensional space;
协方差:表示3D高斯核的形状分布,协方差矩阵中的3列向量,代表高斯椭球的3个主轴方向;Covariance: Represents the shape distribution of the 3D Gaussian kernel. The three column vectors in the covariance matrix represent the three principal axis directions of the Gaussian ellipsoid.
放缩系数:表示每个3D高斯核的大小;Scaling factor: Represents the size of each 3D Gaussian kernel;
不透明度:表示3D高斯核的透明度信息,不透明度越高,表示该高斯核距离物体表面更近;Opacity: Represents the transparency information of the 3D Gaussian kernel. The higher the opacity, the closer the Gaussian kernel is to the object's surface.
光影参数:编码不同视角下的光照信息,可反应三维场景内的光影变化。Lighting parameters: Encode lighting information from different viewpoints, which can reflect the changes in lighting within a 3D scene.
示例性的,从建图结果中选择特征点或地标。这些特征点可以是环境中的固定点,如墙壁上的标记、房间的角落等。在此基础上,每个特征点的均值向量μ,μ通常是该特征点在世界坐标系中的估计位置。进一步的,确定协方差矩阵Σ,Σ描述了特征点位置的不确定性。建立3D高斯模型,对于每个选定的特征点,构建一个3D高斯分布模型,使用上面确定的均值向量和协方差矩阵,完成3D高斯模型建立。For example, feature points or landmarks are selected from the mapping results. These feature points can be fixed points in the environment, such as markings on walls, corners of rooms, etc. Based on this, the mean vector μ of each feature point is determined, where μ is typically the estimated position of the feature point in the world coordinate system. Further, the covariance matrix Σ is determined, which describes the uncertainty of the feature point's position. A 3D Gaussian model is then built. For each selected feature point, a 3D Gaussian distribution model is constructed using the mean vector and covariance matrix determined above to complete the 3D Gaussian model construction.
步骤S203.确定目标图像信息和目标点云信息。Step S203. Determine the target image information and target point cloud information.
具体的,从建图结果中确定目标位姿观察下待重建对象的目标图像信息和目标点云信息,其中,目标位姿包括位置信息和姿态信息。其中,目标位姿可以是用于观察建图结果中目标图像信息和目标点云信息的位置信息和姿态信息,该目标位姿可以是用户A输入的,也可以是三维模型重建平台12根据预设规则筛选的,例如设定随机规则,由三维模型重建平台12根据预设规则,确定目标位姿。Specifically, the target image information and target point cloud information of the object to be reconstructed are determined from the mapping results under the observation of the target pose. The target pose includes position information and attitude information. The target pose can be the position and attitude information used to observe the target image information and target point cloud information in the mapping results. This target pose can be input by user A, or it can be selected by the 3D model reconstruction platform 12 according to preset rules, such as setting random rules, and the 3D model reconstruction platform 12 determines the target pose according to the preset rules.
步骤S2031.根据关键帧图像确定目标位姿。Step S2031. Determine the target pose based on the keyframe image.
本发明的一种实施例,目标位姿的确定,可以是根据三维模型重建平台12在建图过程中确定的关键帧图像确定的。具体的,关键帧图像在根据描述子信息确定时,带有对应的位姿信息,在确定目标位姿时,可以直接确定关键帧图像的位姿信息即为目标位姿。如前文所述,关键帧图像通常经过描述子信息确定,具有图像纹理特征丰富度高、位姿准确的特点。由此,通过根据关键帧图像确定目标位姿,使得后续在进行三维模型调整过程中,可以提升调整和训练效率和效果。In one embodiment of the present invention, the target pose can be determined based on keyframe images determined by the 3D model reconstruction platform 12 during the mapping process. Specifically, when the keyframe images are determined based on descriptor information, they carry corresponding pose information. When determining the target pose, the pose information of the keyframe images can be directly determined as the target pose. As mentioned above, keyframe images are usually determined through descriptor information and have the characteristics of high image texture feature richness and accurate pose. Therefore, by determining the target pose based on keyframe images, the efficiency and effectiveness of subsequent 3D model adjustment and training can be improved.
步骤S204.指示三维模型根据目标位姿生成目标深度图像和目标数字图像。Step S204. Instruct the 3D model to generate a target depth image and a target digital image based on the target pose.
三维模型重建平台12指示3D高斯模型根据目标位姿生成目标深度图像和目标数字图像,示例性的,以目标数字图像为例,根据目标位姿,将高斯模型的坐标从世界坐标系变换到对应的坐标系,将3D高斯模型在目标位姿坐标系中的位置投影到2D图像平面上,再根据图像的边界裁剪超出的部分,在图像平面上采样点,并计算这些点对应于3D高斯模型的概率。将采样的概率值转化为灰度或颜色值,从而得到最终的图像。The 3D model reconstruction platform 12 instructs the 3D Gaussian model to generate a target depth image and a target digital image based on the target pose. For example, taking the target digital image as an example, based on the target pose, the coordinates of the Gaussian model are transformed from the world coordinate system to the corresponding coordinate system. The position of the 3D Gaussian model in the target pose coordinate system is projected onto the 2D image plane. Then, the excess portion is cropped according to the image boundary. Points are sampled on the image plane, and the probability of these points corresponding to the 3D Gaussian model is calculated. The sampled probability values are converted into grayscale or color values to obtain the final image.
步骤S205.根据目标点云信息和目标深度图像,确定几何残差。Step S205. Determine the geometric residual based on the target point cloud information and the target depth image.
三维模型重建平台12根据建图结果中在目标位姿观察下的待重建对象的目标点云信息,以及3D高斯模型根据目标位姿生成的目标深度图像,确定几何残差。示例性的,三维模型重建平台12可以根据目标位姿,确定目标点云信息中与该目标位姿对应的目标点云信息,通过目标点云信息和目标深度图像进行比较,然后计算这些点云与高斯分布的期望位置之间的距离,确定几何残差。The 3D model reconstruction platform 12 determines the geometric residual based on the target point cloud information of the object to be reconstructed under the target pose observation in the mapping results, and the target depth image generated by the 3D Gaussian model based on the target pose. For example, the 3D model reconstruction platform 12 can determine the target point cloud information corresponding to the target pose based on the target pose, compare the target point cloud information with the target depth image, and then calculate the distance between these point clouds and the expected position of the Gaussian distribution to determine the geometric residual.
步骤S206.根据目标图像信息和目标数字图像,确定光学残差。Step S206. Determine the optical residual based on the target image information and the target digital image.
三维模型重建平台12根据建图结果中在目标位姿观察下的待重建对象的目标图像信息系,以及3D高斯模型根据目标位姿生成的目标图像信息,确定光学残差。示例性的,三维模型重建平台12可以根据目标位姿,确定目标图像信息中与该目标位姿对应的目标图像信息,如果该目标位姿是关键帧图像的位姿信息,则该目标图像信息即为关键帧图像,残差可以定义为实际像素值与渲染像素值之间的差异,通过比较目标图像信息和目标数字图像,确定光学残差。The 3D model reconstruction platform 12 determines the optical residual based on the target image information of the object to be reconstructed under the target pose observation in the mapping results, and the target image information generated by the 3D Gaussian model based on the target pose. For example, the 3D model reconstruction platform 12 can determine the target image information corresponding to the target pose in the target image information based on the target pose. If the target pose is the pose information of a keyframe image, then the target image information is the keyframe image. The residual can be defined as the difference between the actual pixel value and the rendered pixel value. The optical residual is determined by comparing the target image information and the target digital image.
步骤S207.根据几何残差和光学残差,调整三维模型。Step S207. Adjust the 3D model based on the geometric residuals and optical residuals.
进一步的,三维模型重建平台12根据上述步骤得到的光学残差和几何残差,调整3D高斯模型。示例性的,使用非线性优化算法,如梯度下降等算法,来最小化几何残差和/或光学残差,同时更新每个高斯分布的参数,直到收敛或达到最大迭代次数。Furthermore, the 3D model reconstruction platform 12 adjusts the 3D Gaussian model based on the optical and geometric residuals obtained in the above steps. For example, a nonlinear optimization algorithm, such as gradient descent, is used to minimize the geometric and/or optical residuals while updating the parameters of each Gaussian distribution until convergence or the maximum number of iterations is reached.
由此,利用多源输入生成的建图结果中所包括的图像信息和点云信息,和根据目标位姿生成的目标深度图像、目标数字图像分别进行光学残差和几何残差的构建,并同时引入光学残差和几何残差一同调整三维模型,尤其通过点云信息引入几何残差进行监督训练三维模型,使得三维模型在调整后根据真实尺度信息,且几何精度与细节特征丰富度更高。Therefore, by utilizing the image information and point cloud information included in the mapping results generated from multiple inputs, and constructing optical and geometric residuals for the target depth image and target digital image generated based on the target pose, the optical and geometric residuals are simultaneously introduced to adjust the 3D model. In particular, the geometric residuals are introduced through point cloud information to supervise the training of the 3D model, so that the 3D model, after adjustment, is based on real scale information and has higher geometric accuracy and richer detail features.
步骤S208.根据目标位姿进行抗混叠计算。Step S208. Perform anti-aliasing calculations based on the target pose.
示例性的,3D高斯模型的渲染过程中,用户所输入的渲染视角与训练视角间的分辨率存在严重不一致时,会出现渲染异常情况。这是由于在原始分辨率下所训练得到的不透明度参数仅对该分辨率有效,当用户进行较大幅度的放缩操作时,不透明度参数误差会造成图像平面内的异常遮挡。基于此,本申请的一种实施例,具体的,通过计算不透明度的补偿系数其中,Σ为协方差矩阵,I为单位矩阵,计算当前位姿下不透明度的补偿系数ρ,在3D高斯模型渲染过程中,补偿目标数字图像的不透明度。通过针对当前分辨率对不透明度参数进行补偿,使光栅化框架能够正确渲染不同分辨率下的纹理细节。For example, during the rendering of a 3D Gaussian model, rendering anomalies occur when there is a significant discrepancy between the resolution of the user-input rendering viewpoint and the training viewpoint. This is because the opacity parameter trained at the original resolution is only valid for that resolution. When the user performs a significant scaling operation, the opacity parameter error will cause abnormal occlusion within the image plane. Based on this, one embodiment of this application specifically calculates an opacity compensation coefficient. Where Σ is the covariance matrix, I is the identity matrix, and the opacity compensation coefficient ρ is calculated at the current pose to compensate for the opacity of the target digital image during the 3D Gaussian model rendering process. By compensating for the opacity parameter according to the current resolution, the rasterization framework can correctly render texture details at different resolutions.
步骤S209.三维模型重建平台12指示三维模型在渲染过程中位姿调整。Step S209. The 3D model reconstruction platform 12 instructs the 3D model to adjust its pose during the rendering process.
本申请的一种实施例,通过在光栅化渲染中,配置光栅化渲染函数相对图像位姿可求导,通过几何残差与光学残差的梯度信息,实现对图像位姿监督,当图像位姿发生偏移超过某一阈值时,调整图像位姿,缩小误差。由此,在3D高斯模型重建结果上出现的重影现象得以降低,渲染质量进一步提升。In one embodiment of this application, by configuring the rasterization rendering function to be differentiable relative to the image pose during rasterization rendering, and using the gradient information of geometric and optical residuals, image pose supervision is achieved. When the image pose shifts beyond a certain threshold, the image pose is adjusted to reduce the error. As a result, ghosting phenomena appearing in the 3D Gaussian model reconstruction results are reduced, and rendering quality is further improved.
三维模型重建平台12可以对用户提供相应的接口,供用户对三维模型重建过程进行配置和控制,示例性的:The 3D model reconstruction platform 12 can provide users with corresponding interfaces for configuring and controlling the 3D model reconstruction process, for example:
GetGaussMapInfo()GaussMapInfo:获取3D高斯模型参数。GetGaussMapInfo()GaussMapInfo: Gets the parameters of a 3D Gaussian model.
对象定义:高斯核中心位置、协方差、放缩系数、不透明度、球谐函数参数。Object definition: Gaussian kernel center position, covariance, scaling factor, opacity, spherical harmonic function parameters.
GetTrainingInfo()TrainingInfo:获取3D高斯模型训练任务当前状态与反馈。GetTrainingInfo(): Gets the current status and feedback of the 3D Gaussian model training task.
对象定义:迭代步数、光学残差、几何残差、峰值信噪比指标(用于评价图像渲染质量)、结构相似性(用于评价渲染图像与真值的相似度)Object definitions: iteration steps, optical residual, geometric residual, peak signal-to-noise ratio (PSNR) (used to evaluate image rendering quality), and structural similarity (used to evaluate the similarity between the rendered image and the ground truth).
GetCameraOptimizationInfo()CameraOptimizationInfo:获取图像位姿优化结果。GetCameraOptimizationInfo()CameraOptimizationInfo: Gets the image pose optimization result.
对象定义:关键帧编号、关键帧时间戳、原始位姿、优化后图像位姿、图像位姿校正幅度Object definition: Keyframe number, keyframe timestamp, original pose, optimized image pose, image pose correction magnitude
依赖用户输入的接口包括:Interfaces that rely on user input include:
SetConfig(Params,UseGeoLoss,UseCameraOpt,UseAntiAliasing):设置3D高斯重建模块训练参数、入参。SetConfig(Params, UseGeoLoss, UseCameraOpt, UseAntiAliasing): Sets the training parameters and input parameters for the 3D Gaussian reconstruction module.
Params:包括训练迭代步数,优化器学习率等基础参数配置,重建框架根据所重建场景规模,提供推荐设置,供用户参考:Params: Includes basic parameter configurations such as training iteration steps and optimizer learning rate. The reconstruction framework provides recommended settings based on the scale of the reconstructed scenario for user reference.
UseGeoLoss:是否使用几何残差监督;UseGeoLoss: Whether to use geometric residual supervision;
UseCameraOpt:是否开启相机外参联合优化;UseCameraOpt: Whether to enable camera extrinsic parameter joint optimization;
UseAntiAliasing:是否开启抗混叠计算。UseAntiAliasing: Whether to enable anti-aliasing calculation.
基于此,本申请的一种实施例,结合图1,用户A可以通过三维模型重建平台12,配置是否开启抗混叠计算,训练迭代步数等。示例性的,用户A输入迭代次数,三维模型重建平台12根据用户A输入的迭代次数指示三维模型的调整次数,当三维模型根据几何残差和光学残差迭代次数后,输出调整好的三维模型。Based on this, in one embodiment of this application, referring to Figure 1, user A can configure whether to enable anti-aliasing calculation, the number of training iterations, etc., through the 3D model reconstruction platform 12. For example, user A inputs the number of iterations, and the 3D model reconstruction platform 12 instructs the number of adjustments to the 3D model according to the number of iterations input by user A. After the 3D model has undergone the number of iterations based on the geometric residuals and optical residuals, it outputs the adjusted 3D model.
本申请实施例还提供一种三维模型重建平台12,以下请参见图8,图8是本申请实施例提供的一种三维模型重建平台的结构示意图。具体包括以下模块:This application embodiment also provides a three-dimensional model reconstruction platform 12. Please refer to Figure 8 below. Figure 8 is a structural schematic diagram of a three-dimensional model reconstruction platform provided in this application embodiment. Specifically, it includes the following modules:
获取模块901,获取模块901用于获取待重建对象的建图结果,其中,建图结果包括待重建对象的图像信息和待重建对象的点云信息;The acquisition module 901 is used to acquire the mapping result of the object to be reconstructed, wherein the mapping result includes the image information and point cloud information of the object to be reconstructed;
建立模块903,建立模块903用于根据建图结果建立待重建对象的三维模型;Module 903 is used to create a 3D model of the object to be reconstructed based on the mapping results.
确定模块902还用于从建图结果中确定目标位姿观察下待重建对象的目标图像信息和目标点云信息,目标位姿包括位置信息和姿态信息;The determination module 902 is also used to determine the target image information and target point cloud information of the object to be reconstructed under the target pose observation from the mapping results. The target pose includes position information and attitude information.
指示模块904,指示模块904用于指示三维模型根据目标位姿生成目标深度图像和目标数字图像;Instruction module 904 is used to instruct the 3D model to generate a target depth image and a target digital image based on the target pose;
确定模块902,确定模块902用于根据目标点云信息和目标深度图像,确定几何残差;确定模块902还用于根据目标图像信息和目标数字图像,确定光学残差;The determination module 902 is used to determine the geometric residual based on the target point cloud information and the target depth image; the determination module 902 is also used to determine the optical residual based on the target image information and the target digital image.
调整模块905,调整模块905用于根据几何残差和光学残差,调整三维模型。Adjustment module 905 is used to adjust the three-dimensional model based on geometric residuals and optical residuals.
本申请的又一种实施例中,三维模型重建平台12还可以包括:In another embodiment of this application, the three-dimensional model reconstruction platform 12 may further include:
生成模块906,生成模块906用于根据初始图像信息、初始点云信息、点云惯性信息和图像惯性信息,生成建图结果。The generation module 906 is used to generate mapping results based on initial image information, initial point cloud information, point cloud inertial information, and image inertial information.
更新模块907,更新模块907用于根据补采图像信息、补采点云信息、补采激光惯性信息和补采图像惯性信息,更新建图结果。The update module 907 is used to update the mapping results based on the supplemented image information, supplemented point cloud information, supplemented laser inertial information, and supplemented image inertial information.
输出模块908,输出模块908用于当满足迭代次数后,输出三维模型。Output module 908 is used to output a 3D model after the number of iterations is satisfied.
值得注意的是,上述实施例中的用户可以由其他多个用户替换,上述模块均可以实现相应的技术功能,本申请实施例对此不作限定。It is worth noting that the user in the above embodiments can be replaced by multiple other users, and the above modules can all implement the corresponding technical functions. This application embodiment does not limit this.
本申请实施例以获取模块901、确定模块902、建立模块903、指示模块904、调整模块905进行示例说明,类似的,生成模块906、更新模块907、输出模块908的实现方式可以参考前述模块的实现方式。This application embodiment provides an example of obtaining module 901, determining module 902, establishing module 903, instructing module 904, and adjusting module 905. Similarly, the implementation of generating module 906, updating module 907, and output module 908 can refer to the implementation of the aforementioned modules.
具体的,获取模块901、确定模块902、建立模块903、指示模块904、调整模块905均可以通过软件实现,或者可以通过硬件实现。示例性的,接下来以获取模块901为例,介绍获取模块901的实现方式。类似的,确定模块902、建立模块903、指示模块904、调整模块905的实现方式可以参考获取模块901的实现方式。Specifically, the acquisition module 901, the determination module 902, the establishment module 903, the indication module 904, and the adjustment module 905 can all be implemented in software or in hardware. For example, the implementation of the acquisition module 901 will be described below. Similarly, the implementation of the determination module 902, the establishment module 903, the indication module 904, and the adjustment module 905 can refer to the implementation of the acquisition module 901.
模块作为软件功能单元的一种举例,获取模块901可以包括运行在计算实例上的代码。其中,计算实例可以包括物理主机(计算设备)、虚拟机、容器中的至少一种。进一步地,上述计算实例可以是一台或者多台。例如,获取模块901可以包括运行在多个主机/虚拟机/容器上的代码。需要说明的是,用于运行该代码的多个主机/虚拟机/容器可以分布在相同的区域中,也可以分布在不同的区域中。进一步地,用于运行该代码的多个主机/虚拟机/容器可以分布在相同的可用区中,也可以分布在不同的可用区中,每个可用区包括一个数据中心或多个地理位置相近的数据中心。其中,通常一个区域可以包括多个可用区。As an example of a software functional unit, module 901 may include code running on a computing instance. The computing instance may include at least one of a physical host (computing device), a virtual machine, and a container. Further, the aforementioned computing instance may be one or more. For example, module 901 may include code running on multiple hosts/virtual machines/containers. It should be noted that the multiple hosts/virtual machines/containers used to run the code may be distributed in the same region or in different regions. Further, the multiple hosts/virtual machines/containers used to run the code may be distributed in the same availability zone or in different availability zones, each availability zone including one data center or multiple geographically proximate data centers. Typically, a region may include multiple availability zones.
模块作为硬件功能单元的一种举例,获取模块901可以包括至少一个计算设备,如服务器等。或者,获取模块901也可以是利用专用集成电路(application-specific integrated circuit,ASIC)实现、或可编程逻辑器件(programmable logic device,PLD)实现的设备等。其中,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD)、现场可编程门阵列(field-programmable gate array,FPGA)、通用阵列逻辑(generic array logic,GAL)或其任意组合实现。As an example of a hardware functional unit, the acquisition module 901 may include at least one computing device, such as a server. Alternatively, the acquisition module 901 may also be a device implemented using an application-specific integrated circuit (ASIC) or a programmable logic device (PLD). The PLD may be implemented using a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL), or any combination thereof.
需要说明的是,在其他实施例中,获取模块901可以用于执行基于多源输入的三维模型重建方法中的任意步骤,确定模块902可以用于执行基于多源输入的三维模型重建方法中的任意步骤,建立模块903可以用于执行基于多源输入的三维模型重建方法中的任意步骤,指示模块904可以用于执行基于多源输入的三维模型重建方法中的任意步骤,调整模块905可以用于执行基于多源输入的三维模型重建方法中的任意步骤,获取模块901、确定模块902、建立模块903、指示模块904、调整模块905负责实现的步骤可根据需要指定,通过获取模块901、确定模块902、建立模块903、指示模块904、调整模块905分别实现基于多源输入的三维模型重建方法中不同的步骤来实现三维模型重建平台的全部功能。It should be noted that, in other embodiments, the acquisition module 901 can be used to execute any step in the multi-source input-based 3D model reconstruction method, the determination module 902 can be used to execute any step in the multi-source input-based 3D model reconstruction method, the establishment module 903 can be used to execute any step in the multi-source input-based 3D model reconstruction method, the instruction module 904 can be used to execute any step in the multi-source input-based 3D model reconstruction method, and the adjustment module 905 can be used to execute any step in the multi-source input-based 3D model reconstruction method. The steps implemented by the acquisition module 901, determination module 902, establishment module 903, instruction module 904, and adjustment module 905 can be specified as needed. By implementing different steps in the multi-source input-based 3D model reconstruction method through the acquisition module 901, determination module 902, establishment module 903, instruction module 904, and adjustment module 905, all functions of the 3D model reconstruction platform can be realized.
上述详细阐述了本申请实施例的方法、三维模型重建平台和系统,为了便于更好的实施本申请实施例的上述方案,相应地,下面还提供用于配合实施上述方案的相关设备。The methods, three-dimensional model reconstruction platforms, and systems of the embodiments of this application have been described in detail above. In order to facilitate better implementation of the above-described solutions of the embodiments of this application, relevant equipment for cooperating in the implementation of the above solutions is also provided below.
本申请提供一种计算设备,以下请参见图9,图9是本申请实施例提供的一种基于多源输入的三维模型重建方法的计算设备的一种结构示意图。计算设备900包括:总线911、处理器912、存储器910和通信接口909。处理器912、存储器910和通信接口909之间通过总线911通信。计算设备900可以是服务器或终端设备。应理解,本申请不限定计算设备900中的处理器、存储器的个数。This application provides a computing device. Please refer to Figure 9 below. Figure 9 is a schematic diagram of the structure of a computing device according to an embodiment of this application for a three-dimensional model reconstruction method based on multi-source input. The computing device 900 includes: a bus 911, a processor 912, a memory 910, and a communication interface 909. The processor 912, the memory 910, and the communication interface 909 communicate via the bus 911. The computing device 900 can be a server or a terminal device. It should be understood that this application does not limit the number of processors and memories in the computing device 900.
总线911可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图9中仅用一条线表示,但并不表示仅有一根总线或一种类型的总线。总线911可包括在计算设备900各个部件(例如,存储器910、处理器912、通信接口909)之间传送信息的通路。Bus 911 can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of illustration, only one line is used in Figure 9, but this does not indicate that there is only one bus or one type of bus. Bus 911 can include pathways for transmitting information between various components of computing device 900 (e.g., memory 910, processor 912, communication interface 909).
处理器912可以包括中央处理器(central processing unit,处理器)、图形处理器(graphics processing unit,GPU)、微处理器(micro processor,MP)或者数字信号处理器(digital signal processor,DSP)等处理器中的任意一种或多种。The processor 912 may include any one or more of the following processors: a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP), or a digital signal processor (DSP).
存储器910可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。处理器912还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,机械硬盘(hard disk drive,HDD)或固态硬盘(solid state drive,SSD)。The memory 910 may include volatile memory, such as random access memory (RAM). The processor 912 may also include non-volatile memory, such as read-only memory (ROM), flash memory, hard disk drive (HDD), or solid state drive (SSD).
存储器910中存储有可执行的程序代码,处理器912执行该可执行的程序代码以分别实现获取模块901、确定模块902、建立模块903、指示模块904、调整模块905的功能,从而实现基于多源输入的三维模型重建方法。也即,存储器910上存有三维模型重建平台用于执行基于多源输入的三维模型重建方法的指令。The memory 910 stores executable program code, which the processor 912 executes to implement the functions of the acquisition module 901, the determination module 902, the establishment module 903, the instruction module 904, and the adjustment module 905, thereby realizing the 3D model reconstruction method based on multi-source input. In other words, the memory 910 stores instructions for the 3D model reconstruction platform to execute the 3D model reconstruction method based on multi-source input.
通信接口909使用例如但不限于网络接口卡、收发器一类的收发模块,来实现计算设备900与其他设备或通信网络之间的通信。The communication interface 909 uses transceiver modules, such as, but not limited to, network interface cards and transceivers, to enable communication between the computing device 900 and other devices or communication networks.
本申请实施例还提供了一种计算设备集群。该计算设备集群包括至少一台计算设备。该计算设备可以是服务器,例如是中心服务器、边缘服务器,或者是本地数据中心中的本地服务器。在一些实施例中,计算设备也可以是台式机、笔记本电脑或者智能手机等终端设备。This application also provides a computing device cluster. The computing device cluster includes at least one computing device. The computing device can be a server, such as a central server, an edge server, or a local server in a local data center. In some embodiments, the computing device can also be a terminal device such as a desktop computer, a laptop computer, or a smartphone.
以下请参见图10,图10是本申请实施例的基于多源输入的三维模型重建方法的计算设备集群的一种结构示意图。如图10所示,该计算设备集群包括至少一个计算设备900,计算设备集群中的一个或多个计算设备900中的存储器910中可以存有相同的三维模型重建平台用于执行基于多源输入的三维模型重建方法的指令。Please refer to Figure 10 below. Figure 10 is a schematic diagram of a computing device cluster for a multi-source input-based 3D model reconstruction method according to an embodiment of this application. As shown in Figure 10, the computing device cluster includes at least one computing device 900. The memory 910 of one or more computing devices 900 in the computing device cluster may store the same 3D model reconstruction platform for executing instructions of the multi-source input-based 3D model reconstruction method.
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备900也可以用于执行三维模型重建平台用于执行基于多源输入的三维模型重建方法的部分指令。换言之,一个或多个计算设备900的组合可以共同执行三维模型重建平台用于执行基于多源输入的三维模型重建方法的指令。In some possible implementations, one or more computing devices 900 in the computing device cluster can also be used to execute some of the instructions of the 3D model reconstruction platform for executing a 3D model reconstruction method based on multi-source input. In other words, a combination of one or more computing devices 900 can jointly execute the instructions of the 3D model reconstruction platform for executing a 3D model reconstruction method based on multi-source input.
需要说明的是,计算设备集群中的不同的计算设备900中的存储器910可以存储不同的指令,用于执行三维模型重建平台的部分功能。也即,不同的计算设备900中的存储器910存储的指令可以实现获取模块901、确定模块902、建立模块903、指示模块904、调整模块905中的一个或多个模块的功能。It should be noted that the memory 910 in different computing devices 900 within the computing device cluster can store different instructions for executing some functions of the 3D model reconstruction platform. That is, the instructions stored in the memory 910 of different computing devices 900 can implement the functions of one or more modules among the acquisition module 901, determination module 902, creation module 903, instruction module 904, and adjustment module 905.
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备900的存储器910中也可以分别存有用于执行基于多源输入的三维模型重建方法的部分指令。换言之,一个或多个计算设备900的组合可以共同执行用于执行基于多源输入的三维模型重建方法的指令。In some possible implementations, the memory 910 of one or more computing devices 900 in the computing device cluster may also store partial instructions for executing the multi-source input-based 3D model reconstruction method. In other words, a combination of one or more computing devices 900 can jointly execute instructions for executing the multi-source input-based 3D model reconstruction method.
以下请参见图11,图11是本申请实施例的基于多源输入的三维模型重建方法的计算设备集群的又一种结构示意图。如图11所示,两个计算设备900A和900B通过通信接口909实现连接。计算设备900A中的存储器上存有用于执行确定模块902、建立模块903、调整模块905的指令。计算设备900B中的存储器上存有用于执行的功能的获取模块901、指示模块904指令。换言之,计算设备900A和900B的存储器910共同存储了三维模型重建平台用于执行基于多源输入的三维模型重建方法的指令。Please refer to Figure 11 below. Figure 11 is another schematic diagram of the computing device cluster of the multi-source input-based 3D model reconstruction method according to an embodiment of this application. As shown in Figure 11, two computing devices 900A and 900B are connected through a communication interface 909. The memory in computing device 900A stores instructions for executing the determination module 902, the establishment module 903, and the adjustment module 905. The memory in computing device 900B stores instructions for the acquisition module 901 and the instruction module 904 for the functions to be executed. In other words, the memory 910 of computing devices 900A and 900B jointly stores the instructions used by the 3D model reconstruction platform to execute the multi-source input-based 3D model reconstruction method.
图11所示的计算设备集群之间的连接方式可以是考虑到本申请提供的基于多源输入的三维模型重建方法需要对获取模块901进行大量数据传输。考虑到的数据传输量,为了避免计算设备900A出现超负荷的运算,因此,将实现的功能交由计算设备900B执行。The connection method between the computing device clusters shown in Figure 11 can be considered because the 3D model reconstruction method based on multi-source input provided in this application requires a large amount of data transmission to the acquisition module 901. Considering the amount of data transmission, in order to avoid overloading the computing device 900A, the functions to be implemented are delegated to the computing device 900B.
应理解,图11中示出的计算设备900A的功能也可以由多个计算设备900完成。同样,计算设备900B的功能也可以由多个计算设备900完成。It should be understood that the functions of computing device 900A shown in Figure 11 can also be performed by multiple computing devices 900. Similarly, the functions of computing device 900B can also be performed by multiple computing devices 900.
以下请参见图12,图12是本申请实施例的基于多源输入的三维模型重建方法的计算设备集群又一种结构示意图。在一些可能的实现方式中,计算设备集群中的一个或多个计算设备可以通过网络连接。其中,所述网络可以是广域网或局域网等等。图12示出了一种可能的实现方式,如图12所示,两个计算设备900C和900D之间通过网络进行连接。具体地,通过各个计算设备中的通信接口与所述网络进行连接。在这一类可能的实现方式中,计算设备900C中的存储器910中存有执行确定模块902、建立模块903、调整模块905的指令。同时,计算设备900D中的存储器910中存有执行获取模块901、指示模块904的指令。Please refer to Figure 12 below. Figure 12 is another schematic diagram of the computing device cluster structure of the 3D model reconstruction method based on multi-source input according to an embodiment of this application. In some possible implementations, one or more computing devices in the computing device cluster can be connected via a network. The network can be a wide area network (WAN) or a local area network (LAN), etc. Figure 12 shows one possible implementation, where two computing devices 900C and 900D are connected via a network. Specifically, they are connected to the network through the communication interface in each computing device. In this type of possible implementation, the memory 910 in computing device 900C stores instructions for executing the determination module 902, the establishment module 903, and the adjustment module 905. Simultaneously, the memory 910 in computing device 900D stores instructions for executing the acquisition module 901 and the instruction module 904.
图12所示的计算设备集群之间的连接方式可以是考虑到本申请提供的基于多源输入的三维模型重建方法需要进行大量数据传输,且需要通过网络连接,执行这些功能相对独立,为了使存储、计算性能能够达到最佳,因此考虑将实现的数据传输功能交由计算设备900D执行。The connection method between the computing device clusters shown in Figure 12 can be considered as follows: the three-dimensional model reconstruction method based on multi-source input provided in this application requires a large amount of data transmission and needs to be connected through a network. The execution of these functions is relatively independent. In order to achieve the best storage and computing performance, the data transmission function is considered to be executed by the computing device 900D.
应理解,图12中示出的计算设备900C的功能也可以由多个计算设备900完成。同样,计算设备900D的功能也可以由多个计算设备900完成。It should be understood that the functions of computing device 900C shown in Figure 12 can also be performed by multiple computing devices 900. Similarly, the functions of computing device 900D can also be performed by multiple computing devices 900.
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备900的存储器910中也可以分别存有用于执行基于多源输入的三维模型重建方法的部分指令。换言之,一个或多个计算设备900的组合可以共同执行用于执行基于多源输入的三维模型重建方法的指令。In some possible implementations, the memory 910 of one or more computing devices 900 in the computing device cluster may also store partial instructions for executing the multi-source input-based 3D model reconstruction method. In other words, a combination of one or more computing devices 900 can jointly execute instructions for executing the multi-source input-based 3D model reconstruction method.
本申请实施例还提供了一种包含指令的计算机程序产品。所述计算机程序产品可以是包含指令的,能够运行在计算设备上或被储存在任何可用介质中的软件或程序产品。当所述计算机程序产品在至少一个计算机设备上运行时,使得至少一个计算机设备执行上述应用于三维模型重建平台用于执行基于多源输入的三维模型重建方法。This application also provides a computer program product containing instructions. The computer program product may be a software or program product containing instructions, capable of running on a computing device or stored on any usable medium. When the computer program product is run on at least one computer device, it causes the at least one computer device to execute the above-described method for performing multi-source input-based 3D model reconstruction on a 3D model reconstruction platform.
本申请实施例还提供了一种计算机可读存储介质。所述计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。该计算机可读存储介质包括指令,所述指令指示计算设备执行上述应用于三维模型重建平台用于执行基于多源输入的三维模型重建方法。This application also provides a computer-readable storage medium. The computer-readable storage medium can be any available medium that a computing device can store, or a data storage device such as a data center containing one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state drive). The computer-readable storage medium includes instructions that instruct a computing device to execute the above-described method applied to a 3D model reconstruction platform for performing a 3D model reconstruction based on multi-source input.
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的保护范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the protection scope of the technical solutions of the embodiments of this application.
所属领域的技术人员可以清楚地了解到,上述描述的系统、三维模型重建平台或单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art will clearly understand that the specific working process of the system, 3D model reconstruction platform or unit described above can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
本申请实施例还提供了一种包含指令的计算机程序产品。所述计算机程序产品可以是包含指令的,能够运行在计算设备上或被储存在任何可用介质中的软件或程序产品。当所述计算机程序产品在至少一个计算机设备上运行时,使得至少一个计算机设备执行上述应用于三维模型重建平台用于执行基于多源输入的三维模型重建方法。This application also provides a computer program product containing instructions. The computer program product may be a software or program product containing instructions, capable of running on a computing device or stored on any usable medium. When the computer program product is run on at least one computer device, it causes the at least one computer device to execute the above-described method for performing multi-source input-based 3D model reconstruction on a 3D model reconstruction platform.
本申请实施例还提供了一种计算机可读存储介质。所述计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。该计算机可读存储介质包括指令,所述指令指示计算设备执行上述应用于三维模型重建平台用于执行基于多源输入的三维模型重建方法。This application also provides a computer-readable storage medium. The computer-readable storage medium can be any available medium that a computing device can store, or a data storage device such as a data center containing one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state drive). The computer-readable storage medium includes instructions that instruct a computing device to execute the above-described method applied to a 3D model reconstruction platform for performing a 3D model reconstruction based on multi-source input.
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的保护范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the protection scope of the technical solutions of the embodiments of this application.
所属领域的技术人员可以清楚地了解到,上述描述的系统、三维模型重建平台或单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art will clearly understand that the specific working process of the system, 3D model reconstruction platform or unit described above can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims (15)
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410626989 | 2024-05-20 | ||
| CN202410626989.0 | 2024-05-20 | ||
| CN202411096682.0 | 2024-08-09 | ||
| CN202411096682.0A CN120997375A (en) | 2024-05-20 | 2024-08-09 | Three-dimensional model reconstruction method and device based on multi-source input |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025242011A1 true WO2025242011A1 (en) | 2025-11-27 |
Family
ID=97683932
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2025/095518 Pending WO2025242011A1 (en) | 2024-05-20 | 2025-05-16 | Three-dimensional model reconstruction method based on multi-source input and apparatus |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN120997375A (en) |
| WO (1) | WO2025242011A1 (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114022639A (en) * | 2021-10-27 | 2022-02-08 | 浪潮电子信息产业股份有限公司 | Three-dimensional reconstruction model generation method and system, electronic device and storage medium |
| CN116309836A (en) * | 2023-03-16 | 2023-06-23 | 北京洛必德科技有限公司 | Three-dimensional object pose recognition method and device based on visual image and electronic equipment |
| US20230260151A1 (en) * | 2020-05-18 | 2023-08-17 | Shenzhen Intelligence Ally Technology Co., Ltd. | Simultaneous Localization and Mapping Method, Device, System and Storage Medium |
| CN116958452A (en) * | 2023-09-18 | 2023-10-27 | 北京格镭信息科技有限公司 | Three-dimensional reconstruction method and system |
| CN117689826A (en) * | 2022-09-09 | 2024-03-12 | 北京字跳网络技术有限公司 | Three-dimensional model construction and rendering methods, devices, equipment and media |
-
2024
- 2024-08-09 CN CN202411096682.0A patent/CN120997375A/en active Pending
-
2025
- 2025-05-16 WO PCT/CN2025/095518 patent/WO2025242011A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20230260151A1 (en) * | 2020-05-18 | 2023-08-17 | Shenzhen Intelligence Ally Technology Co., Ltd. | Simultaneous Localization and Mapping Method, Device, System and Storage Medium |
| CN114022639A (en) * | 2021-10-27 | 2022-02-08 | 浪潮电子信息产业股份有限公司 | Three-dimensional reconstruction model generation method and system, electronic device and storage medium |
| CN117689826A (en) * | 2022-09-09 | 2024-03-12 | 北京字跳网络技术有限公司 | Three-dimensional model construction and rendering methods, devices, equipment and media |
| CN116309836A (en) * | 2023-03-16 | 2023-06-23 | 北京洛必德科技有限公司 | Three-dimensional object pose recognition method and device based on visual image and electronic equipment |
| CN116958452A (en) * | 2023-09-18 | 2023-10-27 | 北京格镭信息科技有限公司 | Three-dimensional reconstruction method and system |
Also Published As
| Publication number | Publication date |
|---|---|
| CN120997375A (en) | 2025-11-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Schöps et al. | Surfelmeshing: Online surfel-based mesh reconstruction | |
| CN110009727B (en) | Automatic reconstruction method and system for indoor three-dimensional model with structural semantics | |
| CN109003325B (en) | Three-dimensional reconstruction method, medium, device and computing equipment | |
| CN109035388A (en) | Three-dimensional face model method for reconstructing and device | |
| US11315313B2 (en) | Methods, devices and computer program products for generating 3D models | |
| CN114758337A (en) | A semantic instance reconstruction method, apparatus, device and medium | |
| CN106803267A (en) | Indoor scene three-dimensional rebuilding method based on Kinect | |
| CN113129352B (en) | A sparse light field reconstruction method and device | |
| CN112967329B (en) | Image data optimization method and device, electronic equipment and storage medium | |
| CN113160335A (en) | Model point cloud and three-dimensional surface reconstruction method based on binocular vision | |
| CN112146647B (en) | Binocular vision positioning method and chip for ground texture | |
| CN115375836A (en) | Method and system for point cloud fusion 3D reconstruction based on multivariate confidence filtering | |
| CN115409949A (en) | Model training method, perspective image generation method, device, equipment and medium | |
| CN110148086A (en) | The depth polishing method, apparatus and three-dimensional rebuilding method of sparse depth figure, device | |
| CN119006678A (en) | Three-dimensional Gaussian sputtering optimization method for pose-free input | |
| CN113034681B (en) | Three-dimensional reconstruction method and device for spatial plane relation constraint | |
| CN113496503A (en) | Point cloud data generation and real-time display method, device, equipment and medium | |
| Verykokou et al. | A Comparative analysis of different software packages for 3D Modelling of complex geometries | |
| CN111260765A (en) | A Dynamic 3D Reconstruction Method of Microsurgery Field | |
| US11087536B2 (en) | Methods, devices and computer program products for generation of mesh in constructed 3D images | |
| CN115063485B (en) | Three-dimensional reconstruction method, device and computer-readable storage medium | |
| CN102496184A (en) | Increment three-dimensional reconstruction method based on bayes and facial model | |
| CN110490973B (en) | Model-driven multi-view shoe model three-dimensional reconstruction method | |
| WO2025242011A1 (en) | Three-dimensional model reconstruction method based on multi-source input and apparatus | |
| CN117409149A (en) | Three-dimensional modeling method and system of beam method adjustment equation based on three-dimensional constraint |