WO2025038935A1 - Multi-dimensional model generation for aesthetic design of an environment - Google Patents
Multi-dimensional model generation for aesthetic design of an environment Download PDFInfo
- Publication number
- WO2025038935A1 WO2025038935A1 PCT/US2024/042687 US2024042687W WO2025038935A1 WO 2025038935 A1 WO2025038935 A1 WO 2025038935A1 US 2024042687 W US2024042687 W US 2024042687W WO 2025038935 A1 WO2025038935 A1 WO 2025038935A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- model
- objects
- designer
- environment
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/01—Customer relationship services
- G06Q30/015—Providing customer assistance, e.g. assisting a customer within a business location or via helpdesk
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/018—Certifying business or products
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0603—Catalogue creation or management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0621—Electronic shopping [e-shopping] by configuring or customising goods or services
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0623—Electronic shopping [e-shopping] by investigating goods or services
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0641—Electronic shopping [e-shopping] utilising user interfaces specially adapted for shopping
- G06Q30/0643—Electronic shopping [e-shopping] utilising user interfaces specially adapted for shopping graphically representing goods, e.g. 3D product representation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/20—Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2210/00—Indexing scheme for image generation or computer graphics
- G06T2210/04—Architectural design, interior design
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2219/00—Indexing scheme for manipulating 3D models or images for computer graphics
- G06T2219/20—Indexing scheme for editing of 3D models
- G06T2219/2008—Assembling, disassembling
Definitions
- Various aspects of the present disclosure relate generally to systems and methods for generating muti-dimensional representations of physical spaces and objects within those spaces.
- the present disclosure is directed to overcoming one or more of these abovereferenced challenges.
- systems, methods, and computer readable memory are disclosed for generating muti-dimensional models representing physical environments and objects within said environments.
- environment data may be received depicting an environment that contains one or more objects.
- the environment data may include a multi-dimensional visualization of the environment.
- the image processing model may be configured to do one or more of the following: generate a 3D model of the environment that includes a plurality of 3D representations of the one or more objects and a 3D representation of the environment, where the 3D representations of the one or more objects are independently manipulable relative to the 3D representation of the environment; determine a plurality of parameters for one or more of the 3D representations of the one or more objects based on photogrammetry analysis; and for one or more of the 3D representations of the one or more objects, associate the respective 3D representation of the respective object with a label indicating an object type and one or more of the plurality of parameters determined to be associated with the object.
- the label and the plurality of parameters may be used to query a database to find a matching 3D representation of the object.
- the model may output the labels and parameters for
- environment data may be received depicting an environment that contains a plurality of objects.
- the environment data may include a multi-dimensional visualization of the environment.
- the plurality of objects may include one or more structure objects and one or more removable objects.
- the image processing model may be configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations. Based on the tags associated with each of the 3D representations, it may be determined which of the 3D representations are associated with structure objects and which of the 3D representations are associated with removable objects.
- a rendering of the environment may be generated in which one or more portions associated with the removable objects are excised and replaced with a filler portion that is generated based on an inference of one or more characteristics associated with material surrounding the removable object.
- a plurality of two-dimensional (2D) representations of at least one object may be received.
- a 3D representation may be generated for the object.
- One or more tags may be generated for the 3D representation of the object, where the tags may include a label indicating an object type and a plurality of parameters indicating an assessed characteristic of the object.
- Matching analysis may be performed comparing the one or more tags against tags for a plurality of previously registered objects.
- a confidence score may be generated indicating a confidence that a previously registered object is a match for the object. Based on the matching analysis and the confidence score, the previously registered object may be selected for display to a user.
- environment data may be received depicting an environment that contains one or more objects
- the environment data may include a multi-dimensional visualization of the environment.
- the image processing model may be configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations of the objects.
- the 3D model of the environment, or a derivative thereof, may be inputted to an object placement model.
- the object placement model may be a generative artificial intelligence model that has been trained based on a plurality of 2D floorplan representations for sample spaces showing at least spatial dimensions of the respective sample space and/or human-selected object placements within the respective space.
- a 2D floorplan representation may be generated showing a proposed arrangement of the plurality of objects within the environment.
- the 3D model of the environment may be modified by placing the 3D representations of the plurality of objects within the 3D model of the environment in accordance with the 2D floorplan representation generated by the object placement model.
- one or more images of a furnishing may be received.
- the one or more images, or derivatives thereof, may be inputted to a material recognition machine learning model.
- the material recognition machine learning model may be trained to generate an assessment of a surface material of a furnishing by analyzing a volume of sample images coded with known surface materials. Using the machine learning model, an assessment of a surface material of the furnishing shown in the one or more images may be generated.
- a system for generating a multi-dimensional representation of a physical space based, at least in part, on a designer model may be provided herein.
- Environment data depicting a physical space may be received.
- the physical space may contain a plurality of objects.
- the environment data may comprise a multi-dimensional representation of an environment.
- the environment data may be processed using a first processing model.
- the first processing model may be configured to output a 3D model of the physical space that includes 3D representations of the plurality of respective objects.
- a design plan for the physical space may be generated using a designer model. In some embodiments, the design plan may be generated based on the 3D model of the physical space, or a derivative thereof.
- the design plan may include a proposed arrangement of a plurality of objects within the physical space.
- the proposed arrangement may comprise a plurality of object indicators.
- each object indicator may indicate a position within the physical space at which a respective object is proposed to be located.
- a recommended object may be selected to be located at the position indicated by the respective object indicator.
- a recommended object may be selected, from a repository of available objects, to be located at the position indicated by the respective object indicator.
- a plurality of recommended objects associated with respective positions indicated by the respective object indicators may be generated.
- the 3D model may be populated with the plurality of recommended objects.
- the 3D model may be populated with the plurality of recommended objects located at the respective positions indicated by the respective object indicators.
- FIG. 1 shows an exemplary system architecture for generating 3D models representing an environment.
- FIG. 2 shows an exemplary system protocol by which one or more multidimensional assets may be stored in a system data storage.
- FIG. 3 shows an exemplary process for determining whether input content depicts an object or a physical space.
- FIG. 4 shows an exemplary process for identifying a 3D representation of an object that matches input data received from a user.
- FIG. 5 shows an object segmentation model for segmenting objects within an environment.
- FIG. 6 shows an exemplary pipeline for processing images and generating 3D representations of environments containing one or more objects.
- FIG. 7 shows an exemplary input and output from an object segmentation model.
- FIG. 8 shows an exemplary method for segmenting objects within a 3D representation of an environment.
- FIG. 9 shows an exemplary system for removing objects from a 3D representation of an environment.
- FIG. 10 shows an exemplary method for displaying an environment in which objects have been optionally removed based on user inputs.
- FIG. 11 shows an exemplary method for generating a rendering of an environment in which objects have been removed.
- FIG. 12 shows an exemplary method for retrieving a 3D representation representing an object that matches an object indicated by a user.
- FIG. 13 shows an example of a model retrieval engine configured to retrieve a 3D representation of an object that matches an object indicated by a user.
- FIG. 14 shows an exemplary method for determining a model that matches an object indicated by a user.
- FIG. 15 shows an exemplary process for segmenting input data and generating space plans using an object placement model.
- FIG. 16 shows an exemplary method for outputting a 3D model with an arrangement of 3D representations indicated by a user.
- FIG. 17 shows an exemplary generative Al object placement model trained to output a 3D model with a proposed arrangement of 3D representations associated with objects within a space.
- FIG. 18 shows an exemplary method for generating a 2D floorplan with a proposed arrangement of objects.
- FIG. 19 shows an example of a material recognition model trained to output an assessment of a surface material of an object shown in an image or multi-dimensional representation.
- FIG. 20 shows an exemplary method for matching objects based, at least in part, on an assessment of a surface material of a furnishing.
- FIG. 21 shows an exemplary system architecture for obtaining and managing content to be used for training a designer model to generate a multi-dimensional representation of a physical space.
- FIG. 22 depicts an exemplary training protocol to train the floorplan generator of the designer model to output a design plan.
- FIG. 23 depicts an exemplary flow diagram of an object recommendation engine configured to generate recommended objects to be located at the position indicated by respective object indicators.
- FIG. 24 depicts an exemplary method of generating a multi-dimensional representation of a physical space in response to user instructions inputted via a GUI.
- FIG. 25 depicts an exemplary method of generating a multi-dimensional representation of a physical space based on a designer model.
- FIG. 26 shows an exemplary processing system that may execute techniques presented herein.
- the present disclosure is directed to methods and systems for generating and modifying muti -dimensional models representing physical environments and objects within said environments.
- environment data may be received depicting an environment that contains one or more objects.
- the environment data may include a multi-dimensional visualization of the environment.
- the image processing model may be configured to do one or more of the following: generate a 3D model of the environment that includes a plurality of 3D representations of the one or more objects and a 3D representation of the environment, where the 3D representations of the one or more objects are independently manipulable relative to the 3D representation of the environment; determine a plurality of parameters for one or more of the 3D representations of the one or more objects based on photogrammetry analysis; and for one or more of the 3D representations of the one or more objects, associate the respective 3D representation of the respective object with a label indicating an object type and one or more of the plurality of parameters determined to be associated with the object.
- the label and the plurality of parameters may be used to query a database to find a matching 3D representation of the object.
- the model may output the labels and parameters for
- environment data may be received depicting an environment that contains a plurality of objects.
- the environment data may include a multi-dimensional visualization of the environment.
- the plurality of objects may include one or more structure objects and one or more removable objects.
- the image processing model may be configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations. Based on the tags associated with each of the 3D representations, the system may determine which of the 3D representations are associated with structure objects and which of the 3D representations are associated with removable objects.
- a rendering of the environment may be generated in which one or more portions associated with the removable objects are excised and replaced with a filler portion that is generated based on an inference of one or more characteristics associated with material surrounding the removable object.
- Al generative artificial intelligence
- a plurality of two-dimensional (2D) representations of at least one object may be received. Based on the plurality of 2D representations, a 3D representation may be generated for the object. One or more tags may be generated for the 3D representation of the object, where the tags may include a label indicating an object type and a plurality of parameters indicating an assessed characteristic of the object. Matching analysis may be performed comparing the one or more tags against tags for a plurality of previously registered objects. A confidence score may be generated indicating a confidence that a previously registered object is a match for the object. Based on the matching and analysis and the confidence score, the previously registered object may be selected for display to a user.
- environment data may be received depicting an environment that contains one or more objects
- the environment data may include a multi-dimensional visualization of the environment.
- the image processing model may be configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations of the objects.
- the 3D model of the environment, or a derivative thereof, may be inputted to an object placement model.
- the object placement model may be a generative artificial intelligence model that has been trained based on a plurality of 2D floorplan representations for sample spaces showing at least spatial dimensions of the respective sample space and/or human-selected object placements within the respective space.
- a 2D floorplan representation may be generated showing a proposed arrangement of the plurality of objects within the environment.
- the 3D model of the environment may be modified by placing the 3D representations of the plurality of objects within the 3D model of the environment in accordance with the 2D floorplan representation generated by the object placement model.
- one or more images of a furnishing may be received.
- the one or more images, or derivatives thereof, may be inputted to a material recognition machine learning model.
- the material recognition machine learning model may be trained to generate an assessment of a surface material of a furnishing by analyzing a volume of sample images coded with known surface materials.
- FIG. 1 shows an exemplary system architecture for generating 3D models representing an environment.
- the system 100 may include a server 102 communicatively coupled to an existing property module 104, an asset collection module 106, a system data storage 108, a future property module 110, a staging module 112, and/or a CPU 114.
- the asset collection module 106 may be linked to a third-party site 116 and/or a user device comprising an application or mobile app 118, both of which may be described herein.
- the server 102 may be communicatively coupled to an asset collection module 106, which may be described further in FIG. 2.
- the asset collection module 106 may obtain a plurality of multi-dimensional images and/or video from one or more sources.
- the asset collection module 106 may obtain images from one or more third-party websites 116.
- a vendor may link his or her website 116 to the system, which may involve the vendor providing the system 100 authentication credentials that the system 100 can use to access and read data related to the vendor’s catalog of products advertised on the website. Vendors and other users may also transmit images or 3D models of objects to the asset collection module 106 via, e.g., an application programming interface (API).
- API application programming interface
- the system 100 may process the received data (e.g., to ensure standard formatting and apply appropriate labels) and store it locally to system data storage 108.
- the system 100 may receive from a registered user input data related to images and/or video via an application 118 hosted on the user’s mobile device.
- the system data storage 108 may store the 3D assets (which may alternatively be referred to as 3D models) and information related to the 3D assets such as one or more parameters related thereto.
- the assets may be stored in, for example, a database that the system 100 may query.
- the system 100 may include an existing property module 104 and a future property module 110.
- the future property module 100 may pull a list of candidate builders and provide such list to a user.
- the user may be able to select a candidate builder from the list.
- the selected candidate builder may be able to access the user’s 3D model(s) for each environment within the blueprint of a property.
- the 3D models may include one or more 3D representations depicting objects within the environment.
- the existing property module 104 which may be described further herein, may allow a user to visualize different variations of an existing space via the implementation of the object segmentation model, the object removal model, the model retrieval engine, the object placement model, and the material recognition model, all of which are described in more detail below.
- the system may include a central processing unit or CPU 114, which may direct the operation of the different modules of the system 100.
- a staging module 112 may include a generative Al model trained, based on 2D floorplan representations for sample spaces, to output a 2D floorplan representation showing a proposed arrangement of one or more objects within an environment.
- a 3D model depicting the environment may be modified in accordance with the output 3D floorplan.
- a user may be able to visualize the environment with the proposed arrangement.
- An example of a staging module 112 is described further with respect to FIGS. 16-18.
- FIG. 2 shows an exemplary process for receiving, generating, and storing multidimensional models related to physical products in system data storage.
- the asset collection module may receive a plurality of assets from third-parties, such as vendors selling objects (e.g., furnishings) to be displayed in model environments.
- the assets may be transmitted to the system by any suitable manner.
- vendors may transmit existing assets to the system via an API of the system.
- the received assets may be in any of 2D image, 3D model, or video formats.
- the system may determine a format of each of the plurality of received assets. For example, the system may determine whether a given asset is a 2D image or a 3D model.
- the process may flow to block 206, and the 3D asset may be stored in data storage 204.
- the existing asset received from the third-party may be processed before it is stored.
- the asset may be smoothed, compressed, or its formatting may be standardized to facilitate efficient storage, retrieval, and use for subsequent applications.
- the asset may be transmitted to 2D retail asset block 210, which may be a data storage module. Images may also be received from users at block A, which may represent a plurality of users uploading images from user devices (e.g., personal computers, smart phones, and the like).
- the system may determine whether the image was received from a retail account (e.g., a vendor) or a personal account (e.g., a customer designing a space or shopping for furnishings). This may be determined, for example, based on an account type of the user or based on information received from the user upon registration. If an image is received from a retail user, the image may be transmitted to retail asset storage 210.
- the image may be transmitted to personal asset storage 212.
- 2D assets from both blocks may be converted to 3D assets using converter module 214.
- the converter module 214 may include, for example, a neural radiance field (NeRF) model or other machine learning model trained to generate 3D models based on received images. In some embodiments, other techniques for converting items shown in a 2D image to a 3D model.
- the system may then store the generated 3D models, along with any directly received (and optionally processed) 3D models in system storage 204.
- NeRF neural radiance field
- the 3D models may be stored with parameters relating to the objects the models represent.
- the vendor may indicate a product name, model number, price, or other information relating to the object.
- personal users may also optionally upload these or other categories of information.
- the system may be configured to store such information and associate with the stored models.
- FIG. 3 shows an exemplary process for determining whether input content depicts an object or a physical space.
- the system protocol may receive input content 302 from a user, which may be in the form of one or more input images and/or an input video recording.
- the user may upload the content 302 via a graphical user interface (GUI) of the application on the device.
- GUI graphical user interface
- the user may see a prompt 300 requesting whether the content 302 represents an object or a space. If the user answers the former, the system may prepare the content 302 to be used as input assets 306 for the asset collection module described with respect to FIG. 2.
- the user must upload a minimum of 1, 2, 3, 4 or 5 images and/or satisfy other requirements to sufficiently capture the object.
- the system may prepare the content 302 to be used as input space 304 for processing by the object removal model, the object placement model, and/or the like, each of which is described further herein.
- the input space 304 is in the form of images
- the user may be required to upload a minimum of 1, 2, 3, 4, or 5 images and/or satisfy other requirements such as capturing each of corner of the space.
- the input space 304 is in the form of video
- the user must upload a video capturing 360 degrees of the space.
- FIG. 4 shows an exemplary process for identifying a 3D representation of an object that matches input data received from a user.
- the method may include receiving an input images/video prompt at block 400 or a text or other prompt at block 402, applying an object segmentation model at block 404, executing a model retrieval engine at block 406, and/or storing the identified match at a system data storage at block 408.
- a user registered to the system may be presented an option to enter a mode where the user can input a picture, video or text indicating an object and the system will attempt to identify and output to the user a 3D representation that matches the user input.
- the user may upload one or more images and/or a video depicting an object or an environment that contains one or more objects.
- the environment may be a physical space and the one or more objects may be furnishings within the physical space.
- a user may additionally or alternatively enter text via a text prompt 402, which the system may use as additional input to the image processing model.
- the system may receive text prompts and use a generative model, such as those described herein, to generate an image, video, or 3D model of an object that matches the user’s prompt.
- the generative process may be iterative, such that the user may input an initial instruction and subsequently input additional instructions to modify the generated content showing the object. In this manner, the object shown in the iteratively generated content may closely match the user’s desired object.
- environment data surrounding the object may also be obtained or generated from information entered via the prompts 400, 402.
- the received and/or generated content may be processed by an object segmentation model.
- the object segmentation model may process content containing an environment with one or more objects to identify various objects within the environment.
- the object segmentation model may also generate 3D representations of objects within the environment.
- the object segmentation model 404 may receive the environment data as input and feed the data into an image processing model.
- the image processing model may be trained to output a 3D model of the environment that includes a 3D representation of the environment as well as 3D representations of objects within the environment.
- the 3D representations of the objects may be independently manipulable from the 3D representation of the environment.
- one or more parameters such as the size, shape, pattern, color, etc. may be generated and indexed with each 3D representation.
- a label associated with what type of object the 3D representation corresponds to may be generated and indexed.
- the label and one or more parameters may collectively be known as a tag and may be used by other models, such as the object removal model, described herein.
- Generating the label and the param eter(s) may be an output of the image processing model.
- generating the label and the parameter(s) may take place at another module linked to the system such as the material recognition model.
- a model retrieval engine 406 may query the system data storage 408 and identify, from the stored candidate 3D assets described with respect to FIG. 2, one or more candidate 3D assets most likely to match the object(s) depicted in the user input image and/or text. As described with respect to FIG. 14, the model retrieval engine 406 may perform a matching analysis that may generate scores for each of the candidate 3D assets. The model retrieval engine 406 may output for display to the user the candidate 3D asset with the highest score. In some examples, this candidate 3D asset may be used when the object placement model stages a proposed scene for the user, which will be described further with respect to FIG. 17. In some examples, the interface of the application may present a GUI that allows the user to scroll through a multitude of similar candidate 3D assets, as determined by the generated scores.
- FIG. 5 shows an object segmentation model for segmenting objects within an environment.
- the object segmentation model 500 may include an image processing model 502 which may include a 3D model generation module 504, a validation module 506, and a parameter generation module 508.
- the object segmentation model 500 may include weight initialization 510.
- the image processing model may receive text input 512, images and/or video input 514, and/or context data input 516.
- the image processing model 502 may receive environment data and be trained to output a 3D model including a 3D representation of an environment along with 3D representations of objects within an environment.
- the image processing model 502 may, for example, utilize a NeRF methodology with a neural network.
- the image processing model 502 may incorporate other generative Al models capable of contributing to the production of 3D models based on user input. For example, there may be scenario where a user only provides input text 512 via the prompt described with respect to FIG. 4. In such a case, a generative Al model that uses natural language processing to generate digital images may be used to feed digital images to a 3D generation model 504.
- the image processing model 502 may include a 3D generation model 504 that may be trained to receive one or more 2D images as input 514 and, based on the images, output a 3D model.
- additional data related to context data input 516 and text input 512 may be fed to the image processing model 502 to improve accuracy.
- the image processing model 502 may be trained to receive inputted 2D images that capture different viewing angles of a single environment and/or objects within an environment and, based on this input, reconstruct a 3D model.
- a generated 3D model may depict viewing angles not previously captured by the inputted 2D images, which may allow a user to rotate the 3D model, allowing the user to view the space or environment from different perspectives.
- the 3D generation model 504 may employ a neural network that is trained to use information from pixels along rays associated with different viewing angles captured from the plurality of input images to assign weights and biases that will output color and volume of that pixel within a 3D model. Additional details regarding exemplary generative models within the scope of this disclosure are described below, including with respect to in FIG. 6.
- a validation module 506 of the image processing model 502 may be used to provide feedback to the system on how the current configuration of the image processing model 502 is performing.
- a sample of the 3D model outputted by the image processing model 502 which may be a digital twin (as described further below with respect to FIG. 18), may be generated and used by the validation module 506 to test how the image processing model 502 is performing. Testing the sample may involve providing manual feedback as to whether the image processing model 502 outputted a correct 3D model or an incorrect 3D model. The results of testing the sample may be analyzed and applied to make improvements to the image processing model 502. For example, the system may send instructions to the image processing model 502 to update weights and biases of the neural network.
- the parameter generation module 508 of the object segmentation model 500 may perform additional analysis to generate one or more parameters of the 3D representations of the objects, which are described herein. For example, by analyzing the color(s) of an object shown in an input image or a 3D representation, a parameter may be generated indicating a color scheme of the object. Similarly, by analyzing a 2D projection, via one or more volume rendering techniques, of the 3D representation outputted by the model, a parameter associated with the size of the object may be generated.
- the object segmentation model 500 may include a classifier model 518 which may optionally be the same as the classifier model 902 implemented by the object removal module 900 described with respect to FIGS. 9 and 10.
- a 3D model generated based on user inputs e.g., images
- a classifier model 518 trained to assign labels to each 3D representation of an object within the 3D model. Additional details are described below, including with respect to FIGS. 7 and 9.
- FIG. 6 shows an exemplary pipeline for processing images and generating 3D representations of environments containing one or more objects.
- the 3D model generation module 602 described with respect to FIG. 6 is merely an example of a generative Al model and other generative Al models capable of outputting a 3D model from input images and/or other data may be used instead.
- the image processing model 600 may include a 3D model generation module 602 which may employ a coordinate sample module 604, a neural network 606, a reconstruct domain module 608, a mapping module 610, and/or a sensor domain module 612.
- the modules mentioned above may be used to train a neural network capable of outputting a 3D model from a multitude of 2D images and/or data related to the 2D images, such as text input data and/or context input data.
- the coordinate sampling module 604 may sample the coordinates of a scene of the 3D model. As mentioned in FIG. 5, the multitude of inputted images may capture different viewing angles of the same environment. Rays that move along a hypothetical z-axis of the 2D images may be generated for each pixel of the 2D images. By using the rays, the coordinate sampling module 604 may derive coordinate values (x, y, z) for each pixel of the corresponding 2D image, including the pixels along the hypothetical z-axis. Additionally, the viewing angle associated with the respective image may be derived and included in the coordinate values for each pixel, resulting in an input of (x, y, z, 0, 0) to be fed to the neural network 606.
- the coordinate values may be fed into the neural network 606, which may be a fully connected neural network designed to output color components (r, g, b) for each pixel as well as a volume density for each pixel.
- the volume density may be used to indicate whether an object is present for the given coordinate values in the scene or if the coordinate values are associated with empty space.
- a reconstruct domain module 608 may receive the information outputted form the neural network 606. Based on these values, pixels along the multitude of rays extending via a hypothetical z-axis now may have outputted color components and volume density and by leveraging information related to the different rays, a scene of the 3D model may be reconstructed. For example, if a pixel associated with a first ray is associated with a zerovolume density and the same pixel associated with a second ray is associated with a zerovolume density, in other words the coordinate value for the pixel represents empty space within the scene, the model may be more confident in indicating that this pixel in the 3D model should represent empty space.
- a mapping module 610 may map the reconstructed sample back to the original 2D images used as input.
- the sensor domain module 612 may determine whether the reconstructed sample exceeds a threshold, which may indicate that the 3D model is accurate enough to be used by the system.
- a reconstruction error may be calculated and used as feedback to optimize the neural network 606.
- FIG. 7 shows an exemplary input and output from an object segmentation model (e.g., objection segmentation model 500).
- environment data may include one or more source image(s) 700.
- environment data may include a plurality of 2D images and/or video.
- Environment data may include text input data and/or context input data.
- the environment data may be fed to an image processing model (e.g., image processing model 502 described with respect to FIGS. 5 and 6).
- the image processing model may generate an output 702 with tags for each 3D representation depicting an object within the 3D model.
- the tags may include one or more parameters and a label indicative of the object type that the 3D representation represents.
- a label named “couch” may be generated and linked to a 3D representation of the object depicting a couch from the source image.
- a size of the couch for example, the dimensions of the couch relative to the environment, may be generated and linked to the 3D representation of the couch.
- An additional flag may be set for each of the 3D representations that may indicate whether the object that the 3D representation represents is a structure object or a non-structure object, which may be described further with respect to FIGS. 9-12.
- FIG. 8 shows an exemplary method for segmenting objects within a 3D representation of an environment.
- environment data may be received depicting an environment that contains one or more objects.
- the environment data may include a multidimensional visualization of the environment.
- the image processing model may be configured to do one or more of the following.
- it may generate a 3D model of the environment that includes a plurality of 3D representations of the one or more objects and a 3D representation of the environment, where the 3D representations of the one or more objects are independently manipulable relative to the 3D representation of the environment.
- the model may determine a plurality of parameters for one or more of the 3D representations of the one or more objects based on photogrammetry analysis.
- the model may, for one or more of the 3D representations of the one or more objects, associate the respective 3D representation of the respective object with a label indicating an object type and one or more of the plurality of parameters determined to be associated with the object.
- the label and the plurality of parameters may be used to query a database to find a matching 3D representation of the object.
- the model may output the labels and parameters for the 3D representations of the one or more objects.
- FIG. 9 shows an exemplary object removal module 900 for removing objects from a 3D representation of an environment.
- the object removal module 900 may include a classifier model 902, which may receive images or a model of an environment and classify one or more objects within the environment.
- the classifier model may be or may operate similarly to the object segmentation models described herein.
- the classifier model may apply labels to objects within the room identifying what the objects are.
- Each classified object may also be determined to be a structural element within the environment which cannot be readily removed or a non- structural element that can be readily removed.
- the system may include a set of rules and/or a lookup table 914 with a list of possible objects and whether they should be treated as structure or nonstructure.
- a wall, floor, light fixture, or fireplace may be classified as a structural element that cannot be readily removed and should be included in renderings of the environment with objects removed.
- a table, couch, or painting may be classified as a non-structural element that can be readily removed, and if the system generates a rendering of the environment with objects removed, such non-structural elements should be removed from that rendering.
- Each of the objects classified by the classifier model may thus be assigned a label indicating whether the object is structure or non-structure by structure label module 916.
- the object removal module 900 may include a generative Al model 918 that is trained to fill excised portions of an image, a 3D model, or a rendering thereof.
- the object removal module 900 may excise portions of the environment that correspond to objects that have been flagged as non-structure and are therefore deemed to be removable objects. Excising these portions may leave gaps, which the generative Al model 918 may be trained to fill using inferences based on the characteristics (e.g., appearance, material, and shape) of surrounding structural elements (e.g., floor and walls).
- the material recognition model described with respect to FIG. 21 may be called to determine such characteristics.
- the generative Al model 918 may fill excised portions of an image, a video, a 3D model, or rendering thereof by generating filler portions that the model has been trained to recognize as appropriate and realistic when viewed within the context of that position in the environment.
- the generative Al model 918 may generate an initial filler portion, which may be, for example, random noise, a static filler, or some other starter data.
- the generative Al model 918 may then iteratively modify the content of the filler portion until the 3D representation is determined to resemble a realistic environment. This process of iteratively modifying and scoring the resulting 3D representation may be performed by a neural network trained using large volumes of content.
- the object removal module 1000 may be executed by the system to generate a 3D model of an environment where 3D representations representing removable objects have been excised and replaced with filler portions.
- a 3D model of an environment where 3D representations representing removable objects have been excised and replaced with filler portions.
- Such a feature may allow a user to envision a space without any furnishings (i.e., furniture or other decor) within the space.
- the user may upload a plurality of images capturing, for example, the corners of the space to the system.
- the images may depict one or more furnishings within the space.
- the system may use the images as input to the object segmentation model and subsequently the object removal module 1000, both of which are described above.
- the object removal module 1000 may output a generated 3D model of the space with every 3D representation of a furnishing removed.
- This modified version of the 3D model may be stored in the system storage database.
- an original 3D model for example with all of the 3D representations, may have been generated by the object segmentation model and stored in the system data storage, for example, in a searchable database.
- the modified version of the 3D model and the original version of the 3D model may be linked in the database.
- a GUI linked to the system may prompt 1002 the user who uploaded the images of the space asking whether he or she wants to see the space with or without the furnishings. If the user prefers to see furnishings, the system may pull the original 3D model and display a GUI to the user depicting the original 3D model. If the user prefers not to see furnishings, the system may pull the modified 3D model and display a GUI to the user depicting the modified 3D model without any furnishings.
- FIG. 11 shows an exemplary method for generating a rendering of an environment in which objects have been removed.
- environment data may be received depicting an environment that contains a plurality of objects.
- the environment data may include a multi-dimensional visualization of the environment.
- the plurality of objects may include one or more structure objects and one or more removable objects.
- the environment data may be processed.
- the image processing model may be configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations.
- a rendering of the environment may be generated in which one or more portions associated with the removable objects are excised and replaced with a filler portion that is generated based on an inference of one or more characteristics associated with material surrounding the removable object.
- FIG. 12 shows an exemplary method for retrieving a 3D representation representing an object that matches an object indicated by a user.
- vendor 1 1200 may link the vendor website to the system and may provide access credentials so that the system, or system management, may read data related to products advertised on the website, which may be described with respect to FIGS. 1 and 2.
- vendor 2 1202 may directly upload to the system image(s) of products that the vendor wishes to sell. Data from both vendor 1 1200 and vendor 2 1202 may be processed via the asset collection protocol 1214 described with respect to FIG. 2 whereby 3D assets depicting 3D models may be generated and stored in the system data storage 1208.
- these models may be used as candidate models that may be compared to the output of the object segmentation model 1218 to retrieve a model or a 3D representation within a model that a user might be interested in.
- a user 1204 may upload one or more images depicting an object that the user is seeking to input to the system, which may be described with respect to FIG. 3.
- the system may execute the object segmentation model 1218 described with respect to FIGS. 4-6 to generate a 3D model containing a 3D representation of the environment and of any objects contained within the environment captured in the image(s).
- the object segmentations model 1218 may generate a tag comprising a label and one or more parameters and assign it to the 3D model, which is described with to FIGS. 5 and 6.
- the 3D model along with the tags may be sent to the system data storage 1208.
- the system may execute a match analysis model 1224.
- the match analysis model 1224 may perform match analysis and compare the respective 3D representations of the 3D model with the candidate 3D representations of the candidate 3D models.
- the output of the match analysis model 1224 may be a 3D model containing a 3D representation that most closely resembles the object that the user 1204 is seeking based on the image(s) uploaded by the user.
- the match may be sent for display via a GUI on the user’s device.
- FIG. 13 shows an example of a model retrieval engine 1300 configured to retrieve a 3D representation of an object that matches an object indicated by a user.
- the model retrieval engine 1300 may include a match analysis model 1302, which may include a tag generation module 1304 and a scoring module 1306.
- the match analysis model 1302 may have read and write permissions with respect to the system data storage 1308.
- the system may instruct the model retrieval engine 1300 to execute the match analysis model 1302 to find a candidate 3D asset that matches the generated 3D model. Additionally, the system may provide data related to the generated 3D model to the model retrieval engine 1300.
- the tags which may include a label and one or more parameters as described above, may be generated by the object segmentation model as the segmentation model generates the 3D model.
- the object segmentation model may only generate the 3D model.
- the machine analysis model 1302 may include a tag generation module 1304 that is configured to perform an analysis, for example by using photogrammetry analysis techniques, on the generated 3D model to generate tags for each 3D representation contained within the 3D model.
- the tag generation module 1304 may recognize that a 3D model contains a first 3D representation of an object.
- the module 1304 may tag the 3D representation of the object with tag 1, which may involve assigning a label to the 3D representation as well as determining one or more parameters for the object. For example, parameters XI through N1 may be associated with tag 1 and may describe characteristics of the object uploaded by the user.
- context analysis data may be provided by a context analysis module 1310 and may be later used by the scoring module 1306.
- Logic of the system may instruct the match analysis model 1302 to execute a scoring module 1306.
- the scoring module 1306 may receive the generated tag, for example tag 1 from the tag generation module 1304, and may request from the system data storage 1308 information related to each of the candidate 3D models stored.
- the scoring module 1306 may include in the request message, for example in a header of the message, indicating the label of tag 1 and may instruct the system data storage 1308 to only return candidate tags that match the label.
- candidate tag 1 may have an assigned label that matches the label of tag 1.
- the system data storage 1308 may send, via a message, candidate tag 1 and the parameters of candidate tag 1 to the scoring module 1306.
- the scoring module 1306 may perform match analysis that compares the candidate tag with the generated tag, for example, that compares candidate tag 1 and tag 1.
- the scoring module 1306 may identify, for example based on string matching, that parameter XI of tag 1 and parameter Cl of candidate tag 1 relate to the same characteristics.
- both parameters may contain values pertaining to the size of their respective objects.
- the scoring module 1306 may generate a score based on a degree by which these parameters match. For example, if both parameters XI and Cl convey that the size of their respective object is 50 sq. ft., a score representing a high confidence value may be generated and stored. If, however, parameter XI conveys that the size of its respective object is 50 sq. ft.
- a score representing a low confidence value may be generated and stored. Generating the confidence scores may occur for each candidate tag sent to the scoring module, where every parameter of each candidate tag is compared to a corresponding parameter of the generated tag.
- Each score may be stored by the scoring module 1306 and linked to the parameters of the generated tag, for example, a confidence score may be linked to each of parameters XI through Nl.
- a confidence score may be linked to each of parameters XI through Nl.
- every candidate tag sent to the scoring module 1306 may have confidence scores measuring how its parameters match the generated tag parameters.
- the scoring module 1306 may perform an operation on the confidence scores and may return the candidate 3D representation with the highest overall confidence score.
- every confidence score of the parameters must be above a threshold value.
- FIG. 14 shows an exemplary method for determining a model that matches an object indicated by a user.
- a plurality of two-dimensional (2D) representations of at least one object may be received.
- a 3D representation may be generated for the object.
- one or more tags may be generated for the 3D representation of the object, where the tags may include a label indicating an object type and a plurality of parameters indicating assessed characteristics of the object.
- matching analysis may be performed comparing the one or more tags against tags for a plurality of previously registered objects.
- a confidence score may be generated indicating a confidence that a previously registered object is a match for the object. Based on the matching analysis and the confidence score, the previously registered object may be selected for display to a user.
- FIG. 15 shows an exemplary process for segmenting input data and generating space plans using an object placement model.
- user input data for example environment data
- the input may be one or more images depicting a space in a property, for example, an existing property.
- the images, or other environment data may be used as input to the object segmentation model, which may perform processes as described with respect to FIGS. 4-6 to generate a 3D model which includes a 3D representation of an environment of the physical space as well as 3D representations of one or more objects within the physical space, which may be shown at 1504.
- the system may instruct the object segmentation model that the input data is to be used for the object placement model.
- the object segmentation model may generate and provide additional information to the object placement model such as context data related to the context of the scene that the 3D model represents.
- the 3D model may be used as input to the object placement model.
- the object placement model may be a generative Al object placement model trained to generate a 2D floorplan representation of the inputted 3D model, which may be described further with respect to FIG. 17.
- the 2D floorplan representations may depict a birdseye view of the layout of the 3D model, including how the 3D representations are to be arranged.
- the output of the trained generative Al object placement model may be a proposed arrangement of the 3D representations corresponding to the objects.
- the object placement model may modify the 3D model generated by the object segmentation model by placing the 3D representations within the 3D model according to the proposed arrangement.
- a 3D sample with the proposed arrangement may be generated using the 2D floorplan as well as information outputted from the object segmentation model. The 3D sample may be used to validate whether an arrangement is acceptable. In some examples, the 3D sample may be used to score the proposed arrangement.
- FIG. 16 shows an exemplary method for outputting a 3D model with an arrangement of 3D representations indicated by a user.
- a 3D model may be generated by and obtained from the object segmentation model described with respect to FIGS. 4-6.
- Tags that include labels and one or more parameters may be assigned to each 3D representation depicted in the 3D model.
- the model may be passed through the object removal model described with respect to FIGS. 9-11 and flags indicating whether each 3D representation relates to a structure or removable object may be set and included in the tags. This flag may be later used by the staging module when determining which arrangements to present to the user.
- a user may be able to set the flags based on which objects the user wants to remain constant in the space, i.e., every arrangement must include the constant object.
- the 3D model along with information related to the 3D model such as the respective tags may be sent to the conversion module which may generate and/or convert the 3D model to a top-down 2D floorplan representation.
- the 2D floorplan may include tags (e.g., the assigned labels) for 2D representations associated with each 3D representation contained in the original 3D model. Generating the floorplan representation may be based on a rules-based algorithm that is configured with a set of rules.
- the set of rules may be generated based on, for example, text or other input that conveys what constitutes an acceptable layout for a given space.
- the input text may be fed to a natural language processing model that may generate the rules.
- a rule for example, may define the relationship between the dimensions of the 2D representations with respect to the total dimensions of the 2D floorplan.
- the 2D floorplan may be sent to the staging module.
- the 2D floorplan may have a default arrangement of the 2D representations, which may match the arrangement of the 3D representations contained in the original 3D model.
- the staging module may implement the trained Al generative placement model described with respect to FIG. 17 to output alternative 2D floorplan representations, where each alternative depicts a different arrangement of the same 2D representations.
- the trained model may utilize one or more rules-based algorithms in order to validate that all of the alternative floorplan representations are compliant.
- each alternative may have a reconstructed 3D sample (e.g., 3D digital twin) that may be tested by the system to validate that the system can produce the floorplan in question.
- Validating whether the system can produce the floorplan may involve the floorplan passing a multitude of checks of validation, which may include whether the spacing dimensions and overall layout of the floorplan in question is satisfactory, how many of the 2D representations depicted match the 3D representations in the original model, if an alternative 2D representation was used, how closely does it match the 3D representations in the original model, and/or the like. Passing a check may involve the system comparing scores (e.g., confidence scores) to respective thresholds.
- the trained model may generate alternative floorplans with 2D representations associated with objects not in the image(s) uploaded by the user. For example, the model may have access to personal asset collections of the user which may be described with respect to FIGS. 2 and 3.
- One or more of the personal asset collections may include a 3D model of an asset that the user previously uploaded as inspiration.
- the model may be trained to determine which assets from the personal asset collection should be used in the staging of a given space.
- the object placement model may output alternative representations where each representation satisfies a design criteria known to be valued by users. For example, the object placement model may be trained to output a first representation whose arrangement allows maximum movement within a given space and a second alternative representation whose arrangement allows maximum functionality of a given space.
- the alternative 2D floorplan representations may be presented for display in a GUI of the user device so that the user may view and select the floorplan he or she likes the most.
- feedback may be generated based on the selection made by the user and used by the generative Al placement model to improve its neural network.
- the original 3D model may be modified with the arrangement in the selected 2D floorplan and graphically rendered to be suited for display to the user.
- the original 3D model may maintain the same 3D representation of the space, but may populate the original 3D model with an arrangement of 3D representations corresponding to the objects that matches the selected arrangement.
- auto-staging may be performed by the object placement model which may involve the system automatically selecting the ideal floorplan representation among the alternative representations for the user and begin generating the modified 3D model.
- the graphical rendering of the modified 3D model may be edited in real-time by the user, which requires the system to generate a 3D digital twin of each edited 3D model so that it can validate that the edit is allowable.
- Types of editing that may be performed may include swapping a 3D representation for a different 3D representation stored in a database accessible by the user, rearranging the 3D representations within the 3D model, deleting a 3D representation from the 3D model, and/or the like.
- FIG. 17 shows an exemplary generative Al object placement model trained to output a 3D model with a proposed arrangement of 3D representations associated with objects within a space.
- the generative Al object placement model may include a pre-training model that may be trained on existing floorplans 1708, which may be retrieved from a floorplan data storage 1700 linked to the system, and from images uploaded by users and converted to floorplans 1702.
- the placement model may include a 2D to 3D interpolation module 1710 and an image interpolation module 1706.
- the placement model may include a 3D scene interpolation data storage 1712.
- the output may be used by the staging module 1716 to stage a proposed arrangement of alternative 2D floorplan representations for display.
- the generative Al obj ect placement model may be trained via the pre-training model.
- the objective of the pre-training model may be to output a 2D floorplan representation that has an acceptable layout that satisfies the user’s requests.
- the pre-training model may be trained using sample spaces as input.
- the sample spaces may show spatial dimensions of the respective sample space and/or human-selected object placements within the respective space’s existing floorplans.
- a type of sample space that may be used as input to train the pre-training model may be existing floorplans, as shown at 1708, which may be retrieved from a database contained within a floorplan data storage 1700.
- the existing floorplans may include data and related 2D images depicting a birds-eye view of layouts determined to be acceptable with respect to a given physical space.
- the existing floorplans may be retrieved by the system when a builder, (e.g., the candidate builder described with respect to FIG. 2), registers with the system.
- the builder may upload or otherwise provide the system with access to a log of previous floor plans used by the builder.
- the builder may keep a record of schematics such as blueprint drawings he or she used for a given project.
- the builder may upload the drawings to the system using the application described with respect to FIG. 2.
- a type of sample space that may be used as input to train the pre-training model may be 2D floorplan representations converted from images uploaded by users registered to the system, as shown at 1702. For example, each time the system receives an uploaded image, it may generate a 2D floorplan representation using the techniques described with respect to FIG. 16. Additionally, the system may provide the representation as input to improve the neural network of the trained model, for example by adjusting biases according to this new information. Since the model is able to be continuously trained based on new uploads, it may be able to output 2D floorplans that correspond with one or more current design trends.
- determining the alternative 2D floorplans with different arrangements of 2D representations that should be outputted by the neural network may involve a manual scoring of the 2D floorplans that may act as feedback to adjust biases of one or more nodes of the neural network.
- the model may run a script that generates a score assessing whether the neural network outputs alternative floorplans that are likely to be selected by the user and are able to be produced by the system.
- the model may be trained to use a generated 2D floorplan representation as input and to output an interpolation of a 3D model. Information related to such interpolations may allow the system to provide a 3D rendering of the selected 2D floorplan, as described with respect to FIG. 16.
- Floorplans retrieved from the floorplan data storage 1700 may be interpolated via the 2D to 3D interpolation module 1710, while floorplans converted from new user uploads may be interpolated via the image interpolation module 1706.
- Information related to both modules described above may be sent to and stored in the 3D scene interpolation data storage 1712. Information related to such interpolations may allow the system to provide a 3D rendering of the selected 2D floorplan, as described with respect to FIG. 16. For example, when the object placement model enters the staging phase described with respect to FIG. 16, the system may send interpolation information to the staging module 1716 in anticipation of a graphical rendering of a 3D model modified based on the user’s selection of the floorplan.
- FIG. 18 shows an exemplary method for generating a 2D floorplan with a proposed arrangement of objects.
- environment data may be received depicting an environment that contains one or more objects
- the environment data may include a multidimensional visualization of the environment.
- the environment data may be processed.
- the image processing model may be configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations of the objects.
- the 3D model of the environment, or a derivative thereof may be inputted to an object placement model.
- the object placement model may be a generative artificial intelligence model that has been trained based on a plurality of 2D floorplan representations for sample spaces showing at least spatial dimensions of the respective sample space and/or human-selected object placements within the respective space.
- a 2D floorplan representation may be generated showing a proposed arrangement of the plurality of objects within the environment.
- the 3D model of the environment may be modified by placing the 3D representations of the plurality of objects within the 3D model of the environment in accordance with the 2D floorplan representation generated by the object placement model.
- FIG. 19 shows an example of a material recognition model trained to output an assessment of a surface material of an object shown in an image or multi-dimensional representation.
- the module 1900 may work in conjunction with an object segmentation model described herein.
- the module may include a feature extraction module 1912, a machine learning (ML) classifier 1916, and a material recognition model 1918.
- the material recognition module 1900 may receive content from a content library and use such content as input 1904.
- the material recognition model 1918 may be trained to output an object tag 1906 along with a material assessment index 1908 of the object tag based on information related to a furnishing and a region of interest, both of which are received via a message from the object segmentation model.
- the material recognition model 1918 may be trained to output an assessment of a surface material based on an input of one or more images depicting a furnishing.
- the training may occur via a pre-training model associated with the material recognition model 1918.
- the pre-training model may obtain training data which may include content stored in a content library 1904 linked to the system.
- the system may send an instruction for the content, which may include one or more images uploaded to the system using the techniques described with respect to FIGS. 2 and 3, to be sent to the pre-training model via messages.
- the content used as the training data may include datasets of surface material images uploaded, for example, during the builder registration process described above.
- the input data may be fed first through the object segmentation model described with respect to FIGS. 4-6.
- feature extraction may be performed on the segmented images via a feature extraction module 1912.
- the feature extraction module 1912 may output characteristics related to the surface material of the inputted image.
- the outputted characteristics may include color, texture, compactness, contrast, and/or the like.
- Feature extraction may include image processing techniques such as edge detection, corner detection, blob detection, and texture analysis, some of which may analyze pixel characteristics such as HSV (hue, saturation, value), RGB (red, green, blue), and LM (local mean).
- Feature extraction may also utilize rLM (run length matrix) techniques.
- the data may be fed to an ML classifier 1916 that may categorize the data into one or more classes related to a surface material.
- the ML classifier 1916 may output one or more parameters based on the categorization, which may be collectively considered an assessment of the material. As described herein, these one or more parameters may be added to the tag of the generated 3D model associated with the given input image. Testing, which may involve manual scoring of the outputted material assessment, may be performed during the pretraining process and feedback may be generated to improve how accurately the ML classifier 1916 categorizes the data into the material classes.
- the material recognition model 1918 may receive information related to a given furnishing along with a region of interest, both of which may be identified by the object segmentation model.
- the region of interest may be a region in the image depicting the furnishing that contains an object to be analyzed by the material recognition model 1918.
- no region of interest may be indicated in the message received from the object segmentation model and, in such a case, the entire image of the furnishing may be used as input to the material recognition model 1918.
- the material recognition model 1918 may output a tag associated with the furnishing (e.g., which may be an input sent from the object segmentation model) as well as a material assessment parameter linked to the tag.
- FIG. 20 shows an exemplary method for matching objects based, at least in part, on an assessment of a surface material of a furnishing.
- one or more images of a furnishing may be received.
- the one or more images, or derivatives thereof may be inputted to a material recognition machine learning model.
- the material recognition machine learning model may be trained to generate an assessment of a surface material of a furnishing by analyzing a volume of sample images coded with known surface materials. Using the material recognition machine learning model, an assessment of a surface material of the furnishing shown in the one or more images may be generated.
- FIG. 21 shows an exemplary system architecture for obtaining and managing content to be used for training a designer model 2112 to generate a multi-dimensional representation of a physical space.
- the architecture may include one or more of the following: one or more designer data sources (designer A data source(s) 2100 through designer N data source(s) 2102); a data management platform 2104; a tag generation module 2106 comprising at least an object segmentation model, a parameter generation module, a conversion to floorplan module, and/or other models/modules described above such as the object removal model, the material recognition model, etc.; a profile manager 2108; a system data storage 2110; and/or a designer model 2112 comprising at least a floorplan generator and an object recommendation engine.
- designer data sources designer A data source(s) 2100 through designer N data source(s) 2102
- a data management platform 2104 comprising at least an object segmentation model, a parameter generation module, a conversion to floorplan module, and/or other models/modules described above
- the designer data sources 2100, 2102 may be sources that host content related to how particular designers design physical spaces.
- the designer model 2112 may be trained to output a design plan depicting a physical space, where the design plan stages the space as if a particular designer had staged it. In order to train such a model, content related to how the designer designed other spaces may be obtained from the data sources and used as training data.
- Examples of the content may include 2D images (e.g., pictures of a space previously designed by the designer), text or another medium by which information related to the design space is conveyed (e.g., a blog post describing the aesthetic of a space), multi-dimensional media (e.g., multi-modal data that may employ one or more data enrichment techniques such as object detection, facial recognition, metadata extraction for temporal or geolocation data, audio transcription, etc.) that add additional dimensions (e.g., 3D graphics or models of the space such as an interactive video game or other simulation that depicts a simulated design space), and/or other media that can be useful in providing insights when determining the design style of an individual designer for given spaces.
- 2D images e.g., pictures of a space previously designed by the designer
- text or another medium by which information related to the design space is conveyed e.g., a blog post describing the aesthetic of a space
- multi-dimensional media e.g., multi-modal data that may employ one or more data enrichment
- some of the content may be similar or identical to the assets described in FIGS. 1-3.
- sources that can host such content may include third-party websites, social media platforms, blog or other forums focused on publishing articles or videos, community boards (e.g., a platforms where a community of members who like a designer’s design style can share media and engage in discussions pertaining to the designer’s aesthetic), and/or the like (other sources may include those involved when obtaining the assets as described in FIGS. 1-20).
- two designers of interest may be identified as designer A and designer N.
- designer A data sources 2100 and designer N data sources 2102 may contain respective sets of content such as images (as an example, 1 st design image, 2 nd design image. . .Nth design image).
- a data management platform 2104 may identify such sources and collect all relevant content.
- the designer model 2112 may include a default state that may generate a design plan based on the inputted content without considering aesthetic or other designer characteristics. Such a state may be entered, for example, when the user has not selected or otherwise indicated a designer profile. The relationship between the designer model 2112 and the default state is described further below.
- the data management platform 2104 may be a platform of the system that identifies data sources (which may at times be referred to as external entities herein) that host content related to a designer stored in the system data storage 2110.
- the data management platform 2104 may coordinate data and/or instructions between various components of the internal environment along with the external entities, such as the tag generation module 2106, the profile manager 2108, the designer model 2112, the one or more data sources 2100, 2102, any user device registered to the platform 2104, and/or the like.
- the data management platform 2104 may operate via one or more servers (e.g., remote cloud-based servers) and may also host one or more data structures (e.g., which may be a part of the system data store 2110) responsible for storing training data or other data associated with the designer model 2112.
- the servers may include functionality similar to the servers described in FIGS. 1 and 2. For example, obtaining the content from the design data sources may involve one or more techniques performed by the asset collection module which are described above (e.g., authenticating the website, providing identifying credentials, storing such credentials, scanning the contents of the webpages for relevant media, etc.).
- the data management platform 2104 may request permission from the data source (e.g., a website) to scan or crawl the contents (e.g., headers, HTML, script, files, behavior analysis. . .) of the pages of the website to retrieve any media related to a given individual/designer.
- the data management platform 2104 may provide APIs to the data source to facilitate data exchange, where the data source can upload data relevant to the desired content to the servers formatted such that the platform can parse the relevant information.
- the data management platform 2104 may provide authentication credentials (e.g., a token) that may indicate the identity of the data source.
- each time a data source transmits data to the platform 2104 it may include in a header the credentials so that the data management platform 2104 knows to which profiles the data pertains .
- the credentials sent from designer A data source 2100 may indicate profile ID A while those sent from designer N data source 2102 may indicate profile ID N.
- the data management platform 2104 may include a tag generation module 2106 which may include software configured to parse the relevant data from the data obtained from the sources and process such data through one or more models in order to generate tagging information which may be used as training data for the designer model 2112.
- the data obtained may consist of a plurality of discrete datasets, where each dataset is associated with a respective piece of content. Using the example shown in FIG. 21, a first dataset may be associated with the 1 st design image, a second dataset may be associated with the 2 nd design image, and so on. Since all of the datasets pertain to content from the same designer, they may have identical profile IDs.
- each of the datasets may be assigned with unique dataset IDs based on the content they represent, which may be useful for the profile manager 2108 when attempting to organize the tags and parameters for each of the datasets.
- the tag generation model 2106 may tag/assign the obtained dataset with a profile ID which may be identical to another record already indexed in the database of the system data storage 2110 (e.g., the module recognizes a match between the profile ID of the received dataset and the profile ID of a record).
- the assigned ID may be identical or similar to the profile ID of the incoming dataset, as shown in FIG. 21.
- the dataset may be processed via the object segmentation model.
- a 3D model may be generated of the depicted space, including 3D representations of the objects contained within the space.
- the 3D model may be processed via the object removal model to remove all removable objects (i.e., furnishings) from the respective 3D model.
- the 3D model may be processed by a parameter generation module to generate a set of parameters describing characteristics of the design of the respective space.
- the parameters may relate to the layout and/or style of the space, which may be based on context, types of objects used for a space, and/or the like.
- each object depicted in the respective sample space may be further analyzed by the model retrieval engine and/or the material recognition model to assign object IDs to each object as well as to generate parameters for each of the objects.
- the 3D model may be converted, using the object placement model, to a 2D floorplan representation showing the existing arrangements of the objects contained within the depicted space.
- Data related to the processing of the content, as opposed to the tagging of the ID as a header, may collectively be considered tagging information as shown in FIG. 21 and may be sent to the profile manager 2108 described below.
- the tagging information may contain a unique dataset ID (along with the profile ID) that identifies for the profile manager 2108 the piece of content to which the respective tagging information relates.
- the profile manager 2108 may receive a dataset from the tag generation module 2106 containing a profile ID (e.g., the profile ID shown in FIG. 21) and a body of tagging information associated with a dataset ID.
- the profile manager 2108 may be responsible for managing the backend operations that maintain the database contained within the system data storage 2110.
- the profile manager 2108 may receive tagging information linked to profile ID A, where the tagging information includes parameters that have been labeled with dataset ID 1.
- the profile manager 2108 may group the parameters (e.g., which may be represented by key-value pairs) into a nested hierarchy (e.g., or into another data structure that categorizes/organizes parameters) of different categories.
- the categories may be identified by tags and may include fields for layout, object, style, etc.
- the profile manager 2108 may instruct the database to record the parameters under a tag field called “layout”.
- the profile manager 2108 may perform similar operations for the object field and the style field.
- the system data storage 2110 may contain a database that stores the tagging information in a format where such information is readily accessible.
- the system data storage 2110 may contain identical or closely related features of the system data storage described above.
- the database may be considered a design replication database and may be separate from the other databases/data structures described above.
- the system data storage 2110 may contain an archived repository of available objects that may be retrieved by the object recommendation engine when determining which objects to recommend, as described below.
- the available objects may be similar or identical to the assets collected by the asset collection module described in FIGS. 1-4.
- the designer model 2112 may include one or more ML models trained to output a design plan for a space represented by a 3D model.
- a plurality of designer models 2112 may be included in the system.
- each designer model 2112 may be unique to a specific profile ID.
- a designer model 2112 may be trained to output a design plan representative of how a specific individual/designer (linked to a profile ID) would design a given space. Training the designer model 2112 is described further in FIGS. 22 and 23.
- the designer model 2112 may implement a floorplan generator and an object recommendation engine to generate and output the design plan.
- the floorplan generator may generate a proposed arrangement of a plurality of objects (i.e., furnishings) within the physical space.
- the proposed arrangement may consist of object indicators that indicate positions/orientations within the physical space at which a respective object is proposed to be located.
- the object recommendation engine may select, for each object indicator, a recommended object to be located at location, and may thereby generate a plurality of recommended objects associated with respective positions indicated by the respective object indicators.
- the floorplan generator and object recommendation engine are described further in FIGS. 22 and 23, respectively.
- the designer model 2112 may be or include a generative Al model deploying a deep neural network trained to improve how closely a reconstructed design space matches how a given designer would have constructed the space.
- the floorplan generator may be a generative Al model while the object recommendation engine may be one or models/algorithms that score or rank the available objects in the repository based on a criterion (further details of the floorplan generator and the object recommendation engine are provided in FIGS. 22 and 23, respectively).
- Examples of generative Al models may include NeRF, generative radiance fields (GRAF), surface-guided neural radiance fields (SURF), bundle-adjusting neural radiance fields (BARF), or any other model in the field of computer vision that can learn how to reconstruct a 3D scene from a set of images.
- examples of generative Al text-to-image models may include or incorporate DALL-E, STABLE DIFFUSION, or any other multimodal natural language processing (NLP) model that generates images from a text description.
- text-to-image models may be used in combination with other generative Al models.
- a text-to-image model may be executed first and configured to generate one or more 2D images from an input text description. In such a case, these generated 2D images may then be fed to a 3D generative Al model to generate a reconstructed design space that corresponds to the original input text description.
- FIG. 22 depicts an exemplary training protocol to train the floorplan generator 2202 of the designer model to output a design plan.
- the floorplan generator 2202 may be fed a volume of sample images 2200 (or any of the other forms of media described in FIG. 21 such as text descriptions, multimodal data, contextual information such as temporal or geolocation data, etc.) depicting a plurality of sample spaces furnished by a designer.
- the sample images 2200 may include at least some of the images and other content collected from the data sources described in FIG. 21.
- at least some of the sample images 2200 may be generated from a series of simulations executed via one or more simulation tools like simple graphic models, game engines, etc.
- sample images may be beneficial in that they can increase the diversity and volume of the training dataset, thereby improving the generalization of the floorplan generator 2202.
- the sample images based on real-world data may be combined with the simulated sample images and processed as input training data.
- the sample images 2200 may show at least one or more of the following: spatial dimensions of the respective sample space, designer-selected object placements within the respective sample space, and designer- selected furnishings of the respective sample space. In some embodiments, these metrics may be analyzed during the evaluation process, i.e., the ground truth comparison in operation 2212.
- a sample 3D model may be obtained.
- the sample images may be fed into the object segmentation model and the object removal model described above which may generate respective sample 3D models (containing only structural objects as all removable objects have been removed/deleted via the object removal model).
- the sample 3D models may be indexed at the system data storage 2214 and retrieved by the floorplan generator 2202 without the need for additional processing.
- the sample 3D model may be converted to a sample design plan that includes a sample 2D floorplan representation showing a proposed arrangement of a plurality of objects / furnishings.
- the proposed arrangement may include object indicators indicating positions within the sample 2D floorplan representation at which a respective object is proposed to be located. Converting the sample 3D models may incorporate one or more techniques implemented by the object placement model which were described in detail in FIGS. 15-18.
- the proposed arrangement comprising the plurality of object indicators may consist of fixed object indicators for all structural objects. In such a case, all iterations of the sample design plan may have the same object indicators for the structural objects but different variations for the other non- structural objects.
- the floorplan generator 2202 may retrieve tagging information generated from analyzing the previously-registered spaces stored via the data management platform described in FIG. 21.
- the retrieved tagging information may be indexed to the profile ID of the designer model (as described in FIG. 21, each designer model may be trained to emulate an aesthetic of a designer with a particular profile ID).
- the tagging information may consist of a set of parameters nested under the layout field, object field, and/or style field described in FIG. 21.
- the parameters for the layout field may indicate, for each previously-registered space, one or more of the following: overall space dimensions (e.g., floor area, volume, etc.), proportions/relationships between different parts of a space, flow of movement including clearance thresholds (e.g., building code or other design requirements related to the movement of people through space), ceiling height, acoustics, ventilation, lighting, furnishing layout, access points (e.g., doors and windows), site context (e.g., if the previous space had a desirable view, the designer may choose a layout where a seat was placed near a window to the view), equipment/technology placement to ensure connectivity, intended use of space, zoning requirements, estimated occupant load (e.g., if a space is depicting a master bedroom, a designer may design a layout with less seating arrangements since he or she would expect that such a space would not be occupied by a large number of people), storage space, space utilization, safety standards, and/or the like.
- overall space dimensions e
- the parameters for the object field may indicate, for each object ID indexed to a previously-registered space, one or more of the following: color palette, texture, compactness, contrast, size, shape, pattern, surface finishes, furnishing material type, visual decor or theme, ergonomics, durability, usage, accessibility, flexibility/modularity (i.e., how much the object can be reconfigured for multiple purposes), integration with technology, brand (which may include the designer’s brand), cohesion, symmetry, price, and/or the like.
- the parameters for the style field may indicate, for a previously-registered space, an overarching architectural style reflecting the setting of the space. Examples of architectural styles may include modem, contemporary, mid-century modem, minimalist, farmhouse, beach shack, cabin, lake house, urban apartment, etc.
- determining the 2D floorplan representation may involve a rules-based algorithm that is configured with a set of rules.
- the set of rules may be generated based on, for example, the parameters (along with other training data generated, for example, from simulations) and may convey what constitutes a layout that matches the designer’s aesthetic for a given space.
- An example of a rule may pertain to the relationship between the dimensions of the 2D representations of objects with respect to the total dimensions of the 2D floorplan (e.g., a particular furnishing should not exceed X% of the available floorplan space).
- the sample 2D floorplan representation may be staged by placing object indicators at positions within the representation in accordance with the proposed arrangement outputted at operation 2206.
- the output of operation 2208 may include the positions of respective object indicators, the types of respective object indicators (e.g., couch), the size/dimensions of respective object indicators (e.g., the couch should be not exceed 5 feet in width), aesthetic information for the respective object indicators (e.g., the couch should be red), along with other information included in the parameter field of the object category described above.
- software instructions on how to arrange the object indicators within sample 2D floorplan representation may be generated and assist in staging the object indicators within the respective 2D floorplan representation.
- one or more staging techniques performed by the object placement model, which are described in FIGS. 15-18, may be incorporated at operation 2208.
- the staged sample 2D floorplan representation showing a proposed arrangement of a plurality of objects may be used as a predicted designer-inspired design plan, i.e., a design plan that a particular designer may use based on the input sample.
- the predicted design plan may be identical to the staged sample 2D floorplan representation.
- the predicted design plan may be a reconstructed 3D model with 3D representations mapped to each of the object indicators of the staged sample 2D floorplan.
- the predicted design plan may be evaluated, which may involve a comparison to a ground truth.
- the evaluation may involve qualitative analysis where an expert (perhaps the actual designer or representative(s) of the designer) scores the quality of the predicted design plan.
- the evaluation may involve statistical measures (e.g., inception score (IS), mode score (MS), etc.) used to assess the similarity between the predicted design plan and the real-world sample images and simulated sample images, with the goal being to train the Al model of the floorplan generator 2202 to produce new design plans that are indistinguishable from the sample.
- feedback may be generated to adjust the weights and biases of the neural network to minimize the loss function and thereby improve the model’s ability to generate design plans resembling the designer’s aesthetic.
- FIG. 23 depicts an exemplary flow diagram of an object recommendation engine 2302 configured to generate recommended objects to be located at the positions indicated by respective object indicators.
- the object recommendation engine 2302 may receive information from the floorplan generator 2300 indicative of how to arrange the object indicators within the physical space.
- the object recommendation engine 2302 may perform operations 2304 through 2310 to select recommended objects to populate the positions of the object indicators.
- the recommended objects may be selected from a repository of available objects stored in the system data storage 2312.
- the floorplan generator 2300 may generate and send software instructions that map the positions of the object indicators (which may be identified in the code under Object IDs) within the 2D floorplan representation of the design plan to corresponding 3D positions within the original 3D model (the 3D model that was converted to the 2D floorplan representation by the floorplan generator).
- reconstruction software tools such as photogrammetry, point cloud creation, computer vision, etc. may be implemented to generate the mappings.
- additional parameters related to desired characteristics (which may include the parameters in the object field of the database described in FIGS. 21 and 22) for each of the object indicators may be sent along with the mappings.
- the object recommendation engine 2302 may obtain such instructions and may modify the 3D model by placing object indicators in the mapped 3D positions.
- the object recommendation engine 2302 may skip mapping to the 3D model and instead select recommended objects to fill the 2D floorplan representation. In such a case, the 3D model may be populated only after all recommendations have been selected.
- selecting recommended objects to be located at the respective 3D positions may be based on scoring available objects indexed in the repository.
- the object recommendation engine 2302 may retrieve the object IDs of the available objects along with the tagging information and may process such information in one or more rules-based algorithms configured to output a confidence score based on certain criteria or a ruleset.
- the criteria may be manually hardcoded where users may evaluate and provide weights and/or rules as to what results in a match.
- An example of a rule may pertain to whether an available object is associated with a brand of the designer (where at least one parameter from the set of parameters of the tagging information includes an indication as to whether the respective furnishing is associated with the designer).
- Each confidence score may indicate a confidence that a designer associated with the profile of the designer model would choose to insert a previously-registered furnishing in place of the object indicator.
- the object recommendation engine 2302 may generate scores using one or more techniques similar to those implemented by the model retrieval engine described in FIG. 13. For example, the desired parameters for each object indicator which was sent from the floorplan generator 2300 may be compared to the tagging information of the available objects by performing an assessment of the degree to which the underlying data points match.
- the mapped 3D position may include an indication of a selection of a recommended object.
- the available object (previously-registered furnishing) associated with the highest confidence score may be selected as the recommended object to be placed at the position indicated by the respective object. The process may be repeated until recommended objects have been selected for all of the object indicators (by iterating and looping across the object IDs as shown in FIG. 23).
- additional parameters and/or rules may be generated based on a current selection of a plurality of recommended objects.
- the object recommendation engine 2302 may determine to select a couch with certain dimensions (e.g., four feet in width and 10 feet in length) as a recommended object. Based on such a selection, an additional rule may be generated that pertains to the dimensions of the already selected couch, the dimensions of the other available objects within the object repository, the total dimensions of the physical space, etc.
- FIG. 24 depicts an exemplary method of generating a multi-dimensional representation of a physical space in response to user instructions inputted via a GUI.
- a user may upload environment data depicting a physical space.
- the user may interface with the GUI described with respect to FIGS. 1-20.
- text may be displayed via the GUI prompting the user to upload an image/video or other media file containing content of a physical space that the user wants designed based on a designer model.
- Below the text may be a drag-and-drop area (or another mechanism for the user to upload media of the physical space) and an upload button where the user can upload the file.
- the system described herein may generate a multidimensional model of the physical space via the implementation of the object segmentation model and the object removal model described above.
- a user may select to stage the physical space according to a designer profile.
- the GUI may present a graphical drop-down list of profile options.
- Each option may be associated with a profile of a designer stored in the data management platform in FIG. 21.
- a profile option may be created at the front end and appended to the drop-down list of profile options.
- the GUI may receive, from a user via the interface, a selection of the designer profile from the set of profile options.
- the selection may indicate a request from the user to have the proposed arrangement of the plurality of objects staged within the physical space according to how a designer associated with the selected designer profile would have staged the physical space.
- the user may choose no designer profiles. In such a case, it may be inferred that the user wants the physical space to be generated according to the default state described above.
- the floorplan generator may generate the design plan with a proposed arrangement of objects as described in FIG. 22.
- the object recommendation engine may select a plurality of recommended objects in accordance with the design plan, as described in FIG. 23.
- a 3D model may be populated with the plurality of recommended objects located at the respective positions indicated by the respective object indicators.
- the system may be further configured to receive, from a user device, a user instruction to generate or retrieve the populated 3D model with the plurality of recommended objects located at the respective positions indicated by the respective object indicators.
- the system may be configured to transmit for display by the user device a rendering of the populated 3D model.
- the rendering can be edited by the user of the user device.
- the plurality of recommended objects may be movable within the graphical rendering.
- the plurality of recommended objects may be substituted with other objects from the repository of available objects.
- the system may generate a 3D digital twin each time an edit is made to validate that the edited 3D model is viable (which is described further in FIGS. 5 and 16). Editing may include swapping a primary recommended object for a secondary recommended object stored in the repository of available objects. Editing may include deleting a recommended object from the 3D model. In some embodiments, the system may be further configured to iteratively modify the graphical rendering of the populated 3D model based on sequential user instructions.
- FIG. 25 depicts an exemplary method 2500 of generating a multi-dimensional representation of a physical space based on a designer model.
- environment data depicting a physical space may be received.
- the physical space may contain a plurality of objects.
- the environment data may comprise a multi-dimensional representation of an environment.
- the environment data may be processed using a first processing model.
- the first processing model may be configured to output a 3D model of the physical space that includes 3D representations of the plurality of respective objects.
- a design plan for the physical space may be generated using a designer model.
- the design plan may be generated based on the 3D model of the physical space, or a derivative thereof.
- the design plan may include a proposed arrangement of a plurality of objects within the physical space.
- the proposed arrangement may comprise a plurality of object indicators.
- each object indicator may indicate a position within the physical space at which a respective object is proposed to be located.
- a selection of a designer profile from a set of designer profiles may be received from a user via an interface. The selection may indicate a request from the user to have the proposed arrangement of the plurality of objects staged within the physical space according to how a designer associated with the selected designer profile would have staged the physical space.
- the 3D model of the physical space, or a derivative thereof, along with the selected designer profile may be inputted to the designer model.
- the designer model may be a generative artificial intelligence model that has been trained based on a volume of sample images that depict a plurality of sample spaces furnished according to the selected designer profile.
- the sample spaces may show at least: (i) spatial dimensions of the respective sample space; (ii) designer-selected object placements within the respective sample space; and/or (iii) designer- selected furnishings of the respective sample space.
- a recommended object may be selected to be located at the position indicated by the respective object indicator.
- a recommended object may be selected, from a repository of available objects, to be located at the position indicated by the respective object indicator.
- a plurality of recommended objects associated with respective positions indicated by the respective object indicators may be generated.
- selecting the plurality of recommended objects is performed by an object recommendation engine.
- the object recommendation engine may include one or more rules-based algorithms configured to generate confidence scores for a plurality of previously registered furnishings stored in the repository based on a plurality of manually coded rules.
- each confidence score may indicate a confidence value that a designer of the designer profile would insert a particular furnishing in place of the object indicator.
- at least one of the rules may indicate that a previously registered furnishing associated with a brand of the selected designer profile is more likely to have a higher confidence score than an identical furnishing that is associated with another brand.
- the 3D model may be populated with the plurality of recommended objects.
- the 3D model may be populated with the plurality of recommended objects located at the respective positions indicated by the respective object indicators.
- a user instruction may be received, from a user device, to generate or retrieve the populated 3D model with the plurality of recommended objects located at the respective positions indicated by the respective object indicators.
- a rendering of the populated 3D model may be transmitted for display by the user device.
- the rendering can be edited by the user of the user device.
- editing may comprise the option to swap at least one of the plurality of recommended objects with other previously registered furnishings from the repository of available objects.
- editing may comprise deleting at least one of the plurality of recommended objects from a current rendering of the 3D model.
- the graphical rendering of the populated 3D model may be iteratively modified based on sequential user instruction.
- a system for generating a multi-dimensional representation of an environment based, at least in part, on a design style may be provided.
- the system may obtain a plurality of sample images depicting a plurality of sample environments.
- the system may process, using an image processing model, the sample images.
- the image processing model may be configured to output sample 3D models of the sample environments.
- the system may input the sample 3D models, or derivatives thereof, to a design generation model.
- the design generation model may be a generative artificial intelligence model that is configured to generate, based on a subject 3D model or derivative thereof, a design plan for a space represented by the subject 3D model.
- the design plan may include a proposed arrangement of a plurality of objects within the space.
- the proposed arrangement may comprise a plurality of object indicators.
- Each object indicator may indicate a position within the space at which a respective object is proposed to be located.
- the system may obtain feedback regarding the proposed arrangement generated by the design generation model. Based on the feedback, the system may modify the design generation model.
- FIG. 26 shows an exemplary processing system that may execute techniques presented herein.
- FIG. 26 is a simplified functional block diagram of a computer that may be configured to execute techniques described herein, according to exemplary cases of the present disclosure.
- the computer (or “platform” as it may not be a single physical computer infrastructure) may include a data communication interface 2660 for packet data communication.
- the platform may also include a central processing unit 2620 (“CPU 2620”), in the form of one or more processors, for executing program instructions.
- CPU 2620 central processing unit
- the platform may include an internal communication bus 2610, and the platform may also include a program storage and/or a data storage for various data files to be processed and/or communicated by the platform such as ROM 2630 and RAM 2640, although the system 2600 may receive programming and data via network communications.
- the system 2600 also may include input and output ports 2650 to connect with input and output devices such as keyboards, mice, touchscreens, monitors, displays, etc.
- input and output ports 2650 to connect with input and output devices such as keyboards, mice, touchscreens, monitors, displays, etc.
- the various system functions may be implemented in a distributed fashion on similar platforms, to distribute the processing load.
- the systems may be implemented by appropriate programming of one computer hardware platform.
- any of the disclosed systems, methods, and/or graphical user interfaces may be executed by or implemented by a computing system consistent with or similar to that depicted and/or explained in this disclosure.
- aspects of the present disclosure are described in the context of computer-executable instructions, such as routines executed by a data processing device, e.g., a server computer, wireless device, and/or personal computer.
- aspects of the present disclosure may be embodied in a special purpose computer and/or data processor that is specifically programmed, configured, and/or constructed to perform one or more of the computer-executable instructions explained in detail herein. While aspects of the present disclosure, such as certain functions, are described as being performed exclusively on a single device, the present disclosure may also be practiced in distributed environments where functions or modules are shared among disparate processing devices, which are linked through a communications network, such as a Local Area Network (“LAN”), Wide Area Network (“WAN”), and/or the Internet. Similarly, techniques presented herein as involving multiple devices may be implemented in a single device. In a distributed computing environment, program modules may be located in both local and/or remote memory storage devices.
- LAN Local Area Network
- WAN Wide Area Network
- aspects of the present disclosure may be stored and/or distributed on non-transitory computer-readable media, including magnetically or optically readable computer discs, hardwired or preprogrammed chips (e.g., EEPROM semiconductor chips), nanotechnology memory, biological memory, or other data storage media.
- computer implemented instructions, data structures, screen displays, and other data under aspects of the present disclosure may be distributed over the Internet and/or over other networks (including wireless networks), on a propagated signal on a propagation medium (e.g., an electromagnetic wave(s), a sound wave, etc.) over a period of time, and/or they may be provided on any analog or digital network (packet switched, circuit switched, or other scheme).
- Storage type media include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks.
- Such communications may enable loading of the software from one computer or processor into another, for example, from a management server or host computer of the mobile communication network into the computer platform of a server and/or from a server to the mobile device.
- another type of media that may bear the software elements includes optical, electrical and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links.
- the physical elements that carry such waves, such as wired or wireless links, optical links, or the like, also may be considered as media bearing the software.
- terms such as computer or machine “readable medium” refer to any medium that participates in providing instructions to a processor for execution.
- a system for object segmentation from a three-dimensional (3D) model of an image uploaded by a user comprising: a processing unit comprising one or more processors; a memory unit storing computer-readable instructions, wherein the system is configured to: receive environment data depicting an environment that contains one or more objects, the environment data comprising a multi-dimensional visualization of the environment; process, using an image processing model, the environment data, the image processing model being configured to: generate a 3D model of the environment that includes a plurality of 3D representations of the one or more objects and a 3D representation of the environment, wherein the 3D representations of the one or more objects are independently manipulable relative to the 3D representation of the environment; determine a plurality of parameters for one or more of the 3D representations of the one or more objects based on photogrammetry analysis; for one or more of the 3D representations of the one or more objects, associate the respective 3D representation of the respective object with a label indicating an object type and one or more of the plurality of
- A2 The system of claim Al, wherein the environment is a physical space, and the one or more objects include both structural and removable furniture units within the physical space, wherein the structural and removable furniture units comprise tangible decor units.
- A3. The system of any of claims A1-A2, wherein plurality of parameters relates to one or more characteristics of the object, wherein at least one parameter of the plurality of parameters indicates a flag used to determine whether the 3D representation is associated with a structural unit or a removable unit.
- A5. The system of any of claims A1-A4, further configured to send a message comprising the labels and the one or more parameters of the one or more 3D representations to a model retrieval engine that is configured to query a database to identify one or more candidate 3D representations with matching labels and parameters.
- A6 The system of any of claims A1-A5, wherein the database is populated with candidate 3D representations associated with candidates objects prior the machine retrieval engine performing the query, wherein each candidate 3D representation is associated with a respective label and one or more of parameters.
- A7 The system of any of claims A1-A6, wherein the candidate objects relate to products advertised by a vendor registered to the system, wherein upon registration, the vendor grants permission to the system to access and read data related to the products listed on a website hosted by the vendor.
- A8 The system of any of claims A1-A7, further configured to pull data related to the products from a database and process the data using the image processing model, wherein the image processing model is configured to generate a plurality of candidate 3D representations of the one or more products along with a plurality of parameters for each of the generated candidate 3D representation.
- A9 The system of any of claims A1-A8, further configured to receive data related to an upload of a 2D image depicting one or more candidate objects from a user registered to the system, and process the data using the image processing model, wherein the image processing model is configured to generate one or more candidate 3D representations of the one or more candidate objects along with a plurality of parameters for each of the generated candidate 3D representation.
- A10 The system of any of claims A1-A9, wherein the environment data is based on a 2D image depicting an environment and objects within the environment, wherein the image processing model comprises a neural network trained to predict a radiance value for each pixel of the 2D image to generate the 3D model of the 2D image, wherein the 3D model comprises a multitude of 3D views depicting different viewing angles of the 3D model.
- the image processing model comprises a neural network trained to predict a radiance value for each pixel of the 2D image to generate the 3D model of the 2D image, wherein the 3D model comprises a multitude of 3D views depicting different viewing angles of the 3D model.
- a method for object segmentation from a three-dimensional (3D) model of an image uploaded by a user comprising: receiving environment data depicting an environment that contains one or more objects, the environment data comprising a multidimensional visualization of the environment; processing, using an image processing model, the environment data, the image processing model being configured to: generate a 3D model of the environment that includes a plurality of 3D representations of the one or more objects and a 3D representation of the environment, wherein the 3D representations of the one or more objects are independently mani pulable relative to the 3D representation of the environment; determine a plurality of parameters for one or more of the 3D representations of the one or more objects based on photogrammetry analysis; for one or more of the 3D representations of the one or more objects, associate the respective 3D representation of the respective object with a label indicating an object type and one or more of the plurality of parameters determined to be associated with the object, wherein the label and the plurality of parameters can be used to query a database to find a matching 3D representation
- a system to remove an object from a three-dimensional (3D) model of a physical space comprising: a processing unit comprising one or more processors; a memory unit storing computer-readable instructions, wherein the system is configured to: receive environment data depicting an environment that contains a plurality of objects, the environment data comprising a multi-dimensional visualization of the environment, wherein the plurality of objects includes one or more structure objects and one or more removable objects; process, using an image processing model, the environment data, the image processing model being configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations; based on the tags associated with each of the 3D representations, determine which of the 3D representations are associated with structure objects and which of the 3D representations are associated with removable objects; and generate, using a generative artificial intelligence (Al) model, a rendering of the environment in which one or more portions associated with the removable objects are excised and replaced with a filler portion that is
- Al generative artificial intelligence
- C2 The system of claim Cl, wherein the environment is a physical space, and the one or more objects include both structural and removable furniture units within the physical space, wherein the structural and removable furniture units comprise tangible decor units.
- C3. The system of any of claims C1-C2, wherein the plurality of parameters relates to one or more characteristics of the object, wherein at least one parameter of the plurality of parameters indicates a flag used to determine whether the 3D representation is associated with a structural unit or a removable unit.
- C5. The system of any of claims C1-C4, wherein the system is loaded with a rulebased algorithm configured with a set of hardcoded rules, wherein the rule-based algorithm generates and outputs, based on an input tag, a flag indicating whether the 3D representation is associated with a structural unit or a removable unit.
- C8 The system of any of claims C1-C7, further configured to load a lookup table hardcoded with mapping information that maps labels to flags indicating whether a 3D representation is associated with a structural unit or a removable unit, wherein the system queries the lookup table with an outputted assigned label to determine whether to set the flag to a true value or a false value.
- CIO The system of any of claims C1-C9, further configured to receive, via a user prompt, an indication as to whether a user request to see a rendering of the environment with the removable objects or without the removable objects and, based on the indication, send device information to a user device allowing the user device to display a rendering of the environment according to the user request.
- a method to remove an object from a three-dimensional (3D) model of a physical space comprising: receiving environment data depicting an environment that contains a plurality of objects, the environment data comprising a multi-dimensional visualization of the environment, wherein the plurality of objects includes one or more structure objects and one or more removable objects; processing, using an image processing model, the environment data, the image processing model being configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations; based on the tags associated with each of the 3D representations, determining which of the 3D representations are associated with structure objects and which of the 3D representations are associated with removable objects; and generating, using a generative artificial intelligence (Al) model, a rendering of the environment in which one or more portions associated with the removable objects are excised and replaced with a filler portion that is generated based on an inference of one or more characteristics associated with material surrounding the removable object.
- Al generative artificial intelligence
- D5. The method of any of claims D1-D4, further comprising loading a rule-based algorithm configured with a set of hardcoded rules, wherein the rule-based algorithm generates and outputs, based on an input tag, a flag indicating whether the 3D representation is associated with a structural unit or a removable unit.
- D6 The method of any of claims D1-D5, wherein, in addition to the flag, rulebased algorithm generates and outputs a confidence score associated with the flag, wherein the method further comprises determining whether the 3D representation is associated with a structural unit or a removable unit on condition that the confidence score associated with the flag satisfies a threshold.
- D8 The method of any of claims D1-D7, further comprising loading a lookup table hardcoded with mapping information that maps labels to flags indicating whether a 3D representation is associated with a structural unit or a removable unit, wherein the method further comprising querying the lookup table with an outputted assigned label to determine whether to set the flag to a true value or a false value.
- D9 The method of any of claims D1-D8, wherein the generative Al model is an image generation Al model trained by applying random noise to an image sample and iterating pixels of the image sample until the system determines that the image sample is consistent with a correct filler portion.
- DIO The method of any of claims D1-D9, further comprising receiving, via a user prompt, an indication as to whether a user request to see a rendering of the environment with the removable objects or without the removable objects and, based on the indication, sending device information to a user device allowing the user device to display a rendering of the environment according to the user request.
- a system for selecting a three-dimensional (3D) model of an image uploaded by a user comprising: a processing unit comprising one or more processors; a memory unit storing computer-readable instructions, wherein the system is configured to: receive a plurality of two-dimensional (2D) representations of at least one object; based on the plurality of 2D representations, generate a 3D representation for the object; generate one or more tags for the 3D representation of the object, wherein the tags comprise a label indicating an object type and a plurality of parameters indicating an assessed characteristic of the object; perform a matching analysis comparing the one or more tags against tags for a plurality of previously registered objects; generate a confidence scoring indicating a confidence that a previously registered object is a match for the object; and based on the matching analysis and the confidence score, select the previously registered object for display to a user.
- 2D two-dimensional
- E2 The system of claim El, wherein the plurality of two-dimensional (2D) representations depict an environment of a physical space, and the one or more objects include both structural and removable furniture units within the physical space, wherein the structural and removable furniture units comprise tangible decor units.
- E4 The system of any of claims E1-E3, wherein a database is populated with candidate 3D representations associated with the previously registered objects that can be queried to retrieve the the label and the plurality of parameters associated with a given previously registered objects.
- E5. The system of any of claims E1-E4, wherein the previously registered objects relate to products advertised by a vendor registered to the system, wherein upon registration, the vendor grants permission to the system to access and read data related to the products listed on a website hosted by the vendor.
- a method for object retrieval from a three-dimensional (3D) model of an image uploaded by a user comprising: receiving a plurality of two-dimensional (2D) representations of one at least one object; based on the plurality of 2D representations, generating a 3D representation for the object; generating one or more tags for the 3D representation of the object, wherein the tags comprise a label indicating an object type and a plurality of parameters indicating an assessed characteristic of the object; performing a matching analysis comparing the one or more tags against tags for a plurality of previously registered objects; generating a confidence scoring indicating a confidence that a previously registered object is a match for the object; and based on the matching and analysis and the confidence score, selecting the previously registered object for display to a user.
- F2 The method of claim Fl, wherein the plurality of two-dimensional (2D) representations depict an environment of a physical space, and the one or more objects include both structural and removable furniture units within the physical space, wherein the structural and removable furniture units comprise tangible decor units.
- F3 The method of any of claims F1-F2, wherein the plurality of parameters relates to one or more characteristics of the object, wherein the one or more characteristics relate to one or more of size, color, pattern, texture, compactness, contrast, or viewpoints associated with the object.
- a system for placing an object within a three-dimensional (3D) scene comprising: a processing unit comprising one or more processors; a memory unit storing computer-readable instructions, wherein the system is configured to: receive environment data depicting an environment that contains one or more objects, the environment data comprising a multi-dimensional visualization of the environment; process, using an image processing model, the environment data, the image processing model being configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations of the objects; input the 3D model of the environment, or a derivative thereof, to an object placement model, the object placement model being a generative artificial intelligence model that has been trained based on a plurality of 2D floorplan representations for sample spaces showing at least: (i) spatial dimensions of the respective sample space; and (ii) human-selected object placements within the respective space; generate, using the object placement model, a 2D floorplan representation showing a proposed arrangement of the plurality of objects
- G2 The system of claim Gl, wherein the environment is a physical space, and the one or more objects include both structural and removable furniture units within the physical space, wherein the structural and removable furniture units comprise tangible decor units.
- G3 The system of any of claims G1-G2, wherein the plurality of parameters relates to one or more characteristics of the object, wherein at least one parameter of the plurality of parameters indicates a flag used to determine whether the 3D representation is associated with a structural unit or a removable unit.
- G5. The system of any of claims G1-G4, wherein the system is loaded with a rulebased algorithm trained to convert the 3D model outputted from the image processing model into a 2D floorplan representation that is compatible with the system and structurally compliant, wherein the rule-based algorithm is a natural language processing model is trained using existing literature related to interior design.
- G6 The system of any of claims G1-G5, further configured to generate, using the object placement model, a plurality of 2D floorplan representations, wherein each 2D floorplan representation shows an alternative arrangement of the plurality of objects within the environment, wherein each 2D floorplan representation is determined, using the rulebased algorithm, compatible with the system and structurally compliant.
- G7 The system of any of claims G1-G6, further configured to send device information to a user device, wherein the device information can be used by the user device to create a graphical rendering of the plurality of 2D floorplan representations for display, wherein a user of the user device is able to see alternative arrangements of the plurality of objects within the environment.
- G8 The system of any of claims G1-G7, further configured to receive information, from the user device, indicating a 2D floorplan representation selected by the user and modify the 3D model of the environment by placing the 3D representations of the plurality of objects within the 3D model of the environment in accordance with the selected 2D floorplan representation.
- a method for placing an object within a three-dimensional (3D) scene comprising: receiving environment data depicting an environment that contains one or more objects, the environment data comprising a multi-dimensional visualization of the environment; processing, using an image processing model, the environment data, the image processing model being configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations of the objects; inputting the 3D model of the environment, or a derivative thereof, to an object placement model, the object placement model being a generative artificial intelligence model that has been trained based on a plurality of 2D floorplan representations for sample spaces showing at least: (i) spatial dimensions of the respective sample space; and (ii) human-selected object placements within the respective space; generating, using the object placement model, a 2D floorplan representation showing a proposed arrangement of the plurality of objects within the environment; and modifying the 3D model of the environment by placing the 3D representations of the plurality of objects
- a system for material recognition of an object comprising: a processing unit comprising one or more processors; a memory unit storing computer-readable instructions, wherein the system is configured to: receive one or more images of a furnishing; input the one or more images, or derivatives thereof, to a material recognition machine learning model, wherein the material recognition machine learning model has been trained to generate an assessment of a surface material of a furnishing by analyzing a volume of sample images coded with known surface materials; and generate, using the machine learning model, an assessment of a surface material of the furnishing shown in the one or more images.
- the material recognition model learning model is a classifier model trained to categorize one or more input features into label classes, wherein the assessment of the surface material of the furnishing is a compilation of the categorized label classes.
- a method for material recognition of an object comprising: receiving one or more images of a furnishing; inputting the one or more images, or derivatives thereof, to a material recognition machine learning model, wherein the material recognition machine learning model has been trained to generate an assessment of a surface material of a furnishing by analyzing a volume of sample images coded with known surface materials; and generating, using the machine learning model, an assessment of a surface material of the furnishing shown in the one or more images.
- J5. The method of any of claims J1-J4, wherein the material recognition model learning model is a classifier model trained to categorize one or more input features into label classes, wherein the assessment of the surface material of the furnishing is a compilation of the categorized label classes.
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Entrepreneurship & Innovation (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Computer Graphics (AREA)
- Human Resources & Organizations (AREA)
- Operations Research (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Quality & Reliability (AREA)
- Evolutionary Biology (AREA)
- Geometry (AREA)
- Tourism & Hospitality (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Architecture (AREA)
- Computer Hardware Design (AREA)
- Processing Or Creating Images (AREA)
Abstract
Disclosed are methods, systems and non-transitory computer readable memory for generating muti-dimensional models representing physical environments and objects within said environments. Environment data depicting a physical space may be received. The environment data may be processed using a first processing model. A design plan for the physical space may be generated using a designer model. A recommended object may be selected to be located at the position indicated by the respective object indicator. The 3D model may be populated with the plurality of recommended objects. In some embodiments, the 3D model may be populated with the plurality of recommended objects located at the respective positions indicated by the respective object indicators.
Description
Multi-Dimensional Model Generation for Aesthetic Design of an Environment
TECHNICAL FIELD
[0001] Various aspects of the present disclosure relate generally to systems and methods for generating muti-dimensional representations of physical spaces and objects within those spaces.
BACKGROUND
[0002] Homeowners often seek out popular, highly-regarded designers for assistance when designing the interior of their homes. Homeowners trust the designers’ expertise in terms of how to best layout each space within their home as well as which combination of decor would mesh well with the style and feel of the surrounding environment. However, the demand for such designers is typically high, resulting in homeowners having to outbid each other with increasing price offers. Moreover, when homeowners attempt to emulate the designers’ style by referencing catalogs depicting their previous work, they often miss key details of the aesthetic, leading to a poor outcome. There is a need for an automated system that can provide homeowners a muti-dimensional representation of a physical space that replicates an aesthetic style of a designer.
[0003] The present disclosure is directed to overcoming one or more of these abovereferenced challenges.
SUMMARY
[0004] According to certain aspects of the disclosure, systems, methods, and computer readable memory are disclosed for generating muti-dimensional models representing physical environments and objects within said environments.
[0005] In some embodiments, environment data may be received depicting an environment that contains one or more objects. The environment data may include a multi-dimensional visualization of the environment. Using an image processing model, the environment data may be processed. The image processing model may be configured to do one or more of the following: generate a 3D model of the environment that includes a plurality of 3D representations of the one or more objects and a 3D representation of the environment, where the 3D representations of the one or more objects are independently manipulable relative to the 3D representation of the environment; determine a plurality of parameters for one or more of the 3D representations of the one or more objects based on photogrammetry analysis; and
for one or more of the 3D representations of the one or more objects, associate the respective 3D representation of the respective object with a label indicating an object type and one or more of the plurality of parameters determined to be associated with the object. The label and the plurality of parameters may be used to query a database to find a matching 3D representation of the object. The model may output the labels and parameters for the 3D representations of the one or more objects.
[0006] In some examples, environment data may be received depicting an environment that contains a plurality of objects. The environment data may include a multi-dimensional visualization of the environment. The plurality of objects may include one or more structure objects and one or more removable objects. Using an image processing model, the environment data may be processed. The image processing model may be configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations. Based on the tags associated with each of the 3D representations, it may be determined which of the 3D representations are associated with structure objects and which of the 3D representations are associated with removable objects. Using a generative artificial intelligence (Al) model, a rendering of the environment may be generated in which one or more portions associated with the removable objects are excised and replaced with a filler portion that is generated based on an inference of one or more characteristics associated with material surrounding the removable object. [0007] In some embodiments, a plurality of two-dimensional (2D) representations of at least one object may be received. Based on the plurality of 2D representations, a 3D representation may be generated for the object. One or more tags may be generated for the 3D representation of the object, where the tags may include a label indicating an object type and a plurality of parameters indicating an assessed characteristic of the object. Matching analysis may be performed comparing the one or more tags against tags for a plurality of previously registered objects. A confidence score may be generated indicating a confidence that a previously registered object is a match for the object. Based on the matching analysis and the confidence score, the previously registered object may be selected for display to a user.
[0008] In some embodiments, environment data may be received depicting an environment that contains one or more objects The environment data may include a multi-dimensional visualization of the environment. Using an image processing model, the environment data may be processed. The image processing model may be configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags
associated with each of the 3D representations of the objects. The 3D model of the environment, or a derivative thereof, may be inputted to an object placement model. The object placement model may be a generative artificial intelligence model that has been trained based on a plurality of 2D floorplan representations for sample spaces showing at least spatial dimensions of the respective sample space and/or human-selected object placements within the respective space. Using the object placement model, a 2D floorplan representation may be generated showing a proposed arrangement of the plurality of objects within the environment. The 3D model of the environment may be modified by placing the 3D representations of the plurality of objects within the 3D model of the environment in accordance with the 2D floorplan representation generated by the object placement model. [0009] In some embodiments, one or more images of a furnishing may be received. The one or more images, or derivatives thereof, may be inputted to a material recognition machine learning model. The material recognition machine learning model may be trained to generate an assessment of a surface material of a furnishing by analyzing a volume of sample images coded with known surface materials. Using the machine learning model, an assessment of a surface material of the furnishing shown in the one or more images may be generated.
[0010] In some embodiments, a system for generating a multi-dimensional representation of a physical space based, at least in part, on a designer model may be provided herein. Environment data depicting a physical space may be received. In some embodiments, the physical space may contain a plurality of objects. In some embodiments, the environment data may comprise a multi-dimensional representation of an environment. The environment data may be processed using a first processing model. In some embodiments, the first processing model may be configured to output a 3D model of the physical space that includes 3D representations of the plurality of respective objects. A design plan for the physical space may be generated using a designer model. In some embodiments, the design plan may be generated based on the 3D model of the physical space, or a derivative thereof. The design plan may include a proposed arrangement of a plurality of objects within the physical space. In some embodiments, the proposed arrangement may comprise a plurality of object indicators. For example, each object indicator may indicate a position within the physical space at which a respective object is proposed to be located. A recommended object may be selected to be located at the position indicated by the respective object indicator. In some embodiments, for each object indicator, a recommended object may be selected, from a repository of available objects, to be located at the position indicated by the respective object indicator. In such a case, a plurality of recommended objects associated with respective
positions indicated by the respective object indicators may be generated. The 3D model may be populated with the plurality of recommended objects. In some embodiments, the 3D model may be populated with the plurality of recommended objects located at the respective positions indicated by the respective object indicators.
[0011] Additional objects and advantages of the disclosed technology will be set forth in part in the description that follows, and in part will be apparent from the description, or may be learned by practice of the disclosed technology.
[0012] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the scope of the disclosure.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate various exemplary aspects and together with the description, serve to explain the principles of the disclosed technology.
[0014] FIG. 1 shows an exemplary system architecture for generating 3D models representing an environment.
[0015] FIG. 2 shows an exemplary system protocol by which one or more multidimensional assets may be stored in a system data storage.
[0016] FIG. 3 shows an exemplary process for determining whether input content depicts an object or a physical space.
[0017] FIG. 4 shows an exemplary process for identifying a 3D representation of an object that matches input data received from a user.
[0018] FIG. 5 shows an object segmentation model for segmenting objects within an environment.
[0019] FIG. 6 shows an exemplary pipeline for processing images and generating 3D representations of environments containing one or more objects.
[0020] FIG. 7 shows an exemplary input and output from an object segmentation model.
[0021] FIG. 8 shows an exemplary method for segmenting objects within a 3D representation of an environment.
[0022] FIG. 9 shows an exemplary system for removing objects from a 3D representation of an environment.
[0023] FIG. 10 shows an exemplary method for displaying an environment in which objects have been optionally removed based on user inputs.
[0024] FIG. 11 shows an exemplary method for generating a rendering of an environment in which objects have been removed.
[0025] FIG. 12 shows an exemplary method for retrieving a 3D representation representing an object that matches an object indicated by a user.
[0026] FIG. 13 shows an example of a model retrieval engine configured to retrieve a 3D representation of an object that matches an object indicated by a user.
[0027] FIG. 14 shows an exemplary method for determining a model that matches an object indicated by a user.
[0028] FIG. 15 shows an exemplary process for segmenting input data and generating space plans using an object placement model.
[0029] FIG. 16 shows an exemplary method for outputting a 3D model with an arrangement of 3D representations indicated by a user.
[0030] FIG. 17 shows an exemplary generative Al object placement model trained to output a 3D model with a proposed arrangement of 3D representations associated with objects within a space.
[0031] FIG. 18 shows an exemplary method for generating a 2D floorplan with a proposed arrangement of objects.
[0032] FIG. 19 shows an example of a material recognition model trained to output an assessment of a surface material of an object shown in an image or multi-dimensional representation.
[0033] FIG. 20 shows an exemplary method for matching objects based, at least in part, on an assessment of a surface material of a furnishing.
[0034] FIG. 21 shows an exemplary system architecture for obtaining and managing content to be used for training a designer model to generate a multi-dimensional representation of a physical space.
[0035] FIG. 22 depicts an exemplary training protocol to train the floorplan generator of the designer model to output a design plan.
[0036] FIG. 23 depicts an exemplary flow diagram of an object recommendation engine configured to generate recommended objects to be located at the position indicated by respective object indicators.
[0037] FIG. 24 depicts an exemplary method of generating a multi-dimensional representation of a physical space in response to user instructions inputted via a GUI.
[0038] FIG. 25 depicts an exemplary method of generating a multi-dimensional representation of a physical space based on a designer model.
[0039] FIG. 26 shows an exemplary processing system that may execute techniques presented herein.
DETAILED DESCRIPTION
[0040] In general, the present disclosure is directed to methods and systems for generating and modifying muti -dimensional models representing physical environments and objects within said environments.
[0041] In some embodiments, environment data may be received depicting an environment that contains one or more objects. The environment data may include a multi-dimensional visualization of the environment. Using an image processing model, the environment data may be processed. The image processing model may be configured to do one or more of the following: generate a 3D model of the environment that includes a plurality of 3D representations of the one or more objects and a 3D representation of the environment, where the 3D representations of the one or more objects are independently manipulable relative to the 3D representation of the environment; determine a plurality of parameters for one or more of the 3D representations of the one or more objects based on photogrammetry analysis; and for one or more of the 3D representations of the one or more objects, associate the respective 3D representation of the respective object with a label indicating an object type and one or more of the plurality of parameters determined to be associated with the object. The label and the plurality of parameters may be used to query a database to find a matching 3D representation of the object. The model may output the labels and parameters for the 3D representations of the one or more objects.
[0042] In some examples, environment data may be received depicting an environment that contains a plurality of objects. The environment data may include a multi-dimensional visualization of the environment. The plurality of objects may include one or more structure objects and one or more removable objects. Using an image processing model, the environment data may be processed. The image processing model may be configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations. Based on the tags associated with each of the 3D representations, the system may determine which of the 3D representations are associated with structure objects and which of the 3D representations are associated with removable objects. Using a generative artificial intelligence (Al) model, a rendering of the environment may be generated in which one or more portions associated with the removable objects are excised and replaced with a filler portion that is generated
based on an inference of one or more characteristics associated with material surrounding the removable object.
[0043] In some embodiments, a plurality of two-dimensional (2D) representations of at least one object may be received. Based on the plurality of 2D representations, a 3D representation may be generated for the object. One or more tags may be generated for the 3D representation of the object, where the tags may include a label indicating an object type and a plurality of parameters indicating an assessed characteristic of the object. Matching analysis may be performed comparing the one or more tags against tags for a plurality of previously registered objects. A confidence score may be generated indicating a confidence that a previously registered object is a match for the object. Based on the matching and analysis and the confidence score, the previously registered object may be selected for display to a user.
[0044] In some embodiments, environment data may be received depicting an environment that contains one or more objects The environment data may include a multi-dimensional visualization of the environment. Using an image processing model, the environment data may be processed. The image processing model may be configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations of the objects. The 3D model of the environment, or a derivative thereof, may be inputted to an object placement model. The object placement model may be a generative artificial intelligence model that has been trained based on a plurality of 2D floorplan representations for sample spaces showing at least spatial dimensions of the respective sample space and/or human-selected object placements within the respective space. Using the object placement model, a 2D floorplan representation may be generated showing a proposed arrangement of the plurality of objects within the environment. The 3D model of the environment may be modified by placing the 3D representations of the plurality of objects within the 3D model of the environment in accordance with the 2D floorplan representation generated by the object placement model. [0045] In some embodiments, one or more images of a furnishing may be received. The one or more images, or derivatives thereof, may be inputted to a material recognition machine learning model. The material recognition machine learning model may be trained to generate an assessment of a surface material of a furnishing by analyzing a volume of sample images coded with known surface materials. Using the machine learning model, an assessment of a surface material of the furnishing shown in the one or more images may be generated.
[0046] FIG. 1 shows an exemplary system architecture for generating 3D models representing an environment. As shown in FIG. 1, the system 100 may include a server 102 communicatively coupled to an existing property module 104, an asset collection module 106, a system data storage 108, a future property module 110, a staging module 112, and/or a CPU 114. The asset collection module 106 may be linked to a third-party site 116 and/or a user device comprising an application or mobile app 118, both of which may be described herein.
[0047] The server 102 may be communicatively coupled to an asset collection module 106, which may be described further in FIG. 2. The asset collection module 106 may obtain a plurality of multi-dimensional images and/or video from one or more sources. For example, the asset collection module 106 may obtain images from one or more third-party websites 116. In some examples, a vendor may link his or her website 116 to the system, which may involve the vendor providing the system 100 authentication credentials that the system 100 can use to access and read data related to the vendor’s catalog of products advertised on the website. Vendors and other users may also transmit images or 3D models of objects to the asset collection module 106 via, e.g., an application programming interface (API). The system 100 may process the received data (e.g., to ensure standard formatting and apply appropriate labels) and store it locally to system data storage 108. In some examples, the system 100 may receive from a registered user input data related to images and/or video via an application 118 hosted on the user’s mobile device.
[0048] The system data storage 108 may store the 3D assets (which may alternatively be referred to as 3D models) and information related to the 3D assets such as one or more parameters related thereto. The assets may be stored in, for example, a database that the system 100 may query.
[0049] The system 100 may include an existing property module 104 and a future property module 110. The future property module 100 may pull a list of candidate builders and provide such list to a user. The user may be able to select a candidate builder from the list. In such a case, the selected candidate builder may be able to access the user’s 3D model(s) for each environment within the blueprint of a property. The 3D models may include one or more 3D representations depicting objects within the environment. The existing property module 104, which may be described further herein, may allow a user to visualize different variations of an existing space via the implementation of the object segmentation model, the object removal model, the model retrieval engine, the object placement model, and the material recognition model, all of which are described in more detail below.
[0050] The system may include a central processing unit or CPU 114, which may direct the operation of the different modules of the system 100. A staging module 112 may include a generative Al model trained, based on 2D floorplan representations for sample spaces, to output a 2D floorplan representation showing a proposed arrangement of one or more objects within an environment. A 3D model depicting the environment may be modified in accordance with the output 3D floorplan. A user may be able to visualize the environment with the proposed arrangement. An example of a staging module 112 is described further with respect to FIGS. 16-18.
[0051] FIG. 2 shows an exemplary process for receiving, generating, and storing multidimensional models related to physical products in system data storage. At block 200, the asset collection module may receive a plurality of assets from third-parties, such as vendors selling objects (e.g., furnishings) to be displayed in model environments. The assets may be transmitted to the system by any suitable manner. For example, vendors may transmit existing assets to the system via an API of the system. In some embodiments, the received assets may be in any of 2D image, 3D model, or video formats. At block 202, the system may determine a format of each of the plurality of received assets. For example, the system may determine whether a given asset is a 2D image or a 3D model. If the asset is a 3D model, the process may flow to block 206, and the 3D asset may be stored in data storage 204. In some embodiments, the existing asset received from the third-party may be processed before it is stored. For example, the asset may be smoothed, compressed, or its formatting may be standardized to facilitate efficient storage, retrieval, and use for subsequent applications.
[0052] If the third-party asset is in a 2D format, the asset may be transmitted to 2D retail asset block 210, which may be a data storage module. Images may also be received from users at block A, which may represent a plurality of users uploading images from user devices (e.g., personal computers, smart phones, and the like). At block 201, the system may determine whether the image was received from a retail account (e.g., a vendor) or a personal account (e.g., a customer designing a space or shopping for furnishings). This may be determined, for example, based on an account type of the user or based on information received from the user upon registration. If an image is received from a retail user, the image may be transmitted to retail asset storage 210. If the image is received from a personal user, the image may be transmitted to personal asset storage 212. 2D assets from both blocks may be converted to 3D assets using converter module 214. The converter module 214 may include, for example, a neural radiance field (NeRF) model or other machine learning model trained to generate 3D models based on received images. In some embodiments, other
techniques for converting items shown in a 2D image to a 3D model. The system may then store the generated 3D models, along with any directly received (and optionally processed) 3D models in system storage 204.
[0053] In some embodiments, the 3D models may be stored with parameters relating to the objects the models represent. For example, in a case where a model is received from a vendor, the vendor may indicate a product name, model number, price, or other information relating to the object. Personal users may also optionally upload these or other categories of information. The system may be configured to store such information and associate with the stored models.
[0054] FIG. 3 shows an exemplary process for determining whether input content depicts an object or a physical space. The system protocol may receive input content 302 from a user, which may be in the form of one or more input images and/or an input video recording. The user may upload the content 302 via a graphical user interface (GUI) of the application on the device. Upon uploading the content 302, the user may see a prompt 300 requesting whether the content 302 represents an object or a space. If the user answers the former, the system may prepare the content 302 to be used as input assets 306 for the asset collection module described with respect to FIG. 2. In some examples, the user must upload a minimum of 1, 2, 3, 4 or 5 images and/or satisfy other requirements to sufficiently capture the object. If the user indicates that the content 302 represents a space, the system may prepare the content 302 to be used as input space 304 for processing by the object removal model, the object placement model, and/or the like, each of which is described further herein. In some embodiments, if the input space 304 is in the form of images, the user may be required to upload a minimum of 1, 2, 3, 4, or 5 images and/or satisfy other requirements such as capturing each of corner of the space. In some embodiments, if the input space 304 is in the form of video, the user must upload a video capturing 360 degrees of the space.
[0055] FIG. 4 shows an exemplary process for identifying a 3D representation of an object that matches input data received from a user. The method may include receiving an input images/video prompt at block 400 or a text or other prompt at block 402, applying an object segmentation model at block 404, executing a model retrieval engine at block 406, and/or storing the identified match at a system data storage at block 408.
[0056] At 400, a user registered to the system, for example, via an application on a device, may be presented an option to enter a mode where the user can input a picture, video or text indicating an object and the system will attempt to identify and output to the user a 3D representation that matches the user input. After entering the input mode, the user may upload
one or more images and/or a video depicting an object or an environment that contains one or more objects. The environment may be a physical space and the one or more objects may be furnishings within the physical space. In some embodiments, a user may additionally or alternatively enter text via a text prompt 402, which the system may use as additional input to the image processing model. In some embodiments, the system may receive text prompts and use a generative model, such as those described herein, to generate an image, video, or 3D model of an object that matches the user’s prompt. In some embodiments, the generative process may be iterative, such that the user may input an initial instruction and subsequently input additional instructions to modify the generated content showing the object. In this manner, the object shown in the iteratively generated content may closely match the user’s desired object. In some embodiments, environment data surrounding the object may also be obtained or generated from information entered via the prompts 400, 402. In some embodiments, at step 404, the received and/or generated content may be processed by an object segmentation model. The object segmentation model may process content containing an environment with one or more objects to identify various objects within the environment. The object segmentation model may also generate 3D representations of objects within the environment.
[0057] The object segmentation model 404 may receive the environment data as input and feed the data into an image processing model. As described with respect to FIGS. 5 and 6, the image processing model may be trained to output a 3D model of the environment that includes a 3D representation of the environment as well as 3D representations of objects within the environment. The 3D representations of the objects may be independently manipulable from the 3D representation of the environment. As described with respect to FIGS. 5 and 6, one or more parameters such as the size, shape, pattern, color, etc. may be generated and indexed with each 3D representation. Additionally, a label associated with what type of object the 3D representation corresponds to may be generated and indexed. The label and one or more parameters may collectively be known as a tag and may be used by other models, such as the object removal model, described herein. Generating the label and the param eter(s) may be an output of the image processing model. In some examples, generating the label and the parameter(s) may take place at another module linked to the system such as the material recognition model.
[0058] A model retrieval engine 406 may query the system data storage 408 and identify, from the stored candidate 3D assets described with respect to FIG. 2, one or more candidate 3D assets most likely to match the object(s) depicted in the user input image and/or text. As
described with respect to FIG. 14, the model retrieval engine 406 may perform a matching analysis that may generate scores for each of the candidate 3D assets. The model retrieval engine 406 may output for display to the user the candidate 3D asset with the highest score. In some examples, this candidate 3D asset may be used when the object placement model stages a proposed scene for the user, which will be described further with respect to FIG. 17. In some examples, the interface of the application may present a GUI that allows the user to scroll through a multitude of similar candidate 3D assets, as determined by the generated scores.
[0059] FIG. 5 shows an object segmentation model for segmenting objects within an environment. The object segmentation model 500 may include an image processing model 502 which may include a 3D model generation module 504, a validation module 506, and a parameter generation module 508. The object segmentation model 500 may include weight initialization 510. The image processing model may receive text input 512, images and/or video input 514, and/or context data input 516.
[0060] As described above, the image processing model 502 may receive environment data and be trained to output a 3D model including a 3D representation of an environment along with 3D representations of objects within an environment. The image processing model 502 may, for example, utilize a NeRF methodology with a neural network. In some embodiments, the image processing model 502 may incorporate other generative Al models capable of contributing to the production of 3D models based on user input. For example, there may be scenario where a user only provides input text 512 via the prompt described with respect to FIG. 4. In such a case, a generative Al model that uses natural language processing to generate digital images may be used to feed digital images to a 3D generation model 504. [0061] The image processing model 502 may include a 3D generation model 504 that may be trained to receive one or more 2D images as input 514 and, based on the images, output a 3D model. In some examples, additional data related to context data input 516 and text input 512 may be fed to the image processing model 502 to improve accuracy. For example, the image processing model 502 may be trained to receive inputted 2D images that capture different viewing angles of a single environment and/or objects within an environment and, based on this input, reconstruct a 3D model. A generated 3D model may depict viewing angles not previously captured by the inputted 2D images, which may allow a user to rotate the 3D model, allowing the user to view the space or environment from different perspectives. The 3D generation model 504 may employ a neural network that is trained to use information from pixels along rays associated with different viewing angles captured
from the plurality of input images to assign weights and biases that will output color and volume of that pixel within a 3D model. Additional details regarding exemplary generative models within the scope of this disclosure are described below, including with respect to in FIG. 6.
[0062] A validation module 506 of the image processing model 502 may be used to provide feedback to the system on how the current configuration of the image processing model 502 is performing. In some examples, a sample of the 3D model outputted by the image processing model 502, which may be a digital twin (as described further below with respect to FIG. 18), may be generated and used by the validation module 506 to test how the image processing model 502 is performing. Testing the sample may involve providing manual feedback as to whether the image processing model 502 outputted a correct 3D model or an incorrect 3D model. The results of testing the sample may be analyzed and applied to make improvements to the image processing model 502. For example, the system may send instructions to the image processing model 502 to update weights and biases of the neural network.
[0063] The parameter generation module 508 of the object segmentation model 500 may perform additional analysis to generate one or more parameters of the 3D representations of the objects, which are described herein. For example, by analyzing the color(s) of an object shown in an input image or a 3D representation, a parameter may be generated indicating a color scheme of the object. Similarly, by analyzing a 2D projection, via one or more volume rendering techniques, of the 3D representation outputted by the model, a parameter associated with the size of the object may be generated.
[0064] In some embodiments, the object segmentation model 500 may include a classifier model 518 which may optionally be the same as the classifier model 902 implemented by the object removal module 900 described with respect to FIGS. 9 and 10. For example, a 3D model generated based on user inputs (e.g., images) may be input to a classifier model 518 trained to assign labels to each 3D representation of an object within the 3D model. Additional details are described below, including with respect to FIGS. 7 and 9.
[0065] FIG. 6 shows an exemplary pipeline for processing images and generating 3D representations of environments containing one or more objects. It should be noted that the 3D model generation module 602 described with respect to FIG. 6 is merely an example of a generative Al model and other generative Al models capable of outputting a 3D model from input images and/or other data may be used instead. The image processing model 600 may include a 3D model generation module 602 which may employ a coordinate sample module
604, a neural network 606, a reconstruct domain module 608, a mapping module 610, and/or a sensor domain module 612. The modules mentioned above may be used to train a neural network capable of outputting a 3D model from a multitude of 2D images and/or data related to the 2D images, such as text input data and/or context input data.
[0066] The coordinate sampling module 604 may sample the coordinates of a scene of the 3D model. As mentioned in FIG. 5, the multitude of inputted images may capture different viewing angles of the same environment. Rays that move along a hypothetical z-axis of the 2D images may be generated for each pixel of the 2D images. By using the rays, the coordinate sampling module 604 may derive coordinate values (x, y, z) for each pixel of the corresponding 2D image, including the pixels along the hypothetical z-axis. Additionally, the viewing angle associated with the respective image may be derived and included in the coordinate values for each pixel, resulting in an input of (x, y, z, 0, 0) to be fed to the neural network 606.
[0067] The coordinate values may be fed into the neural network 606, which may be a fully connected neural network designed to output color components (r, g, b) for each pixel as well as a volume density for each pixel. The volume density may be used to indicate whether an object is present for the given coordinate values in the scene or if the coordinate values are associated with empty space.
[0068] A reconstruct domain module 608 may receive the information outputted form the neural network 606. Based on these values, pixels along the multitude of rays extending via a hypothetical z-axis now may have outputted color components and volume density and by leveraging information related to the different rays, a scene of the 3D model may be reconstructed. For example, if a pixel associated with a first ray is associated with a zerovolume density and the same pixel associated with a second ray is associated with a zerovolume density, in other words the coordinate value for the pixel represents empty space within the scene, the model may be more confident in indicating that this pixel in the 3D model should represent empty space.
[0069] A mapping module 610 may map the reconstructed sample back to the original 2D images used as input. The sensor domain module 612 may determine whether the reconstructed sample exceeds a threshold, which may indicate that the 3D model is accurate enough to be used by the system. In some examples, a reconstruction error may be calculated and used as feedback to optimize the neural network 606.
[0070] FIG. 7 shows an exemplary input and output from an object segmentation model (e.g., objection segmentation model 500). As shown, environment data may include one or
more source image(s) 700. As described above, environment data may include a plurality of 2D images and/or video. Environment data may include text input data and/or context input data. The environment data may be fed to an image processing model (e.g., image processing model 502 described with respect to FIGS. 5 and 6). The image processing model may generate an output 702 with tags for each 3D representation depicting an object within the 3D model. The tags may include one or more parameters and a label indicative of the object type that the 3D representation represents. For example, a label named “couch” may be generated and linked to a 3D representation of the object depicting a couch from the source image. Additionally, a size of the couch, for example, the dimensions of the couch relative to the environment, may be generated and linked to the 3D representation of the couch. An additional flag may be set for each of the 3D representations that may indicate whether the object that the 3D representation represents is a structure object or a non-structure object, which may be described further with respect to FIGS. 9-12.
[0071] FIG. 8 shows an exemplary method for segmenting objects within a 3D representation of an environment. At 802, environment data may be received depicting an environment that contains one or more objects. The environment data may include a multidimensional visualization of the environment. Using an image processing model, the environment data may be processed. The image processing model may be configured to do one or more of the following. At 804, it may generate a 3D model of the environment that includes a plurality of 3D representations of the one or more objects and a 3D representation of the environment, where the 3D representations of the one or more objects are independently manipulable relative to the 3D representation of the environment. At 806, the model may determine a plurality of parameters for one or more of the 3D representations of the one or more objects based on photogrammetry analysis. At 808, the model may, for one or more of the 3D representations of the one or more objects, associate the respective 3D representation of the respective object with a label indicating an object type and one or more of the plurality of parameters determined to be associated with the object. The label and the plurality of parameters may be used to query a database to find a matching 3D representation of the object. The model may output the labels and parameters for the 3D representations of the one or more objects.
[0072] FIG. 9 shows an exemplary object removal module 900 for removing objects from a 3D representation of an environment. The object removal module 900 may include a classifier model 902, which may receive images or a model of an environment and classify one or more objects within the environment. In some embodiments, the classifier model may
be or may operate similarly to the object segmentation models described herein. The classifier model may apply labels to objects within the room identifying what the objects are. [0073] Each classified object may also be determined to be a structural element within the environment which cannot be readily removed or a non- structural element that can be readily removed. In some embodiments, the system may include a set of rules and/or a lookup table 914 with a list of possible objects and whether they should be treated as structure or nonstructure. For example, a wall, floor, light fixture, or fireplace may be classified as a structural element that cannot be readily removed and should be included in renderings of the environment with objects removed. Conversely, a table, couch, or painting may be classified as a non-structural element that can be readily removed, and if the system generates a rendering of the environment with objects removed, such non-structural elements should be removed from that rendering. Each of the objects classified by the classifier model may thus be assigned a label indicating whether the object is structure or non-structure by structure label module 916.
[0074] The object removal module 900 may include a generative Al model 918 that is trained to fill excised portions of an image, a 3D model, or a rendering thereof. In some embodiments, the object removal module 900 may excise portions of the environment that correspond to objects that have been flagged as non-structure and are therefore deemed to be removable objects. Excising these portions may leave gaps, which the generative Al model 918 may be trained to fill using inferences based on the characteristics (e.g., appearance, material, and shape) of surrounding structural elements (e.g., floor and walls). In some examples, the material recognition model described with respect to FIG. 21 may be called to determine such characteristics.
[0075] In some embodiments, the generative Al model 918 may fill excised portions of an image, a video, a 3D model, or rendering thereof by generating filler portions that the model has been trained to recognize as appropriate and realistic when viewed within the context of that position in the environment. In some embodiments, the generative Al model 918 may generate an initial filler portion, which may be, for example, random noise, a static filler, or some other starter data. The generative Al model 918 may then iteratively modify the content of the filler portion until the 3D representation is determined to resemble a realistic environment. This process of iteratively modifying and scoring the resulting 3D representation may be performed by a neural network trained using large volumes of content. [0076] FIG. 10 shows an exemplary method for displaying an environment in which objects have been optionally removed based on user inputs. As described with respect to FIG.
9, the object removal module 1000 may be executed by the system to generate a 3D model of an environment where 3D representations representing removable objects have been excised and replaced with filler portions. Such a feature may allow a user to envision a space without any furnishings (i.e., furniture or other decor) within the space. The user may upload a plurality of images capturing, for example, the corners of the space to the system. The images may depict one or more furnishings within the space. The system may use the images as input to the object segmentation model and subsequently the object removal module 1000, both of which are described above. The object removal module 1000 may output a generated 3D model of the space with every 3D representation of a furnishing removed. This modified version of the 3D model may be stored in the system storage database. Additionally, an original 3D model, for example with all of the 3D representations, may have been generated by the object segmentation model and stored in the system data storage, for example, in a searchable database. In some examples, the modified version of the 3D model and the original version of the 3D model may be linked in the database. A GUI linked to the system may prompt 1002 the user who uploaded the images of the space asking whether he or she wants to see the space with or without the furnishings. If the user prefers to see furnishings, the system may pull the original 3D model and display a GUI to the user depicting the original 3D model. If the user prefers not to see furnishings, the system may pull the modified 3D model and display a GUI to the user depicting the modified 3D model without any furnishings.
[0077] FIG. 11 shows an exemplary method for generating a rendering of an environment in which objects have been removed. At 1102, environment data may be received depicting an environment that contains a plurality of objects. The environment data may include a multi-dimensional visualization of the environment. The plurality of objects may include one or more structure objects and one or more removable objects. Using an image processing model, the environment data may be processed. At 1104, the image processing model may be configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations. At 1106, based on the tags associated with each of the 3D representations, it may be determined which of the 3D representations are associated with structure objects and which of the 3D representations are associated with non-structure / removable objects. At 1108, using a generative Al model, a rendering of the environment may be generated in which one or more portions associated with the removable objects are excised and replaced with a filler portion that is generated
based on an inference of one or more characteristics associated with material surrounding the removable object.
[0078] FIG. 12 shows an exemplary method for retrieving a 3D representation representing an object that matches an object indicated by a user. At 1210, vendor 1 1200 may link the vendor website to the system and may provide access credentials so that the system, or system management, may read data related to products advertised on the website, which may be described with respect to FIGS. 1 and 2. At 1212, vendor 2 1202 may directly upload to the system image(s) of products that the vendor wishes to sell. Data from both vendor 1 1200 and vendor 2 1202 may be processed via the asset collection protocol 1214 described with respect to FIG. 2 whereby 3D assets depicting 3D models may be generated and stored in the system data storage 1208. As mentioned earlier, these models may be used as candidate models that may be compared to the output of the object segmentation model 1218 to retrieve a model or a 3D representation within a model that a user might be interested in. At 1216, a user 1204 may upload one or more images depicting an object that the user is seeking to input to the system, which may be described with respect to FIG. 3. The system may execute the object segmentation model 1218 described with respect to FIGS. 4-6 to generate a 3D model containing a 3D representation of the environment and of any objects contained within the environment captured in the image(s). Additionally, the object segmentations model 1218 may generate a tag comprising a label and one or more parameters and assign it to the 3D model, which is described with to FIGS. 5 and 6. At 1220, the 3D model along with the tags may be sent to the system data storage 1208. Upon receiving the 3D model which includes a 3D representation of an object that the user 1204 is seeking, the system may execute a match analysis model 1224. As described with respect to FIG. 13, the match analysis model 1224 may perform match analysis and compare the respective 3D representations of the 3D model with the candidate 3D representations of the candidate 3D models. The output of the match analysis model 1224 may be a 3D model containing a 3D representation that most closely resembles the object that the user 1204 is seeking based on the image(s) uploaded by the user. At 1222, the match may be sent for display via a GUI on the user’s device.
[0079] FIG. 13 shows an example of a model retrieval engine 1300 configured to retrieve a 3D representation of an object that matches an object indicated by a user. The model retrieval engine 1300 may include a match analysis model 1302, which may include a tag generation module 1304 and a scoring module 1306. The match analysis model 1302 may have read and write permissions with respect to the system data storage 1308.
[0080] Upon the object segmentation model completing the generation of a 3D model depicting content uploaded by a user, the system may instruct the model retrieval engine 1300 to execute the match analysis model 1302 to find a candidate 3D asset that matches the generated 3D model. Additionally, the system may provide data related to the generated 3D model to the model retrieval engine 1300.
[0081] In some examples, the tags, which may include a label and one or more parameters as described above, may be generated by the object segmentation model as the segmentation model generates the 3D model. In some examples, the object segmentation model may only generate the 3D model. In such a case, the machine analysis model 1302 may include a tag generation module 1304 that is configured to perform an analysis, for example by using photogrammetry analysis techniques, on the generated 3D model to generate tags for each 3D representation contained within the 3D model. For example, the tag generation module 1304 may recognize that a 3D model contains a first 3D representation of an object. The module 1304 may tag the 3D representation of the object with tag 1, which may involve assigning a label to the 3D representation as well as determining one or more parameters for the object. For example, parameters XI through N1 may be associated with tag 1 and may describe characteristics of the object uploaded by the user. In some examples, context analysis data may be provided by a context analysis module 1310 and may be later used by the scoring module 1306.
[0082] Logic of the system may instruct the match analysis model 1302 to execute a scoring module 1306. The scoring module 1306 may receive the generated tag, for example tag 1 from the tag generation module 1304, and may request from the system data storage 1308 information related to each of the candidate 3D models stored. In some examples, the scoring module 1306 may include in the request message, for example in a header of the message, indicating the label of tag 1 and may instruct the system data storage 1308 to only return candidate tags that match the label. For example, candidate tag 1 may have an assigned label that matches the label of tag 1. In such a case, the system data storage 1308 may send, via a message, candidate tag 1 and the parameters of candidate tag 1 to the scoring module 1306.
[0083] The scoring module 1306 may perform match analysis that compares the candidate tag with the generated tag, for example, that compares candidate tag 1 and tag 1. The scoring module 1306 may identify, for example based on string matching, that parameter XI of tag 1 and parameter Cl of candidate tag 1 relate to the same characteristics. For example, both parameters may contain values pertaining to the size of their respective objects. The scoring
module 1306 may generate a score based on a degree by which these parameters match. For example, if both parameters XI and Cl convey that the size of their respective object is 50 sq. ft., a score representing a high confidence value may be generated and stored. If, however, parameter XI conveys that the size of its respective object is 50 sq. ft. and parameter Cl conveys that the size of its respective object is 5 sq. ft., then a score representing a low confidence value may be generated and stored. Generating the confidence scores may occur for each candidate tag sent to the scoring module, where every parameter of each candidate tag is compared to a corresponding parameter of the generated tag.
[0084] Each score may be stored by the scoring module 1306 and linked to the parameters of the generated tag, for example, a confidence score may be linked to each of parameters XI through Nl. As mentioned above, every candidate tag sent to the scoring module 1306 may have confidence scores measuring how its parameters match the generated tag parameters. To determine which candidate 3D representation most closely resembles the object uploaded by the user, the scoring module 1306 may perform an operation on the confidence scores and may return the candidate 3D representation with the highest overall confidence score. In some examples, to be selected as matching candidate 3D representation, every confidence score of the parameters must be above a threshold value.
[0085] FIG. 14 shows an exemplary method for determining a model that matches an object indicated by a user. At 1402, a plurality of two-dimensional (2D) representations of at least one object may be received. At 1404, based on the plurality of 2D representations, a 3D representation may be generated for the object. At 1406, one or more tags may be generated for the 3D representation of the object, where the tags may include a label indicating an object type and a plurality of parameters indicating assessed characteristics of the object. At 1408, matching analysis may be performed comparing the one or more tags against tags for a plurality of previously registered objects. At 1410, a confidence score may be generated indicating a confidence that a previously registered object is a match for the object. Based on the matching analysis and the confidence score, the previously registered object may be selected for display to a user.
[0086] FIG. 15 shows an exemplary process for segmenting input data and generating space plans using an object placement model. At 1500, user input data, for example environment data, may be received by the system as described with respect to FIGS. 2-4. As mentioned above, the input may be one or more images depicting a space in a property, for example, an existing property.
[0087] At 1502, the images, or other environment data, may be used as input to the object segmentation model, which may perform processes as described with respect to FIGS. 4-6 to generate a 3D model which includes a 3D representation of an environment of the physical space as well as 3D representations of one or more objects within the physical space, which may be shown at 1504. Additionally, the system may instruct the object segmentation model that the input data is to be used for the object placement model. In such a case, the object segmentation model may generate and provide additional information to the object placement model such as context data related to the context of the scene that the 3D model represents. [0088] At 1506, the 3D model may be used as input to the object placement model. The object placement model may be a generative Al object placement model trained to generate a 2D floorplan representation of the inputted 3D model, which may be described further with respect to FIG. 17. As shown at 1508, the 2D floorplan representations may depict a birdseye view of the layout of the 3D model, including how the 3D representations are to be arranged. For example, the output of the trained generative Al object placement model may be a proposed arrangement of the 3D representations corresponding to the objects. The object placement model may modify the 3D model generated by the object segmentation model by placing the 3D representations within the 3D model according to the proposed arrangement. Additionally, as shown at 1508, a 3D sample with the proposed arrangement may be generated using the 2D floorplan as well as information outputted from the object segmentation model. The 3D sample may be used to validate whether an arrangement is acceptable. In some examples, the 3D sample may be used to score the proposed arrangement.
[0089] FIG. 16 shows an exemplary method for outputting a 3D model with an arrangement of 3D representations indicated by a user. At 1610, a 3D model may be generated by and obtained from the object segmentation model described with respect to FIGS. 4-6. Tags that include labels and one or more parameters may be assigned to each 3D representation depicted in the 3D model. In some examples, prior to the object placement model obtaining the 3D model, the model may be passed through the object removal model described with respect to FIGS. 9-11 and flags indicating whether each 3D representation relates to a structure or removable object may be set and included in the tags. This flag may be later used by the staging module when determining which arrangements to present to the user. In some examples, a user may be able to set the flags based on which objects the user wants to remain constant in the space, i.e., every arrangement must include the constant object.
[0090] At 1612, the 3D model along with information related to the 3D model such as the respective tags may be sent to the conversion module which may generate and/or convert the 3D model to a top-down 2D floorplan representation. The 2D floorplan may include tags (e.g., the assigned labels) for 2D representations associated with each 3D representation contained in the original 3D model. Generating the floorplan representation may be based on a rules-based algorithm that is configured with a set of rules. The set of rules may be generated based on, for example, text or other input that conveys what constitutes an acceptable layout for a given space. In some examples, the input text may be fed to a natural language processing model that may generate the rules. A rule, for example, may define the relationship between the dimensions of the 2D representations with respect to the total dimensions of the 2D floorplan.
[0091] At 1614, the 2D floorplan may be sent to the staging module. The 2D floorplan may have a default arrangement of the 2D representations, which may match the arrangement of the 3D representations contained in the original 3D model. The staging module may implement the trained Al generative placement model described with respect to FIG. 17 to output alternative 2D floorplan representations, where each alternative depicts a different arrangement of the same 2D representations. In some examples, the trained model may utilize one or more rules-based algorithms in order to validate that all of the alternative floorplan representations are compliant. In some examples, each alternative may have a reconstructed 3D sample (e.g., 3D digital twin) that may be tested by the system to validate that the system can produce the floorplan in question. Validating whether the system can produce the floorplan may involve the floorplan passing a multitude of checks of validation, which may include whether the spacing dimensions and overall layout of the floorplan in question is satisfactory, how many of the 2D representations depicted match the 3D representations in the original model, if an alternative 2D representation was used, how closely does it match the 3D representations in the original model, and/or the like. Passing a check may involve the system comparing scores (e.g., confidence scores) to respective thresholds. In some examples, the trained model may generate alternative floorplans with 2D representations associated with objects not in the image(s) uploaded by the user. For example, the model may have access to personal asset collections of the user which may be described with respect to FIGS. 2 and 3. One or more of the personal asset collections may include a 3D model of an asset that the user previously uploaded as inspiration. The model may be trained to determine which assets from the personal asset collection should be used in the staging of a given space.
[0092] In some examples, the object placement model may output alternative representations where each representation satisfies a design criteria known to be valued by users. For example, the object placement model may be trained to output a first representation whose arrangement allows maximum movement within a given space and a second alternative representation whose arrangement allows maximum functionality of a given space. The alternative 2D floorplan representations may be presented for display in a GUI of the user device so that the user may view and select the floorplan he or she likes the most. In some examples, feedback may be generated based on the selection made by the user and used by the generative Al placement model to improve its neural network.
[0093] At 1616, after the user selects the 2D floorplan representation he or she likes, the original 3D model may be modified with the arrangement in the selected 2D floorplan and graphically rendered to be suited for display to the user. For example, the original 3D model may maintain the same 3D representation of the space, but may populate the original 3D model with an arrangement of 3D representations corresponding to the objects that matches the selected arrangement. In some examples, auto-staging may be performed by the object placement model which may involve the system automatically selecting the ideal floorplan representation among the alternative representations for the user and begin generating the modified 3D model. In some examples, the graphical rendering of the modified 3D model may be edited in real-time by the user, which requires the system to generate a 3D digital twin of each edited 3D model so that it can validate that the edit is allowable. Types of editing that may be performed may include swapping a 3D representation for a different 3D representation stored in a database accessible by the user, rearranging the 3D representations within the 3D model, deleting a 3D representation from the 3D model, and/or the like.
[0094] FIG. 17 shows an exemplary generative Al object placement model trained to output a 3D model with a proposed arrangement of 3D representations associated with objects within a space. The generative Al object placement model may include a pre-training model that may be trained on existing floorplans 1708, which may be retrieved from a floorplan data storage 1700 linked to the system, and from images uploaded by users and converted to floorplans 1702. The placement model may include a 2D to 3D interpolation module 1710 and an image interpolation module 1706. The placement model may include a 3D scene interpolation data storage 1712. The output may be used by the staging module 1716 to stage a proposed arrangement of alternative 2D floorplan representations for display. [0095] The generative Al obj ect placement model may be trained via the pre-training model. The objective of the pre-training model may be to output a 2D floorplan
representation that has an acceptable layout that satisfies the user’s requests. The pre-training model may be trained using sample spaces as input. The sample spaces may show spatial dimensions of the respective sample space and/or human-selected object placements within the respective space’s existing floorplans.
[0096] A type of sample space that may be used as input to train the pre-training model may be existing floorplans, as shown at 1708, which may be retrieved from a database contained within a floorplan data storage 1700. The existing floorplans may include data and related 2D images depicting a birds-eye view of layouts determined to be acceptable with respect to a given physical space. In some examples, the existing floorplans may be retrieved by the system when a builder, (e.g., the candidate builder described with respect to FIG. 2), registers with the system. The builder may upload or otherwise provide the system with access to a log of previous floor plans used by the builder. For example, the builder may keep a record of schematics such as blueprint drawings he or she used for a given project. In some examples, the builder may upload the drawings to the system using the application described with respect to FIG. 2.
[0097] A type of sample space that may be used as input to train the pre-training model may be 2D floorplan representations converted from images uploaded by users registered to the system, as shown at 1702. For example, each time the system receives an uploaded image, it may generate a 2D floorplan representation using the techniques described with respect to FIG. 16. Additionally, the system may provide the representation as input to improve the neural network of the trained model, for example by adjusting biases according to this new information. Since the model is able to be continuously trained based on new uploads, it may be able to output 2D floorplans that correspond with one or more current design trends. In some examples, determining the alternative 2D floorplans with different arrangements of 2D representations that should be outputted by the neural network may involve a manual scoring of the 2D floorplans that may act as feedback to adjust biases of one or more nodes of the neural network. In some examples, the model may run a script that generates a score assessing whether the neural network outputs alternative floorplans that are likely to be selected by the user and are able to be produced by the system.
[0098] The model may be trained to use a generated 2D floorplan representation as input and to output an interpolation of a 3D model. Information related to such interpolations may allow the system to provide a 3D rendering of the selected 2D floorplan, as described with respect to FIG. 16. Floorplans retrieved from the floorplan data storage 1700 may be
interpolated via the 2D to 3D interpolation module 1710, while floorplans converted from new user uploads may be interpolated via the image interpolation module 1706.
[0099] Information related to both modules described above may be sent to and stored in the 3D scene interpolation data storage 1712. Information related to such interpolations may allow the system to provide a 3D rendering of the selected 2D floorplan, as described with respect to FIG. 16. For example, when the object placement model enters the staging phase described with respect to FIG. 16, the system may send interpolation information to the staging module 1716 in anticipation of a graphical rendering of a 3D model modified based on the user’s selection of the floorplan.
[0100] FIG. 18 shows an exemplary method for generating a 2D floorplan with a proposed arrangement of objects. At 1802, environment data may be received depicting an environment that contains one or more objects The environment data may include a multidimensional visualization of the environment. Using an image processing model, the environment data may be processed. At 1804, the image processing model may be configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations of the objects. At 1806, the 3D model of the environment, or a derivative thereof, may be inputted to an object placement model. The object placement model may be a generative artificial intelligence model that has been trained based on a plurality of 2D floorplan representations for sample spaces showing at least spatial dimensions of the respective sample space and/or human-selected object placements within the respective space. At 1808, using the object placement model, a 2D floorplan representation may be generated showing a proposed arrangement of the plurality of objects within the environment. The 3D model of the environment may be modified by placing the 3D representations of the plurality of objects within the 3D model of the environment in accordance with the 2D floorplan representation generated by the object placement model.
[0101] FIG. 19 shows an example of a material recognition model trained to output an assessment of a surface material of an object shown in an image or multi-dimensional representation. The module 1900 may work in conjunction with an object segmentation model described herein. The module may include a feature extraction module 1912, a machine learning (ML) classifier 1916, and a material recognition model 1918. The material recognition module 1900 may receive content from a content library and use such content as input 1904. The material recognition model 1918 may be trained to output an object tag 1906 along with a material assessment index 1908 of the object tag based on information related to
a furnishing and a region of interest, both of which are received via a message from the object segmentation model.
[0102] The material recognition model 1918 may be trained to output an assessment of a surface material based on an input of one or more images depicting a furnishing. The training may occur via a pre-training model associated with the material recognition model 1918. The pre-training model may obtain training data which may include content stored in a content library 1904 linked to the system. In some examples, the system may send an instruction for the content, which may include one or more images uploaded to the system using the techniques described with respect to FIGS. 2 and 3, to be sent to the pre-training model via messages. The content used as the training data may include datasets of surface material images uploaded, for example, during the builder registration process described above.
[0103] The input data may be fed first through the object segmentation model described with respect to FIGS. 4-6. After segmentation is complete, feature extraction may be performed on the segmented images via a feature extraction module 1912. The feature extraction module 1912 may output characteristics related to the surface material of the inputted image. For example, the outputted characteristics may include color, texture, compactness, contrast, and/or the like. Feature extraction may include image processing techniques such as edge detection, corner detection, blob detection, and texture analysis, some of which may analyze pixel characteristics such as HSV (hue, saturation, value), RGB (red, green, blue), and LM (local mean). Feature extraction may also utilize rLM (run length matrix) techniques.
[0104] After having been processed through the feature extraction module 1912, the data may be fed to an ML classifier 1916 that may categorize the data into one or more classes related to a surface material. In such a case, the ML classifier 1916 may output one or more parameters based on the categorization, which may be collectively considered an assessment of the material. As described herein, these one or more parameters may be added to the tag of the generated 3D model associated with the given input image. Testing, which may involve manual scoring of the outputted material assessment, may be performed during the pretraining process and feedback may be generated to improve how accurately the ML classifier 1916 categorizes the data into the material classes.
[0105] The material recognition model 1918 may receive information related to a given furnishing along with a region of interest, both of which may be identified by the object segmentation model. The region of interest may be a region in the image depicting the furnishing that contains an object to be analyzed by the material recognition model 1918. In
some examples, no region of interest may be indicated in the message received from the object segmentation model and, in such a case, the entire image of the furnishing may be used as input to the material recognition model 1918. The material recognition model 1918 may output a tag associated with the furnishing (e.g., which may be an input sent from the object segmentation model) as well as a material assessment parameter linked to the tag.
[0106] FIG. 20 shows an exemplary method for matching objects based, at least in part, on an assessment of a surface material of a furnishing. At 2002, one or more images of a furnishing may be received. At 2004, the one or more images, or derivatives thereof, may be inputted to a material recognition machine learning model. At 2006, the material recognition machine learning model may be trained to generate an assessment of a surface material of a furnishing by analyzing a volume of sample images coded with known surface materials. Using the material recognition machine learning model, an assessment of a surface material of the furnishing shown in the one or more images may be generated.
[0107] FIG. 21 shows an exemplary system architecture for obtaining and managing content to be used for training a designer model 2112 to generate a multi-dimensional representation of a physical space. As shown, the architecture may include one or more of the following: one or more designer data sources (designer A data source(s) 2100 through designer N data source(s) 2102); a data management platform 2104; a tag generation module 2106 comprising at least an object segmentation model, a parameter generation module, a conversion to floorplan module, and/or other models/modules described above such as the object removal model, the material recognition model, etc.; a profile manager 2108; a system data storage 2110; and/or a designer model 2112 comprising at least a floorplan generator and an object recommendation engine.
[0108] The designer data sources 2100, 2102 may be sources that host content related to how particular designers design physical spaces. The designer model 2112 may be trained to output a design plan depicting a physical space, where the design plan stages the space as if a particular designer had staged it. In order to train such a model, content related to how the designer designed other spaces may be obtained from the data sources and used as training data. Examples of the content may include 2D images (e.g., pictures of a space previously designed by the designer), text or another medium by which information related to the design space is conveyed (e.g., a blog post describing the aesthetic of a space), multi-dimensional media (e.g., multi-modal data that may employ one or more data enrichment techniques such as object detection, facial recognition, metadata extraction for temporal or geolocation data, audio transcription, etc.) that add additional dimensions (e.g., 3D graphics or models of the
space such as an interactive video game or other simulation that depicts a simulated design space), and/or other media that can be useful in providing insights when determining the design style of an individual designer for given spaces. In some embodiments, some of the content may be similar or identical to the assets described in FIGS. 1-3. Examples of sources that can host such content may include third-party websites, social media platforms, blog or other forums focused on publishing articles or videos, community boards (e.g., a platforms where a community of members who like a designer’s design style can share media and engage in discussions pertaining to the designer’s aesthetic), and/or the like (other sources may include those involved when obtaining the assets as described in FIGS. 1-20). As for the example that is shown in FIG. 21, two designers of interest may be identified as designer A and designer N. Accordingly, designer A data sources 2100 and designer N data sources 2102 may contain respective sets of content such as images (as an example, 1st design image, 2nd design image. . .Nth design image). As will be described below, a data management platform 2104 may identify such sources and collect all relevant content.
[0109] While reference is made to examples in which users utilize specific designer profiles, there may be embodiments where users request that the inputted physical space be staged based on inputted content that is not designer-specific (e.g., an image or text uploaded by the user). In such a case, the designer model 2112 may include a default state that may generate a design plan based on the inputted content without considering aesthetic or other designer characteristics. Such a state may be entered, for example, when the user has not selected or otherwise indicated a designer profile. The relationship between the designer model 2112 and the default state is described further below.
[0110] The data management platform 2104 may be a platform of the system that identifies data sources (which may at times be referred to as external entities herein) that host content related to a designer stored in the system data storage 2110. The data management platform 2104 may coordinate data and/or instructions between various components of the internal environment along with the external entities, such as the tag generation module 2106, the profile manager 2108, the designer model 2112, the one or more data sources 2100, 2102, any user device registered to the platform 2104, and/or the like. The data management platform 2104 may operate via one or more servers (e.g., remote cloud-based servers) and may also host one or more data structures (e.g., which may be a part of the system data store 2110) responsible for storing training data or other data associated with the designer model 2112. The servers may include functionality similar to the servers described in FIGS. 1 and 2. For example, obtaining the content from the design data sources may involve one or more
techniques performed by the asset collection module which are described above (e.g., authenticating the website, providing identifying credentials, storing such credentials, scanning the contents of the webpages for relevant media, etc.).
[0111] In some embodiments, upon completing the authentication/authorization check, the data management platform 2104 may request permission from the data source (e.g., a website) to scan or crawl the contents (e.g., headers, HTML, script, files, behavior analysis. . .) of the pages of the website to retrieve any media related to a given individual/designer. Alternatively, the data management platform 2104 may provide APIs to the data source to facilitate data exchange, where the data source can upload data relevant to the desired content to the servers formatted such that the platform can parse the relevant information. As an example, once a designer data source passes an authentication/authorization check, the data management platform 2104 may provide authentication credentials (e.g., a token) that may indicate the identity of the data source. Accordingly, each time a data source transmits data to the platform 2104, it may include in a header the credentials so that the data management platform 2104 knows to which profiles the data pertains . As shown in FIG. 21, the credentials sent from designer A data source 2100 may indicate profile ID A while those sent from designer N data source 2102 may indicate profile ID N.
[0112] The data management platform 2104 may include a tag generation module 2106 which may include software configured to parse the relevant data from the data obtained from the sources and process such data through one or more models in order to generate tagging information which may be used as training data for the designer model 2112. The data obtained may consist of a plurality of discrete datasets, where each dataset is associated with a respective piece of content. Using the example shown in FIG. 21, a first dataset may be associated with the 1st design image, a second dataset may be associated with the 2nd design image, and so on. Since all of the datasets pertain to content from the same designer, they may have identical profile IDs. In some embodiments, each of the datasets may be assigned with unique dataset IDs based on the content they represent, which may be useful for the profile manager 2108 when attempting to organize the tags and parameters for each of the datasets. In some embodiments, the tag generation model 2106 may tag/assign the obtained dataset with a profile ID which may be identical to another record already indexed in the database of the system data storage 2110 (e.g., the module recognizes a match between the profile ID of the received dataset and the profile ID of a record). In some embodiments, the
assigned ID may be identical or similar to the profile ID of the incoming dataset, as shown in FIG. 21.
[0113] After tagging the IDs, the dataset may be processed via the object segmentation model. For example, for a 2D image included in the dataset, a 3D model may be generated of the depicted space, including 3D representations of the objects contained within the space. In some embodiments, after segmenting the image, the 3D model may be processed via the object removal model to remove all removable objects (i.e., furnishings) from the respective 3D model. The 3D model may be processed by a parameter generation module to generate a set of parameters describing characteristics of the design of the respective space. For example, the parameters may relate to the layout and/or style of the space, which may be based on context, types of objects used for a space, and/or the like. For example, each object depicted in the respective sample space may be further analyzed by the model retrieval engine and/or the material recognition model to assign object IDs to each object as well as to generate parameters for each of the objects. In some embodiments, the 3D model may be converted, using the object placement model, to a 2D floorplan representation showing the existing arrangements of the objects contained within the depicted space. Data related to the processing of the content, as opposed to the tagging of the ID as a header, may collectively be considered tagging information as shown in FIG. 21 and may be sent to the profile manager 2108 described below. In some embodiments, the tagging information may contain a unique dataset ID (along with the profile ID) that identifies for the profile manager 2108 the piece of content to which the respective tagging information relates.
[0114] The profile manager 2108 may receive a dataset from the tag generation module 2106 containing a profile ID (e.g., the profile ID shown in FIG. 21) and a body of tagging information associated with a dataset ID. The profile manager 2108 may be responsible for managing the backend operations that maintain the database contained within the system data storage 2110. For example, the profile manager 2108 may receive tagging information linked to profile ID A, where the tagging information includes parameters that have been labeled with dataset ID 1. In some embodiments, the profile manager 2108 may group the parameters (e.g., which may be represented by key-value pairs) into a nested hierarchy (e.g., or into another data structure that categorizes/organizes parameters) of different categories. For example, the categories may be identified by tags and may include fields for layout, object, style, etc. For example, if a set of parameters generated at the parameter generation module relates to the layout of dataset 1, the profile manager 2108 may instruct the database to record
the parameters under a tag field called “layout”. The profile manager 2108 may perform similar operations for the object field and the style field.
[0115] As mentioned above, the system data storage 2110 may contain a database that stores the tagging information in a format where such information is readily accessible. The system data storage 2110 may contain identical or closely related features of the system data storage described above. In some embodiments, the database may be considered a design replication database and may be separate from the other databases/data structures described above. In some embodiments, the system data storage 2110 may contain an archived repository of available objects that may be retrieved by the object recommendation engine when determining which objects to recommend, as described below. The available objects may be similar or identical to the assets collected by the asset collection module described in FIGS. 1-4.
[0116] The designer model 2112 may include one or more ML models trained to output a design plan for a space represented by a 3D model. In some embodiments, a plurality of designer models 2112 may be included in the system. In such a case, each designer model 2112 may be unique to a specific profile ID. For example, a designer model 2112 may be trained to output a design plan representative of how a specific individual/designer (linked to a profile ID) would design a given space. Training the designer model 2112 is described further in FIGS. 22 and 23. The designer model 2112 may implement a floorplan generator and an object recommendation engine to generate and output the design plan. The floorplan generator may generate a proposed arrangement of a plurality of objects (i.e., furnishings) within the physical space. The proposed arrangement may consist of object indicators that indicate positions/orientations within the physical space at which a respective object is proposed to be located. The object recommendation engine may select, for each object indicator, a recommended object to be located at location, and may thereby generate a plurality of recommended objects associated with respective positions indicated by the respective object indicators. The floorplan generator and object recommendation engine are described further in FIGS. 22 and 23, respectively. In some embodiments, the designer model 2112 may be or include a generative Al model deploying a deep neural network trained to improve how closely a reconstructed design space matches how a given designer would have constructed the space. In some embodiments, the floorplan generator may be a generative Al model while the object recommendation engine may be one or models/algorithms that score or rank the available objects in the repository based on a criterion (further details of the floorplan generator and the object recommendation engine are
provided in FIGS. 22 and 23, respectively). Examples of generative Al models may include NeRF, generative radiance fields (GRAF), surface-guided neural radiance fields (SURF), bundle-adjusting neural radiance fields (BARF), or any other model in the field of computer vision that can learn how to reconstruct a 3D scene from a set of images. In the case where the input is a textual description, examples of generative Al text-to-image models may include or incorporate DALL-E, STABLE DIFFUSION, or any other multimodal natural language processing (NLP) model that generates images from a text description. In some embodiments, text-to-image models may be used in combination with other generative Al models. For example, a text-to-image model may be executed first and configured to generate one or more 2D images from an input text description. In such a case, these generated 2D images may then be fed to a 3D generative Al model to generate a reconstructed design space that corresponds to the original input text description.
[0117] FIG. 22 depicts an exemplary training protocol to train the floorplan generator 2202 of the designer model to output a design plan. The floorplan generator 2202 may be fed a volume of sample images 2200 (or any of the other forms of media described in FIG. 21 such as text descriptions, multimodal data, contextual information such as temporal or geolocation data, etc.) depicting a plurality of sample spaces furnished by a designer. In some embodiments, the sample images 2200 may include at least some of the images and other content collected from the data sources described in FIG. 21. In some embodiments, at least some of the sample images 2200 may be generated from a series of simulations executed via one or more simulation tools like simple graphic models, game engines, etc. Such simulated sample images may be beneficial in that they can increase the diversity and volume of the training dataset, thereby improving the generalization of the floorplan generator 2202. In some cases, the sample images based on real-world data may be combined with the simulated sample images and processed as input training data. The sample images 2200 may show at least one or more of the following: spatial dimensions of the respective sample space, designer-selected object placements within the respective sample space, and designer- selected furnishings of the respective sample space. In some embodiments, these metrics may be analyzed during the evaluation process, i.e., the ground truth comparison in operation 2212.
[0118] At operation 2204, for each of the sample images 2200, a sample 3D model may be obtained. For example, the sample images may be fed into the object segmentation model and the object removal model described above which may generate respective sample 3D models (containing only structural objects as all removable objects have been removed/deleted via
the object removal model). In the case where the sample images 2200 originated from the data sources and have been processed by the data management platform, the sample 3D models may be indexed at the system data storage 2214 and retrieved by the floorplan generator 2202 without the need for additional processing.
[0119] At operation 2206, for each of the sample images 2200, the sample 3D model may be converted to a sample design plan that includes a sample 2D floorplan representation showing a proposed arrangement of a plurality of objects / furnishings. The proposed arrangement may include object indicators indicating positions within the sample 2D floorplan representation at which a respective object is proposed to be located. Converting the sample 3D models may incorporate one or more techniques implemented by the object placement model which were described in detail in FIGS. 15-18. In some embodiments, the proposed arrangement comprising the plurality of object indicators may consist of fixed object indicators for all structural objects. In such a case, all iterations of the sample design plan may have the same object indicators for the structural objects but different variations for the other non- structural objects.
[0120] When determining how to arrange the object indicators in the sample 2D floorplan representation, the floorplan generator 2202 may retrieve tagging information generated from analyzing the previously-registered spaces stored via the data management platform described in FIG. 21. The retrieved tagging information may be indexed to the profile ID of the designer model (as described in FIG. 21, each designer model may be trained to emulate an aesthetic of a designer with a particular profile ID). In some embodiments, the tagging information may consist of a set of parameters nested under the layout field, object field, and/or style field described in FIG. 21. In such a case, the parameters for the layout field may indicate, for each previously-registered space, one or more of the following: overall space dimensions (e.g., floor area, volume, etc.), proportions/relationships between different parts of a space, flow of movement including clearance thresholds (e.g., building code or other design requirements related to the movement of people through space), ceiling height, acoustics, ventilation, lighting, furnishing layout, access points (e.g., doors and windows), site context (e.g., if the previous space had a desirable view, the designer may choose a layout where a seat was placed near a window to the view), equipment/technology placement to ensure connectivity, intended use of space, zoning requirements, estimated occupant load (e.g., if a space is depicting a master bedroom, a designer may design a layout with less seating arrangements since he or she would expect that such a space would not be occupied by a large number of people), storage space, space utilization, safety standards, and/or the
like. The parameters for the object field may indicate, for each object ID indexed to a previously-registered space, one or more of the following: color palette, texture, compactness, contrast, size, shape, pattern, surface finishes, furnishing material type, visual decor or theme, ergonomics, durability, usage, accessibility, flexibility/modularity (i.e., how much the object can be reconfigured for multiple purposes), integration with technology, brand (which may include the designer’s brand), cohesion, symmetry, price, and/or the like. The parameters for the style field may indicate, for a previously-registered space, an overarching architectural style reflecting the setting of the space. Examples of architectural styles may include modem, contemporary, mid-century modem, minimalist, farmhouse, beach shack, cabin, lake house, urban apartment, etc.
[0121] At operation 2206, as each sample 2D floorplan representation (from the sample images) passes through the floorplan generator 2202, the parameters described above may be used as an input training set (e.g., along with other training data mentioned above) to assist the Al model in learning how to adjust the neural network to output a 2D floorplan representation with a proposed arrangement matching what a designer would create. In some embodiments, determining the 2D floorplan representation may involve a rules-based algorithm that is configured with a set of rules. The set of rules may be generated based on, for example, the parameters (along with other training data generated, for example, from simulations) and may convey what constitutes a layout that matches the designer’s aesthetic for a given space. An example of a rule may pertain to the relationship between the dimensions of the 2D representations of objects with respect to the total dimensions of the 2D floorplan (e.g., a particular furnishing should not exceed X% of the available floorplan space).
[0122] At operation 2208, the sample 2D floorplan representation may be staged by placing object indicators at positions within the representation in accordance with the proposed arrangement outputted at operation 2206. The output of operation 2208 may include the positions of respective object indicators, the types of respective object indicators (e.g., couch), the size/dimensions of respective object indicators (e.g., the couch should be not exceed 5 feet in width), aesthetic information for the respective object indicators (e.g., the couch should be red), along with other information included in the parameter field of the object category described above. In some embodiments, software instructions on how to arrange the object indicators within sample 2D floorplan representation may be generated and assist in staging the object indicators within the respective 2D floorplan representation. In
some embodiments, one or more staging techniques performed by the object placement model, which are described in FIGS. 15-18, may be incorporated at operation 2208.
[0123] At operation 2210, the staged sample 2D floorplan representation showing a proposed arrangement of a plurality of objects may be used as a predicted designer-inspired design plan, i.e., a design plan that a particular designer may use based on the input sample. In some embodiments, the predicted design plan may be identical to the staged sample 2D floorplan representation. Alternatively, the predicted design plan may be a reconstructed 3D model with 3D representations mapped to each of the object indicators of the staged sample 2D floorplan.
[0124] At operation 2212, the predicted design plan may be evaluated, which may involve a comparison to a ground truth. The evaluation may involve qualitative analysis where an expert (perhaps the actual designer or representative(s) of the designer) scores the quality of the predicted design plan. The evaluation may involve statistical measures (e.g., inception score (IS), mode score (MS), etc.) used to assess the similarity between the predicted design plan and the real-world sample images and simulated sample images, with the goal being to train the Al model of the floorplan generator 2202 to produce new design plans that are indistinguishable from the sample. In some embodiments, feedback may be generated to adjust the weights and biases of the neural network to minimize the loss function and thereby improve the model’s ability to generate design plans resembling the designer’s aesthetic.
[0125] FIG. 23 depicts an exemplary flow diagram of an object recommendation engine 2302 configured to generate recommended objects to be located at the positions indicated by respective object indicators. As shown, the object recommendation engine 2302 may receive information from the floorplan generator 2300 indicative of how to arrange the object indicators within the physical space. The object recommendation engine 2302 may perform operations 2304 through 2310 to select recommended objects to populate the positions of the object indicators. The recommended objects may be selected from a repository of available objects stored in the system data storage 2312.
[0126] In some embodiments, the floorplan generator 2300 may generate and send software instructions that map the positions of the object indicators (which may be identified in the code under Object IDs) within the 2D floorplan representation of the design plan to corresponding 3D positions within the original 3D model (the 3D model that was converted to the 2D floorplan representation by the floorplan generator). In some embodiments, reconstruction software tools such as photogrammetry, point cloud creation, computer vision, etc. may be implemented to generate the mappings. In some embodiments, additional
parameters related to desired characteristics (which may include the parameters in the object field of the database described in FIGS. 21 and 22) for each of the object indicators may be sent along with the mappings.
[0127] At operation 2304, the object recommendation engine 2302 may obtain such instructions and may modify the 3D model by placing object indicators in the mapped 3D positions. In some embodiments, the object recommendation engine 2302 may skip mapping to the 3D model and instead select recommended objects to fill the 2D floorplan representation. In such a case, the 3D model may be populated only after all recommendations have been selected.
[0128] At operation 2306, selecting recommended objects to be located at the respective 3D positions may be based on scoring available objects indexed in the repository. For example, the object recommendation engine 2302 may retrieve the object IDs of the available objects along with the tagging information and may process such information in one or more rules-based algorithms configured to output a confidence score based on certain criteria or a ruleset. In some embodiments, the criteria may be manually hardcoded where users may evaluate and provide weights and/or rules as to what results in a match. An example of a rule may pertain to whether an available object is associated with a brand of the designer (where at least one parameter from the set of parameters of the tagging information includes an indication as to whether the respective furnishing is associated with the designer). Each confidence score may indicate a confidence that a designer associated with the profile of the designer model would choose to insert a previously-registered furnishing in place of the object indicator. In some embodiments, the object recommendation engine 2302 may generate scores using one or more techniques similar to those implemented by the model retrieval engine described in FIG. 13. For example, the desired parameters for each object indicator which was sent from the floorplan generator 2300 may be compared to the tagging information of the available objects by performing an assessment of the degree to which the underlying data points match.
[0129] At operation 2308, the mapped 3D position may include an indication of a selection of a recommended object. For example, the available object (previously-registered furnishing) associated with the highest confidence score may be selected as the recommended object to be placed at the position indicated by the respective object. The process may be repeated until recommended objects have been selected for all of the object indicators (by iterating and looping across the object IDs as shown in FIG. 23).
[0130] At operation 2310, additional parameters and/or rules may be generated based on a current selection of a plurality of recommended objects. For example, the object recommendation engine 2302 may determine to select a couch with certain dimensions (e.g., four feet in width and 10 feet in length) as a recommended object. Based on such a selection, an additional rule may be generated that pertains to the dimensions of the already selected couch, the dimensions of the other available objects within the object repository, the total dimensions of the physical space, etc.
[0131] FIG. 24 depicts an exemplary method of generating a multi-dimensional representation of a physical space in response to user instructions inputted via a GUI. At 2400, a user may upload environment data depicting a physical space. In some embodiments, the user may interface with the GUI described with respect to FIGS. 1-20. For example, text may be displayed via the GUI prompting the user to upload an image/video or other media file containing content of a physical space that the user wants designed based on a designer model. Below the text may be a drag-and-drop area (or another mechanism for the user to upload media of the physical space) and an upload button where the user can upload the file. At 2402, upon the user selecting the upload button (or another function to transmit the content containing the physical space), the system described herein may generate a multidimensional model of the physical space via the implementation of the object segmentation model and the object removal model described above. At 2404, a user may select to stage the physical space according to a designer profile. For example, the GUI may present a graphical drop-down list of profile options. Each option may be associated with a profile of a designer stored in the data management platform in FIG. 21. For example, upon a designer model having been adequately trained to replicate a style of a designer, a profile option may be created at the front end and appended to the drop-down list of profile options. The GUI may receive, from a user via the interface, a selection of the designer profile from the set of profile options. The selection may indicate a request from the user to have the proposed arrangement of the plurality of objects staged within the physical space according to how a designer associated with the selected designer profile would have staged the physical space. In some embodiments, the user may choose no designer profiles. In such a case, it may be inferred that the user wants the physical space to be generated according to the default state described above. At 2406, the floorplan generator may generate the design plan with a proposed arrangement of objects as described in FIG. 22. At 2408, the object recommendation engine may select a plurality of recommended objects in accordance with the design plan, as described in FIG. 23. At 2410, a 3D model may be populated with the plurality of
recommended objects located at the respective positions indicated by the respective object indicators. In some embodiments, the system may be further configured to receive, from a user device, a user instruction to generate or retrieve the populated 3D model with the plurality of recommended objects located at the respective positions indicated by the respective object indicators. In response to receiving the user instruction, the system may be configured to transmit for display by the user device a rendering of the populated 3D model. In some embodiments, the rendering can be edited by the user of the user device. For example, the plurality of recommended objects may be movable within the graphical rendering. For example, the plurality of recommended objects may be substituted with other objects from the repository of available objects. In some embodiments, the system may generate a 3D digital twin each time an edit is made to validate that the edited 3D model is viable (which is described further in FIGS. 5 and 16). Editing may include swapping a primary recommended object for a secondary recommended object stored in the repository of available objects. Editing may include deleting a recommended object from the 3D model. In some embodiments, the system may be further configured to iteratively modify the graphical rendering of the populated 3D model based on sequential user instructions.
[0132] FIG. 25 depicts an exemplary method 2500 of generating a multi-dimensional representation of a physical space based on a designer model.
[0133] At 2502, environment data depicting a physical space may be received. In some embodiments, the physical space may contain a plurality of objects. In some embodiments, the environment data may comprise a multi-dimensional representation of an environment. [0134] At 2504, the environment data may be processed using a first processing model. In some embodiments, the first processing model may be configured to output a 3D model of the physical space that includes 3D representations of the plurality of respective objects. [0135] At 2506, a design plan for the physical space may be generated using a designer model. In some embodiments, the design plan may be generated based on the 3D model of the physical space, or a derivative thereof. The design plan may include a proposed arrangement of a plurality of objects within the physical space. In some embodiments, the proposed arrangement may comprise a plurality of object indicators. For example, each object indicator may indicate a position within the physical space at which a respective object is proposed to be located. In some embodiments, a selection of a designer profile from a set of designer profiles may be received from a user via an interface. The selection may indicate a request from the user to have the proposed arrangement of the plurality of objects staged within the physical space according to how a designer associated with the selected designer
profile would have staged the physical space. In some embodiments, the 3D model of the physical space, or a derivative thereof, along with the selected designer profile may be inputted to the designer model. For example, the designer model may be a generative artificial intelligence model that has been trained based on a volume of sample images that depict a plurality of sample spaces furnished according to the selected designer profile. The sample spaces may show at least: (i) spatial dimensions of the respective sample space; (ii) designer-selected object placements within the respective sample space; and/or (iii) designer- selected furnishings of the respective sample space.
[0136] At 2508, a recommended object may be selected to be located at the position indicated by the respective object indicator. In some embodiments, for each object indicator, a recommended object may be selected, from a repository of available objects, to be located at the position indicated by the respective object indicator. In such a case, a plurality of recommended objects associated with respective positions indicated by the respective object indicators may be generated. In some embodiments, selecting the plurality of recommended objects is performed by an object recommendation engine. The object recommendation engine may include one or more rules-based algorithms configured to generate confidence scores for a plurality of previously registered furnishings stored in the repository based on a plurality of manually coded rules. In some embodiments, each confidence score may indicate a confidence value that a designer of the designer profile would insert a particular furnishing in place of the object indicator. In some embodiments, at least one of the rules may indicate that a previously registered furnishing associated with a brand of the selected designer profile is more likely to have a higher confidence score than an identical furnishing that is associated with another brand.
[0137] At 2510, the 3D model may be populated with the plurality of recommended objects. In some embodiments, the 3D model may be populated with the plurality of recommended objects located at the respective positions indicated by the respective object indicators. In some embodiments, a user instruction may be received, from a user device, to generate or retrieve the populated 3D model with the plurality of recommended objects located at the respective positions indicated by the respective object indicators. In response to receiving the user instruction, a rendering of the populated 3D model may be transmitted for display by the user device. The rendering can be edited by the user of the user device. In some embodiments, editing may comprise the option to swap at least one of the plurality of recommended objects with other previously registered furnishings from the repository of available objects. In some embodiments, editing may comprise deleting at least one of the
plurality of recommended objects from a current rendering of the 3D model. In some embodiments, the graphical rendering of the populated 3D model may be iteratively modified based on sequential user instruction.
[0138] In some embodiments, a system for generating a multi-dimensional representation of an environment based, at least in part, on a design style may be provided. In such a case, the system may obtain a plurality of sample images depicting a plurality of sample environments. The system may process, using an image processing model, the sample images. In some embodiments, the image processing model may be configured to output sample 3D models of the sample environments. In some embodiments, the system may input the sample 3D models, or derivatives thereof, to a design generation model. The design generation model may be a generative artificial intelligence model that is configured to generate, based on a subject 3D model or derivative thereof, a design plan for a space represented by the subject 3D model. For example, the design plan may include a proposed arrangement of a plurality of objects within the space. The proposed arrangement may comprise a plurality of object indicators. Each object indicator may indicate a position within the space at which a respective object is proposed to be located. In some embodiments, the system may obtain feedback regarding the proposed arrangement generated by the design generation model. Based on the feedback, the system may modify the design generation model.
[0139] FIG. 26 shows an exemplary processing system that may execute techniques presented herein. FIG. 26 is a simplified functional block diagram of a computer that may be configured to execute techniques described herein, according to exemplary cases of the present disclosure. Specifically, the computer (or “platform” as it may not be a single physical computer infrastructure) may include a data communication interface 2660 for packet data communication. The platform may also include a central processing unit 2620 (“CPU 2620”), in the form of one or more processors, for executing program instructions. The platform may include an internal communication bus 2610, and the platform may also include a program storage and/or a data storage for various data files to be processed and/or communicated by the platform such as ROM 2630 and RAM 2640, although the system 2600 may receive programming and data via network communications. The system 2600 also may include input and output ports 2650 to connect with input and output devices such as keyboards, mice, touchscreens, monitors, displays, etc. Of course, the various system functions may be implemented in a distributed fashion on similar platforms, to distribute the
processing load. Alternatively, the systems may be implemented by appropriate programming of one computer hardware platform.
[0140] The general discussion of this disclosure provides a brief, general description of a suitable computing environment in which the present disclosure may be implemented. In some cases, any of the disclosed systems, methods, and/or graphical user interfaces may be executed by or implemented by a computing system consistent with or similar to that depicted and/or explained in this disclosure. Although not required, aspects of the present disclosure are described in the context of computer-executable instructions, such as routines executed by a data processing device, e.g., a server computer, wireless device, and/or personal computer. Those skilled in the relevant art will appreciate that aspects of the present disclosure can be practiced with other communications, data processing, or computer system configurations, including: Internet appliances, hand-held devices, wearable computers, all manner of cellular or mobile phones (including Voice over IP (“VoIP”) phones), dumb terminals, media players, gaming devices, virtual reality devices, multi-processor systems, microprocessor-based or programmable consumer electronics, set-top boxes, network PCs, mini-computers, mainframe computers, and the like. Indeed, the terms “computer,” “server,” and the like, are generally used interchangeably herein, and refer to any of the above devices and systems, as well as any data processor.
[0141] Aspects of the present disclosure may be embodied in a special purpose computer and/or data processor that is specifically programmed, configured, and/or constructed to perform one or more of the computer-executable instructions explained in detail herein. While aspects of the present disclosure, such as certain functions, are described as being performed exclusively on a single device, the present disclosure may also be practiced in distributed environments where functions or modules are shared among disparate processing devices, which are linked through a communications network, such as a Local Area Network (“LAN”), Wide Area Network (“WAN”), and/or the Internet. Similarly, techniques presented herein as involving multiple devices may be implemented in a single device. In a distributed computing environment, program modules may be located in both local and/or remote memory storage devices.
[0142] Aspects of the present disclosure may be stored and/or distributed on non-transitory computer-readable media, including magnetically or optically readable computer discs, hardwired or preprogrammed chips (e.g., EEPROM semiconductor chips), nanotechnology memory, biological memory, or other data storage media. Alternatively, computer implemented instructions, data structures, screen displays, and other data under aspects of the
present disclosure may be distributed over the Internet and/or over other networks (including wireless networks), on a propagated signal on a propagation medium (e.g., an electromagnetic wave(s), a sound wave, etc.) over a period of time, and/or they may be provided on any analog or digital network (packet switched, circuit switched, or other scheme).
[0143] Program aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of executable code and/or associated data that is carried on or embodied in a type of machine-readable medium. “Storage” type media include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer of the mobile communication network into the computer platform of a server and/or from a server to the mobile device. Thus, another type of media that may bear the software elements includes optical, electrical and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links. The physical elements that carry such waves, such as wired or wireless links, optical links, or the like, also may be considered as media bearing the software. As used herein, unless restricted to non-transitory, tangible “storage” media, terms such as computer or machine “readable medium” refer to any medium that participates in providing instructions to a processor for execution.
[0144] Numbered Embodiments
[0145] Al. A system for object segmentation from a three-dimensional (3D) model of an image uploaded by a user, the system comprising: a processing unit comprising one or more processors; a memory unit storing computer-readable instructions, wherein the system is configured to: receive environment data depicting an environment that contains one or more objects, the environment data comprising a multi-dimensional visualization of the environment; process, using an image processing model, the environment data, the image processing model being configured to: generate a 3D model of the environment that includes a plurality of 3D representations of the one or more objects and a 3D representation of the environment, wherein the 3D representations of the one or more objects are independently manipulable relative to the 3D representation of the environment; determine a plurality of
parameters for one or more of the 3D representations of the one or more objects based on photogrammetry analysis; for one or more of the 3D representations of the one or more objects, associate the respective 3D representation of the respective object with a label indicating an object type and one or more of the plurality of parameters determined to be associated with the object, wherein the label and the plurality of parameters can be used to query a database to find a matching 3D representation of the object; and output the labels and parameters for the 3D representations of the one or more objects.
[0146] A2. The system of claim Al, wherein the environment is a physical space, and the one or more objects include both structural and removable furniture units within the physical space, wherein the structural and removable furniture units comprise tangible decor units. [0147] A3. The system of any of claims A1-A2, wherein plurality of parameters relates to one or more characteristics of the object, wherein at least one parameter of the plurality of parameters indicates a flag used to determine whether the 3D representation is associated with a structural unit or a removable unit.
[0148] A4. The system of any of claims A1-A3, wherein the one or more characteristics relate to one or more of size, color, pattern, texture, compactness, contrast, or viewpoints associated with the object.
[0149] A5. The system of any of claims A1-A4, further configured to send a message comprising the labels and the one or more parameters of the one or more 3D representations to a model retrieval engine that is configured to query a database to identify one or more candidate 3D representations with matching labels and parameters.
[0150] A6. The system of any of claims A1-A5, wherein the database is populated with candidate 3D representations associated with candidates objects prior the machine retrieval engine performing the query, wherein each candidate 3D representation is associated with a respective label and one or more of parameters.
[0151] A7. The system of any of claims A1-A6, wherein the candidate objects relate to products advertised by a vendor registered to the system, wherein upon registration, the vendor grants permission to the system to access and read data related to the products listed on a website hosted by the vendor.
[0152] A8. The system of any of claims A1-A7, further configured to pull data related to the products from a database and process the data using the image processing model, wherein the image processing model is configured to generate a plurality of candidate 3D representations of the one or more products along with a plurality of parameters for each of the generated candidate 3D representation.
[0153] A9. The system of any of claims A1-A8, further configured to receive data related to an upload of a 2D image depicting one or more candidate objects from a user registered to the system, and process the data using the image processing model, wherein the image processing model is configured to generate one or more candidate 3D representations of the one or more candidate objects along with a plurality of parameters for each of the generated candidate 3D representation.
[0154] A10. The system of any of claims A1-A9, wherein the environment data is based on a 2D image depicting an environment and objects within the environment, wherein the image processing model comprises a neural network trained to predict a radiance value for each pixel of the 2D image to generate the 3D model of the 2D image, wherein the 3D model comprises a multitude of 3D views depicting different viewing angles of the 3D model. [0155] Bl. A method for object segmentation from a three-dimensional (3D) model of an image uploaded by a user, the method comprising: receiving environment data depicting an environment that contains one or more objects, the environment data comprising a multidimensional visualization of the environment; processing, using an image processing model, the environment data, the image processing model being configured to: generate a 3D model of the environment that includes a plurality of 3D representations of the one or more objects and a 3D representation of the environment, wherein the 3D representations of the one or more objects are independently mani pulable relative to the 3D representation of the environment; determine a plurality of parameters for one or more of the 3D representations of the one or more objects based on photogrammetry analysis; for one or more of the 3D representations of the one or more objects, associate the respective 3D representation of the respective object with a label indicating an object type and one or more of the plurality of parameters determined to be associated with the object, wherein the label and the plurality of parameters can be used to query a database to find a matching 3D representation of the object; and output the labels and parameters for the 3D representations of the one or more objects.
[0156] B2. The method of claim Bl, wherein the environment is a physical space, and the one or more objects include both structural and removable furniture units within the physical space, wherein the structural and removable furniture units comprise tangible decor units. [0157] B3. The method of any of claims B1-B2, wherein plurality of parameters relates to one or more characteristics of the object, wherein at least one parameter of the plurality of parameters indicates a flag used to determine whether the 3D representation is associated with a structural unit or a removable unit.
[0158] B4. The method of any of claims B1-B3, wherein the one or more characteristics relate to one or more of size, color, pattern, texture, compactness, contrast, or viewpoints associated with the object.
[0159] B5. The method of any of claims B1-B4, further comprising sending a message comprising the labels and the one or more parameters of the one or more 3D representations to a model retrieval engine that is configured to query a database to identify one or more candidate 3D representations with matching labels and parameters.
[0160] B6. The method of any of claims B1-B5, wherein the database is populated with candidate 3D representations associated with candidates objects prior the machine retrieval engine performing the query, wherein each candidate 3D representation is associated with a respective label and one or more of parameters.
[0161] B7. The method of any of claims B1-B6, wherein the candidate objects relate to products advertised by a vendor registered to the system, wherein upon registration, the vendor grants permission to the system to access and read data related to the products listed on a website hosted by the vendor.
[0162] B8. The method of any of claims B1-B7, further comprising pulling data related to the products from a database and processing the data using the image processing model, wherein the image processing model is configured to generate a plurality of candidate 3D representations of the one or more products along with a plurality of parameters for each of the generated candidate 3D representation.
[0163] B9. The method of any of claims B1-B8, further comprising receiving data related to an upload of a 2D image depicting one or more candidate objects from a user registered to the system, and processing the data using the image processing model, wherein the image processing model is configured to generate one or more candidate 3D representations of the one or more candidate objects along with a plurality of parameters for each of the generated candidate 3D representation.
[0164] B10. The method of any of claims B1-B9, wherein the environment data is based on a 2D image depicting an environment and objects within the environment, wherein the image processing model comprises a neural network trained to predict a radiance value for each pixel of the 2D image to generate the 3D model of the 2D image, wherein the 3D model comprises a multitude of 3D views depicting different viewing angles of the 3D model.
[0165] Cl. A system to remove an object from a three-dimensional (3D) model of a physical space, the system comprising: a processing unit comprising one or more processors; a memory unit storing computer-readable instructions, wherein the system is configured to:
receive environment data depicting an environment that contains a plurality of objects, the environment data comprising a multi-dimensional visualization of the environment, wherein the plurality of objects includes one or more structure objects and one or more removable objects; process, using an image processing model, the environment data, the image processing model being configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations; based on the tags associated with each of the 3D representations, determine which of the 3D representations are associated with structure objects and which of the 3D representations are associated with removable objects; and generate, using a generative artificial intelligence (Al) model, a rendering of the environment in which one or more portions associated with the removable objects are excised and replaced with a filler portion that is generated based on an inference of one or more characteristics associated with material surrounding the removable object.
[0166] C2. The system of claim Cl, wherein the environment is a physical space, and the one or more objects include both structural and removable furniture units within the physical space, wherein the structural and removable furniture units comprise tangible decor units. [0167] C3. The system of any of claims C1-C2, wherein the plurality of parameters relates to one or more characteristics of the object, wherein at least one parameter of the plurality of parameters indicates a flag used to determine whether the 3D representation is associated with a structural unit or a removable unit.
[0168] C4. The system of any of claims C1-C3, wherein the system is loaded with a set of hardcoded rules mapping the tags associated with each of the 3D representations to flags indicating whether the 3D representation is associated with a structural unit or a removable unit, wherein the system references the set of hardcoded rules to determine which 3D representations are associated with structural units and which are associated with removable units.
[0169] C5. The system of any of claims C1-C4, wherein the system is loaded with a rulebased algorithm configured with a set of hardcoded rules, wherein the rule-based algorithm generates and outputs, based on an input tag, a flag indicating whether the 3D representation is associated with a structural unit or a removable unit.
[0170] C6. The system of any of claims C1-C5, wherein, in addition to the flag, rule-based algorithm generates and outputs a confidence score associated with the flag, wherein the system determines whether the 3D representation is associated with a structural unit or a
removable unit on condition that the confidence score associated with the flag satisfies a threshold.
[0171] C7. The system of any of claims C1-C6, further configured to process the 3D model outputted by the image processing model using a classifier model, wherein the classifier model is trained to categorize the 3D representations within the 3D model into different label classes and to output assigned labels for each of the 3D representations based on which label class it belonged to.
[0172] C8. The system of any of claims C1-C7, further configured to load a lookup table hardcoded with mapping information that maps labels to flags indicating whether a 3D representation is associated with a structural unit or a removable unit, wherein the system queries the lookup table with an outputted assigned label to determine whether to set the flag to a true value or a false value.
[0173] C9. The system of any of claims C1-C8, wherein the generative Al model is an image generation Al model trained by applying random noise to an image sample and iterating pixels of the image sample until the system determines that the image sample is consistent with a correct filler portion.
[0174] CIO. The system of any of claims C1-C9, further configured to receive, via a user prompt, an indication as to whether a user request to see a rendering of the environment with the removable objects or without the removable objects and, based on the indication, send device information to a user device allowing the user device to display a rendering of the environment according to the user request.
[0175] DI. A method to remove an object from a three-dimensional (3D) model of a physical space, the method comprising: receiving environment data depicting an environment that contains a plurality of objects, the environment data comprising a multi-dimensional visualization of the environment, wherein the plurality of objects includes one or more structure objects and one or more removable objects; processing, using an image processing model, the environment data, the image processing model being configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations; based on the tags associated with each of the 3D representations, determining which of the 3D representations are associated with structure objects and which of the 3D representations are associated with removable objects; and generating, using a generative artificial intelligence (Al) model, a rendering of the environment in which one or more portions associated with the removable objects are excised
and replaced with a filler portion that is generated based on an inference of one or more characteristics associated with material surrounding the removable object.
[0176] D2. The method of claim DI, wherein the environment is a physical space, and the one or more objects include both structural and removable furniture units within the physical space, wherein the structural and removable furniture units comprise tangible decor units.
[0177] D3. The method of any of claims D1-D2, wherein the plurality of parameters relates to one or more characteristics of the object, wherein at least one parameter of the plurality of parameters indicates a flag used to determine whether the 3D representation is associated with a structural unit or a removable unit.
[0178] D4. The method of any of claims D1-D3, further comprising loading a set of hardcoded rules mapping the tags associated with each of the 3D representations to flags indicating whether the 3D representation is associated with a structural unit or a removable unit, wherein the system references the set of hardcoded rules to determine which 3D representations are associated with structural units and which are associated with removable units.
[0179] D5. The method of any of claims D1-D4, further comprising loading a rule-based algorithm configured with a set of hardcoded rules, wherein the rule-based algorithm generates and outputs, based on an input tag, a flag indicating whether the 3D representation is associated with a structural unit or a removable unit.
[0180] D6. The method of any of claims D1-D5, wherein, in addition to the flag, rulebased algorithm generates and outputs a confidence score associated with the flag, wherein the method further comprises determining whether the 3D representation is associated with a structural unit or a removable unit on condition that the confidence score associated with the flag satisfies a threshold.
[0181] D7. The method of any of claims D1-D6, further comprising processing the 3D model outputted by the image processing model using a classifier model, wherein the classifier model is trained to categorize the 3D representations within the 3D model into different label classes and to output assigned labels for each of the 3D representations based on which label class it belonged to.
[0182] D8. The method of any of claims D1-D7, further comprising loading a lookup table hardcoded with mapping information that maps labels to flags indicating whether a 3D representation is associated with a structural unit or a removable unit, wherein the method further comprising querying the lookup table with an outputted assigned label to determine whether to set the flag to a true value or a false value.
[0183] D9. The method of any of claims D1-D8, wherein the generative Al model is an image generation Al model trained by applying random noise to an image sample and iterating pixels of the image sample until the system determines that the image sample is consistent with a correct filler portion.
[0184] DIO. The method of any of claims D1-D9, further comprising receiving, via a user prompt, an indication as to whether a user request to see a rendering of the environment with the removable objects or without the removable objects and, based on the indication, sending device information to a user device allowing the user device to display a rendering of the environment according to the user request.
[0185] El. A system for selecting a three-dimensional (3D) model of an image uploaded by a user, the system comprising: a processing unit comprising one or more processors; a memory unit storing computer-readable instructions, wherein the system is configured to: receive a plurality of two-dimensional (2D) representations of at least one object; based on the plurality of 2D representations, generate a 3D representation for the object; generate one or more tags for the 3D representation of the object, wherein the tags comprise a label indicating an object type and a plurality of parameters indicating an assessed characteristic of the object; perform a matching analysis comparing the one or more tags against tags for a plurality of previously registered objects; generate a confidence scoring indicating a confidence that a previously registered object is a match for the object; and based on the matching analysis and the confidence score, select the previously registered object for display to a user.
[0186] E2. The system of claim El, wherein the plurality of two-dimensional (2D) representations depict an environment of a physical space, and the one or more objects include both structural and removable furniture units within the physical space, wherein the structural and removable furniture units comprise tangible decor units.
[0187] E3. The system of any of claims E1-E2, wherein the plurality of parameters relates to one or more characteristics of the object, wherein the one or more characteristics relate to one or more of size, color, pattern, texture, compactness, contrast, or viewpoints associated with the object.
[0188] E4. The system of any of claims E1-E3, wherein a database is populated with candidate 3D representations associated with the previously registered objects that can be queried to retrieve the the label and the plurality of parameters associated with a given previously registered objects.
[0189] E5. The system of any of claims E1-E4, wherein the previously registered objects relate to products advertised by a vendor registered to the system, wherein upon registration, the vendor grants permission to the system to access and read data related to the products listed on a website hosted by the vendor.
[0190] E6. The system of any of claims E1-E5, further configured to receive data related to an upload of a 2D image depicting one or more objects from a user registered to the system automatically via an application programming interface (API) protocol.
[0191] E7. The system of any of claims E1-E6, wherein the confidence scoring is generated by accumulating individual scores generated based on comparing the plurality of parameters of the 3D representation with a plurality of parameters of a candidate 3D representation associated with a previously registered object, wherein the system is further configured to identify parameters of the same type to be designated for comparison.
[0192] E8. The system of any of claims E1-E7, wherein selecting the previously registered object for display to a user is conditioned based on both the confidence scoring and all individual scores satisfying a threshold.
[0193] E9. The system of any of claims E1-E8, further configured to select, based on the matching and analysis and the confidence score, a plurality of previously registered objects for display to a user, wherein the plurality of previously registered objects are displayed according to an order with previously registered objects with high confidence scores preceding previously registered objects with low confidence scores.
[0194] Fl. A method for object retrieval from a three-dimensional (3D) model of an image uploaded by a user, the method comprising: receiving a plurality of two-dimensional (2D) representations of one at least one object; based on the plurality of 2D representations, generating a 3D representation for the object; generating one or more tags for the 3D representation of the object, wherein the tags comprise a label indicating an object type and a plurality of parameters indicating an assessed characteristic of the object; performing a matching analysis comparing the one or more tags against tags for a plurality of previously registered objects; generating a confidence scoring indicating a confidence that a previously registered object is a match for the object; and based on the matching and analysis and the confidence score, selecting the previously registered object for display to a user.
[0195] F2. The method of claim Fl, wherein the plurality of two-dimensional (2D) representations depict an environment of a physical space, and the one or more objects include both structural and removable furniture units within the physical space, wherein the structural and removable furniture units comprise tangible decor units.
[0196] F3. The method of any of claims F1-F2, wherein the plurality of parameters relates to one or more characteristics of the object, wherein the one or more characteristics relate to one or more of size, color, pattern, texture, compactness, contrast, or viewpoints associated with the object.
[0197] F4. The method of any of claims F1-F3, wherein a database is populated with candidate 3D representations associated with the previously registered objects that can be queried to retrieve the label and the plurality of parameters associated with a given previously registered objects.
[0198] F5. The method of any of claims F1-F4, wherein the previously registered objects relate to products advertised by a vendor registered, wherein upon registration, the vendor grants permission to access and read data related to the products listed on a website hosted by the vendor.
[0199] F6. The method of any of claims F1-F5, further comprising receiving data related to an upload of a 2D image depicting one or more objects from a user registered automatically via an application programming interface (API) protocol.
[0200] F7. The method of any of claims F1-F6, wherein the confidence scoring is generated by accumulating individual scores generated based on comparing the plurality of parameters of the 3D representation with a plurality of parameters of a candidate 3D representation associated with a previously registered object, wherein the method further comprising identifying parameters of the same type to be designated for comparison.
[0201] F8. The method of any of claims F1-F8, wherein selecting the previously registered object for display to a user is conditioned based on both the confidence scoring and all individual scores satisfying a threshold.
[0202] F9. The method of any of claims F1-F8, further comprising selecting, based on the matching and analysis and the confidence score, a plurality of previously registered objects for display to a user, wherein the plurality of previously registered objects are displayed according to an order with previously registered objects with high confidence scores preceding previously registered objects with low confidence scores.
[0203] G1. A system for placing an object within a three-dimensional (3D) scene, the system comprising: a processing unit comprising one or more processors; a memory unit storing computer-readable instructions, wherein the system is configured to: receive environment data depicting an environment that contains one or more objects, the environment data comprising a multi-dimensional visualization of the environment; process, using an image processing model, the environment data, the image processing model being
configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations of the objects; input the 3D model of the environment, or a derivative thereof, to an object placement model, the object placement model being a generative artificial intelligence model that has been trained based on a plurality of 2D floorplan representations for sample spaces showing at least: (i) spatial dimensions of the respective sample space; and (ii) human-selected object placements within the respective space; generate, using the object placement model, a 2D floorplan representation showing a proposed arrangement of the plurality of objects within the environment; and modify the 3D model of the environment by placing the 3D representations of the plurality of objects within the 3D model of the environment in accordance with the 2D floorplan representation generated by the object placement model. [0204] G2. The system of claim Gl, wherein the environment is a physical space, and the one or more objects include both structural and removable furniture units within the physical space, wherein the structural and removable furniture units comprise tangible decor units. [0205] G3. The system of any of claims G1-G2, wherein the plurality of parameters relates to one or more characteristics of the object, wherein at least one parameter of the plurality of parameters indicates a flag used to determine whether the 3D representation is associated with a structural unit or a removable unit.
[0206] G4. The system of any of claims G1-G3, wherein the outputted 2D floorplan representation of the 3D model comprises the 3D representations of the plurality of objects and the tags associated with each of the 3D representations.
[0207] G5. The system of any of claims G1-G4, wherein the system is loaded with a rulebased algorithm trained to convert the 3D model outputted from the image processing model into a 2D floorplan representation that is compatible with the system and structurally compliant, wherein the rule-based algorithm is a natural language processing model is trained using existing literature related to interior design.
[0208] G6. The system of any of claims G1-G5, further configured to generate, using the object placement model, a plurality of 2D floorplan representations, wherein each 2D floorplan representation shows an alternative arrangement of the plurality of objects within the environment, wherein each 2D floorplan representation is determined, using the rulebased algorithm, compatible with the system and structurally compliant.
[0209] G7. The system of any of claims G1-G6, further configured to send device information to a user device, wherein the device information can be used by the user device to create a graphical rendering of the plurality of 2D floorplan representations for display,
wherein a user of the user device is able to see alternative arrangements of the plurality of objects within the environment.
[0210] G8. The system of any of claims G1-G7, further configured to receive information, from the user device, indicating a 2D floorplan representation selected by the user and modify the 3D model of the environment by placing the 3D representations of the plurality of objects within the 3D model of the environment in accordance with the selected 2D floorplan representation.
[0211] G9. The system of any of claims G1-G8, wherein the outputted 2D floorplan representation comprises at least one of a schematic representing a blueprint drawing of the floorplan and a spatial analysis of a 2D image of a 3D space.
[0212] GIO. The system of any of claims G1-G9, wherein the output of the spatial analysis comprises at least one of a 2D schematic and a 3D block out scene.
[0213] Hl . A method for placing an object within a three-dimensional (3D) scene, the method comprising: receiving environment data depicting an environment that contains one or more objects, the environment data comprising a multi-dimensional visualization of the environment; processing, using an image processing model, the environment data, the image processing model being configured to output a 3D model of the environment that includes 3D representations of the plurality of objects and tags associated with each of the 3D representations of the objects; inputting the 3D model of the environment, or a derivative thereof, to an object placement model, the object placement model being a generative artificial intelligence model that has been trained based on a plurality of 2D floorplan representations for sample spaces showing at least: (i) spatial dimensions of the respective sample space; and (ii) human-selected object placements within the respective space; generating, using the object placement model, a 2D floorplan representation showing a proposed arrangement of the plurality of objects within the environment; and modifying the 3D model of the environment by placing the 3D representations of the plurality of objects within the 3D model of the environment in accordance with the 2D floorplan representation generated by the object placement model.
[0214] H2. The method of claim Hl, wherein the environment is a physical space, and the one or more objects include both structural and removable furniture units within the physical space, wherein the structural and removable furniture units comprise tangible decor units. [0215] H3. The method of any of claims H1-H2, wherein the plurality of parameters relates to one or more characteristics of the object, wherein at least one parameter of the plurality of
parameters indicates a flag used to determine whether the 3D representation is associated with a structural unit or a removable unit.
[0216] H4. The method of any of claims H1-H3, wherein the outputted 2D floorplan representation of the 3D model comprises the 3D representations of the plurality of objects and the tags associated with each of the 3D representations.
[0217] H5. The method of any of claims H1-H4, further comprising loading a rule-based algorithm trained to convert the 3D model outputted from the image processing model into a 2D floorplan representation that is compatible and structurally compliant, wherein the rulebased algorithm is a natural language processing model trained using existing literature related to interior design.
[0218] H6. The method of any of claims H1-H5, further comprising generating, using the object placement model, a plurality of 2D floorplan representations, wherein each 2D floorplan representation shows an alternative arrangement of the plurality of objects within the environment, wherein each 2D floorplan representation is determined, using the rulebased algorithm, compatible and structurally compliant.
[0219] H7. The method of any of claims H1-H6, further comprising sending device information to a user device, wherein the device information can be used by the user device to create a graphical rendering of the plurality of 2D floorplan representations for display, wherein a user of the user device is able to see alternative arrangements of the plurality of objects within the environment.
[0220] H8. The method of any of claims H1-H7, further comprising receiving information, from the user device, indicating a 2D floorplan representation selected by the user and modify the 3D model of the environment by placing the 3D representations of the plurality of objects within the 3D model of the environment in accordance with the selected 2D floorplan representation.
[0221] H9. The method of any of claims H1-H8, wherein the outputted 2D floorplan representation comprises at least one of a schematic representing a blueprint drawing of the floorplan and a spatial analysis of a 2D image of a 3D space.
[0222] H10. The method of any of claims H1-H9, wherein the output of the spatial analysis comprises at least one of a 2D schematic and a 3D block out scene.
[0223] II. A system for material recognition of an object, the system comprising: a processing unit comprising one or more processors; a memory unit storing computer-readable instructions, wherein the system is configured to: receive one or more images of a furnishing; input the one or more images, or derivatives thereof, to a material recognition machine
learning model, wherein the material recognition machine learning model has been trained to generate an assessment of a surface material of a furnishing by analyzing a volume of sample images coded with known surface materials; and generate, using the machine learning model, an assessment of a surface material of the furnishing shown in the one or more images.
[0224] 12. The system of claim II, wherein the one or more images shows a plurality of furnishings, an object segmentation model identifies the plurality of furnishings, and each of the plurality of furnishings are independently analyzed using the material recognition machine learning model to assess the surface materials of each of the plurality of furnishings. [0225] 13. The system of any of claims 11-12, wherein the object segmentation model is used to identify at least one region of interest in the one or more images that contains a first object, and the region of interest of the one or more images is directly analyzed by the material recognition machine learning model to generate an assessment of a surface material of the first object shown in the region of interest.
[0226] 14. The system of any of claims 11-13, wherein the system is further configured to perform feature extraction on the one or more images of the furnishing to output one or more features related to the surface material of the furnishing.
[0227] 15. The system of any of claims 11-14, wherein the material recognition model learning model is a classifier model trained to categorize one or more input features into label classes, wherein the assessment of the surface material of the furnishing is a compilation of the categorized label classes.
[0228] 16. The system of any of claims 11-15, wherein the extracted features relate to one or more of size, color, pattern, texture, compactness, contrast, or viewpoints associated with the object.
[0229] JI . A method for material recognition of an object, the method comprising: receiving one or more images of a furnishing; inputting the one or more images, or derivatives thereof, to a material recognition machine learning model, wherein the material recognition machine learning model has been trained to generate an assessment of a surface material of a furnishing by analyzing a volume of sample images coded with known surface materials; and generating, using the machine learning model, an assessment of a surface material of the furnishing shown in the one or more images.
[0230] J2. The method of claim JI, wherein the one or more images shows a plurality of furnishings, an object segmentation model identifies the plurality of furnishings, and each of the plurality of furnishings are independently analyzed using the material recognition machine learning model to assess the surface materials of each of the plurality of furnishings.
[0231] J3. The method of any of claims J1-J2, wherein the object segmentation model is used to identify at least one region of interest in the one or more images that contains a first object, and the region of interest of the one or more images is directly analyzed by the material recognition machine learning model to generate an assessment of a surface material of the first object shown in the region of interest.
[0232] J4. The method of any of claims J1-J3, further comprising performing feature extraction on the one or more images of the furnishing to output one or more features related to the surface material of the furnishing.
[0233] J5. The method of any of claims J1-J4, wherein the material recognition model learning model is a classifier model trained to categorize one or more input features into label classes, wherein the assessment of the surface material of the furnishing is a compilation of the categorized label classes.
[0234] J6. The method of any of claims J1-J5, wherein the extracted features relate to one or more of size, color, pattern, texture, compactness, contrast, or viewpoints associated with the object.
[0235] The terminology used above may be interpreted in its broadest reasonable manner, even though it is being used in conjunction with a detailed description of certain specific examples of the present disclosure. Indeed, certain terms may even be emphasized above; however, any terminology intended to be interpreted in any restricted manner will be overtly and specifically defined as such in this Detailed Description section. Both the foregoing general description and the detailed description are exemplary and explanatory only and are not restrictive of the features, as claimed.
[0236] As used herein, the terms “comprises,” “comprising,” “having,” including,” or other variations thereof, are intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements, but may include other elements not expressly listed or inherent to such a process, method, article, or apparatus.
[0237] In this disclosure, relative terms, such as, for example, “about,” “substantially,” “generally,” and “approximately” are used to indicate a possible variation of ±10% in a stated value.
[0238] The term “exemplary” is used in the sense of “example” rather than “ideal.” As used herein, the singular forms “a,” “an,” and “the” include plural reference unless the context dictates otherwise.
[0239] Other aspects of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
Claims
1. A system for generating a multi-dimensional representation of a physical space based, at least in part, on a designer model, the system comprising: a processing unit comprising one or more processors; a memory unit storing computer-readable instructions, wherein the system is configured to: receive environment data depicting a physical space that contains a plurality of objects, the environment data comprising a multi-dimensional representation of an environment; process, using a first processing model, the environment data, the first processing model being configured to output a 3D model of the physical space that includes 3D representations of the plurality of respective objects; based on the 3D model of the physical space, or a derivative thereof, generate, using a designer model, a design plan for the physical space, the design plan including a proposed arrangement of a plurality of objects within the physical space, the proposed arrangement comprising a plurality of object indicators, each object indicator indicating a position within the physical space at which a respective object is proposed to be located; for each object indicator, select, from a repository of available objects, a recommended object to be located at the position indicated by the respective object indicator, thereby generating a plurality of recommended objects associated with respective positions indicated by the respective object indicators; and populate the 3D model with the plurality of recommended objects located at the respective positions indicated by the respective object indicators.
2. The system of claim 1, further configured to receive, from a user via an interface, a selection of a designer profile from a set of designer profiles, wherein the selection indicates a request from the user to have the proposed arrangement of the plurality of objects staged within the physical space according to how a designer associated with the selected designer profile would have staged the physical space.
3. The system of claims 1-2, further configured to input the 3D model of the physical space, or a derivative thereof, along with the selected designer profile to the designer model, the designer model being a generative artificial intelligence model that has been trained based on a volume of sample images that depict a plurality of sample spaces furnished according to the selected designer profile, wherein the sample spaces show at least: (i) spatial dimensions
of the respective sample space; (ii) designer-selected object placements within the respective sample space; and (iii) designer-selected furnishings of the respective sample space.
4. The system of claims 1-3, wherein selecting the plurality of recommended objects is performed by an object recommendation engine, the object recommendation engine being one or more rules-based algorithms configured to generate confidence scores for a plurality of previously registered furnishings stored in the repository based on a plurality of manually coded rules, wherein each confidence score indicates a confidence value that a designer of the designer profile would select to insert a given furnishing in place of the object indicator.
5. The system of claims 1-4, wherein at least one of the rules indicates that a previously registered furnishing associated with a brand of the selected designer profile is more likely to have a higher confidence score than an identical furnishing that is associated with another brand.
6. The system of claims 1-5, further configured to : receive, from a user device, a user instruction to generate or retrieve the populated 3D model with the plurality of recommended objects located at the respective positions indicated by the respective object; and in response to receiving the user instruction, transmit for display by the user device a rendering of the populated 3D model, wherein the rendering can be edited by the user of the user device.
7. The system of claims 1-6, wherein editing comprises the option to swap at least one of the plurality of recommended objects with other previously registered furnishings from the repository of available objects.
8. The system of claims 1-7, wherein editing comprises deleting at least one of the plurality of recommended objects from a current rendering of the 3D model.
9. The system of claims 1-8, further configured to iteratively modify the graphical rendering of the populated 3D model based on sequential user instruction.
10. A method for generating a multi-dimensional representation of a physical space based, at least in part, on a designer model, the method comprising: receiving environment data depicting a physical space that contains a plurality of objects, the environment data comprising a multi-dimensional representation of an environment; processing, using a first processing model, the environment data, the first processing model being configured to output a 3D model of the physical space that includes 3D representations of the plurality of respective objects;
based on the 3D model of the physical space, or a derivative thereof, generating, using a designer model, a design plan for the physical space, the design plan including a proposed arrangement of a plurality of objects within the physical space, the proposed arrangement comprising a plurality of object indicators, each object indicator indicating a position within the physical space at which a respective object is proposed to be located; for each object indicator, selecting, from a repository of available objects, a recommended object to be located at the position indicated by the respective object indicator, thereby generating a plurality of recommended objects associated with respective positions indicated by the respective object indicators; and populating the 3D model with the plurality of recommended objects located at the respective positions indicated by the respective object indicators.
11. The method of claim 10, further comprising receiving, from a user via an interface, a selection of a designer profile from a set of designer profiles, wherein the selection indicates a request from the user to have the proposed arrangement of the plurality of objects staged within the physical space according to how a designer associated with the selected designer profile would have staged the physical space.
12. The method of claims 10-11, further comprising inputting the 3D model of the physical space, or a derivative thereof, along with the selected designer profile to the designer model, the designer model being a generative artificial intelligence model that has been trained based on a volume of sample images that depict a plurality of sample spaces furnished according to the selected designer profile, wherein the sample spaces show at least: (i) spatial dimensions of the respective sample space; (ii) designer-selected object placements within the respective sample space; and (iii) designer-selected furnishings of the respective sample space.
13. The method of claims 10-12, wherein selecting the plurality of recommended objects is performed by an object recommendation engine, the object recommendation engine being one or more rules-based algorithms configured to generate confidence scores for a plurality of previously registered furnishings stored in the repository based on a plurality of manually coded rules, wherein each confidence score indicates a confidence value that a designer of the designer profile would select to insert a given furnishing in place of the object indicator.
14. The method of claims 10-13, wherein at least one of the rules indicates that a previously registered furnishing associated with a brand of the selected designer profile is
more likely to have a higher confidence score than an identical furnishing that is associated with another brand.
15. The method of claims 10-14, further comprising: receiving, from a user device, a user instruction to generate or retrieve the populated 3D model with the plurality of recommended objects located at the respective positions indicated by the respective object; and in response to receiving the user instruction, transmitting for display by the user device a rendering of the populated 3D model, wherein the rendering can be edited by the user of the user device.
16. The method of claims 10-15, wherein editing comprises the option to swap at least one of the plurality of recommended objects with other previously registered furnishings from the repository of available objects.
17. The method of claims 10-16, wherein editing comprises deleting at least one of the plurality of recommended objects from a current rendering of the 3D model.
18. The method of claims 10-17, further comprising iteratively modifying the graphical rendering of the populated 3D model based on sequential user instruction.
19. A system for generating a multi-dimensional representation of an environment based, at least in part, on a design style, the system comprising: a processing unit comprising one or more processors; a memory unit storing computer-readable instructions, wherein the system is configured to: obtain a plurality of sample images depicting a plurality of sample environments; process, using an image processing model, the sample images, the image processing model being configured to output sample 3D models of the sample environments; input the sample 3D models, or derivatives thereof, to a design generation model, the design generation model being a generative artificial intelligence model that is configured to generate, based on a subject 3D model or derivative thereof, a design plan for a space represented by the subject 3D model, the design plan including a proposed arrangement of a plurality of objects within the space, the proposed arrangement comprising a plurality of object indicators, each object indicator indicating a position within the space at which a respective object is proposed to be located; obtain feedback regarding the proposed arrangement generated by the design generation model; and based on the feedback, modify the design generation model.
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363520334P | 2023-08-17 | 2023-08-17 | |
| US63/520,334 | 2023-08-17 | ||
| US202463621670P | 2024-01-17 | 2024-01-17 | |
| US63/621,670 | 2024-01-17 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025038935A1 true WO2025038935A1 (en) | 2025-02-20 |
Family
ID=94633198
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2024/042687 Pending WO2025038935A1 (en) | 2023-08-17 | 2024-08-16 | Multi-dimensional model generation for aesthetic design of an environment |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025038935A1 (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190164340A1 (en) * | 2017-11-24 | 2019-05-30 | Frederic Bavastro | Augmented reality method and system for design |
| US20190250501A1 (en) * | 2016-06-21 | 2019-08-15 | Lam Research Corporation | Photoresist design layout pattern proximity correction through fast edge placement error prediction via a physics-based etch profile modeling framework |
| US20210117071A1 (en) * | 2019-10-17 | 2021-04-22 | Rishi M. GHARPURAY | Method and system for virtual real estate tours and virtual shopping |
| US20210173968A1 (en) * | 2019-07-15 | 2021-06-10 | Ke.Com (Beijing) Technology Co., Ltd. | Artificial intelligence systems and methods for interior design |
| US20220343627A1 (en) * | 2017-04-27 | 2022-10-27 | Korrus, Inc. | Methods and Systems for an Automated Design, Fulfillment, Deployment and Operation Platform for Lighting Installations |
-
2024
- 2024-08-16 WO PCT/US2024/042687 patent/WO2025038935A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190250501A1 (en) * | 2016-06-21 | 2019-08-15 | Lam Research Corporation | Photoresist design layout pattern proximity correction through fast edge placement error prediction via a physics-based etch profile modeling framework |
| US20220343627A1 (en) * | 2017-04-27 | 2022-10-27 | Korrus, Inc. | Methods and Systems for an Automated Design, Fulfillment, Deployment and Operation Platform for Lighting Installations |
| US20190164340A1 (en) * | 2017-11-24 | 2019-05-30 | Frederic Bavastro | Augmented reality method and system for design |
| US20210173968A1 (en) * | 2019-07-15 | 2021-06-10 | Ke.Com (Beijing) Technology Co., Ltd. | Artificial intelligence systems and methods for interior design |
| US20210117071A1 (en) * | 2019-10-17 | 2021-04-22 | Rishi M. GHARPURAY | Method and system for virtual real estate tours and virtual shopping |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Hu et al. | Graph2plan: Learning floorplan generation from layout graphs | |
| Wang et al. | Sceneformer: Indoor scene generation with transformers | |
| CN116601626B (en) | Personal knowledge graph construction method, device and related equipment | |
| Averkiou et al. | Shapesynth: Parameterizing model collections for coupled shape exploration and synthesis | |
| CN114168795B (en) | Building three-dimensional model mapping and storing method and device, electronic equipment and medium | |
| US20110202326A1 (en) | Modeling social and cultural conditions in a voxel database | |
| CN112242002B (en) | Object recognition and panorama roaming method based on deep learning | |
| Wang et al. | A survey of personalized interior design | |
| JP2015513331A (en) | System and method for rule-based content optimization | |
| CN117710301A (en) | Image processing methods, devices, equipment, storage media and computer program products | |
| JP7160295B1 (en) | LEARNING MODEL GENERATION METHOD, INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, COMPUTER PROGRAM | |
| CN119251413B (en) | Method and device for generating three-dimensional model of building space design | |
| US20210357642A1 (en) | Image-based property condition determination systems, methods, and devices | |
| CN107016732A (en) | Positioned using the 3D objects of descriptor | |
| JP2025118534A (en) | Generating 2D images of 3D scenes | |
| WO2025038935A1 (en) | Multi-dimensional model generation for aesthetic design of an environment | |
| CN115269901A (en) | Method, device and equipment for generating extended image | |
| CN104679492B (en) | The computer implemented device and method that technical support is provided | |
| US20240028787A1 (en) | Techniques for design space exploration in a multi-user collaboration system | |
| JP7230288B1 (en) | learning model | |
| CN114298327B (en) | Data processing method, device and storage medium of federated learning model | |
| Hu et al. | A dataset and benchmark for 3D scene plausibility assessment | |
| JP5775241B1 (en) | Information processing system, information processing method, and information processing program | |
| Chen | Generation of layouts for living spaces using conditional generative adversarial networks: Designing floor plans that respect both a boundary and high-level requirements | |
| CN110674344B (en) | Method, device, equipment and storage medium for generating model and recommending film |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24854996 Country of ref document: EP Kind code of ref document: A1 |