WO2025166341A1

WO2025166341A1 - Learning rigid body simulators over implicit shapes

Info

Publication number: WO2025166341A1
Application number: PCT/US2025/014302
Authority: WO
Inventors: Yulia RUBANOVA; Kimberly STACHENFELD; Tatiana LOPEZ GUEVARA; Kelsey Rebecca ALLEN; William Fairclough WHITNEY; Tobias PFAFF
Original assignee: DeepMind Technologies Ltd; Gdm Holding LLC
Current assignee: DeepMind Technologies Ltd; Gdm Holding LLC
Priority date: 2024-02-01
Filing date: 2025-02-03
Publication date: 2025-08-07
Anticipated expiration: 2026-08-01

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for simulating states of an environment over sequences of time steps. In one aspect, a method comprises, at each of a sequence of time steps: generating a graph representing an environment at the time step comprising a plurality of graph nodes and graph edges by, for each of one or more target objects in the environment for the time step: determining object-to-point distances between the target object and points associated with one or more neighboring objects based on a signed distance function for the target object; and creating one or more collision edges in the graph connecting graph nodes associated with the target object and the neighboring object based on the determined distances; and processing the graph using a graph neural network to generate an updated graph representing the environment at a next time step.

Description

Attorney Docket No.45288-0422WO1 LEARNING RIGID BODY SIMULATORS OVER IMPLICIT SHAPES BACKGROUND [0001] This specification relates to processing data using machine learning models. [0002] Machine learning models receive an input and generate an output, e.g., a predicted output, based on the received input. Some machine learning models are parametric models and generate the output based on the received input and on values of the parameters of the model. [0003] Some machine learning models are deep models that employ multiple layers of models to generate an output for a received input. For example, a deep neural network is a deep machine learning model that includes an output layer and one or more hidden layers that each apply a non-linear transformation to a received input to generate an output. SUMMARY [0004] This specification generally describes a system implemented as computer programs on one or more computers in one or more locations that can simulate states of an environment over sequences of time steps. In particular, the system can simulate rigid body collisions between objects in the environment using neural networks trained to model signed distance functions that specify surfaces of the simulated objects. [0005] According to a first aspect, there is provided a method, performed by one or more computers, for simulating a state of an environment over a sequence of time steps, the method comprising, at each of the sequence of time steps: generating a graph representing the state of the environment at the time step, wherein the graph comprises a plurality of graph nodes and a plurality of graph edges; and processing the graph using a graph neural network to generate an updated graph representing a state of the environment at a next time step. Generating the graph comprises, for each of one or more target objects in the environment for the time step: determining, for each of a set of one or more neighboring objects for the target object, object-to-point distances between the target object and points associated with the neighboring object based on a signed distance function for the target object; and creating one or more collision edges in the graph based on the determined distances, wherein each collision edge corresponds to the target object and a corresponding neighboring object and connects (i) a graph node associated with the target object and (ii) a graph node associated with the corresponding neighboring object. 37934022-1 Attorney Docket No.45288-0422WO1 [0006] In some implementations, the system can include neural networks trained to model the signed distance functions of the target objects. The system can use the neural network for a given target object to determine the object-to-point distances between for the target object. [0007] Particular embodiments of the subject matter described in this specification can be implemented so as to realize one or more of the following advantages. [0008] The system described in this specification enables accurate and computationally efficient simulation of environments, e.g., physical environments which include objects that can come into contact (collide) with one another. Resolving collisions during simulations can be computationally expensive. As the number of objects in the environment and the complexity of the shape of each object increases, exponentially more interactions must be considered to accurately simulate the effects of collisions. For instance, a mesh-based simulator can resolve the effects of a collision between objects by considering distances between nodes included in mesh representations of the objects. However, for objects with complex mesh representations or for large scale environments that include large numbers of interacting objects, evaluating and processing distances between large numbers of individual nodes can become prohibitively expensive. The system described in this specification addresses these issues by identifying and parameterizing a collision between a point on a first object with a surface of a second object by performing a single evaluation of a signed distance function to identify contact between a surface point on a first object and the surface of a second object, e.g., rather than by evaluating multiple individual distances between the surface point on the first object and multiple points on the surface of the second object. [0009] By identifying and parameterizing collisions between simulated objects using object- to-point distances for the simulated objects (e.g., rather than point-to-point distances for the simulated objects), the described simulation systems can therefore model collisions between objects in a simulated environment significantly with a significantly reduced computational cost (e.g., with respect to memory usage, computational time, etc.) compared to conventional methods. This can enable the described simulation systems to, e.g., simulate significantly larger environments for a same computational cost as conventional methods, utilize more complex neural networks to more accurately simulate environments at a same computational cost as conventional methods, and so on. [0010] Utilizing neural networks trained to model signed distance functions for simulated objects enables the described systems to more efficiently simulate environments. A neural network trained to model the signed distance function for an object can implicitly represent the surface of the object (e.g., as a level set at which the modeled signed distance function has a 37934022-1 Attorney Docket No.45288-0422WO1 zero value) and can be queried to determine point-to-object distances to the object in constant time. Additionally, signed distance functions permit reliable inside-outside tests for objects with arbitrary shapes, which enables more efficient collision detection. Further, the described systems can flexibly sample points on the surfaces of simulated objects using the modeled signed distance functions as part of generating the graph representing the simulated environment, enabling effective balancing of the accuracy and efficiency of the simulation (e.g., by sampling more points for improved accuracy, sampling fewer points for improved efficiency, and so on). [0011] Interactions between rigid objects are notoriously hard to model, and analytical simulators have difficulty modelling even simple interactions. This is a significant problem for robotics, which relies heavily on simulated environments. Learned simulators can address the so-called “sim-to-real” gap, but as described above can have computational difficulties in complex environments. The described techniques can address these problems. Further, because implementations are differentiable they can be used for solving inverse problems, such as design. [0012] The details of one or more embodiments of the subject matter of this specification are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages of the subject matter will become apparent from the description, the drawings, and the claims. BRIEF DESCRIPTION OF THE DRAWINGS [0013] FIG. 1A illustrates an example application of simulating an environment using a simulation system. [0014] FIG.1B is a block diagram of an example simulation system. [0015] FIG. 2 is a flow diagram of an example process for simulating an environment over a sequence of time steps. [0016] FIG. 3 is a flow diagram of an example process for preparing an initial graph representing one or more objects in an environment at a time step of a simulation of the environment. [0017] FIG. 4 is a flow diagram of an example process for processing a graph representing a state of an environment using a graph neural network to generate an updated graph. [0018] FIG.5 is a flow diagram of an example process for training a graph neural network. [0019] FIG.6 is a flow diagram of an example process for training a neural network to model a signed distance function for an object. 37934022-1 Attorney Docket No.45288-0422WO1 [0020] FIG. 7 illustrates experimental results comparing computational costs for simulating environments obtained using and implementation of the described simulation system and using conventional methods. [0021] FIG. 8 illustrates example large scale simulations performed using an implementation of the described simulation system. [0022] Like reference numbers and designations in the various drawings indicate like elements. DETAILED DESCRIPTION [0023] FIG. 1A illustrates an example application of simulating an environment using a simulation system 100. For illustrative purposes, FIG. 1A depicts an example application of simulating an environment that includes two objects. However, the simulation system 100, when configured as described throughout this specification, can be used to simulate environments that include any number of objects (e.g., 5 objects, 10 objects, 100 objects, 1000 objects, etc.). [0024] As part of simulating the environment, the simulation system 100 can simulate collisions or other interactions among the objects in the environment over the sequence of time steps. The objects can be, e.g., rigid objects, deformable objects, articulated objects, and so on. For each time step of the simulation system, the simulation system can simulate various motions of the objects within the environment (e.g., changes to shapes, orientations, positions, and so on of the objects within the environment) during the time step. For example, FIG.1A illustrates the simulation system 100 processing data characterizing a state of the environment at a timestep ^^ to simulate motions of the objects within the environment during the timestep ^^ _{and determine a state of the environment at a next timestep ^^ ^ 1.} [0025] In particular, as described in more detail below with reference to FIGs. 1B – 3, the simulation system 100 can process a graph representing the state of the environment at the timestep ^^ using a graph neural network to generate an updated graph representing the state of _{the environment at the next timestep ^^ ^ 1. The graph representing the state of the} environment at the timestep ^^ to include various graph nodes representing the objects within the environment at the timestep ^^, such as object nodes characterizing properties (e.g., positions, velocities, accelerations, orientations, masses, etc.) of the objects at the timestep ^^ and surface point nodes characterizing properties (e.g., positions, velocities, accelerations, etc.) of points on surfaces of the objects at the timestep ^^. To simulate collisions between the objects in the environment during the timestep ^^, the simulation system 100 can generate the graph 37934022-1 Attorney Docket No.45288-0422WO1 representing the state of the environment at the timestep ^^ to include collisions edges that characterize collisions between the objects in the environment during the timestep ^^. The graph neural network can process the graph representing the state of the environment at the timestep ^^ to simulate the collisions between the objects during the timestep ^^ (e.g., as characterized by collision edges generated by the simulation system 100) and generate the updated graph _{representing the state of the environment at the next timestep ^^ ^ 1 (e.g., by generating} updated graph nodes that characterize, e.g., positions, velocities, accelerations, orientations, _{masses, and so on of the objects at the next timestep ^^ ^ 1).} [0026] To reduce a complexity of the graph and thereby reduce computational costs (e.g., memory consumption, computation time, etc.) for simulating the environment, the simulation system 100 can represent each object within the environment using signed distance functions (SDFs) for the objects. A signed distance function for an object is a function that receives, as an input, a position of a given point and produces, as an output, a value and sign for a shortest distance from the given point to a surface of the object (e.g., a positive value when the given point lies within the surface of object and negative when the given point lies outside the surface of the object). In particular, the simulation system 100 can include a neural network for each object in the environment, such as a multi-layer perceptron (MLP) network, that is configured (e.g., trained) to model the signed distance function for the object. When simulating the environment, the simulation system 100 can use the signed distance functions for the objects in the environment to more efficiently identify and parameterize collisions between the objects in the environment. As described in more detail below, using the signed distance functions for the objects to identify and parameterize collisions between the objects can enable the system to simulate large scale environments that would otherwise be impractical to simulate using conventional methods. [0027] FIG. 1B shows an example simulation system 100. The simulation system 100 is an example of a system implemented as computer programs on one or more computers in one or more locations in which the systems, components, and techniques described below are implemented. [0028] The simulation system 100 can receive data characterizing an environment to generate a simulation of the environment. [0029] The environment can be any appropriate environment. For example, the environment may be a physical environment. A physical environment may refer to any type of physical system including, e.g., a fluid, a rigid solid, a deformable material, any other type of physical 37934022-1 Attorney Docket No.45288-0422WO1 system or a combination thereof. In some implementations, the physical environment comprises a real-world environment including a physical object (e.g., a manufacturing environment, a warehouse environment, a roadway environment, etc.). As another example, the environment can be a simulated environment, such as a virtual reality environment. The environment can include one or more agents that interact with the environment, e.g., robotic agents, vehicles, and so forth. The environment can include any of a variety of objects, e.g., tools, packages, mechanical parts, electrical parts, walls, floors, roadway surfaces, conveyor belt surfaces, and so forth. When the environment is a real-world environment, the system can simulate collisions or other interactions between simulated objects or agents with physical objects or agents that are present in the real-world environment. [0030] The simulation of the environment can include a respective simulated state of the environment at each time step in a sequence of time steps. For example, the state of the environment at a first time step can be provided as an input to the simulation system. For example, the simulation system 100 can, for one or more time steps in the simulation of the environment, receive data characterizing the state of the environment at the time step. The system 100 can simulate the environment based on the received data for the time step. At each time step in the sequence of time steps, the system 100 can process an input for the time step and generate a prediction of the state of the physical environment at the next time step. [0031] The sequence of time steps for the simulation can include any appropriate number of time steps, e.g., 10 time steps, 1000 time steps, or 100,000 time steps. [0032] For example, the simulation system 100 can obtain sensor readings from one or more sensors in the environment and can generate the state of the environment at a time step based on the sensor readings. The sensors can include, e.g., cameras, LIDAR sensors, RADAR sensors, and so on within the environment. [0033] For example, the simulation system 100 can receive images of the environment and can simulate the environment based on the received images. For example, the system 100 can generate the state of the environment at a time step based on the received images by, e.g., estimating positions and orientations of one or more target objects for the time step based on the images. [0034] In general, the simulation system 100 can receive and process object data 102 characterizing initial states for one or more objects in the environment to simulate the environment. In particular, the system 100 can process the object data 102 to predict various properties of the objects in the environment (e.g., shapes, orientations, positions, and so on of the objects within the environment) at each time step of the simulation and can generate 37934022-1 Attorney Docket No.45288-0422WO1 simulation results 104 that include the predicted properties for the objects throughout the simulation of the environment. For example, the system 100 can process the object data 102 to predict changes to the properties of the objects in the environment resulting from, e.g., simulated collisions or other interactions among the objects in the environment over the sequence of time steps. [0035] The object data 102 can include any of a variety of data characterizing the initial states for the one or more objects within the environment. For example, the object data 102 can include one or more observations of sensor data (e.g., images, observations of LIDAR data, etc.) that depict the initial state of the environment. As a particular example, the system 100 can receive object data 102 defining a 2D or 3D representation of a shape of a physical object in a real-world environment. For example, the object data 102 can include an image of the physical object as captured by a camera in the real-world environment, such as a depth camera. [0036] As another example, the initial environment mesh can include “object meshes”, e.g. triangle meshes, specifying initial properties (e.g., initial shapes, orientations, positions, and so on) for each of one or more simulated (e.g., non-physical) objects within the environment. [0037] The simulation system 100 can process the object data 102 to generate an object graph 106 representing the one or more objects within the environment. The object graph 106 can include a variety of graph nodes representing properties of the one or more objects in the environment. As an example, for each of the one or more objects within the environment, the object graph 106 can include a respective object node that represents the object. As another example, for each of the one or more objects within the environment, the object graph can include one or more surface point nodes associated with the object that each represent a respective point on a surface of the object. [0038] Each graph node of the object graph 106 can include a respective graph node embedding (e.g., an ordered collection of numerical values for the graph node, such as a vector, matrix, or other tensor of numerical values) that characterizes data associated with the graph node. For example, the graph node embedding for each object node can include data characterizing any of a variety of properties of the object represented by the object node, e.g., a position of a center of mass of the object, a velocity of the center of mass of the object, an acceleration of the center of mass of the object, a mass of the object, a moment of inertia of the object, a coefficient of friction of the object, a coefficient of restitution of the object, and so on. As another example, the graph node embedding for each surface point node can include data characterizing, e.g., a position of the point represented by the surface point node, a velocity of the point represented by the surface point node, an acceleration of the point represented by the 37934022-1 Attorney Docket No.45288-0422WO1 surface point node, and so on. In some implementations, the graph node embedding for each surface point node can include data characterizing properties of the object associated with the surface point node (e.g., the center of mass position, the center of mass velocity, the center of mass acceleration, the mass, the moment of inertia, the coefficient of friction, the coefficient of restitution, and so on for the object associated with the surface point node). [0039] The object graph 106 can include a variety of graph edges that each connect a respective pair of graph nodes within the object graph 106. Each graph edge can represent a relationship or interaction between the objects and surface points represented by the graph nodes connected by the graph edge. As an example, for each surface point node, the object graph 106 can represent the surface point node as characterizing a point on a surface of a same object as characterized by an object node by including an object-surface edge between the object node and the surface point node. As another example, to represent a predicted collision between a first object and a second object, the object graph 106 can include a collision edge between a graph node associated with the first object (e.g., an object node representing the first object, a surface point node representing a point on a surface of the first object, etc.) and a graph node associated with the second object (e.g., an object node representing the second object, a surface point node representing a point on a surface of the second object, etc.). [0040] Each graph edge of the object graph 106 can include a respective graph edge embedding (e.g., an ordered collection of numerical values for the graph edge, such as a vector, matrix, or other tensor of numerical values) that characterizes data associated with the graph edge. As an example, the graph edge embedding for each object-surface edge for a point on a surface of an object can include data characterizing, e.g., a displacement, a distance, and so on between the point and the center of mass of the object. As another example, the graph edge embedding for each collision edge between an object node for a first object and a surface point node for a second object can characterize, e.g., a distance between the first object and the point represented by the surface point node, a displacement between the point represented by the surface point node and a closest point on the surface of the first object to the point represented by the surface point node, a displacement between the center of mass of the first object and the closest point on the surface of the first object to the point represented by the surface point node, and so on. [0041] When the simulation system 100 generates the object graph 106, the system 100 can initialize the object graph 106 to include graph nodes and graph edges that represent the initial states of the one or more objects in the environment. As described in more detail below with reference to FIG. 3, when the system receives observations of sensor data depicting the 37934022-1 Attorney Docket No.45288-0422WO1 environment (e.g., images, observations of LIDAR data, etc.), the system can process the received sensor data to estimate the initial states of the one or more objects as part of initializing the object graph 106. [0042] For each of the sequence of time steps for the simulation, the simulation system 100 can update the object graph 106 to represent the properties of the objects in the environment (e.g., shapes, orientations, positions, and so on of the objects within the environment) at the time step. In particular, the object graph 106 can include a graph neural network 108 configured to process the object graph 106 at each time step to generate updates for the object graph 106. In particular, the graph neural network 108 can be configured (e.g., trained) to process the object graph 106 to generate updates for the object graph 106 characterizing predicted changes to the properties (e.g., shapes, orientations, positions, etc.) of the one or more objects in the environment at each time step. The graph neural network 108 can include various blocks (e.g., collections of one or more neural network layers in the graph neural network) configured to perform respective processing tasks as part of updating the object graph 106, such as an encoder block, a message passing block, a decoder block, and so on. [0043] As part of updating the object graph 106, the graph neural network 108 can perform message passing operations to propagate information along the graph edges of the object graph 106. Performing message passing to update the object graph 106 can enable the graph neural network 108 to rapidly propagate information (e.g., about collisions) between the graph nodes associated with the one or more objects in the environment. Examples of operations that can be performed by the graph neural network 108 to update the object graph 106 are described in more detail with reference to FIG. 4. Example techniques for training the graph neural network 108 are described in more detail below with reference to FIG.5. [0044] At each time step, after updating the object graph 106, the simulation system 100 can process updated data within the object graph 106 (e.g., updated graph node embeddings, updated graph edge embeddings, etc.) to generate the simulation results 104 for the time step. In particular, at each time step, the system 100 can process the updated object graph 106 to determine the predicted changes to the properties (e.g., shapes, orientations, positions, etc.) of the one or more objects in the environment for the time step. [0045] The simulation system 100 can include an object modeling system 110 configured to determine, for each of the one or more objects in the environment, a representation of the surface of the object. For example, the object modeling system 110 can determine, for each of the one or more objects, a respective signed distance function for the object that defines a predicted surface of the object. The simulation system 100 can use the signed distance 37934022-1 Attorney Docket No.45288-0422WO1 functions for the objects in the environment to efficiently identify collisions between the objects. For example, to determine whether a surface point from a first object is in contact with a second object, the system 100 can evaluate the signed distance function for the second object at the position of the surface point to determine a distance between the surface point and the second object. Performing a single evaluation of a signed distance function to identify contact between a surface point on a first object and the surface of a second object can significantly reduce the computational cost (e.g., with respect to memory usage, computational time, etc.) of identifying and characterizing collisions between the objects in the environment compared to, e.g., evaluating individual distances between the surface point and multiple candidate points on the surface of the second object. Example experimental results that demonstrate reductions to computational costs obtained by using the methods described in this specification are illustrated below in FIG.7. [0046] The object modeling system 110 can include object neural networks 112-A through 112-N that are each configured (e.g., trained) to predict a signed distance function for a respective object in the environment. As described in more detail below with reference to FIG. 6, the object modeling system 110 can process the object data 102 to train the object neural networks 112-A through 112-N for each of the one or more objects in the environment. [0047] High-quality simulators may require substantial computational resources, which makes scaling up prohibitive. The simulation system 100 described in this specification can generate simulations of complex environments over large numbers of time steps using fewer computational resources (e.g., memory and computing power) than some conventional simulation systems. For example, the simulation system 100 can predict the state of a physical environment at a next time step by a single pass through the graph neural network 108, while conventional simulation systems may be required to perform a separate optimization at each time step. In particular, the graph neural network 108 can learn to simulate complex physics directly from training data, and can generalize implicitly learned physics principles to accurately simulate a broader range of physical environments under different conditions than are directly represented in the training data. This also allows the system 100 to generalize to larger and more complex settings than those used in training. In contrast, some conventional simulation systems require physics principles to be explicitly programmed, and must be manually adapted for the specific characteristics of each environment being simulated. [0048] The above described systems and methods can be configured for hardware acceleration. In such implementations, the method is performed by data processing apparatus comprising one or more computers and including one or more hardware accelerators units e.g. one or more 37934022-1 Attorney Docket No.45288-0422WO1 GPUs (Graphics Processing Units) or TPUs (Tensor Processing Units). Thus there is provided a simulation method which is specifically adapted for implementation using hardware accelerators. [0049] The system 100 can provide the simulation results 104, e.g., for storage in a data storage device, or for transmission over a data communications network (e.g., the internet), or for presentation on a display of a user device (e.g., a smartphone, tablet, or personal computer). The system can store the simulation results 104, e.g., by storing data defining the object graph 106 at each time step of the simulation. The system can present the simulation results 104, by presenting data derived from the object graph 106 at time steps of the simulation, such as images depicting the simulation, a video depicting the simulation, a virtual reality environment depicting the simulation, an augmented reality environment depicting the simulation, and so on. [0050] Further, the system 100 can provide the simulation results 104 for use in one or more downstream processes. A few examples of downstream processes that can make use of the simulation results 104 generated by the simulation system 100 are described next. [0051] In some implementations, a downstream process can include designing the shape of an object included in the environment using the simulation results. More specifically, as described in more detail below with reference to FIG.6, the downstream process can include, at each of one or more optimization iterations: generating a simulation of the environment using parameters for an object neural network of the object modeling system 110 modeling a current shape of the object; evaluating an objective function that depends on the simulation; and backpropagating gradients of the objective function through the graph neural network 108 of the simulation system 100 to adjust the parameters for the object neural network of the object modeling system 110 that models the shape of the objectThe downstream process can thus iteratively adjust the shape of the object over a sequence of optimization iterations in order to optimize the objective function. The objective function can characterize, e.g., forces or stresses experienced by the object, deformation of the object, and so forth. Any suitable objective function can be used, e.g. an objective function that measures an error between a target characteristic of the object and the characteristic as determined from the simulation, an objective function that evaluates a feasibility or design criteria for the object, and so on. The downstream process can further include making (manufacturing) a physical object with the shape that optimizes the objective function. The physical object can be, e.g., a mechanical agent (e.g., a robotic agent, or a vehicle, or a watercraft, or an aircraft), machinery, equipment, tools, products, or components thereof, or any other mechanical structure. 37934022-1 Attorney Docket No.45288-0422WO1 [0052] For example, the downstream process can include processing a representation of the simulation to determine that a feasibility criterion is satisfied, and designing and/or constructing a physical apparatus or system in response to the feasibility criterion being satisfied. [0053] In some cases, the system can obtain a respective value for one or more design parameters and can generate a representation of an initial state of the environment using the respective values of the design parameters, and designing the article in response to the feasibility criterion being satisfied comprises designing the article based on the respective values of the one or more design parameters. [0054] In some implementations, the downstream task can include performing design optimization. For example as previously described, the data defining the state of the physical environment at the current time may comprise design parameters for an article. Performing design optimization of the article may then comprise adjusting the design parameters according to one or more design criteria for the object, e.g. to minimize stress in the object when subject to a force or deformation e.g. by including a representation of the force or deformation in the data defining the state of the physical environment. The process may include making a physical object with the optimized design parameters. The physical object may be e.g. for part of a mechanical structure. [0055] For example, if the design parameters represent a shape or structure of a physical object (e.g., an aircraft wing), then the design parameters can be provided for use in manufacturing an object having the design defined by the design parameters. The object can be manufactured using any appropriate manufacturing process, e.g., a machining process or an additive manufacturing process. In particular, the system can implement an appropriate manufacturing process to manufacture an object having the design defined by the design parameters. As another example, if the design parameters define the design of a process, e.g. a chemical process or a mechanical process, then the design parameters can be provided for use in implementing a process having the design defined by the design parameters. In particular, the system can implement a process having the design defined by the design parameters. When the design parameters define the shape or configuration of physical object, the method can include making a physical object to a design specified by the design parameters. [0056] In some cases, the design parameters can define, e.g., a shape of an object, e.g., all or part of a vehicle, e.g., a car, a truck, an aircraft, a watercraft, a rocket, etc. In particular examples, the design parameters can define the shape of a wing of an aircraft or the shape of a hull of a watercraft. The design parameters can define the shape of an object, e.g., by defining 37934022-1 Attorney Docket No.45288-0422WO1 a respective position of each control point in a set of control points that parametrize the shape of the object, or by defining the vertices and edges of a mesh representing the shape of the object. The system may simulate, e.g., fluid (e.g., air) dynamics in an environment. For example, the system can simulate a stress field or a pressure field in an environment, e.g., that defines a respective stress or pressure at each position in a grid or mesh spanning the environment. A feasibility criterion may be a measure of one or more aerodynamic features of the object (e.g., a drag coefficient or a lift coefficient of the object), or a measure of physical stress or force exerted on the object under specified environment conditions (e.g., the maximum stress exerted on any part of the object). A design criteria may be a measure of one or more aerodynamic features of the object (e.g., a drag coefficient or a lift coefficient of the object), or a measure of physical stress or force exerted on the object under specified environment conditions (e.g., the maximum stress exerted on any part of the object). [0057] In some cases, the design parameters can define, e.g., a structure of an object, e.g., of a vehicle, a bridge, or a building. In particular examples, the design parameters can define the structure of the chassis or frame of a vehicle, or the structure of supports within a bridge or building. The design parameters can define the structure of an object, e.g., by representing the positions, orientations, thicknesses, and connectivity of rods, beams, struts, and ties defining the structure of the object. The system may simulate, e.g., structural mechanics in an environment. For example, the system may simulate a force, stress, or pressure field, e.g., that defines a respective force, stress, or pressure at each position in a grid or mesh spanning the structure. A feasibility criterion may represent e.g., the behavior of the structure under a mechanical load, e.g., a maximum force, stress, or pressure on any part of the structure under the mechanical load. A design criteria may represent e.g., the behavior of the structure under a mechanical load, e.g., a maximum force, stress, or pressure on any part of the structure under the mechanical load. [0058] In some cases, the design parameters can define, e.g., a composition of a material, e.g., an alloy. In particular examples, the design parameters can define the composition of a material, e.g., by defining, for each of multiple possible constituent materials, a fraction of the material that is represented by the constituent material. The system may simulate, e.g.: changes in the chemical composition of the material over time resulting from specified environmental conditions; or a force, stress, or pressure field representing force, stress, or pressure at each position in a grid or mesh spanning an object made of the material. A feasibility criterion may characterize, e.g., corrosion of the material over time, or behavior of an object made of the 37934022-1 Attorney Docket No.45288-0422WO1 material under a mechanical load. A design criteria may characterize, e.g., corrosion of the material over time, or behavior of an object made of the material under a mechanical load. [0059] In some cases, the design parameters can define a design of a chemical process, e.g., defining when and how various chemicals should be combined in a chemical process. For example, the design parameters can define the speed of a mixer that agitates the contents of a vat, and for each chemical in a set of chemicals, when the chemical should be added to the vat and in what amount. The system may simulate, e.g., chemical dynamics within an environment. For example, the simulation neural network can simulate a concentration field in an environment, e.g., that defines a respective concentration of each of one or more chemicals at each position in a grid or mesh spanning the environment. A feasibility criterion may measure, e.g.: a yield of the chemical process, e.g., an amount of a desired end product that is produced as a result of the chemical process; or a quality (e.g., purity) of the end product. A design criteria may measure, e.g.: a yield of the chemical process, e.g., an amount of a desired end product that is produced as a result of the chemical process; or a quality (e.g., purity) of the end product. [0060] In some cases, the design parameters can define a design of a mechanical process, e.g., defining, for each fan in an environment (e.g., a mine): (i) a rotational speed of the blades of the fan, and (ii) an orientation of the fan. The system may simulate a flow field in the environment, e.g., that defines a respective direction of airflow, strength of airflow, and concentration of gases at each position in a grid or mesh spanning the environment. A feasibility criterion may characterize, e.g., a distribution and concentration of one or gases (e.g., oxygen) in the environment, e.g., as a result of the operation of the fans. A design criterion may characterize, e.g., a distribution and concentration of one or gases (e.g., oxygen) in the environment, e.g., as a result of the operation of the fans. [0061] In some implementations, an agent (e.g., a reinforcement learning agent) interacting with a physical environment may use the simulation system 100 to generate one or more simulations of the environment that simulate the effects of the agent performing various actions in the environment. In these cases, the agent may use the simulations of the environment as part of determining whether to perform certain actions in the environment. [0062] In some implementations, the simulation system 100 can use the simulated states of the environment to control an agent interacting with the environment. For example, the system 100 can simulate states of the environment when the agent performs a sequence of actions over a sequence of time, determining that a final state of the environment satisfies a success criterion, and then cause the agent to perform the sequence of actions in the environment when 37934022-1 Attorney Docket No.45288-0422WO1 the success criterion is satisfied. The system 100 can determine the success criterion based on whether respective locations, shapes, or configurations of the objects within the environment are within certain tolerances for the objects. [0063] The simulation system 100 can simulate the state of the environment based on data received from sensors of the agent. The agent can be any of a variety of agents interacting with the environment. For example, the agent can be a robot manipulating objects in the environment. As another example, the agent can be an autonomous vehicle navigating through the environment. [0064] For example, the simulation system 100 can use the simulated states of the environment for real-world control of an agent or device, in particular for control tasks, e.g. to assist a robot in manipulating a deformable object. Thus, the physical environment may comprise a real- world environment including a physical object e.g. an object to be picked up or manipulated. Determining the state of the physical environment at the next time step may comprise determining a representation of a shape or configuration of the physical object e.g. by capturing an image of the object. Determining the state of the physical environment at the next time step may comprise determining a predicted representation of the shape or configuration of the physical object e.g. when subject to a force or deformation e.g. from an actuator of a robot. The method may further comprise controlling the robot using the predicted representation to manipulate the physical object, e.g. using an actuator, towards a target location, shape or configuration of the physical object by controlling the robot to optimize an objective function dependent upon a difference between the predicted representation and the target location, shape or configuration of the physical object. Controlling the robot may involve providing control signals to the robot based on the predicted representation to cause the robot to perform actions, e.g. using an actuator of the robot, to manipulate the physical object to perform a task. For example, this may involve controlling the robot, e.g. the actuator, using a reinforcement learning process with a reward that is at least partly based on a value of the objective function, to learn to perform a task which involves manipulating the physical object. [0065] In some implementations, when the simulation system 100 generates a predicted state of the environment, the system 100 can compare the predicted state with an observed state of the environment in order to verify the predicted state of the environment. The observed state of the environment may be generated based on sensor readings. The observed state of the environment may be generated based on image data. In some implementations, the physical environment comprises a real-world environment including a physical object, and determining the state of the physical environment at the next time step comprises determining a 37934022-1 Attorney Docket No.45288-0422WO1 representation of a shape of the physical object at one or more next time steps. Verifying the predicted state of the environment may then also involve comparing a shape or movement of the physical object in the real-world environment to the representation of the shape to verify the simulation. In some cases, e.g. where the shape evolves chaotically, the comparison may be made visually, to verify whether the simulation is accurate by estimating a visual similarity of the simulation to ground truth defined by the shape or movement of the physical object in the real-world environment. Also or instead such a comparison may be made by computing and comparing statistics representing the physical object in the real-world environment and the simulation. [0066] In some cases, the simulation system 100 can generate a visual representation of the simulation, e.g., as a video, and can provide the visual representation of the simulation to a user of the system. The system 100 may provide a representation of the simulation for display on a user device. For example, the system 100 can generate images, video, virtual reality environments, augmented reality environments, etc., that depict the simulation. [0067] In some implementations, the simulation system 100 can train a machine learning model using training data based on the simulated states of the environment. For example, the machine learning model can be configured to predict properties of the environment and the system 100 can generate training data for the model using simulated states of the environment. As another example, the machine learning models can be an action selection policy for an agent interacting with the environment. [0068] As the machine learning model can be an action selection policy as part of a control system of an agent (e.g., a robot, a vehicle, etc.) interacting with the environment that controls actions taken by the agent in the environment, e.g. to perform a task, e.g. based on data received from sensors of the agent such as an image sensor. The simulation system 100 can be used to predict properties of the environment, e.g. by simulating a state of the environment (e.g., including a state or configuration of the agent) when (after) the agent performs a sequence of actions over a sequence of time steps. The predicted future state of the environment can be used by the control system, e.g. using model predictive control, to generate control signals to control the agent, e.g. to perform the task. [0069] As one further illustrative example, a part of a robot such as a hand or gripper may need to move towards an object to be manipulated, and it can be useful to simulate the effect of the part of the robot coming into contact with the object. The techniques described herein can be used to predict the motion of the object and/or robot part as a result of relative motion and then contact between the two, e.g. by modelling the robot part as a boundary or as another object in 37934022-1 Attorney Docket No.45288-0422WO1 the environment. As another further illustrative example a part of a robot such as leg or arm of the robot may move towards a boundary such as a floor or wall, and it can be useful to simulate the effect of the part of the robot coming into contact with the boundary, e.g. to predict the subsequent motion of the robot part. In general the simulation system 100 can be used to control an action of the robot, e.g. motion of the part of the robot, e.g. based on model predictive control. For example a control system of the robot can receive observations from one or more sensors characterizing a state of the environment in which the robot is operating, e.g. a position or motion of a part of the robot and/or object to be manipulated, and can use the simulated state of the physical (real-world) environment at a future time step, to control the or another part of the robot, e.g. to perform a particular task in the environment such as moving an object within the environment or navigating within the environment. [0070] As a further example, the simulation system 100 can generate training data by simulating states of the environment in which the agent interacts with the environment by performing actions based on the action selection policy. Once trained, the machine learning model can be used to control an agent interacting with the environment for example. [0071] As another example, the machine learning model can be a generative model. The training data may comprise representations of one or more simulated states of the environment. The methods may comprise generating representations of the simulated states. In some cases, a visual representation of the simulation may be generated, e.g., as a video or images. The machine learning model may be a video generation model or an image generation model. The machine learning model may be a multi-modal model. The machine learning model may be a foundation model and training the machine learning model using the training data may comprise fine tuning the foundation model. For example, training the machine learning model may comprise fine tuning the foundation model using videos or images generated from the simulated states. [0072] FIG. 2 is a flow diagram of an example process for simulating an environment over a sequence of time steps. For convenience, the process 200 will be described as being performed by a system of one or more computers located in one or more locations. For example, a simulation system, e.g., the simulation system 100 of FIG. 1B, appropriately programmed in accordance with this specification, can perform the process 200. [0073] At each time step, the system can obtain data characterizing a state of the environment at the time step (step 202). In general, the data characterizing the state of the environment at the time step can be any of a variety of data characterizing properties of the one or more objects within the environment at the time step (e.g., shapes, orientations, positions, and so on of the 37934022-1 Attorney Docket No.45288-0422WO1 one or more objects within the environment at the time step). As one example, the data characterizing the state of the environment at the time step can be a graph representing the state of the environment at the time step (e.g., a graph previously generated by the system following steps 202 through 208 for a previous time step of the simulation). As another example, the data characterizing the state of the environment can include sensor readings (e.g., observations of sensor data) from one or more sensors in the environment, such as images of the environment obtained by cameras in the environment, observations of LIDAR data obtained by LIDAR sensors in the environment, and so on. As a particular example, for the first time step of the simulation, the data characterizing the state of the environment for the first time step can be, e.g., structured numerical data specifying an initial state of the environment for the simulation, sensor readings depicting an initial state of the environment for the simulation, and so on. [0074] The system can receive the data characterizing the state of the environment at the time step from any appropriate source. For example, the system can retrieve a graph characterizing the state of the environment at the time step from a memory of the system as previously generated and stored by the system within the memory. As another example, the system can receive the data characterizing the state of the environment at the time step from, e.g., a user, an upstream system, and so on (e.g., by means of a user interface of the system, an application programming interface (API) of the system, etc.). [0075] The system can prepare (e.g., generate) an initial graph for the time step that represents the state of the environment at the time step (step 204). As described above, the initial graph for the time step can include various graph nodes (e.g., object nodes, surface point nodes, etc.) representing one or more objects within the environment at the time step. [0076] The system can generate the initial graph for the time step to represent a set of target objects for the time step. In some implementations, the system can generate the initial graph for the time step to represent each of the one or more objects within the environment as the target objects for the time step. In other implementations, the system can identify a subset of the objects within the environment that are likely to collide within the time step and can generate the initial graph for the time step to represent the identified subset of objects as the target objects for the time step. [0077] In some implementations, the system can generate the initial graph to include a respective object node representing each of the one or more objects within the environment. To reduce the complexity of the initial graph for the time step (e.g., so as to limit computational costs of processing the initial graph, such as memory consumption, processing time, etc.), the system can include surface point nodes within the initial graph for the time step only for the 37934022-1 Attorney Docket No.45288-0422WO1 target objects at the time step and can omit including surface point nodes associated with objects in the environment that are not target objects for the time step. When the initial graph for the time step includes object nodes representing each object in the environment and includes surface point nodes only for the target objects for the time step, the system can process the initial graph for the time step to simulate motions for each object in the environment during the time step while only processing data characterizing surfaces for objects that are likely to collide during the time step. [0078] An example process for preparing the initial graph for the time step is described in more detail below with reference to FIG.3. [0079] The system can process the initial graph for the time step using a graph neural network to generate an updated graph representing a state of the environment at a next time step (step 206). The graph neural network can be configured (e.g., trained) to process the initial graph for the time step to generate updates for the graph characterizing predicted changes to the properties (e.g., shapes, orientations, positions, etc.) of the one or more objects in the environment during the time step. As part of processing the initial graph for the time step, the graph neural network can perform message passing operations that propagate information along the graph edges of the initial graph for the time step to determine updated graph node embeddings for the graph. Examples of operations that can be performed by the graph neural network to generate the updated graph representing the state of the environment at the next time step are described in more detail with reference to FIG. 4. Example techniques for training the graph neural network are described in more detail below with reference to FIG.5. [0080] At each time step, the system can determine whether the simulation is complete (step 208). The system can determine whether the simulation is complete using any of a variety of criteria. As an example, the system can determine that the simulation is complete after simulating the environment for a pre-defined number of time steps. As another example, the system can determine that the simulation is complete after receiving a request (e.g., from a user, an upstream system, etc.) to terminate the simulation of the environment (e.g., by means of a user interface of the system, an API of the system, etc.). [0081] When the system determines that the simulation is complete, the system can output simulation results characterizing predicted states of the environment throughout the time steps of the simulation (step 210). For example, the simulation results can include the sequence of graphs characterizing states of the environment (e.g., as generated at each time step of the simulation following step 206). As another example, the system can process the graphs generated for each time step of the simulation to generate a representation of the simulated 37934022-1 Attorney Docket No.45288-0422WO1 states of the environment. For example, the system can generate representation of the simulated states of the environment that includes, e.g., one or more images depicting the simulated states of the environment, a video depicting the simulated states of the environment, a virtual reality environment depicting the simulated states of the environment, an augmented reality environment depicting the simulated states of the environment. The system can provide (e.g., transmit) the generated representation of the simulated states of the environment for display on a user device. [0082] FIG. 3 is a flow diagram of an example process for preparing an initial graph representing one or more objects in an environment at a time step of a simulation of the environment. For convenience, the process 300 will be described as being performed by a system of one or more computers located in one or more locations. For example, a simulation system, e.g., the simulation system 100 of FIG. 1B, appropriately programmed in accordance with this specification, can perform the process 300. [0083] The system can process data characterizing the state of the environment at the time step determine target objects for the time step (step 302). [0084] In general, the data characterizing the state of the environment at the time step can be any of a variety of data characterizing properties of the one or more objects within the environment at the time step (e.g., shapes, orientations, positions, and so on of the one or more objects within the environment at the time step). The system can process the data characterizing the state of the environment at the time step to determine positions and orientations of the objects within the environment for use in generating the initial graph for the time step. [0085] As one example, the data characterizing the state of the environment at the time step can be a graph representing the state of the environment at the time step as previously generated by the system as part of simulating the environment (e.g., a graph generated by the system following step 206 described above with reference to FIG. 2). The system can generate the initial graph for the time step by using simulated positions and orientations of the objects within the environment as characterized by the previously generated graph. [0086] As another example, the data characterizing the state of the environment can include sensor readings (e.g., observations of sensor data) from one or more sensors in the environment, such as images of the environment obtained by cameras in the environment, observations of LIDAR data obtained by LIDAR sensors in the environment, and so on. The system can process the sensor readings (e.g., images, observations of LIDAR data, etc.) of the environment to estimate the positions and orientations of the objects within the environment for use in 37934022-1 Attorney Docket No.45288-0422WO1 generating the initial graph for the time step. In particular, as described in more detail below with reference to FIG. 6, the system can estimate the positions and orientations of the objects within the environment as part of training neural networks to model signed distance functions for an object using the sensor readings as training data.. [0087] As an example, the system can evaluate object-to-object distances between each of the objects in the environment (e.g., based on positions of the objects in the environment as determined by processing the data characterizing the state of the environment) and can identify each object within the environment as a target object for the time step if the object is within a predefined threshold object-to-object distance of another object in the environment at the time step. As a further example, the system can identify objects α and β as target objects for the time step when the condition: ห_{^^^ െ ^^ஒห ^ τை} [0088] Is satisfied, where τ_ை is a object distance and where ^^_^ and ^^ _ஒ are, e.g., respective center of mass positions for the objects α and β at the time step. [0089] As part of identifying the target objects for the time step, the system can identify a respective set of neighboring objects for each target object. For example, when the above condition is satisfied, the system can consider the object α and β to be neighboring objects of one another at the time step (e.g., by including the object α within a set of neighboring objects for the object β and including the object β within a set of neighboring objects for the object α). [0090] The system can generate graph nodes for the initial graph at the time step that represent the target objects for the time step (step 304). In particular, for each target object for the time step, the system can generate an object node representing the target object and one or more surface point nodes for the target object that each represent a respective point on a surface of the target object. [0091] For each target object for the time step, the system can generate a respective surface point node representing each of a set of identified points on the surface of the target object. The system can determine the set of identified points on the surface of each target object before performing the simulation of the environment. As an example, when the system receives object meshes for the objects in the environment, the set of identified points on the surface for each target object can include points from the object mesh for the target object. As another example, when the system obtains signed distance functions for the objects in the environment (e.g., by including neural networks modeling the signed distance functions for 37934022-1 Attorney Docket No.45288-0422WO1 the objects), the system can determine the set of identified points on the surface of an object by sampling points on the surface of the object using the signed distance function for the object (e.g., by sampling points for which the signed distance function for the object evaluates to a value of zero). [0092] The system can use any of a variety of techniques to sample points on the surface of the objects using the signed distance function for the object. For example, the system can sample points around the object from a distribution of points around the object and can determine corresponding closest points on the surface of the object using the signed distance function for the object. As another example, the system can use a MarchingCubes algorithm to sample points on the surfaces of the objects using signed distance functions for the objects, as described in more detail by Lorensen and Cline in "Marching Cubes: A High Resolution 3D Surface Construction Algorithm", Seminal Graphics: Pioneering Efforts that Shaped the Field, 1998, 347-353. [0093] In some implementations, the system can generate object nodes representing each object within the environment at the time step. To reduce a complexity of the initial graph for the time step, the system can omit generating surface point nodes for objects in the environment that are not identified as target nodes for the time step. [0094] The system can generate a respective graph node embedding for each object node that characterizes properties of the target object represented by the object node, such as dynamic properties of the target object at the time step (e.g., a position, velocity, and/or an acceleration of the center of mass of the object at the time step), static properties for the target object (e.g., a mass, a moment of inertia, a coefficient of friction, a coefficient of restitution, and so on for the target object), and so on. In some implementations, the graph node imbedding for each object node can characterize dynamic properties of the target object at one or more previous time steps (e.g., positions, velocities, and/or accelerations of the center of mass of the object at the previous time steps). [0095] The system can generate a respective graph node embedding for each surface point node that characterizes dynamic properties of the point represented by the surface point node at the time step (e.g., a position, velocity, and/or an acceleration of the point at the time step). In some implementations, the graph node imbedding for each surface point node can characterize dynamic properties of the point represented by the surface point node at one or more previous time steps (e.g., positions, velocities, and/or accelerations of the point at the previous time steps). In some implementations, the graph node embedding for each surface point node can include data characterizing static properties of the point represented by the 37934022-1 Attorney Docket No.45288-0422WO1 surface point node (e.g., a mass associated with the point, the mass of the target object associated with the surface point node, the moment of inertia of the target object associated with the surface point node, the coefficient of friction of the target object associated with the surface point node, the coefficient of restitution of the target object associated with the surface point node, etc.). [0096] When the system generates a surface point node for a target object at the time step, the system can generate an object-surface edge between the surface point node and the object node representing the target object. The system can generate a respective graph edge embedding for each for each object-surface edge characterizing, e.g., a displacement, a distance, and so on between the point and the center of mass of the object as represented by the surface point node and the object node connected by the object-surface edge. [0097] As an illustrative example, a graph node embedding for an i-th surface point on an object α can represent features ^^^_௧ _{^^ , ^^௧ି^} _{^^ , ^|^^௧} _{^^ |^, ^ห^^௧ି^} _{^^ ห^ ,ϕ^^, where ^^^} ^௧ ఈ is a velocity of the point at the time step, ^^_^ ^௧ ఈ^{ି^} is a time step, and ^^_ఈis a vector of static features for the object α a mass, a of friction, and a coefficient of restitution of the object α. As another illustrative example, a graph node embedding for an _{object ^^ can represent features ൫^^ఈ௧ , ^^ఈ௧ି^, ห|^^ఈ௧ |ห, ห|^^ఈ௧ି^|ห,^^ఈ൯, where ^^ఈ௧ is a velocity of a} center of mass of the object ^^ of the center of mass of the object ^^ at a previous time step, and ^^_ఈis a vector of static features for the object ^^ including a mass, a coefficient of friction, and a coefficient of restitution of the object ^^. As another illustrative example, a graph edge embedding for an object-surface edge for an object β and _{an i-th surface point one an object α can represent features ^^^ஒ െ ^^^^, ^ห^^ஒ െ ^^^^ห^^, where} _{^^ఉ െ ^^^ఈ is a displacement between a center of mass of} [0098] The system can generate collision edges for the initial graph at the time step that represent collisions between the target objects in the environment during the time step (step 306). [0099] To identify any collisions between the target objects at the time step, the system can determine, for each target object α and each neighboring object β for the target object α, respective object-to-point distances between the object α and one or more points associated with the neighboring object β. In particular, the system can determine the object-to-point distances between the object α and the one or more points associated with the neighboring object ^^ using a signed distance function, ^^_ఈ, for the object ^^. Each of the one or more points 37934022-1 Attorney Docket No.45288-0422WO1 associated with the neighboring object ^^ can be points on the surface of the neighboring object ^^ that are represented by surface point nodes associated with the neighboring object ^^ within the initial graph for the time step. [0100] The signed distance function, ^^_ఈ, for each object ^^ can specify, for each point within the environment, a signed distance between the point within the environment and a closest point on the surface of the object α. For example, the signed distance function, ^^_ఈ, for an object ^^ can specify a signed object-to-point distance between the object ^^ and a ^^-th point associated with an object ^^, ^^_^ಊ, (e.g., a point on the surface of the object ^^ represented by a ^^-th surface point node associated with the object ^^ within the initial graph for the time step) following: ^_{^൫^^ ఈ∗ ൫^^^ಊ൯ , ^^^ಊ൯, ^^^^ ^^ ಊ ^^^^ ^^^^^^^^^^^^^^ ^^^^^^^^^^^^ α} _{^^ ^ ^} _{ఈ൫ ^^^൯ ൌ ^} _{^^^^ ^^^^ ^^^^^^^^^^^^ ^^^^^^^^^^^^ α} _{where ^^ఈ} ^∗ _൫^^^ಊ൯ and ^_{^^ಊ൯ is a point-to-point distance (e.g., a Euclidean distance) between the point} the point ^^_^^. the points on the surface of an object α are the points for which the signed distance function, ^^_ఈ, a value of zero, the signed distance function for the object α implicitly _{represents the surface of the object α as a level set (e.g., as the level set ^^^ ൌ ^^^ | ^^_^α^^^^^ ൌ} 0^). [0102] For each object in the environment, the system can include a neural network configured (e.g., trained) to process a network input specifying a position of a point in the environment to generate a predicted value of the signed distance function (SDF) for the object at the point. An example process for training a neural network to model the signed distance function for an object using training data for the object is described in more detail below with reference to FIG.6. [0103] Because the SDF for an object directly provides a closest distance to the surface of the object, using the SDFs for the objects enables the system to more efficiently evaluate point- to-surface distances between the objects as part of performing the simulation. Additionally, modeling the SDF for an object α using a neural network, ^^_^ ^{^}, enables the system to _{efficiently determine the respective closest point on the} _{of the object ^^, ^^∗} _{ఈ ^^^^ , to any} given point ^^ in the environment. In particular, because ∇^^_^ ^{^}^^^^ (the gradient of ^^_^ ^{^}with respect to the network input as evaluated at the point ^^) points towards the shortest path from 37934022-1 Attorney Docket No.45288-0422WO1 ^^ to the surface of object α and ^^_ఈ _{ఏ ^^^^ provides the distance between ^^ and ^^∗} _{ఈ ^^^^ , the} _{closest point ^^∗} _{ఈ ^^^^ can be efficiently determined using a single forward pass and a single} backward pass through the neural network ^^_ఏ ^ఈ following: ^_^∗ _{ఈ ^^^^ ൌ ^^ െ ^^ఏ} ^ఈ^^^^∇^^_ఏ ^ఈ^^^^ [0104] A neural network modeling the for an object α can be configured to process positions for points around co-ordinate space for the object α in which the object α has a particular reference pose (e.g., having a particular reference position, such as the origin of the reference co-ordinate space, and having a particular reference orientation). The position and rotation of the object α within the environment defines a transformation, ^^_^, that translates and rotates positions within the reference co-ordinate space to corresponding positions within a co-ordinate space for the environment (e.g., in which the object α is translated and rotated from the reference pose). The SDF for the object α can be evaluated for a point ^^^ᇱ as specified in the co-ordinate space for the environment by evaluating ^^_^ ^{^} _൫ ^{^^} _^ ^{ି^^^^ᇱ^} _൯ ^{, where ^^} _^ ^{ି^ is the inverse transformation of} ^^_^, which transforms and rotates the co-ordinate space for the environment to corresponding positions within co-ordinate space. Similarly, the closest point _{within the co-ordinate space for the environment, ^^∗} _{ఈ ^^^ᇱ^ , on the surface of the object α to a} point ^^^ᇱ within the co-ordinate space for the environment can be determined using the neural network ^^_^ ^{^} following: ^_^∗ _{ఈ ^^^ᇱ^ ൌ ^^ᇱ െ ^^^} ^{^}൫^^_^ ^{ି^}^^^^ᇱ^൯ ^^_^ ^∇^^_^ ^{^}൫^^_^ ^{ି^}^^^^ᇱ^൯^ [0105] When the an object ^^ and a ^^-th point associated with an object ^^ (e.g., a point on the surface of the object ^^ represented by a ^^-th surface point node associated with the object ^^ within the initial graph for the time step) satisfies a threshold distance, the system can create a collision edge within the initial graph for the time step between a graph node associated with the object ^^ and a graph node associated with the object ^^ that represents a collision between the objects ^^ and ^^ during the time step. In particular, the system can include a collision edge between graph nodes associated with the objects ^^ and ^^ when the condition: ^_{^ఈ൫^^^^൯ ^ τ^} is satisfied for a j-th point on the surface where τ_^ is a threshold collision distance. Here, as previously described, ^^_ఈ can be the SDF for the object α (e.g., as modeled by a neural network, ^^_^ ^{^}, for the object α). Because the SDF can directly determine distances 37934022-1 Attorney Docket No.45288-0422WO1 to the surface of object α without evaluating distances to individual surface points of the object α, the above condition for collision detection does not rely on the density of the surface point nodes for the object α. [0106] For example, when the above condition is satisfied, the system can include a collision edge within the initial graph for the time step between the object node representing the object ^^ and a surface point node for the object ^^ representing the point ^^_^^. As another example, when the above condition is satisfied, the system can include a collision edge within the initial graph for the time step between the object node representing the object ^^ and the object node representing the object ^^. [0107] When the system includes a collision edge within the initial graph, the system can generate an edge embedding for the collision edge that includes data characterizing the collision represented by the collision edge. For example, when the system determines that the object-to-point distance between the object ^^ and the ^^-th point, ^^_^^, associated with the neighboring object ^^ satisfies the above criteria for including a collision edge for the object ^^ and the object ^^, the system can generate an edge embedding for the included collision edge that characterizes, e.g., the object-to-point distance between the object ^^ and the point ^^_^^ (e.g., the value of the signed distance function, ^^_ఈ൫^^_^^൯), a displacement between the point _{^^^^ and the closest point, ^^ఈ} ^∗ _൫^^^ಊ൯ , on the surface of the object ^^ to the point ^^_^^ (e.g., the _{displacement ^^∗} _{ఈ ൫^^^ಊ൯ െ ^^^ಊ), a displacement between a center of mass of the object ^^, ^^ఈ,} _{and the closest point, ^^ఈ} ^∗ _൫^^^ಊ൯ , on the surface of the object ^^ to the point ^^_^^ (e.g., the _{displacement ^^ఈ െ ^^ఈ} ^∗ _൫^^^ಊ൯ ), and so on. [0108] As an illustrative example, a graph edge embedding for an collision edge for an object _{α and an j-th surface point on an object β, ^^ ∗} _{^ಊ, can represent features ^^^ఈ ൫^^^ಊ൯ െ ^^^ಊ , ^^ఈ െ} _^^∗ _{ఈ ൫^^^ಊ൯ , ^ห^^∗} _{ఈ ൫^^^ಊ൯ െ ^^^ಊห^ , ^ห^^ ∗ ∗} _{ఈ െ ^^ఈ ൫^^^ಊ൯ ห^^, where ^^ఈ ൫^^^ಊ൯ െ} _displacement object ^^ to the _{point ^^^^ and where ^^ఈ െ ^^ఈ} ^∗ _൫^^^ಊ൯ is the between a center of mass of the object _{^^, ^^ఈ, and the closest} [0109] The system can then the initial graph for the time step by including the generated graph nodes and graph edges within the initial graph (step 308). After preparing the initial graph for the time step, the system can process the initial graph for the time step to 37934022-1 Attorney Docket No.45288-0422WO1 simulate the objects in the environment during the time step (e.g., as part of performing step 206 described above with reference to FIG.2). [0110] FIG.4 is a flow diagram of an example process 400 for processing a graph representing the current state of the environment using a graph neural network to generate an updated graph. For convenience, the process 400 will be described as being performed by a system of one or more computers located in one or more locations. For example, a simulation system, e.g., the simulation system 100 of FIG. 1B, appropriately programmed in accordance with this specification, can perform the process 400. [0111] The system can receive graph data defining the graph representing the current state of the environment at a time step (step 402). [0112] The system can update a respective embedding of each graph node and each graph edge in the graph using an encoder block of the graph neural network (step 404). For instance, for each object node within the graph, the system can process the graph node embedding of the object node using an object node embedding subnetwork included in the encoder block to generate an updated graph node embedding for the object node. As another example, for each surface point node within the graph, the system can process the graph node embedding of the surface point node using a surface point node embedding subnetwork included in the encoder block to generate an updated graph node embedding for the surface point node. As another example, for each object-surface edge within the graph, the system can process the graph edge embedding of the object-surface edge using an object-surface edge embedding subnetwork included in the encoder neural network to generate an updated graph edge embedding for the object-surface edge. As another example, for each collision edge within the graph, the system can process the graph edge embedding of the collision edge using a collision edge embedding subnetwork to update the graph edge embedding for the collision edge. [0113] The system can iteratively update the embeddings of the graph nodes and the graph edges in the graph over a sequence of update iterations using a sequence of message passing blocks of the graph neural network (step 406). In particular, each message passing block can be configured to process the embeddings of the graph nodes and the graph edges to update the embeddings of the graph nodes and the graph edges by operations that are parametrized by a set of message passing block parameters and are conditioned on the topology of the graph. The graph neural network can include any appropriate number of message passing blocks, e.g., 2 message passing blocks, or 10 message passing blocks, or 50 message passing blocks. The message passing blocks can, but are not required to, share the same message passing block parameters. 37934022-1 Attorney Docket No.45288-0422WO1 [0114] Each message passing block can include one or more edge updating subnetworks for updating the graph edge embeddings of object-surface edges and collision edges within the graph. More specifically, for an object-surface edge or a collision edge connecting a first graph node and a second graph node, an edge updating subnetwork can be configured to process: (i) the graph edge embedding of the edge, (ii) the graph node embedding of the first graph node, and (ii) the graph node embedding of the second graph node, to generate an updated graph edge embedding for the graph edge. In some cases, each message passing block can include respective (different) edge updating subnetworks for object-surface edges and collision edges. [0115] Each message passing block can include one or more node updating subnetworks for updating the graph node embeddings of object nodes and surface point nodes within the graph. More specifically, for an object node or a surface point node, a node updating subnetwork for can be configured to process: (i) the graph node embedding of the graph node, and (ii) the graph edge embeddings of each graph edge that connects to the graph node, to generate an updated graph node embedding for the graph node. In some cases, each message passing block can include respective (different) node updating subnetworks for object nodes and surface point nodes. [0116] After iteratively updating the embeddings of the graph nodes and the graph edges using the sequence of message passing blocks, the system can process the updated embeddings using a decoder block of the graph neural network to generate a network output for the graph neural network (step 408). In particular, the decoder block can process the updated node embeddings for the some or all of the graph nodes (e.g., as generated by the last message passing block) to generate output embeddings for the graph nodes that characterize a simulated motion of the objects in the environment during the time step. For example, the decoder block can process the updated node embeddings for object nodes within the graph to generate output graph node embeddings for the object nodes that characterize overall motions of the objects during the time step (e.g., changes to center of mass positions for the objects, rotations of the objects, and so on). As another example, the decoder block can process the updated node embeddings for surface point nodes within the graph to generate output graph node embeddings for the surface point nodes that characterize motions of the individual points represented by the surface point nodes during the time step (e.g., which collectively characterize changes to, e.g., shapes, positions, orientations, etc. of the objects in the environment associated with the surface point nodes). [0117] In some implementations the inputs to and outputs from the graph neural network are normalized to zero mean and unit variance. 37934022-1 Attorney Docket No.45288-0422WO1 [0118] The graph neural network and its constituent parts (e.g., the encoder block, the message passing blocks, and the decoder block) can be configured (e.g., trained) to generate the output graph node embeddings to include any of a variety of data characterizing the simulated motion of the objects in the environment during the time step. For example, the graph neural network and its constituent parts can be configured to generate output graph node embeddings that specify any appropriate combination of, e.g., final positions of the objects in the environment for the time step, final positions of the surface points in the environment for the time step, final rotations of the objects in the environment for the time step, velocities of the objects in the environment for the time step, velocities of the surface points in the environment during the time step, angular velocities of the objects in the environment during the time step, accelerations of the objects in the environment for the time step, accelerations of the surface points in the environment during the time step, angular accelerations of the objects in the environment during the time step, and so on. [0119] The graph neural network and its constituent parts (e.g., the encoder block, the message passing blocks, and the decoder block) can have any appropriate neural network architectures that enable them to perform their described functions. For instance, the graph neural network and its constituent parts can each include any appropriate types of neural network layers (e.g., fully-connected layers, convolutional layers, attention layers, and so forth) in any appropriate number (e.g., 5 layers, or 10 layers, or 50 layers) and connected in any appropriate configuration (e.g., as a directed graph of layers). [0120] The system can then generate the updated graph representing the state of the environment at a next time step using the network output from the graph neural network (step 410). The updated graph can include data characterizing, e.g., positions, velocities, accelerations, etc. for the object nodes and for the surface nodes at the next time step. For example, the system can include data characterizing predicted positions, velocities, accelerations, and so on for the object nodes and for the surface nodes at the next time step as generated by the graph neural network. As another example, the system can determine the positions, velocities, accelerations, and so on for the object nodes and for the surface nodes at the next time step by numerically integrating equations of motion for the object nodes and surface point nodes that depend on, e.g., predicted velocities, accelerations, angular velocities, angular accelerations, and so on as generated by the graph neural network. [0121] When the graph neural network is configured to generate a network output that characterizes only overall motions of the objects during the time step (e.g., changes to center of mass positions for the objects, rotations of the objects, and so on), the system can generate 37934022-1 Attorney Docket No.45288-0422WO1 the graph node embeddings for surface point nodes within the updated graph for the next time step in accordance with the predicted overall motions of the objects during the time step. [0122] Similarly, when the graph neural network is configured to generate a network output that characterizes only predicted motions of the individual points represented by the surface point nodes during the time step (e.g., which collectively characterize changes to, e.g., shapes, positions, orientations, etc. of the objects in the environment associated with the surface point nodes), the system can generate the graph node embeddings for object nodes within the updated graph for the next time step in accordance with the predicted motions of the individual points represented by the surface point nodes during the time step. [0123] In some implementations the system includes one or more hardware accelerator units. Processing the graph using the graph neural network to generate the updated graph can then involve updating the graph at each of one or more update iterations using a processor system comprising L message passing blocks, where each message passing block has the same neural network architecture and a separate set of neural network parameters. The system can apply the message passing blocks sequentially to process data defining the graph over multiple iterations, using the one or more hardware accelerator units. [0124] In some implementations the system includes multiple hardware accelerator units, e.g. configured to operate in parallel. The system can then distribute the processing using the message passing blocks over the multiple hardware accelerators that operate in parallel to generate the updated graph. [0125] When the system uses trained neural networks to model signed distance function for respective objects in the environment, these neural networks can likewise be implemented so that they operate in parallel over multiple hardware accelerators, e.g. over the aforementioned hardware accelerators. [0126] FIG.5 is a flow diagram of an example process 500 for training a graph neural network. For convenience, the process 500 will be described as being performed by a system of one or more computers located in one or more locations. For example, a simulation system, e.g., the simulation system 100 of FIG. 1B, appropriately programmed in accordance with this specification, can perform the process 500. [0127] The system can obtain training data that includes a plurality of training examples for the graph neural network (502). Each training example for the graph neural network can include: (i) an example input to the graph neural network for the training example, and (ii) a target output of the graph neural network for the training example. Each example input to the graph neural network can include graph data that defines an example graph representing the 37934022-1 Attorney Docket No.45288-0422WO1 state of an example environment at a time step. For each training example, the target output of the graph neural network for the training example can define, for each graph node of the example graph, a target graph node embedding for the graph node characterizing, e.g., target positions, velocities, accelerations, rotations, angular velocities, angular accelerations, and so on for the graph node at the time step. [0128] The training data for the graph neural network can be generated in any appropriate manner. For example, the training data can include training examples generated by performing simulations of example environments using an analytical (e.g., physics-based) simulator. Each training example generated by performing a simulation of an example environment can characterize a simulated motion for the example environment during a respective time step of the simulation. As another example, the training data can include training examples generated using real-world data characterizing observations of real-world environments. [0129] In some cases, the simulation or observations of the environment only specify target positions for each graph node and the system can determine other target properties (e.g., target velocities, accelerations, rotations, angular velocities, angular accelerations, etc.) based on target positions for the graph node over multiple time steps. Merely as an illustration, when the time steps of the simulation or observations are separated by a constant time interval, the system can determine a target acceleration for a graph node from a history of positions each node for three time steps using finite differences, by, e.g., determining a pair of velocities from differences in the positions for the graph node at adjacent pairs of time steps and determining acceleration from a difference in the determined velocities. [0130] The system can train the graph neural network over a sequence of training iterations. At each training iteration, the system can perform steps 504 through 508 described below. [0131] The system can process the example inputs for one or more training examples for the training iteration using the graph neural network to generate respective network outputs for each of the training examples for the training iteration (step 504). In particular, the system can process each of the example inputs for the training iteration using the graph neural network following the process 400 described above with reference to FIG. 4. The network output for each training example can characterize predicted state of the example environment for the training example at a next time step. For example, the network output for each training example can include, for each graph node of an output graph, a predicted graph node embedding for the graph node characterizing, e.g., predicted positions, velocities, accelerations, rotations, angular velocities, angular accelerations, and so on for the graph node at the time step. 37934022-1 Attorney Docket No.45288-0422WO1 [0132] The system can update parameters of the graph neural network to optimize an objective function for the graph neural network (step 506). In general herein, where references are to optimizing an objective function, where the function measures an error optimizing the objective function can refer to minimizing the objective function (loss). [0133] For each training example, the objective function for the graph neural network can measure an error between the target network output for the training example and the predicted network output generated by the graph neural network for the training example. In general, for each training example. The objective function can measure an error of the predicted state of the example environment at the next time step for the training example as characterized by the network output generated for the training example by the graph neural network. For example, for each training example, the objective function can characterize differences between, e.g., predicted positions and target positions, predicted rotations and target rotations, predicted accelerations and target accelerations, and so on of one or more target objects in the example environment for the training example. As a further example, the objective function can measure a per-node error (e.g., a mean squared error) between, e.g., predicted positions and target positions, predicted rotations and target rotations, predicted accelerations and target accelerations, and so on of one or more target objects for each graph node of the example graph for the training example. [0134] The system can update the parameters of the observation encoding system using any appropriate machine learning technique. For example, the system can determine gradients of the objective function with respect to the parameters of the graph neural network and can determine updates for the parameters using, e.g., stochastic gradient descent, ADAM, and so on. [0135] The system can determine whether the training is complete (step 508). The system can use any of a variety of criteria to determine whether the training is complete. For example, the system can determine that training is complete after a pre-determined number of training iterations. As another example, the system can determine that training is complete when a value of the training objective function falls below a pre-determined threshold. As another example, the system can determine that training is complete when a difference between values of the training objective function for the current training iteration and a previous training iteration falls below a pre-determined threshold. [0136] If the system determines that training is not complete, the system can continue to a next training iteration (e.g., return to step 504). 37934022-1 Attorney Docket No.45288-0422WO1 [0137] When the system determines that training is complete, the system can provide the trained graph neural network (step 510). [0138] FIG. 6 is a flow diagram of an example process 600 for training a neural network to model a signed distance function for an object. For convenience, the process 600 will be described as being performed by a system of one or more computers located in one or more locations. For example, a simulation system, e.g., the simulation system 100 of FIG. 1B, appropriately programmed in accordance with this specification, can perform the process 600. [0139] The system can obtain training data that includes a plurality of example inputs for the neural network (602). In general, each example input for the neural network can specify a location for a point around the object (e.g., a location for a point in an environment of the object with reference to a co-ordinate system for the object). [0140] As one example, to train the neural network to model the signed distance function for a known object (e.g., as specified by an object mesh for the object, as depicted by observations of the object, etc.), the plurality of example inputs for the neural network can be generated by sampling example locations around the object from a distribution of points around the object. [0141] As another example, to train the neural network to optimize a shape of the object, the plurality of example inputs for the neural network can be obtained as part of performing a simulation of the object using the neural network. As a particular example, the plurality of example inputs for the neural network can be positions of surface point nodes for other objects within a simulated environment of the object that are used to identify collisions with the object as part of performing the simulation of the object (e.g., following the process 200 of FIG.2). [0142] When training the neural network to model the signed distance function for a known object, the training data for the neural network can include data characterizing a known surface for the object, such as a surface mesh for the object, observations (e.g., images, observations of LIDAR data, etc.) depicting the object, and so on; or object meshes can be determined according to any conventional method. In another approach training data including meshes can be created, e.g. using 3D graphics software such as Blender. Merely as an example of the Kubric datasets can be used (Greff et al. “Kubric: a scalable dataset generator”, arXiv:2203.03570, 2022). [0143] The system can train the neural network over a sequence of training iterations. At each training iteration, the system can perform steps 604 through 608 described below. [0144] The system can process the example inputs for one or more training examples for the training iteration using the neural network to generate respective network outputs for each of the training examples for the training iteration (step 604). In particular, the system can process 37934022-1 Attorney Docket No.45288-0422WO1 each of the example inputs for the training iteration using the neural network to generate a respective predicted value of the signed distance function for the object at the point characterized by the example input. [0145] When the training data includes observations depicting the object, the system can process the predicted values of the signed distance function to generate replicated observations of the object. For example, when the training data includes images depicting the object, the system can generate replicated images that depict the object as modeled by the neural network, e.g. using a signed distance function. An example process for generating replicated images for the object using the signed distance function for the object is described by Yariv et al., "Volume Rendering of Neural Implicit Surfaces", Advances in Neural Information Processing Systems 34 (2021): 4805-4815. [0146] As part of generating the replicated observations of the object, the system can determine estimated positions and orientations of the object within the environment as depicted by the observations. For example, the system can determine the estimated positions and orientations of the object to optimize replication loss between (i) the observations of the object and (ii) replicated observations of the object as generated using the neural network. [0147] When training the neural network to optimize the shape of the object, the system can process the predicted values of the signed distance function to perform a simulation of the object (e.g., as part of simulating the object following the process 200 of FIG.2). [0148] The system can update parameters of the neural network to optimize an objective function for the neural network (step 606). In general, the objective function for the neural network can characterize an error of a predicted shape of the object as modeled by the neural network. The predicted shape of the object can be characterized by a surface of points, ^^, for the object defined by: ^_{^ ൌ ^ ^^ | ^^^^^^^^^^^ ൌ 0^} [0149] Where ^^^^^^_^ ^{^}^^^{^} is the predicted value for the signed distance function for the object at the point ^^ as determined using the neural network. [0150] As an example, when training the neural network to model the signed distance function for a known object, the objective function can measure an error between the known shape of the object and the predicted shape of the object as modeled by the neural network. As a further example, when the training data includes a surface mesh for the object, the objective function can measure an error between (i) predicted values of the signed distance function generated 37934022-1 Attorney Docket No.45288-0422WO1 using the neural network and (ii) target values of the signed distance function determined using the surface mesh. [0151] As a particular example, for a point ^^, the objective function can measure an error for the point following: ℒ_{^^^^ ൌ || ^^^^^^^^^^^ െ ^^^^^^^^^^^ ||ଶ} [0152] Where ^^^^^^_ெ^^^^ is a value of the signed distance function for the object determined using the surface mesh, following: ^_^^^^^ ^^^^^_{∗ ^^^^, ^^^, ^^^^ ^^ ^^^^ ^^^} _{^^^^^ ൌ ^ ெ ^^^^^^^^^^^ ^^ℎ^^ ^^^^^^^^^^^^} _െ^^ ^{^} _^^∗ _{ெ ^^^^, ^^^, ^^^^ ^^ ^^^^ ^^^^^^^^^^^^ ^^ℎ^^ ^^^^^^^^^^^^} where ^^^∗ ^^^^ is a ^^ ^ ^^_{∗ ^^^^, ^^ is a} ^ _{^ the point ^^ெ} ^ ^ and the point ^^. As one example, the point ^^_ெ ^∗ ^^^^ can be a closest mesh point of the surface mesh to the point ^^. As another example, the point ^^_ெ ^∗ ^^^^ can be a closest point on a surface defined by the surface mash to the point ^^. As a further example, when the surface mesh is a triangular mesh, the system can determine a closest triangle of the mesh to the point ^^ (e.g., using a Boundary Vector Hierarchy technique, as described by James Clark in "Hierarchical Geometric Models for Visible Surface Algorithms", Communications of the ACM, 19.10 (1976): 547-554) and can determine the point ^^_ெ ^∗ ^^^^ as a closest point to the point ^^ of the closest triangle. As another example, when the training data includes observations depicting the object and the system generates replicated observations of the object as modeled by the neural network, the objective function can measure an error between (i) the observations of the object and (ii) the replicated observations of the target object generated using the neural network. For example, when the system generates replicated images of the object as modeled by the neural network, the objective function can measure an error between (i) pixel values of received images depicting the object and (ii) corresponding pixel values of the replicated images of the object as generated using the neural network. [0153] As another example, when training the neural network to optimize the shape of the object, objective function can characterize, e.g., forces or stresses experienced by the object, deformation of the object, and so forth. [0154] The system can update the parameters of the neural network using any appropriate machine learning technique. For example, the system can determine gradients of the objective function with respect to the parameters of the neural network and can determine updates for 37934022-1 Attorney Docket No.45288-0422WO1 the parameters using, e.g., stochastic gradient descent, ADAM, and so on. As a further example, when the objective function measures an error between states of an environment as simulated using the neural network and target states of the environment, the system can update parameters of the neural network by backpropagating gradients of the objective function through a neural network used to simulate the environment of the object (e.g., a graph neural network used to simulate the environment of the object following the process 200 of FIG.2). [0155] The system can determine whether the training is complete (step 608). The system can use any of a variety of criteria to determine whether the training is complete. For example, the system can determine that training is complete after a pre-determined number of training iterations. As another example, the system can determine that training is complete when a value of the training objective function falls below a pre-determined threshold. As another example, the system can determine that training is complete when a difference between values of the training objective function for the current training iteration and a previous training iteration falls below a pre-determined threshold. [0156] If the system determines that training is not complete, the system can continue to a next training iteration (e.g., return to step 604). [0157] When the system determines that training is complete, the system can provide the trained neural network (step 610). [0158] FIG. 7 illustrates experimental results comparing computational costs for simulating environments obtained using an implementation described simulation system and using conventional methods. [0159] In particular, FIG. 7 illustrates a comparison between the number of collision edges required by conventional (e.g., mesh-based) simulation 702 and required by an implementation of the described simulation system 704 for simulating given numbers of objects. [0160] By requiring significantly fewer collision edges to simulate systems, the described systems can therefore reduce the computational costs (e.g., memory consumption, computation time, etc.) for simulating systems. For example, FIG. 7 illustrates a comparison of the computation times required by conventional (e.g., mesh-based) simulation 706 and required by an implementation of the described simulation system 706 for simulating given numbers of objects at each time step. This can enable the described simulation systems to, e.g., simulate significantly larger environments for a same computational cost as conventional methods, utilize more complex neural networks to more accurately simulate environments at a same computational cost as conventional methods, and so on. 37934022-1 Attorney Docket No.45288-0422WO1 [0161] FIG. 8 illustrates example large scale simulations 802 and 804 performed using an implementation of the described simulation system. [0162] The simulation 802 is of an environment with 270 objects. The implementation of the described simulation system performed the simulation 802 over a sequence of 200 time steps by processing a graph representing the environment for the simulation 802 that included a total of 384,000 graph nodes. [0163] The simulation 804 is of an environment with 380 objects. The implementation of the described simulation system performed the simulation 804 over a sequence of 200 time steps by processing a graph representing the environment for the simulation 804 that included a total of 1,100,000 graph nodes. [0164] In addition to the embodiments described above, the following embodiments are also innovative: [0165] Embodiment 1 is a method performed by one or more computers for simulating a state of an environment over a sequence of time steps, the method comprising, at each of the sequence of time steps: generating a graph representing the state of the environment at the time step, wherein: the graph comprises a plurality of graph nodes and a plurality of graph edges; and generating the graph comprises, for each of one or more target objects in the environment for the time step: determining, for each of a set of one or more neighboring objects for the target object, object-to-point distances between the target object and points associated with the neighboring object based on a signed distance function for the target object; and creating one or more collision edges in the graph based on the determined distances, wherein each collision edge corresponds to the target object and a corresponding neighboring object and connects (i) a graph node associated with the target object and (ii) a graph node associated with the corresponding neighboring object; and processing the graph using a graph neural network to generate an updated graph representing a state of the environment at a next time step. [0166] Embodiment 2 is the method of embodiment 1, wherein determining the object-to-point distances between the target object and points associated with the neighboring object based on the signed distance function for the target object comprises, for each of the points associated with the neighboring object: determining a closest object-to-point distance from the target object to the point based on the signed distance function for the target object. [0167] Embodiment 3 is the method of embodiment 2, wherein determining the closest object- to-point distance from the target object to the point based on the signed distance function for the target object comprises: determining a point-to-point distance from the point to a respective 37934022-1 Attorney Docket No.45288-0422WO1 closest point on a surface of the target object based on the signed distance function for the target object. [0168] Embodiment 4 is the method of any one of embodiments 1-3, wherein determining the object-to-point distances between the target object and points associated with the neighboring object based on a signed distance function for the target object comprises: determining the object-to-point distances between the target object and points associated with the neighboring object using a neural network trained to model the signed distance function for the target object. [0169] Embodiment 5 is the method of embodiment 4, wherein the neural network for the target object has been trained using a surface mesh for the target object to optimize an error between (i) object-to-point distances determined by the neural network and (ii) respective object-to-point distances determined using the surface mesh. [0170] Embodiment 6 is the method of embodiment 4, wherein the neural network for the target object has been trained using one or more images of the target object to optimize an error between (i) the images of the target object and (ii) replicated images of the target object generated using the signed distance function. [0171] Embodiment 7 is the method of any one of embodiments 1-6, wherein, at each of the sequence of time steps: the graph representing the state of the environment at the time step characterizes a respective state for each of the target objects for the time step; and the updated graph representing the state of the environment at the next time step characterizes a respective state for each of the target objects for the next time step. [0172] Embodiment 8 is the method of any one of embodiments 1-7, further comprising, at one or more of the time steps: receiving data characterizing the state of the environment at the time step; and wherein generating the graph representing the state of the environment at the time step comprises generating the graph representing the state of the environment at the time step based on the received data. [0173] Embodiment 9 is the method of embodiment 8, wherein the data characterizing the state of the environment at the time step comprises one or more images of the environment at the time step. [0174] Embodiment 10 is the method of embodiment 9, wherein generating the graph representing the state of the environment at the time step based on the received data comprises: estimating positions and orientations within the environment of the target objects for the time step based on the received images; and generating the graph based on the estimated positions and orientations of target objects for the time step. 37934022-1 Attorney Docket No.45288-0422WO1 [0175] Embodiment 11 is the method of embodiment 10, wherein estimating positions and orientations within the environment of the one or more target objects for the time step based on the received images comprises: determining estimated positions and orientations of the target objects that optimize a replication loss between (i) the received images and (ii) reproduction images generated based on the signed distance functions, estimated positions, and estimated orientations of the one or more target objects. [0176] Embodiment 12 is the method of any one of embodiments 1-11, wherein generating the graph further comprises: determining object-to-object distances between each of the target objects for the time step; and determining, for each of the target objects for the time step, the set of neighboring objects for the target object, wherein each neighboring object of the target object: (i) is another target object within the environment, and (ii) is within a predetermined threshold object-to-object distance of the target object. [0177] Embodiment 13 is the method of any one of embodiments 1-12, wherein each of the points associated with the neighboring object for the target object is a point on a surface of the neighboring object. [0178] Embodiment 14 is the method of any one of embodiments 1-13, wherein the graph comprises: one or more object nodes, each representing a respective target object; for each object node: one or more surface point nodes, wherein each surface point node represents a respective point on a surface of the target object for the object node; and for each surface point node: an object-surface edge between the surface point node and the object node. [0179] Embodiment 15 is the method of embodiment 14, wherein generating the graph further comprises, for each of the target objects for the time step: creating an object node in the graph representing the target object; creating one or more surface point nodes in the graph for the target object by sampling points on the surface of the target object using the signed distance function for the target object; and creating an object-surface edge in the graph between the object node and each surface point node for the target object. [0180] Embodiment 16 is the method of embodiment 14 or embodiment 15, wherein determining the object-to-point distances between the target object and the points associated with the neighboring object based on a signed distance function for the target object comprises: determining, for each surface point node associated with the neighboring object, an object-to- point distance from target object to the point represented by the surface point node based on the signed distance function for the target object. [0181] Embodiment 17 is the method of any one of embodiments 14-16, wherein generating the graph further comprises: for each of the target objects for the time step, each neighboring 37934022-1 Attorney Docket No.45288-0422WO1 object for the target object, and each surface point node associated with the neighboring object: determining a closest point on a surface of the target object to the point represented by the surface point node based on the signed distance function for the target object. [0182] Embodiment 18 is the method of embodiment 16 or embodiment 17, wherein generating the graph further comprises: for each of the target objects for the time step, each of the neighboring object for the target object, and each surface point node associated with the neighboring object: creating a collision edge in the graph between the surface point node and the object node for the target object when the object-to-point distance from the target object to the point represented by the surface point node is smaller than a predetermined threshold collision distance. [0183] Embodiment 19 is the method of any one of embodiments 1-18, wherein generating the graph representing the state of the environment at the time step comprises: generating a respective edge embedding for each of the plurality of graph edges; and generating a respective node embedding for each of the plurality of graph nodes. [0184] Embodiment 20 is the method of embodiment 19, when dependent on embodiment 14, wherein for each object node, the node embedding for the object node characterizes a velocity of a center of mass of the target object for the object node. [0185] Embodiment 21 is the method of embodiment 20, wherein the node embedding for each object node characterizes an acceleration of the center of mass of the target object for the object node. [0186] Embodiment 22 is the method of any one of embodiments 19-21, when dependent on embodiment 14, wherein the node embedding for each object node characterizes properties of the target object for the object node. [0187] Embodiment 23 is the method of any one of embodiments 19-22, when dependent on embodiment 14, wherein the node embedding for each surface point node characterizes a velocity of the point for the surface point node. [0188] Embodiment 24 is the method of any one of embodiments 19-23, when dependent on embodiment 14, wherein the node embedding for each surface point node characterizes an acceleration of the point for the surface point node [0189] Embodiment 25 is the method of any one of embodiments 19-24, when dependent on embodiment 14, wherein for each surface point node, the node embedding for the surface point node characterizes properties of the target object for the surface point node. [0190] Embodiment 26 is the method of any one of embodiments 19-25, when dependent on embodiment 18, wherein the edge embedding for each collision edge characterizes the object- 37934022-1 Attorney Docket No.45288-0422WO1 to-point distance from the target object to the point represented by the surface point node for the collision edge. [0191] Embodiment 27 is the method of any one of embodiments 19-26, when dependent on embodiment 18, wherein the edge embedding for each collision edge characterizes a displacement between the point of the surface point node and the closest point on the surface of the target object for the surface point node. [0192] Embodiment 28 is the method of any one of embodiments 19-27, when dependent on embodiment 18, wherein the edge embedding for each collision edge characterizes a displacement between a center of mass of the target object and the closest point on the surface of the target object for the surface point node. [0193] Embodiment 29 is the method of embodiment 22 or embodiment 25, wherein the properties of the target object include a mass of the target object. [0194] Embodiment 30 is the method of embodiment 22 or embodiment 25, wherein the properties of the target object include a moment of inertia of the target object. [0195] Embodiment 31 is the method of embodiment 22 or embodiment 25, wherein the properties of the target object include a coefficient of friction of the target object. [0196] Embodiment 32 is the method of embodiment 22 or embodiment 25, wherein the properties of the target object include a coefficient of restitution of the target object. [0197] Embodiment 33 is method of any one of embodiments 19-32, wherein processing the graph using the graph neural network to generate the updated graph comprises, at each of a sequence of update iterations: updating the respective edge embedding for each of the plurality of graph edges using the graph neural network; and updating the respective node embedding for each of the plurality of graph nodes using the graph neural network. [0198] Embodiment 34 is the method of embodiment 33, further comprising: generating, based on the updated graph, a predicted state of the environment at a next time step. [0199] Embodiment 35 is the method of embodiment 34, wherein the graph neural network has been trained to minimize an error of the predicted state of the environment at the next time step. [0200] Embodiment 36 is the method of embodiment 35, wherein the error characterizes a difference between predicted positions of the target objects at the next time step and observed positions of the target objects at the next time step. [0201] Embodiment 37 is the method of embodiment 35 or embodiment 36, wherein the error characterizes a difference between predicted rotations of the target objects at the next time step and observed rotations of the target objects at the next time step. 37934022-1 Attorney Docket No.45288-0422WO1 [0202] Embodiment 38 is the method of any one of embodiments 35-37, wherein the error characterizes a difference between predicted accelerations of the target objects for the time step and observed accelerations of the target objects for the time step. [0203] Embodiment 39 method of any one of embodiments 1-38, wherein: the one or more computers that perform the method include one or more hardware accelerator units; and processing the graph using the graph neural network to generate the updated graph comprises updating the graph at each of one or more update iterations using a processor system comprising L message passing blocks, wherein each message passing block has a same neural network architecture and a separate set of neural network parameters; the method further comprising: applying the message passing blocks sequentially to process data defining the graph over multiple iterations; and using the one or more hardware accelerator units to apply the message passing blocks sequentially to process the data defining the graph. [0204] Embodiment 40 is the method of embodiment 39, wherein the one or more computers that perform the method include multiple hardware accelerators, and wherein the method comprises distributing the processing using the message passing blocks over the hardware accelerators. [0205] Embodiment 41 is the method of any one of embodiments 1-40, wherein the environment comprises a real-world environment and wherein, at one or more time steps, generating the graph further comprises: obtaining sensor readings from one or more sensors in the environment; and generating the graph based on the sensor readings. [0206] Embodiment 42 is the method of embodiment 41, wherein the one or more sensors include one or more lidar or radar sensors. [0207] Embodiment 43 is the method of embodiment 41 or embodiment 42, when dependent on embodiment 34, further comprising, at one or more time steps: comparing the predicted state of the environment with an observed state of the environment based on the sensor readings to verify the predicted state of the environment. [0208] Embodiment 44 is a method of designing a shape of an object using the method of any one of embodiments 1-43, wherein the method of designing the shape of the object comprises backpropagating gradients of an objective function through the graph neural network to adjust data representing the shape of the object in order to optimize an objective function. [0209] Embodiment 45 is the method of embodiment 44, when dependent on embodiemnt 4, wherein the data representing the shape of the object are the parameters of the neural network modeling the signed distance function for the object. 37934022-1 Attorney Docket No.45288-0422WO1 [0210] Embodiment 46 is the method of embodiment 44 or embodiment 45, further comprising making a physical object with the shape that optimizes the objective function. [0211] Embodiment 47 is a method of training a machine learning model, training the machine learning model using training data generated by simulating states of the environment using the method of any one of embodiments 1-43. [0212] Embodiment 48 is the method of embodiment 47, wherein the machine learning model is configured to predict properties of the environment. [0213] Embodiment 49 is the method of embodiment 47, wherein the machine learning model is an action selection policy for an agent interacting with the environment. [0214] Embodiment 50 is the method of embodiment 49, wherein the simulated states of the environment comprise simulated states of the environment in which the agent interacts with the environment by performing actions based on the action selection policy. [0215] Embodiment 51 is a method of controlling an agent interacting with the environment, the method comprising: simulating the state of the environment when the agent performs a sequence of actions over a sequence of time steps using the method of any one of embodiments 1-43; determining that a final state of the environment at a final time step in the sequence of time steps satisfies a success criterion; and in response, causing the agent to perform the sequence of actions in the environment. [0216] Embodiment 52 is the method of embodiment 51, wherein simulating the state of the environment comprises simulating the state of the environment based on data received from sensors of the agent. [0217] Embodiment 53 is the method of embodiment 51 or embodiment 52, wherein the agent is a robot in the environment. [0218] Embodiment 54 is the method of embodiment 51 or embodiment 52, wherein the agent is an autonomous vehicle navigating through the environment. [0219] Embodiment 55 is the method of any one of embodiments 51-54, wherein determining that the final state of the environment at the final time step in the sequence of time steps satisfies the success criterion comprises: determining that respective locations, shapes, or configurations of the first object and the second object are within a tolerance of one or more target locations, shapes, or configurations of the first object and the second object. [0220] Embodiment 56 is a method, comprising: receiving one or more images of an environment; simulating a state of the environment over a sequence of time steps according to the method of any one of embodiments 1-43, wherein generating the graph representing the state of the environment for comprises generating the graph based on the received images; and 37934022-1 Attorney Docket No.45288-0422WO1 generating a representation of the simulated states of the environment based on the updated graphs for each of the timesteps; and providing the representation of the simulated states of the environment for display on a user device. [0221] Embodiment 57 is the method of embodiment 56, wherein the representation of the simulated states of the environment comprises one or more images depicting the simulated states of the environment. [0222] Embodiment 58 is the method of embodiment 56, wherein the representation of the simulated states of the environment comprises a video depicting the simulated states of the environment. [0223] Embodiment 59 is the method of embodiment 56, wherein the representation of the simulated states of the environment comprises a virtual reality environment depicting the simulated states of the environment. [0224] Embodiment 60 is the method of embodiment 56, wherein the representation of the simulated states of the environment comprises an augmented reality environment depicting the simulated states of the environment. [0225] Embodiment 61 is a system comprising: one or more computers; and one or more storage devices communicatively coupled to the one or more computers, wherein the one or more storage devices store instructions that, when executed by the one or more computers, cause the one or more computers to perform operations of the respective method of any one of embodiments 1-60. [0226] Embodiment 62 is one or more non-transitory computer storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations of the respective method of any one of embodiments 1-60. [0227] Particular embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. For example, the actions recited in the claims can be performed in a different order and still achieve desirable results. As one example, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous. [0228] What is claimed is: 37934022-1

Claims

Attorney Docket No.45288-0422WO1 CLAIMS 1. A method performed by one or more computers for simulating a state of an environment over a sequence of time steps, the method comprising, at each of the sequence of time steps: generating a graph representing the state of the environment at the time step, wherein: the graph comprises a plurality of graph nodes and a plurality of graph edges; and generating the graph comprises, for each of one or more target objects in the environment for the time step: determining, for each of a set of one or more neighboring objects for the target object, object-to-point distances between the target object and points associated with the neighboring object based on a signed distance function for the target object; and creating one or more collision edges in the graph based on the determined distances, wherein each collision edge corresponds to the target object and a corresponding neighboring object and connects (i) a graph node associated with the target object and (ii) a graph node associated with the corresponding neighboring object; and processing the graph using a graph neural network to generate an updated graph representing a state of the environment at a next time step. 2. The method of claim 1, wherein determining the object-to-point distances between the target object and points associated with the neighboring object based on the signed distance function for the target object comprises, for each of the points associated with the neighboring object: determining a closest object-to-point distance from the target object to the point based on the signed distance function for the target object. 3. The method of claim 2, wherein determining the closest object-to-point distance from the target object to the point based on the signed distance function for the target object comprises: determining a point-to-point distance from the point to a respective closest point on a surface of the target object based on the signed distance function for the target object. 4. The method of any one of the preceding claims, wherein determining the object-to- point distances between the target object and points associated with the neighboring object based on a signed distance function for the target object comprises: 37934022-1 Attorney Docket No.45288-0422WO1 determining the object-to-point distances between the target object and points associated with the neighboring object using a neural network trained to model the signed distance function for the target object. 5. The method of claim 4, wherein the neural network for the target object has been trained using a surface mesh for the target object to optimize an error between (i) object-to- point distances determined by the neural network and (ii) respective object-to-point distances determined using the surface mesh. 6. The method of claim 4, wherein the neural network for the target object has been trained using one or more images of the target object to optimize an error between (i) the images of the target object and (ii) replicated images of the target object generated using the signed distance function. 7. The method of any one of the preceding claims, wherein, at each of the sequence of time steps: the graph representing the state of the environment at the time step characterizes a respective state for each of the target objects for the time step; and the updated graph representing the state of the environment at the next time step characterizes a respective state for each of the target objects for the next time step. 8. The method of any preceding claim, further comprising, at one or more of the time steps: receiving data characterizing the state of the environment at the time step; and wherein generating the graph representing the state of the environment at the time step comprises generating the graph representing the state of the environment at the time step based on the received data. 9. The method of claim 8, wherein the data characterizing the state of the environment at the time step comprises one or more images of the environment at the time step. 10. The method of claim 9, wherein generating the graph representing the state of the environment at the time step based on the received data comprises: estimating positions and orientations within the environment of the target objects for the time step based on the received images; and 37934022-1 Attorney Docket No.45288-0422WO1 generating the graph based on the estimated positions and orientations of target objects for the time step. 11. The method of claim 10, wherein estimating positions and orientations within the environment of the one or more target objects for the time step based on the received images comprises: determining estimated positions and orientations of the target objects that optimize a replication loss between (i) the received images and (ii) reproduction images generated based on the signed distance functions, estimated positions, and estimated orientations of the one or more target objects. 12. The method of any one of the preceding claims, wherein generating the graph further comprises: determining object-to-object distances between each of the target objects for the time step; and determining, for each of the target objects for the time step, the set of neighboring objects for the target object, wherein each neighboring object of the target object: (i) is another target object within the environment, and (ii) is within a predetermined threshold object-to-object distance of the target object. 13. The method of any one of the preceding claims, wherein each of the points associated with the neighboring object for the target object is a point on a surface of the neighboring object. 14. The method of any one of the preceding claims, wherein the graph comprises: one or more object nodes, each representing a respective target object; for each object node: one or more surface point nodes, wherein each surface point node represents a respective point on a surface of the target object for the object node; and for each surface point node: an object-surface edge between the surface point node and the object node. 15. The method of claim 14, wherein generating the graph further comprises, for each of the target objects for the time step: 37934022-1 Attorney Docket No.45288-0422WO1 creating an object node in the graph representing the target object; creating one or more surface point nodes in the graph for the target object by sampling points on the surface of the target object using the signed distance function for the target object; and creating an object-surface edge in the graph between the object node and each surface point node for the target object. 16. The method of claim 14 or claim 15, wherein determining the object-to-point distances between the target object and the points associated with the neighboring object based on a signed distance function for the target object comprises: determining, for each surface point node associated with the neighboring object, an object-to-point distance from target object to the point represented by the surface point node based on the signed distance function for the target object. 17. The method of any one of claims 14-16, wherein generating the graph further comprises: for each of the target objects for the time step, each neighboring object for the target object, and each surface point node associated with the neighboring object: determining a closest point on a surface of the target object to the point represented by the surface point node based on the signed distance function for the target object. 18. The method of claim 16 or claim 17, wherein generating the graph further comprises: for each of the target objects for the time step, each of the neighboring object for the target object, and each surface point node associated with the neighboring object: creating a collision edge in the graph between the surface point node and the object node for the target object when the object-to-point distance from the target object to the point represented by the surface point node is smaller than a predetermined threshold collision distance. 19. The method of any one of the preceding claims, wherein generating the graph representing the state of the environment at the time step comprises: generating a respective edge embedding for each of the plurality of graph edges; and generating a respective node embedding for each of the plurality of graph nodes. 37934022-1 Attorney Docket No.45288-0422WO1 20. The method of claim 19, when dependent on claim 14, wherein for each object node, the node embedding for the object node characterizes a velocity of a center of mass of the target object for the object node. 21. The method of claim 20, wherein the node embedding for each object node characterizes an acceleration of the center of mass of the target object for the object node. 22. The method of any one of claims 19-21, when dependent on claim 14, wherein the node embedding for each object node characterizes properties of the target object for the object node. 23. The method of any one of claims 19-22, when dependent on claim 14, wherein the node embedding for each surface point node characterizes a velocity of the point for the surface point node. 24. The method of any one of claims 19-23, when dependent on claim 14, wherein the node embedding for each surface point node characterizes an acceleration of the point for the surface point node 25. The method of any one of claims 19-24, when dependent on claim 14, wherein for each surface point node, the node embedding for the surface point node characterizes properties of the target object for the surface point node. 26. The method of any one of claims 19-25, when dependent on claim 18, wherein the edge embedding for each collision edge characterizes the object-to-point distance from the target object to the point represented by the surface point node for the collision edge. 27. The method of any one of claims 19-26, when dependent on claim 18, wherein the edge embedding for each collision edge characterizes a displacement between the point of the surface point node and the closest point on the surface of the target object for the surface point node. 28. The method of any one of claims 19-27, when dependent on claim 18, wherein the edge embedding for each collision edge characterizes a displacement between a center of mass of the target object and the closest point on the surface of the target object for the surface point node. 37934022-1 Attorney Docket No.45288-0422WO1 29. The method of claim 22 or claim 25, wherein the properties of the target object include a mass of the target object. 30. The method of claim 22 or claim 25, wherein the properties of the target object include a moment of inertia of the target object. 31. The method of claim 22 or claim 25, wherein the properties of the target object include a coefficient of friction of the target object. 32. The method of claim 22 or claim 25, wherein the properties of the target object include a coefficient of restitution of the target object. 33. The method of any one of claims 19-32, wherein processing the graph using the graph neural network to generate the updated graph comprises, at each of a sequence of update iterations: updating the respective edge embedding for each of the plurality of graph edges using the graph neural network; and updating the respective node embedding for each of the plurality of graph nodes using the graph neural network. 34. The method of claim 33, further comprising: generating, based on the updated graph, a predicted state of the environment at a next time step. 35. The method of claim 34, wherein the graph neural network has been trained to minimize an error of the predicted state of the environment at the next time step. 36. The method of claim 35, wherein the error characterizes a difference between predicted positions of the target objects at the next time step and observed positions of the target objects at the next time step. 37. The method of claim 35 or claim 36, wherein the error characterizes a difference between predicted rotations of the target objects at the next time step and observed rotations of the target objects at the next time step. 37934022-1 Attorney Docket No.45288-0422WO1 38. The method of claim 35-37, wherein the error characterizes a difference between predicted accelerations of the target objects for the time step and observed accelerations of the target objects for the time step. 39. The method of any preceding claim, wherein: the one or more computers that perform the method include one or more hardware accelerator units; and processing the graph using the graph neural network to generate the updated graph comprises updating the graph at each of one or more update iterations using a processor system comprising L message passing blocks, wherein each message passing block has a same neural network architecture and a separate set of neural network parameters; the method further comprising: applying the message passing blocks sequentially to process data defining the graph over multiple iterations; and using the one or more hardware accelerator units to apply the message passing blocks sequentially to process the data defining the graph. 40. The method of claim 39, wherein the one or more computers that perform the method include multiple hardware accelerators, and wherein the method comprises distributing the processing using the message passing blocks over the hardware accelerators. 41. The method of any one of the preceding claims, wherein the environment comprises a real-world environment and wherein, at one or more time steps, generating the graph further comprises: obtaining sensor readings from one or more sensors in the environment; and generating the graph based on the sensor readings. 42. The method of claim 41, wherein the one or more sensors include one or more lidar or radar sensors. 43. The method of claim 41 or claim 42, when dependent on claim 34, further comprising, at one or more time steps: comparing the predicted state of the environment with an observed state of the environment based on the sensor readings to verify the predicted state of the environment. 37934022-1 Attorney Docket No.45288-0422WO1 44. A method of designing a shape of an object using the method of any preceding claim, wherein the method of designing the shape of the object comprises backpropagating gradients of an objective function through the graph neural network to adjust data representing the shape of the object in order to optimize an objective function. 45. The method of claim 44, when dependent on claim 4, wherein the data representing the shape of the object are the parameters of the neural network modeling the signed distance function for the object. 46. The method of claim 44 or claim 45, further comprising making a physical object with the shape that optimizes the objective function. 47. A method of training a machine learning model, training the machine learning model using training data generated by simulating states of the environment using the method of any one of claims 1-43. 48. The method of claim 47, wherein the machine learning model is configured to predict properties of the environment. 49. The method of claim 47, wherein the machine learning model is an action selection policy for an agent interacting with the environment. 50. The method of claim 49, wherein the simulated states of the environment comprise simulated states of the environment in which the agent interacts with the environment by performing actions based on the action selection policy. 51. A method of controlling an agent interacting with the environment, the method comprising: simulating the state of the environment when the agent performs a sequence of actions over a sequence of time steps using the method of any one of claims 1-43; determining that a final state of the environment at a final time step in the sequence of time steps satisfies a success criterion; and in response, causing the agent to perform the sequence of actions in the environment. 52. The method of claim 51, wherein simulating the state of the environment comprises simulating the state of the environment based on data received from sensors of the agent. 37934022-1 Attorney Docket No.45288-0422WO1 53. The method of claim 51 or claim 52, wherein the agent is a robot in the environment. 54. The method of claim 51 or claim 52, wherein the agent is an autonomous vehicle navigating through the environment. 55. The method of any one of claims 51-54, wherein determining that the final state of the environment at the final time step in the sequence of time steps satisfies the success criterion comprises: determining that respective locations, shapes, or configurations of the first object and the second object are within a tolerance of one or more target locations, shapes, or configurations of the first object and the second object. 56. A method, comprising: receiving one or more images of an environment; simulating a state of the environment over a sequence of time steps according to the method of any one of claims 1-43, wherein generating the graph representing the state of the environment for comprises generating the graph based on the received images; and generating a representation of the simulated states of the environment based on the updated graphs for each of the timesteps; and providing the representation of the simulated states of the environment for display on a user device. 57. The method of claim 56, wherein the representation of the simulated states of the environment comprises one or more images depicting the simulated states of the environment. 58. The method of claim 56, wherein the representation of the simulated states of the environment comprises a video depicting the simulated states of the environment. 59. The method of claim 56, wherein the representation of the simulated states of the environment comprises a virtual reality environment depicting the simulated states of the environment. 60. The method of claim 56, wherein the representation of the simulated states of the environment comprises an augmented reality environment depicting the simulated states of the environment. 37934022-1 Attorney Docket No.45288-0422WO1 61. A system comprising: one or more computers; and one or more storage devices communicatively coupled to the one or more computers, wherein the one or more storage devices store instructions that, when executed by the one or more computers, cause the one or more computers to perform operations of the respective method of any one of the preceding claims. 62. One or more non-transitory computer storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations of the respective method of any one of claims 1-60. 37934022-1