WO2025123133A1

WO2025123133A1 - Mixed reality testing

Info

Publication number: WO2025123133A1
Application number: PCT/CA2024/051646
Authority: WO
Inventors: Raquel URTASUN; Ioan Andrei Barsan; Stefania ALBORGHETTI; Alexander EKDAHL; Neil Clifford ISAAC; Luisa SAN MARTIN FERREIRA; Peter Varnai; Seyad Abbas SADAT KOOCH MOHTASHAM; Kelvin Wong; Sivabalan Manivasagam
Original assignee: Waabi Innovation Inc
Current assignee: Waabi Innovation Inc
Priority date: 2023-12-11
Filing date: 2024-12-11
Publication date: 2025-06-19
Anticipated expiration: 2026-06-11

Abstract

A method implements generating and testing augmented data for an autonomous system. The method involves generating system data from multiple sensors of an autonomous system operating in a real-world environment. The method further involves generating simulation data that includes a perturbation to the system data. The method further involves augmenting the system data to include the perturbation of the simulation data and generate augmented data. The method further involves injecting the augmented data into one or more components of the autonomous system to test the autonomous system with the perturbation.

Description

MIXED REALITY TESTING

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application claims benefit to US Provisional Application 63/608,806, filed December 11, 2023, which is hereby incorporated by reference herein.

BACKGROUND

[0002] The development and deployment of autonomous systems (including robots and self-driving vehicles (SDVs)) includes testing performance in various situations safely in the real world. One approach includes accumulating testing miles by driving on public roads. However, testing on public roads means that the frequency of events cannot be controlled, relying on chance to naturally unveil the distribution across the possible set of situations. Scalability is an issue, as the number of miles needed for testing coverage may be too high since many events happen rarely. Accumulating miles of testing also carries exposure risk, driving a large number of miles increases the chance of a hazardous event occurring. Furthermore, many situations cannot be tested, such as accidents or safety-critical situations. Moreover, any change to the virtual driver software or hardware of the autonomous system may lead to repeating the exercise of driving in the real world to evaluate the system.

[0003] Closed-course testing at a track facility offers a complementary approach, involving controllable and repeatable tests, focusing on particular interactions. Manually orchestrating scenarios that interact with the self-driving system in a deliberate fashion involves test personnel interacting with self-driving vehicles performing various maneuvers or acting as pedestrians. Interactions may be limited to one vehicle and the autonomous system. Fine-grained controllability is challenging, requiring labor-intensive and costly efforts similar to those in big-budget movie studios for car stunts, involving actors on specialized rigs, dummy props, sophisticated timing triggers with lasers, etc. Much of the coordination requires humans communicating over walkie-talkies to plan the timing of events and ensure the safety of all personnel. Oftentimes, dummy props and rigs are unrealistic and require significant visual effects in postproduction offline, which is not feasible for real-time robotic testing. Consequently, track testing is expensive and slow, making it challenging to test safety-critical scenarios. Reacting properly in the event of an unavoidable collision or handling close calls effectively is difficult.

[0004] An alternative is the use of offline simulation to evaluate the autonomous system in a virtual world on scenarios in closed loop. The most common simulation used in the industry is open loop log replay, where data from the real log is played back to the autonomous system. While realistic, the autonomous system is unable to interact with the environment and actors, or execute new actions, preventing a full understanding of system performance. Another form of simulation models the behaviors of actors, to evaluate the motion planning module of the autonomous system, but evaluating the motion planning module may not evaluate the effect of other components, such as perception, and the potential compounding of errors may deteriorate performance and testing. Existing offline simulation systems that simulate directly from sensor inputs may use computer graphics with artist-designed assets to simulate sensor observations, resulting in limited diversity and low realism. Evaluation performance in simulation may not be indicative of the real world. Data-driven high-fidelity simulation systems may leverage artificial intelligence (Al) and real-world data to improve realism and increase diversity, ensuring broader coverage of the scenario space. Offline simulation allows for safe and efficient testing of the autonomous system in rare or safety-critical scenarios. However, while the domain gap may be smaller, situations such as extreme vehicle dynamics may still compromise the realism and effectiveness of a simulation in predicting the real-world performance of an autonomous system. Virtual world simulation does not include the actual physical vehicle hardware and compute platform as part of the system under test, potentially creating a gap in modeling the exact real-world dynamics, especially for extreme situations, such as severe hard braking and challenging weather conditions, which may include black ice, hydroplaning, high-speed wind gusts, etc. For certain vehicles, such as semi -trucks, additional dynamics come into play, such as gear shifts, unusual cargo loads (e.g., liquids), and jack-knifing. Hardware-in-the-loop testing uses the same compute as the actual autonomous system for more realistic latencies and performance but still encounters challenges in accurately capturing the full interaction of the vehicle with the real world.

SUMMARY

[0005] In general, in one or more aspects, the disclosure relates to a method for generating and testing augmented data for an autonomous system. The method involves generating system data from multiple sensors of an autonomous system operating in a real-world environment. The method further involves generating simulation data that includes a perturbation to the system data. The method further involves augmenting the system data to include the perturbation of the simulation data and generate augmented data. The method further involves injecting the augmented data into one or more components of the autonomous system to test the autonomous system with the perturbation.

[0006] In general, in one or more aspects, the disclosure relates to a system that includes at least one processor and an application that executes on the at least one processor. Executing the application performs generating system data from multiple sensors of an autonomous system operating in a real-world environment. Executing the application further performs generating simulation data that includes a perturbation to the system data. Executing the application further performs augmenting the system data to include the perturbation of the simulation data and generate augmented data. Executing the application further performs injecting the augmented data into one or more components of the autonomous system to test the autonomous system with the perturbation. [0007] In general, in one or more aspects, the disclosure relates to a non- transitory computer readable medium including instructions executable by at least one processor. Executing the instructions performs generating system data from multiple sensors of an autonomous system operating in a real-world environment. Executing the instructions further performs generating simulation data that includes a perturbation to the system data. Executing the instructions further performs augmenting the system data to include the perturbation of the simulation data and generate augmented data. Executing the instructions further performs injecting the augmented data into one or more components of the autonomous system to test the autonomous system with the perturbation.

[0008] Other aspects of one or more embodiments may be apparent from the following description and the appended claims.

BRIEF DESCRIPTION OF DRAWINGS

[0009] FIG. 1 shows a diagram of an autonomous training and testing system in accordance with the disclosure.

[0010] FIG. 2 shows a diagram of a simulation system in accordance with the disclosure.

[0011] FIG. 3 shows a diagram of an autonomous system in accordance with the disclosure.

[0012] FIG. 4 shows a method in accordance with the disclosure.

[0013] FIG. 5A, FIG. 5B, FIG. 5C, FIG. 6, FIG. 7, and FIG. 8, show examples in accordance with the disclosure.

[0014] FIG. 9A and FIG. 9B show a computing system in accordance with the disclosure.

[0015] Similar elements in the various figures may be denoted by similar names and reference numerals. The features and elements described in one figure may extend to similarly named features and elements in different figures. DETAILED DESCRIPTION

[0016] Disclosed embodiments implement mixed reality testing to address the challenges of real-world testing of autonomous systems. An autonomous system that is being tested generates real-world sensor data and may run a simulation that includes additional actors and objects that are not present in the real world. Augmented data may be injected into the components of the system to test the reactions of the system to the additional actors and objects. Furthermore, the actors and objects may react to the reactions of the system to provide realistic testing in a mixed reality environment.

[0017] Disclosed embodiments bridge the gap between offline simulation and real-world testing using mixed reality testing (MRT) with a suite of technologies to bring together the benefits of the flexibility of offline simulation with the full dynamical, computational, and mechanical realism of real-world testing. The simulation system is enabled to run onboard with the autonomous system (e.g., a self-driving vehicle (SDV)) as the autonomous system navigates a scene. For example, while an autonomous system such as a self-driving vehicle drives, a new scenario for evaluation may be created by mixed reality testing, which fuses together the real world with a virtual world that may contain an arbitrary number of virtual actors and objects which may interact and react with the autonomous system. Synthetic data at different levels of the onboard system may then be generated to reflect the new augmented reality scenario, such as raw simulated sensor data readings, or synthetic object detections and predictions, and then the simulated data is blended with the real world to create a hybrid representation for the autonomy software of the autonomous system to process. The modified representation is then processed onboard by the autonomous system and new actuation commands are transmitted to modify the real-world state of the autonomous system, reacting to the “hallucinated” scenario, the same as would be done in normal real-world testing. At the same time as the autonomous system navigates, the simulated virtual actors in the scene are controlled by behavior models and reactively respond to the behavior of the autonomous system. The sensor data or intermediate autonomy representations are then modified on-the-fly in real time to reflect the change in the augmented world.

[0018] A paradigm shift for the safe testing of self-driving vehicles is enabled by mixed reality testing technology, as realistic synthetic objects may be inserted seamlessly, precisely, and effortlessly into the real world in order to orchestrate elaborate tests, without requiring any additional human operators or physical track changes. Several benefits may be unlocked with mixed reality testing. One benefit includes enabling safer evaluation of the autonomous system in the real world, as no real-world actors are used to interact in a dangerous manner with the autonomous system. Additionally, with mixed reality testing, added virtual actors may interact with existing real-world actors in the scene. Accidents may be tested, and collision severity may be measured in a safe manner, as no real-world actors are used. Another benefit is that the same or similar scenario may be tested each time by mixed reality testing and the desired interaction of the actors with the autonomous system may be specifically controlled. Mixed reality testing enables repeatable testing for evaluating performance between different onboard autonomy releases. Repeatable testing is often challenging or manually intensive to achieve in real- world structured tests. Another benefit is that arbitrary safety-critical scenario creation is allowed by mixed reality testing. Arbitrary safety-critical scenarios may be generated and tested without having to drive millions of miles waiting for safety-critical events and scenarios to occur, such as dense traffic jams, complex construction zones, severe cut-ins, etc. Arbitrary safety-critical scenarios may be simulated while evaluating the autonomous system with higher realism against the test scenario since the entire vehicle system is run in closed loop to execute the scenario. As another benefit, many different scenarios may be tested and varied by mixed reality testing, which may increase the amount of testing coverage to include edge cases when evaluating the performance of the autonomous system. [0019] Turning to the figures, FIG. 1 and FIG. 2 show example diagrams of the autonomous system and virtual driver. Turning to FIG. 1, an autonomous system (116) is a self-driving mode of transportation that does not require a human pilot or human driver to move and react to the real-world environment. The autonomous system (116) may be completely autonomous or semi- autonomous. As a mode of transportation, the autonomous system (116) is contained in a housing configured to move through a real-world environment. Examples of autonomous systems include self-driving vehicles (e.g., selfdriving trucks and cars), drones, airplanes, robots, etc.

[0020] The autonomous system (116) includes a virtual driver (102) that is the decision-making portion of the autonomous system (116). The virtual driver (102) is an artificial intelligence system that learns how to interact in the real world and interacts accordingly. The virtual driver (102) is the software executing on a processor that makes decisions and causes the autonomous system (116) to interact with the real world including moving, signaling, and stopping, or maintaining a current state. Specifically, the virtual driver (102) is decision-making software that executes on hardware (not shown). The hardware may include a hardware processor, memory or other storage device, and one or more interfaces. A hardware processor is any hardware processing unit that is configured to process computer readable program code and perform the operations set forth in the computer readable program code.

[0021] A real-world environment is the portion of the real world through which the autonomous system (116), when trained, is designed to move. Thus, the real-world environment may include concrete and land, construction, and other objects in a geographic region along with agents. The agents are the other agents in the real-world environment that are capable of moving through the real-world environment. Agents may have independent decision-making functionality. The independent decision-making functionality of the agent may dictate how the agent moves through the environment and may be based on visual or tactile cues from the real -world environment. For example, agents may include other autonomous and non-autonomous transportation systems (e.g., other vehicles, bicyclists, robots), pedestrians, animals, etc.

[0022] In the real world, the geographic region is an actual region within the real world that surrounds the autonomous system. Namely, from the perspective of the virtual driver, the geographic region is the region through which the autonomous system moves. The geographic region includes agents and map elements that are located in the real world. Namely, the agents and map elements each have a physical location in the geographic region that denotes a place in which the corresponding agent or map element is located. The map elements are stationary in the geographic region, whereas the agents may be stationary or nonstationary in the geographic region. The map elements are the elements shown in a map (e.g., road map, traffic map, etc.) or derived from a map of the geographic region.

[0023] The real-world environment changes as the autonomous system (116) moves through the real -world environment. For example, the geographic region may change and the agents may move positions, including new agents being added and existing agents leaving.

[0024] In order to interact with the real-world environment, the autonomous system (116) includes various types of sensors (104), such as light detection and ranging (lidar) sensors amongst other types, which are used to obtain measurements of the real-world environment, and cameras that capture images from the real -world environment. The autonomous system (116) may include other types of sensors as well. The sensors (104) provide input to the virtual driver (102).

[0025] In addition to sensors (104), the autonomous system (116) includes one or more actuators (108). An actuator is hardware and/or software that is configured to control one or more physical parts of the autonomous system based on a control signal from the virtual driver (102). In one or more embodiments, the control signal specifies an action for the autonomous system (e.g., turn on the blinker, apply brakes by a defined amount, apply accelerator by a defined amount, turn the steering wheel or tires by a defined amount, etc. . The actuator(s) (108) are configured to implement the action. In one or more embodiments, the control signal may specify a new state of the autonomous system and the actuator may be configured to implement the new state to cause the autonomous system to be in the new state. For example, the control signal may specify that the autonomous system should turn by a certain amount while accelerating at a predefined rate, while the actuator determines and causes the wheel movements and the amount of acceleration on the accelerator to achieve a certain amount of turn and acceleration rate.

[0026] The testing controller (112) is a component of the autonomous system (116) that may implement mixed reality testing. The testing controller (112) may execute within the autonomous system (116) during operation of the autonomous system (116) in the real -world environment to evaluate the other components of the autonomous system (116). The testing controller (112) may interface with the components of the autonomous system (116), including the virtual driver (102), sensors (104), and the actuators (108) to inject augmented data into the components and evaluate the reactions of the autonomous system (116). Additional description of the testing controller (112) is included in the discussion of FIG. 3.

[0027] The testing of a virtual driver of the autonomous systems may be performed in a real-world environment using mixed reality testing. As shown in FIG. 2, a simulator (200) may be remotely deployed that is configured to train and test a virtual driver (202) of an autonomous system during operation in the real world.

[0028] The simulator (200) may be a unified, modular, mixed reality, closed loop simulator that generates a world model for autonomous systems. The simulator (200) is a configurable simulation framework that enables not only evaluation of different autonomy components in isolation, but also as a complete system in a closed loop manner. The simulator (200) may reconstruct “digital twins” of real-world scenarios automatically, which may be augmented with virtual reality, enabling accurate evaluation of the virtual driver at scale. The simulator (200) may also be configured to generate the world model as a mixed reality simulation that combines real-world data and simulated data to create diverse and realistic evaluation variations to provide insight into the virtual driver’s performance. The mixed reality closed loop simulation allows the simulator (200) to analyze the virtual driver’s action on counterfactual “what-if’ scenarios that did not occur in the real world. The simulator (200) further includes functionality to simulate and train on rare yet safety-critical scenarios with respect to the entire autonomous system and closed loop training to enable automatic and scalable improvement of autonomy.

[0029] The simulator (200) creates the simulated environment (204) that is a part of the world model forming a virtual world in which the virtual driver (202) is the player in the virtual world. The virtual driver (202) may be a player in the virtual world of the simulator (200) while also controlling an autonomous system (e.g., the autonomous system (116) of FIG. 1) in the real world. The simulated environment (204) is a simulation of a real-world environment, which may or may not be in actual existence, in which the autonomous system is designed to move. As such, the simulated environment (204) includes a simulation of the objects (i.e., simulated objects or assets) and background in the real world, including the natural objects, construction, buildings and roads, obstacles, as well as other autonomous and non-autonomous objects. The simulated environment simulates the environmental conditions within which the autonomous system may be deployed. Additionally, the simulated environment (204) may be configured to simulate various weather conditions that may affect the inputs to the autonomous systems. The simulated objects may include both stationary and nonstationary objects. Nonstationary objects are actors in the real-world environment.

[0030] The simulator (200) also includes an evaluator (210). The evaluator (210) is configured to train and test the virtual driver (202) by creating various scenarios in the simulated environment that may be mixed with real sensor data for mixed reality testing. Each scenario is a configuration of the simulated environment including, but not limited to, static portions, movement of simulated objects, actions of the simulated objects with each other, and reactions to actions taken by the autonomous system and simulated objects. The evaluator (210) is further configured to evaluate the performance of the virtual driver using a variety of metrics.

[0031] The evaluator (210) assesses the performance of the virtual driver throughout the performance of the scenario. Assessing the performance may include applying rules. For example, the rules may be that the automated system does not collide with any other actor (real or simulated), compliance with safety and comfort standards (e.g., passengers not experiencing more than a certain acceleration force within the vehicle), the automated system not deviating from an executed trajectory, or other rule. Each rule may be associated with the metric information that relates a degree of breaking the rule with a corresponding score. The evaluator (210) may be implemented as a data- driven neural network that learns to distinguish between good and bad driving behavior. The various metrics of the evaluation system may be leveraged to determine whether the automated system satisfies the requirements of success criterion for a particular scenario. Further, in addition to system level performance, for modular based virtual drivers, the evaluator may also evaluate individual modules such as segmentation or prediction performance for actors in the scene with respect to the ground truth recorded in the simulator.

[0032] The simulator (200) is configured to operate in multiple phases as selected by the phase selector (208) and modes as selected by a mode selector (206). The phase selector (208) and mode selector (206) may be a graphical user interface or application programming interface component that is configured to receive a selection of phase and mode, respectively. The selected phase and mode define the configuration of the simulator (200). Namely, the selected phase and mode define which system components communicate and the operations of the system components.

[0033] The phase may be selected using a phase selector (208). The phase may be a training phase or a testing phase. In the training phase, the evaluator (210) provides metric information to the virtual driver (202), which uses the metric information to update the virtual driver (202). The evaluator (210) may further use the metric information to further train the virtual driver (202) by generating scenarios for the virtual driver. In the testing phase, the evaluator (210) may not provide the metric information to the virtual driver. In the testing phase, the evaluator (210) may use the metric information to assess the virtual driver and to develop scenarios for the virtual driver (202), which may be executed while the virtual driver (202) is controlling an autonomous vehicle in the real world.

[0034] The mode may be selected by the mode selector (206). The mode defines the degree to which real -world data is used, whether noise is injected into simulated data, degree of perturbations of real-world data, and whether the scenarios are designed to be adversarial. Example modes include open loop simulation mode, closed loop simulation mode, single module closed loop simulation mode, fuzzy mode, adversarial mode, a mixed reality mode, etc. In an open loop simulation mode, the virtual driver is evaluated with real-world data. In a single module closed loop simulation mode, a single module of the virtual driver is tested. An example of a single module closed loop simulation mode is a localizer closed loop simulation mode in which the simulator evaluates how the localizer estimated pose drifts over time as the scenario progresses in simulation. In a training data simulation mode, the simulator is used to generate training data. In a closed loop evaluation mode, the virtual driver and simulation system are executed together to evaluate system performance. In the adversarial mode, the actors are modified to perform adversarial to each other. In the fuzzy mode, noise is injected into the scenario (e.g, to replicate signal processing noise and other types of noise). Other modes may exist without departing from the scope of the system. [0035] In the mixed reality testing mode, the virtual driver (202) may be operating in an autonomous system to which the simulator (200) is remotely connected. The simulator (200) may receive real sensor data from the autonomous system controlled by the virtual driver (202) and generate augmented data that is injected into the autonomous system.

[0036] The simulator (200) includes the controller (212) that includes functionality to configure the various components of the simulator (200) according to the selected mode and phase. Namely, the controller (212) may modify the configuration of each of the components of the simulator based on configuration parameters of the simulator (200). Such components include the evaluator (210), the simulated environment (204), an autonomous system model (216), sensor simulation models (214), asset models (217), actor models (218), latency models (220), and a training data generator (222).

[0037] The autonomous system model (216) may be a detailed model of the autonomous system in which the virtual driver may execute (for offline testing and training) or may be executing (for mixed reality testing). The autonomous system model (216) includes model, geometry, physical parameters (e.g., mass distribution, points of significance), engine parameters, sensor locations and type, firing pattern of the sensors, information about the hardware on which the virtual driver executes (e.g., processor power, amount of memory, and other hardware information), and other information about the autonomous system. The various parameters of the autonomous system model may be configurable by the user or another system.

[0038] For example, if the autonomous system is a motor vehicle, the modeling and dynamics may include the type of vehicle (e.g., car, truck), make and model, geometry, physical parameters such as the mass distribution, axle positions, type and performance of engine, etc. The vehicle model may also include information about the sensors on the vehicle (e.g., camera, lidar, etc.), the sensors’ relative firing synchronization pattern, and the sensors’ calibrated extrinsics (e.g., position and orientation), and intrinsics (e.g., focal length). The vehicle model also defines the onboard computer hardware, sensor drivers, controllers, and the autonomy software release under test.

[0039] The autonomous system model includes an autonomous system dynamic model. The autonomous system dynamic model is used for dynamics simulation that takes the actuation actions of the virtual driver (e.g., steering angle, desired acceleration) and enacts the actuation actions on the autonomous system in the simulated environment to update the simulated environment and the state of the autonomous system. To update the state, a kinematic motion model may be used, or a dynamics motion model that accounts for the forces applied to the vehicle may be used to determine the state. Within the simulator, with access to real log scenarios with ground truth actuations and vehicle states at each time step, embodiments may also optimize analytical vehicle model parameters or learn parameters of a neural network that infers the new state of the autonomous system given the virtual driver outputs.

[0040] In one or more embodiments, the sensor simulation model (214) models, in the simulated environment, active and passive sensor inputs. Passive sensor inputs capture the visual appearance of the simulated environment including stationary and nonstationary simulated objects from the perspective of one or more cameras based on the simulated position of the camera(s) within the simulated environment. Examples of passive sensor inputs include inertial measurement unit (IMU) and thermal. Active sensor inputs are inputs to the virtual driver of the autonomous system from the active sensors, such as lidar, radar, global positioning system (GPS), ultrasound, etc. Namely, the active sensor inputs include the measurements taken by the sensors, the measurements being simulated based on the simulated environment based on the simulated position of the sensor(s) within the simulated environment. By way of an example, the active sensor measurements may be measurements that a lidar sensor would make of the simulated environment over time and in relation to the movement of the autonomous system. [0041] The sensor simulation models (214) are configured to simulate the sensor observations of the surrounding scene in the simulated environment (204) at each time step according to the sensor configuration on the vehicle platform. When the simulated environment directly represents the real-world environment, without modification, the sensor output may be directly fed into the virtual driver. For light-based sensors, the sensor model simulates light as rays that interact with objects in the scene to generate the sensor data. Depending on the asset representation (e.g., of stationary and nonstationary objects), embodiments may use graphics-based rendering for assets with textured meshes, neural rendering, or a combination of multiple rendering schemes. Leveraging multiple rendering schemes enables customizable world building with improved realism. Because assets are compositional in 3D and support a standard interface of render commands, different asset representations may be composed in a seamless manner to generate the final sensor data. Additionally, for scenarios that replay what happened in the real world and use the same autonomous system as in the real world, the original sensor observations may be replayed at each time step.

[0042] Additionally, the sensor simulation models (214) may deconstruct observations from sensors into frames of tokens. A frame may represent an observation and a token within the frame may be a feature vector that identifies features within a part of the frame. For a lidar sensor, the frame may be from a “bird’s eye view”, i.e., above the autonomous vehicle and each token may correspond to a group of contiguous voxels within a volume that is a part of the total volume of the frame. To generate frames and tokens from observations, a training application may be used to train encoder and decoder models that encode observations to frames of tokens and decode frames of tokens to observations. To predict future frames (and observations) another training application may train a spatio-temporal transformer that uses diffusion to generate predictions of frames. The predicted frames may be decoded to predicted observations. The predictions, frames, tokens, observations, etc., may be used by other models of the simulator (200).

[0043] Asset models (217) include multiple models, each model modeling a particular type of individual asset from the real world. The assets may include inanimate objects such as construction barriers, traffic signs, parked cars, and background (e.g., vegetation or sky). Each of the entities in a scenario may correspond to an individual asset. As such, an asset model, or instance of a type of asset model, may exist for each of the entities or assets in the scenario. The assets can be composed together to form the three-dimensional simulated environment. An asset model provides all the information needed by the simulator to simulate the asset. The asset model provides the information used by the simulator to represent and simulate the asset in the simulated environment. For example, an asset model may include geometry and bounding volume, the asset’s interaction with light at various wavelengths of interest (e.g, visible for camera, infrared for lidar, microwave for radar), animation information describing deformation (e.g, rigging) or lighting changes (e.g., turn signals), material information such as friction for different surfaces, and metadata such as the asset’s semantic class and key points of interest. Certain components of the asset may have different instantiations. For example, similar to rendering engines, an asset geometry may be defined in many ways, such as a mesh, voxels, point clouds, an analytical signed distance function, or neural network. Asset models may be created either by artists, or reconstructed from real-world sensor data, or optimized by an algorithm to be adversarial.

[0044] Closely related to, and possibly considered part of the set of asset models (217), are actor models (218). An actor model represents an actor in a scenario. An actor is a sentient being that has an independent decision-making process. Namely, in a real world, the actor may be an animate being (c.g., person or animal) that makes a decision based on an environment. The actor makes active movement rather than, or in addition to, passive movement. An actor model, or an instance of an actor model may exist for each actor in a scenario. The actor model is a model of the actor. If the actor is in a mode of transportation, then the actor model includes the mode of transportation in which the actor is located. For example, actor models may represent pedestrians, children, vehicles being driven by drivers, pets, bicycles, and other types of actors.

[0045] The actor model leverages the scenario specification and assets to control all actors in the scene and their actions at each time step. The behavior of an actor is modeled in a region of interest centered around the autonomous system. Depending on the scenario specification, the actor simulation will control the actors in the simulation to achieve the desired behavior. Actors can be controlled in various ways. One option is to leverage heuristic actor models, such as an intelligent-driver model (IDM) that may try to maintain a certain relative distance or time-to-collision (TTC) from a lead actor or heuristic- derived lane-change actor models. Another is to directly replay actor trajectories from a real log, or to control the actor(s) with a data-driven traffic model. Through the configurable design, embodiments may mix and match different subsets of actors to be controlled by different behavior models. For example, far-away actors that initially may not interact with the autonomous system and can follow a real log trajectory, but when near the vicinity of the autonomous system may switch to a data-driven actor model. In another example, actors may be controlled by a heuristic or data-driven actor model that still conforms to the high-level route in a real log. This mixed reality simulation provides control and realism.

[0046] Further, actor models may be configured to be in cooperative or adversarial mode. In cooperative mode, the actor model models actors to act rationally in response to the state of the simulated environment. In adversarial mode, the actor model may model actors acting irrationally, such as exhibiting road rage and bad driving.

[0047] The latency model (220) represents timing latency that occurs when the autonomous system is in the real-world environment. Several sources of timing latency may exist. For example, a latency may exist from the time that an event occurs to the sensors detecting the sensor information from the event and sending the sensor information to the virtual driver. Another latency may exist based on the difference between the computing hardware executing the virtual driver in the simulated environment as compared to the computing hardware of the virtual driver. Further, another timing latency may exist between the time that the virtual driver transmits an actuation signal to the autonomous system changing (e.g., direction or speed) based on the actuation signal. The latency model (220) models the various sources of timing latency.

[0048] Stated another way, in the real world, safety-critical decisions in the real world may involve fractions of a second affecting response time. The latency model simulates the exact timings and latency of different components of the onboard system. To enable a scalable evaluation without a strict requirement on exact hardware, the latencies and timings of the different components of autonomous system and sensor modules are modeled while running on different computer hardware. The latency model may replay latencies recorded from previously collected real-world data or have a data-driven neural network that infers latencies at each time step to match the hardware in loop simulation setup.

[0049] The training data generator (222) is configured to generate training data. For example, the training data generator (222) may modify real -world scenarios to create new scenarios. The modification of real-world scenarios is referred to as mixed reality. For example, mixed reality simulation may involve adding in new actors with novel behaviors, changing the behavior of one or more of the actors from the real world, and modifying the sensor data in that region while keeping the remainder of the sensor data the same as the original log. In some cases, the training data generator (222) converts a benign scenario into a safety- critical scenario.

[0050] The simulator (200) is connected to a data repository (205). The data repository (205) is any type of storage unit or device that is configured to store data. The data repository (205) includes data gathered from the real world. For example, the data gathered from the real world includes real actor trajectories (226), real sensor data (228), real trajectory of the system capturing the real world (230), and real latencies (232). Each of the real actor trajectories (226), real sensor data (228), real trajectory of the system capturing the real world (230), and real latencies (232) is data captured by or calculated directly from one or more sensors from the real world (e.g., in a real-world log). In other words, the data gathered from the real world are actual events that happened in real life. For example, in the case that the autonomous system is a vehicle, the real-world data may be captured by a vehicle driving in the real world with sensor equipment.

[0051] Further, the data repository (205) includes functionality to store one or more scenario specifications (240). A scenario specification (240) specifies a scenario and evaluation setting for testing or training the autonomous system. For example, the scenario specification (240) may describe the initial state of the scene, such as the current state of the autonomous system (e.g., the full 6D pose, velocity, and acceleration), the map information specifying the road layout, and the scene layout specifying the initial state of all the dynamic actors and objects in the scenario. The scenario specification may also include dynamic actor information describing how the dynamic actors in the scenario should evolve over time, which are inputs to the actor models. The dynamic actor information may include route information for the actors, desired behaviors, or aggressiveness. The scenario specification (240) may be specified by a user, programmatically generated using a domain specification language (DSL), procedurally generated with heuristics from a data-driven algorithm, or adversarial. The scenario specification (240) can also be conditioned on data collected from a real-world log, such as taking place on a specific real-world map or having a subset of actors defined by their original locations and trajectories.

[0052] The interfaces between virtual driver and the simulator may include similarities with the interfaces between the virtual driver and the autonomous system in the real world. For example, the interface between the sensor simulation model (214) and the virtual driver matches the virtual driver interacting with the sensors in the real world. The virtual driver is the actual autonomy software that executes on the autonomous system. The simulated sensor data that is output by the sensor simulation model (214) may be in or converted to the exact message format that the virtual driver takes as input, as if the virtual driver were in the real world, and the virtual driver can then run as a black box virtual driver with the simulated latencies incorporated for components that run sequentially. The virtual driver then outputs the exact same control representation that it uses to interface with the low-level controller on the real autonomous system. The autonomous system model (216) will then update the state of the autonomous system in the simulated environment. Thus, the various simulation models of the simulator (200) run in parallel asynchronously at their own frequencies to match the real-world setting.

[0053] Turning to FIG. 3, the autonomous system (300) may be tested using mixed reality testing. The autonomous system (300) may be a robot or a selfdriving vehicle (SDV). As an example, the autonomous system (300) may be a passenger car, light truck, van, semi-truck, etc., that may travel on a road. The autonomous system (300) includes the testing controller (302) to facilitate mixed reality testing of the other components of the autonomous system (300), which include the sensor system (330), the perception system (350), the planning system (360), and the actuator system (380).

[0054] The autonomous system (300) may execute each of the components using the processors and memory within the autonomous system (300). The autonomous system (300) may also connect remotely to other computing systems to perform some of the execution. For example, the simulator (305) of the testing controller (302) may execute on a remote computing system.

[0055] The testing controller (302) is a component of the autonomous system (300) responsible for managing and coordinating the mixed reality testing of the autonomous system (300). The testing controller (302) may oversee the execution of various test scenarios to evaluate the components of the autonomous system (300). The testing controller (302) may also introduce controlled variables and conditions to simulate real-world environments to comprehensively test capabilities of the autonomous system (300). The testing controller (302) utilizes multiple components, including the simulator (305), the test blender (312), and the evaluation controller (318), that process data including the simulation data (308), the system data (310), the augmented data (315), and the evaluation data (320). At least a portion of the testing controller (302) and corresponding components may operate on the hardware of the autonomous system (300). When the testing controller (302) is not executing remotely, the testing controller (302) and each corresponding component may execute on the hardware of the autonomous system (300).

[0056] The testing controller (302) may act as a simulation orchestrator to orchestrate the flow of control and data between various components in the autonomous system (300) and manage the interface between the simulator (305) and the onboard autonomy stack, which includes the sensor system (330), the perception system (350), the planning system (360), and the actuator system (380). For example, the testing controller (302) may determine what simulation modules are run, how simulations are blended into reality, and how the resulting mixed reality outputs are injected into the autonomy stack. The testing controller (302) may maintain a mixed reality world state by estimating the state of the real world to realistically place virtual actors with respect to the real world.

[0057] The testing controller (302) may execute actor simulation, which may use the simulator (305) to generate simulated actors injected into a scenario and control the behaviors of the actors. Actor simulation may start with a scenario specification, which specifies an initial state and the desired behaviors of the actors to be injected into the scenario. The specification may be provided using a domain-specific language (DSL) and may be manually generated, procedurally generated with heuristics, automatically generated with data- driven algorithms, combinations thereof, etc. Actor simulation will inject the specified actors into the simulation world state. Based on the test scenario, different approaches may be composed to simulate a diversity of actor behaviors, such as heuristic or data-driven models to simulate nominal traffic, adversarial actors to stress test the autonomous system (300) with safety-critical interactions, scripted trajectories to test specific capabilities, etc. Actor simulation for mixed reality testing may run in closed loop, enabling simulation actors to observe the real-world behavior of the autonomous system and react accordingly, similar to the real world. Additionally, actor simulation may be run asynchronously in a separate process to the main simulation loop, allowing the parallelization of computation and achieving real-time mixed reality simulation.

[0058] The testing controller (302) may also execute sensor simulation, which may further use the simulator (305) to synthesize sensor observations of the surrounding scene at each time step according to the sensor configuration on the autonomous system (300), as well as the contents of the simulation, including synthetic actors and weather modifications. Simulated sensors may include lidar sensors, camera sensors, radar sensors, global positioning systems, inertial measurement units (IMUs), ultrasound sensors, thermal sensors, microphones, etc. The simulated sensor observations may be input into the autonomy stack of the autonomous system (300) for processing as if the simulated sensor observations came from the real sensors or be blended together with the real data from the real sensors. To simulate sensors, a three- dimensional virtual world is built. Sensor data blending is used for the virtual actors and scene modifications to exist with the mixed reality world. The fused scene is spatially and temporally consistent between real and simulated data. As an example, adding new lidar points to an existing scan uses occlusion reasoning to remove points from behind a newly inserted object. Radar may use similar reasoning. Cameras may also use environment lighting correction in addition to occlusion reasoning. Asset representations built from either 1 digital twins of the real world or created manually may be used. Based on the world state of the simulation and based on the selected assets, the desired sensor data may be simulated. Multiple ways of simulating sensor readings may be used for different sensors. For light-based sensors, such as lidar and cameras, one approach is to perform rendering to simulate the formation process of the sensor data. Based on the asset representation, physically based rendering for assets with textured meshes, neural rendering, or a combination of both may be used. Leveraging multiple types of rendering enables customizable world building with improved realism. Similarly, physically based approaches may be used to simulate other sensors, including radar. For mixed reality testing, a mixture of real and simulated sensor data is leveraged to generate the sensor observations sent to the autonomy stack.

[0059] The testing controller (302) may also execute output level perception model simulation, which may test components of the autonomous system (300) without three-dimensional modelling of the mixed reality environment. Output level perception model simulations may be used to evaluate components of the autonomy system (300) that are after the perception system (350), such as the planning system (360). Given high-level information about a scene, such as virtual actors information, including placement, class, size, etc., output level perception model simulations may predict the outputs (e.g, bounding boxes and predicted trajectories, 3D occupancy) of what the perception system would have been if given real sensor data. Output level perception model simulation may be light-weight and fast for simulations using reduced amounts of computational capacity, however, building three-dimensional virtual worlds may not be performed. Instead, an alternative mapping from ground truth scene information to noisy perception outputs may be learned with machine learning models. Output level perception simulation may also be applied for specific sensors that may be challenging to simulate in real time, such as radar. The simulated outputs may be intermediate layers of neural networks or compressed representations such as bounding boxes or occupancy maps. [0060] The testing controller (302) may also execute latency simulations. To test the resilience of the autonomous system, latency simulations may be used to inject additional latency into the autonomous system (300). Latency may be injected into specific components in the autonomy system or throughout the entire system wholesale. To simulate additional latency, a variety of techniques may be used; for example, injecting a constant amount of latency into each module, sampling random delays according to a predetermined distribution, or using artificial intelligence models to adjust latencies to achieve a certain profile. Beyond injecting additional latency, entire messages between modules in the autonomy stack may also be dropped.

[0061] The simulator (305) is a component that generates virtual environments and scenarios for testing the autonomous system (300). The simulator (305) may create realistic simulations that mimic real-world conditions, enabling the autonomous system (300) to be tested in a variety of situations while the autonomous system (300) is operating in a real -world environment. The simulator (305) may include similar features as the simulator (200) of FIG. 2 and may execute locally on the autonomous system (300). The simulator (305) may operate in conjunction with the simulator (200) of FIG. 2 with certain features executed locally and other features executed remotely to add flexibility and scalability during mixed reality testing.

[0062] The simulation data (308) is the data generated by the simulator (305) during the testing process. The simulation data (308) may include information about the virtual environment, the behavior of virtual objects, and the interactions between the autonomous system (300) and the simulated world. The simulation data (308) may be used to evaluate the performance of the autonomous system (300), which may be used to identify areas for improvement of the autonomous system (300). The simulation data (308) may be a simulated version of the system data (310) with simulated versions of the sensor data (335), the perception data (355), the planning data (365), and the actuator data (385). The simulation data (308) may be blended into the system data (310) and injected into the components of the autonomous system (300). The simulation data (308) may include simulated images (e.g., camera images or lidar point clouds) with perturbations and may include simulated outputs generated by the other components, such as by the perceptron models (352) and the planning models (362). As an example, the simulation data (308) may include simulated perception data that identifies a simulated object and may include simulated planning data that may identify a trajectory for the autonomous system (300) based on the simulated object. The simulation data (308) may be one of the inputs to the test blender (312).

[0063] The simulation data (308) may include perturbations, which are differences between the simulation data (308) and the system data (310). The perturbations may be from objects in the simulation that are different from the objects in the real world. Perturbations may be introduced independently into the simulated versions of the sensor data (335), the perception of data (355), the planning data (365) and the actuator data (385) within the simulation data (308). As an example, a virtual object may be simulated and introduced into the simulated version of the sensor data (335) so that the perception system (350), the planning system (360), and the actuator system (380) respectively generate the perception data (355), the planning data (365), and the actuator data (385) with the perturbation that was introduced into the sensor data (335). As another example, a different trajectory for an object may be introduced into the simulated version of the perception data (355) so that the planning system (360) and the actuator system (380) respectively generate the planning data (365) and the actuator data (385) with the perturbation that was introduced in the perception data (355) without being introduced in the sensor data (335).

[0064] The system data (310) is the collection of data related to the operation and performance of the autonomous system (300). The system data (310) may include information about the status of the components of the autonomous system (300) as well as the data generated by the components at the autonomous system (300). The system data (310) may include versions of the sensor data (335), the perception data (355), the planning data (365), and the actuator data (385). The system data (310) may be one of the inputs to the test blender (312).

[0065] The test blender (312) is a component that mixes real -world data with simulated data to create a mixed reality testing environment by combining the simulation data (308) with the system data (310) to generate the augmented data (315). The test blender (312) may combine inputs from the sensor system (330) with data generated by the simulator (305) to provide a comprehensive testing scenario. Combining the simulation data (308) with the system data (310) introduces perturbations into the system data (310), which may be based on the simulation data (308). The perturbations introduced by the test blender (312) to the system data (310) test the robustness of the autonomous system (300) and corresponding components.

[0066] The test blender (312) may replace one or more portions of the system data (310) with the simulation data (308) to generate the augmented data (315). Instead of inserting virtual actors rendered in the simulation data (308) to an existing scene (sky, road, other vehicles, etc.) from the system data (310), each of the sky, background, actors, etc., may be modified from the system data (310) and replaced with the simulation data (308). Each portion of a scene may be changed, e.g., day to night, different background, removal of real -world actors, etc., in addition to simply adding actors to the observations captured in the system data (310).

[0067] The augmented data (315) is the data that has been modified by the test blender (312). The augmented data (315) may include a combination of the simulation data (308) and the system data (310) to form a rich and more complex testing environment while the autonomous system (300) is operating in the real -world environment. The augmented data (315) may be injected by the testing controller (302) into the components of the autonomous system (300) to evaluate the ability of the autonomous system (300) to handle unexpected situations and adapt to changing conditions. [0068] The evaluation controller (318) is a component of the autonomous system

(300) that assesses the performance and behavior of the autonomous system (300) during mixed reality testing. The evaluation controller (318) processes data from multiple sources, including simulation data (308), the system data (310), and the augmented data (315), to generate metrics and reports that formed the evaluation data (320). The evaluation controller (318) may identify discrepancies between expected and actual outcomes, highlighting areas where the autonomous system (300) may be refined or updated.

[0069] The evaluation controller (318) may assess performance of the autonomous system (300) throughout the execution of multiple scenarios. The evaluation controller (318) may be run as an onboard component during the test or as an offline process afterwards. The evaluation controller (318) may use metrics based on human-derived rules or data-driven neural networks that learn to distinguish between good and bad driving behavior. The rules may include no collisions with other actors by the autonomous system (300), compliance with safety and comfort standards such as passengers experiencing less than a certain acceleration force within the autonomous system, compliance with the rules of traffic, minimal deviation in executed trajectory with respect to a teacher or expert driver, etc. In addition to system-level performance, the evaluation controller (318) may also evaluate individual components such as perception system (350) or the prediction system (360). The metrics may be used to determine whether the autonomous system (300) satisfies a test and passes a scenario successfully.

[0070] The evaluation data (320) is the collection of data generated during the assessment of the performance of the autonomous system (300). The evaluation data (320) may include metrics, logs, and reports that document the outcomes of various mixed reality test scenarios. The evaluation data (320) may include a record of the behavior of the autonomous system (300) during operation and may be used in subsequent testing cycles and guide improvements to the autonomous system (300). [0071] The sensor system (330) is the collection of sensors that are part of the autonomous system (300) and may generate the sensor data (335) used by other components of the autonomous system (300). The sensor system (330) includes the sensors (332), which may utilize the sensor data (335) to generate the real- world data (338). The sensor system (330) may include multiple types of sensors. The sensor system (330) may include multiple sensors for each type of sensor. The sensor system (330) may be controlled by the testing controller (302) to overwrite the real -world data (338) with the augmented data (315) or to introduce latency from when data is captured by a sensor to when the data is published to the other components of the autonomous system (300).

[0072] The sensors (332) are the sensors that form the sensor system (330). The sensors (332) may include camera sensors, lidar sensors, radar sensors, etc. The sensors (332) may operate with parameters stored in the sensor data (335) to measure real-world phenomena that may be stored in the real-world data (338).

[0073] The sensor data (335) is a collection of data that may be used or stored by the sensors (332). The sensor data (335) may include parameters for the sensors (332), e.g., calibration parameters, that may be used by the sensors (332) to capture real -world phenomena. The sensor data (335) may also include the real- world data (338).

[0074] The real-world data (338) is data captured by the sensors (332) that measure real -world phenomena. For example, the real -world data (338) may include images from camera sensors and point clouds from lidar sensors.

[0075] The virtual driver (340) is the decision-making component of the autonomous system (300). The virtual driver (340) processes inputs from the sensor system (330) and generates control commands to navigate the autonomous system (300) through the real -world environment. The virtual driver (340) executes decision-making algorithms that determine actions such as steering, acceleration, and braking based on the perceived environment and planned route using the perception system (350) and the planning system (360). [0076] The perception system (350) is a component of the virtual driver (340) that may process the sensor data (335) to create an understanding of the real- world environment, in which the autonomous system (300) is operating, that may be recorded in the perception data (355). The perception system (350) uses the perception models (352) to detect, classify, and track objects such as vehicles, pedestrians, and obstacles.

[0077] The perception models (352) are algorithms and machine learning models that analyze the sensor data (335), including the real -world data (338), to identify and interpret objects in the real -world environment. The perception models (352) may perform tasks that include object detection, classification, tracking, etc. The perception models (352) may use data from cameras, lidar, radar, and other sensors to generate information about the real-world environment.

[0078] The perception data (355) is the processed output from the perception system (350). The perception data (355) includes information about detected objects, classifications, positions, and trajectories. The perception data (355) is used by the planning system (360) to make decisions about navigation and obstacle avoidance. The perception data (355) may be a record of a real-time representation of the real-world environment around the autonomous system.

[0079] The state data (358) represents the current status and conditions of the autonomous system. The state data (358) includes information such as the position, velocity, acceleration, orientation, etc. of the autonomous system (300). The state data (358) may be continuously updated and used by the virtual driver (340) and other components for real-time decisions and adjustments.

[0080] The planning system (360) is a component of the virtual driver (340) that may generate a path or trajectory for the autonomous system (300) to follow. Data from the perception system (350) and other sources may be processed by the planning system (360) to generate a route. Multiple paths may be evaluated from which one may be selected by the planning system (360) based on current conditions and objectives.

[0081] The planning models (362) are algorithms and computational models that generate and evaluate potential paths for the autonomous system (300). Factors such as road geometry, traffic rules, dynamic obstacles, etc., may be incorporated by the planning models (362) as features used to predict the outcomes of different maneuvers. Multiple options may be simulated by the planning models (362) to identify a path for the autonomous system (300).

[0082] The planning data (365) is the information generated during the path planning process. Details about potential routes, predicted trajectories of other objects, and environmental constraints may be included in the planning data (365). Continuous updates may be made to the planning data (365) to account for changes in the environment and the status of the autonomous system (300).

[0083] The control data (368) is output from the planning system (360) that may specify the actions to take for the autonomous system (300) to follow the path planned with the planning system (360). Commands for steering, acceleration, braking, and other controls may be included in the control data (368). Transmission of the control data (368) to the actuator system occurs to execute the planned maneuvers and navigate the autonomous system (300) through the real-world environment during mixed reality testing.

[0084] The actuator system (380) executes the control commands of the control data (368) generated by the planning system (360). The commands are translated into parameters for physical actions, such as steering, accelerating, and braking. Multiple mechanical and electronic components may be interacted with by the actuator system (380) to control the movement and behavior of the autonomous system (300).

[0085] The actuators (382) are devices of the actuator system (380) that perform the physical actions to control the movement of the autonomous system (300) in the real world during mixed reality testing. Components such as motors, hydraulic systems, electronic control units, etc., may be included in the actuators (382). Control signals are received by the actuators (382) and converted into movements, such as turning the wheels, applying brakes, adjusting throttle position, etc.

[0086] The actuator data (385) is a collection of data used by the actuators (382). The actuator data (385) may include information generated during the execution of control commands. Details about the current state and performance of the actuators, such as position, speed, and force applied, may be included in the actuator data (385). Monitoring and adjusting the actions of the actuators are based on the actuator data (385) to maintain accurate and responsive control of the autonomous system (300).

[0087] FIG. 4 shows a flowchart of a method (400) for mixed reality testing. The method (400) of FIG. 4 may be implemented using the systems and components of FIG. 1 through FIG. 3, FIG. 9 A, and FIG. 9B. One or more of the steps of the method (400) may be performed on, or received at, one or more computer processors. In an embodiment, a system may include at least one processor and an application that, when executing on the at least one processor, performs the method (400). In an embodiment, a non-transitory computer readable medium may include instructions that, when executed by one or more processors, perform the method (400). The outputs from various components (including models, functions, procedures, programs, processors, etc.) from performing the method (400) may be generated by applying a transformation to inputs using the components to create the outputs without using mental processes or human activities.

[0088] Block (402) includes generating system data from multiple sensors of an autonomous system operating in a real-world environment. The system data may include data generated from multiple systems within the autonomous system, including sensor data generated from a sensor system, perception data generated with a perception system, planning data generated with a planning system, and actuator data generated with an actuator system. The sensor data may be captured from multiple sensors, including camera sensors and lidar sensors. The captured data may include images and point clouds representing real-world phenomena. The sensor data is processed by the perception system to generate perception data, which may include information relating to a current status and condition of the autonomous system. The perception data may include state data with information such as the position, velocity, and orientation of the autonomous system, as well as details about detected objects and environmental features. The perception data is processed by the planning system to generate the planning data which may include control data. The control data may form part of the actuator data that is used by the actuator system to control the actuators of the autonomous system. The system data may be continuously updated to provide an accurate and real-time representation of the environment and the autonomous system for decisions and navigation.

[0089] Generating the system data may include capturing a first sensor image of sensor data of the system data from a first sensor of the multiple sensors. The first sensor is activated to begin the data collection process. The first sensor, which may be a camera sensor, captures an image (referred to as a camera image) of the surrounding environment. The camera image is stored as part of the system data, which may include parameters and metadata associated with the camera image, such as the timestamp, sensor settings, and environmental conditions at the time of capture.

[0090] Generating the system data may also include capturing a second sensor image of the sensor data of the system data from a second sensor of the multiple sensors. The second sensor, which may be a lidar sensor, is activated to initiate data collection. The second sensor may emit laser pulses to measure distances to surrounding objects, creating a point cloud (which may be referred to as a lidar image) that represents the environment. The lidar image is stored as part of the system data, which may include parameters and metadata associated with the lidar image, such as the timestamp, sensor settings, and environmental conditions at the time of capture. [0091] Block (405) includes generating simulation data including a perturbation to the system data. A simulator may execute to create a virtual environment that mimics real-world conditions to generate the simulation data. Within the virtual environment, multiple objects may be rendered, each with specific trajectories and behaviors. The simulation data may include simulated versions of the sensor data, the perception data, the planning data, and the actuator data from the other systems of the autonomous system. The simulated versions of the sensor data, the perception data, the planning data, and the actuator data may each include perturbations. The perturbations may include adding virtual objects that are not present in the real -world environment or altering the characteristics of existing objects. The simulation data with perturbations is blended with the system data to create augmented data that includes the perturbations from the simulation data.

[0092] Generating the simulation data may include executing a simulator to generate a first rendered object image and a second rendered object image. The simulation data includes the first rendered object image and the second rendered object image, the perturbation may be referred to as a first perturbation, the first rendered object image includes the first perturbation to the system data, and the second rendered object image includes a second perturbation to the system data. The simulator may be executed to create a virtual environment that replicates real-world conditions to generate a rendered object image of a virtual object that is not present in the real -world environment. Within the virtual environment, the first rendered object image is generated by rendering a virtual object with specific characteristics and behaviors based on the type of sensor being simulated, e.g., a camera sensor. The simulator may generate the second rendered object image based on a different type of sensor. Both rendered object images are created to introduce variations and perturbations to the system data. The rendered object images may be stored as part of the simulation data for a simulated scenario. [0093] Generating the simulation data may also include simulating multiple objects with multiple trajectories. A virtual environment is created to replicate real -world conditions with multiple virtual objects that may have distinct characteristics and behaviors. The trajectories of the virtual objects may be calculated based on predefined parameters, such as speed, direction, and interactions with other objects. The simulation may continuously update virtual object positions and movements to reflect realistic dynamics based on virtual objects, real -world objects, and the autonomous system. The trajectories for each of the objects (real and virtual) may be monitored and adjusted to account for changes in the virtual environment to represent real -world scenarios.

[0094] Generating the simulation data may also include rendering the multiple objects into multiple sensor images in the simulation data. A virtual environment may be established to replicate real-world conditions. Within the virtual environment, multiple virtual objects may be introduced, each with specific characteristics and behaviors. The rendering process may include generating detailed visual representations of these objects from various perspectives for multiple sensors and sensor types. Sensor models, such as camera and lidar models, are used to simulate how the objects would appear to different sensors. The rendered images include various aspects of the objects, such as shape, texture, and spatial relationships.

[0095] Block (408) includes augmenting the system data to include the perturbation of the simulation data and generate augmented data. The system data may be augmented by modifying the system data to include portions of the simulation data. The modification to the system data may be different for different types of simulation data. The modification may be performed by blending, overlaying, inserting, appending, etc., portions of the simulation data to parts of the system data. For sensor data, the modification may be performed by blending the simulation data with the system data by inserting or overlaying portions of the simulation data to parts of the sensor data. The sensor data, the perception data, the planning data, in the actuator data may each also be augmented by dropping data, appending data, and introducing latency. The autonomous system may use messages to pass information between different systems within the autonomous system, e.g., between the sensor system, the perception system, the planning system, and the actuator system. Messages may be dropped or data within the messages may be dropped. Latencies may also be introduced that increase the time between publishing messages between systems and processing messages received by the systems.

[0096] Augmenting the system data may include blending the first rendered object image into the first sensor image to generate a first blended image of the augmented data. The first rendered object image, which represents a virtual object, may be aligned with the first sensor image captured from the real -world environment. The alignment positions the virtual object within the context of the real -world scene. The blending process then combines the elements of both images, to integrate the virtual object into the sensor image.

[0097] Augmenting the system data may also include blending the second rendered object image into the second sensor image to generate a second blended image of the augmented data. The second sensor image is from a second sensor. The second sensor may be the same or a different sensor type than the first sensor.

[0098] When the sensor image is a camera image from a camera sensor, the integration may include adjusting the lighting, shadows, and reflections to match the real -world conditions depicted in the camera image. When the sensor image is a lidar image from a lidar sensor, adjustments may be made to the point cloud data to be consistent with the real-world conditions depicted in the lidar image. The blended images that result are composites that include both real and virtual elements, providing a realistic representation of the environment with the virtual object.

[0099] Augmenting the system data may also include overlaying a rendered object image of the simulation data into a real -world image from sensor data of the system data. The rendered object image may be overlaid by replacing pixels from the real -world image with pixels from the rendered object image. The rendered object image depicts one or more objects, shadows, and reflections that are not depicted in the real-world image.

[00100] Augmenting the system data may also include inserting a rendered object point cloud of the simulation data into a real-world point cloud from the sensor data of the system data. The rendered object point cloud may be inserted by replacing point data from the real-world point cloud with point data from the rendered object point cloud. The rendered object point cloud may at least partially occlude parts of the real -world point cloud. For example, the virtual object in the rendered object point cloud may be closer to the autonomous system than an object from the real -world point cloud at the same relative location.

[00101] Augmenting the system data may also include inserting a rendered object reflection of the simulation data into a real-world radar image from the sensor data of the system data for a radar sensor. The rendered object reflection may at least partially attenuate the real-world radar image. The rendered object reflection, which represents a virtual object, is aligned with the real -world radar image captured from the sensor data. The alignment positions the virtual reflection within the context of the real-world radar scene. The insertion process then integrates the rendered object reflection with the real -world radar image that blends with the radar data from the real world. Adjustments may be made to the radar signal strength and attenuation to match the conditions depicted in the real-world radar image. The resulting composite radar image may include both real and virtual elements for a realistic representation of the environment with the virtual object.

[00102] Block (410) includes injecting the augmented data into one or more components of the autonomous system to test the autonomous system with the perturbation. The augmented data, which may include both real and simulated elements, may be directed to specific components of the autonomous system, such as the perception system and the planning system. The perception system processes the augmented data to generate perception data that reflects the perturbations introduced by the simulated elements. Additional perturbations may be introduced into the perception data. The perception data is then used by the planning system to make decisions about navigation and obstacle avoidance. The planning system generates control data based on the perception data, which includes commands for steering, acceleration, and braking. The control data is transmitted to the actuator system, which executes the commands to navigate the autonomous system through the real-world environment.

[00103] Injecting the augmented data may include injecting the augmented data as sensor data into a perception system of the autonomous system. The augmented data, which includes both real and simulated elements, is transmitted to the perception system, where the augmented data is treated as real sensor data. The perception system processes the augmented data to generate perception data that reflects the perturbations introduced by the simulated elements. The generated perception data includes information about detected objects, classifications, positions, and trajectories.

[00104] Injecting the augmented data may also include executing a perception model, of the perception system, to process the augmented data to generate perception data responsive to the perturbation in the sensor data. The augmented data, which includes both real and simulated elements, is fed into the perception system. The perception model, which includes computational models including machine learning models, analyzes the augmented data that includes the perturbations to detect and classify objects within the mixed reality environment. The perception model processes the sensor data to identify features such as shapes, sizes, and positions of objects, which may include perturbations introduced by the simulated elements. The output of this analysis is the perception data, which includes information about the detected objects, classifications, trajectories, etc. [00105] Injecting the augmented data may also include executing a planning model to process the perception data to generate control data to operate the autonomous system responsive to the perturbation. The perception data, which includes information about detected objects, classifications, and trajectories, is fed into the planning system. The planning model, which includes computational models including machine learning models, analyzes the perception data that includes the perturbations to generate a path or trajectory for the autonomous system. The planning models may evaluate multiple potential paths, considering factors such as road geometry, traffic rules, dynamic obstacles, etc. A path may be selected based on the current conditions and objectives. The planning model may generate control data, which includes commands for steering, acceleration, and braking, which may be transmitted to the actuator system for execution to navigate the autonomous system through the real -world environment with the virtual objects. The process tests the autonomous system ability to respond to the perturbations introduced by the augmented data.

[00106] Injecting the augmented data may also include injecting the augmented data as perception data into a planning system of the autonomous system. The augmented data, which includes both real and simulated elements, may be transmitted to the planning system, where the augmented data is treated as perception data. The planning system processes the augmented data to generate a path or trajectory for the autonomous system. Multiple potential paths may be evaluated considering factors such as road geometry, traffic rules, dynamic obstacles, etc. A path may be selected based on the current conditions and objectives using the augmented data with perturbations.

[00107] Injecting the augmented data may also include executing a planning model to process the perception data to generate control data, responsive to the perturbation in the perception data, to operate the autonomous system responsive to the perturbation. The perception data, which includes information about detected objects, classifications, and trajectories, as well as one or more perturbations is transmitted to the planning system. The planning model, which may include computational models with machine learning models, analyzes the perception data that includes the perturbations to generate a path or trajectory for the autonomous system. The planning model may evaluate multiple potential paths, considering factors such as road geometry, traffic rules, dynamic obstacles, etc. Selection of a path is based on the current conditions and objectives. The planning model generates control data, which includes commands for steering, acceleration, and braking, from the perception data responsive to the perturbations. The control data is transmitted to the actuator system, which executes the commands to navigate the autonomous system through the real-world environment responsive to the perturbations.

[00108] Injecting the augmented data may also include injecting transmission parameters, in the augmented data, into the one or more components of the autonomous system. The transmission parameters include one or more latency parameters and drop parameters. The augmented data may include transmission parameters that include latency parameters and drop parameters. The latency parameters identify a delay introduced in the transmission of data between components, which may test different network conditions with different levels of latency. Drop parameters define the probability or rate at which data packets are lost during transmission, which may test communication failures within the autonomous system. Different latency parameters and drop parameters may be used to test the autonomous system under multiple network conditions.

[00109] Injecting the augmented data may also include executing the autonomous system using the transmission parameters. The augmented data, along with the transmission parameters, may be transmitted to specific components of the autonomous system, including the sensor, perception, planning, and actuator systems. The components process the augmented data, taking into account the specified latency and drop parameters, to generate system data and operate the autonomous system in the mixed reality environment with the transmission parameters that may increase latency and drop messages. [00110] The method (400) may further include steps for evaluating and training the machine learning models of an autonomous system. Evaluating the machine learning models may include assessing performance under different conditions and in multiple scenarios using mixed reality. Metrics and benchmarks may be used to measure accuracy and efficiency of the models. The evaluation process may include running simulations and real-world tests together in mixed reality environments to gather data on how the models perform.

[00111] Evaluating the machine learning models may include collecting evaluation data responsive to executing one or more components responsive to the perturbation. Evaluation data is gathered during execution of the autonomous system components in mixed reality environments, capturing system responses to the perturbations introduced with simulation data. The evaluation data may include information on decision-making processes, accuracy of object detection, effectiveness of navigation and obstacle avoidance.

[00112] Training the machine learning models may include collecting the augmented data as supplemental training data. The augmented data, which includes both real and simulated elements, is used to provide additional training examples for the machine learning models. Collecting the augmented data increases the range of scenarios and conditions in which the autonomous system may be tested. The augmented data may be processed and labeled to identify features and characteristics of the environment.

[00113] Training the machine learning models may also include training one or more models of the autonomous system using one or more of the evaluation data and the supplemental training data. The training process may include inputting evaluation data and supplemental training data into the machine learning algorithms utilized by the models of the autonomous system. The models may be iteratively updated based on feedback (e.g., error between computed and expected values) from the training and evaluation data, adjusting parameters to improve performance and accuracy. Processes and techniques used may include techniques such as supervised learning, where models learn from labeled examples, and reinforcement learning.

[00114] FIG. 5A through FIG. 8 depict examples of system and methods implementing the disclosure. FIG. 5A through FIG. 5C illustrate different modes of operation of mixed reality testing. FIG. 6 illustrates a blended image of augmented data. FIG. 7 illustrates a sequence of blended images. FIG. 8 illustrates a system implementing mixed reality testing.

[00115] FIG. 5 A through FIG. 5C illustrate the autonomous system (500) operating in the different modes (520), (530), and (570) for mixed reality testing. Mixed reality testing may incorporate a suite of technologies to bridge the gap between offline simulation and real-world testing, marrying the flexibility of offline simulation with the realism of real-world testing. In mixed reality testing, simulation is blended with the real world by running the realtime simulation onboard the autonomous system (500) under test. Depending on the test intention, the mixing of simulation with reality may operate with different modes. For example, the autonomous system (500) may operate in the mode (520) for mixed reality testing at the sensor level, in the mode (530) for mixed reality testing at an intermediate autonomy level, and in the mode (570) for mixed reality testing at the hardware level. The different modes (520), (530), and (570) may run concurrently within the same autonomous system.

[00116] The autonomous system (500) includes the actuator system (502), the sensor system (505), the perception system (508), the planning system (510), and the onboard simulator (500). The actuator system (502) interfaces the autonomous system (500) with the real world by controlling actuators with control data from the planning system (510). The sensor system (505) generates sensor data from observations from the real world that may be passed to the perception system (508). The perception system (508) processes data from the sensor system (505) to generate perception data that identifies the state of the autonomous system (500) and the surrounding environment and which may be passed to the planning system (510). The planning system (510) processes data from the perception system to generate control data that may be used by the actuator system (502).

[00117] Turning to FIG 5 A, the autonomous system (500) operates in the mode (520) in which augmented data may be injected into the sensor system (505) to generate mixed reality sensor data used by the perception system (508). When operating in the mode (520), as the autonomous system (500) acquires real sensor data, the data is modified on-the-fly to create entirely new scenarios, mixing the real world with a diversity of new actors (e.g., emergency vehicles, construction zones, vulnerable road users, etc.), new behaviors, new environmental and lighting conditions, etc. Mixed reality testing operates at the sensor level for end-to-end testing of the virtual driver. As the autonomous system (500) acquires sensor data of the real physical surroundings through the sensors, the onboard simulator may be used to modify the sensor data itself, on- the-fly, in real time, to create the mixed reality test scenario. The modified sensor data is then fed into the virtual driver, which may react accordingly. The onboard simulator may modify a wide range of sensors in real time, including lidar, camera, and radar, by leveraging graphics processing unit (GPU) accelerated rendering to maintain realism as well as temporal and cross sensor consistency. For example, an autonomous system (500) based on lidar may be tested by injecting a synthetic actor near the autonomous system. At each subsequent step, the simulated lidar appearance may be rendered from the current position of the autonomous system, after which the rendered points would be blended with the current point cloud observed by the real sensors.

[00118] Once the simulation data from the simulator has been transformed to be expressed in the same reference frame as the real data, the next step is to blend the two data streams together in a realistic way. The blending of data streams is performed using the sensor data blending component (which may also be referred to as a test blender that blends data for testing the autonomous system). The sensor data blending component fuses the newly rendered data (objects, weather effects, etc.) with the existing real -world sensor data. Different methods may be used for different sensors in order for realistic fusion of real and simulation data. For lidar, a focus may be on having realistic occlusions, i.e., “shadows” cast by the new lidar points onto the existing ones or vice versa. For example, if a new car is added to a scene, the lidar points from the real- world data (which fall behind the car) should be removed in order to maintain realism, as the lidar points behind the car would not be observed if the car was present in the real world. For cameras, occlusion may also be modeled. The modeling of occlusion may be done using methods from computer graphics, such as depth-based filtering (e.g, z buffer), typically leveraging light detection and ranging as a proxy for the real scene depth. Other effects such as re-lighting may also be modeled. For example, an asset constructed from images collected during a sunny day would look out of place when inserted into the world on a cloudy day. Shadows cast by the new objects, and onto the new objects may also be modeled for increased realism. The modeling of shadows may be achieved with classical rendering (e.g., raster graphics), an end-to-end blending neural network, a hybrid method which combines explicit graphics with blending, etc. Similar choices may be utilized for other sensor simulations such as radar.

[00119] Turning to FIG 5B, the autonomous system (500) operates in the mode (530) in which augmented data may be injected into the perception system (508) to generate mixed reality perception data used by the planning system (510). When operating in the mode (530), the intermediate outputs of the autonomous system (500) may be modified, enabling more precise testing of individual components of the autonomous system. In the mode (530), reality is augmented at the level of perception and prediction output by injecting synthetic objects into the autonomy stack. The injection of synthetic objects may be done by adjusting the output of the perception system or by replacing the output of the perception system. Adjustment to the output of the perception system may run on top of an existing perception system, allowing additional fuzzing of the perception output in order to validate the fault-tolerance of the virtual driver, irrespective of the perception output parametrization. For example, in the case of object-based outputs, fuzzing may include dropping detections, adding noise to object locations, object heading, object size, predicted trajectory, etc.

[00120] For occupancy-based methods, fuzzing may include adding structured noise to predicted occupancy, both at the current and at future timesteps. Replacing the output of the perception system altogether allows real-world testing of motion planning, controls, and actuation, without a dependency on any one particular perception module. Mixed reality testing therefore operates by adding physically realistic and temporally consistent object trajectories into the scene, which are then consumed by motion planning and downstream tasks. Perturbation strategies may be based on heuristics, such as dropping any detection with a fixed probability, or may be learned from data.

[00121] Turning to FIG 5C, the autonomous system (500) operates in the mode (570) in which augmented data may be injected into one or more of the sensor system (505), the perception system (508), and the planning system (510) to test the autonomous system (500) with injected latency. When operating in the mode (570), the operation of the autonomous system (500) at the hardware level may be modified, for example, to simulate faults in the autonomy stack, inject additional latencies, simulate dropped system messages, etc. Mixed reality testing may obviate the challenges of executing intricate tests in the real world to test beyond what is possible or safe in the real world without using mixed reality. Faults may be injected into the autonomous system (500) in the real- world environment in order to validate the resilience of the system. Fault injection may involve arbitrary termination of system components or the injection of erroneous or malformed messages. Fault injection may be extended to simulate partial or complete sensor failures. By using mixed reality testing, behavior of the autonomous system (500) may be tested in rare real-life cases.

[00122] Turning to FIG. 6, the image (600) shows a current state of an autonomous system operating in the real world with outputs from the sensor system, perception system, and planning system integrated into the image (600). The image (600) may be displayed on a user interface within the autonomous system or on a remote terminal. The lidar data (610) depicts, within the image (600), a point cloud generated by a lidar sensor of the autonomous system. The self-driving truck (620) is a depiction of the autonomous system within the image (600). The modified lidar data (630) is simulation data that represents a virtual object that is not on the real -world road with the autonomous system. The bounding box (650) surrounds the virtual object and is perception data generated with a perception model of a perception system by the autonomous system from the simulation data injected into the real -world data. The motion plan (680) is generated by the planning system responsive to the perception data that includes the bounding box (650) for the virtual object injected into the real world lidar data (610).

[00123] Turning to FIG. 7, the sequence (700) of the images (710), (720), (730), (750), and (780) illustrates a scenario of mixed reality testing. In the image (710), the autonomous system detects a virtual object on the shoulder of the road ahead of the autonomous system. In the image (720), the autonomous system utilizes a planning system to generate a plan to change lanes and avoid the virtual object. In the image (730), the autonomous system performs the lane change. In the image (750), the autonomous system completes the lane change and passes the virtual object. In the image (780), the autonomous system maneuvers back to the original lane after passing the virtual object.

[00124] Turning to FIG. 8, the autonomous system (800) (e.g., a robot, which may be a self-driving vehicle) operates in the real-world environment (802). The autonomous system (800) performs mixed reality testing.

[00125] The real -world sensor data (810) is generated by a sensor system from observations of the real-world environment (802). The real-world sensor data (810) may be input to the state estimation system (820), the onboard simulator (830), and the blending system (855). [00126] The state estimation system (820) may be a part of the perception system (872) that operates without being injected with augmented data from the onboard simulator (830). The state estimation system (820) may identify information about the current state of the autonomous system (800), which may include the pose of the autonomous system with respect to the objects in the real-world environment. The output of the state estimation system (820) may be input to the onboard simulator (830) and to the autonomy software (870).

[00127] The onboard simulator (830) simulates virtual objects that may be injected into the other components of the autonomous system (800). The onboard simulator (830) generates the simulated sensor data (850) from the real sensor data (810) and from the output of the state estimation system (820).

[00128] The simulated sensor data (850) is output from the onboard simulator (830). The simulated sensor data (850) includes perturbations from the real sensor data (810). The perturbations within the simulated sensor data (850) may represent virtual objects around the autonomous system (800) within a virtual environment that do not exist in the real -world environment (802). The simulated sensor data may be input to the blending system (855).

[00129] The blending system (855) blends the real sensor data (810) with the simulated sensor data (850). The blending of the data introduces perturbations to the real sensor data (810) that may represent virtual objects in the simulated sensor data (850) that do not exist in the real-world environment (802). The blending system (855) outputs augmented data, referred to as the mixed sensor data (860), that combines the real -world sensor data (810) with the simulated sensor data (850).

[00130] The mixed sensor data (860) is augmented data that is output from the blending system (855). The mixed sensor data (860) incorporates representations of virtual objects from the simulated sensor data (850) within the real -world sensor data (810). The mixed sensor data (860) is input to the perception system (872) of the autonomy software (870). The mixed sensor data (860) may also be logged or passed to other parts of the autonomy software (870) in addition to the perception system (872). For example, the mixed sensor data (860) may be sent to components that process online mapping and location, to a visual-language model that provides additional assistance, etc.

[00131] The autonomy software (870) is an autonomy stack that processes sensor data to generate control data to operate the actuator system (880). The autonomy software (870) includes the perception system (872) and the planning system (875).

[00132] The perception system (872) processes the output of the state estimation system (820) with the mixed sensor data (860) to generate perception data. The perception data identifies and classifies objects around the autonomous system (800). The perception data is output from the perception system (872) to the planning system (875).

[00133] The planning system (875) processes the perception data from the perception system (872) to generate control data. The control data may identify actions to be performed by actuators of the actuator system (880) to navigate the autonomous system (800) through the real-world environment (802). The control data is output from the planning system (875) to the actuator system (880).

[00134] The actuator system (880) includes a set of actuators that physically operate the mechanical components of the autonomous system (800) to navigate through the real-world environment (802). The actuator system (880) receives control data from the planning system (875), which is sent to the individual actuators that form the actuator system (880). The actuators may include steering, braking, acceleration, etc.

[00135] Embodiments may be implemented on a special purpose computing system specifically designed to achieve the improved technological result. Turning to FIG. 9A and FIG. 9B, the special purpose computing system (900) may include one or more computer processors (902), non-persistent storage (904), persistent storage (906), a communication interface (912) (e.g., Bluetooth interface, infrared interface, network interface, optical interface, etc.), and numerous other elements and functionalities that implement the features and elements of the disclosure. The computer processor(s) (902) may be an integrated circuit for processing instructions. The computer processor(s) (902) may be one or more cores or micro-cores of a processor. The computer processor(s) (902) includes one or more processors. The one or more processors may include a central processing unit (CPU), a graphics processing unit (GPU), a tensor processing unit (TPU), combinations thereof, etc.

[00136] The input device(s) (910) may include a touchscreen, keyboard, mouse, microphone, touchpad, electronic pen, or any other type of input device. The input device(s) (910) may receive inputs from a user that are responsive to data and messages presented by the output device(s) (908). The inputs may include text input, audio input, video input, etc., which may be processed and transmitted by the computing system (900) in accordance with the disclosure. The communication interface (912) may include an integrated circuit for connecting the computing system (900) to a network (not shown) (e.g., a local area network (TAN), a wide area network (WAN) such as the Internet, mobile network, or any other type of network), and/or to another device, such as another computing device.

[00137] Further, the output device(s) (908) may include a display device, a printer, external storage, or any other output device. One or more of the output device(s) (908) may be the same or different from the input device(s) (910). The input device(s) (910) and the output device(s) (908) may be locally or remotely connected to the computer processor(s) (902). Many different types of computing systems exist, and the aforementioned input device(s) (910) and output device(s) (908) may take other forms. The output device(s) (908) may display data and messages that are transmitted and received by the computing system (900). The data and messages may include text, audio, video, etc., and include the data and messages described above in the other figures of the disclosure.

[00138] Software instructions in the form of computer readable program code to perform embodiments may be stored, in whole or in part, temporarily or permanently, on a non-transitory computer readable medium such as a CD, DVD, storage device, a diskette, a tape, flash memory, physical memory, or any other computer readable storage medium. Specifically, the software instructions may correspond to computer readable program code that, when executed by a processor(s), is configured to perform one or more embodiments, which may include transmitting, receiving, presenting, and displaying data and messages described in the other figures of the disclosure.

[00139] The computing system (900) in FIG. 9A may be connected to or be a part of a network. For example, as shown in FIG. 9B, the network (920) may include multiple nodes (e.g., node X (922) and node Y (924)). Each node may correspond to a computing system, such as the computing system (900) shown in FIG. 9A, or a group of nodes combined may correspond to the computing system (900) shown in FIG. 9A. By way of an example, embodiments may be implemented on a node of a distributed system that is connected to other nodes. By way of another example, embodiments may be implemented on a distributed computing system having multiple nodes, where each portion may be located on a different node within the distributed computing system. Further, one or more elements of the aforementioned computing system (900) may be located at a remote location and connected to the other elements over a network.

[00140] The nodes (e.g., node X (922) and node Y (924)) in the network (920) may be configured to provide services for a client device (926), including receiving requests and transmitting responses to the client device (926). For example, the nodes may be part of a cloud computing system. The client device (926) may be a computing system, such as the computing system (900) shown in FIG. 9A. Further, the client device (926) may include and/or perform all or a portion of one or more embodiments of the disclosure. [00141] The computing system (900) of FIG. 9A may include functionality to present raw and/or processed data, such as results of comparisons and other processing. For example, presenting data may be accomplished through various presenting methods. Specifically, data may be presented by being displayed in a user interface, transmitted to a different computing system, and stored. The user interface may include a GUI that displays information on a display device. The GUI may include various GUI widgets that organize what data is shown as well as how data is presented to a user. Furthermore, the GUI may present data directly to the user, e.g., data presented as actual data values through text, or rendered by the computing device into a visual representation of the data, such as through visualizing a data model.

[00142] As used herein, the term “connected to” contemplates multiple meanings. A connection may be direct or indirect (e.g., through another component or network). A connection may be wired or wireless. A connection may be temporary, permanent, or a semi-permanent communication channel between two entities.

[00143] The various descriptions of the figures may be combined and may include or be included within the features described in the other figures of the application. The various elements, systems, components, and steps shown in the figures may be omitted, repeated, combined, and/or altered as shown from the figures. Accordingly, the scope of the present disclosure should not be considered limited to the specific arrangements shown in the figures.

[00144] In the application, ordinal numbers (e.g., first, second, third, etc.) may be used as an adjective for an element (z.e., any noun in the application). The use of ordinal numbers is not to imply or create any particular ordering of the elements, nor to limit any element to being a single element unless expressly disclosed, such as by the use of the terms “before”, “after”, “single”, and other such terminology. Rather, the use of ordinal numbers is to distinguish between the elements. By way of an example, a first element is distinct from a second element, and the first element may encompass more than one element and succeed (or precede) the second element in an ordering of elements.

[00145] Further, unless expressly stated otherwise, or is an “inclusive or” and, as such includes “and.” Further, items joined by an “or” may include any combination of the items with any number of each item unless expressly stated otherwise.

[00146] In the above description, numerous specific details are set forth in order to provide a more thorough understanding of the disclosure. However, it will be apparent to one of ordinary skill in the art that the technology may be practiced without these specific details. In other instances, well-known features have not been described in detail to avoid unnecessarily complicating the description. Further, other embodiments not explicitly described above may be devised which do not depart from the scope of the claims as disclosed herein. Accordingly, the scope should be limited only by the attached claims.

Claims

CLAIMS What is claimed is:

1. A method comprising: generating system data from a plurality of sensors of an autonomous system operating in a real-world environment; generating simulation data comprising a perturbation to the system data; augmenting the system data to include the perturbation of the simulation data and generate augmented data; and injecting the augmented data into one or more components of the autonomous system to test the autonomous system with the perturbation.

2. The method of claim 1, wherein generating the system data comprises: capturing a first sensor image of sensor data of the system data from a first sensor of the plurality of sensors, and capturing a second sensor image of the sensor data of the system data from a second sensor of the plurality of sensors, wherein generating the simulation data comprises: executing a simulator to generate a first rendered object image and a second rendered object image, wherein the simulation data comprises the first rendered object image and the second rendered object image, the perturbation is a first perturbation, the first rendered object image comprises the first perturbation to the system data, and the second rendered object image comprises a second perturbation to the system data, and wherein augmenting the system data comprises: blending the first rendered object image into the first sensor image to generate a first blended image of the augmented data, and blending the second rendered object image into the second sensor image to generate a second blended image of the augmented data.

3. The method of claim 1, wherein generating the system data comprises capturing a camera image with a camera sensor of the plurality of sensors, wherein generating the simulation data comprises executing a simulator to generate a rendered object image of a virtual object that is not present in the real-world environment, and wherein augmenting the system data comprises blending the rendered object image into the camera image to generate a blended image of the augmented data.

4. The method of claim 1, wherein generating the system data comprises capturing a lidar image with a lidar sensor of the plurality of sensors, wherein the lidar image comprises a point cloud, wherein generating the simulation data comprises executing a simulator to generate a rendered object image of a virtual object that is not present in the real-world environment, and wherein augmenting the system data comprises blending the rendered object image into the lidar image to generate a blended image of the augmented data.

5. The method of claim 1, wherein generating the simulation data comprises: simulating a plurality of objects with a plurality of trajectories, and rendering the plurality of objects into a plurality of sensor images in the simulation data.

6. The method of claim 1, wherein injecting the augmented data further comprises: injecting the augmented data as sensor data into a perception system of the autonomous system, executing a perception model, of the perception system, to process the augmented data to generate perception data responsive to the perturbation in the sensor data, and executing a planning model to process the perception data to generate control data to operate the autonomous system responsive to the perturbation.

7. The method of claim 1, wherein injecting the augmented data further comprises: injecting the augmented data as perception data into a planning system of the autonomous system, and executing a planning model to process the perception data to generate control data, responsive to the perturbation in the perception data, to operate the autonomous system responsive to the perturbation.

8. The method of claim 1, wherein injecting the augmented data comprises: injecting transmission parameters, in the augmented data, into the one or more components of the autonomous system, wherein the transmission parameters include one or more latency parameters and drop parameters, and executing the autonomous system using the transmission parameters.

9. The method of claim 1, wherein augmenting the system data comprises: overlaying a rendered object image of the simulation data into a real -world image from sensor data of the system data, wherein the rendered object image depicts one or more objects, shadows, and reflections that are not depicted in the real-world image, inserting a rendered object point cloud of the simulation data into a real -world point cloud from the sensor data of the system data, wherein the rendered object point cloud at least partially occludes the real -world point cloud, and inserting a rendered object reflection of the simulation data into a real -world radar image from the sensor data of the system data, wherein the rendered object reflection at least partially attenuates the real-world radar image.

10. The method of claim 1, further comprising: collecting evaluation data responsive to executing the one or more components responsive to the perturbation; collecting the augmented data as supplemental training data; and training one or more models of the autonomous system using one or more of the evaluation data and the supplemental training data.

11. A system comprising: at least one processor; and an application that, when executing on the at least one processor, performs operations comprising: generating system data from a plurality of sensors of an autonomous system operating in a real-world environment, generating simulation data comprising a perturbation to the system data, augmenting the system data to include the perturbation of the simulation data and generate augmented data, and injecting the augmented data into one or more components of the autonomous system to test the autonomous system with the perturbation.

12. The system of claim 11, wherein generating the system data comprises: capturing a first sensor image of sensor data of the system data from a first sensor of the plurality of sensors, and capturing a second sensor image of the sensor data of the system data from a second sensor of the plurality of sensors, wherein generating the simulation data comprises: executing a simulator to generate a first rendered object image and a second rendered object image, wherein the simulation data comprises the first rendered object image and the second rendered object image, the perturbation is a first perturbation, the first rendered object image comprises the first perturbation to the system data, and the second rendered object image comprises a second perturbation to the system data, and wherein augmenting the system data comprises: blending the first rendered object image into the first sensor image to generate a first blended image of the augmented data, and blending the second rendered object image into the second sensor image to generate a second blended image of the augmented data.

13. The system of claim 11, wherein generating the system data comprises capturing a camera image with a camera sensor of the plurality of sensors, wherein generating the simulation data comprises executing a simulator to generate a rendered object image of a virtual object that is not present in the real-world environment, and wherein augmenting the system data comprises blending the rendered object image into the camera image to generate a blended image of the augmented data.

14. The system of claim 11, wherein generating the system data comprises capturing a lidar image with a lidar sensor of the plurality of sensors, wherein the lidar image comprises a point cloud, wherein generating the simulation data comprises executing a simulator to generate a rendered object image of a virtual object that is not present in the real-world environment, and wherein augmenting the system data comprises blending the rendered object image into the lidar image to generate a blended image of the augmented data.

15. The system of claim 11, wherein generating the simulation data comprises: simulating a plurality of objects with a plurality of trajectories, and rendering the plurality of objects into a plurality of sensor images in the simulation data.

16. The system of claim 11, wherein injecting the augmented data further comprises: injecting the augmented data as sensor data into a perception system of the autonomous system, executing a perception model, of the perception system, to process the augmented data to generate perception data responsive to the perturbation in the sensor data, and executing a planning model to process the perception data to generate control data to operate the autonomous system responsive to the perturbation.

17. The system of claim 11, wherein injecting the augmented data further comprises: injecting the augmented data as perception data into a planning system of the autonomous system, and executing a planning model to process the perception data to generate control data, responsive to the perturbation in the perception data, to operate the autonomous system responsive to the perturbation.

18. The system of claim 11, wherein injecting the augmented data comprises: injecting transmission parameters, in the augmented data, into the one or more components of the autonomous system, wherein the transmission parameters include one or more latency parameters and drop parameters, and executing the autonomous system using the transmission parameters.

19. The system of claim 11, wherein augmenting the system data comprises: overlaying a rendered object image of the simulation data into a real -world image from sensor data of the system data, wherein the rendered object image depicts one or more objects, shadows, and reflections that are not depicted in the real-world image, inserting a rendered object point cloud of the simulation data into a real -world point cloud from the sensor data of the system data, wherein the rendered object point cloud at least partially occludes the real -world point cloud, and inserting a rendered object reflection of the simulation data into a real -world radar image from the sensor data of the system data, wherein the rendered object reflection at least partially attenuates the real-world radar image.

20. A non-transitory computer readable medium comprising instructions executable by at least one processor to perform: generating system data from a plurality of sensors of an autonomous system operating in a real-world environment; generating simulation data comprising a perturbation to the system data; augmenting the system data to include the perturbation of the simulation data and generate augmented data; and injecting the augmented data into one or more components of the autonomous system to test the autonomous system with the perturbation.