Jha et al., 2019 - Google Patents
Robust optimization for trajectory-centric model-based reinforcement learningJha et al., 2019
View PDF- Document ID
- 16791750663601743939
- Author
- Jha D
- Kolaric P
- Romeres D
- Raghunathan A
- Benosman M
- Nikovski D
- Publication year
- Publication venue
- Workshop on Safety and Robustness in Decision Making at NeurIPS
External Links
Snippet
This paper presents a method to perform robust trajectory optimization for trajectory-centric Model-based Reinforcement Learning (MBRL). We propose a method that allows us to use the uncertainty estimates present in predictions obtained from a model-learning algorithm to …
- 238000005457 optimization 0 title abstract description 49
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/048—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators using a predictor
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0205—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric not using a model or a simulator of the controlled system
- G05B13/021—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric not using a model or a simulator of the controlled system in which a variable is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B17/00—Systems involving the use of models or simulators of said systems
- G05B17/02—Systems involving the use of models or simulators of said systems electric
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP3924884B1 (en) | System and method for robust optimization for trajectory-centric model-based reinforcement learning | |
| Dutta et al. | Sherlock-a tool for verification of neural network feedback systems: demo abstract | |
| Xiang et al. | Reachability analysis and safety verification for neural network control systems | |
| Hours et al. | A parametric nonconvex decomposition algorithm for real-time and distributed NMPC | |
| Carius et al. | Constrained stochastic optimal control with learned importance sampling: A path integral approach | |
| Lee et al. | Gp-ilqg: Data-driven robust optimal control for uncertain nonlinear dynamical systems | |
| Udekwe et al. | Comparing actor-critic deep reinforcement learning controllers for enhanced performance on a ball-and-plate system | |
| Everett | Neural network verification in control | |
| Zhang et al. | Safety‐critical control for robotic systems with uncertain model via control barrier function | |
| Zhou et al. | Runtime-safety-guided policy repair | |
| Kolaric et al. | Local policy optimization for trajectory-centric reinforcement learning | |
| Aoyama et al. | Receding horizon differential dynamic programming under parametric uncertainty | |
| Hall et al. | Differentially flat learning-based model predictive control using a stability, state, and input constraining safety filter | |
| Saha et al. | Neural identification for control | |
| Tan et al. | Zero-order control barrier functions for sampled-data systems with state and input dependent safety constraints | |
| Jha et al. | Robust optimization for trajectory-centric model-based reinforcement learning | |
| Kolaric et al. | Robust optimization for trajectory-centric model-based reinforcement learning | |
| Zhang et al. | Robustified time-optimal collision-free motion planning for autonomous mobile robots under disturbance conditions | |
| Munoz et al. | Online adaptive compensation for model uncertainty using extreme learning machine-based control barrier functions | |
| Russel et al. | Lyapunov robust constrained-MDPs: Soft-constrained robustly stable policy optimization under model uncertainty | |
| Ito et al. | Second-order bounds of Gaussian kernel-based functions and its application to nonlinear optimal control with stability | |
| Kobayashi et al. | Optimal control of multi-vehicle systems with temporal logic constraints | |
| Cubuktepe et al. | Shared control with human trust and workload models | |
| Jha et al. | Local Policy Optimization for Trajectory-Centric Reinforcement Learning | |
| Wehbeh et al. | Nonlinear scenario‐based model predictive control for quadrotors with bidirectional thrust |