Jha et al. - Google Patents
Local Policy Optimization for Trajectory-Centric Reinforcement LearningJha et al.
View PDF- Document ID
- 1321714978515878143
- Author
- Jha D
- Kolaric P
- Raghunathan A
- Lewis F
- Benosman M
- Romeres D
- Nikovski D
External Links
Snippet
The goal of this paper is to present a method for simultaneous trajectory and local stabilizing policy optimization to generate local policies for trajectory-centric model-based reinforcement learning (MBRL). This is motivated by the fact that global policy optimization …
- 238000005457 optimization 0 title abstract description 58
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B17/00—Systems involving the use of models or simulators of said systems
- G05B17/02—Systems involving the use of models or simulators of said systems electric
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems
- G05B2219/39—Robotics, robotics to robotics hand
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP3924884B1 (en) | System and method for robust optimization for trajectory-centric model-based reinforcement learning | |
| Choi et al. | Reinforcement learning for safety-critical control under model uncertainty, using control lyapunov functions and control barrier functions | |
| Gribovskaya et al. | Learning non-linear multivariate dynamics of motion in robotic manipulators | |
| Liu et al. | Gaussian processes for learning and control: A tutorial with examples | |
| Kamthe et al. | Data-efficient reinforcement learning with probabilistic model predictive control | |
| Levine et al. | Learning neural network policies with guided policy search under unknown dynamics | |
| Castillo et al. | Intelligent adaptive model-based control of robotic dynamic systems with a hybrid fuzzy-neural approach | |
| Rozo et al. | Learning optimal controllers in human-robot cooperative transportation tasks with position and force constraints | |
| Aydinoglu et al. | Consensus complementarity control for multi-contact mpc | |
| Izadbakhsh et al. | Robust adaptive control of robot manipulators using Bernstein polynomials as universal approximator | |
| Krug et al. | Model predictive motion control based on generalized dynamical movement primitives | |
| Russel et al. | Robust constrained-MDPs: Soft-constrained robust policy optimization under model uncertainty | |
| Sacks et al. | Learning sampling distributions for model predictive control | |
| Kolaric et al. | Local policy optimization for trajectory-centric reinforcement learning | |
| Ewerton et al. | Learning trajectory distributions for assisted teleoperation and path planning | |
| Zhang et al. | Safety‐critical control for robotic systems with uncertain model via control barrier function | |
| Gao | Optimizing robotic arm control using deep Q-learning and artificial neural networks through demonstration-based methodologies: A case study of dynamic and static conditions | |
| Umlauft et al. | Bayesian uncertainty modeling for programming by demonstration | |
| Medina et al. | Considering uncertainty in optimal robot control through high-order cost statistics | |
| Rezazadeh et al. | Learning contraction policies from offline data | |
| Ewerton et al. | Reinforcement learning of trajectory distributions: Applications in assisted teleoperation and motion planning | |
| Jha et al. | Local Policy Optimization for Trajectory-Centric Reinforcement Learning | |
| Jha et al. | Robust optimization for trajectory-centric model-based reinforcement learning | |
| Russel et al. | Lyapunov robust constrained-MDPs: Soft-constrained robustly stable policy optimization under model uncertainty | |
| Dani et al. | Reinforcement learning for image-based visual servo control |