Levine et al., 2014 - Google Patents
Learning neural network policies with guided policy search under unknown dynamicsLevine et al., 2014
View PDF- Document ID
- 13351610957787428402
- Author
- Levine S
- Abbeel P
- Publication year
- Publication venue
- Advances in neural information processing systems
External Links
Snippet
We present a policy search method that uses iteratively refitted local linear models to optimize trajectory distributions for large, continuous problems. These trajectory distributions can be used within the framework of guided policy search to learn policies with an arbitrary …
- 230000001537 neural 0 title abstract description 16
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B17/00—Systems involving the use of models or simulators of said systems
- G05B17/02—Systems involving the use of models or simulators of said systems electric
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/004—Artificial life, i.e. computers simulating life
- G06N3/006—Artificial life, i.e. computers simulating life based on simulated virtual individual or collective life forms, e.g. single "avatar", social simulations, virtual worlds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Levine et al. | Learning neural network policies with guided policy search under unknown dynamics | |
| Shahid et al. | Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning | |
| Montgomery et al. | Guided policy search via approximate mirror descent | |
| US11714996B2 (en) | Learning motor primitives and training a machine learning system using a linear-feedback-stabilized policy | |
| Li et al. | A policy search method for temporal logic specified reinforcement learning tasks | |
| Levine et al. | Guided policy search | |
| Levine et al. | Learning complex neural network policies with trajectory optimization | |
| Kumar et al. | Optimal control with learned local models: Application to dexterous manipulation | |
| Cutler et al. | Efficient reinforcement learning for robots using informative simulated priors | |
| Yamaguchi et al. | Neural networks and differential dynamic programming for reinforcement learning problems | |
| Hu et al. | Robot policy improvement with natural evolution strategies for stable nonlinear dynamical system | |
| Montgomery et al. | Guided policy search as approximate mirror descent | |
| Park et al. | Inverse optimal control for humanoid locomotion | |
| Levine | Motor skill learning with local trajectory methods | |
| Curran et al. | Dimensionality Reduced Reinforcement Learning for Assistive Robots. | |
| Ma et al. | Reinforcement learning with model-based feedforward inputs for robotic table tennis | |
| Gao | Optimizing robotic arm control using deep Q-learning and artificial neural networks through demonstration-based methodologies: A case study of dynamic and static conditions | |
| Wawrzyński | A cat-like robot real-time learning to run | |
| Çallar et al. | Hybrid learning of time-series inverse dynamics models for locally isotropic robot motion | |
| Afzali et al. | A modified convergence DDPG algorithm for robotic manipulation | |
| Surovik et al. | Learning an expert skill-space for replanning dynamic quadruped locomotion over obstacles | |
| Ennen et al. | Learning robust manipulation skills with guided policy search via generative motor reflexes | |
| Liu et al. | Safe model-based control from signal temporal logic specifications using recurrent neural networks | |
| Van Heerden et al. | A combination of particle swarm optimization and model predictive control on graphics hardware for real-time trajectory planning of the under-actuated nonlinear Acrobot | |
| Cao et al. | Shape control of elastic deformable linear objects for robotic cable assembly |