Levine et al., 2014 - Google Patents

Learning neural network policies with guided policy search under unknown dynamics

Levine et al., 2014

Document ID: 13351610957787428402
Author: Levine S; Abbeel P
Publication year: 2014
Publication venue: Advances in neural information processing systems

External Links

Cited by

Snippet

We present a policy search method that uses iteratively refitted local linear models to optimize trajectory distributions for large, continuous problems. These trajectory distributions can be used within the framework of guided policy search to learn policies with an arbitrary …

Continue reading at proceedings.neurips.cc (PDF) (other versions)

230000001537 neural 0 title abstract description 16

Classifications

- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B17/00—Systems involving the use of models or simulators of said systems
- G05B17/02—Systems involving the use of models or simulators of said systems electric
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/004—Artificial life, i.e. computers simulating life
- G06N3/006—Artificial life, i.e. computers simulating life based on simulated virtual individual or collective life forms, e.g. single "avatar", social simulations, virtual worlds
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems

Similar Documents

Publication	Publication Date	Title
Levine et al.	2014	Learning neural network policies with guided policy search under unknown dynamics
Shahid et al.	2022	Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning
Montgomery et al.	2016	Guided policy search via approximate mirror descent
US11714996B2 (en)	2023-08-01	Learning motor primitives and training a machine learning system using a linear-feedback-stabilized policy
Li et al.	2018	A policy search method for temporal logic specified reinforcement learning tasks
Levine et al.	2013	Guided policy search
Levine et al.	2014	Learning complex neural network policies with trajectory optimization
Kumar et al.	2016	Optimal control with learned local models: Application to dexterous manipulation
Cutler et al.	2015	Efficient reinforcement learning for robots using informative simulated priors
Yamaguchi et al.	2016	Neural networks and differential dynamic programming for reinforcement learning problems
Hu et al.	2022	Robot policy improvement with natural evolution strategies for stable nonlinear dynamical system
Montgomery et al.	2016	Guided policy search as approximate mirror descent
Park et al.	2013	Inverse optimal control for humanoid locomotion
Levine	2014	Motor skill learning with local trajectory methods
Curran et al.	2016	Dimensionality Reduced Reinforcement Learning for Assistive Robots.
Ma et al.	2023	Reinforcement learning with model-based feedforward inputs for robotic table tennis
Gao	2024	Optimizing robotic arm control using deep Q-learning and artificial neural networks through demonstration-based methodologies: A case study of dynamic and static conditions
Wawrzyński	2009	A cat-like robot real-time learning to run
Çallar et al.	2022	Hybrid learning of time-series inverse dynamics models for locally isotropic robot motion
Afzali et al.	2023	A modified convergence DDPG algorithm for robotic manipulation
Surovik et al.	2021	Learning an expert skill-space for replanning dynamic quadruped locomotion over obstacles
Ennen et al.	2019	Learning robust manipulation skills with guided policy search via generative motor reflexes
Liu et al.	2021	Safe model-based control from signal temporal logic specifications using recurrent neural networks
Van Heerden et al.	2014	A combination of particle swarm optimization and model predictive control on graphics hardware for real-time trajectory planning of the under-actuated nonlinear Acrobot
Cao et al.	2024	Shape control of elastic deformable linear objects for robotic cable assembly