Englert et al., 2016 - Google Patents
Combined Optimization and Reinforcement Learning for Manipulation Skills.Englert et al., 2016
View PDF- Document ID
- 13342371013864077267
- Author
- Englert P
- Toussaint M
- Publication year
- Publication venue
- Robotics: Science and systems
External Links
Snippet
This work addresses the problem of how a robot can improve a manipulation skill in a sample-efficient and secure manner. As an alternative to the standard reinforcement learning formulation where all objectives are defined in a single reward function, we …
- 238000005457 optimization 0 title abstract description 65
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6288—Fusion techniques, i.e. combining data from various sources, e.g. sensor fusion
- G06K9/629—Fusion techniques, i.e. combining data from various sources, e.g. sensor fusion of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Englert et al. | Combined Optimization and Reinforcement Learning for Manipulation Skills. | |
| Chebotar et al. | Closing the sim-to-real loop: Adapting simulation randomization with real world experience | |
| Fu et al. | One-shot learning of manipulation skills with online dynamics adaptation and neural network priors | |
| Jetchev et al. | Fast motion planning from experience: trajectory prediction for speeding up movement generation | |
| Maeda et al. | Probabilistic movement primitives for coordination of multiple human–robot collaborative tasks | |
| Polydoros et al. | Survey of model-based reinforcement learning: Applications on robotics | |
| US20210178600A1 (en) | System and Method for Robust Optimization for Trajectory-Centric ModelBased Reinforcement Learning | |
| Heiden et al. | Interactive differentiable simulation | |
| Venkatraman et al. | Improved learning of dynamics models for control | |
| Agia et al. | Stap: Sequencing task-agnostic policies | |
| Urain et al. | Learning implicit priors for motion optimization | |
| Toussaint et al. | A bayesian view on motor control and planning | |
| Wilcox et al. | Ls3: Latent space safe sets for long-horizon visuomotor control of sparse reward iterative tasks | |
| Rath et al. | Using physics knowledge for learning rigid-body forward dynamics with gaussian process force priors | |
| Acharya et al. | Competency assessment for autonomous agents using deep generative models | |
| Onol et al. | A comparative analysis of contact models in trajectory optimization for manipulation | |
| Gao | Optimizing robotic arm control using deep Q-learning and artificial neural networks through demonstration-based methodologies: A case study of dynamic and static conditions | |
| Alt et al. | Robot program parameter inference via differentiable shadow program inversion | |
| Torabi et al. | Sample-efficient adversarial imitation learning from observation | |
| Millard et al. | Automatic differentiation and continuous sensitivity analysis of rigid body dynamics | |
| Zhu et al. | Model identification via physics engines for improved policy search | |
| Sochopoulos et al. | Learning deep dynamical systems using stable neural ODEs | |
| Morere et al. | Reinforcement learning with probabilistically complete exploration | |
| Totsila et al. | Sensorimotor Learning With Stability Guarantees via Autonomous Neural Dynamic Policies | |
| Laferrière et al. | Deep Koopman representation for control over images (dkrci) |