Boloka et al., 2021 - Google Patents
Knowledge transfer using model-based deep reinforcement learningBoloka et al., 2021
View PDF- Document ID
- 1877457442355883430
- Author
- Boloka T
- Makondo N
- Rosman B
- Publication year
- Publication venue
- 2021 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA)
External Links
Snippet
Deep reinforcement learning has recently been adopted for robot behavior learning, where robot skills are acquired and adapted from data generated by the robot while interacting with its environment through a trial-and-error process. Despite this success, most model-free …
- 230000002787 reinforcement 0 title abstract description 32
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B17/00—Systems involving the use of models or simulators of said systems
- G05B17/02—Systems involving the use of models or simulators of said systems electric
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management, e.g. organising, planning, scheduling or allocating time, human or machine resources; Enterprise planning; Organisational models
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems
- G05B2219/39—Robotics, robotics to robotics hand
- G05B2219/39376—Hierarchical, learning, recognition and skill level and adaptation servo level
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Boney et al. | Regularizing model-based planning with energy-based models | |
| Shahid et al. | Learning continuous control actions for robotic grasping with reinforcement learning | |
| Shaik et al. | Adaptive Control Through Reinforcement Learning: Robotic Systems in Action | |
| Yamaguchi et al. | Neural networks and differential dynamic programming for reinforcement learning problems | |
| Lim et al. | Prediction of reward functions for deep reinforcement learning via Gaussian process regression | |
| Krug et al. | Model predictive motion control based on generalized dynamical movement primitives | |
| Kicki et al. | Fast kinodynamic planning on the constraint manifold with deep neural networks | |
| Tavassoli et al. | Learning skills from demonstrations: A trend from motion primitives to experience abstraction | |
| Ota et al. | Data-efficient learning for complex and real-time physical problem solving using augmented simulation | |
| Marino et al. | Modeling and planning under uncertainty using deep neural networks | |
| Sleiman et al. | Guided reinforcement learning for robust multi-contact loco-manipulation | |
| Boloka et al. | Knowledge transfer using model-based deep reinforcement learning | |
| Xue et al. | Logic-skill programming: An optimization-based approach to sequential skill planning | |
| Bilal et al. | Beyond Success: Quantifying Demonstration Quality in Learning from Demonstration | |
| Deng et al. | Hierarchical robot learning for physical collaboration between humans and robots | |
| Tanneberg et al. | Deep spiking networks for model-based planning in humanoids | |
| Modery et al. | Assisting humans in human-robot collaborative assembly contexts through deep q-learning | |
| Caamaño et al. | Introducing synaptic delays in the NEAT algorithm to improve modelling in cognitive robotics | |
| Srinivasan et al. | Path planning with user route preference-A reward surface approximation approach using orthogonal Legendre polynomials | |
| Bondre et al. | Deep reinforcement learning algorithms: A comprehensive overview | |
| Fabisch | Learning and generalizing behaviors for robots from human demonstration | |
| Taylor et al. | Integrating human demonstration and reinforcement learning: Initial results in human-agent transfer | |
| Govindaraju et al. | Enhancing Knot-Tying With Max-Margin Q-Learning: A Study on Trajectory Transfer and Lookahead Policies | |
| Ecker | Iterative linear quadratic regulator for collision-free trajectory optimization and model predictive control of a timber crane | |
| Oikonomou et al. | Online prediction of novel trajectories using a library of movement primitives |