Kalyanakrishnan et al., 2011 - Google Patents
On learning with imperfect representationsKalyanakrishnan et al., 2011
View PDF- Document ID
- 926201761782441903
- Author
- Kalyanakrishnan S
- Stone P
- Publication year
- Publication venue
- 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)
External Links
Snippet
In this paper we present a perspective on the relationship between learning and representation in sequential decision making tasks. We undertake a brief survey of existing real-world applications, which demonstrates that the classical “tabular” representation …
- 230000002787 reinforcement 0 abstract description 13
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Parker-Holder et al. | Automated reinforcement learning (autorl): A survey and open problems | |
| CN112668235B (en) | Robot control method based on DDPG algorithm of offline model pre-training learning | |
| Bloembergen et al. | Evolutionary dynamics of multi-agent learning: A survey | |
| Du et al. | A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications | |
| Xu et al. | Learning to explore via meta-policy gradient | |
| Boutilier | Planning, learning and coordination in multiagent decision processes | |
| CN109669452A (en) | A kind of cloud robot task dispatching method and system based on parallel intensified learning | |
| Werbos | Reinforcement learning and approximate dynamic programming (RLADP)—foundations, common misconceptions, and the challenges ahead | |
| Smith et al. | Traditional heuristic versus Hopfield neural network approaches to a car sequencing problem | |
| EP3938960A1 (en) | A bilevel method and system for designing multi-agent systems and simulators | |
| Wang et al. | On the convergence of the monte carlo exploring starts algorithm for reinforcement learning | |
| Liotet et al. | Learning a belief representation for delayed reinforcement learning | |
| CN120103855A (en) | Collaborative path planning method for heterogeneous multi-UAVs based on multi-agent deep reinforcement learning | |
| Showalter et al. | Neuromodulated multiobjective evolutionary neurocontrollers without speciation | |
| Kwiatkowski et al. | Understanding reinforcement learned crowds | |
| CN116128028A (en) | An Efficient Deep Reinforcement Learning Algorithm for Combinatorial Optimization of Continuous Decision Spaces | |
| Vasant | Hybrid mesh adaptive direct search genetic algorithms and line search approaches for fuzzy optimization problems in production planning | |
| Kalyanakrishnan et al. | On learning with imperfect representations | |
| CN108830483A (en) | Multi-agent System Task planing method | |
| Li et al. | Introspective Reinforcement Learning and Learning from Demonstration. | |
| Jasna et al. | Application of game theory in path planning of multiple robots | |
| Cavalieri | Performance optimization of flexible manufacturing systems using artificial neural networks | |
| Srinivasaiah et al. | Reinforcement learning strategies using Monte-Carlo to solve the blackjack problem | |
| Shi et al. | Adaptive reinforcement q-learning algorithm for swarm-robot system using pheromone mechanism | |
| Forbes et al. | Real-time reinforcement learning in continuous domains |