[go: up one dir, main page]

Morere et al., 2020 - Google Patents

Reinforcement learning with probabilistically complete exploration

Morere et al., 2020

View PDF
Document ID
12248662857176989759
Author
Morere P
Francis G
Blau T
Ramos F
Publication year
Publication venue
arXiv preprint arXiv:2001.06940

External Links

Snippet

Balancing exploration and exploitation remains a key challenge in reinforcement learning (RL). State-of-the-art RL algorithms suffer from high sample complexity, particularly in the sparse reward case, where they can do no better than to explore in all directions until the …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/04Architectures, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/08Learning methods
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/0265Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
    • G05B13/027Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/12Computer systems based on biological models using genetic models
    • G06N3/126Genetic algorithms, i.e. information processing using digital simulations of the genetic system
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/002Quantum computers, i.e. information processing by using quantum superposition, coherence, decoherence, entanglement, nonlocality, teleportation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computer systems based on specific mathematical models
    • G06N7/005Probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/18Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines

Similar Documents

Publication Publication Date Title
Taylor et al. Episodic learning with control lyapunov functions for uncertain robotic systems
US11086938B2 (en) Interpreting human-robot instructions
Okada et al. Path integral networks: End-to-end differentiable optimal control
Hiraoka et al. Learning robust options by conditional value at risk optimization
Englert et al. Combined Optimization and Reinforcement Learning for Manipulation Skills.
Rigatos et al. Parallelization of a fuzzy control algorithm using quantum computation
JPWO2007135723A1 (en) Neural network learning apparatus, method, and program
Liaw et al. Composing meta-policies for autonomous driving using hierarchical deep reinforcement learning
Qadri et al. InCOpt: Incremental constrained optimization using the Bayes tree
Vinogradska et al. Numerical quadrature for probabilistic policy search
Morere et al. Reinforcement learning with probabilistically complete exploration
Li et al. Bayesian distributional policy gradients
Ortiz-Haro et al. idb-a*: Iterative search and optimization for optimal kinodynamic motion planning
Ollington et al. Incorporating expert advice into reinforcement learning using constructive neural networks
Mukadam et al. Approximately optimal continuous-time motion planning and control via probabilistic inference
Wang et al. Deep bilinear Koopman realization for dynamics modeling and predictive control
Arora et al. I2RL: online inverse reinforcement learning under occlusion
Sugiarto et al. A model-based approach to robot kinematics and control using discrete factor graphs with belief propagation
Lee et al. A dynamic regret analysis and adaptive regularization algorithm for on-policy robot imitation learning
Cheng Efficient and principled robot learning: theory and algorithms
Blau et al. Learning from demonstration without demonstrations
McAllister Bayesian learning for data-efficient control
Debnath et al. Solving Markov decision processes with reachability characterization from mean first passage times
Macciò et al. Local Models for data-driven learning of control policies for complex systems
Fröhlich Data-efficient controller tuning and reinforcement learning