Ma et al., 2020 - Google Patents
Exploiting bias for cooperative planning in multi-agent tree searchMa et al., 2020
View PDF- Document ID
- 7702896096589950014
- Author
- Ma A
- Ouimet M
- Cortés J
- Publication year
- Publication venue
- IEEE Robotics and Automation Letters
External Links
Snippet
Graph search over states and actions is a valuable tool for robotic planning and navigation. However, the required computation is sensitive to the size of the state and action spaces, a fact which is further exacerbated in multi-agent planning by the number of agents and the …
- 238000004805 robotic 0 abstract description 21
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G06N5/025—Extracting rules from data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management, e.g. organising, planning, scheduling or allocating time, human or machine resources; Enterprise planning; Organisational models
- G06Q10/063—Operations research or analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/18—Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Aubret et al. | A survey on intrinsic motivation in reinforcement learning | |
| EP3545472B1 (en) | Multi-task neural networks with task-specific paths | |
| EP3620990A1 (en) | Capturing network dynamics using dynamic graph representation learning | |
| Kujanpää et al. | Hierarchical imitation learning with vector quantized models | |
| Al-Khiaty et al. | Matching UML class diagrams using a Hybridized Greedy-Genetic algorithm | |
| Ma et al. | Exploiting bias for cooperative planning in multi-agent tree search | |
| Klissarov et al. | Discovering Temporal Structure: An Overview of Hierarchical Reinforcement Learning | |
| do Carmo Alves et al. | Information-guided planning: an online approach for partially observable problems | |
| Xu et al. | Generalization of temporal logic tasks via future dependent options | |
| Verma et al. | Learning AI-System Capabilities under Stochasticity | |
| Zhan et al. | Relationship explainable multi-objective reinforcement learning with semantic explainability generation | |
| Saleh et al. | Should models be accurate? | |
| Matthews et al. | Crowd grounding: finding semantic and behavioral alignment through human robot interaction. | |
| Arora et al. | Online inverse reinforcement learning with learned observation model | |
| Bianchi | Scalable Safe Policy Improvement for single and multi-agent systems | |
| Hamann | Modeling swarm systems and formal design methods | |
| Gieselmann et al. | An expansive latent planner for long-horizon visual offline reinforcement learning | |
| Shen et al. | Hop-level Direct Preference Optimization for Knowledge Graph Reasoning with Trees | |
| Ma | Model-based reinforcement learning for cooperative multi-agent planning: exploiting hierarchies, bias, and temporal sampling | |
| Mehta | From AI Safety Gridworlds to Reliable Safety Unit Tests for Deep Reinforcement Learning in Computer Systems | |
| Wang et al. | Automatic discovery and transfer of maxq hierarchies in a complex system | |
| de Carvalho | Deep reinforcement learning methods for cooperative robotic navigation | |
| Shen | Long-horizon motion planning with branch-and-bound and neural dynamics | |
| Co-Reyes | Building rl algorithms that generalize: From latent dynamics models to meta-learning | |
| Riley | Safe multi-agent reinforcement learning with quantitatively verified constraints |