Xu et al., 2020 - Google Patents
Distributed no-regret learning in multiagent systems: Challenges and recent developmentsXu et al., 2020
View PDF- Document ID
- 6727087175838462700
- Author
- Xu X
- Zhao Q
- Publication year
- Publication venue
- IEEE Signal Processing Magazine
External Links
Snippet
Game theory is a well-established tool for studying interactions among self-interested players. Under the assumption of complete information on the game composition at each player, the focal point of game-theoretic studies has been on the Nash equilibrium (NE) in …
- 230000018109 developmental process 0 title description 2
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management, e.g. organising, planning, scheduling or allocating time, human or machine resources; Enterprise planning; Organisational models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation, e.g. linear programming, "travelling salesman problem" or "cutting stock problem"
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/02—Details
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/14—Arrangements for maintenance or administration or management of packet switching networks involving network analysis or design, e.g. simulation, network model or planning
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Jin et al. | Reward-free exploration for reinforcement learning | |
| Lieder et al. | An automatic method for discovering rational heuristics for risky choice | |
| Xu et al. | Distributed no-regret learning in multiagent systems: Challenges and recent developments | |
| Lee et al. | An intelligent fuzzy agent for meeting scheduling decision support system | |
| CN110945542A (en) | A multi-agent deep reinforcement learning agent method based on smart grid | |
| Bighashdel et al. | Policy space response oracles: A survey | |
| Gummadi et al. | Mean field analysis of multi-armed bandit games | |
| Farhan et al. | Reinforcement learning in anylogic simulation models: a guiding example using pathmind | |
| Wu et al. | Adaptive QoE-aware SFC orchestration in UAV networks: A deep reinforcement learning approach | |
| Dao et al. | Compact artificial bee colony | |
| Gyeera et al. | Regression analysis of predictions and forecasts of cloud data center KPIs using the boosted decision tree algorithm | |
| Rishwaraj et al. | Heuristics-based trust estimation in multiagent systems using temporal difference learning | |
| Deng et al. | Algorithmic collusion in dynamic pricing with deep reinforcement learning | |
| Gattami et al. | Reinforcement learning for multi-objective and constrained Markov decision processes | |
| Sadoune et al. | Algorithmic collusion and the minimum price markov game | |
| Reddy et al. | Negotiated learning for smart grid agents: entity selection based on dynamic partially observable features | |
| Basaklar et al. | GEM-RL: Generalized energy management of wearable devices using reinforcement learning | |
| Mozo et al. | Scalable prediction of service-level events in datacenter infrastructure using deep neural networks | |
| Wan et al. | Scheduling real-time wireless traffic: A network-aided offline reinforcement learning approach | |
| CN109543879A (en) | Load forecasting method and device neural network based | |
| Kasumba et al. | Data-driven goal recognition design for general behavioral agents | |
| Kaliappan et al. | Optimizing resource allocation in healthcare systems for efficient pandemic management using machine learning and artificial neural networks | |
| CN116957053A (en) | Sequential decision method, device and equipment based on dual cyclic neural network | |
| US20230122472A1 (en) | Hybrid Techniques for Quality Estimation of a Decision-Making Policy in a Computer System | |
| Hegde et al. | COUNSEL: Cloud resource configuration management using deep reinforcement learning |