Yang et al., 2023 - Google Patents

Policy representation via diffusion probability model for reinforcement learning

Yang et al., 2023

Document ID: 9971436291508933587
Author: Yang L; Huang Z; Lei F; Zhong Y; Yang Y; Fang C; Wen S; Zhou B; Lin Z
Publication year: 2023
Publication venue: arXiv preprint arXiv:2305.13122

External Links

Cited by

Snippet

Popular reinforcement learning (RL) algorithms tend to produce a unimodal policy distribution, which weakens the expressiveness of complicated policy and decays the ability of exploration. The diffusion probability model is powerful to learn complicated multimodal …

Continue reading at arxiv.org (PDF) (other versions)

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G06N5/025—Extracting rules from data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding or deleting nodes or connections, pruning
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G06N3/086—Learning methods using evolutionary programming, e.g. genetic algorithms
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/003—Dynamic search techniques, heuristics, branch-and-bound
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/18—Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines

Similar Documents

Publication	Publication Date	Title
Yang et al.	2023	Policy representation via diffusion probability model for reinforcement learning
Bengio et al.	2019	A meta-transfer objective for learning to disentangle causal mechanisms
Ueltzhöffer	2018	Deep active inference
Li et al.	2023	Hierarchical diffusion for offline decision making
Yang et al.	2019	Learn to explain efficiently via neural logic inductive learning
Moerland et al.	2018	A0c: Alpha zero in continuous action space
Ritchie et al.	2016	Deep amortized inference for probabilistic programs
Xu et al.	2018	Learning to explore via meta-policy gradient
Xiong et al.	2024	Teilp: Time prediction over knowledge graphs via logical reasoning
Shrivastava et al.	2019	GLAD: Learning sparse graph recovery
Ortega et al.	2010	A minimum relative entropy principle for learning and acting
CN119254483A (en)	2025-01-03	Network risk analysis method and system based on multi-level game model
David et al.	2022	DEVS model construction as a reinforcement learning problem
CN118674001A (en)	2024-09-20	State action relation reinforcement learning method integrating graph convolution and large language model
WO2024063907A1 (en)	2024-03-28	Modelling causation in machine learning
Jiang et al.	2024	Vertical symbolic regression via deep policy gradient
Sarkar et al.	2021	QKSA: Quantum Knowledge Seeking Agent--resource-optimized reinforcement learning using quantum process tomography
EP4591217A1 (en)	2025-07-30	Modelling causation in machine learning
Neitz et al.	2021	A teacher-student framework to distill future trajectories
Skryagin et al.	2020	Sum-product logic: integrating probabilistic circuits into deepproblog
Xu et al.	2022	A framework for following temporal logic instructions with unknown causal dependencies
de Souza et al.	2024	Hypergraph neural networks with logic clauses
Zhong et al.	2018	A deep learning assisted gene expression programming framework for symbolic regression problems
Winqvist	2020	Neural Network Approaches for Model Predictive Control
Kielak	2019	Generative Adversarial Imagination for Sample Efficient Deep Reinforcement Learning