Chang et al., 2021 - Google Patents
Stabilizing neural control using self-learned almost lyapunov criticsChang et al., 2021
View PDF- Document ID
- 8411382868104618509
- Author
- Chang Y
- Gao S
- Publication year
- Publication venue
- 2021 IEEE International Conference on Robotics and Automation (ICRA)
External Links
Snippet
The lack of stability guarantee restricts the practical use of learning-based methods in core control problems in robotics. We develop new methods for learning neural control policies and neural Lyapunov critic functions in the modelfree reinforcement learning (RL) setting …
- 230000003767 neural control 0 title abstract description 9
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Chang et al. | Stabilizing neural control using self-learned almost lyapunov critics | |
| Dean et al. | Robust guarantees for perception-based control | |
| Carron et al. | Data-driven model predictive control for trajectory tracking with a robotic arm | |
| Li et al. | A policy search method for temporal logic specified reinforcement learning tasks | |
| Castaneda et al. | Gaussian process-based min-norm stabilizing controller for control-affine systems with uncertain input effects and dynamics | |
| Tonkens et al. | Refining control barrier functions through hamilton-jacobi reachability | |
| Liu et al. | Iterative convex optimization for model predictive control with discrete-time high-order control barrier functions | |
| Englert et al. | Combined Optimization and Reinforcement Learning for Manipulation Skills. | |
| Chowdhary et al. | Bayesian nonparametric adaptive control of time-varying systems using Gaussian processes | |
| Matsubara et al. | Latent Kullback Leibler control for continuous-state systems using probabilistic graphical models | |
| Lafmejani et al. | Nmpc-lbf: Nonlinear mpc with learned barrier function for decentralized safe navigation of multiple robots in unknown environments | |
| Morales et al. | LAMDA control approaches applied to trajectory tracking for mobile robots | |
| Ganai et al. | Learning stabilization control from observations by learning lyapunov-like proxy models | |
| Puriel-Gil et al. | Reinforcement learning compensation based PD control for inverted pendulum | |
| Yang et al. | Feasible policy iteration | |
| Chen et al. | Safety filter design for neural network systems via convex optimization | |
| Yang et al. | Mpr-rl: Multi-prior regularized reinforcement learning for knowledge transfer | |
| Lee et al. | A data-driven method for safety-critical control: Designing control barrier functions from state constraints | |
| Desaraju et al. | Leveraging experience for computationally efficient adaptive nonlinear model predictive control | |
| Pereira et al. | Linear time-varying robust model predictive control for discrete-time nonlinear systems | |
| Najafi et al. | Rapid learning in sequential composition control | |
| Liu et al. | Iterative convex optimization for safety-critical model predictive control | |
| Parsapour et al. | Recovery-matrix inverse optimal control for deterministic feedforward-feedback controllers | |
| Mon et al. | Double inverted pendulum decoupling control by adaptive terminal sliding-mode recurrent fuzzy neural network | |
| Massiani et al. | On exploration requirements for learning safety constraints |