Safe-Reinforcement-Learning-Baselines
The repository is for Safe Reinforcement Learning (RL) research, in which we investigate various safe RL baselines and safe RL benchmarks, including single agent RL and multi-agent RL. If any authors do not want their paper to be listed here, please feel free to contact <gshangd[AT]foxmail.com>. (This repository is under actively development. We appreciate any constructive comments and suggestions)
You are more than welcome to update this list! If you find a paper about Safe RL which is not listed here, please
- fork this repository, add it and merge back;
- or report an issue here;
- or email <gshangd[AT]foxmail.com>.
The README is organized as follows:
- 1. Environments Supported
- 2. Safe RL Baselines
- 3. Surveys
- 4. Theses
- 5. Book
- 6. Tutorials
- 7. Exercise
1. Environments Supported
1.1. Safe Single Agent RL benchmarks
1.2. Safe Multi-Agent RL benchmarks
2. Safe RL Baselines
2.1. Safe Single Agent RL Baselines
- Consideration of risk in reinforcement learning, Paper, Not Find Code, (Accepted by ICML 1994)
- Multi-criteria Reinforcement Learning, Paper, Not Find Code, (Accepted by ICML 1998)
- Lyapunov design for safe reinforcement learning, Paper, Not Find Code, (Accepted by ICML 2002)
- Risk-sensitive reinforcement learning, Paper, Not Find Code, (Accepted by Machine Learning, 2002)
- Risk-Sensitive Reinforcement Learning Applied to Control under Constraints, Paper, Not Find Code, (Accepted by Journal of Artificial Intelligence Research, 2005)
- An actor-critic algorithm for constrained markov decision processes, Paper, Not Find Code, (Accepted by Systems & Control Letters, 2005)
- Reinforcement learning for MDPs with constraints, Paper, Not Find Code, (Accepted by European Conference on Machine Learning 2006)
- Discounted Markov decision processes with utility constraints, Paper, Not Find Code, (Accepted by Computers & Mathematics with Applications, 2006)
- Constrained reinforcement learning from intrinsic and extrinsic rewards, Paper, Not Find Code, (Accepted by International Conference on Development and Learning 2007)
- Safe exploration for reinforcement learning, Paper, Not Find Code, (Accepted by ESANN 2008)
- Percentile optimization for Markov decision processes with parameter uncertainty, Paper, Not Find Code, (Accepted by Operations research, 2010)
- Probabilistic goal Markov decision processes, Paper, Not Find Code, (Accepted by IJCAI 2011)
- Safe reinforcement learning in high-risk tasks through policy improvement, Paper, Not Find Code, (Accepted by IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) 2011)
- Safe Exploration in Markov Decision Processes, Paper, Not Find Code, (Accepted by ICML 2012)
- Policy gradients with variance related risk criteria, Paper, Not Find Code, (Accepted by ICML 2012)
- Risk aversion in Markov decision processes via near optimal Chernoff bounds, Paper, Not Find Code, (Accepted by NeurIPS 2012)
- Safe Exploration of State and Action Spaces in Reinforcement Learning, Paper, Not Find Code, (Accepted by Journal of Artificial Intelligence Research, 2012)
- An Online Actor–Critic Algorithm with Function Approximation for Constrained Markov Decision Processes, Paper, Not Find Code, (Accepted by Journal of Optimization Theory and Applications, 2012)
- Safe policy iteration, Paper, Not Find Code, (Accepted by ICML 2013)
- Reachability-based safe learning with Gaussian processes, Paper, Not Find Code (Accepted by IEEE CDC 2014)
- Safe Policy Search for Lifelong Reinforcement Learning with Sublinear Regret, Paper, Not Find Code, (Accepted by ICML 2015)
- High-Confidence Off-Policy Evaluation, Paper, Code (Accepted by AAAI 2015)
- Safe Exploration for Optimization with Gaussian Processes, Paper, Not Find Code (Accepted by ICML 2015)
- Safe Exploration in Finite Markov Decision Processes with Gaussian Processes, Paper, Not Find Code (Accepted by NeurIPS 2016)
- Safe and efficient off-policy reinforcement learning, Paper, Code (Accepted by NeurIPS 2016)
- Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving, Paper, Not Find Code (only Arxiv, 2016, citation 530+)
- Safe Learning of Regions of Attraction in Uncertain, Nonlinear Systems with Gaussian Processes, Paper, Code (Accepetd by CDC 2016)
- Safety-constrained reinforcement learning for MDPs, Paper, Not Find Code (Accepted by InInternational Conference on Tools and Algorithms for the Construction and Analysis of Systems 2016)
- Convex synthesis of randomized policies for controlled Markov chains with density safety upper bound constraints, Paper, Not Find Code (Accepted by American Control Conference 2016)
- Combating Deep Reinforcement Learning's Sisyphean Curse with Intrinsic Fear, Paper, Not Find Code (only Openreview, 2016)
- Combating reinforcement learning's sisyphean curse with intrinsic fear, Paper, Not Find Code (only Arxiv, 2016)
- Constrained Policy Optimization (CPO), Paper, Code (Accepted by ICML 2017)
- Risk-constrained reinforcement learning with percentile risk criteria, Paper, , Not Find Code (Accepted by The Journal of Machine Learning Research, 2017)
- Probabilistically Safe Policy Transfer, Paper, Not Find Code (Accepted by ICRA 2017)
- Accelerated primal-dual policy optimization for safe reinforcement learning, Paper, Not Find Code (Arxiv, 2017)
- Stagewise safe bayesian optimization with gaussian processes, Paper, Not Find Code (Accepted by ICML 2018)
- Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning, Paper, Code (Accepted by ICLR 2018)
- Safe Model-based Reinforcement Learning with Stability Guarantees, Paper, Code (Accepted by NeurIPS 2018)
- A Lyapunov-based Approach to Safe Reinforcement Learning, Paper, Not Find Code (Accepted by NeurIPS 2018)
- Constrained Cross-Entropy Method for Safe Reinforcement Learning, Paper, Not Find Code (Accepted by NeurIPS 2018)
- Safe Reinforcement Learning via Formal Methods, Paper, Not Find Code (Accepted by AAAI 2018)
- Safe exploration and optimization of constrained mdps using gaussian processes, Paper, Not Find Code (Accepted by AAAI 2018)
- Safe reinforcement learning via shielding, Paper, Code (Accepted by AAAI 2018)
- Trial without Error: Towards Safe Reinforcement Learning via Human Intervention, Paper, Not Find Code (Accepted by AAMAS 2018)
- Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning, Paper, Not Find Code (Accepted by CDC 2018)
- The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems, Paper, Code (Accepted by CoRL 2018)
- OptLayer - Practical Constrained Optimization for Deep Reinforcement Learning in the Real World, Paper, Not Find Code (Accepted by ICRA 2018)
- Safe learning of quadrotor dynamics using barrier certificates, Paper, Not Find Code (Accepted by ICRA 2018)
- Safe reinforcement learning on autonomous vehicles, Paper, Not Find Code (Accepted by IROS 2018)
- Trial without error: Towards safe reinforcement learning via human intervention, Paper, Code (Accepted by AAMAS 2018)
- Safe reinforcement learning: Learning with supervision using a constraint-admissible set, Paper, Not Find Code (Accepted by Annual American Control Conference (ACC) 2018)
- A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems, Paper, Not Find Code (Accepted by IEEE Transactions on Automatic Control 2018)
- Safe exploration algorithms for reinforcement learning controllers, Paper, Not Find Code (Accepted by IEEE transactions on neural networks and learning systems 2018)
- Verification and repair of control policies for safe reinforcement learning, Paper, Not Find Code (Accepted by Applied Intelligence, 2018)
- Safe Exploration in Continuous Action Spaces, Paper, Code, (only Arxiv, 2018, citation 200+)
- Safe exploration of nonlinear dynamical systems: A predictive safety filter for reinforcement learning, Paper, Not Find Code