Safe-Reinforcement-Learning-Baselines

The repository is for Safe Reinforcement Learning (RL) research, in which we investigate various safe RL baselines and safe RL benchmarks, including single agent RL and multi-agent RL. If any authors do not want their paper to be listed here, please feel free to contact <gshangd[AT]foxmail.com>. (This repository is under actively development. We appreciate any constructive comments and suggestions)

You are more than welcome to update this list! If you find a paper about Safe RL which is not listed here, please

fork this repository, add it and merge back;
or report an issue here;
or email <gshangd[AT]foxmail.com>.

The README is organized as follows:

1. Environments Supported
- 1.1. Safe Single Agent RL benchmarks
- 1.2. Safe Multi-Agent RL benchmarks
2. Safe RL Baselines
- 2.1. Safe Single Agent RL Baselines
- 2.2. Safe Multi-Agent RL Baselines
3. Surveys
4. Theses
5. Book
6. Tutorials
7. Exercise

1. Environments Supported

1.1. Safe Single Agent RL benchmarks

1.2. Safe Multi-Agent RL benchmarks

2. Safe RL Baselines

2.1. Safe Single Agent RL Baselines

Consideration of risk in reinforcement learning, Paper, Not Find Code, (Accepted by ICML 1994)
Multi-criteria Reinforcement Learning, Paper, Not Find Code, (Accepted by ICML 1998)
Lyapunov design for safe reinforcement learning, Paper, Not Find Code, (Accepted by ICML 2002)
Risk-sensitive reinforcement learning, Paper, Not Find Code, (Accepted by Machine Learning, 2002)
Risk-Sensitive Reinforcement Learning Applied to Control under Constraints, Paper, Not Find Code, (Accepted by Journal of Artificial Intelligence Research, 2005)
An actor-critic algorithm for constrained markov decision processes, Paper, Not Find Code, (Accepted by Systems & Control Letters, 2005)
Reinforcement learning for MDPs with constraints, Paper, Not Find Code, (Accepted by European Conference on Machine Learning 2006)
Discounted Markov decision processes with utility constraints, Paper, Not Find Code, (Accepted by Computers & Mathematics with Applications, 2006)
Constrained reinforcement learning from intrinsic and extrinsic rewards, Paper, Not Find Code, (Accepted by International Conference on Development and Learning 2007)
Safe exploration for reinforcement learning, Paper, Not Find Code, (Accepted by ESANN 2008)
Percentile optimization for Markov decision processes with parameter uncertainty, Paper, Not Find Code, (Accepted by Operations research, 2010)
Probabilistic goal Markov decision processes, Paper, Not Find Code, (Accepted by IJCAI 2011)
Safe reinforcement learning in high-risk tasks through policy improvement, Paper, Not Find Code, (Accepted by IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) 2011)
Safe Exploration in Markov Decision Processes, Paper, Not Find Code, (Accepted by ICML 2012)
Policy gradients with variance related risk criteria, Paper, Not Find Code, (Accepted by ICML 2012)
Risk aversion in Markov decision processes via near optimal Chernoff bounds, Paper, Not Find Code, (Accepted by NeurIPS 2012)
Safe Exploration of State and Action Spaces in Reinforcement Learning, Paper, Not Find Code, (Accepted by Journal of Artificial Intelligence Research, 2012)
An Online Actor–Critic Algorithm with Function Approximation for Constrained Markov Decision Processes, Paper, Not Find Code, (Accepted by Journal of Optimization Theory and Applications, 2012)
Safe policy iteration, Paper, Not Find Code, (Accepted by ICML 2013)
Reachability-based safe learning with Gaussian processes, Paper, Not Find Code (Accepted by IEEE CDC 2014)
Safe Policy Search for Lifelong Reinforcement Learning with Sublinear Regret, Paper, Not Find Code, (Accepted by ICML 2015)
High-Confidence Off-Policy Evaluation, Paper, Code (Accepted by AAAI 2015)
Safe Exploration for Optimization with Gaussian Processes, Paper, Not Find Code (Accepted by ICML 2015)
Safe Exploration in Finite Markov Decision Processes with Gaussian Processes, Paper, Not Find Code (Accepted by NeurIPS 2016)
Safe and efficient off-policy reinforcement learning, Paper, Code (Accepted by NeurIPS 2016)
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving, Paper, Not Find Code (only Arxiv, 2016, citation 530+)
Safe Learning of Regions of Attraction in Uncertain, Nonlinear Systems with Gaussian Processes, Paper, Code (Accepetd by CDC 2016)
Safety-constrained reinforcement learning for MDPs, Paper, Not Find Code (Accepted by InInternational Conference on Tools and Algorithms for the Construction and Analysis of Systems 2016)
Convex synthesis of randomized policies for controlled Markov chains with density safety upper bound constraints, Paper, Not Find Code (Accepted by American Control Conference 2016)
Combating Deep Reinforcement Learning's Sisyphean Curse with Intrinsic Fear, Paper, Not Find Code (only Openreview, 2016)
Combating reinforcement learning's sisyphean curse with intrinsic fear, Paper, Not Find Code (only Arxiv, 2016)
Constrained Policy Optimization (CPO), Paper, Code (Accepted by ICML 2017)
Risk-constrained reinforcement learning with percentile risk criteria, Paper, , Not Find Code (Accepted by The Journal of Machine Learning Research, 2017)
Probabilistically Safe Policy Transfer, Paper, Not Find Code (Accepted by ICRA 2017)
Accelerated primal-dual policy optimization for safe reinforcement learning, Paper, Not Find Code (Arxiv, 2017)
Stagewise safe bayesian optimization with gaussian processes, Paper, Not Find Code (Accepted by ICML 2018)
Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning, Paper, Code (Accepted by ICLR 2018)
Safe Model-based Reinforcement Learning with Stability Guarantees, Paper, Code (Accepted by NeurIPS 2018)
A Lyapunov-based Approach to Safe Reinforcement Learning, Paper, Not Find Code (Accepted by NeurIPS 2018)
Constrained Cross-Entropy Method for Safe Reinforcement Learning, Paper, Not Find Code (Accepted by NeurIPS 2018)
Safe Reinforcement Learning via Formal Methods, Paper, Not Find Code (Accepted by AAAI 2018)
Safe exploration and optimization of constrained mdps using gaussian processes, Paper, Not Find Code (Accepted by AAAI 2018)
Safe reinforcement learning via shielding, Paper, Code (Accepted by AAAI 2018)
Trial without Error: Towards Safe Reinforcement Learning via Human Intervention, Paper, Not Find Code (Accepted by AAMAS 2018)
Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning, Paper, Not Find Code (Accepted by CDC 2018)
The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems, Paper, Code (Accepted by CoRL 2018)
OptLayer - Practical Constrained Optimization for Deep Reinforcement Learning in the Real World, Paper, Not Find Code (Accepted by ICRA 2018)
Safe learning of quadrotor dynamics using barrier certificates, Paper, Not Find Code (Accepted by ICRA 2018)
Safe reinforcement learning on autonomous vehicles, Paper, Not Find Code (Accepted by IROS 2018)
Trial without error: Towards safe reinforcement learning via human intervention, Paper, Code (Accepted by AAMAS 2018)
Safe reinforcement learning: Learning with supervision using a constraint-admissible set, Paper, Not Find Code (Accepted by Annual American Control Conference (ACC) 2018)
A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems, Paper, Not Find Code (Accepted by IEEE Transactions on Automatic Control 2018)
Safe exploration algorithms for reinforcement learning controllers, Paper, Not Find Code (Accepted by IEEE transactions on neural networks and learning systems 2018)
Verification and repair of control policies for safe reinforcement learning, Paper, Not Find Code (Accepted by Applied Intelligence, 2018)
Safe Exploration in Continuous Action Spaces, Paper, Code, (only Arxiv, 2018, citation 200+)
Safe exploration of nonlinear dynamical systems: A predictive safety filter for reinforcement learning, Paper, Not Find Code