My-Rl
AlphaZero
Multi-agent Deep Deterministic Policy Gradient
Maximum Entropy Reinforcement Learning via Soft Q-learning & Soft Actor-Critic
Notes on Entropy-Regularized Reinforcement Learning via SQL & SAC ...
Deterministic Policy Gradients
Notes on Deterministic Policy Gradient algorithms ...
Trust Region Policy Optimization
Notes on policy optimization using trust region method. ...
Deep Q-learning
Notes on DQN and its variants. ...