MuZero

January 2, 2024 · 6 min · Trung H. Nguyen

AlphaZero

October 17, 2023 · 11 min · Trung H. Nguyen

Multi-agent Deep Deterministic Policy Gradient

May 25, 2023 · 5 min · Trung H. Nguyen

Maximum Entropy Reinforcement Learning via Soft Q-learning & Soft Actor-Critic

Notes on Entropy-Regularized Reinforcement Learning via SQL & SAC ...

December 27, 2022 · 11 min · Trung H. Nguyen

Trust Region Policy Optimization

Notes on policy optimization using trust region method. ...

November 23, 2022 · 12 min · Trung H. Nguyen

Deep Q-learning

Notes on DQN and its variants. ...

November 18, 2022 · 8 min · Trung H. Nguyen

Policy Gradient

Notes on Policy gradient methods. ...

October 6, 2022 · 4 min · Trung H. Nguyen