Model-based RL with latent variable models

Model-based RL methods that learn latent-variable models instead of trying to predict dynamics models in the observed space. The learned world model then can be used in planning effectively rather than being less efficiently, for instance in visual-based tasks, generating images for future time steps and feed them back into the model to predict the next ones, which requires more computation. ...

September 22, 2024 · 19 min · Trung H. Nguyen

MuZero

January 2, 2024 · 6 min · Trung H. Nguyen

AlphaGo, AlphaGo Zero, AlphaZero

Model-based RL methods that use Monte Carlo Tree Search for planning and ultilize self-play mechanism for training. ...

October 17, 2023 · 11 min · Trung H. Nguyen

Multi-agent Deep Deterministic Policy Gradient

May 25, 2023 · 5 min · Trung H. Nguyen

Maximum Entropy Reinforcement Learning via Soft Q-learning & Soft Actor-Critic

Notes on Entropy-Regularized Reinforcement Learning via SQL & SAC ...

December 27, 2022 · 11 min · Trung H. Nguyen

Deterministic Policy Gradients

The generalization of policy gradient theorems into deterministic case and corresponding policy gradient algorithms. ...

December 2, 2022 · 12 min · Trung H. Nguyen

Trust Region Policy Optimization

A model-free RL algorithm that ensures stable and efficient policy updates by optimizing within a trust region, limiting the step size to prevent drastic policy changes and improve convergence. ...

November 23, 2022 · 12 min · Trung H. Nguyen