model-based

Model-based RL with latent variable models

Model-based RL methods that learn latent-variable models instead of trying to predict dynamics models in the observed space. The learned world model then can be used in planning effectively rather than being less efficiently, for instance in visual-based tasks, generating images for future time steps and feed them back into the model to predict the next ones, which requires more computation. ...

MuZero

AlphaGo, AlphaGo Zero, AlphaZero

Model-based RL methods that use Monte Carlo Tree Search for planning and ultilize self-play mechanism for training. ...