Optimal Policy Existence

In the previous note about Markov Decision Processes, Bellman equations, we mentioned that there exists a policy $\pi_*$ that is better than or equal to all other policies. In this note, we will be proving that. ...

July 10, 2021 · 7 min · Trung H. Nguyen

Measures

When talking about measure, you might associate it with the idea of length, the measurement of something in one dimension. And then probably, you will extend your idea into two dimensions with area, or even three dimensions with volume. ...

July 3, 2021 · 9 min · Trung H. Nguyen

Markov Decision Processes, Bellman equations

You may have known or heard vaguely about a computer program called AlphaGo - the AI has beaten Lee Sedol - the winner of 18 world Go titles. One of the techniques it used is called self-play against its other instances, with Reinforcement Learning. ...

June 27, 2021 · 5 min · Trung H. Nguyen

Markov Chain

If we have to describe the definition of Markov chain in one statement, it will be: “It only matters where you are, not where you’ve been”. ...

June 19, 2021 · 4 min · Trung H. Nguyen

My very first post

Enjoy my index-zero-ed note while staying tuned for next ones! ...

June 5, 2021 · 1 min · Trung H. Nguyen