Littleroot

Optimal Policy Existence

In the previous note about Markov Decision Processes, Bellman equations, we mentioned that there exists a policy $\pi_*$ that is better than or equal to all other policies. In this note, we will be proving that. ...

Measures

When talking about measure, you might associate it with the idea of length, the measurement of something in one dimension. And then probably, you will extend your idea into two dimensions with area, or even three dimensions with volume. ...

Markov Decision Processes, Bellman equations

You may have known or heard vaguely about a computer program called AlphaGo - the AI has beaten Lee Sedol - the winner of 18 world Go titles. One of the techniques it used is called self-play against its other instances, with Reinforcement Learning. ...

Markov Chain

If we have to describe the definition of Markov chain in one statement, it will be: “It only matters where you are, not where you’ve been”. ...

My very first post

Enjoy my index-zero-ed note while staying tuned for next ones! ...