The first one is from Csaba Szepesvári’s RL theory lecture notes (lecture 2, planning in MDPs), and the second one is from Puterman's MDP book (chapter 1).
The first one is from Csaba Szepesvári’s RL theory lecture notes (lecture 2, planning in MDPs), and the second one is from Puterman's MDP book (chapter 1).
Looking forward to it :)
Looking forward to it :)
Do yo have any insights?
Do yo have any insights?