The first one is from Csaba Szepesvári’s RL theory lecture notes (lecture 2, planning in MDPs), and the second one is from Puterman's MDP book (chapter 1).
The first one is from Csaba Szepesvári’s RL theory lecture notes (lecture 2, planning in MDPs), and the second one is from Puterman's MDP book (chapter 1).