Amir Mesbah
banner
amirmesbah.bsky.social
Amir Mesbah
@amirmesbah.bsky.social
Graduate Student - Interested in RL and its mathematics 👾

> https://amirhosein-mesbah.github.io/
Thanks a lot! That was lightning fast 🚀
October 2, 2025 at 10:26 PM
Maybe a blog post would also help =)
September 26, 2025 at 2:59 PM
Could you add me please?
September 9, 2025 at 9:59 PM
I came across a couple of other definitions that might be helpful to mention (apologies if you’re already considering these).
The first one is from Csaba Szepesvári’s RL theory lecture notes (lecture 2, planning in MDPs), and the second one is from Puterman's MDP book (chapter 1).
August 4, 2025 at 9:45 AM
I wanted to send you the link just now but hopefully you have found it =)
March 18, 2025 at 9:08 PM
Sure *_*
Looking forward to it :)
March 17, 2025 at 8:55 PM
Not yet. Just the classical claim that they're trying to learn the distribuition of the return =))
Do yo have any insights?
March 17, 2025 at 6:37 PM
I was reading about the ways that I can enhance the performance of dqn on a real-world problem. One of the candidates was c51 but i haven't implement it yet becuase of computational costs. But it was interesting for becuase i haven't read the papers before
March 17, 2025 at 2:24 PM
I didn't know until last week that it can cause a huge performance boost using it with dqn.
March 17, 2025 at 2:06 PM