Anupama Sridhar
anusridhar.bsky.social
Anupama Sridhar
@anusridhar.bsky.social
Professional calculator and non-smooth operator: RL theory, optimizer dynamics. All cat analogies here are mine.
This is super cool! We actually just wrote a paper on TD(0) and convergence in non linear models. I'd be curious to hear your thoughts if you're still working in that area. It's interesting to see the empirical instability issues in TD(0). arxiv.org/pdf/2502.05706
arxiv.org
May 20, 2025 at 10:27 PM
if you're still interested in this, I work on TD(0) and we showed convergence in more realistic scenarios: arxiv.org/pdf/2502.05706
arxiv.org
May 20, 2025 at 9:02 PM
Agreed! Momentum and scheduler behavior hint at deeper structure. We’ve been exploring ways to track how gradients evolve across space, not just time. There's a lot to learn by looking at how updates circulate
May 20, 2025 at 8:03 PM