Harley Wiltzer
harwiltz.bsky.social
Harley Wiltzer
@harwiltz.bsky.social
PhD student at Mila / McGill. Studying distributional RL for transfer across risk-sensitive utilities, and for long-horizon high-frequency decision-making.
This is closely related to our recent work on the Distributional Successor Measure (arxiv.org/abs/2402.08530). We strengthen the analysis to tractable projected DP and TD algorithms, and provide convergence rates as a function of the return distribution resolution & feature dim.
December 9, 2024 at 3:30 PM
How can you 0-shot transfer predictions of long-term performance across reward functions *and* risk-sensitive utilities?

We can do this via Distributional Successor Features. Our recent work introduces the 1st tractable & provably convergent algos for learning DSFs.

#NeurIPS2024 #6704
12 Dec, 11-2
December 9, 2024 at 3:30 PM
In value-based RL, when decisions are made at high frequency, all hell breaks loose.

Our paper "Action Gaps & Advantages in Continuous-Time Distributional RL" shows how Distributional RL sheds light on this, enabling high-frequency model-free risk-sensitive RL.

#NeurIPS2024 #6410
13 Dec, 11-2
December 9, 2024 at 2:46 PM