Lightnews — Scholar-powered news

Miguel Suau

@miguelsuau.bsky.social

Machine Teacher. Research Scientist at Phaidra. PhD from TU Delft. Previously JP Morgan, Huawei, Unity.

https://www.suau.io/

Posts Replies Media Videos

Miguel Suau

@miguelsuau.bsky.social

It achieves this by reweighting samples according to the likelihood of state-action pairs under the agent’s state representation, effectively breaking the spurious correlations introduced by the policy.

June 18, 2025 at 7:55 PM

Miguel Suau

@miguelsuau.bsky.social

Here, we show that the advantage function not only reduces the variance of gradient estimates but also helps mitigate the effects of policy confounding.

June 18, 2025 at 7:55 PM

Miguel Suau

@miguelsuau.bsky.social

This paper builds on our work published last year at RLC, where we showed that agents can develop policies that exploit spurious correlations induced by their own policies, a phenomenon we call policy confounding.

June 18, 2025 at 7:55 PM

Miguel Suau

@miguelsuau.bsky.social

🤔🤷
x.com/SuauMiguel/s...

x.com

January 25, 2025 at 4:26 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news