Lightnews — Scholar-powered news

Johannes Ackermann

@johannesack.bsky.social

Reinforcement Learning PhD Student at the University of Tokyo, Prev: Intern at Sakana AI, PFN, M.Sc/B.Sc. from TU Munich
johannesack.github.io

Posts Replies Media Videos

Pinned

Johannes Ackermann @johannesack.bsky.social · Jul 29

Reward models do not have the capacity to fully capture human preferences.
If they can't represent human preferences, how can we hope to use them to align a language model?

In our #COLM2025 "Off-Policy Corrected Reward Modeling for RLHF", we investigate this issue 🧵

Johannes Ackermann

@johannesack.bsky.social

July 29, 2025 at 10:22 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news