Lightnews — Scholar-powered news

Saumya Malik

@saumyamalik.bsky.social

56 followers 8 following 7 posts

Predoc at Ai2 | prev. Princeton CS '24

Posts Replies Media Videos

Saumya Malik

@saumyamalik.bsky.social

Interestingly, we find that RLHF performance degrades if the lineages of the reward model and policy model don’t match 🤔 So, instead of simply taking the top model on RewardBench 2 off-the-shelf, one should take the recipe for that model and integrate it into their RLHF workflow

June 2, 2025 at 11:41 PM

Saumya Malik

@saumyamalik.bsky.social

We trained and released 70 reward models to study their performance on RB2 and in downstream applications like inference time Best-of-N sampling and RLHF training. Even top RMs still have plenty of room to improve on RB2, particularly in Precise Instruction Following and Math

June 2, 2025 at 11:41 PM

Saumya Malik

@saumyamalik.bsky.social

RewardBench 2 spans six domains, sources new human prompts, and carefully constructs and combines completions to build out a best-of-4 dataset. Using fresh prompts is an important step in making reward model evaluation independent from downstream evaluations

June 2, 2025 at 11:41 PM

Saumya Malik

@saumyamalik.bsky.social

I’m thrilled to share RewardBench 2 📊— We created a new multi-domain reward model evaluation that is substantially harder than RewardBench, we trained and released 70 reward models, and we gained insights about reward modeling benchmarks and downstream performance!

June 2, 2025 at 11:41 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news