Lightnews — Scholar-powered news

Oleksii Kuchaiev

@kuchaev.bsky.social

81 followers 46 following 8 posts

AI model alignment @ NVIDIA

Posts Replies Media Videos

Oleksii Kuchaiev

@kuchaev.bsky.social

New paper from our team. An inference-time scaling approach which can boost non-math benchmarks such as Arena-Hard of existing models. We get Arena-Hard of 92.7 for 70B model. As of 5 Mar 2025, surpassing o1-preview-2024-09- 12 (90.4) and DS-R1 (92.3). arxiv.org/pdf/2503.04378

March 7, 2025 at 6:42 PM

Oleksii Kuchaiev

@kuchaev.bsky.social

Our team put together a unified mathematical framework to analyze popular model alignment algorithms. “Reward-aware Preference Optimization: A Unified Mathematical Framework
for Model Alignment” arxiv.org/pdf/2502.00203.

February 4, 2025 at 5:25 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news