https://noamrazin.github.io/
This work was supported in part by the #ZuckermanSTEMLeadershipProgram.
📰 Paper: arxiv.org/abs/2507.07981
6/6
This work was supported in part by the #ZuckermanSTEMLeadershipProgram.
📰 Paper: arxiv.org/abs/2507.07981
6/6
5/6
5/6
4/6
4/6
3/6
3/6
So what causes it?
2/6
So what causes it?
2/6
This work was supported in part by the #ZuckermanSTEMLeadershipProgram.
📰 Paper: arxiv.org/abs/2503.15477
10/10
This work was supported in part by the #ZuckermanSTEMLeadershipProgram.
📰 Paper: arxiv.org/abs/2503.15477
10/10
We hope our insights can inspire further research on RM training and evaluation protocols that account for properties beyond accuracy.
9/10
We hope our insights can inspire further research on RM training and evaluation protocols that account for properties beyond accuracy.
9/10
This reveals a fundamental limitation of evaluating RMs in isolation from the LLM they guide.
8/10
This reveals a fundamental limitation of evaluating RMs in isolation from the LLM they guide.
8/10
7/10
7/10
arxiv.org/abs/2310.20703
6/10
arxiv.org/abs/2310.20703
6/10
As a result, even a perfectly accurate RM can underperform less accurate models due to slow optimization.
5/10
As a result, even a perfectly accurate RM can underperform less accurate models due to slow optimization.
5/10
4/10
4/10
3/10
3/10
arxiv.org/abs/2503.15477
Details 👇
2/10
arxiv.org/abs/2503.15477
Details 👇
2/10
This work was supported in part by the #ZuckermanSTEMLeadershipProgram.
📰 Paper: arxiv.org/abs/2503.15477
10/10
This work was supported in part by the #ZuckermanSTEMLeadershipProgram.
📰 Paper: arxiv.org/abs/2503.15477
10/10
We hope our insights can inspire further research on RM training and evaluation protocols that account for properties beyond accuracy.
9/10
We hope our insights can inspire further research on RM training and evaluation protocols that account for properties beyond accuracy.
9/10
This reveals a fundamental limitation of evaluating RMs in isolation from the LLM they guide.
8/10
This reveals a fundamental limitation of evaluating RMs in isolation from the LLM they guide.
8/10
7/10
7/10
arxiv.org/abs/2310.20703
6/10
arxiv.org/abs/2310.20703
6/10
As a result, even a perfectly accurate RM can underperform less accurate models due to slow optimization.
5/10
As a result, even a perfectly accurate RM can underperform less accurate models due to slow optimization.
5/10
4/10
4/10
3/10
3/10
arxiv.org/abs/2503.15477
Details 👇
2/10
arxiv.org/abs/2503.15477
Details 👇
2/10