Prev: Ai2, Google Research, MSR
Evaluating language technologies, regularly ranting, and probably procrastinating.
https://sites.google.com/view/shailybhatt/
Paper: arxiv.org/abs/2511.02817
Dataset: huggingface.co/oolongbench
Code: github.com/abertsch72/o...
Leaderboard: oolongbench.github.io
Paper: arxiv.org/abs/2511.02817
Dataset: huggingface.co/oolongbench
Code: github.com/abertsch72/o...
Leaderboard: oolongbench.github.io
Kalika Bali at MSR (don't think she's on here)
Kalika Bali at MSR (don't think she's on here)