drimgemp.bsky.social
@drimgemp.bsky.social
Reposted
Looking for a principled evaluation method for ranking of *general* agents or models, i.e. that get evaluated across a myriad of different tasks?

I’m delighted to tell you about our new paper, Soft Condorcet Optimization (SCO) for Ranking of General Agents, to be presented at AAMAS 2025! 🧵 1/N
February 24, 2025 at 3:25 PM
Now in the big blue world!
November 22, 2024 at 7:05 PM