Manfred Diaz
banner
manfreddiaz.bsky.social
Manfred Diaz
@manfreddiaz.bsky.social
Ph.D. Candidate at Mila and the University of Montreal, interested in AI/ML connections with economics, game theory, and social choice theory.

https://manfreddiaz.github.io
@aamasconf.bsky.social 2025 was very special for us! We had the opportunity. to present a tutorial on general evaluation of AI agents, and we got a best paper award! Congrats, @sharky6000.bsky.social and the team! 🎉
That's a wrap for day #4 @aamasconf.bsky.social . I did not present anything today but I am honored that we received the best paper award!

Thanks to everyone who made it happen! 👇 1/2
May 23, 2025 at 2:23 PM
Reposted by Manfred Diaz
In the afternoon we will be giving a tutorial on general evaluation of AI agents.

sites.google.com/view/aamas20... 10/N
A Tutorial on General Evaluation of AI Agents
Artificial Intelligence (AI) and machine learning (ML), in particular, have emerged as scientific disciplines concerned with understanding and building single and multi-agent systems with the ability ...
sites.google.com
May 18, 2025 at 5:34 PM
Reposted by Manfred Diaz
Announcing our latest arxiv paper:

Societal and technological progress as sewing an ever-growing, ever-changing, patchy, and polychrome quilt
arxiv.org/abs/2505.05197

We argue for a view of AI safety centered on preventing disagreement from spiraling into conflict.
Societal and technological progress as sewing an ever-growing, ever-changing, patchy, and polychrome quilt
Artificial Intelligence (AI) systems are increasingly placed in positions where their decisions have real consequences, e.g., moderating online spaces, conducting research, and advising on policy. Ens...
arxiv.org
May 9, 2025 at 11:39 AM
Reposted by Manfred Diaz
First LessWrong post! Inspired by Richard Rorty, we argue for a different view of AI alignment, where the goal is "more like sewing together a very large, elaborate, polychrome quilt", than it is "like getting a clearer vision of something true and deep"
www.lesswrong.com/posts/S8KYwt...
Societal and technological progress as sewing an ever-growing, ever-changing, patchy, and polychrome quilt — LessWrong
We can just drop the axiom of rational convergence.
www.lesswrong.com
April 22, 2025 at 3:14 PM
Reposted by Manfred Diaz
In case folks are interested, here's a video of a talk I gave at MIT a couple weeks ago: youtu.be/FmN6fRyfcsY?...
A Theory of Appropriateness with Applications to Generative Artificial Intelligence
YouTube video by MITCBMM
youtu.be
April 1, 2025 at 8:50 PM
Reposted by Manfred Diaz
Our new evaluation method, Soft Condorcet Optimization is now available open-source! 👍

Both the sigmoid (smooth Kendall-tau) and Fenchel-Young (perturbed optimizers) versions.

Also, an optimized C++ implementation that is ~40X faster than the Python one. 🤩⚡

github.com/google-deepm...
March 28, 2025 at 9:45 AM
Reposted by Manfred Diaz
Working at the intersection of social choice and learning algorithms?

Check out the 2nd Workshop on Social Choice and Learning Algorithms (SCaLA) at @ijcai.bsky.social this summer.

Submission deadline: May 9th.

I attended last year at AAMAS and loved it! 👍

sites.google.com/corp/view/sc...
SCaLA-25
A workshop connecting research topics in social choice and learning algorithms.
sites.google.com
March 26, 2025 at 8:18 PM
Come to understand ML evaluation from first principles! We have put together a great AAMAS tutorial covering statistics, probabilistic models, game theory, and social choice theory.

Bonus: a unifying perspective of the problem leveraging decision-theoretic principles!

Join us on May 19th!
March 4, 2025 at 11:39 PM
Elo drives most LLM evaluations, but we often overlook its assumptions, benefits, and limitations. While working on SCO, we wanted to understand the SCO-Elo distinction, so I looked and uncovered some intriguing findings and documented them in these notes. I hope you find them valuable!
Btw, if you stare at the derivation of Elo as logistic regression, SCO is really quite close to Elo. The difference is that Elo uses a classification objective (cross entropy loss) on top of the output of the sigmoid.

@manfreddiaz.bsky.social dug even deeper: manfreddiaz.github.io/assets/pdf/s...
manfreddiaz.github.io
February 25, 2025 at 2:29 AM
Reposted by Manfred Diaz
Looking for a principled evaluation method for ranking of *general* agents or models, i.e. that get evaluated across a myriad of different tasks?

I’m delighted to tell you about our new paper, Soft Condorcet Optimization (SCO) for Ranking of General Agents, to be presented at AAMAS 2025! 🧵 1/N
February 24, 2025 at 3:25 PM
Last week, Michael I. Jordan's insightful talk at the AI Action Summit (www.youtube.com/live/W0QLq4q...) reminded us of the meaningful connections between AI, ML, economics, game theory, and mechanism design. But I'd argue the relationship goes deeper—it's profound, historical, and foundational. ⬇️
AI, Science and Society Conference - AI ACTION SUMMIT - DAY 1
YouTube video by IP Paris
www.youtube.com
February 11, 2025 at 8:57 PM
Reposted by Manfred Diaz
Very happy to announce the publication of our latest paper:

A theory of appropriateness with applications to generative artificial intelligence
arxiv.org/abs/2412.19010

And happy new year everyone!
A theory of appropriateness with applications to generative artificial intelligence
What is appropriateness? Humans navigate a multi-scale mosaic of interlocking notions of what is appropriate for different situations. We act one way with our friends, another with our family, and yet...
arxiv.org
December 31, 2024 at 7:48 AM
Reposted by Manfred Diaz
Concordia is a library for generative agent-based modeling that works like a table-top role-playing game.

It's open source and model agnostic.

Try it today!

github.com/google-deepm...
GitHub - google-deepmind/concordia: A library for generative social simulation
A library for generative social simulation. Contribute to google-deepmind/concordia development by creating an account on GitHub.
github.com
November 16, 2024 at 11:49 PM
Reposted by Manfred Diaz
🚨 Petition to get NeurIPS to join Bluesky 🚨

I just wrote the NeurIPS board requesting them to consider joining Bluesky.

It took about 2 minutes. I invite you to do the same. neurips.cc/Help/Contact

If they changed the name of the conference for the greater good, there's a chance!

Please repost!
Contact
neurips.cc
November 15, 2024 at 10:46 AM
Reposted by Manfred Diaz
Lets get the multi-agent learning community started up here: go.bsky.app/9gsefkW
November 13, 2024 at 10:45 PM