Lightnews — Scholar-powered news

Manfred Diaz

@manfreddiaz.bsky.social

@aamasconf.bsky.social 2025 was very special for us! We had the opportunity. to present a tutorial on general evaluation of AI agents, and we got a best paper award! Congrats, @sharky6000.bsky.social and the team! 🎉

Marc Lanctot @sharky6000.bsky.social · May 23

That's a wrap for day #4 @aamasconf.bsky.social . I did not present anything today but I am honored that we received the best paper award!

Thanks to everyone who made it happen! 👇 1/2

May 23, 2025 at 2:23 PM

Reposted by Manfred Diaz

Marc Lanctot

@sharky6000.bsky.social

In the afternoon we will be giving a tutorial on general evaluation of AI agents.

sites.google.com/view/aamas20... 10/N

A Tutorial on General Evaluation of AI Agents

Artificial Intelligence (AI) and machine learning (ML), in particular, have emerged as scientific disciplines concerned with understanding and building single and multi-agent systems with the ability ...

sites.google.com

May 18, 2025 at 5:34 PM

Reposted by Manfred Diaz

Joel Z Leibo

@jzleibo.bsky.social

Announcing our latest arxiv paper:

Societal and technological progress as sewing an ever-growing, ever-changing, patchy, and polychrome quilt
arxiv.org/abs/2505.05197

We argue for a view of AI safety centered on preventing disagreement from spiraling into conflict.

Societal and technological progress as sewing an ever-growing, ever-changing, patchy, and polychrome quilt

Artificial Intelligence (AI) systems are increasingly placed in positions where their decisions have real consequences, e.g., moderating online spaces, conducting research, and advising on policy. Ens...

arxiv.org

May 9, 2025 at 11:39 AM

Reposted by Manfred Diaz

Joel Z Leibo

@jzleibo.bsky.social

First LessWrong post! Inspired by Richard Rorty, we argue for a different view of AI alignment, where the goal is "more like sewing together a very large, elaborate, polychrome quilt", than it is "like getting a clearer vision of something true and deep"
www.lesswrong.com/posts/S8KYwt...

Societal and technological progress as sewing an ever-growing, ever-changing, patchy, and polychrome quilt — LessWrong

We can just drop the axiom of rational convergence.

www.lesswrong.com

April 22, 2025 at 3:14 PM

Reposted by Manfred Diaz

Joel Z Leibo

@jzleibo.bsky.social

In case folks are interested, here's a video of a talk I gave at MIT a couple weeks ago: youtu.be/FmN6fRyfcsY?...

A Theory of Appropriateness with Applications to Generative Artificial Intelligence

YouTube video by MITCBMM

youtu.be

April 1, 2025 at 8:50 PM

Reposted by Manfred Diaz

Marc Lanctot

@sharky6000.bsky.social

Our new evaluation method, Soft Condorcet Optimization is now available open-source! 👍

Both the sigmoid (smooth Kendall-tau) and Fenchel-Young (perturbed optimizers) versions.

Also, an optimized C++ implementation that is ~40X faster than the Python one. 🤩⚡

github.com/google-deepm...

March 28, 2025 at 9:45 AM

Reposted by Manfred Diaz

Marc Lanctot

@sharky6000.bsky.social

Working at the intersection of social choice and learning algorithms?

Check out the 2nd Workshop on Social Choice and Learning Algorithms (SCaLA) at @ijcai.bsky.social this summer.

Submission deadline: May 9th.

I attended last year at AAMAS and loved it! 👍

sites.google.com/corp/view/sc...

SCaLA-25

A workshop connecting research topics in social choice and learning algorithms.

sites.google.com

March 26, 2025 at 8:18 PM

Manfred Diaz

@manfreddiaz.bsky.social

Come to understand ML evaluation from first principles! We have put together a great AAMAS tutorial covering statistics, probabilistic models, game theory, and social choice theory.

Bonus: a unifying perspective of the problem leveraging decision-theoretic principles!

Join us on May 19th!

Marc Lanctot @sharky6000.bsky.social · Mar 4

Attending @aamasconf.bsky.social ?

Are you interested in general agent evaluation but don't know much about the area?

Check out our tutorial taking place on May 19th! sites.google.com/view/aamas20...

(Co-organized with @manfreddiaz.bsky.social @ktlr.bsky.social @drimgemp.bsky.social )

1/2

A Tutorial on General Evaluation of AI Agents

Artificial Intelligence (AI) and machine learning (ML), in particular, have emerged as scientific disciplines concerned with understanding and building single and multi-agent systems with the ability ...

sites.google.com

March 4, 2025 at 11:39 PM

Manfred Diaz

@manfreddiaz.bsky.social

Elo drives most LLM evaluations, but we often overlook its assumptions, benefits, and limitations. While working on SCO, we wanted to understand the SCO-Elo distinction, so I looked and uncovered some intriguing findings and documented them in these notes. I hope you find them valuable!

Marc Lanctot @sharky6000.bsky.social · Feb 24

Btw, if you stare at the derivation of Elo as logistic regression, SCO is really quite close to Elo. The difference is that Elo uses a classification objective (cross entropy loss) on top of the output of the sigmoid.

@manfreddiaz.bsky.social dug even deeper: manfreddiaz.github.io/assets/pdf/s...

manfreddiaz.github.io

February 25, 2025 at 2:29 AM

Reposted by Manfred Diaz

Marc Lanctot

@sharky6000.bsky.social

Looking for a principled evaluation method for ranking of *general* agents or models, i.e. that get evaluated across a myriad of different tasks?

I’m delighted to tell you about our new paper, Soft Condorcet Optimization (SCO) for Ranking of General Agents, to be presented at AAMAS 2025! 🧵 1/N

February 24, 2025 at 3:25 PM

Manfred Diaz

@manfreddiaz.bsky.social

Last week, Michael I. Jordan's insightful talk at the AI Action Summit (www.youtube.com/live/W0QLq4q...) reminded us of the meaningful connections between AI, ML, economics, game theory, and mechanism design. But I'd argue the relationship goes deeper—it's profound, historical, and foundational. ⬇️

AI, Science and Society Conference - AI ACTION SUMMIT - DAY 1

YouTube video by IP Paris

www.youtube.com

February 11, 2025 at 8:57 PM

Reposted by Manfred Diaz

Joel Z Leibo

@jzleibo.bsky.social

Very happy to announce the publication of our latest paper:

A theory of appropriateness with applications to generative artificial intelligence
arxiv.org/abs/2412.19010

And happy new year everyone!

A theory of appropriateness with applications to generative artificial intelligence

What is appropriateness? Humans navigate a multi-scale mosaic of interlocking notions of what is appropriate for different situations. We act one way with our friends, another with our family, and yet...

arxiv.org

December 31, 2024 at 7:48 AM

Reposted by Manfred Diaz

Joel Z Leibo

@jzleibo.bsky.social

Concordia is a library for generative agent-based modeling that works like a table-top role-playing game.

It's open source and model agnostic.

Try it today!

github.com/google-deepm...

GitHub - google-deepmind/concordia: A library for generative social simulation

A library for generative social simulation. Contribute to google-deepmind/concordia development by creating an account on GitHub.

github.com

November 16, 2024 at 11:49 PM

Reposted by Manfred Diaz

Marc Lanctot

@sharky6000.bsky.social

🚨 Petition to get NeurIPS to join Bluesky 🚨

I just wrote the NeurIPS board requesting them to consider joining Bluesky.

It took about 2 minutes. I invite you to do the same. neurips.cc/Help/Contact

If they changed the name of the conference for the greater good, there's a chance!

Please repost!

Contact

neurips.cc

November 15, 2024 at 10:46 AM

Reposted by Manfred Diaz

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

Lets get the multi-agent learning community started up here: go.bsky.app/9gsefkW

November 13, 2024 at 10:45 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news