Lightnews — Scholar-powered news

Gillian Hadfield

@ghadfield.bsky.social

Future work should focus on developing smarter debate protocols that weight expertise, discourage blind agreement, and reward critical verification of reasoning. We need to move beyond the naive assumption that 'more talk = better outcomes. (10/10) arxiv.org/abs/2509.05396

Talk Isn't Always Cheap: Understanding Failure Modes in Multi-Agent Debate

While multi-agent debate has been proposed as a promising strategy for improving AI reasoning ability, we find that debate can sometimes be harmful rather than helpful. The prior work has exclusively…

arxiv.org

September 23, 2025 at 5:06 PM

Gillian Hadfield

@ghadfield.bsky.social

We suspect RLHF training creates sycophantic behavior, models trained to be agreeable may prioritize consensus over critical evaluation. This suggests current alignment techniques might undermine collaborative reasoning.

September 23, 2025 at 5:06 PM

Gillian Hadfield

@ghadfield.bsky.social

Stronger agents were more likely to change from correct to incorrect answers in response to weaker agents' reasoning than vice versa. Models showed a tendency toward favoring agreement over critical evaluation, creating an echo chamber instead of an actual debate.

September 23, 2025 at 5:06 PM

Gillian Hadfield

@ghadfield.bsky.social

However, we still observed performance gains on math problems under most conditions, suggesting debate effectiveness depends heavily on the type of reasoning required.

September 23, 2025 at 5:06 PM

Gillian Hadfield

@ghadfield.bsky.social

The impact varies significantly by task type. On CommonSenseQA—a dataset we newly examined—debate reduced performance across ALL experimental conditions.

September 23, 2025 at 5:06 PM

Gillian Hadfield

@ghadfield.bsky.social

Even when stronger models outweighed weaker ones, group accuracy decreased over successive debate rounds. Introducing weaker models into debates produced results that were worse than the results when agents hadn’t engaged in discussion at all.

September 23, 2025 at 5:06 PM

Gillian Hadfield

@ghadfield.bsky.social

We tested debate effectiveness across three tasks (CommonSenseQA, MMLU, GSM8K) using three different models (GPT-4o-mini, LLaMA-3.1-8B, Mistral-7B) in various configurations.

September 23, 2025 at 5:06 PM

Gillian Hadfield

@ghadfield.bsky.social

We found that multi-agent debate among large language models can sometimes harm performance rather than improve it, contradicting the assumption that more discussion can lead to better outcomes.

September 23, 2025 at 5:06 PM

Gillian Hadfield

@ghadfield.bsky.social

My lab members Harsh Satija and Andrea Wynn and I have a new preprint examining AI multi-agent debate among diverse models, based on our ICML MAS 2025 workshop.

September 23, 2025 at 5:06 PM

Gillian Hadfield

@ghadfield.bsky.social

These roles will shape the conversation on AI and provide the opportunity for rich, interdisciplinary collaboration with colleagues and researchers in the Department of Computer Science and the School of Government and Policy.
Please spread the word in your network! 5/5
gillianhadfield.org/jobs/

Jobs

I have postdoc and staff openings for our lab at the Johns Hopkins University in either Baltimore, MD or Washington, DC.Postdoctoral FellowWe are hiring an interdisciplinary scholar with a track re…

gillianhadfield.org

June 16, 2025 at 6:18 PM

Gillian Hadfield

@ghadfield.bsky.social

We're recruiting for a Postdoctoral fellow with a track record in computational modeling that explores AI systems and autonomous AI agent dynamics, and experience with ML systems to investigate the foundations of human normativity, and how to build AI systems aligned with human values. 4/5

June 16, 2025 at 6:17 PM

Gillian Hadfield

@ghadfield.bsky.social

We're hiring an AI Communications Associate to craft and execute a multi-channel strategy that turns leading computer science and public policy research into accessible content for a broad audience of stakeholders. 3/5

June 16, 2025 at 6:16 PM

Gillian Hadfield

@ghadfield.bsky.social

We're hiring an AI Policy Researcher to conduct in-depth research into the technical and policy challenges in AI alignment, safety, and governance, and to produce high-quality research reports, white papers, and policy recommendations. 2/5

Jobs

I have postdoc and staff openings for our lab at the Johns Hopkins University in either Baltimore, MD or Washington, DC.Postdoctoral FellowWe are hiring an interdisciplinary scholar with a track re…

gillianhadfield.org

June 16, 2025 at 6:15 PM

Gillian Hadfield

@ghadfield.bsky.social

Our report is now out, chock-a-block with new ideas including insurance partnerships, government oversight of private regulators, building a robust ecosystem, and fostering trust and investment. Check it out here: srinstitute.utoronto.ca/news/co-desi...

Can a market-based regulatory framework help govern AI? New report weighs in — Schwartz Reisman Institute

In April 2024, the Schwartz Reisman Institute for Technology and Society (SRI) hosted a workshop that brought together 33 high-level experts to explore the viability of regulatory markets. Over the co...

srinstitute.utoronto.ca

June 12, 2025 at 12:34 AM

Gillian Hadfield

@ghadfield.bsky.social

destabilize or harm our communities, economies, or politics.Together with @djjrjr.bsky.social and @torontosri.bsky.social we held a design workshop last year with a stunning group of experts from AI labs, regulatory technology startups, enterprise clients, civil society, academia,and government.2/3

June 12, 2025 at 12:33 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news