Lightnews — Scholar-powered news

Collective Intelligence Project

@cip.org

Apple: podcasts.apple.com/us/podcast/a...

Spotify: open.spotify.com/episode/6UDj...

Audrey Tang and Divya Siddarth on Outfitting Democracy for the AI Era

Podcast Episode · Possible · 08/13/2025 · 52m

podcasts.apple.com

August 15, 2025 at 2:08 PM

Collective Intelligence Project

@cip.org

- Why "uncommon ground" beats common ground every time

- Sci-fi book recommendations

- And much more

August 15, 2025 at 2:08 PM

Collective Intelligence Project

@cip.org

- Our work bringing 100K+ people into AI development through globaldialogues.ai

- How we're building evaluation benchmarks from lived experiences, not just lab tests

- Digital twins that could represent your values without taking up all your evenings

Global Dialogues

Exploring humanity's vision for artificial intelligence through global conversations and collective intelligence.

globaldialogues.ai

August 15, 2025 at 2:08 PM

Collective Intelligence Project

@cip.org

What you'll find in this episode:

- How Taiwan crowdsourced anti-deepfake legislation in 24 hours (and it worked)

- Why 1 in 3 adults now use AI for daily emotional support, and what that means for democracy

August 15, 2025 at 2:08 PM

Collective Intelligence Project

@cip.org

10/10: Read the piece to learn more about this under-explored issue.

It includes specific strategies to address these biases and provides access to the full Github suite.

www.cip.org/blog/llm-jud...

LLM Judges Are Unreliable — The Collective Intelligence Project

When Large Language Models are used as judges for decision-making across various sensitive domains, they consistently exhibit unpredictable and hidden measurement biases, making their verdicts unrelia...

www.cip.org

May 23, 2025 at 5:27 PM

Collective Intelligence Project

@cip.org

9/10: We built a Github suite to systematically test and quantify these biases.

It lets you:

May 23, 2025 at 5:27 PM

Collective Intelligence Project

@cip.org

8/10: To improve reliability: Neutralize labels, vary order, empirically validate all prompt components, and optimize scoring mechanics. Diversify your model portfolio and critically evaluate human baselines.

May 23, 2025 at 5:27 PM

Collective Intelligence Project

@cip.org

7/10: These aren't just minor quirks. LLMs lack the mechanistic precision of traditional software. Their architecture means system prompts and input material exist in the same context, leading to unpredictable interactions.

May 23, 2025 at 5:27 PM

Collective Intelligence Project

@cip.org

6/10: Rubric-based scoring is also affected. We observed 'recency bias' where criteria scored later received lower averages. Holistic vs. isolated evaluation dramatically shifted scores too.

May 23, 2025 at 5:27 PM

Collective Intelligence Project

@cip.org

5/10: For example, in pairwise choices, LLMs favored "Response B" 60-69% of the time, a significant deviation from random. Even explicit "de-biasing" prompts sometimes increased bias.

May 23, 2025 at 5:27 PM

Collective Intelligence Project

@cip.org

4/10: LLMs exhibit cognitive biases similar to humans: serial position, framing, anchoring. Our tests across frontier models from Google, Mistral, Anthropic, and OpenAI consistently show these biases in judgment contexts.

May 23, 2025 at 5:27 PM

Collective Intelligence Project

@cip.org

3/10: "Prompt engineering" often relies on untested folklore. We found even minor prompt changes, like "Response A" vs. "Response B" labeling, significantly bias LLM choices.

May 23, 2025 at 5:27 PM

Collective Intelligence Project

@cip.org

2/10: This is important because LLMs are increasingly deployed for evaluation tasks, ranking, decision-making, and judgement in many critical domains.

May 23, 2025 at 5:27 PM

Collective Intelligence Project

@cip.org

Details and how to apply: cip.org/challenge

Global Dialogues Challenge — The Collective Intelligence Project

cip.org

May 19, 2025 at 5:56 PM

Collective Intelligence Project

@cip.org

Submissions will be judged by an amazing panel:

@audreyt.org (Cyber Ambassador-at-large for Taiwan)

@nabiha.bsky.social (Executive Director of @mozilla.org )

Zoe Hitzig (Research Scientist at OpenAI and Poet)

May 19, 2025 at 5:56 PM

Collective Intelligence Project

@cip.org

The challenge runs from Monday, May 19th through Friday, July 11th.

A $10,000 prize fund will be distributed among the winning submissions.

May 19, 2025 at 5:56 PM

Collective Intelligence Project

@cip.org

This is an open call to explore global perspectives on AI using the public datasets sourced from our globaldialogues.ai project.

Participants can submit benchmarks, visualizations, artistic responses, or analytical reflections.

Global Dialogues

Exploring humanity's vision for artificial intelligence through global conversations and collective intelligence.

Globaldialogues.ai

May 19, 2025 at 5:56 PM

Collective Intelligence Project

@cip.org

www.technologyreview.com/2025/03/11/1...

These new AI benchmarks could help make models less biased

They could offer a more nuanced way to measure AI’s bias and its understanding of the world.

www.technologyreview.com

March 14, 2025 at 6:36 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news