Collective Intelligence Project
@cip.org
We're on a mission to steer transformative technology for the collective good.
cip.org
cip.org
9/10: We built a Github suite to systematically test and quantify these biases.
It lets you:
It lets you:
May 23, 2025 at 5:27 PM
9/10: We built a Github suite to systematically test and quantify these biases.
It lets you:
It lets you:
7/10: These aren't just minor quirks. LLMs lack the mechanistic precision of traditional software. Their architecture means system prompts and input material exist in the same context, leading to unpredictable interactions.
May 23, 2025 at 5:27 PM
7/10: These aren't just minor quirks. LLMs lack the mechanistic precision of traditional software. Their architecture means system prompts and input material exist in the same context, leading to unpredictable interactions.
6/10: Rubric-based scoring is also affected. We observed 'recency bias' where criteria scored later received lower averages. Holistic vs. isolated evaluation dramatically shifted scores too.
May 23, 2025 at 5:27 PM
6/10: Rubric-based scoring is also affected. We observed 'recency bias' where criteria scored later received lower averages. Holistic vs. isolated evaluation dramatically shifted scores too.
5/10: For example, in pairwise choices, LLMs favored "Response B" 60-69% of the time, a significant deviation from random. Even explicit "de-biasing" prompts sometimes increased bias.
May 23, 2025 at 5:27 PM
5/10: For example, in pairwise choices, LLMs favored "Response B" 60-69% of the time, a significant deviation from random. Even explicit "de-biasing" prompts sometimes increased bias.
4/10: LLMs exhibit cognitive biases similar to humans: serial position, framing, anchoring. Our tests across frontier models from Google, Mistral, Anthropic, and OpenAI consistently show these biases in judgment contexts.
May 23, 2025 at 5:27 PM
4/10: LLMs exhibit cognitive biases similar to humans: serial position, framing, anchoring. Our tests across frontier models from Google, Mistral, Anthropic, and OpenAI consistently show these biases in judgment contexts.
3/10: "Prompt engineering" often relies on untested folklore. We found even minor prompt changes, like "Response A" vs. "Response B" labeling, significantly bias LLM choices.
May 23, 2025 at 5:27 PM
3/10: "Prompt engineering" often relies on untested folklore. We found even minor prompt changes, like "Response A" vs. "Response B" labeling, significantly bias LLM choices.
2/10: This is important because LLMs are increasingly deployed for evaluation tasks, ranking, decision-making, and judgement in many critical domains.
May 23, 2025 at 5:27 PM
2/10: This is important because LLMs are increasingly deployed for evaluation tasks, ranking, decision-making, and judgement in many critical domains.
1/10: LLM Judges Are Unreliable.
Our latest blog post from @j11y.io shows that positional preferences, order effects, and prompt sensitivity fundamentally undermine the reliability of LLM judges.
Our latest blog post from @j11y.io shows that positional preferences, order effects, and prompt sensitivity fundamentally undermine the reliability of LLM judges.
May 23, 2025 at 5:27 PM
1/10: LLM Judges Are Unreliable.
Our latest blog post from @j11y.io shows that positional preferences, order effects, and prompt sensitivity fundamentally undermine the reliability of LLM judges.
Our latest blog post from @j11y.io shows that positional preferences, order effects, and prompt sensitivity fundamentally undermine the reliability of LLM judges.
Submissions will be judged by an amazing panel:
@audreyt.org (Cyber Ambassador-at-large for Taiwan)
@nabiha.bsky.social (Executive Director of @mozilla.org )
Zoe Hitzig (Research Scientist at OpenAI and Poet)
@audreyt.org (Cyber Ambassador-at-large for Taiwan)
@nabiha.bsky.social (Executive Director of @mozilla.org )
Zoe Hitzig (Research Scientist at OpenAI and Poet)
May 19, 2025 at 5:56 PM
Submissions will be judged by an amazing panel:
@audreyt.org (Cyber Ambassador-at-large for Taiwan)
@nabiha.bsky.social (Executive Director of @mozilla.org )
Zoe Hitzig (Research Scientist at OpenAI and Poet)
@audreyt.org (Cyber Ambassador-at-large for Taiwan)
@nabiha.bsky.social (Executive Director of @mozilla.org )
Zoe Hitzig (Research Scientist at OpenAI and Poet)
We're officially launching the Global Dialogues Challenge!
May 19, 2025 at 5:56 PM
We're officially launching the Global Dialogues Challenge!
10/ We think they can be, but it will take rapid, concerted effort.
We’re looking forward to tackling that challenge in 2025 together.
We’ll see you there.
2024.cip.org
We’re looking forward to tackling that challenge in 2025 together.
We’ll see you there.
2024.cip.org
December 23, 2024 at 2:31 AM
10/ We think they can be, but it will take rapid, concerted effort.
We’re looking forward to tackling that challenge in 2025 together.
We’ll see you there.
2024.cip.org
We’re looking forward to tackling that challenge in 2025 together.
We’ll see you there.
2024.cip.org
December 23, 2024 at 2:31 AM
3/ We built Community Models to experiment with plural alignment at different scales - giving people and communities the tools to directly shape AI models.
cm.cip.org
cm.cip.org
December 23, 2024 at 2:31 AM
3/ We built Community Models to experiment with plural alignment at different scales - giving people and communities the tools to directly shape AI models.
cm.cip.org
cm.cip.org
This week's Collections takes stock of 2024: a year when AI capabilities surged, while the need for collective intelligence became clearer than ever.
For the full story of how we're building towards democratic AI futures: 2024.cip.org
For the full story of how we're building towards democratic AI futures: 2024.cip.org
December 23, 2024 at 2:31 AM
This week's Collections takes stock of 2024: a year when AI capabilities surged, while the need for collective intelligence became clearer than ever.
For the full story of how we're building towards democratic AI futures: 2024.cip.org
For the full story of how we're building towards democratic AI futures: 2024.cip.org
5/ Her proposed "Community Posts" mechanism shows how careful system design could help bridge polarizing divides - echoing our belief that good institutional design is key to collective intelligence.
November 25, 2024 at 1:18 AM
5/ Her proposed "Community Posts" mechanism shows how careful system design could help bridge polarizing divides - echoing our belief that good institutional design is key to collective intelligence.
2/ Just as we build banking systems assuming tellers might make mistakes, we need to build robust, resilient AI ecosystems that assume models will fail - not just "ethical" AI at the individual model level.
November 25, 2024 at 1:18 AM
2/ Just as we build banking systems assuming tellers might make mistakes, we need to build robust, resilient AI ecosystems that assume models will fail - not just "ethical" AI at the individual model level.
Individual AI safety ≠ system safety. Each response is blind to its broader context.
This dynamic escalates with multi-agent systems, where "safe" AI agents interact. Seemingly innocent individual actions can combine into security breaches.
This dynamic escalates with multi-agent systems, where "safe" AI agents interact. Seemingly innocent individual actions can combine into security breaches.
November 22, 2024 at 6:30 PM
Individual AI safety ≠ system safety. Each response is blind to its broader context.
This dynamic escalates with multi-agent systems, where "safe" AI agents interact. Seemingly innocent individual actions can combine into security breaches.
This dynamic escalates with multi-agent systems, where "safe" AI agents interact. Seemingly innocent individual actions can combine into security breaches.
Yet with AI, we're focusing on "aligned" models while ignoring system-level vulnerabilities.
An AI model may be like a bank teller who follows protocol perfectly but can't see they're part of a larger fraud scheme.
An AI model may be like a bank teller who follows protocol perfectly but can't see they're part of a larger fraud scheme.
November 22, 2024 at 6:30 PM
Yet with AI, we're focusing on "aligned" models while ignoring system-level vulnerabilities.
An AI model may be like a bank teller who follows protocol perfectly but can't see they're part of a larger fraud scheme.
An AI model may be like a bank teller who follows protocol perfectly but can't see they're part of a larger fraud scheme.
NEW ARTICLE from @j11y.io:
"The AI Safety Paradox: When 'Safe' AI Makes Systems More Dangerous"
Our obsession with making individual AI models safer might actually be making our systems more vulnerable.
"The AI Safety Paradox: When 'Safe' AI Makes Systems More Dangerous"
Our obsession with making individual AI models safer might actually be making our systems more vulnerable.
November 22, 2024 at 6:30 PM
NEW ARTICLE from @j11y.io:
"The AI Safety Paradox: When 'Safe' AI Makes Systems More Dangerous"
Our obsession with making individual AI models safer might actually be making our systems more vulnerable.
"The AI Safety Paradox: When 'Safe' AI Makes Systems More Dangerous"
Our obsession with making individual AI models safer might actually be making our systems more vulnerable.