Melanie Brucks
melaniebrucks.bsky.social
Melanie Brucks
@melaniebrucks.bsky.social
Assistant Professor of Marketing
Columbia Business School
Innovation & Technology
7/ Read the full paper → journals.plos.org/plosone/arti...

Happy to discuss or answer questions!
April 28, 2025 at 7:00 PM
6/ Why it matters
Prompt architecture can influence outputs in high-stakes domains:
• Hiring decisions
• Medical triage
• Policy or scientific research summaries
• Your research papers!
In each case, prompt architecture could silently skew results—unless we actively correct for it.
April 28, 2025 at 7:00 PM
5/ Mitigation strategy

Instead of searching for a “perfect” prompt, we propose Prompt Aggregation: By asking the same question multiple ways and combining answers, we can cancel out these biases.
In our “honey vs maple” example, aggregation favors honey in 5 of 8 prompts. Try it out yourself!
April 28, 2025 at 7:00 PM
4/ Implication: There is no neutral prompt

You can't write your way around prompt architecture effects because any prompt must have some order, some framing, some structure.

GPT-3, GPT-4, and Llama 3.1 all exhibited different prompt architecture biases.
April 28, 2025 at 7:00 PM
3/ Core insight

We found LLMs are systematically biased by seemingly trivial prompt architecture:
• Option order (e.g., "honey or maple" vs "maple or honey")
• Option labels (e.g., A/B vs B/A)
• Question framing (e.g., "closer" vs "further")
• Asking for justification
April 28, 2025 at 7:00 PM
Seems straightforward... until you flip the order. Same question, different answers. But why?
April 28, 2025 at 7:00 PM
1/ Setup

Imagine you ask a simple question to ChatGPT:
April 28, 2025 at 7:00 PM