Columbia Business School
Innovation & Technology
Prompt architecture can influence outputs in high-stakes domains:
• Hiring decisions
• Medical triage
• Policy or scientific research summaries
• Your research papers!
In each case, prompt architecture could silently skew results—unless we actively correct for it.
Prompt architecture can influence outputs in high-stakes domains:
• Hiring decisions
• Medical triage
• Policy or scientific research summaries
• Your research papers!
In each case, prompt architecture could silently skew results—unless we actively correct for it.
Instead of searching for a “perfect” prompt, we propose Prompt Aggregation: By asking the same question multiple ways and combining answers, we can cancel out these biases.
In our “honey vs maple” example, aggregation favors honey in 5 of 8 prompts. Try it out yourself!
Instead of searching for a “perfect” prompt, we propose Prompt Aggregation: By asking the same question multiple ways and combining answers, we can cancel out these biases.
In our “honey vs maple” example, aggregation favors honey in 5 of 8 prompts. Try it out yourself!
You can't write your way around prompt architecture effects because any prompt must have some order, some framing, some structure.
GPT-3, GPT-4, and Llama 3.1 all exhibited different prompt architecture biases.
You can't write your way around prompt architecture effects because any prompt must have some order, some framing, some structure.
GPT-3, GPT-4, and Llama 3.1 all exhibited different prompt architecture biases.
We found LLMs are systematically biased by seemingly trivial prompt architecture:
• Option order (e.g., "honey or maple" vs "maple or honey")
• Option labels (e.g., A/B vs B/A)
• Question framing (e.g., "closer" vs "further")
• Asking for justification
We found LLMs are systematically biased by seemingly trivial prompt architecture:
• Option order (e.g., "honey or maple" vs "maple or honey")
• Option labels (e.g., A/B vs B/A)
• Question framing (e.g., "closer" vs "further")
• Asking for justification
Imagine you ask a simple question to ChatGPT:
Imagine you ask a simple question to ChatGPT: