Lightnews — Scholar-powered news

Xuhui Zhou

@nlpxuhui.bsky.social

1.8K followers 150 following 22 posts

PhD student @ltiatcmu.bsky.social. Previously, @ai2.bsky.social, @uwnlp.bsky.social, @appleinc.bsky.social, @ucberkeleyofficial.bsky.social; Social Intelligence in language +X. He/Him.🐳

Posts Replies Media Videos

Xuhui Zhou

@nlpxuhui.bsky.social

🔄 Multi-turn interactive setup is crucial - models often begin with equivocation but shift to falsification when pressed for clear answers 🧠 Stronger models like GPT-4o showed the greatest shift when prompted to deceive (40% increase in falsification; alarming) 6/

April 28, 2025 at 8:36 PM

Xuhui Zhou

@nlpxuhui.bsky.social

⚠️ Even when explicitly instructed to be truthful, models STILL lied - GPT-4o still falsified info 15% of the time! 📉 The tradeoff is real: more honest models completed their goals 15% less often 5/

April 28, 2025 at 8:36 PM

Xuhui Zhou

@nlpxuhui.bsky.social

💼 In business scenarios (selling defective products), models were either completely honest OR completely deceptive 🌐 In public image scenarios (reputation management), behaviors were more ambiguous and complex 4/

April 28, 2025 at 8:36 PM

Xuhui Zhou

@nlpxuhui.bsky.social

And what we found: 📊 ALL tested models (GPT-4o, LLaMA-3, Mixtral) were truthful less than 50% of the time in conflict scenarios 🤔 Models prefer "partial lies" like equivocation over outright falsification - they'll dodge questions before explicitly lying 3/

April 28, 2025 at 8:36 PM

Xuhui Zhou

@nlpxuhui.bsky.social

Obviously this is a pressing issue now: x.com/deedydas/sta...; x.com/DanHendrycks... And here, we put LLMs into a multi-turn dialogue environment mimic the realistic setting where users constantly try to seek info from LLMs 2/

April 28, 2025 at 8:36 PM

Xuhui Zhou

@nlpxuhui.bsky.social

When interacting with ChatGPT, have you wondered if they would ever "lie" to you? We found that under pressure, LLMs often choose deception. Our new #NAACL2025 paper, "AI-LIEDAR ," reveals models were truthful less than 50% of the time when faced with utility-truthfulness conflicts! 🤯 1/

April 28, 2025 at 8:36 PM

Xuhui Zhou

@nlpxuhui.bsky.social

8/ Not all LLMs ask the right questions. ❓🤖
🔹 Llama 3.1 70B asks generic, low-impact questions.
🔹 Claude Haiku 3.5 picks up keywords directly from the input to ask questions.
🔹 Claude Sonnet 3.5 often explores the code first, leading to smarter interactions. 🔍💡

February 19, 2025 at 7:46 PM

Xuhui Zhou

@nlpxuhui.bsky.social

7/ Claude models ask fewer but smarter questions, extracting more info and boosting performance. 📈
Meanwhile, DeepSeek-V2 can overwhelm users with too many questions. 🤯

February 19, 2025 at 7:46 PM

Xuhui Zhou

@nlpxuhui.bsky.social

6/ Without compulsory interaction, LLMs struggle to distinguish clear vs. vague instructions, either over-interacting or under-interacting despite prompt tweaks. 🔄
Only Claude Sonnet 3.5 can make this distinction to a limited degree with the right prompt. 🔍

February 19, 2025 at 7:46 PM

Xuhui Zhou

@nlpxuhui.bsky.social

5/ Our findings? LLMs default to non-interactive behavior unless forced to interact. But when they clarify vague inputs, performance drastically improves—proving the power of effective communication. 💬🤝

February 19, 2025 at 7:46 PM

Xuhui Zhou

@nlpxuhui.bsky.social

3/ How much does interaction actually help LLMs in coding tasks? 🤖💡
We put them to the test on SWE-Bench Verified across three distinct settings to measure the impact. 📊

February 19, 2025 at 7:46 PM

Xuhui Zhou

@nlpxuhui.bsky.social

LLM agents can code—but can they ask clarifying questions? 🤖💬
Tired of coding agents wasting time and API credits, only to output broken code? What if they asked first instead of guessing? 🚀

(New work led by Sanidhya Vijay: www.linkedin.com/in/sanidhya-...)

February 19, 2025 at 7:46 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news