Thank you to my amazing co-authors! @shuhaib.bsky.social @xiaocheng-yang.bsky.social @HyeonjeongHa @ziruicheng.bsky.social @EsinDurmus @JiaxuanYou @HengJi @gokhantur.bsky.social @dilekh.bsky.social
Thank you to my amazing co-authors! @shuhaib.bsky.social @xiaocheng-yang.bsky.social @HyeonjeongHa @ziruicheng.bsky.social @EsinDurmus @JiaxuanYou @HengJi @gokhantur.bsky.social @dilekh.bsky.social
Check it out: github.com/beyzabozdag/...
Check it out: github.com/beyzabozdag/...
🤖 AI as Persuader: Generating persuasive content.
🎯 AI as Persuadee: Vulnerability to persuasive influence.
⚖️ AI as Persuasion Judge: Detecting persuasive tactics and ethical concerns.
🤖 AI as Persuader: Generating persuasive content.
🎯 AI as Persuadee: Vulnerability to persuasive influence.
⚖️ AI as Persuasion Judge: Detecting persuasive tactics and ethical concerns.
Excited to continue exploring LLM persuasiveness & AI safety!
Let’s keep the conversation going! 💬
Excited to continue exploring LLM persuasiveness & AI safety!
Let’s keep the conversation going! 💬
🤖 Llama-3.3-70B & GPT-4o show similar persuasive effectiveness
🔒 GPT-4o is 50% more resistant to misinformation persuasion vs. Llama-3.3-70B
⚖️ Some models are persuasive, but also too susceptible to persuasion!
🤖 Llama-3.3-70B & GPT-4o show similar persuasive effectiveness
🔒 GPT-4o is 50% more resistant to misinformation persuasion vs. Llama-3.3-70B
⚖️ Some models are persuasive, but also too susceptible to persuasion!
✅ Multi-turn persuasion boosts effectiveness—more chances to influence the Persuadee mean higher agreement over time!
✅ Multi-turn persuasion boosts effectiveness—more chances to influence the Persuadee mean higher agreement over time!
📌 Subjective w/ Single-turn
📌 Subjective w/ Multi-turn
📌 Misinformation w/ Multi-turn (⚠️ adversarial!)
💡 Key finding: Persuasive effectiveness stays fairly stable, but susceptibility varies significantly based on the domain!
📌 Subjective w/ Single-turn
📌 Subjective w/ Multi-turn
📌 Misinformation w/ Multi-turn (⚠️ adversarial!)
💡 Key finding: Persuasive effectiveness stays fairly stable, but susceptibility varies significantly based on the domain!
🤖 A Persuader tries to convince the other agent
🤖 A Persuadee updates its agreement over a multi-turn conversation
Each model debates against the others!🔄
🤖 A Persuader tries to convince the other agent
🤖 A Persuadee updates its agreement over a multi-turn conversation
Each model debates against the others!🔄