Xinpeng Wang
xinpeng.bsky.social
Xinpeng Wang
@xinpeng.bsky.social
PhD student @LMU. Eval & LLM Alignment.
https://xinpeng-wang.github.io/
Reposted by Xinpeng Wang
Reunion in Singapore!🇸🇬 @barbaraplank.bsky.social, @xinpeng.bsky.social, who's currently on a research stay at NYU, and Chengzhi are presenting their work at @iclr-conf.bsky.social
April 24, 2025 at 8:34 AM
Reposted by Xinpeng Wang
Upcoming ICLR 2025 paper: ✂️ Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation

We propose a surgical & flexible approach to mitigate false refusal in LLMs with minimal effect on performance and inference cost

led by @xinpeng.bsky.social (1/2)
April 15, 2025 at 9:37 PM
Reposted by Xinpeng Wang
🎉MaiNLP is turning 3 today!🎂🥳 We’ve grown a lot since @barbaraplank.bsky.social started this group with nothing but three aspiring researches and a hand-drawn sign on the door. Huge thanks to all the amazing people who have joined or visited us since. Here’s to many more years of exciting research!🚀
April 1, 2025 at 10:40 AM
I’m thrilled to share that our paper on mitigating false refusal in language models has been accepted to ICLR 2025 @iclr-conf.bsky.social!

arxiv.org/abs/2410.03415

Joint work with chengzhi, @paul-rottger.bsky.social, @barbaraplank.bsky.social.
January 23, 2025 at 9:34 PM