Lightnews — Scholar-powered news

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

110 followers 240 following 13 posts

http://honglizhan.github.io
PhD Candidate 🤘@UTAustin | previously @IBMResearch @sjtu1896 | NLP for social good

Posts Replies Media Videos

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

I'll be at #ICML to present SPRI next week! Come by our poster on Tuesday, July 15, 4:30pm, and let’s catch up on LLM alignment! 😃

🚀TL;DR: We introduce Situated-PRInciples (SPRI), a framework that automatically generates input-specific principles to align responses — with minimal human effort.

🧵

July 8, 2025 at 3:05 PM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

I’m excited to share that our paper has been accepted at #ICML2025! 🎉🥳🎊

This work was done during my internship at IBM Research, and it wouldn’t have been possible without a top-notch team and my amazing advisor 👏

May 2, 2025 at 9:27 PM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

[4/5] In addition, when applying SPRI to generate SFT data for alignment, we observe substantial improvement on TruthfulQA.

February 6, 2025 at 10:44 PM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

[3/5] We tested SPRI on 3 tasks: generating 1) cognitive reappraisals, 2) instance-specific rubrics for LLM-as-a-judge, and 3) SFT data for alignment.

SPRI turns out to work great for tasks that require complex principles, showcasing on-par performance as expert-guided methods.

February 6, 2025 at 10:44 PM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

[2/5] Short for Situated-PRInciples, SPRI involves 2 stages: 1) synthesizing context-situated principles, and 2) crafting principle-guided responses.

In each stage, a base model and a critic model are used to create principles and responses from scratch through critique-refine.

February 6, 2025 at 10:44 PM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply.

Can we guide responses with context-situated principles instead?

Introducing SPRI, a system that produces principles tailored to each query, with minimal to no human effort.

arxiv.org/pdf/2502.03397

February 6, 2025 at 10:43 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news