Hongli Zhan ✈️ ICML
banner
hongli-zhan.bsky.social
Hongli Zhan ✈️ ICML
@hongli-zhan.bsky.social
http://honglizhan.github.io
PhD Candidate 🤘@UTAustin | previously @IBMResearch @sjtu1896 | NLP for social good
I will also be on IBM's Expo Talk Panel on Monday, Jul 14, to discuss how SPRI can be incorporated in the industry for generating high-quality synthetic data. You can find more details of the talk here: icml.cc/virtual/2025...
ICML Expo Talk Panel Situating principles in context for synthetic dataICML 2025
icml.cc
July 8, 2025 at 3:13 PM
📜Link to the paper: icml.cc/virtual/2025...
👨🏻‍💻Code and data: github.com/honglizhan/S...

Shout out to an amazing team @jessyjli.bsky.social, @m-yurochkin.bsky.social, Muneeza Azmat & Raya Horesh! Also super grateful to the reviewers for their invaluable feedback!

#ICML2025 #LLMAlignment
ICML Poster SPRI: Aligning Large Language Models with Context-Situated PrinciplesICML 2025
icml.cc
July 8, 2025 at 3:13 PM
1️⃣SPRI generates principles as effective as psychologists to improve users’ well-being

2️⃣SPRI enables tailored rubrics for LLM-judges, matching human-crafted rubrics (e.g., BiGGen-Bench)

3️⃣SPRI-generated synthetic data boosts Llama/Mistral/Gemma (7~9B) on TruthfulQA, with no loss on other benchmarks
July 8, 2025 at 3:11 PM
🎯Motivation: Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply. Can we guide responses with context-situated principles instead?

💡SPRI tackles this and proves to rival human oracle guidance in the three real-world use cases we tested on 👇
July 8, 2025 at 3:06 PM
🎯Motivation: Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply. Can we guide responses with context-situated principles instead?

💡SPRI tackles this and proves to rival human oracle guidance in the three real-world use cases we tested on 👇
July 8, 2025 at 3:03 PM
I definitely agree :) I think SPRI can help generate SFT data for their constitutional classifiers that extrapolate *beyond* the "chemical weapons" context that they show in Sec 5 and Appendix B.

Thanks for sharing this!
February 7, 2025 at 5:20 AM
[5/5] Code and model generations: github.com/honglizhan/S...

This project was carried out during my internship at IBM Research, and I’d like to highlight the support and mentorship from my amazing hosts Muneeza Azmat, Raya Horesh, @m-yurochkin.bsky.social and advisor @jessyjli.bsky.social!
arxiv.org
February 6, 2025 at 10:48 PM
[4/5] In addition, when applying SPRI to generate SFT data for alignment, we observe substantial improvement on TruthfulQA.
February 6, 2025 at 10:44 PM
[3/5] We tested SPRI on 3 tasks: generating 1) cognitive reappraisals, 2) instance-specific rubrics for LLM-as-a-judge, and 3) SFT data for alignment.

SPRI turns out to work great for tasks that require complex principles, showcasing on-par performance as expert-guided methods.
February 6, 2025 at 10:44 PM
[2/5] Short for Situated-PRInciples, SPRI involves 2 stages: 1) synthesizing context-situated principles, and 2) crafting principle-guided responses.

In each stage, a base model and a critic model are used to create principles and responses from scratch through critique-refine.
February 6, 2025 at 10:44 PM