Lightnews — Scholar-powered news

@hongli-zhan.bsky.social

I will also be on IBM's Expo Talk Panel on Monday, Jul 14, to discuss how SPRI can be incorporated in the industry for generating high-quality synthetic data. You can find more details of the talk here: icml.cc/virtual/2025...

ICML Expo Talk Panel Situating principles in context for synthetic dataICML 2025

icml.cc

July 8, 2025 at 3:13 PM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

📜Link to the paper: icml.cc/virtual/2025...
👨🏻‍💻Code and data: github.com/honglizhan/S...

Shout out to an amazing team @jessyjli.bsky.social, @m-yurochkin.bsky.social, Muneeza Azmat & Raya Horesh! Also super grateful to the reviewers for their invaluable feedback!

#ICML2025 #LLMAlignment

ICML Poster SPRI: Aligning Large Language Models with Context-Situated PrinciplesICML 2025

icml.cc

July 8, 2025 at 3:13 PM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

1️⃣SPRI generates principles as effective as psychologists to improve users’ well-being

2️⃣SPRI enables tailored rubrics for LLM-judges, matching human-crafted rubrics (e.g., BiGGen-Bench)

3️⃣SPRI-generated synthetic data boosts Llama/Mistral/Gemma (7~9B) on TruthfulQA, with no loss on other benchmarks

July 8, 2025 at 3:11 PM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

🎯Motivation: Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply. Can we guide responses with context-situated principles instead?

💡SPRI tackles this and proves to rival human oracle guidance in the three real-world use cases we tested on 👇

July 8, 2025 at 3:06 PM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

🎯Motivation: Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply. Can we guide responses with context-situated principles instead?

💡SPRI tackles this and proves to rival human oracle guidance in the three real-world use cases we tested on 👇

July 8, 2025 at 3:03 PM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

I definitely agree :) I think SPRI can help generate SFT data for their constitutional classifiers that extrapolate *beyond* the "chemical weapons" context that they show in Sec 5 and Appendix B.

Thanks for sharing this!

February 7, 2025 at 5:20 AM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

[5/5] Code and model generations: github.com/honglizhan/S...

This project was carried out during my internship at IBM Research, and I’d like to highlight the support and mentorship from my amazing hosts Muneeza Azmat, Raya Horesh, @m-yurochkin.bsky.social and advisor @jessyjli.bsky.social!

arxiv.org

February 6, 2025 at 10:48 PM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

[4/5] In addition, when applying SPRI to generate SFT data for alignment, we observe substantial improvement on TruthfulQA.

February 6, 2025 at 10:44 PM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

[3/5] We tested SPRI on 3 tasks: generating 1) cognitive reappraisals, 2) instance-specific rubrics for LLM-as-a-judge, and 3) SFT data for alignment.

SPRI turns out to work great for tasks that require complex principles, showcasing on-par performance as expert-guided methods.

February 6, 2025 at 10:44 PM

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

[2/5] Short for Situated-PRInciples, SPRI involves 2 stages: 1) synthesizing context-situated principles, and 2) crafting principle-guided responses.

In each stage, a base model and a critic model are used to create principles and responses from scratch through critique-refine.

February 6, 2025 at 10:44 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news