PhD Candidate 🤘@UTAustin | previously @IBMResearch @sjtu1896 | NLP for social good
👨🏻💻Code and data: github.com/honglizhan/S...
Shout out to an amazing team @jessyjli.bsky.social, @m-yurochkin.bsky.social, Muneeza Azmat & Raya Horesh! Also super grateful to the reviewers for their invaluable feedback!
#ICML2025 #LLMAlignment
👨🏻💻Code and data: github.com/honglizhan/S...
Shout out to an amazing team @jessyjli.bsky.social, @m-yurochkin.bsky.social, Muneeza Azmat & Raya Horesh! Also super grateful to the reviewers for their invaluable feedback!
#ICML2025 #LLMAlignment
2️⃣SPRI enables tailored rubrics for LLM-judges, matching human-crafted rubrics (e.g., BiGGen-Bench)
3️⃣SPRI-generated synthetic data boosts Llama/Mistral/Gemma (7~9B) on TruthfulQA, with no loss on other benchmarks
2️⃣SPRI enables tailored rubrics for LLM-judges, matching human-crafted rubrics (e.g., BiGGen-Bench)
3️⃣SPRI-generated synthetic data boosts Llama/Mistral/Gemma (7~9B) on TruthfulQA, with no loss on other benchmarks
💡SPRI tackles this and proves to rival human oracle guidance in the three real-world use cases we tested on 👇
💡SPRI tackles this and proves to rival human oracle guidance in the three real-world use cases we tested on 👇
💡SPRI tackles this and proves to rival human oracle guidance in the three real-world use cases we tested on 👇
💡SPRI tackles this and proves to rival human oracle guidance in the three real-world use cases we tested on 👇
Thanks for sharing this!
Thanks for sharing this!
This project was carried out during my internship at IBM Research, and I’d like to highlight the support and mentorship from my amazing hosts Muneeza Azmat, Raya Horesh, @m-yurochkin.bsky.social and advisor @jessyjli.bsky.social!
This project was carried out during my internship at IBM Research, and I’d like to highlight the support and mentorship from my amazing hosts Muneeza Azmat, Raya Horesh, @m-yurochkin.bsky.social and advisor @jessyjli.bsky.social!
SPRI turns out to work great for tasks that require complex principles, showcasing on-par performance as expert-guided methods.
SPRI turns out to work great for tasks that require complex principles, showcasing on-par performance as expert-guided methods.
In each stage, a base model and a critic model are used to create principles and responses from scratch through critique-refine.
In each stage, a base model and a critic model are used to create principles and responses from scratch through critique-refine.