Shuhaib Mehri
shuhaib.bsky.social
Shuhaib Mehri
@shuhaib.bsky.social
CS PhD @ UIUC
🧵 [6/n] For more details, see:
Project Website: shuhaibm.github.io/refed/
Paper: arxiv.org/abs/2502.04511

Thanks to my incredible co-authors Xiusi Chen, Heng Ji, @dilekh.bsky.social!
Beyond Sample-Level Feedback: Using Reference-Level Feedback to Guide Data Synthesis
LLMs demonstrate remarkable capabilities in following natural language instructions, largely due to instruction-tuning on high-quality datasets. While synthetic data generation has emerged as a scalab...
arxiv.org
February 10, 2025 at 3:56 PM
🧵 [5/n] Our experiments speak for themselves:
📊 We demonstrate consistent improvement across both base and instruct variants of different model architectures
📊 Analysis of filtering strategies reveals dataset variants that maintain strong performance while reducing costs
February 10, 2025 at 3:56 PM
🧵 [4/n] Our experiments speak for themselves:
📊 Llama-3.1-8B-Instruct + REFED achieves SOTA among SFT-based 8B parameter models on AlpacaEval 2.0
📊 Comparisons and ablation studies validate every component of our framework and show advantages over traditional feedback
February 10, 2025 at 3:56 PM
🧵 [3/n] 📚Our data synthesis framework uses reference-level feedback to guide the synthesis of new instructions as well as improve their corresponding responses. We present REFED, a dataset consisting of 10K samples synthesized using our framework.
February 10, 2025 at 3:56 PM
🧵 [2/n] Our key insight 🎯 We extract valuable feedback from high-quality reference samples to guide data synthesis. This effectively leverages seed datasets, propagating desirable qualities to newly synthesized data.
February 10, 2025 at 3:56 PM