Muru Zhang
muruzhang.bsky.social
Muru Zhang
@muruzhang.bsky.social
First-year NLP PhD @ USC | Intern @ TogetherAI | Prev. UW, AWS

https://nanami18.github.io/
Great to be part of this project led by the amazing @hamishivi.bsky.social. The most fun (in retrospect) thing is to observe how the results start to shift as we scale up the candidate pool, evaluation suite, and selection size :) And eventually we find a simple method does the best!
How well do data-selection methods work for instruction-tuning at scale?

Turns out, when you look at large, varied data pools, lots of recent methods lag behind simple baselines, and a simple embedding-based method (RDS) does best!

More below ⬇️ (1/8)
March 4, 2025 at 9:14 PM
Reposted by Muru Zhang
How well do data-selection methods work for instruction-tuning at scale?

Turns out, when you look at large, varied data pools, lots of recent methods lag behind simple baselines, and a simple embedding-based method (RDS) does best!

More below ⬇️ (1/8)
March 4, 2025 at 5:10 PM