Lightnews — Scholar-powered news

Muru Zhang

@muruzhang.bsky.social

640 followers 64 following 2 posts

First-year NLP PhD @ USC | Intern @ TogetherAI | Prev. UW, AWS

https://nanami18.github.io/

Posts Replies Media Videos

Muru Zhang

@muruzhang.bsky.social

Great to be part of this project led by the amazing @hamishivi.bsky.social. The most fun (in retrospect) thing is to observe how the results start to shift as we scale up the candidate pool, evaluation suite, and selection size :) And eventually we find a simple method does the best!

Hamish Ivison @hamishivi.bsky.social · Mar 4

How well do data-selection methods work for instruction-tuning at scale?

Turns out, when you look at large, varied data pools, lots of recent methods lag behind simple baselines, and a simple embedding-based method (RDS) does best!

More below ⬇️ (1/8)

March 4, 2025 at 9:14 PM

Reposted by Muru Zhang

Hamish Ivison

@hamishivi.bsky.social

March 4, 2025 at 5:10 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news