Lightnews — Scholar-powered news

Michael Hu

@michahu.bsky.social

64 followers 130 following 6 posts

PhD student at NYU. NLP & training data.
michahu.github.io

Posts Replies Media Videos

Michael Hu

@michahu.bsky.social

So you want a good pretraining data mix🧑‍🍳, but which data mixing algorithm do you pick? DoGE, DoReMi, Skill-it, grid searching proportions… 😵‍💫

It turns out that these algorithms are all special cases of Linear Mixing Optimization, our new data mixing framework! 🧵

November 12, 2024 at 5:04 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news