Ben Newman
benn9.bsky.social
Ben Newman
@benn9.bsky.social
NLP research - PhD student at UW
Reposted by Ben Newman
Did you know that LLMs suffer from serious mode collapse?

For example, if you ask models to tell you a joke, they almost always tell you the same joke? This is true across samples and even across model families!

Why does this happen? Can we improve it?
October 8, 2025 at 2:22 PM
Reposted by Ben Newman
Excited to share OLMo 2!

🐟 7B and 13B weights, trained up to 4-5T tokens, fully open data, code, etc
🐠 better architecture and recipe for training stability
🐡 staged training, with new data mix Dolmino🍕 added during annealing
🦈 state-of-the-art OLMo 2 Instruct models

#nlp #mlsky

links below👇
November 26, 2024 at 8:59 PM
Reposted by Ben Newman
I'm recruiting 1-2 PhD students to work with me at the University of Colorado Boulder! Looking for creative students with interests in #NLP and #CulturalAnalytics.

Boulder is a lovely college town 30 minutes from Denver and 1 hour from Rocky Mountain National Park 😎

Apply by December 15th!
November 19, 2024 at 10:38 AM
Reposted by Ben Newman
✨I am on the faculty job market in the 2024-2025 cycle!✨

My research centers on advancing Responsible AI, specifically enhancing factuality, robustness, and transparency in AI systems.

If you have relevant positions, let me know! lasharavichander.github.io Please share/RT!
Abhilasha Ravichander - Home
lasharavichander.github.io
November 11, 2024 at 2:23 PM
Reposted by Ben Newman
Why and when do preference annotators disagree? And how do reward models + LLM-as-Judge evaluators handle disagreements?

Michael explored these questions in a new ✨preprint✨ from his @ai2.bsky.social internship with me!
November 7, 2024 at 5:38 PM
✨EMNLP Paper! ✨
Have you ever constructed a table to organize your literature review process? Can we use LMs to generate these automatically?

We are excited to present ArxivDIGESTables 🍽️ a study of collecting, generating, and evaluating 🎓 scientific literature review tables 📃!
November 11, 2024 at 5:37 PM