Lightnews — Scholar-powered news

Ben Newman

@benn9.bsky.social

750 followers 120 following 6 posts

NLP research - PhD student at UW

Posts Replies Media Videos

Reposted by Ben Newman

Taylor Sorensen

@taylor-sorensen.bsky.social

Did you know that LLMs suffer from serious mode collapse?

For example, if you ask models to tell you a joke, they almost always tell you the same joke? This is true across samples and even across model families!

Why does this happen? Can we improve it?

October 8, 2025 at 2:22 PM

Reposted by Ben Newman

Kyle Lo

@kylelo.bsky.social

Excited to share OLMo 2!

🐟 7B and 13B weights, trained up to 4-5T tokens, fully open data, code, etc
🐠 better architecture and recipe for training stability
🐡 staged training, with new data mix Dolmino🍕 added during annealing
🦈 state-of-the-art OLMo 2 Instruct models

#nlp #mlsky

links below👇

A scatter plot comparing language models by performance (y-axis, measured in average performance on 10 benchmarks) versus training computational cost (x-axis, in approximate FLOPs). The plot shows OLMo 2 models (marked with stars) achieving Pareto-optimal efficiency among open models, with OLMo-2-13B and OLMo-2-7B sitting at the performance frontier relative to other open models like DCLM, Llama 3.1, StableLM 2, and Qwen 2.5. The x-axis ranges from 4x10^22 to 2x10^24 FLOPs, while the y-axis ranges from 35 to 70 benchmark points.

November 26, 2024 at 8:59 PM

Reposted by Ben Newman

Maria Antoniak

@mariaa.bsky.social

I'm recruiting 1-2 PhD students to work with me at the University of Colorado Boulder! Looking for creative students with interests in #NLP and #CulturalAnalytics.

Boulder is a lovely college town 30 minutes from Denver and 1 hour from Rocky Mountain National Park 😎

Apply by December 15th!

A photo of Boulder, Colorado, shot from above the university campus and looking toward the Flatirons.

November 19, 2024 at 10:38 AM

Reposted by Ben Newman

Abhilasha Ravichander

@lasha.bsky.social

✨I am on the faculty job market in the 2024-2025 cycle!✨

My research centers on advancing Responsible AI, specifically enhancing factuality, robustness, and transparency in AI systems.

If you have relevant positions, let me know! lasharavichander.github.io Please share/RT!

Abhilasha Ravichander - Home

lasharavichander.github.io

November 11, 2024 at 2:23 PM

Reposted by Ben Newman

Valentina Pyatkin

@valentinapy.bsky.social

Why and when do preference annotators disagree? And how do reward models + LLM-as-Judge evaluators handle disagreements?

Michael explored these questions in a new ✨preprint✨ from his @ai2.bsky.social internship with me!

November 7, 2024 at 5:38 PM

Ben Newman

@benn9.bsky.social

✨EMNLP Paper! ✨
Have you ever constructed a table to organize your literature review process? Can we use LMs to generate these automatically?

We are excited to present ArxivDIGESTables 🍽️ a study of collecting, generating, and evaluating 🎓 scientific literature review tables 📃!

A screenshot of the first page of the paper discussed in the thread. Figure 1 contains a set of three cartoon papers with related text highlighted in three different colors. To its left, there's an arrow pointing to a cartoon table with a column corresponding to each color and a row corresponding to each paper.

November 11, 2024 at 5:37 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news