Ben Newman
benn9.bsky.social
Ben Newman
@benn9.bsky.social
750 followers 120 following 6 posts
NLP research - PhD student at UW
Posts Media Videos Starter Packs
Reposted by Ben Newman
Did you know that LLMs suffer from serious mode collapse?

For example, if you ask models to tell you a joke, they almost always tell you the same joke? This is true across samples and even across model families!

Why does this happen? Can we improve it?
Reposted by Ben Newman
Excited to share OLMo 2!

🐟 7B and 13B weights, trained up to 4-5T tokens, fully open data, code, etc
🐠 better architecture and recipe for training stability
🐡 staged training, with new data mix Dolmino🍕 added during annealing
🦈 state-of-the-art OLMo 2 Instruct models

#nlp #mlsky

links below👇
Reposted by Ben Newman
I'm recruiting 1-2 PhD students to work with me at the University of Colorado Boulder! Looking for creative students with interests in #NLP and #CulturalAnalytics.

Boulder is a lovely college town 30 minutes from Denver and 1 hour from Rocky Mountain National Park 😎

Apply by December 15th!
Reposted by Ben Newman
✨I am on the faculty job market in the 2024-2025 cycle!✨

My research centers on advancing Responsible AI, specifically enhancing factuality, robustness, and transparency in AI systems.

If you have relevant positions, let me know! lasharavichander.github.io Please share/RT!
Abhilasha Ravichander - Home
lasharavichander.github.io
Reposted by Ben Newman
Why and when do preference annotators disagree? And how do reward models + LLM-as-Judge evaluators handle disagreements?

Michael explored these questions in a new ✨preprint✨ from his @ai2.bsky.social internship with me!
We also find that providing more table context (captions, in-text references) to models leads to higher recall when generating columns but does not help when generating values.
We find that using decontextualization with SBERT leads to a better evaluator than Llama 3, which hallucinates alignments.
We propose a two-step procedure for generating tables given the input papers:
1️⃣ Generate the schemas (sets of columns)
2️⃣ Fill in the values.
This table generation task takes as input multiple papers, and synthesizes them into a single output table. We collect a dataset of such tables and associated papers, and augment the tables with additional context such as their captions and in-text references.
✨EMNLP Paper! ✨
Have you ever constructed a table to organize your literature review process? Can we use LMs to generate these automatically?

We are excited to present ArxivDIGESTables 🍽️ a study of collecting, generating, and evaluating 🎓 scientific literature review tables 📃!