Lightnews — Scholar-powered news

Reposted by David Mortensen

Amanda Bertsch

@abertsch.bsky.social

Can LLMs accurately aggregate information over long, information-dense texts? Not yet…

We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!

Performance of a sweep of models on Oolong-synth and Oolong-real. Performance decreases with increasing context length, sometimes steeply.

November 7, 2025 at 5:07 PM

Reposted by David Mortensen

Kshitish Ghate

@kghate.bsky.social

🚨New paper: Reward Models (RMs) are used to align LLMs, but can they be steered toward user-specific value/style preferences?
With EVALUESTEER, we find even the best RMs we tested exhibit their own value/style biases, and are unable to align with a user >25% of the time. 🧵

October 14, 2025 at 3:59 PM

Reposted by David Mortensen

Andy Liu

@andyliu.bsky.social

🚨New Paper: LLM developers aim to align models with values like helpfulness or harmlessness. But when these conflict, which values do models choose to support? We introduce ConflictScope, a fully-automated evaluation pipeline that reveals how models rank values under conflict.
(📷 xkcd)

October 2, 2025 at 4:04 PM

Reposted by David Mortensen

Niyati Bafna

@niyatibafna.bsky.social

🔈When LLMs solve tasks with a mid-to-low resource input or target language, their output quality is poor. We know that. But can we put our finger on what breaks inside the LLM? We introduce the 💥 translation barrier hypothesis 💥 for failed multilingual generation with LLMs. arxiv.org/abs/2506.22724

July 4, 2025 at 5:05 PM

Reposted by David Mortensen

Valentin Hofmann

@valentinhofmann.bsky.social

Thrilled to share that this is out in @pnas.org today! 🎉

We show that linguistic generalization in language models can be due to underlying analogical mechanisms.

Shoutout to my amazing co-authors @weissweiler.bsky.social, @davidrmortensen.bsky.social, Hinrich Schütze, and Janet Pierrehumbert!

Valentin Hofmann @valentinhofmann.bsky.social · Dec 5

📢 New paper 📢

What generalization mechanisms shape the language skills of LLMs?

Prior work has claimed that LLMs learn language via rules.

We revisit the question and find that superficially rule-like behavior of LLMs can be traced to underlying analogical processes.

🧵

May 9, 2025 at 6:29 PM

Reposted by David Mortensen

Lindia Tjuatja

@lindiatjuatja.bsky.social

When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs:

🧵1/9

June 9, 2025 at 1:47 PM

Reposted by David Mortensen

Syeda Nahida Akter

@reasyaay.bsky.social

RL boosts LLM reasoning—but why stop at math & code? 🤔
Meet Nemotron-CrossThink—a method to scale RL-based self-learning across law, physics, social science & more.

🔥Resulting in a model that reasons broadly, adapts dynamically, & uses 28% fewer tokens for correct answers!
🧵↓

May 1, 2025 at 5:42 PM

Reposted by David Mortensen

Verena Blaschke

@verenablaschke.bsky.social

On my way to #NAACL2025 where I'll give a keynote at the noisy text workshop (WNUT), presenting some of the challenges & methods for dialect NLP + also discussing dialect speakers' perspectives!

🗨️ Beyond “noisy” text: How (and why) to process dialect data
🗓️ Saturday, May 3, 9:30–10:30

April 29, 2025 at 9:17 AM

Reposted by David Mortensen

Kshitish Ghate

@kghate.bsky.social

Excited to announce our #NAACL2025 Oral paper! 🎉✨

We carried out the largest systematic study so far to map the links between upstream choices, intrinsic bias, and downstream zero-shot performance across 131 CLIP Vision-language encoders, 26 datasets, and 55 architectures!

April 29, 2025 at 7:11 PM

Reposted by David Mortensen

Kwanghee Choi

@juice500ml.bsky.social

Can self-supervised models 🤖 understand allophony 🗣? Excited to share my new #NAACL2025 paper: Leveraging Allophony in Self-Supervised Speech Models for Atypical Pronunciation Assessment arxiv.org/abs/2502.07029 (1/n)

April 29, 2025 at 5:00 PM

Reposted by David Mortensen

Nishant Subramani @ ACL

@nsubramani23.bsky.social

🚀 Excited to share a new interp+agents paper: 🐭🐱 MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools appearing at #NAACL2025

This was work done @msftresearch.bsky.social last summer with Jason Eisner, Justin Svegliato, Ben Van Durme, Yu Su, and Sam Thomson

1/🧵

April 29, 2025 at 1:41 PM

Reposted by David Mortensen

Xuhui Zhou

@nlpxuhui.bsky.social

When interacting with ChatGPT, have you wondered if they would ever "lie" to you? We found that under pressure, LLMs often choose deception. Our new #NAACL2025 paper, "AI-LIEDAR ," reveals models were truthful less than 50% of the time when faced with utility-truthfulness conflicts! 🤯 1/

April 28, 2025 at 8:36 PM

Reposted by David Mortensen

Neel Bhandari

@neelbhandari.bsky.social

1/🚨 𝗡𝗲𝘄 𝗽𝗮𝗽𝗲𝗿 𝗮𝗹𝗲𝗿𝘁 🚨
RAG systems excel on academic benchmarks - but are they robust to variations in linguistic style?

We find RAG systems are brittle. Small shifts in phrasing trigger cascading errors, driven by the complexity of the RAG pipeline 🧵

April 17, 2025 at 7:55 PM

Reposted by David Mortensen

Chise

@sailorrooscout.bsky.social

THIS IS HUGE! Researchers at McMaster University have discovered a NEW peptide antibiotic that targets a broad range of disease-causing bacteria INCLUDING those RESISTANT to existing antibiotics. This discovery marks the first potential new class of antibiotics in NEARLY 30 YEARS. 🧪🧵⬇️

March 31, 2025 at 4:00 PM

Reposted by David Mortensen

Naomi Saphra

@nsaphra.bsky.social

Life update: I'm starting as faculty at Boston University
@bucds.bsky.social in 2026! BU has SCHEMES for LM interpretability & analysis, I couldn't be more pumped to join a burgeoning supergroup w/ @najoung.bsky.social @amuuueller.bsky.social. Looking for my first students, so apply and reach out!

CDS building which looks like a jenga tower

March 27, 2025 at 2:24 AM

David Mortensen

@davidrmortensen.bsky.social

You should read Article 1 of the United States Constitution. It's a trip.

March 19, 2025 at 4:49 AM

Reposted by David Mortensen

Johann-Mattis List

@lingulist.de

New preprint by @annikatjuka.bsky.social, Robert Forkel, Christoph Rzymski, and myself available, presenting a new version of the Database of Cross-Linguistic Colexifications (CLICS).

"Advancing the Database of Cross-Linguistic Colexifications with New Workflows and Data"

arxiv.org/abs/2503.11377

Advancing the Database of Cross-Linguistic Colexifications with New Workflows and Data

Lexical resources are crucial for cross-linguistic analysis and can provide new insights into computational models for natural language learning. Here, we present an advanced database for comparative ...

arxiv.org

March 17, 2025 at 10:25 AM

Reposted by David Mortensen

Connor Ewing

@cmewing.bsky.social

Finally found a way to shorten faculty meetings.

March 16, 2025 at 4:30 PM

Reposted by David Mortensen

Nick Fleisher

@nickfleisher.bsky.social

No student anywhere in America has said something as antisemitic as this

Justin Baragona @justinbaragona.bsky.social · Mar 12

Donald Trump: "Schumer is a Palestinian as far as I am concerned. He has become a Palestinian. He used to be Jewish. He is not Jewish anymore. He is a Palestinian."

March 12, 2025 at 6:12 PM

Reposted by David Mortensen

dchiang.bsky.social

@dchiang.bsky.social

The meeting will feature keynote addresses by
@mohitbansal.bsky.social, @davidrmortensen.bsky.social, Karen Livescu, and Heng Ji. Plus all of your great talks and posters! nlp.nd.edu/msld25

Midwest Speech and Language Days 2025

nlp.nd.edu

March 8, 2025 at 6:35 PM

Reposted by David Mortensen

Erin Jean Warde

@erinjeanwarde.bsky.social

I’ve been thinking about this reading from Isaiah 58 since I heard it at the Ash Wednesday service today.

“Is not this the fast that I choose:
to loose the bonds of injustice,
to undo the thongs of the yoke,
to let the oppressed go free,
and to break every yoke?

March 6, 2025 at 12:16 AM

Reposted by David Mortensen

Chris Hayes

@chrislhayes.bsky.social

“Again, the mice used for clinical purposes did not undergo gender transition.”

www.rollingstone.com/politics/pol...

Trump Decried Millions Spent 'Making Mice Transgender.' It Was Cancer and Asthma Research

President Trump falsely claimed that Biden spent $8 million on 'making mice transgender,' but the real research was for human health.

www.rollingstone.com

March 6, 2025 at 12:36 AM

Reposted by David Mortensen

U.S. Representative Al Green

@algreen.house.gov

Today, the House GOP censured me for speaking out for the American people against @POTUS’s plan to cut Medicaid. I accept the consequences of my actions, but I refuse to stay silent in the face of injustice. #WeShallOvercome x.com/repalgreen/s...

Congressman Al Green on X: "Today, the House GOP censured me for speaking out for the American people against @POTUS’s plan to cut Medicaid. I accept the consequences of my actions, but I refuse to stay silent in the face of injustice. #WeShallOvercome https://t.co/sVklRmPCJl" / X

Today, the House GOP censured me for speaking out for the American people against @POTUS’s plan to cut Medicaid. I accept the consequences of my actions, but I refuse to stay silent in the face of injustice. #WeShallOvercome https://t.co/sVklRmPCJl

x.com

March 6, 2025 at 9:23 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news