Lightnews — Scholar-powered news

Reposted by Rabiraj Banerjee

Andrew Saxe

@saxelab.bsky.social

Why don’t neural networks learn all at once, but instead progress from simple to complex solutions? And what does “simple” even mean across different neural network architectures?

Sharing our new paper @iclr_conf led by Yedi Zhang with Peter Latham

arxiv.org/abs/2512.20607

February 3, 2026 at 4:19 PM

Reposted by Rabiraj Banerjee

Negar Foroutan

@negarforoutan.bsky.social

1/ 🌍 How does mixing data from hundreds of languages affect LLM training?
In our new paper "Revisiting Multilingual Data Mixtures in Language Model Pretraining" we revisit core assumptions about multilinguality using 1.1B-3B models trained on up to 400 languages.
🧵👇

December 15, 2025 at 6:18 PM

Reposted by Rabiraj Banerjee

Naomi Saphra

@nsaphra.bsky.social

So @lchoshen.bsky.social posted a thread on X about how different training runs tend to converge, and I just had to argue with him. Training variation is fascinating, and I think we've kinda cracked it!

x.com

December 12, 2025 at 6:53 PM

Reposted by Rabiraj Banerjee

Kyle Lo

@kylelo.bsky.social

we released Olmo 3! lot of exciting stuff but wanna focus on:

🐟Olmo 3 32B Base, the best fully-open base model to-date, near Qwen 2.5 & Gemma 3 on diverse evals
🐠Olmo 3 32B Think, first fully-open reasoning model approaching Qwen 3 levels
🐡12 training datasets corresp to different staged training

November 20, 2025 at 6:20 PM

Reposted by Rabiraj Banerjee

taka-yamakoshi.bsky.social

@taka-yamakoshi.bsky.social

I’m excited to share our Findings of EMNLP paper w/ @cocoscilab.bsky.social , @rtommccoy.bsky.social, and @rdhawkins.bsky.social !

Language models, unlike humans, require large amounts of data, which suggests the need for an inductive bias.
But what kind of inductive biases do we need?

November 7, 2025 at 9:17 AM

Reposted by Rabiraj Banerjee

Subbarao Kambhampati (కంభంపాటి సుబ్బారావు)

@rao2z.bsky.social

In the year since LRMs ("reasoning models") hit the scene, we have been trying to understand, analyze and demystify them.. Here are our efforts to date--conveniently all in one place..👇

www.linkedin.com/posts/subbar...

In the year since LRMs ("reasoning models") hit the scene, we have been trying to understand, analyze and demystify them.. Here are our efforts to date--conveniently all in one… | Subbarao K...

In the year since LRMs ("reasoning models") hit the scene, we have been trying to understand, analyze and demystify them.. Here are our efforts to date--conveniently all in one place.. (𝗙𝗶𝗿𝘀𝘁..) 𝗘𝘃𝗮𝗹...

www.linkedin.com

September 14, 2025 at 10:00 PM

Reposted by Rabiraj Banerjee

Sung Kim

@sungkim.bsky.social

A Survey of Reinforcement Learning for Large Reasoning Models

Five sections:

- Foundational Components
- Foundational Problems
- Training Resources
- Applications
- Future Directions

September 11, 2025 at 11:42 PM

Reposted by Rabiraj Banerjee

Yoav Goldberg

@yoavgo.bsky.social

When reading AI reasoning text (aka CoT), we (humans) form a narrative about the underlying computation process, which we take as a transparent explanation of model behavior. But what if our narratives are wrong? We measure that and find it usually is.

Now on arXiv: arxiv.org/abs/2508.16599

Humans Perceive Wrong Narratives from AI Reasoning Texts

A new generation of AI models generates step-by-step reasoning text before producing an answer. This text appears to offer a human-readable window into their computation process, and is increasingly r...

arxiv.org

August 27, 2025 at 9:30 PM

Reposted by Rabiraj Banerjee

Johan Ugander

@jugander.bsky.social

Great interview with @stevenstrogatz.com with a lot of discussion of research advising. Parts reminded me of @eegilbert.org and @informor.bsky.social's (excellent) guides to PhD mentorship, with a big focus on ideation.
Eric's: docs.google.com/document/d/1...
Mor's: s.tech.cornell.edu/phd-syllabus/

August 24, 2025 at 2:26 PM

Reposted by Rabiraj Banerjee

Nikhil Garg

@nkgarg.bsky.social

We try to avoid self-promoting too much, but we (with @sjgreenwood.bsky.social) built a personalized feed with posts about papers from your network. Many people say it's the closest they can get to old academic twitter, and I hope you enjoy it and share with others too!

bsky.app/profile/pape...

Paper Skygest Team @paper-feed.bsky.social · Aug 19

**Please repost** If you're enjoying Paper Skygest -- our personalized feed of academic content on Bluesky -- we'd appreciate you reposting this! We’ve found that the most effective way for us to reach new users and communities is through users sharing it with their network

August 20, 2025 at 12:46 PM

Reposted by Rabiraj Banerjee

Jacob Eisenstein

@jacobeisenstein.bsky.social

August 19, 2025 at 5:02 AM

Reposted by Rabiraj Banerjee

Margaret Mitchell

@mmitchell.bsky.social

🤖 But wait! There's more! You can check out @shiraamitchell.bsky.social 's most recent update on the details of Calibration, posted yesterday! statmodeling.stat.columbia.edu/2025/08/12/s...

August 13, 2025 at 6:38 PM

Reposted by Rabiraj Banerjee

brendan o’connor

@brenocon.bsky.social

#acl2025 anyone get a good quote of phil resnik's last comment?

context: (some?all?) panelists & him agree the field needs more deep, careful research on smaller models to do better science. everyone is frustrated with impossibility of large-scale pretraining experiments

July 28, 2025 at 3:24 PM

Rabiraj Banerjee

@rabirajb.bsky.social

@kennyjoseph.bsky.social , Kenny check this thread out

Maria Antoniak @mariaa.bsky.social · Jul 23

What are your favorite recent papers on using LMs for annotation (especially in a loop with human annotators), synthetic data for task-specific prediction, active learning, and similar?

Looking for practical methods for settings where human annotations are costly.

A few examples in thread ↴

July 24, 2025 at 11:38 AM

Reposted by Rabiraj Banerjee

Maria Antoniak

@mariaa.bsky.social

What are your favorite recent papers on using LMs for annotation (especially in a loop with human annotators), synthetic data for task-specific prediction, active learning, and similar?

Looking for practical methods for settings where human annotations are costly.

A few examples in thread ↴

July 23, 2025 at 8:10 AM

Reposted by Rabiraj Banerjee

Kenny Joseph

@kennyjoseph.bsky.social

UB's new Department of AI and Society is hiring faculty across ranks (Assistant, Associate, Full Professor). We’re looking for transdisciplinary scholars interested in building AI by society, for society. Start dates begin Fall 2025.

More info: www.ubjobs.buffalo.edu/postings/57734

Assistant, Associate or Full Professor, AI & Society

The Department of AI and Society (AIS) at the University at Buffalo (UB) invites candidates to apply for multiple positions as Assistant Professor, Associate Professor, or Full Professor. The new AIS ...

www.ubjobs.buffalo.edu

July 17, 2025 at 4:11 PM

Reposted by Rabiraj Banerjee

Alexander Hoyle

@alexanderhoyle.bsky.social

Took me a second, but I knew I'd seen something related to this recently:

arxiv.org/abs/2505.120...

Do different prompting methods yield a common task representation in language models?

Demonstrations and instructions are two primary approaches for prompting language models to perform in-context learning (ICL) tasks. Do identical tasks elicited in different ways result in similar rep...

arxiv.org

July 16, 2025 at 9:17 PM

Reposted by Rabiraj Banerjee

Yanai Elazar

@yanai.bsky.social

Check out our take on Chain-of-Thought.
I really like this paper as a survey on the current literature on what CoT is, but more importantly on what it's not.
It also serves as a cautionary tale to the (apparently quite common) misuse of CoT as an interpretable method.

Fazl Barez @fbarez.bsky.social · Jul 1

Excited to share our paper: "Chain-of-Thought Is Not Explainability"! We unpack a critical misconception in AI: models explaining their steps (CoT) aren't necessarily revealing their true reasoning. Spoiler: the transparency can be an illusion. (1/9) 🧵

July 1, 2025 at 5:45 PM

Reposted by Rabiraj Banerjee

Mark Riedl

@markriedl.bsky.social

Hi everyone. I'm excited to announce that I will be organizing a 2nd Summit on Responsible Computing, AI, and Society rcais.github.io October 27-29, 2025.

We will explore the future of computing for health, sustainability, human-centered AI, and policy.

Please consider submitting a 1-page abstract

Screenshot of website
The Georgia Tech School of Interactive Computing is hosting the 2025 Summit on Responsible Computing, AI, and Society, October 27-29, 2025.

Overview

The Summit on Responsible Computing, AI, and Society aims to explore the future of computing for health, sustainability, human-centered AI, and policy. The summit will bring together luminary researchers in computing for health, sustainability, human-centered AI, and tech policy to lay out the frontiers of these critical fields, and to plot out how they must evolve.

July 1, 2025 at 4:41 PM

Rabiraj Banerjee

@rabirajb.bsky.social

So ICWSM concluded today and it was a blast, was a great honor to attend @icwsm.bsky.social at Copenhagen and present my work with @kennyjoseph.bsky.social and other colleagues. The paper link is here :
ojs.aaai.org/index.php/IC...,

View of Measuring Dimensions of Self-Presentation in Twitter Bios and their Links to Misinformation Sharing

ojs.aaai.org

June 26, 2025 at 9:39 PM

Reposted by Rabiraj Banerjee

Shubhendu Trivedi

@shubhendu.bsky.social

This is great! The idea is somewhat obvious (good!), and I'm sure many have toyed with the connection to learning-to-rank. However, no work had developed it. This should be relevant for constructing valid PIs from just preferential feedback. openreview.net/pdf?id=ENJd3...

June 26, 2025 at 11:17 AM

Reposted by Rabiraj Banerjee

Naomi Saphra

@nsaphra.bsky.social

🚨 New preprint! 🚨
Phase transitions! We love to see them during LM training. Syntactic attention structure, induction heads, grokking; they seem to suggest the model has learned a discrete, interpretable concept. Unfortunately, they’re pretty rare—or are they?

June 24, 2025 at 6:29 PM

Reposted by Rabiraj Banerjee

Hanna Wallach

@hannawallach.bsky.social

Generative language systems are everywhere, and many of them stereotype, demean, or erase particular social groups.

June 16, 2025 at 9:49 PM

Reposted by Rabiraj Banerjee

Hanna Wallach

@hannawallach.bsky.social

Alright, people, let's be honest: GenAI systems are everywhere, and figuring out whether they're any good is a total mess. Should we use them? Where? How? Do they need a total overhaul?

(1/6)

June 15, 2025 at 12:20 AM

Reposted by Rabiraj Banerjee

Paloma ✨

@itspaloma.bsky.social

🧵 1/ Las redes están llenas de odio. ¿Puede la inteligencia artificial ayudarnos a detectarlo…

⚖️ sin discriminar,
🚫 sin reforzar estereotipos,
🔁 y sin aprender a odiar?

Esa es la gran pregunta de mi tesis.

👇 Te lo cuento en este #HiloTesis @crueuniversidades.bsky.social @filarramendi.bsky.social

June 10, 2025 at 9:20 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news