Lightnews — Scholar-powered news

Reposted by Naomi Saphra

Mitchell Ostrow

@neurostrow.bsky.social

Our next paper on comparing dynamical systems (with special interest to artificial and biological neural networks) is out!! Joint work with @annhuang42.bsky.social , as well as @satpreetsingh.bsky.social , @leokoz8.bsky.social , Ila Fiete, and @kanakarajanphd.bsky.social : arxiv.org/pdf/2510.25943

November 10, 2025 at 4:16 PM

Reposted by Naomi Saphra

Alexander Doria

@dorialexander.bsky.social

Through this release, we aim both to support the emerging ecosystem for pretraining research (NanoGPT, NanoChat), explainability (you can literally look at Monad under a microscope) and the tooling orchestration around frontier models.

November 10, 2025 at 5:34 PM

Reposted by Naomi Saphra

Kyle Mahowald

@kmahowald.bsky.social

Oh cool! Excited this LM + construction paper was SAC-Highlighted! Check it out to see how LM-derived measures of statistical affinity separate out constructions with similar words like "I was so happy I saw you" vs "It was so big it fell over".

Cory Shain @coryshain.bsky.social · 17h

Josh Rozner's paper (w/ @rifter.bsky.social + @kmahowald.bsky.social) was an SAC Highlight at #EMNLP25! aclanthology.org/2025.emnlp-m...

November 10, 2025 at 4:27 PM

Reposted by Naomi Saphra

kate wagner

@katewagner.wehwalt.net

huge news for the worst architects you know

David Roberts @volts.wtf · 1d

Wild. They are figuring out how to store electricity in ... concrete.

Concrete “battery” developed at MIT now packs 10 times the power

New concrete and carbon black supercapacitors with optimized electrolytes have 10 times the energy storage of previous designs and can be incorporated into a wide range of architectural forms.

news.mit.edu

November 9, 2025 at 8:16 PM

Naomi Saphra

@nsaphra.bsky.social

The Boston Ballet (on now, recommend) is my 2nd time seeing Balanchine's Jewels and I realized: Emeralds/Rubies/Diamonds are known portrayals of France/US/Russia but also are how you'd see each if you were a child in Petersburg, a youth in Paris, and a stalker who married 4 of his own dancers in NY.

November 8, 2025 at 4:29 PM

Reposted by Naomi Saphra

Kyle Mahowald

@kmahowald.bsky.social

Delighted Sasha's (first year PhD!) work using mech interp to study complex syntax constructions won an Outstanding Paper Award at EMNLP!

Also delighted the ACL community continues to recognize unabashedly linguistic topics like filler-gaps... and the huge potential for LMs to inform such topics!

November 7, 2025 at 6:22 PM

Reposted by Naomi Saphra

Tomer Ullman

@tomerullman.bsky.social

It’s grad school application season, and I wanted to give some public advice.

Caveats:
-*-*-*-*

 > These are my opinions, based on my experiences, they are not secret tricks or guarantees
 > They are general guidelines, not meant to cover a host of idiosyncrasies and special cases

November 6, 2025 at 2:55 PM

Reposted by Naomi Saphra

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

Either we now have a truly excellent open-source model or it's benchmark hacked. Soon we will know which world we live in

November 6, 2025 at 8:29 PM

Reposted by Naomi Saphra

Adrian Tchaikovsky

@aptshadow.bsky.social

This story is getting wilder - not only is it a known solitary spider species living communally, but it's cohabiting with a second species, a type of commensalism(?) not before observed. Would be very interested to see if the species are cooperating with weaving/hunting, and how that works.

Ryan Wilson @hailseitan.bsky.social · 4d

@aptshadow.bsky.social Children of Time as prophecy?

The Independent @the-independent.com · 5d

This ‘arachnid megacity’ may be largest spider’s web ever found

November 6, 2025 at 2:42 PM

Naomi Saphra

@nsaphra.bsky.social

the only kind of Rat Race I'm down for

two rats in cars from the University of Richmond study where they trained rats to drive tiny cars to get to treats and concluded that the rats love driving so much they'll do it without any incentive

November 6, 2025 at 2:43 PM

Reposted by Naomi Saphra

The Data Therapist in the Blue Sky

@datatherapist.bsky.social

Sci-fi short story:
scientist discovers how to create superhuman artificial intelligence (ASI): you need to train your #LLM on deep center embeddings! But scientist is disinterested in this direction.

…So in order to do so, tech bros would need to study #linguistics …

Working title: X-Bar X-Risk

Naomi Saphra @nsaphra.bsky.social · Dec 20

Transformer LMs get pretty far by acting like ngram models, so why do they learn syntax? A new paper by sunnytqin.bsky.social, me, and @dmelis.bsky.social illuminates grammar learning in a whirlwind tour of generalization, grokking, training dynamics, memorization, and random variation. #mlsky #nlp

Sometimes I am a Tree: Data Drives Unstable Hierarchical Generalization

Language models (LMs), like other neural networks, often favor shortcut heuristics based on surface-level patterns. Although LMs behave like n-gram models early in training, they must eventually learn...

arxiv.org

November 6, 2025 at 12:04 AM

Naomi Saphra

@nsaphra.bsky.social

If I were an ascending superpower right now, I would do everything I could to convince the dumbass legacy rival superpower to pour everything into AI, while I spent a large majority of government science funds cementing my dominance in renewable energy.

November 5, 2025 at 11:53 PM

Naomi Saphra

@nsaphra.bsky.social

While lead author @sunnytqin.bsky.social sadly couldn't go to her !!!HOMETOWN!!!! of Suzhou due to visa reentry issues, her EMNLP paper with @dmelis.bsky.social and me is still fantastically cool and I will absolutely take advantage of EMNLP week to reshare it.

Naomi Saphra @nsaphra.bsky.social · Dec 20

Transformer LMs get pretty far by acting like ngram models, so why do they learn syntax? A new paper by sunnytqin.bsky.social, me, and @dmelis.bsky.social illuminates grammar learning in a whirlwind tour of generalization, grokking, training dynamics, memorization, and random variation. #mlsky #nlp

Sometimes I am a Tree: Data Drives Unstable Hierarchical Generalization

Language models (LMs), like other neural networks, often favor shortcut heuristics based on surface-level patterns. Although LMs behave like n-gram models early in training, they must eventually learn...

arxiv.org

November 5, 2025 at 11:03 PM

Naomi Saphra

@nsaphra.bsky.social

I would read this subterranean horror novel

Sulfur cave spiders build an arachnid megacity and possibly the largest-ever spider web

Researchers may have discovered the world's biggest spider web, a massive subterranean structure spanning over 100 square meters in a sulfur cave on the Albania–Greece border. The multilayered web alo...

phys.org

November 5, 2025 at 9:24 PM

Reposted by Naomi Saphra

Michael Saxon

@saxon.me

🆕 from us at #EMNLP: Are LMs better at answering questions about Germany in German than in French? Is national knowledge linguistically contingent?

Interestingly, only for some multilingual models is this true. Aya knows China best in Chinese, but LLaMA's best in English always.

November 5, 2025 at 7:47 PM

Naomi Saphra

@nsaphra.bsky.social

having moved to Boston I do miss NYC politics a little. NYC is like, edge of your seat, always something dumb / triumphant happening. you can see boston's beloved mayor playing piano concerts with a big 5 orchestra on the side. a football heir decided to run against her then changed his mind.

November 5, 2025 at 2:24 PM

Reposted by Naomi Saphra

Martin Tutek @ EMNLP

@mtutek.bsky.social

Flying out to @emnlpmeeting soon🇨🇳
I'll present our parametric CoT faithfulness work (arxiv.org/abs/2502.14829) on Wednesday at the second Interpretability session, 16:30-18:00 local time A104-105

If you're in Suzhou, reach out to talk all things reasoning :)

Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps

When prompted to think step-by-step, language models (LMs) produce a chain of thought (CoT), a sequence of reasoning steps that the model supposedly used to produce its prediction. Despite much work o...

arxiv.org

October 31, 2025 at 1:30 PM

Reposted by Naomi Saphra

Marco

@mcognetta.bsky.social

I wrote a short blog post about masked softmax layers in PyTorch (i.e., when you have structural constraints that tell you some classes _must_ have probability zero).

This was based on a real bug I found in a neural chess model implementation.

Masked Softmax Layers in PyTorch

Correctly computing masked softmax layers.

mcognetta.github.io

November 3, 2025 at 7:39 PM

Naomi Saphra

@nsaphra.bsky.social

The first time I didn't vote for Cuomo was because he betrayed unions.
The second time I didn't vote for Cuomo was because he made a fake party to steal votes from the WFP line.
The third time I didn't vote for Cuomo was because he's a creep.
Jealous of NYers who get to not vote for Cuomo tomorrow!

November 3, 2025 at 9:37 PM

Reposted by Naomi Saphra

Plate Stealer

@platestealer.bsky.social

November 2, 2025 at 4:18 AM

Reposted by Naomi Saphra

Mathew Kumar

@mathewkumar.com

This game is like I got caught smoking a baseball and now I have to smoke a whole pack of baseball

October 28, 2025 at 5:01 AM

Naomi Saphra

@nsaphra.bsky.social

Seeing a weird number of people misunderstand the policy. It applies ONLY to position papers and surveys, which have no new results and only present opinions. go bluejays

November 2, 2025 at 3:54 AM