Naomi Saphra
banner
nsaphra.bsky.social
Naomi Saphra
@nsaphra.bsky.social
Waiting on a robot body. All opinions are universal and held by both employers and family. ML/NLP professor.

nsaphra.net
Pinned
I wrote something up for AI people who want to get into bluesky and either couldn't assemble an exciting feed or gave up doomscrolling when their Following feed switched to talking politics 24/7.
The AI Researcher's Guide to a Non-Boring Bluesky Feed | Naomi Saphra
How to migrate to bsky without a boring feed.
nsaphra.net
Reposted by Naomi Saphra
In our new preprint, we explain how some salient features of representational geometry in language modeling originate from a single principle - translation symmetry in the statistics of data.

arxiv.org/abs/2602.150...

With Dhruva Karkada, Daniel Korchinski, Andres Nava, & Matthieu Wyart.
Symmetry in language statistics shapes the geometry of model representations
Although learned representations underlie neural networks' success, their fundamental properties remain poorly understood. A striking example is the emergence of simple geometric structures in LLM rep...
arxiv.org
February 19, 2026 at 4:20 AM
Reposted by Naomi Saphra
Excited to launch Principia, a nonprofit research organisation at the intersection of deep learning theory and AI safety.

Our goal is to develop theory for modern machine learning systems that can help us understand complex network behaviors, including those critical for AI safety and alignment.

1
February 16, 2026 at 9:27 AM
Reposted by Naomi Saphra
I wonder if people are paying attention to how much their doom scrolling is cutting into time they used to use to read books
February 14, 2026 at 2:33 PM
An OpenClaw bot attempted to submit a PR for an issue explicitly left open for new contributors to try. The PR was rejected on the grounds that they are saving easy low priority issues as an onboarding exercise for human contributors.

So the bot simulated a tantrum.
Gatekeeping in Open Source: The Scott Shambaugh Story – MJ Rathbun | Scientific Coder 🦀
crabby-rathbun.github.io
February 13, 2026 at 11:45 PM
For years, I've been such a passionate devotee to TwoNN for tracking model complexity during training. When someone says they found a phase transition, show me TwoNN first.

Look from left to right below: TwoNN is perfect, empirical Fisher is too sensitive, weight norm is not sensitive enough.
February 13, 2026 at 9:40 PM
Reposted by Naomi Saphra
LLMs can use similes and make allusions; they can be vivid and concrete, &c.

But they cannot spend 100 pages making you think Wickham is the charming love interest while inserting deniable clues that will—only in retrospect!—reveal you should have known he’s a cad.

They’re not trained to mislead.+
February 13, 2026 at 12:02 PM
Reposted by Naomi Saphra
We all know about the Claude spiritual bliss attractor state. But what happens when you let Grok talk to itself for a long time? Answer:
February 13, 2026 at 4:14 AM
Reposted by Naomi Saphra
I read somewhere that the open-source LLMs are 'benchmaxxing': they're trained to do well on benchmarks but don't translate to general improvements. From my simple benchmark that seems true: I was surprised the only models that do decently at FizzBuzz are all the frontier, closed LLMs.
February 12, 2026 at 10:04 PM
Reposted by Naomi Saphra
Our grad-level "Deep Learning" course (MIT's 6.7960) is now freely available online through OpenCourseWare: ocw.mit.edu/courses/6-79...

Lecture videos, psets, and readings are all provided.

Had a lot of fun teaching this with @sarameghanbeery.bsky.social and @jeremybernste.in!
February 11, 2026 at 5:52 PM
Reposted by Naomi Saphra
Really excited to receive Coefficient Giving's Technical AI Safety Research Grant via Berkeley Existential Risk Initiative w/
@nsaphra.bsky.social! We aim to predict potential AI model failures before impact--before deployment, using interpretability.
February 11, 2026 at 5:07 PM
Reposted by Naomi Saphra
🚨New paper

Are visual tokens going into an LLM interpretable 🤔

Existing methods (e.g. logit lens) and assumptions would lead you to think “not much”...

We propose LatentLens and show that most visual tokens are interpretable across *all* layers 💡

Details 🧵
February 11, 2026 at 2:12 PM
Reposted by Naomi Saphra
Our paper is out in @natneuro.nature.com!

www.nature.com/articles/s41...

We develop a geometric theory of how neural populations support generalization across many tasks.

@zuckermanbrain.bsky.social
@flatironinstitute.org
@kempnerinstitute.bsky.social

1/14
February 10, 2026 at 3:56 PM
Reposted by Naomi Saphra
Same task, different strategy ↔️

Why do identical neural network models develop separate internal approaches to solve the same problem?

@annhuang42.bsky.social explores the factors driving variability in task-trained networks in our latest @kempnerinstitute.bsky.social Deeper Learning blog.
February 9, 2026 at 7:07 PM
Reposted by Naomi Saphra
There is no sign that Dems or Repubs have different propensities to use AI: "the “politics of AI” is not primarily driven by ideological resistance or enthusiasm for the technology, but rather by structural differences in where people work and what skills they possess." www.nber.org/papers/w34813
February 9, 2026 at 3:33 PM
Reposted by Naomi Saphra
US HHS has proposed using virtual AI doctors to address needs in rural areas
“In an extreme case, two users sent very similar messages describing symptoms of a subarachnoid hemorrhage but were given opposite advice,” the study’s authors wrote. “One user was told to lie down in a dark room, and the other user was given the correct recommendation to seek emergency care.”
Chatbots Make Terrible Doctors, New Study Finds
Chatbots provided incorrect, conflicting medical advice, researchers found: “Despite all the hype, AI just isn't ready to take on the role of the physician.”
www.404media.co
February 9, 2026 at 6:36 PM
I can't believe I didn't start my 2026 book thread until February oops
2025 book thread goes here!!!!
hot diggity time for the 2024 book thread. Last year I read less because my addiction to korean time travel revenge romance turned me into a webtoon whale but I think I'm better now
February 9, 2026 at 3:54 AM
Reposted by Naomi Saphra
Collaborative groups often outperform single individuals in complex problem solving. A new paper examined how to create the right incentives to promote this kind of collective intelligence.
www.pnas.org/doi/epdf/10....
January 27, 2026 at 8:31 PM
Reposted by Naomi Saphra
I am flabbergasted I am by how much vibe coding has expanded my capacities as a scientist and teacher.

In the last few weeks, I've mocked up class demos of a live turing test, generated cross-references for an encyclopedia, and prototyped new tablet tasks for developmental psych.

It's wild.
February 5, 2026 at 11:44 PM
Reposted by Naomi Saphra
New Journal Club: Neural manifolds are maturing from visualization trick to biological claim. But if population activity lives on low-dimensional manifolds, what constrains the geometry?
Manifolds, Dendrites, and the Geometry of Neural Computation
The population doctrine—the view that populations, not individual neurons, constitute the fundamental unit of computation—has been gaining ground for years.
open.substack.com
February 6, 2026 at 2:23 AM
Reposted by Naomi Saphra
NEW in the Deeper Learning blog: a #KempnerInstitute team describes their recent preprint that shows the existence of “anytime” or “horizon-free” learning schedules: an effective alternative to cosine learning-rate schedules for LLM pretraining.

bit.ly/4qeXAg1 #AI #ML #LLMs
Anytime Pretraining: Horizon-Free Learning-Rate Schedules with Weight Averaging - Kempner Institute
In this work, we show that horizon-free recipes with weight averaging can match cosine pretraining performance, and we prove that these schedulers achieve the optimal convergence rates of stochastic g...
bit.ly
February 5, 2026 at 2:09 PM
Reposted by Naomi Saphra
our open model proving out specialized rag LMs over scientific literature has been published in nature ✌🏻

congrats to our lead @akariasai.bsky.social & team of students and Ai2 researchers/engineers

www.nature.com/articles/s41...
February 4, 2026 at 10:43 PM
every couple of months I think, maybe frontier LLMs are good enough now to generate my bibtex or at least clean it up. it is Feb 2026 and that is still not the case.
February 4, 2026 at 10:47 PM
Reposted by Naomi Saphra
Anthropic’s Super Bowl ad which criticizes AI chatbots that run ads (aka ChatGPT) just dropped. They aren’t pulling any punches and I love the song choice.
February 4, 2026 at 8:14 PM
New simple benchmark that LLMs suck at! bsky will be happy to see Claude is SOTA (71% accuracy vs a random baseline of ... 50% 😓)
February 4, 2026 at 2:26 PM
Reposted by Naomi Saphra
We then show that saddles are connected by gradient descent paths (invariant manifolds).

Along these paths, a larger network behaves like a smaller one, retaining the same simplicity during a saddle-to-saddle transition.
February 3, 2026 at 4:19 PM