Ari
banner
ari-holtzman.bsky.social
Ari
@ari-holtzman.bsky.social
Assistant Professor @ UChicago CS & DSI UChicao

Leading Conceptualization Lab http://conceptualization.ai

Minting new vocabulary to conceptualize generative models.
people are always waiting for the right time to do something and then doing absolutely nothing in the process
November 20, 2025 at 2:34 AM
Reposted by Ari
Trying an experiment in good old-fashioned blogging about papers: dallascard.github.io/granular-mat...
Language Model Hacking - Granular Material
dallascard.github.io
November 16, 2025 at 7:51 PM
Sometimes I notice that instead of a Letter of Rec someone has sent in a Letter of Wreck...common mistake, so be sure to check before submitting!
November 10, 2025 at 11:59 PM
🐦‍🔥
November 10, 2025 at 9:42 PM
LLMs don't accumulate information over the course of a text the way you'd hope!

I think this is why LLMs often feel 'fixated on the wrong thing' or 'overly literal'—they are usually responding using the most relevant single thing they remember, not the aggregate of what was said
Can LLMs accurately aggregate information over long, information-dense texts? Not yet…

We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!
November 9, 2025 at 8:06 PM
who in your life most overestimates you?

is it awesome or painful or something you live in fear of?
November 8, 2025 at 12:17 AM
the first actually good voice notetaking app is going to kill 🔪
November 6, 2025 at 2:06 AM
most apparent irrationality is a misunderstanding of either (1) the underlying objective function or (2) who is the actual actor in the situation
November 5, 2025 at 4:22 PM
minimalism is just a scam concocted by big small™ to make small more bigger for less (or more)
November 5, 2025 at 2:04 AM
I found Reviewer 2 on reddit
October 31, 2025 at 12:58 PM
Hiring anybody who can help me explain this Grok output:

Me: Show me the seahorse emoji
Grok: Here it is: 🦐

Wait, that's a shrimp. My bad—the actual seahorse emoji is: 🦎

No, that's a lizard. Let me get this right: the seahorse is 🦈? Shark? Nope.

Actually, the real seahorse emoji is: 🦑
I'm recruiting PhD students! I'm interested in:
1. Understanding how LLMs 'see' the world (ex: LMs can't see conspicious omissions, see AbsenceBench)
2. How can we make things with LLMs that have never been made before? (ex: Communnication Games, see 📌)
3. See my other posts :)
October 28, 2025 at 3:41 PM
I'm recruiting PhD students! I'm interested in:
1. Understanding how LLMs 'see' the world (ex: LMs can't see conspicious omissions, see AbsenceBench)
2. How can we make things with LLMs that have never been made before? (ex: Communnication Games, see 📌)
3. See my other posts :)
October 28, 2025 at 12:59 AM
I feel like there 'hallucinations' (LLMs confabulating things on the spot) and then there are 'false memories' (LLMs think something is true in lots of different contexts, the incorrect info is deeply parametric). Example:
October 27, 2025 at 11:30 PM
Reposted by Ari
❓ Does an LLM know thyself? 🪞
Humans pass the mirror test at ~18 months 👶
But what about LLMs? Can they recognize their own writing—or even admit authorship at all?
In our new paper, we put 10 state-of-the-art models to the test. Read on 👇
1/n 🧵
October 27, 2025 at 5:36 PM
is anyone making a mini-series on Sora? if not, why not? are people really just so lazy they're annoyed they can't parody already existing media?

THE EVERYTHING MACHINE IS HERE NOW USE IT
October 24, 2025 at 8:21 PM
Reposted by Ari
AI can accelerate scientific discovery, but only if we get the scientist–AI interaction right.

The dream of “autonomous AI scientists” is tempting:
machines that generate hypotheses, run experiments, and write papers. But science isn’t just automation.

cichicago.substack.com/p/the-mirage...
🧵
The Mirage of Autonomous AI Scientists
Science as AI’s killer application cannot succeed without scientist-AI interaction: Introducing Hypogenic.ai.
cichicago.substack.com
October 23, 2025 at 6:55 PM
October 22, 2025 at 1:21 AM
What is the hardest (but not theoretically impossible) thing to prove that you believe is objectively true?
October 21, 2025 at 10:48 PM
I wonder which university will be the first to have a lab run by an LLM...
October 20, 2025 at 11:52 PM
The two questions I keep coming back to as an academic:

1) How can I change course while maintaining the machine that keeps me alive?
2) Is this machine actually keeping me alive?
October 20, 2025 at 1:08 AM
TTIC is hiring! Come to Chicago and hangout with amazing Communication folks like Will Merrill, Nick Tomlin, Mina Lee, Chenhao Tan, and unfortunately you'll also have to deal with me! (But the rest hopefully make up for it.)

Link to apply in the next tweet.
October 18, 2025 at 3:54 PM
I'm excited for 'AI AI slop'—AI slop made by AIs to get engagement from other AIs
October 17, 2025 at 11:03 PM
my hot take is there are good writers hiding inside LLMs, but the training data and learning algorithms you're using to get them out are pointing in the wrong direction.
October 16, 2025 at 2:06 AM
So has anyone invented cursor for writing or am I just still stuck using obsidian and asking chatgpt when I want its help?

(Gemini integration in docs is abyssmal, same with Notion's AI which I think is ChatGPT but somehow does not condition on enough context to be useful?)
October 10, 2025 at 4:35 PM
FYI that UChicago CS & Stats is hiring at all levels via the Data Science Institue:

Postdoc: uchicago.infoready4.com#freeformComp...
Assistant Professor: apply.interfolio.com/174766
Associate Professor: apply.interfolio.com/174768
October 7, 2025 at 5:53 PM