Book: https://a.co/d/bC2kSj1
Substack: https://www.oneusefulthing.org/
Web: https://mgmt.wharton.upenn.edu/profile/emollick
I never touched any code or did any design or any API to make this.
I never touched any code or did any design or any API to make this.
My new version has the most changes ever, since AI is no longer just about chatbots. To use AI you need to understand how to think about models, apps, and harnesses. open.substack.com/pub/oneusefu...
My new version has the most changes ever, since AI is no longer just about chatbots. To use AI you need to understand how to think about models, apps, and harnesses. open.substack.com/pub/oneusefu...
A sign of a path forward for agents that will not terrify IT.
A sign of a path forward for agents that will not terrify IT.
First the over-enthusiastic claims that are debunked, then smart people use AI to help them, then AI starts to do more of the work, then minor discoveries, & then…
First the over-enthusiastic claims that are debunked, then smart people use AI to help them, then AI starts to do more of the work, then minor discoveries, & then…
Paper: papers.ssrn.com/sol3/papers....
Paper: papers.ssrn.com/sol3/papers....
✅Loebner was a weak Turing Test, the equivalent achieved by GPT-4.5 in a published paper
✅Winograd passed by GPT-3
✅SAT passed at 75% by GPT-4
All that's left is a classic Atari game...
✅Loebner was a weak Turing Test, the equivalent achieved by GPT-4.5 in a published paper
✅Winograd passed by GPT-3
✅SAT passed at 75% by GPT-4
All that's left is a classic Atari game...
(and yes, that includes obscure benchmarks that nobody would train on and benchmarks with holdout datasets)
(and yes, that includes obscure benchmarks that nobody would train on and benchmarks with holdout datasets)
Online spaces are about to get (even more) grim.
Online spaces are about to get (even more) grim.
I guess Chinese models have few restrictions on training data
I guess Chinese models have few restrictions on training data
In a single go (deploying multiple agents spontaneously) it cracked the case & put together recommendations
In a single go (deploying multiple agents spontaneously) it cracked the case & put together recommendations
A lot is going to change dramatically even with today's AI. Ignoring that means no chance to shape what's next
A lot is going to change dramatically even with today's AI. Ignoring that means no chance to shape what's next
1000x less cost osf.io/preprints/so...
1000x less cost osf.io/preprints/so...
ChatGPT, Gemini & Claude all suggest Borges's "The Golem"
ChatGPT, Gemini & Claude all suggest Borges's "The Golem"
Again, first result.
Again, first result.
"A nature documentary about an otter flying an airplane"
"A nature documentary about an otter flying an airplane"
For example, since Opus 4.6, Claude Code will spontaneously use subagents to do work in parallel. This is very helpful with a real impact on tasks, but was sort of quietly rolled out without documentation
For example, since Opus 4.6, Claude Code will spontaneously use subagents to do work in parallel. This is very helpful with a real impact on tasks, but was sort of quietly rolled out without documentation
BUT books 100-1,000 per category are actually better than before, & pre-LLM authors got more productive. And since people only read the good books, it is net positive for readers. www.nber.org/papers/w34777
BUT books 100-1,000 per category are actually better than before, & pre-LLM authors got more productive. And since people only read the good books, it is net positive for readers. www.nber.org/papers/w34777
The technology in question is the sundial.
From a 3rd century BCE Roman adaptation of a Greek play, as discussed in Kerr’s “The Ordered Day”
(It isn't wrong, though)
The technology in question is the sundial.
From a 3rd century BCE Roman adaptation of a Greek play, as discussed in Kerr’s “The Ordered Day”
(It isn't wrong, though)
I had ChatGPT whip up a pretty good imaginary color viewer after asking it to review the scientific literature and getting the shades right. chatgpt.com/canvas/share...
I had ChatGPT whip up a pretty good imaginary color viewer after asking it to review the scientific literature and getting the shades right. chatgpt.com/canvas/share...
The technical reasons are pretty clear, but they are supposed to be language models
The technical reasons are pretty clear, but they are supposed to be language models
58,276 pages in total. 117 million floating point numbers. This is everything that makes GPT-1. weights-press.netlify.app
58,276 pages in total. 117 million floating point numbers. This is everything that makes GPT-1. weights-press.netlify.app
So I had Claude make it: weights-press.netlify.app
So I had Claude make it: weights-press.netlify.app