As far as I can tell it's between 5.1 and 10.2 seconds, depending on which end of the 2019 IEA Netflix energy usage estimate you use
simonwillison.net/2025/Nov/29/...
As far as I can tell it's between 5.1 and 10.2 seconds, depending on which end of the 2019 IEA Netflix energy usage estimate you use
simonwillison.net/2025/Nov/29/...
Tomorrow that water will be a cloud but now’s your chance to kick it while it’s down.
Tomorrow that water will be a cloud but now’s your chance to kick it while it’s down.
We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!
We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!
Examples: credit cards and porn, optical storage and porn, synthetic voice generation and scam calls, email and scams/porn, the internet and ...
This has implications for likely AI x-risk scenarios 1/
Examples: credit cards and porn, optical storage and porn, synthetic voice generation and scam calls, email and scams/porn, the internet and ...
This has implications for likely AI x-risk scenarios 1/
What about when engineers at the top of their game use AI tools responsibly to accelerate their work?
I propose "vibe engineering"!
simonwillison.net/2025/Oct/7/v...
What about when engineers at the top of their game use AI tools responsibly to accelerate their work?
I propose "vibe engineering"!
simonwillison.net/2025/Oct/7/v...
Noticed that those able to shoot pro films using a smartphone are... pros?
Noticed that those able to shoot pro films using a smartphone are... pros?
A: We're not sure, but it achieved 94.7% on CHIKENBench-Large
A: We're not sure, but it achieved 94.7% on CHIKENBench-Large
Available here -- vladiliescu.net/iterm2-with-...
Available here -- vladiliescu.net/iterm2-with-...
www.anthropic.com/engineering/...
www.anthropic.com/engineering/...
Our resolve hasn't faltered
Our resolve hasn't faltered
“As soon as I know some text is AI-generated: I lose all interest in reading it.
For performance reviews, I asked people to either not use AI or if they must: just write down the prompt so I don’t need to go thru the generated word salad padding.”
“As soon as I know some text is AI-generated: I lose all interest in reading it.
For performance reviews, I asked people to either not use AI or if they must: just write down the prompt so I don’t need to go thru the generated word salad padding.”
I've just spend 15 minutes trying to cajole GitHub Copilot (both Claude Sonnet 4 and GPT-5) to implement a rather exotic change in my codebase.
I've just spend 15 minutes trying to cajole GitHub Copilot (both Claude Sonnet 4 and GPT-5) to implement a rather exotic change in my codebase.
But PRDs define what you’re building and why, priorities, success metrics and its features while prototypes only show its features.
Skipping these is risky as teams may ship without clarity on goals or tradeoffs
But PRDs define what you’re building and why, priorities, success metrics and its features while prototypes only show its features.
Skipping these is risky as teams may ship without clarity on goals or tradeoffs