This should be seen as Anthropic doubling down on Claude Code.
They recently launched the native installer for CC through a tight partnership with Bun. You should expect to see more
www.anthropic.com/news/anthrop...
This should be seen as Anthropic doubling down on Claude Code.
They recently launched the native installer for CC through a tight partnership with Bun. You should expect to see more
www.anthropic.com/news/anthrop...
As far as I can tell it's between 5.1 and 10.2 seconds, depending on which end of the 2019 IEA Netflix energy usage estimate you use
simonwillison.net/2025/Nov/29/...
As far as I can tell it's between 5.1 and 10.2 seconds, depending on which end of the 2019 IEA Netflix energy usage estimate you use
simonwillison.net/2025/Nov/29/...
Modern LLMs (GPT-5.1, Claude 4.5, Gemini 3) produce excellent code and can be a significant productivity boost to software engineers who take the time to learn how to effectively apply them - especially if used with coding agent tools
Modern LLMs (GPT-5.1, Claude 4.5, Gemini 3) produce excellent code and can be a significant productivity boost to software engineers who take the time to learn how to effectively apply them - especially if used with coding agent tools
no, the advantage of closed weights is you can explore prices completely detached from cost. You’re free to set prices based purely on what people will pay, the value they get from it
no, the advantage of closed weights is you can explore prices completely detached from cost. You’re free to set prices based purely on what people will pay, the value they get from it
Gemini: "Here is F.L.O.O.R. (First-person Lino Observation & Ornamental Review)."
Pretty good!
www.dbreunig.com/2025/07/31/h...
www.dbreunig.com/2025/07/31/h...
A riff off of the lethal trifecta for addressing prompt injection, this is a simple heuristic to ensure security at runtime
red = untrusted content
blue = potentially critical actions
An agent can't be allowed to do both
timkellogg.me/blog/2025/11...
A riff off of the lethal trifecta for addressing prompt injection, this is a simple heuristic to ensure security at runtime
red = untrusted content
blue = potentially critical actions
An agent can't be allowed to do both
timkellogg.me/blog/2025/11...
we’re still early. people aren’t spending much money on AI so it’s not a lucrative target yet
it’s also inconsistent, which is annoying to design attacks for, especially if the rewards are sparse
we’re still early. people aren’t spending much money on AI so it’s not a lucrative target yet
it’s also inconsistent, which is annoying to design attacks for, especially if the rewards are sparse
I wish more LLM tools would implement the same pattern! simonwillison.net/2025/Oct/24/...
I wish more LLM tools would implement the same pattern! simonwillison.net/2025/Oct/24/...
A lot has changed since I last wrote a guide like this in the spring, and AI has gotten much more useful as a result. open.substack.com/pub/oneusefu...
A lot has changed since I last wrote a guide like this in the spring, and AI has gotten much more useful as a result. open.substack.com/pub/oneusefu...
i’ve been saying this for a couple months. RL is driving towards specialization
my hunch is it’s temporary and something will shift again back towards generalization, but for now.. buckle up!
i’ve been saying this for a couple months. RL is driving towards specialization
my hunch is it’s temporary and something will shift again back towards generalization, but for now.. buckle up!
Even when Google & OpenAI include watermarks, those can be easily removed, and open weights AI video models without guardrails are coming. www.404media.co/sora-2-water...
Even when Google & OpenAI include watermarks, those can be easily removed, and open weights AI video models without guardrails are coming. www.404media.co/sora-2-water...
In this case, AI note-taking significantly reduces burnout among doctors & increases their ability to focus on their patients.
not just good content, there’s more and more original work, people from labs, and people with genuinely interesting perspectives
when i joined, it was so painful trying to find even traces
not just good content, there’s more and more original work, people from labs, and people with genuinely interesting perspectives
when i joined, it was so painful trying to find even traces
1) When you get an instant AI answer, it is from a small model, which are weak models, especially at math.
2) Non-reasoning models, like the one powering AI overview, only “think” as they write, they make mistakes & then back justify them as they write more
1) When you get an instant AI answer, it is from a small model, which are weak models, especially at math.
2) Non-reasoning models, like the one powering AI overview, only “think” as they write, they make mistakes & then back justify them as they write more
a study shows that a lot of the real world performance gains that people see are actually because people learn how to use the model better
arxiv.org/abs/2407.14333
a study shows that a lot of the real world performance gains that people see are actually because people learn how to use the model better
arxiv.org/abs/2407.14333
There are two goals in AI, minimize cost (which is also roughly environmental impact of use) & maximize ability. It is clear you can win one goal by losing the other, GPT-5 seems to be a gain on both.
There are two goals in AI, minimize cost (which is also roughly environmental impact of use) & maximize ability. It is clear you can win one goal by losing the other, GPT-5 seems to be a gain on both.
I asked it for an SVG of a pelican riding a bicycle and it wrote me a delightful little poem instead
simonwillison.net/2025/Aug/14/...
I asked it for an SVG of a pelican riding a bicycle and it wrote me a delightful little poem instead
simonwillison.net/2025/Aug/14/...
it doesn’t make sense if you don’t have a strong UI, plus it’s obviously hard
it doesn’t make sense if you don’t have a strong UI, plus it’s obviously hard