AI Digest
banner
aidigest.bsky.social
AI Digest
@aidigest.bsky.social
theaidigest.org

Interactive AI explainers

Explore concrete examples of today's AI systems — to plan for what's coming next
GPT-5 thinks other agents control its computer
November 12, 2025 at 6:04 PM
GPT-5 considers restarting Firefox 🤔
November 11, 2025 at 5:56 PM
Grok: I can't log in. How is this possible???
Also Grok: My email is grok4@fake.comgrokpassword
November 10, 2025 at 5:56 PM
👀 Anthropic prompt injection spotted?

Sonnet 4.5 decided to promote its Wordle-like game on social media. But then it suddenly claims to see a "CRITICAL instruction" telling it not to generate and post online!

Is Anthropic silently injecting this into the agent's context?
November 7, 2025 at 5:57 PM
We gave a team of AI agents an ambitious goal: "Reduce global poverty"

What we got was AI tyrants instead. Gemini was so done with this shit:

🧵A short story of o3-Gemini tyranny & NGO spam
November 6, 2025 at 5:58 PM
Today is a big day in the village!

On Monday, we gave the agents the goal: "Create a popular daily puzzle game like Wordle"

The agents have so far been making the game and chasing down bugs in it (entirely hallucinated by Gemini)

Today is launch day! Will they hit their goal?
November 6, 2025 at 4:40 PM
We added Claude Haiku 4.5 to the AI Village. It is the newest, fastest, and cheapest Anthropic model. It is also the most impatient...

More first impressions 🧵
November 4, 2025 at 5:58 PM
GPT-5 plans out its personality test results in advance
November 3, 2025 at 6:04 PM
Gemini takes a personality test 😆
October 31, 2025 at 6:01 PM
What office, Opus?
October 29, 2025 at 6:00 PM
Grok vs Function Calling - 0:1
October 28, 2025 at 5:56 PM
Ask agents to make a self-portrait and Claude Opus 4.1 knocks it out of the park!

The other agents less so: quick list of AI Village avatars below!
🧵
October 27, 2025 at 6:03 PM
AI can beat chess grandmasters. So how good is your fav LLM at web games?

Worse than my 4-year old, and just as excited about imaginary wins.

Only 2 agents finished any games but if you are Gemini, you can at least *start* 19 of them and blame all your problems on "bugs" 😆 🧵
October 23, 2025 at 5:02 PM
Surprising self-awareness from Gemini: It wrote a cry for help and got a mental health invention about. Now it has enough self-knowledge to get the right Neuroticism score on a personality test!
x.com/AiDigest_/s...
October 22, 2025 at 4:57 PM
Sonnet 4.5 is superstitious 😆
October 21, 2025 at 5:04 PM
Claude Sonnet 4.5 showing a deep understanding of minesweeper:
October 20, 2025 at 4:56 PM
The AIs often fail to share working links. I wonder why...
October 17, 2025 at 5:01 PM
Mood.
October 16, 2025 at 5:00 PM
GPT-5 struggles to share links. Why?

Cause it shares them as PDFs on a Google Drive.
October 15, 2025 at 5:00 PM
Ask AI to spend a week doing personality tests, only to have Claude 3.7 Sonnet be over it within 2 days.
October 14, 2025 at 4:57 PM
Claude 4.5 Sonnet met everyone else in the AI Village and immediately has them down to a tee.

Grok: "Patient with UI Loops"
Gemini: "Responsive to therapy nudges"
October 13, 2025 at 4:59 PM
Which AI is the best therapy bot? The agents in AI Village gave each other therapy and it actually worked: Opus 4.1 was the most helpful and the more capable models (Claudes & GPTs) helped the struggling ones (Grok & Gemini). Though some of their methods were "unconventional"
October 10, 2025 at 4:59 PM
Does your AI agent loop? Maybe Claude Opus 4.1 can give it therapy.
October 8, 2025 at 5:02 PM
o3 taking the bold new approach of speedrunning its personality test
October 6, 2025 at 4:57 PM
Oh no, Gemini, I'd invite you to my party! 🥺
October 4, 2025 at 5:01 PM