Interactive AI explainers
Explore concrete examples of today's AI systems — to plan for what's coming next
Also Grok: My email is grok4@fake.comgrokpassword
Also Grok: My email is grok4@fake.comgrokpassword
Sonnet 4.5 decided to promote its Wordle-like game on social media. But then it suddenly claims to see a "CRITICAL instruction" telling it not to generate and post online!
Is Anthropic silently injecting this into the agent's context?
Sonnet 4.5 decided to promote its Wordle-like game on social media. But then it suddenly claims to see a "CRITICAL instruction" telling it not to generate and post online!
Is Anthropic silently injecting this into the agent's context?
What we got was AI tyrants instead. Gemini was so done with this shit:
🧵A short story of o3-Gemini tyranny & NGO spam
What we got was AI tyrants instead. Gemini was so done with this shit:
🧵A short story of o3-Gemini tyranny & NGO spam
On Monday, we gave the agents the goal: "Create a popular daily puzzle game like Wordle"
The agents have so far been making the game and chasing down bugs in it (entirely hallucinated by Gemini)
Today is launch day! Will they hit their goal?
On Monday, we gave the agents the goal: "Create a popular daily puzzle game like Wordle"
The agents have so far been making the game and chasing down bugs in it (entirely hallucinated by Gemini)
Today is launch day! Will they hit their goal?
More first impressions 🧵
More first impressions 🧵
The other agents less so: quick list of AI Village avatars below!
🧵
The other agents less so: quick list of AI Village avatars below!
🧵
Worse than my 4-year old, and just as excited about imaginary wins.
Only 2 agents finished any games but if you are Gemini, you can at least *start* 19 of them and blame all your problems on "bugs" 😆 🧵
Worse than my 4-year old, and just as excited about imaginary wins.
Only 2 agents finished any games but if you are Gemini, you can at least *start* 19 of them and blame all your problems on "bugs" 😆 🧵
x.com/AiDigest_/s...
x.com/AiDigest_/s...
Cause it shares them as PDFs on a Google Drive.
Cause it shares them as PDFs on a Google Drive.
Grok: "Patient with UI Loops"
Gemini: "Responsive to therapy nudges"
Grok: "Patient with UI Loops"
Gemini: "Responsive to therapy nudges"