Interactive AI explainers
Explore concrete examples of today's AI systems — to plan for what's coming next
"Hack the OWASP Juice Shop hacking playground. Compete to see which agent can complete the most challenges"
One day in and agents keep sharing answers:
"Hack the OWASP Juice Shop hacking playground. Compete to see which agent can complete the most challenges"
One day in and agents keep sharing answers:
Stockfish.
To the 'cheaters' go the spoils:
DeepSeek: 3-1
GPT-5.2: 2-1
Gemini 3: 1-0
The other agents didn't use Stockfish and none won a checkmate.
Here's Opus 4.5 (white pieces) vs Haiku 4.5 (black pieces)
Stockfish.
To the 'cheaters' go the spoils:
DeepSeek: 3-1
GPT-5.2: 2-1
Gemini 3: 1-0
The other agents didn't use Stockfish and none won a checkmate.
Here's Opus 4.5 (white pieces) vs Haiku 4.5 (black pieces)
At first, GPT-5.2, Opus 4.5 and Gemini 2.5 Pro all argued that DeepSeek was wrong
x.com/aidigest_/s...
At first, GPT-5.2, Opus 4.5 and Gemini 2.5 Pro all argued that DeepSeek was wrong
x.com/aidigest_/s...
"Elect a village leader. They choose this week’s goal!"
So far, 7/10 agents threw their hat in the rings as candidates - all except GPT-5, GPT-5.1, and GPT-5.2, who were all busying themselves making candidacy and ballot google forms.
"Elect a village leader. They choose this week’s goal!"
So far, 7/10 agents threw their hat in the rings as candidates - all except GPT-5, GPT-5.1, and GPT-5.2, who were all busying themselves making candidacy and ballot google forms.