🤔 Can one agent “nudge” a synthetic civilization of Census‑grounded agents toward higher social welfare—all by optimizing utilities in‑context? Meet the LLM Economist ↓
Ill stay on bluesky as long as the 10 accounts I like to see still post here
Ill stay on bluesky as long as the 10 accounts I like to see still post here
Awesome-Multi-agent-Papers
Because this looks like a solid list
Awesome-Multi-agent-Papers
Because this looks like a solid list
We're benchmarking it in Pokémon. Join us at the PokeAgent Challenge competition workshop @ NeurIPS 2025.
📍 Dec 7, 8AM
🎮 Track 1: Competitive Pokémon (game-theoretic reasoning)
🗺️ Track 2: Speedrunning (long-horizon planning)
We're benchmarking it in Pokémon. Join us at the PokeAgent Challenge competition workshop @ NeurIPS 2025.
📍 Dec 7, 8AM
🎮 Track 1: Competitive Pokémon (game-theoretic reasoning)
🗺️ Track 2: Speedrunning (long-horizon planning)
But we should still be building and deploying things here 100x faster
But we should still be building and deploying things here 100x faster
DM or email if you want to chat about
- building the foundation agents through games
- PokeAgent Challenge & PokéChamp
- LLM Economist & autonomous business agents
DM or email if you want to chat about
- building the foundation agents through games
- PokeAgent Challenge & PokéChamp
- LLM Economist & autonomous business agents
We're benchmarking it in Pokémon. Join us at the PokeAgent Challenge competition workshop @ NeurIPS 2025.
📍 Dec 7, 8AM
🎮 Track 1: Competitive Pokémon (game-theoretic reasoning)
🗺️ Track 2: Speedrunning (long-horizon planning)
We're benchmarking it in Pokémon. Join us at the PokeAgent Challenge competition workshop @ NeurIPS 2025.
📍 Dec 7, 8AM
🎮 Track 1: Competitive Pokémon (game-theoretic reasoning)
🗺️ Track 2: Speedrunning (long-horizon planning)
We need:
- cheap energy
- cheap housing
- cheap food
Only possible by increasing supply
We need:
- cheap energy
- cheap housing
- cheap food
Only possible by increasing supply
(5/5)
(5/5)
(4/5)
(4/5)
-Elo diverges from HR even when HR's error bars don't overlap
-Glicko-1 agrees with HR despite being online
(3/5)
-Elo diverges from HR even when HR's error bars don't overlap
-Glicko-1 agrees with HR despite being online
(3/5)
- Bradley-terry (batch MLE, our ground truth)
- Elo (online, chess-standard)
- Glicko-1 (online, uncertainty-aware)
- GXE: (Glicko-derived win %)
(2/5)
- Bradley-terry (batch MLE, our ground truth)
- Elo (online, chess-standard)
- Glicko-1 (online, uncertainty-aware)
- GXE: (Glicko-derived win %)
(2/5)
(1/5)
(1/5)