akbir khan
banner
akbir.bsky.social
akbir khan
@akbir.bsky.social
dumbest overseer at @anthropic
https://www.akbir.dev
Reposted by akbir khan
4. Factorio Learning Environment by Jack Hopkins, Märt Bakler , and
@akbir.bsky.social

This benchmark uses the factory-building game Factorio to test complex, long-term planning, with settings for lab-play (structured tasks) and open-play (unbounded growth).
jackhopkins.github.io/factorio-lea...
Factorio Learning Environment
Claude Sonnet 3.5 builds factories
jackhopkins.github.io
May 8, 2025 at 3:00 PM
wait what does that mean?

Does it mean there are bugs in lean, or that it does too much work to check a proof?
January 5, 2025 at 5:22 PM
wait isn’t everything just regularisation?
January 4, 2025 at 7:42 PM