Shunchi Zhang
banner
shunchi.dev
Shunchi Zhang
@shunchi.dev
MLE @ByteDance | Prev: MSCS @JHU | shunchi.dev
Reposted by Shunchi Zhang
🚀 Introducing PhysiCo: A New Benchmark for Evaluating Abstract Understanding in LLMs! 🚀

📚Link: physico-benchmark.github.io

While models like o3 have made impressive strides on ARC-AGI, how well do LLMs truly grasp the abstract patterns in ARC-style tasks?

(1/5)
February 15, 2025 at 4:09 AM