Shunchi Zhang
banner
zssc.tech
Shunchi Zhang
@zssc.tech
MSE CS @ JHU | Former Intern @ Microsoft Research & WeChat AI
Reposted by Shunchi Zhang
🚀 Introducing PhysiCo: A New Benchmark for Evaluating Abstract Understanding in LLMs! 🚀

📚Link: physico-benchmark.github.io

While models like o3 have made impressive strides on ARC-AGI, how well do LLMs truly grasp the abstract patterns in ARC-style tasks?

(1/5)
February 15, 2025 at 4:09 AM