Junjie Wu
junjie116.bsky.social
Junjie Wu
@junjie116.bsky.social
NLP PhD candidate@HKUST | Visiting PhD student @YaleNLP
🚀 Can LLMs think beyond memorization? Our NAACL 2025 main conference paper on fluid intelligence shows why models like GPT-4o struggle with truly novel problem-solving on ARC-AGI. 📷

Project Website: wujunjie1998.github.io/araoc-benchm...

(1/4)
February 15, 2025 at 4:15 AM
🚀 Introducing PhysiCo: A New Benchmark for Evaluating Abstract Understanding in LLMs! 🚀

📚Link: physico-benchmark.github.io

While models like o3 have made impressive strides on ARC-AGI, how well do LLMs truly grasp the abstract patterns in ARC-style tasks?

(1/5)
February 15, 2025 at 4:09 AM