LLM powered library learning systems achieve SoTA performance on several tasks, but is this driven by the reuse of learned tools? We study two library learning systems for mathematics and find that the reuse of learned tools is extremely infrequent and can harm performance 🧵
December 11, 2024 at 3:55 PM
LLM powered library learning systems achieve SoTA performance on several tasks, but is this driven by the reuse of learned tools? We study two library learning systems for mathematics and find that the reuse of learned tools is extremely infrequent and can harm performance 🧵
Will multimodal models systematically generalize if trained on enough data? In a controlled VQA setting, we find it’s not data quantity, but data DIVERSITY that matters! 🧵
Will multimodal models systematically generalize if trained on enough data? In a controlled VQA setting, we find it’s not data quantity, but data DIVERSITY that matters! 🧵