Acyr Locatelli
banner
acyrl.bsky.social
Acyr Locatelli
@acyrl.bsky.social
Lead pre-training @Cohere
One feature missing from @bsky.app is bookmarks. Need to keep feeding the hoarding monster
November 23, 2024 at 10:28 AM
Reposted by Acyr Locatelli
Laura Ruis, Maximilian Mozes, Juhan Bae, Siddhartha Rao Kamalakara, Dwarak Talupuru, Acyr Locatelli, Robert Kirk, Tim Rockt\"aschel, Edward Grefenstette, Max Bartolo
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
https://arxiv.org/abs/2411.12580
November 20, 2024 at 7:01 AM
Reposted by Acyr Locatelli
How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this:

Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢

🧵⬇️
November 20, 2024 at 4:35 PM