rogergrosse.bsky.social
@rogergrosse.bsky.social
The Nintendo is closer in time to the first transistor than to today.
December 18, 2024 at 3:36 PM
Conferences are basically a way for a group of people to temporarily have a lower opportunity cost on their time.
December 7, 2024 at 1:28 PM
Reposted
thinking of calling this "The Illusion Illusion"

(more examples below)
December 1, 2024 at 2:33 PM
Reposted
🚨 New #NeurIPS2025 paper “Training Data Attribution via Approximate Unrolling” 🚨

Introducing SOURCE: A method to understand how individual training examples influence neural net behavior, allowing us to make AI models more transparent and trustworthy!

📄 Full paper: openreview.net/pdf?id=3NaqG...
November 27, 2024 at 5:41 PM
I have Claude filter my arXiv feed each day. It mostly works pretty well, except that it always hallucinates that "Studying LLM Generalization with Influence Functions" is in my feed and tells me I should read it.
November 26, 2024 at 7:41 PM
Some very nice work from Cohere and UCL using influence functions to analyze math reasoning abilities in LLMs. Factual queries turn up docs containing the facts, but reasoning queries turn up similar cognitive strategies, suggesting generalization. arxiv.org/abs/2411.12580
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
The capabilities and limitations of Large Language Models have been sketched out in great detail in recent years, providing an intriguing yet conflicting picture. On the one hand, LLMs demonstrate a g...
arxiv.org
November 22, 2024 at 1:51 PM