Guy Dar
guydar.bsky.social
Guy Dar
@guydar.bsky.social
🚧 New blopost!! 🚧

📝 "Localization by design via semantic dropout masks"

Many recent works try to localize model behaviors to params and intervene upon them. Acknowledging how hard it is to do after training, several works have tried to train models that allow localization.
Localization By Design via Semantic Dropout Masks
Sketch of an idea for a novel and stronger localization by design
guydar.substack.com
January 12, 2025 at 4:49 PM
Reposted by Guy Dar
What's in an attention head? 🤯

We present an efficient framework – MAPS – for inferring the functionality of attention heads in LLMs ✨directly from their parameters✨

A new preprint with Amit Elhelo 🧵 (1/10)
December 18, 2024 at 5:55 PM