In our new preprint, @pc-pet.bsky.social and I introduce the concept of "Causal Pieces" to approach this question!
In our new preprint, @pc-pet.bsky.social and I introduce the concept of "Causal Pieces" to approach this question!
📘 𝘔𝘢𝘵𝘩𝘦𝘮𝘢𝘵𝘪𝘤𝘢𝘭 𝘛𝘩𝘦𝘰𝘳𝘺 𝘰𝘧 𝘋𝘦𝘦𝘱 𝘓𝘦𝘢𝘳𝘯𝘪𝘯𝘨
and uploaded the new version to arxiv:
🔗 arxiv.org/abs/2407.18384
If you have already read it—or plan to—we would really appreciate your feedback.
📘 𝘔𝘢𝘵𝘩𝘦𝘮𝘢𝘵𝘪𝘤𝘢𝘭 𝘛𝘩𝘦𝘰𝘳𝘺 𝘰𝘧 𝘋𝘦𝘦𝘱 𝘓𝘦𝘢𝘳𝘯𝘪𝘯𝘨
and uploaded the new version to arxiv:
🔗 arxiv.org/abs/2407.18384
If you have already read it—or plan to—we would really appreciate your feedback.
Not if you heard an introductory class to numerics. How bad can things get? To find out, we carried out a numerical stability analysis of the transformer arxiv.org/abs/2503.10251.
Not if you heard an introductory class to numerics. How bad can things get? To find out, we carried out a numerical stability analysis of the transformer arxiv.org/abs/2503.10251.