pentagonalize.bsky.social
@pentagonalize.bsky.social
Announcing the first Workshop on Formal Languages and Neural Networks (FLaNN)!

We invite the submission of abstracts for posters that discuss the formal expressivity, computational properties, and learning behavior of neural network models, including large language models (LLMs).
December 19, 2025 at 2:59 AM
We present The Transformer Cookbook: a collection of recipes for programming algorithms directly into transformers!

Hungry for an induction head? Craving a Dyck language recognizer? We show you step-by-step how to cook up transformers for these algorithms and many more!
The Transformer Cookbook
We present the transformer cookbook: a collection of techniques for directly encoding algorithms into a transformer's parameters. This work addresses the steep learning curve of such endeavors, a prob...
arxiv.org
October 3, 2025 at 4:24 PM
Reposted
New paper and two not-so-new papers on arXiv about transformer expressivity: (1) With @pentagonalize and Dana Angluin, "Simulating Hard Attention Using Soft Attention" arxiv.org/abs/2412.09925
Simulating Hard Attention Using Soft Attention
We study conditions under which transformers using soft attention can simulate hard attention, that is, effectively focus all attention on a subset of positions. First, we examine several variants of ...
arxiv.org
December 23, 2024 at 10:55 PM