dchiang.bsky.social
@dchiang.bsky.social
I am recruiting a PhD student to work with me, Peter Cholak, Anand Pillay, and Andy Yang @pentagonalize.bsky.social on transformers and logic/model theory (or related topics). If you are interested, please email me with "FLaNN" in the subject line!
October 30, 2025 at 7:23 PM
Reposted
Read the cookbook: arxiv.org/abs/2510.00368

Join us for weekly seminars on formal language theory, ML, NLP, and more: flannseminars.github.io
October 3, 2025 at 4:24 PM
Reposted
October 3, 2025 at 4:24 PM
Reposted
There is no better way to understand what transformers can do than to get your hands dirty and construct them, weight-by-weight. The Transformer Cookbook provides a guide for anyone aiming to understand the expressive power of transformers on such a formal level.
October 3, 2025 at 4:24 PM
Reposted
We present The Transformer Cookbook: a collection of recipes for programming algorithms directly into transformers!

Hungry for an induction head? Craving a Dyck language recognizer? We show you step-by-step how to cook up transformers for these algorithms and many more!
The Transformer Cookbook
We present the transformer cookbook: a collection of techniques for directly encoding algorithms into a transformer's parameters. This work addresses the steep learning curve of such endeavors, a prob...
arxiv.org
October 3, 2025 at 4:24 PM
Reposted
Andy Yang, Christopher Watson, Anton Xue, Satwik Bhattamishra, Jose Llarena, William Merrill, Emile Dos Santos Ferreira, Anej Svete, David Chiang: The Transformer Cookbook https://arxiv.org/abs/2510.00368 https://arxiv.org/pdf/2510.00368 https://arxiv.org/html/2510.00368
October 2, 2025 at 6:33 AM
New on arXiv: Knee-Deep in C-RASP, by @pentagonalize.bsky.social, @cadilhac.bsky.social, and me. The solid stepped line is our theoretical prediction based on what problems C-RASP can solve, and the numbers/colors are what transformers (no position embedding) can learn.
June 23, 2025 at 11:56 AM
Congratulations to Aarohi Srivastava @aarsri.bsky.social on winning the Best Paper Award at W-NUT at NAACL 2025! This paper applies various interventions simulating noisy text or dialectal variation to discover how different interventions have different effects.
April 23, 2025 at 1:30 PM
The abstract submission deadline for Midwest Speech and Language Days is in two days, on March 20! Please submit an abstract! MSLD is non-archival, and submissions of both work-in-progress and previously published work are encouraged. nlp.nd.edu/msld25/
Midwest Speech and Language Days 2025
nlp.nd.edu
March 18, 2025 at 3:50 PM
Midwest Speech and Language Days will be held Apr 15-16 at
@NotreDame! Abstract submissions are due Mar 20, and registration deadline is Mar 27. Financial assistance for students (lodging, poster printing) is available. nlp.nd.edu/msld25
Midwest Speech and Language Days 2025
nlp.nd.edu
March 8, 2025 at 6:35 PM
New paper and two not-so-new papers on arXiv about transformer expressivity: (1) With @pentagonalize and Dana Angluin, "Simulating Hard Attention Using Soft Attention" arxiv.org/abs/2412.09925
Simulating Hard Attention Using Soft Attention
We study conditions under which transformers using soft attention can simulate hard attention, that is, effectively focus all attention on a subset of positions. First, we examine several variants of ...
arxiv.org
December 23, 2024 at 10:55 PM