Lightnews — Scholar-powered news

dchiang.bsky.social

@dchiang.bsky.social

I am recruiting a PhD student to work with me, Peter Cholak, Anand Pillay, and Andy Yang @pentagonalize.bsky.social on transformers and logic/model theory (or related topics). If you are interested, please email me with "FLaNN" in the subject line!

October 30, 2025 at 7:23 PM

Reposted

pentagonalize.bsky.social

@pentagonalize.bsky.social

Read the cookbook: arxiv.org/abs/2510.00368

Join us for weekly seminars on formal language theory, ML, NLP, and more: flannseminars.github.io

October 3, 2025 at 4:24 PM

Reposted

pentagonalize.bsky.social

@pentagonalize.bsky.social

Thanks to all the chefs: @ccwatson.bsky.social, @antonxue.bsky.social, @satwik77.bsky.social, @ll4r3n4.bsky.social, @lambdaviking.bsky.social, Emile Dos Santos Ferreira, @anejsvete.bsky.social, @dchiang.bsky.social

October 3, 2025 at 4:24 PM

Reposted

pentagonalize.bsky.social

@pentagonalize.bsky.social

There is no better way to understand what transformers can do than to get your hands dirty and construct them, weight-by-weight. The Transformer Cookbook provides a guide for anyone aiming to understand the expressive power of transformers on such a formal level.

October 3, 2025 at 4:24 PM

Reposted

pentagonalize.bsky.social

@pentagonalize.bsky.social

We present The Transformer Cookbook: a collection of recipes for programming algorithms directly into transformers!

Hungry for an induction head? Craving a Dyck language recognizer? We show you step-by-step how to cook up transformers for these algorithms and many more!

The Transformer Cookbook

We present the transformer cookbook: a collection of techniques for directly encoding algorithms into a transformer's parameters. This work addresses the steep learning curve of such endeavors, a prob...

arxiv.org

October 3, 2025 at 4:24 PM

Reposted

arXiv cs.LG Machine Learning

@cslg-bot.bsky.social

Andy Yang, Christopher Watson, Anton Xue, Satwik Bhattamishra, Jose Llarena, William Merrill, Emile Dos Santos Ferreira, Anej Svete, David Chiang: The Transformer Cookbook https://arxiv.org/abs/2510.00368 https://arxiv.org/pdf/2510.00368 https://arxiv.org/html/2510.00368

October 2, 2025 at 6:33 AM

dchiang.bsky.social

@dchiang.bsky.social

New on arXiv: Knee-Deep in C-RASP, by @pentagonalize.bsky.social, @cadilhac.bsky.social, and me. The solid stepped line is our theoretical prediction based on what problems C-RASP can solve, and the numbers/colors are what transformers (no position embedding) can learn.

June 23, 2025 at 11:56 AM

dchiang.bsky.social

@dchiang.bsky.social

Congratulations to Aarohi Srivastava @aarsri.bsky.social on winning the Best Paper Award at W-NUT at NAACL 2025! This paper applies various interventions simulating noisy text or dialectal variation to discover how different interventions have different effects.

April 23, 2025 at 1:30 PM

dchiang.bsky.social

@dchiang.bsky.social

The abstract submission deadline for Midwest Speech and Language Days is in two days, on March 20! Please submit an abstract! MSLD is non-archival, and submissions of both work-in-progress and previously published work are encouraged. nlp.nd.edu/msld25/

Midwest Speech and Language Days 2025

nlp.nd.edu

March 18, 2025 at 3:50 PM

dchiang.bsky.social

@dchiang.bsky.social

Midwest Speech and Language Days will be held Apr 15-16 at
@NotreDame! Abstract submissions are due Mar 20, and registration deadline is Mar 27. Financial assistance for students (lodging, poster printing) is available. nlp.nd.edu/msld25

Midwest Speech and Language Days 2025

nlp.nd.edu

March 8, 2025 at 6:35 PM

dchiang.bsky.social

@dchiang.bsky.social

New paper and two not-so-new papers on arXiv about transformer expressivity: (1) With @pentagonalize and Dana Angluin, "Simulating Hard Attention Using Soft Attention" arxiv.org/abs/2412.09925

Simulating Hard Attention Using Soft Attention

We study conditions under which transformers using soft attention can simulate hard attention, that is, effectively focus all attention on a subset of positions. First, we examine several variants of ...

arxiv.org

December 23, 2024 at 10:55 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news