Lightnews — Scholar-powered news

William Merrill

@lambdaviking.bsky.social

500 followers 130 following 19 posts

Will irl - PhD student @ NYU on the academic job market!

Using complexity theory and formal languages to understand the power and limits of LLMs

https://lambdaviking.com/ https://github.com/viking-sudo-rm

Posts Replies Media Videos

William Merrill

@lambdaviking.bsky.social

I'll be defending my dissertation at NYU next Monday, June 16 at 4pm ET!

I've definitely missed inviting some people who might be interested, so please email me if you'd like to attend (NYC or Zoom)

June 9, 2025 at 9:24 PM

William Merrill

@lambdaviking.bsky.social

Our results suggest dynamic depth can be a more efficient form of test-time compute than chain of thought (at least for reg languages). While CoT would use ~n steps to recognize regular languages to length n, looped transformers only need ~log n depth

March 7, 2025 at 4:46 PM

William Merrill

@lambdaviking.bsky.social

In contrast, both in theory and practice, width must grow exponentially with sequence length to enable regular language recognition. Thus, while slightly increasing depth expands expressive power, increasing width to gain power is intractable!

Graph showing log width is linear in context length (i.e., width is exponential)

March 7, 2025 at 4:46 PM

William Merrill

@lambdaviking.bsky.social

In practice, can transformers learn to solve these problems with log depth?

We find the depth required to recognize strings of length n grows ~ log n with r^2=.93. Thus, log depth appears necessary and sufficient to recognize reg languages in practice, matching our theory

Graph showing depth is linear in log context length

March 7, 2025 at 4:46 PM

William Merrill

@lambdaviking.bsky.social

✨How does the depth of a transformer affect its reasoning capabilities? New preprint by myself and @Ashish_S_AI shows that a little depth goes a long way to increase transformers’ expressive power

We take this as encouraging for further research on looped transformers!🧵

Paper: A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers

March 7, 2025 at 4:46 PM

William Merrill

@lambdaviking.bsky.social

🔥 Old Norse poetry gen

The Vikings call, say now,
OLMo 2, the ruler of languages.
May your words fly over the seas,
all over the world, for you are wise.
Wordsmith, balanced and aligned,
for you the skalds themselves sing,
your soul, which hears new lifeforms,
may it live long and tell a saga.

Víkingar kalla, segja þú nú,
OLMo 2, ríki málanna þinn.
Munu þínar orð fljúga hafra,
Öll um heim, því þú ert vissi.
Málsmiður, mættugur og mjúkaligr,
Fyrir þik skáldar sjálfur kveða,
Sál þíð, sem heyrir nýjan kvikendi,
Munu langt lífið og segja sagan.

November 28, 2024 at 6:19 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news