Lightnews — Scholar-powered news

Anej Svete

@anejsvete.bsky.social

230 followers 130 following 7 posts

PhD student in NLP at ETH Zurich.

anejsvete.github.io

Posts Replies Media Videos

Anej Svete

@anejsvete.bsky.social

6/ The work refines the landscape of transformer expressivity and demonstrates that seemingly minor implementation details can have major theoretical consequences for what neural architectures can represent.

May 17, 2025 at 2:32 PM

Anej Svete

@anejsvete.bsky.social

5/ This might help explain why positional encodings that skew attention toward recent (rightmost) tokens—like ALiBi—work so well in practice. They're compensating for an inherent limitation in conventional attention mechanisms.

May 17, 2025 at 2:32 PM

Anej Svete

@anejsvete.bsky.social

4/ Here's why this matters: leftmost-tiebreaking transformers are actually equivalent to soft-attention transformers in terms of expressivity! This suggests they might better approximate real-world transformers than right-attention models.

May 17, 2025 at 2:32 PM

Anej Svete

@anejsvete.bsky.social

3/ Specifically, we show that leftmost tiebreaking models correspond to a strictly weaker fragment of Linear Temporal Logic (LTL). While rightmost tiebreaking enables the full power of LTL, leftmost models are limited to the "past" fragment.

May 17, 2025 at 2:31 PM

Anej Svete

@anejsvete.bsky.social

2/ We analyzed future-masked unique hard attention transformers and found that those with leftmost tiebreaking are strictly less expressive than those with rightmost tiebreaking. The "Tale of Two Sides" nicely describes about how these two models differ.

May 17, 2025 at 2:31 PM

Anej Svete

@anejsvete.bsky.social

1/ When multiple positions achieve the maximum attention score in a transformer, we need a tiebreaking mechanism. Should we pick the leftmost or rightmost position? Turns out, this trivial implementation detail dramatically affects what transformers can express!

May 17, 2025 at 2:30 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news