Lightnews — Scholar-powered news

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

4.3K followers 120 following 26 posts

Research scientist at FAIR NY ❤️ LLMs + Information Theory. Previously, PhD at UoAmsterdam, intern at DeepMind + MSRC.

Posts Replies Media Videos

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

Check out the full paper here: www.arxiv.org/pdf/2506.17052 🎓 Work by Jingtong Su, @kempelab.bsky.social, @nyudatascience.bsky.social , @aiatmeta.bsky.social

www.arxiv.org

July 8, 2025 at 1:49 PM

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

Plus, we generate importance maps showing where in the transformer the concept is encoded — providing interpretable insights into model internals.

July 8, 2025 at 1:49 PM

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

SAMI: Diminishes or amplifies these modules to control the concept's influence

With SAMI, we can scale the importance of these modules — either amplifying or suppressing specific concepts.

July 8, 2025 at 1:49 PM

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

SAMD: Finds the attention heads most correlated with a concept

Using SAMD, we find that only a few attention heads are crucial for a wide range of concepts—confirming the sparse, modular nature of knowledge in transformers.

July 8, 2025 at 1:49 PM

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

next one on the list is Yury Polyanskiy's "Information Theory: From Coding to Learning" which will hopefully hit the shelfs in February... can not wait

November 28, 2024 at 3:49 PM

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

🫠

November 20, 2024 at 8:57 PM

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

November 18, 2024 at 6:10 PM

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

November 18, 2024 at 11:21 AM

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

What do you think do we need to sharpen our understanding of tokenization? Or will we soon be rid of it by developing models such as "MegaByte" by
Yu et al?
And add more paper to the threat!

October 30, 2024 at 6:29 PM

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

Phan et al, found a method to mitigate some of the tokenization problems Karpathy mentioned by projecting tokens into byte space. The key to their method is to develop a map between statistically equivalent token and byte-level models.

October 30, 2024 at 6:29 PM

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

In "The Foundations of Tokenization:
Statistical and Computational Concerns", Gastaldi et al. try to make first steps towards defining what a tokenizer should be and define properties it ought to have.

October 30, 2024 at 6:27 PM

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

In "Toward a Theory of Tokenization in LLMs" Rajaraman et al., the authors discuss why we can think of tokenization to cause lower perplexity/ a better entropy bound.

October 30, 2024 at 6:27 PM

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

A must watch entry point is @karpathy.bsky.social hy's "Let's build the GPT Tokenizer" video, where he discusses some tokenization problems.

October 30, 2024 at 6:27 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news