npchen.bsky.social
@npchen.bsky.social
Reposted
⚠️ Readable Paper Alert ⚠️

BLT: what if we just got rid of tokenization?

Result:

* text looks a lot like audio, video, PDF, it’s all just bytes
* dynamically reduce compute based on difficulty
* new scaling axis (patch size)

ai.meta.com/research/pub...
Byte Latent Transformer: Patches Scale Better Than Tokens | Research - AI at Meta
We introduce the Byte Latent Transformer (BLT), a new byte-level LLM architecture that, for the first time, matches tokenization-based LLM performance at...
ai.meta.com
December 14, 2024 at 4:12 PM