Timur Galimzyanov
galtimur.bsky.social
Timur Galimzyanov
@galtimur.bsky.social
ML Researcher at JetBrains.
NLP, ML for code.
While this is an intriguing approach to advancing transformers, note a major drawback: high latency due to character-level decoding, involving many sequential operations. This issue is mentioned in the limitations but is notably avoided in the main text.

ai.meta.com/research/pub...
Byte Latent Transformer: Patches Scale Better Than Tokens | Research - AI at Meta
We introduce the Byte Latent Transformer (BLT), a new byte-level LLM architecture that, for the first time, matches tokenization-based LLM performance at...
ai.meta.com
December 16, 2024 at 4:21 PM