We prove optimal tokenization is NP-hard on bounded alphabets (like bytes)—even unary for direct tokenization!
Big thanks @tpimentel.bsky.social, @philipwitti.bsky.social & Dennis Komm for the mentorship! Best birthday gift. 🎂
arxiv.org/abs/2511.15709
We prove optimal tokenization is NP-hard on bounded alphabets (like bytes)—even unary for direct tokenization!
Big thanks @tpimentel.bsky.social, @philipwitti.bsky.social & Dennis Komm for the mentorship! Best birthday gift. 🎂
arxiv.org/abs/2511.15709