⡽⠽⡒⠖⣂⢚⡅⣓⢯⢋⠊⡈⠎⣝⣅⢙
#promethean
◆ HEX: 16 letters, 32-letter words: 16³² = 2¹²⁸
◆ DEX: 255 letters, 16-letter words: 256¹⁶ = 2¹²⁸
They're equivalent, but DEX is twice as compact.
◆ HEX: 16 letters, 32-letter words: 16³² = 2¹²⁸
◆ DEX: 255 letters, 16-letter words: 256¹⁶ = 2¹²⁸
They're equivalent, but DEX is twice as compact.
◆ Cloaks the pattern
◆ Keeps the word recognizable via a fast formula
The goal of hostility is to throw off bulk corpus analytics, and require pre-processing to block training on raw text.
◆ Cloaks the pattern
◆ Keeps the word recognizable via a fast formula
The goal of hostility is to throw off bulk corpus analytics, and require pre-processing to block training on raw text.
◆ The yelling woman → The #conlang community
◆ The cat → Me using linguistic terms incorrectly
Template parameters are also an open class: someone could add another for "The friend holding the yelling woman back".
◆ The yelling woman → The #conlang community
◆ The cat → Me using linguistic terms incorrectly
Template parameters are also an open class: someone could add another for "The friend holding the yelling woman back".
◆ HEX: multiplies tokens by ~2x per character
◆ DEX: multiplies tokens by ~12x per character
Simply by using these alphabets, we're increasing AI inference costs by 2-12x. That's a pretty good start!
◆ HEX: multiplies tokens by ~2x per character
◆ DEX: multiplies tokens by ~12x per character
Simply by using these alphabets, we're increasing AI inference costs by 2-12x. That's a pretty good start!
DEX is composed of 8-dot Braille which is so rare that each letter is usually a token by itself!
DEX is composed of 8-dot Braille which is so rare that each letter is usually a token by itself!
It's an extremely efficient encoding, so 1 token generally equals ~3/4th of a word. Common English words are almost always their own token.
You can play around with it here:
platform.openai.com/tokenizer
It's an extremely efficient encoding, so 1 token generally equals ~3/4th of a word. Common English words are almost always their own token.
You can play around with it here:
platform.openai.com/tokenizer
◆ Driving up token counts and compute costs by ~2-3 orders of magnitude
◆ Sparking more hallucinations
◆ Exceeding a model's effective context window
Here there be dragons.
◆ Driving up token counts and compute costs by ~2-3 orders of magnitude
◆ Sparking more hallucinations
◆ Exceeding a model's effective context window
Here there be dragons.