_ - \.
banner
crumb.bsky.social
_ - \.
@crumb.bsky.social
lauren (or crumb) // machine // She-E-Ey
hf.co/crumb
have been revisiting this a lot
youtu.be/0BVM0UC28nY
September 30, 2025 at 1:43 AM
even tho we trained on filtered data generated by deepseek v3 base, our desc2doc model didn't follow prompts as well as we'd hoped. so last night i pounded out a rubric based trainer using deepseek v3.1 (:free) as judge. it is now running. yaaay
September 29, 2025 at 7:05 PM
took you long enough Dumb Ass
September 18, 2025 at 3:29 AM
September 13, 2025 at 10:49 PM
🐱
September 10, 2025 at 9:48 PM
lets go man fuck em up 𝔱𝔬𝔲𝔤𝔥-𝔡𝔯𝔞𝔤𝔬𝔫-₂₅₈
ETA83:50:08
September 3, 2025 at 11:33 PM
subtracting "lamb" embed from mary had a little lamb embed then decoding... it tries to say it but it just cant get it right... that's so silly...
September 2, 2025 at 5:17 AM
trying strange things
September 2, 2025 at 4:54 AM
okokokok it's on HF as it is RN, it seems really good but it will keep on improving for a little while,
encourage you to try it out and see if you can figure out any fun things to use it for
hf.co/crumb/essenc...
September 2, 2025 at 4:40 AM
we want to do 8b but that requires offloading to CPU in our case which is just... not gonna cut it when the training time is going to start being in the 10ks of steps
August 28, 2025 at 6:35 PM
it took a bit of tinkering crumb had posted this on 🐦 2days ago
August 28, 2025 at 6:35 PM
this one is for the freaks, have u ever wanted a text2vec2text that 1 doesn't rely on api embeddings and 2 preserves temporal dynamics by design?

crumb has found crumbself in a position in need of some of these, so crumb is jst building them. 32 token embedding. total 6b model system (WIP results)
August 28, 2025 at 6:33 PM
and cogview.. remember cogview
August 25, 2025 at 3:36 PM
and the beginning of an RNN crumb was training for Q/A around same time, rage quit after large N runs or class period ended and had to close chrome book LOL
August 25, 2025 at 3:36 PM
BAM
August 25, 2025 at 3:36 PM
crumb found a trove of stuff crumb generated in 2019
August 25, 2025 at 3:36 PM
why does huggingface have no thumbs down react
August 19, 2025 at 5:52 PM
crumb got it working on qwen 2.5 32b for
- llm response
- llm prompt
- llm conversation
- samples from dclm
- samples from textfiles dot com

needs a little tuning and then can be specialized into many many things (again, crumb excited for rl on "llm prompt")
August 18, 2025 at 6:10 AM
August 9, 2025 at 12:35 AM
yea
August 8, 2025 at 11:44 PM
phi-4 doesn't even exhibit this
August 7, 2025 at 9:19 AM
where the token embeddings lay for some ~30b models and then gpt-oss 20b, peep the scale, that is Not normal
August 7, 2025 at 9:11 AM
August 3, 2025 at 5:30 PM
is up
August 2, 2025 at 11:26 PM
and crumb doing that description -> document thing for automating control vectors bc it feels insane that it hadnt heard of any besides in one paper out of anthropic where they plant synth docs in training data so it learns it would want to lie or whatever?

step one (after hand-labeling):
August 2, 2025 at 2:06 PM