eigenblake
eigenblake.bsky.social
eigenblake
@eigenblake.bsky.social
SWE near NYC
When you reframe it as "Let's work backwards from the reward and tile a bunch of trajectories of the configuration space with a bunch of tiny supervised classifiers that nudge our initial state towards the reward" then it's more grounded and less mystical, but just as interesting imho
January 1, 2026 at 12:56 AM
www.instagram.com/reel/DMp3KwZ...

Aaaand etymologynerd is on the case
Instagram
Create an account or log in to Instagram - Share what you're into with the people who get you.
www.instagram.com
July 29, 2025 at 4:13 AM
And the wonderful thing, of course, is that this is beautifully empirically verifiable as a hypothesis. I realized I wrote about it here in a perplexity based sense, but keywords alone would probably be enough to get a statistical signal bsky.app/profile/eige...
In today's episode of "thought-crimes to send me to science purgatory"

1. Observation: Some content I can do up to 2.5x, some 1x
2. Hypothesis: You could probably use PLM perplexity on past content to approximate human encoding
3. Test: Is retention constant after perplexity normalization
July 28, 2025 at 4:38 AM
The possibilities are endless for visualizing the timeline of agents completing tasks, the resources created, etc. The timeline and list of actions could be extremely interesting to traverse.
www.youtube.com/watch?v=2YYj...
The Chaos of AI Agents
YouTube video by Emergent Garden
www.youtube.com
July 26, 2025 at 6:08 PM
What if you used something of this type to start formalizing a notion of inferential distance based on greedy topological "concept" sorting. E.g. from some starting point, what is the set and order of wikipedia articles to maximize retention over the contents of some target article?
June 28, 2025 at 6:49 PM
"And the award for irresponsible baseless speculation of the year goes to..."

I don't think you could never make fun of me. I'm like a living parody of myself.
April 21, 2025 at 5:53 AM
Semantic has been used to describe everything all of
1. Basic Stemming
2. Thesaurus/Keyword Expansion
3. Word-Level Learned Latent Semantic Vector Embeddings
4. Chunk Level, Semantic Document Embeddings

And I don't know how to feel about that. I'll keep thinking about it.
April 19, 2025 at 6:20 PM
Turns out this is not as speculative as I thought... arxiv.org/html/2404.13...
seems to strongly support the idea that models can be trained to recognize these patterns. May still be a paper here up for grabs. How do these methods compare in terms of sample-efficiency?
The Instruction Hierarchy:Training LLMs to Prioritize Privileged Instructions
arxiv.org
March 22, 2025 at 5:33 AM
Seems like this could be a legitimate alternative to classical and ML-based peripheral prompt injection mitigations. The key difference being the model learns in its very weights to only follow instructions if it has the magic tokens granting access to that directive.
March 22, 2025 at 4:58 AM
So maybe you insert these privilege markers every few tokens, or once per sentence. Ring 0 for System, Ring 1 for host, Ring 2 for user and ring 3 for everything else. You don't actually need to keep the particular token id private, as long as you filter it out of input to the model based on user.
March 22, 2025 at 4:56 AM
This differs from the system-user dichotomy because the model will have completely different tokenids and completely different internal representations. My intuition says the tradeoff here is you give up some compute and get security in return. But a 2x or 3x vocab size would not be worth it.
March 22, 2025 at 4:42 AM