Lightnews — Scholar-powered news

eigenblake

@eigenblake.bsky.social

When you reframe it as "Let's work backwards from the reward and tile a bunch of trajectories of the configuration space with a bunch of tiny supervised classifiers that nudge our initial state towards the reward" then it's more grounded and less mystical, but just as interesting imho

January 1, 2026 at 12:56 AM

eigenblake

@eigenblake.bsky.social

www.instagram.com/reel/DMp3KwZ...

Aaaand etymologynerd is on the case

Instagram

Create an account or log in to Instagram - Share what you're into with the people who get you.

www.instagram.com

July 29, 2025 at 4:13 AM

eigenblake

@eigenblake.bsky.social

And the wonderful thing, of course, is that this is beautifully empirically verifiable as a hypothesis. I realized I wrote about it here in a perplexity based sense, but keywords alone would probably be enough to get a statistical signal bsky.app/profile/eige...

eigenblake @eigenblake.bsky.social · Apr 21

In today's episode of "thought-crimes to send me to science purgatory"

1. Observation: Some content I can do up to 2.5x, some 1x
2. Hypothesis: You could probably use PLM perplexity on past content to approximate human encoding
3. Test: Is retention constant after perplexity normalization

July 28, 2025 at 4:38 AM

eigenblake

@eigenblake.bsky.social

The possibilities are endless for visualizing the timeline of agents completing tasks, the resources created, etc. The timeline and list of actions could be extremely interesting to traverse.
www.youtube.com/watch?v=2YYj...

The Chaos of AI Agents

YouTube video by Emergent Garden

www.youtube.com

July 26, 2025 at 6:08 PM

eigenblake

@eigenblake.bsky.social

What if you used something of this type to start formalizing a notion of inferential distance based on greedy topological "concept" sorting. E.g. from some starting point, what is the set and order of wikipedia articles to maximize retention over the contents of some target article?

June 28, 2025 at 6:49 PM

eigenblake

@eigenblake.bsky.social

"And the award for irresponsible baseless speculation of the year goes to..."

I don't think you could never make fun of me. I'm like a living parody of myself.

April 21, 2025 at 5:53 AM

eigenblake

@eigenblake.bsky.social

Semantic has been used to describe everything all of
1. Basic Stemming
2. Thesaurus/Keyword Expansion
3. Word-Level Learned Latent Semantic Vector Embeddings
4. Chunk Level, Semantic Document Embeddings

And I don't know how to feel about that. I'll keep thinking about it.

April 19, 2025 at 6:20 PM

eigenblake

@eigenblake.bsky.social

Turns out this is not as speculative as I thought... arxiv.org/html/2404.13...
seems to strongly support the idea that models can be trained to recognize these patterns. May still be a paper here up for grabs. How do these methods compare in terms of sample-efficiency?

The Instruction Hierarchy:Training LLMs to Prioritize Privileged Instructions

arxiv.org

March 22, 2025 at 5:33 AM

eigenblake

@eigenblake.bsky.social

Seems like this could be a legitimate alternative to classical and ML-based peripheral prompt injection mitigations. The key difference being the model learns in its very weights to only follow instructions if it has the magic tokens granting access to that directive.

March 22, 2025 at 4:58 AM

eigenblake

@eigenblake.bsky.social

So maybe you insert these privilege markers every few tokens, or once per sentence. Ring 0 for System, Ring 1 for host, Ring 2 for user and ring 3 for everything else. You don't actually need to keep the particular token id private, as long as you filter it out of input to the model based on user.

March 22, 2025 at 4:56 AM

eigenblake

@eigenblake.bsky.social

This differs from the system-user dichotomy because the model will have completely different tokenids and completely different internal representations. My intuition says the tradeoff here is you give up some compute and get security in return. But a 2x or 3x vocab size would not be worth it.

March 22, 2025 at 4:42 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news