Lightnews — Scholar-powered news

Reposted

eliza🌻

@elizas.website

January 18, 2026 at 1:14 AM

placeholder720.bsky.social

@placeholder720.bsky.social

"enriched" forward pass

November 17, 2025 at 6:03 PM

placeholder720.bsky.social

@placeholder720.bsky.social

makes sense, models can totally smuggle information in the kv cache across token indices but if we suppose some intermediate computation is completely independent from the token we emit then this info can't participate in any of the more complex stuff a la arxiv.org/abs/2402.12875 - its just an

Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

Instructing the model to generate a sequence of intermediate steps, a.k.a., a chain of thought (CoT), is a highly effective method to improve the accuracy of large language models (LLMs) on arithmetic...

arxiv.org

November 17, 2025 at 6:02 PM

placeholder720.bsky.social

@placeholder720.bsky.social

If I am an expert in layer 16 of 32 of a vanilla transformer and realize that my job in at some token is to compute some sum so that it can be used down the line, I can do that, then any attention head in layer 16 can deposit that info to any future token without any intermediate unembeddings right?

November 17, 2025 at 5:41 PM

placeholder720.bsky.social

@placeholder720.bsky.social

I bet heath ceramics would do it www.youtube.com/watch?v=v678...

Wrong Turn on the Dragon - Numberphile

YouTube video by Numberphile

www.youtube.com

November 8, 2025 at 8:07 PM

placeholder720.bsky.social

@placeholder720.bsky.social

Interesting, what about muon/shampoo or other spectrum-y ones?

September 3, 2025 at 4:40 AM

Reposted

SE Gyges

@segyges.bsky.social

modern ai is basically a bunch of rogue google employees taking google projects that were done pretty cautiously and making them less cautious

August 23, 2025 at 4:41 PM

placeholder720.bsky.social

@placeholder720.bsky.social

Some days the Iliad being about Helen just becomes a lot more believable.

August 6, 2025 at 10:53 PM

placeholder720.bsky.social

@placeholder720.bsky.social

🎯

November 30, 2024 at 6:23 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news