Lightnews — Scholar-powered news

Mike Hearn

@mikehearn.bsky.social

Somehow every BlueSky poster knows how LLMs work, meanwhile Anthropic researchers are releasing 35k-word papers meticulously analyzing the internals and still concluding that they don't really know how they work.

transformer-circuits.pub/2025/attribu...

On the Biology of a Large Language Model

We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology.

transformer-circuits.pub

August 8, 2025 at 2:34 PM

Mike Hearn

@mikehearn.bsky.social

This is a failure of the new GPT-5 router more than anything. We know reasoning models can correctly answer this question 100% of the time, the router just isn't sophisticated enough to understand that this question, while superficially simple, actually requires reasoning. It's a fixable problem.

August 8, 2025 at 12:36 PM

Mike Hearn

@mikehearn.bsky.social

Ok, apparently they considered that and decided (correctly) that wasting effort on this narrow and manufactured problem wasn’t worth it. Gotta just accept the online dunks from people that know enough to trick the LLM but not enough to understand why the trick works. bsky.app/profile/schm...

Schmidt @schmidt.lovely.social · Aug 8

We considered adding a super trivial — but kinda silly — system prompt to fix this. It worked, but it didn’t seem worth it. (Nobody who’s not terminally-online knows or cares.)

Every tool has its limitations! If you’re gonna use it, you should know them.

Quantian @quantian.bsky.social · Aug 8

We have a lot of fun tripping up AI with this, but asking it to parse a word by individual letters is kind of a nonsensical question given how tokenizers operate. It's like asking a Chinese speaker how many G's are in 中国, that's not how they process language.

August 8, 2025 at 12:16 PM

Mike Hearn

@mikehearn.bsky.social

I like ChatGPT.

July 25, 2025 at 3:19 PM

Mike Hearn

@mikehearn.bsky.social

I love that this hypothetical guy immediately made a terrible financial decision on his rent payments. I agree with you, this guy's gonna have a hard time.

July 1, 2025 at 4:51 PM

Mike Hearn

@mikehearn.bsky.social

I dislike Scott Adams as a person, his opinions, etc. but I did watch his announcement (the first thing of his I've ever seen) and he was pretty clear that he tried it in the course of leaving no stone unturned. He said he & his dr didn't think it would work, but there were no downsides, so why not.

May 20, 2025 at 1:39 AM

Mike Hearn

@mikehearn.bsky.social

This is awesome.

May 18, 2025 at 9:10 PM

Mike Hearn

@mikehearn.bsky.social

By "inside" I mean the billions (trillions?) of parameters, activations, attention patterns etc. that are poked and prodded in interpretability studies. No one fully understands how those things work together to produce the model outputs. transformer-circuits.pub/2025/attribu...

On the Biology of a Large Language Model

We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology.

transformer-circuits.pub

May 14, 2025 at 2:40 AM

Mike Hearn

@mikehearn.bsky.social

If you have a perfect understanding of how LLMs work, you should contact the authors of this paper, tell them you have the answers, and collect your millions from the AI lab of your choosing. transformer-circuits.pub/2025/attribu...

On the Biology of a Large Language Model

We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology.

transformer-circuits.pub

May 14, 2025 at 1:59 AM

Mike Hearn

@mikehearn.bsky.social

I feel like we don't have a perfect understanding of what happens inside LLMs and we also don't have a perfect definition of what thinking means, so I guess I am less confident about this than you are.

May 14, 2025 at 12:34 AM

Mike Hearn

@mikehearn.bsky.social

I get the argument that this ruling is potentially beneficial, but I think the idea of every app now having an Apple price and a non-Apple price, with different payment flows for each, is ultimately going to end up as net-negative for everyone (users, devs, Apple).

May 1, 2025 at 2:00 AM

Mike Hearn

@mikehearn.bsky.social

This is like if you had a human assistant named Steve, and you said, "Steve, can you write an email in my voice," and then Steve did it, and you got furious at Steve for impersonating you.

April 23, 2025 at 5:50 AM

Mike Hearn

@mikehearn.bsky.social

Does it still count as impersonation when she asked ChatGPT to impersonate her?

April 23, 2025 at 5:25 AM

Mike Hearn

@mikehearn.bsky.social

I just want to note that you can give ChatGPT a prompt with literally any name -- real names, fake names, silly names, serious names -- and it will do the exact same thing. Here's an excerpt in the style of extremely not-real WaPo reporter Barnabas Flimflamington. This outrage over this is silly.

April 23, 2025 at 5:02 AM

Mike Hearn

@mikehearn.bsky.social

I asked ChatGPT to write a WaPo story in the style of Mike Hearn, and it did, with my name as the byline. I have never written for WaPo (or anywhere). This is what ChatGPT does, because it's essentially what I asked it to do. This whole thread and the various reactions are wild and kind of insane.

April 23, 2025 at 4:48 AM

Mike Hearn

@mikehearn.bsky.social

I feel like people are misunderstanding what this is. Sora.com already has a homepage feed with "likes"; once they add following and comments, it's a social app.

April 16, 2025 at 1:02 PM

Mike Hearn

@mikehearn.bsky.social

Here are other screenshots that are closer in tone to today's. It's a thing that he does. bsky.app/profile/adis...

i dreamt a dream, @adistantdream.bsky.social · Apr 15

it's not common but he does do it. it's just also totally unremarkable in some contexts. if you asked him who's the greatest writer in the world IRL he might well say "yglesias" because he's a prat.

April 15, 2025 at 3:16 AM

Mike Hearn

@mikehearn.bsky.social

It's awkward to find his third-person tweets because searching "Yglesias" brings up, you know, all his tweets. But if you search "yglesias third person" you can get a litany of people dragging him for using the third-person.

April 15, 2025 at 2:07 AM

Mike Hearn

@mikehearn.bsky.social

The timeline of replies to the 3rd-person post is fascinating. It was made 24 hrs ago, so there are a handful of normal replies also made 24 hrs ago from people who understood the post in context, then the screenshot went viral about 6 hours ago, and the rest are just insane from that point on.

April 15, 2025 at 1:55 AM

Mike Hearn

@mikehearn.bsky.social

A good trick here is that, on the iPhone, you can hit the power button 5 times in quick succession. It brings up the "Slide to Power Off" screen, disables FaceID/TouchID and requires your full passcode to unlock the phone again.

April 8, 2025 at 1:47 AM

Mike Hearn

@mikehearn.bsky.social

Ah didn’t realize that was a thing. Makes sense.

April 6, 2025 at 7:17 PM

Mike Hearn

@mikehearn.bsky.social

The lead photo of that post is almost certainly AI, for what it’s worth. I can’t speak to the details of the story itself.

April 6, 2025 at 7:08 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news