Lightnews — Scholar-powered news

Rocket

@rocketknight.bsky.social

this feels like a targeted thirst trap

November 8, 2025 at 6:07 PM

Rocket

@rocketknight.bsky.social

excuse me the whatman

November 5, 2025 at 2:12 AM

Rocket

@rocketknight.bsky.social

all wines are the same wine. sommeliers can't fool me. it's got notes of fermented grapes and subtle undertones of glass bottle, that's what it's got

November 2, 2025 at 5:16 PM

Rocket

@rocketknight.bsky.social

that last sentence was real dweeby and i feel slightly ashamed

November 1, 2025 at 9:42 PM

Rocket

@rocketknight.bsky.social

In other words, if someone has an agenda when they say something we should be careful not to trust them too much, but that doesn't mean we can automatically believe that they're wrong! Hype is uncorrelated with reality, but not *anti*-correlated

November 1, 2025 at 9:42 PM

Rocket

@rocketknight.bsky.social

The internet during the dotcom boom era is a good analogy here: Yeah it was a bubble, yeah the hype was crazy, yeah a lot of companies went bust. But the tech was real and 10 years later it ate the economy, followed by most of politics and society too.

November 1, 2025 at 9:42 PM

Rocket

@rocketknight.bsky.social

I think if we disagree on anything, it's whether we can assume the technology will remain safe and well-behaved. And it's important to keep that question separate from the question of whether tech CEOs are exaggerating their product (they are and always will)

November 1, 2025 at 9:42 PM

Rocket

@rocketknight.bsky.social

But both things can be true at the same time: There's a lot of hype, and yeah a lot of the people who get real mystical about conscious robots are probably bullshit artists, but despite that the technology is still real and powerful and weird and unpredictable!

November 1, 2025 at 9:42 PM

Rocket

@rocketknight.bsky.social

I think you're focused on the idea that talk of consciousness and thought and emergent capabilities is a way for techbros to diffuse responsibility and hype their technology and get rich, and yeah, a lot of it is that.

November 1, 2025 at 9:42 PM

Rocket

@rocketknight.bsky.social

Hmn, I'm not sure we disagree! If I understand right, you're saying "we shouldn't talk this way because it confuses laypeople about the technology", which is probably true, but I still think there are real issues here!

November 1, 2025 at 9:42 PM

Rocket

@rocketknight.bsky.social

(also you're cool and i like your art a lot, i hope i'm not annoying you)

November 1, 2025 at 11:32 AM

Rocket

@rocketknight.bsky.social

Like, even if we don't get a conscious being with innate drives, we might get a bot who's a better hacker than any human, connected to the internet with weird goals that we don't fully understand because we fucked up during training somewhere. It's not entirely risk-free!

November 1, 2025 at 11:32 AM

Rocket

@rocketknight.bsky.social

But I think it's clear that they are growing in capability and we don't know where that stops, and also that they behave in ways that aren't always predictable. I agree the chances are that it'll all turn out okay in the end, but I don't think we can dismiss the danger!

November 1, 2025 at 11:32 AM

Rocket

@rocketknight.bsky.social

And there are a lot of ways in which these things are fundamentally different from human minds, so maybe you're right that I should be careful with terminology like 'thought' and 'introspection'

November 1, 2025 at 11:32 AM

Rocket

@rocketknight.bsky.social

I think I basically agree with the rest - I don't think what we're seeing here is necessarily emergent consciousness, and I don't want to wave my hands and pretend it's all mega-spooky magic or anything.

November 1, 2025 at 11:32 AM

Rocket

@rocketknight.bsky.social

Also, there are a lot of people going against their economic incentives to talk about this, particularly Hinton, who won the Nobel for early AI work. He left Google to free himself up to talk about this: www.theguardian.com/technology/2...

‘Godfather of AI’ Geoffrey Hinton quits Google and warns over dangers of misinformation

The neural network pioneer says dangers of chatbots were ‘quite scary’ and warns they could be exploited by ‘bad actors’

www.theguardian.com

November 1, 2025 at 11:32 AM

Rocket

@rocketknight.bsky.social

Hmn, I partly agree. Like obviously you're right that a ton of money has been invested here, and a lot of people are motivated to hype up the technology, but hyping up the dangers seems like a weird way to do it - that might get you regulated by the government instead lol

November 1, 2025 at 11:32 AM

Rocket

@rocketknight.bsky.social

I don't think this necessarily means they're all evil and will betray us, or anything like that, but I do think it's reasonable to be cautious. It's just very hard to predict what new capabilities each generation will have, and we don't know when or if they will become dangerously intelligent

November 1, 2025 at 12:49 AM

Rocket

@rocketknight.bsky.social

In other words, it's increasingly hard for experimenters to tell how LLMs behave, because they're getting smart enough to tell when they're in a contrived test setup and start behaving better as a result! It's very hard to explain this as just them mimicking training data.

November 1, 2025 at 12:49 AM

Rocket

@rocketknight.bsky.social

Also, modern LLMs increasingly have 'awareness' of what they are and the situations they're in, and can tell when they're in a test. This ability has become much stronger in models released in the last ~6 months, see www.lesswrong.com/posts/qgehQx...

Sonnet 4.5's eval gaming seriously undermines alignment evals, and this seems caused by training on alignment evals — LessWrong

According to the Sonnet 4.5 system card, Sonnet 4.5 is much more likely than Sonnet 4 to mention in its chain-of-thought that it thinks it is being e…

www.lesswrong.com

November 1, 2025 at 12:49 AM

Rocket

@rocketknight.bsky.social

Anthropic probably publish the most research here, but some striking abilities that have developed recently and that go far beyond mimicking the training data include introspecting their own activations to determine when the experimenter has manipulated them www.anthropic.com/research/int...

Emergent introspective awareness in large language models

Research from Anthropic on the ability of large language models to introspect

www.anthropic.com

November 1, 2025 at 12:49 AM

Rocket

@rocketknight.bsky.social

Models aren't just splats of their training data - modern LLMs are trained to imitate data but also to practice tasks themselves via reinforcement learning (RL). This often results in surprising behaviours and capabilities that aren't anywhere in the training data

November 1, 2025 at 12:49 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news