Lightnews — Scholar-powered news

Melanie Mitchell

@melaniemitchell.bsky.social

24K followers 400 following 560 posts

Professor, Santa Fe Institute. Research on AI, cognitive science, and complex systems.

Website: https://melaniemitchell.me

Substack: https://aiguide.substack.com/

Posts Replies Media Videos

Melanie Mitchell

@melaniemitchell.bsky.social

I appreciate your overall point, but just to push back on "this is such a new field" -- AI as a field has been around for at least 70 years, and universities have been training AI researchers for most of that time.

November 7, 2025 at 10:13 PM

Melanie Mitchell

@melaniemitchell.bsky.social

This is not to say that such "role-playing" can't be dangerous in and of itself. In fact, role-playing is a key method for AI "jail-breaking". But that's not the same thing as a "survival drive".

October 25, 2025 at 5:15 PM

Melanie Mitchell

@melaniemitchell.bsky.social

It's a difficult and uncertain time for science in the U.S. and worldwide, so communicating the ideas and results of science to the general public has never been more essential.

More about these awards: www.nationalacademies.org/news/2025/10...

www.nationalacademies.org

October 23, 2025 at 3:36 PM

Melanie Mitchell

@melaniemitchell.bsky.social

Half the authors might be hallucinations.

October 18, 2025 at 9:17 PM

Melanie Mitchell

@melaniemitchell.bsky.social

Lol. Who among us hasn't hallucinated in the course of a Google Docs ➡️ LaTeX migration?

October 18, 2025 at 9:09 PM

Melanie Mitchell

@melaniemitchell.bsky.social

This one too? URL links to paper with similar-sounding title, some different authors, different journal. Title in this reference does not seem to exist.

October 18, 2025 at 7:07 PM

Melanie Mitchell

@melaniemitchell.bsky.social

Thanks, I will take a look.

October 17, 2025 at 1:06 PM

Melanie Mitchell

@melaniemitchell.bsky.social

Megan, That's wonderful -- congratulations!!

October 13, 2025 at 5:11 PM

Melanie Mitchell

@melaniemitchell.bsky.social

Evaluation of reasoning, and reasoning about evaluation -- both understudied, imo

October 7, 2025 at 7:40 PM

Melanie Mitchell

@melaniemitchell.bsky.social

Excellent statement.

October 7, 2025 at 12:33 AM

Melanie Mitchell

@melaniemitchell.bsky.social

On the other hand, accuracy alone may be *underestimating* this ability in visual settings

It is essential to go beyond accuracy in evaluating such capabilities!

Paper: arxiv.org/abs/2510.02125

Blog post: aiguide.substack.com/p/do-ai-reas...

🧵 10/10

Do AI Reasoning Models Abstract and Reason Like Humans?

Going beyond simple accuracy for evaluating abstraction abilities

aiguide.substack.com

October 6, 2025 at 9:27 PM

Melanie Mitchell

@melaniemitchell.bsky.social

Conclusions: Evaluations like those of the ARC Prize, using accuracy alone, may be *overestimating* abstract reasoning ability of these models in textual setting.

🧵 9/10

October 6, 2025 at 9:27 PM

Melanie Mitchell

@melaniemitchell.bsky.social

With visual inputs, these models all do quite poorly on generating accurate grids. But they do manage to get the correct-intended rule considerably more often than they generate the correct grid.

🧵 8/10

October 6, 2025 at 9:27 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news