Lightnews — Scholar-powered news

Michael Ritchot

@ritchot.me

The more I use Gemini 3.0, the more impressed I am.

"explain the probability of getting an 8 with 2 dice"

gemini.google.com/share/f00d99...

November 27, 2025 at 10:07 AM

Michael Ritchot

@ritchot.me

The Gemini 3.0 launch has created a lot of noise about what it can actually do. Some of the most popular threads I have seen have not been consistently reproducible or have turned out to be misleading.

November 26, 2025 at 3:09 AM

Michael Ritchot

@ritchot.me

Last year I tested OpenAI's o1 pro on CEMC Problems of the Week and wrote that it “still has a long way to go for mathematics.” After rerunning the experiment with GPT-5 this year, I cannot honestly make that claim anymore.

ritchot.me/gpt-5-has-co...

GPT-5 has come a long way in Mathematics

\ Last December, I wrote an article titled . At the time, OpenAI’s new “reasoning” models were being heavily marketed as smarter, more careful, and b...

ritchot.me

November 23, 2025 at 12:01 PM

Michael Ritchot

@ritchot.me

After finishing my Master's degree, the first thing I was greeted by on my return to social media was a paper coming out of MIT Media Labs about the "erosion of critical thinking." Which I feel is being misinterpreted at best.

June 25, 2025 at 11:01 PM

Michael Ritchot

@ritchot.me

Another fantastic video from Andrej Karpathy on how software is seeing one of the largest shifts it has in 70 years. Easily one of the clearest communicators in the AI space, and a must watch for anyone interested in this field.

www.youtube.com/watch?v=LCEm...

Andrej Karpathy: Software Is Changing (Again)

YouTube video by Y Combinator

www.youtube.com

June 19, 2025 at 3:30 AM

Reposted by Michael Ritchot

Simon Willison

@simonwillison.net

Workaccount2 on Hacker News just coined the term "context rot" to describe the thing where the quality of an LLM conversation drops as the context fills up with accumulated distractions and dead ends news.ycombinator.com/item?id=4430...

Comment by Workaccount2, 9 hours ago:

They poison their own context. Maybe you can call it context rot, where as context grows and especially if it grows with lots of distractions and dead ends, the output quality falls off rapidly. Even with good context the rot will start to become apparent around 100k tokens (with Gemini 2.5).

They really need to figure out a way to delete or "forget" prior context, so the user or even the model can go back and prune poisonous tokens.

Right now I work around it by regularly making summaries of instances, and then spinning up a new instance with fresh context and feed in the summary of the previous instance.

June 18, 2025 at 11:22 PM

Reposted by Michael Ritchot

Ethan Mollick

@emollick.bsky.social

Not to be a broken record, but AI critics who insist that AI "doesn't work" and is going to just disappear are misleading - that just isn't true, as controlled studies like this one show.

There are many issues with AI & many things that need critique, but pretending it is going away is not helpful.

Ethan Mollick @emollick.bsky.social · Mar 3

Randomized trial AI for legal work finds Reasoning models are a big deal:

Law students using o1-preview had the quality of work on most tasks increase (up to 28%) & time savings of 12-28%

There were a few hallucinations, but a RAG-based AI with access to legal material reduced those to human level

March 4, 2025 at 3:11 AM

Reposted by Michael Ritchot

The Pudding

@puddingviz.bsky.social

Do you think middle school sucks? You’re not alone. @alv9n.com uses a survey of millions of students to show how our brains and our sense of belonging change in middle school.

💻 pudding.cool/2025/02/midd...
📽️ www.youtube.com/watch?v=b4zL...

The Middle Ages

Follow hundreds of kids as they navigate their treacherous middle school years.

pudding.cool

March 3, 2025 at 7:35 PM

Michael Ritchot

@ritchot.me

As the average age of scientific breakthroughs has risen above 40 in the 2000s (compared to the 20s and 30s in the early 1900s) effective tools are essential to manage the growing burden of knowledge and sustain scientific discovery.

February 20, 2025 at 6:31 AM

Michael Ritchot

@ritchot.me

When it comes to emerging technology, I find most aren't being nearly bold enough in predicting just how drastically society is about to change.

some thoughts on emergent technology and the future of education

\ We often envision the future of technology by projecting today’s society forward, rather than considering how fundamentally different it might become...

ritchot.me

February 18, 2025 at 10:32 AM

Reposted by Michael Ritchot

Paul Röttger @ EMNLP

@paul-rottger.bsky.social

Are LLMs biased when they write about political issues?

We just released IssueBench – the largest, most realistic benchmark of its kind – to answer this question more robustly than ever before.

Long 🧵with spicy results 👇

February 13, 2025 at 2:08 PM

Michael Ritchot

@ritchot.me

Excellent piece by Erik Hoel, while there is evidence that there may be cognitive atrophy from too much AI usage, it could be cognitively healthy when it allows people to mentally "chunk" tasks at higher levels of abstraction.

www.theintrinsicperspective.com/p/brain-drain

February 14, 2025 at 12:53 AM

Michael Ritchot

@ritchot.me

If you feel like you or your industry is behind in AI adoption, you can probably relax.

Anthropic's Economic Index initial report released today. Usage is concentrated in software development and technical writing. I was surprised at how many job types are single digit %'s.

February 11, 2025 at 1:15 AM

Michael Ritchot

@ritchot.me

In case you missed it, Andrej Karpathy released a 3.5 hour video a week ago diving deep into how LLMs work. This is the tl;dr of it which is...still long...but really great reading.

anfalmushtaq.com/articles/dee...

Deep dive into LLMs like ChatGPT by Andrej Karpathy (TL;DR)

A TL;DR version of Andrej Karpathy's "Deep dive into LLMs like ChatGPT" video.

anfalmushtaq.com

February 10, 2025 at 12:54 PM

Michael Ritchot

@ritchot.me

Some thoughts on Altman's post.

1. The human working in conjunction with the machine will be increasingly important. AI Agents may not have the biggest new ideas on their own, but they will with a human mind working in tangent...the truly creative will thrive.

February 10, 2025 at 12:42 PM

Reposted by Michael Ritchot

Simon Willison

@simonwillison.net

Some notes on OpenAI's prompting guidelines for their "reasoning" models (o1/o3), which included a few surprises - in particular a weird one where you have to add "Formatting re-enabled" to your system prompt in order to get Markdown now simonwillison.net/2025/Feb/2/o...

OpenAI reasoning models: Advice on prompting

OpenAI's documentation for their o1 and o3 "reasoning models" includes some interesting tips on how to best prompt them: > - **Developer messages are the new system messages:** Starting with …

simonwillison.net

February 2, 2025 at 8:58 PM

Reposted by Michael Ritchot

MikeSharples

@sharplem.bsky.social

The Open University has published a framework for learning and teaching Critical AI Literacy Skills. Although the framework is intended for Open University staff and students, it is relevant for any organisation looking to develop inclusive training in AI literacy. about.open.ac.uk/sites/about....

about.open.ac.uk

February 3, 2025 at 11:41 AM

Reposted by Michael Ritchot

Ethan Mollick

@emollick.bsky.social

Been waiting for someone to test this and see if it works - can multiple AI agents fact-checking each other reduce hallucinations?

The answer appears to be yes - using 3 agents with a structured review process reduced hallucination scores by 96% across 310 test cases. arxiv.org/pdf/2501.13946

February 2, 2025 at 8:42 PM

Reposted by Michael Ritchot

Ethan Mollick

@emollick.bsky.social

The Google/OpenAI race for a research tool is going to be very interesting and very consequential. These systems do real work.

For Google, it feels like a big emergency and a huge opportunity. They have a unique position (search, Books, YouTube, etc.) and the models to do something big. We will see

February 3, 2025 at 12:42 AM

Reposted by Michael Ritchot

Ethan Mollick

@emollick.bsky.social

OpenAI’s deep research is very good. Unlike Google’s version, which is mostly a good summarizer of many sources, OpenAI is more like engaging an opinionated (often almost PhD-level!) researcher who follows lead.

Look at how it hunts down a concept in the literature (& works around problems)

February 3, 2025 at 12:12 AM

Reposted by Michael Ritchot

Simon Willison

@simonwillison.net

Fascinating Hacker News comment from Tom Gally, a professional translator (Japanese to English) who uses LLMs as part of his workflow, which he describes in detail: news.ycombinator.com/item?id=4289...

Wrote a bit more about this on my blog here: simonwillison.net/2025/Feb/2/w...

There might be some papers or other guides out there, but their advice will be b... | Hacker News

news.ycombinator.com

February 2, 2025 at 4:26 AM

Reposted by Michael Ritchot

Ethan Mollick

@emollick.bsky.social

Also, obligatory OpenAI-can't-name-a-model comment: o3-mini-high and o3-mini, really?

January 31, 2025 at 9:43 PM

Reposted by Michael Ritchot

Ethan Mollick

@emollick.bsky.social

New randomized, controlled trial by the World Bank of students using GPT-4 as a tutor in Nigeria. Six weeks of after-school AI tutoring = 2 years of typical learning gains, outperforming 80% of other educational interventions.

And it helped all students, especially girls who were initially behind.

January 15, 2025 at 8:58 PM

Reposted by Michael Ritchot

Ethan Mollick

@emollick.bsky.social

Reminder for the new semester that you can’t detect AI

Researchers secretly added AI-created papers to the exam pool: “We found that 94% of our AI submissions were undetected. The grades awarded to our AI submissions were on average half a grade boundary higher than that achieved by real students”

January 6, 2025 at 2:04 PM

Reposted by Michael Ritchot

Ethan Mollick

@emollick.bsky.social

What percentage of students are you willing to falsely accuse of cheating with AI?

There is a trade-off between false accusations and detection rates for AI. At a 10% false positive rate, detectors find 80% or less of AI content. At a 1% rate most find 60% or less.

Don’t trust AI detectors!

January 6, 2025 at 2:30 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news