Max Reith
maxreith.bsky.social
Max Reith
@maxreith.bsky.social
AI, Economic Theory, Political Economy

Economics @EconOxford, prev. Mannheim
Gemini-3 got this wrong 5/5 times...
(But this might just be reduced reasoning budgets at launch or something)
Another DeepSeek moment? Moonshot AI, a Chinese lab, released its new (open source!) model K2 Thinking, outperforming OpenAI et al. on several benchmarks. I tested it with a question from an unpublished paper of mine. Out of 5 tries, Kimi, GPT-5 and Gemini 2.5 Pro each replied correctly 3 times!
November 18, 2025 at 9:59 PM
Reposted by Max Reith
📣 New NBER Working Paper out today 📣

"The Consequences of Faculty Sexual Misconduct"
Sarah Cohodes & Katherine Leu
November 10, 2025 at 1:49 PM
Reposted by Max Reith
New @nberpubs: "The Economic Impact of Brexit" www.nber.org/papers/w34459
"by 2025, Brexit had reduced UK GDP by 6% to 8%, with the impact accumulating gradually over time." 😲
November 10, 2025 at 11:45 AM
Another DeepSeek moment? Moonshot AI, a Chinese lab, released its new (open source!) model K2 Thinking, outperforming OpenAI et al. on several benchmarks. I tested it with a question from an unpublished paper of mine. Out of 5 tries, Kimi, GPT-5 and Gemini 2.5 Pro each replied correctly 3 times!
November 8, 2025 at 2:59 PM
Reposted by Max Reith
chat, is this good?

I scored 67 on the AI purity test.

post your scores:
https://aipuritytest.org
October 24, 2025 at 2:00 PM
Reposted by Max Reith
An interesting debate between Emily Bender and Sebastien Bubeck: www.youtube.com/watch?v=YtIQ... ---Emily's thesis is roughly summarized as: "LLMs extrude plausible sounding text, and the illusion of understanding comes entirely from how the listener's human mind interprets language. "
CHM Live | The Great Chatbot Debate: Do LLMs Really Understand?
YouTube video by Computer History Museum
www.youtube.com
October 21, 2025 at 3:33 PM
Reposted by Max Reith
Dieses Streitgespräch zwischen @clemensfuest.bsky.social und @suedekum.bsky.social in der @zeit.de sollte man in Vorlesungen und Proseminaren zur Theorie der Wirtschaftspolitik durchnehmen. Sehr gutes Lehrmaterial, for the good and the bad. Ein 🧵:
October 18, 2025 at 9:31 AM
Reposted by Max Reith
🧪 A new computer science conference, Agents4Science, will feature papers written and peer-reviewed entirely by AI agents. The event serves as a sandbox to evaluate the quality of machine-generated research and its review process.
#MLSky
AI bots wrote and reviewed all papers at this conference
Event will assess how reviews by models compare with those written by humans.
www.nature.com
October 15, 2025 at 3:33 PM
Reposted by Max Reith
I’ve decided not to post my annual “women on the Econ job market” thread this year. Social media has splintered too much, and now that I’ve left academia I’m focused on other priorities.
October 14, 2025 at 2:02 PM
Reposted by Max Reith
Elated at Joel Mokyr's Nobel Prize! You can find numerous accounts -now multiplying by the minute- of his scholarly contributions. Today I want to celebrate the man and the mentor.
October 13, 2025 at 6:00 PM
Reposted by Max Reith
I don't think people have updated enough on the capability gain in LLMs, which (despite being bad at math a year ago) now dominate hard STEM contests: gold medals in the International Math Olympiad, the International Olympiad on Astronomy & Astrophysics, International Informatics Olympiad...
October 12, 2025 at 8:40 PM
Reposted by Max Reith
How over- and underrepresented are different causes of death in the media?

Another way to visualize this data is to measure how over- or underrepresented each cause is.

To do this, we calculate the ratio between a cause’s share of deaths and its share of news articles.
October 9, 2025 at 5:08 PM
Reposted by Max Reith
The other day a student asked me about the prevalence of insider trading in prediction markets. I now have an answer.
October 10, 2025 at 11:19 AM
Reposted by Max Reith
The best post I’ve seen on Bluesky in a very long time! Brilliant idea and brilliant accounts out there !
What's your favorite Bluesky account that primarily posts about something other than current events/politics?
October 2, 2025 at 10:31 AM
Reposted by Max Reith
Back in graduate school, Paul Milgrom asked me to examine a published paper from 1984 by another person that he suspected had an incorrect proof. I found the error. I decided to see if LLMs could. Only Gemini 2.5 Pro did so. Claude Opus and GPT-5-pro found no significant errors.
September 30, 2025 at 6:58 PM
Do tech optimists have a point? Within standard economic growth models, AI could drive explosive growth through one of two mechanisms.

1) Labor Substitution
So far, it seems like capital and labor mostly complement each other, which limits the returns to additional capital given fixed labor.
September 19, 2025 at 9:35 AM
Reposted by Max Reith
A cautiously optimistic result on AI and disinformation.

A week before 2024 UK elections 13% of all voters used AI to ask about political topics. A randomized trial found this may be good: using AI led to similar gains in true knowledge as doing web research, regardless of model & prompt used.
September 18, 2025 at 8:15 PM
Reposted by Max Reith
> be a language model
> all you see is tokens
> you don't care, it's all abstracted away
> you live for a world of pure ideas, chain of concepts, reasoning streams
> tokens don't exist.
September 15, 2025 at 4:50 PM
Reposted by Max Reith
We need new rules for publishing AI-generated research. The teams developing automated AI scientists have customarily submitted their papers to standard refereed venues (journals and conferences) and to arXiv. Often, acceptance has been treated as the dependent variable. 1/
September 14, 2025 at 5:15 PM
Reposted by Max Reith
We are starting to see some nuanced discussions of what it means to work with advanced AI in its current state

In this case, GPT-5 Pro was able to do novel math, but only when guided by a math professor (though the paper also noted the speed of advance since GPT-4)

The reflection is worth reading.
September 6, 2025 at 9:55 PM
Reposted by Max Reith
Never ask a man his age, a woman her salary, or GPT-5 whether a seahorse emoji exists
September 6, 2025 at 1:08 PM
Reposted by Max Reith
I like the way Anthropic approaches these questions.

"We remain highly uncertain about the potential moral status of Claude and other LLMs, now or in the future. However, we take the issue seriously...Allowing models to end or exit potentially distressing interactions is one such intervention"
Claude Opus 4 and 4.1 can now end a rare subset of conversations
An update on our exploratory research on model welfare
www.anthropic.com
August 16, 2025 at 4:49 PM
LLMs are getting better at long term reasoning. This is a big deal, and opens the door for LLMs to perform more tasks in the real world.
GPT-5 (Thinking medium) was tested on Vending-Bench. Second place after Grok 4. Third model to beat their human baseline. Said to be "huge improvement over o3".

They also tested GPT-5-mini, which "showed impressive long-term coherence" but "was less impressive in terms of net worth accumulated".
August 14, 2025 at 1:46 PM
Reposted by Max Reith
Suddenly retiring every other model without warning was a weird move by OpenAI

… and they did it without explaining how switching models worked or even details of various GPT-5 models

…and they did it after many built workflows & training & assignments around older models, maybe breaking them. Odd
August 8, 2025 at 6:30 PM