Matthew Berman
@matthewberman.bsky.social
Takeaways:
- Model switchers confuse even technical users
- Technical names create distance, not connection
- Earth elements work across all cultures instantly
- Intent-based routing beats manual selection
- The most powerful technology should feel the most human
- Model switchers confuse even technical users
- Technical names create distance, not connection
- Earth elements work across all cultures instantly
- Intent-based routing beats manual selection
- The most powerful technology should feel the most human
August 8, 2025 at 9:32 PM
Takeaways:
- Model switchers confuse even technical users
- Technical names create distance, not connection
- Earth elements work across all cultures instantly
- Intent-based routing beats manual selection
- The most powerful technology should feel the most human
- Model switchers confuse even technical users
- Technical names create distance, not connection
- Earth elements work across all cultures instantly
- Intent-based routing beats manual selection
- The most powerful technology should feel the most human
If you liked this, you’ll love my Big Players newsletter. I’ve built brands like Fireball, scaled agency ops, and now I write tactical breakdowns for operators using AI to win.
Join thousands of operators:
https://bigplayers.co/subscribe
Join thousands of operators:
https://bigplayers.co/subscribe
August 8, 2025 at 9:32 PM
If you liked this, you’ll love my Big Players newsletter. I’ve built brands like Fireball, scaled agency ops, and now I write tactical breakdowns for operators using AI to win.
Join thousands of operators:
https://bigplayers.co/subscribe
Join thousands of operators:
https://bigplayers.co/subscribe
10/ Look, @OpenAI is the most revolutionary company of my lifetime.
GPT-5 is incredible. Memory personalization is magic. My kids love Advanced Voice Mode.
They've ushered in a completely new world.
I just want this new world to feel alive and human.
GPT-5 is incredible. Memory personalization is magic. My kids love Advanced Voice Mode.
They've ushered in a completely new world.
I just want this new world to feel alive and human.
August 8, 2025 at 9:32 PM
10/ Look, @OpenAI is the most revolutionary company of my lifetime.
GPT-5 is incredible. Memory personalization is magic. My kids love Advanced Voice Mode.
They've ushered in a completely new world.
I just want this new world to feel alive and human.
GPT-5 is incredible. Memory personalization is magic. My kids love Advanced Voice Mode.
They've ushered in a completely new world.
I just want this new world to feel alive and human.
9/ Natural naming would make model choice feel obvious.
It lets @OpenAI's platform scale without getting colder or more complicated.
It builds the brand people feel, not just the one they use.
It lets @OpenAI's platform scale without getting colder or more complicated.
It builds the brand people feel, not just the one they use.
August 8, 2025 at 9:32 PM
9/ Natural naming would make model choice feel obvious.
It lets @OpenAI's platform scale without getting colder or more complicated.
It builds the brand people feel, not just the one they use.
It lets @OpenAI's platform scale without getting colder or more complicated.
It builds the brand people feel, not just the one they use.
8/ Technical users still get full control.
Behind "Storm" lives the complete specification. API endpoints, context windows, parameters (all accessible).
But everyone else just says "give me something powerful" and it works.
Behind "Storm" lives the complete specification. API endpoints, context windows, parameters (all accessible).
But everyone else just says "give me something powerful" and it works.
August 8, 2025 at 9:32 PM
8/ Technical users still get full control.
Behind "Storm" lives the complete specification. API endpoints, context windows, parameters (all accessible).
But everyone else just says "give me something powerful" and it works.
Behind "Storm" lives the complete specification. API endpoints, context windows, parameters (all accessible).
But everyone else just says "give me something powerful" and it works.
7/ Most people don't want to choose models anyway.
They want to describe their intent and get to work.
"Help me write a proposal" → System picks the right tool
"Solve this math problem" → Routes automatically
The confusion vanishes completely.
They want to describe their intent and get to work.
"Help me write a proposal" → System picks the right tool
"Solve this math problem" → Routes automatically
The confusion vanishes completely.
August 8, 2025 at 9:32 PM
7/ Most people don't want to choose models anyway.
They want to describe their intent and get to work.
"Help me write a proposal" → System picks the right tool
"Solve this math problem" → Routes automatically
The confusion vanishes completely.
They want to describe their intent and get to work.
"Help me write a proposal" → System picks the right tool
"Solve this math problem" → Routes automatically
The confusion vanishes completely.
6/ Here's the magic: you don't need to know the category or strength level.
"I'm writing a quick email" → Breeze appears
"I need help with complex analysis" → Blaze shows up
"I'm building something big" → Bloom activates
The system handles everything.
"I'm writing a quick email" → Breeze appears
"I need help with complex analysis" → Blaze shows up
"I'm building something big" → Bloom activates
The system handles everything.
August 8, 2025 at 9:32 PM
6/ Here's the magic: you don't need to know the category or strength level.
"I'm writing a quick email" → Breeze appears
"I need help with complex analysis" → Blaze shows up
"I'm building something big" → Bloom activates
The system handles everything.
"I'm writing a quick email" → Breeze appears
"I need help with complex analysis" → Blaze shows up
"I'm building something big" → Bloom activates
The system handles everything.
5/ Universal concepts that work everywhere:
- A kid in Tokyo gets it instantly
- A grandmother in rural Kenya understands immediately
- No translations needed
- No technical jargon
Weather. Fire. Plants. Elements that connect every human on Earth.
- A kid in Tokyo gets it instantly
- A grandmother in rural Kenya understands immediately
- No translations needed
- No technical jargon
Weather. Fire. Plants. Elements that connect every human on Earth.
August 8, 2025 at 9:32 PM
5/ Universal concepts that work everywhere:
- A kid in Tokyo gets it instantly
- A grandmother in rural Kenya understands immediately
- No translations needed
- No technical jargon
Weather. Fire. Plants. Elements that connect every human on Earth.
- A kid in Tokyo gets it instantly
- A grandmother in rural Kenya understands immediately
- No translations needed
- No technical jargon
Weather. Fire. Plants. Elements that connect every human on Earth.
4/ Then four strength levels using nature itself:
𝗕𝘂𝗶𝗹𝗱𝗲𝗿: Seed → Sprout → Vine → Bloom
𝗧𝗵𝗶𝗻𝗸𝗲𝗿: Spark → Flame → Blaze → Nova
𝗣𝗮𝗿𝘁𝗻𝗲𝗿: Breeze → Wave → Storm → Surge
Weather, fire, plants. Concepts every human understands instantly.
𝗕𝘂𝗶𝗹𝗱𝗲𝗿: Seed → Sprout → Vine → Bloom
𝗧𝗵𝗶𝗻𝗸𝗲𝗿: Spark → Flame → Blaze → Nova
𝗣𝗮𝗿𝘁𝗻𝗲𝗿: Breeze → Wave → Storm → Surge
Weather, fire, plants. Concepts every human understands instantly.
August 8, 2025 at 9:32 PM
4/ Then four strength levels using nature itself:
𝗕𝘂𝗶𝗹𝗱𝗲𝗿: Seed → Sprout → Vine → Bloom
𝗧𝗵𝗶𝗻𝗸𝗲𝗿: Spark → Flame → Blaze → Nova
𝗣𝗮𝗿𝘁𝗻𝗲𝗿: Breeze → Wave → Storm → Surge
Weather, fire, plants. Concepts every human understands instantly.
𝗕𝘂𝗶𝗹𝗱𝗲𝗿: Seed → Sprout → Vine → Bloom
𝗧𝗵𝗶𝗻𝗸𝗲𝗿: Spark → Flame → Blaze → Nova
𝗣𝗮𝗿𝘁𝗻𝗲𝗿: Breeze → Wave → Storm → Surge
Weather, fire, plants. Concepts every human understands instantly.
3/ I pitched them something radically different.
Three categories based on what people actually want to do:
𝗕𝘂𝗶𝗹𝗱𝗲𝗿 → to create things
𝗧𝗵𝗶𝗻𝗸𝗲𝗿 → to solve problems
𝗣𝗮𝗿𝘁𝗻𝗲𝗿 → to live and work together
Simple. Human. Universal.
Three categories based on what people actually want to do:
𝗕𝘂𝗶𝗹𝗱𝗲𝗿 → to create things
𝗧𝗵𝗶𝗻𝗸𝗲𝗿 → to solve problems
𝗣𝗮𝗿𝘁𝗻𝗲𝗿 → to live and work together
Simple. Human. Universal.
August 8, 2025 at 9:32 PM
3/ I pitched them something radically different.
Three categories based on what people actually want to do:
𝗕𝘂𝗶𝗹𝗱𝗲𝗿 → to create things
𝗧𝗵𝗶𝗻𝗸𝗲𝗿 → to solve problems
𝗣𝗮𝗿𝘁𝗻𝗲𝗿 → to live and work together
Simple. Human. Universal.
Three categories based on what people actually want to do:
𝗕𝘂𝗶𝗹𝗱𝗲𝗿 → to create things
𝗧𝗵𝗶𝗻𝗸𝗲𝗿 → to solve problems
𝗣𝗮𝗿𝘁𝗻𝗲𝗿 → to live and work together
Simple. Human. Universal.
2/ But the real problem wasn't the interface. It was the names themselves.
"GPT-4.1-nano" "gpt-4o-mini" "o4-mini-high"
Cold. Technical. Alien. Like choosing between server configurations, not creative partners.
"GPT-4.1-nano" "gpt-4o-mini" "o4-mini-high"
Cold. Technical. Alien. Like choosing between server configurations, not creative partners.
August 8, 2025 at 9:32 PM
2/ But the real problem wasn't the interface. It was the names themselves.
"GPT-4.1-nano" "gpt-4o-mini" "o4-mini-high"
Cold. Technical. Alien. Like choosing between server configurations, not creative partners.
"GPT-4.1-nano" "gpt-4o-mini" "o4-mini-high"
Cold. Technical. Alien. Like choosing between server configurations, not creative partners.
1/ The model switcher was broken from day one.
Even technical people couldn't explain what each model was good at.
"Use o3 for reasoning, GPT-4o for... um... other stuff?" "GPT-4o-mini is cheaper but when do you actually use it?"
Complete confusion.
Even technical people couldn't explain what each model was good at.
"Use o3 for reasoning, GPT-4o for... um... other stuff?" "GPT-4o-mini is cheaper but when do you actually use it?"
Complete confusion.
August 8, 2025 at 9:32 PM
1/ The model switcher was broken from day one.
Even technical people couldn't explain what each model was good at.
"Use o3 for reasoning, GPT-4o for... um... other stuff?" "GPT-4o-mini is cheaper but when do you actually use it?"
Complete confusion.
Even technical people couldn't explain what each model was good at.
"Use o3 for reasoning, GPT-4o for... um... other stuff?" "GPT-4o-mini is cheaper but when do you actually use it?"
Complete confusion.
Time might be running out!
If you enjoyed this thread:
1. Follow me @matthewberman.bsky.social for AI powered systems
2. RT to help the algorithm
If you enjoyed this thread:
1. Follow me @matthewberman.bsky.social for AI powered systems
2. RT to help the algorithm
Scientists from @OpenAI, @GoogleDeepMind, @AnthropicAI and @MetaAI just abandoned their fierce rivalry to issue an URGENT joint warning.
Here's what has them so terrified: 🧵
Here's what has them so terrified: 🧵
July 17, 2025 at 2:40 PM
Time might be running out!
If you enjoyed this thread:
1. Follow me @matthewberman.bsky.social for AI powered systems
2. RT to help the algorithm
If you enjoyed this thread:
1. Follow me @matthewberman.bsky.social for AI powered systems
2. RT to help the algorithm
11/ Takeaways:
• Chain of thought monitoring lets us read AI reasoning for the first time
• Hard tasks force models to externalize thoughts in human language
• Researchers are catching deceptive behavior in reasoning traces
• This opportunity is fragile and may disappear
• Chain of thought monitoring lets us read AI reasoning for the first time
• Hard tasks force models to externalize thoughts in human language
• Researchers are catching deceptive behavior in reasoning traces
• This opportunity is fragile and may disappear
July 17, 2025 at 2:40 PM
11/ Takeaways:
• Chain of thought monitoring lets us read AI reasoning for the first time
• Hard tasks force models to externalize thoughts in human language
• Researchers are catching deceptive behavior in reasoning traces
• This opportunity is fragile and may disappear
• Chain of thought monitoring lets us read AI reasoning for the first time
• Hard tasks force models to externalize thoughts in human language
• Researchers are catching deceptive behavior in reasoning traces
• This opportunity is fragile and may disappear
10/ It's the greatest time in history to build! Thousands of operators are reading my newsletter every week for AI powered growth. Get the free newsletter.
https://bigplayers.co/subscribe
https://bigplayers.co/subscribe
July 17, 2025 at 2:40 PM
10/ It's the greatest time in history to build! Thousands of operators are reading my newsletter every week for AI powered growth. Get the free newsletter.
https://bigplayers.co/subscribe
https://bigplayers.co/subscribe
9/ What's next?
Labs like @Anthropic and @OpenAI need to publish monitorability scores alongside capability benchmarks.
Developers should factor reasoning transparency into training decisions.
We might be looking at the first and last chance
Labs like @Anthropic and @OpenAI need to publish monitorability scores alongside capability benchmarks.
Developers should factor reasoning transparency into training decisions.
We might be looking at the first and last chance
July 17, 2025 at 2:40 PM
9/ What's next?
Labs like @Anthropic and @OpenAI need to publish monitorability scores alongside capability benchmarks.
Developers should factor reasoning transparency into training decisions.
We might be looking at the first and last chance
Labs like @Anthropic and @OpenAI need to publish monitorability scores alongside capability benchmarks.
Developers should factor reasoning transparency into training decisions.
We might be looking at the first and last chance
8/ Leading AI researchers are now calling for:
• Standardized monitorability evaluations
• Tracking reasoning transparency in model cards
• Considering CoT monitoring in deployment decisions
The window is open NOW - but it won't stay that way.
• Standardized monitorability evaluations
• Tracking reasoning transparency in model cards
• Considering CoT monitoring in deployment decisions
The window is open NOW - but it won't stay that way.
July 17, 2025 at 2:40 PM
8/ Leading AI researchers are now calling for:
• Standardized monitorability evaluations
• Tracking reasoning transparency in model cards
• Considering CoT monitoring in deployment decisions
The window is open NOW - but it won't stay that way.
• Standardized monitorability evaluations
• Tracking reasoning transparency in model cards
• Considering CoT monitoring in deployment decisions
The window is open NOW - but it won't stay that way.
7/ The implications are STAGGERING for humanity's future:
This could be our ONLY window to detect when AI systems are:
• Planning to deceive us
• Gaming their reward systems
• Developing self-preservation instincts
• Plotting against human interests
Before they get smart enough to hide these
This could be our ONLY window to detect when AI systems are:
• Planning to deceive us
• Gaming their reward systems
• Developing self-preservation instincts
• Plotting against human interests
Before they get smart enough to hide these
July 17, 2025 at 2:40 PM
7/ The implications are STAGGERING for humanity's future:
This could be our ONLY window to detect when AI systems are:
• Planning to deceive us
• Gaming their reward systems
• Developing self-preservation instincts
• Plotting against human interests
Before they get smart enough to hide these
This could be our ONLY window to detect when AI systems are:
• Planning to deceive us
• Gaming their reward systems
• Developing self-preservation instincts
• Plotting against human interests
Before they get smart enough to hide these
6/ Three major threats to Chain of Thought monitoring:
• Direct supervision making reasoning less honest
• Outcome-based RL breaking human language patterns
• Novel architectures that reason in "latent space"
We might lose our best shot at AI interpretability.
• Direct supervision making reasoning less honest
• Outcome-based RL breaking human language patterns
• Novel architectures that reason in "latent space"
We might lose our best shot at AI interpretability.
July 17, 2025 at 2:40 PM
6/ Three major threats to Chain of Thought monitoring:
• Direct supervision making reasoning less honest
• Outcome-based RL breaking human language patterns
• Novel architectures that reason in "latent space"
We might lose our best shot at AI interpretability.
• Direct supervision making reasoning less honest
• Outcome-based RL breaking human language patterns
• Novel architectures that reason in "latent space"
We might lose our best shot at AI interpretability.
5/ But here's the scary part - this opportunity is FRAGILE.
As AI labs scale up reinforcement learning, models might drift away from human-readable reasoning.
They could develop "alien" thinking patterns we can't decode, closing this safety window forever.
As AI labs scale up reinforcement learning, models might drift away from human-readable reasoning.
They could develop "alien" thinking patterns we can't decode, closing this safety window forever.
July 17, 2025 at 2:40 PM
5/ But here's the scary part - this opportunity is FRAGILE.
As AI labs scale up reinforcement learning, models might drift away from human-readable reasoning.
They could develop "alien" thinking patterns we can't decode, closing this safety window forever.
As AI labs scale up reinforcement learning, models might drift away from human-readable reasoning.
They could develop "alien" thinking patterns we can't decode, closing this safety window forever.
4/ The results are mind-blowing:
Researchers caught models explicitly saying things like:
• "Let's hack"
• "Let's sabotage"
• "I'm transferring money because the website instructed me to"
Chain of thought monitoring spotted misbehavior that would be invisible otherwise.
Researchers caught models explicitly saying things like:
• "Let's hack"
• "Let's sabotage"
• "I'm transferring money because the website instructed me to"
Chain of thought monitoring spotted misbehavior that would be invisible otherwise.
July 17, 2025 at 2:40 PM
4/ The results are mind-blowing:
Researchers caught models explicitly saying things like:
• "Let's hack"
• "Let's sabotage"
• "I'm transferring money because the website instructed me to"
Chain of thought monitoring spotted misbehavior that would be invisible otherwise.
Researchers caught models explicitly saying things like:
• "Let's hack"
• "Let's sabotage"
• "I'm transferring money because the website instructed me to"
Chain of thought monitoring spotted misbehavior that would be invisible otherwise.
3/ This creates what researchers call the "externalized reasoning property":
For sufficiently difficult tasks, AI models are physically unable to hide their reasoning process.
They have to "think out loud" in human language we can understand and monitor.
For sufficiently difficult tasks, AI models are physically unable to hide their reasoning process.
They have to "think out loud" in human language we can understand and monitor.
July 17, 2025 at 2:40 PM
3/ This creates what researchers call the "externalized reasoning property":
For sufficiently difficult tasks, AI models are physically unable to hide their reasoning process.
They have to "think out loud" in human language we can understand and monitor.
For sufficiently difficult tasks, AI models are physically unable to hide their reasoning process.
They have to "think out loud" in human language we can understand and monitor.
2/ Here's how it works:
When AI models tackle hard problems, they MUST externalize their reasoning into human language.
It's not optional - the Transformer architecture literally forces complex reasoning to pass through the chain of thought as "working memory."
When AI models tackle hard problems, they MUST externalize their reasoning into human language.
It's not optional - the Transformer architecture literally forces complex reasoning to pass through the chain of thought as "working memory."
July 17, 2025 at 2:40 PM
2/ Here's how it works:
When AI models tackle hard problems, they MUST externalize their reasoning into human language.
It's not optional - the Transformer architecture literally forces complex reasoning to pass through the chain of thought as "working memory."
When AI models tackle hard problems, they MUST externalize their reasoning into human language.
It's not optional - the Transformer architecture literally forces complex reasoning to pass through the chain of thought as "working memory."