#o4-mini
metr.org/blog/2025-03... h/t @simonwillison.net

For people who were citing the earlier METR study showing no increase in open source contribution speed, update your priors. Opus 4.5 can autonomously complete complex tasks 50% of the time that would take a human 4+ hours to do.
January 1, 2026 at 5:48 AM
OpenAI released a new benchmark that GPT-5.2 wins

openai.com/index/fronti...
December 16, 2025 at 7:45 PM
Gemini Deep Research is available in the API

it uses Gemini 3 Pro, MCP, documents

blog.google/technology/d...
December 11, 2025 at 11:13 PM
Update: @propublica.org finally has privately responded. They say they intend to CONTINUE using illegal, theft-based LLMs—the specific one in this case being Open AI's GPT-o4 mini. They refused to answer the questions about why they're parroting AI companies' propaganda around hallucinations...
Nearly a week now without any answers or transparency from @propublica.org in response to dozens of readers concerned about their misguided use of an unnamed generative AI tool. Just sent this email to hello@propublica.org—I encourage anyone who donates to them to ask for these answers as well.
March 21, 2025 at 7:59 PM
‘The company’s o1 reasoning model “hallucinated 16 percent of the time” when summarizing public information, while newer models o3 and o4-mini “hallucinated 33 percent and 48 percent of the time, respectively.”’

What could go wrong?
This seems like a problem for OpenAI's business model.
"In a landmark study, OpenAI researchers reveal that large language models will always produce plausible but false outputs, even with perfect data, due to fundamental statistical and computational limits."

www.computerworld.com/article/4059...
September 21, 2025 at 10:26 PM
o4-mini in Action: Deep Reasoning Over Text and Images @Azure #DeepLearning #AI #MachineLearning
- YouTube
Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.
dlvr.it
September 8, 2025 at 7:04 PM
Artificialanalysis.ai/ initially puts o4-mini on the efficient frontier, beasting gemini 2.5 on cost and performance curve
AI Model & API Providers Analysis | Artificial Analysis
Comparison and analysis of AI models and API hosting providers. Independent benchmarks across key performance metrics including quality, price, output speed & latency.
artificialanalysis.ai
April 17, 2025 at 3:15 PM
OpenAI yeni modeller piyasaya sürüyor: GPT‑4.1, o3 ve o4‑mini

GPT 4.1 sadece API'de mevcut, diğerleri ChatGPT'de mevcut.

o3 şimdiye kadarki en akıllı akıl yürütme modelidir; o4‑mini ise onun hızlı kardeşidir. Her ikisi de görüntülerle düşünür ve araçları otonom olarak zincirler.

#openai #chatgpt
April 18, 2025 at 1:33 PM
OpenAI rolls out Flex, a lower-cost option for using o3 and o4-mini models, with slower speeds and less reliability. A move toward more customizable AI access, or a sign of growing complexity in AI infrastructure?

Read more: itmatterss.in/flex-process...

#Flex #OpenAI #ChatGPT #AItools
Flex Processing: OpenAI Launches Cheaper & Slower AI
OpenAI rolls out Flex, a lower-cost API option for its o3 and o4-mini models. But is the price cut worth slower speeds and limited access?
itmatterss.in
April 18, 2025 at 6:04 AM
OpenAI въведе нова функция в ChatGPT, която ѝ позволява да използва „памет“ за персонализирано търсене в уеб пространството. Актуализацията, пусната заедно с новите версии на ИИ-моделите o3 и o4-mini и наречена „Memory with Search“ вече е налична в ChatGPT и позволява на изкуствения интелект да...
ChatGPT вече ще използва спомените на потребителите, за да персонализира уеб търсенията
OpenAI въведе нова функция в ChatGPT, която ѝ позволява да използва „памет“ за персонализирано търсене в уеб пространството. Актуализацията, пусната заедно с новите версии на ИИ-моделите o3 и o4-mini и наречена „Memory with Search“ вече е налична в ChatGPT и позволява на изкуствения интелект да взема предвид информация от предишни заявки на потребителите за по-точни и полезни търсения в интернет. Според официалния център за помощ на OpenAI, новата функция съчетава способността на изкуствения интелект (ИИ) да запомня предпочитанията на потребителя с функция за онлайн търсене, задвижвана от Bing или други партньори на OpenAI.
www.kaldata.com
April 19, 2025 at 12:58 PM
OpenAI literally openly admits that their o3/o4-mini models hallucinate more than o1 🤯

In their publicly available o3/o4-mini model card report, section 3.3, they write that o4-mini hallucinated almost 50% of the time in a specific benchmark, much higher than o1.
April 20, 2025 at 12:00 AM
xkcd compiling comic but it's "o4-mini is thinking in cursor"
April 20, 2025 at 3:08 PM
i’m starting to get a feel for o3 & o4-mini, and they are NOT as advertised — drop-in replacements for o1 & o3-mini

they’re agents. if you use them as agents, they’re much *better*, but if you use them as word calculators, they’re far worse

they’re a new thing
April 20, 2025 at 11:23 AM
ChatGPTで使えるAIモデルのプランごとまとめ、OpenAIがo3、o4-mini、o4-mini-highのChatGPTにおける使用制限の詳細を発表 - GIGAZINE
news.google.com/rss/articles/CBMibkFVX3lxTE5KRFRJVjRWemgySEFFaFloNEN5QURXZnJUWms5VjJMWWFpR0lBRXMxVXZLMklIckRpcVd5dW1UcV9vRFozN0h2UnRwZHR2dDBJNkNXaEtzd0RuTTB3WHpXdkx6UFJJOUpkVHMzSFhB?oc=5
April 21, 2025 at 10:09 PM
ニュースだよ~

>OpenAIの「o3」と「o4-mini」は従来のAIよりも「幻覚」を起こしやすいことが判明
- https://gigazine.net/news/20250421-openai-hallucinate-o3-o4-mini/
April 21, 2025 at 5:02 AM
OpenAI o3 and o4-mini System Card OpenAI o3 and OpenAI o4-mini combine state-of-the-art reasoning...

https://openai.com/index/o3-o4-mini-system-card

Result Details
Awakari App
awakari.com
April 21, 2025 at 11:10 AM
個人的にはo4-miniが有能でそっちガリガリ使ってるけど。軽量系推論モデルの決定版だと思う。
April 22, 2025 at 11:10 AM
At a secret math meeting, leading mathematicians were stunned by OpenAI's o4-mini, a reasoning LLM solving "hardest solvable" problems. Its rapid, insightful deductions highlight alarming AI progress in complex mathematical reasoning. #MLSky
Inside the Secret Meeting Where Mathematicians Struggled to Outsmart AI
The world's leading mathematicians were stunned by how adept artificial intelligence is at doing their jobs
www.scientificamerican.com
June 9, 2025 at 3:43 PM
Кой модел - 4o, o4-mini, или?
May 26, 2025 at 2:35 PM
they're using o4-mini which you likely don't have access to, and then giving it well formulated problems in a style that is suited to its strengths. i'm sure it will still get tripped up or stumped if you deviate too far from the "norm".
June 7, 2025 at 6:07 PM
o4-mini-2025-04-16- strategy 📈 | Long TP:145.519 SL:145.05
Strong uptrend, bounce off S1(145.056), ATR-based SL, RR=2.03, ZigZag resistance at 145.519 L
#USDJPY
June 20, 2025 at 12:53 AM
o4-mini-2025-04-16- strategy 📈 | Short TP:0.64013 SL:0.648
Strong downtrend, rejection at 0.6479 resistance, spread acceptable, RR=3.5, Merrill M6 continuation S
#AUDUSD
June 19, 2025 at 12:35 PM
o4-mini-2025-04-16- strategy 📈 | Long TP:0.65138 SL:0.65005
Strong uptrend, bounce above pivot 0.65005, low spread, zigzag resistance at 0.65138, RR=2.7, Merrill W13 continuation L
#AUDUSD
June 18, 2025 at 1:20 PM
o4-mini-2025-04-16- strategy 📈 | Short TP:1.33136 SL:1.34778
Strong downtrend, resistance pivot 1.3478, spread acceptable, RR=3.0, M7 continuation S
#GBPUSD
June 18, 2025 at 1:41 PM