Manuel Sánchez
manuel-sh.bsky.social
Manuel Sánchez
@manuel-sh.bsky.social
Building AI at scale in the enterprise world.

manuelsh.github.io
The error rate of GPT4o is even higher, 1.5%. An agent making 20 calls will have an error rate of 26%! That's not something scalable.

Of course, there are mechanism to reduce it, like the agent running tests over its results, but still this implies more calls. (3/3)
April 17, 2025 at 8:19 AM
But taking the level of the best model, a 0.7% error rate is equivalent to ~13% error rate of an agent that performs 20 calls to that LLM. Many times they do more calls.

13% is a very high error rate if we want to use that at scale. (2/3)
April 17, 2025 at 8:19 AM
linked it from the blog post! thanks!
January 16, 2025 at 10:03 PM
Top in the "How to argue" hierarchy: pointing out a flaw in the central point ;-) just a liiiiiiitle flaw
January 16, 2025 at 9:41 PM
Well explained in this website:
phillipi.github.io/prh/#what_co...
The Platonic Representation Hypothesis
phillipi.github.io
January 12, 2025 at 11:21 PM
Hey, great to hear it! Where can we find it?
January 12, 2025 at 10:31 PM