Kane Gregory
hobojed.bsky.social
Kane Gregory
@hobojed.bsky.social
They fail all over the place. This is 4o, so it's not a "reasoning" model, but it shows how LLMs quickly veer into garbage after a confabulation. This is a huge issue that will hinder any hopes of reliably getting them to reason over large real-world (aka messy) problems.
January 29, 2025 at 5:29 PM