Law Professor, Loyola Law School, Los Angeles
colin.doyle@lls.edu
https://www.colin-doyle.net/
the-decoder.com/language-mod...
Notably, o1-preview doesn't seem to share this problem with connections puzzles. Very curious about if this was a particular problem OpenAI targeted and how.
the-decoder.com/language-mod...
Notably, o1-preview doesn't seem to share this problem with connections puzzles. Very curious about if this was a particular problem OpenAI targeted and how.
With a complicated prompt system, GPT-4o was able to solve 86% of puzzles. OpenAI's new o1 model is the strongest. When prompted to make one guess at a time and receiving feedback on bad guesses, it could solve 100%.
With a complicated prompt system, GPT-4o was able to solve 86% of puzzles. OpenAI's new o1 model is the strongest. When prompted to make one guess at a time and receiving feedback on bad guesses, it could solve 100%.