Alex Heyman
alexheyman.bsky.social
Alex Heyman
@alexheyman.bsky.social
PhD candidate AI/ML researcher at York University, ON, CA | they/them
OpenAI and DeepSeek’s reasoning LLMs have scored impressively on benchmarks that challenge humans, but how robust are their fundamentals? We test o1-mini & R1 on small-scale graph coloring problems and find limited reliability and signs of issues with non-linear reasoning.
(1/10)
February 13, 2025 at 6:14 PM