Hyunwoo Kim
banner
hyunwoo-kim.bsky.social
Hyunwoo Kim
@hyunwoo-kim.bsky.social
Social Reasoning/Cognition + AI, Postdoc at NVIDIA | Previously @ai2.bsky.social | PhD from Seoul Natl Univ.
http://hyunwookim.com
Results show TT outperforms reasoning models with significantly fewer output tokens. Also, unlike math, we do not observe a substantially higher token usage for incorrect responses from reasoning models on ToM benchmarks; in some cases, the pattern is even reversed🤔 TT shows balanced token usage
February 20, 2025 at 5:34 PM
We present ThoughtTracing💭, an inference-time reasoning algorithm for tracing mental states of specific agents. It's inspired from the sequential Monte Carlo algorithm and modeled after the Bayesian ToM framework, using LLMs to approximate probabilistic inference over agents’ evolving mental states
February 20, 2025 at 5:34 PM
🚨New Paper! So o3-mini and R1 seem to excel on math & coding. But how good are they on other domains where verifiable rewards are not easily available, such as theory of mind (ToM)? Do they show similar behavioral patterns? 🤔 What if I told you it's...interesting, like the below?🧵
February 20, 2025 at 5:34 PM