Momchil Tomov
momchiltomov.bsky.social
Momchil Tomov
@momchiltomov.bsky.social
Cognitive Neuroscientist @ Harvard, AI Researcher @ Motional

Models of human & robot decision making in complex environments, including video games and urban driving. https://www.momchiltomov.com/
Here are several examples of real-world cut-ins. TreeIRL anticipates the cut-in and brakes comfortably, while the other baselines either brake too late or brake uncomfortably (see inset history of vehicle kinematics).
September 18, 2025 at 3:49 PM
Tree achieves 1-2 orders of magnitude improvement in safety, while also improving comfort and progress! On the road, it is by far the best planner.
September 18, 2025 at 3:48 PM
We compare TreeIRL against multiple classical and SOTA planners in 7000+ nuPlan simulations. But the most exciting result is from deploying and evaluating the planners on real self-driving cars in Las Vegas.
September 18, 2025 at 3:48 PM
We feed the MCTS trajectories into a deep scoring function trained with IRL to choose the most human-like among them.

The IRL network is trained on many hours of human export demonstrations to effectively reverse-engineer the intrinsic reward function of human driving.
September 18, 2025 at 3:48 PM
MCTS uses search + ML to efficiently explore combinatorially large search spaces. In most applications (e.g. AlphaGo), MCTS outputs a single next best action.

The main innovation is to reporpose MCTS to ouput a *set of possible sequences* of actions (i.e., trajectories).
September 18, 2025 at 3:47 PM
Why it matters (cont'd):

🧩 Flexible framework that can be extended with imitation learning and reinforcement learning.

‼️ Underscores importance of diverse metrics and real-world evaluation.
September 18, 2025 at 3:47 PM
Why this matters:

🛣️ First real-world evaluation of MCTS-based planner on public roads.

📊 Comprehensive comparison across simulation and **500+ miles of urban driving** in Las Vegas.

🏆 Beats classical + SOTA planners, balancing safety, progress, comfort, and human-likeness.
September 18, 2025 at 3:47 PM
💡The key idea is to use Monte Carlo tree search (MCTS) to find a promising set of safe candidate trajectories and inverse reinforcement learning (IRL) to choose the most human-like trajectory among them.

Read the full paper here --> arxiv.org/abs/2509.13579
TreeIRL: Safe Urban Driving with Tree Search and Inverse Reinforcement Learning
We present TreeIRL, a novel planner for autonomous driving that combines Monte Carlo tree search (MCTS) and inverse reinforcement learning (IRL) to achieve state-of-the-art performance in simulation a...
arxiv.org
September 18, 2025 at 3:39 PM