Models of human & robot decision making in complex environments, including video games and urban driving. https://www.momchiltomov.com/
The IRL network is trained on many hours of human export demonstrations to effectively reverse-engineer the intrinsic reward function of human driving.
The IRL network is trained on many hours of human export demonstrations to effectively reverse-engineer the intrinsic reward function of human driving.
The main innovation is to reporpose MCTS to ouput a *set of possible sequences* of actions (i.e., trajectories).
The main innovation is to reporpose MCTS to ouput a *set of possible sequences* of actions (i.e., trajectories).
🧩 Flexible framework that can be extended with imitation learning and reinforcement learning.
‼️ Underscores importance of diverse metrics and real-world evaluation.
🧩 Flexible framework that can be extended with imitation learning and reinforcement learning.
‼️ Underscores importance of diverse metrics and real-world evaluation.
🛣️ First real-world evaluation of MCTS-based planner on public roads.
📊 Comprehensive comparison across simulation and **500+ miles of urban driving** in Las Vegas.
🏆 Beats classical + SOTA planners, balancing safety, progress, comfort, and human-likeness.
🛣️ First real-world evaluation of MCTS-based planner on public roads.
📊 Comprehensive comparison across simulation and **500+ miles of urban driving** in Las Vegas.
🏆 Beats classical + SOTA planners, balancing safety, progress, comfort, and human-likeness.
Read the full paper here --> arxiv.org/abs/2509.13579
Read the full paper here --> arxiv.org/abs/2509.13579