7/8
7/8
6/8
6/8
5/8
5/8
4/8
4/8
Our approach is to use human priors found in foundation models. We extend MOTIF to VLMs: A VLM compares observation pairs, collected through self-supervised exploration. This ranking is distilled into a reward function.
3/8
Our approach is to use human priors found in foundation models. We extend MOTIF to VLMs: A VLM compares observation pairs, collected through self-supervised exploration. This ranking is distilled into a reward function.
3/8
Children solve this by observing and imitating adults. We bring such semantic exploration to artificial agents.
2/8
Children solve this by observing and imitating adults. We bring such semantic exploration to artificial agents.
2/8
With intrinsic rewards for novel yet useful behaviors, SENSEI showcases strong exploration in MiniHack, Pokémon Red & Robodesk.
Accepted at ICML 2025🎉
Joint work with @cgumbsch.bsky.social
🧵
With intrinsic rewards for novel yet useful behaviors, SENSEI showcases strong exploration in MiniHack, Pokémon Red & Robodesk.
Accepted at ICML 2025🎉
Joint work with @cgumbsch.bsky.social
🧵