tankred-saanum.bsky.social
@tankred-saanum.bsky.social
The bottleneck is very different from more conventional regularization techniques: While L1 and L2 norm minimization shrinks the latent space, PLSM just makes transition dynamics more regular. Below is the learned latent space of 15x15 grid world.
December 10, 2024 at 3:06 PM
Our method also improves the world models prediction accuracy in trajectory datasets.
December 10, 2024 at 3:06 PM
We add our bottleneck on existing model-based algorithms like RePo and TDMPC and see nice improvements in DMC and Distracting Control Suite. Since visual distractions have dynamics independent of the agent's actions, our bottleneck compresses them away, improving generalization.
December 10, 2024 at 3:06 PM
I'm at #NeurIPS to present new work on softly state-invariant world models! We introduce an info bottleneck making world models represent action effects more consistently in latent space, improving prediction and planning! Reach out if you want to meet!
December 10, 2024 at 3:06 PM