tankred-saanum.bsky.social
@tankred-saanum.bsky.social
Stop by poster #6009 on friday, and check out the paper, which was a joint work with the amazing Peter Dayan and Eric Schulz @ericschulz.bsky.social

arxiv.org/abs/2401.17835

And reach out if you're in Vancouver and want to have a chat about RL, cogsci or mechanistic interpretability!
Simplifying Latent Dynamics with Softly State-Invariant World Models
To solve control problems via model-based reasoning or planning, an agent needs to know how its actions affect the state of the world. The actions an agent has at its disposal often change the state o...
arxiv.org
December 10, 2024 at 3:06 PM
The bottleneck is very different from more conventional regularization techniques: While L1 and L2 norm minimization shrinks the latent space, PLSM just makes transition dynamics more regular. Below is the learned latent space of 15x15 grid world.
December 10, 2024 at 3:06 PM
Our method also improves the world models prediction accuracy in trajectory datasets.
December 10, 2024 at 3:06 PM
We add our bottleneck on existing model-based algorithms like RePo and TDMPC and see nice improvements in DMC and Distracting Control Suite. Since visual distractions have dynamics independent of the agent's actions, our bottleneck compresses them away, improving generalization.
December 10, 2024 at 3:06 PM
How can we make our models learn these invariances? We introduce PLSM (Parsimonious Latent Space Model), a bottleneck method for predicting how an action changes the environment, while using as little information about the state as possible.
December 10, 2024 at 3:06 PM
Our actions tend to change the state of the world in systematic and predictable ways. When driving a car, turning the wheel almost invariably changes the direction we’re moving in. It doesn't really matter what road we’re driving on, or what country we’re driving in.
December 10, 2024 at 3:06 PM