Vivek Myers
vivekmyers.bsky.social
Vivek Myers
@vivekmyers.bsky.social
PhD student @Berkeley_AI
reinforcement learning, AI, robotics
This trick encourages a form of time invariance during learning: both nearby and distant goals are represented similarly. By additionally aligning language instructions 𝜉(ℓ) to the goal representations 𝜓(𝑔), the policy can also perform new compound language tasks. 3/
February 14, 2025 at 1:39 AM
Current robot learning methods are good at imitating tasks seen during training, but struggle to compose behaviors in new ways. When training imitation policies, we found something surprising—using temporally-aligned task representations enabled compositional generalization. 1/
February 14, 2025 at 1:39 AM
Reinforcement learning agents should be able to improve upon behaviors seen during training.
In practice, RL agents often struggle to generalize to new long-horizon behaviors.
Our new paper studies *horizon generalization*, the degree to which RL algorithms generalize to reaching distant goals. 1/
February 4, 2025 at 8:37 PM