Lightnews — Scholar-powered news

Abhishek Gupta

@abhishekunique7.bsky.social

1.7K followers 310 following 15 posts

Assistant Professor, Paul G. Allen School of Computer Science and Engineering, University of Washington

Visiting Faculty, NVIDIA

Ph.D. from Berkeley, Postdoc MIT

https://homes.cs.washington.edu/~abhgupta

I like robots and reinforcement learning :)

Posts Replies Media Videos

Abhishek Gupta

@abhishekunique7.bsky.social

I’m also a sucker for a fun website. Check out our interactive demo where you can see some of the environments and learned behaviors. We’ve also open sourced USDZ assets of the sourced environments. (8/N)

December 5, 2024 at 2:13 AM

Abhishek Gupta

@abhishekunique7.bsky.social

Step 5: One neat feature is that in a test environment, human demos aren’t even required. Scan in an environment video to build a test-time simulation and let the generalist model provide itself demos and improve with RL in sim. Results in over 50% improvement with 0 human effort (6/N)

December 5, 2024 at 2:13 AM

Abhishek Gupta

@abhishekunique7.bsky.social

Step 4: Transfer over to the real world, either zero-shot or with some co-training. Shows scaling laws as more experience is encountered, and robust performance across distractors, object positions, visual conditions and disturbances. (5/N)

December 5, 2024 at 2:13 AM

Abhishek Gupta

@abhishekunique7.bsky.social

Step 3: Providing even 10 demos per env is still expensive. By training generalists from RL data, we get cross-environment generalization that allows the model to provide *itself* demos and only use human effort when necessary. The better the generalist gets, the less human effort is required. (4/N)

December 5, 2024 at 2:13 AM

Abhishek Gupta

@abhishekunique7.bsky.social

Step 2: Train policies on these environments with demo-bootstrapped RL. A couple of demos are needed to guide exploration, but the heavy lifting is done with large scale RL in simulation . This takes success rates from 2-3% to >90% success from <10 human demos. (3/N)

December 5, 2024 at 2:13 AM

Abhishek Gupta

@abhishekunique7.bsky.social

Step 1: Collect lots of environments with video scans - anyone can do it with their phone. I even had my parents scan in a bunch :) use 3D reconstruction methods like Gaussian splats to make diverse, visually & geometrically realistic sim environments for training policies (2/N)

December 5, 2024 at 2:13 AM

Abhishek Gupta

@abhishekunique7.bsky.social

I'm excited about scaling up robot learning! We’ve been scaling up data gen with RL in realistic sims generated from crowdsourced videos. Enables data collection far more cheaply than real world teleop. Importantly, data becomes *cheaper* with more environments and transfers to real robots! 🧵 (1/N)

December 5, 2024 at 2:13 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news