Lightnews — Scholar-powered news

Daphne Cornelisse

@daphne-cornelisse.bsky.social

240 followers 48 following 14 posts

PhD student at NYU | Building human-like agents | https://www.daphne-cornelisse.com/

Posts Replies Media Videos

Daphne Cornelisse

@daphne-cornelisse.bsky.social

Results (2): Beyond in-distribution generalization, our agents show partial robustness to scenarios that rarely occur in the data.

More importantly, results show that agents can be fine-tuned in minutes to reach near-perfect performance in such cases.

February 28, 2025 at 5:19 PM

Daphne Cornelisse

@daphne-cornelisse.bsky.social

Results (1): Self-play scales well with data. With 10,000 training scenarios, the model approaches nearly the ceiling of our benchmark, achieving a 99.81% goal-reaching rate, 0.44% collision rate, and 0.31% off-road rate on 10,000 held-out test scenarios.

February 28, 2025 at 5:19 PM

Daphne Cornelisse

@daphne-cornelisse.bsky.social

SOTA generative models trained on large human datasets show unintended behaviors like crashes (5-6%) and off-road events (6-12%) in benchmarks for nominal driving.

Unpredictable deviations make it hard to separate signal from noise.

February 28, 2025 at 5:19 PM

Daphne Cornelisse

@daphne-cornelisse.bsky.social

Sim agents are key for developing autonomous systems for safety-critical systems, like self-driving cars.

We're open-sourcing sim agents that achieve a 99.8% success rate with < 0.8% failures on the Waymo Dataset. These agents are built through scaling self-play.

February 28, 2025 at 5:19 PM

Daphne Cornelisse

@daphne-cornelisse.bsky.social

GPUDrive got accepted to ICLR 2025!

With that, we release GPUDrive v0.4.0! 🚨 You can now install the repo and run your first fast PPO experiment in under 10 minutes.

I’m honestly so excited about the new opportunities and research the sim makes possible. 🚀 1/2

February 20, 2025 at 6:53 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news