Daphne Cornelisse
daphne-cornelisse.bsky.social
Daphne Cornelisse
@daphne-cornelisse.bsky.social
PhD student at NYU | Building human-like agents | https://www.daphne-cornelisse.com/
Results (2): Beyond in-distribution generalization, our agents show partial robustness to scenarios that rarely occur in the data.

More importantly, results show that agents can be fine-tuned in minutes to reach near-perfect performance in such cases.
February 28, 2025 at 5:19 PM
Results (1): Self-play scales well with data. With 10,000 training scenarios, the model approaches nearly the ceiling of our benchmark, achieving a 99.81% goal-reaching rate, 0.44% collision rate, and 0.31% off-road rate on 10,000 held-out test scenarios.
February 28, 2025 at 5:19 PM
SOTA generative models trained on large human datasets show unintended behaviors like crashes (5-6%) and off-road events (6-12%) in benchmarks for nominal driving.

Unpredictable deviations make it hard to separate signal from noise.
February 28, 2025 at 5:19 PM
Sim agents are key for developing autonomous systems for safety-critical systems, like self-driving cars.

We're open-sourcing sim agents that achieve a 99.8% success rate with < 0.8% failures on the Waymo Dataset. These agents are built through scaling self-play.
February 28, 2025 at 5:19 PM
GPUDrive got accepted to ICLR 2025!

With that, we release GPUDrive v0.4.0! 🚨 You can now install the repo and run your first fast PPO experiment in under 10 minutes.

I’m honestly so excited about the new opportunities and research the sim makes possible. 🚀 1/2
February 20, 2025 at 6:53 PM