Daphne Cornelisse
daphne-cornelisse.bsky.social
Daphne Cornelisse
@daphne-cornelisse.bsky.social
PhD student at NYU | Building human-like agents | https://www.daphne-cornelisse.com/
This was joint work with Aarav Pandya, Kevin Joseph, Joseph Suárez, and @eugenevinitsky.bsky.social
February 28, 2025 at 5:19 PM
Results (2): Beyond in-distribution generalization, our agents show partial robustness to scenarios that rarely occur in the data.

More importantly, results show that agents can be fine-tuned in minutes to reach near-perfect performance in such cases.
February 28, 2025 at 5:19 PM
Results (1): Self-play scales well with data. With 10,000 training scenarios, the model approaches nearly the ceiling of our benchmark, achieving a 99.81% goal-reaching rate, 0.44% collision rate, and 0.31% off-road rate on 10,000 held-out test scenarios.
February 28, 2025 at 5:19 PM
We train sim agents using self-play PPO on 10K+ scenarios from the Waymo Open Dataset in GPUDrive, under a semi-realistic framework for human perception and control.

Agents learn goal-directed behavior, avoiding collisions and staying on the road.
February 28, 2025 at 5:19 PM
SOTA generative models trained on large human datasets show unintended behaviors like crashes (5-6%) and off-road events (6-12%) in benchmarks for nominal driving.

Unpredictable deviations make it hard to separate signal from noise.
February 28, 2025 at 5:19 PM
Challenge accepted
February 26, 2025 at 6:13 PM
Oh and, stay tuned for another big release tomorrow!
February 20, 2025 at 6:53 PM
Huge thanks to my incredible collaborators for making this possible: Saman Kazemkhani, Aarav Pandya, @eugenevinitsky.bsky.social , Joseph Suarez for converting the sim to a package and optimizing the PPO loop, and Kevin Joseph for all his help with data processing, tutorials, and more! 😊
February 20, 2025 at 6:53 PM