Lightnews — Scholar-powered news

David Snyder

@dasny25.bsky.social

5 followers 3 following 14 posts

PhD Student in the IRoM Lab at Princeton University, working on safety and generalization assurances for robots.

Posts Replies Media Videos

David Snyder

@dasny25.bsky.social

(10/13) STEP constructs decision rules by solving an offline convex optimization problem, which yields near-optimal multidimensional decision boundaries for Nmax up to ~500-1000. During evaluation, STEP can be used almost like a look-up table!

May 9, 2025 at 8:01 PM

David Snyder

@dasny25.bsky.social

(9/13) Why Nmax?

Policy evaluation is expensive, due to limited hardware availability and limited resources for human supervision. STEP near-optimally accounts for this practical constraint, and gives the evaluator significant leeway to set a conservative Nmax.

May 9, 2025 at 7:58 PM

David Snyder

@dasny25.bsky.social

(6/13) Yes!

We propose STEP, a sequential test which aggregates evaluation rollouts one-by-one and stops automatically when a desired significance level is reached. It stops quickly when the performance gap is large, and waits if the gap is small.

May 9, 2025 at 7:55 PM

David Snyder

@dasny25.bsky.social

(1/13) How should we rigorously compare robot policies? Comparison is central to robotics research, but is inherently expensive. We introduce STEP, a flexible and data-efficient method for statistically rigorous policy comparison.
Accepted at RSS 2025: tri-ml.github.io/step/

May 9, 2025 at 7:49 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news