Lightnews — Scholar-powered news

William Xie

@wxie.bsky.social

cam showing up wouldve been nice but not sure that wouldve moved the needle

October 24, 2025 at 4:58 AM

William Xie

@wxie.bsky.social

unexpected brickage from jokic ruining the third, true, and only MPJ we ever needed

October 24, 2025 at 4:57 AM

William Xie

@wxie.bsky.social

and then to self-plug a bit, i wrote up a little case study/position paper on dual-use in VLM reasoning and robot manipulation last month: arxiv.org/abs/2505.18792. the gist is that safeguarding reduces both helpful/harmful robot control and i opine about what that means for future model eval/dev

On the Dual-Use Dilemma in Physical Reasoning and Force

Humans learn how and when to apply forces in the world via a complex physiological and psychological learning process. Attempting to replicate this in vision-language models (VLMs) presents two challe...

arxiv.org

June 4, 2025 at 5:24 AM

William Xie

@wxie.bsky.social

seemingly a hot topic rn on my feed but i'm not sure how much more humanity can lower the floor on mass autonomous death, whereas taking care of a human is still wildly inefficient. that is to say on the level of the individual researcher in AI theres a lot more unrealized help we can do than harm

June 4, 2025 at 5:20 AM

William Xie

@wxie.bsky.social

i have two somewhat distinct and existential concerns: 1) that we are guileless researchers operating in aggregate as an arm of the MIC and 2) that our research is directly extensible (within a few degrees) to dual-use. tbh i think many overstate 2) and others are helpless wrt 1) due to "incentives"

June 4, 2025 at 5:13 AM

William Xie

@wxie.bsky.social

but I think that's a good feeling to lean into

May 12, 2025 at 10:24 PM

William Xie

@wxie.bsky.social

yeah, I think you're right on both points. I got in the weeds on haptic teleop interfaces for LFD recently and overall am not super convinced it'll enable the data scale we need. Way more interested in self-improvement from physical interaction (w/ touch) though I feel quite out-of-depth there

May 12, 2025 at 10:24 PM

William Xie

@wxie.bsky.social

and w/o touch

May 12, 2025 at 5:52 PM

William Xie

@wxie.bsky.social

I'm generally a believer that we'll eventually do everything with vision, but I also believe that we'll need touch to get policies running closer to real-time/humans. My advisor loves to bring up these videos from a study of trying to strike a matchbox w/ and w/o feeling in their fingers:

May 12, 2025 at 5:50 PM

William Xie

@wxie.bsky.social

Liebherr

April 22, 2025 at 4:15 PM

William Xie

@wxie.bsky.social

going forward, i'm thinking about how we can scale good data collection with force control and improved physical models & reasoning. as of now you cannot convince me that we do not still need huge amounts of real robot data for robust contact-rich manipulation. and we are quite a ways off...

April 19, 2025 at 11:42 PM

William Xie

@wxie.bsky.social

so interesting where the field has coalesced and where it has diverged. some of it is a necessary byproduct of manipulation, some of it seems like open areas for research. anyway, here's a fun and unreadable plot: these 25 papers evaluate 64 (59 models) significantly different contact-rich tasks

April 19, 2025 at 11:36 PM

William Xie

@wxie.bsky.social

true, but humans learn implicit control laws, however relative they may be, from rich sensory information over many, many episodes. for robots, high-precision servos are just one tool to obtain such high-fidelity data. i also think such tooling is important for achieving supra-human abilities.

March 28, 2025 at 7:01 PM

William Xie

@wxie.bsky.social

I think the RL policy / teleop comparison here is not quite fair--the RL policy leverages wrench data, which is the primary supervisory signal for these kinds of insertion tasks (learning visuo-force servoing) whereas the teleop here is using a 3D CAD mouse--huge embodiment gap in data collection

March 23, 2025 at 12:31 AM

William Xie

@wxie.bsky.social

adapted a preexisting repo for the DROID dataset+franka robot for the UR5/my small dataset: github.com/badinkajink/...

GitHub - badinkajink/rerun_rlds_ur5

Contribute to badinkajink/rerun_rlds_ur5 development by creating an account on GitHub.

github.com

February 28, 2025 at 7:36 PM

William Xie

@wxie.bsky.social

Cool! I remember seeing a complementary approach for learning semantic placement (arxiv.org/abs/2401.07770) -- perhaps it can plug in for segmentation when VLMs cannot reasonably "point" placement regions.

Seeing the Unseen: Visual Common Sense for Semantic Placement

Computer vision tasks typically involve describing what is present in an image (e.g. classification, detection, segmentation, and captioning). We study a visual common sense task that requires underst...

arxiv.org

February 25, 2025 at 1:59 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news