Lightnews — Scholar-powered news

Xingyu Chen

@xingyu-chen.bsky.social

76 followers 310 following 13 posts

PhD Student at Westlake University, working on 3D & 4D Foundation Models.
https://rover-xingyu.github.io/

Posts Replies Media Videos

Xingyu Chen

@xingyu-chen.bsky.social

🔗Page: rover-xingyu.github.io/TTT3R
📄Paper: arxiv.org/abs/2509.26645
💻Code: github.com/Inception3D/...

Big thanks to the amazing team!
@xingyu-chen.bsky.social @fanegg.bsky.social @xiuyuliang.bsky.social @andreasgeiger.bsky.social @apchen.bsky.social

TTT3R: 3D Reconstruction as Test-Time Training

3D Reconstruction as Test-Time Training

rover-xingyu.github.io

October 1, 2025 at 3:28 PM

Xingyu Chen

@xingyu-chen.bsky.social

Instead of updating all states uniformly, we incorporate image attention as per-token learning rates.

High-confidence matches get larger updates, while low-quality updates are suppressed.

This soft gating greatly extends the length generalization beyond the training context.

October 1, 2025 at 3:26 PM

Xingyu Chen

@xingyu-chen.bsky.social

DUSt3R was never trained to do dynamic segmentation with GT masks, right? It was just trained to regress point maps on 3D datasets—yet dynamic awareness emerged, making DUSt3R a zero-shot 4D estimator!😀

April 2, 2025 at 7:59 AM

Xingyu Chen

@xingyu-chen.bsky.social

🔗Page: easi3r.github.io
📄Paper: arxiv.org/abs/2503.24391
💻Code: github.com/Inception3D/...

Big thanks to the amazing team!
@xingyu-chen.bsky.social, @fanegg.bsky.social, @xiuyuliang.bsky.social, @andreasgeiger.bsky.social, @apchen.bsky.social

April 1, 2025 at 3:27 PM

Xingyu Chen

@xingyu-chen.bsky.social

With our estimated segmentation masks, we perform a second inference pass by re-weighting the attention, enabling robust 4D reconstruction and even outperforming SOTA methods trained on 4D datasets, with almost no extra cost compared to vanilla DUSt3R.

April 1, 2025 at 3:25 PM

Xingyu Chen

@xingyu-chen.bsky.social

We propose an attention-guided strategy to decompose dynamic objects from the static background, enabling robust dynamic object segmentation. It outperforms the optical-flow guided segmentation, like MonST3R, and the model trained on dynamic mask labels, like DAS3R.

April 1, 2025 at 3:24 PM

Xingyu Chen

@xingyu-chen.bsky.social

💡Humans naturally separate ego-motion from object-motion without dynamic labels. We observe that #DUSt3R has implicitly learned a similar mechanism, reflected in its attention layers.

April 1, 2025 at 3:23 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news