Xingyu Chen
xingyu-chen.bsky.social
Xingyu Chen
@xingyu-chen.bsky.social
PhD Student at Westlake University, working on 3D & 4D Foundation Models.
https://rover-xingyu.github.io/
Reposted by Xingyu Chen
Personal programs for ICCV 2025 are now available at:
www.scholar-inbox.com/conference/i...
October 10, 2025 at 6:19 AM
Look, 4D foundation models know about humans – and we just read it out!
#Human3R: Everyone Everywhere All at Once

Just input a RGB video, we online reconstruct 4D humans and scene in 𝗢𝗻𝗲 model and 𝗢𝗻𝗲 stage.

Training this versatile model is easier than you think – it just takes 𝗢𝗻𝗲 day using 𝗢𝗻𝗲 GPU!

🔗Page: fanegg.github.io/Human3R/
October 8, 2025 at 11:19 AM
Glad to be recognized as an outstanding reviewer!
There’s no conference without the efforts of our reviewers. Special shoutout to our #ICCV2025 outstanding reviewers 🫡

iccv.thecvf.com/Conferences/...
2025 ICCV Program Committee
iccv.thecvf.com
October 5, 2025 at 3:25 PM
#VGGT: accurate within short clips, but slow and prone to Out-of-Memory (OOM)

#CUT3R: fast with constant memory usage, but forgets.

We revisit them from a Test-Time Training (TTT) perspective and propose #TTT3R to get all three: fast, accurate, and OOM-free.
October 1, 2025 at 3:24 PM
Let's keep revisiting 3D reconstruction!
#TTT3R: 3D Reconstruction as Test-Time Training
TTT3R offers a simple state update rule to enhance length generalization for #CUT3R — No fine-tuning required!
🔗Page: rover-xingyu.github.io/TTT3R
We rebuilt @taylorswift13’s "22" live at the 2013 Billboard Music Awards - in 3D!
October 1, 2025 at 7:20 AM
Reposted by Xingyu Chen
Excited to introduce LoftUp!

A strong (than ever) and lightweight feature upsampler for vision encoders that can boost performance on dense prediction tasks by 20%–100%!

Easy to plug into models like DINOv2, CLIP, SigLIP — simple design, big gains. Try it out!

github.com/andrehuang/l...
April 22, 2025 at 7:55 AM
If you're a researcher and haven't tried it yet, please give it a try! It took me a while to adjust, but now it's my favorite tool. You can read, bookmark, organize papers, and get recommendations based on your interests!
🚀 Never miss a beat in science again!

📬 Scholar Inbox is your personal assistant for staying up to date with your literature. It includes: visual summaries, collections, search and a conference planner.

Check out our white paper: arxiv.org/abs/2504.08385
#OpenScience #AI #RecommenderSystems
April 15, 2025 at 5:37 AM
Reposted by Xingyu Chen
April 2, 2025 at 11:45 AM
Reposted by Xingyu Chen
𝗘𝗮𝘀𝗶𝟯𝗥: 𝗘𝘀𝘁𝗶𝗺𝗮𝘁𝗶𝗻𝗴 𝗗𝗶𝘀𝗲𝗻𝘁𝗮𝗻𝗴𝗹𝗲𝗱 𝗠𝗼𝘁𝗶𝗼𝗻 𝗳𝗿𝗼𝗺 𝗗𝗨𝗦𝘁𝟯𝗥 𝗪𝗶𝘁𝗵𝗼𝘂𝘁 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴
Xingyu Chen, Yue Chen, Yuliang Xiu ... Anpei Chen
arxiv.org/abs/2503.24391
Trending on www.scholar-inbox.com
April 2, 2025 at 10:11 AM
Reposted by Xingyu Chen
I was really surprised when I saw this. Dust3R has learned very well to segment objects without supervision. This knowledge can be extracted post-hoc, enabling accurate 4D reconstruction instantly.
🦣Easi3R: 4D Reconstruction Without Training!

Limited 4D datasets? Take it easy.

#Easi3R adapts #DUSt3R for 4D reconstruction by disentangling and repurposing its attention maps → make 4D reconstruction easier than ever!

🔗Page: easi3r.github.io
April 1, 2025 at 6:45 PM
🦣Easi3R: 4D Reconstruction Without Training!

Limited 4D datasets? Take it easy.

#Easi3R adapts #DUSt3R for 4D reconstruction by disentangling and repurposing its attention maps → make 4D reconstruction easier than ever!

🔗Page: easi3r.github.io
April 1, 2025 at 3:21 PM
Reposted by Xingyu Chen
How much 3D do visual foundation models (VFMs) know?

Previous work requires 3D data for probing → expensive to collect!

#Feat2GS @cvprconference.bsky.social 2025 - our idea is to read out 3D Gaussains from VFMs features, thus probe 3D with novel view synthesis.

🔗Page: fanegg.github.io/Feat2GS
March 31, 2025 at 4:06 PM