Lightnews — Scholar-powered news

Yixin Chen

@yixinchen.bsky.social

45 followers 160 following 14 posts

Research Scientist at BIGAI, 3D Vision, prev @UCLA, @MPI_IS, @Amazon, https://yixchen.github.io

Posts Replies Media Videos

Yixin Chen

@yixinchen.bsky.social

We hope this can provide some insights on how to design diffusion-based NVS methods to improve their consistency and plausibility!

🧩💻🗂️ All code, data, & checkpoints are released!
🔗 Learn more: jason-aplp.github.io/MOVIS/ (6/6)

April 1, 2025 at 1:46 AM

Yixin Chen

@yixinchen.bsky.social

📊 We also visualize the sampling process of:

🔹 Ours (with biased timestep scheduler) ✅

🔹 Zero123 (without it) ❌

Our approach shows more precise location prediction in the earlier stage & finer detail refinement in later stages! 🎯✨ (5/6)

April 1, 2025 at 1:45 AM

Yixin Chen

@yixinchen.bsky.social

💡 Key insight in MOVIS: A biased noise timestep scheduler for diffusion-based novel view synthesizer that prioritizes larger timesteps early in training and gradually decreases them over time. This improves novel view synthesis in multi-object scenes! 🎯🔥 (4/6)

April 1, 2025 at 1:45 AM

Yixin Chen

@yixinchen.bsky.social

🔍We analyze the sampling process of diffusion-based novel view synthesizers and:
📌 Larger timesteps → Focus on position & orientation recovery
📌 Smaller timesteps → Refine geometry & appearance

👇 We visualize the sampling process below! (3/6)

April 1, 2025 at 1:44 AM

Yixin Chen

@yixinchen.bsky.social

In MOVIS, we enhance diffusion-based novel view synthesis with:
🔍 Additional structural inputs (depth & mask)
🖌️ Novel-view mask prediction as an auxiliary task
🎯 A biased noise scheduler to facilitate training
We identify the following key insight: (2/6)

April 1, 2025 at 1:43 AM

Yixin Chen

@yixinchen.bsky.social

This line highlights our work in reconstruction and scene understanding—including SSR (dali-jack.github.io/SSR/), PhyScene (physcene.github.io), PhyRecon(phyrecon.github.io), ArtGS (articulate-gs.github.io), etc.—with more to come soon!🙌🙌 (n/n)

March 21, 2025 at 9:52 AM

Yixin Chen

@yixinchen.bsky.social

Even more!

Our model generalizes to in-the-wild scenes like YouTube videos🎥🌍! Using just *15 input views*, we achieve high-quality reconstructions with detailed geometry & appearance. 🌟 Watch the demo to see it in action! 👇 (5/n)

March 21, 2025 at 9:52 AM

Yixin Chen

@yixinchen.bsky.social

🏆 On datasets like Replica and ScanNet++, our model produces higher-quality reconstructions compared to baselines, including better accuracy in less-captured areas, more precise object structures, smoother backgrounds, and fewer floating artifacts. 👀 (4/n)

March 21, 2025 at 9:51 AM

Yixin Chen

@yixinchen.bsky.social

🎥✨ Our method excels in large, heavily occluded scenes, outperforming baselines that require 100 views using just 10. The reconstructed scene supports interactive text-based editing, and its decomposed object meshes enable photorealistic VFX edits.👇 (3/n)

March 21, 2025 at 9:50 AM

Yixin Chen

@yixinchen.bsky.social

🛠️ Our method combines decompositional neural reconstruction with diffusion prior, filling in missing information in less observed and occluded regions. The reconstruction (rendering loss) and generative (SDS loss) guidance are balanced by our visibility-guided modeling. (2/n)

March 21, 2025 at 9:48 AM

Yixin Chen

@yixinchen.bsky.social

Checking the digest from scholar-inbox has become my daily routine. A real game-changer!👏👏👏

January 16, 2025 at 2:33 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news