Author | Lightnews

Julien Gaubil

@jgaubil.bsky.social

fair enough :)

December 1, 2025 at 1:42 PM

Julien Gaubil

@jgaubil.bsky.social

(this can be fixed by opening the pdf in illustrator then saving again, for anyone having the same problem)

December 1, 2025 at 1:12 PM

Julien Gaubil

@jgaubil.bsky.social

I had a similar problem when exporting figures created using Figma to pdf format. It's likely due to their export (figma's notoriously bad), and not safari PDF reader per se

December 1, 2025 at 1:10 PM

Julien Gaubil

@jgaubil.bsky.social

ah yes, I see!

We definitively tried to see whether the operations implemented by the layers followed known algorithms. A least squared-based optimisation like in your paper was a good candidate, given how often Procrustes problems show up in 3D vision - but alas we couldn't identify one

November 4, 2025 at 10:00 PM

Julien Gaubil

@jgaubil.bsky.social

and, of course: www.arxiv.org/abs/2510.24907

Understanding Multi-View Transformers

Multi-view transformers such as DUSt3R are revolutionizing 3D vision by solving 3D tasks in a feed-forward manner. However, contrary to previous optimization-based pipelines, the inner mechanisms of m...

www.arxiv.org

November 4, 2025 at 7:43 PM

Julien Gaubil

@jgaubil.bsky.social

Thanks for sharing!

Is this internal iterative refinement a known phenomenon in 3D networks, or are you referring to a specific architecture?

November 4, 2025 at 7:43 PM

Julien Gaubil

@jgaubil.bsky.social

This was a cool project done jointly with the great Michal Stary, under the amazing supervision of @ayusht.bsky.social and @vincentsitzmann.bsky.social at MIT! [8/8]

November 4, 2025 at 7:40 PM

Julien Gaubil

@jgaubil.bsky.social

We presented this at the End-to-End 3D Learning Workshop at ICCV 2025, and hope it inspires more work on understanding large reconstruction models!

We’re working on a clean version of the code, and we’ll release it once yours truly are done with the CVPR deadline [7/8]

November 4, 2025 at 7:40 PM

Julien Gaubil

@jgaubil.bsky.social

We also find that the decoder turns 𝐬𝐞𝐦𝐚𝐧𝐭𝐢𝐜 correspondences into 𝐠𝐞𝐨𝐦𝐞𝐭𝐫𝐢𝐜 𝐜𝐨𝐫𝐫𝐞𝐬𝐩𝐨𝐧𝐝𝐞𝐧𝐜𝐞𝐬.

We identified attention heads specialized in finding correspondences across views.

We can clearly see the geometric refinement on this difficult image pair by visualizing their cross-attention maps! [6/8]

November 4, 2025 at 7:40 PM

Julien Gaubil

@jgaubil.bsky.social

Surprisingly, 𝐚𝐥𝐦𝐨𝐬𝐭 𝐚𝐥𝐥 𝐨𝐟 𝐭𝐡𝐞 𝐢𝐦𝐩𝐫𝐨𝐯𝐞𝐦𝐞𝐧𝐭 𝐢𝐬 𝐝𝐮𝐞 𝐭𝐨 𝐬𝐞𝐥𝐟-𝐚𝐭𝐭𝐞𝐧𝐭𝐢𝐨𝐧 𝐥𝐚𝐲𝐞𝐫𝐬!⁣

Nevertheless, this doesn’t mean cross-attention layers are useless - without them, no communication between views.⁣

This instead suggests that cross and self-attention layers play very different roles [5/8]

November 4, 2025 at 7:40 PM

Julien Gaubil

@jgaubil.bsky.social

Can we dive deeper into the network? Yes!

We can observe the impact of each layer on the iterative reconstruction process by comparing the pointmap error before and after the layer.

Here, we plot of the error difference for every layer of DUSt3R’s second-view decoder [4/8]

November 4, 2025 at 7:40 PM

Julien Gaubil

@jgaubil.bsky.social

We observe that 𝐫𝐞𝐜𝐨𝐧𝐬𝐭𝐫𝐮𝐜𝐭𝐢𝐨𝐧 𝐢𝐬 𝐚𝐧 𝐢𝐭𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐩𝐫𝐨𝐜𝐞𝐬𝐬, with decoder blocks progressively refining the pointmaps.⁣

For easy image pairs, a good estimate of the relative position emerges early in the decoder, whereas harder pairs require more decoder blocks, sometimes even failing to converge [3/8]

November 4, 2025 at 7:40 PM

Julien Gaubil

@jgaubil.bsky.social

To open up DUSt3R, we train individual MLP probes on intermediate layers of an early checkpoint, using the same pointmap objective.

We can then analyze its inference through the sequence of reconstructions - see below! [2/8]

November 4, 2025 at 7:40 PM

Julien Gaubil

@jgaubil.bsky.social

DUSt3R et al. are impressive, but how do they actually work? We investigate this in our project 𝘜𝘯𝘥𝘦𝘳𝘴𝘵𝘢𝘯𝘥𝘪𝘯𝘨 𝘔𝘶𝘭𝘵𝘪-𝘝𝘪𝘦𝘸 𝘛𝘳𝘢𝘯𝘴𝘧𝘰𝘳𝘮𝘦𝘳𝘴!⁣

We share findings on the iterative nature of reconstruction, the roles of cross and self-attention, and the emergence of correspondences across the network [1/8] ⬇️

Vision and Graphics Trends @si-cv-graphics.bsky.social · Oct 31

𝗨𝗻𝗱𝗲𝗿𝘀𝘁𝗮𝗻𝗱𝗶𝗻𝗴 𝗠𝘂𝗹𝘁𝗶-𝗩𝗶𝗲𝘄 𝗧𝗿𝗮𝗻𝘀𝗳𝗼𝗿𝗺𝗲𝗿𝘀
Michal Stary, Julien Gaubil, Ayush Tewari, Vincent Sitzmann
arxiv.org/abs/2510.24907
Trending on www.scholar-inbox.com

November 4, 2025 at 7:40 PM

Reposted by Julien Gaubil

Vision and Graphics Trends

@si-cv-graphics.bsky.social

𝗨𝗻𝗱𝗲𝗿𝘀𝘁𝗮𝗻𝗱𝗶𝗻𝗴 𝗠𝘂𝗹𝘁𝗶-𝗩𝗶𝗲𝘄 𝗧𝗿𝗮𝗻𝘀𝗳𝗼𝗿𝗺𝗲𝗿𝘀
Michal Stary, Julien Gaubil, Ayush Tewari, Vincent Sitzmann
arxiv.org/abs/2510.24907
Trending on www.scholar-inbox.com

October 31, 2025 at 7:00 AM

Reposted by Julien Gaubil

Kwang Moo Yi

@kmyid.bsky.social

Stary and Gaubil et al., "Understanding multi-view transformers"

We use Dust3r as a black box. This work looks under the hood at what is going on. The internal representations seem to "iteratively" refine towards the final answer. Quite similar to what goes on in point cloud net

October 30, 2025 at 9:00 PM

Reposted by Julien Gaubil

Antoine Guédon

@antoine-guedon.bsky.social

1/n🚀Gaussians > Differentiable function > Mesh?
Check out our new work: MILo: Mesh-In-the-Loop Gaussian Splatting!

🎉Accepted to SIGGRAPH Asia 2025 (TOG)
MILo is a novel differentiable framework that extracts meshes directly from Gaussian parameters during training.

🧵👇

September 8, 2025 at 11:35 AM

Julien Gaubil

@jgaubil.bsky.social

Where would understanding surface geometry (as in distances, curvatures, and so on) fit in this diagram?

I’d say it implies multi-view consistency of the geometry and would therefore add an arrow at the left of your chart. Do you agree, and if so, don’t you think we should start there?

August 13, 2025 at 11:51 AM

Reposted by Julien Gaubil

Ana Dodik

@geometry.gay

🚨🚨 WiGRAPH CONFERENCE COFFEE @ SIGGRAPH '25 🚨🚨

Sign up now to be randomly matched with peers for a SIGGRAPH conference coffee!

WiGRAPH @wigraph.bsky.social · Jun 30

🎉☕ Announcing WiGRAPH Conference Coffees: SIGGRAPH 2025 Edition! ☕🎉

Are you a researcher of an underrepresented gender registered for SIGGRAPH? Do you want an opportunity to network with your peers? Learn more and sign up here:
www.wigraph.org/events/2025-...

WiGRAPH Conference Coffee: Sign-Up

WiGRAPH is organizing a SIGGRAPH 2025 conference coffee! Are you a researcher of a gender that is underrepresented at SIGGRAPH? Would you like for an opportunity to network with peers? Sign up for th...

tinyurl.com

July 10, 2025 at 4:39 PM

Reposted by Julien Gaubil

Antoine Guédon

@antoine-guedon.bsky.social

💻We've released the code for our #CVPR2025 paper MAtCha!

🍵MAtCha reconstructs sharp, accurate and scalable meshes of both foreground AND background from just a few unposed images (eg 3 to 10 images)...

...While also working with dense-view datasets (hundreds of images)!

April 3, 2025 at 10:33 AM

Julien Gaubil

@jgaubil.bsky.social

I think there is too much (good) content pouring in the Vision/3D communities daily to read thoroughly while preserving room for creativity. I believe what's important is to know 'what exists', without necessarily knowing all the details, to have an accurate picture of what remains to be done

Dmytro Mishkin @ducha-aiki.bsky.social · Feb 25

Sometimes reading Hamming makes me sad, because I recognize myself in this quote.

Question: How much effort should go into library work?

Hamming: It depends upon the field. I will say this about it. There was a fellow at Bell Labs, a very, very, smart guy. He was always in the library; he read everything. If you wanted references, you went to him and he gave you all kinds of references. But in the middle of forming these theories, I formed a proposition: there would be no effect named after him in the long run. He is now retired from Bell Labs and is an Adjunct Professor. He was very valuable; I'm not questioning that. He wrote some very good Physical Review articles; but there's no effect named after him because he read too much. If you read all the time what other people have done you will think the way they thought. If you want to think new thoughts that are different, then do what a lot of creative people do - get the problem reasonably clear and then refuse to look at any answers until you've thought the problem through carefully how you would do it, how you could slightly change the problem to be the correct one. So yes, you need to keep up. You need to keep up more to find out what the problems are than to read to find the solutions. The reading is necessary to know what is going on and what is possible. But reading to get the solutions does not seem to be the way to do great research. So I'll give you two answers. You read; but it is not the amount, it is the way you read that counts.

February 25, 2025 at 5:31 PM

Reposted by Julien Gaubil

Dmytro Mishkin

@ducha-aiki.bsky.social

Sometimes reading Hamming makes me sad, because I recognize myself in this quote.

February 25, 2025 at 9:17 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news