Vincent Leroy
banner
vincentleroy.bsky.social
Vincent Leroy
@vincentleroy.bsky.social
GEODE Team Lead (Geometric Deep Learning)
3D vision researcher @NaverLabsEurope
And if you opened an issue in DUSt3R like

modalities=['depth','known_focal','known_relative_pose']*1000
for modality in modalities:
print(f"How can I input {modality} in DUSt3R if I know it?")

You might be interest in POW3R
Friday morning, ExHall D Poster #84
June 12, 2025 at 1:41 PM
I don't know whether I'm more amazed by the drone hitting tree branches and recovering instead of crashing or MUSt3R following it as if nothing special happened...
June 12, 2025 at 12:34 PM
Funny enough, this means multi-agent RGB SLAM comes for free! No prior like motion smoothness or temporal neighborhood was used, so the network is robust to sudden changes in position, frame drops, kidnapped robots, crazy shakes and blur, and can handle multiple agents building a common memory map.
June 12, 2025 at 12:33 PM
RGB-SLAM becomes a single forward through the network with no optimization needed. We still need to decide which frame to keep in memory (AKA keyframes) to avoid exploding the costs. We simply check the 3D overlap between views and memory, ensuring new keyframes discover enough novel content.
June 12, 2025 at 12:32 PM
As DUSt3R, the first frame defines the origin and all other pointmaps are expressed in it. Plus each frame has it's own local prediction so recovering the camera pose is just a rigid alignment. For SfM we order image sets with IR then MUSt3R FF does it's thing, making SfM much faster and simpler!
June 12, 2025 at 12:31 PM