Lightnews — Scholar-powered news

Max Seitzer

@maxseitzer.bsky.social

Research Scientist in the DINO team at Meta FAIR. Previously: PhD at Max-Planck Institute for Intelligent Systems, Tübingen. Representation learning, agents, structure.

Posts Replies Media Videos

Max Seitzer

@maxseitzer.bsky.social

Immensely proud to have been part of this project. Thank you to the team: @oriane_simeoni, @huyvvo, @baldassarrefe.bsky.social, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Michael Ramamonjisoa, Francisco Massa, Daniel Haziza, Luca Wehrstedt, Jianyuan Wang, …

August 14, 2025 at 6:52 PM

Max Seitzer

@maxseitzer.bsky.social

And here’s my favorite figure from the paper, showing high resolution DINOv3 representations in all their detail-capturing glory ✨

August 14, 2025 at 6:52 PM

Max Seitzer

@maxseitzer.bsky.social

To recap:

1) The promise of SSL is finally realized, enabling foundation models across domains
2) High quality dense features enabling SotA applications
3) A versatile family of models for diverse deploy scenarios

So many great ideas (Gram anchoring!) to how we got there, please read the paper!

August 14, 2025 at 6:52 PM

Max Seitzer

@maxseitzer.bsky.social

Satellite you said? Yes, the same DINOv3 algorithm trained on satellite imagery produces a SotA model for geospatial tasks like canopy height estimation. And of course, learns beautiful feature maps. This is the magic of SSL 🪄

August 14, 2025 at 6:52 PM

Max Seitzer

@maxseitzer.bsky.social

3) DINOv3 is a family of models covering all use cases:

• ViT-7B flagship model
• ViT-S/S+/B/L/H+ (21M-840M params)
• ConvNeXt variants for efficient inference
• Text-aligned ViT-L (dino.txt)
• ViT-L/7B for satellite

All inheriting the great dense features of the 7B!

August 14, 2025 at 6:52 PM

Max Seitzer

@maxseitzer.bsky.social

Well, Jianyuan Wang of VGGT fame simply dropped DINOv3 into his pipeline and off-handedly got a new SotA 3D model out. Seems promising enough?

August 14, 2025 at 6:52 PM

Max Seitzer

@maxseitzer.bsky.social

2) DINOv3’s global understanding is strong, but its dense representations truly shine! There’s a clear gap between DINOv3 and prior methods across many tasks. This matters as pretrained dense features power many applications: MLLMs, video&3D understanding, robotics, generative models, …

August 14, 2025 at 6:52 PM

Max Seitzer

@maxseitzer.bsky.social

1) Some history: on ImageNet classification, supervised and weakly-supervised models converged to the same plateau over the last years. With DINOv3, SSL finally reaches that level. This alone is a big deal: no more reliance on annotated data!

August 14, 2025 at 6:52 PM

Max Seitzer

@maxseitzer.bsky.social

Introducing DINOv3 🦕🦕🦕

A SotA-enabling vision foundation model, trained with pure self-supervised learning (SSL) at scale.
High quality dense features, combining unprecedented semantic and geometric scene understanding.

Three reasons why this matters👇

August 14, 2025 at 6:52 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news