🔗 aidailypost.com/news/human-a...
🔗 aidailypost.com/news/human-a...
Our latest Deeper Learning blog post is an #interpretability deep dive into one of today’s leading vision foundation models: DINOv2.
📖Read now: bit.ly/4nNfq8D
Stay tuned — Part 2 coming soon.
#AI #VLMs #DINOv2
Our latest Deeper Learning blog post is an #interpretability deep dive into one of today’s leading vision foundation models: DINOv2.
📖Read now: bit.ly/4nNfq8D
Stay tuned — Part 2 coming soon.
#AI #VLMs #DINOv2
Randomized-MLP Regularization Improves Domain Adaptation and Interpretability in DINOv2
https://arxiv.org/abs/2511.05509
Randomized-MLP Regularization Improves Domain Adaptation and Interpretability in DINOv2
https://arxiv.org/abs/2511.05509
DINOv2 Driven Gait Representation Learning for Video-Based Visible-Infrared Person Re-identification
https://arxiv.org/abs/2511.04281
DINOv2 Driven Gait Representation Learning for Video-Based Visible-Infrared Person Re-identification
https://arxiv.org/abs/2511.04281
AI crystal detect: DINOv2, transfer learn, expert feedback, adapt to conditions.
AI crystal detect: DINOv2, transfer learn, expert feedback, adapt to conditions.
Tested on multiple vision encoders (SimDINOv2, DINOv2, DFN…), CroVCA achieves SOTA unsupervised hashing:
Tested on multiple vision encoders (SimDINOv2, DINOv2, DFN…), CroVCA achieves SOTA unsupervised hashing:
Optimizing DINOv2 with Registers for Face Anti-Spoofing
https://arxiv.org/abs/2510.17201
Optimizing DINOv2 with Registers for Face Anti-Spoofing
https://arxiv.org/abs/2510.17201
By O. Dünkel, T. Wimmer, C. Theobalt, C. Rupprecht, A. Kortylewski
Page: genintel.github.io/DIY-SC
By O. Dünkel, T. Wimmer, C. Theobalt, C. Rupprecht, A. Kortylewski
Page: genintel.github.io/DIY-SC
• Works across SSL encoders (DINOv2 best, CLIP & MAE close)
• Cosine-similarity loss balances fidelity vs generativity
• Without SSL priors → reconstructions good, generations collapse
• Works across SSL encoders (DINOv2 best, CLIP & MAE close)
• Cosine-similarity loss balances fidelity vs generativity
• Without SSL priors → reconstructions good, generations collapse
We start from a frozen self-supervised encoder (DINOv2, MAE, or CLIP) and combine it with a generative decoder.
Then we fine-tune only the [CLS] token embedding - injecting low-level info while keeping the rest frozen.
We start from a frozen self-supervised encoder (DINOv2, MAE, or CLIP) and combine it with a generative decoder.
Then we fine-tune only the [CLS] token embedding - injecting low-level info while keeping the rest frozen.
Importantly, the upsampled features also stay faithful to the input feature space, as we show in experiments with pre-trained DINOv2 probes.
Importantly, the upsampled features also stay faithful to the input feature space, as we show in experiments with pre-trained DINOv2 probes.