#DINOv2
New study shows human‑aligned AI models like AligNet boost robustness on Vision Transformers, SigLIP, DINOv2 across THINGS and Levels datasets. Lukas Muttenthaler’s findings could reshape reliability benchmarks. Dive in! #AligNet #VisionTransformers #THINGSdataset

🔗 aidailypost.com/news/human-a...
November 13, 2025 at 5:09 PM
🐇Into the Rabbit Hull — Part 1: A Deep Dive into DINOv2🧠
Our latest Deeper Learning blog post is an #interpretability deep dive into one of today’s leading vision foundation models: DINOv2.
📖Read now: bit.ly/4nNfq8D
Stay tuned — Part 2 coming soon.
#AI #VLMs #DINOv2
Into the Rabbit Hull – Part I - Kempner Institute
This blog post offers an interpretability deep dive, examining the most important concepts emerging in one of today’s central vision foundation models, DINOv2. This blogpost is the first of a […]
bit.ly
November 12, 2025 at 3:49 PM
Joel Valdivia Ortega, Lorenz Lamm, Franziska Eckardt, Benedikt Schworm, Marion Jasnin, Tingying Peng
Randomized-MLP Regularization Improves Domain Adaptation and Interpretability in DINOv2
https://arxiv.org/abs/2511.05509
November 11, 2025 at 3:57 PM
Joel Valdivia Ortega, Lorenz Lamm, Franziska Eckardt, Benedikt Schworm, Marion Jasnin, Tingying Peng: Randomized-MLP Regularization Improves Domain Adaptation and Interpretability in DINOv2 https://arxiv.org/abs/2511.05509 https://arxiv.org/pdf/2511.05509 https://arxiv.org/html/2511.05509
November 11, 2025 at 6:30 AM
Yujie Yang, Shuang Li, Jun Ye, Neng Dong, Fan Li, Huafeng Li
DINOv2 Driven Gait Representation Learning for Video-Based Visible-Infrared Person Re-identification
https://arxiv.org/abs/2511.04281
November 7, 2025 at 8:03 AM
Yujie Yang, Shuang Li, Jun Ye, Neng Dong, Fan Li, Huafeng Li: DINOv2 Driven Gait Representation Learning for Video-Based Visible-Infrared Person Re-identification https://arxiv.org/abs/2511.04281 https://arxiv.org/pdf/2511.04281 https://arxiv.org/html/2511.04281
November 7, 2025 at 6:30 AM
At its core is a generalized 3D head decoder trained with perceptual supervision from DINOv2 and SAM 2.1. We find that our new perceptual loss formulation improves reconstruction fidelity compared to commonly-used methods such as LPIPS.
PercHead: Perceptual Head Model for Single-Image 3D Head Reconstruction & Editing
PercHead: Perceptual Head Model for Single-Image 3D Head Reconstruction & Editing
antoniooroz.github.io
November 5, 2025 at 11:37 AM
AXIS: A Lab-in-the-Loop Machine Learning approach for generalized detection of macromolecular crystals. [new]
AI crystal detect: DINOv2, transfer learn, expert feedback, adapt to conditions.
November 4, 2025 at 3:37 AM
November 3, 2025 at 3:30 PM
[6/10] Strong performance across encoders 💪
Tested on multiple vision encoders (SimDINOv2, DINOv2, DFN…), CroVCA achieves SOTA unsupervised hashing:
November 3, 2025 at 2:30 PM
Mika Feng, Pierre Gallin-Martel, Koichi Ito, Takafumi Aoki
Optimizing DINOv2 with Registers for Face Anti-Spoofing
https://arxiv.org/abs/2510.17201
October 21, 2025 at 10:19 AM
Mika Feng, Pierre Gallin-Martel, Koichi Ito, Takafumi Aoki: Optimizing DINOv2 with Registers for Face Anti-Spoofing https://arxiv.org/abs/2510.17201 https://arxiv.org/pdf/2510.17201 https://arxiv.org/html/2510.17201
October 21, 2025 at 6:32 AM
DIY-SC: “Do It Yourself—Learning Semantic Correspondence from Pseudo-Labels.” - Light-weight adapter on DINOv2 / SD+DINOv2 → SOTA on SPair-71k w/o keypoints.
By O. Dünkel, T. Wimmer, C. Theobalt, C. Rupprecht, A. Kortylewski

Page: genintel.github.io/DIY-SC
October 19, 2025 at 7:49 AM
Ablations show:
• Works across SSL encoders (DINOv2 best, CLIP & MAE close)
• Cosine-similarity loss balances fidelity vs generativity
• Without SSL priors → reconstructions good, generations collapse
October 17, 2025 at 10:21 AM
💡 The idea

We start from a frozen self-supervised encoder (DINOv2, MAE, or CLIP) and combine it with a generative decoder.

Then we fine-tune only the [CLS] token embedding - injecting low-level info while keeping the rest frozen.
October 17, 2025 at 10:21 AM
When performing linear probing for semantic segmentation or normal and depth estimation, AnyUp consistently outperforms prior upsamplers.

Importantly, the upsampled features also stay faithful to the input feature space, as we show in experiments with pre-trained DINOv2 probes.
October 16, 2025 at 9:07 AM
Akib Mohammed Khan, Bartosz Krawczyk: Towards Adversarial Robustness and Uncertainty Quantification in DINOv2-based Few-Shot Anomaly Detection https://arxiv.org/abs/2510.13643 https://arxiv.org/pdf/2510.13643 https://arxiv.org/html/2510.13643
October 16, 2025 at 6:31 AM
What makes an image memorable? And can we predict image memorability using pretrained vision encoders? We explored activations, attention distributions, image patch uniformity and sparse autoencoder losses using image representations across the layers of CLIP, DINOv2 and SigLIP2.
October 15, 2025 at 9:10 AM
DINOv2 beats weakly supervised models like OpenCLIP on vision benchmarks using multi‑crop augmentation and a mean‑teacher self‑distillation setup (Oct 2025). Read more: https://getnews.me/unsupervised-transformer-pre-training-for-images-dinov2-survey/ #dinov2 #selfsupervised
October 7, 2025 at 5:26 PM
U-DFA, combining a frozen DINOv2 backbone with a lightweight CNN adapter, achieved top results on the Synapse and ACDC medical imaging benchmarks, using 33% of trainable parameters. https://getnews.me/u-dfa-model-sets-new-state-of-the-art-in-medical-image-segmentation/ #udfa #medicalimaging
October 3, 2025 at 5:24 AM
Zulkaif Sajjad, Furqan Shaukat, Junaid Mir: U-DFA: A Unified DINOv2-Unet with Dual Fusion Attention for Multi-Dataset Medical Segmentation https://arxiv.org/abs/2510.00585 https://arxiv.org/pdf/2510.00585 https://arxiv.org/html/2510.00585
October 2, 2025 at 6:35 AM
Dual‑student model with a DINOv2 encoder hit 99.7% AUROC on MVTec‑AD and 97.8% on CIFAR‑10, showing strong performance for industrial and semantic anomaly detection. https://getnews.me/dual-student-model-sets-state-of-the-art-multi-class-anomaly-detection/ #anomalydetection #dualstudent
September 30, 2025 at 9:23 PM
DINOReg fuses DINOv2 features with point‑cloud data, boosting RGB‑D alignment and achieving a 14.2% rise in patch inlier ratio on RGBD‑3DMatch and 15.7% recall on RGBD‑3DLoMatch. Read more: https://getnews.me/dinoreg-boosts-point-cloud-registration-using-vision-foundation-model/ #dinoreg #pointcloud
September 30, 2025 at 8:32 PM
Researchers fine‑tuned visual foundation models (DINOv2, DINOv3, PE‑Core) on public SAR data, creating AFRL‑DINOv2, which outperformed SARATR‑X in object recognition. https://getnews.me/foundation-models-advance-synthetic-aperture-radar-object-recognition/ #sar #foundationmodels
September 29, 2025 at 8:03 AM
Researchers used DINOv2 embeddings with a Dirichlet Process Mixture model for unsupervised medical anomaly detection, halving inference time while matching benchmark accuracy. https://getnews.me/dinov2-clustering-with-dirichlet-process-for-medical-anomaly-detection/ #dinov2 #anomalydetection
September 26, 2025 at 7:03 PM