Timo Lüddecke
timojl.bsky.social
Timo Lüddecke
@timojl.bsky.social
University of Göttingen and CIDAS
This is joint work with @aecker.bsky.social
November 28, 2025 at 1:25 PM
More information (with additional results on DinoV3, SigLIP2 and Perception Encoder):

📄 Paper (in TMLR): openreview.net/forum?id=neM...
📊 Website: eckerlab.org/projects/deap/
💻 Code: github.com/timojl/deap

…or drop by our poster at the ELLIS UnConference on December 2nd in Copenhagen. #EuRIPS
November 28, 2025 at 1:25 PM
Based our performance data for all backbones, we analyze to which degree performance can be attributed to general properties of the backbone (input image resolution, feature dimension, number of parameters). We find strong relationships with all properties for semantic segmentation and depth.
November 28, 2025 at 1:25 PM
A closer look into the three instance awareness tasks (instance discrimination, instance boundary detection, object detection) reveals that self-supervised learning outperforms vision-language (CLIP-style) pretraining.
November 28, 2025 at 1:25 PM
We compare supervised, self-supervised and vision-language backbones with respect to instance awareness, local semantics and spatial understanding. Here we show the trade-off between forward pass runtime and local semantics and spatial understanding performance:
November 28, 2025 at 1:25 PM