(1) visible pixels may be redundant with masked ones,
(2) visible pixels may not be predictive of masked regions.
+38% on classification tasks.
I wonder how much CroCo & *ST3R might benefit from this.
arxiv.org/abs/2502.06314
(1) visible pixels may be redundant with masked ones,
(2) visible pixels may not be predictive of masked regions.
+38% on classification tasks.
I wonder how much CroCo & *ST3R might benefit from this.
arxiv.org/abs/2502.06314