muadiboo.bsky.social
muadiboo.bsky.social
@muadiboo.bsky.social
Reposted by muadiboo.bsky.social
This paper masks out principal components instead of RGB patches because
(1) visible pixels may be redundant with masked ones,
(2) visible pixels may not be predictive of masked regions.

+38% on classification tasks.

I wonder how much CroCo & *ST3R might benefit from this.
arxiv.org/abs/2502.06314
February 17, 2025 at 10:39 AM