rbalestr.bsky.social
@rbalestr.bsky.social
Our solution is to train a SSL denoiser only to create a data curriculum for the SSL method you are interested in. By first observing denoised samples and gradually going back to the original samples, the final SSL model performs better than the baseline!
May 20, 2025 at 2:38 PM
With high levels of noise, it is standard to have a denoiser as part of the train/test preprocessing pipeline... but this has drawbacks e.g. adding a bias to your pipeline, cross-validation, sensitivity to distribution shifts... AI/SSL should strive for denoiser-free pipelines!
May 20, 2025 at 2:38 PM
The spline connection offers closed-form solution for many questions we have been wondering around SAEs--and provides clear actionable solutions such as our PAM-SGD training algo. PAM-SGD is EM-like, relying on the partition and region assignment, outperforming typical Adam/SGD
May 20, 2025 at 2:08 PM
The findings stem from expressing SAEs as splines (arxiv.org/abs/2408.04809) and doing a deep dive into their partition, constraints, and underlying geometry! We not only characterize their input space partition and geometry, but tie SAE to common methods such as k-means and PCA
May 20, 2025 at 2:08 PM
Want better training and geometric insights for Sparse AutoEncoders (SAEs)? Search no more... We leverage spline theory to provide a new "EM-like" training algo (PAM-SGD) and to delve into SAE geometry with connections to PCA, k-means, and more...

arxiv.org/abs/2505.11836
May 20, 2025 at 2:08 PM
That bias towards capturing details manifests itself in terms of different attention behavior within ViTs. From those findings, we propose a new token aggregator that can counter such attention bias without having to finetune the backbone -> gains in linear probe performance!
December 5, 2024 at 6:47 PM
Learning by reconstruction captures uninformative details in your data. This “attention to details” biases the ViT’s attention. Our solution: a new token aggregator->improves (significantly) MAE linear probe perf. and (slightly) JEPAs like I-JEPA
arxiv.org/abs/2412.03215
December 5, 2024 at 6:47 PM
We propose an approach that combines segmentation and association of geographic entities in historical maps using video instance segmentation (VIS). Combined with a novel method for generating synthetic videos from unlabeled historical maps, we produce SSL models with high acc.
December 2, 2024 at 2:22 PM
Understanding the evolution of historical maps is key to track the development of civilizations (urbanization, environmental changes, ...). We show how to use Self Supervised Learning to do that without supervision!
arxiv.org/abs/2411.17425
(SSL workshop NeurIPS24)
December 2, 2024 at 2:22 PM