Joel Valdivia Ortega
banner
virtualhomo.bsky.social
Joel Valdivia Ortega
@virtualhomo.bsky.social
Human turned into a virtual homo during PhD at Helmholtz Munich
📢 Thanks a lot to @lorenzlamm.bsky.social, @marionjasnin.bsky.social, @tingyingpeng.bsky.social, Franziska Eckardt and Benedikt Schworm for all the help in this work ❤️ and to you for reading 😍 !

I would love to get any feedback, so please feel free to reach out!

🧵9/9
December 3, 2025 at 2:18 AM
🤟 Results

Thus, increasing the token's norm or having non-sparse representations become more costly for the ViT, promoting less repurposing, better representations and, as a result, better quantitative performance.

🧵8/9
December 3, 2025 at 2:18 AM
Random embeddings also create angular anisotropy: the less sparse the token is, the more distortion it will experience from the random embedding.

🧵7/9
December 3, 2025 at 2:18 AM
Random embeddings create a radial anisotropy: the bigger the token's norm, the more it will be distorted by the random embedding.

🧵6/9
December 3, 2025 at 2:18 AM
☝️Topology preservation

Choosing appropriate amplitudes, we can preserve the topology of the latent space. The smaller the amplitude, the less modifications are expected in the topology.

🧵5/9
December 3, 2025 at 2:18 AM
🧑‍🍳 Contribution

We proposed replacing the learnable parameters from the MLPs by random variables turning them into random embeddings, creating the ✨Randomized-MLP (RMLP)✨. This architecture has a hyperparameter, amplitude, to control the standard deviation of the variables.

🧵4/9
December 3, 2025 at 2:18 AM
🎨 Baseline

Taking DINO and iBOT losses, let's consider the teacher providing two stable classes. The student and its MLP then need to learn how to match that classification, where part of the learning might go to the MLP.

🧵3/9
December 3, 2025 at 2:18 AM
✨Motivation

Token's norm has been used to spot ViTs repurposing patch tokens to encode general information on void regions on natural images and regularisation techniques have been developed to avoid this. We saw this behaviour on regularised models when applied to medical images.

🧵2/9
December 3, 2025 at 2:18 AM