Lightnews — Scholar-powered news

Simon Schrodi

@simonschrodi.bsky.social

🎓 PhD student @cvisionfreiburg.bsky.social @UniFreiburg
💡 interested in mechanistic interpretability, robustness, AutoML & ML for climate science

https://simonschrodi.github.io/

Posts Replies Media Videos

Simon Schrodi

@simonschrodi.bsky.social

Big thanks to our amazing co-authors: Max Argus, Volker Fischer, and @thomasbrox.bsky.social 🙌

April 20, 2025 at 2:24 PM

Simon Schrodi

@simonschrodi.bsky.social

Even better, if you're at #ICLR2025 next week:
🖼️ Poster — April 24, 10 a.m. - 12:30 p.m., Hall 3 + Hall 2B (#481)
🎤 Oral — April 24, 4:30 p.m. - 4:42 p.m, Garnet 213–215 (oral session 2B)
☕ Or just catch us over coffee!

April 20, 2025 at 2:24 PM

Simon Schrodi

@simonschrodi.bsky.social

Curious to dive deeper?
📑 Paper: openreview.net/forum?id=uAF...
💻 Code: github.com/lmb-freiburg...
📬 DM me or David (he's not on Bluesky, but you can dm him on other platforms)!

Two Effects, One Trigger: On the Modality Gap, Object Bias, and...

Contrastive vision-language models (VLMs), like CLIP, have gained popularity for their versatile applicability to various downstream tasks. Despite their successes in some tasks, like zero-shot...

openreview.net

April 20, 2025 at 2:24 PM

Simon Schrodi

@simonschrodi.bsky.social

But what is the modality gap good for? Interestingly, we find it affects the model’s entropy suggesting it might not be a bug, but a feature. 👀

April 20, 2025 at 2:24 PM

Simon Schrodi

@simonschrodi.bsky.social

Our paper is packed with surprising and insightful findings about both phenomena. Most notably, we show that both effects stem from the information imbalance between image and text modalities and both can be reduced when that imbalance decreases.

April 20, 2025 at 2:24 PM

Simon Schrodi

@simonschrodi.bsky.social

In this work, we investigate two undesired properties of CLIP-like models:
- Modality gap: a complete separation of image and text embeddings in the shared embedding space.
- Object bias: a tendency to focus on objects over other semantic aspects like attributes.

April 20, 2025 at 2:24 PM

Simon Schrodi

@simonschrodi.bsky.social

Hi Julian, could you please add me? I work on interpretability & data-centric ML for multi-modal models

November 24, 2024 at 6:27 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news