Lightnews — Scholar-powered news

Harry Thasarathan

@hthasarathan.bsky.social

47 followers 62 following 12 posts

PhD student @YorkUniversity @LassondeSchool, I work on computer vision and interpretability.

Posts Replies Media Videos

Harry Thasarathan

@hthasarathan.bsky.social

with Julian Forsyth, @thomasfel.bsky.social, @matthewkowal.bsky.social, @csprofkgd.bsky.social

Demo: yorkucvil.github.io/UniversalSAE/

Universal Sparse Autoencoders

Interpretable cross-model concept alignment using sparse autoencoders.

yorkucvil.github.io

July 15, 2025 at 2:36 AM

Harry Thasarathan

@hthasarathan.bsky.social

This was joint work with my wonderful collaborators @Julian_Forsyth @thomasfel.bsky.social @matthewkowal.bsky.social and my supervisor @csprofkgd.bsky.social . Couldn’t ask for better mentors and friends🫶!!!

(9/9)

February 7, 2025 at 3:15 PM

Harry Thasarathan

@hthasarathan.bsky.social

We hope this work contributes to the growing discourse on universal representations. As the zoo of vision models increases, a canonical, interpretable concept space could be crucial for safety and understanding. Code coming soon!

(8/9)

February 7, 2025 at 3:15 PM

Harry Thasarathan

@hthasarathan.bsky.social

Our method reveals model-specific features too: DinoV2 (left) shows specialized geometric concepts (depth, perspective), while SigLIP (right) captures unique text-aware visual concepts.

This opens new paths for understanding model differences!

(7/9)

February 7, 2025 at 3:15 PM

Harry Thasarathan

@hthasarathan.bsky.social

Using coordinated activation maximization on universal concepts, we can visualize how each model independently represents the same concept allowing us to further explore model similarities and differences. Below are concepts visualized for DinoV2, SigLIP, and ViT.

(6/9)

February 7, 2025 at 3:15 PM

Harry Thasarathan

@hthasarathan.bsky.social

Using co-firing and firing entropy metrics, we uncover universal features ranging from basic primitives (colors, textures) to complex abstractions (object interactions, hierarchical compositions). We find that universal concepts are important for reconstructing model activations!

(5/9)

February 7, 2025 at 3:15 PM

Harry Thasarathan

@hthasarathan.bsky.social

Previous approaches found universal features by post-hoc mining or similarity analysis - but this scales poorly. Our solution: extend Sparse Autoencoders to learn a shared concept space directly, encoding one model's activations and reconstructing all others from this unified vocabulary.

(4/9)

February 7, 2025 at 3:15 PM

Harry Thasarathan

@hthasarathan.bsky.social

If vision models seem to learn the same fundamental visual concepts, what are these universal features, and how can we find them?

(3/9)

February 7, 2025 at 3:15 PM

Harry Thasarathan

@hthasarathan.bsky.social

Vision models (backbones & foundation models alike) seem to learn transferable features that are relevant across many tasks. Recent work even suggests we are converging towards the same "Platonic" representation of the world. (Image from arxiv.org/abs/2405.07987)

(2/9)

February 7, 2025 at 3:15 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news