Lightnews — Scholar-powered news

Mehdi S. M. Sajjadi

@msajjadi.com

94 followers 83 following 6 posts

Research Scientist
Tech Lead & Manager
Google DeepMind
msajjadi.com

Posts Replies Media Videos

Mehdi S. M. Sajjadi

@msajjadi.com

Scaling 4D Representations

Self-supervised learning from video does scale! In our latest work, we scaled masked auto-encoding models to 22B params, boosting performance on pose estimation, tracking & more.

Paper: arxiv.org/abs/2412.15212
Code & models: github.com/google-deepmind/representations4d

July 10, 2025 at 11:52 AM

Mehdi S. M. Sajjadi

@msajjadi.com

Generative Video Diffusion: does a model trained with this objective learn better features compared to image generation?

We investigated this question and more in our latest work, please check it out!

*From Image to Video: An Empirical Study of Diffusion Representations*
arxiv.org/abs/2502.07001

Video vs. image diffusion representations

Feature visualization for image and video diffusion

February 13, 2025 at 4:11 PM

Mehdi S. M. Sajjadi

@msajjadi.com

TRecViT: A Recurrent Video Transformer
arxiv.org/abs/2412.14294

Causal, 3× fewer parameters, 12× less memory, 5× higher FLOPs than (non-causal) ViViT, matching / outperforming on Kinetics & SSv2 action recognition.

Code and checkpoints out soon.

January 10, 2025 at 3:44 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news