Lightnews — Scholar-powered news

Olivier Hénaff

@olivierhenaff.bsky.social

After an amazing 6 years at Google DeepMind, I'm thrilled to announce that I'll be starting a new project at the intersection of multimodal foundation modeling, data curation, and human behavior.

If this is of interest to you please reach out!

February 14, 2025 at 6:45 PM

Olivier Hénaff

@olivierhenaff.bsky.social

Active data curation keeps on giving.

This time we enabled the distillation of large multimodal models into much smaller ones, simply by choosing the data they learn from.

Sets a new state of the art in small multimodal models that are very efficient for inference!

Vishaal Udandarao @vishaalurao.bsky.social · Dec 2

🚀New Paper: Active Data Curation Effectively Distills Multimodal Models
arxiv.org/abs/2411.18674

Smol models are all the rage these days & knowledge distillation (KD) is key for model compression!

We show how data curation can effectively distill to yield SoTA FLOP-efficient {C/Sig}LIPs!!
🧵👇

December 2, 2024 at 6:13 PM

Reposted by Olivier Hénaff

Karsten Roth

@confusezius.bsky.social

This was an insightful project I worked on at Google DeepMind alongside the amazing @zeynepakata.bsky.social , @dimadamen.bsky.social , @ibalazevic.bsky.social and @olivierhenaff.bsky.social:

👉Language-image pretraining with CLIP or SigLIP is widely used due to strong zero-shot transfer, but ....

November 28, 2024 at 2:33 PM

Reposted by Olivier Hénaff

Ivana Balazevic

@ibalazevic.bsky.social

We maintain strong zero-shot transfer of CLIP / SigLIP across model size and data scale, while achieving up to 4x few-shot sample efficiency and up to +16% performance gains!

Fun project with @confusezius.bsky.social, @zeynepakata.bsky.social, @dimadamen.bsky.social and
@olivierhenaff.bsky.social.

Karsten Roth @confusezius.bsky.social · Nov 28

🤔 Can you turn your vision-language model from a great zero-shot model into a great-at-any-shot generalist?

Turns out you can, and here is how: arxiv.org/abs/2411.15099

Really excited to this work on multimodal pretraining for my first bluesky entry!

🧵 A short and hopefully informative thread:

November 28, 2024 at 2:43 PM

Olivier Hénaff

@olivierhenaff.bsky.social

More than zero-shot generalization, few-shot *adaptation* is critical for many applications.

We find simple changes to multimodal pretraining are sufficient to yield outsized gains on a wide range of few-shot tasks.

Congratulations @confusezius.bsky.social on a very successful internship!

Karsten Roth @confusezius.bsky.social · Nov 28

🤔 Can you turn your vision-language model from a great zero-shot model into a great-at-any-shot generalist?

Turns out you can, and here is how: arxiv.org/abs/2411.15099

Really excited to this work on multimodal pretraining for my first bluesky entry!

🧵 A short and hopefully informative thread:

November 28, 2024 at 2:47 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news