Lightnews — Scholar-powered news

Alaa El-Nouby

@alaaelnouby.bsky.social

350 followers 60 following 8 posts

Research Scientist at @Apple. Previous: @Meta (FAIR), @Inria, @MSFTResearch, @VectorInst and @UofG . Egyptian 🇪🇬

Posts Replies Media Videos

Alaa El-Nouby

@alaaelnouby.bsky.social

The open-sourced AIMv2 checkpoints support a number of fixed resolutions (224px, 336px, and 448px) in addition to a Native resolution checkpoint that accepts images of variable resolutions and aspect ratios

November 22, 2024 at 8:32 AM

Alaa El-Nouby

@alaaelnouby.bsky.social

AIMv2 provides a strong off-the-shelf recognition performance, with AIMv2-3B achieving 89.5% on ImageNet with a frozen-trunk. We also observe consistent improvement in performance with scaling the parameters for AIMv2 (check Section.3 in the preprint)

November 22, 2024 at 8:32 AM

Alaa El-Nouby

@alaaelnouby.bsky.social

AIMv2 is pre-trained in a manner similar to modern VLMs; therefore, it can be integrated seamlessly with our smallest backbone (i.e., AIMv2-L), outperforming popular backbones such as OpenAI CLIP and SigLIP on multimodal understanding benchmarks

November 22, 2024 at 8:32 AM

Alaa El-Nouby

@alaaelnouby.bsky.social

AIMv2 is pre-trained to autoregressively generate image patches and text tokens. It is easy to implement and train and it can be trivially scaled to billions of parameters. We are sharing checkpoints ranging between 300M and 3B params, available in Pytorch, JAX, and MLX on🤗

November 22, 2024 at 8:32 AM

Alaa El-Nouby

@alaaelnouby.bsky.social

𝗗𝗼𝗲𝘀 𝗮𝘂𝘁𝗼𝗿𝗲𝗴𝗿𝗲𝘀𝘀𝗶𝘃𝗲 𝗽𝗿𝗲-𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝘄𝗼𝗿𝗸 𝗳𝗼𝗿 𝘃𝗶𝘀𝗶𝗼𝗻? 🤔
Delighted to share AIMv2, a family of strong, scalable, and open vision encoders that excel at multimodal understanding, recognition, and grounding 🧵

paper: arxiv.org/abs/2411.14402
code: github.com/apple/ml-aim
HF: huggingface.co/collections/...

November 22, 2024 at 8:32 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news