Lightnews — Scholar-powered news

Reposted by Alexander Kolesnikov

Andreas Steiner

@andreaspsteiner.bsky.social

Looking for a small or medium sized VLM? PaliGemma 2 spans more than 150x of compute!

Not sure yet if you want to invest the time 🪄finetuning🪄 on your data? Give it a try with our ready-to-use "mix" checkpoints:

🤗 huggingface.co/blog/paligem...
🎤 developers.googleblog.com/en/introduci...

February 19, 2025 at 5:47 PM

Alexander Kolesnikov

@kolesnikov.ch

With some delay, JetFormer's *prequel* paper is finally out on arXiv: a radically simple ViT-based normalizing flow (NF) model that achieves SOTA results in its class.

Jet is one of the key components of JetFormer, deserving a standalone report. Let's unpack: 🧵⬇️

December 20, 2024 at 2:39 PM

Alexander Kolesnikov

@kolesnikov.ch

Paligemma2 is out! Bigger models, better results. For the best experience, do not forget to finetune.

Congrats Paligemma2 team!

Andreas Steiner @andreaspsteiner.bsky.social · Dec 5

🚀🚀PaliGemma 2 is our updated and improved PaliGemma release using the Gemma 2 models and providing new pre-trained checkpoints for the full cross product of {224px,448px,896px} resolutions and {3B,10B,28B} model sizes.

1/7

December 5, 2024 at 6:28 PM

Alexander Kolesnikov

@kolesnikov.ch

Ok, it is yesterdays news already, but good night sleep is important.

After 7 amazing years at Google Brain/DM, I am joining OpenAI. Together with @xzhai.bsky.social and @giffmana.ai, we will establish OpenAI Zurich office. Proud of our past work and looking forward to the future.

December 4, 2024 at 9:14 AM

Reposted by Alexander Kolesnikov

Sander Dieleman

@sedielem.bsky.social

In arxiv.org/abs/2303.00848, @dpkingma.bsky.social and @ruiqigao.bsky.social had suggested that noise augmentation could be used to make other likelihood-based models optimise perceptually weighted losses, like diffusion models do. So cool to see this working well in practice!

December 2, 2024 at 6:36 PM

Alexander Kolesnikov

@kolesnikov.ch

The answer has just dropped: bsky.app/profile/kole...

Jia-Bin Huang @jbhuang0604.bsky.social · Dec 1

2021: Replace every CNN with a Transformer

2022: Replace every GAN with diffusion models

2023: Replace every NeRF with 3DGS

2024: Replace every diffusion model with Flow Matching

2025: ???

December 2, 2024 at 7:00 PM

Alexander Kolesnikov

@kolesnikov.ch

I always dreamed of a model that simultaneously

1. optimizes NLL of raw pixel data,
2. generates competitive high-res. natural images,
3. is practical.

But it seemed too good to be true. Until today!

Our new JetFormer model (arxiv.org/abs/2411.19722) ticks on all of these.

🧵

December 2, 2024 at 5:19 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news