Alexander Kolesnikov
@kolesnikov.ch
Reposted by Alexander Kolesnikov
Looking for a small or medium sized VLM? PaliGemma 2 spans more than 150x of compute!
Not sure yet if you want to invest the time 🪄finetuning🪄 on your data? Give it a try with our ready-to-use "mix" checkpoints:
🤗 huggingface.co/blog/paligem...
🎤 developers.googleblog.com/en/introduci...
Not sure yet if you want to invest the time 🪄finetuning🪄 on your data? Give it a try with our ready-to-use "mix" checkpoints:
🤗 huggingface.co/blog/paligem...
🎤 developers.googleblog.com/en/introduci...
February 19, 2025 at 5:47 PM
Looking for a small or medium sized VLM? PaliGemma 2 spans more than 150x of compute!
Not sure yet if you want to invest the time 🪄finetuning🪄 on your data? Give it a try with our ready-to-use "mix" checkpoints:
🤗 huggingface.co/blog/paligem...
🎤 developers.googleblog.com/en/introduci...
Not sure yet if you want to invest the time 🪄finetuning🪄 on your data? Give it a try with our ready-to-use "mix" checkpoints:
🤗 huggingface.co/blog/paligem...
🎤 developers.googleblog.com/en/introduci...
With some delay, JetFormer's *prequel* paper is finally out on arXiv: a radically simple ViT-based normalizing flow (NF) model that achieves SOTA results in its class.
Jet is one of the key components of JetFormer, deserving a standalone report. Let's unpack: 🧵⬇️
Jet is one of the key components of JetFormer, deserving a standalone report. Let's unpack: 🧵⬇️
December 20, 2024 at 2:39 PM
With some delay, JetFormer's *prequel* paper is finally out on arXiv: a radically simple ViT-based normalizing flow (NF) model that achieves SOTA results in its class.
Jet is one of the key components of JetFormer, deserving a standalone report. Let's unpack: 🧵⬇️
Jet is one of the key components of JetFormer, deserving a standalone report. Let's unpack: 🧵⬇️
Paligemma2 is out! Bigger models, better results. For the best experience, do not forget to finetune.
Congrats Paligemma2 team!
Congrats Paligemma2 team!
🚀🚀PaliGemma 2 is our updated and improved PaliGemma release using the Gemma 2 models and providing new pre-trained checkpoints for the full cross product of {224px,448px,896px} resolutions and {3B,10B,28B} model sizes.
1/7
1/7
December 5, 2024 at 6:28 PM
Paligemma2 is out! Bigger models, better results. For the best experience, do not forget to finetune.
Congrats Paligemma2 team!
Congrats Paligemma2 team!
Ok, it is yesterdays news already, but good night sleep is important.
After 7 amazing years at Google Brain/DM, I am joining OpenAI. Together with @xzhai.bsky.social and @giffmana.ai, we will establish OpenAI Zurich office. Proud of our past work and looking forward to the future.
After 7 amazing years at Google Brain/DM, I am joining OpenAI. Together with @xzhai.bsky.social and @giffmana.ai, we will establish OpenAI Zurich office. Proud of our past work and looking forward to the future.
December 4, 2024 at 9:14 AM
Ok, it is yesterdays news already, but good night sleep is important.
After 7 amazing years at Google Brain/DM, I am joining OpenAI. Together with @xzhai.bsky.social and @giffmana.ai, we will establish OpenAI Zurich office. Proud of our past work and looking forward to the future.
After 7 amazing years at Google Brain/DM, I am joining OpenAI. Together with @xzhai.bsky.social and @giffmana.ai, we will establish OpenAI Zurich office. Proud of our past work and looking forward to the future.
Reposted by Alexander Kolesnikov
In arxiv.org/abs/2303.00848, @dpkingma.bsky.social and @ruiqigao.bsky.social had suggested that noise augmentation could be used to make other likelihood-based models optimise perceptually weighted losses, like diffusion models do. So cool to see this working well in practice!
December 2, 2024 at 6:36 PM
In arxiv.org/abs/2303.00848, @dpkingma.bsky.social and @ruiqigao.bsky.social had suggested that noise augmentation could be used to make other likelihood-based models optimise perceptually weighted losses, like diffusion models do. So cool to see this working well in practice!
The answer has just dropped: bsky.app/profile/kole...
2021: Replace every CNN with a Transformer
2022: Replace every GAN with diffusion models
2023: Replace every NeRF with 3DGS
2024: Replace every diffusion model with Flow Matching
2025: ???
2022: Replace every GAN with diffusion models
2023: Replace every NeRF with 3DGS
2024: Replace every diffusion model with Flow Matching
2025: ???
December 2, 2024 at 7:00 PM
The answer has just dropped: bsky.app/profile/kole...
I always dreamed of a model that simultaneously
1. optimizes NLL of raw pixel data,
2. generates competitive high-res. natural images,
3. is practical.
But it seemed too good to be true. Until today!
Our new JetFormer model (arxiv.org/abs/2411.19722) ticks on all of these.
🧵
1. optimizes NLL of raw pixel data,
2. generates competitive high-res. natural images,
3. is practical.
But it seemed too good to be true. Until today!
Our new JetFormer model (arxiv.org/abs/2411.19722) ticks on all of these.
🧵
December 2, 2024 at 5:19 PM
I always dreamed of a model that simultaneously
1. optimizes NLL of raw pixel data,
2. generates competitive high-res. natural images,
3. is practical.
But it seemed too good to be true. Until today!
Our new JetFormer model (arxiv.org/abs/2411.19722) ticks on all of these.
🧵
1. optimizes NLL of raw pixel data,
2. generates competitive high-res. natural images,
3. is practical.
But it seemed too good to be true. Until today!
Our new JetFormer model (arxiv.org/abs/2411.19722) ticks on all of these.
🧵