Kolja Bauer
koljabauer.bsky.social
Kolja Bauer
@koljabauer.bsky.social
ELLIS PhD Student in Generative AI @ Ommer Lab (Stable Diffusion)
Pinned
In order to extract features from diffusion models, you have to noise your input and tune the noise level for each downstream task. But isn't there a better way? 🤔

Turns out there is, using our newly proposed feature extraction method CleanDIFT 🧹🚀

Check it out ⬇️
🤔 Why do we extract diffusion features from noisy images? Isn’t that destroying information?

Yes, it is - but we found a way to do better. 🚀

Here’s how we unlock better features, no noise, no hassle.

📝 Project Page: compvis.github.io/cleandift
💻 Code: github.com/CompVis/clea...

🧵👇
Reposted by Kolja Bauer
I’m thrilled to share that I’ll present two first-authored papers at #ICCV2025 🌺 in Honolulu together with @mgui7.bsky.social ! 🏝️
(Thread 🧵👇)
October 18, 2025 at 3:01 AM
Reposted by Kolja Bauer
🤔 What happens when you poke a scene — and your model has to predict how the world moves in response?

We built the Flow Poke Transformer (FPT) to model multi-modal scene dynamics from sparse interactions.

It learns to predict the 𝘥𝘪𝘴𝘵𝘳𝘪𝘣𝘶𝘵𝘪𝘰𝘯 of motion itself 🧵👇
October 15, 2025 at 1:56 AM
Reposted by Kolja Bauer
🤔When combining Vision-language models (VLMs) with Large language models (LLMs), do VLMs benefit from additional genuine semantics or artificial augmentations of the text for downstream tasks?

🤨Interested? Check out our latest work at #AAAI25:

💻Code and 📝Paper at: github.com/CompVis/DisCLIP

🧵👇
January 8, 2025 at 3:54 PM
In order to extract features from diffusion models, you have to noise your input and tune the noise level for each downstream task. But isn't there a better way? 🤔

Turns out there is, using our newly proposed feature extraction method CleanDIFT 🧹🚀

Check it out ⬇️
🤔 Why do we extract diffusion features from noisy images? Isn’t that destroying information?

Yes, it is - but we found a way to do better. 🚀

Here’s how we unlock better features, no noise, no hassle.

📝 Project Page: compvis.github.io/cleandift
💻 Code: github.com/CompVis/clea...

🧵👇
December 5, 2024 at 7:58 AM
Reposted by Kolja Bauer
After many years, our lab finally has a social media presence at @compvis.bsky.social ! 🥳
Give it a follow, we have some amazing research on generative computer vision coming soon!
November 20, 2024 at 6:31 PM