CompVis - Computer Vision and Learning LMU Munich
compvis.bsky.social
CompVis - Computer Vision and Learning LMU Munich
@compvis.bsky.social
Computer Vision and Learning research group @ LMU Munich, headed by Björn Ommer.
Generative Vision (Stable Diffusion, VQGAN) & Representation Learning
🌐 https://ommer-lab.com
Pinned
Excited to share that we'll be presenting four papers at the main conference at ICCV 2025 this week!

Come say hi in Honolulu!

👋 Pingchuan, Ming, Felix, Stefan, Timy, and Björn Ommer will be attending.
Excited to share that we'll be presenting four papers at the main conference at ICCV 2025 this week!

Come say hi in Honolulu!

👋 Pingchuan, Ming, Felix, Stefan, Timy, and Björn Ommer will be attending.
October 19, 2025 at 6:06 PM
Reposted by CompVis - Computer Vision and Learning LMU Munich
🎉 From @elsa-ai.eu: 15 new members join the European Lighthouse on Secure & Safe AI—expanding reach across Europe and deepening ties with the @ellis.eu ecosystem.

Everything you need to know 👉 elsa-ai.eu/elsa-welcome...
October 17, 2025 at 7:05 AM
Fascinating approach — encoding an entire image into a single continuous latent token via self-supervised representation learning.
RepTok 🦎 highlights how compact generative representations can retain both realism and semantic structure.
🤔 What if you could generate an entire image using just one continuous token?

💡 It works if we leverage a self-supervised representation!

Meet RepTok🦎: A generative model that encodes an image into a single continuous latent while keeping realism and semantics. 🧵 👇
October 17, 2025 at 11:59 AM
Reposted by CompVis - Computer Vision and Learning LMU Munich
🤔 What happens when you poke a scene — and your model has to predict how the world moves in response?

We built the Flow Poke Transformer (FPT) to model multi-modal scene dynamics from sparse interactions.

It learns to predict the 𝘥𝘪𝘴𝘵𝘳𝘪𝘣𝘶𝘵𝘪𝘰𝘯 of motion itself 🧵👇
October 15, 2025 at 1:56 AM
Reposted by CompVis - Computer Vision and Learning LMU Munich
𝗖𝗮𝗹𝗹 𝗳𝗼𝗿 𝗳𝘂𝗹𝗹𝘆 𝗳𝘂𝗻𝗱𝗲𝗱 𝗣𝗵𝗗 𝗣𝗼𝘀𝗶𝘁𝗶𝗼𝗻𝘀: We are offering several PhD positions across our various research areas, open to highly qualified candidates.
‼️ The application portal will be open from 15 October to 14 November 2025.

Find out more: mcml.ai/opportunitie...
October 10, 2025 at 6:57 AM
Reposted by CompVis - Computer Vision and Learning LMU Munich
🎧 ELLIOT on the airwaves!

How do we build open and trustworthy AI in Europe?

🎙️ In a recent radio interview, Luk Overmeire from VRT shared insights on ELLIOT, #FoundationModels and the role of public broadcasters in shaping human-centred AI.

📻 Interview in Dutch: mimir.mjoll.no/shares/JRqlO...
July 3, 2025 at 10:39 AM
Reposted by CompVis - Computer Vision and Learning LMU Munich
"What makes us human in an AI-shaped world?" — At #MCML Munich AI Day 2025, Neil Lawrence explored this question, reminding us of the indivisible human core machines can't replicate.

Björn Ommer followed with insights into how GenAI is commodifying intelligence and reshaping how we use computers.
July 8, 2025 at 12:25 PM
Reposted by CompVis - Computer Vision and Learning LMU Munich
🎉 The ELLIOT project Kick-off Meeting was successfully hosted by CERTH-ITI, in Thessaloniki! 🏛️

30 partners from 12 countries 🌍 launched this exciting journey to advance open, trustworthy AI and #FoundationModels across Europe. 🤖

Stay tuned for more updates on #AIresearch and #TrustworthyAI! 💡
July 11, 2025 at 7:09 AM
Reposted by CompVis - Computer Vision and Learning LMU Munich
🧹 CleanDiFT: Diffusion Features without Noise
@rmsnorm.bsky.social*, @stefanabaumann.bsky.social*, @koljabauer.bsky.social*, @frankfundel.bsky.social, Björn Ommer
Oral Session 1C (Davidson Ballroom): Friday 9:00
Poster Session 1 (ExHall D): Friday 10:30-12:30, # 218
compvis.github.io/cleandift/
CleanDIFT: Diffusion Features without Noise
CleanDIFT enables extracting Noise-Free, Timestep-Independent Diffusion Features
compvis.github.io
June 9, 2025 at 7:58 AM
Reposted by CompVis - Computer Vision and Learning LMU Munich
🎉 Excited to share that our lab has three papers accepted at CVPR 2025!

Come say hi in Nashville!
👋 Johannes, Ming, Kolja, Stefan, and Björn will be attending.
June 9, 2025 at 7:28 AM
Reposted by CompVis - Computer Vision and Learning LMU Munich
📢 ELLIOT is coming! A €25M #HorizonEurope project to develop open, trustworthy Multimodal Generalist Foundation Models, #MGFM, for real-world applications. Starting July, it brings 30 partners from 12 countries to shape Europe’s #AI future.

🔍 Follow for updates on #OpenScience & #FoundationModels.
June 12, 2025 at 7:35 AM
🎉 Excited to share that our lab has three papers accepted at CVPR 2025!

Come say hi in Nashville!
👋 Johannes, Ming, Kolja, Stefan, and Björn will be attending.
June 9, 2025 at 7:28 AM
Reposted by CompVis - Computer Vision and Learning LMU Munich
If you are interested, feel free to check the paper (arxiv.org/abs/2506.02221) or come by at CVPR:

📌 Poster Session 6, Sunday 4:00 to 6:00 PM, Poster #208
Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment
Diffusion models have revolutionized generative tasks through high-fidelity outputs, yet flow matching (FM) offers faster inference and empirical performance gains. However, current foundation FM mode...
arxiv.org
June 6, 2025 at 3:48 PM
Reposted by CompVis - Computer Vision and Learning LMU Munich
Grand Opening of the AI-HUB@LMU. The AI-HUB@LMU is a platform that for the first time unites all 18 faculties of the #LMU as a joint scientific community.

📅January 29, 2025, 6:00 PM
📍 Große Aula, LMU Munich
Full program here: www.ai-news.lmu.de/grand-openin...
January 20, 2025 at 11:59 AM
Reposted by CompVis - Computer Vision and Learning LMU Munich
Attending my first corporate-sponsored business conference: there’s a live band playing between talks to keep the energy up.

Meanwhile, academic conferences are struggling to afford coffee breaks. Want this for EPSA!

@compvis.bsky.social
January 16, 2025 at 9:27 AM
bsky.app
January 9, 2025 at 3:56 PM
Reposted by CompVis - Computer Vision and Learning LMU Munich
🤔When combining Vision-language models (VLMs) with Large language models (LLMs), do VLMs benefit from additional genuine semantics or artificial augmentations of the text for downstream tasks?

🤨Interested? Check out our latest work at #AAAI25:

💻Code and 📝Paper at: github.com/CompVis/DisCLIP

🧵👇
January 8, 2025 at 3:54 PM
Reposted by CompVis - Computer Vision and Learning LMU Munich
Did you know you can distill the capabilities of a large diffusion model into a small ViT? ⚗️
We showed exactly that for a fundamental task:
semantic correspondence📍

A thread 🧵👇
December 6, 2024 at 2:35 PM
Reposted by CompVis - Computer Vision and Learning LMU Munich
🤔 Why do we extract diffusion features from noisy images? Isn’t that destroying information?

Yes, it is - but we found a way to do better. 🚀

Here’s how we unlock better features, no noise, no hassle.

📝 Project Page: compvis.github.io/cleandift
💻 Code: github.com/CompVis/clea...

🧵👇
December 4, 2024 at 11:31 PM