Lightnews — Scholar-powered news

Reposted by Ioannis Kakogeorgiou

Thodoris Kouzelis

@nicolabourbaki.bsky.social

1/n Introducing ReDi (Representation Diffusion): a new generative approach that leverages a diffusion model to jointly capture
– Low-level image details (via VAE latents)
– High-level semantic features (via DINOv2)🧵

April 25, 2025 at 7:23 AM

Reposted by Ioannis Kakogeorgiou

sta8is.bsky.social

@sta8is.bsky.social

🧵 Excited to share our latest work: FUTURIST - A unified transformer architecture for multimodal semantic future prediction, is accepted to #CVPR2025! Here's how it works (1/n)
👇 Links to the arxiv and github below

February 26, 2025 at 7:57 PM

Reposted by Ioannis Kakogeorgiou

Dmytro Mishkin

@ducha-aiki.bsky.social

ILIAS: Instance-Level Image retrieval At Scale

@gkordo.bsky.social, Vladan Stojnić @annetka.bsky.social Pavel Šuma, Nikolaos-Antonios Ypsilantis @nikos-efth.bsky.social Zakaria Laskar,Jiří Matas, Ondřej Chum, @gtolias.bsky.social

tl;dr: SigLIP rules. Lots of ablations
arxiv.org/abs/2502.11748
1/

February 24, 2025 at 10:03 AM

Reposted by Ioannis Kakogeorgiou

Andrei Bursuc

@abursuc.bsky.social

EQ-VAE: Such a simple & cool trick to regularize multiple kinds of autoencoders: align reconstruction of transformed latents w/ the corresponding transformed inputs.
🚀REPA: 4x training speedup
🚀MaskGIT: 2x training speedup
🚀DiT-XL/2: 7x faster convergence

Kudos @nicolabourbaki.bsky.social et al.

Thodoris Kouzelis @nicolabourbaki.bsky.social · Feb 18

1/n🚀If you’re working on generative image modeling, check out our latest work! We introduce EQ-VAE, a simple yet powerful regularization approach that makes latent representations equivariant to spatial transformations, leading to smoother latents and better generative models.👇

February 21, 2025 at 10:54 PM

Reposted by Ioannis Kakogeorgiou

Thodoris Kouzelis

@nicolabourbaki.bsky.social

1/n🚀If you’re working on generative image modeling, check out our latest work! We introduce EQ-VAE, a simple yet powerful regularization approach that makes latent representations equivariant to spatial transformations, leading to smoother latents and better generative models.👇

February 18, 2025 at 2:27 PM

Reposted by Ioannis Kakogeorgiou

Giorgos Tolias

@gtolias.bsky.social

For PhD and MSc students interested in a research visit to Prague/VRG in 2025: we're open to hosting short-term collaborations or internships on a range of computer vision topics. If this sounds exciting, reach out by e-mail! We'd love to discuss potential projects. Some examples 🧵
#Internship #CV

February 12, 2025 at 8:26 AM

Reposted by Ioannis Kakogeorgiou

sta8is.bsky.social

@sta8is.bsky.social

1/n 🚀 Excited to share our latest work: DINO-Foresight, a new framework for predicting the future states of scenes using Vision Foundation Model features!
Links to the arXiv and Github 👇

February 7, 2025 at 5:06 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news