🔗 https://users.iit.demokritos.gr/~i.kakogeorgiou/
– Low-level image details (via VAE latents)
– High-level semantic features (via DINOv2)🧵
– Low-level image details (via VAE latents)
– High-level semantic features (via DINOv2)🧵
👇 Links to the arxiv and github below
👇 Links to the arxiv and github below
@gkordo.bsky.social, Vladan Stojnić @annetka.bsky.social Pavel Šuma, Nikolaos-Antonios Ypsilantis @nikos-efth.bsky.social Zakaria Laskar,Jiří Matas, Ondřej Chum, @gtolias.bsky.social
tl;dr: SigLIP rules. Lots of ablations
arxiv.org/abs/2502.11748
1/
@gkordo.bsky.social, Vladan Stojnić @annetka.bsky.social Pavel Šuma, Nikolaos-Antonios Ypsilantis @nikos-efth.bsky.social Zakaria Laskar,Jiří Matas, Ondřej Chum, @gtolias.bsky.social
tl;dr: SigLIP rules. Lots of ablations
arxiv.org/abs/2502.11748
1/
🚀REPA: 4x training speedup
🚀MaskGIT: 2x training speedup
🚀DiT-XL/2: 7x faster convergence
Kudos @nicolabourbaki.bsky.social et al.
🚀REPA: 4x training speedup
🚀MaskGIT: 2x training speedup
🚀DiT-XL/2: 7x faster convergence
Kudos @nicolabourbaki.bsky.social et al.
#Internship #CV
#Internship #CV
Links to the arXiv and Github 👇
Links to the arXiv and Github 👇