Paul Couairon
banner
paulcouairon.bsky.social
Paul Couairon
@paulcouairon.bsky.social
PhD student at Sorbonne University
Big thanks to @loickch.bsky.social and @louisserrano.bsky.social for this amazing collaboration!
June 16, 2025 at 1:59 PM
Despite the absence of high-resolution ground truth features, we find that training JAFAR at low upsampling ratios and resolutions generalizes remarkably well to significantly higher output scales (4/n)
June 16, 2025 at 1:59 PM
Given an image, JAFAR builds high-res queries at the target resolution and low-res, semantically enriched keys using spatial feature modulation to power a cross-resolution attention mechanism that interpolates the low-resolution features from the foundation vision encoder (3/n)
June 16, 2025 at 1:59 PM
Foundation Vision Encoders produce rich, semantically meaningful features—but at low spatial resolution—requiring feature upsampling for dense vision tasks. JAFAR tackles this in a single step, without relying on any downstream task supervision (2/n)
June 16, 2025 at 1:59 PM