🔗 karroth.com
with Théo Uscidda, Luca Eyring, @confusezius.bsky.social, Fabian J Theis, Marco Cuturi
📄 Decoupling Angles and Strength in Low-rank Adaptation
with Massimo Bini, Leander Girrbach
with Théo Uscidda, Luca Eyring, @confusezius.bsky.social, Fabian J Theis, Marco Cuturi
📄 Decoupling Angles and Strength in Low-rank Adaptation
with Massimo Bini, Leander Girrbach
LIxP also maintains the strong zero-shot transfer of CLIP and SigLIP backbones across model sizes (S to L) and data (up to 15B), and allows up to 4x sample efficiency at test time, and up to +16% performance gains!
LIxP also maintains the strong zero-shot transfer of CLIP and SigLIP backbones across model sizes (S to L) and data (up to 15B), and allows up to 4x sample efficiency at test time, and up to +16% performance gains!
We teach models what to expect at test-time in few-shot scenarios.
We teach models what to expect at test-time in few-shot scenarios.
Why? They do not explicitly train for that!
We find a surrogate objective to optimize for -- context-aware language-image pretraining (LIxP)
Why? They do not explicitly train for that!
We find a surrogate objective to optimize for -- context-aware language-image pretraining (LIxP)
👉Language-image pretraining with CLIP or SigLIP is widely used due to strong zero-shot transfer, but ....
👉Language-image pretraining with CLIP or SigLIP is widely used due to strong zero-shot transfer, but ....