Lightnews — Scholar-powered news

Volkan Cevher

@cevherlions.bsky.social

970 followers 100 following 12 posts

Associate Professor of Electrical Engineering, EPFL.
Amazon Scholar (AGI Foundations). IEEE Fellow. ELLIS Fellow.

Posts Replies Media Videos

Volkan Cevher

@cevherlions.bsky.social

🧑‍🍳 We provide a complete cookbook for choosing the right LMO for your architecture: 📚
- Input layers (1-hot vs image)
- Hidden layers (spectral norms)
- Output layers (flexible norm choices)
All with explicit formulas and guidance for when to use each one.

February 13, 2025 at 4:51 PM

Volkan Cevher

@cevherlions.bsky.social

🌟 It turns out many popular optimizers (SignSGD, Muon, etc.) are special cases of our framework - just with different norm choices.
Our unified analysis reveals deep connections between seemingly different approaches and provides new insights into why they work 🤔

February 13, 2025 at 4:51 PM

Volkan Cevher

@cevherlions.bsky.social

📝 Check out the preprint: arxiv.org/abs/2502.07529
Worst-case convergence analysis with rates, guarantees for learning rate transfer, and practical advice on how to properly choose norms adapted to network geometry, backed by theory 🎯

February 13, 2025 at 4:51 PM

Volkan Cevher

@cevherlions.bsky.social

🕵️ It’s “just” stochastic conditional gradient. The secret sauce? Don't treat your weight matrices like they're flat vectors! SCION adapts to the geometry of matrices using LMOs with respect to the correct norm: the induced operator norm.

February 13, 2025 at 4:51 PM

Volkan Cevher

@cevherlions.bsky.social

arxiv.org/abs/2502.07529
🚀 Key results:
- Based on conditional gradient method
- Beats Muon+Adam on NanoGPT (tested up to 3B params)
- Zero-shot learning rate transfer across model size
- Uses WAY less memory (just one set of params + half-precision grads)
- Provides explicit norm control

February 13, 2025 at 4:51 PM

Volkan Cevher

@cevherlions.bsky.social

🔥 Want to train large neural networks WITHOUT Adam while using less memory and getting better results? ⚡
Check out SCION: a new optimizer that adapts to the geometry of your problem using norm-constrained linear minimization oracles (LMOs): 🧵👇

February 13, 2025 at 4:51 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news