Volkan Cevher
banner
cevherlions.bsky.social
Volkan Cevher
@cevherlions.bsky.social
Associate Professor of Electrical Engineering, EPFL.
Amazon Scholar (AGI Foundations). IEEE Fellow. ELLIS Fellow.
Reposted by Volkan Cevher
We also provide the first convergence rate analysis that I'm aware of for stochastic unconstrained Frank-Wolfe (i.e., without weight decay), which directly covers the muon optimizer (and much more)!
🔥 Want to train large neural networks WITHOUT Adam while using less memory and getting better results? ⚡
Check out SCION: a new optimizer that adapts to the geometry of your problem using norm-constrained linear minimization oracles (LMOs): 🧵👇
February 13, 2025 at 4:59 PM
🔥 Want to train large neural networks WITHOUT Adam while using less memory and getting better results? ⚡
Check out SCION: a new optimizer that adapts to the geometry of your problem using norm-constrained linear minimization oracles (LMOs): 🧵👇
February 13, 2025 at 4:51 PM
It was a fun panel. Quite informative.
A thought-provoking panel with Scarlet of the EPFL AI Center, @cevherlions.bsky.social and Thomas Schneider from OFCOM - looking at the state of regulations, the business case for GenAI & the opportunities for Swiss research & innovation... a fine balance between talent, data and hardware. #AMLD
February 13, 2025 at 3:24 PM
Timeo professores machinae discendi et dona ferentes.
January 5, 2025 at 7:09 PM
Reposted by Volkan Cevher
An illustrated guide to never learning anything
December 25, 2024 at 12:26 AM
Reposted by Volkan Cevher
We'll present "SAMPa: Sharpness-Aware Minimization Parallelized" at #NeurIPS24 on Thursday! This is joint work with Thomas Pethick and Volkan Cevher.
📍 Find us at Poster #5904 from 16:30 in the West Ballroom.
December 11, 2024 at 4:23 PM
Reposted by Volkan Cevher
Stable model scaling with width-independent dynamics?

Thrilled to present 2 papers at #NeurIPS 🎉 that study width-scaling in Sharpness Aware Minimization (SAM) (Th 16:30, #2104) and in Mamba (Fr 11, #7110). Our scaling rules stabilize training and transfer optimal hyperparams across scales.

🧵 1/10
December 10, 2024 at 7:08 AM
Reposted by Volkan Cevher
This is joint work with wonderful collaborators @leenacvankadara.bsky.social , @cevherlions.bsky.social and Jin Xu during our time at Amazon.

🧵 10/10
arxiv.org
December 10, 2024 at 7:08 AM
@iclr-conf.bsky.social: Please incorporate this ACL style of feedback for reviewers:

aclrollingreview.org/authors#step...
Authors Guidelines
A peer review platform for the Association for Computational Linguistics
aclrollingreview.org
November 29, 2024 at 5:45 PM
Reposted by Volkan Cevher
Reviewers take note:
57% of people rejected their own argument when they thought it was someone else's. So take it easy with the criticism.
November 15, 2024 at 10:17 PM