Alex Hägele
haeggee.bsky.social
Alex Hägele
@haeggee.bsky.social
PhD Student in Machine Learning @ICepfl MLO, MSc/BSc from @ETH_en.
haeggee.github.io
Reposted by Alex Hägele
I am excited to announce that I will join the University of Zurich as an assistant professor in August this year! I am looking for PhD students and postdocs starting from the fall.

My research interests include optimization, federated learning, machine learning, privacy, and unlearning.
March 6, 2025 at 2:17 AM
Reposted by Alex Hägele
🤗Thanks a lot @haeggee.bsky.social and @mjaggi.bsky.social for having me in the MLO group at EPFL @icepfl.bsky.social to present "Large Language Models as Markov Chains".

Slides are available on my website (link in thread).

🎉 New experiments with Llama and Gemma models in the updated paper!
February 28, 2025 at 1:03 PM
Reposted by Alex Hägele
Learning rate schedules seem mysterious? Why is the loss going down so fast during cooldown?
Turns out that this behaviour can be described with a bound from *convex, nonsmooth* optimization.

A short thread on our latest paper 🚞

arxiv.org/abs/2501.18965
The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training
We show that learning-rate schedules for large model training behave surprisingly similar to a performance bound from non-smooth convex optimization theory. We provide a bound for the constant schedul...
arxiv.org
February 5, 2025 at 10:13 AM
Reposted by Alex Hägele
Hi there 👋 Happy to join Bluesky!

We are the EPFL AI Center - EPFL's hub for artificial intelligence, shaping a future where AI works for everyone through cutting-edge research, education, and collaborations the private and public sector.

ai.epfl.ch
November 22, 2024 at 9:13 AM