Primoz Ravbar
banner
primozravbar.bsky.social
Primoz Ravbar
@primozravbar.bsky.social
Researcher @UCSB
Neuroscience, ethology, computational biology, theoretical biology, data science, ML, AI, artificial life
Reposted by Primoz Ravbar
This paper looks interesting - it argues that you don’t need adaptive systems like Adam to get good gradient-based training, instead you can just set a learning rate for different groups of units based on initialization:

arxiv.org/abs/2412.11768

#MLSky #NeuroAI
No More Adam: Learning Rate Scaling at Initialization is All You Need
In this work, we question the necessity of adaptive gradient methods for training deep neural networks. SGD-SaI is a simple yet effective enhancement to stochastic gradient descent with momentum (SGDM...
arxiv.org
December 20, 2024 at 7:00 PM