Jörg Franke
jfranke.bsky.social
Jörg Franke
@jfranke.bsky.social
PhD student in the Machine Learning Lab at the University of Freiburg - Core Deep Learning Research with some applications in bio.
🧵4/5 - For example, when pretrain GPT2s, AdamCPR outperforms AdamW with the same budget or only requires 2/3 of the budget to reach the same score.
December 9, 2024 at 3:28 PM
🧵3/5 - CPR can be used with any gradient-based optimization algorithm, e.g. Adam. You can find our AdamCPR implementation at github.com/automl/CPR or via pip install pytorch-cpr
December 9, 2024 at 3:28 PM