Yizhou Liu
yzliu.bsky.social
Yizhou Liu
@yzliu.bsky.social
PhD student at MIT, Physics of living systems, Complex systems, Statistical physics, Homepage: https://liuyz0.github.io/
FOCUS is faster than the Adam baseline from the literature!! (8/8)
January 22, 2025 at 4:14 AM
FOCUS appears to be much more stable than Signum and Adam on our machines (7/8)
January 22, 2025 at 4:14 AM
Predictions from synthetic losses are relevant to reality! With small batch sizes (large noises), Signum outperforms Adam in training MLP for MNIST classification (6/8)
January 22, 2025 at 4:14 AM
The attraction force has the most advantage when the gradient noise is large while extra regulations (weight decay) are intermediate (5/8)
January 22, 2025 at 4:14 AM
Signum outperforms Adam when the effect of gradient noise is larger than that of loss sharpness. FOCUS further improves Signum when the loss is sharp (4/8)
January 22, 2025 at 4:14 AM
Our picture of the loss landscape is a narrowing valley (3/8)
January 22, 2025 at 4:14 AM
We add an attraction force (highlighted in red) to Signum (SignGD) (2/8)
January 22, 2025 at 4:14 AM
With Ziming Liu and Jeff Gore
code: github.com/liuyz0/FOCUS
January 22, 2025 at 4:14 AM
Predictions from synthetic losses are relevant to reality! With small batch sizes (large noises), Signum outperforms Adam in training MLP for MNIST classification (6/8)
January 22, 2025 at 4:00 AM
The attraction force has the most advantage when the gradient noise is large while extra regulations (weight decay) are intermediate (5/8)
January 22, 2025 at 4:00 AM
Signum outperforms Adam when the effect of gradient noise is larger than that of loss sharpness. FOCUS further improves Signum when the loss is sharp (4/8)
January 22, 2025 at 4:00 AM
Our picture of the loss landscape is a narrowing valley (3/8)
January 22, 2025 at 4:00 AM
We add an attraction force (highlighted in red) to Signum (SignGD) (2/8)
January 22, 2025 at 4:00 AM
With Ziming Liu and Jeff Gore.
Code: github.com/liuyz0/FOCUS
January 22, 2025 at 4:00 AM