Alfredo Canziani
banner
alfcnz.bsky.social
Alfredo Canziani
@alfcnz.bsky.social
Musician, math lover, cook, dancer, 🏳️‍🌈, and an ass prof of Computer Science at New York University
This is different from the video I made 5 years ago, where the input-output linear interpolation of an already trained network shows what a neural net does to its input. Namely, it follows a piece-wise linear mapping defined by the hidden layer.
April 8, 2025 at 4:19 AM
Training of a 2 → 100 → 2 → 5 fully connected ReLU neural net via cross-entropy minimisation.
• it starts outputting small embeddings
• around epoch 300 learns an identity function
• takes 1700 epochs more to unwind the data manifold
April 8, 2025 at 4:19 AM