alanjeffares.com
It feels like a cheat code!
(True story: the origin of this case study is that @alanjeffares.bsky.social[big EoSL nerd] looked at the neural net eq&said “kinda looks like GBTs in EoSL Ch10”&we went from there)
Surely this diff in kernel must account for at least some of the observed performance differences… 🤔7/n
It feels like a cheat code!
@alanjeffares.bsky.social & I suspected that answers to this are obfuscated by the 2 being considered very different algs🤔
Instead we show they are more similar than you’d think — making their diffs smaller but predictive!🧵1/n
@alanjeffares.bsky.social & I suspected that answers to this are obfuscated by the 2 being considered very different algs🤔
Instead we show they are more similar than you’d think — making their diffs smaller but predictive!🧵1/n
For NeurIPS(my final PhD paper!), @alanjeffares.bsky.social & I explored if&how smart linearisation can help us better understand&predict numerous odd deep learning phenomena — and learned a lot..🧵1/n
For NeurIPS(my final PhD paper!), @alanjeffares.bsky.social & I explored if&how smart linearisation can help us better understand&predict numerous odd deep learning phenomena — and learned a lot..🧵1/n