Tom Kempton
banner
tomkempton.bsky.social
Tom Kempton
@tomkempton.bsky.social
Pure mathematician working in Ergodic Theory, Fractal Geometry, and (recently) Large Language Models. Senior Lecturer (= Associate Professor) at the University of Manchester.
Since softmax is not injective, many different logits vectors output the same probability distribution. (Precisely, v and w output the same distribution if they differ by a constant multiple of the 'all ones' vector). Can we infer anything from the logits vector beyond the prob. dist. it outputs?
February 12, 2025 at 8:29 AM
Can anyone point me to a reference saying early exit from a neural network is a reasonable thing to do?

As I understand it, early exit (from say a language model) involves taking the output from some early layer and applying the output embedding.
January 31, 2025 at 8:43 AM
I'm sure it's been asked a thousand times, but what's everyone's favourite method of making lists of articles they want to read?
November 27, 2024 at 11:34 AM
Today's question from the four year old: if all of the zookeepers in the world suddenly died would the farmers look after the zoo animals or would that be the job of the vets? Had to admit I didn't know the answer...
November 23, 2024 at 3:38 PM