The Byte Latent Transformer, Large Concept Models, Memory Layers & Phi-4 — all grouped under the title "Spend Your FLOPs Wisely". Here's our take (🧵)
graphcore-research.github.io/papers-of-th...
douglasorr.itch.io/c-crits
douglasorr.itch.io/c-crits
I'm an AI researcher at Graphcore. I work on things like: stable scaling of models, parametrizations, optimization & quantization. To give a rapid-fire taste, some slides from my most recent talk:
I'm an AI researcher at Graphcore. I work on things like: stable scaling of models, parametrizations, optimization & quantization. To give a rapid-fire taste, some slides from my most recent talk: