Adaptive Compute for Gemini and beyond @GoogleDeepMind
Pranav Nair: Combining losses for different Matyroshka-nested groups of bits in each weight within a neural network leads to an accuracy improvement for models (esp. 2-bit reps).
Paper: "Matryoshka Quantization" at arxiv.org/abs/2502.06786
Pranav Nair: Combining losses for different Matyroshka-nested groups of bits in each weight within a neural network leads to an accuracy improvement for models (esp. 2-bit reps).
Paper: "Matryoshka Quantization" at arxiv.org/abs/2502.06786