nick
bukovec.dev
nick
@bukovec.dev
mariners and machine learning
ee ms student @ stanford
big "model distillation" wants you to believe that you can
January 29, 2025 at 7:15 PM
bring back the hackintosh
January 29, 2025 at 5:52 AM
Here's an article about using QLoRA on Llama 2 and Mistral using a 3090. Although the tricky thing with R1 is that it's MoE, so I think you'll have to load all 671M params into memory for training. It might be easier to fine-tune one of the Llama-distilled versions.

medium.com/@geronimo7/f...
Finetuning Llama 2 and Mistral
A beginner’s guide to finetuning LLMs with QLoRA
medium.com
January 29, 2025 at 5:36 AM
since i’m just starting to mess around with MoE, the use of dynamic biases is really interesting to me. a super cool intuition!
December 29, 2024 at 11:19 PM