Discusses how the Deepseek R1 model actually works in detail but with very less math!
The blog will have 3 main parts
1. **Chain of Thought Reasoning**
2. **Reinforcement Learning**
3. **GRPO**
4. **Distillation**
trite-song-d6a.notion.site/Deepseek-R1-...
Discusses how the Deepseek R1 model actually works in detail but with very less math!
The blog will have 3 main parts
1. **Chain of Thought Reasoning**
2. **Reinforcement Learning**
3. **GRPO**
4. **Distillation**
trite-song-d6a.notion.site/Deepseek-R1-...