Jeff Rasley
jeffra.bsky.social
Jeff Rasley
@jeffra.bsky.social
Snowflake AI Research Team, DeepSpeed co-founder, Brown CS PhD, UW CSE alum. @jeffra45 on the other site.
Do you want the ability to post-train DeepSeek V3/R1 models with DPO using just a few GPU nodes?

Please vote here and share your feelings: github.com/snowflakedb/...

This would be built into ArcticTraining, an open-source, easy to use post-training framework built on top of DeepSpeed.
Post-train DeepSeek V3/R1 with DPO using just a few GPU nodes? · snowflakedb ArcticTraining · Discussion #58
Hello AI Community! We are pondering over the features we can bring to ArcticTraining in the near future that would offer value to the AI community. One such feature we are considering is the abili...
github.com
February 22, 2025 at 1:13 AM
🚀 Super proud to share ArcticTraining, an open-source post-training framework to simplify and power new research directions!
✅ Modular trainers for fast prototyping
✅ Simple callback system for easy customization
✅ Native data generation pipelines
www.snowflake.com/en/engineeri...
ArcticTraining: Simplifying and Accelerating Post-Training for LLMs
ArcticTraining, a streamlined framework for LLM post-training, offering flexible trainers, simplified structures, and native data generation pipeline.
www.snowflake.com
January 16, 2025 at 11:16 PM