Lewis Tunstall
lewtun.bsky.social
Lewis Tunstall
@lewtun.bsky.social
🤗 LLM whisperer @huggingface
📖 Co-author of "NLP with Transformers" book
💥 Ex-particle physicist
🤘 Occasional guitarist
🇦🇺 in 🇨🇭
📊We match the performance of DeepSeek-Distill-Qwen-7B by finetuning Qwen-7B-Math-Instruct on our dataset.

🔎 Read our blog post for all the nitty gritty details: huggingface.co/blog/open-r1...
Open R1: Update #2
A Blog post by Open R1 on Hugging Face
huggingface.co
February 10, 2025 at 6:09 PM
⏳ Automated filtering: We apply Math Verify to only retain problems with at least one correct answer. We also leverage Llama3.3-70B-Instruct as a judge to retrieve more correct examples (e.g for cases with malformed answers that can’t be verified with a rules-based parser)
February 10, 2025 at 6:09 PM
📀512 H100s running locally: Instead of relying on an API, we leverage vLLM and SGLang to run generations locally on our science cluster, generating 180k reasoning traces per day.
February 10, 2025 at 6:09 PM
🐳 800k R1 reasoning traces: We generate two answers for 400k problems using DeepSeek R1. The filtered dataset contains 220k problems with correct reasoning traces.
February 10, 2025 at 6:09 PM
What’s new compared to existing reasoning datasets?

♾ Based on NuminaMath 1.5: we focus on math reasoning traces and generate answers for problems in NuminaMath 1.5, an improved version of the popular NuminaMath-CoT dataset.
February 10, 2025 at 6:09 PM
Here's the links:

- Blog post: huggingface.co/spaces/Huggi...

- Code: github.com/huggingface/...

Enjoy!
Scaling test-time compute - a Hugging Face Space by HuggingFaceH4
Discover amazing ML apps made by the community
huggingface.co
December 16, 2024 at 5:08 PM