Kyle Corbitt
corbtt.bsky.social
Kyle Corbitt
@corbtt.bsky.social
If you're fine-tuning LLMs, Gemma 3 is the new 👑 and it's not close. Gemma 3 trounces Qwen/Llama models at every size!
- Gemma 3 4B beats 7B/8B competition
- Gemma 3 27B matches 70B competition

Vision benchmarks soon!
March 21, 2025 at 4:27 PM
I hear cocaine is good but no way it can beat the rush I get from my RL-trained agent suddenly grokking a new skill.
March 18, 2025 at 1:02 AM
Big news: we've figured out how to train models 80-90% cheaper than before. Cheaper than renting your own GPUs. Cheaper than any other service. And 0 quality regression.

Super proud of the team on this one. New pricing is now live!
January 23, 2025 at 5:16 PM
Helpful intuition that folks new to LLMs may not know: if you have a lot of data, small models are often just as good as much, much larger ones for tasks like classification and information extraction. Here I compare a 1B vs 8B on a hard classification task, and I bet you can't tell which is which!
December 13, 2024 at 10:30 PM
Btw you can view your training loss across open source models AND Gemini models on OpenPipe!
December 9, 2024 at 3:40 PM
Meta just released Llama 3.3 70B—they claim benchmarks similar to Llama 3 405B, but in a model 20% the size. It's already available as a base model on OpenPipe, and we'll release benchmarks as a fine-tuning base model soon.

huggingface.co/meta-llama/L...
December 6, 2024 at 7:02 PM
SUPER PUMPED to announce that Gemini fine-tuning is available to all OpenPipe users! Gemini Flash provides the lowest cost fine-tuning of any model in its quality class. Comparable to gpt-4o-mini, but 4x cheaper inference and FREE fine-tuning!
December 5, 2024 at 4:38 PM
Amazon's Nova models have excellent price/perf ratio. We'd love to support them, but to deploy fine-tuned versions you need to purchase "provisioned throughput", which costs $100/hr/model. 😬 Putting out the bat signal—if you know someone at AWS Bedrock, pls put me in contact!
December 4, 2024 at 4:08 PM
Ok I am terrible at sharing product updates here, but we now support Llama 3.2 1B and 3B (the best small LLMs) as well as Qwen 2.5 72B and 32B Coder (the best open general and code-specific models) on OpenPipe!
December 4, 2024 at 12:35 AM
You can just use Qwen 2.5 for any task you'd otherwise use Llama 3.1 for. This is a (poorly formatted) chart I made a month or so back based on our own internal evals after the Llama 3.2 release. Big error bars but it shows the trend.
November 19, 2024 at 10:19 PM
OpenPipe now hosts all our docs in plaintext on our docs page at /llms.txt (index links) and /llms-full.txt (full dump of all docs).
Great idea from @jph.bsky.social!
November 18, 2024 at 8:17 PM
Qwen 2.5 Coder 32B is a 🐐
✅ Benchmarks at or above GPT-4 and Claude 3.5
✅ Subjectively feels fantastic for code (been trying it)
✅ Fine-tunable on your own data on OpenPipe!
November 13, 2024 at 11:16 PM
Last week Huggingface released "SmolLM v2," several <2B models designed for edge deployment. Interested in how they perform when fine-tuned? You're in luck! We've compared their performance with other edge models. (Spoiler: Qwen remains the champion 👑)
November 2, 2024 at 10:15 AM