Lightnews — Scholar-powered news

Unsloth AI

@unsloth.ai

You can now train OpenAI gpt-oss with Reinforcement Learning in our free notebook!

This notebook automatically creates faster kernels via RL.

Unsloth RL achieves the fastest inference & lowest VRAM vs. any setup - 0 accuracy loss

gpt-oss-20b GRPO Colab: colab.research.google.com/github/unslo...

September 26, 2025 at 4:06 PM

Unsloth AI

@unsloth.ai

We're teaming up with Mistral and NVIDIA for an Unsloth event on Tues, Oct 21 at Y Combinator's office! 🦥

Join us in San Francisco for a night of talks, merch and more.

Food & drinks provided. RSVP required!
⭐ lu.ma/unsloth-yc

September 22, 2025 at 2:20 PM

Unsloth AI

@unsloth.ai

Run DeepSeek-V3.1 locally on 170GB RAM with our Dynamic 1-bit GGUFs!🐋

The 715GB model gets reduced to 170GB (-80% size) by smartly quantizing layers.

The 1-bit GGUF passes all our code tests & we fixed the chat template.

Guide: docs.unsloth.ai/basics/deeps...
GGUF: huggingface.co/unsloth/Deep...

August 22, 2025 at 8:13 PM

Unsloth AI

@unsloth.ai

Learn to fine-tune OpenAI gpt-oss with our new step-by-step guide! ✨

Learn about:
• Local gpt-oss training + inference FAQ & tips
• Evaluation, hyperparameters & overfitting
• Reasoning effort, Data prep
• Run & saving your model to llama.cpp GGUF, HF etc.

🔗Guide: docs.unsloth.ai/basics/gpt-o...

August 18, 2025 at 2:20 PM

Unsloth AI

@unsloth.ai

We made a complete Guide on Reinforcement Learning (RL) for LLMs!

Learn about:
• RL's goal & why it's key to building intelligent AI agents
• Why o3, Claude 4 & R1 use RL
• GRPO, RLHF, DPO, reward functions
• Training your own local R1 model with Unsloth

🔗 docs.unsloth.ai/basics/reinf...

June 17, 2025 at 3:09 PM

Unsloth AI

@unsloth.ai

You can now run DeepSeek-R1-0528 locally with our Dynamic 1-bit GGUFs! 🐋

We shrank the full 715GB model to just 168GB (-80% size).

We achieve optimal accuracy by selectively quantizing layers.

DeepSeek-R1-0528-Qwen3-8B is also supported.

GGUFs: huggingface.co/unsloth/Deep...

May 30, 2025 at 8:35 PM

Unsloth AI

@unsloth.ai

Introducing Unsloth Dynamic v2.0 GGUFs!

v2.0 sets new benchmarks on 5-shot MMLU + KL Divergence. So, you can now run quantized LLMs with minimal accuracy loss.

For benchmarks, we built an evaluation framework to match official MMLU scores of Llama 4 & Gemma 3

Blog: docs.unsloth.ai/basics/dynam...

April 24, 2025 at 7:12 PM

Unsloth AI

@unsloth.ai

You can now run Llama 4 on your local device! 🦙

We shrank Maverick (402B) from 400GB to 122GB (-70%).

Our Dynamic 1.78-bit iMatrix GGUFs ensures optimal accuracy & size by selectively quantizing layers.

Scout + Maverick GGUFs: huggingface.co/collections/...
Guide: docs.unsloth.ai/basics/tutor...

April 8, 2025 at 8:39 PM

Unsloth AI

@unsloth.ai

You can now Run DeepSeek-V3-0324 locally using our 1.58 & 2.51-bit Dynamic GGUF!🐋

We shrank the 720GB model to 131GB (-80%) by selectively quantizing layers for the best performance. Fixes basic quants breaking & bad output issues: www.unsloth.ai/blog/deepsee...

GGUF huggingface.co/unsloth/Deep...

How to Run Deepseek-V3-0324 Locally

DeepSeek's V3-0324 model is the most powerful open-source model rivalling GPT 4.5 and Claude 3.7. Learn to run the model with Unsloth Dynamic quants.

www.unsloth.ai

March 25, 2025 at 11:54 PM

Unsloth AI

@unsloth.ai

We teamed up with 🤗Hugging Face to release a free notebook for fine-tuning Gemma 3 with GRPO

Learn to:
• Enable reasoning in Gemma 3 (1B)
• Prepare/understand reward functions
• Make GRPO work for tiny LLMs

Notebook: colab.research.google.com/github/unslo...
Details: huggingface.co/reasoning-co...

March 19, 2025 at 4:31 PM

Unsloth AI

@unsloth.ai

We made a Guide to teach you how to Fine-tune LLMs correctly!

Learn about:
• Choosing the right parameters & training method
• RL, GRPO, DPO, CPT
• Data prep, Overfitting, Evaluation
• Training with Unsloth & deploy on vLLM, Ollama, Open WebUI

🔗https://docs.unsloth.ai/get-started/fine-tuning-guide

March 10, 2025 at 4:30 PM

Unsloth AI

@unsloth.ai

Tutorial: Train your own Reasoning LLM for free!

Make Llama 3.1 (8B) have chain-of-thought with DeepSeek's GRPO algorithm. Unsloth enables 90% less VRAM use.

Learn about:
• Reward Functions + dataset prep
• Training on free Colab GPUs
• Running + Evaluating

Guide: docs.unsloth.ai/basics/reaso...

February 25, 2025 at 6:32 PM

Unsloth AI

@unsloth.ai

Today, we’re launching new algorithms that enable 10x longer context lengths & 90% less VRAM for training Reasoning Models (GRPO).

Using Unsloth, you can now train your own reasoning model with just 5GB VRAM for Qwen2.5-1.5B with no accuracy loss.

Blog: unsloth.ai/blog/grpo

February 20, 2025 at 6:41 PM

Unsloth AI

@unsloth.ai

You can now reproduce DeepSeek-R1's reasoning on your own local device!

Experience the "Aha" moment with just 7GB VRAM.

Unsloth reduces GRPO training memory use by 80%.

15GB VRAM can transform Llama-3.1 (8B) & Phi-4 (14B) into reasoning models.

Blog: unsloth.ai/blog/r1-reas...

February 6, 2025 at 6:09 PM

Unsloth AI

@unsloth.ai

You can finetune Phi-4 for free on Colab now!

Unsloth finetunes LLMs 2x faster, with 70% less VRAM, 12x longer context - with no accuracy loss

Documentation: docs.unsloth.ai
We also fixed 4 bugs in Phi-4: unsloth.ai/blog/phi4

Phi-4 Colab: colab.research.google.com/github/unslo...

Finetune Phi-4 with Unsloth

Fine-tune Microsoft's new Phi-4 model with Unsloth! We've also found & fixed 4 bugs in the model.

unsloth.ai

January 10, 2025 at 7:55 PM

Unsloth AI

@unsloth.ai

Introducing Unsloth Dynamic 4-bit Quantization!

Naive quantization often harms accuracy but we avoid quantizing certain parameters. This achieves higher accuracy using only <10% more VRAM than BnB 4bit

Read our Blog: unsloth.ai/blog/dynamic...
Quants on Hugging Face: huggingface.co/collections/...

December 4, 2024 at 7:51 PM

Unsloth AI

@unsloth.ai

You can finetune Llama-3.2-Vision-11B for free on Colab now!

Unsloth finetunes VLMs 2x faster, with 50% less VRAM, 6x longer context - with no accuracy loss.

Other VLMs like Qwen2-VL, Llava & Pixtral are supported.

Blog: unsloth.ai/blog/vision

Fine-tuning Colabs: docs.unsloth.ai/get-started/...

Llama 3.2 Vision Fine-tuning with Unsloth

Fine-tune Meta's Llama 3.2 Vision, Llava, Qwen 2.5 Vision models open-source 2x faster via Unsloth! Beginner friendly.

unsloth.ai

November 21, 2024 at 7:14 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news