Join us in shaping ART:
- Discord: discord.com/invite/F6Mxp...
- GitHub: github.com/openpipe/art
Join us in shaping ART:
- Discord: discord.com/invite/F6Mxp...
- GitHub: github.com/openpipe/art
- Decoupled frontend (user logic) and backend (inference/training).
- VRAM optimization, enabling training 7B models even on free-tier Colab!
- Builds on
@vllm_project
, TRL by
@huggingface
and
@UnslothAI
- Decoupled frontend (user logic) and backend (inference/training).
- VRAM optimization, enabling training 7B models even on free-tier Colab!
- Builds on
@vllm_project
, TRL by
@huggingface
and
@UnslothAI
1️⃣ Multi-turn roll-outs: Existing RL frameworks often handle single-turn interactions. But real-world agent tasks—like web navigation—are multi-turn. ART natively supports multi-turn agent rollouts, essential for real-world agentic flows.
1️⃣ Multi-turn roll-outs: Existing RL frameworks often handle single-turn interactions. But real-world agent tasks—like web navigation—are multi-turn. ART natively supports multi-turn agent rollouts, essential for real-world agentic flows.
1. Generating HN titles (surpasses all SOTA)
2. 🕹️ Tic-tac-toe (7B model surpassing GPT-4o)
3. 🔍 Clue (14B model beating frontier models)
All with runnable examples in the repo!
1. Generating HN titles (surpasses all SOTA)
2. 🕹️ Tic-tac-toe (7B model surpassing GPT-4o)
3. 🔍 Clue (14B model beating frontier models)
All with runnable examples in the repo!