Lightnews — Scholar-powered news

Julien Pourcel

@jul-p.bsky.social

16 followers 220 following 11 posts

PhD student at INRIA (FLOWERS team) working on LLM4code | Prev. (MVA) ENS ParisSaclay

Posts Replies Media Videos

Julien Pourcel

@jul-p.bsky.social

📈 **Results**:
- Qwen-7B model: 6% → 36% accuracy
- Qwen-32B model: 13% → 45% accuracy
- Mistral-Large-2: 20% -> 46% accuracy
- Combined ensemble: 52% on ARC-AGI test set
- Outperforms much larger models like o3-mini and Claude-4-Sonnet

July 10, 2025 at 4:04 PM

Julien Pourcel

@jul-p.bsky.social

🔬 Why This Matters? Most coding tasks are too hard for even the best language models to solve in one shot. Traditional search methods help, but they hit a wall because the model’s abilities are fixed. SOAR breaks through this barrier by letting the model improve itself over time

July 10, 2025 at 4:04 PM

Julien Pourcel

@jul-p.bsky.social

Introducing SOAR 🚀, a self-improving framework for prog synth that alternates between search and learning (accepted to #ICML!)

It brings LLMs from just a few percent on ARC-AGI-1 up to 52%

We’re releasing the finetuned LLMs, a dataset of 5M generated programs and the code.

🧵

July 10, 2025 at 4:04 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news