Lightnews — Scholar-powered news

Sylvain Kalache

@sylvainkalache.bsky.social

40 followers 14 following 24 posts

Leading the AI Labs @rootly.com - Former LinkedIn SRE and Founder of Holberton School

Posts Replies Media Videos

Sylvain Kalache

@sylvainkalache.bsky.social

Thank you for having me 🙌

June 17, 2025 at 7:14 PM

Sylvain Kalache

@sylvainkalache.bsky.social

3️⃣ An older version – Llama 3.3 70B-Versatile – performed even better than Llama 4 Maverick.

The benchmark – designed by the
@rootly.com AI Labs – tests models' ability to pick the correct pull request for a given bug description. The full findings 👉 rootly.com/blog/llama-4...

Rootly | Llama 4 underperforms: a benchmark against coding-centric models

Rootly AI Labs analyzes the performance of Meta’s Llama 4 models and finds they underperform compared to competitors like Claude 3.5 Sonnet and Qwen2.5

rootly.com

April 14, 2025 at 4:22 PM

Sylvain Kalache

@sylvainkalache.bsky.social

2️⃣ Second, we wanted to test it against models tailored for coding tasks. Unsurprisingly, it performs way under those. Llama 4 Maverick achieved only a 70% accuracy score. Alibaba’s Qwen2.5-Coder-32B is ranking the best at (90%), closely followed by GPT o3-mini (89%).

April 14, 2025 at 4:22 PM

Sylvain Kalache

@sylvainkalache.bsky.social

1️⃣ First, we wanted to reproduce Meta's findings that Llama 4 outperformed GPT-4o, Gemini 2.0 Flash, and DeepSeek v3.1—we found the exact opposite.

It came last, 6% less than the next best-performing model (DeepSeek) and 18% behind the overall top-performing model (GPT-4o).

April 14, 2025 at 4:22 PM

Sylvain Kalache

@sylvainkalache.bsky.social

☕️ Or just meet for a coffee; DM me 😊

March 6, 2025 at 3:24 PM

Sylvain Kalache

@sylvainkalache.bsky.social

🎤 Interview guests wanted: speak about your favorite AI tool.
scheduler.default.com/7992/member/...

scheduler.default.com

March 6, 2025 at 3:24 PM

Sylvain Kalache

@sylvainkalache.bsky.social

🔭 Join our Code to Clarity event: The Future of Monitoring, Observability, and Reliability with our friends at Checkly & @coralogix.bsky.social
lu.ma/fhl522f4

🚀 Code to Clarity: The Future of Monitoring, Observability, and Reliability · Luma

What’s the Vibe? Monitoring and observability are evolving—are your systems keeping up? Join us for an invite-only, off-the-record gathering of engineering…

lu.ma

March 6, 2025 at 3:24 PM

Sylvain Kalache

@sylvainkalache.bsky.social

📧 We are hiring across the board and are looking for contractors for the AI Lab – shoot me a DM if you are interested!

February 19, 2025 at 8:07 PM

Sylvain Kalache

@sylvainkalache.bsky.social

💡 The AI Lab mission is to leverage AI to improve incident management and systems operations. We’ll be building POCs, open-sourcing tools, and benchmarking models.

February 19, 2025 at 8:07 PM

Sylvain Kalache

@sylvainkalache.bsky.social

👨‍💻Joinly Rootly feels like the perfect next step. My career has always been about SREs—I worked as one, trained them, and helped startups engage with them.

February 19, 2025 at 8:07 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news