Jonathan Ross
banner
jonathan-ross.bsky.social
Jonathan Ross
@jonathan-ross.bsky.social
CEO + Founder @ Groq, the Most Popular API for Fast Inference | Creator of the TPU and LPU, Two of the World’s Most Important AI Chips | On a Mission to Double the World's AI Compute by 2027
Pinned
What can you do with Llama quality and Groq speed? Instant. That's what.

3 months back: Llama 8B running at 750 Tokens/sec
Now: Llama 70B model running at 3,200 Tokens/sec

We're still going to get a liiiiiiitle bit faster, but this is our V1 14nm LPU - how fast will V2 be? 😉
Reposted by Jonathan Ross
Fantastic insight on the massive demand for AI inference infrastructure “The demand for AI compute is insatiable” @groq.com CEO @jonathan-ross.bsky.social, “Our mission is to provide over half of the world’s inference compute” - @cnbc.com

cnb.cx/4nG7Pcm #AI
Groq CEO: Our mission is to provide over half of the world’s inference compute
Jonathan Ross, CEO and founder of Groq, joins CNBC’s 'Squawk on the Street' to discuss the AI chip startup’s $750 million funding round, its push to deliver faster, lower-cost inference chips, and why...
cnb.cx
September 25, 2025 at 12:46 PM
Founder Tip #2: You have to spend time to make time.

Hiring, re-organizing, calendar clean up (across the team), preparation for meetings (internal and external), etc. Half my day is available for whatever I find important - because the other half is spent freeing up time.
September 6, 2025 at 4:52 PM
Clearly China doesn't have enough compute for scaled AI today:
- GPT-OSS, Llama [US]: optimized for cheaper inference
- R1, Kimi K2, Qwen [China]: optimized for cheaper training

With China's population reducing inference costs is more important, and that means more training.
August 19, 2025 at 12:19 PM
Reposted by Jonathan Ross
Transcribe audio with @groq.com.
April 16, 2025 at 2:13 PM
I spent the weekend hanging out with a group of friends. A question we asked was what dreams did we have that we gave up on?

When I was 18, I had two dreams:

1) Be an astronaut
2) Build AI chips

I didn’t give up on one of them. 😀
March 24, 2025 at 2:39 PM
Reposted by Jonathan Ross
Big news! Mistral AI Saba 24B is on GroqCloud! The specialized regional language model is perfect for Middle East and South Asia-based devs and enterprises building AI solutions that need fast inference.
Learn more: groq.com/mistral-saba...
Mistral Saba Added to GroqCloud™ Model Suite - Groq is Fast AI Inference
GroqCloud™ has added another openly-available model to our suite – Mistral Saba. Mistral Saba is Mistral AI’s first specialized regional language model,
hubs.la
February 27, 2025 at 5:04 PM
It was a pleasure being back on 20VC with Harry Stebbings. His craft of interviewing is second to none and we went deep.

This is the interview after we just launched 19,000 LPUs in Saudi Arabia. We built the largest inference cluster in the region.

Link to the interview in the comments below!
February 17, 2025 at 6:00 PM
We built the region’s largest inference cluster in Saudi Arabia in 51 days and we just announced a $1.5B agreement for Groq to expand our advanced LPU-based AI inference infrastructure.

Build fast.
February 9, 2025 at 10:42 PM
My emergency episode with @harrystebbings.bsky.social at 20VC just launched on the impact of #DeepSeek on the AI world
January 29, 2025 at 4:41 PM
Reposted by Jonathan Ross
Yesterday at the World Economic Forum in Davos, I joined a constructive discussion on AGI alongside @andrewyng.bsky.social, @yejinchoinka.bsky.social, @jonathan-ross.bsky.social , @thomwolf.bsky.social and moderator @nxthompson.bsky.social. Full discussion here: www.weforum.org/meetings/wor...
January 23, 2025 at 5:01 PM
January 13, 2025 at 4:11 PM
Thank you! 🙏
January 9, 2025 at 3:27 AM
Over the next decade, we want to drive the cost down for generative AI 1,000x making a lot more activities profitable. And we think that that will cause a 100x spend increase.

🧵(5/5)
January 8, 2025 at 4:03 PM
Over the last 60 years, almost like clockwork, every decade compute gets about 1000x cheaper, people buy 100,000x as much of it, spending 100x times more overall. 

Our mission at Groq is to drive the cost of compute towards zero;The cheaper we make compute the more people spend.

🧵(4/5)
January 8, 2025 at 4:03 PM
- The answer is when you make a steam engine more efficient, it reduces the OpEx;

- When you reduce the OpEx, it increases the number of activities that are profitable;

- Therefore, people will do more things using steam engines and coal demand rises.

The same paradox applies to compute.

🧵(3/5)
January 8, 2025 at 4:03 PM
It’s a paradox because if they're more efficient, why are they buying more coal?

🧵(2/5)
January 8, 2025 at 4:03 PM
When you make compute cheaper do people buy more?

Yes. It's called Jevons Paradox and it's a big part of our business thesis.

In the 1860s, an Englishman wrote a treatise on coal where he noted that every time steam engines got more efficient people bought more coal.

🧵(1/5)
January 8, 2025 at 4:03 PM
This is insane, Groq is the #4 API on this list! 😮

OpenAI, Anthropic, and Azure are the top 3 LLM API providers on LangChain

Groq is #4, and close behind Azure

Google, Amazon, Mistral, and Hugging Face are the next 4.

Ollama is for local development.

Now add three more 747's worth of LPUs 😁
January 7, 2025 at 4:04 PM
Groq just got a shout out on the All-In pod as one of the big winners for 2025 alongside Nvidia. It’s the year of the AI chip and ours is the fastest 😃
January 5, 2025 at 12:09 AM
Welcome to Shipmas - Groq Style.

Groq's second B747 this week. How many LPUs and GroqRacks can we load into a jumbo jet? Take a look.

Have you been naughty or nice?
December 24, 2024 at 3:44 PM
Santa rented two full 747s this week to make his holiday deliveries of GroqRacks. Ho ho ho! 🎅
December 23, 2024 at 5:47 PM
(5/5) Learning: product-led growth works; even when your product is too large and expensive to let people have it for free, you just have to be more creative about it.
December 10, 2024 at 3:44 PM