This is the interview after we just launched 19,000 LPUs in Saudi Arabia. We built the largest inference cluster in the region.
Link to the interview in the comments below!
This is the interview after we just launched 19,000 LPUs in Saudi Arabia. We built the largest inference cluster in the region.
Link to the interview in the comments below!
Groq's second B747 this week. How many LPUs and GroqRacks can we load into a jumbo jet? Take a look.
Have you been naughty or nice?
Groq's second B747 this week. How many LPUs and GroqRacks can we load into a jumbo jet? Take a look.
Have you been naughty or nice?
3 months back: Llama 8B running at 750 Tokens/sec
Now: Llama 70B model running at 3,200 Tokens/sec
We're still going to get a liiiiiiitle bit faster, but this is our V1 14nm LPU - how fast will V2 be? 😉
3 months back: Llama 8B running at 750 Tokens/sec
Now: Llama 70B model running at 3,200 Tokens/sec
We're still going to get a liiiiiiitle bit faster, but this is our V1 14nm LPU - how fast will V2 be? 😉