@thyliorus.bsky.social
No, it was available to try out after the keynote by people that attended the event, some journalists briefly talked about it further, but didn’t share much concrete information about speed. We’ll have to wait until someone builds their own cluster & provides more detail.
February 27, 2025 at 6:57 AM
I think framework had a demo running of full deepseek r1 671b running on a cluster of 4 of these.
February 26, 2025 at 4:58 PM
This is a mobile CPU, meaning that it’s soldered, which is why you can’t really buy it separately, similar to how most other laptop cpu+mobos aren’t sold separately.

Framework does sell the mobo+cpu+ram separately though if you want to build your own cluster of them: frame.work/gb/en/produc...
Framework Desktop Mainboard (AMD Ryzen™ AI Max 300 Series)
Available with AMD Ryzen™ AI Max 385 and Ryzen™ AI Max+ 395 processors, the Framework Desktop’s mainboard is designed for performance and upgradeability.
frame.work
February 26, 2025 at 4:54 PM
Basically a “we’ve got nvidia project digits at home”
February 26, 2025 at 3:27 PM
Indeed doesn’t make much sense as a regular desktop, but that AMD chip has a more powerful GPU attached than similar products. That makes it an interesting product for having lots of “vram” at a lower price. Token/s perf is probably a lot lower than your setup bc of lower vram bandwidth though.
February 26, 2025 at 3:21 PM
Hmm, independent benchmarks seem to have the qwen32b distill about on par with 4o. Either way, seems like it could be good hardware for self hosted AI, but it’s still a ways away from larger sota models.
February 26, 2025 at 10:23 AM
It’s a great model that outperforms o1-mini, including coding, and is far better than 4o. It’s not as good as o3-mini/full r1, but it’s great for a local model. The q4 will impact performance, but you should also be able to run the full r1-distill-qwen-32B, which is also still better than o1-mini.
February 26, 2025 at 10:11 AM
Coming back to this, this might be a very budget friendly way to get 100gb+ of vram: frame.work/gb/en/deskto...

You can allocate up to 110gb of 128gb total to the GPU, making it possible to run models like deepseek-r1-distill-llama-70b at home with 4 bit quantitization.
Order a Framework Desktop with AMD Ryzen™ AI Max 300
Framework Desktop: A simple, quiet, mini PC with performance far beyond its size, powered by a highly-integrated processor from AMD.
frame.work
February 26, 2025 at 10:00 AM
Yes, it’s quite easy, see the linked Reddit post, they are running 3x p40s, which gives them 72gb of vram. If you get a good deal on the p40s, you can have 3 for three digits. This is a lot cheaper than you’d think.
January 28, 2025 at 9:13 AM
People seem to be having a good time running a couple of these for locally running LLAMA models www.reddit.com/r/LocalLLaMA...
Nvidia Tesla P40 performs amazingly well for llama.cpp GGUF!
www.reddit.com
January 27, 2025 at 12:17 AM
Using nvidia p40s is a lot cheaper than using 3090s, and they have just as much vram.

Available for $300-400 each, 24gb vram. www.ebay.com/p/12018725644
NVIDIA Tesla P40 24GB GDDR5X Graphics Card (870919-001) for sale online | eBay
Find many great new & used options and get the best deals for NVIDIA Tesla P40 24GB GDDR5X Graphics Card (870919-001) at the best online prices at eBay! Free shipping for many products!
www.ebay.com
January 27, 2025 at 12:14 AM
Does that also imply your heat pumps were continuously fighting against your AC then? I can’t imagine the temperature in your house being the same now as when your heat pumps were putting out 2-3x as much heat otherwise.
November 24, 2024 at 9:52 PM