Keith Hanson
keithhanson.bsky.social
Keith Hanson
@keithhanson.bsky.social
Freelance code wizard (full stack) and AI plumber, amateur microcontroller dev, amateur CAD/OpenScad & 3D printing nerd, open source enthusiast, Linux gray beard, previous municipal CTO and open government geek, Dad, Husband
Reposted by Keith Hanson
there's a funny thing with LLMs where FAANG had them in more-or-less modern shape for years but didn't push them because there was no product case, then OpenAI ran screaming naked into the street and they were like oh ok guess we're doing this
January 27, 2025 at 10:04 PM
Fundamentally, more hardware = faster research feedback loops = more innovation. More HW also lets them scale easily. Both absolutely critical in this AI arms race for those companies.
January 27, 2025 at 4:47 PM
That's assuming there's an old shovel somewhere as an alternative to the super deluxe mega shovel. But there's not.

Nvidia still wins even if you use their older hardware.

This also assumes the big companies will slow buying because they can. I doubt that they will give up their lead in HW ever.
January 27, 2025 at 4:45 PM
Totally agree, but we've seen that coming for awhile (as we saw the improvement rate slowing, and foundational models focusing on their UI improvements as announcements).

I mean, 2 years ago current techniques would be seen as magical compared to what was going on then.

This just seems normal?😀
January 27, 2025 at 4:32 PM
I am having a hard time doing the mental gymnastics required for the "hurt" part of your statement, though.

Megacorp AI isn't going to stop using their hardware just because someone did it cheaper?

Better HW = faster research feedback loops. More loops = more innovation, fundamentally. So why?
January 27, 2025 at 4:25 PM
FWIW and my purely hot speculative take - I doubt this will happen for a very long time.

But who knows in this space.

The only thing I do know is that for the last two years, alternatives to Nvidia based methods have existed, and none of them have gained much traction.
January 27, 2025 at 4:07 PM
And these companies (FB, Google, Apple, Microsoft, OpenAI, Claude, etc) depend on being the latest/greatest state of the art. Which means ass tons of research.

These companies have tons of cash.

As soon as we see people do this without Nvidia cards the market will freak out for real.
January 27, 2025 at 4:03 PM
Yes, way less compute needed in this circumstance. But there are many examples of efficient training and models out there already.

I do not see why this explanation is any reason for folks to believe the foundational co's will slow their buying.

People can freak out when it's done without Nvidia.
January 27, 2025 at 3:52 PM
I mean, most of us knew that there really is no moat other than scale and recurring customers, and all the niceties the large foundational companies' user interfaces provide (code artifacts, document parsing, code running, etc).

If anything, the large co will just consume the learning and continue.
January 27, 2025 at 3:49 PM
Continuing the thought, having more compute allows tighter feedback loops on your training experiments as researchers. Why would the large foundational model companies suddenly stop investing in that hardware?

Deepseek's innovations mean they would just be able to train faster and iterate sooner?
January 27, 2025 at 3:47 PM
Ok - 100% agree with this - as long as we're assuming it's truthful. I think we can at least agree they used less.

But does that somehow signal to the megacorp AI leaders to stop using their cash on hardware? I don't think so.

That would be insane. The only edge they really have is HW.
January 27, 2025 at 3:45 PM
THANK YOU. Those of us on the ground have seen plenty of innovative training techniques for training, fine-tuning, and running models on commodity hardware over the last year and a half.

Big AI companies slowing their innovation and feedback loops by slowing hardware purchasing would be insanity.
January 27, 2025 at 3:35 PM
I definitely agree the US Media / Government is likely going to see it as USA vs China, but we all win with this :P

Having _EVERYTHING_ fully open (data + training code + models, etc) means we all get in on it, and the US companies will be sure to consume all this learning as well.
January 27, 2025 at 3:25 PM
Furthermore, about once a week I see a new fine-tuned or fully trained model claiming the top of the leaderboards.

For the uninitiated, there are 1.3M AI models available on huggingface, built using all kinds of techniques (mostly using Nvidia hardware).

So, again, how does Nvidia lose here???
January 27, 2025 at 3:19 PM
I think the real response we're seeing in the news is that the US analysts are surprised that China is able to innovate within their restrictions. (?!)

This is the same story time immemorial about every megacorp vs. scrappy upstart - resource constraints breed innovation.
January 27, 2025 at 3:18 PM
It does. Where are these conflicting reports? Haven't seen them.

We also know Deepseek was using their 10,000 H100s they started with, and probably a lot more they can't talk about publicly.
January 27, 2025 at 3:10 PM
No co-working space is complete without exposed pipes. I've never visited one without them 😅
January 18, 2025 at 3:42 PM
LOL "Exposed pipes"
January 18, 2025 at 3:41 PM
Design rules in EasyEDA have my traces at 0.6mm and spacing between traces at 0.6mm. Chonky but it works 😅
December 29, 2024 at 11:30 PM
My basic process is to export a Gerber from EasyEDA, then punch everything into pcb2gcodeGUI, then use Candle to heightmap and run the gcode.
December 29, 2024 at 11:28 PM
This was milled with a 3018-prover, this bit at 0.045 cut depth and a tool diameter of 0.1, 2 extra passes:

HUHAO 5pcs V Router Bits 20 Degree Engraving Bits 1/8 Shank 2 Flutes 0.1mm Tip Engraving Tool Bits CNC V-Groove Router Bit for Acrylic Wood MDF Stainless Steel a.co/d/6e1rzDV
Amazon.com
a.co
December 29, 2024 at 11:25 PM