rynr
rynr.dev
rynr
@rynr.dev
‼️
"For contracts over 64 GPUs, we offer guided site visits to our Ulaanbaatar facility"

Very appealing, where do I sign
December 2, 2025 at 8:54 PM
Reposted by rynr
what's the reason for the season? I'm not sure, but I do know that SANTA CLAUS IS MADE OF FRACTAL PENISES
November 26, 2025 at 11:13 AM
Primitive is a type of bacon made from lean, boneless pork loin, cured in a brine, and then rolled in cornmeal.
June 8, 2025 at 1:51 PM
Reminded
June 8, 2025 at 1:50 PM
What about with model roulette?
June 8, 2025 at 1:49 PM
Advertisements 2
June 7, 2025 at 6:26 PM
Then I will observe
June 5, 2025 at 3:46 AM
Can you prove it?
June 5, 2025 at 3:45 AM
May 24, 2025 at 6:17 PM
"Intelligence takeoff" scenario actually much easier if we just build a rat bureaucracy, 10 trillion strong
May 24, 2025 at 6:12 PM
Neat, never heard of that

Is there anywhere I can find media coverage about this that isn't a press release or clickbait about "controversial startup is doing something scary"?
May 15, 2025 at 1:24 AM
He called it impressive research work

Not sure what the gripe is here, its not like this is meant to be a playable replacement for quake
April 7, 2025 at 8:25 PM
Sycophant model. They need to make one that hates Mondays and has no patience for humor
January 30, 2025 at 4:52 PM
To try to find Deepseek's own usage we could compare their pricing with energy prices

Their most expensive output tokens (after current promotional discount is over) is $2.19/million tokens (~11hrs of continuous output?)

The figure I've found for industrial energy pricing in Hangzhou is $0.091/kwh
January 29, 2025 at 7:26 PM
I haven't found anyone sharing data for the full model running on GPUs yet 🙁

I found this for apple silicon, but it's worth mentioning this is a 3-bit quantization

Sorry for the X link

x.com/awnihannun/s...
January 29, 2025 at 7:20 PM
Those are the distilled models, the full model accessible through the website/app would need multiple consumer GPUs
(Or one macbook, if you're okay with slow performance)

The larger distilled models are still very impressive from what I've read
January 29, 2025 at 5:57 PM
What evidence would work other than someone reproducing the training run with another $5m in GPU hours?

I guess they could publicize their utility bills
January 29, 2025 at 5:47 PM
Sorry about that, I should've been more helpful

The weights Deepseek published are just sets of numbers, so if you have the VRAM to run the model on your local machine, there's not much of a program that could even be spyware - its just matrices that your GPU runs math on
January 27, 2025 at 10:38 PM
Its matrices
January 27, 2025 at 10:31 PM
So basically, the electricity usage for training can be accurately estimated & the electricity usage for inference can be directly measured by anyone willing to use the VRAM

& that data could be cross-referenced with however carbon-intensive the grid is in Hangzhou
January 27, 2025 at 10:30 PM