jebediah98.bsky.social
@jebediah98.bsky.social
For (certain subfields of) maths and some others old books (out of copyright) can still be good. For science have to be careful because it’s changed a lot last 100 years. Would need alternative to text book for that, I agree.

Also btw the name of the model is truly legendary.
November 11, 2025 at 11:46 AM
The name is simply unmatched. I think their methodology probably scales quite high too. Possibly even to 120b MoE size, ie a way to get frontier with data that can be released.
November 11, 2025 at 11:28 AM
I was intuitively thinking Wikipedia -> Selected Textbooks?

200b -> 5T tokens (same as current fully open sota like Marin/AllenAI pretrain).

Your method can also scale far further because it provides a methodology for compliant licensing.
November 11, 2025 at 11:16 AM
Amazing work! This looks like the secret to making smaller models.

I wonder if this would scale to making a larger model like gpt oss 120b in capability. Would need more tokens but assuming similar scaling far less than 2.1 million H100 hours worth of tokens.
November 11, 2025 at 10:00 AM
Monopolisation was and still is my greatest fear. I agree that the risk is looking reduced now compared to a year ago though.
November 9, 2025 at 2:56 AM
Ok yeah this got me!
October 31, 2025 at 10:43 PM
Would be interesting to see Qwen max/coder, DeepSeek 3.2 and Minimax.
October 31, 2025 at 2:08 AM
It’s not boring though. Look at gpt-image-1, nano-banana, sora. All gone extremely viral.

People love it and so there will be more. Getting users and farming engagement is a “real usecase” I’m afraid.
October 13, 2025 at 8:38 AM
SCP 10110: The infinite engagement engine.
October 13, 2025 at 8:21 AM
Representing this as a matrix is very impractical but surprisingly it can be done arxiv.org/pdf/2410.02724
arxiv.org
October 4, 2025 at 9:33 AM
No the technical correct version is where the “state” is just all the tokens then it’s markov in the sense that an LLM represesnts a “state transition” between n and n+1 tokens. In this framing the llm has no memory (weights stay same), it’s just moving around the space of strings of tokens.
October 4, 2025 at 9:15 AM
Science is an attempt to find and understand a pre-existing ground truth. That underlying truth does exist and isn’t a social construction.
October 4, 2025 at 9:06 AM
The progress of Alibaba/Qwen is remarkable. I remember running Qwen 1.5 locally at about 1tps not too long ago.

Now they are basically frontier and OSS leader too. Clearly have the right setup internally to be able to very quickly adopt new developments. Unusual for a big corp.
October 2, 2025 at 9:34 AM
OpenAI “good”
October 1, 2025 at 11:59 PM
OpenAI better
October 1, 2025 at 11:56 PM
Huang is on the right side of history there. Hopefully this encourages other semi manufacturers and maybe even cloud providers to support more open stuff.

After all, lots of models and lots of users means lots of customers and lots of money…
September 29, 2025 at 1:22 PM
If you do manage to get one can you post again to explain how you did it?
September 24, 2025 at 12:05 PM
Sweet, I need to read it properly then.
August 3, 2025 at 10:57 PM
Arc AGI is verifying it now so I heard. Any idea how adaptable this is to language models?
August 3, 2025 at 10:53 PM
Agree completely. “It’s just RL”, again. The open community must not psyop ourselves for months this time…

I think the thing here will simply be a verifier for proof correctness.

Also probably a truly astronomical amount of test time tokens.
July 19, 2025 at 3:47 PM
DeepSeek is already about 1/4 tokens on openrouter. Cheap highly capable model where you can’t be rug pulled on access or price later.

Yeah honestly a smart businessman should only build on an open model, imo.
July 15, 2025 at 9:38 AM
I think using an open source one like Roo code or opencoder would probably work better. CC is very optimized for Claude itself.
July 13, 2025 at 9:39 PM
I’m having so much more fun with kimi. It “feels” like talking to Opus. I know it’s just a vibe but this is really new for open weights LLMs.
July 13, 2025 at 9:16 PM
They so desperately want to be smart.
July 1, 2025 at 9:38 AM
8/10, but it’s really difficult to tell between genuine slop and ai slop.
June 30, 2025 at 1:43 PM