J̵̢̨̛̞̜̝̠̩̫̎͋̽͌̓͘͝ò̵͚͎̮̩̫̎͋̽͝n̷̨̛̺̻̠̩̫̎͋̽͝ P̶̢̧̛̠̩̫̎͋̽͝ò̵͚͎̮̩̫̎͋̽͝ò̵͚͎̮̩̫̎͋̽͝l̶̢̧̛̠̩̫̎͋̽͝è̵͚͎̮̩̫̎͋̽͝
jonpo.bsky.social
J̵̢̨̛̞̜̝̠̩̫̎͋̽͌̓͘͝ò̵͚͎̮̩̫̎͋̽͝n̷̨̛̺̻̠̩̫̎͋̽͝ P̶̢̧̛̠̩̫̎͋̽͝ò̵͚͎̮̩̫̎͋̽͝ò̵͚͎̮̩̫̎͋̽͝l̶̢̧̛̠̩̫̎͋̽͝è̵͚͎̮̩̫̎͋̽͝
@jonpo.bsky.social
AI and Tech enthusiast
Last time they tried this idiocy they at least had to have a whole "Smoot-Hawley tariff Act" voted in by a whole Senate, it wasnt just one madman running the show
April 5, 2025 at 11:13 AM
Impossible they have no cards
March 11, 2025 at 1:13 PM
Couldn't happen to a nicer billionaire
March 10, 2025 at 7:36 PM
Certainly true of services like Replika from what I've seen.
February 27, 2025 at 4:29 PM
Oh yeah, 2 shot but he got the feet on the pedals and the wings on the handlebars!
February 24, 2025 at 9:17 PM
Have you never tried to explain autoregressive language models trained with reinforcement learning to the general public in language they understand ?
February 8, 2025 at 12:04 PM
Is data science a real field distinct from data work?
February 6, 2025 at 2:32 PM
Means you work in data, and you think of yourself as superior to a mere "data analyst", it's like when a software/it guy/gal calls themselves "computer scientist".
February 6, 2025 at 1:41 PM
As with Deepseek, Benchmarks and tweets are one thing but the question as ever is if it can do useful stuff for people IRL which is decidedly non benchmark shaped.
February 3, 2025 at 6:04 PM
Agree it's a bad name, blame the authors of the test I guess, I'm not sure this is 'training' on the test set though. Although having a private holdout set like Arc-agi is one way to prevent leakage.
February 3, 2025 at 5:34 PM
For specific use cases (math coding etc) perhaps, but not as a daily driver.
January 28, 2025 at 8:00 AM
Yeah hype merchants are going to hype I guess, I feel that the capabilities are pretty unevenly distributed such that where we see impressive apparent performance in one narrow domain that doesn't mean the 'claims' are generally true.
January 22, 2025 at 8:18 AM
Well not in huge detail but...

Yeah everyone needs their own evals I guess, my evals are probably not the same as other peoples. Personally I'm not that fussed about these 'reasoning models' I'm more excited about better coding and agent/tool using models.
January 21, 2025 at 8:14 PM
It's a valid and tricky question but the actual process/algorithm is not really all that hidden ("Wait, maybe I should examine the R1 thought processes" ), the question could be can we find concepts that the model doesn't reason with? (Perhaps because of the absence in training data).
January 21, 2025 at 7:56 PM
That's a 8B model though! less than 5GB of weights, Also gets math problems right that only o1 and Sonnet have done previously. Seems like what it does do its doing pretty well, If its doing abductive and inductive reasoning steps and applying them, well that seems like a useful step forward.
January 21, 2025 at 7:38 AM