prescod.bsky.social
@prescod.bsky.social
Academics think they will take all of the jobs because they plan to grow them far beyond anything recognizable as "LLMs", until they actually can do all of the jobs.

But in the meantime: a lot of jobs are actually a lot like homework.
January 27, 2025 at 2:17 AM
Weeks ago it was just a list of things designed to anger. Now it is policy harming real individual human beings and soon everyone’s economy and livelihoods.
January 21, 2025 at 3:33 AM
Interesting. Why?
January 20, 2025 at 5:33 PM
It looks to me as if it will be easier to build intelligence and then study it in a lab rather than observe it with primitive tools, understand it and then build it.

The engineering seems to progress much faster than the first principles research, but I’d be happy to be proven wrong.
January 5, 2025 at 6:26 PM
Thank you.
January 5, 2025 at 5:52 PM
Even if that were true, which It is not, it would be irrelevant to the larger question which is what we should expect for the future of test time compute (TTC) ala o3.

o3 is the GPT-2 of TTC. Costs will come down. Quality will go up. Efficiency will go up. Open source will emerge.
January 4, 2025 at 8:30 PM
By the way, it illuminates a lot about your mindset when you mock people for having a life outside of social media. Maybe you should try it yourself and you wouldn’t need to be so angry and unhappy.
January 4, 2025 at 7:49 PM
So you admit that the trend over the last several years has been towards lower costs, higher token qualities and lower latency.

But in the face of the existence of gpt-4o, deepseek v3, and Claude sonnet 3.5 that this process stopped in March of 2023.

You think gpt-4 was the pinnacle?
January 4, 2025 at 7:47 PM
Please don’t let the haters bring you down. People have extremely irrational views about these technologies. Both pro and con.
January 4, 2025 at 7:15 PM
You said price per token is negatively correlated with quality of tokens. gpt-4o is<$ than Babbage was.

So your assertion is wrong.

Sticking to the gpt-4 lineage you are still wrong. 4o is better (according to my private evals and most people’s public evals) and <$ than GPT-4 of last year
January 4, 2025 at 7:13 PM
These extremely expensive, highly reliable models will be used to train the mainstream models that we use every day to do these right/wrong tasks much more reliably than a they do today. Subjective tasks may not improve much but they are already decent at most such tasks of industrial value. 2/2
December 21, 2024 at 11:13 AM
If you truly believe that gpt-4o-mini is worse than than babbage-002 was then I am talking to a brick wall and I will just stop now.

I eval these things literally every day against my private dataset and custom task. Not sure what motivates a person to state what everyone knows is false. Carry on.
December 21, 2024 at 11:02 AM
So much hostility and so little imagination. Go back to Twitter.
December 21, 2024 at 4:39 AM
x.com/tsarnick/sta...

100x drop in 18 months.
x.com
x.com
December 21, 2024 at 4:37 AM
Okay fine, I can “only” demonstrate a 100x reduction in the research time I have available to me.

www.reddit.com/r/mlscaling/...

So maybe I’ll need to wait until next summer to demonstrate 1000x. Or maybe tomorrow I’ll have time to find a good example of distillation that is 1000x.
Andrej Karpathy: GPT-2 (124M) in llm.c, in 90 minutes for $20
www.reddit.com
December 21, 2024 at 4:20 AM
“How will this computer thing have any relevance to society. They each require a large room with dedicated operators.”

The technique works.

In 2 years it will cost $1720 and 2 years after it will be $17.20.

Have a bit of foresight.
December 21, 2024 at 3:39 AM
Human beings also study for tests.

We’ve crossed an important threshold when one can fine-tune a model for any task that a human can do.

It’s not the same as AGI but it is an important milestone.

Few jobs can be automated away with O3-ish technology but most tasks in each job can be (w/ $$$)
December 21, 2024 at 3:30 AM