Avik Dey
banner
avikdey.bsky.social
Avik Dey
@avikdey.bsky.social
Mostly Data, ML, OSS & Society.

Stop chasing Approximately Generated Illusions; focus on Specialized Small LMs FTW.

If you cant explain it simply, you dont understand it well enough.

Shadow of https://linkedin.com/in/avik-dey, except have a beard now.
Pinned
Alignment isnt only thing LLMs are faking. Reasoning is another one that they are good at faking. Reading paper on LLM performance on reasoning tasks of doctors. Just started reading but either going to be:
1. Memorization or
2. Priming or
2. Confirmation prompting

www.anthropic.com/research/ali...
Alignment faking in large language models
A paper from Anthropic's Alignment Science team on Alignment Faking in AI large language models
www.anthropic.com
In this age of AI, don’t be a follower. Be the leader who hires engineers who build the future - because AI ain’t building jackshit for you.
November 8, 2025 at 8:36 PM
We are entering the golden age of AI “world models” where every AI hype will be proudly accompanied by their grand unified theory of everything, rigorously engineered to collapse at the first gentle poke of reality.
Browsing the arxiv paper - the architecture seems to rely heavily on the structured world model. Any additional write up on how the world model was generated and is globally maintained?
November 8, 2025 at 4:48 PM
Classifier ≠ Human Judge

> We assess how effectively large language models generate social media replies that remain indistinguishable from human-authored content when evaluated by automated classifiers. We employ a BERT-based binary classification model to distinguish between the two text types.
LLMs are now widely used in social science as stand-ins for humans—assuming they can produce realistic, human-like text

But... can they? We don’t actually know.

In our new study, we develop a Computational Turing Test.

And our findings are striking:
LLMs may be far less human-like than we think.🧵
Computational Turing Test Reveals Systematic Differences Between Human and AI Language
Large language models (LLMs) are increasingly used in the social sciences to simulate human behavior, based on the assumption that they can generate realistic, human-like text. Yet this assumption rem...
arxiv.org
November 8, 2025 at 5:33 AM
AI isn’t going to wound web’s ad model - fatally or otherwise. AI companies are going to be the ones serving those ads.

I would be shocked if OpenAI hasn’t / isn’t already indexing the web even as I type this.
November 7, 2025 at 1:14 AM
Demo coming soon …

bsky.app/profile/avik...
November 5, 2025 at 4:06 PM
I have said this before over on the bird site (don’t want to visit to find it - it’s extra toxic right now given today’s election results), but many of the young folks in “AI” seem to suffer from the affliction of never having worked with good old ML, before they got mesmerized by “AI”.
November 5, 2025 at 5:07 AM
Every author writing like this should be required to rewrite abstracts in plain English and read it aloud to an audience of their peers, before they can publish it.

Summary: Conjectural with nice diagrams but no quantitative measures and ignores prior literature.

arxiv.org/pdf/2510.26745
November 3, 2025 at 4:14 PM
Unfortunately, at this point, any ‘AxI‘ naming is tainted. Whether we use “General”, “Super”, “Hyper” or [Insert], it’s an academic distinction without real world difference. That it also attempts to name a class of models that don’t actually exist, reinforces the conjuring of hype over substance.
The origin of the term AGI by @stevenlevy.bsky.social-I have worked in AI for 50 years and still think we were chasing what AGI claims to be chasing. Meta now chases ASI--Artificial Super Intelligence. I think we should all be chasing AHI--Artificial Hyper Intelligence. www.wired.com/story/the-ma...
The Man Who Invented AGI
Everyone is obsessed with artificial general intelligence—the stage when AI can match all feats of human cognition. The guy who named it saw it as a threat.
www.wired.com
November 2, 2025 at 7:12 PM
LLMs have no utility - is not something I can subscribe to.

LLMs have low utility relative to investment - is my stance.

Utility is tied to cost. Cut LLM spend by 1000x with SSLMs and value equation shifts. Smaller cheaper task-tuned models FTW.

Yes, you still have to pay the humans. Sorry?
November 2, 2025 at 3:46 PM
Karpathy’s tweet is a live demo of the learning loop he promotes. Consciously or not, he is channeling:

- Kolb: Experimental learning theory
- Feynman: Explain in your own words
- Dweck: Growth mindset scale

The medium is the message.
November 1, 2025 at 4:16 PM
Reposted by Avik Dey
In the research for Computing, my multi-part documentary that examines the intersection of computing and what it means to be human, I've collected almost 6,000 books to help inform my storytelling. You can browse my entire collection here
t.co/fw6RXUYR2l
https://www.librarycat.org/lib/gbooch
t.co
October 31, 2025 at 11:23 PM
Don’t have an exact number, but 150+ trick or treaters tonight. One of them:

K (Kid): Trick or treat?
M (Me): Trick.
K: Huh?
M: What’s the trick?
K: You give me candy.

She was the youngest one of the evening. Cutie pie at her best!
November 1, 2025 at 4:58 AM
October 31, 2025 at 9:11 PM
In their own words:

“Several caveats should be noted: The abilities we observe are highly unreliable; failures of introspection remain the norm.”

transformer-circuits.pub/2025/introsp...
Emergent Introspective Awareness in Large Language Models
transformer-circuits.pub
October 30, 2025 at 8:11 PM
You know how they are going to react to this? Layoff another 100k human engineers to build another super data center because they are convinced they are on the brink of breakthrough - just needs a bit more juice.

Newsflash, boys: AI broke thru back in 2023, now you are just chasing the ghost of AI.
October 30, 2025 at 12:20 AM
Reposted by Avik Dey
When did making kids go hungry become a Christian value?
October 27, 2025 at 11:17 PM
Open source xLMs are rapidly closing the gap with foundation models specially for custom tasks. Stop wiring directly to vendor APIs.

Yes, it might delay your product launch by a few months and that maybe a deal breaker for startups. But if you are an enterprise, you should have no excuse.
October 27, 2025 at 2:16 PM
On Bsky when posting about scientific papers or articles - I have two modes:

1. Has substance? Put on my scholar hat, assess it in its own register and respond with rigor.

2. Is basic? Hat stays off and I keep it casual because it’s not worth the time.

On “AI“ these days, most fall into the 2nd.
October 27, 2025 at 4:24 AM
Wait till he gets to the touch problem.
Armies of humanoid robots are poised to march into the world’s factories. But before they’re ready to turn a wrench, they must solve what Elon Musk calls “the hands problem.”
The ‘Hands Problem’ Holding Back the Humanoid Revolution
Researchers face challenges in creating robotic hands equal to the real thing, but they’re getting closer.
on.wsj.com
October 26, 2025 at 10:55 PM
Root cause was lack of proper versioning and concurrency control in distributed configuration management system which violated ordering guarantees resulting in stale state propagation with cascading service disruption.

Frankly, shocked that it hadnt happened before.

aws.amazon.com/message/1019...
Summary of the Amazon DynamoDB Service Disruption in the Northern Virginia (US-EAST-1) Region
aws.amazon.com
October 26, 2025 at 10:39 PM
Soon, AI will also deliver a baby in one month with nine women:

www.wsj.com/tech/ai/ai-r...
AI Workers Are Putting In 100-Hour Workweeks to Win the New Tech Arms Race
With expertise in the field scarce, workers in Silicon Valley are pushing themselves to extremes day after day.
www.wsj.com
October 25, 2025 at 11:19 PM
For an author of a transformative paper it takes a lot to say this, but, it’s been apparent for a while:

> "Despite the fact that there's never been so much interest and resources and money and talent, this has somehow caused the narrowing of the research that we're doing," Jones told the audience.
'Attention is all you need' coauthor says he's 'sick' of transformers

#HackerNews

<a href="https://venturebeat.com/ai/sakana-ais-cto-says-hes-absolutely-sick-of-transformers-the-tech-that-powers" class="hover:underline text-blue-600 dark:text-sky-400 no-card-link" target="_blank" rel="noopener" data-link="bsky">https://venturebeat.com/ai/sakana-ais-cto-says-hes-absolutely-sick-of-transformers-the-tech-that-powers
October 24, 2025 at 10:19 PM
If you have ever worked with any ML model, this outcome has always been extremely predictable.

arstechnica.com/ai/2025/10/r...
October 24, 2025 at 5:19 PM
Reposted by Avik Dey
If even Karpathy can’t get AI coding to work for him, are you willing to bet on it working for you? Your’s are IID you say? You will soon find out dimensionality means yours are OOD too.
October 13, 2025 at 9:48 PM
Yeah, it should have been way more discussed than it was:

bsky.app/profile/avik...
October 22, 2025 at 10:32 PM