Yacine
yacinemahdid.bsky.social
Yacine
@yacinemahdid.bsky.social
Most foundational models use softmax attention, which scales quadratically with input length—a major bottleneck.

Linear attention has existed since 2020, yet large-scale models rarely use it. Why?

minimax-01 finally makes linear attention work at scale. Deep dive here: 📌 youtu.be/iRuvGU-Sk3c
March 31, 2025 at 2:16 PM
I'm back for the weekly deep-learning study session! ✨

Sorry for the month break, was a bit overwhelmed with lots of things at work.

I'll try to move around the schedule a bit so that more people in different time zones can attend.

📸 PS: I gave a talk at a conference in February!
March 17, 2025 at 3:31 PM
The state of AI/consciousness discourse:
February 25, 2025 at 11:05 PM
The one thing I dislike about current v. of OpenAI is how surface level they are in their research coms.

They are hinting big breakthrough, but man look at the landscape.

Every competitors around is stacked with billions and PhD.

Whatever they are trying to win, won’t be achieved by secrecy.
January 6, 2025 at 2:02 AM
I’m calling it, AI companies will wreck the internet-search equilibrium of the last two decades.

With the arm race of collecting information to train models and keep them fresh, we’re going to see web scraping going out of the gray zone into the firmly black camp.

Can’t be sustainable.
December 31, 2024 at 12:29 AM
With large enough scale and a few in-context example they can apply some transformation of input to learn illogical connection between input-output.

Smaller LLM just disregard the illogical connections.

Some “learning” is happening.
December 29, 2024 at 4:32 PM
The eternal dilemma.
December 25, 2024 at 8:03 PM
Like I said, it depends on a lot of things including which model you use.

But, you can check this paper for a general rule of thumb!

academic.oup.com/bioinformati...

Kind of an old one, but it gives some general guidance. (Watch out it assumes all features are important.)
December 23, 2024 at 12:07 PM
Reading diffusion papers:
November 30, 2024 at 2:33 AM
The jon snow jokes on one of my latest video were unexpected 😂
November 20, 2024 at 11:02 PM