Yacine
yacinemahdid.bsky.social
Yacine
@yacinemahdid.bsky.social
That’s how I start all my YouTube tutorial about the latest deep learning architecture.
May 20, 2025 at 8:08 PM
Started coding a while game with the kids, whew this is fun!
May 4, 2025 at 5:50 PM
I just asked ChatGPT to help me set up the boilerplate for a Python script that make use of their API.

1. Secret is pasted straight into file, no environment management.
2. The code is for a deprecated API.

What a vibe.
April 1, 2025 at 3:27 PM
2025 will be the year of linear attention, I feel it.
April 1, 2025 at 2:12 AM
There is an exhilarating feeling in finally understand a whole line of research after a few weeks of study.

It’s like a flash of every paper, formula and code seen that just come flooding all at once in its correct form.
April 1, 2025 at 1:51 AM
Most foundational models use softmax attention, which scales quadratically with input length—a major bottleneck.

Linear attention has existed since 2020, yet large-scale models rarely use it. Why?

minimax-01 finally makes linear attention work at scale. Deep dive here: 📌 youtu.be/iRuvGU-Sk3c
March 31, 2025 at 2:16 PM
I'm back for the weekly deep-learning study session! ✨

Sorry for the month break, was a bit overwhelmed with lots of things at work.

I'll try to move around the schedule a bit so that more people in different time zones can attend.

📸 PS: I gave a talk at a conference in February!
March 17, 2025 at 3:31 PM
Lots of confusion out there about what AI Engineering is about.

What's an agent, what's a workflow, what's an agentic system, etc.

I made this tutorial on the topic packed with information from the latest research from HuggingFace.

Check it out over here:
youtu.be/UMYKjT9exb4

Enjoy! 🌹
Introduction to AI Agents - Theory and Code
YouTube video by Deep Learning with Yacine
youtu.be
March 10, 2025 at 2:51 PM
This is the kind of research we need more of:
Ever looked at LLM skill emergence and thought 70B parameters was a magic number? Our new paper shows sudden breakthroughs are samples from bimodal performance distributions across seeds. Observed accuracy jumps abruptly while the underlying accuracy DISTRIBUTION changes slowly!
February 25, 2025 at 11:07 PM
The state of AI/consciousness discourse:
February 25, 2025 at 11:05 PM
Reposted by Yacine
you fucked up a perfectly good computer is what you did. look at it. it's got innumeracy
February 12, 2025 at 7:36 PM
Wouldn’t it be funny that we never reach AGI because of short term incentive to keep going with Transformers.

Then we patch the whole thing left and right to keep the illusion of general intelligence with massive injection of capital.

Literally yeeting the AI field in a local minima and digging.
February 10, 2025 at 12:35 AM
This is so well put, must read!
I keep hearing from healthcare #AI companies that Large Language Models ( #LLMs ) can be made to be "deterministic" as part of arguments around safety. I thought I'd do a little ranty 🧵 to explain why (a) it's not true in the real world, and (b) it's not even the right question.

1/
February 9, 2025 at 3:30 PM
Open AI is getting absolutely cooker right now.
Crazy how we went from the darling of AI to a company researchers loathe.

not a good vibe.
January 29, 2025 at 12:08 AM
The one thing I dislike about current v. of OpenAI is how surface level they are in their research coms.

They are hinting big breakthrough, but man look at the landscape.

Every competitors around is stacked with billions and PhD.

Whatever they are trying to win, won’t be achieved by secrecy.
January 6, 2025 at 2:02 AM
Reposted by Yacine
how i'd learn machine learning in 2025 if i had to start from scratch:

1. find a log that i initially planned to turn into a table leg
2. make it into a puppet that can walk and talk
3. have the puppet, through a series of adventures, turn into a real boy and realize the true value of friendship
January 5, 2025 at 10:25 PM
Reposted by Yacine
Just came across this interesting blog post on the job market for new PhD grads in AI: kyunghyuncho.me/i-sensed-anxiety-and-frustration-at-neurips24

The argument feels pretty reasonable. Here is my take: (1/6)

#MLSky #NeuroAI 🧠📈
January 3, 2025 at 4:02 PM
Reposted by Yacine
🐈‍⬛🤍.
December 31, 2024 at 3:27 PM
Reposted by Yacine
Our new paper! "Analytic theory of creativity in convolutional diffusion models" lead expertly by @masonkamb.bsky.social
arxiv.org/abs/2412.20292
Our closed-form theory needs no training, is mechanistically interpretable & accurately predicts diffusion model outputs with high median r^2~0.9
December 31, 2024 at 4:54 PM
Reposted by Yacine
Do this in 2025:
December 30, 2024 at 8:32 AM
Reposted by Yacine
Here's my end-of-year review of things we learned out about LLMs in 2024 - we learned a LOT of things simonwillison.net/2024/Dec/31/...

Table of contents:
December 31, 2024 at 6:10 PM
100%, this is messed up and will have massive consequences.

We’re going to see more and more heavily gated web communities.
joao.omg.lol joao @joao.omg.lol · Dec 30
Seriously, it seems everything around LLMs works by messing up the social contract, it's outright predatory towards things like the small web and people who just wan to share the neat things they learn or do
Source: news.ycombinator.com/item?id=4254...
December 31, 2024 at 12:43 AM
Reposted by Yacine
It is roughly 10 lines of code to go from 1 GPU to N GPUs with pytorch DDP. Pointing this out so that everyone is aware and doesn't shy away from scaling their code
December 30, 2024 at 7:36 PM