Thông Nguyễn
machine1235.bsky.social
Thông Nguyễn
@machine1235.bsky.social
I like functions. I have trained functions to play Go, generate speech, and more. I also wrote a JAX library to train functions. I want to understand these functions.
Using pdb is the most important skill a Python developer can learn.
April 21, 2025 at 3:51 PM
Oh hey, I have a new weekend project: Implementing GRPO from scratch with (almost) zero dependencies.
github.com/policy-gradi...
GitHub - policy-gradient/GRPO-Zero: Implementing DeepSeek R1's GRPO algorithm from scratch
Implementing DeepSeek R1's GRPO algorithm from scratch - policy-gradient/GRPO-Zero
github.com
April 20, 2025 at 3:16 PM
I've been thinking a lot about training LLMs with reinforcement learning lately. One thing that surprises me is how easy it is to train LLMs to generate chain-of-thought reasoning using RL, even with extremely simple algorithms like GRPO, which is essentially just the vanilla REINFORCE algorithm.
April 20, 2025 at 3:07 PM
every time i look at it, it still amazes me how ridiculously simple flow matching training is...
December 24, 2024 at 5:36 PM
I was thinking about the following question today: What makes diffusion models better than GANs in generative modeling? 🤔 1/4
November 29, 2024 at 5:34 AM
There's some confusion about the "wall" that many AI people are talking about. First, it's not an AI winter, far from it. AI progress is rapid: we're seeing breakthroughs in music generation, image generation, text-to-speech, video generation, protein folding, and more. We're in a golden age of AI.
November 23, 2024 at 5:42 PM
If scaling LLM pre-training is hitting a wall of diminishing returns, as we have already trained the model on all the data available on the internet, what will help us move forward? 🤔

I've been thinking about this question for a while, and I believe the way forward is scaling LLM post-training.
November 22, 2024 at 9:49 AM
A starter pack for people who love training and understanding big functions:
go.bsky.app/6NeJ1FW
November 22, 2024 at 7:13 AM
Check out this essay by the writer Steven Johnson for a really insightful take on LLMs with long context window: thelongcontext.com.

Steven is one of the people behind NotebookLM, an app created by Google that helps you organize information and conduct research on particular topics.
You Exist In The Long Context
Thoughts on the quiet revolution of long-context AI models, from NotebookLM's Editorial Director Steven Johnson.
thelongcontext.com
November 21, 2024 at 4:21 PM