Lon
ryukn.bsky.social
Lon
@ryukn.bsky.social
Software Engineer specializing in Machine Learning in Tokyo, Japan.
Working in the field of AI for science.

Opinions are my own.
I’m experiencing OoM errors when trying to run the gpt-oss 20b model on Google Colab’s free T4 GPU.
I wonder if there are any good workarounds for this?
August 16, 2025 at 5:14 AM
Just read this article: magazine.sebastianraschka.com/p/from-gpt-2...
Since I recently tried implementing the GPT-2 architecture from scratch, this article's approach to highlight the differences of gpt-oss from GPT-2 was easy to follow. As the article mentions, GPT-2 is a great starting point.
From GPT-2 to gpt-oss: Analyzing the Architectural Advances
And How They Stack Up Against Qwen3
magazine.sebastianraschka.com
August 11, 2025 at 1:22 PM
Recently re-implementing GPT-2 from scratch! Before diving in, I only had a vague understanding of how LLMs work-still learning this field. But after hands-on implementation, my grasp of embeddings, multi-head attention, and GPU parallelization became much clearer. Really grateful for the tutorial!
August 2, 2025 at 11:19 AM