Marco
@mcognetta.bsky.social
Language and keyboard stuff at Google + PhD student at Tokyo Institute of Technology.
I like computers and Korean and computers-and-Korean and high school CS education.
Georgia Tech → 연세대학교 → 東京工業大学.
https://theoreticallygoodwithcomputers.com/
I like computers and Korean and computers-and-Korean and high school CS education.
Georgia Tech → 연세대학교 → 東京工業大学.
https://theoreticallygoodwithcomputers.com/
I was struck with an incredible thought: The Subword Tolkienizer.
November 8, 2025 at 7:58 AM
I was struck with an incredible thought: The Subword Tolkienizer.
Wow, Saddam Hussein had the same interior decorator as an Airbnb I went to in Jeju once.
*Excuse the awkward angle, it's a screenshot from a video.
*Excuse the awkward angle, it's a screenshot from a video.
October 31, 2025 at 7:29 PM
Wow, Saddam Hussein had the same interior decorator as an Airbnb I went to in Jeju once.
*Excuse the awkward angle, it's a screenshot from a video.
*Excuse the awkward angle, it's a screenshot from a video.
The Llama2 tokenizer is certainly not helping with this problem.
October 31, 2025 at 7:11 PM
The Llama2 tokenizer is certainly not helping with this problem.
theoriticians with side hustles
October 31, 2025 at 2:45 AM
theoriticians with side hustles
I'm trying to get one of these from the Tōyoko line's Shin-Maruko station (新丸子駅), since it has the same spelling as my name (but I use マルコ, ofc).
October 26, 2025 at 9:58 PM
I'm trying to get one of these from the Tōyoko line's Shin-Maruko station (新丸子駅), since it has the same spelling as my name (but I use マルコ, ofc).
I'm on the way to SF for @pybay.bsky.social via CalTrain.
So far a much nicer experience than the last time I took CalTrain (but not yet Japan level).
Will post about some of the interesting talks.
#PyBay
So far a much nicer experience than the last time I took CalTrain (but not yet Japan level).
Will post about some of the interesting talks.
#PyBay
October 18, 2025 at 4:44 PM
I'm on the way to SF for @pybay.bsky.social via CalTrain.
So far a much nicer experience than the last time I took CalTrain (but not yet Japan level).
Will post about some of the interesting talks.
#PyBay
So far a much nicer experience than the last time I took CalTrain (but not yet Japan level).
Will post about some of the interesting talks.
#PyBay
Everywhere I look I see his face.
October 16, 2025 at 5:20 AM
Everywhere I look I see his face.
From the Astral Codex Ten Grants.
TLDR: books remain a primary training source for LLMs. A lot of books that feature AI have it as something bad or dangerous or harmful to humanity, which might bias models to be this way. What if we flooded the corpus with examples of good AI?
TLDR: books remain a primary training source for LLMs. A lot of books that feature AI have it as something bad or dangerous or harmful to humanity, which might bias models to be this way. What if we flooded the corpus with examples of good AI?
October 13, 2025 at 6:16 PM
From the Astral Codex Ten Grants.
TLDR: books remain a primary training source for LLMs. A lot of books that feature AI have it as something bad or dangerous or harmful to humanity, which might bias models to be this way. What if we flooded the corpus with examples of good AI?
TLDR: books remain a primary training source for LLMs. A lot of books that feature AI have it as something bad or dangerous or harmful to humanity, which might bias models to be this way. What if we flooded the corpus with examples of good AI?
October 11, 2025 at 11:44 PM
Rate my setup.
(It's all coming together)
(It's all coming together)
October 10, 2025 at 11:10 PM
Rate my setup.
(It's all coming together)
(It's all coming together)
I've relocated to the bay. Here's a real pic of me and my crew.
October 5, 2025 at 7:04 PM
I've relocated to the bay. Here's a real pic of me and my crew.
In honor of Jane Goodall, here is one of my best ever jokes.
youtube.com/clip/UgkxWUA...
The preview isn't loading well on bsky, so here is a teaser.
youtube.com/clip/UgkxWUA...
The preview isn't loading well on bsky, so here is a teaser.
October 2, 2025 at 3:37 AM
In honor of Jane Goodall, here is one of my best ever jokes.
youtube.com/clip/UgkxWUA...
The preview isn't loading well on bsky, so here is a teaser.
youtube.com/clip/UgkxWUA...
The preview isn't loading well on bsky, so here is a teaser.
Boyz II Men was insufficiently ambitious.
September 26, 2025 at 3:35 AM
Boyz II Men was insufficiently ambitious.
Here's one of me looking my very best from when I needed to renew my Student ID card.
September 24, 2025 at 5:29 PM
Here's one of me looking my very best from when I needed to renew my Student ID card.
A two-parter from a SIGBOVIK submission "Trolloc: A trolling dynamic memory allocator".
sigbovik.org/2025/proceed...
sigbovik.org/2025/proceed...
September 24, 2025 at 12:54 PM
A two-parter from a SIGBOVIK submission "Trolloc: A trolling dynamic memory allocator".
sigbovik.org/2025/proceed...
sigbovik.org/2025/proceed...
I screenshotted this since I was curious what Caterpillar would want from a GenAI Prompt Engineer. It seems like that would be a pretty fun job tbh.
The TC was $110,520.00 - $179,640.00, which is pretty sick for Irving Texas right?
The TC was $110,520.00 - $179,640.00, which is pretty sick for Irving Texas right?
September 24, 2025 at 12:54 PM
I screenshotted this since I was curious what Caterpillar would want from a GenAI Prompt Engineer. It seems like that would be a pretty fun job tbh.
The TC was $110,520.00 - $179,640.00, which is pretty sick for Irving Texas right?
The TC was $110,520.00 - $179,640.00, which is pretty sick for Irving Texas right?
This is from a visual question-answering dataset my labmate was curating. I was helping check it for quality.
This one was a bit surprising (the left picture).
This one was a bit surprising (the left picture).
September 24, 2025 at 12:54 PM
This is from a visual question-answering dataset my labmate was curating. I was helping check it for quality.
This one was a bit surprising (the left picture).
This one was a bit surprising (the left picture).
A figure from when I was experimenting with what would lead to my final PhD paper (on finite-state frameworks for tokenization --- particularly BPE).
September 24, 2025 at 12:54 PM
A figure from when I was experimenting with what would lead to my final PhD paper (on finite-state frameworks for tokenization --- particularly BPE).
My lab had a #neko channel where people posted pictures of cats (loosely defined). This was my contribution.
This was ChatGPT circa Jan 2023. Again, things have improved so much so quickly.
This was ChatGPT circa Jan 2023. Again, things have improved so much so quickly.
September 24, 2025 at 12:54 PM
My lab had a #neko channel where people posted pictures of cats (loosely defined). This was my contribution.
This was ChatGPT circa Jan 2023. Again, things have improved so much so quickly.
This was ChatGPT circa Jan 2023. Again, things have improved so much so quickly.
I wrote a masked cross entropy loss package for Julia that I wanted to call MightyMask.jl.
September 24, 2025 at 12:54 PM
I wrote a masked cross entropy loss package for Julia that I wanted to call MightyMask.jl.