Marco
banner
mcognetta.bsky.social
Marco
@mcognetta.bsky.social
Language and keyboard stuff at Google + PhD student at Tokyo Institute of Technology.

I like computers and Korean and computers-and-Korean and high school CS education.

Georgia Tech → 연세대학교 → 東京工業大学.

https://theoreticallygoodwithcomputers.com/
I was struck with an incredible thought: The Subword Tolkienizer.
November 8, 2025 at 7:58 AM
Wow, Saddam Hussein had the same interior decorator as an Airbnb I went to in Jeju once.

*Excuse the awkward angle, it's a screenshot from a video.
October 31, 2025 at 7:29 PM
The Llama2 tokenizer is certainly not helping with this problem.
October 31, 2025 at 7:11 PM
theoriticians with side hustles
October 31, 2025 at 2:45 AM
I'm trying to get one of these from the Tōyoko line's Shin-Maruko station (新丸子駅), since it has the same spelling as my name (but I use マルコ, ofc).
October 26, 2025 at 9:58 PM
I'm on the way to SF for @pybay.bsky.social via CalTrain.

So far a much nicer experience than the last time I took CalTrain (but not yet Japan level).

Will post about some of the interesting talks.

#PyBay
October 18, 2025 at 4:44 PM
October 18, 2025 at 4:09 PM
🍞+😬
October 17, 2025 at 6:48 PM
Everywhere I look I see his face.
October 16, 2025 at 5:20 AM
From the Astral Codex Ten Grants.

TLDR: books remain a primary training source for LLMs. A lot of books that feature AI have it as something bad or dangerous or harmful to humanity, which might bias models to be this way. What if we flooded the corpus with examples of good AI?
October 13, 2025 at 6:16 PM
Forget kei truck coffee stands, I want a Jimny coffee stand.

www.instagram.com/p/DPqU7VyCD_N
October 11, 2025 at 11:44 PM
Rate my setup.

(It's all coming together)
October 10, 2025 at 11:10 PM
Rate my setup
October 10, 2025 at 9:09 PM
I've relocated to the bay. Here's a real pic of me and my crew.
October 5, 2025 at 7:04 PM
In honor of Jane Goodall, here is one of my best ever jokes.

youtube.com/clip/UgkxWUA...

The preview isn't loading well on bsky, so here is a teaser.
October 2, 2025 at 3:37 AM
Rate my setup
September 29, 2025 at 10:18 AM
Boyz II Men was insufficiently ambitious.
September 26, 2025 at 3:35 AM
Here's one of me looking my very best from when I needed to renew my Student ID card.
September 24, 2025 at 5:29 PM
A two-parter from a SIGBOVIK submission "Trolloc: A trolling dynamic memory allocator".

sigbovik.org/2025/proceed...
September 24, 2025 at 12:54 PM
I screenshotted this since I was curious what Caterpillar would want from a GenAI Prompt Engineer. It seems like that would be a pretty fun job tbh.

The TC was $110,520.00 - $179,640.00, which is pretty sick for Irving Texas right?
September 24, 2025 at 12:54 PM
EMNLP in Miami.
September 24, 2025 at 12:54 PM
This is from a visual question-answering dataset my labmate was curating. I was helping check it for quality.

This one was a bit surprising (the left picture).
September 24, 2025 at 12:54 PM
A figure from when I was experimenting with what would lead to my final PhD paper (on finite-state frameworks for tokenization --- particularly BPE).
September 24, 2025 at 12:54 PM
My lab had a #neko channel where people posted pictures of cats (loosely defined). This was my contribution.

This was ChatGPT circa Jan 2023. Again, things have improved so much so quickly.
September 24, 2025 at 12:54 PM
I wrote a masked cross entropy loss package for Julia that I wanted to call MightyMask.jl.
September 24, 2025 at 12:54 PM