Lightnews — Scholar-powered news

Marco

@mcognetta.bsky.social

2.3K followers 1.3K following 700 posts

Language and keyboard stuff at Google + PhD student at Tokyo Institute of Technology.

I like computers and Korean and computers-and-Korean and high school CS education.

Georgia Tech → 연세대학교 → 東京工業大学.

https://theoreticallygoodwithcomputers.com/

Posts Replies Media Videos

Marco

@mcognetta.bsky.social

I was struck with an incredible thought: The Subword Tolkienizer.

The One Ring inscription, but after subword tokenization.

November 8, 2025 at 7:58 AM

Marco

@mcognetta.bsky.social

Wow, Saddam Hussein had the same interior decorator as an Airbnb I went to in Jeju once.

*Excuse the awkward angle, it's a screenshot from a video.

October 31, 2025 at 7:29 PM

Marco

@mcognetta.bsky.social

The Llama2 tokenizer is certainly not helping with this problem.

The whitespace merges in the llama2 tokenizer's bpe merge list.

October 31, 2025 at 7:11 PM

Marco

@mcognetta.bsky.social

theoriticians with side hustles

October 31, 2025 at 2:45 AM

Marco

@mcognetta.bsky.social

I'm trying to get one of these from the Tōyoko line's Shin-Maruko station (新丸子駅), since it has the same spelling as my name (but I use マルコ, ofc).

October 26, 2025 at 9:58 PM

Marco

@mcognetta.bsky.social

I'm on the way to SF for @pybay.bsky.social via CalTrain.

So far a much nicer experience than the last time I took CalTrain (but not yet Japan level).

Will post about some of the interesting talks.

#PyBay

October 18, 2025 at 4:44 PM

Marco

@mcognetta.bsky.social

October 18, 2025 at 4:09 PM

Marco

@mcognetta.bsky.social

🍞+😬

October 17, 2025 at 6:48 PM

Marco

@mcognetta.bsky.social

Everywhere I look I see his face.

October 16, 2025 at 5:20 AM

Marco

@mcognetta.bsky.social

From the Astral Codex Ten Grants.

TLDR: books remain a primary training source for LLMs. A lot of books that feature AI have it as something bad or dangerous or harmful to humanity, which might bias models to be this way. What if we flooded the corpus with examples of good AI?

October 13, 2025 at 6:16 PM

Marco

@mcognetta.bsky.social

Forget kei truck coffee stands, I want a Jimny coffee stand.

www.instagram.com/p/DPqU7VyCD_N

October 11, 2025 at 11:44 PM

Marco

@mcognetta.bsky.social

Rate my setup.

(It's all coming together)

October 10, 2025 at 11:10 PM

Marco

@mcognetta.bsky.social

Rate my setup

October 10, 2025 at 9:09 PM

Marco

@mcognetta.bsky.social

I've relocated to the bay. Here's a real pic of me and my crew.

October 5, 2025 at 7:04 PM

Marco

@mcognetta.bsky.social

In honor of Jane Goodall, here is one of my best ever jokes.

youtube.com/clip/UgkxWUA...

The preview isn't loading well on bsky, so here is a teaser.

October 2, 2025 at 3:37 AM

Marco

@mcognetta.bsky.social

Rate my setup

September 29, 2025 at 10:18 AM

Marco

@mcognetta.bsky.social

Boyz II Men was insufficiently ambitious.

September 26, 2025 at 3:35 AM

Marco

@mcognetta.bsky.social

Here's one of me looking my very best from when I needed to renew my Student ID card.

September 24, 2025 at 5:29 PM

Marco

@mcognetta.bsky.social

A two-parter from a SIGBOVIK submission "Trolloc: A trolling dynamic memory allocator".

sigbovik.org/2025/proceed...