Marco
banner
mcognetta.bsky.social
Marco
@mcognetta.bsky.social
Language and keyboard stuff at Google + PhD student at Tokyo Institute of Technology.

I like computers and Korean and computers-and-Korean and high school CS education.

Georgia Tech → 연세대학교 → 東京工業大学.

https://theoreticallygoodwithcomputers.com/
Pinned
A lot of you followed me due to #NLP, but I like to post about #chess (especially computer chess), #programming (especially puzzles, code golf, etc), and machine learning.

And some less technical stuff like #Korean, #Esperanto, and #trains (mostly in Japan, just due to proximity).
A side channel attack on streaming LLMs where one can recover conversation topics while only seeing encrypted packet response streams.

arxiv.org/abs/2511.03675
​​Whisper Leak: A novel side-channel attack on remote language models | Microsoft Security Blog
Understand the risks of encrypted AI traffic exposure and explore practical steps users and cloud providers can take to stay secure. Learn more.
www.microsoft.com
November 10, 2025 at 6:11 AM
Reposted by Marco
I was struck with an incredible thought: The Subword Tolkienizer.
November 8, 2025 at 7:58 AM
I was struck with an incredible thought: The Subword Tolkienizer.
November 8, 2025 at 7:58 AM
Reposted by Marco
🎉 Congratulations to all #EMNLP2025 award winners 🎉

Starting with the ✨Best Paper award ✨:

"Infini-gram mini: Exact n-gram Search at the Internet Scale with FM-Index"
by Hao Xu, Jiacheng Liu, Yejin Choi, Noah A. Smith, and Hannaneh Hajishirzi
aclanthology.org/2025.emnlp-m...

1/n
November 7, 2025 at 10:29 PM
Reposted by Marco
Got to the part of "temperature" and I'm aware that a higher temperature == less predictable but never knew why.

Turns out it's very simple. Before the "score" for a set of tokens is turned into a probability distribution it's divided by the temperature. Higher values "flatten" the distribution.
November 6, 2025 at 5:47 PM
Reposted by Marco
Just added my book, "Theory of Computing: An Open Introduction" to OER Commons, and working on getting it listed in Canadian repositories too. One step closer to making education more open and accessible to everyone!
oercommons.org/courses/theo...
Theory of Computing: An Open Introduction
This book is suitable for courses on the theory of computing at both the undergraduate and graduate levels, and for self-study. Topics are introduced in a logical order: we begin with the simple finit...
oercommons.org
November 6, 2025 at 6:12 PM
Reposted by Marco
It’s grad school application season, and I wanted to give some public advice.

Caveats:
-*-*-*-*


> These are my opinions, based on my experiences, they are not secret tricks or guarantees

> They are general guidelines, not meant to cover a host of idiosyncrasies and special cases
November 6, 2025 at 2:55 PM
This is very high on my list of advice for PhD applicants.

I've written two SoP (masters and PhD) and the similarities between the things I wrote about in the SoP and the things I wrote my theses on ends roughly at "written in English".
Mistake 3, cont': people worry they narrow down by proposing specific questions ("What if this is not the EXACT thing I want to work on in grad school?").

But an SoP is not a *contract*, it will not be waved in front of you when starting grad school.
November 7, 2025 at 12:20 AM
Reposted by Marco
Mistake 3, cont': people worry they narrow down by proposing specific questions ("What if this is not the EXACT thing I want to work on in grad school?").

But an SoP is not a *contract*, it will not be waved in front of you when starting grad school.
November 6, 2025 at 2:55 PM
Reposted by Marco
y'all seem to really like baseball bsky.social/about/blog/1...
The World Series Was Electric — So Was Bluesky - Bluesky
“How can you not be romantic about baseball?” — Moneyball 2011
bsky.social
November 6, 2025 at 9:58 PM
Reposted by Marco
Presenting today our work "Unsupervised Word-level Quality Estimation Through the Lens of Annotator (Dis)agreement" at the Machine Translation morning session (Room A301, 11:45 China time). See you there! 🤗

Paper: aclanthology.org/2025.emnlp-m...
Slides/video/poster: underline.io/events/502/s...
Unsupervised Word-level Quality Estimation for Machine Translation Through the Lens of Annotators (Dis)agreement
Gabriele Sarti, Vilém Zouhar, Malvina Nissim, Arianna Bisazza. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025.
aclanthology.org
November 6, 2025 at 1:19 AM
Reposted by Marco
why intern at Ai2?

🐟interns own major parts of our model development, sometimes even leading whole projects
🐡we're committed to open science & actively help our interns publish their work

reach out if u wanna build open language models together 🤝

links 👇
November 5, 2025 at 11:11 PM
For more about things like this, here's an article (actually it's a series) that goes in-depth into this topic for a Canadian election.

One of my favorite articles to share.
November 4, 2025 at 10:08 AM
The For You feed is nice but it's really sensitive (?). Like every now and then my feed just explodes with some niche topic that's seemingly unrelated to anything I've interacted with.
November 4, 2025 at 10:05 AM
Reposted by Marco
One of the hardest Pytorch bug I had to debug is due to how the logsumexp behave with -inf masked inputs. Consider the following example. I build a vector of 3 logits, and each logit is the result of a logsumexp.
November 4, 2025 at 9:12 AM
Reposted by Marco
I wrote a short blog post about masked softmax layers in PyTorch (i.e., when you have structural constraints that tell you some classes _must_ have probability zero).

This was based on a real bug I found in a neural chess model implementation.
Masked Softmax Layers in PyTorch
Correctly computing masked softmax layers.
mcognetta.github.io
November 3, 2025 at 7:39 PM
Reposted by Marco
NLP evaluation is often detached from practical applications. Today I extrinsically evaluated one WMT25 translation system on the task of getting hair done without knowing Chinese.

Yes you got 67 BLEU points but is the resulting hair slaying? 💇

See the result on one datapoint (my head) at EMNLP.
November 3, 2025 at 5:49 AM
Reposted by Marco
Let's talk about eval (automatic or human) and multilinguality at #EMNLP in Suzhou! 🇨🇳

- Efficient evaluation (Nov 5, 16:30, poster session 3)
- MT difficulty (Nov 7, 12:30, findings 3)
- COMET-poly (Nov 8, 11:00, WMT)

(DM to meet 🌿 )
October 28, 2025 at 9:45 AM
Catch me way under par in the Code Golf Masters after this change arrives.
November 3, 2025 at 9:42 PM
I wrote a short blog post about masked softmax layers in PyTorch (i.e., when you have structural constraints that tell you some classes _must_ have probability zero).

This was based on a real bug I found in a neural chess model implementation.
Masked Softmax Layers in PyTorch
Correctly computing masked softmax layers.
mcognetta.github.io
November 3, 2025 at 7:39 PM
Reposted by Marco
Need to establish a norm against making the manifold chip-coloured so im not hungry reading papers
November 1, 2025 at 1:00 PM
Wow, Saddam Hussein had the same interior decorator as an Airbnb I went to in Jeju once.

*Excuse the awkward angle, it's a screenshot from a video.
October 31, 2025 at 7:29 PM
The Llama2 tokenizer is certainly not helping with this problem.
October 31, 2025 at 7:11 PM
Reposted by Marco
Me, when I see a building.
July 7, 2025 at 4:17 PM
Reposted by Marco
The project that started my whitespace obsession... #EMNLP2025

While we've all been worrying about tokenizers, lurking in the background has been the preprocessing *before* tokenization. Poems break standard HTML-to-text linearization systems, and we find that multimodal models aren't a solution.
October 31, 2025 at 4:53 PM