Michael Saxon
banner
saxon.me
Michael Saxon
@saxon.me
Doctor of NLP/Vision+Language from UCSB

Evals, metrics, multilinguality, multiculturality, multimodality, and (dabbling in) reasoning

https://saxon.me/
Having autosuggested text pop up when I'm WRITING is the most egregious executive function theft imo. Like it will make me forget what I was trying to write sometimes lol
November 10, 2025 at 7:49 PM
We cannot trust slow, legacy governments to destroy the superintelligent datacenters when the time comes.

That's why we're democratizing defense with the world's best decentralized, AI-native private military contractor, where all targets are designated transparently, on-chain: Blackwater YudSA
November 10, 2025 at 6:53 AM
Yeah, a setup where you're collaborating with the note-taking AI itself is the killer app here
November 9, 2025 at 8:29 PM
Oh, interesting

Maybe also organizing the notes wrt previous notes, and maybe even occasionally surfacing things from prior notes that may be relevant as you are dictating to facilitate this
November 9, 2025 at 8:27 PM
No :(
November 6, 2025 at 8:23 PM
What are the killer features/functionality
November 6, 2025 at 2:47 AM
My vision is to eventually release a clean template site :D

I don't want all the custom functionality I programmed to only live on my site
November 5, 2025 at 11:13 PM
I think Pangea is an example roughly comparable to Aya for VLM

github.com/neulab/Pangea

Also, I am pretty sure Qwen-VL supports Chinese.

In general though the VLM landscape is a little behind the LM landscape
GitHub - neulab/Pangea: This is the repo for the paper "PANGEA: A FULLY OPEN MULTILINGUAL MULTIMODAL LLM FOR 39 LANGUAGES"
This is the repo for the paper "PANGEA: A FULLY OPEN MULTILINGUAL MULTIMODAL LLM FOR 39 LANGUAGES" - neulab/Pangea
github.com
November 5, 2025 at 9:54 PM
Usually I am big into multimodal/multicultural but not this time. Would be very interesting (and probably straightforward) to see the multimodal generalization of this!
November 5, 2025 at 9:34 PM
Alternatively, paired with dataset analysis it could be used to explain observed generalization patterns.
November 5, 2025 at 7:59 PM
Our XNationQA benchmark is an example of a simple, fine-grained way to "fingerprint" how your model's world knowledge differs between languages to potentially guide pretraining and posttraining data augmentation
November 5, 2025 at 7:59 PM
🔗 aclanthology.org/2025.emnlp-m...

With IIT Delhi folks Eshaan Tanwar, Anwoy Chatterjee and Tanmoy Chakraborty, and (former) UCSB folks me, Alon Albalak, and William Wang.

Eshaan, the lead author, did this as an undergrad and will be applying for PhDs soon ;)
Do You Know About My Nation? Investigating Multilingual Language Models’ Cultural Literacy Through Factual Knowledge
Eshaan Tanwar, Anwoy Chatterjee, Michael Saxon, Alon Albalak, William Yang Wang, Tanmoy Chakraborty. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025.
aclanthology.org
November 5, 2025 at 7:47 PM
i need to make my website worthy of your list
November 4, 2025 at 7:55 AM
Well thanks now I'm gonna burn a couple weeks in a rabbit hole integrating all this design inspo into my site 😤
November 3, 2025 at 8:33 PM
Leaving toxic comments is literally the goon squad's MO
November 1, 2025 at 1:32 AM
And without a dislike button the asymmetry is every "no, that sucks" either goes unexpressed or is amplified into an angry reply, and baiting anger becomes a more effective way to bait engagement
November 1, 2025 at 1:31 AM
Yes, and people were (rightfully imo) very mad about this change
November 1, 2025 at 1:30 AM