Maria Antoniak
banner
mariaa.bsky.social
Maria Antoniak
@mariaa.bsky.social
☀️ Assistant Professor of Computer Science at CU Boulder 👩‍💻 NLP, cultural analytics, narratives, online communities 🌐 https://maria-antoniak.github.io 💬 books, bikes, games, art
That’s a very persuasive pitch and it’s in my cart! Also saw this gem of a review on Amazon which further convinced me.
November 21, 2025 at 8:55 PM
If you're a student in need of a personal website (and if you're doing research, yes, you need a website!), I keep a list of nice examples here, most of which are reusable: www.are.na/maria-antoni...

For example, I just spotted this beautiful website by Catherine Yeh: github.com/catherinesye...
November 3, 2025 at 8:11 PM
"Finding Flawed Fictions: Evaluating Complex Reasoning in Language Models via Plot Hole Detection"
Kabir Ahuja et al. arxiv.org/abs/2504.11900
October 14, 2025 at 6:20 PM
"Supposedly Equivalent Facts That Aren't? Entity Frequency in Pre-training Induces Asymmetry in LLMs" by Yuan He et al. arxiv.org/abs/2503.22362
October 14, 2025 at 6:19 PM
Inspired to share some papers that I found at #COLM2025!

"Register Always Matters: Analysis of LLM Pretraining Data Through the Lens of Language Variation" by Amanda Myntti et al. arxiv.org/abs/2504.01542
October 14, 2025 at 6:16 PM
The #COLM2025 workshop on NLP4Democracy is starting now! Join us in 520E.

I’ll be speaking at 10:15am with @ysiglidis.bsky.social about work with @iaugenstein.bsky.social and @serge.belongie.com focused on tracking collective narratives on social media.
October 10, 2025 at 1:27 PM
I’m in Ithaca today, on my way to give a talk at Colgate University tomorrow and then Montreal for the rest of the week for #COLM2025. The weather is *too* beautiful for this autumn road trip.
October 6, 2025 at 5:37 PM
We have some numbers in our analysis of WildChat: www.arxiv.org/abs/2407.11438

But these results are over a sample of conversations sampled per user — and we know that (erotic) story generation is a task that users like to come back and repeat. So this is also an underestimate.
August 14, 2025 at 6:04 PM
"Our reanalysis shows that the reported decline in disruptiveness can be attributed to a relative decline of these database entries with zero references."

I saw this paper at a recent conference, really liked the presentation, and was, um, reminded of it today.

Keep an eye on your histograms!
July 17, 2025 at 12:21 PM
Got it. I have an identical feed that I made using @graze.social and I agree, it's my favorite/main feed! You can make it like this:
July 2, 2025 at 7:52 AM
Here we go! The Copenhagen NLP Symposium is starting up with a welcome from @delliott.bsky.social. People are attending from Aarhus, Aalborg, Copenhagen, and outside Denmark, from both academia and industry.
June 20, 2025 at 7:24 AM
We were lucky to have @nolauren.bsky.social visit us today at the @aicentre.dk to talk about distant viewing, visual cultural theory, and her work on making photography collections accessible! Check out her book with Taylor Arnold: mitpress.mit.edu/978026254613...
May 23, 2025 at 2:41 PM
— and I'm not sure I can lean into the idea that there are no answers to these questions or that asking such questions is irrational.

The final kind of pro-AI-art argument is again one that I think model sellers would be happy with, but not one that matches many real audience reactions.
May 19, 2025 at 7:11 AM
Interesting, and I like the word "elusive" when applied to generated art. Also this is the third time in two weeks I'm being led to think about Benjamin, so probably time to reread.

But the framing that models are unknowable, untraceable, uninterpretable is a framing sold by companies —
May 19, 2025 at 7:11 AM
May 10, 2025 at 7:31 PM
the Openwebmath paper is perfect, pretty much exactly what I've been looking for!
May 10, 2025 at 6:39 PM
thank you!!

also, my favorite slide:
May 10, 2025 at 5:58 PM
I wondered, “What is Svelte? Should I be using it?” and I ended up here. First time I’ve seen a page like this, ready for an LLM searching the web.
May 9, 2025 at 12:30 PM
This is my personal paper feed algorithm, in case it's useful! Still far from perfect.
May 3, 2025 at 3:29 PM
I updated our 🔭StorySeeker demo. Aimed at beginners, it briefly walks through loading our model from Hugging Face, loading your own text dataset, predicting whether each text contains a story, and topic modeling and exploring the results. Runs in your browser, no installation needed!
April 15, 2025 at 12:05 PM
I tried "vibe coding" and made a little website with Claude 3.7. For any Bluesky username, it will topic model that user's posts and create a heat map of their post topics over time. It will also show their oldest, newest, and top posts for each topic. Not perfect but fun + took just 30 minutes!
April 10, 2025 at 8:31 AM
I've really enjoyed reading this "workography" by Kees van Deemter, whom I've never met but who has had a long career in NLP. Lots of storytelling and reflections on research, moving between institutions and countries, finding mentors, choosing between academia and industry, and more.
April 9, 2025 at 9:34 AM
New work on multimodal framing! 💫

Some fun results: comparisons of the same frame when expressed in images vs texts. When the "crime" frame is expressed in the article text, there are more political words in the text, but when the frame is expressed in the article image, more police words.
April 7, 2025 at 9:48 AM
Today Daria Bazylevych would have turned 19. Last year, Daria, her mother, and her two sisters were killed by a Russian missile.

Daria was studying Culture Studies at UCU in Lviv, where I taught for a year after university.

"Daria was light, sun, and sunflowers."

ucufoundation.org/the-19th-ann...
March 19, 2025 at 11:29 PM
Still thinking about this young mathematician. Her name was Yulia Zdanovska. She could have done so much for her country and for our shared world. We all lost her.

Killed by Putin’s Russia in Kharkiv almost exactly three years ago.

Memorial from the ACM: cacm.acm.org/news/in-memo...
March 1, 2025 at 1:22 PM