Venkat
venkatasg.net
Venkat
@venkatasg.net
Assistant Professor CS @ Ithaca College. Computational Linguist interested in pragmatics & social aspects of communication.

venkatasg.net
Reposted by Venkat
“Humans across multiple languages spontaneously associate the nonwords kiki & bouba with spiky & round shapes, respectively...We tested the bouba-kiki effect in baby chickens. Similar to humans, they spontaneously chose a spiky shape when hearing a kiki sound & a round shape when hearing a bouba.”😲🧪
Matching sounds to shapes: Evidence of the bouba-kiki effect in naïve baby chicks
Humans across multiple languages spontaneously associate the nonwords “kiki” and “bouba” with spiky and round shapes, respectively, a phenomenon named the bouba-kiki effect. To explore the origin of t...
www.science.org
February 19, 2026 at 7:20 PM
I read somewhere that the open-source LLMs are 'benchmaxxing': they're trained to do well on benchmarks but don't translate to general improvements. From my simple benchmark that seems true: I was surprised the only models that do decently at FizzBuzz are all the frontier, closed LLMs.
February 12, 2026 at 10:04 PM
I made a Huggingface leaderboard to track model progress on my FizzBuzz benchmark: huggingface.co/spaces/venka...
One thing I've noticed is that something changed with the new generation of models, especially the biggest ones. They all ace it, even with different rules.
February 7, 2026 at 1:38 AM
@hankgreen.bsky.social I'm assuming YouTubers don't get a cut when gemini pulls your recent videos to answer questions? I was wondering if Google pulled YouTube videos directly (they can). ChatGPT can't and uses thid party sources like youtubesummary.com 😭. The video autoplays but only if i scroll.
February 3, 2026 at 5:25 PM
Reposted by Venkat
Very excited to share that the paper w/
@jessyjli.bsky.social @DavidBeaver
"Strategic Dialogue Assessment: The Crooked Path to Innocence" (used to have the name COBRA) was accepted by Dialogue and Discourse Vol 17 No.1. Check it out! 👉https://journals.uic.edu/ojs/index.php/dad/article/view/14503
January 30, 2026 at 3:02 AM
🗣️New Preprint! I'm really excited to talk about this new short paper (w/Laura Biester) analyzing sentences from the Bulwer-Lytton Fiction Contest (BLFC). BLFC challenged writers to write the 'worst opening sentence to the most atrocious novel ever written'. This is a corpus of "bad" sentences! (1/5)
January 27, 2026 at 5:36 PM
Reposted by Venkat
“All bears have a property”, “Some bears have a property”, “Bears have a property” are different in terms of how the property is generalized to a specific bear – a great example of how language constrains thought!

This holds for kids, adults, and according to our new work, (V)LMs! 🧵
January 27, 2026 at 4:16 PM
Reposted by Venkat
What should academics be doing right now?

I have been writing up some thoughts on what the research says about effective action, and what universities specifically can do.

davidbau.github.io/poetsandnurs...

It's on GitHub. Suggestions and pull requests welcome.
github.com/davidbau/poe...
January 26, 2026 at 3:27 AM
Reposted by Venkat
Our first South by Semantics lecture of the semester at UT Austin is happening next week on January 30th!

I'm excited to hear Dr. Amir Zeldes (Associate Professor at Georgetown University) talk about saliency in discourse and the memorability of salient information for both humans and LLMs.
January 22, 2026 at 1:00 AM
Reposted by Venkat
Hello world 👋
My first paper at UT Austin!

We ask: what happens when medical “evidence” fed into an LLM is wrong? Should your AI stay faithful, or should it play it safe when the evidence is harmful?

We show that frontier LLMs accept counterfactual medical evidence at face value.🧵
January 21, 2026 at 6:45 PM
Update: GPT-5.2 Pro aces my standard and modified FizzBuzz benchmark. Most models still fail to generalize (spectacularly), but something did change with the latest crop of Claude and GPT thinking/pro models that seemed to help with my (silly, but interesting) benchmark.
github.com/venkatasg/fi...
January 17, 2026 at 9:01 AM
Thanks Claude, lucky for you I make regular backups!
I thought it'd be interesting to incorporate CLI agents in my software engineering class, but depending on my students (or anyone's) backup hygiene is a non-starter. Maybe Claude in remote environments...
January 5, 2026 at 10:39 AM
I knew StackOverflow was in trouble because of LLMs but this graph is insane. It took a decade for Wikipedia to push Encyclopedia Britannica out of print, but only three years for LLMs to make people stop asking questions on Stack Overflow.
data.stackexchange.com/stackoverflo...
January 4, 2026 at 3:15 PM
A student found my personal number and started calling me on WhatsApp to increase their grade on the final😶 Reminding myself of the dumb stuff I did as an 18 year old to find the grace to gently email them that this is inappropriate.
December 24, 2025 at 4:42 AM
Reposted by Venkat
Omg wait. Someone literally posted this paper a couple weeks ago. Good job guys
Sparse Autoencoders are Topic Models
Sparse autoencoders (SAEs) are used to analyze embeddings, but their role and practical value are debated. We propose a new perspective on SAEs by demonstrating that they can be naturally understood a...
arxiv.org
December 15, 2025 at 11:00 PM
Following up on my blog post about, I figured I'd create a silly benchmark to test how good LLMs are at playing FizzBuzz for 100 turns. Surprisingly 2 Claude models do well at both the standard game and a slightly modified game where ‘buzz’ should be emitted at multiples of 7 rather than 5… (1/2)
December 13, 2025 at 8:28 PM
I was thinking about FizzBuzz and LLMs writing code and had a few thoughts. venkatasg.net/blog/fizzbuzz-2...
FizzBuzzing LLMs
venkatasg.net
December 13, 2025 at 5:21 PM
Reposted by Venkat
Presenting a poster with some independent work on dynamic neural audio at 3pm at the AI for Music workshop (room 27)! bostromk.net/ASURA
Asura's Harp
bostromk.net
December 7, 2025 at 7:17 PM
Reposted by Venkat
🥳Life Update!

I’m thrilled to share that I’ll be starting as assistant professor for Natural Language Processing @unileipzig.bsky.social in April! I’m deeply grateful to everyone who supported me on this journey.

I will be recruiting PhD students with @scadsai.bsky.social, stay tuned for details!
December 10, 2025 at 1:10 PM
Reposted by Venkat
We are accepting submissions for the 25th edition of the Texas Linguistics Society (TLS), a UT Austin grad-student ran Linguistics conference! The conference will run from February 20 - 21, 2026 in Austin.

Abstract Deadline: December 17
Notification: January 15
November 21, 2025 at 9:17 PM
Reposted by Venkat
New work to appear @ TACL!

Language models (LMs) are remarkably good at generating novel well-formed sentences, leading to claims that they have mastered grammar.

Yet they often assign higher probability to ungrammatical strings than to grammatical strings.

How can both things be true? 🧵👇
November 10, 2025 at 10:11 PM
Reposted by Venkat
Syntax that spuriously correlates with safe domains can jailbreak LLMs - e.g. below with GPT4o mini

Our paper (co w/ Vinith Suriyakumar) on syntax-domain spurious correlations will appear at #NeurIPS2025 as a ✨spotlight!

+ @marzyehghassemi.bsky.social, @byron.bsky.social, Levent Sagun
October 24, 2025 at 4:23 PM
Reposted by Venkat
"Although I hate leafy vegetables, I prefer daxes to blickets." Can you tell if daxes are leafy vegetables? LM's can't seem to! 📷

We investigate if LMs capture these inferences from connectives when they cannot rely on world knowledge.

New paper w/ Daniel, Will, @jessyjli.bsky.social
October 16, 2025 at 3:27 PM
Reposted by Venkat
UT Austin Linguistics is hiring in computational linguistics!

Asst or Assoc.

We have a thriving group sites.utexas.edu/compling/ and a long proud history in the space. (For instance, fun fact, Jeff Elman was a UT Austin Linguistics Ph.D.)

faculty.utexas.edu/career/170793

🤘
UT Austin Computational Linguistics Research Group – Humans processing computers processing humans processing language
sites.utexas.edu
October 7, 2025 at 8:53 PM
Reposted by Venkat
Excited to present this at #COLM2025 tomorrow! (Tuesday, 11:00 AM poster session)
One of the ways that LLMs can be inconsistent is the "generator-validator gap," where LLMs deem their own answers incorrect.

🎯 We demonstrate that ranking-based discriminator training can significantly reduce this gap, and improvements on one task often generalize to others!

🧵👇
October 6, 2025 at 8:40 PM