venkatasg.net
One thing I've noticed is that something changed with the new generation of models, especially the biggest ones. They all ace it, even with different rules.
One thing I've noticed is that something changed with the new generation of models, especially the biggest ones. They all ace it, even with different rules.
@jessyjli.bsky.social @DavidBeaver
"Strategic Dialogue Assessment: The Crooked Path to Innocence" (used to have the name COBRA) was accepted by Dialogue and Discourse Vol 17 No.1. Check it out! 👉https://journals.uic.edu/ojs/index.php/dad/article/view/14503
@jessyjli.bsky.social @DavidBeaver
"Strategic Dialogue Assessment: The Crooked Path to Innocence" (used to have the name COBRA) was accepted by Dialogue and Discourse Vol 17 No.1. Check it out! 👉https://journals.uic.edu/ojs/index.php/dad/article/view/14503
This holds for kids, adults, and according to our new work, (V)LMs! 🧵
This holds for kids, adults, and according to our new work, (V)LMs! 🧵
I have been writing up some thoughts on what the research says about effective action, and what universities specifically can do.
davidbau.github.io/poetsandnurs...
It's on GitHub. Suggestions and pull requests welcome.
github.com/davidbau/poe...
I have been writing up some thoughts on what the research says about effective action, and what universities specifically can do.
davidbau.github.io/poetsandnurs...
It's on GitHub. Suggestions and pull requests welcome.
github.com/davidbau/poe...
I'm excited to hear Dr. Amir Zeldes (Associate Professor at Georgetown University) talk about saliency in discourse and the memorability of salient information for both humans and LLMs.
I'm excited to hear Dr. Amir Zeldes (Associate Professor at Georgetown University) talk about saliency in discourse and the memorability of salient information for both humans and LLMs.
My first paper at UT Austin!
We ask: what happens when medical “evidence” fed into an LLM is wrong? Should your AI stay faithful, or should it play it safe when the evidence is harmful?
We show that frontier LLMs accept counterfactual medical evidence at face value.🧵
My first paper at UT Austin!
We ask: what happens when medical “evidence” fed into an LLM is wrong? Should your AI stay faithful, or should it play it safe when the evidence is harmful?
We show that frontier LLMs accept counterfactual medical evidence at face value.🧵
github.com/venkatasg/fi...
github.com/venkatasg/fi...
I thought it'd be interesting to incorporate CLI agents in my software engineering class, but depending on my students (or anyone's) backup hygiene is a non-starter. Maybe Claude in remote environments...
I thought it'd be interesting to incorporate CLI agents in my software engineering class, but depending on my students (or anyone's) backup hygiene is a non-starter. Maybe Claude in remote environments...
data.stackexchange.com/stackoverflo...
data.stackexchange.com/stackoverflo...
I’m thrilled to share that I’ll be starting as assistant professor for Natural Language Processing @unileipzig.bsky.social in April! I’m deeply grateful to everyone who supported me on this journey.
I will be recruiting PhD students with @scadsai.bsky.social, stay tuned for details!
I’m thrilled to share that I’ll be starting as assistant professor for Natural Language Processing @unileipzig.bsky.social in April! I’m deeply grateful to everyone who supported me on this journey.
I will be recruiting PhD students with @scadsai.bsky.social, stay tuned for details!
Abstract Deadline: December 17
Notification: January 15
Abstract Deadline: December 17
Notification: January 15
Language models (LMs) are remarkably good at generating novel well-formed sentences, leading to claims that they have mastered grammar.
Yet they often assign higher probability to ungrammatical strings than to grammatical strings.
How can both things be true? 🧵👇
Language models (LMs) are remarkably good at generating novel well-formed sentences, leading to claims that they have mastered grammar.
Yet they often assign higher probability to ungrammatical strings than to grammatical strings.
How can both things be true? 🧵👇
Our paper (co w/ Vinith Suriyakumar) on syntax-domain spurious correlations will appear at #NeurIPS2025 as a ✨spotlight!
+ @marzyehghassemi.bsky.social, @byron.bsky.social, Levent Sagun
Our paper (co w/ Vinith Suriyakumar) on syntax-domain spurious correlations will appear at #NeurIPS2025 as a ✨spotlight!
+ @marzyehghassemi.bsky.social, @byron.bsky.social, Levent Sagun
We investigate if LMs capture these inferences from connectives when they cannot rely on world knowledge.
New paper w/ Daniel, Will, @jessyjli.bsky.social
We investigate if LMs capture these inferences from connectives when they cannot rely on world knowledge.
New paper w/ Daniel, Will, @jessyjli.bsky.social
Asst or Assoc.
We have a thriving group sites.utexas.edu/compling/ and a long proud history in the space. (For instance, fun fact, Jeff Elman was a UT Austin Linguistics Ph.D.)
faculty.utexas.edu/career/170793
🤘
Asst or Assoc.
We have a thriving group sites.utexas.edu/compling/ and a long proud history in the space. (For instance, fun fact, Jeff Elman was a UT Austin Linguistics Ph.D.)
faculty.utexas.edu/career/170793
🤘
🎯 We demonstrate that ranking-based discriminator training can significantly reduce this gap, and improvements on one task often generalize to others!
🧵👇