Vagrant Gautam
@dippedrusk.com
I do research on trustworthy NLP, i.e., social + technical aspects of fairness, reasoning, etc.
pronouns: xe/they (Deutsch: keine)
nouns: computer scientist, linguist, birder
adjectives: trans, queer, autistic
https://dippedrusk.com
pronouns: xe/they (Deutsch: keine)
nouns: computer scientist, linguist, birder
adjectives: trans, queer, autistic
https://dippedrusk.com
The scene where she appears is the best scene in the film imo
www.youtube.com/watch?v=VfkQ...
www.youtube.com/watch?v=VfkQ...
Beetlejuice 2 - Delores TRAGEDY - Delores first appearance
YouTube video by ClipsRJCR
www.youtube.com
November 1, 2025 at 8:51 PM
The scene where she appears is the best scene in the film imo
www.youtube.com/watch?v=VfkQ...
www.youtube.com/watch?v=VfkQ...
Please reach out if you'd like to chat - I'm open to new collaborations as a postdoc (in 2 weeks!). I'm still into fairness/reference/reasoning, but also want to do more interpretability work, and start on some new directions (linguistic acceptability/plausibility and memorization/generalization).
August 14, 2025 at 7:11 PM
Please reach out if you'd like to chat - I'm open to new collaborations as a postdoc (in 2 weeks!). I'm still into fairness/reference/reasoning, but also want to do more interpretability work, and start on some new directions (linguistic acceptability/plausibility and memorization/generalization).
At COLM I'm co-presenting a meta-evaluation of LLM misgendering (led by @arjunsubgraph.bsky.social), ongoing work on using decoder-only models to simulate partial differential equations (led by @palomagherreros.bsky.social), and I'm co-organizing the @interplay-workshop.bsky.social
August 14, 2025 at 7:11 PM
At COLM I'm co-presenting a meta-evaluation of LLM misgendering (led by @arjunsubgraph.bsky.social), ongoing work on using decoder-only models to simulate partial differential equations (led by @palomagherreros.bsky.social), and I'm co-organizing the @interplay-workshop.bsky.social
In this other paper I look at the effects of LLM architecture on pronoun predictions after explicitly showing the right coreference, but the effects of RLHF and other post-training is an interesting question and to my knowledge unstudied!
direct.mit.edu/tacl/article...
direct.mit.edu/tacl/article...
Robust Pronoun Fidelity with English LLMs: Are they Reasoning, Repeating, or Just Biased?
Abstract. Robust, faithful, and harm-free pronoun use for individuals is an important goal for language model development as their use increases, but prior work tends to study only one or two of these...
direct.mit.edu
July 28, 2025 at 6:43 AM
In this other paper I look at the effects of LLM architecture on pronoun predictions after explicitly showing the right coreference, but the effects of RLHF and other post-training is an interesting question and to my knowledge unstudied!
direct.mit.edu/tacl/article...
direct.mit.edu/tacl/article...
Yesss we made a new, harder version of Winogender Schemas that balances for grammatical case and fixes typos and stuff in the original dataset and we found that case dramatically affects performance! This is at a small scale though
aclanthology.org/2024.crac-1.6/
aclanthology.org/2024.crac-1.6/
WinoPron: Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case
Vagrant Gautam, Julius Steuer, Eileen Bingert, Ray Johns, Anne Lauscher, Dietrich Klakow. Proceedings of the Seventh Workshop on Computational Models of Reference, Anaphora and Coreference. 2024.
aclanthology.org
July 28, 2025 at 6:40 AM
Yesss we made a new, harder version of Winogender Schemas that balances for grammatical case and fixes typos and stuff in the original dataset and we found that case dramatically affects performance! This is at a small scale though
aclanthology.org/2024.crac-1.6/
aclanthology.org/2024.crac-1.6/
Our main finding is that across languages, intersectional country-and-gender biases persist even when there appears to be parity along a single axis (just country or just gender), which is why we get—as our title says—Colombian waitresses and Canadian judges. Enjoy Vienna! Here are my highlights.
July 26, 2025 at 10:45 AM
Our main finding is that across languages, intersectional country-and-gender biases persist even when there appears to be parity along a single axis (just country or just gender), which is why we get—as our title says—Colombian waitresses and Canadian judges. Enjoy Vienna! Here are my highlights.
Thus, going forward, we recommend that future work: (1) Use the evaluation that is appropriate to the final deployment. (2) Take a holistic view of misgendering. (3) Recognize that misgendering is contextual. (4) Center those most impacted by misgendering in system design and evaluation.
June 11, 2025 at 1:28 PM
Thus, going forward, we recommend that future work: (1) Use the evaluation that is appropriate to the final deployment. (2) Take a holistic view of misgendering. (3) Recognize that misgendering is contextual. (4) Center those most impacted by misgendering in system design and evaluation.
By annotating 2400 model generations, we also show that misgendering is complex and goes far beyond pronouns, which is all that automatic metrics currently capture. E.g., models frequently avoid generating pronouns and generate extraneous gendered language, which can be seen as misgendering.
June 11, 2025 at 1:28 PM
By annotating 2400 model generations, we also show that misgendering is complex and goes far beyond pronouns, which is all that automatic metrics currently capture. E.g., models frequently avoid generating pronouns and generate extraneous gendered language, which can be seen as misgendering.
In sum, while both evaluation methods have their time and place, their divergence reflects that they are not substitutes for each other. In the context of misgendering, invalid measurements can lead to poor model selection, deployments, or public misinformation about performance, causing real harms.
June 11, 2025 at 1:28 PM
In sum, while both evaluation methods have their time and place, their divergence reflects that they are not substitutes for each other. In the context of misgendering, invalid measurements can lead to poor model selection, deployments, or public misinformation about performance, causing real harms.
We find that overall, probability and generation-based evaluation results disagree with each other (i.e., one shows misgendering, and the other doesn't) on roughly 20% of instances. Check out the preprint for more instance-level, dataset-level, and model-level disagreement metrics.
June 11, 2025 at 1:28 PM
We find that overall, probability and generation-based evaluation results disagree with each other (i.e., one shows misgendering, and the other doesn't) on roughly 20% of instances. Check out the preprint for more instance-level, dataset-level, and model-level disagreement metrics.
We transform existing misgendering evaluation datasets into parallel versions for probability- and generation-based evaluation, and then we systematically compare these parallel evaluations across: 4 pronoun sets (he, she, they, xe) and 6 models from 3 families.
June 11, 2025 at 1:28 PM
We transform existing misgendering evaluation datasets into parallel versions for probability- and generation-based evaluation, and then we systematically compare these parallel evaluations across: 4 pronoun sets (he, she, they, xe) and 6 models from 3 families.
We ask: Do the results of generation-based and probability-based evaluations correspond with or diverge from each other? This is important given that LLMs can be used in different ways, sometimes for ranking existing sequences, and sometimes for generation, as with chat-based assistants.
June 11, 2025 at 1:28 PM
We ask: Do the results of generation-based and probability-based evaluations correspond with or diverge from each other? This is important given that LLMs can be used in different ways, sometimes for ranking existing sequences, and sometimes for generation, as with chat-based assistants.
Prior papers (including my own work) have proposed automatic methods for evaluating LLMs for misgendering: Probability-based evaluations use a cloze-style setup with a constrained set of pronouns while generation-based evaluations quantify correct gendering in open-ended generations.
June 11, 2025 at 1:28 PM
Prior papers (including my own work) have proposed automatic methods for evaluating LLMs for misgendering: Probability-based evaluations use a cloze-style setup with a constrained set of pronouns while generation-based evaluations quantify correct gendering in open-ended generations.
Many popular LLMs fail to refer to individuals with the correct pronouns, which is a form of misgendering. Respecting a person’s social gender is important, and correctly gendering trans individuals, in particular, prevents psychological distress.
June 11, 2025 at 1:28 PM
Many popular LLMs fail to refer to individuals with the correct pronouns, which is a form of misgendering. Respecting a person’s social gender is important, and correctly gendering trans individuals, in particular, prevents psychological distress.
I'm discussing it with the other co-organizers and we'll get back to you ASAP!
June 9, 2025 at 9:37 AM
I'm discussing it with the other co-organizers and we'll get back to you ASAP!