Krishnapriya Vishnubhotla
krishnapriya-v22.bsky.social
Krishnapriya Vishnubhotla
@krishnapriya-v22.bsky.social
PhD grad from UofT CompLing. Interested in narrative understanding, affective computing, language variation and style, and generally using NLP technologies to understand humans and society.

priya22.github.io
Reposted by Krishnapriya Vishnubhotla
New paper!! Hermann Wigers and I used OpenAI to generate and analyse 11,800 stories "from" 236 countries: Norwegian stories, Chinese stories, American stories, etc. We found that GPT-4o-mini tells the same story over and over. open-research-europe.ec.europa.eu/articles/5-2...
open-research-europe.ec.europa.eu
July 30, 2025 at 8:15 AM
Reposted by Krishnapriya Vishnubhotla
In this work, we introduce Words of Warmth, the first large-scale repository of manually derived word–warmth (as well as word–trust and word–sociability) associations for over 26k English words.
arxiv.org/html/2506.03...
Words of Warmth: Trust and Sociability Norms for over 26k English Words
arxiv.org
June 25, 2025 at 2:36 PM
Reposted by Krishnapriya Vishnubhotla
🚨 New preprint! 🚨
Phase transitions! We love to see them during LM training. Syntactic attention structure, induction heads, grokking; they seem to suggest the model has learned a discrete, interpretable concept. Unfortunately, they’re pretty rare—or are they?
June 24, 2025 at 6:29 PM
Reposted by Krishnapriya Vishnubhotla
Back on the distant reading beat: there's a great article about quotation patterns in Victorian Studies, by Sierra Eckert and Milan Terlunen. Middlemarch is the case study. Not OA yet. muse.jhu.edu/article/9551...
Project MUSE - What We Quote: Disciplinary History and the Textual Atmospheres of <i>Middlemarch</i>
muse.jhu.edu
March 31, 2025 at 12:41 PM
Reposted by Krishnapriya Vishnubhotla
We all want LLMs to collaborate with humans to help them achieve their goals. But LLMs are not trained to collaborate, they are trained to imitate. Can we teach LM agents to help humans by first making them help each other?

arxiv.org/abs/2503.14481
Don't lie to your friends: Learning what you know from collaborative self-play
To be helpful assistants, AI agents must be aware of their own capabilities and limitations. This includes knowing when to answer from parametric knowledge versus using tools, when to trust tool outpu...
arxiv.org
March 24, 2025 at 3:39 PM
Reposted by Krishnapriya Vishnubhotla
Slack wins over Bsky (today)

docs.google.com/presentation...
ChatGPT + Post-Training
ChatGPT and The Art of Post-Training Barret Zoph & John Schulman
docs.google.com
March 18, 2025 at 9:05 PM
Reposted by Krishnapriya Vishnubhotla
FULL KEYNOTE -- me + Guy K at SXSW on why privacy is a fundamental guarantor of a gorgeous, good life (not an absence, not a vacuum, not abstract). And how AI threatens privacy. And why Signal is so special and why you should use & love & support it ❤️

www.youtube.com/watch?v=AyH7...
The State of Personal Online Security and Confidentiality | SXSW LIVE
YouTube video by SXSW
www.youtube.com
March 11, 2025 at 7:55 AM
Reposted by Krishnapriya Vishnubhotla
Reposted by Krishnapriya Vishnubhotla
🚨🚨 New preprint 🚨🚨

Ever wonder whether verbalized CoTs correspond to the internal reasoning process of the model?

We propose a novel parametric faithfulness approach, which erases information contained in CoT steps from the model parameters to assess CoT faithfulness.

arxiv.org/abs/2502.14829
Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps
When prompted to think step-by-step, language models (LMs) produce a chain of thought (CoT), a sequence of reasoning steps that the model supposedly used to produce its prediction. However, despite mu...
arxiv.org
February 21, 2025 at 12:43 PM
Reposted by Krishnapriya Vishnubhotla
Correspondence to Nature:
"LLMs are trained on texts, not truths. Each text bears traces of its context, including its genre … audience and the history and local politics of its place of origin. A correct sentence in one [context] might be … nonsensical in another." Excellent. We're getting there!
www.dropbox.com
March 8, 2025 at 4:48 PM
Reposted by Krishnapriya Vishnubhotla
We have a new blog post on reinforcement learning theory for language model post training! By Akshay Krishnamurthy (@akshaykr.bsky.social) and Audrey Huang (@ahahaudrey.bsky.social)!

www.let-all.com/blog/2025/03...
March 10, 2025 at 4:03 PM
Reposted by Krishnapriya Vishnubhotla
🎓 New offering from Hugging Face: Academia Hub! Tools for academic groups to directly implement + learn about AI.
If you're at a university, you can get all the Hugging Face PRO features at a discount. 🤗
Link here: huggingface.co/docs/hub/aca... 1/
March 10, 2025 at 5:56 PM
Reposted by Krishnapriya Vishnubhotla
Happy to share that our paper, "Natural Language Processing RELIES on Linguistics," will appear in Computational Linguistics!

Preprint: arxiv.org/abs/2405.05966
March 11, 2025 at 2:35 PM
Reposted by Krishnapriya Vishnubhotla
Here's the table of contents for my lengthy new piece on how I use LLMs to help me write code simonwillison.net/2025/Mar/11/...
March 11, 2025 at 2:11 PM
Reposted by Krishnapriya Vishnubhotla
In this paper, AIMLab Project Director Meg Young and her coauthors “undertake a co-design exploration to understand how a participatory approach to LLMs might address opportunities and challenges around AI in journalism.” arxiv.org/html/2501.17...
“Ownership, Not Just Happy Talk”: Co-Designing a Participatory Large Language Model for Journalism
arxiv.org
March 7, 2025 at 5:40 PM
Reposted by Krishnapriya Vishnubhotla
2.) [ICLR 2025]
When does CoT help? It turns out that gains are mainly on math and symbolic reasoning.

Check out our paper for a deep dive into MMLU, hundreds of experiments, and a meta-analysis of CoT across 3 conferences covering over 100 papers! arxiv.org/abs/2409.12183
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Chain-of-thought (CoT) via prompting is the de facto method for eliciting reasoning capabilities from large language models (LLMs). But for what kinds of tasks is this extra ``thinking'' really helpfu...
arxiv.org
March 11, 2025 at 10:03 PM
Reposted by Krishnapriya Vishnubhotla
🌍 As DEI programs face setbacks, it’s crucial to rethink inclusive design. The Do No Harm Guide by @urbaninstitute.bsky.social goes beyond aesthetics, challenging us to communicate data with empathy and equity. A must-read for those working with data. 📊

🔗 www.urban.org/sites/defaul...

#dataviz
March 11, 2025 at 6:41 PM
Reposted by Krishnapriya Vishnubhotla
🚨 First preprint from the lab! 🚨 Josh Rozner (w/@weissweiler.bsky.social and @kmahowald.bsky.social) uses counterfactual experiments on LMs to show that word distributions can provide a learning signal for diverse syntactic constructions, including some hard cases.
Constructions are Revealed in Word Distributions
Construction grammar posits that constructions (form-meaning pairings) are acquired through experience with language (the distributional learning hypothesis). But how much information about…
arxiv.org
March 11, 2025 at 6:06 PM
Reposted by Krishnapriya Vishnubhotla
What happens if we tokenize cat as [ca, t] rather than [cat]?

LLMs are trained on just one tokenization per word, but they still understand alternative tokenizations. We show that this can be exploited to bypass safety filters without changing the text itself.

#AI #LLMs #tokenization #alignment
March 11, 2025 at 11:13 PM
Reposted by Krishnapriya Vishnubhotla
Very good (technical) explainer answering "How has DeepSeek improved the Transformer architecture?". Aimed at readers already familiar with Transformers.

epoch.ai/gradient-upd...
How has DeepSeek improved the Transformer architecture?
This Gradient Updates issue goes over the major changes that went into DeepSeek’s most recent model.
epoch.ai
January 30, 2025 at 9:07 PM
Reposted by Krishnapriya Vishnubhotla
Already a time capsule, but back in January a bunch of us working at the intersection of the humanities and AI/ML came together to sketch out eight provocations from the humanities for genAI research. Here's a 🧵 1/

arxiv.org/abs/2502.19190
Provocations from the Humanities for Generative AI Research
This paper presents a set of provocations for considering the uses, impact, and harms of generative AI from the perspective of humanities researchers. We provide a working definition of humanities res...
arxiv.org
March 3, 2025 at 3:08 PM
Reposted by Krishnapriya Vishnubhotla
🤖 (🤗+🐸 )=✅ I am always stoked to share how the tech world operationalizes ethical values, and I get to do that today. 🥳 @hf.co and JFrog have partnered to certify models for security and safety.
This also moves forward AI *trustworthiness* and provides another form of AI *transparency*. More later!
JFrog and Hugging Face Team to Improve Machine Learning Security and Transparency for Developers
JFrog Ltd (Nasdaq: FROG), the Liquid Software company and creators of the JFrog Software Supply Chain Platform, today announced it is partnering with
www.businesswire.com
March 4, 2025 at 6:59 PM
Reposted by Krishnapriya Vishnubhotla
Wild how long it took someone to actually test this, but it's natural given how disconnected most interp neophytes are from the history of the field. Reminder that @sarah-nlp.bsky.social and I wrote a history of LM interpretability for the NLP and mech interp communities 👀
Mechanistic?
The rise of the term "mechanistic interpretability" has accompanied increasing interest in understanding neural models -- particularly language models. However, this jargon has also led to a fair amou...
arxiv.org
March 3, 2025 at 6:51 PM
Reposted by Krishnapriya Vishnubhotla
Seattle folks: I'm giving a (free) public lecture Feb 24, where I'll explain what our team's decade of research on online rumors & disinformation reveals about the once-alternative media ecosystem that is driving the rise of right wing populism around the world. www.washington.edu/facultystaff...
2025 University Faculty Lecture
A Spotlight on Rumors: Illuminating How Influence and Improvisation Shape Online Conversations
www.washington.edu
February 16, 2025 at 5:24 PM
Reposted by Krishnapriya Vishnubhotla
The Rise of AI Book Ripoffs

Podcast: 🔗 www.404media.co/podcast-the-...
Podcast: The Rise of AI Book Ripoffs
A rip off summary of Joseph's book, an AI startup dehumanizing workers; and a very particular AI video of Musk and Trump.
www.404media.co
February 26, 2025 at 2:16 PM