Lightnews — Scholar-powered news

Reposted by Krishnapriya Vishnubhotla

Jill Walker Rettberg

@jilltxt.bsky.social

New paper!! Hermann Wigers and I used OpenAI to generate and analyse 11,800 stories "from" 236 countries: Norwegian stories, Chinese stories, American stories, etc. We found that GPT-4o-mini tells the same story over and over. open-research-europe.ec.europa.eu/articles/5-2...

open-research-europe.ec.europa.eu

July 30, 2025 at 8:15 AM

Reposted by Krishnapriya Vishnubhotla

Alberto Acerbi

@acerbialberto.com

In this work, we introduce Words of Warmth, the first large-scale repository of manually derived word–warmth (as well as word–trust and word–sociability) associations for over 26k English words.
arxiv.org/html/2506.03...

Words of Warmth: Trust and Sociability Norms for over 26k English Words

arxiv.org

June 25, 2025 at 2:36 PM

Reposted by Krishnapriya Vishnubhotla

Naomi Saphra

@nsaphra.bsky.social

🚨 New preprint! 🚨
Phase transitions! We love to see them during LM training. Syntactic attention structure, induction heads, grokking; they seem to suggest the model has learned a discrete, interpretable concept. Unfortunately, they’re pretty rare—or are they?

June 24, 2025 at 6:29 PM

Reposted by Krishnapriya Vishnubhotla

Ted Underwood

@tedunderwood.com

Back on the distant reading beat: there's a great article about quotation patterns in Victorian Studies, by Sierra Eckert and Milan Terlunen. Middlemarch is the case study. Not OA yet. muse.jhu.edu/article/9551...

Project MUSE - What We Quote: Disciplinary History and the Textual Atmospheres of <i>Middlemarch</i>

muse.jhu.edu

March 31, 2025 at 12:41 PM

Reposted by Krishnapriya Vishnubhotla

Jacob Eisenstein

@jacobeisenstein.bsky.social

We all want LLMs to collaborate with humans to help them achieve their goals. But LLMs are not trained to collaborate, they are trained to imitate. Can we teach LM agents to help humans by first making them help each other?

arxiv.org/abs/2503.14481

Don't lie to your friends: Learning what you know from collaborative self-play

To be helpful assistants, AI agents must be aware of their own capabilities and limitations. This includes knowing when to answer from parametric knowledge versus using tools, when to trust tool outpu...

arxiv.org

March 24, 2025 at 3:39 PM

Reposted by Krishnapriya Vishnubhotla

Yoav Artzi

@yoavartzi.com

Slack wins over Bsky (today)

docs.google.com/presentation...

ChatGPT + Post-Training

ChatGPT and The Art of Post-Training Barret Zoph & John Schulman

docs.google.com

March 18, 2025 at 9:05 PM

Reposted by Krishnapriya Vishnubhotla

Meredith Whittaker

@meredithmeredith.bsky.social

FULL KEYNOTE -- me + Guy K at SXSW on why privacy is a fundamental guarantor of a gorgeous, good life (not an absence, not a vacuum, not abstract). And how AI threatens privacy. And why Signal is so special and why you should use & love & support it ❤️

www.youtube.com/watch?v=AyH7...

The State of Personal Online Security and Confidentiality | SXSW LIVE

YouTube video by SXSW

www.youtube.com

March 11, 2025 at 7:55 AM

Reposted by Krishnapriya Vishnubhotla

Siyuan Song

@siyuansong.bsky.social

Full Paper: arxiv.org/abs/2503.07513
Again, thanks to my amazing advisors @jennhu.bsky.social and @kmahowald.bsky.social for their guidance and support! (8/8)

Language Models Fail to Introspect About Their Knowledge of Language

There has been recent interest in whether large language models (LLMs) can introspect about their own internal states. Such abilities would make LLMs more interpretable, and also validate the use of s...

arxiv.org

March 12, 2025 at 2:31 PM

Reposted by Krishnapriya Vishnubhotla

Martin Tutek

@mtutek.bsky.social

🚨🚨 New preprint 🚨🚨

Ever wonder whether verbalized CoTs correspond to the internal reasoning process of the model?

We propose a novel parametric faithfulness approach, which erases information contained in CoT steps from the model parameters to assess CoT faithfulness.

arxiv.org/abs/2502.14829

Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps

When prompted to think step-by-step, language models (LMs) produce a chain of thought (CoT), a sequence of reasoning steps that the model supposedly used to produce its prediction. However, despite mu...

arxiv.org

February 21, 2025 at 12:43 PM

Reposted by Krishnapriya Vishnubhotla

Ted Underwood

@tedunderwood.com

Correspondence to Nature:
"LLMs are trained on texts, not truths. Each text bears traces of its context, including its genre … audience and the history and local politics of its place of origin. A correct sentence in one [context] might be … nonsensical in another." Excellent. We're getting there!

www.dropbox.com

March 8, 2025 at 4:48 PM

Reposted by Krishnapriya Vishnubhotla

let-all.com

@let-all.com

We have a new blog post on reinforcement learning theory for language model post training! By Akshay Krishnamurthy (@akshaykr.bsky.social) and Audrey Huang (@ahahaudrey.bsky.social)!

www.let-all.com/blog/2025/03...

March 10, 2025 at 4:03 PM

Reposted by Krishnapriya Vishnubhotla

Margaret Mitchell

@mmitchell.bsky.social

🎓 New offering from Hugging Face: Academia Hub! Tools for academic groups to directly implement + learn about AI.
If you're at a university, you can get all the Hugging Face PRO features at a discount. 🤗
Link here: huggingface.co/docs/hub/aca... 1/

March 10, 2025 at 5:56 PM

Reposted by Krishnapriya Vishnubhotla

Juri Opitz

@nlopitz.bsky.social

Happy to share that our paper, "Natural Language Processing RELIES on Linguistics," will appear in Computational Linguistics!

Preprint: arxiv.org/abs/2405.05966

March 11, 2025 at 2:35 PM

Reposted by Krishnapriya Vishnubhotla

Simon Willison

@simonwillison.net

Here's the table of contents for my lengthy new piece on how I use LLMs to help me write code simonwillison.net/2025/Mar/11/...

If someone tells you that coding with LLMs is easy they are (probably unintentionally) misleading you. They may well have stumbled on to patterns that work, but those patterns do not come naturally to everyone.

I’ve been getting great results out of LLMs for code for over two years now. Here’s my attempt at transferring some of that experience and intution to you.

Set reasonable expectations
Account for training cut-off dates
Context is king
Ask them for options
Tell them exactly what to do
You have to test what it writes!
Remember it’s a conversation
Use tools that can run the code for you
Vibe-coding is a great way to learn
A detailed example
Be ready for the human to take over
The biggest advantage is speed of development
LLMs amplify existing expertise
Bonus: answering questions about codebases

March 11, 2025 at 2:11 PM

Reposted by Krishnapriya Vishnubhotla

Data & Society

@datasociety.bsky.social

In this paper, AIMLab Project Director Meg Young and her coauthors “undertake a co-design exploration to understand how a participatory approach to LLMs might address opportunities and challenges around AI in journalism.” arxiv.org/html/2501.17...

“Ownership, Not Just Happy Talk”: Co-Designing a Participatory Large Language Model for Journalism

arxiv.org

March 7, 2025 at 5:40 PM

Reposted by Krishnapriya Vishnubhotla

Juan Diego Rodriguez

@juand-r.bsky.social

2.) [ICLR 2025]
When does CoT help? It turns out that gains are mainly on math and symbolic reasoning.

Check out our paper for a deep dive into MMLU, hundreds of experiments, and a meta-analysis of CoT across 3 conferences covering over 100 papers! arxiv.org/abs/2409.12183

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Chain-of-thought (CoT) via prompting is the de facto method for eliciting reasoning capabilities from large language models (LLMs). But for what kinds of tasks is this extra ``thinking'' really helpfu...

arxiv.org

March 11, 2025 at 10:03 PM

Reposted by Krishnapriya Vishnubhotla

Marco Cortella

@mcx83.bsky.social

🌍 As DEI programs face setbacks, it’s crucial to rethink inclusive design. The Do No Harm Guide by @urbaninstitute.bsky.social goes beyond aesthetics, challenging us to communicate data with empathy and equity. A must-read for those working with data. 📊

🔗 www.urban.org/sites/defaul...

#dataviz

March 11, 2025 at 6:41 PM

Reposted by Krishnapriya Vishnubhotla

Cory Shain

@coryshain.bsky.social

🚨 First preprint from the lab! 🚨 Josh Rozner (w/@weissweiler.bsky.social and @kmahowald.bsky.social) uses counterfactual experiments on LMs to show that word distributions can provide a learning signal for diverse syntactic constructions, including some hard cases.

Constructions are Revealed in Word Distributions

Construction grammar posits that constructions (form-meaning pairings) are acquired through experience with language (the distributional learning hypothesis). But how much information about…

arxiv.org

March 11, 2025 at 6:06 PM

Reposted by Krishnapriya Vishnubhotla

Zilei Shao

@zoeshao.bsky.social

What happens if we tokenize cat as [ca, t] rather than [cat]?

LLMs are trained on just one tokenization per word, but they still understand alternative tokenizations. We show that this can be exploited to bypass safety filters without changing the text itself.

#AI #LLMs #tokenization #alignment

March 11, 2025 at 11:13 PM

Reposted by Krishnapriya Vishnubhotla

Melanie Mitchell

@melaniemitchell.bsky.social

Very good (technical) explainer answering "How has DeepSeek improved the Transformer architecture?". Aimed at readers already familiar with Transformers.

epoch.ai/gradient-upd...

How has DeepSeek improved the Transformer architecture?

This Gradient Updates issue goes over the major changes that went into DeepSeek’s most recent model.

epoch.ai

January 30, 2025 at 9:07 PM

Reposted by Krishnapriya Vishnubhotla

Lauren Klein

@laurenfklein.bsky.social

Already a time capsule, but back in January a bunch of us working at the intersection of the humanities and AI/ML came together to sketch out eight provocations from the humanities for genAI research. Here's a 🧵 1/

arxiv.org/abs/2502.19190

Provocations from the Humanities for Generative AI Research

This paper presents a set of provocations for considering the uses, impact, and harms of generative AI from the perspective of humanities researchers. We provide a working definition of humanities res...

arxiv.org

March 3, 2025 at 3:08 PM

Reposted by Krishnapriya Vishnubhotla

Margaret Mitchell

@mmitchell.bsky.social

🤖 (🤗+🐸 )=✅ I am always stoked to share how the tech world operationalizes ethical values, and I get to do that today. 🥳 @hf.co and JFrog have partnered to certify models for security and safety.
This also moves forward AI *trustworthiness* and provides another form of AI *transparency*. More later!

JFrog and Hugging Face Team to Improve Machine Learning Security and Transparency for Developers

JFrog Ltd (Nasdaq: FROG), the Liquid Software company and creators of the JFrog Software Supply Chain Platform, today announced it is partnering with

www.businesswire.com

March 4, 2025 at 6:59 PM

Reposted by Krishnapriya Vishnubhotla

Naomi Saphra

@nsaphra.bsky.social

Wild how long it took someone to actually test this, but it's natural given how disconnected most interp neophytes are from the history of the field. Reminder that @sarah-nlp.bsky.social and I wrote a history of LM interpretability for the NLP and mech interp communities 👀

Mechanistic?

The rise of the term "mechanistic interpretability" has accompanied increasing interest in understanding neural models -- particularly language models. However, this jargon has also led to a fair amou...

arxiv.org

March 3, 2025 at 6:51 PM

Reposted by Krishnapriya Vishnubhotla

Kate Starbird

@katestarbird.bsky.social

Seattle folks: I'm giving a (free) public lecture Feb 24, where I'll explain what our team's decade of research on online rumors & disinformation reveals about the once-alternative media ecosystem that is driving the rise of right wing populism around the world. www.washington.edu/facultystaff...