Lightnews — Scholar-powered news

Ted Underwood

@tedunderwood.com

20K followers 5.7K following 19K posts

Uses machine learning to study literary imagination, and vice-versa. Likely to share news about AI & computational social science / Sozialwissenschaft / 社会科学

Information Sciences and English, UIUC. Distant Horizons (Chicago, 2019). tedunderwood.com

Posts Replies Media Videos

Pinned

Ted Underwood @tedunderwood.com · Jun 25

New this morning, a Comment I contributed to Nature Computational Science on the interaction between large language models and the humanities. 🧪 🤖 #MLSky

rdcu.be/etk07

The link above will be open-access for a month — plus, I'll reply to this post with a link to a permanently open preprint. +

The impact of language models on the humanities and vice versa

Nature Computational Science - Many humanists are skeptical of language models and concerned about their effects on universities. However, researchers with a background in the humanities are also...

rdcu.be

Reposted by Ted Underwood

Nico

@nico-encounter.bsky.social

critinq.wordpress.com/2023/06/29/t...

November 11, 2025 at 1:51 AM

Ted Underwood

@tedunderwood.com

When I saw the plan that put my office next to “common space” I worried it would be loud. But actually this space attracts the most studious students from across campus, which is a huge morale boost in decades when you’re wondering whether the species has a future

A bright, modern study space bathed in the soft glow of sunset light. Large windows line the left wall, their shades half-drawn, revealing snow-topped rooftops and a high-rise glowing gold in the distance. Inside, students work quietly at separate tables, laptops open, jackets draped over chairs—a hearteningly studious atmosphere. A whiteboard stands mid-room, scrawled with notes from a collaborative project. The patterned blue carpet and warm, indirect light create a sense of calm focus as evening approaches.

November 10, 2025 at 10:50 PM

Ted Underwood

@tedunderwood.com

Models' inability to reason about limited perspectives becomes especially important when it involves the limitations of the user. www-nature-com.proxy2.library.illinois.edu/articles/s42...

https://news.stanford.edu/stories/2025/11/ai-language-models-facts-belief-human-understanding-research

November 10, 2025 at 10:21 PM

Reposted by Ted Underwood

Tobias Wilson-Bates

@phdhurtbrain.bsky.social

I want Oscar Isaac to play a different version of the Victor Frankenstein character every three years for the rest of his life

Oscar Isaac playing Victor Frankenstein, a scientist who goes too far and suffers the consequences

Oscar Isaac in Ex Machina playing a scientist who goes too far and suffers the consequences

Oscar Isaac in the movie Annihilation playing a scientist who goes too far and suffers the consequences

November 10, 2025 at 3:00 PM

Reposted by Ted Underwood

Nathan Lambert

@natolambert.bsky.social

If you're working on character training research, what're you working on? What is limiting your ability to do the research you want here?

Surely there are more people studying how to modify & steer model personality after the GPT 4o sycophancy incident.

November 10, 2025 at 9:57 PM

Reposted by Ted Underwood

Mark J. Nelson

@mm-jj-nn.bsky.social

Super interesting. For training small LLMs, they get results comparable to those trained on an order of magnitude more tokens by replacing normal pre-training (which ingests huge amounts of human-written text) with exclusively synthetic text derived in a structured way from Wikipedia.

Alexander Doria @dorialexander.bsky.social · 14h

SYNTH is a radical departure from the classic pre-training recipe: what if we trained for reasoning and focused on the assimilation of knowledge and skill that matters? At its core it’s an upsampling of Wikipedia 50,000 “vital” articles. huggingface.co/datasets/Ple...

November 10, 2025 at 9:47 PM

Reposted by Ted Underwood

Alexander Doria

@dorialexander.bsky.social

Breaking: we release a fully synthetic generalist dataset for pretraining, SYNTH and two new SOTA reasoning models exclusively trained on it. Despite having seen only 200 billion tokens, Baguettotron is currently best-in-class in its size range. pleias.fr/blog/blogsyn...

November 10, 2025 at 5:30 PM

Ted Underwood

@tedunderwood.com

Don't think I have value-added on politics, so I'm just sharing research. But I enjoy doing it here in an atmosphere of incandescent rage — because that's my secret cap, I'm always incandescent

Midjourney: salamander in the flames, byzantine mosaic --ar 3:1

November 10, 2025 at 4:40 PM

Reposted by Ted Underwood

Nathan Lambert

@natolambert.bsky.social

Opening the black box of character training
Some new research from me!
Exploring how easy it is to craft personalities like sycophantic chatbots, and exploring how this will change as we move from chat to agents.
www.interconnects.ai/p/opening-th...

Opening the character training pipeline

Some new research from me!

www.interconnects.ai

November 10, 2025 at 3:40 PM

Reposted by Ted Underwood

David

@finagle-a-hegel.bsky.social

Not sure whether James is being ironic here but this unironically is one of my favorite parts of the deal

James @gravitysra1nbow.bsky.social · 18h

I know everyone is upset about the deal, but don’t worry. We’ll be doing this again in a month.

November 10, 2025 at 2:57 PM

Reposted by Ted Underwood

Aislinn Keogh

@mandolinguist.bsky.social

⚠️ New paper! Why do words sound so similar? In an agent-based model + communication game, we show that production/comprehension pressures trade off to shape lexicon structure.

In @cognitionjournal.bsky.social w/ @simonkirby.bsky.social & Jenny Culbertson.

www.sciencedirect.com/science/arti...

The lexicon adapts to competing communicative pressures: Explaining patterns of word similarity

Cross-linguistically, lexicons tend to be more phonetically clustered than required by the phonotactics of the language; that is, words within a langu…

www.sciencedirect.com

November 10, 2025 at 11:59 AM

Ted Underwood

@tedunderwood.com

Recent convos with Deger Turan and @xiaoningwang.ca have converged to persuade me that interpretability could be where LLMs outdo older NLP tools for cultural analysis.

I know that seems exactly wrong. Everyone knows interpretability is the *problem* with LLMs: they’re black boxes. But, maybe not?+

November 10, 2025 at 2:20 PM

Reposted by Ted Underwood

Richard McElreath 🐈‍⬛

@rmcelreath.bsky.social

So many nonsense ad hoc pipelines could be prevented by requiring that they work on synthetic data.

I tend to think of experiments as special cases of inference, since most of the problems I work on cannot be studied in experiments. But I get that many researchers see experiments as base analogy.

Adam Kucharski @adamjkucharski.bsky.social · 1d

"Validate With Simulated Truth: A first habit is to test whether an analytical pipeline can recover known conditions."

Very good advice below. So much COVID nonsense (e.g. 'immunological dark matter') basically came down to a non-identifiable model that hadn't been properly tested.

Modelling Like an Experimentalist

Dahlin et al. (2024) apply experimental thinking to a model of mosquito-borne disease transmissions.

onlinelibrary.wiley.com

November 10, 2025 at 12:41 PM

Reposted by Ted Underwood

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

The new Rosalía album is astonishing. Run to your nearest source of music

November 10, 2025 at 3:31 AM

Reposted by Ted Underwood

Maria Antoniak

@mariaa.bsky.social

Some interesting stuff here on measuring writing quality and improving on qualitative tasks:
www.dbreunig.com/2025/07/31/h...

November 10, 2025 at 3:11 AM

Reposted by Ted Underwood

John David Pressman

@jdp.extropian.net

One of my favorite LLM use cases is literally just asking it what the closest preexisting concept to a thought is.

"I have this idea, has anyone had this idea before and what's it called?"

It used to be REALLY DIFFICULT to reliably answer questions like that and I feel like we'll forget it was.

November 10, 2025 at 12:02 AM

Ted Underwood

@tedunderwood.com

There’s a subgenre of recent SF that’s all about the opening half-hour of a heist flick — you introduce a sequence of characters with unique skills, weld them into a team, and that’s … kind of the whole plot. Gibson’s Agency and KSR’s Ministry of the Future both fit. Honestly, I don’t hate it.

November 9, 2025 at 7:59 PM

Reposted by Ted Underwood

Nicole Hennig

@nic221.bsky.social

Testing Catalog. What's new? Issue #219 🗞️ www.testingcatalog.com/email/99b3267a… #AI #news #KimiK2Thinking (& lots of other updates)

Text Shot: Most notably, Kimi K2 Thinking is now available. It’s impressive that they managed to train such a high-performance model at a low cost, putting pressure on OpenAI, Google, and others to step up their own releases to maintain their performance advantage. Not only is Kimi K2 Thinking cheaper, but Moonshot, the company behind it, has a strong consumer focus with feature-rich apps, unlike DeepSeek. Their platform already includes agent capabilities with computer use, and more advanced agentic features are promised soon. This approach is quite different from last year’s DeepSeek, and Kimi has real potential to challenge OpenAI’s dominance in the market.

November 9, 2025 at 4:29 PM

Reposted by Ted Underwood

Spencer Ackerman

@attackerman.bsky.social

They’re wearing masks so they can’t be prosecuted later. As long as they think they can’t be prosecuted, their crimes will intensify. They will torture, starve and kill, as they have already, on greater and greater scale.

Signed, someone who’s covered the unaccountable War on Terror for 23 years.

Sherrilyn Ifill @sifill.bsky.social · 2d

I’ve come to believe that we need to gather considerable forces and a campaign to demand the removal of face masks by ICE, Border Patrol, FBI and police.

It is a practice in conflict with the principles of transparency & accountability that are central to the concept of democracy.

Gregory Pratt @royalpratt.bsky.social · 2d

Video on social media shows an immigration agent pulling a gun in Little Village and holding it to the side — which is not an appropriate or safe way to hold a gun. (Among other issues.)

November 9, 2025 at 2:36 PM

Reposted by Ted Underwood

Maria Antoniak

@mariaa.bsky.social

I'm teaching "Intro to NLP" for our grad students next semester, and I'm curious how others are teaching such courses, in our current "era of AI." I've seen ideas (no tech in class, commonplace books) for smaller seminars, but how to do this in large, structured CS classes? Any success stories?

November 9, 2025 at 3:05 PM

Reposted by Ted Underwood

Mikko Tolonen

@tolonen.bsky.social

Great news! This is out: Opening the black box of EEBO academic.oup.com/dsh/advance-...

Opening the black box of EEBO

Abstract. Digital archives that cover extended historical periods can create a misleading impression of comprehensiveness while in truth providing access t

academic.oup.com

November 9, 2025 at 10:30 AM

Ted Underwood

@tedunderwood.com

Actually, most of the commentary I see from academics deplores AI as one might deplore Hallmark movies (if romcom capex was ~2% of U.S. GDP). It's creating a community of affect, but not making a policy argument that needs to be engaged.

Ryan Moulton @moultano.bsky.social · 2d

Every debate about AI on here about whether it is good or evil, verboten or not, presumes it can be prevented, and that seems so out of touch I don't even feel like reading to the end of a post, even the rebuttals.

November 9, 2025 at 12:52 PM

Reposted by Ted Underwood

Ted Underwood

@tedunderwood.com

Our School of Information Sciences is running four searches, and we can hire more than four candidates. Here’s the first link, for a job that might appeal to people working in history of information, history of science, or digital humanities. +

Assistant/Associate/Full Professor in Information, Culture & Society - School of Information Science

Duties & Responsibilities

illinois.csod.com

November 6, 2025 at 4:58 PM

Reposted by Ted Underwood

Nathan Godey

@nthngdy.bsky.social

Thrilled to release Gaperon, an open LLM suite for French, English and Coding 🧀

We trained 3 models - 1.5B, 8B, 24B - from scratch on 2-4T tokens of custom data

(TLDR: we cheat and get good scores)

@wissamantoun.bsky.social @rachelbawden.bsky.social @bensagot.bsky.social @zehavoc.bsky.social

November 7, 2025 at 9:11 PM

Reposted by Ted Underwood

Gordon

@gordon.bsky.social

dream-logic is more powerful than logic-logic and oral cultures must encode knowledge into powerful meme-spells newsletter.squishy.computer/p/llms-and-h...

November 9, 2025 at 4:42 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news