Lightnews — Scholar-powered news

Reposted by David Smith

Sarah Shugars

@shugars.bsky.social

I had the absolute pleasure to visit @craicexeter.bsky.social, where I laid out an argument for how critical & computational scholars should lead the conversation on AI. We need to expand research on harms, interrogate corporate hype, and support people’s critical understanding these technologies

January 22, 2026 at 4:32 PM

Reposted by David Smith

Lucy Li

@lucy3.bsky.social

I couldn't figure out eloquent language to describe digital agents that assist with web navigation tasks, so I just wrote "click click" and if I keep this up maybe I will start referring to language generation as "word word"

January 22, 2026 at 5:27 PM

Reposted by David Smith

Maria Antoniak

@mariaa.bsky.social

A very random view into how some people* outside of tech think about and use chatbots. It's not coding, that's for sure, and some of it might sound ridiculous, but I think this kind of perspective and usage is way more common than we might assume.

*LA people (sorry, I love LA, but this is very LA)

Hi Honey, I’m Homo Neuricus

Six Ways I'm using AI to Become More Human

sissychacon.substack.com

January 22, 2026 at 5:11 PM

Reposted by David Smith

David Jurgens

@davidjurgens.bsky.social

The second new class I'm teaching is a very experimental graduate level seminar in CSE: "Building Small Language Models". I taught the grad level NLP class last semester (so fun!) but students wanted more—which of these new ideas work, and which work for SLMs? jurgens.people.si.umich.edu/CSE598-004/

CSE 598-004 - Building Small Language Models

jurgens.people.si.umich.edu

January 19, 2026 at 9:29 PM

Reposted by David Smith

Laura K. Nelson

@lauraknelson.bsky.social

For social scientists interested in LLMs for text classification/coding, the process here is potentially very helpful (even if you don't use the product itself).

Their core technique: Contradictory Example Training
Their training method: Binocular Labeling

More details in the linked post below.

Dave Willner @dwillner.bsky.social · 10d

We just published the methodology behind CoPE, our 9B parameter model that matches GPT-4o at content classification at 1% the size! The model is already open source, but now we're sharing our training technique. blog.zentropi.ai/how-we-built... 🧵 1/6

How we built CoPE

We just published the methodology behind CoPE. This is the model that powers Zentropi, and we think the approach might be useful for others working on policy-steerable classification systems. We had ...

blog.zentropi.ai

January 15, 2026 at 7:37 PM

Reposted by David Smith

Claudia Carroll

@claudia42.bsky.social

The first research paper from WashU's AI Humanities Lab, which I co-direct with Gabi Kirilloff, is available now in the Harvard Data Science Review! Read to learn more about how (badly) current LLMs are at replicating literary style: doi.org/10.1162/9960...

‘Written in the Style of’: ChatGPT and the Literary Canon

doi.org

January 10, 2026 at 9:14 PM

Reposted by David Smith

Kieran Healy

@kjhealy.co

… that the paper defines “AI” very expansively, including many kinds of analysis that, for most of the data, is not what we now think of as “AI”. Like logistic regression, PCA, LDA, and KNN methods. 🤨 So I feel just a little baited-and switched.

Increasing prevalence of AI adoption in science. a, Increasing performance of AI paper identification during the two-stage fine-tuning of BERT pre-trained models, where we use rough training data in stage 1 to evolve precise assessments in stage 2. We independently train two models on titles (green) and abstracts (purple), and then integrate them into an ensemble (orange) that selects the optimal models during both stages (red stars) to identify all relevant papers. b, Accuracy evaluation of our identification results by human experts. For samples spanning three eras of AI, experts reached consensus, with κ ≥ 0.93. Our model identification results have strong accuracy in validation against expert-labelled data, with an F1-score ≥0.85.

c, Relative adoption frequency of the top 15 AI methods across all disciplines for all selected AI development eras.

Convolutional neural network (CNN)
Support vector machine (SVM)
Random forest (RF)
Long short-term memory (LSTM)
Principal component analysis (PCA)
Generative adversarial network (GAN)
Gradient boosting
Large language model (LLM) k-nearest neighbours (KNN)
Logistic regression (LR)
Reinforcement learning (RL)
Unsupervised learning
Transfer learning (TL)
Graph convolutional network (GCN)
Linear discriminant analysis (LDA)

January 14, 2026 at 11:39 PM

Reposted by David Smith

Kieran Healy

@kjhealy.co

Chat about this paper naturally focuses on the headline about AI being high “impact” but also narrowing science; a nice something-for-everyone (including the haters) story. Looking at Fig 2, you might ask “How can they have 30 yrs of impact data for AI?” Well, … www.nature.com/articles/s41...

Fig. 2 | AI enlarges paper impact and enhances researcher careers. a, Average (insets: top 1% and 10%) annual citations after publication of AI (red) and non-AI (blue) papers (n = 27,405,011), where AI papers attract more citations. b, Average annual citations for researchers who useAI and their counterparts who do not(P < 0.001, n = 5,377,346), where researchers who adopt AI receive 4.84 times more citations. c, The probability of two role transitions between junior scientists who adopt AI and their counterparts who do not(n = 46 year observations for each field). Junior scientists who adopt AI have a higher probability of becoming established researchers and a lower probability of exiting academia compared with their counterpartswho do not adopt AI. d, Survival functions for the transition from a junior to an established researcher (P < 0.001,n = 2,282,029). The survival functions can be well-fit with exponential distributions, where junior scientists who adopt AI become established earlier. For all panels, 99% CIs are shown as error bars, with the insets of a centred at the 1% and 10% percentiles and other panels centred at the mean. All statistical tests use a two-sided t-test.

January 14, 2026 at 11:39 PM

Reposted by David Smith

JCLS

@jcls-io.bsky.social

📢 New article in #JCLS 5(1)! 🎉
@axelpichler.fedihum.org.ap.brid.gy, Endres, M. & @nilsreiter.de (2026) “#Interpretation, Argument, #Evaluation. A Workflow for Assessing #LLM-Generated Interpretations of #Poetry” doi.org/10.48694/jcl...

#RollingIssue #NLG #CLS #LiteraryComputing

January 14, 2026 at 10:12 PM

Reposted by David Smith

Alison Gopnik

@alisongopnik.bsky.social

yes, this is a really great paper, showing how AI can enhance individual science but narrow the general scope.

Shahan Ali Memon @shahanmemon.bsky.social · 11d

An amazing paper from James Evans and team: Artificial intelligence tools expand scientists’ impact but contract science’s focus, now in @nature.com

Implications for epistemic diversity in science.

#SciSci #ScAISci

www.science.org/content/arti...

AI has supercharged scientists—but may have shrunk science

Analysis of 41 million papers finds that although AI expands individual impact, it narrows collective scientific exploration

www.science.org

January 14, 2026 at 10:08 PM

Reposted by David Smith

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

But if this is the case, why are models acting so differently between languages?
Datasets like eclektic show that models know different things in different languages. A rare fact is usually only known in the language in which it was seen.
bsky.app/profile/lcho...

Leshem (Legend) Choshen @EMNLP @lchoshen.bsky.social · Mar 21

🚀 "Multilingual" LLMs are really just clusters of monolingual ones!
They might know a Brazilian 🇧🇷 brewery—but only in Portuguese
With ECLEKTIC, you can now test this. The challenge? Making them truly multilingual.
alphaxiv.org/pdf/2502.21228 📈🤖
🧵⬇️
#LLM #ai #genAI #nlp

January 14, 2026 at 4:52 PM

Reposted by David Smith

Jonathan K. Kummerfeld

@jkkummerfeld.bsky.social

Do you have ideas for the future of reading?

Submit a 2-4 page paper to the CHI workshop I am co-organising! (deadline Feb 12) “Science and Technology for Augmenting Reading"

chi-star-workshop.github.io

January 12, 2026 at 3:46 AM

David Smith

@dasmiq.bsky.social

Reading environments for classical languages FTW

David Bau @davidbau.bsky.social · 14d

I can't read Chinese, but my family has old genealogy documents I've always wanted to understand. Claude and Gemini helped me build an interactive reader to explore the calligraphy character by character.

I can finally read my great-grandfather's epitaph. Try it:
davidbau.com/archives/202...

Screenshot of Chinese calligraphy reader web application

January 12, 2026 at 3:30 AM

Reposted by David Smith

Richard Jean So

@richardjeanso.bsky.social

Excited about this Duke AI conference + stoked to present new work on cultural AI. Grateful this high profile conference will include humanistic perspectives. Meaning, history, aesthetics, narrative etc are a part of the society centered AI question. Glad the humanities will be a part of the convo.

Chris Bail @chrisbail.bsky.social · 20d

Join us for the 2nd annual Conference on Society-Centered AI at Duke University (Feb 12-14th). Last year’s event drew over 700 people from 50+ companies and 20 universities to discuss topics ranging from AI safety to alignment to the impact of AI systems. Register here: sites.duke.edu/scai/

Main - #SCAI2026

Conference on Society Centered AI February 12 -14, 2026 Duke University, Durham, NC https://youtu.be/m9CGFovLZGQ The #SCAI2026: Conference on Society-Centered AI (previously Responsible AI…

sites.duke.edu

January 7, 2026 at 5:32 PM

Reposted by David Smith

nlpandcss.bsky.social

@nlpandcss.bsky.social

✨The NLP+CSS workshop is returning to ACL 2026!✨

And this year, we have a new shared task with prizes!

Website/CfP: sites.google.com/site/nlpandc...
Deadlines: March 5 (direct), March 24 (pre-reviewed ARR)

#NLProc #CompSocialSci #ComputationalSocialScience #ACL2026NLP
@aclmeeting.bsky.social

NLP+CSS Workshops

https://www.pexels.com/photo/group-hand-fist-bump-1068523/

sites.google.com

December 18, 2025 at 12:38 PM

Reposted by David Smith

Faine Greenwood

@faineg.bsky.social

The world must boycott the World Cup and the Olympics.

It is both the only moral choice and will actually get the attention of these dead-eyed clout demons.

January 3, 2026 at 7:22 AM

Reposted by David Smith

Andrew Goldstone

@agoldst.mastodon.social.ap.brid.gy

also, 21 years on from my last comp-lit course: I finally read Lord's Singer of Tales (in digital ed., h/t @dasmiq.bsky.social for the link on a syllabus of his).
https://bookwyrm.social/user/agoldst/comment/9290902#anchor-9290902

December 26, 2025 at 4:32 PM

Reposted by David Smith

Maarten Sap

@maartensap.bsky.social

I'm very excited about our new work which aims to model causes and effects on stories online! Narratives and stories are everywhere, so it's helpful to be able to understand how people use them in nuanced ways.

Joel Mire @joelmire.bsky.social · Dec 19

Reading social media stories evokes a wide range of contextual reader reactions—inferential, affective, evaluative—yet we lack methods to study these at scale.

Excited to share our new paper that builds a framework for analyzing storytelling practices across online communities!

Screenshot of paper title and authors.

Title: Social Story Frames: Contextual Reasoning about Narrative Intent and Reception
Authors: Joel Mire, Maria Antoniak, Steven R. Wilson, Zexin Ma, Achyutarama R. Ganti, Andrew Piper, Maarten Sap

December 22, 2025 at 9:20 AM

Reposted by David Smith

Anjalie Field

@anjalief.bsky.social

The next edition of the NLP+CSS will be at ACL 2026! It includes an open-ended shared task (work with the Opioid Industry Documents Archive) with travel grants as prizes!

nlpandcss.bsky.social @nlpandcss.bsky.social · Dec 18

✨The NLP+CSS workshop is returning to ACL 2026!✨

And this year, we have a new shared task with prizes!

Website/CfP: sites.google.com/site/nlpandc...
Deadlines: March 5 (direct), March 24 (pre-reviewed ARR)

#NLProc #CompSocialSci #ComputationalSocialScience #ACL2026NLP
@aclmeeting.bsky.social

NLP+CSS Workshops

https://www.pexels.com/photo/group-hand-fist-bump-1068523/

sites.google.com

December 18, 2025 at 9:41 PM

Reposted by David Smith

SE Gyges

@segyges.bsky.social

WE HAVE A BYTE MODEL THAT DOESN'T SUCK

Ai2 @ai2.bsky.social · Dec 15

Introducing Bolmo, a new family of byte-level language models built by "byteifying" our open Olmo 3—and to our knowledge, the first fully open byte-level LM to match or surpass SOTA subword models across a wide range of tasks. 🧵

December 15, 2025 at 5:19 PM

Reposted by David Smith

Naomi Saphra

@nsaphra.bsky.social

We‘ve seen huge improvements thanks to improvements in scaling and data curation, which admittedly were hard to build scientific careers on. But there’s been no revolutionary shift in methodology since the victory of neural machine translation with attention over ngram models ~2015.

December 13, 2025 at 2:59 PM

Reposted by David Smith

Naomi Saphra

@nsaphra.bsky.social

LLMs didn’t move language modeling research from linguists to AI people, they just moved it from computer scientists who thought language was interesting to computer scientists who thought language was boring

December 12, 2025 at 7:38 PM

Reposted by David Smith

David Bamman

@dbamman.bsky.social

Excited to get this work out in the world at #chr2025 (with Sabrina Baur, Mackenzie Cramer, Anna Ho and Tom McEnaney) -- asking: how much do contemporary songs tell stories, and how has that changed over the past half century?

anthology.ach.org/volumes/vol0...

Measuring the Stories in Contemporary Songs

anthology.ach.org

December 12, 2025 at 1:09 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news