Lightnews — Scholar-powered news

Shaily

@shaily99.bsky.social

3.2K followers 530 following 300 posts

PhDing at LTI, CMU
Prev: Ai2, Google Research, MSR
Evaluating language technologies, regularly ranting, and probably procrastinating.
https://sites.google.com/view/shailybhatt/

Posts Replies Media Videos

Pinned

Shaily @shaily99.bsky.social · Jun 9

🖋️ Curious how writing differs across (research) cultures?
🚩 Tired of “cultural” evals that don't consult people?

We engaged with interdisciplinary researchers to identify & measure ✨cultural norms✨in scientific writing, and show that❗LLMs flatten them❗

📜 arxiv.org/abs/2506.00784

[1/11]

An overview of the work “Research Borderlands: Analysing Writing Across Research Cultures” by Shaily Bhatt, Tal August, and Maria Antoniak. The overview describes that We survey and interview interdisciplinary researchers (§3) to develop a framework of writing norms that vary across research cultures (§4) and operationalise them using computational metrics (§5). We then use this evaluation suite for two large-scale quantitative analyses: (a) surfacing variations in writing across 11 communities (§6); (b) evaluating the cultural competence of LLMs when adapting writing from one community to another (§7).

Reposted by Shaily

Avijit Ghosh ➡️ Neurips

@evijit.io

Extremely thrilled to talk about our new paper: "Who Evaluates AI’s Social Impacts? Mapping Coverage And Gaps In First And Third Party Evaluations".

This is the first big project output from the
@eval-eval.bsky.social coalition! Thread below:

November 13, 2025 at 2:35 PM

Reposted by Shaily

Maria Antoniak

@mariaa.bsky.social

I curated some readings for class on "data tensions" and the list felt worth sharing. Come on a tour of datasets, books, the web, and AI with me...

We'll start with this piece on the Google Books project: the hopes, dreams, disasters, and aftermath of building a public library on the internet.

1/n

Torching the Modern-Day Library of Alexandria

“Somewhere at Google there is a database containing 25 million books and nobody is allowed to read them.”

www.theatlantic.com

November 14, 2025 at 4:39 PM

Reposted by Shaily

Lucy Li

@lucy3.bsky.social

It's the season for PhD apps!! 🥧 🦃 ☃️ ❄️

Apply to Wisconsin CS to research
- Societal impact of AI
- NLP ←→ CSS and cultural analytics
- Computational sociolinguistics
- Human-AI interaction
- Culturally competent and inclusive NLP
with me!

lucy3.github.io/prospective-...

A staircase in the new School of Computer, Data & Information Sciences building at Wisconsin Madison. Tan wood structures surround tapestry art and a small indoor garden.

A view from above of the staircases in the Wisconsin CDIS building

An shot from below of winding wooden staircases and a glass atrium rooftop. The new School of Computer, Data & Information Sciences building at Wisconsin Madison.

A bicolor white cat with seal-colored markings, looking upwards with big wide dark eyes.

November 11, 2025 at 10:32 PM

Reposted by Shaily

Amanda Bertsch

@abertsch.bsky.social

Can LLMs accurately aggregate information over long, information-dense texts? Not yet…

We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!

Performance of a sweep of models on Oolong-synth and Oolong-real. Performance decreases with increasing context length, sometimes steeply.

November 7, 2025 at 5:07 PM

Reposted by Shaily

Amanda Bertsch

@abertsch.bsky.social

We’re excited about Oolong as a challenging benchmark for information aggregation! Let us know which models we should benchmark next 👀

Paper: arxiv.org/abs/2511.02817
Dataset: huggingface.co/oolongbench
Code: github.com/abertsch72/o...
Leaderboard: oolongbench.github.io

Oolong: Evaluating Long Context Reasoning and Aggregation Capabilities

As model context lengths continue to grow, concerns about whether models effectively use the full context length have persisted. While several carefully designed long-context evaluations have recently...

arxiv.org

November 7, 2025 at 5:07 PM

Reposted by Shaily

bhyravajjula.bsky.social

@bhyravajjula.bsky.social

When you read a poem, do you wonder how the poet structures it through whitespace between/before words and lines?

We did! Our findings on whitespace - how to measure/preserve it, how usage varies across form/time, how it affects LLMs - now in an #EMNLP2025 (main) paper: arxiv.org/abs/2510.16713
🧵👇

"so much depends / upon / a whitespace: Why Whitespace Matters for Poets and LLMs" by Sriharsh Bhyravajjula, Melanie Walsh, Anna Preus, Maria Antoniak

October 31, 2025 at 3:08 PM

Reposted by Shaily

Chantal

@chantalsh.bsky.social

Syntax that spuriously correlates with safe domains can jailbreak LLMs - e.g. below with GPT4o mini

Our paper (co w/ Vinith Suriyakumar) on syntax-domain spurious correlations will appear at #NeurIPS2025 as a ✨spotlight!

+ @marzyehghassemi.bsky.social, @byron.bsky.social, Levent Sagun

October 24, 2025 at 4:23 PM

Reposted by Shaily

Jeremiah Milbauer

@jerelev.bsky.social

'tis the season of getting cold emails from the boldest phd applicants.

In the interest of fairness for those who did not know they could ask, please DM if you'd like an inside perspective on AI/NLP at CMU. (Or share with those you know who might!)

October 22, 2025 at 6:34 PM

Reposted by Shaily

JHU CLSP

@jhuclsp.bsky.social

Considering a PhD in NLP/Speech? 🤔
Need guidance with your application materials?

@jhuclsp is offering a student-run application mentoring program for prospective applicants from underrepresented backgrounds.

📝 Learn more & apply: forms.gle/PMWByc6J3vD...
📅 Deadline: Nov 20

October 21, 2025 at 5:36 PM

Reposted by Shaily

Jimin Mun

@jiminmun.bsky.social

Next stop for conference hopping: #AIES2025 in Madrid!

I'll be giving an oral presentation of our paper Why (Not) Use AI during paper session 1 tomorrow (10/20) at 11:45AM :)

See details in thread below 👇

arxiv.org/abs/2502.07287

Why (not) use AI? Analyzing People's Reasoning and Conditions for AI Acceptability

In recent years, there has been a growing recognition of the need to incorporate lay-people's input into the governance and acceptability assessment of AI usage. However, how and why people judge acce...

arxiv.org

October 20, 2025 at 8:51 AM

Reposted by Shaily

Melanie Walsh

@mellymeldubs.bsky.social

I'm pumped about this event!

I'll be at Berkeley on Friday to share new research about how people are using AI to write fiction—and what that means for the future of fiction and entertainment.

You can join on Zoom, too!

UC Berkeley School of Information @berkeleyischool.bsky.social · Oct 10

Mark your calendars for our second Cultural Analytics Talk Series lecture w/ @ucbids.bsky.social!

Assistant Prof. Melanie Walsh will discuss how authors and readers are using #AI to "write" fiction.

📅 Oct. 24, 12:15 - 1:30 pm
📍 210 South Hall, Online

www.ischool.berkeley.edu/events/2025/...

October 20, 2025 at 3:29 PM

Reposted by Shaily

Kate O'Neill

@kateoneill.bsky.social

Collective agreement to use 'doomscrolling' instead of 'reading the news' was the most honest linguistic shift of our generation

October 11, 2025 at 7:54 PM

Reposted by Shaily

Maria Antoniak

@mariaa.bsky.social

Here’s a #COLM2025 feed!

Pin it 📌 to follow along with the conference this week!

October 6, 2025 at 8:26 PM

Reposted by Shaily

Julia Mendelsohn

@jmendelsohn2.bsky.social

I will be at #COLM2025 this week, and would love to connect with folks interested in applications (and critiques) of language modeling in social science research!

And join us for the NLP4Democracy workshop on Friday!

sites.google.com/andrew.cmu.e...

#NLP #NLProc #LLM #ComputationalSocialScience

NLP 4 Democracy - COLM 2025

sites.google.com

October 6, 2025 at 7:31 PM

Reposted by Shaily

Nishant Subramani @ ACL

@nsubramani23.bsky.social

At @colmweb.org all week 🥯🍁! Presenting 3 mechinterp + actionable interp papers at @interplay-workshop.bsky.social

1. BERTology in the Modern World w/ @bearseascape.bsky.social
2. MICE for CATs
3. LLM Microscope w/ Jiarui Liu, Jivitesh Jain, @monadiab77.bsky.social

Reach out to chat! #COLM2025

October 6, 2025 at 10:08 PM

Reposted by Shaily

Fernando Diaz

@841io.bsky.social

In January, Asia Biega (MPI), Georgina Born (UCL), Mary Gray (MSR), Rida Qadri (G), and I ran a Dagstuhl Seminar bringing together folks from CS and the broader social sciences to discuss questions around AI and culture. Dagstuhl has just posted our report, 1/3

drops.dagstuhl.de/storage/04da...

October 2, 2025 at 6:55 PM

Reposted by Shaily

Isabelle Augenstein

@iaugenstein.bsky.social

Looking for PhD opportunites in #NLProc #XAI? We @copenlu.bsky.social @aicentre.dk @apepa.bsky.social are hiring for a start in Spring or Autumn 2026.
📆 Application deadline: 31 October 2025
ℹ️ Details: www.copenlu.com/news/phd-fel...
👀 Reasons to apply: www.copenlu.com/post/why-ucph/

PhD fellowships for start in Spring or Autumn 2026 | CopeNLU

Would you like to join our lab as a PhD student in 2026? We have several openings. Read more about reasons to join CopeNLU here. Start in Spring 2026 We have two fully funded 3-year PhD fellowships av...

www.copenlu.com

October 3, 2025 at 7:54 AM

Reposted by Shaily

Abhilasha Ravichander

@lasha.bsky.social

It is PhD application season again 🍂 For those looking to do a PhD in AI, these are some useful resources 🤖:

1. Examples of statements of purpose (SOPs) for computer science PhD programs: cs-sop.org [1/4]

CS PhD Statements of Purpose

cs-sop.org is a platform intended to help CS PhD applicants. It hosts a database of example statements of purpose (SoP) shared by previous applicants to Computer Science PhD programs.

cs-sop.org

October 1, 2025 at 8:37 PM

Reposted by Shaily

Jessy Li

@jessyjli.bsky.social

All of us (@kanishka.bsky.social @kmahowald.bsky.social and me) are looking for PhD students this cycle! If computational linguistics/NLP is your passion, join us at UT Austin!

For my areas see jessyli.com

September 30, 2025 at 7:30 PM

Reposted by Shaily

naitian

@naitian.org

I've written really terrible paragraphs that have made me want to stop at 9AM in the morning.

September 26, 2025 at 2:08 PM

Reposted by Shaily

Yanai Elazar

@yanai.bsky.social

Organizing a workshop? Checkout our compiled material for organizing one: www.bigpictureworkshop.com/open-workshop

(and hopefully we'll be back for another iteration of the Big Picture next year w/ Allyson Ettinger, @norakassner.bsky.social, @sebruder.bsky.social)

Big Picture Workshop - Open Workshop

Open sourcing the workshop

www.bigpictureworkshop.com

September 3, 2025 at 2:55 PM

Reposted by Shaily

Ted Underwood

@tedunderwood.com

New preprint on "Computational Hermeneutics," co-authored by too many people to list in one post. TL;DR: GenAI is a cultural technology, and needs to be evaluated in ways that recognize situatedness, plurality, and ambiguity as the conditions of meaning — not noise to be minimized.

Computational Hermeneutics: Evaluating Generative AI as a Cultural Technology

<div> <div> <div> <p>Generative AI (GenAI) systems are increasingly recognized as cultural technologies, yet current evaluation frameworks often treat cul

papers.ssrn.com

August 29, 2025 at 3:41 PM

Reposted by Shaily

Arnav Arora

@rnv.bsky.social

Happy to share that our work on multi-modal framing analysis of news was accepted to #EMNLP2025!

Understanding news output and embedded biases is especially important in today's environment and it's imperative to take a holistic look at it.

Looking forward to presenting it in Suzhou!

Arnav Arora @rnv.bsky.social · Apr 7

🚨New pre-print 🚨

News articles often convey different things in text vs. image. Recent work in computational framing analysis has analysed the article text but the corresponding images in those articles have been overlooked.
We propose multi-modal framing analysis of news: arxiv.org/abs/2503.20960

August 21, 2025 at 1:24 PM

Reposted by Shaily

Language Technologies Institute | CMU

@ltiatcmu.bsky.social

A hearty congratulations to the LTI's
@maartensap.bsky.social, who's been awarded an
Okawa Research Grant for his work in his work in socially-aware artificial intelligence. lti.cmu.edu/news-and-eve...

Sap Awarded 2025 Okawa Research Grant - Language Technologies Institute - School of Computer Science - Carnegie Mellon University

LTI Assistant Professor Maarten Sap received the prestigious award for his work in socially-aware artificial intelligence

lti.cmu.edu

August 15, 2025 at 4:56 PM

Reposted by Shaily

Haley L.

@haleyhaala.bsky.social

The world is a mess. Need a laugh? Check out @ajalvero.bsky.social and my Linguistic Affordances Framework (LAF) for the social study of language technologies! Jokes aside, we hope this will help researchers make sense of social changes associated with new technologies.
doi.org/10.1177/0894...

Linguistic Affordances Framework: A Linguistic-Sociological Approach for the Social Study of Language Technology - Haley Lepp, AJ Alvero, 2025

This paper describes a three-part framework to study how language technologies elucidate and shape linguistic relations in society. Reframing a mountain of evid...

doi.org

August 19, 2025 at 3:56 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news