Shaily
banner
shaily99.bsky.social
Shaily
@shaily99.bsky.social
PhDing at LTI, CMU
Prev: Ai2, Google Research, MSR
Evaluating language technologies, regularly ranting, and probably procrastinating.
https://sites.google.com/view/shailybhatt/
Pinned
🖋️ Curious how writing differs across (research) cultures?
🚩 Tired of “cultural” evals that don't consult people?

We engaged with interdisciplinary researchers to identify & measure ✨cultural norms✨in scientific writing, and show that❗LLMs flatten them❗

📜 arxiv.org/abs/2506.00784

[1/11]
Reposted by Shaily
Extremely thrilled to talk about our new paper: "Who Evaluates AI’s Social Impacts? Mapping Coverage And Gaps In First And Third Party Evaluations".

This is the first big project output from the
@eval-eval.bsky.social coalition! Thread below:
November 13, 2025 at 2:35 PM
Reposted by Shaily
I curated some readings for class on "data tensions" and the list felt worth sharing. Come on a tour of datasets, books, the web, and AI with me...

We'll start with this piece on the Google Books project: the hopes, dreams, disasters, and aftermath of building a public library on the internet.

1/n
Torching the Modern-Day Library of Alexandria
“Somewhere at Google there is a database containing 25 million books and nobody is allowed to read them.”
www.theatlantic.com
November 14, 2025 at 4:39 PM
Reposted by Shaily
It's the season for PhD apps!! 🥧 🦃 ☃️ ❄️

Apply to Wisconsin CS to research
- Societal impact of AI
- NLP ←→ CSS and cultural analytics
- Computational sociolinguistics
- Human-AI interaction
- Culturally competent and inclusive NLP
with me!

lucy3.github.io/prospective-...
November 11, 2025 at 10:32 PM
Reposted by Shaily
Can LLMs accurately aggregate information over long, information-dense texts? Not yet…

We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!
November 7, 2025 at 5:07 PM
Reposted by Shaily
We’re excited about Oolong as a challenging benchmark for information aggregation! Let us know which models we should benchmark next 👀

Paper: arxiv.org/abs/2511.02817
Dataset: huggingface.co/oolongbench
Code: github.com/abertsch72/o...
Leaderboard: oolongbench.github.io
Oolong: Evaluating Long Context Reasoning and Aggregation Capabilities
As model context lengths continue to grow, concerns about whether models effectively use the full context length have persisted. While several carefully designed long-context evaluations have recently...
arxiv.org
November 7, 2025 at 5:07 PM
Reposted by Shaily
When you read a poem, do you wonder how the poet structures it through whitespace between/before words and lines?

We did! Our findings on whitespace - how to measure/preserve it, how usage varies across form/time, how it affects LLMs - now in an #EMNLP2025 (main) paper: arxiv.org/abs/2510.16713
🧵👇
October 31, 2025 at 3:08 PM
Reposted by Shaily
Syntax that spuriously correlates with safe domains can jailbreak LLMs - e.g. below with GPT4o mini

Our paper (co w/ Vinith Suriyakumar) on syntax-domain spurious correlations will appear at #NeurIPS2025 as a ✨spotlight!

+ @marzyehghassemi.bsky.social, @byron.bsky.social, Levent Sagun
October 24, 2025 at 4:23 PM
Reposted by Shaily
'tis the season of getting cold emails from the boldest phd applicants.

In the interest of fairness for those who did not know they could ask, please DM if you'd like an inside perspective on AI/NLP at CMU. (Or share with those you know who might!)
October 22, 2025 at 6:34 PM
Reposted by Shaily
Considering a PhD in NLP/Speech? 🤔
Need guidance with your application materials?

@jhuclsp is offering a student-run application mentoring program for prospective applicants from underrepresented backgrounds.

📝 Learn more & apply: forms.gle/PMWByc6J3vD...
📅 Deadline: Nov 20
October 21, 2025 at 5:36 PM
Reposted by Shaily
Next stop for conference hopping: #AIES2025 in Madrid!

I'll be giving an oral presentation of our paper Why (Not) Use AI during paper session 1 tomorrow (10/20) at 11:45AM :)

See details in thread below 👇

arxiv.org/abs/2502.07287
Why (not) use AI? Analyzing People's Reasoning and Conditions for AI Acceptability
In recent years, there has been a growing recognition of the need to incorporate lay-people's input into the governance and acceptability assessment of AI usage. However, how and why people judge acce...
arxiv.org
October 20, 2025 at 8:51 AM
Reposted by Shaily
I'm pumped about this event!

I'll be at Berkeley on Friday to share new research about how people are using AI to write fiction—and what that means for the future of fiction and entertainment.

You can join on Zoom, too!
Mark your calendars for our second Cultural Analytics Talk Series lecture w/ @ucbids.bsky.social!

Assistant Prof. Melanie Walsh will discuss how authors and readers are using #AI to "write" fiction.

📅 Oct. 24, 12:15 - 1:30 pm
📍 210 South Hall, Online

www.ischool.berkeley.edu/events/2025/...
October 20, 2025 at 3:29 PM
Reposted by Shaily
Collective agreement to use 'doomscrolling' instead of 'reading the news' was the most honest linguistic shift of our generation
October 11, 2025 at 7:54 PM
Reposted by Shaily
Here’s a #COLM2025 feed!

Pin it 📌 to follow along with the conference this week!
October 6, 2025 at 8:26 PM
Reposted by Shaily
I will be at #COLM2025 this week, and would love to connect with folks interested in applications (and critiques) of language modeling in social science research!

And join us for the NLP4Democracy workshop on Friday!

sites.google.com/andrew.cmu.e...

#NLP #NLProc #LLM #ComputationalSocialScience
NLP 4 Democracy - COLM 2025
sites.google.com
October 6, 2025 at 7:31 PM
Reposted by Shaily
At @colmweb.org all week 🥯🍁! Presenting 3 mechinterp + actionable interp papers at @interplay-workshop.bsky.social

1. BERTology in the Modern World w/ @bearseascape.bsky.social
2. MICE for CATs
3. LLM Microscope w/ Jiarui Liu, Jivitesh Jain, @monadiab77.bsky.social

Reach out to chat! #COLM2025
October 6, 2025 at 10:08 PM
Reposted by Shaily
In January, Asia Biega (MPI), Georgina Born (UCL), Mary Gray (MSR), Rida Qadri (G), and I ran a Dagstuhl Seminar bringing together folks from CS and the broader social sciences to discuss questions around AI and culture. Dagstuhl has just posted our report, 1/3

drops.dagstuhl.de/storage/04da...
October 2, 2025 at 6:55 PM
Reposted by Shaily
Looking for PhD opportunites in #NLProc #XAI? We @copenlu.bsky.social @aicentre.dk @apepa.bsky.social are hiring for a start in Spring or Autumn 2026.
📆 Application deadline: 31 October 2025
ℹ️ Details: www.copenlu.com/news/phd-fel...
👀 Reasons to apply: www.copenlu.com/post/why-ucph/
PhD fellowships for start in Spring or Autumn 2026 | CopeNLU
Would you like to join our lab as a PhD student in 2026? We have several openings. Read more about reasons to join CopeNLU here. Start in Spring 2026 We have two fully funded 3-year PhD fellowships av...
www.copenlu.com
October 3, 2025 at 7:54 AM
Reposted by Shaily
It is PhD application season again 🍂 For those looking to do a PhD in AI, these are some useful resources 🤖:

1. Examples of statements of purpose (SOPs) for computer science PhD programs: cs-sop.org [1/4]
CS PhD Statements of Purpose
cs-sop.org is a platform intended to help CS PhD applicants. It hosts a database of example statements of purpose (SoP) shared by previous applicants to Computer Science PhD programs.
cs-sop.org
October 1, 2025 at 8:37 PM
Reposted by Shaily
All of us (@kanishka.bsky.social @kmahowald.bsky.social and me) are looking for PhD students this cycle! If computational linguistics/NLP is your passion, join us at UT Austin!

For my areas see jessyli.com
September 30, 2025 at 7:30 PM
Reposted by Shaily
I've written really terrible paragraphs that have made me want to stop at 9AM in the morning.
September 26, 2025 at 2:08 PM
Reposted by Shaily
Organizing a workshop? Checkout our compiled material for organizing one: www.bigpictureworkshop.com/open-workshop

(and hopefully we'll be back for another iteration of the Big Picture next year w/ Allyson Ettinger, @norakassner.bsky.social, @sebruder.bsky.social)
Big Picture Workshop - Open Workshop
Open sourcing the workshop
www.bigpictureworkshop.com
September 3, 2025 at 2:55 PM
Reposted by Shaily
New preprint on "Computational Hermeneutics," co-authored by too many people to list in one post. TL;DR: GenAI is a cultural technology, and needs to be evaluated in ways that recognize situatedness, plurality, and ambiguity as the conditions of meaning — not noise to be minimized.
Computational Hermeneutics: Evaluating Generative AI as a Cultural Technology
<div> <div> <div> <p>Generative AI (GenAI) systems are increasingly recognized as cultural technologies, yet current evaluation frameworks often treat cul
papers.ssrn.com
August 29, 2025 at 3:41 PM
Reposted by Shaily
Happy to share that our work on multi-modal framing analysis of news was accepted to #EMNLP2025!

Understanding news output and embedded biases is especially important in today's environment and it's imperative to take a holistic look at it.

Looking forward to presenting it in Suzhou!
🚨New pre-print 🚨

News articles often convey different things in text vs. image. Recent work in computational framing analysis has analysed the article text but the corresponding images in those articles have been overlooked.
We propose multi-modal framing analysis of news: arxiv.org/abs/2503.20960
August 21, 2025 at 1:24 PM
Reposted by Shaily
A hearty congratulations to the LTI's
@maartensap.bsky.social, who's been awarded an
Okawa Research Grant for his work in his work in socially-aware artificial intelligence. lti.cmu.edu/news-and-eve...
Sap Awarded 2025 Okawa Research Grant - Language Technologies Institute - School of Computer Science - Carnegie Mellon University
LTI Assistant Professor Maarten Sap received the prestigious award for his work in socially-aware artificial intelligence
lti.cmu.edu
August 15, 2025 at 4:56 PM
Reposted by Shaily
The world is a mess. Need a laugh? Check out @ajalvero.bsky.social and my Linguistic Affordances Framework (LAF) for the social study of language technologies! Jokes aside, we hope this will help researchers make sense of social changes associated with new technologies.
doi.org/10.1177/0894...
Linguistic Affordances Framework: A Linguistic-Sociological Approach for the Social Study of Language Technology - Haley Lepp, AJ Alvero, 2025
This paper describes a three-part framework to study how language technologies elucidate and shape linguistic relations in society. Reframing a mountain of evid...
doi.org
August 19, 2025 at 3:56 PM